Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2017 Jul 19;37(29):7023–7035. doi: 10.1523/JNEUROSCI.0692-17.2017

Contrasting Effects of Medial and Lateral Orbitofrontal Cortex Lesions on Credit Assignment and Decision-Making in Humans

MaryAnn P Noonan 1,2,, Bolton KH Chau 3, Matthew FS Rushworth 1,4, Lesley K Fellows 2
PMCID: PMC6705719  PMID: 28630257

Abstract

The orbitofrontal cortex is critical for goal-directed behavior. Recent work in macaques has suggested the lateral orbitofrontal cortex (lOFC) is relatively more concerned with assignment of credit for rewards to particular choices during value-guided learning, whereas the medial orbitofrontal cortex (often referred to as ventromedial prefrontal cortex in humans; vmPFC/mOFC) is involved in constraining the decision to the relevant options. We examined whether people with damage restricted to subregions of prefrontal cortex showed the patterns of impairment observed in prior investigations of the effects of lesions to homologous regions in macaques. Groups of patients with either lOFC (predominantly right hemisphere), mOFC/vmPFC, or dorsomedial prefrontal (DMF), and a comparison group of healthy age- and education-matched controls performed a probabilistic 3-choice decision-making task. We report anatomically specific patterns of impairment. We found that credit assignment, as indexed by the normal influence of contingent relationships between choice and reward, is reduced in lOFC patients compared with Controls and mOFC/vmPFC patients. Moreover, the effects of reward contingency on choice were similar for patients with lesions in DMF or mOFC/vmPFC, compared with Controls. By contrast, mOFC/vmPFC-lesioned patients made more stochastic choices than Controls when the decision was framed by valuable distracting alternatives, suggesting that value comparisons were no longer independent of irrelevant options. Once again, there was evidence of regional specialization: patients with lOFC lesions were unimpaired relative to Controls. As in macaques, human lOFC and mOFC/vmPFC are necessary for contingent learning and value-guided decision-making, respectively.

SIGNIFICANCE STATEMENT The lateral and medial regions of the orbitofrontal cortex are cytoarchitectonically distinct and have different anatomical connections. Previous investigations in macaques have shown these anatomical differences are accompanied by functional specialization for learning and decision-making. Here, for the first time, we test the predictions made by macaque studies in an experiment with humans with frontal lobe lesions, asking whether behavioral impairments can be linked to lateral or medial orbitofrontal cortex. Using equivalent tasks and computational analyses, our findings broadly replicate the pattern reported after selective lesions in monkeys. Patients with lateral orbitofrontal damage had impaired credit assignment, whereas damage to medial orbitofrontal cortex meant that patients were more likely to be distracted by irrelevant options.

Keywords: credit assignment, decision-making, orbitofrontal cortex, prefrontal cortex, reward, ventromedial prefrontal cortex

Introduction

The ventral and orbital frontal lobes have been implicated in various aspects of reinforcement-guided learning and decision-making. Patients with damage here have difficulty flexibly learning from feedback (Rolls et al., 1994; Hornak et al., 2004; Tsuchida et al., 2010) and making decisions (Fellows and Farah, 2007; Clark et al., 2008). Such studies have focused on patients with damage to medial orbitofrontal cortex (mOFC), often referred to as ventromedial prefrontal cortex (vmPFC), although lesions often extend into adjacent regions, including lateral orbitofrontal cortex (lOFC). There is evidence for anatomical and connectional differences between lOFC and mOFC/vmPFC in both humans and macaques (Carmichael and Price, 1994, 1995a, b; Ongür and Price, 2000; Kahnt et al., 2012; Zald et al., 2014; Neubert et al., 2015). Here we investigate whether learning and decision-making impairments in human patients can be linked to lOFC instead of, or as well as, mOFC/vmPFC.

In monkeys, there is a double dissociation between lOFC and mOFC/vmPFC lesion effects (Noonan et al., 2010; Walton et al., 2010). lOFC lesions impair credit assignment: the ability to assign reward outcomes to particular stimulus choices. Normal monkeys attribute expected value to a stimulus as a function of the precise history of reward received in association with the choice of that particular stimulus, in accordance with the “Law of Effect” (Thorndike, 1933a). In contrast, animals with lOFC lesions are no longer able to associate a reward outcome with the corresponding choice on which it was contingent. Instead, animals value a particular stimulus as a proximity-weighted function of the history of all rewards received approximately at the time of choice, even when the rewards were actually caused by choices of the alternative stimuli on preceding and subsequent trials, a phenomenon that Thorndike termed “Spread of Effect.”

By contrast, mOFC/vmPFC lesions appear to impair reward-guided decision-making because they disrupt the comparison of choice values (Noonan et al., 2010). Rather than choosing the highest value option, macaques with mOFC/vmPFC lesions were more likely than controls to choose the second best option. This impairment manifested partly as a function of the expected value of a third option. Economic theory suggests that rational decisions between any given pair of options should be made in the same manner, regardless of other available options (Luce, 1959; Ray, 1973). A 3-choice task involves value comparisons between each pair of options, with the third option essentially a “distractor” for each comparison. The value of this distractor, irrelevant in principle, nonetheless has an impact on behavior in mOFC/vmPFC-lesioned animals. Without the mOFC/vmPFC, animals are less likely to choose the best in a pair when another valuable alternative was present, as if they rely more on a decision mechanism subject to divisive normalization, such as that identified in the intraparietal sulcus (Louie et al., 2011; Chau et al., 2014).

Macaque mOFC and lOFC share many similarities with human mOFC/vmPFC and lOFC (Neubert et al., 2015). Functional variation is seen in vmPFC/mOFC and lOFC activity measured with fMRI (Noonan et al., 2011; Howard et al., 2015; Howard et al., 2016). However, the differential effects of lesions in these subregions have not been directly compared in humans in relation to credit assignment and decision-making. Fifteen people with lesions of prefrontal cortex and 22 healthy controls (Controls) performed probabilistic 3-armed bandit tasks. Patients had lesions that primarily affected mOFC/vmPFC, lOFC (predominantly right hemisphere), or dorsomedial frontal (DMF) cortex, the latter serving as a control group for nonspecific effects of frontal damage. Building on the macaque work described above, and taking a similar analytic approach, the study tested two hypotheses. First, patients with lesions that included lOFC, but not mOFC/vmPFC or DMF, would be impaired in credit assignment. Second, the choices of mOFC/vmPFC patients would be influenced by the expected value of the irrelevant third option. We predicted that this disrupted decision-making would be selectively associated with damage to mOFC/vmPFC, but not lOFC or DMF.

Materials and Methods

Participants.

Fifteen people (11 female) with focal lesions involving the frontal lobes were recruited from the Cognitive Neuroscience Research Registry at McGill University. Patients were eligible if they had a lesion affecting either region of interest: vmPFC/mOFC or lOFC. Patients with damage to DMF and sparing mOFC/vmPFC and lOFC were also included, as a lesioned control group. Age- and education-matched healthy control (Controls) subjects (n = 22, 11 female) were recruited through local advertisement in Montreal. They were free from neurological or psychiatric disease and not taking any psychoactive medication. Controls completed screening tests for mild cognitive impairment and depression. All scored ≥26 on the Montreal Cognitive Assessment (Nasreddine et al., 2005) and <12 on the Beck Depression Inventory. Patients completed a more extensive neuropsychological screening battery testing memory, language, attention, and executive functions (see Table 3).

Table 3.

Neuropsychological screening tests for the three patient groupsa

Group BDI (/63) IQ estimate (ANART) Animal fluency F-A-S fluency Picture naming (% correct) Incidental memory (% correct) 2-Back working memory (% correct)
lOFC 16.0 (12.1) 117.2 (12.6) 17.2 (4.5) 34.0 (19.7) 92.5 (5.0)b 75.5 (8.1)b 95.5 (5.7)b
mOFC/vmPFC 10.2 (5.5) 123.3 (8.9)c 19.4 (2.9) 37.0 (14.6) 94.0 (5.5) 82.8 (12.6) 95.2 (7.5)
DMF 8.2 (7.8) 123.3 (6.0)b 17.7 (4.9) 33.0 (18.2) 95.8 (5.8) 70.3 (13.3) 95.8 (4.2)

aData are mean (SD). ANART, American National Adult Reading Test.

bScore not available for 1 patient.

cScores not available for 2 patients.

Motivated by the prior work in monkeys (Noonan et al., 2010; Walton et al., 2010), patients were separated into three groups a priori based on the location of their damage, assessed on their most recent MR or CT imaging. Five patients (4 female) had lesions in orbital cortex lateral to the medial orbitofrontal sulcus (four right hemisphere, one left hemisphere). The lesions encompassed an lOFC region that has been linked to credit assignment in neuroimaging studies (Chau et al., 2015; Akaishi et al., 2016; Jocham et al., 2016). We therefore refer to this group as the lOFC group but note that, in some cases, the damage extends into adjacent ventrolateral prefrontal cortex; indeed, based on functional imaging, the lOFC region of interest is at the boundary between OFC and ventrolateral prefrontal cortex. Five patients (3 women) had damage affecting vmPFC, with injury including cortex medial to the medial orbitofrontal sulcus and ventral to the cingulate sulcus. We refer to this lesion group as the mOFC/vmPFC group but note that the lesions compromised adjacent tissue to varying degrees across individuals. One patient had a lesion affecting both lOFC and mOFC/vmPFC and was therefore included in both groups (see Statistical analysis). Six patients (4 female) had lesions that spared both ventral regions of interest, involving cortex dorsal to the cingulate sulcus and dorsomedial to the superior frontal sulcus. One patient also had damage to the parietal lobe. This group is referred to as DMF. Groupwise lesion overlap images were generated by registering patients' lesions to the MNI brain. The overlap images for the three anatomically defined groups are shown in Figure 1C.

Figure 1.

Figure 1.

A, Experimental design of the 3-armed bandit task. Intertrial interval, ITI. B, Reward probabilities of the three options fluctuated independently across trials. C, Lesion locations in mOFC/vmPFC patients (n = 5; area of maximum overlap in 3 patients), DMF (n = 6; area of maximum overlap in 4 patients), and lOFC (n = 9; area of maximum overlap found in 3 patients). Colors represent degree of lesion overlap, as indicated by the color bar. The patient whose lesion covers mOFC/vmPFC and lOFC is included in both lesion maps.

Patients were studied at least 6 months after injury (median time since injury = 6.5 years, range = 2.4–11.8 years). Damage to mOFC/vmPFC was caused by tumor resection in 2 cases and hemorrhage in 3 cases. Damage to lOFC was caused by tumor resection in 4 cases and ischemic stroke in 1 case. Damage to DMF was caused by tumor resection in 3 cases and ischemic stroke in 2 cases.

All subjects provided written informed consent in accordance with the Declaration of Helsinki and were compensated for their time with a nominal fee, plus earnings based on the rewards gained in the task. The study was approved by the MNI's research ethics board.

Procedure and equipment.

Subjects were invited to play a 3-armed bandit task. The game was contextualized in terms of a free trip to the casino. The subjects were told that each different picture represented a different slot machine. They were reminded that a slot machine has a hidden likelihood of paying out on each play and that this game worked in the same way.

During the testing session, three novel distinguishable fractal stimuli were presented on the screen of a laptop computer (Fujitsu, Lifebook T, with Windows Vista) running Presentation Neurobehavioural Systems (version 14.9). In each trial, the stimuli were presented in one of three computer-randomized locations within a triangle configuration. A question mark at the center of the screen would disappear once the subject made a choice (see Fig. 1A). The subjects selected a stimulus by pressing the arrow on the keyboard corresponding to their chosen stimulus' current location (left, upper, or right). Registration of a correct response resulted in the appearance of a green checkmark at the center of the screen, according to the probability defined for that stimulus for the schedule under which the subject was currently being tested. The correct feedback caused a green money bar at the bottom of the screen to increase by a fixed number of pixels, keeping a tally of each subject's winnings. Feedback was presented for 1500 ms. Stimuli would remain on screen until feedback. Subjects were told that their job was to accumulate points throughout the task and that this would be converted into payment added to the amount they would receive as compensation for participating in the experiment (no more than an additional $5). Incorrect choices resulted in a red cross appearing at the center of the screen. The intertrial interval was 1000 ms.

Before the main experiment, all subjects first briefly practiced with a 2-choice game and then a 3-choice game, to make sure they understood the instructions. The reward schedule used in the main experimental session was 500 trials long and contained three options. Regardless of what the subject chose, the best option could change after ∼25 trials (see Fig. 1B). Subjects completed two schedules with new stimuli for each schedule, and the order of stimuli used for each schedule was counterbalanced across subjects. Subjects had the opportunity to take a break halfway through each of the two reward schedules and at the end of the first schedule. Testing took ∼1.5 h to complete. All trials completed are included in each of the analyses.

Patients were tested either in a quiet room of their home or in a quiet experimental testing room at the MNI. All healthy control subjects were tested at the MNI.

Statistical analysis.

The behavioral analyses were performed using MATLAB 6.5 (The MathWorks) and SPSS (version 22). Equal variance could not generally be assumed so, where appropriate, corrected t tests are reported, while ANOVAs were Huynh-Feldt-corrected. All data points were log-transformed if the analysis set contained samples that were ±3× the SD from the mean. Unless otherwise stated, data from each of the two key groups of interest, lOFC and mOFC/vmPFC groups, were analyzed separately and compared with the data from the age-matched Controls. The two experimental groups were compared directly with one another only when the patient whose lesion involved both regions of interest was removed from both groups. DMF patients were compared directly with Controls, and also compared with each respective experimental lesion group.

Demographic measures (age and years of education) were compared between healthy Controls and all patient groups, whereas the DMF brain-damaged control group was compared with the experimental patient groups on the screening tests (Beck Depression Inventory-II, American National Adult Reading Test, Animal Fluency, F-A-S Fluency, Picture Naming, Incidental Memory, and 2-Back working memory) using independent samples t tests. Lesion volumes in patient groups were compared with independent t tests.

1. Total success and best choices.

The total number of rewards (green checkmarks) received was calculated for each subject. Control group scores were compared with those of each patient group in independent-samples t tests. The data were also analyzed as a function of the subjective expected values of all three stimuli. In this task, expected value corresponds to the estimate of reward probability associated with each stimulus. This is based on the outcomes experienced over the preceding trials. We estimated it using a standard Rescorla–Wagner learning model with a Boltzmann action selection rule. The reward learning rate (α) was fitted individually to each subject's data using standard nonlinear minimization procedures. These data were used to estimate the expected value of each of the three stimuli on every trial (the same learning rate was used for all three stimuli). The aim was to identify the best (V1), second best (V2), and worst (V3) expected value stimulus for every trial and determine the probability that subjects chose the best option. The proportion of choices of the best and worst options were compared in independent 2 (group: Controls vs lOFC | mOFC/vmPFC | DMF) × 2 (reward schedule: 1 vs 2) × 2 (half: first vs second halves of reward schedule) mixed ANOVA.

2. Credit assignment.

We examined how each subject's current choice was influenced by choices they had made and rewards they had received in the prior few trials. We have illustrated all of the possible combinations of choice and outcome in the recent past (trials n − 1 to n − 4) as a matrix in Figure 3A and will describe four particularly important areas within this matrix in relation to current choice behavior.

Figure 3.

Figure 3.

Distribution of reinforcement. A, Labeled matrix of components included in logistic regression. Red diagonal represents appropriate, contingent links between choices and rewards. Green section represents the influence of the association between the most recent reward and the choices made on previous trials. Blue section represents the influence of the association between the most recent choice and the rewards received on past trials. B, Z-transformed β regression weights for this matrix for each group. Bright pixels represent larger regression weights. C, Plots of mean influence of labeled marked cells in A for Controls, lOFC, DMF, and mOFC/vmPFC. Ci, Red diagonal in A. Cii, Green leftmost column in A. Cii, Blue top row in A. The first trial in the past is common to all three graphs (Ci,Cii,Ciii). Symbols and bars represent mean ± SEM. “o,” individual subject weights. Controls are plotted against lOFC, DMF, and mOFC/vmPFC from right to left. Rightmost panel replots lOFC against mOFC/vmPFC. Ci–Ciii, Right-hand graphs, Black “o,” subject whose lesion affects both mOFC/vmPFC and lOFC. Statistics for direct comparisons between mOFC/vmPFC and lOFC lesions leave out this subject. Compared with Controls (Ci, left) and mOFC/vmPFC patients (Ci, right), lOFC patients exhibit impaired credit assignment with reduced β weights reflecting the loss of influence of the immediate past reward from the immediate past choice (R−1xC−1) on the current choice. By contrast, lOFC patients relative to Controls (Cii, left) and mOFC/vmPFC patients (Cii, right) show significantly greater Spread of Effect of reinforcement received on the last trial to each of the earlier choices (R−1xC−2:4). Significance bars represent main effects of group. These positive effects correspond to analyses 1 and 3 outlined in Materials and Methods (2, credit assignment). *Denotes statistical difference.

  1. Current choice is often predicted by the immediate preceding choice (C−1) and by whether or not a reward was received for that choice (R−1). This corresponds to R−1xC−1 in the upper left corner marked as red to represent a contingent relationship between choice and reward in Figure 3A. Thorndike (1933a) argued that, if the immediately preceding action is rewarded, then that action is reinforced and is likely to be made again in the future: the “Law of Effect.”

  2. Further past choices and their contingent outcomes beyond the immediate past choice can also influence current choice. This corresponds to the red diagonal (R−2xC−2, R−3xC−3, R−4xC−4, abbreviated to R−2:4xC−2:4) in Figure 3A.

Thorndike and others (Thorndike, 1933b; Noonan et al., 2010; Walton et al., 2010; Jocham et al., 2016) have observed that outcomes can be erroneously associated with temporally adjacent but unrelated choices and can go on to influence current choices, the phenomenon termed “Spread of Effect.” In other words, current choice can be influenced by choices and outcomes that are “off the diagonal” in Figure 3A.

  • c. The current choice can be influenced by rewards that were actually caused by choices of the alternative stimuli on preceding trials. Figure 3A (green leftmost column) illustrates this backward spread of reward. This column shows how reward that was received on the immediately preceding trial (R−1) may interact with choices made two (C−2), three (C−3), or four (C−4) trials ago and influence the current choice. We denote these factors as R−1xC−2, R−1xC−3, and R−1xC−4 (abbreviated to R−1xC−2:4).

  • d. The current choice can also be influenced by past rewards, caused by choices of the alternative stimuli, reinforcing more recent choices. Figure 3A (blue uppermost row) illustrates this forward spread of reward. This row shows how reward that was received two (R−2), three (R−3), or four (R−4) trials ago may interact with the choice made on the immediately preceding trial (C−1) and influence the current choice. We denote these factors as R−1xC−1, R−2xC−1, and R−4xC−1 (abbreviated to R−2:4xC−1).

We ran a multiple logistic regression analysis (Barraclough et al., 2004; Lau and Glimcher, 2007; Walton et al., 2010) to determine which combination of factors best explained choices. All of the possible combinations of choice and outcome in the recent past (trials n − 1 to n − 4) were included as regressors. To reiterate, this allowed us to investigate the influence of specific choice-outcome associations on current behavior (see Fig. 3A, red diagonal, comparisons 1 and 2). In other words, it revealed how subjects assigned outcome credit or reinforcement to their choices. This analysis also identified potential false associations when outcomes were assigned to choices made earlier (see Fig. 3A, green area, comparison 3) or on subsequent trials (see Fig. 3A, blue area, comparison 4).

We interrogated the four key choice-outcome associations described above.

  1. We assessed the influence the immediately preceding choice-outcome association has on the current choice (R−1xC−1). Patients' beta values from this single cell of the regression matrix were compared with those of Controls in independent-samples one-tailed t tests to determine whether they were lower than in Controls; lower values would indicate a failure to assign rewards to choices in a contingent manner on the preceding trial, which is usually particularly influential in determining choice on the next trial.

  2. We examined the extended history of choice and reward conjunction on even earlier trials. The beta values from the red diagonal cells were subjected to a 3 (past choice × past reward: R−2:4xC−2:4) × 2 (group: Controls, lOFC | DMF | mOFC/vmPFC) mixed ANOVA.

  3. In a third analysis, we examined the false influence of the reward just received in conjunction with the previous history of choices. The beta values from the three green vertical cells were entered into a 3 (past choices: R−1xC−2:4) × 2 (group: Controls, lOFC | DMF | mOFC/vmPFC) mixed ANOVA.

  4. Finally, we examined false associations between the choice just made and the previous reward history. The beta values from the three blue horizontal cells were entered into in a 3 (past reward: R−2:4xC−1) × 2 (group: Controls, lOFC | DMF | mOFC/vmPFC) mixed ANOVA.

These four analyses were also performed to compare lOFC and mOFC/vmPFC groups, and lOFC, DMF groups directly.

If a credit assignment mechanism is impaired, then instead of learning specific choice-reward conjunctions, subjects may rely on a system that forms associations between overall recency-weighted histories of choices and the overall recency-weighted histories of outcomes. In some cases, both the Law of Effect mechanism and the Spread of Effect mechanism predict the same choice, but in others they predict different choices. We compared these two situations here. We looked at the pattern of choices when a new stimulus is chosen (e.g., option B) after a long history of choice on another stimulus (i.e., option A). Options A, B, and C in this analysis refer to sequences of choosing the same option (rather than one specific stimulus). We examined the effect of an outcome, reward or no reward, on a newly chosen stimulus B, after different histories of A choices. If credit is correctly assigned, participants should always be more likely to reselect B on the following trial (n) if the choice on the previous trial (n − 1) was rewarded than if it did not result in reward. By corollary, they should be less likely to switch back to A after Bs that are rewarded than those that are not (Law of Effect). Moreover, this effect should be independent of choice history if all credit is properly assigned to the new choice, B. By contrast, if the credit for the new outcome is assigned not to the choice that causes the outcome, but instead to the integrated history of choices, then credit for a reward after choosing option B will be assigned partly based to previous choices of option A (Spread of Effect). By contrast, choices of option C, which was not chosen on any of the trials, should not be affected.

We assessed the differential influence of a reward (or no reward, ±) for a previous choice of option B on subsequent choice of options A, B, or C (subsequent choice is denoted as “?”). Further, to assess the impact of choice history on subsequent choice, we compared sequences that contained different numbers of past A choices. Trials were binned in two categories: subjects had either chosen a single A and then a B option (A, B±, ?) or they had previously chosen two or three A options before the B choice (A, A, B±, ? and A, A, A, B±, ?). Differential influences on choices A and B were compared across groups in a 2 (subsequent choice: ? = A vs ? = B) × 2 (choice history: AB? vs AAB? + AAAB?) × 2 (group: Controls vs lOFC | DMF | mOFC/vmPFC) mixed ANOVA. This was followed by selective mixed ANOVAs for all three subsequent choice types (choice history × group for Choice A, B, or C).

3. Value-based decision-making.

We tested whether participants' choices between the three options (A, B, C) were predicted by the options' expected values (VA, VB, VC), as estimated from the Rescorla–Wagner learning model while controlling for the value interactions between them (VA × VB, VA × VC, and VB × VC). Options A, B, and C refer to particular stimulus options. Critically, the following analyses reframed the 3-choice decision as two binary value comparisons between pairs of options.

The first analysis contained three steps before group-level statistical comparisons. First, we applied two multinomial logistic regression analyses, which can be considered an extension of a binomial logistic regression to allow for a dependent variable with more than two categories. The model predicted the proportion of choices of either A or C from the options' expected values, with choices of option B assigned as a reference category. The analysis yielded two general linear equations, in which the resulting beta coefficients express the influence of the options' value and interactions on the logarithmic odds of each binary comparison; predicting options A choices relative to B choices (Eq. 1) and predicting option C choices relative B choices (Eq. 2) as follows:

graphic file with name zns02917-9903-m01.jpg
graphic file with name zns02917-9903-m02.jpg

In Equations 1 and 2, the same expressions were associated with different behavioral meanings. For example, VA was the expected value of a decision-relevant option in Equation 1, but it was the value of a decision-irrelevant option, “distractor,” in Equation 2. Next, Equations 1 and 2 were generalized such that options X and Y are the options being compared, with option Y as the reference, option X as the comparator, and option D denoting the irrelevant option.

For Equation 1, where X = A, Y = B, D = C, this results in the following:

graphic file with name zns02917-9903-m03.jpg

For Equation 2, where X = C, Y = B, D = A, this results in the following:

graphic file with name zns02917-9903-m04.jpg

Finally, because the regression weights of the two resultant equations now had comparable meanings with respect to whether they were relevant to the options compared, the two sets of regression weights were averaged.

The average regression weights from VX, VY, and VD (from Eqs. 3, 4) were compared in Controls in a one-way repeated-measures ANOVA, followed by post hoc t tests against zero. Each lesion group was compared with Controls in separate 3 (decision term: VX, VY, VD) × 2 (group: Control vs vmPFC | lOFC | DMF) mixed ANOVAs.

The second analysis aimed to assess the impact of the value of the decision-irrelevant option (VD) on the choice between the two relevant options (VX and VY). Previous work in macaques suggested that mOFC/vmPFC plays a key role in focusing the choice on the decision-relevant options, particularly when there are distracting alternatives. We hypothesized that decisions between two options would not only depend on expected value differences between any two options but also on the expected value of a third irrelevant option (Noonan et al., 2010).

The second analysis had three additional steps from Equations 3 and 4 before group-level statistical comparisons. First, the general linear Equations 3 and 4 were rearranged to isolate the (VX − VY)VD term. This is achieved by noting that the additive influence of the value of the two options (e.g., β1VX + β2VY) is equivalent to the value difference between the options (e.g., (β1β2)2(VXVY)) added to the sum of the values (e.g., (β1+β2)2(VX+VY)). The additive influence of β1VX and β2VY, as well as β5VXVD and β6VYVD, can therefore be expressed as shown in Equations 5 and 6, respectively. Substituting Equations 5 and 6 into Equation 3 yields Equation 7, which examines more complex decision contexts. Again, Equation 7 generalizes and the appropriate substitutions can be made into the C versus B comparison in Equation 4 as follows:

graphic file with name zns02917-9903-m05.jpg
graphic file with name zns02917-9903-m06.jpg
graphic file with name zns02917-9903-m07.jpg

Consequently, the resulting two sets of regression weights reflecting Equation 7 for each binary comparison were averaged.

We tested how the expected value of the irrelevant option VD could affect the comparison between X and Y (i.e., (β5β6)2(VXVY)VD) while controlling for the effects of VX − VY, VX + VY, VX × VY, VD, and (VX + VY)VD. (VX − VY)VD. Average β weights from the two binary comparisons were compared between Controls and each lesion group in a priori independent-samples t tests. This analysis was also performed to compare mOFC/vmPFC and lOFC groups, and mOFC/vmPFC and DMF groups directly. Based on previous findings, we expected a reduction in the impact of VX and VY on choices between X and Y in the presence of high value (vs low value) decision-irrelevant distractors in vmPFC-lesioned groups relative to Controls and lesion control groups. Therefore, one-tailed statistics were applied.

Results

Lesion overlaps and demographics

Lesion patients were divided a priori into three subgroups: mOFC/vmPFC, lOFC, and DMF based on the location of their damage assessed on their most recent MRI or CT. Figure 1C shows the overlap image of lesion tracings manually registered to the MNI brain for each group. Lesion and cluster volumes are reported in Table 1. Table 1 and the main statistical tests on behavioral performance were pooled over left and right hemisphere lesion sites. However, for illustration purposes in the figure, we show lesion data for left and right hemisphere lOFC lesions separately. We calculated the maximal number of patients with voxels damaged within a lesioned area. Despite variability in lesion location and extent, voxels in a cluster of 209.4 cm3 (centered on MNI coordinates of 6, 31, −20) were damaged in 3 of 5 mOFC/vmPFC patients. To calculate lOFC overlap extent, the single left hemisphere lOFC patient's lesion mask was flipped into the right hemisphere. In total, across left and right lOFC lesion groups, voxels extending over 489.4 cm3 (centered on MNI coordinates of 46, 26, 3) were damaged in 4 of 5 patients. Finally, voxels in a cluster of 44.3 cm3 (centered on MNI coordinates of −5, 19, 56) were damaged in 4 of 6 DMF patients. There was no difference in mean lesion volume between the two experimental groups and the DMF brain-damaged control group (lOFC vs DMF, t(7.73) = −0.91, p = 0.393; mOFC/vmPFC vs DMF, t(8.57) = 1.05, p = 0.321).

Table 1.

Lesion overlap and cluster information for the lesion groupsa

Group Mean (SD) lesion volume (mm3) Total lesion coverage (mm3) Maximum overlap No. of voxels in maximum overlap Center of gravity of maximum overlap (MNI)
lOFC (collapsed across hemispheres) 4991.20 (3208.38) 30,157.1 4 4894 46, 26, 3
mOFC/vmPFC 3374.55 (2603.88) 8735.3 3 2094 6, 31, −20
DMF 2004.56 (1678.00) 14,273.8 + 1625.3 4 443 −5, 19, 56

alOFC group is reported with the single left lOFC patient's lesion mask flipped into the right hemisphere.

While some voxels were damaged in more than one lesion group, this between-group overlap was not in regions of interest. Between mOFC/vmPFC and DMF groups, 758.0 cm3 was mutually damaged in one patient group and also damaged in no more than 2 patients in the other group (one mOFC/vmPFC patient overlaps with 1 DMF patient = 646.9 cm3 and 2 DMF patients = 2.2 cm3). Two mOFC/vmPFC patients overlap with 1 DMF patient (108.9 cm3). Inspection of the lesions suggests that 1 DMF patient, whose lesion extended anteriorly into frontopolar cortex, accounts for much of the overlap. Overlap between the lOFC and DMF group's shows 170.2 cm3 was mutually damaged in 1 patient group and no more than 3 patients in the other group. This again is driven mainly by a single DMF patient whose lesion overlaps with the lesion in 3 lOFC patients, with decreasing extent, in medial white matter pathways (1 DMF patient's lesion overlaps with 1 lOFC patient = 69.6 cm3, 2 lOFC patients = 13.8 cm3, and 3 lOFC patients = 3.4 cm3).

Demographic information and neuropsychological screening results are provided in Tables 2 and 3, respectively. Controls were comparable to all patient groups in age (all p values > 0.148) and education (all p values > 0.781). Further, the DMF patient control group was not different from the experimental patient groups on the Beck Depression Inventory (p values > 0.254), estimated IQ (p values > 0.397), Animal Fluency (p values > 0.486), F-A-S Fluency (p values > 0.696), picture naming (p values > 0.366), Incidental memory (p values > 0.147), or letter 2-Back working memory (p values > 0.871).

Table 2.

Demographic information for the four groupsa

Group Age (yr) Education (yr)
Controls 52.18 (11.42) 15.45 (3.25)
lOFC 54.4 (16.41) 15.6 (3.36)
mOFC/vmPFC 55.4 (16.64) 15.4 (3.97)
DMF 59.33 (9.40) 15 (3.34)

aData are mean (SD).

Total rewards earned and subjective best choices

In this challenging multioption learning environment, we first examined global performance in all groups. The total rewards earned by each patient group did not significantly differ from the Control group (p ≥ 0.209). However, there was a significant difference in the way the lOFC patients distributed their choices among the options based on their subjective value. We compared the rate at which they chose the best value option (V1) as opposed to the worst value option (V3) and found a significant difference compared with Controls (group × option × reward schedule × half: F(1,25) = 5.141, p = 0.032; Fig. 2). This suggests that the lOFC group was choosing the subjectively worst option, at the expense of the best option more often than the Control group, particularly in the first half of the first testing session. By contrast, DMF and mOFC/vmPFC patients were no different from Controls.

Figure 2.

Figure 2.

Summary of the proportion of choices of V1 minus proportion of V3 choices for Controls (green), DMF (orange), lOFC (pink), and mOFC/vmPFC (blue) groups. The lOFC group chose the subjectively worst option, at the expense of the best option, more often than the Control group, particularly in the first half of the first testing session. *Denotes statistical difference.

Credit assignment

The impaired performance in the lOFC patients may, as in lOFC-lesioned monkeys (Walton et al., 2010), reflect a loss in understanding the causal relationships between particular choices and their contingent outcomes. To test this idea, we ran a multiple logistic regression analysis to determine which combination of factors best explained choices, including all possible combinations of choice and outcome in the recent past (Fig. 3A). The regression weights of these choice-outcome combinations are plotted as matrices in Figure 3B, where brighter colors represent larger weights. As suggested by Figure 3Bi, Ci, Control subjects' choices were strongly influenced by the stimuli they had recently selected and by the outcomes received for each of those choices (R−1xC−1, t(21) = 10.13, p < 0.001), an effect that diminished with increasing separation from the current trial (R−2:4xC−2:4, F(2,42) = 17.06, p < 0.001). Controls were therefore associating specific choices with the specific outcomes that followed. By contrast, supporting our hypothesis, lOFC patients had weaker assignment of credit to the immediately previous reward and choice than Controls (R−1xC−1 group: t(25) = 1.70, p = 0.038). This is apparent in Figure 3Biii (Ci, first panel), where the immediately preceding choice and reward conjunction influences current choice less in the lOFC group (top left corner is brighter in Controls than lOFC patients). There was also a trend for a similar difference between lOFC and the DMF control group (R−1xC−1 group: t(6.64) = 1.79, p = 0.060), whereas the DMF group did not differ significantly from Controls (R−1xC−1 group: t(26) = 0.19, p = 0.851; Fig. 3Bii, Ci, second panel). Unlike previous reports in lOFC-lesioned monkeys, the influence of past choices and their contingent outcomes beyond the first past choice did not differ between the lOFC and Control groups; while the diagonal line running from top left to bottom right in the figure is brighter in Controls than in lOFC patients, this numerical difference in the influences of these earlier past choice and reward conjunctions in the lOFC and Controls groups was not statistically significant (R−2:4xC−2:4 group: F(1,25) = 1.88, p = 0.668; Fig. 3Ci, first panel).

We next tested for evidence of Spread of Effect in Controls, corresponding to the erroneous association of outcomes with temporally adjacent but unrelated choices (Walton et al., 2010). It is clear from Figure 3Bi that the leftmost column of the results matrix is much darker than the diagonal. This means that illusory (noncontingent) conjunctions between the last reward (trials R−1) and the prior history of choices (C−2, C−3, and C−4) did not lead to the choice being taken again on the next trial. Indeed, in Controls there was a negative influence of immediately previous reward on choice history; receiving reward on the previous trial did not make it more likely that even earlier choices would be made again (R−1xC−2:4 average past choice: t(21) = 5.46, p < 0.0001; Fig. 3Cii). This pattern was significantly different in lOFC patients. In line with our hypothesis, lOFC patients exhibited a significant reduction in this negative influence of past reinforcement on past choice compared with Controls (R−1xC−2:4 group: F(1,25) = 4.23, p = 0.050; Fig. 3Cii, first panel). Again, there was a trend for a similar difference between lOFC and DMF patients (R−1xC−2:4 group: F(1,9) = 4.29, p = 0.068), but there was no evidence for a difference between DMF and Controls (R−1xC1:4 group: F(1,26) = 0.19, p = 0.663; Fig. 3Cii, second panel).

We also examined the degree to which subjects erroneously associated the choice just made with the rewards received on earlier trials (Fig. 3A, blue boxes). Here again, an initial negative influence was apparent in Controls that decreased with increasing distance from current choice and eventually reversed sign (R−2:4xC−1, past rewards: F(1.58,33.18) = 18.6, p < 0.001; Fig. 3Bi, Ciii, first panel). This pattern was not present in lOFC patients (R−2:4xC−1, F(2,8) = 1.42, p = 0.297; Fig. 3Biii, Ciii, first panel), although they did not differ significantly from Controls (R−2:4xC−1 group: F(1,25) = 0.04, p = 0.852, group × past rewards: F(1.58,39.54) = 0.22, p = 0.752). The DMF group also did not differ from healthy Controls (R−2:4xC−1 group: F(1,26) = 0.40, p = 0.532, second panel).

In monkeys, impairments in credit assignment were restricted to animals with lOFC lesions and not found in animals with mOFC/vmPFC lesions (Noonan et al., 2010). We examined whether human mOFC/vmPFC patients were similarly unimpaired compared with Controls. Patients with lesions to mOFC/vmPFC were no different from Controls in the way that they assigned the credit for the last reward to the appropriate choice either on the immediately preceding choice (R−1xC−1 group: t(25) = −0.73, p = 0.474, third panel) or on earlier trials; they were influenced by past choice-reward conjunctions in a similar way (R−2:4xC−1:4 group: F(1,25) = 0.76, p = 0.785, third panel). Similarly, Controls and mOFC/vmPFC patients did not differ in the way the current choice was influenced by interactions between the effect of the immediately preceding reward and even earlier choices (R−1xC−2:4 group: F(1,25) = 0.08, p = 0.780; past trial × group: F(2,50) = 0.46, p = 0.636, Fig. 3A, left-hand column, Cii, third panel) or by interactions between the immediately preceding choice and earlier rewards (R−2:4xC−1 group: F(1,25) = 1.27, p = 0.271; past trial × group: F(2,50) = 2.00, p = 0.147; Fig. 3Ciii, third panel, top row).

To confirm regional specialization, we removed the patient whose lesion affected both mOFC/vmPFC and lOFC regions, and compared the lOFC and mOFC/vmPFC patient groups directly in these key analyses. Confirming our hypothesis, compared with mOFC/vmPFC patients, the lOFC-lesioned group attributed less weight to the immediately past reward and choice (R−1xC−1 group: t(3.20) = −3.48, p = 0.036; Fig. 3Ci, fourth panel) and misassigned relatively greater credit for the most recent reward to earlier choices (R−1xC−2:4 group: F(1,6) = 13.99, p = 0.010; Fig. 3Cii, fourth panel).

The results above suggest that, just like lOFC-lesioned monkeys, patients with lesions to homologous lOFC regions are relying on their history of choices, rather than particular conjoint choice-reward associations, to update their expected value estimates for each option. If this is the case, then lOFC-lesioned patients should be less likely than Controls to reselect a newly chosen stimulus (e.g., option B) after a history of choices on another stimulus (i.e., option A) when there was a recent reward. We examined the differential effect of an outcome, reward or no reward, on a newly chosen stimulus, after sequences of choices directed toward the same option (Fig. 4). Consistent with correct credit assignment, Control subjects were more likely to reselect B on the following trial (n) if their choice on the previous trial (n − 1) was rewarded than if it did not result in reward (Fig. 4B, green). They were also less likely to switch back to A after B choices that were rewarded than those that were not (Fig. 4A, green). lOFC patients, by contrast, showed a significantly reduced influence of a rewarded B choice on subsequent choice (group × choice: F(1,25) = 5.89, p = 0.023). Compared with Controls, lOFC patients were less likely to reselect B after a reward if the subject had recently chosen A (group: F(1,25) = 6.01, p = 0.022; Fig. 4B). Indeed, lOFC patients were more likely than Controls to reselect A (group: F(1,25) = 4.41, p = 0.046; Fig. 4A) as if the credit for the new outcome was not assigned to the choice that caused the outcome, but instead to the integrated history of choices, with the reinforcement for choosing option B being partly assigned to previous choices of option A. In lOFC-lesioned monkeys, these effects increased with the length of the history of A choices. By contrast, in humans, there was no differential effect of length of choice history between lOFC-lesioned patients and Controls (previous A choices: F(1,25) = 0.00, p = 0.991). Finally, we ran the control analysis; instead of looking at whether receiving reward for choosing option B after option A resulted in spread of reinforcement to the previous option, we tested whether it spread to the other option, C, that had not been taken recently. Just as is the case in monkeys, human lOFC patients and Controls did not differ in their subsequent choice of option C as a function of reward delivery during the sequence of A and B choices (group: F(1,25) = 2.07, p = 0.163; Fig. 4C). DMF and mOFC/vmPFC patients were no different from Controls in these analyses (main effects and interactions of group, p > 0.571).

Figure 4.

Figure 4.

Different likelihood of choosing option A (A), B (B), or C (C) on trial n after previously selecting option B on trial n − 1 as a function of whether or not reward was received for this B choice. Data are plotted based on the length of choice history on A. Left side of each plot represents 1 previous choice of A (A, B±, ?). Right side of each plot represents 2 or 3 previous choices of A (A, A, B±, ? and A, A, A, B±, ?). lOFC patients were significantly more likely than Controls to reselect A (A) and significantly less likely to reselect B (B), as if the reward for choosing B was not assigned to the choice that caused the outcome but instead to the integrated history of choices of A. *Denotes statistical difference.

Value-guided decision-making

A multinomial logistic regression was performed to test how choice decisions were biased by the expected value of all options while accounting for the variance explained by the interactions between each pair of option values. Hence, we applied a GLM of six expressions on the choice data, VA, VB, VC, VA × VB, VA × VC, and VB × VC, focusing the analysis on the main effects of option value. The first step considers the 3-choice decision as two binary comparisons: option A versus option B and option C versus option A. In this analysis, choices of option B were taken as a reference such that regression weights were estimated for biases for each binary choice comparison. Next, the regression weights for each binary comparison were visualized in terms of whether they were relevant or irrelevant to the choice (i.e., when weighing A vs B, the value of C is irrelevant). Finally, each pair of regression weights for A versus B and C versus B was averaged to produce beta regression weights for the values of the relevant options (relevant options were relabeled X and Y; see Materials and Methods), and the expected value of the irrelevant option (relabeled D; see Materials and Methods).

Figure 5A shows the regression weights of each factor on VX, VY, and VD. In Controls, the variance explained by the model's main effects was unevenly distributed (F(1.16,24.44) = 77.86, p < 0.001). Unsurprisingly, the VX had an overall significant positive (t(21) = 8.71, p < 0.001) and VY a negative effect (t(21) = −8.60, p < 0.001) on choice of X over Y. This indicates that larger expected values of X and smaller values of Y leads to a choice of X over Y, as expected. In addition, VD, irrelevant to the decision between X and Y, did not show a significant impact on biasing X versus Y decisions (t(21) = 0.01, p = 0.989). Comparing the main effect beta weights between controls and lesion groups revealed no significant differences of group (Controls vs DMF | vmPFC | lOFC, F values ≤ 2.01, p values ≥ 0.168), or interactions between group and decision factor (Controls vs DMF | vmPFC | lOFC F values ≤ 3.23, p values ≥ 0.076).

Figure 5.

Figure 5.

Weights of influence of expected value differences on choices of X relative to Y. A, Log-normalized mean weights and SEM of beta weights from the main effects of VX, VY, and VD are plotted from a GLM controlling the interactions between option values for Controls, mOFC/vmPFC, lOFC, and DMF. There is a positive influence of Vx, and a negative influence of VY, on choosing X over Y in all subject groups. B, The GLM is rearranged to isolate the effect of the expected value of the irrelevant (“distractor”) third option on the value difference between X and Y. Controls are plotted against mOFC/vmPFC (Bi), DMF (Bii), and lOFC (Biii). Biv, mOFC/vmPFC and lOFC patients are replotted. “o,” Individual subject scores. Biv, Black “o,” subject whose lesion affects both mOFC/vmPFC and lOFC. Statistics for direct comparisons between mOFC/vmPFC and lOFC lesions leave out this subject. High expected values of the irrelevant third option significantly reduce the influence of VX − VY on choice only in mOFC/vmPFC (Bi) and DMF (Bii) groups relative to Controls. lOFC patients do not differ from Controls in the effect of the expected value of the third option (VD) on their choices between X and Y (Biii). *Denotes statistical difference.

Next, in the same GLM, we rearranged the regression terms of VX × VD and VY × VD into (VX − VY)VD and (VX + VY)VD (see Materials and Methods). This enabled us to better understand how expected value differences between any give pair of options influenced decision-making, by testing the critical effects of (VX − VY)VD while controlling for VX − VY, VX + VY, VX × VY, VD, and (VX + VY)VD. In light of previous findings in the macaque, we hypothesized a negative (VX − VY)VD effect in vmPFC-lesioned patients. This would support the idea that, when the value of the third option in each binary comparison is high (vs low), the positive impact of VX and negative impact of VY on guiding choices of X become weaker, despite this third option being, in principle, irrelevant to the choice between X and Y.

In line with our hypothesis, we found a negative effect of (VX − VY)VD with high expected value distractors negatively affecting the bias of VX and VY on vmPFC-lesioned patients' choice between relevant options X and Y (Fig. 5B). The regression beta weight for this factor was significantly reduced in vmPFC patients relative to controls (t(25) = 2.33, p = 0.014; Fig. 5Bi). However, contrary to expectations, the DMF brain-damaged control group were also significantly different from controls (t(26) = 2.14, p = 0.042; Fig. 5Bii) and not different from vmPFC patients (t(9) = 0.30, p = 0.386).

Previous work in macaques demonstrated regional specialization between the effects of mOFC/vmPFC and lOFC damage on the impact of the distractor on choice (Noonan et al., 2010). We replicate that here, showing that the (VX − VY)VD regression beta weight in lOFC patients was no different from controls (t(25) = 0.34, p = 0.735; Fig. 5Biii) but significantly greater than vmPFC patients (t(6) = 2.19, p = 0.036; Fig. 5Biv). Collectively, these results showed that larger values of decision-irrelevant options were related to more stochastic decision-making between decision-relevant options only after lesions in the mPFC (mOFC/vmPFC and DMF). Further, these analyses suggest inferences drawn from work in monkeys can aid understanding of patterns of impairment in humans and confirm that high value irrelevant options distract patients with mOFC/vmPFC lesions from identifying or attending to the best option when making a decision.

Discussion

Here, predictions based on selective lesion effects observed in macaques (Noonan et al., 2010; Walton et al., 2010) were tested in humans. We predicted that lOFC damage would disrupt credit assignment during value learning, whereas mOFC/vmPFC damage would disrupt value-guided decision-making. Using similar 3-choice tasks and equivalent analytical approaches as in the macaque work, we report a comparable pattern of regionally dissociable deficits. Directly replicating findings in macaques (Walton et al., 2010), we show that the normal positive influence of the contingent relationship between past choice and past reward is reduced in lOFC patients compared with Controls and mOFC/vmPFC patients. Moreover, patients with lesions in DMF and mOFC/vmPFC were no different from Controls in their ability to link outcomes to choices. By contrast, despite spared associative learning, relative to Controls, mOFC/vmPFC-lesioned patients struggled to use values of relevant options to guide decision-making. Within a 3-choice decision, if a high-value third option was present while these patients were evaluating pairs of options, they make more random choices. Once again, there was evidence of regional specificity; patients with lOFC lesions were unimpaired relative to Controls on this measure.

Several theoretical models of value-guided decision-making draw at least a partial distinction between decision-making and learning mechanisms (Rushworth et al., 2011; Levy and Glimcher, 2012). To the extent that subregional effects have been examined in humans, studies of reward learning have focused on mOFC/vmPFC rather than lOFC. There have been demonstrations that damage to human mOFC/vmPFC disrupts value-guided decision-making (Fellows and Farah, 2007; Camille et al., 2011; Henri-Bhargava et al., 2012). Although there is some evidence of mOFC/vmPFC involvement in probabilistic reward learning (Tsuchida et al., 2010), it is not clear whether this could be attributed to failures of associative learning seen following OFC lesions in many animal models (Walton et al., 2010; Schoenbaum et al., 2011; Rudebeck and Murray, 2014; Stalnaker et al., 2015). Indeed, recent work found no impairment in credit assignment in a 2-armed bandit task after mOFC/vmPFC lesions (Kumaran et al., 2015). Here, we studied the effects of more lateral OFC lesions, finding evidence that human lOFC was essential for credit assignment and mediating Thorndike's Law of Effect (Thorndike, 1933a). Patients with lOFC lesions were less influenced by the precise history of contingent choice and reward conjunctions than Controls and patients with lesions in other frontal areas. Instead, patients' choices were more influenced by credit that had been misassigned to noncontingent past choices.

By contrast, mOFC/vmPFC patients' deficits in this task were partly a function of the range of values on offer. This three-option task shows how the expected value difference between any two options is influenced by the value of the third option. Chau et al. (2014) found that healthy subjects perform better when this third option is also valuable. They argue that this creates higher levels of inhibition within a mOFC/vmPFC network that mediates decision-making. As these levels increase, decisions become more accurate and the best option is more likely to be chosen. Here, patients with mOFC/vmPFC damage did not exhibit the beneficial effects relating to the value of the irrelevant alternative, instead making more stochastic choices, suggesting that they are sensitive to the range of available values in a manner that is “irrational” in formal economic terms (Louie et al., 2011).

These dissociable deficits in credit assignment and value-based choice are consistent with the different anatomical connections of these two regions. Although mOFC and lOFC are interconnected with many of the same brain regions involved in reward and reinforcement, there are important points of differentiation of the two networks. The lOFC receives input from nearly all sensory regions, such as temporal lobe area TE and perirhinal cortex, and these may be important when credit is assigned to representations of specific visual stimuli (Carmichael and Price, 1995b; Kondo et al., 2005).

Representations of the choices that have just been made are activated in lOFC at the time of reward feedback (Tsujimoto et al., 2009) or when they are informative for behavior (Noonan et al., 2011). Although representations of choice history and reward history might be relatively widely distributed throughout the brain (Seo et al., 2014), only a few regions seem to represent the conjoint history of choices and rewards. Subregions of OFC might update associations between specific choices and outcomes (Sul et al., 2010; Noonan et al., 2011; Rudebeck et al., 2013). Further, the lOFC may not work in isolation during credit assignment (Akaishi et al., 2016). When subjects are not able to learn contingently, they rely on noncontingent, statistical learning mechanisms linked to amygdala or sensorimotor corticostriatal circuitry. This supports the existence of multiple, parallel reward learning mechanisms (Cisek, 2012; Kolling et al., 2012, 2016a, b; Hunt and Hayden, 2017).

By contrast, the mOFC/vmPFC has a distinguishable set of anatomical connections (Ongür and Price, 2000). Connections to sensory regions are weak or absent. Instead mOFC/vmPFC is more strongly interconnected with anterior cingulate cortex (ACC), which is, in turn, connected with cingulate motor areas (Carmichael and Price, 1996). Through such connections, and the region's prominent value signals (Noonan et al., 2011; Howard et al., 2015), including the presence of chosen and unchosen value signals with different signs, the mOFC/vmPFC may be well placed to influence decision-making (Boorman et al., 2009; Basten et al., 2010; Philiastides et al., 2010; Rushworth et al., 2011; Jocham et al., 2012; Kolling et al., 2012; Chau et al., 2014; Economides et al., 2014; Hunt et al., 2015). The mOFC/vmPFC may not work alone in this function. There may be distinct mechanisms (1) for making decisions about the rewards that should be the focus of behavior and attention, involving the mOFC/vmPFC; and (2) for making decisions about the actions that should be made to obtain those rewards, engaging the ACC (Rushworth et al., 2012). This circuitry may explain the behavioral similarities observed between mOFC/vmPFC and DMF-lesioned groups. It may be that DMF injury, alone or with injury to the cingulum bundle, a frontotemporal tract known to connect area 32, SMA, and orbital areas 14 and 11 (Schmahmann and Pandya, 2006), disrupted mOFC/vmPFC-ACC connections that normally support the coordination between these two mechanisms to guide adaptive behavior.

Our mOFC/vmPFC patients' impairment is compatible with reports of irrational decisions in patients with such lesions (Fellows and Farah, 2005; Camille et al., 2011; Henri-Bhargava et al., 2012) and could be due to a failure to attend to the relevant aspects of the decision (Damasio, 1994; Fellows, 2006). Such accounts suggest the potential for one suboptimal choice to be taken if it is identified as being better than an even worse alternative, even if there are also better choices. A role for the mOFC/vmPFC in attention-dependent decision-making is supported by attention-dependent mOFC/vmPFC value signals (Lim et al., 2011) and by findings that lesions abolish the normal attentional advantage of reward-associated stimuli (Vaidya and Fellows, 2015) and diminish the influence of currently relevant stimulus dimensions on choice (Vaidya and Fellows, 2016). Decisions made after mOFC/vmPFC lesions may unmask the operation of other brain areas, such as the striatum, ACC, and intraparietal sulcus (for review, see Rushworth et al., 2011; Gottlieb et al., 2014), which may not represent all aspects of choice value accurately. Our results are therefore in line with a greater influence of value signals in other brain regions that, rather than facilitating decision-making, allow high V3 values to impair decision-making.

We took care to recruit subjects with lesions selectively affecting mOFC/vmPFC, lOFC, and DMF. The resulting small samples, unequal distribution of lesions across the hemispheres, and the likelihood of associated white matter damage mean that caution is needed in making precise links between cognitive processes and specific brain regions (Rudebeck et al., 2013). Of note, our sample included only 1 left lOFC patient. Although this subject's learning performance fell close to the group mean, we cannot definitively exclude the possibility of right hemisphere lateralization of the lOFC effects. However, there is no consensus on potential lateralization of OFC functions: recent fMRI experiments suggest that a posterior bilateral lOFC region may be important for reversal learning and credit assignment in both macaques (Chau et al., 2015) and humans (Akaishi et al., 2016; Jocham et al., 2016). In monkeys, the region extends from anterior insula into the orbital part of area 12/47, a region within the lesion territory linked with impaired credit assignment (Walton et al., 2010).

Reversal learning has been linked to the lOFC. However, previous work relating lOFC lesion impairments or lOFC BOLD activity to credit assignment shows that these effects occur independently of reversals in stimulus–reward links (Noonan et al., 2010), the availability of more than two options (Chau et al., 2015; Jocham et al., 2016), and the probabilistic nature of the task (Fellows and Farah, 2003). The reversals in the present task are likely to contribute to the effects only insofar as the changeable reward environments make credit assignment important.

In conclusion, mOFC and lOFC play different roles in credit assignment and value-based decision-making. The present findings are consistent with predictions made on the basis of previous macaque lesion and neuroimaging studies (Noonan et al., 2010; Walton et al., 2010; Chau et al., 2015). They also support claims for homologies between human and macaque frontal cortex (Mackey and Petrides, 2010; Neubert et al., 2014, 2015) and are consistent with very recent neuroimaging work conducted in humans implicating the posterior lOFC in credit assignment (Akaishi et al., 2016; Jocham et al., 2016).

Footnotes

This work was supported by Canadian Institutes of Health Research Operating Grant MOP 97821, Fonds de Recherche en Santé du Québec Chercheur-Boursier award to L.K.F., Medical Research Council fellowship to M.F.S.R., Jeanne Timmins Costello Postdoctoral Fellow and St John's College Oxford Early Career Teaching Fellowship to M.P.N., Hong Kong Research Grants Council (25610316) to B.K.H.C., and McGill-Oxford Neuroscience Collaboration. We thank Tim Behrens, Mark Walton, and Nils Kolling for support with analyses, experimental design, and interpretation; Linda Yu for collecting crucial pilot data for this project; and Arlene Berg for assisting in subject recruitment.

The authors declare no competing financial interests.

References

  1. Akaishi R, Kolling N, Brown JW, Rushworth M (2016) Neural mechanisms of credit assignment in a multicue environment. J Neurosci 36:1096–1112. 10.1523/JNEUROSCI.3159-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barraclough DJ, Conroy ML, Lee D (2004) Prefrontal cortex and decision-making in a mixed-strategy game. Nat Neurosci 7:404–410. 10.1038/nn1209 [DOI] [PubMed] [Google Scholar]
  3. Basten U, Biele G, Heekeren HR, Fiebach CJ (2010) How the brain integrates costs and benefits during decision-making. Proc Natl Acad Sci U S A 107:21767–21772. 10.1073/pnas.0908104107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Boorman ED, Behrens TE, Woolrich MW, Rushworth MF (2009) How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62:733–743. 10.1016/j.neuron.2009.05.014 [DOI] [PubMed] [Google Scholar]
  5. Camille N, Griffiths CA, Vo K, Fellows LK, Kable JW (2011) Ventromedial frontal lobe damage disrupts value maximization in humans. J Neurosci 31:7527–7532. 10.1523/JNEUROSCI.6527-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Carmichael ST, Price JL (1994) Architectonic subdivision of the orbital and medial prefrontal cortex in the macaque monkey. J Comp Neurol 346:366–402. 10.1002/cne.903460305 [DOI] [PubMed] [Google Scholar]
  7. Carmichael ST, Price JL (1995a) Limbic connections of the orbital and medial prefrontal cortex in macaque monkeys. J Comp Neurol 363:615–641. 10.1002/cne.903630408 [DOI] [PubMed] [Google Scholar]
  8. Carmichael ST, Price JL (1995b) Sensory and premotor connections of the orbital and medial prefrontal cortex of macaque monkeys. J Comp Neurol 363:642–664. 10.1002/cne.903630409 [DOI] [PubMed] [Google Scholar]
  9. Carmichael ST, Price JL (1996) Connectional networks within the orbital and medial prefrontal cortex of macaque monkeys. J Comp Neurol 371:179–207. [DOI] [PubMed] [Google Scholar]
  10. Chau BK, Kolling N, Hunt LT, Walton ME, Rushworth MF (2014) A neural mechanism underlying failure of optimal choice with multiple alternatives. Nat Neurosci 17:463–470. 10.1038/nn.3649 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chau BK, Sallet J, Papageorgiou GK, Noonan MP, Bell AH, Walton ME, Rushworth MF (2015) Contrasting roles for orbitofrontal cortex and amygdala in credit assignment and learning in macaques. Neuron 87:1106–1118. 10.1016/j.neuron.2015.08.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cisek P. (2012) Making decisions through a distributed consensus. Curr Opin Neurobiol 22:927–936. 10.1016/j.conb.2012.05.007 [DOI] [PubMed] [Google Scholar]
  13. Clark L, Bechara A, Damasio H, Aitken MR, Sahakian BJ, Robbins TW (2008) Differential effects of insular and ventromedial prefrontal cortex lesions on risky decision-making. Brain 131:1311–1322. 10.1093/brain/awn066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Damasio AR. (1994) Descartes' error: emotion, reason, and the human brain. New York: Putnam. [Google Scholar]
  15. Economides M, Guitart-Masip M, Kurth-Nelson Z, Dolan RJ (2014) Anterior cingulate cortex instigates adaptive switches in choice by integrating immediate and delayed components of value in ventromedial prefrontal cortex. J Neurosci 34:3340–3349. 10.1523/JNEUROSCI.4313-13.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Fellows LK. (2006) Deciding how to decide: ventromedial frontal lobe damage affects information acquisition in multi-attribute decision-making. Brain 129:944–952. 10.1093/brain/awl017 [DOI] [PubMed] [Google Scholar]
  17. Fellows LK, Farah MJ (2003) Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. Brain 126:1830–1837. 10.1093/brain/awg180 [DOI] [PubMed] [Google Scholar]
  18. Fellows LK, Farah MJ (2005) Different underlying impairments in decision-making following ventromedial and dorsolateral frontal lobe damage in humans. Cereb Cortex 15:58–63. 10.1093/cercor/bhh108 [DOI] [PubMed] [Google Scholar]
  19. Fellows LK, Farah MJ (2007) The role of ventromedial prefrontal cortex in decision-making: judgment under uncertainty or judgment per se? Cereb Cortex 17:2669–2674. 10.1093/cercor/bhl176 [DOI] [PubMed] [Google Scholar]
  20. Gottlieb J, Hayhoe M, Hikosaka O, Rangel A (2014) Attention, reward, and information seeking. J Neurosci 34:15497–15504. 10.1523/JNEUROSCI.3270-14.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Henri-Bhargava A, Simioni A, Fellows LK (2012) Ventromedial frontal lobe damage disrupts the accuracy, but not the speed, of value-based preference judgments. Neuropsychologia 50:1536–1542. 10.1016/j.neuropsychologia.2012.03.006 [DOI] [PubMed] [Google Scholar]
  22. Hornak J, O'Doherty J, Bramham J, Rolls ET, Morris RG, Bullock PR, Polkey CE (2004) Reward-related reversal learning after surgical excisions in orbito-frontal or dorsolateral prefrontal cortex in humans. J Cogn Neurosci 16:463–478. 10.1162/089892904322926791 [DOI] [PubMed] [Google Scholar]
  23. Howard JD, Gottfried JA, Tobler PN, Kahnt T (2015) Identity-specific coding of future rewards in the human orbitofrontal cortex. Proc Natl Acad Sci U S A 112:5195–5200. 10.1073/pnas.1503550112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Howard JD, Kahnt T, Gottfried JA (2016) Converging prefrontal pathways support associative and perceptual features of conditioned stimuli. Nat Commun 7:11546. 10.1038/ncomms11546 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Hunt LT, Hayden BY (2017) A distributed, hierarchical and recurrent framework for reward-based choice. Nat Rev Neurosci 18:172–182. 10.1038/nrn.2017.7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hunt LT, Behrens TE, Hosokawa T, Wallis JD, Kennerley SW (2015) Capturing the temporal evolution of choice across prefrontal cortex. eLife 4:e11945. 10.7554/eLife.11945 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Jocham G, Hunt LT, Near J, Behrens TE (2012) A mechanism for value-guided choice based on the excitation-inhibition balance in prefrontal cortex. Nat Neurosci 15:960–961. 10.1038/nn.3140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Jocham G, Brodersen KH, Constantinescu AO, Kahn MC, Ianni AM, Walton ME, Rushworth MF, Behrens TE (2016) Reward-guided learning with and without causal attribution. Neuron 90:177–190. 10.1016/j.neuron.2016.02.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kahnt T, Chang LJ, Park SQ, Heinzle J, Haynes JD (2012) Connectivity-based parcellation of the human orbitofrontal cortex. J Neurosci 32:6240–6250. 10.1523/JNEUROSCI.0257-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kolling N, Behrens TE, Mars RB, Rushworth MF (2012) Neural mechanisms of foraging. Science 336:95–98. 10.1126/science.1216930 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kolling N, Behrens T, Wittmann MK, Rushworth M (2016a) Multiple signals in anterior cingulate cortex. Curr Opin Neurobiol 37:36–43. 10.1016/j.conb.2015.12.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Kolling N, Wittmann MK, Behrens TE, Boorman ED, Mars RB, Rushworth MF (2016b) Value, search, persistence and model updating in anterior cingulate cortex. Nat Neurosci 19:1280–1285. 10.1038/nn.4382 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kondo H, Saleem KS, Price JL (2005) Differential connections of the perirhinal and parahippocampal cortex with the orbital and medial prefrontal networks in macaque monkeys. J Comp Neurol 493:479–509. 10.1002/cne.20796 [DOI] [PubMed] [Google Scholar]
  34. Kumaran D, Warren DE, Tranel D (2015) Damage to the ventromedial prefrontal cortex impairs learning from observed outcomes. Cereb Cortex 25:4504–4518. 10.1093/cercor/bhv080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lau B, Glimcher PW (2007) Action and outcome encoding in the primate caudate nucleus. J Neurosci 27:14502–14514. 10.1523/JNEUROSCI.3060-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Levy DJ, Glimcher PW (2012) The root of all value: a neural common currency for choice. Curr Opin Neurobiol 22:1027–1038. 10.1016/j.conb.2012.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lim SL, O'Doherty JP, Rangel A (2011) The decision value computations in the vmPFC and striatum use a relative value code that is guided by visual attention. J Neurosci 31:13214–13223. 10.1523/JNEUROSCI.1246-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Louie K, Grattan LE, Glimcher PW (2011) Reward value-based gain control: divisive normalization in parietal cortex. J Neurosci 31:10627–10639. 10.1523/JNEUROSCI.1237-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Luce RD. (1959) Individual choice behaviour: a theoretical analysis. New York: Wiley. [Google Scholar]
  40. Mackey S, Petrides M (2010) Quantitative demonstration of comparable architectonic areas within the ventromedial and lateral orbital frontal cortex in the human and the macaque monkey brains. Eur J Neurosci 32:1940–1950. 10.1111/j.1460-9568.2010.07465.x [DOI] [PubMed] [Google Scholar]
  41. Nasreddine ZS, Phillips NA, Bédirian V, Charbonneau S, Whitehead V, Collin I, Cummings JL, Chertkow H (2005) The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc 53:695–699. 10.1111/j.1532-5415.2005.53221.x [DOI] [PubMed] [Google Scholar]
  42. Neubert FX, Mars RB, Thomas AG, Sallet J, Rushworth MF (2014) Comparison of human ventral frontal cortex areas for cognitive control and language with areas in monkey frontal cortex. Neuron 81:700–713. 10.1016/j.neuron.2013.11.012 [DOI] [PubMed] [Google Scholar]
  43. Neubert FX, Mars RB, Sallet J, Rushworth MF (2015) Connectivity reveals relationship of brain areas for reward-guided learning and decision-making in human and monkey frontal cortex. Proc Natl Acad Sci U S A 112:E2695–E2704. 10.1073/pnas.1410767112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Noonan MP, Walton ME, Behrens TE, Sallet J, Buckley MJ, Rushworth MF (2010) Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc Natl Acad Sci U S A 107:20547–20552. 10.1073/pnas.1012246107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Noonan MP, Mars RB, Rushworth MF (2011) Distinct roles of three frontal cortical areas in reward-guided behavior. J Neurosci 31:14399–14412. 10.1523/JNEUROSCI.6456-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Ongür D, Price JL (2000) The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys and humans. Cereb Cortex 10:206–219. 10.1093/cercor/10.3.206 [DOI] [PubMed] [Google Scholar]
  47. Philiastides MG, Biele G, Heekeren HR (2010) A mechanistic account of value computation in the human brain. Proc Natl Acad Sci U S A 107:9430–9435. 10.1073/pnas.1001732107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Ray P. (1973) Independence of irrelevant alternatives. Econometrica 41:987–991. 10.2307/1913820 [DOI] [Google Scholar]
  49. Rolls ET, Hornak J, Wade D, McGrath J (1994) Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. J Neurol Neurosurg Psychiatry 57:1518–1524. 10.1136/jnnp.57.12.1518 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Rudebeck PH, Murray EA (2014) The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron 84:1143–1156. 10.1016/j.neuron.2014.10.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Rudebeck PH, Saunders RC, Prescott AT, Chau LS, Murray EA (2013) Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nat Neurosci 16:1140–1145. 10.1038/nn.3440 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Rushworth MF, Noonan MP, Boorman ED, Walton ME, Behrens TE (2011) Frontal cortex and reward-guided learning and decision-making. Neuron 70:1054–1069. 10.1016/j.neuron.2011.05.014 [DOI] [PubMed] [Google Scholar]
  53. Rushworth MF, Kolling N, Sallet J, Mars RB (2012) Valuation and decision-making in frontal cortex: one or many serial or parallel systems? Curr Opin Neurobiol 22:946–955. 10.1016/j.conb.2012.04.011 [DOI] [PubMed] [Google Scholar]
  54. Schmahmann JD. and Pandya DN (2006) Fiber pathways of the brain. New York: Oxford. [Google Scholar]
  55. Schoenbaum G, Takahashi Y, Liu TL, McDannald MA (2011) Does the orbitofrontal cortex signal value? Ann N Y Acad Sci 1239:87–99. 10.1111/j.1749-6632.2011.06210.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Seo H, Cai X, Donahue CH, Lee D (2014) Neural correlates of strategic reasoning during competitive games. Science 346:340–343. 10.1126/science.1256254 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Stalnaker TA, Cooch NK, Schoenbaum G (2015) What the orbitofrontal cortex does not do. Nat Neurosci 18:620–627. 10.1038/nn.3982 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Sul JH, Kim H, Huh N, Lee D, Jung MW (2010) Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision-making. Neuron 66:449–460. 10.1016/j.neuron.2010.03.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Thorndike EL. (1933a) A proof of the Law of Effect. Science 77:173–175. 10.1126/science.77.1989.173-a [DOI] [PubMed] [Google Scholar]
  60. Thorndike EL. (1933b) The “spread” or “scatter” of the influence from a reward, in relation to Gestalt doctrines. Science 77:368. 10.1126/science.77.1998.368 [DOI] [PubMed] [Google Scholar]
  61. Tsuchida A, Doll BB, Fellows LK (2010) Beyond reversal: a critical role for human orbitofrontal cortex in flexible learning from probabilistic feedback. J Neurosci 30:16868–16875. 10.1523/JNEUROSCI.1958-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Tsujimoto S, Genovesio A, Wise SP (2009) Monkey orbitofrontal cortex encodes response choices near feedback time. J Neurosci 29:2569–2574. 10.1523/JNEUROSCI.5777-08.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Vaidya AR, Fellows LK (2015) Ventromedial frontal cortex is critical for guiding attention to reward-predictive visual features in humans. J Neurosci 35:12813–12823. 10.1523/JNEUROSCI.1607-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Vaidya AR, Fellows LK (2016) Necessary contributions of human frontal lobe subregions to reward learning in a dynamic, multidimensional environment. J Neurosci 36:9843–9858. 10.1523/JNEUROSCI.1337-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF (2010) Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron 65:927–939. 10.1016/j.neuron.2010.02.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Zald DH, McHugo M, Ray KL, Glahn DC, Eickhoff SB, Laird AR (2014) Meta-analytic connectivity modeling reveals differential functional connectivity of the medial and lateral orbitofrontal cortex. Cereb Cortex 24:232–248. 10.1093/cercor/bhs308 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES