Skip to main content
Brain logoLink to Brain
. 2017 May 24;140(6):1743–1756. doi: 10.1093/brain/awx105

Selective impairment of goal-directed decision-making following lesions to the human ventromedial prefrontal cortex

Justin Reber 1,2, Justin S Feinstein 3, John P O’Doherty 4, Mimi Liljeholm 4,5, Ralph Adolphs 4, Daniel Tranel 1,2,
PMCID: PMC6075075  PMID: 28549132

See Manohar and Akam (doi:10.1093/brain/awx119) for a scientific commentary on this article.

Using a food reward devaluation procedure, Reber et al. show that patients with ventromedial prefrontal cortex lesions are selectively impaired in instrumental choices following satiation. This argues for a role for the human vmPFC in goal-directed decision-making, while showing that it is not necessary for hedonic experience of outcome value.

Keywords: vmPFC, devaluation, reward, instrumental, decision-making

Abstract

See Manohar and Akam (doi:10.1093/brain/awx119) for a scientific commentary on this article.

Neuroimaging studies suggest that the human ventromedial prefrontal cortex is a key region for goal-directed behaviour. However, it remains unclear whether the ventromedial prefrontal cortex is necessary for such behaviour. Here we used a canonical test from the animal literature designed to distinguish goal-directed from habit-based choice: namely, outcome devaluation. Patients with focal damage to the ventromedial prefrontal cortex showed deficits in goal-directed choice by persistently selecting actions for a food outcome that had been devalued through selective satiation. By contrast, the same patients had entirely intact acquisition of instrumental contingencies, demonstrating preserved habitual control, and also gave normal ratings of the hedonic value of the devalued food. These findings for the first time demonstrate a necessary and selective role for the human ventromedial prefrontal cortex in goal-directed choice, reconciling prior neuroimaging results in humans with lesion studies in animals, and providing a mechanistic explanation of the real-life deficits in decision-making that have been documented in patients with damage to this structure.

Introduction

It has long been suggested that two distinct systems compete to control behaviour. A slow yet deliberative flexible system is often postulated to complement (and sometimes compete with) a rapid, yet relatively inflexible automatic system (Schneider and Shiffrin, 1977; Norman and Shallice, 1980; Kahneman, 2011). In animal learning theory, these two systems are described as ‘goal-directed’ and ‘habitual’ control, respectively (Dickinson, 1985; Balleine and Dickinson, 1998). In goal-directed control, actions are selected with reference to the predicted incentive value of the goal or outcome that those actions engender. In the habitual control system, actions are selected in a more reflexive manner by associations acquired to antecedent stimuli on the basis of past reinforcement, without the flexibility of taking into account predictions of the current value of the goal elicited by those actions (Daw et al., 2005; Balleine and Ostlund, 2007; Rangel et al., 2008; de Wit and Dickinson, 2009). Evidence for the existence of these two systems arose initially from behavioural studies in rodents, in which the two systems were shown to control behaviour differentially after varying amounts of training (Adams and Dickinson, 1981; Dickinson et al., 1983). Additionally, computational theories of goal-directed and habitual choice also postulate the existence of multiple control signals in the brain, specifically one for so-called model-based control (thought to subserve goal-directed choice) and a second for so-called model-free control (thought to subserve habit-based choice) (Doya et al., 2002; Daw et al., 2005).

Attempts to investigate neural correlates of goal-directed control have focused on the prefrontal cortex. Early neuropsychological investigations of the prefrontal cortex revealed that damage to this area impacts the capacity for integrated thought and the flexible control of behaviour (Milner, 1963; Luria, 1966), and more recent neuropsychological, neuroimaging and neurophysiological studies in both humans and non-human primates have confirmed a role for the prefrontal cortex in deliberative control (Owen, 1997; Miller and Cohen, 2001; Wallis et al., 2001). More specifically, within the prefrontal cortex, the ventromedial prefrontal cortex (vmPFC), which includes the medial orbitofrontal cortex and adjacent medial prefrontal cortex, has been especially implicated in flexible value-based decision-making and reward-related learning in both monkeys and humans (Bechara et al., 1994; Rolls et al., 1994; Baxter et al., 2000; Bechara et al., 2000; Gottfried et al., 2003; Kable and Glimcher, 2009; Rudebeck and Murray, 2011; Rushworth et al., 2011; Rudebeck et al., 2013).

To definitively identify the neural systems involved in goal-directed control, it is necessary to use behavioural protocols that are designed to separate out goal-directed from habitual responding. One such procedure is the instrumental reinforcer devaluation task (Adams and Dickinson, 1981). In this protocol, a subject is trained to associate several distinct stimuli with one of several outcomes (receiving specific foods). One of these outcomes is subsequently devalued by (for example) feeding to satiety. The critical test measures the instrumental action performed for the now devalued outcome. Habit-based control is exemplified by continued responding on the devalued action, whereas a decrease in such responding (relative to responding with another, still valued, action) demonstrates goal-directed behaviour, as it is only goal-directed control that is sensitive to the current incentive value of the outcome.

Using this and related protocols, lesion studies in rodents (Balleine and Dickinson, 1998; Corbit and Balleine, 2003) have mapped the systems for goal-directed control to prelimbic sectors of the prefrontal cortex and dorsomedial striatum. In monkeys, goal-directed control has been found to involve the orbitofrontal cortex including the medial orbitofrontal area (Rhodes and Murray, 2013). Functional MRI studies in humans have provided convergent evidence—vmPFC activation has been correlated with goal outcomes that are sensitive to outcome value (Plassmann et al., 2007; Valentin et al., 2007; de Wit et al., 2009; McNamee et al., 2015).

However, these studies leave unclear the precise and necessary contribution of the human vmPFC to decision-making: is it implicated in such a broad range of studies because it plays multiple roles in valuation and choice (perhaps through intermingled neuronal populations that neuroimaging cannot resolve), or does it play an essential role only in a specific aspect of value-based decision-making?

The proposal that the human vmPFC is a necessary component of the system for goal-directed choice, whether selective for this function or not, has not been put to a definitive empirical test—that is, no study so far has demonstrated specific behavioural impairments in this process when that brain region is damaged. Note that our use of the term ‘goal-directed choice’ here refers specifically to the ability to use knowledge about the current value of goal states to guide instrumental actions, as opposed to the capacity to select goals based on instruction or the capacity to resolve decision or response conflict.

We hypothesized that patients with damage to the human vmPFC would show impaired goal-directed choice, whereas their habit-based choice would be unaffected. We adapted a free operant instrumental devaluation task from the rodent literature and administered the protocol to six human patients with focal lesions to the vmPFC, using real foods delivered through modified capsule vending machines. Comparisons to 20 healthy participants and seven patients with lesions outside the vmPFC allowed us to establish neuroanatomical specificity for our findings.

Materials and methods

Participants

Our participant sample consisted of three groups: a target patient group with focal, adult-onset lesions to the vmPFC (n = 6), a brain-damaged comparison (BDC) group of patients with non-vmPFC brain lesions (n = 7), and another comparison group of neurologically healthy individuals (n = 20). The neurological patients were chosen from the Patient Registry in the University of Iowa’s Department of Neurology (Division of Neuropsychology and Cognitive Neuroscience).

The patients had focal, stable lesions that have been well-characterized through structural imaging (MRI scans for most patients and CT scans for those with conditions that preclude MRIs) (Frank et al., 1997). Each patient in the Registry has been screened by a neurologist and a neuropsychologist to exclude individuals with a history of learning disabilities, psychiatric disorders, substance abuse, or other neurological conditions (Keifer and Tranel, 2013). Patients were further screened to ensure that they had no diagnosable mood disorder at the time of the current experiment. Due to the nature of the task used in the current experiment, all potential participants were carefully screened to exclude individuals with any history of eating disorders, specific food allergies, diabetes, hypertension, or other medical conditions with dietary restrictions. Five patients with bilateral vmPFC lesions from our patient registry were disqualified from participation in the current study because they were diabetic (lesion locations and volumes for the five excluded patients did not differ significantly from those of the six included vmPFC patients). All neuropsychological, neuroanatomical, and experimental data were collected at least 3 months after lesion onset, in the chronic epoch of recovery. All participants received detailed information describing the experimental procedures, approved by the Institutional Review Board for Human Subjects Research at the University of Iowa, and all participants gave informed consent before participating in the study, in accordance with the Declaration of Helsinki.

Demographics

Participants’ demographic data are presented in Table 1. Thirteen additional participants (n = 1 vmPFC, n = 9 BDC, and n = 3 neurologically normal) who completed the study were excluded from the analysis and demographic comparisons. Specifically, two BDC patients and one neurologically normal participant discontinued because of a failure to fast at least 4 h before the task, and 10 participants—one vmPFC, seven BDC, and two neurologically normal—were excluded from the analysis based on insufficient consumption of food (fewer than 150 calories) during the satiation phase. This latter exclusionary criterion (designed a priori) avoided the possibility that some participants might find the foods insufficiently rewarding, or refuse to fully satiate (e.g. because they were reluctant to consume food during the experiment). Because participants did not consume the rewards that they won during the pre-satiation phase, the number of calories consumed during satiation was a proxy measurement of whether participants’ pre-satiation scores were reasonably valid representations of their desire to eat each reward and their understanding of the cue-action-reward relationship. Participants unwilling or unable to consume enough of the food provided during the satiation phase, despite winning rewards during the pre-satiation test, were deemed unlikely to have given a valid baseline measurement, and were therefore excluded from the analyses. Further investigation of these 10 excluded participants showed that the key findings of the study remained statistically significant even when these excluded individuals were included in the analyses.

Table 1.

Lesion descriptions and neuropsychological test scores for vmPFC and BDC patients, with group demographics for the vmPFC, BDC, and neurologically normal groups

Lesion information
Patient ID Group Lesion description Lesion chronicity (years) Lesion volume (mm3) Medications
318 vmPFC Bilateral frontal meningioma resection 37 117 676
2112 vmPFC L frontal haemorrhagic stroke 18 14 245 Antiepileptic
2391 vmPFC Bilateral frontal meningioma resection 15 71 033
2352 vmPFC Subarachnoid haemorrhage; R anterior communicating artery aneurysm clip 16 13 632
3534 vmPFC Bilateral frontal meningioma resection 5 42 103
3535 vmPFC R frontal meningioma resection 6 23 264 Antiepileptic
1971 BDC L temporal lobe resection 20 33 500 Antiepileptic
3058 BDC R temporal lobe resection, anterior choroidal artery stroke 11 32 843 Antiepileptic
3575 BDC Bilateral occipital/cerebellar stroke 4 3694 Antiepileptic
2403 BDC L temporal lobe resection 16 31 899
3386 BDC R temporal lobe resection 8 26 200 Antiepileptic
3695 BDC R frontal haematoma resection 4 Unavailable
3767 BDC R frontal haemorrhagic stroke 1 Unavailable Antiepileptic
Neuropsychological test scores
Patient ID Group WAIS Verbal Comprehension WAIS Perceptual Reasoning WAIS Working Memory WAIS Processing Speed WAIS Full- scale IQ AVLT Trial 5 Raw Score AVLT Delayed Recall BDI-II Wechsler General Memory Index
318 vmPFC 127 119 108 102 119 14 10 0 109
2112 vmPFC 148 148 130 NA 149 15 15 0 111
2352 vmPFC 103 99 111 122 106 14 11 8 109
2391 vmPFC 116 127 100 111 118 15 14 4 132
3534 vmPFC 107 117 105 100 110 15 12 8 117
3535 vmPFC 118 107 119 122 120 15 12 1 140
vmPFC group average 119.8(14.8) 119.5(15.6) 112.2(9.9) 111.4(9.4) 120.3(13.8) 14.7(0.5) 12.3(1.7) 3.5(3.5) 119.7(12.1)
BDC group average 103.9(8.8) 103.1(9.3) 108.0(8.5) 104.6(11.0) 105.1(7.8) 12.4(1.7) 10.1(2.6) 2.7(3.5) 100.8(14.0)
Group demographics
Group Sex (F:M) Average age Average years of education Average BMI Aetiology Laterality
vmPFC 4:2 69.3 (5.6) 14.5 (2.2) 30.8 (5.4) Benign tumour resection: 4; stroke: 2 Bilateral: 4; left: 1; Right: 1
BDC 4:3 48.1 (8.7) 15.9 (3.1) 26.8 (5.3) Surgical resection: 5; stroke: 2 Bilateral: 1; left: 2; right: 4
NL 15:5 66.4 (4.3) 15.6 (2.1) 28.3 (3.2) NA NA

Standard deviations in parentheses. AVLT = Auditory-Verbal Learning Test; BDI = Beck Depression Inventory; NL = neurologically normal; WAIS = Wechsler Adult Intelligence Scale (III & IV).

There were no left-handed individuals in the vmPFC group; the BDC group included one left-handed patient. The neurologically normal group contained one left-handed individual and two people who did not report handedness. The groups did not differ significantly in years of education [F(2,30) = 0.630, P = 0.539, pη2 = 0.040] or body mass index (BMI) [F(2,30) = 1.528, P = 0.233, pη2 = 0.092], and the three groups had comparable sex ratios. The vmPFC and neurologically normal groups were well-matched on age [t(24) = 1.372, P = 0.183]; the BDC group was somewhat younger [t(11) = 5.105, P < 0.001; Table 1]. We ran all the main analyses using age as a covariate, and none of the main outcomes were affected, nor was age meaningfully related to the variables of interest [F(1,28) = 0.428, P = 0.518, pη2 = 0.015]. Lesion aetiologies for the vmPFC group included benign meningioma resection (n = 4) and stroke (n = 2). In the BDC group, lesions were due to surgical resections to treat pharmacoresistant epilepsy (n = 4), surgical haematoma resection (n = 1), and stroke (n = 2). The patients showed no evidence of abnormal microvascular disease in subcortical regions such as the basal ganglia. Two patients in the vmPFC group had unilateral lesions, one right-sided and one left-sided, and four had bilateral lesions. Six of the BDC patients had unilateral lesions, four right-sided and two left-sided, and one had a bilateral lesion. Despite the uneven distribution of bilateral lesions among the groups, there were no significant differences in overall lesion volume between the vmPFC and the BDC patients for whom we had lesion volumes available [t(10) = 1.117, P = 0.293], and lesion volume did not correlate significantly with the main variables of interest (R = 0.202, P = 0.551). Our groups were too small to allow a definitive analysis of bilaterality versus unilaterality as a factor, but we have addressed this qualitatively in the results section.

Each lesion was reconstructed in three dimensions onto a normal template brain by an expert tracer using the MAP-3 technique (Frank et al., 1997). The resulting lesion maps were then overlapped onto a template volume, and a colour-coded graphic representation of the number of lesions overlapping at any given voxel was created (Fig. 1). The lesions of all six patients in the vmPFC group overlapped primarily in the vmPFCs, with some individual lesions extending superiorly into the anterior cingulate and medial prefrontal cortices. The majority of the BDC group’s lesions were located in the temporal lobes, and the group also included two patients with non-vmPFC frontal lobe lesions and one patient with an occipital lesion (Figs 1C and 2, and Table 1). All of the patients’ lesions affected both cortex and (variably) white matter tracts, so it is difficult to draw conclusions about whether the effects we observed were primarily due to local cortical damage or disconnection of other involved regions. This issue will require further investigations that go well beyond the scope of the current report, and we acknowledge this as a limitation of the current study.

Figure 1.

Figure 1

Lesion overlap of vmPFC and BDC patients and photographs of the experimental set-up. (A) The lesions of all six patients in the vmPFC group overlapped primarily in the vmPFCs, with individual lesions extending superiorly into the anterior cingulate and medial prefrontal cortices. Colours represent the number of overlapping lesions at each voxel, ranging from 1 (blue) to 6 (red). (B) These lesions also overlap heavily with the regions identified in a coordinate-based meta-analysis as reliably associated with the subjective valuation of stimuli across all modalities (Clithero and Rangel, 2013; adapted with permission). (C) The lesions of five of the patients in the BDC group overlapped primarily in the left and right temporal poles. (D and E) The task used two computer-controlled candy vending machines to dispense capsules with calorically-matched food rewards chosen from 14 snack foods. ALF = anterior lateral fascilculus; dPCC = dorsal posterior cingulate cortex; vPCC = ventral posterior cingulate cortex; VSTR = ventral striatum.

Figure 2.

Figure 2

MRI and CT brain images of patients in the vmPFC and BDC groups. Subject IDs are presented next to perisagittal views of patients’ brains and 6–12 coronal cross-sections, in radiological convention (left hemisphere on the right and vice versa). Lesions are visible as hypodense (black or dark grey) areas in these images.

Procedure

In order that participants would be reasonably hungry during the task, they were instructed to refrain from eating for at least 4 h before coming to the laboratory for the experiment (the majority of participants skipped breakfast before completing the experiment in the morning). At the beginning of the task, participants were presented with individual pieces of 14 snack foods, divided evenly into predominantly salty and predominantly sweet categories, which they subsequently tasted and individually rated on pleasantness and desirability on 21- and 11-point Likert-type scales, respectively. Participants were provided with water to wash the taste of the previous food out of their mouths if they wished. The salty foods consisted of low-sodium Planter’s whole cashews, Cheetos®, Fritos corn chips, Goldfish® cheese-flavoured crackers, Jolly Time Crispy ‘n White popcorn, Ritz Bits cheese crackers, and pretzels. The sweet foods consisted of Sour Brite Crawlers gummy worms, mini marshmallows, Haribo gummy bears, M&M’s®, Skittles, mini Oreos, and Reese’s mini peanut butter cups.

After the experimenter (who was blind to the details of the neurological participants’ lesions and neuropsychological profiles) had calculated each participant’s overall preference score for each food (the sum of the pleasantness and desirability ratings), each participant was presented with two forced choices: one choice between two salty foods, and one choice between two sweet foods. The choices presented were the two highest-rated foods in each category that were similar in rating to the highest-rated foods in the other category, such that the chosen two foods were similar in combined pleasantness and desirability. Typically, this meant that participants would choose between their top two sweet and salty foods. However, if a participant’s highest-rated sweet foods were more than two points higher than their most-preferred salty foods, for instance, the choices would be first between the two preferred salty foods and then the two sweet foods rated most similarly to the salty food chosen. At the end of the forced choices, participants would have two foods—one sweet, one salty, and both similarly preferred—that would be used for the remainder of the task. In pre-screening of the 14 different foods, all participant groups endorsed ratings of pleasantness and desire to eat that were highly correlated across the foods [Spearman’s correlation of desirability and pleasantness were calculated across all foods and participants by group; vmPFC: rs(84) = 0.873, P < 0.001; neurologically normal: rs(280) = 0.896, P < 0.001; BDC: rs(97) = 0.856, P < 0.001]. Hence, patients with vmPFC lesions, like both comparison groups, were rational in this respect—they desired those foods they found the most pleasant.

Following the food preference determination, participants completed the 26-item version of the Eating Attitudes Test (EAT-26) (Garner et al., 1982) to assess disordered eating attitudes and behaviours. A score >20 on the assessment was an exclusion criterion, but no participant reached this cut-off. Using an 11-point Likert-type scale, participants completed the first of four self-report questionnaires in which they rated their hunger, fullness, and desire for the sweet and salty rewards that they had chosen. During this time, the experimenter filled two computer-controlled capsule vending machines with the selected food rewards, randomly counterbalanced so that the machine nearest to the participant dispensed the sweet treat for half of the participants and the salty treat for the other half (Fig. 1).

Training phase

Once participants had completed the initial hunger/satiation questionnaire, they began the training phase of the main task. The task was presented on a Dell Inspiron laptop using MATLAB software (The Mathworks Inc., Natick, MA, USA). During the training phase, participants completed a free-operant task in which responses to two distinct fractal cues were rewarded on a variable ratio (VR) schedule with capsules filled with roughly calorie-matched amounts (ranging from 4.4 to 18 calories) of their two selected foods, dispensed from the vending machines. Participants were instructed to press the ‘A’ and ‘L’ keys in any combination that they wished while the fractals were on screen in order to learn the best strategies to win rewards. The fractals appeared one at a time on the screen for 5–15 s, during which participants could earn one of the two rewards from the vending machines by pressing either the ‘A’ or the ‘L’ key on the keyboard between 10–20 times, a threshold that was pseudorandomly generated with each fractal presentation (Fig. 3). Both the fractal-to-machine association and button-to-fractal associations were counterbalanced across participants.

Figure 3.

Figure 3

The four phases of the experimental procedure. During the Training phase, participants learned the correct fractal-to-button-to-reward relationship, and the phase ended once each participant had won 10 sweet and 10 salty rewards. Participants would then have 2 min during the pre-satiation phase to win as many of either or both rewards as they desired. The experimenter would then present the participants with a bowl full of 1000 calories of the food that they had preferred during the pre-satiation phase, and the participants would be given 5 min to consume as much of the food as possible. Subsequently, the participants would once again have 2 min to win as many rewards as they desired during the post-satiation phase. Devaluation scores were calculated from the difference between the number of each reward won during the pre-satiation and post-satiation phase.

Whenever a participant pressed the keys the correct number of times in response to the fractal, ending on at least five consecutive presses of the key associated with the displayed fractal, the computer would display ‘You Win’ and the vending machine associated with the key/fractal combination would dispense a small plastic capsule filled with the corresponding food reward. Participants then had 10 s to consume the food and discard the capsule. The training phase ended once the participant had won and consumed 10 of each reward, an amount that fell short of satiation but permitted acquisition of the instrumental contingencies of the task. Although the fractals were presented in a pseudorandom order, once a participant had won 10 of a single reward, the corresponding fractal stopped appearing so as to avoid unequal satiation during the training phase. After completing the training phase and learning the appropriate associations between button, fractal, and food, the participants filled out a second self-report questionnaire about their hunger, fullness, and desire to consume each food (as described earlier for the initial measurement).

Pre-satiation baseline

Next, participants moved on to the pre-satiation phase, in which both fractals were displayed side-by-side on the screen for 120 s. Participants were instructed to use the strategies and associations that they had learned during the training phase to freely win as many of the rewards as they would like to eat. Rather than consuming the treats immediately, participants set aside the rewards they won during this phase. To prevent participants from simply winning as many of both rewards as possible without regard to their immediate desire to consume them, participants were told to ‘keep the food in the capsules and set them aside for later, but only win what you want to eat’. Once the 2 min were finished, the experimenter transferred all of the participant’s winnings to a plastic bag for the participant to take home, covertly counting the number of sweet and salty treats that the participant had won. Although there was no direct index of whether participants were following the directions to only win as much as they would like to consume, there was evidence that all participants understood and complied with these instructions: every single participant consumed at least as many calories during the following phase as they had won during the pre-satiation test, indicating that none of them won more than they would have been willing to consume.

Reinforcer devaluation through satiation

The experimenter then prepared the satiation phase, filling a large bowl with 1000 calories (determined by weight) of the food most won by the participant during the pre-satiation phase. Participants were given the bowl and instructed to eat as much as they could during the next 5 min. The experimenter would occasionally check in on the participants to provide them with water and to ensure that they were actually consuming the food during the phase. After 5 min, the experimenter returned and removed the bowl, weighing it again to approximate the number of calories consumed during satiation. The satiation phase was followed by a third hunger/fullness/desirability questionnaire.

Post-satiation devaluation test

The final step involved placing the patients back in the test situation in which they could freely respond again on the two instrumental actions, one action associated with the now-devalued food outcome, and the other action associated with a still-valued food. Participants completed the post-satiation test, which was nearly identical to the free-choice task used in the pre-satiation phase. Both fractals were once again displayed for 120 s, and participants were again told to use the strategies that they had learned to win the rewards they wished to eat and to set them aside until the end of the phase. For this final post-satiation phase, however, participants were informed that they would eat everything that they won after the task was finished. Individuals whose behaviour is goal-directed would be expected to decrease responding on the now-devalued action relative to the action associated with the still-valued outcome, consistent with the prior literature. On the other hand, if behaviour is under habitual control, individuals should show no such decrease in responding on the now-devalued outcome. After consuming all of their final winnings, participants filled out the fourth and final hunger/fullness/desirability questionnaires and a post-experiment manipulation check.

Results

Initial learning of instrumental reward contingencies

During the training phase of the task, all three groups showed clear patterns of learning (Fig. 4). Crucially, there were no significant differences between groups in the number of training trials needed to reach criterion [one-way ANOVA, F(2,30) = 0.898, P = 0.418, pη2 = 0.056]. These results indicate that patients with vmPFC lesions have the capacity to learn instrumental actions for rewards, suggesting that they have an intact habitual system, an intact goal-directed system, or both.

Figure 4.

Figure 4

Group and individual learning curves from the training phase. All three groups won increasing quantities of both rewards during the training phase of the task, demonstrating that they learned the button-to-reward and cue-to-reward relationships, and all participants won and consumed a total of 20 food rewards by the completion of the training phase. Bars represent group means, and symbols represent individual number of rewards won during each quarter of trials needed to reach criterion. Error bars show 95% confidence intervals. NL = neurologically normal.

Reinforcer devaluation phase

All three groups consumed a similar caloric amount of the satiated reward during this satiation phase [one-way ANOVA, F(2,30) = 0.314, P = 0.733, pη2 = 0.021; grand mean = 291] and in total before the post-satiation test [one-way ANOVA, F(2,30) = 0.632, P = 0.539, pη2 = 0.040; grand mean = 356].

Test procedure to disambiguate goal-directed and habitual control

To quantify the extent to which participants shifted their instrumental responses (button presses) from one reward to the other after selective satiation, we calculated a devaluation score (score = responses associated with the satiated reward during the pre-satiation task−responses associated with the satiated reward during the post-satiation task). The more positive the change score, the greater the reduction in responses from baseline. To control for differences in response rates across participants and potential ceiling effects from using a change score as our primary dependent measure, we included a measure of participants’ key-presses for the satiated food during the pre-satiation test as a covariate in our analysis.

Consistent with the continued contribution of goal-directed control, participants in the BDC and neurologically normal groups had large devaluation scores, reflecting a large decrease in the number of times they chose the action associated with the now-devalued outcome during the test. Participants in the vmPFC group, by contrast (and in direct contradiction with their self-reported reduced desire to consume the satiated food), had significantly lower devaluation scores than both comparison groups, showing little to no reduction of the action associated with the satiated reward [Fig. 5A; ANCOVA; F(2,30) = 6.549, P = 0.004, pη2 = 0.311; planned comparisons: t-tests with Dunn–Šidák correction, vmPFC versus neurologically normal, t(24) = 3.560, P = 0.003, Hedge’s g = 1.60; vmPFC versus BDC, t(11) = 2.786, P = 0.035, Hedge’s g = 1.44].

Figure 5.

Figure 5

Devaluation scores. (A) Patients with lesions to the vmPFC decreased their responses to the satiated rewards significantly less during the post-satiation test than normal comparisons and BDC patients, showing a reduced devaluation effect. (B) The groups did not significantly differ in their devaluation of non-satiated foods during the post-satiation test. Group means represent devaluation scores (score = responses for reward type before satiation – responses for reward type after satiation) adjusted for the effects of the covariate (total responses during the pre-satiation test for the given reward type); error bars show 95% confidence intervals. (C) The histogram shows the estimated null distribution sampled by 10 000 random permutations of all devaluation scores for the satiated reward; the solid line shows the observed mean of the vmPFC lesion patients’ devaluation scores, lower than all but 40 of the randomly generated null means. (D and E) Group averages with individual scores on pre- and post-satiation measures for both rewards. Two patients with vmPFC lesions increased their responses to the satiated reward after satiation. Error bars show 95% confidence intervals. *P < 0.05, **P < 0.01. NL = neurologically normal.

Additionally, this behavioural difference in the vmPFC group was not seen for the non-satiated reward: a similar analysis of the devaluation scores associated with the non-satiated reward, controlling for the number of responses during the pre-satiation test to that reward, showed no significant group differences [Fig. 5B; ANCOVA; F(2,30) = 0.930, P = 0.406, pη2 = 0.060]. Furthermore, to counteract the statistical limitations of our small sample sizes, we also ran a permutation test on the devaluation scores for the satiated reward. Six scores were randomly sampled with replacement from all 33 participants for 10 000 iterations, generating an estimated null distribution. Of these 10 000 mean devaluation scores, only 40 were equal to or lower than the vmPFC lesion patients’ observed mean devaluation score (Fig. 5C).

Self-report ratings of food desirability, fullness and hunger

All groups reported a decreased desire to eat the satiated food post-satiation, and there were no group differences in the pre- to post-satiation changes in these self-reported ratings of: (i) desire to consume the satiated food [Fig. 6A; one-way ANOVA, F(2,30) = 0.349, P = 0.708, pη2 = 0.023]; (ii) desire to consume the non-satiated food [Fig. 6B; ANOVA, F(2,30) = 1.732, P = 0.194, pη2 = 0.103]; (iii) fullness [one-way ANOVA, F(2,30) = 1.372, P = 0.269, pη2 = 0.084]; or (iv) hunger [one-way ANOVA, F(2,30) = 2.539, P = 0.096, pη2 = 0.145]. Furthermore, all groups reported significant reductions in their desire to consume the satiated reward after satiation [Fig. 7A; ANOVA, F(1,30) = 34.993, P < 0.001, pη2 = 0.538]. These results demonstrate that patients with vmPFC lesions have an intact capacity to exhibit reinforcer devaluation through selective satiation on these particular dependent measures—lesions to the vmPFC spare the abilities to process the experienced reward-value of food, to change the experienced value of those foods by eating to satiety, and to alter the desire to consume food rewards as a function of changes in the value of food. Additionally, the total number of calories consumed did not differ between groups [Fig. 7E; F(2,30) = 0.073, P = 0.930, pη2 = 0.005], demonstrating that the vmPFC group’s impaired devaluation was not driven by a global difference in appetite at the time of the experiment.

Figure 6.

Figure 6

Self-reported changes in desirability of rewards. Patients with lesions to the vmPFC, normal comparison (NL), and BDC patients showed no significant differences in how much they changed their self-reported ratings of their desire to eat both (A) the satiated food and (B) the non-satiated food after satiation. Error bars show 95% confidence intervals.

Figure 7.

Figure 7

Self-reported desirability of satiated and non-satiated rewards, hunger and fullness at different time points. (A) All three groups reported a decreased desire to consume the satiated food after satiation, but (B) desirability of the non-satiated reward did not change as dramatically after the satiation phase. (C) Hunger decreased throughout the experiment for all three groups, while (D) fullness increased over the course of the experiment. Error bars show 95% confidence intervals. ***P < 0.001. (E) Average total calories consumed by group. Despite their impairments in devaluing the action associated with the satiated reward, patients with vmPFC lesions consumed, on average, a similar number of calories to the normal comparison and brain-damaged comparison participants over the duration of the task. Error bars show 95% confidence intervals. NL = neurologically normal.

To further explore test-retest reliability and demonstrate the reproducibility of our findings within individual subjects, we retested two vmPFC patients on the entire experimental protocol, at a subsequent follow-up visit. Both patients showed minimal differences between their test and retest scores on the main measures of interest—the differences between their devaluation score for the satiated reward [a mean difference of 85.5, or 0.7 standard deviations (SD)], calories consumed (mean difference of 87.9, or 0.5 SD), and number of training trials needed to reach criterion (mean difference of 11.5, or 0.4 SD) were all minimal.

One additional factor we addressed is whether unilateral lesions had a relatively smaller effect compared to bilateral lesions upon patients’ reinforcer devaluation. Although we lacked the statistical power to investigate this quantitatively, it is notable that the vmPFC patient with the lowest devaluation score for the satiated reward had a unilateral lesion, while the BDC patient with a bilateral lesion had the median score in that group.

We would also note that the BDC group included four patients with temporal lobe resections (two left hemisphere, two right hemisphere). In these cases, the amygdala was included in the resection, along with other anterior temporal and medial temporal lobe structures. All of these four BDC patients showed normal devaluation and performed like the neurologically normal group. It will be important in future work to investigate the role of the human amygdala in reinforcer devaluation, for example, by studying patients with more focal unilateral and bilateral amygdala damage.

Discussion

Our results demonstrate for the first time that the human vmPFC is necessary for goal-directed control, a major aspect of adaptive behaviour in which actions are selected based on the current value of reward outcomes. Unlike our two comparison populations (brain-damaged and healthy participants), patients with vmPFC lesions failed to reduce instrumental choices for a food outcome they themselves no longer deemed valuable in self-report. In fact, two of the six patients increased their responses to the satiated food, a behaviour that was never observed in any of the 27 participants in the two comparison groups. These results show that in humans, as in rodents, this region of prefrontal cortex plays an essential role in enabling goal-directed behavioural control on instrumental tasks. These findings not only provide an important complement to animal studies and a burgeoning functional MRI literature implicating this brain region in value-based decision-making, but also sharpen our knowledge of its necessary role: while essential for updating the value representations that guide goal-directed choice (or linking such representations, even if successfully updated, to instrumental choice), the vmPFC is not essential for instrumental learning as such, nor is it essential for updating cognitive preferences or interoception of satiety.

These preserved abilities we found with our behavioural task also allow us to rule out alternative explanations for the observed effects. The fact that the patients with vmPFC lesions performed normally on the initial acquisition of instrumental reward contingencies can be taken as evidence that they have an intact ability to learn and maintain instrumental actions. Most importantly, the vmPFC patients gave entirely normal ratings of the hedonic value of food rewards, along with a marked decrease in subjective pleasantness ratings for the devalued foods as well as reports of increased fullness and reduced hunger. Thus, a basic impairment in subjective valuation and selective satiation per se cannot account for our findings.

The fact that patients with lesions to the vmPFC were intact in their capacity to acquire instrumental actions for rewards in the first place suggests a relatively preserved habit-based learning system. This finding is consistent with studies in rodents, in which lesions of prelimbic cortex do not impair the initial acquisition of instrumental actions for reward (Balleine and Dickinson, 1998; Corbit and Balleine, 2003). It is of course important to note that humans with vmPFC lesions have often been found to be impaired in the acquisition of more complicated types of decision-making behaviours, such as performance on gambling tasks or reward-related reversal learning tasks (Bechara et al., 1994; Rolls et al., 1994; Hornak et al., 2004; Fellows and Farah, 2005). One possible explanation for the difference between the current findings and those previous findings is that such complex tasks likely impose greater demands on a goal-directed (or model-based) system to enable the acquisition of a successful behavioural strategy, whereas in the simpler case of instrumental action learning without complex (or switching) contingencies, a habitual system can effectively manage a successful learning strategy (Hampton et al., 2006; Jang et al., 2015). For instance, both the Iowa Gambling Task and reversal learning may at least in part involve the capacity to re-evaluate the current ongoing value of a particular action after an initial learning phase, a function that may depend in part on the capacity to link actions to the current incentive value of outcomes (indeed, patients with vmPFC lesions typically show intact initial learning on the Iowa Gambling Task, and manifest impairments only on later trials) (Bechara et al., 2000). Alternatively, the fact that human vmPFC lesions impair reversal learning may be due to white matter damage, as recent primate studies have demonstrated. Aspiration lesions of primate orbitofrontal cortex impair reversal learning and instrumental devaluation, whereas excitotoxic lesions that leave the underlying white matter intact only impair instrumental devaluation (Rudebeck et al., 2013). As noted earlier, we cannot sort this out in our current lesion patients, and follow-up investigations will be needed to address this issue definitively in humans.

Another notable finding in our study is that vmPFC lesions left intact have the capacity not only to judge the pleasantness of a food reward, but even to judge it as less pleasant when it was devalued through satiation. The patients with vmPFC lesions, similar to the comparison participants, showed robust devaluation effects on their subjective ratings and on their eating behaviour for the satiated food, in striking contrast to their instrumental behaviour on our main task. Taken together, these findings suggest that the vmPFC may not be essential for the capacity to judge the hedonic value of food rewards or for direct consumption behaviour when unmediated through the selection of instrumental actions. These results are of considerable importance because functional MRI studies in humans (O’Doherty et al., 2000; Small et al., 2001; Kringelbach et al., 2003; Rolls et al., 2003) and electrophysiology studies in non-human primates (Critchley and Rolls, 1996) frequently report the presence of neural activity in this region and adjacent orbitofrontal cortex in response to the value of an experienced food reward (and many other types of rewards). The present findings suggest that, at least within the vmPFC (including medial and central parts of the orbitofrontal cortex), this activity may not be required for the ‘hedonic’ experience of food rewards per se. One intriguing implication of this finding is that experienced reward representations in the vmPFC are not essential to give rise to experienced hedonics (Kringelbach and Rolls, 2004), but instead, such representations may be used for other purposes, such as facilitating the learning of goal-directed associations or computing prediction errors. It will be important in future studies to determine the contribution of the specific impairment we report here to the real-world behaviour of patients with damage to the vmPFC. None of our vmPFC participants evidenced any gross eating disorder (and BMIs were comparable across the groups), suggesting that intact habit-based control, together with intact basic food preferences and cognitive knowledge, can be sufficient to compensate for the absence of goal-directed control.

Our main finding that the vmPFC is an essential component of the system for goal-directed control gives precision to the broader literature implicating this brain region in reward-related decision-making. The role of this region in decision-making appears to be a very specific contribution to a cognitive process in which the incentive values of outcomes are updated and retrieved at the time of decision-making, allowing them to be deployed in goal-directed decisions.

Acknowledgements

We would like to thank Nash Witkin, Raul Samrah, Victoria Spring, Chris Kovach, and Anthony Gómez for their contributions to this project.

Glossary

Abbreviations

BDC

brain-damaged comparison

vmPFC

ventromedial prefrontal cortex

Funding

Supported in part by a McDonnell Foundation Collaborative Action Award [#220020387], the Kiwanis Foundation, a NIH Predoctoral Training Award [T32 GM108540], and a Conte Center from NIMH (1 P50 MH094258-04A1).

References

  1. Adams CD, Dickinson A. Instrumental responding following reinforcer devaluation. Q J Exp Psychol 1981; 33: 109–21. [Google Scholar]
  2. Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 1998; 37: 407–19. [DOI] [PubMed] [Google Scholar]
  3. Balleine BW, Ostlund SB. Still at the choice‐point: action selection and initiation in instrumental conditioning. Ann N Y Acad Sci 2007; 1104: 147–71. [DOI] [PubMed] [Google Scholar]
  4. Baxter MG, Parker A, Lindner CC, Izquierdo AD, Murray EA. Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex. J Neurosci 2000; 20: 4311–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bechara A, Tranel D, Damasio H. Characterization of the decision-making deficit of patients with ventromedial prefrontal cortex lesions. Brain 2000; 123: 2189–202. [DOI] [PubMed] [Google Scholar]
  6. Bechara A, Damasio AR, Damasio H, Anderson SW. Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 1994; 50: 7–15. [DOI] [PubMed] [Google Scholar]
  7. Clithero JA, Rangel A. Informatic parcellation of the network involved in the computation of subjective value. Soc Cogn Affect Neurosci 2013; 9: 1289–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Corbit LH, Balleine BW. The role of prelimbic cortex in instrumental conditioning. Behav Brain Res 2003; 146: 145–57. [DOI] [PubMed] [Google Scholar]
  9. Critchley HD, Rolls ET. Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J Neurophysiol 1996; 75: 1673–86. [DOI] [PubMed] [Google Scholar]
  10. Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 2005; 8: 1704–11. [DOI] [PubMed] [Google Scholar]
  11. de Wit S, Corlett PR, Aitken MR, Dickinson A, Fletcher PC. Differential engagement of the ventromedial prefrontal cortex by goal-directed and habitual behavior toward food pictures in humans. J Neurosci 2009; 29: 11330–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. de Wit S, Dickinson A. Associative theories of goal-directed behaviour: a case for animal–human translational models. Psychol Res 2009; 73: 463–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dickinson A. Actions and habits: the development of a behavioural autonomy. Philos Trans R Soc Lond B Biol Sci 1985; 308: 67–78. [Google Scholar]
  14. Dickinson A, Nicholas D, Adams CD. The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. Q J Exp Psychol 1983; 35: 35–51. [Google Scholar]
  15. Doya K, Samejima K, Katagiri K, Kawato M. Multiple model-based reinforcement learning. Neural Comput 2002; 14: 1347–69. [DOI] [PubMed] [Google Scholar]
  16. Fellows LK, Farah MJ. Different underlying impairments in decision-making following ventromedial and dorsolateral frontal lobe damage in humans. Cereb Cortex 2005; 15: 58–63. [DOI] [PubMed] [Google Scholar]
  17. Frank R, Damasio H, Grabowski TJ. Brainvox: an interactive, multimodal visualization and analysis system for neuroanatomical imaging. Neuroimage 1997; 5: 13–30. [DOI] [PubMed] [Google Scholar]
  18. Garner DM, Olmsted MP, Bohr Y, Garfinkel PE. The eating attitudes test: psychometric features and clinical correlates. Psychol Med 1982; 12: 871–8. [DOI] [PubMed] [Google Scholar]
  19. Gottfried JA, O’Doherty JA, Dolan RJ. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 2003; 301: 1104–7. [DOI] [PubMed] [Google Scholar]
  20. Hampton AN, Bossaerts P, O’Doherty JP. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J Neurosci 2006; 26: 8360–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hornak J, O’Doherty J, Bramham J, Rolls ET, Morris RG, Bullock PR. et al. Reward-related reversal learning after surgical excisions in orbito-frontal or dorsolateral prefrontal cortex in humans. J Cogn Neurosci 2004; 16: 463–78. [DOI] [PubMed] [Google Scholar]
  22. Jang AI, Costa VD, Rudebeck PH, Chudasama Y, Murray EA, Averbeck BB. The role of frontal cortical and medial-temporal lobe brain areas in learning a bayesian prior belief on reversals. J Neurosci 2015; 35: 11751–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kable JW, Glimcher PW. The neurobiology of decision: consensus and controversy. Neuron 2009; 63: 733–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kahneman D. Thinking, fast and slow. New York: Macmillan; 2011. [Google Scholar]
  25. Keifer E, Tranel D. A neuropsychological investigation of the Delis-Kaplan executive function system. J Clin Exp Neuropsychol 2013; 35: 1048–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kringelbach ML, O’Doherty J, Rolls ET, Andrews C. Activation of the human orbitofrontal cortex to a liquid food stimulus is correlated with its subjective pleasantness. Cereb Cortex 2003; 13: 1064–71. [DOI] [PubMed] [Google Scholar]
  27. Kringelbach ML, Rolls ET. The functional neuroanatomy of the human orbitofrontal cortex: evidence from neuroimaging and neuropsychology. Prog Neurobiol 2004; 72: 341–72. [DOI] [PubMed] [Google Scholar]
  28. Luria AR. Higher cortical functions in man. Oxford: Basic Books; 1966. [Google Scholar]
  29. McNamee D, Liljeholm M, Zika O, O’Doherty JP. Characterizing the associative content of brain structures involved in habitual and goal-directed actions in humans: a multivariate FMRI study. J Neurosci 2015; 35: 3764–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annu Rev Neurosci 2001; 24: 167–202. [DOI] [PubMed] [Google Scholar]
  31. Milner B. Effects of different brain lesions on card sorting: the role of the frontal lobes. Arch Neurol 1963; 9: 90–100. [Google Scholar]
  32. Norman DA, Shallice T. Attention to action: willed and automatic control of behavior. In: Davidson RJ, Schwartz GE, Shapiro D, editors. Consciousness and self-regulation. New York: Springer US; 1986: 1–18. [Google Scholar]
  33. O’Doherty J, Rolls ET, Francis S, Bowtell R, McGlone F, Kobal G. et al. Sensory-specific satiety-related olfactory activation of the human orbitofrontal cortex. Neuroreport 2000; 11: 893–7. [DOI] [PubMed] [Google Scholar]
  34. Owen AM. Cognitive planning in humans: neuropsychological, neuroanatomical and neuropharmacological perspectives. Prog Neurobiol 1997; 53: 431–50. [DOI] [PubMed] [Google Scholar]
  35. Plassmann H, O’Doherty J, Rangel A. Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci 2007; 27: 9984–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Rangel A, Camerer C, Montague PR. A framework for studying the neurobiology of value-based decision making. Nat Rev Neurosci 2008; 9: 545–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Rhodes SE, Murray EA. Differential effects of amygdala, orbital prefrontal cortex, and prelimbic cortex lesions on goal-directed behavior in rhesus macaques. J Neurosci 2013; 33: 3380–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Rolls ET, Hornak J, Wade D, McGrath J. Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. J Neurol Neurosurg Psychiatry 1994; 57: 1518–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Rolls ET, Kringelbach ML, De Araujo IE. Different representations of pleasant and unpleasant odours in the human brain. Eur J Neurosci 2003; 18: 695–703. [DOI] [PubMed] [Google Scholar]
  40. Rudebeck P, Murray EA. Dissociable effects of subtotal lesions within the macaque orbital prefrontal cortex on reward-guided behavior. J Neurosci 2011; 31: 10569–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Rudebeck PH, Saunders RC, Prescott AT, Chau LS, Murray EA. Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nat Neurosci 2013; 16: 1140–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Rushworth MF, Noonan MP, Boorman ED, Walton ME, Behrens TE. Frontal cortex and reward-guided learning and decision-making. Neuron 2011; 70: 1054–69. [DOI] [PubMed] [Google Scholar]
  43. Schneider W, Shiffrin RM. Controlled and automatic human information processing: I. Detection, search, and attention. Psychol Rev 1977; 84: 1–66. [Google Scholar]
  44. Small DM, Zatorre RJ, Dagher A, Evans AC, Jones-Gotman M. Changes in brain activity related to eating chocolate: from pleasure to aversion. Brain 2001; 124 ( Pt 9): 1720–33. [DOI] [PubMed] [Google Scholar]
  45. Valentin VV, Dickinson A, O’Doherty JP. Determining the neural substrates of goal-directed learning in the human brain. J Neurosci 2007; 27: 4019–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Wallis JD, Anderson KC, Miller EK. Single neurons in prefrontal cortex encode abstract rules. Nature 2001; 411: 953–6. [DOI] [PubMed] [Google Scholar]

Articles from Brain are provided here courtesy of Oxford University Press

RESOURCES