Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2003 Aug 27;23(21):7931–7939. doi: 10.1523/JNEUROSCI.23-21-07931.2003

Dissociating Valence of Outcome from Behavioral Control in Human Orbital and Ventral Prefrontal Cortices

John O'Doherty 1, Hugo Critchley 1, Ralf Deichmann 1, Raymond J Dolan 1
PMCID: PMC6740603  PMID: 12944524

Abstract

The precise role of orbitofrontal cortex (OFC) in affective processing is still debated. One view suggests OFC represents stimulus reward value and supports learning and relearning of stimulus-reward associations. An alternate view implicates OFC in behavioral control after rewarding or punishing feedback. To discriminate between these possibilities, we used event-related functional magnetic resonance imaging in subjects performing a reversal task in which, on each trial, selection of the correct stimulus led to a 70% probability of receiving a monetary reward and a 30% probability of obtaining a monetary punishment. The incorrect stimulus had the reverse contingency. In one condition (choice), subjects had to choose which stimulus to select and switch their response to the other stimulus once contingencies had changed. In another condition (imperative), subjects had simply to track the currently rewarded stimulus. In some regions of OFC and medial prefrontal cortex, activity was related to valence of outcome, whereas in adjacent areas activity was associated with behavioral choice, signaling maintenance of the current response strategy on a subsequent trial. Caudolateral OFC-anterior insula was activated by punishing feedback preceding a switch in stimulus in both the choice and imperative conditions, indicating a possible role for this region in signaling a change in reward contingencies. These results suggest functional heterogeneity within the OFC, with a role for this region in representing stimulus-reward values, signaling changes in reinforcement contingencies and in behavioral control.

Keywords: reward, instrumental learning, behavioral control, orbitofrontal cortex, event-related, fMRI

Introduction

The orbitofrontal cortex (OFC) is arguably the least understood subdivision of prefrontal cortex (PFC). According to one view, OFC represents stimulus reward value and subserves learning and relearning of associations between arbitrary neutral stimuli and rewards or punishments (Rolls, 2000). Consistent with this, single-unit studies in nonhuman animals and human neuroimaging studies report OFC responses during the presentation of rewarding or punishing stimuli in different modalities (Thorpe et al., 1983; Critchley and Rolls, 1996; Zald and Pardo, 1997; Small et al., 1999; Elliott et al., 2000a; Breiter et al., 2001; Gottfried et al., 2002).

An alternative view proposes that OFC is involved in response selection in the context of rewarding or punishing outcomes, especially in the inhibition or suppression of responses that were previously associated with reward (Dias et al., 1996; Elliott et al., 2000b; Roberts and Wallis, 2000). Evidence for the response selection/inhibition hypothesis arises predominantly from lesion studies conducted in both nonhuman primates and human patients, in which during performance of instrumental reward tasks, OFC lesions lead to difficulties in extinguishing or switching responses from a previously rewarded stimulus once contingencies have altered and that stimulus is no longer rewarded (Butter, 1969; Iversen and Mishkin, 1970; Rolls et al., 1994; Dias et al., 1996).

The aim of this study was to determine whether activity in the OFC and in adjacent ventromedial and lateral prefrontal cortices related to response selection could be distinguished from that to rewarding and punishing feedback itself. To accomplish this, we used a probabilistic reversal task in which the average magnitude of rewards and punishments obtainable after choice of the correct or incorrect stimulus was kept constant. The only factor that distinguishes the correct and incorrect stimuli is the probability of obtaining a reward or punishment. A similar design was used in an event-related functional magnetic resonance imaging (fMRI) study by Cools et al. (2002). However, these authors did not obtain signal in the OFC because of susceptibility artifact, so it was not possible to distinguish positive and negative feedback from response selection in this region.

In the present study, we used two main conditions (Fig. 1). In the “choice” condition, on each trial, subjects were free to choose what stimulus to select and could change their choice of stimulus on any trial. In the “imperative” condition, subjects were rewarded and punished after the selection of a stimulus, but this time they did not choose which stimulus to select. Instead, the choice was made for them by the computer. Within the choice condition, comparisons could be performed between punishment trials, which were followed by a change in stimulus choice to punishment trials that were not followed by this change, thus isolating neural events signaling response switching from neural events related to punishment itself. Comparisons between the choice and imperative tasks also provided a means to examine the effects of response selection, because these mechanisms were predicted to be engaged only in the choice condition.

Figure 1.

Figure 1.

Illustration of task display for choice and imperative reversal task. Subjects were presented with two abstract visual stimuli. At the beginning, one stimulus was designated the correct stimulus and the other the incorrect stimulus. In the choice task, subjects selected a stimulus, which then increased in brightness and was followed by a monetary outcome (winning or losing 10 or 20 pence). In the imperative task, subjects did not select the stimulus, but instead this selection was made by computer. Subjects had to respond to indicate which of the two stimuli had been selected and then received rewarding and punishing feedback.

Materials and Methods

Subjects

Fifteen healthy right-handed normal subjects, 10 of whom were female, were included in the experiment. The subjects were preassessed to exclude those with a prior history of neurological or psychiatric illness. All subjects gave informed consent, and the study was approved by the local research ethics committee.

Choice reversal task description

Two unfamiliar and easily discriminable fractal patterns were displayed on a gray background, positioned to the left and right of a central fixation cross. The total score was displayed numerically in the center of the screen above a fixation cross. The two fractals were assigned randomly to either the left or the right of the screen on each trial. After a subject selected a stimulus, the chosen stimulus increased in brightness, and 1 sec later a message appeared below the stimulus, indicating how much money the subject had won or lost, together with a picture of the amount won or lost (which was either an image of a 20 pence or 10 pence piece) (Fig. 1). On losing trials, a red cross was superimposed over the image of the amount lost. The feedback remained on the screen for 1.3 sec, which then cleared, to be followed by a fixation cross. The next trial was triggered after 2000 msecs.

At the beginning of the task, one of the stimuli was arbitrarily designated the “correct stimulus,” and the other the “incorrect” stimulus. Selection of the correct stimulus led to a monetary win with probability of 0.7 and a monetary loss with probability of 0.3. Selection of the incorrect stimulus led to a monetary win with probability of 0.3 and a monetary loss with probability of 0.7. Consistent selection of the correct stimulus, therefore, led to an overall monetary gain. Conversely, consistent selection of the incorrect stimulus led to an overall monetary loss. The magnitudes of rewards and punishments also varied, in that on trials in which a monetary reward occurred, there was an equal probability that it would be 10 pence or 20 pence. Similarly, on trials in which a monetary loss was received, there was an equal probability that it would be 10 pence or 20 pence. Criterion was five touches of the correct stimulus. Once criterion was reached, reversal occurred after a Poisson process, such that there was a probability of 0.25 that a reversal took place on any given post-criterion trial. Once reversal occurred, another reversal was not triggered until criterion was reached on the new correct stimulus.

Imperative reversal task description

The imperative reversal task was identical to the choice reversal task in terms of presentation (although two different fractal stimuli were used), except that in this case subjects had no choice about which stimulus they would select on a given trial. Instead, the computer selected one of the stimuli according to the selections made and feedback obtained by another subject while performing the choice task. Thus, each subject's imperative task was yoked to the choice condition of another subject.

As in the choice task, each trial began with the presentation of two arbitrary neutral stimuli on either side of a fixation cross. Unlike the choice task, 500 msecs into the trial, one of the two stimuli spontaneously increased in brightness, indicating that the computer had chosen that stimulus. Once a stimulus had been chosen, but before feedback was obtained, subjects were instructed to make a response indicating whether the selected stimulus was on the left or right of the screen. This ensured that subjects were attending to the relevant stimulus, as well as enabling motor confounds to be removed in comparisons with the choice task.

Experimental procedure Prescanning training phase.

Before scanning, subjects were trained with a modified version of the choice reversal task used in the scanner. Subjects were instructed in the first instance that they had to find out which one of the two stimuli was correct, without any reference to the fact that contingencies would reverse. Once subjects had reached criterion for the first time (which in the case of the training task was 10 selections of the correct stimulus), a message on the screen informed them that they had found the correct stimulus (this message was only present in the training phase). The task was then paused, and subjects were instructed that once the task resumed, the contingencies would at some point reverse. The subjects were told that they had to work out when a reversal occurred and then switch their choice of stimulus. The stimuli used in the training task differed from the ones used during the actual scanning phase. Training was complete once subjects had attained at least two reversals after the first acquisition. Subjects were informed that at the end of the study they would be able to keep the total amount of money accumulated during task performance when in the scanner (for both the choice and imperative tasks). This total amount did not exceed 10 pounds for any subject, and at the end of the experiment, subjects were paid 10 pounds irrespective of their individual performance.

Scanning phase. The task was presented on a projector screen positioned ∼10 cm away from the subject's face. On each trial, the subject used one of two buttons to select the stimulus positioned on either the left or right side of the screen. The order of presentation of the choice and imperative tasks was counterbalanced across subjects. To rule out stimulus-specific effects, the six fractal stimuli used in the experiment (including the two used in the prescanning training phase) were randomly assigned to either the training phase or the choice or imperative tasks for each individual subject. Subjects performed the choice and imperative tasks in two separate 15 min sessions. In each session, 60 low-level baseline trials were randomly intermixed with the task-related trials. These involved the presentation of a fixation cross for 3 sec. Subjects completed an average of 184 task-related trials in the 15 min provided.

Imaging procedure

The functional imaging was conducted by using a 2 Tesla Siemens Vision MRI scanner to acquire gradient echo T2*-weighted echo-planar images images with blood oxygenation level-dependent contrast. We used a special sequence designed to optimize functional sensitivity in the OFC and medial temporal lobes (Deichmann et al., 2003). This consisted of tilted acquisition in an oblique orientation at 30* to the anterior-posterior commissure line, as well as application of a preparation pulse with a duration of 1 msec and an amplitude of -2 mT/m in the slice selection direction. This sequence has been shown to produce robust activation in the OFC and medial temporal lobes in a previous study (Gottfried et al., 2002). The sequence enabled 39 axial slices of 3.67 mm thickness and 3 mm in-plane resolution to be acquired with a repetition time of 2.78 sec. Subjects were placed in a light head restraint within the scanner to limit head movement during acquisition. A T1-weighted structural image was also acquired for each subject. Functional imaging data were acquired in two separate 15 min (336 vol) sessions in each subject during performance of the choice and imperative tasks.

Image analysis

The images were analyzed using SPM99 (Wellcome Department of Imaging Neuroscience, London, UK). To correct for subject motion, the images were realigned to the first volume (Friston et al., 1995). The images were then spatially normalized to a standard T2* template with a resampled voxel size of 3 mm3, and spatial smoothing was applied using a Gaussian kernel with a full width at half-maximum of 8 mm. Intensity normalization and high-pass temporal filtering (using a filter width of twice the minimum inter-trial interval) were also applied to the data.

Statistical analysis

Statistical analysis was carried out using the general linear model, in which each single event was modeled as a delta function convolved with the hemodynamic response function and its temporal derivative.

Events were divided up into positive (reward) and negative outcomes, according to whether money was won or lost after stimulus selection. The time of onset of each event was locked to the point in the trial when the subject received the outcome after having made a stimulus selection. We differentiated between negative outcomes that led to a switch of stimulus choice on the next trial and negative outcomes that did not lead to such a switch.

In a preliminary analysis, we subdivided switch events into those that occurred after five or more consecutive selections of the previously chosen stimulus and those that did not. The rationale for this was to determine whether switch events that occurred after a subject had responded consistently to a particular stimulus could be differentiated from more spontaneous switch events in which the subject had not previously established a persistent response-set to the other stimulus. The majority of switch events occurred after five or more selections (mean number of such events across subjects, 14.7), the next most frequent switch event was that after two or less previous selections of the other stimulus (mean number across subjects, 11.3). Events with three or four consecutive selections were the least common (mean, 5.3 events across subjects). In the main analysis reported here, we pooled over all switch events irrespective of the number of selections of the previous stimulus, because the preliminary analysis did not reveal any significant differences (at p < 0.001) between the two types of switch event defined above.

We only modeled positive outcomes not leading to a switch of stimulus choice, because switches after a rewarding outcome were rare (mean occurrence, 3.8 events across subjects). The two different outcome magnitudes (10 and 20 pence) were also modeled separately for each event type. For each individual subject, motion parameters were included as regressors of no interest for both sessions, to take into account additional effects of head motion not removed at the motion correction stage.

Linear contrasts were performed between the regressors to test for differential effects at the single subject level. These were then taken to the group random effects level by performing one-sample t tests on the contrast images derived from each single subject. In the main analysis reported here, we averaged over the different outcome magnitudes for each event type.

We tested for the effects of valence by comparing trials in which rewards (reward) were obtained with trials in which punishments were obtained that were not followed by a subsequent switch of stimulus choice (pun_noswch). We tested for the effects of response selection by comparing trials that were not followed by a change in stimulus choice on the subsequent trial (reward and pun_noswch trials) with those that were followed by a switch in stimulus choice (pun_swch trials). The contrast to detect areas with increased responses during trials not preceding a switch in stimulus was: [reward + pun_noswch]/2 - pun_swch, whereas the inverse contrast detected areas with increased activity during punishing trials preceding a switch in stimulus choice. We also tested for a difference in the effects of valence and response selection between the choice and imperative conditions by subtracting the relevant contrast in the choice condition from that in the imperative condition. Furthermore, we also tested for common activations relating to valence and response selection in the choice and imperative tasks by performing a conjunction analysis between the relevant contrasts from the two conditions.

We report results at p < 0.001 uncorrected for multiple comparisons in regions of interest, which we define for the purposes of this study as being in the OFC and adjacent ventral medial and lateral prefrontal cortices, as well as anterior cingulate cortex, amygdala, and striatum.

Results

Valence of outcome

Regions showing valence-related responses are summarized in Table 1 and detailed below.

Table 1.

Effects of valence of outcome




Laterality

X

Y

Z

Peak Z-score
Effects of valence of outcome: reward-punishment
Choice: rew — pun_noswch
Ventral medial PFC Right 9 42 −12 3.54
9 66 6 3.46
Medial OFC Right 12 36 −18 3.11
Right 3 15 −12 3.22
Lateral OFC Left −39 42 −15 3.9
Posterior cingulate cortex Left −9 −33 45 3.97
Right −9 −42 45 4.53
Imperative: rew — pun_noswch
Caudolateral OFC Left −27 18 −15 3.23
Medial OFC Left −6 18 −12 3.12
Amygdala Left −27 −3 −27 3.58
Border of ventral amygdala Right 33 −6 −33 3.37
Conjunction of choice and imperative: rew — pun_noswch
Medial OFC Right 3 18 −12 3.45
Caudolateral OFC Left −24 18 −15 3.47
Ventral striatum (nucleus accumbens) Right 12 9 −9 3.28
Amygdala Left −27 −3 −30 3.78
Border of ventral amygdala Right 27 −3 −33 3.72
Anterior cingulate cortex Right 12 0 45 3.68
Mid dorsal insular cortex Right 39 3 15 4.02
Effects of valence of outcome: punishment-reward
Choice: pun_noswch — rew
Anterior insular cortex Right 42 27 0 3.15
Lateral prefrontal cortex Right 51 24 0 4.19
Imperative: pun_noswch — rew
Lateral OFC Right 45 36 −9 3.21
Dorsal medial PFC 0 48 21 3.65
Conjunction of choice and imperative: pun_noswch — rew
Lateral PFC
Right
51
21
−3
3.68

Reward > punishment

Areas showing greater responses to reward than punishment (in the absence of a behavioral switch) in the choice task include medial PFC and left lateral OFC (Fig. 2). In the imperative task, effects were found in the medial OFC/subgenual cingulate, left caudolateral OFC, and bilateral amygdala (on the right, the locus of activation is at the border of ventral amygdala). A conjunction of reward-punishment trials between the choice and imperative tasks revealed effects in medial PFC, medial OFC/subgenual cingulate cortex, right ventral striatum, and bilateral amygdala (Fig. 3).

Figure 2.

Figure 2.

Areas of ventral PFC showing reward-related responses in the choice task. A, Group random effects results are shown superimposed on coronal and sagittal slices from the subject-averaged structural MRI image [at the Montreal Neurological Institute (MNI) coordinates indicated in the top right corner of each image]. Significant effects are shown at p < 0.001 in yellow, and to show the full extent of the activations, at p < 0.01 in red. A plot of effect sizes from medial PFC (the area circled) is shown for each trial type (reward, pun_noswch and pun_swch). B, Results from the same contrast are shown for a subset of single subjects superimposed on each subject's individual structural MRI. The threshold is set at p < 0.01 for illustration.

Figure 3.

Figure 3.

Areas activated in conjunction of reward-pun_noswch contrast between the choice and imperative tasks. Group random effects results are shown superimposed on coronal slices at the MNI coordinates indicated (top right corner of each image). Significant effects are shown at p < 0.001 in yellow and at p < 0.01 in red (to show the full extent of the activations). mPFC, Medial PFC; mOFC, medial OFC; lOFC, lateral OFC; nACC, nucleus accumbens; Amyg, amygdaloid area.

Punishment > reward

In the choice task, right dorsal insula showed increased activity to punishing relative to rewarding outcomes, as well as part of the ventral lateral PFC. No significant effects were detected in the OFC. In the imperative tasks, significant effects were found only in a part of dorsal anterior cingulate cortex. Furthermore, a conjunction analysis revealed common activity to punishing-rewarding outcomes in the right lateral PFC.

Response selection

Areas showing response selection-related effects in the choice task are listed in Table 2 and detailed below.

Table 2.

Effects of response selection




Laterality

X

Y

Z

Peak Z-score
Effects of response selection: response maintenance-response switching
Choice: [rew + (pun_noswch)]/2 - pun_swch
Medial OFC 0 42 −18 3.37
Lateral OFC Left −42 36 −18 3.53
Medial PFC Left −3 54 −9 3.23
Posterior cingulate cortex Left −3 −54 15 3.86
Effects of response selection: response switching-response maintenance
Choice: Pun_Swch - [Rew + Pun_Noswch]/2
Anterior insula-caudolateral OFC Right 36 27 −9 3.78
Dorsal anterior cingulate cortex Left −3 21 42 4.58
Right 3 33 36 4.71
Para-cingulate cortex
Left
−6
12
51
5.05

Table 3.

Comparison between choice and imperative tasks




Laterality

X

Y

Z

Peak Z-score
Choice-imperative: effects of valence of outcome (reward > punishment)
Medial OFC Right 9 39 −18 3.56
Lateral OFC Left −42 42 −15 4.35
Central OFC Right 18 39 −9 3.43
Choice-imperative: effects of response maintenance
Medial OFC Right 3 27 −12 3.27
Central OFC Right 24 36 −6 4.26
Choice-imperative: effects of response switching
Dorsal anterior cingulate cortex Left −3 21 39 3.67
Right 6 27 33 3.76
Right 9 21 39 3.58
Caudate nucleus Left −6 3 0 3.78
Right 21 0 18 3.67
Right 15 18 3 3.92
Putamen Right 21 0 −3 3.39
Conjunction of choice and imperative tasks: response maintenance
Caudate nucleus Right 33 3 6 4.73
Posterior cingulate cortex 0 −15 48 5.21
Lateral OFC Left −30 42 −9 3.28
Conjunction of choice and imperative tasks: response switching
Anterior insula-caudolateral OFC Left −33 21 −18 3.69
Anterior insula-caudolateral OFC
Right
39
24
−12
4.08

Response maintenance > response switching

Regions with increased activity during trials in the choice task in which the subject maintained responding to the current stimulus on the subsequent trial were medial OFC, right central OFC, and medial PFC, as well as a part of the left lateral OFC. Group random effects results from orbital and medial PFC are shown in Fig. 4, together with activation maps and evoked-response plots from a subset of single subjects. A direct comparison of pun_noswch to pun_swch events revealed significant differences between these two event types at the coordinates described above, albeit at a lower threshold of p < 0.005. For comparison, areas demonstrating response maintenance effects and areas sensitive to rewarding outcomes are shown superimposed on the same structural MRI in Figure 5.

Figure 4.

Figure 4.

Areas related to response maintenance in the choice task. A, Group random effects results are shown superimposed on coronal and sagittal slices from the subject-averaged structural MRI image (at the MNI coordinates indicated in the top right corner of each image). Significant effects are shown at p < 0.001 in yellow and at p < 0.01 in red (to show the full extent of the activations). A plot of effect sizes from medial OFC (the area circled) is shown for each trial type (reward, pun_noswch, and pun_swch). B, Results from the same contrast are shown for a subset of single subjects superimposed on each subject's individual structural MRI. The threshold is set at p < 0.01 for illustration. C, Plots of fitted event-related responses obtained from peak voxels in medial OFC of each single subject shown in B.

Figure 5.

Figure 5.

Valence (reward-punishment) and response maintenance-related effects in the choice task. Activations related to rewarding outcomes and response maintenance are shown superimposed on the same coronal and sagittal slices. Reward-related effects are shown in red (at p < 0.01) and yellow (at p < 0.001), and response maintenance-related effects are shown in blue (at p < 0.01) and cyan (at p < 0.001).

Response switching > response maintenance

Areas with increased activity on trials immediately preceding a switch of stimulus in the choice were a part of right agranular insula extending into caudolateral OFC and dorsal anterior cingulate cortex (Fig. 6A). The direct contrast of pun_swch - pun- _noswch also revealed significant activation in this agranular transitional region at p < 0.001, with a separate locus in caudolateral OFC, providing evidence that this area is not related to punishment per se but is activated only during punishing outcomes that are followed in the choice task by a subsequent switch in stimulus choice (Fig. 6B). Activation maps and evoked responses are shown from two single subjects in Figure 6C.

Figure 6.

Figure 6.

Areas related to response switching in the choice task. A, Group random effects results of the contrast of pun_swch - [rewacq + pun_noswch]/2 are shown superimposed on coronal and sagittal slices from the subject-averaged MRI image (at the MNI coordinates indicated in the top right corner of each image). Significant effects are shown at p < 0.001 in yellow and at p < 0.01 in red (to show the full extent of the activations). A plot of effect sizes from anterior insula/caudolateral OFC (the area circled) is shown for each trial type (reward, pun_noswch, and pun_swch). B, Results from the contrast of pun_swch-pun_noswch at the group random effects level, showing a separate locus of activity in caudolateral OFC. C, Results from the contrast pun_swch - [rewacq + punacq]/2 are shown for a subset of single subjects superimposed on each subject's individual structural MRI. The threshold is set at p < 0.01 for illustration.

Direct comparison between choice and imperative tasks

Areas showing significantly greater responses during the choice task than the imperative task to rewarding-punishing feedback include medial OFC, left lateral OFC, and right central OFC. Areas showing significantly greater responses during the choice task than the imperative task to response maintenance were medial OFC and right central OFC. The reverse contrast to identify regions responding more to response switching in the choice and the imperative tasks revealed effects in dorsal anterior cingulate cortex and striatum (bilateral caudate nucleus and right putamen). Interestingly, no differential effects were found in this contrast in right insula/caudolateral OFC.

Response selection or sensitivity to contingency changes?

An alternative interpretation pertaining to response selection-related activity in the choice task is that rather than signaling whether or not responses should be maintained or switched, these areas signal whether contingencies have changed or not. If a change in contingency has not been detected, then this would be equivalent to signaling that responses should be maintained in the choice task. Similarly, if a change in contingency has been detected, then this would be equivalent to signaling that responses should be altered in the choice task. The way in which these two possibilities can be disambiguated is if the same areas are recruited during the equivalent comparisons between response maintenance and response-switching events in the imperative task as in the choice task. If so, under the assumption that response selection is present only in the choice task, contingency change detection would be a more likely explanation of observed OFC activity.

Conjunction of response selection effects between choice and imperative tasks

A conjunction of response maintenance effects in choice and imperative task revealed significant effects in the left lateral OFC, right caudate nucleus, and posterior cingulate cortex. Consistent with the contingency change interpretation, a conjunction of response-switching effects in the choice task with the equivalent contrast in the imperative task revealed activity in the same region of bilateral caudolateral OFC-anterior insula found to be associated with response switching in the choice task (above). This is shown in Figure 7.

Figure 7.

Figure 7.

Response switching or contingency change detection? Effects of a conjunction of the contrast of pun_swch - [rewacq + punacq]/2 between the choice and imperative tasks is shown superimposed on a coronal slice from the subject averaged MRI image (at the MNI coordinate shown top right). This result illustrates that anterior insula/caudolateral OFC is recruited during both the choice and imperative tasks, suggesting that this area may be more related to detection of contingency changes than response inhibition per se. Significant effects are shown at p < 0.001 in yellow and at p < 0.01 in red (to show the full extent of the activations).

Discussion

In this study, we show that different subregions of ventral PFC have distinct roles during affective learning. First, regions of medial and orbital PFC are involved in representing outcome, with increased responses to rewarding outcomes. This finding is consistent with previous studies (Breiter et al., 2001; Knutson et al., 2001; O'Doherty et al., 2001; Elliott et al., 2003). Furthermore, although outcome-related activity was evident during both choice and imperative conditions, we also show in a direct contrast of choice-imperative conditions that outcome-related activity in medial and central OFC was greater during the choice than imperative conditions. This may reflect cognitive modulation of outcome representations, in that in the choice task, knowledge of the value of outcomes is critical for future behavioral choice, whereas this is not the case in the imperative task.

Here, the main finding is that responses in ventromedial and orbital cortex do not merely represent valence of outcome but also signal subsequent behavioral choice. In the choice task, enhanced responses in medial and left lateral OFC were evident to both rewarding and punishing feedback not followed by a change of stimulus choice, relative to punishing feedback that was followed by a change in behavior. This suggests that after the receipt of outcomes on the previous trial, activity in ventral PFC predicts the behavioral decision of the subject on the subsequent trial. Furthermore, parts of medial and central OFC were significantly more active during the choice condition than the imperative condition. The implication of this finding is that these areas may be engaged under conditions when behavioral decision making is required. This result is compatible with the idea that orbitofrontal and medial PFC is involved in integrating rewarding and punishing feedback for affective decision making (Bechara et al., 2000; Krawczyk, 2002).

We observed a different pattern of responses in agranular insula, contiguous with caudolateral OFC. Once again, in the choice task, activity in this region was related to behavioral choice. However, effects were in the opposite direction to that found in anterior medial and lateral OFC. As shown in Figure 3, activity in this region was increased when subjects received punishing feedback that on the following trial was associated with a switch in stimulus choice. This region was not engaged by a punishing stimulus in which the subject did not subsequently switch stimulus choice, or by rewarding stimuli. We note that the locus of this activity is close to but more ventral than coordinates reported by Cools et al. (2002), as an area activated immediately preceding a reversal of stimulus choice. Interestingly, a conjunction of switch versus stay outcomes between the choice and imperative tasks revealed significant effects in this region bilaterally. The finding of activity in this region during the imperative as well as choice tasks complicates an interpretation that activity in this region reflects changes in response selection or inhibition of the previously selected response. An alternative explanation is that in the imperative task, even though subjects do not choose responses, they do, on average, detect a change in contingencies on trials preceding a switch in stimulus (given that the imperative trials from one subject are yoked to the choice task from another subject). Thus, activity in anterior insula/caudolateral OFC may relate to detection of a change in contingencies or, more specifically, a decrease in the average reward value of the currently chosen stimulus. These findings provide an important insight into the nature of deficits at reversal learning after lesions of orbital PFC (Rolls et al., 1994; Dias et al., 1996). Our results argue against a characterization of the effects of such lesions as being caused by a difficulty at inhibiting the previously selected response (Roberts and Wallis, 2000). Rather, our results suggest that lesions of this area may impair the ability to detect a change in reward contingencies.

Responses in anterior insula/caudolateral OFC during the choice and imperative tasks can be contrasted with that of dorsal anterior cingulate cortex. This region was active during punishing trials preceding a switch in the choice task. Moreover, this region was significantly more active during the choice task than the imperative task. These findings are consistent with an fMRI study of a reward-based motor selection task (which was essentially a reversal task) by Bush et al. (2002). These authors reported anterior cingulate responses related to a decrease in reward that was also a precursor to a shift in action choice. The finding in the present study that activity in this part of anterior cingulate cortex was modulated by choice suggests that this area is not merely involved in detecting a change in reward value, but that it is particularly related to signaling a shift in response strategy after a change in contingencies. This interpretation is compatible with the proposal by Shima and Tanji (1998) that neurons in dorsal anterior cingulate cortex are involved in the voluntary control of reward-based movements. This result could also be compatible with observed anterior cingulate involvement in the generation and control of autonomic arousal states, particularly during volitional task engagement (Critchley et al., 2001a,b,c). Within this conceptual model, cingulate-driven autonomic responses may prospectively facilitate behavioral response switching.

Elliott et al. (2000b), on the basis of a review of neuroimaging findings, proposed a refinement of the response selection/inhibition hypothesis, in which medial OFC is suggested to be involved in the monitoring of reward values and lateral OFC is suggested to be involved in the inhibition or suppression of previously rewarded responses. In a previous neuroimaging study of a reversal learning paradigm, O'Doherty et al. (2001) found differential responses in OFC to abstract reward and punishment (play money), such that medial OFC was more activated after rewarding feedback, and lateral OFC was more activated after punishment. These findings were interpreted as indicating that medial and lateral OFC are differentially involved in representing abstract rewards and punishments, respectively. We note that in this previous study, signals related to response selection and valence of outcome were confounded. Other studies have also found medial-lateral dissociations for rewarding versus punishing stimuli (Small et al., 2001; Gottfried et al., 2002; O'Doherty et al., 2003). Nevertheless, in the present study, we did not observe a clear dissociation between medial and lateral OFC. Consistent with previous findings, we did obtain activity in medial orbital/medial PFC-related to rewarding outcomes. However, contrary to previous results, parts of left lateral OFC also showed increased activity to reward, indicating that lateral OFC can, under some circumstances, be activated by rewarding outcomes (Elliott et al., 2003). This cautions against a simple interpretation of medial and lateral OFC functional dissociation in terms of valence or even response selection. One caveat in relation to the current study is that there was also an anticipatory component on each trial, because there was a 1 sec interval after stimulus selection before outcome presentation. It should be noted in the study by Elliott et al. (2003), in which lateral orbital activity was also observed to reward, the authors used a block design that did not control for expectation-related effects. This raises the possibility that in our study and that of Elliott et al. (2003), responses in lateral OFC to reward may relate to an anticipatory component.

In addition to outcome valence-related activity in PFC, significant effects were also found in amygdala and ventral striatum to rewarding versus punishing feedback. Interestingly, these results emerged in a conjunction across task, although significant effects were present in amygdala in the imperative task alone. The fact that these areas did not come out in the difference between choice and imperative tasks suggests that these areas are not modulated in the same manner as ventral PFC by the degree to which feedback is required for behavioral choice. Significant effects of reward in amygdala and nucleus accumbens have also been reported in previous studies (Knutson et al., 2001; Elliott et al., 2003). In the case of nucleus accumbens, activity may be related to reward prediction rather than being related to feedback itself (Knutson et al., 2001; Pagnoni et al., 2002). Amygdala responses have also been found in previous studies to be related to anticipation of reward (Knutson et al., 2001; O'Doherty et al., 2002).

To conclude, our findings suggest a heterogeneous response profile in human orbital medial and lateral prefrontal cortices during performance of an affective learning choice task. Some regions represent valence irrespective of behavioral choice, other regions are sensitive to response maintenance, and other regions are involved in detecting a change in contingencies. In future neuropsychological investigations, it will be of interest to determine whether discrete lesions in subregions of ventral PFC produce distinct behavioral deficits in reversal learning along the lines described here, by testing for differences in the effects of lesions of anterior insula-caudolateral OFC and ventromedial PFC. The results of the present study are compatible with the hypotheses that orbital and adjacent cortices are involved in representing rewarding and punishing feedback, as well as being critically involved in decision making and behavioral choice.

Footnotes

R.J.D. is supported by a Wellcome Trust Programme grant. H.C. is supported by a Wellcome Clinician Scientist Fellowship award.

Correspondence should be addressed to Dr. John O'Doherty, Wellcome Department of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK. E-mail: j.odoherty@fil.ion.ucl.ac.uk.

Copyright © 2003 Society for Neuroscience 0270-6474/03/237931-09$15.00/0

References

  1. Bechara A, Damasio H, Damasio AR ( 2000) Emotion, decision making and the orbitofrontal cortex. Cereb Cortex 10: 295-307. [DOI] [PubMed] [Google Scholar]
  2. Breiter HC, Aharon I, Kahneman D, Dale A, Shizgal P ( 2001) Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron 30: 619-639. [DOI] [PubMed] [Google Scholar]
  3. Bush G, Vogt BA, Holmes J, Dale AM, Greve D, Jenike MA, Rosen BR ( 2002) Dorsal anterior cingulate cortex: a role in reward-based decision making. Proc Natl Acad Sci USA 99: 523-528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Butter CM ( 1969) Perseveration in extinction and in discrimination reversal tasks following selective prefrontal ablations in Macaca mulatta. Physiol Behav 4: 163-171. [Google Scholar]
  5. Cools R, Clark L, Owen AM, Robbins TW ( 2002) Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging. J Neurosci 22: 4563-4567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Critchley HD, Rolls ET ( 1996) Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J Neurophysiol 75: 1673-1686. [DOI] [PubMed] [Google Scholar]
  7. Critchley HD, Mathias CJ, Dolan RJ ( 2001a) Neuroanatomical basis for first- and second-order representations of bodily states. Nat Neurosci 4: 207-212. [DOI] [PubMed] [Google Scholar]
  8. Critchley HD, Mathias CJ, Dolan RJ ( 2001b) Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron 29: 537-545. [DOI] [PubMed] [Google Scholar]
  9. Critchley HD, Melmed RN, Featherstone E, Mathias CJ, Dolan RJ ( 2001c) Brain activity during biofeedback relaxation: a functional neuroimaging investigation. Brain 124: 1003-1012. [DOI] [PubMed] [Google Scholar]
  10. Deichmann R, Gottfried JA, Hutton C, Turner R ( 2003) Optimized EPI for fMRI studies of the orbitofrontal cortex. NeuroImage 19: 430-441. [DOI] [PubMed] [Google Scholar]
  11. Dias R, Robbins TW, Roberts AC ( 1996) Dissociation in prefrontal cortex of affective and attentional shifts. Nature 380: 69-72. [DOI] [PubMed] [Google Scholar]
  12. Elliott R, Friston KJ, Dolan RJ ( 2000a) Dissociable neural responses in human reward systems. J Neurosci 20: 6159-6165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Elliott R, Dolan RJ, Frith CD ( 2000b) Dissociable functions in the medial and lateral orbitofrontal cortex: evidence from human neuroimaging studies. Cereb Cortex 10: 308-317. [DOI] [PubMed] [Google Scholar]
  14. Elliott R, Newman JL, Longe OA, Deakin JF ( 2003) Differential response patterns in the striatum and orbitofrontal cortex to financial reward in humans: a parametric functional magnetic resonance imaging study. J Neurosci 23: 303-307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Friston KJ, Ashburner J, Poline JB, Frith CD, Heather JD, Frackowiak RS ( 1995) Spatial registration and normalisation of images. Hum Brain Mapp 2: 165-189. [Google Scholar]
  16. Gottfried JA, Deichmann R, Winston JS, Dolan RJ ( 2002) Functional heterogeneity in human olfactory cortex: an event-related functional magnetic resonance imaging study. J Neurosci 22: 10819-10828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Iversen SD, Mishkin M ( 1970) Perseverative interference in monkeys following selective lesions of the inferior prefrontal convexity. Exp Brain Res 11: 376-386. [DOI] [PubMed] [Google Scholar]
  18. Knutson B, Fong GW, Adams CM, Varner JL, Hommer D ( 2001) Dissociation of reward anticipation and outcome with event-related fMRI. NeuroReport 12: 3683-3687. [DOI] [PubMed] [Google Scholar]
  19. Krawczyk DC ( 2002) Contributions of the prefrontal cortex to the neural basis of human decision making. Neurosci Biobehav Rev 26: 631-664. [DOI] [PubMed] [Google Scholar]
  20. O'Doherty J, Kringelbach ML, Rolls ET, Hornak J, Andrews C ( 2001) Abstract reward and punishment representations in the human orbitofrontal cortex. Nat Neurosci 4: 95-102. [DOI] [PubMed] [Google Scholar]
  21. O'Doherty J, Deichmann R, Critchley HD, Dolan RJ ( 2002) Neural responses during anticipation of a primary taste reward. Neuron 33: 815-826. [DOI] [PubMed] [Google Scholar]
  22. O'Doherty J, Winston J, Critchley H, Perrett D, Burt DM, Dolan RJ ( 2003) Beauty in a smile: the role of medial orbitofrontal cortex in facial attractiveness. Neuropsychology 41: 147-155. [DOI] [PubMed] [Google Scholar]
  23. Pagnoni G, Zink CF, Montague PR, Berns GS ( 2002) Activity in human ventral striatum locked to errors of reward prediction. Nat Neurosci 5: 97-98. [DOI] [PubMed] [Google Scholar]
  24. Roberts AC, Wallis JD ( 2000) Inhibitory control and affective processing in the prefrontal cortex: Neuropsychological studies in the common marmoset. Cereb Cortex 10: 252-262. [DOI] [PubMed] [Google Scholar]
  25. Rolls ET ( 2000) The orbitofrontal cortex and reward. Cereb Cortex 10: 284-294. [DOI] [PubMed] [Google Scholar]
  26. Rolls ET, Hornak J, Wade D, McGrath J ( 1994) Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. J Neurol Neurosurg Psychiatry 57: 1518-1524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Shima K, Tanji J ( 1998) Role for cingulate motor area cells in voluntary movement selection based on reward. Science 282: 1335-1338. [DOI] [PubMed] [Google Scholar]
  28. Small DM, Zald DH, Jones-Gotman M, Zatorre RJ, Pardo JV, Frey S, Petrides M ( 1999) Human cortical gustatory areas: a review of functional neuroimaging data. NeuroReport 10: 7-14. [DOI] [PubMed] [Google Scholar]
  29. Small DM, Zatorre RJ, Dagher A, Evans AC, Jones-Gotman M ( 2001) Changes in brain activity related to eating chocolate: from pleasure to aversion. Brain 124: 1720-1733. [DOI] [PubMed] [Google Scholar]
  30. Thorpe SJ, Rolls ET, Maddison S ( 1983) Neuronal activity in the orbitofrontal cortex of the behaving monkey. Exp Brain Res 49: 93-115. [DOI] [PubMed] [Google Scholar]
  31. Zald DH, Pardo JV ( 1997) Emotion, olfaction, and the human amygdala: amygdala activation during aversive olfactory stimulation. Proc Natl Acad Sci USA 94: 4119-4124. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES