Abstract
A dominant focus in studies of learning and decision-making is the neural coding of scalar reward value. This emphasis ignores the fact that choices are strongly shaped by a rich representation of potential rewards. Here, using fMRI adaptation, we demonstrate that responses in the human orbitofrontal cortex (OFC) encode a representation of the specific type of food reward predicted by a visual cue. By controlling for value across rewards and by linking each reward with two distinct stimuli, we could test for representations of reward–identity that were independent of associative information. Our results show reward–identity representations in a medial-caudal region of OFC, independent of the associated predictive stimulus. This contrasts with a more rostro-lateral OFC region encoding reward–identity representations tied to the predicate stimulus. This demonstration of adaptation in OFC to reward specific representations opens an avenue for investigation of more complex decision mechanisms that are not immediately accessible in standard analyses, which focus on correlates of average activity.
Introduction
During learning and decision making, humans and other animals make use of a rich representation of the reward environment. When different stimuli predict different types of reward, learning on each stimulus is enhanced (the differential outcome effect) (Jones and White, 1994; Savage, 2001; Noonan et al., 2011). Furthermore, when choosing between stimuli that predict different reward types, outcome-specific devaluations (e.g., illness paired with a particular food) are immediately accounted for (Balleine and Dickinson, 1998; Ostlund and Balleine, 2007; Rudebeck and Murray, 2011). Indeed, the defining feature of goal-based control is the explicit representation and evaluation of potential outcomes for different choices (Valentin et al., 2007; Padoa-Schioppa, 2011; McDannald et al., 2012).
One brain structure implicated in this capacity for goal-based control is the orbitofrontal cortex (OFC) (Burke et al., 2008). In macaques, single-unit activity in the OFC contains a rich representation of reward outcomes, including their visual appearance, taste, smell, and texture (Rolls and Baylis, 1994; Rolls et al., 1999). During decision-making, spatially overlapping but discrete populations of cells in the OFC respond to stimuli that predict different types of juice reward, independent of quantity (Padoa-Schioppa and Assad, 2008). These findings suggest that the OFC explicitly encodes the identity of rewards.
However, it remains ambiguous whether these reward–identity representations are independent from information associated with reward because the identity of a reward can often be predicted by multiple different stimuli, which bear no physical resemblance to the reward itself. It is plausible that our ability to represent, for example, coffee as a single reward type, but additionally distinguish between the labels of two brands of coffee, requires two distinct types of neural code: (1) the coding of reward–identity regardless of predictive stimuli (“coffee”); (2) the coding of stimulus–reward associations, where different representations code for each distinct stimulus that predicts reward (“label-reward,” or more specifically “label-coffee” association).
Given the increasing evidence for anatomical and functional dissociations within OFC, we hypothesized that distinct subregions of OFC may encode these two distinct types of reward code. Although more medial regions receive input from visceral and gustatory regions and play a role in outcome valuation and choice (Plassmann et al., 2007; Noonan et al., 2010; Hunt et al., 2012), more lateral regions receive highly processed visual input from anterior temporal cortex and appear important for updating associations between reward outcomes and predictive stimuli (Carmichael and Price, 1996; Walton et al., 2010; Noonan et al., 2011). We predicted that more medial-caudal regions may encode reward–identity representations that are invariant to predictive stimuli, whereas more rostro-lateral regions may contain reward representations paired to specific stimuli.
We therefore designed a novel fMRI paradigm that allowed us to identify encoding of food reward–identity in human subjects and compare this with encoding of stimulus–reward associations. Because responses to different rewards are spatially overlapping at the resolution of MRI, they cannot be dissociated using standard imaging paradigms. However, repetition suppression is a phenomenon in which sequential presentation of information sharing a common feature leads to attenuation in the response of neural populations sensitive to that feature. Thus, using repetition suppression, we could test for encoding of food-specific reward–identity and stimulus–reward associations while controlling for subjective value across available food rewards. By choosing trials in which either the reward alone or both the stimulus and reward were repeated, we could identify brain regions encoding reward–identity representations and contrast them to those encoding conjoined stimulus–reward associations.
Materials and Methods
Participants
A total of 21 healthy volunteers participated in the fMRI experiment. Two volunteers were excluded from the experiment because of (1) excessive head movement (>6 mm in any one dimension for one of the three sessions) and (2) failing to stay awake during the scan. The remaining 19 participants (mean age 24.8 ± 1.0 years, 13 females) were included in the analyses. All participants gave informed written consent, and the study was approved by the local research ethics committee. Participants were asked to refrain from eating 1 h before the start of the experiment. They were paid £15 for their time; in addition, they were given 50 pence for every correct response to questions during the scan (48 questions in total).
Behavioral testing
Choice of food rewards with equal subjective value.
Participants were first asked to rate six different food items, on a scale of 1 to 10, according to their subjective desirability (strawberry, tangerine segment, polo, crisp, brazil nut, chocolate; Fig. 1A). The experimenter chose two food types, A and B, to use in the remainder of the experiment under the constraint that they had been given high and similar ratings. To further minimize value-related variance, which might have reduced the sensitivity of the experiment, we used an indifference test procedure to adjust the quantity of each food item and ensure that for each participant all stimuli paired to food, predicted rewards of equivalent subjective value. We asked participants to make a series of binary choices between A and B, where the quantity of A and B was independently varied between one and six portions on each trial. Choices were fitted to a sigmoid function and the ratio of quantities at which participants chose the two food types equally frequently and therefore showed equal preference for A and B, was determined (Padoa-Schioppa and Assad, 2008). Whole-unit quantities of foods A and B were chosen such that the indifference ratio was best approximated using whole numbers between 1 and 6. These “indifferent quantities” were then used for the remainder of the experiment. Importantly, in addition to minimizing value effects in the elicited rewards described here, we also eliminated any effects of value in the design of the fMRI contrasts (see below). Thus, the adaptation procedure was not biased by value-related correlations.
To establish whether participants changed their valuation of A and B during the course of the experiment, the indifference test was repeated after the scanning session. This second indifference test was used to recalculate the value of B relative to A and compared this value with those obtained from the first indifference test.
Learning of stimulus–item associations.
Participants were familiarized with four different items: the two food types and their respective quantities, and two neutral objects (a 3 × 3 × 3 cm cardboard box and a dark red marble with radius 1.5 cm). Each item was assigned two abstract yellow shapes (Fig. 1B), and the eight pairings were shown passively to participants over a total of 16 trials.
After the passive presentation, participants were then actively trained on the stimulus–item pairings using a reaction time task. On each trial, one of the eight abstract yellow shapes was shown for 400 ms before all four possible items were presented across the screen. The position of the items was randomized across trials, and each position mapped onto one of four buttons on the keyboard. Participants were instructed to use their right hand to press the button associated with the correct item as quickly and accurately as possible. Feedback for their choice was given after every trial. At the end of each block (64 trials), subjects were informed about their average reaction time and accuracy.
Participants were required to do the stimulus–item learning task for at least four blocks. If their average accuracy across the entire fourth block of 64 trials was >90% (i.e., up to six mistakes), the training was terminated. Otherwise, participants were required to continue with the task until they reached the 90% criterion. In addition, before commencing with the scan, the experimenter asked the participant to confirm that upon seeing a yellow shape the representation of the corresponding food or neutral item could be elicited automatically. To ensure subjects could elicit a vivid, multisensory representation of each of the items, they were asked to hold each of the food and neutral items in turn and familiarize themselves with their texture and weight. For the food items, participants were additionally required to taste one portion of each food type. Furthermore, for each item, participants were familiarized with four adjectives describing outcome attributes (e.g., “sweet” or “red”) that would later be used to refer to the item during the scan.
Scan procedure
During scanning, visual stimuli were presented via a computer monitor projected onto a screen. On each trial, two abstract yellow shapes were shown consecutively, each for 700 ms with an interstimulus interval of 400 ms, and participants were instructed to vividly represent the food reward or neutral object associated with each of the abstract shapes (Fig. 1C). The intertrial interval was selected from a truncated γ distribution with a shape parameter of 6 and scale parameter of 1000 (mean of 6 s, minimum 3.5 s and maximum 10 s).
This resulted in different trial types, which were used to identify regions encoding stimulus-independent reward–identity and conjoined stimulus–reward information. We reasoned that a brain region whose cells encode reward–identity, independent of the predictive stimuli associated with the reward, should show adaptation (reduced signal) whenever two different stimuli predict the same reward, but not when two different stimuli predict the same neutral object. On the other hand, a brain region with cells predominantly represent the pairing of a reward with its associated stimulus, should show adaptation when the same stimulus is presented consecutively if that stimulus predicts a food reward, but not a neutral item. Crucially, within each trial, participants were presented with three possible types of stimuli pairs, which occurred in two different conditions (food or neutral): those where the stimuli and the food or neutral item were the same (same-stimulus-same-item [SSSI]), those where the stimuli were different but the associated items were the same (different-stimulus-same-item [DSSI]), and those that had different stimuli and different associated items (different-stimulus-different-item [DSDI]).
The possibility of adaptation between trials led to further categorization of trials, determined by the similarity of each trial to that presented previously. If either item elicited on the current trial had been elicited on the previous trial, then the trial was categorized as a same-item trial (SI). If both items elicited on the current trial were different from those on the previous trial (but from the same category, i.e., food or neutral), then the trial was categorized as a different-item trial (DI). If either of the stimulus–item pairings on the current trial had been elicited on the previous trial, then the trial was categorized as a same-stimulus–item trial (SSI). If both the stimulus–item pairings on the current trial were different from those on the previous trial (but from the same category, i.e., food or neutral), then the trial was categorized as a different-stimulus–item trial (DSI).
Participants completed three scan sessions of 144 trials each. For each session, trial types were equally divided between food and neutral items and further divided between the three within-trial types, with the order of presentation randomized. Furthermore, in each scan session, 16 yes/no questions were interspersed at random into the trial sequence and presented immediately after a pair of abstract shapes, before the intertrial interval. These questions were included to verify that participants were eliciting representations of the associated food or neutral items. Each question specified which of the two preceding shapes it was referring to, and concerned properties of the associated item, for example, “First outcome salty?” The adjectives used were chosen to encourage participants to elicit multisensory representations of each item and concerned the appearance and texture of both food and neutral items, and additionally taste and smell for food items. As described above, participants were familiarized with all descriptions used to refer to each item before entering the scanner. There was an equal number of yes and no questions, and these were each divided equally between the four food and neutral items and the first and second stimulus. At the end of each session, participants were told how many questions they had answered correctly and how much money they had earned, with each correct question being rewarded with 50 pence.
At the end of each session, participants were reminded of the stimulus–item pairs through completion of 24 trials of the learning task used before scanning. Choices to both the questions and the learning task were indicated with the right hand using an MRI-compatible button box. Reaction times and accuracy were measured.
fMRI data acquisition and preprocessing
T2*-weighted echo-planar images (EPI) with blood oxygen level-dependent (BOLD) contrast were acquired using a 32 channel head coil on a 3Tesla Trio MRI scanner (Siemens). A special sequence was used to minimize signal drop out in the OFC region (Weiskopf et al., 2006) and included an echo time (TE) of 70 ms, a tilt of 30° relative to the rostro-caudal axis, and a local z-shim with a moment of −0.4 mT/m ms applied to the OFC region. To achieve whole-brain coverage, we used 43 transverse slices of 2 mm thickness, with an interslice gap of 1 mm, and in-plane resolution of 3 × 3 mm, and collected slices in an ascending order. This lead to a repetition time (TR) of 3.01 s. In each session, 419 volumes were collected (∼20 min), and the first five volumes were discarded to allow for T1 equilibration effects. A single T1-weighted structural image with 1 × 1 × 1 mm voxel resolution was acquired and coregistered with the EPI images to permit anatomical localization. A fieldmap with dual echo-time images (TE1 = 10 ms, TE2 = 14.76 ms, whole brain coverage, voxel size 3 × 3 × 3 mm) was obtained for each subject to allow for corrections in geometric distortions induced in the EPIs at high field strength.
Physiological measures were collected during the EPI acquisition. The cardiac pulse was recorded using an MRI-compatible pulse oximeter (Model 8600 F0, Nonin Medical), and thoracic movement was monitored using a custom-made pneumatic belt positioned around the abdomen. The pneumatic pressure changes were converted into an analog voltage using a pressure transducer (Honeywell International) before digitization, as reported in Hutton et al. (2011).
Preprocessing and statistical analyses were performed using SPM8 (Wellcome Trust Centre for Neuroimaging, London, UK; www.fil.ion.ucl.ac.uk/spm). Image preprocessing consisted of correction for signal bias, realignment of images to the first volume, distortion correction using fieldmaps, normalization to a standard EPI template, and smoothing using an 8 mm full-width at half maximum Gaussian kernel. Movement parameters were inspected visually to check for head movement, and one subject with excessive head movement was removed from the analysis (displacement > 6 mm in any one direction).
Data analysis
Images were analyzed in an event-related manner using a general linear model (GLM) involving 12 explanatory variables (EVs). Six EVs corresponded to the three different within-trial conditions (SSSI, DSSI, and DSDI) for both food and neutral trials. A further four EVs corresponded to the two item-specific between-trial conditions (SI and DI) for both food and neutral trials. Finally, the last two EVs described the time of question presentation and response. The duration of events in all EVs for both the within- and between-trial analyses was kept constant for any given participant and was dependent upon the trial duration and the participant's mean reaction time measured from the learning task performed inside the scanner (700 ms + 400 ms + 700 ms + mean reaction time). For the within-trial analysis, the event onset was set to the presentation of the first stimulus of the trial and for the between-trial analysis, the event onset was set to the first stimulus of the second trial.
An additional 23 nuisance regressors were included in the GLM because the anatomical location of the OFC makes the BOLD signal in this region particularly sensitive to both subject motion and physiological noise. First, to account for motion-related artifacts that had not been eliminated in rigid-body motion correction, the six motion regressors obtained during realignment were included. Second, to remove variance accounted for by cardiac and respiratory responses, a physiological noise model was constructed using an in-house developed Matlab toolbox (Hutton et al., 2011). Models for cardiac and respiratory phase and their aliased harmonics were based on RETROICOR (Glover et al., 2000). The model for changes in respiratory volume was based on Birn et al. (2006). To model fluctuations arising from the cardiac phase, it was necessary to choose a reference slice in each volume (Hutton et al., 2011). Because the EPI sequence was tilted by 30°, slice 7 was used as the reference slice given its proximity to the location to OFC. This resulted in 17 physiological regressors in total: 10 for cardiac phase, six for respiratory phase, and one for respiratory volume. The GLM thus included a total of 35 EVs for each session, and each session was modeled separately within a single GLM.
The primary aim of our analysis was to identify brain areas representing information about reward–identity. The GLM allowed us to examine effects of both within-trial as well as between-trial adaptation to reward information. To identify areas that represented the identity of food (our example primary reward), but not neutral objects, we tested for within-trial adaptation effects using the contrast [(DSDIf − DSSIf) − (DSDIn − DSSIn)], and for between-trial adaptation effects using [(DIf − SIf) − (DIn − SIn)]. To obtain group statistics, the resulting contrast images of all participants were entered into a second level random-effects analysis using a one-sample t test across participants. In addition, before concluding that brain regions identified using these contrasts were encoding representations of reward–identity, we performed post hoc tests to ensure that the effect was also significant within the food condition alone.
A second GLM was used to test for between-trial adaptation to stimulus–reward pairings. Four subjects had to be excluded from this analysis because they did not have multiple examples of repeated stimuli in every block. This GLM was identical to the first, except the between-trial conditions SI and DI were replaced by SSI and DSI, whereas the corresponding within-trial comparison was performed using the first GLM.
This second analysis was used to test for adaptation to stimulus–reward pairings rather than stimulus-independent reward representations, and contrasts for both within- and between- stimulus–reward pairing were defined in a similar way to those contrasts used to measure reward–identity. To control for adaptation to visual stimulus features, we contrasted all adaptation effects to food specific stimulus–reward information with the equivalent neutral condition ([((DSDIf + DSSIf) − 2SSSIf) − ((DSDIn + DSSIn) − 2SSSIn)] and [(DSIf − SSIf) − (DSIn − SSIn)]).
It should be noted that, in addition to adjusting the quantity of food items A and B such that value differences were minimized, the formulation of all the above contrasts further ensured that value effects were eliminated from the results. Because food A and food B appear equally frequently in both elements of the subtraction, any residual value differences were automatically controlled: [VAL(A + B) + VAL(B + A)] − [VAL(A + A) + VAL(B + B)] = 0.
Because we had two independent measures of adaptation (within- and between-trial), we were, in each case, able to use a contrast defined by one measure to test the adaptation effect in the other measure. This ensures that all reported results are replicated by two independent measures, and obviates questions of multiple comparisons by performing tests in regions of interest (ROIs) defined from contrasts approximately orthogonal to the contrast of interest. For example, ROIs for the within-trial contrasts were obtained from the equivalent between-trial contrast, and vice versa, thresholded at either p < 0.05 or p < 0.01 (uncorrected). Indeed, the two contrasts exhibited a small negative correlation (reward–identity adaptation: ρ = −0.18; stimulus–reward adaptation: ρ = −0.15), inducing a slight bias in the null distribution against replicating the adaptation across conditions. Despite this, we were able to detect significant effects.
For statistical comparisons, we extracted a participant's average parameter estimate for a given contrast or EV of interest from all voxels in the corresponding orthogonal ROI. The obtained parameter estimates were then subjected to two-tailed paired t tests and repeated-measures ANOVAs as reported in the Results.
Bar plots in Figures 2 and 3 show parameter estimates extracted from the orthogonal ROIs for each of the six within-trial and four between-trial conditions. The plots depicting the time course of the BOLD signal for different trial types were obtained by extracting the BOLD time series from the preprocessed data of each participant using the corresponding orthogonal group ROI (as described in Behrens et al., 2007). In brief, the obtained signal was resampled with a resolution of 300 ms, divided into trials, separately averaged for each of the food trial types, and regressed against the same EVs as those included in the SPM GLM. The plots show the normalized averaged BOLD signal across each condition after having regressed out and subtracted the variance explained by all other EVs. In both Figures 2D and Figure 3D, two adjacent axial slices (z = −19 and z = −20; z = −9 and z = −10, respectively) were merged onto one slice for better visualization of the bilateral OFC regions.
Results
Behavioral results
Before entering the scanner, participants were familiarized with four different items: two food types (A and B), and two nonrewarding neutral objects. To control for value-related variance between A and B, these two food types were chosen from six different options. Participants were first asked to rate the six different food types out of 10, and the experimenter chose two food types that were given similar and high ratings. The average rating given to the chosen food types was 6.52, and the average absolute difference between the ratings of the two chosen foods was 1.42. We then used an indifference test to establish participant's subjective value for A and B and adjusted the quantity of B relative to A such that participants showed equal preference between the two foods. The indifference quantities determined, rounded to whole units, were then used for the remainder of the experiment. After completion of the scan, the indifference test was repeated to index possible change in preference. Recalculating the quantity of B given both the original quantity of A and the new indifference ratio, the mean absolute change in quantity of B relative to A was found to be 0.684 (range, 3; median, 0). Thus, in the majority of participants, individual preferences remained stable across the duration of the experiment and value differences between the two food types were successfully controlled.
Before entering the scanner, participants also performed a reaction time task to learn the association between each of the food and neutral items and two different abstract shapes to which they were paired (Fig. 1B). On average, participants needed to complete 269 ± 8 trials of the reaction time task to approach performance with a 600 ms reaction time and 100% accuracy (when performance was averaged across an entire block of 64 trials). The average reaction time on the last training block was 588 ± 19 ms with a mean accuracy of 96.7 ± 1.3%. Note that the task required participants not only to determine the item associated with the presented stimulus, but also to map their response onto one of four different buttons according to the position of the correct outcome on the screen while maintaining a suitable balance between speed and accuracy.
During scanning, two abstract yellow shapes were shown consecutively on each trial, and participants were instructed to elicit a vivid mental representation of the corresponding item associated with each abstract shape (Fig. 1C). In each block of scanning, 16 yes/no questions concerning the attributes of the elicited items were presented to probe whether appropriate item representations were successfully elicited. Participant's mean response accuracy was 90 ± 2% (number of correct responses: minimum, 33; mean, 43 of a total of 48).
Imaging results
Reward–identity encoding
The main aim of the study was to isolate representations of reward–identity in the OFC. Our design allowed us to examine effects of both within-trial and between-trial adaptation to reward–identity. Crucially, this allowed us to provide an independent replication of our results within the same experiment.
First, we used the contrasts of interest for within- and between-trial reward–identity adaptation to identify regions encoding food-specific reward–identity representations. Namely, we contrasted representation of the same food item in response to two different predictive stimuli with representations of two different food items ([DSDIf − DSSIf] and [DIf − SIf] for within- and between-trial analyses, respectively). To demonstrate that the adaptation effect was specific to rewarding items, we further subtracted the equivalent contrast for neutral items ([(DSDIf − DSSIf) − (DSDIn − DSSIn)] and [(DIf − SIf) − (DIn − SIn)]) for within- and between-trial analyses respectively). We verified post hoc the specificity of the resulting effect by ensuring the effect was also significant in the food condition alone.
In both contrasts ([(DSDIf − DSSIf) − (DSDIn − DSSIn)] and [(DIf − SIf) − (DIn − SIn)]), bilateral activity was revealed in a caudal region of OFC. Each contrast was then used, independent of the other, to define an ROI in the caudal OFC, resulting in two “orthogonal” ROIs (defined at p < 0.05, uncorrected, see Materials and Methods for actual correlations). Figure 2A and Figure 2D, respectively, illustrate the within- and between-trial contrasts for reward–identity adaptation (both p < 0.05, uncorrected) and show the location of the ROI in caudal OFC extracted from each contrast. The two orthogonal ROIs provided a means to perform an unbiased corrected test for within-trial reward–identity adaptation using the between-trial ROI, and vice versa. For example, for the within-trial contrast used to identify representations of reward–identity ([(DSDIf − DSSIf) − (DSDIn − DSSIn)]), the ROI defined from the equivalent between-trial contrast ([(DIf − SIf) − (DIn − SIn)]), was used to test for statistical significance of the adaptation effect.
For all participants, average parameter estimates for each of the six within-trial and four between-trial conditions were extracted from all voxels contained in the respective orthogonal ROI. This revealed bilateral adaptation specific to food identity in caudal OFC both within and between trials. When the same food item was elicited consecutively, either in the same or next trial (DSSIf and SIf), the BOLD response in caudal OFC was attenuated compared with when two different food items were elicited (within: [(DSDIf − DSSIf) − (DSDIn − DSSIn)], t(18) = 3.65, p = 0.002, peak coordinate (−18, 5, −23); between: [(DIf − SIf) − (DIn − SIn)], t(18) = 2.62, p = 0.017, peak coordinate (−15, 8, −23); Figure 2B,E). This effect was also evident in the peristimulus BOLD time courses extracted from the preprocessed data of each participant in the respective orthogonal ROIs (Fig. 2C,F). To verify the specificity of this effect to representations of rewarding food items, the difference between the extracted parameter estimates for the food conditions were also found to be significant for both within- and between-trial comparisons (within: [DSDIf − DSSIf], t(18) = 2.60, p = 0.018, peak coordinate (−18, 8, −23); between: [DIf − SIf], t(18) = 2.32, p = 0.032, peak coordinate (−12, 5, −23); Figure 2B,E). Furthermore, the extracted parameter estimates for all neutral conditions were found not to differ significantly from each other (within: DSSIn vs DSDIn: t(18) = 1.24, p = 0.227; between: SIn vs DIn: t(18) = 0.52, p = 0.609).
No other brain regions identified in the overlap between the within- and between-trial contrasts of interest showed a bilateral response pattern consistent with reward–identity encoding. Those regions that showed a unilateral response pattern consistent with reward–identity encoding are listed in Table 1. Regions were selected from the within-trial contrast masked by the between-trial contrast, and the between-trial contrast masked by the within-trial contrast (both p < 0.05, uncorrected). No bilateral brain regions survived correction using a whole-brain corrected approach in either the within- or between-trial adaptation contrasts for reward–identity.
Table 1.
Within or between | Location | Coordinate | ||
---|---|---|---|---|
Within | Right caudal OFC | 15 | 8 | −20 |
Within | Left caudal orbitofrontal cortex | −18 | 2 | −23 |
Within | Left anterior cingulate | −4 | 24 | 34 |
Between | Right caudal OFC | 9 | 8 | −17 |
Between | Left caudal OFC | −15 | 8 | −23 |
Between | Right prefrontal cortex | 16 | 38 | 40 |
Between | Left anterior cingulate | −6 | 24 | 34 |
Between | Right hippocampus | 30 | −22 | −16 |
Between | Left ventral medial prefrontal cortex | −10 | 42 | −2 |
Between | Right fusiform gyrus | 42 | −30 | −26 |
Between | Left posterior insula | −44 | −6 | −4 |
“Within” and “between” refer to the corresponding within- and between-trial adaptation contrasts to reward–identity described in the main text. The only bilateral region showing a response pattern consistent with reward–identity encoding was the medial-caudal OFC region shown in Figure 2.
Stimulus–reward mappings
Next, we investigated whether OFC also represents conjoined stimulus–reward information. We compared trials with repeated representation of the same food item in response to the same predictive stimulus, with trials in which the representation of the two food items was predicted by two different stimuli. To further control for adaptation to visual stimulus features, we contrasted the food specific conditions with the equivalent neutral conditions to give the following contrasts: [((DSDIf + DSSIf) − (2 × SSSIf)) − ((DSDIn + DSSIn) − (2 × SSSIn))] and [(DSIf − SSIf) − (DSIn − SSIn)] for within- and between-trial analyses, respectively. We used each of the contrasts to construct an independent ROI for within-trial and between-trial adaptation (defined at p < 0.01, uncorrected). This resulted in two orthogonal bilateral ROIs located in the lateral OFC (lOFC). Figure 3A, D illustrate the within- and between-trial contrasts for stimulus–reward adaptation (both p < 0.05, uncorrected for visualization) and show the location of the ROIs in lOFC extracted from each contrast.
The parameter estimates extracted from the respective orthogonal ROIs revealed adaptation to stimulus–reward identity information. For both within- and between-trial conditions, there was relative suppression in response to trials where the same stimulus–reward association was elicited consecutively, compared with trials in which either the stimulus or both stimulus and reward were not repeated (within: [((DSDIf + DSSIf) − (2 × SSSIf)) − ((DSDIn + DSSIn) − (2 × SSSIn))], t(18) = 3.54, p = 0.002, peak coordinate (27, 38, −11); between: [(DSIf − SSIf) − (DSIn − SSIn)], t(14) = 3.46, p = 0.004, peak coordinate (24, 38, − 11); Figure 3B,E). The peristimulus BOLD time courses extracted from the appropriate independent ROIs illustrate this effect (Fig. 3C,F). Notably, this effect was also significant when considering only those trials with rewarding food items (within: ([(DSDIf + DSSIf) − (2 × SSSIf)], t(18) = 2.50, p = 0.022, peak coordinate (24, 38, −11); between: [DSIf − SSIf], t(14) = 2.53, p = 0.024, peak coordinate (27, 35, −11); Figure 3B,E). Similarly, contrasting trials with different predictive stimuli against those with the same predictive stimulus, but restricted to only those conditions with repeated presentations of the same associated reward, also revealed a significant adaptation effect (DSSIf − SSSIf, t(18) = 2.30, p = 0.033; Fig. 3B). Together, these results show robust encoding of stimulus–reward associations in lOFC. The only other bilateral brain region in the overlap between the within- and between-trial interaction contrasts (both p < 0.01) found to have a response pattern appropriate for adaptation to conjoined stimulus–reward information was in the superior frontal gyrus (Table 2). All unilateral brain regions that showed a response pattern consistent with stimulus–reward encoding are listed in Table 2. No bilateral brain regions survived correction using a whole-brain corrected approach in either the within- or between-trial stimulus–reward adaptation contrasts.
Table 2.
Within or between | Location | Coordinate | ||
---|---|---|---|---|
Within | Right lateral OFC | 27 | 38 | −11 |
Within | Left lateral OFC | −21 | 38 | −11 |
Within | Right anterior insula/gustatory taste cortex | 52 | 18 | −2 |
Within | Bilateral superior frontal gyrus | 16 | 12 | 64 |
−18 | 12 | 64 | ||
Within | Right posterior middle frontal gyrus | 32 | 4 | 60 |
Within | Right anterior middle frontal gyrus | 40 | 44 | 24 |
Within | Left anterior middle frontal gyrus | −36 | 40 | 16 |
Between | Right lateral OFC | 24 | 35 | −11 |
Between | Left lateral OFC | −24 | 35 | −8 |
Between | Right posterior middle frontal gyrus | 32 | 4 | 60 |
“Within” and “between” refer to the corresponding within- and between-trial adaptation contrasts to stimulus–reward encoding described in the main text. The bilateral response pattern in lateral OFC is shown in Figure 3.
The relative suppression to repeated presentation of stimulus–reward information was only observed in response to food items (Fig. 3B). However, in the neutral condition, rather than observing effect sizes that were not significantly different from each other, we found a relative increase in activation of lOFC in response to any pair of stimuli (same or different) that shared the same associated neutral item (Fig. 3B). Therefore, unlike when presented with a stimulus associated with a rewarding item, the lOFC does not show either adaptation to the identity of a neutral item (p = 0.227, see above) or to the associated stimulus predicting the neutral item (within: [DSSIn − SSSIn] t(18) = 0.16, p = 0.876; between: [DSIn − SSIn], t(14) = 0.99, p = 0.338). Consequently, there is no evidence to suggest that a specific representation of the neutral object or stimulus-object association is maintained. We term this pattern of repetition suppression for stimuli, only if they predict rewards, as the encoding of a stimulus–reward association. The absence of repetition suppression for neutral items rules out a possible explanation that relies on pure stimulus coding.
Different time scale of adaptation in orbitofrontal and visual cortices
The lateral and caudal OFC both encoded information at two different time scales, within and between trials. Given the rather long intertrial intervals ranging between 3.5 and 10 s, this suggests that the encoding of stimulus–reward and reward–identity information in OFC is sustained over several seconds. Indeed, the effect size of reward-specific within- and between-trial adaptations in each OFC region were all significantly different from 0 (lOFC: t(18) = 2.50, p = 0.022, for ((DSDIf + DSSIf) − (2 × SSSIf)), and t(14) = 2.53, p = 0.024 for (DSIf − SSIf); caudal OFC: t(18) = 2.61, p = 0.018 for (DSDIf − DSSIf), and t(18) = 2.33, p = 0.032 for (DIf − SIf); Figures 3B,E, and 2B,E, respectively). Interestingly, this finding is consistent with reports of single-cell activity in primate and rodent OFC, which demonstrate sustained outcome-specific responses over a timescale of several seconds (Hikosaka and Watanabe, 2000; Schoenbaum et al., 2003), and with the demonstration that rodents with OFC lesions show impairments when maintaining reward expectations over delays of similar periods (Rudebeck et al., 2006).
In contrast, the time scale of adaptation effects reported in visual cortices with fMRI are typically in the range of 250–400 ms (Henson and Rugg, 2003). To test for visual adaptation at the longer time scale between trials, we defined an ROI in visual cortex using the response to any visual event, which was orthogonal to effects of visual adaptation (p < 0.05, uncorrected). Within this ROI, we compared trials with repeated presentation of the same stimulus against all other trials: ([(DSDI + DSSI) − (2 × SSSI)] and (DSI − SSI) for within- and between-trial analyses, respectively). For visualization, Figure 4A shows this contrast (at p < 0.01, uncorrected), masked by the ROI. An interaction analysis revealed within- but not between-trial stimulus adaptation in visual cortex. Repetition suppression effects only occurred when stimulus presentation was repeated after the short within-trial interstimulus time interval of 400 ms and not at the longer between-trial interval of 6000 ms (within vs between: t(14) = 2.17, p = 0.047; post hoc t tests for within: t(14) = 3.98, p = 0.001; between: t(14) = −0.96, p = 0.351). This difference of within- versus between-trial adaptation was notably more pronounced in neutral trials compared with food trials.
Finally, we performed a direct statistical comparison of within- and between-trial adaptation effects across visual and orbitofrontal cortices. An interaction analysis comparing adaptation effects in orbitofrontal with visual cortices for within and between-trial time intervals was performed on the mean adaptation effects obtained across the lateral and caudal OFC ROIs (obtained from the same orthogonal ROIs as before). By considering the difference between the parameter estimates for the adaptation condition and the corresponding control condition in visual and orbitofrontal cortices, a significant interaction between type of adaptation (within vs between) and region (visual vs OFC) was revealed (2 × 2 ANOVA: p = 0.019 F(1,44) = 7.00; Fig. 4B). In OFC, adaptation effects occurred both within and between trials, and in visual cortex adaptation was significantly more pronounced within a trial. This suggests that, whereas the OFC has the propensity to retain reward representations and reward-associated information across time periods extending up to 6000 ms, visual feature processing in sensory regions may be more short-lasting.
Discussion
In this study, we developed a novel repetition suppression paradigm using fMRI to measure representations of reward–identity and stimulus–reward associations within human OFC. Our data revealed a response profile in medial-caudal OFC consistent with that expected for a region encoding the identity of a food reward, whereas a more rostral and lateral OFC region showed encoding of stimulus–reward associations.
To test for encoding of reward–identity, we first isolated regions that showed suppression in the BOLD response after consecutive representation of the same, relative to different, food items. Single-cell recordings from macaque OFC have identified neurons that encode representations suggestive of reward–identity information (Tremblay and Schultz, 1999; Padoa-Schioppa and Assad, 2006, 2008). However, without controlling simultaneously for both stimulus and value information, and without comparing with neutral items, the interpretation of the reward–identity representations in these studies remains ambiguous. The repetition suppression paradigm used in this experiment was specifically designed to access reward–identity representations independent of stimulus and value information. We controlled for the subjective value attributed to both food items and assigned two different stimuli to each item, allowing us to distinguish the processing of stimulus–reward associations and reward–identity. The response pattern of medial-caudal OFC showed suppression to consecutive representations of the same food items relative to consecutive representations of different food items. Notably, this suppression effect was independent of whether the food item was predicted by two identical or two different stimuli. Furthermore, such a repetition–suppression was not observed in response to stimuli that predicted the same neutral items. Together, these findings demonstrate that the medial-caudal region of OFC encodes food-specific reward–identity representations, an example of reward–identity coding, whereby reward information is encoded independent of associative value or stimulus information.
The encoding of reward–identity in the medial region of caudal OFC is consistent with the known anatomical and functional subdivisions of OFC. The caudal OFC receives multisensory inputs, particularly from taste and olfactory cortices (Carmichael and Price, 1996; Rolls, 2000), all of which provide essential components for constructing representations of reward–identity. Functionally, there is substantial evidence to suggest that medial OFC is activated by consumption, expectation, and valuation of food items (Del Parigi et al., 2002; O'Doherty et al., 2002; Gottfried et al., 2003; Simmons et al., 2005; Plassmann et al., 2010). Indeed, previous functional data suggest that there is a gradation in OFC reward-related representations, with primary rewards represented caudally, and secondary rewards more rostrally (Sescousse et al., 2010). Such a gradient is consistent with the caudo-medial OFC location that shows reward–identity coding in our study, where rewards were primarily dissociated by taste. It is an intriguing suggestion that rewards whose identity varies along more abstract axes may be distinguished in other more rostral OFC locations.
At the statistical threshold with which we defined the ROI in the caudo-medial OFC, there was similar repetition suppression in the caudo-lateral OFC, within trials. Unlike caudo-medial OFC, this more lateral region did not replicate this repetition suppression between trials and so should not be interpreted as statistically robust. However, it is notable that this most caudal part of lOFC is strongly connected with the medial orbital network (Carmichael and Price, 1996) and indeed shows stronger resting fMRI correlations with these medial structures than with the remainder of lOFC (Kahnt et al., 2012).
Our second finding relates to a rostro-lateral region of OFC. Although there was a positive response to consecutive representation of different stimuli predicting the same reward type, a relative suppression was observed in response to consecutive representation of the same stimulus if that stimulus predicted a reward. This pattern of results suggests that cellular activity in the lOFC contains a representation of the particular stimulus that predicts a reward. By contrast, repetition suppression could not be observed in this region for stimuli that predicted neutral outcomes. We have termed this pattern of activity the coding of a stimulus–reward association. It is notable that, because we were unable to compare this result with adaptation from repeated representation of a stimulus that predicts multiple reward types, this stimulus–reward association may include encoding of both stimuli that predict general and specific reward types.
The coding of stimulus–reward associations in the lOFC supports a previously proposed functional delineation of OFC regions. The lOFC contributes to a relatively distinct anatomical network from the medial OFC (Croxson et al., 2005; Price, 2007). In particular, the lateral region receives highly processed visual information (Carmichael and Price, 1996), consistent with a functional role attributed to the lOFC in environment-centered reward evaluation and the learning of stimulus values (Bouret and Richmond, 2010; Walton et al., 2010; Noonan et al., 2011; Rushworth et al., 2011). The representation of stimulus–reward associations in lOFC is thus consistent with, and may even explain, the type of stimulus–reward credit assignment deficits observed in macaques and rodents with lOFC damage (Gallagher et al., 1999; Walton et al., 2010; Takahashi et al., 2011).
While in the neutral condition, we did not observe any repetition suppression for stimuli or objects in lOFC; we did observe a somewhat counterintuitive repetition enhancement for neutral objects. However, this response is consistent with previous studies reporting lOFC activation during cognitive tasks. When subjects are required to either maintain or switch their behavioral strategy, the lOFC shows a positive response to maintaining cognitive strategies (Rushworth et al., 2002; Hampton and O'Doherty, 2007). Indeed, this same repetition enhancement has been observed in response to repeated presentation of the same picture of a nonrewarding object (Bar et al., 2001, 2006). In light of these studies, it is perhaps particularly surprising in the current study that this known repetition enhancement is not observed if the stimulus in question predicts a reward. The simple pairing of a stimulus with a reward induces a different coding scheme in the lOFC, one that codes for the specific reward-predicting stimulus. Furthermore, the absence of this suppression for neutral items rules out a possible explanation that relies on pure stimulus coding.
A number of different biophysical mechanisms have been proposed for fMRI adaptation, even in visual cortex (Grill-Spector et al., 2006), and this uncertainty is amplified in regions, such as OFC, where repetition suppression has not to our knowledge been studied in single-unit activity. However, these mechanistic concerns can be at least partially mitigated by careful experimental design. By always comparing trials that are identical except for a repetition on one dimension, any repetition suppression can only be ascribed to one of two causes: either neural activity distinguishes repetitive events per se or neural activity reflects the content that was repeated. Furthermore, by comparing sets of trials with identical form, we were able to mitigate potential concerns pertaining to the levels of stimulus-elicited activity in OFC. fMRI effects in OFC are typically negative compared with baseline (Raichle et al., 2001; Fox et al., 2009), even when parametrically modulated (e.g., Boorman et al., 2009). This lack of a clear baseline would, for example, cause difficulties when interpreting alternative repetition suppression designs that compare trials with single against repeated presentations.
In a final step of our analysis, we examined differences in the time scale of adaptation effects in visual cortex to visual feature adaptation, compared with the adaptation effects in OFC to more cognitively complex information. Although reward and stimulus–reward adaptation in OFC were present both when repetition occurred after a short time interval (within a trial) and after a longer time interval (between trials), this was not the case for stimulus adaptation within visual cortex. Rather, in visual cortex, adaptation to visual features was evident only when repetition occurred after a short time interval (within a trial). Although repetition suppression effects have been reported across a range of time scales, previous studies typically report optimal adaptation effects at short time scales, similar to the within-trial time interval of 400 ms used in this analysis.
Notably, we find that adaptation effects in OFC occurred in response to repetition intervals that varied across a range of time scales, from milliseconds to the order of seconds. Both the encoding of reward–identity in caudal OFC and stimulus–reward associations in lOFC survive stringent tests for both within- and between-trial adaptation. Consistent with the role of OFC and surrounding frontal regions in maintaining sustained patterns of neural activity, this result highlights OFC's ability to hold online representations of reward–identity and related associative information across a time interval of several seconds. Indeed, in our task, OFC appears to maintain a sustained pattern of reward-related activity. Although the task does not require participants to hold onto representations across the intertrial interval, it is possible that subjects nevertheless represent the elicited rewards across this period.
Having established a paradigm that enables measurement of reward–identity representations in human OFC, it will now be possible to use the technique for fine-grained analysis to facilitate our understanding of human OFC and its contribution to economic decision making. Whereas single-unit electrophysiology studies in nonhuman species are essential to our understanding of the mechanisms of prefrontal cortex, human data are of particular importance in this brain structure, which is among the most modified during human evolution (Semendeferi et al., 2002; Schoenemann et al., 2005), and which exhibits patterns of activity that may be particularly specialized for tasks that are not learned over extensive training (Hunt et al., 2012). The adaptation paradigm described here provides a means by which distinct representations of reward information can be measured. It also provides a potentially powerful means by which fMRI studies could reveal more complex coding patterns in reward processing, value construction, and decision mechanisms, which are currently not immediately amenable as correlates of average activity.
Footnotes
This work was supported by the Wellcome Trust Grant WT088312AIA to T.E.J.B., 4 year PhD studentship 086120/Z08/Z to M.C.K.-F., and the Medical Research Council 4 year PhD studentship G1000411 to H.C.B., R.J.D. and the Wellcome Trust Centre for Neuroimaging are supported by core funding from the Wellcome Trust Strategic Award Grant 091593/Z/10/Z.
The authors declare no competing financial interests.
References
- Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37:407–419. doi: 10.1016/s0028-3908(98)00033-1. [DOI] [PubMed] [Google Scholar]
- Bar M, Tootell RB, Schacter DL, Greve DN, Fischl B, Mendola JD, Rosen BR, Dale AM. Cortical mechanisms specific to explicit visual object recognition. Neuron. 2001;29:529–535. doi: 10.1016/s0896-6273(01)00224-0. [DOI] [PubMed] [Google Scholar]
- Bar M, Kassam KS, Ghuman AS, Boshyan J, Schmid AM, Schmidt AM, Dale AM, Hämäläinen MS, Marinkovic K, Schacter DL, Rosen BR, Halgren E. Top-down facilitation of visual recognition. Proc Natl Acad Sci U S A. 2006;103:449–454. doi: 10.1073/pnas.0507062103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Behrens TE, Woolrich MW, Walton ME, Rushworth MF. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10:1214–1221. doi: 10.1038/nn1954. [DOI] [PubMed] [Google Scholar]
- Birn RM, Diamond JB, Smith MA, Bandettini PA. Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI. Neuroimage. 2006;31:1536–1548. doi: 10.1016/j.neuroimage.2006.02.048. [DOI] [PubMed] [Google Scholar]
- Boorman ED, Behrens TE, Woolrich MW, Rushworth MF. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron. 2009;62:733–743. doi: 10.1016/j.neuron.2009.05.014. [DOI] [PubMed] [Google Scholar]
- Bouret S, Richmond BJ. Ventromedial and orbital prefrontal neurons differentially encode internally and externally driven motivational values in monkeys. J Neurosci. 2010;30:8591–8601. doi: 10.1523/JNEUROSCI.0049-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burke KA, Franz TM, Miller DN, Schoenbaum G. The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature. 2008;454:340–344. doi: 10.1038/nature06993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carmichael ST, Price JL. Connectional networks within the orbital and medial prefrontal cortex of macaque monkeys. J Comp Neurol. 1996;371:179–207. doi: 10.1002/(SICI)1096-9861(19960722)371:2<179::AID-CNE1>3.0.CO;2-#. [DOI] [PubMed] [Google Scholar]
- Croxson PL, Johansen-Berg H, Behrens TE, Robson MD, Pinsk MA, Gross CG, Richter W, Richter MC, Kastner S, Rushworth MF. Quantitative investigation of connections of the prefrontal cortex in the human and macaque using probabilistic diffusion tractography. J Neurosci. 2005;25:8854–8866. doi: 10.1523/JNEUROSCI.1311-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Del Parigi A, Gautier J-F, Chen K, Salbe AD, Ravussin E, Reiman E, Tataranni PA. Neuroimaging and obesity: mapping the brain responses to hunger and satiation in humans using positron emission tomography. Ann N Y Acad Sci. 2002;967:389–397. [PubMed] [Google Scholar]
- Fox MD, Zhang D, Snyder AZ, Raichle ME. The global signal and observed anticorrelated resting state brain networks. J Neurophysiol. 2009;101:3270–3283. doi: 10.1152/jn.90777.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glover GH, Li TQ, Ress D. Image-based method for retrospective correction of physiological motion effects in fMRI: RETROICOR. Magn Reson Med. 2000;44:162–167. doi: 10.1002/1522-2594(200007)44:1<162::aid-mrm23>3.0.co;2-e. [DOI] [PubMed] [Google Scholar]
- Gottfried JA, O'Doherty J, Dolan RJ. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science. 2003;301:1104–1107. doi: 10.1126/science.1087919. [DOI] [PubMed] [Google Scholar]
- Grill-Spector K, Henson R, Martin A. Repetition and the brain: neural models of stimulus-specific effects. Trends Cogn Sci (Regul Ed) 2006;10:14–23. doi: 10.1016/j.tics.2005.11.006. [DOI] [PubMed] [Google Scholar]
- Hampton AN, O'Doherty JP. Decoding the neural substrates of reward-related decision making with functional MRI. Proc Natl Acad Sci U S A. 2007;104:1377–1382. doi: 10.1073/pnas.0606297104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henson RN, Rugg MD. Neural response suppression, haemodynamic repetition effects, and behavioural priming. Neuropsychologia. 2003;41:263–270. doi: 10.1016/s0028-3932(02)00159-8. [DOI] [PubMed] [Google Scholar]
- Hikosaka K, Watanabe M. Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. Cereb Cortex. 2000;10:263–271. doi: 10.1093/cercor/10.3.263. [DOI] [PubMed] [Google Scholar]
- Hunt LT, Kolling N, Soltani A, Woolrich MW, Rushworth MF, Behrens TE. Mechanisms underlying cortical activity during value-guided choice. Nat Neurosci. 2012;15:470–476. doi: 10.1038/nn.3017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hutton C, Josephs O, Stadler J, Featherstone E, Reid A, Speck O, Bernarding J, Weiskopf N. The impact of physiological noise correction on fMRI at 7T. Neuroimage. 2011;57:101–112. doi: 10.1016/j.neuroimage.2011.04.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones BM, White KG. An investigation of the differential-outcomes effect within sessions. J Exp Anal Behav. 1994;61:389–406. doi: 10.1901/jeab.1994.61-389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahnt T, Chang LJ, Park SQ, Heinzle J, Haynes JD. Connectivity-based parcellation of the human orbitofrontal cortex. J Neurosci. 2012;32:6240–6250. doi: 10.1523/JNEUROSCI.0257-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDannald MA, Takahashi YK, Lopatina N, Pietras BW, Jones JL, Schoenbaum G. Model-based learning and the contribution of the orbitofrontal cortex to the model-free world. Eur J Neurosci. 2012;35:991–996. doi: 10.1111/j.1460-9568.2011.07982.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noonan MP, Walton ME, Behrens TE, Sallet J, Buckley MJ, Rushworth MF. Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc Natl Acad Sci U S A. 2010;107:20547–20552. doi: 10.1073/pnas.1012246107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noonan MP, Mars RB, Rushworth MF. Distinct roles of three frontal cortical areas in reward-guided behavior. J Neurosci. 2011;31:14399–14412. doi: 10.1523/JNEUROSCI.6456-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Doherty JP, Deichmann R, Critchley HD, Dolan RJ. Neural responses during anticipation of a primary taste reward. Neuron. 2002;33:815–826. doi: 10.1016/s0896-6273(02)00603-7. [DOI] [PubMed] [Google Scholar]
- Ostlund SB, Balleine BW. Orbitofrontal cortex mediates outcome encoding in Pavlovian but not instrumental conditioning. J Neurosci. 2007;27:4819–4825. doi: 10.1523/JNEUROSCI.5443-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C. Neurobiology of economic choice: a good-based model. Annu Rev Neurosci. 2011;34:333–359. doi: 10.1146/annurev-neuro-061010-113648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA. The representation of economic value in the orbitofrontal cortex is invariant for changes of menu. Nat Neurosci. 2008;11:95–102. doi: 10.1038/nn2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plassmann H, O'Doherty J, Rangel A. Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci. 2007;27:9984–9988. doi: 10.1523/JNEUROSCI.2131-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plassmann H, O'Doherty JP, Rangel A. Appetitive and aversive goal values are encoded in the medial orbitofrontal cortex at the time of decision making. J Neurosci. 2010;30:10799–10808. doi: 10.1523/JNEUROSCI.0788-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Price JL. Definition of the orbital cortex in relation to specific connections with limbic and visceral structures and other cortical regions. Ann N Y Acad Sci. 2007;1121:54–71. doi: 10.1196/annals.1401.008. [DOI] [PubMed] [Google Scholar]
- Raichle ME, MacLeod AM, Snyder AZ, Powers WJ, Gusnard DA, Shulman GL. A default mode of brain function. Proc Natl Acad Sci U S A. 2001;98:676–682. doi: 10.1073/pnas.98.2.676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolls ET. The orbitofrontal cortex and reward. Cereb Cortex. 2000;10:284–294. doi: 10.1093/cercor/10.3.284. [DOI] [PubMed] [Google Scholar]
- Rolls ET, Baylis LL. Gustatory, olfactory, and visual convergence within the primate orbitofrontal cortex. J Neurosci. 1994;14:5437–5452. doi: 10.1523/JNEUROSCI.14-09-05437.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolls ET, Critchley HD, Browning AS, Hernadi I, Lenard L. Responses to the sensory properties of fat of neurons in the primate orbitofrontal cortex. J Neurosci. 1999;19:1532–1540. doi: 10.1523/JNEUROSCI.19-04-01532.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudebeck PH, Murray EA. Dissociable effects of subtotal lesions within the macaque orbital prefrontal cortex on reward-guided behavior. J Neurosci. 2011;31:10569–10578. doi: 10.1523/JNEUROSCI.0091-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudebeck PH, Walton ME, Smyth AN, Bannerman DM, Rushworth MF. Separate neural pathways process different decision costs. Nat Neurosci. 2006;9:1161–1168. doi: 10.1038/nn1756. [DOI] [PubMed] [Google Scholar]
- Rushworth MF, Hadland KA, Paus T, Sipila PK. Role of the human medial frontal cortex in task switching: a combined fMRI and TMS study. J Neurophysiol. 2002;87:2577–2592. doi: 10.1152/jn.2002.87.5.2577. [DOI] [PubMed] [Google Scholar]
- Rushworth MF, Noonan MP, Boorman ED, Walton ME, Behrens TE. Frontal cortex and reward-guided learning and decision-making. Neuron. 2011;70:1054–1069. doi: 10.1016/j.neuron.2011.05.014. [DOI] [PubMed] [Google Scholar]
- Savage LM. In search of the neurobiological underpinnings of the differential outcomes effect. Integr Physiol Behav Sci. 2001;36:182–195. doi: 10.1007/BF02734092. [DOI] [PubMed] [Google Scholar]
- Schoenbaum G, Setlow B, Saddoris MP, Gallagher M. Encoding predicted outcome and acquired value in orbitofrontal cortex during cue sampling depends upon input from basolateral amygdala. Neuron. 2003;39:855–867. doi: 10.1016/s0896-6273(03)00474-4. [DOI] [PubMed] [Google Scholar]
- Schoenemann PT, Sheehan MJ, Glotzer LD. Prefrontal white matter volume is disproportionately larger in humans than in other primates. Nat Neurosci. 2005;8:242–252. doi: 10.1038/nn1394. [DOI] [PubMed] [Google Scholar]
- Semendeferi K, Lu A, Schenker N, Damasio H. Humans and great apes share a large frontal cortex. Nat Neurosci. 2002;5:272–276. doi: 10.1038/nn814. [DOI] [PubMed] [Google Scholar]
- Sescousse G, Redouté J, Dreher JC. The architecture of reward value coding in the human orbitofrontal cortex. J Neurosci. 2010;30:13095–13104. doi: 10.1523/JNEUROSCI.3501-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simmons WK, Martin A, Barsalou LW. Pictures of appetizing foods activate gustatory cortices for taste and reward. Cereb Cortex. 2005;15:1602–1608. doi: 10.1093/cercor/bhi038. [DOI] [PubMed] [Google Scholar]
- Takahashi YK, Roesch MR, Wilson RC, Toreson K, O'Donnell P, Niv Y, Schoenbaum G. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat Neurosci. 2011;14:1590–1597. doi: 10.1038/nn.2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature. 1999;398:704–708. doi: 10.1038/19525. [DOI] [PubMed] [Google Scholar]
- Valentin VV, Dickinson A, O'Doherty JP. Determining the neural substrates of goal-directed learning in the human brain. J Neurosci. 2007;27:4019–4026. doi: 10.1523/JNEUROSCI.0564-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF. Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron. 2010;65:927–939. doi: 10.1016/j.neuron.2010.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weiskopf N, Hutton C, Josephs O, Deichmann R. Optimal EPI parameters for reduction of susceptibility-induced BOLD sensitivity losses: a whole-brain analysis at 3 T and 1.5 T. Neuroimage. 2006;33:493–504. doi: 10.1016/j.neuroimage.2006.07.029. [DOI] [PubMed] [Google Scholar]