Abstract
Goal-directed behavior is sensitive to the current value of expected outcomes. This requires independent representations of specific rewards, which have been linked to orbitofrontal cortex (OFC) function. However, the mechanisms by which the human brain updates specific goals on the fly, and translates those updates into choices, have remained unknown. Here we implemented selective devaluation of appetizing food odors in combination with pattern-based neuroimaging and a decision-making task. We found that in a hungry state, participants chose to smell high-intensity versions of two value-matched food odor rewards. After eating a meal corresponding to one of the two odors, participants switched choices toward the low intensity of the sated odor but continued to choose the high intensity of the nonsated odor. This sensory-specific behavioral effect was mirrored by pattern-based changes in fMRI signal in lateral posterior OFC, where specific reward identity representations were altered after the meal for the sated food odor but retained for the nonsated counterpart. In addition, changes in functional connectivity between the OFC and general value coding in ventromedial prefrontal cortex (vmPFC) predicted individual differences in satiety-related choice behavior. These findings demonstrate how flexible representations of specific rewards in the OFC are updated by devaluation, and how functional connections to vmPFC reflect the current value of outcomes and guide goal-directed behavior.
SIGNIFICANCE STATEMENT The orbitofrontal cortex (OFC) is critical for goal-directed behavior. A recent proposal is that OFC fulfills this function by representing a variety of state and task variables (“cognitive maps”), including a conjunction of expected reward identity and value. Here we tested how identity-specific representations of food odor reward are updated by satiety. We found that fMRI pattern-based signatures of reward identity in lateral posterior OFC were modulated after selective devaluation, and that connectivity between this region and general value coding ventromedial prefrontal cortex (vmPFC) predicted choice behavior. These results provide evidence for a mechanism by which devaluation modulates a cognitive map of expected reward in OFC and thereby alters general value signals in vmPFC to guide goal-directed behavior.
Keywords: decision making, devaluation, fMRI, olfaction, orbitofrontal cortex, reward
Introduction
A central function of the brain is to direct behavior toward essential rewards, such as food, shelter, and mates. Behavioral control mechanisms that can flexibly adapt to changes in the subjective value of these rewards are operationally defined as “goal-directed behaviors” (Balleine and Dickinson, 1998; O'Doherty et al., 2017). Previous work indicates that the orbitofrontal cortex (OFC) is a key neural substrate for supporting goal-directed behavior. Neural signals in this region reflect predictive reward information (Schoenbaum et al., 1998; Tremblay and Schultz, 1999; O'Doherty et al., 2002), and these anticipatory responses are modulated according to changes in value without the need for additional stimulus–outcome learning (Gottfried et al., 2003; Valentin et al., 2007; Gremel and Costa, 2013). Moreover, animals with lesions to the OFC continue to respond to cues predicting devalued outcomes (Gallagher et al., 1999; Rhodes and Murray, 2013; Rudebeck et al., 2013; Murray et al., 2015), further demonstrating that this region is critical for goal-directed behavior.
A fundamental feature of any goal-directed system is that representations of goals themselves must be specific (Cardinal et al., 2002; Rudebeck and Murray, 2014). This specificity ensures that, for example, a decrease in the value of food after a meal does not affect the value of shelter or mates. Recent work in animals (Burke et al., 2008; McDannald et al., 2014; Stalnaker et al., 2014) and humans (Sescousse et al., 2010; Klein-Flügge et al., 2013; Howard et al., 2015; Boorman et al., 2016) has linked the OFC to the processing of such identity-specific reward signals (Rudebeck and Murray, 2014). These findings raise the possibility that identity-specific goal representations in OFC are altered to reflect changes in the value of a goal. However, whether and how identity-specific alterations are implemented within the OFC and the broader reward network is not known.
In principle, specific updates could be implemented either directly by changing reward identity representations in OFC or by changing the assignment of value to these rewards, either within the OFC or in downstream regions such as ventromedial prefrontal cortex (vmPFC). In contrast to OFC, activity in the vmPFC reflects decision values regardless of reward identity (Plassmann et al., 2007; Chib et al., 2009; Lebreton et al., 2009; Levy and Glimcher, 2011; McNamee et al., 2013; Howard et al., 2015). Moreover, whereas identity-specific OFC signals are linked to reward expectations before a decision (Burke et al., 2008; Stalnaker et al., 2014; Howard et al., 2015; Rich and Wallis, 2016), identity-general signals are typically observed at the time of choice (Daw et al., 2006; Padoa-Schioppa and Assad, 2006; Hampton and O'Doherty, 2007; Plassmann et al., 2007; Kennerley et al., 2011; McNamee et al., 2013; Strait et al., 2014). Based on these findings, we hypothesized that identity-specific signals are directly updated in OFC and that they provide critical input to representations of decision values in vmPFC.
To test this hypothesis, we implemented a value-based choice task in conjunction with fMRI and a selective devaluation paradigm, using food odors as appetitive unconditioned stimuli (US). During fMRI scanning, participants made choices among visual conditioned stimuli (CS) to receive either low-intensity (low-value) or high-intensity (high-value) versions of two distinct food odors. Scanning was conducted first while participants were hungry and then immediately after they had eaten a meal related to one of the two odors to satiety. Pattern-based fMRI analyses revealed that in the lateral posterior OFC (pOFC), satiety modulated anticipatory reward identity representations for the sated (SA) odor, whereas fMRI patterns related to the nonsated (NS) food remained intact. Moreover, satiety-related changes in functional connectivity between lateral pOFC and general value coding vmPFC predicted individual differences in how satiety altered choice behavior. Together, our results suggest a mechanism by which specific reward signals in lateral OFC are flexibly and independently updated by devaluation, and how functional connections with vmPFC support goal-directed behavior.
Materials and Methods
Subjects
Nineteen healthy human participants with no history of psychiatric illness (seven male; age, 20–34; mean ± SD, 25.0 ± 3.45 years) gave informed written consent to participate in this study. The study protocol was approved by the Northwestern University Institutional Review Board. One participant was excluded from analysis because of excessive head motion (>4 mm), and one was excluded because of a high number of missed responses (>15%), resulting in data from 17 total subjects presented here.
Odor stimuli and presentation
Eight food odors, including four sweet (strawberry, caramel, cupcake, gingerbread) and four savory (potato chips, pot roast, sautéed onions, garlic), were provided by International Flavors and Fragrances. For all experimental tasks, odors were delivered directly to participants' noses using a custom-built computer-controlled olfactometer capable of redirecting medical-grade air with precise timing at a constant flow rate of 3.6 L/min through the headspace of amber bottles containing liquid solutions of the food odors. The olfactometer was equipped with two independent mass flow controllers (Alicat), allowing for dilution of any given odorant with odorless air.
Experimental design
The experiment consisted of 3 separate days of testing. On all 3 days, participants were instructed to arrive in a hungry state, having fasted for at least 6 h before testing. All behavioral ratings were made on visual analog scales using a scroll wheel and mouse button press. Pleasantness rating anchors were “most liked sensation imaginable” (10) and “most disliked sensation imaginable” (−10). Intensity rating anchors were “strongest imaginable” and “undetectable.” Identity rating anchors corresponded to two-letter abbreviations of the two food odor rewards (e.g., SB for strawberry and PC for potato chip). Subjects were compensated with $20 per hour of behavioral testing and $40 per hour of fMRI scanning.
Day 1: stimulus selection.
Participants first provided pleasantness ratings of the eight food odors. Based on these ratings, one sweet odor and one savory odor were chosen such that they were matched as closely as possible in pleasantness. Next, we acquired pleasantness ratings for the two selected odors across a range of odor concentrations, diluted to varying degrees with odorless air. Based on these ratings, we selected two intensity levels for each odor, such that the two low-intensity odors had the same pleasantness and the two high-intensity odors had the same pleasantness. Participants then provided independent pleasantness and intensity ratings on these four odors to verify the relationship between intensity and pleasantness (i.e., value).
Day 2: classical conditioning and test trials.
After a brief reminder of the selected odors, participants completed a training session consisting of alternating blocks of classical conditioning and probe test trials in which they learned associations between visual symbols (CS) and the food odors (US). Two unique symbols were randomly designated to be paired with each of the four odors. On each trial of the conditioning task, the CS was first presented on the screen above a white crosshair for 2.5 s. The crosshair then changed from white to blue for 2 s, indicating that an odor was present and cuing the participant to make a sniff. The odor presentation was immediately followed by an odor pleasantness rating and an odor identity rating (order randomized) and a 5–8 s interstimulus interval. Conditioning proceeded first in a series of “blocked” trials, in which a given CS/US pair was repeated for five consecutive trials (2 symbols per odor × 4 odors × 5 repeats = 40 blocked trials). This was followed by 16 randomly sequenced conditioning trials, in which each CS/US pair was presented twice. We then assessed the strength of the CS/US association with 16 “test” trials, in which participants first saw a symbol and then were asked to which odor and which intensity it corresponded. If performance was <90% on the test trials, participants received an additional 16 conditioning trials, followed by another series of test trials. Conditioning proceeded in this way until >90% was achieved in a test session, up to a maximum of four cycles.
Day 3: choice task and fMRI scanning.
Before fMRI scanning, one of the two selected odors was randomly designated as the SA odor and the other odor was designated as NS odor. The SA odor was associated with the meal eaten by participants between the presatiety and postsatiety fMRI scanning sessions. Each scanning session consisted of five fMRI runs, and each run consisted of 24 randomly sequenced choice trials. Each choice trial started with an “offer” phase, which consisted of two CSs simultaneously presented on the screen on either side of a white crosshair for 4 s (left/right position randomized). The crosshair then turned green, indicating the onset of the “choice” phase in which participants were free to choose, via mouse click, either of the two CSs. After the 3 s choice phase, the center crosshair turned blue, cuing the participant to sniff the odor US paired with the chosen CS. After this 2 s “outcome” phase, participants rated either the pleasantness or identity of the odor (rating type randomized), followed by a 0–4 s interstimulus interval. Two-thirds of the choice trials were “high versus low” (HL), such that one of the CSs predicted a high-intensity US and the other predicted a low-intensity US of the same identity. The remaining one-third of the trials were “low versus low” (LL), such that both CSs predicted the same low-intensity US.
Immediately before each fMRI scanning session, participants provided additional pleasantness ratings for the four odor USs. After the presatiety session, participants were removed from the scanner and taken to a separate testing room where they first provided ratings of hunger, desire to eat, fullness, and satiety on visual analog scales. They were then presented with a meal consisting of an abundant amount of one food item corresponding as closely as possible to the SA odor (e.g., a large bowl of potato chips if the potato chip odor was designated as SA), as well as water to drink as needed. Participants were instructed to eat as much as they wanted, that there was more food than what was presented if they wanted more, and to consider this as a meal rather than a snack. They were then left alone and instructed to notify the experimenters when they were done eating. After the meal, participants again provided ratings of hunger, desire to eat, fullness, and satiety. They were then given an opportunity to use the restroom if needed and were escorted back to the scanner for the postsatiety session, which was identical to the presatiety session except for independent randomization of trial sequences. This experimental paradigm is similar to that used in a prior study of reward devaluation (Gottfried et al., 2003), except that we used an instrumental choice task instead of a pavlovian task and we focus on anticipatory activity in advance of the receipt of an expected odor reward.
fMRI data acquisition
MRI data were acquired on the Siemens 3T PRISMA system equipped with a 64-channel head–neck coil. Echoplanar imaging (EPI) volumes were acquired with the following parameters: repetition time, 2 s; echo time, 20 ms; flip angle, 90°; slice thickness, 2 mm, no gap; number of slices, 35; interleaved slice acquisition order; matrix size, 110 × 110 voxels; field of view, 220 × 220 mm. The functional scanning window was tilted ∼30° from axial to minimize susceptibility artifacts in OFC. Each fMRI run lasted 6.4 min and consisted of 192 EPI volumes covering the ventral part of the PFC, the anterior temporal lobe, and the basal ganglia. To aid in normalization of the functional scans, we also acquired 10 EPI volumes for each participant covering the entire brain, with the same parameters as described above except 95 slices and a repetition time of 5.25 s. A 1 mm isotropic T1-weighted structural scan was also acquired at the end of the postsatiety scanning session for each participant.
Sniff recording and analysis
During scanning, breathing activity was monitored using a respiratory effort band (BIOPAC Systems) affixed around the participant's torso and recorded using PowerLab equipment (ADInstruments) at a sampling rate of 1 kHz. Breathing traces for each fMRI run were temporally smoothed using a moving window spanning 500 ms, high-pass filtered (cutoff, 50 s) to remove slow-frequency drifts, and normalized by subtracting the mean and dividing by the SD across the entire run trace. The onset of inhalation initiated by the choice task sniff cue (center crosshair turning blue) on each choice trial was determined by finding the time of the minimum signal value within a window spanning 1 s on either side of the sniff cue presentation. We then calculated trial-by-trial sniff amplitude and volume, which were used as nuisance regressors in statistical modeling of the fMRI data (see below).
fMRI data preprocessing
To correct for head motion during scanning, for each subject all functional images across presatiety and postsatiety scanning sessions were aligned to the first acquired image using SPM12 (www.fil.ion.ucl.ac.uk/spm/, RRID: SCR_007037). The 10 whole-brain EPIs were motion corrected, averaged, and coregistered to the T1 structural image. The functional EPI time series was, using the mean EPI, coregistered to the mean whole-brain EPI. Spatial normalization was performed by normalizing the T1-weighted structural image to the MNI (Montreal Neurological Institute) EPI template volume. For multivariate analyses, the resulting deformation fields were applied to brain maps of decoder classification accuracy. For the functional connectivity analysis, the deformation fields were applied to the functional EPI volumes. In both cases, the resulting normalized volumes were spatially smoothed with a 6 × 6 × 6 mm full-width half-maximum Gaussian kernel before group-level statistical testing.
Multivoxel pattern analysis for reward identity coding
As with our previous studies (Kahnt et al., 2010, 2014; Howard et al., 2015, 2016), we implemented a searchlight-based decoding approach combined with a linear kernel support vector machine (SVM) to test for information in fMRI activity patterns without potential biases attributable to voxel selection. We first specified separate general linear models (GLMs) using the realigned functional images from each fMRI run. The GLMs included six event-related regressors of interest specifying the onset of the choice offers for the following conditions: (1) SAHL set I; (2) SAHL set II; (3) SALL; (4) NSHL set I; (5) NSHL set II; (6) NSLL. We also included event-related regressors locked to the time of choice, the onset of odor presentation, and the onset of the rating. Nuisance regressors included the following: trial-by-trial calculations of sniff amplitude and sniff duration, the six motion parameters calculated during realignment, the derivatives of each motion parameter, one regressor for the absolute signal difference between even and odd slices (to account for signal caused by within-scan motion), and the derivative of this signal. Additional regressors were included as needed to model individual volumes that exhibited strong head motion.
Spatial patterns of parameter estimates corresponding to the offer onset conditions were then subjected to SVM classification using the LIBSVM implementation (http://www.csie.ntu.edu.tw/∼cjlin/libsvm/, RRID: SCR_010243). To test for brain regions that coded for the specific identity of expected odor rewards while controlling for differences in CS visual features, at each searchlight (sphere with radius of ∼4 voxels) we trained the classifier to discriminate SAHL and NSHL set I patterns from four of the five runs and tested it on SAHL and NSHL set II patterns from the left out run. This procedure was repeated using set II as the training and set I as the test, and with each run in turn left out. The average classification accuracy resulting from this procedure was then mapped back to the center voxel of the searchlight sphere. Searchlight spheres consisted only of gray matter voxels specified by inclusively masking functional volumes with the tissue probability map provided by SPM, thresholded at 0.2 and inverse normalized to subject-specific native space. The classification described above was first conducted in the presatiety data set to identify regions that encoded the identity of the expected food rewards (see Fig. 4A). Within these regions, we then tested whether the same classification analysis conducted in the postsatiety data set would produce above-chance classification.
Figure 4.
Neural representations of reward identity. A, A classifier was trained to discriminate sated versus nonsated offer-related fMRI activity related to one set of symbols (e.g., set I, solid outlines) and tested on the same conditions but evoked by the other set (e.g., set II, dashed outlines). Classification was performed first on presatiety data to identify regions that encoded the identity of the expected reward outcomes. Effects that survived small-volume correction for multiple comparisons using a priori regions are shown here. B, C, Identity-based classification was performed separately in the presatiety and postsatiety data (B), and resulting accuracies were averaged across voxels in medial pOFC and lateral pOFC clusters (C). D, A classifier was trained to discriminate odor identities in the presatiety data and tested across sessions once with postsatiety sated patterns (TEST SA) and once with postsatiety nonsated patterns (TEST NS). E, Accuracies resulting from these two classification tests were extracted from medial pOFC and lateral pOFC. Error bars depict mean and SEM for n = 17. *p < 0.05, t tests.
To test whether codes of SA or NS identity were specifically modulated after satiety, we implemented a cross-session classification analysis (see Fig. 4D). For this, the classifier was trained on SAHL and NSHL set I patterns from the presatiety data and tested on set II patterns in two ways: (1) presatiety SAHL and postsatiety NSHL and (2) postsatiety SAHL and presatiety NSHL. Classification accuracy can only be above chance in either case if the activity patterns encoding the identity of SA or NS rewards in the presatiety session remain similar in the postsatiety session. Accuracies for these analyses were averaged across voxels in the regions identified in the original identity-based decoding analysis and compared between postsatiety SAHL and postsatiety NSHL.
Multivoxel pattern analysis for general value coding
To test for regions that coded the general value of the expected rewards at the offer phase of choice trials, we used the same run-wise GLMs described above. In this case, however, we trained the classifier to discriminate between parameter estimates of SAHL (averaged across set I and set II) and SALL from four of the five fMRI runs and tested on NSHL and NSLL from the left out run (see Fig. 5A). We then trained on NSHL and NSLL from four of the five fMRI runs and tested on SAHL and SALL from the left out run, such that each identity was used as training and test in each run-based iteration. With this design, classification accuracy can only be above chance if activity patterns coding for information about the value of an offer generalize across reward identities. To test for regions encoding general reward value at the time a choice was made, we specified a new set of run-wise GLMs that was identical to the ones described previously except the six condition-specific regressors of interest were time-locked to the onset of choice and a single regressor was time-locked to all offer onsets. As with the identity-based decoding analysis, we first tested for regions encoding general reward value in presatiety data only and in significant regions tested whether the same decoding was above chance in the postsatiety session.
Figure 5.
Neural representations of general reward value. A, The classifier was trained to discriminate [high versus low] versus [low versus low] offer- or choice-related fMRI activity for one odor identity and tested on the same conditions but evoked by the other identity. Classification was performed first on presatiety data to identify regions that encoded the general value of the expected reward outcomes. Effects that survived small-volume correction for multiple comparisons using a priori regions are shown here. B, C, General value decoding was performed separately in the presatiety and postsatiety data (B), and resulting accuracies were averaged across voxels in VS and vmPFC clusters (C). D, A classifier was trained to discriminate [high versus low] versus [low versus low] offers or choices for one identity in presatiety and tested on the same conditions but for the other identity in the postsatiety data. E, Accuracies were averaged across voxels in the VS and vmPFC clusters. Error bars depict mean and SEM for n = 17. *p < 0.05, t tests.
We also conducted a cross-session general value decoding analysis to test whether either the SA or NS condition exhibited specific modulation in general value information from presatiety to postsatiety. To test for changes in general value in the NS condition, we trained the classifier on SAHL versus SALL in presatiety and tested it on NSHL versus NSLL conditions in postsatiety. Conversely, to test for satiety-related changes in general value in the SA condition, we trained the classifier on NSHL versus NSLL in presatiety and tested it on SAHL versus SALL conditions in postsatiety. Above-chance classification in this cross-session analysis would indicate that activity patterns coding general reward value were similar in both scanning sessions.
Psychophysiological interaction analysis
We conducted a psychophysiological interaction (PPI) analysis using the gPPI toolbox (McLaren et al., 2012) to test for sensory-specific satiety-related connectivity changes between general value coding vmPFC and reward identity coding pOFC regions. For each subject, we estimated a PPI model on normalized and smoothed functional EPIs. Activity in the vmPFC seed (identified in the general value analysis) was included as the physiological factor, and odor type (SA vs NS) and session (presatiety vs postsatiety) were included as the psychological factors. Psychophysiological regressors were included for SAHL and NSHL trials at the time of choice for presatiety and postsatiety sessions. The subject-wise models also included regressors for the onset of all offers, all odor presentations, and all ratings. Using the parameter estimates from the psychophysiological regressors, contrasts were calculated for each condition, scanning session, and subject and averaged across voxels within the medial and lateral pOFC.
Group-level statistical analysis
To test for brain regions encoding reward identity and general reward value, we performed group-level one-sample t tests on normalized and smoothed decoder accuracy maps. For the identity-based analyses, these group-level models included a regressor for subject-by-subject calculations of the difference in rated pleasantness between high-intensity SA and high-intensity NS. This was done to ensure that decoding accuracy was driven by information about reward identity independent of potential value differences between the two odors. Similarly, the general value group-level models included a regressor for the difference in pleasantness between high- and low-intensity SA and high- and low-intensity NS to ensure that observed value coding was independent of relative value differences between high and low intensity between the two conditions. Significance at the group level was set at p < 0.05, small-volume corrected for multiple comparisons [family-wise error rate (FWE)] at the voxel level using corresponding anatomical regions defined in the Neuromorphometrics brain atlas (included in SPM12) as follows: medial pOFC (regions 146 and 186), lateral pOFC (region 178), left ventral striatum (regions 37, 30, and 58), right ventral striatum (regions 23, 36, and 57), and vmPFC (regions 124, 125, 140, and 141). For display purposes, activations are shown at p < 0.005, uncorrected. To test for interactions between experimental conditions and testing sessions (presatiety and postsatiety), we used repeated-measures ANOVAs, two-tailed. For directed and undirected tests between conditions, we used paired t tests, one-tailed and two-tailed, respectively. All fMRI decoding results reported here were initially tested against the theoretical chance level of 50%. To ensure that significant above-chance accuracies represent true effects, we also derived an empirical chance level by conducting post hoc decoding analyses with randomly permuted (n = 1000) labels in the training data. All decoding accuracies reported here are also significant when tested against empirical chance.
Results
For each participant (n = 17), from an initial panel of eight possible food odors we selected one sweet and one savory odor such that they were matched in rated pleasantness (Fig. 1A). We then set a low-intensity and high-intensity version of these odors, establishing a two-factorial design with identity and intensity as factors (Fig. 1B). Independent pleasantness ratings confirmed a main effect of intensity (repeated-measures ANOVA: F(1,16) = 43.4, p = 6.24 × 10−6) and no difference between either the two high-intensity (paired t test: t(16) = 0.83, p = 0.41) or the two low-intensity (t(16) = 0.53, p = 0.60) odors (repeated-measures ANOVA, intensity-by-identity interaction: F(1,16) = 0.0056, p = 0.94; Fig. 1C). Participants then learned associations between two sets of visual CSs and the selected food odors (USs; Fig. 1D).
Figure 1.
Stimulus selection and classical conditioning. A, The initial set of eight food odors included four sweet and four savory odors. B, A high-intensity and a low-intensity version of each selected odor was established, comprising a two-factorial design with odor identity (purple, sweet; blue, savory) and intensity (dark, high; light, low) as factors. C, Follow-up ratings confirmed that higher-intensity odors were matched in pleasantness (value) and were significantly more pleasant than their low-intensity counterparts (*post hoc paired t tests, sweet high vs sweet low: t(16) = 4.84, p = 8.97 × 10−5; savory high vs savory low: t(16) = 5.26, p = 3.86 × 10−5). Error bars depict mean and SEM for n = 17. D, To control for potential confounds attributable to visual stimulus features, each of the four odors was paired with two unique visual symbols (designated set I and set II) in a classical conditioning session. Symbol–odor pairings were randomly assigned for each participant.
On the subsequent day, participants performed a choice task during fMRI scanning involving the previously learned CS/US pairs, first when hungry and then immediately after eating a meal corresponding to one of the two food odors (Fig. 2A,B, counterbalanced across subjects, D). Choice trials involved either a value difference between the two options (HL) or the two options were paired with the same low-value odor (LL). Importantly, all choices were within-identity, allowing us to observe behavioral and fMRI activity changes specifically related to the sated (SAHL, SALL) and the nonsated (NSHL, NSLL) choice trials (Fig. 2C).
Figure 2.
fMRI scanning paradigm. A, After the presatiety fMRI session, participants were presented with an abundant amount of a single food item (e.g., potato chips) corresponding to one of the two food odors. B, Choice trials started with an offer phase, in which two symbols were shown on either side of a white fixation crosshair (randomly determined). When the crosshair turned green (choice phase), participants were free to choose either symbol with a left or right button press. The outcome phase was initiated by a blue crosshair, cueing the participants to make a sniff to receive the odor paired with the chosen symbol. Participants then rated either the pleasantness or the identity of the odor, followed by a randomly jittered intertrial fixation interval. C, Choices were either high versus low, in which one symbol predicted the high intensity and the other predicted the low intensity of the same odor identity, or low versus low, in which both symbols predicted the same low-intensity odor. Choice types are shown here both as they appeared on the screen and below in schematics denoting the identity of the offered odor (sated, orange; nonsated, green), the intensity of the offered odor (lighter shade is lower intensity), and the set to which the symbol belonged (solid outline, set I; dashed outline, set II). D, Designation of which odor was paired with a meal during the main experiment (orange, odor paired with food, i.e., sated; green, odor not paired with food, i.e., nonsated) was pseudorandomized such that approximately half of the subjects (n = 8) received a sweet meal whereas the other half (n = 9) received a savory meal.
Selective feeding differentially alters choice behavior
Ratings confirmed that participants were hungry and had a strong desire to eat before the meal and felt full and sated after the meal (Fig. 3A). Pleasantness ratings of the SA and NS odors revealed clear evidence for sensory-specific satiety (repeated-measures ANOVA, session-by-odor identity interaction: F(1,16) = 36.3, p = 1.75 × 10−5; Fig. 3B). Post hoc paired t tests demonstrated a reduction in pleasantness after the meal for both the high-intensity (t(16) = 5.92, p = 1.08 × 10−5) and low-intensity (t(16) = 2.22, p = 0.021) versions of the SA odor, whereas there was no change for the NS odor (high and low intensity, p values >0.56). Importantly, there was a session-by-identity-by-intensity interaction (repeated-measures ANOVA: F(1,16) = 6.21, p = 0.024), indicating that the high-intensity SA odor decreased in pleasantness more than the low-intensity SA odor.
Figure 3.
Behavioral results. A, Ratings of hunger (anchors “extremely hungry” and “not at all hungry”: pre vs post, t(16) = 13.8, p = 1.33 × 10−10), desire to eat (anchors “extreme desire to eat” and “no desire to eat”: pre vs post, t(16) = 11.7, p = 1.43 × 10−9), fullness (anchors “extremely full” and “not at all full”: post vs pre, t(16) = 16.4, p = 9.79 × 10−12), and satiety (anchors “extremely sated” and “not at all sated”: post vs pre, t(16) = 13.3, p = 2.39 × 10−10) were made on a visual analog scale (range, 0–10) immediately before and after the meal. B, Odor pleasantness ratings (anchors −10 = most disliked sensation imaginable, 10 = most liked sensation imaginable) were acquired at the beginning of each scanning session. C, D, Proportion of trials in which the high intensity was chosen for high versus low trials in each fMRI run (C) and averaged across runs for each scanning session (D). E, Trial-by-trial sniff traces, time-locked to the onset of the cued sniff, were sorted by condition and session and averaged across trials and subjects. Measures of sniff amplitude and volume were included as nuisance regressors in run-wise GLMs. F, Choice reaction times were reduced after satiety for both the sated (t(16) = 2.85, p = 0.012) and nonsated (t(16) = 2.37, p = 0.031) odor. Error bars depict mean and SEM for n = 17. *p < 0.05, post hoc t tests.
Analysis of choice behavior during SAHL and NSHL trials revealed that high-intensity odors were preferred throughout the presatiety session (p values <0.001, run-wise t tests on SAHL and NSHL vs 50% chance; p values >0.38, run-wise t tests on SAHL vs NSHL; Fig. 3C). Across sessions, high-intensity choices for SAHL were significantly reduced compared with the NSHL condition (repeated-measures ANOVA, session-by-condition interaction: F(1,16) = 21.8, p = 2.59 × 10−4; p values <0.01, postsatiety run-wise t tests of NSHL vs SAHL; Fig. 3C,D). This effect was evident in the very first choice after satiety (paired t test: t(16) = 2.70, p = 0.008, NSHL vs SAHL proportion high-intensity choice on first postsatiety trial; Fig. 3C, asterisks), indicating model-based updating of value (Gläscher et al., 2010; Daw et al., 2011; Jones et al., 2012), rather than additional learning in the postsatiety session. Notably, participants continued to choose the high-intensity option for the NSHL condition after the meal (p values <0.001, postsatiety run-wise t tests on NS vs chance), demonstrating that the NS odor maintained its value even in the context of general changes in hunger and desire to eat.
We also tested for effects of selective devaluation on sniff behavior, which was recorded throughout the fMRI scanning sessions. By analyzing sniff traces that were time-locked to cued sniff onsets (Fig. 3E), we found a main effect of condition on both sniff peak amplitude (F(1,16) = 6.18, p = 0.024) and sniff volume (F(1,16) = 7.29, p = 0.016). Although there was no condition-by-session interaction in either of these measures that mirrored the sensory-specific changes in pleasantness rating or choice behavior (p values >0.22), we included trial-by-trial measures of sniff amplitude and volume as nuisance regressors in our run-wise GLMs to avoid potential confounds attributable to differences in odor sampling (see Materials and Methods). There was also a main effect of session on choice reaction times (F(1,16) = 9.27, p = 0.0077; Fig. 3F); however, given a lack of condition-by-session interaction (p = 0.51), this effect is unlikely to reflect specific changes in odor reward value.
Identity representations of devalued rewards are altered in lateral pOFC after satiety
We hypothesized that satiety-related changes in odor value are mirrored in specific reward representations in the OFC. To test this, we implemented a pattern-based fMRI analysis using linear SVM classification and a searchlight approach (Kriegeskorte et al., 2006; Kahnt et al., 2010).
In a first step, we focused on the presatiety data and decoded the identity of the expected odors at the offer onset. To decode predicted reward identity independent of visual symbol identity, we trained a classifier on fMRI activity patterns corresponding to HL choice trials with different odor identities in one stimulus set (e.g., SAHL vs NSHL set I presatiety) and tested the classifier on activity patterns from HL choice trials in the second stimulus set (e.g., SAHL vs NSHL set II presatiety; see Materials and Methods; Fig. 4A). This classifier revealed significant above-chance accuracy for reward identity in a medial (x, y, z, coordinates: 10, 16, −20; t(16) = 4.75, pFWE = 0.016) and lateral (22, 14, −24; t(16) = 3.93, pFWE = 0.026) aspect of the pOFC.
Next, we tested whether reward identity representations in these regions were modulated after the meal by contrasting identity-based classification in presatiety and postsatiety data (Fig. 4B). In both regions, reward identity information was significantly reduced after the meal (paired t tests; medial pOFC: t(16) = 3.59, p = 0.0012; lateral pOFC: t(16) = 1.83, p = 0.043; Fig. 4C). However, although this analysis shows that reward identity information was modulated by satiety, it does not indicate in which representations (SA or NS) these changes occurred.
To specifically test which reward identity representations were altered (or not) by satiety, we conducted two cross-session classification analyses: one for SA and one for NS. These decoding models were trained on activity patterns from the presatiety data in one stimulus set (e.g., SAHL vs NSHL set I presatiety) and tested, separately for SA and NS trials, on activity patterns from the postsatiety data of the second stimulus set (e.g., NSHL set II postsatiety; Fig. 4D). In the lateral pOFC, classification was significantly above chance for the NSHL condition (t(16) = 2.62, p = 0.009) and significantly greater than the SAHL condition (t(16) = 2.85, p = 0.006), suggesting that activity patterns for the identity of only the SA reward were specifically altered in this region (Fig. 4E). By contrast, in medial pOFC, activity patterns for the expected reward identity of both SA and NS odors were modulated after the meal (paired t tests, medial pOFC: SA, t(16) = 0.28, p = 0.39; NS, t(16) = 0.56, p = 0.29; Fig. 4E) and did not differ between SA and NS (t(16) = −0.32, p = 0.75). The difference between NS and SA cross-session decoding was significantly greater in lateral pOFC compared with the anatomically adjacent medial pOFC (paired t test: t(16) = 1.78, p = 0.047), suggesting that specific reward representations in these two regions undergo different devaluation-related changes.
Olfactory cortex encodes expected odor identity across sessions
Prior studies of sensory olfactory and odor expectation processing indicate that odor identity is represented in distributed patterns of activity in piriform cortex (PC) (Howard et al., 2009; Stettler and Axel, 2009; Zelano et al., 2011). Although we did not find robust evidence for expected odor identity coding in PC in our initial analysis, we conducted a modified version of the decoding analysis using an anatomically defined PC region of interest and tested whether expected odor identity in this region was represented across both sessions. We found that in right PC, identity-based classification was significantly above chance in both the presatiety (paired t test, t(16) = 2.10, p = 0.026) and postsatiety (t(16) = 2.18, p = 0.022) sessions and did not change across the two (t(16) = 1.22, p = 0.24). This suggests that, whereas representations in pOFC underwent satiety-related changes, those in PC supported stable codes of expected odor identity.
vmPFC encodes general reward value
To test for reward representations that generalize across odor outcome identities, we implemented a separate classification analysis using both the HL and LL choice trials (Fig. 5A). By training on a value-based contrast within one odor identity (e.g., NSHL vs NSLL) and testing on a value-based contrast in the other odor identity (e.g., SAHL vs SALL), this analysis targeted brain regions in which value-based activity patterns are similar for both expected odor identities (Howard et al., 2015). In the presatiety data, we found representations of general reward value at the time of the offer in the bilateral ventral striatum (VS; left VS: −20, 14, −8, t(16) = 8.37, pFWE = 1.34 × 10−4; right VS: 16, 18, −10, t(16) = 9.42, pFWE = 7.48 × 10−5, Fig. 5A). In line with previous studies (Daw et al., 2006; Plassmann et al., 2007; McNamee et al., 2013), we found general value coding at the time of choice in vmPFC (0, 46, −20, t(16) = 5.11, pFWE = 0.016). Note that differences between HL and LL trials may reflect state value signals at the time of the offer and decision values at the time of choice.
By comparing general value coding between presatiety and postsatiety (Fig. 5B), we found that in both the VS and the vmPFC, classification accuracy was reduced after satiety (paired t tests; VS: t(16) = 4.64, p = 1.36 × 10−4; vmPFC: t(16) = 2.60, p = 0.0096; Fig. 5C). Testing for general value patterns in the postsatiety data, separately for SA and NS, a cross-session analysis (Fig. 5D) revealed that in both VS and vmPFC, classification was significantly above chance for the NS (VS: t(16) = 1.92, p = 0.036; vmPFC: t(16) = 1.90, p = 0.038; Fig. 5E) but not for the SA condition (VS: t(16) = 1.45, p = 0.083; vmPFC: t(16) = 0.96, p = 0.18). However, cross-session accuracy for NS was not significantly greater than for SA in either region (VS: t(16) = −0.11, p = 0.46; vmPFC: t(16) = 0.32, p = 0.38), suggesting that general motivational changes induced by satiety may have blunted general value information in these regions.
Connectivity between lateral pOFC and vmPFC predicts choice behavior
We next tested whether specific reward signals in the lateral pOFC are related to general value signals in the vmPFC by means of functional connections between these two regions. We also tested whether satiety-related changes in choice behavior depended on this functional connection. We predicted that selectively altered reward identity signals in lateral pOFC may be accompanied by a change in functional coupling with vmPFC, reflecting a change in the value of sated rewards. To this end, we implemented a PPI analysis, with vmPFC as the seed region and odor identity and session as psychological variables (see Materials and Methods; Fig. 6A). Indeed, we found a significant session-by-identity interaction on connectivity between vmPFC and lateral pOFC (repeated-measures ANOVA, F(1,16) = 11.37, p = 0.0039; Fig. 6B) that was driven by significantly lower connectivity for SA compared with NS in the postsatiety data (post hoc paired t test: t(16) = 3.58, p = 0.0012) and no significant difference between NS and SA in the presatiety data (t(16) = 0.92, p = 0.37). Most importantly, across subjects, the magnitude of this differential connectivity modulation was predictive of the satiety-related change in choice behavior (Pearson correlation, r = 0.57, p = 0.017; Fig. 6C). In contrast, there was no corresponding change in connectivity between vmPFC and medial pOFC (session-by-identity interaction, F(1,16) = 0.0034, p = 0.95; Fig. 6D), and connectivity was not correlated with changes in choice behavior (Pearson correlation, r = 0.042, p = 0.87; Fig. 6E). These findings suggest that functional coupling between lateral pOFC and vmPFC reflects the value of an expected outcome, with direct relevance for goal-directed behavior.
Figure 6.
Functional connectivity analysis. A, vmPFC was used as the seed region in the PPI analysis. Connectivity parameters were averaged across voxels in the medial and lateral pOFC clusters encoding reward identity. B, Connectivity between vmPFC and lateral pOFC was specifically reduced after satiety for the SA condition compared with the NS condition. Condition-by-session interaction, *p < 0.05, repeated-measures ANOVA. C, The condition-by-session interaction of the connectivity between vmPFC and lateral pOFC was significantly correlated with the sensory-specific change in choice behavior. D, E, There was no change in connectivity between vmPFC and medial pOFC (D) and no correlation between these parameters and choice behavior (E). Error bars depict mean and SEM for n = 17.
Discussion
Goal-directed behavior requires predictive neural representations of specific rewards that can be independently modulated to reflect changes in value. Such signals have been identified in the OFC across species (Burke et al., 2008; Sescousse et al., 2010; Klein-Flügge et al., 2013; McDannald et al., 2014; Stalnaker et al., 2014; Howard et al., 2015; Boorman et al., 2016). Importantly, updating of these representations must occur on the fly, and thus without the need for additional stimulus–outcome learning (Dickinson and Balleine, 1994). Here we used a devaluation paradigm to induce a reduction in the value of one of two distinct food odors. Using multivariate pattern analysis techniques, we show that in a hungry state the identity of these odor rewards is encoded in medial and lateral aspects of the pOFC. After selective satiety, both sated and nonsated reward representations were modulated in medial pOFC, whereas in lateral pOFC the identity-specific representation of only the sated odor was altered.
Prior studies using devaluation tasks across a variety of model species have identified OFC as a key substrate for mediating goal-directed behavior (Rolls et al., 1989; Gallagher et al., 1999; O'Doherty et al., 2000; Gottfried et al., 2003; Pickens et al., 2003; Valentin et al., 2007; Gremel and Costa, 2013; Rhodes and Murray, 2013; Murray et al., 2015). However, whether satiety-related modulations in the OFC are indicative of changes in general reward value, or whether they are tied to the unique sensory properties of the outcome, remained unknown. Here we show that devaluation modulates identity-specific representations of sated food odors in the lateral pOFC. It is important to point out that these findings do not merely reflect changes in neural value codes but in how prospective outcome identity is encoded after devaluation.
At the level of neuronal ensembles, reward identity representations could reflect the firing of cell populations coding for the specific identity of the expected outcome. Such responses have previously been identified in single units in both rodent (Stalnaker et al., 2014) and monkey (Padoa-Schioppa and Assad, 2006) OFC. Our results indicate that satiety selectively alters outcome identity representations. Interestingly, a follow-up univariate analysis of fMRI activity evoked by SA and NS offers in medial and lateral pOFC revealed no evidence for mean signal changes or interactions in either region (p values >0.23). Thus satiety-related alterations for SA outcomes could reflect a reduced number of cells coding for the devalued outcome, a narrowed coding range in these neurons, or less consistent firing across trials. Although our results cannot distinguish between these possibilities, they indicate that devaluation acts directly on OFC representations of specific goals and not merely on value signals associated with them.
It is important to note that we did not explicitly control for potential differences in which stimulus was attended in the offer phase. Thus, it is possible that whereas participants attended to CS predicting NSH presatiety and postsatiety, they may have attended to CS predicting SAH presatiety, but attended to CS predicting SAL postsatiety. In principle, such a change in attention specific to stimuli in the SA condition could explain the differential cross-session decoding results in Figure 4, D and E. However, based on the results presented in Figure 4, B and C, we believe that this interpretation is not very likely. Specifically, if participants were selectively attending to (and reliably representing) SAL during the postsatiety session, cross-session decoding for SA would indeed decrease, but we would still expect to see significant identity decoding when training and testing a classifier within the postsatiety phase. We did not observe significant identity decoding in the postsatiety session (Fig. 4B,C), suggesting that potential changes in attention are unlikely to account for the observed changes in how outcome identity was represented after satiety (Fig. 4D,E).
In contrast to identity-specific signals in pOFC, we found that vmPFC represents different rewards using a general reward value code. This is in line with human imaging studies showing value processing in vmPFC (Plassmann et al., 2007; Chib et al., 2009; Lebreton et al., 2009; Levy and Glimcher, 2011; McNamee et al., 2013; Howard et al., 2015), as well as studies showing that lesions to the vmPFC and adjacent OFC impair value-based choices (Bechara et al., 2000; Izquierdo et al., 2004; Fellows and Farah, 2007; Camille et al., 2011). However, to our knowledge no human or animal study has tested the effects of vmPFC lesions on behavior in a devaluation task, and it is therefore unclear whether vmPFC is necessary for goal-directed behavior.
A fundamental question is how representations of specific rewards are linked with general reward signals in vmPFC to control goal-directed behavior. Here we show that functional connectivity between the lateral pOFC and vmPFC was specifically modulated by satiety, indicating that the strength of coupling between these regions reflects the value of the expected outcome. Supporting this notion, functional connectivity between these regions was predictive of satiety-related changes in choice behavior. These findings are in line with the idea that lateral OFC is associated with specific reward information, whereas vmPFC and anterior cingulate implement value comparisons to guide choice behavior (Rushworth et al., 2012).
Our findings support a recent proposal that OFC represents a cognitive map for reinforcement learning (Wilson et al., 2014; Schuck et al., 2016). According to this framework, value is not necessarily a component of OFC representations but is assigned to the represented states via connections to other brain areas. Our results extend this idea by suggesting that connectivity to vmPFC is a potential substrate for assigning OFC state representations with value, and by suggesting that devalued goals may be less coherently represented in the OFC state space, and therefore evoke weaker value signals in vmPFC. The implication is that by preferentially signaling valued states, the OFC plays a fundamentally affirmative role in decision making (Rudebeck and Murray, 2014).
Previous work has suggested that updating of reward associations involves interactions between OFC and amygdala (Baxter et al., 2000; Rhodes and Murray, 2013; Zeeb and Winstanley, 2013). Here we did not find specific changes in outcome representations in the amygdala, and functional connectivity between the amygdala and the pOFC was not modulated by sensory-specific satiety. This may be explained by differences in study design in that our experiment involved instrumental choices rather than pavlovian responding, which has previously been linked to amygdala activity (Gottfried et al., 2003; Prévost et al., 2013). However, this may also reflect the fact that amygdala–OFC interactions are important during the satiety phase of devaluation but not afterward during choices (Wellman et al., 2005).
We show that identity-specific representations in medial and lateral pOFC undergo distinct satiety-related changes. Both of these regions are located in caudal OFC, which has been implicated in food reward identity coding using methods that similarly control for value such as those used here (Klein-Flügge et al., 2013). Given the direct input from olfactory and gustatory sensory cortex into caudal OFC, this region is well positioned to integrate olfactory information into representations of identity-specific food rewards (Carmichael and Price, 1995; Rolls, 2015). [Note that reward identity signals in posterior OFC contrast with identity-specific value signals in anterior OFC (Howard et al., 2015), as well as with studies showing that anterior OFC tracks satiety-related changes in reward value (Small et al., 2001; Gottfried et al., 2003), raising the possibility that anterior and posterior OFC differentially contribute to goal-directed behavior (Murray et al., 2015).] However, based on human cytoarchitecture, our medial and lateral pOFC clusters lie in distinct locations corresponding to regions Fo2 and Fo3, respectively (Henssen et al., 2016). In terms of anatomical connectivity, primate tracing studies have identified a “medial” network associated with limbic and visceromotor functions and an “orbital” network involved in sensory integration and association (Carmichael and Price, 1996). Given the close homology between primate and human OFC (Mackey and Petrides, 2010; Wallis, 2011; Neubert et al., 2015), our medial pOFC cluster likely lies in the medial network, whereas the lateral pOFC cluster lies at the intersection between the two networks. Access to both sensory and limbic substrates may therefore enable lateral pOFC to differentially update specific reward representations.
In contrast to lateral pOFC, the medial pOFC exhibited a nonspecific change after satiety. Such uniform changes may be related to hunger level and could be implemented by dopaminergic signaling. Indeed, fMRI signals in posteromedial pOFC have been shown to correlate with midbrain activity (Kahnt et al., 2012) in a dopamine-dependent manner (Kahnt and Tobler, 2017), and midbrain-derived dopamine levels are reduced after feeding in the rodent VS and medial prefrontal cortex (Ahn and Phillips, 1999; Roitman et al., 2004; de Araujo et al., 2012). Although speculative, this may indicate that nonspecific changes in medial pOFC depend on satiety-induced changes in dopamine, which may ultimately result in altered representations for both food odors. This nonspecific, homeostatic system could operate in parallel with the goal-directed pathway in lateral pOFC/vmPFC.
In summary, our results suggest that representations of specific rewards in medial and lateral pOFC undergo differential devaluation-related changes. In addition, they provide novel evidence for a mechanism by which these outcome-specific substrates are functionally connected to vmPFC to support adaptive choice behavior.
Footnotes
This work was supported by National Institute of Neurological Disorders and Stroke Grant T32NS047987 (J.D.H.) and National Institute on Deafness and Other Communication Disorders Grant R01DC015426 (T.K.). We thank International Flavors and Fragrances (S. Warrenburg and A. Dumer) for providing food odorants, Rachel Reynolds for assistance in fMRI data acquisition, and Jay A. Gottfried for helpful comments on this manuscript.
The authors declare no competing financial interests.
References
- Ahn S, Phillips AG (1999) Dopaminergic correlates of sensory-specific satiety in the medial prefrontal cortex and nucleus accumbens of the rat. J Neurosci 19:RC29(1–6). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balleine BW, Dickinson A (1998) Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37:407–419. 10.1016/S0028-3908(98)00033-1 [DOI] [PubMed] [Google Scholar]
- Baxter MG, Parker A, Lindner CC, Izquierdo AD, Murray EA (2000) Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex. J Neurosci 20:4311–4319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bechara A, Tranel D, Damasio H (2000) Characterization of the decision-making deficit of patients with ventromedial prefrontal cortex lesions. Brain 123:2189–2202. [DOI] [PubMed] [Google Scholar]
- Boorman ED, Rajendran VG, O'Reilly JX, Behrens TE (2016) Two anatomically and computationally distinct learning signals predict changes to stimulus-outcome associations in hippocampus. Neuron 89:1343–1354. 10.1016/j.neuron.2016.02.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burke KA, Franz TM, Miller DN, Schoenbaum G (2008) The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature 454:340–344. 10.1038/nature06993 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Camille N, Griffiths CA, Vo K, Fellows LK, Kable JW (2011) Ventromedial frontal lobe damage disrupts value maximization in humans. J Neurosci 31:7527–7532. 10.1523/JNEUROSCI.6527-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cardinal RN, Parkinson JA, Hall J, Everitt BJ (2002) Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neurosci Biobehav Rev 26:321–352. [DOI] [PubMed] [Google Scholar]
- Carmichael ST, Price JL (1995) Sensory and premotor connections of the orbital and medial prefrontal cortex of macaque monkeys. J Comp Neurol 363:642–664. 10.1002/cne.903630409 [DOI] [PubMed] [Google Scholar]
- Carmichael ST, Price JL (1996) Connectional networks within the orbital and medial prefrontal cortex of macaque monkeys. J Comp Neurol 371:179–207. 10.1002/(SICI)1096-9861(19960722)371:2%3C179::AID-CNE1%3E3.0.CO;2-%23 [DOI] [PubMed] [Google Scholar]
- Chib VS, Rangel A, Shimojo S, O'Doherty JP (2009) Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. J Neurosci 29:12315–12320. 10.1523/JNEUROSCI.2575-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441:876–879. 10.1038/nature04766 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ (2011) Model-based influences on humans' choices and striatal prediction errors. Neuron 69:1204–1215. 10.1016/j.neuron.2011.02.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Araujo IE, Ferreira JG, Tellez LA, Ren X, Yeckel CW (2012) The gut-brain dopamine axis: a regulatory system for caloric intake. Physiol Behav 106:394–399. 10.1016/j.physbeh.2012.02.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dickinson A, Balleine B (1994) Motivational control of goal-directed action. Anim Learn Behav 22:1–18. 10.3758/BF03199951 [DOI] [Google Scholar]
- Fellows LK, Farah MJ (2007) The role of ventromedial prefrontal cortex in decision making: judgment under uncertainty or judgment per se? Cereb Cortex 17:2669–2674. 10.1093/cercor/bhl176 [DOI] [PubMed] [Google Scholar]
- Gallagher M, McMahan RW, Schoenbaum G (1999) Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci 19:6610–6614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gläscher J, Daw N, Dayan P, O'Doherty JP (2010) States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66:585–595. 10.1016/j.neuron.2010.04.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gottfried JA, O'Doherty J, Dolan RJ (2003) Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 301:1104–1107. 10.1126/science.1087919 [DOI] [PubMed] [Google Scholar]
- Gremel CM, Costa RM (2013) Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat Commun 4:2264. 10.1038/ncomms3264 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hampton AN, O'Doherty JP (2007) Decoding the neural substrates of reward-related decision making with functional MRI. Proc Natl Acad Sci U S A 104:1377–1382. 10.1073/pnas.0606297104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henssen A, Zilles K, Palomero-Gallagher N, Schleicher A, Mohlberg H, Gerboga F, Eickhoff SB, Bludau S, Amunts K (2016) Cytoarchitecture and probability maps of the human medial orbitofrontal cortex. Cortex 75:87–112. 10.1016/j.cortex.2015.11.006 [DOI] [PubMed] [Google Scholar]
- Howard JD, Plailly J, Grueschow M, Haynes JD, Gottfried JA (2009) Odor quality coding and categorization in human posterior piriform cortex. Nat Neurosci 12:932–938. 10.1038/nn.2324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howard JD, Gottfried JA, Tobler PN, Kahnt T (2015) Identity-specific coding of future rewards in the human orbitofrontal cortex. Proc Natl Acad Sci U S A 112:5195–5200. 10.1073/pnas.1503550112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howard JD, Kahnt T, Gottfried JA (2016) Converging prefrontal pathways support associative and perceptual features of conditioned stimuli. Nat Commun 7:11546. 10.1038/ncomms11546 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Izquierdo A, Suda RK, Murray EA (2004) Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci 24:7540–7548. 10.1523/JNEUROSCI.1921-04.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones JL, Esber GR, McDannald MA, Gruber AJ, Hernandez A, Mirenzi A, Schoenbaum G (2012) Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338:953–956. 10.1126/science.1227489 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahnt T, Heinzle J, Park SQ, Haynes JD (2010) The neural code of reward anticipation in human orbitofrontal cortex. Proc Natl Acad Sci U S A 107:6010–6015. 10.1073/pnas.0912838107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahnt T, Chang LJ, Park SQ, Heinzle J, Haynes JD (2012) Connectivity-based parcellation of the human orbitofrontal cortex. J Neurosci 32:6240–6250. 10.1523/JNEUROSCI.0257-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahnt T, Park SQ, Haynes JD, Tobler PN (2014) Disentangling neural representations of value and salience in the human brain. Proc Natl Acad Sci U S A 111:5000–5005. 10.1073/pnas.1320189111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahnt T, Tobler PN (2017) Dopamine modulates the functional organization of the orbitofrontal cortex. J Neurosci 37:1493–1504. 10.1523/jneurosci.2827-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kennerley SW, Behrens TE, Wallis JD (2011) Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat Neurosci 14:1581–1589. 10.1038/nn.2961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klein-Flügge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TE (2013) Segregated encoding of reward-identity and stimulus-reward associations in human orbitofrontal cortex. J Neurosci 33:3202–3211. 10.1523/JNEUROSCI.2532-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kriegeskorte N, Goebel R, Bandettini P (2006) Information-based functional brain mapping. Proc Natl Acad Sci U S A 103:3863–3868. 10.1073/pnas.0600244103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lebreton M, Jorge S, Michel V, Thirion B, Pessiglione M (2009) An automatic valuation system in the human brain: evidence from functional neuroimaging. Neuron 64:431–439. 10.1016/j.neuron.2009.09.040 [DOI] [PubMed] [Google Scholar]
- Levy DJ, Glimcher PW (2011) Comparing apples and oranges: using reward-specific and reward-general subjective value representation in the brain. J Neurosci 31:14693–14707. 10.1523/JNEUROSCI.2218-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mackey S, Petrides M (2010) Quantitative demonstration of comparable architectonic areas within the ventromedial and lateral orbital frontal cortex in the human and the macaque monkey brains. Eur J Neurosci 32:1940–1950. 10.1111/j.1460-9568.2010.07465.x [DOI] [PubMed] [Google Scholar]
- McDannald MA, Esber GR, Wegener MA, Wied HM, Liu TL, Stalnaker TA, Jones JL, Trageser J, Schoenbaum G (2014) Orbitofrontal neurons acquire responses to “valueless” Pavlovian cues during unblocking. Elife 3:e02653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McLaren DG, Ries ML, Xu G, Johnson SC (2012) A generalized form of context-dependent psychophysiological interactions (gPPI): a comparison to standard approaches. Neuroimage 61:1277–1286. 10.1016/j.neuroimage.2012.03.068 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNamee D, Rangel A, O'Doherty JP (2013) Category-dependent and category-independent goal-value codes in human ventromedial prefrontal cortex. Nat Neurosci 16:479–485. 10.1038/nn.3337 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murray EA, Moylan EJ, Saleem KS, Basile BM, Turchi J (2015) Specialized areas for value updating and goal selection in the primate orbitofrontal cortex. Elife 4:e11695. 10.7554/eLife.11695 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neubert FX, Mars RB, Sallet J, Rushworth MF (2015) Connectivity reveals relationship of brain areas for reward-guided learning and decision making in human and monkey frontal cortex. Proc Natl Acad Sci U S A 112:E2695–2704. 10.1073/pnas.1410767112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Doherty J, Rolls ET, Francis S, Bowtell R, McGlone F, Kobal G, Renner B, Ahne G (2000) Sensory-specific satiety-related olfactory activation of the human orbitofrontal cortex. Neuroreport 11:893–897. 10.1097/00001756-200003200-00046 [DOI] [PubMed] [Google Scholar]
- O'Doherty JP, Deichmann R, Critchley HD, Dolan RJ (2002) Neural responses during anticipation of a primary taste reward. Neuron 33:815–826. 10.1016/S0896-6273(02)00603-7 [DOI] [PubMed] [Google Scholar]
- O'Doherty JP, Cockburn J, Pauli WM (2017) Learning, reward, and decision making. Annu Rev Psychol 68:73–100. 10.1146/annurev-psych-010416-044216 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA (2006) Neurons in the orbitofrontal cortex encode economic value. Nature 441:223–226. 10.1038/nature04676 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pickens CL, Saddoris MP, Setlow B, Gallagher M, Holland PC, Schoenbaum G (2003) Different roles for orbitofrontal cortex and basolateral amygdala in a reinforcer devaluation task. J Neurosci 23:11078–11084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plassmann H, O'Doherty J, Rangel A (2007) Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci 27:9984–9988. 10.1523/JNEUROSCI.2131-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prévost C, McNamee D, Jessup RK, Bossaerts P, O'Doherty JP (2013) Evidence for model-based computations in the human amygdala during Pavlovian conditioning. PLoS Comput Biol 9:e1002918. 10.1371/journal.pcbi.1002918 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rhodes SE, Murray EA (2013) Differential effects of amygdala, orbital prefrontal cortex, and prelimbic cortex lesions on goal-directed behavior in rhesus macaques. J Neurosci 33:3380–3389. 10.1523/JNEUROSCI.4374-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rich EL, Wallis JD (2016) Decoding subjective decisions from orbitofrontal cortex. Nat Neurosci 19:973–980. 10.1038/nn.4320 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roitman MF, Stuber GD, Phillips PE, Wightman RM, Carelli RM (2004) Dopamine operates as a subsecond modulator of food seeking. J Neurosci 24:1265–1271. 10.1523/JNEUROSCI.3823-03.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolls ET. (2015) Taste, olfactory, and food reward value processing in the brain. Prog Neurobiol 127–128 64–90. 10.1016/j.pneurobio.2015.03.002 [DOI] [PubMed] [Google Scholar]
- Rolls ET, Sienkiewicz ZJ, Yaxley S (1989) Hunger modulates the responses to gustatory stimuli of single neurons in the caudolateral orbitofrontal cortex of the macaque monkey. Eur J Neurosci 1:53–60. 10.1111/j.1460-9568.1989.tb00774.x [DOI] [PubMed] [Google Scholar]
- Rudebeck PH, Murray EA (2014) The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron 84:1143–1156. 10.1016/j.neuron.2014.10.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudebeck PH, Saunders RC, Prescott AT, Chau LS, Murray EA (2013) Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nat Neurosci 16:1140–1145. 10.1038/nn.3440 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rushworth MF, Kolling N, Sallet J, Mars RB (2012) Valuation and decision-making in frontal cortex: one or many serial or parallel systems? Curr Opin Neurobiol 22:946–955. 10.1016/j.conb.2012.04.011 [DOI] [PubMed] [Google Scholar]
- Schoenbaum G, Chiba AA, Gallagher M (1998) Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci 1:155–159. 10.1038/407 [DOI] [PubMed] [Google Scholar]
- Schuck NW, Cai MB, Wilson RC, Niv Y (2016) Human orbitofrontal cortex represents a cognitive map of state space. Neuron 91:1402–1412. 10.1016/j.neuron.2016.08.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sescousse G, Redouté J, Dreher JC (2010) The architecture of reward value coding in the human orbitofrontal cortex. J Neurosci 30:13095–13104. 10.1523/JNEUROSCI.3501-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Small DM, Zatorre RJ, Dagher A, Evans AC, Jones-Gotman M (2001) Changes in brain activity related to eating chocolate: from pleasure to aversion. Brain 124:1720–1733. 10.1093/brain/124.9.1720 [DOI] [PubMed] [Google Scholar]
- Stalnaker TA, Cooch NK, McDannald MA, Liu TL, Wied H, Schoenbaum G (2014) Orbitofrontal neurons infer the value and identity of predicted outcomes. Nat Commun 5:3926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stettler DD, Axel R (2009) Representations of odor in the piriform cortex. Neuron 63:854–864. 10.1016/j.neuron.2009.09.005 [DOI] [PubMed] [Google Scholar]
- Strait CE, Blanchard TC, Hayden BY (2014) Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron 82:1357–1366. 10.1016/j.neuron.2014.04.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tremblay L, Schultz W (1999) Relative reward preference in primate orbitofrontal cortex. Nature 398:704–708. 10.1038/19525 [DOI] [PubMed] [Google Scholar]
- Valentin VV, Dickinson A, O'Doherty JP (2007) Determining the neural substrates of goal-directed learning in the human brain. J Neurosci 27:4019–4026. 10.1523/JNEUROSCI.0564-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wallis JD. (2011) Cross-species studies of orbitofrontal cortex and value-based decision-making. Nat Neurosci 15:13–19. 10.1038/ncb2665 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wellman LL, Gale K, Malkova L (2005) GABAA-mediated inhibition of basolateral amygdala blocks reward devaluation in macaques. J Neurosci 25:4577–4586. 10.1523/JNEUROSCI.2257-04.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson RC, Takahashi YK, Schoenbaum G, Niv Y (2014) Orbitofrontal cortex as a cognitive map of task space. Neuron 81:267–279. 10.1016/j.neuron.2013.11.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeeb FD, Winstanley CA (2013) Functional disconnection of the orbitofrontal cortex and basolateral amygdala impairs acquisition of a rat gambling task and disrupts animals' ability to alter decision-making behavior after reinforcer devaluation. J Neurosci 33:6434–6443. 10.1523/JNEUROSCI.3971-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zelano C, Mohanty A, Gottfried JA (2011) Olfactory predictive codes and stimulus templates in piriform cortex. Neuron 72:178–187. 10.1016/j.neuron.2011.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]