Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2010 Mar 15;107(13):6010–6015. doi: 10.1073/pnas.0912838107

The neural code of reward anticipation in human orbitofrontal cortex

Thorsten Kahnt a,b,c,1, Jakob Heinzle a, Soyoung Q Park c,d, John-Dylan Haynes a,b,c,e,1
PMCID: PMC2851854  PMID: 20231475

Abstract

An optimal choice among alternative behavioral options requires precise anticipatory representations of their possible outcomes. A fundamental question is how such anticipated outcomes are represented in the brain. Reward coding at the level of single cells in the orbitofrontal cortex (OFC) follows a more heterogeneous coding scheme than suggested by studies using functional MRI (fMRI) in humans. Using a combination of multivariate pattern classification and fMRI we show that the reward value of sensory cues can be decoded from distributed fMRI patterns in the OFC. This distributed representation is compatible with previous reports from animal electrophysiology that show that reward is encoded by different neural populations with opposing coding schemes. Importantly, the fMRI patterns representing specific values during anticipation are similar to those that emerge during the receipt of reward. Furthermore, we show that the degree of this coding similarity is related to subjects’ ability to use value information to guide behavior. These findings narrow the gap between reward coding in humans and animals and corroborate the notion that value representations in OFC are independent of whether reward is anticipated or actually received.

Keywords: expected reward value, functional MRI, multivariate decoding, distributed coding, decision making


Decisions are influenced by the reward value we expect to obtain from choosing different options. One key cortical region for reward processing is the orbitofrontal cortex (OFC) (15). Lesions of the OFC lead to continued responding in reinforcer devaluation tasks (68), whereas normal animals show decreased conditioned responding if the outcome of a specific response is devaluated after training. Consistent with these findings, electrophysiological data shows that single-unit activity in OFC signals reward expectancies (912) that change during reversal and devaluation (1315).

Human fMRI studies have shown positive correlations between reward value and fMRI signals in OFC, where activity increases with increasing expected reward (1621) and decreases after devaluation of the predicted outcome (22). On a macroscopic level, some neuroimaging studies suggest a large-scale functional distinction between different OFC subregions; whereas medial regions respond to rewards, lateral regions are thought to respond to punishments (2, 23). However, electrophysiological studies in animals suggest a more complicated link between reward value and neural activity. Single-unit recordings in rat and monkey OFC have shown that different local subpopulations of neurons either increase or decrease their firing rates with increasing reward (5, 2427). Thus, OFC appears to contain different neural subpopulations with opposing response schemes for value. Such findings challenge the sensitivity of methods like fMRI that sample average signals across extended brain regions and thus might not be able to access the full information about value signals in prefrontal areas (25). Furthermore, multiunit recordings in animals suggest that OFC might encode reward value in a population rather than in a single-cell code (2830). The information encoded in the distributed activity of cell populations will not be adequately accessible to conventional neuroimaging approaches using artificially smoothed data at independent spatial positions. More recently, multivariate pattern recognition has emerged as a unique approach to investigate the information present in patterns of fMRI signals (31, 32). The term “fMRI pattern” is considered in these approaches as the spatial response profile of a given set of fMRI voxels. Pattern recognition could potentially allow us to assess the value-related information in the entire ensemble of OFC voxels without requiring specific assumptions about the single-neuron coding scheme.

Understanding how value is represented in OFC as measured with fMRI is particularly important to assess whether the representation is similar during anticipation and receipt of reward. OFC activity has been proposed to act as working memory for expected rewards in a dorsolateral prefrontal cortex-like fashion (33, 34). However, previous fMRI studies have yielded different views on the link between anticipation and receipt of reward. Some human fMRI studies have shown that OFC activity correlates with the value of both reward-predicting cues and rewarding outcomes (1619). Others have suggested that anticipation and receipt differentially recruit distinct brain regions (35, 36). Pattern recognition techniques could potentially resolve this issue by directly testing whether similar spatial response topographies represent reward value during anticipation and receipt.

Here we investigate to which degree information about reward value can be decoded from fMRI patterns in the human OFC. Specifically, we asked (i) whether fMRI patterns contain information about value during anticipation and (ii) whether these patterns are similar during anticipation and receipt of reward. To address these questions, we probed the information in fMRI patterns during a simple decision-making task by means of multivariate decoding techniques (31, 32). In each trial of the task, subjects saw a rotating ensemble of colored dots for 2 seconds (Fig. 1A). Rotation coherence ranged stepwise from 100% counterclockwise (CCW) to 100% clockwise (CW), and color ranged from 100% green (G) to 100% red (R). Using cues defined by these two feature dimensions (color, rotation) allowed us to dissociate reward representations from sensory representations following a logical XOR. For each subject, only specific conjunctions of color and rotation were predictive of a certain reward value, whereas the individual sensory features were not correlated with the reward value at all. Figure 1B shows one example of the association between sensory cues and reward value. After a variable time interval (4-8 s), subjects had to report either the rotation direction or the color of the dots (randomized). Given a correct response, the value (ranging from 0 to 10 points) of that particular cue was delivered to the subject as reward (Fig. 1A).

Fig. 1.

Fig. 1.

Experimental design and behavioral results. (A) After presentation of a sensory cue, subjects had to judge either the main rotation direction or the color of the dots (randomized). The reward outcome was delivered after correct responses. (B) Sensory cues consisted of colored rotating dots that were associated with reward value in a logical XOR fashion. The combination of rotation direction and color was reward predicting, whereas color and rotation direction alone were not informative about the outcome. Two stimulus combinations that do not share any sensory properties predict high rewards (e.g., G&CW and R&CCW) and two predict low rewards (e.g., G&CCW and R&CW). CW, clockwise; CCW, counterclockwise. An example pairing is shown here (the actual parings were counterbalanced across subjects). All 16 cells were presented once in each run. To compensate for the overrepresentation of intermediate values, the four extreme values were presented two additional times each (16 + 2 × 4 = 24 trials). (C) Subjective ratings from a postscanning rating session increased as a function of reward value, suggesting that subjects indeed were aware of the link between cues and reward levels. All differences are significant (P < 0.001, Bonferroni corrected). Error bars for SEM are smaller than the symbols.

In our analyses, we used multivariate pattern classification to decode the reward value of the sensory cues from fMRI patterns during anticipation. Furthermore, we trained the classifier on patterns during receipt of reward and tested its performance on patterns during anticipation. This allowed us to identify brain regions in which similar fMRI patterns correlate with value during anticipation as well as receipt of reward.

Results

Behavioral Results.

During the experiment, task performance (93.5 ± 1.38% correct, mean ± SEM) and reaction time (RT, 685 ± 15 ms) were significantly modulated by the reward value of the sensory cues (mean regression coefficient b = 0.28, t = 2.73, P < 0.05 and b = −1.48, t = −4.70, P < 0.05, for percent correct and RT, respectively). Thus, subjects were more accurate and faster with increasing reward value, indicating that they established reward expectations (10). In a postscanning rating session, subjects were asked to rate the value of each cue on a continuous rating scale. Subjective ratings increased with the value of the cues [Fig. 1C; ANOVA, main effect of value; F(45, 3) = 232.50, P < 0.001], with significant differences between all value levels (P < 0.001, Bonferroni corrected). Within each subject, subjective ratings were significantly explained by a linear model of the reward value (all Fs > 21.95, all R2s > 0.37, P < 0.001). Hence, subjects learned the association between the sensory cues and the reward value that they predict.

Neuroimaging Results.

Focusing on the fMRI data, we asked whether distributed fMRI patterns contain information about reward value during anticipation independent of the sensory properties of the cue. We systematically searched through the entire brain for information about reward value. This was done using a searchlight decoding technique (37, 38) that investigates how much information can be extracted from any local spherical cluster of voxels (radius = 4 voxels; Fig. 2 and Materials and Methods). In a region that represents reward value independent of the sensory properties of the cue, the fMRI pattern to a cue with a specific value should be the same as that of a second cue with the same reward value but different sensory properties. To test this, we trained a support vector classifier (SVC) on one pair of low vs. high value cues (e.g., G&CW vs. R&CW) and predicted the value of a different set of high vs. low value cues (e.g., R&CCW vs. G&CCW). Importantly, to ensure complete independence, training and test data were taken from different scanning runs. A 10-fold leave-one-out cross-validation procedure was performed for all possible training-test combinations (Fig. 2B).

Fig. 2.

Fig. 2.

Multivariate pattern classification. (A) Cues were sorted into four groups according to their sensory properties. Two cues predicted a high reward (7 and 10 points), whereas the remaining two predicted a low reward (0 and 3 points). (B) A support vector classifier (SVC) was trained on "training" data from nine scanning runs to classify fMRI patterns evoked by one specific pair of low- vs. high-value cues (e.g., G&CCW vs. R&CCW). From the remaining test data set (run 10), fMRI patterns to cues that also predicted low and high values but had different sensory properties were used to test the performance of the SVC (e.g., R&CW vs. G&CW). In total, this procedure was performed on four different training-test pairs each time as a 10-fold leave-one-out cross-validation. (C) We searched in every local cluster of brain activity for information about the reward value during anticipation using a searchlight approach (37, 38). For every voxel in the brain, the fMRI patterns in the local cluster surrounding this voxel were extracted for each cue and each scanning run separately. Then the decoding procedure described in B was performed on that data.

We found several brain regions with significant decoding accuracy [P < 0.05, familywise error (FWE) corrected for the whole brain]. These included the medial OFC (BA 11; Fig. 3), the ventral striatum, the dorsomedial and dorsolateral PFC, the frontopolar cortex, the precuneus, the lateral posterior parietal cortex, and the amygdala/hippocampus (Fig. S1). Our analysis decoded information about expected reward independent of the sensory properties of the cue. In contrast to OFC, decoding accuracy in visual areas followed the sensory properties of the cues (Fig. S2). It is important to note that these findings by no means exclude the possibility that there might be sensory specific signaling in the OFC as well. Single OFC neurons could still signal sensory properties that we cannot detect using our method due to the lack of spatial resolution of the fMRI signal compared with single-unit recordings.

Fig. 3.

Fig. 3.

Decoding of reward value during anticipation. (A) Distributed fMRI patterns in the medial OFC [MNI coordinates: (3, 33, −6), t = 6.90] and the ventral striatum [VS (6, 6, −6), t = 5.65] represent the value of anticipated outcomes independent of the sensory properties of the cues. T map based on the decoding accuracies of all four training-test pairs is thresholded at P < 0.05, FWE whole-brain corrected with a cluster extent threshold of k = 30 voxels, and overlaid on a normalized T1-weighted image averaged across subjects. (B) Bar graphs show average decoding accuracy across subjects (% correct classified, chance level is 50%) for the different training-test pairs (nos. 1–4; see Fig. 2B) and error bars depict SEM. Please note that the decoding accuracy only provides a lower bound on information. The predictive accuracy at the level of populations of single cells could potentially be substantially higher if only a subpopulation of cells is modulated by reward, as suggested by electrophysiological studies in primates (10, 24, 26).

A second question we aimed to address is whether reward value is represented by similar fMRI patterns during anticipation and receipt. To test this we used the patterns obtained from the anticipation phase as training data and the fMRI patterns evoked by the receipt of reward as test data set (Materials and Methods). Conversely, we also tested whether fMRI patterns during receipt of reward generalize to the anticipation phase. These analyses should reveal regions in which reward value is represented by similar patterns during anticipation and receipt of reward. Only the medial OFC (BA 11; Fig. 4A), the medial PFC (mPFC, BA 10), and the dorsal anterior cingulate cortex (dACC, BA 32) showed significant decoding accuracy (P < 0.05, FWE whole-brain corrected). Thus, in these regions, the fMRI patterns that contain information about the reward value during receipt of reward already emerge during anticipation. Figure 4B shows the distributed voxel selectivity patterns in the medial OFC from the SVC trained on anticipation and receipt of reward, respectively, as well as their similarity (correlation) for an exemplary subject (see Figs. S3 and S4 for results from all subjects). As can be seen from Fig. 4B, there are reliably reproducible fine-grained subregions within OFC that either increase or decrease their activity with increasing reward value, which is compatible with previous electrophysiological findings on reward coding in animal OFC (5, 2427).

Fig. 4.

Fig. 4.

Similar value-coding fMRI patterns in the OFC during anticipation and receipt of reward. (A) In the medial OFC [MNI coordinates: (3, 54, −15), t = 6.20] similar fMRI patterns represent value during both anticipation and receipt of reward. The t map based on decoding accuracies from both training-test pairs is thresholded at P < 0.05 (FWE whole-brain corrected; cluster extent threshold k = 30 voxels) and overlaid on a normalized T1-weighted image averaged across subjects. (B) The surface plot depicts voxel selectivities (support vector weights, SV weights) in the spherical cluster surrounding the individual peak voxel in medial OFC for one subject. The selectivity of each voxel for either low or high values is color coded in blue and yellow, respectively. (Left) SV weights from the SVC trained on fMRI patterns during anticipation and (Right) during receipt of reward. Scatter plot in the middle illustrates the similarity between the voxel selectivities during anticipation (x axis) and receipt of reward (y axis). 3D patterns from all subjects are shown in Figs. S3 and S4. (C) Significant relationship (r = 0.51, P < 0.05) between the correlation of the fMRI patterns in the medial OFC during anticipation and receipt of reward (pattern similarity, x axis) and the coefficients of determination (R2) describing the subjective association between the sensory cue and reward value (subjective association, y axis) obtained from the postscanning ratings. There was no significant relationship in the mPFC (P = 0.65) or the dACC cluster (P = 0.09). (D) Significant relationship (r = 0.61, P < 0.05) between pattern similarity in OFC (x axis) and the modulation of performance (% correct) by expected value (y axis) during the task. There was no significant relationship in the mPFC (P = 0.14) or the dACC (P = 0.29).

We compared these findings to a more conventional analysis of OFC activity based on a general linear model with smoothed data, where the fine-grained activation patterns will be averaged out. In our case this resulted in no significant difference in the medial OFC between different values during the anticipation phase (P = 0.39), but significant differences at the reward receipt phase (P < 0.001; Fig. S5). Thus, based on our data, a conventional fMRI analysis would lead to the opposite conclusion that reward value is differently represented in medial OFC during anticipation and receipt of reward. This discrepancy might help explain why previous studies have differed in terms of the role of OFC in anticipation (18, 19, 35, 36).

We then proceeded to investigate whether the neural link between anticipation and receipt phase had any influence on behavioral performance. If the subjective association between sensory cue and reward value relies on the similarity between fMRI patterns during anticipation and receipt of reward, the degree to which subjects represent the value during the cue should be reflected in this pattern similarity (Materials and Methods). Indeed, across subjects, the pattern similarity in the medial OFC was positively correlated with the coefficient of determination (R2; Materials and Methods) obtained from the postscanning ratings (r = 0.51, P < 0.05, Fig. 4C). That is, the higher the similarity of the fMRI patterns during anticipation and receipt, the stronger the subjective association between sensory cue and reward value. Furthermore, we found a significant correlation between pattern similarity and the degree to which performance (% correct) was modulated by value (r = 0.61, P < 0.05, Fig. 4D). However, the negative correlation between pattern similarity and the modulation of RT by value failed to reach significance threshold (r = −0.34, P = 0.20), which might be due to a ceiling effect of RT. Together, these findings suggest that the pattern similarity is functionally relevant for using the value information provided by the sensory cues to guide behavior.

Discussion

Our findings provide insight into an important question in behavioral neuroscience. We show that in the OFC, reward value can be decoded from distributed patterns of brain activity as measured by fMRI. Whereas previous human fMRI studies showed increasing OFC responses with increasing reward value (1622), single-unit and multiunit recordings in animals suggest a more complex, heterogeneous mapping between value and neural firing (5, 2427). Relating findings between monkey electrophysiology and human neuroimaging studies is vital for neuroscientific research (39). By demonstrating that fMRI patterns in the OFC represent reward value, our findings narrow the gap between these conflicting results. In the nervous system, information is often encoded in distributed population activity, where each unit has a different tuning property (40, 41). Indeed, besides demonstrations of single-unit coding of expected reward (912), previous studies have shown population coding of different reward magnitudes (29) and expected reward probabilities (30) in rat OFC.

This raises the important question of how population coding on a neuronal level is translated into distributed fMRI patterns. One hypothesis is that the reward-related patterning of fMRI signals in OFC reflects “biased sampling” of cortical cells by individual fMRI voxels. This could be caused by random fluctuations in the distribution of cells coding differently for reward value such that by chance one voxel happens to be populated slightly more by one type of cell and another voxel happens to be populated slightly more by another type of cell (Fig. S6). Such a link between information encoded in a fine-grained patterning of neural selectivity and fMRI patterns has been established both empirically and theoretically in the early visual system (42, 43). Similar informative pattern signals have also been shown for cognitive and decision processes in prefrontal cortex (38, 44, 45). Although the detailed topographic microarchitecture of prefrontal cortex still remains to be understood, anatomical tracing studies suggest that local projections within prefrontal cortex are patchy, which might point toward a quasi-columnar architecture in prefrontal cortex similar to that in visual areas (46). Our data suggest that in OFC a fine-grained topography of reward coding schemes can be expected beyond the distinction between medial and lateral regions (2, 23). It is important to note that although fMRI pattern analysis goes far beyond conventional fMRI methods, it can only provide a coarse estimation of the information contained in populations of single cells because the fMRI signal predominantly reflects the local field potential (39), and each voxel can contain over half a million neurons (for details, see Fig. S6).

Furthermore, we have shown that value-encoding fMRI patterns in medial OFC are similar during anticipation and receipt of reward. One possible reason for this result could be that the BOLD response to the anticipation phase extends into the receipt phase. We believe this to be unlikely, because we used long randomized delays (4–8 s) that allow the statistical separation of two phases (47). To further assess this we performed a simulation of such a “bleeding” effect (Fig. S7). Also, an additional analysis shows that bleeding does not occur for the encoding of sensory cues (Fig. S8). Taken together, this confirms that a single extended response during the anticipation phase is an unlikely cause of our finding. We therefore conclude that the neural representation of expected values is similar to that of the actual outcome. However, the limited temporal resolution of fMRI does not allow us to tell whether the neural signals in the OFC are temporally sustained and span from anticipation to the receipt phase, or whether the receipt phase is a transient reactivation of the same neurons. These interpretations are both in line with the idea that activity in the OFC acts as working memory for expected rewards (33, 34). Activity patterns in the OFC could maintain value representations in working memory, which could enable flexible updating with new information and ensure that important outcomes receive behavioral priority (8). Indeed, stronger maintenance of gustatory information in the OFC compared with dorsolateral PFC has recently been shown, which also points toward a more general role of OFC in reward working memory (48).

Our findings have several implications for psychological theories of reward learning and decision making. Specifically, reinforcement learning theory predicts that during learning, the reward value of the outcome is shifted toward the sensory cue (49, 50). Beyond that, our data suggests that the very same neural code that represents the value of the outcome is taken over by the cue. In other words, learning is accomplished by forming fine-grained activity patterns that represent a certain reward value in response to specific cues. Furthermore, so-called “as-if” feelings have been suggested by the somatic marker hypothesis (51). Although we do not experience them per se, these anticipatory representations are thought to guide our choices toward the more rewarding option (52). Based on observations in patients with frontal-lobe lesions, the somatic marker hypothesis predicts that these as-if feelings are represented in the OFC (51, 52). Our finding supports this idea by demonstrating that activity patterns in this region contain the same information about the reward value during both the reward-predicting cue and the actual outcome. Furthermore, the degree of similarity between response patterns during anticipation and receipt of reward is related to subjects’ ability to behaviorally use the value information provided by the cue. That is, the closer the match between neural response patterns during anticipation and receipt of reward, the closer the internal value representation matches the actual reward value. Damage to the OFC would lead to the inability to represent anticipatory value information, which could explain the observed decision-making impairments in these patients (53).

In summary, our findings suggest that sensory cues establish value representations by eliciting distributed neural response patterns in the OFC. Importantly, the pattern responses during anticipation are similar to those during the actual outcome—a simple but highly effective strategy. Furthermore, this similarity is functionally relevant to use the anticipated value to guide behavior. This mechanism may serve as a general neural process that is used by the brain to represent future rewarding outcomes and events. It remains to be shown how these patterns develop during learning and devaluation and whether this mechanism extends also to domains other than reward learning and decision making.

Materials and Methods

Participants.

Sixteen right-handed, healthy subjects participated in the experiment (mean age 26.4; SD ± 2.78 years). Subjects were free of neurological and psychiatric history and gave informed consent to participate in the fMRI experiment. The experimental procedure was approved by the local ethics review board of the Charité – Universitätsmedizin Berlin.

Stimuli and Task.

In each trial of the experiment, subjects saw one visual cue consisting of colored rotating dots for 2 s. Rotation of dots ranged from 100% counterclockwise (CCW) and 0% clockwise (CW) to 0% CCW and 100% CW (100% CCW and 0% CW; 75% CCW and 25% CW; 25% CCW and 75% CW; 0% CCW and 100% CW). The color of the dots ranged from green (G) to red (R) (100% G and 0% R; 66% G and 33% R; 33% G and 66% R; 0% G and 100% R). For each subject, specific conjunctions of rotation and color were reward predicting (Fig. 1B), whereas the individual dimensions were not correlated with the reward value. After a variable delay (4-8 s), subjects had to report either the main rotation direction (CCW vs. CW) or the color (G vs. R) of the dots by means of a button press (maximum RT = 1 s). The chosen response was briefly surrounded by a white frame, and feedback (0-10 points) was delivered after correct responses (Fig. 1A). Trials were separated by a variable delay of 4–8 s. After scanning, subjects received 2 cents for each point they acquired during the scanning session (∼22 € in total). The order of rotation and color judgments was randomized to ensure that subjects attended to both rotation and color on each trial. RT and performance (% correct) did not differ between rotation and color judgments (paired t test; t = −0.49, P = 0.63, and t = 0.17, P = 0.87, respectively), indicating similar levels of difficulty for both judgment tasks. From each subject, 10 scanning runs were acquired. Each of the four value levels was presented six times per run, resulting in 24 trials. Before scanning, subjects performed three training runs to learn the association between the cues and the reward outcomes. After the fMRI experiment, subjects saw each cue from the main experiment twice in pseudorandomized order and were asked to rate the corresponding reward value on a continuous rating scale (without labeling). These ratings were used to assess the degree to which subjects established associations between the cues and the reward values that they predict.

Behavioral Data Analyses.

RT and task performance (% correct) were analyzed using linear regression. Specifically, the trial-wise reward values were regressed against RT and task performance, respectively. Thus regression coefficients indicate the strength of modulation of behavior as a function of reward value. To characterize the degree to which subjects were able to perceive and report the value of each cue, we fitted a linear regression model to the subjective ratings obtained from the postscanning rating session. Specifically, the cue value was regressed against the subjective value ratings. This resulted in a coefficient of determination (R2) for each subject which describes the strength of the subjective association between cues and reward value.

fMRI Data Acquisition and Preprocessing.

Functional imaging was conducted on a 3-Tesla Siemens Trio scanner equipped with a 12-channel head coil. In each of the 10 scanning runs, 196 T2*-weighted gradient-echo echo-planar images (EPI) containing 33 slices (3 mm thick) separated by a gap of 0.75 mm were acquired. Imaging parameters were as follows: repetition time (TR) 2,000 ms, echo time (TE) 30 ms, flip angle 90°, matrix size 64 × 64 and a field of view (FOV) of 192 mm, resulting in a voxel size of 3 × 3 × 3.75 mm. A T1-weighted structural data set was collected for the purpose of anatomical localization. The parameters were as follows: TR 1,900 ms, TE 2.52 ms, matrix size 256 × 256, FOV 256 mm, 192 slices (1 mm thick), flip angle 9°.

Preprocessing, parameter estimation, and group statistics of the functional data were performed using SPM2 (Wellcome Department of Imaging Neuroscience, Institute of Neurology, London). For preprocessing, images were slice-time corrected and realigned. Importantly, to preserve fine-grained sampling biases of individual voxels, no spatial normalization or spatial smoothing was applied at this point of the analysis.

Decoding Anticipated Reward Values.

To identify where in the brain the reward value of sensory cues is represented, we used a multivariate decoding technique (31, 32). In a first step, preprocessed data were analyzed in the framework of the general linear model (GLM). For each run, a GLM was applied to the unsmoothed data with four cue-locked regressors of interest: (i) G&CCW; (ii) R&CCW; (iii) G&CW; and (iv) R&CW. For half of the subjects the cue reward association was: G&CCW = low, R&CCW = high, G&CW = high, and R&CW = low. For the other half of subjects this mapping was inversed. Additionally, an outcome-locked regressor of no interest was also included into the GLM. These regressors were convolved with a hemodynamic response function (HRF) and then simultaneously regressed against the BOLD signal in each voxel. The resulting parameter estimates represent the amount of variance in the BOLD signal that is explained by each regressor. The parameter estimates were then used to search for brain regions that carry spatially distributed information about the reward value of sensory cues. For this, we used a “searchlight” approach (37, 38), which examines the information in local fMRI patterns surrounding each voxel vi (see Fig. 2). This approach allows us to extract information from locally distributed activity patterns without potentially biasing prior voxel selection. For a given voxel vi we first defined a small spherical cluster (radius = 4 voxels) centered on vi. For each voxel in this local cluster we extracted the unsmoothed parameter estimates separately for each of the four cues, and separately for all 10 scanning runs. This yielded four multidimensional pattern vectors for each run, representing the spatially distributed response patterns to different sensory cues in that local cluster. Importantly, because we were only interested in information encoded in the distributed fMRI patterns, the mean of each pattern was subtracted. This ensures that the average signal of the pattern vectors does not contain any information at all. We then used 9 of the 10 runs to train a linear support vector classification (SVC) to classify one pair of low- vs. high-value cues (e.g., G&CCW vs. G&CW). The amount of value related information present within this local cluster independent of the sensory properties of the cue was then assessed by examining how well a different pair of low- vs. high-value cues (e.g., R&CW vs. R&CCW) in the remaining independent test data set (run 10) was classified. Specifically, decoding accuracy was defined as the percentage of correct classifications in the independent test data set. In total, the training and test procedure was repeated 10 times, each with a different run assigned as test data set yielding an average prediction accuracy in the local environment of the central voxel vi (10-fold cross-validation). Note that this cross-validation using independent training and test data sets avoids "double dipping" and circular analyses (54). This procedure was repeated for all four possible training-test combinations (Fig. 2B) and all spatial positions, i.e., voxels vj. For each subject, this resulted in four 3D maps of decoding accuracy, one for each training-test pair. The SVC was performed using the LIBSVM implementation (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) with a linear kernel and a standard cost parameter c = 1.

Group statistics were performed on a voxel-by-voxel basis to examine how well the value of the sensory cues could be decoded on average across all subjects from each position in the brain. For this purpose the prediction accuracy maps were spatially normalized to a standard T2* template of the Montreal Neurological Institute (MNI) and smoothed with a 6-mm FWHM Gaussian kernel. The normalized and smoothed accuracy maps were then entered into a one-way ANOVA with four levels, defining each training-test pair (see Fig. 2B). Regions that contained information about the reward value were identified using voxel-wise t tests based on the four training-test pairs. To identify significant clusters, we used a threshold of P < 0.05, family-wise error (FWE) corrected for the whole brain (all searchlight positions) and a cluster extend threshold of 30 voxels. This particularly conservative threshold was used to minimize the possibility of false positive results.

Common Value Coding fMRI Patterns During Anticipation and Receipt of Reward.

To investigate the similarity between the value coding fMRI patterns during anticipation and receipt of reward, we set up a second GLM with four regressors of interest: (i) high-value cues; (ii) low-value cues; (iii) high-value outcomes; and (iv) low-value outcomes. These regressors were again convolved with an HRF and simultaneously regressed against the BOLD signal in each voxel. Importantly, because all regressors were simultaneously regressed against the BOLD signal, parameter estimates of cue and outcome do not account for the same variance in the BOLD signal. In other words, parameter estimates of high- and low-value cues are independent of activity to high- and low-value outcomes and vice versa. To identify brain regions in which similar fMRI patterns represent the reward value during anticipation and receipt of reward, we repeated the searchlight procedure described above. This time, however, we used the fMRI patterns during anticipation (low- vs. high-value cues) to train the SVC and tested the performance on fMRI patterns during receipt of reward (low- vs. high-reward outcome) and vice versa. Again, the mean of each fMRI pattern was subtracted to remove any information possibly present in the average signal. Group statistics were performed on normalized and smoothed accuracy maps using a one-way ANOVA with two levels defining both training-test pairs (train on cue, test on receipt and train on receipt, test on cue). Regions that represent value by similar response patterns during anticipation and receipt of reward were identified using voxel-wise t tests based on the two training-test pairs. The same FWE-corrected threshold of P < 0.05, k = 30 voxel was used to identify significant clusters.

To obtain a measure of the similarity between fMRI patterns during anticipation and receipt of reward, we extracted the voxel selectivities, that is, the support vector weights (SV weights) from the searchlight cluster surrounding the individual peak voxel in the medial OFC [within the medial OFC cluster with significant (P < 0.05, FWE corrected) decoding accuracy at group level]. Per subject, this resulted in two spatial patterns of voxel selectivities, one from training the SVC on data during anticipation and one from training the SVC on data during the receipt of reward (Fig. 4B and Fig. S3). To quantify their similarity for each subject, Pearson’s correlation coefficient was computed between the spatial patterns (Fig. 4B Inset and Fig. S4).

Supplementary Material

Supporting Information

Acknowledgments

We thank the two anonymous reviewers for their useful comments and suggestions. This work was funded by the Bernstein Computational Neuroscience Program of the German Federal Ministry of Education and Research BMBF Grant 01GQ0411, the Excellence Initiative of the German Federal Ministry of Education and Research DFG Grant GSC86/1-2009, and the Max Planck Society.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0912838107/DCSupplemental.

References

  • 1.Rolls ET. The orbitofrontal cortex and reward. Cereb Cortex. 2000;10:284–294. doi: 10.1093/cercor/10.3.284. [DOI] [PubMed] [Google Scholar]
  • 2.Kringelbach ML. The human orbitofrontal cortex: Linking reward to hedonic experience. Nat Rev Neurosci. 2005;6:691–702. doi: 10.1038/nrn1747. [DOI] [PubMed] [Google Scholar]
  • 3.Murray EA, O'Doherty JP, Schoenbaum G. What we know and do not know about the functions of the orbitofrontal cortex after 20 years of cross-species studies. J Neurosci. 2007;27:8166–8169. doi: 10.1523/JNEUROSCI.1556-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.O'Doherty JP. Lights, camembert, action! The role of human orbitofrontal cortex in encoding stimuli, rewards and choices. Ann N Y Acad Sci. 2007;1121:254–272. doi: 10.1196/annals.1401.036. [DOI] [PubMed] [Google Scholar]
  • 5.Schoenbaum G, Saddoris MP, Stalnaker TA. Reconciling the roles of orbitofrontal cortex in reversal learning and the encoding of outcome expectancies. Ann N Y Acad Sci. 2007;1121:320–335. doi: 10.1196/annals.1401.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and represen-tation of incentive value in associative learning. J Neurosci. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci. 2004;24:7540–7548. doi: 10.1523/JNEUROSCI.1921-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Pickens CL, et al. Different roles for orbitofrontal cortex and basolateral amygdala in a reinforcer devaluation task. J Neurosci. 2003;23:11078–11084. doi: 10.1523/JNEUROSCI.23-35-11078.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Schoenbaum G, Chiba AA, Gallagher M. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci. 1998;1:155–159. doi: 10.1038/407. [DOI] [PubMed] [Google Scholar]
  • 10.Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature. 1999;398:704–708. doi: 10.1038/19525. [DOI] [PubMed] [Google Scholar]
  • 11.Wallis JD, Miller EK. Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task. Eur J Neurosci. 2003;18:2069–2081. doi: 10.1046/j.1460-9568.2003.02922.x. [DOI] [PubMed] [Google Scholar]
  • 12.Roesch MR, Olson CR. Neuronal activity related to reward value and motivation in primate frontal cortex. Science. 2004;304:307–310. doi: 10.1126/science.1093223. [DOI] [PubMed] [Google Scholar]
  • 13.Rolls ET, Critchley HD, Mason R, Wakeman EA. Orbitofrontal cortex neurons: Role in olfactory and visual association learning. J Neurophysiol. 1996;75:1970–1981. doi: 10.1152/jn.1996.75.5.1970. [DOI] [PubMed] [Google Scholar]
  • 14.Schoenbaum G, Chiba AA, Gallagher M. Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. J Neurosci. 1999;19:1876–1884. doi: 10.1523/JNEUROSCI.19-05-01876.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Critchley HD, Rolls ET. Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J Neurophysiol. 1996;75:1673–1686. doi: 10.1152/jn.1996.75.4.1673. [DOI] [PubMed] [Google Scholar]
  • 16.Breiter HC, Aharon I, Kahneman D, Dale A, Shizgal P. Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron. 2001;30:619–639. doi: 10.1016/s0896-6273(01)00303-8. [DOI] [PubMed] [Google Scholar]
  • 17.O'Doherty JP, Deichmann R, Critchley HD, Dolan RJ. Neural responses during anticipation of a primary taste reward. Neuron. 2002;33:815–826. doi: 10.1016/s0896-6273(02)00603-7. [DOI] [PubMed] [Google Scholar]
  • 18.Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature. 2006;441:876–879. doi: 10.1038/nature04766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kim H, Shimojo S, O'Doherty JP. Is avoiding an aversive outcome rewarding? Neural substrates of avoidance learning in the human brain. PLoS Biol. 2006;4:e233. doi: 10.1371/journal.pbio.0040233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Tom SM, Fox CR, Trepel C, Poldrack RA. The neural basis of loss aversion in decision-making under risk. Science. 2007;315:515–518. doi: 10.1126/science.1134239. [DOI] [PubMed] [Google Scholar]
  • 21.Plassmann H, O'Doherty J, Shiv B, Rangel A. Marketing actions can modulate neural representations of experienced pleasantness. Proc Natl Acad Sci USA. 2008;105:1050–1054. doi: 10.1073/pnas.0706929105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gottfried JA, O'Doherty J, Dolan RJ. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science. 2003;301:1104–1107. doi: 10.1126/science.1087919. [DOI] [PubMed] [Google Scholar]
  • 23.O'Doherty JP, Kringelbach ML, Rolls ET, Hornak J, Andrews C. Abstract reward and punishment representations in the human orbitofrontal cortex. Nat Neurosci. 2001;4:95–102. doi: 10.1038/82959. [DOI] [PubMed] [Google Scholar]
  • 24.Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kennerley SW, Dahmubed AF, Lara AH, Wallis JD. Neurons in the frontal lobe encode the value of multiple decision variables. J Cogn Neurosci. 2009;21:1162–1178. doi: 10.1162/jocn.2009.21100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kennerley SW, Wallis JD. Evaluating choices by single neurons in the frontal lobe: Outcome value encoded across multiple decision variables. Eur J Neurosci. 2009;29:2061–2073. doi: 10.1111/j.1460-9568.2009.06743.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Morrison SE, Salzman CD. The convergence of information about rewarding and aversive stimuli in single neurons. J Neurosci. 2009;29:11471–11483. doi: 10.1523/JNEUROSCI.1815-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Schoenbaum G, Eichenbaum H. Information coding in the rodent prefrontal cortex. II. Ensemble activity in orbitofrontal cortex. J Neurophysiol. 1995;74:751–762. doi: 10.1152/jn.1995.74.2.751. [DOI] [PubMed] [Google Scholar]
  • 29.van Duuren E, Lankelma J, Pennartz CM. Population coding of reward magnitude in the orbitofrontal cortex of the rat. J Neurosci. 2008;28:8590–8603. doi: 10.1523/JNEUROSCI.5549-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.van Duuren E, et al. Single-cell and population coding of expected reward probability in the orbitofrontal cortex of the rat. J Neurosci. 2009;29:8965–8976. doi: 10.1523/JNEUROSCI.0005-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Haynes JD, Rees G. Decoding mental states from brain activity in humans. Nat Rev Neurosci. 2006;7:523–534. doi: 10.1038/nrn1931. [DOI] [PubMed] [Google Scholar]
  • 32.Norman KA, Polyn SM, Detre GJ, Haxby JV. Beyond mind-reading: Multi-voxel pattern analysis of fMRI data. Trends Cogn Sci. 2006;10:424–430. doi: 10.1016/j.tics.2006.07.005. [DOI] [PubMed] [Google Scholar]
  • 33.Goldman-Rakic P. In: Handbook of Physiology: The Nervous System. Mountcastle VB, Plum F, Geiger SR, editors. Bethesda: Am Physiol Soc; 1987. pp. 373–417. [Google Scholar]
  • 34.Wallis JD. Orbitofrontal cortex and its contribution to decision-making. Annu Rev Neurosci. 2007;30:31–56. doi: 10.1146/annurev.neuro.30.051606.094334. [DOI] [PubMed] [Google Scholar]
  • 35.Knutson B, Fong GW, Adams CM, Varner JL, Hommer D. Dissociation of reward anticipation and outcome with event-related fMRI. Neuroreport. 2001;12:3683–3687. doi: 10.1097/00001756-200112040-00016. [DOI] [PubMed] [Google Scholar]
  • 36.Knutson B, Fong GW, Bennett SM, Adams CM, Hommer D. A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: Characterization with rapid event-related fMRI. Neuroimage. 2003;18:263–272. doi: 10.1016/s1053-8119(02)00057-5. [DOI] [PubMed] [Google Scholar]
  • 37.Kriegeskorte N, Goebel R, Bandettini P. Information-based functional brain mapping. Proc Natl Acad Sci USA. 2006;103:3863–3868. doi: 10.1073/pnas.0600244103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Haynes JD, et al. Reading hidden intentions in the human brain. Curr Biol. 2007;17:323–328. doi: 10.1016/j.cub.2006.11.072. [DOI] [PubMed] [Google Scholar]
  • 39.Logothetis NK, Pauls J, Augath M, Trinath T, Oeltermann A. Neurophysiological investigation of the basis of the fMRI signal. Nature. 2001;412:150–157. doi: 10.1038/35084005. [DOI] [PubMed] [Google Scholar]
  • 40.Oram MW, Földiák P, Perrett DI, Sengpiel F. The ‘Ideal Homunculus’: Decoding neural population signals. Trends Neurosci. 1998;21:259–265. doi: 10.1016/s0166-2236(97)01216-2. [DOI] [PubMed] [Google Scholar]
  • 41.Quian Quiroga R, Panzeri S. Extracting information from neuronal populations: Information theory and decoding approaches. Nat Rev Neurosci. 2009;10:173–185. doi: 10.1038/nrn2578. [DOI] [PubMed] [Google Scholar]
  • 42.Kamitani Y, Tong F. Decoding the visual and subjective contents of the human brain. Nat Neurosci. 2005;8:679–685. doi: 10.1038/nn1444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Haynes JD, Rees G. Predicting the orientation of invisible stimuli from activity in human primary visual cortex. Nat Neurosci. 2005;8:686–691. doi: 10.1038/nn1445. [DOI] [PubMed] [Google Scholar]
  • 44.Hampton AN, O'doherty JP. Decoding the neural substrates of reward-related decision making with functional MRI. Proc Natl Acad Sci USA. 2007;104:1377–1382. doi: 10.1073/pnas.0606297104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Soon CS, Brass M, Heinze HJ, Haynes JD. Unconscious determinants of free decisions in the human brain. Nat Neurosci. 2008;11:543–545. doi: 10.1038/nn.2112. [DOI] [PubMed] [Google Scholar]
  • 46.Pucak ML, Levitt JB, Lund JS, Lewis DA. Patterns of intrinsic and associational circuitry in monkey prefrontal cortex. J Comp Neurol. 1996;376:614–630. doi: 10.1002/(SICI)1096-9861(19961223)376:4<614::AID-CNE9>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
  • 47.Friston KJ, Zarahn E, Josephs O, Henson RN, Dale AM. Stochastic designs in event-related fMRI. Neuroimage. 1999;10:607–619. doi: 10.1006/nimg.1999.0498. [DOI] [PubMed] [Google Scholar]
  • 48.Lara AH, Kennerley SW, Wallis JD. Encoding of gustatory working memory by orbitofrontal neurons. J Neurosci. 2009;29:765–774. doi: 10.1523/JNEUROSCI.4637-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Rescorla RA, Wagner AR. In: Classical Conditioning II: Current Research and Theory. Black AH, Prokasy WF, editors. New York: Appleton-Century-Crofts; 1972. [Google Scholar]
  • 50.Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  • 51.Damasio AR. The somatic marker hypothesis and the possible functions of the prefrontal cortex. Philos Trans R Soc Lond B Biol Sci. 1996;351:1413–1420. doi: 10.1098/rstb.1996.0125. [DOI] [PubMed] [Google Scholar]
  • 52.Bechara A, Damasio H, Tranel D, Damasio AR. Deciding advantageously before knowing the advantageous strategy. Science. 1997;275:1293–1295. doi: 10.1126/science.275.5304.1293. [DOI] [PubMed] [Google Scholar]
  • 53.Anderson SW, Bechara A, Damasio H, Tranel D, Damasio AR. Impairment of social and moral behavior related to early damage in human prefrontal cortex. Nat Neurosci. 1999;2:1032–1037. doi: 10.1038/14833. [DOI] [PubMed] [Google Scholar]
  • 54.Kriegeskorte N, Simmons WK, Bellgowan PS, Baker CI. Circular analysis in systems neuroscience: The dangers of double dipping. Nat Neurosci. 2009;12:535–540. doi: 10.1038/nn.2303. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES