Abstract
Prior experience plays a critical role in decision making. It enables explicit representation of potential outcomes and provides training to valuation mechanisms. However, we can also make choices in the absence of prior experience, by merely imagining the consequences of a new experience. Here, using fMRI repetition suppression in humans, we show how neuronal representations of novel rewards can be constructed and evaluated. A likely novel experience is constructed by invoking multiple independent memories within hippocampus and medial prefrontal cortex. This construction persists for only a short time period, during which new associations are observed between the memories for component items. Together these findings suggest that in the absence of direct experience, co-activation of multiple relevant memories can provide a training signal to the valuation system which allows the consequences of new experiences to be imagined and acted upon.
Humans display remarkable flexibility in their behavior. Like many other animals, we guide our behavior through direct experience, but we can also infer the likely consequences of actions never previously taken1,2. Through generalizing principles and applying them to new situations3,4, we can predict new relationships and statistical structures in our environment and use these to estimate the value of new events1,5,6. Whilst some progress has been made in uncovering the brain regions that underlie these complex abilities1,3–7, little or no progress has been made in understanding how neuronal networks support these complex computations, partly because it is unclear to what extent such computations exist in species where we can readily measure single cell activity.
One potential mechanism that allows for upcoming events to be evaluated involves using past experience to predict consequences of future possible scenarios. In rodents, hippocampal firing sequences at choice points predict or ‘preplay’ the forthcoming environment8, and the likely outcomes of their decision can later be decoded in the orbitofrontal cortex9. By contrast, when choosing between novel options, there is no direct experience from which to preplay and evaluate future options. However, it is possible that the representation of an upcoming novel outcome may be constructed by combining multiple distinct relevant experiences, preplayed simultaneously.
To test these predictions we required access to the information content of neural populations underlying the representation of a novel experience. Despite the poor spatial resolution of fMRI, there are well-validated strategies that can reveal underlying cellular representations. For example, fMRI adaptation takes advantage of the fact that activated cellular ensembles within a voxel show a relative suppression in their activity in response to repetition of a stimulus to which they recently responded. Despite ambiguity in the biophysical mechanism underlying repetition suppression10, when combined with careful experimental design the technique allows for inferences to be made about the underlying neuronal representations12,13.
Here we used fMRI adaptation to probe the neural representation of a novel food reward. We hypothesized that if the representation of a novel food was constructed by explicit combination of multiple distinct experiences, we would observe fMRI adaptation when subjects evaluated a novel reward immediately after evaluating a component ingredient. Furthermore, if multiple experiences were replayed simultaneously, plasticity may result between the underlying neuronal assemblies. Hence, experiences used to construct the same novel good would later adapt to each other. Lastly, we hypothesized that this complex construction process would not be required after an independent neuronal representation of the novel good had been established. We should therefore observe a reduction in each adaptation effect after allowing the subjects either to experience the novel good directly, or to simulate the novel good repeatedly. This repetition suppression paradigm therefore allowed us to probe the neural mechanisms that underlie human capacity for flexible, online, value construction.
Results
Deciding between novel goods
We created thirteen ‘novel goods’ whose values were unknown to the subjects. However, each good was a novel combination of two different familiar foods (Fig. 1a). Participants were given the opportunity to observe these novel goods without being allowed to sample them either by taste or smell.
To first establish that these goods activate known value-related brain regions, we measured fMRI activity in 19 subjects whilst they evaluated and chose between pairs of these novel goods (Fig. 1b). After the scan session, subjects performed a Becker-DeGroot-Marschak (BDM) auction14 that allowed us to measure subjects’ constructed value for each good. Consistent with reports in simpler valuation contexts, we observed a signal that correlated with the value of the chosen option in a network of brain regions that included ventral and dorsal medial prefrontal cortex ((v/d)mPFC), and posterior cingulate cortex, (mPFC: p = 0.001 FWE corrected on cluster level, peak t(17) = 6.30, Fig. 2a). The involvement of both vmPFC and dmPFC is of particular interest given that the task requires subjects to construct, and evaluate, a model of a future outcome. This involvement accords with recent evidence that vmPFC encodes value preference for executable choices while dmPFC does so for choices that are modeled abstractly7.
To evaluate these novel goods, subjects could not rely on pre-learnt values. Hence, their only recourse was to construct, online, an expectation of the compound’s value from knowledge of the individual components. A key question therefore is whether subjects constructed a novel representation of the compound by explicitly combining the representations of each component and, if so, which brain regions support this construction process? We reasoned that this construction process could be measured using fMRI adaptation. Activity relating to the construction of the compound value would be suppressed when preceded by a related component if, and only if, the subject had engaged the neuronal ensembles of the components when constructing a representation of the compound.
Constructing representations of novel goods using memories
For each participant, we selected two of the thirteen novel compounds, here referred to as ‘AB’ and ‘CD’, made up of four ‘familiar’ individual components (‘A’, ‘B’, ‘C’ and ‘D’) that subjects had tasted immediately prior to the experiment (Fig. 1a). To avoid visual confounds in a later analysis, we trained subjects to associate each of the 6 component and compound foods (‘A’, ‘B’, ‘C’, ‘D’, ‘AB’, ‘CD’) with two different abstract shapes (Fig. 1c). Participants trained extensively on these associations between food items and abstract shapes. In the final block of trials the mean accuracy was 97.8%, with mean reaction time of 845.2ms.
On each trial in the scanner, we presented a distinct shape that served as an instruction cue for subjects to elicit an explicit mental representation of the associated food (Fig. 1d). The key comparison of interest here was brain activity elicited by novel goods when preceded by related components (e.g. ‘A’ or ‘B’ followed by ‘AB’) compared to novel goods when preceded by unrelated components (e.g. ‘C’ or ‘D’ followed by ‘AB’).
Early in the experiment (block 1 out of 3) we observed fMRI adaptation between the representation of novel goods and their constituent components in both mPFC (p<0.001 FWE-corrected on cluster level, peak t(18) = 4.45, Fig. 2b) and bilateral hippocampus (t(18) = 2.55, p = 0.010 using ROI analysis (see Methods), Fig. 2b). These two brain regions are components of a network commonly activated in studies of value7,15–18, episodic memory4,19,20 and spatial navigation12. The present result implies that these brain regions construct a value representation of a novel item from component memories, and do so by simultaneously engaging neuronal representations of these components.
Plasticity between simultaneously active memories
If this is in fact the case then it follows that during the construction of the compound good ‘AB’, the neuronal ensembles representing components ‘A’ and ‘B’ should be simultaneously active. We reasoned that this simultaneous activity, which first occurred during the stimulus-item training phase prior to scanning, would induce experience dependent plasticity between cellular elements in these two ensembles- a plasticity evident in the scanning trials as a shadow of this value construction process. For example, after constructing a representation of ‘tea jelly’, we reasoned that cellular representations of ‘tea’ would induce activity in jelly-preferring ensembles, and vice versa. This can also be tested using fMRI adaptation which predicts a differential effect for components that were part of the same compound compared to components that were not.
Indeed, when we compared early trials of ‘A’ that were preceded by ‘B’, to those that were preceded by ‘C’, we again found relative suppression in medial prefrontal cortex activity (p = 0.014 FWE corrected on cluster level, peak t(18) = 4.24, Fig. 2c), but not hippocampus (t(18) = 0.34, p = 0.367 using ROI analysis, see Methods).
Notably, across all three blocks, the extent to which individual participants showed adaptation between related components in mPFC, but also in the hippocampus, was predicted by the average value of the novel items (mPFC: r = 0.47 and p = 0.040; hippocampus: r = 0.58 and p = 0.010) but not component items (mPFC: r = −0.05, p = 0.833; hippocampus: r = −0.09, p = 0.730). This suggests that the mechanism underlying this suppression occurred during the earlier construction of the novel good, and not during the participant’s elicitation of the component item at the time that this signal was measured. Indeed, in both structures, the correlation with the value of the novel good survived the removal of any signal attributable to the component values (mPFC: r = 0.51, p = 0.015, Fig. 2d; hippocampus: r = 0.60, p = 0.004, Fig. 2e; see Supplementary Fig. 1 and corresponding discussion). Together these findings support value dependent plasticity in related components as a consequence of co-activation during construction of the novel goods.
It is important to note here that these three de facto tests of mPFC function (valuation, construction and plasticity) do not rely on any of the same data. Despite slight differences in thresholded peak locations of the two adaptation effects, they each show similar patterns of activity within mPFC (Fig. 2f). Medial prefrontal cortex can therefore evaluate novel goods by constructing explicit representations of expected outcomes from familiar components, a process that engenders plasticity between simultaneously active component representations.
The influence of sensory experience upon construction
We then asked whether consummatory exposure to the novel goods would reduce a need to construct value online. To test this idea, we repeated the experiment in a second group of 20 subjects with one important difference. This second group (familiar) were given a single sample of each of the 13 novel compound goods to taste before the experiment. Notably, both groups underwent the same item-stimulus learning task prior to entering the scanner, and there was no significant difference between groups in reaction time or accuracy on the final block of trials during the learning task (see Supplementary Table 1). Any difference between the two groups in the representation or evaluation of novel goods could therefore be attributed to the impact of sensory exposure.
We first assessed value effects during decision trials. Both groups showed similar consistency in their choices (see Supplementary Fig. 2). As was the case for the unfamiliar group, the familiar group encoded chosen value activity in a network of value-related brain regions including mPFC (Fig. 3a). In both groups, the neural activity observed in mPFC was consistent with a role for this brain region in the evaluation of compound goods (Fig. 3b–c).
To test whether this single experience was enough to reduce a need for online value construction, we compared adaptation effects across the two groups. To avoid selection bias, we used ROIs derived from whole-brain adaptation effects averaged across both adaptation contrasts in the two groups (Fig. 4a–b, see Methods). A between group comparison within these ROIs revealed significant differences in the adaptation effects between the familiar and unfamiliar participants in both mPFC (3-way ANOVA (see methods): group*condition interaction, p = 0.018, F(1,144) = 5.76), and hippocampus (3-way ANOVA (see methods): group*adaptation_type*condition interaction, p = 0.035, F(1,144) = 4.52). Using post-hoc two-sample t-tests to decompose these interactions, we found relative to the unfamiliar group, the familiar group showed reduced adaptation between the novel goods and their related components in mPFC (group difference: trend, t(18) = 1.70, p = 0.053, Fig. 4c), and in the hippocampus (group difference: t(18) = 3.11, p = 0.003, Fig. 4c). Furthermore, the familiar group did not show plasticity in mPFC between the representation of the constituent components of a novel good (group difference: t(18) = 1.96, p = 0.033, Fig. 4c). Crucially, there was no significant difference between groups in their ability to accurately elicit the correct representations during the imagination task (group comparison of accuracy: p = 0.82, and reaction time: p = 0.89), nor in the average subjective value assigned to any of the novel goods used in the adaptation task (see Supplementary Fig. 3). This result therefore suggests that even a single previous experience of a good is sufficient to reduce a requirement for online value construction. This is particularly notable given that extensive experience is required to reduce goal-oriented behavior and establish habitual actions21.
Temporal dynamics of the construction mechanism
If experiential and constructed valuation use distinct neural mechanisms, it is possible that the value construction mechanism could itself substitute for a direct experience and train experiential valuation mechanisms. As the experiment progressed, subjects had substantial experience of constructing the representation of the novel good. We asked whether, after multiple previous simulations of an experience it was still necessary to construct and evaluate the representation of novel goods anew on each trial? Alternatively, were values learnt despite participants never having experienced the novel good? As our experiment extended over three separate blocks, we were able to study changes in value construction-related adaptation effects over time.
Previous studies have found goal directed choice mechanisms exhibit marked differences early and late in choice experiments17. In the present study, we used a three-way ANOVA (see Methods) to identify attenuation of adaptation effects in mPFC and hippocampus in the unfamiliar group across the scanning session (block*condition interaction for mPFC, p = 0.004, F(1,144) = 8.44, and block*adaptation-type interaction for hippocampus, p = 0.011, F(1,144) = 6.56). Post-hoc t-tests comparing block 1 with all remaining blocks revealed a significant reduction in adaptation over time of a novel-good to its related component (mPFC: t(18) = 2.12, p = 0.024, and hippocampus: t(18) = 2.13, p = 0.024; Fig. 5a), and in the plasticity between related components (mPFC: t(18) = 1.85, p = 0.041, but not hippocampus: t(18) = 0.81, p = 0.785; Fig. 6a).
To ensure sensitivity to the construction process was maintained across the duration of the experiment, we also considered temporal dynamics of other adaptation effects, and of value signals encoded on decision trials. In the unfamiliar group, both adaptation in mPFC to repetition of any item (but not stimulus) and adaptation in visual areas to repetition of a stimulus did not show reduction over time (one-tailed paired t-tests: t(18) = 0.46, p = 0.326, Fig. 5a; and t(18) = 0.50, p = 0.312, Supplementary Fig. 4a respectively). Furthermore, the chosen value signal encoded by mPFC also did not reduce over time, but instead remained consistent across sessions (Fig. 3c). In addition, performance on the imagination task improved across blocks (Fig. 5b–c). Rather than a loss of sensitivity, this suggests that the diminishing adaptation effects demonstrate that simulated experience is sufficient to establish an independent representation of the novel good that no longer needs to be reconstructed anew on each trial.
Despite the overall reduction of cross-component suppression over the course of the experiment, this was not true for components that had been used to construct high value novel goods. When averaging across the final two blocks, both the mPFC and hippocampus showed a significant positive correlation with the value of the compound items (mPFC: r = 0.64, p = 0.002, Fig. 6b; hippocampus: r = 0.63, p = 0.003, Fig. 6d; after accounting for variance explained by the value of the component items in both cases). After accounting for variance explained by component value, a median split of participants according to the value assigned to the novel goods verified that there was long lasting plasticity in mPFC and hippocampus in the final two blocks for those participants who attributed high but not low values to the novel goods (mPFC: ‘High’ t(8) = 2.84, p = 0.022, and ‘High vs Low’ t(8) = 2.68, p = 0.028, Fig. 6c; Hippocampus: ‘High’ t(8) = 3.52, p = 0.008, and ‘High vs Low’ t(8) = 5.36, p<0.001, Fig. 6e). Suggestive evidence that value-dependent adaptation between related component items emerged later in hippocampus relative to mPFC (‘High’, Fig. 6c vs 6e), could not be verified statistically (t(8) = 1.30, p = 0.229). Together these results suggest that the plasticity is long lasting if value is attributed to the original association.
Discussion
The role of memory in prospective evaluation and inference has previously been emphasized in both animals22 and humans3,4,20. Simulation and preplay can be used to explore an ‘internal model’ of the environment and evaluate anticipated outcomes8,23. However, the neural mechanisms by which these processes are achieved has remained unclear, particularly in circumstances where anticipated outcomes have not previously been experienced. Here we used repetition suppression in fMRI to reveal a neuronal mechanism that supports prospective representation and evaluation of novel experiences.
Repetition suppression has been used extensively in sensory brain regions to probe the information content of neural activations, and more recently in more frontal brain regions including orbitofrontal cortex24. However, a number of different hypotheses have been proposed to explain the underlying physiological mechanisms behind the phenomenon, including fatigue, sparse coding and predictive coding10,25–27. Although there is not yet a consensus on which mechanism provides the most appropriate explanation for the phenomenon, when used in a carefully controlled experimental design, the consequences of this ambiguity is mitigated since all models make the same prediction: if a neural population is sensitive to a particular feature or dimension, then suppression will occur in response to a repetition of this feature, but not others.
The repetition suppression paradigm used here was designed to allow interrogation of the underlying representation of a novel reward. By asking people to imagine and evaluate novel rewards in the scanner, we found that the neural representation of a novel reward was dependent upon representations of multiple related and previously experienced rewards. Our data suggest that neuronal networks can construct a novel experience by simultaneous activation of multiple previous memories, in order that this constructed experience may be evaluated. Whilst signals in the anterior hippocampus were found to be related to construction, those in medial prefrontal cortex were related to both construction and valuation.
Crucially, unlike other goal directed decision mechanisms that have been reported previously21,23,28,29, we only found evidence for a construction mechanism when subjects had no direct experience of an outcome, and even then only fleetingly. It is possible, therefore, that constructed value can provide a substitute for direct experience, and train the experiential goal-directed systems that have been studied previously. This training signal may be considered analogous to off-line training of an habitual system which makes use of simulations from an internal goal-directed model23,30–32. Whereas the teaching signal provided to a habitual system replicates, or fine-tunes, previous sensory experience, the teaching signal provided to a goal-directed system may establish an internal model of the future world by repeated imagination of a novel experience.
During the construction process, a second repetition suppression effect was observed between distinct and previously unassociated memories that contributed to the construction. This effect implies that the neural representation of related, compared to unrelated, component items became more similar as a consequence of the pre-scan training task, during which the participants were first exposed to the novel compounds. Notably, since the suppression was not observed in the familiar group it seems highly unlikely that this suppression effect reflects inherent similarity between related compared to unrelated components. Rather, the most plausible explanation for this change is that through repeated representation of a novel compound, previously unrelated memories were recruited simultaneously, inducing a form of plasticity between the underlying representations of necessary components.
Within both brain regions involved in construction, the mPFC and hippocampus, plasticity between related components was dependent upon the value of the novel compounds, but not the value of components. This value dependence effect suggests that the representations of the component memories were simultaneously present during valuation of the novel compounds. A number of different mechanistic explanations may underlie this dependency. For example, the occurrence of greater BOLD activity at the time of pairing may induce more plasticity, or alternatively when representing a higher value compound, the enhanced availability of neuromodulators, such as dopamine, may serve to facilitate plasticity.
Given that on average participants showed a reduction over time in the initial plasticity observed in mPFC, with comparable dynamics to the construction mechanism, it must be acknowledged that it remains ambiguous whether the adaptation observed between related components reflects classical Hebbian plasticity, or even occurs in the same regions to those where the repetition suppression is observed. However, those participants who assigned high value to the novel goods, showed plasticity in mPFC that outlasted the construction process. In the hippocampus, where plasticity was not observed early on in the experiment, the same participants showed plasticity late in the experiment. Therefore, the extent to which neural representations of related component became more similar to one another, but also the durability of the effect, was dependent upon value attributed to the novel compounds. Irrespective of the underlying nature of the plasticity, the influence of compound value upon component memories thus supports the claim that these representations are paired together at the time of construction of the novel compounds.
The medial prefrontal cortex is regularly activated in studies of valuation33,17,34–36,18, and is particularly notable amongst such reward-related regions for the flexibility of the value signals that it contains. These computations may, for example, rely on an understanding of the complex structure of the environment5, the generalisation of concepts learnt in different situations3, or the integration of several disparate sources of information37. If subjects are asked to ignore all of their own experiences and preferences, and instead to guess what a very different individual would choose, mPFC value signals can immediately reflect the preferences of this new individual7,38. Such online evaluation is a hallmark property of ‘goal-directed’ choices, which are frequently contrasted with ‘habitual’ or overlearnt choices in studies of animal and human behaviour6,21,23,29,39,40. Previous studies of goal-oriented behaviour have, however, focused on situations where values are known, but must be associated with a particular course of action by interfering the structure of the world1,23,29. Our data suggest that medial prefrontal cortex can combine previous experiences to construct prospective outcomes de novo on each trial, and can then evaluate these constructed outcomes.
Hippocampal preplay mechanisms are known to be important substrates for goal-directed spatial decisions in rodents8,41, and hippocampal value signals can be recorded in situations where outcomes must be inferred from knowledge of relationships between stimuli in the world1,42. Notably, hippocampal activity is often recorded in concert with a network involving mPFC in studies of spatial memory and scene construction12,19,43. Consistent with the proposed function of memory in prospective inference44,45, the formation of associative links46,47, and constructive episodic simulation48,49, our data suggest that hippocampal activity can also play an active role in constructing de novo experiences in non-spatial contexts.
These findings show that a potential new experience can be prospectively represented and evaluated by invoking multiple memories simultaneously within hippocampus and medial prefrontal cortex. By highlighting this neuronal mechanism we provide unique insight into the neuronal computations underlying flexible behaviors that dominate human decision making and which are difficult to study in animal models.
Online Methods
Participants
Thirty-nine healthy volunteers participated in the fMRI experiment, and were assigned to one of two groups (unfamiliar and familiar) by drawing from Matlab’s pseudo-random number generator. One participant (from the familiar group) was excluded from all further analyses due to poor performance (less than 80% accuracy on task performance during any one session). All remaining participants (19 unfamiliar participants with mean age: 28.0, 13 females, and 19 familiar participants with mean age: 27.3, 10 females) were included in further analyses with the exception of one participant (from the unfamiliar group) who was excluded from analysis of the decision task due to parameter estimates being more than 3.5 standard deviations away from the group mean. The final sample size was comparable to that commonly used in fMRI studies. All participants refrained from eating for two hours prior to the start of the experiment. The study was approved by a local UCL ethics committee (ref. number 3486/001) and all participants gave informed written consent.
Experimental task
Behavioral training
Thirteen different novel food combinations, or ‘goods’, were presented to the participants along with their names: tea-jelly, tomato-jam, popcorn-jelly beans, beetroot-custard, onion-mints, pea-mousse, olive-strawberry, pesto-nutella, spinach-pineapple smoothie, raspberry-avocado smoothie, vanilla-salt, yoghurt-pretzels, and coffee-yoghurt. Each good was formed by combining two familiar component food types which had not previously been tasted together (Fig. 1a). The experimenter chose two novel goods, AB and CD, for each participant, under the constraint that the participant liked all four individual component foods (A and B, C and D) from which the two novel goods were formed. All participants were given a small sample of the components (A, B, C, D) to eat, but only participants in the familiar group were allowed to taste, smell and handle the novel goods.
The experimenter randomly assigned two abstract pink shapes to each novel good (AB and CD) and to their respective components (Fig. 1c). Participants were then actively trained on the twelve stimulus-item pairings using a reaction time task. On each trial one of the twelve abstract shapes was shown for 400ms before all six possible items were presented in randomized positions across the screen. Participants were instructed to press the button associated with the correct item as quickly and accurately as possible. Participants were required to continue with this stimulus-item learning task until their average reaction time per block approached 800ms with 100% accuracy.
Scanning
Whilst in the scanner, participants performed two different tasks, a decision making task and an imagination task. The tasks were evenly divided across the three twenty-minute scan sessions with 26 decision trials, and 240 imagination trials in each session.
During the decision task, participants performed 78 choice trials. On each trial, photographs of two novel goods were shown on the screen and participants were given four seconds to evaluate the two options, before being prompted to indicate their preference (Fig. 1b). To encourage veridical evaluation, participants agreed to eat the chosen food from one of their decision trials (chosen at random by the computer) after exiting the scanner.
During the imagination task, an abstract pink shape was presented for 800ms on each trial and this served as an instruction cue to vividly imagine the food item associated with the shape (Fig. 1d). The inter-trial interval was selected from a truncated gamma distribution with mean 2.5s. The trials were sorted into seven principal categories with thirty-two trials of each category per scan session, presented in a randomized order. The seven different categories were as follows: (1) novel good preceded by related component (AB preceded by A); (2) novel good preceded by unrelated component (AB preceded by C); (3) component preceded by the related component (A preceded by B); (4) any food item preceded by another unrelated food item in the same category, such as component preceded by unrelated component or novel good preceded by the other novel good (A preceded by C, or AB preceded by CD); (5) any food item (component or novel-good) preceded by the same food item but predicted by a different abstract stimulus (A preceded by A, or AB preceded by AB); (6) any food item (component or novel-good) preceded by the same food item and predicted by the same abstract stimulus (A preceded by A, or AB preceded by AB); (7) abstract stimulus which had no association with a food outcome preceded by itself. The remaining sixteen trials were necessary to ensure equal numbers of trials in each of the seven conditions.
Of particular interest for our analysis were conditions 1:4. We reasoned that neuronal assemblies which used the components to construct the novel goods would show adaptation (reduced response) in (1) relative to (2). Furthermore, if imagination of a component caused activation of the related component, we would expect reduced response in (3) relative to (4).
For each scan session, fourteen yes/no questions were randomly presented during the imagination task. Each question concerned properties of the food associated with the abstract shape on the last trial. For example asking ‘Was the outcome salty?’. Participants received fifty pence for each correct response. The adjectives used were chosen to encourage participants to elicit multisensory representations of each item, and concerned the appearance, texture, taste and smell of the food items.
Post-scan behavioral task
At the end of the scanning session, after participants had not eaten for a total of five hours, they were sold one of the novel-goods using a Becker-DeGroot-Marschak (BDM) auction procedure14, using a previously reported protocol50. The BDM is known to elicit a measure of a participant’s willingness to pay for a good35, therefore providing a measure of subjective value for each novel good.
fMRI data acquisition and pre-processing
T2*-weighted echo-planar images (EPI) with blood oxygen level-dependent (BOLD) contrast were acquired using a 32-channel head coil on a 3Tesla Trio MRI scanner (Siemens, Erlangen, Germany). A special sequence was used to minimize signal drop out in the OFC region and included an echo time (TE) of 70ms, a tilt of 30o relative to the rostro-caudal axis and a local z-shim with a moment of −0.4 mT/m ms applied to the OFC region. To achieve whole-brain coverage, we used 43 transverse slices of 2mm thickness, with an inter-slice gap of 1mm and in-plane resolution of 3×3 mm, and collected slices in an ascending order. This lead to a repetition time (TR) of 3.01 seconds. In each session roughly 430 volumes were collected (~20 minutes) and the first five volumes were discarded to allow for T1 equilibration effects. A fieldmap with dual echo-time images (TE1= 10ms, TE2 = 14.76ms, whole brain coverage, voxel size 3×3×3 mm) and a single T1-weighted structural image with 1×1×1mm voxel resolution was acquired for each participant to correct for geometric distortions and co-register the epi images respectively.
Preprocessing and statistical analyses were carried out using SPM8 (Wellcome Trust Centre for Neuroimaging, London, UK, www.fil.ion.ucl.ac.uk/spm). After discarding the first five volumes, images were corrected for signal bias, realigned to the first volume, corrected for distortion using fieldmaps, normalized to a standard epi template, and smoothed using an 8mm full-width at half maximum Gaussian kernel.
Data analysis
Images were analyzed in an event related manner using a general linear model (GLM) involving thirty-two explanatory variables. Twenty-six explanatory variables corresponded to conditions 1 to 6 which were further divided into the different food types. Four additional explanatory variables described the ‘no-outcome’ trials, the time of question presentation, the response to question, and the time of evaluation during the decision trials. Two parametric regressors were included, corresponding to the participant’s subjective value of the two novel-goods and time locked to the onset the decision trials. An additional twenty-three nuisance regressors were included in the GLM to account for motion-related artifacts and physiological noise.
The primary aim of our analysis was to identify the neural mechanism underlying the construction of a novel-good. To detect brain regions involved in evaluating the novel goods we looked for activity modulated by chosen value during the decision task. To detect brain regions involved in constructing the novel-goods (‘component-to-compound’) we used the contrast: [(AB preceded by C) -(AB preceded by A)], averaging across all possible permutations (i.e. explanatory variables (2)-(1) from above). To detect plasticity effects between the related components (‘component-to-component’) we used the contrast [(A preceded by C)-(A preceded by B)], again averaging across all possible permutations (i.e. explanatory variables (4)-(3) from above). To detect brain regions showing adaptation to repeated item but not stimulus (‘item-to-self’) we used the contrast [(item preceded by different item)-(item preceded by itself but paired with a different stimulus)], i.e. explanatory variables (4)-(5) from above. To detect brain regions showing adaptation to repeated stimulus (‘stimulus adaptation’) we used the contrast [(stimulus preceded by different stimulus and different item)- (stimulus preceded by itself)], i.e. explanatory variables (4)-(6) from above. The contrast images of all participants were entered into a second level random effects analysis.
For our initial analyses, we assessed contrasts using whole-brain family wise error (FWE) corrected statistical significance. The cluster defining threshold was p<0.01 uncorrected and the corrected significance level defined as p<0.05. In the unfamiliar group, effects in mPFC were significant at the FWE corrected cluster level (shown in Figure 2a–c). To statistically assess hippocampal activity in our contrasts of interest (which did not survive cluster-based FWE thresholding), we tested the average signal from within a region of interest (ROI). This ROI approach was also used to test for a difference in mPFC signal across groups and across task blocks, and to test for repetition suppression effects in visual regions.
All ROIs were defined from contrasts that were orthogonal to the contrasts of interest, to allow statistical tests to be performed in an unbiased fashion. To define an ROI in the hippocampi we used the contrast identifying adaptation of ‘item-to-self’ averaged across all blocks (see above, thresholded at p<0.01 uncorrected). Firstly, a hippocampal ROI was defined from the unfamiliar group alone (see Fig. 2b), and secondly from the average of both groups (shown in Fig. 4b, Fig. 5a, and Fig. 6a, and used thereafter). To assess all construction related adaptation effects in mPFC, across groups and between blocks, we defined an ROI from the average of the two construction related contrasts (‘component-to-compound’ and ‘component-to-component’) from both groups and across all blocks (thresholded at p<0.001 uncorrected; ROI shown in Fig. 4a, Fig. 5a, Fig. 6a).
To compare the value signals in mPFC encoded by the two groups during the decision task, an ROI was defined from the average of the contrast for chosen value across the two groups (thresholded at p<0.01 uncorrected; shown in Fig. 3b). To investigate whether adaptation of component ‘item-to-self’ in mPFC reduced across the duration of the experiment, an ROI was defined using the ‘item-to-self’ contrast when including only component trials, and averaged across all blocks in the unfamiliar group (thresholded at p<0.01 uncorrected; shown in Fig. 5a). A final ROI was defined in visual regions to test the specificity and temporal dynamics of adaptation effects, and defined from a contrast identifying a main effect to any visual event, averaged across all blocks (thresholded at p<0.01 uncorrected; shown in Supplementary Fig. 4b). Note that differences between blocks or groups are, by definition, orthogonal to the group and block average effect.
The ROIs were then used to extract parameter estimates (as shown in the bar plots in Fig. 3:6) to test for significance between groups and across time. To assess differences between groups, we used a 3-way ANOVA to test for a main effect of and an interaction between: ‘group’, unfamiliar/familiar; ‘adaptation type’, component-to-compound/component-to-component; and ‘condition’, control/adaptation trial specific to the relevant adaptation type. To assess difference across time, we used a 3-way ANOVA to test for a main effect of and an interaction between: ‘block’, block 1/block 2 and 3; ‘adaptation type’, component-to-compound/component-to-component; and ‘condition’, control/adaptation trial specific to the relevant adaptation type. Post-hoc t-tests were then used to decompose the results of the ANOVA, using one-tailed t-tests to assess changes in signal within a group, and one-tailed two sample t-tests to assess differences across groups. The Kolmogorov-Smirnov goodness-of-fit hypothesis test was used to check that data were approximately normally distributed.
In the unfamiliar group, first a partial correlation was performed between ‘component-to-component’ suppression effects and the average value participants assigned to the two novel goods (during the BDM), after removing signal attributable to the component value (see Fig. 2d–e and Fig. 6b,d). The adaptation signal was extracted from mPFC ROI shown in Fig. 4a, 5a, and 6a, and hippocampus ROI shown in Fig. 4b, 5a and 6a, averaged across all blocks (Fig. 2d–e) and then repeated using the adaptation signal from the final two blocks of trials (Fig. 6b,d). The former correlations were compared with one between adaptation effect size and average component value, using a similar partial correlation with average component value after removing effects attributable to compound value (see Supplementary Fig. 1). One participant was excluded from these correlations due to missing data for the value of items. Finally, participants were divided using a median split according to high and low value attributed to the novel goods, and two-tailed t-tests and two-tailed paired t-tests used to assess adaptation in the latter two blocks between related components after variance attributable to the average component value had been removed.
Supplementary Material
Acknowledgments
This study was supported by the Wellcome Trust (grant WT088312AIA to T.E.J.B., and Senior Investigator Award to R.J.D. 098362/Z/12/Z), the Medical Research Council (4-year PhD studentship, G1000411, to H.C.B.). The Wellcome Trust Centre for Neuroimaging is supported by core funding from the Wellcome Trust Strategic Award Grant 091593/Z/10/Z. We thank P. Dayan, E. A. Maguire, N. Burgess, S. W. Kennerley, L.T. Hunt, and E. D. Boorman for helpful comments on an earlier draft of the manuscript; and H. Blumenthal’s Fat Duck Cookbook for recipe inspiration.
References
- 1.Wimmer GE, Shohamy D. Preference by association: how memory mechanisms in the hippocampus bias decisions. Science. 2012;338:270–273. doi: 10.1126/science.1223252. [DOI] [PubMed] [Google Scholar]
- 2.Peters J, Büchel C. Episodic future thinking reduces reward delay discounting through an enhancement of prefrontal-mediotemporal interactions. Neuron. 2010;66:138–148. doi: 10.1016/j.neuron.2010.03.026. [DOI] [PubMed] [Google Scholar]
- 3.Kumaran D, Summerfield JJ, Hassabis D, Maguire EA. Tracking the emergence of conceptual knowledge during human decision making. Neuron. 2009;63:889–901. doi: 10.1016/j.neuron.2009.07.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zeithamova D, Dominick AL, Preston AR. Hippocampal and ventral medial prefrontal activation during retrieval-mediated learning supports novel inference. Neuron. 2012;75:168–179. doi: 10.1016/j.neuron.2012.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hampton AN, Bossaerts P, O’Doherty JP. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J. Neurosci. 2006;26:8360–8367. doi: 10.1523/JNEUROSCI.1010-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Gläscher J, Daw N, Dayan P, O’Doherty JP. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron. 2010;66:585–595. doi: 10.1016/j.neuron.2010.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Nicolle A, et al. An agent independent axis for executed and modeled choice in medial prefrontal cortex. Neuron. 2012;75:1114–1121. doi: 10.1016/j.neuron.2012.07.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Johnson A, Redish AD. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci. 2007;27:12176–12189. doi: 10.1523/JNEUROSCI.3761-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Steiner AP, Redish AD. The road not taken: neural correlates of decision making in orbitofrontal cortex. Front Neurosci. 2012;6:131. doi: 10.3389/fnins.2012.00131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Grill-Spector K, Henson R, Martin A. Repetition and the brain: neural models of stimulus-specific effects. Trends Cogn. Sci. (Regul. Ed.) 2006;10:14–23. doi: 10.1016/j.tics.2005.11.006. [DOI] [PubMed] [Google Scholar]
- 11.Grill-Spector K, Henson R, Martin A. Repetition and the brain: neural models of stimulus-specific effects. Trends Cogn. Sci. (Regul. Ed.) 2006;10:14–23. doi: 10.1016/j.tics.2005.11.006. [DOI] [PubMed] [Google Scholar]
- 12.Doeller CF, Barry C, Burgess N. Evidence for grid cells in a human memory network. Nature. 2010;463:657–661. doi: 10.1038/nature08704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kourtzi Z, Kanwisher N. Representation of perceived object shape by the human lateral occipital complex. Science. 2001;293:1506–1509. doi: 10.1126/science.1061133. [DOI] [PubMed] [Google Scholar]
- 14.Becker GM, DeGroot MH, Marschak J. Measuring utility by a single-response sequential method. Behav Sci. 1964;9:226–232. doi: 10.1002/bs.3830090304. [DOI] [PubMed] [Google Scholar]
- 15.Kable JW, Glimcher PW. The neurobiology of decision: consensus and controversy. Neuron. 2009;63:733–745. doi: 10.1016/j.neuron.2009.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Boorman ED, Behrens TEJ, Woolrich MW, Rushworth MFS. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron. 2009;62:733–743. doi: 10.1016/j.neuron.2009.05.014. [DOI] [PubMed] [Google Scholar]
- 17.Hunt LT, et al. Mechanisms underlying cortical activity during value-guided choice. Nat. Neurosci. 2012;15:470–476. S1–3. doi: 10.1038/nn.3017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Jocham G, Hunt LT, Near J, Behrens TEJ. A mechanism for value-guided choice based on the excitation-inhibition balance in prefrontal cortex. Nat. Neurosci. 2012;15:960–961. doi: 10.1038/nn.3140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hassabis D, Kumaran D, Maguire EA. Using imagination to understand the neural basis of episodic memory. J. Neurosci. 2007;27:14365–14374. doi: 10.1523/JNEUROSCI.4549-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Schacter DL, Addis DR. Constructive memory: the ghosts of past and future. Nature. 2007;445:27. doi: 10.1038/445027a. [DOI] [PubMed] [Google Scholar]
- 21.Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37:407–419. doi: 10.1016/s0028-3908(98)00033-1. [DOI] [PubMed] [Google Scholar]
- 22.TOLMAN EC. Cognitive maps in rats and men. Psychol Rev. 1948;55:189–208. doi: 10.1037/h0061626. [DOI] [PubMed] [Google Scholar]
- 23.Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. Model-based influences on humans’ choices and striatal prediction errors. Neuron. 2011;69:1204–1215. doi: 10.1016/j.neuron.2011.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Klein-Flügge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TEJ. Segregated Encoding of Reward–Identity and Stimulus–Reward Associations in Human Orbitofrontal Cortex. J. Neurosci. 2013;33:3202–3211. doi: 10.1523/JNEUROSCI.2532-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wiggs CL, Martin A. Properties and mechanisms of perceptual priming. Curr. Opin. Neurobiol. 1998;8:227–233. doi: 10.1016/s0959-4388(98)80144-x. [DOI] [PubMed] [Google Scholar]
- 26.Desimone R. Neural mechanisms for visual memory and their role in attention. Proc. Natl. Acad. Sci. U.S.A. 1996;93:13494–13499. doi: 10.1073/pnas.93.24.13494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Summerfield C, Trittschuh EH, Monti JM, Mesulam M-M, Egner T. Neural repetition suppression reflects fulfilled perceptual expectations. Nat Neurosci. 2008;11:1004–1006. doi: 10.1038/nn.2163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Johnson A, Van der Meer MA, Redish AD. Integrating hippocampus and striatum in decision-making. Current Opinion in Neurobiology. 2007;17:692–697. doi: 10.1016/j.conb.2008.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Jones JL, et al. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science. 2012;338:953–956. doi: 10.1126/science.1227489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Sutton RS. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proceedings of the seventh international conference on machine learning. 1990;216:224. [Google Scholar]
- 31.Johnson A, Redish AD. Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model. Neural Netw. 2005;18:1163–1171. doi: 10.1016/j.neunet.2005.08.009. [DOI] [PubMed] [Google Scholar]
- 32.Gershman SJ, Markman AB, Otto AR. Retrospective Revaluation in Sequential Decision Making: A Tale of Two Systems. J Exp Psychol Gen. 2012 doi: 10.1037/a0030844. doi:10.1037/a0030844. [DOI] [PubMed] [Google Scholar]
- 33.Kable JW, Glimcher PW. The neural correlates of subjective value during intertemporal choice. Nat. Neurosci. 2007;10:1625–1633. doi: 10.1038/nn2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.FitzGerald THB, Seymour B, Dolan RJ. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 2009;29:8388–8395. doi: 10.1523/JNEUROSCI.0717-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Plassmann H, O’Doherty J, Rangel A. Orbitofrontal Cortex Encodes Willingness to Pay in Everyday Economic Transactions. J. Neurosci. 2007;27:9984–9988. doi: 10.1523/JNEUROSCI.2131-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Levy I, Lazzaro SC, Rutledge RB, Glimcher PW. Choice from non-choice: predicting consumer preferences from blood oxygenation level-dependent signals obtained during passive viewing. J. Neurosci. 2011;31:118–125. doi: 10.1523/JNEUROSCI.3214-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Behrens TEJ, Hunt LT, Woolrich MW, Rushworth MFS. Associative learning of social value. Nature. 2008;456:245–249. doi: 10.1038/nature07538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Janowski V, Camerer C, Rangel A. Empathic choice involves vmPFC value signals that are modulated by social processing implemented in IPL. Soc Cogn Affect Neurosci. 2012 doi: 10.1093/scan/nsr086. doi:10.1093/scan/nsr086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 2005;8:1704–1711. doi: 10.1038/nn1560. [DOI] [PubMed] [Google Scholar]
- 40.McDannald MA, et al. Model-based learning and the contribution of the orbitofrontal cortex to the model-free world. Eur. J. Neurosci. 2012;35:991–996. doi: 10.1111/j.1460-9568.2011.07982.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Dragoi G, Tonegawa S. Preplay of future place cell sequences by hippocampal cellular assemblies. Nature. 2011;469:397–401. doi: 10.1038/nature09633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Doeller CF, Burgess N. Distinct error-correcting and incidental learning of location relative to landmarks and boundaries. Proc. Natl. Acad. Sci. U.S.A. 2008;105:5909–5914. doi: 10.1073/pnas.0711433105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hassabis D, Maguire EA. Deconstructing episodic memory with construction. Trends in Cognitive Sciences. 2007;11:299–306. doi: 10.1016/j.tics.2007.05.001. [DOI] [PubMed] [Google Scholar]
- 44.Klein SB, Loftus J, Kihlstrom JF. Memory and Temporal Experience: the Effects of Episodic Memory Loss on an Amnesic Patient’s Ability to Remember the Past and Imagine the Future. Social Cognition. 2002;20:353–379. [Google Scholar]
- 45.Buckner RL. The role of the hippocampus in prediction and imagination. Annu Rev Psychol. 2010;61:27–48. C1–8. doi: 10.1146/annurev.psych.60.110707.163508. [DOI] [PubMed] [Google Scholar]
- 46.Davachi L. Item, context and relational episodic encoding in humans. Current Opinion in Neurobiology. 2006:693–700. doi: 10.1016/j.conb.2006.10.012. [DOI] [PubMed] [Google Scholar]
- 47.Rudy JW, Sutherland RJ. The hippocampal formation is necessary for rats to learn and remember configural discriminations. Behavioural Brain Res. 1989;34:97–109. doi: 10.1016/s0166-4328(89)80093-2. [DOI] [PubMed] [Google Scholar]
- 48.Schacter DL, Addis DR. The cognitive neuroscience of constructive memory: remembering the past and imagining the future. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 2007;362:773–786. doi: 10.1098/rstb.2007.2087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Addis DR, Wong AT, Schacter DL. Remembering the past and imagining the future: common and distinct neural substrates during event construction and elaboration. Neuropsychologia. 2007;45:1363–1377. doi: 10.1016/j.neuropsychologia.2006.10.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Harris A, Adolphs R, Camerer C, Rangel A. Dynamic construction of stimulus values in the ventromedial prefrontal cortex. PLoS ONE. 2011;6:e21074. doi: 10.1371/journal.pone.0021074. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.