Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 11.
Published in final edited form as: Neuron. 2020 Sep 3;108(3):526–537.e4. doi: 10.1016/j.neuron.2020.08.010

Processing in Lateral Orbitofrontal Cortex Is Required for Estimating Subjective Preference during Initial, but Not Established, Economic Choice

Matthew PH Gardner 1,*, Davied Sanchez 1, Jessica C Conroy 1, Andrew M Wikenheiser 2,3, Jingfeng Zhou 1, Geoffrey Schoenbaum 1,4,*
PMCID: PMC7666073  NIHMSID: NIHMS1622777  PMID: 32888408

SUMMARY

The orbitofrontal cortex (OFC) is proposed to be critical to economic decision making. Yet one can inactivate OFC without affecting well-practiced choices. One possible explanation of this lack of effect is that well-practiced decisions are codified into habits or configural-based policies not normally thought to require OFC. Here, we tested this idea by training rats to choose between different pellet pairs across a set of standard offers and then inactivating OFC subregions during choices between novel offers of previously experienced pairs or between novel pairs of previously experienced pellets. Contrary to expectations, controls performed as well on novel as experienced offers yet had difficulty initially estimating their subjective preference on novel pairs, difficulty exacerbated by lateral OFC inactivation. This pattern of results indicates that established economic choice reflects the use of an underlying model or goods space and that lateral OFC is only required for normal behavior when the established framework must incorporate new information.

In Brief

In the current study, Gardner et al. show that the OFC is not necessary for established economic choice behavior, despite its apparent dependence on a goods space or model; however, the lateral part of OFC is necessary for initially establishing or modifying that underlying model.

INTRODUCTION

Research into the function of the orbitofrontal cortex has revolved around a juncture of two prominent fields of behavioral neuroscience—economic decision making (Fellows, 2011; Levy and Glimcher, 2011; Padoa-Schioppa and Assad, 2006; Padoa-Schioppa and Conen, 2017; Plassmann et al., 2007, 2010; Rich and Wallis, 2016; Rudebeck and Murray, 2014) and the study of model-based behaviors, including those revealing the use of so-called cognitive maps (Bradfield et al., 2015b; Constantinescu et al., 2016; Gallagher et al., 1999; Howard et al., 2020; Izquierdo et al., 2004; Jones et al., 2012; Parkes et al., 2018; Takahashi et al., 2013; Wang et al., 2020; West et al., 2011; Wilson et al., 2014). Work within each field has remained relatively independent, which has led to a difficulty in resolving a more generalized theory of orbitofrontal function (Padoa-Schioppa and Schoenbaum, 2015).

Recent research from our lab found that the orbitofrontal cortex (OFC) is not always required for economic choice. For instance, the OFC was not necessary for economic decisions made in a well-established goods space (Gardner et al., 2017, 2018); however, it became necessary for those same choices when the relative values of the goods were modified (Gardner et al., 2019). This dichotomous finding suggests an intriguing link between OFC’s role in economic choice behavior and its role in behavior dependent on cognitive maps. Cognitive maps are particularly advantageous for novel situations in which valid predictions of future events are impossible without knowledge of a viable underlying model (Wilson et al., 2014), as occurs in the case of an economic choice between known goods after revaluation.

Based on these recent findings, we hypothesized that economic choice would be OFC dependent specifically when underlying models for economic-based decisions have updated through accommodation of new information. To test this proposal, we set out to determine whether OFC is required for two different types of novel decisions. In one, subjects were presented with novel offers of previously experienced pellet pairs: never experienced offers, such as 6 pellets of food A versus 4 pellets of food B, were presented alongside previously experienced offers (6:1, 4:1, 3:1, 2:1, 1:1, 1:2, 1:3,1:4, and 1:6) pellets of food A versus food B. In the other experiment, subjects were presented with novel pairs of previously experienced pellets: same set of standard offers but with a pair of food pellets, such as food pellet types A and C, that the rats had never chosen between but with which they had previous experience. We reasoned that novel offers and novel pairs would give us substantial insight into the underlying basis of economic choice, while at the same time identifying more specifically when and why the OFC is required for such behavior.

RESULTS

Rats were trained on the economic choice task used in previous studies (Gardner et al., 2017, 2018, 2019). Briefly, within this task, hungry rats were presented with choices between different amounts of unique food pellets. On each trial, two different visual stimuli were presented, each signaling the type and amount of a particular food pellet available on that trial (Figure 1A). Rats chose by touching the screen with the preferred option after a 1-s viewing period, during which rats must maintain a nosepoke hold at a central port. Rats learned 8 visual cue → food-type associations, resulting in 28 possible pairs. Two of the cues were withheld in the initial training set of experienced pairs so they could be used for the novel pair experiment. Training sessions on the full task ranged from ~100 to 300 trials, over which 11 different offers of a particular pellet pair were randomly presented (see STAR Methods for more detail). The choice behavior across each of the offers in a session was used to construct a psychometric curve reflecting the subjects’ relative preference between the two pellets on offer.

Figure 1. General Task Design and Description of the Novel Offer and Novel Pairs Experiments.

Figure 1.

(A) Schematic of a single trial on the economic choice task. Rats must nosepoke at a central port following onset of an auditory stimulus. The rats must then maintain the nosepoke hold for 2 s, which is indicated by cessation of the auditory cue. After the first second of the hold, visual stimuli are presented on the screens indicating the current offer. After completing the hold, rats can make a decision by pressing one of the screens.

(B) The set of visual stimuli used in the experiment. Rats were trained to associate the symbol (row) of each visual stimulus with a specific type of food pellet and the number of segmentations within the stimulus (column) with the number of pellets available. Visual symbol - food pellet associations remained constant throughout the experiment. The bottom two stimuli were used for only the novel pairs experiment.

(C) Design for the novel offers experiment. One pair of the 10 possible symbol-pellet pairs was randomly chosen, and the 11 standard offers rats experienced during training (black circles) were given for a warm-up session. On the subsequent experimental session, rats were given two additional novel offer types (orange circles), in which the numbers of each pellet offered had never been chosen between before. To assess the contribution of the lateral and medial portions of the OFC, brain regions were either inactivated or not during all trials (during the choice period) of an experimental session. An example set of offers presented during an experimental session is shown with the standard offers (gray) and two novel offers (orange).

(D) Design for the novel pairs experiment. Rats were tested on up to 8 food pellet comparisons that they had never previously experienced (dotted lines). On the first day of experiencing a “novel pair,” brain regions were either inactivated or not on every trial (during the choice period) for 16 trials of each of the 11 standard offers for 176 total trials. Rats were then tested on a second day in which no inactivation occurred in order to be used as a reference for stable behavior. An example set of offers is shown for an experimental session. Unlike the “novel offer” experiment, the rats never had any previous experience with any of the offers because the two food pellets had never been included in the same session.

After reaching proficiency on this task and displaying transitivity in choices between several different pellets, rats in the current study underwent surgery, in which a virus containing NpHR was infused into lateral (n = 6) or medial OFC (n = 7) and fibers were implanted bilaterally over each area to allow optogenetic inactivation of neurons there during task performance (Figures 2A and 2B). After ~10–12 weeks for recovery, viral expression, and re-acclimatization to the task while tethered to fiber-optic cables, the rats were tested in two behavioral experiments, in which they made decisions for the first time on the economic choice task—one involving novel offers and another involving novel pairs—during optogenetic inactivation of OFC.

Figure 2. Histological Verification of Fiber Placements and Viral Expression in Lateral and Medial Orbitofrontal Cortex.

Figure 2.

(A) Schematic histological assessment of the extent of viral expression (middle) and the terminal fiber placement (left) for each of the rats in the lateral OFC inactivation group at ~3.0 mm anterior of bregma. Example of NpHR3.0-eYFP expression (green) and DAPI (blue; right) is shown.

(B) Same as (A) but for the medial OFC inactivation group, ~4.7 mm anterior of bregma.

Responding to Novel Offers Accurately Reflects Subjective Preference without New Learning and Does Not Depend on Orbitofrontal Processing

The first experiment was designed to probe whether well-trained rats use the configurations of particular offers in order to make reliable choices for established pellet pairs, as if using a large set of stimulus-response (S-R) associations or policies (Figure 1C). In other words, the rats might simply learn to emit a particular response to each pattern of visual stimuli presented on the two screens, i.e., 1 triangle on the left screen and 4 crescents on the right screen means it is best to press the right screen. This sort of “model-free” response strategy—of treating each cue combination as a complex cue and relying on what might be termed cached values or policies to generate the appropriate response—would in theory be OFC independent, thereby explaining why established behavior in prior studies has been insensitive to OFC inactivation.

To test for this, rats were given choices involving a previously experienced pellet pair but with the inclusion of novel offer types (i.e., 3:6, 2:4, 3:8, …), which were not part of the offer set they had previously experienced (Figure 1C). We reasoned that, if rats were using a strategy based on unique cue configurations to govern their behavior, then they would not be able to respond appropriately to novel offers, because they consisted of never-seen-before combinations of the cues. As a result, their performance, at least initially, would have a substantially different profile than experienced offers of a similar ratio. On the other hand, if the rats were attending to each cue individually, recalling its value, and then comparing the two in some manner, then they should respond to these novel offers in the same framework as the well-learned offers. For a given test session, two novel offers were chosen at random from the remaining possible offers between the two familiar pellets being presented: (2:2, 3:3, 4:4, 6:6, 6:8, 3:4, 2:3, 4:6, 2:4, 3:6, 4:8, 3:8, and 2:6). Pellet pairs changed for each test and were never repeated.

Contrary to the proposal that the rats were using configural S-R policies to perform the task, we found that the rats exhibited similar stability in measures of their pellet preference on the novel and established offer pairs (Figure 3A). This is evident in a visual inspection of the novel offers, which were generally positioned close to the psychometric curve, describing their preference for the two pellets. In order to quantify this effect, residuals of both offer types were determined using a leave-one-out method, in which a generalized linear model was fit without the offer of interest. We then compared the residuals of the novel offers to the residuals of the nearest adjacent established offers. Adjacent offers were used to mitigate issues of heteroscedasticity of choice behavior across the range of offers, as there is typically higher variance in choice behavior at the indifference point (IP) than at the tails of the sigmoid. The distribution of the residuals for the novel and the adjacent established offers is shown in Figure 3B. Although there is a small increase in the residuals of the novel offers in the control sessions (n = 45), this difference was not significant, as revealed by a mixed-effects one-way ANOVA (F1,134 = 0.55; p = 0.46). Further, the residuals for the novel offers were significantly less than the potential residuals for each of the novel offers (Figure 3C; F1, 134 = 294; p = 1.4e–35). To be sure that we were not missing an effect early in the session due to rapid learning, we ran the same analysis with just 4 trials of exposure to each novel pair. This analysis also revealed no significant effect of offer type (F1, 134 = 0.17; p = 0.68). Thus, performance on the novel offers was indistinguishable from performance on the comparable established offers, and there was no evidence of learning during the course of the session.

Figure 3. Choices for Novel Offers Are Consistent with Established Preferences and Do Not Require OFC.

Figure 3.

(A) Single session examples in which offer types rats had never previously experienced were given. Standard offers (black circles) are plotted alongside the two novel offer types (shown in red) for control sessions (first column), lateral OFC inactivation sessions (second column), and medial OFC inactivation sessions (third column). The percentage of trials in which the non-preferred pellet (pellet B) was chosen is plotted for each of the offer ratios (x axis; log scale).

(B) Histograms of the residuals for standard offers adjacent to the novel offer types (adjacent, dark color) and novel offer types (novel, lighter color). Residuals were based on a sigmoidal fit (probit analysis) using a leave-one-out approach. Columns are the same as in (A).

(C) Mean residuals for the different offer types are shown for each group, as well as the mean residuals of the potential range of disruption for the novel offer types. Columns are the same as in (B).

(D) The average paired difference in residuals between the standard adjacent and novel offers (left) and corresponding histograms (right) for each group (gray, controls; blue, lateral OFC inactivation; magenta, medial OFC inactivation). Error bars are plotted as SEM.

The above results suggest that rats are not using a strategy based on associating particular actions with configural cue combinations experienced during training, because they were able to immediately show their normal choice preference for novel offers, which involve novel cue combinations. This assumes of course that the established and novel offers both depend on the same underlying processes. If this is the case, then we would expect behavior to the novel offers to be insensitive to OFC inactivation, just like behavior to the established offers. To confirm this, we conducted additional test sessions (identical to and interleaved with the above), during which we optogenetically inactivated both the lateral (n = 10) and medial (n = 19) OFC during the choice phase of each trial. In accordance with our prior studies indicating that neither area is necessary for performance of the task in well-practiced behavior, we found no effect of inactivation on the inverse slope or the IP for either group as determined by a mixed-effects one-way ANOVA (inverse slope: lateral: F1, 26 = 0.05, p = 0.82; medial: F1, 33 = 0.90, p = 0.35; IP: lateral: F1, 26 = 2.50, p = 0.13; medial: F1, 33 = 0.88, p = 0.36). To test for effects of the inactivation on the novel offers, we again compared the residuals for controls and both experimental groups on the established and novel offers (Figures 3B and 3C). Two-factor mixed-effects ANOVAs with offer type (adjacent/novel) and group (control/inactivation) as the within- and between-subjects factors, respectively, clearly revealed no significant interaction of offer type x group for the lateral OFC (F1, 97 = 0.39; p = 0.54) but an interaction that approached significance for the medial OFC (F1, 121 = 3.57, p = 0.061; all main effects for either cohort: F ≤ 2.82, p > 0.079; see table for full results).

To determine whether the near-significant interaction within the medial OFC inactivation group was indicative of a more nuanced significant effect, we conducted several additional and more focused analyses. We first tested for whether medial OFC (mOFC) inactivation might be having a stronger effect early in training, as an effect could potentially be mitigated by learning, by comparing the residuals from the first to second half of the session (F1, 130 = 0.23; p = 0.63). We also examined whether mOFC inactivation caused a shift of the IP toward indifference as if disrupting specific information about the two pellets (offer type x group: F1, 121 = 1.53; p = 0.22) or caused shallowing of the slope of the sigmoidal choice curve as if promoting random choice behavior (offer type x group: F1, 121 = 1.53; p = 0.22). The failure of any of these more focused assessments to reach significance indicates that the effect of mOFC inactivation on novel offers, if real, was the result of random shifts in choice around the model fit. Though speculative, such an effect could occur if the area were important for maintaining stable value estimates during the comparison process, a function that might be more important for novel than established offers. Although our findings are equivocal, this would be in keeping with other work that has found significant involvement of medial orbitofrontal manipulations in similar settings (Baylis and Gaffan, 1991; Boorman et al., 2009; Fellows, 2006; Noonan et al., 2010; Rudebeck and Murray, 2011; Strait et al., 2014).

These results indicate that the rats are not relying on configural S-R policies to guide behavior, and they further show that OFC is not generally required for responding to either established or novel offers. Once a subjective preference has been established for a given pair of goods, both medial and lateral OFC can be taken offline with little or no impact on the resultant economic choice behavior, even when it is necessary to decide between quantities not previously experienced.

Responding to Novel Pairs Requires New Learning to Accurately Reflect Subjective Preference

The ability of the rats to spontaneously exhibit stable subjective preferences to novel offers combined with the insensitivity of even this behavior to inactivation of either medial or lateral OFC could be explained if the rats were still using a strategy based on cached values or policies but linked to each cue itself. These independent quantities could then be compared on the fly during the decision process. If this were the case, then as long as the rats had prior training on the relevant cues, they could be presented in any combination and performance would be normal. And if that performance were based on cached values, then it would arguably not require the OFC.

One way to test for such a strategy is to present novel pairs of food pellets, which the rats have learned about previously in combination with other pellets, and compare how faithfully the initial choices between the two pellets correspond to the subjective preferences later in training. A comparator process that simply utilizes the cached values of the cues acquired during prior training predicts that the correspondence will be quite high, similar to that across a period of training after the pairs have been experienced. Interestingly, this is also the prediction made by economic theory; because choices are transitive, the prediction is that, if we take two pellets from previously experienced pairs, the rats should already know their subjective preferences between them.

To test for this, rats were presented with novel pairs of pellets in test sessions (n = 48). These were simply pellets from the already learned stimulus set that had not yet been paired in a session (Figures 1B and 1D). Rats were tested on the novel pairs for 2 consecutive days, with testing on different novel pairs separated by a rest day. We compared the behavior on these novel pair sessions to behavior on similar paired training days with experienced pellets from rats in prior experiments (n = 120; Gardner et al., 2017, 2018). To determine whether rats maintained stable choice for novel pairs, choice behavior was analyzed across 4 separate 44-trial epochs, which afforded small but reasonable numbers of trials for estimating the slope and IP (4 trials of each of the 11 offers). Figure 4A shows examples from individual sessions, illustrating the shift in these defining measures across these epochs. The IP and slope measures were extracted and shown for all sessions in Figure 4B relative to their day 2 behavior. If rats exhibit stable behavior from the beginning of the novel pairs, this would indicate that values of different pellets are simply accessed and compared without any re-evaluation of the features of the new pellet-pair comparison. Although the clusters in Figure 4B seemed relatively consistent across quarters for the experienced pairs, the novel pair sessions showed a tightening of behavior between the first and second quarters. To quantify the changes in choice behavior, the differences in measures were compared relative to the behavior on day 2. This provides a standardizing reference for each session and amounts to a comparison of the lines linking the paired day 1 and day 2 sessions in Figure 4B. The average changes in the measures across a moving window of 44 trials (11 trial increments; Figure 4C) show how choice for the novel pairs requires some experience before the value measures become stable, an effect that is lacking in the experienced pairs. To compare the changes in choice behavior shown in Figure 4C, a mixed-effects two-way multivariate ANOVA (MANOVA) was performed that included the changes in inverse slope and IP as the two measures and group (established pairs and novel pairs) and time (44 trial quarters) as the between- and within-subject factors, respectively, which revealed a significant interaction of group × time (F3,1014 = 7.34; p = 7.2e-5) and time (F3,1014 = 15.5; p = 7.4e–10). Analyses implementing one-way MANOVAs for each 44-trial quarter revealed that the significant group × time interaction was mainly due to a significant group effect in the first 44 trials (first quarter: F1, 166 = 9.15, p = 2.9e–3; all other quarters: F < 0.41, p < 0.52), indicating that this behavioral adjustment during novel pairs occurred early in each session. Thus, when presented with a new pairing of previously experienced pellets, the rats had to adjust their subjective preference through trial and error in the initial trials.

Figure 4. Decisions for Novel Pairs Require Learning for Accurate Reflection of Subjective Preferences.

Figure 4.

(A) Example sessions of behavior for experienced pellet pairs (left, yellow) and novel orfirst-time pairs (right, gray). First day choice behavior on a pair is split into quarters (44 trials bins), with the corresponding sigmoidal fits displayed from light (first quarter) to dark (last quarter) colors. Behavior including all trials on the second day of the novel pairs is displayed (black circles) as well as the associated sigmoidal fit (black). The percentage of times the rat chose pellet B (y axis) is plotted for the different offer types (x axis, log scale).

(B) Scatterplots of the behavioral measures, indifference point (IP) (x axis), and inverse slope (y axis) for each quarter of day 1 (large circles) linked (by black lines) to behavior on day 2 (small circles, dark gray) for experienced pairs (left) and novel pairs (right). Colors are the same as in (A). Oval area plots (gray, background) show the mean ± 2 standard deviations of the data shown for day 1. Sessions for the experienced pairs were downsampled for display purposes to match the number of novel pair sessions.

(C) The change in behavior across the first day on a pellet pair referenced to the second day on the pair using a 44-trial moving window stepped by 11 trials. Bins are plotted from light, first 44 trials of day 1, to dark, last 44 trials of day 1. x axis, absolute change in the IP (log scale) from the 44 trial bins on day 1 to all trials of day 2; y axis, corresponding change of the inverse slope. Columns are the same as in (B).

(D) Same data as in (C) plotted across the 11 trial steps of the moving window for the absolute change in IP (left) and the change in inverse slope (right) for both the experienced (yellow) and novel (gray) pairs.

For (C) and (D), both axes are referenced to the experienced pair differences for day 1. Error bars are plotted as SEM.

That learning is required suggests that the rats are not able to make their choices on a novel pellet pair by simply calling up and comparing values cached or stored in the visual cues during prior training with the individual pellets, because if they were doing this, their performance should not require adjustment. Further, and perhaps more interestingly, it also implies that the model-based “goods space” underlying the decision process requires updating when two new things are compared, even when they have been experienced previously. Although contrary to assertions that subjective preferences are not learned, the need for experience is perhaps not surprising. Choosing between two things that have not been directly compared requires one to imagine or estimate subjective preference. This has been shown to occur for novel combinations of previously experienced outcomes (i.e., tea-jelly versus snail-porridge) and to involve activation of representations of the constituent outcomes in parts of human OFC (Barron et al., 2013). The current data suggest that this period of estimation occurs even when the items on offer have been experienced extensively in isolation. Because estimated and actual preferences may differ, it is then not surprising that the measured subjective preference will shift when such choices are made repeatedly, as is the case here. What is remarkable is to see this process playing out in rats, even for choices between things with which they have some experience.

Lateral (but Not Medial) OFC Is Necessary for Adjusting Initial Estimated Subjective Preferences to Match Actual Subjective Preferences during Novel Economic Choice

That initial choices with novel pairs—a circumstance that probably best approximates actual economic choice behavior—reflects imagined or estimated preferences that differ from those derived from experience raises the very interesting question of whether initial choices might depend on the OFC in a way that established choices do not. Such a dichotomization of economic choice would parallel results in much simpler procedures, where inactivation of the OFC, particularly lateral OFC, affects value-based judgements as well as learning when inference or imagination is required, but not when the same behaviors depend on direct experience (Gallagher et al., 1999; Howard et al., 2020; Jones et al., 2012; Takahashi et al., 2013).

To test for this, we compared performance on the novel pairs in the control sessions with sessions in which either the medial or lateral OFC had been optogenetically inactivated during the choice period on all trials during day 1 (Figure 1D). Examination of changes in the IP and inverse slope during individual sessions revealed that lateral OFC inactivation affected the rats’ ability to reach stable behavior (Figure 5A), dramatically increasing the distance between the IP and slope on the initial day of training relative to the second day, particularly in the first quarter of the session (Figure 5B). A moving window analysis of the mean changes in IP and slope across the first day on the novel pairs showed significant disruption on these measures after lateral OFC inactivation relative to the control and medial inactivation groups (Figures 5C and 5D). A two-way, mixed-effects MANOVA with IP and inverse slope as multivariate measures and group (levels: control, lateral OFC inactivation, and medial OFC inactivation) and time (session split into quarters) as the between- and within-subjects factors revealed a significant main effect of group (F2, 81 = 5.36; p = 6.5e-3) and time (F3,573 = 27.0; p = 2.4e–16). Visual inspection of the average change in IP and inverse slope (Figures 5C and 5D, respectively) over the course of the session revealed this effect was primarily due to larger differences in the lateral OFC inactivation group.

Figure 5. Accurate Updating of Subjective Preferences for Novel Pairs Relies on Lateral, but Not Medial, OFC.

Figure 5.

(A) Example sessions of behavior for control novel pellet pairs (left, gray), for lateral OFC inactivation (middle, blue), and for medial OFC inactivation (right, magenta). Choice behavior for the first day on a pair is split into quarters (44 trials bins), with the corresponding sigmoidal fits displayed from light (first quarter) to dark (last quarter). Behavior including all trials on the second day of the novel pairs is displayed (black circles) as well as the associated sigmoidal fit (black). The percentage of times the rat chose pellet B (y axis) is plotted for the different offer types (x axis, log scale).

(B) Scatterplots of the behavioral measures, IP (x axis), and inverse slope (y axis) for each quarter of day 1 (large circles, colors same as in A linked [by black lines] to behavior on day 2 [small circles, dark gray] for controls [left], lateral OFC inactivation [middle], and medial OFC inactivation [right]). Oval area plots (gray, background) show the mean ± 2 standard deviations of the data shown for day 1.

(C) The change in behavior across the first day on a pellet pair referenced to the second day on the pair using a 44-trial moving window stepped by 11 trials. Bins are plotted from light, first 44 trials of day 1, to dark, last 44 trials of day 1. x axis, absolute change in the IP (log scale) from the 44 trial bins on day 1 to all trials of day 2; y axis, corresponding change of the inverse slope. Columns are the same as in (B).

(D) Same data as in (C) plotted across the 11 trial steps of the moving window for the absolute change in IP (left) and the change in inverse slope (right) for the control (gray), lateral OFC inactivation (blue), and medial OFC inactivation (magenta) groups.

For (C) and (D), both axes are referenced to the experienced pair differences for day 1. Error bars are plotted as SEM.

Subsequent two-way, mixed-effects MANOVAs comparing the lateral and medial groups independently showed the effect was primarily driven by inactivation of lateral OFC (group: F1, 35 = p = 0.012; time: F3, 246 = 17.5, p = 2.5e–10; group × time: F3,246 = 1.23, p = 0.30). The failure of the interaction to reach significance indicates that the inactivated group failed to fully catch up to the control group during this first session; this is evident in the final position of the measures on the last 44 trials (Figure 5C), which approached the values for day 2 in controls but did not when lateral OFC was inactivated. Interestingly, these effects were specific to lateral inactivation; the effects of medial OFC inactivation did not reach significance (group: F1, 47 = 0.19, p = 0.67; time: F3, 324 = p = 2.4e–6; group × time: F3, 324 = 0.246, p = 0.86).

The effects of lateral OFC inactivation reflected effects on the IP as well as the slope. Two-way, mixed-effects ANOVAs run, for each parameter separately, revealed significant effects of group (IP: F1,35 = 6.52, p = 0.015; inverse slope: F1, 35 = 4.31, p = 0.045) and time (IP: F3,120 = 9.58, p = 1.0e–5; inverse slope: F3,120 = 11.5, p = 1.2e–6) for both parameters. As with the MANOVA run with both parameters, there was no significant interaction of group × time for either measure (IP: F3,120 = 1.82, p = 0.15; inverse slope: F3,120 = 0.80, p = 0.49).

We previously showed that inactivating medial or lateral OFC has no effect on well-trained behavior in our task (Gardner et al., 2017, 2018). Again, in the rats used in this study, we did not find significant effects of OFC inactivation in well-trained animals that had previous experience with the pairs tested for either the IP or the inverse slope measures as revealed with the null effects in the novel offer experiment (Figure 2). The complete lack of effect of OFC inactivation, here and previously, on choices made on well-experienced pairs contrasts with the significant effect of inactivation of lateral OFC on novel pairs, described above. To be sure that the effect of lateral OFC inactivation on choices made on novel pairs was specific to the initial choices between two pellets, we inactivated lateral OFC (lOFC) in these same rats after they had just 2 additional days of experience on the novel pairs. We found that inactivation of lOFC at this point in training no longer had any effect on behavior (Figure 6). A two-way, mixed-effects MANOVA run identically to the novel pairs above revealed no significant effect of group (F1, 30 = 0.540; p = 0.47) or group × time (F3, 216 = 0.464; p = 0.71) and a significant effect of time (F3, 216 = 8.00; p = 4.4e–5).

Figure 6. Mild Experience with Food Pairs Removes Dependence of Choice Behavior on Lateral OFC.

Figure 6.

(A) The change in behavior across the first day on a pellet pair referenced to the second day on the pair using a 44-trial moving window stepped by 11 trials. Bins are plotted from light, first 44 trials of day 1, to dark, last 44 trials of day 1. x axis, absolute change in the IP (log scale) from the 44 trial bins on day 1 to all trials of day 2; y axis, corresponding change of the inverse slope. Left, control novel pairs, gray; middle, control mild-experienced pairs, red; right, lateral OFC inactivation mild-experienced pairs, green.

(B) Same data as in (A) plotted across the 11 trial steps of the moving window for the absolute change in IP (left) and the change in inverse slope (right) for the control (gray), mild-experienced controls (red), and lateral OFC inactivation mild-experience (green) groups. Both axes are referenced to the experienced pair differences for day 1. Error bars are plotted as SEM.

DISCUSSION

Here, we probed the basis of economic choice in an attempt to better understand when and why the OFC is important to this iconic class of value-based behavior. For this, we trained rats to make choices between differently flavored food pellets. In each session, a single pellet pair was presented in a standard set of offers. Once the rats were proficient making choices across a series of pairs, they underwent testing in which they were presented with novel offers of previously experienced pairs or with novel pairs of previously experienced pellets. Contrary to expectations, we found that established economic choice behavior in our version of the task, which we have shown to be OFC independent, could not be explained by appealing to the operation of configural S-R habits or even to comparison of individual cached values, instead appearing to reflect model-based processing within an established framework. We reached this conclusion because naive controls performed normally on novel offers, ruling out a configural S-R solution, while at the same time exhibiting difficulty initially estimating their actual subjective preference when confronted with novel pairs, indicating that they were not simply comparing individual cached values.

As in our prior work (Gardner et al., 2017, 2018, 2019), inactivation of OFC, medial or lateral, had little effect on the ability of rats to use this established framework—or cognitive map—when making choices between experienced pellet pairs; in the current study, this was true even when novel offers were presented, particularly for inactivation of the lOFC. As long as the rats had some prior experience with the comparison, OFC processing was not necessary for the rats to properly estimate their subjective preference between the pellets on offer. However, when the rats had no prior experience with a particular pellet pair, inactivation of lOFC significantly exacerbated the slight difficulty they showed normally in properly estimating their actual subjective preference. Below, we will consider the implications of these somewhat unexpected findings for our understanding of economic choice as well as OFC function.

Economic choice refers to behavior in which choices are made between goods of different values. It is somewhat unique when compared to other forms of value-based behavior in that it involves an overt decision between two items whose value differs subjectively along many different dimensions (Levy and Glimcher, 2011; Padoa-Schioppa, 2011). This type of behavior has been modeled experimentally in animals by giving repeated choices between various goods in order to construct a series of psychometric curves (Padoa-Schioppa and Assad, 2006). Unlike most of our real-world economic behavior, this approach provides the subject with extensive experience making choices between the different goods on offer. Two assumptions are made in using this approach to model real-world behavior. One is that subjective preferences are not learned. According to this idea, we and our mammalian cousins innately know how much we prefer one good relative to another, at least once we have experienced the goods in isolation. The other assumption is that the choice procedure is inherently model based. That is, because of its complexity—the need to compare across many different dimensions, many different offer ratios, and even among many different goods to generate many different responses—it is not something that can be “habitized” or converted into a policy.

Our results support the second assumption, because if the rats were using a large set of configural habits, then they should have exhibited increased variance and larger residuals in their behavior when confronted with novel offers, and if they were comparing cached values, then they should not have required any experience to reach proficiency when confronted with novel pairs. That the rats were able to respond as accurately on novel offers as on experienced offers and yet required learning with novel pairs suggests that, even in very well-trained subjects, the choices are being made by reference to some sort of model or goods space. This conclusion is in line with arguments that behavior is inherently model based in such complex designs. However, this then raises a conundrum, because the standard behavior in this design is immune to inactivation of the OFC, at least in rats that have become proficient on the task (but see Kuwabara et al., 2020, in which inactivation of lOFC in mice affects behavior on an odor-guided economic choice task, and also our comment on this paper in Gardner et al., 2019). This insensitivity to inactivation of either medial OFC or lOFC contradicts the general interpretation that OFC-independent behaviors gain that independence because they can be mediated without reference to such internal models.

This brings us to the first assumption, which is that the subjective preferences underlying economic choice behavior are not learned. Despite the substantial experience of the rats in the current study with each of the pellets being used and the close similarity between the different pellets, both of which should have minimized the need for any further evaluation of the pellets, we still found that rats required experience making choices between two pellets in order to establish stable preferences. This suggests that, even when two goods are part of an established goods space, there are still adjustments that must be made through trial and error before the precise relationship between them can be established. Although contrary to the idea that subjective preferences are not learned, this makes intuitive sense if one considers how often we are unhappy with our choices, particularly ones we have not had a chance to make hundreds or thousands of times.

And it was this particular ability—the ability to adjust the established goods space to quickly incorporate and then use this new information to more precisely control choice behavior—that was disrupted by lOFC inactivation. Specifically, inactivation of the lOFC made the rats’ initial choices more variable and prevented the normal adaptation we saw during the initial choices between novel pellet pairs in control sessions. The increase in the variability of the choice in addition to the increased preference offset indicates that rats were at a loss to constrain behavior during the adaptation, a finding raising interesting parallels to proposals that lOFC is involved in confidence (Kepecs et al., 2008; Masset et al., 2020). This effect was not observed with inactivation of medial OFC, nor was it evident when lOFC was again inactivated in the same rats just several sessions later, after the goods space was presumably established. The effect of lOFC inactivation was only apparent during this initial period of adjustment. The specificity of the role of lOFC is at odds with the proposal that this region is specifically required for the comparison of values determined from model-based associations (Padoa-Schioppa, 2011); however, it is consistent with the idea that lOFC is especially important for the evaluation of such associations when prior direct experience with the evaluation is non-existent or perhaps incomplete or inadequate.

If our results, with regard to lateral orbitofrontal function, can be generalized outside the boundaries of economic choice, it would suggest a much narrower and more restricted role for the lateral subdivision in supporting behaviors based on mental simulation, model-based processing, or cognitive mapping. Rather than being necessary for all such operations, the lOFC may instead be specialized, within a circuit of structures, for incorporating new information into existing maps and for generalizing from old to new situations. This idea recognizes that most of the well-controlled behaviors dependent on lOFC require both model-based processing and either generalization to new situations or the integration of new information into an established map. These are inherently intertwined processes because the onus of showing that a behavior is model based requires display of the model within a new situation. For example, OFC-dependent deficits are observed in reinforcer devaluation (Gallagher et al., 1999; Gardner et al., 2017; Howard et al., 2020; Izquierdo et al., 2004; Parkes et al., 2018; West et al., 2011), sensory preconditioning (Jones et al., 2012; Wang et al., 2020), over-expectation (Takahashi et al., 2009, 2013), and specific transfer (Lichtenberg et al., 2017; Ostlund and Balleine, 2007). In each, the lOFC is necessary in the final probe test, where there is a requirement for a cognitive map but also a need to extend it to incorporate new information or to adapt behavior to new circumstances.

This is also true in less well-controlled settings, such as reversal learning, where deficits can be observed after OFC manipulations but only on the first few reversals (Schoenbaum et al., 2002), when arguably a new cognitive map is being developed to handle the procedure. With experience, OFC becomes unnecessary, even for new reversal problems. Although this has been interpreted as reflecting the development of a policy, it could reflect the independence of established cognitive maps.

Indeed, a similar pattern is evident with economic choice. Mice given experience on only a single pellet pair still show some impact of OFC inactivation on their ability to properly estimate their subjective preferences (Kuwabara et al., 2020), and even in rats given extensive experience across an entire series of pellets, we can easily recover sensitivity to OFC inactivation by forcing a reconfiguration of the established goods space by revaluing one of the pellets immediately before the test (Gardner et al., 2019). Although speculative, the proposal that lOFC is particularly critical for establishing, changing, or updating the cognitive map is also in accord with recent studies showing OFC is required for behavior specifically when underlying complex task structure is changing (Constantinople et al., 2019; Miller et al., 2018; Parkes et al., 2018). It is also consistent with imaging data showing that the human OFC is selectively engaged by manipulations, such as outcome devaluation (Gottfried et al., 2003; Howard and Kahnt, 2017; Howard et al., 2020) or when people must imagine choosing between novel combinations of previously experienced food items (Barron et al., 2013); although such work has been typically viewed as supporting a role for OFC in utilizing established models, the selective engagement under these conditions could be due to a more specific role in updating and modifying models maintained more widely. Such a function would differentiate the role of OFC from the role of other areas important in cognitive mapping, such as hippocampus and striatum.

STAR★METHODS

RESOURCE AVAILABILITY

Lead Contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Geoffrey Schoenbaum (geoffrey.schoenbaum@nih.gov).

Materials Availability

All materials used, including the customized pellet-types, can be found on the Key Resources Table. CAD designs for the customized equipment can be found at https://github.com/mphgardner/RatEconChoiceTask.

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
DAPI-Fluorescent- G Electron Microscopy Services Cat No. 17984-24
Triton X-100 Sigma-Aldrich Cat No. X100-500ML
5-TUL banana flavored food pellets (20 mg) TestDiet 1813985
5-TUL peanut butter flavored food pellets (20 mg) TestDiet 1817952
5-TUM food pellets (20mg) TestDiet 1811143
Fruit punch flavored pellets, 67% cellulose (20 mg) TestDiet 1817960-371
Chocolate flavored pellets, 75% cellulose (20 mg) TestDiet 1817259-371
5-TUM raspberry flavored pellets, 67% cellulose (20 mg) TestDiet 1817958-371
5-TUW cellulose food pellets (20 mg) TestDiet 1811557
Custom bacon flavored pellets (20 mg) BioServ F07382
Bacterial and Virus Strains
AAV5/CamKIIa-eNpHR3.0-eYFP UNC Vector Core N/A
AAV5/CamKIIa-eYFP UNC Vector Core N/A
Experimental Models: Organisms/Strains
Long-Evans Rat Charles River RRID: RGD_2308852
Software and Algorithms
MATLAB Mathworks RRID: SCR_001622
Other
Doric dual optical commutators Doric Lenses Cat No. FRJ_1x2i_FC-2FC_0.22
200 micron diameter fiber optic patch cable Thor Labs M72L01
Fiber optic cannulae Thor Labs Cat No. CFM12U-20
ceramic zirconia ferrule bore 230um Precision Fiber Products Cat No MM-FER2002S15-P
FC multimode connector Precision Fiber Products Cat No. MM-CON2004-2300-2-BLK
543 nm DPSS Laser Shanghai Lasers Cat No. GL543T3-100
Arduino Mega Adafruit Industries Cat No. 191
Raspberry Pi 3 B Adafruit Industries Cat No. 3055
3.5” Resistive Touch Screen Adafruit Industries Cat No. 2050

Data and Code Availability

Scripts for the behavioral paradigm can be found at https://github.com/mphgardner/RatEconChoiceTask. Data and analysis scripts are available upon request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Fourteen male Long-Evans rats (275–300 g, Charles River Laboratories), aged approximately 3 months at the start of the experiment, were trained and tested at the National Institute on Drug Abuse Intramural Research Program (Baltimore, MD) in accordance with the National Institute of Health guidelines determined by the Animal Care and Use Committee. All rats had ad libitum access to water during the experiment and were fed 16–20 g of food per day, including rat chow and pellets consumed during the behavioral task. Rats were initially food restricted to 85% of their baseline weight to begin training. Behavior was performed during the light phase of the light/dark schedule.

METHOD DETAILS

Apparatus

Rats were trained and tested in modified standard behavioral boxes (12” × 10” × 12,” Coulbourn Instruments, Holliston, MA) that were controlled by a Raspberry Pi 3 (Raspberry Pi Foundation, Cambridge, UK) using custom-written code in Python (https://www.Python.org) (Gardner et al., 2017, 2018). Both custom-made equipment and Coulbourn components were used in the apparatus. Touchscreens (Adafruit Industries, New York, NY, 2.8” - initial training -and 3.5” - later training and testing) were housed in custom-made walls and were controlled by individual microcontrollers (Arduino Mega, Arduino, https://www.arduino.cc/), which communicated with the Raspberry Pi 3 to display the current offers and provide screen-press feedback. Custom-designed nosepoke ports (1.5” H X 1.25” W X 1.5” D) with infrared photodetectors to determine whether a poke had occurred were fixed to the floor of the box about one inch from the wall and. The primary configuration of the box had touchscreens and accompanying wall mounts oriented at 30° from the plane of the left side wall to facilitate better viewing of the screen while the rats were nosepoking at the central port. A tall recessed food magazine (Med-Associates, Fairfax, VT) was placed on the center of the right wall opposite to the nosepoke and touchscreens. Pellets from two separate externally mounted feeders were dispensed into the food magazine. The speaker used for playing the white noise cue (75 dB) to indicate the beginning of a trial was placed externally to the conditioning chamber. During the optogenetic inhibition phase of the experiment, solid state lasers (532 nm; Laser Century, Shanghai China) were controlled in analog mode (8 bit depth) by a microcontroller (Arduino Uno, Arduino, https://www.arduino.cc/) using custom written software (Gardner et al., 2017, 2018).

Choice Task

Each trial started with a white noise cue, which indicated that the rat could nosepoke at the central port. After a 1 s nosepoke at the port, the current offers were displayed on the two screens situated on either side of the nosepoke. After another 1 s period, during which the rats were required to remain in the nosepoke, the white noise ended indicating that a choice could be made by touching either of the screens to receive the offer-type and pellet number displayed. Immediately following the choice, the pellets were delivered into the food magazine on the opposite side of the chamber. Rats then waited 6–16 s before the next trial started which was comprised of a random component (uniform distribution, range: 0 – 4 s) and an additional period based on the number of pellets delivered on the prior trial (1 additional second for each pellet). This was determined empirically such that after food consumption rats were not waiting for longer periods of time for the next trial to start following trials in which only 1 or 2 pellets were delivered. Failure to hold the nosepoke for the first second restarted the 1 s timer and failure to hold the nosepoke once the screens were displayed resulted in the termination and repeat of the trial. Rats performed ~150–350 trials per session.

Food-Pellet Reinforcers

All rats received the same menu of pellet offers arranged in the following average preference order (highly palatable banana flavored pellets, Test-Diet 5-TUL (1813985); bacon flavored pellets containing lactose and 1.4% NaCl, Bio-Serv, custom formulation (F07382); grain flavored pellets, Test-Diet 5-TUM (1811143); fruit punch flavored pellets with 33% sucrose and 67% cellulose, Test-Diet, custom formulation (1817455–371); chocolate flavored pellets with 25% sucrose and 75% cellulose, Test-Diet, custom formulation (1817259–371); and 100% cellulose pellets, Test-Diet 5-TUW (1811557) which were the same as previously reported (Gardner et al., 2017, 2018, 2019). The additional pellets trained for the novel pairs experiment were: highly palatable peanut-butter flavored pellets, Test-Diet 5-TUL (1817952); and raspberry flavored pellets with 67% cellulose, based on the Test-Diet 5-TUM formulation (1817952). Visual cues predicting the different offer-types consisted of different shapes, indicating the type of pellet available, and different numbers of segmentations of the symbol, indicating the number of pellets available. Each rat received unique cue-pellet pairings that remained constant throughout testing.

Shaping and Pre-Surgical Training

Initial training on the task lasted 3–4 months before rats experienced any of the tested pairs of pellets and progressed through several stages that introduced different aspects of the task. Before starting, rats were food restricted to ~85% of their body weight, then they were first trained to touch a single illuminated touchscreen to receive unflavored sucrose pellets, after which they began training to discriminate two visual cues which either resulted in an unflavored sucrose pellet or nothing (the images used were not used for any subsequent aspect of the task). After rats showed discriminative behavior to the two visual cues, a central nosepoke was introduced to the box and rats were progressively trained to hold in the port for 2 s (1 s with no cues on and one second with visual cues displayed) when the white noise cue was turned on. Upon acquisition of the nosepoke, rats were introduced to the full task. To learn each of the cue-pellet associations, rats were trained for several days on each of the 5 flavored pellets versus a non-preferred cellulose pellet. After rats showed stable preferences for each of the pellets versus cellulose, they were exposed to other pellet-pairs. In each session, rats were given 11 possible offers including the 1:0 and 0:1 offers. The other 9 offers ranged either from 1:6 to 6:1 or 1:4 to 8:1 (X:Y, Y being the presumed preferred pellet-type) from the offer set [1:8,1:6,1:4,1:3,1:2,1:1, 2:1, 3:1, 4:1, 6:1, 8:1] depending on the presumed pair preference.

Surgery

Surgical procedures followed guidelines for aseptic technique. Rats received AAV-CaMKIIa-eNpHR3.0-eYFP (Gene Therapy Center at University of North Carolina at Chapel Hill) bilaterally into the lateral OFC (n = 7) and the medial OFC (n = 7) under stereotaxic guidance. The lateral OFC was targeted at AP +3.0 mm, ML ± 3.2 mm, and DV −4.4 mm from the brain surface and the medial OFC was targeted at AP 4.7 mm, ML ± 0.6 mm, and DV −3.6 mm from the brain surface. A total 1 ml of virus (titer ~1012) per hemisphere was delivered at the rate of ~0.1 ml/min by infusion pump. Immediately following viral infusions, optic fibers (200 μm in core diameter; Thorlab, Newton, NJ) were implanted bilaterally for the lateral OFC at A/P: 3.0 mm, M/L: ± 3.2 mm, and D/V:- 4.2 mm (from dura) at an angle of 10 degrees in the M/L plane; and for the medial OFC at A/P: 4.7 mm, M/L: ± 0.6 mm, and D/V:- − 3.4 mm (from dura) at an angle of 12 degrees in the M/L plane (Bradfield et al., 2015a)Cephalexin (15 mg/kg p.o.) was administered daily for 10 days post-operatively to prevent infection.

Post-Surgical Testing

Following a 2–3 week recovery from surgery, rats were retrained on the full task and accustomed to performing with two fiber-optic patch cables attached to an optic commutator (Doric Lenses, Quebec Canada). Cables were constructed with blocking covers to reduce leakage of light into the box. However, it is impossible to completely eliminate light leakage. To control for effects of such light leakage during laser-on trials, ‘dummy’ fiber-optic cables were employed during retraining and testing. The ‘dummy’, or blocked, cables were identical to the patent-fiber cables except that the optical fiber was blocked at the end of the cable and permitted no light transmittance into the brain. The blocked-fiber cables were constructed identically to the patent-fiber cables with one exception; the optical fiber was terminated at the ferrule, or ~1cm, from the animal-side terminal of the patch cable. A solid metal wire was inserted into the ferrule and epoxied into place in order to block the light. All blocked-fiber cables were tested after construction as well as on a periodic basis using a Fiber Optic Power Meter (ThorLabs). After rats were familiarized with the blocked-fiber cables and the laser being turned on, testing was begun. Overall, rats had several days of experience on all pellet pairs before moving to the experimental portion of the study. The novel offer experiment was conducted prior to the novel pairs study.

Novel Offers

Rats’ behavior was assessed on decisions for novel offer-types (Figure 1C). In addition to the standard offer set described above, experimental sessions included two of the following never-before experienced novel offer-types: [2:2, 3:3, 4:4, 6:6, 6:8, 3:4, 2:3, 4:6, 2:4, 3:6, 4:8, 3:8, 2:6]. One of the 10 possible pairs of pellets (not including the cellulose pellet) was randomly chosen and rats performed the task while attached to the blocked- or patent-fibers (fully counterbalanced) with the laser turned on for all trials. If the connection of the cables became loose by then end of the session, the session was discarded from the analysis. The lasers (532 nm, 16–18 mW; Laser Century, Shanghai China) were controlled by a microcontroller (Arduino Uno, Arduino) and were turned on concurrently with the white noise cue to indicate the availability to begin a trial. Lasers were turned off at the time of decision using a linear ramp over 300 ms to avoid the possibility of rebound excitation. To minimize the duration of the laser, the white noise and laser were on for 5 s before a timeout period occurred. Rats also had a maximum of 5 s to make a choice once nosepoke hold was fulfilled. Sessions lasted 2–2.5 hours.

Novel Pairs

Following completion of the novel offers experiment, rats were introduced to two new cue-outcome pairs with 6 days of training. Rats showed stable preference between the two pellets at the end of the six days of training.

Following training of the two new pellets, rats were presented with decisions between a pair of food pellets which they had never chosen between previously. These pairs were chosen randomly from the set of possible novel pairs. On the first day of exposure to the novel pair of food pellets, the rats performed the task while attached to the blocked- or patent-fibers (fully counterbalanced) with the laser turned on for all trials. Sessions lasted for 16 presentations of each of the 11 offer-types, resulting in 176 trials. Rats were then run on the same pellet-pair the subsequent day with the blocked-fibers attached in order to determine stable preferences and slopes for which to compare the first day behavior. Because the preferences between different pellet-types are subjective, behavior on the second day was used as a reference point to compare dynamic changes in preference and choice variance across the first day. Following completion of the novel pairs testing, this consisted of 8 pairs tested, 4 with the blocked-, and 4 with the patent-fibers connected (sessions in which the cables became disconnected were not included in the study), rats were given another day to experience the pellet-pairs and then were retested using the same procedure as above in order to determine whether the effect was specific to the first time rats made decisions between two pellets. Timing of the onset and offset of the laser were as described in the above section.

Histology

After completion of the experiment, rats were perfused with phosphate buffer saline followed by 4% PFA. The brains were then immersed in 30% sucrose for at least 24 hr and frozen. The brains were sliced at 40 mm and stained with DAPI (through Vecta-shield-DAPI, Vector Lab, Burlingame, CA). The location of the fiber tip and NpHR-eYFP was verified using an Olympus confocal microscope.

QUANTIFICATION AND STATISTICAL ANALYSIS

Indifference Point and Inverse Slope Estimation

Raw data was collected using custom written code in Python. All further analysis was performed using MATLAB. As described previously (Padoa-Schioppa and Assad, 2006), in order to estimate a scalar relative value of two goods from a limited subset of all possible offers, an assumption must be made about the function relating the two goods in offer space. Here we assume a linear indifference curve (within a reasonable set of offer space) which entails that the ratio of the number of each good offered leading to indifferent behavior remains constant as the number of goods offered increases. In order to estimate the relative value of two goods from the choice behavior we performed a probit regression for each session (Padoa-Schioppa and Assad, 2008), which uses the cumulative distribution function of the normal distribution to predict the choice behavior given the log ratio of the offers. This provides estimated parameters μ^ and σ^ of the fitted normal distribution, which were used as estimates for the log of the indifference point (IP) - the estimated relative value - and inverse slope parameter respectively. This analysis was performed using the fitglm function in MATLAB which fits a generalized linear model of the choice data using an inverse normal cumulative distribution, ‘probit’, function as the link function and assumes a bernoulli distribution for the binary choice response variable resulting in the model

Φμ,σ1(y)=β0+β1x+ε

in which Φ is the normal cumulative distribution function

Φ(z)=12πzet2/2dt

and the predictor, x, is the log of the offer ratios. The estimated parameters of the normal distribution are μ^=β0/β1 and σ^=1/β1 and σ^=1/β1. Estimated indifference points IP = expμ^greater than an 8:1 ratio (non-preferred:preferred pellet) were considered as an 8:1 IP for analyses.

Estimates of Residuals for the Novel Offers

In order to determine an unbiased estimate of behavior on the novel offer-types presented, a leave-one-out method was employed in order to determine the residuals of the choice for a particular offer-type based on the best fit of the remaining offers. The generalized linear model was performed as described above for each of the n offer-types in the offer set x with the offer of interest, xi, left out

Φμ,σ1(y)=β0+β1x1:i1,i+1:n+ε

in order to determine the residual, εi, for offerxi so that an estimate of the residual would not be influenced by the particular offer being included the model fit. The novel offers were compared to the nearest offer-type, referred to in the results as the adjacent offer, in the set of the 11 standard offer-types. In many cases, these were the same ratios, or same value of x, (i.e., the novel offer-type of 3B:3A would be compared to the standard offer-type of 1B:1A). This was done to avoid effects of variability in the distribution of the residuals across the range of offers producing a bias in the comparison of the novel to standard offers and to allow for a one-to-one correspondence in a paired test.

To determine the possible range of the residuals for the novel offers, the model estimate for each novel offer type was compared to the maximum of either the nearest of the floor (0% of the non-preferred chosen) or ceiling (100% of the non-preferred chosen) or chance-level choice (50% of the non-preferred chosen). This method was used with the assumption that behavior would potentially move to either chance level, or to the nearest of the floor or ceiling.

Statistical Analyses

Nested experimental designs were used for both the novel offer and novel pair experiments. All mixed-effects ANOVAs and MANOVAs for the novel offers and novel pairs experiments included the blocking factors of Subject and Session, nested within Subject, which were both modeled as random effects. All other factors were fixed. The repeated-measures ANOVAs included Subject as a blocking factor. For the models comparing the experienced pairs with the novel pair controls, subjects were nested within the Group factor as a between-subjects design. All analyses of the indifference point were performed in log scale. MATLAB was used for all analyses.

Highlights.

  • OFC is not necessary for established economic choice

  • Such established economic choice does not reflect habits or configural policies

  • Lateral OFC is necessary for estimating initial subjective preferences

  • OFC is critical to choice when the underlying goods space is established

ACKNOWLEDGMENTS

This work was supported by the Intramural Research Program at NIDA (G.S.). The authors thank Melissa Sharpe and Kaue Costa for their helpful insights and Dr. Karl Deisseroth and the Gene Therapy Center at the University of North Carolina at Chapel Hill for providing viral reagents. The opinions expressed in this article are the authors’ own and do not reflect the view of the NIH/DHHS.

Footnotes

DECLARATION OF INTERESTS

The authors declare no competing interests.

REFERENCES

  1. Barron HC, Dolan RJ, and Behrens TE (2013). Online evaluation of novel choices by simultaneous representation of multiple memories. Nat. Neurosci 76, 1492–1498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Baylis LL, and Gaffan D (1991). Amygdalectomy and ventromedial prefrontal ablation produce similar deficits in food choice and in simple object discrimination learning for an unseen reward. Exp. Brain Res 86, 617–622. [DOI] [PubMed] [Google Scholar]
  3. Boorman ED, Behrens TE, Woolrich MW, and Rushworth MF (2009). How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743. [DOI] [PubMed] [Google Scholar]
  4. Bradfield LA, Dezfouli A, van Holstein M, Chieng B, and Balleine BW (2015a). Medial orbitofrontal cortex mediates outcome retrieval in partially observable task situations. Neuron 88, 1268–1280. [DOI] [PubMed] [Google Scholar]
  5. Bradfield LA, Dezfouli A, van Holstein M, Chieng B, and Balleine BW (2015b). Medial orbitofrontal cortex mediates outcome retrieval in partially observable task situations. Neuron 88, 1268–1280. [DOI] [PubMed] [Google Scholar]
  6. Constantinescu AO, O’Reilly JX, and Behrens TEJ (2016). Organizing conceptual knowledge in humans with a gridlike code. Science 352, 1464–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Constantinople CM, Piet AT, Bibawi P, Akrami A, Kopec C, and Brody CD (2019). Lateral orbitofrontal cortex promotes trial-by-trial learning of risky, but not spatial, biases. eLife 8, e49744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Fellows LK (2006). Deciding how to decide: ventromedial frontal lobe damage affects information acquisition in multi-attribute decision making. Brain 729,944–952. [DOI] [PubMed] [Google Scholar]
  9. Fellows LK (2011). Orbitofrontal contributions to value-based decision making: evidence from humans with frontal lobe damage. Ann. N Y Acad. Sci 7239,51–58. [DOI] [PubMed] [Google Scholar]
  10. Gallagher M, McMahan RW, and Schoenbaum G (1999). Orbitofrontal cortex and representation of incentive value in associative learning. J. Neurosci 79, 6610–6614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gardner MPH, Conroy JS, Shaham MH, Styer CV, and Schoenbaum G (2017). Lateral orbitofrontal inactivation dissociates devaluation-sensitive behavior and economic choice. Neuron 96, 1192–1203.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gardner MPH, Conroy JC, Styer CV, Huynh T, Whitaker LR, and Schoenbaum G (2018). Medial orbitofrontal inactivation does not affect economic choice. eLife 7, e38963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Gardner MPH, Conroy JC, Sanchez DC, Zhou J, and Schoenbaum G (2019). Real-time value integration during economic choice is regulated by orbitofrontal cortex. Curr. Biol 29, 4315–4322.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gottfried JA, O’Doherty J, and Dolan RJ (2003). Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 307, 1104–1107. [DOI] [PubMed] [Google Scholar]
  15. Howard JD, and Kahnt T (2017). Identity-specific reward representations in orbitofrontal cortex are modulated by selective devaluation. J. Neurosci 37, 2627–2638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Howard JD, Reynolds R, Smith DE, Voss JL, Schoenbaum G, and Kahnt T (2020). Targeted stimulation of human orbitofrontal networks disrupts outcome-guided behavior. Curr. Biol 30, 490–498.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Izquierdo A, Suda RK, and Murray EA (2004). Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J. Neurosci 24, 7540–7548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Jones JL, Esber GR, McDannald MA, Gruber AJ, Hernandez A, Mirenzi A, and Schoenbaum G (2012). Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kepecs A, Uchida N, Zariwala HA, and Mainen ZF (2008). Neural correlates, computation and behavioural impact of decision confidence. Nature 455, 227–231. [DOI] [PubMed] [Google Scholar]
  20. Kuwabara M, Kang N, Holy TE, and Padoa-Schioppa C (2020). Neural mechanisms of economic choices in mice. eLife 9, e49669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Levy DJ, and Glimcher PW (2011). Comparing apples and oranges: using reward-specific and reward-general subjective value representation in the brain. J. Neurosci 37, 14693–14707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lichtenberg NT, Pennington ZT, Holley SM, Greenfield VY, Cepeda C, Levine MS, and Wassum KM (2017). Basolateral amygdala to orbitofrontal cortex projections enable cue-triggered reward expectations. J. Neurosci 37, 8374–8384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Masset P, Ott T, Lak A, Hirokawa J, and Kepecs A (2020). Behavior- and modality-general representation of confidence in orbitofrontal cortex. Cell 782, 112–126.e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Miller KJ, Botvinick MM, and Brody CD (2018). Value representations in orbitofrontal cortex drive learning, not choice. bioRxiv. 10.1101/245720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Noonan MP, Walton ME, Behrens TE, Sallet J, Buckley MJ, and Rushworth MF (2010). Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc. Natl. Acad. Sci. USA 707, 20547–20552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ostlund SB, and Balleine BW (2007). Orbitofrontal cortex mediates outcome encoding in Pavlovian but not instrumental conditioning. J. Neurosci 27, 4819–4825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Padoa-Schioppa C (2011). Neurobiology of economic choice: a good-based model. Annu. Rev. Neurosci 34, 333–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Padoa-Schioppa C, and Assad JA (2006). Neurons in the orbitofrontal cortex encode economic value. Nature 447, 223–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Padoa-Schioppa C, and Assad JA (2008). The representation of economic value in the orbitofrontal cortex is invariant for changes of menu. Nat. Neurosci 77,95–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Padoa-Schioppa C, and Conen KE (2017). Orbitofrontal cortex: a neural circuit for economic decisions. Neuron 96, 736–754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Padoa-Schioppa C, and Schoenbaum G (2015). Dialogue on economic choice, learning theory, and neuronal representations. Curr. Opin. Behav. Sci 5, 16–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Parkes SL, Ravassard PM, Cerpa JC, Wolff M, Ferreira G, and Coutureau E (2018). Insular and ventrolateral orbitofrontal cortices differentially contribute to goal-directed behavior in rodents. Cereb. Cortex 28, 2313–2325. [DOI] [PubMed] [Google Scholar]
  33. Plassmann H, O’Doherty J, and Rangel A (2007). Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J. Neurosci 27, 9984–9988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Plassmann H, O’Doherty JP, and Rangel A (2010). Appetitive and aversive goal values are encoded in the medial orbitofrontal cortex at the time of decision making. J. Neurosci 30, 10799–10808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Rich EL, and Wallis JD (2016). Decoding subjective decisions from orbitofrontal cortex. Nat. Neurosci 79, 973–980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Rudebeck PH, and Murray EA (2011). Dissociable effects of subtotal lesions within the macaque orbital prefrontal cortex on reward-guided behavior. J. Neurosci 37, 10569–10578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Rudebeck PH, and Murray EA (2014). The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron 84, 1143–1156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Schoenbaum G, Nugent SL, Saddoris MP, and Setlow B (2002). Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 73, 885–890. [DOI] [PubMed] [Google Scholar]
  39. Strait CE, Blanchard TC, and Hayden BY (2014). Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron 82, 1357–1366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Takahashi YK, Roesch MR, Stalnaker TA, Haney RZ, Calu DJ, Taylor AR, Burke KA, and Schoenbaum G (2009). The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron 62, 269–280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Takahashi YK, Chang CY, Lucantonio F, Haney RZ, Berg BA, Yau H-J, Bonci A, and Schoenbaum G (2013). Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning. Neuron 80, 507–518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Wang F, Howard JD, Voss JL, Schoenbaum G, and Kahnt T (2020). Targeted stimulation of an orbitofrontal network disrupts decisions based on inferred, not experienced outcomes. bioRxiv. 10.1101/2020.04.24.059808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. West EA, DesJardin JT, Gale K, and Malkova L (2011). Transient inactivation of orbitofrontal cortex blocks reinforcer devaluation in macaques. J. Neurosci 37, 15128–15135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Wilson RC, Takahashi YK, Schoenbaum G, and Niv Y (2014). Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Scripts for the behavioral paradigm can be found at https://github.com/mphgardner/RatEconChoiceTask. Data and analysis scripts are available upon request.

RESOURCES