Summary
Neural correlates implicate the orbitofrontal cortex (OFC) in value-based or economic decision-making [1–3]. Yet inactivation of OFC in rats performing a rodent-version of the standard economic choice task is without effect [4, 5], a finding more in accord with ideas that the OFC is primarily necessary for behavior when new information must be taken into account [6–9]. Neural activity in the OFC spontaneously updates to reflect new information, particularly about outcomes [10–16], and OFC is necessary for adjustments to learned behavior only under these conditions [4, 16–25]. Here we merge these two independent lines of research by inactivating lateral OFC during an economic choice that requires new information about the value of the predicted outcomes to be incorporated into an already established choice. Outcome value was changed by pre-feeding the rats one of the two food options before testing. In control rats, this pre-feeding resulted in divergent changes in choice behavior that depended on the rats’ preference for the pre-fed food. Optogenetic inactivation of OFC disrupted this bi-directional effect of pre-feeding, without affecting other measures describing the underlying choice behavior. This finding unifies the role of the OFC in economic choice with its role in a host of other behaviors, causally demonstrating that the OFC is not necessary for economic choice per se, unless that choice must incorporate new information about the outcomes.
ETOC
Appropriate decision-making depends on up-to-date information about the available offers. Here Gardner et al. show that immediate adjustments in choice behavior following revaluation of an offer requires the orbitofrontal cortex to be online at the time of the choice.
Results
Rats were trained on the economic choice task used in previous studies [4, 5]. Briefly, within this task, hungry rats are presented with choices between different amounts of unique food pellets. Two different visual stimuli are presented on each trial, each signaling, by shape and number of segmentations respectively, the availability of a particular type and amount of a unique food pellet (Figure 1A,B). Rats choose by touching the screen with the preferred option after a 1 second viewing period, during which they must maintain a nosepoke hold at a central port. Rats learn 6 visual cue -> food-type associations, resulting in 15 possible pairs. Standard sessions range from ~100–300 trials, over which 11 different offers of a particular pellet pair are randomly presented (see methods for more detail). The choice behavior across each of the offers in a session is used to construct a psychometric curve through which the relative value of the two food-types can be estimated. This estimate is expressed as the indifference point (IP), defined as the ratio of the two pellets at which the subject chooses equally between them.
After reaching proficiency on this task and displaying transitivity in choices between several different pellets, rats in the current study (n = 13) underwent surgery in which a virus containing NpHR was infused into lateral OFC and fibers were implanted bilaterally overlying the area to allow optogenetic inactivation of neurons in the region during task performance (Figure 1D). After ~10–12 weeks for recovery, viral expression, and re-acclimatization to the task while tethered to fiber-optic cables, the rats were tested on one of the pellet pairs across two sessions: a standard “baseline” session, in which the full suite of offers was presented to the rats in their normally deprived state, and a “probe” session, in which several offers around the baseline session IP were presented, immediately following pre-feeding on one of the pellet-types (Figure 1C). The intent of the pre-feeding was to selectively revalue one of the pellets outside the context of the choice task [26, 27]. Light was delivered into the fiber-optic cables during cue presentation on all trials in the probe session; sometimes the rats wore a patent-fiber cable (n = 23 session-pairs), which allowed transmission of light into the OFC, while other times the rats wore a blocked-fiber cable (n = 24 session-pairs), which prevented light from entering the skull. Each rat contributed data to both types of sessions with the order counterbalanced, and each session-pair utilized a unique pair of pellets to avoid overtraining and practice effects and any lingering effects of pre-feeding on preference.
Also, since calculating the indifference point requires a substantial number of trials, we were not able to assess the effect of pre-feeding under extinction conditions, as is normally done in revaluation studies. By focusing on offers near the IP from the baseline session, we hoped to maximize our ability to see shifts caused by pre-feeding in the fewest trials possible, thereby minimizing the impact of learning about the new value of the pellet. Additionally, we only delivered light into OFC during the choice phase; light delivery was terminated when the rat made a choice, well before the pellets were retrieved from the food cup and consumed, which is presumably when learning would occur. Finally, our reading of the literature suggests that any contribution from learning in response to exposure to the revalued pellet should be insensitive to OFC inactivation, since OFC is typically unnecessary for shifts in behavior caused by directly-experienced changes in reward, as occurs in discrimination learning [28, 29], Pavlovian conditioning [17] and extinction by reward omission [30], or even after reward revaluation, as long as the reward is presented [31].
Established economic choice is sensitive to outcome revaluation
This approach identified clear, if subtle, shifts in economic choice behavior after pre-feeding of one of the pellets in the blocked-fiber control sessions; interestingly, the degree of the effect was variable, such that sometimes the shift was substantial, sometimes trivial, and in some cases seemed to be toward rather than away from the pre-fed pellet (Figure 1E, example sessions). To better determine the source of this variability, we plotted the change in the IP between the baseline and probe sessions against each rats’ preference for the pre-fed pellet, determined in the baseline session. This analysis revealed an inverse correlation between preference for the pre-fed pellet and the IP shift (Figure 2A, R = 0.56, p = 4.3×10−3), with the direction of shift reversing close to the point of equivalence or 1:1 preference (x-intercept of the regression = 0.93 B:A ratio). In other words, if the rats were pre-fed on the non-preferred pellet, they exhibited a classic devaluation effect [26], shifting their IP away from that pellet in the subsequent probe test (10 of 11 sessions, points right of the 1:1), whereas if they were pre-fed on the preferred pellet, they exhibited what has been referred to as an appetizer or “potato chip” effect (Peter Holland, personal communication), shifting their IP toward that pellet (9 of 13 sessions, points left of 1:1; Chi square test for independence, X21 = 8.86, p = 2.9×10−3). Isolating baseline sessions according to whether the pre-fed pellet was preferred or non-preferred (Figure 2A filled circles, and S1A for individual rats) revealed that the IP shifted toward the prefed pellet when it was preferred (grey whisker plot, one-sided t-test, t10 = 2.20, p = 0.030) and away from the pre-fed pellet when it was non-preferred (magenta whisker plot, one-sided t-test, t = −3.01, n = 7, p = 0.012). Importantly this effect was present as early as it was possible to reliably estimate the IP (20 trials, p = 0.048, one-sided Kolgomorov-Smirnov test – all data, p = 0.014 for sessions with significant baseline preferences) and did not appear to increase over time. This was evident in an analysis of IP across a moving 20-trial window (Figure 2C, Figure 2SA), in which the IPs estimated using the first 20 trials (median IP shift = 24.3%) were not significantly different than any other 20 trial interval over the subsequent 40 trials (maximum median IP shift = 25.3%, one-sided two sample Kolgomorov-Smirnov test: p > 0.19). This result suggests that the effect was a spontaneous effect of pre-feeding rather than learning within the probe session.
Revaluation-sensitive changes in established economic choice require lateral OFC
Consistent with the idea that the OFC would be required for integrating new information about predicted outcomes into an established choice behavior – and not for the established behavior itself - the relationship between pellet preference and IP shift was selectively abolished in the patent-fiber sessions. Rats exhibited the same overall average and range of IPs in the blocked-and patent-fiber baseline sessions (mean IP blocked: 0.98 B:A ± 0.08%, IP patent: 0.98 B:A ± 0.11%, range IP blocked: [0.43, 2.17] B:A), range IP patent: [0.36, 2.67] B:A) and consumed the same amounts of both the preferred and non-preferred pellets during pre-feeding before the respective probe sessions (Blocked Fiber, n = 24: 15.3±0.8 grams; Patent Fiber, n = 23: 15.7±3.4 grams; 2 sample t-test, p > 0.05, t25 = −0.11), however when OFC was inactivated in the probe test, the changes in IP from baseline were no longer related to the rats’ preference for the pre-fed pellet (Figure 2B, R = 0.095, p = 0.67). A comparison of the residuals of both regressions revealed a significant reduction in correlation in the patent-fiber condition (one-sided z-test, p = 0.042, z = 1.73; sessions with significant IPs: p = 5.2 ×10−3, z = 2.56). This conclusion was also supported by a within-subjects regression analysis including Fiber-Type and Baseline Preference as predictors and Subjects as a blocking factor. This analysis showed a significant interaction between Fiber-Type and Baseline Preference (β = 0.66, t31 = 2.10, p = 0.043) with no effects of any of the other predictors (t31 <= 1.34, p >= 0.19, see Figure 2 legend for details). This interaction was due to the aforementioned loss of coupling between the IP shift and preference that had been observed in the blocked-fiber condition. Importantly this loss was present even in the earliest part of the probe session; a comparison of the cumulative distribution of IP shifts in the blocked- and patent-fiber sessions at 20 trials (Figure 2C and D) revealed a significant difference (p = 0.031, one-sided two sample Kolgomorov-Smirnov test).
To affirm the robustness of the above effects based on the IP, we also examined the choice behavior directly by analyzing the percentage of trials in which pellet B (the pre-fed pellet) was chosen in the probe test (Figure 4 A and B). We ran a linear regression with predictors of Offer, Baseline Preference, Fiber-type as well as Subjects as a blocking factor. Consistent with the effects of pre-feeding and OFC inactivation on IP described above, this analysis showed a significant interaction between Preference*Fiber-Type, both overall (β = 0.39, t148 = 3.20, p = 1.6×10−3 see Table S1 for full results).
OFC inactivation affected the relationship between changes in IP and pellet preference despite having no impact on general motivational changes in response latency and the slope of the psychometric curve induced by pre-feeding (Figures 3 and Figure 4C and D, respectively). Two way repeated measures ANOVAs (Fiber X Day) comparing changes in response latency or slope from baseline to the probe session showed that while rats were generally slower and had shallower slopes in their choice behavior as a result of prefeeding (main effect of Day for slope: F1,78 = 4.04, p = 0.041; main effect of Day for latency: ) these effects were independent of OFC-inactivation (effect of Fiber*Day for slope: F1,78 = 1.19, p = 0.28; effect for latency: Fiber*Day: F1,78 = 1.16, p = 0.28; see legends for Figures 3 and 4, respectively, for full results). Unlike the IPs, these effects did not interact with the baseline preference as revealed by linear regressions identical to those performed for the IP (Fiber*Preference for the slope: β = −0.59, t20 = 1.44, p = 0.17; Fiber*Preference for the latency: β = −0.59, t20 = 1.44, p = 0.17, see Tables S2 and S3 for full results).
The impact of pre-feeding was disrupted despite a similar distribution of shifts in IP for the two conditions at test (Figures 2A and S2A) two sample Kolmogorov-Smirnov test, p = 0.51, D = 0.22). Thus the patent-fiber condition showed variance in the IP difference across the two sessions similar to that of controls, but that variance was decoupled from the normal effect of pre-feeding; the IP shifted toward the pre-fed pellet in about half the sessions, regardless of whether it was preferred (6 of 13 sessions) or non-preferred (6 of 10 sessions; Chi square test for independence, X21 = 0.43, p = 0.51), and while the IP followed the baseline preference in 19 of 24 sessions in the blocked-fiber control condition, it did so in only 10 of 23 sessions in the patent-fiber condition (Chi square test for independence, X21 = 6.33, p = 0.011). Consequently, there were no significant changes in choice behavior in the probe test in the patent-fiber condition, whether we isolated sessions in which the preferred (Figure 2B, gold whisker plot, one-sided t-test, t8 = 0.45, p = 0.67), or non-preferred pellet was pre-fed (green whisker plot, one-sided t-test, t6 = −0.16, p = 0.88) or analyzed all the sessions together (one-sided t-test, t15 = 0.48, p = 0.64).
Finally, to better visualize these divergent effects of pre-feeding in controls and their disappearance in the experimental condition, we plotted the choice behavior from the baseline and probe tests contingent on whether the preferred or non-preferred pellet had been prefed (Figure 4 A and B). This consideration of the baseline preference as a binary factor (preferred/non-preferred) was consistent with previous analyses treating baseline preference as a continuous factor (regression analysis of choice behavior). A three way repeated measures ANOVA (Fiber-Type X Offers X Baseline Preference) comparing changes in choice behavior across days revealed a significant interaction of Fiber-Type*Baseline Preference (F1,43 = 6.64, p = 0.013; see Table S4 for the full results). Tests of the simple effects for the blocked- and patent-fiber conditions revealed a strong main effect of preference for the blocked-fiber, and a non-significant main effect of preference in the patent-fiber conditions. The resultant average psychometric curves in Figure 4 provide a concise illustration of these preference-related effects of revaluation on economic choice behavior and their dependence upon lateral OFC.
Discussion
Here we have shown that established economic choice behavior, tested in rats in an experimental setting, is sensitive to changes in the current value of one of the outcomes on offer. Revaluing one of the two outcomes prior to a test session via pre-feeding resulted in reliable shifts in the outcomes’ relative value as revealed by the rats’ choice behavior. Importantly, while our design required us to violate “best practices” for revaluation testing by delivering the food pellets during the critical probe test, we found that the effects of pre-feeding were present in controls in the very earliest block of trials and did not change thereafter. This, along with the subsequent OFC-dependence of this behavior, strongly suggests that it was a spontaneous effect of pre-feeding rather than learning within the probe session. Thus our results provide the first demonstration of which we are aware that behavior in laboratory versions of economic choice, which normally involve substantial training, can remain sensitive to transient changes in the current or real-time value of the goods on offer.
This demonstration is important because, while revaluation-sensitive behavior has been upheld as synonymous with economic choice behavior [32], experimental studies using tasks such as the one employed here have generally paid little or no attention to whether the behavior is in fact based on this type of value as opposed to simply reflecting ingrained or habitual policies acquired with extensive experience on the task. Our current data suggest that there is a bit of both, since the indifference point was shifted by revaluation, showing that real-time value plays a role in its determination, but the shift was not extreme, averaging ~25% of the IP, consistent with the idea that value “cached” in the cues during prior experience plays a substantial role. We would speculate that the balance between such cached versus real-time values reflects the amount of experience the subject has had operating within a particular goods space; this would be consistent with general ideas regarding the habitization of behavior.
Interestingly, pre-feeding in the context of economic choice did not result in the unitary “devaluation” effect often observed in simple, Pavlovian or even instrumental settings [26, 27, 33, 34]. Rather than simply avoiding the pre-fed pellet, we found that the rats’ behavior interacted with their preference for the pellet. When it was non-preferred, they avoided it, showing the classic devaluation effect, whereas when it was preferred, they sought it out, showing what has been referred to as an appetizer effect (Peter Holland, personal communication). While appetizer effects have been described previously, they tend to be unreliable in experimental settings – an observation which may be due to lack of consideration of the food preference as a predictive factor (Peter Holland, personal communication). We would speculate that the robust, bidirectional effect seen here may reflect the use of a highly sensitive choice procedure – the economic choice task – to derive a precise estimate of the relative value of the pre-fed and control outcomes. The use of choice combined with the sensitivity of the procedure may bring out the appetizer effect in a way that other simpler procedures do not.
Consistent with the general hypothesis that the OFC is important for behavior when that behavior requires the integration of new information about impending outcomes or events [i.e. inferred or model-based information; 4, 35, 36], we found that optogenetic inactivation of the lateral OFC during the choice period in the task selectively abolished the effects of pre-feeding. Rats in the experimental condition exhibited the same behavior as controls at baseline, and ate the same amount of the pre-fed pellets, whether preferred or not. Further they showed similar variance in their raw choice behavior and estimated indifference point between the baseline and probe test after pre-feeding. However, the shifts in their indifference point were uncoupled from the principled relationship to pellet preference that was observed in controls. This was true in both directions – that is, OFC inactivation disrupted both the devaluation effect observed when the non-preferred pellet was pre-fed, as well as the appetizer effect observed when the preferred pellet was pre-fed. Combined with our prior negative results [4, 5], these findings indicate that the lateral region in OFC is necessary for economic choice behavior to the extent that behavior requires integration of new information into the model or goods-space used to guide the behavior. Notably lateral OFC in rodents is arguably homologous with areas of lateral OFC in primates most closely associated with economic choice as well as revaluation-sensitive behavior [37–39]. Thus a selective role for this area in economic choice after revalution is consistent with prior causal work on OFC function in other behavioral contexts [4, 15, 17–21, 24, 36] and with correlative evidence that neural representations in OFC update to reflect such integration [10–16].
Importantly, this framework predicts that economic choice behavior in which the goods space is new or not fully explored should be sensitive to OFC inactivation, since under these conditions the choice behavior requires the subject to infer or estimate the relative values of the new items from prior experience with items of varying similarity. This process of mental simulation is what we would speculate requires OFC-dependent mechanisms, which are engaged by the revaluation in the current experiment. This prediction is consistent with evidence from human studies – most correlative, one causal – indicating OFC’s involvement in economic choice when offers are more unique [3, 40], and with data indicating effects of OFC inactivation on slope and other measures of economic choice in mice that have only been trained on a single pellet pair and therefore lack extensive experience generalizing across goods [41]. We would suggest that, as with other behaviors, as economic choices become more a repetition of past actions and less dependent on inference and estimation, they will become less dependent on OFC.
Lastly, these findings provide, to the best our knowledge, one of the first demonstrations that the OFC’s contribution to revaluation-sensitive changes in behavior is bidirectional. Previous work generally has only documented a role for OFC in settings when the new value must be used to stop or redirect a previously learned response. For example, in classic OFC-dependent devaluation designs, the OFC is required to stop responding to the cue when the outcome predicted is no longer desired [4, 15–21, 24]. This sort of deficit is often interpreted as reflecting deficits in response inhibition. The current data join a prior report by Gremel et al [42] to show that the OFC’s role is not simply to inhibit behavior, since the OFC is equally necessary to reduce or increase responding and instead support the interpretation that what is provided by the OFC is the ability to integrate novel information into the associative framework that is used to guide the behavior [7].
STAR Methods
Lead Contact and Materials Availability
Contact for Reagent and Resource Sharing
See the Key Resources Table for information about resources. Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Geoffrey Schoenbaum (geoffrey.schoenbaum@nih.gov).
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
DAPI-Fluorescent- G | Electron Microscopy Services | Cat No. 17984-24 |
Triton™ X-100 | Sigma-Aldrich | Cat No. X100-500ML |
Bacterial and Virus Strains | ||
AAV5/CamKIIa-eNpHR3.0-eYFP | UNC Vector Core | n/a |
AAV5/CamKIIa-eYFP | UNC Vector Core | n/a |
Biological Samples | ||
Chemicals, Peptides, and Recombinant Proteins | ||
Critical Commercial Assays | ||
Deposited Data | ||
Experimental Models: Cell Lines | ||
Experimental Models: Organisms/Strains | ||
Long-Evans Rat | Charles River | RRID: RGD_2308852 |
Oligonucleotides | ||
Recombinant DNA | ||
Software and Algorithms | ||
Matlab | Mathworks | RRID: SCR_001622 |
Graphic State | Coulbourn Instruments | Cat No. G4.0-UP |
Other | ||
Doric dual optical commutators | Doric Lenses | Cat No. FRJ_1x2i_FC-2FC_0.22 |
200 micron diameter fiber optic patch cable | Thor Labs | M72L01 |
Fiber optic cannulae | Thor Labs | Cat No. CFM12U-20 |
ceramic zirconia ferrule bore 230um | Precision Fiber Products | Cat No MM-FER2002S15-P |
FC multimode connector | Precision Fiber Products | Cat No. MM-CON2004-2300-2-BLK |
543 nm DPSS Laser | Shanghai Lasers | Cat No. GL543T3-100 |
Arduino Mega | Adafruit Industries | Cat No. 191 |
Raspberry Pi 3 B | Adafruit Industries | Cat No. 3055 |
3.5” Resistive Touch Screen | Adafruit Industries | Cat No. 2050 |
Experimental Model and Subject Details
Fifteen male Long-Evans rats (275–300 g, Charles River Laboratories), aged approximately 3 months at the start of the experiment, were trained and tested at the National Institute on Drug Abuse Intramural Research Program (Baltimore, MD) in accordance with the National Institute of Health guidelines determined by the Animal Care and Use Committee. All rats had ad libitum access to water during the experiment and were fed 16–20 grams of food per day, including rat chow and pellets consumed during the behavioral task. Rats were initially food restricted to 85% of their baseline weight to begin training. Behavior was performed during the light phase of the light/dark schedule.
Method Details
Apparatus
Rats were trained and tested in modified standard behavioral boxes (12” x 10” x 12”, Coulbourn Instruments, Holliston, MA) that were controlled by a Raspberry Pi 3 (Raspberry Pi Foundation, Cambridge, UK) using custom-written code in Python (Python.org) [4, 5]. Both custom-made equipment and Coulbourn components were used in the apparatus. Touchscreens (Adafruit Industries, New York, NY, 2.8” – initial training -and 3.5” – later training and testing) were housed in custom-made walls and were controlled by individual microcontrollers (Arduino Mega, Arduino, www.arduino.cc), which communicated with the Raspberry Pi 3 to display the current offers and provide screen press feedback. Custom-designed nosepoke ports (1.5” H X 1.25” W X 1.5” D) with infrared photodetectors to determine whether a poke had occurred were fixed to the floor of the box about one inch from the wall and. The primary configuration of the box had touchscreens and accompanying wall mounts oriented at 30° from the plane of the left side wall to facilitate better viewing of the screen while the rats were nosepoking at the central port. A tall recessed food magazine (Med-Associates, Fairfax, VT) was placed on the center of the right wall opposite to the nosepoke and touchscreens. Pellets from two separate externally mounted feeders were dispensed into the food magazine. The speaker used for playing the white noise cue (75 dB) to indicate the beginning of a trial was placed externally to the conditioning chamber. During the optogenetic inhibition phase of the experiment, solid state lasers (532 nm; Laser Century, Shanghai China) were controlled in analog mode (8 bit depth) by a microcontroller (Arduino Uno, Arduino, www.arduino.cc)
Choice Task
Each trial started with a white noise cue, which indicated that the rat could nosepoke at the central port. After a 1 second nosepoke at the port, the current offers were displayed on the two screens situated on either side of the nosepoke. After another 1s period, during which the rats were required to remain in the nosepoke, the white noise ended indicating that a choice could be made by touching either of the screens to receive the offer-type and pellet number displayed. Immediately following the choice, the pellets were delivered into the food magazine on the opposite side of the chamber. Rats then waited 6–16 seconds before the next trial started which depended on a random component as well as the number of pellets delivered on the prior trial. This was determined empirically such that rats were not waiting for longer periods of time for the next trial to start following trials in which only 1 or 2 pellets were delivered. Failure to hold the nosepoke for the first second restarted the 1 second timer and failure to hold the nosepoke once the screens were displayed resulted in the termination of the trial. Rats performed ~150–350 trials per session.
Food-Pellet Reinforcers
All rats received the same menu of pellet offers arranged in the following average preference order (highly palatable banana flavored pellets, Test-Diet 5-TUL (1813985); bacon flavored pellets containing lactose and 1.4% NaCl, Bio-Serv, custom formulation (F07382); grain flavored pellets, Test-Diet 5-TUM (1811143); grape flavored pellets with 50% sucrose and 50% cellulose, Test-Diet, custom formulation (1817455–371); chocolate flavored pellets with 25% sucrose and 75% cellulose, Test-Diet, custom formulation (1817259–371); and 100% cellulose pellets, Test-Diet 5-TUW (1811557). Visual cues predicting the different offer-types consisted of different shapes, indicating the type of pellet available, and different numbers of segmentations of the symbol, indicating the number of pellets available [4]. Each rat received unique cue-pellet pairings that remained constant throughout testing.
Shaping and Pre-Surgical Training
Initial training on the task lasted 3–4 months before rats experienced any of the tested pairs of pellets and progressed through several stages that introduced different aspects of the task. Before starting, rats were food restricted to ~85% of their body weight, then they were first trained to touch a single illuminated touchscreen to receive unflavored sucrose pellets, after which they began training to discriminate two visual cues which either resulted in an unflavored sucrose pellet or nothing (the images used were not used for any subsequent aspect of the task). After rats showed discriminative behavior to the two visual cues, a central nosepoke was introduced to the box and rats were progressively trained to hold in the port for 2 seconds (1 second with no cues on and one second with visual cues displayed) when the white noise cue was turned on. Upon acquisition of the nosepoke, rats were introduced to the full task. To learn each of the cue-pellet associations, rats were trained for several days on each of the 5 flavored pellets versus a non-preferred cellulose pellet. After rats showed stable preferences for each of the pellets versus cellulose, they were exposed to other pellet-pairs. In each session, rats were given 11 possible offers including the 1:0 and 0:1 offers. The other 9 offers ranged either from 1:6 to 6:1 or 1:4 to 8:1 (X:Y, Y being the presumed preferred pellet-type) from the offer set [1:8, 1:6, 1:4, 1:3, 1:2, 1:1, 2:1, 3:1, 4:1, 6:1, 8:1] depending on the presumed pair preference.
Surgery
Surgical procedures followed guidelines for aseptic technique. Rats received AAV-CaMKIIaeNpHR3.0-eYFP (Gene Therapy Center at University of North Carolina at Chapel Hill) bilaterally into the medial OFC under stereotaxic guidance at AP 3.0 mm, ML ±3.2 mm, and DV −4.4 mm from the brain surface. A total 1 μl of virus (titer ~1012) per hemisphere was delivered at the rate of ~0.1 μl/min by infusion pump [43]. Immediately following viral infusions, optic fibers (200 μm in core diameter; Thorlab, Newton, NJ) were implanted bilaterally at A/P: 3.0 mm, M/L: ±3.2 mm, and D/V:- 4.2 mm (from dura) at an angle of 10 degrees in the M/L plane. Cephalexin (15 mg/kg p.o.) was administered daily for 10 days post-operatively to prevent infection.
Post-Surgical Testing
Following a 2–3 week recovery from surgery, rats were retrained on the full task and accustomed to performing with two fiber-optic patch cables attached to an optic commutator (Doric Lenses, Quebec Canada). Cables were constructed with blocking covers to reduce leakage of light into the box. However, it is impossible to completely eliminate light leakage. To control for effects of such light leakage during laser-on trials, ‘dummy’ fiber-optic cables were employed during retraining and testing. The ‘dummy’, or blocked, cables were identical to the patent-fiber cables except that the optical fiber was blocked at the end of the cable and permitted no light transmittance into the brain. The blocked-fiber cables were constructed identically to the patent-fiber cables with one exception; the optical fiber was terminated at the ferrule, or ~ 1cm, from the animal-side terminal of the patch cable. A solid metal wire was inserted into the ferrule and epoxied into place in order to block. All blocked-fiber cables were tested after construction as well as on a periodic basis using a Fiber Optic Power Meter (ThorLabs). After rats were familiarized with the blocked-fiber cables and the laser being turned on, testing was begun.
One of the 10 possible pairs of pellets (not including the cellulose pellet) was randomly chosen and rats performed the full choice task while attached to the blocked-fiber cables with the laser turned on for all trials. This first day of re-exposure to a particular pellet-pair was used to obtain an estimate of the rats’ baseline IP for the particular pair. On the second day of the experiment, rats were first exposed to one of the pellet-types counterbalanced by whether the pre-fed pellet was the more, or less, preferred pellet based on the IP acquired from the prior day. Rats had 2 hour access to 20 grams of one of the pellet-types which were used on the prior day. Immediately following this pre-exposure, rats were run on the task with a limited offer set of 3–4, offers centered on the IP of the prior day. For example, if the estimated IP on day 1 was 2B:1A, offers [1B:1A, 2B:1A, 3B:1A] were used during the probe test. This limited offer set was used to maximize IP estimates for reduced trial numbers following pre-feeding, and a small subset of the sessions had the forced offers [1B:0A, 0A:1B] included which were not considered for any of the analyses. The first 60 trials of probe sessions were used for analysis in order to minimize the effect of learning the new reinforcer values. Sessions with less than 20 trials were excluded from the analysis due to insubstantial data for estimating the IP (3 sessions). For 6 of the sessions rats completed between 40 and 60 trials. Overall rats completed a median of 102 trials during the probe test and a median of 223 trials during the baseline session. During the probe test, rats were either attached to the blocked-fiber or the patent-fiber cables and the laser was turned on for all trials. If the connection of the cables became loose by then end of the session, the session was discarded from the analysis. The lasers (532 nm, 16–18 mW; Laser Century, Shanghai China) were controlled by a microcontroller (Arduino Uno, Arduino) and were turned on concurrently with the white noise cue to indicate the availability to begin a trial. Lasers were turned off at the time of decision using a linear ramp over 300 ms to avoid the possibility of rebound excitation. To minimize the duration of the laser, the white noise and laser were on for 5 seconds before a timeout period occurred. Rats also had a maximum of 5 seconds to make a choice once nosepoke hold was fulfilled. Sessions lasted 2–2.5 hours.
Histology
After completion of the experiment, rats were perfused with phosphate buffer saline followed by 4 % PFA. The brains were then immersed in 30% sucrose for at least 24 hr and frozen. The brains were sliced at 40 μm and stained with DAPI (through Vectashield-DAPI, Vector Lab, Burlingame, CA). The location of the fiber tip and NpHR-eYFP or eYFP expression was verified using an Olympus confocal microscope.
Quantification and Statistical Analysis
Indifference Point and Inverse Slope Estimation
Raw data was collected using custom written code in Python. All further analysis was performed using MATLAB. As described previously [1], in order to estimate a scalar relative value of two goods from a limited subset of all possible offers, an assumption must be made about the function relating the two goods in offer space. Here we assume a linear indifference curve (within a reasonable set of offer space) which entails that the ratio of the number of each good offered leading to indifferent behavior remains constant as the number of goods offered increases. In order to estimate the relative value of two goods from the choice behavior we performed a probit regression for each session [44], which uses the cumulative distribution function of the normal distribution to predict the choice behavior given the log ratio of the offers. This provides estimated parameters and of the fitted normal distribution, which were used as estimates for the log of the indifference point (IP) - the estimated relative value - and inverse slope parameter respectively. This analysis was performed using the fitglm function in MATLAB which fits a generalized linear model of the choice data using an inverse normal cumulative distribution, ‘probit’, function as the link function and assumes a bernoulli distribution for the binary choice response variable resulting in the model
in which Φ is the normal cumulative distribution function
and the predictor, x, is the log of the offer ratios resulting in the estimated parameters = β0/β1 and = 1/β1. Sessions with relative pellet values outside of the offer range tested were not included in the analysis. This was done by excluding sessions (n = 1) with estimated indifference points IP = exp greater than a 6:1 ratio (non-preferred:preferred pellet) during the baseline day of the experiment.
Correlation and Linear Regression
The correlation between the change in IP from day 1 to day 2 to the baseline IP was measured using Pearson’s correlation coefficient. The correlation coefficients between the two conditions (Blocked and Patent) were then compared using a one-sided z-test (the primary hypothesis of the experiment was that OFC inactivation would disrupt revaluation effects) following application of Fisher’s Z-transformation to normalize the correlation coefficients. An ordinary least squares regression of the form
was performed with the baseline IP as the predictor and the change in IP from day 1 to day 2 as the response variable in order to determine the point at which the change in preference reversed,β0, the x-intercept.
Average Choice Behavior Alignment
The average choice behavior across sessions (Figure 4 A and B) was computed by subtracting the estimated IP from the log of the offer ratios for each session. The relative offer ratios were then binned into the intervals shown in Figure 4 A and B. To visualize the average shift in IP for sessions in which both the preferred and non-preferred pellet was pre-fed, preferred pre-fed sessions were reversed.
Data and Code Availability
Scripts and CAD files for the behavioral task and equipment can be found at https://github.com/mphgardner/RatEconChoiceTask. Additional code and the dataset will be made available upon request from the lead contact, Geoffrey Schoenbaum (geoffrey.schoenbaum@nih.gov).
Supplementary Material
Highlights.
Rats show immediate changes in choice behavior following reinforcer revaluation.
Direction of satiety-specific revaluation depends on the baseline food preference.
Orbitofrontal inactivation disrupts behavior following reinforcer revaluation.
Acknowledgments
This work was supported by the Intramural Research Program at NIDA (GS). The authors thank Andrew Wikenheiser, Melissa Sharpe, and Kaue Costa for their helpful insights, and Dr. Karl Deisseroth and the Gene Therapy Center at the University of North Carolina at Chapel Hill for providing viral reagents. The opinions expressed in this article are the authors’ own and do not reflect the view of the NIH/DHHS.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declaration of Interests
The authors declare no competing interests.
References
- 1.Padoa-Schioppa C, and Assad JA (2006). Neurons in orbitofrontal cortex encode economic value. Nature 441, 223–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Levy DJ, and Glimcher PW (2011). Comparing apples and oranges: Using rewardspecific and reward-general subjective value representation in the brain. Journal of Neuroscience 31, 14693–14707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Plassmann H, O’Doherty J, and Rangel A (2007). Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. Journal of Neuroscience 27, 9984–9988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gardner MPH, Conroy JS, Shaham MH, Styer CV, and Schoenbaum G (2017). Lateral orbitofrontal inactivation dissociates devaluation-sensitive behavior and economic choice. Neuron 96, 1192–1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gardner MPH, Conroy JC, Styer CV, Huynh T, Whitaker LR, and Schoenbaum G (2018). Medial orbitofrontal inactivation does not affect economic choice. eLIFE 7, e38963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Rudebeck PH, and Murray EA (2014). The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron 84, 1143–1156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wilson RC, Takahashi YK, Schoenbaum G, and Niv Y (2014). Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bradfield J, and Hart G (2019). Medial and lateral orbitofrontal cortices represent unique components of cognitive maps of task space. PsyArXiv. [DOI] [PubMed]
- 9.Delamater AR (2007). The role of the orbitofrontal cortex in sensory-specific encoding of associations in pavlovian and instrumental conditioning. Ann N Y Acad Sci 1121, 152–173. [DOI] [PubMed] [Google Scholar]
- 10.Gottfried JA, O’Doherty J, and Dolan RJ (2003). Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 301, 1104–1107. [DOI] [PubMed] [Google Scholar]
- 11.Howard JD, and Kahnt T (2017). Identity-specific reward representations in orbitofrontal cortex are modulated by selective devaluation. Journal of Neuroscience 37, 2627–2638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.O’Doherty J, Rolls ET, Francis S, Bowtell R, McGlone F, Kobal G, Renner B, and Ahne G (2000). Sensory-specific satiety-related olfactory activation of the human orbitofrontal cortex. Neuroreport 11, 893–897. [DOI] [PubMed] [Google Scholar]
- 13.Valentin VV, Dickinson A, and O’doherty JP (2007). Determining the neural substrates of goal-directed learning in the human brain. Journal of Neuroscience 27, 4019–4026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Critchley HD, and Rolls ET (1996). Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. Journal of Neurophysiology 75, 1673–1686. [DOI] [PubMed] [Google Scholar]
- 15.Gremel CM, and Costa RM (2013). Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nature Communications 4, 2264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sadacca BF, Wied HM, Lopatina N, Saini GK, Nemirovsky D, and Schoenbaum G (2018). Orbitofrontal neurons signal sensory associations underlying model-based inference in a sensory preconditioning task. eLIFE 7, e30373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gallagher M, McMahan RW, and Schoenbaum G (1999). Orbitofrontal cortex and representation of incentive value in associative learning. Journal of Neuroscience 19, 6610–6614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Izquierdo AD, Suda RK, and Murray EA (2004). Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. Journal of Neuroscience 24, 7540–7548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Reber J, Feinstein JS, O”Doherty JP, Liljeholm M, Adolphs R, and Tranel D (2017). Selective impairment of goal-directed decision-making following lesions to the human ventromedial prefrontal cortex. Brain 140, 1743–1756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Murray EA, Moylan EJ, Saleem KS, Basile BM, and Turchi J (2015). Specialized areas for value updating and goal selection in the primate orbitofrontal cortex. ELIFE 10.7554/eLife.11695.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.West EA, DesJardin JT, Gale K, and Malkova L (2011). Transient inactivation of orbitofrontal cortex blocks reinforcer devaluation in macaques. Journal of Neuroscience 31, 15128–15135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Panayi MC, and Killcross S (2018). Functional heterogeneity within the rodent lateral orbitofrontal cortex dissociates outcome devaluation and reversal learning deficits. Elife 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Miller KJ, Botvinick MM, and Brody CD (2018). Value Representations in Orbitofrontal Cortex Drive Learning, not Choice. bioRxiv, 245720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Howard JD, Reynolds R, Smith DE, Voss JL, Schoenbaum G, and Kahnt T (2019). Targeted stimulation of human orbitofrontal networks disrupts outcome-guided behavior. BioRxiv 10.1101/740399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Keiflin R, Reese RM, Woods CA, and Janak PH (2013). The orbitofrontal cortex as part of a hierarchical neural system mediating choice between two good options. J Neurosci 33, 15989–15998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Holland PC, and Rescorla RA (1975). The effects of two ways of devaluing the unconditioned stimulus after first and second-order appetitive conditioning. Journal of Experimental Psychology: Animal Behavior Processes 1, 355–363. [DOI] [PubMed] [Google Scholar]
- 27.Johnson AW, Gallagher M, and Holland PC (2009). The basolateral amygdala is critical to the expression of pavlovian and instrumental outcome-specific reinforcer devaluation effects. J Neurosci 29, 696–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Schoenbaum G, Nugent S, Saddoris MP, and Setlow B (2002). Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 13, 885–890. [DOI] [PubMed] [Google Scholar]
- 29.Walton ME, Behrens TEJ, Buckley MJ, Rudebeck PH, and Rushworth MFS (2010). Separable learning systems in the macaque brain and the role of the orbitofrontal cortex in contingent learning. Neuron 65, 927–939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Takahashi Y, Roesch MR, Stalnaker TA, Haney RZ, Calu DJ, Taylor AR, Burke KA, and Schoenbaum G (2009). The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron 62, 269–280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bradfield LA, Dezfouli A, van Holstein M, Chieng B, and Balleine BW (2015). Medial orbitofrontal cortex mediates outcome retrieval in partially observable task situations. Neuron 88, 1268–1280. [DOI] [PubMed] [Google Scholar]
- 32.Padoa-Schioppa C (2011). Neurobiology of economic choice: a goods-based model. Annual Review of Neuroscience 34, 333–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Dickinson A, and Balleine BW (1994). Motivational control of goal-directed action. Animal Learning and Behavior 22, 1–18. [Google Scholar]
- 34.Colwill RM (1993). An associative analysis of instrumental learning. Current Directions in Psychological Science 2, 111–116. [Google Scholar]
- 35.Schoenbaum G, and Esber G (2010). How do you (estimate you will) like them apples? Integration as a defining trait of orbitofrontal function. Current Opinion in Neurobiology 20, 205–211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Jones JL, Esber GR, McDannald MA, Gruber AJ, Hernandez A, Mirenzi A, and Schoenbaum G (2012). Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science 338, 953–956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Stalnaker TA, Cooch NK, and Schoenbaum G (2015). What the orbitofrontal cortex does not do. Nature Neuroscience 18, 620–627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ongur D, and Price JL (2000). The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys and humans. Cerebral Cortex 10, 206–219. [DOI] [PubMed] [Google Scholar]
- 39.Heilbronner SR, Rodriquez-Romaguera J, Quirk GJ, Groenewegen HJ, and Haber SN (2016). Circuit based cortico-striatal homologies between rat and primate. Biological Psychiatry 80, 509–521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Camille N, Griffiths CA, Vo K, Fellows LK, and Kable JW (2011). Ventromedial frontal lobe damage disrupts value maximization in humans. Journal of Neuroscience 31, 7527–7532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kuwabara M, Holy TE, and Padoa-Schioppa C (2019). Neural mechanisms of economic choices in mice. BioRxiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Baltz ET, Yalcinbas EA, Renteria R, and Gremel CM (2018). Orbital frontal cortex updates state-induced value change for decision-making. eLIFE 7, e35988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Takahashi YK, Chang CY, Lucantonio F, Haney RZ, Berg BA, Yau H-J, Bonci A, and Schoenbaum G (2013). Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning. Neuron 80, 507–518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Padoa-Schioppa C, and Assad JA (2008). The representation of economic value in the orbitofrontal cortex is invariant for changes in menu. Nature Neuroscience 11, 95–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Scripts and CAD files for the behavioral task and equipment can be found at https://github.com/mphgardner/RatEconChoiceTask. Additional code and the dataset will be made available upon request from the lead contact, Geoffrey Schoenbaum (geoffrey.schoenbaum@nih.gov).