Abstract
Reciprocal connections between the orbitofrontal cortex (OFC) and the basolateral nucleus of the amygdala (BLA) provide a critical circuit for guiding normal behavior when information about expected outcomes is required. Recently, we reported that outcome signaling by OFC neurons is also necessary for learning in the face of unexpected outcomes during a Pavlovian over-expectation task. Key to learning in this task is the ability to build on prior learning to infer or estimate an amount of reward never previously received. OFC was critical to this process. Notably, in parallel work, we found that BLA was not necessary for learning in this setting. This suggested a dissociation in which the BLA might be critical for acquiring information about the outcomes but not for subsequently using it to make novel predictions. Here we evaluated this hypothesis by recording single-unit activity from BLA in rats during the same Pavlovian over-expectation task used previously. We found that spiking activity recorded in BLA in control rats did reflect novel outcome estimates derived from the integration of prior learning, however consistent with a model in which this process occurs in the OFC, these correlates were entirely abolished by ipsilateral OFC lesions. These data indicate that this information about these novel predictions is represented in the BLA, supported via direct or indirect input from the OFC, even though it does not appear to be necessary for learning.
SIGNIFICANCE STATEMENT The basolateral nucleus of the amygdala (BLA) and the orbitofrontal cortex (OFC) are involved in behavior that depends on knowledge of impending outcomes. Recently, we found that only the OFC was necessary for using such information for learning in a Pavlovian over-expectation task. The current experiment was designed to search for neural correlates of this process in the BLA and, if present, to ask whether they would still be dependent on OFC input. We found that although spiking activity in BLA in control rats did reflect the novel outcome estimates underlying learning, these correlates were entirely abolished by OFC lesions.
Keywords: amygdala, extinction, orbitofrontal, over-expectation, rat, single-unit
Introduction
Outcome-dependent behavioral control allows individuals to promptly adapt to changes in the environment. Historically, the basolateral amygdala (BLA) and the orbitofrontal cortex (OFC) are two key components of the circuit important for such behavioral control (Jones and Mishkin, 1972). More recently, numerous studies have shown that both areas are necessary for changes in behavioral paradigms that reflect a knowledge of impending outcomes, such as reinforcer devaluation (Hatfield et al., 1996; Málkováet al., 1997; Gallagher et al., 1999; Baxter et al., 2000; Balleine et al., 2003; Izquierdo et al., 2004; Machado and Bachevalier, 2007; Ostlund and Balleine, 2008; Rudebeck et al., 2013b; Zeeb and Winstanley, 2013) and associative encoding in the two areas is typically interdependent (Schoenbaum et al., 2003; Saddoris et al., 2005; Hampton et al., 2007; Rudebeck et al., 2013a). Although there is still some debate about the relative involvement of the two regions in learning versus using information (Blundell et al., 2001; Pickens et al., 2003, 2005; Wellman et al., 2005; Johnson et al., 2009; West et al., 2011; Zhang et al., 2013; Gore et al., 2015), these studies illustrate the close relationship between BLA and OFC in acquiring and then using outcome expectancies to guide behavior.
There is also evidence that OFC can use information of expected outcomes to facilitate learning. Although this is evident in a variety of settings (Jones and Mishkin, 1972; Walton et al., 2010; McDannald et al., 2011; Jones et al., 2012), it is perhaps most clear in Pavlovian over-expectation. In this task (Rescorla, 1970), a subject learns that two cues are independent predictors of reward. Subsequently these two cues are presented together for several sessions, still followed by the normal amount of reward, then are finally presented separately in a probe test. Most subjects show a sudden decline in conditioned responding when the cues are separated. This decline occurs on the very first trial and shows all the hallmarks of extinction learning (Rescorla, 2006, 2007). However, unlike typical extinction, which is induced by omitting the outcome that has been received in prior training, here extinction is induced by tricking the subjects into expecting more than the amount normally delivered via compounding of the cues. Interestingly, the OFC is necessary for this type of extinction (Takahashi et al., 2009, 2013), even when it is not required for extinguishing the prior learning in response to omission (Burke et al., 2009; Takahashi et al., 2009). Further, OFC neurons show changes in firing at the time of compound cue presentation that suggest its role is to integrate the prior predictions to generate the novel outcome estimate (Takahashi et al., 2013).
In contrast, inactivation of the BLA has no effect on this form of extinction learning (Haney et al., 2010). This provides a novel dissociation between these two areas. Combined with evidence of a relative imbalance between BLA and OFC in the acquisition versus the manipulation/use of such associative information (Pickens et al., 2003), this suggests a model in which the BLA might represent the retrospectively acquired associative information, whereas the OFC might be more important for taking that information and using it prospectively. This model would be consistent with prior studies in which BLA damage affected associative correlates in OFC broadly (Schoenbaum et al., 2003), whereas OFC damage had a more circumscribed effect on outcome-anticipatory firing after a response was made (Saddoris et al., 2005). If this were true, then we would predict that single-unit firing in the BLA to be primarily related to the previously acquired elemental associations, and that any neural correlates related to the compound cue and the generation of the novel, heightened estimates of reward, which are generated on-the-fly, would be entirely dependent on input from OFC.
Here we tested this hypothesis by recording single-unit activity from BLA in rats during Pavlovian over-expectation. Consistent with the proposed model, we found that although spiking activity in BLA in control rats did reflect novel outcome estimates derived from the integration of prior learning, these correlates were entirely abolished by ipsilateral OFC lesions. These data indicate that information about these novel predictions is represented in the BLA, supported via direct or indirect input from the OFC.
Materials and Methods
Subjects.
Twenty-five male Long–Evans rats (Charles River Laboratories) weighing 250–275 g upon arrival were housed individually on a 12 h light/dark schedule (lights on at 7:00 A.M.). All rats were given ad libitum access to food except during testing periods. During behavioral testing (over-expectation training), rats were food deprived to 85% of their baseline weight, by giving 10 g of pellets each day until the desired weight was reached and then maintained at 85% with 15 g of pellets per day. Water was freely available throughout the experiments. No statistical test was run to determine sample size a priori. The sample sizes we chose are similar to those used in previous publications. All testing was conducted at the National Institute on Drug Abuse Intramural Research Program in accordance with NIH guidelines.
Stereotaxic surgeries and histology.
Drivable bundles of 10- to 25-μm-diameter FeNiCr recording electrodes (Stablohm 675, California Fine Wire) were implanted unilaterally in BLA under stereotaxic guidance at 3.0 mm posterior and 5.0 mm lateral to bregma and 7.5 mm ventral to the brain surface. Electrodes were advanced subsequently into final positions within BLA during recording. These rats were divided into three groups (Naive, Sham, and OFC-lesioned). Rats in the OFC-lesioned group (n = 9) received unilateral infusions of NMDA (12.5 μg/μl) into OFC to create neurotoxic lesions ipsilateral to the recording electrode. Rats in the sham group (n = 10) received infusions of saline vehicle at the same location, and rats in the nonsurgical group (n = 6) received no injections. Coordinates for OFC infusions were as follows: 0.05 μl: AP: +3.0 mm; ML: +3.2 mm; DV: −5.2 mm; 0.1 μl; AP: +3.0 mm; ML: +4.2 mm; DV: −5.2 mm; 0.1 μl: AP: +4.0 mm; ML: +2.2 mm; DV: −3.8 mm; 0.1 μl: AP: +4.0 mm; ML: +3.7 mm; DV: −3.8 mm.
At the end of the single-unit recording experiment, the brains were removed from the skulls and processed for histology using standard techniques to verify the final electrode position and the lesioned area. To verify the final electrode position, rats were deeply anesthetized and the final electrode position was marked by the passage of a current though each microwire to create a small iron deposit. The rats were then perfused with 4% PFA and potassium ferrocyanide solution to visualize the iron deposit.
Pavlovian over-expectation training.
Following recovery, rats began Pavlovian over-expectation training. Rats received the training in aluminum chambers ∼18 inches on each side with sloping walls narrowing to an area of 12 × 12 inches at the bottom. A food cup was recessed in the center of one end wall. Entries were monitored by photobeam. Two food dispensers containing 45 mg sucrose pellets (banana- or grape-flavored; Bio-Serv) delivered pellets to the food cup. White noise or a tone, each measuring ∼76 dB, was delivered via a wall speaker. A clicker (2 Hz) and a 6 W bulb were also mounted on that wall.
Rats were shaped to retrieve food pellets, and then underwent 12 conditioning sessions. In each session, the rats received eight 30 s presentations of three different auditory stimuli (A1, A2, and A3) and one visual stimulus (V). Each session consisted of eight blocks, and each block consisted of 4 presentations of a cue. The order of cue-blocks was counterbalanced and randomized. For all conditioning, V consisted of a cue light, and A1, A2, and A3 consisted of a tone, clicker or white noise, respectively (counterbalanced). Two differently flavored sucrose pellets (banana and grape, designated as O1 and O2, counterbalanced) were used as rewards. A1 and V terminated with delivery of three pellets of O1, and A2 terminated with delivery of three pellets of O2. A3 was paired with no food. After completion of conditioning training, rats received a single session of compound probe (CP). During the first half of the session, the initial conditioning continued, with six trials each of four cues, in a blocked design, with order counterbalanced. During the second half of the session, compound training began with six trials of concurrent A1 and V presentation, followed by delivery of three pellets, the same amount of reward received during initial conditioning. A2, A3, and V continued to be presented as in initial conditioning, with six trials each stimulus. These cues were also presented in a blocked design with order counterbalanced. After the compound probe, rats received 3 d of compound training sessions (CP2–CP4) with 12 presentations of A1V, A2, A3, and V. One day after the last compound training, rats received a single extinction probe session (PB). During the first half of the session, the compound training continued with six presentations of A1V, A2, A3, and V. During second half of the session, rats received eight non-reinforced presentations of A1, A2, and A3, with the order mixed and counterbalanced.
Following the probe test, the electrode was typically advanced to a new location, and the rats repeated days 11 and 12 of conditioning and then underwent additional rounds of over-expectation training to acquire additional data. This was done up to two times for a given rat, resulting in 12 rounds of training for the nonsurgical group, 24 rounds of training for the sham group and 17 rounds of training for the OFC-lesion group. Neural data from the initial compound and extinction days were not statistically different from data gathered in later rounds of training and thus these neurons were analyzed together in the text.
The primary measure of conditioning to cues was the percentage of time that each rat spent with its head in the food cup during the last 20 s conditioned stimulus (CS) presentation, as indicated by disruption of the photobeam. We also measured the percentage of time that each rat showed rearing behavior during the last 20 s CS period. To correct for time spent rearing, the percentage of responding during the last 20 s CS was calculate as follows: percentage of responding = 100 × ([% of time in food cup]/[100 −% of time of rearing]).
Single-unit recording.
Throughout the Pavlovian over-expectation training, rats were attached to the recording cable and before each session, wires were screened for activity. Active wires were selected for recording, and the session was begun. On the rare occasion that <4/8 wires were active, the electrode assembly was advanced 40 or 80 μm at the end of the session. Otherwise, the electrode was kept in the same position between sessions within a single round of over-expectation training. After the extinction probe test, ending a round of training, the electrode assembly was advanced 80 μm regardless of the number of active wires to acquire activity from a new group of neurons in any subsequent training.
Neural activity was recorded using two identical Plexon Multichannel Acquisition Processor Systems, interfaced with training chambers described above. After amplification and filtering, waveforms (>2.5:1 signal-to-noise) were extracted from active channels and recorded to disk by an associated workstation with event timestamps. Units were stored using Offline Sorter software from Plexon, using a template-matching algorithm. Sorted files were processed in Neuroexplorer to extract unit timestamps and relevant event markers and analyzed in MATLAB.
Firing activity in the last 20 s of each CS was compared with activity in the last 20 s of the pre-CS period by t test (p < 0.05). Neurons with significantly higher activity during at least one of the four cues were defined as “cue-responsive” as described in the main text. Normalized firing rate was calculated by dividing the mean firing rate during the last 20 s of CS by the mean firing rate in the last 20 s of pre-CS period.
Results
We recorded single-unit activity from the BLA in naive, sham and OFC-lesioned rats during training on a Pavlovian over-expectation task (Rescorla, 1970; Fig. 1a). The Pavlovian over-expectation task was identical to that used in prior recording studies (Takahashi et al., 2013; Lucantonio et al., 2014). This task consists of three phases: initial conditioning, compound training, and extinction testing. In initial conditioning, rats are trained that each of several different cues predicts the same amount of reward in the same location. In subsequent compound training, two of the cues are presented together still followed by the same reward. Rats often show increased responding to this compound cue, termed summation, which is thought to reflect a novel and immediate expectation for increased reward. However, because the compound cue yields the same reward as each individual cue, the novel, increased expectation results in a negative prediction error and hence new learning. This is made apparent by an immediate decline in responding to one of the compounded cues when it is presented later, by itself, in the extinction test. As in previous recording studies, to examine firing from the same single-units across the critical transition points between simple conditioning and compound training and between compound training and extinction testing, these sessions were compressed into two “probe” sessions (Fig. 1a). All other data come from sessions separated by at least 1 d; we will not make any claims about whether we are recording the same neurons across days (Table 1 shows a full account of the numbers of neurons recorded in different phases).
Table 1.
Session | Learned |
||||||
---|---|---|---|---|---|---|---|
All |
Increase |
Decrease |
|||||
Control | OFC-lesion | Control | OFC-lesion | Control | OFC-lesion | ||
Conditioning | CD1–2 | 41 | 30 | 8 | 3 | 10 | 7 |
CD3–4 | 40 | 18 | 9 | 3 | 11 | 5 | |
CD5–6 | 37 | 35 | 8 | 11 | 12 | 8 | |
CD7–8 | 43 | 34 | 14 | 11 | 12 | 8 | |
CD9–10 | 59 | 45 | 15 | 12 | 13 | 11 | |
CD11–12 | 101 | 55 | 36 | 16 | 24 | 14 | |
Compound probe | CP | 87 | 54 | 19 | 15 | 16 | 12 |
Compound training | CP2 | 81 | 31 | 23 | 9 | 15 | 7 |
CP3 | 68 | 27 | 15 | 6 | 13 | 4 | |
CP4 | 68 | 44 | 20 | 9 | 14 | 8 | |
Extinction probe | PB | 80 | 58 | 16 | 11 | 17 | 12 |
Electrodes were implanted before any training (Fig. 1b). Rats in the OFC-lesioned group (n = 9) also received unilateral infusions of NMDA (12.5 μg/μl) into OFC to create neurotoxic lesions ipsilateral to the recording electrode. We used unilateral lesions ipsilateral to our recording electrodes to remove the primary source of OFC input to our recording site without confounding any neural changes with changes in behavior. Indeed, bilateral OFC lesions would have prevented both summation and the resultant extinction learning (Takahashi et al., 2009). Rats in the sham group (n = 10) received the same infusions of saline vehicle. Rats in the naive group (n = 6) received no infusions. Neural and behavioral data from naive rats were not statistically different from data gathered in sham rats (F values <0.0012, p values >0.97), and thus these animals were analyzed together in the text and we will refer to them as a control group. NMDA infusions targeted the ventral and lateral orbital areas and ventral and dorsal agranular insular areas (Fig. 1c).
Conditioning (Fig. 1a) consisted of presentations of three auditory cues (A1, A2, and A3, counterbalanced) and a visual cue (V). Each of these was paired with three sucrose pellets, except A3, which served as a CS−. Control and OFC-lesioned rats showed similar increases in conditioned responding to A1, A2, and V across sessions (Fig. 2a,b). A three-factor ANOVA (session × cue × treatment) comparing conditioned responding during cue presentation demonstrated significant main effects of both cue and session (cue: F(3,69) = 21.8, p < 0.01; session: F(5,115) = 71.7, p < 0.01), as well as a significant interaction between them (F(15,345) = 30.9, p < 0.01). However, there were neither significant main effects nor any interactions with treatment (F values <2.07, p values >0.11). Post hoc testing also showed that there were no differences in responding to A1 and A2 at any point in training in either group.
Conditioning was paralleled by a modest increase in the prevalence of cue-evoked neural activity in both control (Fig. 2c) and OFC-lesioned rats (Fig. 2d). Cue-evoked activity was present in ∼40% of the BLA neurons recorded in the first two sessions of conditioning, consisting of neurons that either increased or decreased firing to at least one of the four cues. With subsequent training, the proportion of neurons that showed a phasic increase in firing grew, whereas the proportion of neurons that suppressed firing did not change substantially (χ2 test; Fig. 2c,d). This suggests that the inhibitory neurons were not representing the relevant associations. Thus, the analyses presented here are focused on the population of neurons that showed excitatory phasic responses to the cues. Please note that neurons that showed inhibitory phasic responses to cues did not show any of the critical effects discussed below (data not shown).
At the end of conditioning, rats were trained in a compound probe session (Fig. 1a, CP). This session consisted of additional conditioning (CP 1/2) followed by compound training (CP 2/2), in which A1 and V were presented concurrently (A1V) followed by delivery of three pellets, the same amount of reward received in initial conditioning. A2, A3, and V were presented individually throughout compound training and were followed by the same reward as in initial conditioning. Both groups showed a significant increase in responding to A1 when it was presented in compound with V (Fig. 3a,b). A three-factor ANOVA (cue × phase × treatment) showed a significant interaction between cue and phase (F(3,69) = 0.76, p < 0.01), due to a significant increase in responding to A1 when it was paired with V (Fig. 3b). However, there were neither significant main effects nor any interactions with treatment (F values <1.24, p values >0.27). Notably, the increased responding to the A1V compound cue was specific; neither group showed any change in responding to the A2 control cue between the two phases.
We recorded 87 neurons from BLA in control rats and 54 neurons from BLA in OFC-lesioned rats during the compound probe session. These populations included 19 in the control group and 15 in the OFC-lesioned group that exhibited an excitatory phasic response to at least one of the cues during the conditioning phase. Behavioral summation at the start of compound training in the control group was accompanied by a sudden increase in the phasic neural response to the compound cue in these neurons in the controls (Fig. 3c,e) but not in the OFC-lesioned group (Fig. 3d,f). Two-factor ANOVA's (treatment × phase) showed significant effects of treatment and phase and a significant interaction between treatment and phase on the pattern of firing to A1 (treatment: F(1,32) = 4.55, p = 0.040; phase: F(1,32) = 7.93, p = 0.008; interaction: F(1,32) = 13.42, p = 0.0009) but not A2 (treatment: F(1,32) = 2.78, p = 0.10; phase: F(1,32) = 0.19, p = 0.66; interaction: F(1,32) = 0.57, p = 0.45), due to a significant increase in firing to A1 in the control but not in the OFC-lesioned group at the start of compound training. Post hoc analysis showed a significant difference between firing to A1 in control group during the compound phase compared with firing to A1 during the conditioning phase (p = 0.0003), and a significant difference to firing to A1 in compound and conditioning phases in OFC-lesioned animals (p < 0.02).
The contrast between control and OFC-lesioned rats was also evident in index scores, capturing the change in neural activity in each cue-responsive neuron to A1 and A2 between conditioning and compound training. In the control group, the distribution of these index scores shifted significantly above zero for A1 but not for A2 (Fig. 3g,h), whereas in the OFC-lesioned group, the distribution of the index scores did not shift for either cue (Fig. 3i,j). A1 also differed significantly between groups (Mann–Whitney U test, z = 2.74, p = 0.006). The shift in firing to the A1 cue in the control group was directly correlated with increased conditioned responding to the compound cue (Fig. 3k), confirming that neural summation in BLA predicted behavioral summation in the control rats. No correlation was found in OFC-lesion rats (Fig. 3l).
Importantly, the spontaneous increase in firing to the A1V compounded cue observed in the control group was not simply a reflection of the increased sensory input associated with the sudden combination of the two cues, but rather tracked the elevated expectations of reward. The A1/A2 firing ratio increased significantly at the start of compound training in control rats but then declined in subsequent sessions, consistent with learning (Fig. 3m; F(1,18) = 6.52, p = 0.02). By contrast, in the OFC-lesion group, activity to A1 and A2 remained similar and was stable from the end of conditioning to the last compound session (Fig. 3n; F(1,14) = 0.23, p = 0.64).
At the end of compound training, the rats were tested in an extinction probe session (Fig. 1a, PB). This session consisted of additional compound training (PB 1/2) followed by extinction (PB 2/2), in which A1 and the other auditory cues were presented alone without the food reward. As expected, in both groups, when A1 was separated from V at the start of extinction, the rats showed a sudden and selective decline in responding, which persisted throughout the extinction phase (Fig. 4a,b). Importantly, the reduction in responding to A1 was evident on the first trial of extinction. A three-factor ANOVA (cue × trial × treatment) revealed a significant interaction between cue and trial (F(14,322) = 21.9, p < 0.01), due to a significant decline in responding to A1 when it was separated from V. However, there were neither significant main effects nor any interactions with treatment (F values <3.56, p values >0.07).
We recorded 80 neurons from BLA in control rats and 58 neurons from BLA in OFC-lesioned rats during the extinction probe session, including 16 in the control group and 11 in the OFC-lesioned group exhibiting an excitatory phasic response to at least one of the cues. In the control group, the firing spontaneously declined at the start of extinction training to A1 (Fig. 4c; F(1,15) = 13.45, p = 0.002) but not for A2 (Fig. 4e; F(1,15) = 2.01, p = 0.17). Neurons recorded from OFC-lesioned rats showed no significant change in firing (Fig. 4d–f; A1: F(1,10) = 2.40; p = 0.15; A2: F(1,10) = 2.77, p = 0.13).
This contrast in effects was also evident in the distribution of index scores comparing firing of each neuron to A1 and A2 at the end of compound training versus the first trial in extinction. In the control group, the distribution of these scores was shifted below zero for A1 but not A2 (Fig. 4g,h), and the shift to the A1 cue on the first trial was directly correlated with reduced responding in that session (Fig. 4k). Notably, reduced behavioral responding to A1 was inversely correlated with neural summation measured in the first compound training session (Fig. 4m). Thus, the stronger the neural response to the compound cue at the start of compound training, the weaker conditioned responding to the A1 cue at the start of extinction. In the OFC-lesioned group, the distribution of the index scores did not shift for either cue (Fig. 4i,j), and there were no correlations between conditioned responding and neural activity (Fig. 4l,n). Thus, neural estimates of outcomes in BLA were predictive of both behavior and learning in rats that had an intact OFC but not in rats that had an OFC lesion ipsilateral to the recording electrode.
Discussion
Here we examined the potential basis of the apparent dissociation between the BLA and OFC in inferring novel outcomes for the purpose of learning in a Pavlovian over-expectation task. This is a task in which extinction learning is driven by an experimental manipulation, compound presentation of two previously conditioned cues, which is designed to increase the amount of reward expected, rather than by reward omission (Rescorla, 1970). Importantly, the increased expectation of reward induced by the compounding of the two cues requires the subject to integrate the historical or retrospective significance of the two cues to infer or estimate an amount of reward that may come in the future. We have shown previously that this prospective, integrative function depends critically upon the OFC. Rats lacking OFC function due to lesions (Takahashi et al., 2009), pharmacological inactivation (Takahashi et al., 2009), or optogenetic inhibition specifically at the time of cue integration (Takahashi et al., 2013) fail to learn, even in the face of normal extinction from reward omission (Burke et al., 2009; Takahashi et al., 2009).
Interestingly, in a single study, inactivation of the BLA had no effect on extinction induced by over-expectation; rats that received infusions of GABA agonists into BLA before compound training sessions showed entirely normal extinction assessed in later probe testing (Haney et al., 2010). Given the close historical correspondence in the functions of the OFC and BLA, this dissociation is striking as it suggests a novel distinction in how these two areas might be involved in using associative information. For example, the BLA might be more important for the acquisition and representation of information based on past experience, whereas the OFC might be more important for using that information to make predictions about future events. Normally these two functions would be closely related; however, under some circumstances, as for example when prior information must be used to make somewhat novel predictions, they might diverge, leading to impairments after inactivation of one area but not the other. For example, the BLA is important for the acquisition of information necessary for Pavlovian reinforcer devaluation (Hatfield et al., 1996; Málkováet al., 1997; Machado and Bachevalier, 2007); however, it is sometimes not required for the use of this information after it is acquired and perhaps updated (Pickens et al., 2003; Wellman et al., 2005; but see Johnson et al., 2009). By contrast, the OFC is typically necessary whenever such information must be used, on-the-fly, to generate predictions about future outcomes (Pickens et al., 2003, 2005; Takahashi et al., 2009; West et al., 2011; Jones et al., 2012).
With this background, the current experiment was designed simply to examine neural correlates of this generative process in the BLA and to ask whether any such correlates might be dependent on input, either direct or indirect, from the OFC. Our hypothesis had two main predictions. The first and strongest prediction was simply that single-unit activity in the BLA would not reflect the novel reward estimates required for learning during compound training. This outcome would be reminiscent of the distinction between early learning theories (Bush and Mosteller, 1951a,b), in which predictions are elemental and tied to specific cues, and more modern theories (Rescorla and Wagner, 1972; Sutton and Barto, 1981), in which predictions are explicitly summed across all available cues. Clearly neural systems can and likely do employ both strategies. One explanation of the behavioral results would be if this theoretical dichotomy were reflected in the neural processing of these two areas.
The second, weaker alternative prediction was that activity in BLA might reflect these novel estimates but that they would be entirely dependent upon the OFC. This outcome would indicate that this information is present and available in the BLA, but that its availability depends on OFC function. This would be consistent with a model in which BLA acquires and perhaps represents independently the historical, retrospective information, while requiring feedback from OFC to signal the integrated predictions.
Our results are in accord with the latter prediction. Single units in the BLA developed responses to the cues with training, and the firing of these neurons increased with compounding (and decreased with uncompounding) of cues in a way that seemed to directly track the inferred reward predictions. Indeed, even though we have found that learning does not depend on BLA in this setting (Haney et al., 2010), the increase in firing to the compound cue predicted subsequent evidence of learning. In each case, the neural correlates were similar to what we have observed in OFC (Takahashi et al., 2013).
However, all of these neural features depended entirely on the OFC because they were abolished by ipsilateral OFC lesions. Single units recorded in lesioned rats did not change firing when the cues were compounded or uncompounded, nor did their firing relate to behavior. This is not because the rats did not learn. Lesions were unilateral, and the rats showed entirely normal behavior both during compound training and during extinction. The effects of ipsilateral OFC lesions were also restricted to the neural phenomenon, integration of the predictions, thought to be OFC-dependent. Lesioned rats exhibited similar levels of cue-evoked firing during conditioning and showed the same change in this cue-evoked activity with learning. Thus, acquisition of the historical, retrospective associations by BLA neurons was not appreciably affected by the OFC lesions. It was only in the ability of the BLA to represent (and learn from) the novel predictions that the effect was evident. Interestingly, this is somewhat similar to what we have observed previously in BLA of OFC-lesioned rats learning odor discriminations (Saddoris et al., 2005). In that study, we found that the representations established to the predictive cues with learning were relatively unaffected by OFC lesions; however, the outcome expectant firing that occurred after the rat had responded and was waiting for reward was much weaker in the rats lacking an OFC. To the extent, this firing reflects a prediction about the reward that is generated on-the-fly based on the prior learning, its sensitivity to OFC damage would parallel our current results.
Yet, if the BLA has information about the novel prediction during over-expectation, then why is it not necessary for learning in this context? The obvious answer is that information can be represented in an area, even though that area is not strictly necessary for any one task that requires that information. In the case of Pavlovian over-expectation, the information is likely present across an array of structures, starting perhaps with prefrontal areas such as the OFC. Indeed the OFC itself may drive learning by presenting this information to downstream areas, such as the midbrain dopamine neurons, without the involvement of the amygdala (Schoenbaum et al., 2009; Takahashi et al., 2009, 2011). This is not to minimize the importance of the presence of the information within BLA; its presence likely has very important implications for what the BLA is able to monitor and adjust in more complex behavioral settings. For example, the BLA is not required for using information to support Pavlovian reinforcer devaluation induced by illness when only a single outcome is at stake (Pickens et al., 2003). However there is evidence that the BLA becomes more important when multiple outcomes are used and/or outcome value is manipulated by selective satiation (Johnson et al., 2009). Moreover, it has been shown that gustatory areas just caudal to the OFC exhibit BLA-dependent anticipatory encoding of outcomes (Samuelsen et al., 2012). The role of OFC-dependent outcome expectancies in BLA might be more apparent in updating representations downstream in tasks that emphasize tracking multiple outcomes and more complex forms of learning. For example, if the compounded cues predict different outcomes, rats may be able to appropriately assign credit for the one that is omitted in compound training. Such ability might require the BLA to have updated outcome predictions from OFC, to generate more specific teaching signals.
In contrast, we have previously shown that inactivation of the central nucleus of the amygdala (CeN) disrupts learning in response to over-expectation (Haney et al., 2010). Thus, the over-expectation paradigm provides a behavioral setting in which ABL and CeN may function in parallel rather than in serial. One possibility is that the role of CeN in over-expectation reflects the involvement of this area in supporting attentional function (Holland and Gallagher, 1993; Maddux et al., 2007; Calu et al., 2010) rather than any selective role in representing certain types of associative information. Additional studies are required to evaluate these possible explanations.
In conclusion, the current data support the dichotomy between BLA and the OFC in representing associative information acquired through past, direct experience versus manipulating that same information to derive novel estimates to guide future behavior and learning. They do not however support a strict division of labor, such as that between formal learning theory accounts. Instead, perhaps not surprisingly, they show that, in an intact brain information is represented across multiple components of the circuit. Its precise role in any one area will be governed by that area's function and also what other parts of the circuit are capable of supporting in isolation.
Footnotes
This work was supported by funding from NIDA (G.S.). The opinions expressed in this article are the authors' own and do not reflect the view of the National Institutes of Health, the Department of Health and Human Services, or the United States government.
The authors declare no competing financial interests.
References
- Balleine BW, Killcross AS, Dickinson A. The effect of lesions of the basolateral amygdala on instrumental conditioning. J Neurosci. 2003;23:666–675. doi: 10.1523/JNEUROSCI.23-02-00666.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baxter MG, Parker A, Lindner CC, Izquierdo AD, Murray EA. Control of response selection by reinforcer value requires interaction of amygdala and orbitofrontal cortex. J Neurosci. 2000;20:4311–4319. doi: 10.1523/JNEUROSCI.20-11-04311.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blundell P, Hall G, Killcross S. Lesions of the basolateral amygdala disrupt selective aspects of reinforcer representation in rats. J Neurosci. 2001;21:9018–9026. doi: 10.1523/JNEUROSCI.21-22-09018.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burke KA, Takahashi YK, Correll J, Brown PL, Schoenbaum G. Orbitofrontal inactivation impairs reversal of Pavlovian learning by interfering with “disinhibition” of responding for previously unrewarded cues. Eur J Neurosci. 2009;30:1941–1946. doi: 10.1111/j.1460-9568.2009.06992.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bush RR, Mosteller F. A model for stimulus generalization and discrimination. Psychol Rev. 1951a;58:413–423. doi: 10.1037/h0054576. [DOI] [PubMed] [Google Scholar]
- Bush RR, Mosteller F. A mathematical model for simple learning. Psychol Rev. 1951b;58:313–323. doi: 10.1037/h0054388. [DOI] [PubMed] [Google Scholar]
- Calu DJ, Roesch MR, Haney RZ, Holland PC, Schoenbaum G. Neural correlates of variations in event processing during learning in central nucleus of amygdala. Neuron. 2010;68:991–1001. doi: 10.1016/j.neuron.2010.11.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gore F, Schwartz EC, Brangers BC, Aladi S, Stujenske JM, Likhtik E, Russo MJ, Gordon JA, Salzman CD, Axel R. Neural representations of unconditioned stimuli in basolateral amygdala mediate innate and learned responses. Cell. 2015;162:134–145. doi: 10.1016/j.cell.2015.06.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hampton AN, Adolphs R, Tyszka MJ, O'Doherty JP. Contributions of the amygdala to reward expectancy and choice signals in human prefrontal cortex. Neuron. 2007;55:545–555. doi: 10.1016/j.neuron.2007.07.022. [DOI] [PubMed] [Google Scholar]
- Haney RZ, Calu DJ, Takahashi YK, Hughes BW, Schoenbaum G. Inactivation of the central but not the basolateral nucleus of the amygdala disrupts learning in response to over-expectation of reward. J Neurosci. 2010;30:2911–2917. doi: 10.1523/JNEUROSCI.0054-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hatfield T, Han JS, Conley M, Gallagher M, Holland P. Neurotoxic lesions of basolateral, but not central, amygdala interfere with Pavlovian second-order conditioning and reinforcer devaluation effects. J Neurosci. 1996;16:5256–5265. doi: 10.1523/JNEUROSCI.16-16-05256.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holland PC, Gallagher M. Effects of amygdala central nucleus lesions on blocking an d unblocking. Behav Neurosci. 1993;107:235–245. doi: 10.1037/0735-7044.107.2.235. [DOI] [PubMed] [Google Scholar]
- Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci. 2004;24:7540–7548. doi: 10.1523/JNEUROSCI.1921-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson AW, Gallagher M, Holland PC. The basolateral amygdala is critical to the expression of Pavlovian and instrumental outcome-specific reinforcer devaluation effects. J Neurosci. 2009;29:696–704. doi: 10.1523/JNEUROSCI.3758-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones B, Mishkin M. Limbic lesions and the problem of stimulus-reinforcement associations. Exp Neurol. 1972;36:362–377. doi: 10.1016/0014-4886(72)90030-1. [DOI] [PubMed] [Google Scholar]
- Jones JL, Esber GR, McDannald MA, Gruber AJ, Hernandez A, Mirenzi A, Schoenbaum G. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science. 2012;338:953–956. doi: 10.1126/science.1227489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lucantonio F, Takahashi YK, Hoffman AF, Chang CY, Bali-Chaudhary S, Shaham Y, Lupica CR, Schoenbaum G. Orbitofrontal activation restores insight lost after cocaine use. Nat Neurosci. 2014;17:1092–1099. doi: 10.1038/nn.3763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Machado CJ, Bachevalier J. The effects of selective amygdala, orbital frontal cortex or hippocampal formation lesions on reward assessment in nonhuman primates. Eur J Neurosci. 2007;25:2885–2904. doi: 10.1111/j.1460-9568.2007.05525.x. [DOI] [PubMed] [Google Scholar]
- Maddux JM, Kerfoot EC, Chatterjee S, Holland PC. Dissociation of attention in learning and action: effects of lesions of the amygdala central nucleus, medial prefrontal cortex, and posterior parietal cortex. Behav Neurosci. 2007;121:63–79. doi: 10.1037/0735-7044.121.1.63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Málková L, Gaffan D, Murray EA. Excitotoxic lesions of the amygdala fail to produce impairment in visual learning for auditory secondary reinforcement but interfere with reinforcer devaluation effects in rhesus monkeys. J Neurosci. 1997;17:6011–6020. doi: 10.1523/JNEUROSCI.17-15-06011.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDannald MA, Lucantonio F, Burke KA, Niv Y, Schoenbaum G. Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. J Neurosci. 2011;31:2700–2705. doi: 10.1523/JNEUROSCI.5499-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostlund SB, Balleine BW. Differential involvement of the basolateral amygdala and mediodorsal thalamus in instrumental action selection. J Neurosci. 2008;28:4398–4405. doi: 10.1523/JNEUROSCI.5472-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paxinos G, Watson C. The rat brain in stereotaxic coordinates. Ed 4. San Diego: Academic; 1998. [DOI] [PubMed] [Google Scholar]
- Pickens CL, Setlow B, Saddoris MP, Gallagher M, Holland PC, Schoenbaum G. Different roles for orbitofrontal cortex and basolateral amygdala in a reinforcer devaluation task. J Neurosci. 2003;23:11078–11084. doi: 10.1523/JNEUROSCI.23-35-11078.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pickens CL, Saddoris MP, Gallagher M, Holland PC. Orbitofrontal lesions impair use of cue-outcome associations in a devaluation task. Behav Neurosci. 2005;119:317–322. doi: 10.1037/0735-7044.119.1.317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rescorla RA. Reduction in the effectiveness of reinforcement after prior excitatory conditioning. Learn Motiv. 1970;1:372–381. doi: 10.1016/0023-9690(70)90101-3. [DOI] [Google Scholar]
- Rescorla RA. Spontaneous recovery from overexpectation. Learn Behav. 2006;34:13–20. doi: 10.3758/BF03192867. [DOI] [PubMed] [Google Scholar]
- Rescorla RA. Renewal from overexpectation. Learn Behav. 2007;35:19–26. doi: 10.3758/BF03196070. [DOI] [PubMed] [Google Scholar]
- Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical conditioning II: current research and theory. New York: Appleton-Century-Crofts; 1972. pp. 64–99. [Google Scholar]
- Rudebeck PH, Mitz AR, Chacko RV, Murray EA. Effects of amygdala lesions on reward-value coding in orbital and medial prefrontal cortex. Neuron. 2013a;80:1519–1531. doi: 10.1016/j.neuron.2013.09.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rudebeck PH, Saunders RC, Prescott AT, Chau LS, Murray EA. Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nat Neurosci. 2013b;16:1140–1145. doi: 10.1038/nn.3440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saddoris MP, Gallagher M, Schoenbaum G. Rapid associative encoding in basolateral amygdala depends on connections with orbitofrontal cortex. Neuron. 2005;46:321–331. doi: 10.1016/j.neuron.2005.02.018. [DOI] [PubMed] [Google Scholar]
- Samuelsen CL, Gardner MP, Fontanini A. Effects of cue-triggered expectation on cortical processing of taste. Neuron. 2012;74:410–422. doi: 10.1016/j.neuron.2012.02.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenbaum G, Setlow B, Saddoris MP, Gallagher M. Encoding predicted outcome and acquired value in orbitofrontal cortex during cue sampling depends upon input from basolateral amygdala. Neuron. 2003;39:855–867. doi: 10.1016/S0896-6273(03)00474-4. [DOI] [PubMed] [Google Scholar]
- Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat Rev Neurosci. 2009;10:885–892. doi: 10.1038/nrn2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychol Rev. 1981;88:135–170. doi: 10.1037/0033-295X.88.2.135. [DOI] [PubMed] [Google Scholar]
- Takahashi YK, Roesch MR, Stalnaker TA, Haney RZ, Calu DJ, Taylor AR, Burke KA, Schoenbaum G. The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron. 2009;62:269–280. doi: 10.1016/j.neuron.2009.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi YK, Roesch MR, Wilson RC, Toreson K, O'Donnell P, Niv Y, Schoenbaum G. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat Neurosci. 2011;14:1590–1597. doi: 10.1038/nn.2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi YK, Chang CY, Lucantonio F, Haney RZ, Berg BA, Yau HJ, Bonci A, Schoenbaum G. Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning. Neuron. 2013;80:507–518. doi: 10.1016/j.neuron.2013.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF. Separable learning systems in the macaque brain and the role of the orbitofrontal cortex in contingent learning. Neuron. 2010;65:927–939. doi: 10.1016/j.neuron.2010.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wellman LL, Gale K, Malkova L. GABAA-mediated inhibition of basolateral amygdala blocks reward devaluation in macaques. J Neurosci. 2005;25:4577–4586. doi: 10.1523/JNEUROSCI.2257-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- West EA, DesJardin JT, Gale K, Malkova L. Transient inactivation of orbitofrontal cortex blocks reinforcer devaluation in macaques. J Neurosci. 2011;31:15128–15135. doi: 10.1523/JNEUROSCI.3295-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeeb FD, Winstanley CA. Functional disconnection of the orbitofrontal cortex and basolateral amygdala impairs acquisition of a rat gambling task and disrupts animals' ability to alter decision-making behavior after reinforcer devaluation. J Neurosci. 2013;33:6434–6443. doi: 10.1523/JNEUROSCI.3971-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang W, Schneider DM, Belova MA, Morrison SE, Paton JJ, Salzman CD. Functional circuits and anatomical distribution of response properties in the primate amygdala. J Neurosci. 2013;33:722–733. doi: 10.1523/JNEUROSCI.2970-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]