Abstract
To make adaptive decisions, we build an internal model of the associative relationships in an environment and use it to make predictions and inferences about specific available outcomes. Detailed, identity-specific cue-reward memories are a core feature of such cognitive maps. Here we used fiber photometry, cell-type and pathway-specific optogenetic manipulation, Pavlovian cue-reward conditioning, and decision-making tests in male and female rats, to reveal that ventral tegmental area dopamine (VTADA) projections to the basolateral amygdala (BLA) drive the encoding of identity-specific cue-reward memories. Dopamine is released in the BLA during cue-reward pairing; VTADA→BLA activity is necessary and sufficient to link the identifying features of a reward to a predictive cue, but does not assign general incentive properties to the cue or mediate reinforcement. These data reveal a dopaminergic pathway for the learning that supports adaptive decision making and help explain how VTADA neurons achieve their emerging multifaceted role in learning.
Keywords: learning, memory, decision making, Pavlovian conditioning, Pavlovian-to-instrumental transfer, ventral tegmental area, basolateral amygdala, appetitive, model-based learning
Dopamine has long been known to contribute to learning. Midbrain dopamine neurons can signal errors in reward prediction1–3. These learning signals have canonically been interpreted to cache the general value of a reward to its predictor and reinforce response policies that rely on past success, rather than forethought of specific outcomes4–6. But adaptive decision making often requires such forethought. For example, if you see both pizza and donut boxes outside the seminar room, assuming you like both, you need to use these cues to represent the identity of the predicted foods in order to make the snack choice that is optimal in your current circumstances (e.g., are you craving something sweet or savory, or have you just had donuts for breakfast?). So, to ensure flexible behavior, humans and other animals do not just learn the general value of predictive events, but also encode the relationships between these cues and the identifying features of their associated outcomes7, 8. Such identity-specific cue-reward memories are fundamental components of the internal model of environmental relationships, aka cognitive map9, we use to generate the predictions and inferences needed for many forms of flexible, advantageous decision making. Little is known of how we form these cue-reward memories. But recent evidence suggests dopamine might actually contribute10–17. New data have challenged the value-centric dogma of dopamine function, indicating it plays a much broader role in learning than originally thought18–21. How dopamine contributes to identity-specific cue-reward learning is unknown, yet critical for understanding dopamine’s emerging multifaceted function in learning.
One candidate pathway through which dopamine might contribute to cue-reward learning is the VTA dopamine (VTADA) projection to the basolateral amygdala (BLA)22, 23. This pathway has received much less attention than the more popular VTADA projections to nucleus accumbens and prefrontal cortex, so little is known of its function. The BLA itself was recently shown to be crucial for forming detailed, identity-specific, cue-reward memories24. Therefore, here we combined a systems neuroscience toolkit with Pavlovian cue-reward conditioning and tests of the nature of learning and its influence on decision making to evaluate VTADA→BLA pathway function in linking the unique features of rewarding events to predictive cues, i.e., encoding the identity-specific reward memories that support adaptive decision making.
RESULTS
Dopamine is released in the BLA during cue-reward learning.
We first asked whether and when dopamine is released in the BLA during the encoding of identity-specific cue-reward memories. We used fiber photometry to record fluorescent activity of the G-protein-coupled receptor-activation-based dopamine sensor (GRABDA) in the BLA of male and female rats during Pavlovian conditioning (Figure 1a–c). Rats were food deprived and received 8 sessions of Pavlovian long-delay conditioning during which 2 distinct auditory cues (aka, conditioned stimuli) each predicted a unique food reward (e.g., white noise—sucrose/click—pellets). During each session, each cue was presented 8 times (variable 2.5-min mean intertrial interval, ITI) for 30 s and terminated in the delivery of its associated reward (Figure 1c). This conditioning has been shown to engender the encoding of identity-specific cue-reward memories as evidenced by the ability of the cues to subsequently promote instrumental choice of the specific predicted reward25 and sensitivity of the conditional goal-approach response to devaluation of the predicted reward26. Across training, rats developed a Pavlovian conditional goal-approach response (Figure 1d). Like BLA neuronal responses (Extended Data 1), dopamine was released in the BLA at both cue onset and offset/reward delivery across training (Figure 1e–f; full statistical reporting provided in Supplemental Table 1; see also Extended Data 2 for data from each of the 8 training sessions, and Extended Data 3 for data aligned to reward collection). Thus, dopamine is released in the BLA in response to both cues and rewards, as well as their pairing during Pavlovian conditioning.
Figure 1. Dopamine is released in the BLA during cue-reward learning.

(a-f) Fiber photometry recording of BLA dopamine release during Pavlovian long-delay conditioning. (a) Top: Representative fluorescent image of BLA GRABDA2h expression and fiber placement. Bottom: Fiber photometry approach for imaging GRABDA fluorescence changes in BLA neurons. (b) Schematic representation of BLA GRABDA2h expression and optical fiber tips for all subjects. Brain slides from56. (c) Pavlovian long-delay conditioning procedure. CS, 30-s conditioned stimulus (aka, “cue”, white noise or click) followed immediately by reward outcome (O, sucrose solution or grain pellet). (d) Food-port entry rate during Pavlovian conditioning. Two-way RM ANOVA, Training × Cue: F(2.77, 22.15) = 14.69, P < 0.0001. (e) Area under the BLA GRABDA Z-scored curve (AUC). Two-way RM ANOVA, Event: F(1.83, 14.61) = 7.63, P = 0.006. (f) GRABDA fluorescence changes (Z-score) across Pavlovian conditioning. Ticks represent time of reward collection for each subject. Data from the last six sessions were averaged across 2-session bins (3/4, 5/6, and 7/8). N = 9, 5 male rats. (g-l) Fiber photometry recording of BLA dopamine release during Pavlovian trace conditioning. (g) Top: Representative fluorescent image of BLA GRABDA expression and fiber placement. Bottom: Fiber photometry approach. (h) Schematic representation of GRABDA expression and placement of optical fiber tips in BLA for all subjects. (i) Pavlovian trace conditioning procedure. CS, 10-s conditioned stimulus (white noise or click) followed by a 1.5-s trace interval before reward outcome (O, chocolate or unflavored purified pellets). (j) Percentage of time spent in the food-port during Pavlovian conditioning. Two-way RM ANOVA, Training × Cue: F(3.21, 28.92) = 7.77, P = 0.0005. (k) BLA GRABDA Z-scored AUC across Pavlovian conditioning. Two-way RM ANOVA, Training × Event: F(2.85, 25.69) = 3.72, P = 0.03. (l) GRABDA fluorescence changes across Pavlovian conditioning. N = 10 rats (GRABDA2h: N = 4, 3 male; GRABDA2m: N = 6, 3 male). Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons. See Supplemental Table 1 for full statistical reporting.
To further reveal how BLA dopamine relates to cue-reward learning, we recorded GRABDA in the BLA during Pavlovian trace conditioning (Figure 1g–j). A new group of food-deprived rats received 8 sessions of Pavlovian trace conditioning during which 2 distinct auditory cues each predicted a unique food reward after a brief delay (e.g., white noise—chocolate pellets/click—unflavored pellets; Figure 1i–j). During each session, each cue was presented 8 times (variable 2.5-min ITI) for 10 s and its associated reward was delivered 1.5 s after cue offset. The trace interval temporally separated reward delivery from cue offset, allowing us to resolve dopamine signals to these discrete events. We used a shorter cue-reward interval to better enable subjects to predict reward delivery. This conditioning has also been shown to engender the encoding of identity-specific cue-reward memories as evidenced by sensitivity of the conditional goal-approach response to devaluation of the predicted reward27. Again, we found that BLA dopamine is associated with cue-reward learning (Figure 1k–l). Dopamine was initially released in response to reward delivery. Thus, across types of cue-reward learning, dopamine is released in the BLA during cue-reward pairing, the critical window for encoding the cue-reward association. In this task, we found that reward-evoked BLA dopamine attenuated with training. Conversely, cue-evoked BLA dopamine release was initially small and grew with training. Indeed, the slope of the BLA dopamine reward response across training was negative (β = -0.13, confidence interval -0.37 – 0.10) and signifantly different (F(1,96) = 9.09, P = 0.003) from the slope of the BLA dopamine cue-onset response across training, which was positive (β = 0.25, confidence interval 0.15 – 0.36). Thus, unpredicted rewards trigger dopamine release in the BLA and this response backpropagates to reward predictors with training. After training, we detected BLA dopamine responses to unpredicted reward delivery, which were graded by reward magnitude (Extended Data 4a–b). Consistent with a prior report28, we also detected BLA dopamine responses to a mildly aversive event (unpredicted puffs of air to the face; Extended Data 4c–d). Thus, dopamine is released in the BLA during salient appetitive and aversive events and during multiple forms of cue-reward learning.
VTADA→BLA projection activity mediates cue-reward learning.
Having found that dopamine is released in the BLA during cue-reward pairing, we next asked whether this mediates the encoding of identity-specific cue-reward memories (Figure 2a–d). We cre-dependently expressed the inhibitory opsin archaerhodopsin T (ArchT) or tdTomato control bilaterally in VTADA neurons of male and female tyrosine hydroxylase (Th)-cre rats (Figure 2a–b) and implanted optical fibers bilaterally over BLA (Figure 2c) to allow us to, in ArchT-expressing subjects, transiently inactivate VTADA axons and terminals in the BLA. Rats first received instrumental conditioning, without manipulation, in which one of two different lever-press actions each earned one of two distinct food rewards (e.g., left press→sucrose/right press→pellets; 11 sessions; Figure 2e). Rats then received Pavlovian long-delay conditioning, during which each of 2 distinct, 30-s, auditory cues predicted the immediate delivery of one of the food rewards at cue offset (e.g., white noise—sucrose/click—pellets; 8 of each cue/session; variable 2.5-min mean ITI; 8 sessions). VTADA→BLA projections were optically inhibited (532 nm, 10 mW, 3 s) coincident with each reward delivery during each Pavlovian conditioning session. We restricted optical inhibition to reward delivery because this is the time at which the cue-reward pairing occurs and when we detected robust dopamine release in the BLA. Optical inhibition of VTADA→BLA projections did not disrupt outcome collection (Extended Data 5a). It also did not impede the development of a Pavlovian conditional goal-approach response (Figure 2f), even if we inhibited throughout the entire cue and reward period (Extended Data 6). Thus, VTADA→BLA projections are not required to reinforce an appetitive conditional goal-approach response.
Figure 2. Optical inhibition of VTADA→BLA projections during cue-reward pairing attenuates the encoding of identity-specific cue-reward memories.

(a-i) Optical inhibition of VTADA→BLA projections during Pavlovian long-delay conditioning with outcome-specific Pavlovian-to-instrumental transfer test. (a) Bottom: Representative fluorescent image of ArchT-tdTomato expression in VTADA neurons. Middle: Strategy for bilateral optogenetic inhibition of VTADA→BLA projections. Top: Representative image of fiber placement in the vicinity of immunofluorescent ArchT-tdTomato-expressing VTADA axons and terminals in BLA. (b) Schematic representation of ArchT-tdTomato expression in VTA and (c) placement of optical fiber tips in BLA for all subjects. (d) Pavlovian long-delay conditioning and Pavlovian-to-instrumental transfer procedure. A, action (left or right lever press); CS, 30-s conditioned stimulus (aka, “cue”, white noise or click) followed immediately by reward outcome (O, sucrose solution or grain pellet). (e) Lever-press rate averaged across levers and across the final 2 instrumental sessions. (f) Food-port entry rate during across Pavlovian conditioning. Three-way RM ANOVA, Training × Cue: F(4.09, 77.71) = 5.73, P = 0.0004. (g-i) Outcome-specific Pavlovian-to-instrumental transfer test. (g) Lever-press rates on the lever that earned the “Same” outcome as predicted by the forthcoming or current cue or on the other available lever (Different). *P < 0.05, planned comparisons cue same presses v. preCue same presses and cue different presses v. preCue different presses. (h) Elevation in pressing [(Presses during cue)/(Presses during cue + preCue presses)]. Two-way RM ANOVA, Virus × Lever: F(1, 19) = 9.22, P = 0.007. (i) Food-port entry rate. Two-way RM ANOVA, Cue: F(1, 19) = 15.18, P = 0.001. ArchT, N = 11, 6 male rats; tdTomato, N = 10, 5 male rats. (j-q) Optical inhibition of VTADA→BLA projections during Pavlovian trace conditioning with outcome-specific devaluation test. (j) Bottom: Representative fluorescent image of ArchT-tdTomato expression in VTADA neurons. Middle: Strategy for bilateral optogenetic inhibition of VTADA→BLA projections. Top: Representative image of fiber placement in the vicinity of immunofluorescent ArchT-tdTomato-expressing VTADA axons and terminals in BLA. (k) Schematic representation of ArchT-tdTomato expression in VTA and (l) placement of optical fiber tips in BLA for all subjects. (m) Pavlovian trace conditioning and outcome-specific devaluation procedure. CS, 10-s conditioned stimulus (white noise or tone) following by 1.5-s trace interval before reward outcome (O, chocolate or unflavored purified pellets); LiCl, lithium chloride 0.3M, 1.5% volume/weight. (n) Percentage of time in the food-delivery port during Pavlovian conditioning. Three-way RM ANOVA, Training × Cue: F(1.61, 16.13) = 31.49, P <0.0001. (o-p) Outcome-specific devaluation probe test. (o) Percentage of time in the food port during baseline, cue signaling the devalued reward and cue signaling the non-devalued (valued) reward. *P<0.05, **P<0.01, planned comparisons cue valued % time in port v. preCue % time in port and cue devalued % time in port v. preCue % time in port. (p) Elevation in percent time in food port [(CS % time in port)/(CS % time in port + preCue % time in port)]. Two-way RM ANOVA, Virus × Cue: F(1, 10) = 5.20, P = 0.046. (q) Amount out of 100 available pellets consumed during post-test consumption choice. Two-way RM ANOVA, Value: F(1, 10) = 249.00, P < 0.0001. ArchT, N = 5, 4 male rats; Control, N = 7, 4 male rats (3 WT/cre-dependent ArchT; 4 Th-cre/cre-dependent tdTomato). Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. ^P = 0.059, *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons. See Supplemental Table 1 for full statistical reporting.
Conditional approach to the shared goal location does not require subjects to have learned the identifying details of the predicted rewards. So, to ask whether VTADA→BLA projections are needed for encoding identity-specific cue-reward memories, we next gave subjects an outcome-specific Pavlovian-to-instrumental transfer (PIT) test. During this test both levers were present, but lever pressing was not reinforced. Each cue was presented 4 times (also without accompanying reward), with intervening cue-free baseline periods (fixed 2.5-min ITI), to assess its influence on action performance and selection in the novel choice scenario. Because the cues are never directly associated with the instrumental actions, this test assesses the ability to use the cues to retrieve a representation of the specific predicted reward to motivate choice of the action known to earn that same unique outcome25, 29. No manipulation was given on test. If subjects had encoded identity-specific cue-reward memories, then cue presentation should cause them to selectively increase presses on the lever that, during training, earned the same specific reward as predicted by that cue. Controls showed this outcome-specific PIT effect, increasing presses during the cues on the lever associated with the same predicted reward, but not on the lever associated with the different reward. Conversely, the cues were not capable of selectively guiding lever-press choice in the group for which VTADA→BLA projections had been inhibited during Pavlovian conditioning (Figure 2g–h). Rather, for these subjects, the cues increased pressing on both levers, significantly so on the different lever. Thus, inhibition of VTADA→BLA projections during learning disrupts the encoding of identity-specific cue-reward memories. The general cue-induced increase in pressing in these subjects during PIT suggests the general incentive properties of the cues were intact. Indeed, inhibition of VTADA→BLA projections during the entire cue and reward period during learning, prevented encoding of identity-specific cue-reward memories, but still did not disrupt the ability of cues to non-discriminately motivate instrumental action (Extended Data 6). As in training, during the PIT test the conditional goal-approach response was similar between groups (Figure 2I; see also Extended Data 6 for similar results with longer duration inhibition). Thus, VTADA→BLA projections are active at the time of cue-reward pairing and this activity is needed to link the identifying details of the reward to the predictive cue, but not to reinforce a conditional response or to assign general incentive properties to the cue to support general motivation.
To provide converging evidence that VTADA→BLA projections support the encoding of identity-specific cue-reward memories, we next asked whether this pathway mediates the cue-reward learning that enables subjects to adapt their conditional responses following devaluation of the predicted reward (Figure 2a–m). We used the Pavlovian trace conditioning task to further generalize VTADA→BLA function across different types of cue-reward learning. During each conditioning session, each of 2 distinct, 10-s, auditory cues was presented 8 times for 10 s and its associated reward was delivered 1.5 s after cue offset (e.g., white noise—chocolate pellets/pulsed tone—unflavored pellets; 5 sessions; Figure 2m). VTADA→BLA projections were optically inhibited (532 nm, 10 mW, 3 s) with each reward delivery to attenuate the BLA dopamine reward response (Figure 1l) during learning. Again, optical inhibition of VTADA→BLA projections did not disrupt reward collection (Extended Data 5b) or impede the development of a Pavlovian conditional goal-approach response (Figure 2n), confirming VTADA→BLA projections are not required to reinforce an appetitive Pavlovian response.
To ask whether VTADA→BLA are needed to encode identity-specific cue-reward memories, we next devalued one of the food rewards by pairing it with lithium chloride (LiCl; 0.3M, 1.5% volume/weight; 8 sessions) in the absence of the associated cue. The devaluation was effective. Rats fully rejected the LiCl-paired but not unpaired food (Figure 2q). During test, each cue was presented 8 times (without accompanying reward), with intervening cue-free baseline periods (variable 2.5-min ITI). No manipulation was given on test. If subjects had encoded identity-specific cue-reward memories, they should use the cues to retrieve a representation of the specific predicted reward and increase entries into the food-port during the cue signaling the non-devalued reward but not during the cue signaling the devalued reward8, 30, 31. Controls showed this outcome-specific devaluation effect. Conversely, the conditional food-port approach response of subjects for which we inhibited VTADA→BLA projections learning was insensitive to devaluation (Figure 2o–p). Rats continued to enter the food port during the cue, even if the predicted outcome was devalued. Combined these data indicate that dopamine is released in the BLA during reward experience and this is necessary to link the identifying features of the reward to a predictive cue to enable subsequent cue-induced reward predictions for adaptive decision making.
VTADA→BLA projection activity drives cue-reward learning.
Since VTADA→BLA projection activity during cue-reward pairing mediates the encoding of identity-specific cue-reward memories, we reasoned that activation of these projections might drive the formation of such memories. To test this, we first needed to attenuate the encoding of cue-reward memories to serve as a platform to neurobiologically rescue learning. To achieve this, we took advantage of classic Kamin blocking procedures32 by using a visual cue that already reliably predicts a particular reward to block formation of an association between a novel auditory cue and that specific reward33 (Figure 3a). Male and female rats first received instrumental conditioning in which each of two lever-press actions earned a unique food reward (e.g., left press→sucrose/right press→pellets; 11 sessions; Figure 3b). Subjects then received visual cue Pavlovian conditioning. For subjects in the Blocking group, two distinct 30-s visual cues each terminated in the delivery of a unique food outcome (e.g., house light—sucrose/flashing light—pellet; 16 of each cue/session; 2.5-min mean variable ITI; 12 sessions). Controls received equated conditioning in which a third distinct 30-s visual stimulus predicted both reward types (16 trials alternating lights-sucrose/16 trials alternating lights-pellet). Subjects acquired Pavlovian conditional goal-approach responses to these visual cues (Figure 3c). All subjects then received compound conditioning, during which each of the two visual cues previously conditioned for the Blocking group was presented concurrent with an auditory cue for 30 s terminating in the delivery of one of the distinct food outcomes (e.g., house light + white noise—sucrose/flashing light + click—pellet; 8 of each compound cue/session; 4 sessions). For subjects in the blocking group, each compound cue was paired with the reward previously associated with the visual cue. Thus, the visual component of the compound cue already reliably predicted the outcome. However, for controls neither the visual nor auditory component of the compound cue had been previously associated with the outcome. All subjects showed conditional goal-approach responses to the compound cues (Figure 3d). To assess acquisition of the unique, identity-specific auditory cue-reward relationships, rats were given a PIT test during which action selection was evaluated in the presence of the auditory cues. Controls showed evidence that they learned the auditory cue-reward relationships by being able to use the cues to selectively increase presses on the lever associated with the same specific reward (Figure 3e–f). If the previously encoded visual cue-reward memory blocked encoding of the relationship between the auditory cue and identifying features of the reward, then subjects in the blocking group should not be able to use the auditory cues to represent the specific predicted reward and guide their choices towards the action associated with that outcome during the PIT test. This is what we found. Subjects in the blocking group displayed a non-specific increase in pressing across both levers during the PIT test, indicating they were unable to use the auditory cues to represent the specific predicted reward to guide choice and, thus, had not learned the identity-specific auditory cue-reward memories (Figure 3e–f). Despite disrupted PIT performance, expression of conditional goal-approach response was preserved in the blocking group (Figure 3g). Thus, as has been shown previously33, we were able to effectively attenuate the encoding of identity-specific cue-reward memories.
Figure 3. Previously learned cue-reward relationships block encoding of new identity-specific cue-reward memories.

(a) Procedure. A, action (left or right lever press); CS, 30-s conditioned stimulus (aka, “cue”, CSA/B: house light or flashing lights; CSC: alternating lights on either side of the chamber; CS1/CS2: white noise or click) followed immediately by reward outcome (O, sucrose solution or grain pellet). (b) Lever-press rate averaged across levers and across the final 2 instrumental conditioning sessions. (c) Food-port entry rate during visual cues Pavlovian conditioning. Three-way RM ANOVA, Training × Cue: F(4.55, 136.40) = 30.77, P < 0.0001. (d) Food-port entry rate during compound conditioning. Three-way RM ANOVA, Cue: F(1,30) = 173.60, P < 0.0001. (e-g) Auditory cue outcome-specific Pavlovian-to-instrumental transfer test. (e) Lever-press rates on the lever that earned the “Same” outcome as predicted by the forthcoming or current cue or on the other available lever (Different). Three-way RM ANOVA, Group × Cue: F(1, 30) = 4.54, P = 0.04. **P < 0.01, ***P<0.001, planned comparisons cue same presses v. preCue same presses and cue different presses v. preCue different presses. (f) Elevation in pressing [(Presses during cue)/(Presses during cue + preCue presses)]. (g) Food-port entry rate. Two-way RM ANOVA, Cue: F(1, 30) = 154.70, P < 0.0001. Blocking, N = 16, 11 male rats; Control, N = 16, 11 male rats. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons. See Supplemental Table 1 for full statistical reporting.
Using this blocking procedure, we next asked whether activation of VTADA→BLA projections is sufficient to rescue, or unblock, the encoding of identity-specific cue-reward memories (Figure 4a–d). We cre-dependently expressed the excitatory opsin channelrhodopsin (ChR2) or eYFP control in VTADA neurons of male and female Th-cre rats (Figure 4a–b) and implanted optical fibers bilaterally over BLA (Figure 4c) to allow us to, in ChR2-expressing subjects, transiently stimulate VTADA axons and terminals in the BLA. Rats first received instrumental conditioning, without manipulation, to learn two action-reward relationships (e.g., left press→sucrose/right press→pellets; Figure 4e). They then received visual cue Pavlovian conditioning, also manipulation-free. All subjects received blocking conditions and, thus, during Pavlovian conditioning had two distinct visual cues each paired with a unique food outcome (e.g., house light—sucrose/flashing light—pellet). Both groups developed Pavlovian conditional goal-approach responses to the visual cues (Figure 4f). Rats next received compound conditioning during which each of the visual cues was presented concurrent with an auditory cue for 30 s terminating in the delivery of the same outcome already associated with the visual cue (e.g., house light + white noise—sucrose/flashing light + click—pellet). During each compound conditioning session, VTADA→BLA projections were optically stimulated (473 nm; 20 Hz, 10 mW, 25-ms pulse width, 3 s) during reward delivery, when the cue-reward pairing happens and, thus, learning can occur. VTADA→BLA stimulation had no effect on reward collection (Extended Data 7). It also did not affect goal-approach responses to the compound cue (Figure 4g). To assess the encoding of identity-specific cue-reward memories, we gave rats a PIT test with the auditory cues, without manipulation. We replicated the blocking effect in the eYFP controls. For these subjects, the auditory cues were not capable of guiding choice behavior during the PIT test. Stimulation of VTADA→BLA projections during compound training did, however, drive the encoding of identity-specific cue-reward memories. Rats in this group were able to use the auditory cues to know which specific outcome was predicted to selectively increase presses on the lever associated with that same reward (Figure 4h–i). Both groups showed similar goal-approach responses to the cues (Figure 4j), indicating that optical stimulation of VTADA→BLA projections did not augment reinforcement of a general conditional approach response. Similarly, rats did not self-stimulate VTADA→BLA projections, indicating stimulation at this frequency, which reflects the upper endogenous firing rate of dopamine neurons in response to rewarding events3, 34, was not itself reinforcing (Extended Data 8). Thus, activation of VTADA→BLA projections concurrent with reward experience is sufficient to drive the encoding of identity-specific cue-reward memories, but does not promote reinforcement.
Figure 4. Optical stimulation of VTADA→BLA projections during cue-reward pairing unblocks encoding of identity-specific cue-reward memories.

(a) Bottom: Representative fluorescent image of ChR2-eYFP expression in VTADA neurons. Middle: Strategy for bilateral optogenetic stimulation of VTADA→BLA projections. Top: Representative image of fiber placement in the vicinity of immunofluorescent ChR2-eYFP-expressing VTADA axons and terminals in the BLA. (b) Schematic representation of ChR2-eYFP expression in VTA and (c) placement of optical fiber tips in BLA for all subjects. (d) Training procedures. A, action (left or right lever press); CS, 30-s conditioned stimulus (aka, “cue”, CSA/B: house light or flashing lights; CS1/CS2: white noise or click) followed immediately by reward outcome (O, sucrose solution or grain pellet). (e) Lever-press rate averaged across levers and across the final 2 instrumental conditioning sessions. (f) Food-port entry rate during visual cue Pavlovian conditioning. Three-way RM ANOVA, Training × Cue: F(4.15, 91.32) = 25.86, P < 0.0001. (g) Food-port entry rate during compound conditioning. Three-way RM ANOVA, Training × Cue period: F(2.28, 50.21) = 9.06, P = 0.0002. (h-k) Auditory cue outcome-specific Pavlovian-to-instrumental transfer test. (h) Test procedure. (i) Lever-press rates on the lever that earned the “Same” outcome as predicted by the forthcoming or current cue or on the other available lever (Different). Three-way RM ANOVA, Virus × Lever × Cue: F(1, 22) = 4.48, P = 0.046. **P < 0.01, planned comparisons cue same presses v. preCue same presses and cue different presses v. preCue different presses. (j) Elevation in pressing [(Presses during cue)/(Presses during cue + preCue presses)].Two-way RM ANOVA, Virus × Lever: F(1, 22) = 5.72, P = 0.03. (k) Food-port entry rate. Two-way RM ANOVA, Cue: F(1, 22) = 36.10, P < 0.0001. ChR2, N = 11, 6 male rats; eYFP, N = 13, 6 male rats. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons. See Supplemental Table 1 for full statistical reporting.
DISCUSSION
Here we explored the function of dopamine input to the BLA in cue-reward learning. We found that dopamine is released in the BLA during cue-reward pairing and can backpropagate from reward to predictors with learning. VTADA→BLA projection activity at cue-reward pairing is both necessary and sufficient to drive the encoding of identity-specific, cue-reward memories to enable subsequent adaptive decision making. It does not, however, mediate reinforcement or assign general incentive properties to cues to support non-specific motivation. These data reveal the VTADA→BLA pathway as a critical contributor to the formation of detailed, identity-specific, cue-reward memories, fundamental components of the internal model of environmental relationships, aka cognitive map, that supports flexible decision making.
Dopamine is released in the BLA during cue-reward learning. Across two different forms of cue-reward pairing, we detected robust dopamine responses to reward delivery. Thus, dopamine is released in the BLA when a cue-reward association can be formed. This is also when the BLA is, itself, active24. During long-delay conditioning, cue-reward pairing continued to be accompanied by dopamine release throughout training, perhaps owing to the cooccurrence of cue offset and reward delivery and/or the difficulty precisely timing reward delivery with a long cue-reward interval. With a shorter cue-reward interval and temporal separation of reward from cue offset, we found that the reward-evoked BLA dopamine response attenuated with learning. Correspondingly, it backpropagated to the reward predictor with training. VTADA neurons are well known to support learning by signaling errors in reward prediction1–3. The backpropagating pattern is consistent with such a signal, as is the finding that dopamine responses to unpredicted reward scaled with reward magnitude. However, a pure reward prediction error signal would dip in response to aversive events35. To the contrary, but consistent with the activity of VTADA→BLA terminals28, we found that an unpredicted aversive event increased BLA dopamine release. Thus, dopamine is released in the BLA in response to both unpredicted rewarding and aversive events and, therefore, at least in bulk, does not signal valence or solely reward prediction error. We also detected small dopamine responses to the cues when they were novel on the first conditioning session. Combined these findings are consistent with a model in which dopamine integrates salience and reward prediction error36 and with evidence that dopamine can reflect perceived salience to support the attention needed for learning37. Indeed, BLA dopamine can shape attention-related learning signals in the BLA38. VTADA neurons have recently been implicated in myriad learning-related processes18–21, 37, 39, 40. Thus, further work is needed to reveal the precise processes that BLA dopamine encodes to support learning, including the possibility that BLA dopamine release may be heterogeneous based on individual VTADA→BLA cell activity, microenvironment, and/or type of learning. Critically, dopamine is released in the BLA to cues, rewards, and their pairing, when cues can become linked to the identifying features of the rewards they predict.
VTADA→BLA pathway activity at the time of cue-reward pairing drives the encoding of identity-specific reward memories. Inhibiting this activity attenuated the ability to link the identifying details of the reward to the predictive cue such that subjects were unable to later use information to inform decision making in a novel situation. This was supported across two different forms of cue-reward learning (long-delay and trace conditioning) and two different types of decision making (instrumental choice and sensitivity to outcome devaluation). Stimulation of VTADA→BLA projections was sufficient to rescue the encoding of identity-specific cue-reward memories when it would otherwise be blocked, such that subjects were later able to use these memories to inform decision making. Although the terminal inhibition indicates the VTADA→BLA pathway regulates identity-specific reward learning, if VTADA→BLA neurons collateralize, the stimulation results could, in part, be due to antidromic activation of collaterals triggering dopamine release elsewhere in the brain. Nonetheless, the data indicate that VTADA→BLA activity is both necessary and sufficient for the formation of identity-specific reward memories. This is consistent with prior evidence that VTADA neurons track learning from unexpected changes in outcome identity41 and can signal the identifying features of an reward42. It also accords with evidence that VTADA neuron activity mediates unblocking driven by changes in outcome identity15 and drives learning about the identifying features of predicted rewards needed for sensitivity of cue responses to outcome devaluation14. The present data indicate VTADA neurons mediate the encoding of identity-specific reward memories and that this is achieved, at least in part, via projections to BLA.
The VTADA→BLA pathway does not mediate reinforcement or assign general incentive properties to cues. The canonical theory of dopamine function is that it provides a teaching signal to cache the general value of future rewarding events to a predictive cue and reinforce response policies based on past success1–5. If VTADA→BLA projections mediate reinforcement, then we should have found their inhibition to disrupt the development of the Pavlovian conditional approach response. To the contrary, conditional responses were preserved following VTADA→BLA inhibition. If VTADA→BLA pathway activity is sufficient to promote reinforcement, then we should have found activation of this pathway to be reinforcing itself or to promote the reinforcement of conditional responses. We found no evidence of this either. Rather, VTADA→BLA projections regulate the cue-reward learning that enables subsequent choices in new situations, absent any prior opportunity for those choices to have been reinforced. Two pieces of evidence indicate that VTADA→BLA projection activity does not cache general incentive properties to cues. First, following inhibition of VTADA→BLA projections during learning, cues were still capable of generally invigorating instrumental activity, akin to general Pavlovian-to-instrumental transfer29. Second, stimulating VTADA→BLA projections during cue-reward pairing did not cause cues to non-discriminately motivate action. These null effects on reinforcement and general motivational value are consistent with evidence that the BLA itself is dispensable for these processes24, 43. Thus, any contribution of dopamine to general value and reinforcement learning is likely via pathways other than those to the BLA. VTADA→BLA projections may be specialized for encoding the identity-specific memories that support adaptive decision making.
By establishing a function for the VTADA→BLA pathway in identity-specific reward memory, these data open new and important questions for future investigation. One is the mechanism through which VTADA→BLA projections contribute to learning. Dopamine is positioned to influence learning via modulation of neuronal plasticity in the BLA. Dopamine can act on GABAergic interneurons to increase spontaneous inhibitory network activity28, 44, 45 and enhance long-term potentiation through suppression of feedforward inhibition46. Like dopamine function in the prefrontal cortex47, this balance could enhance signal-to-noise by filtering out weak inputs to ensure only strong inputs conveying important information are potentiated. Dopamine can also enhance the excitability of BLA projection neurons44 and activation of VTADA→BLA projections can elevate the second messenger cyclic adenosine monophosphate and enhance BLA responses to cues48. Dopamine may gate plasticity in BLA48–50, as it does in striatal circuits51. Separate populations of BLA neurons can encode unique rewards52, 53. VTADA→BLA projections may contribute to identity-specific associative learning by facilitating the formation of these neuronal groups. This is a ripe question for future investigation. Another is the excitatory synapses that dopamine signaling may potentiate. One candidate is lateral orbitofrontal cortex projections to the BLA, which mediate the encoding of identity-specific reward memories24, 54. At least in mice, some VTADA→BLA projections can corelease glutamate to activate BLA interneurons28. That BLA dopamine release coincides with cue-reward pairing suggests dopamine is likely to be involved, but the extent to which glutamate corelease contributes is another important open question.
Findings from this study have important implications for how we conceptualize VTADA function. We found that dopamine release in the BLA contributes to the learning that enables flexible decision making in novel situations. These results contribute to the emerging understanding that VTADA neurons have a multifaceted role in learning18–21, 39. This and other recent work on more canonical dopamine pathways14, 55 indicates dopamine’s multifaceted contribution to learning is likely dictated by the function of downstream target regions. As we further explore the function of distinct dopamine pathways we may reveal core principles of dopamine function, e.g., learning and/or plasticity modulation, but we will most certainly find diversity of function based on projection target.
Here we show that the VTADA→BLA pathway drives the formation of an association between a cue and the unique reward it predicts. Such identity-specific reward memories are fundamental components of the internal model of environmental relationships, cognitive map, that enables us to generate the predictions and inferences needed for flexible decision making, including that in novel situations. This core form of memory can support a diverse array of behavioral and decision functions. Thus, VTADA→BLA projections may also support identity-specific social, drug, and/or aversive memories. An inability to properly encode predicted outcomes can lead to ill-informed motivations and decisions. This is characteristic of the cognitive symptoms underlying many psychiatric diseases. Thus, these data may also aid our understanding and treatment of substance use disorder and mental illnesses marked by disruptions to both dopamine function and decision making.
METHODS
Subjects
Male and female wildtype Long-Evans rats and transgenic Long-Evans rats expressing Cre recombinase under control of the tyrosine hydroxylase promoter (Th-cre) aged 8 – 11 weeks at the time of surgery served as subjects. Rats were housed in a temperature (68–79°F) and humidity (30–70%) regulated vivarium. They were initially housed in same-sex pairs and then following surgery housed individually to preserve implants. Rats were provided with water ad libitum in the home cage and were maintained on a food-restricted 12–14 g daily diet (Lab Diet, St. Louis, MO) to maintain approximately 85–90% free-feeding body weight. Rats were handled for 3–5 days prior to the onset of each experiment. Separate groups of naïve rats were used for each experiment. Experiments were performed during the dark phase of a 12:12 hr reverse dark/light cycle (lights off at 7AM). All procedures were conducted in accordance with the NIH Guide for the Care and Use of Laboratory Animals and were approved by the UCLA Institutional Animal Care and Use Committee.
Surgery
We used standard surgical procedures described previously24, 54, 57, 58. Rats were anesthetized with isoflurane (4–5% induction, 1–2% maintenance), and a nonsteroidal anti-inflammatory agent was administered pre- and postoperatively to minimize pain and discomfort. Surgical details for each experiment are described below. In all cases, surgery occurred prior to the onset of behavioral training.
Behavioral procedures
Apparatus
Training took place in Med Associates conditioning chambers (East Fairfield, VT) housed within sound- and light-attenuating boxes, described previously59. Each chamber had grid floors and contained 2 retractable levers that could be inserted to the left and right of a recessed food-delivery port (magazine) on the front wall. Stimulus lights were positioned above each of these levers. A photobeam entry detector was positioned at the entry to the food port. Each chamber was equipped with a syringe pump to deliver 20% sucrose solution in 0.1 ml increments through a stainless-steel tube into one well of the food port and a pellet dispenser to deliver 45-mg food pellets (Bio-Serv, Frenchtown, NJ) into another well of the same port. Both a white noise and tone generator were attached to independent speakers on the wall opposite the levers and food-delivery port. A clicker was also mounted on this wall. A fan mounted to the outer chamber provided ventilation and external noise reduction. A 3-watt, 24-volt house light mounted on the top of the back wall opposite the food port provided illumination, except in Pavlovian blocking experiments for which it was used as a conditioned stimulus. For the Pavlovian blocking behavioral experiment, two stimulus lights were also positioned facing up outside, but immediately adjacent to the chamber at floor level on the front left and back right corners. Chambers used for intracranial self-stimulation contained 2 nose poke ports on the wall with the house light, a smooth plexiglass floor, and rounded wall opposite the nose pokes. They did not contain levers or a food-delivery port. For optogenetic manipulations, chambers were outfitted with an Intensity Division Fiberoptic Rotary Joint (Doric Lenses, Quebec, QC, Canada) connecting the output fiber optic patch cords to a laser (Dragon Lasers, ChangChun, JiLin, China) positioned outside of the chamber.
Pavlovian long-delay conditioning
Magazine conditioning.
Rats first received 2 days of training to learn where to receive the sucrose (20%, 0.1 ml/delivery) and food pellet (45 mg grain; Bio-Serv) outcomes. Each day included 2 sessions, separated by approximately 1 hr, order counterbalanced across days, one with 30 non-contingent deliveries of sucrose and one with 30 grain pellet deliveries (60-s intertrial interval, ITI).
Preexposure.
To reduce the initial saliency of the auditory stimuli used in subsequent Pavlovian conditioning, subjects received one day of preexposure to the click and white noise stimuli. Click and noise were presented pseudo-randomly for 30 s, 4 times each with a variable 1.5 – 3-min ITI (mean = 2.5 min).
Pavlovian conditioning.
All rats received 8 sessions of Pavlovian conditioning (1 session/day on consecutive days) to learn to associate each of 2 auditory cues (aka conditioned stimuli; 80–82 db), click (10 Hz) and white noise, with a specific food outcome, sucrose solution or grain pellets. Each 30-s cue terminated with the delivery of its associated outcome. For half the subjects, click terminated in the delivery of sucrose and noise predicted pellets, with the other half receiving the opposite arrangement. Each session consisted of 8 click and 8 white noise presentations. Cues were delivered pseudo-randomly with a variable 1.5 – 3-min ITI (mean = 2.5 min).
Pavlovian trace conditioning
Magazine conditioning.
Rats first received 2 days of training to learn where to receive the chocolate purified pellet (45 mg; Bio-Serv) and unflavored purified pellet (45 mg; Bio-Serv) food rewards. Each day included 2 sessions, separated by approximately 1 hr, order counterbalanced across days, one with 30 non-contingent deliveries of chocolate pellets and one with 30 unflavored pellet deliveries (60-s ITI).
Preexposure.
To reduce the initial saliency of the auditory stimuli used in subsequent Pavlovian conditioning, subjects received one day of preexposure to the click and white noise stimuli. Click and noise were presented pseudo-randomly for 30-s durations, 4 times each with a variable 1.5 – 3-min ITI (mean = 2.5 min).
Pavlovian conditioning.
All rats received 8 sessions of Pavlovian conditioning (1 session/day on consecutive days) to learn to associate each of 2 auditory cues (80–82 db), click (10 Hz) and white noise, with a specific food outcome, chocolate or unflavored purified pellets. Each 10-s cue terminated with the delivery of its associated outcome after a 1.5-s trace interval. For half the subjects, click terminated in the delivery of chocolate pellets and noise predicted unflavored pellets, with the other half receiving the opposite arrangement. Each session consisted of 8 click and 8 white noise presentations. Cues were delivered pseudo-randomly with a variable 1 – 4-min ITI (mean = 2.5 min).
Unpredicted food reward.
Following training, rats received a single session during which they received non-contingent food-pellet deliveries (chocolate or unflavored, counterbalanced; average 60-s ITI, range = 20–110 s). Rats received 5 trials each in which either 1, 2, or 3 food pellets were delivered into the food-delivery port. Trial order was pseudorandomized.
Unpredicted aversive airpuff.
Rats received a single session in a context different from that of training during which they received 3–8 non-contingent presentations of a single puff of air (Dust Off; Falcon Safety Products, Branchburg NJ) delivered to the top part of the face at on average 35-s ITI.
Pavlovian long-delay conditioning with Pavlovian-to-instrumental transfer test
Magazine conditioning.
Rats first received 2 days of training to learn where to receive the sucrose (20%, 0.1 ml/delivery) and food pellet (45 mg grain; Bio-Serv) rewards. Each day included 2 sessions, separated by approximately 1 hr, order counterbalanced across days, one with 30 non-contingent deliveries of sucrose and one with 30 grain pellet deliveries (60-s ITI).
Instrumental conditioning.
Rats next received 11 days, minimum, of instrumental conditioning. Each day consisted of 2 training sessions, one with the left lever and one with the right lever, separated by at least 1 hr with order alternated across days. Each action was reinforced with one of the different food outcomes (e.g., left press→grain pellets/right press→sucrose solution). Lever-outcome pairings were counterbalanced at the start of the experiment within each group. Each session terminated after 20 outcomes had been earned or 45 min had elapsed. Actions were continuously reinforced on the first day and then escalated ultimately to a random-ratio (RR) 20 schedule of reinforcement in which a variable number of presses (average = 20) were required to earn a reward.
Pavlovian conditioning.
All rats received 8 sessions of Pavlovian conditioning (1 session/day on consecutive days) to learn to associate each of 2 auditory cues (80–82 db), click (10 Hz) and white noise, with a specific food outcome, sucrose solution or grain pellets. Each 30-s cue terminated with the delivery of its associated outcome. For half the subjects, click terminated in the delivery of sucrose and noise predicted pellets, with the other half receiving the opposite arrangement. Cue-reward pairings were counterbalanced within groups and with respect to instrumental lever-outcome pairings. Each session consisted of 8 click and 8 white noise presentations. Cues were delivered pseudo-randomly with a variable 1.5 – 3-min ITI (mean = 2.5 min).
Instrumental retraining and extinction.
Following Pavlovian conditioning, rats received one instrumental retraining session on the RR-20 reinforcement schedule. Rats then received one session of instrumental extinction to establish a low level of pressing. During this single 30-min session both levers were available but pressing was not reinforced.
Outcome-specific Pavlovian-to-instrumental transfer tests.
Rats next received an outcome-specific Pavlovian-to-instrumental transfer (PIT) test. During the PIT test, both levers were continuously present, but pressing was not reinforced. After 5 min of lever-pressing extinction, each 30-s cue was presented separately 4 times, separated by a fixed 2.5-min ITI, in alternating order. Cue order was counterbalanced across subjects. No outcomes were delivered following cue presentation. Rats next received 2 instrumental retraining sessions. This was followed by 1 Pavlovian retraining session. After retraining, rats were given a second PIT test. This test was identical to the first except the pre-extinction phase was 10 min and each rat received the cues in opposite order to the first test.
Pavlovian trace conditioning with outcome-specific devaluation test
Magazine conditioning.
Rats first received 2 days of training to learn where to receive the chocolate purified pellet (45 mg; Bio-Serv) and unflavored purified pellet (45 mg; Bio-Serv) rewards. Each day included 2 sessions, separated by approximately 1 hr, order counterbalanced across days, one with 30 non-contingent deliveries of chocolate pellets and one with 30 unflavored pellet deliveries (60-s intertrial interval, ITI).
Pavlovian conditioning.
All rats received 8 sessions of Pavlovian conditioning (1 session/day on consecutive days) to learn to associate each of 2 auditory cues (aka conditioned stimuli; 80–82 db), pulsed tone (1.5 kHz; 2s on/2s off) and white noise, with a specific food outcome, chocolate or unflavored purified pellets. Each 10-s cue terminated with the delivery of its associated outcome after a 1.5-s trace interval. For half the subjects, tone terminated in the delivery of chocolate pellets and noise predicted unflavored pellets, with the other half receiving the opposite arrangement. Each session consisted of 8 tone and 8 white noise presentations. Cues were delivered pseudo-randomly with a variable 1 – 4-min ITI (mean = 2.5 min).
Outcome-specific devaluation by conditioned taste aversion.
Following training, one of the food rewards was devalued by pairing with the malaise-inducing agent lithium chloride (LiCl). In the conditioning chambers, rats were given 30, non-contingent deliveries of one pellet type (60-s intertrial interval, ITI) followed immediately by a i.p. injection of LiCl (0.3M, 1.5% volume/weight). For the control, rats were given 30, non-contingent deliveries of the other pellet type (60-s intertrial interval, ITI) in the conditioning chamber, without subsequent LiCl injection. Rats received 1 session/day with 16 total sessions (8 devaluation and 8 control) in the order 3 devaluation, 3 control, 3 devaluation, 3 control, 1 devaluation, 2 control, 1 devaluation.
Outcome-specific devaluation probe test.
24 hr after the last session, rats next received an outcome-specific devaluation probe test. Each 10-s cue was presented separately 8 times, separated by a variable 2.5-min ITI, in alternating order. Cue order was counterbalanced across subjects. No outcomes were delivered following cue presentation.
Outcome-specific devaluation consumption test.
After the devaluation probe test, rats next received a consumption choice test to confirm the efficacy of the conditioned taste aversion. Rats were given access to 100 pellets of each type in a choice and allowed to consume freely for 20 min.
Outcome-specific blocking and Pavlovian-to-instrumental transfer
Magazine conditioning.
Rats first received 2 days of training to learn where to receive the sucrose (20%, 0.1 ml/delivery) and food pellet (45 mg grain; Bio-Serv) rewards. Each day included 2 sessions, separated by approximately 1 hr, order counterbalanced across days, one with 30 non-contingent deliveries of sucrose and one with 30 grain pellet deliveries (60-s ITI). The house light was off during these sessions.
Instrumental conditioning.
Rats next received 11 days, minimum, of instrumental conditioning. Each day consisted of 2 separate training sessions, one with the left lever and one with the right lever, separated by at least 1 hr with order alternated across days. Each action was reinforced with one of the different food rewards (e.g., left press→grain pellets/right press→sucrose solution). Lever-outcome pairings were counterbalanced at the start of the experiment within each group. Each session terminated after 20 outcomes had been earned or 45 min had elapsed. Actions were continuously reinforced on the first day and then escalated ultimately to a RR-20 schedule of reinforcement. The house light was off during these sessions.
Pavlovian conditioning.
Rats received 12 sessions of visual cue Pavlovian conditioning (1 session/day on consecutive days) in a dark operant chamber to learn to associate visual cues with the food outcomes. For rats in the blocking group, each of 2 30-s visual cues, house light or flashing stimulus lights (2 hz), was paired with a specific food outcome, sucrose (20%, 0.1 ml/delivery) or grain pellets (45 mg; Bio-Serv; e.g., house light—sucrose/flashing light—pellet). Cue-reward pairings were counterbalanced within groups and with respect to instrumental lever-outcome pairings. For half the subjects, the house light terminated in the delivery of sucrose and flashing lights predicted pellets, with the other half receiving the opposite arrangement. Each session consisted of 16 house light and 16 flashing light presentations. Cues were delivered pseudo-randomly with a variable 1.5 – 3-min ITI (mean = 2.5 min). Subjects in the control group (behavioral experiment only) were trained to associate a third distinct, 30-s visual stimulus with both food outcomes. Each session consisted of 32 presentations of lights on either side of the outside of the chamber alternating every 2 s (30-s duration; variable 1.5 – 3-min ITI, mean = 2.5 min). On half the trials the 30-s alternating outside lights cue terminated in the delivery of sucrose (20%, 0.1 ml/delivery) and on the other half in in grain pellets (45 mg; Bio-Serv), in pseudorandom order.
Instrumental retraining and extinction.
Following Pavlovian conditioning, rats received one instrumental retraining session on the RR-20 reinforcement schedule. Rats then received one session of instrumental extinction to establish a low level of pressing. During this single 30-min session both levers were available but pressing was not reinforced.
Preexposure.
Rats received one day of preexposure to the auditory stimuli. Click and noise were independently presented pseudo-randomly for 30-s, 8 times each with a variable 1.5 – 3-min ITI (mean = 2.5 min).
Compound conditioning.
Rats next received 4 compound conditioning sessions (1 session/day on consecutive days) in which the house light and flashing stimulus light cues were each presented in compound with a distinct auditory stimulus, click (10 Hz) or white noise (80–82 dB). For half the subjects in each group, the house light was presented simultaneously for 30 s with the click and the flashing lights concurrent noise for 30 s. The other half of subjects received the opposite arrangement. Visual-auditory cue pairings were counterbalanced within groups and with respect to instrumental and visual cue-reward contingencies. For subjects in the blocking group, each compound stimulus terminated in the reward paired with the visual stimulus during initial Pavlovian conditioning (e.g., house light + white noise—sucrose/flashing light + click—pellet). Compound cue-reward pairings were counterbalanced across subjects in the control group. Each compound conditioning session consisted of 8, 30-s presentations of each compound cue, terminating in the delivery of the associated food outcome. Compound cues were delivered pseudo-randomly with a variable 1.5 – 3-min ITI (mean = 2.5 min).
Outcome-specific Pavlovian-to-instrumental transfer tests.
Rats next received an outcome-specific PIT test. During the PIT test, both levers were continuously present, but pressing was not reinforced. After 5 min of lever-pressing extinction, each 30-s cue was presented separately 4 times, separated by a fixed 2.5-min ITI, in alternating order. Cue order was counterbalanced across subjects. No outcomes were delivered following cue presentation. The house light was off at test. Rats in the behavioral experiment next received two instrumental retraining sessions, one session of Pavlovian retraining with only visual cue presentations and one day of compound retraining prior to a second PIT test. This test was identical to the first except the pre-extinction phase was 10 min and each rat received the cues in opposite order to the first test.
Data collection
Discrete entries and time spent in the food-delivery port and/or lever presses were recorded continuously for each session. For Pavlovian training and PIT test sessions, the 30-s periods prior to each cue onset served as the baseline for comparison of cue-induced changes in lever pressing and/or food-port entries.
Fiber photometry recordings of dopamine release in the BLA during Pavlovian long-delay conditioning
Subjects
Nine male (N = 5) and female (N = 4) Long Evans rats (Th-cre- littermates, N = 6; Charles River Laboratories, N = 3) aged 9–11 weeks at the time of surgery were used to record dopamine release in the BLA across Pavlovian long-delay conditioning. Subjects without sufficient fiber photometry GRABDA2h signal of sufficient quality (N = 2) were excluded from the dataset prior to analysis. An additional 3 subjects expressing GFP (2 male) were used to record GFP fluorescence changes as a control during the last Pavlovian conditioning session.
Surgery
Rats were infused bilaterally with AAV encoding the GPCR-activation-based dopamine sensor GRABDA2h (pAAV9-hsyn-GRAB_DA2h, Addgene) or control fluorophore (AAV8-hSYN-GFP). Virus (0.3 μl) was infused bilaterally into the BLA (AP: -2.7; ML: ±5.0; DV: -8.7 males or -8.6 mm females, from bregma). 5 min later, viral injectors were dorsally repositioned in the BLA for a second viral infusion (0.3 μl; DV: -8.4 males or -8.3 mm females). Subjects included in the control experiment received a single viral infusion (0.5 μl; DV: -8.6 mm). Optical fibers (400-μm diameter, 0.37 NA, Neurophotometrics) were implanted bilaterally 0.2 mm dorsal to the first infusion site. Virus was infused at a rate of 0.1 μl/min using 28-gauge injectors and injectors were left in place for 10 min after the second infusion. Experiments commenced approximately 4 weeks after surgery to allow sufficient expression in the BLA.
Fiber photometry recordings
Animals were habituated to the optical tether during the magazine conditioning sessions, but no light was delivered. Following magazine training, fiber photometry was used to image GRABDA2h fluorescent changes in BLA neurons throughout each Pavlovian long-delay conditioning session (N = 9) or GFP fluorescence during the last Pavlovian conditioning session (N = 3) using a commercial fiber photometry system (Neurophotometrics Ltd.). 470 nm excitation light was adjusted to approximately 80–100 μW at the tip of the patch cord (fiber core diameter: 400 μm; Doric Lenses). Fluorescence emission was passed through a 535 nm bandpass filter and focused onto the complementary metal-oxide semiconductor (CMOS) camera sensor through a tube lens. Samples were collected at 20 Hz using a custom Bonsai60 workflow. Time stamps of task events were collected simultaneously through an additional synchronized camera aimed at the Med Associates interface, which sent light pulses coincident with task events. Signals were saved using Bonsai software and exported to MATLAB (MathWorks, Natick, MA) for analysis. Recordings were collected unilaterally from the hemisphere with the strongest fluorescence signal at the start of the experiment.
Fiber photometry recordings of calcium activity in BLA neurons during Pavlovian long-delay conditioning
Subjects
Eight male (N = 4) and female (N = 4) wildtype rats (Charles River Laboratories, Wilmington, MA) aged approximately 9 weeks at the time of surgery were included in this study to assess BLA neuronal activity changes across Pavlovian long-delay conditioning. No subjects were excluded.
Surgery
Rats were infused bilaterally with adeno-associated virus (AAV) expressing the genetically encoded calcium indicator GCaMP6f under control of the calcium/calmodulin-dependent protein kinase (CaMKII) promoter (pENN.AAV5.CAMKII.GCaMP6f.WPRE.SV40, Addgene, Watertown, MA). Virus (0.5 μl) was bilaterally infused into the BLA (AP: -2.9; ML: ± 5.0; DV: -8.8 mm from bregma) at a rate of 0.1 μl/min using 28-gauge injectors. Injectors were left in place for 10 additional min following infusion. Optical fibers (200-μm diameter, 0.37 numerical aperture (NA), Neurophotometrics, San Diego, CA) were implanted bilaterally 0.2 mm dorsal to the infusion site. Experiments commenced approximately 4 weeks after surgery to allow sufficient expression in BLA cell bodies.
Fiber photometry recordings
Animals were habituated to the optical tether during the magazine conditioning sessions, but no light was delivered. Following magazine training, fiber photometry was used to image bulk calcium activity in BLA neurons throughout each Pavlovian conditioning session. We simultaneously imaged GCaMP6f and control fluorescence in the BLA using a commercial fiber photometry system (Neurophotometrics Ltd.). Two light-emitting LEDs (470 nm: Ca2+-dependent GCaMP fluorescence; 415 nm: autofluorescence, motion artifact, Ca2+-independent GCaMP fluorescence) were reflected off dichroic mirrors and coupled via a patch cord (fiber core diameter: 200 μm; Doric Lenses, Quebec, Canada) to the implanted optical fiber. The intensity of excitation light was adjusted to ∼80 μW at the tip of the patch cord. Samples were collected at 20 Hz interleaved between the 415 nm and 470 nm excitation channels. Recordings were collected unilaterally from the hemisphere with the strongest fluorescence signal in the 470 nm channel at the start of the experiment.
Fiber photometry recordings of dopamine release in the BLA during Pavlovian trace conditioning
Subjects
10 male (GRABDA2h, N = 3; GRABDA2m, N = 3) and female (GRABDA2h, N =1; GRABDA2m, N = 3) Long Evans rats (Th-cre- littermates, N = 5; Gad-cre-, N = 2; Charles River Laboratories, N = 3) aged 7–9 weeks at the time of surgery were used to record dopamine release in the BLA across Pavlovian trace conditioning. Subjects without sufficient quality GRABDA signal (N = 20) were excluded from the dataset prior to analysis. There were no statistical differences in task related signal (AUC) between GRABDA2h and GRABDA2m (F(1, 8)=1.39, P = 0.27) and no interaction between sensor and any other variable of interest (lowest P: F(4, 32) = 1.17, P = 0.34), so subjects were combined into a single group. An additional 4 (2 male) subjects served as GFP controls.
Surgery
Rats were infused bilaterally with AAV encoding GRABDA (pAAV9-hsyn-GRAB_DA2h, Addgene or AAV9-hsyn-DA4.4, WZ Biosciences) or control fluorophore (AAV8-hSYN-GFP). Virus (0.3 μl) was infused unilaterally into the BLA (AP: -2.7; ML: ±5.0; DV: -8.7 males or -8.6 mm females, from bregma). 5 min later, viral injectors were dorsally repositioned in the BLA for a second viral infusion (0.3 μl; DV: -8.4 males or -8.3 mm females). Optical fibers (400-μm diameter, 0.37 NA, Neurophotometrics) were implanted bilaterally 0.2 mm dorsal to the first infusion site. Virus was infused at a rate of 0.1 μl/min using 28-gauge injectors and injectors were left in place for 10 min after the second infusion. Experiments commenced approximately 4 weeks after surgery to allow sufficient expression in the BLA.
Fiber photometry recordings
Animals were habituated to the optical tether during the magazine conditioning sessions, but no light was delivered. Following magazine training, fiber photometry was used to image GRABDA or GFP fluorescence in BLA neurons throughout each Pavlovian trace conditioning session using a commercial fiber photometry system (Neurophotometrics Ltd.). 470 nm excitation light was adjusted to approximately 80–100 μW at the tip of the patch cord (fiber core diameter: 400 μm; Doric Lenses) and samples were collected at 20 Hz. In a subset of subjects, recordings were made during aversive airpuffs to the face (GRABDA2h: N = 2, 2 male; GRABDA2m: N = 6, 3 male) and, in a separate session, unpredicted food-pellet reward deliveries (GRABDA2h: N = 2, 2 male; GRABDA2m: N = 5, 3 male).
Optogenetic inhibition of VTADA→BLA terminals at reward delivery during Pavlovian long-delay conditioning with Pavlovian-to-instrumental transfer test
Subjects
Twenty-one male (N = 11) and female (N = 10) transgenic Th-cre+ (hemizygous) Long Evans rats aged approximately 10 weeks at the time of surgery were used in this study to assess the necessity of VTADA→BLA projection activity for cue-reward learning. Eleven (6 males) served in the experimental group and 10 (5 males) served as controls. Subjects with misplaced optic fibers (N = 3) were excluded from the dataset.
Surgery
Th-cre rats were randomly assigned to a viral group and infused bilaterally with a cre-dependent AAV encoding either the inhibitory opsin archaerhodopsin T (ArchT; N = 11; 6 males; AAV5-CAG-FLEX-ArchT-tdTomato, Addgene) or a tdTomato fluorescent protein control (tdTomato; N = 10; 5 males; AAV5-CAG-FLEX-tdTomato, University of North Carolina Vector Core, Chapel Hill, NC). Virus (0.2 μl) was infused bilaterally at a rate of 0.1 μl/min into the VTA (AP: -5.3; ML: ±0.7; DV: -8.3 mm from bregma) using a 28-gauge injector. Injectors were left in place for 10 min following infusion. Optical fibers (200-μm core, 0.39 NA, Thorlabs, Newton, NJ) held in ceramic ferrules (Kientec Systems, Stuart, FL) were implanted bilaterally in the BLA (AP: -2.7; ML: ±5.0; DV: -8.2 mm from bregma). Experiments commenced 4–5 weeks after surgery to allow sufficient expression in VTADA→BLA terminals at the time of manipulation (7–9 weeks post-surgery).
Optogenetic inhibition of VTADA→BLA projections
Rats received magazine and instrumental training as above. Animals were habituated to the optical tether (200 μm, 0.22 NA, Doric Lenses) for at least the last 2 sessions of instrumental conditioning, but no light was delivered. Optogenetic inhibition was used to attenuate the activity of ArchT-expressing VTADA axons and terminals in the BLA at the time of cue-reward pairing during each Pavlovian long-delay conditioning session. During each Pavlovian conditioning session, green light (532 nm; 10 mW) was delivered to the BLA via a laser (Dragon Lasers) connected through a ceramic mating sleeve (Thorlabs) to the ferrule implanted on the rat. Light was delivered continuously for 3 s concurrent with each outcome delivery (occurring at cue offset). If the reward was retrieved after the laser had gone off, then the retrieval entry (first food-port entry after reward delivery) triggered an additional 3-s illumination. Light effects were estimated to be restricted to the BLA based on predicted irradiance values (https://web.stanford.edu/group/dlab/cgi-bin/graph/chart.php). Following Pavlovian conditioning, rats proceeded to the PIT tests as described above, during which they were tethered to the optical patch cords, but no light was delivered. The same light delivery procedures were used during Pavlovian retraining in between PIT tests.
Optogenetic inhibition of VTADA→BLA terminals during entire cue-reward period during Pavlovian long-delay conditioning with Pavlovian-to-instrumental transfer test
Subjects
Fifteen male (N = 8) and female (N = 7) transgenic Long Evans rats aged approximately 9 weeks at the time of surgery were used in this study to assess the necessity of VTADA→BLA projection activity for cue-reward learning. Seven (4 males) Th-cre+ (hemizygous) rats served in the experimental group. 1 subject with insufficient viral expression was excluded from the dataset. Eight subjects served in the control group, 4 (2 male) Th-cre+ (hemizygous) rats and 4 (2 male) wildtype Th-cre- littermates. Behavioral data did not differ between the two control types (lowest P: F(1, 6) = 1.61, P = 0.25) and so they were collapsed into a single control group.
Surgery
Th-cre rats were randomly assigned to a viral group and infused bilaterally with a cre-dependent AAV encoding either ArchT (N = 7; 4 males; AAV5-CAG-FLEX-ArchT-tdTomato, Addgene) or a tdTomato fluorescent protein control (tdTomato; N = 4; 2 males; AAV5-CAG-FLEX-tdTomato). Four (2 male) wildtype Th-cre- littermates were infused bilaterally with the cre-dependent AAV encoding ArchT (AAV5-CAG-FLEX-ArchT-tdTomato, Addgene). Virus (0.2 μl) was infused bilaterally at a rate of 0.1 μl/min into the VTA (AP: -5.3; ML: ±0.7; DV: -8.3 mm from bregma) using a 28-gauge injector. Injectors were left in place for 10 min following infusion. Optical fibers (200-μm core, 0.39 NA, Thorlabs, Newton, NJ) held in ceramic ferrules (Kientec Systems, Stuart, FL) were implanted bilaterally in the BLA (AP: -2.7; ML: ±5.0; DV: -8.2 mm from bregma). Experiments commenced 4–5 weeks after surgery to allow sufficient expression in VTADA→BLA terminals at the time of manipulation (7–9 weeks after surgery).
Optogenetic inhibition of VTADA→BLA projections
Rats received magazine and instrumental training as above. Animals were habituated to the optical tether (200 μm, 0.22 NA, Doric Lenses) for at least the last 2 sessions of instrumental conditioning, but no light was delivered. Optogenetic inhibition was used to attenuate the activity of ArchT-expressing VTADA axons and terminals in the BLA throughout cue and reward presentation during each Pavlovian long-delay conditioning session. During each Pavlovian conditioning session, green light (532 nm; 10 mW) was delivered to the BLA via a laser (Dragon Lasers) connected through a ceramic mating sleeve (Thorlabs) to the ferrule implanted on the rat. Light was delivered continuously beginning at the onset of each cue and ending 3 s after cue offset (33 s total). Thus, we inhibited during the entire cue-reward period. If the reward was retrieved after the laser had gone off, then the retrieval entry triggered an additional 3-s illumination. Following Pavlovian conditioning, rats proceeded to the PIT tests as described above, during which they were tethered to the optical patch cords, but no light was delivered. The same light delivery procedures were used during Pavlovian retraining in between PIT tests.
Optogenetic inhibition of VTADA→BLA terminals at reward delivery during Pavlovian trace conditioning with outcome-specific devaluation test
Subjects
Twelve male (N = 8) and female (N = 4) transgenic Long Evans rats aged approximately 10 weeks at the time of surgery were used in this study to assess the necessity of VTADA→BLA projection activity for cue-reward learning. Five (4 males) Th-cre+ (hemizygous) rats served in the experimental group. Subjects that died prior to test (N = 2) or with misplaced optic fibers (N = 1) were excluded from the dataset. Seven total subjects served in the control group, 4 (3 male) Th-cre+ (hemizygous) rats and 3 (1 male) wildtype Th-cre- littermates. 1 subject that died prior to test was excluded from the dataset. Behavioral data did not differ between the two types of controls (lowest P: F(1, 5) = 1.18, P = 0.33) and so they were collapsed into a single control group.
Surgery
Th-cre rats were randomly assigned to a viral group and infused bilaterally with a cre-dependent AAV encoding either ArchT (N = 5; 4 males; AAV5-CAG-FLEX-ArchT-tdTomato, Addgene) or a tdTomato fluorescent protein control (tdTomato; N = 4; 3 males; AAV5-CAG-FLEX-tdTomato). Three (1 male) wildtype Th-cre- littermates were infused bilaterally with the cre-dependent AAV encoding ArchT (AAV5-CAG-FLEX-ArchT-tdTomato, Addgene). Virus (0.2 μl) was infused bilaterally at a rate of 0.1 μl/min into the VTA (AP: -5.3; ML: ±0.7; DV: -8.3 mm from bregma) using a 28-gauge injector. Injectors were left in place for 10 min following infusion. Optical fibers (200-μm core, 0.39 NA, Thorlabs, Newton, NJ) held in ceramic ferrules (Kientec Systems, Stuart, FL) were implanted bilaterally in the BLA (AP: -2.7; ML: ±5.0; DV: -8.2 mm from bregma). Experiments commenced 4–5 weeks after surgery to allow sufficient expression in VTADA→BLA terminals at the time of manipulation (7–9 weeks after surgery).
Optogenetic inhibition of VTADA→BLA projections
Rats received magazine conditioning as above. Animals were habituated to the optical tether (200 μm, 0.22 NA, Doric Lenses) during this training, but no light was delivered. Optogenetic inhibition was used to attenuate the activity of ArchT-expressing VTADA axons and terminals in the BLA at the time of cue-reward pairing during each Pavlovian trace conditioning session. During each Pavlovian conditioning session, green light (532 nm; 10 mW) was delivered to the BLA via a laser (Dragon Lasers) connected through a ceramic mating sleeve (Thorlabs) to the ferrule implanted on the rat. Light was delivered continuously for 3 s concurrent with each reward delivery (occurring 1.5 s after cue offset). If the outcome was retrieved after the laser had gone off, then the retrieval entry triggered an additional 3-s illumination. Thus, we inhibited at each reward delivery, without inhibiting at cue offset. Rats received 5 conditioning sessions to avoid negatively deflecting VTADA→BLA activity once BLA dopamine reward responses attenuate with learning. Following Pavlovian conditioning, rats proceeded to the outcome-specific devaluation and probe test as described above. Rats were tethered to the optical patch cords during the probe test, but no light was delivered.
Outcome-specific Pavlovian blocking and Pavlovian-to-instrumental transfer
Subjects
Thirty-two male (N = 22) and female (N = 10) Long Evans rats (Charles River) aged approximately 8 weeks at the start of the experiment were used in this study to evaluate the extent to which previously learned cues could block the encoding of novel identity-specific cue-reward memories. Prior to the start of behavioral training, subjects were randomly assigned to Blocking (N = 16, 11 male) or Control (N = 16, 11 male) groups. Rat were trained and tested using the Outcome-specific blocking and Pavlovian-to-instrumental transfer procedures described above.
Optical stimulation of VTADA→BLA terminals at reward delivery during Pavlovian blocking with Pavlovian-to-instrumental transfer
Subjects
Twenty-four male (N = 12) and female (N = 12) transgenic TH-cre+ (hemizygous) Long Evans rats aged between 9–12 weeks at the time of surgery were used in this study. Subjects with misplaced optical fibers (N = 2) or lacking viral expression (N = 2) were excluded from the dataset.
Surgery
Th-cre rats were randomly assigned to a viral group and infused bilaterally with a cre-dependent AAV encoding either the excitatory opsin channelrhodopsin (ChR2; N = 11, 6 male; AAV5-EF1a-DIO-hChR2(H134R)-eYFP, University of North Carolina Vector Core) or an enhanced yellow fluorescent protein control (eYFP; N = 13, 6 males; pAAV5-Ef1a-DIO-eYFP, Addgene). Virus (0.2 μl) was infused bilaterally at a rate of 0.1 μl/min into the VTA (AP: -5.3; ML: ±0.7; DV: -8.3 mm from bregma) using a 28-gauge injector. Injectors were left in place for 10 min following viral infusions. Optical fibers (200 μm core, 0.39 NA, Thorlabs) held in ceramic ferrules (Kientec Systems) were implanted bilaterally in the BLA (AP: -2.7; ML: ±5.0; DV: -8.2 mm from bregma). Experiments commenced approximately 2 weeks after surgery to allow sufficient expression in VTADA→BLA axon terminals at the time of optical manipulation (7–8 weeks after surgery).
Optogenetic stimulation of VTADA→BLA projections
Rats received magazine conditioning, instrumental training, and visual cue Pavlovian conditioning as described for the Outcome-specific blocking and Pavlovian-to-instrumental transfer procedures above. All subjects received the blocking condition. Animals were habituated to the optical tether (200 μm, 0.22 NA, Doric Lenses) for at least the last 2 sessions of instrumental conditioning and the last two days of visual cue Pavlovian conditioning, but no light was delivered. Optogenetic excitation was used to stimulate the activity of ChR2-expressing VTADA axons and terminals in the BLA at the time of each cue-reward pairing during each compound conditioning session. During each compound conditioning session, blue light (473 nm; 10 mW; 25-ms pulse width) was delivered to the BLA via a laser (Dragon Lasers) for 3 s at a rate of 20 Hz concurrent with each reward delivery. We selected this stimulation frequency to match the upper end firing rate of VTADA neurons detected in response to reward3, 34 similar to prior work on the VTA→BLA pathway61. Following compound conditioning, rats proceeded to the PIT test as described above, during which they were tethered to the optical patch cords, but no light was delivered.
Intracranial self-stimulation
Following the PIT test, rats received 2 sessions (1 session/day) of intracranial self-stimulation (ICSS) testing. This occurred in a distinct context from the prior conditioning and testing. This context had a smooth plexiglass rather than grid floor, round right-side wall and no levers or food-delivery port. During each 1-hr session animals were allowed to nose poke in 2 ports positioned on the left and right side of the left wall of the operant chamber. Nose pokes into the active port triggered 1-s blue light (473nm; 10 mW; 25-ms pulse width; 20 Hz) delivery to the BLA. Subsequent nose pokes during the 1-s light-delivery period were recorded but did not extend light delivery. Inactive port pokes were also recorded. For half of the subjects in each group, the left port was active and the right inactive, with the opposite arrangement for the other half.
Histology
Following behavioral experiments, rats were deeply anesthetized with Nembutal and transcardially perfused with phosphate buffered saline (PBS) followed by 4% paraformaldehyde (PFA). Brains were removed and post-fixed in 4% PFA overnight, placed into 30% sucrose solution, then sectioned into 30-μm slices using a cryostat and stored in cryoprotectant. Slices were rinsed in a DAPI solution for 4 min (5 mg/mL stock, 1:10000), washed 3 times in PBS for 15 min, mounted on slides and coverslipped with ProLong Gold mounting medium. Images were acquired using a Keyence BZ-X710 microscope (Keyence, El Segundo, CA) with a 4x, 10x, and 20x objective (CFI Plan Apo), CCD camera, and BZ-X Analyze software.
GFP fluorescence was used to confirm expression of GCaMP6f in BLA cell bodies. Immunofluorescence was used to confirm expression of GRABDA in the BLA. Floating coronal sections were washed 3 times in 1x PBS for 30 min and then blocked for 1–1.5 hr at room temperature in a solution of 3% normal goat serum and 0.3% Triton X-100 dissolved in PBS. Sections were then washed 3 times in PBS for 15 min and incubated in blocking solution containing chicken anti-GFP polyclonal antibody (1:1000; Abcam, Cambridge, MA) with gentle agitation at 4°C for 18–22 hr. Sections were next rinsed 3 times in PBS for 30 min and incubated with goat anti-chicken IgY, Alexa Fluor 488 conjugate (1:500; Abcam) in blocking solution at room temperature for 2 hr. Sections were washed a final 2 times in PBS for 10 min.
tdTomato fluorescence with a Th costain was used to confirm expression of ArchT-tdTomato in VTADA neurons. Floating coronal sections were washed 3 times in 1x PBS for 30 min and then blocked for 2 hr at room temperature in a solution of 3% normal donkey serum and 0.2% Triton X-100 dissolved in PBS. Sections were then washed 3 times in PBS for 15 min and incubated in blocking solution containing rabbit anti-TH antibody (1:1000; EMD Millipore, Burlington, MA) with gentle agitation at 4°C for 44–48 hr. Sections were next rinsed 3 times in PBS for 30 min and incubated with goat anti-rabbit IgG, Alexa Fluor 488 conjugate (1:500; Thermofisher Scientific, Waltham, MA) in blocking solution at room temperature for 2 hr. Sections were washed a final 2 times in PBS for 10 min. Immunofluorescence was also used to confirm expression of ArchT-tdTomato in axons and terminals in the BLA. Floating coronal sections were washed 2 times in 1x PBS for 10 min and then blocked for 2 hr at room temperature in a solution of 10% normal goat serum and 0.5% Triton X-100 dissolved in PBS. Sections were then washed 3 times in PBS for 15 min and incubated in blocking solution containing rabbit anti DsRed polyclonal antibody (1:1000; EMD Millipore, Burlington, MA) with gentle agitation at 4°C for 18–22 hr. Sections were next rinsed 3 times in blocking solution for 30 min and incubated with goat anti-rabbit IgG, Alexa Fluor 594 conjugate (1:500; Thermofisher Scientific) in blocking solution at room temperature for 2 hr. Sections were washed a final 2 times in PBS for 10 min.
eYFP fluorescence with a Th costain was used to confirm expression of ChR2-eYFP expression in VTADA neurons. Staining procedures were as described above using a secondary goat anti-rabbit Alexa 594 antibody (Thermofisher Scientific). Immunofluorescence following procedures described for GFP amplification also described above were used to confirm expression of ChR2 in axons and terminals in the BLA.
Data analysis
Behavioral analysis
Behavioral data were processed with Microsoft Excel (Microsoft, Redmond, WA). Press rates on the last 2 sessions of instrumental training were averaged across levers then across days and compared between groups to test for any pre-existing group differences in instrumental behavior. For Pavlovian long-delay conditioning, conditional food-port approach responses during the Pavlovian and compound conditioning sessions were assessed by comparing the rate of entries into the food-delivery port (entries/min) during the 30-s cue periods relative to the 30-s baseline periods prior to cue onset (preCue). Because cue periods were shorter, for trace conditioning, Pavlovian conditional food-port approach responses during the Pavlovian conditioning sessions were assessed by comparing the percentage of time spent in the food-delivery port during the 10-s cue periods relative to the 10-s baseline preCue periods. Data were averaged across trials for each cue and then averaged across the two cues. For PIT tests, entry rate into the food-port during the 30-s cues were also compared to the baseline 30-s preCue periods. Data were averaged across trials for each cue and then averaged across cues. Lever press rates (presses/min) during the 30-s baseline preCue periods were compared to that during the 30-s cue periods to capture the cue-induced change in lever pressing. Lever presses were separated for presses on the lever that, during training, earned the same outcome as the upcoming or presented cue (Same presses) versus those on the other available lever (Different presses). Data was separated into Same v. Different presses for each preCue and cue period, averaged across trials, then averaged across cue types. To account for baseline press rates and evaluate the selective cue-induced change in lever pressing, we computed an elevation ratio for each lever [(Cue:Same presses)/(Cue:Same presses + preCue:Same presses)] and [(Cue:Different presses)/(Cue:Different presses + preCue:Different presses)]. When two PIT tests were conducted, food-port entry rate, lever-press rates, and elevation ratios were averaged across PIT tests. For devaluation tests, percent time in the food-delivery port was averaged across the 10-s baseline preCue periods and compared with that during the 10-s cue periods, separated by the cue that signaled the non-devalued outcome v. that signaling the currently devalued (i.e., pre-fed) outcome. Elevation ratios were computed for each cue [(Cue: non-Devalued %time)/(Cue: non-Devalued %time + preCue %time)] and [(Cue: Devalued %time)/(Cue: Devalued %time + preCue %time)]. For ICSS sessions, the total number of nose pokes into the active and active ports were compared across the two sessions.
GRABDA fiber photometry analysis
Data were pre-processed using a custom-written pipeline in MATLAB (MathWorks). The 470 nm signal was resampled to 19.5 Hz and then was divided by a second-order exponential fitted to the raw data to account for attenuation in fluorescence resulting from photobleaching across the session. The resampled and fitted data were then Z-scored. Area under the curve (AUC) was calculated for each individual aligned trace within each session using a trapezoidal function. For the Pavlovian long-delay conditioning, we used 2-s preCue baseline, cue onset, and Cue offset/reward delivery windows. To match the duration of the trace interval, for Pavlovian trace conditioning, we used 1.5-s preCue baseline, Cue onset, Cue offset, and reward delivery windows. We compared data across conditioning sessions 1, 2, 3/4, 5/6, and 7/8. Thus, data from the mid and latter training sessions were averaged across 2-session bins. All subjects had reliable data from at least one session per bin. Session data were excluded if artifactual signal due to excessive motion or patch cord twisting was detected for at least half of the trials (Long-delay conditioning: N = 3 sessions from N = 2 subjects; Trace conditioning: N = 1 sessions from N = 2 subjects). Two subjects without reliable data from at least one session per bin were excluded from the long-delay conditioning experiment, and two subjects were excluded from the trace conditioning experiment. We were able to obtain reliable imaging data from all 8 training sessions from N = 7/9 subjects for the long-delay conditioning experiment and from N = 8/10 subjects for the trace conditioning experiment (Extended Data 5). When evaluating the data from subjects for which we collected reliable recordings from each training session, there were no significant statistical interaction between Session and Events of interest across the data within each bin (Long-delay conditioning: lowest P: F(1.11, 6.68) = 1.66, P = 0.25; Trace conditioning: lowest P: F(1.68, 11.73) = 0.97, P = 0.39), justifying the collapse across training bins.
GCaMP6f fiber photometry analysis
Data were pre-processed using a custom-written pipeline in MATLAB (MathWorks, Natick, MA). Using least-squares linear regression, the 415 nm signal was fit to the 470 nm signal. Change in fluorescence (ΔF/F) at each time point was calculated by subtracting the fitted 415 nm signal from the 470 nm signal and normalizing to the fitted 415 nm data [(470-fitted 415)/fitted 415)]. The ΔF/F data were resampled to 19.5 Hz then Z-scored [(ΔF/F - mean ΔF/F)/std(ΔF/F)]. Using a custom MATLAB workflow, Z-scored traces were then aligned to cue onset for each trial. Peak magnitude was calculated on the Z-scored trace for each trial using 5-s preCue baseline, cue onset, and postCue offset/outcome delivery windows. Data were averaged across trials and then across cues. Session data were excluded if no transient calcium fluctuations were detected on the 470 nm channel above the isosbestic channel or if poor linear fit was detected due to excessive motion artifact (N = 2 sessions from N = 2 subjects). To examine the progression in BLA activity across training, we compared data across conditioning sessions 1, 2, 3/4, 5/6, and 7/8. Thus, data from the mid and latter training sessions were averaged across 2-session bins. All subjects had reliable data from at least one session per bin.
Statistical analysis
Datasets were analyzed by two-tailed, paired and unpaired Student’s t tests, two-, or three-way repeated-measures analysis of variance (ANOVA), and simple linear regression as appropriate (GraphPad Prism, GraphPad, San Diego, CA; SPSS, IBM, Chicago, IL). For the few datasets that were slightly non-normal, results were cross-checked using non-parametric statistics and the findings were identical. We opted to use parametric statistics for consistency across experiments and given evidence that ANOVA is robust to slight non-normality62, 63. For well-established behavioral effects (PIT), multiple pairwise comparisons were used for a priori post hoc comparisons based on a logical extension of Fisher's protected least significant difference procedure for controlling familywise Type I error rates64. All other post hoc tests were corrected for multiple comparisons using the Bonferroni method and used to clarify main and interaction effects. Greenhouse-Geisser correction was applied to mitigate the influence of unequal variance between conditions. Alpha levels were set at P < 0.05. Full statistical reporting is provided in Supplemental Table 1.
Sex as a biological variable
Male and female rats were used in approximately equal numbers for each experiment, but the N per sex was underpowered to examine sex differences. Sex was therefore not included as a factor in statistical analyses, though individual data points are visually disaggregated by sex.
Rigor and reproducibility
Group sizes were estimated a priori based on prior work using male Long Evans rats in this behavioral task59, 65, 66 and to ensure counterbalancing of Cue-reward and Lever-outcome pairings. Investigators were not blinded to viral group because they were required to administer virus. All behaviors were scored using automated software (MedPC). Each experiment included at least 1 replication cohort and cohorts were balanced by viral group, Cue-reward and Lever-reward pairings, hemisphere etc. prior to the start of the experiment.
Extended Data
Extended Data Figure 1. BLA neurons are active during cue-reward encoding.

To characterize the endogenous activity of BLA neurons, we used fiber photometry to record fluorescent activity of the genetically encoded calcium indicator GCaMP6f67 in the BLA of male and female rats. (a) Top: Representative fluorescent image of GCaMP6f expression and fiber placement in the BLA. Bottom: Fiber photometry approach for bulk calcium imaging in BLA neurons. (b) Schematic representation of GCaMP6f expression and placement of optical fiber tips in BLA for all subjects. (c) Pavlovian long-delay conditioning procedure schematic. CS, 30-s conditioned stimulus (aka, “cue”, white noise or click) followed immediately by reward outcome (O, sucrose solution or grain pellet). (d) Food-port entry rate during the cue relative to the preCue baseline period, averaged across the 2 cues for each Pavlovian conditioning session. Across training, rats developed a Pavlovian conditional approach response of entering the food-delivery port during cue presentation. Two-way RM ANOVA Training × Cue: F(2.44, 17.07) = 7.97, P = 0.002; Training: F(3.30, 23.10) = 4.85, P = 0.008; Cue: F(1, 7) = 80.33, P < 0.0001. *P < 0.05, **P < 0.01. N = 8, 4 male rats. (e-f) BLA neurons are active during the encoding of cue-reward memories. BLA neurons were robustly activated both at cue onset and offset when the outcome was delivered. Cue onset responses beginning on the first conditioning sessions have been detected previously2. These novelty responses rapidly attenuate if the stimuli are not associated with reward24. (e) Quantification of maximal (peak) GCaMP6f Z-score ΔF/F during the 5-s period following cue onset or outcome delivery compared to the equivalent baseline period immediately prior to cue onset. Two-way RM ANOVA Training × Event: F(2.52, 17.61) = 3.94, P = 0.03; Event: F(1.39, 9.71) = 58.63, P < 0.0001; Training F(1.71, 11.97) = 2.30, P = 0.15. (f) GCaMP6f fluorescence changes (Z-score ΔF/F) in response to cue presentation (blue) and outcome delivery across days of training. Tick marks represent time of outcome collection for each subject. Data from the last six sessions were averaged across 2-session bins (3/4, 5/6, and 7/8). N = 8, 4 male rats. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons. Consistent with prior evidence24, BLA neurons are activated by rewards and their predictors. BLA activation is particularly robust when the cues can become linked to the identifying features of the rewards they predict. Although these data likely reflect both somatic and non-somatic calcium activity, they are consistent with prior electrophysiological evidence that BLA neurons respond to reward during learning68–71.
Extended Data Figure 2. Dopamine release in BLA during cue-reward learning across each of the 8 Pavlovian conditioning sessions.

(a) GRABDA2h fluorescence changes (Z-score) in response to cue presentation (blue) and reward delivery across each of the 8 Pavlovian conditioning sessions. (b) Quantification of BLA GRABDA Z-scored signal AUC during the 2-s period following cue onset or reward delivery compared to the equivalent baseline period immediately prior to cue onset. Two-way RM ANOVA Event: F(1.85, 11.07) = 4.90, P = 0.03; Training: F(2.34, 14.03) = 1.13, P = 0.36; Training × Event: F(3.45, 20.99) = 0.59, P=0.65. *P < 0.05, relative to preCue baseline, Bonferroni correction. N = 7, 4 male rats. (c) GRABDA fluorescence changes (Z-score) in response to cue presentation and reward delivery across each of the 8 Pavlovian conditioning sessions. (d) Quantification of BLA GRABDA Z-scored signal AUC during the 1.5-s period following cue onset, cue offset (trace interval), or reward delivery compared to the equivalent baseline period immediately prior to cue onset. Two-way RM ANOVA Event: F(2.06, 14.40) = 13.24, P = 0.0005; Training: F(3.62, 25.33) = 2.43, P = 0.08; Training × Event: F(3.60, 25.17) = 2.60, P = 0.07. *P < 0.05, **P < 0.01, ***P < 0.001, relative to preCue baseline, Bonferroni correction. (GRABDA2h: N = 3, 2 male; GRABDA2m: N = 5, 3 male). The slope of the BLA dopamine reward response across training was significantly negative (β = −0.13, confidence interval −0.25 – −0.007; F(1,62) = 4.49, P = 0.04) and signifantly different (F(1,124) = 13.33, P = 0.0004) from the slope of the BLA dopamine cue-onset response across training, which was significantly positive (β = 0.13, confidence interval 0.06 – 0.20; F(1,62) = 13.53, P = 0.0005). Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons.
Extended Data Figure 3: GRABDA responses to reward collection.

(a) GRABDA2h fluorescence changes (Z-score) in response to reward collection across Pavlovian long-delay conditioning. Data from the last six sessions were averaged across 2-session bins (3/4, 5/6, and 7/8). N = 9, 5 male rats. (b) GRABDA fluorescence changes (Z-score) in response to reward collection across Pavlovian trace conditioning. Data from the last six sessions were averaged across 2-session bins (3/4, 5/6, and 7/8). GRABDA2h: N = 4, 3 male; GRABDA2m: N = 6, 3 male. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points.
Extended Data Figure 4: GRABDA responses to unpredicted rewarding and aversive events.

(a) GRABDA fluorescence changes (Z-score) in response to unpredicted delivery of 1, 2, or 3 food pellets. (b) Quantification of BLA GRABDA Z-scored signal AUC during the 20-s period following pellet delivery. Two-way RM ANOVA Reward period × Magnitude: F(1.92, 11.50) = 12.46, P = 0.001; Magnitude: F(1.94, 11.66) = 11.04, P = 0.002; Reward: F(1, 6) = 7.86, P =0.03. GRABDA2h : N = 2, 2 male; GRABDA2m : N = 5, 3 male (c) GRABDA fluorescence changes (Z-score) in response to unpredicted puff of air to the face. (d) Quantification of BLA GRABDA Z-scored trace AUC during the 5-s period following airpuff delivery relative to 5-s preAirpuff baseline. Two-tailed paired sample t-test t(7) = 5.88, P = 0.0006. GRABDA2h: N = 2, 2 male; GRABDA2m: N = 6, 3 male. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, ***P<0.001 Bonferroni-corrected post-hoc comparisons.
Extended Data Figure 5. Inhibition of VTADA→BLA projections does not disrupt reward collection during Pavlovian conditioning.

There was no effect of optical inhibition of VTADA→BLA projections at reward delivery on collection of the food outcomes. (a) Entries into the food-delivery port during the 30-s periods before and after cue presentation during Pavlovian long-delay conditioning. Rats entered the food-delivery port during the 30-s postcue/reward-delivery period more than the preCue baseline period and similarly between groups. Training × Period: F(4.94,93.85) = 3.00, P = 0.02; Training: F(3.13, 59.48) = 8.51, P < 0.0001; Period: F(1,19) = 72.60, P < 0.0001; Virus: F(1,19) = 0.47, P = 0.50; Training × Virus: F(7,133) = 0.65, P = 0.72; Virus × Period: F(1,19) = 0.87, P = 0.36; Training × Virus × Period: F(7,133) = 0.71, P = 0.66. ArchT, N = 11, 6 male rats; tdTomato, N = 10, 5 male rats. (b) Percent time spent in the food-delivery port during the 10-s preCue baseline and 10-s postCue offset (including trace interval and reward delivery period) periods during Pavlovian trace conditioning. Rats entered the food-delivery port during the 10-s postCue period more than the preCue period and similarly between groups. Training × Period: F(1.93,19.27) = 9.68, P = 0.001; Training: F(2.59, 25.88) = 9.28, P = 0.0004; Period: F(1,10) = 138.50, P < 0.0001; Virus: F(1,10) = 14.94, P = 0.003; Training × Virus: F(4, 40) = 1.35, P = 0.27; Virus × Period: F(1,10) = 1.37, P = 0.27; Training × Virus × Period: F(4, 40) = 0.05, P = 0.996. ArchT, N = 5, 4 male rats; Control, N = 7, 4 male rats (3 WT/cre-dependent ArchT; 4 Th-cre/cre-dependent tdTomato). Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons.
Extended Data Figure 6. Optical inhibition of VTADA→BLA projections throughout cue and reward during learning attenuates the encoding of identity-specific cue-reward memories.

We cre-dependently expressed ArchT bilaterally in VTADA neurons of male and female Th-cre rats and implanted optical fibers bilaterally over BLA. (a) Bottom: Representative fluorescent image of cre-dependent ArchT-tdTomato expression in VTA cell bodies with coexpression of Th in Th-Cre rats. Middle: Strategy for bilateral optogenetic inhibition of VTADA axons and terminals in the BLA of Th-cre rats. Top: Representative image of fiber placement in the vicinity of immunofluorescent ArchT-tdTomato-expressing VTADA axons and terminals in the BLA. (b) Schematic representation of cre-dependent ArchT-tdTomato expression in VTA and (c) placement of optical fiber tips in BLA for all subjects. For half of the control group, we expressed cre-dependent tdTomato in the VTA of Th-cre male and female rats. For the other half, wildtype rats were infused with cre-dependent ArchT (which did not express owing to the lack of cre recombinase) into the VTA. Both groups received bilateral optical fibers above the BLA. Thus, we control for light delivery, viral expression, and genotype. There were no significant behavioral differences between each type of control (lowest P: F(1, 6) = 1.61, P = 0.25). (d) Procedure. A, action (left or right lever press); CS, 30-s conditioned stimulus (aka, “cue”, white noise or click) followed immediately by reward outcome (O, sucrose solution or grain pellet). (e) Rats first received 11 sessions of instrumental conditioning, without manipulation, in which one of two different lever-press actions each earned one of two distinct food rewards (e.g., left press→sucrose/right press→pellets). Lever-press rate averaged across levers and across the final 2 instrumental conditioning sessions. Two-tailed independent sample t-test t(13) = 1.20, P = 0.25. (f) Rats then received Pavlovian conditioning. During each of the 8 Pavlovian conditioning sessions, each of 2 distinct, 30-s, auditory cues was presented 8 times and terminated in the delivery of one of the food rewards (e.g., white noise—sucrose/click—pellets). VTADA→BLA projections were optically inhibited (532 nm, 10 mW, 33 s) during the entirety of each cue-reward period. Light turned on at the onset of each cue and off 3 s following reward delivery. Optical inhibition of VTADA→BLA projections through the cue and reward period did not disrupt development of a Pavlovian conditional goal-approach response. Food-port entry rate during the cue relative to the preCue baseline period, averaged across trials and across the 2 cues for each Pavlovian conditioning session. Thin lines represent individual subjects. Three-way RM ANOVA Training × CS: F(3.30, 42.87) = 20.69, P < 0.0001; CS: F(1, 13) = 295.60, P < 0.0001; Training: F(3.03.,39.42) = 4.13, P = 0.01; Virus: F(1,13) = 1.61, P = 0.23; Training × Virus: F(7,91) = 0.37, P = 0.92; Virus × Cue: F(1,13) = 3.05, P = 0.10; Training × Virus × CS: F(7,91) = 2.17, P = 0.04. By the end of training both groups showed similar elevation in food-port approach during the cues. (g-i) We next gave subjects an outcome-specific Pavlovian-to-instrumental transfer (PIT) test, without manipulation. Controls learned the identity-specific cue-reward memories as evidenced by their ability to use the cues to selectively elevate pressing on the lever associated with the same outcome as predicted by the cue. Conversely, the cues were not capable of guiding lever-press choice in the group for which VTADA→BLA projections were inhibited during Pavlovian conditioning. Rather, for these subjects, the cues caused a general increase in pressing across both levers. (g) Lever-press rates during the preCue baseline periods compared to press rates during the cue periods separated for presses on the lever that, in training, delivered the same outcome as predicted by the cue (Same) and pressing on the other available lever (Different). Three-way RM ANOVA Virus × Lever × Cue: F(1, 13) = 7.35, P =0.02; Virus: F(1, 13) = 4.59, P = 0.05; Lever: F(1, 13) = 5.76, P = 0.03; Cue: F(1, 13) = 58.87, P < 0.0001; Virus × Lever: F(1, 13) = 1.91, P = 0.19; Virus × Cue: F(1, 13) = 12.00, P = 0.004; Lever × Cue period: F(1, 13) = 7.56, P = 0.02. *P<0.05, **P < 0.01, planned comparisons cue same presses v. preCue same presses and cue different presses v. preCue different presses. Inhibition of VTADA→BLA projections during cue-reward learning prevents subjects from learning identity-specific cue-reward memories, but does not prevent the assignment of general incentive properties to the cues that supports non-discriminate cue-induced motivation. (h) Elevation in lever presses on the Same lever [(Same lever presses during cue)/(Same presses during cue + Same presses during preCue)], relative to the elevation in pressing on the Different lever [(Different lever presses during cue)/(Different presses during cue + Different presses during preCue)], averaged across cues during the PIT test. Two-way RM ANOVA Virus: F(1, 13) = 2.21, P = 0.16; Lever: F(1, 13) = 1.67, P = 0.22; Virus × Lever: F(1, 13) = 1.14, P = 0.30. (i) As in training, during the PIT test the conditional goal-approach response was similar between groups, further indicating that even longer duration inhibition of VTADA→BLA projections during cue-reward learning does not disrupt development of conditional responses. Food-port entry rate during the cues relative to the preCue baseline periods, averaged across cues during the PIT test. Two-way RM ANOVA Cue: F(1, 13) = 44.71, P < 0.0001; Virus: F(1, 13) = 0.08, P = 0.79; Virus × Cue: F(1, 13) = 0.61, P = 0.45. *P < 0.05, **P < 0.01, ***P < 0.001, Bonferroni correction. ArchT, N = 7, 4 male rats; Control N = 8, 4 Th-cre/tdTomato 2 male rats, 4 wildtype cre-dependent ArchT 2 male rats. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons. These data confirm that VTADA→BLA projections are needed to link the identifying details of the reward to a predictive cue, but not to reinforce a conditional response or to assign general incentive properties to the cue to support general motivation.
Extended Data Figure 7. Stimulation of VTADA→BLA projections does not affect reward collection during compound conditioning.

There was no effect of optical stimulation of VTADA→BLA projections paired with reward delivery on collection of the food outcomes. Rats entered the food-delivery port during the 30-s postCue/reward-delivery period more than the preCue baseline period and similarly between groups. Three-way RM ANOVA Period: F(1, 22) = 46.80, P < 0.0001; Training: F(1.50, 32.90) = 3.70, P = 0.047; Virus: F(1, 22) = 1.89, P = 0.18; Training × Virus: F(3, 66) = 1.48, P = 0.23; Training × Period: F(2.55, 56.04) = 0.22, P = 0.85; Virus × Period: F(1, 22) = 0.04, P = 0.84; Training × Virus × Period: F(3, 66) = 0.51, P = 0.68. *P < 0.05, **P < 0.01 relative to preCue baseline, Bonferroni correction. ChR2, N = 11, 6 male rats; eYFP, N = 13, 6 male rats. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. *P<0.05, **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons.
Extended Data Figure 8. Stimulation of VTADA→BLA projections is not reinforcing.

To assess the reinforcing properties of VTADA→BLA activation, rats were given 2 sessions of intracranial self-stimulation (ICSS) in a context different from that of prior conditioning. Nose pokes in the active port triggered 1-s blue light delivery (473 nm; 10 mW; 25 ms pulse width; 20 Hz). Data show total active nose pokes compared to inactive nose pokes across 2, 1-hr ICSS sessions. Activation of VTADA→BLA projections was not reinforcing. Rats expressing ChR2 showed similar levels of active nose pokes as the eYFP control group in the first session and this decreased to the level of the inactive nose pokes in the second session. Three-way RM ANOVA Session × Virus × Nose poke: F(1, 22) = 5.00, P = 0.04; Virus × Nose poke: F(1, 22) = 5.18, P = 0.03; Session × Virus: F(1,22) = 5.18, P = 0.03; Session × Nose poke: F(1, 22) = 1.24, P = 0.28; Session: F(1, 22) = 3.05, P = 0.09; Virus: F(1, 22) = 1.94, P = 0.18; Nose poke: F(1, 22) = 54.66, P < 0.0001. Elevated active v. inactive port nose poking in both the eYFP and ChR2 groups could have resulted from the prior association formed between blue light and reward delivery during compound conditioning. If true, then this could have extinguished by the second session in the ChR2 group, potentially indicating that VTADA→BLA projection activity during either initial learning or online during the ICSS session may contribute to the reward expectation and/or learning processes that contribute to extinction. Alternatively, the nose poking in both groups could reflect salience of the light delivery, which could habituate more quickly in the ChR2 group. ChR2, N = 11, 6 male rats; eYFP, N = 13, 6 male rats. Data presented as trial-averaged, between-subject mean ± s.e.m. with individual data points. **P<0.01, ***P<0.001 Bonferroni-corrected post-hoc comparisons.
Supplementary Material
ACKNOWLEDGEMENTS
This research was supported by NIH grant DA035443 and MH106972 (KMW), NIH grant DA057084 (KMW & MJS), NSF GRFP (ACS), NSF CAREER 2143910 (MJS), the Staglin Center for Behavior and Brain Sciences, and the Wendell Jeffrey and Bernice Wenzel Term Chair in Behavioral Neuroscience to KMW.
Footnotes
COMPETING FINANCIAL INTERESTS
The authors declare no biomedical financial interests or potential conflicts of interest.
Code availability
Custom-written MATLAB code is available from the corresponding author upon request and the basic code is available via Dryad (https://doi.org/10.5068/D1109S).
Data availability
All data that support the findings of this study is available as supplemental files.
REFERENCES
- 1.Steinberg EE, et al. A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci 16, 966–973 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Schultz W, Dayan P & Montague PR A neural substrate of prediction and reward. Science 275, 1593–1599 (1997). [DOI] [PubMed] [Google Scholar]
- 3.Eshel N, Tian J, Bukwich M & Uchida N Dopamine neurons share common response function for reward prediction error. Nat Neurosci 19, 479–486 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Waelti P, Dickinson A & Schultz W Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001). [DOI] [PubMed] [Google Scholar]
- 5.Schultz W Predictive reward signal of dopamine neurons. J Neurophysiol 80, 1–27 (1998). [DOI] [PubMed] [Google Scholar]
- 6.Montague PR, Dayan P & Sejnowski TJ A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16, 1936–1947 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Delamater AR On the nature of CS and US representations in Pavlovian learning. Learn Behav 40, 1–23 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Fanselow MS & Wassum KM The Origins and Organization of Vertebrate Pavlovian Conditioning. Cold Spring Harb Perspect Biol (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.TOLMAN EC Cognitive maps in rats and men. Psychol Rev 55, 189–208 (1948). [DOI] [PubMed] [Google Scholar]
- 10.Sharpe MJ, et al. Dopamine transients are sufficient and necessary for acquisition of model-based associations. Nature Neuroscience 20, 735 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sharp ME, Foerde K, Daw ND & Shohamy D Dopamine selectively remediates ‘model-based’reward learning: a computational approach. Brain 139, 355–364 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Stalnaker TA, et al. Dopamine neuron ensembles signal the content of sensory prediction errors. Elife 8 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Howard JD & Kahnt T Identity prediction errors in the human midbrain update reward-identity expectations in the orbitofrontal cortex. Nature communications 9, 1611 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Keiflin R, Pribut HJ, Shah NB & Janak PH Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions. Curr Biol 29, 93–103.e103 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chang CY, Gardner M, Di Tillio MG & Schoenbaum G Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features. Curr Biol 27, 3480–3486.e3483 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wunderlich K, Smittenaar P & Dolan RJ Dopamine enhances model-based over model-free choice behavior. Neuron 75, 418–424 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Seitz BM, Hoang IB, DiFazio LE, Blaisdell AP & Sharpe MJ Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner. Curr Biol 32, 3210–3218.e3213 (2022). [DOI] [PubMed] [Google Scholar]
- 18.Langdon AJ, Sharpe MJ, Schoenbaum G & Niv Y Model-based predictions for dopamine. Curr Opin Neurobiol 49, 1–7 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Gardner MPH, Schoenbaum G & Gershman SJ Rethinking dopamine as generalized prediction error. Proc Biol Sci 285 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nasser HM, Calu DJ, Schoenbaum G & Sharpe MJ The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning. Front Psychol 8, 244 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Keiflin R & Janak PH Error-Driven Learning: Dopamine Signals More Than Value-Based Errors. Curr Biol 27, R1321–R1324 (2017). [DOI] [PubMed] [Google Scholar]
- 22.Brinley-Reed M & McDonald AJ Evidence that dopaminergic axons provide a dense innervation of specific neuronal subpopulations in the rat basolateral amygdala. Brain Res 850, 127–135 (1999). [DOI] [PubMed] [Google Scholar]
- 23.Beier KT, et al. Circuit Architecture of VTA Dopamine Neurons Revealed by Systematic Input-Output Mapping. Cell 162, 622–634 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sias A, et al. A bidirectional corticoamygdala circuit for the encoding and retrieval of detailed reward memories. eLife 10 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Colwill RM & Motzkin DK Encoding of the unconditioned stimulus in Pavlovian conditioning. Animal Learning & Behavior 22, 384–394 (1994). [Google Scholar]
- 26.RA R Preservation of Pavlovian associations through extinction. The Quarterly Journal of Experimental Psychology: Section B 49.3, 245–258 (1996). [Google Scholar]
- 27.Costa KM, et al. The role of the orbitofrontal cortex in creating cognitive maps. bioRxiv, 2022.2001.2025.477716 (2022). [Google Scholar]
- 28.Lutas A, et al. State-specific gating of salient cues by midbrain dopaminergic input to basal amygdala. Nat Neurosci 22, 1820–1833 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Corbit LH & Balleine BW Learning and Motivational Processes Contributing to Pavlovian-Instrumental Transfer and Their Neural Bases: Dopamine and Beyond. Curr Top Behav Neurosci (2016). [DOI] [PubMed] [Google Scholar]
- 30.Holland PC & Straub JJ Differential effects of two ways of devaluing the unconditioned stimulus after Pavlovian appetitive conditioning. J Exp Psychol Anim Behav Process 5, 65–78 (1979). [DOI] [PubMed] [Google Scholar]
- 31.Delamater AR & Oakeshott S Learning about multiple attributes of reward in Pavlovian conditioning. Ann N Y Acad Sci 1104, 1–20 (2007). [DOI] [PubMed] [Google Scholar]
- 32.Kamin L Predictability, surprise, attention, and conditioning. in SYMP. ON PUNISHMENT (1967). [Google Scholar]
- 33.Rescorla RA Learning about qualitatively different outcomes during a blocking procedure. Animal Learning & Behavior 27, 140–151 (1999). [Google Scholar]
- 34.Coddington LT & Dudman JT The timing of action determines reward prediction signals in identified midbrain dopamine neurons. Nat Neurosci 21, 1563–1573 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Niv Y & Schoenbaum G Dialogues on prediction errors. Trends Cogn Sci 12, 265–272 (2008). [DOI] [PubMed] [Google Scholar]
- 36.Schultz W Dopamine reward prediction-error signalling: a two-component response. Nat Rev Neurosci 17, 183–195 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kutlu MG, et al. Dopamine release in the nucleus accumbens core signals perceived saliency. Curr Biol 31, 4748–4761.e4748 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Esber GR, et al. Attention-related Pearce-Kaye-Hall signals in basolateral amygdala require the midbrain dopaminergic system. Biol Psychiatry 72, 1012–1019 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Seitz BM, Blaisdell AP & Sharpe MJ Higher-Order Conditioning and Dopamine: Charting a Path Forward. Front Behav Neurosci 15, 745388 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Berke JD What does dopamine mean? Nature Neuroscience 21, 787–793 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Takahashi YK, et al. Dopamine Neurons Respond to Errors in the Prediction of Sensory Features of Expected Rewards. Neuron 95, 1395–1405.e1393 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Stalnaker T, et al. Dopamine neuron ensembles signal the content of sensory prediction errors. BioRxiv, 723908 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Corbit LH & Balleine BW Double dissociation of basolateral and central amygdala lesions on the general and outcome-specific forms of pavlovian-instrumental transfer. J Neurosci 25, 962–970 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kröner S, Rosenkranz JA, Grace AA & Barrionuevo G Dopamine modulates excitability of basolateral amygdala neurons in vitro. J Neurophysiol 93, 1598–1610 (2005). [DOI] [PubMed] [Google Scholar]
- 45.Lorétan K, Bissière S & Lüthi A Dopaminergic modulation of spontaneous inhibitory network activity in the lateral amygdala. Neuropharmacology 47, 631–639 (2004). [DOI] [PubMed] [Google Scholar]
- 46.Bissière S, Humeau Y & Lüthi A Dopamine gates LTP induction in lateral amygdala by suppressing feedforward inhibition. Nat Neurosci 6, 587–592 (2003). [DOI] [PubMed] [Google Scholar]
- 47.Vander Weele CM, et al. Dopamine enhances signal-to-noise ratio in cortical-brainstem encoding of aversive stimuli. Nature 563, 397–401 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Lutas A, Fernando K, Zhang SX, Sambangi A & Andermann ML History-dependent dopamine release increases cAMP levels in most basal amygdala glutamatergic neurons to control learning. Cell Rep 38, 110297 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Rosenkranz JA & Grace AA Dopamine-mediated modulation of odour-evoked amygdala potentials during pavlovian conditioning. Nature 417, 282–287 (2002). [DOI] [PubMed] [Google Scholar]
- 50.Li C & Rainnie DG Bidirectional regulation of synaptic plasticity in the basolateral amygdala induced by the D1-like family of dopamine receptors and group II metabotropic glutamate receptors. J Physiol 592, 4329–4351 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Speranza L, di Porzio U, Viggiano D, de Donato A & Volpicelli F Dopamine: The Neuromodulator of Long-Term Synaptic Plasticity, Reward and Movement Control. Cells 10 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Liu J, et al. Neural Coding of Appetitive Food Experiences in the Amygdala. Neurobiol Learn Mem 155, 261–275 (2018). [DOI] [PubMed] [Google Scholar]
- 53.Courtin J, et al. A neuronal mechanism for motivational control of behavior. Science 375, eabg7277 (2022). [DOI] [PubMed] [Google Scholar]
- 54.Malvaez M, Shieh C, Murphy MD, Greenfield VY & Wassum KM Distinct cortical–amygdala projections drive reward value encoding and retrieval. Nature Neuroscience (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Saunders BT, Richard JM, Margolis EB & Janak PH Dopamine neurons create Pavlovian conditioned stimuli with circuit-defined motivational properties. Nat Neurosci 21, 1072–1083 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Paxinos G & Watson C The rat brain in stereotaxic coordinates (Academic Press, 1998). [DOI] [PubMed] [Google Scholar]
- 57.Collins AL, et al. Nucleus Accumbens Cholinergic Interneurons Oppose Cue-Motivated Behavior. Biol Psychiatry (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Lichtenberg NT, et al. The medial orbitofrontal cortex - basolateral amygdala circuit regulates the influence of reward cues on adaptive behavior and choice. J Neurosci (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Malvaez M, et al. Basolateral amygdala rapid glutamate release encodes an outcome-specific representation vital for reward-predictive cues to selectively invigorate reward-seeking actions. Sci Rep 5, 12511 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Lopes G, et al. Bonsai: an event-based framework for processing and controlling data streams. Front Neuroinform 9, 7 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Morel C, et al. Midbrain projection to the basolateral amygdala encodes anxiety-like but not depression-like behaviors. Nat Commun 13, 1532 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Schmider E, Ziegler M, Danay E, Beyer L & Bühner M Is it really robust? Reinvestigating the robustness of ANOVA against violations of the normal distribution assumption. Methodology : European journal of research methods for the behavioral & social sciences 6, 147–151 (2010). [Google Scholar]
- 63.Knief U & Forstmeier W Violating the normality assumption may be the lesser of two evils. Behav Res Methods 53, 2576–2590 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Levin JR, Serlin RC, Seaman & M.A. A controlled powerful multiple-comparison strategy for several situations. Psychological Bulletin 115, 153–159 (1994). [Google Scholar]
- 65.Lichtenberg NT & Wassum KM Amygdala mu-opioid receptors mediate the motivating influence of cue-triggered reward expectations. Eur J Neurosci (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Lichtenberg NT, et al. Basolateral amygdala to orbitofrontal cortex projections enable cue-triggered reward expectations. J Neurosci (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Chen TW, et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295–300 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Schoenbaum G, Chiba AA & Gallagher M Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci 1, 155–159 (1998). [DOI] [PubMed] [Google Scholar]
- 69.Sugase-Miyamoto Y & Richmond BJ Neuronal signals in the monkey basolateral amygdala during reward schedules. J Neurosci 25, 11071–11083 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Fontanini A, Grossman SE, Figueroa JA & Katz DB Distinct subtypes of basolateral amygdala taste neurons reflect palatability and reward. J Neurosci 29, 2486–2495 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Roesch MR, Calu DJ, Esber GR & Schoenbaum G Neural correlates of variations in event processing during learning in basolateral amygdala. J Neurosci 30, 2464–2471 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data that support the findings of this study is available as supplemental files.
