Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Oct 1.
Published in final edited form as: Nature. 2025 Feb 19;640(8059):722–731. doi: 10.1038/s41586-024-08580-w

A dual-pathway architecture for stress to disrupt agency and promote habit

Jacqueline R Giovanniello 1, Natalie Paredes 1, Anna Wiener 1, Kathia Ramírez-Armenta 1, Chukwuebuka Oragwam 1, Hanniel O Uwadia 1, Abigail L Yu 2, Kayla Lim 3, Jenna S Pimenta 1, Gabriela E Vilchez 1, Gift Nnamdi 1, Alicia Wang 1, Megha Sehgal 1, Fernando MCV Reis 1, Ana C Sias 1, Alcino J Silva 1,4,5, Avishek Adhikari 1,4,5, Melissa Malvaez 1, Kate M Wassum 1,4,5
PMCID: PMC12011321  NIHMSID: NIHMS2070292  PMID: 39972126

Abstract

Chronic stress can change how we learn and, thus, how we make decisions15. Here we investigated the neuronal circuit mechanisms that enable this. Using a multifaceted systems neuroscience approach in male and female mice, we reveal a dual pathway, amygdala-striatal neuronal circuit architecture by which a recent history of chronic stress disrupts the action-outcome learning underlying adaptive agency and promotes the formation of inflexible habits. We found that the basolateral amygdala projection to the dorsomedial striatum is activated by rewarding events to support the action-outcome learning needed for flexible, goal-directed decision making. Chronic stress attenuates this to disrupt action-outcome learning and, therefore, agency. Conversely, the central amygdala projection to the dorsomedial striatum mediates habit formation. Following stress this pathway is progressively recruited to learning to promote the premature formation of inflexible habits. Thus, stress exerts opposing effects on two amygdala-striatal pathways to disrupt agency and promote habit. These data provide neuronal circuit insights into how chronic stress shapes learning and decision making, and help understand how stress can lead to the disrupted decision making and pathological habits that characterize substance use disorders and mental health conditions.

Keywords: learning, decision making, instrumental conditioning, basolateral amygdala, central amygdala, reward, striatum


When making a decision, we can use what we have learned about our actions and their outcomes to prospectively evaluate the consequences of our potential choices6. This goal-directed strategy supports our agency. It allows us to choose actions that cause desirable consequences and avoid those that lead to outcomes that are not currently beneficial. This strategy is, thus, highly flexible. Yet we don’t always think about the consequences of our behavior. Often this is fine. Such habits allow us to efficiently execute routine behaviors based on past success, without forethought of their consequences6,7. The brain balances goal-directed and habitual control to allow behavior to be adaptive when needed, but efficient when appropriate8. Disrupted agency and overreliance on habit can cause inadequate consideration of consequences, disrupted decision making, inflexible behavior, and a lower threshold for compulsivity911. This can contribute to cognitive symptoms in numerous diseases, including substance use disorder1214, obsessive-compulsive disorder15, obesity16, schizophrenia17,18, depression18,19, anxiety20, and autism21. Chronic stress tips the balance of behavioral control towards habit15. Stress can change how we learn and, thus, how we make decisions, by attenuating agency and promoting the formation of inflexible habits. Because stress is a major predisposing factor for addiction and other psychiatric conditions2225, understanding how stress promotes habit will illuminate an avenue of vulnerability for these conditions. Yet, despite importance for understanding adaptive and maladaptive behavior, little is known of the neuronal circuits that support the learning underlying agency and habits and even less of those that enable stress to potentiate habit formation.

Amygdala-striatal projections are potential candidate pathways by which stress could influence learning and behavioral control strategy. The dorsomedial striatum (DMS) is an evolutionarily conserved hub for the action-outcome learning that supports goal-directed decision making26,27. Suppression of DMS activity attenuates such agency and promotes inflexible habits28. The basolateral amygdala (BLA) is also needed for goal-directed behavior29. It sends a direct excitatory projection to the DMS30,31. Little is known of the function of the BLA→DMS pathway, though it is well-positioned to facilitate the action-outcome learning that supports agency. Conversely, the central amygdala (CeA) has been implicated in habit32. It sends a direct, likely inhibitory33, projection to the striatum30,34,35 and is, thus, poised to oppose striatal activity. Both the BLA and CeA are highly implicated in stress processing36,37. Therefore, here we investigated the function of the BLA→DMS and CeA→DMS pathways in action-outcome and habit learning and asked whether chronic stress acts via these amygdala-striatal pathways to attenuate agency and promote the formation of inflexible habits.

RESULTS

Stress disrupts agency and promotes habit

We first designed a behavioral procedure to model stress-potentiated habit formation in male and female mice (Figure 1a). Mice received 14 consecutive days of chronic mild unpredictable stress (“stress”) including daily, pseudorandom, exposure to 2 of 6 stressors: damp bedding (4–16 hr), tilted cage (4–16 hr), white noise (80 db; 2–16 hr), continuous illumination during the dark phase (12 hr), physical restraint (2 hr), and footshock (0.7-mA, 1-s, 5 shocks/10 min). This models aspects of the repeated and varied nature of stress experienced by humans, including uncontrollable physical aversive events, disrupted sleep, and poor environmental conditions. Controls received equated handling. Demonstrating efficacy, serum corticosterone was higher (Figure 1b; see Supplemental Table 1 for full statistical reporting) and body weight was lower (Figure 1c) in stressed mice than controls. This procedure was intentionally mild to model low-level, chronic stress. Accordingly, it did not cause major anxiety- or depression-like phenotypes in classic assays of such behavior (Extended Data Figure 1). 24 hr following the last stressor, mice were trained to lever press to earn a food-pellet reward. We used 4 sessions of training on a random-ratio schedule of reinforcement in which a variable number of presses (average = 1 – 10, escalated each training session) was required to earn each reward. The tight press-reward relationship of this regime encourages action-outcome learning and, together with the short training duration, the use of such knowledge to support agency and goal-directed decision making38. Mice were food-deprived and bodyweight did not significantly differ between control and stressed mice during training (Supplemental Table 2). Both control and stressed mice similarly acquired the instrumental behavior (Figure 1d). Thus, stress did not cause general learning, motivational, or locomotor impairments. To evaluate behavioral control strategy, we used the gold-standard outcome-specific devaluation test6,39. Mice were given 90-min, non-contingent access to the food pellet earned during training to induce a sensory-specific satiety rendering that specific food pellet temporarily devalued. Lever pressing was assessed in a 5-min, non-reinforced probe test immediately following the prefeeding. Performance was compared to that following satiation on an alternate food pellet to control for general satiety (Valued state; test order counterbalanced). Both control and stressed mice consumed similar amounts during the prefeed (Supplemental Table 3), indicating that stress did not alter food consumption. Stress also did not affect food pellet discrimination or devaluation efficacy (Supplemental Table 4). If subjects have learned the action-outcome relationship and are using this to support prospective consideration of action consequences for flexible, goal-directed decision making, they will reduce lever pressing when the outcome is devalued. We saw such agency in control subjects (Figure 1ef; see also Extended data Figure 2 for data on entries into the food-delivery port). Stressed mice were insensitive to devaluation, indicating disrupted agency. Such lack of consideration of action consequences marks inflexible habits8,26.

Figure 1: Chronic stress disrupts action-outcome learning and potentiates habit formation.

Figure 1:

(a) Procedure. Stress, chronic unpredictable mild stress; RR-10, presses earned food pellet rewards on a random-ratio reinforcement schedule prior to devaluation tests (b) Blood serum corticosterone 24 hr after 14 d of 1 stressor/d, 2 stressors/d, or daily handling (Control). 1-way ANOVA: Stress: F(2, 20)=17.35, P<0.0001. Control N=8 (4 male), 1x stress N=7 (3 male), 2x stress N=8 (4 male). (c) Percent change (Δ) in body weight averaged across the first 10 d of stress. 2-sided t-test: t14=4.50, P=0.0005, 95% Confidence interval (CI) −6.95 - −2.46. N=8/group (4 male) mice. (d) Training press rate (beginning with the last day of fixed-ratio 1 training). 2-way ANOVA: Training: F(2.12, 95.32)=168.20, P<0.0001. See Supplemental Table 1 for full statistical reporting. (e) Devaluation test press rate. 2-way ANOVA: Stress x Value: F(1, 45)=4.43, P=0.04. (f) Devaluation index [(Devalued condition presses)/(Valued condition presses + Devalued presses)]. 2-sided t-test: t(45)=2.99, P=0.005, 95% CI 0.05 – 0.24. Control N=22 (13 male), Stress N=25 (12 male) mice. (g) Procedure. P(Reward|Press)=0.1, presses earned pellets with a probability of 0.1 prior to contingency degradation and test. (h) Training press rate. 2-way ANOVA: Training: F(1.66, 41.39)=211.10, P<0.0001. (i) Press rate during the post-contingency degradation lever-pressing probe test. 2-way ANOVA: Stress x Contingency Degradation Group: F(1, 25)=12.75, P=0.002. Control, Non-degraded N=7 (3 male), Control, Degraded N=7 (3 male), Stress Non-degraded N=7 (3 male) Stress Degraded N=8 (4 male) mice. Males=closed circles/solid lines, Females=open circles/dashed lines. Data presented as mean +/− SEM. **P <0.01, ***P<0.001, corrected for multiple comparisons.

To provide converging evidence that stress disrupts the action-outcome learning that supports agency, we conducted a second experiment, this time assessing behavioral control strategy using the other gold-standard test: contingency degradation6,40 (Figure 1g). Mice received chronic stress or daily handling control prior to being trained to lever press to earn food-pellet rewards. During training, each press earned reward with a probability that became progressively leaner [P(Reward | Press) = 1.0 – 0.1]. Control and stressed mice, again, similarly acquired the instrumental behavior (Figure 1h). Half the subjects in each group received a 20-min contingency degradation session during which lever pressing continued to earn reward with a probability of 0.1, but reward was also delivered non-contingently with the same probability [P(Reward | Press) = 0.1; P(Reward | NoPress) = 0.1]. Thus, reward was no longer contingent on pressing. The other half received a non-degraded control session in which rewards remained contingent on pressing ([P(Reward | Press) = 0.1; P(Reward | NoPress) = 0]; see Extended Data Figure 3 for data from the contingency degradation session). Lever pressing was assessed in a 5-min, non-reinforced probe test the next day. If subjects learned the action-outcome contingency and used it to support their agency, their actions should be sensitive to the change in this contingent relationship, such that they will reduce lever pressing when it is no longer needed to earn reward40. Controls were sensitive to contingency degradation. Stressed mice were not (Figure 1i). Together these data show that a recent history of chronic stress causes an inability to engage one’s agency and flexibly adapt behavior when its consequence is not currently beneficial or when it is no longer required to earn reward. Thus, chronic stress disrupts action-outcome learning to attenuate agency and, instead, causes the premature formation of inflexible habits.

Stress oppositely affects BLA→DMS and CeA→DMS

We next confirmed the existence of direct BLA and CeA projections to dorsal striatum using both anterograde and retrograde tracing. We found that both BLA and CeA directly project the DMS (Extended Data Figure 4). We then characterized the activity of these BLA→DMS and CeA→DMS pathways during action-outcome learning and asked whether it is influenced by chronic stress. We used fiber photometry to record fluorescent activity of the genetically encoded calcium indicator GCaMP8s expressed using an intersectional approach in BLA or CeA neurons that project to the DMS (Figure 2aj). Mice received chronic stress or daily handling control prior to being trained to lever press to earn food-pellet rewards on a random-ratio reinforcement schedule (Figure 2c). Both control and stressed mice similarly acquired the instrumental behavior (Figure 2e, k, see also Extended Data Figure 5 for food-port entry data). Fiber photometry (473 nm calcium-dependent, 415 nm isosbestic) recordings were made during each training session. BLA→DMS neurons were robustly activated by earned reward during learning (Figure 2fi). Thus, the BLA→DMS pathway is active when subjects are able to link the rewarding consequence to their actions, thus forming the action-outcome knowledge that supports agency. This activity was absent in stressed mice (Figure 2g, i). Chronic stress attenuated the BLA→DMS activity associated with action-outcome learning. Conversely, CeA→DMS neurons were not robustly active during this form of instrumental learning in control subjects, indicating CeA→DMS projection activity is not associated with action-outcome learning. Stress caused the CeA→DMS pathway to be progressively engaged around earned reward with training (Figure 2lo). The CeA→DMS response to earned reward was long lasting, taking approximately 30 seconds to return to baseline after reward collection (Extended Data Figure 6). Thus, a recent history of chronic stress causes the CeA→DMS pathway to be recruited to instrumental learning. We detected similar patterns in response to unpredicted rewards in both pathways (Extended Data Figure 6). Both BLA→DMS and CeA→DMS projections were acutely activated by unpredicted aversive events (footshock; Extended Data Figure 6), indicating that neither BLA→DMS nor CeA→DMS bulk activity is valence-specific. These aversive responses were not altered by stress (Extended Data Figure 6), providing a positive control for our ability to detect signal in all groups. Chronic stress did, however, reduce post-shock fear-related BLA→DMS activity, consistent with its effects on reward signals in this pathway. Chronic stress did not alter baseline spontaneous calcium activity in either pathway, indicating it does not generally increase or decrease excitability in these pathways (Extended Data Figure 6). Together these data indicate that a recent history of chronic stress oppositely modulates BLA→DMS and CeA→DMS pathway activity. BLA→DMS projections are normally activated by rewarding events, but stress prevents this learning-related activity and, instead, causes the CeA→DMS pathway to be progressively recruited during learning.

Figure 2: Chronic stress attenuates BLA→DMS activity during action-outcome learning and progressively recruits CeA→DMS activity.

Figure 2:

(a) Intersectional BLA→DMS or CeA→DMS fiber photometry calcium imaging approach. (b) Expression and fiber map for all subjects. Adapted from77. (c) Procedure. Stress, chronic unpredictable stress; RR-10, random-ratio reinforcement schedule. (d-i) Fiber photometry recordings of GCaMP8s in BLA→DMS neurons during learning. (d) Images of retro-cre expression in DMS and immunofluorescent staining of cre-dependent GCaMP8s expression and fiber placement in BLA. Scale bars = 200 μm. (e) Training press rate. 2-way ANOVA: Training: F(1.72, 32.66)=81.40, P<0.0001. (f-g) Trial-averaged Z-scored Δf/F BLA→DMS GCaMP8s fluorescence changes aligned to bout-initiating presses (f) and reward collection (g) across training. (h-i) Area under curve (AUC) 3-s prior to initiating presses (h; 2-way ANOVA: Training: F(2.49, 47.38)=0.91, P=0.43) or following reward collection (i; 2-way ANOVA: Stress: F(1, 19)=24.13, P<0.0001). Control N=9 (4 male), Stress N=12 (5 male) mice. (j-o) Fiber photometry recordings of GCaMP8s in CeA→DMS neurons during learning. (j) Immunofluorescent image of retro-cre expression in DMS and cre-dependent GCaMP8s expression and fiber placement in CeA. (k) Training press rate. 2-way ANOVA: Training: F(1.51, 30.23)=65.61, P<0.0001. (l-m) Trial-averaged Z-scored Δf/F CeA→DMS GCaMP8s fluorescence changes aligned to bout-initiating presses (l) and reward collection (m) across training. (n-o) AUC 3-s prior to initiating presses (n; 2-way ANOVA: Stress: F(1, 20)=0.74, P=0.40) or following reward collection (o; 2-way ANOVA: Training x Stress: F(3, 60)=4.51, P=0.006). Control N=11 (6 male), Stress N=11 (4 male) mice. Males=closed circles/solid lines, Females=open circles/dashed lines. Data presented as mean +/− SEM. *P <0.05, **P <0.01, ***P<0.001, corrected for multiple comparisons.

BLA→DMS mediates agency learning

BLA→DMS projections are activated by rewards to support action-outcome learning for flexible, goal-directed decision making.

BLA→DMS projections are activated by earned rewards. This experience is an opportunity to link the reward to the action that earned it, forming the action-outcome knowledge that supports agency. So, we reasoned that such BLA→DMS activity might be critical for action-outcome learning. If this is true, then inhibiting reward-evoked BLA→DMS activity should suppress action-outcome learning and, thereby, disrupt flexible goal-directed decision making. We tested this by optogenetically inhibiting BLA→DMS projection activity during instrumental learning. We expressed the inhibitory opsin archaerhodopsin (Arch) or fluorophore control in the BLA and implanted optical fibers in the DMS in the vicinity of Arch-expressing BLA axons and terminals (Figure 3ab). Mice were trained to lever press to earn food-pellet rewards on a random-ratio schedule of reinforcement. We optically (532 nm, 10 mW, 5 s) inhibited BLA terminals in the DMS during each earned reward (Figure 3c). BLA→DMS inhibition did not affect acquisition of the instrumental behavior (Figure 3d; see also Extended Data Figure 7 for food-port entry data). Training was followed by a set of outcome-specific devaluation tests, as above. No manipulation was given at test to allow us to isolate BLA→DMS function in action-outcome learning rather than the expression of such learning during decision making. Controls were sensitive to outcome devaluation, indicating action-outcome learning for goal-directed decision making. Inhibition of BLA→DMS projections during learning caused subsequent insensitivity to outcome devaluation (Figure 3ef). BLA→DMS inhibition was not inherently rewarding or aversive (Extended Data Figure 8). Thus, BLA→DMS projections are normally activated by rewarding events to enable the action-outcome learning that supports agency.

Figure 3: BLA→DMS mediates action-outcome learning and is suppressed by stress to disrupt agency and promote habit formation.

Figure 3:

(a-f) Optogenetic BLA→DMS inactivation at reward during instrumental learning. (a) Optogenetic inhibition approach. (b) Top, immunofluorescent images of Arch expression in BLA and optical fiber tip in the vicinity of Arch-expressing BLA terminals in DMS. Scale bars = 200 μm. Bottom, expression and fiber map for all subjects. (c) Procedure. RR-10, random-ratio reinforcement schedule. (d) Training press rate. 2-way ANOVA: Training: F(1.70, 32.34)=41.26, P<0.0001. (e) Devaluation test press rate. 2-way ANOVA: Stress x Value: F(1, 19)=14.35, P=0.001. (f) Devaluation index [(Devalued presses)/(Valued presses + Devalued presses)]. 2-sided t-test: t(19)=5.03, P<0.0001, 95% CI 0.18 – 0.44. eYFP N=10 (5 male), Arch N=11 (5 male) mice. (g-l) Optogenetic BLA→DMS activation at reward during post-stress learning. (g) Intersectional optogenetic activation approach. (h) Top, immunofluorescent images of retro-cre expression in DMS and cre-dependent ChR2 expression and fiber in BLA. Bottom, expression and fiber map for all subjects. (i) Procedure. Stress, chronic unpredictable stress; RI-30s, random-interval reinforcement schedule. (j) Training press rate. 2-way ANOVA: Training: F(1.95, 64.18)=30.17, P<0.0001. (k) Devaluation test press rate. 3-way ANOVA: Value x Stress x Virus: F(1, 33)=6.74, P=0.01. Control groups, 2-way ANOVA: Value x Virus: F(1, 16)=0.3.13, P=0.10. Stress groups, 2-way ANOVA: Value x Virus: F(1, 17)=4.23, P=0.05. (l) Devaluation index. 2-way ANOVA: Stress x Virus: F(1, 33)=9.64, P=0.004. Control eYFP N=11 (7 male), Control ChR2 N=7 (4 males), Stress eYFP N=9 (2 male), Stress ChR2 N=10 Stress (3 male) mice. (m-r) Chemogenetic BLA→DMS activation during post-stress learning. (m) Intersectional chemogenetic activation approach. (n) Top, immunofluorescent images of retro-cre expression in DMS and cre-dependent hM3Dq expression in BLA. Bottom, expression map for all subjects. (o) Procedure. CNO, clozapine-N-oxide. (p) Training press rate. 2-way ANOVA: Training: F(2.04, 67.36)=73.32, P<0.0001. (q) Devaluation test press rate. Planned comparisons 2-sided t-test valued v. devalued, Control mCherry: t(11)=2.76, P=0.01, 95% CI 1.20 – 7.97; Control hM3Dq: t(5)=0.89, P=0.38, 95% CI −2.69 – 6.89; Stress mCherry: t(8)=1.25, P=0.22, 95% CI −1.51 – 6.31; Stress hM3Dq: t(9)=2.9, P=0.007, 95% CI 1.57 – 8.99. (r) Devaluation index. 2-way ANOVA: Stress x Virus: F(1, 33)=11.60, P=0.002. Control mCherry N=12 (7 male), Control hm3Dq N=6 (3 male), Stress mCherry N=9 (5 male), Stress hM3Dq N=10 (5 male) mice. Males=closed circles/solid lines, Females=open circles/dashed lines. Data presented as mean +/− SEM. *P<0.05, **P<0.01, ***P<0.001, corrected for multiple comparisons.

BLA→DMS activation restores agency after stress

Stress-induced suppression of BLA→DMS projections disrupts action-outcome learning and enables premature habit formation.

Since, BLA→DMS projections are critical for action-outcome learning, we next reasoned that the stress-induced suppression of BLA→DMS activity might disrupt such learning. We tested this by asking whether activating BLA→DMS projections during learning, to counter the effects of stress, is sufficient to restore action-outcome learning and, thus, goal-directed decision making in stressed mice. We did this in two ways. Because chronic stress abolishes reward-evoked BLA→DMS activity during learning, we first used optogenetics to stimulate BLA→DMS projections at the time of earned reward during learning following chronic stress. Using an intersectional approach (Figure 3g), we expressed the excitatory opsin Channelrhodopsin 2 (ChR2) or fluorophore control in DMS-projecting BLA neurons (Figure 3h). Following chronic stress or daily handling control, mice were trained to lever press to earn food-pellet rewards. We used a random-interval schedule of reinforcement in which a variable (average 30-s) period of time had to elapse after an earned reward before a press would earn another reward. Limited training on this regime allows action-outcome learning for goal-directed decision making41. However, the looser action-outcome relationship is more permissive for habits than a ratio reinforcement schedule38,42, thereby making it more difficult to neurobiologically prevent stress-potentiated habit and the results more robust if such an effect were to occur. We optically (473 nm, 10 mW, 20Hz, 2 s) stimulated DMS-projecting BLA neurons during collection of each earned reward (Figure 3i). Neither stress nor BLA→DMS stimulation significantly altered acquisition of the instrumental behavior (Figure 3j). Training was followed by the outcome-devaluation test, conducted without manipulation. Whereas controls were sensitive to subsequent outcome devaluation, indicating action-outcome learning and flexible goal-directed decision making, stressed mice were insensitive to devaluation, indicating premature habit formation (Figure 3-l). Optogenetic activation of BLA→DMS projections during learning restored normal action-outcome learning enabling agency, as evidenced by sensitivity to devaluation, in stressed mice (Figure 3kl). Thus, activation of BLA→DMS projections during reward learning is sufficient to overcome the effect of prior chronic stress and restore action-outcome learning to enable agency for flexible, goal-directed decision making.

To provide converging evidence, we conducted a second experiment in which we activated the BLA→DMS pathway during post-stress learning using chemogenetics. Using an intersectional approach (Figure 3m), we expressed the excitatory designer receptor human M3 muscarinic receptor (hM3Dq) or fluorophore control in DMS-projecting BLA neurons (Figure 3n). Following chronic stress or daily handling control, mice were trained to lever press to earn food-pellet rewards on a random-interval reinforcement schedule (Figure 3o). Prior to each instrumental training session, mice received the hM3Dq ligand clozapine-N-oxide (CNO; 0.2 mg/kg43,44 i.p.) to activate BLA→DMS projections. Neither stress nor chemogenetic BLA→DMS activation altered instrumental acquisition (Figure 3p). Mice then received devaluation tests. Whereas controls were sensitive to outcome devaluation, stressed mice were, again, insensitive to devaluation (Figure 3qr). Chemogenetic activation of BLA→DMS projections during learning replicated the effect of optogenetic activation, restoring action-outcome learning to enable goal-directed decision making, as evidenced by sensitivity to devaluation, in stressed mice (Figure 3qr). Neither optogenetic nor chemogenetic activation of BLA→DMS projections significantly impacted learning in subjects without a history of chronic stress. Behavior was, however, variable in these groups with some marginal evidence of an influence on action-outcome learning, perhaps due to disruption of neurotypical activity. Together these data reveal that BLA→DMS projections are activated by rewards to enable the action-outcome learning that supports flexible, goal-directed decision making and chronic stress attenuates this to disrupt agency and promote premature habit formation.

CeA→DMS mediates habit formation

CeA→DMS projections mediate the formation of routine habits.

The CeA is necessary for habit32. This function may be achieved, at least in part, via its direct inhibitory projection to DMS. It is, therefore, perhaps not surprising that the CeA→DMS pathway is not typically active during action-outcome learning. Rather the CeA is activated by rewards following overtraining45. Therefore, we reasoned that the CeA→DMS pathway might mediate the natural habit formation that occurs for routine behaviors. To test this, we asked whether CeA→DMS projection activity is necessary for habit formation by optogenetically inhibiting CeA→DMS projections at the time of earned reward during learning and overtraining. We expressed the inhibitory opsin Arch or fluorophore control in the CeA and implanted optical fibers in the DMS in the vicinity of Arch-expressing CeA axons and terminals (Figure 4ac). Mice were trained to lever press to earn food-pellet rewards on a random-interval schedule of reinforcement and were overtrained to promote natural habit formation (Figure 4c). We optically (532 nm, 10 mW, 5 s) inhibited CeA terminals in the DMS during each earned reward (Figure 4c). Training was followed by the devaluation test. No manipulation was given on test to allow us to isolate CeA→DMS function in habit learning from habit expression. Optogenetic CeA→DMS inhibition did not alter acquisition of the instrumental behavior (Figure 4d; see also Extended Data Figure 9 for food-port entry data). It did, however, prevent habit formation. Controls formed routine habits, evidenced by insensitivity to devaluation. Mice for which we inhibited the CeA→DMS pathway during overtraining continued to show flexible goal-directed decision making, sensitivity to devaluation (Figure 4ef). Thus, the CeA→DMS pathway mediates the natural habit formation that occurs with repeated practice of an instrumental routine.

Figure 4: CeA→DMS mediates habit formation and is recruited by chronic stress to promote premature habit.

Figure 4:

(a-f) Optogenetic inactivation of CeA→DMS projections at reward during natural habit formation. (a) Optogenetic inhibition approach. (b) Top, immunofluorescent images of Arch expression in CeA and optical fiber tip in the vicinity of Arch-expressing CeA terminals in the DMS. Scale bars = 200 μm. Bottom, expression and fiber map for all subjects. (c) Procedure. RI-30s, random-interval overtrain reinforcement schedule. (d) Training press rate. 2-way ANOVA: Training: F(1.46, 29.09)=15.69, P=0.0001. (e) Devaluation test press rate. 2-way ANOVA: Virus x Value: F(1, 20)=4.72, P=0.04. (f) Devaluation index [(Devalued presses)/(Valued presses + Devalued presses)]. 2-sided t-test: t(20)=2.80, P=0.01, 95% CI −0.45 - −0.06. eYFP N=11 (3 male), Arch N=11 (7 male) mice. (g-l) Optogenetic CeA→DMS inactivation at reward during post-stress learning. (g) Optogenetic inhibition approach. (h) Top, immunofluorescent images of Arch expression in CeA and optical fiber tip in the vicinity of Arch-expressing CeA terminals in the DMS. Bottom, expression and fiber map for all subjects. (i) Procedure. Stress, chronic unpredictable stress. (j) Training press rate. Training: F(2.15, 68.91)=31.05, P<0.0001. (k) Devaluation test press rate. 3-way ANOVA: Value x Stress x Virus: F(1, 32)=4.14, P=0.05. Control groups, 2-way ANOVA: Value x Virus: F(1, 18)=0.15, P=0.70. Stress groups, 2-way ANOVA: Value x Virus: F(1, 14)=12.88, P=0.003. (l) Devaluation index. 2-way ANOVA: Stress x Virus: F(1, 32)=4.47, P=0.04. Control eYFP N=9 (5 male), Control Arch N=11 (4 male), Stress eYFP N=7 (6 male), Stress Arch N=9 (5 male) mice. (m-r) Chemogenetic CeA→DMS inhibition during post-stress learning. (m) Intersectional chemogenetic inhibition approach. (n) Top, immunofluorescent images of retro-cre expression in DMS. Bottom, expression map for all subjects. (o) Procedure. CNO, clozapine-N-oxide. (p) Training press rate. 2-way ANOVA: Training: F(1.54, 63.31)=21.12, P<0.0001. (q) Devaluation test press rate. Planned comparisons 2-sided t-tests valued v. devalued, Control mCherry: t(11)=4.59, P<0.0001, 95% CI 4.57 – 11.73; Control hM4Di: t(12)=0.73, P=0.46, 95% CI −2.18 to 4.71; Stress mCherry: t(10)=0.47, P=0.64, 95% CI −4.62 – 2.87; Stress hM4Di: t(8)=2.41, P=0.02, 95% CI 0.79 – 9.07. (r) Devaluation index. 2-way ANOVA: Stress x Virus: F(1, 41)=5.99, P=0.02. Control mCherry N=12 (5 male), Control hM4Di N=13 (8 male), Stress mCherry N=11 (5 male), Stress hM4Di N=9 (4 male) mice. (s-x) Optogenetic CeA→DMS stimulation at reward during learning following subthreshold chronic stress. (s) Intersectional optogenetic stimulation approach. (t) Top, images of retro-cre expression in DMS and immunofluorescent staining of cre-dependent ChR2 expression and fibers in CeA. Bottom, expression and fiber map for all subjects. (u) Procedure. Subthresold stress, 1x daily chronic unpredictable stress; RR-10, random-ratio reinforcement schedule. (v) Training press rate. 2-way ANOVA: Training: F(2.30, 45.90)=71.93, P<0.0001. (w) Devaluation test press rate. 2-way ANOVA: Virus x Value: F(1, 20)=7.40, P=0.01. (x) Devaluation index. 2-sided t-test t(20)=4.29, P=0.0004, 95% CI 0.19 – 0.55. eYFP N=10 (4 male), ChR2 N=12 (6 male) mice. Males=closed circles/solid lines, Females=open circles/dashed lines. Data presented as mean +/− SEM. ^P=0.069, *P<0.05, **P<0.01, ***P <0.001, corrected for multiple comparisons.

Stress promotes habit via CeA→DMS

Stress-induced recruitment of CeA→DMS projections mediates premature habit formation.

Given that the CeA→DMS pathway mediates habit formation, we next reasoned that the stress-induced recruitment of this pathway to learning may enable stress to promote premature habit formation. If this is true, then preventing the stress-induced increase in CeA→DMS activity during learning should prevent premature habit formation and restore action-outcome learning and, therefore, agency. We tested this in two ways. Because chronic stress engages the CeA→DMS pathway at reward experience during learning, we first optogenetically inhibited CeA→DMS projections at the time of earned reward during learning following stress. We expressed the inhibitory opsin Arch or fluorophore control in the CeA and implanted optical fibers in the DMS (Figure 4gh). Following chronic stress or daily handling control, mice were trained to lever press to earn food-pellet rewards and we optically (532 nm, 10 mW, 5 s) inhibited CeA terminals in the DMS during each earned reward (Figure 4i). We used a random-interval schedule of reinforcement to increase the robustness of the results. Neither stress nor CeA→DMS inhibition altered acquisition of the instrumental behavior (Figure 4j). Training was followed by the devaluation tests, conducted without manipulation. At test, we again found evidence of goal-directed decision making, sensitivity to devaluation, in control subjects and potentiated habit formation, insensitivity to devaluation, in stressed subjects (Figure 4kl). Optogenetic inhibition of CeA→DMS activity at reward during learning restored action-outcome learning to enable goal-directed decision making in stressed mice, as evidenced by sensitivity to devaluation (Figure 4kl). Thus, stress-induced activation of CeA→DMS projections during reward learning is necessary to promote premature habit formation.

To provide converging evidence, we conducted a second experiment in which we chemogenetically inhibited CeA→DMS projections during learning following stress. We used an intersectional approach (Figure 4m) to express the inhibitory designer receptor human M4 muscarinic receptor (hM4Di) or a fluorophore control in DMS-projecting CeA neurons (Figure 4mn). Following chronic stress or daily handling control, mice were trained to lever press to earn food-pellet rewards on a random-interval reinforcement schedule (Figure 4o). Prior to each training session, mice received the hM4Di ligand CNO (2.0 mg/kg46,47 i.p.) to inactivate CeA→DMS projections. Neither stress nor chemogenetic CeA→DMS inactivation altered acquisition of the instrumental behavior (Figure 4p). Chemogenetic inhibition of CeA→DMS projections during learning replicated the effects of optogenetic inhibition, restoring action-outcome learning to enable goal-directed decision making, sensitivity to devaluation, in stressed mice (Figure 4qr). Neither optogenetic nor chemogenetic CeA→DMS inhibition significantly impacted learning or behavioral control strategy in subjects without a history of chronic stress. Together, these data indicate that chronic stress engages CeA→DMS projections during subsequent reward learning experience to promote the premature formation of inflexible habits.

CeA→DMS projection activity is sufficient to promote premature habit formation following subthreshold chronic stress.

We next asked whether CeA→DMS pathway activity at reward during learning is sufficient to promote habit formation. We used an intersectional approach (Figure 4s) to express the excitatory opsin ChR2 or a fluorophore control in DMS-projecting CeA neurons and implanted optic fibers above the CeA (Figure 4t). We first optically (473 nm, 10 mW, 20 Hz, 25-ms pulse width, 2 s) stimulated CeA→DMS neurons with each earned reward during instrumental learning on a random-ratio schedule of reinforcement in mice without a history of chronic stress. This neither affected acquisition of the lever-press behavior, nor the action-outcome learning needed to support flexible, goal-directed decision making during the devaluation test (Extended Data Figure 10). Thus, activation of the CeA→DMS pathway during reward learning experience is not alone sufficient to disrupt action-outcome learning or promote habit formation.

We next reasoned that activation of CeA→DMS projections might be sufficient to tip the balance of behavioral control towards habit in the context of a very mild chronic stress experience. To test this, we repeated the experiment this time in mice with a history of once daily stress for 14 consecutive days (Figure 4su). Again, neither CeA→DMS activation nor stress altered acquisition of the instrumental behavior (Figure 4v). The less frequent chronic stress was itself insufficient to cause premature habit formation. Mice were sensitive to devaluation, indicating preserved action-outcome learning and agency (Figure 4wx). Activation of CeA→DMS projections at reward during learning was sufficient to cause premature habit formation, as evidenced by greater insensitivity to devaluation in subjects that received stimulation relative to those that did not (Figure 4wx). Thus, activation of CeA→DMS projections during learning is sufficient to amplify the effects of prior subthreshold chronic stress to promote habit formation. CeA→DMS stimulation was not inherently rewarding or aversive in either control or stressed subjects (Extended Data Figure 8). Together, these data indicate that chronic stress recruits the CeA→DMS pathway to subsequent learning to promote the premature formation of inflexible habits.

DISCUSSION

These data reveal a dual pathway neuronal circuit architecture by which a recent history of chronic stress shapes learning to disrupt adaptive agency and promote inflexible habits. Both the BLA and CeA send direct projections to the DMS. The BLA→DMS pathway is activated by rewarding events to support the action-outcome learning needed for flexible, goal-directed decision making. Chronic stress attenuates this activity to disrupt action-outcome learning and, therefore, agency. Conversely, the CeA→DMS pathway mediates habit formation. Stress recruits this pathway to learning to promote the premature formation of inflexible habits. Thus, chronic stress disrupts agency and promotes habit formation by flipping the amygdala input to the DMS that supports learning.

Here we provide a model for the function of amygdala-striatal projections. Whereas the BLA→DMS pathway mediates action-outcome learning to support agency, the CeA→DMS pathway mediates the formation of routine habits. BLA→DMS pathway function in action-outcome learning is consistent with evidence that BLA lesion or BLA/DMS disconnection disrupts goal-directed behavior29,48,49. We implicate direct BLA→DMS projections. The data show that this pathway is activated by rewarding events to link those rewards to the actions that earned them to enable the prospective consideration of action consequences needed for flexible decision making. These data do not accord with evidence that BLA→DMS ablation does not disrupt action-outcome learning50. Such ablations may allow compensatory mechanisms that are not possible with temporally-specific manipulation. Unlike the BLA→DMS pathway, the CeA→DMS pathway is not typically activated during action-outcome learning. Rather CeA→DMS projections mediate the natural habit formation that occurs with repeated practice of a routine. This is consistent with evidence that CeA neurons are activated by rewards with overtraining45, that CeA lesion disrupts habit32, and that CeA→DMS projections oppose flexible adjustment of behavior when an action is no longer rewarded35. Unlike valence-processing models of amygdala function51,52, our data indicate that BLA and CeA projections to DMS are unlikely to convey simple positive or negative valence, but rather differentially shape the content of learning. The data support a parallel model53 whereby, via distinct outputs to the DMS, the amygdala actively gates the nature of learning to regulate the balance of behavioral control strategies. An important question opened by this model is how different reward learning experiences, schedules of reinforcement, and training regimes recruit activity in these pathways and how this intersects with stress and other life experiences.

Stressful life events can disrupt one’s agency and promote the formation of inflexible, potentially maladaptive, habits. Indeed, after chronic stress, people become less able to adapt their behavior when its outcome has been devalued15. Using two independent tests, we provide evidence in male and female mice that a recent history of chronic stress disrupts the action-outcome knowledge needed for agency and instead causes the formation of inflexible habits. Habit formation in stressed subjects was premature. Whereas we showed habits formed naturally with overtraining on a random-interval schedule, stressed mice form habits with only a limited amount of such training. Stress disrupts agency and promotes habit regardless of whether behavior is reinforced on the agency-promoting random-ratio schedule or the habit-promoting random-interval schedule. We found that chronic stress disrupts action-outcome learning and promotes habit formation by flipping the activity of BLA and CeA inputs to the DMS.

Chronic stress attenuates reward-learning-related activity in the BLA→DMS pathway to disrupt action-outcome learning and agency and instead recruits activity in the CeA→DMS pathway to promote the formation of inflexible habits. That agency could be rescued by manipulations to oppose these stress effects during only the learning phase indicates that stress influences behavioral control by shaping learning. The stress-induced attenuation of BLA→DMS activity was surprising because the BLA is, generally, hyperactive following chronic stress5461 (c.f.62). This may suggest that the effect of stress on BLA neurons depends on their projection target. Elevated CeA→DMS activity following stress is consistent with evidence that stress increases CeA activity6365. Whereas, stress attenuated BLA→DMS activity throughout learning, the CeA→DMS pathway was progressively recruited across training in stressed subjects. This could indicate that stress-induced CeA→DMS engagement requires repeated reward learning or reinforcement opportunity. It could also suggest the CeA→DMS pathway is engaged to compensate for the stress-induced attenuation of the BLA→DMS pathway activity needed for action-outcome learning. Indeed, the transition of behavioral control to habit systems requires a shift of behavioral control from BLA to CeA66. Such speculations require further evaluation of amygdala-striatal activity using dual-pathway recordings and manipulations. Interestingly, activation of the CeA→DMS pathway was not sufficient itself to promote habit formation. CeA→DMS activation did, however, tip the balance towards habit following a subthreshold mild chronic stress experience. Thus, stress may prime the CeA→DMS pathway to be recruited during subsequent learning. CeA→DMS activation may work along with a confluence of disruptions, likely to the BLA→DMS pathway, but also to cortical inputs to DMS5,42 to promote habit formation. The CeA can also work indirectly, likely via the midbrain67,68, with the dorsolateral striatum to regulate habit formation32,66. Thus, the CeA may promote habit through both direct and indirect pathways to the striatum. Although evidence from the terminal optogenetic inhibition experiments confirm involvement of direct amygdala projections to DMS, both pathways may collateralize and such collaterals may, too, be involved in learning and affected by stress.

The discoveries here open the door to many important future questions. One is the mechanisms through which chronic stress affects amygdala-striatal activity. That chronic stress occurred before training and did not alter spontaneous activity in either pathway, suggests that it may lay down neuroplastic changes in these pathways that become influential during subsequent learning opportunities. How such changes occur is a big and important question for future research. They likely involve a combination of stress action in the amygdala, perhaps via canonical stress systems such as corticotropin releasing hormone69 and/or kappa/dynorphin70, and stress action at regions upstream to the amygdala. Epigenetic mechanisms may also be involved71. An equally substantial next question is how these pathways influence downstream DMS activity. Indeed, DMS neuronal activity, especially plasticity in dopamine D1 receptor-expressing neurons72, is critical for the action-outcome learning that supports goal-directed decision making and when suppressed promotes inflexible habits2628. A reasonable speculation is that the excitatory BLA→DMS pathway promotes downstream learning-related activity in DMS to support action-outcome learning and that the inhibitory CeA→DMS pathway dampens such activity to encourage habit formation. Amygdala-striatal inputs may coordinate in this regard with corticostriatal inputs known to be important for supporting action-outcome learning5,42 and susceptible to chronic stress5. Both amygdala subregions and DMS participate in drug-seeking66,73,74 and active-avoidance behavior75. The central amygdala is particularly implicated in compulsive drug seeking and drug seeking after extended use, dependence and withdrawal, or stress66,76. Thus, more broadly, our results indicate chronic stress could oppositely modulate BLA→DMS and CeA→DMS pathways to promote maladaptive drug-seeking and/or avoidance habits. Towards this end, whether individual differences in BLA→DMS and/or CeA→DMS activity confer resilience or susceptibility to stress-potentiated habit formation is an important future question.

Adaptive decision making often requires understanding your agency in a situation. Knowing that your actions can produce desirable or undesirable consequences and using this to make thoughtful, deliberate, goal-directed decisions. Chronic stress can disrupt agency and promote inflexible, habitual control over behavior. We found that stress does this with a one-two punch to the brain. Chronic stress dials down the BLA→DMS pathway activity needed to learn the association between an action and its consequence to enable flexible, well-informed decisions. It also dials up activity in the CeA→DMS pathway, causing the formation of rigid, inflexible habits. These data provide neuronal circuit insights into how chronic stress shapes how we learn and, thus, how we decide. This helps understand how stress can lead to the disrupted decision making and pathological habits that characterize substance use disorders and mental illness.

This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1038/s41586-024-08580-w https://www.nature.com/articles/s41586-024-08580-w

METHODS

See Supplemental Table 5 for key reagents.

Subjects

Male and female wildtype C57/Bl6J mice (Jackson Laboratories, Bar Harbor, ME) aged 9–12 weeks old at the time of surgery served as subjects. Rabies tracing was conducted with Drd1a-Cre and Adora2A-Cre transgenic mice bred in house and aged 8–16 weeks at the time of surgery. Mice were housed in a temperature (68–79 °F) and humidity (30–70%) regulated vivarium on 12:12 hr reverse dark/light cycle (lights off at 7 AM). Behavioral experiments were performed during the dark phase. Mice were group housed in same-sex groups of 3–4 mice/cage prior to onset of behavioral experiments and subsequently single-housed for the remainder of the experiment to facilitate food deprivation and preserve implants. Unless noted below, mice were provided with food (standard rodent chow, Lab Diet, St. Louis, MO) and water ad libitum in the home cage. Mice were handled for 3–5 days prior to the start of behavioral training for each experiment. All procedures were conducted in accordance with the NIH Guide for the Care and Use of Laboratory Animals and were approved by the UCLA Institutional Animal Care and Use Committee.

Surgery

Mice were anesthetized with isoflurane (3% induction, 1% maintenance), and positioned in a digital stereotaxic frame (Kopf, Tujunga, CA). Subcutaneous Rimadyl (Carprofen; 5 mg/kg; Zoetis, Parsippany, NJ) was given pre-operatively for analgesia and anti-inflammatory purposes. Small cranial holes (1–2 mm2) were drilled, through which virus or fluorescent tracers were delivered via a guide cannula (DMS: 28 ga, BLA/CeA: 33 ga), PlasticsOne, Roanoke, VA) connected to a 1-mL syringe (Hamilton Company, Reno, NV) by intramedic polyethylene tubing (BD; Franklin Lakes, NJ) and controlled by a syringe pump (Harvard Apparatus, Holliston, MA). Coordinates (from Bregma) were determined by mouse brain reference atlas77 and were as follows: CeA, AP −1.2, ML ±2.8, DV −4.6 mm; BLA, AP −1.5, ML ±3.2, DV −5.0 mm; DMS, AP +0.2, ML ±1.8, DV −2.65 mm. Virus or tracers were infused at a rate of 0.1 μL/min and cannulae were left in place for at least 10 min post-injection. For injection-only surgeries, the skin was re-closed with Vetbond tissue adhesive (3M, Saint Paul, MN). For surgeries requiring fiber-optic cannulae, fibers were placed 0.3 mm above the target region for optogenetic experiments and at the infusion site for fiber photometry experiments, secured to the skull using RelyX Unicem Universal Self-Adhesive Resin (3M) and a head cap was created using C&B Metabond quick adhesive cement system (Parkell Inc., Brentwood, NY), followed by opaque dental cement (Lang Dental Manufacturing, Wheeling, IL). After surgery, mice were kept on a heating pad maintained at 35 °C for 1 hr and then single-housed in a new homecage for recovery and monitoring. Mice received chow containing the antibiotic TMS for 7 days following surgery to prevent infection, after which they were returned to standard rodent chow. Specific surgical details for each experiment are described below. In all cases, surgery occurred prior to the onset of stress and/or behavioral training.

Chronic mild unpredictable stress

The chronic mild unpredictable stress (“stress”) procedure was modified from5,7881. Mice assigned to the stress group were exposed to 2 stressors/day (foot shock, physical restraint, tilted cage, white noise, continuous illumination, or damp bedding) for 14 days in a pseudorandomized manner at variable time onset and for varying durations between 2 and 16 hr. Each stress protocol was consistent across subjects within a cohort. Control subjects received equated daily handling in the vivarium by the same experimenter administering the stress. Stress was administered in a separate, enclosed laboratory space distinct from both the vivarium and behavioral testing rooms. Stressed mice had home-cage nesting material removed for the duration of the stress exposure82. Mice were transported to the stress space in individual 16-oz clear polyethylene containers and on a dedicated transport cart and placed into individual cages in the stress space. Stress efficacy was assessed by daily body weight measurements83. Sub-threshold stress exposure was identical to stress except mice received only 1 stressor/day. An example stress protocol is provided in Supplemental Table 6.

Stressors

Footshock:

Subjects were placed in the conditioning chamber for 2 min to acclimate and then exposed to 5, 2–3 s, 0.7-mA footshocks with a variable intertrial interval averaging 60 s (30–90 s range). The footshock chamber had a similar grid floor to the behavioral testing chambers (described below) but was otherwise distinct in wall shape (round), pattern (monochrome polka dot), lack of bedding, scent (75% ethanol), and lighting (off). The chambers also lacked food ports and levers. Chambers were cleaned with 75% ethanol between subjects.

Physical Restraint:

Subjects were immobilized in modified 50-mL polypropylene conical tubes with 4 air holes per side, 1 at the top, and 1 in the cap for the tail (10 total). Mice were scruffed and placed inside the conical tube for 2 hr in their stress cage.

Tilted cage:

Stress cages were placed on chocks to tilt each cage at approximately a 45-degree angle for 6 – 16 hr.

White noise:

100-db white noise was played in the stress space for all stress mice for a duration of 6 – 16 hr.

Continuous illumination:

Overhead lights were turned on during the dark phase of the light cycle (7PM – 7AM).

Damp bedding:

~200 mL of water was mixed with the stress cage corncob bedding. Mice were placed in their stress cage with this damp bedding for 6 – 16 hr. Mice were returned to a new home cage with clean, dry bedding afterwards.

Corticosterone ELISA

N = 11 male and N = 12 female mice were used for corticosterone measurements of blood serum after exposure to 0, 1, or 2 stressors per day for 14 day. Measurements were taken 24 hr after the final stress exposure. Mice were decapitated and trunk blood was collected in 1.7-mL sample tubes on ice. Tubes were centrifuged at 2000 g for 10 min at 4 °C. Clear supernatant was collected and placed in new 1.7-mL sample tubes and frozen at −20 °C. Samples were diluted 1:40 in sample dilution buffer. Serum corticosterone levels were assessed using a Corticosterone ELISA kit as directed (Enzo Biosciences; Farmingdale, NY) and quantified on a microplate reader (Molecular Devices, San Jose, CA).

Behavioral procedures

Instrumental conditioning and tests

Instrumental conditioning procedures were adapted from our prior work41.

Apparatus.

Training took place in Med Associates wide mouse conditioning chambers (East Fairfield, VT) housed within sound- and light-attenuating boxes. Each chamber had metal grid floors and contained a retractable lever to the left of a recessed food-delivery port (magazine) on the front wall. A photobeam entry detector was positioned at the entry to the food port. Each chamber was equipped with 2 pellet dispensers to deliver either 20-mg grain or chocolate-flavored purified pellets (Bio-Serv, Frenchtown, NJ) into the food port. A fan mounted to the outer chamber provided ventilation and external noise reduction. A 3-watt, 24-volt house light mounted on the top of the back wall opposite the food port provided illumination. To monitor subject behavior, monochrome digital cameras (Med Associates) were positioned over top of the conditioning chambers. For optogenetic manipulations, chambers were outfitted with an Intensity Division Fiberoptic Rotary Joint (Doric Lenses, Quebec, QC, Canada) connecting the output fiber optic patch cords to a 473-nm or 593-nm laser (Dragon Lasers, ChangChun, JiLin, China) positioned outside of the chamber.

Food deprivation.

3 – 5 days prior to the start of behavioral training, mice were food-deprived to maintain 85%–90% of their free-feeding body weight. Mice were given 1.5 – 3.0 g of their home chow at the same time daily at least 2 hr after training sessions. For experiments involving stress, food deprivation began during the last 3 days of the stress procedure. Owing to food deprivation, body weights did not differ between groups at the start or end of training (see Supplemental Table 2).

Outcome pre-exposure.

To familiarize subjects with the food pellet that would become the instrumental outcome, mice were given 1 session of outcome pre-exposure. Mice were placed in a clean, empty cage and allowed to consume 20 – 30 of the food pellets from a metal cup. If any pellets remained, they were placed in the home cage overnight for consumption.

Magazine conditioning.

Mice received 1 session of training in the operant chamber to learn where to receive the food pellets (20-mg grain or chocolate-purified pellets). Mice received 20 – 30 non-contingent pellet deliveries from the food port with a fixed 60-s intertrial interval.

Instrumental conditioning.

Mice received 4 sessions (1 session/day consecutively), minimum, of instrumental conditioning in which lever presses earned delivery of a single food pellet. Earned pellet type (grain or chocolate) was counterbalanced across subjects within each group of each experiment. Each session began with the illumination of the house light and extension of the lever, and ended with the retraction of the lever and turning off of the house light. Sessions ended after the total available outcomes (20 or 30, as noted for each experiment below) had been earned or a maximum time limit (20 or 30 min, as noted below) had been reached. In all cases, training began on a fixed-ratio 1 schedule (FR-1), in which each action was reinforced with one food pellet. Once mice completed 2 sessions in which they earned 80% of the max outcomes, the reinforcement schedule was escalated to either random-interval (RI) or random-ratio (RR) as described for each experiment below. For the RI protocol, mice received 1 session on an RI-15 s schedule then 2 – 3 sessions on the final RI-30 s schedule (variable average 15-s or 30-s interval must elapse following a reinforcer for another press to be reinforced). Subjects on the RR protocol received 1 session each of RR-2, RR-5, and RR-10 schedule of reinforcement (variable press requirement average of 2, 5, or 10 presses to earn the food pellet). For the overtraining protocol, mice received 8 total training sessions, 1 on an RI-15 s schedule then 7 sessions on the final RI-30 s schedule.

For subjects in the contingency degradation experiment, following FR-1 training, they received 2 days of training in which each press was reinforced with a probability of 0.2 [P(Reward | Press) = 0.2] and a final session in which each press earned reward with a probability of 0.1 [P(Reward | Press) = 0.1].

Alternate outcome exposures.

To equate exposure of the non-trained pellet, all mice were given non-contingent access to same number of the alternate food pellets (e.g., chocolate pellets if grain pellets served as the training outcome) as the earned pellet type in a different context (clear plexiglass cage) a minimum of 2 hr before or after (alternated daily) each RI or RR instrumental training session.

Sensory-specific satiety outcome devaluation test.

Testing began 24 hr after the final instrumental conditioning session. Mice were given 1 – 1.5 hr access to either 4 g of the food pellets previously earned by lever pressing (Devalued condition) or 4 g of the non-trained pellets to control for general satiety (Valued condition). The remaining pellets were weighed following prefeeding to measure total consumption. Consumption did not significantly differ between the Devalued v. Valued conditions for any experiment (Supplemental Table 3). Immediately after this prefeeding, lever pressing was assessed during a 5-min, non-reinforced probe test. Following the probe test, mice were given a 10-min consumption choice test with simultaneous access to 1 g of both pellet types to ensure rejection of the devalued outcome. In all cases, mice consumed less of the prefed pellet than non-prefed pellet, indicating successful sensory-specific satiety devaluation (Supplemental Table 4). 24 hr after the first devaluation test, mice received 1 session of instrumental retraining on the final reinforcement schedule (RI30 or RR10), followed the next day by a second devaluation test in which they were prefed the opposite food pellet. Thus, each mouse was tested in both the Valued and Devalued conditions, with test order counterbalanced across subjects within each group for each experiment.

Contingency degradation test.

24 hr after the final instrumental conditioning session, mice received a 20-min contingency degradation session during which lever pressing continued to earn reward with a probability of 0.1, but reward was also delivered freely with the same probability even if mice did not press (i.e., non-contingent; [P(Reward | Press) = 0.1; [P(Reward | NoPress) = 0.1]). Thus, lever pressing was no longer necessary to earn reward. This session was identical for non-degraded controls, except they did not receive non-contingent rewards [P(Reward | Press) = 0.1; [P(Reward | NoPress) = 0]. 24 hr following the contingency degradation session, the effects of this contingency change were assessed in a 5-min non-reinforced probe test.

Real-time place preference/avoidance test

Procedure was conducted as described previously84. Mice were habituated to a 2-sided opaque plexiglass chamber (20 × 42 × 27 cm) for 10 min, during which their baseline preference for the left or right side of the chamber was measured. During the first 10-min test session, one side of the chamber was assigned to the light-delivery side (counterbalanced across subjects within each group). Mice were placed in the non-stimulation side to start the experiment. Light (Dragon Laser; Changchun, China) was delivered upon entry into the light-paired side and continued until the subject exited that side (optical stimulation: 473 nm, 5-ms pulse width, 20 Hz, ~8–10 mW at fiber tip; optical inhibition: 593 nm, continuous, ~8–10 mW). Mice then received a second test, identical to the first, in which the opposite side of the chamber served as the light-paired side. Sessions were video-recorded using a CCD camera. This camera interfaced with Biobserve software (Biobserve GmbH, Germany) and a Pulse Pal (Sanworks, Rochester, NY), to track subject position in real time and trigger laser delivery. The apparatus was cleaned with 75% ethanol after each session. Distance traveled, movement velocity, and time spent in each chamber was generated by Biobserve software post-session. Time spent in laser-paired chamber was compared between groups to assess preference or aversion of laser delivery.

Open-field test

Procedure was conducted as described previously84. Mice were placed in an opaque plexiglass arena (34 × 34 × 34 cm) for a single 10-min session. Sessions were video recorded using a CCD camera interfaced with Anymaze (Stoelting Co., Wood Dale, IL) software, which was used to track subject position in real time. Center region was defined as the innermost third of the floor area. Brightness above the OFT was ~70 lux. The apparatus was cleaned with 75% ethanol after each subject. Distance traveled, movement velocity, and time spent in either center or surrounding outer area was generated by Bioserve software and compared between groups.

Light/dark emergence test

The dark side of a 2-chamber apparatus was made of black opaque plexiglass and completely enclosed except for a small entry through the middle divider. The light side was made of white opaque plexiglass and was open to the light above. Brightness in the light chamber was ~70 lux. Mice were placed in the open portion of the apparatus to initiate a 10-min session. Each session was video recorded using a CCD camera, which interfaced with Anymaze software to track subject location. The apparatus was cleaned with 75% ethanol after each session. Distance traveled, movement velocity, and time spent in the light chamber was generated by Biobserve software and compared between groups.

Elevated plus maze

Procedure was conducted as described previously85. The dimensions of the elevated plus maze (EPM) arms were 30 cm × 7 cm, and the height of the closed arm walls was 20 cm. The maze was 65 cm elevated from the floor and was placed in the center of the behavior room away from other stimuli. Brightness above the EPM was ~70 lux. For the 10-min EPM test, mice were placed in the center of EPM facing a closed arm. Each session was video recorded using a CCD camera, which interfaced with Anymaze software to track subject location in real time. The apparatus was cleaned with 75% ethanol after each session. Distance traveled, movement velocity, and time spent in the center, open arms, or closed arms was generated by Biobserve software and compared between groups.

Sucrose-preference test

Mice first received habituation to 2 standard home-cage water bottles filled with water in the home cage for 16 hr. Subsequently, one water bottle was replaced with a bottle of 10% sucrose. Bottles were left in place for 24 hr and weighed before and after placement. Bottle positions were switched for another 24-hr period and subsequently weighed again. Amount of sucrose and water consumed, as well as a ratio of the two, during the 48-hr period was compared between groups.

Progressive ratio test

Mice were trained on the instrumental training protocol above to a reinforcement schedule of RR-10. They were then given a progressive ratio test in which the number of lever presses required to receive a pellet increased by 4 with each reinforcer delivered (e.g., 1, 5, 9, 13, 17, 21 etc.). The session ended after >5-min break in pressing or maximum duration of 4 hr. Session duration, rewards delivered, total presses, and the break point (last completed press requirement) were collected and compared between groups.

Effects of chronic mild unpredictable stress on instrumental learning and sensitivity to outcome devaluation

Male and female (Control: Final N = 22, 13 male; Stress: N = 25, 12 male) naïve mice were used in this experiment to assess how a recent history of chronic stress impacts instrumental learning and behavioral control strategy. 6 subjects (not included in above N) were excluded because they did not meet instrumental training performance criteria. Mice were randomly assigned to Control v. Stress groups. Mice were given 14 consecutive days of twice daily stress or daily handling as described above. 24 hr after the final stress exposure, mice began instrumental conditioning as described above. After completion of FR-1, mice received 1 session each of training on an RR-2, RR-5, and RR-10 reinforcement schedule (max 20 outcomes/20 min/session). We chose an RR reinforcement schedule for this experiment because it tends to promote action-outcome learning and goal-directed decision making38,39,42,86 and would, thus, make it more difficult for prior stress to induce habits, increasing the robustness of the results. Following training, mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above.

Effects of chronic mild unpredictable stress on action-outcome learning

Male and female (Control, Non-degraded: Final N = 7, 3 male; Control, Contingency degradation N = 3, 3 male; Stress, Non-degraded: N = 7, 3 male; Stress, Contingency degradation N = 8, 4 male) naïve mice were used in this experiment assess how a recent history of chronic stress impacts the ability to learn an action-outcome contingency. 3 subjects (not included in above N) were excluded because they did not meet instrumental training performance criteria. Mice were randomly assigned to Control v. Stress groups. Mice were given 14 consecutive days of twice daily stress or daily handling as described above. 24 hr after the final stress exposure, mice began instrumental conditioning as described above. After completion of FR-1, mice received 2 sessions of training in which lever presses were reinforced with a probability of 0.2 [P(Reward | Press) = 0.2] and one session in which they were reinforced with a probability of 0.1 ([P(Reward | Press) = 0.1; max 20 outcomes/20 min/session). Following training, mice received a single contingency degradation or non-degraded control session, as described above. This was followed the next day by a lever-pressing probe test, described above.

Effects of chronic mild unpredictable stress on common indices of anxiety- and depression-like behavior

Male and female (Control: Final N = 12, 6 male; Stress: N = 12, 6 male) naïve mice were used in this experiment to assess how a recent history of chronic stress impacts performance in common indices of anxiety- and depression-like behavior. Mice were randomly assigned to Control v. Stress groups. Mice were given 14 consecutive days of twice daily stress or daily handling as described above. 24 hr after the final stress exposure, mice began testing, as described above. Mice were given tests in the order: open field test, light-dark emergence test, elevated plus maze, sucrose preference test, progressive ratio test.

Tracing

Anterograde tracing of CeA neurons was performed as previously described87. Male (N = 2) and female (N = 2) naïve mice were infused bilaterally with the anterograde tracer AAV8-Syn-mCherry (Addgene, Watertown, MA) in the CeA (0.2 μL). Virus was allowed to express for 4 weeks, following which mice were perfused and histology was processed as described below to identify fluorescently labeled fibers in the dorsal striatum.

For retrograde tracing of DMS-projecting amygdala neurons, male (N = 2) and female (N = 2) naïve mice were infused with Fluorogold (Sigma, St. Louis, MO; 4% in sterile saline) in the DMS (0.2 μL). Virus was allowed to express for 5 days, following which mice were perfused and histology was processed as described below to identify fluorescently labeled cell bodies in CeA and BLA.

For retrograde tracing of monosynaptic inputs onto Drd1a+ or A2A+ DMS neurons, male (N = 4) and female (N = 4) Drd1a-cre or male (N = 3) and female (N = 2) Adora2A-cre naïve mice were infused with 0.3 μL AAV2-hSyn-FLEX-TVA-P2A-eGFP-2A-oG (Salk Gene Transfer, Targeting and Therapeutics Facility) in the DMS. Three weeks later, mice were infused with 0.3 μL EnvA G-deleted Rabies-mCherry at the same DMS coordinates. Mice were perfused 1 week later and tissue was processed as described below to identify monosynaptically-labeled inputs in CeA and BLA. 4 Drd1a-cre and 1 Adora2A-cre subjects were removed due to starter virus spillover in the BNST.

Fiber photometry calcium imaging of CeA→DMS or BLA→DMS projections during instrumental learning following stress

Male and female (BLA→DMS Control: Final N = 9, 4 male; BLA→DMS Stress: N = 12, 5 male; CeA→DMS Control: N = 11, 6 male; CeA→DMS Stress: N = 11, 4 male) naïve mice were used in this experiment to monitor calcium fluctuations in CeA→DMS and BLA→DMS projections during instrumental conditioning after stress. 18 subjects (not included in above N) with off-target viral expression and/or fiber location were excluded from the dataset. 4 subjects were excluded for loss of optic fibers/headcaps. 4 subjects were excluded for missing recording data from one session. 3 subjects that did not complete instrumental conditioning were also excluded. Mice were randomly assigned to Virus and Stress groups. At surgery, mice received unilateral infusion (left/right hemisphere counterbalanced across subjects within each group) of a retrogradely trafficked AAV encoding cre-recombinase (AAVrg-Syn-Cre-P2A-dTomato, Addgene) into the DMS (0.3 μl) and of an AAV encoding the cre-dependent genetically encoded calcium indicator GCaMP8s (AAV9-Syn-FLEX-GcAMP8s-GFP, Addgene) into either the CeA or BLA (0.1–0.2 μl). Optic fiber cannulae (5.0-mm length (BLA) or 4.6 mm (CeA), 200-μm diameter, 0.37 NA, Inper, Hangzhou, China) were implanted over the GCaMP infusion site for calcium imaging at cell bodies. Mice were given 1 – 2 weeks to recover post-surgery, followed by 14 consecutive days of twice daily stress or daily handling as described above. Mice were habituated to restraint during the final 3 days of the stress/handling period. 24 hr after the final stress exposure, mice began instrumental conditioning as described above. Each session began with a 3-minute baseline period prior to the start of the instrumental session for assessment of changes in baseline calcium activity. After completion of FR-1, mice received 1 session each of training on an RR-2, RR-5, and RR-10 reinforcement schedule (max 20 outcomes/20 min/session).

Fiber photometry was used to image bulk calcium activity in CeA→DMS or BLA→DMS neurons for 3-min prior to and throughout each instrumental conditioning session using a commercial fiber photometry system (Neurophotometrics Ltd., San Diego, CA). Two light-emitting LEDs (470 nm: Ca2+-dependent GCaMP fluorescence; 415 nm: autofluorescence, motion artifact, Ca2+-independent GCaMP fluorescence) were reflected off dichroic mirrors and coupled via a patch cord (200 μm; 0.37 NA, Inper) to the implanted optical fiber. The intensity of excitation light was adjusted to ~100 μW at the tip of the patch cord. Fluorescence emission was passed through a 535-nm bandpass filter and focused onto the complementary metal-oxide semiconductor (CMOS) camera sensor through a tube lens. Samples were collected at 20 Hz interleaved between the 415 nm and 470 nm excitation channels using a custom Bonsai workflow. Time stamps of task events were collected simultaneously through an additional synchronized camera aimed at the Med Associates interface, which sent light pulses coincident with task events (onset, press, entry, reward). Signals were saved using Bonsai software and exported to MATLAB (MathWorks, Natick, MA) for analysis.

To assess the response to appetitive and aversive stimuli and provide a positive signal control, fiber photometry measurements were made during subsequent non-contingent reward and footshock sessions. In the first session, mice received 10 non-contingent food-pellet deliveries with a variable 60-s intertrial interval. 24 hr later, they received a session of 5, 2-s, 0.7mA footshocks with a variable 60-s intertrial interval. Calcium signal was aligned to reward collection or shock onset using timestamps collected as above. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber location.

Fiber photometry analysis

Data were pre-processed using a custom-written pipeline in MATLAB (MathWorks, Natick, MA) as previously88. The 415 nm and 470 nm signals were fit using an exponential curve. Change in fluorescence (ΔF/F) at each time point was calculated by subtracting the fitted 415 nm signal from the 470 nm signal and normalizing to the fitted 415 nm data [(470-fitted 415)/fitted 415)]. The ΔF/F data was Z-scored to the average of the whole session [(ΔF/F − mean ΔF/F)/std(ΔF/F)]. Z-scored traces were then aligned to behavioral event timestamps throughout each session. Area under the curve (AUC) was calculated for each individual aligned trace within each session using a trapezoidal function. We use the 3-s period prior to initiating presses to quantify activity related to the initiation of actions. We used the 3-s period following reward collection to quantify activity related to the earned reward and unpredicted reward. We used the 1-s period following shock onset to quantify acute shock responses and the 2-s post-shock period to quantify activity following the shock. Quantifications and signal aligned to events were averaged across trials within a session and compared across sessions and between groups. Spontaneous activity was recorded during a 3-minute baseline period in the instrumental training context prior to each training session. Calcium events were identified as described previously89. We defined a series of sliding-moving windows (15-s window, 1-second step) along the trace in which we filtered out high-amplitude events (greater than 2x the median of the 15-s window) and calculated the median absolute deviation of the resultant trace. Calcium transients with local maxima greater than 2x above the median absolute deviation were selected as events. These events were used to calculate spontaneous event frequency and amplitude for BLA→DMS and CeA→DMS pathways.

Optogenetic inhibition of BLA→DMS projections during instrumental learning

Male and female (eYFP Final N = 10, 5 male; Arch N = 11, 5 male) naïve mice were used in this experiment to assess the necessity of BLA→DMS projection activity at outcome experience during training for the action-outcome learning that supports goal-directed decision making. 13 subjects with off-target viral expression or fiber location and 6 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to Viral group. At surgery, mice received bilateral infusion of an AAV encoding the inhibitory opsin archaerhodopsin (AAVDJ-Syn-eArch-YFP, Stanford Vector Core) or fluorophore control (AAVDJ-Syn-eYFP; Addgene) into the BLA (0.1 – 0.2 μl). Optic fiber cannulae (2.5-mm length, 100-μm diameter, 0.22 NA, Inper) were implanted over the DMS. Mice were given 3 weeks to recover and allow for viral expression. Mice were habituated to restraint for attaching optical fibers for 3 days immediately prior to instrumental conditioning. During instrumental conditioning, mice were tethered to a 100-μm diameter optic fiber bifurcated patchcord (Inper) attached to a 593-nm laser (Dragon Laser) via a rotary joint. Mice were habituated to the tether during the magazine training session, but no laser was delivered. Beginning with the first FR-1 session, all subjects received laser delivery during reward collection (first magazine entry after reward delivery; 5-s pulse, 8–10 mW). After completion of FR-1, mice received 1 session each of instrumental conditioning on an RR-2, RR-5, and RR-10 reinforcement schedule (max 20 outcomes/20 min/session). We chose an RR schedule of reinforcement for this experiment because tends to promote action-outcome learning and goal-directed decision making 38,39,42,86 and, thus, would make it more difficult to neurobiologically induce habit formation, increasing the robustness of the results. Following training mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above. Mice were tethered but no laser was delivered on test days. Mice received laser as in training during the intervening retraining session. After instrumental training and testing, mice were tested in the RTPP test as described above. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber placement.

Optogenetic activation of BLA→DMS projections during instrumental learning following stress

Male and female (Control eYFP: Final N = 11, 7 male; Control ChR2: N = 7, 4 males; Stress eYFP: N = 9, 2 male; Stress ChR2: N = 10 Stress, 3 male) mice were used in this experiment to assess whether activation of BLA→DMS projections during learning is sufficient to rescue action-outcome learning for goal-directed decision making in subjects with a history of stress. 4 subjects with off-target viral expression or fiber location and 2 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to Virus and Stress groups. At surgery, mice received bilateral infusion of a retrogradely trafficked AAV encoding cre-recombinase (AAVrg-Syn-Cre-P2A-dTomato, Addgene) into the DMS (0.3 μl) and AAV encoding the cre-inducible excitatory opsin ChR2 (AAV8-Syn-DIO-ChR2-eYFP, Stanford Vector Core) or fluorophore control (AAV8-Syn-DIO-eYFP, Stanford Vector Core) into the BLA (0.1–0.2 μl). Optic fiber cannulae (5.0-mm length, 100-μm diameter, 0.22 NA, Inper) were implanted over the BLA. Mice were given 1 – 2 weeks to recover post-surgery, followed by 14 consecutive days of twice daily stress or daily handling as described above. Mice were habituated to restraint for attaching optical fibers during the final 3 days of the stress/handling period. 24 hr after the final stress exposure, mice began instrumental conditioning, as described above. During instrumental conditioning, mice were tethered to a 100-μm diameter optic fiber bifurcated patchcord (Inper) attached to a 473-nm laser (Dragon Laser) via a rotary joint. Mice were habituated to the tether during the magazine training session, but no laser was delivered. Beginning with the first FR-1 session, all subjects received laser delivery during reward collection, (first magazine entry after reward delivery; 2-s duration, 20 Hz, 5-ms pulse width, 8–10 mW). After completion of FR-1, mice received 1 training session on an RI-15s reinforcement schedule and 2 training sessions on the RI-30s schedule (max 20 outcomes/20 min/session). We chose an RI reinforcement schedule for this experiment because it tends to promote habit formation 38,39,42,86 and, thus, would make it more difficult to neurobiologically prevent stress-potentiated habit, increasing the robustness of the results. Following training, mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above. Mice were tethered but no laser was delivered on test days. Mice received laser as in training during the intervening retraining session. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber placement.

Chemogenetic activation of BLA→DMS projections during instrumental learning following stress

Male and female (Control, mCherry: Final N = 12, 7 male; Control, hM3Dq: N = 6, 3 male; Stress, mCherry: N = 9, 5 male; Stress, hM3Dq: N = 10, 5 male) naïve mice were used in this experiment to assess whether activation of BLA→DMS projections during learning is sufficient to rescue action-outcome learning for goal-directed decision making in subjects with a history of stress. 15 subjects with off-target viral expression and 2 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to Viral and Stress groups. At surgery, all mice received bilateral infusion of the retrogradely-trafficked canine-adenovirus encoding cre-recombinase (CAV2-Cre-GFP; Plateforme de Vectorologie de Montpellier, Montpellier, France) into the DMS (0.3 μl) and AAV encoding the cre-inducible excitatory designer receptor human M3 muscarinic receptor (hM3Dq; AAV2-Syn-DIO-hM3Dq-mCherry; Addgene, Watertown, MA) or fluorophore control (AAV2-Syn-DIO-mCherry; Addgene) into the BLA (0.1–0.2 μl). Mice were given 1–2 weeks to recover post-surgery, followed by 14 consecutive days of twice daily stress or daily handling, as described above. Mice were habituated to i.p. injections during the final 3 days of the stress/handling period. 24 hr after the final stress exposure, mice began instrumental conditioning, as described above. All subjects received an intraperitoneal (i.p.) injection of clozapine-n-oxide (water soluble CNO; 0.2 mg/kg43,44,9092; Hello Bio, Princeton, NJ) 30 min prior to each instrumental conditioning session. Upon completion of FR-1 (80% max rewards delivered), mice received 1 training session the RI-15s reinforcement schedule following by 2 sessions on an RI-30s schedule (max 30 outcomes/30 min/session). We chose an RI schedule of reinforcement for this experiment because it tends to promote habit formation 38,39,42,86 and, thus, would make it more difficult to neurobiologically prevent stress-potentiated habit, increasing the robustness of the results. Following training mice received a counterbalanced pair of sensory-specific satiety outcome-specific devaluation tests, as above. No CNO was given on test days. CNO was given prior to the retraining session (RI-30s) in between tests. After instrumental training and testing, mice were perfused and brain tissue was processed using standard histology procedures described below to assess viral expression location and spread.

Optogenetic inactivation of CeA→DMS projections during instrumental overtraining

Male and female (control, eYFP: N = 11, 3 male; control, Arch: N = 11, 7 male) naïve mice were used in this experiment to assess the necessity of CeA→DMS projection activity at outcome experience during learning for the natural habit formation that occurs with overtraining. 2 subjects with off-target viral expression or fiber location were excluded from the dataset. Mice were randomly assigned to Virus group. At surgery, mice received bilateral infusion of an AAV encoding the inhibitory opsin Arch (AAVDJ-Syn-eArch-eYFP, Stanford Vector Core) or fluorophore control (AAVDJ-Syn-eYFP; Addgene) into the CeA (0.1–0.2 μl). Optic fiber cannulae (2.5-mm length, 100-μm diameter, 0.22 NA, Inper) were implanted over the DMS. Mice were given 1 week to recover post-surgery. Mice were habituated to restraint for attaching optical fibers. Mice then receive instrumental overtraining on the RI-30s schedule as described above. During instrumental conditioning, mice were tethered to a 100-μm diameter optic fiber bifurcated patchcord (Inper) attached to a 593-nm laser (Dragon Laser) via a rotary joint. Mice were habituated to the tether during the magazine training session, but no laser was delivered. Beginning with the first FR-1 session, all subjects received laser delivery during reward collection, (first magazine entry after reward delivery; 5-s pulse, 8–10 mW). After completion of FR-1, mice received 1 training session on an RI-15s reinforcement schedule and 7 training sessions on the RI-30s schedule (max 20 outcomes/20 min/session). We chose an RI reinforcement schedule for this experiment because it tends to promote habit formation 38,39,42,86. We overtrained subjects to also promote the formation of habits naturally in control subjects. Following training, mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above. Mice were tethered but no laser was delivered on test days. Mice received laser as in training during the intervening retraining session. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber placement.

Optogenetic inactivation of CeA→DMS projections during instrumental learning following stress

Male and female (Control, eYFP: N = 9, 5 male; Control, Arch: N = 11, 4 male; Stress, eYFP: N = 7, 6 male; Stress, Arch: N = 9, 5 male) naïve mice were used in this experiment to assess the necessity of CeA→DMS projection activity at outcome experience during learning for stress-potentiated habit formation. 12 subjects with off-target viral expression or fiber location and 2 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to Virus and Stress groups. At surgery, mice received bilateral infusion an AAV encoding the inhibitory opsin Arch (AAVDJ-Syn-eArch-eYFP, Stanford Vector Core) or fluorophore control (AAVDJ-Syn-eYFP; Addgene) into the CeA (0.1–0.2 μl). Optic fiber cannulae (2.5-mm length, 100-μm diameter, 0.22 NA, Inper) were implanted over the DMS. Mice were given 1 – 2 weeks to recover post-surgery, followed by 14 consecutive days of twice daily stress or daily handling as described above. Mice were habituated to restraint for attaching optical fibers during the final 3 days of the stress/handling period. 24 hr after the final stress exposure, mice began instrumental conditioning as described above. During instrumental conditioning, mice were tethered to a 100-μm diameter optic fiber bifurcated patchcord (Inper) attached to a 593-nm laser (Dragon Laser) via a rotary joint. Mice were habituated to the tether during the magazine training session, but no laser was delivered. Beginning with the first FR-1 session, all subjects received laser delivery during reward collection, (first magazine entry after reward delivery; 5-s pulse, 8–10 mW). After completion of FR-1, mice received 1 training session on an RI-15s reinforcement schedule and 2 training sessions on the RI-30s schedule (max 20 outcomes/20 min/session). We chose an RI reinforcement schedule for this experiment because it tends to promote habit formation 38,39,42,86 and, thus, would make it more difficult to neurobiologically prevent stress-potentiated habit, increasing the robustness of the results. Following training, mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above. Mice were tethered but no laser was delivered on test days. Mice received laser as in training during the intervening retraining session. After instrumental training and testing, mice were tested in the RTPP test as described above. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber placement.

Chemogenetic inactivation of CeA→DMS projections during instrumental learning following stress

Male and female (Control, mCherry: N = 12, 5 male; Control, hM4Di: N = 13, 8 male; Control, mCherry: N = 11, 5 male; Control, hM4Di: N = 9, 4 male) naïve mice were used in this experiment to assess the necessity of CeA→DMS projection activity during learning for stress-potentiated habit formation. 16 subjects with off-target viral expression and 3 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to Viral and Stress groups. At surgery, all mice received bilateral infusion of the retrogradely-trafficked CAV encoding cre-recombinase (CAV2-Cre-GFP; Plateforme de Vectorologie de Montpellier) into the DMS (0.3 μl) and AAV encoding the cre-inducible inhibitory designer receptor human M4 muscarinic receptor (hM4DGi; AAV2-Syn-DIO-hM4Di-mCherry; Addgene) or fluorophore control (AAV2-Syn-DIO-mCherry; Addgene) into the CeA (0.1–0.2 μl). Mice were given 1–2 weeks to recover post-surgery, followed by 14 consecutive days of twice daily stress or daily handling as described above. Mice were habituated to i.p. injections during the final 3 days of the stress/handling period. 24 hr after the final stress exposure, mice began instrumental conditioning as described above. All subjects received an intraperitoneal (i.p.) injection of CNO (2 mg/kg46,47,90,93; Hello Bio) 30 min prior to each instrumental conditioning session. Upon completion of FR-1, mice received 1 session of training on an RI-15s reinforcement schedule followed by 2 sessions on the RI-30s schedule (max 30 outcomes/30 min/session). We chose an RI reinforcement schedule for this experiment because it tends to promote habit formation 38,39,42,86 and, thus, would make it more difficult to neurobiologically prevent stress-potentiated habit, increasing the robustness of the results. Following training, mice received a counterbalanced pair of sensory-specific satiety outcome-specific devaluation tests, as above. No CNO was given on test days. CNO was given prior to the retraining session. After instrumental training and testing, mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location and spread.

Optogenetic activation of CeA→DMS projections during instrumental learning

Male and female (eYFP N = 17, 9 male; ChR2: N = 6, 3 male) naïve mice were used in this experiment to assess whether CeA→DMS projection activation at outcome experience during learning is sufficient to promote habit formation. 11 subjects with off-target viral expression or fiber location and 4 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to Virus group. Given the low density of CeA→DMS projections, we choose to activate DMS-projecting CeA cell bodies. At surgery, mice received bilateral infusion of a retrogradely trafficked AAV encoding cre-recombinase (AAVrg-Syn-Cre-P2A-dTomato, Addgene) into the DMS (0.3 μl) and AAV encoding the cre-inducible excitatory opsin channelrhodopsin 2 (ChR2; AAV8-Syn-DIO-ChR2-eYFP, Stanford Vector Core) or fluorophore control (AAV8-Syn-DIO-eYFP, Stanford Vector Core) into the CeA (0.1–0.2 μl). Optic fiber cannulae (5.0-mm length, 100-μm diameter, 0.22 NA, Inper) were implanted over the CeA. Mice were given 3 weeks to recover and allow for viral expression. Mice were habituated to restraint for 3 days prior to instrumental conditioning. During instrumental conditioning, mice were tethered to a 100-μm diameter optic fiber bifurcated patchcord (Inper) attached to a 473-nm laser (Dragon Laser) via a rotary joint. Mice were habituated to the tether during the magazine training session, but no laser was delivered. Beginning with the first FR-1 session, all subjects received laser delivery during reward collection (first magazine entry after reward delivery; 2-s duration, 20 Hz, 5-ms pulse width, 8–10 mW). After completion of FR-1, mice received 1 day each of training on an RR-2, RR-5, and RR-10 reinforcement schedule (max 20 outcomes/20 min/session). We chose an RR schedule of reinforcement for this experiment because it tends to promote action-outcome learning and goal-directed decision making 38,39,42,86 and, thus, would make it more difficult to neurobiologically induce habit formation, increasing the robustness of the results. Following training mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above. Mice were tethered but no laser was delivered on test days. Mice received laser as in training during the intervening retraining session. After instrumental training and testing, mice were tested in the RTPP test, as described above. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber placement.

Optogenetic activation of CeA→DMS projections during instrumental learning following sub-threshold stress

Male and female (eYFP N = 10, 4 male; ChR2: N = 12, 6 male) naïve mice were used in this experiment to assess whether CeA→DMS projection activation at outcome experience during learning is sufficient to promote habit formation in mice with a history of less-frequent stress (subthreshold for promoting habit formation). 11 subjects with off-target viral expression or fiber location and 4 subjects that did not complete instrumental conditioning were excluded from the dataset. Mice were randomly assigned to viral groups. Similar to optogenetic activation of CeA→DMS neurons in control mice, we chose to target cell bodies with this approach. At surgery, mice received bilateral infusion of a retrogradely trafficked AAV encoding cre-recombinase (AAVrg-Syn-Cre-P2A-dTomato, Addgene) into the DMS (0.3 μl) and AAV encoding the cre-inducible excitatory opsin ChR2 (AAV8-Syn-DIO-ChR2-eYFP, Stanford Vector Core) or fluorophore control (AAV8-Syn-DIO-eYFP, Stanford Vector Core) into the CeA (0.1–0.2 μl). Optic fiber cannulae (5.0-mm length, 100-μm diameter, 0.22 NA, Inper) were implanted over the CeA. Mice were given 1 – 2 weeks to recover post-surgery, followed by 14 consecutive days of once/daily stress or daily handling as described above. Mice were habituated to restraint for attaching optical fibers during the final 3 days of the subthreshold stress/handling period. 24 hr after the final stress exposure, mice began instrumental conditioning, as described above. During instrumental conditioning, mice were tethered to a 100-μm diameter optic fiber bifurcated patchcord (Inper) attached to a 473-nm laser (Dragon Laser) via a rotary joint. Mice were habituated to the tether during the magazine training session, but no laser was delivered. Beginning with the first FR-1 session, all subjects received laser delivery during reward collection, (first magazine entry after reward delivery; 2-s duration, 20 Hz, 5-ms pulse width, 8–10 mW). After completion of FR-1, mice received 1 day each of training on an RR-2, RR-5, and RR-10 reinforcement schedule (max 20 outcomes/20 min/session). Following training mice received a counterbalanced set of sensory-specific satiety outcome-specific devaluation tests, as above. Mice were tethered but no laser was delivered on test days. Mice received laser during the intervening retraining session. After instrumental training and testing, mice were tested in the RTPP test, as described above. Mice were then perfused and brain tissue was processed with standard histology procedures described below to assess viral expression location/spread and fiber placement.

Immunohistochemistry

Mice were anesthetized with isoflurane and transcardially perfused with ice-cold PBS followed by cold 4% paraformaldehyde. The brains were removed, post-fixed in 4% paraformaldehyde, then cryoprotected in 30% sucrose in PBS. 30-μm coronal slices were taken on a cryostat and collected in PBS. Immunohistochemical analysis was performed as described previously88,9496. Briefly, floating sections were blocked for 1 hr at room temperature in blocking solution (3% normal goat serum (NGS, Jackson ImmunoResearch Laboratories), 0.3% Triton X-100 (Fisher)) in PBS and then incubated overnight with gentle agitation at 4 °C in blocking solution plus 1:1000 dilution primary antibody (chicken anti-GFP polyclonal, Abcam; rabbit anti-dsRed polyclonal, Takara Bio). Sections were then incubated covered with gentle agitation for 2 hr at room temperature in blocking solution plus 1:500 dilution secondary antibody (goat anti-rabbit IgG Alexafluor 594 conjugate; goat anti-chicken IgG Alexafluor 488 conjugate, Invitrogen). All sections were washed 3 times for 5 min each in PBS before and after each incubation step and mounted on slides using ProLong Gold antifade reagent with DAPI (Invitrogen). All images were acquired using a Keyence (BZ-X710) microscope with 4X, 10X, and 20X objectives (CFI Plan Apo), CCD camera, and BZ-X Analyze software and a Zeiss Confocal LSM with 2.5X and 20X objectives and Zeiss ZEN (blue edition) image acquisition software.

Statistics and reproducibility

Statistical Analysis

Datasets were analyzed by 2-tailed t-tests, or 1-, 2-, or 3-way repeated-measures analysis of variance (ANOVA), as appropriate (GraphPad Prism, GraphPad version 9–11, San Diego, CA; SPSS, IBM, Chicago, IL). For chemogenetic replications of optogenetic results, we used planned comparisons for test press rate data. Some datasets were slightly non-normally distributed. For these datasets, statistical tests were also run using non-parametric analyses and the results were highly consistent. We opted to use parametric statistics for consistency across experiments and given evidence that ANOVA is robust to slight non-normality97,98. Bonferroni post hoc tests corrected for multiple comparisons were performed to clarify statistical interactions. Greenhouse-Geisser correction was applied to mitigate the influence of unequal variance between conditions. Alpha levels were set at P < 0.05.

Sex as a biological variable

For the initial behavioral finding, sex was included as a factor in the ANOVA and found to not significantly account for variance (No main effect of Sex on lever pressing acquisition: F(1, 43) = 0.43, P = 0.51, devaluation test press rate: F(1,43) = 0.60, P = 0.44, or devaluation index: F(1,43) = 0.04, P = 0.84). Therefore, data from male and female mice was combined for analyses. For subsequent experiments, male and female mice were used in approximately equal numbers, but the N per sex was underpowered to examine sex differences. Sex was therefore not included as a factor in statistical analyses, though individual data points are visually disaggregated by sex.

Rigor and reproducibility

Group sizes were estimated based on prior work with this behavioral task41 and to ensure counterbalancing of virus, stress, pellet type, and devaluation test order. Investigators were not blinded to viral or stress group because they were required to administer infusions and stress exposure. All behaviors were scored using automated software (Med Associates). Each experiment included at least 1 replication cohort and cohorts were balanced by Viral group, Stress group, and hemisphere (for photometry recordings and tracing) prior to the start of the experiment. Investigators were blinded to group when performing histological validation and determining exclusions based on viral spread or mistargeted implant.

Extended Data

Extended data Figure 1: Chronic mild unpredictable stress does not cause classic anxiety- and depression-like phenotypes.

Extended data Figure 1:

Mice received 14 consecutive d of chronic mild unpredictable stress (stress) including twice daily exposure to 1 of 6 mild stressors at pseudorandom times and orders: damp bedding (16 hr), tilted cage (16 hr), white noise (80 db; 2 hr), continuous illumination (8 hr), physical restraint (2 hr), footshock (0.7-mA, 1-s, 5 shocks/10 min) prior to subsequent testing in a battery of behavioral assays classically used to assess anxiety- and depression-like behavior. (a-c) Open field test. Distance traveled (a; 2-sided t-test: t(22) = 0.32, P = 0.75, 95% CI −4.43 – 3.24), time spent in center zone (b; 2-sided t-test: t(22) = 1.10, P = 0.28), and entries into center zone (c; 2-sided t-test: t(22) = 0.63, P = 0.54, 95% CI −10.03 – 5.36). (d-f) Elevated plus maze. Distance traveled (d; 2-sided t-test: t(22) = 0.08, P = 0.94, 95% CI −2.72 – 2.92), time spent in open arms (e; 2-sided t-test: t(22) = 0.01, P = 0.92, 95% CI −26.17 – 23.70), and entries into open arms (f; 2-sided t-test: t(22) = 0.23, P = 0.82, 95% CI −6.56 – 5.23). (g-i) Light-dark emergence test. Distance traveled in light zone (g; 2-sided t-test: t(22) = 0.97, P = 0.34, 95% CI −0.73 – 2.01), time spent in light zone (h; 2-sided t-test: t(22) = 1.57, P = 0.13, 95% CI −11.93 – 86.98), and entries into light zone (I; 2-sided t-test: t(22) = 1.37, P = 0.19, 95% CI −1.708 to 8.041). (j-k) Sucrose preference test. Average amount consumed of water and 10% sucrose over 24 hr (j; 2-way ANOVA: Solution: F(1, 22) = 113.20, P < 0.0001; Stress: F(1, 22) = 0.14, P = 0.71, Solution x Stress: F(1, 22) = 0.02, P = 0.89) and ratio of sucrose:water consumed (k; t(22) = 0.03, P = 0.98, 95% CI −0.064 – 0.063). (l-m) Progressive ratio Tests. Total presses (l; 2-sided t-test: t(22) = 2.13, P = 0.04, 95% CI 72.94 – 5346) and breakpoint (k; Final ratio completed; 2-sided t-test: t(22) = 2.12, P = 0.46, 95% CI 1.02 – 94.31). Control N = 12 (6 male), Stress N = 12 (6 male) mice. Males = closed circles, Females = open circles. Data presented as mean +/− SEM. *P <0.05, ***P <0.001. Our stress procedure does not affect general locomotor activity or avoidance of anxiogenic spaces or create an anhedonia phenotype. Rather this stress procedure appears to cause elevated motivation to exert effort to obtain reward. This contrasts with more severe, longer-lasting stress procedures, which do produce anxiety- and depression-like phenotypes in these tasks80,100,101. Thus, our stress procedure models chronic, low-level stress.

Extended data Figure 2: Food-port entries during training and probe tests following handling control or chronic stress.

Extended data Figure 2:

(a) Food-port entry rate across training for subjects in the devaluation experiment. 2-way ANOVA: Training: F(2.42, 108.90) = 3.17, P = 0.04; Stress: F(1, 45) = 0.07, P = 0.79; Training x Stress: F(3, 135) = 0.57, P = 0.64. (b) Food-port entries during the devaluation probe tests. 2-way ANOVA: Value: F(1, 45) = 6.77, P = 0.01, Stress: F(1, 45) = 0.29, P = 0.60; Stress x Value: F(1, 45) = 2.42, P = 0.13. Control N = 22 (13 male), Stress N = 25 (12 male) mice. (c) Food-port entry rate across training for subjects in the contingency degradation experiment. 3-way ANOVA: Training: F(2.84, 62.10) = 6.44, P = 0.001; Stress: F(1, 25) = 0.01, P = 0.91; Future Contingency Degradation group: F(1, 25) = 1.27, P = 0.27; Training x Stress: F(3, 75) = 1.62, P = 0.19; Training x Group: F(3, 75) = 0.24, P = 0.87; Stress x Group: F(1, 25) = 0.004, P = 0.95; Training x Stress x Group: F(3, 75) = 1.49, P = 0.23. (d) Food-port entries during the probe test 24 hr following contingency degradation or non-degraded control. 2-way ANOVA: Stress x Contingency Degradation Group: F(1, 25) = 18.88, P = 0.0002; Contingency Degradation: F(1, 25) = 4.29, P = 0.05; Stress: F(1, 25) = 1.41, P = 0.25. Control, Non-degraded N = 7 (3 male), Control, Degraded N = 7 (3 male), Stress Non-degraded N = 7 (3 male) Stress Degraded N = 8 (4 male) mice. Males = solid lines, Females = dashed lines. Data presented as mean +/− SEM. *P < 0.05, **P < 0.01, corrected for multiple comparisons.

Extended data Figure 3: Lever presses and food-port entries during contingency degradation.

Extended data Figure 3:

(a) Contingency degradation Procedure. Following stress and training, half the subjects in each group received a 20-min contingency degradation session during which lever pressing continued to earn reward with a probability of 0.1, but reward was also delivered non-contingently with the same probability. This session was identical for non-degraded controls, except they did not receive free rewards. (b) 3-way ANOVA: Press rate in 1-min bins during the contingency degradation session. Time x Contingency Degradation Group: F(19, 475) = 2.03, P = 0.0063; Time x Stress: F(19, 475) = 2.43, P = 0.0007; Stress x Group: F(1, 25) = 0.0001, P = 0.99; Time: F(9.17, 229.20) = 2.13, P = 0.03; Stress: F(1, 25) = 1.36, P = 0.26; Degradation Group: F(1, 25) = 68.23, P < 0.0001; Time x Stress x Degradation Group: F(19, 475) = 1.30, P = 0.19. Contingency degradation cause lower press rates across the session in both control (Time x Contingency Degradation Group: F(12, 228) = 2.47, P = 0.0009; Time: F(6.62, 79.39) = 2.47, P = 0.03; Degradation Group: F(1, 12) = 45.16, P < 0.0001) and stressed (Contingency Degradation Group: F(1, 13) = 28.22, P = 0.0001; Time: F(6.01, 78.16) = 2.19, P = 0.05; Time x Contingency Degradation Group: F(19, 247) = 1.10, P = 0.35) mice. (c) Rate of entry into the food-delivery port in 1-min bins during the contingency degradation session. 3-way ANOVA: Time x Contingency Degradation Group: F(19, 475) = 3.80, P < 0.0001; Time x Stress: F(19, 475) = 1.20, P = 0.26; Stress x Group: F(1, 25) = 0.006, P = 0.94; Time: F(6.26, 156.60) = 7.53, P < 0.0001; Stress: F(1, 25) = 2.51, P = 0.13; Degradation Group: F(1, 25) = 1.37, P = 0.5; Time x Stress x Degradation Group: F(19, 475) = 0.86, P = 0.63. Control, Non-degraded N = 7 (3 male), Control, Degraded N = 7 (3 male), Stress Non-degraded N = 7 (3 male) Stress Degraded N = 8 (4 male) mice. Males = closed circles/solid lines, Females = open circles/dashed lines. Data presented as mean +/− SEM. *P <0.05, **P < 0.01, corrected for multiple comparisons.

Extended Data Figure 4: BLA and CeA directly project to DMS.

Extended Data Figure 4:

(a) Top: Anterograde tracing approach. Infusion of an AAV expressing mCherry into the CeA. Bottom: mCherry labeling at infusion site in CeA (left) and mCherry-labeled fibers in the DMS (right). N = 4 (2 male) mice. We observed mCherry-expressing putative fibers in the DMS but not dorsolateral striatum. Expression was also detected in other well-known CeA projection targets such as the bed nucleus of the stria terminalis. (b) Top: Retrograde tracing approach. We infused the fluorescently labeled retrograde tracer Fluorogold into the DMS. Bottom: Fluorogold labeling at infusion site in DMS (left) and fluorogold-labeled, DMS-projecting cell bodies in BLA and CeA (middle), with CeA magnified (right). Labeled cells was detected in both BLA and CeA, indicating that both BLA and CeA directly project to DMS. Labeling was greater in BLA than CeA, indicating the BLA→DMS pathway is denser than the CeA→DMS pathway. N = 4 (2 male) mice. (c) Top: Approach for rabies trans-synaptic retrograde tracing of DMS Drd1+ striatal neurons. We used rabies tracing to confirm monosynaptic amygdala projections onto DMS neurons. We infused a starter virus expressing cre-dependent TVA-oG-GFP into the DMS of mice expressing cre-recombinase under the control of dopamine receptor 1 (D1-Cre) or adenosine 2a receptor (A2A-Cre) genes102,103, followed by ΔG-deleted rabies-mCherry to retrogradely label cells that synapse onto DMS D1 or A2A neurons. Bottom: Starter oG virus (green) and ΔG-deleted rabies-mCherry (red) expression in DMS Drd1+ neurons (left) and rabies-labeled, DMS D1-projecting cell bodies in the BLA and CeA (right), consistent with prior reports30,34. Representative example from N = 4 (3 males) mice. (d) Top: Approach for rabies trans-synaptic retrograde tracing of DMS Adora2a+ neurons. Bottom: Starter ΔG virus (green) and rabies-mCherry (red) expression in DMS Adora2a+ neurons (left) and rabies-labeled, DMS A2A-projecting cell bodies in the BLA and CeA (right). Representative example N = 4 (3 males) mice. Scale bars = 200 μm. Combined, these data confirm that both BLA and CeA directly project to the DMS and are, thus, poised to influence the learning that supports goal-directed decision making and habit formation.

Extended Data Figure 5: Food-port entries during training with fiber photometry recording of BLA→DMS or CeA→DMS calcium activity following handling control or chronic stress.

Extended Data Figure 5:

(a) Food-port entry rates across training for BLA→DMS GCaMP8s mice. 2-way ANOVA: Training: F(2.47, 46.99) = 0.65, P = 0.56; Stress: F(1, 19) = 0.05, P = 0.82; Training x Stress: F(3, 57) = 0.24, P = 0.87. BLA Control N = 9 (4 male), BLA Stress N = 12 (5 male) mice. (b) Food-port entry rates across training for CeA→DMS GCaMP8s mice. 2-way ANOVA: Training: F(2.36, 47.19) = 0.89, P = 0.43; Stress: F(1, 20) = 2.71, P = 0.12; Training x Stress: F(3, 60) = 0.09, P = 0.96. CeA Control N = 11 (6 male), CeA Stress N = 11 (4 male) mice. Males = solid lines, Females = dashed lines. Data presented as mean +/− SEM.

Extended Data Figure 6: BLA→DMS and CeA→DMS pathway baseline activity and pathway responses to unpredicted rewarding and aversive events in control and stressed mice.

Extended Data Figure 6:

(a-j) Following instrumental training (Figure 2), we used fiber photometry to record GCaMP8s fluorescent changes in either BLA (top) or CeA (bottom) neurons that project to the DMS in response to unpredicted food-pellet reward deliveries or unpredicted 2-s, 0.7 mA footshocks in control and stressed mice. (a) Trial-averaged Z-scored Δf/F BLA→DMS GCaMP8s fluorescence changes around unpredicted food-pellet reward delivery. (b) Trial-averaged quantification of area under the BLA→DMS GCaMP8s Z-scored Δf/F curve (AUC) during the 3-s period prior to (baseline) and following reward collection. 2-way ANOVA: Stress x Reward: F(1, 18) = 10.88, P = 0.004; Reward: F(1, 18) = 1.19; P = 0.03; Stress: F(1, 18) = 1.77, P = 0.20. (c) Trial-averaged Z-scored Δf/F CeA→DMS GCaMP8s fluorescence changes around unpredicted food-pellet reward delivery. (d) Trial-averaged quantification CeA→DMS GCaMP8s Z-scored Δf/F AUC during the 3-s period prior to and following reward collection. 2-way ANOVA: Stress x Reward: F(1, 20) = 11.79, P = 0.02; Reward: F(1, 20) = 8.14, P = 0.01; Stress F(1, 20) = 4.49, P = 0.05. (e) Trial-averaged Z-scored Δf/F BLA→DMS GCaMP8s fluorescence changes around unpredicted footshock. (f) Trial-averaged quantification of BLA→DMS GCaMP8s Z-scored Δf/F AUC during the 1-s acute shock response compared to a 1-s pre-shock baseline. 2-way ANOVA: Shock: F(1, 18) = 8.533, P = 0.01; Stress: F(1, 18) = 0.1433, P = 0.71; Stress x Shock F(1, 18) = 1.725, P = 0.21 (g) Trial-averaged quantification of BLA→DMS GCaMP8s Z-scored Δf/F AUC during 2-s post-shock period. 2-sided t-test: t(18) = 2.26, P = 0.04, 95% CI −2.681 to −0.09544. (h) Trial-averaged Z-scored Δf/F CeA→DMS GCaMP8s fluorescence changes around unpredicted footshock. (i) Trial-averaged quantification of CeA→DMS GCaMP8s Z-scored Δf/F AUC during the 1-s acute shock response, compared to baseline. 2-way ANOVA: Shock: F(1, 20) = 28.24, P < 0.0001; Stress: F(1, 20) = 0.22, P = 0.64; Stress x Shock: F(1, 20) = 3.201, P = 0.09. (j) Trial-averaged quantification of CeA→DMS GCaMP8s Z-scored Δf/F AUC during 2-s post-shock period. 2-sided t-test: t(20) = 0.88, P = 0.39, 95% CI −0.99 – 2.43. BLA Control N = 8 (4 male), BLA Stress N = 12 (5 male) mice. CeA Control N = 11 (6 male), CeA Stress N = 11 (4 male) mice. BLA→DMS projections are activated by unpredicted rewards and this is attenuated by prior chronic stress. Conversely, CeA→DMS projections are not normally robustly activated by unpredicted rewards, but are activated by unpredicted rewards following chronic stress. Interestingly, unpredicted rewards robustly activated CeA→DMS projections here, but rewards did not evoke such a response early in instrumental training (Figure 2m). Rather rewards responses developed with training. This indicates that stress-induced engagement of the CeA→DMS pathway may require repeated reward experience, which may reflect engagement of this pathway with repeated reinforcement and/or opportunity to learn the value or salience of the reward. We speculate this CeA→DMS engagement could be a compensatory mechanism triggered in response to the lack of engagement of the BLA→DMS pathway. Both BLA→DMS and CeA→DMS pathways are acutely activated by unpredicted footshock regardless of prior stress. Chronic stress reduces post-shock activity in the BLA→DMS pathway. (k-l) Frequency (k; 2-way ANOVA: Training: F(2.41, 45.69) = 0.17, P = 0.88; Stress: F(1, 19) = 0.08, P = 0.78; Training x Stress: F(3, 57) = 0.85, P = 0.47) and amplitude (l; 2-way ANOVA: Training: F(2.48, 47.10) = 0.86, P = 0.45; Stress: F(1, 19) = 0.034, P = 0.85; Training x Stress: F(3, 57) = 1.37, P = 0.26) of Z-scored Δf/F spontaneous calcium activity of BLA→DMS projections during the 3-min baseline period prior to each training session in handled control and stressed mice. (m-n) Frequency (m; 2-way ANOVA: Training: F(2.70, 53.97) = 0.21, P = 0.88; Stress F(1, 20) = 3.03, P = 0.10; Training x Stress: F(3, 60) = 0.55, P = 0.65) and amplitude (n; 2-way ANOVA: Training: F(2.59, 51.83) = 0.32, P = 0.78; Stress: F(1, 20) = 3.70, P = 0.07; Training x Stress: F(3, 60) = 0.75, P = 0.52) of Z-scored Δf/F spontaneous calcium activity of CeA→DMS projections during the 3-min baseline period prior to each training session handled control and stressed mice. Chronic stress did not alter baseline spontaneous calcium activity in either pathway. (o) Trial-averaged Z-scored Δf/F CeA→DMS GCaMP8s fluorescence changes aligned to reward collection during training, with 40-s post-collection window. Blue line is the average time of the next lever press (light blue bar = s.e.m.). In stressed mice, CeA→DMS neurons respond to earned reward and this activity takes ~30 s on average to come back to baseline. Control N = 11 (6 male), Stress N = 11 (4 male) mice. Males = solid lines, Females = dashed lines. Data presented as mean +/− SEM. **P < 0.01, corrected for multiple comparisons.

Extended Data Figure 7: Food-port entries during training with BLA→DMS manipulations and devaluation probe tests.

Extended Data Figure 7:

(a-b) Optogenetic inactivation of BLA→DMS projections at reward during instrumental learning. (a) Food-port entries across training. 2-way ANOVA: Training: F(2.03, 38.55) = 3.30, P = 0.05; Virus: F(1, 19) = 0.14, P = 0.71; Training x Virus: F(3, 57) = 0.43, P = 0.73. (b) Food-port entry rates during devaluation probe tests. 2-way ANOVA: Stress x Value: F(1, 19) = 4.38, P = 0.05; Stress: F(1, 19) = 0.47, P = 0.50; Value: F(1, 19) = 0.39, P = 0.54. eYFP N = 10 (5 males), Arch N = 11 (5 male) mice. (c-d) Optogenetic activation of BLA→DMS projections during post-stress instrumental learning. (c) Food-port entry rate across training. 3-way ANOVA: Training: F(2.5, 82.82) = 6.47, P = 0.001; Stress: F(1, 33) = 3.78, P = 0.06; Virus: F(1, 33) = 0.02, P = 0.89; Training x Stress: F(3, 99) = 0.67, P = 0.57; Training x Virus: F(3, 99) = 0.45, P = 0.72; Stress x Virus: F(1, 33) = 2.18, P = 0.15; Training x Stress x Virus: F(3, 99) = 0.26, P = 0.86. (d) Food-port entry rate during the devaluation probe tests. 3-way ANOVA: Value: F(1, 33) = 15.65, P = 0.0004; Stress: F(1, 33) = 0.23, P = 0.63; Virus: F(1, 33) = 0.20, P = 0.65; Value x Stress: F(1, 33) = 2.75, P = 0.11; Value x Virus: F(1, 33) = 0.09, P = 0.76; Virus x Stress: F(1, 33) = 0.17, P = 0.68; Value x Stress x Virus: F(1, 33) = 1.73, P = 0.20. Control, Value: F(1, 16) = 12.42, P = 0.003; Virus: F(1, 16) = 0.0007, P = 0.98; Value x Virus: F(1, 16) = 0.40, P = 0.53. Stress, Value: F(1, 17) = 3.46, P = 0.08; Virus: F(1, 17) = 0.45, P = 0.51; Value x Virus: F(1, 17) = 1.71, P = 0.21. Control eYFP N = 11 (7 male), Control ChR2 N = 7 (4 males), Stress eYFP N = 9 (2 male), Stress ChR2 N = 10 Stress (3 male) mice. (e-f) Chemogenetic activation of BLA→DMS projections during post-stress instrumental learning. (e) Food-port entry rate across training. 3-way ANOVA: Training: F(2.55, 84.12) = 1.64, P = 0.19; Stress: F(1, 33) = 0.05, P = 0.95; Virus: F(1, 33) = 0.08, P = 0.78; Training x Stress: F(3, 99) = 0.16, P = 0.92; Training x Virus: F(3, 99) = 0.21, P = 0.89; Stress x Virus: F(1, 33) = 0.02, P = 0.89; Training x Stress x Virus: F(3, 99) = 3.07, P = 0.03. (f) Food-port entry rate during the devaluation probe test. Planned comparisons 2-sided t-test valued v. devalued, Control mCherry: t(20) = 1.88, P = 0.07, 95% CI - 0.21 – 5.41; Control hM3Dq: t(10) = 1.32, P = 0.20, 95% CI −1.40 – 6.54; Stress mCherry: t(16) = 0.75, P = 0.46, 95% CI −2.04 – 4.44; Stress hM3Dq: t(18) = 3.36, P = 0.002, 95% CI 2.01 – 8.16. Control mCherry N = 12 (7 male), Stress mCherry N = 9 (5 male), Stress hM3Dq N = 10 Stress (5 male) mice. Males = solid lines, Females = dashed lines. Data presented as mean +/− SEM. **P < 0.01, corrected for multiple comparisons.

Extended Data Figure 8: Manipulation of BLA or CeA terminals in DMS is neither rewarding or aversive.

Extended Data Figure 8:

(a) Following training and testing (Figure 3hn) mice receive a real-time place preference test in which 1 side of a 2-chamber apparatus was paired with optogenetic inhibition of BLA axons and terminals in the DMS. Average percent time spent in light-paired chamber across 2, 10-minute sessions (one with light paired with each side). 2-sided t-test: t(19) = 0.65, P = 0.52, 95% CI −0.04 – 0.08. eYFP N = 10 (5 male), Arch N = 11 (5 male) mice. Males = closed circles, Females = open circles. Data presented as mean +/− SEM. (b-c) Following training and testing mice receive a real-time place preference test in which 1 side of a 2-chamber apparatus was paired with optogenetic stimulation of DMS-projecting CeA neurons. (b) Average percent time spent in light paired chamber across 2, 10-minute sessions (one with light paired with each side) in handled control subjects. 2-sided t-test: t(21) = 1.75, P = 0.10, 95% CI −0.79 – 9.06. eYFP N = 17 (9 male), ChR2 N = 6 (3 male) mice. (c) Average percent time spent in light paired chamber across 2, 10-minute sessions (one with light paired with each side) in subjects with a prior once/daily stress for 14 d. 2-sided t-test: t(16) = 0.52, P = 0.61, 95% CI −3.74 – 6.17. eYFP N = 8 (4 male), ChR2 N = 10 (6 male) mice. Males = closed circles, Females = open circles. Data presented as mean +/− SEM.

Extended data Figure 9: Food-port entries during training with CeA→DMS manipulations and devaluation probe tests.

Extended data Figure 9:

(a-b) Optogenetic inhibition of CeA→DMS projections during instrumental overtraining. (a) Food-port entry rates across training. 2-way ANOVA: Training: F(2.29, 45.82) = 1.81, P = 0.17; Virus: F(1, 20) = 0.67, P = 0.42; Training x Virus: F(8, 160) = 0.60, P = 0.77. (b) Food-port entry rates during the devaluation probe tests. 2-way ANOVA: Virus x Value: F(1, 20) = 4.51, P = 0.046; Value: F(1, 20) = 1.47, P = 0.24; Virus: F(1, 20) = 0.41, P = 0.53;. eYFP N = 11 (3 male), Arch N = 11 (7 male) mice. (c-d) Optogenetic inactivation of CeA→DMS projections at reward during post-stress learning. (c) Food-port entry rates across training. 3-way ANOVA: Training: F(2.63, 84.18) = 3.21, P = 0.03; Stress: F(1, 32) = 0.60, P = 0.44; Virus: F(1, 32) = 4.75, P = 0.04; Training x Stress: F(3, 96) = 1.55, P = 0.21; Training x Virus: F(3, 96) = 2.42, P = 0.07; Stress x Virus: F(1, 32) = 0.04, P = 0.84; Training x Stress x Virus: F(3, 96) = 1.14, P = 0.34. (k) Food-port entry rate during the devaluation probe test. 3-way ANOVA: Value x Stress x Virus: F(1, 32) = 0.03, P = 0.86; Value: F(1, 32) = 6.44, P = 0.02; Stress: F(1, 32) = 2.02, P = 0.16; Virus: F(1, 32) = 1.09, P = 0.30; Value x Stress: F(1, 3) = 0.99, P = 0.33; Value x Virus: F(1, 32) = 0.02, P = 0.89; Virus x Stress: F(1, 32) = 0.24, P = 0.63. Control groups, 2-way ANOVA: Value x Virus: F(1, 18) = 0.09, P = 0.77; Value: F(1, 18) = 1.99, P = 0.17; Virus: F(1, 18) = 0.21, P = 0.65. Stress groups, 2-way ANOVA: Value x Virus: F(1, 14) = 0.0005, P = 0.98; Value: F(1, 14) = 3.94, P = 0.06; Virus: F(1, 14) = 0.85, P = 0.87. Control eYFP N = 9 (5 male), Control Arch N = 11 (4 male), Stress eYFP N = 7 (6 male), Stress Arch N = 9 (5 male) mice. (e-f) Chemogenetic inhibition of CeA→DMS projections during post-stress instrumental learning. (e) Food-port entry rates across training. Training: F(1.85, 75.67) = 2.02, P = 0.14; Stress: F(1, 41) = 4.42, P = 0.04; Virus: F(1, 41) = 0.41, P = 0.53; Training x Stress: F(3, 123) = 3.08, P = 0.03; Training x Virus: F(3, 123) = 0.64, P = 0.59; Stress x Virus: F(1, 41) = 0.20, P = 0.66; Training x Stress x Virus: F(3, 123) = 3.23, P = 0.02. (f) Food-port entry rates during the devaluation probe tests. Planned comparisons 2-sided t-test valued v. devalued, Control mCherry: t(11) = 1.94, P = 0.06, 95% CI −0.25 – 12.07; Control hM4Di: t(12) = 0.38, P = 0.71, 95% CI −4.81 – 7.03; Stress mCherry: t(10) = 0.05, P = 0.96, 95% CI −6.33 – 5.99; Stress hM4Di: t(8) = 0.47, P = 0.64, 95% CI −5.47 – 8.76. Control mCherry N = 12 (5 male), Control hM4Di N = 13 (8 male), Stress mCherry N = 11 (5 male), Stress hM4Di N = 9 (4 male) mice. (g-h) Optogenetic stimulation of CeA→DMS projections at reward during learning following subthreshold once daily stress (SubStress). (g) Food-port entry rate across training. 2-way ANOVA: Training: F(1.73, 34.50) = 0.89, P = 0.41; Virus: F(1, 20) = 0.46, P = 0.51; Training x Virus: F(3, 60) = 0.39, P = 0.76. (g) Food-port entry rate during the devaluation probe test. 2-way ANOVA: Virus x Value: F(1, 20) = 1.37, P = 0.26; Virus: F(1, 20) = 0.005, P = 0.94; Value: F(1, 20) = 1.36, P = 0.26. eYFP N = 10 (4 male), ChR2 N = 12 (6 male) mice. Males = solid lines, Females = dashed lines. Data presented as mean +/− SEM.

Extended data Figure 10: Optogenetic stimulation of CeA→DMS projections in control mice.

Extended data Figure 10:

(a) We used an intersectional approach to express the excitatory opsin Channelrhodopsin 2 (ChR2), or a fluorophore control in DMS-projecting CeA neurons and implanted optic fibers above the CeA. (b) Representative images of retro-cre expression in DMS and immunofluorescent staining of cre-dependent ChR2 expression in CeA (scale bars = 200 μm) and ap of retro-cre in DMS and cre-dependent ChR2 expression in CeA for all subjects. (c) Procedure. Lever presses earned food pellet rewards on a random-ratio (RR) reinforcement schedule. We used blue light (473 nm, 10 mW, 20 Hz, 25-ms pulse width, 2 s) to stimulate CeA→DMS neurons during the collection of each earned reward in mice without a history of stress. Mice were then given a lever-pressing probe test in the Valued state, prefed on untrained food-pellet type to control for general satiety, and Devalued state prefed on trained food-pellet type to induce sensory-specific satiety devaluation (order counterbalanced). (d) Press rates across training. 2-way ANOVA: Training: F(1.85, 38.75) = 62.18, P < 0.0001; Virus: F(1, 21) = 0.23, P = 0.64; Training x Virus: F(3, 63) = 0.05, P = 0.98. (e) Food-port entries across training. 2-way ANOVA: Training: F(2.42, 50.77) = 2.00, P = 0.14; Virus: F(1, 21) = 1.85, P = 0.19; Training x Virus: F(3, 63) = 0.22, P = 0.88. (f) Press rate during the devaluation probe test. 2-way ANOVA: Value: F(1, 21) = 20.32, P = 0.0002; Virus: F(1,21) = 0.92, P = 0.35; Virus x Value: F(1, 21) = 1.17, P = 0.29. (g) Devaluation index. 2-sided t-test: t(21) = 1.37, P = 0.19, 95% CI −0.25 – 0.05. (h) Food-port entries during the devaluation probe tests. 2-way ANOVA: Value: F(1, 21) = 30.07, P < 0.0001; Virus: F(1, 21) = 0.12, P = 0.73; Virus x Value: F(1, 21) = 3.45, P = 0.08. eYFP N = 17 (9 male), ChR2 N = 6 (3 male) mice. Data presented as mean +/− SEM. ** P < 0.01, *** P < 0.001, corrected for multiple comparisons. Optogenetic activation of CeA→DMS projections at reward during learning neither affects affect acquisition of the lever-press behavior, nor the action-outcome learning needed to support flexible goal-directed decision making during the devaluation test.

Supplementary Material

1

ACKNOWLEDGEMENTS

This research was supported by NIH R01DA046679 (KMW), NIH R01DA058374 (KMW), NIH R01DA035443 (KMW), NIH T32DA024635 (JRG), NIH F32DA056201 (JRG), A.P. Giannini Fellowship (JRG), NIH K99MH135177 (JRG), NIH TL4GM118977 (NP), NIH R01MH119089 (AA), and the Staglin Center for Behavior and Brain Sciences. UCLA Behavioral Testing Core provided space and behavioral testing equipment for the open field, elevated plus maze and light-dark emergence test.

Footnotes

COMPETING FINANCIAL INTERESTS

The authors have no biomedical financial interests or potential conflicts of interest to declare.

Code availability

Custom-written MATLAB code is accessible via Dryad repository https://doi.org/10.5061/dryad.2jm63xt00 and available from the corresponding author upon request.

Data availability

All data that support the findings of this study are available in the source data accompanying this paper and from the corresponding author upon request.

REFERENCES

  • 1.Schwabe L & Wolf OT Stress prompts habit behavior in humans. J Neurosci 29, 7191–7198 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Pool ER et al. Determining the effects of training duration on the behavioral expression of habitual control in humans: a multilaboratory investigation. Learn Mem 29, 16–28 (2022). 10.1101/lm.053413.121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Friedel E et al. How Accumulated Real Life Stress Experience and Cognitive Speed Interact on Decision-Making Processes. Front Hum Neurosci 11, 302 (2017). 10.3389/fnhum.2017.00302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Schwabe L, Dalm S, Schächinger H & Oitzl MS Chronic stress modulates the use of spatial and stimulus-response learning strategies in mice and man. Neurobiol Learn Mem 90, 495–503 (2008). 10.1016/j.nlm.2008.07.015 [DOI] [PubMed] [Google Scholar]
  • 5.Dias-Ferreira E et al. Chronic stress causes frontostriatal reorganization and affects decision-making. Science 325, 621–625 (2009). 10.1126/science.1171203 [DOI] [PubMed] [Google Scholar]
  • 6.Balleine BW The Meaning of Behavior: Discriminating Reflex and Volition in the Brain. Neuron 104, 47–62 (2019). 10.1016/j.neuron.2019.09.024 [DOI] [PubMed] [Google Scholar]
  • 7.Graybiel AM Habits, rituals, and the evaluative brain. Annu Rev Neurosci 31, 359–387 (2008). 10.1146/annurev.neuro.29.051605.112851 [DOI] [PubMed] [Google Scholar]
  • 8.Dickinson A Actions and Habits: the development of behavioural autonomy. Philosphical Transactions of the Royal Society of London B308, 67–78 (1985). [Google Scholar]
  • 9.Redish AD, Jensen S & Johnson A A unified framework for addiction: vulnerabilities in the decision process. Behav Brain Sci 31, 415–437; discussion 437–487 (2008). 10.1017/S0140525X0800472X [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vandaele Y & Ahmed SH Habit, choice, and addiction. Neuropsychopharmacology 46, 689–698 (2021). 10.1038/s41386-020-00899-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Voon V et al. Disorders of compulsivity: a common bias towards learning habits. Mol Psychiatry 20, 345–352 (2015). 10.1038/mp.2014.44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Belin D, Belin-Rauscent A, Murray JE & Everitt BJ Addiction: failure of control over maladaptive incentive habits. Curr Opin Neurobiol (2013). 10.1016/j.conb.2013.01.025 [DOI] [PubMed] [Google Scholar]
  • 13.Hogarth L, Balleine BW, Corbit LH & Killcross S Associative learning mechanisms underpinning the transition from recreational drug use to addiction. Ann N Y Acad Sci 1282, 12–24 (2013). 10.1111/j.1749-6632.2012.06768.x [DOI] [PubMed] [Google Scholar]
  • 14.Ray LA et al. Capturing habitualness of drinking and smoking behavior in humans. Drug Alcohol Depend 207, 107738 (2020). 10.1016/j.drugalcdep.2019.107738 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gillan CM et al. Disruption in the balance between goal-directed behavior and habit learning in obsessive-compulsive disorder. Am J Psychiatry 168, 718–726 (2011). 10.1176/appi.ajp.2011.10071062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Horstmann A et al. Slave to habit? Obesity is associated with decreased behavioural sensitivity to reward devaluation. Appetite 87, 175–183 (2015). 10.1016/j.appet.2014.12.212 [DOI] [PubMed] [Google Scholar]
  • 17.Morris RW, Cyrzon C, Green MJ, Le Pelley ME & Balleine BW Impairments in action-outcome learning in schizophrenia. Transl Psychiatry 8, 54 (2018). 10.1038/s41398-018-0103-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Griffiths KR, Morris RW & Balleine BW Translational studies of goal-directed action as a framework for classifying deficits across psychiatric disorders. Front Syst Neurosci 8, 101 (2014). 10.3389/fnsys.2014.00101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Byrne KA, Six SG & Willis HC Examining the effect of depressive symptoms on habit formation and habit-breaking. J Behav Ther Exp Psychiatry 73, 101676 (2021). 10.1016/j.jbtep.2021.101676 [DOI] [PubMed] [Google Scholar]
  • 20.Alvares GA, Balleine BW & Guastella AJ Impairments in goal-directed actions predict treatment response to cognitive-behavioral therapy in social anxiety disorder. PLoS One 9, e94778 (2014). 10.1371/journal.pone.0094778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Alvares GA, Balleine BW, Whittle L & Guastella AJ Reduced goal-directed action control in autism spectrum disorder. Autism Res (2016). 10.1002/aur.1613 [DOI] [PubMed] [Google Scholar]
  • 22.Agid O, Kohn Y & Lerer B Environmental stress and psychiatric illness. Biomed Pharmacother 54, 135–141 (2000). 10.1016/S0753-3322(00)89046-0 [DOI] [PubMed] [Google Scholar]
  • 23.Baumeister D, Lightman SL & Pariante CM The Interface of Stress and the HPA Axis in Behavioural Phenotypes of Mental Illness. Curr Top Behav Neurosci 18, 13–24 (2014). 10.1007/7854_2014_304 [DOI] [PubMed] [Google Scholar]
  • 24.Brady KT & Sinha R Co-occurring mental and substance use disorders: the neurobiological effects of chronic stress. Am J Psychiatry 162, 1483–1493 (2005). 10.1176/appi.ajp.162.8.1483 [DOI] [PubMed] [Google Scholar]
  • 25.Duffing TM, Greiner SG, Mathias CW & Dougherty DM Stress, substance abuse, and addiction. Curr Top Behav Neurosci 18, 237–263 (2014). 10.1007/7854_2014_276 [DOI] [PubMed] [Google Scholar]
  • 26.Malvaez M & Wassum K Regulation of habit formation in the dorsal striatum. Current Opinion in Behavioral Sciences 20, 67–74 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Balleine BW & O’Doherty JP Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69 (2010). 10.1038/npp.2009.131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Yin HH, Ostlund SB, Knowlton BJ & Balleine BW The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci 22, 513–523 (2005). [DOI] [PubMed] [Google Scholar]
  • 29.Balleine BW, Killcross AS & Dickinson A The effect of lesions of the basolateral amygdala on instrumental conditioning. J Neurosci 23, 666–675 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pan WX, Mao T & Dudman JT Inputs to the dorsal striatum of the mouse reflect the parallel circuit architecture of the forebrain. Front Neuroanat 4, 147 (2010). 10.3389/fnana.2010.00147 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.R G. et al. Endogenous opioid dynamics in the dorsal striatum sculpt neural activity to control goal-directed action. BioRxiv (2024). 10.1101/2024.05.20.595035 [DOI] [Google Scholar]
  • 32.Lingawi NW & Balleine BW Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits. J Neurosci 32, 1073–1081 (2012). 10.1523/JNEUROSCI.4806-11.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Swanson LW & Petrovich GD What is the amygdala? Trends Neurosci 21, 323–331 (1998). [DOI] [PubMed] [Google Scholar]
  • 34.Wall NR, De La Parra M, Callaway EM & Kreitzer AC Differential innervation of direct- and indirect-pathway striatal projection neurons. Neuron 79, 347–360 (2013). 10.1016/j.neuron.2013.05.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Heaton EC, Seo EH, Butkovich LM, Yount ST & Gourley SL Control of goal-directed and inflexible actions by dorsal striatal melanocortin systems, in coordination with the central nucleus of the amygdala. Prog Neurobiol 238, 102629 (2024). 10.1016/j.pneurobio.2024.102629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Moscarello JM & Penzo MA The central nucleus of the amygdala and the construction of defensive modes across the threat-imminence continuum. Nat Neurosci 25, 999–1008 (2022). 10.1038/s41593-022-01130-5 [DOI] [PubMed] [Google Scholar]
  • 37.Roozendaal B, McEwen BS & Chattarji S Stress, memory and the amygdala. Nat Rev Neurosci 10, 423–433 (2009). 10.1038/nrn2651 [DOI] [PubMed] [Google Scholar]
  • 38.Dickinson AD, Nicholas J & Adams CD The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. Quarterly Journal of Experimental Psychology 35.1, 35–51 (1983). [Google Scholar]
  • 39.Adams CD & Dickinson A Instrumental responding following reinforcer devaluation. The Quarterly Journal of Experimental Psychology 33, 109–121 (1981). [Google Scholar]
  • 40.Hammond LJ The effect of contingency upon the appetitive conditioning of free-operant behavior. J Exp Anal Behav 34, 297–304 (1980). 10.1901/jeab.1980.34-297 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Malvaez M et al. Habits Are Negatively Regulated by Histone Deacetylase 3 in the Dorsal Striatum. Biol Psychiatry (2018). 10.1016/j.biopsych.2018.01.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Gremel CM & Costa RM Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat Commun 4, 2264 (2013). 10.1038/ncomms3264 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Alexander GM et al. Remote control of neuronal activity in transgenic mice expressing evolved G protein-coupled receptors. Neuron 63, 27–39 (2009). 10.1016/j.neuron.2009.06.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Zhu H et al. Cre-dependent DREADD (Designer Receptors Exclusively Activated by Designer Drugs) mice. Genesis 54, 439–446 (2016). 10.1002/dvg.22949 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.KA A et al. Habit learning shapes activity dynamics in the central nucleus of the amygdala. BioRxiv (2024). 10.1101/2024.02.20.580730 [DOI] [Google Scholar]
  • 46.Tipps M, Marron Fernandez de Velasco E, Schaeffer A & Wickman K Inhibition of Pyramidal Neurons in the Basal Amygdala Promotes Fear Learning. eNeuro 5 (2018). 10.1523/ENEURO.0272-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Tuscher JJ, Taxier LR, Fortress AM & Frick KM Chemogenetic inactivation of the dorsal hippocampus and medial prefrontal cortex, individually and concurrently, impairs object recognition and spatial memory consolidation in female mice. Neurobiol Learn Mem 156, 103–116 (2018). 10.1016/j.nlm.2018.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Corbit LH, Leung BK & Balleine BW The role of the amygdala-striatal pathway in the acquisition and performance of goal-directed instrumental actions. J Neurosci 33, 17682–17690 (2013). 10.1523/JNEUROSCI.3271-13.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ostlund SB & Balleine BW Differential involvement of the basolateral amygdala and mediodorsal thalamus in instrumental action selection. J Neurosci 28, 4398–4405 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Fisher SD, Ferguson LA, Bertran-Gonzalez J & Balleine BW Amygdala-Cortical Control of Striatal Plasticity Drives the Acquisition of Goal-Directed Action. Curr Biol (2020). 10.1016/j.cub.2020.08.090 [DOI] [PubMed] [Google Scholar]
  • 51.Namburi P et al. A circuit mechanism for differentiating positive and negative associations. Nature 520, 675–678 (2015). 10.1038/nature14366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Tye KM Neural Circuit Motifs in Valence Processing. Neuron 100, 436–452 (2018). 10.1016/j.neuron.2018.10.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Balleine BW & Killcross S Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci 29, 272–279 (2006). [DOI] [PubMed] [Google Scholar]
  • 54.Ugolini A, Sokal DM, Arban R & Large CH CRF1 receptor activation increases the response of neurons in the basolateral nucleus of the amygdala to afferent stimulation. Front Behav Neurosci 2, 2 (2008). 10.3389/neuro.08.002.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Liu ZP et al. Chronic stress impairs GABAergic control of amygdala through suppressing the tonic GABAA receptor currents. Mol Brain 7, 32 (2014). 10.1186/1756-6606-7-32 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Rosenkranz JA, Venheim ER & Padival M Chronic stress causes amygdala hyperexcitability in rodents. Biol Psychiatry 67, 1128–1136 (2010). 10.1016/j.biopsych.2010.02.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hetzel A & Rosenkranz JA Distinct effects of repeated restraint stress on basolateral amygdala neuronal membrane properties in resilient adolescent and adult rats. Neuropsychopharmacology 39, 2114–2130 (2014). 10.1038/npp.2014.60 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Rau AR, Chappell AM, Butler TR, Ariwodola OJ & Weiner JL Increased Basolateral Amygdala Pyramidal Cell Excitability May Contribute to the Anxiogenic Phenotype Induced by Chronic Early-Life Stress. J Neurosci 35, 9730–9740 (2015). 10.1523/JNEUROSCI.0384-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Sharp BM Basolateral amygdala and stress-induced hyperexcitability affect motivated behaviors and addiction. Transl Psychiatry 7, e1194 (2017). 10.1038/tp.2017.161 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Masneuf S et al. Glutamatergic mechanisms associated with stress-induced amygdala excitability and anxiety-related behavior. Neuropharmacology 85, 190–197 (2014). 10.1016/j.neuropharm.2014.04.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Lowery-Gionta EG et al. Chronic stress dysregulates amygdalar output to the prefrontal cortex. Neuropharmacology 139, 68–75 (2018). 10.1016/j.neuropharm.2018.06.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Blume SR, Padival M, Urban JH & Rosenkranz JA Disruptive effects of repeated stress on basolateral amygdala neurons and fear behavior across the estrous cycle in rats. Sci Rep 9, 12292 (2019). 10.1038/s41598-019-48683-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Partridge JG et al. Stress increases GABAergic neurotransmission in CRF neurons of the central amygdala and bed nucleus stria terminalis. Neuropharmacology 107, 239–250 (2016). 10.1016/j.neuropharm.2016.03.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.He F, Ai H, Wang M, Wang X & Geng X Altered Neuronal Activity in the Central Nucleus of the Amygdala Induced by Restraint Water-Immersion Stress in Rats. Neurosci Bull 34, 1067–1076 (2018). 10.1007/s12264-018-0282-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Giovanniello J et al. A Central Amygdala-Globus Pallidus Circuit Conveys Unconditioned Stimulus-Related Information and Controls Fear Learning. J Neurosci 40, 9043–9054 (2020). 10.1523/JNEUROSCI.2090-20.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Murray JE et al. Basolateral and central amygdala differentially recruit and maintain dorsolateral striatum-dependent cocaine-seeking habits. Nat Commun 6, 10088 (2015). 10.1038/ncomms10088 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Liu J et al. Differential efferent projections of GABAergic neurons in the basolateral and central nucleus of amygdala in mice. Neurosci Lett 745, 135621 (2021). 10.1016/j.neulet.2020.135621 [DOI] [PubMed] [Google Scholar]
  • 68.Sah P, Faber ES, Lopez De Armentia M & Power J The amygdaloid complex: anatomy and physiology. Physiol Rev 83, 803–834 (2003). 10.1152/physrev.00002.2003 [DOI] [PubMed] [Google Scholar]
  • 69.Limoges A, Yarur HE & Tejeda HA Dynorphin/kappa opioid receptor system regulation on amygdaloid circuitry: Implications for neuropsychiatric disorders. Front Syst Neurosci 16, 963691 (2022). 10.3389/fnsys.2022.963691 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Daviu N, Bruchas MR, Moghaddam B, Sandi C & Beyeler A Neurobiological links between stress and anxiety. Neurobiol Stress 11, 100191 (2019). 10.1016/j.ynstr.2019.100191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.McEwen BS, Nasca C & Gray JD Stress Effects on Neuronal Structure: Hippocampus, Amygdala, and Prefrontal Cortex. Neuropsychopharmacology 41, 3–23 (2016). 10.1038/npp.2015.171 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Shan Q, Ge M, Christie MJ & Balleine BW The acquisition of goal-directed actions generates opposing plasticity in direct and indirect pathways in dorsomedial striatum. J Neurosci 34, 9196–9201 (2014). 10.1523/JNEUROSCI.0313-14.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Belin-Rauscent A, Everitt BJ & Belin D Intrastriatal shifts mediate the transition from drug-seeking actions to habits. Biol Psychiatry 72, 343–345 (2012). 10.1016/j.biopsych.2012.07.001 [DOI] [PubMed] [Google Scholar]
  • 74.Corbit LH, Nie H & Janak PH Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum. Biol Psychiatry 72, 389–395 (2012). 10.1016/j.biopsych.2012.02.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Wendler E et al. The roles of the nucleus accumbens core, dorsomedial striatum, and dorsolateral striatum in learning: performance and extinction of Pavlovian fear-conditioned responses and instrumental avoidance responses. Neurobiol Learn Mem 109, 27–36 (2014). 10.1016/j.nlm.2013.11.009 [DOI] [PubMed] [Google Scholar]
  • 76.Weera MM, Schreiber AL, Avegno EM & Gilpin NW The role of central amygdala corticotropin-releasing factor in predator odor stress-induced avoidance behavior and escalated alcohol drinking in rats. Neuropharmacology 166, 107979 (2020). 10.1016/j.neuropharm.2020.107979 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Franklin KBJ & Paxinos G The Mouse Brain in Stereotaxic Coordinates. 3rd Edition edn, (Elsevier, 2008). [Google Scholar]

ADDITIONAL REFERENCES

  • 78.Tye KM et al. Dopamine neurons modulate neural encoding and expression of depression-related behaviour. Nature 493, 537–541 (2013). 10.1038/nature11740 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Cerniauskas I et al. Chronic Stress Induces Activity, Synaptic, and Transcriptional Remodeling of the Lateral Habenula Associated with Deficits in Motivated Behaviors. Neuron 104, 899–915.e898 (2019). 10.1016/j.neuron.2019.09.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Monteiro S et al. An efficient chronic unpredictable stress protocol to induce stress-related responses in C57BL/6 mice. Front Psychiatry 6, 6 (2015). 10.3389/fpsyt.2015.00006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Bavley CC, Fischer DK, Rizzo BK & Rajadhyaksha AM Ca. Neurobiol Stress 7, 27–37 (2017). 10.1016/j.ynstr.2017.02.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Freymann J, Tsai PP, Stelzer HD, Mischke R & Hackbarth H Impact of bedding volume on physiological and behavioural parameters in laboratory mice. Lab Anim 51, 601–612 (2017). 10.1177/0023677217694400 [DOI] [PubMed] [Google Scholar]
  • 83.Pałucha-Poniewiera A, Podkowa K, Rafało-Ulińska A, Brański P & Burnat G The influence of the duration of chronic unpredictable mild stress on the behavioural responses of C57BL/6J mice. Behav Pharmacol 31, 574–582 (2020). 10.1097/FBP.0000000000000564 [DOI] [PubMed] [Google Scholar]
  • 84.La-Vu MQ et al. Sparse genetically defined neurons refine the canonical role of periaqueductal gray columnar organization. Elife 11 (2022). 10.7554/eLife.77115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Reis FM et al. Dorsal periaqueductal gray ensembles represent approach and avoidance states. Elife 10 (2021). 10.7554/eLife.64934 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Hilário MR, Clouse E, Yin HH & Costa RM Endocannabinoid signaling is critical for habit formation. Front Integr Neurosci 1, 6 (2007). 10.3389/neuro.07.006.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Lichtenberg NT et al. The Medial Orbitofrontal Cortex-Basolateral Amygdala Circuit Regulates the Influence of Reward Cues on Adaptive Behavior and Choice. J Neurosci 41, 7267–7277 (2021). 10.1523/JNEUROSCI.0901-21.2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Sias A et al. A bidirectional corticoamygdala circuit for the encoding and retrieval of detailed reward memories. eLife 10 (2021). 10.7554/eLife.68617 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Sherathiya VN, Schaid MD, Seiler JL, Lopez GC & Lerner TN GuPPy, a Python toolbox for the analysis of fiber photometry data. Sci Rep 11, 24212 (2021). 10.1038/s41598-021-03626-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Roth BL DREADDs for Neuroscientists. Neuron 89, 683–694 (2016). 10.1016/j.neuron.2016.01.040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Vazey EM & Aston-Jones G Designer receptor manipulations reveal a role of the locus coeruleus noradrenergic system in isoflurane general anesthesia. Proc Natl Acad Sci U S A 111, 3859–3864 (2014). 10.1073/pnas.1310025111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Qiu MH, Chen MC, Fuller PM & Lu J Stimulation of the Pontine Parabrachial Nucleus Promotes Wakefulness via Extra-thalamic Forebrain Circuit Nodes. Curr Biol 26, 2301–2312 (2016). 10.1016/j.cub.2016.07.054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Pomrenze MB et al. A Corticotropin Releasing Factor Network in the Extended Amygdala for Anxiety. J Neurosci 39, 1030–1043 (2019). 10.1523/JNEUROSCI.2143-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Malvaez M, Shieh C, Murphy MD, Greenfield VY & Wassum KM Distinct cortical–amygdala projections drive reward value encoding and retrieval. Nature Neuroscience (2019). 10.1038/s41593-019-0374-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Collins AL et al. Nucleus Accumbens Cholinergic Interneurons Oppose Cue-Motivated Behavior. Biol Psychiatry (2019). 10.1016/j.biopsych.2019.02.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Lichtenberg NT et al. The medial orbitofrontal cortex - basolateral amygdala circuit regulates the influence of reward cues on adaptive behavior and choice. J Neurosci (2021). 10.1523/JNEUROSCI.0901-21.2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Schmider E, Ziegler M, Danay E, Beyer L & Bühner M Is it really robust? Reinvestigating the robustness of ANOVA against violations of the normal distribution assumption. Methodology : European journal of research methods for the behavioral & social sciences 6, 147–151 (2010). [Google Scholar]
  • 98.Knief U & Forstmeier W Violating the normality assumption may be the lesser of two evils. Behav Res Methods 53, 2576–2590 (2021). 10.3758/s13428-021-01587-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Wassum K et al. (Dryad, 2021). [Google Scholar]
  • 100.Mineur YS, Belzung C & Crusio WE Effects of unpredictable chronic mild stress on anxiety and depression-like behavior in mice. Behav Brain Res 175, 43–50 (2006). 10.1016/j.bbr.2006.07.029 [DOI] [PubMed] [Google Scholar]
  • 101.Fang X et al. Chronic unpredictable stress induces depression-related behaviors by suppressing AgRP neuron activity. Mol Psychiatry 26, 2299–2315 (2021). 10.1038/s41380-020-01004-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Gong S et al. Targeting Cre recombinase to specific neuron populations with bacterial artificial chromosome constructs. J Neurosci 27, 9817–9823 (2007). 10.1523/JNEUROSCI.2707-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Valjent E, Bertran-Gonzalez J, Hervé D, Fisone G & Girault JA Looking BAC at striatal signaling: cell-specific analysis in new transgenic mice. Trends Neurosci 32, 538–547 (2009). 10.1016/j.tins.2009.06.005 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

All data that support the findings of this study are available in the source data accompanying this paper and from the corresponding author upon request.

RESOURCES