Summary:
At the core of value-based learning is the nucleus accumbens (NAc). D1- and D2-receptor containing medium spiny neurons (MSNs) in the NAc core are hypothesized to have opposing valence-based roles in behavior. Using optical imaging and manipulation approaches in mice, we show that neither D1 nor D2 MSNs signal valence. D1 MSN responses were evoked by stimuli regardless of valence or contingency. D2 MSNs were evoked by both cues and outcomes, were dynamically changed with learning, and tracked valence-free prediction error at the population and individual neuron level. Finally, D2 MSN responses to cues were necessary for associative learning. Thus, D1 and D2 MSNs work in tandem, rather than in opposition, by signaling specific properties of stimuli to control learning.
Keywords: striatum, reinforcement learning, fear conditioning, motivation, aversion, calcium imaging
Graphical Abstract

eTOC Blurb
A large body of work has focused on how the nucleus accumbens (NAc) is critical to reward encoding. Zachry*, Kutlu* et al. show that medium spiny neurons (MSNs) within the NAc of mice do not signal reward. Rather, D1 MSN responses are evoked by salient stimuli, while D2 MSNs track prediction errors
INTRODUCTION
The ability to adaptively navigate an environment relies on associative learning, where animals form associations between predictive cues and external stimuli1,2. This process is at the core of nearly all adaptive behavior and its dysregulation is also a key feature of a wide range of psychopathologies3-9. Associative learning is dependent on both the ability to identify the valence of external stimuli, as well as various other factors such as salience, prediction, and prediction error10-16. These valence-independent factors are important for helping animals navigate changing environments, where the same stimulus can often elicit opposite behavioral responses depending on the context in which it is encountered17,18. For example, an unavoidable footshock will induce freezing, but an avoidable footshock will elicit an escape response in rodents - even though the footshock in both cases has negative valence18-21. Without understanding the relationship between stimulus processing and behavioral action our understanding of neural circuit control of behavior and its dysregulation in disease is incomplete.
The nucleus accumbens (NAc) is a hub for learning, selecting, and executing goal-oriented behaviors associated with both appetitive and aversive stimuli22-26. The NAc is a heterogeneous region primarily composed of medium spiny neurons (MSNs) that are classified based on their expression of D1- (which signal through Gαs) or D2-type dopamine receptors (which signal through Gαi) 27-30. Critically, these populations integrate information from dopaminergic inputs from the midbrain and glutamatergic inputs from across the brain, which combine to control their activity patterns31,32. The current understanding of D1 and D2 MSNs has been largely derived from work in the dorsal striatum showing that these two populations are segregated based on their projection targets into the direct (D1 MSNs) and indirect (D2 MSNs) pathways that drive action initiation and inhibition, respectively33-36. However, this pathway segregation is more nuanced in the NAc, and both D1 and D2 MSNs project to overlapping downstream areas, such as the ventral pallidum30. These fundamental organizational differences preclude our ability to infer MSN subtype-specific behavioral control based on this previous work.
In the NAc, much of our understanding of the role these populations play in behavioral control has been based on their role in reward-based behaviors and in response to drugs of abuse. Together these studies that span physiology, transcriptional plasticity, and pharmacological/optical manipulations have led to the hypothesis that these populations have opposing roles in behavior with D1 MSNs promoting reward and D2 MSNs promoting aversion/preventing reward seeking37-45. In this framework, these MSN populations function in opposition to one another, with D1 MSNs increasing and D2 MSNs decreasing activity in response to reward-based stimuli. However, emerging evidence is beginning to show that the functions of these neurons extend beyond simple reward-based coding46-53 and it is currently unclear exactly what environmental factors elicit responses in these populations and how these temporal dynamics are linked to behavioral control. By combining these recording/manipulation approaches with behaviors that include both operant and Pavlovian contingencies, we show the precise valence-independent role that these populations play in learning.
RESULTS
Aversive stimuli evoke both D1 and D2 MSN responses
To record temporally defined neural activity in awake and behaving animals, the genetically encoded calcium indicator, GCaMP6f, was selectively expressed in either D1 or D2 MSNs in the NAc core (using D1-cre or A2A-cre mice, respectively). Because D2 receptors are expressed in some interneuron populations in the NAc, A2A-Cre mice were used to select for D2 MSNs, specifically54. Using fiber photometry, population-level calcium transients were monitored in awake and behaving animals (Figure 1A; see Figure S1A). First, D1 and D2 MSNs were recorded during a positive reinforcement task where mice emitted an operant response during an auditory cue for a sucrose reinforcer. In line with previous valence-based predictions43,44, at the time of sucrose collection D1 MSNs showed a positive response (Figure 1B), while D2 MSNs showed a small decrease (Figure 1C). However, when mice were presented with unsignaled footshocks at random intervals of varying intensity (0.3 or 1.0 mA) both D1 MSNs (Figure 1D) and D2 MSNs (Figure 1E) showed positive shock-evoked responses that increased with increasing shock intensity (Figure S1D-E).
Figure 1. D1 and D2 MSNs do not track stimulus valence.
(A) Cre-dependent GCaMP6f (AAV5.hsyn.flex.CGaMP6f) was expressed in D1 MSNs (D1-cre mice) or D2 (A2A-cre mice) MSNs. (right) Example of GCaMP6f expression in NAc core.
(B) D1 MSNs showed a positive response to sucrose retrieval in a positive reinforcement operant task (two-tailed independent sample t-test, t45 = 2.897, p = 0.0058, n = 5 mice). Dark grey dots are individual trials across all animals, light grey dots are averaged responses for each animal.
(C) D2 MSNs showed a decrease to sucrose retrieval in the same task (two-tailed independent sample t-test, t60 = 6.287, p < 0.0001, n = 5 mice).
(D) D1 MSNs showed an intensity-dependent positive response to unsignaled shock (nested ANOVA, F(1,39) = 6.53, p = 0.0159, n = 5 mice).
(E) D2 MSNs showed an intensity-dependent positive response to unsignaled shock (nested ANOVA, F(1,47) = 5.04, p = 0.031, n = 6 mice).
(F) Intracranial self-stimulation (ICSS) task design. An excitatory opsin (ChR2; AAV5.Ef1a.DIO.hChR2) or a control vector (eYFP; AAV5.hSyn1.eYFP) was expressed in D1 MSNs or D2 MSNs in the NAc core. Nose pokes resulted in laser illumination (14Hz, 2s, 8mW, 470nM). Viral expression of ChR2 in the NAc core
(G) Mice were trained to nose poke for optical stimulation of either D1 MSNs or D2 MSNs over four days.
(H) D1-Cre (D1 MSN) and A2A-Cre (D2 MSN) mice showed a preference for the active nose poke as compared to eYFP controls (repeated measures ANOVA, trial × group interaction F(6,42) = 3.168, p = 0.0118).
(I) D1-cre (n = 5 mice) and A2A-Cre (D2 MSNs, n = 7 mice) showed a greater percentage of total responses on the active operanda as compared to the eYFP controls (n = 5 mice, one-way ANOVA, F(2,14) = 8.955, p = 0.0031; Dunnett’s post-hoc eYFP versus D1, p = 0.0360; eYFP verses D2, p = 0.0016).
(J) Training-dependent increase in responses in D1-Cre and A2A-Cre mice as compared to eYFP (one-way ANOVA, F(2,14) = 4.602, p = 0.0291; Dunnett’s post-hoc eYFP versus D1, p = 0.0248; eYFP verses A2A, p = 0.0485).
Data represented as mean ± S.E.M.; * p < 0.05; ** p < 0.01, ****p < 0.0001.
Stimulation of D1 and D2 MSNs supports reinforcement
Next, to test whether the stimulation of D1 MSNs or D2 MSNs supported reinforcement, the excitatory opsin (channelrhodopsin) was expressed in each MSN population (Figure S1B, Figure S1F-K). Mice nose poked for optical stimulation of either D1 or D2 MSNs (470nm, 2s, 14Hz, 8mW) (Figure 1F,G). Stimulation of either D1 or D2 MSNs supported reinforcement, while eYFP controls performed at chance levels over training (Figure 1H). Mice showed higher response rates when behavior was reinforced by either D1 or D2 MSN optical stimulation as compared to the unreinforced, eYFP, condition (Figure 1I). Finally, mice in both the D1 MSN and D2 MSN stimulation groups increased their responses with additional training sessions as compared to eYFP controls (Figure 1J). Together, these data showed that there is no clear demarcation between D1 and D2 MSN functions in the NAc core based on valence. D1 MSN responses (in the positive direction) were evoked by both appetitive and aversive stimuli, rather than decreasing to aversive stimuli as would be predicted for a reward-based signal55,56. Further, D2 MSN stimulation supported reinforcement, suggesting that these neurons were not transmitting an aversive signal or preventing motivated responding.
D2, but not D1, MSNs scale with learning
One of the difficulties with defining the precise role of neural populations in behavioral control is the ability to dissociate multiple behavioral factors. Thus, we first utilized a behavioral task developed in our laboratory [MCOAT17] to delineate the relationship between multiple task parameters (e.g., valence, action initiation, prediction) and the resulting neural signals. Mice were first trained in a positive reinforcement task where an auditory cue predicted that an operant response would result in the delivery of sucrose (Figure 2A-C, Figure S2A,B). Next, mice were trained in a negative reinforcement task where a distinct auditory cue indicated that a response on a second operanda prevented the delivery of a series of footshocks (Figure 2D-F, Figure S2C,D). Importantly, the operant response in the two phases is the same (nose poke), and both outcomes (sucrose retrieval or removal of shock) are positive; however, the maintaining stimulus has opposite valence (shock, negative; sucrose, positive). Thus, this behavioral task allows for the dissociation of motivational responding (the same between conditions) from the valence of the stimulus maintaining the behavior (the opposite).
Figure 2. D2 MSN responses to predictive cues scale with learning, while D1 MSN responses do not change.
(A) A discriminative cue (Sd) indicated that responses on a fixed-ratio 1 schedule resulted in sucrose delivery.
(B) Mice acquired this task (>60 responses on the active operanda).
(C) Nearly all responses were made during the cue period, indicating that mice learned the value of this cue (two-tailed independent sample t-test, t11 = 11.23, p < 0.0001, n = 12 mice; chance = 50%).
(D) Mice responded during an Sd to prevent shock presentation.
(E) Mice avoided almost all possible shocks.
(F) Almost all responses were made during the cue period, indicating that mice learned the cue value (two-tailed independent sample t-test, t9 = 14.68, p < 0.0001, n = 10 mice; critical value = 75.1%).
(G) D1 MSNs showed a response to the Sd that signaled positive reinforcement. This response did not change with training (nested ANOVA, F(1,176) = 0.63, p = 0.427, n = 6 mice). Dark grey dots are individual trials across all animals, light grey dots are the first response in each session for each animal.
(H) Heatmap of D1 MSN responses pre-training and post-training during negative reinforcement. All heatmaps are individual trials ordered by response magnitude to the cue.
(I) D1 MSNs showed a positive response to the Sd signaling negative reinforcement that did not change with experience (nested ANOVA F(1,91) = 1.21, p = 0.2747, n = 5 mice).
(J) Heatmap of D1 MSN responses pre-training and post-training during negative reinforcement.
(K) D2 MSNs showed an increase in response to the Sd signaling positive reinforcement between pre- and post-training (nested ANOVA F(1,166) = 16.38, p < 0.0001, n = 6 mice).
(L) Heatmap of D2 MSN responses.
(M) D2 MSNs showed a learning-dependent increase to the Sd signaling negative reinforcement (nested ANOVA F(1,88) = 5.35, p = 0.0234, n = 5 mice).
(N) Heatmap of trial responses pre-training and post-training during negative reinforcement.
Data represented as mean ± S.E.M. * p < 0.05, ** p < 0.01, **** p < 0.0001.
In the positive reinforcement phase, mice learned quickly to respond on the active nose poke (Figure 2B) and responded with greater probability during the cue than during the inter-trial interval (Figure 2C; Figure S2A-B). During negative reinforcement, mice avoided greater than 80% of potential shocks within a session and had a higher probability of responding during the cue than during the inter-trial interval (Figure 2E-F; Figure S2C-D). Calcium transients to the discriminative cues that predicted positive or negative reinforcement were assessed during the first training session and again in the same mice after they had met acquisition criteria.
In response to the discriminative cue that predicted positive reinforcement, we observed a positive calcium transient in D1 MSNs; however, this response did not change over training (Figure 2G-H). The same pattern was present in the negative reinforcement task, where the discriminative cue evoked a positive, time-locked calcium transient in D1 MSNs and this cue evoked response was not sensitive to training history (Figure 2I-J; Figure SE-F). Conversely, D2 MSNs also showed a positive time-locked calcium transient in response to both the discriminative cue that signaled positive reinforcement (Figure 2K,L) and the discriminative cue that signaled negative reinforcement (Figure 2M,N); The response in both conditions was further increased as performance increased over training (also see Figure SG-H). These data - combined with the data above - suggest that D1 MSNs respond to stimuli generally, while D2 MSNs dynamically respond to changes in predictive cues.
MSN responses are consistent across tasks
While these results were consistent across different operant tasks, it was not clear if these signals were specific to motivated action. D1/D2 MSN responses could signal information about cues that were not sensitive to contingencies/actions (e.g., if/or how strongly the cue predicted an outcome). We hypothesized that D2 MSN responses dynamically change with the predictive value of cues, regardless of whether the task required an action or not. We tested this hypothesis using a Pavlovian fear conditioning paradigm. Mice were presented with a 5s cue followed by a brief, unavoidable footshock. Mice were trained over four fear conditioning sessions (6 trials per session, Figure 3A,i), followed by four sessions of extinction where the cue, but no shock, was presented (Figure 3A, ii). Freezing was analyzed during 1) fear conditioning session 1 (FC1) when learning of the cue-shock association was forming, 2) in the final session [fear conditioning session 4 (FC4)], when the association was established, and 3) on the final session of extinction, when the cue-shock association was extinguished (EXT4). Both groups of mice (D1-cre and A2A-cre) increased freezing during acquisition and decreased freezing following extinction (Figure 3B; Figure S3A-B).
Figure 3. D2 MSN responses to cues increase over learning, while D1 MSN responses do not change, in Pavlovian tasks.
(A) (i) Mice received a five second cue followed by a half-second shock in a Pavlovian fear conditioning task. (ii) During extinction, the cue was presented for five seconds but the shock was not delivered.
(B) Freezing across trials during fear conditioning session 1 (FC1), fear conditioning session 4 (FC4), and extinction session 4 (EXT4) (repeated measures ANOVA, trial × group interaction F(5.017,55.19) = 2.879, p = 0.0221).
(C) D1 MSN response to the cue in the first (FC1) and last (FC4) fear conditioning session.
(D, E) D1 MSNs showed no difference in response to the cue (nested ANOVA, F(1,71) = 0.02, p = 0.8977, n = 6 mice) or the shock (nested ANOVA F(1,71) = 1.73, p = 0.1938, n = 6 mice) over sessions. Dark grey dots are individual trials across all animals, light grey dots are averaged responses for each animal.
(F) D2 MSN response to the cue in FC1 and FC4.
(G) D2 MSNs showed an increase in response to the cue over sessions (nested ANOVA, F(1,71) = 11.28, p = 0.0014, n = 6 mice).
(H) The peak responses to the shock in D2 MSNs normalized to the pre-trial baseline (nested ANOVA, F(1,71) = 0.42, p = 0.5218, n = 6 mice).
(I) D1 MSN cue and shock responses in the last fear conditioning session (FC4) compared to the last session of extinction (data from FC4, replotted from panel C).
(J) There was no change in the D1 MSN response to the cue following extinction (nested ANOVA F(1,71) = 0, p = 0.9907, n = 6 mice).
(K) D1 MSN responses to shock period. In FC4 the shock was presented and in extinction (EXT4) it was not (nested ANOVA, F(1,71) = 68.59, p < 0.0001, n = 6 mice).
(L) D2 MSN response to the cue and shock in FC4 as compared to extinction (FC4 data plotted from panel F).
(M) There was a decrease in the cue response following extinction (nested ANOVA, F(1,71) = 16.32, p = 0.0002, n = 6 mice).
(N) The response during the shock period was also reduced (nested ANOVA, F(1,71) = 56.31, p < 0.0001, n = 6 mice). In FC4 the shock was presented and in extinction (EXT4) it was not.
(O, P) Heatmap of D1 MSN (O) and D2 MSN (P) responses to task parameters ordered by response magnitude to cue (from largest at top to smallest at bottom) during early (FC1) and late (FC4) fear conditioning and late fear extinction (EXT4).
Data represented as mean ± S.E.M. *** p < 0.001, **** p < 0.0001. [fear conditioning session 1 (FC1); fear conditioning session 4 (FC4); extinction session 4 (EXT4)].
We observed a time-locked positive D1 MSN calcium transient in response to both the cue and the shock in early fear conditioning (FC1) and late fear conditioning (FC4; Figure 3C-E,O; Figure S3C-D). The magnitude of this response did not change over learning. Conversely, D2 MSNs exhibited a positive, cue-evoked calcium transient that was further increased with learning (Figure 3F-H,P; Figure SE-F).
When mice went through extinction, freezing was reduced (Figure 3B). However, we observed no difference in the positive cue-evoked D1 MSN calcium transient (Figure 3I-J,O; see Figure S7G-H for additional analysis). Also, the positive, shock-evoked D1 MSN calcium transient disappeared during extinction (as the shock was not presented in extinction trials, Figure 3I,K). The positive, cue-evoked D2 MSN calcium transients were reduced with extinction (Figure 3L,M,P; Figure SI-J). Further, there was no D2 MSN response in either direction at the time of the omitted shock during the final extinction session when shock was not presented and no longer expected (Figure 3L,N).
These data suggest that cue-evoked calcium transients in D2 MSNs track the strength of the cue-outcome association. This occurs in the same direction (a positive response) regardless of whether the outcome value is positive or negative (Figure 2, 3) and does not depend on the action required, as this same pattern was present for both reinforcement (motivated action, Figure 2) and fear conditioning (freezing, Figure 3). Thus, D2 MSNs appear to track predictions, while D1 MSNs respond to stimuli generally. These data also further rule out that the D2 MSN response tracks general motor responses – freezing or otherwise – as the same signal was observed when the animals were freezing (Figure 3) and when mice were making a motivated operant response (Figure 2, Figure S2G-H).
D2 MSNs track prediction errors
We hypothesized that D2 MSN responses tracked a valueless prediction error. In the case of valueless prediction error, responses to cues are initially small when they are neutral and novel (regardless of valence) and increase as those cues acquire associative value. Conversely, in response to the stimulus itself (what the cue predicts), initially the response is positive and large, and this response decreases as the stimulus becomes more predicted by the antecedent cue. Thus, one key feature of valueless and other prediction-error signals is that in addition to cue responses changing over learning, relative stimulus responses also change15.
While we observed in Figure 1C that D2 MSNs showed a negative calcium transient at the time of sucrose collection in well-trained animals, when unexpected sucrose was given to naïve mice D2 MSN calcium responses were positive in response to sucrose consumption (time-locked to the first sucrose lick, Figure 4A). Thus, the D2 MSN calcium response to an unexpected reward was positive (Figure 4B; see Figure S4A), and this response to the same rewarding stimulus was reduced as the reward became more predicted (Figure 1C). We observed similar increases in D2 MSN calcium responses that occurred in response to unpredicted vs predicted aversive stimuli.
Figure 4. D2 MSN responses to stimuli are modulated by prior predictions.
(A) Mice were given ad libitum access to sucrose in the delivery port and signals were analyzed around the first lick in the first lick bout.
(B) When sucrose was not predicted, D2 MSN response was increased (independent sample t-test, t25 = 4.41, p = 0.0002, n = 4 mice), rather than decreased (Fig 1C). We next determined if D1 MSN or D2 MSN responses to footshocks were changed based on prediction.
(C) Signals were z-scored around the baseline preceding the onset of the shock in the first two trials of fear conditioning session 1 (FC1) when the mouse experiences the cue and the shock for the first and second time, and fear conditioning session 4 (FC4) when the mouse has extensive experience with the cue-shock association. Dark grey dots are trial 1 response, light grey dots are trial 2 response. There was no effect on D1 MSNs responses to the shock under these conditions (nested ANOVA F(1,23) = 0.74, p = 0.4055, n = 6 mice).
(D) D2 MSN responses to the footshock became smaller as the prediction between cue and shock became stronger (nested ANOVA, F(1,23) = 18.22, p = 0.0011, n = 6 mice).
(E) The likelihood of a cue-shock pairing was manipulated (shock occurs 10% of the trials in the session or 75% of the trials in the session).
(F, left) Fiber photometry trace from D2 MSNs during the shock responses depending on the probability of the prediction.
(F, right) D2 MSNs showed an increase in the cue response (paired t-test, t4 = 4.540, p = 0.0105, n = 5 mice) and decrease in the magnitude of the shock response (paired t-test, t4 = 3.117, p = 0.0356, n = 5 mice) with greater predictability of the shock outcome.
Data represented as mean ± S.E.M. * p < 0.05, *** p < 0.001. [fear conditioning session 1 [FC1]; fear conditioning session 4 [FC4].
The same pattern was present with aversive stimuli. Shock-evoked transients were assessed during the first 2 trials of fear conditioning (FC1, when animals did not expect the footshock) and during the first 2 trials of the final fear conditioning session (FC4, when the footshock was anticipated). First, we analyzed D1 and D2 MSN responses aligned to the onset of the shock (instead of the cue as in Figure 3D,G) to probe the shock response relative to baseline. Shock-evoked D1 MSN responses were positive but did not change with expectation (Figure 4C; Figure S4B). This was true whether data was assessed over days (fear conditioning session 1 vs 4) or within-session (trial 1 versus trial 6; Figure S4D-E). However, the positive D2 MSN calcium transient evoked by the shock was smaller when it was expected (Figure 4D; Figure S4C).
Finally, we directly manipulated the probability of shock presentation in two separate sessions: the first session when the shock was predicted by the cue 10% of the time, and the second session when the shock was predicted by the cue 75% of the time (Figure 4E). Accordingly, the size of the positive cue-evoked calcium transient in the D2 MSNs reflected how well the shock was predicted, where the higher probability cue resulted in a stronger positive D2 MSN calcium response to the cue (Figure 4F, Figure S4F-H) and a reduced response to the shock, an effect that was consistent across repeated testing (Figure SF-H). Thus, as the cue becomes more predictive the cue response is largest and the outcome response is smallest, similar to what has been observed by other prediction-based signals57,58.
Learning increases D2 MSN recruitment
To further understand how these signals developed over learning, we employed microendoscopic cellular resolution calcium imaging (see Figure S1C for the GRIN lens placements). GCaMP6m was expressed in either D1 or D2 MSNs as described above, and using a GRIN lens for optical access, we recorded single-cell calcium transients in these identified populations during fear conditioning (Figure 5A-C; see Figure S5A-B and Figure S5C-D).
Figure 5. D2 MSN recruitment is increased over learning and individual D2 MSNs respond to both the cue and the shock.
(A) D1 and D2 MSN responses were recorded via cell-type specific expression of GCaMP6m as described. A GRIN lens was implanted above the NAc core for optical access.
(B) Fear conditioning. Mice received a ten-second cue followed by a 0.5s shock.
(C) Freezing responses increased over training (repeated measures ANOVA, trial × group interaction F(3.990,35.91) = 4.212, p = 0.0068). Session 1 (FC1), Session 4 (FC4).
(D) D1 MSNs responses across detected cells (157 cells in FC1, 180 cells in FC4). (E) There was a moderate decrease in the peak response to the cue in the last session as compared to the first (independent sample t-test, t335 = 2.505, p = 0.0127, n = 5 mice).
(F) The shock response did not change (independent sample t-test, t335 = 0.6697, p = 0.5035, n = 5 mice).
(G, H) Percentage of the total D1 MSNs detected in each session that increased (positive), decreased (negative), or showed no response (no response) to the cue or shock in the first (G, FC1) or last session (H, FC4). D1 MSN responses did not change over learning.
(I) D2 MSNs responses (107 cells in FC1, 111 cells in FC4).
(J) D2 MSN responses to the cue were increased over sessions (independent sample t-test, t216 = 3.435, p = 0.0007, n = 5 mice).
(K) No difference in the shock response (independent sample t-test, t216 = 0.71, p = 0.4779, n = 5 mice).
(L, M) Percentage of D2 MSNs that increased (positive), decreased (negative), or did not respond (no response) to the cue and the shock in the first session (L, FC1) and the last (M, FC4). The number of D2 MSNs that responded to the cue changed over learning.
Data represented as mean ± S.E.M. * p < 0.05, *** p < 0.001. [fear conditioning session 1 [FC1]; fear conditioning session 4 [FC4].
Replicating our results, we found that D1 MSNs - when represented as a whole field trace or the average of all identified single cell responses - responded to both the cue and the footshock during fear conditioning session 1 (FC1; Figure 5D-F; Figure S5C-D, for whole field calcium traces; Figure S5E-H for area under the curve). While there was heterogeneity in the responses (Figure 5G, left), most cells responded positively (55.41%, of all identified cells recorded during the session) to the shock (Figure 5G, right). Following extended training, when behavioral responses were asymptotic (fear conditioning session 4, FC4), the proportionality of D1 MSN responses did not change (Figure 5H). Consistent with the idea that D1 MSNs respond to the presence of stimuli, the number of D1 MSNs that were active at the time of the previous footshock was reduced when the shock was not present in extinction (to 23.81%; Figure S5I-M). Thus, D1 MSNs track stimulus presence and show that the population-level responses in the NAc core are representative of a majority of the population.
D2 MSNs showed the same population response patterns observed above – both when represented as field of view and when presented as the average of all identified single cell responses (Figure 5I-K; Figure S5C-D; Figure S5E-H). Cue and shock responsive cells were identified as any cell that had a significant increased response (>1.96 z-scores from baseline) within 2 seconds of cue onset and 2 seconds of shock onset, respectively. The increase in population-level response to the cue over training was explained by an increase in the size of the D2 MSN ensemble that was evoked by the presentation of the predictive cue, which was largest on the final fear condition session (FC4) compared to the initial fear conditioning session (FC1). Indeed, the percentage of cue responsive D2 MSNs increased from 39.26% to 75.68% of identified cells between FC1 and FC4 (Figure 5L-M, left). Shock responsive D2 MSNs were also increased and went from 54.21% to 81.99% of the population (Figure 5L-M, right). Critically, the opposite was observed when animals moved from fear conditioning to fear extinction where the percentages of responsive cells were reduced to 42.86% for the cue and to 34.92% for the footshock (Figure S5Q-R). These results support the hypothesis that D2 MSNs signal predictions as this signal progressively develops with learning (over fear conditioning) and is updated when new information is encountered (during extinction). More specifically, the D2 MSN response to the cue becomes stronger as the animals learn the association between the cue and the footshock and weaken as this association is extinguished (Figure 5I-K; Figure S5N-R).
D2 MSN dynamics change over learning
To better understand how D1 and D2 MSN populations were changing over experience, we used dimensionality reduction approaches to visualize how these populations were changing both within and across sessions (Figure 6Ai-iv). We performed a principal component analysis (PCA) and plotted the trajectory summarizing the activity of all recorded neurons through the space defined by the first and second principal components (PC). The large variation of the trajectory during the cue period in the final fear conditioning session (FC4, Figure 6Aiv) as compared to the first fear conditioning session (FC1, Figure 6Aiii) is consistent with the finding that more neurons are engaged at this time in FC4 compared to FC1 (Figure 5L-M). This difference was not apparent when comparing FC1 and FC4 trajectories of D1-MSNs (Figure 6Ai, ii), again supporting that only D2-MSNs undergo changes at the population level. While learning-dependent changes occur in D2 MSNs, we wanted to further understand the cell response patterns that characterized these changes and how the temporal patterns of these responses changed over time.
Figure 6. D2 MSNs, but not D1 MSNs, change dynamically over learning in both the pattern and timing of responses to learned cues.
(A) Neural trajectories summarizing the activity of D1-MSNs in fear conditioning session 1 [FC1, (i), n=157 neurons], D1-MSNs in fear conditioning session 4 [FC4, (ii), n=180 neurons], D2-MSNs in FC1 [(iii), n=107 neurons], and D2-MSNs in FC4 [(iv), n=111 neurons]. Each time point is depicted as an arrow pointing in the direction of the next time point. The size of each arrow is proportional to the delay until the next timepoint (i.e., how fast the activity is moving along the trajectory with large arrows depicting more rapid changes). The pre-cue baseline period is colored light grey, the cue period is color-coded (D2-MSN/FC1: red, D2-MSN/FC4: orange, D1-MSN/FC1: dark blue, D1-MSN/FC4: light blue), and the shock period is colored dark grey. As mice learn the cue-footshock contingency, D2 MSN cue responses, but not D1 MSN responses, become more variable.
(B) D2 MSNs were categorized based on observed activity patterns in the NAc during the initial fear conditioning session (FC1) in the following categories: (i) response only to the cue; (ii) response only to the shock (iii) response both to the cue and shock.
(C) In D1 MSNs, most of the cells only responded to the shock during the initial fear conditioning session (FC1).
(D) The D1 MSN cell recruitment to the cue and the shock was similar in the last fear conditioning session (FC4), with a majority of cells responding only to the shock.
(E) In D2 MSNs, initially (on FC1) only a small percentage of cells responded to both the cue and shock.
(F) In FC4, a majority of D2 MSNs responded to both the cue and shock.
(G) D2 MSNs were recorded on the first session (FC1) and cells detected during this session were longitudinally co-registered with cells in the last session (FC4) based on activity during each session.
(H) Most of the cells that only responded to the cue in FC1 were not detected as active during the final fear conditioning session (FC4, only 13% co-registered). The majority of D2 MSNs that responded to the shock (either shock alone, or both cue and shock) were re-recruited in FC4.
(I) Heatmaps showing cue responses for fear conditioning session 1 and 4 ordered by tge tune of response following cue presentation.
(J) Histogram of event numbers for each second of the cue period, superimposed on the z-scored averaged calcium responses. Event analysis showed that the number of D2 MSN events within the cue period increased with learning (chi square=34.32, p<0.0001) and the amplitude of those events became larger as well [(i) the whole cue period, unpaired t-test, t718 = 4.26, p < 0.0001, n = 239-481 events]. When clustered based on the timing of the response the peak event amplitude was larger in FC4 during the early segment ((ii) from the cue onset to 3 sec; unpaired t-test, t301 = 3.76, p = 0.0002, n = 78-225 events) but not during the middle [(iii) 3.5 sec to 6.5 sec, unpaired t-test, t169 = 1.13, p = 0.26, n = 57-114 events] or the late [(iv) 7 sec to the cue offset, unpaired t-test, t191 = 1.33, p = 0.18, n = 73-120 events] segments of the cue period.
(K) The event onset was earlier in FC4 compared to FC1 (unpaired t-test, t718 = 3.61, p = 0.0003, n = 239-481 events).
Data represented as mean ± S.E.M., *** p < 0.001, **** p < 0.001, ns = not significant. [fear conditioning session 1 [FC1]; fear conditioning session 4 [FC4].
D2 MSNs bridge cues and outcomes
For D2 MSNs to function as a prediction error signal, both the cue and the outcome response should be represented in the same cells, rather than two separate populations. Analysis of the single cell responses during the first fear conditioning session (FC1) revealed 3 groups of D2 MSNs: 1) D2 MSNs that showed a positive response only to the cue (Cue only); 2) D2 MSNs that showed a positive response only to the shock (Shock only) and 3) D2 MSNs that showed a positive response to both the cue and shock (Both cue and shock; Figure 6B). Consistent with the data presented in Figure 6A showing little change in the D1 MSN responses to stimuli across sessions, we found that more than half of D1 MSNs (55.1%) responded to the shock only (Figure 6C) and this pattern was roughly the same between FC1 (Figure 6C) and FC 4 (Figure 6D).
However, D2 MSN response patterns changed over learning. Initially, only a small percentage of the D2 MSNs respond to both the cue and shock (14.28%) while the rest of the cells respond only to the cue (46.94%) or only to the shock (38.78%; Figure 6E). However, when animals learned the association between the cue and shock - in the fourth fear conditioning session (FC4) - the majority of the D2 MSNs showed a positive response to both the cue and the shock (65.38%; Figure 6F). Next, we used activity-based co-registration to determine if cells showing one of the three activity signatures were more or less likely to be active in subsequent sessions. Co-registered cells were identified based on activity during the first fear conditioning session (FC1) and compared against activity signatures in the last fear conditioning session (FC4). Thus, only cells that were active in both FC1 and FC4 were considered co-registered (Figure 6G; see Figure S6A-B for co-registration approach). Critically, cells that responded to the cue only were not likely to be identified in FC4 (13% co-registered), while a majority of the D2 MSNs that responded only to the shock (84% co-registered) or both to the cue and shock (71.5% co-registered) were re-recruited and active in FC4 (Figure 6H). Additionally, of the cells that were shock responsive in FC1 and co-registered in FC4, >60% showed a response to the cue in FC4 (Figure S6C).
Temporal pattern of D2 MSN response changes with learning
We next used a hierarchical clustering approach to identify different patterns of responses that were present within the D2 MSN population in response to the cue during fear conditioning (Figure S6D-E). There were three major clusters of cells: cells that responded early in the cue period, cells that responded in the middle of the cue period, and cells that responded near the end of the cue period. Between FC1 and FC4, the cluster of cells that responded early in the cue increased both in magnitude and number of cells that made up this population (Figure S6E), suggesting that with learning, D2 MSN responses to the cue become more time-locked to cue-onset. Indeed, when D2 MSN cue responses were ordered based response time during the cue period for fear conditioning session 1 (top) and fear conditioning session 4 (bottom) there appeared to be a shift in the timing, where responses occurred closer to cue onset (Figure 6I). To statistically test this, D2 MSN calcium events that occurred any time within the cue window were identified and analyzed. Between FC1 and FC4 the amplitude of events increased (Figure 6Ji, all events) and the number of D2 MSN events within the cue period increased (Figure 6J, bottom histogram), without any change in the kinetics of individual events (Figure S6F).
When events were clustered and analyzed based on the timing of the response during the cue period these effects were largest during the first 3 seconds of the 10 second cue period (Figure 6ii-iv). In the last fear conditioning session (FC4) over 55% of the events occurred in the first 3 seconds following cue presentation, and 10% occurred in the last 3 seconds (Figure S6G), suggesting a shift in the timing of the response towards earlier in the cue period over learning. Indeed, there was a significant decrease in the average response time of all events (i.e., earlier in the cue period) between FC1 and FC4 (Figure 6K).
Together these data show that the number of D2 MSNs that respond to a cue is increased over learning, the cells that are recruited evolve to respond to both the cue and the shock over learning, and the temporal profile of these responses shifts to be more aligned with the cue onset as the cue becomes more predictive.
D2 MSN cue responses are necessary for learning
Finally, to test whether D2 MSN activity at the time of a cue was necessary for associative learning, we expressed halorhodopsin, an inhibitory opsin, selectively in D1 or D2 MSNs, using D1- and A2A-cre mice, respectively (Figure 7A). Mice underwent fear conditioning, as described above, where a 5-second cue was paired with a brief shock. On each trial, D1 and D2 MSNs were inhibited for the duration of the cue presentation (590nm, 8mW, constant; Figure 7B). Inhibiting D2 MSNs during the cue presentation resulted in a deficit in learning the cue-outcome association as compared to the D1-cre and eYFP controls (Figure 7C-D). D1 MSN inhibition at the time of the predictive cue did not have any effect on the trajectory of associative learning (Figure 7C-D). These effects cannot be explained by changes in motor activity, as these stimulations had no effect on open-field locomotion (Figure S7A-C). Additionally, when D2 MSNs were inhibited during appetitive conditioning, this also slowed the trajectory of learning for sucrose in a reinforcement task (Figure S7D-F). In the larger context of our study, these results strongly suggest that the positive and cue-evoked D2 MSN signal in the NAc that occurs at the time of the predictive cue causally mediates cue-outcome associations, regardless of valence.
Figure 7. Optogenetic inhibition of D2 MSN responses during the cue slows associative learning.
(A) The inhibitory opsin halorhodopsin (AAV5.hSyn.DIO.eNpHR3.0) was selectively expressed in D1 or D2 MSNs. Representative histology.
(B) A laser (constant, 5s, 8mW, 590nM) was illuminated at the time of cue onset.
(C) When D2, but not D1 MSNs, were inhibited mice developed a freezing response at a slower rate (RM ANOVA trial × group interaction F(2,18) = 8.17, p = 0.0030; multiple comparison D2 MSN versus eYFP session 3, p = 0.0005).
(D) D2 MSN inhibition reduced freezing during the 3 training trials as compared to eYFP animals (one-way ANOVA F(2,13) = 7.52, p = 0.0067; Bonferroni’s post-hoc eYFP versus D1, p > 0.9999; eYFP verses A2A, p = 0.018).
Data represented as mean ± S.E.M. * p < 0.05, ** p < 0.01.
DISCUSSION
Here, using a wide range of paradigms that result in behavioral responses to stimuli of both positive and negative valence, we characterized the precise role that NAc core D1 and D2 MSN activity plays in behavioral control. At the population and single cell level, we showed that D1 MSNs in the NAc core responded to the presence of unconditioned stimuli – regardless of valence. Conversely, D2 MSNs in the NAc core responded in a prediction-based fashion, increasing with learning, scaling with prediction, and causally controlling the trajectory of associative learning. Overall, these data show that D1 and D2 MSNs in the NAc core do not have opposing roles in behavioral control. Rather, they work in tandem to provide information regarding specific valence-independent aspects of associative learning.
While the NAc has been referred to as a reward-associated brain region23,32,59,60, these studies are not the first to show that signaling within the NAc, and also the dorsal striatum61, is causally related to both aversive and appetitive stimulus processing17,19,21-26,55,62,63. While some of these studies explained these results based on the idea of bidirectional valence coding, the results from our experiments, which examine a wider range of behavioral responses and contingencies, suggest that signaling in the NAc - especially the core region63-67 - may best be explained by valence-independent factors that allow for adaptive behavioral control across contexts and conditions. This is also consistent with data across a range of experiments and fields showing that D1 and D2 MSNs receive overlapping inputs of sensory and cortical systems68 and stimulation of inputs into the NAc from the prefrontal cortex, hippocampus, or basolateral amygdala inputs are reinforcing31,69,70. Indeed, we and others show that the stimulation of both D1 and D2 MSNs supports reinforcement47,49-51, indicating that these populations do not track valence or opposing motivational drive. It is important to note, however, that there is regional heterogeneity when it comes to behavioral control in the NAc71-73 and it is possible that these effects are specific to the NAc core. Nevertheless, these data still show that the idea of opposing valence cannot explain the role of these populations ubiquitously throughout the ventral striatum.
The framework presented within this manuscript is consistent with a large amount of emerging data on the role of these populations in behavioral control. Recent work has shown that manipulating D2 MSNs alters behavior, especially under unpredictable conditions (risky choice) and when behavioral updating is necessary53. Zalocusky et al. (2016) showed that in a model of risk preference in rats, stimulation of D2 MSNs during a period where animals were presented with a choice to emit an operant response or not resulted in fewer risky choices. Here we show that D2 MSNs track the predictability of cues and are engaged during similar types of operant tasks in a fashion that tracks performance. Nishioka et al. (2021) showed that optogenetic inhibition of D2 MSNs in the NAc during trials where there were errors in outcomes negatively impacted performance in the subsequent trial. Here we show that D2 MSNs track predictions and signal when outcomes are unexpected, something that would be necessary for error-based updating.
These data are also consistent with previous work using optogenetics, where activation of D2 MSNs during a reward-predictive cue enhances motivation50,74. This would be predicted based on the idea that D2 MSNs transmit a prediction signal, and this would enhance the association between cues and outcomes, thus increasing operant performance. This same work also showed that triggering activity during the food pellet delivery reduces motivation50,74, which would also be expected if D2 MSNs were transmitting a prediction error signal as an increase in the D2 MSN response at the time of the outcome would signal an error in prediction – i.e. that food is no longer presented following that action - and reduce future responding. The data included within this manuscript are also consistent with other recent work showing that both D1 and D2 MSN activity is evoked in response to food rewards75-77, where we show here that unexpected sucrose delivery results in increased D1 MSN responses (signaling stimulus presence) and D2 MSN responses (signaling an unexpected outcome). Together, the data presented within the current manuscript and data from others converge to show that D2 MSN activity is critical in situations where updating is necessary. Overall, the ability of D1 and D2 MSNs to signal precise types of information along with the fact that there is evidence of collateral transmission between MSNs suggests an ongoing relationship between D1 and D2 MSNs78,79, which likely allows for the ability to refine responses to relevant environmental stimuli.
Finally, these data are critical for conceptualizing how experience-dependent changes within these two cellular populations give rise to behavioral maladaptation associated with disease states. There has been extensive characterization of the effects of drugs of abuse on D1 and D2 MSNs at the molecular and cellular level. Acute drug exposure has been shown to enhance D1 MSN, while suppressing D2 MSN activity - likely through activation of dopamine receptors on these populations38,80. Additionally, repeated drug exposure has been shown to result in long-lasting enhancement of synaptic activity in D1 MSNs relative to D2 MSNs38,39,81-84. D2 MSNs have also been linked to drug-associated plasticity and seeking, especially as it relates to cue-driven behavior like drug seeking, or drug-taking under more variable reinforcement schedules37,85, a result that would be predicted based on these data as well. Given the critical role we have identified here for D1 and D2 MSNs in associative learning, it is plausible that drugs of abuse may disrupt the balance critical for forming associations between predictive cues and outcomes33-35,37,43,86.
By experimentation that integrates learning across contexts, behavioral action, and valence, we provide a framework for accumbal MSNs in valence-independent behavioral control. We conclude that NAc core MSNs help coordinate the learning of associations to drive adaptive behavior in all cases. The roles that D1 and D2 MSNs play in adaptive behavior thus provide new insights into many of the psychopathologies associated with disruptions in this brain region – including substance use disorders, gambling, and depression among others 3-5,7,8,87. In effect, we can frame many of these disorders as dysregulation in adaptive associative learning processes, a perspective that can ultimately reshape how we treat them.
STAR METHODS
RESOURCE AVAILABILITY
Lead Contact.
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Erin Calipari (erin.calipari@vanderbilt.edu).
Materials Availability.
This study did not generate new unique reagents.
Data and Code Availability
Data reported in this paper are available from the lead contact upon reasonable request.
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS
Subjects.
Male and female 6- to 8-week-old D1:Cre (Jax: #030329) and A2A:Cre (MMRRC RRID: MMRRC_036158-UCD) mice were obtained from Jackson Laboratories and the Mutant Mouse Resource and Research Centers (MMRC) repository (University of Missouri, Columbia, MO), respectively. All animals were maintained on a 12h reverse light/dark cycle. Animals had ad-libitum access to food and water except for the studies that employed sucrose as the main reinforcer, where mice were food restricted to 90% of free-feeding weight for the duration of the studies. Mice were weighed every other day to ensure that weight was maintained. All experiments were conducted in accordance with the guidelines of the Institutional Animal Care and Use Committee (IACUC) at Vanderbilt University School of Medicine, which approved and supervised all animal protocols.
METHOD DETAILS
General Surgical Procedures.
Ketoprofen (5mg/kg; subcutaneous injection) was administered at least 30 mins before any surgical procedure. Ophthalmic ointment was applied to the eyes before and throughout the duration of surgery. Under Isoflurane anesthesia (induced at 4%, maintained at 1.5% in O2), mice were positioned in a stereotaxic frame (Kopf Instruments, Tujunga, CA). Using aseptic techniques, a midline incision was made down the scalp and a craniotomy was made using a dental drill. A 10-mL Nanofil Hamilton syringe (WPI) with a 34-gauge beveled metal needle was used to infuse viral constructs into the NAc core as follows:
For fiber photometry and optogenetics:
The NAc core was targeted for virus injection (from bregma, A/P = +1.4 mm; ML = +1.5 mm; D/V = −4.3 mm; 10° angle). Targeting was unilateral for fiber photometry and optogenetic excitation (channelrhodopsin) experiments and bilateral for optogenetic inhibition (halorhodopsin) experiments. Virus was infused at a rate of 50 nL/min for a total of 500 nL. All viruses were adeno-associated viruses (AAV), and which viruses were used for each experiment are outlined in the experimental details. Following AAV infusion, the needle was kept at the injection site for seven minutes and then slowly withdrawn. Permanent implantable 2.5 mm fiber optic canula were implanted in the NAc [(Doric Lenses, Quebec, Canada) (Fiber Photometry: 400μM fiber core, 0.48 NA; Optogenetics: 200μM fiber core, 0.22 NA)]. Fiberoptic cannula were positioned above the viral injection site (from bregma, A/P = +1.4 mm; ML = +1.5 mm; D/V = −4.2 mm; 10° angle) and were cemented to the skull using C&B Metabond adhesive cement system (C&B Metabond; Parkell).
For microendoscopic imaging:
Viral injection targeting of the NAc core was done as described above (from bregma, A/P = +1.4 mm; M/L = +1.0 mm; DV = −3.8; 0° angle). These coordinates targeted the same area of the NAc core as above but without an angle. After viral infusion, the craniotomy was enlarged, and the dura mater was removed with a 26-gauge beveled needle. No tissue was aspirated. Proview integrated lenses (Inscopix) were used for all mice in these studies. These lenses are a combination of a traditional gradient index (GRIN) lens with a baseplate already attached, allowing for lens implantation and baseplate surgery to be done in the same surgical procedure. A 0.6 mm diameter, GRIN lens (7.3mm length,Inscopix) was lowered stereotaxically through the craniotomy −3.7 mm into the brain. The remaining portion of the lens was above the skull and attached to the baseplate. Lenses were implanted slightly above the track for virus infusions to avoid tissue damage in the imaging plane. the lens was positioned above the viral injection site (from bregma, A/P = +1.4 mm; M/L = +1.0 mm; D/V = −3.7 mm; 0° angle) and was cemented to the skull using adhesive cement (C&B Metabond; Parkell). The baseplate, which is attached to the lens, was cemented around the lens and to the skull to support the connection of the miniaturized microscope for freely moving imaging. Mice were allowed to recover for 6-8 weeks until testing for signal and beginning experimental testing.
Histology:
At the end of each of the behavioral/imaging experiments outlined below, subjects were deeply anesthetized with an intraperitoneal injection of Ketamine/Xylazine (100mg/kg/10mg/kg) and transcardially perfused with 10 mL of PBS solution followed by 10 mL of cold 4% PFA in 1x PBS. Animals were quickly decapitated, the brain was extracted and placed in 4% PFA solution and stored at 4 °C for at least 48-hours. Brains were then transferred to a 30% sucrose solution in 1x PBS and allowed to sit until brains sank to the bottom of the conical tube at 4 °C. After sinking, brains were sectioned at 35μm on a freezing sliding microtome (Leica SM2010R). Sections were stored in a cryoprotectant solution (7.5% sucrose + 15% ethylene glycol in 0.1 M PB) at −20°C until immunohistochemical processing. We immunohistochemically stained all NAc slices with an anti-GFP antibody (chicken anti-GFP; Abcam #AB13970, 1:2000 in 5% BSA in PBS; room temperature overnight) for GCaMP6f, GCaMP6m, channelrhodopsin, and halorhodopsin for the validation of viral placement. Sections were then incubated with secondary antibodies [gfp: goat anti-chicken AlexaFluor 488 (Life Technologies #A-11039, 1:1000 in 5% BSA in PBS)] overnight at 4-degree temperature. After washing, sections were incubated for 5 min with DAPI (NucBlue, Invitrogen) to achieve counterstaining of nuclei before mounting in Prolong Gold (Invitrogen). Fluorescent images were taken using a Keyence BZ-X700 inverted fluorescence microscope (Keyence), under a dry 10x objective (Nikon). The injection site location and the fiber implant or GRIN lens placements were determined via serial imaging in all animals. We identified sections that displayed the NAc core, viral expression, and fiber optic tip.
FIBER PHOTOMETRY
Fiber photometry general approach:
A Cre-recombinase-dependent virus carrying the fluorescent calcium indicator GCaMP6f (AAV5.hSyn.Flex.GCaMP6f.WPRE.SV40) was expressed in the NAc core of transgenic mice that express cre-recombinase in either D1 MSNs (D1-Cre mice) or D2 MSNs [Adora2a (A2A)-Cre mice]. Resulting fluorescent signals were recorded through a permanently implanted fiberoptic (400μm fiber diameter, 0.48 NA) affixed to each mouse’s skull (described in surgical procedures). The fiber photometry recording system uses two light-emitting diodes (LED, Thorlabs) controlled by an LED driver (Thorlabs) at 490nm (run through a 470nM filter to produce 470nM excitation - the excitation peak of GCaMP) and 405nm (an isosbestic control channel, 88-91). LED emissions pass through several filters and are reflected off of a series of dichroic mirrors (Fluorescence MiniCube, Doric) allowing for emission and the recording of resulting excitation through the same optical system. LEDs were controlled by a real-time signal processor (RZ5P; Tucker-Davis Technologies) and emission signals from each LED stimulation were determined via multiplexing. The fluorescent signals were collected via a photoreceiver (Newport Visible Femtowatt Photoreceiver Module, Doric). Synapse software (Tucker-Davis Technologies) was used to control the timing and intensity of the LEDs and to record the emitted fluorescent signals. The LED intensity was set to 125μW for each LED and was measured daily to ensure that it was constant across trials and experiments. For each event of interest (e.g., predictive cue, head entries, licks, shock), transistor-transistor logic (TTL) signals were used to timestamp onset times from Med-PC V software (Med Associates Inc.) and were detected via the RZ5P in the synapse software (explained in more detail in the analysis section below).
Fiber photometry analysis.
The analysis of the fiber photometry data was conducted using a custom Matlab pipeline, as we have described previously 19,92. Raw 470nM (F470 channel) and isosbestic 405nM (F405 channel) traces were collected at a rate of 1000 samples per second (1kHz). The raw data from each individual channel (450nM or 470nM) was then minimally filtered using a lowess filter before calculating Δf/f values via polynomial curve fitting. For the lowess filter, the number of data points for calculating the filtered value was set to 0.0004 (values closer to 1 indicates stronger smoothing). Δf/f for the entire trace was calculated as (F470nm-F405nm)/F405nm. This transformation uses the isosbestic F405nm channel, which is not responsive to fluctuations in calcium, to control for calcium-independent fluctuations in the signal and to control for photobleaching. Then, the data from the resulting Δf/f trace was cropped around behavioral events using TTL pulses. For each experiment 2s of pre-TTL and 18s of post-TTL Δf/f values were analyzed. Z-scores were calculated from the cropped trace by taking the pre-TTL Δf/f values as baseline (z-score = (TTLsignal - b_mean)/b_stdev, where the TTL signal is the Δf/f value for each post-TTL time point, b_mean is the baseline mean, and b_stdev is the baseline standard deviation). This allowed for the determination of calcium events that occurred at the precise moment of each significant behavioral event. For statistical analysis, peak height, and area under the curve (AUC) values were calculated for each individual trace around identified behavioral relevant events via trapezoidal numerical integration on each of the z-scores across a fixed timescale which varied based on experiment. The duration of the peak height and AUC data collection was determined by limiting the analysis to the z-scores between time 0 (TTL signal onset) and the time where the calcium signal returns to the pre-event baseline.
CELLULAR RESOLUTION CALCIUM IMAGING VIA MICROENDOSCOPES
Single cell imaging general approach:
For calcium imaging at the single cell level, we used endoscopic miniature microscopes (nVista miniature microscope, Inscopix). Calcium signals were recorded via the calcium in sensor GCaMP6m (GCamP (AAV5.CAG.Flex.GCaMP6m.WPRE.SV40). Signals were recorded on the miniature microscopes through the GRIN lens allowing for the resolution of single cells. Single cell activity in the NAc core was recorded in awake and behaving animals. During each behavioral session, the miniscope was attached to the baseplate that was implanted previously (see surgical section for description). The imaging parameters (gain, LED power, focus) were determined for each animal to ensure recording quality and kept constant throughout the study. At the end of the recording session, the miniscope was removed and the baseplate cover was replaced.
Image processing and signal extraction:
Data was acquired at 20 frames per second using nVista miniature microscopes (Inscopix). Image processing was accomplished using Mosaic software (v.1.3.1., Inscopix Data Processing Software v1.3.1 (Mountain View, CA)). Raw videos were pre-processed by applying 2x spatial downsampling to reduce file size and processing time, and isolated dropped frames were corrected. No temporal downsampling was applied. Lateral movement was corrected by using a portion of a single reference frame using Inscopix Data Processing Software (IDPS v1.3.1). Images were cropped to remove post-registration borders and sections in which cells were not observed. Videos were then exported as TIF stacks for analysis. After motion correction and cropping, a constrained non-negative matrix factorization algorithm optimized for micro-endoscopic imaging (CNMF-E) was utilized to extract fluorescence traces from neurons 93,94. CNMF-E cell detection parameters were as follows: patch_dims = 50, 50; K = 20; gSiz = 20; gSig = 12; min_pnr = 20; min_corr = 0.8; max_tau = 0.400. Considering calcium fluctuations can exhibit negative transients, associated with a pause in firing 69, we did not constrain temporal components to >=0. The Δf/f values were computed for the whole field of view as the output pixel value was represented as a relative percent change from the baseline. Raw CNMF-E traces were used for all analyses. The spatial mask and calcium time series of each cell were manually inspected using the IDPS interface. Cells found to be duplicated or misdetected due to neuropils or other artifacts were discarded.
Calcium activity quantification:
Stimulus-evoked activity was assessed by aligning calcium activity traces around cue or stimulus onset. Transitor-transistor logic (TTL) signals from MedPC were directly fed to the nVista system. The behavioral apparatus was interfaced with the miniature microscope via BNC cables and the onset of behavioral events - cues and shocks - were associated with a frame of the video using TTLs. The Δf/f values for each movie frame were calculated as M’(x,y,t) = (M(x,y,t)-Fbaseline(x,y))/Fbaseline(x,y) where M’ is the output movie with Δf/f values, M(x,y,t) is the value for the pixel coordinate (x,y) at the t frame of the movie, and Fbaseline(x,y) is the baseline value for the (x,y) coordinate. The raw Δf/f data from the CNMF-E program was exported and used for peri-event analysis. Data was cropped around each significant event (cue presentations; TTL) and z-scored to normalize for baseline differences. Z-scores were calculated by taking the pre-TTL Δf/f values as baseline (z-score = (TTLsignal - b_mean)/b_stdev, where TTL signal is the Δf/f value for each post-TTL time point, b_mean is the baseline mean, and b_stdev is the baseline standard deviation). For traces aligned around a stimulus, a baseline window of 2 seconds prior to stimulus onset was used. Z-scored traces were then averaged across trials to create one trace per neuron for each stimulus type. To quantify the magnitude of the response, peak height (the maximum value of the calcium transient) and area under the curve (AUC) for each animal’s averaged population trace was calculated. Values across a two second window beginning at stimulus onset were averaged, and this value was used to determine the response profile of each individual cell. We then calculated whether the cell response to the cues and shock outcomes were significant to determine responsive and non-responsive cells as well as the direction of the response (positive, negative, or no response). For this analysis, we calculated peak heights of the cue or shock evoked response from an individual cell averaged from 6 fear conditioning trials as the maximum z-score achieved during a 2 second post-TTL window. We then ran two separate one-tailed independent t-tests to determine whether the response was significantly higher than +1.96 or lower than −1.96, the critical z-score for significance at p=0.05. For the AUCs, we used one-tailed independent t-tests to determine whether the mean AUC value was significantly different than 0. The cells that showed an averaged maximum response through 6 trials higher than the threshold were labeled as “Positive” cells. The cells that showed a significant negative response were labeled “Negative” cells. All other cells were determined as “No response” cells. We conducted this analysis for the cue and shock cell responses.
Longitudinal co-registration:
To identify cells and their responses across days within the same animal we used the longitudinal registration pipeline, defined in the Inscopix Data Process Software (IDPS) Guide (Section 4.9.1 Longitudinal Registration) identify the same cell across recording sessions in longitudinal series. Briefly, the algorithm uses the CNMF-E processed cell-map, which is an average of cellular activity responses across a recording session. The program then takes the number of cell segmentation outputs and creates an aligned "summary" cell map made of binarized spatial maps of all the ROIs that appear in the cell sets for each recording session. Each cell map was then aligned to the first cell map using a rigid transformation that accounts for translation and rotation. The aligned cell map is then designated global IDs and a normalized cross correlation (NCC) value is calculated. This NCC value is a measure of the ratio of overlap between a pair of ROIs from the “summary” cell map and the individual cell map. Matches are selected by identifying the maximal NCC value from the correlation matrix of ROI comparisons. Because cells may only be active in particular cell sets, a “match” means that an active cell in FC1 and FC4 were located in the same place in the field. For this set of experiments the images of the first cell set (the first day of fear conditioning, FC1) were defined as the global cell set against which the other cell sets are matched (this map was compared to the map that was detected on the final session of fear conditioning, FC4, and then independently to the cell map on the final session of extinction, EXT4). We then find the pair of cell images between the global cell sets and other aligned cell sets that maximize the normalized cross correlation (NCC). The program then generates an output that aligns the same cell from across sessions. We are thus able to determine if the same cell is active across sessions. To validate the accuracy of the co-registration pipeline, we also manually overlaid the co-registered cell maps onto the FC4 cell map (Fig. S11-12). We then manually identified the co-registered cells to ensure there was no misalignment. We found that the co-registration maps are highly accurate in terms of identifying the cells active both during FC1 and FC4.
Neural trajectory analysis:
This analysis was performed using custom code written in python (v3.10.9). For each cell, we computed the mean activity across the six trials of each session. We applied a gaussian filter with sigma=1bin (0.05s) and standardized the resulting smoothed average using the scipy.stats.zscore() method (scipy v1.10.0). All recorded neurons for each condition were combined into a timepoint x neuron matrix (one matrix for each of D1-MSN/FC1, D1-MSN/FC4, D2-MSN/FC1, D2-MSN/FC4). We then performed principal component analysis (PCA) by computing the covariance matrix using numpy.cov() followed by eigen value decomposition using numpy.linalg.eigh() (numpy v1.23.5). The resulting principal components were sorted and the top two were used for visualizing the trajectory through PCA space.
Hierarchical clustering:
Using the D2 MSN calcium traces from all the cells detected in the fear conditioning session 4, we employed a hierarchical clustering approach to group cells based on their cue responses in an unbiased way. We used the “clustergram” Matlab function and the “correlation” distance metric to group the cell activity.
BEHAVIORAL EXPERIMENTS:
Apparatus.
Mice were trained and tested daily in individual Med Associates (St. Albans, Vermont) operant conditioning chambers fitted with two illuminated nose pokes on either side of an illuminated sucrose delivery port, all of which featured an infrared beam break to assess head entries and nose pokes as well as a lickometer (Med Associates) to record tongue contacts on the sipper. One nose poke functioned as the active and the other as the inactive nose poke depending on the phase of the experiment (described below). Responses on both nose pokes were recorded throughout the duration of the experiments. Chambers were fitted with additional visual stimuli including a standard house light and two yellow LEDs located above each nose poke. Auditory stimuli included a white noise generator (which were used at 85 dB in these experiments) and a 16-channel tone generator capable of outputting frequencies between the range of 1 and 20 kHz (also presented at 85 dB). Each box was outfitted with an infrared camera that recorded behavioral sessions and allowed for the analysis of freezing behavior (described below).
Each code that was written to run the MedPCV program for each of the behavioral tests below was coded to also send a TTL pulse at the onset of each behaviorally relevant stimulus. This was sent through a BNC connection that interfaced with either the TDT fiber photometry synapse software (for fiber photometry recordings) or with the Inscopix recording software (for microendoscopic imaging). A series of behavioral experiments were run throughout this study to link activity within D1 and D2 MSNs to behavioral responding in Pavlovian and reinforcement contexts. They are outlined in detail below:
Positive Reinforcement.
Mice were trained to nose poke on an active nose poke - denoted by its illumination - for delivery of sucrose in a trial-based fashion. Following a correct response, the sucrose delivery port was illuminated for 5 seconds, and sucrose was delivered (1s duration of delivery, 10% sucrose w/v, 10ul volume per delivery). To create a trial-based procedure, a discriminative stimulus (Sd, sucrose) – an auditory cue – signaled that responses emitted during the presentation of the Sd would result in the delivery of sucrose. In these experiments, the Sd was white noise (85dB). Responses made during the Sd resulted in sucrose delivery and responses made at any other time in the session were recorded, but not reinforced. During the initial training, the Sd was presented throughout the entirety of each 1-hour session and animals could respond for sucrose without interruption. When animals reached ≥ 60 active responses in a single session, they were then moved to a discrete trial-based structure in subsequent 1-hour sessions, wherein the Sd was presented for 30 seconds at the beginning of each trial with a variable inter-trial interval (ITI) – on average the interval was 30 seconds, with interval times ranged from 20 to 40 seconds. Each trial ended following a correct response and associated sucrose delivery or at the end of a 30 second period with no active response. At the end of the trial both the Sd, and sucrose were terminated. Animals that exhibited active responses in ≥ 80% of trials during a session then proceeded to the final phase of training wherein the duration of Sd sucrose was reduced to 10 seconds. Upon reaching the 80% criterion during this phase (i.e., acquisition), post-training calcium responses in D1 and D2 MSNs were recorded over a 30 min session.
Negative Reinforcement.
Mice were trained to emit an operant response on the opposite, non-sucrose-paired nose poke for shock avoidance. A second auditory discriminative stimulus (Sd, shock; 85dB at 2.5kHz) was presented at the beginning of each trial following a variable inter-trial interval (ITI) – on average the interval was 76 seconds, with interval times ranged from 30 to 150 seconds. In each trial, the discriminative stimulus was presented for 30 seconds after which a series of 20 footshocks (1mA, 0.5 second duration) was delivered with a 10 second inter-stimulus interval. Trials ended when animals responded on the correct nose poke or at the end of the shock period. The end of the shock period was denoted by the presentation of a house light cue that signaled the end of the trial and was illuminated for one second. During these trials, mice could respond during the initial 30 second Sd, shock period to avoid shocks completely, respond any time during the shock period to terminate the remaining shocks, or not respond at all. If mice did not respond both the trial and Sd, shock were terminated after all 20 shocks had been presented (230 seconds total). Acquisition during negative reinforcement training was defined as receiving fewer than 20% of total shocks in a single one-hour session.
Unsignaled footshocks of varying intensities:
Behaviorally naive mice were placed into the same operant boxes used above, however, none of the ports or cues were active. A total of 8 footshocks were delivered in a non-contingent and inescapable fashion over a 12-minute period. Shocks were delivered at 0.3mA and 1mA intensities (4 presentations for each shock intensity). Shocks were delivered in a pseudo-random order with variable inter-stimulus intervals (mean ITI = 30 sec). All shock intensities were presented within the same test session. Shocks were delivered by the Standalone Aversive Simulator/Scrambler (Med Associates, Env 4145). Shock intensities were measured prior to the session via ammeter (Med Associates, ENV-421-B) with two clip leads attached to the grid rods in the Med Associates chamber. During the session, shock intensities were manually changed on the Standalone Aversive Simulator/Scrambler interface.
Unsignaled sucrose delivery:
Mice with no prior training history were placed into the operant box and were given ad libitum access to the port, which was manually filled with sucrose (10% sucrose w/v,) prior to the session. Sessions were 30 minutes in length. Tongue contacts on the sucrose cup were recorded using an electrical lickometer (sampled at 1 kHz), and via continuous video is collected with an overhead camera. Each lick triggered a TTL output to the fiber photometry recording software allows for signal alignment around the first sucrose lick in each mouse. If mice did not lick, the session was run the next day until mice made a lick for sucrose.
Fear conditioning followed by extinction.
Mice received a single footshock (1mA, 0.5 second duration) immediately following a 5 second auditory cue (5kHz tone; 85dB) for 6 pairings. Mice underwent 4 fear conditioning sessions over four consecutive days, followed by 4 extinction sessions in which the cue was presented, but shocks were omitted entirely. For cellular resolution imaging experiments, the cue was presented for 10s, instead of 5, during the fear conditioning and fear extinction sessions. This was due to the reduced movement of the mice in response to the sizeable scope. Every session was captured with an infrared camera. Freezing analysis was assessed during the cue presentation for each animal. Freezing was hand-scored by an observer who was blind to the experimental conditions. Time spent freezing was recorded (immobile – lack of any movement including sniffing) from the cue onset to cue offset.
Fear conditioning with changing cue-outcome probabilities.
Mice underwent two sessions of fear conditioning in which the probability of the cue being followed by a foot shock was changed to probe how neural responses changed to the cue and the shock under these conditions. In session 1, on 10% of the trials the auditory cue (5 second; 1kHz tone; 85dB) was followed by a single brief shock (1mA, 0.5 second duration) at the offset of the cue. On the remaining 90% of trials, the same cue was presented without the shock. Trials were interleaved in a pseudo-random order. In session 2, 75% of the trials were cue-shock pairings and 25% were cue-no shock pairings. We analyzed the cue and shock response on the trial in which mice had received 10% of the cue-shock pairings (session 1) or 75% (session 2). A separate experiment was run where two sessions were run on two consecutive days where on 10% of the trials the auditory cue (5 second; 1kHz tone; 85dB) was followed by a single brief shock (1mA, 0.5 second duration) at the offset of the cue. In the remaining 90% of trials, the same cue was presented without the shock.
OPTOGENETICS
Intracranial Self-Stimulation (ICSS).
In a group of D1 or A2A-Cre mice, channelrhodopsin [excitatory opsin (AAV5.Ef1a.DIO.hChR2)], or eYFP [control vector (AAV5.hSyn1.eYFP)] was unilaterally expressed in the NAc core as described about in the surgical procedures section. A 200um fiber optic implant (Doric, 0.22 NA; MFC_200/240-0.22_4.7mm_MF2.5_FLT) was placed into the NAc core above the viral injection site. Mice were placed in an operant chamber where responses on an active nose poke resulted in laser stimulation (470nm, 2s, 14Hz, 8mW), and responses on the inactive nose poke had no programmed consequences. No other stimuli were present during this task (house lights, cue lights, etc.). The number of active and inactive nose pokes was recorded during the two-hour session for four days. The active and inactive nose poke were counterbalanced across mice.
Optical inhibition of D1- and D2-MSNs during fear conditioning.
Halorhodopsin [inhibitory opsin (AAV5.hSyn.DIO.eNpHR3.0.YFP)] or eYFP [control vector (AAV5.hSyn1.eYFP)] was bilaterally expressed in D1-Cre and A2A-Cre mice using the strategies previously described. In a fear conditioning session, mice received a single footshock (1mA, 0.5 second duration) immediately following a 5 second auditory cue for 3 pairings. The cue was a tone (5kHz tone; 85dB). During this initial training session, each presentation of the cue was optically inhibited (5 sec, 8 mw, continuous laser) for the duration of the cue presentation. Freezing was hand-scored for the 5 second pre-footshock cue period for each trial in a blind fashion. The freezing response was defined as the time (seconds) that mice were immobile (lack of any movement including sniffing) during the tone period and calculated as the percentage of total cue time.
Optical inhibition of D1- and D2-MSNs during positive reinforcement.
Halorhodopsin [inhibitory opsin (AAV5.hSyn.DIO.eNpHR3.0.YFP)] or eYFP [control vector (AAV5.hSyn1.eYFP)] was bilaterally expressed in A2A-Cre mice using the strategies described for the fear conditioning experiments above. All mice received the same positive reinforcement training described for the fiber photometry experiments above. During the initial training, the Sd was presented throughout the entirety of each 1-hour session and animals could respond for sucrose without interruption. When animals reached ≥ 60 active responses in a single session, they were then moved to a discrete trial-based structure in subsequent 1-hour sessions, wherein the Sd was presented for 30 seconds at the beginning of each trial with a variable ITI. During this discrete cue session, each presentation of the cue was optically inhibited (590nM, 5 sec, 8 mw, continuous laser) for the initial 10 seconds of the cue presentation. Mice received 4 such training sessions and their nose poke responses during the cue presentations were recorded (active nose pokes). We converted the number of active nose pokes for each session to the percentage of the active nose poke response numbers during the first training session (%Baseline active response) in order to eliminate the initial baseline learning differences between mice.
Optical inhibition of D1- and D2-MSNs during open field test.
Halorhodopsin [inhibitory opsin (AAV5.hSyn.DIO.eNpHR3.0.YFP)] or eYFP [control vector (AAV5.hSyn1.eYFP)] was bilaterally expressed in D1-Cre and A2A-Cre mice using the strategies previously described. For the locomotor activity testing, all mice were placed in an open field arena and following a 5-minute habituation phase, mice were given yellow laser stimulations on a random interval schedule (590nM, 5 sec, 8 mw, continuous laser). The behavioral session was recorded via a camera above the open field arena. Velocity (cm/second) and Distance traveled (cm) during the intervals between each laser stimulation were measured using the EthoVision XT - Video tracking software (Noldus). Data are reported as average velocity per minute (cm/s/min) and average distance traveled per minute (cm/min).
Optogenetic excitation approach validation via fiber photometry:
A group of D1-cre and A2A-cre mice were injected with Chrimson.FLEX: AAV5-Syn-FLEX-rc[ChrimsonR-tdTomato] (Chrimson; Addgene) into the NAc core as described above. A Cre-recombinase-dependent virus carrying the fluorescent calcium indicator GCaMP6f (AAV5.hSyn.Flex.GCaMP6f.WPRE.SV40) was also expressed in the NAc core of transgenic mice as described for the fiber photometry experiments. Control mice (No Chrimson) received only the GCaMP6f injections. A 400 μm fiber optic was implanted into the NAc core. Using the same stimulation parameters described above for optogenetic stimulation, a laser stimulation (590nm, 1 second, 20Hz, 8mW) was delivered into the NAc core while recording fluorescent signals emitted from GCaMP6f in the same animal using fiber photometry. Chrimson and No Chrimson control mice received 6 laser stimulations during Session 1 and again 24 hours later during Session 2. The size of the calcium response elicited by the Chrimson stimulations was recorded.
QUANTIFICATION AND STATISTICAL ANALYSIS
Statistical analyses were performed using GraphPad Prism (version 9; GraphPad Software, Inc, La Jolla, CA) and Matlab (Mathworks, Natick, MA). Z-scores were calculated as explained above (see fiber photometry analysis). We used nested ANOVAs as well as paired, unpaired and independent t-tests where appropriate for analyzing peak height and AUC values for fiber photometry experiments. Repeated measures ANOVAs were used for the behavioral data from reinforcement studies (positive and negative reinforcement) as well as for the optogenetic studies with multiple trials (ICSS and cue inhibition). For the positive and negative reinforcement tasks, the probability of responding during the discriminative cue was calculated by: number of nose pokes on the active port during the predictive cue only / number of total nose pokes on the active side during both the cue and ITI. We then ran independent sample tests to compare the probability to chance in each task. For the positive reinforcement, the chance level for making a response on the active port during the cue presentation was 50% (30 second cue period versus 30 second averaged ITI period). For the negative response, making a response on the active port during the predictive cue at chance level was 75.1% (230 second cue period versus 76 second averaged ITI period). We also used one-way ANOVAs for the optogenetic studies, which were followed by Dunnett or Bonferroni post-hoc analyses to compare groups. For the single cell imaging experiments maximum z-scores were identified and statistical difference from the critical z-scores at p=0.05 level (+1.96 or −1.96) was computed using independent-t-tests. We further compared FC1 and FC4 cell responses (peak height or AUC) using paired t-tests. Alpha was 0.05 for all statistical analysis. All data were depicted as group mean ± standard error of the mean (S.E.M.). We assumed normal distribution of sample means for all t and F statistics.
Supplementary Material
KEY RESOURCES TABLE
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| anti-GFP | Abcam | #AB13970 |
| anti-chicken AlexaFluor 488 | Life Technologies | #A-11039 |
| Bacterial and virus strains | ||
| AAV5.hSyn.Flex.GCaMP6f.WPRE.SV40 | Chen et al. (2013) Nature doi: 10.1038/nature12354 |
Addgene #100833-AAV5 |
| AAV5.CAG.Flex.GCaMP6m.WPRE.SV40 | Chen et al. (2013) Nature doi: 10.1038/nature12354 |
Addgene #100839-AAV5 |
| AAV5. hSyn.DIO.NpHR3.0-YFP | Deisseroth Lab unpublished | Addgene #26972-AAV5 |
| AAV5.Ef1a.DIO.hChR2 | Mattis et al. (2011) Nature Methods doi: 10.1038/nmeth.1808 |
Addgene #35509-AAV5 |
| AAV5.hSyn1.eYFP | Challis et al. (2019) Nature Protocols doi: 10.1038/s41596-018-0097-3 |
Addgene #117382-AAV5 |
| AAV5-Syn-FLEX-rc[ChrimsonR-tdTomato] | Klapoetke et al. (2014) Nature Methods doi: 10.1038/nmeth.2836 |
Addgene #62723-AAV5 |
| Experimental models: Organisms/strains | ||
| C57BL/6J mice | Jackson Laboratories | SN: 000664 |
| D1-Cre mice | Jackson Laboratories | SN: 030329 |
| Adora2a (A2A)-Cre mice | Mutant Mouse Resource and Research Centers | SN: 036158-UCD |
| Software and algorithms | ||
| Matlab R2019b | Mathworks | N/A |
| Prism9.0 | Graphpad | N/A |
| Constrained non-negative matrix factorization algorithm (CNMF-E) |
Zhou et al. (2018) eLife doi: 10.7554/eLife.28728 |
N/A |
| Inscopix Data Process Software (IDPS) | Inscopix | N/A |
Highlights.
D1 and D2 MSNs in the NAc of mice do not signal reward or valence
D1 and D2 MSNs do not have opposing activity patterns during behavior
D1 MSN responses are evoked by stimuli regardless of valence or contingency
D2 MSNs track valence-free prediction error at the individual neuron level
ACKNOWLEDGEMENTS
This work was supported by NIH grants DA048931, AA030931, and DA052317 to E.S.C., R21MH132052 and KL2TR002245 to M.G.K and GM07628 to J.E.Z. as well as by funds from the VUMC Faculty Research Scholar Award to M.G.K., the Pfeil Foundation to M.G.K., Brain and Behavior Research Foundation to M.G.K and E.S.C, the Whitehall Foundation to E.S.C., and the Edward Mallinckrodt, Jr. Foundation to E.S.C.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
DECLARATION OF INTERESTS
The authors declare no competing interests.
REFERENCES
- 1.Cardinal RN (2006). Neural systems implicated in delayed and probabilistic reinforcement. Neural Networks 19, 1277–1301. 10.1016/j.neunet.2006.03.004. [DOI] [PubMed] [Google Scholar]
- 2.Green L, and Myerson J (2004). A Discounting Framework for Choice With Delayed and Probabilistic Rewards. Psychol Bull 130, 769–792. 10.1037/0033-2909.130.5.769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Avanzi M, Uber E, and Bonfà F (2004). Pathological gambling in two patients on dopamine replacement therapy for Parkinson’s disease. Neurol Sci 25, 98–101. 10.1007/s10072-004-0238-z. [DOI] [PubMed] [Google Scholar]
- 4.Chang C-J, Guo W, Zhang J, Newman J, Sun S-H, and Wilson M (2021). Behavioral clusters revealed by end-to-end decoding from microendoscopic imaging. bioRxiv, 2021.04.15.440055. 10.1101/2021.04.15.440055. [DOI] [Google Scholar]
- 5.Dodd ML, Klos KJ, Bower JH, Geda YE, Josephs KA, and Ahlskog JE (2005). Pathological Gambling Caused by Drugs Used to Treat Parkinson Disease. Archives of Neurology 62, 1377–1381. 10.1001/archneur.62.9.noc50009. [DOI] [PubMed] [Google Scholar]
- 6.Keiflin R, and Janak PH (2015). Dopamine Prediction Errors in Reward Learning and Addiction: From Theory to Neural Circuitry. Neuron 88, 247–263. 10.1016/j.neuron.2015.08.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Redish AD (2004). Addiction as a Computational Process Gone Awry. Science 306, 1944–1947. 10.1126/science.1102384. [DOI] [PubMed] [Google Scholar]
- 8.Schultz W (2011). Potential Vulnerabilities of Neuronal Reward, Risk, and Decision Mechanisms to Addictive Drugs. Neuron 69, 603–617. 10.1016/j.neuron.2011.02.014. [DOI] [PubMed] [Google Scholar]
- 9.Wise RA, and Koob GF (2014). The Development and Maintenance of Drug Addiction. Neuropsychopharmacology 39, 254–262. 10.1038/npp.2013.261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bussey T (1999). Novelty in the brain. Trends in Cognitive Sciences 3, 126. 10.1016/S1364-6613(99)01305-4. [DOI] [Google Scholar]
- 11.Lubow RE (1973). Latent inhibition. Psychological Bulletin 79, 398–407. 10.1037/h0034425. [DOI] [PubMed] [Google Scholar]
- 12.Lubow RE (1973). Latent Inhibition as a Means of Behavior Prophylaxis. Psychol Rep 32, 1247–1252. 10.2466/pr0.1973.32.3c.1247. [DOI] [PubMed] [Google Scholar]
- 13.Parkin AJ (1997). Human memory: Novelty, association and the brain. Current Biology 7, R768–R769. 10.1016/S0960-9822(06)00400-3. [DOI] [PubMed] [Google Scholar]
- 14.Quintero E, Díaz E, Vargas JP, Schmajuk N, López JC, and De la Casa LG (2011). Effects of context novelty vs. familiarity on latent inhibition with a conditioned taste aversion procedure. Behavioural Processes 86, 242–249. 10.1016/j.beproc.2010.12.011. [DOI] [PubMed] [Google Scholar]
- 15.Rescorla R, and Wagner A (1972). A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement. In Classical Conditioning: Current research and theory 2, 64–99. [Google Scholar]
- 16.Russell JA (2003). Core affect and the psychological construction of emotion. Psychological Review 110, 145–172. 10.1037/0033-295X.110.1.145. [DOI] [PubMed] [Google Scholar]
- 17.Kutlu MG, Zachry JE, Brady LJ, Melugin PR, Kelly SJ, Sanders C, Tat J, Johnson AR, Thibeault K, Lopez AJ, et al. (2020). A novel multidimensional reinforcement task in mice elucidates sex-specific behavioral strategies. Neuropsychopharmacol. 45, 1463–1472. 10.1038/s41386-020-0692-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Thibeault KC, Kutlu MG, Sanders C, and Calipari ES (2019). Cell-type and projection-specific dopaminergic encoding of aversive stimuli in addiction. Brain Research 1713, 1–15. 10.1016/j.brainres.2018.12.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kutlu MG, Zachry JE, Melugin PR, Cajigas SA, Chevee MF, Kelly SJ, Kutlu B, Tian L, Siciliano CA, and Calipari ES (2021). Dopamine release in the nucleus accumbens core signals perceived saliency. Current Biology 31, 4748–4761.e8. 10.1016/j.cub.2021.08.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Oleson EB, Gentry RN, Chioma VC, and Cheer JF (2012). Subsecond Dopamine Release in the Nucleus Accumbens Predicts Conditioned Punishment and Its Successful Avoidance. J. Neurosci 32, 14804–14808. 10.1523/JNEUROSCI.3087-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wenzel JM, Oleson EB, Gove WN, Cole AB, Gyawali U, Dantrassy HM, Bluett RJ, Dryanovski DI, Stuber GD, Deisseroth K, et al. (2018). Phasic dopamine signals in the nucleus accumbens that cause active avoidance require endocannabinoid mobilization in the midbrain. Curr Biol 28, 1392–1404.e5. 10.1016/j.cub.2018.03.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bin Saifullah MA, Nagai T, Kuroda K, Wulaer B, Nabeshima T, Kaibuchi K, and Yamada K (2018). Cell type-specific activation of mitogen-activated protein kinase in D1 receptor-expressing neurons of the nucleus accumbens potentiates stimulus-reward learning in mice. Sci Rep 8, 14413. 10.1038/s41598-018-32840-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Day JJ, and Carelli RM (2007). The Nucleus Accumbens and Pavlovian Reward Learning. Neuroscientist 13, 148–159. 10.1177/1073858406295854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.de Jong JW, Afjei SA, Pollak Dorocic I, Peck JR, Liu C, Kim CK, Tian L, Deisseroth K, and Lammel S (2019). A Neural Circuit Mechanism for Encoding Aversive Stimuli in the Mesolimbic Dopamine System. Neuron 101, 133–151.e7. 10.1016/j.neuron.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ray MH, Russ AN, Walker RA, and McDannald MA (2020). The Nucleus Accumbens Core is Necessary to Scale Fear to Degree of Threat. J. Neurosci 40, 4750–4760. 10.1523/JNEUROSCI.0299-20.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sugam JA, Saddoris MP, and Carelli RM (2014). Nucleus Accumbens Neurons Track Behavioral Preferences and Reward Outcomes During Risky Decision Making. Biological Psychiatry 75, 807–816. 10.1016/j.biopsych.2013.09.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Albin RL, Young AB, and Penney JB (1989). The functional anatomy of basal ganglia disorders. Trends in Neurosciences 12, 366–375. 10.1016/0166-2236(89)90074-X. [DOI] [PubMed] [Google Scholar]
- 28.Gerfen CR (1992). The Neostriatal Mosaic: Multiple Levels of Compartmental Organization in the Basal Ganglia. Annual Review of Neuroscience 15, 285–320. 10.1146/annurev.ne.15.030192.001441. [DOI] [PubMed] [Google Scholar]
- 29.Gerfen CR, Engber TM, Mahan LC, Susel Z, Chase TN, Monsma FJ, and Sibley DR (1990). D1 and D2 Dopamine Receptor-regulated Gene Expression of Striatonigral and Striatopallidal Neurons. Science 250, 1429–1432. 10.1126/science.2147780. [DOI] [PubMed] [Google Scholar]
- 30.Kupchik YM, Brown RM, Heinsbroek JA, Lobo MK, Schwartz DJ, and Kalivas PW (2015). Coding the direct/indirect pathways by D1 and D2 receptors is not valid for accumbens projections. Nat Neurosci 18, 1230–1232. 10.1038/nn.4068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Britt JP, Benaliouad F, McDevitt RA, Stuber GD, Wise RA, and Bonci A (2012). Synaptic and Behavioral Profile of Multiple Glutamatergic Inputs to the Nucleus Accumbens. Neuron 76, 790–803. 10.1016/j.neuron.2012.09.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Russo SJ, and Nestler EJ (2013). The brain reward circuitry in mood disorders. Nat Rev Neurosci 14, 609–625. 10.1038/nrn3381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bateup HS, Santini E, Shen W, Birnbaum S, Valjent E, Surmeier DJ, Fisone G, Nestler EJ, and Greengard P (2010). Distinct subclasses of medium spiny neurons differentially regulate striatal motor behaviors. Proceedings of the National Academy of Sciences 107, 14845–14850. 10.1073/pnas.1009874107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kravitz AV, Freeze BS, Parker PRL, Kay K, Thwin MT, Deisseroth K, and Kreitzer AC (2010). Regulation of parkinsonian motor behaviours by optogenetic control of basal ganglia circuitry. Nature 466, 622–626. 10.1038/nature09159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kravitz AV, Tye LD, and Kreitzer AC (2012). Distinct roles for direct and indirect pathway striatal neurons in reinforcement. Nat Neurosci 15, 816–818. 10.1038/nn.3100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Parker JG, Marshall JD, Ahanonu B, Wu Y-W, Kim TH, Grewe BF, Zhang Y, Li JZ, Ding JB, Ehlers MD, et al. (2018). Diametric neural ensemble dynamics in parkinsonian and dyskinetic states. Nature 557, 177–182. 10.1038/s41586-018-0090-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Bock R, Shin JH, Kaplan AR, Dobi A, Markey E, Kramer PF, Gremel CM, Christensen CH, Adrover MF, and Alvarez VA (2013). Strengthening the accumbal indirect pathway promotes resilience to compulsive cocaine use. Nat Neurosci 16, 632–638. 10.1038/nn.3369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Calipari ES, Bagot RC, Purushothaman I, Davidson TJ, Yorgason JT, Peña CJ, Walker DM, Pirpinias ST, Guise KG, Ramakrishnan C, et al. (2016). In vivo imaging identifies temporal signature of D1 and D2 medium spiny neurons in cocaine reward. Proc Natl Acad Sci U S A 113, 2726–2731. 10.1073/pnas.l521238113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Creed M, Ntamati NR, Chandra R, Lobo MK, and Lüscher C (2016). Convergence of Reinforcing and Anhedonic Cocaine Effects in the Ventral Pallidum. Neuron 92, 214–226. 10.1016/j.neuron.2016.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Danjo T, Yoshimi K, Funabiki K, Yawata S, and Nakanishi S (2014). Aversive behavior induced by optogenetic inactivation of ventral tegmental area dopamine neurons is mediated by dopamine D2 receptors in the nucleus accumbens. Proceedings of the National Academy of Sciences 111, 6455–6460. 10.1073/pnas.1404323111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Francis TC, Chandra R, Friend DM, Finkel E, Dayrit G, Miranda J, Brooks JM, Iñiguez SD, O’Donnell P, Kravitz A, et al. (2015). Nucleus Accumbens Medium Spiny Neuron Subtypes Mediate Depression-Related Outcomes to Social Defeat Stress. Biol Psychiatry 77, 212–222. 10.1016/j.biopsych.2014.07.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hearing M, Graziane N, Dong Y, and Thomas MJ (2018). Opioid and psychostimulant plasticity: targeting overlap in nucleus accumbens glutamate signaling. Trends Pharmacol Sci 39, 276–294. 10.1016/j.tips.2017.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lobo MK, Covington HE, Chaudhury D, Friedman AK, Sun H, Damez-Werno D, Dietz DM, Zaman S, Koo JW, Kennedy PJ, et al. (2010). Cell Type–Specific Loss of BDNF Signaling Mimics Optogenetic Control of Cocaine Reward. Science 330, 385–390. 10.1126/science.1188472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Lobo MK, and Nestler EJ (2011). The Striatal Balancing Act in Drug Addiction: Distinct Roles of Direct and Indirect Pathway Medium Spiny Neurons. Front Neuroanat 5, 1–11. 10.3389/fnana.2011.00041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Jaskir A, and Frank MJ (2023). On the normative advantages of dopamine and striatal opponency for learning and choice. eLife 12, e85107. 10.7554/eLife.85107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Al-Hasani R, McCall JG, Shin G, Gomez AM, Schmitz GP, Bernardi JM, Pyo C-O, Park SI, Marcinkiewcz CM, Crowley NA, et al. (2015). Distinct Subpopulations of Nucleus Accumbens Dynorphin Neurons Drive Aversion and Reward. Neuron 87, 1063–1077. 10.1016/j.neuron.2015.08.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Cole SL, Robinson MJF, and Berridge KC (2018). Optogenetic self-stimulation in the nucleus accumbens: D1 reward versus D2 ambivalence. PLOS ONE 13, e0207694. 10.1371/journal.pone.0207694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Nishioka T, Macpherson T, Hamaguchi K, and Hikida T (2021). Distinct Roles of Dopamine D1 and D2 Receptor-expressing Neurons in the Nucleus Accumbens for a Strategy Dependent Decision Making. Preprint at bioRxiv, 10.1101/2021.08.05.455353 10.1101/2021.08.05.455353. [DOI] [Google Scholar]
- 49.Soares-Cunha C, Coimbra B, Sousa N, and Rodrigues AJ (2016). Reappraising striatal D1- and D2-neurons in reward and aversion. Neuroscience & Biobehavioral Reviews 68, 370–386. 10.1016/j.neubiorev.2016.05.021. [DOI] [PubMed] [Google Scholar]
- 50.Soares-Cunha C, Coimbra B, Domingues AV, Vasconcelos N, Sousa N, and Rodrigues AJ (2018). Nucleus Accumbens Microcircuit Underlying D2-MSN-Driven Increase in Motivation. eNeuro 5. 10.1523/ENEURO.0386-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Soares-Cunha C, de Vasconcelos NAP, Coimbra B, Domingues AV, Silva JM, Loureiro-Campos E, Gaspar R, Sotiropoulos I, Sousa N, and Rodrigues AJ (2020). Nucleus accumbens medium spiny neurons subtypes signal both reward and aversion. Mol Psychiatry 25, 3241–3255. 10.1038/s41380-019-0484-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Soares-Cunha C, Domingues AV, Correia R, Coimbra B, Vieitas-Gaspar N, de Vasconcelos NAP, Pinto L, Sousa N, and Rodrigues AJ (2022). Distinct role of nucleus accumbens D2-MSN projections to ventral pallidum in different phases of motivated behavior. Cell Reports 38, 110380. 10.1016/j.celrep.2022.110380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Zalocusky KA, Ramakrishnan C, Lerner TN, Davidson TJ, Knutson B, and Deisseroth K (2016). Nucleus accumbens D2R cells signal prior outcomes and control risky decision-making. Nature 531, 642–646. 10.1038/nature17400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Gallo EF, Meszaros J, Sherman JD, Chohan MO, Teboul E, Choi CS, Moore H, Javitch JA, and Kellendonk C (2018). Accumbens dopamine D2 receptors increase motivation by decreasing inhibitory transmission to the ventral pallidum. Nat Commun 9, 1086. 10.1038/s41467-018-03272-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Roitman MF, Wheeler RA, and Carelli RM (2005). Nucleus Accumbens Neurons Are Innately Tuned for Rewarding and Aversive Taste Stimuli, Encode Their Predictors, and Are Linked to Motor Output. Neuron 45, 587–597. 10.1016/j.neuron.2004.12.055. [DOI] [PubMed] [Google Scholar]
- 56.Schultz W, Dayan P, and Montague PR (1997). A neural substrate of prediction and reward. Science 275, 1593–1599. 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
- 57.Fiorillo CD, Tobler PN, and Schultz W (2003). Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons. Science 299, 1898–1902. 10.1126/science.1077349. [DOI] [PubMed] [Google Scholar]
- 58.Tobler PN, Fiorillo CD, and Schultz W (2005). Adaptive Coding of Reward Value by Dopamine Neurons. Science 307, 1642–1645. 10.1126/science.1105370. [DOI] [PubMed] [Google Scholar]
- 59.Nicola SM (2007). The nucleus accumbens as part of a basal ganglia action selection circuit. Psychopharmacology 191, 521–550. 10.1007/s00213-006-0510-4. [DOI] [PubMed] [Google Scholar]
- 60.Willmore L, Cameron C, Yang J, Witten IB, and Falkner AL (2022). Behavioural and dopaminergic signatures of resilience. Nature 611, 124–132. 10.1038/s41586-022-05328-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Bloem B, Huda R, Amemori K, Abate AS, Krishna G, Wilson AL, Carter CW, Sur M, and Graybiel AM (2022). Multiplexed action-outcome representation by striatal striosome-matrix compartments detected with a mouse cost-benefit foraging task. Nat Commun 13, 1541. 10.1038/s41467-022-28983-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Iordanova MD, Yau JO-Y, McDannald MA, and Corbit LH (2021). Neural substrates of appetitive and aversive prediction error. Neuroscience & Biobehavioral Reviews 123, 337–351. 10.1016/j.neubiorev.2020.10.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Li SSY, and McNally GP (2015). A role of nucleus accumbens dopamine receptors in the nucleus accumbens core, but not shell, in fear prediction error. Behav Neurosci 129, 450–456. 10.1037/bne0000071. [DOI] [PubMed] [Google Scholar]
- 64.Ambroggi F, Ghazizadeh A, Nicola SM, and Fields HL (2011). Roles of Nucleus Accumbens Core and Shell in Incentive-Cue Responding and Behavioral Inhibition. J. Neurosci 31, 6820–6830. 10.1523/JNEUROSCI.6491-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Corbit LH, Muir JL, and Balleine BW (2001). The Role of the Nucleus Accumbens in Instrumental Conditioning: Evidence of a Functional Dissociation between Accumbens Core and Shell. J. Neurosci 21, 3251–3260. 10.1523/JNEUROSCI.21-09-03251.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Floresco SB, McLaughlin RJ, and Haluk DM (2008). Opposing roles for the nucleus accumbens core and shell in cue-induced reinstatement of food-seeking behavior. Neuroscience 154, 877–884. 10.1016/j.neuroscience.2008.04.004. [DOI] [PubMed] [Google Scholar]
- 67.West EA, and Carelli RM (2016). Nucleus Accumbens Core and Shell Differentially Encode Reward-Associated Cues after Reinforcer Devaluation. J Neurosci 36, 1128–1139. 10.1523/JNEUROSCI.2976-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Guo Q, Wang D, He X, Feng Q, Lin R, Xu F, Fu L, and Luo M (2015). Whole-Brain Mapping of Inputs to Projection Neurons and Cholinergic Interneurons in the Dorsal Striatum. PLOS ONE 10, e0123381. 10.1371/journal.pone.0123381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Otis JM, Namboodiri VMK, Matan AM, Voets ES, Mohorn EP, Kosyk O, McHenry JA, Robinson JE, Resendez SL, Rossi MA, et al. (2017). Prefrontal cortex output circuits guide reward seeking through divergent cue encoding. Nature 543, 103–107. 10.1038/nature21376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Stuber GD, Sparta DR, Stamatakis AM, van Leeuwen WA, Hardjoprajitno JE, Cho S, Tye KM, Kempadoo KA, Zhang F, Deisseroth K, et al. (2011). Excitatory transmission from the amygdala to nucleus accumbens facilitates reward seeking. Nature 475, 377–380. 10.1038/nature10194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Chen G, Lai S, Bao G, Ke J, Meng X, Lu S, Wu X, Xu H, Wu F, Xu Y, et al. (2023). Distinct reward processing by subregions of the nucleus accumbens. Cell Reports 42, 112069. 10.1016/j.celrep.2023.112069. [DOI] [PubMed] [Google Scholar]
- 72.Zhou K, Xu H, Lu S, Jiang S, Hou G, Deng X, He M, and Zhu Y (2022). Reward and aversion processing by input-defined parallel nucleus accumbens circuits in mice. Nat Commun 13, 6244. 10.1038/s41467-022-33843-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.West EA, and Carelli RM (2016). Nucleus Accumbens Core and Shell Differentially Encode Reward-Associated Cues after Reinforcer Devaluation. J Neurosci 36, 1128–1139. 10.1523/JNEUROSCI.2976-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Soares-Cunha C, Coimbra B, David-Pereira A, Borges S, Pinto L, Costa P, Sousa N, and Rodrigues AJ (2016). Activation of D2 dopamine receptor-expressing neurons in the nucleus accumbens increases motivation. Nat Commun 7, 11829. 10.1038/ncomms11829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Baldo BA, and Kelley AE (2007). Discrete neurochemical coding of distinguishable motivational processes: insights from nucleus accumbens control of feeding. Psychopharmacology 191, 439–459. 10.1007/s00213-007-0741-z. [DOI] [PubMed] [Google Scholar]
- 76.Carlezon WA, and Thomas MJ (2009). Biological substrates of reward and aversion: A nucleus accumbens activity hypothesis. Neuropharmacology 56, 122–132. 10.1016/j.neuropharm.2008.06.075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Natsubori A, Tsutsui-Kimura I, Nishida H, Bouchekioua Y, Sekiya H, Uchigashima M, Watanabe M, de d’Exaerde AK, Mimura M, Takata N, et al. (2017). Ventrolateral Striatal Medium Spiny Neurons Positively Regulate Food-Incentive, Goal-Directed Behavior Independently of D1 and D2 Selectivity. J. Neurosci 37, 2723–2733. 10.1523/JNEUROSCI.3377-16.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Burke DA, Rotstein HG, and Alvarez VA (2017). Striatal Local Circuitry: A New Framework for Lateral Inhibition. Neuron 96, 267–284. 10.1016/j.neuron.2017.09.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Dobbs LK, Kaplan AR, Lemos JC, Matsui A, Rubinstein M, and Alvarez VA (2016). Dopamine Regulation of Lateral Inhibition between Striatal Neurons Gates the Stimulant Actions of Cocaine. Neuron 90, 1100–1113. 10.1016/j.neuron.2016.04.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Luo Z, Volkow ND, Heintz N, Pan Y, and Du C (2011). Acute Cocaine Induces Fast Activation of D1 Receptor and Progressive Deactivation of D2 Receptor Striatal Neurons: In Vivo Optical Microprobe [Ca2+]i Imaging. J. Neurosci 31, 13180–13190. 10.1523/JNEUROSCI.2369-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.MacAskill AF, Cassel JM, and Carter AG (2014). Cocaine Exposure Reorganizes Cell-Type and Input-Specific Connectivity in the Nucleus Accumbens. Nat Neurosci 17, 1198–1207. 10.1038/nn.3783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Pascoli V, Terrier J, Espallergues J, Valjent E, O’Connor EC, and Lüscher C (2014). Contrasting forms of cocaine-evoked plasticity control components of relapse. Nature 509, 459–464. 10.1038/nature13257. [DOI] [PubMed] [Google Scholar]
- 83.Joffe ME, and Grueter BA (2016). Cocaine experience enhances thalamo-accumbens N-methyl-D-aspartate receptor function. Biol Psychiatry 80, 671–681. 10.1016/j.biopsych.2016.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Joffe ME, Turner BD, Delpire E, and Grueter BA (2018). Genetic loss of GluN2B in D1-expressing cell types enhances long-term cocaine reward and potentiation of thalamo-accumbens synapses. Neuropsychopharmacology 43, 2383–2389. 10.1038/s41386-018-0131-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Heinsbroek JA, Neuhofer DN, Griffin WC, Siegel GS, Bobadilla A-C, Kupchik YM, and Kalivas PW (2017). Loss of Plasticity in the D2-Accumbens Pallidal Pathway Promotes Cocaine Seeking. J Neurosci 37, 757–767. 10.1523/JNEUROSCI.2659-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Durieux PF, Bearzatto B, Guiducci S, Buch T, Waisman A, Zoli M, Schiffmann SN, and de Kerchove d’Exaerde A (2009). D2R striatopallidal neurons inhibit both locomotor and drug reward processes. Nat Neurosci 12, 393–395. 10.1038/nn.2286. [DOI] [PubMed] [Google Scholar]
- 87.Xu L, Nan J, and Lan Y (2020). The Nucleus Accumbens: A Common Target in the Comorbidity of Depression and Addiction. Front Neural Circuits 14, 37. 10.3389/fncir.2020.00037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Akerboom J, Chen T-W, Wardill TJ, Tian L, Marvin JS, Mutlu S, Calderón NC, Esposti F, Borghuis BG, Sun XR, et al. (2012). Optimization of a GCaMP Calcium Indicator for Neural Activity Imaging. J. Neurosci 32, 13819–13840. 10.1523/JNEUROSCI.2601-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Dana H, Sun Y, Mohar B, Hulse BK, Kerlin AM, Hasseman JP, Tsegaye G, Tsang A, Wong A, Patel R, et al. (2019). High-performance calcium sensors for imaging activity in neuronal populations and microcompartments. Nat Methods 16, 649–657. 10.1038/s41592-019-0435-6. [DOI] [PubMed] [Google Scholar]
- 90.Lerner TN, Shilyansky C, Davidson TJ, Evans KE, Beier KT, Zalocusky KA, Crow AK, Malenka RC, Luo L, Tomer R, et al. (2015). Intact-Brain Analyses Reveal Distinct Information Carried by SNc Dopamine Subcircuits. Cell 162, 635–647. 10.1016/j.cell.2015.07.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Tian L, Hires SA, Mao T, Huber D, Chiappe ME, Chalasani SH, Petreanu L, Akerboom J, McKinney SA, Schreiter ER, et al. (2009). Imaging neural activity in worms, flies and mice with improved GCaMP calcium indicators. Nat Methods 6, 875–881. 10.1038/nmeth.1398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Kutlu MG, Zachry JE, Melugin PR, Tat J, Cajigas S, Isiktas AU, Patel DD, Siciliano CA, Schoenbaum G, Sharpe MJ, et al. (2022). Dopamine signaling in the nucleus accumbens core mediates latent inhibition. Nat Neurosci 25, 1071–1081. 10.1038/s41593-022-01126-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Pnevmatikakis EA, Soudry D, Gao Y, Machado TA, Merel J, Pfau D, Reardon T, Mu Y, Lacefield C, Yang W, et al. (2016). Simultaneous Denoising, Deconvolution, and Demixing of Calcium Imaging Data. Neuron 89, 285–299. 10.1016/j.neuron.2015.11.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Zhou P, Resendez SL, Rodriguez-Romaguera J, Jimenez JC, Neufeld SQ, Giovannucci A, Friedrich J, Pnevmatikakis EA, Stuber GD, Hen R, et al. (2018). Efficient and accurate extraction of in vivo calcium signals from microendoscopic video data. eLife 7, e28728. 10.7554/eLife.28728. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data reported in this paper are available from the lead contact upon reasonable request.
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.







