Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Apr 11.
Published in final edited form as: Cell Rep. 2020 Dec 15;33(11):108492. doi: 10.1016/j.celrep.2020.108492

A Comparison of Dopaminergic and Cholinergic Populations Reveals Unique Contributions of VTA Dopamine Neurons to Short-Term Memory

Jung Yoon Choi 1,2, Hee Jae Jang 1, Sharon Ornelas 1, Weston T Fleming 1, Daniel Fürth 3, Jennifer Au 1, Akhil Bandi 1, Esteban A Engel 1, Ilana B Witten 1,2,4,*
PMCID: PMC8038523  NIHMSID: NIHMS1657051  PMID: 33326775

SUMMARY

We systematically compare the contributions of two dopaminergic and two cholinergic ascending populations to a spatial short-term memory task in rats. In ventral tegmental area dopamine (VTA-DA) and nucleus basalis cholinergic (NB-ChAT) populations, trial-by-trial fluctuations in activity during the delay period relate to performance with an inverted-U, despite the fact that both populations have low activity during that time. Transient manipulations reveal that only VTA-DA neurons, and not the other three populations we examine, contribute causally and selectively to short-term memory. This contribution is most significant during the delay period, when both increases and decreases in VTA-DA activity impair short-term memory. Our results reveal a surprising dissociation between when VTA-DA neurons are most active and when they have the biggest causal contribution to short-term memory, and they also provide support for classic ideas about an inverted-U relationship between neuromodulation and cognition.

Graphical Abstract

graphic file with name nihms-1657051-f0001.jpg

In Brief

Choi et al. find that dopamine neurons in the VTA contribute to short-term memory during the delay period, despite low activity during that time. Trial-by-trial fluctuations in delay period activity correlate with accuracy with an inverted-U, and either increasing or decreasing DA activity impairs performance.

INTRODUCTION

Short-term memory (Baddeley, 1986; Baddeley and Hitch, 1974; Erlich et al., 2011; Funahashi et al., 1993; Fuster and Alexander, 1971; Inagaki et al., 2019; Kamigaki and Dan, 2017; Kopec et al., 2015; Kubota and Niki, 1971; Liu et al., 2014; Miller et al., 2018; Romo et al., 1999) is a fundamental cognitive process with distinct temporal components: a “sample period” in which new information is updated into short-term memory, a “delay period” in which the memory is maintained, and ultimately a behavioral readout based on the memory (“choice period”). Although neuromodulators have been implicated in short-term memory (Brozoski et al., 1979; Clark and Noudoost, 2014; Croxson et al., 2011; Everitt and Robbins, 1997; Hasselmo and Stern, 2006; Ott and Nieder, 2019; Sun et al., 2017), it remains unclear which neuromodulators are most relevant and which temporal component of short-term memory they support.

For example, dopamine (DA) has been implicated in short-term memory through pioneering experiments that pharmacologically manipulated DA receptors in prefrontal cortex (PFC) in monkeys performing short-term memory tasks (Arnsten et al., 1994; Cai and Arnsten, 1997; Floresco and Phillips, 2001; Murphy et al., 1996; Sawaguchi and Goldman-Rakic, 1991; Vijayraghavan et al., 2007; Williams and Goldman-Rakic, 1995; Zahrt et al., 1997). This line of work suggested that DA has an “inverted-U” influence on short-term memory and on memory-related activity during the delay period. In other words, too much or too little DA is detrimental to short-term memory, while intermediate levels enhance short-term memory. From these experiments, the idea arose that optimal levels of DA in PFC during the delay period serve to stabilize memory-related activity (Figure 1A; Arnsten, 1997; Arnsten et al., 2012; Cools and D’Esposito, 2011; Gibbs and D’Esposito, 2005).

Figure 1. Fiber Photometry Recordings of VTA-DA, SNc-DA, NB-ChAT, and MS-ChAT Neurons in Rats Performing a Spatial Short-Term Memory Task.

Figure 1.

(A) Schematic of inverted-U hypothesis adapted from Cools and Robbins (2004). In this framework, DA contributes to maintaining a memory item during the delay period.

(B) Schematic of gating hypothesis, adapted from Hazy et al. (2007). In this framework, DA contributes to updating of new information during the sample period.

(C) Schematic of the delayed non-match to position (DNMTP) task. The sample period starts with the sample lever presentation on either the left or right side of the chamber (“sample presentation”). Pressing the sample lever (“sample press”) triggers the nosepoke on the back of the chamber to be illuminated. The delay period initiates when the rat makes a nosepoke, which turns off the nosepoke light (“delay start”). After the delay period (1, 5, or 10 s), the nosepoke is again illuminated, signaling that the delay period is over (“delay end”). Upon making another nosepoke, choice levers are presented (“choice presentation”). The rat needed to press the “non-match” lever (“choice press”) to be rewarded with a drop of water during the outcome period. Colored bars delineate the duration of the sample, delay, choice, and outcome periods relative to the task events. Note that choice period denotes the choice readout period, as opposed to when the choice is necessarily being made.

(D) Performance of trained rats during fiber photometry recordings (n = 34 rats; bars and error bars indicate mean ± SEM across rats). Rats showed delay-dependent impairment in accuracy (one-way ANOVA, accuracy explained by delay duration; p < 0.001 for delay; n = 34 rats).

(E) Cell-type-specific expression of GCaMP6f was obtained using TH::Cre or ChAT::Cre rats and Cre-dependent GCaMP virus (AAV2/5-CAG-DIO or FLEX-GCaMP6f).

(F) Example recording trace showing simultaneous acquisition of time-varying GCaMP6f fluorescence from NB-ChAT neurons (top), timestamps of each task event (middle), and the rat’s speed in the chamber (bottom).

(G) Schematic of midbrain DA system. Two nuclei of interest are the ventral tegmental area (VTA-DA) and substantia nigra pars compacta (SNc-DA).

(H and I) GCaMP6f (green) is specifically expressed in DA neurons (red) in the VTA (H) and SNc (I).

(J) Schematic of basal forebrain ChAT system. Two nuclei of interest are the nucleus basalis (NB-ChAT) and medial septum (MS-ChAT).

(K and L) GCaMP6f (green) is specifically expressed in ChAT neurons (red) in the NB (K) and MS (L). Scale bars: 1 mm (top), 500 mm (bottom left), and 25 mm (bottom right) (H and I, K and L).

See also Figure S1.

However, integrating these findings with the understanding that has emerged based on direct recordings of activity in DA neurons has presented a challenge. DA neurons with cell bodies in the ventral tegmental area (VTA-DA) and substantia nigra (SNc-DA) project to the striatum, PFC, and other forebrain regions. These neurons, which are thought to provide the major source of DA to their forebrain targets, are known to respond transiently to unexpected rewards and reward-predicting cues (Bayer and Glimcher, 2005; Cohen et al., 2012; Ellwood et al., 2017; Ljungberg et al., 1991; Parker et al., 2016; Roesch et al., 2007; Schultz, 1986, 1998; Schultz et al., 1993). This signal has been interpreted as a reward prediction error, which is thought to support reinforcement learning (Chang et al., 2016; Parker et al., 2016; Steinberg et al., 2013). On the other hand, DA neurons are not known to be active during the delay period of tasks with short-term memory components, when rewards and reward-predicting cues are absent (Cohen et al., 2012; Ljungberg et al., 1991; Matsumoto and Takada, 2013).

Thus, the “gating” theory of short-term memory has been proposed to integrate the role of DA in encoding a reward prediction error signal with the idea that it regulates short-term memory (Figure 1B; Braver and Cohen, 1999, 2000; O’Reilly and Frank, 2006). In this model, phasic bursts of DA at the times of reward-predicting events serve to open the “gate” and update relevant items into short-term memory. Low levels of DA during the delay period allow the gate to remain closed and prevent distractors from overwriting the memory item.

In particular, the gating theory suggests that phasic DA at the time of updating is critical to short-term memory, while the classic ideas based on pharmacology suggest that tonic levels of DA during the delay period are more important (Figures 1A and 1B). In order to directly test these two ideas, we must understand when DA contributes to short-term memory—does DA affect the updating of short-term memory with new information during the sample period, or is it more important during the delay period?

Addressing this question requires first knowing which DA subregions are relevant to short-term memory. The two major ascending sources of DA to the forebrain arise from the VTA and SNc. In addition to determining which DA neurons are relevant to short-term memory, and when they contribute, we also compared the role of DA to that of other neuromodulators. We focused on ascending cholinergic (ChAT) neurons arising from the basal forebrain regions—nucleus basalis (NB-ChAT) and medial septum (MS-ChAT)—given previous work implicating these populations in short-term memory and other cognitive processes (Croxson et al., 2011; Hasselmo, 2006; Hasselmo and Sarter, 2011; Hasselmo and Stern, 2006; Sun et al., 2017).

RESULTS

Rats Performed a DNMTP Task during Optical Recording of DA and ChAT Neurons

Rats were trained on a rodent spatial short-term memory task known as delayed non-match to position (DNMTP; Figure 1C; Akhlaghpour et al., 2016; Dunnett et al., 1988). In the DNMTP task, rats are presented with a sample lever in one of two possible locations on the front wall of the chamber (“sample presentation”). Upon pressing the lever (“sample press”), the lever retracts and the nosepoke on the back wall of the chamber is illuminated. The rat then initiates the delay period by entering the nosepoke (“delay start”). After a delay of either 1, 5, or 10 s, when the rat re-enters the nosepoke, both levers are presented on the front wall (“choice presentation”). To obtain a water reward, the rat must press the lever that does not match the initial sample lever (“choice press”). Trained rats performed well above chance and displayed a delay-dependent decline in performance (Figure 1D; one-way ANOVA, accuracy explained by delay; p < 0.001 for delay; n = 34 rats).

After training, rats were injected with a Cre-dependent AAV2/5 GCaMP6f virus in the VTA or SNc in the case of TH::Cre rats (Figures 1G1I) or in the NB or MS in the case of ChAT::Cre rats (Figures 1J1L) and implanted with an optical fiber at the same location for fiber photometry recordings (Figure 1E; Figure S1). We recorded time-varying GCaMP fluorescence during the task, along with the animal’s head position in the chamber and the timestamps for each task event (Figure 1F).

VTA-DA, SNc-DA, and NB-ChAT, but Not MS-ChAT, Populations Primarily Encode Task Events Rather Than the Rats’ Speed

Before examining in detail the neural correlates of behavioral events, we determined whether the animals’ movement in the chamber provided a better explanation of neural activity than the events themselves. This is a possible confound in interpreting neural correlates of events, given that in an operant task, an animal’s movement may correlate with the timing of task events; therefore, apparent neural correlates of task events may be better explained as neural correlates of movement.

Thus, we compared the predictive power of linear encoding models (Engelhard et al., 2019; Lovett-Barron et al., 2019; Musall et al., 2019) in which the GCaMP signal was predicted based on different sets of predictors: either only speed (“speed model”), only task events (“event model”), or a full model based on both task events and speed (“event and speed model,” model schematic in Figure 2A; see STAR Methods for details on encoding models). This revealed that the time-varying GCaMP signal in VTA-DA, SNc-DA, and NB-ChAT was better explained by the task events than speed, whereas speed explained GCaMP in MS-ChAT as well as task events (Figure 2B; one-way ANOVA, R2 over 3-fold cross-validation for different encoding models: p < 0.001 for VTA-DA, p < 0.001 for SNc-DA, p < 0.001 for NB-ChAT, p < 0.001 for MS-ChAT; post hoc pairwise t test between speed model and task events model, with Bonferroni correction: p < 0.001 for VTA-DA, p < 0.001 for SNc-DA, p < 0.001 for NB = ChAT, p = 0.86 for MS-ChAT; n = 10 sites for VTA-DA, n = 13 for SNc-DA, n = 18 NB-ChAT, n = 8 for MS-ChAT; Figures S2AS2F for the speed encoding of MS-ChAT; Figures S2GS2L for the visualization of task event kernels learned from the full model).

Figure 2. VTA-DA, SNc-DA, and NB-ChAT, but Not MS-ChAT, Better Encode Task Events Than Speed.

Figure 2.

(A) Schematic of the full encoding model, in which GCaMP at each time point was predicted based on both task events and the rat’s speed. For each task event, a set of 10 predictors was created by convolving that task event’s timestamps with a spline basis set, in order to allow temporally delayed versions of each event to predict GCaMP fluorescence. Speed predictors include first-, second-, and third-degree polynomials of the animal’s speed at each point in time.

(B) Three encoding models (x axis) were generated and compared on held-out data: (1) a model with only speed predictors, (2) a model with only task event predictors, and (3) the full model with both task event and speed predictors. In the VTA-DA, SNc-DA, and NB-ChAT populations, the task events model performed better than the speed model, while in the MS-ChAT population, the speed model and the task events model were comparable (one-way ANOVA, R2 explained by each encoding model: p < 0.001 for VTA-DA, p < 0.001 for SNc-DA, p < 0.001 for NB-ChAT, p < 0.001 for MS-ChAT; post hoc pairwise t test comparing difference between speed model and task events model, with Bonferroni correction: p < 0.001 for VTA-DA, p < 0.001 for SNc-DA, p < 0.001 for NB-ChAT, p = 0.86 for MS-ChAT; n = 10 VTA-DA, 13 SNc-DA, 18 NB-ChAT, 8 MS-ChAT sites). Bars and error bars indicate mean ± SEM across recording sites. Each dot represents a recording site. Rh for each recording site was obtained by averaging over 3-fold cross-validations.

See also Figure S2.

VTA-DA, SNc-DA, and NB-ChAT Neurons Have Elevated Activity during the Sample, Choice, and Outcome Periods, but Not during the Delay Period

Given that task events were good predictors of the variance of GCaMP fluorescence in VTA-DA, SNc-DA, and NB-ChAT neurons, we further examined how neural activity correlated with each event in those populations by time-locking the GCaMP signal to each event (Figures 3A3I; Figures S3AS3I). We observed some commonalities in the activity profiles across these task-encoding populations. For example, transient elevation of GCaMP fluorescence in relation to task events was evident across the sample, choice, and reward period in all three populations (Figures 3J, 3L, and 3N; one-way ANOVA, average GCaMP explained by sample, delay, choice, or outcome epoch; p < 0.001 for VTA-DA, p < 0.003 for SNc-DA, p = 0.002 for NB-ChAT; n = 10 sites for VTA-DA, n = 13 for SNc-DA, n = 18 for NB-ChAT).

Figure 3. During the Delay Period, GCaMP Fluorescence in VTA-DA and NB-ChAT Relates to Performance with an Inverted-U, Despite Relatively Low Fluorescence at That Time.

Figure 3.

(A) Schematic of fiber photometry recordings from VTA-DA.

(B) Z-scored GCaMP fluorescence from VTA-DA recordings, time-locked to each task event during the sample, delay, and choice periods (n = 10 sites). Data from all 10 s delay trials.

(C) Z-scored GCaMP fluorescence from VTA-DA recordings during the outcome period, separated by rewarded and unrewarded trials (n = 10 sites). Data from all 10 s delay trials.

(D–F) Same as in (A)–(C) but for SNc-DA recordings (n = 13 sites).

(G–I) Same as (A)–(C) but for NB-ChAT recordings (n = 18 sites).

(J) Average Z-scored GCaMP fluorescence during the sample, delay, choice, and outcome periods from VTA-DA recordings (10 s delay trials, n = 10 sites). Average GCaMP activity was significantly lower in the delay period than in the sample, choice, or outcome periods (post hoc pairwise t test with Bonferroni correction; p < 0.001 between delay and sample, p < 0.001 between delay and choice, p = 0.001 between delay and outcome, p < 0.001 between sample and choice; n = 10 sites).

(K) Left: accuracy relative to delay period fluorescence in VTA-DA. To relate delay period fluorescence to accuracy, we ranked all trials according to their average delay period fluorescence for each recording site (n = 10 sites) and delay period duration (n = 3 delay durations). For trials in each quintile for each delay period duration, we plotted the average accuracy versus the fluorescence quintile, averaging across delay period duration and then recording sites. Note that calculating GCaMP fluorescence quintiles separately for each delay accounted for delay-dependent differences in fluorescence and allowed visualization of the delay-independent relationship between fluorescence and accuracy. Right: accuracy relative to delay period fluorescence predicted from the model fit to the data on the left (mixed-effect linear regression, accuracy predicted based on first- and second-degree polynomial of delay period fluorescence quintile, delay, and random effect of individual recording site; p = 0.005 for the second-degree polynomial; n = 10 sites). See Figure S6 for additional statistical analyses of the inverted-U relationship.

(L) same as (J) but for SNc-DA recordings (n = 13 sites). Average GCaMP fluorescence was significantly lower in the delay period than in the choice period, but not different from the sample or outcome periods (post hoc pairwise t test with Bonferroni correction; p = 1.00 between delay and sample, p < 0.001 between delay and choice, p = 0.28 between delay and outcome, p = 0.002 between sample and choice, p < 0.05 between sample and outcome; n = 13 SNc-DA sites).

(M) Same as (K) but for SNc-DA recording sites (mixed-effect linear regression, accuracy predicted based on first- and second-degree polynomial of delay period fluorescence quintile, delay, and random effect of individual recording site; p = 0.45 for the second-degree polynomial, n = 13 sites).

(N) Same as (J) but from NB-ChAT recordings (n = 18 sites). Average GCaMP activity was significantly lower in the delay period than in the choice and outcome periods, but not different from the sample periods (post hoc pairwise t test with Bonferroni correction; p = 1.00 between delay and sample, p = 0.03 between delay and choice, p = 0.002 between delay and outcome, p = 0.09 between sample and choice, p = 0.003 between sample and outcome; n = 18 NB-ChAT sites).

(O) Same as (K) but in NB-ChAT recording sites (mixed-effect linear regression, accuracy predicted based on first- and second-degree polynomial of delay period fluorescence quintile, delay duration, and random effect of individual recording site; p = 0.008 for the second-degree polynomial, n = 18 sites). All error bars and shaded regions denote SEM.

See also Figures S3, S4, S5, and S6.

We did not observe elevation of GCaMP fluorescence during the delay period in these populations. In VTA-DA, fluorescence during the delay period was significantly lower than during the sample or choice periods (Figure 3J; post hoc pairwise t test with Bonferroni correction; p < 0.001 between delay and sample, p < 0.001 between delay and choice; n = 10 VTA-DA sites). In SNc-DA and NB-ChAT recordings, the delay period fluorescence was not significantly different from that of the sample period but was significantly lower than that of the choice period (Figures 3L and 3N; post hoc pairwise t test with Bonferroni correction; p = 1.00 for SNc-DA, p = 1.00 for NB-ChAT between delay and sample; p < 0.001 for SNc-DA, p = 0.03 for NB-ChAT between delay and choice; n = 13 SNc-DA, 18 NB-ChAT sites). Finally, in both VTA-DA and SNc-DA populations, activity was higher during the choice period than during the sample period (Figures 3J and 3L; post hoc pairwise t test with Bonferroni correction; p < 0.001 for VTA-DA, p = 0.002 for SNc-DA between sample and choice; n = 10 VTA-DA, 13 SNc-DA sites). The higher activity during the choice period can be interpreted as modulation by a temporally discounted reward expectation function (Fiorillo et al., 2008; Kobayashi and Schultz, 2008; Mazur, 1987; Richards et al., 1997; Roesch et al., 2007; Starkweather et al., 2017) and therefore consistent with reward prediction error. This is because the sample and the choice periods involved the same stimulus and action, but the choice period was more proximal to the reward period than the sample period. Choice period activity was not correlated with delay duration, after accounting for the baseline offset at the end of the delay (Figures S3M and S3N).

By comparison, in the NB-ChAT population, choice period activity was not significantly higher than that of the sample period (Figure 3N; post hoc pairwise t test with Bonferroni correction; p = 0.09 between sample and choice; n = 18 NB-ChAT sites). Another distinction between the NB-ChAT and the VTA-DA populations was that NB-ChAT responded preferentially to the lever press action, while VTA-DA responded to the lever presentation (SNc-DA had mixed selectivity; Figures S3OS3Q).

As a control, we recorded from animals in which GFP and not GCaMP was expressed. In that case, we did not observe a similar pattern of modulation of fluorescence relative to task events (Figures S3JS3L).

Finally, we examined the spatial distribution of response profiles within each region based on anatomical reconstruction of recording fiber placement. Pairwise correlations between recording sites revealed that GCaMP recordings were highly homogeneous in VTA-DA and MS-ChAT populations and heterogeneous in SNc-DA and NB-ChAT populations (Figures S4A and S4B). Interestingly, NB-ChAT responses were spatially organized along the medio-lateral axis (Figures S4CS4E). Furthermore, the medial and lateral subregions of NB-ChAT received topographic input from the medial and lateral subregions of the striatum, respectively (Figure S5).

In summary, neural correlates in VTA-DA neurons during this task were consistent with the gating theory (Figure 1B) in that there was elevated activity during the sample period and suppressed activity during the delay period. Activity was further elevated during the choice period, which can be considered as consistent with reward prediction error and therefore with the gating theory.

During the Delay Period, VTA-DA and NB-ChAT Activity Relates to Performance with an Inverted-U Relationship

Although delay period activity was relatively low in VTA-DA and NB-ChAT, we found an interesting relationship between activity during the delay period and task accuracy in both populations. Specifically, task accuracy as a function of the trial-averaged delay period fluorescence followed an inverted-U relationship (Figures 3K, 3M, and 3O). Statistically, this was confirmed with a mixed-effect linear regression in which accuracy was predicted based on the first- and second-degree polynomial of delay period fluorescence quintile (as well as delay period duration and a random effect of individual recording site). For both VTA-DA and NB-ChAT, but not SNc-DA, the second-degree polynomial of delay period fluorescence was statistically significant, indicative of an inverted-U shape (p = 0.005 for VTA-DA, p = 0.008 for NB-ChAT, p = 0.45 for SNc-DA for second-degree polynomial of delay period fluorescence quintile; n = 10 VTA-DA, 13 SNc-DA, 18 NB-ChAT sites). Additionally, the Sasabuchi-Lind-Mehlum tests for inverted-U further validated the inverted-U relationship between accuracy and delay period fluorescence in VTA-DA and NB-ChAT populations (Figure S6A; for detail on tests, see STAR Methods, Inverted-U quantification). In contrast to the delay period, fluorescence was not related to accuracy with an inverted-U according to the same sets of tests in any of these regions during the sample and choice periods.

To control for the possibility that the rat’s position during the delay period could contribute to the inverted-U relationship between fluorescence and accuracy, we repeated the same analysis using the subset of the delay period data during which the animal’s head was near the nosepoke (Figures S6BS6D). The significant inverted-U relationship between the delay period GCaMP fluorescence and accuracy in VTA-DA and NB-ChAT populations was maintained in this subset of the data (mixed-effect linear regression in which accuracy was predicted based on the first- and second-degree polynomial of delay period fluorescence quintile, delay period duration, and a random effect of individual recording site; p = 0.003 for VTA-DA, p = 0.09 for SNc-DA, p = 0.002 NB-ChAT).

Thus, we observed a neural correlate of the inverted-U relationship between neuromodulation and short-term memory performance, specifically during the delay period.

Optogenetic Inhibition of VTA-DA Neurons Selectively Impairs Short-Term Memory, While Inhibition of SNc-DA, NB-ChAT, and MS-ChAT Neurons Does Not

To determine whether the activity we measured in neuromodulatory populations contributes causally to task performance, we optogenetically inhibited each population throughout a trial, on a subset of trials. To this end, we injected Cre-dependent NpHR into the VTA or SNc of TH::Cre rats and implanted bilateral fibers above the injection site (Figures 4A and 4B; Figures S7 and S8).

Figure 4. Optogenetic Inhibition of VTA-DA, but Not SNc-DA, Selectively Impairs Short-Term Memory.

Figure 4.

(A) Schematic of VTA-DA and SNc-DA targeting strategy using TH::Cre rats and Cre-dependent AAV2/5 DIO-NpHR-YFP (or DIO-YFP for the control group) virus injected into the VTA (top) or SNc (bottom), respectively.

(B) Example histology from the VTA (top) and SNc (bottom), showing the co-localization of TH (red) and NpHR (green). Scale bars: 50 μm (top) and 75 μm (bottom).

(C) Schematic of experimental design for the entire-trial inhibition experiment. Continuous illumination (532 nm, ~5–6 mW light power) was presented throughout the entirety of a trial, on a randomly selected 20% of trials, every other day.

(D) Inhibition of VTA-DA neurons during the memory-guided DNMTP task impaired accuracy (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p < 0.001 for light, n = 13 NpHR rats).

(E) In YFP control animals, the effect of light was not significant (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p = 0.23 for light; n = 7 YFP rats), and there was a significant interaction between laser 3 group (D and E combined: mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of group, light, delay duration, and random effect of individual rat; p < 0.001 for light 3 group, n = 13 NpHR + 7 YFP rats).

(F) In the VTA-DA NpHR group, virus expression level as measured by fluorescence intensity correlates with the behavioral effect size as measured by change in accuracy between light-on and light-off trials for each rat (y axis, two-way ANOVA, light impairment in accuracy explained by opsin expression level and delay duration; p = 0.002 for opsin expression, p = 0.07 for delay duration; R2 = 0.32; n = 13 NpHR rats).

(G) Optogenetic inhibition of VTA-DA neurons during the cue-guided DNMTP task. Accuracy (y axis) was not impaired in light-on trials (green bar) compared with light-off trials (gray bar; mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p = 0.14 for light; n = 7 NpHR rats).

(H–K) Same as (D)–(G) but in the SNc-DA group. Unlike VTA-DA, the accuracy was impaired in both the memory-guided (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p < 0.001 for light; n = 12 NpHR rats) and cue-guided DNMTP task (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p < 0.001 for light; n = 6 NpHR rats). All error bars denote SEM.

See also Figures S7 and S8.

Full-trial inhibition of DA neurons in the VTA significantly impaired rats’ performance in the memory-guided DNMTP task (Figures 4C and 4D; mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p < 0.001 for light; n = 13 NpHR rats). The level of opsin expression (as assessed by fluorescence intensity in histology) was significantly correlated with the optogenetic impairment (Figure 4F; two-way ANOVA light impairment explained by fluorescence level and delay; p = 0.002 for fluorescence, p = 0.07 for delay; R2 = 0.32; n = 13 NpHR rats). There was no significant light-induced impairment in the YFP control animals (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p = 0.23 for light; n = 7 YFP rats; Figure 4E; Figure S7B), and there was a significant light 3 group interaction between the NpHR and YFP groups (Figures 4D and 4E, mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, NpHR/YFP opsin group, and random effect of individual rat; p < 0.001 for light 3 group; n = 13 NpHR, 7 YFP rats). In contrast to the effect on accuracy, choice omission rate did not show significant light-induced change (Figure S7E; mixed-effect logistic regression, choice omission/completion predicted based on fixed effects of light, delay, NpHR/YFP opsin group, memory-guided/cue-guided task type, and random effect of individual rat; p = 0.23 for light; n = 13 rats for memory-guided NpHR, n = 7 rats for memory-guided YFP, n = 7 rats for cue-guided NpHR).

To determine whether the impairment induced by VTA-DA inhibition was specifically attributable to the short-term memory component of the task, we compared performance to a control variant of the task in which rats did not have to use short-term memory (Figure 4G). In the cue-guided task, the motor requirements were identical, but a light cue directly above the correct choice lever was illuminated during the choice period to signal which lever was correct. Optogenetic inhibition of DA neurons in VTA did not affect performance in the cue-guided task (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p = 0.14 for light; n = 7 NpHR rats; Figure 4G; Figure S7C). Thus, the effect of optogenetic inhibition of VTA-DA appeared to be dependent on the task having a short-term memory component.

Next, we investigated whether SNc-DA neurons also contributed causally to short-term memory. We performed an identical set of inhibition experiments in the SNc as we had in the VTA. Optogenetic inhibition of SNc-DA neurons impaired accuracy in the memory-guided task (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay and random effect of individual rat; p < 0.001 for light; n = 12 NpHR rats; Figure 4H; Figure S8A). This effect was not present in control rats expressing YFP in SNc-DA (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay and random effect of individual rat; p = 0.12 for light; n = 7 YFP rats; Figure 4I; Figure S8B). However, unlike VTA-DA, there was no significant interaction between light on/off and opsin/yfp group (Figures 4H and 4I; mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, NpHR/YFP opsin group, and random effect of individual rat; p = 0.27 for light 3 group; n = 12 NpHR, 7 YFP rats), and the level of opsin expression did not correlate with the optogenetic impairment (Figure 4J; two-way ANOVA, light impairment explained by fluorescence level and delay; p = 0.99 for fluorescence, p = 0.69 for delay; R2 = 0.02; n = 12 NpHR rats).

Moreover, unlike VTA-DA neurons, optogenetic inhibition of SNc-DA neurons during the control cue-guided task significantly impaired accuracy (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p < 0.001 for light; n=6 NpHR rats; Figure 4K; Figure S8C). The presence of light effect in both the memory-guided and cue-guided tasks suggests that the effect of SNc-DA inhibition is different from that of VTA-DA and cannot be specifically attributed as a short-term memory deficit.

We next asked whether ascending ChAT neurons in the NB and MS contributed causally to short-term memory (Figure 5). To address this, throughout the trial on a subset of trials, we inhibited ChAT neurons in the MS and NB in ChAT::Cre rats performing the DNMTP task (Figures 5A and 5B; Figure S9). We found that the inhibition of neither ChAT population affected short-term memory performance (Figure 5C, NB-ChAT group, mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p = 0.9 for light; n = 5 NpHR rats; Figure 5D, MS-ChAT group, mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p = 0.17 for light; n = 3 NpHR rats).

Figure 5. Optogenetic Inhibition of NB-ChAT and MS-ChAT Does Not Impair Short-Term Memory.

Figure 5.

(A) Schematic of NB-ChAT and MS-ChAT targeting strategy using ChAT::Cre rats and AAV2/5 DIO-NpHR-YFP virus injected into the NB (top) or MS (bottom).

(B) Example histology from the NB (top) and MS (bottom), showing co-localization of ChAT (red) and NpHR (green). Scale bar, 50 μm.

(C) Optogenetic inhibition of NB-ChAT does not affect performance in the DNMTP task (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat, p = 0.9 for light; n = 5 NpHR rats; continuous illumination, 532 nm, ~5–6 mW light power).

(D) Optogenetic inhibition of MS-ChAT does not affect short-term memory (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration and random effect of individual rat; p = 0.17 for light; n = 3 NpHR rats; continuous illumination, 532 nm, ~5–6 mW light power). All error bars denote SEM.

See also Figure S9.

Taken together, this suggests that the causal contribution of VTA-DA to short-term memory is unique relative to the other neuromodulatory populations that we examined.

Optogenetic Inhibition of VTA-DA Neurons during the Delay Period Impairs Short-Term Memory, Despite the Suppressed Activity during That Time

After determining that VTA-DA neurons contribute to short-term memory, we next asked when they do so: during the sample, delay, or choice period of the task. To address this, on a subset of trials, and in a randomly interleaved manner, we inhibited VTA-DA neurons during one of the three epochs (Figure 6A; Figure S7A).

Figure 6. Optogenetic Inhibition of VTA-DA during the Delay Produces Impairments in Short-Term Memory, Despite the Suppressed Activity during That Time.

Figure 6.

(A) Schematic of experimental design for sub-trial inhibition. Continuous light was presented during either the sample, delay or choice periods of a trial in an interleaved manner, with each manipulation occurring on 10% of trials (532 nm, ~5-–6 mW light power).

(B–D) Accuracy for light-on (green bar) versus light-off (gray bar) trials for the sample (B), delay (C), and choice (D) period. Mixed-effect logistic regression (n = 13 NpHR rats) to predict correct/incorrect choice based on fixed effects of light (light off, light during sample, light during delay, light during choice), delay duration, and random effect of individual rat reveals a significant effect of light during sample (p = 0.002) and delay (p < 0.001), but not choice (p = 0.3). All error bars denote SEM.

See also Figure S7.

Optogenetic inhibition of VTA-DA neurons during the sample and delay period, but not the choice period, impaired short-term memory (Figures 6B6D; mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light epoch, delay, and random effect of individual rat; p = 0.002 for light on sample, p < 0.001 for light on delay, p = 0.3 for light on choice; n = 13 NpHR rats). The effect size for delay period inhibition was larger than that for the sample period (regression coefficient for delay: b = −0.45 ± 0.076; for sample: b = −0.25 ± 0.081), which is surprising given the suppressed activity in this population during the delay period (Figures 3B and 3J).

Optogenetic Activation of VTA-DA Neurons during the Delay, but Not the Sample, Period Impairs Short-Term Memory

The inverted-U hypothesis posits that too much or too little DA would impair short-term memory. This would suggest that not only inhibition but also activation of VTA-DA neurons would impair short-term memory. On the other hand, the gating theory would suggest that more DA during the sample period could enhance short-term memory.

Thus, we next injected an AAV2/5 expressing Cre-dependent ChR2 into the VTA of TH::Cre rats (Figures 7A and 7B; Figure S7D). We briefly activated VTA-DA neurons at the time of the sample presentation to simulate the phasic response observed with fiber photometry (Figure 7C; 5 ms pulse duration, 5 pulses of 20 Hz stimulation, ~13–15 mW). Optogenetic activation did not improve short-term memory, which was not consistent with predictions from the gating theory (Figure 7D, mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p = 0.13 for light; n = 9 ChR2 rats).

Figure 7. Optogenetic Activation of VTA-DA Neurons during the Delay, but Not Sample, Impairs Short-Term Memory.

Figure 7.

(A) Schematic of VTA-DA targeting strategy using TH::Cre rats and AAV2/5 DIO-ChR2-YFP virus.

(B) Histology of ChR2 expression in VTA-DA neurons. Scale bar, 50µm.

(C) Schematic of experimental design for optogenetic activation at sample lever presentation. VTA-DA was activated at sample lever presentation (5 pulses at 20 Hz, 5 ms pulse duration, 447 nm, ~13–15 mW light power).

(D) Performance in DNMTP task for light-on (blue bar) versus light-off (gray bar) trials for VTA-DA activation using the protocol described in (C). VTA-DA activation at sample presentation did not modulate accuracy (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p = 0.13 for light; n = 9 ChR2 rats).

(E) Schematic of experimental design for burst activation of VTA-DA during the delay period (5 pulses at 20 Hz per burst, 1 burst/s, 5 ms pulse duration, 447nm, ~13–15 mW light power).

(F) Performance in the DNMTP task for light-on (blue bar) versus light-off (gray bar) trials for VTA-DA activation during the delay period using the protocol described in (E). VTA-DA activation in bursts during the delay period significantly impaired accuracy (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay duration, and random effect of individual rat; p < 0.001 for light; n = 10 ChR2 rats).

(G) Schematic of experimental design for tonic optogenetic activation of VTA-DA during the delay period (1 pulse/s, 5 ms pulse duration, 447nm, ~13–15 mW light power).

(H) Performance in the DNMTP task for light-on (blue bar) versus light-off (gray bar) trials for VTA-DA activation during the delay period using the protocol described in (G). VTA-DA tonic activation during the delay period significantly impaired accuracy (mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p < 0.001 for light; n = 10 ChR2 rats). All error bars denote SEM.

See also Figure S7.

Next, we activated VTA-DA neurons during the delay period, which is when we observed the most impairment of short-term memory from the optogenetic inhibition (Figure 7E; 5 pulses at 20 Hz per burst, 1 burst/s, 5 ms pulse duration, ~13–15 mW light power). We found that the optogenetic activation of VTA-DA neurons resulted in significant impairment of task performance, similar to our results from inhibition of this population (Figure 7F, mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p < 0.001 for light; n = 10 ChR2 rats, Figure S7D). In fact, even mild optogenetic activation (1 pulse/s, 5 ms pulse duration) of VTA-DA neurons during the delay period resulted in a significant impairment in performance (Figures 7G and 7H; mixed-effect logistic regression, correct/incorrect choice predicted based on fixed effects of light, delay, and random effect of individual rat; p < 0.001 for light; n = 10 ChR2 rats).

DISCUSSION

In summary, we found that VTA-DA neurons, and not the other three neuromodulatory populations we examined (SNc-DA, NB-ChAT, and MS-ChAT), contribute to short-term memory. Specifically, VTA-DA neurons do so preferentially during the delay period, despite the low activity in that population during that time. Both neural correlates and neural manipulations of VTA-DA neurons are consistent with an inverted-U relationship between population-level activity and accuracy.

Support for an Inverted-U Relationship between DA and Short-Term Memory Maintenance

Midbrain DA neurons are known to respond to reward-predicting cues and unexpected rewards; in other words, they encode errors in the prediction of reward (Bayer and Glimcher, 2005; Cohen et al., 2012; Roesch et al., 2007; Schultz, 1986, 1998; Schultz et al., 1997). In addition, DA has been implicated in short-term memory, primarily through pharmacological manipulations in monkeys (Arnsten et al., 1994; Cai and Arnsten, 1997; Sawaguchi and Goldman-Rakic, 1991; Williams and Goldman-Rakic, 1995). However, it has been unclear whether and how to integrate these literatures. In particular, pharmacological experiments had suggested that DA is most important to short-term memory during the delay period (Figure 1A; Vijayraghavan et al., 2007; Williams and Goldman-Rakic, 1995). Because there are usually no reward-predicting cues or rewards during the delay period, it is not obvious whether and why there would be DA activity at that time.

Thus, to reconcile the role of DA in reinforcement learning with one in short-term memory, it was proposed that DA contributes to the updating of short-term memory with new information (Figure 1B; Braver and Cohen, 1999, 2000; O’Reilly and Frank, 2006), which should occur at the time of reward-predicting stimuli, rather than to the maintenance of short-term memory during the delay period, which was the original hypothesis from pharmacological experiments. Because pharmacology is too slow to distinguish between a role in updating versus maintaining short-term memory, these hypotheses have remained untested. Thus, a major goal of this study was to directly measure and manipulate DA neuron activity during a short-term memory task with a distinct “sample period” in which short-term memory is updated as well as a “delay period” in which short-term memory is maintained, to determine which aspect of memory DA supports.

Our recordings revealed activity in DA neurons that was consistent with reward prediction error and therefore the “gating theory.” We observed relatively low activity in VTA-DA neurons during the delay period and elevated activity during the sample, choice, and outcome periods. This is consistent with VTA-DA responses primarily being explained by reward prediction error: reward-predicting cues appear during the sample and choice periods (the lever presentation), and reward occurs during the outcome period. Furthermore, the choice lever presentation elicited higher activity than the sample lever presentation, which is also consistent with a reward prediction error, assuming a temporally discounted reward expectation function (Fiorillo et al., 2008; Kobayashi and Schultz, 2008; Mazur, 1987; Richards et al., 1997; Roesch et al., 2007; Starkweather et al., 2017). Similar to VTA-DA, SNc-DA neurons also did not have elevated activity during the delay period, although the activity was not as low as VTA-DA.

Based on these neural correlates, we expected that DA might be causally involved in the sample period of short-term memory, consistent with the gating theory. In fact, we did observe a mild impairment in short-term memory as a result of inhibiting during the sample period, providing some causal support for that hypothesis. However, this effect was relatively small, and we observed no effect of activation during this period.

In addition, our recordings provided insight into the relationship between endogenous VTA-DA activity and short-term memory performance: we observed an “inverted-U” relationship between the delay period activity and performance. This correlational evidence provides a different form of support of classic ideas that had emerged from pharmacological manipulations, which had artificially manipulated receptor activation but provided no insight into the natural activity patterns.

Furthermore, bi-directional optogenetic manipulations revealed that the delay period was most relevant to short-term memory, as inhibition or activation led to relatively large impairments in performance, despite the low activity at that time. Thus, our manipulation of cell bodies very much resembled the dose-dependent “inverted-U” effects of D1 receptor agonist treatment in monkey PFC during spatial short-term memory (Cai and Arnsten, 1997; Murphy et al., 1996; Sawaguchi and Goldman-Rakic, 1991; Vijayraghavan et al., 2007; Williams and Goldman-Rakic, 1995; Zahrt et al., 1997). These findings highlight a dissociation between when DA neurons are most active (sample/choice period) and when their activity most affects short-term memory (delay period) and reveal new correlational and causal support that are consistent with classic ideas of an “inverted-U” relationship between DA and cognition.

Previous Work Measuring DA Neural Activity or DA Efflux during Short-Term Memory

Matsumoto and Takada (2013) recorded from VTA-DA and SNc-DA in non-human primates in a short-term memory task and found that SNc-DA responded to the sample stimulus only when the subject was required to store it in short-term memory. From these neural correlates, they concluded that only SNc-DA, but not VTA-DA, activity reflects short-term memory demand. However, they did not manipulate neural activity in these populations to assess causality, and in fact our finding that VTA-DA and not SNc-DA contributes to short-term memory, and does so preferentially during the delay period, provides another potential interpretation of their results. Specifically, our results suggest that the SNc-DA response to the sample stimulus observed in their study may only be correlational and not causal to short-term memory. An important caveat in connecting our study with this previous work is that that study was performed in primates, and there may be species differences in the contribution of VTA-DA and SNc-DA to short-term memory (Williams and Goldman-Rakic, 1998; Beckstead et al., 1979; Fallon and Moore, 1978).

Although we compared effects of VTA-DA and SNc-DA inhibition in a short-term memory task and a cue-guided task (Figure 4), we did not directly compare neural correlates in the DA system of these two tasks. Watanabe et al. (1997) used in vivo microdialysis to demonstrate an increase in DA in the principal sulcus in primates after performance of a short-term memory task, but not a cue-guided task. Whether such differences exist in the fast dynamics of VTA-DA activity within a trial remains to be established.

Another issue left unaddressed in the present study is the potential functional difference in distinct VTA-DA subpopulations. There is increased appreciation that VTA-DA neurons have heterogeneous and non-canonical signals during certain tasks (Bromberg-Martin et al., 2010; Cai et al., 2020; Coddington and Dudman, 2018; Engelhard et al., 2019; Howe and Dombeck, 2016; Howe et al., 2013; Lee et al., 2019; Lerner et al., 2015; Menegas et al., 2018; Parker et al., 2016; da Silva et al., 2018). For example, there might be a population of neurons with elevated activity during the delay period, and it is specifically those neurons that are contributing causally to short-term memory performance, but not the population as a whole.

Distinctions and Similarities in Neural Correlates of Short-Term Memory across DA and ChAT Populations

Aside from clarifying the temporal contribution of DA to short-term memory, another goal of this work was to directly compare the dopaminergic and cholinergic contribution to short-term memory. Perhaps the most prominent difference between the populations was observed in the MS-ChAT neurons, which encoded speed much more than the other populations. Additionally, VTA-DA preferentially responded to lever presentation cue, whereas NB-ChAT preferentially responded to lever press action during the sample and choice periods.

The most striking similarity we observed was across the three task-encoding populations (NB-ChAT, VTA-DA, SNc-DA), all of which had reward responses and elevated activity during the sample and choice periods. This is consistent with previous reports of reward responses not only in DA neurons but also in NB-ChAT neurons (Hangya et al., 2015; Teles-Grilo Ruivo et al., 2017). Another similarity was between the NB-ChAT and VTA-DA population; both had an inverted-U shaped relationship between delay period activity and performance.

SNc-DA, NB-CHAT, and MS-ChAT Neurons Do Not Contribute Causally and Selectively to Short-Term Memory

In contrast to some of the similarities we observed in the neural correlates of the task across neuromodulatory populations, the causal contributions were more distinct. Only VTA-DA neurons contributed selectively to the short-term memory task, as SNc-DA inhibition affected both the short-term memory task and a control task. In addition, NB-ChAT and MS-ChAT populations were not causally involved in short-term memory. To ascertain effective inhibition of NB-ChAT and MS-ChAT neurons, we performed in vitro electrophysiological validation and also examined targeting and expression (Figure S9).

The lack of involvement of the NB-ChAT populations is not aligned with classic lesion studies that used non-specific excitotoxins to lesion NB (i.e., ibotenic acid, quisqualic acid) and reported deficits in a battery of spatial memory tests such as the Morris water maze (Connor et al., 1991; Mandel and Thal, 1988; Mandel et al., 1989) and radial maze (Hodges et al., 1989; Lerer and Warner, 1986; Turner et al., 1992). However, our negative result with the NB-ChAT population is consistent with subsequent and more specific studies with cholinergic neuron-selective neurotoxin IgG-saporin (Baxter and Bucci, 2013; Baxter et al., 1995; Torres et al., 1994; Wenk et al., 1994).

In contrast to NB-ChAT neurons, MS-ChAT neurons have been implicated in certain spatial short-term memory tasks with IgG-saporin (Torres et al., 1994). However, our neural correlate demonstrated that MS-ChAT population primarily encodes the animal’s movement rather than task events, providing little reason to believe that these neurons would be selectively involved in short-term memory. One possibility may be that the septo-hippocampal ChAT pathway is only selectively involved in short-term memory in the case of novel stimuli (Hasselmo and Sarter, 2011; Hasselmo and Stern, 2006), perhaps by contributing to the generation of exploratory behavior.

STAR☆METHODS

RESOURCE AVAILABILITY

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Ilana B. Witten (iwitten@princeton.edu).

Materials availability

This study did not generate new unique reagents.

Data and code availability

The datasets and code supporting the current study are available from the Lead Contact, I.B.W., upon reasonable request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

All experimental procedures were conducted in accordance with the National Institute of Health guidelines and were approved by the Princeton University Institutional Animal Care and Use Committee.

TH::Cre (Horizon TGRA8400) or ChAT::Cre rats (RRRC 658) were maintained on a Long Evans background (Brown et al., 2013; Liu et al., 2016; Witten et al., 2011). A total of 109 rats (108 male and 1 female rats; 36 rats for fiber photometry, 62 rats for optogenetics, 5 rats for slice physiology, and 6 rats for rabies retrograde tracing) weighing >300 g/rat were used for experimentation. At the time of surgery, rats used for fiber photometry experiments were 19 ± 1.01 weeks old, for optogenetics experiments were 18.0 ± 0.77 weeks old, for slice physiology experiments were 14.34 ± 0.03 weeks old, and for rabies retrograde tracing experiments were 12.05 ± 0.39 weeks old. Rats were double-housed, unless they weighed over 500 g or had health-related concerns (e.g., fighting). Rats were maintained on a 12-hour light on – 12-hour light off schedule. All surgical and behavioral procedures were performed during the light off cycle.

METHOD DETAILS

Delayed non-match to sample task

Rats were water-restricted to 80%–85% of their ad-libitum weight and trained on a delayed non-match to position (DNMTP) spatial short-term memory task in operant chambers (Med-associates; Akhlaghpour et al., 2016; Dunnett et al., 1988). The operant chamber had two retractable levers on the front wall and a nose port on the opposite back wall (Figure 1C). In the DNMTP task, rats were trained to remember the position of the presented sample lever (either right or left) for a delay duration, and report the memory by pressing the “non-match” lever during the choice period. At the beginning of each trial, the sample period was initiated once the sample lever was presented (emerged from the wall) from one of two possible locations - either the right or left position. Upon pressing the sample lever, the lever retracted back into the wall, and the light in the back nose port was illuminated. The delay period started as the rat went to the back wall to poke its nose into the illuminated nose port. The delay period lasted for 1, 5, or 10 s (10, 20, 30 s or 5, 10, 15 s in a subset of experiments shown in Figures 4I and 7) in a randomly interleaved manner, so that the rat did not know when the delay period would end. At the end of the delay period, the nose port lighted up again, and the rats must then make the second nosepoke for both levers to extend from the front wall and to begin the choice period. A correct response was to press the lever that did not match the sample lever. A small light in the reward receptacle lit up immediately following the correct lever press, providing a feedback to the rat’s choice as well as signaling the presence of the water reward in the receptacle. Following the feedback light, rats entered the reward receptacle and consumed the water reward. The rats were given up to 15 s to press the sample lever and up to 5 s to perform a nosepoke in the illuminated nose port and to press the choice lever. All trials were followed by 5 s inter-trial interval if the previous trial was correct, and 8 s inter-trial interval for previously incorrect or omitted trials.

In the beginning of training, water-deprived rats learned the behavioral sequence of the task to get a reward. Initially, rats spent 1–2 weeks learning a simpler “nosepoke-nosepoke-lever press” sequence. In the simpler sequence, rats had to make two nosepokes in the back of the chamber, which triggered a random lever to be presented. The pressing of the presented lever led to a drop of water reward. When the rats repeated 100 sequences within an hour of training, they moved onto the more difficult, full sequence, which consisted of “lever press-nosepoke-nosepoke-lever press.” In this stage, the behavioral sequence was the same as the DNMTP task, but the choice period was modified such that rats needed to simply press the presented lever, instead of making an overt choice between the two levers, as only one choice lever emerged from the wall. At the end of the full sequence, rats were rewarded with a drop of water. The rats learned the full sequence in a few days. Then, the rats were finally introduced to the DNMTP task, in which the two nosepokes were separated by a short time delay (1, 2, 3 s), two choice levers were presented, and pressing of the “non-match” to sample lever was rewarded. For the following 3–6 weeks, delays were lengthened (1, 3, 5 s, and then 1, 5, 10 s) and rats learned the “non-match” to sample rule, improving their performance accuracy (> 80%). In total, the rats received 1–2 months of training.

The cue-guided task served as a control task for DNMTP, as it does not require short-term memory. The task structure was the same as DNMTP with only one difference: the rats were “guided” to the correct choice lever with a cue light directly above the correct lever when the choice levers were presented.

Surgery

For all surgical procedures, rats were deeply anesthetized in 4%–5% isoflurane and placed in a stereotactic setup (Kopf Instruments, Tujunga, CA, USA). After the rats were deeply anesthetized, rats were maintained on 1%–2% isoflurane throughout the surgery. The rats received baytril (5mg/kg, i.m.) before surgery and meloxicam (2mg/kg, s.c.) before and 24h after surgery. Rats were allowed a 5 day postoperative recovery period.

Fiber photometry experiment

Data in Figures 2 and 3 are from a series of fiber photometry experiments, which consisted of VTA-DA, SNc-DA, NB-ChAT, MS-ChAT (n = 34 rats, 50 recording sites) and control GFP groups (n = 2 rats, 4 recording sites).

For the VTA-DA group (n = 7 rats, 10 recording sites), 1mL of Cre-dependent GCaMP6f (AAV2/5-CAG-Flex-GCamP6f, Upenn Vector Core, titer: 1.17 × 1013 GC/mL or AAV2/5-CAG-DIO-RatOpt-GCaMP6f, PNI Vector Core, titer: 2.30 ×1013 GC/mL, (Cameron et al., 2019) was injected into the VTA (A/P: −6.0 mm, M/L: ±0.8 mm, D/V: −8.0 mm) of TH::Cre rats.

For the SNc-DA group (n = 8 rats, 13 recording sites), 1µL of Cre-dependent GCaMP6f (AAV2/5-CAG-Flex-GCamP6f, Upenn Vector Core, titer: 3.90 ×1012 GC/mL or AAV2/5-CAG-DIO-RatOpt-GCaMP6f, PNI Vector Core, titer: 2.30 × 1013 GC/mL) was injected into the SNc (A/P: −5.6 mm, M/L: ±1.7–2.25 mm, D/V: −7.7 – −8.2 mm) of TH::Cre rats.

For the NB-ChAT group (n = 17 rats, 19 recording sites, note that one recording site was removed from the analysis, see “Encoding models” for details), 1µL of Cre-dependent GCaMP6f (AAV2/5-CAG-Flex-GCamP6f, Upenn Vector Core, titer: 2.34 × 1012 GC/mL or AAV2/5-CAG-DIO-RatOpt-GCaMP6f, PNI Vector Core, titer: 2.30 × 1013 GC/mL) was injected into the NB (A/P: −1.5 mm, M/L: ±2.8 – 3.3 mm, D/V: −7.0 mm) of ChAT::Cre rats.

For the MS-ChAT group (n = 7 rats, 8 recording sites), 0.75µL of Cre-dependent GCaMP6f (AAV2/5-CAG-Flex-GCamP6f, Upenn Vector Core, titer: 2.34 × 1012 GC/mL) was injected into the MS of ChAT::Cre rats (A/P: +0.5 mm, M/L: 0mm, D/V: −7.0 mm, 10° angle).

For the control GFP group (n = 2 rats, 4 recording sites), 0.75 – 1 µL of Cre-dependent GFP virus (AAV2/5-CAG-Flex-eGFP, Upenn Vector Core, titer: 1.81 × 1012 GC/mL) was injected to the NB (A/P: −1.5mm, M/L: 3.0mm, D/V: −7.0mm) and MS (A/P: +0.5 mm, M/L: 0mm, D/V: −7.0 mm) of ChAT::Cre rats.

After the virus injection, a fiber optic cannula (400 µm core diameter, low-autofluorescence, MFC_400/430–0.48_10mm_MF2.5_FLT, Doric Lenses) was implanted 0–0.7 mm above the injection site. Note that fiber optic cannula implantation into MS and VTA, and virus injection into MS was at 10° angle to divert the superior sagittal sinus.

18 rats contributed two recording sites each (bilaterally or from two different regions), and 18 rats contributed a single recording site each, resulting in a total 54 recording sites from 36 animals.

Optogenetics experiment

For the optogenetic inhibition experiment, 1µL of Cre-dependent NpHR (AAV2/5-EF1a-DIO-eNpHR3.0-eYFP, Upenn Vector Core, titer: 1.29 × 1013 GC/mL or PNI Vector Core, titer: 1.00 × 1014 GC/mL) was injected into the SNc and VTA of TH::Cre rats, and NB and MS of ChAT::Cre rats.

For the optogenetic activation experiment, 1µL of cre-dependent ChR2 (AAV2/5-EF1a-DIO-ChR2-eYFP, Upenn Vector Core, titer: 7.70 × 1012 GC/mL or PNI Vector Core, titer: 7.0 × 1014 GC/mL) was injected into the SNc and VTA of TH::Cre rats.

For the control illumination experiment, 1µL of Cre-dependent YFP virus (AAV2/5-EF1a-DIO-eYFP, PNI Vector Core, titer: 6.0 × 1013 GC/ml) was injected into the VTA of TH::Cre rats and SNc of ChAT::Cre rats.

After the virus injection, a fiber optic cannula (300mm core diameter, custom made with MM-FER-2006SS-3300 from Precision fiber products and FT300UMT from Thor labs) was implanted 0–0.7 mm above the injection sites. The optogenetic manipulations of VTA, SNc, and NB were bilateral (A/P: −6.0 mm, M/L: ± 0.8 mm, D/V: −8.0 mm for VTA; A/P: −5.6 mm, M/L: ± 1.7 to 2.25 mm, D/V: −7.7 – −8.2 mm for SNc; A/P: −1.5 mm, M/L: ± 2.8 to 3.3 mm, D/V: −7.0 mm for NB), and the optogenetic manipulation of MS was unilateral (A/P: +0.5 mm, M/L: 0 mm, D/V: −7.0 mm), since the structure was centrally located in the midline. Also note that the fiber optic cannula implantation into the MS and VTA, and virus injection into the MS was at 10° angle to divert the superior sagittal sinus.

Rabies retrograde tracing experiment

In 6 ChAT::Cre rats, 1.5 µL of helper virus (AAV2/5-CMV-DIO-TVA66T-HA-P2A-N2cΔG, PNI Vector Core, titer: 2.0 × 1014 GC/mL) was injected into the NB (A/P: −1.5 mm, M/L: 0.75 µL at 2.8 mm, 0.75 µl at 3.5 mm, D/V: −7.2 mm). 4 weeks later, 3 of them were assigned to the medial NB group and received 50, 100, or 200nL of rabies virus injection (RabV-CVS-N2cDG-mCherry/EnvA, The Center for Neuroanatomy with Neurotropic Viruses, CNNV, titer: 2.0 × 108 ffu/mL) into the medial NB (A/P: −1.5 mm, M/L: 2.8 mm, D/V: −7.2 mm). The remaining 3 rats were assigned to the lateral NB group and received 50, 100, or 200nL of rabies virus injection into the lateral NB (A/P: −1.5 mm, M/L: 3.5 mm, D/V: −7.2 mm).

Ex-vivo slice physiology experiment

In 5 ChAT::Cre rats, 1 µL of Cre-dependent NpHR virus (AAV2/5-EF1a-DIO-eNpHR3.0-eYFP, PNI Vector Core, titer: 2.20 × 1014 GC/mL) was injected bilaterally into the NB (A/P: −1.5 mm, M/L: ±3.0 mm, D/V: −7.2 mm). Additionally, 0.75 µL of the same virus was injected into the MS (A/P: +0.5 mm, M/L: 0 mm, D/V: −7.0 mm) at 10° angle to divert the superior sagittal sinus.

Fiber photometry

We recorded fluorescence through an implanted fiber (ferrule, Doric Lenses, MFC_400/430–0.48_10mm_MF2.5_FLT, patch cord, Doric Lenses, MFP_400/430/1100–0.57_0.45m_FCM-MF2.5_LAF) while the rats were performing the DNMTP task. We excited GCaMP (or GFP in case of control rats) with two different wavelengths: 405nm (intensity at fiber tip: 5–10 mW, sinusoidal frequency modulation: 531 Hz) and 488 nm (intensity at fiber tip: 15–25 mW, sinusoidal frequency modulation: 211 Hz) using an LED driver (Thorlabs DC4104). Emission light from GCaMP was collected through the same fiber using a photodetector (Newport, Femtowatt 215), and the analog data were digitized by the TDT system (RZ5D) which served both as a A-D converter and lock-in amplifier. A small head-mounted LED was used to track the rat’s position in the chamber while recording. The position data were simultaneously acquired through the TDT video tracking system (RV2). The timestamps for task events were registered as TTL pulses from the operant chamber into the TDT fiber photometry system through the Med-associates interface connection. Thus, the TDT acquisition system synchronously acquired event time stamps through the Med-associates interface, GCaMP signal through the photodetector, and animal’s head position through the TDT RV2.

GCaMP signal preprocessing

With 488 nm excitation, the fluorescence of GCaMP is relatively calcium-dependent, but with 405 nm excitation, its fluorescence is largely calcium-independent (Akerboom et al., 2012; Tian et al., 2009). When calculating dF/F, we therefore utilized the 405 nm channel to calculate the baseline fluorescence in order to account for calcium-independent changes in fluorescence that may be caused by the rats’ movement in our freely moving operant task (Lerner et al., 2015).

The fluorescence signals were acquired at 381 Hz and then downsampled to 10 Hz using “resample” function in MATLAB. These downsampled signals were processed according to the following steps:

First, control 405 nm signal Scontrol(t) was fit to 488 nm GCaMP signal SGCaMP(t) using least-squares regression to calculate the fitted control signalSfitted(t):

SGCaMP(t)=β0+β1(Scontrol(t))+ε
Sfitted(t)=β^0+β^1(Scontrol(t))

Second, the relative change in fluorescence signal, ΔF/F(t), was calculated using SGCaMP(t) andSfitted(t).

ΔF/F(t)=SGCaMP(t)Sfitted(t)Sfitted(t)

Lastly, ΔF/F(t) was z-scored to facilitate comparison across recording sessions and rats. The mean (mean(ΔF/F(t))) and the standard deviation (std(ΔF/F(t)) was calculated over each recording session.

Zscored(ΔF/F(t))=(ΔF/F(t))mean(ΔF/F(t))/std(ΔF/F(t))

Immunohistochemistry

Rats were deeply anesthetized using euthasol (2 mg/kg, i.p) and transcardially perfused first with phosphate-buffered saline (PBS), and then with 4% paraformaldehyde (PFA) in PBS. Brains were collected and post-fixed in 4% PFA overnight. The brains were then placed in 30% sucrose in PBS solution for 2–5 days at 4°C. Frozen brains were cut into 40–50mm thick coronal sections using a cryostat.

One-third of the coronal sections near the target location were directly mounted from the cryostat and coverslipped with a mounting solution (fluoromount-G with DAPI, Southern Biotech) to obtain accurate fiber location and to confirm virus expression without any staining. These images were taken using a microscope (Nikon Ti2000E or Leica M205FA) or whole slide scanner (Hamamatsu Nanozoomer S60).

Another one-third of sections were stained for TH or ChAT, to observe co-localization with GCaMP, NpHR, or ChR2. These sections were placed in a blocking buffer (2% normal donkey serum and 1% bovine serum albumin in PBST; Sigma A7906–100G) for 30min. Then for TH staining, sections were incubated overnight at 4°C in solution containing the primary antibody for tyrosine hydroxylase (Chicken-TH, 1:500 or 1:1000 dilutions, Aves lab TYH). For ChAT staining, sections were incubated for two days at 4°C in solution containing the primary antibody for choline acetyltransferase (Goat-ChAT, 1:100 dilution, Millipore AB144P). When enhancement of GCaMP, NpHR, and ChR2 signals was necessary, primary antibody for GFP was used (Rabbit-GFP, 1:1000 dilution, Thermo Fisher Scientific G10362). Sections were then washed with PBS for 30min, and incubated overnight at 4°C in Alexa Fluor 647 or Cy3 (Donkey anti-Chicken-Cy3, 1:1000 dilution, Jackson ImmunoResearch, 703-165-155 or Donkey anti-Goat-647, 1:1000 dilution, Jackson ImmunoResearch, 705-605-147) and Alexa Fluor 488 (Donkey anti-Rabbit-488, 1:1000 dilution, Jackson ImmunoResearch, 711-545-152). After PBS washes, sections were mounted in a mounting solution (fluoromount-G with DAPI, Southern Biotech). To confirm colocalization, cellular resolution images were taken using a confocal microscope (Leica TCS SP8).

Reconstruction of fiber placements

Fiber tip locations of the fiber photometry recording (Figure S1) and optogenetics manipulation sites (Figures S7, S8, and S9) were reconstructed from the histology of coronal brain sections referencing the Paxinos Rat Atlas (Paxinos and Watson, 6th edition). A/P position of the fiber tip was approximated from the section with the deepest fiber track.

In the section with the deepest fiber tip location, M/L position of the fiber tip was carefully reconstructed by normalizing the measured M/L distance of the fiber tip to the reference M/L distance, and scaling that ratio to match the Paxinos Rat Atlas. These normalization-scaling steps effectively registered the measured M/L position into the Paxinos atlas, accounting for individual tissue shrinkage in each brain during histology. Reference distance utilized well-defined “landmarks” in the tissue, such as the distance from the midline to the outermost edge of the tissue (i.e., longest M/L). Then, we derived the atlas-referenced M/L distance of the fiber tip by equating the ratio of measured M/L distances of fiber tip and reference mark to the ratio of atlas-referenced M/L distances of the fiber tip and the reference landmark, and solving for the atlas-referenced M/L distance of the fiber tip.

atlasMLdistancefibertip=measuredMLdistancefibertip*atlasMLdistancereferencemeasuredMLdistancereference

The D/V position of the fiber tip was also derived similarly by referencing the distance of well-known “landmarks” along the D/V (e.g., D/V distance from the top to bottom of the tissue along the midline).

Quantification of opsin expression levels

We quantified the fluorescence intensity as a measure of opsin expression level and correlated it with light-induced accuracy impairment (Figures 4F and 4J). To do so, we collected the tissue with the deepest fiber track and imaged them under the same setting using a Leica M205FA microscope. Using Leica LAS X software, we manually drew the outline of fluorescent areas (around VTA/SNc region). The fluorescence intensity inside the fluorescent area was measured and then the fluorescence intensity of a region of the same size within the same brain slice but above the fluorescent area was subtracted.

Optogenetics experiments

About 6–7 weeks post virus injection (AAV2/5-EF1a-DIO-NpHR-eYFP in experimental group, AAV2/5-EF1a-DIO-eYFP in control group, for detail, see Methods, Surgery, Optogenetics experiment), rats were tested in the entire-trial inhibition experiment (Figures 4D, 4E, 4H, and 4I). In randomly selected 20% of all trials, rats received green light bilaterally throughout the sample, delay, and choice periods (532 nm continuous illumination, ~5–6 mW light power) in SNc and VTA for 5 sessions. Rats in the VTA and SNc groups performed ~242 trials/session and ~212 trials/session on average respectively.

For entire-trial inhibition experiment in the NB and MS (Figures 5C and 5D), rats received green light bilaterally throughout the sample, delay, and choice periods (532 nm continuous illumination, ~5–6 mW light power) on NB and MS for 2 sessions in 15% of all trials. Each test session was 1.5 h long and interleaved with a day where rats performed the task without illumination in order to reduce behavioral adaptation to the manipulation. Rats in the NB and MS groups performed ~294 trials/session and ~308 trials/session on average respectively.

Rats expressing NpHR in VTA were then used for the sub-trial inhibition experiment (Figures 6B6D). Rats received green light (532 nm continuous illumination, ~5–6 mW light power) in SNc or VTA in a randomly selected 30% of all trials for 10 testing sessions, with each test session interleaved with a day where rats performed the task without testing. The laser-on trials were randomly and equally distributed into sample light-on trials (10% of total), delay light-on trials (10% of total), and choice light-on trials (10% of total). Rats performed an average of ~238 trials/session.

A subset of rats from the aforementioned entire-trial inhibition experiments (n = 5 for VTA, n = 3 for SNc) and additional rats (n = 2 for VTA, n = 3 for SNc) were trained on the cue-guided task to use cue light to guide their choice (Figures 4G and 4K). As they quickly learned the new rule (in ~2 weeks), they reached >95% average accuracy in all delays (and delay-dependence accuracy impairment dissipated in re-trained rats). These rats received entire-trial inhibition using the same parameter (20% 532nm continuous illumination, ~5–6 mW, 5 sessions) from the DNMTP entire-trial inhibition experiment. Rats in the VTA and SNc groups performed ~240 trials/session and ~225 trials/session on average respectively.

For the ChR2 experiments, a separate cohort of rats were injected with DIO-ChR2-eYFP in the VTA and tested 6–7 weeks post-injection. For sample period activation experiment (Figure 7D), rats received pulsed blue light in VTA when the sample lever was presented (447 nm, 5 ms pulse duration, 1 burst of 5 pulses at the sample presentation, ~13–15mW light power). For delay period activation experiment (Figures 7F and 7H), rats received pulsed blue light in the VTA during the delay period (447 nm, 5 ms pulse duration 20Hz burst per second of 5 pulses or 1pulse per second, ~13–15mW light power). Stimulation took place on a randomly selected 20% of all trials for a total of 5 stimulation sessions, interleaved with nonstimulation sessions. Rats performed on average 187 trials/session.

Ex vivo electrophysiology recordings

To test the efficacy of optogenetic inhibition in NB-ChAT and MS-ChAT cells, we performed ex vivo electrophysiology in ChAT::Cre rats (Figure S9). Coronal slices containing the MS or NB were prepared from 5-month-old male ChAT::Cre rats 4 weeks after injecting with DIO-NpHR virus. Rats were deeply anesthetized with an intraperitoneal injection of euthasol (2 mg/kg, ip) and decapitated. After extraction, the brain was immersed in ice-cold carbogenated N-methyl-D-glucamine (NMDG) artificial cerebrospinal fluid (ACSF) (92 mM NMDG, 2.5 mM KCl, 1.25 mM NaH2PO4, 30 mM NaHCO3, 20 mM HEPES, 25 mM glucose, 2 mM thiourea, 5 mM Na-ascorbate, 3 mM Na-pyruvate, 0.5 mM CaCl2·4H2O, 10 mM MgSO4·7H2O and 12 mM N-acetyl-L-cysteine) for 3 min. Afterward, coronal slices (300 mm) were sectioned using a vibratome (VT1200s, Leica) and then incubated in NMDG ACSF at 34°C for 12–14 min. Slices were then transferred into a holding solution of HEPES ACSF (92 mM NaCl, 2.5 mM KCl, 1.25 mM NaH2PO4, 30 mM NaHCO3, 20 mM HEPES, 25 mM glucose, 2 mM thiourea, 5 mM Na-ascorbate, 3 mM Na-pyruvate, 2 mM CaCl2·4H2O, 2 mM MgSO4·7H2O and 12 mM N-acetyl-L-cysteine, bubbled at room temperature with 95% O2, 5% CO2) for at least 45 min until recordings were performed. Whole cell recordings were performed using a Multiclamp 700B (Molecular Devices, Sunnyvale, CA) using pipettes with a resistance of 4–7 MOhm filled with a potassium-based internal solution containing 120 mM potassium gluconate, 0.2 mM EGTA, 10 mM HEPES, 5 mM NaCl, 1 mM MgCl2, 2 mM Mg-ATP and 0.3 mM NA-GTP, with the pH adjusted to 7.2 with KOH. ChAT neurons were identified for recordings based on YFP expression. Photostimulation parameters were 586 nm and 0.034–0.053 mW/mm2. Neurons were held at −70 mV during photocurrent measurements. Baseline potential was calculated as the mean potential over a 1 s period just prior to stimulation. Peak hyperpolarization was calculated as the largest hyperpolarization relative to baseline potential. Steady-state hyperpolarization was calculated as the mean hyperpolarization during the last 1 s of stimulation. Peak and steady-state photocurrents were calculated using the same time intervals. To confirm the ability of photocurrents to eliminate action potentials in MS-ChAT cells, action potentials were induced by a positive current injection (200 pA, 25 ms pulse duration, 1 Hz). Action potentials in NB-ChAT cells were induced by a positive current injection (150 pA, 50 ms pulse duration, 4 Hz). Stimulation frequencies were chosen based on published in vivo firing frequencies of either cell population (Hedrick and Waters, 2010; Simon et al., 2006).

QUANTIFICATION AND STATISTICAL ANALYSIS

Encoding models

To distinguish the relative contribution of locomotion and task events in predicting the GCaMP signal, we built and compared three encoding models, as shown in Figure 2B. The three models were based on linear regressions, in which the measured GCaMP signal was predicted by the weighted sum of predictors based on task events, animals’ speed in the chamber or the combination of task events and speed.

Event predictors

Task event predictors (Ei, j) were generated for each type of task event by convolving a time series of event times (Ti, 1 when event occurred, or 0 otherwise) with a 10 degrees-of-freedom spline basis set of 3s duration (Bj, where j = [1..10]) see Engelhard et al., 2019; Park et al., 2014). For the ith task events and jth spline basis function, the task event predictor (Ei, j) is defined as follows:

Ei,j(t)=(Ti*Bj)(t)=Ti(tτ)Bj(τ)dτ

The 10 types of task events consisted of sample lever presentation, sample lever press, delay start, delay end, choice lever presentation, choice lever press, correct reward port entry, correct reward port exit, incorrect reward port entry, and incorrect reward port exit (therefore, i = [1..10]). Note the duration of the spline basis set for the reward response was longer than the others (10 s) to capture the prolonged reward consumption responses observed in some animals. To allow predictors to capture response kernels starting 1 s before each event, the event time series were shifted acausally by 1 s before performing the convolution (with the exception of the reward event, which was not shifted acausally).

The advantage of convolving each event with the spline basis set to generate our predictors is that it allows for a temporal delay in the relationship between neural activity and behavior, while minimizing the number of predictors by assuming smoothness in the response profiles (Engelhard et al., 2019; Park et al., 2014). A 10 degrees-of-freedom spline basis set was selected to preserve the shape of time-locked GCaMP signal in the response kernels learned from the model, while minimizing the number of total predictors.

Speed predictors

Animals’ movement speed was calculated from the tracked x, y position of the rats’ head using a small LED light attached to the fiber photometry tether, close to the rats’ head. The x, y positions were tracked and acquired at 102 Hz. Tracking was lost if the LED light was hidden by the chamber objects (i.e., underneath the lever or too far into the reward consumption inlet) or its reflection on the wall was captured. Missing tracking points were treated as NaN in MATLAB and R. The tracked x, y position in pixels was converted to centimeters by manually defining the outer edges of the tracked arena, whose dimension was 32.5cm x 24.5cm. The position vectors were iteratively median-filtered three times (with 100ms window) to reduce noise and interpolate missing data from the tracking loss. The Euclidean distance, derived from the change in x, y position, was multiplied by the acquisition frequency to calculate instantaneous speed. The instantaneous speed was then downsampled to 10Hz, using the “resample” function in MATLAB, to generate the speed predictor.

Speed predictors (Sk) were continuous variables which included first, second, and third-degree polynomials of the animal’s speed, to allow flexibility in the relationship between speed and GCaMP.

Encoding Models

The full encoding model to predict GCaMP (1), a reduced model with only the task event predictors (2), and an alternative reduced model with only the speed predictors (3) are expressed as follows:

g(t)=β0+i=1Neventj=1Nbsβi,jEi,j(t)+k=1Npolyβk[S(t)]k+ε (1)
g(t)=β0+i=1Neventj=1Nbsβi,jEi,j(t)+ε (2)
g(t)=β0+k=1Npolyβk[S(t)]k+ε (3)

where g(t) is the predicted GCaMP signal predicted based on task event predictors (Ei,j) and/or animal’s speed (Sk). Through the linear regression, the model learned b weights (β0, βi,j, and βk) for the predictors (Ei,j and Sk). Parameters were: Nevent, or the number of task events, which was 10; Nbs, degrees of freedom of the spline basis set, which was 10; and Npoly, the degree of polynomials used to model speed, which was 3.

Model evaluation using cross-validation

To examine the relative contribution of animal’s movement versus task events predictors, R2of the three models, as a measure of model’s predictive power, were calculated and compared (Figure 2B). To generate R2 of the model, data from each recording site was divided into three folds, in which 23 of the data were used to train the model (using the “lm” function in R), and the 13 of the data were held-out to test the trained model. After the model was trained, predicted GCaMP from the model was generated using the “predict” function on the predictor matrix of the held-out data. R2 was calculated on the held-out data. This training-testing process was repeated until each fold was used as the held-out data for testing (3-fold cross-validation). The resulting three R2 for each fold was averaged to create an average R2for each recording site. Note that rank-deficient fit was not used to calculate average R2, since it suggested the data were not sufficient. This resulted in eliminating one NB recording site (1 out of 50 recording sites) from further analysis.

To fit the model with a linear regression, “lm” function in R was used (Figure 2; Figures S2GS2L). To validate that our model is not overfitting, we also fit the same model using a lasso regression (“glmnet” function in R), which uses regularization to select relevant predictors. The kernels generated from the lasso regression were similar to the kernels from the linear regression.

Generating event kernels from the model

The response kernel for a type of task event is the component of the neural response that can be specifically attributed to the type of task events in the encoding model. These response kernels learned from the model are reported in Figures S2GS2L. To generate the response kernels, beta weights (βi,j) for the task event predictors (Ei,j) were learned from regression described above.

For ith type of task events, response kernels is the weighted (βi.j) sum of spline basis function Bj(t) for the task event as follows:

j=1Nbsβi,jBj(τ)

Inverted-U quantification

To statistically test if there is an inverted-U relationship between fluorescence and accuracy (Figures 4K, 4M, and 4O; Figure S6), the average accuracy was predicted by a mixed-effect linear regression based on the following predictors: 1st and 2nd degree polynomial of delay period fluorescence quintile, delay period duration, and random effect of individual recording site (implemented with “lmer” function in R). Note that the random effect of individual recording sites allows the model to account for individual differences in average accuracy, while identifying the curve that best fits the entire dataset. The inverted-U was supported by the negative and statistically significant coefficient of the 2nd degree polynomial of delay period fluorescence quintile.

To justify our model selection process, we compared two mixed-effect linear regression models. In the first full model, accuracy was predicted by both the first and second degree polynomial of delay period fluorescence quintile, delay duration, and random effect of individual recording sites. In the second reduced model, everything was the same as the first model, except the second degree polynomial of delay period fluorescence quintile was omitted. Since the second model is nested within the first model, we performed a chi-square test of the two models to determine if the addition of the second degree polynomial term is justified. In fact, the addition of the second degree polynomial significantly improved the model fit only in the VTA-DA and NB-ChAT group (X2 6,7 = 8.22, p < 0.001 for VTA-DA, X2 6,7 = 7.18, p < 0.001 for NB-ChAT), but not in the SNc-DA group (X2 6,7 = 0.58, p = 0.45 for SNc-DA). The goodness of fit of the selected, full model were 0.58 for VTA-DA, 0.36 for SNc-DA, and 0.45 for NB-ChAT.

We further validated the inverted-U by incorporating an additional set of statistical tests, based on (Lind and Mehlum, 2010). These results are summarized in Figure S6A. They recommend that in addition to the 2nd degree polynomial p value described above, an inverted-U should be confirmed through: i) significance of the positive slope on the lower data range, ii) significance of negative slope on the upper data range, iii) joint significance of the left and right side slope, and iv) checking that the maximum of the inverted-U and its confidence interval fall within the x-range of the data. Given the inverted-U equation (y = α+βx+λx2), the significance of positive and negative slopes was computed from one-sided t test for inequalities β+2λxl < 0 and β+2λxh > 0, where xl and xh were the minimum and maximum of the data range (in this case, the 1st and 5th quintile of fluorescence). The joint significance of the two slopes was tested from the composite hypotheses of the inequalities (β+2λxl <0 U β+2λxh > 0, intersection-union test). These statistics were computed using the Stata module provided with the paper (https://econpapers.repec.org/software/bocbocode/s456874.htm)

Contribution of lever presentation versus press

To compare the relative contribution of reward-predicting cues and reward-motivated actions in predicting GCaMP fluorescence (Figures S3OS3Q), we quantified the reduction in variance explained when the predictor of interest was removed from the encoding model.

First, we compared the full model (as described earlier in the “Encoding models” section) with a cue-reduced model. The cue-reduced model was the same as the full model, except the “sample lever presentation” predictors (10 basis set predictors for the “sample lever presentation” event) were removed from the predictor matrix. The data were fit again to the cue-reduced model, using the “lm” function and 3-fold cross-validation. The contribution of the sample cue predictors (Ccue) was defined as the reduction in the explained variance, R2, of the reduced model compared to the full model (Engelhard et al., 2019; Lovett-Barron et al., 2019; Musall et al., 2019):

Ccue=1Rcuereduced2Rfull2

We similarly compared the full model with an action-reduced model by removing the “sample lever press” predictors from the predictor matrix and calculating the contribution of sample lever press predictors (Caction):

Caction=1Ractionreduced2Rfull2

Note that we removed sample presentation and sample press (and not choice presentation or choice lever press) to derive the cue-reduced and action-reduced models. This is because choice press coincided with the light cue for reward in our task design, thus we were unable to cleanly dissociate the reward cue from choice lever press action.

Finally, the relative contribution of the predictor for each recording site was calculated as a percentage over the combined contribution of cue and action.

RelativeCcue=CcueCcue+Caction
RelativeCaction=CactionCcue+Caction

To statistically compare the relative contribution of cues and actions to the explained variance, we performed pairwise t tests across the VTA-DA, SNc-DA, and NB-ChAT recording sites.

Rabies tracing and wholebrain quantification

To analyze input cells to medial and lateral subregions of the NB-ChAT population, we injected Cre-dependent helper virus and rabies virus into the NB of ChAT::Cre rats (for detail, see Methods, Surgery, Rabies retrograde tracing experiment; Figure S5; Reardon et al., 2016). 3 weeks post surgery, rats (n = 6 rats, 3 rats in each medial and lateral NB-ChAT groups) were perfused and their brains were extracted for histology (for detail, see Immunohistochemistry). Brain sections covering the entire brain (approximate AP range from +4 to −9mm) in 100µm spacing were mounted and coverslipped with a mounting solution, then imaged using a whole slide scanner (Hamamatsu Nanozoomer S60). These images (Raw 16-bit TIFF) of brain sections were analyzed using a published platform, “WholeBrain” (Fürth et al., 2018).

The analysis of the brain sections consisted of three steps - registration to Allen brain atlas, detection of input cells, and final registration to Waxholm Space atlas of the Sprague Dawley rat brain (Papp et al., 2014). First, we visually identified the corresponding mouse A/P coordinate of all rat brain sections, referencing Openbrainmap (http://openbrainmap.org). Then each imaged section of the rat brain were registered into the Allen brain atlas of the same A/P coordinate, using the “registration” function from “Whole-brain” package in R. Once the imaged section was registered, mCherry-labeled cells (input cells infected with RabV-CVS-N2cΔG-mCherry/EnvA virus) in the images were automatically detected using the “segment” function from “Wholebrain” package in R, with visual inspection to detect outliers and manually correct when deemed necessary. When the registration and detection steps are over, “Wholebrain” creates a data frame containing information on all counted mCherry-labeled cells, their location (A/P, M/L, D/V) in Allen brain atlas, and the brain ontology they belong. Finally, an additional registration process converted the mouse brain coordinates of the detected input cells into the rat brain coordinates using the new “map.to.rat” function (WholeBrain v. version 0.1.36).

Supplementary Material

1
2

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

Chicken monoclonal anti-TH Aves lab Cat# TYH; RRID: AB_10013440
Goat monoclonal anti-ChAT Millipore Cat# AB144P; RRID: AB_2079751
Rabbit monoclonal anti-GFP Thermo Fisher Scientific Cat# G10362; RRID: AB_2536526
Donkey anti-Chicken conjugated to Cy3 Jackson Immuno Research Cat# 703-165-155; RRID: AB_2340363
Donkey anti-Goat conjugated to Alexa Fluor 647 Jackson Immuno Research Cat# 705-605-147; RRID: AB_2340437
Donkey anti-Rabbit conjugated to Alexa Fluor 488 Jackson Immuno Research Cat# 711-545-152; RRID: AB_2313584

Bacterial and Virus Strains

AAV2/5-CAG-Flex-GCamP6f U Penn Vector Core https://www.addgene.org/100835/; RRID: Addgene_100835
AAV2/5-CAG-DIO-RatOpt-GCaMP6f PNI Viral Core, Princeton Cat# AAV-VC58
AAV2/5-CAG-Flex-eGFP U Penn Vector Core https://www.addgene.org/51502/; RRID: Addgene_51502
AAV2/5-EF1a-DIO-eNpHR3.0-eYFP U Penn Vector Core https://www.addgene.org/26966/; RRID: Addgene_26966
AAV2/5-EF1a-DIO-ChR2-eYFP U Penn Vector Core https://www.addgene.org/20298/; RRID: Addgene_20298
AAV2/5-EF1a-DIO-eNpHR3.0-eYFP PNI Viral Core, Princeton Cat# AAV-VC24
AAV2/5-EF1a-DIO-ChR2-eYFP PNI Viral Core, Princeton Cat# AAV-VC53
AAV2/5-EF1a-DIO-eYFP PNI Viral Core, Princeton Cat# AAV-VC93
AAV2/5-CMV-DIO-TVA66T-HA-P2A-N2cΔG PNI Viral Core, Princeton Cat# AAV-VC178
RabV-CVS-N2cΔG-mCherry/EnvA Obtained by PNI Viral Core from The Center for Neuroanatomy with Neurotropic Viruses (CNNV) https://www.addgene.org/73464/; RRID: Addgene_73464

Experimental Models: Organisms/Strains

Rat: TH::Cre Long Evans Horizon Cat# TGRA8400
Rat: ChAT::Cre Long Evans RRRC Cat# 658

Software and Algorithms

U test stata module Lind and Mehlum, 2010 https://econpapers.repec.org/software/bocbocode/s456874.htm
Wholebrain R package Fürth et al., 2018 http://www.wholebrainsoftware.org/

Other

Fibers for optogenetics Thor labs Cat# FT300UMT
Ferrules for optogenetics Precision Fiber Products Cat# MM-FER-2006SS-3300
Patch cord for fiberphotometry Doric Lenses Cat# MFP_400/430/1100-0.57_0.45m_FCM-MF2.5_LAF
Ferrules for fiberphotometry Doric Lenses Cat# MFC_400/430-0.48_10mm_MF2.5_FLT

Highlights.

  • Recording and manipulation of four neuromodulatory populations during short-term memory

  • Only dopamine neurons in VTA contribute selectively to short-term memory

  • Contribution is during the delay period, despite low activity at that time

  • Inverted-U relationship between VTA dopamine activity and performance

ACKNOWLEDGMENTS

We would like to thank members of the Witten lab for support and advice on this work as well as the PNI viral core, the Center for Neuroanatomy with Neurotropic Viruses (NIH 2P40OD010996–16), Matthias Schnell for providing N2c rabies virus, and Jihyun Bak for providing comments on the manuscript. This research was funded by NYSCF, Pew, McKnight, and NARSAD grants to I.B.W., ARO W911NF1710554, and the following NIH grants: 1R01MH106689–01A1, U19 NS104648–01, DP2 DA035149–01, and 1R01DA047869–01. I.B.W. is a New York Stem Cell Foundation–Robertson Investigator.

Footnotes

SUPPLEMENTAL INFORMATION

Supplemental Information can be found online at https://doi.org/10.1016/j.celrep.2020.108492.

DECLARATION OF INTERESTS

The authors declare no competing interests.

REFERENCES

  1. Akerboom J, Chen T-W, Wardill TJ, Tian L, Marvin JS, Mutlu S, Calderón NC, Esposti F, Borghuis BG, Sun XR, et al. (2012). Optimization of a GCaMP calcium indicator for neural activity imaging. J. Neurosci 32, 13819–13840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akhlaghpour H, Wiskerke J, Choi JY, Taliaferro JP, Au J, and Witten IB (2016). Dissociated sequential activity and stimulus encoding in the dorso-medial striatum during spatial working memory. eLife 5, e19507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arnsten AF (1997). Catecholamine regulation of the prefrontal cortex. J. Psychopharmacol 11, 151–162. [DOI] [PubMed] [Google Scholar]
  4. Arnsten AF, Cai JX, Murphy BL, and Goldman-Rakic PS (1994). Dopamine D1 receptor mechanisms in the cognitive performance of young adult and aged monkeys. Psychopharmacology (Berl.) 116, 143–151. [DOI] [PubMed] [Google Scholar]
  5. Arnsten AFT, Wang MJ, and Paspalas CD (2012). Neuromodulation of thought: flexibilities and vulnerabilities in prefrontal cortical network synapses. Neuron 76, 223–239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Baddeley AD (1986). Working Memory (Oxford University Press; ). [Google Scholar]
  7. Baddeley AD, and Hitch G (1974). Working Memory. Psychol. Learn. Motiv 8, 47–89. [Google Scholar]
  8. Baxter MG, and Bucci DJ (2013). Selective immunotoxic lesions of basal forebrain cholinergic neurons: twenty years of research and new directions. Behav. Neurosci 127, 611–618. [DOI] [PubMed] [Google Scholar]
  9. Baxter MG, Bucci DJ, Gorman LK, Wiley RG, and Gallagher M (1995). Selective immunotoxic lesions of basal forebrain cholinergic cells: effects on learning and memory in rats. Behav. Neurosci 109, 714–722. [DOI] [PubMed] [Google Scholar]
  10. Bayer HM, and Glimcher PW (2005). Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Beckstead RM, Domesick VB, and Nauta WJ (1979). Efferent connections of the substantia nigra and ventral tegmental area in the rat. Brain Res 175, 191–217. [DOI] [PubMed] [Google Scholar]
  12. Braver TS, and Cohen JD (1999). Dopamine, cognitive control, and schizophrenia: the gating model. Prog. Brain Res 121, 327–349. [DOI] [PubMed] [Google Scholar]
  13. Braver TS, and Cohen JD (2000). On the control of control: The role of dopamine in regulating prefrontal function and working memory. In Control of Cognitive Processes: Attention and Performance XVIII, Monsell S and Driver J, eds. (MIT Press; ), pp. 713–737. [Google Scholar]
  14. Bromberg-Martin ES, Matsumoto M, and Hikosaka O (2010). Dopamine in motivational control: rewarding, aversive, and alerting. Neuron 68, 815–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Brown AJ, Fisher DA, Kouranova E, McCoy A, Forbes K, Wu Y, Henry R, Ji D, Chambers A, Warren J, et al. (2013). Whole-rat conditional gene knockout via genome editing. Nat. Methods 10, 638–640. [DOI] [PubMed] [Google Scholar]
  16. Brozoski TJ, Brown RM, Rosvold HE, and Goldman PS (1979). Cognitive deficit caused by regional depletion of dopamine in prefrontal cortex of rhesus monkey. Science 205, 929–932. [DOI] [PubMed] [Google Scholar]
  17. Cai JX, and Arnsten AF (1997). Dose-dependent effects of the dopamine D1 receptor agonists A77636 or SKF81297 on spatial working memory in aged monkeys. J. Pharmacol. Exp. Ther 283, 183–189. [PubMed] [Google Scholar]
  18. Cai LX, Pizano K, Gundersen GW, Hayes CL, Fleming WT, Holt S, Cox JM, and Witten IB (2020). Distinct signals in medial and lateral VTA dopamine neurons modulate fear extinction at different times. eLife 9, e54936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cameron CM, Murugan M, Choi JY, Engel EA, and Witten IB (2019). Increased Cocaine Motivation Is Associated with Degraded Spatial and Temporal Representations in IL-NAc Neurons. Neuron 103, 80–91.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Chang CY, Esber GR, Marrero-Garcia Y, Yau H-J, Bonci A, and Schoenbaum G (2016). Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors. Nat. Neurosci 19, 111–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Clark KL, and Noudoost B (2014). The role of prefrontal catecholamines in attention and working memory. Front. Neural Circuits 8, 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Coddington LT, and Dudman JT (2018). The timing of action determines reward prediction signals in identified midbrain dopamine neurons. Nat. Neurosci 21, 1563–1573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Cohen JY, Haesler S, Vong L, Lowell BB, and Uchida N (2012). Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Connor DJ, Langlais PJ, and Thal LJ (1991). Behavioral impairments after lesions of the nucleus basalis by ibotenic acid and quisqualic acid. Brain Res 555, 84–90. [DOI] [PubMed] [Google Scholar]
  25. Cools R, and D’Esposito M (2011). Inverted-U-shaped dopamine actions on human working memory and cognitive control. Biol. Psychiatry 69, e113–e125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Cools R, and Robbins TW (2004). Chemistry of the adaptive mind. Philos. Trans. A Math. Phys. Eng. Sci 362, 2871–2888. [DOI] [PubMed] [Google Scholar]
  27. Croxson PL, Kyriazis DA, and Baxter MG (2011). Cholinergic modulation of a specific memory function of prefrontal cortex. Nat. Neurosci 14, 1510–1512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. da Silva JA, Tecuapetla F, Paixão V, and Costa RM (2018). Dopamine neuron activity before action initiation gates and invigorates future movements. Nature 554, 244–248. [DOI] [PubMed] [Google Scholar]
  29. Dunnett SB, Evenden JL, and Iversen SD (1988). Delay-dependent short-term memory deficits in aged rats. Psychopharmacology (Berl.) 96, 174–180. [DOI] [PubMed] [Google Scholar]
  30. Ellwood IT, Patel T, Wadia V, Lee AT, Liptak AT, Bender KJ, and Sohal VS (2017). Tonic or phasic stimulation of dopaminergic projections to prefrontal cortex causes mice to maintain or deviate from previously learned behavioral strategies. J. Neurosci 37, 8315–8329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Engelhard B, Finkelstein J, Cox J, Fleming W, Jang HJ, Ornelas S, Koay SA, Thiberge SY, Daw ND, Tank DW, and Witten IB (2019). Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature 570, 509–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Erlich JC, Bialek M, and Brody CD (2011). A cortical substrate for memory-guided orienting in the rat. Neuron 72, 330–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Everitt BJ, and Robbins TW (1997). Central cholinergic systems and cognition. Annu. Rev. Psychol 48, 649–684. [DOI] [PubMed] [Google Scholar]
  34. Fallon JH, and Moore RY (1978). Catecholamine innervation of the basal forebrain. IV. Topography of the dopamine projection to the basal forebrain and neostriatum. J. Comp. Neurol 180, 545–580. [DOI] [PubMed] [Google Scholar]
  35. Fiorillo CD, Newsome WT, and Schultz W (2008). The temporal precision of reward prediction in dopamine neurons. Nat. Neurosci 11, 966–973. [DOI] [PubMed] [Google Scholar]
  36. Floresco SB, and Phillips AG (2001). Delay-dependent modulation of memory retrieval by infusion of a dopamine D1 agonist into the rat medial prefrontal cortex. Behav. Neurosci 115, 934–939. [PubMed] [Google Scholar]
  37. Funahashi S, Chafee MV, and Goldman-Rakic PS (1993). Prefrontal neuronal activity in rhesus monkeys performing a delayed anti-saccade task. Nature 365, 753–756. [DOI] [PubMed] [Google Scholar]
  38. Fürth D, Vaissiére T, Tzortzi O, Xuan Y, Märtin A, Lazaridis I, Spigolon G, Fisone G, Tomer R, Deisseroth K, et al. (2018). An interactive framework for whole-brain maps at cellular resolution. Nat. Neurosci 21, 139–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Fuster JM, and Alexander GE (1971). Neuron activity related to short-term memory. Science 173, 652–654. [DOI] [PubMed] [Google Scholar]
  40. Gibbs SEB, and D’Esposito M (2005). A functional MRI study of the effects of bromocriptine, a dopamine receptor agonist, on component processes of working memory. Psychopharmacology (Berl.) 180, 644–653. [DOI] [PubMed] [Google Scholar]
  41. Hangya B, Ranade SP, Lorenc M, and Kepecs A (2015). Central Cholinergic Neurons Are Rapidly Recruited by Reinforcement Feedback. Cell 162, 1155–1168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Hasselmo ME (2006). The role of acetylcholine in learning and memory. Curr. Opin. Neurobiol 16, 710–715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Hasselmo ME, and Sarter M (2011). Modes and models of forebrain cholinergic neuromodulation of cognition. Neuropsychopharmacology 36, 52–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Hasselmo ME, and Stern CE (2006). Mechanisms underlying working memory for novel information. Trends Cogn. Sci 10, 487–493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Hazy TE, Frank MJ, and O’reilly RC (2007). Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system. Philos. Trans. R. Soc. Lond. B Biol. Sci 362, 1601–1613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Hedrick T, and Waters J (2010). Physiological properties of cholinergic and non-cholinergic magnocellular neurons in acute slices from adult mouse nucleus basalis. PLoS ONE 5, e11046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Hodges H, Thrasher S, and Gray JA (1989). Improved radial maze performance induced by the benzodiazepine antagonist ZK 93 426 in lesioned and alcohol-treated rats. Behav. Pharmacol 1, 45–55. [PubMed] [Google Scholar]
  48. Howe MW, and Dombeck DA (2016). Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature 535, 505–510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Howe MW, Tierney PL, Sandberg SG, Phillips PEM, and Graybiel AM (2013). Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Inagaki HK, Fontolan L, Romani S, and Svoboda K (2019). Discrete attractor dynamics underlies persistent activity in the frontal cortex. Nature 566, 212–217. [DOI] [PubMed] [Google Scholar]
  51. Kamigaki T, and Dan Y (2017). Delay activity of specific prefrontal interneuron subtypes modulates memory-guided behavior. Nat. Neurosci 20, 854–863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Kobayashi S, and Schultz W (2008). Influence of reward delays on responses of dopamine neurons. J. Neurosci 28, 7837–7846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Kopec CD, Erlich JC, Brunton BW, Deisseroth K, and Brody CD (2015). Cortical and Subcortical Contributions to Short-Term Memory for Orienting Movements. Neuron 88, 367–377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Kubota K, and Niki H (1971). Prefrontal cortical unit activity and delayed alternation performance in monkeys. J. Neurophysiol 34, 337–347. [DOI] [PubMed] [Google Scholar]
  55. Lee RS, Mattar MG, Parker NF, Witten IB, and Daw ND (2019). Reward prediction error does not explain movement selectivity in DMS-projecting dopamine neurons. eLife 8, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Lerer BE, and Warner J (1986). Radial Maze Performance Deficits Following Lesions of Rat Basal Forebrain. In Alzheimer’s and Parkinson’s Disease: Strategies for Research and Development, Fisher A, Hanin I, and Lachman C, eds. (Springer; ), pp. 419–426. [Google Scholar]
  57. Lerner TN, Shilyansky C, Davidson TJ, Evans KE, Beier KT, Zalocusky KA, Crow AK, Malenka RC, Luo L, Tomer R, and Deisseroth K (2015). Intact-Brain Analyses Reveal Distinct Information Carried by SNc Dopamine Subcircuits. Cell 162, 635–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Lind JT, and Mehlum H (2010). With or Without U? The Appropriate Test for a U-Shaped Relationship. Oxf. Bull. Econ. Stat 72, 109–118. [Google Scholar]
  59. Liu D, Gu X, Zhu J, Zhang X, Han Z, Yan W, Cheng Q, Hao J, Fan H, Hou R, et al. (2014). Medial prefrontal activity during delay period contributes to learning of a working memory task. Science 346, 458–463. [DOI] [PubMed] [Google Scholar]
  60. Liu Z, Brown A, Fisher D, Wu Y, Warren J, and Cui X (2016). Tissue Specific Expression of Cre in Rat Tyrosine Hydroxylase and Dopamine Active Transporter-Positive Neurons. PLoS ONE 11, e0149379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Ljungberg T, Apicella P, and Schultz W (1991). Responses of monkey midbrain dopamine neurons during delayed alternation performance. Brain Res 567, 337–341. [DOI] [PubMed] [Google Scholar]
  62. Lovett-Barron M, Chen R, Bradbury S, and Andalman AS (2019). Multiple overlapping hypothalamus-brainstem circuits drive rapid threat avoidance. bioRxiv 10.1101/745075. [DOI] [PMC free article] [PubMed]
  63. Mandel RJ, and Thal LJ (1988). Physostigmine improves water maze performance following nucleus basalis magnocellularis lesions in rats. Psychopharmacology (Berl.) 96, 421–425. [DOI] [PubMed] [Google Scholar]
  64. Mandel RJ, Chen AD, Connor DJ, and Thal LJ (1989). Continuous physostigmine infusion in rats with excitotoxic lesions of the nucleus basalis magnocellularis: effects on performance in the water maze task and cortical cholinergic markers. J. Pharmacol. Exp. Ther 251, 612–619. [PubMed] [Google Scholar]
  65. Matsumoto M, and Takada M (2013). Distinct representations of cognitive and motivational signals in midbrain dopamine neurons. Neuron 79, 1011–1024. [DOI] [PubMed] [Google Scholar]
  66. Mazur JE (1987). An adjusting procedure for studying delayed reinforcement. In Quantitative Analyses of Behavior, Volume 5. The Effect of Delay and of Intervening Events on Reinforcement Value, Commons ML, Mazur JE, Nevin JA, and Rachlin H, eds. (Lawrence Erlbaum Associates; ), 55–73. [Google Scholar]
  67. Menegas W, Akiti K, Amo R, Uchida N, and Watabe-Uchida M (2018). Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli. Nat. Neurosci 21, 1421–1430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Miller EK, Lundqvist M, and Bastos AM (2018). Working Memory 2.0. Neuron 100, 463–475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Murphy BL, Arnsten AF, Goldman-Rakic PS, and Roth RH (1996). Increased dopamine turnover in the prefrontal cortex impairs spatial working memory performance in rats and monkeys. Proc. Natl. Acad. Sci. USA 93, 1325–1329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Musall S, Kaufman MT, Juavinett AL, Gluf S, and Churchland AK (2019). Single-trial neural dynamics are dominated by richly varied movements. Nat. Neurosci 22, 1677–1686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. O’Reilly RC, and Frank MJ (2006). Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput 18, 283–328. [DOI] [PubMed] [Google Scholar]
  72. Ott T, and Nieder A (2019). Dopamine and Cognitive Control in Prefrontal Cortex. Trends Cogn. Sci 23, 213–234. [DOI] [PubMed] [Google Scholar]
  73. Papp EA, Leergaard TB, Calabrese E, Johnson GA, and Bjaalie JG (2014). Waxholm Space atlas of the Sprague Dawley rat brain. Neuroimage 97, 374–386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Park IM, Meister MLR, Huk AC, and Pillow JW (2014). Encoding and decoding in parietal cortex during sensorimotor decision-making. Nat. Neurosci 17, 1395–1403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Parker NF, Cameron CM, Taliaferro JP, Lee J, Choi JY, Davidson TJ, Daw ND, and Witten IB (2016). Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci 19, 845–854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Reardon TR, Murray AJ, Turi GF, Wirblich C, Croce KR, Schnell MJ, Jessell TM, and Losonczy A (2016). Rabies Virus CVS-N2c(DG) Strain Enhances Retrograde Synaptic Transfer and Neuronal Viability. Neuron 89, 711–724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Richards JB, Mitchell SH, de Wit H, and Seiden LS (1997). Determination of discount functions in rats with an adjusting-amount procedure. J. Exp. Anal. Behav 67, 353–366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Roesch MR, Calu DJ, and Schoenbaum G (2007). Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci 10, 1615–1624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Romo R, Brody CD, Hernández A, and Lemus L (1999). Neuronal correlates of parametric working memory in the prefrontal cortex. Nature 399, 470–473. [DOI] [PubMed] [Google Scholar]
  80. Sawaguchi T, and Goldman-Rakic PS (1991). D1 dopamine receptors in prefrontal cortex: involvement in working memory. Science 251, 947–950. [DOI] [PubMed] [Google Scholar]
  81. Schultz W (1986). Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. J. Neurophysiol 56, 1439–1461. [DOI] [PubMed] [Google Scholar]
  82. Schultz W (1998). Predictive reward signal of dopamine neurons. J. Neurophysiol 80, 1–27. [DOI] [PubMed] [Google Scholar]
  83. Schultz W, Apicella P, and Ljungberg T (1993). Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci 13, 900–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Schultz W, Dayan P, and Montague PR (1997). A neural substrate of prediction and reward. Science 275, 1593–1599. [DOI] [PubMed] [Google Scholar]
  85. Simon AP, Poindessous-Jazat F, Dutar P, Epelbaum J, and Bassant M-H (2006). Firing properties of anatomically identified neurons in the medial septum of anesthetized and unanesthetized restrained rats. J. Neurosci 26, 9038–9046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Starkweather CK, Babayan BM, Uchida N, and Gershman SJ (2017). Dopamine reward prediction errors reflect hidden-state inference across time. Nat. Neurosci 20, 581–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, and Janak PH (2013). A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci 16, 966–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Sun Y, Yang Y, Galvin VC, Yang S, Arnsten AF, and Wang M (2017). Nicotinic α4β2 Cholinergic Receptor Influences on Dorsolateral Prefrontal Cortical Neuronal Firing during a Working Memory Task. J. Neurosci 37, 5366–5377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Teles-Grilo Ruivo LM, Baker KL, Conway MW, Kinsley PJ, Gilmour G, Phillips KG, Isaac JTR, Lowry JP, and Mellor JR (2017). Coordinated Acetylcholine Release in Prefrontal Cortex and Hippocampus Is Associated with Arousal and Reward on Distinct Timescales. Cell Rep 18, 905–917. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Tian L, Hires SA, Mao T, Huber D, Chiappe ME, Chalasani SH, Petreanu L, Akerboom J, McKinney SA, Schreiter ER, et al. (2009). Imaging neural activity in worms, flies and mice with improved GCaMP calcium indicators. Nat. Methods 6, 875–881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Torres EM, Perry TA, Blockland A, Wilkinson LS, Wiley RG, Lappi DA, and Dunnet SB (1994). Behavioural, histochemical and biochemical consequences of selective immunolesions in discrete regions of the basal forebrain cholinergic system. Neuroscience 63, 95–122. [DOI] [PubMed] [Google Scholar]
  92. Turner JJ, Hodges H, Sinden JD, and Gray JA (1992). Comparison of radial maze performance of rats after ibotenate and quisqualate lesions of the forebrain cholinergic projection system: effects of pharmacological challenge and changes in training regime. Behav. Pharmacol 3, 359–373. [PubMed] [Google Scholar]
  93. Vijayraghavan S, Wang M, Birnbaum SG, Williams GV, and Arnsten AFT (2007). Inverted-U dopamine D1 receptor actions on prefrontal neurons engaged in working memory. Nat. Neurosci 10, 376–384. [DOI] [PubMed] [Google Scholar]
  94. Watanabe M, Kodama T, and Hikosaka K (1997). Increase of extracellular dopamine in primate prefrontal cortex during a working memory task. J. Neurophysiol 78, 2795–2798. [DOI] [PubMed] [Google Scholar]
  95. Wenk GL, Stoehr JD, Quintana G, Mobley S, and Wiley RG (1994). Behavioral, biochemical, histological, and electrophysiological effects of 192 IgG-saporin injections into the basal forebrain of rats. J. Neurosci 14, 5986–5995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Williams GV, and Goldman-Rakic PS (1995). Modulation of memory fields by dopamine D1 receptors in prefrontal cortex. Nature 376, 572–575. [DOI] [PubMed] [Google Scholar]
  97. Williams SM, and Goldman-Rakic PS (1998). Widespread origin of the primate mesofrontal dopamine system. Cereb. Cortex 8, 321–345. [DOI] [PubMed] [Google Scholar]
  98. Witten IB, Steinberg EE, Lee SY, Davidson TJ, Zalocusky KA, Brodsky M, Yizhar O, Cho SL, Gong S, Ramakrishnan C, et al. (2011). Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Zahrt J, Taylor JR, Mathew RG, and Arnsten AF (1997). Supranormal stimulation of D1 dopamine receptors in the rodent prefrontal cortex impairs spatial working memory performance. J. Neurosci 17, 8528–8535. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Data Availability Statement

The datasets and code supporting the current study are available from the Lead Contact, I.B.W., upon reasonable request.

RESOURCES