Abstract
There is compelling evidence that midbrain dopamine (DA) neurons and their projections to the ventral striatum provide a mechanism for motivating reward-seeking behavior, and for utilizing information about unexpected rewards (prediction errors, RPEs) to guide behavior based on current, rather than historical, outcomes. When this mechanism is compromised in addictions, it may produce patterns of maladaptive behavior that remain obdurate in the face of contrary information and even adverse consequences. Nonetheless, DAergic contributions to performance on behavioral tasks that rely on an ability to flexibly update stimulus-reward relationships remains incomplete understood. In the current study, we used a discrimination and reversal paradigm to monitor subsecond DA release in mouse NAc core (NAc) using in vivo fast-scan cyclic voltammetry (FSCV). We observed post-choice elevations in phasic NAc DA release; however, increased DA transients were only evident during early reversal when mice made responses at the newly-rewarded stimulus. Based on this finding, we used in vivo optogenetic (halorhodopsin (eNpHR3.0)) photosilencing and (Channelrhodopsin2 (ChR2)) photostimulation to assess the effects of manipulating VTA-DAergic fibers in the NAc on reversal performance. Photosilencing the VTA→NAc DAergic pathway during early reversal increased errors, while photostimulation did not demonstrably affect behavior. Taken together, these data provide additional evidence of the importance of NAc DA release as a neural substrate supporting adjustments in learned behavior after a switch in expected stimulus-reward contingencies. These findings have possible implications for furthering understanding the role of DA in persistent, maladaptive decision-making characterizing addictions.
Keywords: mouse, reversal learning, voltammetry, optogenetics, prediction error
Graphical Abstract

Subsecond dopamine (DA) release in the nucleus accumbens core (NAc) was monitored using fast-scan cyclic voltammetry during discrimination and reversal learning. Post-choice elevations in phasic DA release were observed during early reversal while photosilencing of VTA-DAergic fibers disrupted performance. These data provide additional evidence of the importance of NAc DA release as a neural substrate supporting learned behavior after a switch in expected stimulus-reward contingencies.
Introduction
Forming associations between cues and rewards is fundamental to the ability to understand and successfully engage with the environment but this process can become dysfunctional in neuropsychiatric disorders. A growing literature indicates that the activity of midbrain dopamine (DA) neurons provides a key neuromodulatory substrate for this form of learning. For instance, some, although not all, studies have shown that rodents and nonhuman primates will develop a place preference or acquire an operant response for photoexcitation of DA neurons in the midbrain, ventral tegmental area (VTA) or substantia nigra pars compacta (SNc) (Tsai et al., 2009; Adamantidis et al., 2011; Witten et al., 2011; Kim et al., 2012; Rossi et al., 2013; Ilango et al., 2014a; Ilango et al., 2014b; Pascoli et al., 2015; Gremel & Lovinger, 2016; Stauffer et al., 2016). Furthermore, building upon classic electrical self-stimulation work (Olds, 1958; Howarth & Deutsch, 1962; Bozarth & Wise, 1981), optogenetic studies have shown that photostimulating putative DA neurons can support operant learning (Witten et al., 2011). In addition, in vivo single-unit recordings and BOLD measures in the human VTA (D’Ardenne et al., 2008) have detected patterns of activity in DA neurons consistent with a reward prediction error (RPE): such that activity increases in response to novel or unexpected rewards and decreases when expected rewards are omitted (Ljungberg et al., 1992; Schultz et al., 1997; Hollerman & Schultz, 1998; Ungless et al., 2004; Bayer & Glimcher, 2005; Pan et al., 2005; Roesch et al., 2007; Matsumoto & Hikosaka, 2009; Eshel et al., 2015).
Despite this compelling evidence, the precise contribution of specific DAergic circuits to how behaviors directed at reward-predictive stimuli develop and adapt remains incompletely understood (Berke 2018). Of the various forebrain regions that are innervated by VTA-DA neurons, transient calcium signals and phasic DA release within the nucleus accumbens (NAc) reliably coincide with responding for rewards or the presentation of unexpected rewards or cues that predict reward and anticipate outcomes (Phillips et al., 2003b; Roitman et al., 2004; Day et al., 2007; Owesson-White et al., 2008; Owesson-White et al., 2009; Day et al., 2010; Nasrallah et al., 2011; Sugam et al., 2012; Clark et al., 2013; Wassum et al., 2013; Hart et al., 2014; Ostlund et al., 2014; Saddoris et al., 2015; Parker et al., 2016; Saddoris et al., 2017). Fluctuations in NAc DA is consistent with the known role of the NAc in cue-invigoration of instrumental responding for reward, and for the signaling of RPEs (Wyvell & Berridge, 2000; Berridge, 2007; Lex & Hauber, 2008; Pecina & Berridge, 2013; Wassum et al., 2013; Ostlund et al., 2014; Klanker et al., 2015; Aitken et al., 2016; Collins et al., 2016a; Corbit & Balleine, 2016; Klanker et al., 2017).
Supporting the behavioral relevance of these NAc-DA transients, DA D1R antagonists delivered into the NAc prevents VTA-driven conditioned place preference (Lammel et al., 2012) and social interaction (Gunaydin et al., 2014), and reduces the willingness of animals to self-stimulate VTA-NAc DA neurons (Cheer et al., 2007; Steinberg et al., 2014). Prior work has also shown that DA receptor agonists and antagonists infused into the NAc, or more specifically the NAc core (but not shell), bidirectionally modulate the DA-related ‘blocking’ learning effect (Iordanova et al., 2006; Li & McNally, 2015), while other studies report no effect on spatial reversal (Haluk & Floresco, 2009) or reversal of a cued operant response (Calaminus & Hauber, 2007). Moreover, an elegant recent report from Parker and colleagues showed that optogenetically silencing VTA neurons impaired probabilistic spatial reversal learning, paralleling increased calcium signals from VTA/SN neurons to the NAc in the same task (Parker et al., 2016).
To corroborate and extend the current literature, the goal of the present study was to first establish the role of phasic striatal DA release in a behavioral assay for choice-learning and choice-flexibility, and then causally interrogate the necessity of these circuits for flexible choice. A touchscreen-based pairwise visual discrimination/reversal task was employed (Brigman et al., 2009; Brigman et al., 2013, Bergstrom et al. 2018) as a behavioral preparation that offers the ability to probe the contribution of a DA-RPE signal in new choice learning (during discrimination), and as well as flexible choice (re)learning driven by an explicit violation of reward prediction after stimulus-reward contingencies are reversed (Izquierdo et al., 2016). In vivo fast-scan cyclic voltammetry (FSCV) was used to monitor transient DA release in the NAc as mice made touchscreen choices to obtain a food reward. On the basis of the FSCV results obtained, experiments were then undertaken using pathway-specific, bidirectional in vivo photomanipulations to assess whether post-correct-choice stimulation or silencing of VTA DA input to the NAc affected choice flexibility after a stimulus-reward contingency reversal.
Materials and Methods
Subjects
Male C57BL/6J (RRID: IMSR_JAX:000664) and C57BL/6J-background DAT-Cre transgenic mice (B6.SJL-Slc6a3tm1.1(cre)Bkmn/J, RRID: IMSR_JAX:006660) were obtained from The Jackson Laboratory (Bar Harbor, ME, USA). DAT-Cre mice used for testing were generated by crossing heterozygous transgenic males with C57BL/6J females to produce heterozygous mutants for testing. Male and female C57BL/6J-background D2DAT null mutant mice (and D2R-floxed/DAT-IRES-Cre-littermate controls) were generated in-house by intercrossing homozygous D2-floxed females with hemizygous DAT-IRES-Cre males to excise D2R on DA neurons, as previously described (Bello et al., 2011; Holroyd et al., 2015). Mice were grouped-housed prior to surgery then single-housed (to maintain the integrity of intra-cranial implants) in a temperature- and humidity controlled vivarium under a 12-hour light/dark cycle (lights on 0600 h). Experimental procedures were performed in accordance with the National Institutes of Health Guide for Care and Use of Laboratory Animals and approved by the local NIAAA Animal Care and Use Committee. The number of mice used in each experiment is indicated in the figure legends.
Discrimination and reversal task
Testing procedures were based on those previously reported (Brigman et al., 2009; Brigman et al., 2013) and used the Bussey-Saksida Touch Screen System (model 80614, Lafayette Instruments, Lafayette, IN, USA). The apparatus comprises a touch-sensitive screen on one wall of the chamber and a reward-delivery magazine on the opposite wall of the chamber (Figure 1A).
Figure 1: Experimental set-up and behavioral readouts from a mouse touchscreen-based visual discrimination and reversal paradigm.
(A) Mice were trained to reliably discriminate between two visual stimuli by making a choice on a touchscreen to obtain a food reward. The stimulus-reward contingencies were then reversed to test for choice flexibility. During both discrimination and reversal, mice initiated trials with a nose poke into the reward magazine. The task was self-guided, meaning unlimited time was allowed to initiate a trial, make a choice on the touchscreen, and collect a reward. A correct choice was followed by a pellet delivery and a 2-s tone while an incorrect choice was followed by a 15-s timeout during which new trials could not be initiated. (B-C) Phasic dopamine transients were recorded in the NAc. (D) Recordings were conducted on the first (early) and last (late) discrimination and reversal sessions and during the reversal session after mice attained 50% correct responses (mid). (E) Choice accuracy significantly improved from early (n=7) to late (n=7) discrimination performance stages, then decreased at early reversal (n=8) before increasing over mid (n=6) and late reversal (n=5) stages. (F) Choice errors decreased from early to late discrimination, were significantly higher at early reversal and then decreased again over mid to late reversal. (G) Latency to make a choice did not differ between sessions. (H) Latency to retrieve the reward did not differ between sessions. Data are means ± SEM. **P<.01 versus early discrimination, ††P<.05 versus late discrimination, ##P<.01 versus early reversal.
Prior to testing, body weight was reduced and maintained at 85% free feeding weight throughout testing to motivate responding. Rewards were 14 mg food pellets (#F05684, BioServ, Frenchtown, NJ, USA), provided first in the home cage and then in the test chamber for 30 minutes in order to acclimate mice to the training environment and the reward (~10 reward pellets/mouse). Mice were next trained to associate the dispensing of reward with simultaneous presentation of a 2-second, 65-dB tone and illumination of the magazine light. Forty pellets were randomly dispensed and mice were required to consume all pellets in a 30-minute testing session.
There were 3 phases of pre-training. In phase 1, the mouse was required to initiate each trial with a head entry into the food magazine upon illumination, which extinguished the magazine light and resulted in the appearance of a 6.5 cm2 stimulus (selected randomly from a catalogue) in 1 of the 2 touchscreen windows for 10 seconds. Immediately after the choice, a 1-pellet reward was delivered, in conjunction with the tone and magazine light cues. The criterion for phase 1 was earning 30 pellets in <30 minutes. In phase 2, the mouse was required to touch the window containing the stimulus to receive a reward and the stimulus remained on the screen until the mouse made a response. The criterion for phase 2 was earning 30 pellets in <30 minutes. Phase 3 was the same as phase 2 with the exception that, to discourage indiscriminate touchscreen responding, touches at a blank window led to a 15-second ‘timeout’ period during which the house light was extinguished and a new trial could not be initiated. Following a blank-screen response, a ‘correction trial’ was given in which the same stimulus was presented in the same window. The next trial proper could not begin until a stimulus-window response was made on a correction trial. There were 30 trials (excluding any correction trials) per session/day. The criterion for phase 3 was touching the stimulus-containing screen on >75% of trials.
Discrimination.
Discrimination involved the simultaneous presentation of 2 novel 6.5 cm2 stimuli (‘fan’ and ‘marbles’) in a spatially randomized manner immediately on trial initiation (Figure S1). Stimuli remained on the screen until the mouse made a response (with no maximum cutoff time) or the session ended (max 60 minutes). Responses at the ‘fan’ stimulus produced a food reward and corresponding 2-second tone/light cue at a continuous rate of reinforcement. Trials with a correct response had a 5-second inter-trial interval after delivery of the reward. Responses at the ‘marble’ stimulus (=‘errors’) produced no food reward and a 15-second ‘timeout’ period. Each error was followed by a correction trial in which the 2 stimuli were presented in the same spatial configuration. The next trial proper could not begin until a correct response was made on a correction trial. Mice were given 30 trials proper per session (1 session per day). Percent correct performance was calculated as the number of correct choices out of the 30 trials proper. Training continued until a performance criterion of >85% correct responses was attained on 2 consecutive sessions. All trials were self-initiated. Reward was delivered in conjunction with the tone and magazine light cues.
Reversal.
For reversal, the designation of stimuli as correct versus incorrect was reversed for each mouse (correct = marbles, error = fan). Training continued on the new contingency until a performance criterion of >85% correct responses (out of the 30 trials proper) was attained on 2 consecutive sessions. All trials were self-initiated. Reward was delivered in conjunction with the tone and magazine light cues.
The following dependent measures were analyzed to quantify discrimination and reversal performance (Graybeal et al., 2011; Brigman et al., 2013): percent correct responding (correct responses/30 trials proper)*100), total errors, latency to touch a stimulus (timed from the appearance of the stimuli), and the latency to collect the reward (timed from a correct touch). In addition, the trial-by-trial sequence of responses (both trials proper and correction trials) was analyzed according to whether an error was followed by another error (lose→stay) or a correct response (lose→shift), and whether a correct response was followed by another correct response (win→stay) or an error (win→shift). For FSCV, the dependent measures were analyzed on specific stages of testing (Brigman et al., 2013) corresponding to the sessions when DA was monitored: the first discrimination session (=early discrimination), the next discrimination session after attainment of 85% correct performance (=late discrimination), the first reversal session (=early reversal), the next reversal session after attainment of 50% correct performance (=mid reversal), and the next discrimination session after attainment of 85% correct performance (=late reversal). For optogenetic experiments, the dependent measures were analyzed on the first session of reversal, when the photomanipulations occurred.
FSCV monitoring of phasic DA release during behavior
Electrode construction.
Recording electrodes were custom-made according to methods previously described (Clark et al., 2010). Carbon fibers (Goodfellow Corporation, Devon, PA, USA) were inserted into cut segments of fused silica tubing (NAc: 5.5 mm; Polymicro Technologies, Phoenix, AZ, USA) while submerged in isopropyl alcohol. The silica segments were then sealed on one end using a 2-part epoxy (T-QS12 Epoxy, Super Glue, Ontario, CA, USA) and the exposed carbon fiber was cut to 150 μm. Silver epoxy was then used to attach a silver connector to the other side of the tubing. After drying overnight, a coat of 2-part epoxy was applied and the entire assembly was allowed to dry. Electrode connectors were assembled from stainless steel electrodes (MS303–2-A-20mm, Plastics One, Roanoke, VA, USA) by soldering one wire to a gold pin and sealing with clear epoxy and the other wire to a silver wire (#782500, A-M Systems, Sequim, WA, USA) and painting with silver epoxy. A 1–2 mm segment of silver wire left protruding from the silver epoxy was chlorinated by soaking in undiluted bleach overnight to form the Ag/AgCl reference electrode.
Electrode implantation.
C57BL/6J mice had electrodes implanted under isoflurane anesthesia (maintained at 2–5% by a precision vaporizer and delivered through a nose cone) using a stereotaxic alignment system (Kopf Instruments, Tujunga, CA, USA). The reference, attached to a connector, was first implanted and fixed with dental cement. A chronic recording electrode was then implanted in the contralateral NAc (AP: +1.0 mm, ML: ±1.2 mm from bregma and DV: −3.7 mm from the brain’s surface). The electrode was fixed to the skull with jeweler’s screws and dental cement (Grip Cement, Dentsply, Milford, DE, USA). Finally, the recording electrode was attached to the connector and cemented into place. Mice were kept on a warm water blanket during and after surgery and body temperature was monitored. After surgery, mice were placed in a clean cage and Ketoprofen (5 mg/kg, s.c.) administered for three days. Mice were allowed to recover at least 1 week before beginning food restriction and then behavioral training, as described above. The first recording was made 4–6 weeks post-surgery.
DA recordings.
On the day of recording, electrodes were conditioned by applying a triangular waveform (−0.4 to +1.3 V) to the implanted carbon fiber at 60 Hz and then 10 Hz for at least 1 hour. As compared to acute glass electrodes, chronic electrodes require a longer time of exposure to the applied potential at a high frequency to overcome the loss of sensitivity that arises following chronic implantation (Phillips et al., 2003a; Heien et al., 2005). Recordings were done at a rate of 10 Hz. To ensure DA release could be detected prior to testing, a DA transient was visually examined in response to presentation of a food pellet (Figure S1). Recordings took place on each of 5 test sessions representing performance milestones in discrimination and reversal learning, as described above (Figure 1B). Due to failure of the electrodes in some mice by the latter part of testing, DA recordings were obtained in fewer mice at the later reversal stages.
On the completion of testing, electrode integrity was assessed by measuring DA transients in the home cage 5 minutes after intraperitoneal injection of a cocktail (10 mL/kg body weight, in saline vehicle) comprising 15 mg/kg of cocaine (Sigma Aldrich, St. Louis, MO, USA) and 2 mg/kg of the D2R antagonist, sulpiride. To verify electrode placements, a subset of mice was anaesthetized with isoflurane and an electrolytic lesion was made by passing current through the electrode. The next day, mice were deeply anaesthetized with a ketamine (100 mg/kg)/xylazine (10 mg/kg) cocktail, transcardially perfused with phosphate buffered saline, then 4% paraformaldehyde. The brain was removed and suspended in 4% paraformaldehyde overnight and then 0.1M phosphate buffer at 4°C for 1–2 days. Fifty-μm coronal sections were cut with a vibratome (Classic 1000 model, Vibratome, Bannockburn, IL, USA) and sections examined with the aid of an Olympus BX41 microscope (Olympus, Center Valley, PA, USA). Mice with incorrect electrode placement were removed from the analysis (for electrode placement locations, see Figure 1C).
Analysis.
FSCV data were analyzed using the High Definition Cyclic Voltammetry (HDCV) software suite (Wightman Lab, University of North Carolina, Chapel Hill, NC, USA) (Bucher et al., 2013). DA was isolated from voltammetric data using principal component analysis (Bucher et al., 2013) and converted to concentration using a calibration factor of 10 nA/μM, which gives an average of 10 nA current for 1 μM DA solution. This is the average calibration factor of chronic electrodes used previously in our laboratory (e.g., Mateo et al., 2017), as well as other laboratories (e.g., Willuhn et al., 2014) that use the same voltammetric input waveform (−0.4V-1.3V) as used here. A standard training set for DA and pH was built from spontaneous transients elicited by administration of the aforementioned, post-testing, cocaine/sulpiride cocktail. This was used to set the background subtraction for each mouse. The pH values obtained from the training set were accounted for in the chemometric analysis. Potential drift was not corrected for.
Digital input/output DA signals were recorded and aligned over the period from 5 seconds before, through the 30 seconds after, either 1) a correct response was made on the touchscreen, 2) an error was made on the touchscreen, or 3) a head entry was made into the magazine to collect a reward after a correct response was made. In addition, DA responses were calculated as the average DA signal 1) from a correct response through to reward collection, 2) over the 15 seconds after an error response, 3) over the 10 seconds after reward collection. DA concentrations were smoothed with a 5-point running average and averaged across mice.
VTA→NAc DA neuronal photosilencing and photostimulation
Surgical procedures.
For photosilencing, DAT-Cre+ mice and DAT-Cre- had rAAV5/Ef1a-DIO-eNpHR3.0-EYFP (titer 3.2 × 10 ^ 12 GC/mL, UNC Vector Core, Chapel Hill, NC, USA, RRID: SCR_002448) bilaterally infused (0.35 μL/hemisphere) into the VTA (AP: −2.9 mm, ML: ±0.4 mm, DV: −4.55 from bregma) over 10 minutes using a Hamilton syringe and 33-gauge needle. The needle was left in place for a further 10 minutes to ensure diffusion. For photostimulation, DAT-Cre+ mice and DAT-Cre- mice were infused with rAAV5/Ef1a-DIOhChR2 (H123R)-EYFP (titer 5.5 × 10 ^ 12 GC/mL, UNC Vector Core) into the VTA. During the same surgery for viral infusion, ceramic ferrules were bilaterally implanted 0.2 mm above the NAc (AP: +1.2 mm, ML: ±1.3 mm, DV: −4.3 from bregma) to direct optical fibers. Ferrules were secured to the skull with jeweler’s screws, cyanoacetate and acrylic cement. Each fiber optic ferrule assembly consisted of a 200 μm diameter multimodal fiber (0.39 NA, Thorlabs, Newton, NJ, USA) contained within a 230 μm diameter ceramic ferrule (Precision Fiber Products, South Hillview Milpitas, CA, USA).
In vivo photomanipulations.
Beginning 1 week after surgery, DAT-Cre mice were tested through to discrimination criterion, without any photomanipulation. On the first session of reversal testing (= >7 weeks after virus injection) light was delivered immediately after each correct response (light on at a correct touchscreen response, light off at reward collection). For eNpHR3.0-expressing mice, green light was delivered continuously during the correct-to-reward period through 65.5 μm optical fibers (NA 0.37; Thorlabs, Newton, NJ, USA) coupled to a 200-mW, 532-nm, laser system (Opto Engine, Midvale, UT, USA) interfaced with the touchscreen software to deliver TTL pulses to a laser driver. For ChR2-expressing mice, blue light was delivered in 5 milliseconds pulses at 20 Hz through a 200-mW, 433-nm laser system (Opto Engine) that was also interfaced with the touchscreen software to deliver TTL pulses to a laser driver. Laser power at the tip of the fiber was calibrated before each test session using a power meter (PM20, Thorlabs, Newton, NJ, USA) and adjusted to achieve 9 mW in the case of eNpHR3.0 and 8 mW in the case of ChR2.
After photomanipulations were conducted on the first session of reversal, subsequent reversal testing continued through to criterion without additional light delivery. Then, on the session after criterion was attained, light was shone, as above, during the period from trial initiation through to choice in order to test for any general (e.g., motoric) performance disturbances produced by photosilencing or photostimulating.
Viral expression verification.
At the completion of testing, mice were deeply anaesthetized with a ketamine (100 mg/kg)/xylazine (10 mg/kg) cocktail before transcardial perfusion with phosphate buffered saline, then 4% paraformaldehyde. The brain was removed and suspended in 4% paraformaldehyde overnight and then 0.1M phosphate buffer at 4°C for 1–2 days. Fifty-μm coronal sections were cut with a vibratome (Classic 1000 model, Vibratome, Bannockburn, IL, USA). Sections were coverslipped with Vectashield Hardset mounting medium and DAPI (Vector Laboratories, Inc., Burlingame, CA, USA) and examined with the aid of an Olympus BX41 microscope. One mouse was adjudged to have insufficient viral expression and was removed from the analysis.
To estimate the percentage of virally infected DA neurons in the VTA, sections from a subset of Cre- and Cre+/eNpHR3.0-expressing mice (n=5 per group) were obtained, as above. Sections were stained for rat anti-DAT (cat # MAB369, 1:2000, Millipore, Billerica, MA, USA) and chicken anti-GFP (cat # ab13970, 1:2000, Abcam, Cambridge, MA, USA) antibodies over 48 hours in blocking solution at 4 degrees Celsius. Given the poor somatic labeling typical of ChR2-expressing neurons, we did not attempt to quantify expression in this group. The sections were then rinsed in PBS then incubated with goat anti-rat IgG H&L Alexa Fluor 555 (cat # A21434, 1:1000, Life Technologies, Waltham, MA, USA) and goat anti-chicken IgG H&L Alexa Fluor 488 (cat # A11039, 1:1000, Life Technologies) for 2 hours, then rinsed in PBS. Afterwards, the sections were counterstained with Hoechst 33342 (5 μg/mL, Life Technologies) and coverslipped with Vectashield Hardset mounting medium (Vector Laboratories). To quantify the proportion of DAT+/YFP+ cells, 1–4 sections were imaged per mouse at 20× in the VTA at coordinates corresponding to the injection site (−2.9 mm anteroposterior to bregma). A total of 160 randomly selected DAT-labeled cells in Cre+ mice and 116 DAT-labeled cells in Cre- mice were counted and, of these, the number that were also EYFP-labeled were counted to estimate the percentage of the DAT-labeled population that was virally transfected.
Ex vivo FSCV photostimulation verification.
A test naïve cohort of DAT-Cre mice expressing ChR2 in VTA-NAc neurons was sacrificed and the brain cooled in ice-cold, oxygenated (95% O2, 5% CO2) modified Krebs buffer containing (in mM): NaCl 126, KCl 2.5, NaH2PO4 1.2, CaCl2 2.4, MgCl2 1.2, NaHCO3 25, Glucose 11, HEPES 20, L-ascorbic acid 0.4, and pH adjusted to 7.4. Coronal slices containing the NAc were prepared with a vibrating tissue slicer (Leica Instruments, Wetzlar, Germany). Slices were placed in an interface chamber and continually perfused (2 mL/minute) with artificial cerebrospinal fluid (ACSF) containing (in mM): NaCl 126, KCl 2.5, NaH2PO4 1.2, CaCl2 2.4, MgCl2 1.2, NaHCO3 25, Glucose 11, HEPES 20, L-ascorbic acid 0.4, pH 7.4, temperature 32 degrees Celsius.
FSCV recordings of blue light-evoked DA release were performed using a glass-encased cylindrical carbon fiber (r = 3.5 μm, 100 μm exposed length; Hexcell Corporation, Stamford, CT, USA) placed in the NAc at a site expressing EYFP. Optical stimulation (20 Hz, 5 millisecond pulse width, ~2 mW) was delivered through a water-immersion 40× microscope objective to achieve whole-field illumination. Light source was a single wavelength (470 nm) LED system (CoolLED pE-100, Andover, UK). Light pulse duration and timing were controlled by a Master-8 (A.M.P.I., Jerusalem, Israel). DA release was monitored using FSCV by applying a triangular waveform optimized for ex vivo recordings (−0.4 to +1.2 V at 400 V/s) at 10 Hz to the carbon-fiber (Heien et al., 2003).
Genetic model of augmented striatal DA release
D2DAT mutants, along with their littermate floxed controls, were tested for discrimination and reversal, as above. Following the completion of reversal learning, mice were tested for locomotor behavior in a 45 × 45 cm open field for 30 minutes and total distance traveled measured using Ethovision video tracking software (Noldus, Leesburg, VA, USA).
Statistics
Behavioral and neurochemical data were analyzed using analysis of variance (ANOVA) followed by Holm-Sidak post hoc tests or Student’s t-tests, with Bonferroni correction for multiple comparisons, where applicable.
Results
Behavioral performance
Mice took 12.6 ± 1.1 sessions to attain late discrimination criterion from the start of discrimination training, 6.0 ± 1.2 sessions to attain mid reversal criterion and 10.3 ± 2.1 sessions to attain late reversal from the start of reversal training. None of the mice tested failed to attain the discrimination and reversal criteria.
Examination of behavioral performance across the 5 major stages of discrimination and reversal (Figure 1D) showed that number of correct choices made, expressed as percentage of the 30 trials proper per session, significantly increased from early to late discrimination (P=.0051), then decreased on early reversal (P<.0001) before progressively increasing again by mid reversal (P=.0004) and finally attaining criterion levels at late reversal (one-way ANOVA effect of stage: F4,28=15.75, P<.0001) (Figure 1E). Total errors showed the inverse pattern, with significant decreases in errors across early to late discrimination (P=.0006), then a marked increase at early reversal (P<.0001) followed by decreases over mid and late reversal (P<.0001) (one-way ANOVA effect of stage: F4,28=23.43, P<.0001) (Figure 1F). Total responses at each performance stage followed the same trends as the stage-wise changes in error rates (F4,28=10.78, P<.0001, data not shown). Finally, there was no overall effect of stage on latency to respond (ANOVA effects of stage: F4,28=1.48, P=.2341) (Figure 1G) or latency to collect a reward (ANOVA effects of stage: F4,28=1.66, P=.1865) (Figure 1H).
These behavioral data confirm discrimination and reversal performance in electrode-implanted C57BL/6J mice that are comparable with prior studies in this mouse strain in this task (Izquierdo et al., 2006; Graybeal et al., 2011; Brigman et al., 2013), although latencies to response and to reward tended to be longer here than in earlier studies. This is likely due to the large head-cap and tethering necessary for recordings, which may slow movement generally even compared to the lighter head-caps used in for in vivo optogenetics.
Phasic DA correlates of behavior in the NAc
Next, DAergic correlates of performance in the NAc were investigated using in vivo FSCV DA recordings. When the recording data were aligned to the time of response, an electrochemical signal consistent with DA was detected in the NAc during the period following a correct response (on early reversal) (Figure 2A). Electrode placement was confirmed in a subset of mice (the remaining electrodes were either used for ex vivo calibration or resulted in poor lesions) (Figure 1C) and DA release in those animals was found to be similar to subjects with unconfirmed electrode placement (average [DA] across all sessions in electrodes with verified placements: 21.5 ± 32.1 nM vs. unverified: 14.4 ± 5.1 nM). Moreover, there was an increase in DA transients in response to systemic administration of a cocktail of cocaine and sulpiride at the completion of testing (Figure 2B). This confirms that the ability to record DA was maintained through to the final stage of behavioral testing, with the caveat that the drug-evoked measure was likely stronger than behaviorally-related DA transients.
Figure 2: DA transients in the NAc associated with discrimination and reversal.
(A) Dopamine transients were recorded in the NAc. Example from one trial during early reversal. (B) Electrode functionality was verified after the last reversal session by recording spontaneous DA transients after a systemic injection of a cocaine + sulpiride cocktail. DA CV reflects time of peak DA concentration (arrow). (C) Average DA responses during the correct choice-to-reward period were significantly greater during early reversal, relative to a 5-second pre-choice baseline. (D) Large increases in DAergic responses in the NAc were observed after correct choices made during early reversal, but not other stages of reversal or discrimination. The y-axis represents Δ[DA] from the time of choice. Arrows denote average time of reward collection. (E) Average DA responses averaged over a 15-second post-error (lights off time-out) period were unaltered at any stage. (F) DA responses did not change following choice errors at any performance stage. The y-axis represents Δ[DA] from the time of choice. n=5–8. Data are means ± SEM. **P<.01 vs. pre-choice baseline.
Comparison of choice-aligned NAc DA transients across task stages revealed a significant increase after correct responses were made (relative to a 5-second pre-response baseline) during early reversal (Bonferroni corrected paired t-test: t7=4.56, P=.0026), but not any other performance stage (Figure 2C, D). By contrast, there was no significant change in DA transients aligned to errors, at any stage of behavior (Figure 2E, F). In follow-up analyses, there were no discernible within-session changes in choice-aligned DA transients across trials at early reversal or at any other session (data not shown). On the basis of a prior study using a rat operant task showing that higher accumbal DA responses correspond to longer delays to reward (Wanat et al., 2010), we performed correlation between post-correct or post-error DA and latency to reward collection; but did not find a significant correlation (r= −0.10, P=.8137). Finally, when DA responses were re-aligned to the time of reward collection (i.e., magazine head entry after a correct choice), no significant change in transients was evident, irrespective of performance stage (Figure S1).
Taken together, these recording data indicate a marked increase in NAc DA release after mice made an unexpectedly rewarded correct response after a previously-learned stimulus-reward relationship was reversed; suggestive of a positive RPE.
DA VTA→NAc photosilencing and photostimulation during early reversal
Building on the observation of a DA response following correct-choices at early reversal, we virally-transfected VTA DA neurons with either eNpHR3.0 or ChR2 and tested the effects of selectively photosilencing or photostimulating, respectively, fibers in the NAc pathway during early reversal. Confirming the specificity and efficacy of the viral approach, there was expression of EYFP in >80% of putative (i.e., DAT-labeled) VTA DA cells, as well as in fibers in the NAc (Figure 3A–D, Figure S2). In addition, ex vivo FSCV recordings confirmed that DA transients were evoked by shining blue light on NAc slices containing ChR2-expressing DAergic fibers originating in the VTA (Figure S3). Because we did not measure glutamate or other neurotransmitters known to be co-released by VTA DA neurons (Hnasko et al., 2010; Stuber et al., 2010; Tecuapetla et al., 2010), we cannot exclude the possibility that photostimulation caused the release of these transmitters in addition to DA. We also did not attempt FSCV recordings in eNpHR3.0-expressing slices, given this would require electrically or optogenetically evoking DA release, thereby reducing the physiological relevance of any observed eNpHR3.0-mediated inhibition.
Figure 3: Impaired early reversal after VTA→NAc core photosilencing.
(A) AAV-eNpHR3.0 or AAV-ChR2 was injected into VTA of DAT-Cre mice (DAT). DAT immunostaining (upper) and EYFP expression in a Cre+ mouse injected with AAV-eNpHR3.0 (scale bar = 200 μm). (B) Higher magnification images showing DAT immunostaining (upper), EYFP expression (middle) and merged (lower) in a Cre+ mouse injected with AAV-eNpHR3.0 (scale bar = 100 μm). (C) AAV-eNpHR3.0 infected 80% of DAT+ VTA neurons in Cre+ mice and but no neurons in Cre- controls. (D) Representative images depicting virus expression in VTA fibers in the NAc in a Cre+ mouse injected with AAV-eNpHR3.0 (scale bar = 500 μm; high magnification scale bar = 50 μm). (E) Optical fibers were targeted at the NAc and VTA DAergic inputs to NAc core were photosilenced or photostimulated after a correct choice was made during early reversal. (F) Placement of optical fibers in the NAc. (G) Photosilencing increased the number of errors during early reversal (n=7–13). (H) Photosilencing increased the number of lose-stay trials during early reversal. (I) Neither photosilencing nor photostimulation altered overall performance during early reversal. (J) Latencies to make a choice or collect a reward were not affected by photosilencing or photostimulation. (K) Photosilencing and photostimulation between trial initiation and choice was repeated during late reversal. (L-N) Neither photosilencing nor photostimulation affected behavioral performance during late reversal. Data are means ± SEM. *P<.05 versus DAT-Cre- controls.
Behaviorally, viral expression per se (i.e., in the absence of light) did not significantly affect discrimination performance (Cre- = 10.1±1.0, Cre+/eNpHR3.0 = 8.1±0.5, Cre+/ChR2 = 13.0±1.4 sessions to criterion). On early reversal, DA VTA→NAc neurons were either photosilenced or photostimulated immediately after each correct response was made (through to reward collection) by bilaterally shining green or blue light, respectively, in the NAc (Figure 3E–F). When compared to Cre- controls, photosilencing produced a significant increase in total errors in eNpHR3.0-expressing Cre+ mice (P=.0020) whereas photostimulation in ChR2-expressing Cre+ mice was without effect (P=.4383) (ANOVA effects of group: F2,26=7.75, P=.0023) (Figure 3G).
Trial-by-trial analysis also revealed a significant increase in lose→stay (ANOVA effects of group: F2,26=4.88, P=.0159; post hoc: P=.0140), lose→shift (ANOVA effects of group: F2,26=4.65, P=.0187; post hoc: P=.0253) and win→shift (ANOVA effects of group: F2,26=4.44, P=.0220; post hoc: P=.0296) trial-pairs in response to VTA→NAc photosilencing (Figure 3H). Despite an overall increase in errors and shifts in the microstructure of performance in the photosilenced group, choices expressed as percentage of the 30 trials ‘proper’ per session (i.e., excluding trials in which correction errors were made) was not different between the groups (ANOVA effects of group: P=.4229) (Figure 3I). There was also no overall effect of group (ANOVA effects of group: P=.1002) on latency to respond or collect reward, though there was a trend for both measures to be lower in the eNpHR3.0 group (Figure 3J).
Following early reversal photomanipulations, mice were tested through to criterion, without light. There was no difference between groups in sessions to attain reversal criterion, indicating the effect of photosilencing was limited to the session in which light was shone without lasting effects on the subsequent light-free sessions. Mice were then re-retested with light now shone during the pre-response (i.e., trial-initiation-to-touchscreen-response) period (Figure 3K). Groups did not differ on any behavioral parameter (Figure 3L–O), discounting possible non-specific (e.g., locomotor) effects of the photomanipulation and suggesting that this pathway is not necessary for the expression of reversal learning when the new stimulus-reward contingency is well-established.
Genetic model of augmented striatal DA release
As a complementary approach to the FSCV recordings and optogenetics experiment, we tested D2DAT mutants in which DA release in the NAc is augmented due to loss of D2R autoreceptor-mediated inhibitory feedback onto DA neurons (Bello et al., 2011; Holroyd et al., 2015). Replicating previous findings (Bello et al., 2011), these mutants were hyperactive in a novel open field test, as compared to floxed controls (WT = 10.60 ± 0.75 m, mutant = 13.18 ± 0.73 m; t-test: t15=2.38, P=.0310, n=8–9). However, neither discrimination nor reversal performance was abnormal in the mutants (Figure S4).
Discussion
In the current study, using FSCV to monitor DA transients in vivo, a NAc DA signal was observed when a choice between two stimuli unexpectedly produced reward and reward-associated cues after a previously learned stimulus-reward contingency was reversed. In addition, selectively silencing DAergic VTA inputs to the NAc, using optogenetics, disrupted early reversal performance. These findings provide a novel demonstration of recruitment of a DAergic response in the NAc during the updating of stimulus-reward associations and show how the absence of this signal interferes with the ability to disengage from a previously rewarded stimulus in favor of a now-fruitful choice.
Prior studies have found that DA release is not necessary for learning under all circumstances (Cannon & Palmiter, 2003; Flagel et al., 2011), but generally support the functional importance of DA signals for stimulus-reward learning, particularly when information about prior outcomes is needed to update behavior and optimize performance (Zweifel et al., 2009; Parker et al., 2010). In line with this, the current FSCV data showed that NAc DA release was coincident with unexpectedly-rewarded choices at a stage in the task, early reversal, when expectations stimulus-reward associations are violated. This increase in phasic NAc DA following correct choices at early reversal resembles similar FSCV findings obtained in a nose-poke based spatial reversal task in rats; wherein the magnitude of NAc, but not DLS, DA transients at this stage predicted successful reversal learning (Klanker et al., 2015; Klanker et al., 2017). The current data also show that post-correct DA signals diminished in tandem with reversal learning, tracking the establishment of revised expectations about the availability/unavailability of reward after choices.
In the current analysis, there was no indication of a decrease in NAc DA (i.e., a negative RPE) when mice made errors on early reversal choices that were unexpectedly unrewarded. This could be due to a technical limitation of our FSCV method, given studies using calcium imaging have been able to observe negative RPE-like signals in VTA/SN DA inputs to the NAc in a reversal setting (Parker et al., 2016). RPE-like signals in the VTA/SN→NAc pathway have also been detected with calcium imaging at the time of reward consumption in other mouse reversal tasks (Parker et al., 2016), whereas we also failed to observe DA transients following reward delivery. This again may simply reflect the methodological approach employed, or could be a function of the prolonged shaping and training mice receive in this touchscreen task, which make the reward fully predicted by the time discrimination and reversal training begins.
A phasic DA signal in the NAc core represents a plausible downstream mechanism for the increased midbrain DA neuron firing that contributes to RPEs (Schultz, 2013; Steinberg & Janak, 2013; Chang et al., 2016). Thus, an increase in DA after an unexpectedly rewarded choice at early reversal could be interpreted as evidence of a positive RPE in the VTA→NAc circuit that serves to update and adapt behavior in the face of the new stimulus-reward contingency. Alternatively, post-correct DA responses at this stage of the task could be explained by the unexpected occurrence of the tone and magazine light cues (and even the sound of the pellet being delivered) that signal reward availability after a correct response throughout training. Such reward-predictive cues are known to be potent evokers of DA in the NAc and have been shown to causally contribute to cue-invigoration of instrumental responding for reward (Wyvell & Berridge, 2000; Satoh et al., 2003; Roitman et al., 2004; Berridge, 2007; Lex & Hauber, 2008; Beierholm et al., 2013; Pecina & Berridge, 2013; Wassum et al., 2013; du Hoffmann & Nicola, 2014; Ostlund et al., 2014; Aitken et al., 2016; Collins et al., 2016a; Corbit & Balleine, 2016). It is worth noting here that the current task design does not allow for the alignment of DA responses to the time mice sample the stimuli prior to choice, because trials are initiated in the reward-magazine location at the side of the chamber opposite to the touchscreen in order to minimize attention to the magazine at the expense of the screen.
It seems reasonable to assume that at least some component of the DA transients following early reversal correct choices reflect cue-related motivation to collect the reward rather than a ‘pure’ RPE signal (Hamid et al., 2016). To test this further in future studies, it could prove informative to parse DA responses to presentation of reward-signaling cues from the delivery of the reward itself, for example by simply omitting the cues (Lederle et al., 2011). Notwithstanding, a motivational account of the current data would fit with the more general notion that NAc DA is recruited when the cost of responding is physically effortful and demands high incentive motivation (Day et al., 2010; Salamone et al., 2016). Indeed, ramping NAc DA responses, similar in pattern to those currently observed, have been associated with reward-anticipatory action-sequences directed toward the location in which reward is presented (Howe et al., 2013). Importantly, these responses scale with the size and proximity of the reward (Hamid et al., 2016), diminish with training and, in a parallel with the currently DA response to unexpected reward, recover with an unexpected change in reward value (Collins et al., 2016b).
In this context, it was noteworthy that DA transients were not seen when mice made correct choices during the early discrimination phase, despite this being an unexpected outcome to touching what is a novel stimulus. This differs from prior FSCV studies showing DA responses at early stages of operant learning (Willuhn et al., 2012; Klanker et al., 2015; Collins et al., 2016a; Hamid et al., 2016; Klanker et al., 2017). Differences in electrode design (chronic vs. acute) and species (mouse vs. rat) may have contributed to differences between the observations reported here and in previous studies (Rodeberg et al., 2017). Another consideration, already alluded to above, is that by the start of discrimination testing, mice have already been through extensive pre-training, during which responses at a range of novel, interchangeable stimuli are rewarded. This experience may have lessened the surprise of being rewarded for touching the correct stimulus in the novel discrimination pair. This differs from early reversal; here the two stimuli are equally familiar and each now comes with a clear expectation about their respective outcomes.
Photosilencing VTA DAergic inputs to the NAc immediately after, but not before, unexpectedly rewarded choices produced disruptions to performance during early reversal. This aligns with our FSCV data showing NAc DA release at this stage and suggesting a functional role for this pathway in updating behavior to accommodate the change in stimulus-reward contingency. However, impaired early reversal performance after VTA→NAc photosilencing differs from the finding that dorsomedial striatum, but not NAc (core or shell), lesions in rats failed to affect spatial reversal learning (Castane et al., 2010), as well as pharmacological data, again in rats, showing that NAc D2R agonism, but not D1R or D2R blockade, also left spatial reversal learning intact (Haluk & Floresco, 2009). There are a number of potential reasons for these apparent discrepancies, including species (rat versus mouse), task (spatial/lever-press versus visual/touchscreen), temporal resolution (permanent lesions and long-lasting pharmacological effects versus post-correct-choice-specific silencing), and the specificity of the manipulation (pan-regional lesion and receptor-restricted antagonism versus VTA input-specific). There is also the possibility that while virus and optical fibers were targeted to the NAc core, some inhibition may also have occurred in the NAc shell. Our photomanipulations could also have caused antidromic stimulation of DA neurons and resultant changes in activity in other VTA DA targets (Jennings and Stuber, 2014). Finally, and as already noted, there are unknown behavioral consequences of co-released glutamate or other transmitters from DA neurons (Hnasko et al., 2010; Stuber et al., 2010; Tecuapetla et al., 2010; Yoo et al., 2016).
While the current data show that VTA→NAc photosilencing increased errors on early reversal, this was not paralleled by reduction in the overall number of correct choices made on the 30 session trials ‘proper’ (i.e., excluding correction error trials). The increase in errors did not appear to be due to a non-specific increase in responding because silencing applied, albeit at pre-choice, once mice had attained reversal criterion was without effect. This paradoxical combination of a high error rate with overall intact correct performance can occur in this task when the act of committing repeated errors on correction trials eventually feeds into benefit correct performance later in the session. An alternative, more straightforward explanation, is that because early reversal correct responding is already low in controls, any deficit due to silencing would be difficult to detect. Nonetheless, the behaviorally impairing effect of silencing generally concurs with Parker et al.’s recent observation that silencing VTA/SN DA neurons, from trial initiation to reward receipt, reduces selection of the rewarded stimulus on subsequent trials (Parker et al., 2016), although in their study, limiting silencing to the post-choice period was ineffective. A valuable experiment to better compare these studies would involve adapting our procedure to silence after early reversal errors, rather than correct choices. Another useful design modification would be to test for effects at later performance stages by applying post-choice photomanipulations at other performance stages or, as in other recent work (Bergstrom et al., 2018), throughout the entirety of testing to criterion.
Electrical stimulation of VTA neurons disrupts utilization of negative feedback from previous trials to avoid making risky choices on subsequent trials (Stopper et al., 2014) and in an operant spatial reversal task that is conceptually closer to ours, photostimulating VTA DA (TH-Cre+) neurons has been shown to improve performance (Adamantidis et al., 2011). By contrast, photostimulating the VTA→NAc pathway after early reversal correct choices did not produce a change in performance to mirror the decrement seen with photosilencing. Of course, low ChR2 expression, inadequate distribution of light or suboptimal electrode placement could all potentially account for this negative finding. An alternative explanation for this lack of an effect is that given correct choices already produce a significant DA transient, artificially augmenting release to supra-threshold levels may be without benefit to performance.
Indirectly supporting this argument, mutant mice with augmented NAc DA release due to deletion of the D2 autoreceptor on DA neurons did not exhibit improved reversal. There are, however, reports of improved learning in these D2 autoreceptor mutants, albeit in tasks quite distinct from the current reversal paradigm (e.g., fixed ratio responding for cocaine) (Bello et al., 2011; Holroyd et al., 2015). It should also be borne in mind that we did not record DA release with FSCV and therefore cannot be certain there was in fact augmented task-related DA release in the mutants. Moreover, while this mutation affects DA release in the NAc, it does not spatially dissociate DA effects in the NAc and DS, or for that matter any other striatal subregion or non-striatal target of midbrain DA cells.
Nonetheless, the suggestion that photostimulation was unable to improve reversal performance because it was augmenting an already present DA signal, agrees with prior work showing that DA neuronal photostimulation can serve as a RPE signal to promote learning by generating DA activity around stimuli that are in and of themselves insufficient to evoke an increase in DA (e.g., blocking, extinguished, preconditioned, or unconditioned stimuli) (Iordanova et al., 2006; Witten et al., 2011; Kim et al., 2012; Steinberg et al., 2013; Sharpe et al., 2017). One way to reconcile these findings with the current data, would be to couple photostimulation with simultaneous DA recordings as mice perform the reversal task and determine whether or not NAc DA responses are exaggerated by photostimulating the VTA inputs. The findings would provide further insight into whether artificially augmenting DA release is sufficient to facilitate reversal and other forms of cognitive flexibility.
In sum, the current data provide further evidence that DA release in the NAc is evoked when learned expectations about stimulus-reward relationships are violated, and also show that interfering with DA release in VTA→NAc neurons is sufficient to disrupt behavioral performance during early reversal. These findings offers new insight into how dysfunction of DA signaling in specific neural circuits might contribute to inflexible patterns of behavior characterizing addictions and other neuropsychiatric disorders.
Supplementary Material
Acknowledgements
We are grateful to Dr. Marcelo Rubinstein for provision of D2R-floxed mice, Dr. Alberto Castro for assistance with slice recordings, Ms. Jessika Brenin for assistance with behavioral data collection, and Mr. Gabriel Loewinger for assistance with FSCV. The work was supported by the National Institute on Alcohol Abuse and Alcoholism Intramural Research Program and NIH grants DA022340 and DA042595 (to J.F.C.).
Abbreviations
- DA
dopamine
- FSCV
fast scan cyclic voltammetry
- NAc
nucleus accumbens
- RPE
reward prediction error
- VTA
ventral tegmental area
Footnotes
Conflict of Interest Statement
The authors declare no competing financial interests.
Data Accessibility Statement
Data will be made available upon request. Please send requests to Dr. Andrew Holmes (holmesan@mail.nih.gov).
References
- Adamantidis AR, Tsai HC, Boutrel B, Zhang F, Stuber GD, Budygin EA, Tourino C, Bonci A, Deisseroth K & de Lecea L (2011) Optogenetic interrogation of dopaminergic modulation of the multiple phases of reward-seeking behavior. J Neurosci, 31, 10829–10835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aitken TJ, Greenfield VY & Wassum KM (2016) Nucleus accumbens core dopamine signaling tracks the need-based motivational value of food-paired cues. J Neurochem, 136, 1026–1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bayer HM & Glimcher PW (2005) Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129–141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beierholm U, Guitart-Masip M, Economides M, Chowdhury R, Duzel E, Dolan R & Dayan P (2013) Dopamine modulates reward-related vigor. Neuropsychopharmacology, 38, 1495–1503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bello EP, Mateo Y, Gelman DM, Noain D, Shin JH, Low MJ, Alvarez VA, Lovinger DM & Rubinstein M (2011) Cocaine supersensitivity and enhanced motivation for reward in mice lacking dopamine D2 autoreceptors. Nat Neurosci, 14, 1033–1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergstrom HC, Lipkin AM, Lieberman AG, Pinard CR, Gunduz-Cinar O, Brockway ET, … & Rubio FJ (2018). Dorsolateral striatum engagement interferes with early discrimination learning. Cell Rep, 23, 2264–2272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berke JD (2018). What does dopamine mean? Nat Neurosci, 21, 787–793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berridge KC (2007) The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology, 191, 391–431. [DOI] [PubMed] [Google Scholar]
- Bozarth MA & Wise RA (1981) Intracranial self-administration of morphine into the ventral tegmental area in rats. Life Sci, 28, 551–555. [DOI] [PubMed] [Google Scholar]
- Brigman JL, Daut RA, Wright T, Gunduz-Cinar O, Graybeal C, Davis MI, Jiang Z, Saksida LM, Jinde S, Pease M, Bussey TJ, Lovinger DM, Nakazawa K & Holmes A (2013) GluN2B in corticostriatal circuits governs choice learning and choice shifting. Nat Neurosci, 16, 1101–1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brigman JL, Ihne J, Saksida LM, Bussey TJ & Holmes A (2009) Effects of Subchronic Phencyclidine (PCP) Treatment on Social Behaviors, and Operant Discrimination and Reversal Learning in C57BL/6J Mice. Frontiers Behav Neurosci, 3, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calaminus C & Hauber W (2007) Intact discrimination reversal learning but slowed responding to reward-predictive cues after dopamine D1 and D2 receptor blockade in the nucleus accumbens of rats. Psychopharmacology, 191, 551–566. [DOI] [PubMed] [Google Scholar]
- Cannon CM & Palmiter RD (2003) Reward without dopamine. J Neurosci, 23, 10827–10831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Castane A, Theobald DE & Robbins TW (2010) Selective lesions of the dorsomedial striatum impair serial spatial reversal learning in rats. Behav Brain Res, 210, 74–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang CY, Esber GR, Marrero-Garcia Y, Yau HJ, Bonci A & Schoenbaum G (2016) Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors. Nat Neurosci, 19, 111–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheer JF, Aragona BJ, Heien ML, Seipel AT, Carelli RM & Wightman RM (2007) Coordinated accumbal dopamine release and neural activity drive goal-directed behavior. Neuron, 54, 237–244. [DOI] [PubMed] [Google Scholar]
- Clark JJ, Collins AL, Sanford CA & Phillips PE (2013) Dopamine encoding of Pavlovian incentive stimuli diminishes with extended training. J Neurosci, 33, 3526–3532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark JJ, Sandberg SG, Wanat MJ, Gan JO, Horne EA, Hart AS, Akers CA, Parker JG, Willuhn I, Martinez V, Evans SB, Stella N & Phillips PE (2010) Chronic microsensors for longitudinal, subsecond dopamine detection in behaving animals. Nat Methods, 7, 126–129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins AL, Aitken TJ, Greenfield VY, Ostlund SB & Wassum KM (2016a) Nucleus Accumbens Acetylcholine Receptors Modulate Dopamine and Motivation. Neuropsychopharmacology, 41, 2830–2838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins AL, Greenfield VY, Bye JK, Linker KE, Wang AS & Wassum KM (2016b) Dynamic mesolimbic dopamine signaling during action sequence learning and expectation violation. Sci Rep, 6, 20231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH & Balleine BW (2016) Learning and Motivational Processes Contributing to Pavlovian-Instrumental Transfer and Their Neural Bases: Dopamine and Beyond. Curr Top Behav Neurosci, 27, 259–289. [DOI] [PubMed] [Google Scholar]
- D’Ardenne K, McClure SM, Nystrom LE & Cohen JD (2008) BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science, 319, 1264–1267. [DOI] [PubMed] [Google Scholar]
- Day JJ, Jones JL, Wightman RM & Carelli RM (2010) Phasic nucleus accumbens dopamine release encodes effort- and delay-related costs. Biol Psychiat, 68, 306–309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Day JJ, Roitman MF, Wightman RM & Carelli RM (2007) Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat Neurosci, 10, 1020–1028. [DOI] [PubMed] [Google Scholar]
- du Hoffmann J & Nicola SM (2014) Dopamine invigorates reward seeking by promoting cue-evoked excitation in the nucleus accumbens. J Neurosci, 34, 14349–14364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eshel N, Bukwich M, Rao V, Hemmelder V, Tian J & Uchida N (2015) Arithmetic and local circuitry underlying dopamine prediction errors. Nature, 525, 243–246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, Akers CA, Clinton SM, Phillips PE & Akil H (2011) A selective role for dopamine in stimulus-reward learning. Nature, 469, 53–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graybeal C, Feyder M, Schulman E, Saksida LM, Bussey TJ, Brigman JL & Holmes A (2011) Paradoxical reversal learning enhancement by stress or prefrontal cortical damage: rescue with BDNF. Nat Neurosci, 14, 1507–1509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gremel CM & Lovinger DM (2016) Associative and sensorimotor cortico-basal ganglia circuit roles in effects of abused drugs. Genes Brain Behav, 16, 71–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gunaydin LA, Grosenick L, Finkelstein JC, Kauvar IV, Fenno LE, Adhikari A, Lammel S, Mirzabekov JJ, Airan RD, Zalocusky KA, Tye KM, Anikeeva P, Malenka RC & Deisseroth K (2014) Natural neural projection dynamics underlying social behavior. Cell, 157, 1535–1551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haluk DM & Floresco SB (2009) Ventral striatal dopamine modulation of different forms of behavioral flexibility. Neuropsychopharmacology, 34, 2041–2052. [DOI] [PubMed] [Google Scholar]
- Hamid AA, Pettibone JR, Mabrouk OS, Hetrick VL, Schmidt R, Vander Weele CM, Kennedy RT, Aragona BJ & Berke JD (2016) Mesolimbic dopamine signals the value of work. Nat Neurosci, 19, 117–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hart AS, Rutledge RB, Glimcher PW & Phillips PE (2014) Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J Neurosci, 34, 698–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heien ML, Phillips PE, Stuber GD, Seipel AT, & Wightman RM (2003) Overoxidation of carbon-fiber microelectrodes enhances dopamine adsorption and increases sensitivity. Analyst 128, 1413–19. [DOI] [PubMed] [Google Scholar]
- Heien ML, Khan AS, Ariansen JL, Cheer JF, Phillips PE, Wassum KM & Wightman RM (2005) Real-time measurement of dopamine fluctuations after cocaine in the brain of behaving rats. P Natl Acad Sci USA, 102, 10023–10028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hnasko TS, Chuhma N, Zhang H, Goh GY, Sulzer D, Palmiter RD, Rayport S & Edwards RH (2010) Vesicular glutamate transport promotes dopamine storage and glutamate corelease in vivo. Neuron, 65, 643–656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hollerman JR & Schultz W (1998) Dopamine neurons report an error in the temporal prediction of reward during learning. Nat Neurosci, 1, 304–309. [DOI] [PubMed] [Google Scholar]
- Holroyd KB, Adrover MF, Fuino RL, Bock R, Kaplan AR, Gremel CM, Rubinstein M & Alvarez VA (2015) Loss of feedback inhibition via D2 autoreceptors enhances acquisition of cocaine taking and reactivity to drug-paired cues. Neuropsychopharmacology, 40, 1495–1509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howarth CI & Deutsch JA (1962) Drive decay: the cause of fast “extinction” of habits learned for brain stimulation. Science, 137, 35–36. [DOI] [PubMed] [Google Scholar]
- Howe MW, Tierney PL, Sandberg SG, Phillips PE & Graybiel AM (2013) Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature, 500, 575–579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ilango A, Kesner AJ, Broker CJ, Wang DV & Ikemoto S (2014a) Phasic excitation of ventral tegmental dopamine neurons potentiates the initiation of conditioned approach behavior: parametric and reinforcement-schedule analyses. Front Behav Neurosci, 8, 155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ilango A, Kesner AJ, Keller KL, Stuber GD, Bonci A & Ikemoto S (2014b) Similar roles of substantia nigra and ventral tegmental dopamine neurons in reward and aversion. J Neurosci, 34, 817–822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iordanova MD, Westbrook RF & Killcross AS (2006) Dopamine activity in the nucleus accumbens modulates blocking in fear conditioning. Eur J Neurosci, 24, 3265–3270. [DOI] [PubMed] [Google Scholar]
- Izquierdo A, Brigman JL, Radke AK, Rudebeck PH, & Holmes A (2017) The neural basis of reversal learning: an updated perspective. Neuroscience, 345, 12–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Izquierdo A, Wiedholz LM, Millstein RA, Yang RJ, Bussey TJ, Saksida LM & Holmes A (2006) Genetic and dopaminergic modulation of reversal learning in a touchscreen-based operant procedure for mice. Behav Brain Res, 171, 181–188. [DOI] [PubMed] [Google Scholar]
- Jennings JH, & Stuber GD (2014). Tools for resolving functional activity and connectivity within intact neural circuits. Curr Biol, 24, R41–R50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim KM, Baratta MV, Yang A, Lee D, Boyden ES & Fiorillo CD (2012) Optogenetic mimicry of the transient activation of dopamine neurons by natural reward is sufficient for operant reinforcement. PloS One, 7, e33612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klanker M, Fellinger L, Feenstra M, Willuhn I & Denys D (2017) Regionally distinct phasic dopamine release patterns in the striatum during reversal learning. Neuroscience, 345, 110–123. [DOI] [PubMed] [Google Scholar]
- Klanker M, Sandberg T, Joosten R, Willuhn I, Feenstra M & Denys D (2015) Phasic dopamine release induced by positive feedback predicts individual differences in reversal learning. Neurobiol Learn Mem, 125, 135–145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lammel S, Lim BK, Ran C, Huang KW, Betley MJ, Tye KM, Deisseroth K & Malenka RC (2012) Input-specific control of reward and aversion in the ventral tegmental area. Nature, 491, 212–217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lederle L, Weber S, Wright T, Feyder M, Brigman JL, Crombag HS, Saksida LM, Bussey TJ & Holmes A (2011) Reward-related behavioral paradigms for addiction research in the mouse: performance of common inbred strains. PloS One, 6, e15536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lex A & Hauber W (2008) Dopamine D1 and D2 receptors in the nucleus accumbens core and shell mediate Pavlovian-instrumental transfer. Learn Mem, 15, 483–491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li SS & McNally GP (2015) A role of nucleus accumbens dopamine receptors in the nucleus accumbens core, but not shell, in fear prediction error. Behav Neurosci, 129, 450–456. [DOI] [PubMed] [Google Scholar]
- Ljungberg T, Apicella P & Schultz W (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. J Neurophysiol, 67, 145–163. [DOI] [PubMed] [Google Scholar]
- Mateo Y, Johnson KA, Covey DP, Atwood BK, Wang HL, Zhang S, Gildish I, Cachope R, Bellocchio L, Guzman M, Morales M, Cheer JF & Lovinger DM (2017) Endocannabinoid Actions on Cortical Terminals Orchestrate Local Modulation of Dopamine Release in the Nucleus Accumbens. Neuron, 96, 1112–1126 e1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matsumoto M & Hikosaka O (2009) Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature, 459, 837–841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nasrallah NA, Clark JJ, Collins AL, Akers CA, Phillips PE & Bernstein IL (2011) Risk preference following adolescent alcohol use is associated with corrupted encoding of costs but not rewards by mesolimbic dopamine. P Natl Acad Sci USA, 108, 5466–5471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olds J (1958) Self-stimulation of the brain; its use to study local effects of hunger, sex, and drugs. Science, 127, 315–324. [DOI] [PubMed] [Google Scholar]
- Ostlund SB, LeBlanc KH, Kosheleff AR, Wassum KM & Maidment NT (2014) Phasic mesolimbic dopamine signaling encodes the facilitation of incentive motivation produced by repeated cocaine exposure. Neuropsychopharmacology, 39, 2441–2449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Owesson-White CA, Ariansen J, Stuber GD, Cleaveland NA, Cheer JF, Wightman RM & Carelli RM (2009) Neural encoding of cocaine-seeking behavior is coincident with phasic dopamine release in the accumbens core and shell. Eur J Neurosci, 30, 1117–1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Owesson-White CA, Cheer JF, Beyene M, Carelli RM & Wightman RM (2008) Dynamic changes in accumbens dopamine correlate with learning during intracranial self-stimulation. P Natl Acad Sci USA, 105, 11957–11962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pan WX, Schmidt R, Wickens JR & Hyland BI (2005) Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J Neurosci, 25, 6235–6242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker JG, Zweifel LS, Clark JJ, Evans SB, Phillips PE & Palmiter RD (2010) Absence of NMDA receptors in dopamine neurons attenuates dopamine release but not conditioned approach during Pavlovian conditioning. P Natl Acad Sci USA, 107, 13491–13496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker NF, Cameron CM, Taliaferro JP, Lee J, Choi JY, Davidson TJ, Daw ND & Witten IB (2016) Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat Neurosci, 19, 845–854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pascoli V, Terrier J, Hiver A & Luscher C (2015) Sufficiency of Mesolimbic Dopamine Neuron Stimulation for the Progression to Addiction. Neuron, 88, 1054–1066. [DOI] [PubMed] [Google Scholar]
- Pecina S & Berridge KC (2013) Dopamine or opioid stimulation of nucleus accumbens similarly amplify cue-triggered ‘wanting’ for reward: entire core and medial shell mapped as substrates for PIT enhancement. Eur J Neurosci, 37, 1529–1540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips PE, Robinson DL, Stuber GD, Carelli RM & Wightman RM (2003a) Real-time measurements of phasic changes in extracellular dopamine concentration in freely moving rats by fast-scan cyclic voltammetry. Method Molec Med, 79, 443–464. [DOI] [PubMed] [Google Scholar]
- Phillips PE, Stuber GD, Heien ML, Wightman RM & Carelli RM (2003b) Subsecond dopamine release promotes cocaine seeking. Nature, 422, 614–618. [DOI] [PubMed] [Google Scholar]
- Rodeberg NT, Sandberg SG, Johnson JA, Phillips PE & Wightman RM (2017) Hitchhiker’s Guide to Voltammetry: Acute and Chronic Electrodes for in Vivo Fast-Scan Cyclic Voltammetry. ACS Chem Neurosci, 8, 221–234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roesch MR, Calu DJ & Schoenbaum G (2007) Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat Neurosci, 10, 1615–1624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roitman MF, Stuber GD, Phillips PE, Wightman RM & Carelli RM (2004) Dopamine operates as a subsecond modulator of food seeking. J Neurosci, 24, 1265–1271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rossi MA, Sukharnikova T, Hayrapetyan VY, Yang L & Yin HH (2013) Operant self-stimulation of dopamine neurons in the substantia nigra. PloS One, 8, e65799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saddoris MP, Sugam JA & Carelli RM (2017) Prior Cocaine Experience Impairs Normal Phasic Dopamine Signals of Reward Value in Accumbens Shell. Neuropsychopharmacology, 42, 766–773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saddoris MP, Sugam JA, Stuber GD, Witten IB, Deisseroth K & Carelli RM (2015) Mesolimbic dopamine dynamically tracks, and is causally linked to, discrete aspects of value-based decision making. Biol Psychiat, 77, 903–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salamone JD, Pardo M, Yohn SE, Lopez-Cruz L, SanMiguel N & Correa M (2016) Mesolimbic Dopamine and the Regulation of Motivated Behavior. Curr Top Behav Neurosci, 27, 231–257. [DOI] [PubMed] [Google Scholar]
- Satoh T, Nakai S, Sato T & Kimura M (2003) Correlated coding of motivation and outcome of decision by dopamine neurons. J Neurosci, 23, 9913–9923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W (2013) Updating dopamine reward signals. Curr Op Neurobiol, 23, 229–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W, Dayan P & Montague PR (1997) A neural substrate of prediction and reward. Science, 275, 1593–1599. [DOI] [PubMed] [Google Scholar]
- Sharpe MJ, Chang CY, Liu MA, Batchelor HM, Mueller LE, Jones JL, Niv Y & Schoenbaum G (2017) Dopamine transients are sufficient and necessary for acquisition of model-based associations. Nat Neurosci, 20, 735–742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stauffer WR, Lak A, Yang A, Borel M, Paulsen O, Boyden ES & Schultz W (2016) Dopamine Neuron-Specific Optogenetic Stimulation in Rhesus Macaques. Cell, 166, 1564–1571 e1566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinberg EE, Boivin JR, Saunders BT, Witten IB, Deisseroth K & Janak PH (2014) Positive reinforcement mediated by midbrain dopamine neurons requires D1 and D2 receptor activation in the nucleus accumbens. PloS One, 9, e94771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinberg EE & Janak PH (2013) Establishing causality for dopamine in neural function and behavior with optogenetics. Brain Res, 1511, 46–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K & Janak PH (2013) A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci, 16, 966–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stopper CM, Tse MT, Montes DR, Wiedman CR & Floresco SB (2014) Overriding phasic dopamine signals redirects action selection during risk/reward decision making. Neuron, 84, 177–189. [DOI] [PubMed] [Google Scholar]
- Stuber GD, Hnasko TS, Britt JP, Edwards RH & Bonci A (2010) Dopaminergic terminals in the nucleus accumbens but not the dorsal striatum corelease glutamate. J Neurosci, 30, 8229–8233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugam JA, Day JJ, Wightman RM & Carelli RM (2012) Phasic nucleus accumbens dopamine encodes risk-based decision-making behavior. Biol Psychiat, 71, 199–205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tecuapetla F, Patel JC, Xenias H, English D, Tadros I, Shah F, Berlin J, Deisseroth K, Rice ME, Tepper JM & Koos T (2010) Glutamatergic signaling by mesolimbic dopamine neurons in the nucleus accumbens. J Neurosci, 30, 7105–7110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsai HC, Zhang F, Adamantidis A, Stuber GD, Bonci A, de Lecea L & Deisseroth K (2009) Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science, 324, 1080–1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ungless MA, Magill PJ & Bolam JP (2004) Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science, 303, 2040–2042. [DOI] [PubMed] [Google Scholar]
- Wanat MJ, Kuhnen CM & Phillips PE (2010) Delays conferred by escalating costs modulate dopamine release to rewards but not their predictors. J Neurosci, 30, 12020–12027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wassum KM, Ostlund SB, Loewinger GC & Maidment NT (2013) Phasic mesolimbic dopamine release tracks reward seeking during expression of Pavlovian-to-instrumental transfer. Biol Psychiat, 73, 747–755. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Willuhn I, Burgeno LM, Everitt BJ & Phillips PE (2012) Hierarchical recruitment of phasic dopamine signaling in the striatum during the progression of cocaine use. P Natl Acad Sci USA, 109, 20703–20708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Willuhn I, Burgeno LM, Groblewski PA & Phillips PE (2014) Excessive cocaine use results from decreased phasic dopamine signaling in the striatum. Nat Neurosci. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Witten IB, Steinberg EE, Lee SY, Davidson TJ, Zalocusky KA, Brodsky M, Yizhar O, Cho SL, Gong S, Ramakrishnan C, Stuber GD, Tye KM, Janak PH & Deisseroth K (2011) Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron, 72, 721–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wyvell CL & Berridge KC (2000) Intra-accumbens amphetamine increases the conditioned incentive salience of sucrose reward: enhancement of reward “wanting” without enhanced “liking” or response reinforcement. J Neurosci, 20, 8122–8130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoo JH, Zell V, Gutierrez-Reed N, Wu J, Ressler R, Shenasa MA, Johnson AB, Fife KH, Faget L & Hnasko TS (2016) Ventral tegmental area glutamate neurons co-release GABA and promote positive reinforcement. Nat Commun, 7, 13697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zweifel LS, Parker JG, Lobb CJ, Rainwater A, Wall VZ, Fadok JP, Darvas M, Kim MJ, Mizumori SJ, Paladini CA, Phillips PE & Palmiter RD (2009) Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior. P Natl Acad Sci USA, 106, 7281–7288. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



