Abstract
Mesolimbic dopamine perturbations modulate performance of reward-seeking behavior, with tasks requiring high effort being especially vulnerable to disruption of dopamine signaling. Previous work primarily investigated longer-term perturbations such as receptor antagonism and dopamine depletion, which constrain the ability to assess dopamine contributions to effort expenditure in isolation from other behavior events, such as reward consumption. Also unclear is if dopamine is required for both initiation and maintenance when a sequence of multiple instrumental responses is required. Here we used optogenetic inhibition of midbrain TH+ neurons to probe the role of dopamine neuron activity during instrumental responding for reward by varying the time epoch of neural inhibition relative to the time of response initiation. Within a fixed-ratio procedure, requiring eight nosepoke responses per reinforcer delivery, or a progressive ratio procedure, in which within-session response requirements increased exponentially, inhibiting dopamine neurons while mice were engaged in response bouts decreased the probability of continued responding. If inhibition occurred during each attempted bout, the effect was to decrease total responses, and thus amount of rewards earned, over a session. In contrast, if inhibition was applied only during some bouts, mice increased the number of bouts initiated to earn control levels of reward. Inhibiting dopamine neurons while mice were not responding decreased the probability of initiating an instrumental response but had no effect on the amount of effort exerted over the entire session. We conclude that midbrain dopamine signaling promotes initiation of instrumental responding and maintains motivation to continue ongoing bouts of effortful responses.
Keywords: Motivation, instrumental learning, optogenetics, ventral tegmental area, progressive ratio
INTRODUCTION
Considerable evidence indicates that brief phasic dopamine (DA) neuron activity can serve as a reward prediction error (RPE) signal capable of supporting certain forms of associative learning. Neural response profiles congruent with this view have been observed in non-human primates and rodents (Cohen et al., 2012; Day et al., 2007; Eshel et al., 2015; Flagel et al., 2011; Hart et al., 2014; Ljungberg et al., 1992; Matsumoto and Hikosaka, 2009; Owesson-White et al., 2008; Roesch et al., 2007; Schultz et al., 1997, 2015; Waelti et al., 2001), and optogenetic manipulation of VTA DA neuron activity to mimic a positive or negative RPE at the time of reward receipt, or immediately following an action predictably changes future behavior (Chang et al., 2016; Hamid et al., 2016, 2016, Parker et al., 2016, 2016; Steinberg et al., 2013; Tsai et al., 2009).
In addition to learning, a role for VTA DA in ongoing motivation, or behavioral activation, has long been appreciated (Berridge, 2007; Berridge and Robinson, 2003; Robbins and Everitt, 2007; Salamone, 2002; Salamone and Correa, 2012), and has been proposed to vary in relation to current subjective effort/energy utilization requirements (Beeler et al., 2012; Cagniard et al., 2006; Niv, 2007; Ostlund et al., 2012; Salamone and Correa, 2012). For example, DA antagonists and lesions of DA terminals in the nucleus accumbens (NAc) reduce effortful instrumental responding and promote behavioral switching from more to less effortful means to gain access to food reward (Aberman and Salamone, 1999; Hosking et al., 2015; Ostlund et al., 2012; Salamone et al., 1991).
Observation of two modes of DA neuron activity, phasic firing and tonic firing, (Goto et al., 2007; Grace, 2000, 2016; Niv, 2007; Sulzer et al., 2016; Wanat et al., 2009) have led to the assignment of distinct contributions of phasic and tonic DA neuron activity to learning and motivation, respectively, especially when considering DA actions in the nucleus accumbens (NAc) (Niv et al., 2007; Parker et al., 2010; Schultz, 2007; Zweifel et al., 2009), with phasic DA changes mediating learning and changes in tonic levels over tens of seconds to minutes impacting motivational variables. However, multiple studies suggest that motivation may be related to changes in VTA/NAc DA levels over relatively short time scales that would typically not be considered ‘tonic’ (Collins et al., 2016; Hamid et al., 2016; Ko and Wanat, 2016; Phillips et al., 2003; Roitman et al., 2004; Wassum et al., 2012a), and have been referred to as ‘slow phasic’, representing changes occurring on the time scale of behavior, yet longer than the time required for a DA neuron burst (Salamone and Correa, 2012). For example, in a recent set of studies using fast scan cyclic voltammetry (FSCV), Collins and colleagues (2016) directly measured DA in NAc of rats trained to lever press one lever for access to a second lever whose depression led to reward. These authors found that DA concentration increases were sometimes observed to begin prior to lever approach, and to last on the order of 5–10 secs (Collins et al., 2016). Features of these increases were correlated with latency to approach the first lever, suggesting that these briefer DA signals alter motivated responding directly (Collins et al., 2016). While these responses changed over days with learning, they also were predictive of behavioral response characteristics, in line with roles of DA in both motivation and learning (Collins et al 2016; Wassum et al., 2012). Similar conclusions emerge from a detailed analysis of NAc FSCV DA signals within a reinforcement learning framework; Hamid et al (2016) found that the dynamics of the DA signal encode both immediate value that acutely impacts effort expenditure, as well as changes in value, or RPEs, that affect future response allocation. Interestingly, the authors found that optogenetic activation of VTA DA neurons time-locked to a trial initiation cue did not affect future choice, i.e., learning, but did decrease response latencies on that same trial. Together these findings suggest that VTA DA neuron activity can impact ongoing responding, congruent with a motivational interpretation, perhaps even on shorter time scales than typically considered to reflect ‘tonic’ DA.
While these and other prior studies indicate contributions to motivated behavior, they generally did not distinguish between a requirement for DA for initiation or maintenance of responding. In a careful series of studies, Nicola and colleagues provide evidence that DA levels in the NAc are critical for ongoing modulation of behavior, specifically for facilitation of approach behavior (Nicola, 2010, 2016), an idea that would place the requirement for DA at the initiation phase of action sequences, rather than the expression. On the other hand, findings of selective effects of DA lesions on responding under larger ratio requirements as compared with continuous reinforcement (Aberman and Salamone, 1999; Salamone et al., 2001), as well as the observation of DA increases from the initiation through the full execution of a response sequence (Collins et al., 2016) might suggest that DA also is important for maintenance of longer sequences of responses.
Here we tested whether suppression of DA neuron activity when mice were required to execute a sequence of responses to obtain reward would impair performance, using optogenetic inhibition of TH+ neurons in the midbrain. We trained TH-IRES-Cre mice on appetitive instrumental procedures for food reward then triggered DA neuron inhibition during different behavioral epochs. Based on the prior findings discussed above, we chose to inhibit DA neurons at two distinct time points during these instrumental procedures. In the first condition, we triggered inhibition while mice were engaged in off-task behavior and measured the probability that mice would reinitiate instrumental responding while DA neurons were inhibited. In the second condition, we triggered inhibition just after mice initiated an instrumental response and were therefore actively engaged in a bout of responding. We found that TH+ neuron inhibition applied both prior to or during a sequence of instrumental actions made to obtain reward reduces the probability that reward-seeking actions will be elicited during the time of inhibition; this is an acute effect that fully recovers when the inhibition ceases. Thus inhibition can acutely decrease the probability of reward-seeking actions, in agreement with contributions of DA to ongoing responding during both the initiation and maintenance phases of instrumental behavior.
EXPERIMENTAL PROCEDURES
Subjects
Male tyrosine hydroxylase (TH)-IRES-Cre mice aged 8–12 weeks were housed individually on a reverse 12 h light/dark cycle (lights off at 10:00). All mice began behavioral training with standard rodent chow and water available ad libitum. Mice were tested under food restriction and were brought down to 90% of their free-fed body weight over 5 days before the start of the behavioral test. All experimental procedures were approved by the Institutional Animal Care and Use Committee of the Ernest Gallo Clinic and Research Center at the University of California, San Francisco and were consistent with the guidance described within the National Institute of Health Guide for the Care and Use of Laboratory Animals.
Surgical procedures and virus injection
Mice were anesthetized with 100mg/kg ketamine and 10mg/kg xylazine and placed in a stereotaxic frame. Mice received an infusion of either Cre-dependent halorhodopsin (AAV5/Ef1a-DIO-eNpHR3.0)-eYFP or a control Cre-dependent YFP virus (AAV5/Ef1a-DIO)-eYFP (viruses were ~1012 infectious units ml 1; UNC Viral Vector Core, Chapel Hill, NC, USA). A custom made 31-gauge infuser was used to deliver 1μl of virus targeting the VTA (target coordinates: ML±0.5mm, AP -3.5mm, DV -4.45mm relative to bregma) at a rate of 0.1μl per minute. The infuser was left in place for an additional 10 minutes following each infusion. Two additional holes were drilled (ML±1.5mm) and two chronic bilateral optical fibers were implanted at an angle of ±13.5° to a targeted depth of DV -4.45mm relative to skull. Optical fibers were made in-house with optical fiber (BFL37-200, Thorlabs) and a ceramic ferrule (MM-FER2007C-2500, Precision Fiber Products) and were secured to the skull surface with two metal screws and dental cement. As an analgesic, all mice were given ad libitum access to acetaminophen (80mg/100ml water, oral, baby aspirin) for five days following surgery. All mice were given 7 days to recover before starting handling and behavioral procedures. All behavioral tests were conducted at least 5 weeks postsurgery.
Behavioral procedures
After recovery from surgery, mice were trained in standard behavioral chambers (Med Associates) in ventilated sound-attenuating chambers. Behavioral chambers were outfitted with a 2.8W/100mA house light and two nosepoke ports flanking a recessed port into which the reward pellet was delivered. The back of one nosepoke port was illuminated with a cue light when active and a 500ms, 1kH tone was delivered after each response requirement was completed on the active nosepoke, indicating that a reward pellet was available. Mice were first trained on a fixed ratio 1 (FR1) schedule of reinforcement to self-administer a high-fat food (35%, 20mg, Bio-serv) pellet. After acquiring the FR1 task, mice were then progressed to FR3 training for a minimum of 3 days before progressing to an FR8 or progressive ratio (PR) schedule for testing, depending upon the experiment. Under the PR schedule, the number of nosepoke responses required to earn a pellet was increased after each successive reward according to the equation x = (5*e^reward*0.24))-5, which produced the following response requirement schedule: 1,2,4,6,9,12,15,20,25,32,40,50,62,77,95,118… In all schedules, a 500ms, 1kH tone cue signaled the completion of the current ratio requirement and the availability of the reward.
For five days prior to testing and in sessions with optical inhibition of DA neurons, the implanted fiber optic implants were attached to patch cables (MFP_200/240/900-0.22_2.0m_sma, Doric Lenses) that terminated with a ceramic ferrule with a fitted ceramic sleeve (SM-CS125S, Precision Fiber Products). The other end of the patch cable was attached to a bilateral optical commutator (Doric Lenses), which was connected via a second patch cable to a 200mW DPSS 532 nm laser (OEM Laser Systems). The timing of optical inhibition was triggered by a computer running Med PC IV (Med Associates) software, which also recorded nosepoke responses and reward port entries.
Histological verification of NpHR virus expression and fiber optic placement
Brains were post-fixed in paraformaldehyde and cryoprotected in 25% sucrose for 24 hrs before slicing. Sections (50μm) were washed in PBS and incubated with 0.2% bovine serum albumin (BSA) and 0.2% Triton X-100 for 20 min. 10% normal donkey serum (NDS) was added for an additional 30 min incubation. Sections were then incubated with primary antibodies overnight at 4 degrees Celsius in PBS with BSA and Triton X-100. Concentrations for primary antibodies were as follows: mouse anti-GFP (1:1500, Invitrogen), rabbit anti-TH (1:1500, Fisher Scientific). Sections were then washed with 2% NDS in PBS for 10 min before incubating with secondary antibodies for 2 hrs at room temperature. Concentrations for secondary antibodies were 1:200 and were conjugated to Alexa Flour 488 or 594 dyes (Invitrogen). Sections were mounted onto glass microscope slides in phosphate buffered water, coverslipped with Vectashield mounting medium (Fisher Scientific), and imaged on a confocal microscope to verify locations of virus expression and fiber placements. Mice with insufficient virus expression or inaccurate fiber placements were discarded from the study.
Statistics
All values are expressed as mean±SEM. Statistical differences within groups were assessed using paired 2-tailed Student’s t-tests and statistical differences between groups were assessed using unpaired 2-tailed Student’s t tests. A level of confidence of p < 0.05 was employed for statistical significance. For analysis of number of responses during laser on periods for different response requirements in the progressive ratio procedure, p values were adjusted for multiple corrections using the Benjamini–Hochberg correction with a corrected cutoff of p=0.05, rather than apply a repeated measures ANOVA approach because only a subset of trials, that varied across subjects, contained one or more laser-on periods.
RESULTS
We used in vivo optogenetic techniques with the aim of selectively inhibiting midbrain DA neurons of mice during behavior. To permit inhibition of TH+ neurons in the VTA, TH-IRES-Cre mice received injections of Cre-dependent AAV virus (AAV5-Ef1a-DIO-eNpHR3.0-eYFP) expressing the light-sensitive inhibitory channel, halorhodopsin 3.0 (NpHR), in the VTA; chronic optical fiber implants were targeted dorsal to this region to allow for selective bilateral optogenetic DA neuron inhibition (Fig. 1). This inhibitory opsin approach has been used in TH-IRES-Cre mice in multiple published reports (Chaudhury et al., 2013; Ilango et al., 2014; Parker et al., 2016; Tan et al., 2012; Tye et al., 2013), and inhibition was confirmed in TH+ cells in vitro and in vivo, and little to no rebound excitation was observed including after long inhibitions (Chaudhury et al., 2013).
Mice were trained to respond, or ‘nosepoke’, for food reward by inserting their snout into a small hole in the experimental chamber, and the number of responses required for reward varied by experiment. However, in all experiments, completion of the correct number of nosepoke responses resulted in presentation of a 500 msec auditory tone and one high-fat food pellet. We tested the effects of DA neuron inhibition on initiation of nosepoke responding and on completion of a nosepoke response bout after nosepoking had begun. To accomplish this, we tracked behavior in real time and used the tracked behavioral events to trigger laser onset to inhibit midbrain DA neurons during each of these three behavioral epochs. Inhibition during the two behavioral epochs were tested separately in different sessions.
TH+ neuron inhibition decreases the probability of instrumental response initiation
To determine if DA is required to initiate a bout of nosepoke responses, TH-IRES-Cre mice expressing NpHR in the VTA were trained to respond under a fixed-ratio (FR8) instrumental schedule of reinforcement (eight sequential nosepokes required for delivery of one reward pellet). Fifteen sec laser pulses were delivered during off-task periods, specifically, when more than 15 seconds had elapsed since a subject collected a reward pellet with no nosepoke responses initiated during those 15 seconds (Fig. 2a). Mice were less likely to initiate nosepoking while the laser was on compared to a no-laser control session in which equivalent 15-s time intervals were analyzed (paired t-test, t(9)=4.046,p=0.003, Fig. 2b). This behavioral suppression recovered rapidly upon laser offset (Fig. 2c), as indicated by no difference in the mean latency to initiate responding between inhibited and non-inhibited sessions (paired t-test, t(9)= −.18,p=0.86; Fig. 2d). Furthermore, we observed no decrease in the total number of nosepokes emitted during the 2-hour behavioral session (paired t-test, t(9)= −.394, p=0.70; Fig. 2e). These results show that optical inhibition of DA neurons decreases the likelihood for initiation of instrumental nosepoke response bouts in a behavioral task with a fixed, moderate effort requirement.
To expand on this, we tested the effect of TH+ neuron inhibition in mice trained under a progressive ratio (PR) schedule, in which the number of nosepokes required to earn a reward is systematically increased over trials. To assess the requirement for DA neuron activity in response initiation in the PR task we triggered 15-s laser pulses when mice had failed to nosepoke for > 20 seconds (Fig. 2f). As with the FR8 procedure, DA neuron inhibition decreased the percentage of nosepoke bouts initiated during the laser pulse when compared to a control no-inhibition session (t(10)= 3.50, p=0.006 ; Fig. 2g,h). We observed a fast rebound in the probability of initiating a nosepoke bout upon laser offset (Fig. 2h), and no change in the latency to initiate nosepoking after DA neuron inhibition when compared to no-inhibition control sessions (t(10)= −.976, p=0.35; Fig. 2i). There was no change in the mean total nosepokes in the PR task (t(10)=.921. p=0.379; Fig. 2j).
Inhibiting mesolimbic TH+ neurons after nosepoke bout initiation decreases bout duration
To determine whether DA neuron activation is required to maintain an ongoing sequence of instrumental responses, we tested the effects of DA neuron inhibition responding under the FR8 schedule after subjects initiated the first nosepoke in the bout of eight required for reward (laser onset 0.01 seconds after response initiation, laser duration 15 seconds; Fig. 3a). Inhibiting DA neurons at the beginning of the nosepoke bout reduced the number of nosepokes completed during the laser pulse compared to responding during comparable 15-s windows in a second no-laser test session (paired t-test, t(9)=5.65,p=0.0003, Fig. 3b). In addition, the probability of completing additional nosepokes during the laser pulse decreased precipitously when compared to the no-laser control session (Fig. 3c). As a consequence of this reduction in the number of responses, mice completed significantly fewer trials (i.e., more frequently failed to meet the FR8 response requirement; paired t-test, t(9)=−6.85,p<0.0001; Fig. 3d), and made fewer nosepokes over the entire session, compared to the no laser session (paired t-test, t(9)=3.67, p=0.005; Fig. 3e). To examine whether the observed decrease in nosepokes was secondary to motor deficits caused by DA neuron inhibition, we measured the time intervals between nosepoke responses while DA neurons were and were not inhibited. We found no difference in percentage of inter-response intervals (IRI) that were less than 1 second (IRI < 1 sec; .87 +/− .04 vs .82 +/− .04, paired t-test. t(9)=1.9, p=0.09) in agreement with previous findings that inhibiting VTA DA neurons does not affect locomotor behavior per se (Tye et al., 2013).
Because inhibiting DA neurons following the initiation of every response bout markedly decreased nosepoking behavior, we wondered if this effect could depend upon changes in overall motivation that outlasted the period of inhibition. However, when we inhibited TH+ neurons after initiation of only 50% of nosepoke bouts and compared behavior during equivalent 15-sec periods for the remaining 50% of bouts (Fig. 3f), we observed a decrease in mean nosepokes that was specific for laser-on bouts as compared with laser-off bouts (paired t-test, t(9)=4.62, p=0.001; Fig. 3g,h). Because mice completed significantly fewer ratios (FR8) while the laser was on (t(9)=5.28, p=0.001; Fig. 3j), yet still performed overall as well as they did during a session with no inhibition (total nosepokes: paired t-test, t(9)=−.394, p=0.70; total rewards earned: paired t-test, t(9)=1.13, p=0.29; Fig.3i), we hypothesized that mice were compensating by increasing the number of bouts initiated throughout the test session. Bout initiation was defined as nosepoke responding after >3 seconds had elapsed since a prior nosepoke. (We chose a 3 second IRI as this was the shortest time period that DA receptor antagonists were shown to impair instrumental response initiation (Nicola, 2010)). We found that mice did indeed initiate more bouts in sessions in which they received TH+ neuron inhibition for 50% of their bouts when compared to control, no-laser sessions (t(9)= −3.65, p=0.005; Fig. 3k).
We also examined the impact of DA neuron inhibition on maintenance of responding during the PR task by inhibiting DA neurons after bout initiation (15-s laser onset after completion of 3 nosepokes with IRI < 0.75 seconds on 50% of trials; Fig. 4a). The mean number of nosepokes performed during inhibition decreased when compared to initiated bouts with no inhibition in the same behavioral test session (t(9)=2.81, p=0.02; Fig. 4b). The percentage of IRIs < 1 second between inhibited and non-inhibited conditions was not different, arguing against a motor deficit due to DA neuron inhibition (p=0.16, inhibited: 79.46%±9.07, simulated laser: 91.07%±2.18). When responding was examined relative to increasing response requirement (Fig. 4c), the average number of nosepokes during 15-s inhibition periods was relatively constant in the face of increasing overall response requirements in between reward deliveries. In trials where DA neurons were not inhibited (simulated laser), mice increased the number of nosepokes performed as the response requirement increased, performing an average of 22.57±4.57 nose pokes in 15 seconds in the more demanding response requirements (greater than 100 nose pokes required to earn each reward). DA neuron inhibition had no effect on the number of nose pokes performed during the laser pulse for response requirements less than 10 (p=0.50, inhibited: 6.78±1.73, simulated laser: 5.39±1.11) (Fig. 7c). However, mice performed fewer nose pokes while DA neurons were inhibited when compared to no laser trials with response requirements above 26 (RR26–50, p=0.01; RR51–100, p<0.0001; RR 100+, p<0.0001; Fig. 7c). Interestingly, inhibiting DA neurons on 50% of trials resulted in significantly fewer nosepokes performed across the 2-hour test session when compared to control sessions with no DA neuron inhibition (t(9)=2.825, p=0.02; Fig. 4d). To determine if mice attempted to compensate for the reduction in nosepoke responding during DA neuron inhibition, we measured the total number of bouts initiated across the inhibition session compared to a no-inhibition session. In this test condition, we did not observe an increase in the number of nose poke bouts initiated in the inhibition session (p=0.92, inhibited: 75.5±8.37, no inhibition: 76.2±10.99).
DISCUSSION
Here we demonstrate that inhibition of midbrain DA neurons impairs performance of motivated behavior. We first show that DA neuron activity at the time mice are not engaged in instrumental behavior is required to promote initiation of an instrumental response. In addition, DA neuron activity at the time mice are engaged in an instrumental response promotes the continuation of responding through completion of the response requirement. Finally, because in most cases mice readily compensate for nose pokes omitted while DA neurons are inhibited, the effects appear limited to acute alterations in responding.
These findings are generally consistent with many years of work showing that long-term DA depletions and receptor antagonism impair performance on instrumental tasks with moderate to high response costs (Aberman and Salamone, 1999; Aberman et al., 1998; Mingote et al., 2005; Nicola, 2010; Salamone and Correa, 2012; Salamone et al., 2001). Our ability to confine DA inhibition to fifteen second epochs allowed us to specifically ask whether DA is required for the initiation of effortful responding. The clear decrease in the initiation of a nosepoke bout during inhibition indicates that DA is indeed important for initiation. This finding supports the flexible approach hypothesis proposed to account for the effects of NAc DA manipulations on instrumental responding (Nicola, 2010). Nicola has proposed that NAc DA is required to reinitiate an instrumental response after a period of off-task behavior (Nicola, 2010), and that the impact of DA manipulations on effortful responding are accounted for by reductions in subjects’ ability to re-engage with the operandum after a pause in responding. Our results are additionally consistent with reports that self-initiated action sequences are preceded by DA neuron firing (c.f., (Jin and Costa, 2010; Romo and Schultz, 1990) as well as mesolimbic DA release (Ko and Wanat, 2016; Phillips et al., 2003; Roitman et al., 2004; Stuber et al., 2005; Wassum et al., 2012b). In addition, brief, phasic activation of DA cell body regions promote cocaine-seeking behavior, suggesting a causal role for DA neuron activity in promoting response initiation (Phillips et al., 2003). Our findings may reflect a broader role of NAc DA in approach behavior, both to instrumental operanda and cues, as suggested by multiple experimental findings (c.f., (Collins et al., 2016; Di Ciano et al., 2001; Flagel et al., 2011; Fraser and Janak, 2017; Salamone et al., 2016; Saunders et al., 2017; Wassum et al., 2012b) and as explored in a recent review (Nicola, 2016).
Although evidence suggests NAc DA promotes approach behavior, this doesn't necessarily imply that DA is required to maintain effort once an instrumental response sequence is initiated. However, we found that inhibiting DA neurons during a bout of nosepokes significantly decreased the ability to complete an FR8 response requirement, as well as the higher response requirements within a progressive ratio procedure. In contrast, some studies using DA antagonists suggest that DA is not required for behavioral responding during moderate to long sequences of instrumental actions (Nicola, 2010). There are multiple possible explanations for the difference between the current study and past studies using DA antagonists. First, optogenetic inhibition of midbrain DA neurons will also impact co-transmitters of these neurons, for example, glutamate (Morales and Margolis, 2017; Stuber et al., 2010; Yamaguchi et al., 2015). Therefore, it is possible that some of our behavioral effects may be dependent on decreased glutamate release as well as DA release in terminal regions, which could explain the discrepancy between our findings and those with DA antagonists. Second, the pharmacological impact of DA antagonists at presynaptic DA receptors and/or pharmacological actions over other behavioral epochs, in addition to actual responding, could produce different overall effects on behavior. Finally, it is possible that the effect of inhibition after initiation of a bout observed here was mediated by a reduction in return to responding after very brief disengagements by subjects on some trials; in this case, one might propose that re-initiation was impaired, but what might be the defining difference between re-initiation and maintenance? Independent of the terminology, while we did not find evidence for this type of deficit in the current study, it would be of interest to directly test the impact of brief forced response breaks within bouts.
We note that in the progressive ratio test we did not see a deficit in nosepoking behavior as a result of DA neuron inhibition for response requirements less than ten and only a minor deficit for response requirements less than twenty-five. These results might seem surprising given the marked decrease in nosepoking behavior observed during DA neuron inhibition in the FR8 task. This may be explained by a procedural difference in the inhibition; whereas laser onset occurred after one nosepoke in the FR8 tests, laser onset occurred after the third nosepoke in the progressive ratio task. While we made this adjustment to bias our inhibition to times when we imagined the mouse was ‘committed’ to preforming a long bout, this change precludes a direct comparison of effect magnitude across the two procedures. It is also possible that this difference arises from the variance in relative response cost experienced by subjects in FR8 vs progressive ratio, as this variable can determine the impact of DA manipulations. In support of this idea, Ostlund and colleagues (2012) found that rats that experienced an upshift or no change in relative response cost during a test session as compared to prior training sessions (FR1 or FR10 train, FR10 test) showed a substantial deficit in task performance under DA receptor antagonists. On the other hand, animals that experienced a downshift in relative response cost (FR20 train, FR10 test) were not impaired in test performance after DA receptor antagonism. Therefore, DA may mediate a subjective, relative evaluation of the current effort requirements of a task rather than the absolute cost of obtaining a reward. Future studies to test this possibility using DA neuron inhibition in short time frames as in the present study would be beneficial.
We inhibited DA neuron firing at the cell-body level, in the VTA. In addition to ventral striatal projections, DA neurons in the VTA have non-striatal targets, and it is possible that inhibition of projections to the prefrontal cortex, hippocampus, or other regions, could account for some of the effects seen here, a possibility that must be tested.
Notably, the effects reported here were limited to the actual time epoch during which the laser is on. When within-bout inhibition only occurred on 50% of FR8 trials, mice increased the number of bouts initiated during times of no inhibition, thereby earning equivalent reward over the session. This indicates that, to a certain extent, mice are able compensate for the acute effects of in-bout DA neuron inhibition. Likewise, while we observed pronounced acute effects of inhibiting DA neurons on response initiation, the delays in response initiation did not translate into a reduction in total nosepoke behavior in any of the tests performed, as subjects initiated sufficient response bouts during periods of no inhibition to earn control levels of reward over the session. We conclude that interrupting DA neuron activity at the time of performing a bout of operant responses substantially interferes with the ability of mice to overcome response costs and maintain motivation while the inhibition is occurring, but these effects are acute and recover rapidly.
These effects therefore are in line with a direct effect on motivated performance of reward-seeking behavior, rather than a learning effect revealed at a later time point. While we note that a minority of laser-on periods overlapped with reward delivery, which could be expected to extinguish reward-seeking behavior by acting as a negative reward prediction error (Chang et al., 2016; Parker et al., 2010; Steinberg and Janak, 2013), this appears not to be sufficient in the tests conducted here to alter later behavior outside of the laser-on periods. A direct test in which inhibition is consistently paired with reward delivery is required to explore this issue.
Highlights.
Is midbrain dopamine neuron activity required for responding on fixed ratio and progressive ratio tasks in mice?
Optogenetic inhibition of dopamine neurons decreases the likelihood of initiation a response bout
Optogenetic inhibition of dopamine neurons after bout initiation decreases the likelihood of bout completion
Mice counteract depressant effects of dopamine inhibition during no-inhibition periods to maintain overall reward levels
Acknowledgments
This work was supported by National Institutes of Health grant R01 DA035943 and funds from the State of California through UCSF.
Footnotes
Author contributions: Experimental conception, design and interpretation: SFW, PHJ; data collection: SFW, RMR; data analysis: SFW; writing and editing: SFW, PHJ.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Aberman JE, Salamone JD. Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement. Neuroscience. 1999;92:545–552. doi: 10.1016/s0306-4522(99)00004-4. [DOI] [PubMed] [Google Scholar]
- Aberman JE, Ward SJ, Salamone JD. Effects of dopamine antagonists and accumbens dopamine depletions on time-constrained progressive-ratio performance. Pharmacol Biochem Behav. 1998;61:341–348. doi: 10.1016/s0091-3057(98)00112-9. [DOI] [PubMed] [Google Scholar]
- Beeler JA, Frazier CRM, Zhuang X. Putting desire on a budget: dopamine and energy expenditure, reconciling reward and resources. Front Integr Neurosci. 2012;6:49. doi: 10.3389/fnint.2012.00049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berridge KC. The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology (Berl) 2007;191:391–431. doi: 10.1007/s00213-006-0578-x. [DOI] [PubMed] [Google Scholar]
- Berridge KC, Robinson TE. Parsing reward. Trends Neurosci. 2003;26:507–513. doi: 10.1016/S0166-2236(03)00233-9. [DOI] [PubMed] [Google Scholar]
- Cagniard B, Beeler JA, Britt JP, McGehee DS, Marinelli M, Zhuang X. Dopamine scales performance in the absence of new learning. Neuron. 2006;51:541–547. doi: 10.1016/j.neuron.2006.07.026. [DOI] [PubMed] [Google Scholar]
- Chang CY, Esber GR, Marrero-Garcia Y, Yau HJ, Bonci A, Schoenbaum G. Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors. Nat Neurosci. 2016;19:111–116. doi: 10.1038/nn.4191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaudhury D, Walsh JJ, Friedman AK, Juarez B, Ku SM, Koo JW, Ferguson D, Tsai HC, Pomeranz L, Christoffel DJ, et al. Rapid regulation of depression-related behaviours by control of midbrain dopamine neurons. Nature. 2013;493:532–536. doi: 10.1038/nature11713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 2012;482:85–88. doi: 10.1038/nature10754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins AL, Greenfield VY, Bye JK, Linker KE, Wang AS, Wassum KM. Dynamic mesolimbic dopamine signaling during action sequence learning and expectation violation. Sci Rep. 2016;6:20231. doi: 10.1038/srep20231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Day JJ, Roitman MF, Wightman RM, Carelli RM. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat Neurosci. 2007;10:1020–1028. doi: 10.1038/nn1923. [DOI] [PubMed] [Google Scholar]
- Di Ciano P, Cardinal RN, Cowell RA, Little SJ, Everitt BJ. Differential involvement of NMDA, AMPA/kainate, and dopamine receptors in the nucleus accumbens core in the acquisition and performance of pavlovian approach behavior. J Neurosci Off J Soc Neurosci. 2001;21:9471–9477. doi: 10.1523/JNEUROSCI.21-23-09471.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eshel N, Bukwich M, Rao V, Hemmelder V, Tian J, Uchida N. Arithmetic and local circuitry underlying dopamine prediction errors. Nature. 2015;525:243–246. doi: 10.1038/nature14855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, Akers CA, Clinton SM, Phillips PEM, Akil H. A selective role for dopamine in stimulus-reward learning. Nature. 2011;469:53–57. doi: 10.1038/nature09588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fraser KM, Janak PH. Long-lasting contribution of dopamine in the nucleus accumbens core, but not dorsal lateral striatum, to sign-tracking. Eur J Neurosci. 2017;46:2047–2055. doi: 10.1111/ejn.13642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goto Y, Otani S, Grace AA. The Yin and Yang of dopamine release: a new perspective. Neuropharmacology. 2007;53:583–587. doi: 10.1016/j.neuropharm.2007.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grace AA. The tonic/phasic model of dopamine system regulation and its implications for understanding alcohol and psychostimulant craving. Addict Abingdon Engl. 2000;95(Suppl 2):S119–128. doi: 10.1080/09652140050111690. [DOI] [PubMed] [Google Scholar]
- Grace AA. Dysregulation of the dopamine system in the pathophysiology of schizophrenia and depression. Nat Rev Neurosci. 2016;17:524–532. doi: 10.1038/nrn.2016.57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamid AA, Pettibone JR, Mabrouk OS, Hetrick VL, Schmidt R, Vander Weele CM, Kennedy RT, Aragona BJ, Berke JD. Mesolimbic dopamine signals the value of work. Nat Neurosci. 2016;19:117–126. doi: 10.1038/nn.4173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hart AS, Rutledge RB, Glimcher PW, Phillips PEM. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J Neurosci Off J Soc Neurosci. 2014;34:698–704. doi: 10.1523/JNEUROSCI.2489-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hosking JG, Floresco SB, Winstanley CA. Dopamine antagonism decreases willingness to expend physical, but not cognitive, effort: a comparison of two rodent cost/benefit decision-making tasks. Neuropsychopharmacol Off Publ Am Coll Neuropsychopharmacol. 2015;40:1005–1015. doi: 10.1038/npp.2014.285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ilango A, Kesner AJ, Keller KL, Stuber GD, Bonci A, Ikemoto S. Similar roles of substantia nigra and ventral tegmental dopamine neurons in reward and aversion. J Neurosci Off J Soc Neurosci. 2014;34:817–822. doi: 10.1523/JNEUROSCI.1703-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jin X, Costa RM. Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature. 2010;466:457–462. doi: 10.1038/nature09263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ko D, Wanat MJ. Phasic Dopamine Transmission Reflects Initiation Vigor and Exerted Effort in an Action- and Region-Specific Manner. J Neurosci Off J Soc Neurosci. 2016;36:2202–2211. doi: 10.1523/JNEUROSCI.1279-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ljungberg T, Apicella P, Schultz W. Responses of monkey dopamine neurons during learning of behavioral reactions. J Neurophysiol. 1992;67:145–163. doi: 10.1152/jn.1992.67.1.145. [DOI] [PubMed] [Google Scholar]
- Matsumoto M, Hikosaka O. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature. 2009;459:837–841. doi: 10.1038/nature08028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mingote S, Weber SM, Ishiwari K, Correa M, Salamone JD. Ratio and time requirements on operant schedules: effort-related effects of nucleus accumbens dopamine depletions. Eur J Neurosci. 2005;21:1749–1757. doi: 10.1111/j.1460-9568.2005.03972.x. [DOI] [PubMed] [Google Scholar]
- Morales M, Margolis EB. Ventral tegmental area: cellular heterogeneity, connectivity and behaviour. Nat Rev Neurosci. 2017;18:73–85. doi: 10.1038/nrn.2016.165. [DOI] [PubMed] [Google Scholar]
- Nicola SM. The flexible approach hypothesis: unification of effort and cue-responding hypotheses for the role of nucleus accumbens dopamine in the activation of reward-seeking behavior. J Neurosci Off J Soc Neurosci. 2010;30:16585–16600. doi: 10.1523/JNEUROSCI.3958-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nicola SM. Reassessing wanting and liking in the study of mesolimbic influence on food intake. Am J Physiol Regul Integr Comp Physiol. 2016;311:R811–R840. doi: 10.1152/ajpregu.00234.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Niv Y. Cost, benefit, tonic, phasic: what do response rates tell us about dopamine and motivation? Ann N Y Acad Sci. 2007;1104:357–376. doi: 10.1196/annals.1390.018. [DOI] [PubMed] [Google Scholar]
- Niv Y, Daw ND, Joel D, Dayan P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl) 2007;191:507–520. doi: 10.1007/s00213-006-0502-4. [DOI] [PubMed] [Google Scholar]
- Ostlund SB, Kosheleff AR, Maidment NT. Relative response cost determines the sensitivity of instrumental reward seeking to dopamine receptor blockade. Neuropsychopharmacol Off Publ Am Coll Neuropsychopharmacol. 2012;37:2653–2660. doi: 10.1038/npp.2012.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Owesson-White CA, Cheer JF, Beyene M, Carelli RM, Wightman RM. Dynamic changes in accumbens dopamine correlate with learning during intracranial self-stimulation. Proc Natl Acad Sci U S A. 2008;105:11957–11962. doi: 10.1073/pnas.0803896105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker JG, Zweifel LS, Clark JJ, Evans SB, Phillips PEM, Palmiter RD. Absence of NMDA receptors in dopamine neurons attenuates dopamine release but not conditioned approach during Pavlovian conditioning. Proc Natl Acad Sci U S A. 2010;107:13491–13496. doi: 10.1073/pnas.1007827107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker NF, Cameron CM, Taliaferro JP, Lee J, Choi JY, Davidson TJ, Daw ND, Witten IB. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat Neurosci. 2016;19:845–854. doi: 10.1038/nn.4287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips PEM, Stuber GD, Heien MLAV, Wightman RM, Carelli RM. Subsecond dopamine release promotes cocaine seeking. Nature. 2003;422:614–618. doi: 10.1038/nature01476. [DOI] [PubMed] [Google Scholar]
- Robbins TW, Everitt BJ. A role for mesencephalic dopamine in activation: commentary on Berridge (2006) Psychopharmacology (Berl) 2007;191:433–437. doi: 10.1007/s00213-006-0528-7. [DOI] [PubMed] [Google Scholar]
- Roesch MR, Calu DJ, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat Neurosci. 2007;10:1615–1624. doi: 10.1038/nn2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roitman MF, Stuber GD, Phillips PEM, Wightman RM, Carelli RM. Dopamine operates as a subsecond modulator of food seeking. J Neurosci Off J Soc Neurosci. 2004;24:1265–1271. doi: 10.1523/JNEUROSCI.3823-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Romo R, Schultz W. Dopamine neurons of the monkey midbrain: contingencies of responses to active touch during self-initiated arm movements. J Neurophysiol. 1990;63:592–606. doi: 10.1152/jn.1990.63.3.592. [DOI] [PubMed] [Google Scholar]
- Salamone JD. Functional significance of nucleus accumbens dopamine: behavior, pharmacology and neurochemistry. Behav Brain Res. 2002;137:1. doi: 10.1016/s0166-4328(02)00281-4. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Correa M. The mysterious motivational functions of mesolimbic dopamine. Neuron. 2012;76:470–485. doi: 10.1016/j.neuron.2012.10.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salamone JD, Steinpreis RE, McCullough LD, Smith P, Grebel D, Mahan K. Haloperidol and nucleus accumbens dopamine depletion suppress lever pressing for food but increase free food consumption in a novel food choice procedure. Psychopharmacology (Berl) 1991;104:515–521. doi: 10.1007/BF02245659. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Wisniecki A, Carlson BB, Correa M. Nucleus accumbens dopamine depletions make animals highly sensitive to high fixed ratio requirements but do not impair primary food reinforcement. Neuroscience. 2001;105:863–870. doi: 10.1016/s0306-4522(01)00249-4. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Pardo M, Yohn SE, López-Cruz L, SanMiguel N, Correa M. Mesolimbic Dopamine and the Regulation of Motivated Behavior. Curr Top Behav Neurosci. 2016;27:231–257. doi: 10.1007/7854_2015_383. [DOI] [PubMed] [Google Scholar]
- Saunders BT, Richard JM, Margolis EB, Janak PH. Instantiation of incentive value and movement invigoration by distinct midbrain dopamine circuits. bioRxiv 2017 [Google Scholar]
- Schultz W. Multiple dopamine functions at different time courses. Annu Rev Neurosci. 2007;30:259–288. doi: 10.1146/annurev.neuro.28.061604.135722. [DOI] [PubMed] [Google Scholar]
- Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
- Schultz W, Carelli RM, Wightman RM. Phasic dopamine signals: from subjective reward value to formal economic utility. Curr Opin Behav Sci. 2015;5:147–154. doi: 10.1016/j.cobeha.2015.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinberg EE, Janak PH. Establishing causality for dopamine in neural function and behavior with optogenetics. Brain Res. 2013;1511:46–64. doi: 10.1016/j.brainres.2012.09.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, Janak PH. A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci. 2013;16:966–973. doi: 10.1038/nn.3413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stuber GD, Wightman RM, Carelli RM. Extinction of cocaine self-administration reveals functionally and temporally distinct dopaminergic signals in the nucleus accumbens. Neuron. 2005;46:661–669. doi: 10.1016/j.neuron.2005.04.036. [DOI] [PubMed] [Google Scholar]
- Stuber GD, Hnasko TS, Britt JP, Edwards RH, Bonci A. Dopaminergic terminals in the nucleus accumbens but not the dorsal striatum corelease glutamate. J Neurosci Off J Soc Neurosci. 2010;30:8229–8233. doi: 10.1523/JNEUROSCI.1754-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sulzer D, Cragg SJ, Rice ME. Striatal dopamine neurotransmission: regulation of release and uptake. Basal Ganglia. 2016;6:123–148. doi: 10.1016/j.baga.2016.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan KR, Yvon C, Turiault M, Mirzabekov JJ, Doehner J, Labouèbe G, Deisseroth K, Tye KM, Lüscher C. GABA neurons of the VTA drive conditioned place aversion. Neuron. 2012;73:1173–1183. doi: 10.1016/j.neuron.2012.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsai HC, Zhang F, Adamantidis A, Stuber GD, Bonci A, de Lecea L, Deisseroth K. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science. 2009;324:1080–1084. doi: 10.1126/science.1168878. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tye KM, Mirzabekov JJ, Warden MR, Ferenczi EA, Tsai HC, Finkelstein J, Kim SY, Adhikari A, Thompson KR, Andalman AS, et al. Dopamine neurons modulate neural encoding and expression of depression-related behaviour. Nature. 2013;493:537–541. doi: 10.1038/nature11740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waelti P, Dickinson A, Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001;412:43–48. doi: 10.1038/35083500. [DOI] [PubMed] [Google Scholar]
- Wanat MJ, Willuhn I, Clark JJ, Phillips PEM. Phasic dopamine release in appetitive behaviors and drug addiction. Curr Drug Abuse Rev. 2009;2:195–213. doi: 10.2174/1874473710902020195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wassum KM, Tolosa VM, Tseng TC, Balleine BW, Monbouquette HG, Maidment NT. Transient extracellular glutamate events in the basolateral amygdala track reward-seeking actions. J Neurosci Off J Soc Neurosci. 2012a;32:2734–2746. doi: 10.1523/JNEUROSCI.5780-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wassum KM, Ostlund SB, Maidment NT. Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task. Biol Psychiatry. 2012b;71:846–854. doi: 10.1016/j.biopsych.2011.12.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamaguchi T, Qi J, Wang HL, Zhang S, Morales M. Glutamatergic and dopaminergic neurons in the mouse ventral tegmental area. Eur J Neurosci. 2015;41:760–772. doi: 10.1111/ejn.12818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zweifel LS, Parker JG, Lobb CJ, Rainwater A, Wall VZ, Fadok JP, Darvas M, Kim MJ, Mizumori SJY, Paladini CA, et al. Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior. Proc Natl Acad Sci U S A. 2009;106:7281–7288. doi: 10.1073/pnas.0813415106. [DOI] [PMC free article] [PubMed] [Google Scholar]