Abstract
Rationale
Dopamine (DA) activity in the nucleus accumbens (NAc) is related to the general motivational effects of rewarding stimuli. Dickinson and colleagues have shown that initial acquisition of instrumental responding reflects action–outcome relationships based on instrumental incentive learning, which establishes the value of an outcome. Given that the sensitivity of responding to outcome devaluation is not affected by NAc lesions, it is unlikely that incentive learning during the action–outcome phase is mediated by DA activity in the NAc.
Objectives
DA efflux in the NAc after limited and extended training was compared on the assumption that comparable changes would be observed during both action–outcome- and habit-based phases of instrumental responding for food. This study also tested the hypothesis that increase in NAc DA activity is correlated with instrumental responding during extinction maintained by a conditioned stimulus paired with food.
Methods
Rats were trained to lever press for food (random-interval 30 s schedule). On the 5th and 16th day of training, microdialysis samples were collected from the NAc or mediodorsal striatum (a control site for generalized activity) during instrumental responding in extinction and then for food reward, and analyzed for DA content using high performance liquid chromatography.
Results
Increase in DA efflux in the NAc accompanied responding for food pellets on both days 5 and 16, with the magnitude of increase significantly enhanced on day 16. DA efflux was also significantly elevated during responding in extinction only on day 16.
Conclusions
These results support a role for NAc DA activity in Pavlovian, but not instrumental, incentive learning.
Keywords: Instrumental learning, Action–outcome, Habit, Food reward, Uncertainty, Microdialysis, Dopamine, Nucleus accumbens, Mediodorsal striatum
Introduction
Despite more than 25 years of intensive research there is still a vigorous and ongoing debate concerning the specific aspects of complex and adaptive behaviors that are mediated by brain dopamine (DA) function (Ikemoto and Panksepp 1999; Redgrave et al. 1999; Salamone and Correa 2002; Schultz 2002; Berridge and Robinson 2003; Joseph et al. 2003; Wise 2004; Kelley et al. 2005; Salamone et al. 2005; Young et al. 2005). Much of the attention is focused on the nucleus accumbens (NAc) and a consensus has formed around the theory that DA innervation of this structure plays a key role in incentive motivation, a Pavlovian conditioned appetitive state that can influence the vigor of approach behavior (Fibiger and Phillips 1986; Robbins et al. 1989; Berridge and Robinson 1998; Cardinal et al. 2002; Parkinson et al. 2002). Consistent with the importance of conditional stimuli (CS+) in the initiation of approach behavior, visual and olfactory stimuli associated with natural rewards such as food or sexually receptive conspecifics can evoke significant increases in DA efflux in the NAc that precede similar changes observed during consummatory behavior (Fiorino et al. 1997; Ahn and Phillips 1999, 2002). Depletion or antagonism of DA function in the NAc eliminates or diminishes the influence of Pavlovian CSs on the level of instrumental responding, i.e., Pavlovian to instrumental transfer (Cardinal et al. 2002; Parkinson et al. 2002), and additionally, shifts response choice from that with higher work demand and a food reward of greater value to that with lower work cost and lower reward value (Salamone et al. 1994; Salamone and Correa 2002).
Contemporary views of instrumental conditioning propose that performance of an instrumental task is controlled by two distinct processes. Based on carefully designed experiments, Dickinson et al. (1995) have demonstrated that during initial learning trials (i.e., after exposure to 120 outcomes), an animal’s performance is based on the knowledge (or expectation) that an instrumental action will lead to a specific biologically significant outcome. After many more trials (i.e., after exposure to 360 outcomes), responses gradually shift from being outcome-dependent to habit-based. It is only in the early action–outcome controlled stage that instrumental performance is sensitive to manipulations that alter the incentive value of the outcome, underscoring the remarkable ability of rats to acquire and encode information relating current incentive value to an action–outcome contingency in as little as 120 outcome trials. Using outcome devaluation and Pavlovian to instrumental transfer tests, lesions of the basolateral amygdala have been reported to impair the capacity of rats to encode the relation between a specific action and the value of an outcome (Corbit and Balleine 2005).
Given the generally accepted role of DA in the NAc in Pavlovian-based incentive motivation, the question arises as to whether DA activity in this region is involved in action–outcome and/or habit-based based stages of instrumental responding. A role for either the core or shell region of the NAc in the formation of action–outcome associations has not been confirmed (Balleine and Killcross 1994; Corbit et al. 2001; de Borchgrave et al. 2002), but deficits in the acquisition of instrumental responding have been reported after blockade of the NMDA class of glutamate receptors in the NAc core (Kelley et al. 1997). In a recent study, Yin et al. (2006) examined DA function during early action–outcome as distinct from later habit-based stages of instrumental responding. Mice, with a knockdown of the DA transporter and chronically elevated levels of DA, showed no deficits in acquisition of an instrumental task and responding early in training was still sensitive to tests of outcome devaluation (Yin et al. 2006). However, in rats with bilateral 6-OHDA lesions of the nigrostriatal DA system, responding on instrumental tasks remained sensitive to reward devaluation despite extensive training sessions (Faure et al. 2005). Thus, the nigrostriatal DA system, and more specifically, its innervation of the posterior lateral striatum, appears to be necessary for transition of instrumental conditioning from an action–outcome stage to a habit-based stage.
Important insights into the role of DA transmission at different stages of instrumental behavior may be gained by examining in vivo changes in DA levels in major areas of DA innervation. Therefore, we conducted microdialysis experiments in the NAc and mediodorsal (MD) striatum at early (5th day) and later (16th day) stages of instrumental learning employing a protocol used successfully to demonstrate that responding after 120 but not 360 reinforced responses is sensitive to outcome devaluation and is therefore action–outcome as distinct from habit-based (Dickinson et al. 1995). Given the finding that quinolinic acid or NMDA-induced cytotoxic lesions of the NAc failed to alter suppression of instrumental responding after outcome devaluation (de Borchgrave et al. 2002), it is unlikely that activity in the NAc is involved in instrumental incentive learning. Rather, the finding that lesions restricted only to the NAc completely abolished Pavlovian to instrumental transfer is consistent with the involvement of DA function in this nucleus in conditioned appetitive states. Therefore, we hypothesize that DA efflux in the NAc may be comparable during both early action–outcome and later habit-based stages of instrumental responding. We also examined changes in DA levels in the MD striatum, as a control site for generalized activity; hence, we predicted no significant changes in DA efflux associated with instrumental responding. The design of this study also incorporated a “within-trial” extinction phase to test the hypothesis that DA efflux in the NAc but not MD striatum would be increased significantly by presentation of a Pavlovian CS+ paired previously with food pellets during instrumental responding in extinction. The relationship between response rate and magnitude of DA efflux in either NAc or MD striatum was also of interest.
Materials and methods
Surgery
Long–Evans male rats (Charles River, Canada) weighing 280–310 g were implanted bilaterally with stainless steel guide cannulae (19 gauge, 15 mm) under anesthesia induced by xylazine (7 mg/kg) and ketamine hydrochloride (100 mg/kg) delivered intraperitoneally. Cannulae were implanted 1 mm below dura immediately above the NAc [in millimeter: +1.7 anteroposteriorly (AP) and ±1.1 mediolaterally (ML) from bregma] or MD striatum (+1.2 AP, 2.5 ML) and secured with dental acrylic and four stainless steel screws. Stylets maintained patency of the cannulae until probe implantation. An additional sham-cannula was embedded within the back half of the acrylic head cap for purposes of habituation (see Behavioral apparatus and training).
Immediately after surgery, rats were moved to a reverse light-cycle (lights on 7 a.m.to 7 p.m.) colony room maintained at 20°C and housed individually in plastic bins lined with corncob bedding. Novel objects (e.g., egg cartons and paper rolls) were placed weekly in the cages to promote exploratory and play behavior. Rats were handled and weighed by the experimenter on a daily basis. Four days after surgery, rats were placed on a food restriction schedule for the duration of the experiment, maintaining their body weight at ∼85% of their free-feeding weight. The daily ration of food (20–25 g Purina Rat Chow) was given to the rats in their home cages following each day’s operant responding session. Water was available at all times except during the operant training and testing sessions.
Behavioral apparatus and training
Training and experiment sessions were conducted in a Plexiglas chamber (30×23×23 cm) with wire-mesh flooring. It was equipped with a retractable lever, a dispenser that delivered 45 mg Noyes pellets, and a light source (1 W, 12 V) located on the wall opposite the location of the lever. Pellets were dispensed into a photocell-equipped magazine located just left of the lever. The chamber was enclosed in a sound-attenuated ventilated box (Colbourn Instruments; Allentown, PA, USA) which had a small hole in the ceiling to allow passage of dialysis lines.
Training began 8 days into the food-deprivation schedule. Before a rat was placed in a testing chamber, a stainless steel coil was attached to the sham-cannula. This allowed the rat to habituate to the weight and feel of being tethered to the stainless steel coil which normally sheathes microdialysis lines during experiments. Each session began with rats being placed in operant chambers under baseline conditions (coil attached, lever retracted, and light off), the length of which was varied from day to day. Illumination of the light and insertion of the lever signaled the availability of pellets contingent on lever presses. A computer was used to control the equipment and record the number of lever presses and nose poke entries into the magazine.
At the start of the study, rats were trained to approach the magazine to retrieve 30 pellets delivered on a random-time 60-s schedule. Instrumental training sessions began the next day (day 1) on a RI 2-s schedule of pellet delivery. The schedule was changed to RI 15-s on day 2 and then to RI 30-s on day 3, and remained so for the remainder of the training and microdialysis sessions. Each instrumental training session was terminated when rats had responded for 30 pellets, except on days 4 and 15 when rats were allowed to press for 60 pellets (see Fig. 2). To minimize the effect of stress before microdialysis tests, rats remained overnight in the chambers under baseline conditions with water, after the training session on day 2.
Microdialysis probes and high pressure liquid chromatography system
Microdialysis probes were constructed as previously described (Ahn and Phillips 1999; Fiorino et al. 1997). They were concentric in design with a 2-mm semipermeable membrane (340 μm outer diameter, 65 kD MW cut-off, Filtral 12, Hospal; Neurnberg, Germany) with PE50 inlet and silica outlet tubing. A probe assembled in this manner typically had, at 21°C, in vitro recoveries of ∼22% DA. Once a probe was implanted in the brain through the guide cannula (see “Microdialysis experiments” section below for implant coordinates), a cylindrical brass collar secured the probe in place. The inlet tubing was connected to a liquid swivel (Instech 375s; Plymouth Meeting, PA, USA) which was mounted on top of the Colbourn box. Both inlet and outlet tubing were encased in a protective stainless steel coil which extended from the liquid swivel to the brass collar. A 2.5-ml gas-tight syringe (Unimetrics) and syringe pump (Model 22, Harvard Apparatus; South Natick, MA, USA) were used to perfuse a modified Ringer’s solution (in millimolar: 10 sodium phosphate buffer, 1.3 CaCl2, 3.0 KCl, 1.0 MgCl2, 147 NaCl, pH 7.4) through the probe at 1 μl/min.
DA content in microdialysis samples were separated by high pressure liquid chromatography (HPLC) and quantified by electrochemical detection. The HPLC system consisted of, in sequence of flow, a Bio-Rad pump (Hercules, CA, USA), an SSI pulse damper (State College, PA, USA) a Valco Instruments two-position auto-Injector (EC10W; Houston, TX, USA), a Beckman reverse-phase column (Ultrasphere, ODS 5 μm, 15 cm, 4.6 mm internal diameter; Fullerton, CA, USA), an ESA guard cell (Model 5020; Chemlsford, MA, USA), an ESA analytical cell (Model 5011), an ESA Coulochem II Electrochemical detector, and a dual channel chart recorder. The electrochemical detection parameters were: +450 mV for the oxidation channel, −300 mV for the reduction channel, and −450 mV for the guard cell output. The mobile phase (in millimolar: 83 sodium acetate buffer, 27 EDTA, and 1.30 sodium octyl sulfate at pH 3.5, 10% methanol) flowed through the system at 1 ml/min.
Microdialysis experiments
In all experiments, each rat was tested with microdialysis on days 5 and 16. The side of probe implantation (i.e., left or right brain structure) was counterbalanced for each experiment day. The two microdialysis sessions were conducted using an identical protocol. The day before an experiment, rats were implanted unilaterally with probes in the NAc (exposed membrane spanned −6.0 to −8.0 mm DV from dura) or MD striatum (−4.0 to −6.0 mm DV from dura) after the day’s training session and kept overnight (∼16–18 h) in the test chamber with water under baseline conditions. The next morning, microdialysis samples were collected at 10-min intervals (10 μl volume) and analyzed immediately with HPLC. Baseline conditions were continued until four consecutive samples showed stable DA levels (i.e., <10% fluctuation between samples; times 1–4).
In “Experiment 1”, the baseline period was followed by the concurrent illumination of the chamber and insertion of the lever. Rats were then allowed a 30-min period (times 5–7) during which they could lever press for food on a RI 30-s schedule. The session concluded with the retraction of the lever and a 40-min time-out period with the lights off (times 8–11). In “Experiment 2”, a 30-min extinction phase was incorporated (times 5–7) before a food reward phase. All aspects of the extinction phase were identical to previous rewarded training sessions except that lever presses did not lead to food delivery. A 20-min time-out period under baseline conditions (times 8–9) preceded a 10-min priming period when the light was on and five pellets were delivered randomly and noncontingently (time 10). The start of a food-rewarded phase (times 11–13) was marked by the insertion of the lever into the already illuminated chamber. During this period, rats were allowed to lever press for pellets on a RI 30-s schedule for 30 min. An experiment concluded with an additional 40-min postsession baseline period (times 14–17).
Histology
Rats were deeply anesthetized with chloral hydrate and intracardially perfused first with 0.9% NaCl and then phosphate-buffered formalin (3.7% formaldehyde). The removed brains were stored in 15% (w/v) sucrose in formalin for at least 24 h before being prepared as 50 μm coronal sections on 2% gel-coated slides. Cresyl violet staining was used to help verify placement of probe tracts. Only data from those rats with tracts in the shell/core region of the NAc (16 of 18 rats) and MD striatum (six of seven rats) of both hemispheres were included in the statistical analyses.
Data analyses
Neurochemical data were transformed into percentage of change from baseline (i.e., 0% representing the average concentration of the three samples preceding the final 4th baseline sample). Neurochemical data were analyzed using either a one-way (time) or two-way (day × time) repeated measures ANOVA followed by the Dunnett method of multiple comparisons, using the final baseline sample (time 4) as the control sample. The Huynh–Feldt correction for nonsphericity was applied to the degrees of freedom for all within-subject analyses. Comparisons between two means were assessed using paired t tests. A coefficient of determination (R2) was computed based on a linear regression of a scatter plot between lever presses/10 min (y-axis) and corresponding percent changes in DA efflux (x-axis), for data obtained on test days 5 and 16. Statistical analyses were performed using Systat or SPSS statistical packages.
Results
Experiment 1
Changes in extracellular DA levels in the NAc were compared after limited and extended training sessions, while rats lever pressed for food reward on a RI 30-s schedule. As shown in Fig. 1, the rates of responding approximately doubled from days 5 to 16, even though rats obtained and consumed a similar number of rewards during the two sessions [43.3 pellets (∼2.0 g) on day 5 and ∼46.2 pellets (∼2.1 g) on day 16]. The number of magazine entries also did not differ between test days. Separate one-way ANOVAs revealed a significant main effect of time on DA efflux on day 5 (F7,49=12.038, p<0.001) and day 16 (F7,49=5.325, p<0.044). Further analyses indicated that on both days, there was a significant increase in DA efflux above their respective baselines that remained elevated for the 30-min duration of instrumental responding for food pellets (Dunnett’s, p<0.05). Despite the doubling of response rates from days 5 to 16, the pattern and magnitudes of DA efflux on the two test days were not statistically different (maximal increase of 69±16% on day 5 and 71±28% on day 16), as a two-way repeated measures ANOVA failed to show a significant interaction of day × time on DA efflux (F7,98=0.598, p=0.528). However, a paired t test showed that during the first 10 min (time 5), DA efflux was significantly higher on day 16 than day 5 (68±24 vs 30±12%, respectively; p<0.05). Basal values of DA in the NAc (uncorrected for recovery) were 2.72±0.61 and 2.19±0.29 nM for days 5 and 16, respectively, and were not statistically different from each other.
Experiment 2
Instrumental behavior As shown in Fig. 2, all rats in the NAc and MD striatal microdialysis groups learned to lever press for food pellets on a RI 30-s schedule and to retrieve the pellets, as indicated by the number of magazine entries. Training data for one rat in the MD striatal group was lost and not included in the following analyses. Instrumental behavior became more efficient through the initial days of training, and during the third and fourth training sessions, rats made significantly more lever presses than magazine entries. This increase in ratio of lever presses to magazine entries, from limited to extended training experience, may be explained by proposing that rats learned to use the auditory click made by the dispenser with the delivery of a pellet. Thus, rats learned to enter the magazine only when they heard the click. On days (4 and 15) before microdialysis tests, rats were allowed to press for 60 pellets, rather than the normal 30 pellets available on other days, and accordingly increased the rate of lever presses on these days. The mean number of lever press responses per training session across the RI 30-s schedule tended to be higher in the NAc group (from 327 to 835) than in the MD striatal group (from 367 to 436). However, an ANOVA test revealed no main effect of group (F1,11=2.414, p=0.149). ANOVA of magazine entry data similarly indicated that counts were comparable between the NAc and MD striatal groups (F1,11=0.366, p=0.557).
On test days, rats were tested for instrumental responding in extinction and then with food reward. During the extinction phase, the pellet dispenser was disconnected from the magazine, but still produced an auditory click according to the RI 30-s schedule. During the rewarded phase, all rats consumed every pellet delivered during the lever press sessions. During the rewarded component of the experiment, rats consumed ∼45.7 pellets (∼2.1 g) on day 5 and ∼46.2 pellets (∼2.1 g) on day 16. The similarity in food consumption and difference in magnitude of DA efflux (see below), across days 5 and 16, again supports the view that DA response is not a function of food reward. In both the NAc and MD striatal groups (Figs. 3 and 4), rate of instrumental performance approximately doubled from day 5 to day 16, during both the extinction and rewarded phases of the sessions. Over the 30-min extinction phase, rats displayed a typical decline in rate of lever presses, whereas over the 30-min reward phase, there was a slight increase in rate of responding. In the NAc group, the number of lever presses was significantly higher on day 16 than on day 5 during the first 10 min of the extinction (t test, p=0.003) and rewarded (t test, p=0.002) phases. In the MD striatal group, the lever press rates during the first 10 min of the extinction phase were comparable on days 5 and 16, but differed significantly between the 2 days during the first 10 min of the rewarded phase (p=0.038). In both the NAc and MD striatal groups, an ANOVA indicated that the number of magazine entries did not differ significantly between days 5 and 16.
DA efflux Basal values of DA in the NAc (uncorrected for recovery) were 2.58±0.32 and 2.16±0.16 nM for days 5 and 16, respectively, and were not statistically different from each other. Basal values of DA in the MD striatum (uncorrected for recovery) were 2.71±0.23 and 2.95±0.34 nM for days 5 and 16, respectively, and also did not differ statistically.
In the NAc, the overall pattern of DA efflux across the different phases of the microdialysis test on day 5 appeared comparable to that observed on day 16 (Fig. 3), but statistical analyses revealed several key differences. Accordingly, an ANOVA identified a significant day × time interaction on DA efflux (F13,182=2.760, p=0.022), with a significant simple main effect of Time on day 5 (F13,91=8.690, p<0.001) and day 16 (F13,91=14.649, p<0.001). On both days, there was an increase in NAc DA levels during the initial 10 min of responding in extinction (10±7% above baseline on day 5 and 19±6% on day 16), but this increase was only significant after extended training on day 16 (paired samples t test, p=0.005). DA efflux then returned to baseline for the remainder of the extinction phase and time-out period. During the 10-min period preceding the rewarded responding phase, five pellets were noncontingently dispensed into the magazine; the purpose of this was to prime the rats to lever press again for food reward. During this period, DA levels did not differ from baseline values on day 5 (4±9%) but by day 16, were increased significantly above baseline (27±7%; Dunnett’s test, p<0.05). The reward phase of the session was accompanied by significant elevation of DA efflux on both days (maximal increase of 37±12% on day 5 and +83±15% on day 16; Dunnett’s test, p<0.05) that remained elevated after retraction of the lever and cessation of instrumental responding. DA levels then gradually declined towards baseline values over the remaining 60 min of the test session. During the initial 10 min of the reward phase (time 11), the magnitude of change on day 16 (83±15%) was significantly greater than the change in efflux observed on day 5 (37±12%; t test, p=0.013).
In the MD striatum, there were no statistically significant changes in DA levels during the entire instrumental responding session on both days 5 and 16 (Fig. 4). Despite performing lever presses and magazine entries at rates comparable to the NAc group, DA levels in this group showed only slight fluctuations around baseline.
Correlation between response rate and DA efflux
Based on a linear regression of a scatter plot between lever presses/10 min and percent change in DA efflux, R2 values of 0.0014 for data from day 5 and 0.0065 on day 16 indicated that <1% of the variation in lever presses could be explained by a linear correlation between lever presses and DA levels, on both test days (Fig. 5).
Histology
The locations of all microdialysis probes are presented in Fig. 6. The 2-mm lengths of the dialysis membrane were in the NAc (shell-core boundary) or medial aspect of the MD striatum (just dorsal and lateral of the anterior commissure).
Discussion
The present study examined the role of DA activity in the NAc in behaviors maintained by instrumental and Pavlovian incentive learning. DA efflux in the NAc was increased significantly during both early and later training stages of an instrumental response for food on a RI-30-s schedule of reinforcement (Figs. 1 and 3). It is important to note that this pattern of results was observed whether a period of extinction preceded a 30-min period of instrumental responding reinforced by food pellets. As such, the present findings confirmed previous reports of increased DA release in the NAc during lever pressing for food, employing fixed interval or ratio schedules of reinforcement (Salamone et al. 1994; Richardson and Gratton 1996; Cousins et al. 1999). Response actions during the early phase of training on interval schedules (i.e., in rats having received as few as 120 outcomes) have been characterized as goal- or outcome-directed (Adams and Dickinson 1981; Balleine and Dickinson 1992; Dickinson et al. 1995) and accordingly may represent instrumental incentives, as distinct from Pavlovian incentive processes related to incentive motivation (Parkinson et al. 2002). Continued performance of these instrumental actions leads to habitual responding which, unlike action–outcome learning, is impervious to outcome devaluation or contingency degradation (Dickinson et al. 1995). We failed to observe a selective increase in DA efflux when rats had limited as compared to extended training experience, as might be expected if dopaminergic activity in the NAc is related to instrumental incentive learning. Indeed, the magnitude of DA efflux was significantly greater after extended training when behavior is said to be no longer controlled by incentive learning and is based instead on habit. These increases in DA efflux were site-specific, as no significant changes in medial MD striatal DA efflux were observed throughout the different phases of this experiment (Fig. 4).
Performance on instrumental tasks is often conducted under nonrewarded or extinction conditions to evaluate the control of behavior by Pavlovian incentive stimuli, unconfounded by unconditioned reward stimuli. In the present study, a significant increase in DA efflux in the NAc was observed only during the initial 10-min sample of responding in extinction on training day 16, but not on day 5 (Fig. 3). The specific CS+ present in this experiment was a distinct auditory “click” of the pellet dispenser which occurred on a RI 30-s schedule, by itself during extinction or accompanied by delivery of food pellets into the magazine during rewarded responding. As such, these data are consistent with previous reports of increased DA efflux in the NAc elicited by a CS+ (Phillips et al. 1993; Datla et al. 2002). In a Pavlovian to instrumental transfer protocol, a CS+ previously paired noncontingently to food reward can facilitate the acquisition of an instrumental response (Parkinson et al. 2002). Systemic administration of DA receptor antagonists during Pavlovian pairings of a CS+ with food reward blocks Pavlovian to instrumental transfer (Beninger and Phillips 1981; Dickinson et al. 2000). Damage to the shell or core of the NAc spares the acquisition of Pavlovian to instrumental transfer, but disrupts the potentiation by intra-NAc amphetamine on responding for the CS+ (Parkinson et al. 2002). Together, these findings suggest that a phasic increase of DA in the NAc, shown to occur after treatment with amphetamine (Taepavarapruk and Phillips 2003; Brebner et al. 2005), may mediate the facilitatory effects of a Pavlovian CS+ on instrumental responding.
It is also of interest to note that the inclusion of an extinction session before a reinforced phase of instrumental responding attenuated the magnitude of DA efflux in the NAc observed when food reward was available on day 5, but not on day 16. This finding may be attributed to attenuation in the secondary reinforcement property of the CS+ associated with the delivery of food reward or possibly, an influence of frustrative nonreward engendered by extinction. In either case, it is apparent that these effects of extinction are restricted to the early phase of instrumental training.
In earlier studies, Salamone et al. (1994) proposed that an important aspect of dopaminergic activity in the NAc is related to behavioral activation, exertion of effort, and possibly cost benefit analyses relating effort to value of reward stimuli (Salamone et al. 2003, 2005). In support of this hypothesis, consumption of large quantities of freely available food pellets or lab chow was not accompanied by increased DA efflux. It must be noted in passing that consumption to satiety of a large meal of a palatable food such as fruit loops, onion rings (Ahn and Phillips 1999), or sucrose (Hajnal and Norgren 2002), is accompanied by a significant increase in DA efflux in both the NAc and medial prefrontal cortex. Salamone et al. (1994) also observed a significant relationship between response rates of individual rats and the magnitude of DA efflux in the NAc. Specifically, an increase in DA release was only observed in rats that responded at medium to high rates of responding, whereas in rats with low response output, this measure did not differ from controls.
Our data are also relevant to the relationship between response output and DA activity. In both Experiments 1 and 2, there was no evidence of a simple relationship between response rates and magnitude of DA efflux. As shown in Fig. 1, although the rate of lever presses were twice as high on day 16 compared to day 5, the magnitude of DA efflux did not different significantly between the 2 days. In Fig. 3, on day 5, initial response rates during the extinction and reward phases of the test were comparable during the first 10 min (times 5 and 11), yet the corresponding magnitude of DA efflux during the reward phase was three times greater than the extinction phase. A similar pattern was observed on day 16. Thus, different rates of responding were associated with similar magnitudes of DA efflux, and similar rates of responding were associated with different magnitudes of DA activity. Accordingly, no positive correlations between rate of lever press responding and magnitude of DA efflux in the Nac were observed after limited and extended training (Fig. 5). Finally, with respect to the appealing hypothesis that dopaminergic activity in the NAc is related to behavioral activation (Salamone et al. 2003, 2005), it must be emphasized that although the present data challenge this hypothesis, it cannot be refuted simply on the basis of the lack of a correlation between magnitude of DA efflux and intensity or degree of behavioral activation.
The failure to observe a significant increase in DA efflux in the MD striatum during either action–outcome or habit-based instrumental responding provides a clear indication that DA transmission in this region of the striatum is not involved in instrumental conditioning or stimulus–response habit formation. These data are consistent with the finding that neither excitotoxic lesions nor reversible inactivation of the anterior MD striatum had any effect on acquisition or expression of action–outcome associations in instrumental conditioning (Yin et al. 2005). In contrast, lesions or inactivation of the MD striatum, posterior to the probe placements in the present study, impaired instrumental performance based on outcome–expectancy (Yin et al. 2004). Blockade of NMDA receptors in the dorsomedial striatum also disrupted action–outcome learning consistent with a role for glutamate-mediated synaptic plasticity in the encoding of action–outcome associations (Yin et al. 2005). The MD (“associative”) striatum receives inputs from association cortices (e.g., prelimbic region of the prefrontal cortex and premotor areas), as well as the basolateral amygdala, which appears to mediate the assignment of incentive value to the consequences of instrumental actions (Corbit and Balleine 2005).
Integrity of the dorsolateral striatum has been shown to be required for habit formation in instrumental learning, and furthermore, rats with damage to this region of the striatum reverted to a state in which instrumental actions were goal-directed (Yin et al. 2005). This finding implies that the system involving the dorsolateral striatum responsible for habit formation can inhibit the circuit that mediates action–outcome or goal-directed instrumental actions. This in turn raises the possibility that the increase in NAc DA efflux observed in the present study after extended training provides a representation of instrumental incentive learning that is held in check by activity in the dorsolateral striatum.
In conclusion, the present findings provide neurochemical evidence in support of previous data questioning the role of DA in the NAc in coupling incentive value to representations of instrumental outcomes (de Borchgrave et al. 2002). The data showing elevated DA efflux in the NAc during extinction in the presence of a Pavlovian CS+, in turn are consistent with a role for the NAc in incentive motivation (Fibiger and Phillips 1986; Robbins et al. 1989; Phillips et al. 1993; Balleine and Killcross 1994; Berridge and Robinson 1998). Rats received an apportionment of ∼60 food pellets during microdialysis test on days 5 and 16, yet the magnitude of DA efflux was significantly greater after extended training sessions. These data refute the hypothesis that dopaminergic activity in the NAc is a reflection of either reward value or reinforcement of instrumental responses (Wise 2004). The present “within subject” design revealed a hitherto unappreciated effect of extended training of instrumental responding with an interval schedule on the magnitude of DA efflux in the NAc. For reasons discussed above, this does not appear to be related to motor responding per se. Rather, we speculate that this effect may reflect the specific condition of a random or variable interval schedule of outcome presentation, in which extended training is necessary to appreciate that the probability of receiving a beneficial outcome at any particular time in the 30-min test session is always unpredictable. This degree of uncertainty may be highly compatible with the optimal conditions for activating midbrain DA neurons (Fiorillo et al. 2003), which in turn would result in a sustained increase in DA release in the NAc throughout a period of random reinforcement. This pattern of DA release could play an important role in maintaining a high level of motivation at the service of a variety of response strategies available to ensure access to objects essential for survival.
Acknowledgement
This work was funded by the Canadian Institutes for Health Research.
References
- Adams CD, Dickinson A (1981) Instrumental responding following reinforcer devaluation. Q J Exp Psychol 33B:109–122
- Ahn S, Phillips AG (1999) Dopaminergic correlates of sensory-specific satiety in the medial prefrontal cortex and nucleus accumbens of the rat. J Neurosci 19:RC29 [DOI] [PMC free article] [PubMed]
- Ahn S, Phillips AG (2002) Modulation by central and basolateral amygdalar nuclei of dopaminergic correlates of feeding to satiety in the rat nucleus accumbens and medial prefrontal cortex. J Neurosci 22:10958–10965 [DOI] [PMC free article] [PubMed]
- Balleine B, Dickinson A (1992) Signalling and incentive processes in instrumental reinforcer devaluation. Q J Exp Psychol B 45:285–301 [PubMed]
- Balleine B, Killcross S (1994) Effects of ibotenic acid lesions of the nucleus accumbens on instrumental action. Behav Brain Res 65:181–193 [DOI] [PubMed]
- Beninger RJ, Phillips AG (1981) The effects of pimozide during pairing on the transfer of classical conditioning to an operant discrimination. Pharmacol Biochem Behav 14:101–105 [DOI] [PubMed]
- Berridge KC, Robinson TE (1998) What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res Brain Res Rev 28:309–369 [DOI] [PubMed]
- Berridge KC, Robinson TE (2003) Parsing reward. Trends Neurosci 26:507–513 [DOI] [PubMed]
- Brebner K, Ahn S, Phillips AG (2005) Attenuation of d-amphetamine self-administration by baclofen in the rat: behavioral and neurochemical correlates. Psychopharmacology (Berl) 177:409–417 [DOI] [PubMed]
- Cardinal RN, Parkinson JA, Hall J, Everitt BJ (2002) Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neurosci Biobehav Rev 26:321–352 [DOI] [PubMed]
- Corbit LH, Balleine BW (2005) Double dissociation of basolateral and central amygdala lesions on the general and outcome-specific forms of Pavlovian-instrumental transfer. J Neurosci 25:962–970 [DOI] [PMC free article] [PubMed]
- Corbit LH, Muir JL, Balleine BW (2001) The role of the nucleus accumbens in instrumental conditioning: evidence of a functional dissociation between accumbens core and shell. J Neurosci 21:3251–3260 [DOI] [PMC free article] [PubMed]
- Cousins MS, Trevitt J, Atherton A, Salamone JD (1999) Different behavioral functions of dopamine in the nucleus accumbens and ventrolateral striatum: a microdialysis and behavioral investigation. Neuroscience 91:925–934 [DOI] [PubMed]
- Datla KP, Ahier RG, Young AMJ, Gray JA, Joseph MH (2002) Conditioned appetitive stimulus increases extracellular dopamine in the nucleus accumbens of the rat. Eur J Neurosci 16:1987–1993 [DOI] [PubMed]
- de Borchgrave R, Rawlins JN, Dickinson A, Balleine BW (2002) Effects of cytotoxic nucleus accumbens lesions on instrumental conditioning in rats. Exp Brain Res 144:50–68 [DOI] [PubMed]
- Dickinson A, Balleine B, Watt A, Gonzalez F, Boakes RA (1995) Motivational control after extended training. Animal Learn Behav 23:197–206
- Dickinson A, Smith J, Mirenowicz (2000) Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. Behav Neurosci 114:468–483 [DOI] [PubMed]
- Faure A, Haberland U, Condé F, El Massioui N (2005) Lesion to the nigrostriatal dopamine system disrupts stimulus–response habit formation. J Neurosci 25:2771–2780 [DOI] [PMC free article] [PubMed]
- Fibiger HC, Phillips AG (1986) Reward, motivation and cognition: psychobiology of mesotelencephalic dopamine systems. In: Mountcastle VB, Bloom FE, Gerges SR (eds) Handbook of physiology: the nervous system, vol 4. American Physiological Society, Bethesda, pp 647–675
- Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299:1898–1902 [DOI] [PubMed]
- Fiorino DF, Coury A, Phillips AG (1997) Dynamic changes in nucleus accumbens dopamine efflux during the Coolidge effect in male rats. J Neurosci 17:4849–4855 [DOI] [PMC free article] [PubMed]
- Hajnal A, Norgren R (2002) Repeated access to sucrose augments dopamine turnover in the nucleus accumbens. Neuroreport 13:2213–2216 [DOI] [PubMed]
- Ikemoto S, Panksepp J (1999) The role of nucleus accumbens dopamine in motivated behavior: a unifying interpretation with special reference to reward-seeking. Brain Res Brain Res Rev 31:6–41 [DOI] [PubMed]
- Joseph MH, Datla K, Young AMJ (2003) The interpretation of the measurement of nucleus accumbens dopamine by in vivo dialysis: the kick, the craving or the cognition? Neurosci Biobehav Rev 27:527–541 [DOI] [PubMed]
- Kelley AE, Smith-Roe SL, Holahan MR (1997) Response–reinforcement learning is dependent on N-methyl-d-aspartate receptor activation in the nucleus accumbens core. Proc Natl Acad Sci U S A 94:12174–12179 [DOI] [PMC free article] [PubMed]
- Kelley AE, Baldo BA, Pratt WE, Will MJ (2005) Corticostriatal–hypothalamic circuitry and food motivation: integration of energy, action and reward. Physiol Behav 86:773–795 [DOI] [PubMed]
- Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, Rudarakanchana N, Halkerston KM, Robbins TW, Everitt BJ (2002) Nucleus accumbens dopamine depletion impairs both acquisition and performance of appetitive Pavlovian approach behaviour: implications for mesoaccumbens dopamine function. Behav Brain Res 137:149–163 [DOI] [PubMed]
- Paxinos G, Watson C (1997) The rat brain in stereotaxic coordinates, 4th edn. Academic, San Diego
- Phillips AG, Atkinson LJ, Blackburn JR, Blaha CD (1993) Increased extracellular dopamine in the nucleus accumbens of the rat elicited by a conditional stimulus for food: an electrochemical study. Can J Physiol Pharmacol 71:387–393 [DOI] [PubMed]
- Redgrave P, Prescott TJ, Gurney K (1999) Is the short-latency dopamine response too short to signal reward error? Trends Neurosci 22:146–151 [DOI] [PubMed]
- Richardson NR, Gratton A (1996) Behavior-relevant changes in nucleus accumbens dopamine transmission elicited by food reinforcement: an electrochemical study in rat. J Neurosci 16:8160–8169 [DOI] [PMC free article] [PubMed]
- Robbins TW, Cador M, Taylor JR, Everitt BJ (1989) Limbic–striatal interactions in reward-related processes. Neurosci Biobehav Rev 13:155–162 [DOI] [PubMed]
- Salamone JD, Correa M (2002) Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine. Behav Brain Res 137:3–25 [DOI] [PubMed]
- Salamone JD, Cousins MS, McCullough LD, Carriero DL, Berkowitz RJ (1994) Nucleus accumbens dopamine release increases during instrumental lever pressing for food but not free food consumption. Pharmacol Biochem Behav 49:25–31 [DOI] [PubMed]
- Salamone JD, Correa M, Mingote S, Weber SM (2003) Nucleus accumbens dopamine and the regulation of effort in food-seeking behavior: implications for studies of natural motivation, psychiatry, and drug abuse. J Pharmacol Exp Ther 305:1–8 [DOI] [PubMed]
- Salamone JD, Correa M, Mingote SM, Weber SM (2005) Beyond the reward hypothesis: alternative functions of nucleus accumbens dopamine. Curr Opin Pharmacol 5:34–41 [DOI] [PubMed]
- Schultz W (2002) Getting formal with dopamine and reward. Neuron 36:241–263 [DOI] [PubMed]
- Taepavarapruk P, Phillips AG (2003) Neurochemical correlates of relapse to d-amphetamine self-administration by rats induced by stimulation of the ventral subiculum. Psychopharmacology (Berl) 168:99–108 [DOI] [PubMed]
- Wise RA (2004) Dopamine, learning and motivation. Nat Rev Neurosci 5:483–494 [DOI] [PubMed]
- Yin HH, Knowlton BJ, Balleine BW (2004) Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci 19:181–189 [DOI] [PubMed]
- Yin HH, Ostlund SB, Knowlton BJ, Balleine BW (2005) The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci 22:513–523 [DOI] [PubMed]
- Yin HH, Zhuang X, Balleine BW (2006) Instrumental learning in hyperdopaminergic mice. Neurobiol Learn Mem 85:283–288 [DOI] [PubMed]
- Young AMJ, Moran PM, Joseph MH (2005) The role of dopamine in conditioning and latent inhibition: what, when, where and how? Neurosci Biobehav Rev 296:963–976 [DOI] [PubMed]