Skip to main content

Some NLM-NCBI services and products are experiencing heavy traffic, which may affect performance and availability. We apologize for the inconvenience and appreciate your patience. For assistance, please contact our Help Desk at info@ncbi.nlm.nih.gov.

The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2000 Aug 15;20(16):6282–6288. doi: 10.1523/JNEUROSCI.20-16-06282.2000

NMDA, But Not Dopamine D2, Receptors in the Rat Nucleus Accumbens Are Involved in Guidance of Instrumental Behavior by Stimuli Predicting Reward Magnitude

Wolfgang Hauber 1, Ines Bohn 1, Christian Giertler 1
PMCID: PMC6772588  PMID: 10934279

Abstract

Expectancy of future reward is an important factor guiding the speed of instrumental behavior. The present study sought to explore whether signals transmitted via the NMDA subtype of glutamate receptors and via dopamine D2 receptors in the nucleus accumbens (NAc) are critical for the determination of reaction times (RTs) of instrumental responses by the expectancy of future reward. A simple RT task for rats demanding conditioned lever release was used in which the upcoming reward magnitude (5 or 1 pellet) was signaled in advance by discriminative stimuli. In trained rats, RTs of conditioned responses with expectancy of a high reward magnitude were found to be significantly shorter. The shortening of RTs by stimuli predictive of high reward to be obtained was dose-dependently impaired by bilateral intra-NAc infusion of the competitive NMDA antagonistdl-2-amino-5-phosphonovaleric acid (APV) (1, 2, or 10 μg in 0.5 μl/side), but not by infusion of the preferential dopamine D2 antagonist haloperidol (5 and 12.5 μg in 0.5 μl/side) or by infusion of vehicle (0.5 μl/side). In conclusion, the data reveal that in well trained animals stimulation of intra-NAc NMDA, but not of dopamine D2, receptors, is critically involved in guiding the speed of instrumental responses according to stimuli predictive of the upcoming reward magnitude.

Keywords: nucleus accumbens, goal-directed behavior, reward, dopamine, glutamate, reaction time, rat


Reward expectancy is an important factor of guidance in adaptive motor behavior. Accordingly, the speed of instrumental responses has been found to be a function of the expected reward magnitude because reaction times (RTs) of rats were shortened by expectancy of signaled high reward (Brown and Bowman, 1995). Likewise, RTs of reaching movements (Hollerman et al., 1998) or saccadic eye movements (Kawagoe et al., 1998) of primates decreased as a function of the relative attractiveness of the expected reward.

The nucleus accumbens (NAc) as an interface between limbic and motor structures (Groenewegen et al., 1996) may play a key role in the control of goal-directed actions by reward (Mogenson et al., 1980). It is generally assumed that the NAc subserves motivated behaviors such as feeding, sexual behavior, or exploratory locomotion elicited by primary reward and by conditioned stimuli associated with reward (Robbins et al., 1989; Everitt, 1990; Mitchell and Gratton, 1994; Watanabe, 1996). For instance, lesions of the NAc abolished conditioned place preference (Everitt et al., 1991), suggesting that environmental cues predictive for reward no longer control behavior. Furthermore, neurons in the striatum show reward expectation-related activations triggered by reward-predicting stimuli (Apicella et al., 1991; Schultz et al., 1992;Kawagoe et al., 1998). Depending on the expected type of reinforcer, behavior-related neuronal activity of striatal neurons is influenced differentially, implying that these neurons incorporate information about the expected behavioral outcome (Hollerman et al., 1998). Although these data suggest that the NAc might be involved in processes guiding instrumental behavior according to predictive information on future reward magnitude, surprisingly little information is available on neurochemical mechanisms in the NAc that translate information about the expected reward magnitude into the speed of an instrumental response. The NAc receives convergent glutamatergic input from cortical and limbic regions concerned with the processing of the motivational significance of stimuli (Watanabe et al., 1996; Everitt et al., 1989; Schoenbaum et al., 1998; Schoenbaum et al., 1999) and mesolimbic dopaminergic input from the ventral tegmental area, which has been implicated in the rewarding properties of reinforcers (for review, seeWise and Bozarth, 1987). In the present study we investigated whether signals in the NAc transmitted via the NMDA subtype of glutamate receptors (Cotman and Iversen, 1987) and via dopamine D2 receptors are critical for the guidance of instrumental behavior by the expected reward magnitude. The effects of an intra-NAc NMDA and dopamine D2 receptor blockade were examined in a lever release task for rats in which RTs of instrumental responses were a function of the expected food reward signaled in advance by discriminative instructive stimuli. Although it is well known that striatal NMDA and dopamine receptors play a key role in motor control (Hauber, 1996, 1998), the NAc does not control motor aspects of RT performance per se (Amalric and Koob, 1987; Brown and Robbins, 1989; Carli et al., 1989). Thus alterations of RTs induced by intra-NAc blockade of NMDA and dopamine D2receptors should reflect changes in the translation of motivational information into response speed.

MATERIALS AND METHODS

Subjects

Twenty male Sprague Dawley rats (Charles River, Sulzfeld, Germany) were housed in groups of up to four animals in transparent plastic cages (Type IV; 35 × 55 × 10 cm; Ebeco, Castrop-Rauxel, Germany). Temperature (20 ± 2°C) and humidity (50 ± 10%) were kept constant in the animal house, and a 12 hr light/dark schedule was used with lights on between 6 A.M. and 6 P.M. All rats were given ad libitum access to water. Standard laboratory maintenance chow (Altromin, Lage, Germany) was restricted to 12 gm per animal and day. On the days that behavioral tests were given, rats received 2–8 gm of food reward (45 mg pellets; Bioserv, Frenchtown, NJ) in the testing apparatus, depending on the individual performance. On these days, the amount of standard laboratory chow was adapted individually to keep body weights constant. Rats weighed 200–250 gm on arrival and 270–350 gm at the time of surgery.

Surgery

For stereotaxic surgery, animals were anesthetized with sodium pentobarbital (50 mg/kg, i.p.) (Sigma-Aldrich, Steinheim, Germany) after pretreatment with atropine sulfate (0.05 mg/kg, i.p.) (Sigma-Aldrich) and secured in a Kopf stereotaxic frame (Kopf Instruments, Tujunga, CA). Bilateral stainless steel cannulae (outer diameter 0.8 mm) were aimed at the NAc and implanted using standard stereotaxic procedures. The coordinates with reference to the atlas ofPellegrino et al. (1981) were as follows: anteroposterior, 3.2 mm anterior to bregma; mediolateral, 1.7 mm; dorsoventral, −4.0 mm below dura with the toothbar 5 mm above the interaural line. Each rat was given at least 7 d to recover from surgery before postoperative training was started.

Drug infusion

On injection days, the obturators were removed, and bilateral injection cannulae (outer diameter 0.45 mm) were lowered to the final site of infusion and attached via polyethylene tubing to microliter syringes controlled by a microdrive pump (Kopf Instruments). The preferential dopamine D2 antagonist haloperidol (Sigma-Aldrich) (5 and 12.5 μg in 0.5 μl 1% lactate), the competitive NMDA receptor antagonistdl-2-amino-5-phosphonovaleric acid (APV) (Research Biochemical International, Koeln, Germany) (1, 2, or 10 μg in 0.5 μl saline), and respective vehicles (0.5 μl) were delivered bilaterally at a rate of 0.5 μl/min. Injection cannulae were left in place for an additional 1 min after each infusion to allow for diffusion. Each rat remained in its home cage for an additional 5 min before being placed in the test chamber.

Apparatus

Four operant test chambers (24 × 21 × 30 cm) (Med Associates, St. Albans, VT) were used. Test chambers were placed in separate sound-attenuating cubicles with fans providing a constant low level of background noise. Each chamber was supplied with a retractable lever, a food dispenser with receptacle, and two stimulus lights, one above the retractable lever, the other above the food receptacle. The experiments were controlled on-line (SmartControl Interfaces, Med Associates) by a computer system (MedPC-Software, Med Associates).

Behavioral procedure

RT task. A simple RT task was used in which discriminative stimuli indicate the upcoming reward magnitude. The task demands conditioned lever release (Amalric and Koob, 1987; Baunez et al., 1994), with instructive stimuli indicating the reward magnitude to be obtained after a subsequent imperative stimulus as described by Brown and Bowman (1995) in a hole box task.

Moreover, in intact rats, RTs have been found to be a function of lengthening of the foreperiod from trial onset until presentation of the imperative stimulus. This relationship probably reflects motor readiness (Brown and Robbins, 1991). We additionally introduced different foreperiods in the task used here and measured motor readiness. Motor readiness was used to monitor nonspecific motor effects of the treatments.

According to the protocols of Amalric and Koob (1987) and Baunez et al. (1994), rats had to press the lever and wait for the imperative stimulus, which was provided by the stimulus light above the lever after a variable foreperiod of 200, 500, or 800 msec. The imperative stimulus signaled to the rats to release the lever quickly and to respond to the food receptacle in which the food pellets were delivered (45 mg pellets; Bioserv).

On each trial, the rat received either one or five food pellets. The number of pellets for each trial was randomly determined in advance and signaled to the rats by two distinct brightness levels of the cue lights that provided the instructive stimulus (Brown and Bowman, 1995). The instructive stimulus was turned on at the beginning of each trial before lever press and remained present until delivery of food reward. To check for equal perception of instructive stimuli of the two different brightness levels, for 50% of the rats, a bright stimulus was associated with delivery of five pellets and a dim stimulus was associated with delivery of one pellet. For the other 50% of the rats, the opposite pattern was used. Results showed that rats discriminated bright and dim stimuli; therefore, RT data obtained with both stimulus patterns were pooled.

RTs defined as latency from the onset of the imperative stimulus to lever release were recorded with an accuracy of 10 msec. For a correct trial, animals had to release the lever within 100–1000 msec. Responses with RTs <100 msec were defined as “early” responses; responses with RTs >1000 msec were defined as “late” responses. A daily individual session demanded 72 correct trials, i.e., 12 correct trials for each foreperiod (200, 500, and 800 msec) and reward magnitude (one and five pellets), and lasted 15–25 min depending on the individual. A schematic representation of the order of trial events is given in Figure 1.

Fig. 1.

Fig. 1.

Schematic representation of the order of events in a trial. At the beginning of a trial, the instructive stimulus presented by a cue light above the food receptacle was turned on at one of two brightness levels that were associated with different reward magnitudes (1 or 5 pellets). Thereafter a rat spontaneously pressed the lever. After a variable foreperiod (200, 500, or 800 msec), the imperative stimulus provided by a cue light above the lever signaled the animal to release the lever to get the food reward in the receptacle. Responses with RT within 100–1000 msec (top) were considered to be correct and were rewarded as indicated by the instructive stimulus. Early responses (RT < 100 msec) (middle) or late responses (RT > 1000 msec) (bottom) caused the trial to be repeated with the identical foreperiod and reward magnitude. RTs were defined as latency between presentation of the imperative stimulus and the lever release.

Training. Animals were trained for 8 weeks until behavior was stable, and thereafter the mean accuracy was ∼75%; i.e., on average, 96 trials were necessary to attain 72 correct responses. Then animals were subjected to surgery. After 7 d of recovery, postoperative training was given for 1 week to reach preoperative accuracy levels.

Experimental procedure

All animals were trained in one daily session on 5 d per week during the complete experimental period. Effects of drug and vehicle infusions were investigated in one experimental session per week. In each experimental session, one single drug dose and the respective vehicle were tested. A series of five different experimental sessions was performed to examine the effects of intra-NAc infusion of APV (1 μg), APV (2 μg), haloperidol (5 μg), haloperidol (12.5 μg), and APV (10 μg) in the order as given. Before each experimental session, animals were assigned at random to two treatment groups receiving either vehicle or drug infusions to prevent order effects of drug administration. Random assignments were made until two criteria were met: (1) mean RTs of both treatment groups had to be significantly shorter with expectancy of a high reward magnitude (five pellets) as compared with a low reward magnitude (one pellet), and (2) mean RTs of both groups had to be significantly shorter with longer foreperiods (for calculation see Data analysis). Each animal received a total of five infusions. Very rarely, animals showed pronounced irritation caused by the microinfusion procedure and were not tested subsequently. Also, a few animals developed permanent guide cannulae occlusion and were not used for further experiments. Therefore, sample sizes were different in each experimental session and became smaller toward the end of the experiment. Before any experimental treatment, all animals were subjected to a test session preceded by a vehicle infusion to familiarize them with the experimental procedure.

Data analysis

Treatment effects were assessed by within-subjects comparisons of rats assigned to control and drug groups. Because of considerable interindividual variability of baseline performance, a between-subjects design would be less powerful (Winer, 1971). The performance of animals that received a microinfusion of a single drug dose (“drug group”) or vehicle (“control group”) in the experimental session (“injection”) was compared with their respective performance in the preceding session (“preinjection”) on the day before without drug or vehicle infusion.

Drug effects on accuracy of task performance were determined by using the following parameters: (1) the mean of the overall number of trials to achieve the criterion of 72 correct responses (±SEM) and (2) percentage means of early, correct, and late responses from the total number of trials per session (±SEM) from each session. Means of each parameter from experimental sessions with drug and vehicle injection and from respective preinjection sessions were compared using one-way ANOVA.

The following calculations were conducted with RT data from correct responses (RT 100–1000 msec) of all preinjection and injection sessions. In control rats, RTs of responses with an expected high reward magnitude were significantly shorter than those with an expected low reward magnitude. This speeding of RTs was used as an index of RT guidance by reward expectancy. Treatment-induced effects were determined by comparing RT speeding in drug and control groups on preinjection and injection days. Mean RT differences (±SEM) of responses with high and low reward magnitudes were given and compared statistically by means of a two-way ANOVA with groups and treatment as factors followed by the least significant difference (LSD) post hoc test.

The decrease of RT as a function of foreperiod reflecting motor readiness was characterized by the slope of the regression straight lines. Treatment effects on motor readiness were calculated by comparing slopes of straight regression lines of drug and control groups on preinjection and injection days. Mean slopes (±SEM) were given and compared statistically by means of a two-way ANOVA with groups and treatment as factors followed by the LSD post hoc test. The STATISTICA (version 5.1, StatSoft, Inc., Hamburg, Germany) statistical package was used for all statistical computations. The level of statistical significance (α-level) was set at p < 0.05.

Histology

After completion of behavioral testing, animals were euthanized by an overdose of sodium pentobarbital (150 mg/kg, i.p.) (Sigma-Aldrich) to confirm correct placement of cannulae. Brains were removed, fixed in 10% formalin for 2.5 hr, and stored in 30% sucrose. Brain sections (20 μm) were cut with a cryostat (Reichert and Jung, Heidelberg, Germany), mounted on coated slides, and stained with cresyl violet. Placements were verified with reference to the atlas ofPellegrino et al. (1981).

RESULTS

Accuracy

As shown in Table 1, intra-NAc infusion of vehicle to rats of the control groups did not significantly alter the number of trials to reach criterion as compared with the preinjection session. Thus the infusion procedure per se did not interfere with this aspect of task performance. Likewise, infusion of haloperidol or APV into rats of the drug groups had no significant effect on the number of trials to reach criterion compared with the respective preinjection sessions (Table 1).

Table 1.

Mean number of trials (±SEM) to reach criterion (72 correct trials per session; RT: 100–1000 msec) of control and drug groups in sessions without intra-NAc infusions (preinjection) and in sessions with intra-NAc infusions (injection) of vehicle or drug

Control group Drug group
Preinjection Injection Preinjection Injection
Trials Solution Trials n Trials Drug (μg) Trials n
 93 ± 3 Vehicle 94  ± 5 8 100  ± 7 HP, 5 102  ± 4 8
102 ± 5 Vehicle 96  ± 5 7 86  ± 2 HP, 12.5 96  ± 2 5
 95 ± 4 Vehicle 89  ± 2 9 97  ± 4 APV, 1 92  ± 3 7
 94 ± 6 Vehicle 95  ± 6 9 98  ± 4 APV, 2 103  ± 3 10
 85 ± 1 Vehicle 89  ± 4 7 100  ± 3 APV, 10 98  ± 4 5

HP, Haloperidol. ANOVA using within-subjects comparisons with injection day as factor revealed no significant differences.

An analysis of the response distribution further showed that in rats receiving vehicle infusions the percentage means of early, correct, and late responses were altered only moderately as compared with the respective preinjection sessions. As shown in Table2, there was an increase in the percentage means of late responses after vehicle infusion in two control groups (controls of APV 2 μg:F(1,8) = 7.33, p < 0.05; controls of APV 10 μg: F(1,6)= 12.80, p < 0.05). Also, intra-NAc infusions of haloperidol or APV had moderate effects on the distribution of responses as indicated by the increased proportion of late responses induced by 5 μg haloperidol (F(1,7)= 8.71; p < 0.05) and by 1 μg APV (F(1,6) = 7.39; p < 0.05), 2 μg APV (F(1,9) = 7.33;p < 0.05), and 10 μg APV (F(1,4) = 12.8; p < 0.05) (Table 2).

Table 2.

Percentage means (±SEM) of correct (RT: 100–1000 msec), early (RT: <100 msec), and late (RT: >1000 msec) responses from the total number of trials of control and drug groups in sessions without intra-NAc infusions (preinjection) and in sessions with intra-NAc infusions (injection) of vehicle or drug

Infusion n Correct responses (%) Early responses (%) Late responses (%)
Preinjection Injection Preinjection Injection Preinjection Injection
Vehicle 8 79  ± 3 78  ± 4 21  ± 3 21  ± 4 1  ± 0 1  ± 0
HP, 5 μg 8 74  ± 5 72  ± 3 23  ± 5 14  ± 2 3  ± 1 14  ± 4*
Vehicle 7 72  ± 3 76  ± 4 27  ± 3 22  ± 4 1  ± 0 2  ± 1
HP, 12.5 μg 5 84  ± 2 77  ± 6 14  ± 2 21  ± 7 2  ± 1 3  ± 2
Vehicle 9 77  ± 3 81  ± 2 22  ± 3 17  ± 2 1  ± 0 2  ± 1
APV, 1 μg 7 75  ± 3 79  ± 3 25  ± 3 19  ± 3 0  ± 0 2  ± 1*
Vehicle 9 79  ± 5 78  ± 4 21  ± 5 18  ± 5 1  ± 0 4  ± 1*
APV, 2 μg 10 75  ± 3 70  ± 2 23  ± 3 24  ± 3 2  ± 0 6  ± 1*
Vehicle 7 85  ± 1 81  ± 3 14  ± 2 14  ± 4 1  ± 0 4  ± 1*
APV, 10 μg 5 72  ± 2 74  ± 3 28  ± 2 20  ± 3 0  ± 0 6  ± 2*

HP, Haloperidol.

*

p < 0.05; ANOVA using within-subjects comparisons with injection day as factor.

Reaction time

On completion of postoperative training, RTs were significantly shorter with the expectancy of a higher reward magnitude (main effect of pellet: F(1,18) = 92.37;p < 0.001) as shown in Figure2. RTs were also faster as a function of lengthening of the foreperiod (main effect of the foreperiod:F(2,36) = 38.06; p < 0.001) (Fig. 2). No interaction between number of pellets and foreperiod was found suggesting that independent mechanisms account for shortening of RTs by reward expectation and foreperiod (pellets × foreperiod; F(2,36) = 1.91).

Fig. 2.

Fig. 2.

The effect of the number of expected pellets and the lengthening of foreperiod on RT in the last postoperative training session (n = 19, n = 72 correct responses per animal) in animals without intra-NAc infusion.A, Mean RTs (±SEM) were significantly determined by the number of expected pellets. Expectancy of a high reward magnitude produced an RT speeding of 48 msec. B, RTs were significantly determined by lengthening of the foreperiod. The mean slope of the regression straight line was m = −0.18 msec/msec. *p < 0.001, ANOVA with reward magnitude and foreperiod as factors.

Reward expectancy

The shortening of RT with expectancy of the high reward magnitude was not significantly altered in control groups by vehicle infusion as shown in Figures 3 and4. This suggests that the infusion procedure per se had no effect on this parameter.

Fig. 3.

Fig. 3.

Effects of intra-NAc infusion of APV or vehicle (VEH) on reward expectancy. RT differences between correct responses associated with expectancy of high (5 pellets) and low reward (1 pellet) are given as mean RT gain (±SEM). RT gain in drug and control groups from sessions with APV or vehicle infusion and from preceding sessions without infusion were compared. Although the low dose of APV had no significant effect (A), higher doses of APV (B,C) reduced speeding of RT induced by expectancy of high reward. *p < 0.05, ANOVA with groups and treatment as factors followed by the LSD test.

Fig. 4.

Fig. 4.

Effects of intra-NAc infusion of haloperidol (HP) or vehicle (VEH) on reward expectancy. RT differences between correct responses associated with expectancy of high (5 pellets) and low reward (1 pellet) are given as mean RT gain (±SEM). RT gain in drug and control groups from sessions with haloperidol or vehicle infusion and from preceding sessions without infusion were compared. A, B, Haloperidol tested in two doses did not significantly affect speeding of RT induced by expectancy of high reward (ANOVA with groups and treatment as factors followed by the LSD test).

After intra-NAc infusion of a low dose of APV (1 μg/side), RTs were not different from the preinjection session (Fig. 3A). By contrast, after infusion of an intermediate dose of APV (2 μg/side) (Fig. 3B), the speeding of RT by expectancy of a high reward was significantly reduced compared with the preinjection day (F(1,55) = 5.88; p < 0.01). Likewise, infusion of a high dose of APV (10 μg/side) (Fig.3C) significantly reduced the speeding of RT associated with expectancy of a high reward (F(1,17) = 5.51; p < 0.03).

In contrast, intra-NAc infusion of haloperidol did not change the shortening of RT induced by expectancy of a high reward magnitude. As shown in Figure 4, a low dose of haloperidol (5 μg/side) as well as a high dose of haloperidol (12.5 μg/side) had no significant effect on RT speeding.

Motor readiness

There was no significant effect of vehicle injection on rats of the control groups on the slopes of the regression straight lines, indicating no change of motor readiness as shown in Figures5 and 6. This suggests that the infusion procedure per se had no effect on motor readiness. In addition, there was no effect on the mean slopes of regression straight lines after infusion of APV as shown in Figure 5 or of haloperidol as depicted in Figure 6. Thus, infusion of APV or haloperidol did not affect motor readiness, i.e., the determination of RT by lengthening of the foreperiod.

Fig. 5.

Fig. 5.

Effects of intra-NAc infusion of APV or vehicle (VEH) on motor readiness. Slopes (±SEM) of regression straight lines from RTs as a function of the length of foreperiod in correct responses are given. Slopes from drug and control groups in sessions with APV or vehicle infusions and from preceding sessions without infusion were compared. A–C, APV tested in three doses did not significantly affect speeding of RT as a function of foreperiod lengthening (ANOVA with groups and treatment as factors followed by the LSD test).

Fig. 6.

Fig. 6.

Effects of intra-NAc infusion of haloperidol (HP) or vehicle (VEH) on motor readiness. Slopes (±SEM) of regression straight lines from RTs as a function of the length of foreperiod in correct responses are given. Slopes from drug and control groups in sessions with haloperidol or vehicle infusion and from preceding sessions without infusion were compared. A, B, Haloperidol tested in two doses did not significantly affect speeding of RTs as a function of foreperiod lengthening (ANOVA with groups and treatment as factors followed by the LSD test).

Histology

In all animals that were evaluated (n = 19), cannulae tip placements deviated <0.5 mm from target coordinates in the NAc. One animal was excluded because of misplacement of guide cannulae. The locations of cannulae tips for all evaluated rats are represented in Figure 7.

Fig. 7.

Fig. 7.

Location of cannulae tips in the NAc (black circles) for all rats used for data analysis. Plates are adaptations from the atlas of Pellegrino et al. (1981).Numbers beside each plate correspond to the anteroposterior distance from bregma (in millimeters).

DISCUSSION

Using a RT task demanding conditioned lever release, the present study demonstrates that in normal animals there was a speeding of RTs with expectancy of a high reward magnitude. Apparently, the predictive information provided by the instructive stimulus produced a reward expectancy that shortened RTs. RTs were also shorter with lengthening of the foreperiod, a relationship probably reflecting motor readiness (Brown and Robbins, 1991). Intra-NAc infusion of vehicle or of the preferential dopamine D2 antagonist haloperidol did not affect guidance of RT by the expected reward magnitude or motor readiness. By contrast, intra-NAc blockade of NMDA receptors with APV dose-dependently impaired determination of RT by the expected reward magnitude, but left motor readiness intact.

The RT task used here involves an adaptation of a hole box task with discriminative stimuli indicating upcoming reward magnitude (Brown and Bowman, 1995) to a lever release task described by Amalric and Koob (1987). Correspondingly, instructive visual stimuli signaling in advance two different reward magnitudes (five vs one pellet) and foreperiods of 200, 500, and 800 msec until presentation of the imperative visual stimulus were introduced in the present lever release task. RT differences between responses associated with high and low reward magnitudes were ∼50 msec and correspond well with those determined in a nine-hole box task (Brown and Bowman, 1995). Likewise, the effect of lengthening foreperiod on RT of ∼100 msec as measured here is in keeping with data from various hole box and Skinner box tasks (Brown and Robbins, 1991; Brown and Bowman, 1995; Brown et al., 1996; Brasted et al., 1997; Blokland, 1998; Brasted et al., 1998), although the length of foreperiod and the number of foreperiod intervals were not exactly identical across these studies.

Intra-NAc dopamine D2 receptors and reward expectancy

Intra-NAc infusion of haloperidol did not affect the number of trials to reach criterion, implicating that retrieval of the task was intact. Therefore mnemonic deficits induced by a dopamine D2 receptor blockade in the NAc might be ruled out. In addition, there is no evidence for nonspecific motor impairments associated with intra-NAc infusion of haloperidol. Motor readiness was intact, and the minor increase of the proportion of late responses after haloperidol infusion was similar to the one found in control animals after vehicle infusions, suggesting that this change was a result of the infusion procedure per se. The doses of haloperidol used here have been shown to impair RT performance of rats after infusion into the caudate-putamen (Amalric and Koob, 1989; Blokland and Honig, 1999). The absence of RT deficits after intra-NAc infusion of haloperidol found here is consistent with earlier data that dopamine depletion of the NAc by 6-hydroxydopamine did not impair RT performance in a similar lever release task (Amalric and Koob, 1987) or a hole box task demanding nose pokes (Carli et al., 1989). This confirms the notion that impaired dopamine transmission in the NAc does not result in motor deficits per se interfering with RT performance (Amalric and Koob, 1987).

Moreover, blockade of intra-NAc dopamine D2receptors by haloperidol did not change the determination of RT by the number of expected pellets. Thus, control of RT by stimuli predictive for future reward magnitude seems not to rely on dopamine D2 receptor-mediated signals in the NAc, at least in well trained animals as used here. Given the poor selectivity of haloperidol for dopamine D2 receptors (D2 receptors:Ki = 1.2 nm; D1 receptors: Ki= 80 nm) (Seeman and Van Tol, 1994), and given the high concentration of haloperidol infused (5 and 12.5 μg in 0.5 μl), an almost complete blockade not only of dopamine D2 but also of D1 receptors is likely. Thus one may assume that intra-NAc dopamine D1 as well as D2 receptors are not involved in RT control by the expected rewards. However, this hypothesis has to be tested in future experiments using selective dopamine D1 antagonists.

An extensive body of evidence suggests that the NAc plays a fundamental role in the transduction of motivation into action (Mogenson et al., 1980) and that the mesolimbic dopamine system is of major importance for the guidance of goal-directed behaviors by rewarding stimuli (for review, see Berridge and Robinson, 1998; Di Chiara, 1998; Schultz, 1998; Redgrave et al., 1999). In view of these hypotheses, the failure to detect an involvement of intra-NAc dopamine D2receptors in control of behavior by reward expectancy might be surprising. However, most hypotheses concerning the role of dopamine in reward processes state that dopaminergic signals are particularly important during the initial, incentive part of learning when reward-predicting stimuli are novel and unpredictable (Schultz, 1998). In contrast, after extensive overtraining with stereotyped task performance as in the case of our study, the involvement of mesolimbic dopamine may be less important (Schultz, 1998). In line with this notion, predictable rewarding brain stimulation produced Fos-like immunoreactivity in many forebrain regions, but only very moderately in mesolimbic dopaminergic neurons (Hunt and McGregor, 1998). Thus a dopamine-dependent attribution of the relative salience (Berridge and Robinson, 1998) of stimulus–reward associations guiding RT performance may take place in early steps of training on the task used here. To summarize so far, our data show that in well trained animals the adaptation of instrumental behavior to the expected reward magnitude does not involve dopamine D2receptor-mediated signals in the NAc.

Intra-NAc NMDA receptors and reward expectancy

Intra-NAc infusion of APV did not affect the number of trials to reach criterion, indicating that treatment-induced mnemonic deficits are unlikely. Correspondingly, intra-NAc infusion of APV impaired response–reinforcement learning only in the early stages of acquisition but not in well trained animals (Kelley et al., 1997). Furthermore, motor impairments after intra-NAc infusion of APV were not observed. Motor readiness was intact, and there was only a minor, albeit significant, increase of the proportion of late responses that was also found occasionally in control animals after vehicle infusions. Thus these latter changes were probably a result of the infusion procedure per se. The doses of APV used here have been shown to impair RT performance after infusion into the caudate-putamen (Baunez et al., 1994). The failure to detect performance deficits after intra-NAc APV infusion might be attributable to the fact that the NAc is not involved in control of pure motor aspects of RT performance as already discussed above with regard to mesolimbic dopamine. Accordingly, cell body lesions of the NAc did not induce RT deficits in a nine-hole box task (Brown and Robbins, 1989).

Blockade of intra-NAc NMDA receptors by APV dose-dependently impaired the speeding of RT associated with an upcoming high reward. Thus guidance of RT by stimuli predictive for different reward magnitudes depends on stimulation of intra-NAc NMDA receptors. An involvement of the NAc in guiding RT performance by predictive information about the reward magnitude to be obtained has been investigated previously in rats using ibotenic acid lesions (Brown and Bowman, 1995). In this elegant study, the determination of RT by the expected reward magnitude was not affected by lesions of the NAc. One might expect that lesion-induced inactivation of the NAc and pharmacological blockade of intra-NAc NMDA receptors used in the present study produce some overlapping behavioral impairments. However, our data reveal that the impairment in RT determination by the expected reward magnitude was subtle after intra-NAc NMDA receptor blockade. If this deficit occurs only transiently after lesion, it would be difficult to detect. Furthermore, functional reorganization after lesion might take place, thereby compensating for this impairment.

There is evidence from in vivo electrophysiological recording experiments that neurons of the dorsal and ventral striatum are sensitive to motivationally significant stimuli that code reward magnitude. In primates tested in a task similar to the one used here, RTs were found to be determined by the expected type of reinforcer that significantly influenced behavior-related neuronal activity (Hollerman et al., 1998). Also, the expectation of reward-modulated electrophysiological responses of striatal neurons in primates and the saccadic eye movement investigated occurred earlier and faster in the rewarded direction as opposed to nonrewarded directions (Kawagoe et al., 1998). It is likely that reward-related signals are transmitted to the striatum by glutamatergic projections from cortical and limbic regions (McGeorge and Faull, 1989) such as the amygdala, prefrontal, or orbitofrontal cortex, which are involved in processing of the incentive properties of stimuli (Everitt et al., 1989; Watanabe, 1996; DeCoteau et al., 1997; Gallagher et al., 1999; Leon and Shadlen, 1999; Tremblay and Schultz, 1999). Input of these structures converges in the NAc on medium-sized striatal projection neurons involving NMDA and non-NMDA receptors (Albin et al., 1992). To the best of our knowledge, the present data show for the first time that stimulation of intra-NAc NMDA receptors is critically involved in guiding the speed of instrumental responding in well trained animals according to stimuli predictive for reward magnitude.

Footnotes

This research was supported by the Deutsche Forschungsgemeinschaft (Ha2340/3-1).

Correspondence should be addressed to Dr. Wolfgang Hauber, Abteilung Tierphysiologie, Biologisches Institut, Universität Stuttgart, Pfaffenwaldring 57, D-70550 Stuttgart, Germany. E-mail:hauber@po.uni-stuttgart.de.

REFERENCES

  • 1.Albin RL, Marcowiec RL, Hollingsworth ZR, Dure LS, Penney JB, Young AB. Excitatory amino acid binding sites in the basal ganglia of the rat: a quantitative autoradiographic study. Neuroscience. 1992;46:35–48. doi: 10.1016/0306-4522(92)90006-n. [DOI] [PubMed] [Google Scholar]
  • 2.Amalric M, Koob GF. Depletion of dopamine in the caudate nucleus but not in nucleus accumbens impairs reaction-time performance in rats. J Neurosci. 1987;7:2129–2134. doi: 10.1523/JNEUROSCI.07-07-02129.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Amalric M, Koob GF. Dorsal pallidum as a functional motor output of the corpus striatum. Brain Res. 1989;483:389–394. doi: 10.1016/0006-8993(89)90186-8. [DOI] [PubMed] [Google Scholar]
  • 4.Apicella P, Ljungberg T, Scarnati E, Schultz W. Responses to reward in monkey dorsal and ventral striatum. Exp Brain Res. 1991;85:491–500. doi: 10.1007/BF00231732. [DOI] [PubMed] [Google Scholar]
  • 5.Baunez C, Nieoullon A, Amalric M. N-methyl-d-aspartate receptor blockade impairs behavioural performance of rats in a reaction time task: new evidence for glutamatergic-dopaminergic interactions in the striatum. Neuroscience. 1994;61:521–531. doi: 10.1016/0306-4522(94)90431-6. [DOI] [PubMed] [Google Scholar]
  • 6.Berridge KC, Robinson TE. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res Rev. 1998;28:309–369. doi: 10.1016/s0165-0173(98)00019-8. [DOI] [PubMed] [Google Scholar]
  • 7.Blokland A. Involvement of striatal cholinergic receptors in reaction time and fixed-interval responding in rats. Brain Res Bull. 1998;45:21–25. doi: 10.1016/s0361-9230(97)00275-x. [DOI] [PubMed] [Google Scholar]
  • 8.Blokland A, Honig W. Intra-striatal haloperidol and scopolamine injections: effects on choice reaction time performance in rats. Eur Neuropsychopharmacol. 1999;9:523–531. doi: 10.1016/s0924-977x(99)00036-x. [DOI] [PubMed] [Google Scholar]
  • 9.Brasted PJ, Humby T, Dunnett SB, Robbins TW. Unilateral lesions of the dorsal striatum in rats disrupt responding in egocentric space. J Neurosci. 1997;17:8919–8926. doi: 10.1523/JNEUROSCI.17-22-08919.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Brasted PJ, Dobrossy MD, Robbins TW, Dunnett SB. Striatal lesions produce distinctive impairments in reaction time performance in two different operant chambers. Brain Res Bull. 1998;46:487–493. doi: 10.1016/s0361-9230(98)00044-6. [DOI] [PubMed] [Google Scholar]
  • 11.Brown VJ, Bowman EM. Discriminative cues indicating reward magnitude continue to determine reaction time of rats following lesions of the nucleus accumbens. Eur J Neurosci. 1995;7:2479–2485. doi: 10.1111/j.1460-9568.1995.tb01046.x. [DOI] [PubMed] [Google Scholar]
  • 12.Brown VJ, Robbins TW. Elementary processes of response selection mediated by distinct regions of the striatum. J Neurosci. 1989;9:3760–3765. doi: 10.1523/JNEUROSCI.09-11-03760.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Brown VJ, Robbins TW. Simple and choice reaction time performance following unilateral striatal dopamine depletion in the rat. Brain. 1991;114:513–525. doi: 10.1093/brain/114.1.513. [DOI] [PubMed] [Google Scholar]
  • 14.Brown VJ, Brasted PJ, Bowman EM. The effect of systemic d-amphetamine on motor versus motivational processes in the rat. Psychopharmacology. 1996;128:171–180. doi: 10.1007/s002130050122. [DOI] [PubMed] [Google Scholar]
  • 15.Carli M, Jones GH, Robbins TW. Effects of unilateral dorsal and ventral striatal dopamine depletion on visual neglect in the rat: a neural and behavioural analysis. Neuroscience. 1989;29:309–327. doi: 10.1016/0306-4522(89)90059-6. [DOI] [PubMed] [Google Scholar]
  • 16.Cotman CW, Iversen LL. Excitatory amino acids in the brain: focus on NMDA receptors. Trends Neurosci. 1987;10:263–269. [Google Scholar]
  • 17.DeCoteau WE, Kesner RP, Williams JM. Short-term memory for food reward magnitude: the role of the prefrontal cortex. Behav Brain Res. 1997;88:239–249. doi: 10.1016/s0166-4328(97)00044-2. [DOI] [PubMed] [Google Scholar]
  • 18.Di Chiara G. A motivational learning hypothesis of the role of mesolimbic dopamine in compulsive drug use. J Psychopharmacol. 1998;12:54–67. doi: 10.1177/026988119801200108. [DOI] [PubMed] [Google Scholar]
  • 19.Everitt BJ. Sexual motivation: a neural and behavioural analysis of the mechanisms underlying appetitive copulatory responses in male rats. Neurosci Biobehav Rev. 1990;14:217–232. doi: 10.1016/s0149-7634(05)80222-2. [DOI] [PubMed] [Google Scholar]
  • 20.Everitt BJ, Cador M, Robbins TW. Interactions between the amygdala and ventral striatum in stimulus-reward associations: studies using a second-order schedule of sexual reinforcement. Neuroscience. 1989;30:63–75. doi: 10.1016/0306-4522(89)90353-9. [DOI] [PubMed] [Google Scholar]
  • 21.Everitt BJ, Morris KA, O'Brien A, Robbins TW. The basolateral amygdala-ventral striatal system and conditioned place preference: further evidence of limbic-striatal interactions underlying reward-related processes. Neuroscience. 1991;42:1–18. doi: 10.1016/0306-4522(91)90145-e. [DOI] [PubMed] [Google Scholar]
  • 22.Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Groenewegen HJ, Wright CI, Beijer AVJ. The nucleus accumbens: gateway for limbic structures to reach the motor system? In: Holstege G, Bandler R, Saper CB, editors. Emotional motor system. Elsevier; Amsterdam: 1996. pp. 485–511. [DOI] [PubMed] [Google Scholar]
  • 24.Hauber W. Impairments of movement initiation and execution induced by a blockade of dopamine D1 or D2 receptors are reversed by a blockade of N-methyl-d-aspartate receptors. Neuroscience. 1996;73:121–130. doi: 10.1016/0306-4522(96)00036-x. [DOI] [PubMed] [Google Scholar]
  • 25.Hauber W. Involvement of basal ganglia transmitter systems in movement initiation. Prog Neurobiol. 1998;56:507–540. doi: 10.1016/s0301-0082(98)00041-0. [DOI] [PubMed] [Google Scholar]
  • 26.Hollerman JR, Tremblay L, Schultz W. Influence of reward expectation on behaviour-related neuronal activity in primate striatum. J Neurophysiol. 1998;80:947–963. doi: 10.1152/jn.1998.80.2.947. [DOI] [PubMed] [Google Scholar]
  • 27.Hunt GE, McGregor IS. Rewarding brain stimulation induces only sparse fos-like immunoreactivity in dopaminergic neurons. Neuroscience. 1998;83:501–515. doi: 10.1016/s0306-4522(97)00409-0. [DOI] [PubMed] [Google Scholar]
  • 28.Kawagoe R, Takikawa Y, Hikosaka O. Expectation of reward modulates cognitive signals in the basal ganglia. Nat Neurosci. 1998;1:411–416. doi: 10.1038/1625. [DOI] [PubMed] [Google Scholar]
  • 29.Kelley AE, Smith-Roe SL, Holahan MR. Response-reinforcement learning is dependent on N-methyl-d-aspartate receptor activation in the nucleus accumbens core. Proc Natl Acad Sci USA. 1997;94:12174–12179. doi: 10.1073/pnas.94.22.12174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Leon MI, Shadlen MN. Effect of expected reward magnitude on the response of neurons in the dorsolateral prefrontal cortex of the macaque. Neuron. 1999;24:415–425. doi: 10.1016/s0896-6273(00)80854-5. [DOI] [PubMed] [Google Scholar]
  • 31.McGeorge AJ, Faull RLM. The organization of the projection from the cerebral cortex to the striatum in the rat. Neuroscience. 1989;29:503–537. doi: 10.1016/0306-4522(89)90128-0. [DOI] [PubMed] [Google Scholar]
  • 32.Mitchell JB, Gratton A. Involvement of mesolimbic dopamine neurons in sexual behaviours: implications for the neurobiology of motivation. Rev Neurosci. 1994;5:217–329. doi: 10.1515/revneuro.1994.5.4.317. [DOI] [PubMed] [Google Scholar]
  • 33.Mogenson GJ, Jones DL, Yim CY. From motivation to action: functional interface between the limbic system and the motor system. Prog Neurobiol. 1980;14:69–97. doi: 10.1016/0301-0082(80)90018-0. [DOI] [PubMed] [Google Scholar]
  • 34.Pellegrino LJ, Pellegrino AS, Cushmann AJ. A stereotaxic atlas of the rat brain. Plenum; New York: 1981. [Google Scholar]
  • 35.Redgrave P, Prescott TJ, Gurney K. Is the short-latency dopamine response too short to signal reward error? Trends Neurosci. 1999;22:146–151. doi: 10.1016/s0166-2236(98)01373-3. [DOI] [PubMed] [Google Scholar]
  • 36.Robbins TW, Cador M, Taylor JR, Everitt BJ. Limbic-striatal interactions in reward-related processes. Neurosci Biobehav Rev. 1989;13:155–162. doi: 10.1016/s0149-7634(89)80025-9. [DOI] [PubMed] [Google Scholar]
  • 37.Schoenbaum G, Chiba AA, Gallagher M. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci. 1998;1:155–159. doi: 10.1038/407. [DOI] [PubMed] [Google Scholar]
  • 38.Schoenbaum G, Chiba AA, Gallagher M. Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. J Neurosci. 1999;19:1876–1884. doi: 10.1523/JNEUROSCI.19-05-01876.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Schultz W. Predictive reward signal of dopamine neurons. J Neurophysiol. 1998;80:1–27. doi: 10.1152/jn.1998.80.1.1. [DOI] [PubMed] [Google Scholar]
  • 40.Schultz W, Apicella P, Scarnati E, Ljungberg T. Neuronal activity in monkey ventral striatum related to the expectation of reward. J Neurosci. 1992;12:4595–4610. doi: 10.1523/JNEUROSCI.12-12-04595.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Seeman P, Van Tol HH. Dopamine receptor pharmacology. Trends Pharmacol Sci. 1994;15:264–270. doi: 10.1016/0165-6147(94)90323-9. [DOI] [PubMed] [Google Scholar]
  • 42.Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature. 1999;398:704–708. doi: 10.1038/19525. [DOI] [PubMed] [Google Scholar]
  • 43.Watanabe M. Reward expectancy in primate prefrontal neurons. Nature. 1996;382:629–632. doi: 10.1038/382629a0. [DOI] [PubMed] [Google Scholar]
  • 44.Winer BJ. Statistical principles in experimental design. McGraw-Hill; New York: 1971. [Google Scholar]
  • 45.Wise RA, Bozarth MA. A psychomotor stimulant theory of addiction. Psychol Rev. 1987;94:469–492. [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES