Response-reinforcement learning is dependent on N-methyl-d-aspartate receptor activation in the nucleus accumbens core

Ann E Kelley; Stephanie L Smith-Roe; Matthew R Holahan

doi:10.1073/pnas.94.22.12174

. 1997 Oct 28;94(22):12174–12179. doi: 10.1073/pnas.94.22.12174

Response-reinforcement learning is dependent on N-methyl-d-aspartate receptor activation in the nucleus accumbens core

Ann E Kelley ^1,^*, Stephanie L Smith-Roe ¹, Matthew R Holahan ¹

PMCID: PMC23741 PMID: 9342382

Abstract

The nucleus accumbens, a site within the ventral striatum, is best known for its prominent role in mediating the reinforcing effects of drugs of abuse such as cocaine, alcohol, and nicotine. Indeed, it is generally believed that this structure subserves motivated behaviors, such as feeding, drinking, sexual behavior, and exploratory locomotion, which are elicited by natural rewards or incentive stimuli. A basic rule of positive reinforcement is that motor responses will increase in magnitude and vigor if followed by a rewarding event. It is likely, therefore, that the nucleus accumbens may serve as a substrate for reinforcement learning. However, there is surprisingly little information concerning the neural mechanisms by which appetitive responses are learned. In the present study, we report that treatment of the nucleus accumbens core with the selective competitive N-methyl-d-aspartate (NMDA) antagonist 2-amino-5-phosphonopentanoic acid (AP-5; 5 nmol/0.5 μl bilaterally) impairs response-reinforcement learning in the acquisition of a simple lever-press task to obtain food. Once the rats learned the task, AP-5 had no effect, demonstrating the requirement of NMDA receptor-dependent plasticity in the early stages of learning. Infusion of AP-5 into the accumbens shell produced a much smaller impairment of learning. Additional experiments showed that AP-5 core-treated rats had normal feeding and locomotor responses and were capable of acquiring stimulus-reward associations. We hypothesize that stimulation of NMDA receptors within the accumbens core is a key process through which motor responses become established in response to reinforcing stimuli. Further, this mechanism, may also play a critical role in the motivational and addictive properties of drugs of abuse.

The basal ganglia constitute a group of structures in the mammalian forebrain that have traditionally been considered as motor control regions. Indeed, one of the most prominent signs of basal ganglia disorders is motor disturbance or impairment (1). However, in more recent years empirical work has implicated the striatum (a major component of the basal ganglia) in cognition, reinforcement mechanisms, and sensorimotor integration (2, 3). Although the outflow of the striatum indeed reaches major extrapyramidal motor centers such as globus pallidus, substantia nigra, and subthalamic nucleus, the input to this structure, arising from the neocortex, limbic system, and midbrain, suggests that it plays a complex integrative role in adaptive motor actions (4, 5).

One region of the striatum that has received considerable attention in this regard is the nucleus accumbens. The nucleus accumbens is best known for its role in mediating the reinforcing and rewarding properties of drugs of abuse (6). Drugs such as cocaine, heroin, alcohol, and even nicotine are hypothesized to produce their rewarding effects via activation of accumbens dopamine (7), and it has been recently postulated that chronic neuroadaptations in this system may underlie the addiction process (8). Parallel research has indicated, not surprisingly, that the nucleus accumbens and its associated circuitry subserve behaviors linked to natural or biological rewards, such as feeding, drinking, sex, exploration, and appetitive learning (9).

Although a significant role in reward-related learning has been attributed to the nucleus accumbens as well as other regions of the striatum, there is surprisingly little information as to the key mechanisms underlying this postulated function. In vivo electrophysiological recording experiments suggest that ventral and dorsal striatal neurons are sensitive to motivationally significant stimuli in the environment and show firing properties during appetitive conditioning tasks consistent with adaptive changes during learning (10, 11). Moreover, much of the information reaching the main striatal output neurons, the medium spiny neurons, arises from cortical and limbic regions via a glutamate-coded projection (12). The medium spiny neurons receiving this information contain both NMDA (N-methyl-d-aspartate) and non-NMDA receptors (13). Within the accumbens, there is a unique convergence of afferent input originating in regions that are concerned with motivational and sensory processing, such as the amygdala, hippocampus, and prefrontal cortex (14). Thus, a logical hypothesis to consider would be that NMDA receptors, which are well known to be a critical component of the neural plasticity associated with long-term potentiation (15), mediate the synaptic modifications associated with reinforcement learning. In the present study we tested this hypothesis in rats undergoing training in a simple lever-pressing task. We first suspected that NMDA receptors within the core of the nucleus accumbens were important for spatial behavior when we found that 2-amino-5-phosphonopentanoic acid (AP-5) in this region disrupted learning in a spatial food-gathering task (16). The present study was initially designed as a control experiment, in which we hypothesized that similar treatment might not affect nonspatial operant responding. To our surprise, operant learning appeared to be even more disrupted than spatial learning.

We have found that selective antagonism of NMDA receptors in the core subregion of the nucleus accumbens during acquisition of this task impairs learning. However, once learning is acquired, NMDA receptors are no longer required for performance of the response. The results suggest that NMDA receptors in the accumbens core constitute an important substrate for basic response-reinforcement learning.

MATERIALS AND METHODS

Animals and Surgery.

A total of 54 male Sprague–Dawley rats was used for these experiments. Care of animals was in accordance with institutional guidelines. Rats were housed in groups, in a temperature- (21°C) and light-controlled (12-h light/12-h dark) animal colony. For cannula implantation, animals were anesthetized with a ketamine/xylazine mixture (100 mg/kg and 10 mg/kg respectively; Research Biochemicals, Natick, MA). Standard stereotaxic procedures were used to implant bilateral 23-gauge stainless steel guide cannulae, with coordinates based on flat-skull stereotaxic orientation. Cannulae were secured with dental acrylic and stainless steel screws, and a wire stylet was placed into the guide to maintain patency. For the main experiment (experiment I), two groups of rats were implanted with cannulae aimed at the nucleus accumbens core or the nucleus accumbens shell. The coordinates for these sites were as follows: core, A–P, +1.4 mm; L–M, ±1.7 mm; D–V, −5.5 mm; and shell, A–P, +1.0 mm; M–L, ±1.0 mm; D–V, −5.3 mm. For experiments II and III, rats were implanted with cannulae aimed at the accumbens core using the coordinates described. Following several days recovery from surgery, all rats were put on a restricted diet that maintained body weight at ≈85% of free-feeding weight. Water was freely available at all times in the home cage.

Drugs and Microinfusions.

The selective, competitive NMDA antagonist AP-5 (15) was obtained from Research Biochemicals. Pipradrol hydrochloride was a gift of Merrell Dow (Cincinnati, OH). Intracerebral microinfusions were bilateral in a volume of 0.5 μl. The dose of AP-5 was always 5 nmol (1 μg) per side. Infusions of drug or vehicle were given by lowering a 30-gauge injector cannulae to the site of infusion (−7.8 from skull for both accumbens placements). A microdrive pump (Harvard Apparatus) was used to administer drug infusions to the site with an infusion time of 1 min 33 s, followed by 1 min of diffusion time. The injectors were then removed, the stylets were replaced, and the animals were placed in the test apparatus immediately. For all experiments, animals were given several preliminary “sham” injections, in which a dummy injector was lowered through the guide to adapt them to the procedure.

Behavioral Training and Experimental Procedure.

For the first experiment (acquisition of lever-press response for food), rats were trained in operant chambers (Coulbourn Instruments, Allentown, PA) equipped with two levers, a house light, and a red signal light. All stimulus events and data acquisition were controlled with a microcomputer (Paul Fray, Cambridge, U.K.). Before training, animals were adapted to the food pellets (45 mg sucrose pellets). Additionally, the rats had preexposure for two 10-min sessions to the operant test cages with several free pellets available in the food tray (with no levers present). On the first test day and all test days thereafter, rats were placed in the operant chamber for a 15-min session. Responding on one lever resulted in delivery of a food pellet on a variable ratio-2 schedule of reinforcement (an average of every two responses was rewarded). Responding on the other lever had no consequences. Whether the left or right lever was the correct lever varied among animals but was always the same for an individual animal. When a correct response was made, a food pellet was delivered into a food tray located in between the two levers. A photocell located in the food tray recorded nose-pokes. Pellet delivery was accompanied by house light offset and illumination of a red stimulus light on the response panel (3 s), as well as a light in the food tray. Dependent variables recorded included correct responses, incorrect responses, and nose-pokes. Rats were given the appropriate microinfusion (core group: AP-5, n = 8; vehicle, n = 8; shell group: AP-5, n = 8; vehicle, n = 6) immediately before the session for the first 4 test days. They were then tested without any infusion for days 5–9. On day 10, all animals (including the control group) were given a microinfusion of AP-5.

To ascertain the effects of AP-5 infusions into the accumbens core on motivation for feeding, in a second experiment a group of rats (n = 8) was tested in cages similar to their home cage in a 15-min test of food intake. These animals had a food-deprivation history similar to those used in the learning experiments. With an event recorder linked to a computer, an observer blind to treatment measured duration of feeding, locomotor activity (frequency of crossings of center of cage), and food intake in grams. Animals were given a microinfusion of AP-5 (5 nmol) or vehicle just prior to this test. Infusions were administered in a counterbalanced order, such that all animals received both AP-5 and vehicle treatment. These infusions were administered several days apart.

In a third experiment, a different training procedure was used in two additional groups of rats (AP-5, n = 8; vehicle, n = 8). The purpose of this experiment was to ascertain whether animals treated with infusions of AP-5 in the accumbens core were able to make stimulus-reward, in contrast to stimulus-response, associations. This experiment consisted of two phases. In the first phase, with no levers present, all rats were subjected to a classical (Pavlovian) conditioning procedure in which they were trained to associate the onset of a conditioned stimulus (CS) with delivery of a sucrose pellet (unconditioned stimulus, UCS). The CS was a visual-auditory (compound) stimulus that consisted of house light off, red light on (3 s duration), and the click of the pellet dispenser (1 s duration). This stimulus event was presented on a random interval, 30-s schedule. Subjects were trained not to respond prematurely (i.e., before the compound stimulus was presented) by introducing a 3-s delay in the next possible presentation if a premature nose-poke occurred. If the animal nose-poked during the 3 s of the CS presentation or within 5 s of pellet delivery, it was counted as a discriminated approach (correct) response. Approaches to the food tray outside of this time bin were recorded as premature responses. Animals were trained daily on this schedule for 5 days, with 45 trials per day. Immediately before each of the five sessions, rats were infused with either AP-5 or vehicle in the accumbens core.

For the second phase of training in this experiment, which began 2 days following the end of discriminated approach training, and during which animals were given no brain infusions, two levers were introduced into the box. Responding on one [the conditioned reinforcer (CR) lever] resulted in presentation of the compound stimulus (the CR), but no sucrose pellet. Responding on the other lever (no CR lever) had no consequences. It is well established that animals will acquire operant responding reinforced solely by the CS (that becomes the CR), if the CS and the USC have been explicitly paired (17). Moreover, this responding is greatly enhanced by psychostimulant drugs that increase synaptic dopamine, such as amphetamine, cocaine, and pipradrol (18). In the present case, animals were give three daily 30-min sessions in which responding on the two levers was recorded. On the fourth day, animals were administered pipradrol (5 mg/kg i.p.). On the fifth day, all animals were tested once again without drug.

Histological Analysis.

At the end of experiments, all rats were deeply anesthetized with sodium pentobarbital and perfused transcardially with isotonic saline followed by 10% formalin. The brains were stored in a 10% sucrose-formalin mixture for several days before sectioning. Brains were cut into 60-μm sections and stained for Nissl substance with cresyl violet. The sections were examined with light microscopy and estimated location of infusion sites was recorded on atlas sections.

Statistical Analysis.

Data were analyzed with either one-, two-, or three-factor ANOVA, with treatment as the between subjects factor, and days and lever as the within subjects (repeated measures) factor in the multifactorial analyses.

RESULTS

AP-5 Treatment in Accumbens Core During Response-Reinforcement Learning.

Fig. 1A shows the effects of AP-5 treatment in the accumbens core on acquisition of the lever-press response for food. During the first two sessions, all animals sampled the levers and did not appear to discriminate between them. In the third and fourth sessions, the vehicle-treated animals began to show a preference for the rewarded lever and developed a robust response over the next few sessions. In contrast, animals with AP-5 infusions show no evidence of learning while under the effects of the drug, and only began to discriminate between the two levers on the sixth and seventh sessions. Analysis of variance of lever press scores over the first 9 days indicated a significant effect of treatment [F(1,14) = 10.9, P < 0.005]. There were also significant day × treatment, lever × treatment, and day × lever × treatment interactions (see legend of Fig. 1 for further statistics). Analysis of the scores for days 5–9 indicated a significant difference between the two groups despite the fact the treatment was no longer given [F(1,14) = 9.9, P < 0.007]. AP-5 had no significant effect on responding when administered to either group, once they had learned the task (day 10). [The lower mean for the AP-5-treated group on days 9 and 10 is due to two rats that never learned. The performance of the remaining rats was similar to controls (mean on day 9 for vehicle rats = 156 ± 31, drug rats = 158 ± 17; mean on day 10 for vehicle rats = 182 ± 26, drug rats 145 ± 23)].

Influence of NMDA antagonist infusion into nucleus accumbens core on acquisition of lever-pressing for sucrose pellets. Animals received intra-accumbens infusion of AP-5 (5 nmol) or vehicle (saline) on the first four test days; on the remaining training days, no infusion was given except on day 10, when all animals (including those previously infused with vehicle) were infused with AP-5. (A) Lever presses. In addition to treatment effect ∗∗, there were significant day × treatment ††, [F(8,112) = 4.7, P < 0.001], lever × treatment [F(1,14) = 17.6, P < 0.001], and day x lever x treatment [F(8,-112) = 3.9, P < 0.001] interactions, when all 9 days were analyzed. See text for further differences between groups. Also note that a separate ANOVA of days 3 and 4 (when animals were still receiving treatment) revealed a significant treatment effect {[F(1,14) = 7.1, P < 0.01], day × treatment interaction [F(1,14) = 13.4, P < 0.01], lever × treatment interaction [F(1,14) = 16.7, P < 0.001], and day × lever × treatment interaction [F(1,14) = 18.9, P < 0.001]}. These interactions indicate that on these days, control animals were beginning to learn to lever-press and to discriminate between correct and incorrect levers, whereas the AP-5 treated rats were not. Inc, incorrect responses. (B) Nose-pokes into food tray during learning. ∗∗, Significant effect of treatment; ††, day × treatment interaction, P < 0.01.

Nose-pokes into the food tray (the unconditioned response emitted to obtain food) also appeared to follow a learning curve (Fig. 1B). In the control animals, there was a progressive increase in the number of nose-pokes that closely correlated with learning the operant response. AP-5-treated rats also showed an increase in nose-pokes over time, but this too lagged behind the controls and only began to increase in the absence of drug treatment. Analysis of variance revealed a significant treatment effect for nose-pokes over the first 9 days of testing [F(1,14) = 9.6, P < 0.008]. There was also a significant day × treatment interaction, indicated that the development of this response was differentially affected by treatment [F(8,112) = 3.0, P < 0.004].

AP-5 Treatment in Accumbens Shell During Response-Reinforcement Learning.

AP-5 treatment into the nucleus accumbens shell had a measurable effect upon learning, although the magnitude of this deficit was considerably smaller than that induced by AP-5 in the core (Fig. 2A). There was no overall significant treatment effect [F(1,12) = 0.2, P < 0.64], nor was there a significant lever × treatment interaction (P < 0.4). However, there was a significant day × lever × treatment interaction [F(8,96) = 3.9, P < 0.001]. This effect indicates that on certain days, treatment differentially affected correct responding. It can be observed from Fig. 2A that on days 4 and 5, when control animals were beginning to show a preference for the rewarded lever, AP-5-treated animals showed less discrimination. ANOVA conducted on these 2 days revealed a significant lever x treatment interaction [F(1,12) = 4.9, P < 0.002]. Fig. 2B shows that all animals increased their nose-poke response over days, regardless of treatment.

Influence of NMDA antagonist infusion into nucleus accumbens shell on acquisition of lever-pressing for sucrose pellets. See legend of Fig. 1 for further details. (A) Lever presses. ††, P < 0.01, day × lever × treatment interaction. ∗∗, P < 0.01, day × lever interaction for days 4 and 5. Inc, incorrect responses. (B) Nose-pokes into food tray.

Feeding and Locomotion in Food-Deprived Rats After AP-5 Treatment in Accumbens Core.

It can be observed from Table 1 that AP-5 treatment appeared not to affect any measures of feeding behavior or locomotor activity. Responses were comparable in both treatment groups, and no obvious differences were noted between the two groups in terms of overt behavior.

Table 1.

Influence of intra-accumbens AP-5 treatment on feeding and locomotor activity in food-deprived rats (15-min test)

Accumbens treatment (n = 8)	Food intake, g	Feeding duration, sec	Locomotion
AP-5 (5 nmol)	4.4 ± 0.6	573 ± 49	20 ± 4
Vehicle	4.2 ± 0.8	631 ± 24	18 ± 3

Open in a new tab

Data represent means ± SEM. Locomotion is frequency of cage crossings. No significant differences between treatments.

AP-5 Treatment in Accumbens Core During Discriminated Approach Learning and Subsequent Acquisition of Responding for Conditioned Reinforcement.

The data from the 5 days of discriminated approach training (classical conditioning) are shown in Table 2. There was a significant effect of days and no interactions, indicating that both treatment groups learned to nose-poke during the time bin surrounding presentation of the CS over days. However, there was also a significant effect of treatment on responses, with AP-5 attenuating correct approaches on days 2–5 [F(1,14) = 5.2, P < 0.04]. Fig. 3 shows the results of the second phase, when levers were introduced into the box. Over the first 3 days, animals developed a preference for the lever that resulted in presentation of the CR. There was a main effect of lever [F(1,13) = 5.5, P < 0.04], but no effects of pretreatment or interactions. These results indicate that both groups of animals, whether pretreated with AP-5 or vehicle during the classical conditioning phase, formed stimulus-reward associations. Treatment with pipradrol, a potent dopamine-releasing drug, markedly and selectively enhanced lever pressing for the CR in both groups (Fig. 3B). Moreover, the treatment with pipradrol enhanced acquisition of reward-related responding; analysis of lever pressing on the days before and after pipradrol (Fig. 3A) indicated a significant effect of day [F(1,13) = 9.1, P < 0.01], and a significant day × lever interaction [F (1,13) = 4.83, P < 0.05].

Table 2.

Responses during discriminated approach training

Accumbens treatment	Days of conditioning
Accumbens treatment	1	2	3	4	5
Correct approaches
AP-5 (n = 8)^*	20 ± 4	24 ± 4	31 ± 3	35 ± 2	38 ± 3
Vehicle (n = 8)	21 ± 4	36 ± 3	40 ± 1	43 ± 1	44 ± 1
Premature responses
AP-5	344 ± 86	280 ± 40	437 ± 45	407 ± 54	298 ± 67
Vehicle	426 ± 81	423 ± 79	364 ± 59	237 ± 25	220 ± 45

Open in a new tab

Data are expressed as means ± SEM. Each conditioning session was approximately 40 min.

*P < 0.05, main treatment effect. Dose of AP-5, 5 nmol (bilaterally).

Acquisition of responding for CR in animals treated previously with AP-5 or vehicle in the nucleus accumbens core during discriminated approach (classical conditioning) training, as shown in Table 2. (Note that animals did not have intra-accumbens treatment during the tests for which data is shown above.) Animals were exposed to CR and NCR (no CR) levers on days 1–3, and on day 5, with no drug treatment. On day 4, all rats were given the dopamine-releasing drug pipradrol (5 mg/kg i.p.). (A) Comparison of responding between days 3 and 5 revealed significant effect of day (∗∗, P < 0.01) and a significant day × lever interaction (†, P < 0.05), indicating that treatment with pipradrol on day 4 potentiated acquisition of responding. On days 1–3 animals also showed significant preference for CR lever, regardless of pretreatment (see text). C, correct (CR) lever; I/Inc, incorrect (NCR) lever. (B) Treatment with pipradrol markedly and selectively enhanced responding for the CR lever, although magnitude of enhancement varied considerably between rats.

DISCUSSION

We have found that administration of a selective antagonist of NMDA receptors in the nucleus accumbens blocks response-reinforcement learning. Although the basal ganglia have long been implicated in motor learning, we demonstrate a role for striatal NMDA receptors in the process whereby a simple motor response becomes “stamped in” when followed by a reward. This result suggests a fundamental role for glutamate-mediated mechanisms in the accumbens core in instrumental learning.

It can be noted that on the third and fourth day of testing for experiment 1 (see Fig. 1), control animals began to discriminate between the two levers and rapidly increased the efficacy of lever-pressing thereafter. AP-5-treated animals sampled the levers (correct responding being about 17–20 responses on the first few days) and ate the food that was delivered, but did not appear to make the connection between performance of the response and availability of reward. This effect cannot easily be accounted for by a lack of motivation, because in separate tests of hungry AP-5-treated animals, feeding behavior and intake were no different from that of vehicle-infused rats. Moreover, a motor impairment cannot explain the lack of responding, because AP-5 either has no effect on locomotor activity (present data) or in some cases slightly increases it (19). Additional evidence against a performance impairment unrelated to learning is the fact that once the behavior is learned, AP-5 no longer had any effect. The most parsimonious interpretation of these data is that exposure to the task in the presence of AP-5 prevented a process whereby the rat makes a connection between its behavioral actions and the availability of the rewarding stimulus. Moreover, this effect is much stronger in the accumbens core subregion compared with the accumbens shell. AP-5 infusion into the accumbens shell, 0.7 mm medial to the core site, caused only a mild impairment, evident only on 2 of the 10 test days. This dissociation is particularly interesting, in that the core has preferential connections to motor basal ganglia circuits, which may be more important for motor learning (20).

It is interesting that the nose-poke response was also markedly impaired by core AP-5 infusion. In the control group, this response increases sharply in parallel with operant learning. (Although note that it also reaches an asymptote or even decreases in the later stages of learning.) We interpret this increase to be reflective of acquired incentive motivation; as responding becomes invigorated, animals receive many more pellets at the food tray. The increase in reward (presumably resulting in activation of dopaminergic neurons (11, 21, 22) may thus enhance appetitive motor responses. In the core AP-5 group, this behavior only develops once treatment has ceased. This effect is also anatomically specific inasmuch as nose-poking develops normally in animals treated with AP-5 in the shell of the accumbens. Thus, learning of both operant and unconditioned motor responses appears to depend on an NMDA-receptor-dependent mechanism. Furthermore, it is particularly noteworthy that once motor learning was established, core infusion of AP-5 did not influence performance or “retention” of the response. Thus, NMDA receptors appear to play very little role in established acquired responses.

The data further suggest a dissociation between response-reinforcement and stimulus-reward associative learning. In the test of acquisition of reward-related responding, both groups learned to lever-press selectively for the CR, indicating that this stimulus must have acquired affective value for the animal. Moreover, both groups showed similar response potentiation by pipradrol. It has been previously established that responding for conditioned reward does not develop, nor do stimulants increase it, if the CS and the UCS have been randomly paired (23). Therefore, it seems likely that treatment with AP-5 in the accumbens core did not impair the mechanism whereby a stimulus acquires motivational value. However, it is important to note that during the classical conditioning phase, AP-5 treatment induced a small but significant impairment in correct discriminated approaches. Although this response is not a contingency for obtaining food, the correct timing and efficacy of the response involves learned motor performance, and this too may be impaired by blockade of NMDA receptors. Thus, NMDA receptor activation in the core does not appear to play a major role in stimulus-reward associative learning, which may more likely involve the amygdala and other cortical systems (24).

There is much evidence for a major role of the basal ganglia in motor learning and reward processing. Neurons in the monkey ventral striatum are sensitive to both primary and conditioned rewards (11, 12). Moreover, neuronal response plasticity has been demonstrated in striatal neurons during behavioral learning. During acquisition of sensorimotor conditioning in monkeys, in which a cue predicts delivery of juice reward, there is a progressive increase in the number of tonically active neurons that respond to the cue (10). Indeed, an important theory of striatal function posits that this structure is crucial for the acquisition and performance of relatively automatic learned “habits,” or basic stimulus-response learning (25–28). Lesions of the ventral or dorsal striatum have been found to impair acquisition on a variety of learning tasks, particularly when animals are required to use fixed cues to improve performance (27, 29–31). However, although it is always assumed that the corticostriatal pathway mediates these effects, to our knowledge there has been no direct demonstration of glutamate-dependent processes in instrumental response learning.

If the striatum plays a key role in sensorimotor learning, it must have access to essential information arising from corticolimbic structures. The accumbens core receives converging input from midbrain, amygdala, hippocampus, mediodorsal thalamus, and prefrontal cortex, which together transmit salient signals concerning sensory, affective, and cognitive information. These afferents are coded by glutamate and synapse directly onto the medium spiny striatal output neurons (32, 33). These neurons contain both NMDA and non-NMDA receptors (13) and project to motor output systems. In other brain regions, primarily hippocampus, NMDA receptors are well established to be an initial key element in the postsynaptic activation and subsequent intracellular molecular events necessary for long-term potentiation and synaptic plasticity (15, 34, 35). Thus, it is logical to hypothesize that NMDA receptor activation in the accumbens core may be an important mechanism through which stimulus-response associative learning occurs. Electrophysiological evidence supports this notion; tetanic stimulation of hippocampal, amygdala, or prefrontal inputs to accumbens induces long-term potentiation or long-term depression in nucleus accumbens neurons (36–38).

Recently, several theoretical models have been proposed to explain reinforcement learning in corticostriatal systems (39–41).These models postulate an interaction of dopaminergic and corticostriatal synapses, and consequent integrated molecular signals, on the dendritic spines of striatal medium-size spiny output neurons. Activity in spiny neurons is largely dependent on excitatory input from cortex. Influx of calcium via NMDA receptors in association with dopamine-mediated intracellular changes (such as in the cAMP system) is proposed as essential for the cellular basis of reinforcement. In support of this notion, a recent report demonstrated long-term enhancement of synaptic strength when cortical striatal excitation and dopaminergic activation were temporally coordinated (42); it has also been found that dopamine selectively enhances NMDA-induced excitations in striatal slices (43). There is some evidence that enhanced dopaminergic function can improve learning and memory (44). Although the role of dopamine in the present study has not been examined, our results provide behavioral evidence for these models. The enhanced dopaminergic signal within the accumbens, provided by both food deprivation and availability of food reward (21, 22), undoubtedly plays an important role in modulating response learning. We further suggest that activation of NMDA receptors is a necessary component of this process. Whether other striatal regions in addition to accumbens core also contribute to this process awaits further study.

In addition to having implications for the understanding of brain mechanisms in relation to natural reinforcers, these results may also be considered to be relevant to the process of drug dependence. It is well established that the nucleus accumbens mediates drug reward (6), and it has been known for some time that NMDA receptors located within the mesocorticodopaminergic system may play a significant role in drug sensitization (45) and in the long-term neuronal adaptations that drugs of abuse induce (46). Therefore, it is possible that, like for natural reinforcers, the process of drug dependence involves plasticity-related neuroadaptations within the accumbens core.

Acknowledgments

We thank Leonard A. Levin for insightful discussions and Kenneth Sadeghian for technical assistance. This work was supported by Grant DA04788 from the National Institute on Drug Abuse.

ABBREVIATIONS

NMDA: N-methyl-d-aspartate
AP-5: 2-amino-5-phosphonopentanoic acid
CS: conditioned stimulus
UCS: unconditioned stimulus, CR, conditioned reinforcer

References

1.Albin R L, Young A B, Penney J B. Trends Neurosci. 1995;18:63–64. [PubMed] [Google Scholar]
2.Rolls E T. Rev Neurol (Paris) 1994;150:8–9. [PubMed] [Google Scholar]
3.Graybiel A M, Aosaki T, Flaherty A W, Kimura M. Science. 1994;265:1826–1831. doi: 10.1126/science.8091209. [DOI] [PubMed] [Google Scholar]
4.Alexander G E, DeLong M R, Strick P L. Annu Rev Neurosci. 1986;9:357–381. doi: 10.1146/annurev.ne.09.030186.002041. [DOI] [PubMed] [Google Scholar]
5.Nauta W J H. In: Neurology and Psychiatry: A Meeting of Minds. Mueller, editor. Basel: Karger; 1989. pp. 43–63. [Google Scholar]
6.Wise R A, Bozarth M A. Psychol Rev. 1987;94:469–492. [PubMed] [Google Scholar]
7.Di Chiara G, Imperato A. Proc Natl Acad Sci. 1988;85:5274–5278. doi: 10.1073/pnas.85.14.5274. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Nestler E J, Hope B T, Widnell K L. Neuron. 1993;11:995–1006. doi: 10.1016/0896-6273(93)90213-b. [DOI] [PubMed] [Google Scholar]
9.Robbins T W, Everitt B J. Curr Opin Neurobiol. 1996;6:228–236. doi: 10.1016/s0959-4388(96)80077-8. [DOI] [PubMed] [Google Scholar]
10.Aosaki T, Tsubokawa H, Ishida A, Watanabe K, Graybiel A M, Kimura M. J Neurosci. 1994;14:3969–3984. doi: 10.1523/JNEUROSCI.14-06-03969.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Schultz W, Apicella P, Ljungberg T. J Neurosci. 1993;13:900–913. doi: 10.1523/JNEUROSCI.13-03-00900.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Smith A D, Bolam J P. Trends Neurosci. 1990;13:259–265. doi: 10.1016/0166-2236(90)90106-k. [DOI] [PubMed] [Google Scholar]
13.Tallaksen-Green S J, Wiley R G, Albin R L. Brain Res. 1992;594:165–170. doi: 10.1016/0006-8993(92)91044-f. [DOI] [PubMed] [Google Scholar]
14.Groenewegen H J, Wright C I, Beijer A V J. Prog Brain Res. 1996;107:485–511. doi: 10.1016/s0079-6123(08)61883-x. [DOI] [PubMed] [Google Scholar]
15.Davis S, Butcher S P, Morris R G M. J Neurosci. 1992;12:21–34. doi: 10.1523/JNEUROSCI.12-01-00021.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Maldonado-Irizarry C S, Kelley A E. Behav Pharmacol. 1995;6:527–539. [PubMed] [Google Scholar]
17.Robbins T W. Psychopharmacologia (Berlin) 1975;45:103–114. [Google Scholar]
18.Robbins T W. Psychopharmacology. 1978;58:79–87. doi: 10.1007/BF00426794. [DOI] [PubMed] [Google Scholar]
19.Kelley A E, Throne L C. Brain Res Bull. 1992;29:247–254. doi: 10.1016/0361-9230(92)90034-u. [DOI] [PubMed] [Google Scholar]
20.Zahm D S, Brog J S. Neuroscience. 1992;50:751–767. doi: 10.1016/0306-4522(92)90202-d. [DOI] [PubMed] [Google Scholar]
21.Hernandez L, Hoebel B G. Physiol Behav. 1988;44:599–606. doi: 10.1016/0031-9384(88)90324-1. [DOI] [PubMed] [Google Scholar]
22.Wilson C, Nomikos G G, Collu M, Fibiger H C. J Neurosci. 1995;15:5169–5178. doi: 10.1523/JNEUROSCI.15-07-05169.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Cador M, Taylor J R, Robbins T W. Psychopharmacology. 1991;104:377–385. doi: 10.1007/BF02246039. [DOI] [PubMed] [Google Scholar]
24.Gallagher M, Holland P C. Proc Natl Acad Sci USA. 1994;91:11771–11776. doi: 10.1073/pnas.91.25.11771. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Mishkin M, Petri H L. In: Neuropsychology of Memory. Butters N, Squire L R, editors. New York: Guilford; 1984. pp. 287–296. [Google Scholar]
26.Packard M G, White N M. Behav Neural Biol. 1990;53:39–50. doi: 10.1016/0163-1047(90)90780-a. [DOI] [PubMed] [Google Scholar]
27.Reading P J, Dunnett S B, Robbins T W. Behav Brain Res. 1991;45:147–161. doi: 10.1016/s0166-4328(05)80080-4. [DOI] [PubMed] [Google Scholar]
28.Packard M G, McGaugh J L. Behav Neurosci. 1992;106:439–446. doi: 10.1037//0735-7044.106.3.439. [DOI] [PubMed] [Google Scholar]
29.McDonald R J, White N M. Behav Neurosci. 1993;107:3–22. doi: 10.1037//0735-7044.107.1.3. [DOI] [PubMed] [Google Scholar]
30.Sutherland R J, Rodriguez A J. Behav Brain Res. 1989;32:265–277. doi: 10.1016/s0166-4328(89)80059-2. [DOI] [PubMed] [Google Scholar]
31.Annett L E, McGregor A, Robbins T W. Behav Brain Res. 1989;31:231–242. doi: 10.1016/0166-4328(89)90005-3. [DOI] [PubMed] [Google Scholar]
32.Totterdell S, Smith A S. J Chem Neuroanat. 1989;2:285–298. [PubMed] [Google Scholar]
33.Sesack S R, Pickel V M. Brain Res. 1990;527:266–279. doi: 10.1016/0006-8993(90)91146-8. [DOI] [PubMed] [Google Scholar]
34.Bliss T V, Collingridge Nature (London) 1993;361:31–39. doi: 10.1038/361031a0. [DOI] [PubMed] [Google Scholar]
35.Malenka R C. Cell. 1994;78:535–538. doi: 10.1016/0092-8674(94)90517-7. [DOI] [PubMed] [Google Scholar]
36.Uno M, Ozawa N. Neurosci Res. 1991;12:251–262. doi: 10.1016/0168-0102(91)90115-f. [DOI] [PubMed] [Google Scholar]
37.Pennartz C M, Ameerun R F, Groenewegen H J, Lopes da Silva F H. Eur J Neurosci. 1993;5:107–117. doi: 10.1111/j.1460-9568.1993.tb00475.x. [DOI] [PubMed] [Google Scholar]
38.Kombian S B, Malenka R C. Nature (London) 1994;368:242–245. doi: 10.1038/368242a0. [DOI] [PubMed] [Google Scholar]
39.Kotter R. Prog Neurobiol. 1994;44:163–196. doi: 10.1016/0301-0082(94)90037-x. [DOI] [PubMed] [Google Scholar]
40.Wickens J, Kötter R. In: Models of Information Processing in the Basal Ganglia. Houk J C, Davis J L, Beiser D G, editors. Cambridge, MA: MIT Press; 1995. pp. 187–214. [Google Scholar]
41.Houk J C, Adams J L, Barto A G. In: Models of Information Processing in the Basal Ganglia. Houk J C, Davis J L, Beiser D G, editors. Cambridge, MA: MIT Press; 1995. pp. 249–270. [Google Scholar]
42.Wickens J R, Begg A J, Arbuthnott G W. Neuroscience. 1996;70:1–5. doi: 10.1016/0306-4522(95)00436-m. [DOI] [PubMed] [Google Scholar]
43.Cepeda C, Buchwald N A, Levine M S. Proc Natl Acad Sci USA. 1993;90:9576–9580. doi: 10.1073/pnas.90.20.9576. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Packard M G, White N M. Behav Neurosci. 1991;105:295–306. doi: 10.1037//0735-7044.105.2.295. [DOI] [PubMed] [Google Scholar]
45.Wolf M E, White F J, Hu X-T. J Neurosci. 1994;14:1735–1745. doi: 10.1523/JNEUROSCI.14-03-01735.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Konradi C, Leveque J C, Hyman S E. J Neurosci. 1996;16:4231–4239. doi: 10.1523/JNEUROSCI.16-13-04231.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B1] 1.Albin R L, Young A B, Penney J B. Trends Neurosci. 1995;18:63–64. [PubMed] [Google Scholar]

[B2] 2.Rolls E T. Rev Neurol (Paris) 1994;150:8–9. [PubMed] [Google Scholar]

[B3] 3.Graybiel A M, Aosaki T, Flaherty A W, Kimura M. Science. 1994;265:1826–1831. doi: 10.1126/science.8091209. [DOI] [PubMed] [Google Scholar]

[B4] 4.Alexander G E, DeLong M R, Strick P L. Annu Rev Neurosci. 1986;9:357–381. doi: 10.1146/annurev.ne.09.030186.002041. [DOI] [PubMed] [Google Scholar]

[B5] 5.Nauta W J H. In: Neurology and Psychiatry: A Meeting of Minds. Mueller, editor. Basel: Karger; 1989. pp. 43–63. [Google Scholar]

[B6] 6.Wise R A, Bozarth M A. Psychol Rev. 1987;94:469–492. [PubMed] [Google Scholar]

[B7] 7.Di Chiara G, Imperato A. Proc Natl Acad Sci. 1988;85:5274–5278. doi: 10.1073/pnas.85.14.5274. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8.Nestler E J, Hope B T, Widnell K L. Neuron. 1993;11:995–1006. doi: 10.1016/0896-6273(93)90213-b. [DOI] [PubMed] [Google Scholar]

[B9] 9.Robbins T W, Everitt B J. Curr Opin Neurobiol. 1996;6:228–236. doi: 10.1016/s0959-4388(96)80077-8. [DOI] [PubMed] [Google Scholar]

[B10] 10.Aosaki T, Tsubokawa H, Ishida A, Watanabe K, Graybiel A M, Kimura M. J Neurosci. 1994;14:3969–3984. doi: 10.1523/JNEUROSCI.14-06-03969.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11.Schultz W, Apicella P, Ljungberg T. J Neurosci. 1993;13:900–913. doi: 10.1523/JNEUROSCI.13-03-00900.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12.Smith A D, Bolam J P. Trends Neurosci. 1990;13:259–265. doi: 10.1016/0166-2236(90)90106-k. [DOI] [PubMed] [Google Scholar]

[B13] 13.Tallaksen-Green S J, Wiley R G, Albin R L. Brain Res. 1992;594:165–170. doi: 10.1016/0006-8993(92)91044-f. [DOI] [PubMed] [Google Scholar]

[B14] 14.Groenewegen H J, Wright C I, Beijer A V J. Prog Brain Res. 1996;107:485–511. doi: 10.1016/s0079-6123(08)61883-x. [DOI] [PubMed] [Google Scholar]

[B15] 15.Davis S, Butcher S P, Morris R G M. J Neurosci. 1992;12:21–34. doi: 10.1523/JNEUROSCI.12-01-00021.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B16] 16.Maldonado-Irizarry C S, Kelley A E. Behav Pharmacol. 1995;6:527–539. [PubMed] [Google Scholar]

[B17] 17.Robbins T W. Psychopharmacologia (Berlin) 1975;45:103–114. [Google Scholar]

[B18] 18.Robbins T W. Psychopharmacology. 1978;58:79–87. doi: 10.1007/BF00426794. [DOI] [PubMed] [Google Scholar]

[B19] 19.Kelley A E, Throne L C. Brain Res Bull. 1992;29:247–254. doi: 10.1016/0361-9230(92)90034-u. [DOI] [PubMed] [Google Scholar]

[B20] 20.Zahm D S, Brog J S. Neuroscience. 1992;50:751–767. doi: 10.1016/0306-4522(92)90202-d. [DOI] [PubMed] [Google Scholar]

[B21] 21.Hernandez L, Hoebel B G. Physiol Behav. 1988;44:599–606. doi: 10.1016/0031-9384(88)90324-1. [DOI] [PubMed] [Google Scholar]

[B22] 22.Wilson C, Nomikos G G, Collu M, Fibiger H C. J Neurosci. 1995;15:5169–5178. doi: 10.1523/JNEUROSCI.15-07-05169.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B23] 23.Cador M, Taylor J R, Robbins T W. Psychopharmacology. 1991;104:377–385. doi: 10.1007/BF02246039. [DOI] [PubMed] [Google Scholar]

[B24] 24.Gallagher M, Holland P C. Proc Natl Acad Sci USA. 1994;91:11771–11776. doi: 10.1073/pnas.91.25.11771. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B25] 25.Mishkin M, Petri H L. In: Neuropsychology of Memory. Butters N, Squire L R, editors. New York: Guilford; 1984. pp. 287–296. [Google Scholar]

[B26] 26.Packard M G, White N M. Behav Neural Biol. 1990;53:39–50. doi: 10.1016/0163-1047(90)90780-a. [DOI] [PubMed] [Google Scholar]

[B27] 27.Reading P J, Dunnett S B, Robbins T W. Behav Brain Res. 1991;45:147–161. doi: 10.1016/s0166-4328(05)80080-4. [DOI] [PubMed] [Google Scholar]

[B28] 28.Packard M G, McGaugh J L. Behav Neurosci. 1992;106:439–446. doi: 10.1037//0735-7044.106.3.439. [DOI] [PubMed] [Google Scholar]

[B29] 29.McDonald R J, White N M. Behav Neurosci. 1993;107:3–22. doi: 10.1037//0735-7044.107.1.3. [DOI] [PubMed] [Google Scholar]

[B30] 30.Sutherland R J, Rodriguez A J. Behav Brain Res. 1989;32:265–277. doi: 10.1016/s0166-4328(89)80059-2. [DOI] [PubMed] [Google Scholar]

[B31] 31.Annett L E, McGregor A, Robbins T W. Behav Brain Res. 1989;31:231–242. doi: 10.1016/0166-4328(89)90005-3. [DOI] [PubMed] [Google Scholar]

[B32] 32.Totterdell S, Smith A S. J Chem Neuroanat. 1989;2:285–298. [PubMed] [Google Scholar]

[B33] 33.Sesack S R, Pickel V M. Brain Res. 1990;527:266–279. doi: 10.1016/0006-8993(90)91146-8. [DOI] [PubMed] [Google Scholar]

[B34] 34.Bliss T V, Collingridge Nature (London) 1993;361:31–39. doi: 10.1038/361031a0. [DOI] [PubMed] [Google Scholar]

[B35] 35.Malenka R C. Cell. 1994;78:535–538. doi: 10.1016/0092-8674(94)90517-7. [DOI] [PubMed] [Google Scholar]

[B36] 36.Uno M, Ozawa N. Neurosci Res. 1991;12:251–262. doi: 10.1016/0168-0102(91)90115-f. [DOI] [PubMed] [Google Scholar]

[B37] 37.Pennartz C M, Ameerun R F, Groenewegen H J, Lopes da Silva F H. Eur J Neurosci. 1993;5:107–117. doi: 10.1111/j.1460-9568.1993.tb00475.x. [DOI] [PubMed] [Google Scholar]

[B38] 38.Kombian S B, Malenka R C. Nature (London) 1994;368:242–245. doi: 10.1038/368242a0. [DOI] [PubMed] [Google Scholar]

[B39] 39.Kotter R. Prog Neurobiol. 1994;44:163–196. doi: 10.1016/0301-0082(94)90037-x. [DOI] [PubMed] [Google Scholar]

[B40] 40.Wickens J, Kötter R. In: Models of Information Processing in the Basal Ganglia. Houk J C, Davis J L, Beiser D G, editors. Cambridge, MA: MIT Press; 1995. pp. 187–214. [Google Scholar]

[B41] 41.Houk J C, Adams J L, Barto A G. In: Models of Information Processing in the Basal Ganglia. Houk J C, Davis J L, Beiser D G, editors. Cambridge, MA: MIT Press; 1995. pp. 249–270. [Google Scholar]

[B42] 42.Wickens J R, Begg A J, Arbuthnott G W. Neuroscience. 1996;70:1–5. doi: 10.1016/0306-4522(95)00436-m. [DOI] [PubMed] [Google Scholar]

[B43] 43.Cepeda C, Buchwald N A, Levine M S. Proc Natl Acad Sci USA. 1993;90:9576–9580. doi: 10.1073/pnas.90.20.9576. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44] 44.Packard M G, White N M. Behav Neurosci. 1991;105:295–306. doi: 10.1037//0735-7044.105.2.295. [DOI] [PubMed] [Google Scholar]

[B45] 45.Wolf M E, White F J, Hu X-T. J Neurosci. 1994;14:1735–1745. doi: 10.1523/JNEUROSCI.14-03-01735.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B46] 46.Konradi C, Leveque J C, Hyman S E. J Neurosci. 1996;16:4231–4239. doi: 10.1523/JNEUROSCI.16-13-04231.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Response-reinforcement learning is dependent on N-methyl-d-aspartate receptor activation in the nucleus accumbens core

Ann E Kelley

Stephanie L Smith-Roe

Matthew R Holahan

Abstract