Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2014 Jan 22;34(4):1358–1369. doi: 10.1523/JNEUROSCI.4592-13.2014

δ-Opioid and Dopaminergic Processes in Accumbens Shell Modulate the Cholinergic Control of Predictive Learning and Choice

Vincent Laurent 1,*, Jesus Bertran-Gonzalez 2,*, Billy C Chieng 1,*, Bernard W Balleine 1,
PMCID: PMC3898294  PMID: 24453326

Abstract

Decision-making depends on the ability to extract predictive information from the environment to guide future actions. Outcome-specific Pavlovian-instrumental transfer (PIT) provides an animal model of this process in which a stimulus predicting a particular outcome biases choice toward actions earning that outcome. Recent evidence suggests that cellular adaptations of δ-opioid receptors (DORs) on cholinergic interneurons (CINs) in the nucleus accumbens shell (NAc-S) are necessary for PIT. Here we found that modulation of DORs in CINs critically influences D1-receptor (D1R)-expressing projection neurons in the NAc-S to promote PIT. First, we assessed PIT-induced changes in signaling processes in dopamine D1- and D2-receptor-expressing neurons using drd2-eGFP mice, and found that PIT-related signaling was restricted to non-D2R-eGFP-expressing neurons, suggesting major involvement of D1R-neurons. Next we confirmed the role of D1Rs pharmacologically: the D1R antagonist SCH-23390, but not the D2R antagonist raclopride, infused into the NAc-S abolished PIT in rats, an effect that depended on DOR activity. Moreover, asymmetrical infusion of SCH-23390 and the DOR antagonist naltrindole into the NAc-S also abolished PIT. DOR agonists were found to sensitize the firing responses of CINs in brain slices prepared immediately after the PIT test. We confirmed the opioid-acetylcholinergic influence over D1R-neurons by selectively blocking muscarinic M4 receptors in the NAc-S, which tightly regulate the activity of D1Rs, a treatment that rescued the deficit in PIT induced by naltrindole. We describe a model of NAc-S function in which DORs modulate CINs to influence both D1R-neurons and stimulus-guided choice between goal-directed actions.

Keywords: goal-directed action, choice, nucleus accumbens shell, δ-opioid receptor, dopamine receptor, muscarinic acetylcholine receptor M4

Introduction

Pavlovian conditioning imbues a stimulus with the ability to influence future actions; i.e., a stimulus associated with a particular outcome biases choice toward actions that earn that same outcome (Colwill and Rescorla, 1988; Dickinson and Balleine, 1994; Holmes et al., 2010). Considerable evidence suggests that this outcome-specific Pavlovian-instrumental transfer effect (PIT) requires activity in the nucleus accumbens shell (NAc-S); lesion (Corbit et al., 2001) or inactivation (Corbit and Balleine, 2011) of the NAc-S removes the influence of predictive stimuli on choice. Interestingly, these manipulations appeared not to affect either the stimulus–outcome or action–outcome relationships that interact in the expression of specific PIT, suggesting that the NAc-S extracts and integrates information from both Pavlovian and instrumental training to control the influence of outcome-related stimuli on choice between actions.

Recent evidence suggests that δ-opioid receptors (DORs) in the NAc-S play an essential role in specific PIT (Laurent et al., 2012; Bertran-Gonzalez et al., 2013). We first observed that specific PIT was selectively impaired by genetic deletion of DORs, as well as by systemic or local blockade of DORs in the NAc-S (Laurent et al., 2012). Subsequently, we found that successful outcome-specific transfer was associated with an increase of DOR expression in the somatic membrane of cholinergic interneurons (CINs) in the NAc-S. This plastic cellular response occurred during Pavlovian conditioning and required the presence of specific stimulus–outcome relationships (Bertran-Gonzalez et al., 2013). Together, these findings suggest that Pavlovian conditioning produced a persistent translocation of DORs to the somatic membrane of NAc-S CINs, and that this translocation is later necessary to promote the influence of outcome-related stimuli on choice between actions. Nevertheless, the effects of these local changes in cholinergic activity on the overall functioning of the NAc-S remain unknown.

It is well established that, like the rest of the striatum, the NAc-S is composed almost exclusively of medium-sized spiny neurons (MSNs), which constitute the only output neurons of this structure (Gerfen and Surmeier, 2011). These MSNs, which are intermingled in the tissue, can be separated into two distinct populations based on the dopamine receptor type they express, a distinction that, at least in dorsal areas, reflects the divergent pathways projecting from the striatum (Bertran-Gonzalez et al., 2010). Although this segregation appears less pronounced in ventral striatal regions (Smith et al., 2013), striatonigral MSNs generally express dopamine D1 receptors (D1Rs), whereas striatopallidal MSNs are enriched in dopamine D2 receptors (D2Rs). Given this distinction, the present experiments first aimed to determine which projection population in the NAc-S was primarily involved in the expression of specific PIT and, second, to investigate how DOR-mediated changes in cholinergic activity modulate MSN function to promote the influence of Pavlovian predictors on choice between actions.

Materials and Methods

Subjects

Swiss-Webster drd2-eGFP mice carrying a bacterial artificial chromosome (BAC) expressing eGFP under drd2 regulatory sequences were obtained from the GENSAT (Gene Expression Nervous System Atlas) program at Rockefeller University (New York, NY). These mice were crossed with C57BL/6 wild-type mice, and stable heterozygous mice from the F1 generation were used in this study (13 mice in total, male, and aged 8–10 weeks). They were housed in plastic boxes (two to five mice per box) located in a climate-controlled colony room and were maintained on a 12 h light/dark cycle. A total of 109 experimentally naive Long–Evans rats (aged 7–12 weeks) were obtained from Monash University Animal Research Platform. They were housed in plastic boxes (two or three rats per box) located in a separate colony room. Five days before the behavioral procedures, all animals were handled daily and were put on a food deprivation schedule to maintain them at ∼85% of their ad libitum feeding weight. The Animal Ethics Committee at the University of Sydney approved all experimental procedures.

Apparatus

Training and testing took place in 32 MED Associates operant chambers (16 for mice and 16 for rats) enclosed in sound- and light-resistant shells. Each chamber was equipped with a pump fitted with a syringe that could deliver 0.1 ml of a 20% sucrose solution into a recessed magazine in the chamber. Each chamber was also equipped with two pellet dispensers that could individually deliver either grain food pellets (20 mg for mice and 45 mg for rats; BioServe Biotechnologies) or chocolate food pellets (20 mg for mice) when activated. The chambers contained two retractable levers that could be inserted to the left and right side of the magazine. An infrared photobeam crossed the magazine opening, allowing for the detection of head entries. A 3 W, 24 V house light provided illumination of the operant chamber, and each chamber contained a Sonalert that, when activated, delivered a 3 kHz pure tone, a 28 V DC mechanical relay that was used to deliver a 2 Hz clicker stimulus, and a white noise generator (80 dB). A set of four microcomputers running MED Associates proprietary software (Med-PC) controlled all experimental events and recorded magazine entries and lever presses.

Drugs

R-(+)-SCH-23390 hydrochloride (SCH; Sigma-Aldrich), a selective D1R antagonist, S-(−)-raclopride (+)-tartrate salt (RAC; Sigma-Aldrich), a selective D2R antagonist, and Muscarinic Toxin 3 (MT3; Peptides International), a highly selective muscarinic M4 receptor (M4R) antagonist, were all dissolved in 0.9% (w/v) nonpyrogenic saline. Two concentrations (2.5 μg/μl and 1 μg/μl) of the D1 and D2 dopamine antagonists were used (Baldo et al., 2002; Bossert et al., 2007; Faure et al., 2008), whereas one concentration (4 μg/μl) of MT3 was used (Diehl et al., 2007). For intracranial infusions, the DOR antagonist naltrindole hydrochloride (NAL; Tocris Bioscience) was dissolved in 0.9% (w/v) saline containing 5% DMSO to obtain a final concentration of 5 μg/μl (Laurent et al., 2012). The same concentration (Perrine et al., 2006) was used for systemic administration (intraperitoneal) of NAL at a volume of 10 ml/kg which, in that case, was dissolved in distilled water. Either 0.9% (w/v) saline (SCH, RAC, and MT3), 0.9% (w/v) saline containing 5% DMSO (for intracranial injection of NAL), or distilled water (for systemic injection of NAL) was used as vehicle (VEH) in each case to control for any effect of the injection procedure per se. Working drug concentrations for electrophysiology were as follows: picrotoxin (100 μm; Sigma-Aldrich), CNQX disodium and DL-AP5 (10 and 100 μm, respectively; Ascent Scientific), deltorphin II (300 nm; Tocris Bioscience). During cell-attached and whole-cell recording experiments, stock solutions of all drugs were diluted to working concentrations in the extracellular solution immediately before use and applied by continuous superfusion.

Surgery and microinjections

At the time of surgery, rats weighted between 290 and 360 g. Continuous flow of mixed isoflurane and oxygen gas solution was used to anesthetize rats that were then placed in a stereotaxic frame (Kopf Instruments) with the incisor bar set at −3.3 mm. The scalp was retracted to expose the skull, and 26 gauge guide cannulae (Plastics One) were bilaterally implanted through holes drilled in the skull in one of the targeted structures. Two different sets of coordinates (indicated in millimeters relative to bregma) were used for the core region of the nucleus accumbens: one for the left [anteroposterior (AP), +1.2; mediolateral (ML), −2.1; dorsoventral (DV), −6.0] and one for the right (AP, +1.2; ML, −3.2; DV, −6.2; angled 10° toward the midline in the coronal plane) hemisphere. The coordinates used for the shell region of the nucleus accumbens were the following: AP, +1.7; ML, ±0.75; DV, −6.4. The guide cannulae were maintained in position with dental cement, and dummy cannulae were kept in each guide at all times except during microinjections. Immediately after the surgical procedure, rats were injected i.p. with a prophylactic (0.4 ml) dose of 300 mg/kg solution of procaine penicillin. Rats were allowed 3 d to recover from surgery, during which time they were handled and weighed daily.

SCH-23390, raclopride, naltrindole, MT3, and vehicle were infused into either the core or the shell region of the nucleus accumbens by inserting a 33 gauge infusion cannula into the guide. The infusion cannulae were connected to a 25 μl glass syringe connected to an infusion pump (KD Scientific, SDR Clinical Technology) and projected 1 mm ventral to the tip of the guide. A total volume of 0.2 μl was delivered at a rate of 0.1 μl/min. The infusion cannula remained in place for a further 1 min after the infusion and then removed. On the day before the first infusion, the dummy cannula was removed and the infusion pump was turned on for 2 min to familiarize the rats with the procedure and thereby minimize any stress produced by this procedure when infusions occurred.

Behavioral procedures

Contingent Pavlovian training.

All animals received eight daily sessions of Pavlovian training during which the levers were retracted. Each session was of 60 min duration and consisted in presenting two conditioned stimuli (noise and clicker for mice, tone and clicker for rats), each paired with one of the two food outcomes used (grain or chocolate pellets for mice, grain pellets or sucrose solution for rats). Each stimulus lasted 2 min in duration and was presented four times in a pseudorandom order with a variable intertrial interval of 5 min. The stimulus–outcome relationships were fully counterbalanced. The sucrose or pellets were delivered on a random-time 30 s schedule throughout the appropriate stimulus.

Noncontingent Pavlovian training.

This training was identical to contingent training except the conditioned stimuli and the delivery of the outcomes were uncorrelated and dispersed across the entire session. Thus, the stimulus–outcome predictive relationships were weakened in this group, as the outcomes could be obtained in the absence or presence of either stimulus. The number of O1 and O2 delivered in one noncontingent training session was identical to the number of O1 and O2 given in one contingent training session.

Instrumental training.

Following Pavlovian training, all animals received 8 d of instrumental training during which two responses (left and right lever presses) were trained with the two different food outcomes in separate daily sessions. The order of the sessions was counterbalanced, as were the response–outcome relationships that were also counterbalanced with the stimulus–outcome relationships established during Pavlovian training. Each session ended when 20 outcomes were earned or when 30 min had elapsed. For the first 2 d, lever pressing was continuously reinforced (i.e., each response was reinforced). Then, the probability of the outcome given a response was gradually shifted over days using increasing random ratio schedules: a RR5 schedule (p = 0.2) was used on days 3–5 and a RR10 (p = 0.1) was used on days 6–8. For experiments involving cannulations, rats were then given ad libitum access to food and water for 5 consecutive days before undergoing surgery. Following recovery from surgery, rats were returned to the food deprivation schedule previously used and received 2 additional days of instrumental training on a RR10 schedule.

Pavlovian-instrumental transfer.

After the final day of RR10 training, animals were given a Pavlovian-instrumental transfer test. Both levers were inserted into the box, but no outcomes were delivered. Responding was extinguished on both levers for 8 min to reduce baseline performance. Each stimulus was then presented four times over the next 40 min in the following order: clicker-noise-noise-clicker-noise-clicker-clicker-noise. For rats, the noise was replaced by a tone. Stimulus presentations lasted 2 min and were separated by a 3 min fixed interval. In pharmacological experiments, all microinjections were given 15 min before test except for the systemic administration of naltrindole that occurred 30 min before. The order of these microinjections was fully counterbalanced. That is, rats that received vehicle infusion on Test 1 were infused with drug on Test 2, whereas rats infused with drug on Test 1 received vehicle on Test 2. In some experiments (i.e., DOR/D1R blockade and MT3 blockade), full counterbalancing was not possible due to the high number of different drug treatments. However, the order of microinjections used ensured maximal counterbalancing. Finally, “Same” in noncontingent animals was defined as left lever press rate minus baseline, whereas “Different” corresponded to right lever press rate minus baseline. This pseudorandom allocation of performance was justified because noncontingent training prevented the establishment of specific relationships between the stimuli and the outcomes.

Tissue preparation to control for cannulae placements

At the end of the experiment, the rats received a lethal dose of sodium pentobarbital (300 mg/kg; Virbac Pty. Ltd.). The brains were removed, frozen, and sectioned coronally with a cryostat (Leica Microsystems Australia) at 40 μm through the core or the shell region of the nucleus accumbens. Every third section was collected on a slide, and the sections were stained with cresyl violet. The location of cannulae tips was determined under a microscope by a trained observer who was unaware of the subjects' group designations using boundaries defined in the atlas of Paxinos and Watson (2006). Subjects with inaccurate cannulae placements or with extensive damage at the infusion site were excluded from the statistical analysis.

Transcardial fixation and brain sectioning for immunofluorescence

After the test, mice were rapidly anesthetized with sodium pentobarbital (500 mg/kg, i.p. in mice) and transcardially perfused with cold 4% paraformaldehyde in 0.1 m sodium phosphate buffer, pH 7.5. Brains were postfixed in the same solution at 4°C overnight. Coronal 30-μm-thick sections (+1.3 from bregma in mice) were cut with a vibratome (Leica Microsystems VT1000) and stored at −20°C in a solution containing 30% ethylene glycol, 30% glycerol, and 0.1 m sodium phosphate buffer, until they were processed for immunofluorescence.

Immunofluorescence

Individualized free-floating sections were rinsed in Tris-buffered saline with NaF (TBS-NaF; 0.25 m Tris, 0.5 m NaCl, and 0.1 mm NaF, pH 7.5), incubated for 5 min in TBS-NaF containing 3% H2O2 and 10% methanol, and then rinsed 10 min three times in TBS-NaF. After 20 min incubation in 0.2% Triton X-100 in TBS-NaF, sections were rinsed three times in TBS-NaF again. DARPP-32 and the double-phosphorylated form of ERK1/2 (phospho-Thr202/Tyr204-ERK1/2) were simultaneously detected in drd2-eGFP mice through incubation with combined purified mouse anti-DARPP-32 (1:300, catalog #611520; BD Biosciences) and polyclonal rabbit anti-phospho-Thr202/Tyr204 ERK1/2 (1:300, catalog #9101; Cell Signaling Technology) primary antibodies diluted in TBS-NaF (4°C, overnight). All sections were then rinsed 10 min in TBS-NaF three times and incubated 60 min with combined donkey anti-mouse Alexa Fluor 647-coupled and donkey anti-rabbit CY3-coupled antibodies diluted 1:400 in TBS. Sections were rinsed four times for 10 min in TBS before mounting in Vectashield fluorescence medium (Vector Laboratories).

Fluorescence analysis

In drd2-eGFP mice, 635.2 μm2 confocal images were obtained in the ventromedial extension of the nucleus accumbens shell (approximate coordinates AP, +1.3; ML, ±0.8; DV, −5) using sequential laser scanning confocal microscopy (Olympus FV1000). Samples contained simultaneous eGFP, phospho-ERK1/2, and DARPP-32 staining, which were scanned sequentially at 2519 pixels/μm resolution (Ch01, Ch02, and Ch03, respectively). Before quantification, all image files were randomly renumbered using a MS Excel plug-in (Bio-excel2007 by Romain Bouju, France). Cell counts were performed in Open Source ImageJ software (MacBiophotonics upgrade v. 1.43u, Wayne Rasband, National Institutes of Health, Bethesda, MD) as follows. (1) Total amount of phospho-ERK1/2-immunoreactive neurons was marked and quantified in Ch02. (2) Phospho-ERK1/2 marks were superimposed to Ch01 (eGFP), and coincident phospho-ERK1/2- and eGFP-immunoreactive neurons were requantified (D2+). (3) Phospho-ERK1/2 marks were superimposed on Ch03 (DARPP-32), and phosho-ERK1/2-immunoreactive neurons that were negative for DARPP-32 were requantified (non-MSNs). The number of phospho-ERK1/2-immunoreactive cells (p-ERK1/2 neurons) contained in D1-MSNs (D1+) was calculated as (1) minus (2) minus (3). The small bias produced by cells that were pERK1/2-positive, eGFP-positive, and DARPP-32-negative (0.42% in our study) was not considered.

Brain slice preparation

Male Long–Evan rats (12–15 weeks old) that had undergone a prior PIT test were deeply anaesthetized using isoflurane inhalation (4% in air), decapitated and their brain removed. Horizontal brain slices (300 μm thick) containing the NAc-S were cut using a vibratome (Leica Microsystems VT1200S) in ice-cold oxygenated sucrose cutting solution containing the following (in mm): 241 sucrose, 28 NaHCO3, 11 glucose, 1.4 NaH2PO4, 3.3 KCl, 0.2 CaCl2, 7 MgCl2. Slices were hemisected at midline and maintained at 33°C in a submerged chamber containing physiological saline with the following composition (in mm): 126 NaCl, 2.5 KCl, 1.4 NaH2PO4, 1.2 MgCl2, 2.4 CaCl2, 11 glucose, and 25 NaHCO3, and equilibrated with 95% O2 and 5% CO2.

Electrophysiological recording and post hoc histological analysis

After equilibrating for 1 h, slices were then transferred to a recording chamber and visualized under an upright microscope (Olympus BX50WI) using differential interference contrast (DIC) Dodt tube optics, and superfused continuously (1.5 ml min−1) with oxygenated physiological saline at 33°C. Cell-attached and whole-cell patch-clamp recordings were made using electrodes (2–5 MΩ) containing internal solution consisting of the following (in mm): 115 K gluconate, 20 NaCl, 1 MgCl2, 10 HEPES, 11 EGTA, 5 Mg-ATP, and 0.33 Na-GTP, pH 7.3, osmolarity 285–290 mOsm l−1. Biocytin (0.1%; Sigma-Aldrich) was routinely added to the internal solution for marking the sampled neurons during whole-cell recording. Data acquisition was performed with a Multiclamp 700B amplifier (Molecular Devices), connected to a Macintosh computer and interface ITC-18 (Instrutech). In cell-attached mode, action potentials were sampled at 5 kHz (low pass filter 2 kHz) and whole-cell currents were also sampled at 5 kHz (low pass filter 2 kHz; Axograph X, Molecular Devices). Whole-cell recordings were established immediately following data collection in cell-attached mode. Data from cell-attached and whole-cell recordings were only included in analyses if (1) the neurons appeared healthy under DIC on the monitor screen, (2) cholinergic interneurons were spontaneously active during cell-attached recording, (3) action potential amplitudes were at least 60 mV above threshold after establishing whole-cell recording mode, and (4) neurons demonstrated physiological characteristics of cholinergic interneurons such as the presence of hyperpolarization-activated cation current Ih but no plateau low-threshold spiking (Kawaguchi et al., 1995), to ensure that only highly viable neurons were included.

Immediately after physiological recording, brain slices containing biocytin-filled neurons were fixed overnight in 4% paraformaldehyde/0.16 m phosphate buffer (PB) solution and then placed in 0.3% Triton X-100/PB for 3 d to permeabilize cells. Slices were then placed in 10% horse serum/PB for 1 h before being incubated in primary goat anti-choline acetyltransferase (anti-ChAT, 1:500; Millipore) for 2 d at 4°C to aid identification of CINs. The slices were rinsed in PB and then in a one-step incubation containing both Alexa Fluor 488-conjugated donkey anti-goat secondary antibody (1:500; Life Technologies) and Alexa Fluor 647-conjugated Streptavidin (1:1000; Life Technologies) for 2 h. Stained slices were rinsed, mounted onto glass slides, dried, and coverslipped with Vectashield mounting medium (Vector Laboratories). Images were obtained using sequential laser scanning confocal microscopy (Fluoview FV1000, BX61WI microscope, Olympus).

Statistical methods

Statistical analyses were conducted using within-subjects or mixed-model ANOVA depending on the experimental design (unless stated otherwise). For all analyses, significance was assessed against a type I error rate of 0.05. ANOVAs were followed by simple main effects analyses to establish the source of any significant interactions.

Results

Specific PIT increases ERK1/2 phosphorylation in D1R-expressing MSNs in the NAc-S

Based on the functional dichotomy of D1 and D2 receptors in striatal tissue (Gerfen and Surmeier, 2011), we first sought to examine which neuronal population was primarily engaged in the NAc-S during the expression of PIT. For this, we took advantage of drd2-eGFP mice, which endogenously label D2R-expressing neurons and have been found to be extremely useful for functional histology studies (Valjent et al., 2009), as accurate recognition of D2R-expressing neurons is possible without signal amplification due to the intensity of the fluorescence signal (Matamales et al., 2009). We combined drd2-eGFP fluorescence with DARPP-32 immunostaining, a reliable marker of both D1R- and D2R-expressing MSNs, to accurately distinguish between D2R-MSNs (i.e., those expressing DARPP-32 and eGFP fluorescence) and D1R-MSNs (i.e., those expressing DARPP-32 but not eGFP fluorescence; Matamales et al., 2009). In the present experiment, food-deprived drd2-eGFP mice were subjected to a specific PIT protocol involving three stages (Fig. 1A). In the Pavlovian stage, all mice were trained to associate two auditory stimuli (S1 and S2) with the delivery of two distinct food outcomes (O1 and O2). In the instrumental stage, mice learned that one action (A1; i.e., pressing one lever) delivered one of the food outcomes (O1), whereas another action (A2; i.e., pressing another lever) delivered the other outcome (O2). Mice were then separated into two groups. One group (Group PIT; n = 7) received a PIT test during which performance on the levers was assessed both in the absence and the presence of S1 and S2. The mice were rapidly anesthetized and transcardially perfused immediately after this stage. The other group of mice (Group No PIT; n = 6) received the same procedure except they did not receive the PIT test.

Figure 1.

Figure 1.

Specific PIT exposure triggers strong ERK1/2 signaling responses selectively in D1R-MSNs of the NAc-S. A, All mice received Pavlovian and instrumental training. A single PIT test was given to one group of mice (Group PIT, black) whereas it was omitted for the other group (Group No PIT, red). B–D, Pavlovian training (B) produced a gradual increase in magazine entries during presentation of the conditioned stimulus. The lever press rate gradually increased across instrumental training (C). Group PIT exhibited a higher lever press rate when the stimulus predicted the same outcome as the response (Same) than when the stimulus predicted a different outcome (Different) or when there was no stimulus (Baseline) D, Immediately after the test, all mice were perfused and brain samples processed for optimal phosphorylation signal (see Materials and Methods). E, F, Three-staining immunofluorescence (F) was conducted to label eGFP (identifying D2R-MSNs), phospho-ERK1/2 (p-ERK1/2; identifying activated neurons) and DARPP-32 (D-32; identifying all MSNs) in the NAc-S of trained drd2-eGFP mice exposed (Group PIT) or not exposed (Group No PIT) to the PIT test. Almost all p-ERK1/2-activated neurons were identified as MSNs (D-32 immunoreactive, data not shown). For quantification (E), neurons were classified as eGFP or Non-eGFP, depending on the presence or absence of D2R-eGFP. PIT exposure induced large numbers of phospho-ERK1/2-immunoreactive neurons in the NAc-S, which were almost entirely excluded from D2R-eGFP fluorescence. Error bars denote ±1 SEM.

Pavlovian training (Fig. 1B) occurred smoothly since the levels of magazine entry were higher in the presence of the stimuli (i.e., S period) than in their absence (i.e., Pre-S period; F(1,84) = 42.263; p < 0.001), and these levels increased across training (F(7,207) = 17.0; p < 0.001). There was no difference between the two groups of mice, and the gradual increase in responding did not depend on the group's allocation (F values <2.3). During instrumental training (Fig. 1C), all mice acquired the lever press responding that increased as the ratio parameters to earn the outcomes increased across days (F(7,77) = 5.3; p < 0.001). Again, mice in Groups PIT and no PIT did not differ (F values <1.2). Figure 1D shows the performance of Group PIT as the mean number of lever presses per minute when the stimuli predicted the same outcome as the action (Same), when the stimuli predicted a different outcome from the action (Different), or when there was no stimulus (Baseline). There was clear evidence of outcome-specific PIT, as a stimulus trained to predict a particular outcome elevated responding on the action earning that same outcome (Same vs Different: F(1,13) = 18.0; p < 0.01; Same vs Baseline: F(1,13) = 14.4; p < 0.01).

Next, to study which neuronal population was principally engaged during PIT, we used immunodetection of the doubly phosphorylated form of the MAPK-ERK1/2, a method extensively used to visualize activated neurons in the striatum (Valjent et al., 2000; Bertran-Gonzalez et al., 2008). We used confocal microscopy on drd2-eGFP mice combined with DARPP-32 immunostaining to assess which MSN population in the NAc-S expressed ERK1/2 phosphorylation following the test. The results, which are presented in Figure 1, E and F, revealed distinct patterns of activation depending on both the subpopulation concerned and the behavioral protocol administered (F(1,25) = 6.9; p < 0.05). There was a clear increase in the number of neurons expressing phospho-ERK1/2 immunoreactivity when trained animals were exposed to PIT compared with those trained but not given the specific PIT test (F(1,11) = 5.7; p < 0.05). Strikingly, the vast majority of phospho-ERK1/2-immunoreactive neurons in Group PIT, although expressing DARPP-32, did not express eGFP (F(1,11) = 7.2; p < 0.05; Fig. 1F). Animals in Group No PIT showed low ERK1/2 activation levels in all DARPP-32 neurons whether they expressed eGFP or not (F < 1). In our quantification (Fig. 1E), we classified p-ERK1/2-immunoreactive neurons as eGFP (i.e., D2R-expressing MSNs) and non-eGFP (i.e., D2R-negative MSNs), according to their fluorescence profile. Our results clearly showed that exposure to the PIT test was associated with an increase of p-ERK1/2 immunoreactivity in non-eGFP, DARRP-32-positive neurons, i.e., putative D1R MSNs in the NAc-S. These results suggest that specific PIT depends on activity in D1R-expressing projection neurons of the NAc-S.

D1R activity in the NAc-S is critical for specific PIT

Next, we used a pharmacological approach in rats to confirm the role played by D1R- and D2R-expressing MSNs in the NAc-S during specific PIT. Food-deprived rats were subjected to a specific PIT protocol similar to the one described above except for two differences (Fig. 2). The first was that rats were bilaterally implanted with cannulae into the NAc-S (Fig. 2E, left) during the course of instrumental training. The second difference is that there were two rounds of PIT tests (Fig. 2A). One round of tests (PIT 1) evaluated the effects of relatively high doses of D1R and D2R antagonists (High dose), whereas a second round (PIT 2; after retraining) assessed the effects of lower doses of the same drugs (Low dose).

Figure 2.

Figure 2.

Effect of D1R and D2R blockade in the NAc shell and core on expression of specific Pavlovian-instrumental transfer. A, All rats received Pavlovian and instrumental training followed by two consecutive PIT tests. One group performed these two tests under NAc-S infusion of either VEH or a high dose of SCH, while another group was infused with either vehicle or a high dose of RAC. Following instrumental retraining, rats were again submitted to two consecutive PIT tests that were identical to the previous ones except that low doses of SCH and RAC were infused into the NAc-S. B, Rats infused with vehicle exhibited specific PIT. Although both doses of SCH impaired specific PIT, only the low dose did so without affecting baseline performance. C, Rats displayed specific PIT whether they were infused in the NAc-S with vehicle or the low and high dose of RAC. D, The performance during the tests revealed that animals exhibited specific PIT whether they had been infused into the NAc-C with VEH, SCH, or RAC. E, Placements of the injection cannula tips in the NAc-S (left) and NAc-C (right) for rats infused with SCH (blue) and RAC (red). Distances on the atlas templates are indicated in millimeters relative to bregma.

All rats learned the predictive relationships across Pavlovian training (data not shown) as indicated by higher levels of magazine entries in the presence of the stimuli than in their absence (F(1,98) = 91.3; p < 0.001). Further, these levels gradually increased across days (F(7,239) = 7.6; p < 0.001). During instrumental training (data not shown), all animals acquired the lever press responding that increased as the ratio schedules increased across days (F(9,149) = 96.5; p < 0.001). On each testing round (PIT 1 and PIT 2), two tests were given on 2 consecutive days (Fig. 2B,C). On PIT 1, two groups of rats (n = 8) performed the transfer tests after NAc-S infusion of either VEH or the D1R antagonist SCH (2.5 μg/μl). The D1R antagonist had a strong effect on outcome-specific transfer (Fig. 2B). ANOVA conducted using factors of Drug and of Period (separating Same, Different, and Baseline) showed a main effect of Drug (F(1,14) = 18.1; p < 0.001), Period (F(2,14) = 16.4; p < 0.001), and a Drug × Period interaction (F(2,47) = 10.7; p < 0.001). Simple-effects analyses on the significant interaction revealed significant PIT in vehicle-treated rats (F(2,14) = 17.5; p < 0.001) but not in rats infused with SCH (F < 3.5). However, the impairment was not specific to the presence of the stimuli as SCH-treated animals displayed lower instrumental performance during Baseline than did control animals (F(1,14) = 15.4; p < 0.01). This implies that SCH-23390 induced a general decrease in instrumental responding rather than a specific removal of outcome-specific PIT. As a consequence, the same animals were given additional instrumental training before being retested under a lower dose of the D1R antagonist (PIT 2; 1 μg/μl). The data presented in the right panel of Figure 2B clearly shows that SCH-23390 prevented outcome-specific PIT. ANOVA revealed no main effect of Drug (F < 1), but a main effect of Period (F(2,14) = 17.7; p < 0.001) and a Drug × Period interaction (F(2,47) = 5.6; p < 0.05). Rats displayed a selective increase in instrumental responding during stimulus Same when treated with vehicle (F(1,15) = 18.8; p < 0.001) but not when given SCH-23390 (F < 0.3). Importantly, these effects only emerged during the stimuli; there was no difference between groups during test in the Baseline period (F < 1).

In another set of rats run in parallel, we also examined in two rounds of testing the effects of NAc-S infusions of high and low doses of the D2R antagonist raclopride (2.5 μg/μl and 1 μg/μl) on outcome-specific PIT. The left panel of Figure 2C shows that, at a high dose, raclopride had very little if any impact on outcome-specific PIT. The statistical analysis revealed a main effect of Period (F(2,12) = 15.6; p < 0.001) but no main effect of Drug or no Drug × Period interaction (Fs<1.5). Not surprisingly, the exact same outcome was observed when a lower dose of raclopride was used in PIT 2 (Fig. 2C, right). Again, ANOVA showed a main effect of Period (F(2,12) = 11.9; p < 0.001) but no main effect of drug or no drug × lever interaction (F values <3.5). Thus, raclopride failed to impair specific PIT regardless of the dose infused into the NAc-S.

These effects of SCH-23390 were specific to infusions into the NAc-S. Two groups of rats received the same protocol except they were bilaterally cannulated in the nucleus accumbens core (NAc-C; Fig. 2E, right). One group (n = 9) received the PIT tests under infusion of either VEH or SCH-23390 (1 μg/μl), while the other group (n = 7) was infused with either VEH or raclopride (1 μg/μl). Pavlovian and instrumental training occurred without incident (data not shown). Rats discriminated between the S and pre-S periods (F(1,105) = 146.3; p < 0.001), and this discrimination grew larger over the course of training (F(7,255) = 17.3; p < 0.001), as did the instrumental performance (F(9,159) = 129.9; p < 0.001). The data of rats tested under SCH or vehicle are presented in the left panel of Figure 2D. ANOVA revealed a main effect of Period (F(2,16) = 45.1; p < 0.001), but no effect of Drug or a Drug × Period interaction (F values <3.5). Thus, the Same stimuli elevated responding on the action when they predicted the same outcome relative both to baseline (F(1,35) = 35.4; p < 0.001) and to the Different stimulus (F(1,35) = 24.7; p < 0.001). A similar effect was observed in the test of raclopride (Fig. 2D, right), as the analysis revealed a main effect of Period (F(2,12) = 9.3; p < 0.001), but no effect of Drug or a Drug × Period interaction (F values <2.2). Thus, neither D1R nor D2R activation in the NAc-C is essential for specific PIT.

D1R- and DOR-related processes interact in the NAc-S to promote specific PIT

Our current and past results point to the involvement of a dopaminergic process, through D1Rs expressed in postsynaptic projection neurons, and an opioidergic process, through DORs expressed on cholinergic interneurons, in the NAc-S as key factors promoting the influence of stimuli on choice between actions (Laurent et al., 2012). In the present experiment, we sought to assess whether these processes interact in driving the PIT effect or whether they provide independent sources of control over that effect. To achieve this, we gave rats asymmetrical infusions of the D1R and DOR antagonists SCH-23390 and naltrindole into the NAc-S. Rats were cannulated in the NAc-S (Fig. 3C) and then subjected to the specific PIT protocol described previously. All rats received four PIT tests that were conducted under the influence of distinct pharmacological treatments involving symmetrical or asymmetrical bilateral infusions in the NAc-S (Fig. 3A). Pavlovian conditioning was successful as rats exhibited more magazine entries during the stimulus period than during the prestimulus period (F(1,304) = 421.7; p < 0.001; data not shown). Further, the levels of magazine entries during the stimulus period increased as training progressed (F(7,319) = 11.4; p < 0.001; data not shown). All rats acquired lever press responding that increased over the course of instrumental training (F(1,190) = 32.5; p < 0.001; data not shown).

Figure 3.

Figure 3.

D1Rs and DORs cooperate in the NAc-S to mediate the expression of choice. A, Schematic showing the five distinct pharmacological treatments involving bilateral infusions in the NAc-S. VEH, gray; NAL, green; SCH, blue. B, The lever press rate minus baseline revealed that rats bilaterally infused with vehicle (VEH-VEH) or unilaterally infused with either SCH (VEH-SCH) or NAL (NAL-VEH) displayed specific PIT. In contrast, specific PIT was removed by bilateral infusion of NAL (NAL-NAL) or asymmetrical infusion of NAL and SCH (NAL-SCH; gray dashed line, baseline). C, Placements of the injection cannula tips in the NAc-S. Distances on the atlas templates are indicated in millimeters relative to bregma.

After Pavlovian and instrumental training, we exposed these rats to the different pharmacological treatments (Fig. 3A) immediately before the outcome-specific transfer tests. The data from these tests are presented in Figure 3B and are shown as the mean lever presses per minute (stimulus minus baseline) when the stimulus predicted the same outcome as the action (Same) and when it predicted a different outcome (Different). Baseline levels of responding during the PIT test in the absence of the stimuli were not influenced by the various drug treatments and did not differ (F < 1). ANOVA found a main effect of Drug (F(4,150) = 3.8; p < 0.001), Period (F(1,150) = 42.7; p < 0.001), and a Drug × Period interaction (F(4,159) = 4.8; p < 0.001). Bilateral infusion of vehicle (Group VEH-VEH; F(1,44) = 21.9; p < 0.001), as well as unilateral infusions of the D1R antagonist SCH-23390 (Group VEH-SCH; F(1,26) = 13.2; p < 0.001) or the DOR antagonist naltrindole (Group NAL-VEH; F(1,30) = 22.7; p < 0.001) spared outcome-specific transfer. In contrast, this transfer was prevented by bilateral infusion of naltrindole (Group NAL-NAL), or unilateral naltrindole infusion combined with contralateral infusion of SCH-23390 (Group NAL-SCH) (both F values <1). The specific impairment produced by the combination of unilateral NAL and SCH infusions suggests that D1Rs and DORs, both essential to generate PIT, influence a common cellular circuit in the NAc-S responsible for the expression of outcome-specific transfer and, therefore, we next sought to establish the nature of the circuit mediating this interaction.

Specific PIT is associated with changes in the firing patterns of NAc-S CINs

Our previous studies in transgenic mice and in rats provided strong evidence for the involvement of DORs expressed in the NAc-S in specific PIT (Laurent et al., 2012), in particular those expressed on cholinergic interneurons (Bertran-Gonzalez et al., 2013). We found in the latter study that contingent Pavlovian training resulted in an increase of an irregular/burst firing pattern in CINs immediately after that training, which was accentuated by administration of the DOR agonist deltorphin. From these data we hypothesized that the training increased CIN regulation by DORs, an effect that rendered these neurons more sensitive to deltorphin treatment. In the present experiment, we aimed to (1) confirm the specific effect of Pavlovian contingency on firing responses of NAc-S CINs in rats, and (2) assess whether those firing changes persisted over further stages of training to be present at the PIT test.

Two groups of rats, which differed in the manner in which they were trained during the Pavlovian stage, were submitted to a PIT test. A contingent group (Group Contingent; n = 6) was trained as before, whereas a noncontingent group (Group Non Cont; n = 7) was exposed to uncorrelated presentations of the stimuli and outcomes (Fig. 4A). All rats were then given instrumental training followed by a single PIT test. The levels of magazine entries across Pavlovian training are presented in Figure 4B. The statistical analysis found that performance in the presence or absence of the stimuli depended on the day of training and the training protocol received (F(7,140) = 18.5; p < 0.001). Although noncontingent rats did distinguish between the pre-S and S periods (F(1,42) = 39.0; p < 0.001; F(1,111) = 5.8; p < 0.001), the stimulus–outcome relationships established were substantially weaker; noncontingent rats displayed lower levels of magazine entries in the presence of the stimuli than rats given contingent training (F(1,77) = 8.9; p < 0.05), but similar levels of magazine entry in the absence of the stimuli (F < 3.8). Importantly, contingent rats readily discriminated the two periods (F(1,35) = 77.4; p < 0.001), and this discrimination grew larger over trials (F(7,95) = 16.3; p < 0.001). Following Pavlovian training, all rats acquired the lever press responses that increased as training progressed (F(7,103) = 88.8; p < 0.001). There was no difference between the two groups of rats (F < 0.1; Fig. 4C). We next assessed the effect of noncontingent training on outcome-specific transfer (Fig. 4D). Performance is plotted as in the previous experiment, since there was no difference in baseline responding between the two groups (F < 0.1). ANOVA revealed no effect of Group (Cont vs Non Cont; F < 0.5) but a main effect of Period (F(1,25) = 7.3; p < 0.05) and a Group × Period interaction (F(1,25) = 7.5; p < 0.05). Thus, contingent rats elevated responding on the action delivering the outcome predicted by the stimulus that was presented (F(1,25) = 6.7; p < 0.05), whereas noncontingent rats distributed their responding on the two available actions (F < 0.1).

Figure 4.

Figure 4.

Contingent Pavlovian training promotes persistent changes in NAc-S CINs that are during PIT. A, Two groups of rats were initially submitted to Pavlovian training. A Noncontingent group (Non cont; red) received the same number of conditioned stimuli and outcomes (O) as the Contingent group (black), although presentation of stimuli did not predict the delivery of the outcomes (see Materials and Methods). B–D, Noncontingent training (B) prevented the gradual increase of conditioned responses, otherwise observed in the contingent group. Acquisition of instrumental performance (C) was equivalent in contingent and noncontingent groups. Levels of lever pressing minus baseline at test (D) revealed that contingent rats exhibited outcome-specific PIT, whereas noncontingent rats did not. E, A representative cholinergic interneuron of the NAc-S labeled with biocytin during electrophysiological recording. Post hoc immunofluorescence confirmed the ChAT phenotype of the recorded neuron (left insets). A voltage–current relationship for the labeled neuron in E is also shown (E1 inset). F–H, In the presence of synaptic blockers, the effect of the DOR agonist deltorphin (300 nm) on spontaneous action potentials in NAc-S CINs from slices of contingently and noncontingently trained rats submitted to PIT. In contingently trained rats, application of deltorphin increased the basal irregular firing pattern of NAc-S CINs (F, neuron in E) and enhanced the variance of action potential frequency (G), whereas overall action potential frequency did not differ significantly from Noncontingent rats (H). Error bars denote ±SEM.

Immediately after the test, the brains from rats in Groups Cont and Non Cont were processed for slice electrophysiology. In NAc-S-containing in vitro slices, using a cell-attached configuration of patch-clamp electrophysiology with least perturbation of intracellular content, we compared the action potential firing patterns and spike frequency in identified CINs (Fig. 4E). Compared with noncontingent controls in the presence of synaptic blockers, we found that the DOR agonist deltorphin (300 nm) increased the irregular/burst firing pattern of CINs when bath-applied to the NAc-S preparations of contingently trained rats (Fig. 4F,G; Mann–Whitney U test, p < 0.05). However, the overall action potential frequencies did not differ by deltorphin application in both groups (Fig. 4H; Mann–Whitney U test, p = 0.9). These results extend our previous finding in mice that contingent Pavlovian training specifically influenced CIN firing (Bertran-Gonzalez et al., 2013), and confirm that the increase in DOR agonist sensitivity acquired during the initial Pavlovian phase persists throughout instrumental training and the subsequent PIT test. These data strongly suggest that Pavlovian conditioning produces cellular changes in NAc-S CINs that may ultimately contribute to the selective modulation of D1R-expressing projection neurons that takes place at the moment of PIT.

The DOR–D1R interaction is mediated by acetylcholine M4 receptors

Our results so far are consistent with our original findings showing that (1) blockade of DORs in the NAc-S prevents the PIT effect, and (2) Pavlovian predictive learning triggers DOR translocation to the membrane of CINs, an accumulation that is likely to be necessary to subsequently guide choice between actions (Laurent et al., 2012; Bertran-Gonzalez et al., 2013). Indeed, our previous findings showed that the higher the levels of DORs in the membrane of NAc-S CINs, the larger the PIT effect is. In this current series, we have highlighted the involvement of D1Rs in postsynaptic projection neurons in this process, which appear to be tightly coupled to presynaptic cholinergic events in the NAc-S to generate PIT. To find the molecular link between postsynaptic D1R and presynaptic DOR processes in the NAc-S, we explored the involvement of M4Rs, which are enriched in ventral striatal areas and are expressed selectively on D1R-expressing MSNs (Tayebati et al., 2004; Lobo et al., 2006; Guo et al., 2010; Jeon et al., 2010).

In the next experiment, we tested the effects of M4R blockade during a PIT test conducted in control conditions or in the presence of the DOR antagonist naltrindole. To this end, we designed a pharmacological experiment in rats, which used systemic naltrindole to induce a general blockade of DORs with NAc-S infusion of MT3, a highly selective M4R antagonist (Jolkkonen et al., 1995; Wang et al., 1997; Guo et al., 2010). Rats were bilaterally implanted with cannulae in the NAc-S (Fig. 5C) and subjected to the three-stage protocol described previously. They underwent two transfer tests; one after infusion of vehicle and one after infusion of MT3, both into the NAc-S, combined with systemic injection of either vehicle or naltrindole (Fig. 5A), making four drug treatments; i.e., groups given systemic injection of vehicle plus bilateral infusion of vehicle (Group VEH-VEH, n = 14) or MT3 (Group MT3-VEH, n = 14) in the NAc-S, and groups given systemic injection of naltrindole plus bilateral infusion of vehicle (Group VEH-NAL; n = 13) or MT3 (Group MT3-NAL; n = 12).

Figure 5.

Figure 5.

Selective blockade of M4Rs in the NAc-S rescued the deficit in PIT produced by DOR blockade. A, Schematic showing the four distinct drug treatments administered. NAL (green) and VEH (gray) were given systemically, whereas bilateral NAc-S infusion involved either the muscarinic toxin 3 (MT3; red) or VEH (gray). B, Rats treated systemically with naltrindole (VEH/NAL) failed to display specific PIT. However, PIT could be rescued by bilaterally infusing rats with MT3 into the NAc-S (MT3/VEH). Rats only infused with MT3 (MT3/VEH) exhibited specific PIT similar to control animals (VEH/VEH; gray dashed line, baseline). C, Placements of the injection cannula tips in the NAc-S. Distances on the atlas templates are indicated in millimeters relative to bregma.

Rats successfully discriminated between the stimulus and prestimulus periods across Pavlovian training (F(1,182) = 211.2; p < 0.001), and this discrimination grew larger across days (F(7,431) = 29.2; p < 0.001; data not shown). Rats subsequently acquired lever press responding that increased over the course of instrumental training (F(9,269) = 93.2; p < 0.001; data not shown). Performance during the PIT tests is shown in Figure 5B as the mean number of lever presses per minute (stimulus minus Baseline) during a stimulus predicting the same outcome as a particular response (Same) and during a stimulus predicting a different outcome from a particular response (Different). The statistical analysis found no significant effect of Drug (F < 1), but it revealed a main effect of Period (F(1,98) = 31.6; p < 0.001) and Period × Group interaction (F(3,105) = 3.1; p < 0.05). Indeed, M4R blockade exerted a different influence over outcome-specific transfer when that transfer occurred in the presence or absence of naltrindole. Animals that only received local infusion of MT3 (MT3-VEH; F(1,26) = 18.8; p < 0.001) were as able to display specific PIT as the control animals (VEH-VEH; F(1,26) = 16; p < 0.001). As previously shown (Bertran-Gonzalez et al., 2013), the PIT response was prevented by systemic naltrindole treatment (VEH-NAL; F(1,24) = 0.09; p = 0.77). Importantly, local infusion of the M4R antagonist MT3 blocked the effect of systemic naltrindole on PIT, and the PIT response was rescued in MT3 (NAc-S) + naltrindole (systemic)-treated rats (MT3-NAL; F(1,22) = 6.7; p < 0.001). These effects emerged during the stimuli; no difference was found among the groups when the stimuli were absent (F(1,49) = 1.9; p = 0.37; data not shown). Overall, our results have uncovered a mechanism through which cholinergic modulation of D1R-expressing projection neurons in the NAc-S is regulated by DORs through the latter's expression on CINs to mediate the appropriate choice of actions during specific Pavlovian-instrumental transfer.

Discussion

The current results provide evidence that D1R-expressing projection neurons in the NAc-S play a key role in the effect of predictive learning on choice between goal-directed actions. Mice given the opportunity to express specific PIT exhibited high levels of phosphorylated ERK1/2 almost exclusively in putative D1R-containing MSNs of the NAc-S, demonstrating strong signaling activity specifically in this subset of NAc-S projection neurons. Confirming this result, D1R, but not D2R, blockade in the NAc-S was found to prevent specific PIT in rats. Moreover, consistent with our previous finding on the role of DORs during PIT (Laurent et al., 2012) and their plastic adaptations on CINs during Pavlovian learning (Bertran-Gonzalez et al., 2013), our current experiments revealed that PIT was associated with increased sensitivity of CINs to the DOR agonist deltorphin, confirming the involvement of DORs on these neurons during choice.

These results clearly imply cooperative activity between D1Rs and DORs in the NAc-S during PIT and, supporting this, we found that unilateral NAc-S antagonism of D1Rs combined with contralateral NAc-S antagonism of DORs prevented specific PIT. The role of DORs in this process was further confirmed, as both systemic and local NAc-S blockade of these receptors prevented PIT expression. In pursuit of the cooperative mechanism between DOR expression on CINs and dopamine, we considered the role of the M4Rs in PIT. These receptors are abundant in the striatum, particularly in the ventral areas (Tayebati et al., 2004). Moreover, M4Rs are coexpressed with D1Rs selectively in direct pathway neurons (Lobo et al., 2006), where they exert opposing influence on cAMP signaling through Gαi coupling (Guo et al., 2010; Jeon et al., 2010). We hypothesized that the increased sensitivity of DORs on CINs and the consequent changes in CIN firing pattern after Pavlovian conditioning (Bertran-Gonzalez et al., 2013) produced both a localized reduction in acetylcholine release and, by reducing inhibition by M4Rs, a decrease in the cholinergic “clamp” on corticostriatal inputs (Ding et al., 2010) to generate strong D1R-mediated cellular activity during PIT. Importantly, in support of this hypothesis, we found that selective blockade of M4Rs in the NAc-S reduced the inhibitory effect of naltrindole on PIT and rescued the rats' capacity to express appropriate choices during the PIT test.

Although dopamine constitutes a major source of modulation in the striatum (Gerfen and Surmeier, 2011), other modulators regulate striatal function in close interplay with dopamine. For example, cholinergic interneurons (Kawaguchi et al., 1995), which provide the main source of striatal acetylcholine (Bolam, 1984), ramify very widely to generate among the highest cholinergic activities in the brain (Sorimachi and Kataoka, 1975). Another important source of modulation in the striatum comes from opioidergic systems (Kieffer and Evans, 2009). Prepro-enkephalin, for example, is a neuropeptide generated exclusively in striatopallidal projection neurons that, once released as enkephalin to the extracellular space, acts on DORs expressed in different cellular compartments (Steiner and Gerfen, 1998; Le Merrer et al., 2009). We and others have reported expression of DORs in the somatic and proximal dendritic membranes of striatal CINs (Le Moine et al., 1994; Scherrer et al., 2006; Bertran-Gonzalez et al., 2013). Although functional interactions between some of these neurotransmitters have been reported in the past (Goldberg et al., 2012; Threlfell et al., 2012), how these systems cooperate with one another to modulate overall striatal function and produce relevant behaviors is complex and has remained elusive. Here, we show that these three neuromodulatory processes combine to tightly regulate output neurons in the NAc-S during PIT. In this context, we recently described the accumulation of DORs in the NAc-S in the somatic membrane of CINs as a consequence of Pavlovian conditioning (Bertran-Gonzalez et al., 2013). Given that neither DORs nor the entire NAc-S region is necessary for Pavlovian conditioned responding per se (Corbit et al., 2001; Corbit and Balleine, 2011; Laurent et al., 2012), we reasoned that these changes of DOR distribution were preparing ventral CINs for use in guiding instrumental actions during Pavlovian cues. Indeed, we found that the extent of DOR membrane accumulation in NAc-S CINs correlated with performance during the PIT test (Bertran-Gonzalez et al., 2013). Here we also confirmed the functional relevance of this plastic change on CINs showing that contingent Pavlovian training increased sensitivity of CIN firing to deltorphin, a direct agonist of DORs, a cellular adaptation that persisted at least until the moment of PIT.

Based on these results, we propose a functional model describing the hypothesized cellular interactions occurring in the NAc-S during PIT, and how the different neuromodulatory systems may cooperate to produce exclusive D1R responses (Fig. 6). In a drug-free situation, contingent Pavlovian training triggers the accumulation of DORs to the membrane of CINs, rendering these neurons more sensitive to the enkephalinergic signal even weeks after Pavlovian training. During PIT (Fig. 6A), intense MSN stimulation is expected to trigger, among other cellular responses, the release of enkephalin, which is produced in large amounts in D2R-MSNs and is the endogenous ligand of DORs (Steiner and Gerfen, 1998; Le Merrer et al., 2009). This paracrine signal can exert strong somatic inhibition over DOR-sensitized CINs, eliciting transient interruptions in their regular firing and a shift to irregular/burst firing. As shown by several studies, burst-pause firing responses are expected to generate large acetylcholine oscillations in the striatum, which are thought to be central for corticostriatal plasticity (Schulz and Reynolds, 2013). In this situation, the sensitive enkephalin influence on CINs can produce sudden drops of acetylcholine, which would relieve the inhibitory tone on both corticostriatal synapses and D1R-MSNs—the latter being mediated through M4Rs, permitting the expression of large cAMP-dependent signaling programs in these cells and the expression of NAc-S function, leading to congruent stimulus-driven choices. When naltrindole is present in the NAc-S during PIT (Fig. 6B), blockade of DORs is expected to antagonize the effects of endogenously released enkephalin, thus preventing the local drop in acetylcholine and enabling M4Rs to recover their inhibitory tone over D1Rs to prevent D1R-mediated cAMP responses. In contrast, when both naltrindole and the selective M4R antagonist MT3 are present in the NAc-S in the PIT test (Fig. 6C), the direct blockade of M4Rs is predicted to suppress their inhibitory tone over cAMP, allowing D1R-dependent responses to be expressed independently of the naltrindole-induced cholinergic state.

Figure 6.

Figure 6.

Model of the cooperative interactions between opioidergic, acetylcholinergic, and dopaminergic systems in the NAc-S in the expression of choice between goal-directed actions in a test of Pavlovian-instrumental transfer. Each diagram (A–C) exemplifies the molecular interactions occurring in a cortico-direct pathway synapse (D1R-MSN; left) a CIN (center), and a cortico-indirect pathway synapse (D2R-MSN; right) at the moment of PIT. A corresponds to conditions without pharmacological intervention; B corresponds to a condition in which the selective DOR antagonist naltrindole (N, red) has been administered in the NAc-S; and C corresponds to a condition in which both naltrindole and the selective M4R blocker MT3 (black) are present in the NAc-S. The green contour represents the accumulated DORs in the somatic region of the CIN due to prior Pavlovian conditioning (Bertran-Gonzalez et al., 2013). Blue color represents overactivity. Red color represents hypoactivity. See Discussion for explanatory details. AC, adenylate cyclase; Ach, acetylcholine; Ca2+, intracellular calcium signal; cAMP, intracellular cAMP signal; DA, dopamine; ENK, enkephalin; Glut, glutamate; N, naltrindole.

Clearly, although this model is speculative and requires further experimentation, our evidence to date strongly suggests that opioids, through their activity at DORs expressed on CINs, promote stimulus-based choice via cholinergic modulation of D1R-expressing projection neurons in the NAc-S. Recent work in our laboratory has established that the projection from the nucleus accumbens shell to the medial ventral pallidum mediates specific PIT (Leung and Balleine, 2013), and the preferential role for D1R-expressing striatal MSNs in PIT suggests that this projection pathway is controlled by D1R-expressing MSNs. Widespread corticostriatal stimulation in the NAc-S during PIT is, however, required to coordinate substantial somatosensory, associative, and executive information to generate the organized behavioral pattern congruent with predictive cues on test. Accordingly, based on recent evidence for simultaneous activity in both D1R- and D2R-expressing neurons during voluntary movements (Cui et al., 2013; Isomura et al., 2013), as well as recent models of coherent corticostriatal oscillations (Koralek et al., 2013), we should expect both D1R- and D2R-expressing projection neurons to be involved in the oscillatory transmission of this signal, with upstream cellular events occurring in both MSN populations. In fact, if strong enough, cortical stimulation can produce downstream nuclear responses in virtually all striatal neurons, which could result in the release of neuropeptides, such as substance P in D1R-MSNs or enkephalin in D2R-MSNs (Sgambato et al., 1997; Fig. 6). Nevertheless, despite the existence of upstream events that are widespread across the two subpopulations, the dichotomy imposed by the dopaminergic signal through the distribution of D1Rs and D2Rs, and the opposing cAMP responses that these receptors promote, is translated into exquisitely segregated downstream molecular responses (Bertran-Gonzalez et al., 2008), and this segregation of output activity provides interesting clues as to the overall functional role of the D1R- and D2R-expressing MSNs in the NAc-S.

Although the NAc-S appears not to be necessary for Pavlovian or instrumental learning per se (Corbit et al., 2001; Corbit and Balleine, 2011), local changes induced by Pavlovian and instrumental training are necessary to support PIT. Any deficiency in the integrative processes supporting PIT will, therefore, strongly affect the stimulus control of action, and deficits in this function have been associated with a number of disorders, most notably stimulus-induced relapse in drug seeking after a period of abstinence, the stimulus control of food seeking in obesity, and the loss of control over perseverative actions in a number of psychiatric disorders including psychotic disorders and depression (Hyman, 2005; Seymour and Dolan, 2008; Simpson et al., 2010). There is, therefore, growing evidence of pathologies in decision-making involving the NAc-S (Kalivas and Volkow, 2005; Simon et al., 2011; Stopper and Floresco, 2011) and, given the numerous pharmacological agents that influence the opioidergic, acetylcholinergic, and dopaminergic systems, the current findings may help in designing potential pharmacological strategies for rescuing those deficits.

Footnotes

This research was supported by funding from the National Institute of Mental Health, Grant MH56646; the National Health and Medical Research Council, Grant 633267; and an Australian Laureate Fellowship from the Australian Research Council, FL0992409, to B.W.B.

References

  1. Baldo BA, Sadeghian K, Basso AM, Kelley AE. Effects of selective dopamine D1 or D2 receptor blockade within nucleus accumbens subregions on ingestive behavior and associated motor activity. Behav Brain Res. 2002;137:165–177. doi: 10.1016/S0166-4328(02)00293-0. [DOI] [PubMed] [Google Scholar]
  2. Bertran-Gonzalez J, Bosch C, Maroteaux M, Matamales M, Hervé D, Valjent E, Girault JA. Opposing patterns of signaling activation in dopamine D1 and D2 receptor-expressing striatal neurons in response to cocaine and haloperidol. J Neurosci. 2008;28:5671–5685. doi: 10.1523/JNEUROSCI.1039-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bertran-Gonzalez J, Hervé D, Girault JA, Valjent E. What is the degree of segregation between striatonigral and striatopallidal projections? Front Neuroanat. 2010;4:136. doi: 10.3389/fnana.2010.00136. pii. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bertran-Gonzalez J, Laurent V, Chieng BC, Christie MJ, Balleine BW. Learning-related translocation of δ-opioid receptors on ventral striatal cholinergic interneurons mediates choice between goal-directed actions. J Neurosci. 2013;33:16060–16071. doi: 10.1523/JNEUROSCI.1927-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bolam JP. Synapses of identified neurons in the neostriatum. Ciba Found Symp. 1984;107:30–47. doi: 10.1002/9780470720882.ch3. [DOI] [PubMed] [Google Scholar]
  6. Bossert JM, Poles GC, Wihbey KA, Koya E, Shaham Y. Differential effects of blockade of dopamine D1-family receptors in nucleus accumbens core or shell on reinstatement of heroin seeking induced by contextual and discrete cues. J Neurosci. 2007;27:12655–12663. doi: 10.1523/JNEUROSCI.3926-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Colwill RM, Rescorla RA. Associations between the discriminative stimulus and the reinforcer in instrumental learning. J Exp Psychol Anim Behav Processes. 1988;14:155–164. doi: 10.1037/0097-7403.14.2.155. [DOI] [Google Scholar]
  8. Corbit LH, Balleine BW. The general and outcome-specific forms of Pavlovian-instrumental transfer are differentially mediated by the nucleus accumbens core and shell. J Neurosci. 2011;31:11786–11794. doi: 10.1523/JNEUROSCI.2711-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Corbit LH, Muir JL, Balleine BW. The role of the nucleus accumbens in instrumental conditioning: evidence of a functional dissociation between accumbens core and shell. J Neurosci. 2001;21:3251–3260. doi: 10.1523/JNEUROSCI.21-09-03251.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, Costa RM. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013;494:238–242. doi: 10.1038/nature11846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Dickinson A, Balleine B. Motivational control of goal-directed action. Anim Learn Behav. 1994;22:1–18. doi: 10.3758/BF03199951. [DOI] [Google Scholar]
  12. Diehl F, Fürstenau de Oliveira L, Sánchez G, Camboim C, de Oliveira Alvares L, Lanziotti VB, Cerveñansky C, Kornisiuk E, Jerusalinky D, Quillfeldt JA. Facilitatory effect of the intra-hippocampal pre-test administration of MT3 in the inhibitory avoidance task. Behav Brain Res. 2007;177:227–231. doi: 10.1016/j.bbr.2006.11.030. [DOI] [PubMed] [Google Scholar]
  13. Ding JB, Guzman JN, Peterson JD, Goldberg JA, Surmeier DJ. Thalamic gating of corticostriatal signaling by cholinergic interneurons. Neuron. 2010;67:294–307. doi: 10.1016/j.neuron.2010.06.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Faure A, Reynolds SM, Richard JM, Berridge KC. Mesolimbic dopamine in desire and dread: enabling motivation to be generated by localized glutamate disruptions in nucleus accumbens. J Neurosci. 2008;28:7184–7192. doi: 10.1523/JNEUROSCI.4961-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Gerfen CR, Surmeier DJ. Modulation of striatal projection systems by dopamine. Annu Rev Neurosci. 2011;34:441–466. doi: 10.1146/annurev-neuro-061010-113641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Goldberg JA, Ding JB, Surmeier DJ. Muscarinic modulation of striatal function and circuitry. Handb Exp Pharmacol. 2012:223–241. doi: 10.1007/978-3-642-23274-9_10. [DOI] [PubMed] [Google Scholar]
  17. Guo ML, Fibuch EE, Liu XY, Choe ES, Buch S, Mao LM, Wang JQ. CaMKIIalpha interacts with M4 muscarinic receptors to control receptor and psychomotor function. EMBO J. 2010;29:2070–2081. doi: 10.1038/emboj.2010.93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Holmes NM, Marchand AR, Coutureau E. Pavlovian to instrumental transfer: a neurobehavioural perspective. Neurosci Biobehav Rev. 2010;34:1277–1295. doi: 10.1016/j.neubiorev.2010.03.007. [DOI] [PubMed] [Google Scholar]
  19. Hyman SE. Addiction: a disease of learning and memory. Am J Psychiatry. 2005;162:1414–1422. doi: 10.1176/appi.ajp.162.8.1414. [DOI] [PubMed] [Google Scholar]
  20. Isomura Y, Takekawa T, Harukuni R, Handa T, Aizawa H, Takada M, Fukai T. Reward-modulated motor information in identified striatum neurons. J Neurosci. 2013;33:10209–10220. doi: 10.1523/JNEUROSCI.0381-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jeon J, Dencker D, Wörtwein G, Woldbye DPD, Cui Y, Davis AA, Levey AI, Schütz G, Sager TN, Mørk A, Li C, Deng CX, Fink-Jensen A, Wess J. A subpopulation of neuronal M4 muscarinic acetylcholine receptors plays a critical role in modulating dopamine-dependent behaviors. J Neurosci. 2010;30:2396–2405. doi: 10.1523/JNEUROSCI.3843-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Jolkkonen M, Van Giersbergen PL, Hellman U, Wernstedt C, Oras A, Satyapan N, Adem A, Karlsson E. Muscarinic toxins from the black mamba Dendroaspis polylepis. Eur J Biochem. 1995;234:579–585. doi: 10.1111/j.1432-1033.1995.579_b.x. [DOI] [PubMed] [Google Scholar]
  23. Kalivas PW, Volkow ND. The neural basis of addiction: a pathology of motivation and choice. Am J Psychiatry. 2005;162:1403–1413. doi: 10.1176/appi.ajp.162.8.1403. [DOI] [PubMed] [Google Scholar]
  24. Kawaguchi Y, Wilson CJ, Augood SJ, Emson PC. Striatal interneurones: chemical, physiological and morphological characterization. Trends Neurosci. 1995;18:527–535. doi: 10.1016/0166-2236(95)98374-8. [DOI] [PubMed] [Google Scholar]
  25. Kieffer BL, Evans CJ. Opioid receptors: from binding sites to visible molecules in vivo. Neuropharmacology. 2009;56(Suppl 1):205–212. doi: 10.1016/j.neuropharm.2008.07.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Koralek AC, Costa RM, Carmena JM. Temporally precise cell-specific coherence develops in corticostriatal networks during learning. Neuron. 2013;79:865–872. doi: 10.1016/j.neuron.2013.06.047. [DOI] [PubMed] [Google Scholar]
  27. Laurent V, Leung B, Maidment N, Balleine BW. μ- and δ-opioid-related processes in the accumbens core and shell differentially mediate the influence of reward-guided and stimulus-guided decisions on choice. J Neurosci. 2012;32:1875–1883. doi: 10.1523/JNEUROSCI.4688-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Le Merrer J, Becker JA, Befort K, Kieffer BL. Reward processing by the opioid system in the brain. Physiol Rev. 2009;89:1379–1412. doi: 10.1152/physrev.00005.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Le Moine C, Kieffer B, Gaveriaux-Ruff C, Befort K, Bloch B. Delta-opioid receptor gene expression in the mouse forebrain: localization in cholinergic neurons of the striatum. Neuroscience. 1994;62:635–640. doi: 10.1016/0306-4522(94)90464-2. [DOI] [PubMed] [Google Scholar]
  30. Leung BK, Balleine BW. The ventral striato-pallidal pathway mediates the effect of predictive learning on choice between goal-directed actions. J Neurosci. 2013;33:13848–13860. doi: 10.1523/JNEUROSCI.1697-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lobo MK, Karsten SL, Gray M, Geschwind DH, Yang XW. FACS-array profiling of striatal projection neuron subtypes in juvenile and adult mouse brains. Nat Neurosci. 2006;9:443–452. doi: 10.1038/nn1654. [DOI] [PubMed] [Google Scholar]
  32. Matamales M, Bertran-Gonzalez J, Salomon L, Degos B, Deniau JM, Valjent E, Hervé D, Girault JA. A Striatal medium-sized spiny neurons: identification by nuclear staining and study of neuronal subpopulations in BAC transgenic mice. PLoS One. 2009;4:e4770. doi: 10.1371/journal.pone.0004770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Paxinos G, Watson C. The rat brain in stereotaxic coordinates: hard cover edition. San Francisco: Academic; 2006. [Google Scholar]
  34. Perrine SA, Hoshaw BA, Unterwald EM. Delta opioid receptor ligands modulate anxiety-like behaviors in the rat. Br J Pharmacol. 2006;147:864–872. doi: 10.1038/sj.bjp.0706686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Scherrer G, Tryoen-Tóth P, Filliol D, Matifas A, Laustriat D, Cao YQ, Basbaum AI, Dierich A, Vonesh JL, Gavériaux-Ruff C, Kieffer BL. Knockin mice expressing fluorescent delta-opioid receptors uncover G protein-coupled receptor dynamics in vivo. Proc Natl Acad Sci U S A. 2006;103:9691–9696. doi: 10.1073/pnas.0603359103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Schulz JM, Reynolds JN. Pause and rebound: sensory control of cholinergic signaling in the striatum. Trends Neurosci. 2013;36:41–50. doi: 10.1016/j.tins.2012.09.006. [DOI] [PubMed] [Google Scholar]
  37. Seymour B, Dolan R. Emotion, decision making, and the amygdala. Neuron. 2008;58:662–671. doi: 10.1016/j.neuron.2008.05.020. [DOI] [PubMed] [Google Scholar]
  38. Sgambato V, Abo V, Rogard M, Besson MJ, Deniau JM. Effect of electrical stimulation of the cerebral cortex on the expression of the Fos protein in the basal ganglia. Neuroscience. 1997;81:93–112. doi: 10.1016/S0306-4522(97)00179-6. [DOI] [PubMed] [Google Scholar]
  39. Simon NW, Montgomery KS, Beas BS, Mitchell MR, LaSarge CL, Mendez IA, Bañuelos C, Vokes CM, Taylor AB, Haberman RP, Bizon JL, Setlow B. Dopaminergic modulation of risky decision-making. J Neurosci. 2011;31:17460–17470. doi: 10.1523/JNEUROSCI.3772-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Simpson EH, Kellendonk C, Kandel E. A possible role for the striatum in the pathogenesis of the cognitive symptoms of schizophrenia. Neuron. 2010;65:585–596. doi: 10.1016/j.neuron.2010.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Smith RJ, Lobo MK, Spencer S, Kalivas PW. Cocaine-induced adaptations in D1 and D2 accumbens projection neurons (a dichotomy not necessarily synonymous with direct and indirect pathways) Curr Opin Neurobiol. 2013;23:546–552. doi: 10.1016/j.conb.2013.01.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Sorimachi M, Kataoka K. High affinity choline uptake: an early index of cholinergic innervation in rat brain. Brain Res. 1975;94:325–336. doi: 10.1016/0006-8993(75)90065-7. [DOI] [PubMed] [Google Scholar]
  43. Steiner H, Gerfen CR. Role of dynorphin and enkephalin in the regulation of striatal output pathways and behavior. Exp Brain Res. 1998;123:60–76. doi: 10.1007/s002210050545. [DOI] [PubMed] [Google Scholar]
  44. Stopper CM, Floresco SB. Contributions of the nucleus accumbens and its subregions to different aspects of risk-based decision making. Cogn Affect Behav Neurosci. 2011;11:97–112. doi: 10.3758/s13415-010-0015-9. [DOI] [PubMed] [Google Scholar]
  45. Tayebati SK, Di Tullio MA, Amenta F. Age-related changes of muscarinic cholinergic receptor subtypes in the striatum of Fisher 344 rats. Exp Gerontol. 2004;39:217–223. doi: 10.1016/j.exger.2003.10.016. [DOI] [PubMed] [Google Scholar]
  46. Threlfell S, Lalic T, Platt NJ, Jennings KA, Deisseroth K, Cragg SJ. Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons. Neuron. 2012;75:58–64. doi: 10.1016/j.neuron.2012.04.038. [DOI] [PubMed] [Google Scholar]
  47. Valjent E, Corvol JC, Pages C, Besson MJ, Maldonado R, Caboche J. Involvement of the extracellular signal-regulated kinase cascade for cocaine-rewarding properties. J Neurosci. 2000;20:8701–8709. doi: 10.1523/JNEUROSCI.20-23-08701.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Valjent E, Bertran-Gonzalez J, Hervé D, Fisone G, Girault JA. Looking BAC at striatal signaling: cell-specific analysis in new transgenic mice. Trends Neurosci. 2009;32:538–547. doi: 10.1016/j.tins.2009.06.005. [DOI] [PubMed] [Google Scholar]
  49. Wang JQ, Jolkkonen M, McGinty JF. The muscarinic toxin 3 augments neuropeptide mRNA in rat striatum in vivo. Eur J Pharmacol. 1997;334:43–47. doi: 10.1016/S0014-2999(97)01176-X. [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES