Abstract
To investigate the involvement of dopaminergic projections to the prelimbic and infralimbic cortex in the control of goal-directed responses, a first experiment examined the effect of pretraining 6-OHDA lesions of these cortices. We used outcome devaluation and contingency degradation procedures to separately assess the representation of the outcome as a goal or the encoding of the contingency between the action and its outcome. All groups acquired the instrumental response at a normal rate, indicating that dopaminergic activity in the medial prefrontal cortex is not necessary for the acquisition of instrumental learning. Sham-operated animals showed sensitivity to both outcome devaluation and contingency degradation. Animals with dopaminergic lesions of the prelimbic cortex, but not the infralimbic cortex, failed to adapt their instrumental response to changes in contingency, whereas their response remained sensitive to outcome devaluation. In a second experiment, aimed at determining whether dopamine was specifically needed during contingency changes, we performed microinfusions of the dopamine D1/D2 receptor antagonist flupenthixol in the prelimbic cortex only before contingency degradation sessions. Animals with infusions of flupenthixol failed to adapt their response to changes in contingency, thus replicating the deficit of animals with dopaminergic lesions in Experiment 1. These results demonstrate that dissociable neurobiological mechanisms support action–outcome relationships and goal representation, dopamine signaling in the prelimbic cortex being necessary for the former but not the latter.
Introduction
Decision making in an ever changing environment requires the selection of actions and constant monitoring of their consequences. Many of the basic cognitive mechanisms of selection processes can be identified during instrumental conditioning in both rodents (Dickinson, 1985; Balleine and Dickinson, 1998) and primates, including humans (Valentin et al., 2007, Tanaka et al., 2008).
In rats pressing a lever to gain access to a food reward, action is thought to be mediated by goal-directed action–outcome (A–O) associations, requiring both the encoding of the contingency between the action and its specific outcome and the representation of the outcome as the goal. These two aspects can be specifically assessed by using contingency degradation and outcome devaluation procedures (Dickinson, 1985; Balleine and Dickinson, 1998). Recent advances have demonstrated that the sensitivity of instrumental responses to both contingency degradation and outcome devaluation requires the integrity of a circuit involving the posterior part of the dorsomedial striatum (Yin et al., 2005a,b), the mediodorsal thalamus (Corbit et al., 2003), the basolateral nucleus of the amygdala (Balleine et al., 2003), and the prelimbic region (PLC) of the medial prefrontal cortex (Corbit and Balleine, 2003; Killcross and Coutureau, 2003; Ostlund and Balleine, 2005; Coutureau et al., 2009).
Considerable evidence points to a role of the dopaminergic system in reward-based learning (Costa, 2007), but much less is known about its contribution to goal-directed behavior (Wickens et al., 2007). Sensitization of the dopaminergic system with amphetamine promotes responding that is insensitive to outcome devaluation, i.e., mediated by a stimulus–response (S–R) habit system (Nelson and Killcross, 2006; Nordquist et al., 2007). In contrast, microinfusion of dopamine directly into the infralimbic cortex (ILC; a more ventral part of the medial prefrontal cortex) (Hitchcott et al., 2007), or damage to the nigrostriatal dopamine system using 6-hydroxy-dopamine (6-OHDA) (Faure et al., 2005) promotes the expression of A–O associations in conditions under which performance is normally mediated by S–R system.
The present study examined the role of the mesoprefrontal dopaminergic system in goal-directed responding. In a first experiment, we assessed the effects of pretraining 6-OHDA lesions of dopaminergic fibers in the medial prefrontal cortex (mPFC) on the rats' performance in outcome devaluation and contingency degradation procedures, to evaluate both the representation of the outcome as the goal and the encoding of the relationship between action and outcome. In addition, we contrasted the effects of dopaminergic loss within the prelimbic or the infralimbic cortices, to further dissociate different neuroanatomical systems within the medial prefrontal cortex. In a second experiment, we assessed the ability of rats to perform goal-directed behavior under pharmacological blockade of dopaminergic D1/D2 receptors in the prelimbic cortex.
Materials and Methods
Subjects
Male naive Long–Evans rats (Centre d'Elevage Janvier, France), weighing between 250 and 300 g, were used. On their arrival at the laboratory, they were housed in pairs in polycarbonate cages (46 × 26 × 20 cm) in a temperature- and humidity-controlled room, and maintained under a 12 h light/dark cycle (light on at 7:00 A.M.). The experiments took place during the light phase of the cycle. Before the experiments, animals were given ad libitum access to food and water for 1 week and were handled every day. After recovery from surgery and throughout the duration of the experiments, rats were maintained at 90% of their original weight by restricting their food intake to ∼12 g/d. All experiments were conducted in agreement with the French (Directive 87-148, Ministère de l'Agriculture et de la Pêche) and international (Directive 86-609, November 24th, 1986, European Community) legislation.
Experiment 1: effect of pretraining 6-OHDA lesion
Surgical procedures
Animals were divided into three groups: SHAM group (sham-operated controls, n = 12), PL group (6-OHDA-injected in the prelimbic cortex, n = 12), and IL group (6-OHDA-injected in the infralimbic cortex, n = 12). Before surgery, animals were pretreated with the noradrenaline uptake blocker, desipramine hydrochloride (Sigma-Aldrich, 25 mg/kg, i.p.). Thirty minutes later, they were anesthetized with isoflurane and placed in a stereotaxic frame (Kopf Instruments) in a flat skull position. The bone above the mPFC was removed using a high-speed drill. The toxin 6-OHDA (Sigma-Aldrich, 4 μg/μl) was dissolved in a vehicle solution containing NaCl 0.9% and 0.1% of ascorbic acid. Intracerebral injections were performed through a elongated glass pipette (tip diameter, 30 μm) using a pressure ejection system (Picospritzer II, General Valve Corporation). For the lesioned groups, 0.2 μl of 6-OHDA were injected at the following coordinates from bregma: anteroposterior (AP), +3.8 mm; lateral (L), ±0.6 mm; dorsoventral (DV), −3.8 mm/AP, + 3.2 mm; L, ± 0.6 mm; DV, −3.6 mm/AP, + 2.5 mm; L, ± 0.6 mm; DV, −3.4 mm for the prelimbic cortex, and AP, + 3 mm; L, ± 0.7 mm; DV, −5.4 mm for the infralimbic cortex (Paxinos and Watson, 1998). Rats in the SHAM group received a similar surgical procedure with injections of vehicle. Half of the SHAM group received vehicle in the prelimbic cortex and the other received vehicle in the infralimbic cortex. Injections were made at the rate of 0.1 μl/min and the pipette was left in place for 5 min after the injection to allow diffusion of the solution into the tissue. After surgery, animals were returned to the vivarium and left to recover for 2 weeks with ad libitum access to food and water.
Behavioral apparatus
Animals were trained in eight identical conditioning chambers (40 cm wide × 30 cm deep × 35 cm high, Imetronic), each located inside a sound and light-attenuating wooden chamber (74 × 46 × 50 cm). Each chamber had a ventilation fan producing a background noise of 55 dB and four light-emitting diodes on the ceiling for illumination of chamber. Each chamber had two opaque panels on the right and left sides, two clear Perspex walls on the back and front sides, and a stainless-steel grid floor (rod diameter: 0.5 cm; interrod distance: 1.5 cm). In the middle of the left wall, a magazine (6 × 4.5 × 4.5 cm) received food pellets (45 mg, F0165, Bio-Serv) from a dispenser located outside the operant chamber. In the middle of the right wall, an alternative magazine (7.9 × 5.6 × 9.7 cm) could receive fluid reinforcement (0.1 ml; sucrose solution 10%, Sigma-Aldrich) from a syringe pump located outside the operant chamber. Each magazine was equipped with infrared cells to detect the animal's visits. A retractable lever (4 × 1 × 2 cm) could be inserted next to each of the magazines. A personal computer connected to the operant chambers via an Imetronic interface and equipped with SKAA_PROG software (Imetronic) controlled the equipment and recorded the data.
To measure consumption behavior, eight Perspex cages (42 × 28 × 20 cm) located in a separate room were used. A glass dish (7.5 cm in diameter) containing food pellets could be fixed on the floor of each cage, or a bottle containing the sucrose solution could be inserted into the cage.
Immunohistochemistry
At the end of behavioral testing, rats were killed with an overdose of pentobarbital sodium (Ceva Santé Animale) and perfused transcardially with 300 ml of NaCl 0.9% solution, followed by 300 ml of paraformaldehyde (PFA) 4% solution in 0.1 m phosphate buffer (PB). The brains were removed, postfixed overnight in PFA 4% and transferred to a PB 0.1 m/sucrose 30% solution for 48 h at 4°C. Serial coronal sections (50 μm thick) were cut on a freezing microtome (Leica SM 2400), collected, and stored in a cryoprotective solution (PBS 0.1 m/azide 0.03%). The following primary antibodies were used: mouse monoclonal anti-tyrosine hydroxylase (TH; 1/500 in PBST 0.3% and normal goat serum 2%, Millipore Bioscience Research Reagents) and mouse monoclonal anti-dopamine-β-hydroxylase (DBH; 1/10,000 in PBST 0.3% and normal goat serum 2%, Millipore Bioscience Research Reagents). Free-floating sections were incubated with TH antibody or DBH antibody for 48 h at 4°C on a shaker. Sections were then incubated with biotinylated goat anti-mouse secondary antibody (1/1000 in PBST 0.3%, Jackson ImmunoResearch) for 90 min at room temperature. They were then incubated with avidin-biotin-peroxydase complex (1/200 in PBS, Vector Laboratories) for 90 min at room temperature. The final staining was made with a diaminobenzidine (DAB, Sigma-Aldrich) and hydrogen peroxide solution. Sections were rinsed with Tris buffer, collected on gelatin-coated slides, dehydrated with toluene, and mounted in Eukitt mounting medium.
Evaluation of fiber loss
To provide an evaluation of fiber loss, we performed examination of labeled sections under an Olympus BX50 microscope with 10× lens, connected to a Sony DXC-950 camera. Microphotographs of two sections for the prelimbic cortex (+4.2 and +2.2 mm from bregma) and one section for the infralimbic cortex (+2.2 mm from bregma) were examined. For each section, two adjacent square areas of 1600 μm2 per hemisphere were considered. Quantification of TH- and DBH-positive fibers was performed using an automated method developed in the laboratory using ImageJ software. To determine the area covered by fibers in a central area of interest, a black and white digitized version of the microphotograph was smoothed with a Gaussian filter (diameter 24 pixels) and subtracted from the original picture to isolate high spatial frequencies. The picture was then subjected to a fixed threshold to identify stained elements. The surface covered by fibers was evaluated as the proportion of detected pixels in the region of interest. The results were expressed as a relative value with respect to the SHAM group and were submitted to an ANOVA with Lesion as factor, followed by Student–Newman–Keuls (SNK) post hoc analysis.
Behavioral procedures
Magazine training.
Initially, all rats were trained for 2 d to collect rewards (sucrose solution or pellets) during two 30 min magazine training sessions per day (one session for each reward, separated by ∼2 h). Rewards were delivered on a random time 60 s schedule. The order of presentation of rewards was alternated each day and counterbalanced across animals.
Instrumental training.
All rats were then trained to press a lever to obtain a reward during two 30-min-long instrumental training sessions each day for 8 d (one session for each lever–reward association, in alternating order, with a 2 h break between them). The cage was illuminated and the lever inserted during the duration of the whole session. Different reinforcement schedules were used. The rats first received training under a continuous reinforcement, fixed ratio schedule FR1, for 2 d (i.e., each lever press was rewarded) until they had earned 100 of each outcome. Animals were then shifted to a random ratio schedule 5 for 2 d (RR-5; i.e., each response was rewarded with a probability of 0.2 on average) and then to a random ratio 10 schedule for 4 d (RR-10, i.e., each response was rewarded with a probability of 0.1 on average).
Outcome devaluation.
To familiarize the animals to the novel consumption cage, each rat was placed for 30 min in an individual consumption cage after the completion of the last session of instrumental training. On the following day, rats were given free access in their home cages to one of the two rewards (half receiving food pellet and half receiving sucrose solution) for 60 min. Immediately after the prefeeding treatment, rats were placed in their operant chambers for a 5 min test. During the test, both levers were inserted in the chamber, but no reward was delivered. The test session began with the illumination of the chamber and the insertion of both levers at the same time. Presses on each of the levers were then recorded for 5 min. To evaluate the effectiveness of prefeeding procedure, animals were then transferred to the consumption cages and allowed access to 10 g of each of the two rewards successively (the order of the prefed and the nonprefed reward being counterbalanced across animals) for 15 min. Their overall consumption was measured. The following day, animals received two 30 min retraining sessions identical to those given at the end of training (RR-10 schedule). A second test was conducted on the final day. This was identical to the first one, except rats were prefed with the alternative reward before the test. The consumption test was again conducted, counterbalanced for type of reward and order.
Contingency degradation.
After the devaluation procedure, rats received 2 d of retraining on RR-10 schedule (average probability of reward: 0.1), followed by contingency degradation for 4 d. Two 20 min sessions were given each day (one for each reward separated by ∼2 h) and the order of sessions was alternated. One of these sessions (nondegraded condition) was identical to the RR-10 instrumental training. In the other session (degraded condition), animals earned a reward (pellets or sucrose solution) as in RR-10 instrumental training, but additional deliveries of the same reward occurred independently of lever presses, so that the instrumental contingency was degraded. The noncontingent rewards were delivered with the same probability of 0.1 after each second without a lever press. For half the rats, the response–pellet contingency was degraded, and for the other half the response–sucrose contingency was degraded.
Data analysis
Instrumental performance was expressed as the ratio of the rate of lever presses to the baseline rate measured on the last day of training. A consumption index (i.e., amount of consumed reward/total amount available) was used to evaluate the effectiveness of satiety specific devaluation. Statistical analyses were performed using ANOVAs with Lesion (SHAM, PL, IL) and Devaluation (Devalued or Nondevalued) as factors during the satiety specific test, and Lesion and Session as factors for the contingency degradation procedures. Student–Newman–Keuls post hoc tests were also performed when required. Analyses were performed using StatView (version 5.0.1) and the α risk for rejection of the null hypothesis was fixed at 0.05.
Experiment 2: effect of dopamine D1/D2 receptors blockade
Surgical and histological procedures
For implantation of guide cannulae, rats were anesthetized with isoflurane and placed in a stereotaxic frame (Kopf Instruments) in flat skull position. Guide cannulae (8 mm long, 36 gauge, Le Guellec) were lowered above the PLC at the following coordinates (in mm from bregma): anteroposterior, +3.2; lateral, ±0.6; ventral, −2.5. Guide cannulae were held in place by a block of dental cement (PalavitG) overlaying three small skull-screws. Removable stylets were inserted in the cannulae to protect them from dust. Rats were left in a warm postsurgical room for 24 h with ad libitum access to food and water. Then, they were replaced in the vivarium for 1 week of recovery and were individually handled everyday. After behavioral testing, animals received a lethal dose of sodium pentobarbital (Ceva Santé Animale) and were perfused transcardially with 60 ml of a NaCl 0.9% solution, followed by 120 ml of a 10% formaldehyde solution. Brains were stored in a 30% sucrose-formalin solution for 72 h before being cut in 60 μm sections with a freezing microtome (Leica SM2400). After being collected onto gelatin-coated slides, brain sections were left to dry for 48 h and finally they were stained with thionin. Infusion sites were located under microscope. Section reconstructions were drawn in reference to the atlas of Paxinos and Watson (1998).
Microinfusions
During the last days of instrumental training, in the afternoon after the last session, rats were brought to the microinfusion room. There, they were familiarized with being wrapped in a cloth. The day before microinfusion, stylets were removed and cannulae were cleaned using an 8 mm long dental nerve broach (Micro-Mega). On the day of injection, intracerebral microinfusions were made bilaterally into the PLC. The rats were wrapped and gently handled, to insert stainless steel injection cannulae (39 gauge and 9.5 mm long; Le Guellec) into the guides. These cannulae were connected through catheters to two Hamilton syringes placed on a dual-syringe infusion pump (2.2 model, Harvard Apparatus). Microinfusions were made at the rate of 0.5 μl/min. Depending on the group, the rats received either 0.5 μl of a dopamine D1/D2 receptor antagonist, cis-(z)-flupenthixol dihydrochloride (Sigma-Aldrich) diluted at 30 μg/μl (FLU group) in phosphate buffer (Sigma-Aldrich), or 0.5 μl of phosphate buffer (VEH group). The injection cannulae were maintained in place for 2 min after microinfusion to allow diffusion of the solution.
Behavioral apparatus and procedures
In experiment 2, slight procedural changes were introduced. We used two sets of eight conditioning chambers identical to the ones of experiment 1 and located in two adjacent rooms. A distinctive type of food pellets served as reward in each set of conditioning chambers, providing similar but easily discriminable situations: one type of reward was identical to the one of experiment 1 (normal pellets: 45 mg, F0165, Bio-Serv) and the other one was alternative pellets (45 mg, AIN-76A pellets, TestDiet) that simply served to test for nonspecific effects of the drug. To minimize irreversible tissue damage, the number of microinfusions was limited and no attempt was made to compare degraded and nondegraded conditions in the same subjects. Initial conditioning training (magazine training, instrumental training) was conducted as in experiment 1. Each day, animals received one session in each set of operant chambers, with a break of 2 h between them. The order of presentation of rewards was alternated each day. On the day after the last session of instrumental training, to test whether flupenthixol might influence lever press performance (drug test), animals received microinjections immediately followed by an instrumental session on RR-10 schedule for alternative pellets.
In the following sessions, all rats were submitted to the contingency degradation procedure in the chambers that delivered the normal pellets, and therefore the other set of chambers was not used anymore. The contingency degradation procedure was identical to experiment 1. The rats received the infusion appropriate to their group just before each of the four sessions. The sessions were separated by a break of 72 h to allow for the elimination of the drug (half-life of 19 h). An additional 5 min test session in the absence of drug and without food delivery was performed 72 h after the last degradation session. Finally, to rule out any more changes in primary motivation resulting from flupenthixol infusion, a food consumption test was conducted: rats were given access to 20 g of the food pellets for 1 h in consumption cages, and the amount consumed by each animal was recorded.
Data analysis
Instrumental performance was expressed as the ratio of the rate of response (lever presses or magazine entries) to the baseline rate measured on the last day of training. Statistical analyses were performed using ANOVAs with Drug (VEH, FLU) and Session as factors for the contingency degradation procedure. Analyses were performed using StatView (version 5.0.1) and the α risk for rejection of the null hypothesis was fixed at 0.05.
Results
Experiment 1: effect of pretraining 6-OHDA lesion
After behavioral and histological procedures, six rats that did not acquire instrumental performance and one hydrocephalic rat were excluded from the analyses. SHAM animals which had received vehicle injection in either the prelimbic or the infralimbic cortex displayed similar patterns of behavioral responses. Because they did not significantly differ at any stage of the experiment, they were pooled to provide a SHAM group. The final group sizes were as follows: SHAM (n = 8); PL (n = 10); IL (n = 11).
6-OHDA lesions
Figure 1a shows immunostaining for TH fibers in the prefrontal cortex for a representative rat from each group. A large number of TH-immunoreactive fibers can be observed in both the prelimbic and infralimbic cortices of the SHAM animal. These are known to originate from the ventral tegmental area (VTA) (Heidbreder and Groenewegen, 2003, Hoover and Vertes, 2007, Lammel et al., 2008). Injections of 6-OHDA in the PL group drastically reduced TH staining in the prelimbic cortex but not in the infralimbic cortex. Injections of 6-OHDA in the IL group reduced TH staining in the infralimbic cortex, but also in the posterior part of the prelimbic cortex, thus suggesting a spread of the toxin to more dorsal regions. In fact, the loss was less important in the anterior part of prelimbic cortex which was more distant from the IL site of injection. Quantification of these effects in each group confirms these partially overlapping effects of injections in the PL and IL (Fig. 1b).
In the prelimbic cortex, the effects of injection were significantly different in the SHAM, PL, and IL groups (F(2,26) = 28.5; p < 0.001), but depended on the rostrocaudal position (F(2,26) = 5.86; p < 0.01). At the most posterior level (+2.2 mm with respect to bregma), the loss of TH immunostaining was similar (approximately −72%) in the PL and IL groups [SNK post hoc test, nonsignificant (ns)]. However, at the more anterior level (+4.2 mm with respect to bregma), the loss was more important (SNK test, p < 0.05) in the PL group (−91%) than in the IL group (−59%), indicating that a significant amount of dopamine activity was spared in the IL group.
Within the infralimbic region, the drastic loss of TH immunostaining (F(2,26) = 7.2; p < 0.01) was specific to the IL group (Fig. 1b, right). The post hoc test indicates significantly lower TH immunoreactivity in the IL group than in the two other groups (p < 0.01), confirming that the PL group displayed normal levels of TH immunoreactivity in this region.
TH immunoreactivity characterizes both dopaminergic and noradrenergic fibers, so that it was important to evaluate the loss of DBH-immunoreactive, noradrenergic fibers. The quantification of DBH immunostaining (Fig. 1c) shows a modest decrease in both the prelimbic and the infralimbic cortices for PL and IL group (23 and 33% of maximum loss, respectively), indicating that the protection of noradrenergic fibers using desipramine was not complete (Morrow et al., 1999). Although this decrease was not globally significant (F(2,26) = 2.34; ns), a significant loss was detected in the infralimbic cortex (F(2,26) = 4.87; p < 0.05). The loss of DBH immunoreactivity in this region was similar in the PL and IL groups, which both differed from the SHAM group. However, this limited reduction cannot explain the important loss of TH staining in lesioned groups, so that the behavioral effects observed in our study can be mostly attributed to the destruction of dopaminergic fibers.
Behavioral results
Instrumental training.
The three groups of animals acquired the instrumental response at the same rate (data not shown). A mixed ANOVA with Session and Lesion as factors showed an effect of Session (F(3,55) = 53.4; p < 0.001), but no effect of Lesion (F(2,55) = 0.002; ns) nor any significant Lesion × Session interaction (F(6,165) = 1.5; ns), indicating that dopaminergic loss within the prelimbic or infralimbic areas had no effect on the acquisition of the instrumental response. Mean response rates of the SHAM, PL, and IL groups at the completion of training were 29 ± 4, 25 ± 3, and 24 ± 2, respectively.
Devaluation test.
Figure 2a shows the instrumental performance during the 5 min devaluation test for the three experimental groups, as a proportion of their baseline. For both the SHAM and lesioned groups, the results are clear: the performance in the devalued condition (black bars) was markedly reduced compared with the nondevalued condition (white bars), indicating that satiety-induced devaluation was efficient in decreasing instrumental performance. The large difference between the devalued and the nondevalued conditions in the PL and IL groups indicates that dopaminergic loss within these two regions did not alter the sensitivity of instrumental performance to changes in the value of the outcome. This description of the data was confirmed by an ANOVA with Lesion (SHAM, PL, or IL) and Devaluation (Nondevalued or Devalued) as factors, showing a significant effect of Devaluation (F(1,55) = 75.86; p < 0.001) but no significant effect of Lesion (F(2,55) = 0.28; ns) and, importantly, no Lesion × Condition interaction (F(2,55) = 0.34; ns).
Consumption test.
The result of the consumption test confirmed that the prefeeding treatment induced a specific satiety that devalued the rewards in all three groups (Fig. 2b). All animals rejected the sated reward (black bars) but consumed high quantities of the nonsated reward (white bars). The ANOVA shows a significant effect of the Devaluation factor (F(1,55) = 137.77; p < 0.001), but no significant effect of Lesion (F(2,55) = 0.58; ns), as well as no significant Lesion × Devaluation interaction (F(2,55) = 2.19; ns). This indicates that dopamine depletion did not alter satiety.
Contingency degradation.
Figure 3 shows the effect of contingency degradation on instrumental responses. As shown on the left, all three groups continued to press readily in the nondegraded condition across the four training sessions. A repeated-measures ANOVA shows no effect related to Lesion or Session in this condition (largest F(6,78) = 1.12; ns).
The results of the degraded condition (middle) stand in marked contrast to those of the Nondegraded condition. Indeed, both the SHAM and IL groups showed an important decrease in lever pressing, thus indicating that these animals correctly adapted their responding to the change of the action–outcome contingency. However, responding of the PL group remained stable and close to the baseline throughout training (mean of 0.81 ± 0.08 on the fourth day of training), a result suggesting that the dopaminergic lesion within the prelimbic region prevented adaptation of instrumental responses to contingency changes. The ANOVA with Session and Lesion as factors confirms this dissociation between PL and IL dopaminergic innervation. It reveals a significant effect of Session (F(3,26) = 5.21; p < 0.05) and Lesion (F(2,26) = 5.07; p < 0.05) and an significant interaction between Session and Lesion, F(6,78) = 2.49; p < 0.05), indicating that the contingency degradation procedure was acquired differently across groups. Specifically, the SHAM, PL, and IL groups did not differ from one another during the first day of contingency degradation training, but the PL group differed from the others during the last 2 d (SNK test, p < 0.01), confirming that dopamine depletion in the prelimbic cortex leads to an impaired adaptation to contingency changes.
This deficit cannot be an indirect consequence of lesion-induced alterations in food cup behavior since the three groups presented a similar mean rate of magazine entries during the degraded condition (Fig. 3, right). An ANOVA on the rate of visits to the magazine confirms the absence of significant effect of Lesion (F(2,26) = 1.70; ns).
Experiment 2: effect of dopamine D1/D2 receptors blockade
Histology
Figure 4 provides a schematic representation of the infusion sites within the prelimbic cortex. Twenty-one animals presented infusion sites located in the dorsal prelimbic cortex, allowing the diffusion into this region. Final group sizes were as follows: VEH group (n = 13) and FLU group (n = 8).
Behavioral results
Instrumental training.
Throughout training, the two groups of animals planned to receive either vehicle or flupenthixol acquired the two instrumental responses at the same rate (data not shown). Separate repeated-measures ANOVAs for the two instrumental responses with Session and Drug as factors showed an effect of Session (F(3,19) = 24.24 and F(3,19) = 23.90; p < 0.001) but no Drug effect (F(2,19) = 0.01 and F(2,19) = 0.30 respectively; ns) and no interaction between these two factors (F(3,57) = 0.68 and F(3,57) = 1.17 respectively; ns).
Contingency degradation.
Infusion of flupenthixol or vehicle in prelimbic cortex had no effect on instrumental performance per se since both groups displayed similar levels of instrumental responding indicating an absence of effect of the drug treatment [VEH m = 0.67 ± 0.06; FLU mean (m) = 0.69 ± 0.09; F(1,19) = 0.06; ns], although responding in both groups was reduced relative to the baseline. Thus, there was no specific influence of flupenthixol infusion on instrumental performance
Figure 5 (left) shows the effect of contingency degradation on instrumental response. The vehicle group showed an important decrease in lever pressing across sessions, indicating that these animals adapted their instrumental behavior to the modification of the action–outcome contingency. However, the FLU group displayed a more persistent and stable response than the VEH group throughout the degradation procedure. A repeated-measures ANOVA with Drug and Session as factors shows a significant effect of Drug (F(1,19) = 4.83; p < 0.05), Session (F(3,19) = 3.33; p < 0.05), and a significant interaction between the two factors (F(3,57) = 3.04; p < 0.05). Separate analyses for each day show that the FLU group significantly differed from VEH group from the second day of procedure (F(1,19) = 6.27; p < 0.05/F(1,19) = 9.25; p < 0.01/F(1,19) = 5.12; p < 0.05 for each day, respectively). This result strongly suggests that the blockade of dopamine transmission in the prelimbic cortex only during the contingency degradation procedure prevented the adaptation of instrumental response to contingency modifications. Moreover, the difference in response between the FLU and VEH groups persisted in the subsequent test in the absence of drug (VEH m = 0.56 ± 0.07; FLU m = 0.96 ± 0.11; F(1,19) = 9.57; p < 0.01).
This effect cannot be a consequence of drug-induced alterations in magazine approach behavior since the two groups displayed similar mean rate of magazine entries during the four sessions of contingency degradation stage (VEH m = 1.76 ± 0.11; FLU m = 2.21 ± 0.29). An ANOVA with Drug as factor shows no significant effect of Drug (F(1,19) = 2.78; ns). In addition, this behavioral effect does not result from alterations in primary motivation for reward, because VEH and FLU groups consumed equally high quantities of reward during the consumption test (11 and 12 g, respectively), where there was no significant effect of Drug (F(1,19) = 0.24; ns).
Discussion
The present study clearly demonstrates the involvement of dopaminergic innervation of the prefrontal cortex in the control of goal-directed behavior. Experiment 1 demonstrates that the loss of dopaminergic signaling in the prelimbic cortex abolished the sensitivity of instrumental response to contingency degradation but not to devaluation of the outcome by sensory-specific satiety. The effects of dopaminergic depletion therefore differ from those reported for cell body lesions, as earlier experiments have shown that both processes were disrupted in rats with pretraining cell body lesions of the PLC (Corbit and Balleine, 2003; Killcross and Coutureau, 2003). In contrast, the loss of dopaminergic signaling in the infralimbic cortex had no effect on the sensitivity of instrumental response in these two procedures, thereby confirming important functional dissociations within the medial prefrontal cortex (Killcross and Coutureau, 2003). Experiment 2 further showed that the dopamine D1/D2 receptor antagonist flupenthixol, injected locally in the prelimbic cortex during contingency degradation, sufficed to disrupt goal-directed behavior, although normal functioning was preserved during both the acquisition of instrumental task and test.
It may be necessary to rule out alternative behavioral interpretations of our data. For instance, a different level of reward-elicited approach in lesioned/flupenthixol groups might compete with lever press behavior. We, however, showed that magazine activity in lesioned/flupenthixol or control rats did not significantly differ. Alternatively, the DA-lesioned or flupenthixol-injected rats might show an alteration in their response to satiety, but the results of the consumption test show that they consume the same amounts as control rats of sated food or nonsated food, as described in other PFC-lesion studies (Killcross and Coutureau, 2003). Thus, neither changes in magazine approach behavior nor an alteration in satiety-related processes are likely to explain our pattern of results.
There is evidence that encoding of the outcome depends on dopaminergic mechanisms outside the medial prefrontal cortex. For example, lesions of the nucleus accumbens (NAC) core, which receives dense dopaminergic projection from the VTA, have been shown to reduce sensitivity to outcome devaluation but not to contingency degradation (Corbit et al., 2001). In addition, Montague et al. (2004) have shown that VTA inhibition or dopamine antagonist infusions in NAC diminished the animal capacity to adapt their behavior according to reward value. These data suggest that dopaminergic innervation in the NAC, in contrast to the PLC, is involved in modulating instrumental response according to changes in the value of the outcome. Together, these results show that outcome devaluation and contingency degradation can be doubly dissociated, i.e., that the two processes are implemented through distinct neural substrates. These might include brain areas such as the basolateral amygdala or the NAC (Heidbreder and Groenewegen, 2003; Vertes, 2004; Hoover and Vertes, 2007), which have strong reciprocal connections with the mPFC. The two processes also appear to require processing in the PLC during separate phases of the experiment since mPFC lesions have been shown to affect outcome devaluation only when performed during the acquisition of the instrumental task (Ostlund and Balleine, 2005), whereas the present results indicate a later involvement of dopaminergic mechanisms when contingency changes. Moreover, the dorsomedial striatum may integrate these various aspects of goal-directed performance, since outcome devaluation and contingency degradation are affected by both pretraining and posttraining lesions, as well as by pharmacological manipulations of this region (Yin et al., 2005a,b).
In the rat, the PLC has been shown to be necessary to adapt to changes in instrumental contingency (Balleine and Dickinson, 1998; Corbit and Balleine, 2003; Dalley et al., 2004). Our study is focused on the acquisition phase of contingency degradation and directly addresses the updating of action–outcome relationships. We demonstrate here that dopaminergic signaling in the PLC plays a critical role in the detection of contingency changes. Moreover, when rats had acquired the task with an intact brain function, blocking dopaminergic receptors during the contingency degradation stage clearly impaired adaptation in experiment 2. This deficit is therefore not attributable to a long-lasting adaptation to dopaminergic denervation, but rather to a specific contribution of dopamine to the detection of contingency changes. This result is consistent with a putative role of dopamine neurons in encoding reward-prediction error (Schultz, 1998; McClure et al., 2003; Montague et al., 2004). Indeed, dopaminergic neurons display a phasic increase of their firing rate when reward occurrence exceeds expectations. Thus, in the degraded condition, the delivery of noncontingent rewards that are not caused by lever pressing and therefore not expected by the animals are likely to induce a positive dopaminergic error signal. This signal could then be used by the rats to adapt their behavior according to the new situation. In the PL group, the reduction or loss of such a signal precisely in the cortical area associated with contingency detection would prevent the animals from updating their action–outcome representation to guide behavior. In contrast, in the outcome devaluation procedure, there is no unexpected reward so that a role of this error signal is unlikely. This might account for the absence of effect of denervation in the devaluation test. The fact that in groups PL and IL equivalent levels of dopaminergic denervation were induced in the posterior part of the PLC suggests that the critical role in these processes is played by the more anterior region of the PLC, which was also the region displaying the most dramatic loss of dopaminergic fibers. The PL group indeed presented both an extreme loss of fibers in the more anterior parts of the prelimbic cortex and a nearly complete preservation of dopaminergic innervation in the infralimbic region.
Conversely, the absence of behavioral effects of the dopaminergic loss in the IL group confirms the existence of important dorsoventral dissociations within the mPFC (Peters et al., 2008). These behavioral results are consistent with previous reports showing that lesions of the ILC, unlike those of the PLC, do not impair goal-directed instrumental responses (Killcross and Coutureau, 2003). One may argue that the partial prelimbic denervation present in the IL group might have masked a deficit of a different kind, but there is at present no evidence for an effect of dopaminergic denervation in the ILC. Previous research has shown that infusions of dopamine into the ILC could restore the sensitivity of an overtrained instrumental response to posttraining devaluation of the outcome (Hitchcott et al., 2007), thus suggesting that dopamine within the ILC might be involved in the coordination of actions and habits. On this basis, habit-based responding, i.e., a response not sensitive to outcome devaluation, might have been expected after a dopamine lesion of the ILC. This hypothesis was not supported by experiment 1 that instead replicated the effects of excitotoxic lesions (Killcross and Coutureau, 2003). Nevertheless, because of procedural differences, such as the devaluation being conducted using sensory-specific satiety (Killcross and Coutureau, 2003) versus conditioned food aversion (Hitchcott et al., 2007), this issue would warrant further investigations.
The impact of dopaminergic lesions on adaptive instrumental behavior may also be viewed in relation with the general role of dopamine in the PFC. This region is especially important for working memory and behavioral flexibility (Miller and Cohen, 2001; Seamans and Yang, 2004; Ragozzino, 2007). Learning a new pattern of contingency requires the ability to flexibly update previously acquired representations and select new response strategies. Contingency degradation could therefore constitute a novel way to assess behavioral flexibility. A few studies suggest the importance of mesocortical dopaminergic projections in these processes (Floresco and Magyar, 2006; Van der Meulen et al., 2007). In particular, antagonists of either D1 or D2 dopaminergic receptors can impair attentional set shifting performance (Floresco and Magyar, 2006). When dopaminergic activity is reduced, the errors appear to be mostly perseverative, indicating a reduced ability of the animals to flexibly adapt their behavior. It could therefore be possible that animals were not able to alter their response strategy in response to the altered contingencies, as a result of dopamine depletion in the prelimbic cortex. Whether the deficit observed in dopamine-depleted animals results from an altered perception of the contingency change or an inability to acquire and maintain a new response strategy still remains to be elucidated.
In summary, the current findings provide evidence for a specific role of prefrontal dopaminergic innervation in instrumental learning. These results extend previous work suggesting the implication of prelimbic and infralimbic cortex in two dissociable neural networks for goal-directed actions and habits, respectively (Coutureau and Killcross, 2003), by showing that dopaminergic innervation of the prelimbic cortex is indeed essential in goal-directed actions to allow the detection of changes in action–outcome relationships, but not for goal representation. These findings have important implications for our understanding of the cognitive and neural processes underlying decision making (Rangel et al., 2008).
Footnotes
This work was supported by grants from the Centre National de la Recherche Scientifique (Programme Interdisciplinaire Neuroinformatique) and Conseil Régional d'Aquitaine. F.N. is a fellow of the Ministère de l'Enseignement Supérieur. We thank L. Decorte and A. Faugère for their technical assistance and D. Panzeri, N. Argenta, and J. Huard for their help in animal breeding and care.
References
- Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37:407–419. doi: 10.1016/s0028-3908(98)00033-1. [DOI] [PubMed] [Google Scholar]
- Balleine BW, Killcross AS, Dickinson A. The effect of lesions of the basolateral amygdala on instrumental conditioning. J Neurosci. 2003;23:666–675. doi: 10.1523/JNEUROSCI.23-02-00666.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH, Balleine BW. The role of prelimbic cortex in instrumental conditioning. Behav Brain Res. 2003;146:145–157. doi: 10.1016/j.bbr.2003.09.023. [DOI] [PubMed] [Google Scholar]
- Corbit LH, Muir JL, Balleine BW. The role of the nucleus accumbens in instrumental conditioning: evidence of a functional dissociation between accumbens core and shell. J Neurosci. 2001;21:3251–3260. doi: 10.1523/JNEUROSCI.21-09-03251.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH, Muir JL, Balleine BW. Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats. Eur J Neurosci. 2003;18:1286–1294. doi: 10.1046/j.1460-9568.2003.02833.x. [DOI] [PubMed] [Google Scholar]
- Costa RM. Plastic corticostriatal circuits for action learning: what's dopamine got to do with it? Ann N Y Acad Sci. 2007;1104:172–191. doi: 10.1196/annals.1390.015. [DOI] [PubMed] [Google Scholar]
- Coutureau E, Killcross S. Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats. Behav Brain Res. 2003;146:167–174. doi: 10.1016/j.bbr.2003.09.025. [DOI] [PubMed] [Google Scholar]
- Coutureau E, Marchand AR, Di Scala G. Goal-directed responding is sensitive to lesions to the prelimbic cortex or basolateral nucleus of the amygdala but not to their disconnection. Behav Neurosci. 2009;123:443–448. doi: 10.1037/a0014818. [DOI] [PubMed] [Google Scholar]
- Dalley JW, Cardinal RN, Robbins TW. Prefrontal executive and cognitive functions in rodents: neural and neurochemical substrates. Neurosci Biobehav Rev. 2004;28:771–784. doi: 10.1016/j.neubiorev.2004.09.006. [DOI] [PubMed] [Google Scholar]
- Dickinson A. Actions and habits: the development of behavioural autonomy. Phil Trans R Soc Lond B Biol Sci. 1985;308:67–78. [Google Scholar]
- Faure A, Haberland U, Condé F, El Massioui N. Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J Neurosci. 2005;25:2771–2780. doi: 10.1523/JNEUROSCI.3894-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Floresco SB, Magyar O. Mesocortical dopamine modulation of executive functions: beyond working memory. Psychopharmacology. 2006;188:567–585. doi: 10.1007/s00213-006-0404-5. [DOI] [PubMed] [Google Scholar]
- Heidbreder CA, Groenewegen HJ. The medial prefrontal cortex in the rat: evidence for a dorso-ventral distinction based upon functional and anatomical characteristics. Neurosci Biobehav Rev. 2003;27:555–579. doi: 10.1016/j.neubiorev.2003.09.003. [DOI] [PubMed] [Google Scholar]
- Hitchcott PK, Quinn JJ, Taylor JR. Bidirectional modulation of goal-directed actions by prefrontal cortical dopamine. Cereb Cortex. 2007;17:2820–2827. doi: 10.1093/cercor/bhm010. [DOI] [PubMed] [Google Scholar]
- Hoover WB, Vertes RP. Anatomical analysis of afferent projections to the medial prefrontal cortex in the rat. Brain Struct Funct. 2007;212:149–179. doi: 10.1007/s00429-007-0150-4. [DOI] [PubMed] [Google Scholar]
- Killcross S, Coutureau E. Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb Cortex. 2003;13:400–408. doi: 10.1093/cercor/13.4.400. [DOI] [PubMed] [Google Scholar]
- Lammel S, Hetzel A, Häckel O, Jones I, Liss B, Roeper J. Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron. 2008;57:760–773. doi: 10.1016/j.neuron.2008.01.022. [DOI] [PubMed] [Google Scholar]
- McClure SM, Daw ND, Montague PR. A computational substrate for incentive salience. Trends Neurosci. 2003;26:423–428. doi: 10.1016/s0166-2236(03)00177-2. [DOI] [PubMed] [Google Scholar]
- Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annu Rev Neurosci. 2001;24:167–202. doi: 10.1146/annurev.neuro.24.1.167. [DOI] [PubMed] [Google Scholar]
- Montague PR, Hyman SE, Cohen JD. Computational roles for dopamine in behavioural control. Nature. 2004;431:760–767. doi: 10.1038/nature03015. [DOI] [PubMed] [Google Scholar]
- Morrow BA, Elsworth JD, Rasmusson AM, Roth RH. The role of mesoprefrontal dopamine neurons in the acquisition and expression of conditioned fear in the rat. Neuroscience. 1999;92:553–564. doi: 10.1016/s0306-4522(99)00014-7. [DOI] [PubMed] [Google Scholar]
- Nelson A, Killcross S. Amphetamine exposure enhances habit formation. J Neurosci. 2006;26:3805–3812. doi: 10.1523/JNEUROSCI.4305-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nordquist RE, Voorn P, de Mooij-van Malsen JG, Joosten RN, Pennartz CM, Vanderschuren LJ. Augmented reinforcer value and accelerated habit formation after repeated amphetamine treatment. Eur Neuropsychopharmacol. 2007;17:532–540. doi: 10.1016/j.euroneuro.2006.12.005. [DOI] [PubMed] [Google Scholar]
- Ostlund SB, Balleine BW. Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning. J Neurosci. 2005;25:7763–7770. doi: 10.1523/JNEUROSCI.1921-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paxinos G, Watson C. The rat brain in stereotaxic coordinates. Ed 4. San Diego: Academic; 1998. [DOI] [PubMed] [Google Scholar]
- Peters J, LaLumiere RT, Kalivas PW. Infralimbic prefrontal cortex is responsible for inhibiting cocaine seeking in extinguished rats. J Neurosci. 2008;28:6046–6053. doi: 10.1523/JNEUROSCI.1045-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ragozzino ME. The contribution of the medial prefrontal cortex, orbitofrontal cortex and dorsomedial striatum to behavioral flexibility. Ann N Y Acad Sci. 2007;1121:355–375. doi: 10.1196/annals.1401.013. [DOI] [PubMed] [Google Scholar]
- Rangel A, Camerer C, Montague PR. A framework for studying the neurobiology of value-based decision making. Nat Rev Neurosci. 2008;9:545–556. doi: 10.1038/nrn2357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W. Predictive reward signal of dopamine neurons. J Neurophysiol. 1998;80:1–27. doi: 10.1152/jn.1998.80.1.1. [DOI] [PubMed] [Google Scholar]
- Seamans JK, Yang CR. The principal features and mechanisms of dopamine modulation in the prefrontal cortex. Prog Neurobiol. 2004;74:1–58. doi: 10.1016/j.pneurobio.2004.05.006. [DOI] [PubMed] [Google Scholar]
- Tanaka SC, Balleine BW, O'Doherty JP. Calculating consequences: brain systems that encode the causal effects of actions. J Neurosci. 2008;28:6750–6755. doi: 10.1523/JNEUROSCI.1808-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valentin VV, Dickinson A, O'Doherty JP. Determining the neural substrates of goal-directed learning in the human brain. J Neurosci. 2007;27:4019–4026. doi: 10.1523/JNEUROSCI.0564-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van der Meulen JA, Joosten RN, de Bruin JP, Feenstra MG. Dopamine and noradrenaline efflux in the medial prefrontal cortex during serial reversals and extinction of instrumental goal-directed behavior. Cereb Cortex. 2007;17:1444–1453. doi: 10.1093/cercor/bhl057. [DOI] [PubMed] [Google Scholar]
- Vertes RP. Differential projections of the infralimbic and prelimbic cortex in the rat. Synapse. 2004;51:32–58. doi: 10.1002/syn.10279. [DOI] [PubMed] [Google Scholar]
- Wickens JR, Horvitz JC, Costa RM, Killcross S. Dopaminergic mechanisms in actions and habits. J Neurosci. 2007;27:8181–8183. doi: 10.1523/JNEUROSCI.1671-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin HH, Knowlton BJ, Balleine BW. Blockade of NMDA receptors in the dorsomedial striatum prevents action–outcome learning in instrumental conditioning. Eur J Neurosci. 2005a;22:505–512. doi: 10.1111/j.1460-9568.2005.04219.x. [DOI] [PubMed] [Google Scholar]
- Yin HH, Ostlund SB, Knowlton BJ, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci. 2005b;22:513–523. doi: 10.1111/j.1460-9568.2005.04218.x. [DOI] [PubMed] [Google Scholar]