The Role of the Rat Medial Prefrontal Cortex in Adapting to Changes in Instrumental Contingency

Etienne Coutureau; Frederic Esclassan; Georges Di Scala; Alain R Marchand

doi:10.1371/journal.pone.0033302

. 2012 Apr 4;7(4):e33302. doi: 10.1371/journal.pone.0033302

The Role of the Rat Medial Prefrontal Cortex in Adapting to Changes in Instrumental Contingency

Etienne Coutureau ^1,², Frederic Esclassan ^1,², Georges Di Scala ^1,², Alain R Marchand ^1,^2,^*

Editor: Xiaoxi Zhuang³

PMCID: PMC3319541 PMID: 22496747

Abstract

In order to select actions appropriate to current needs, a subject must identify relationships between actions and events. Control over the environment is determined by the degree to which action consequences can be predicted, as described by action-outcome contingencies – i.e. performing an action should affect the probability of the outcome. We evaluated in a first experiment adaptation to contingency changes in rats with neurotoxic lesions of the medial prefrontal cortex. Results indicate that this brain region is not critical to adjust instrumental responding to a negative contingency where the rats must refrain from pressing a lever, as this action prevents reward delivery. By contrast, this brain region is required to reduce responding in a non-contingent situation where the same number of rewards is freely delivered and actions do not affect the outcome any more. In a second experiment, we determined that this effect does not result from a different perception of temporal relationships between actions and outcomes since lesioned rats adapted normally to gradually increasing delays in reward delivery. These data indicate that the medial prefrontal cortex is not directly involved in evaluating the correlation between action-and reward-rates or in the perception of reward delays. The deficit in lesioned rats appears to consist of an abnormal response to the balance between contingent and non-contingent rewards. By highlighting the role of prefrontal regions in adapting to the causal status of actions, these data contribute to our understanding of the neural basis of choice tasks.

Introduction

Decision making requires adequate integration of actions with respect to their goal. A number of studies have demonstrated that this process depends on the identification of causal relationships between actions and events [1], which amounts to contingency learning. Contingency is usually defined as the difference between the probability to observe a given outcome in the presence of a given action and the same probability in the absence of this action.

An increasing body of evidence points to a role of prefrontal regions in the representation of contingencies. In particular, activity within prefrontal areas in both primates and rodents is related to the acquisition and updating of contingency [2], [3], [4]. In rodents, the medial prefrontal cortex (mPFC) contributes to the learning of instrumental contingencies in animals pressing a lever for a food reward [5], [6]. This research has established that rats with damage to the mPFC learn the task at a normal rate [7], [8], but that their response is insensitive to manipulations of consequences such as contingency degradation i.e. weakening the correlation between food delivery and lever pressing [5], [6], [9], [10]. Dopaminergic mechanisms also appear to be involved since lesions of dopaminergic terminals in the mPFC alter normal adaptation to contingency degradation [11].

The mechanisms responsible for these effects however remain poorly known. In a standard contingency degradation procedure, the outcome is equally probable in the presence or absence of action (see [12]). Thus both the causal and temporal relationship between action and outcome are altered. Normal rats, but not mPFC-lesioned rats, respond to this new situation by reducing their lever-pressing rate. This deficit might result from either the degree of control over the outcome or the temporal relationship between response and outcome. The present study therefore aims at elucidating this issue.

First, the mPFC might be required for adaptation when there is no clear relationship between the action and the outcome, i.e. under conditions of low or null contingency. We tested this hypothesis by comparing the performance of previously trained mPFC-lesioned animals in two contingency conditions. In a first condition, classically called omission (e.g. [13]), the animals had to refrain from pressing the lever for a fixed time (20 s) in order to obtain the food reward. Thus, although food could not be obtained by lever pressing, a relationship between action and outcome was preserved (negative contingency). In a second condition, reward delivery was independent of lever pressing, a non-contingent situation.

Second, since a condition of degraded contingency is characterized by a variable time interval between lever-press and response delivery, the deficit of mPFC-lesioned rats might result from an altered perception of the temporal relationship between action and reward delivery. We evaluated this hypothesis in a second experiment under delayed reward conditions that gradually disrupted the contiguity between lever press and reward.

Experiment 1

Materials and Methods

Ethics statement

All procedures involving animals and their care conformed the institutional guidelines that comply with international (Directive 86-609, November 24, 1986, European Community) and national (council directive 87-848, october 19, 1987, Ministère de l'Agriculture et de la Forêt, Service Vétérinaire de la Santé et de la Protection Animales) laws and policies. They adhered to protocols approved by Région Aquitaine Veterinary Services (Direction Départementale de la Protection des Animaux, approval ID: A37-063). E.C. holds permission for animal experiments no. 33 06 008 from Ministère de l'Agriculture et de la Forêt. Surgery was performed under ketamine+xylazine anaesthesia (Expt. 1) or isoflurane anaesthesia (Expt. 2). Following surgery, animals were daily weighted and observed to detect and minimize pain or discomfort.

Subjects

Thirty two male, Long Evans rats obtained from Centre d'Elevage Janvier (France) were used. Rats were housed in pairs and accustomed to the laboratory vivarium for one week. The vivarium was maintained at 21°C±1°C with the light on from 7 a.m. to 7 p.m. All experiments were carried out during the light portion of the cycle. Following recovery from surgery animals were maintained at about 90% of free feeding weight (340–405 g) by providing the animals once daily with 15 g rodent formula (laboratory chow, Purina).

Surgery

The rats were anaesthetised using a mixture of ketamine (90 mg/kg) and xylazine (10 mg/kg) and then placed in a Kopf stereotaxic frame (Kopf instruments, Tujunga, CA) in a flat skull position. Neurotoxic lesions were performed using multiple NMDA micro-injections. The bone above the injection sites was removed using a high-speed drill. NMDA (Sigma-Aldrich) 40 mM in PBS (pH = 7.4) was injected into the brain through a glass pipette glued onto the end of the needle of a 5-µl Hamilton syringe held with a microinjector (Imetronic, Pessac, France). For the lesioned group (mPFC, n = 16), 0.1 µl of NMDA was infused in the medial prefrontal cortex at the following coordinates (in mm from Bregma): A-P (antero-posterior)+3.8, L (lateral)±0.6, V (ventral) −3.8; A-P +3.2, L ±0.6, V −3.6; A-P +3.0, L ±0.6, V −5.4; A-P +2.5, L ±0.6, V −3.4. Injections were made at a rate of 0.10 µl/min then the pipette was left in place for 5 min to allow diffusion of the solution into the tissue. The control group (SHAM, n = 16) was given a similar surgical procedure but the dura was simply breached using a standard needle and no injection was given. All subjects recovered for a period of at least 7 days after surgery with ad lib access to food and water. Animals were then individually handled for 5 min on each of 3 days, after which the food deprivation schedule and behavioural experiments were initiated.

Apparatus

Eight identical (40 cm wide×30 cm deep×35 cm high) operant chambers (Imetronic, Pessac, France) were used in this experiment. They were individually enclosed in ventilated, sound- and light-attenuating wooden cubicles. Each chamber had a stainless-steel grid floor above a sawdust tray. The left panel of the chamber featured a recessed food magazine in its centre and a retractable lever (2×4×1 cm) located on the left of the magazine, 7 cm above the grid floor. An external food dispenser delivered calibrated rodent formula pellets (Bioserv, NJ) into the magazine. All experiments were designed and controlled from a PC with real-time software (Imétronic, Pessac, France).

Behavioural procedures

During two daily magazine training sessions, rats were accustomed to the operant chambers and allowed to consume the food pellets used as rewards. During each 30 min session, 30 food pellets were delivered into the magazine at pseudo-random intervals. No lever was presented at this stage.

For initial lever press training, each training session began with the illumination of the houselight and insertion of the lever and ended with the retraction of the lever and turning off of the houselight. The rats were first trained for 2 sessions under a fixed interval 20 s (FI-20s) schedule, in which food pellets could be obtained every 20 s by pressing the lever. A session ended as soon as 50 rewards were earned or after 45 minutes had expired. The rats were then switched to a single session of variable-interval 30 s schedule (VI-30 range 7.5–75 s), under which a pellet became available every 30 s on average if the rat then pressed the lever, then to 4 sessions under a variable-interval 60 s schedule (VI-60 range 15–150 s). These sessions ended as soon as 30 rewards were earned or 45 minutes had expired. Throughout instrumental training, although only some of the lever presses were rewarded, food was never delivered in the absence of lever pressing, thereby ensuring that a positive contingency was in effect (Figure 1a).

A) During instrumental training (positive contingency), the lever becomes inactive for a variable interval (white rectangle) following each reward delivery. The first lever press after this interval triggers an immediate reward. No reward occurs in the absence of lever press (positive contingency). B) During omission training, rewards are delivered following a 20 s delay without lever press (black rectangle). A lever press during the delay resets the delay. Consecutive rewards are delivered at 20 s intervals in the absence of lever press activity (negative contingency). C) During yoked training, rewards are synchronized to the rewards of another rat trained in omission, regardless of the yoked rat's activity. Rewards may occur at any time with respect to lever presses (null contingency).

After the initial training, the action-outcome contingency was changed. Each group of rats was divided in two and each half was switched to one of two contingency conditions, either negative or null (Figure 1b, c). Within each lesion group, rats were associated in pairs, corresponding to the two conditions. Within each pair, the rat in the negative contingency condition (omission schedule) obtained a pellet whenever 20 s had elapsed without the rat pressing the lever. The other rat in the pair (yoked) received pellets delivered at exactly the same instants, irrespective of its behaviour (null contingency condition). Thus, in the negative contingency condition, food deliveries occurred well apart from lever pressing, i.e. 20 s after the previous lever press or pellet delivery, whereas in the null contingency condition, reward delivery could occur at any time with respect to lever pressing. Importantly, this procedure equated the amount of food pellets delivered in each group.

On the following day, the rats were returned to the operant chambers for a 30 min test session, in which the lever was inserted, but no food was delivered.

Histology

After behavioural testing, animals received a lethal dose of sodium pentobarbital and were perfused transcardially with saline (0.9%) followed by 10% buffered formalin. The brains were removed and post-fixed in a formalin-saccharose 30% solution for 2 days, then were frozen and cut into 40 µm-thick coronal sections with a freezing microtome (−20°C). The sections were collected onto gelatin-coated slides and dried before being stained with thionine. Histological analysis was performed under the microscope by an experimenter (F.E.) blind to lesion condition. Sections were examined for gross morphological changes, gliosis and scarring. The extent of lesions was reconstructed in reference to the atlas of Paxinos and Watson [14].

Data analysis

Rates of lever pressing and magazine entries were calculated over blocks of 5 min of training and over the whole session of test. Statistical analyses were performed on StatView ® software (SAS Institute Inc.) with ANOVA and Student-Newman-Keuls post-hoc tests, using lesion (Sham, mPFC) and condition (negative vs. null contingency) as between subject factors and blocks as repeated measures when appropriate. The alpha value for rejection of the null hypothesis was 0.05 throughout. Complementary analyses and modelling of action sequences from this experiment are provided in a related paper [7].

Results

Histology

Figure 2 illustrates the extent of the mPFC lesion. For histological analysis, significant cell loss or gliosis in the targeted area and no significant damage to the neighbouring structures were used as criteria for inclusion.

a) minimal (black area) and maximal (gray area) mPFC lesions affected both the prelimbic and infralimbic parts of the medial prefrontal cortex. b) Photomicrograph of a typical mPFC lesion, illustrating cell loss (outlined by arrowheads). Cg1: Cingulate Cortex 1; PL: Prelimbic Cortex; IL: Infralimbic cortex.

The lesions were found acceptable in 12 rats. As shown, the damaged area primarily concerned the prelimbic and infralimbic cortices. In four rats, the rostral part of the anterior cingulate cortex was also affected. Two rats were discarded because they had only unilateral lesions and therefore their two yoked animals were also discarded from the statistical analysis. The final cell sizes were therefore as follows: SHAM-negative (n = 8), SHAM-null (n = 8); mPFC-negative (n = 6), mPFC- null (n = 6).

Instrumental training and baseline responding

Lesioned and control rats acquired the initial instrumental response at identical rates (Fs<1 for all effects involving groups) and attained a plateau in instrumental performance after three sessions of training (data not shown).

By the end of training (last VI-60 session), there was no difference in the levels of lever press responding between animals allocated to the various groups. The mean rates of responding were as follows: SHAM-negative: 13.3 responses/min; SHAM- null: 15.7; mPFC-negative: 15.3; mPFC- null: 15.2. An ANOVA with Group (SHAM, mPFC) and protocol (negative, null) revealed no effect of any of the factors (F′s<1). Thus, subsequent stages of the experiment were not biased by any difference in baseline responding.

Changes in action-outcome contingency

Figure 3a shows the effect of contingency changes on instrumental performance. As shown on the left panel, Sham-operated animals (left panel) gradually learned to withhold lever pressing under the negative contingency condition where lever pressing prevented food delivery, as well as under the null contingency condition where lever pressing had no effect. By contrast, rats with lesions of the mPFC maintained a high level of responding throughout training in the null contingency condition (right panel). However the mPFC-lesioned animals were able to correctly reduce their responding in the negative contingency condition, like sham-operated animals.

a) Evolution of the rate of lever-pressing during the session of contingency change in blocks of 5 min. (mean + s.e.), according to lesion and condition b) final rate of response at test. Data are expressed as mean rates of responding. c) Evolution of the rate of entries into the empty magazine during the session of contingency change (mean + s.e.). d) Evolution of absolute number of rewards delivered during the session of contingency change in blocks of 5 min., according to lesion. Equal rewards are delivered in both conditions. Negative: negative contingency condition; Null: null contingency condition.

Statistical analysis confirmed this description of the data. A mixed analysis of variance with between-subject factors ‘lesion’ (SHAM, mPFC) and ‘condition’ (negative, null) and the within-subject factor ‘acquisition’ (block of 5 min) revealed a significant effect of acquisition (F _5,120 = 21.4, P<0.001). More importantly, the analysis showed the existence of a significant three-way interaction (F _5,120 = 2.75, P = 0.022), indicating that contingency changes differentially affected lesioned vs. intact rats in the negative and null conditions.

Separate analysis of each lesion group indicated that sham-operated animals showed an effect of acquisition (F _5,70 = 14.7, P<0.001) but no acquisition×condition interaction (F<1). Their performances were therefore comparable in both contingency conditions. In contrast, a similar analysis performed in mPFC-lesioned rats showed a significant effect of acquisition (F _5,50 = 8.34, P<0.001) but also an acquisition×condition interaction (F _5,50 = 4.53, P = 0.002), with post-hoc comparisons revealing that negative and null contingency performances did differ at the end of the session but not at the beginning (P<0.05). Indeed, under the null contingency condition, there was no significant decrease in instrumental performance in the mPFC-lesioned group (P>0.1), in contrast to sham-operated rats (P<0.001).

Delays between lever pressing and food delivery were consistently high in the negative contingency situation, being always 20 s to the first reward delivery, or more if no lever press occurred between rewards. By contrast, in the null contingency situation, these delays were quite variable and sometimes quite short, with a gradually decaying distribution extending to about 15 s. In this situation, mPFC-lesioned rats experienced on average shorter action-reward intervals than control rats (harmonic mean: 0.94 s vs. 1.54 s), largely due to their higher response rate.

The results of the test without food delivery are shown in Figure 3b. Again, mPFC-lesioned rats displayed an abnormally high rate of lever pressing in the null contingency condition, and this observation was supported by a significant interaction between ‘condition’ and ‘lesion’ (F _1,24 = 7.05, P = 0.014).

Figure 3d shows the gradual increase in food delivery in both groups during the session of adaptation to contingency changes. There was no difference in food delivery between groups (Fs<1). Thus, the difference of behaviour between mPFC-lesioned and control groups in the null contingency condition cannot be attributed to a difference in the density of reward.

Figure 3c shows the mean rate of visits to the empty magazine during the contingency-change session. mPFC-lesioned rats displayed a significantly lower magazine activity (F _1,24 = 4.50, P = 0.045). There was no evidence of significant changes in magazine entries across the session (F _5,120 = 1.15, P>0.1), nor of any effect related to condition or lesion (largest F = 1.40, P>0.1). In order to further assess the role of response competition in reducing lever-pressing, we evaluated the correlation between rates of magazine entries and rates of lever pressing for each rat over blocks of 2 min (15 measure pairs per rat). The Pearson correlation coefficients between these two measures were highly variable within each group (mPFC-negative: −0.80 to 0.52 ; mPFC-null: −0.38 to 0.53 ; SHAM-negative: −0.41 to 0.87 ; SHAM-null: −0.65 to 0.90). Only three negative correlations and five positive correlations were significant (two-tailed threshold: −0.514). Thus, the decrease of lever-pressing performance, when present, was not necessarily associated with an increase of other behaviours such as waiting at the food magazine.

The occurrence of non-contingent rewards elicited in all groups of rats a visit to the magazine and consumption of the food pellet, after which lever pressing resumed. We found no evidence for a differential pattern of response in the lesioned group, as would be expected if food delivery contributed to energize instrumental responding specifically in this group.

Experiment 2

The negative and null contingency conditions were characterized by different distributions of delays between lever pressing and reward delivery. Thus, the detection of changes in contingency might depend upon the degree of temporal contiguity between response and outcome. The aim of Experiment 2 was therefore to determine whether mPFC-lesioned rats were impaired in detecting changes in contiguity between an action and its outcome.