Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 May 1.
Published in final edited form as: Neurobiol Learn Mem. 2020 Feb 14;171:107189. doi: 10.1016/j.nlm.2020.107189

Inactivation of the Prelimbic Cortex Attenuates Operant Responding in Both Physical and Behavioral Contexts

Callum M P Thomas 1, Eric A Thrailkill 2, Mark E Bouton 2, John T Green 2
PMCID: PMC7198320  NIHMSID: NIHMS1566306  PMID: 32061995

Abstract

The present experiments aimed to expand our understanding of the role of the prelimbic cortex (PL) in the contextual control of instrumental behavior. Research has previously shown that the PL is involved when the “physical context,” or chamber in which an instrumental behavior is trained, facilitates performance of the instrumental response (Trask, Shipman, Green, & Bouton, 2017). Recently, evidence has suggested that when a sequence of two instrumental behaviors is required to earn a reinforcing outcome, the first response (rather than the physical chamber) can be the “behavioral context” for the second response (Thrailkill, Trott, Zerr, and Bouton (2016). Could the PL also be involved in this kind of contextual control? Here rats first learned a heterogenous behavior chain in which the first response (i.e. pressing a lever or pulling a chain) was cued by a discriminative stimulus and led to a second stimulus which cued a second response (i.e. pulling a chain or pressing a lever); the second response led to a sucrose reward. When the first and second responses were tested in isolation in the training context, pharmacological inactivation of the PL resulted in a reduction of the first response, but not the second response. When the second response was performed in the “context” of the first response (i.e., as part of the behavior chain) however, PL inactivation reduced the second response. Overall, these results support the idea that the PL is important for mediating the effects of a training context on instrumental responding, whether the context is physical or behavioral.

Introduction

Operant behaviors are weakened when they are tested outside the physical context (e.g., the Skinner box) in which they are learned (e.g, Thrailkill & Bouton, 2015a). The role of context in behavioral control has generally been attributed to the prefrontal cortex (PFC) (Miller & Cohen, 2001). Notably, the prelimbic subregion of the PFC has been identified as being involved in context-appropriate responding in choice paradigms (Haddon & Killcross, 2006; Marquis, Killcross, & Haddon, 2007) and context-based renewal of extinguished operant behaviors (Bossert et al., 2011; Eddy, Todd, Bouton, & Green, 2016; Fuchs, Eaddy, Su, & Bell, 2007; Fuchs et al., 2005; Palombo et al., 2017; Trask, Shipman, Green, & Bouton, 2017; Willcocks & McNally, 2013). Our work has suggested that rather than having a general role in renewal of extinguished behaviors, the PL selectively promotes the performance of operant behaviors in the context in which they are learned (i.e., acquisition context) (Trask et al., 2017). In that research, we demonstrated that pharmacological inactivation of the PL with baclofen and muscimol (GABA receptor agonists) reduced the response rate of an operant response when it was tested in the context in which it had been trained but not when tested in a non-training context (Trask et al., 2017). Moreover, this inactivation could also block ABA renewal of extinguished behavior (acquisition in context A, extinction in context B, renewal test in context A) but had no effect on ABC renewal (acquisition in context A, extinction in context B, renewal test in context C).

Recent evidence suggests that when two responses must be made in sequence to obtain a reinforcer, only the first response is sensitive to the physical context; the second response is sensitive to the performance of the first response (Thrailkill, Trott, Zerr, & Bouton, 2016), what can be termed the “behavioral context”. Briefly, in a heterogenous instrumental chain procedure, a discriminative stimulus (S1) signals the rat to perform the first response (R1), which terminates S1 and intitiates a second stimulus (S2) that signals the rat to perform the second response (R2), which terminates S2 and leads to a food pellet reward (Thrailkill & Bouton, 2015b). Each component of the chain can be tested separately, by presenting S1 or S2 and allowing the opportunity to make either R1 or R2. Using this procedure, we found that while R1 was sensitive to a change in the physical context (i.e. its rate decreased when tested in a non-trained physical context after acquisition), R2 was not (Thrailkill et al., 2016). In fact, R2 appeared to be sensitive to a change in the behavioral context (i.e., the response rate decreased when R2 was tested separately from R1). Further, after extinction of R2, R2 renewed when tested back in the behavioral context in which it had been trained (R1-R2) but not when it was tested in a non-trained behavioral context (R3-R2) (Steinfeld, Alcalá, Thrailkill, & Bouton, 2019; Thrailkill et al., 2016).

Here, we test the hypothesis that the PL is a key brain structure for the processing of not only physical context but also behavioral context as defined above. To that end, we trained rats to perform a discriminated heterogenous instrumental chain and then tested both R1 and R2 in isolation from each other in the physical acquisition context (Context A), and then the entire chain and R2 in isolation in a non-trained physical context (Context B). In that way, R1 and R2 were tested both within their respective acquisition contexts (Context A for R1; after R1 for R2) and outside their respective acquisition contexts (Context B for R1; in the absence of R1 for R2). We hypothesized that PL inactivation would reduce R2 when tested with R1 (i.e., within its acquisition context) but not when R2 was tested without R1 (i.e., outside of its acquisition context). We also hypothesized that PL inactivation would reduce R1 when tested in context A but not when tested in context B, similar to our previous finding examining a single operant (Trask et al., 2017).

Method

Subjects.

The subjects were 32 male Wistar rats purchased from Charles River Laboratories (St. Constance, Quebec, CAN). They were 59-63 days old upon arrival and were individually housed in a room maintained on a 12:12 h light:dark cycle. Experimentation took place during the light period of the cycle. After recovery from the surgical procedures, food restriction was introduced in order to reduce body weights to 90% of baseline. Food restriction was maintained for the duration of the experiment.

Surgery.

Once acclimated to the colony room, rats were anesthetized with isoflurane and stereotaxic surgery was performed to bilaterally implant guide cannulae (26 gauge, Plastics One) in the PL region of the medial prefrontal cortex (mPFC). Rats were given 5.0 mg/kg of carprofen for analgesia both during surgery and 1 d postoperatively. During surgery, bupivacaine was also administered as a local anesthetic (0.15 ml) and 1 ml of lactated Ringer’s solution was administered for hydration. Using a 12 degree angle in the medio-lateral plane, guide cannulae were lowered into the brain at + 3.0 mm anterior from bregma, +/− 0.75 mm from midline and − 3.0 mm ventral from bregma (infusions were 1 mm below the guide cannula tips). After recovery, a new baseline weight was taken and rats began food restriction. One rat did not recover from anesthesia.

Apparatus.

Each Skinner box was housed in its own sound attenuation chamber. All boxes were of the same design (Med Associates model ENV008-VP). They measured 30.5 x 24.1 x 21.0 cm (l x w x h). A recessed 5.1 cm x 5.1 cm food cup was centered in the front wall 2.5 cm above the level of the floor. Two response manipulanda were used, a retractable lever (Med Associates model ENV112CM) was positioned to the left of the food cup and a chain was suspended from the ceiling to the right of the food cup. Two 28 V panel lights, positioned near the lever (approx. 2.5 cm above) and chain (approx. 1.5 cm to the right) were used as discriminative stimuli (SD) for the two manipulanda. Each chamber was illuminated by one 7.5 W incandescent bulb mounted to the ceiling of the sound attenuation chamber, 34.9 cm from the grid floor at the front wall of the chamber. Ventilation fans provided background noise of 65 dBA.

In one set of boxes, the side walls and ceiling were made of clear acrylic plastic, whereas the front and rear walls were made of brushed aluminum. The floor was made of stainless steel grids (0.48 cm diameter) staggered such that odd- and even-numbered grids were mounted in two separate planes, one 0.5 cm above the other. This set of boxes had no distinctive visual cues on the walls or ceilings of the chambers. A dish containing 5 ml of lemon-scented Pine-Sol (Clorox) was placed outside of each chamber near the front wall.

The second set of boxes was similar to the lemon-scented boxes, except for the following features. In each box, one side wall had black diagonal stripes, 3.8 cm wide and 3.8 cm apart. The ceiling had similarly spaced stripes oriented in the same direction. The grids of the floor were mounted on the same plane and were spaced 1.6 cm apart (center-to-center). A distinct odor was continuously presented by placing 5 ml of pine-scented Pine-Sol (Clorox) in a dish outside the chamber.

In either set of boxes, the reinforcer was a 45 mg sucrose-based food pellet (5-TUT: 1811251, TestDiet) delivered to the magazine. The apparatus was controlled by computer equipment located in an adjacent room.

Procedure.

During training, one session was conducted each day, 7 days a week. Animals were handled daily and maintained at their target weight with supplemental feeding at approximately 2 hr post-session when necessary.

Acquisition.

On the first day of training, all rats received a session of magazine training consisting of 30 free pellets delivered on a variable time (VT) 30-s schedule. Training of R2 then began on Day 2. Only the R2 manipulandum (chain or lever, counterbalanced) was present. A response on the R2 manipulandum was reinforced according to fixed ratio (FR) 1 schedule until 20 pellets were earned, then an additional 30 pellets could be earned according to FR 1 during presentations of the SD (S2); responses in the absence of the SD were not reinforced. Completing the FR 1 requirement turned S2 off, delivered a pellet, and initiated a variable 45-s intertrial interval (ITI). S2 was always the panel light near the R2 manipulandum. If a response was not made during S2, S2 terminated after 60 s and a new ITI was initiated. On Day 3, there were 30 presentations of S2 with the FR 1 requirement in effect. Beginning on Day 4, the second manipulandum (R1) was introduced to the chamber, and rats now received 30 presentations of S1 (the panel light near R1). A response on R1 turned off S1 and turned on S2, which set the occasion for reinforcement of a single R2 with a food pellet. There was a variable 45-s ITI. On Days 4 and 5, the response requirement for each link was increased to random ratio (RR) 2. The requirement was further increased to RR 4 for Days 6–13. During this period, the maximal duration of each S was gradually reduced from 60 s to 20 s.

Tests A.1 and A.2.

On Days 14 and 15, rats received tests, in the training context (Context A), of R1 alone (Test A.1) and R2 alone (Test A.2) in counterbalanced order. Between 30 minutes and one hour prior to each test, each rat received a bilateral intra-PL, 0.5 μL infusion of either baclofen/muscimol (1.0 mM/0.1 mM; Sigma-Aldrich) (Group 1) or vehicle (phosphate-buffered saline) (Group 2) at a rate of 0.25 μL per minute using a microinfusion pump. Infusions were made by inserting internal cannulae into guide cannulae; the internal cannulae tips protruded 1 mm below the guide cannulae tips. A given rat received the same infusate each day. In each test session, rats received 10 trials with both response manipulanda available. The R1 test consisted of 10 presentations of S1, and the R2 test consisted of 10 presentations of S2. Responses on the appropriate manipulandum turned off the corresponding S according to RR 4 but otherwise had no programmed consequences (Thrailkill & Bouton, 2016).

Reacquisition.

Since Tests A.1 and A.2 consisted of tests of each response in extinction, rats received 4 sessions of reacquisition to return response rates to a baseline similar to day 13 of acquisition. These sessions were identical to the last session of the acquisition phase and consisted of 30 trials of the behavioral chain with an RR 4 requirement and maximal stimulus duration of 20 s. On each of these 4 days, rats also received a 30-min exposure session to a second context (Context B). Context B for each rat was a location-matched box in the set in which they had not been trained in (e.g. A rat trained in the upper left lemon-scented box was exposed to the upper left pine-scented box). Order of daily reacquisition sessions and exposure sessions was counterbalanced such that on each day half the rats received acquisition in their context A and then received exposure to their context B (i.e., A-B) while the other half received the opposite (i.e., B-A) and the order alternated each day (e.g., a rat received A-B on day one and B-A on day two.

Tests B.1 and B.2.

On Days 20 and 21, rats received a test of the full chain (R1-R2; Test B.1) and R2 alone (Test B.2), in counterbalanced order, in the non-trained context, Context B. As before, 30-60 minutes prior to each test, each rat received a 0.5 μL bilateral intra-PL infusion of either baclofen/muscimol (1.0 mM/0.1 mM; Sigma-Aldrich) or vehicle (PBS). Rats received the opposite infusate from what they had received prior to Tests A.1 and A.2 (e.g., rats that had received B/M prior to Tests A.1 and A.2 received vehicle prior to Tests B.1 and B.2). In each test session, rats received 10 trials with both response manipulanda available. Trials of the B.1 test were similar to those of acquisition except that S2 presentation occurred after 20 s of S1 if R1’s RR4 requirement was not met. Additionally, completion of the RR4 requirement for R2 during S2 turned off S2 but otherwise had no programmed consequences. The R2 test (Test B.2) was identical to the previous R2 test (Test A.2) except that it took place in Context B.

Histology.

Following the final test, rats were injected with a lethal dose of sodium pentobarbital (150 mg/kg, i.p.) and then transcardially perfused with 0.9% saline and 10% buffered formalin. Following the perfusion, electrodes were inserted into the guide cannulae and protruded 1 mm below the guide cannula tip to match the infusion site. A marking lesion was created by 10 sec of 0.1 mA current delivery. Brains were removed and stored in 10% buffered formalin. Three to 4 days before embedding in optimal cutting temperature (OCT) compound, brains were transferred to a 30% sucrose/10% buffered formalin solution for cryoprotection. The brains were then frozen in OCT compound and sectioned at 70 μm with a cryostat. Sections were mounted onto chrome alum subbed slides and dried. Tissue was stained with cresyl violet (for cell bodies) and Prussian blue (for marking lesions) and allowed to dry before being coverslipped with Permount. Cannula placement was then verified and compared to a rat brain atlas (Paxinos, 2014)

Data analysis.

All data were subjected to ANOVA or t tests where appropriate, using SPSS 26. The rejection criterion was set to p < .05. In the analysis of the data several different scores were calculated including response rate elevation scores and proportion baseline scores. Response rate elevation scores are the response rate for an individual response (e.g., R1) during the 30 seconds prior to stimulus onset subtracted from the response rate for that response during the stimulus period and therefore represent the change in the response rate that can be attributed to the stimulus being present. Our analysis of elevation scores was not complicated by differences in pre-stimulus periods responding. These data are omitted for brevity but are available upon request. Proportion baseline scores are the elevation score for an individual response during the test divided by the elevation score for that response on the most recent day of training (i.e., elevation scores for tests A. 1 and A.2 were divided by elevation scores on day 13 of acquisition and elevation scores for B.1 and B.2 were divided by elevations scores on day 19).

Results

A total of 12 rats were excluded from data analysis. One rat did not recover from anesthesia. Seven rats were excluded due to misplaced cannulae or excessive cannulae-related tissue damage. Four rats (Rats 7, 8, 25, 31) were excluded from the data set because in at least one test, their proportion baseline score was more than two standard deviations above the mean (z > 2) (Rat 7: Test B.1 z = 3.03, Rat 8: Test A.2 z = 2.97, Test B.1 z = 3.11, Rat 25: Test A. 1 z = 2.05, Rat 31: Test A.2 z = 2.58). Data from the remaining 20 rats (10 that had received B/M in Tests A.1 and A.2 and vehicle in Tests B.1 and B.2 and 10 that had received the opposite) were analyzed.

Training.

Rats acquired the discriminated heterogeneous chain without incident. The results of the training phase are shown in Figure 1A. Response rate elevation scores were analyzed across the entire training phase with a Treatment (B/M, Vehicle) X Response (S1R1, S2R2) X Session (12) repeated-measures ANOVA. Mauchly’s test of sphericity indicated that the assumption of sphericity had been violated for both Session, χ2(65) = 109.7, p < .01, and Response X Session, χ2(65) = 116.7, p < .01, so a Greenhouse-Geisser correction was applied. Both R1 and R2 response rate elevation scores increased across the 12 sessions as indicated by a significant main effect of session F(11, 97.0) = 30.68, MSE = 106.85, p < .001. R2 elevation scores were additionally found to be greater than R1 elevation scores, as indicated by a significant main effect of response F(1, 18) = 79.74, MSE = 1172.99, p < .001. No main effects or interactions involving treatment reached significance, largest F = 1.06.

Figure 1.

Figure 1.

(A) Acquisition of the chained instrumental responses (R1 and R2) over 8 initial and then 4 reacquisition training sessions. Elevation scores represent the difference between responses on the given manipulanda during the its corresponding stimulus (e.g., S1 and R1) and the prestimulus period. (B) Response rates during each stimulus period during session 8 and 12, the final training sessions prior to test set A and B, respectively. Error bars indicate standard error of the mean. Group designation (Group 1 = black line and circles, Group 2 = red line and squares) indicates when animals received the baclofen/muscimol and saline infusions, with Group 1 receiving baclofen/muscimol prior to tests A.1 and A.2 and saline prior to tests B.1 and B.2, while Group 2 received the opposite treatment. Response rates and elevation scores are distinguished throughout using open (R1) or closed (R2) circles and squares.

Response rates during each of the three stimulus periods on day 13 (see Figure 1B), the last day of training prior to tests A.1 and A.2, were analyzed with separate Treatment (B/M, Vehicle) X Response (R1, R2) ANOVAs to confirm that R1 and R2 responding were controlled by their respective stimuli, S1 and S2. During the 30 seconds prior to S1 onset (pre-S1), there was a significant main effect of response, F(1, 18) = 12.93, MSE = 7.62, p = .002, indicating that R1 had a higher baseline than R2; however, response rates remained low for both. During the S1 presentation period, there was a significant main effect of response, F(1, 18) = 50.57, MSE = 35.29, p < .001, indicating that R1 responding was greater than R2 responding. During the S2 presentation period, a significant main effect of response, F(1, 18) = 187.40, MSE = 147.22, p < .001, indicated that R2 responding was now greater than R1 responding. None of the effects or interactions involving treatment (a dummy variable at this point) reached significance, largest F = 2.80, p > 0.10.

The same analyses were performed on response rates during each stimulus period on day 19, the last day of training prior to tests B.1 and B.2. During the 30 seconds prior to S1 onset (pre-S1), there was a significant main effect of response, F(1, 18) = 12.31, MSE = 31.31, p = .003, indicating that baseline R1 responding was greater than R2 responding; however response rates remained low for both. During the S1 presentation period, there was a significant main effect of response, F(1, 18) = 143.20, MSE = 73.11, p < .001, indicating that R1 responding was greater than R2 responding. During the S2 presentation period, a significant main effect of response, F(1, 18) = 175.44, MSE = 172.58, p < .001, indicated that R2 responding was greater than R1 responding. None of the effects or interactions involving treatment reached significance, largest F = 1.25. During both tests, preS1 rates did not differ across treatment groups indicating that use of elevation scores would not introduce a shift in the data based solely upon baseline rates.

Tests A.1 and A.2.

Given our a priori hypotheses, that R1 would be sensitive to PL inactivation when tested within its training context (i.e., in Context A; Test A.1) and R2 would be insensitive to PL inactivation when tested outside of its training context (i.e., without R1; Test A.2), the data from Tests A.1 and A.2 were analyzed separately, using independent-samples t-tests to compare responding in each test after intra-PL B/M infusion vs. vehicle infusion (see Figure 2A). As expected based on our prior work, PL inactivation prior to Test A.1 resulted in a significant reduction in R1 as a proportion of baseline responding (Mdiff = .362, t(18) = 3.12, p = .006), but did not affect R2 in Test A.2 (Mdiff = .0293, t(18) = 0.30, p = .773).

Figure 2.

Figure 2.

(A) Results of separate tests R1 and R2 in context A. (B) Results of full chain test in context B. (C) Results of R2 test in context B. All results are presented as proportion baseline scores (elevation score at test as a proportion of elevation score during each rat’s previous training session) with error bars indicating standard error of the mean. * indicates P < .05.

Tests B.1 and B.2.

Given our a priori hypothesis, that R2 would be sensitive to PL inactivation when tested within its training context (i.e., after R1 was performed; Test B.1) and R1 and R2 would be insensitive to PL inactivation when tested outside of their training contexts, context B (Test B.1) and in the absence of R1 (Test B.2), respectively, the data from Tests B.1 and B.2 were analyzed separately, using independent-samples t-tests to compare responding in each test after intra-PL B/M infusion vs. vehicle infusion (see Figures 2B and 2C). For analysis of R2 in Test B.1, Levene’s test indicated unequal variances (F = 10.21, p = .005), so degrees of freedom were adjusted from 18 to 12.59. As hypothesized, PL inactivation prior to Test B.1 resulted in a statistically significant reduction in R2 as a proportion of baseline responding, (Mdiff = .130, t(12.59) = 2.25, p = .043); no effect of PL inactivation on R1 was found, (Mdiff = .002, t(18) = 0.02, p = .981). In Test B.2, there was no effect of PL inactivation (Mdiff = .057, t(18) = 0.95, p = .354).

Histology.

Rat brain sections were examined to determine the infusion location. Briefly, the most ventral point of cannula track or the epicenter of blue staining (Prussion blue) was determined to be the site of infusion. If an animal had at least one infusion site that lay outside the specified region of the PL then they were excluded. Additionally, in some animals, evidence of significant infection within the target region was found and these animals were likewise excluded. The infusion locations for the remaining rats are shown in Figure 3.

Figure 3.

Figure 3.

Infusion cannula tip locations within the prelimbic cortex region (areas 32 and 24) (Paxinos, 2014). Numbers represent distance from bregma.

Discussion

Here we investigated the involvement of the PL in behaviors which are dependent on contexts that are either physical or behavioral. The current results indicate that the inactivation of the PL produces the same effect on behaviors which are associated with behavioral contexts as those that are associated with physical contexts. Beyond adding to the list of PL functions, this finding significantly extends our understanding of the role of the PL in contextual processing because it suggests that the PL treats “context” in a broad sense rather than being specifically sensitive to the physical, external environment (Bouton, 1993, 2019).

The current experiment utilized a discriminated behavioral chain, which previous results have shown, and we replicate here, produces individual responses (R1 and R2) that are dependent on these two forms of context (physical and behavioral, respectively) (Thrailkill et al., 2016). The discriminated behavioral chain procedure also allows for the testing of the two responses in isolation from each other. Given the nature of a behavioral context, the current design allowed us to test each R both in and out of their respective contexts. First, we tested each R by itself in the physical training context (Context A), and then tested the R1-R2 chain and R2 by itself in a non-trained physical context (Context B). We show that pharmacological inactivation of the PL reduced R1 when tested by itself in Context A, but had no effect on R2. In contrast, when tested in context B, pharmacological inactivation of the PL reduced R2 when R2 was tested with R1 (i.e., as part of the behavior chain) but not when R2 was tested alone. PL inactivation also had no effect on R1 when R1 was tested in Context B. These results are consistent with the idea that PL inactivation selectively reduces responding of either R1 or R2 in the presence of their specific contexts, physical and behavioral respectively.

Behavior chaining procedures have been used to understand the relationship between drug-seeking (R1) and drug-taking (R2) in addiction-like behavior (e.g., Singer, Fadanelli, Kawa, & Robinson, 2018; Zapata, Minney, & Shippenberg, 2010). The literature on the neural substrates of behavior chains is, however, relatively sparse. To our knowledge, only a handful of previous studies have examined the neural substrates of behavior chains. Those studies have examined either non-discriminated chains, in which there is no S1 or S2 and R1 and R2 must simply be executed in sequence (e.g., Ostlund et al., 2009), or partially-discriminated chains in which the R2 manipulandum is inserted into the chamber following completion of a continuously-available R1. (In the latter method, there is no explicit S for R1, but R2 insertion can be thought of as an S2; e.g., Wassum et al., 2012.) The previous studies have shown that secondary motor cortex, dorsolateral striatum, and the nucleus accumbens are important brain regions in chained behavior (Ostlund, Winterbauer, & Balleine, 2009; Singer et al., 2018; Wassum, Ostlund, & Maidment, 2012; Yin, 2009, 2010; Zapata et al., 2010). In contrast, the present study utilized a discriminated behavioral chain, which was experimentally useful because it allows for the testing of each response independently, but may also provide a better model of some forms of behavioral chains in humans since various stimuli (visual, auditory, etc.) are often associated with drug seeking and taking. Though the fully discriminated procedure could alter the neural substrates involved in the performance of the behavioral chain, previous work has demonstrated the same PL inactivation effect, as we observed on R1, using a non-discriminated single response design (Trask et al., 2017).

In the current study, PL inactivation reduced R2 when tested in a chain with R1 but not when tested in isolation. Our interpretation of this result is that the PL is important for R2 when it is in its “behavioral context” (following R1). We cannot completely rule out the possibility that PL is important for R2 when it is tested with any preceding response, even one it was not trained with (e.g., R3). However, several studies have demonstrated that R2 is associated specifically with the R1 with which it is trained. For example, Thrailkill and Bouton (2015b; Experiment 3) trained rats on two separate discriminated chains, S1R1-S2R2 and S3R3-S4R4. Then, one group underwent extinction of S1R1. Subsequent tests of S2R2 and S4R4 revealed a reduction only in R2. Thrailkill et al. (2016; Experiment 4) also trained two separate discriminated chains (S1R1-S2R2; S3R3-S4R4). They then extinguished S2R2 and S4R4, and showed that R2 renewed when tested with R1, but not when tested with R3; R4 renewed when tested with R3 but not when tested with R1. These results lend confidence to our conclusion that the PL is important for R2 only when it is in a chain with its associated R1.

To our knowledge, these are the first results to demonstrate that the PL is capable of processing behavioral information and adjust future behavior based on this input. Given the differences between physical and behavioral contexts, the findings could suggest that an even wider variety of contextual stimuli might be incorporated into behavioral outputs through the PL. While further research is needed to confirm this hypothesis, the implications of the idea would be significant. If it were the case, manipulation of the PL would be especially powerful in its ability to affect behavioral change, and prevent ABA renewal, since non-physical aspects of context may be difficult or impossible to change externally.

Table 1.

Experimental procedure. A and B refer to the physical contexts in which training or test occurred, while [B] indicates exposure to context B. + designates reinforcement, − designates nonreinforcement. B/M indicates baclofen/muscimol-induced inactivation of the PL, and VEH indicates vehicle control infusion.

Group Acquisition Infusion Tests A.1 and A.2 Reacquisition Infusion Tests B.1 and B.2
1 A:S1R1→S2R2+ B/M A:S1R1−, A:S2R2− A:S1R1→S2R2+, [B] VEH B:S1R1→S2R2− B:S2R2−
2 VEH B/M

Acknowledgments

This research was supported by the University of Vermont Department of Psychological Science and by NIH Grant R01 DA033123 to MEB. EAT was supported by NIH Grant K01 DA044456.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Bossert JM, Stern AL, Theberge FR, Cifani C, Koya E, Hope BT, & Shaham Y (2011). Ventral medial prefrontal cortex neuronal ensembles mediate context-induced relapse to heroin. Nat Neurosci, 14(4), 420–422. doi: 10.1038/nn.2758 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bouton ME (1993). Context, time, and memory retrieval in the interference paradigms of Pavlovian learning. Psychol Bull, 114(1), 80–99. doi: 10.1037/0033-2909.114.1.80 [DOI] [PubMed] [Google Scholar]
  3. Bouton ME (2019). Extinction of instrumental (operant) learning: interference, varieties of context, and mechanisms of contextual control. Psychopharmacology (Berl). doi: 10.1007/s00213-018-5076-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Eddy MC, Todd TP, Bouton ME, & Green JT (2016). Medial prefrontal cortex involvement in the expression of extinction and ABA renewal of instrumental behavior for a food reinforcer. Neurobiol Learn Mem, 128, 33–39. doi: 10.1016/j.nlm.2015.12.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Fuchs RA, Eaddy JL, Su ZI, & Bell GH (2007). Interactions of the basolateral amygdala with the dorsal hippocampus and dorsomedial prefrontal cortex regulate drug context-induced reinstatement of cocaine-seeking in rats. Eur J Neurosci, 26(2), 487–498. doi: 10.1111/j.1460-9568.2007.05674.x [DOI] [PubMed] [Google Scholar]
  6. Fuchs RA, Evans KA, Ledford CC, Parker MP, Case JM, Mehta RH, & See RE (2005). The role of the dorsomedial prefrontal cortex, basolateral amygdala, and dorsal hippocampus in contextual reinstatement of cocaine seeking in rats. Neuropsychopharmacology, 30(2), 296–309. doi: 10.1038/sj.npp.1300579 [DOI] [PubMed] [Google Scholar]
  7. Haddon JE, & Killcross S (2006). Prefrontal cortex lesions disrupt the contextual control of response conflict. J Neurosci, 26(11), 2933–2940. doi: 10.1523/JNEUROSCI.3243-05.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Marquis JP, Killcross S, & Haddon JE (2007). Inactivation of the prelimbic, but not infralimbic, prefrontal cortex impairs the contextual control of response conflict in rats. Eur J Neurosci, 25(2), 559–566. doi: 10.1111/j.1460-9568.2006.05295.x [DOI] [PubMed] [Google Scholar]
  9. Miller EK, & Cohen JD (2001). An integrative theory of prefrontal cortex function. Annu Rev Neurosci, 24, 167–202. doi: 10.1146/annurev.neuro.24.1.167 [DOI] [PubMed] [Google Scholar]
  10. Ostlund SB, Winterbauer NE, & Balleine BW (2009). Evidence of Action Sequence Chunking in Goal-Directed Instrumental Conditioning and Its Dependence on the Dorsomedial Prefrontal Cortex. The Journal of Neuroscience, 29(25), 8280. doi: 10.1523/JNEUROSCI.1176-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Palombo P, Leao RM, Bianchi PC, de Oliveira PEC, Planeta CDS, & Cruz FC (2017). Inactivation of the Prelimbic Cortex Impairs the Context-Induced Reinstatement of Ethanol Seeking. Front Pharmacol, 8, 725. doi: 10.3389/fphar.2017.00725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Paxinos GW, C. (2014). Paxinos and Watson’s The Rat Brain in Stereotaxic Coordinates. San Diego: Elsevier Academic Press. [Google Scholar]
  13. Singer BF, Fadanelli M, Kawa AB, & Robinson TE (2018). Are Cocaine-Seeking "Habits" Necessary for the Development of Addiction-Like Behavior in Rats? J Neurosci, 38(1), 60–73. doi: 10.1523/JNEUROSCI.2458-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Steinfeld M, Alcalá JA, Thrailkill EA, & Bouton ME (2019). Renewal in a heterogeneous behavior chain: Extinction of the first response prevents renewal of a second response when it is separately extinguished and returned to the chain. Learning and Motivation, 68, 101587. [Google Scholar]
  15. Thrailkill EA, & Bouton ME (2015a). Contextual control of instrumental actions and habits. J Exp Psychol Anim Learn Cogn, 41(1) 69–80. doi: 10.1037/xan0000045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Thrailkill EA, & Bouton ME (2015b). Extinction of chained instrumental behaviors: Effects of procurement extinction on consumption responding. J Exp Psychol Anim Learn Cogn, 41(3), 232–246. doi: 10.1037/xan0000064 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Thrailkill EA, & Bouton ME (2016). Extinction of chained instrumental behaviors: Effects of consumption extinction on procurement responding. Learn Behav, 44(1), 85–96. doi: 10.3758/s13420-015-0193-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Thrailkill EA, Trott JM, Zerr CL, & Bouton ME (2016). Contextual control of chained instrumental behaviors. J Exp Psychol Anim Learn Cogn, 42(4), 401–414. doi: 10.1037/xan0000112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Trask S, Shipman ML, Green JT, & Bouton ME (2017). Inactivation of the Prelimbic Cortex Attenuates Context-Dependent Operant Responding. J Neurosci, 37(9), 2317–2324. doi: 10.1523/JNEUROSCI.3361-16.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Wassum KM, Ostlund SB, & Maidment NT (2012). Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task. Biol Psychiatry, 71(10), 846–854. doi: 10.1016/j.biopsych.2011.12.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Willcocks AL, & McNally GP (2013). The role of medial prefrontal cortex in extinction and reinstatement of alcohol-seeking in rats. Eur J Neurosci, 37(2), 259–268. doi: 10.1111/ejn.12031 [DOI] [PubMed] [Google Scholar]
  22. Yin HH (2009). The role of the murine motor cortex in action duration and order. Front Integr Neurosci, 3, 23. doi: 10.3389/neuro.07.023.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Yin HH (2010). The sensorimotor striatum is necessary for serial order learning. J Neurosci, 30(44), 14719–14723. doi: 10.1523/JNEUROSCI.3989-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Zapata A, Minney VL, & Shippenberg TS (2010). Shift from goal-directed to habitual cocaine seeking after prolonged experience in rats. J Neurosci, 30(46), 15457–15463. doi: 10.1523/JNEUROSCI.4072-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES