Abstract
Operant behavior is typically organized into sequences of responses that eventually lead to a reinforcer. Response elements can be categorized as those that directly lead to reward consumption (i.e., a consumption response), and those that lead to the opportunity to make the consumption response (i.e., a procurement response). These responses often differ topographically and in terms of the discriminative stimuli that set the occasion for them. We have recently shown that extinction of the procurement response acts to weaken the specific associated consumption response, and that active inhibition of the procurement response is required for this effect. To expand the analysis of the associative structure of chains, the present experiments asked the reverse question of whether extinction of consumption behavior results in a decrease in the associated procurement response in a discriminated heterogeneous chain. In Experiment 1, extinction of consumption alone led to an attenuation of the associated procurement response only when rats were allowed to make the consumption response in extinction. Exposure to the consumption stimulus alone was not sufficient to produce weakened procurement responding. In Experiment 2, rats learned two distinct heterogeneous chains; extinction of one consumption response specifically weakened the procurement response associated with it. The results add to evidence suggesting that rats learn a highly specific associative structure in behavior chains, and emphasize the role of learning response inhibition in extinction.
Keywords: Heterogeneous behavior chains, instrumental learning, extinction, response inhibition
Operant behavior often involves chains of linked responses that are each required in order to produce a primary reinforcer. For instance, following the terminology used by Collier (1981), one operant response (a consumption response) can lead directly to the reinforcer, whereas a second operant response (a procurement response) can be required to access an opportunity to make the consumption response. Behavior chains often include explicit discriminative stimuli (SDs) for each response. Thus, a procurement SD sets the occasion for the procurement response, which produces a consumption SD. The consumption SD then sets the occasion for a consumption response and perhaps reinforces the preceding procurement response (i.e., as a conditioned reinforcer, Gollub, 1977). In the laboratory, such discriminated heterogeneous behavior chains can be arranged as a sequence of linked responses across different manipulanda signaled by distinct SDs. Translationally, they are analogous to the chains of different but linked behaviors that humans engage in when procuring and consuming food or drugs (Conklin, Robin, Perkins, Salkeld, & McClernon, 2008; Ostlund & Balleine, 2008).
Surprisingly little research has examined the associative processes that underlie the performance of discriminated heterogeneous chains. In addition, very little research has studied their extinction (but see Catlin & Gleitman, 1972; Fantino, 1965), which is important to understand on both theoretical and translational grounds. Recently, Thrailkill and Bouton (2015) reported a series of experiments with discriminated heterogeneous chains that began to address these issues by characterizing the effects of extinction of the procurement response on the associated consumption response. Rats learned that a procurement response (e.g., lever press) in a procurement SD led to a consumption SD that set the occasion for a consumption responding (e.g., chain pull) that earned a food pellet. In the first experiment, rats received extinction of either the entire chain or the procurement response alone, and were then tested on the consumption response. The key finding was that extinction of the procurement response alone weakened the consumption response. The results of a second experiment suggested that making the procurement response in extinction was required to produce the effect: Mere extinction exposure to the procurement SD without the opportunity to make the procurement response did not weaken the consumption response. In a final experiment, rats learned to make two heterogeneous procurement-consumption chains prior to extinction of one of the procurement responses. Extinction of the procurement response specifically suppressed performance of the consumption response that had been associated with it in a chain. There was no evidence that extinction of the procurement response suppressed the other consumption response when it was compared to responding in a nonextinguished control group. The results begin to characterize the associative structure underlying performance in heterogeneous instrumental chains: Overall, the evidence was consistent with the idea that making the procurement response in extinction led to the activation of the associated consumption response representation, which allowed the consumption response to undergo mediated extinction (Holland, 1990; Holland & Wheeler, 2009).
The present experiments were designed to further explore the associative structure of heterogeneous chains by testing the effect of the reverse operation, that is, the effects of consumption extinction on procurement responding. The results of two previous sets of experiments on heterogeneous chains suggest that separate manipulation of the value of the consumption response can indeed influence the associated procurement response (Olmstead, Lafond, Everitt, & Dickinson, 2001; Zapata, Minney, & Shippenberg, 2010). However, the precise interpretation of the results is not clear. In those experiments, rats learned to make a procurement response when a procurement lever was inserted into the operant chamber. The procurement response then caused the procurement lever to retract and a second consumption lever to be inserted. Responding on the second lever then led to the primary reinforcer (an intravenous infusion of cocaine). Note that the insertion and retraction of the levers served as SDs. Following acquisition, the two groups received either extinction of the consumption response (i.e., the consumption lever was present continuously but pressing it did not lead to a cocaine infusion), or further reinforcement of the consumption response. In a subsequent test of procurement responding, the group that had received extinction of the consumption response showed less procurement responding than the group that had received reinforcement. But in the absence of a control group that received no treatment of the consumption response, it is not possible to know whether the results were due the extinction of the response in the one group or reinforcement of the response in the other (or both). A second within-subject experiment in which subjects were trained with two heterogeneous chains and then received extinction and reinforcement of the two consumption responses is ambiguous in the same way, as were further results reported by Zapata et al. (2010). At the present point in time, it is not certain whether extinction of the consumption response can weaken procurement in the same way that procurement extinction weakens consumption (Thrailkill & Bouton, 2015).
The present experiments were thus intended to extend the analysis of the effects of consumption extinction on procurement responding in a discriminated heterogeneous chain. Following Thrailkill and Bouton (2015), they were designed to ask two new questions: (1.) whether consumption extinction weakens procurement responding in comparison to an untreated control group, and (2.) whether making the consumption response in extinction is necessary to produce that effect. Rats were first trained to perform the heterogeneous chain studied by Thrailkill and Bouton (2015), in which procurement and consumption response manipulanda (a lever and a chain) were continuously available but responding on them was guided by distinct visual SDs. Experiment 1 demonstrated that extinction of consumption responding can indeed weaken procurement responding, and that the effect depends on the rats making the actual consumption response during extinction. Experiment 2 then ruled out nonspecific effects of extinction as a potential explanation for the effect, and further supported an account of the effect based on a direct association between the procurement and consumption responses.
Experiment 1
The design of the first experiment is shown in Table 1. It mirrored the design of a study reported by Thrailkill and Bouton (2015, Experiment 2). Rats were trained to perform a two-response chain that involved both lever-press and chain-pull responses (order counterbalanced). The table refers to the procurement and consumption responses as P and C, respectively, and the stimuli (SDs) that set the occasion for them as SP and SC. Following acquisition of the chain, three groups received extinction exposures to the consumption stimulus (SC) with either (1.) both the procurement and consumption manipulanda present [Group SC-C (P)], (2.) the consumption manipulandum but not the procurement manipulandum present (Group SC-C), or (3.) neither manipulandum present (Group SC-only). The presence/absence of the different manipulanda arranged things so that Group SC-C (P) could make both responses, Group SC-C could only make the consumption response, and Group SC-only could not make either during extinction. The procurement SD was never presented. A fourth group, Group Handle, received identical handling and transport to the laboratory from the colony room without exposure to the SDs, responses, or operant chamber. All rats were then tested for procurement responding in the presence of the procurement SD alone. The extinction groups allowed us to assess whether learning to inhibit the consumption response is required for consumption extinction to weaken the procurement response. If mere exposure to the consumption SD were sufficient to weaken the procurement response, then Group SC-only would show the weakened procurement responding that we expected in Groups SC-C (P) and SC-C. The inclusion of both Group SC-C and SC-C (P) allowed us to assess any influence of the presence vs. absence of the procurement manipulandum when consumption responding was extinguished.
Table 1.
Group | Acquisition | Extinction | Test | |
---|---|---|---|---|
Experiment 1 | ||||
SC-C (P) | SP:P → SC: C + | (P, C) | SC: C − (P, C) | SP: P − (P, C) |
SC-C | SP:P → SC: C + | (P, C) | SC: C − (C) | SP: P − (P, C) |
SC Only | SP:P → SC: C + | (P, C) | SC − | SP: P − (P, C) |
Handle | SP:P → SC: C + | (P, C) | ---- | SP: P − (P, C) |
Experiment 2 | ||||
Extinguished | SP1:P1 → SC1:C1 + SP2:P2 → SC2:C2 + |
(P1, P2, C1, C2) | SC1:C1 − (C1) | SP1:P1 − (P1, P2) SP2:P2 − |
Handle | SP1:P1 → SC1:C1 + SP2:P2 → SC2:C2 + |
(P1, P2, C1, C2) | ---- | SP1:P1 − (P1, P2) SP2:P2 − |
Note. P and C, and SP and SC refer to procurement and consumption responses, and SDs, respectively. + designates reinforcement, − designates nonreinforcement (extinction), ---- designates handling without exposure to the experimental apparatus. Parentheses indicate which response manipulanda were present in a given phase.
Method
Subjects
Thirty-two female rats (75–90 days old) were housed individually in suspended wire-mesh cages and maintained at 80% of their free-feeding weights. Rats had unlimited access to water in their homecages and were given supplementary feeding approximately 2 hr after each session.
Apparatus
The apparatus consisted of two unique sets of four conditioning chambers (model ENV-007-VP; Med Associates, St. Albans, VT) located in separate rooms of the laboratory. Each chamber was housed in its own sound-attenuating chamber. All boxes measured 31.75 × 24.13 × 29.21 cm (Length × Width × Height). The sidewalls consisted of clear acrylic panels, and the front and rear walls were made of brushed aluminum. A recessed food cup was centered on the front wall approximately 2.5 cm above the floor. A retractable lever (model ENV-112CM, Med Associates) was positioned to the left of the food cup. The lever was 4.8 cm wide and 6.3 cm above the grid floor. It protruded 2.0 cm from the front wall when extended. A removable chain-pull response manipulandum (model ENV-111C, Med Associates) was positioned to the right of the food cup. The chain was 23.5 cm long and 5.7 cm above the grid floor. It was spaced 2.0 cm from the front wall. Two 28-V (2.8 W) panel lights (diameter = 2.5 cm) were mounted on the wall near each manipulandum, 10.8 cm above the floor and 6.4 cm from the center of food cup. One light was immediately above the lever and the other was behind the chain. The chambers could be illuminated by 7.5-W incandescent bulbs mounted to the ceiling of the sound attenuation chamber. Ventilation fans provided background noise of 65 dBA.
The two sets of chambers had unique features that allowed them to serve as different contexts in other experiments, although they were not used for that purpose here. In one set of boxes, the floor consisted of 0.5 cm diameter stainless steel floor grids spaced 1.6 cm apart (center-to-center) and mounted parallel to the front wall. The ceiling and side wall had black horizontal stripes, 3.8 cm wide and 3.8 cm apart. In the other set of chambers, the floor consisted of alternating stainless steel grids with different diameters (0.5 and 1.3 cm, spaced 1.6 cm apart). The ceiling and side wall were covered with dark dots (2 cm in diameter). Reinforcement consisted of the delivery of a 45 mg food pellet (MLab Rodent Tablets; TestDiet, Richmond, IN) into the food cup. The apparatus was controlled by computer equipment in an adjacent room
Procedure
Acquisition
Sessions were conducted daily and lasted approximately 30 min. All rats experienced each training phase, and were given brief remedial training in a separate session if they failed to respond during the main session. Rats first received two 30-min sessions of magazine training with response manipulanda removed. In each session, there were 60 noncontingent pellet deliveries scheduled according to a random time (RT) 30 s schedule. Over the next two sessions, the consumption response was trained. At this time, only the consumption manipulandum was present and the consumption stimulus was presented on 30 trials with a 45 s variable ITI. Manipulanda (lever or chain) were counterbalanced across subjects; the consumption SD was always the panel light near the consumption manipulandum. A consumption response turned the SD off and immediately produced a food pellet according to a continuous reinforcement (CRF) schedule. A trial was terminated if a response was not made within 60 s of SD onset. In the following session, the procurement response manipulandum was added to the chamber. At the start of each of 30 trials, the new procurement SD (panel light near the procurement manipulandum) was now turned on. A single procurement response during the procurement SD turned off the stimulus, and immediately turned on the consumption SD, in the presence of which a single consumption response then produced a food pellet. Following two sessions of such CRF chain training, there were two sessions in which the reinforcement schedule in both links was a random ratio (RR) 2. For the remaining sessions, the schedule was always RR 4 in both links. Time allowed in the procurement and consumption stimuli to meet the RR 4 requirement decreased in steps from 60 s, 45 s, 30 s, to the terminal value of 20 s over the first four sessions of RR 4 training. The maximal stimulus duration of 20 s remained in effect for a final 4 sessions of acquisition.
Extinction
Rats were then randomly assigned to one of four groups (ns = 8). Over the next four sessions, three groups received extinction sessions in which there were 30 presentations of the consumption stimulus without reinforcement. In Groups SC-C (P) and SC-C, consumption responding terminated the stimulus on RR 4, but did not produce a food pellet. For Group SC-C (P), both the procurement and consumption manipulanda were available. For Group SC-C, only the consumption manipulandum was available (the procurement manipulandum was removed). For Group SC only, neither manipulandum was present; these rats received 30 20-s presentations of the consumption stimulus without reinforcement. The fourth group (Group Handle) received no extinction sessions but were brought from the colony room to the laboratory and handled each day in a manner equivalent to that of the other groups.
Procurement test
After four sessions of extinction, all rats received a test session in which both response manipulanda were present. There were 30 procurement trials, i.e., 30 occasions on which the procurement SD was presented without being followed the consumption SD. Trials were separated by a variable 45-s ITI. Responses on the procurement manipulandum during the procurement SD turned off the SD according to RR 4, but did not produce the consumption SD or a food pellet. Procurement trials otherwise ended with the SD terminating after 20 s had elapsed.
Data analysis
To describe procurement and consumption responding occasioned by the corresponding SD, we calculated elevation scores by subtracting the response rate on procurement and consumption manipulanda during the 30 s immediately before the procurement stimulus was presented (the pre-procurement period) from the response rate during the procurement and consumption stimuli, respectively. The elevation scores and pre-procurement response rates were evaluated with analyses of variance (ANOVAs) using a rejection criterion of p < .05.
Results
One rat failed to acquire the chain and was dropped from the study. The final ns were 8, 8, 7, and 8 for groups SC-C (P), SC-C, SC-only, and Handle, respectively.
Acquisition
All but the one rat acquired the procurement-consumption chain without incident. Acquisition of procurement and consumption responding is presented in Figure 1. Figure 1a shows an increase in each response over the course of training sessions. Procurement elevation scores were compared in a Group (4) by Session (8) ANOVA. There was a significant effect of session, F(7, 189) = 19.14, MSE = 26.24, p < .01, and no group differences or interaction, Fs < 1. A similar analysis applied to consumption elevation scores revealed a significant effect of session, F(7, 189) = 21.03, MSE = 64.68, p < .01, and no group difference of interaction, largest F = 1.01. Average procurement response rates during the pre-SD period in the first session of acquisition for groups SC-C (P), SC-C, SC-only, and Handle were 8.3, 7.0, 4.3, and 6.4, respectively. Average procurement response rates during the pre-SD period in the final session of acquisition for groups SC-C (P), SC-C, SC-only, and Handle were 7.2, 6.5, 5.6, and 4.3, respectively. A Group by Session ANOVA comparing procurement response rates in the pre-procurement SD period found a significant effect of session, F(7, 189) = 7.68, MSE = 6.57, p < .01, but no group difference or interaction, Fs < 1. Average consumption response rates during the pre-SD period in the first session of acquisition for groups SC-C (P), SC-C, SC-only, and Handle were 5.3, 6.1, 3.9, and 4.4, respectively; the corresponding response rates during the last session were 3.3, 2.0, 1.2, and 4.3, respectively. A Group by Session ANOVA revealed a significant decrease across sessions, F(7, 189) = 5.83, MSE = 3.09, p < .01, but there were no group differences or interaction, largest F = 1.06.
Mean response rates on procurement and consumption manipulanda in the pre-procurement SD, procurement SD, and consumption SD periods in the last session of acquisition are presented in Figure 1b. Both responses were low in the pre-procurement SD period, then elevated during their respective SD periods, thus demonstrating strong stimulus control over responding. In pre-procurement SD period, a Group (4) by Response (Procurement vs. Consumption) ANOVA found greater responding on the procurement manipulandum, F(1, 27) = 17.70, MSE = 18.94, p < .001, and no group differences, F(3, 27) = 1.21, MSE = 23.43, or interaction, F < 1. In the procurement SD period, rats responded significantly more on the procurement manipulandum than the consumption, F(1, 27) = 106.43, MSE = 73.22, p < .001, and there were no group differences or interaction, Fs < 1. In the consumption SD, rats responded significantly more on the consumption than procurement manipulandum, F(1, 27) = 195.45, MSE = 164.09, p < .001, and there were no group differences or interaction, Fs < 1.
Extinction
The results of the extinction phase are presented in Figure 1c; Groups SC-C (P) and SC-C each decreased their consumption responding within each sessions and over sessions of extinction, showing spontaneous recovery at the beginning of each session. A Group [SC-C (P) vs SC-C] by Session (4) by Trial Block (6) ANOVA confirmed these observations with significant effects of Session, F(3, 42) = 45.04, MSE = 163.47, p < .001, Trial Block, F(5, 70) = 16.69, MSE = 94.18, p < .001, and Session by Block interaction, F(15, 210) = 3.67, MSE = 87.32, p < .001. The effect of Group did not reach significance, F(1, 14) = 3.87, MSE = 232.89, p = .07, and there were no other significant interactions, largest F = 1.42. Average pre-consumption SD consumption response rates in Groups SC-C (P) and SC-C were 2.3 and 0.8 in the first session, and 0.7 and 0.6 in the last session of extinction. An ANOVA revealed significantly greater responding in Group SC-C (P), F(1, 14) = 5.36, MSE = 9.03, p = .04, there were also significant effects of Session, F(3, 42) = 6.42, MSE = 2.83, p < .01, and Block, F(5, 70) = 3.96, MSE = 3.67, p < .01, as well as a Session-by-Group interaction, F(3, 42) = 3.27, p = .03. Greater responding in Group SC-C (P) may reflect better generalization from acquisition to extinction.
Recall that Group SC-only received the full 20-s consumption SD presentation during each trial of extinction. Making the consumption response [available to Groups SC-C (P) and SC-C] could shorten exposure to the consumption SD. A Group [SC-C (P) vs. SC-C] by Session (4) ANOVA revealed greater average consumption SD exposure (in seconds) in Group SC-C, F(1, 14) = 5.34, MSE = 5.59, p = .04, and a significant increase in SD exposure over sessions, as responding in Groups SC-C (P) and SC-C increasingly slowed, F(3, 42) = 72.05, MSE = 2.06, p < .01. The interaction did not reach significance, F = 1.76. All animals in Group SC only received 20-s SD presentations and thus had no variance in exposure time; therefore, any statistical test would find significant differences between SC only (20 s), and Groups SC-C (P) (16.6 s) and SC-C (17.6 s).
Procurement test
The results of the procurement test are presented in Figure 2. With the exception of the first block of 5 trials, Groups SC-C (P) and SC-C made fewer procurement responses than Group SC-only and Group Handle. These observations were supported by a Group (4) by Trial Block (6) ANOVA, which revealed significant effects of Group, F(3, 27) = 5.49, MSE = 183.63, p < .01, Trial Block, F(5, 135) = 21.41, MSE = 38.27, p < .01, and an interaction, F (15, 135) = 1.75, p < .05. A separate ANOVA on the Groups that received extinction with manipulanda [Groups SC-C (P) and SC-C] found a significant effect of Block, F(5, 70) = 11.28, MSE = 37.27, p < .01, but no other significant effects, largest F = 1.36. Thus, responding in these groups did not differ. A similar comparison of responding in the remaining groups (Groups SC only and Handle) found a significant effect of Block, F(5, 65) = 11.06, MSE =39.36, p < .01, and significant Group by Block interaction, F(5, 65) = 3.77, p < .01, but no group difference, F(1, 13) = 1.12, MSE = 220.40, p = .31. Separate planned comparisons were made between groups allowed to make the consumption response in extinction and Group SC-only. Group SC-C (P) had a lower rate of procurement than SC-only over the test session, F(1, 13) = 5.76, MSE = 198.37, p = .03; procurement in each group decreased over blocks, F(5, 65) = 5.98, MSE = 36.32, p < .001, but the two factors did not interact, F(5, 65) = 1.96, p = .09. Procurement was initially similar in Groups SC-C and SC-only, but decreased faster in Group SC-C. This was supported by a significant Group by Block interaction, F(5, 65) = 2.57, MSE = 42.87, p = .03, and an effect of Block, F(5, 65) = 7.87, p < .001, but no main effect of Group F(1, 13) = 1.65, MSE = 206.62, p = .22. Overall, the results suggest that extinction treatments that allowed the rat to make the consumption response during extinction [Groups SC-C (P) and SC-C] were successful at weakening procurement responding.
The analysis on procurement SD elevation scores was not complicated by differences in procurement responding during the 30 s pre-procurement SD period. During these periods In the first block, mean procurement responding was 1.9, 5.3, 7.0, and 5.3 in groups SC-C (P), SC-C, SC only, and Handle, respectively, and 0.2, 0.0, 0.4, and 0.6 in the last block. A Group (4) by Block (6) ANOVA found a significant decrease in procurement responding during the pre-procurement SD period, F(5, 135) = 23.10, MSE = 4.30, p < .01, but no Group difference or interaction, largest F = 1.21.
Discussion
Almost all of the animals acquired the instrumental chain and demonstrated excellent stimulus control by the end of training. Groups that then received consumption extinction showed weakened procurement responding. As noted above, previous studies (Olmstead et al., 2001; Zapata et al., 2010) have found related results, but differed critically in that the comparison group received further reinforcement, rather than no treatment, of the consumption response. The present results are thus the first to indicate that extinction of consumption is sufficient to reduce procurement responding. Another new result is that performance of the actual consumption response in extinction [Groups SC-C (P) and SC-C] was required in order to observe this effect. The impact of SD-only (Group SC-only) exposure on the procurement response was weaker than response exposure even though the SD-only animals received more cumulative exposure time to the consumption SD. The present results mirror our recent work showing that extinguishing the procurement response can conversely weaken consumption (Thrailkill & Bouton, 2015).
Experiment 2
It is possible that the effect of consumption extinction on procurement responding in Experiment 1 was due to some non-specific effect of extinction. For example, if consumption extinction generated frustration, frustration might generally suppress all instrumental responses. Alternatively, there might have been some generalization between the lever pressing and chain pulling, although the highly specific allocation of lever pressing and chain pulling in acquisition raise doubts about such a possibility. Experiment 2 nonetheless asked whether the effect of consumption extinction is specific to the procurement response that was associated with it in the chain. The design, which is summarized in Table 1, was similar to one used by Thrailkill and Bouton (2015). Rats now learned two separate discriminated heterogeneous behavior chains. Two additional response manipulanda were added to the conditioning chambers so that two procurement responses (P1 and P2) were available along with two consumption responses (C1 and C2). All rats learned to perform two chains consisting of P1-C1 and P2-C2 each leading to the same food pellet reinforcer in the same sessions. Following acquisition, an experimental group received extinction of one consumption response (e.g., C1) and a control group received only equivalent handling. Both groups were then tested on each procurement response (P1 and P2) in the absence of consumption manipulanda. If extinction of consumption weakened procurement responding in Experiment 1 through frustration or response generalization, then extinction of C1 should weaken P1 and P2 to a similar extent. However, if extinction of consumption only weakens the procurement response that has been linked with it in a chain, then extinction of C1 will primarily weaken P1. In line with our previous work, a handling group was included in order to assess whether other nonspecific factors play an additional role.
Method
Subjects
Thirty-two female Wistar rats from the same supplier were used. Their age, housing, and maintenance conditions were identical to those described in Experiment 1.
Apparatus
The apparatus consisted of two unique sets of four conditioning chambers (model ENV-008-VP; Med Associates) housed in separate rooms of the laboratory. Each chamber was in its own sound attenuation chamber. All boxes measured 30.5 × 24.1 × 23.5 cm (Length × Width × Height). The side walls and ceiling were made of clear acrylic plastic, while the front and rear walls were made of brushed aluminum. A recessed food cup measured 5.1 cm × 5.1 cm and was centered on the front wall approximately 2.5 cm above the level of the floor. Chain pull (14.5 cm long) and nose poke (2.5 cm in diameter and 2 cm deep) were located on the front wall on either side of the food cup and 6.3 cm (to the bottom of the chain and to center of poke) above the chamber floor. The nose poke was near the side panel that functioned as the chamber door. Two retractable levers (model ENV-112CM, Med Associates) were located directly across from the chain pull and nose poke on the rear wall. The levers were each 4.8 cm long and 6.3 cm above the grid floor. Levers protruded 1.9 cm from the front wall when extended. Four 28-V (2.8 W) panel lights (diameter = 2.5 cm) were mounted on the walls above (behind in the case of the chain) each response, 10.8 cm above the floor and 6.4 cm from the center of the front or rear wall. The chambers were illuminated by two 7.5 W incandescent bulbs mounted to the ceiling of the sound attenuation chamber, 34.9 cm from the grid floor, ventilation fans provided background noise of 65 dBA. The two sets of boxes had unique features that allowed them to serve as different contexts in other experiments but were not used for that purpose here. In one set of boxes, the grids of the floor were spaced 1.6 cm apart (center-to-center). In the other set of boxes, the floor consisted of alternating stainless steel grids with different diameters (0.5 and 1.3 cm, spaced 1.6 cm apart). There were no other distinctive features between the two sets of chambers. The reinforcer and control of events was the same as in Experiment 1.
Procedure
Training was conducted seven days a week, with two sessions a day separated by approximately 3 hours. On the day prior to response training, rats received two sessions of magazine training in which 30 food pellets were delivered to the food cup according to an RT 60-s schedule.
Individual chain training
Rats were then trained to perform each of two chains individually. Procurement responses consisted of pressing the left or right levers (counterbalanced) and consumption responses consisted of the chain-pull or nose poke (also counterbalanced). Individual chain training was conducted with only two manipulanda in the chamber at one time (one for procurement and one for consumption). Rats first learned to perform one of the consumption responses. In the first two sessions, there were 20 presentations of a consumption SD separated by a variable 45-s ITI. If a consumption response was made within 60 s of stimulus onset, the stimulus turned off immediately and a pellet was delivered (CRF). Otherwise, the SD ended without a pellet after 60 s. In the next session, the procurement manipulandum (left or right lever) was introduced to the chamber and the first chain was trained. Procurement responses were counterbalanced such that half the rats were required to travel along the side walls to reach the associated consumption response, and half had to cross the chamber diagonally. Single chain sessions consisted of 20 presentations of the procurement SD separated by a variable 45-s ITI. Initially, if a single procurement response was made within 60 s, the procurement SD terminated and the consumption SD was turned on, allowing the consumption response to be reinforced. Presentations of the procurement SD that did not lead to a procurement response ended after 60 s without the presentation of the consumption SD or a food pellet. Training of the first chain occurred over six consecutive sessions. In the first two sessions, procurement and consumption responding were reinforced according to CRF (as described above). On the final 4 sessions, the response requirement for procurement and consumption was RR 2. The second chain was then trained in an identical manner with the manipulanda used for the first chain removed from the chamber. As before, there were two sessions of training the consumption response and then 6 sessions of training the procurement-consumption chain.
Multiple chain training
Following training of the second chain, animals were trained to perform both chains within the same session over the next 14 sessions. All four manipulanda were now present, and trials with the opportunity to perform each chain were presented in pseudorandom order. There were 40 chain trials in each session (20 with each chain). The response requirement for both procurement and consumption was reduced to CRF, and then increased in two-session increments to RR2 and finally RR4. The maximum stimulus durations that were allowed when there was no response were decreased from 60 s to 20 s. Rats finally received 6 sessions of training with the terminal schedule parameters (RR 4 in all links on both chains). Sessions lasted approximately 40 min.
“Probe trials” were also introduced after every tenth trial (Thrailkill & Bouton, 2015). A probe trial consisted of the presentation of one of the procurement stimuli; when the response requirement (or 20 s stimulus) was met, both consumption SDs were then presented simultaneously. A single correct consumption response, defined as a response to the lever associated with the probed procurement response, was immediately reinforced. A response to the wrong consumption manipulandum had no scheduled consequences. Probe trials ended without reinforcement if a correct response was not made before 60 s had elapsed. Probe trials provided a measure of whether the animals followed the chained structure of the task, as opposed to merely tracking the different SDs.
Extinction
Following acquisition, the rats were assigned to two groups. One group received 3 sessions of extinction training with one consumption response (Group Extinguished). The selected consumption response was fully counterbalanced for parallel/diagonal, nosepoke/chain, and position in the order in which the response sequence was initially trained. There were 40 presentations of the consumption stimulus in each session separated by a 45-s ITI. Except for the single consumption manipulandum, all manipulanda were removed from the chamber. The remaining rats (Group Handle) received handling and transport in the same manner as the extinguished group, but were returned to the colony instead of being placed into the chamber for extinction.
Procurement test
All rats then received a test session in which each procurement response was tested in the presence of each procurement stimulus. Both procurement manipulanda were present in the chamber. The consumption manipulanda were absent. The test session consisted of 4 presentations of each consumption SD in an ABBA or BAAB order (counterbalanced). Procurement responses during a procurement SD turned off the stimulus according to RR 4, but did not produce the consumption SD or food. Procurement SDs were otherwise terminated after 20 s on each trial.
Results
Acquisition
All rats acquired the two chains. Procurement and consumption elevation scores for the acquisition phase are shown in Figure 3a. Both types of responses increased in rate over the course of acquisition. The two chains were arbitrarily distinguished by the location [left (L) or right (R)] of the procurement lever on the rear wall of the chamber (recall that the various manipulanda were counterbalanced). Procurement elevation scores were compared in a Group (Extinguished vs. Control) by Response (L vs. R) by Session (12) ANOVA that found a significant effect of Session, F(11, 330) = 39.21, MSE = 63.19, p < .01, but no effect of Response, F(1, 30) = 2.11, MSE = 1185.14, p = .16, or Group, F(1, 30) = 0.55, MSE = 6116.75, p = .47. All other interactions failed to reach significance, largest F = 1.24. Average procurement response rates during the pre-procurement SD periods for to-be Extinguished and Nonextinguished responses in Group Extinguished were 1.6 and 1.6 in Session 1, and 2.1 and 3.3 in Session 12. Average procurement response rate during the pre-procurement SD period for Left and Right responses in Group Handle were 2.3 and 2.1 in Session 1, and 3.6 and 2.9 in Session 12. A Group (Extinguished vs. Handle) by Response (L vs. R) by Session (12) ANOVA comparing procurement response rates during the pre-procurement SD period found a significant increase in procurement response rates across sessions, F(11, 330) = 4.72, MSE = 4.04, p < .01, and no effects of Group, F < 1, or Response, F(1, 30) = 1.22, MSE = 85.12, p = .28. All other interactions also failed to reach significance, largest F = 1.05.
Consumption elevation scores were also analyzed in a Group (Extinguished vs. Control) by Response (L vs. R) by Session (12) ANOVA. The analysis found a significant effect of Session, F(11, 330) = 22.50, MSE = 79.72, p < .01, but no effect of Response, F(1, 30) = 0.18, MSE = 6544.11, p = .67, or Group, F(1, 30) = 1.09, MSE = 807.03, p = .30. All other interactions failed to reach significance, largest F = 1.18. Average consumption response rates during the pre-procurement SD periods for the consumption responses associated with Extinguished and Nonextinguished procurement responses in Group Extinguished were 1.6 and 0.6 in Session 1, and 1.5 and 0.5 in Session 12. Average consumption response rate during the pre-procurement SD period for consumption responses associated with the Left and Right procurement responses in Group Handle were 0.3 and 0.5 in Session 1, and 0.8 and 0.4 in Session 12. A Group (Extinguished vs. Handle) by Response (L vs. R) by Session (12) ANOVA comparing consumption response rates during the pre-procurement SD period found a significant increase in consumption response rates across sessions, F(11, 330) = 2.31, MSE = 0.56, p = .01, and no effects of Group, F < 1, or Response, F(1, 30) = 1.25, MSE = 43.69, p = .27. All other interactions also failed to reach significance, largest F = 1.29.
Figure 3b shows procurement and consumption response rates during pre-procurement SD, procurement SD, and consumption SD periods of the final acquisition session. As in Experiment 1, stimulus control was clearly strong. A Response (Procurement vs. Consumption) by Status (correct or incorrect, in the sense that it would be reinforced or not on a particular trial) ANOVA on response rates during the pre-procurement SD period collapsed over Group and Chain found significantly greater Procurement responding, F(1, 15) = 7.96, MSE = 5.76, p = .01, and no effect of status or interaction, Fs < 1. A similar ANOVA comparing response rates during the procurement SD found significantly effects of Response, F(1, 15) = 558.29, MSE = 10.99, p < .01, Status, F(1, 15) = 612.26, MSE = 9.36, p < .01, and a significant interaction, F(1, 15) = 615.93, MSE = 9.16, p < .01. The Response x Status interaction indicates the strong tendency to choose the correct procurement response in each procurement SD. The same pattern was found for consumption responding during consumption SDs. Animals were significantly more likely to make the correct consumption behavior in a particular consumption SD, as indicated by significant effects of Response, Status, and a Response by Status interaction, smallest F = 1083.01.
Figure 3c shows accuracy from the probe trials over the 12 sessions of acquisition. Trials in which the first response following the Left or Right lever was the associated consumption response (nosepoke or chainpull, counterbalanced) were counted as correct. Accuracy was high from the start and then increased over sessions in both groups. A Group (Extinguished vs. Handle) by Block (12) ANOVA found a significant effect of Session, F(11, 330) = 3.62, MSE = 0.03, p < .01 but no effect of Group, F(1, 30) = 1.19, MSE = 0.16, p = .28, and no interaction, F = 1.16.
Extinction
Figure 3d shows extinction of elevation scores on Left and Right consumption responses in blocks of 4 trials. Consumption decreased within each session, and showed decreasing spontaneous recovery across sessions. This observation was confirmed in a Chain (Left vs. Right) by Session (3) by Trial Block (10) ANOVA found significant effects of Session, F(2, 28) = 30.79, MSE = 176.02, p < .01, and Block, F(9, 126) = 37.91, MSE = 63.62, p < .01, as well as a Session by Block interaction, F(18, 252) = 12.61, MSE = 52.42, p < .01. There was no effect of Chain, F(1, 14) < 1, or other interactions, largest F = 1.12. Consumption response rates in the pre-consumption SD period were similarly analyzed in a Chain (Left vs. Right) by Session (3) by Block (10) ANOVA. Consumption responding in the pre-consumption SD period found no effects of session, F(2, 28) = 1.49, MSE = 11.94, p = .24, Bin, F = 1.62, MSE = 6.44, p = .12, or Response, F < 1, and no interactions, largest F = 1.10.
Test
Figure 4 shows the results of the procurement response test. The results suggest that consumption extinction specifically weakened the procurement response that had been associated with it in the chain. In Group Extinguished, a Response (Extinguished vs. Nonextinguished) by Trial (4) ANOVA found that animals performed the procurement response associated with the extinguished consumption response at a significantly lower rate than the other procurement response, F(1, 15) = 5.43, MSE = 189.09, p = .03, with no other effects of Trial or interaction, largest F = 1.31. Rats in Group Extinguished also responded on the procurement lever associated with the extinguished consumption response less than the average procurement responding in Group Handle, F(1, 30) = 7.99, MSE = 552.25, p = .01, with no effect of Trial or interaction, F’s < 1. In contrast, Group Extinguished’s responding on the procurement lever associated with the nonextinguished consumption response did not differ from the average responding in Group Handle, F(1, 30) = 1.79, MSE = 661.77, p = .20, and there was no effect of Trial or interaction, largest F = 1.03.
Interpretation of the preceding results was not complicated by different pre-procurement SD procurement response rates. In Group Extinguished, a Response (Extinguished vs. Nonextinguished) by Trial (4) ANOVA found that pre-procurement SD responding did not differ between the two procurement responses, F(1, 15) = 2.64, MSE = 87.52, p = .13, and there were no effects of Trial, F(3, 45) = 1.21, MSE = 43.68, p = .32, or interaction, F(3, 45) < 1, p = .47. Nor did pre-procurement SD responding for the response associated with the extinguished consumption response differ from the average procurement responding in Group Handle, F(1, 30) = 1.87, MSE = 129.53, p = .18. There was a marginal effect of Trial, F(3, 90) = 2.66, MSE = 32.03, p = .05, but no interaction, F(3, 90) < 1. The same analysis applied to the procurement response associated with the nonextinguished consumption response found no difference between pre-procurement SD responding, F(1, 30) < 1, MSE = 51.61, there was a significant effect of Trial, F(3, 90) = 5.55, MSE = 21.49, p = .002, but no interaction, F < 1.
Discussion
Rats learned to perform two heterogeneous chains and demonstrated a high level of accuracy in making the correct consumption response after each procurement link during the probe tests. Most important, extinction of a consumption response selectively weakened the procurement response that had been associated with it during training. In addition, consumption extinction did not measurably suppress the procurement response from the other chain, as suggested by the lack of difference from responding in a group that received no extinction at all. The results thus suggest that consumption extinction can weaken procurement responding through a mechanism that does not reduce to response generalization or nonspecific effects such as frustration. The selective suppression of procurement responding also cannot be explained by a possible depression or inhibition of the animal’s representation of the primary reinforcer, which was common to both chains (and thus, both procurement responses). The results extend the results of a double-chain experiment reported by Olmstead et al. (2001), which did not discriminate between the reducing effects of extinction and the enhancing effects of reinforcement. They are also analogous to evidence that extinction of procurement specifically weakens a consumption response associated with it during heterogeneous chain training (Thrailkill & Bouton, 2015).
General Discussion
The present experiments further characterize extinction and the associative structure that underlies a discriminated heterogeneous instrumental chain. In both experiments, rats efficiently learned to perform behavior chains in which separate SDs were available to set the occasion for separate procurement and consumption behaviors. Presentation of either SD could demonstrably control the corresponding response; throughout training, presentation of SP set the occasion for procurement responding and began performance of the chain, and during consumption extinction, presentation of SC alone was shown to be sufficient to set the occasion for consumption responding. The main result, though, was that extinction of a consumption response weakened subsequent performance of a procurement response that had been associated with it in a chain (Olmstead et al., 2001; Zapata et al., 2010). Experiment 1 further demonstrated two new findings. First, extinction of the consumption response was sufficient to weaken procurement responding in comparison to a group that had received no further training with the consumption response (Group Handle). Second, extinction exposure to the consumption SD alone, without the opportunity to make the consumption response, had no impact on procurement responding (Group SC-only). Evidently, nonreinforcement of the consumption response is necessary to produce the effect on procurement responding. Experiment 2 then found that, after the training of two separate heterogeneous chains, extinction of one of the two consumption responses selectively weakened the procurement response that had been associated with it. Rats performed the other procurement response at a level that was not different from responding in a control group that had not received consumption extinction (Group Handle). In addition to clarifying and extending the results of Olmstead et al. (2001) and Zapata et al. (2010), the present findings provide an essentially perfect complement to previous studies of the effects of extinguishing procurement responding on consumption responding after discriminated heterogeneous chain training (Thrailkill & Bouton, 2015).
The present results continue to confirm the importance of emitting the instrumental response during instrumental extinction. As noted above, the results of Experiment 1 clearly suggest that extinction exposure to the consumption SD (SC) alone was not sufficient to weaken the associated procurement response. That result, coupled with the fact that in Experiment 2 an alternative procurement response (P2) was not depressed by extinction of a separate consumption response (C1), despite their connection with earning a common primary reinforcer, suggests that the suppression of the procurement response is not due to a suppression of a reinforcer representation that might be evoked by associated SDs or responses during extinction. Evidently, it is the decrease in strength of the consumption response—rather than the consumption SD or the reinforcer representation-- that weakened the procurement response here. The critical role of the response in extinction is consistent with the complementary findings of Thrailkill and Bouton (2015). It may also be consistent with other recent work from this laboratory on extinction of non-chained instrumental responses (Bouton, Todd, Vurbic, and Winterbauer, 2011; Todd, 2013; Todd, Vurbic, & Bouton, 2014). Those results have suggested the role of response inhibition in instrumental extinction (see Bouton & Todd, 2014, for review); the animal appears to learn to inhibit the instrumental response (in a specific context) when it undergoes extinction. The current results are consistent with the idea that similar inhibition of the consumption response may be necessary to weaken (or inhibit) the associated procurement response.
The present findings are also consistent with previous research on associative structures underlying serial Pavlovian learning. Holland and colleagues (1990; Holland & Ross, 1981) found that after serial compound conditioning (in which S1 is followed by S2, which is then followed by a reinforcer), extinction exposure to S2 weakens the response to S1, and extinction exposure to S1 weakens responding to S2. Holland and Ross (1981) also demonstrated specificity in serial compound learning with a within-subject procedure involving two serial compounds. They argued that the results supported the idea that the animal learns an S1-S2 association during serial compound Pavlovian conditioning. The present results, along with those of Thrailkill and Bouton (2015), provide a parallel in instrumental learning. After training with a serially-organized heterogeneous instrumental chain, extinction of the consumption response weakens procurement responding (present results) and extinction of the procurement response weakens consumption (Thrailkill & Bouton, 2015). We interpret the findings to suggest that in a representative discriminated heterogeneous chain, animals may learn an analogous R1-R2 association.
In summary, the present experiments are the first to demonstrate an unambiguous decrement in procurement behavior following extinction of an associated consumption behavior. This decrement appears to be a result of learning about the consumption response, and is specific to a procurement response specifically associated with the extinguished consumption response. The specificity of the effect suggests that inhibition of consumption has the effect of weakening the procurement response through a direct response-response association.
Acknowledgments
This research was supported by Grant RO1 DA033123 from the National Institute on Drug Abuse to MEB.
References
- Bouton ME, Todd TP, Vurbic D, Winterbauer NE. Renewal after the extinction of free operant behavior. Learning and Behavior. 2011;39:57–67. doi: 10.3758/s13420-011-0018-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Catlin J, Gleitman H. Pattern of disruption in the extinction of a chain FR 7-FR 7 in pigeons. Animal Learning and Behavior. 1973;1:154–156. [Google Scholar]
- Collier GH. Determinants of choice. In: Bernstein DJ, editor. Nebraska Symposium on Motivation. Lincoln, NE: University of Nebraska Press; 1981. pp. 67–127. [PubMed] [Google Scholar]
- Conklin CA, Robin N, Perkins KA, Salkeld RP, McClernon FJ. Proximal versus distal cues to smoke: The effects of environments on smokers’ cue-reactivity. Experimental and Clinical Psychopharmacology. 2008;16:207–214. doi: 10.1037/1064-1297.16.3.207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Everitt BJ. Neural and psychological mechanisms underlying compulsive drug seeking habits and drug memories – Indications for novel treatments of addiction. European Journal of Neuroscience. 2014;40:2163–2182. doi: 10.1111/ejn.12644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fantino E. Some data on the discriminative stimulus hypothesis of secondary reinforcement. Psychological Record. 1965;15:409–415. [Google Scholar]
- Gollub L. Conditioned reinforcement: schedule effects. In: Honig WK, Staddon JER, editors. Handbook of operant behavior. Englewood Cliffs, NJ: Prentice-Hall; 1977. pp. 288–312. [Google Scholar]
- Holland PC. Event representation in Pavlovian conditioning: Image and action. Cognition. 1990;37:105–131. doi: 10.1016/0010-0277(90)90020-k. [DOI] [PubMed] [Google Scholar]
- Holland PC, Ross RT. Within-compound associations in serial compound conditioning. Journal of Experimental Psychology: Animal Behavior Processes. 1981;7:228–241. [Google Scholar]
- Holland PC, Wheeler DS. Representation-mediated food aversions. In: Reilly S, Schachtman T, editors. Conditioned Taste Aversion: Behavioral and Neural Processes. Oxford: Oxford University Press; 2009. pp. 196–225. [Google Scholar]
- Mazur JE, Fantino E. Choice. In: McSweeney FK, Murphy ES, editors. The Wiley Blackwell Handbook of Operant and Classical Conditioning. Hoboken, NJ: Jon Wiley & Sons, Ltd; 2014. pp. 195–210. [Google Scholar]
- Olmstead MC, Lafond MV, Everitt BJ, Dickinson A. Cocaine seeking by rats is a goal-directed action. Behavioral Neuroscience. 2001;115:394–402. [PubMed] [Google Scholar]
- Ostlund SB, Balleine BW. On habits and addiction: an associative analysis of compulsive drug seeking. Drug Discovery Today: Disease Models. 2008;5:235–245. doi: 10.1016/j.ddmod.2009.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thrailkill EA, Bouton ME. Extinction of chained instrumental behaviors: Effects of procurement extinction on consumption responding. Journal of Experimental Psychology: Animal Learning and Cognition. 2015 doi: 10.1037/xan0000064. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Todd TP, Vurbic D, Bouton ME. Mechanisms of renewal after the extinction of discriminated operant behavior. Journal of Experimental Psychology: Animal Learning and Cognition. 2014;40:355–368. doi: 10.1037/xan0000021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zapata A, Minney VL, Shippenberg TS. Shift from goal-directed to habitual cocaine seeking after prolonged experience in rats. Journal of Neuroscience. 2010;30:15457–15463. doi: 10.1523/JNEUROSCI.4072-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]