Abstract
Rationale
Drug-associated environmental stimuli may serve as conditioned reinforcers to enhance drug self-administration behaviors in humans and laboratory animals. However, it can be difficult to distinguish experimentally the conditioned reinforcing effects of a stimulus from other behavioral processes that can change rates of responding.
Objectives
To characterize the conditioned reinforcing effects of a stimulus paired with the μ-opioid agonist, remifentanil, using a new-response acquisition procedure in the rat.
Methods
First, in Pavlovian conditioning (PAV) sessions, rats received response-independent IV injections of remifentanil and presentations of a light-noise compound stimulus. In paired PAV groups, injections and stimulus presentations always co-occurred. In random PAV control groups, injections and stimulus presentations occurred with no consistent relationship. Second, in instrumental acquisition (ACQ) sessions, all animals could respond in an active nose-poke that produced the stimulus alone or in an inactive nose-poke that had no scheduled consequences.
Results
During ACQ, rats made significantly more active nose-pokes than inactive nose-pokes after paired PAV, but not after random PAV. Between groups, rats also made more active nose-pokes after paired PAV than after random PAV. After paired PAV, increased active responding was obtained under different schedules of reinforcement, persisted across multiple ACQ sessions, and depended on the number of PAV sessions conducted.
Conclusions
The remifentanil-paired stimulus served as a conditioned reinforcer for nose-poking: responding depended on both the contingency between the stimulus and remifentanil and the contingency between the nose-poke and the stimulus. Generally, new-response acquisition procedures may provide valid, flexible models for studying opioid-based conditioned reinforcement.
Keywords: conditioned reinforcement, cues, opioid, Pavlovian conditioning, response acquisition
Exposure to drug-associated environmental stimuli can significantly enhance drug self-administration behaviors in both humans and laboratory animals (Everitt and Robbins 2000; Olive and Kalivas 2011; Le Foll and Goldberg 2005; See 2005). Many of these effects are consistent with the drug-associated stimuli functioning as conditioned reinforcers to increase the frequency of drug-taking and/or drug-seeking responses; however, it can be difficult experimentally to distinguish conditioned reinforcement from the other associative and non-associative effects of drug exposure and stimulus presentation (Cunningham 1993; Kelleher and Gollub 1962; Mackintosh 1974; Williams 1994). Treatments for drug abuse and dependence are increasingly focused on techniques to reduce human drug takers’ reactions to drug-associated stimuli (e.g., Milton and Everitt 2010; Myers and Carlezon 2010; Taylor et al. 2009). To decrease problematic drug-related responses while minimizing the risk of disruption to other, more adaptive behaviors, these treatments should target precisely the specific learning mechanisms responsible for drug-stimulus associations and stimulus-maintained behavior (cf., Conklin and Tiffany 2002; Hogarth and Duka 2006). To help address the specific contributions that conditioned reinforcement can make to drug abuse and dependence (as distinguished, even, from other Pavlovian conditioned effects; Milton and Everitt 2010), thorough behavioral assessments are needed to characterize the conditioned reinforcing effects of drug-associated stimuli and to determine the necessary and sufficient conditions for such stimuli to act as conditioned reinforcers.
Three criteria must be satisfied to establish that a stimulus is, indeed, acting as a conditioned reinforcer (Mackintosh 1974, p 234). Changes in the rate of the response that produces the stimulus must (1) not depend on a current or historical association between the response and a primary reinforcer; rather, rates must depend (2) on the Pavlovian association between a primary reinforcer and the stimulus and (3) on the instrumental association between the response and the stimulus. Among the experimental procedures developed to study conditioned reinforcement (reviewed by Williams 1994), new-response acquisition is considered particularly rigorous because it can generate behavior that clearly satisfies all three of these criteria (e.g., Hyde 1976; Taylor and Robbins 1984; Sosa et al. 2011). In classical new-response acquisition procedures, animals are first given response-independent pairings of a primary reinforcer and exteroceptive stimulus. Subsequently, the stimulus alone is programmed as the consequence of a novel instrumental response, and the ability of animals to learn to make that response is assessed. In this case, animals do not have the opportunity to associate the instrumental response with the primary reinforcer, as the response that produces the stimulus does not and did not produce the primary reinforcer, and if adequate controls are included, the effects of the specified Pavlovian and instrumental associations can also be established.
New-response acquisition procedures have been used widely to study the conditioned reinforcing effects of food- or water-associated stimuli, and the basic behavioral procedures have been adapted for more complex studies of the associative and neurobiological determinants of performance with conditioned reinforcement (e.g., Beninger and Ranaldi 1994; Beninger and Rolfe 1995; Burke et al. 2007; de Borchgrave et al. 2002; Olausson et al. 2004; Parkinson et al. 1999, 2005; Snycerski et al. 2005). Despite these advances with non-drug reinforcers, new-response acquisition has not been extensively used to study stimuli paired with drugs of abuse. Early work by Davis, Smith and colleagues (reviewed by Davis and Smith 1987; see also Marcus et al. 1976; Goddard and Leri 2006) showed that rats would increase their responding on a lever that produced a buzzer noise after the noise was paired with response-independent IV injections of morphine or amphetamine, compared to a pre-conditioning baseline period when lever-presses produced the noise and IV saline injection. These results are consistent with the noise becoming a conditioned reinforcer by Pavlovian association with the drug. However, it is difficult to exclude alternative explanations, as these studies did not include a second, inactive lever or other control for nonspecific changes in behavior and/or an associative control to account for potential effects of drug exposure regardless of the programmed drug-stimulus association (see Cunningham 1993 for more on interpreting such pre-vs.-post conditioning designs).
More recently, new-response acquisition procedures have been developed in which self-administration of a drug is trained using one type of manipulandum (e.g., a nose-poke), with each IV drug injection accompanied by a particular stimulus, and then responding on a second type of manipulandum (e.g., a lever) is trained with the stimulus alone. These procedures have been used most commonly to study responding with cocaine-paired stimuli (Di Ciano 2008; Di Ciano and Everitt 2004; Di Ciano et al. 2007, 2008; Hutcheson et al. 2011; Panlilio et al. 2007; Samaha et al. 2011) or nicotine-paired stimuli (Palmatier et al. 2007, 2008). Crucially, among these studies, several studies with both cocaine (Di Ciano and Everitt 2004; Panlilio et al. 2007) and nicotine (Palmatier et al. 2007, 2008) have included both a control manipulandum and an associative control condition to assess the sensitivity of responding to the drug-stimulus pairing. Corresponding studies have not, to our knowledge, been performed with opioid-paired stimuli. In addition to studying cocaine-paired stimuli, Di Ciano and Everitt (2004) did measure rats’ acquisition of responding with heroin-paired stimuli; however, no associative control was included for the heroin-paired stimulus in the heroin-trained animals, whereas an unpaired stimulus control condition was included for cocaine-trained animals.
The present experiments characterized rats’ acquisition of a novel instrumental response (nose-poking) that produced a light-noise stimulus that had been paired with the potent, short-acting μ-opioid agonist, remifentanil. To establish that acquisition depended on, or was sensitive to, the Pavlovian contingency between the stimulus and remifentanil, animals exposed to stimulus-remifentanil pairings were compared to animals given remifentanil injections and stimulus presentations without consistent pairing (a “truly random” control, Rescorla 1967). To establish that acquisition depended on the instrumental contingency between a particular response and the stimulus, animals were allowed to choose between an active nose-poke manipulandum, which produced the stimulus, and an inactive nose-poke manipulandum, which had no scheduled consequences. Three experiments were conducted. Experiment 1 characterized rats’ responding in 2 instrumental acquisition sessions after 5 Pavlovian conditioning sessions. In Experiment 2, animals were tested in 7 instrumental acquisition sessions after 5 Pavlovian conditioning sessions. These additional acquisition sessions were conducted to assess the persistence of responding with the stimulus. Experiment 3 assessed the influence of the number of drug-stimulus pairings, giving animals 7 acquisition sessions after only 1 Pavlovian conditioning session. Finally, to investigate the influence of the schedule of reinforcement on new-response acquisition, the active response produced the stimulus under either a random ratio (RR) 2 or fixed ratio (FR) 1 schedule in Experiments 2 and 3.
Methods
Subjects
Male Sprague-Dawley rats weighing at least 250 g were obtained from Harlan (Indianapolis, IN) to serve as subjects in all experiments. Experimental groups contained 8–12 rats. Animals were housed individually in a temperature (21–23 °C) and humidity controlled facility on a 12 h light/dark cycle (lights on at 7:00 am). Experimental sessions were conducted 6–7 days/week during the light phase of the cycle. All animals had unrestricted access to tap water and standard pellet chow in the home cage for the duration of the experiment. All studies were performed in accordance with the Guide for the Care and Use of Laboratory Animals (Institute of Laboratory Animal Research 1996), as adopted and promulgated by the National Institutes of Health, and all experimental procedures were approved by the University of Michigan Committee on the Use and Care of Animals.
Surgery
After at least 7 days of acclimation to the facility, each rat was implanted with a chronic indwelling femoral vein catheter to allow for IV drug administration. Catheterization surgery was performed under ketamine/xylazine (90:10 mg/kg, IP) anesthesia. Catheters, custom made from polyurethane tubing (MRE 040, Braintree Scientific; Braintree, MA) and Tygon tubing (S-54-HL, Norton Performance Plastics; Akron, OH), were inserted into the left femoral vein and routed subcutaneously to the area between the scapulae for externalization. At the scapulae, the catheter was attached to 22 ga stainless steel tubing that was passed through and secured to a Dacron mesh back-plate (DC95BS, Instech Laboratories; Plymouth Meeting, PA, USA). Rats were allowed at least 5 days to recover from surgery before starting experimental sessions. Catheters were flushed with 0.25 ml of saline with heparin (50 U/ml) each day during recovery, as well as before and after experimental sessions to ensure patency.
Apparatus
Experimental sessions were conducted in two experimental chambers (ENV-008, Med Associates; St. Albans, VT) contained inside light- and sound-attenuating cubicles. Each experimental chamber was located in a separate room of the laboratory. The right wall of each experimental chamber contained a white incandescent houselight (ENV-215M, Med Associates) and a sound generator and speaker (ENV-230 and ENV-224AM, Med Associates). Two nose-poke manipulanda with built-in LED stimulus lights (ENV-114BM, Med Associates) could also be inserted into the right wall. When present, the nose-pokes were located 2.5 cm above the grid floor. The right nose-poke was located 4 cm from the front wall of the experimental chamber, whereas the left nose-poke was located 4 cm from the rear wall. The houselight was centered horizontally between the nose-pokes and located 9 cm above the grid floor. The speaker was located above the right nose-poke, 7.5 cm from the floor. Blank aluminum panels were inserted when the nose-pokes were removed, but all other elements of the experimental chamber remained in place.
IV drug injections were delivered by motorized syringe drivers (PHM-107; Med Associates) through Tygon tubing (S-54-HL, Norton Performance Plastics; Akron, OH) connected to a fluid swivel (375/22PS; Instech Laboratories, Plymouth Meeting, PA or QCS-D; Strategic Applications Inc., Lake Villa, IL) and spring tether, which were mounted to a counterbalanced arm. The syringe drivers were located outside of the light- and sound-attenuating cubicles.
Pavlovian conditioning
After recovery from catheterization surgery, rats received either “paired” or “random” Pavlovian conditioning (PAV) sessions. During all PAV sessions, the nose-pokes were removed from the experimental chambers, and all animals received response-independent IV injections of remifentanil (3.2 μg/kg delivered in a volume of 100 μl/kg) and response-independent deliveries of a light-noise compound stimulus. The dose of remifentanil was chosen based on previous work in the laboratory on remifentanil self-administration (Cooper et al. 2008). The light-noise stimulus consisted of houselight illumination and white noise (80±5 db as measured at the center of the chamber). Injections and stimuli lasted 2.0±0.5 s, depending on the weight of the individual animal. In the paired PAV groups, a single variable time (VT) 3 min schedule controlled both remifentanil injection and stimulus delivery, and injections and stimuli always co-occurred. In the random PAV control groups, remifentanil injection and stimulus delivery were each controlled by independent VT 3 min schedules. Injections and stimuli were not explicitly unpaired. For both paired PAV and random PAV, inter-injection/inter-stimulus intervals ranged from 0.0 to 6.0 min. The 3 min average inter-injection interval was chosen based on the half-life of remifentanil (Crespo et al. 2005) to allow for extensive metabolism between injections. PAV sessions lasted until 20 injections and 20 stimuli were delivered (approximately 60 min). In Experiments 1 and 2, separate groups of animals received paired PAV or random PAV for 5 consecutive sessions (100 total injections/stimulus deliveries). In Experiment 3, all groups of animals received 1 session of paired PAV (20 total injections/stimulus deliveries).
Instrumental acquisition
Instrumental acquisition (ACQ) sessions began the day after the conclusion of PAV. ACQ sessions were conducted the same way following paired PAV and random PAV. During ACQ sessions, the two nose-pokes were present in the experimental chambers. The start of each ACQ session was indicated by the illumination of the stimulus lights inside both nose-pokes, and both nose-pokes remained illuminated for the duration of the session. In each group, the right nose-poke was active for one half of the animals, whereas the left nose-poke was active for the other half of the animals. Responses in the active nose-poke produced the light-nose stimulus alone. No remifentanil injections were given: animals were attached to the tether, but saline replaced remifentanil on the syringe driver, and the driver did not run at any point. In Experiment 1, responses in the active nose-poke produced the stimulus under a modified RR2 schedule. Under the RR2 schedule, the first response in the active nose-poke in each session produced the stimulus with a probability of 1.0, whereas each subsequent response in the session produced the stimulus with a probability of 0.5. In Experiments 2 and 3, in separate paired PAV and random PAV groups, responses in the active nose-poke produced the light-noise stimulus under the RR2 schedule or under a FR1 schedule. In all groups, responses in the inactive nose-poke were recorded but had no scheduled consequences. Active and inactive responses made during stimulus presentation were not recorded. All ACQ sessions lasted for 60 min. In Experiment 1, ACQ was conducted for 2 consecutive sessions for all animals. In Experiments 2 and 3, ACQ was conducted for 7 consecutive sessions for all animals.
Data analysis
Based on the acquisition criteria of Cunningham (1993, p 375), two hypotheses were tested: (1) a remifentanil-associated conditioned reinforcer will produce differential responding, i.e., animals will make more active responses than inactive responses after paired PAV but not after random PAV and (2) a remifentanil-associated conditioned reinforcer will increase responding compared to control animals, i.e., animals will make more active responses after paired PAV than after random PAV. In Experiments 1 and 2, for each schedule of reinforcement, the mean number of active and inactive nose-pokes made in each ACQ session were analyzed using a three-way ANOVA with the within-subjects factors of manipulandum (active vs. inactive) and session (ACQ1–2 in Experiment 1, ACQ1–7 in Experiment 2) and the between-subjects factor of PAV history (paired vs. random). Paired t-tests were then used to compare the active and inactive responses of each group in each ACQ session. Following a significant PAV history X manipulandum interaction and nonsignificant interactions involving PAV history and session, responding was averaged across sessions, and unpaired t-tests were used to compare the mean active responses of the paired PAV vs. random PAV groups and the mean inactive responses of the paired PAV vs. random PAV groups. The Holm-Bonferroni method was used to correct for multiple pairwise comparisons. In Experiment 3, for each schedule of reinforcement, the mean number of active and inactive nose-pokes made in each ACQ session were analyzed using a two-way ANOVA with the within-subjects factors of manipulandum and session. Analyses were performed using Prism 5.0 (GraphPad Software; La Jolla, CA) or SPSS Statistics 20.0 (IBM; Armonk, NY). Differences were considered significant when p < .05, two-tailed.
Drugs
Remifentanil was obtained from the hospital pharmacy of the University of Michigan Health System (Ultiva brand, GlaxoSmithKline; Uxbridge, Middlesex, UK) and dissolved in sterile saline (0.9 % w/v).
Results
Experiment 1: Responding in 2 ACQ sessions after 5 PAV sessions
Figure 1 presents the nose-poke responses of rats in 2 ACQ sessions after 5 sessions of either paired PAV (Figure 1a) or random PAV (Figure 1b). Animals responded differently in the active vs. inactive nose-poke [main effect of manipulandum; F(1,18) = 6.04, p = .024; session X manipulandum: F(1,18) = 4.45; p = .049]. By pairwise comparison, animals that received paired PAV made significantly more active responses than inactive responses in ACQ2 [t(9) = 3.55, p = .012], whereas the active and inactive responses of animals that received random PAV were not different in either ACQ session [0.12 < t(9) < 1.61, all p’s > .10]. Between groups, however, the effects of PAV history were not significant [main effect and all interactions: 0.24 < F(1,18) < 2.82, all p’s > .10].
Fig. 1.

Acquisition of a novel nose-poke response when responses in the active nose-poke produce a stimulus that was previously paired with response-independent IV remifentanil injection. a: Active and inactive nose-poke responses made by rats (n = 10) after 5 sessions of paired PAV. b: Active and inactive nose-poke responses made by control rats (n = 10) after 5 sessions of random PAV. * p < .05. Significant difference between active and inactive responding in the given ACQ session as assessed by paired t-test. All data are presented as the mean ± SEM.
Experiment 2: Responding in 7 ACQ sessions after 5 PAV sessions
Figure 2 presents the nose-poke responses of rats in 7 ACQ sessions after 5 sessions of either paired or random PAV. Animals responded under either the RR2 (Figures 2a–2c) or FR1 (Figures 2d–2f) schedules of reinforcement.
Fig. 2.
Persistence of responding across ACQ sessions with the remifentanil-paired stimulus under both the RR2 and FR1 schedules of reinforcement. a: Active and inactive nose-poke responses made by rats (n = 12) under the RR2 schedule after 5 sessions of paired PAV. b: Active and inactive nose-poke responses made by control rats (n = 10) under the RR2 schedule after 5 sessions of random PAV. c: Mean active and inactive responses made from ACQ1–7 under the RR2 schedule after paired or random PAV. d: Active and inactive nose-poke responses made by rats (n = 12) under the FR1 schedule after 5 sessions of paired PAV. e: Active and inactive nose-poke responses made by control rats (n = 8) under the FR1 schedule after 5 sessions of random PAV. f: Mean active and inactive responses made from ACQ1–7 under the FR1 schedule after paired or random PAV. * p < .05; ** p < .01. Significant difference between active and inactive responding in the given ACQ session as assessed by paired t-test. # p < .05; ## p < .01. Significant difference between paired and random PAV as assessed by unpaired t-test. All data are presented as the mean ± SEM.
Under the RR2 schedule, animals responded differently in the active vs. inactive nose-poke [main effect of manipulandum: F(1,20) = 16.48, p < .001; session X manipulandum: F(6,120) = 2.47, p = .027]. By pairwise comparison, animals that received paired PAV made significantly more active responses than inactive responses in each session from ACQ2–7 [Figure 2a; 3.20 < t(11) < 4.64, all p’s < .05]. After random PAV, animals’ active and inactive responses were not different in any ACQ session [Figure 2b; 0.20 < t(9) < 1.99, all p’s > .10]. Between groups, animals responded differently after paired PAV vs. random PAV [main effect of PAV history: F(1,20) = 6.69, p = .018], and the effects of PAV history differed for active vs. inactive responding [PAV history X manipulandum: F(1,20) = 9.63, p = .006]. Responding changed across ACQ sessions [main effect of session: F(6,120) = 3.48, p = .003], but the effects of PAV history did not depend on the session [session X PAV history: F(6,120) = 1.08, p = .37; session X PAV history X manipulandum: F(6,120) = 1.56, p = .16]. Collapsing across sessions to characterize the PAV history X manipulandum interaction (Figure 2c), animals made more active responses after paired PAV than after random PAV [t(20) = 2.91, p = .017), whereas inactive responding was not different after paired PAV vs. random PAV [t(20) = 1.40, p = .17].
Under the FR1 schedule, numerically, animals made more active responses than inactive responses after paired PAV and more inactive responses than active responses after random PAV. The main effect of manipulandum was not significant [F(1,18) = 2.97, p = .10], but responding differed significantly within the paired PAV group. By pairwise comparison, animals that received paired PAV made significantly more active responses than inactive responses in ACQ2 and ACQ4–7 [Figure 2d; 3.41 < t(11) < 4.75, all p’s < .05]. After random PAV, animals’ active and inactive responses were not different in any ACQ session [Figure 2e; 0.0 < t(7) < 2.93, all p’s > .10]. Animals’ responding under the FR1 schedule was affected by their PAV history as it was under the RR2 schedule: under the FR1 schedule, as well, animals responded differently after paired PAV vs. random PAV [main effect of PAV history: F(1,18) = 7.17, p = .015], and the effects of PAV history differed for active vs. inactive responding [PAV history X manipulandum: F(1,18) = 15.48, p < .001]. Responding changed across ACQ sessions [main effect of session: F(6,108) = 16.23, p < .001], but the effects of PAV history did not depend on the session [session X PAV history: F(6,108) = 1.03, p = .40; session X PAV history X manipulandum: F(6,108) = 1.22, p = .30]. Collapsing across sessions to characterize the PAV history X manipulandum interaction (Figure 2f), animals made more active responses after paired PAV than after random PAV [t(18) = 3.60, p = .004], whereas inactive responding did not differ by PAV history [t(18) = 0.37, p = .71]
Experiment 3: Responding in 7 ACQ sessions after 1 PAV sessions
Figure 3 presents the active and inactive responses of rats in 7 ACQ sessions after 1 session of paired PAV. Animals responded under either the RR2 (Figure 3a) or FR1 (Figure 3b) schedules of reinforcement. Under the RR2 schedule, responding did not differ by nose-poke [main effect of manipulandum: F(1,7) = 2.07, p = .19] or across sessions [main effect of session: F(6,42) = 1.74, p = .13; session X manipulandum interaction: F(6,42) = 1.14, p = .35]. Under the FR1 schedule, likewise, responding did not differ by nose-poke [main effect of manipulandum: F(1,9) = 3.96, p = .078] or across sessions [main effect of session: F(6,54) = 0.90, p = .49; session X manipulandum interaction: F(6,54) = 0.99, p = .43]. The trend toward a difference between the nose-pokes under the FR1 schedule is caused by a slight, but persistent, preference for the inactive response over the active response. Because paired PAV did not produce any significant changes in ACQ responding, control groups with 1 session of random PAV were not tested.
Fig. 3.

After 1 session of PAV, rats do not acquire nose-poke responding with the remifentanil-paired stimulus. a: Active and inactive nose-poke responses made by rats (n = 8) under the RR2 schedule after 1 session of paired PAV. b: Active and inactive nose-poke responses made by rats (n = 10) under the FR1 schedule after 1 session of paired PAV. All data are presented as the mean ± SEM.
Discussion
Various behavioral processes can change rates of responding when animals are exposed to a drug-paired environmental stimulus. These processes may be related to exposure to the drug itself, exposure to the stimulus itself, and/or the drug-stimulus pairing. In addition to the conditioned reinforcing effects of the stimulus, responding may be altered by the primary reinforcing effects of the drug, primary reinforcing effects of the stimulus (i.e., sensory reinforcement), discriminative effects of the stimulus, unconditioned effects of drug exposure, nonassociative learning (e.g., habituation to the sensory aspects of the stimulus), and other influences (Cunningham 1993; Kelleher and Gollub 1962; Mackintosh 1974; Williams 1994). The present study, therefore, used a behaviorally stringent new-response acquisition procedure to characterize the conditioned reinforcing effects of a light-noise stimulus that was paired with the μ-opioid agonist, remifentanil.
After 5 sessions of paired PAV, rats acquired a novel nose-poke response that produced the light-noise stimulus alone. Under both the RR2 and FR1 schedules of reinforcement, significant preferences for the active response developed rapidly (by ACQ2, Experiments 1 and 2) and persisted across multiple testing sessions (active > inactive even in ACQ7, Experiment 2). Control rats did not acquire nose-poking when the stimulus and remifentail were not consistently paired: after 5 sessions of random PAV, no significant preference for the active response was observed in any ACQ session. With the 7 ACQ sessions in Experiment 2, furthermore, rats made more active responses after paired PAV than after random PAV, and pairing the stimulus with remifentanil selectively affected active responding, as inactive responding did not differ by PAV history under either schedule. Thus, the remifentanil-paired stimulus maintained both differential responding (active > inactive within-subjects) and increased responding (active > active between-subjects). Different criteria may be used to determine when a response has been successfully acquired with either conditioned or primary reinforcement; however, in experimental designs that include two manipulanda, testing for both within-group and between-group differences in active responding may provide a more comprehensive account of the response strength obtained, even if it is not always used as the minimum requirement for a successful demonstration of reinforcement (Cunningham 1993; Snycerski et al. 2005).
In contrast to the effects of 5 sessions of paired PAV, rats did not acquire responding under either schedule of reinforcement after 1 session of paired PAV. These results are consistent with earlier studies of the effects of pairing number on the conditioned reinforcing effects of food-associated stimuli, as well as more general notions of “associative strength” or the degree of association underlying other behaviors that depend on Pavlovian learning (reviewed by Kelleher and Gollub 1962; Mackintosh 1974).
Responding with the remifentanil-paired stimulus, therefore, satisfies the three criteria for a sufficient demonstration of conditioned reinforcement (Mackintosh 1974, p 234). First, the absence of the nose-poke manipulanda during PAV and the absence of remifentanil during ACQ prevented direct association of the nose-poke response with remifentanil as a primary reinforcer. Rather, the differences between the paired PAV and random PAV groups show that acquisition depended on the Pavlovian pairing of the stimulus with remifentanil. Prior exposure to remifentanil and stimulus presentation without consistent pairing did not produce differential responding during ACQ or as many active responses as paired PAV. Finally, the differences between active and inactive nose-poke responding during ACQ indicate that acquisition was sensitive to the instrumental association between a particular response and the stimulus as a consequence of that response. The side of the active nose-poke (left vs. right) was counterbalanced across animals in each group, and the houselight and speaker were not consistently located above the active nose-poke. It is, therefore, unlikely that either a spatial bias or Pavlovian conditioned approach was the sole basis for differential responding. Likewise, both nose-pokes simply remained illuminated for the duration of the session, and so the differences in responding are unlikely to have emerged from a difference in the sensory aspects of the active vs. inactive manipulanda themselves.
This is not to say that independently programmed or randomized presentations of drugs and environmental stimuli have no effect on behavior, or that the random control groups learned nothing during their PAV sessions. Even with the significant differences between the paired PAV and random PAV groups reviewed above, it is noteworthy that the animals in the random PAV groups responded throughout ACQ, making ~5–10 active and inactive responses per session. This responding may be due to associative processes (e.g., from pairing the experimental chamber generally with remifentanil) and/or nonassociative processes (e.g., reactions to the nose-poke manipulanda as novel objects). Some of these same processes may have also influenced the responding of the paired PAV groups, in addition to the effects of the remifentanil-stimulus pairing. Presently, a random control procedure was chosen to ensure that the experimental and control groups were matched for their exposure to the individual experimental elements—total remifentanil exposure and total exposure to the light-noise stimulus—during PAV (Cunningham 1993). However, there continues to be debate about the procedures that comprise adequate controls for Pavlovian conditioning (e.g., Kirkpatrick and Church 2004; Miller and Matzel 1989; Papini and Bitterman 1990). The present study cannot address the presence or absence of learning in the random PAV groups, except to note that whatever learning occurred did not produce the same effect on nose-poke responding that paired PAV did, and so the differences between the groups in this target behavior are still relevant to understanding how a specific drug-paired stimulus can influence a specific behavior.
In human drug abuse and dependence generally, Pavlovian drug-associated stimuli are thought to play a number of distinct, but interacting, roles in maintaining drug self-administration behaviors and provoking relapse (reviewed by Milton and Everitt 2010). As conditioned reinforcers, specifically, drug-paired stimuli may help to sustain (1) prolonged sequences or chains of behavior that ultimately lead to drug consumption and (2) drug-seeking responses in extinction, when the drug itself is unavailable (Milton and Everitt 2010). Human drug abusers are often required to engage in long, complex sequences of behavior to obtain and prepare drugs prior to consuming them, and laboratory animals can also be trained to produce extended multioperant chains with self-administered drug (e.g., Thompson and Pickens 1969, Figure 9). Reducing the conditioned reinforcing of drug-paired stimuli may disrupt the performance of such chains, reducing access to and drug-taking in their terminal links. Next, by maintaining existing responses and/or training new responses in the absence of the drug itself, conditioned reinforcers may both complicate the detoxification process, as individuals attempt to break ongoing patterns of drug self-administration, and contribute to relapse after extended abstinence. The persistent preference for the active response observed in the present study is noteworthy in this regard. Historically, researchers have questioned whether new-response acquisition behavior is too transient to be of practical use in studying conditioned reinforcement: because responses during instrumental acquisition necessarily present the stimulus in the absence of the primary reinforcer, Pavlovian extinction may rapidly reduce or eliminate the conditioned reinforcing effects of the stimulus (Mackintosh 1974; Williams 1994). Many of the detailed interactions of Pavlovian and instrumental learning remain to be clarified (e.g., Palmatier et al. 2008), but it is becoming increasingly clear that sustained response-acquisition performance can be obtained with drug-paired stimuli (see also Di Ciano and Everitt 2004; Di Ciano et al. 2008). Altogether, therefore, interventions that reduce the conditioned reinforcing effects of drug-paired stimuli may help to make drug-taking and -seeking behaviors less diverse and less sustainable. It is important to note that, to date, such cue exposure therapies have not produced consistent clinical benefits, but they may be refined as new research into animal learning is translated into work with human drug users (e.g., Conklin and Tiffany 2002; Xue et al. 2012). New-response acquisition procedures may provide useful models for studying the enduring control that drug-paired stimuli can exert over behavior because of their conditioned reinforcing effects, specifically, and for testing interventions designed to alter that control.
Acknowledgments
We thank Davina Barron, Alyssa Cunningham, Tomas Davaloz, and Adam Kynaston for their excellent technical assistance. We thank Gail Winger and Nhu Truong for their comments on an earlier draft of the manuscript.
This research was supported by NIDA grants DA 020669, DA 024897, and DA 032943; and by the NSF Graduate Research Fellowship Program under grant DGE 0718128.
Footnotes
The authors have no conflicts of interest to disclose. All experiments comply with the current laws of the United States of America, the country in which they were performed.
Contributor Information
Jeremiah W. Bertz, Email: jwbertz@umich.edu, Department of Psychology, 1012 East Hall, 530 Church Street, University of Michigan, Ann Arbor, MI 48109-1043, USA, Tel.: +1-734-764-2307, Fax: +1-734-764-7118
James H. Woods, Department of Psychology, 1012 East Hall, 530 Church Street, University of Michigan, Ann Arbor, MI 48109-1043, USA. Department of Pharmacology, 1301 MSRB III, 1150 West Medical Center Drive, University of Michigan Medical School, Ann Arbor, MI 48109-0632, USA
Works cited
- Beninger RJ, Ranaldi R. Dopaminergic agents with different mechanisms of action differentially affect responding for conditioned reward. In: Palomo T, Archer T, editors. Strategies for studying brain disorders, vol 1, depressive, anxiety and drug abuse disorders. Farrand Press; London: 1994. pp. 411–428. [Google Scholar]
- Beninger RJ, Rolfe NG. Dopamine D1-like receptor agonists impair responding for conditioned reward in rats. Behav Pharmacol. 1995;6:785–793. [PubMed] [Google Scholar]
- Burke KA, Franz TM, Miller DN, Schoenbaum G. Conditioned reinforcement can be mediated by either outcome-specific or general affective representations. Front Integr Neurosci. 2007 doi: 10.3389/neuro.07/002.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conklin CA, Tiffany ST. Applying extinction research and theory to cue-exposure addiction treatments. Addiction. 2002;97:155–167. doi: 10.1046/j.1360-0443.2002.0055a.x. [DOI] [PubMed] [Google Scholar]
- Cooper ZD, Truong YN-T, Shi Y-G, Woods JH. Morphine deprivation increases self-administration of the fast- and short-acting μ-opioid receptor agonist remifentanil in the rat. J Pharmacol Exp Ther. 2008;326:920–929. doi: 10.1124/jpet.108.139196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crespo JA, Sturm K, Saria A, Zernig G. Simultaneous intra-accumbens remifentanil and dopamine kinetics suggest that neither determines within-session operant responding. Psychopharmacology (Berl) 2005;183:201–209. doi: 10.1007/s00213-005-0180-7. [DOI] [PubMed] [Google Scholar]
- Cunningham CL. Pavlovian drug conditioning. In: van Haaren F, editor. Methods in behavioral pharmacology. Elsevier; New York: 1993. pp. 349–381. [Google Scholar]
- Davis WM, Smith SG. Conditioned reinforcement as a measure of the rewarding properties of drugs. In: Bozarth MA, editor. Methods of assessing the reinforcing properties of abused drugs. Springer-Verlag; New York: 1987. pp. 199–210. [Google Scholar]
- de Borchgrave R, Rawlins JNP, Dickinson A, Balleine BW. Effects of cytotoxic nucleus accumbens lesions on instrumental conditioning in rats. Exp Brain Res. 2002;144:50–68. doi: 10.1007/s00221-002-1031-y. [DOI] [PubMed] [Google Scholar]
- Di Ciano P, Everitt BJ. Conditioned reinforcing properties of stimuli paired with self-administered cocaine, heroin, or sucrose: implications for the persistence of addictive behaviour. Neuropharmacology. 2004;47:202–213. doi: 10.1016/j.neuropharm.2004.06.005. [DOI] [PubMed] [Google Scholar]
- Di Ciano P, Benham-Hermetz J, Fogg AP, Osborne GEC. Role of the prelimbic cortex in the acquisition, re-acquisition or persistence of responding for a drug-paired conditioned reinforcer. Neuroscience. 2007;150:291–298. doi: 10.1016/j.neuroscience.2007.09.016. [DOI] [PubMed] [Google Scholar]
- Di Ciano P, Robbins TW, Everitt BJ. Differential effects of nucleus accumbens core, shell, or dorsal striatal inactivations on the persistence, reacquisition, or reinstatement of responding for a drug-paired conditioned reinforcer. Neuropsychopharmacology. 2008;33:1413–1425. doi: 10.1038/sj.npp.1301522. [DOI] [PubMed] [Google Scholar]
- Di Ciano Facilitated acquisition but not persistence of responding for a cocaine-paired conditioned reinforcer following sensitization with cocaine. Neuropsychopharmacology. 2008;33:1426–1431. doi: 10.1038/sj.npp.1301542. [DOI] [PubMed] [Google Scholar]
- Everitt BJ, Robbins TW. Second-order schedules of drug reinforcement in rats and monkeys: measurement of reinforcing efficacy and drug-seeking behaviour. Psychopharmacology (Berl) 2000;153:17–30. doi: 10.1007/s002130000566. [DOI] [PubMed] [Google Scholar]
- Goddard B, Leri F. Reinstatement of conditioned reinforcing properties of cocaine-conditioned stimuli. Pharmacol Biochem Behav. 2006;83:540–546. doi: 10.1016/j.pbb.2006.03.015. [DOI] [PubMed] [Google Scholar]
- Hogarth L, Duka T. Human nicotine conditioning requires explicit contingency knowledge: is addictive behavior cognitively mediated? Psychopharmacology (Berl) 2006;184:553–566. doi: 10.1007/s00213-005-0150-0. [DOI] [PubMed] [Google Scholar]
- Hutcheson DM, Quarta D, Halbout B, Rigal A, Valerio E, Heidbreder C. Orexin-1 receptor antagonist SB-334867 reduces the acquisition and expression of cocaine-conditioned reinforcement and the expression of amphetamine-conditioned reward. Behav Pharmacol. 2011;22:173–181. doi: 10.1097/FBP.0b013e328343d761. doi: 10.1097/FBP.0b013e328343d761. [DOI] [PubMed] [Google Scholar]
- Hyde TS. The effect of Pavlovian stimuli on the acquisition of a new response. Learn Motiv. 1976;7:223–239. doi: 10.1016/0023-9690(76)90030-8. [DOI] [Google Scholar]
- Institute of Laboratory Animal Research CoLS, National Research Council. Guide for the care and use of laboratory animals. 7. National Academies Press; Washington DC: 1996. [Google Scholar]
- Kelleher RT, Gollub LR. A review of positive conditioned reinforcement. J Exp Anal Behav. 1962;5:543–597. doi: 10.1901/jeab.1962.5-s543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirkpatrick K, Church RM. Temporal learning in random control procedures. J Exp Psychol Anim Behav Process. 2004;30:213–228. doi: 10.1037/0097-7403.30.3.213. [DOI] [PubMed] [Google Scholar]
- Le Foll B, Goldberg SR. Control of the reinforcing effects of nicotine by associated environmental stimuli in animals and humans. Trends Pharmacol Sci. 2005;26:287–293. doi: 10.1016/j.tips.2005.04.005. [DOI] [PubMed] [Google Scholar]
- Mackintosh NJ. The psychology of animal learning. Academic Press; New York: 1974. [Google Scholar]
- Marcus R, Carnathan G, Meyer RE, Cochin J. Morphine-based secondary reinforcement: effects of different doses of naloxone. Psychopharmacology (Berl) 1976;48:247–250. doi: 10.1007/BF00496856. [DOI] [PubMed] [Google Scholar]
- Miller RR, Matzel LD. Contingency and relative associative strength. In: Klein SB, Mowrer RR, editors. Contemporary learning theories: Pavlovian conditioning and the status of traditional learning theory. Lawrence Erlbaum Associates; Hillsdale, NJ: 1989. pp. 61–84. [Google Scholar]
- Milton AL, Everitt BJ. The psychological and neurochemical mechanisms of drug memory reconsolidation: implications for the treatment of addiction. Eur J Neurosci. 2010;31:2308–2319. doi: 10.1111/j.1460-9568.2010.07249.x. [DOI] [PubMed] [Google Scholar]
- Myers KM, Carlezon WA. Extinction of drug- and withdrawal-paired cues in animal models: relevance to the treatment of addiction. Neurosci Biobehav Rev. 2010;35:285–302. doi: 10.1016/j.neubiorev.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olausson P, Jentsch JD, Taylor JR. Repeated nicotine exposure enhances responding with conditioned reinforcement. Psychopharmacology (Berl) 2004;173:98–104. doi: 10.1007/s00213-003-1702-9. [DOI] [PubMed] [Google Scholar]
- Olive MF, Kalivas PW. Conditioning of addiction. In: Johnson BA, editor. Addiction medicine. Springer; New York: 2011. pp. 159–178. [DOI] [Google Scholar]
- Palmatier MI, Liu X, Matteson GL, Donny EC, Caggiula AR, Sved AF. Conditioned reinforcement in rats established with self-administered nicotine and enhanced by noncontingent nicotine. Psychopharmacology (Berl) 2007;195:235–243. doi: 10.1007/s00213-007-0897-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palmatier MI, Coddington SB, Liu X, Donny EC, Caggiula AR, Sved AF. The motivation to obtain nicotine-conditioned reinforcers depends on nicotine dose. Neuropharmacology. 2008;55:1425–1430. doi: 10.1016/j.neuropharm.2008.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Panlilio LV, Thorndike EB, Schindler CW. Blocking of conditioning to a cocaine-paired stimulus: testing the hypothesis that cocaine perpetually produces a signal of larger-than-expected reward. Pharmacol Biochem Behav. 2007;86:774–777. doi: 10.1016/j.pbb.2007.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Papini MR, Bitterman ME. The role of contingency in classical conditioning. Psychol Rev. 1990;97:396–403. doi: 10.1037/0033-295X.97.3.396. [DOI] [PubMed] [Google Scholar]
- Parkinson JA, Olmstead MC, Burns LH, Robbins TW, Everitt BJ. Dissociation in effects of lesions of the nucleus accumbens core and shell on appetitive Pavlovian approach behavior and the potentiation of conditioned reinforcement and locomotor activity by D-amphetamine. J Neurosci. 1999;19:2401–2411. doi: 10.1523/JNEUROSCI.19-06-02401.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parkinson JA, Roberts AC, Everitt BJ, Di Ciano P. Acquisition of instrumental conditioned reinforcement is resistant to the devaluation of the unconditioned stimulus. Q J Exp Psychol B. 2005;58:19–30. doi: 10.1080/02724990444000023. doi: 0.1080/02724990444000023. [DOI] [PubMed] [Google Scholar]
- Rescorla RA. Pavlovian conditioning and its proper control procedures. Psychol Rev. 1967;74:71–80. doi: 10.1037/h0024109. [DOI] [PubMed] [Google Scholar]
- Samaha A-N, Minogianis E-A, Nachar W. Cues paired with either rapid or slower self-administered cocaine injections acquire similar conditioned rewarding properties. PLoS One. 2011;6:e26481. doi: 10.1371/journal.pone.0026481. doi: 10.1371/journal.pone.0026481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- See RE. Neural substrates of cocaine-cue associations that trigger relapse. Eur J Pharmacol. 2005;526:140–146. doi: 10.1016/j.ejphar.2005.09.034. [DOI] [PubMed] [Google Scholar]
- Snycerski S, Laraway S, Poling A. Response acquisition with immediate and delayed conditioned reinforcement. Behav Processes. 2005;68:1–11. doi: 10.1016/j.beproc.2004.08.004. [DOI] [PubMed] [Google Scholar]
- Sosa R, dos Santos CV, Flores C. Training a new response using conditioned reinforcement. Behav Processes. 2011;87:231–236. doi: 10.1016/j.beproc.2011.03.001. [DOI] [PubMed] [Google Scholar]
- Taylor JR, Robbins TW. Enhanced behavioural control by conditioned reinforcers following microinjections of d-amphetamine into the nucleus accumbens. Psychopharmacology (Berl) 1984;84:405–412. doi: 10.1007/BF00555222. [DOI] [PubMed] [Google Scholar]
- Taylor JR, Olausson P, Quinn JJ, Torregrossa MM. Targeting extinction and reconsolidation mechanisms to combat the impact of drug cues on addiction. Neuropharmacology. 2009;56:186–195. doi: 10.1016/j.neuropharm.2008.07.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson T, Pickens R. Drug self-administration and conditioning. In: Steinberg H, editor. Scientific basis of drug dependence. J & A Churchill; London: 1969. pp. 177–198. [Google Scholar]
- Williams BA. Conditioned reinforcement: experimental and theoretical issues. Behav Anal. 1994;17:261–285. doi: 10.1007/BF03392675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xue YX, Luo YX, Wu P, Shi HS, Xue LF, Chen C, Zhu WL, Ding ZB, Bao YP, Shi J, Epstein DH, Shaham Y, Lu L. A memory retrieval-extinction procedure to prevent drug craving and relapse. Science. 2012;336:241–245. doi: 10.1126/science.1215070. [DOI] [PMC free article] [PubMed] [Google Scholar]

