Abstract
Drug reinforcement learning is relevant for the development of addiction. The present study investigated how changes in the magnitude of drug-unconditioned stimulus during associative learning modulate the acquisition and extinction of cocaine-induced conditioned place preference (CPP). B6;129S F2 mice were conditioned by three dosing schedules of cocaine: a) ascending, b) fixed, and c) descending daily doses. Following acquisition of CPP, extinction was induced by a) context reexposure, b) reconditioning by saline, and c) reconditioning by descending doses of cocaine. The magnitude of CPP following conditioning by daily ascending doses of cocaine (2,4,8 and 16 mg/kg) was significantly higher than that obtained from conditioning by either a fixed daily dose (16mg/kg × 4 days) or daily descending doses (24,12,6 and 3mg/kg). Extinction following context reexposure showed persistent CPP in the “ascending” group compared to the other two groups. However, extinction via reconditioning by saline was equally effective in all groups. Interestingly, reconditioning by descending doses of cocaine a) extinguished CPP and b) resulted in partial resistance to the reinstatement of conditioned response by cocaine priming. Results underscore the significance of daily changes in cocaine dosage in the development and extinction of drug-induced conditioned response. Increase and decrease in cocaine dosage strengthens and weakens cocaine-associated memory, respectively. Moreover, extinction by “tapering down” drug reward may be superior to extinction by saline.
Keywords: cocaine, conditioned place preference, conditioned response, extinction, reinforcement learning
INTRODUCTION
Reinforcement learning is important for motivation, decision making and survival; this form of learning also has a major role in the development of drug addiction (Hyman et al. 2006; Robbins et al. 2008). Pavlovian conditioning entails reinforcement learning; pairing of an unconditioned stimulus (US) with a neutral context or cue confers conditioned stimulus (CS) properties to these stimuli. When the US is appetitive, reexposure to the CS elicits approach behavior, which is a conditioned response. The conditioned place preference (CPP) paradigm has been used extensively to investigate the motivational effects of drugs of abuse. The extent that CPP is relevant for the study of “reward” and “reinforcement” may be a matter of debate. Some consider the CPP paradigm relevant to the measure of “reward” (Bardo & Bevins 2000; Tzschentke 2007) and other consider this paradigm relevant for measuring the reinforcing properties of drugs of abuse (Sanchis-Segura & Spanagel 2006; Schenider et al. 2010). Drugs of abuse act as reinforcers because their effects are associated with learning and memory processes (White 1996; White & Milner 1992). The face validity of the CPP paradigm lies in that it can model learning and memory processes pertinent to addictive behavior (Itzhak et al. 2010; White & Carr, 1985). These include acquisition, extinction, and reinstatement of drug-induced conditioned response (Itzhak & Martin, 2002; Mueller & Stewart, 2000; Sanchis-Segura & Spanagel, 2006). These behavioral phenotypes are relevant for the development of drug-seeking behavior, the learning to extinguish drug-seeking, and the potential for relapse (reinstatement of conditioned behavior). In fact, drug-associated cues gain the properties of an excitatory CS, and exposure to such cues induces powerful craving that can precipitate relapse in abstinent drug users (Childress et al. 1999; Newlin 1992; Robbins et al. 1999).
Studies on natural reinforcement learning suggest that learning is dependent on the discrepancy between the reward and its prediction. The more the subject is “surprised” during training, the greater the opportunity for learning (Rescorla & Wagner, 1972). The present study investigated how changes in the magnitude of drug-unconditioned stimulus during associative learning modulate the acquisition and extinction of cocaine-induced CPP. We hypothesized that conditioning by daily incremental increase in cocaine would strengthen the memory of cocaine-associated context, while a progressive decline in cocaine dosage would weaken such memory. In addition, we posited that conditioning by ascending doses of cocaine mirrors, at least in part, the human practice of drug use. Routinely, studies on conditioned reward have used a fixed daily dose of the drug administration, which does not mimic the human pattern of drug abuse. We also sought to investigate whether reconditioning by descending doses of cocaine could extinguish cocaine-induced CPP. The latter is significant for extinction learning by tapering down drug reward, as occurs with nicotine replacement therapy for tobacco addiction (Ross & Peselow, 2009).
We report that conditioning by daily ascending dosage of cocaine, which we believe simulates escalation in drug use over the course of addiction, solidified drug-associated memory. On the other hand, conditioning by descending doses of cocaine weakened drug-associated memory and extinguished cocaine-induced conditioned behavior. These results underscore the significance of changes in the magnitude of drug-unconditioned stimulus for the development and extinction of drug-induced conditioned behavior.
MATERIALS AND METHODS
Animals
Adult female C57BL/6J and male 129S1/SvImJ mice were purchased from Jackson Laboratories (Bar Harbor, Maine). These two mouse strains were bred to generate an F2 progeny: B6;129S F2. Mice were bred in our facilities at the University of Miami, Miller School of Medicine, Miami, FL, as we previously described in detail (Balda et al. 2006; Itzhak & Anderson, 2008). After weaning (postnatal day 21), male and female mice were separately housed in groups of five per cage. To avoid “litter effect,” each cage contained mice from 4–5 different litters. In all subsequent experiments described herein, adult (8–10 week old) male B6;129S F2 mice were used. Animals were housed in a temperature- (22±0.5°C) and humidity- (50%) controlled room and maintained on a 12-h light/dark schedule with free access to food and water. Animal care was in accordance with the Guide for the Care and Use of Laboratory Animals (National Research Council, National Academy Press, 1996) and approved by the University of Miami Animal Care and Use Committee.
Place conditioning apparatus
Custom-designed Plexiglas cages (42L × 20W × 20H cm; Opto-Max Activity Meter v2.16; Columbus Instruments, Columbus, OH) were used. The training context consisted of two compartments, separated by a removable guillotine door, one comprising four black walls with a smooth black floor and the other four white walls and a floor covered with sandpaper (fine grit 150C, Norton). Each cage is equipped with 2 horizontal sensors mounted alongside opposing lengths. The two compartments (21 × 20 × 20 cm) were each scanned by 7 infrared beams at a rate of 10Hz (2.54cm intervals). A null zone 8 cm wide was assigned at the interface of the two compartments to ensure that only full entry into each compartment was registered as ‘real’ time spent in each zone. Information collected from sensors, e.g., time spent in each compartment and horizontal locomotor activity was recorded and analyzed by the Opto-Max interface and software.
General conditioning procedure
Conditioning and testing were carried out in a test room separate from the housing room. This room had dimmed lighting provided by a reading lamp, which was placed in one corner of the room facing a wall, and contained two 18-inch F15T8 white fluorescent bulbs, 15 Watts each. On the first day, between 12:00–14:00h, mice were habituated (20min) to the training context; time spent in each compartment was recorded to determine preconditioning compartment-preference/aversion. To ensure a strictly unbiased training design, mice that showed initial preconditioning preference of more than 10–12% of the total time (20min) to either compartment were removed from the study (15–20% of the mice). No significant difference in the number of subjects that had bias towards the black or white compartment of the apparatus was observed. After this elimination, pre-conditioning average times spent in the black, and white compartments and in the null zone, during 1200 seconds, were 478±31, 491±29, and 212±11 seconds, respectively. For the next 4 days (days 2–5) mice were trained by a morning (10:00–12:00h) saline session and an afternoon (14:00–16:00h) cocaine session, each lasting 30min. Saline and cocaine were administered intraperitoneally (IP).
Mice were conditioned for four days by three different schedules of cocaine: a) ascending dosage, b) a fixed daily dose and c) descending dosage. Two schedules of ascending dosage of cocaine were investigated: one consisted of 3,6,12 and 24mg/kg, and the second consisted of 2,4,8, and 16 mg/kg. Because results of these two schedules were very similar (Fig. 1a), subsequent experiments were carried out with the lower schedule of 2,4,8 and 16mg/kg. The fixed dose schedule consisted of daily injections of 16mg/kg cocaine. Because 16mg/kg cocaine was the highest dose given in the ascending schedule, we wanted to investigate whether the high magnitude of CPP observed in that group was due to the relatively high dose of cocaine (16mg/kg) administered on the last conditioning day. Three schedules of descending doses of cocaine were investigated (24,12,6 and 3mg/kg; 8,4,2 and 1 mg/kg and 8,4,2 and 0mg/kg).
For the unbiased design, training was counterbalanced: half of the subjects were trained with drug in the black compartment and the other half in the white compartment. The order of injections (saline-drug) was not counterbalanced because drug administration in the morning session may have an effect lingering into the afternoon session. Cocaine was administered immediately before the animal was placed into the appropriate compartment. Following each training session, the sandpaper was removed, the cage was thoroughly cleaned with diluted laboratory-grade detergent followed by water and then dried.
Expression of CPP
To ensure a) the elimination of residual drug from the last training session, and b) the consolidation of long-term memory of drug reward, expression of CPP was tested 3 days after the last training session (see details in Table 1). Testing was carried out between 12:00–14:00h, which is the same time period that pretraining habituation had been recorded. Each test was performed in a drug-free state and lasted for 20min.
Table 1.
Experimental design and timeline | ||||||
---|---|---|---|---|---|---|
Habituation | Conditioning (cocaine) | Expression Test | Extinction Training | Extinction Test | Free Exploration | Reinstatement (cocaine) |
Day 1 | Days 2–5 | Day 8 | Exp.2 and 3 Free Exploration Days 9–14 |
Days 9–14 | N/A | N/A |
Exp. 4, 5 and 6 Reconditioning Days 9–12 |
Day 15 | Days 16–18 | Day 19 |
N/A, not applicable
Extinction procedures
Free exploration
Twenty-four hours after the CPP expression test, mice were re-exposed to the conditioning context and were allowed free exploration of both compartments of the cage for six days in the absence of drug or saline injections (Table 1). Repeated exposure to the context (CS) in the absence of drug (US) elicits extinction learning (Bouton 2004). Each daily session lasted for 20min and time spent in each compartment was recorded. Extinction was defined as a significant decrease in the magnitude of CPP following the expression test.
Reconditioning
Different groups of mice were trained to extinguish their place preference through reconditioning either by saline or by descending doses of cocaine. For saline reconditioning, mice received a morning saline session (in the saline-paired context) and a second saline session in the afternoon (in the context that had previously been paired with cocaine). For reconditioning by cocaine, mice received a morning saline session (in the saline-paired context) and an afternoon cocaine session (in the context that had also been paired with cocaine during initial conditioning), whereupon the daily dose of cocaine was reduced. As in the conditioning phase, each reconditioning session lasted 30min. Place preference test (20min) was carried out 3 days following the reconditioning phase. To ensure extinction, mice were allowed three more days of free exploration (20min) and time spent in saline- and drug-paired context were recorded daily. On the following day, a challenge injection of cocaine was administered to determine the levels of CPP reinstatement (Table 1).
Experiment 1: Conditioned place preference and locomotor activity following conditioning by ascending, fixed, and descending doses of cocaine
Two groups of mice (n=10 each) were conditioned by two different ascending doses of cocaine. One group was conditioned by 3,6,12 and 24mg/kg cocaine and a second group by 2,4,8, and 16 mg/kg cocaine. A third group (n=10) was conditioned by a fixed daily dose of cocaine (16 mg/kg × 4 days). A fourth group (n=10) was conditioned by descending doses of cocaine (24,12,6, and 3mg/kg). A fifth group (n=10) served as a control; cocaine session was substituted by saline (0mg/kg × 4). Mice were conditioned for 4 days and the expression of place preference and locomotor activity were recorded after 3 days for 20min.
Experiment 2: Conditioning by ascending and descending doses of cocaine and extinction by free exploration
This experiment investigated the rate of extinction of CPP following conditioning by ascending and descending doses of cocaine and therefore additional groups of mice were conditioned as following. One group of mice (n=10) was conditioned by daily ascending doses of cocaine (3,6,12 and 24mg/kg) and a second group (n=14) was conditioned by daily descending doses of cocaine (24,12,6 and 3mg/kg). Following the CPP expression test, mice were re-exposed to the training context for the next six days. Preference for the cocaine- and saline-paired contexts was recorded daily (20min).
Experiment 3: Conditioning by ascending and fixed doses of cocaine and extinction by free exploration
This experiment investigated the rate of extinction of CPP following conditioning by ascending and fixed doses of cocaine; additional groups of mice were conditioned as following: One group of mice (n=8) was conditioned by daily ascending doses of cocaine (2,4,8 and 16mg/kg) and a second group (n=10) was conditioned by a fixed daily dose of cocaine (16mg/kg). Following the CPP expression test, mice were re-exposed to the training context for the next six days. Preference for the cocaine- and saline-paired context was recorded daily (20min).
Experiment 4: Conditioning by ascending and fixed doses of cocaine and extinction following reconditioning by saline
Because the rates of extinction through “free exploration” differed between the groups, we sought to investigate extinction following reconditioning by saline. One group of mice (n=12) was conditioned -by ascending doses of cocaine (2,4,8 and 16mg/kg) and a second group (n=15) was conditioned by a fixed daily dose of cocaine (16mg/kg). One day following the CPP expression test, mice were reconditioned by saline for 4 days as described under the Reconditioning section. Place preference was tested 3 days after the reconditioning phase, after which mice were allowed three more days of free exploration of the CPP apparatus. On the following day, reinstatement of place preference was determined immediately following administration of priming injection of cocaine (3.5mg/kg). The latter dose was chosen as an average daily dose of cocaine given during extinction trails described in Experiments 5 and 6 below.
Experiment 5: Conditioning by ascending doses of cocaine and extinction following reconditioning by descending doses of cocaine or saline
The goal of this experiment was to investigate how a daily decrease in cocaine dosage influences the rate of extinction and the reinstatement of place preference in mice that had been conditioned by the optimal ascending dose schedule of cocaine. Mice (n=32) were conditioned by 2,4,8 and 16mg/kg cocaine. Following the CPP expression test, one group (n=16) was reconditioned by two saline sessions for 4 days. A second group (n=16) was reconditioned by a morning saline session and an afternoon cocaine session for 4 days. Starting with a dose of 8mg/kg cocaine, the daily dosage decreased to 4, 2 and then 1mg/kg. Place preference was tested 3 days after the reconditioning phase, and then mice were allowed three more days of free exploration of the CPP apparatus. On the following day, reinstatement of place preference was determined immediately following administration of priming injection of cocaine (3.5mg/kg).
Experiment 6: Conditioning by ascending doses of cocaine and extinction following reconditioning by descending and fixed doses of cocaine
The aim of this experiment was two fold. First, to investigate extinction by a descending schedule of cocaine that terminates with 0mg/kg instead of 1mg/kg (Experiment 5). Second, to investigate whether reconditioning by a fixed low dose of cocaine, which equals to the average daily dose of 3.5mg/kg cocaine, induces extinction. First, two groups of mice (n=15 each) were conditioned by daily ascending doses of cocaine 2,4,8 and 16mg/kg and three days later expression of CPP was tested. On the following day, one group was reconditioned by descending doses of cocaine (8,4,2 and 0mg/kg) and a second group was reconditioned by a low daily dose of cocaine (3.5mg/kg). Reconditioning procedures and subsequent tests were conducted as described in Experiment 5.
Statistical analysis
Acquisition of CPP (Fig. 1)
The magnitude of place preference across the groups that were conditioned by ascending, fixed, and descending doses of cocaine or saline was calculated as the difference between the times spent in drug- and saline-paired compartments. Results of the magnitude of place preference (Fig. 1a) and results of locomotor activity during the place preference test (Fig. 1b) across the five groups were analyzed by analysis of covariance (ANCOVA), where CPP magnitude was assigned as the dependent variable and locomotion as the covariant.
Extinction and reinstatement of CPP (Figs. 2 through 5)
Experiments which included extinction and reinstatement tests were analyzed by two-way ANOVA (group x time). Comparison between the magnitude of reinstatement of CPP and locomotor activity across different groups were analyzed by ANCOVA (Fig. 6). When ANCOVA/ANOVA was significant, Bonferroni post hoc multiple comparison test was performed. A p values less than 0.05 was considered significant. All statistical analyses were performed using PASW Statistics 18 software.
RESULTS
Experiment 1: Conditioned place preference and locomotor activity following conditioning by ascending, fixed, and descending doses of cocaine
This experiment was designed to investigate how different schedules of cocaine influence the magnitude of place preference and whether place preference was influenced by motor behavior. The magnitude of place preference was calculated as the difference between the times spent in cocaine- and saline-paired compartments (delta value). Results in Fig. 1a depict the magnitude of place preference and results in Fig. 1b depict the locomotor activity recorded during the CPP test.. Results of CPP and locomotor activity were analyzed by ANCOVA. The outcome of CPP, as the dependent variable, showed significant differences between the groups F[4,45]=5.88, p<0.01. Bonferroni post hoc analysis showed that the magnitude of place preference of the two ascending groups (3,6,12, 24 and 2,4,8,16 mg/kg) was significantly higher than that of the fixed dose group (p<0.05) and the descending group (p<0.05). No significant differences were observed between the delta of the higher-dose ascending group (557±43 sec) and the delta of the lower-dose ascending group (569±39 sec). Likewise, the difference between the delta of the fixed dose group (336±50 sec) and the delta of the descending dose group (252±35) was not significant. The test between-subject effects showed the following parameters: corrected model F[5,44]=8.23, p<0.001; intercept F[1,44]=9.41, p<0.01; locomotion F[1,44]=0.693, p=0.41; group F[4,44]=9.88, p<0.001. Results suggest that locomotor activity (covariant) had no significant influence on the magnitude of place preference.
Three major findings suggest that the magnitude of place preference was not dose-dependent but rather reliant on the daily changes in cocaine dosage. First, the average daily dose of cocaine was the same in the 3,6,12,24 mg/kg ascending group and 24,12,6,3 mg/kg descending group, yet the magnitude of CPP in the ascending group was twice as high as the descending group (Fig. 1a). Second, the average daily dose of cocaine in one ascending group (3,6,12,24 mg/kg) was 50% higher than the second ascending group (2,4,8,16 mg/kg), yet the magnitude of CPP in both groups was similar (Fig. 1a). Third, the average daily dose of cocaine in the ascending group of 2,4,8,16 mg/kg was 7.5mg/kg, yet the magnitude of CPP in this group was significantly higher than in the fixed dose group that received 16mg/kg cocaine for 4 days (Fig. 1a).
Experiment 2: Conditioning by ascending and descending doses of cocaine and extinction by free exploration
This experiment was designed to investigate the rate of extinction of CPP following conditioning by ascending and descending doses of cocaine. The magnitude of place preference in mice conditioned by ascending doses of cocaine (3,6,12,24 mg/kg) was twice as high as that of the group conditioned by descending doses (24,12,6,3 mg/kg) (Fig. 2a; *p<0.05). Following the CPP expression test (day 1), mice were allowed to explore both compartments of the conditioning cage for 6 consecutive days. A two-way ANOVA showed a significant group effect F[1,154]=37.67; p<0.001 and no significant time effect F[6,154]=0.82; p=0.55. Bonferroni post hoc analysis showed significant differences between the two groups on all days (*p<0.05) except day 4. In the descending group, CPP on day 7 was significantly lower than on day 1 (#p<0.05), suggesting extinction of CPP; no such differences were observed in the ascending group (Fig. 2a). Results show that the ascending-dose group consistently maintained higher preference for the drug-paired compartment than the descending-dose group.
Experiment 3: Conditioning by ascending and fixed doses of cocaine and extinction by free exploration
This experiment was designed to investigate the rate of extinction of CPP following conditioning by ascending and fixed doses of cocaine. The magnitude of place preference in mice conditioned by ascending doses of cocaine (2,4,8,16 mg/kg) was significantly higher than in the group conditioned by a fixed dose of cocaine (16mg/kg × 4) (p<0.05; Fig. 2b). Following the CPP expression test, mice were allowed to explore both compartments of the conditioning cage for 6 consecutive days. A two-way ANOVA showed a significant group effect F[1,112]=49.79; p<0.001, a significant time effect F[6,112]=3.761; p<0.01 and no significant interaction. Bonferroni post hoc analysis showed significant difference between the two groups on free exploration days 1, 3, 4, 6 and 7 (p<0.05). In the fixed dose group, CPP on days 6 and 7 was significantly lower than on day 1 (#p<0.05), suggesting extinction of CPP; no such differences were observed in the ascending group (Fig. 2b). Results show that the ascending-dose group consistently maintained higher preference for the drug-paired compartment than the fixed-dose group.
Experiment 4: Conditioning by ascending and fixed doses of cocaine and extinction following reconditioning by saline
Experiment 3 showed that following free exploration, mice that had been conditioned by ascending doses of cocaine were more resistant to extinction than those conditioned by a fixed dose of cocaine. The purpose of this experiment was to investigate whether this difference is observed following extinction by reconditioning with saline. Mice were conditioned by ascending doses of cocaine (2,4,8,16mg/kg) or a fixed-dose schedule (16mg/kg × 4). Subsequently, expression, extinction following reconditioning by saline, and reinstatement by cocaine priming were investigated (Fig. 3). Two-way ANOVA showed a significant group effect F[1,150]=4.39; p<0.05, a significant time effect F[5,150]=42.55; p<0.001 and a significant interaction F[5,150]=4.269; p<0.01. Expression of CPP on day 8 in the fixed dose group was significantly lower than that of the ascending dose group (“&” p<0.05; Fig. 3). The test following the reconditioning phase (day 15; Fig. 3) and the free exploration days (days 16–18; Fig. 3) showed significant reduction of CPP in both groups compared to the expression test (*p<0.05). A reinstatement test on the following day showed a significant increase in CPP in both groups (#p<0.05; Fig. 3).
Experiment 5: Conditioning by ascending doses of cocaine and extinction following reconditioning by descending doses of cocaine or saline
The aim of this experiment was to test the hypothesis that reconditioning by descending doses of cocaine may extinguish previously acquired place preference. Mice were first conditioned by ascending doses of cocaine (2,4,8,16 mg/kg) because this schedule resulted in optimal place preference (Fig. 1a). Following the CPP expression test, one group was reconditioned by descending doses of cocaine (8,4,2, and 1mg/kg) and a second group was reconditioned by saline. Subsequent tests were performed as described in Fig. 4. Two-way ANOVA showed a significant group effect F[1,180]=9.175; p<0.01, a significant time effect F[5,180]=58.43; p<0.001 and a significant interaction F[5,180]=12.39; p<0.001. Given that all mice were conditioned by ascending doses of cocaine (2,4,8 and 16mg/kg), the expression of CPP was similar in both groups (Fig. 4). The test on day 15, following reconditioning by descending doses of cocaine (8,4,2 and 1mg/kg) or saline showed that both groups extinguish CPP compared to the expression test (Fig. 4; *p<0.05 for both groups). The results suggest that extinction by descending doses of cocaine was as effective as extinction by saline (Fig. 4). During the next three exploration days (days 16–18) the levels of place preference in both groups were significantly lower in magnitude than the CPP observed in the first expression test (day 8; *p<0.05). Interestingly, a priming injection of cocaine (3.5mg/kg) reinstated robust CPP in the saline-reconditioned group but not in the cocaine-reconditioned groups (Fig. 4).
Experiment 6: Conditioning by ascending doses of cocaine and extinction following reconditioning by descending and fixed doses of cocaine
Experiment 5 showed that reconditioning by descending doses of cocaine successfully extinguished and prevented reinstatement of place preference. The aim of this experiment was two fold. First, to investigate extinction following reconditioning by a descending schedule of cocaine that terminates with 0mg/kg instead of 1mg/kg (Experiment 5). Second, to investigate whether reconditioning by a fixed low dose of cocaine, which equals to the average daily dose of 3.5mg/kg cocaine, induces extinction.
Two groups of mice were first conditioned by ascending doses of cocaine (2,4,8,16 mg/kg); the magnitude of CPP in both groups was similar (Fig. 5). Following the CPP expression test, one group was reconditioned by descending doses of cocaine (8,4,2, and 0 mg/kg) and a second group was reconditioned by daily 3.5mg/kg cocaine. Subsequent tests were performed as described in Fig. 5. Two-way ANOVA showed a significant group effect F[1,168]=41.99; p<0.001, a significant time effect F[5,168]=52.56; p<0.001 and a significant interaction F[5,168]=11.21; p<0.001. The test on day 15 showed complete extinction of CPP in the descending dose group but not in the fixed dose group. Yet, the fixed dose group showed less CPP on day 15 compared to day 8 (p<0.05; Fig. 5), suggesting partial extinction. From day 15 through 18 the fixed dose group maintained higher place preference compared to the descending dose group (p<0.05; Fig. 5). A priming injection of cocaine given to the descending group on day 19 resulted in higher CPP than that observed on day 18 (P<0.05), suggesting reinstatement. However, the magnitude of CPP on day 19 was lower than on day 8 (p<0.05; Fig. 5), suggesting only partial reinstatement of CPP in the descending dose group. Cocaine priming had no significant effect on the fixed dose group (Fig. 5).
The result suggested that reinstatement of place preference is dependent on the schedule of extinction training. Thus, we compared between the levels of reinstatement of CPP by cocaine (3.5 mg/kg) given to groups that underwent extinction by: a) saline (Experiment 5), b) descending doses of cocaine (8,4,2 and 1mg/kg; Experiment 5), and c) descending doses of cocaine that terminated with saline on the last day of reconditioning (8,4,2, and 0 mg/kg; Experiment 6). Results of CPP reinstatement (Fig. 6a) and locomotor activity (Fig.6b) were analyzed by ANCOVA. Results of the CPP as the dependent variable showed significant differences between the groups F[2,44]=5.71; p<0.01. Pairwise Bonferroni comparison showed significant differences between each of the three groups (p<0.05), suggesting that the magnitude of CPP reinstatement was dependent on the different extinction protocols. The test between-subject effects showed the following parameters: corrected model F[3,43]=15.02, p<0.001; intercept F[1,43]=4.62, p=0.037; locomotion F[1,43]=0.04, p=0.83; group F[2,43]=22.17, p<0.001. Results suggest that locomotor activity had no significant influence on place preference. Hence, it appears that resistance to reinstatement is dependent not only on reconditioning by a descending dose schedule of cocaine but also on the particular treatment that was given on the last day of extinction training.
DISCUSSION
In the present study, we hypothesized that changes in the magnitude of drug-unconditioned stimulus during associative learning modulate the acquisition and extinction of cocaine-induced place preference. Indeed, mice conditioned by ascending doses of cocaine showed the highest and the longest-lasting preference for the cocaine-associated context compared to groups conditioned by either a fixed dose or descending doses of cocaine. Hence, the ranking order for optimal associative learning appears to be dependent on the nature of the schedule of cocaine administration: ascending > fixed > descending.
Three major findings suggest that the acquisition of cocaine-induced place preference was not dose-dependent but rather reliant on the nature of the schedule of drug administration. First, the average daily dose of cocaine was the same in the 3,6,12,24 mg/kg ascending group and 24,12,6,3 mg/kg descending group, yet the magnitude of CPP in the ascending group was twice as high as the descending group (Fig. 1a). Second, the average daily dose of cocaine in one ascending group (3,6,12,24 mg/kg) was 50% higher than the second ascending group (2,4,8,16 mg/kg), yet the magnitude of CPP in both groups was not significantly different (Fig. 1a). Third, the average daily dose of cocaine in the ascending group of 2,4,8,16 mg/kg was 7.5mg/kg, yet the magnitude of CPP in this group was significantly higher than in the fixed dose group that received 16mg/kg cocaine for 4 days (Fig. 1a).
We posit that these results may be reminiscent of the “prediction error” theory of natural reinforcement learning (Bayer & Glimcher 2005; Fiorillo et al. 2008; Schultz 2007). Evidence from studies on natural reinforcement suggests that learning is dependent on the discrepancy between expected and obtained reward. If reward outcome is greater than expected, a positive prediction error is encoded. While phasic changes in dopaminergic neurotransmission encode prediction error-dependent learning (Schultz, 2007), in the present study, changes in cocaine dosage may modulate tonic release of mesolimbic dopamine. Currently, the relationship between results of the present study and the prediction error theory is not clear. However, the findings that daily increments in cocaine dosage led to the highest and the long-lasting magnitude of place preference (Fig. 1a) suggest that this schedule elicited a) daily increase in reward magnitude and b) strengthening of cocaine-associated memory.
Conditioning by daily descending doses of cocaine induced a) the lowest and the shortest-lasting magnitude of place preference, and b) extinction learning. These findings suggest that the subjects learned to “lower their expectation” of drug reward and consequently “lost interest” in the drug-paired environment. The learning of the significance of reduction in reward magnitude may be associated with the encoding of a “negative prediction error” signal. Regardless, our findings are in agreement with a recent study showing that negative prediction error in cocaine reward reduces lever pressing for cocaine infusion in rats (Marks et al. 2010).
It is interesting to note how ascending doses of cocaine during acquisition of CPP influenced subsequent extinction by free exploration. In this paradigm, the subject is exposed to the conditioned stimulus in the absence of the unconditioned stimulus (Bouton 2004). Apparently, when the association between drug and context was strengthened (the ascending group), and when the subject had a free choice between drug- and saline-paired contexts, the preference for the drug-paired context was maintained (Fig. 2). However, the association between drug and context was weakened following conditioning by a fixed daily dose of cocaine or descending doses of cocaine; a rapid decline in CPP was observed (Fig. 2). Conversely, the results of extinction through reconditioning by saline showed no differences between the various conditioned groups; both extinguished CPP (Fig. 3). Note that extinction training by free exploration lasted for 6 days (Fig. 2) while extinction training by saline reconditioning lasted only 4 days (Fig. 3). These findings suggest that reconditioning facilitates extinction learning more so than the free choice exploration paradigm.
We then were interested in examining how a descending dose schedule of cocaine during the extinction phase, influences extinction and reinstatement of CPP. Two major findings were revealed: a) reconditioning by saline or by descending doses of cocaine both extinguished CPP, and b) resistance to reinstatement was dependent on the extinction training protocol. Extinction by saline reconditioning resulted in the highest level of reinstatement. Extinction by reconditioning with a descending schedule of cocaine that terminated with 0 mg/kg cocaine resulted in partial reinstatement, while training that terminated with 1mg/kg cocaine resulted in resistance to reinstatement (Fig. 6a). The reason for the resistance to reinstatement is unclear, but perhaps training by descending doses of cocaine ultimately reduced the sensitivity to cocaine. The significance of tapering down cocaine reward for extinction learning is manifested by the finding that reconditioning by a fixed daily dose of cocaine (3.5mg/kg), which averaged the 8,4,2 and 0mg/kg schedule, resulted in only small reduction in CPP (Fig. 5). Further studies are required to determine the effectiveness of other schedules of extinction training and how they influence the reinstatement of CPP by cocaine. Nevertheless, the findings suggest that extinction by tapering down of drug reward may be a better tool for extinction of drug-seeking behavior than an abrupt discontinuation of drug use. This paradigm may resemble the standard ‘nicotine replacement therapy’ used for the treatment of tobacco addiction.
In summary, results suggest that changes in the magnitude of drug-unconditioned stimulus during Pavlovian conditioning influence the acquisition and extinction of approach behavior to drug-associated context. We posit that conditioning by ascending doses of cocaine better simulate the human practice of drug use, e.g., escalation in the quantity of drug used over time. Typical CPP studies have utilized a fixed dosage of drug, administered daily over several days. However, based on our current findings we suggest the utilization of an ascending dose schedule not only because it results in a greater magnitude of CPP, but also because it better resembles the human practice of drug use. Finally, the demonstration that the descending dose schedule of cocaine produced extinction of approach behavior that is partially resistant to reinstatement suggests that this technique may be practical for the treatment of drug addiction and the prevention of relapse.
Acknowledgments
This work was supported by R01DA026878 and R21DA029404 from the National Institute on Drug Abuse, National Institutes of Health, USA
References
- Balda MA, Anderson KL, Itzhak Y. Adolescent and adult responsiveness to the incentive value of cocaine reward in mice: role of neuronal nitric oxide synthase (nNOS) gene. Neuropharmacology. 2006;51:341–349. doi: 10.1016/j.neuropharm.2006.03.026. [DOI] [PubMed] [Google Scholar]
- Bardo MT, Bevins RA. Conditioned place preference: what does it add to our preclinical understanding of drug reward? Psychopharmacology. 2000;153:31–43. doi: 10.1007/s002130000569. [DOI] [PubMed] [Google Scholar]
- Bayer HM, Glimcher PW. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. doi: 10.1016/j.neuron.2005.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bouton ME. Context and behavioral processes in extinction. Learning and Memory. 2004;11:485–494. doi: 10.1101/lm.78804. [DOI] [PubMed] [Google Scholar]
- Childress AR, Mozley PD, McElgin W, Fitzgerald J, Reivich M, O’Brien CP. Limbic activation during cue-induced cocaine craving. The American Journal of Psychiatry. 1999;156:11–18. doi: 10.1176/ajp.156.1.11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiorillo CD, Newsome WT, Schultz W. The temporal precision of reward prediction in dopamine neurons. Nature Neuroscience. 2008;11:966–973. doi: 10.1038/nn.2159. [DOI] [PubMed] [Google Scholar]
- Hyman SE, Malenka RC, Nestler EJ. Neural mechanisms of addiction: the role of reward-related learning and memory. Annual Review of Neuroscience. 2006;29:565–598. doi: 10.1146/annurev.neuro.29.051605.113009. [DOI] [PubMed] [Google Scholar]
- Itzhak Y, Anderson KL. Ethanol-induced behavioral sensitization in adolescent and adult mice: role of the nNOS gene. Alcoholism, Clinical and Experimental Research. 2008;32:1839–1848. doi: 10.1111/j.1530-0277.2008.00766.x. [DOI] [PubMed] [Google Scholar]
- Itzhak Y, Martin JL. Cocaine-induced conditioned place preference in mice: induction, extinction and reinstatement by related psychostimulants. Neuropsychopharmacology. 2002;26:130–134. doi: 10.1016/S0893-133X(01)00303-7. [DOI] [PubMed] [Google Scholar]
- Itzhak Y, Roger-Sanchez C, Kelley JB, Anderson KL. Discrimination between cocaine-associated context and cue in a modified conditioned place preference paradigm: role of the nNOS gene. The International Journal of Neuropsychopharmacology. 2010;13:171–180. doi: 10.1017/S1461145709990666. [DOI] [PubMed] [Google Scholar]
- Marks KR, Kearns DN, Christensen CJ, Silberberg A, Weiss SJ. Learning that a cocaine reward is smaller than expected: a test of Redish’s computational model of addiction. Behavioural Brain Research. 2010;212:204–207. doi: 10.1016/j.bbr.2010.03.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mueller D, Stewart J. Cocaine-induced conditioned place preference: reinstatement by priming injections of cocaine after extinction. Behav Brain Res. 2000;115:39–47. doi: 10.1016/s0166-4328(00)00239-4. [DOI] [PubMed] [Google Scholar]
- Newlin DB. A comparison of drug conditioning and craving for alcohol and cocaine. Recent Developments in Alcoholism. 1992;10:147–164. doi: 10.1007/978-1-4899-1648-8_8. [DOI] [PubMed] [Google Scholar]
- Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical conditioning II. New York: Appleton-Century Crofts; 1972. pp. 64–99. [Google Scholar]
- Robbins SJ, Ehrman RN, Childress AR, O’Brien CP. Comparing levels of cocaine cue reactivity in male and female outpatients. Drug and Alcohol Dependence. 1999;53:223–230. doi: 10.1016/s0376-8716(98)00135-5. [DOI] [PubMed] [Google Scholar]
- Robbins TW, Ersche KD, Everitt BJ. Drug addiction and the memory systems of the brain. Annals of the New York Academy of Science. 2008;1141:1–21. doi: 10.1196/annals.1441.020. [DOI] [PubMed] [Google Scholar]
- Ross S, Peselow E. Pharmacotherapy of addictive disorders. Clinical Neuropharmacology. 2009;32:277–289. doi: 10.1097/wnf.0b013e3181a91655. [DOI] [PubMed] [Google Scholar]
- Sanchis-Segura C, Spanagel R. Behavioural assessment of drug reinforcement and addictive features in rodents: an overview. Addiction Biology. 2006;11:2–38. doi: 10.1111/j.1369-1600.2006.00012.x. [DOI] [PubMed] [Google Scholar]
- Schneider M, Heise V, Spanagel R. Differential involvement of the opioid receptor antagonist naloxone in motivational and hedonic aspects of reward. Behavioural Brain Research. 2010;208:466–472. doi: 10.1016/j.bbr.2009.12.013. [DOI] [PubMed] [Google Scholar]
- Schultz W. Multiple dopamine functions at different time courses. Annual Review of Neuroscience. 2007;30:259–288. doi: 10.1146/annurev.neuro.28.061604.135722. [DOI] [PubMed] [Google Scholar]
- Tzschentke TM. Measuring reward with the conditioned place preference (CPP) paradigm: update of the last decade. Addict Biol. 2007;12:227–462. doi: 10.1111/j.1369-1600.2007.00070.x. [DOI] [PubMed] [Google Scholar]
- White NM. Addictive drugs as reinforcers: multiple partial actions on memory systems. Addiction. 1996;91:921–949. [PubMed] [Google Scholar]
- White NM, Carr GD. The conditioned place preference is affected by two independent reinforcement processes. Pharmacology, Biochemistry, and Behavior. 1985;23:37–42. doi: 10.1016/0091-3057(85)90127-3. [DOI] [PubMed] [Google Scholar]
- White NM, Milner PM. The psychobiology of reinforcers. Annu Rev Psychol. 1992;43:433–471. doi: 10.1146/annurev.ps.43.020192.002303. [DOI] [PubMed] [Google Scholar]