Skip to main content
Learning & Memory logoLink to Learning & Memory
. 2008 May;15(5):299–303. doi: 10.1101/lm.762508

Pavlovian influences on goal-directed behavior in mice: The role of cue-reinforcer relations

Hans S Crombag 1,2,3, Ezequiel M Galarce 1,2, Peter C Holland 1,2
PMCID: PMC3097034  NIHMSID: NIHMS290755  PMID: 18441288

Abstract

Two experiments refined procedures to study Pavlovian influences on goal-directed behavior in mice and studied the effects of CS–US relations in Pavlovian-instrumental interactions. Independent groups of mice underwent Pavlovian training to associate either a 10-sec or 2-min auditory stimulus (CS) with reward. We next assessed the ability of the response-contingent CS presentations to reinforce novel instrumental responding (conditioned reinforcement; CRf) or the ability of noncontingent CS presentations to increase ongoing instrumental responding (Pavlovian-instrumental transfer; PIT). Whereas 10-sec training conditions produced strong CRf (and no PIT), 2-min training conditions produced robust PIT (but no CRf).


Contemporary theories of appetitive motivation emphasize roles for Pavlovian conditioned stimuli (CSs) in guiding and energizing goal-directed behavior (Toates 1986; Berridge and Robinson 2003; Dickinson and Balleine 2007). For example, CSs paired with unconditioned food rewards (USs) can modulate the rate of ongoing instrumental responding for food (Pavlovian-instrumental transfer, or PIT), and/or serve as reinforcers to support acquisition of new instrumental responding (Conditioned reinforcement, or CRf). Recently, there has been a resurgence of interest in these phenomena because of their potential role in the development and maintenance of different psychopathologies, including eating disorders and substance abuse.

Although a number of models for studying appetitive Pavlovian-instrumental interactions in rats and monkeys have been developed, and some critical variables that influence performance have been identified, few of these procedures have been carefully adapted for use in mice. Given the ever-expanding use of transgenic mouse models targeting cellular and molecular mechanisms in synaptic plasticity and learning and memory, development of mouse models of Pavlovian-instrumental interactions will be valuable.

Here, we report on two procedures that produce robust PIT and CRf in mice and that demonstrate the importance of CS–US relations in Pavlovian learning (Rescorla 1988). Independent groups of mice received Pavlovian pairings of either a 10-sec or 2-min auditory stimulus (CS) with reward delivery. Next, the ability of the CS to serve as a conditioned reinforcer for the acquisition of instrumental responding (CRf, Experiment 1) or to modulate the rate of previously trained lever pressing for milk (PIT, Experiment 2) was assessed. We observed that, whereas training with longer CSs that had longer and variable temporal intervals between CS and US reward presentations (interstimulus interval or ISI) generated greater PIT and little CRf, training with shorter CSs that had short and constant ISIs generated more robust CRf, and little PIT. This double dissociation is consistent with previous suggestions that CS–US temporal and/or predictive relations can markedly alter the nature or content of conditioning as well as its amount (e.g., Holland 1980; Silva and Timberlake 1998).

Results

Experiment 1: Conditioned reinforcement

The ability of the Pavlovian CSs to serve as conditioned reinforcers for the acquisition of novel instrumental nose-poke responding was assessed in a single test session (Fig. 1A) in the absence of primary reinforcement. Nose-pokes to one port (CS port) produced presentation of the 3-sec CS, whereas nose-pokes to the other port (no-CS port) had no consequences. Mice trained with 10-sec CSs performed significantly more nose-pokes to the CS port than to the no-CS port, indicating that the CS was a potent conditioned reinforcer. In contrast, mice trained with the 2-min CS showed no significant CRf effect, as response levels producing the CS did not differ from baseline (no-CS) levels. Thus, the conditioned reinforcing effect of the CS differed depending on the CS duration used during training.

Figure 1.

Figure 1.

Results of the conditioned reinforcement (CRf) test in Experiment 1 of mice trained using a 2-min or 10-sec Pavlovian CS. (A) Dark bars show the average (+ SEM) number of nose-poke responses into the port producing 3-sec CS presentations, and white bars show responses into the non-CS port. A Group (10-sec or 2-min CSs in training) X Stimulus (CS or no-CS) ANOVA showed a significant main effect of Stimulus (F(1,14) = 13.3, P < 0.01), indicating that CS-reinforced instrumental responding was increased compared with nonreinforced instrumental response levels. Additionally, there was a significant main effect of Group (F(1,14) = 10.5, P < 0.01), and a significant interaction between Group and Stimulus (F(1,14) = 4.9, P < 0.05), indicating that conditioned reinforcing effect of the CS differed depending on the CS duration used during training. Post-hoc analyses indicated that mice trained with the 10-sec CS responded more to the CS port than to the no-CS port (P < 0.01), whereas mice trained in the 2-min CS condition failed to show such enhanced responding to the CS-reinforced nose poke port. (B) Average (+ SEM) number of entries into the liquid reward magazine during the test for CRf. Mice trained in the 2-min CS condition made significantly more liquid receptacle entries than mice trained in the 10-sec condition (t = 4.5, P < 0.001). No milk rewards were presented during the 40-min CRf test session.

Although the mice trained in the 2-min condition showed little evidence for CRf of nose-poking, they showed substantial liquid receptacle entry during the test session (Fig. 1B). Thus, despite their not being effective conditioned reinforcers, the 3-sec response-contingent CSs nevertheless maintained some aspects of their original Pavlovian conditioning. Because receptacle entries appeared higher in the mice trained in the 2-min condition, it is possible that their failure to perform the nose-poke response was the consequence of competition from liquid receptacle responses. Alternately, greater receptacle responding in the 2-min condition may reflect reduced competition from CRf nose-poking. Although the correlation between these two responses was insignificant (Pearson’s r = −0.39, P = 0.14), we cannot fully exclude these possibilities (see Discussion).

Experiment 2: Pavlovian-instrumental transfer

Mice rapidly acquired instrumental lever responding during training (data not shown). Mean ± SEM responses/min on the final session were 9.0 ± 0.9 in the 10-sec group and 9.9 ± 1.5 in the 2-min group.

Figure 2A shows instrumental lever-press rates during the 2-min CS and ITI periods, averaged over the entire test. PIT (enhanced responding during the CS vs. ITI periods) was substantial in the 2-min group, but not in the 10-sec group. Thus, whereas CS presentations increased response rates in the 2-min group (P < 0.001), they failed to do so in the 10-sec group. Importantly, these effects of training condition were seen only for responding on the previously rewarded lever and not for control lever responding (P’s > 0.6).

Figure 2.

Figure 2.

Results of the Pavlovian-instrumental transfer (PIT) test in Experiment 2 of mice trained using a 10-sec or 2-min Pavlovian CS. (A) Average (+ SEM) number of responses on the previously reinforced lever during the 2-min no-CS (ITI; white bars) and CS (dark bars) periods. A Group (10-sec or 2-min CS durations in training) X Stimulus (CS or ITI) ANOVA yielded a significant effect of Stimulus (F(1,13) = 48.27, P < 0.0001), no significant effect of Group (F(1,13) = 3.9, P = 0.07), and a significant interaction of Stimulus with Group (F(1,13) = 14, P < 0.01). (B) Average (+ SEM) number of entries into the liquid reward magazine during the 2-min ITI and CS periods. A Group X Stimulus ANOVA showed a significant effect of Stimulus (F(1,13) = 32.4, P < 0.0001), but no Group or Stimulus X Group interaction effect (P’s > 0.4). No milk rewards were presented during the 32-min PIT test session.

Figure 2B shows liquid receptacle entries during CS and ITI periods in the PIT test. Although the CS enhanced lever presses only in mice trained with a 2-min CS, it evoked similar rates of receptacle entry in both groups. Thus, the mice trained with 10-sec CSs displayed evidence for the maintenance of Pavlovian CRs to the 2-min test CSs, despite their failure to show PIT throughout those CSs.

Notably, the temporal distribution of lever-press responding during CS presentations in the PIT test differed substantially between groups. Figure 3 illustrates the strong CS stimulus control over instrumental lever responding observed during testing, which was most pronounced in the 2-min group. In this group (Fig. 3B; dark circles in Fig. 3C), CS presentations increased lever responding selectively during the period of CS presentation and response levels remained at low baseline levels during ITI periods. Although Figure 3C suggests a gradual increase in lever-press rate over the course of the CS presentation interval, inspection of Figure 3B shows that mice often made rapid transitions from low to high rates of responding within the CS period. In contrast, in the 10-sec group (Fig. 3A), CS presentations produced only small increases in lever responding over baseline, and, as observed by Holland and Gallagher (2003), in rats trained with 10-sec CSs, these elevations were not evident until 20–30 sec after CS presentation.

Figure 3.

Figure 3.

Temporal pattern (in 10-sec bins) of the active lever responses during the Pavlovian-instrumental transfer (PIT) test of mice trained using a 10-sec (A) or 2-min (B) Pavlovian CS. Shaded area delineates 2-min CS presentation periods. (C) Average (±SE) active lever responding combined across all eight CS presentations in the 2-min group (●) and the 10-sec group (◦). BL reflects the average baseline response level during the 10 sec prior to CS onset.

Discussion

Depending on training conditions, both CRf and PIT were very robust in C57/BL6 mice. In Experiment 1, mice in the 10-sec group performed nose-poke responses that earned a previously reinforced CS at double the rate that they performed a nose-poke that had no such consequence. In Experiment 2, presentations of a previously reinforced CS tripled the baseline rate of instrumental lever pressing of mice in the 2-min group. Thus, these procedures for revealing learned incentive motivational properties of Pavlovian CSs are well suited for use in mice.

At the same time, there was a double dissociation between the conditions of Pavlovian CS training that favored PIT versus CRf. CRf was substantial, but PIT was very small in the 10-sec groups, in which the CS was 10 sec in duration and reward presentation always coincided with the last 5 sec of the CS. Conversely, PIT was substantial but CRf small in the 2-min groups, in which the CS was 2 min in duration, and reward was delivered randomly throughout the CS period, on the average of every 30 sec.

Several factors may have contributed to this double dissociation. First, it is possible that CRf and PIT reflect distinct psychological processes, which are differentially favored by our two Pavlovian training conditions. The two training conditions differed in both the delay and variability of US deliveries following CS onset. Whereas in the 10-sec condition, reinforcement always followed CS onset by 5 sec and coterminated with it, in the 2-min condition, delivery of the US occurred at a minimum 15 sec after CS onset and was variable during the 2-min CS period. These two different CS–US conditions could have had qualitatively different consequences for the type of information acquired by the CS. This notion is similar to previous discussions of how CS–US interval and/or expected delay-to-reward effects the type of conditioned response system activated in conditioning (e.g., Konorski 1967; Lovibond 1981; Wagner and Brandon 1989). These investigators noted that long CSs favor the development of more preparatory CRs, which modulate ongoing behavior (as in PIT), whereas shorter duration CSs favor consummatory responses. Our results suggest, additionally, that shorter CSs might also be more likely to acquire conditioned reinforcement properties. Thus, our findings are consistent with the notion that CS and US relations critically determine the impact that CSs have on motivated behavior.

In this regard, it is notable that a variety of data indicate that the CRf and PIT rely on different neurobiological substrates. For example, with simple, single-reinforcer procedures like those used here, lesions of the central nucleus of the amygdala disrupt PIT, but not CRf, whereas lesions of the basolateral amygdala disrupt CRf but not PIT (for review, see Everitt et al. 2000). More recently, using transgenic mouse models targeting molecular mechanisms involved in synaptic plasticity (LTP and LTD), Mead and Stephens (2003a, b) and Crombag et al. (2006) found PIT and CRf to be affected differentially by select glutamate receptor mutations. This neurobiological dissociation is consistent with the aforementioned notion that our different Pavlovian training conditions may have recruited qualitatively different response systems.

Second, the conditions used for assessing PIT (2-min CS durations) were more similar to the training conditions of the 2-min groups, and the conditions used for assessing CRf (3-sec CS durations) were more similar to the training conditions of the 10-sec groups. Thus, poorer performance during tests may have reflected “generalization decrement” induced by stimulus change between training and testing. Arguing against this account, however, are the conditioned liquid receptacle entry data, which show little evidence of such generalization decrement. In PIT testing, similar liquid receptacle entry rates during 2-min CS presentations were observed in both groups, despite the much greater PIT in the 2-min group. Likewise, although 3-sec CS presentations were more effective as conditioned reinforcers in the 10-sec group, more liquid receptacle responses were observed during testing in the 2-min group.

Third, conditioned responses often reflect the timing of reward delivery (e.g., Meck and Church 1984; Lattal 1999; Holland 2000; Delamater and Oakeshott 2007). If incentive motivational responses also displayed such timing, then one might expect them to reach maximal strength 5–10 sec after CS onset in the 10-sec group, rising and falling rapidly before and after that time interval, but to display a lower peak and a broader distribution across time in the 2-min group. Thus, peak incentive value might occur much sooner after nose-poke performance in the 10-sec groups than in the 2-min groups; because reinforcement (in general) is typically more effective the shorter the delay of reward, conditioned reinforcement would similarly be expected to be greater in the 10-sec group. In contrast, in the PIT tests, the broad temporal extent of the incentive response in the 2-min group would influence instrumental responding for a longer period of time than in the 10-sec group, producing a more robust overall PIT effect. However, the data shown in Figure 3 do not support this account. In the 10-sec group, no early spike in PIT was observed, and the slight elevation in instrumental responding that was observed persisted for nearly the entire CS duration. Specifically, mean time to peak responding during the test for PIT was 67.1 ± 11.5 sec for the 10-sec group and 82.5 ± 10.8 sec for the 2-min group (unpaired t-test, P = 0.18). Thus, the temporal distribution of PIT observed in both groups is more consistent with the idea of a slowly recruiting conditioned emotional response.

Fourth, as noted earlier, it is possible that the low level of CRf we observed in the 2-min condition was due in part to competition between nose-poke and food receptacle behaviors. Interestingly, other investigators have implicated such competition in failures to observe PIT. For instance, Baxter and Zamble (1982) reported that short duration CSs can be made to potentiate ongoing instrumental responding as long as the animals are prevented from acquiring competing (e.g., magazine approach) responses. Similarly, Delamater and Oakeshott (2007) found that Pavlovian extinction experience actually facilitates (sensory-specific) PIT, perhaps reflecting extinction-induced reduction of competing Pavlovian approach responding. Notably, we found no evidence for competition effects in our PIT tests: 10-sec and 2-min groups showed very different levels of PIT despite comparable food receptacle responding. Indeed, although we cannot fully discount a contribution of response competition to our CRf results, the lack of differential receptacle responding or competition effects in our PIT test makes such an account less likely.

Finally, although we favor the view that shorter CS–US intervals favor conditioned reinforcement, whereas longer intervals favor PIT, we note again that the two Pavlovian training conditions differed in a number of other ways, including the mean wait time to US delivery and the variance in that time—and as a consequence, the ratio of CS–US to intertrial interval, the probabilistic nature of US delivery, as well as the relative density of CS–US pairings. These variables are known to affect the amount (e.g., Wynne et al. 1996) and form (e.g., Holland 1980) of conditioned responding, and may have contributed to our results. Clearly, future experiments will be required to better determine which of these are most critical in determining the ability of Pavlovian CS to support PIT versus CRf.

Regardless, the conclusion remains that temporal CS–US variables influence the observation of CRf and PIT. Although the implications of these results for understanding psychopathological conditions such as eating disorders or addiction are not straightforward, they illustrate the complexity and dynamics of the learning and memory processes involved. We suggest, therefore, that future studies into the molecular basis of these disorders, including those using mice with genetic mutations, carefully consider how seemingly minor differences in training parameters can fundamentally alter the psychological and, presumably, neurobiological mechanisms involved.

Materials and Methods

Twenty-four C57BL/6 mice were housed four/cage in a climate controlled facility (12:12hr light:dark) and in accordance with NIH and Johns Hopkins standards. Starting 3 d prior to training and for the duration of the experiments, mice were food restricted to about 85%–90% of their free-feeding.

Training and testing was conducted in standard conditioning chambers (ENV-307W, Med Associates, Inc.) located inside of sound-attenuating enclosures. A fan provided constant ventilation and low-level background noise and an incandescent house light provided low-level illumination. Each chamber was equipped with a motorized dipper that delivered 30% sweetened milk in 0.01-cc volumes into a recessed receptacle. Photocells recorded entries into the receptacle. A speaker, used to present auditory stimuli, was located on the wall opposite the liquid receptacle. In the CRf test of Experiment 1, nose-poke devices were located 5 cm to each side of the receptacle. In the instrumental training and transfer phases of Experiment 2, levers were placed in these locations.

Pavlovian training

Pavlovian training was identical in the two experiments. Mice first received 2 x 40-min sessions to train them to approach the liquid receptacle and consume the milk (US). Next, mice were given Pavlovian training sessions to establish an association between an auditory cue (CS) and milk delivery (US). The CS was a broadband white-noise stimulus with the amplitude set 5 dB above background. For the 10-sec group, the CS was presented for 10 sec and milk presentations coincided with the last 5 sec of the CS presentation and coterminated with the CS. In each 20-min session, 40 CS–US trials were given; the time between CS presentations (intertrial interval or ITI) was variable (mean = 30 sec; range = 15–45 sec). For the 2-min group, the CS was presented for a fixed period of 120 sec and milk was presented at random times during the CS, with an average interval of 30 sec. Thus, during every 2-min CS, about four USs were presented. The ITI period between CS presentations was again varied and averaged 120 sec (range 60–180 sec). A total of 10 CSs were presented during each 40-min training session. Thus, both groups received a total of about 40 US presentations in each session. Entries into the liquid receptacle were recorded, but because in the 2-min group liquid could be delivered at any time, these entries were not a meaningful measure of learned responding, and are not presented here.

Experiment 1: Conditioned reinforcement

At the conclusion of Pavlovian training, CRf was assessed in a single 40-min test, with nose-poke ports mounted on each side of the liquid receptacle. For both groups of mice, responding into one port resulted in a 3-sec presentation of the CS and responding into the alternate port had no consequences. The CS-reinforced port (left or right) was counterbalanced within each group. Mice were tested under extinction conditions; no milk USs were presented at any time. The response measures included responses to the CS-reinforced and nonreinforced nose-poke ports and entries into the liquid receptacle.

Experiment 2: Pavlovian-instrumental transfer

After Pavlovian training, mice underwent training to acquire stable operant lever responding on a variable-interval (VI) schedule of reinforcement for the same milk reward. For each mouse, responses on either the right or left lever (counterbalanced within groups) resulted in the delivery of a milk reward for 5 sec, whereas presses on the alternate (control) lever had no consequences. Mice were first trained in 30-min sessions, in which each response was reinforced. After robust levels of lever responding were obtained, the schedule requirement for reinforcement was changed across sessions such that responding was reinforced on average every 15 sec (VI-15 sec), then every 30 sec (VI-30 sec), and finally every 60 sec (VI-60 sec).

PIT was evaluated in a final 32-min test session with both levers available but under extinction conditions. The CS was presented for 2-min periods separated by fixed 2-min ITI periods for a total of eight CS presentations. The impact of CS presentation on instrumental responding was determined by comparing lever response rates during the CS period with ITI levels of responding. Responding on the control lever and liquid receptacle entry responses were also recorded during the test session.

Acknowledgments

We thank Jonathan Nucum, Anjana Muralidharan, and Jeffery Sutton for their excellent assistance in conducting the experiments. This research was supported by center grant P40-RR-017688 to the Neurogenetics and Behavior Center.

Footnotes

References

  1. Baxter D., Zamble E. Reinforcer and response specificity in appetitive transfer of control. Anim. Learn. Behav. 1982;10:201–210. [Google Scholar]
  2. Berridge K.C., Robinson T.E. Parsing reward. Trends Neurosci. 2003;26:507–513. doi: 10.1016/S0166-2236(03)00233-9. [DOI] [PubMed] [Google Scholar]
  3. Crombag H.S., Sutton J.M., Takamiya K., Holland P.C., Huganir R.L. 2006. A unique and selective role for GluR1 s831 phosphorylation in incentive learning Society for Neuroscience Conference, Atlanta, GA [Google Scholar]
  4. Delamater A.R., Oakeshott S. Learning about multiple attributes of reward in Pavlovian conditioning. Ann. N. Y. Acad. Sci. 2007;1104:1–20. doi: 10.1196/annals.1390.008. [DOI] [PubMed] [Google Scholar]
  5. Dickinson A., Balleine B. The role of learning in the operation of motivational systems. In: Gallistel C.R., editor. Stevens' handbook of experimental psychology: Learning, motivation and emotion. Vol. 3. Wiley and Sons; New York: 2007. pp. 497–534. [Google Scholar]
  6. Everitt B.J., Cardinal R.N., Hall J., Parkinson J.A., Robbins T.W. Differential involvement of amygdala subsystems in appetitive conditioning and drug addiction. In: Aggleton J., editor. The amygdala: A functional analysis. Oxford University Press; New York: 2000. pp. 353–390. [Google Scholar]
  7. Holland P.C. CS-US interval as a determinant of the form of Pavlovian appetitive conditioned responses. J. Exp. Psychol. Anim. Behav. Process. 1980;6:155–174. [PubMed] [Google Scholar]
  8. Holland P.C. Trial and intertrial durations in appetitive conditioning in rats. Anim. Learn. Behav. 2000;28:121–135. [Google Scholar]
  9. Holland P.C., Gallagher M. Double dissociation of the effects of lesions of basolateral and central amygdala on conditioned stimulus-potentiated feeding and Pavlovian-instrumental transfer. Eur. J. Neurosci. 2003;17:1680–1694. doi: 10.1046/j.1460-9568.2003.02585.x. [DOI] [PubMed] [Google Scholar]
  10. Konorski J. Integrative activity of the brain; an interdisciplinary approach. University of Chicago Press; Chicago, IL: 1967. [Google Scholar]
  11. Lattal K.M. Trial and intertrial durations in Pavlovian conditioning: Issues of learning and performance. J. Exp. Psychol. Anim. Behav. Process. 1999;25:433–450. doi: 10.1037/0097-7403.25.4.433. [DOI] [PubMed] [Google Scholar]
  12. Lovibond P.F. Appetitive Pavlovian-instrumental interactions: Effects of inter-stimulus interval and baseline reinforcement conditions. Q. J. Exp. Psych. B. 1981;33:257–269. doi: 10.1080/14640748108400811. [DOI] [PubMed] [Google Scholar]
  13. Mead A.N., Stephens D.N. Involvement of AMPA receptor GluR2 subunits in stimulus-reward learning: Evidence from glutamate receptor gria2 knock-out mice. J. Neurosci. 2003a;23:9500–9507. doi: 10.1523/JNEUROSCI.23-29-09500.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Mead A.N., Stephens D.N. Selective disruption of stimulus-reward learning in glutamate receptor gria1 knock-out mice. J. Neurosci. 2003b;23:1041–1048. doi: 10.1523/JNEUROSCI.23-03-01041.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Meck W.H., Church R.M. Simultaneous temporal processing. J. Exp. Psychol. Anim. Behav. Process. 1984;10:1–29. [PubMed] [Google Scholar]
  16. Rescorla R. Behavioral studies of Pavlovian conditioning. Annu. Rev. Neurosci. 1988;11:329–352. doi: 10.1146/annurev.ne.11.030188.001553. [DOI] [PubMed] [Google Scholar]
  17. Silva K.M., Timberlake W. The organization and temporal properties of appetitive behavior in rats. Anim. Learn. Behav. 1998;26:182–195. [Google Scholar]
  18. Toates F.M. Motivational systems. Cambridge University Press; Cambridge and New York: 1986. [Google Scholar]
  19. Wagner A.R., Brandon S.E. Evolution of a structured connectionist model of Pavlovian conditioning (AESOP) In: Klein S.B., Mowre R.R., editors. Contemporary learning theories: Pavlovian conditioning and the status of traditional learning theory. Erlbaum; Hillsdale, NY: 1989. pp. 149–189. [Google Scholar]
  20. Wynne C.D., Staddon J.E., Delius J.D. Dynamics of waiting in pigeons. J. Exp. Anal. Behav. 1996;65:603–618. doi: 10.1901/jeab.1996.65-603. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Learning & Memory are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES