Abstract
Sign-tracking is a form of autoshaping where animals develop conditioned responding directed toward stimuli predictive of an outcome even though the outcome is not contingent on the animal's behavior. Sign-tracking behaviors are thought to arise out of the attribution of incentive salience (i.e., motivational value) to reward-predictive cues. It is not known how incentive salience would be attributed to serially occurring cues, despite cues often occurring in a sequence in the real world as reward approaches. The experiments presented here demonstrate that reward-proximal cue responding is not altered by the presence of a distal reward cue (Experiment 1), and similarly that reward-distal cue responding which animals favor, is not altered by the presence of a reward-proximal cue (Experiment 2). Extinction of reward-proximal cues after training of the serial sequence leads to a generalized reduction in lever responding (Experiment 3). Together, we show that both Pavlovian serial lever cues acquire motivational value. These experiments also provide support to the notion that sign-tracking responses are insensitive to changes in outcome value, and that responding to serial cues creates a distinct context for outcome value.
Incentive motivation can take different forms in behavior. A common one is approach to a reward when its availability is indicated by a cue. When cues are discrete and localizable, motivated behavior often can also be directed to the cue itself (Brown and Jenkins 1968; De Tommaso et al. 2017). This behavior, also known as sign-tracking, is thought to reflect in part the attribution of incentive value to the cue by virtue of its pairing with reward (Berridge 2004; Flagel and Robinson 2017). Behavioral and neuroscientific investigations have shed light on how incentive salience, as manifest in sign-tracking, relates to cue-reward relationships and how it is enabled by neural systems (Berridge 2007; Tomie et al. 2008; Flagel and Robinson 2017). However, such work involves the use of a single reward-predictive cue, typically an audio/visual cue or lever insertion cue, while in the real world motivated behavior is often guided by a series of stimuli leading up to a reward. For example, a meal is signaled by many sequential cues including food sights, sounds, and smells. It remains unknown how incentive salience attribution works in the situation of serial reward cues that draw in motivated behavior.
Prior work using a serial design with two cues found a bias in neural responses (in the ventral pallidum) to a distal auditory cue, but when motivational state was boosted with dopaminergic or opioid drugs, that response became accentuated to the proximal auditory cue and in parallel motivated behavior was enhanced (Smith et al. 2011). This finding highlighted a potential distal cue bias for motivation in normal conditions, with motivational state changes having their greatest effect on proximal cues. According to learning models, such as those rooted in prediction-error learning and informed by the activity dynamics of dopamine neurons, the proximal cue in a distal-proximal sequence would initially gain the most associative strength due to acquiring a stronger prediction error (Ludvig et al. 2012). For example, a nictitating membrane study in rabbits demonstrated that proximal cues facilitate responding to distal cues; without proximal cues, little, if any, responding occurs to the distal cue. The proximal cue dominance for learning is especially true when the interval between proximal and US delivery is short (Kehoe et al. 1979). In contrast, however, there is also evidence that stimulus learning can progress with time to favor the distal cue as, plausibly, it grows to carry the greatest predictive strength regarding the timing of events to follow. A bias in neural responses toward a distal cue has been documented in several motivation-related brain areas including in the ventral pallidum, as noted, and in dopamine neurons (Schultz et al. 1993; Schultz 2002; Smith et al. 2011; Collins et al. 2016). There is also clear evidence that sequential cues can become associatively linked, such as from experiments (Rescorla 1972; Holland and Ross 1981) showing that the extinction of a stimulus in sequence will lead to reduction of conditioned responding specific to temporally paired stimuli.
In instrumental learning conditions that use levers for operant requirements, reward-distal and reward-proximal stimuli can also be processed differently. Work on heterogeneous instrumental chains, as well as second order conditioning, has shown that actions temporally nearer to reward can be more sensitive to shifts in motivational state, while those further away can be more sensitive to incentive learning (Holland and Rescorla 1975; Balleine et al. 1995), and a distal-action bias has been found in dopaminergic signaling as well (Wassum et al. 2012). However, while two sequential actions can be dissociated in motivational sensitivity, there is also evidence that they can become associatively coupled. For example, the extinction of one action can reduce responding to the other paired instrumental action (Olmstead et al. 2001; Thrailkill and Bouton 2017a), and, in some circumstances, devaluation of the outcome can affect both actions as though they were bound together as a “chunk” (Ostlund et al. 2009; Thrailkill and Bouton 2017a).
The mostly open question of how a motivational attraction to serial reward cues arises is an important one given the range of disorders involving excessive motivation that are thought to involve dysfunctional incentive salience processes. It is also a nontrivial question, as the sign-tracking readout of incentive salience does not obey expectations derived from studies on other forms of Pavlovian conditioned responses or instrumental behaviors. This is particularly true when a manipulable cue, like the insertion of a lever, is used. Curious features of sign-tracking to levers are that it is resistant to associative blocking by other stimuli (Holland et al. 2014), is partly persistent in the face of reward omission when it occurs (Chang and Smith 2016), and is insensitive to devaluation of the outcome (Morrison et al. 2015; De Tommaso et al. 2017) while also being immediately sensitive to new appetite states relevant to the outcome (Robinson and Berridge 2013). It is thus difficult to derive predictions about sign-tracking to serial lever cues: would motivated responding be biased toward one cue or be equivalent? Would a change in the value of the expected reward affect responding to one cue, both cues, or neither? Is the value that is added to the cues equivalent?
Here, three experiments were conducted to address these questions. Experiments 1 and 2 compared sign-tracking to serial reward-distal and reward-proximal cues to a traditional single reward-proximal cue (Experiment 1) or to a single reward-distal cue (Experiment 2). These experiments additionally evaluated sign-tracking sensitivity to reward devaluation in each condition. Experiment 3 assessed the impact of extinguishing responding to the reward-proximal cue on responding to the serial cue sequence. In all experiments, two distinct lever cues were used given that, at least for other cues, those of the same modality tend to promote responding and support stimulus associations (Holland and Ross 1981).
Results
Experiment 1
Outline
Experiment 1 assessed the structure and outcome-sensitivity of sign-tracking to serial lever cues (Group Serial: CSdist → CSprox → US; N = 8) in comparison to a single cue (Group Proximal: CSprox → US; N = 8) (Fig. 1A). Rats in Group Serial were given a magazine training session followed by 12 daily sign-tracking sessions in which a lever cue (CSdist) was presented for 10 sec then retracted and immediately followed by a subsequent lever cue (CSprox) for 10 sec, and then followed by delivery of two reward pellets upon retraction of CSprox. Sign-tracking training sessions for rats in Group Proximal included a single CS+ lever (CSprox) presented for 10 sec and two pellets delivered upon its retraction, and a separate 10-sec nonreinforced lever presented during the ITI to equate total lever presentation time. After training, all rats were exposed to an outcome devaluation procedure. This consisted of a predevaluation extinction probe session, predevaluation rewarded session, pairings of lithium-chloride with reward pellets in holding boxes to induce aversion, a post-devaluation extinction probe session (i.e., with no rewards provided), and a post-devaluation reacquisition session (i.e., with rewards provided normally as during training). Reward consumption was measured in holding cages during the devaluation and again at the very end of testing in a follow-up test.
Figure 1.
The presence of a distal lever does not alter proximal press rate responding throughout the 12 sessions of sign-tracking. Within Group Serial, a distal preference develops over time. (A) Timeline of experimental sessions for Group Serial (above) and Group Proximal (bottom). (Mag train) magazine training, (Ext) extinction test, (Reacq) reacquisition test, (Consum) consumption test in holding box. (B) Proximal press rates in presses per minute (ppm) in Group Serial (n = 8, white) and Group Proximal (n = 8, gray). Press rates were calculated by dividing overall presses within a session by total cue availability (250 sec) divided by 60 min. Error bars represent ±SEM. (C) Proximal (white) versus distal (black) press rates within Group Serial (n = 8) over 12 sessions of sign-tracking. (D) Difference scores of total proximal presses minus total distal presses were calculated for each animal within the serial group for each session and plotted as an average score per session. Lines projecting above the x-axis represent a positive score indicating proximal lever bias. (E) Difference scores of total proximal presses minus total distal presses were divided by the sum of all presses for each rat, for each session providing a number from −1 to 1. Scores were then split with 0 to 1 categorized as proximal bias for graphing purposes. Graph depicts percent of animals for each session with a proximal or distal bias. For example, session 5 depicts 75% (six out of eight of serial animals) with a distal bias. Error bars represent ±SEM.
Training
To compare rates of pressing the proximal lever between Group Serial and the Group Proximal, a linear mixed model used lever presses per minute (ppm) on the CSprox as the dependent variable by fixed effects of group and session (sessions 1–12) as well as an interaction between group and session with random intercepts for individual animals and learning curves were included. There was neither a significant main effect of group (estimate: 3.44 ppm; CI: −4.69–10.89; P = 0.418) nor session (estimate: 0.58 ppm; CI: −0.14–1.35; P = 0.131), indicating that Group Serial and Group Proximal did not differ on average in presses per minute on the CSprox. Interestingly, there was not a significant group by session interaction (estimate: −0.15 ppm; CI: −1.61–1.28; P = 0.830) (Fig. 1B), indicating that Group Serial and Group Proximal did not press at differing rates over sessions to the CSprox. Together, these results suggest that the presence or absence of a distal lever has no effect on responding to a CSprox.
Further investigation into CSdist versus CSprox responses in Group Serial revealed interesting trends in lever preference over sessions. A linear mixed model to compare press rates between the CSdist and CSprox within the serial group used ppm as the dependent measure by the fixed effects of lever type and session with random intercepts for individual animals as well as learning curves. There was neither a significant main effect of lever (estimate: −2.01 ppm; CI: −6.07–2.43; P = 0.387), nor session (estimate: 0.81 ppm; CI: −0.10–1.63; P = 0.115). However, there was a significant interaction for lever type by session (estimate: 0.62 ppm; CI: 0.01–1.18; P = 0.048), showing that as sessions progressed, rats favored the CSdist by approximately 0.62 presses per minute (Fig. 1D). This general distal bias was further confirmed by assessing the absolute difference in pressing within-animals over sessions. To do this, we assessed the percent of animals exhibiting a lever bias using a preference index of responding per rat per session ([CSprox − CSdist/CSprox + CSdist]; <0 indicated a distal bias; >0 indicated a proximal bias). While, as a whole, Group Serial had a preference for the CSdist, at the individual level there was a mix (Fig. 1E). Early sessions consisted of rats biased toward distal or proximal levels in roughly equivalent numbers, while as sessions progressed there became a moderate population preference for the distal lever.
Group Proximal exhibited a clear preference toward the reinforced lever over sessions compared to the unreinforced lever (data not shown). A linear mixed model of press rates on the proximal lever against the unreinforced lever in Group Proximal confirmed a significant main effect of lever (estimate: −6.15 ppm; CI: −10.3–(−1.99); P = 0.004) as well as a significant lever by session interaction (estimate: −0.92 ppm; CI: −1.46–(−0.364); P = 0.002) indicating that as sessions progressed the proximal, reinforced lever was preferred. There was not a significant main effect of session (estimate: 0.20 ppm; CI: −0.533–0.197; P = 0.580).
Devaluation sensitivity in extinction
We next evaluated the sensitivity of sign-tracking in both groups to outcome devaluation. Devaluation conducted in holding cages was successful in that both groups acquired an aversion and rejected pellets to a criterion (<1 g consumed; average LiCl injections received: 2.25, range: 2–3) (Fig. 2C). A linear mixed model of ppm on the CSprox by group and session (pre- versus post-devaluation extinction sessions) showed nonsignificant results of group effects (estimate: 0.72 ppm; CI: −1.31–3.23; P = 0.551), session (i.e., before versus after devaluation, estimate: 0.09 ppm; CI: −0.43–0.57; P = 0.731), and a nonsignificant session by group interaction (estimate: 0.22 ppm; CI: −0.77–1.19; P = 0.666) (Fig. 2A). Thus, neither group diverged from the other in press rates after devaluation occurred. A linear mixed model looking at Group Serial in ppm during extinction showed no main effect of lever type (estimate: 0.62 ppm; CI: −0.53–1.76; P = 0.311), nor session (estimate: −0.62 ppm; CI: −1.47–0.15; P = 0.153), nor session by lever type interaction (estimate: −0.85 ppm; CI: −2.51–0.76; P = 0.323) (Fig. 2A). Thus, responding to both levers in Group Serial remained similar and unaffected by devaluation. We note that press rates were generally lower but above zero in extinction sessions compared to reinforced sessions, though were equivalent pre- versus post-devaluation (see also insensitivity to devaluation in reinforced sessions below with higher press rates, indicating a lack of a potential floor effect). In short, both Group Proximal and Group Serial maintained similar rates of CSprox pressing despite outcome devaluation during extinction tests. However, as noted below, the groups exhibited major differences in how generalized the devaluation was to the task condition.
Figure 2.
All animals are outcome insensitive in both extinction and reacquisition tests following devaluation. Consumption in holding boxes shows decrease in eating with successive LiCl pairings followed by increased eating in serial animals in post-devaluation reacquisition tests. (A) Press rates in presses per minute (ppm) on the proximal only group (n = 8, gray), proximal lever in the serial group (n = 8, white), and distal lever in the serial group (black) in extinction testing pre- and post-devaluation. Press rates were calculated by dividing overall presses within a session by total cue availability (∼60 sec) divided by total time (∼15 min). Error bars represent +SEM. (B) Press rates in presses per minute (ppm) in Group Proximal (gray), proximal lever in Group Serial (white), and distal lever in Group Serial (black) during reacquisition tests pre- and post-devaluation. Error bars represent +SEM. (C) Total grams consumed of pellets during devaluation sessions with holding box LiCl pairings in the Group Serial (white) and Group Proximal (gray) with error bars representing +SEM. The far right shows holding box consumption test conducted after the final reacquisition test. (D) Total pellets remaining in the chamber after reacquisition sessions in Group Serial (white) and Group Proximal (gray). Error bars represent +SEM.
Magazine entries pre- and post-devaluation (in extinction)
Due to the result that all groups maintained lever pressing after devaluation, we evaluated magazine entries as well. A linear mixed model of total magazine entries per minute (epm) during extinction was analyzed by session (before and after devaluation during extinction tests), group and session by group interaction. There was no main effect of group (estimate: −0.22 epm; CI: −5.15–4.38; P = 0.925), nor group by session interaction (estimate: −2.19 epm; CI: −7.81–3.80; P = 0.435). However, there was a significant main effect of session (estimate: 2.92 epm; CI: −0.07–5.79; P = 0.0499) with all animals in the post-devaluation session entering the magazine at an estimate of 2.92 entries per minute fewer than in the predevaluation session (data not shown). These results indicate that regardless of lever sequence (serial or proximal-only), all animals respond to devaluation with a decrease in magazine entries despite their maintenance of lever pressing rates.
Devaluation sensitivity in reacquisition
Pre- and post-devaluation reacquisition sessions were next compared. A linear mixed model analyzing ppm made on the CSprox between Group Serial and Group Proximal showed no main effect of group in that neither group significantly differed in press rates on the proximal lever (estimate: 3.42 ppm; CI: −8.78–15.5; P = 0.579). The main effect of post-devaluation session was not significant (estimate: 4.14 ppm; CI: −2.18–10.4; P = 0.195), nor was the group by session interaction term (estimate: −3.30; CI: −15.2–8.76; P = 0.596) (Fig. 2B), such that neither group differed in lever press rates after devaluation in reacquisition sessions. Further, analysis of Group Serial alone did not show any significant preference of lever type before or after devaluation, with nonsignificant main effects of lever (estimate: 1.29 ppm; CI: −7.39–9.49; P = 0.768), session (estimate: −2.61 ppm; CI: −8.43–2.92; P = 0.402), and lever by session interaction (estimate: −0.24 ppm; CI: −12.2–12.0; P = 0.969) (Fig. 2B). This result indicates that sign-tracking in both groups was robust in the face of reward devaluation even when the reward was being delivered.
Magazine entries pre- and post-devaluation (in reacquisition)
All animals entered the magazine (which contained the devalued reward) at a lesser rate throughout the post-devaluation reacquisition session (data not shown). A linear mixed model revealed no significant main effect of group (estimate: −2.96 epm; CI: −8.26–2.90; P = 0.292), nor group by session interaction (estimate: −3.473 epm; CI: −9.99–1.93; P = 0.266). However, again there is a significant main effect of session (estimate: 5.88 epm; CI: 1.33–7.11; P = 0.015).
Devaluation generalization
Analysis of reward consumption during the reacquisition test by measuring remaining pellets in the magazine after testing uncovered a surprising addition to this outcome-insensitivity in sign-tracking behavior. Group Proximal rejected the pellets being delivered after devaluation, despite having sign-tracked consistently. In stark contrast, paired t-tests revealed Group Serial consumed nearly all of the “devalued” pellets (though they had entered the magazine at a lesser rate than previously). Their pellet consumption was significantly greater than Group Proximal (t(7.93) = 3.242, P = 0.012) (Fig. 2D). We confirmed that the aversion remained present in the holding boxes in a follow-up consumption test; neither Group Serial (t(7) = 1.986, P = 0.087) nor Group Proximal (t(5) = 1.000, P = 0.3632) consumed significantly greater than zero pellets (Fig. 2C, right). We note that animals had already returned to ad libitum feeding before follow-up feeding tests occurred.
Several conclusions were supported by Experiment 1. Sign-tracking is acquired equivalently to serial lever CSs, with a bias toward distal lever in responding over sessions. However, sign-tracking to a reward-proximal lever is unaffected by the presence of a reward-distal lever preceding it. Sign-tracking to a single proximal cue is also insensitive to outcome devaluation, consistent with prior reports (Morrison et al. 2015; Nasser et al. 2015; De Tommaso et al. 2017). Remarkably, while rats with a single cue continue to sign-track but reject the rewards received, rats with serial cues continue to respond and continue to consume the reward as evidence of a contextual limit to the devaluation that they had acquired (see Discussion).
Experiment 2
Outline
Results from Experiment 1 provide some evidence for a distal preference generated over sessions in serial sign-tracking. However, it is not clear if a distal lever alone could carry the same motivational value without the subsequent presentation of a proximal cue. Experiment 2, therefore, examined if a distal cue alone, with a trace gap of 10 sec (i.e., a lack of proximal cue), would acquire a robust sign-tracking response. Additionally, Experiment 2 aimed to replicate the distal preference-by-session interaction within Group Serial as well as the surprising outcome devaluation results. As part of this effort, Experiment 2 animals were not put on ad libitum feeding prior to the final consumption test (after post-devaluation reacquisition) in order to ensure that the low rates of consumption seen in the first experiment was not due to satiation.
Specifically, Experiment 2 compared the serial cue condition (Group Serial: CSdist → CSprox → US; N = 8) to a distal-only cue condition (Group Distal: CSdist → 10 sec → US for sign-tracking; N = 8). For Group Serial, all procedures followed Experiment 1. For Group Distal, training consisted of the presentation of the distal CS+ lever cue: 10 sec CSdist lever insertion, then a 10 sec fixed delay, and then pellet delivery (Fig. 3A). Similar to Group Proximal in Experiment 1, Group Distal also received a nonreinforced lever presentation during the ITI. The training timeline, devaluation protocol, behavior measurement, and analyses followed Experiment 1.
Figure 3.
The presence of a proximal lever does not alter distal press rate responding throughout the 12 sessions of sign-tracking. Within serial animals, a distal preference develops over time. (A) Timeline of experimental sessions for Group Serial (above) and Group Distal (bottom). (B) Distal press rates in presses per minute (ppm) in Group Serial (n = 8, black) and Group Distal (n = 8, gray). Error bars represent ±SEM. (C) Proximal (white) versus distal (black) press rates within Group Serial (n = 8) over 12 sessions of training. Error bars represent ±SEM. (D) Difference scores of total proximal presses minus total distal presses were calculated for each animal within Group Serial for each session and plotted as an average score per session with error bars representing ±SEM. Lines projecting above the x-axis represent a positive score indicating proximal lever bias. (E) Difference scores of total proximal presses minus total distal presses were divided by the sum of all presses for each rat, for each session providing a number from −1 to 1. Scores were then split with 0 to 1 categorized as proximal bias for graphing purposes. Graph depicts percent of animals for each session with a proximal or distal bias (determined by difference scores).
Training
The same linear mixed model structure used in Experiment 1 to compare press CSprox rates between groups was used here to compare CSdist rates. There was not a significant main effect of group (estimate: 5.03 ppm; CI: −4.37–12.8; P = 0.249), indicating that the two groups engaged with the CSdist similarly. A significant main effect of session (estimate: 1.25 ppm; CI: 0.06–2.32; P = 0.049), showed that rats in both groups were learning to sign-track over sessions. An insignificant group by session interaction (estimate: 0.65 ppm; CI: −1.01–2.16; P = 0.444) (Fig. 3B) indicated that the groups did not significantly differ in their CSdist press rates (note more variance in distal-only pressing).
The same linear mixed model structure was used to compare pressing rates between the CSprox and CSdist within Group Serial as used in Experiment 1. A significant main effect of session (estimate: 1.29 ppm; CI: 0.34–2.37; P = 0.040) was found as well as a significant main effect of lever (estimate: −5.52 ppm; CI: −9.92–(−1.39); P = 0.015), showing that the CSdist generated on average 5.85 presses per min more than the CSprox. There was a replicated significant interaction for lever type by session (estimate: 1.24 ppm; CI: 0.69–1.88; P = <0.001) indicating that as sessions progressed, rats favored the CSdist by approximately 1.24 presses per minute (Fig. 3C). Pressing difference and the population distribution of pressing bias showed similar effects to Experiment 1. Again, what began as a roughly equivalent pressing amount or individual bias toward the proximal or CSdist evolved over sessions to a pattern that favored the CSdist in pressing (Fig. 3D,E).
The same linear mixed model structure was used to compare press rates within Group Distal toward the reinforced (distal) lever and the unreinforced lever presented during the ITI. There was not a significant main effect of lever (estimate: −1.40 ppm; CI: −6.37–3.57; P = 0.583) nor session (estimate: 0.634 ppm; CI: −0.072–1.34; P = 0.125) however a significant lever by session interaction (estimate: −1.24 ppm; CI: −1.91–(−0.560); P = <0.001) indicated that over time, the reinforced (i.e., distal) lever was preferred over the nonreinforced lever (data not shown).
Devaluation sensitivity in extinction
Both groups formed an aversion to the pellets by our criterion (<1 g consumed) and required on average 2.68 LiCl pairings (from 2 to 4 pairings) to meet this rejection criterion. A linear mixed model of ppm on the CSdist by group and session (pre- and post-devaluation) during extinction showed nonsignificant results of group effects (estimate: 0.37 ppm; CI: −2.80–3.68; P = 0.818), session (i.e., before versus after devaluation, estimate: 1.23 ppm; CI: −1.96–4.09; P = 0.440), and a nonsignificant session by group interaction term. Thus, devaluation did not affect sign-tracking to the CSdist in either group when responding was assessed in extinction (estimate: −3.72 ppm; CI: −9.79–2.56; P = 0.250) (Fig. 4A). Moreover, a linear mixed model looking only at Group Serial in ppm showed no main effect of lever type (distal or proximal; estimate: 0.30 ppm; CI: −2.75–3.14; P = 0.848), nor session (estimate: −0.81 ppm; CI: −1.82–0.15; P = 0.114), nor session by lever type interaction (estimate: 0.36 ppm; CI: −1.40–2.34; P = 0.716) (Fig. 4A), indicating a lack of devaluation effect on sign-tracking to either CSprox or CSdist in Group Serial, however it is noteworthy that press rates in extinction were similar to Experiment 1 in that they were generally reduced compared to previous testing.
Figure 4.
All animals are outcome insensitive in both extinction tests and reacquisition tests following devaluation. Consumption testing shows decrease in eating during LiCl pairings in holding box followed by increased eating in Group Serial in post-devaluation reacquisition tests. (A) Press rates in presses per minute (ppm) in Group Distal (n = 8, gray), proximal lever in Group Serial (n = 8, white), and distal lever in Group Serial (black) in extinction testing pre- and post-devaluation. Error bars represent ±SEM. (B) Press rates in presses per minute (ppm) in Group Distal (gray), proximal lever in Group Serial (white), and distal lever in Group Serial (black) in reacquisition tests pre- and post-devaluation. Error bars represent +SEM. (C) Total grams consumed of pellets during devaluation sessions with LiCl pairings in Group Serial (white) and Group Distal (gray) with error bars representing +SEM. The far right represents follow-up feeding tests conducted after the final reacquisition testing. (D) Total pellets remaining in the chamber after reacquisition sessions in Group Serial (white) and Group Distal (gray) indicates the larger bars as rejection of pellets. Error bars represent +SEM.
Magazine entries pre- and post-devaluation (in extinction)
A linear mixed model revealed a significant effect of group in the number of magazine entries per min pre- and post-devaluation (estimate: −6.09 epm; CI: −9.83–(−2.18); P = 0.008; data not shown), specifically that Group Serial entered the magazine 6.09 entries per min fewer than Group Distal. A significant effect of session indicated that both groups entered the magazine at an attenuated rate after devaluation (estimate −3.95 epm; CI: −6.07–(−1.50); P = 0.004), specifically by a magnitude of about four entries fewer per minute. There was not a significant session by group interaction; both groups decreased pressing from pre- to post-devaluation sessions at a similar rate (estimate: 3.00 epm; CI: −1.52–7.76; P = 0.209).
Devaluation sensitivity in reacquisition
During the reacquisition tests, a linear mixed model analyzing ppm made on the CSdist between Group Serial and Group Distal showed that neither group significantly reduced pressing after devaluation with no main effect of group (estimate: 13.4 ppm; CI: −1.98–28.4; P = 0.103), post-devaluation session (estimate: −1.74 ppm; CI: −7.37–3.34; P = 0.535), nor group by session interaction term (estimate: 2.04 ppm; CI: −9.02–13.9; P = 0.715) (Fig. 4B). Further, analysis of Group Serial showed a significant preference of lever type during acquisition with a linear mixed model showing significant main effects of lever (estimate: 9.80 ppm; CI: 3.02–16.6; P = 0.013), but not session (estimate: 0.57 ppm; CI: −8.84–11.3; P = 0.912) nor lever by session interaction (estimate: −1.29 ppm; CI: −16.0–12.3; P = 0.860) (Fig. 4B). Together with Experiment 1, these results show that sign-tracking resists outcome devaluation whether there are serial cues, only a reward-proximal cue, or only a reward-distal cue.
Magazine entries pre- and post-devaluation (in reacquisition)
A linear mixed model of magazine entry rate revealed a main effect of group indicating that the serial group generally entered the magazine at a decreased rate (estimate: −9.01 epm; CI: −14.4–(−3.81); P = 0.005; data not shown). Interestingly, there was not a replicated effect of session; all groups maintained similar entry rates pre- and post-devaluation (estimate: 0.83 epm; CI: −3.30–4.39; P = 0.675). The session by group interaction was not significant (estimate: 2.93 epm; CI: −4.73–10.4; P = 0.464).
Devaluation generalization
Concerning reward consumption, the effect of Experiment 1 was replicated. All animals in Group Serial consumed all available “devalued” pellets in the task during reacquisition, yet maintained a relative aversion to the pellets in the final consumption test in holding boxes, with consumption not different from zero (t(7) = 1.806, P = 0.114; though note some consumption due to the hungry state of animals here or slight aversion decay) (Fig. 4C, right). Group Distal behaved roughly in between the behavior of Group Serial here and Group Proximal above: they rejected some, but not all, of the pellets in the task and ate a minimal but significant amount in the final feeding test (compared to zero consumption; (t(7) = 3.231, P = 0.0144) (Fig. 4C). As a result, for in-task consumption, Group Serial and Group Distal were not different (t(7) = 1.301, P = 0.235) (Fig. 4D).
In sum, Experiment 2 shows that animals will acquire a robust sign-tracking response to a lever distal to reward that is followed by a 10 sec trace interval. This response is also devaluation insensitive. Results for Group Serial replicated all findings from Experiment 1, including the CSdist bias, the devaluation insensitivity, and the lack of generalization of pellet aversion from the holding boxes to the task.
Experiment 3
Outline
Despite evidence that serial auditory/visual cues, or serial actions, can be associatively linked (Rescorla 1972; Holland and Ross 1981; Thrailkill and Bouton 2017b), the uniqueness of sign-tracking to lever stimuli noted earlier raises the question of whether they become linked as well. Following the logic that sign-tracking reflects the attribution of incentive salience to the cues, the issue additionally becomes one of whether the value being added to one cue in a sequence is tied to, or the same as, the value added to another cue in the sequence. That is, are their motivational attributes equivalent or distinct? Based on published evidence that extinction of a reward-proximal cue or action reduces responses to a distal cue or action (see references above), we addressed this question by extinguishing sign-tracking responses to the CSprox and evaluating the effect of this procedure on responding to the learned CSdist → CSprox sequence.
All rats were trained for 12 sessions under the CSdist → CSprox → US serial cue protocol as above. Then, rats in Group Extinction (N = 8) received extinction of the CSprox cue through 250 nonreinforced presentations over five sessions. The pellet dispenser noise remained as during training, but no pellets were delivered. Rats in Group Control (N = 8) instead were placed in operant chambers for equivalent session length with the same ambient light and fan noise. Subsequently, rats in both groups were given a 60 min test session with the serial CSdist → CSprox sequence occurred in extinction. Then, a 60 min reacquisition session (CSdist → CSprox → US) was administered to test the recovery of pressing behavior (Fig. 5A). Sign-tracking measurement and statistical analysis followed Experiments 1 and 2.
Figure 5.
Serial animals develop a distal preference over time. (A) Timeline of experimental sessions. All animals received the same magazine and serial sign-tracking programs and then animals were split into the Group Extinction with just the proximal lever presented (above) and Group Control with context exposure (bottom). (B) Proximal (white) versus distal (black) press rates over 12 training sessions (n = 16). Error bars represent ±SEM. (C) Difference scores of total proximal presses minus total distal presses were calculated for each animal for each session and plotted as an average score per session with error bars representing ±SEM. Lines projecting above the x-axis represent a positive score indicating proximal lever bias. (D) Difference scores of total proximal presses minus total distal presses were divided by the sum of all presses for each rat, for each session providing a number from −1 to 1. Scores were then split with 0 to 1 categorized as proximal bias for graphing purposes. Graph depicts percent of animals for each session with a proximal or distal bias. For example, session 3 depicts ∼63% (five out of eight of serial animals) with a distal bias.
Training
Rats acquired the sign-tracking response over sessions, and once again exhibited a CSdist preference indicated by a significant lever by session interaction in a linear mixed model (estimate: 0.85 ppm; CI: 0.36–1.39; P = 0.001) such that the CSdist was preferred by 0.85 presses per minute by session. Session was a significant factor (estimate: 0.97 ppm; CI: 0.54–1.35; P < 0.001) with learning rates steadily increasing from early sessions to later sessions. As with previous experiments, there was not a significant main effect of lever type (estimate: −0.86 ppm; CI: −4.69–3.02; P = 0.655), reiterating the distal preference effect is only generated over repeated exposures of the serial sequence over sessions. It should be noted that no group differences were found in the first 12 sessions of training for the subsequently split Extinction versus Control groups (estimate: −4.54 ppm; CI: −13.4–3.71; P = 0.284) as all animals were similar in responding leading up to the extinction phase (Fig. 5B). A preference toward the distal lever in pressing and population bias was replicated again (Fig. 5C,D)
Effect of proximal extinction (serial extinction test)
Rats in Group Extinction significantly decreased pressing on the CSprox from the last rewarded serial session by ∼5 ppm per session (estimate: −5.20 ppm; CI: −6.29–(−4.19); P = <0.001) to a fraction of the press rate previously established (Fig. 5E) in a manner similar to Ahrens et al. (2016). In the subsequent test session, all animals were presented with the serial sequence in extinction. A linear mixed model of presses per trial by the fixed effects of the 25 trials in the session, lever (CSdist or CSprox), and group (Extinction or Control) with interaction terms of group by lever, lever by trial, and random effects of trial and animal. Contrasts of Group Extinction versus Group Control lever identity were constructed. Group Control pressed more overall in this test session, regardless of lever, as indicated by a significant main effect of group (estimate: 1.67 presses; CI: 0.79–2.54; P = 0.001). There was also a main effect of lever (estimate: −0.55 presses; CI: −1.10–(−0.03); P = 0.035), with both groups pressing fewer times overall to the CSdist. A main effect of trial (estimate: −0.12 presses; CI: −0.15–(−0.09); P = <0.001) showed that all groups on all levers decreased pressing as the extinction trials progressed. However, there was a lever by group interaction (estimate: 0.54 presses; CI: 0.00–1.04; P = 0.036), indicating that Group Extinction pressed significantly less to the CSdist compared to the degree in which Group Control pressed less to the CSdist. It should be noted that because Group Extinction pressed less overall, the degree of pressing to the CSdist is nearly nonexistent, thus restricting the degree of difference available. It appears that during exposure for Group Control during extinction, there was a loss of value toward the distal lever more so than the proximal lever. Even so, the proximal lever extinction group had a significant reduction in pressing toward the distal lever compared to controls (as noted in the lever by group interaction). Finally, there was a nonsignificant lever by trial interaction indicating that no preference was generated to either lever over trial progression (estimate: 0.02 presses; CI: −0.01–0.06; P = 0.19) (Fig. 6A,B).
Figure 6.
Animals given extinction on the proximal lever show reduced pressing compared to controls. All animals rapidly reacquire pressing on both levers. (A) Group Extinction (n = 8, white) versus Group Control (n = 8, black) average lever press rates in extinction test of serial sequence. Error bars represent +SEM. (B) Group Extinction (n = 8, solid line) versus Group Control (n = 8, dotted line) presses per trial over 25 trials in extinction. Proximal lever (white) presses per trial and distal presses per trial (black), each point represents the average of five trials, error bars represent +SEM. (C) Group Extinction (n = 8, white) versus Group Control (n = 8, black) average lever press rates in reacquisition test of serial sequence. Error bars represent +SEM. (D) Group Extinction (n = 8, solid line) versus Group Control (n = 8, dotted line) presses per trial over 25 trials in reacquisition. Proximal lever (white) presses per trial and distal presses per trial (black), each point represents the average of five trials, error bars represent +SEM.
Effect of sign-tracking extinction (reacquisition test)
In the following reacquisition test, no significant main effect of group was found (estimate: 0.67 presses; CI: −0.49–2.07; P = 0.237), indicating that neither group differed in presses per trial. No main effect of lever (estimate: 0.43 presses; CI: −0.25–1.08; P = 0.212) or trial (estimate: 0.00 presses; CI: −0.04–0.04; P = 0.898) was found to indicate a general preference for lever nor a significant increase in overall presses over sessions. The interaction effect for group by lever was not significant (estimate: −0.41; CI: −0.99–0.18; P = 0.221), indicating that both groups had similar press rates to both levers. However, the interaction effect for lever by trial was significant (estimate: 0.09 presses; CI: 0.05–0.14; P = <0.001) (Fig. 6C,D), indicating that as trials progressed the CSdist was preferred by both groups just as it had been during initial training.
To confirm a rapid reacquisition of sign-tracking responding from the previous extinction day, extinction test press rates were compared to the reacquisition test press rates in a linear mixed model of press rates by fixed effects of group, lever, trial, and session with random effects of trial, day, and individual animal press rates. A main effect of session (estimate: 3.42 presses; CI: 2.63–4.23; P = <0.001) showed that in reacquisition an increase in press rates occurred by a factor of 3.42 presses. Further, this effect was the same for both groups with an insignificant group by session interaction (estimate: −1.13 presses; CI: −2.56–0.316; P = 0.112).
We conclude that extinction of responding to the CSprox leads to a generalized decrease in responding to both levers during follow-up testing, suggesting that the motivational value applied to the CSprox is shared with the CSdist. The reinstatement of pressing to both levers in the reacquisition session after sign-tracking extinction to the CSprox is notably rapid, indicating that the valuation of those cues had remained stored but inhibited in behavior expression during the extinction procedure. Intriguingly, responding to the CSdist is favored during training, reduced in both groups over the course of trials in the extinction test, and refavored rapidly in the reacquisition test.
Discussion
Through the lens of sign-tracking behavior, three experiments aimed to address the question of how motivational value (i.e., incentive salience) becomes attributed to cues for reward that occur in a sequence.
Serial cues both acquire incentive salience with a bias toward the reward-distal cue
Both serial cues engaged a robust response across experiments. However, as a whole, serial presentations led to the reward-distal cue evoking greater pressing over sessions. This distal preference is reminiscent of accentuated responses to distal cues in a serial sequence that emerges in dopamine and ventral pallidum neurons (Schultz et al. 1997; Smith et al. 2011; but see Pan et al. 2005), both of which are implicated in signaling the motivational properties of reward cues (Schultz 2002; Smith et al. 2011; Wassum et al. 2012) and thus could underlie the behavioral patterns found here. Although our favored interpretation of the sign-tracking response relates to a motivational valuation of the cues, it is also plausible that the distal cue in the series became, with learning, to be the earliest consistent predictor of reward (Smith and Berridge 2005; Smith et al. 2011) and thus engaged the strongest response as a function of its predictive strength.
Results also suggested that the level of responding to either serial cue was not affected by the presence or absence of the other paired cue. This result was unexpected, particularly as previous reports have shown that intervening stimuli (i.e., proximal stimuli) facilitate learning of the CS → US relationship (Rescorla 1982) in a manner similar to second order conditioning in instrumental learning (Spence 1947). This implies that cues occurring with a subsequent delay or trace before reward (e.g., our distal-cue-only condition) are more difficult to learn than cues that occur without a delay (e.g., our proximal cue). This has been directly found in a pigeon autoshaping study, in which a distal-only visual cue did not generate key peck responses compared to a condition when the same cue was followed by a proximal cue (Kaplan and Hearst 1982). This absence of a response to a distal-only cue conflicts with our finding of similar response rates of solitary distal cues compared to distal cues in the serial condition. Thus, the notion that serial cues aid each other in facilitating a conditioned response is not necessarily supported here, potentially due to the distinct properties of learning and responding to levers as cues. Instead, it seems that both lever cues acquire value and support conditioning. Indeed, a paradigm such as ours with a solitary distal cue is in some ways related to more Pavlovian cues with uncertainty in their reward prediction, which gain greater conditioned responding than certain cues (Anselme et al. 2013). Further work needs to be done to assess the length of trace on sign-tracking and when it is no longer supported.
Sign-tracking to serial cues imposes a contextual limit on outcome value
The outcome devaluation procedure (conditioned aversion to the food in holding boxes) led to several intriguing insights about the persistence of sign-tracking behavior. It has previously been shown using single CS+ conditions that sign-tracking animals are insensitive to outcome devaluation, which is not the case in animals that instead use the CS+ to inform goal entry behavior (i.e., goal tracking; Morrison et al. 2015). Further, animals that have a tendency toward sign-tracking are more likely to continue responding after outcome devaluation while both sign- and goal-tracking animals respond equally to audio or visual second order stimuli to lever cues (Nasser et al. 2015). In humans, cues that previously predicted a liquid reward were devalued by quenching participants thirst then tested among distractor stimuli and are shown to continue to demand attention from participants and are considered outcome insensitive (De Tommaso et al. 2017). Our results in the solitary cue condition in Experiments 1 and 2 essentially replicate these findings; animals in this group did not alter their sign-tracking response after reward devaluation either in extinction or reacquisition sessions, despite maintaining an aversion to the reward (reflected in minimal consumption during reacquisition and in the subsequent free-feeding test, as well as avoiding the food cup). In other words, the aversion generalized to the task rather normally in these animals, yet they did not correspondingly adjust their sign-tracking response.
Commonly, outcome insensitivity in a behavioral response is interpreted as reflecting an underlying habitual stimulus-response association (Dickinson 1985; Balleine and Dickinson 1998). However, given that sign-tracking can exhibit non-habit-like characteristics—it can change flexibly in its response details after an omission criterion is imposed (Chang and Smith 2016) and it can emerge spontaneously in a novel appetite state (Robinson and Berridge 2013)—an alternative interpretation is also worth considering that the response reflects a dissociation of the motivational value of the cue from the value of the reward itself, with cue valuation thus being resistant to change when reward value changes (Robinson and Berridge 2008; De Tommaso et al. 2017). Regardless, the sign-tracking response to a single cue, as well as serial cues, is notably rigid.
An entirely distinct type of rigidity was observed for rats that learned the serial cue sequence in Experiments 1 and 2. These rats not only continued to sign-track to both cues for the devalued reward, but when the reward was presented during reacquisition they consumed it as well. And yet, the aversion had remained present as confirmed by little consumption in free-feeding tests (i.e., zero consumption in the sated state (Experiment 1); weak consumption in the hungry state (Experiment 2)). These results are unlike previous reports of reward-proximal actions being directly sensitive to motivational shifts (Balleine et al. 1995), or a series of actions for reward being sensitive in tandem (Thrailkill and Bouton 2017a,b), suggesting again a uniqueness to sign-tracking as a response to levers rather than pressing them as a superstitious instrumental requirement. The possible argument that the aversion could not generalize well to the task is weakened by the decent generalization in the proximal-only group as above. We instead interpret this finding as reflecting the formation of a distinct context for the task condition as established by the serial cues in a manner that was not established by the proximal cue alone. In this view, the aversion remained specific to the environment in which it was acquired (holding box) and did not generalize to the task environment in which prior experience with the reward as valuable had occurred following serial lever cues. It is possible that this is because the serial cues occurred as a sequence, or the total CS+ cue duration being greater than in single-cue groups. In either case, the persistence of a CS+ response appears to be due to a contextual limit on the devaluation of the UCS reminiscent of context-mediated second order conditioning in LiCl–CS aversions as in Bills et al. (2003).
Context-specific aversion can occur that is independent of Pavlovian cues (Loy et al. 1993; Boakes et al. 1997), and context can drive taste aversion in autoshaping procedures (Archer et al. 1984). Our results indicate, surprisingly enough, that aversions can also be made to be more context-dependent when the application test is one in which the food is preceded by a series of learned lever cues. Results from Experiment 2 also showed that a distal-only cue produces some resistance to devaluation generalization as well, suggesting a potential factor for interval length between lever cue and reward delivery. Still, we emphasize that it remains unknown how specific this unusual result is to our experimental methods, and add that the level of the “true aversion” established could be dissociated from what we measured in holding-box intake.
Another important consideration is the notion that the food cup is situated as potentially the final cue in a series leading up to the delivery of food reward. In this view, we cannot dissociate the decrease in food cup entries after devaluation as a result of it being a distinct conditioned response or the food cup being the most reward-proximal stimulus. In a similar vein, we did not observe any behavior which would be categorized as goal-tracking (dominant food cup approach during the CS+). All animals developed a conditioned response to the levers. Sign-tracking is the dominant response when multiple lever cues are present (Chang and Smith 2016), while the goal-tracking phenotype typically develops in studies using a single CS+ cue.
The incentive salience of serial cues can be shared and rapidly recovered
Experiment 3 showed that extinction of the proximal cue after training of the serial sequence led to a reduction in overall responding to both cues. These data support the idea that a representational link can form between serial cues or actions for reward. For instance, extinction of audio-visual proximal cues in sequence leads to reduction in conditioned responding to distal cues (Holland and Ross 1981), and instrumental chains show similar reductions in associated actions after extinction of one (Thrailkill and Bouton 2017b). Although the interpretation of such results is typically one of sequential cues or actions becoming associated with one another, we offer an additional interpretation related to sign-tracking results that the incentive salience of the cues becomes linked. In this view, reducing the motivational draw of the proximal cue by removing reward led to a parallel reduction of the motivational draw to the distal cue as well, which would suggest that the value of those cues can be equivalent or shared. Such an idea is consistent with observations that cues can acquire the hedonic or motivational attributes of the reward itself (Jenkins and Moore 1973; Davey et al. 1989; Berridge 2004; Holland and Wheeler 2008), meaning here that both serial cues might have acquired the same attributes related to the grain pellet reward. The resistance of sign-tracking to devaluation, however, indicates that these shared attributes might not be sensory-specific to the reward.
There was also a rapid, essentially immediate, reacquisition of the sign-tracking response once reward was again provided after extinction. This finding supports the theory that extinction reduces conditioned responses and associations but does not abolish them (Pavlov 1927; Rescorla 1997; Delameter 2004). Considering the incentive salience interpretation of sign-tracking, a possibility is that the motivational value of serial cues too can become suppressed but not lost with extinction learning. The rapid reacquisition here could be relevant to disorders of excessive motivation, including addiction, which are often characterized by their high probability of relapse in the presence of incentive drug stimuli (Robinson and Berridge 1993).
Materials and Methods
Animals
Experimentally naïve male Long Evans rats obtained from Charles River (N = 48; Charles River, Indianapolis, IN, USA). Rats were single housed in ventilated plastic cages in a climate-controlled colony room set to a 12 h light–dark cycle (lights on at 7:00 a.m.). Experiments were conducted during the light cycle. Food and water were available ad libitum until 7 d before magazine training. Weight was restricted to 85% ad libitum weight prior to testing and maintained throughout all experimental procedures. Rats were provided with standard rat chow after each testing session to maintain 85% weight and were given free access to water the remaining 23 h/d. All procedures were approved by the Dartmouth College Animal Care and Use Committee.
Apparatus
Sign-tracking training and testing was conducted in standard operant chambers (Med Associates), which were enclosed in sound- and light-attenuating cabinets outfitted with fans for ventilation and white noise. Chambers contained two retractable levers to the left and right of a recessed magazine. Lever deflections were recorded automatically. Magazine entries were recorded through breaks in an infrared beam adjacent to the magazine.
Sign-tracking
Training began with a 30-min magazine acclimation session in which pellets were delivered with a probability of 1 pellet every 30 sec. Rats in Group Serial were then exposed to 12 daily, 60 min sessions of conditioning. In each session, rats received a 10 sec lever presentation (CSdist) followed by lever retraction and 10 sec insertion of the second lever (CSprox). Two pellets (BioServ, 45 mg Dustless Precision Pellets) were delivered following retraction of the CSprox lever followed by a 120 sec ITI. Twenty-five such trials occurred per session. Rats in Group Proximal (Experiment 1) were given 12 sessions, each with 25 trials, during which on the CSprox lever was presented followed by reward pellets. Rats in Group Distal (Experiment 2) were given 12 sessions, with 25 trials each, where the CSdist lever was presented for 10 sec and followed by a 10 sec interval before reward delivery. In both of these single-CS+ conditions, the solitary reinforced lever trials were pseudorandomly interspersed with presentations of a 10 sec unreinforced (CS−) lever during 45–70 sec of the intertrial interval. Both groups (serial-cue and single-cue) then received a single 15-min extinction session (i.e., levers were presented and feeder noises occurred as during training but no pellets were given), followed by a single, 60-min reacquisition session (i.e., rewarded session identical to earlier training sessions) occurring as during training. All groups were counterbalanced with right or left levers as proximal cues.
Experiment 3 animals were given a single session of magazine training followed by 12 sessions of 60 min conditioning sessions of serial lever sign-tracking. Each session contained 25 presentations of CSdist for 10 sec, followed by CSprox for 10 sec and the delivery of two reward pellets. Across animals, proximal cues were counterbalanced with either left or right lever. Following the final conditioning session, animals were pseudorandomly separated into an extinction group and a control group for an even number of left and right proximal cue animals in each group. Extinction of the proximal lever occurred in five, 60 min training sessions with a total of 250 nonreinforced trials in Group Proximal Extinction. Group Control was placed in operant boxes for the equivalent session length with the same ambient light and fan noise. After the extinction phase, all animals were presented with the serial sequence in extinction in a 60 min, 25 trial session. The following session, all animals received the rewarded serial sequence in a 60 min, 25 trial session. In both the post-extinction serial tests, animals received the same lever assignment as they experienced during training.
Outcome devaluation and post-devaluation testing
Devaluation of the grain pellets was carried out in clear plastic holding boxes (used during transfer from colony to testing room during training procedures) affixed with a plastic petri dish containing 10 g of pellets. Rats were permitted to consume the pellets for 20 min followed by injection of lithium chloride (LiCl; 0.3 M; 10 mg/kg; in deionized water). Rats remained in the boxes for an additional 20 min following injection and were then returned to their homecages. After 48 h, rats were tested similarly for consumption. Additional pairings of LiCl were delivered until rats consumed <1 g. All rats rejected the reward pellets after three pairings. Forty-eight hours after the final devaluation, rats were exposed to the conditioning task again in extinction setting followed by one session of reacquisition. Pellet intake was then again assessed in holding boxes after these sessions to evaluate retention of the LiCl-induced aversion. Experiment 1 follow-up feeding tests were conducted after rats had already been put back on ad libitum. It should be noted that two animals in the single-cue group from were not tested in follow-up feeding tests as they had already been repurposed for other experiments. Experiment 2 follow-up feeding tests were conducted under continued 85% weight restriction to ensure that effects seen from Experiment 1 were not due to satiation.
Behavioral measures and analyses
Lever deflections, magazine entries, and time spent in magazine were recorded through MedPC. During devaluation procedures, pellets were weighed to calculate grams consumed. Magazines were visually inspected after testing to note and count any remaining pellets. All statistical tests were carried out using R (R Core Team 2016). Individual linear mixed models (R; “lme4”) were used to analyze effects of dependent variable responding (ex. lever presses per minute (ppm)) by fixed effects of experimental group or lever type, and session while accounting for random effects of differences in individual starting values for the dependent variable in session one and differences in individual learning rates over sessions. Zero sum contrasts are made for categorical variables (i.e., group and lever type) when appropriate. Linear mixed models are fit by maximum likelihood and t-tests use Satterthwaite approximations of degrees of freedom (R; “lmerMod”). Linear mixed models were analyzed with package lme4 from CRAN (Bates et al. 2015). The reported statistics will include parameter estimates (β values), confidence intervals (95% bootstrapped confidence intervals around dependent variable), and P-values (“lmerTest”, Kuznetsova et al. 2016). Comparisons of extinction and reacquisition sessions pre- and post-devaluation were analyzed in paired t-tests (R Core Team 2016).
Linear mixed models were used because they consider aspects of the data structure that repeated measures ANOVA cannot and allows for safer generalization to larger populations of animals. For instance, mixed models expect a greater likelihood that repeated measures taken from one animal over time tend to be more similar than across animals and can account for these trends (Boisgontier and Cheval 2016). The ability to account for this error is especially useful in autoshaping analysis where animals tend to have individualized responses (Flagel et al. 2009; Meyer et al. 2012) and to not mistake this error as significant effect. Our models here are constructed to account for individual starting values on day 1, as well as individual learning rates over multiple sessions. Graphs and figures were constructed using GraphPad Prism version 7.0a and Adobe Illustrator.
Acknowledgments
We thank Travis Todd for advice on the task design and manuscript, and Vassiki Chauhan and Alex daSilva for statistical support. This work was supported by funding from The Whitehall Foundation to KSS (2014-05-77).
Author contributions: Cues in the environment that predict events often occur in a sequence. For example early morning, aromatic smells, and brewing sounds reliably signal a cup of coffee and draw in motivation. Here, we look at how animals respond to, and are motivated toward lever cues presented in sequence before the delivery of a food reward. These responses are known as sign-tracking and are thought to be a reflection of the cue's motivational value. We find that levers in a sequence produce responding, but that eventually a preference develops for the first cue, the cue temporally distal to reward. Additionally, we show that levers in sequence do not develop distinct responses from levers presented alone. We additionally confirm that sign-tracking responses to any cue are maintained even when the reward is devalued, and that serial-cue sign-tracking produces an unusual contextual barrier for devaluation. Finally, we show that cues in sequence share a learned association or common value.
Footnotes
Article is online at http://www.learnmem.org/cgi/doi/10.1101/lm.046599.117.
References
- Ahrens AM, Singer BF, Fitzpatrick CJ, Morrow JD, Robinson TE. 2016. Rats that sign-track are resistant to Pavlovian but not instrumental extinction. Behav Brain Res 296: 418–430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anselme P, Robinson MJ, Berridge KC. 2013. Reward uncertainty enhances incentive salience attribution as sign-tracking. Behav Brain Res 238: 53–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Archer T, Sjödén PO, Nilsson L. 1984. The importance of contextual elements in taste-aversion learning. Scand J Psychol 25: 251–257. [DOI] [PubMed] [Google Scholar]
- Balleine BW, Dickinson A. 1998. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37: 407–419. [DOI] [PubMed] [Google Scholar]
- Balleine BW, Garner C, Gonzalez F, Dickinson A. 1995. Motivational control of heterogeneous instrumental chains. J Exp Psychol Anim Behav Process 21: 203–217. [Google Scholar]
- Bates D, Maechler M, Bolker B, Walker S. 2015. Fitting linear mixed-effects models using lme4. J Stat Softw 67: 1–48. [Google Scholar]
- Berridge KC. 2004. Motivation concepts in behavioral neuroscience. Physiol Behav 81: 179–209. [DOI] [PubMed] [Google Scholar]
- Berridge KC. 2007. The debate over dopamine's role in reward: the case for incentive salience. Psychopharmacology (Berl) 191: 391–431. [DOI] [PubMed] [Google Scholar]
- Bills C, Smith S, Myers N, Schachtman TR. 2003. Effects of context exposure during conditioning on conditioned taste aversion. Anim Learn Behav 31: 369. [DOI] [PubMed] [Google Scholar]
- Boakes RA, Westbrook RF, Elliott M, Swinbourne AL. 1997. Context dependency of conditioned aversions to water and sweet tastes. J Exp Psychol Anim Behav Process 23: 56–67. [DOI] [PubMed] [Google Scholar]
- Boisgontier KC, Cheval B. 2016. The anova to mixed model transition. Neurosci Biobehav Rev 68: 1004–1005. [DOI] [PubMed] [Google Scholar]
- Brown PL, Jenkins HM. 1968. Auto-shaping of the pigeon's key-peck. J Exp Anal Behav 11: 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang SE, Smith KS. 2016. An omission procedure reorganizes the microstructure of sign-tracking while preserving incentive salience. Learn Mem 23: 151–155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins AL, Greenfield VY, Bye JK, Linker KE, Wang AS, Wassum KM. 2016. Dynamic mesolimbic dopamine signaling during action sequence learning and expectation violation. Sci Rep 6: 20231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davey GCL, Phillips JH, Witty S. 1989. Signal-directed behavior in the rat: Interactions between the nature of the CS and the nature of the UCS. Anim Learn Behav 17: 447. [Google Scholar]
- Delameter AR. 2004. Experimental extinction in Pavlovian conditioning: behavioural and neuroscience perspectives. Q J Exp Psychol B 57: 97–132. [DOI] [PubMed] [Google Scholar]
- De Tommaso M, Mastropasqua T, Turatto M. 2017. The salience of a reward cue can outlast reward devaluation. Behav Neurosci 131: 226–234. [DOI] [PubMed] [Google Scholar]
- Dickinson A. 1985. Actions and habits: the development of behavioural autonomy. Philos Trans R Soc Lond B Biol Sci 308: 67–78. [Google Scholar]
- Flagel SB, Akil H, Robinson TE. 2009. Individual differences in the attribution of incentive salience toward reward-related cues: Implications for addiction. Neuropharmacology 56: 139–148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flagel SB, Robinson TE. 2017. Neurobiological basis of individual variation in stimulus-reward learning. Curr Opin Behav Sci 13: 178–185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holland PC, Asem JS, Galvin CP, Keeney CH, Hsu M, Miller A, Zhou V. 2014. Blocking in autoshaped lever-pressing procedures with rats. Learn Behav 42: 1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holland PC, Rescorla RA. 1975. The effect of two ways of devaluing the unconditioned stimulus after first- and second-order appetitive conditioning. J Exp Psychol Anim Behav Process 1: 355–363. [DOI] [PubMed] [Google Scholar]
- Holland PC, Ross RT. 1981. Within-compound associations in serial compound conditioning. J Exp Psychol 7: 228–241. [Google Scholar]
- Holland PC, Wheeler DS. 2008. Representation-mediated food aversions. In Conditioned taste aversion: behavioral and neural processes (ed. Reilly S, Schachtman T). Oxford University Press, New York, New York, Oxford. [Google Scholar]
- Jenkins HM, Moore BR. 1973. The form of the auto-shaped response with food or water reinforcers. J Exp Anal Behav 20: 163–181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaplan PS, Hearst E. 1982. Bridging temporal gaps between CS and US in autoshaping: insertion of other stimuli before, during, and after CS. J Exp Psychol Anim Behav Process 8: 187–203. [PubMed] [Google Scholar]
- Kehoe EJ, Gibbs CM, Garcia E, Gormezano I. 1979. Associative transfer and stimulus selection in classical conditioning of the rabbit's nictitating membrane response to serial compound CSs. J Exp Psychol Anim Behav Process 5: 1–18. [DOI] [PubMed] [Google Scholar]
- Kuznetsova A, Brockhoff PB, Christensen RHB. 2016. lmerTest: tests in linear mixed effects models. R package version 2.0–33. [Google Scholar]
- Loy I, Alvarez R, Rey V, Lopez M. 1993. Context-US associations rather than occasion setting in taste aversion learning. Learn Motiv 24: 55–72. [Google Scholar]
- Ludvig EA, Sutton RS, Kehoe EJ. 2012. Evaluating the TD model of classical conditioning. Learn Behav 40: 305–319. [DOI] [PubMed] [Google Scholar]
- Meyer PJ, Lovic V, Saunders BT, Yager LM, Flagel SB, Morrow JD, Robinson TE. 2012. Quantifying individual variation in the propensity to attribute incentive salience to reward cues. PLoS One 7: e38987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morrison SE, Bamkole MA, Nicola SM. 2015. Sign tracking, but not goal tracking, is resistant to outcome devaluation. Front Neurosci 9: 468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nasser HM, Chen Y-W, Fiscella K, Calu DJ. 2015. Individual variability in behavioral flexibility predicts sign-tracking tendency. Front Behav Neurosci 9: 289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olmstead MC, Lafond MV, Everitt BJ, Dickinson A. 2001. Cocaine seeking by rats is a goal-directed action. Behav Neurosci 115: 394–402. [PubMed] [Google Scholar]
- Ostlund SB, Winterbauer NE, Balleine BW. 2009. Evidence of action sequence chunking in goal-directed instrumental conditioning and its dependence on the dorsomedial prefrontal cortex. J Neurosci 29: 8280–8287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pan WX, Schmidt R, Wickens JR, Hyland BI. 2005. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J Neurosci 25: 6235–6242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pavlov IP. 1927. Conditioned reflexes. Oxford University Press, Oxford. [Google Scholar]
- R Core Team. 2016. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. [Google Scholar]
- Rescorla RA. 1972. Informational variables in Pavlovian conditioning. Psychol Learn Motiv 6: 1–44. [Google Scholar]
- Rescorla RA. 1982. Simultaneous second-order conditioning produces S-S learning in conditioned suppression. J Exp Psychol Anim Behav Process 8: 23–32. [PubMed] [Google Scholar]
- Rescorla RA. 1997. Spontaneous recovery after Pavlovian conditioning with multiple outcomes. Anim Learn Behav 25: 99–107. [Google Scholar]
- Robinson TE, Berridge KC. 1993. The neural basis for drug craving: an incentive-sensitization theory of addiction. Brain Res Rev 18: 247–291. [DOI] [PubMed] [Google Scholar]
- Robinson TE, Berridge KC. 2008. The incentive sensitization theory of addiction: some current issues. Philos Trans R Soc Lond B Biol Sci 363: 3137–3146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MJF, Berridge KC. 2013. Instant transformation of learned repulsion into motivational “wanting”. Curr Biol 23: 282–289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W. 2002. Getting formal with dopamine and reward. Neuron 36: 241–263. [DOI] [PubMed] [Google Scholar]
- Schultz W, Apicella P, Ljungberg T. 1993. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci 13: 900–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W, Dayan P, Montague PR. 1997. A neural substrate of prediction and reward. Science 275: 1593–1599. [DOI] [PubMed] [Google Scholar]
- Smith KS, Berridge KC. 2005. The ventral pallidum and hedonic reward: neurobiological maps of sucrose “liking” and food intake. J Neurosci 25: 8637–8649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith KS, Berridge KC, Aldridge JW. 2011. Disentangling pleasure from incentive salience and learning signals in brain reward circuitry. Proc Natl Acad Sci 108: 255–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spence KW. 1947. The role of secondary reinforcement in delayed reward learning. Psychol Rev 54: 1–8. [Google Scholar]
- Thrailkill EA, Bouton ME. 2017a. Effects of outcome devaluation on instrumental behaviors in a discriminated heterogeneous chain. J Exp Psychol Anim Learn Cogn 43: 88–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thrailkill EA, Bouton ME. 2017b. Factors that influence the persistence and relapse of discriminated behavior chains. Behav Process 141: 3–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tomie A, Grimes KL, Pohorecky LA. 2008. Behavioral characteristics and neurobiological substrates shared by Pavlovian sign-tracking and drug abuse. Brain Res Rev 58: 121–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wassum KM, Ostlund SB, Maidment NT. 2012. Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task. Biol Psychiatry 71: 846–854. [DOI] [PMC free article] [PubMed] [Google Scholar]