What motivates us to perform specific actions? On one hand, action selection may be influenced predominantly by the stimuli that we have learned to associate with responses in the presence of rewards (i.e., stimulus–response associations). On the other hand, action selection could be driven by an expectation of the consequent outcome (i.e., response–outcome associations). These two types of learned associations have long been hypothesized to compete with each other and drive different forms of operant behavior (Dickinson et al., 1995). A prominent hypothesis is that the dorsolateral striatum, at least in rodents and primates, encodes stimulus–response associations.
Drugs of abuse, such as cocaine, are thought to strengthen stimulus–response encoding in the dorsolateral striatum (Everitt, 2014). This hypothesis is based on the finding that proper dorsolateral striatal functioning is required for the expression of habitual behavior, which itself is thought to rely on stimulus–response associations and underlie drug-seeking (Corbit et al., 2012; Lingawi and Balleine, 2012; Vandaele and Janak, 2017). Furthermore, repeated cocaine exposure accelerates the development of habitual “stimulus–response” behavior, possibly by facilitating glutamate transmission in the striatum and consequently sensitizing dorsolateral striatal neurons (LeBlanc et al., 2013; Parikh et al., 2014; O'Hare et al., 2016). If this is true, then if one were to give animals repeated cocaine exposure and then record from neurons in the dorsolateral striatum during operant learning, one should expect to observe enhanced stimulus–response encoding.
This was precisely the goal of an experiment recently published in The Journal of Neuroscience (Burton et al., 2017). Single-unit recordings were made in the dorsolateral striatum as rats completed a decision-making task that involved multiple cues, actions, and outcomes. On any given trial, rats were required to hold their nose in a port until an odor was presented that instructed them to turn to the left (odor 1), right (odor 2), or either direction (odor 3) to receive a liquid sucrose reward from a fluid well. There were thus four different odor–action combinations that were rewarded: odor 1-left, odor 2-right, odor 3-left, and odor 3-right. Across blocks of trials, the value and identity of the sucrose rewards in each well were varied. During the first block of a session, the same magnitude of sucrose was delivered to each well, but with different delays (short or long). In the second block, these contingencies were switched. In the third block, the reward delay was the same for both wells, but the magnitude differed (small vs big). These contingencies switched in the fourth block. This created eight different action–outcome combinations: left-short, left-long, left-small, left-big, right-short, right-long, right-small, and right-big.
This is a useful task because researchers could identify neurons that modulated their firing rates during the decision period (between odor onset and port exit) and in response to specific odor–action combinations, regardless of the value or identity of the consequent outcome (i.e., stimulus–response encoding). For example, a neuron that displayed a higher firing rate when odor 3 was presented and the rat turned left, compared with when the rat turned right in the presence of odor 3 or left in the presence of odors 1 and 2, would be considered a stimulus–response encoding neuron. Neural activity could also be analyzed to uncover response–outcome encoding by detecting changes in firing rates in accordance with changes in the action–outcome contingencies while collapsing across odor cues. For example, a neuron that displayed a higher firing rate when the rat turned left and a short delay reward was available in the left well, compared with all the other trial types during which the rat turned left or right, would be considered a response–outcome encoding neuron. To examine the effects of cocaine on associative encoding in the dorsolateral striatum, some rats were allowed to repeatedly self-administer cocaine over 12 days followed by a month-long withdrawal period before single-unit recordings. Control rats were either allowed to repeatedly administer sucrose pellets instead of cocaine or were never allowed to engage in self-administration of any kind.
The experiment yielded three main findings. First, there was no difference between cocaine-exposed and control rats in the proportion of dorsolateral striatal neurons that were significantly modulated by the odor cue, the chosen action, or combinations between cues and actions. In other words, extended cocaine exposure did not result in enhanced stimulus–response encoding in the dorsolateral striatum. Second, a higher proportion of dorsolateral striatal neurons were significantly modulated by specific action–outcome combinations in cocaine-exposed rats than in controls, suggesting that cocaine exposure enhanced response–outcome encoding. Third, the nature of the response–outcome encoding was different from that observed in control rats. At the population level, dorsolateral striatal neurons in control rats fired maximally before rats turned toward one of the fluid wells and when only one of the four outcomes (small magnitude, large magnitude, short delay, long delay) was available in that well. In contrast, neurons in cocaine-exposed rats showed population activity that reflected a more abstract contingency between an action and an outcome (i.e., firing maximally when one of the four outcomes was available in one of the wells regardless of which action the rat ended up choosing).
It is unsurprising that correlates of response–outcome associations were found in the dorsolateral striatum of control and cocaine-exposed rats because previous experiments in rats that used the same task also found robust response–outcome encoding in this brain region (Stalnaker et al., 2010; Burton et al., 2014). Nonetheless, this remains a curious finding because disrupting dorsolateral striatal function does not disrupt goal-directed actions that rely on response–outcome associations (Yin et al., 2004; Corbit and Janak, 2010; Gremel and Costa, 2013). Even more curious is the finding that extended cocaine exposure enhanced response–outcome encoding in the dorsolateral striatum because this type of drug experience has been found to abolish goal-directed control of operant behavior. Specifically, under conditions where rodents are trained to press a lever for a food reward and then the food reward is devalued, prior cocaine exposure prevents the animals from using knowledge about the lever-food association, thus preventing the suppression of lever pressing (LeBlanc et al., 2013; Corbit et al., 2014).
This conundrum can be resolved in three possible ways. First, the types of tasks that are normally used to reveal the weakening, suppression, or erasure of response–outcome associations by cocaine are free-operant tasks with a single response–outcome contingency. The task used by Burton et al. (2017) was not free-operant (operant responses were cued by odors), and there were multiple response–outcome contingencies. Indeed, as the authors point out, the presence of multiple response–outcome contingencies seems to preserve knowledge of response–outcome associations and encourage goal-directed action (e.g., Kosaki and Dickinson, 2010). Cocaine exposure might result in quite different patterns of neural activity in free-operant, single response–outcome contingency situations. Second, the ability of cocaine to weaken, suppress, or erase knowledge of response–outcome associations might work by modifying associative information in the dorsomedial striatum, not the dorsolateral striatum (Corbit et al., 2014). Third, the patterns of neural activity observed by Burton et al. (2017) might not reflect response–outcome associations, but action value. It is tempting to propose that a neuron that fired at a high rate when a large magnitude outcome was available in the left fluid well did not fire at an equally high rate when the short-delay outcome was available in that same well because the neuron codes for a specific response–outcome association, not the value of turning left. However, how rats valued each of the four outcomes cannot be determined without a rigorous behavioral economic assessment. Such a neuron might have shown an equally high firing rate if the identity of the outcome was changed (e.g., by changing its flavor), but its value was held constant. This would be consistent with other evidence showing that striatal neurons encode the values of actions, even when those actions are not chosen, much like what was observed in cocaine-exposed rats (Lau and Glimcher, 2008).
Last, it is quite revealing that cocaine-exposed rats did not show heightened stimulus–response encoding in the dorsolateral striatum. This suggests that we might be thinking incorrectly about how drugs of abuse influence dorsolateral striatal function. Despite the current consensus, there is only weak evidence that cocaine enhances stimulus–response learning or that the dorsolateral striatum is responsible for encoding stimulus–response associations. Cocaine seems to be doing more than just strengthening stimulus–response associations. For example, cocaine enhances the impact of conditioned stimuli on operant lever pressing in rats (LeBlanc et al., 2013; Ostlund et al., 2014). It is unlikely that this can be achieved by a general enhancement of stimulus–response encoding because, in these experiments, magazine entry is the response that is reinforced in the presence of the cue, and the invigoration of this response in the presence of the cue is not enhanced in cocaine-exposed rats (LeBlanc et al., 2013). In addition, habitual responding, that form of action control that is facilitated by cocaine exposure and for which proper dorsolateral striatal functioning is crucial, seems to stem not from enhanced stimulus–response encoding but from increased temporal uncertainty of outcomes or from poor contiguity between actions and outcomes (Derusso et al., 2010). The data from Burton et al. (2017) provide another source of evidence that reinforces the idea that modification of stimulus–response encoding is not an accurate way to describe what the dorsolateral striatum is doing or how drugs of abuse impact dorsolateral striatal function.
Footnotes
Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/preparing-manuscript#journalclub.
The author declares no competing financial interests.
References
- Burton AC, Bissonette GB, Lichtenberg NT, Kashtelyan V, Roesch MR (2014) Ventral striatum lesions enhance stimulus and response encoding in dorsal striatum. Biol Psychiatry 75:132–139. 10.1016/j.biopsych.2013.05.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burton AC, Bissonette GB, Zhao AC, Patel PK, Roesch MR (2017) Prior cocaine self-administration increases response–outcome encoding that is divorced from actions selected in dorsal lateral striatum. J Neurosci 37:7737–7747. 10.1523/JNEUROSCI.0897-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH, Janak PH (2010) Posterior dorsomedial striatum is critical for both selective instrumental and Pavlovian reward learning. Eur J Neurosci 31:1312–1321. 10.1111/j.1460-9568.2010.07153.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH, Nie H, Janak PH (2012) Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum. Biol Psychiatry 72:389–395. 10.1016/j.biopsych.2012.02.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH, Chieng BC, Balleine BW (2014) Effects of repeated cocaine exposure on habit learning and reversal by N-acetylcysteine. Psychopharmacology 39:1893–1901. 10.1038/npp.2014.37 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Derusso AL, Fan D, Gupta J, Shelest O, Costa RM, Yin HH (2010) Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Front Integr Neurosci 4:17. 10.3389/fnint.2010.00017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dickinson A, Balleine B, Watt A, Gonzalez F, Boakes RA (1995) Motivational control after extended instrumental training. Anim Learn Behav 23:197–206. 10.3758/BF03199935 [DOI] [Google Scholar]
- Everitt BJ. (2014) Neural and psychological mechanisms underlying compulsive drug seeking habits and drug memories: indications for novel treatments of addiction. Eur J Neurosci 40:2163–2182. 10.1111/ejn.12644 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gremel CM, Costa RM (2013) Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat Commun 4:1–12. 10.1038/ncomms3264 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kosaki Y, Dickinson A (2010) Choice and contingency in the development of behavioral autonomy during instrumental conditioning. J Exp Psychol Anim Behav Process 36:334–342. 10.1037/a0016887 [DOI] [PubMed] [Google Scholar]
- Lau B, Glimcher PW (2008) Value representations in the primate striatum during matching behavior. Neuron 58:451–463. 10.1016/j.neuron.2008.02.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- LeBlanc KH, Maidment NT, Ostlund SB (2013) Repeated cocaine exposure facilitates the expression of incentive motivation and induces habitual control in rats. PLoS One 8:4. 10.1371/journal.pone.0061355 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lingawi NW, Balleine BW (2012) Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits. J Neurosci 32:1073–1081. 10.1523/JNEUROSCI.4806-11.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Hare JK, Ade KK, Sukharnikova T, Van Hooser SD, Palmeri ML, Yin HH, Calakos N (2016) Pathway-specific striatal substrates for habitual behavior. Neuron 89:472–479. 10.1016/j.neuron.2015.12.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostlund SB, LeBlanc KH, Kosheleff AR, Wassum KM, Maidment NT (2014) Phasic mesolimbic dopamine signaling encodes the facilitation of incentive motivation produced by repeated cocaine exposure. Neuropsychopharmacology 39:2441–2449. 10.1038/npp.2014.96 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parikh V, Naughton SX, Shi X, Kelley LK, Yegla B, Tallarida CS, Rawls SM,Unterwald EM (2014) Cocaine-induced neuroadaptations in the dorsal striatum: glutamate dynamics and behavioral sensitization. Neurochem Int 75:54–65. 10.1016/j.neuint.2014.05.016 [DOI] [PubMed] [Google Scholar]
- Stalnaker TA, Calhoon GG, Ogawa M, Roesch MR, Schoenbaum G (2010) Neural correlates of stimulus–response and response–outcome associations in dorsolateral versus dorsomedial striatum. Front Integr Neurosci 4:12. 10.3389/fnint.2010.00012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vandaele Y, Janak PH (2017) Defining the place of habit in substance use disorders. Prog Neuropsychopharmacol Biol Psychiatry. Advance online publication. Retrieved Jun 27, 2017. doi: 10.1016/j.pnpbp.2017.06.029. 10.1016/j.pnpbp.2017.06.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin HH, Knowlton BJ, Balleine BW (2004) Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci 19:181–189. 10.1111/j.1460-9568.2004.03095.x [DOI] [PubMed] [Google Scholar]