Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Sep 1.
Published in final edited form as: Biol Psychiatry. 2012 Mar 20;72(5):389–395. doi: 10.1016/j.biopsych.2012.02.024

Habitual alcohol seeking: Time course and the contribution of subregions of the dorsal striatum

Laura H Corbit 1,3, Hone Nie 1, Patricia H Janak 1,2
PMCID: PMC3674580  NIHMSID: NIHMS361147  PMID: 22440617

Abstract

Background

Addictions are defined by a loss of flexible control over behavior. The development of response habits may reflect early changes in behavioral control. The following experiments examined the flexibility of alcohol-seeking following different durations of self-administration training and tested the role of the dorsal striatum in the control of flexible and habitual alcohol self-administration.

Methods

Rats were trained to lever-press to earn unsweetened ethanol (10%). The sensitivity of the lever-press response to devaluation was assessed by prefeeding the rats either ethanol or sucrose prior to an extinction test following different amounts of training (1,2,4, and 8 weeks). We subsequently tested the role of the dorsomedial striatum (DMS) and dorsolateral striatum (DLS) in controlling alcohol seeking using reversible inactivation techniques (baclofen/muscimol: 1.0/0.1mM, 0.3μl per side).

Results

We find that operant responding for ethanol early in training is goal-directed and reduced by devaluation, but after 8 weeks of daily operant training, control has shifted to a habit-based system no longer sensitive to devaluation. Further, following relatively limited training, when responding is sensitive to devaluation, inactivation of the DMS greatly attenuates the alcohol-seeking response whereas inactivation of the DLS is without effect. In contrast, responding that is insensitive to devaluation following 8 weeks of training becomes sensitive to devaluation following inactivation of the DLS, but unaffected by inactivation of the DMS.

Conclusions

These experiments demonstrate that extended alcohol self-administration produces habit-like responding and that response control shifts from the DMS to the DLS across the course of training.

Keywords: goal-directed, habit learning, devaluation, dorsolateral striatum, dorsomedial striatum, ethanol


Addiction is characterized by the loss of flexible control over drug use despite negative consequences (1). This might reflect the development of habitual drug seeking (25), however, few studies have examined whether self-administration of drugs of abuse, including alcohol, is in fact habitual. Several studies have shown that prolonged exposure to alcohol renders consumption insensitive to quinine adulteration (68), consistent with inflexible intake, yet changes in the flexibility of responses performed to gain access to alcohol has not been explicitly examined. Further, as most recreational drug use presumably starts as a flexible behavior, it is unclear at what point control shifts to a habit-based system. Within the animal learning field, specific tests have been developed wherein flexible versus habitual control over responding can be assessed and these tests can be employed to address this question.

Flexible performance relies on an expectancy related to the outcome of a particular action. Thus, responding tracks the current value of that outcome and is normally reduced when outcome value is decreased (9). In contrast, automatic or habitual responding is elicited by antecedent stimuli and not directly controlled by outcome expectancy. When responding is habitual, therefore, changes in the value of the rewarding outcome have no immediate effect on performance of that response (10; 11). By specifically manipulating outcome value and observing consequent effects on performance, revaluation tests (9; 12; 13) have become the preferred diagnostic for identifying goal-directed versus habitual responses (reliant on response-outcome or R-O vs. stimulus-response or S-R associative architectures, respectively).

Previous studies have employed devaluation to examine control of responding for alcohol (14), however several limitations including the presence of a nonspecific devaluation, affecting responding for both alcohol and a non-alcohol reward, and the use of a sucrose substitution procedure to induce responding for alcohol, thereby promoting associations of the lever with both sucrose and alcohol, make these results difficult to interpret. Therefore, the aim of Experiment 1 was to test the prediction that following relatively limited training responding for alcohol should be flexible, but, following more extensive practice, responding should shift to habitual control.

A role for the basal ganglia, especially the dorsal striatum (DS) in habit or stimulus-response (S-R) learning has been demonstrated in humans (15; 16) and other animals (1721). Rats with lesions or inactivation of the dorsolateral striatum (DLS) remain sensitive to outcome devaluation under conditions in which control subjects have lost this sensitivity (22; 23), indicating that the DLS supports habitual instrumental performance. Flexible performance, therefore, must be controlled by a neural circuit that does not include the DLS. Indeed, pharmacological inactivation or NMDA antagonist infusion into medial regions of the DS (DMS) prior to instrumental training (24; 25) or devaluation testing (26) renders responding insensitive to outcome devaluation suggesting that the DMS mediates performance based on the response-outcome relationship. Hence functionally-distinct circuits contribute to instrumental responding with the DLS mediating habit/S-R learning and the DMS mediating R-O learning (13; 27).

Based on its role in habit learning, the dorsal striatal system may contribute to the development of a drinking habit. Recent studies found that alcohol self-administration enhances activity at NR2B-containing NMDA receptors in the DMS, and is decreased by DMS infusion of the NMDA receptor antagonist, ifenprodil (28). Further, alcohol self-administration is decreased by BDNF infusion into the DLS, and increased by local reductions in BNDF expression in the same region (29). These and other studies (3034) support a critical role of the DS in controlling alcohol self-administration. To explore the nature of that role, Experiment 2 tested the hypothesis that flexible responding for alcohol relies on the DMS whereas responding for alcohol that is habitual, following extended training, relies on the DLS.

Methods and Materials

Experiment 1a: Sensitivity to Outcome Devaluation Across Extended Training: Repeated Testing of Subjects after 1, 2, 4, and 8 Weeks of Training

Subjects and apparatus

Nineteen male Long-Evans rats (~300g; Harlan, Indianapolis, IN) were singly housed with free access to food and water. All procedures were approved by the Institutional Animal Care and Use Committee of the Ernest Gallo Clinic and Research Center at the University of California, San Francisco. Training and testing took place in Med Associates (East Fairfield, VT) operant chambers housed within sound-attenuating shells. Each chamber was equipped with a pump fitted with a syringe that delivered a fixed volume of solution into a recessed magazine in the chamber when activated. The chambers contained retractable levers to the left and right of the magazine. A houselight mounted on the top-center of the opposite wall provided illumination.

Alcohol Acclimation in the home cage

To familiarize them with alcohol, rats initially were given free access to 10% ethanol (10E; v/v) in filtered water in the home cage, for 24 hrs a day for 14 days, followed by 14 days of 1 hr access to 10E at the time that training would subsequently occur. Water was always available in a separate bottle fixed to the home cage. Rats were weighed daily and EtOH consumption recorded.

Instrumental training

Rats underwent a single 30 min magazine training session wherein 10E was delivered under a RT-60s schedule. Rats were next trained to make a lever-press response to deliver small aliquots (0.1 ml) of 10E in 60 min sessions. The first two days of training were under a continuous reinforcement schedule; reinforcement was then shifted to a random ratio (RR) 2 schedule for three days, followed by a RR3 schedule. Animals failing to respond at levels sufficient to achieve alcohol intake of at least 0.3 g/kg for 5 out of 7 days a week were excluded from the study (2 animals excluded according to this criterion leaving 17 in this group). The reward receptacle was examined at the end of each session to ensure that the earned rewards were consumed; this was always the case. Training data are shown in Figure S1 in the Supplement.

Devaluation testing

For each test, rats were divided into 2 groups, devalued and non-devalued. For the devalued condition, rats were given 45-min of free access to 10E in the home cage. For the non-devalued condition, rats were given 45-min free access to 1% sucrose (wt/vol; 1S; this concentration was chosen as it produces consumption volumes similar to those found with 10E. See Figure S2 in the Supplement). A consumption criterion of 3ml was required for an animal’s data to be included. Immediately following home-cage pre-feeding, rats were tested for lever-press responding in a 10-min extinction test. Following this first test, rats received 2 days of retraining and were tested again such that rats that had received the devaluation treatment now received the non-devalued treatment and vice versa. Tests were conducted following 1, 2, 4, and 8 weeks of training; following each pair of tests, rats underwent daily EtOH training sessions as above.

Experiment 1b: Sensitivity to Outcome Devaluation Following Extended Training: Comparison of Separate Groups Trained for 2 or 8 Weeks

Subjects and apparatus

Rats were assigned to a 2 week (n=11) or 8 week (n=10) group. Home cage ethanol exposure was identical to that described above.

Instrumental training and devaluation testing

Rats were trained to lever-press for 10E as described above. The 2 week group underwent 14 daily sessions and the 8 week group underwent 56 daily sessions prior to the first devaluation test, conducted as described above, with each rat tested in the devalued and non-devalued conditions (order counterbalanced).

Experiment 1c: Sensitivity of a Sucrose-Seeking Response to Outcome Devaluation Following Two or Eight Weeks of Training

Subjects and apparatus

Rats were assigned to a 2 week (n=9) or 8 week (n=9) group or an 8 week plus ethanol group (SucEth; n= 13). Rats had 48 hours of free access to a 2% sucrose solution (2S; wt./vol. in filtered water) in the home cage prior to training. The total volume of 2S made available was equivalent to the average total volume of 10E consumed by rats in the pre-exposure phase of Experiment 1b above. The 2S solution was chosen in an attempt to produce similar response rates as 10E (see Figure S3 in the Supplement).

Instrumental training and devaluation testing

Animals were trained to lever-press for 0.1 ml aliquots of 2S as above, with 14 (2-week group) or 56 daily sessions (8-week group) prior to the first devaluation test. To address whether non-contingent exposure to ethanol accelerates habit formation, a SucEth group was given 4 weeks of home cage 10E, followed by training to lever-press for sucrose. This group received 30 min ethanol access in the home cage 4 h after each of 56 daily sucrose self-administration sessions. Devaluation followed the test procedures described above except that for the devalued condition rats were given 45-min of free access to 2S whereas for the non-devalued condition rats were given access to a 5% polycose solution which produced similar consumption levels (Figure S3 in the Supplement). Each rat underwent two tests, one in the devalued condition and one in the non-devalued condition (order counterbalanced).

Experiment 2: The Role of the Dorsomedial and Dorsolateral Striatum Following Limited or Extended Training

Subjects and apparatus

Subjects were 46 rats, with housing conditions, testing apparatus, and initial acclimation to 10E identical to Experiment 1a.

Instrumental training

Rats were trained to lever-press for 10E as described above. Rats were divided into a 2-week (14 daily sessions) and an 8-week (56 daily sessions) group (see Figure S4 in the Supplement for training data).

Surgery

Rats in the 2-week and 8-week groups were further divided into lateral or medial groups after attempting to equate baseline instrumental response rates (N=11–12 per group). Surgery was performed after ~ one week of training for the 2-week group and after ~7 weeks of training for the 8-week group to allow one week of post-surgery responding prior to testing. Stereotaxic surgery was conducted under isoflorane anaesthesia to implant 26 gauge guide cannulae (Plastics One, Roanoke, VA) targeted at either the dorsolateral striatum (DLS; AP: + 1.2 mm, ML: +/− 3.4 mm, DV: − 1.0 mm) or dorsomedial striatum (DMS; AP: +1.2 mm, ML: +/− 1.5mm, DV: − 1.4 mm; coordinates relative to bregma). Guide cannulae tips were positioned 3 mm dorsal to the intended infusion site; thus, final DV coordinates for the infuser tips were −4.0 and −4.4 mm ventral to dura for DMS and DLS targets respectively.

Infusions

Each animal underwent a total of 4 tests to allow testing in both the devalued and non-devalued conditions under both saline and inactivation conditions (order counterbalanced for each placement group and devaluation condition). Inactivation was achieved with a cocktail of the GABA-B receptor agonist, baclofen, and the GABA-A receptor agonist, muscimol (BAC/MUS; 1.0/0.1 mM, Sigma, St Louis MO) via infusion cannulae (33 gauge; Plastics One) extending 3 mm below the guide cannula tip (0.3 μl/min/hemisphere) 10 minutes prior to test (1 min infusion). Saline vehicle was administered as the control treatment.

Histology

Coronal sections (50μm) of formalin-fixed tissue were sliced, mounted, and stained with Nissl stain, to allow verification of cannulae placement

Data Analysis

Data were analyzed in repeated measures or mixed analysis of variance (ANOVA) as appropriate with the within-subjects factors of devaluation (devalued vs. non-devalued) for all experiments, and, for Expt.1a, training duration (1, 2, 4, and 8 weeks) and, for Expt.2, infusion (saline vs. baclofen/muscimol). For Expts.1b&c, the between-subjects factors included duration (2 and 8 weeks). For Expt. 2, the between-subjects factor included region (DMS vs DLS).

Results

Experiment 1a. Sensitivity to Outcome Devaluation Across Extended Training. Repeated Testing of Subjects after 1, 2, 4, and 8 Weeks of Training

We tested the hypothesis that responding for alcohol would initially be goal-directed, but would become habitual with extended training, by examining sensitivity to devaluation by ethanol pre-exposure. After limited training, we expected responding for ethanol to be reduced by pre-feeding with ethanol, but not by pre-feeding with sucrose, while after extended training, we expected no selective decrease in responding after ethanol pre-feeding. The results from the test sessions are in line with these predictions (Figure 1). ANOVA revealed an effect of training length [F(3,48)=9.2, p<.01] and devaluation [F(1,16)=5.6, p<.05]. The interaction between these factors was marginal [F(3,48)= 2.7, p=.058], however, one-way ANOVAs conducted for each time point indicate that following limited training (Figure 1A; 1 week; F(1,16)= 10.96, p<.01; or 2 weeks; Figure 1B; F(1,16)=6.4, p<.05) EtOH pre-feeding reduced responding on the EtOH lever compared to 1S pre-feeding, demonstrating a significant devaluation effect. Following 4 weeks of training, responding was numerically reduced, however the devaluation effect was no longer significant [Figure 1C; F(1,16)=2.1, p>.05]. At the 8-week time point rats no longer decreased their responding for EtOH following EtOH pre-exposure [Figure 1D; F(1,16)<1]. This insensitivity to reward devaluation suggests that 8 weeks of daily training is sufficient for the development of habitual responding for alcohol in rats.

Figure 1. Time course for the development of habitual alcohol seeking.

Figure 1

Mean lever presses (+/− SEM) in 10-min following either pre-feeding of alcohol (devalued) or sucrose (non-devalued) after different amounts of training. Responding was significantly reduced by devaluation after (A) 1 week or (B) 2 weeks of self-administration, but not after (C) 4 weeks or (D) 8 weeks of training, suggests that 8 weeks of daily training is sufficient for the development of an alcohol “habit” in rats. N=17, ** indicates p<.01; * indicates p<.05. Separate groups of rats were trained and tested only once following either two (E) or eight (F) weeks of training. Again, rats decreased responding following devaluation of the alcohol outcome following two weeks but not eight weeks of training. N=10–11 per group.

Experiment 1b. Sensitivity to Outcome Devaluation Following Extended Training. Comparison of Separate Groups Trained for 2 or 8 Weeks

It is possible that repeated testing could account for the lack of sensitivity to outcome devaluation observed above. To address this possibility, separate groups of rats were trained for either 2 or 8 weeks and then tested for sensitivity to devaluation. Rats tested following two weeks of training decreased responding following devaluation of the alcohol outcome (Figure 1E) whereas those tested following eight weeks of training were not sensitive to this manipulation (Figure 1F). ANOVA revealed no effect of Group [F(1,21)<1] but a significant effect of devaluation [F(1,21)=8.6, p<.01] and an interaction between these factors [F(1,21)=4.1, p<.05]. One-way ANOVAs conducted for each group revealed a significant devaluation effect in the two-week group [F(1,10)=7.0, p<.05] but not the eight week group [F(1,10)=1.0, p>.05].

Experiment 1c. Sensitivity of a Sucrose-Seeking Response to Outcome Devaluation Following Two or Eight Weeks of Training

To address whether the development of habitual responding is invariably produced by the instrumental training used above, we examined sensitivity to devaluation following either limited or extended training with sucrose reward. As shown in Figure 2A–B, responding for sucrose was sensitive to devaluation at both time points. ANOVA revealed no effect of Group [F(1,17)<1], a significant effect of devaluation [F(1,17)=10.4, p<.01] and no interaction between these factors[F(1,17)=1.3,p>.05]. Separate rats were given non-contingent ethanol during training to respond for sucrose reward. Their average daily intake in the 3 days prior to testing was 2.5 mls which resulted in an average ethanol level of 0.42 g/kg. When tested for sensitivity to devaluation, this group failed to show sensitivity to devaluation [Figure 2C; F(1,12)= 3.2, p> .05]. As rats differed in the amount of ethanol consumed we explored the relationship between total ethanol consumed across the experiment and the magnitude of the devaluation effect (nondevalued responses – devalued responses). We found that there was a significant negative correlation between these measures [Pearson’s r= −.62, p<.05] indicating that rats that drank more showed reduced sensitivity to devaluation and that sensitivity potentially remained for lower drinkers.

Figure 2. Sensitivity of responding for sucrose reward to devaluation following either two or eight weeks of training.

Figure 2

Mean lever presses (+/− SEM) in 10-min following pre-feeding of either sucrose (devalued) or polycose (non-devalued). Responding for sucrose was sensitive to devaluation at both the two week (A) and eight week (B) time points suggesting that responding for sucrose does not transition to habitual performance within this time frame. However, rats given home cage access to ethanol (C) over the course of instrumental training did not show sensitivity to devaluation indicating that non-contingent ethanol promoted habit formation. N=9–13 per group. * indicates p<.05.

Experiment 2. The Role of the Dorsomedial and Dorsolateral Striatum Following Limited or Extended Training

Limited Training

The effects of inactivation of the DMS or DLS on sensitivity to devaluation were assessed following two weeks of training. As shown in Figure 3A&B, following saline infusion, rats in both the DMS and DLS groups decreased responding following devaluation of the alcohol outcome. This result was unaffected by DLS inactivation; however, responding in both the devalued and non-devalued conditions was reduced by inactivation of the DMS. ANOVA indicated no main effect of Group [F(1,20) <1], but there was an effect of both inactivation [F(1,20)=9.6, p<.01], and devaluation [F(1,20)=14.6, p<.01]. None of the two-way interactions were significant, but importantly, there was a significant group x inactivation x devaluation interaction [F(1,20)= 6.4, p<.05]. Further analyses of the DMS group indicated an effect of DMS inactivation [F(1,10)=10.0, p<.05] and only a marginal effect of devaluation [F(1,10)=4.1, p=.07] but a significant interaction between these factors [F(1,10)=5.7, p<.05] explained by a significant devaluation effect following saline [F(1,10)=9.0, p<.05] but not inactivation treatment [F(1,10)<1]. The decrease in performance of the non-devalued response suggests that an intact DMS is required for performance of this goal-directed action. In contrast, further analyses of the DLS group revealed an effect of inactivation [F(1,10)=4.7, p<.05] and devaluation [F(1,10)=13.6, p<.01] but no interaction [F(1,10)<1] suggesting that while responding was somewhat lower following inactivation, the sensitivity to devaluation was preserved.

Figure 3. Effects of inactivation of the DMS or DLS on sensitivity to devaluation following either two or eight weeks of training.

Figure 3

Mean lever presses (+/− SEM) in 10- min following devaluation or control treatment and infusion of saline or muscimol into either the DMS or DLS. (A&B) A significant effect of devaluation was observed following two weeks of training for rats in both the DMS and DLS groups following saline infusion. (A) This sensitivity was lost following inactivation of the DMS. (B) Sensitivity to devaluation was unaffected by inactivation of the DLS. (C&D) Following 8 weeks of training, performance of rats in both the DMS and DLS groups was not sensitive to devaluation of the alcohol outcome. (C) Rats in the DMS group remained insensitive to devaluation following inactivation. (D) Inactivation of the DLS lead to a renewed sensitivity to devaluation. N= 11 or 12 per group. * indicates p<.05.

Extended Training

We assessed the effects of inactivation of the DMS or DLS on sensitivity to devaluation following eight weeks of training. Following saline infusion, rats in the DMS and DLS groups showed evidence of habitual responding for alcohol; performance was not sensitive to alcohol devaluation (Figure 3C&D). DMS inactivation did not affect this pattern (Fig. 3C). In contrast, DLS inactivation led to a renewed sensitivity to devaluation (Fig. 3D). ANOVA revealed a significant effect of inactivation [F(1,22)=10.3, p<.01], an interaction between inactivation and devaluation [F(1,22)=4.8, p<.05], as well as a three-way interaction between inactivation, interaction and group [F(1,22)=5.4, p<.05]. No other main effects or interactions were significant. Further analyses of the DMS group indicated no effect of inactivation [F(1,11)=2.6,p>.05], devaluation [F(1,11)<1] or interaction between these factors [F(1,11)<1]. In contrast, rats in the DLS group showed a significant effect of inactivation [F(1,11)=14.4, p<.01], no effect of devaluation [F(1,11)<1] and an interaction between these factors [F(1,11)=5.5, p<.05]. This was explained by a significant effect of devaluation in this group following inactivation [F(1,11)=8.7, p<.05] but not saline infusion [F(1,11)<1].

Schematic representation of the cannulae placements for both the 2 and 8 week groups is shown in Figure 4.

Figure 4. Schematic representation of cannulae placements within the DMS or DLS for rats in Experiment 2.

Figure 4

The location of the injector tips is represented by squares for the DMS group and circles for the DLS group. Numbers on the left indicate the distance anterior to bregma. Coronal images after (48).

Discussion

We have shown that responding for alcohol is flexible and sensitive to changes in outcome value following limited instrumental self-administration training. In contrast, following extended training, responding for alcohol no longer shows sensitivity to changes in outcome value, the hallmark of automatic or habitual performance. This effect can be accounted for, at least in part, by the effects of chronic exposure to alcohol as rats given equivalent ethanol but trained to respond to earn sucrose reward similarly failed to show sensitivity to devaluation. Selective performance following devaluation relies on an intact DMS as inactivation of this structure reduces responding and eliminates the devaluation effect. Habitual responding is controlled by the DLS as inactivation of this structure causes the habitual responding observed under control conditions to revert to the goal-directed or flexible system that is sensitive to changes in the current value of the earned outcome. Together these results suggest that, with extended training and alcohol exposure, instrumental responding for alcohol loses flexibility, and control of performance shifts from medial to lateral regions of the DS. The return of flexible responding following inactivation of the DLS further suggests that these two response strategies exist in parallel. Specifically, while the relative dominance of the habit vs. flexible systems in response control changes across training and accumulating ethanol exposure, flexible responding can still be recovered under conditions where habitual performance might typically dominate. These results have important implications for theories of both instrumental conditioning and addiction.

The schedule of reinforcement on which animals are trained influences development of habitual responding, with interval schedules producing rapid (days) habit development and ratio schedules being relatively insensitive to this process (35). It has been argued that this difference stems from the fact that the perceived causal relationship between an individual response and outcome is weaker for interval schedules compared to ratio schedules (11). Differences in reinforcement schedules may explain why previous studies have shown devaluation following extended training in a procedure where a block of responses was required to initiate a single drinking episode thus providing fewer S-R pairings than the current procedure (36). Nonetheless, for an addict, the transition from recreational to habitual or even compulsive drug use occurs over time, and actions made to obtain alcohol or other drugs may be more closely modelled by ratio than interval schedules. Thus, we suggest that the use of ratio schedules that yield goal-directed responding at early time points, yet have the capacity to transition to habitual control following extensive practice, may provide a more valid model for the development of a drug habit within which further questions about neural and pharmacological control can be addressed. Future studies extending this model, for example, using progressive ratio responding or punishment, or testing under alcohol-dependent conditions, are needed to evaluate potential motivational changes and their relation to habitual responding across the course of extended training (7; 37).

The effects of DMS inactivation are consistent with previous studies examining the control of flexible responding for food reward; these studies found that an intact DMS is required for both acquisition and expression of selective response-outcome associations (2426). Notably, DMS inactivation, and thus the functional loss of the goal-directed system, may explain the overall low performance observed in the present study early in training when the habit system remains relatively weak (26). In contrast, we suggest that the lack of effect of DMS inactivation on overall responding later in training is because the contribution of the habit system is relatively strong and thus able to support performance, albeit without sensitivity to devaluation. Our finding that DLS inactivation eliminates habitual performance is also consistent with previous demonstrations with food reward (38), and the recent report that performance of a seeking-taking chain for cocaine delivery is sensitive to devaluation (achieved by extinguishing the taking response) following limited but not extended training and relies on the DLS (23). A role of the DLS in cocaine-seeking is also supported by the finding that DLS DA antagonist microinfusion reduces responding for cocaine, as does unilateral accumbens core lesions paired with DA antagonist infusion into the contralateral DLS (39). This latter finding implicates connectivity among striatal regions via descending striatal projections to midbrain DAergic nuclei that in turn project to striatal territories (spiralling stiato-nigral-striatal circuits as described in (5; 40). The possibility that the shift in control from DMS to DLS we observed is mediated by DMS/DLS interaction through striatonigral loops remains to be tested. Interestingly, eight weeks of responding for sucrose reward did not produce insensitivity to devaluation unless animals were also exposed to alcohol, suggesting that alcohol rewards may have a greater propensity to support habit development, consistent with previous findings (14; 37; 41).

Our finding that non-contingent ethanol accelerates habit formation in sucrose-trained animals is consistent with previous experiments demonstrating that non-contingent exposure to amphetamine leads to rapid development of habitual performance for natural rewards (42; 43). Whether similar results would be observed following alcohol exposure only prior to training awaits further testing. Acute and chronic alcohol exposure, including alcohol self-administration, alters physiological properties of neurons within the striatum (28; 44; 45). For example, in the DMS, acute alcohol leads to long-term depression of synapses that normally show long-term potentiation (45), while 24-hr withdrawal from alcohol produces a long-term facilitation of activity of NR2B-containing NMDA receptors (28; 44). These findings suggest the possibility that alcohol’s pharmacological actions within the DMS alter the normal learning processes mediated by this region perhaps favoring a shift to DLS control. It is also possible that alcohol-induced suppression of R-O mechanisms/DMS function may slow or prevent the reversal of automatic responding contributing to the persistence of alcohol addiction.

Consistent with views in animal learning theory, it has been suggested that human drug use is likewise controlled by both cognitive and non-cognitive (automatic) processes (46). While on the one hand, drug users may experience intense desire for drugs which could be considered exaggerated valuation, it has also been shown that environmental stimuli may activate brain circuitry involved in drug desire without conscious awareness (47), and ritualized aspects of drug seeking routines may become automatic or habitual with repeated practice, suggesting that multiple cognitive processes contribute to drug seeking behaviors. The development of automaticity of drug seeking seen in the current model and the apparent separation of performance from evaluation may capture some aspects of the addictive process.

Supplementary Material

01

Acknowledgments

This research was supported by grants from the NIH (R01 AA018025; PHJ) and ABMRF (LHC). We are grateful to Lacey Sahuque and Natalie Warrick for contributions to animal training.

Footnotes

Financial Disclosures

The authors declare no biomedical financial interests or potential conflicts of interest.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 4. Washington, D.C: American Psychiatric Association; 2000. Text Revision. [Google Scholar]
  • 2.Belin D, Jonkman S, Dickinson A, Robbins TW, Everitt BJ. Parallel and interactive learning processes within the basal ganglia: relevance for the understanding of addiction. Behav Brain Res. 2009;199:89–102. doi: 10.1016/j.bbr.2008.09.027. [DOI] [PubMed] [Google Scholar]
  • 3.Gerdeman GL, Partridge JG, Lupica CR, Lovinger DM. It could be habit forming: drugs of abuse and striatal synaptic plasticity. Trends Neurosci. 2003;26:184–192. doi: 10.1016/S0166-2236(03)00065-1. [DOI] [PubMed] [Google Scholar]
  • 4.Vanderschuren LJ, Everitt BJ. Behavioral and neural mechanisms of compulsive drug seeking. Eur J Pharmacol. 2005;526:77–88. doi: 10.1016/j.ejphar.2005.09.037. [DOI] [PubMed] [Google Scholar]
  • 5.Everitt BJ, Robbins TW. Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nat Neurosci. 2005;8:1481–1489. doi: 10.1038/nn1579. [DOI] [PubMed] [Google Scholar]
  • 6.Lesscher HM, van Kerkhof LW, Vanderschuren LJ. Inflexible and indifferent alcohol drinking in male mice. Alcohol Clin Exp Res. 2010;34:1219–1225. doi: 10.1111/j.1530-0277.2010.01199.x. [DOI] [PubMed] [Google Scholar]
  • 7.Hopf FW, Chang SJ, Sparta DR, Bowers MS, Bonci A. Motivation for alcohol becomes resistant to quinine adulteration after 3 to 4 months of intermittent alcohol self-administration. Alcohol Clin Exp Res. 2010;34:1565–1573. doi: 10.1111/j.1530-0277.2010.01241.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wolffgramm J. An ethopharmacological approach to the development of drug addiction. Neurosci Biobehav Rev. 1991;15:515–519. doi: 10.1016/s0149-7634(05)80142-3. [DOI] [PubMed] [Google Scholar]
  • 9.Adams CD, Dickinson A. Instrumental Responding Following Reinforcer Devaluation. Q J Exp Psychol-B. 1981;33:109–121. [Google Scholar]
  • 10.Adams CD. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Quarterly Journal of Experimental Psychology. 1982;34B:77–98. [Google Scholar]
  • 11.Dickinson A. Actions and Habits - the Development of Behavioral Autonomy. Philos T Roy Soc B. 1985;308:67–78. [Google Scholar]
  • 12.Balleine BW, O’Doherty JP. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci. 2006;7:464–476. doi: 10.1038/nrn1919. [DOI] [PubMed] [Google Scholar]
  • 14.Dickinson A, Wood N, Smith JW. Alcohol seeking by rats: action or habit? Q J Exp Psychol B. 2002;55:331–348. doi: 10.1080/0272499024400016. [DOI] [PubMed] [Google Scholar]
  • 15.Knowlton BJ, Mangels JA, Squire LR. A neostriatal habit learning system in humans. Science. 1996;273:1399–1402. doi: 10.1126/science.273.5280.1399. [DOI] [PubMed] [Google Scholar]
  • 16.Tricomi E, Balleine BW, O’Doherty JP. A specific role for posterior dorsolateral striatum in human habit learning. Eur J Neurosci. 2009;29:2225–2232. doi: 10.1111/j.1460-9568.2009.06796.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Atallah HE, Lopez-Paniagua D, Rudy JW, O’Reilly RC. Separate neural substrates for skill learning and performance in the ventral and dorsal striatum. Nat Neurosci. 2007;10:126–131. doi: 10.1038/nn1817. [DOI] [PubMed] [Google Scholar]
  • 18.Barnes TD, Kubota Y, Hu D, Jin DZ, Graybiel AM. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature. 2005;437:1158–1161. doi: 10.1038/nature04053. [DOI] [PubMed] [Google Scholar]
  • 19.Faure A, Haberland U, Conde F, El Massioui N. Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J Neurosci. 2005;25:2771–2780. doi: 10.1523/JNEUROSCI.3894-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Featherstone RE, McDonald RJ. Dorsal striatum and stimulus-response learning: lesions of the dorsolateral, but not dorsomedial, striatum impair acquisition of a stimulus-response-based instrumental discrimination task, while sparing conditioned place preference learning. Neuroscience. 2004;124:23–31. doi: 10.1016/j.neuroscience.2003.10.038. [DOI] [PubMed] [Google Scholar]
  • 21.Jog MS, Kubota Y, Connolly CI, Hillegaart V, Graybiel AM. Building neural representations of habits. Science. 1999;286:1745–1749. doi: 10.1126/science.286.5445.1745. [DOI] [PubMed] [Google Scholar]
  • 22.Yin HH, Knowlton BJ. Contributions of striatal subregions to place and response learning. Learn Mem. 2004;11:459–463. doi: 10.1101/lm.81004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Zapata A, Minney VL, Shippenberg TS. Shift from goal-directed to habitual cocaine seeking after prolonged experience in rats. J Neurosci. 2010;30:15457–15463. doi: 10.1523/JNEUROSCI.4072-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Corbit LH, Janak PH. Posterior dorsomedial striatum is critical for both selective instrumental and Pavlovian reward learning. Eur J Neurosci. 2010;31:1312–1321. doi: 10.1111/j.1460-9568.2010.07153.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Yin HH, Knowlton BJ, Balleine BW. Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. Eur J Neurosci. 2005;22:505–512. doi: 10.1111/j.1460-9568.2005.04219.x. [DOI] [PubMed] [Google Scholar]
  • 26.Yin HH, Ostlund SB, Knowlton BJ, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci. 2005;22:513–523. doi: 10.1111/j.1460-9568.2005.04218.x. [DOI] [PubMed] [Google Scholar]
  • 27.Balleine BW, Liljeholm M, Ostlund SB. The integrative function of the basal ganglia in instrumental conditioning. Behav Brain Res. 2009;199:43–52. doi: 10.1016/j.bbr.2008.10.034. [DOI] [PubMed] [Google Scholar]
  • 28.Wang J, Lanfranco MF, Gibb SL, Yowell QV, Carnicella S, Ron D. Long-lasting adaptations of the NR2B-containing NMDA receptors in the dorsomedial striatum play a crucial role in alcohol consumption and relapse. J Neurosci. 2010;30:10187–10198. doi: 10.1523/JNEUROSCI.2268-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Jeanblanc J, He DY, Carnicella S, Kharazia V, Janak PH, Ron D. Endogenous BDNF in the dorsolateral striatum gates alcohol drinking. J Neurosci. 2009;29:13494–13502. doi: 10.1523/JNEUROSCI.2243-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.McGough NN, He DY, Logrip ML, Jeanblanc J, Phamluong K, Luong K, et al. RACK1 and brain-derived neurotrophic factor: a homeostatic pathway that regulates alcohol addiction. J Neurosci. 2004;24:10542–10552. doi: 10.1523/JNEUROSCI.3714-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Xia JX, Li J, Zhou R, Zhang XH, Ge YB, Ru Yuan X. Alterations of rat corticostriatal synaptic plasticity after chronic ethanol exposure and withdrawal. Alcohol Clin Exp Res. 2006;30:819–824. doi: 10.1111/j.1530-0277.2006.00095.x. [DOI] [PubMed] [Google Scholar]
  • 32.Chen J, Nam HW, Lee MR, Hinton DJ, Choi S, Kim T, et al. Altered glutamatergic neurotransmission in the striatum regulates ethanol sensitivity and intake in mice lacking ENT1. Behav Brain Res. 2010 doi: 10.1016/j.bbr.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jeanblanc J, He DY, McGough NN, Logrip ML, Phamluong K, Janak PH, Ron D. The dopamine D3 receptor is part of a homeostatic pathway regulating ethanol consumption. J Neurosci. 2006;26:1457–1464. doi: 10.1523/JNEUROSCI.3786-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Vengeliene V, Leonardi-Essmann F, Sommer WH, Marston HM, Spanagel R. Glycine transporter-1 blockade leads to persistently reduced relapse-like alcohol drinking in rats. Biol Psychiatry. 2010;68:704–711. doi: 10.1016/j.biopsych.2010.05.029. [DOI] [PubMed] [Google Scholar]
  • 35.Dickinson A, Nicholas DJ, Adams CD. The Effect of the Instrumental Training Contingency on Susceptibility to Reinforcer Devaluation. Q J Exp Psychol-B. 1983;35:35–51. [Google Scholar]
  • 36.Samson HH, Cunningham CL, Czachowski CL, Chappell A, Legg B, Shannon E. Devaluation of ethanol reinforcement. Alcohol. 2004;32:203–212. doi: 10.1016/j.alcohol.2004.02.002. [DOI] [PubMed] [Google Scholar]
  • 37.Vanderschuren LJ, Everitt BJ. Drug seeking becomes compulsive after prolonged cocaine self-administration. Science. 2004;305:1017–1019. doi: 10.1126/science.1098975. [DOI] [PubMed] [Google Scholar]
  • 38.Yin HH, Knowlton BJ, Balleine BW. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci. 2004;19:181–189. doi: 10.1111/j.1460-9568.2004.03095.x. [DOI] [PubMed] [Google Scholar]
  • 39.Belin D, Everitt BJ. Cocaine seeking habits depend upon dopamine-dependent serial connectivity linking the ventral with the dorsal striatum. Neuron. 2008;57:432–441. doi: 10.1016/j.neuron.2007.12.019. [DOI] [PubMed] [Google Scholar]
  • 40.Haber SN, Fudge JL, McFarland NR. Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. J Neurosci. 2000;20:2369–2382. doi: 10.1523/JNEUROSCI.20-06-02369.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Miles FJ, Everitt BJ, Dickinson A. Oral cocaine seeking by rats: action or habit? Behav Neurosci. 2003;117:927–938. doi: 10.1037/0735-7044.117.5.927. [DOI] [PubMed] [Google Scholar]
  • 42.Nelson A, Killcross S. Amphetamine exposure enhances habit formation. J Neurosci. 2006;26:3805–3812. doi: 10.1523/JNEUROSCI.4305-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Nordquist RE, Voorn P, de Mooij-van Malsen JG, Joosten RN, Pennartz CM, Vanderschuren LJ. Augmented reinforcer value and accelerated habit formation after repeated amphetamine treatment. Eur Neuropsychopharmacol. 2007;17:532–540. doi: 10.1016/j.euroneuro.2006.12.005. [DOI] [PubMed] [Google Scholar]
  • 44.Wang J, Carnicella S, Phamluong K, Jeanblanc J, Ronesi JA, Chaudhri N, et al. Ethanol induces long-term facilitation of NR2B-NMDA receptor activity in the dorsal striatum: implications for alcohol drinking behavior. J Neurosci. 2007;27:3593–3602. doi: 10.1523/JNEUROSCI.4749-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Yin HH, Park BS, Adermark L, Lovinger DM. Ethanol reverses the direction of long-term synaptic plasticity in the dorsomedial striatum. Eur J Neurosci. 2007;25:3226–3232. doi: 10.1111/j.1460-9568.2007.05606.x. [DOI] [PubMed] [Google Scholar]
  • 46.Tiffany ST, Conklin CA. A cognitive processing model of alcohol craving and compulsive alcohol use. Addiction. 2000;95(Suppl 2):S145–153. doi: 10.1080/09652140050111717. [DOI] [PubMed] [Google Scholar]
  • 47.Childress AR, Ehrman RN, Wang Z, Li Y, Sciortino N, Hakun J, et al. Prelude to passion: limbic activation by “unseen” drug and sexual cues. PLoS One. 2008;3:e1506. doi: 10.1371/journal.pone.0001506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Paxinos G, Watson C. The rat brain in stereotaxic coordinates. IV. San Diego: Academic Press; 1998. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES