Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Dec 1.
Published in final edited form as: J Neurosci Res. 2020 Jan 8;98(6):1031–1045. doi: 10.1002/jnr.24581

Interfacing Behavioral and Neural Circuit Models for Habit Formation

Talia N Lerner 1,*
PMCID: PMC7183881  NIHMSID: NIHMS1546967  PMID: 31916623

Abstract

Habits are an important mechanism by which organisms can automate the control of behavior to alleviate cognitive demand. However, transitions to habitual control are risky because they lead to inflexible responding in the face of change. The question of how the brain controls transitions into habit is thus an intriguing one. How do we regulate when our repeated actions become automated? When is it advantageous or disadvantageous to release actions from cognitive control? Decades of research have identified a variety of methods for eliciting habitual responding in animal models. Progress has also been made to understand which brain areas and neural circuits control transitions into habit. Here, I discuss existing research on behavioral and neural circuit models for habit formation (with an emphasis on striatal circuits), and discuss strategies for combining information from different paradigms and levels of analysis to prompt further progress in the field.

Introduction

How does the brain control behavior? Some actions are goal-directed: we imagine the consequences of particular choices and take careful measures to ensure good, cost-efficient outcomes to our actions. Other actions are habitual: we respond to familiar situations by relying on established routines and practiced skills. Both of these goal-directed and habitual strategies may be useful for survival, depending on context. Automating a subset of routine behaviors by creating habits allows fast, efficient responding without significant cognitive demand, but leads to inflexible responding in the face of change (Dickinson, 1985; Packard and Knowlton, 2002; Yin and Knowlton, 2006). Thus, the brain must decide when it is appropriate to create habits from repeated actions and when it is more advantageous to stay goal-directed. Importantly, not every brain will balance between goal-directed and habitual control in the same way: individual differences in habit learning rates may to contribute to a variety of individual differences in reward seeking strategies, and may also contribute to an individual’s risk for disorders such as drug addiction (George and Koob, 2017). Habit formation is thus a key area for further study, to better understand how we use habits to navigate our daily lives and how we can manipulate habit formation circuits to mitigate disease risk and treat existing patients.

Maladaptive habit formation mechanisms have been hypothesized to contribute to a variety of neuropsychiatric problems, including obsessive-compulsive disorder (OCD), autism, and drug addiction (Alvares et al., 2016; Everitt and Robbins, 2016; Gillan et al., 2014). While these disorders are distinct from each other when considered as a whole, they share the characteristic that problematic behavioral sequences are repeatedly executed and are difficult to inhibit. However, the exact contributions of habit per se to the particular symptoms of each disorder remain unclear. For example, in the context of drug addiction, it has been observed both that habit-associated brain areas become engaged in drug seeking with extended training (Everitt and Robbins, 2016) and that this engagement of habit areas is not necessary: constantly solving for new action-outcome contingencies to receive drug reward, which prevents habit formation, preserves many characteristics of drug addiction in a rodent model (including escalating use and punishment-resistant drug seeking; Singer et al., 2018). Thus, the role of habits in drug addiction has been questioned. Does it play a role in some aspects of drug addiction? Perhaps in some individuals but not others?

In fact, to determine how dysfunction of the habit system contributes to the development of a brain disorder such as addiction, we need two major things. First, we must better formalize how habits are defined behaviorally. As detailed below, many studies of habit use different methodologies, and while the tasks used may all be related to one another, there are also potentially important differences. These differences do not need to be erased, but understood and related to each other. In other words, we should take care not to artificially narrow our view of habit in pursuit of a clean definition; rather, the goal should be to understand how the primary features of habit contribute to many varied circumstances. Second, we must develop a circuit model for how habitual behavior is produced, such that the statement that “habit circuits” are engaged or disrupted is meaningful across analyses. One way to answer the question of whether habit is involved in controlling a behavior is with a behavioral probe such as outcome devaluation. Another way to connect across behavioral paradigms would be to ask whether similar neural circuits are engaged by related, putatively habit-inducing, tasks. Below, I summarize knowledge and progress on these two issues, with a view as to how the field can proceed to develop a better interface between behavioral and circuit level models of habit.

Tasks to Probe Habits in Animal Models

Colloquially, habits are simply actions that are performed regularly and are resistant to change, a definition which influences our intuitive understanding of habit and our communications with the public on the findings of our research about habits. Scientifically, however, habits have a narrower, more specific definition. A habit is developed when a stimulus-response association is formed. The stimulus is a familiar sensory cue or environmental context, which then triggers a responsive action without consideration of the expected outcome of that action and/or without consideration of the value of the action’s outcome to the animal. Thus, habitual actions are performed automatically, even when they appear to be maladaptive.

When a habitual action does produce an adaptive outcome, it can be difficult to determine that the action was produced by force of habit rather than by goal-directed control. But under the stimulus-response definition of habit, one can test for habitual behavior by creating a situation in which habitual and goal-directed control systems will differ in the actions they produce. Generally, experimenters do this by manipulating the value of an outcome or by manipulating the action-outcome contingency within a task. Both approaches to probing for habit have been employed frequently in the literature, with variations in how the action-outcome contingency or outcome value is manipulated. As other reviews in this issue rightly point out, the probes chosen to evaluate habitual behavior can significantly influence study outcomes and interpretations (Schreiner et al., Woon et al., this issue). Table 1 summarizes the most common approaches that have been taken to probe habit. Some tests manipulate outcome values: they reduce the animals’ motivation for the outcome (satiety specific devaluation, in which an animal is pre-feed a reinforcer to reduce its drive to obtain the particular reinforcer) or they induce a negative valence to the outcome (LiCl pairing, in which an animal learns to associate a previously palatable reinforcer with malaise). Other tests manipulate the action-outcome contingency. Omission probes reverse the contingency of actions and outcomes, requiring animals to withhold their responding to earn rewards. Contingency degradation delivers rewards regardless of responding. Since both of these action-outcome contingency manipulations may read out slightly different aspects of behavioral flexibility, it is imperative to closely examine the methods used to measure habit in the existing literature on habit formation.

Table 1.

Methods for Probing Habit Formation

Probe type Variable Manipulated Selected References
Satiety-specific devaluation Outcome Value (DeRusso et al., 2010; Gremel and Costa, 2013; Gremel et al., 2016; Vandaele et al., 2017; Yin et al., 2005a, 2005b)
LiCl taste aversion devaluation Outcome Value (Smith et al., 2012; Vandaele et al., 2017; Yin et al., 2004)
Omission Action-outcome Contingency (DeRusso et al., 2010; Rossi and Yin, 2012; Yu et al., 2009)
Contingency degradation Action-outcome Contingency (Gourley et al., 2013; Vandaele et al., 2017; Yin et al., 2005b)

Another important issue is that under this methodology of identifying habits, habits are simply the impairment of goal-directed behavior. A goal-directed behavior should be responsive to both action-outcome contingency changes and changes in outcome value (Dickinson and Balleine, 1994), so the loss of either is used as evidence for habit. However, this definition of habit may be problematic. Using the probes described in Table 1, it is impossible to determine whether an apparent “habit” is the result of a strengthened stimulus-response association or a weakening of goal-directed mechanisms (Vandaele and Janak, 2018). Additionally, it has been argued that habitual and goal-directed control mechanisms operate in a hierarchical organization or in parallel rather than being mutually exclusive (Dezfouli and Balleine, 2013; Lee et al., 2014). In this case, traditional probe tasks would fail to capture important dynamics of the system. The use of so-called two-step decision tasks to assess model-free vs model-based learning is one attempt to simultaneously and non-exclusively measure contributions from habitual and goal-directed control systems to behavioral output (Daw et al., 2005, 2011), but there is still substantial disagreement in the field about whether model-free and model-based learning map onto habitual vs goal-directed behavior as elicited by more traditional operant tasks and probe tests. Indeed, alternative computational frameworks to explain habit have recently been proposed (Miller et al., 2019). The traditional operant tasks used to elicit habit formation are discussed in the next section of this review. For a review of the use of two-step task, see Geramita et al. (this issue).

What Kind of Operant Training Induces Habits?

The strategy that an animal uses to control its behavior is dependent (at least in part) on the external structure of the task it is asked to perform. A number of different tasks have been developed to elicit habit formation as measured by the probe approaches in Table 1. Random/Variable ratio (RR) or random/variable interval (RI) schedules are the most commonly used. In an RR schedule, multiple responses (e.g. lever presses or nosepokes) are required for the subject to earn a reward. The exact number of responses, however, is variable. In an RI schedule, rewards are only available to be earned (by performing a lever press or nosepoke) after a certain period of time, which is variable. The subject must continue to respond to check if the response will be rewarded. In rodents, RI schedules are more effective than RR schedules at producing habits (Dickinson and Charnock, 1985; Dickinson et al., 1983; Gremel and Costa, 2013; Yin and Knowlton, 2006). Both of these random schedules of reinforcement are in turn much more effective at eliciting habit than a fixed ratio schedule, where the relationship between the action and outcome is entirely predictable and stable (DeRusso et al., 2010). However, we have never fully understood why this should be true.

In fact, some recent work challenges the view that it is merely uncertainty which promotes habitual responding. Vandaele and colleagues demonstrated that a fixed ratio schedule (FR5) can in fact lead to rapid habit formation when it is bracketed by lever insertion and removal (Vandaele et al., 2017). This “discrete trials” version of the FR5 task, termed DT5, suggests that habit formation is accelerated by cued task bracketing, which seems in contrast to uncertainty. However, cues may help accelerate habit formation by creating clear stimuli for stimulus-response associations to form around, and by bracketing tasks into clearly defined action sequences. The fact that both an RI60 task and a DT5 task are effective at eliciting habitual responding raises the question of whether these apparently very differently structured tasks actually engage different circuit-level routes to habitual performance. In vivo recordings during these tasks may help to clarify, and will be discussed further in the second part of the review.

Some additional operant tasks have also been designed to elicit habits. In particular, Graybiel and colleagues have taken advantage of a T-maze task in multiple studies of habit (Kubota et al., 2009; Smith and Graybiel, 2016; Smith et al., 2012; Thorn et al., 2010). In the T-maze task, rats run down the long arm of a T-maze and are cued halfway down as to which direction they should turn at the end to receive reward. This task differs from a classic RI reinforcement schedule in important ways: animals are rewarded every time they make a correct decision (no uncertain waiting periods that induce high response rates), and they must perform a sensory discrimination to determine the correct decision for each trial. Nevertheless, after overtraining rats are unable to adjust their behavior after outcome devaluation, continuing to run down the T-maze and turn to the devalued side when instructed (Smith et al., 2012). Results from these studies are compelling and form a consistent body of literature, yet it remains unclear (as for the DT5 task) whether habitual performance in the T-maze is elicited via similar or different circuit-level mechanisms as habitual performance observed after RI60 training.

Motor Skill Learning as Habit

Habit formation and motor skill learning are related to each other. They are often discussed in parallel, and sometimes conflated. Motor skill learning involves the chunking of action sequences into fluidly-executed motions requiring minimal cognitive engagement. The ability to learn new motor skills depends on similar brain areas as habit formation. For example, skill learning in mice on an accelerating rotarod test depends on dopamine-dependent shifts in encoding between the dorsomedial and dorsolateral striatum (Yin et al., 2009), similar to the shifts in encoding observed as habitual performance emerges during operant training (Yin et al., 2004, 2005a, 2005b), which is also dependent on dopamine (Faure et al., 2005). Learning on the accelerating rotarod has also been used to model acquired repetitive behaviors in mouse models of autism, which have coincident changes in striatal circuitry (Rothwell et al., 2014). In human patients with Parkinson’s disease, in whom nigrostriatal dopamine signaling is impaired, there are deficits in new motor skill acquisition (Kawashima et al., 2018) as well in habit (Bannard et al., 2019; Knowlton et al., 1996; Witt et al., 2002).

Motor skill acquisition has also been assessed in rodents using a variety of skilled reaching tasks and fast, timing-dependent sequences of lever pressing (Jin and Costa, 2010; Jin et al., 2014; Kawai et al., 2015; Xu et al., 2009). Basal ganglia circuit function and striatal dopamine inputs are again at the heart of these learned behaviors. Dopamine cells projecting to the dorsal striatum signal the beginning and end of learned action sequences, and help to control learned sequence-related activity in the striatum (Jin and Costa, 2010; Jin et al., 2014). Motor skill acquisition also stabilizes dendritic spines in motor cortex (Xu et al., 2009), but provocatively, motor cortex was found to be dispensable for the execution a previously learned motor task, suggesting that subcortical circuits can independently support motor execution after learning has occurred (Kawai et al., 2015). Indeed, lesions of the dorsolateral striatum (DLS), which prevent habit formation (Yin et al., 2004), also prevent learned motor skill execution (Dhawale et al., 2019).

Another model system in which motor skill learning has been investigated is songbirds. Songbirds have a specialized song learning circuit called the anterior forebrain pathway, which includes areas analogous to cortex, basal ganglia, and thalamus in mammals (Doupe et al., 2005). In songbirds such as the zebra finch, song is a highly stereotyped and easily quantified motor output. The combination of such an elegant motor output with a brain circuit that is dedicated to producing it (and separated from the circuits controlling other movements) makes birdsong a very appealing system for studying the relationship between brain activity and behavior. From studies of birdsong we know that the song-related basal ganglia, and its dopaminergic inputs, are required for song learning to take place (Brainard and Doupe, 2000; Gadagkar et al., 2016). The implication is that this type of skill learning too may bear relationships with habit learning in mammals. Thus, advances in our understanding of how song production is controlled in the avian brain stand to inform many of our studies in mammals, including those involving habit.

Whether singing in birds or mice is a “habit” is not obvious. Despite the similarities in brain structures required for motor skill learning and habit formation, the relationship between these behaviors, especially as defined by performance in the probe tests listed in Table 1, remains to be formalized. Skills such as singing are performed in the absence of external rewards like sucrose pellets, but changes in behavior can be driven by sensory feedback and internal template matching, which also drives dopaminergic reward prediction error signals (Gadagkar et al., 2016). Eventually, skilled singing is rewarded by mating opportunities in the wild, but the behavior is learned well before mating occurs (e.g. given an appropriate tutor, male zebrafinch learn and crystalize their song around the same time they reach sexual maturity, ~90 days post hatching). In lab animals, actual mating may never occur as a consequence for singing, yet the behavior is still learned and performed. Thus, song learning is an interesting but perhaps exceptional context is which a motor skill is acquired due to an innate drive. Still, the study of song learning has provided important principles for motor learning more generally, such as the critical role of variability in motor performance to learning (Dhawale et al., 2017). The variability that drives motor learning appears to be created by basal ganglia circuits, which also support habit formation (Dhawale et al., 2017). What role does behavioral variability play in habit formation? And could that be a key to understanding why difference reward schedules promote it?

A primary difference between tests of motor skills and habits is timing. Motor skills generally involve precision of action on the millisecond time scale, whereas habits encoding relationships between lever pressing and reward retrieval from a separate reward delivery port involve learning about events separated by seconds. Whether there are common striatal and dopaminergic mechanisms capable of mediating both millsecond- and second-scale feedback to alter behavior, particularly transitions to habit, is largely unknown. One study in songbirds showed that birds are capable of learning from millisecond-scale auditory feedback, however the authors did not explore whether a dopaminergic mechanism mediates that effect (Charlesworth et al., 2011). In rodents, millisecond timescale dorsal striatal dopamine signals can bias animals towards changes in action, a plausible mechanism for inducing fast behavioral adaptations in a motor sequence in response to salient feedback (Howe and Dombeck, 2016; Jin and Costa, 2010; da Silva et al., 2018).

Grooming Behavior

Grooming is a repetitive behavior that mice perform spontaneously without training. It follows a stereotyped sequence, starting from the nose and working back across the face and body. Grooming meets our intuitive or colloquial definition of habit as a regularly performed behavior, and thus is often discussed in relation to habit. Grooming is also similar to the skilled motor tasks described above, requiring fine coordinated sequences of movement to execute. Like birdsong, it is a stereotyped behavior acquired early in life. But is grooming a habit in the formal psychological sense? Self-injurious overgrooming is observed in several mouse models of obsessive-compulsive disorder (OCD) and autism, contributing the hypothesis that habit plays a role in these disorders (Peça et al., 2011; Shmelkov et al., 2010; Welch et al., 2007). The fact that mice with OCD/autism-related mutations will continue to groom even when the behavior is apparently harmful does suggest a connection to habit: these mice seem unable to discontinue a behavior even when the action is leading to a maladaptive outcome.

In addition to spontaneous grooming, mice can be induced to groom when a water drop is applied to the head (Burguière et al., 2013). In a mouse model of OCD (Sapap3 model; Welch et al., 2007), water-induced grooming behavior transitions into additional spontaneous grooming bouts, providing support for the idea that repetitive behaviors in OCD are the result of exuberant habit formation that quickly disconnects actions from desired outcomes. However, to what degree grooming behavior is or is not related to other types of habit, such as learned operant behaviors and motor skills, is not well established.

Compulsive Behavior

Habits have been widely hypothesized to contribute to addiction. In particular, habits may contribute to compulsive drug seeking, usually defined as drug seeking in the face of negative consequences. While in humans the negative consequences of drug seeking typically involve the loss of money, jobs, and important social relationships, as well as negative long-term health effects, in animals these negative consequences are often modeled simply as electrical shocks. Other simple methods for modeling the negative consequences of drug taking in animal models include inducing malaise associated with the drug via LiCl or histamine treatment, or adding bitterants to the drug (primarily quinine added to alcohol) to cause an aversive taste response (Vanderschuren et al., 2017). Extended drug taking in rodents leads to the perseverance of drug seeking behavior even when shocks are also delivered as a consequence for seeking and it is hypothesized that habits play a role in this perseverance (Belin et al., 2008; Chen et al., 2013; Pelloux et al., 2007; Vanderschuren and Everitt, 2004).

While generally studied in the context of drug abuse, compulsive responding is not limited to responding for drugs. Rodents will also tolerate electrical shocks to receive sucrose in some circumstances (Datta et al., 2018; Nieh et al., 2015). As mentioned above, mice with OCD-linked gene mutations will continue to groom even when it causes pain and injury, an obvious negative consequence. Thus, it is important to understand how negative feedback plays a role in shaping the emergence of habitual behavior, and whether habit formation circuits drive punishment-resistant reward seeking both generally and in particular circumstances or disorders.

Many tests for compulsive responding ask rodents to learn a new action-outcome association between their previously learned action and a new aversive outcome such as a shock. The tests also potentially change the perceived cost of obtaining a rewarding outcome or indirectly reduce the value of the outcome since it is paired with aversion, depending on the timing of the aversive feedback. Thus, tests for “compulsivity” are similar to the probes designed to test for habit formation. Shock paradigms to test for compulsion vary in their methods. Some punish lever pressing with certainty, while others deliver shock probabilistically (e.g. Chen et al., 2013; Deroche-Gamonet et al., 2004). Some studies in monkeys delay the aversive outcome, although most rodent studies using shock deliver it immediately (Epstein and Kowalczyk, 2018; Vanderschuren et al., 2017; Woolverton et al., 2012). One important difference between shock delivery and aversive pairing (e.g. LiCl pairing) is that aversive pairing directly degrades the value of the reward, whereas shocks that occur immediately as a consequence for lever pressing punish the action but leave the reward value intact (for further commentary on this issue see: Epstein and Kowalczyk, 2018; Vanderschuren et al., 2017). Differentiating between compulsive and habitual responding is thus a challenge, though one which may be surmountable with the addition of circuit-level investigations demonstrating whether similar or different neural mechanisms are involved in each.

Notably, habitual and compulsive responding do not always track together. When Singer and colleagues trained rats to solve a new operant “puzzle” each day to get cocaine, they found that rats still escalated their cocaine intake and continued to seek cocaine when a footshock consequence was imposed (Singer et al., 2018). However, the behavior in theory could not be fully automated, since the actions required to get the outcome were changing each day. Additionally, blocking dopamine signaling in the DLS did not interrupt cocaine-seeking behavior in this paradigm, in contrast to other studies (Giuliano et al., 2019; Murray et al., 2014; Vanderschuren et al., 2005). These results demonstrate that compulsive drug seeking does not absolutely require dopamine signaling in the DLS. However, the data do not preclude the involvement of “habit” defined in the behavioral sense e.g. by outcome devaluation procedures.

Another example of the dissociation between compulsive and habitual responding comes from Willuhn and colleagues. They found that dopamine signaling in the DLS in response to cocaine self-administration develops over weeks of training (Willuhn et al., 2012). DLS dopamine signaling in this case was required for the selection of drug seeking actions, however the behavioral paradigm used - a short access (1 hour/day) self-administration paradigm - is generally not sufficient to achieve compulsive (shock-resistant) drug seeking. Therefore, a model emerges in which habit-related brain systems may become engaged in behavior independently of compulsive responding. The involvement of DLS may precede the development of compulsive behavior, but, intriguingly, a transition to reliance on the DLS system predicts vulnerability to compulsivity (Giuliano et al., 2019). Still, based on these few studies it is difficult to fully assess the relationship between two closely related concepts. Additionally, it is not known whether cocaine hijacks habitual and compulsive neural mechanisms in a non-naturalistic way or whether responding for natural rewards such as sucrose would produce similar dissociations. One study found that the development of compulsive sucrose-seeking behavior in rats could not predict the development of compulsive cocaine-seeking behavior, suggesting that the neurobiological basis for the engagement of habit may differ under conditions of cocaine use (Datta et al., 2018).

Avoidance Learning

The vast majority of behavioral studies of habit have focused on paradigms in which animals receive valued rewards for their actions (positive reinforcement). However, animals also learn from aversive outcomes (positive punishment), from the relief of aversive outcomes (negative reinforcement), and from the removal of rewarding outcomes (negative punishment). It unfortunately remains unclear whether and how striatal habit mechanisms are engaged by feedback mechanisms other than positive reinforcement. Studies of compulsivity help address the role of positive punishment, but what about the roles of negative reinforcement and punishment? It has been suggested that active avoidance learning, in which animals perform an action to prevent a shock from occurring, may invoke habit (LeDoux et al., 2017). Habit is an appealing explanation for why animals continue to perform actions that prevent negative consequences, since as the animal correctly performs preventative actions they essentially begin to perform the actions in extinction (i.e. if the animal’s actions prevent the negative consequence from occurring 100% of the time, then no obvious outcomes occur as the performance of the behavior continues).

The importance of understanding how habits contribute to avoidance is a key question in the study of anxiety disorders and OCD. Human OCD patients show stronger learning of avoidance habits than control subjects (Gillan et al., 2014). However, this enhanced habit formation is associated with an increase in activity in the caudate (analogous to rodent DMS), an area for goal-directed control, but not changes in the putamen (analogous to rodent DLS), suggesting that avoidance habits in OCD may be the result of impaired goal-directed systems rather than strengthened habit learning (Gillan et al., 2015). Human studies of avoidance habits also show that a history of early-life stress, which is associated with vulnerability to a number of psychiatric disorders, promotes the development of avoidance habits (Patterson et al., 2019). Despite these interesting human findings, animal studies on avoidance learning have largely not considered habit. Geramita et al in this issue (cross-ref) discuss some of the reasons why, with suggestions for moving forward.

If we can develop better ways to model avoidance habits in rodents, there are many interesting circuit-level hypotheses to explore, including the roles for dopamine and dorsal striatal circuits. Dopamine, which is thought to be important for habit formation when learning from positive reinforcement, is also likely important for learning from aversive outcomes. Subsets of dopamine neurons increase their activity for aversive outcomes and for cues predicting aversive outcomes (Lammel et al., 2012; Lerner et al., 2015; Matsumoto and Hikosaka, 2009; Menegas et al., 2018), which could allow for the invigoration of actions by punishment. Additionally, some dopamine neurons projecting to the NAc respond to safety cues (indicating that shocks will not occur) and encode a “safety prediction error” signal (Stelly et al., 2019). Dopamine neurons projecting to the caudal tail of the striatum are also potentially interesting in the context of avoidance learning, as ablation of these neurons has been shown to reduce avoidance (Menegas et al., 2018). However, whether or not the activity of any of these dopamine neurons can control habit formation in the context of avoidance learning is not yet determined. Future studies more thoroughly examining the role of habit-related neural circuitry in learning from different types of reinforcement and punishment may help the field to clarify its definitions of habitual control over behavior.

Towards a Circuit Model for Habit Formation

As behavioral work on habit and related tasks has proceeded as described above, so too has work to create a convincing circuit model for habit formation. Such a model is essential for progress in the field. Without a circuit model for habit formation, we cannot be sure if the various tasks being used to study habits and other potentially related behaviors (see Table 2) converge on similar circuits. Furthermore, without a strong working model of normal habit formation, we are limited in our ability to systematically test whether habit circuits are altered in animal models of neuropsychiatric disease.

Table 2.

Behavioral Paradigms for Habit Formation and Related Behaviors

Behavioral Paradigm Key Features
Random Interval (RI) Training Positive Reinforcement, Uncertainty in Timing Leads to High Levels of Responding
Fixed Ratio Discrete Trials (DT5) Training Positive Reinforcement, Cueing of Discrete Trials is a Key to Habit Formation
T-Maze Positive Reinforcement, Sensory Discrimination Task Leading to Habit With Extensive Overtraining
Motor Skill Learning (e.g. Accelerating Rotarod, Skilled Reaching Tasks, Vocal Learning) Mixed reinforcement/punishments depending on the task, or can be performed without explicit external feedback. Precise motor timing requirements may engage habit mechanisms to ensure fluid action sequences.
Grooming Robust innate repetitive behavior. Self-injurious overgrooming may invoke positive punishment and model OCD symptoms such as excessive hand washing and trichotillomania.
Compulsive Drug or Sucrose Seeking Positive Punishment for seeking drug or sucrose. Tests animals’ sensitivity to the addition of an aversive outcome for seeking positive reinforcement.
Avoidance Learning Negative Reinforcement. Animals learn to act to avoid aversive outcomes. Important model for determining how habits contribute to avoidance e.g. in anxiety disorders.
Two-Step Task Many different types of reinforcement or punishment may be used. The two-step task allows one to assess the parallel contributions of “model-free” vs “model-based” behavior to performance.

What is the current state of circuit models for habit formation? Extensive work has identified striatal learning systems in habit formation and this work provides us with a set of brain regions to focus. Specifically, the dorsolateral striatum (DLS) is imperative for supporting habit formation and motor skill acquisition (Yin and Knowlton, 2006; Yin et al., 2004, 2009). Lesions to the DLS prevent habit formation (Yin et al., 2004), as do lesions of the dopaminergic inputs to the DLS from the substantia nigra pars compacta (SNc; Faure et al., 2005). Pharmacological blockade of dopaminergic signaling in the DLS also impairs motor skill acquisition on the accelerating rotarod (Yin et al., 2009) and habitual cocaine- and heroin-seeking (Belin and Everitt, 2008; Hodebourg et al., 2019; Willuhn et al., 2012).

This DLS learning system works in parallel with other striatal learning systems centered around the dorsomedial striatum (DMS) and ventral striatum (nucleus accumbens, NAc) to regulate reward processing, incentive motivation, and action selection. Lesions to the DMS generally bias rodents away from goal-directed instrumental behavior and towards habit (Gremel and Costa, 2013; Yin et al., 2005a, 2005b), but the effects of DMS lesions are different if the anterior vs posterior DMS is targeted. Anterior DMS (aDMS) lesions do not have major effects on habit formation, as measured by outcome devaluation or by contingency degradation. Posterior DMS (pDMS) lesions reduce instrumental performance and increase habit formation (Yin et al., 2005b). Lesions of the pDMS, but not aDMS, also bias rodents towards egocentric rather than allocentric navigation strategies in a T maze task, a finding that is consistent with increased striatal-driven habit learning (Yin and Knowlton, 2004). Lesions to the NAc do not have major effects on measures of habit such as outcome devaluation and contingency degradation (de Borchgrave et al., 2002; Corbit et al., 2001). However, NAc core lesions impair instrumental performance and NAc medial shell lesions impair Pavlovian-instrumental transfer (Balleine and Killcross, 1994; Corbit et al., 2001).

Together, these lesion studies have crudely mapped the striatal subregions participating in different aspects of instrumental learning and habit formation, but a critical outstanding question in the field is to what degree these systems interact. Are the NAc, DMS, and DLS systems all engaged simultaneously in learning, and to what degree and at what level in the circuitry do they coordinate or compete to control behavioral output?

There is behavioral evidence for an interaction between striatal subregions in gating the transition to habit. Using NAc lesions paired with contralateral infusions of dopamine receptor antagonists in the DLS (to disconnect NAc activity from the control of DLS dopamine activity), Belin and Everitt demonstrated that crosstalk between the NAc and DLS is important for habitual cocaine seeking (Belin and Everitt, 2008). However, this study, while foundational, did not provide circuit-level insight as to the nature of the interaction taking place.

The Ascending Spiral Hypothesis

One prominent and influential hypothesis in the field regarding the interaction between striatal subsystems is the “ascending spiral” hypothesis (Haber et al., 2000; Yin and Knowlton, 2006). The Ascending Spiral Hypothesis posits that the NAc disinhibits DMS dopamine signaling, causing dopamine-dependent plasticity of corticostriatal connections in the DMS. In turn, DMS disinhibits dopamine signaling and dopamine-dependent corticostriatal plasticity in the DLS. The Ascending Spiral Hypothesis originally arose from anatomical data collected in monkeys. Haber et al. (2000) used combinations of anterograde and retrograde tracers injected into the striatum to demonstrate a plausible route of indirect information flow from more ventromedial to more dorsolateral regions of the striatum through the dopaminergic midbrain. Axons originating from the ventral striatum overlapped with cell bodies of dopamine neurons projecting to the central striatum, and axons originating from the central striatum overlapped with cell bodies of dopamine neurons projecting to the DLS. While intriguing, a major limitation of this study is that the authors could not determine whether synapses were actually made between the labeled axons and cell bodies in their preparations; the argument was made based purely on proximity of the labels rather than functional measurements. In fact, notably, the Ascending Spiral Hypothesis does not propose that direct connections are made between striatal axons and midbrain dopamine neurons. Since the striatum contains only GABAergic projection neurons, direct connections between the central striatum axons and DLS-projecting cell bodies, for example, would be inhibitory. Thus, it was proposed that there are disinhibitory connections, in which GABAergic striatal projection neurons would contact GABAergic cells in the nearby substantia nigra pars reticulata (SNr), which would then be the cells to contact the dopamine neurons projecting back to the DLS. Despite the appeal of this hypothesis for learning theories, the original data do not speak to the possibility of disynaptic disinhibition. In fact, the Ascending Spiral Hypothesis is potentially in conflict with the observation that DMS lesions (at least of the pDMS) accelerate the emergence of habitual control over behavior, the opposite effect of what might be expected in this framework (Gremel and Costa, 2013; Yin et al., 2005a, 2005b). Thus, it is imperative to test the Ascending Spiral Hypothesis more rigorously to determine its appropriate role in a circuit model for habit formation.

Disinhibitory inputs, which are central to the Ascending Spiral Hypothesis, are posited on the basis of separate knowledge of striatal inputs to midbrain SNr GABA neurons, and SNr GABA inputs to dopamine neurons. The direct pathway of the striatum sends GABAergic projections to SNr cells (Albin et al., 1989; DeLong, 1990), although the SNr is not uniformly inhibited by direct pathway stimulation in vivo (Freeze et al., 2013). In turn, dopamine neurons in the substantia nigra pars compacta (SNc) receive strong GABAergic inputs from the SNr (Tepper and Lee, 2007; Tepper et al., 1995). SNr GABA neurons have tonic, linear current-frequency relationships (Richards et al., 1997), meaning that a disinhibition circuit through these neurons would likely lead to corresponding graded changes in SNc dopamine neuron tonic firing rather than inducing bursts. Dopamine burst firing relevant to habit formation could be induced by concurrent excitatory inputs, whose efficacy might be strengthened by decreased inhibition from the SNr, but such a circuit then needs to be included explicitly in the Ascending Spiral Hypothesis model.

A careful study of the morphology of SNr GABA neurons showed that SNr axons extend into the SNc in a longitudinal band across the ventral tier. This band encompasses the location of SNc dopamine neurons projecting to the striatal subregion from which the traced SNr cell would receive inputs, but may also extend beyond those boundaries (Mailly et al., 2003). However, since this study was morphological and did not measure functional synaptic strengths, at present we still do not know if there is specific connectivity from DMS to SNr GABA neurons that project to DLS-projecting dopamine neurons, or the strength of that connectivity if it does exist. Additionally, it does not appear that all SNr GABA neurons make synapses onto dopamine neurons (although further work is needed to characterize different streams of SNr output; Rizzi and Tan, 2019). Disinhibition could potentially work only in closed reciprocal loops (e.g. DMS disinhibiting DMS-projecting dopamine neurons) or in a “descending spiral” (e.g. DLS disinhibiting DMS-projecting dopamine neurons) as well as in an ascending spiral.

A second oft-cited reference related to the Ascending Spiral Hypothesis is Ikemoto et al (2007), which was conducted in rats. In this study, the retrograde tracer Fluoro-Gold was injected into various striatal sites and the locations of labeled dopaminergic cell bodies in the midbrain were reported. Indeed, there was a clear organization of dopamine cell bodies found, and dopaminergic projections to the dorsal striatum were found to arise primarily from the SNc. However, no distinction was made in this study between DMS and DLS within the dorsal striatum, and no anterograde tracing (parallel to what Haber et al (2000) completed in monkeys) was done to examine the overlap of output-defined dopamine neuron cell bodies with inputs from distinct striatal subregions. Additionally, as was true in the Haber et al. (2000) study, no experiments (e.g. electrophysiological measurements) were carried out to verify functional synaptic connections within a striatonigrostriatal spiral, meaning there is still no direct evidence that such a circuit could mediate disinhibition during habit formation.

More direct evidence of disinhibitory control over dopamine neurons exists in the ventral tegmental area (VTA), which contains dopamine neurons projecting to the NAc. However, the patterns of disinhibitory control do not clearly follow predictions of the Ascending Spiral Hypothesis. NAc neurons projecting to the VTA preferentially target VTA GABA neurons, leading to a disinhibition of VTA dopamine signaling following optogenetic stimulation of NAc terminals in the VTA (Bocklisch et al., 2013). This disinhibition appears to operate in a closed reciprocal loop, rather than an open ascending spiral. Supporting this finding, in a study looking at specific striatal subregions it was found that NAc lateral shell neurons disinhibit dopamine neurons that project back to the lateral shell in a reciprocal loop through VTA GABA neurons (Yang et al., 2018). Reciprocal loop dopamine disinhibition may also be important for songbird vocal learning (Gale and Perkel, 2010).

It has been suggested that the ventral pallidum (VP) is well suited to mediate a disinhibitory ascending spiral connecting the NAc and dorsal striatum (Root et al., 2015). The VP is a source of inhibitory afferent control for SNc dopamine neurons that receives inputs from the NAc, making this suggestion plausible. Indeed, the VP is required for Pavlovian-instrumental transfer (Leung and Balleine, 2013) as would be predicted for such a circuit. However, the role of the VP in mediating transitions from goal-directed instrumental behavior to habit is not established. This transition may require a different mechanism, hypothesized by the Ascending Sprial Hypothesis to be a disinhibitory connection between the DMS and DLS.

Disynaptic disinhibition of dopamine neurons is not the only route by which striatal activity might influence dopamine release. It is also important to consider the role of direct striatal inputs to dopamine neurons, which constitute a major source of their afferent control. Monosynaptic rabies tracing experiments have provided a useful overview of the brain-wide inputs to midbrain dopamine neurons (Beier et al., 2015; Lerner et al., 2015; Menegas et al., 2015; Watabe-Uchida et al., 2012). These experiments confirmed that dopamine neurons receive direct inputs from striatum and demonstrated the relative numbers of inputs received in comparison with other brain areas. Notably, SNc dopamine neurons receive ~50% of their inputs from the dorsal striatum and an additional significant portion from the NAc (Lerner et al., 2015; Watabe-Uchida et al., 2012). Dopamine neurons in the VTA also receive inputs from both the dorsal and ventral striatum (Beier et al., 2015, 2019). Therefore, in both the SNc and VTA, there is potential for direct monosynaptic inhibition in addition to disynaptic disinhibition of dopamine neurons.

Direct inhibitory inputs are not a part of the Ascending Spiral Hypothesis as it is currently set forth, and in fact these inputs appear to follow an opposite pattern: DLS inputs to DMS-projecting dopamine neurons are common and strong, as measured by both rabies tracing and electrophysiology (Lerner et al., 2015). Similarly, rabies tracing experiments showed that the dorsal striatum sends large numbers of inputs to dopamine neurons that project to the NAc lateral shell (altough this NAc lateral shell-projector population also sent inputs to DMS and DLS, complicating the interpretation; Beier et al., 2015).

Since both inhibition and disinhibition circuits may connect striatal activity to dopamine neuron activity, it is reasonable to ask which type of modulation dominates at behaviorally relevant time points. It is not clear if the inhibition and disinhibition circuits operate together (i.e. are active at the same times during behavior), especially as these circuits may arise from different striatal neuron populations. Striosomes (also known as patches) within the striatum project directly to dopamine neurons, whereas the matrix compartment of the striatum contains direct pathway projections to SNr GABA neurons. Striosome and matrix neurons receive different cortical inputs, which may drive their engagement in behavior separately (Friedman et al., 2015; Smith et al., 2016). In general, striosomes receive input from more “limbic” areas, as opposed to the associational and sensorimotor cortical inputs to DMS and DLS matrix neurons, respectively. In vivo imaging from striosome neurons shows some differences in activity patterns between compartments, with striosome neurons responding more strongly to reward-predicting cues than matrix neurons (Bloem et al., 2017). Thus, one can hypothesize that striatal inhibition of dopamine neurons dominates during cue presentation, especially after extensive training.

Striosomes are likely an important part of the habit formation circuit. Partial ablation of striosome neurons with a selective toxin called dermorphin-saporin causes deficits in learning on the rotarod (Lawhorn et al., 2009) as well as in habit formation in a more traditional operant test (Jenrette et al., 2019). One possible mechanism for these effects on learning could be a resulting imbalance in the regulation of striatal dopamine release (Shumilov et al., 2018). Imbalances between activity in the striosome and matrix compartments have been proposed to contribute to the development of neurological and psychiatric disorders including Huntington’s disease, L-DOPA-induced dyskinesias, dystonia, and drug addiction (Crittenden and Graybiel, 2011). Understanding these imbalances and the circuit mechanisms by which they might contribute to symptomatology will be a key to generating new clinical interventions.

In conclusion, while the Ascending Spiral Hypothesis has been influential in the habit formation field, convincing circuit- and synaptic-level evidence of disinhibition has not been demonstrated, leaving the door open to other possibilities. Although striatonigrostriatal loops might mediate NAc to DMS to DLS information transfer, we should not focus on them to the exclusion of other possibilities. Other possible circuits that could promote communication between the DMS and DLS include corticostriatothalamic loops, lateral connections made between striatal subregions (including through interneurons), and basal ganglia loops downstream of the striatum (e.g. through the globus pallidus externa, which sends projections back to the striatum). The Ascending Spiral might also work in parallel with systems that dampen rather than promote habit. Silencing of the DLS, particularly direct pathway striatal neurons in the DLS, promotes early goal-directed instrumental learning and PFC-DMS circuit engagement (Bergstrom et al., 2018). Thus, inputs from the DLS onto midbrain DMS dopamine circuits may serve to slow the acquisition of habits through a “descending spiral.”

Shifting Patterns of DMS and DLS Involvement in Behavior with Habit Formation

In vivo electrophysiological recordings show that patterns of activity in the DMS and DLS change with habit formation and motor skill acquisition, but different tasks can produce different results, calling into question whether the same circuits and plasticity mechanisms are engaged by each (Gremel and Costa, 2013; Thorn et al., 2010; Vandaele et al., 2019; Yin et al., 2009). Thorn et al. (2010) used the T-maze task (described above) paired with tetrode recordings in the DMS and DLS. Similar percentages of task-responsive neurons were found in each striatal subregion, however, the patterns of activity differed across training. Responses in the DLS tended to occur at action boundaries of the task (locomotion onset, turn, goal). Goal responses in particular seemed to emerge and strengthen with overtraining, after rats reached a performance criterion, perhaps reflecting an emerging reward responsiveness. In contrast, DMS neurons responded most strongly in the middle of the task as the rats progressed down the long arm of the T-maze track. Strong responses to the cue onset (the signal telling rats which direction to turn) occurred mid-training, but faded with overtraining. These results seem in line with the idea that DMS is most actively engaged in action-outcome learning during an earlier phase of task experience, whereas DLS becomes in engaged in creating habits later on.

Gremel and Costa (2013) recorded DMS and DLS neurons using more traditional operant task. They trained mice to pursue rewards on an RI schedule (promoting habitual responding) in one context and an RR schedule (promoting goal-directed responding) in another context. This clever study design allowed them to assess within-subject differences depending on the training context. Similar to Thorn et al (2010), this group found roughly equal percentages of task-responsive neurons in DMS and DLS. While some neurons responded specifically in one context, many were modulated in both the RI and RR contexts. The observation of task-responsive neurons in both DMS and DLS in both contexts questions the notion of a hard distinction between the two systems as habitual control emerges. When looking at the magnitude of the changes observed in DMS and DLS, however, some differences were observed in this study. After training, DMS neurons had a larger increase in their lever-press associated firing in the RR context when the reward had been devalued. DLS neurons had a smaller increase their lever-press associated firing in RR context when the reward was valued. In contrast to Thorn et al (2010), Gremel and Costa did not find disengagement of DMS task-responsive neurons over training in the habitual context, nor did they find any changes in DLS task-responsiveness with habit (RI context).

Another study by Vandaele et al. (2019) used the DT5 task (Table 2, described above). Like Gremel and Costa (2013), this group observed that both the DMS and DLS remained substantially task responsive late into training, in this case many weeks after habits (as assessed by satiety-specific devaluation) had formed. The continued engagement of DMS in habitual behavior questions the notion that behavioral control completely shifts to DLS circuits with overtraining. DLS may still be required for the initial transition to habit, but the consolidation of habit memory may take place elsewhere. Indeed, in this study pharmacological inactivation of DLS late in training had modest effects on behavior, slightly decreasing lever press rates, but overall did not prevent performance of the task.

Finally, Yin et al. (2009) used the accelerating rotarod for their study examining the participation of DMS and DLS neurons in motor skill acquisiton. They found a pattern of early DMS engagement in the task and later DLS engagement as the task was mastered and performance plateaued. In this case, the findings appear more similar to Thorn et al (2010), with DMS disengaging later.

These four studies clearly drive home the point that DMS and DLS engagement in behavior may be highly task dependent. They leave future researchers with the difficult job of parsing which responses are truly required for habit formation in general, and which are task-specific. The reasons that certain tasks maintain DMS engagement while others cause it to diminish will be a particularly interesting avenue for future work. Such investigations will be important for clarifying whether there are multiple neural circuit implementations of habit available to an animal. Better connecting the emergence of habitual behavior with each of these recordings will also be key. Since behavioral probes for habit (Table 1) can only be done at discrete time points, it can be difficult to assess exactly when an individual animal is transitioning to habitual control, limiting the power of analyses. In the Vandaele et al (2019) study, habit occurred early in training (after 10 sessions). Thus, habit per se could not be correlated with the late changes observed in DMS after many weeks. In contrast, the T-maze task using by Thorn et al (2010) is more complicated to train. Training to criteria and then further overtraining until rats are insensitive to outcome devaluation generally takes much longer than when using the DT5 task (Smith et al., 2012). Whether these differences in training time or other aspects of the tasks are important for determining how DMS and DLS are engaged remain to be seen.

Notably, all of these studies which compared the in vivo activities of DMS and DLS neurons used relatively anterior recording coordinates. It remains unclear how the activities of posterior striatal regions are correlated with the emergence of habitual behavior, and this is a potentially important question. Lesions and inactivations of aDMS and pDMS differ in their effects, with pDMS lesions being more effective at promoting the early emergence of habitual control (Yin et al., 2005b). However, since most recordings are done in the aDMS it is difficult to know how to align the two literatures. Additionally, the posterior DLS (pDLS), including the far caudal tail of the striatum, is an understudied area, rich in cells projecting directly to substantia nigra dopamine neurons (Lerner et al., 2015; Menegas et al., 2015). Dopamine cells projecting to the caudal tail of the striatum also have unique input connectivity patterns (Menegas et al., 2015). Thus, it will be illuminating for future studies to determine the functions of circuits involving the pDLS and caudal tail of the striatum in the emergence of habitual behavior and to incorporate these striatal subregions into a refined circuit model of habit formation.

Plasticity of Habit Circuits with Learning

To assess the validity of any circuit model of habit formation that is developed, we must determine what types of plasticity take place during training to mechanistically cause the observed in vivo shifts in striatal function over time. A growing body of evidence points to the involvement of corticostriatal plasticity mechanisms in habit formation. Long-term synaptic plasticity at cortical inputs onto DMS and DLS neurons depends critically on dopamine, and blocking dopamine signaling during learning impairs habit or motor skill acquisition (Faure et al., 2005; Yin et al., 2009). Additionally, inhibition of adenosine A2A receptors or endocannabinoid CB1 receptors, both of which are known to be important actors in corticostriatal plasticity pathways (Lerner and Kreitzer, 2011, 2012; Lerner et al., 2010; Shen et al., 2008; Surmeier et al., 2009), interferes with habit formation (Gremel et al., 2016; Hilário et al., 2007; Li et al., 2016; Yu et al., 2009).

Using an accelerating rotarod task, Yin et al. (2009) showed that AMPA:NMDA ratios at excitatory inputs onto DLS neurons are decreased specifically in the later stages of learning, after performance has plateaued. Additionally, LTD in the DLS was more readily observed in slices made from mice trained to the late stages of learning, suggesting that LTP had occurred in vivo. Another study using the rotarod task found that the engagement of cortical inputs to the DMS and DLS changes dynamically during learning (Kupferschmidt et al., 2017). PFC inputs to the DMS peak in activity early in learning and disengage later, while M1 motor cortical inputs to the DLS remain strong. However, a limitation in both of these studies is that differences changes onto direct vs indirect pathway striatal neurons were not examined.

Corticostriatal plasticity may act disparately on the direct and indirect pathways within the striatum over the course of goal-directed and habitual learning. In the pDMS, AMPA:NMDA ratios increase onto direct pathway neurons but decrease onto indirect pathway neurons after training to a goal-directed stage of behavior (Shan et al., 2014). No changes in AMPA:NMDA ratios were observed in the DLS at this early stage in learning. After longer training on a RI60 schedule to induce habitual control, indirect pathway neurons in the DLS showed a reduced amplitude of spontaneous EPSCs (sEPSCs), suggesting that LTD onto these neurons had occurred (Shan et al., 2015). The average amplitude of recorded sEPSCs from each mouse negatively correlated with its press rate in the last RI60 training session, suggesting that this reduction in sEPSC amplitude is specifically involved in the escalation of responding behavior associated with habit.

In addition to plasticity in the strengths of corticostriatal coupling to the direct and indirect pathways, shifts in timing may play a role in habit formation. In a study using acute brain slices containing the DLS, O’Hare et al. found that changes in the relative timing of direct vs indirect pathway activity in response to cortical stimulation correlated with habitual behavior: direct pathway striatal neurons fired before indirect pathway striatal neurons in habitual mice, whereas the inverse was true in goal-directed mice (O’Hare et al., 2016).

Corticostriatal plasticity is further implicated in habit-related behaviors because mice prone to developing OCD-like repetitive overgrooming behaviors all have corticostriatal synaptic deficits in common (Peça et al., 2011; Shmelkov et al., 2010; Welch et al., 2007), and overgrooming can be induced by repetitive corticostriatal stimulation (Ahmari et al., 2013) perhaps by engaging endocannabinoid-dependent long-term depression mechanisms important for the development of habitual responding (Gremel et al., 2016).

The fact that corticostriatal plasticity is gated not only by dopamine, but by a host of other neuromodulators suggests that dopaminergic circuits like those invoked by the Ascending Spiral Hypothesis may not be the only mechanism by which transitions to habit are influenced. Alterations to circuits that gate the release of neuromodulators like adenosine, acetylcholine, and endogenous opioids in the striatum could also contribute to habit formation under different circumstances and in different disorders.

Sites of plasticity other than corticostriatal synapses may additionally play a role in shaping the function of the striatal circuitry regulating habit. Plasticity in cortical circuits upstream of the corticostriatal projections is one example. As another example, if dopamine inputs to the DMS and DLS are regulated by an ascending spiral from other striatal regions as proposed, then plasticity of inputs onto SNr GABA and/or SNc dopamine neurons might regulate habit formation through the spiral. Some inputs to SNc dopamine neurons are altered by exposure to drugs of abuse such as cocaine (Beaudoin et al., 2018), which could provide a basis for understanding how these drugs engage habit circuits. Overall, there are likely many distinct sites of plasticity occurring during habit formation. Plasticity events at these different sites might act together and be interdependent on one another. Understanding which synaptic changes occur at which points in training could help answer the question of why certain reinforcement schedules lead to the emergence of habitual control on different timescales.

Conclusion

As a field, we have developed an array of behavioral tasks to study habit. What is now required is to better formalize our definitions of habit, thinking broadly across behavioral fields to integrate studies of instrumental responding, motor skill learning, repetitive behaviors, compulsive behaviors, and avoidance learning. Furthermore, a circuit model for habit – encompassing specific descriptions of circuits and synaptic changes that mediate changes in network activity occurring with habit formation - will provide a foundation to compare mechanisms across tasks. This review has focused on striatal mechanisms, but in fact many additional brain circuits may play a role as well and should be incorporated into our theories. Ultimately, a convincing circuit model for habit is indispensable for understanding the complex relationships between habit and habit-related behavioral tasks, and is required to make substantive progress on addressing the question of whether dysfunctions in habit circuits indeed contribute to the symptoms observed in various neuropsychiatric disorders such as OCD, autism, and addiction.

Significance:

This article gives an overview of how habits have been conceptualized and studied behaviorally. It also reviews findings and hypotheses about the neural implementation of habits, with an emphasis on striatal circuits. The aim is to integrate discussions of behavioral and circuit-level approaches to the study of habit, and to motivate new research directions at the interface between these levels of investigation.

Acknowledgments:

This work was funding by the NIH (R00MH109569 and DP2MH122401) and by a NARSAD Young Investigator Award from the Brain & Behavior Research Foundation.

Footnotes

Conflict of Interest Statement:

The author has no conflicts of interest to declare.

References

  1. Ahmari SE, Spellman T, Douglass NL, Kheirbek MA, Simpson HB, Deisseroth K, Gordon JA, and Hen R (2013). Repeated Cortico-Striatal Stimulation Generates Persistent OCD-Like Behavior. Science 340, 1234–1239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Albin RL, Young AB, and Penney JB (1989). The functional anatomy of basal ganglia disorders. Trends Neurosci. 12, 366–375. [DOI] [PubMed] [Google Scholar]
  3. Alvares GA, Balleine BW, Whittle L, and Guastella AJ (2016). Reduced goal-directed action control in autism spectrum disorder. Autism Res. Off. J. Int. Soc. Autism Res. 9, 1285–1293. [DOI] [PubMed] [Google Scholar]
  4. Balleine B, and Killcross S (1994). Effects of ibotenic acid lesions of the nucleus accumbens on instrumental action. Behav. Brain Res. 65, 181–193. [DOI] [PubMed] [Google Scholar]
  5. Bannard C, Leriche M, Bandmann O, Brown CH, Ferracane E, Sánchez-Ferro Á, Obeso J, Redgrave P, and Stafford T (2019). Reduced habit-driven errors in Parkinson’s Disease. Sci. Rep 9, 3423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Beaudoin GMJ, Gomez JA, Perkins J, Bland JL, Petko AK, and Paladini CA (2018). Cocaine Selectively Reorganizes Excitatory Inputs to Substantia Nigra Pars Compacta Dopamine Neurons. J. Neurosci. Off. J. Soc. Neurosci 38, 1151–1159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Beier KT, Steinberg EE, DeLoach KE, Xie S, Miyamichi K, Schwarz L, Gao XJ, Kremer EJ, Malenka RC, and Luo L (2015). Circuit Architecture of VTA Dopamine Neurons Revealed by Systematic Input-Output Mapping. Cell 162, 622–634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Beier KT, Gao XJ, Xie S, DeLoach KE, Malenka RC, and Luo L (2019). Topological Organization of Ventral Tegmental Area Connectivity Revealed by Viral-Genetic Dissection of Input-Output Relations. Cell Rep. 26, 159–167.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Belin D, and Everitt BJ (2008). Cocaine seeking habits depend upon dopamine-dependent serial connectivity linking the ventral with the dorsal striatum. Neuron 57, 432–441. [DOI] [PubMed] [Google Scholar]
  10. Belin D, Mar AC, Dalley JW, Robbins TW, and Everitt BJ (2008). High Impulsivity Predicts the Switch to Compulsive Cocaine-Taking. Science 320, 1352–1355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bergstrom HC, Lipkin AM, Lieberman AG, Pinard CR, Gunduz-Cinar O, Brockway ET, Taylor WW, Nonaka M, Bukalo O, Wills TA, et al. (2018). Dorsolateral Striatum Engagement Interferes with Early Discrimination Learning. Cell Rep. 23, 2264–2272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bloem B, Huda R, Sur M, and Graybiel AM (2017). Two-photon imaging in mice shows striosomes and matrix have overlapping but differential reinforcement-related responses. ELife 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bocklisch C, Pascoli V, Wong JCY, House DRC, Yvon C, de Roo M, Tan KR, and Lüscher C (2013). Cocaine Disinhibits Dopamine Neurons by Potentiation of GABA Transmission in the Ventral Tegmental Area. Science 341, 1521–1525. [DOI] [PubMed] [Google Scholar]
  14. de Borchgrave R, Rawlins JNP, Dickinson A, and Balleine BW (2002). Effects of cytotoxic nucleus accumbens lesions on instrumental conditioning in rats. Exp. Brain Res. 144, 50–68. [DOI] [PubMed] [Google Scholar]
  15. Brainard MS, and Doupe AJ (2000). Interruption of a basal ganglia-forebrain circuit prevents plasticity of learned vocalizations. Nature 404, 762–766. [DOI] [PubMed] [Google Scholar]
  16. Burguière E, Monteiro P, Feng G, and Graybiel AM (2013). Optogenetic stimulation of lateral orbitofronto-striatal pathway suppresses compulsive behaviors. Science 340, 1243–1246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Charlesworth JD, Turner EC, Warren TL, and Brainard MS (2011). Learning the microstructure of successful behavior. Nat. Neurosci 14, 373–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Chen BT, Yau H-J, Hatch C, Kusumoto-Yoshida I, Cho SL, Hopf FW, and Bonci A (2013). Rescuing cocaine-induced prefrontal cortex hypoactivity prevents compulsive cocaine seeking. Nature 496, 359–362. [DOI] [PubMed] [Google Scholar]
  19. Corbit LH, Muir JL, and Balleine BW (2001). The Role of the Nucleus Accumbens in Instrumental Conditioning: Evidence of a Functional Dissociation between Accumbens Core and Shell. J. Neurosci 21, 3251–3260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Crittenden JR, and Graybiel AM (2011). Basal Ganglia disorders associated with imbalances in the striatal striosome and matrix compartments. Front. Neuroanat 5, 59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Datta U, Martini M, Fan M, and Sun W (2018). Compulsive sucrose- and cocaine-seeking behaviors in male and female Wistar rats. Psychopharmacology (Berl.) 235, 2395–2405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Daw ND, Niv Y, and Dayan P (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci 8, 1704–1711. [DOI] [PubMed] [Google Scholar]
  23. Daw ND, Gershman SJ, Seymour B, Dayan P, and Dolan RJ (2011). Model-Based Influences on Humans’ Choices and Striatal Prediction Errors. Neuron 69, 1204–1215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. DeLong MR (1990). Primate models of movement disorders of basal ganglia origin. Trends Neurosci. 13, 281–285. [DOI] [PubMed] [Google Scholar]
  25. Deroche-Gamonet V, Belin D, and Piazza PV (2004). Evidence for Addiction-like Behavior in the Rat. Science 305, 1014–1017. [DOI] [PubMed] [Google Scholar]
  26. DeRusso A, Fan D, Gupta J, Shelest O, Costa RM, and Yin HH (2010). Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Front. Integr. Neurosci 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Dezfouli A, and Balleine BW (2013). Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized. PLOS Comput. Biol 9, e1003364. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Dhawale AK, Smith MA, and Ölveczky BP (2017). The Role of Variability in Motor Learning. Annu. Rev. Neurosci 40, 479–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Dhawale AK, Wolff SBE, Ko R, and Ölveczky BP (2019). The basal ganglia can control learned motor sequences independently of motor cortex. BioRxiv 827261. [Google Scholar]
  30. Dickinson A (1985). Actions and Habits: The Development of Behavioural Autonomy. Philos. Trans. R. Soc. B Biol. Sci 308, 67–78. [Google Scholar]
  31. Dickinson A, and Balleine B (1994). Motivational control of goal-directed action. Anim. Learn. Behav. 22, 1–18. [Google Scholar]
  32. Dickinson A, and Charnock DJ (1985). Contingency effects with maintained instrumental reinforcement. Q. J. Exp. Psychol. Sect. B 37, 397–416. [Google Scholar]
  33. Dickinson A, Nicholas DJ, and Adams CD (1983). The Effect of the Instrumental Training Contingency on Susceptibility to Reinforcer Devaluation. Q. J. Exp. Psychol. Sect. B 35, 35–51. [Google Scholar]
  34. Doupe AJ, Perkel DJ, Reiner A, and Stern EA (2005). Birdbrains could teach basal ganglia research a new song. Trends Neurosci. 28, 353–363. [DOI] [PubMed] [Google Scholar]
  35. Epstein DH, and Kowalczyk WJ (2018). Compulsive Seekers: Our take. Two Clinicians’ Perspective on a New Animal Model of Addiction. Neuropsychopharmacol. Off. Publ. Am. Coll. Neuropsychopharmacol. 43, 677–679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Everitt BJ, and Robbins TW (2016). Drug Addiction: Updating Actions to Habits to Compulsions Ten Years On. Annu. Rev. Psychol 67, 23–50. [DOI] [PubMed] [Google Scholar]
  37. Faure A, Haberland U, Condé F, and El Massioui N (2005). Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J. Neurosci. Off. J. Soc. Neurosci 25, 2771–2780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Freeze BS, Kravitz AV, Hammack N, Berke JD, and Kreitzer AC (2013). Control of Basal Ganglia Output by Direct and Indirect Pathway Projection Neurons. J. Neurosci 33, 18531–18539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Friedman A, Homma D, Gibb LG, Amemori K-I, Rubin SJ, Hood AS, Riad MH, and Graybiel AM (2015). A Corticostriatal Path Targeting Striosomes Controls Decision-Making under Conflict. Cell 161, 1320–1333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Gadagkar V, Puzerey PA, Chen R, Baird-Daniel E, Farhang AR, and Goldberg JH (2016). Dopamine neurons encode performance error in singing birds. Science 354, 1278–1282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Gale SD, and Perkel DJ (2010). A Basal Ganglia Pathway Drives Selective Auditory Responses in Songbird Dopaminergic Neurons via Disinhibition. J. Neurosci 30, 1027–1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. George O, and Koob GF (2017). Individual differences in the neuropsychopathology of addiction. Dialogues Clin. Neurosci. 19, 217–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Gillan CM, Morein-Zamir S, Urcelay GP, Sule A, Voon V, Apergis-Schoute AM, Fineberg NA, Sahakian BJ, and Robbins TW (2014). Enhanced avoidance habits in obsessive-compulsive disorder. Biol. Psychiatry 75, 631–638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Gillan CM, Apergis-Schoute AM, Morein-Zamir S, Urcelay GP, Sule A, Fineberg NA, Sahakian BJ, and Robbins TW (2015). Functional neuroimaging of avoidance habits in obsessive-compulsive disorder. Am. J. Psychiatry 172, 284–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Giuliano C, Belin D, and Everitt BJ (2019). Compulsive Alcohol Seeking Results from a Failure to Disengage Dorsolateral Striatal Control over Behavior. J. Neurosci. Off. J. Soc. Neurosci 39, 1744–1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Gourley SL, Olevska A, Gordon J, and Taylor JR (2013). Cytoskeletal Determinants of Stimulus-Response Habits. J. Neurosci 33, 11811–11816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Gremel CM, and Costa RM (2013). Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat. Commun 4, 2264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Gremel CM, Chancey JH, Atwood BK, Luo G, Neve R, Ramakrishnan C, Deisseroth K, Lovinger DM, and Costa RM (2016). Endocannabinoid Modulation of Orbitostriatal Circuits Gates Habit Formation. Neuron 90, 1312–1324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Haber SN, Fudge JL, and McFarland NR (2000). Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. J. Neurosci. Off. J. Soc. Neurosci 20, 2369–2382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Hilário MRF, Clouse E, Yin HH, and Costa RM (2007). Endocannabinoid signaling is critical for habit formation. Front. Integr. Neurosci 1, 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Hodebourg R, Murray JE, Fouyssac M, Puaud M, Everitt BJ, and Belin D (2019). Heroin seeking becomes dependent on dorsal striatal dopaminergic mechanisms and can be decreased by N-acetylcysteine. Eur. J. Neurosci 50, 2036–2044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Howe MW, and Dombeck DA (2016). Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature 535, 505–510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Jenrette TA, Logue JB, and Horner KA (2019). Lesions of the Patch Compartment of Dorsolateral Striatum Disrupt Stimulus-Response Learning. Neuroscience 415, 161–172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Jin X, and Costa RM (2010). Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature 466, 457–462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Jin X, Tecuapetla F, and Costa RM (2014). Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat. Neurosci 17, 423–430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kawai R, Markman T, Poddar R, Ko R, Fantana AL, Dhawale AK, Kampff AR, and Ölveczky BP (2015). Motor cortex is required for learning but not for executing a motor skill. Neuron 86, 800–812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Kawashima S, Ueki Y, Kato T, Ito K, and Matsukawa N (2018). Reduced striatal dopamine release during motor skill acquisition in Parkinson’s disease. PloS One 13, e0196661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Knowlton BJ, Mangels JA, and Squire LR (1996). A neostriatal habit learning system in humans. Science 273, 1399–1402. [DOI] [PubMed] [Google Scholar]
  59. Kubota Y, Liu J, Hu D, DeCoteau WE, Eden UT, Smith AC, and Graybiel AM (2009). Stable encoding of task structure coexists with flexible coding of task events in sensorimotor striatum. J. Neurophysiol 102, 2142–2160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Kupferschmidt DA, Juczewski K, Cui G, Johnson KA, and Lovinger DM (2017). Parallel, but Dissociable, Processing in Discrete Corticostriatal Inputs Encodes Skill Learning. Neuron 96, 476–489.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Lammel S, Lim BK, Ran C, Huang KW, Betley MJ, Tye KM, Deisseroth K, and Malenka RC (2012). Input-specific control of reward and aversion in the ventral tegmental area. Nature 491, 212–217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lawhorn C, Smith DM, and Brown LL (2009). Partial ablation of mu-opioid receptor rich striosomes produces deficits on a motor-skill learning task. Neuroscience 163, 109–119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. LeDoux JE, Moscarello J, Sears R, and Campese V (2017). The birth, death and resurrection of avoidance: a reconceptualization of a troubled paradigm. Mol. Psychiatry 22, 24–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Lee SW, Shimojo S, and O’Doherty JP (2014). Neural computations underlying arbitration between model-based and model-free learning. Neuron 81, 687–699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Lerner TN, and Kreitzer AC (2011). Neuromodulatory control of striatal plasticity and behavior. Curr. Opin. Neurobiol 21, 322–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Lerner TN, and Kreitzer AC (2012). RGS4 is required for dopaminergic control of striatal LTD and susceptibility to parkinsonian motor deficits. Neuron 73, 347–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Lerner TN, Horne EA, Stella N, and Kreitzer AC (2010). Endocannabinoid signaling mediates psychomotor activation by adenosine A2A antagonists. J. Neurosci. Off. J. Soc. Neurosci 30, 2160–2164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Lerner TN, Shilyansky C, Davidson TJ, Evans KE, Beier KT, Zalocusky KA, Crow AK, Malenka RC, Luo L, Tomer R, et al. (2015). Intact-Brain Analyses Reveal Distinct Information Carried by SNc Dopamine Subcircuits. Cell 162, 635–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Leung BK, and Balleine BW (2013). The Ventral Striato-Pallidal Pathway Mediates the Effect of Predictive Learning on Choice between Goal-Directed Actions. J. Neurosci 33, 13848–13860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Li Y, He Y, Chen M, Pu Z, Chen L, Li P, Li B, Li H, Huang Z-L, Li Z, et al. (2016). Optogenetic Activation of Adenosine A2A Receptor Signaling in the Dorsomedial Striatopallidal Neurons Suppresses Goal-Directed Behavior. Neuropsychopharmacol. Off. Publ. Am. Coll. Neuropsychopharmacol 41, 1003–1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Mailly P, Charpier S, Menetrey A, and Deniau J-M (2003). Three-Dimensional Organization of the Recurrent Axon Collateral Network of the Substantia Nigra Pars Reticulata Neurons in the Rat. J. Neurosci 23, 5247–5257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Matsumoto M, and Hikosaka O (2009). Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Menegas W, Bergan JF, Ogawa SK, Isogai Y, Umadevi Venkataraju K, Osten P, Uchida N, and Watabe-Uchida M (2015). Dopamine neurons projecting to the posterior striatum form an anatomically distinct subclass. ELife 4, e10032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Menegas W, Akiti K, Amo R, Uchida N, and Watabe-Uchida M (2018). Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli. Nat. Neurosci 21, 1421–1430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Miller KJ, Shenhav A, and Ludvig EA (2019). Habits without values. Psychol. Rev 126, 292–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Murray JE, Dilleen R, Pelloux Y, Economidou D, Dalley JW, Belin D, and Everitt BJ (2014). Increased impulsivity retards the transition to dorsolateral striatal dopamine control of cocaine seeking. Biol. Psychiatry 76, 15–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Nieh EH, Matthews GA, Allsop SA, Presbrey KN, Leppla CA, Wichmann R, Neve R, Wildes CP, and Tye KM (2015). Decoding Neural Circuits that Control Compulsive Sucrose Seeking. Cell 160, 528–541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. O’Hare JK, Ade KK, Sukharnikova T, Van Hooser SD, Palmeri ML, Yin HH, and Calakos N (2016). Pathway-Specific Striatal Substrates for Habitual Behavior. Neuron 89, 472–479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Packard MG, and Knowlton BJ (2002). Learning and memory functions of the Basal Ganglia. Annu. Rev. Neurosci 25, 563–593. [DOI] [PubMed] [Google Scholar]
  80. Patterson TK, Craske MG, and Knowlton BJ (2019). Enhanced Avoidance Habits in Relation to History of Early-Life Stress. Front. Psychol 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Peça J, Feliciano C, Ting JT, Wang W, Wells MF, Venkatraman TN, Lascola CD, Fu Z, and Feng G (2011). Shank3 mutant mice display autistic-like behaviours and striatal dysfunction. Nature 472, 437–442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Pelloux Y, Everitt BJ, and Dickinson A (2007). Compulsive drug seeking by rats under punishment: effects of drug taking history. Psychopharmacology (Berl.) 194, 127–137. [DOI] [PubMed] [Google Scholar]
  83. Richards CD, Shiroyama T, and Kitai ST (1997). Electrophysiological and immunocytochemical characterization of GABA and dopamine neurons in the substantia nigra of the rat. Neuroscience 80, 545–557. [DOI] [PubMed] [Google Scholar]
  84. Rizzi G, and Tan KR (2019). Synergistic Nigral Output Pathways Shape Movement. Cell Rep. 27, 2184–2198.e4. [DOI] [PubMed] [Google Scholar]
  85. Root DH, Melendez RI, Zaborszky L, and Napier TC (2015). The ventral pallidum: Subregion-specific functional anatomy and roles in motivated behaviors. Prog. Neurobiol 130, 29–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Rossi MA, and Yin HH (2012). Methods for studying habitual behavior in mice. Curr. Protoc. Neurosci Chapter 8, Unit 8.29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Rothwell PE, Fuccillo MV, Maxeiner S, Hayton SJ, Gokce O, Lim BK, Fowler SC, Malenka RC, and Südhof TC (2014). Autism-associated neuroligin-3 mutations commonly impair striatal circuits to boost repetitive behaviors. Cell 158, 198–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Shan Q, Ge M, Christie MJ, and Balleine BW (2014). The Acquisition of Goal-Directed Actions Generates Opposing Plasticity in Direct and Indirect Pathways in Dorsomedial Striatum. J. Neurosci 34, 9196–9201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Shan Q, Christie MJ, and Balleine BW (2015). Plasticity in striatopallidal projection neurons mediates the acquisition of habitual actions. Eur. J. Neurosci 42, 2097–2104. [DOI] [PubMed] [Google Scholar]
  90. Shen W, Flajolet M, Greengard P, and Surmeier DJ (2008). Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321, 848–851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Shmelkov SV, Hormigo A, Jing D, Proenca CC, Bath KG, Milde T, Shmelkov E, Kushner JS, Baljevic M, Dincheva I, et al. (2010). Slitrk5 deficiency impairs corticostriatal circuitry and leads to obsessive-compulsive-like behaviors in mice. Nat. Med 16, 598–602, 1p following 602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Shumilov K, Real MÁ, Valderrama-Carvajal A, and Rivera A (2018). Selective ablation of striatal striosomes produces the deregulation of dopamine nigrostriatal pathway. PLOS ONE 13, e0203135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. da Silva JA, Tecuapetla F, Paixão V, and Costa RM (2018). Dopamine neuron activity before action initiation gates and invigorates future movements. Nature 554, 244–248. [DOI] [PubMed] [Google Scholar]
  94. Singer BF, Fadanelli M, Kawa AB, and Robinson TE (2018). Are Cocaine-Seeking “Habits” Necessary for the Development of Addiction-Like Behavior in Rats? J. Neurosci. Off. J. Soc. Neurosci 38, 60–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Smith KS, and Graybiel AM (2016). Habit formation coincides with shifts in reinforcement representations in the sensorimotor striatum. J. Neurophysiol 115, 1487–1498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Smith JB, Klug JR, Ross DL, Howard CD, Hollon NG, Ko VI, Hoffman H, Callaway EM, Gerfen CR, and Jin X (2016). Genetic-based dissection unveils the inputs and outputs of striatal patch and matrix compartments. Neuron 91, 1069–1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Smith KS, Virkud A, Deisseroth K, and Graybiel AM (2012). Reversible online control of habitual behavior by optogenetic perturbation of medial prefrontal cortex. Proc. Natl. Acad. Sci. U. S. A 109, 18932–18937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Stelly CE, Haug GC, Fonzi KM, Garcia MA, Tritley SC, Magnon AP, Ramos MAP, and Wanat MJ (2019). Pattern of dopamine signaling during aversive events predicts active avoidance learning. Proc. Natl. Acad. Sci. U. S. A 116, 13641–13650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Surmeier DJ, Plotkin J, and Shen W (2009). Dopamine and synaptic plasticity in dorsal striatal circuits controlling action selection. Curr. Opin. Neurobiol 19, 621–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Tepper JM, and Lee CR (2007). GABAergic control of substantia nigra dopaminergic neurons. Prog. Brain Res. 160, 189–208. [DOI] [PubMed] [Google Scholar]
  101. Tepper JM, Martin LP, and Anderson DR (1995). GABAA receptor-mediated inhibition of rat substantia nigra dopaminergic neurons by pars reticulata projection neurons. J. Neurosci. Off. J. Soc. Neurosci 15, 3092–3103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Thorn CA, Atallah H, Howe M, and Graybiel AM (2010). Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 66, 781–795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Vandaele Y, and Janak PH (2018). Defining the place of habit in substance use disorders. Prog. Neuropsychopharmacol. Biol. Psychiatry 87, 22–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Vandaele Y, Pribut HJ, and Janak PH (2017). Lever Insertion as a Salient Stimulus Promoting Insensitivity to Outcome Devaluation. Front. Integr. Neurosci 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Vandaele Y, Mahajan NR, Ottenheimer DJ, Richard JM, Mysore SP, and Janak PH (2019). Distinct recruitment of dorsomedial and dorsolateral striatum erodes with extended training. ELife 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Vanderschuren LJMJ, and Everitt BJ (2004). Drug Seeking Becomes Compulsive After Prolonged Cocaine Self-Administration. Science 305, 1017–1019. [DOI] [PubMed] [Google Scholar]
  107. Vanderschuren LJ, Minnaard AM, Smeets JA, and Lesscher HM (2017). Punishment models of addictive behavior. Curr. Opin. Behav. Sci 13, 77–84. [Google Scholar]
  108. Vanderschuren LJMJ, Ciano PD, and Everitt BJ (2005). Involvement of the Dorsal Striatum in Cue-Controlled Cocaine Seeking. J. Neurosci. 25, 8665–8670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, and Uchida N (2012). Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron 74, 858–873. [DOI] [PubMed] [Google Scholar]
  110. Welch JM, Lu J, Rodriguiz RM, Trotta NC, Peca J, Ding J-D, Feliciano C, Chen M, Adams JP, Luo J, et al. (2007). Cortico-striatal synaptic defects and OCD-like behaviours in Sapap3-mutant mice. Nature 448, 894–900. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Willuhn I, Burgeno LM, Everitt BJ, and Phillips PEM (2012). Hierarchical recruitment of phasic dopamine signaling in the striatum during the progression of cocaine use. Proc. Natl. Acad. Sci 109, 20703–20708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Witt K, Nuhsman A, and Deuschl G (2002). Dissociation of habit-learning in Parkinson’s and cerebellar disease. J. Cogn. Neurosci 14, 493–499. [DOI] [PubMed] [Google Scholar]
  113. Woolverton WL, Freeman KB, Myerson J, and Green L (2012). Suppression of cocaine self-administration in monkeys: effects of delayed punishment. Psychopharmacology (Berl.) 220, 509–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Xu T, Yu X, Perlik AJ, Tobin WF, Zweig JA, Tennant K, Jones T, and Zuo Y (2009). Rapid formation and selective stabilization of synapses for enduring motor memories. Nature 462, 915–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Yang H, de Jong JW, Tak Y, Peck J, Bateup HS, and Lammel S (2018). Nucleus Accumbens Subnuclei Regulate Motivated Behavior via Direct Inhibition and Disinhibition of VTA Dopamine Subpopulations. Neuron 97, 434–449.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Yin HH, and Knowlton BJ (2004). Contributions of striatal subregions to place and response learning. Learn. Mem. Cold Spring Harb. N 11, 459–463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Yin HH, and Knowlton BJ (2006). The role of the basal ganglia in habit formation. Nat. Rev. Neurosci 7, 464. [DOI] [PubMed] [Google Scholar]
  118. Yin HH, Knowlton BJ, and Balleine BW (2004). Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur. J. Neurosci 19, 181–189. [DOI] [PubMed] [Google Scholar]
  119. Yin HH, Knowlton BJ, and Balleine BW (2005a). Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. Eur. J. Neurosci 22, 505–512. [DOI] [PubMed] [Google Scholar]
  120. Yin HH, Ostlund SB, Knowlton BJ, and Balleine BW (2005b). The role of the dorsomedial striatum in instrumental conditioning. Eur. J. Neurosci 22, 513–523. [DOI] [PubMed] [Google Scholar]
  121. Yin HH, Mulcare SP, Hilário MRF, Clouse E, Holloway T, Davis MI, Hansson AC, Lovinger DM, and Costa RM (2009). Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat. Neurosci 12, 333–341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Yu C, Gupta J, Chen J-F, and Yin HH (2009). Genetic deletion of A2A adenosine receptors in the striatum selectively impairs habit formation. J. Neurosci. Off. J. Soc. Neurosci 29, 15100–15103. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES