Skip to main content
PLOS One logoLink to PLOS One
. 2020 Jan 8;15(1):e0224715. doi: 10.1371/journal.pone.0224715

Lesion of striatal patches disrupts habitual behaviors and increases behavioral variability

Jacob A Nadel 1,2, Sean S Pawelko 1, Della Copes-Finke 1, Maya Neidhart 1, Christopher D Howard 1,*
Editor: Jeff A Beeler3
PMCID: PMC6948820  PMID: 31914121

Abstract

Habits are automated behaviors that are insensitive to changes in behavioral outcomes. Habitual responding is thought to be mediated by the striatum, with medial striatum guiding goal-directed action and lateral striatum promoting habits. However, interspersed throughout the striatum are neurochemically differing subcompartments known as patches, which are characterized by distinct molecular profiles relative to the surrounding matrix tissue. These structures have been thoroughly characterized neurochemically and anatomically, but little is known regarding their function. Patches have been shown to be selectively activated during inflexible motor stereotypies elicited by stimulants, suggesting that patches may subserve habitual behaviors. To explore this possibility, we utilized transgenic mice (Sepw1 NP67) preferentially expressing Cre recombinase in striatal patch neurons to target these neurons for ablation with a virus driving Cre-dependent expression of caspase 3. Mice were then trained to press a lever for sucrose rewards on a variable interval schedule to elicit habitual responding. Mice were not impaired on the acquisition of this task, but lesioning striatal patches disrupted behavioral stability across training, and lesioned mice utilized a more goal-directed behavioral strategy during training. Similarly, when mice were forced to omit responses to receive sucrose rewards, habitual responding was impaired in lesioned mice. To rule out effects of lesion on motor behaviors, mice were then tested for impairments in motor learning on a rotarod and locomotion in an open field. We found that patch lesions partially impaired initial performance on the rotarod without modifying locomotor behaviors in open field. This work indicates that patches promote behavioral stability and habitual responding, adding to a growing literature implicating striatal patches in stimulus-response behaviors.

Introduction

Organisms must optimize behavioral strategies in order to be successful in their environments. However, various strategies exist for this purpose; optimization can be rapid and strongly dependent on outcomes or slow and resistant to change. Behaviors have therefore been divided into two main categories: goal-oriented and habitual behaviors [1]. Goal-directed, or action-outcome behaviors, are sensitive to the relationship between action and outcome and are thus highly flexible. In contrast, habitual, or stimulus-response strategies, are insensitive to changes in action-outcome relationships and lead to the continued use of behaviors that do not necessarily result in positive outcomes. While habitual strategies are evolutionarily advantageous by improving cognitive efficiency, maladaptive habit formation underlies pathological states including Obsessive Compulsive Disorder [24], drug addiction [57], and Tourette’s Syndrome [8]. These disorders are characterized by compulsive and maladaptive behaviors with common neuroanatomical alterations.

Habits have been studied in animal models by measuring perseverance of instrumental behaviors (e.g., lever pressing) following changes in reward value, or by measuring flexibility in responding during probes manipulating action-outcome contingency [9,10]. Distinct neural circuits supporting goal-directed and habitual behaviors have been identified using this approach [11,12]. Impairment of the dorsomedial striatum, prelimbic cortex, or orbitofrontal cortex tends to disrupt goal-directed behaviors and animals become less sensitive to changes in outcomes [1316]. In contrast, the lateral striatum functions as a key ‘habit center’, as lesions of this region promote flexibility [17]. This idea is consistent with human imaging studies, which find habitual behaviors correspond to overreliance on the putamen, the primate homolog of the dorsolateral striatum [18,19]. A model has therefore been established suggesting that the dorsomedial striatum and frontal cortical inputs facilitate goal-directed actions, while the dorsolateral striatum promotes habitual behaviors [11], but see [20].

In addition to a medial-lateral divide, the dorsal striatum contains neurochemically distinct compartments: patches or striosomes compose approximately 15% of striatal volume and are surrounded by the remaining 85% of the striatum, known as the matrix [21,22]. Patches were discovered nearly 50 years ago [23], and have since been identified in the human, monkey, cat, and rodent [24]. Despite decades of research into the neuroanatomy and connectivity of striatal patches, their function remains poorly understood. Patches are heavily interconnected with limbic circuits, and they provide the only direct inhibition to midbrain dopamine neurons from the striatum [2527], but see [28]. After repeated exposure, stimulant drugs of abuse drive expression of immediate early genes such as c-fos selectively in patches, and this expression is predictive of motor stereotypies [21,29,30]. Similarly, lesions of striatal patches reduce stimulant-induced motor stereotypies [31,32], suggesting patches may subserve compulsive behaviors. Recent work has found that pharmacological ablation of μ-opioid containing neurons, which are enriched in patches, disrupts habitual responding for sucrose rewards in rats [33]. In aggregate, these studies indicate a role for patches in compulsive, habitual motor behaviors. To investigate patch involvement in habitual behaviors, we utilized transgenic mice (Sepw1 NP67) which express Cre-recombinase preferentially in striatal patch neurons [28,34]. We used a virus driving Cre-dependent expression of caspase 3 to selectively ablate patch neurons before training mice on a variable interval schedule of reinforcement, which has been previously used to establish habitual responding [35]. During training, we noted significantly increased day-to-day variability in response rates in lesioned mice relative to controls. Additionally, lesioning striatal patches disrupted behavioral stability across training and lesioned mice utilized a more goal-directed behavioral strategy during training. When mice were forced to omit responses in order to earn rewards, lesioned mice had diminished response rates relative to control mice, suggesting impaired habitual responding. Lesioned mice were also slightly impaired on acquisition of motor learning as assessed by performance on an accelerating, rotating balance rod (rotarod), though these mice show no generalized locomotor impairments in open field. Together, this work supports the notion that patches subserve habitual behaviors by promoting behavioral stability, an effect that cannot be solely attributed to deficits in motor control.

Materials and methods

Animals

All experiments were in accordance with protocols approved by the Oberlin College Institutional Animal Care and Use Committee. Mice were maintained on a 12 hr/12 hr light/dark cycle and unless otherwise noted, were provided ad libitum access to water and food. Experiments were carried out during the light cycle. Overall, 29 male and female Sepw1-Cre/Rosa26-EGFP mice between 2 and 5 months of age were used in this study. Sepw1-Cre mice were generously provided by Charles Gerfen (National Institutes of Health) and Nathanial Heintz (Rockefeller University). These mice show preferential Cre recombinase expression in striatal patches [28,34].

Reagents

Isoflurane anesthesia was obtained from Patterson Veterinary (Greeley, CO, USA). Sterile and filtered phosphate buffered saline (PBS, 1X) was obtained from GE Life Sciences (Pittsburgh, PA, USA). Unless otherwise noted, all other reagents were obtained through VWR (Radnor, PA, USA).

Viral injections

To selectively ablate striatal patches, Sepw1 NP67 X Rosa26-EGFP mice were anesthetized with isoflurane (4% at 2 L/sec O2 for induction, 0.5–1.5% at 0.5 L/sec O2 afterward), placed in a stereotactic frame (David Kopf Instruments, Tajunga, CA, USA), and were bilaterally injected with AAV5-flex-taCasp3-TEVp (UNC viral vector core). Cre-dependent expression of caspase 3 has been previously shown to drive apoptosis in neurons while limiting necrosis in surrounding tissue [36]. Briefly, two burr holes were drilled above dorsal striatum (+0.9 AP, ±1.8 ML, and 2.5 DV), and a 33-gauge needle was slowly lowered to the DV coordinate over 2 minutes and held in place for 1 min prior to injections. A 5 μl syringe (Hamilton) was used to inject 0.5 μl of virus over 5 min and the needle was left in place for 5 min following injections. The needle was then slowly retracted over 5 min. Mice were sutured and received Carprofen (5 mg/kg, s.c.) as postoperative analgesia. All mice were given 3 weeks to recover before behavioral training began. Control (non-lesion control) mice underwent an identical surgical procedure but received 0.5 μl of sterile, filtered phosphate-buffered saline (PBS).

Variable Interval Training

Mice were trained on a variable interval schedule to induce habitual responding ([35], see Fig 1J for experimental design). Throughout training, mice were food deprived and kept at 85% of initial weight by daily feeding of 1.5–2.5g of standard mouse chow. Operant conditioning was performed in standard operant chambers (Med Associates). Each chamber had two retractable levers on either side of a food magazine, where sucrose rewards were delivered (20% sucrose solution, 20 μl), and a house light on the opposite side of the chamber. Mice first underwent three days of continuous reinforcement training (CRF/FR1, one lever press yields one reward). At the start of the session, the house light was illuminated, and the left lever was inserted into the chamber. After 60 min or 50 rewards, the light was shut off, the lever was retracted, and the session ended. Animals that failed to obtain >10 rewards during FR1 were given an extra day of FR1 training and were excluded if they did not reach this criterion. Next, mice were trained on a variable-interval 30 task, in which they were rewarded on average 30 seconds (15–45 sec, possible intervals separated by 3 sec) contingent on lever pressing. To determine how patch lesions modified habit formation across training, lesion and control mice were divided into three groups experiencing either 3, 5, or 7 days of training on a VI60 schedule (rewarded every 60 seconds on average, ranging from 30–90 sec, possible intervals separated by 3 sec). Variable interval sessions ended after 60 min or when 50 rewards had been earned.

Fig 1. Schematic of experimental design.

Fig 1

A. Schematic representation of injection sites in a coronal mouse brain section. Sepw1-Cre mice preferentially express Cre recombinase in striatal patches and ‘exo-patches’ (see text; green). AAV5-AAV-flex-taCasp3-TEVp (0.5 μl) or sterile PBS (control) was injected bilaterally into the dorsal striatum of Sepw1-Cre mice to preferentially lesion patches. B. Representative image of intact striatum of Sepw1-Cre X Rosa26-EGFP mice displaying dense GFP expression in striatal patches. Dotted line denotes border of the striatum and solid white line denotes striatal patches. C. In lesioned mice, GFP + cells are greatly reduced and striatal patches are reduced in number. D-F. Representative μ-opioid receptor (D) and GFP expression (E) and overlay (F) in intact striatum. G-I. Representative μ-opioid receptor (G) and GFP expression (H) and overlay (I) in lesioned striatum. J. Experimental design. Mice were trained to respond on a continuous reinforcement training (CRF) before beginning variable interval 30 training (VI30). This was followed by variable interval 60 (VI60) training to establish habitual responding. After training, mice experienced counterbalanced valuation/devaluation probes (Val, Deval, respectively), followed by a day of reinstatement (VI60), and two days of omission (Omis). See Methods for details of each behavioral schedule.

Probe tests

Following completion of VI training, a devaluation test was conducted over two days. Here, mice were allowed free access to either chow (valuation) or sucrose solution (devaluation) for one hour. Immediately after, mice were given a 5-min probe test in which the lever was extended and presses were recorded, but no rewards were delivered. The order of the valued and devalued condition tests was randomized for each mouse. Mice that experienced 7 days of VI60 training only underwent a single day of devaluation after finding a significant change in response rate across probe days regardless of probe condition (see Results). One day after valuation and devaluation probe tests, mice were reinstated on the VI60 task to reestablish response rates. The following two days, mice were tested with a 60-minute omission test in which the action-outcome contingency was reversed such that mice were required to refrain from pressing the lever for 20 seconds in order to receive rewards, and pressing the lever reset the counter. Omission is a robust means of testing habitual responding [11,37], and was used to probe goal-directed control.

Rotarod

Deficits in operant behaviors could be due to changes in habit formation or due to generalized motor deficits. Therefore, following omission tests, mice were returned to ad libitum access to chow for at least one week prior to assessment of motor learning. We next sought to determine how lesions of striatal patches might affect motor learning using a rotarod (Ugo Basile). Mice were initially habituated to the rod by first walking for 5 min at a slow, constant rate of 4 rpm. Lesion or control animals were then trained with four trials per day for four days where the rotarod accelerated from 3–40 rotations per min over 360 sec [38]. Each trial ended when the mouse fell from the rod or after 360 sec had elapsed. A resting period of at least 15 min separated trials. Latency to fall was recorded and compared between lesion and control groups.

Open field

Following rotarod training, caspase-lesioned mice and controls were individually placed in a square activity chamber (42 cm wide x 42 cm long x 30 cm tall) and video-monitored from above for 30 minutes. After session completion, the distance moved, velocity, and rotation of each mouse was extracted from the video file using Ethovision (Noldus) and compared between control and lesion groups.

Immunohistochemistry

Following the completion of behavioral experiments, mice were anesthetized with isoflurane and transcardially perfused with 0.9% saline and 4% paraformaldehyde (PFA) using a peristaltic pump or manual injection. Brains were removed and allowed to post-fix in 4% PFA at 4°C for 24 h. Brains were then transferred to a 30% sucrose solution and returned to 4°C. Following sinking, brains were sectioned on a freezing microtome into 25 μm sections, which were stored in a cryoprotectant solution before being washed 3X in Tris buffered saline (TBS) and blocked in 3% horse serum and 0.25% Triton X-100. Sections were then incubated in a 1:500 dilution of anti-GFP polyclonal guinea pig antibody (Synaptic Systems, cat#132–004), and/or anti-μ-opioid receptor polyclonal rabbit antibody (Immunostar, cat #24216) for 24–48 h at 4°C on a shaker. Following incubation, sections were washed 2x15 minutes in TBS to remove excess primary antibody, then blocked for 30 minutes before incubating in Alexa Fluor® 488 AffiniPure Donkey Anti-Guinea Pig IgG (Jackson ImmunoResearch, cat#706-545-148, diluted 1:250) and/or Cy®3AffiniPure Donkey Anti-Rabbit IgG (Jackson ImmunoResearch, cat#711-165-152, diluted 1:250) for 2 hours at room temperature. Tissue was then washed 3x15 min in TBS to reduce background staining. Slices were subsequently floated in 0.1M phosphate buffer (PB) and mounted on slides. After drying, sections were covered using mounting media (Aqua-Poly/Mount, Polysciences, 18606–20) with DAPI (Sigma-Aldrich D9542; 1:1000). Tissue was visualized using a Leica DM4000B fluorescent microscope.

Data and statistical analysis

Mean press rates and normalized press rates were compared for each probe test and reinstatement days. Devaluation probe rates for each mouse were normalized to valuation press rates (LPr; [39]) or average press rates across all VI60. Reinstatement press rates were normalized to press rates during the final day of VI60. Omission press rates were normalized to press rate during the reinstatement day following devaluation probes. Autocorrelation (lag 1) of press rates across VI60 training and cross-correlation were determined using MATLAB (R2018b, Mathworks). We intended to investigate the effects of patch lesions across different VI60 training durations (3, 5, or 7 days), but found no effect of training days across multiple task metrics, including press rates on the final day of VI60 training, and normalized response rates during valuation/devaluation probes, reinstatement day, nor omission days (p > 0.05). Therefore, we collapsed these three groups for subsequent analysis. However, due to fewer training days in the 3-day group, variability and behavioral strategy analysis was reserved for mice that received 5 or 7 days of training.

Statistical analysis was conducted using MATLAB (R2018b, Mathworks) or GraphPad Prism 7 (GraphPad). Press rates in VI30, VI60, devaluation probes, LPr, reinstatement day, and change across omission days, as well as distance moved, velocity, and rotations in open field were compared between lesion and control groups with unpaired student’s t-tests. Devaluation and valuation presses were compared within groups using a paired student’s t-test. Efficiency was assessed by dividing number of presses or head-entries to number of rewards, which was calculated for day 1 and day 5. Day 5 efficiency was then normalized to day 1 and was compared using a one-sample t-test comparing means to 100% (no change). Similarly, press rates across omission days were compared using one-sample t-test, where day 2 press rates were normalized to day 1. Press rates across learning, probe days, omission, performance in rotarod across trials, and cross-correlations were compared using two-way repeated measures ANOVA. For ANOVAs, the Sidak’s multiple comparisons test was used for post-hoc tests except for histograms and cross-correlation, where a bonferroni corrected multiple comparisons was performed. Distributions of inter-press and inter-head-entry-interval were compared using a non-parametric Two-sample Kolmogorov-Smirnov test of distribution. Finally, Pearson’s Correlation was used to compare average press rate across VI60 to press rate in omission day 1. Significance was defined as p 0.05.

Results

Lesion of striatal patches enhances behavioral variability

To explore patch contribution to habitual behaviors, we used Sepw1-Cre mice, which express cre-recombinase in patches [34], and an AAV encoding a modified caspase 3 virus to preferentially lesion striatal patches. Injection of AAV led to deletion of GFP+ neurons in the dorsal striatum (Fig 1A–1C). Patches have been defined by expression of μ-opioid receptor (MOR; [21]), so we next characterized the expression of MOR in intact and lesioned tissue. GFP+ neurons preferentially aggregate in MOR-enriched striatal patches, though, as previously reported, the Sepw1 line also expresses Cre in “exo-patches,” or striatal neurons outside of patches that are ‘patch-like’ in terms of receptor expression and development (Fig 1D–1F; [28,34]). Injection of virus encoding caspase 3 led to loss of GFP+ neurons from patches and a reduction of exo-patch neurons in both dorsomedial and dorsolateral striatum. This change was accompanied by diffuse expression of μ-opioid receptor and loss of discrete patch expression in the dorsal striatum (Fig 1G–1I). Three weeks after injection of virus (n = 14) or vehicle (n = 15), mice were trained on a variable interval schedule of reinforcement, which has been shown to induce habitual responding in mice ([35], Fig 1J). Both lesioned and control mice increased press rates across FR1, VI30, and VI60 training (two-way repeated-measures ANOVA, significant effect of day, F(8,216) = 24.9, p < 0.0001) and lesioned mice were not impaired in acquisition of the task relative to controls (non-significant effect of group, F(1,27) = 0.2706, p = 0.6071; non-significant interaction, F(8,216) = 1.687, p = 0.1028; Fig 2A). Interestingly, across training, control mice were more consistent in their day-to-day press rates relative to patch lesioned mice. Fig 2B and 2C show the daily press rate of one mouse subtracted from the average press rate for that mouse across VI60 training in both a representative control (Fig 2B) and lesioned mouse (Fig 2C). Here, larger bars reflect increased variance across days. Indeed, across VI60 training days, lesioned mice displayed significantly increased behavioral variability in response rates (unpaired t-test, t = 2.797, df = 27, p = 0.0094; Fig 2D). Similarly, press rates in control mice were more predictive of press rates the following day, as they demonstrated significantly greater autocorrelation coefficients (at lag 1) relative to lesioned mice (unpaired t-test: t = 2.144, df = 21, p = 0.0439, Fig 2E). This suggests that lesioning patches may disrupt the stabilization of lever press rate across training, which may indicate increased behavioral flexibility. Despite this, press rates did not differ between patch lesioned or control mice in VI60 (t = 0.3034, df = 27, p = 0.7639, Fig 2F). Together, this suggests that lesioning striatal patches does not impair acquisition of action-outcome contingencies in VI60 training, though lesions may enhance behavioral variability across days.

Fig 2. Lesioning striatal patches increases response variability.

Fig 2

A. Across CRF (FR1), variable interval 30 (VI30), and variable interval 60 (VI60) training, lesion (red) and non-lesion control (blue) mice have similar increases in press rates. B-C. Representative day-to-day variation of press rates for a control (B) and lesioned (C) mouse. The line at 0 represents the mean press rate across all VI60 days for each respective mouse and bars represent the difference from the mean on each day. D. Coefficient of variation in press rates across VI60 training days is significantly increased in lesioned mice relative to controls. E. Autocorrelation coefficient at lag 1 is reduced in patch lesioned mice relative to controls. F. Press rates across all VI60 days are not different between lesioned and control mice. * indicates p < 0.05.

Lesion of striatal patches alters behavioral strategy and efficiency

Increased behavioral variability suggested that lesioned mice may display other differences in responding across VI training. Therefore, we plotted distributions of inter-press intervals across both groups in day 1 and day 5 of VI60 training (Fig 3A and 3B). The distribution of inter-press intervals between groups demonstrated a similar bimodal shape suggesting similar response rates between groups. Over training, control mice tend to increase their pressing around 2 sec, though the distribution does not significantly change across training (Two-sample Kolmogorov-Smirnov test, p > 0.05; Fig 3A), while lesioned mice tended to suppress responses at this interval (Two-sample Kolmogorov-Smirnov test, p < 0.05; Fig 3B). Ultimately, this resulted in a significant increase in efficiency in lesioned mice over training (one-sample t-test, t = 2.377, df = 10, p = 0.0388, Fig 3C), while control mice displayed no change in press:reward efficiency from day 1 to 5 (one-sample t-test, t = 0.2779, df = 11, p = 0.7862, Fig 3C). We next repeated this analysis for head entries into the food magazine by plotting inter-head-entry-intervals and comparing efficiency. Control mice significantly alter their distribution of inter-entry-interval, suggesting these mice increase stereotyped head entries across training at 2–4 sec intervals (Two-sample Kolmogorov-Smirnov test, p < 0.05; Fig 3D). On the other hand, lesioned mice tended to reduce headentries, though distributions do not significantly change across training (Two-sample Kolmogorov-Smirnov test, p > 0.05; Fig 3E). This resulted in a partial increase in head-entry:reward efficiency in lesioned mice (one-sample t-test, t = 1.917, df = 10, p = 0.0842, Fig 3F) and no change in control mice (one-sample t-test, t = 0.4354, df = 11, p = 0.6717, Fig 3F). Together, this suggests that control mice develop a less efficient strategy to obtain rewards relative to lesioned mice, potentially due to emergence of habitual, stereotyped magazine entry across learning in controls, and due to reduced pressing across learning in lesioned mice.

Fig 3. Lesioned mice develop a more efficient behavioral strategy.

Fig 3

A-B. Distribution of inter-press interval for lesioned (A) and control mice (B) on VI60 day 1 and day 5. Solid lines represent mean and dotted lines of the same color are SEM. *indicates significantly different bins based on Kolmogorov-Smirnov test (Dvalue>Dcritical) C. Lesioned mice become more efficient (change in # presses / # rewards) across training, while controls do not. D-E. Distribution of inter-entry-interval for lesioned (D) and control mice (E) on VI60 day 1 and day 5. Solid lines represent mean and dotted lines of the same color are SEM. *indicates significantly different bins based on Kolmogorov-Smirnov test (Dvalue>Dcritical) F. Lesioned mice become slightly more efficient (# head-entries / # rewards) across training, while controls do not. G-H. Cross-correlation of press rate and head entry rate in 100 ms bins for control (G) and lesioned (H) mice (lags -50 to 50; see text for details). # indicates p < 0.1; * indicates p < 0.05.

The differences in behavioral efficiency between lesioned and control mice may reflect differences in press/head entry patterns. That is, improved efficiency (press or entry:reward ratio) may reflect animals better learning the interval, pacing presses during the interval, and then making a head entry to determine the outcome of a press (press-check responding). On the other hand, making repeated head entries or entries followed by a press (check-press responding) may be associated with reduced efficiency by mandating multiple entries. We therefore sought to characterize the structure of responding across variable interval training for each of these groups. To characterize response patterns over time, we performed a cross-correlation analysis of presses and head-entries. Briefly, press and head-entry counts were taken across 100 ms bins for day 1 and 5 and presses were correlated to head entry at a range of intervals (lags -50 to 50). Highly correlated responding at lag 0 indicates that presses were predictive of head entries in the same 100 ms bin. Correlation at lag -50 suggests presses were predictive of head entries 5 sec later (press-check responding), and correlation at lag 50 suggests head entries were predictive of presses 5 sec later (check-press responding). Lags between these extremes represent correlation at a shorter interval between press and entry rates. Between day 1 and 5, control mice show a change in responding with both an increase in correlation between press-check responses, and an in check-press responding (two-way repeated measures ANOVA, both factors repeated measures, significant interaction, F(99,1089) = 4.232, p < 0.0001, significant bonferroni-corrected post-hoc tests shown on figure; Fig 3G). This suggests that control mice increase stereotyped press-check and check-press sequences, which is accompanied by no change in overall efficiency (Fig 3C–3F). On the other hand, lesioned mice subtly modify their responding across training, with an increased correlation in short latency press-check responding (two-way repeated measures ANOVA, both factors repeated measures, significant interaction, F(99,990) = 3.545, p < 0.0001, significant bonferroni-corrected post-hoc tests shown on figure; Fig 3H). Thus, control mice increase both press-check and check-press response patterns that may indicate the emergence of reflexive, stereotyped head-entries. However, lesioned mice never increase this check-press behavior and improve their press-check responding, which is associated with increased efficiency. This improvement may suggest that patch lesioned mice maintain goal-directed responding across learning.

Lesion of striatal patches does not disrupt devaluation press rates

Habitual behavior is operationally defined by resistance to outcome devaluation; that is, habitual organisms will continue to respond for a reinforcer even after being given free access to the reinforcer [9,10]. Thus, after the completion of training, mice were given free access to either home chow (valuation condition) or the sucrose reward they received in the operant task (devaluation condition), randomized across two days (Fig 1J). Press rates in 5 min devaluation and valuation probes did not differ in either control (paired t-test, t = 1.462, df = 11, p = 0.1717; Fig 4A) or lesioned mice (paired t-test, t = 0.6923, df = 10, p = 0.5045; Fig 4B). Further, patch lesions did not significantly impact mean devaluation press rates between groups (unpaired t-test, t = 1.362, df = 27, p = 0.1843; Fig 4C), or devaluation press rates normalized to VI60 press rates (unpaired t-test, t = 1.298, df = 27, p = 0.2054; Fig 4D). We next quantified habitual behavior by normalizing lever press rate in devaluation tests to press rates in valuation tests (LPr, see [39]) to compare the effects of reward-specific valuation to generalized satiation. Similar to devaluation tests, this metric was also not different between lesioned and control mice (unpaired t-test, t = 0.09028, df = 21, p = 0.9289; Fig 4E). However, we did observe a significant decrease in lever pressing across probe days (two-way repeated-measures ANOVA, significant effect of day, F(1,21) = 21.38, p < 0.0001; Fig 4F), demonstrating that mice tended to decrease pressing across days similarly between lesion and control mice (non-significant effect of group, F(1,21) = 0.0156, p = 0.9018, no significant interaction F(1,21) = 0.1939, p = 0.6642). This significant decrease in press rates across subsequent probe tests is not commonly reported and indicates that Sepw1 mice rapidly extinguish responding across subsequent probe tests. Due to the effect of day occluding any effect of probe condition, we were unable to draw conclusive inferences about the degree of habit formation from these data.

Fig 4. Lesions of striatal patches do not change response rate during devaluation.

Fig 4

A-B. Press rates do not differ between devaluation and valuation probes for control (A) or lesioned mice (B). C. Press rates do not differ between lesioned (red) and control (blue) mice in devaluation D. Devaluation press rates normalized baseline press rate (average VI60 rate) is not different between lesion and control mice. E. Devaluation press rates normalized to valuation press rates (LPr, see text) did not differ between lesioned and control mice. F. Lesioned and control mice both decrease response rates across subsequent probe days.

Lesion of striatal patches alters performance during retraining and omission

Since this effect of time complicates interpretation of devaluation results, we next retrained mice with one additional day of VI6 0 (reinstatement) to reestablish high press rates. We then performed two days of omission as a further assessment of habitual responding. Here, the press contingency was reversed and mice were rewarded every 20 seconds if they refrained from lever pressing, and any presses reset this timer. This approach is more efficient at extinguishing behaviors than extinction, and can be used to assess habits [11]. We first compared flexibility of mice to update their response rates to changes in task design across devaluation/valuation, reinstatement on VI60, and omission. Similar to VI60 training, lesions of striatal patches significantly increased the variance of response rates across days relative to control mice (unpaired t-test, t = 2.163, df = 27, p = 0.0396; Fig 5A), suggesting control mice maintain more consistent press rates across these probe and retraining days. Following devaluation, reinstatement to a VI60 schedule did not alter mean press rates between control and lesioned mice (unpaired t-test, t = 1.138, df = 27, p = 0.265; Fig 5B). However, lesioned mice reinstated lever pressing to a greater extent than control mice when press rates were normalized to the final day of VI60 training (unpaired t-test, t = 2.698, df = 27, p = 0.0119; Fig 5C), further indicating enhanced behavioral flexibility following patch lesions. During omission, mean press rates did not differ between lesioned and control mice (two-way repeated-measures ANOVA, non-significant effect of group or interaction, p > 0.05; Fig 5D), though both groups suppressed responding across days (two-way repeated-measures ANOVA, significant effect of time, F(1,27) = 31.42, p < 0.0001; Fig 5D). However, when press rates were normalized to reinstatement press rates to control for between-subject variance, lesioned mice demonstrated diminished press rates relative to control mice (two-way repeated-measures ANOVA, significant time x group interaction, F(1,27) = 5.17, p = 0.0311; Fig 5E), suggesting habitual responding is impaired in these mice. Post-hoc tests revealed that control mice had elevated press rates on the first day of omission compared to lesioned mice (Sidak’s multiple comparisons test, Day 1, p = 0.0288). We next analyzed the press rates within the first and second halves of this first omission day. Both lesioned and control mice tended to decrease their press rate over time (two-way repeated-measures ANOVA, significant effect of time, F(1,27) = 83.76, p < 0.0001; Fig 5F) though lesioned mice had suppressed response rates over both halves relative to controls (significant effect of group, F(1,27) = 6.028, p = 0.0208, no group x time interaction, F(1,27) = 0.7304, p = 0.4003). Next, we assessed if average response rates across VI60 were predictive of response rates during omission to determine if mice are behaving in a stereotyped manner across time. Control mice display a significant correlation between press rate in VI60 and omission press rate (Pearson’s Correlation, r2 = 0.6345, p = 0.0004; Fig 5G), while this correlation was not significant for lesioned mice (Pearson’s Correlation, r2 = 0.01697, p = 0.6571; Fig 5H), further suggesting control mice are more consistent in press rate across days. Finally, we compared press rates across omission days to determine if mice are more habitual from omission day 1 to day 2. When press rates in omission day 2 are normalized to press rates in omission day 1, there is no significant decrease in responding in control mice between days (one-sample t-test, t = 0.2079, df = 14, p = 0.8383; Fig 5I, left) suggesting habitual responding. On the other hand, lesioned mice significantly decrease responding over time (one-sample t-test, t = 6.889, df = 13, p < 0.0001; Fig 5I, right). Together, these data suggest that control mice maintain a more stereotyped response rate across probe and retraining days, suggesting stronger habit formation in these mice.

Fig 5. Lesions of striatal patches enhance variance across probes and alter performance in omission.

Fig 5

A. Variance in press rate across devaluation/valuation probe, reinstatement, and omission day 1 suggest caspase lesions increase variability in press rates across days. B. During reinstatement to a VI60 schedule following devaluation/valuation probes, mean press rates do not differ between groups. C. When controlling for differences in baseline press rate, lesioned mice increase responding to a greater extent than controls during reinstatement to the VI60 schedule (data normalized to final day of VI60). D. Mice then underwent omission across two days. Mean press rates did not differ between control and lesioned mice in either day of omission. E. When controlling for differences in baseline press rate, lesioned mice rapidly reduce press rates relative to controls in day 1 of omission (data normalized to VI60 reinstatement rates). F. Press rates within the first and second half of omission day 1 suggest reduced responding in lesion mice relative to control mice. G-H. There is a significant correlation between average VI60 press rates and press rates in the first day of omission for control (G) but not lesioned mice (H). I. One sample t-test comparing change in response from reinstatement to omission suggests that control animals display habitual responding across days (left), while lesioned animals do not (right). * indicates p < 0.05.

Finally, to mirror the analysis of behavioral structure performed for VI60 training (Fig 3), we next compared the distribution of inter-press- and inter-head-entry-interval between control and lesioned mice for devaluation, valuation, and omission trials. Further, we also assessed these distributions within treatment across devaluation and valuation days. Finally, we compared the structure of behavioral responses between groups in devaluation, valuation, and omission trials. Ultimately, no significant differences were noted for these analysis (p>0.05) suggesting patch lesions did not alter behavioral strategy during probe tests.

Lesion of striatal patches impairs motor learning, but not locomotion

Deficits in operant conditioning may be due to differences in habit formation or to generalized motor deficits. Therefore, after the completion of variable interval training, we assessed the effect of lesioning patches on motor learning using an accelerating rotarod. Mice performed four trials per day for four days, and latency to fall was measured (maximum 360 seconds; [38]). Both lesioned and control mice increased performance across days, as indicated by a significant effect of day (two-way ANOVA with multiple comparisons, main effect of day: F(3,81) = 49.58 p < 0.0001). However, no effect of lesion was noted across all four tested days (non-significant effect of group: F(1,27) = 2.119, p = 0.1570, non-significant interaction, F(3,81) = 1.513, p = 0.2173; Fig 6A). Within the first day of testing, lesioned and control mice improved performance (two-way repeated-measures ANOVA, significant effect of trial, F(3,81) = 12.54, p < 0.0001) though lesioned mice were slightly impaired relative to controls as indicated by a trending effect of group (F(1,27) = 3.944, p = 0.0573; non-significant interaction, p > 0.05; Fig 6B). However, by day 4, this difference was not present (two-way repeated-measures ANOVA, non-significant effect of group, F(1,27) = 0.1248, p = 0.7267, Fig 6C) and performance stabilized (non-significant effect of time, F(3,81) = 0.2656, p = 0.8500, non-significant interaction, p > 0.05). This indicates that lesion of patches may partially disrupt initial motor learning, but with time, patch-lesioned mice were able to perform at the same level as control mice.

Fig 6. Patch lesions slightly impair motor learning but not overall performance or locomotion.

Fig 6

A. Performance on the rotarod across days does not differ between lesioned and control mice. B. Performance during day 1 of rotarod training suggests that lesioned mice are slightly impaired relative to control mice. C. Performance on each trial within day 4 of rotarod training suggests that lesioned mice perform similarly late in training. D-F. Performance in open field suggests that lesioned mice are not different than controls in the overall distance moved (D), overall velocity (E), or in number of rotations (F). # indicates p < 0.1.

To assess overall motor activity, a subset of mice (n = 13 control, n = 11 lesion) were placed in an open field and distance moved, velocity, and rotations were quantified. We observed no differences in overall movement (unpaired t-test, t = 0.7784, df = 22, p = 0.4446), average velocity (unpaired t-test, t = 0.7835, df = 22, p = 0.4417), rotation (unpaired t-test, t = 0.1968, df = 22, p = 0.8458; Fig 5D–5F). These data indicate that patches may play a role in early motor learning, but that lesioning patches does not affect motor functioning.

Discussion

Here, we investigated a role for striatal patches in habit formation and motor behaviors. To do this, we selectively lesioned patches using a Cre-dependent caspase 3 virus in Sepw1 NP67 mice, and noted loss of striatal patches (Fig 1). Mice with patch lesions demonstrated normal learning on a variable interval task, but displayed greater day-to-day variability in response rates across training. Further, control mice developed check-press patterns of responding during training, which may reflect the development of stereotyped, habitual head entry during learning. Lesioned mice did not alter check-press behavior, but increased press-check patterns of responding, resulting in increased efficiency. Lesioned mice did not display impaired devaluation press rates, though this result is complicated by a generalized decrease in response rates across valuation and devaluation probe days. Lesioned mice also suppressed press rates faster than control mice when they were placed on an omission task, where responses had to be withheld to earn rewards, and lesioned mice were more variable in their performance across probe and retraining days. Taken together, these results indicate that patch lesioned mice demonstrated weakened habitual behaviors and impaired behavioral stability across training and changes in task design, suggesting that striatal patches may be a key site of behavioral stability. Finally, patch lesioned mice showed slight impairment in acquisition of a new motor skill on a rotarod and no impairments in baseline locomotor activity, suggesting patches may regulate motor learning, but not motor execution per se, and that deficits in operant behaviors are not simply attributable to motor deficits.

In the current study, we noted that patch lesions impaired habitual responding during omission, where mice had to suppress response rates to obtain rewards (Fig 5E–5I). Omission and devaluation conditions are commonly utilized to assess habitual behaviors. However, these tasks assess different aspects of habit. Specifically, devaluation manipulates the value of the reinforcer to determine if responding persists [9,10,40]. On the other hand, omission reverses a learned action-outcome contingency, while leaving the value of the reinforcer intact. Previous work has noted that this approach results in a rapid decrease in responding, more rapid even than extinction [11,37], and it has been used to assess aspects of flexibility [39,41]. Omission is thought to reflect both the extinguishing of one behavior (e.g., lever pressing) and reinforcement of other behaviors (e.g., waiting by the food magazine), emphasizing both breakdown of an old, and learning of a new, action-outcome contingency [37]. Therefore, in the current study, while extinction was noted in both groups, lesioned mice displayed faster goal-directed control or enhanced flexibility of updating both old and new action-outcome contingencies. Indeed, while the standard view of habit formation during VI schedules is that goal-directed behaviors degrade as habits are formed, a recent study suggests that with extensive training, goal-directed behavior will eventually emerge [42]. Therefore, it is possible that, rather than disrupting habit formation, lesions of patches facilitate the emergence of goal-directed control.

Our finding of impaired habitual responses in omission is consistent with a recent study that used a conjugated cytotoxin (dermaphorin-saporin) to selectively ablate μ-opioid neurons in the striatum and that found that habit formation was impaired [33]. These findings are also consistent with studies suggesting lesions of patches impair inflexible motor stereotypies [31,32]. Jenrette et al. noted deficits in press rates when sucrose rewards were paired with lithium chloride to devalue sucrose rewards through taste aversion. However, the current study did not find a deficit in devaluation press rates when mice were provided free access to sucrose. We attribute this difference to two main factors. First, the method of devaluation (free access to reward vs. taste aversion) may not similarly devalue rewards, and it is possible that taste aversion is a more robust manipulation. Second, we noted a significant effect of probe day such that mice pressed less on day 2 regardless of probe condition (Fig 4F), indicating that the counterbalancing of days confounded any effects of probe condition. The reasons for this remain unclear, as multiple papers have successfully used this probe paradigm to assess habitual behavior [20,39]. Two factors may contribute to this finding. First, the use of home-cage chow and liquid sucrose rewards could represent an asymmetrical manipulation between devaluation and valuation probe trials, which may have impacted the results of these probe trials. However, the approach used here has been utilized in a previous study, and these mice remained sensitive to devaluation [43]. Nevertheless, the lack of difference in our probes could be attributed to potential asymmetry in consumption before devaluation and valuation probes. Another factor that might have impacted this result was length of probe trial. Our probe trial duration (5 min) greatly exceeded the delay experienced during variable interval training (30–90 sec), which might have resulted in rapid extinguishing of pressing. Other groups have used probe trials that more closely match delay times that mice experienced during training [39]. Therefore, future studies of habit using mice should be mindful of symmetry in designing valuation and devaluation probes, and in length of probes relative to variable interval delays.

It remains unclear how patches encode habitual behaviors. It is possible that disruption of striatal patches leads to over-reliance on brain circuits subserving goal-directed behaviors, including the prefrontal cortex, nucleus accumbens, and dorsomedial striatum [33]. Activity in striatal patches is tied to reward processing [44,45], and patches support intracranial self-stimulation [46], suggesting that patches have a role in reinforcement. Patch spiny projection neurons also have direct inputs to dopamine neurons [2527] and a recent dissertation indicates they may suppress dopamine activity through GABAA-mediated inward currents [47]. Lesions to patches may therefore influence spiraling basal ganglia circuits [48] by causing dysregulation of striatal dopamine release that may manifest as impaired reinforcement or disrupted decision making processes [49,50]. Indeed, dopamine signaling shifts from ventromedial to lateral striatum with extended training [51], and this process may be impacted by lesions to patches. Future studies should examine the interplay between patches and dopamine across habit formation to explore this possibility.

Alternatively, patches may mediate habitual behaviors through the endocannabinoid system in the striatum. CB1 receptors are crucial for striatal plasticity and synaptic depression [52,53], and these receptors are enriched in both striatal patches [54] and in striatal projections from the orbitofrontal cortex [14]. Indeed, the orbitofrontal cortex is thought to be key in habit and cognitive flexibility [55,56], and orbitostriatal projections are central in the transition from goal-directed to habitual strategies [13,57]. Further, knockout of CB1 receptors from orbitostriatal terminals impairs habit formation [14]. Thus, CB1 receptors are in a prime position to mediate habit-related plasticity in striatal patches. Loss of striatal patches might impair this process, which may disrupt the transfer from goal-oriented to habitual behavior.

Importantly, the use of a virus encoding caspase 3 at volumes utilized here resulted in loss of patches from the dorsal striatum spanning both medial and lateral subregions (Fig 1). Based on proposed models of striatal functioning, the medial striatum is thought to guide goal-directed behaviors [16], whereas the dorsolateral striatum and its dopamine inputs are thought to be necessary for habit formation [11,17,58], though caveats to this view have been reported [20,59]. Patches are uniformly distributed across the dorsomedial and dorsolateral striatum, forming extended compartments across anterior and posterior ends [6062]. It is possible that medial and lateral patches have a differential role in habit formation that could reflect the larger medial-lateral divide across the striatum. Future work should investigate this possibility.

While Sepw1 NP67 mice have preferential expression of Cre recombinase in patch projection neurons, a limitation of the current work is the expression of Cre in ‘exo-patch’ neurons [28], resulting in the lesioning of both patch and exo-patch neurons (Fig 1). Exo-patch neurons have similar gene expression and connectivity profiles to patch neurons [28], but they fall outside of traditionally defined patches [21]. The Sepw1 NP67 line has been previously used to study patch connectivity [28,54] and activity [45]. Other recent studies have utilized alternative Cre lines to target patches, including Mash1-CreER [44], 599-CreER [63], or Oprm1-Cre [64], though each of these lines also has some off-target labeling of exo-patch or matrix neurons. Thus, while the current work suggests lesions preferentially targeting patches impair aspects of habitual behavior, we cannot rule out the contribution of exo-patch neurons in this process.

An unexpected finding from the current work was increased day-to-day behavioral variability in patch lesioned mice (Fig 2B–2E, Fig 5A). These data suggest that lesions of striatal patches may generally increase behavioral variability across days. This could suggest that patches play a general role in regulating crystallization of motor patterns, thus establishing habits. Many organisms crystalize motor patterns beyond habit formation in operant conditioning: across development, seasons, or lifespan. For example, many species of songbird show elevated variability in song production either as juveniles or during winter seasons; this variability is eventually reduced over time [65]. Indeed, the basal ganglia is thought to modulate variability in song production in birds [66]. Moreover, spiny projection neuron distribution and patch organization differ between vocal and non-vocal songbird species [67]. Similarly, in rodents, spontaneous variation in foraging patterns are common, even following reinforcement of prior exploration (a win-shift pattern, [68,69]. Non-specific lesions of dorsal striatum impair this behavioral variability and can increase spontaneous alternation in ‘win-stay’ conditions, where rodents need to return to previously rewarded areas [70,71]. Future studies could investigate striatal patches as a site for stabilizing behavioral patterns in motor behaviors and reinforcement learning beyond operant conditioning.

Similarly, during habit formation, discrete behavioral elements may become chunked into larger behavioral sequences with repetition [72,73]. Indeed, as habits form, the likelihood of a given action to follow a preceding action increases [74,75]. Sensory input may therefore drive selection of concatenated behaviors once habits form, and action-outcome contingencies may be updated on a sequence-level [73,74]. It is possible that striatal patches may play a role in this aggregation of behavioral elements. Indeed, the striatum has been shown to be critical for expression of innate behavioral sequences [76,77] and learning of new behavioral sequences is particularly dependent on the lateral striatum [78]. Further, striatal neurons encode the beginning and ends of behavioral sequences as learning occurs [79,80], with differential contribution of striatal direct and indirect pathways [81]. Future studies should investigate correlates of behavioral chunking in patch neurons across habit formation.

While habitual strategies free cognitive resources and are therefore more efficient overall, goal-directed animals are sensitive to reward outcomes and might be more likely to optimize their behavioral strategy. Indeed, here, control mice begin making more stereotyped presses and head-entries and increase check-press sequences over training, establishing an inefficient, habitual checking strategy (Fig 3). On the other hand, mice with lesioned patches fail to establish this checking behavior and only improve press-check responses, resulting in an increase in efficiency. Repetitive head-entries may result in overtraining, which could enhance the establishment of inflexible responding [40]. On the other hand, the propensity of control mice to develop these behaviors may be reflective of ongoing habit formation, that is, repeated head-entries may be a marker of the establishment of habits, which is disrupted in mice with lesioned patches. Indeed, several differing views have emerged regarding why habits develop. First, it is thought that repeated pairings of behavior and reward result in habits [82]. Alternatively, tasks where the link between action and outcome is more difficult to predict drive habitual responding, explaining why random ratio schedules maintain more goal-directed responding relative to random interval schedules [40]. A related, but novel idea has been recently put forward: that tasks where animals are able to pay less attention to their responding and the outcome of behavior may drive habits [83]. Here, sham controls may be able to pay less attention to their responding due to the automacy afforded by intact patches, while lesioned mice must attend to outcomes, resulting in efficient and goal-directed behavior. Future studies utilizing variable interval schedules of reinforcement should investigate changes in responding during training that might predict habit formation.

Consistent with previous reports [84], patch lesioned mice also have slight deficits in early motor learning, but not in general movement parameters (Fig 6). Notably, minor dopamine dysfunction also leads to deficits in motor learning, but not general motor deficits [85], again raising the possibility that these deficits are partially mediated by dysfunctional dopamine regulation following patch lesions. Indeed, recent work suggests that patch lesions may drive dopamine dysfunction in the striatum, which may directly affect early motor learning [86]. Despite deficits in learning on the rotarod, it remains unlikely that motor learning is the only function of patch compartments, as our results also suggest learning of lever-pressing, locomotion, and final performance on rotarod all remain intact following patch lesion. Other studies investigating fine motor control have found that selective inhibition of matrix neurons using DREADDs disrupts performance in reaching and grasping tasks [87]. Patch compartments have been better studied in decision making [49,50] and reward processing [44,45]. Together, this suggests that matrix neurons may regulate motor execution, whereas patch neurons regulate timing and selection of actions. Indeed, this notion is consistent with computational models [88], which hold that patches bias matrix neurons towards specific actions.

In sum, this work adds to a growing literature suggesting striatal patches support habit formation [29,33]. Lesioning patches may lead to overactivation of brain structures that support goal-oriented behaviors, including the dorsomedial striatum or prefrontal cortex. Alternatively, patch lesions may alter dopamine signaling in the striatum [25,27]. Finally, brain regions supporting inflexible behaviors have been implicated in the pathology of Obsessive Compulsive Disorder [24], drug addiction [57], and Tourette’s Syndrome [8]. Future studies should investigate the contribution of striatal patches to these disease states.

Supporting information

S1 File. A GraphPad Prism file containing the complete data sets used in this study.

(PZFX)

Acknowledgments

J.A.N. was supported by the Nu Rho Psi Undergraduate Research Grant and the Robert Rich Student Research Grant through Oberlin College. The authors would like to thank Drs. Charles Gerfen (National Institute of Mental Health) and Nathaniel Heintz (The Rockefeller University) for generously providing Sepw1 NP67 mice. Additionally, the authors would like to thank Professor Pat Simen for fruitful discussion regarding cross correlation analysis, and Colin Dawson for discussion on analyzing variability. The authors would also like to thank Claire Geddes, Jared Smith, and Armando Salinas for discussion and feedback on the manuscript. Finally, the authors would also like to thank Lori Lindsay, Forrest Rose, Dorothy Auble, Gigi Knight, Bill Mohler, Chris Mohler and Laurie Holcomb for research support.

Data Availability

Data are all available in a GraphPad Prism file present in the submission.

Funding Statement

This work was partially funded by a grant from Nu Rho Psi to JAN. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Dolan RJ, Dayan P. Goals and Habits in the Brain. Neuron. 2013;80: 312–325. 10.1016/j.neuron.2013.09.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gillan CM, Papmeyer M, Morein-Zamir S, Sahakian BJ, Fineberg NA, Robbins TW, et al. Disruption in the balance between goal-directed behavior and habit learning in obsessive-compulsive disorder. Am J Psychiatry. 2011;168: 718–726. 10.1176/appi.ajp.2011.10071062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Gillan CM, Robbins TW. Goal-directed learning and obsessive-compulsive disorder. Philos Trans R Soc B Biol Sci. 2014;369: 20130475–20130475. 10.1098/rstb.2013.0475 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Voon V, Derbyshire K, Rück C, Irvine MA, Worbe Y, Enander J, et al. Disorders of compulsivity: a common bias towards learning habits. Mol Psychiatry. 2015;20: 345–352. 10.1038/mp.2014.44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Nelson A, Killcross S. Amphetamine exposure enhances habit formation. J Neurosci Off J Soc Neurosci. 2006;26: 3805–3812. 10.1523/JNEUROSCI.4305-05.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Sjoerds Z, Luigjes J, van den Brink W, Denys D, Yücel M. The Role of Habits and Motivation in Human Drug Addiction: A Reflection. Front Psychiatry. 2014;5 10.3389/fpsyt.2014.00008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Smith RJ, Laiks LS. Behavioral and neural mechanisms underlying habitual and compulsive drug seeking. Prog Neuropsychopharmacol Biol Psychiatry. 2018;87: 11–21. 10.1016/j.pnpbp.2017.09.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Delorme C, Salvador A, Valabrègue R, Roze E, Palminteri S, Vidailhet M, et al. Enhanced habit formation in Gilles de la Tourette syndrome. Brain. 2016;139: 605–615. 10.1093/brain/awv307 [DOI] [PubMed] [Google Scholar]
  • 9.Dickinson A, Nicholas DJ, Adams CD. The Effect of the Instrumental Training Contingency on Susceptibility to Reinforcer Devaluation. Q J Exp Psychol Sect B. 1983;35: 35–51. 10.1080/14640748308400912 [DOI] [Google Scholar]
  • 10.Adams CD, Dickinson A. Instrumental Responding following Reinforcer Devaluation. Q J Exp Psychol Sect B. 1981;33: 109–121. 10.1080/14640748108400816 [DOI] [Google Scholar]
  • 11.Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci. 2006;7: 464–476. 10.1038/nrn1919 [DOI] [PubMed] [Google Scholar]
  • 12.Packard MG, Knowlton BJ. Learning and memory functions of the Basal Ganglia. Annu Rev Neurosci. 2002;25: 563–593. 10.1146/annurev.neuro.25.112701.142937 [DOI] [PubMed] [Google Scholar]
  • 13.Gremel CM, Costa RM. Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat Commun. 2013;4 10.1038/ncomms3264 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gremel CM, Chancey JH, Atwood BK, Luo G, Neve R, Ramakrishnan C, et al. Endocannabinoid Modulation of Orbitostriatal Circuits Gates Habit Formation. Neuron. 2016;90: 1312–1324. 10.1016/j.neuron.2016.04.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci Off J Soc Neurosci. 2004;24: 7540–7548. 10.1523/JNEUROSCI.1921-04.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Yin HH, Ostlund SB, Knowlton BJ, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning: Striatum and instrumental conditioning. Eur J Neurosci. 2005;22: 513–523. 10.1111/j.1460-9568.2005.04218.x [DOI] [PubMed] [Google Scholar]
  • 17.Yin HH, Knowlton BJ, Balleine BW. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci. 2004;19: 181–189. 10.1111/j.1460-9568.2004.03095.x [DOI] [PubMed] [Google Scholar]
  • 18.McNamee D, Liljeholm M, Zika O, O’Doherty JP. Characterizing the associative content of brain structures involved in habitual and goal-directed actions in humans: a multivariate FMRI study. J Neurosci Off J Soc Neurosci. 2015;35: 3764–3771. 10.1523/JNEUROSCI.4677-14.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Tricomi E, Balleine BW, O’Doherty JP. A specific role for posterior dorsolateral striatum in human habit learning. Eur J Neurosci. 2009;29: 2225–2232. 10.1111/j.1460-9568.2009.06796.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Malvaez M, Greenfield VY, Matheos DP, Angelillis NA, Murphy MD, Kennedy PJ, et al. Habits Are Negatively Regulated by Histone Deacetylase 3 in the Dorsal Striatum. Biol Psychiatry. 2018. 10.1016/j.biopsych.2018.01.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Crittenden JR, Graybiel AM. Basal Ganglia Disorders Associated with Imbalances in the Striatal Striosome and Matrix Compartments. Front Neuroanat. 2011;5 10.3389/fnana.2011.00059 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gerfen CR. The neostriatal mosaic: multiple levels of compartmental organization. Trends Neurosci. 1992;15: 133–139. 10.1016/0166-2236(92)90355-c [DOI] [PubMed] [Google Scholar]
  • 23.Kuhar MJ, Pert CB, Snyder SH. Regional distribution of opiate receptor binding in monkey and human brain. Nature. 1973;245: 447–450. 10.1038/245447a0 [DOI] [PubMed] [Google Scholar]
  • 24.Graybiel AM, Ragsdale CW. Histochemically distinct compartments in the striatum of human, monkeys, and cat demonstrated by acetylthiocholinesterase staining. Proc Natl Acad Sci U S A. 1978;75: 5723–5726. 10.1073/pnas.75.11.5723 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, Uchida N. Whole-Brain Mapping of Direct Inputs to Midbrain Dopamine Neurons. Neuron. 2012;74: 858–873. 10.1016/j.neuron.2012.03.017 [DOI] [PubMed] [Google Scholar]
  • 26.Gerfen CR. The neostriatal mosaic. I. compartmental organization of projections from the striatum to the substantia nigra in the rat. J Comp Neurol. 1985;236: 454–476. 10.1002/cne.902360404 [DOI] [PubMed] [Google Scholar]
  • 27.Fujiyama F, Sohn J, Nakano T, Furuta T, Nakamura KC, Matsuda W, et al. Exclusive and common targets of neostriatofugal projections of rat striosome neurons: a single neuron-tracing study using a viral vector. Eur J Neurosci. 2011;33: 668–677. 10.1111/j.1460-9568.2010.07564.x [DOI] [PubMed] [Google Scholar]
  • 28.Smith JB, Klug JR, Ross DL, Howard CD, Hollon NG, Ko VI, et al. Genetic-Based Dissection Unveils the Inputs and Outputs of Striatal Patch and Matrix Compartments. Neuron. 2016;91: 1069–1084. 10.1016/j.neuron.2016.07.046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Canales JJ, Graybiel AM. A measure of striatal function predicts motor stereotypy. Nat Neurosci. 2000;3: 377–383. 10.1038/73949 [DOI] [PubMed] [Google Scholar]
  • 30.Canales J. Stimulant-induced adaptations in neostriatal matrix and striosome systems: Transiting from instrumental responding to habitual behavior in drug addiction. Neurobiol Learn Mem. 2005;83: 93–103. 10.1016/j.nlm.2004.10.006 [DOI] [PubMed] [Google Scholar]
  • 31.Murray RC, Gilbert YE, Logan AS, Hebbard JC, Horner KA. Striatal patch compartment lesions alter methamphetamine-induced behavior and immediate early gene expression in the striatum, substantia nigra and frontal cortex. Brain Struct Funct. 2014;219: 1213–1229. 10.1007/s00429-013-0559-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Murray RC, Logan MC, Horner KA. Striatal patch compartment lesions reduce stereotypy following repeated cocaine administration. Brain Res. 2015;1618: 286–298. 10.1016/j.brainres.2015.06.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jenrette TA, Logue JB, Horner KA. Lesions of the Patch Compartment of Dorsolateral Striatum Disrupt Stimulus–Response Learning. Neuroscience. 2019;415: 161–172. 10.1016/j.neuroscience.2019.07.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Gerfen CR, Paletzki R, Heintz N. GENSAT BAC cre-recombinase driver lines to study the functional organization of cerebral cortical and basal ganglia circuits. Neuron. 2013;80: 1368–1383. 10.1016/j.neuron.2013.10.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Rossi MA, Yin HH. Methods for Studying Habitual Behavior in Mice In: Crawley JN, Gerfen CR, Rogawski MA, Sibley DR, Skolnick P, Wray S, editors. Current Protocols in Neuroscience. Hoboken, NJ, USA: John Wiley & Sons, Inc.; 2012. 10.1002/0471142301.ns0829s60 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Yang CF, Chiang MC, Gray DC, Prabhakaran M, Alvarado M, Juntti SA, et al. Sexually dimorphic neurons in the ventromedial hypothalamus govern mating in both sexes and aggression in males. Cell. 2013;153: 896–909. 10.1016/j.cell.2013.04.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Davis J, Bitterman ME. Differential reinforcement of other behavior (DRO): a yoked-control comparison. J Exp Anal Behav. 1971;15: 237–241. 10.1901/jeab.1971.15-237 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Durieux PF, Schiffmann SN, de Kerchove d’Exaerde A. Differential regulation of motor control and response to dopaminergic drugs by D1R and D2R neurons in distinct dorsal striatum subregions: Dorsal striatum D1R- and D2R-neuron motor functions. EMBO J. 2012;31: 640–653. 10.1038/emboj.2011.400 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.O’Hare JK, Ade KK, Sukharnikova T, Van Hooser SD, Palmeri ML, Yin HH, et al. Pathway-Specific Striatal Substrates for Habitual Behavior. Neuron. 2016;89: 472–479. 10.1016/j.neuron.2015.12.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Dickinson A. Actions and Habits: The Development of Behavioural Autonomy. Philos Trans R Soc B Biol Sci. 1985;308: 67–78. 10.1098/rstb.1985.0010 [DOI] [Google Scholar]
  • 41.Yu C, Gupta J, Chen J-F, Yin HH. Genetic deletion of A2A adenosine receptors in the striatum selectively impairs habit formation. J Neurosci Off J Soc Neurosci. 2009;29: 15100–15103. 10.1523/JNEUROSCI.4215-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Garr E, Bushra B, Tu N, Delamater AR. Goal-directed control on interval schedules does not depend on the action-outcome correlation. J Exp Psychol Anim Learn Cogn. 2019. 10.1037/xan0000229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Li Y, Pan X, He Y, Ruan Y, Huang L, Zhou Y, et al. Pharmacological Blockade of Adenosine A2A but Not A1 Receptors Enhances Goal-Directed Valuation in Satiety-Based Instrumental Behavior. Front Pharmacol. 2018;9 10.3389/fphar.2018.00393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Bloem B, Huda R, Sur M, Graybiel AM. Two-photon imaging in mice shows striosomes and matrix have overlapping but differential reinforcement-related responses. eLife. 2017;6 10.7554/eLife.32353 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Yoshizawa T, Ito M, Doya K. Reward-Predictive Neural Activities in Striatal Striosome Compartments. eneuro. 2018;5: ENEURO.0367–17.2018. 10.1523/ENEURO.0367-17.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.White NM, Hiroi N. Preferential localization of self-stimulation sites in striosomes/patches in the rat striatum. Proc Natl Acad Sci U S A. 1998;95: 6486–6491. 10.1073/pnas.95.11.6486 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Faust T. Influence of the Neostriatal Patch System on the Prediction-Based Coding of Midbrain Dopaminergic Neurons. PhD, Rutgers University; 2017. [Google Scholar]
  • 48.Ikeda H, Koshikawa N, Cools AR. Accumbal core: essential link in feed-forward spiraling striato-nigro-striatal in series connected loop. Neuroscience. 2013;252: 60–67. 10.1016/j.neuroscience.2013.07.066 [DOI] [PubMed] [Google Scholar]
  • 49.Friedman A, Homma D, Gibb LG, Amemori K, Rubin SJ, Hood AS, et al. A Corticostriatal Path Targeting Striosomes Controls Decision-Making under Conflict. Cell. 2015;161: 1320–1333. 10.1016/j.cell.2015.04.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Friedman A, Homma D, Bloem B, Gibb LG, Amemori K, Hu D, et al. Chronic Stress Alters Striosome-Circuit Dynamics, Leading to Aberrant Decision-Making. Cell. 2017;171: 1191–1205.e28. 10.1016/j.cell.2017.10.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Willuhn I, Burgeno LM, Everitt BJ, Phillips PEM. Hierarchical recruitment of phasic dopamine signaling in the striatum during the progression of cocaine use. Proc Natl Acad Sci. 2012;109: 20703–20708. 10.1073/pnas.1213460109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Gerdeman GL, Ronesi J, Lovinger DM. Postsynaptic endocannabinoid release is critical to long-term depression in the striatum. Nat Neurosci. 2002;5: 446–451. 10.1038/nn832 [DOI] [PubMed] [Google Scholar]
  • 53.Atwood BK, Lovinger DM, Mathur BN. Presynaptic long-term depression mediated by Gi/o-coupled receptors. Trends Neurosci. 2014;37: 663–673. 10.1016/j.tins.2014.07.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Davis MI, Crittenden JR, Feng AY, Kupferschmidt DA, Naydenov A, Stella N, et al. The cannabinoid-1 receptor is abundantly expressed in striatal striosomes and striosome-dendron bouquets of the substantia nigra. Lee J, editor. PLOS ONE. 2018;13: e0191436 10.1371/journal.pone.0191436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Torregrossa MM, Quinn JJ, Taylor JR. Impulsivity, compulsivity, and habit: the role of orbitofrontal cortex revisited. Biol Psychiatry. 2008;63: 253–255. 10.1016/j.biopsych.2007.11.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Schoenbaum G, Saddoris MP, Stalnaker TA. Reconciling the Roles of Orbitofrontal Cortex in Reversal Learning and the Encoding of Outcome Expectancies. Ann N Y Acad Sci. 2007;1121: 320–335. 10.1196/annals.1401.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Burguiere E, Monteiro P, Feng G, Graybiel AM. Optogenetic Stimulation of Lateral Orbitofronto-Striatal Pathway Suppresses Compulsive Behaviors. Science. 2013;340: 1243–1246. 10.1126/science.1232380 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Faure A, Haberland U, Condé F, El Massioui N. Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J Neurosci Off J Soc Neurosci. 2005;25: 2771–2780. 10.1523/JNEUROSCI.3894-04.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Okada K, Nishizawa K, Fukabori R, Kai N, Shiota A, Ueda M, et al. Enhanced flexibility of place discrimination learning by targeting striatal cholinergic interneurons. Nat Commun. 2014;5: 3778 10.1038/ncomms4778 [DOI] [PubMed] [Google Scholar]
  • 60.Desban M, Kemel ML, Glowinski J, Gauchy C. Spatial organization of patch and matrix compartments in the rat striatum. Neuroscience. 1993;57: 661–671. 10.1016/0306-4522(93)90013-6 [DOI] [PubMed] [Google Scholar]
  • 61.Johnston JG, Gerfen CR, Haber SN, van der Kooy D. Mechanisms of striatal pattern formation: conservation of mammalian compartmentalization. Brain Res Dev Brain Res. 1990;57: 93–102. 10.1016/0165-3806(90)90189-6 [DOI] [PubMed] [Google Scholar]
  • 62.Morigaki R, Goto S. Putaminal Mosaic Visualized by Tyrosine Hydroxylase Immunohistochemistry in the Human Neostriatum. Front Neuroanat. 2016;10: 34 10.3389/fnana.2016.00034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.McGregor MM, McKinsey GL, Girasole AE, Bair-Marshall CJ, Rubenstein JLR, Nelson AB. Functionally Distinct Connectivity of Developmentally Targeted Striosome Neurons. Cell Rep. 2019;29: 1419–1428.e5. 10.1016/j.celrep.2019.09.076 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Märtin A, Calvigioni D, Tzortzi O, Fuzik J, Wärnberg E, Meletis K. A Spatiomolecular Map of the Striatum. Preprint, biorxiv; 2019. May 10.1101/613596 [DOI] [PubMed] [Google Scholar]
  • 65.Brainard MS, Doupe AJ. Translating Birdsong: Songbirds as a Model for Basic and Applied Medical Research. Annu Rev Neurosci. 2013;36: 489–517. 10.1146/annurev-neuro-060909-152826 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Kao MH, Doupe AJ, Brainard MS. Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song. Nature. 2005;433: 638–643. 10.1038/nature03127 [DOI] [PubMed] [Google Scholar]
  • 67.Garcia-Calero E, Bahamonde O, Martinez S. Differences in number and distribution of striatal calbindin medium spiny neurons between a vocal-learner (Melopsittacus undulatus) and a non-vocal learner bird (Colinus virginianus). Front Neuroanat. 2013;7: 46 10.3389/fnana.2013.00046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Charnov EL. Optimal foraging, the marginal value theorem. Theor Popul Biol. 1976;9: 129–136. 10.1016/0040-5809(76)90040-x [DOI] [PubMed] [Google Scholar]
  • 69.Compton D. Behavior strategy learning in rat: effects of lesions of the dorsal striatum or dorsal hippocampus. Behav Processes. 2004;67: 335–342. 10.1016/j.beproc.2004.06.002 [DOI] [PubMed] [Google Scholar]
  • 70.Sakamoto T, Okaichi H. Use of Win-Stay and Win-Shift Strategies in Place and Cue Tasks by Medial Caudate Putamen (MCPu) Lesioned Rats. Neurobiol Learn Mem. 2001;76: 192–208. 10.1006/nlme.2001.4006 [DOI] [PubMed] [Google Scholar]
  • 71.McDonald RJ, White NM. A triple dissociation of memory systems: hippocampus, amygdala, and dorsal striatum. Behav Neurosci. 1993;107: 3–22. 10.1037//0735-7044.107.1.3 [DOI] [PubMed] [Google Scholar]
  • 72.Graybiel AM. Habits, rituals, and the evaluative brain. Annu Rev Neurosci. 2008;31: 359–387. 10.1146/annurev.neuro.29.051605.112851 [DOI] [PubMed] [Google Scholar]
  • 73.Lingawi NW, Dezfouli A, Balleine BW. The Psychological and Physiological Mechanisms of Habit Formation In: Murphy RA, Honey RC, editors. The Wiley Handbook on the Cognitive Neuroscience of Learning. Chichester, UK: John Wiley & Sons, Ltd; 2016. pp. 409–441. 10.1002/9781118650813.ch16 [DOI] [Google Scholar]
  • 74.Dezfouli A, Lingawi NW, Balleine BW. Habits as action sequences: hierarchical action control and changes in outcome value. Philos Trans R Soc Lond B Biol Sci. 2014;369 10.1098/rstb.2013.0482 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Matsumoto N, Hanakawa T, Maki S, Graybiel AM, Kimura M. Role of [corrected] nigrostriatal dopamine system in learning to perform sequential motor tasks in a predictive manner. J Neurophysiol. 1999;82: 978–998. 10.1152/jn.1999.82.2.978 [DOI] [PubMed] [Google Scholar]
  • 76.Berridge KC, Whishaw IQ. Cortex, striatum and cerebellum: control of serial order in a grooming sequence. Exp Brain Res. 1992;90: 275–290. 10.1007/bf00227239 [DOI] [PubMed] [Google Scholar]
  • 77.Van den Bercken JH, Cools AR. Evidence for a role of the caudate nucleus in the sequential organization of behavior. Behav Brain Res. 1982;4: 319–327. 10.1016/0166-4328(82)90058-4 [DOI] [PubMed] [Google Scholar]
  • 78.Yin HH. The sensorimotor striatum is necessary for serial order learning. J Neurosci Off J Soc Neurosci. 2010;30: 14719–14723. 10.1523/JNEUROSCI.3989-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, et al. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013;494: 238–242. 10.1038/nature11846 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Jin X, Tecuapetla F, Costa RM. Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat Neurosci. 2014;17: 423–430. 10.1038/nn.3632 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Geddes CE, Li H, Jin X. Optogenetic Editing Reveals the Hierarchical Organization of Learned Action Sequences. Cell. 2018;174: 32–43.e15. 10.1016/j.cell.2018.06.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Thorndike EL. Animal intelligence; experimental studies,. New York,: The Macmillan Company,; 1911. 10.5962/bhl.title.55072 [DOI] [Google Scholar]
  • 83.Thrailkill EA, Trask S, Vidal P, Alcalá JA, Bouton ME. Stimulus control of actions and habits: A role for reinforcer predictability and attention in the development of habitual behavior. J Exp Psychol Anim Learn Cogn. 2018;44: 370–384. 10.1037/xan0000188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Lawhorn C, Smith DM, Brown LL. Partial ablation of mu-opioid receptor rich striosomes produces deficits on a motor-skill learning task. Neuroscience. 2009;163: 109–119. 10.1016/j.neuroscience.2009.05.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Ogura T, Ogata M, Akita H, Jitsuki S, Akiba L, Noda K, et al. Impaired acquisition of skilled behavior in rotarod task by moderate depletion of striatal dopamine in a pre-symptomatic stage model of Parkinson’s disease. Neurosci Res. 2005;51: 299–308. 10.1016/j.neures.2004.12.006 [DOI] [PubMed] [Google Scholar]
  • 86.Shumilov K, Real MÁ, Valderrama-Carvajal A, Rivera A. Selective ablation of striatal striosomes produces the deregulation of dopamine nigrostriatal pathway. Beeler JA, editor. PLOS ONE. 2018;13: e0203135 10.1371/journal.pone.0203135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Lopez-Huerta VG, Nakano Y, Bausenwein J, Jaidar O, Lazarus M, Cherassse Y, et al. The neostriatum: two entities, one structure? Brain Struct Funct. 2016;221: 1737–1749. 10.1007/s00429-015-1000-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Shivkumar S, Muralidharan V, Chakravarthy VS. A Biologically Plausible Architecture of the Striatum to Solve Context-Dependent Reinforcement Learning Tasks. Front Neural Circuits. 2017;11 10.3389/fncir.2017.00045 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Jeff A Beeler

7 Nov 2019

PONE-D-19-28957

Lesion of striatal patches disrupts habitual behaviors and increases behavioral variability

PLOS ONE

Dear Mr. Nadel,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

In addressing the reviewers' concerns, please note my comment below regarding reviewer #2's request that you repeat the experiments.

==============================

We would appreciate receiving your revised manuscript by Dec 22 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Jeff A Beeler

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Our internal editors have looked over your manuscript and determined that it may be within the scope of our Neuroscience of Reward and Decision Making Call for Papers. This collection of papers is headed by a team of Guest Editors for PLOS ONE: Stephanie Groman, Satoshi Ikemoto, Jane Taylor and Robert Whelan. With this Collection we hope to bring together researchers working on a wide range of disciplines, from animal subjects research, computational approaches and patient-centered research. Additional information can be found on our announcement page: https://collections.plos.org/s/reward-and-decision-making. If you would like your manuscript to be considered for this collection, please let us know in your cover letter and we will ensure that your paper is treated as if you were responding to this call. Agreeing to be part of the call-for-papers will not affect the date your manuscript is published. If you would prefer to remove your manuscript from collection consideration, please specify this in the cover letter.

3. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

4. We note that you have included the phrase “data not shown” in your manuscript. Unfortunately, this does not meet our data sharing requirements. PLOS does not permit references to inaccessible data. We require that authors provide all relevant data within the paper, Supporting Information files, or in an acceptable, public repository. Please add a citation to support this phrase or upload the data that corresponds with these findings to a stable repository (such as Figshare or Dryad) and provide and URLs, DOIs, or accession numbers that may be used to access these data. Or, if the data are not a core part of the research being presented in your study, we ask that you remove the phrase that refers to these data.

Additional Editor Comments:

Reviewer #2 asks that you repeat the experiments. As the other two reviewers did not make a similar request, the authors may respond by addressing the reviewer's concerns with either a justification or noting the issue in the manuscript. Entirely repeating the experiments is not necessary, but please address the concern.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This paper reports the effects of striosome/patch lesions in the dorsolateral striatum on instrumental behavior in mice. Using a Cre driver line that allows selective lesion of patch neurons, the authors used standard procedures such as devaluation and omission to test the hypothesis that patch neurons could contribute to habitual formation. They concluded that patch lesions weakened habitual control. The experiments are well designed and novel, and on the whole the paper is well written. I do have a few questions concerning the interpretation. It'd be helpful if the authors could address these.

1. The results show that patch lesions did not affect devaluation, but increased sensitivity to imposition of the omission contingency. While both tests have been used to test for habits, they are obviously different and in principle could be dissociated. This should be explicitly discussed. The original use of the omission schedule also focuses on differential reinforcement of other behaviors, which could be relevant here (Davis and Bitterman 1971).

2. One interesting feature of the data is increased behavioral variability in the lesion mice. However, the same analysis was not performed on behavior on devaluation and omission tests. Why not? It'd be helpful to see this analysis.

3. discussion of increased variability includes the possibility of impaired reward memory consolidation. I'm not sure if I can follow the logic here. This is very briefly mentioned with no citation. Some elaboration is needed to clarify this point, or the authors should remove it.

4. The discussion of motor pattern crystalization is interesting. Dezfouli and Balleine published a series of papers examining the relationship between behavioral chunking, sequence learning, and habit formation (e.g. 2013). These should be discussed in relation to the current finding. Also, the work showing DLS contribution to sequence learning (e.g. Yin 2010) seems relevant.

Reviewer #2: This paper examines how lesions of the patch compartment in the dorsal striatum affect habitual control of instrumental performance in mice. While the main research focus has been on lateral versus medial dorsal striatum, the contributions of the patch and matrix compartments are unknown. The authors find that patch lesions induced by caspase 3 result in increased lever pressing variability and altered micro-structure of lever pressing. During selective satiety tests conducted after instrumental training, the lesion animals appear to not differ from sham animals—both seem to show instrumental insensitivity to devaluation. However, during omission tests, lesion animals show lower lever pressing rates, which suggests greater sensitivity to the action-outcome association. This is despite no apparent differences in overall motor activity, as assessed by a number of tests.

Major comments:

The method of reward devaluation is problematic. The reason for using a specific satiety procedure is to control for general satiety, but using chow as a control when the mice are food-deprived does not accomplish that. The mice are food-deprived, and they will consume more chow than sucrose solution. This makes the tests an unfair comparison, because the mice are differentially sated during the tests. If half the mice were trained to earn sucrose and the other half chow, this method would be slightly less problematic. However, given that all mice were trained to earn sucrose solution, the appropriate control is another type of solution (e.g. maltodextrin). This is an especially important point given that the authors’ goal is to elicit habits. Given that the ‘valued’ test is likely associated with increased consumption relative to the ‘devalued’ test, it’s no surprise that the authors found no difference in instrumental performance between valued and devalued test. But it’s unclear if this can be attributed to a habit or asymmetrical consumption—thus undermining the main point of the paper. I recommend running the experiment again, but using a better control for general satiety that equates consumption between valued and devalued tests.

The authors present and analyze normalized pressing rates to gauge habitual responding, but I strongly recommend presenting and analyzing the raw pressing rates for devaluation and omission tests, either instead of or in addition to the normalized rates. Raw pressing rates provide a straightforward measure of performance, while also removing any suspicion associated with normalized data.

Line 262-263: “In the context of the variable interval schedule, a single press followed by head entry is the most efficient strategy to obtain a reward, while head-entries followed by presses are less efficient”. This is confusing. To the naïve reader, it sounds like the authors are saying that on a VI, it is optimal to always check the magazine after a single lever press, and it is not optimal to check the magazine after a series of presses. Is this what the authors are suggesting? If so, it does not make much sense, and needs to be explained. It would help to present a rigorous definition of efficiency.

Line 295-296: “This finding is not consistent with prior reports”. Are the authors suggesting that the decrease in instrumental performance across two days of extinction is inconsistent with prior reports? If so, that is not true. Instrumental performance usually decreases with increased extinction experience.

For the data in Fig 5B, the results of the interaction test are not reported, and the main effect of group is not statistically significant. Yet, the authors perform a post-hoc test anyway. This seems inappropriate.

The results from the omission test are interesting, and the authors interpret the data to mean that patch lesions disrupt habit formation. An alternative interpretation that should be mentioned is that patch lesions facilitate the emergence of goal-directed control. The standard way of thinking about action control on a VI schedule is that goal-directed control appears early and then erodes with more extensive training to make way for a habit. However, a recent paper (Garr et al., 2019; DOI: 10.1037/xan0000229) has proposed a revision of this belief. Specifically, those authors argue that goal-directed control on VI schedules will eventually emerge with truly extensive training, but is also mediated by the average action-outcome interval. In light of this view, the omission data could be taken to mean that patch lesions facilitate goal-directed control, and for the sake of balance, the authors should mention it.

Minor comments:

How were the intervals distributed on the VI schedules? Also, why did the authors choose to implement a lower bound? This seems unusual to me.

There is a section titled ‘Probe trials’. I recommend changing the title to ‘Probe tests’, because the task is free-operant without any discrete trials.

Line 142-143: “Omission is a more robust means of extinguishing habitual responding…”. More robust than what?

In Fig 3A, B, D, E, there are 5 lines per plot. I assume that some of those lines represent SEM bounds, but this is not stated in the caption nor is it clearly depicted. Part of the problem is that the figures are blurry.

In Fig 4D, it’s unclear what the data are normalized to.

Line 319-320: “We next analyzed the press rates within the first and second halves of this first omission trial”. What do the authors mean by “trial”? The test is free-operant without any discrete trials, correct?

Reviewer #3: Re: PONE-D-19-28957

This present manuscript evaluates the contributions of striatal patches to instrumental learning and motor learning. Recent works have shown similar findings, using different approaches. Here, they rely on a Cre line expressed in patches, and then use a cre-dependent caspase virus to delete neurons within a patch. They find that learned lever press responding is somewhat more variable in patch-lesioned mice, and that some aspects of what has been termed habitual control are disrupted. There does not seem to be large differences in motor effects. Together, this may add novel data to the literature about how patches supporting the learning and performance of actions. There are interesting ideas about stabilizing response patterns, and in the context of Bouton’s recent work is an interesting find. A few comments made below would help to clarify and solidify the findings. In addition, there may be concerns about the stability of the Cre line used.

DATA

- I do not believe it is appropriate to use ANOVAs for analysis of the variability data. The data is bimodal in its distribution, and an assumption when using ANOVA is that the data follows a normal distribution. That being said, it is tricky implement non-parametric tests as well, but probably necessary here. As it stands, it is hard to see the argument for this being appropriate.

- Continuing with statistics, there are places where post-hoc comparisons in the absence of an a priori hypothesis are not warranted. Following up main effects within a group is fine, but comparisons of data points between groups is not appropriate if there is not a significant interaction. One such place is in the motor rod data, and the use of a post hoc to justify day 1 investigations.

- Reliance on the use of presence/absence of GFP may not be sufficient to make the claim of patch deletion. While two previous papers were cited (one of which one author was on), a quick check of those papers did not alleviate these concerns. It seems like it is quite leaky and how many cells are taken out of patches (by mu opioid staining is not clear). This doesn’t seem to be a patch selective, but just more expression within a patch.

CLARITY

-line 45, changes in action-outcomes contingencies are not determined via outcome devaluation, but by probes that examine contingency

- line 70- there is no evidence lesioned mice are “suppressing” unnecessary response.

-142, line 374, omission is robust at testing for habitual responding, and extinction is observed, but faster in goal-directed control

-line 207- press rates results are discussed, but then followed by number of lever presses, makes looking at the graphs confusing initially.

-line 288 – noticed a significant effect on press rate

- for comparisons on presence of “habitual behavior”, omission day press rate should be compared to baseline with a one-sample t-test, to show it in Sham but not caspase.

-more should be said about the distribution of patches across striatum, as this is lesions to DMS and DLS, which support seemingly different roles.

-line 453, does not seem to need a reference, or at least going to the references and seeing reviews is a bit confusing here.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jan 8;15(1):e0224715. doi: 10.1371/journal.pone.0224715.r002

Author response to Decision Letter 0


4 Dec 2019

We would like to thank the reviewers for their thoughtful and expeditious review of our manuscript. We feel that addressing these concerns will make this work stronger overall, and we are grateful for an opportunity to revise our work. As a brief overview, we have split Figure 4 into the new Figure 4 and Figure 5, we have added 13 new panels to the figures, with new statistical analyses on 10 panels, and 6 new histological panels. Further, we have included ~6 new paragraphs to the text, and many more wording changes to improve the precision of the document. Below you will find our responses to each concern with reviewer comments italicized, changes to text in bold, and our own comments in plain text.

Reviewer #1: This paper reports the effects of striosome/patch lesions in the dorsolateral striatum on instrumental behavior in mice. Using a Cre driver line that allows selective lesion of patch neurons, the authors used standard procedures such as devaluation and omission to test the hypothesis that patch neurons could contribute to habitual formation. They concluded that patch lesions weakened habitual control. The experiments are well designed and novel, and on the whole the paper is well written. I do have a few questions concerning the interpretation. It'd be helpful if the authors could address these.

1. The results show that patch lesions did not affect devaluation, but increased sensitivity to imposition of the omission contingency. While both tests have been used to test for habits, they are obviously different and in principle could be dissociated. This should be explicitly discussed. The original use of the omission schedule also focuses on differential reinforcement of other behaviors, which could be relevant here (Davis and Bitterman 1971).

This is a very good point and we agree the text should highlight this important difference between these differential tests of habit. We have therefore included the following paragraph in the discussion:

“Omission and devaluation conditions are commonly utilized to assess habitual behaviors. However, these tasks assess different aspects of habit. Specifically, devaluation manipulates the value of the reinforcer to determine if responding persists (Adams and Dickinson 1981; Dickinson 1985). On the other hand, omission reverses a learned action-outcome contingency, while leaving the value of the reinforcer intact. Previous work has noted that this approach results in a rapid decrease in responding, more rapid even than extinction (Davis and Bitterman 1971; Yin and Knowlton 2006), and it has been used to assess aspects of flexibility (O'Hare et al. 2016; Yu et al. 2009). Omission is thought to reflect both the extinguishing of one behavior (eg. lever pressing) and reinforcement of other behaviors (e.g., waiting by the food magazine), emphasizing both breakdown of an old, and learning of a new, action-outcome contingency (Davis and Bitterman 1971). Therefore, in the current study, while extinction was noted in both groups, lesioned mice displayed faster goal-directed control or enhanced flexibility of updating both old and new action-outcome contingencies.”

2. One interesting feature of the data is increased behavioral variability in the lesion mice. However, the same analysis was not performed on behavior on devaluation and omission tests. Why not? It'd be helpful to see this analysis.

Agreed. We had not initially investigated behavioral variance across probe and omission days because the structure of the task changed. However, we have conducted this analysis and included it as the new Figure 5A. Interestingly, across probe trials, reinstatement and the first day of omission, lesioned mice are significantly more variable between days, which may reflect increased variability in these mice, or rapid updating of A-O contingencies. We then aimed to determine if press rates across VI60 training were predictive of press rates during omission. Interestingly, a significant correlation was noted for control animals. However, this correlation is not significant for lesioned mice, further suggesting rapid updating of A-O contingencies in omission, or of a general increase in variability across days. This ultimately led to a significant change in press rate in saline mice (one-sample t-test, see Reviewer 3 comment below) suggesting habitual responding on day 1 of omission, and no change in response across omission days for caspase mice, suggesting weakened habit formation. The text and figure legends have been updated accordingly.

Next, we have analyzed distribution of inter-press-interval and inter-entry-intervals in devaluation, valuation, and the first day of omission. We have also performed cross-correlation analysis of presses and head entry to investigate structure of these behavioral sessions. Ultimately, we noted no significant differences between saline and caspase mice in valuation, devaluation, or omission. We then compared between valuation and devaluation trials within saline and caspase mice, and similarly found no significant differences. We have summarized these findings in the figure shown below (NOTE: we could not put this figure in the text box, see .pdf file of Response to Reviewers to see figure).

We opted, ultimately, not to include this as a new figure to streamline readability of the document. However, if the reviewer feels this analysis should be included as a new figure, we are happy to include it. We have included acknowledgement of this analysis in the text which can be found in results:

“Finally, to mirror the analysis of behavioral structure performed for VI60 training (Figure 3), we next compared the distribution of inter-press- and inter-head-entry-interval between control and lesioned mice for devaluation, valuation, and omission trials. Further, we also assessed these distributions within treatment across devaluation and valuation days. Finally, we compared the structure of behavioral responses between groups in devaluation, valuation, and omission trials. Ultimately, no significant differences were noted for these analysis (p>0.05) suggesting patch lesions did not alter behavioral strategy during probe tests.”

3. discussion of increased variability includes the possibility of impaired reward memory consolidation. I'm not sure if I can follow the logic here. This is very briefly mentioned with no citation. Some elaboration is needed to clarify this point, or the authors should remove it.

We attempted to raise the possibility that some information regarding reward value could be stored in striosomes, and lesioning of patches may disrupt consolidation of memory that occurs overnight between behavioral sessions (Stickgold and Walker 2007), resulting in day-to-day variability. Indeed, striatum is thought to encode aspects of value (Knutson et al. 2009). Nevertheless, this discussion is somewhat speculative, and for clarity, we have removed this line from discussion. The text now reads:

“This could suggest that patches play a general role in regulating crystallization of motor patterns, thus establishing habits.”

4. The discussion of motor pattern crystalization is interesting. Dezfouli and Balleine published a series of papers examining the relationship between behavioral chunking, sequence learning, and habit formation (e.g. 2013). These should be discussed in relation to the current finding. Also, the work showing DLS contribution to sequence learning (e.g. Yin 2010) seems relevant.

Agreed, this will enrich the discussion. We have included the following text”

“Similarly, during habit formation, discrete behavioral elements may become chunked into larger behavioral sequences with repetition (Graybiel 2008; Lingawi et al. 2016). Indeed, as habits form, the likelihood of a given action to follow a preceding action increases (Dezfouli and Balleine 2013; Matsumoto et al. 1999). Sensory input may therefore drive selection of concatenated behaviors once habits form, and action-outcome contingencies may be updated on a sequence-level (Dezfouli et al. 2014; Lingawi et al. 2016). It is possible that striatal patches may play a role in this aggregation of behavioral elements. Indeed, the striatum has been shown to be critical for expression of innate behavioral sequences (Berridge and Whishaw 1992; Van den Bercken, J H and Cools 1982) and learning of new behavioral sequences is particularly dependent on the lateral striatum (Yin 2010). Further, striatal neurons encode the beginning and ends of behavioral sequences as learning occurs (Cui et al. 2013; Jin et al. 2014; Jin and Costa 2015), with differential contribution of striatal direct and indirect pathways (Geddes et al. 2018). Future studies should investigate correlates of behavioral chunking in patch neurons across habit formation.”

Reviewer #2: This paper examines how lesions of the patch compartment in the dorsal striatum affect habitual control of instrumental performance in mice. While the main research focus has been on lateral versus medial dorsal striatum, the contributions of the patch and matrix compartments are unknown. The authors find that patch lesions induced by caspase 3 result in increased lever pressing variability and altered micro-structure of lever pressing. During selective satiety tests conducted after instrumental training, the lesion animals appear to not differ from sham animals—both seem to show instrumental insensitivity to devaluation. However, during omission tests, lesion animals show lower lever pressing rates, which suggests greater sensitivity to the action-outcome association. This is despite no apparent differences in overall motor activity, as assessed by a number of tests.

Major comments:

The method of reward devaluation is problematic. The reason for using a specific satiety procedure is to control for general satiety, but using chow as a control when the mice are food-deprived does not accomplish that. The mice are food-deprived, and they will consume more chow than sucrose solution. This makes the tests an unfair comparison, because the mice are differentially sated during the tests. If half the mice were trained to earn sucrose and the other half chow, this method would be slightly less problematic. However, given that all mice were trained to earn sucrose solution, the appropriate control is another type of solution (e.g. maltodextrin). This is an especially important point given that the authors’ goal is to elicit habits. Given that the ‘valued’ test is likely associated with increased consumption relative to the ‘devalued’ test, it’s no surprise that the authors found no difference in instrumental performance between valued and devalued test. But it’s unclear if this can be attributed to a habit or asymmetrical consumption—thus undermining the main point of the paper. I recommend running the experiment again, but using a better control for general satiety that equates consumption between valued and devalued tests.

The reviewer raises a valid concern and good points here. It is possible that mice might have consumed more chow than sucrose, or that consumption of chow had a more potent effect on general satiety. Accordingly, many studies have used counterbalanced liquid rewards (sucrose and maltodextrin) to accomplish reward-specific satiety in assessing habitual responding (A. Nelson and Killcross 2006; A. J. Nelson and Killcross 2013). Similarly, groups have employed differing types of solid food for with the same goal (O'Hare et al. 2016). Many studies have also used a cross-modality approach (solid and liquid rewards) similar to our own (Gremel and Costa 2013; He et al. 2016; Li et al. 2018; Renteria et al. 2018; Sieburg et al. 2019) and have found significant differences between valuation and devaluation. Indeed, the current approach using home-cage chow and liquid sucrose has been successfully employed in the literature (He et al. 2016; Li et al. 2018) Moreover, anecdotally, our own pilot experiments using C57/BL6 did not reveal statistically significant across-day decreases in lever-press rates. This point led us to suggest that rapid extinguishing across probe days may be an effect of using the Sepw1 line.

An additional concern beyond the reviewer’s was brought to our attention recently: our variable interval schedule had, on average, 60 second delays between response and reward. However, the probe trial duration exceeded this experienced time interval. It is therefore possible that use of 5 min probe trials may be problematic, as intervals beyond the learned variable interval delay may drive rapid extinguishing of behaviors. However, probe trials exceeding delays learned during VI training are also common (eg(Jenrette et al. 2019). Nevertheless, the mouse line used, length of probe trials, and potentially asymmetric valuation and devaluation may have all culminated in our devaluation probe results. We do feel that including our data is important for the field: many habit researchers have noted across-day decreases in pressing (through word of mouth), though these results are often downplayed in the literature. Discussion of these factors will benefit this work and acknowledge shortcomings of our approach.

We have therefore added a new paragraph to Discussion to address the devaluation result:

“Two factors may contribute to this finding. First, the use of home-cage chow and liquid sucrose rewards could represent an asymmetrical manipulation between devaluation and valuation probe trials, which may have impacted the results of these probe trials. Many studies have used counterbalanced liquid rewards (sucrose and maltodextrin) to accomplish reward-specific satiety in assessing habitual responding (A. Nelson and Killcross 2006; A. J. Nelson and Killcross 2013). Similarly, groups have employed differing types of solid food for with the same goal (O'Hare et al. 2016). However, the approach used here has been utilized elsewhere (He et al. 2016; Li et al. 2018) and, more generally, the combination of liquid and solid reinforcers used across devaluation and valuation trials is common (Gremel and Costa 2013; Gremel et al. 2016; He et al. 2016; Li et al. 2018; Renteria et al. 2018). Nevertheless, the lack difference in our devaluation and valuation trials could be attributed to this asymmetry in reward, which Sepw1 mice may be particularly sensitive to. Another factor that might have impacted this result was length of probe trial. Our probe trial duration (5 min) greatly exceeded the delay experienced during variable interval training (30-90 sec), which might have resulted in rapid extinguishing of pressing. Other groups have used probe trials that more closely match delay times that mice experienced during training (O'Hare et al. 2016). Therefore, future studies of habit using mice should be mindful of symmetry in designing valuation and devaluation probes, and in length of probes relative to variable interval delays.”

The authors present and analyze normalized pressing rates to gauge habitual responding, but I strongly recommend presenting and analyzing the raw pressing rates for devaluation and omission tests, either instead of or in addition to the normalized rates. Raw pressing rates provide a straightforward measure of performance, while also removing any suspicion associated with normalized data.

This is also a good suggestion. The use of normalized press rates is a common means of controlling for variable baseline press rates between individuals eg. (Gremel and Costa 2013; Hilario et al. 2007; O'Hare et al. 2016; Renteria et al. 2018; Yin et al. 2004). Our intent was to model this approach, particularly with a dataset with significant variability, as well as to streamline readability of the document. We have nevertheless split Figure 4 into Figure 4 and Figure 5, and provided raw press rates for Devaluation, Reinstatement, and Omission, as well as including additional variance analysis per reviewer #1. The text has been amended to support this change.

Press rates in devaluation, reinstatement, and omission do not differ between groups, suggesting the largest effects are found across days. Together, these results suggest that patch-lesioned mice are more flexible in updating their action-outcome contingencies to new reinforcement schedules across days, which is consistent with our overarching argument: that patch lesions make animals more flexible, either by impacting habit formation or by facilitating goal-directed responding.

Line 262-263: “In the context of the variable interval schedule, a single press followed by head entry is the most efficient strategy to obtain a reward, while head-entries followed by presses are less efficient”. This is confusing. To the naïve reader, it sounds like the authors are saying that on a VI, it is optimal to always check the magazine after a single lever press, and it is not optimal to check the magazine after a series of presses. Is this what the authors are suggesting? If so, it does not make much sense, and needs to be explained. It would help to present a rigorous definition of efficiency.

We apologize for the lack of clarity here. Our attempt was to suggest that the difference in efficiency between these groups may reflect some difference in press/head entry patterns. If the average variable interval is well learned, a mouse that is behaving optimally may wait, press a minimal number of times, and immediately seek feedback following pressing by making a head entry. On the other hand, if mice continually make head entries that did not follow a press, it is less likely these head entries will be rewarded (resulting in reduced efficiency). The language was, however, imprecise and we have attempted to clarify this point by reworded this section. We have also including our definition of efficiency, and attempted to make the language more clear by implementing a simplified discussion of the meaning of the cross-correlation data. We have updated the language in Discussion to match these edits:

“The differences in behavioral efficiency between lesioned and control mice may reflect differences in press/head entry patterns. That is, improved efficiency (press or entry:reward ratio) may reflect animals better learning the interval, pacing presses during the interval, and then making a head entry to determine the outcome of a press (press-check responding). On the other hand, making repeated head entries or entries followed by a press (check-press responding) may be associated with reduced efficiency by mandating multiple entries. We therefore sought to characterize the structure of responding across variable interval training for each of these groups… Correlation at lag -50 suggests presses were predictive of head entries 5 sec later (press-check responding), and correlation at lag 50 suggests head entries were predictive of presses 5 sec later (check-press responding). Lags between these extremes represent correlation at a shorter interval between press and entry rates. Between day 1 and 5, control mice show a change in responding with both an increase in correlation between press-check responses, and an in check-press responding (two-way repeated measures ANOVA, both factors repeated measures, significant interaction, F(99,1089) = 4.232, p < 0.0001, significant bonferroni-corrected post-hoc tests shown on figure; Fig 3G). This suggests that control mice increase stereotyped press-check and check-press sequences, which is accompanied by no change in overall efficiency (Fig 3 C + F). On the other hand, lesioned mice subtly modify their responding across training, with an increased correlation in short latency press-check responding (two-way repeated measures ANOVA, both factors repeated measures, significant interaction, F(99,990) = 3.545, p < 0.0001, significant bonferroni-corrected post-hoc tests shown on figure; Fig 3H). Thus, control mice increase both press-check and check-press response patterns that may indicate the emergence of reflexive, stereotyped head-entries. Lesioned mice never increase this check-press behavior and improve their press-check responding, which is associated with increased efficiency. This improvement may suggest that patch lesioned mice maintain goal-directed responding across learning.”

Line 295-296: “This finding is not consistent with prior reports”. Are the authors suggesting that the decrease in instrumental performance across two days of extinction is inconsistent with prior reports? If so, that is not true. Instrumental performance usually decreases with increased extinction experience.

We apologize for lack of clarity here. While extinction has been repeatedly shown to drive decreases in response rate (Davis and Bitterman 1971; Yin and Knowlton 2006), significant decreased response rates across two probe trials are not commonly reported. We have amended this statement accordingly:

“This significant decrease in press rate across subsequent probe trials is not commonly reported.”

For the data in Fig 5B, the results of the interaction test are not reported, and the main effect of group is not statistically significant. Yet, the authors perform a post-hoc test anyway. This seems inappropriate.

We have amended this by removing post hoc analysis to address this concern and those of Reviewer #3. Additionally, we have softened the language in the text by adding the word ‘slightly’ to describe impairments in rotarod, which better reflects the trending effect of lesion in day 1 of rotarod training. We have updated the figure, figure legend, and text accordingly. Similarly, the new figure 5F has also had post hoc tests removed as the interaction was not significant. A significant effect of lesion is now shown on the figure.

The results from the omission test are interesting, and the authors interpret the data to mean that patch lesions disrupt habit formation. An alternative interpretation that should be mentioned is that patch lesions facilitate the emergence of goal-directed control. The standard way of thinking about action control on a VI schedule is that goal-directed control appears early and then erodes with more extensive training to make way for a habit. However, a recent paper (Garr et al., 2019; DOI: 10.1037/xan0000229) has proposed a revision of this belief. Specifically, those authors argue that goal-directed control on VI schedules will eventually emerge with truly extensive training, but is also mediated by the average action-outcome interval. In light of this view, the omission data could be taken to mean that patch lesions facilitate goal-directed control, and for the sake of balance, the authors should mention it.

We absolutely agree, lesions of patches could reflect increases in goal-directed control, which may reflect rapid updating of response rates across omission and probe tests. We have therefore added a section to Discussion to address this possibility:

“Indeed, the standard view of habit formation during VI schedules is that goal-directed behaviors degrade as habits are formed, a recent study suggests that with extensive training, goal-directed behavior will eventually emerge (Garr et al. 2019). Therefore, it is possible that, rather than disrupting habit formation, lesions of patches facilitate the emergence of goal-directed control.”

Minor comments:

How were the intervals distributed on the VI schedules? Also, why did the authors choose to implement a lower bound? This seems unusual to me.

VI30 spanned 15-45 sec while VI60 spanned 30-90 sec. Intervals at each trial were randomly selected from a list of possible intervals, each separated by 3 sec (30, 33, 36, etc.). Lower and upper bounds were selected to reflect time intervals across a wide range of times to mirror probability-based random interval schedules. This approach led to escalating press rates similar to previous reports (Fig 2A) (Hilario et al. 2007). To clarify this point we have expanded details in Methods:

“30 seconds (15-45 sec, possible intervals separated by 3 sec)... (rewarded every 60 seconds on average, ranging from 30-90 sec, possible intervals separated by 3 sec)”

There is a section titled ‘Probe trials’. I recommend changing the title to ‘Probe tests’, because the task is free-operant without any discrete trials.

Done. Further, we have converted the term “trial” to read test when referring to probe tests, we have deleted the word trial when referencing omission days, and we have only left the word trial when referring to rotarod trials that took place on the same day.

Line 142-143: “Omission is a more robust means of extinguishing habitual responding…”. More robust than what?

This line has been amended to address this and the concern raised by reviewer #3 below:

“Omission is a robust means of testing habitual responding (Davis and Bitterman) [19], and was used to probe goal-directed control”

In Fig 3A, B, D, E, there are 5 lines per plot. I assume that some of those lines represent SEM bounds, but this is not stated in the caption nor is it clearly depicted. Part of the problem is that the figures are blurry.

This may be partly due to compression during submission. Our .PNGs were exported at 300 dpi, which should be standard, publication quality resolution, though we see that the images in the pdf are heavily compressed. There are 6 lines per figure, and as the reviewer assumed, the dotted lines are SEM. We have updated figure legends accordingly:

“Solid lines represent mean and dotted lines of the same color are SEM”

In Fig 4D, it’s unclear what the data are normalized to.

Information about normalization is located in Data Analysis (“Reinstatement press rates were normalized to press rates during the final day of VI60”), but we have amended the figure legend to be more clear:

“Lesioned mice increased responding to a greater extent than controls during reinstatement to the VI60 schedule (data normalized to final day of VI60).”

Line 319-320: “We next analyzed the press rates within the first and second halves of this first omission trial”. What do the authors mean by “trial”? The test is free-operant without any discrete trials, correct?

This was imprecise throughout the text. We have converted the term “trial” to read test when referring to probe tests, we have deleted the word trial when referencing omission days, and we have only left the word trial when referring to rotarod trials that took place on the same day.

Reviewer #3: Re: PONE-D-19-28957

This present manuscript evaluates the contributions of striatal patches to instrumental learning and motor learning. Recent works have shown similar findings, using different approaches. Here, they rely on a Cre line expressed in patches, and then use a cre-dependent caspase virus to delete neurons within a patch. They find that learned lever press responding is somewhat more variable in patch-lesioned mice, and that some aspects of what has been termed habitual control are disrupted. There does not seem to be large differences in motor effects. Together, this may add novel data to the literature about how patches supporting the learning and performance of actions. There are interesting ideas about stabilizing response patterns, and in the context of Bouton’s recent work is an interesting find. A few comments made below would help to clarify and solidify the findings. In addition, there may be concerns about the stability of the Cre line used.

DATA

- I do not believe it is appropriate to use ANOVAs for analysis of the variability data. The data is bimodal in its distribution, and an assumption when using ANOVA is that the data follows a normal distribution. That being said, it is tricky implement non-parametric tests as well, but probably necessary here. As it stands, it is hard to see the argument for this being appropriate.

This is a good point. Further, these data are distributions of inter-response-intervals, and an analysis of distribution is a more appropriate approach than ANOVA. We therefore reanalyzed this data set utilizing a two-sample Kolmogorov-Smirnov test of distribution, which is non-parametric, and which compares distribution shape. We find that distribution of inter-press-intervals changes across training in lesioned mice, but not controls, lesioned mice increase efficiency (responses/reward) by decreasing stereotyped pressing across training. On the other hand, control mice do not increase efficiency, this is accompanied by no change in IPI distribution. Caspase mice do not significantly alter their distribution of inter-head-entry-interval across learning, though they display a trend for increased efficiency. Controls significantly alter their distribution of inter-head-entry-intervals by increasing stereotyped entries, which is associated with no increase in efficiency over training.

This analysis has replaced the previous ANOVA results in the text and figure:

“Over training, control mice tend to increase their pressing around 2 sec, though the distribution does not significantly change across training (Two-sample Kolmogorov-Smirnov test, p>0.05; Fig 3A) , while lesioned mice tended to suppress responses at this interval (Two-sample Kolmogorov-Smirnov test, p<0.05, Fig 3B).”

“Control mice significantly alter their distribution of inter-entry-interval, suggesting these mice increase stereotyped head entries across training at 2-4 sec intervals (Two-sample Kolmogorov-Smirnov test, p<0.05; Fig 3D). On the other hand, lesioned mice tended to reduce headentries, though distributions do not significantly change across training (Two-sample Kolmogorov-Smirnov test, p>0.05; Fig 3E). This resulted in a partial increase in head-entry:reward efficiency in lesioned mice (one-sample t-test, t = 1.917, df = 10, p =0.0842, Fig 3F) and no change in control mice (one-sample t-test, t = 0.4354, df = 11, p =0.6717, Fig 3F). Together, this suggests that control mice develop a less efficient strategy to obtain rewards relative to lesioned mice, potentially due to emergence of habitual magazine entry across learning in controls, and due to reduced pressing across learning in lesioned mice.”

- Continuing with statistics, there are places where post-hoc comparisons in the absence of an a priori hypothesis are not warranted. Following up main effects within a group is fine, but comparisons of data points between groups is not appropriate if there is not a significant interaction. One such place is in the motor rod data, and the use of a post hoc to justify day 1 investigations.

The reviewer is also correct on this point. We have removed this post hoc analysis to address this concern, and that of reviewer #2 above. Additionally, we have softened the language in the text by adding the word ‘slightly’ to describe impairments in rotarod, which better reflects the trending effect of lesion in day 1 of rotarod training. We have updated the figure, figure legend, and text accordingly. Additionally, this was also true of old figure 4F/ new figure 5F. The post hoc tests have been omitted and group effects are now indicated on this panel.

- Reliance on the use of presence/absence of GFP may not be sufficient to make the claim of patch deletion. While two previous papers were cited (one of which one author was on), a quick check of those papers did not alleviate these concerns. It seems like it is quite leaky and how many cells are taken out of patches (by mu opioid staining is not clear). This doesn’t seem to be a patch selective, but just more expression within a patch.

The reviewer raises a valid concern. Indeed, in our own histological analysis of this line, we have noted Cre expression outside of patches, which can be seen in Figure 1. The work the reviewer mentions characterized these so-called ‘exo-patch’ neurons (Cre+ neurons outside of traditionally defined patches) and found similarities in µ-opioid receptor expression, D1/D2 receptor expression, and embryological development between patches and exo-patches, suggesting they may be similar in nature (supplemental figures 1-3)(Smith et al. 2016). Further, the Sepw1 line has been shown to have preferential output to SNc dopamine neurons, consistent with established patch architecture (Gerfen et al. 2013; Smith et al. 2016). Nevertheless, the language in the current work can be softened to acknowledge the possibility that our lesions likely occurred in patch and “exo-patch” SPNs. Specifically, use of the term “selective” when describing Cre-expression or lesion to patches has been converted to “preferential” throughout the text.

The reviewer’s point about GFP vs MOR expression is also duly noted. We have now performed this stain and reworked Figure 1 to include representative images of MOR expression in intact and lesioned striatum. We see loss of GFP+ cells in dorsal striatum, and diffuse MOR expression in this region that is not well defined into discrete patches (though ventral patches are still present). This stain suggests that both GFP and MOR expression are disrupted following injection of virus encoding caspase 3. The text has been amended to support this change in the figure:

“…an AAV encoding a modified caspase 3 virus to preferentially lesion striatal patches. Injection of AAV led to deletion of GFP+ neurons in the dorsal striatum (Fig 1A-C). Patches have been defined by expression of µ-opioid receptor (MOR; Crittenden and Graybiel, 2011), so we next characterized the expression of MOR in intact and lesioned tissue. GFP+ neurons preferentially aggregate in MOR-enriched striatal patches, though, as previously reported, the Sepw1 line also expresses Cre in “exo-patches”, or striatal neurons outside of patches that are ‘patch-like’ in terms of receptor expression and development (Fig 1 D-F; Smith et al. 2016). Injection of virus encoding caspase 3 led to loss of GFP+ neurons from patches and a reduction of exo-patch neurons in both dorsomedial and dorsolateral striatum. This change was accompanied by diffuse expression of µ-opioid receptor and loss of discrete patch expression in the dorsal striatum (Fig 1G-I).”

CLARITY

-line 45, changes in action-outcomes contingencies are not determined via outcome devaluation, but by probes that examine contingency

True, we have updated the text to be more precise:

“Habits have been studied in animal models by measuring perseverance of instrumental behaviors (e.g. lever pressing) following changes in reward value [9,10], or by measuring flexibility in responding during probes manipulating action-outcome contingency (Davis and Bitterman; Yin and Knowlton 06).”

- line 70- there is no evidence lesioned mice are “suppressing” unnecessary response.

This has been omitted:

“Additionally, lesioning striatal patches disrupted behavioral stability across training and lesioned mice utilized a more goal-directed behavioral strategy during training.”

-142, line 374, omission is robust at testing for habitual responding, and extinction is observed, but faster in goal-directed control

142: “Omission is a robust means of testing habitual responding (Davis and Bitterman) [19], and was used to probe goal-directed control”

374: “Therefore, in the current study, while extinction was noted in both groups, lesioned mice displayed faster goal-directed control or enhanced flexibility of updating both old and new action-outcome contingencies.”

-line 207- press rates results are discussed, but then followed by number of lever presses, makes looking at the graphs confusing initially.

The reviewer makes a good point here, the use of press # is problematic as length of behavioral session changes between days. We have converted this figure to press rate and updated the text accordingly. It now reads:

“Figure 2B+C show the daily press rate of one mouse subtracted from the average press rate for that mouse across VI60 training in both a representative control (Fig 2B) and lesioned mouse (Fig 2C).”

-line 288 – noticed a significant effect on press rate

“Patch lesions did not significantly impact press rates during devaluation tests”

- for comparisons on presence of “habitual behavior”, omission day press rate should be compared to baseline with a one-sample t-test, to show it in Sham but not caspase.

This has now been performed and is included as then new Figure 5I. Relative to basline pressing (Reinstatement press rate), sham mice persist in press rates during the first day of omission, indicating habitual behavior. On the other hand, caspase lesioned mice have significantly reduced press rates in omission relative to baseline, suggesting impaired habitual responding. The text and figure legends have been updated to reflect this change. This comment also inspired us to investigate press rates across VI60 to determine if they reflected press rates in omission. Interestingly, there is a significant correlation in control, but not lesioned mice, suggesting lesioned mice rapidly update A-O contingencies in omission (Figure 5G-H).

-more should be said about the distribution of patches across striatum, as this is lesions to DMS and DLS, which support seemingly different roles.

We agree. Indeed, when we presented parts of this work at the recent Society for Neuroscience conference, this question was the most commonly asked about the data set. To address this, we have added a brief paragraph to Discussion exploring different functions of medial and lateral patches:

“Importantly, use of a virus encoding caspase 3 at volumes utilized here resulted in loss of patches from the dorsal striatum spanning both medial and lateral subregions (Figure 1). Based on proposed models of striatal functioning, the medial striatum is thought to guide goal-directed behaviors (Yin et al. 2005), whereas the dorsolateral striatum and its dopamine inputs are thought to be necessary for habit formation (Faure et al. 2005; Yin et al. 2004; Yin and Knowlton 2006), though caveats to this view have been reported (Malvaez et al. 2018; Okada et al. 2014). Patches are uniformly distributed across the dorsomedial and dorsolateral striatum, forming extended compartments across anterior and posterior ends (Desban et al. 1993; Johnston et al. 1990; Morigaki and Goto 2016). It is possible that medial and lateral patches have a differential role in habit formation that could reflect the larger medial-lateral divide across the striatum. Future work should investigate this possibility.”

-line 453, does not seem to need a reference, or at least going to the references and seeing reviews is a bit confusing here.

This citation has been removed.

References

Adams C. D. and Dickinson A. (1981) Instrumental responding following reinforcer devaluation. The Quarterly Journal of Experimental Psychology Section B. 33, 109-121.

Berridge K. C. and Whishaw I. Q. (1992) Cortex, striatum and cerebellum: control of serial order in a grooming sequence. Exp. Brain Res. 90, 275-290.

Cui G., Jun S. B., Jin X., Pham M. D., Vogel S. S., Lovinger D. M. and Costa R. M. (2013) Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 494, 238-242.

Davis J. and Bitterman M. E. (1971) Differential reinforcement of other behavior (DRO): a yoked-control comparison. J. Exp. Anal. Behav. 15, 237-241.

Desban M., Kemel M. L., Glowinski J. and Gauchy C. (1993) Spatial organization of patch and matrix compartments in the rat striatum. Neuroscience. 57, 661-671.

Dezfouli A. and Balleine B. W. (2013) Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized. PLoS Comput. Biol. 9, e1003364.

Dezfouli A., Lingawi N. W. and Balleine B. W. (2014) Habits as action sequences: hierarchical action control and changes in outcome value. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 369, 10.1098/rstb.2013.0482.

Dickinson A. (1985) Actions and Habits: The Development of Behavioural Autonomy. Philosophical Transactions of the Royal Society of London.Series B, Biological Sciences. 308, 67-78.

Faure A., Haberland U., Conde F. and El Massioui N. (2005) Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J. Neurosci. 25, 2771-2780.

Garr E., Bushra B., Tu N. and Delamater A. R. (2019) Goal-directed control on interval schedules does not depend on the action–outcome correlation. Journal of Experimental Psychology: Animal Learning and Cognition, No Pagination Specified.

Geddes C. E., Li H. and Jin X. (2018) Optogenetic Editing Reveals the Hierarchical Organization of Learned Action Sequences. Cell. 174, 3-43.e15.

Gerfen C. R., Paletzki R. and Heintz N. (2013) GENSAT BAC cre-recombinase driver lines to study the functional organization of cerebral cortical and basal ganglia circuits. Neuron. 80, 1368-1383.

Graybiel A. M. (2008) Habits, rituals, and the evaluative brain. Annu. Rev. Neurosci. 31, 359-387.

Gremel C. M., Chancey J. H., Atwood B. K., Luo G., Neve R., Ramakrishnan C., Deisseroth K., Lovinger D. M. and Costa R. M. (2016) Endocannabinoid Modulation of Orbitostriatal Circuits Gates Habit Formation. Neuron. 90, 1312-1324.

Gremel C. M. and Costa R. M. (2013) Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat. Commun. 4, 2264.

He Y., Li Y., Chen M., Pu Z., Zhang F., Chen L., Ruan Y., Pan X., He C., Chen X., Li Z. and Chen J. F. (2016) Habit Formation after Random Interval Training Is Associated with Increased Adenosine A2A Receptor and Dopamine D2 Receptor Heterodimers in the Striatum. Front. Mol. Neurosci. 9, 151.

Hilario M. R., Clouse E., Yin H. H. and Costa R. M. (2007) Endocannabinoid signaling is critical for habit formation. Front. Integr. Neurosci. 1, 6.

Jenrette T. A., Logue J. B. and Horner K. A. (2019) Lesions of the Patch Compartment of Dorsolateral Striatum Disrupt Stimulus-Response Learning. Neuroscience. 415, 161-172.

Jin X. and Costa R. M. (2015) Shaping action sequences in basal ganglia circuits. Curr. Opin. Neurobiol. 33, 188-196.

Jin X., Tecuapetla F. and Costa R. M. (2014) Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat. Neurosci. 17, 423-430.

Johnston J. G., Gerfen C. R., Haber S. N. and van der Kooy D. (1990) Mechanisms of striatal pattern formation: conservation of mammalian compartmentalization. Brain Res. Dev. Brain Res. 57, 93-102.

Knutson B., Delgado M. R. and Phillips P. E. M. (2009) Chapter 25 - Representation of Subjective Value in the Striatum. Neuroeconomics, 389-406.

Li Y., Pan X., He Y., Ruan Y., Huang L., Zhou Y., Hou Z., He C., Wang Z., Zhang X. and Chen J. F. (2018) Pharmacological Blockade of Adenosine A2A but Not A1 Receptors Enhances Goal-Directed Valuation in Satiety-Based Instrumental Behavior. Front. Pharmacol. 9, 393.

Lingawi N. W., Dezfouli A. and Balleine B. W. (2016) The Psychological and Physiological Mechanisms of Habit Formation. The Wiley Handbook on the Cognitive Neuroscience of Learning, 409-441.

Malvaez M., Greenfield V. Y., Matheos D. P., Angelillis N. A., Murphy M. D., Kennedy P. J., Wood M. A. and Wassum K. M. (2018) Habits Are Negatively Regulated by Histone Deacetylase 3 in the Dorsal Striatum. Biol. Psychiatry. 84, 383-392.

Matsumoto N., Hanakawa T., Maki S., Graybiel A. M. and Kimura M. (1999) Role of [corrected] nigrostriatal dopamine system in learning to perform sequential motor tasks in a predictive manner. J. Neurophysiol. 82, 978-998.

Morigaki R. and Goto S. (2016) Putaminal Mosaic Visualized by Tyrosine Hydroxylase Immunohistochemistry in the Human Neostriatum. Front. Neuroanat. 10, 34.

Nelson A. J. and Killcross S. (2013) Accelerated habit formation following amphetamine exposure is reversed by D1, but enhanced by D2, receptor antagonists. Front. Neurosci. 7, 76.

Nelson A. and Killcross S. (2006) Amphetamine exposure enhances habit formation. J. Neurosci. 26, 3805-3812.

O'Hare J. K., Ade K. K., Sukharnikova T., Van Hooser S. D., Palmeri M. L., Yin H. H. and Calakos N. (2016) Pathway-Specific Striatal Substrates for Habitual Behavior. Neuron. 89, 472-479.

Okada K., Nishizawa K., Fukabori R., Kai N., Shiota A., Ueda M., Tsutsui Y., Sakata S., Matsushita N. and Kobayashi K. (2014) Enhanced flexibility of place discrimination learning by targeting striatal cholinergic interneurons. Nat. Commun. 5, 3778.

Renteria R., Baltz E. T. and Gremel C. M. (2018) Chronic alcohol exposure disrupts top-down control over basal ganglia action selection to produce habits. Nat. Commun. 9, 21-9.

Sieburg M. C., Ziminski J. J., Margetts-Smith G., Reeve H., Brebner L. S., Crombag H. S. and Koya E. (2019) Reward devaluation attenuates cue-evoked sucrose seeking and is associated with the elimination of excitability differences between ensemble and non-ensemble neurons in the nucleus accumbens. eNeuro.

Smith J. B., Klug J. R., Ross D. L., Howard C. D., Hollon N. G., Ko V. I., Hoffman H., Callaway E. M., Gerfen C. R. and Jin X. (2016) Genetic-Based Dissection Unveils the Inputs and Outputs of Striatal Patch and Matrix Compartments. Neuron. 91, 1069-1084.

Stickgold R. and Walker M. P. (2007) Sleep-dependent memory consolidation and reconsolidation. Sleep Med. 8, 331-343.

Van den Bercken, J H and Cools A. R. (1982) Evidence for a role of the caudate nucleus in the sequential organization of behavior. Behav. Brain Res. 4, 319-327.

Yin H. H. (2010) The sensorimotor striatum is necessary for serial order learning. J. Neurosci. 30, 14719-14723.

Yin H. H. and Knowlton B. J. (2006) The role of the basal ganglia in habit formation. Nat. Rev. Neurosci. 7, 464-476.

Yin H. H., Knowlton B. J. and Balleine B. W. (2005) Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. Eur. J. Neurosci. 22, 505-512.

Yin H. H., Knowlton B. J. and Balleine B. W. (2004) Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur. J. Neurosci. 19, 181-189.

Yu C., Gupta J., Chen J. F. and Yin H. H. (2009) Genetic deletion of A2A adenosine receptors in the striatum selectively impairs habit formation. J. Neurosci. 29, 15100-15103.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Jeff A Beeler

18 Dec 2019

PONE-D-19-28957R1

Lesion of striatal patches disrupts habitual behaviors and increases behavioral variability

PLOS ONE

Dear Mr. Nadel,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The reviewers are mostly satisfied with the submitted revision, though two minor lingering points need to be addressed. (1) the genetic mutation is not entirely specific to striosomes and is 'leaky.' While this in no way diminishes the value of the reported findings, this needs to clearly be stated in the discussion as a caveat with a very brief description how not completely specific to striosomes, (2) another reviewer continues to request raw data. As this does not seem to be a concern for the other reviewers, I would ask the authors provide me (editor) an explanation of why raw data is not included. The manuscript will not need to go out for review again. The authors' responses will be evaluated promptly by the editor and a final decision rendered.

We would appreciate receiving your revised manuscript by Feb 01 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Jeff A Beeler

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: My concerns have been addressed by the authors. the manuscript has been improved considerably. The discussion now includes better acknowledgment of previous work, including the difference between omission and devaluation as assays for habitual performance. The additional discussion of different aspects of habitual behavior is also appropriate. I think the paper is now publishable.

Reviewer #2: Overall, the authors have done a good job at addressing my previous comments. My additional comments are listed below.

- Regarding my primary concern about the devaluation procedure, the authors claim that “the current approach using home-cage chow and liquid sucrose has been successfully employed in the literature (He et al. 2016; Li et al. 2018)”. The results of the He et al. paper alleviate my concern because the mice are food-deprived and when they are given selective satiety tests after training on a random ratio schedule, they show instrumental sensitivity to devaluation. This is encouraging. The Li et al. paper is less relevant because the mice are not reported as food-deprived, so I recommend removing this citation. The authors also appeal to the fact that other researches have used a combination of liquid and solid reinforcers during devaluation tests, but this does not relate to my concern and distracts from the main issue. My concern is a higher rate of consumption during devalued vs. valued tests (I would be concerned about this even if the authors used wildtype mice, so translating this concern into a caveat should generalize beyond Sepw1 mice). If the assignment of solid and liquid reinforcers is balanced across subjects, this concern goes away (mostly). I strongly recommend the authors rephrase the following paragraph by removing the first sentence and amending the second sentence to mean mice in general (not just Sepw1 mice):

“The combination of liquid and solid reinforcers used across devaluation and valuation trials is common (Gremel and Costa 2013; Gremel et al. 2016; He et al. 2016; Li et al. 2018; Renteria et al. 2018). Nevertheless, the lack difference in our devaluation and valuation trials could be attributed to this asymmetry in reward, which Sepw1 mice may be particularly sensitive to”.

- I previously recommended that the authors display and analyze the raw pressing rates from the selective satiety tests. However, in the revised paper the authors have only presented and analyzed raw pressing rates from the devalued condition but not the valued condition. I apologize if I was not clear, but this is the standard way of analyzing devaluation test data (comparing valued vs. devalued for each group). Without the raw pressing rates from the valued condition, there is no way to compare figures 4A and 4C. If the authors are going to present the raw pressing rates from the devalued condition, they must also present the raw pressing rates from the valued condition.

Reviewer #3: The authors have largley addressed my concerns. The main one being the leakiness of the line. It still feels a bit misleading to state investigation of a patch deletion on these behaviors. The question of patch like-due to development is interesting, but a separate investigation.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jan 8;15(1):e0224715. doi: 10.1371/journal.pone.0224715.r004

Author response to Decision Letter 1


18 Dec 2019

Dear Dr. Beeler,

We would like to again thank you and the reviewers for insightful comments and prompt processing of our manuscript. Below we address each comment raised by the reviewers. We have created two new panels in Figure 4, and have updated our Graphpad Prism file accordingly. Moreover, we have included new discussion and edited wording in several places in the document. Again, we feel the work has been made stronger through this process. Below you will find our responses to each concern with reviewer comments italicized, changes to text in bold, and our own comments in plain text.

Reviewer #1: My concerns have been addressed by the authors. the manuscript has been improved considerably. The discussion now includes better acknowledgment of previous work, including the difference between omission and devaluation as assays for habitual performance. The additional discussion of different aspects of habitual behavior is also appropriate. I think the paper is now publishable.

We again thank the reviewer for their constructive feedback on this work, we agree that the manuscript has benefited significantly from the inclusion of these topics.

Reviewer #2: Overall, the authors have done a good job at addressing my previous comments. My additional comments are listed below.

- Regarding my primary concern about the devaluation procedure, the authors claim that “the current approach using home-cage chow and liquid sucrose has been successfully employed in the literature (He et al. 2016; Li et al. 2018)”. The results of the He et al. paper alleviate my concern because the mice are food-deprived and when they are given selective satiety tests after training on a random ratio schedule, they show instrumental sensitivity to devaluation. This is encouraging. The Li et al. paper is less relevant because the mice are not reported as food-deprived, so I recommend removing this citation.

This citation has been removed.

The authors also appeal to the fact that other researches have used a combination of liquid and solid reinforcers during devaluation tests, but this does not relate to my concern and distracts from the main issue. My concern is a higher rate of consumption during devalued vs. valued tests (I would be concerned about this even if the authors used wildtype mice, so translating this concern into a caveat should generalize beyond Sepw1 mice). If the assignment of solid and liquid reinforcers is balanced across subjects, this concern goes away (mostly). I strongly recommend the authors rephrase the following paragraph by removing the first sentence and amending the second sentence to mean mice in general (not just Sepw1 mice):

“The combination of liquid and solid reinforcers used across devaluation and valuation trials is common (Gremel and Costa 2013; Gremel et al. 2016; He et al. 2016; Li et al. 2018; Renteria et al. 2018). Nevertheless, the lack difference in our devaluation and valuation trials could be attributed to this asymmetry in reward, which Sepw1 mice may be particularly sensitive to”.

We apologize for the confusion on this point. This has been done. Lines 458-465 now read:

“First, the use of home-cage chow and liquid sucrose rewards could represent an asymmetrical manipulation between devaluation and valuation probe trials, which may have impacted the results of these probe trials. However, the approach used here has been utilized in a previous study, and these mice remained sensitive to devaluation [44]. Nevertheless, the lack of difference in our probes could be attributed to potential asymmetry in consumption before devaluation and valuation probes.”

- I previously recommended that the authors display and analyze the raw pressing rates from the selective satiety tests. However, in the revised paper the authors have only presented and analyzed raw pressing rates from the devalued condition but not the valued condition. I apologize if I was not clear, but this is the standard way of analyzing devaluation test data (comparing valued vs. devalued for each group). Without the raw pressing rates from the valued condition, there is no way to compare figures 4A and 4C. If the authors are going to present the raw pressing rates from the devalued condition, they must also present the raw pressing rates from the valued condition.

We apologize for the confusion here as well. We have now performed this analysis and included two new panels in Figure 4. As shown below, presses between devaluation and valuation trials were not different in either control or lesioned groups (A-B). This result has been updated in Results (Line 305-307), Figure legend 4 (Line 321-322), and the paired student’s t-test used to compare these is now in Statistical Analysis (Line 194-195).

Reviewer #3: The authors have largley addressed my concerns. The main one being the leakiness of the line. It still feels a bit misleading to state investigation of a patch deletion on these behaviors. The question of patch like-due to development is interesting, but a separate investigation.

The author highlights an important limitation of the current work. While the Sepw1 line has been used in many other studies to determine patch function (Crittenden et al. 2016; Smith et al. 2016; Yoshizawa et al. 2018), the off-site targeting of exopatch neurons remains problematic. Other Cre lines are available and provide viable alternatives, including Mash1-CreER (Bloem et al. 2017), 599-CreER (McGregor et al. 2019), or Oprm1-Cre (Märtin et al. 2019), though each of these lines suffers from some level of exopatch Cre expression (or expression of Cre in matrix). In acknowledgment of this shortcoming, we have added the following paragraph to discussion:

“While Sepw1 NP67 mice have preferential expression of Cre recombinase in patch projection neurons, a limitation of the current work is the expression of Cre in ‘exo-patch’ neurons (Smith et al 2017), resulting in the lesioning of both patch and exo-patch neurons (Figure 1). Exo-patch neurons have similar gene expression and connectivity profiles to patch neurons (Smith et al 2017), but they fall outside of traditionally defined patches (Crittenden et al 2011). The Sepw1 NP67 line has been previously used to study patch connectivity (Smith et al, Crittenden et al 2016) and activity (Yoshizawa et al 2018). Other recent studies have utilized alternative Cre lines to target patches, including Mash1-CreER (Bloem et al 2017), 599-CreER (McGregor et al 2019), or Oprm1-Cre (Märtin et al 2019), though each of these lines also has some off-target labeling of exo-patch or matrix neurons. Thus, while the current work suggests lesions preferentially targeting patches impair aspects of habitual behavior, it remains unknown what role exo-patch neurons play in this process.”

References

Bloem B., Huda R., Sur M. and Graybiel A. M. (2017) Two-photon imaging in mice shows striosomes and matrix have overlapping but differential reinforcement-related responses. Elife. 6, 10.7554/eLife.32353.

Crittenden J. R., Tillberg P. W., Riad M. H., Shima Y., Gerfen C. R., Curry J., Housman D. E., Nelson S. B., Boyden E. S. and Graybiel A. M. (2016) Striosome-dendron bouquets highlight a unique striatonigral circuit targeting dopamine-containing neurons. Proc. Natl. Acad. Sci. U. S. A. 113, 11318-11323.

Märtin A., Calvigioni D., Tzortzi O., Fuzik J., Wärnberg E. and Meletis K. (2019) A Spatiomolecular Map of the Striatum. bioRxiv, 613596. (Preprint)

McGregor M. M., McKinsey G. L., Girasole A. E., Bair-Marshall C. J., Rubenstein J. L. R. and Nelson A. B. (2019) Functionally Distinct Connectivity of Developmentally Targeted Striosome Neurons. Cell. Rep. 29, 1419-1428.e5.

Smith J. B., Klug J. R., Ross D. L., Howard C. D., Hollon N. G., Ko V. I., Hoffman H., Callaway E. M., Gerfen C. R. and Jin X. (2016) Genetic-Based Dissection Unveils the Inputs and Outputs of Striatal Patch and Matrix Compartments. Neuron. 91, 1069-1084.

Yoshizawa T., Ito M. and Doya K. (2018) Reward-Predictive Neural Activities in Striatal Striosome Compartments. eNeuro. 5, 10.1523/ENEURO.0367-Feb.

Again, we thank you and the reviewers for careful consideration of our work. We look forward to re-review of our manuscript.

We would be grateful to be considered in the Neuroscience of Reward and Decision Making Call for Papers and we have no changes to our financial disclosure.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 2

Jeff A Beeler

23 Dec 2019

Lesion of striatal patches disrupts habitual behaviors and increases behavioral variability

PONE-D-19-28957R2

Dear Dr. Nadel,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Jeff A Beeler

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Congratulations. You have succeeded in a rigorous review process. I look forward to publication of your paper!

Reviewers' comments:

Acceptance letter

Jeff A Beeler

30 Dec 2019

PONE-D-19-28957R2

Lesion of striatal patches disrupts habitual behaviors and increases behavioral variability

Dear Dr. Nadel:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Jeff A Beeler

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. A GraphPad Prism file containing the complete data sets used in this study.

    (PZFX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    Data are all available in a GraphPad Prism file present in the submission.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES