Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Oct 22.
Published in final edited form as: Nature. 2010 Jul 22;466(7305):457–462. doi: 10.1038/nature09263

Start/Stop Signals Emerge in Nigrostriatal Circuits during Sequence Learning

Xin Jin 1, Rui M Costa 1,2,*
PMCID: PMC3477867  NIHMSID: NIHMS410319  PMID: 20651684

Summary

Learning new action sequences subserves a plethora of different abilities like escaping a predator, playing a piano, or producing fluent speech. Proper initiation and termination of each action sequence is critical for the organization of behavior, and is compromised in nigrostriatal disorders like Parkinson's and Huntington's disease. Using a self-paced operant task in which mice learn to perform a particular sequence of actions to obtain an outcome, we uncovered neural activity in nigrostriatal circuits specifically signaling the initiation or the termination of each action sequence. This start/stop activity emerged during sequence learning, was specific for particular actions, and did not reflect interval timing, movement speed or action value. Furthermore, genetically altering the function of striatal circuits disrupted the development of start/stop activity and selectively impaired sequence learning. These results have important implications for understanding the functional organization of actions, and sequence initiation and termination impairments observed in basal ganglia disorders.


Animal behavior can be organized as sequences of particular actions or movements1,2. The organization of behavior as sequences of actions is complex and requires the precise timing and ordering of movements within a sequence. It also requires the proper initiation and termination of the sequence, i.e. identifying the first and the last elements within the behavioral sequence. Although the study of innate behavioral sequences and fixed action patterns controlled by central pattern generators has received substantial attention3,4, the neural mechanisms underlying the learning and execution of acquired behavioral sequences are still largely unknown. The dorsal striatum and its dopaminergic afferents have been implicated in skill learning 5,6 and action “chunking”7,8,9. Importantly, the initiation and termination of sequences of voluntary movements is impaired in disorders affecting the striatum and its dopaminergic inputs, like Parkinson's 10,11,12 and Huntington's disease11,13. Consistently, the learning of novel sequences is also compromised in disorders affecting these circuits14,15,16. Furthermore, neuronal activity in prefrontal cortex, which projects to striatum, can change during the signaled initiation and termination of a sequence of saccades17. Although previous studies have reported changes in neural activity in the striatum and the substantia nigra pars reticulata (SNr) during the initiation of natural movement sequences18,19, the role of the striatum and nigrostriatal dopamine in the initiation and termination of newly acquired, self-generated action sequences has not been explored. Here, we show that as mice learn to perform a particular behavioral sequence, neural activity specifically signaling the self-paced initiation or termination of the newly acquired sequence emerges in nigrostriatal circuits. Consistently, genetically manipulating the function of these circuits disrupts the development of neural activity signaling sequence initiation or termination, and affects sequence learning.

Mice learn a specific sequence of actions

A group of mice (n = 14) were trained in a self-paced operant task in which a fixed number of lever presses (fixed-ratio eight, FR8) would earn a sucrose solution reward with no explicit stimuli signaling the correct sequence length or reward availability (see Methods). The average lever press rate increased with training (Supplementary Fig. 1; P < 0.01), while the behavior of the mice became more robustly organized as discrete sequences of about eight presses (Fig. 1a, b). The average number of presses per sequence increased significantly from 5.3 ± 0.3 presses on day 1 to 7.6 ± 0.3 and 8.1 ± 0.3 on days 6 and 12, respectively (Fig. 1c; P < 0.01); and after training it was no longer different from eight (day 1: P < 0.01; days 6,12: P > 0.05). Concomitantly, the average temporal duration of a sequence decreased during the first six days of training (P < 0.01), and subsequently reached a steady level (Fig. 1d). The inter-sequence-interval (ISI) also decreased with training (Fig. 1e; P < 0.01), while the within-sequence press rate increased (Fig. 1f; P < 0.05). Importantly, the variability (measured by coefficient of variation) of sequence length, sequence duration, ISI, and within-sequence press rate all decreased after six days of training (Fig. 1g–j; P < 0.01 for all). It is unlikely that mice used explicit sensory cues to determine when a reward would be available because often the reward was delivered in the middle of a lever-press sequence and mice continued to press before checking the magazine and consuming the reward (Fig. 1b inset), indicating that the initiation and termination of a sequence of presses were largely self-determined. These results reveal that a robust action sequence structure emerged with training.

Figure 1. Mice learn to perform a specific sequence of actions.

Figure 1

a, b, Example of the microstructure of the behavior of a mouse during a session on (a) the first day of FR8 training and (b) the last day of FR8 training. Each dot indicates a lever press, with red and black color indicating the first and final press of each individual sequence. The black and red solid lines on the X axis represent magazine entries and licks, respectively. The black dashed lines indicated the reward timing and corresponding lever press. c, Average number of lever presses per sequence, d, sequence duration, e, inter-sequence interval, and f, within-sequence press rate, during the first, sixth and twelfth day of FR8 training. g–j, Variability, measured as coefficient of variance (CV), of (g) sequence length, (h) sequence duration, (i) inter-sequence interval, and (j) within-sequence press rate during the first, sixth and twelfth day of FR8 training. Error bars denote s.e.m., same for all figures below.

Start/Stop activity in nigrostriatal circuits

We recorded the neural activity in dorsal striatum and substantia nigra (dopaminergic neurons in compacta - SN DA, and GABAergic neurons in reticulata, SN GABA) during the emergence of action sequences (Methods, Supplementary Fig. 2, Supplementary Table 1). Putative neuronal cell types were classified based on spike waveform, firing properties and pharmacology; classification was further confirmed using genetic20,21 and optogenetic tools22,23 (Supplementary Figs. 3–6, 16, 17). We found that striatal MSNs (medium spiny neurons) and SN GABA neurons could either exhibit a phasic increase in firing frequency before lever pressing (Fig. 2a, d) or a decrease of firing during lever pressing (Fig. 2b, c). Interestingly, SN DA neurons usually showed a phasic increase but never a decrease in firing activity before lever pressing (Fig. 2e, Supplementary Fig. 7). More than one third of all the recorded MSNs and more than half of SN GABA and SN DA neurons showed lever press-related activity (Fig. 2f, see statistics in Supplementary Information). However, there was no significant change in the proportion of neurons showing press-related activity as training progressed (P > 0.05 for all), indicating that the changes observed in sequence behavior (Fig. 1) were not caused by overall changes in press-related activity of these cell types.

Figure 2. Lever press-related activity in nigrostriatal circuits.

Figure 2

a, b, Examples of MSNs showing (a) positive and (b) negative modulation of firing rate in relation to lever pressing. c, d, Examples of (c) positive and (d) negative modulation of firing rate of SN GABA neurons relative to lever pressing. e, Example of positive modulation of firing rate in a SN DA neuron before lever pressing. For each panel, top: raster plot where each dot indicates a spike; bottom: PETH for the neuron; time zero is the time of lever pressing. f, Percentage of MSNs (left panel), SN GABA (middle panel) and SN DA (right panel) neurons displaying press-related activity throughout learning, respectively.

Interestingly, when we investigated activity changes relative to each of the different lever presses within a sequence, we found that many MSNs displayed a phasic increase in firing rate preceding the first press, which was significantly higher than the rate increase preceding the other presses (Fig. 3a; P < 0.01, same neuron as Fig. 2a). Similar sequence initiation specific activity was found in SN GABA neurons (Fig. 3b; P < 0.01, same neuron as Fig. 2d) and SN DA neurons (Fig. 3c; P < 0.01, same neuron as Fig. 2e). Moreover, we also found neurons in all three areas that significantly modulated their activity selectively before the final press of a sequence (Fig. 3d; P < 0.001). Few neurons signaled both sequence initiation and termination (Fig. 3e; P < 0.01), compared to those signaling just initiation or termination (Fig. 3i, j, k, P < 0.01 for all cell types). Importantly, the proportion of neurons showing start/stop related neural activity was much higher than that of neurons with activity selectively related to the middle presses within a sequence (day 12: P < 0.01 for all three areas), which was very low (Supplementary Fig. 8).

Figure 3. Neural activity signaling the initiation and termination of action sequences emerges in nigrostriatal circuits during learning.

Figure 3

a–c, Example of (a) a striatal MSN, (b) a SN GABA neuron and (c) a SN DA neuron showing significantly higher phasic firing selectively before the first press of a sequence. d, Example of a striatal MSN showing a phasic increase in activity preferentially before the last press of a sequence. e, Example of a SN GABA neuron showing both sequence initiation and termination related activity. For each panel, top: raster plot where each black dot indicates a spike and each orange triangle represents a lever press; bottom: PETH for the neuron. From left to right, each column shows the PETH of the same neuron related to the 1st, 2nd, 3rd and 3rd to final, 2nd to final and final press of each sequence. f–h, Proportion of (f) striatal MSNs, (g) SN GABA neurons and (h) SN DA neurons exhibiting sequence start/stop related activity (including just initiation, just termination or both) throughout training. i–k, Proportion of striatal MSNs (i), SN GABA neurons (j), and SN DA neurons (k) displaying activity signaling sequence initiation, termination, or both initiation and termination (day 12).

Start/Stop activity emerges with learning

Surprisingly, although there was no significant change in the proportion of press-related neurons during training (Fig. 2), the percentage of sequence start/stop related activity in striatal MSNs increased from 16.0 ± 2.5 %, in day 1, to 28.4 ± 3.2 % and 27.8 ± 2.9 % in days 6 and 12, respectively (Fig. 3f, P < 0.01, detailed characterization in Supplementary Fig. 9). This increase was mainly observed from day 1 to day 6 (P < 0.01), consistent with the time course of sequence learning (Fig. 1, c–f). Furthermore, this increase in start/stop activity was more prominent in sensorimotor striatum than in associative striatum (Supplementary Fig. 10). Similar learning-related increase for sequence start/stop related activity was observed in SN GABA neurons (Fig. 3g; 20.6 ± 2.7 %, 40.6 ± 2.5 % and 39.4 ± 3.3 % for days 1, 6 and 12 respectively, P < 0.01) and SN DA neurons (Fig. 3h; 27.1 %, 36.0 % and 44.4 % for days 1, 6 and 12 respectively, χ2 test, P < 0.05).

Start/Stop activity is action specific

We excluded the possibility that the emergence of start/stop activity reflected interval timing 24 (Supplementary Fig. 11) or stemmed directly from the shortening of inter-press intervals and sequence duration during training (Supplementary Fig. 12).

We also investigated if this start/stop activity would be related to different value expectation during the first and last presses of a sequence compared to the middle presses25,26. We designed an experiment with two levers leading to different reward magnitudes under FR8 (Methods). Mice perceived the different reward amounts as indicated by the number of licks during the small and large reward sessions (Fig. 4a; P < 0.01; Supplementary Fig. 13), and learned the relation between a particular action sequence and the corresponding reward magnitude as indicated by their choice in two-lever extinction tests (Fig. 4b; P < 0.01 for days 7 and 13).

Figure 4. Sequence start/stop related activity does not reflect differences in expected value and can be action-specific.

Figure 4

a, Total number of licks for left and right single-lever FR8 forced choice sessions, where the left lever sequences led to a small reward and the right lever to a larger reward; after six days the lever-reward magnitude contingency was switched. b, Mice prefer the lever leading to larger reward during two-lever choice extinction tests on day 7 and day 13. c, Percentage of striatal MSNs, SN GABA neurons, and SN DA neurons exhibiting action-specific start/stop related activity. d, g, j, Firing rate modulation of (d) striatal MSNs, (g) SN GABA neurons and (j) SN DA neurons in relation to lever pressing during small and large reward sessions, and e, h, k for corresponding population average. f, i, l, The rate modulation for each press within the action sequence for (f) striatal MSNs, (i) SN GABA neurons, and (l) SN DA neurons.

We compared the firing rate modulation of each neuron in the small versus large reward magnitude sessions (days 6 and 12, Fig. 4d, g, j, Supplementary Fig. 14). Consistent with previous studies 27,28, dopaminergic neurons showed significantly higher firing rate modulation during lever pressing for a large versus a small reward (Fig. 4k, 8.0 ± 1.7 Hz vs. 5.2 ± 1.0 Hz, P < 0.05). Striatal MSNs (Fig. 4e, P > 0.05) and SN GABA neurons (Fig. 4h, P > 0.05) as a population did not show different firing rate modulation during pressing for large versus small rewards.

Importantly, the higher firing rate modulation of dopaminergic neurons during large reward magnitude sessions was observed for every press within a sequence (Fig. 4l; P < 0.05), but not particularly during the first or last presses of a sequence (no differences in rate modulation across different presses, P > 0.05). Similar results were observed for striatal MSNs and SN GABA neurons (Fig. 4f, i; P > 0.05 for both), even when considering only the neurons showing preferential modulation by large or small reward magnitude (Supplementary Fig. 15). These data suggest that sequence start/stop related activity cannot be easily attributed to differences in reward expectation during the first or last presses of a sequence.

Since in this two-lever task mice are performing a similar sequence of presses on each lever, we asked whether the start/stop related activity of each neuron would be specific for a particular action (e.g. for initiating a sequence in the left but not the right lever), or generalized for similar actions. Most neurons with press-related activity for both levers showed differential start/stop activity between left and right lever sequences (67.8 ± 8.5 %, 74.2 ± 3.4 %, and 70.8 % for MSNs, SN GABA and SN DA neurons respectively, Fig. 4c; similar across cell types P > 0.05, Kruskal-Wallis), suggesting that sequence start/stop activity is action specific.

Disrupting start/stop activity impairs sequence learning

The learning of novel sequences is compromised in disorders affecting striatal circuits14,15,16. In the striatum, NMDA receptors are important for dopamine-dependent plasticity at glutamatergic synapses29,30, and for MSNs burst firing31 (Supplementary Figs. 16, 17), as seen during action sequence initiation and termination (Fig. 3a, d). We investigated whether disrupting NMDA receptor function specifically in the striatum would affect the development of sequence start/stop activity in this structure. We used a Cre-loxP strategy to generate mutant mice with a striatal-specific deletion of NMDAR1 (striatal NR1-KO, Methods)20, and recorded striatal activity in striatal NR1-KO and littermate controls during 6 days of FR8 training (n = 16 for KO and 10 for CT, Methods). Striatal NR1-KO mice learned to lever press to obtain sucrose (Fig. 5a), although there were differences between genotypes (P < 0.01, mainly due to larger ISI but not inter-press interval in KO, see below).

Figure 5. Striatal-specific deletion of NMDA receptors disrupts the development of start/stop activity impairs sequence learning.

Figure 5

a, Average lever pressing rate per session during three days of CRF followed by six days of FR8 training for striatal NR1-KO mutants and their littermate controls. b, Proportion of MSNs in striatal NR1-KO mutants and littermate controls displaying press-related activity during early and. late stages of training. c, Proportion of MSNs in striatal NR1-KO mutants and littermate controls exhibiting sequence start/stop related activity during the early and late stages of training. d, e, Example of the behavior microstructure of the same striatal-KO mouse during (d) the first day of FR8 training, and (e) the sixth day of FR8 training. All markers and insets are the same as used in Fig. 1a, b. f–i, Average (f) sequence length, (g) duration, (h) ISI and (i) within-sequence press rate for mutants and littermate controls. j–m, CV of sequence (j) length, (k) duration, (l) ISI and (m) within-sequence press rate during the first and sixth day of FR8 training for striatal NR1-KO mice and littermate controls.

Similarly to what we observed in C57BL/6J mice (Fig. 2f), about 40% of striatal MSNs displayed lever press-related activity in both striatal NR1-KO mice and littermate controls, and this proportion did not change with training (Fig. 5b; P > 0.05; Supplementary Table 2). However, the percentage of neurons with start/stop activity was significantly lower in striatal NR1-KO mice than in controls (Fig. 5c; P < 0.01), and did not increase with training in the mutants (P > 0.05) as it did in littermate controls (P < 0.05, similar to C57BL/6J mice, Fig. 3f). Interestingly, this deficit was more apparent in sensorimotor striatum (Supplementary Fig. 18).

Given the deficits in the development of start/stop activity in striatal NR1-KO, we examined sequence learning in mutants and control littermates (n = 14 for KO and 19 for CT, Methods). We found that striatal NR1-KO exhibited little evidence of sequence learning compared with controls (Fig. 5d, e, see also Fig. 1a, b), as indicated by similar sequence length between days 1 and 6 in KO (Fig. 5f; P > 0.05). Also, unlike in controls, the sequence length after training was significantly different from eight presses in mutants (Fig. 5f; P < 0.01). Furthermore, sequence duration also did not change from day 1 to day 6 in striatal NR1-KO as it did in controls (Fig. 5g; P > 0.05 for KO, P < 0.01 for CT). This impairment in sequence learning did not stem from any obvious motor impairments per se in striatal NR1-KO mice, because within-sequence press rate was similar between KO mice and littermate controls (Fig. 5i; P > 0.05), and the ISI decreased with training in KO mice (Fig. 5h; P < 0.01). Importantly, the variability of sequence behavior for each animal was generally higher in KO, and did not diminish as much with training as in controls (Fig. 5j; sequence length: KO P < 0.01; Fig. 5k; sequence duration: KO P > 0.05; Fig. 5l; ISI: KO P = 0.05; Fig. 5m; within-sequence press rate: KO P > 0.05; CT P < 0.01 in all cases).

Discussion

In this study, by investigating the behavioral microstructure and correlated neural activity in a self-paced operant task, we found that neurons in nigrostriatal circuits can signal the initiation and termination of self-paced action sequences. This sequence start/stop neural activity emerged as animals learned a specific action sequence, and was specific for particular actions. Importantly, this activity did not reflect interval timing, changes in inter-press interval during learning, or differences in expected value during the first or last actions of a sequence. Furthermore, a striatal-specific manipulation that affects plasticity and phasic firing in MSNs20 (Supplementary Fig. 17) impaired the development of start/stop activity in the striatum and selectively disrupted the learning of action sequences without affecting movement speed (Fig. 5i), or the ability to discriminate action value (Supplementary Fig. 17). These data underscore the importance of basal ganglia in learning and crystallizing action sequences7,8,32, and expand on previous studies showing changes in striatal activity related to initiation of cue-guided movements3336, and sequence learning impairments in patients with focal basal ganglia lesions16, Parkinson's15 and Huntington's disease14.

Phasic firing of dopaminergic neurons has been widely studied as a reward prediction error signal37,38, and our data confirmed that phasic firing of DA neurons before lever pressing can encode the expected value of the outcome27,28. However, the possible function of phasic dopamine signals in self-paced behavior, namely in the initiation and termination of specific action sequences, or in the transition between different actions, has been somewhat neglected39. Our data show that, in addition to striatal MSNs and SN GABA neurons, the phasic activity of SN DA neurons can signal the initiation and termination of specific action sequences. Hence, although the role of dopamine in motor performance and in Parkinson's has been mostly associated with tonic dopamine38, our findings may have implications for the deficits in initiation and termination of voluntary movement sequences observed in Parkinson's disease10,11,12.

These findings are therefore relevant for understanding the organization of actions, and also a variety of sequence learning and execution deficits resulting from basal ganglia dysfunction1116.

METHODS

Animals

All experiments were approved by the NIAAA ACUC and the Portuguese DGV, and done in accordance with NIH and European guidelines. C57BL/6J male mice between 3 and 6 months old, purchased from the Jackson Laboratory at 8 weeks of age, were used in the experiments using WT mice. Striatal-specific NMDAR1-knockout (KO) and control littermates were generated by crossing RGS9-cre mouse with NMDAR1-loxP mouse, as formerly described in Dang et al.1. The behavioral experiments using striatal NR1-KO mice were performed on 8 to 12 weeks old male and female RGS9-cre + / NMDAR1-loxP homozygous mice and all the controls were their littermates, including RGS9-cre +, RGS9-cre + / NMDAR1-loxP heterozygous and NMDAR1-loxP homozygous mice. There was no difference between the three control groups so the data were combined. TH-cre/NR1 flox mice were generated by crossing TH-cre mouse (in which Cre expression in the midbrain is localized specifically in dopaminergic neurons) in midbrain dopaminergic neurons2, with NMDAR1-loxP mouse, as described3. A cre-inducible adeno-associated virus (AAV) vector carrying the gene encoding the light-activated cation channel channelrhodopsin-2 (ChR2) and a fluorescent reporter4 was stereotactic delivered into the SNc of TH-cre mice2 enabling specific expression of ChR2 in DA containing neurons (THChR2)4.

Behavior training

Behavior training and testing took place in operant chambers as described previously5. Briefly, each chamber (21.6 cm L × 17.8 cm W × 12.7 cm H) was housed within a sound attenuating box (Med-Associates, St. Albans, VT) and equipped with two retractable levers on either side of the food magazine and a house light (3 W, 24 V) mounted on the opposite side of the chamber. Sucrose solution (10 %) was delivered into a metal cup in the magazine through a syringe pump. Magazine entries were recorded using an infrared beam and licks using a contact lickometer. Mice were placed on food restriction throughout training, and fed daily after the training sessions with ~2.5 g of regular food to allow them to maintain a body weight of around 85 % of their baseline weight.

Training started with a 30 minute magazine training session in which the reinforcer was delivered on a random time schedule, on average every 60 seconds (30 reinforcers). The following day lever-pressing training started with continuous reinforcement (CRF), in which animals obtained a reinforcer after each lever press. The session began with the illumination of the house light and insertion of the lever, and ended with the retraction of the lever and the offset of the house light. In the first day of CRF the sessions lasted 45 minutes or until mice received five reinforcers, the second day of CRF lasted 45 minutes or until mice received 10 reinforcers, and the last day of CRF lasted 45 minutes or until mice received 15 reinforcers. After three days of CRF, animals started to be trained (day 1) on a fixed ratio schedule in which eight presses earn a reinforcer (FR8), without any stimulus signaling when eight presses were completed or when the reinforcer was delivered; this training continued for twelve days. To train animals in a two-lever, two-reward magnitude task, every day animals had two single-lever training sessions, one for the left lever and another for the right lever. Throughout training, one of the levers delivered a small reward (15 μl of solution) after eight presses, while the other delivered a large reward (50 μl of solution) after the same number of presses, and the order of the daily sessions was counterbalanced. After six days of training, animals were given a choice test in extinction with both levers presented for 5 minutes without reward (day 7). Starting the subsequent day, the contingency between lever and reward magnitude was reversed for the following six days, and another extinction test was conducted at the end of training (day 13). The animals were trained daily without interruption and every day the training started approximately at same time. All timestamps of lever presses, magazine entries and licks for each animal were recorded with a 10 ms resolution. The training chambers and procedures for training in striatal NR1-KO and littermate controls were exactly the same as used for C57BL/6J mice.

Behavior sequence

The beginning and end of a sequence of lever presses was determined by either the statistics of lever pressing for each animal (either bimodal or Poisson distribution, on average a 20 s pause between sequences), or by a bout of licks interrupting lever pressing. The sequence length and duration were thus calculated based on each individual sequence, and the within-sequence press rate computed by the ratio of sequence length (≥ 2 presses) and the corresponding sequence duration. The inter-sequence-interval was defined as the time between two successive sequences. The mean within-sequence inter-press interval was calculated from inter-press intervals within each individual sequence and averaged for all sequences in each animal per session.

Surgery

Electrophysiological data in C57BL/6J experiment were collected from eleven mice. Each of them was implanted by two electrode arrays ipsilaterally, with one targeting the dorsal striatum and another the substantia nigra. The main electrode design used in this study consists of an array of 2 × 8 Platinum-coated tungsten microwire electrodes (35 or 50 μm diameter)6. For dorsal striatum, tungsten microwire electrodes of 50 μm diameter with 150 μm spacing between microwires, and 250 μm spacing between rows were used. The more 8 medial electrodes targeted the more medial area of dorsal striatum (associative) while the 8 more lateral targeted more the lateral region of dorsal striatum (sensorimotor subregion)6. For substantia nigra, tungsten microwire electrodes of 35 μm diameter with 150 μm spacing between electrodes and 150 μm spacing between rows were used. In some experiments the array used for substantia nigra was cut at a 30 to 45 degree angle to better fit the medial-lateral anatomy of the substantia nigra.

The craniotomies were made at the following coordinates: 0.5 mm rostral to bregma and 1.8 mm laterally for dorsal striatum; 3.4 mm caudal to bregma and 1.0 mm laterally for substantia nigra. During surgeries, the microwire arrays were gently lowered ~ 2.2 mm from the surface of the brain for dorsal striatum and ~ 4.2 mm for substantia nigra, while simultaneously monitoring neural activity. Final placement of the electrodes was monitored online during the surgery based on the neural activity, and then confirmed histologically at the end of the experiment after perfusion with 10 % formalin, brain fixation in a solution of 30 % sucrose and 10 % formalin, followed by cryostat sectioning (coronal slices of 40 – 60 μm), and cresyl violet staining (Supplementary Fig. 2).

In the TH-ChR2 experiment, the virus was injected into the SNc through a glass pipette (using Nanoject II, Warner Instruments) into two sites: 3.4 mm caudal to bregma, 1.0 mm laterally and ~ 4.1 mm and ~ 4.3 mm bellow the dura, respectively. We injected 0.3 μl of purified virus per site. A guide cannula terminating 300 μm above the injection/recording site was implanted attached to the electrode array, allowing simultaneous electrophysiological recordings and light stimulation, which was delivered through an optical fiber (200 μm core diameter, 0.37 N.A., Thorlabs Inc., NJ) with a diode-pumped solid-state laser (473 nm, LaserGlow Tech Inc., Canada) controlled by TTL pulses (10 ms). The measured output at the tip of the 200 μm fiber was approximately 60 mW.

Neural recordings during operant learning

The animals with implanted electrodes were allowed to recover for 2 to 3 weeks after surgery before training started. The training procedure was exactly the same as described above for the animals only undergoing behavioral testing. Some animals took longer to acquire the task due to the mechanics of the recording wires. In those cases, the data used as day 1 of FR8 training was defined as the first day in which the animal obtained ten or more reinforcers, and day 6, day 12 are the sixth and twelfth day after that. Surgery and electrode array implantation for striatal NR1-KO and littermate controls was the same as for C57BL/6J, but with only one instead array per KO animal (for easier training), targeted to striatum. Since striatal NR1-KO were severely impaired in learning and executing sequences of lever presses, and this was more severe with the headstage and recording cables connected, the neural data of striatal NR1-KO mutants and littermate controls at different training stages was acquired using a between animal design, i.e. one group was trained without cables and recorded during the first day they earned more than 10 reinforcers during FR8, and another group trained was trained without cables and was recorded after 6 consecutive days earning more than 10 reinforcers. There were 6 NR1-KO recorded during the early training stage and 10 NR1-KO recorded during late training stage, and 10 littermate mice during early and the same number during late training.

Neural activity was recorded using the MAP system (Plexon Inc., TX). The spike activity was initially sorted using an online sorting algorithm (Plexon Inc.). Only cells with a clearly identified waveform and relatively high signal-to-noise ratio were used6,7. At the end of the recording, cells were resorted using an offline sorting algorithm (Plexon Inc.) to isolate single units6,7. Single units displayed a clear refractory period in the inter-spike interval histogram, with no spikes during the refractory period (larger than 1.3 ms). TTL pulses were sent from a Med-Associates interface board to the MAP recording system through an A/D board (Texas Instrument Inc., TX) so that the animal's behavioral timestamps during operant conditioning were synchronized and recorded together with the neural activity. In order to characterize if SN neurons were dopaminergic neurons, a D2 receptor agonist (quinpirole, 1–2 mg/kg, Sigma Inc., MO) was injected intra-peritoneally at the end of the sessions. Neural activity was recorded for 1 to 2 hours before and after injection for comparison.

Neural recordings in anesthetized mice

Striatal NR1-KO homozygous mice and control littermates were used. Recordings were performed using the Plexon MAP system (Plexon Inc., TX) with the animals under isoflurane anesthesia (1.0 – 1.2 %). The electrode arrays used were the same as those used for in vivo freely moving recordings6,7, and a skull screw was used a ground. The coordinates were the same as used for striatal recoding in behaving mice: 0.5 mm rostral to bregma and 1.8 mm lateral to midline. All the units recorded within a depth of 2.0 – 2.7 mm below the cortical surface were then classified as dorsal striatum MSNs or interneurons based on waveform, firing rate, and activity pattern (Supplementary Fig. 3, also Supplementary Fig. 16)7. Only stable recordings lasting more than ten minutes were used for further analysis. The burst-like phasic spontaneous activity in striatal MSNs was defined as two or more spikes occurring with an inter-spike interval of less than 125 ms and terminated with an inter-spike interval more than 280 ms. Spike-triggered average was calculated by averaging the LFP in a time window 1s preceding and 1s following a spike.

Cell type classification

In the dorsal striatum, putative fast-spiking interneurons (FSIs) were identified as having a waveform trough half-width of less than 100 μs with baseline firing rate of more than 10 Hz, and putative cholinergic interneurons (TANs) were identified as those with a waveform trough halfwidth more than 250 μs (Supplementary Fig. 3, also Supplementary Fig. 16). All other units were classified as putative projection neurons (MSNs)7.

In substantia nigra, putative dopaminergic neurons were classified based on the following criteria8,9: low baseline firing rate (less than 10 Hz), specific waveform with wide action potential (half-width more than 350 μs), low negative - positive peak amplitude ratio (less than 0.4), and substantial (≥ 50 %) inhibition by the D2-selective agonist quinpirole; further validation of the classification criteria was performed using genetic and optogenetic tools (Supplementary Figs. 4–6). The rest were classified as putative SN GABA neurons, which are most likely the SNr projection neurons, because the percentage of GABAergic interneurons in the SN is rather small10,11. The burst firing in SN DA neurons was defined as two or more spikes occurring with an inter-spike interval of less than 80 ms and terminating with an inter-spike interval larger than than 160 ms12. The burst set rate measured how many bursts occurred per second, and the percentage of spikes fired in bursts were thus calculated for each SN DA neuron. Optical stimulation in TH-ChR2 mice was performed every several minutes with either a train of 60 pulses of 10 ms duration delivered at 11 Hz, or a train of 30 pulses of 10 ms duration delivered at 14 Hz.

Lever press-related neurons throughout a session

Neural activity referenced to lever press onset was averaged in 20-ms bin, shifted by 1 ms, and averaged across trials to construct the peri-event histogram (PETH), which was the basis for analyzing amplitude and latency of press-related firing activity. Distributions of the PETH from 5000 to 2000 ms before lever press were considered baseline activity. We then determined which 20-ms bins, slid in 1 ms steps during an epoch spanning from 1000 ms before and after the event, met the criteria for task-related activity. A significant increase in firing rate was defined if at least 20 consecutive overlapping bins had firing rate larger than a threshold of 99 % above baseline activity, and a significant decrease in firing rate was defined if at least 20 consecutive bins had a firing rate smaller than a threshold of 95 % below baseline activity13. The onset of press-related firing rate modulation was defined as the beginning of the first of 20 consecutive significant bins. The modulation period was defined as the time window from the beginning of the first of 20 consecutive significant bins to the final one of the consecutive significant bins13. For across-session comparisons, the modulation rate for each press was normalized to the maximal firing rate in sequence, so the values range from 0 to 1 with larger numbers indicating stronger modulation.

Sequence initiation/termination related neurons

To determine whether a task-related neuron was sequence start/stop related or not, we generated six firing-rate distributions, each one based on the PETH of rate modulation period for a specific press within the sequence: namely the first, second and third or the third to final, second to final and final press of a sequence. Sequence start/stop related neurons were defined as those where the mean peak (or trough) firing-rate modulation of the first press (start), final press (stop), or both was significantly different from the peak/trough of the within sequence presses. Sequence middle-press selective neurons were determined in the same way by looking for those neurons that showed significantly different rate modulation for any of the middle press within the sequence. All data analyses were conducted in Matlab with custom-written programs (The MathWorks Inc., MA).

Statistical procedures

The statistics were performed (and averaged) on the values for each animal per session except for SN DA neurons because of the low number of neurons recorded per session/animal (In this case the average represents the neurons recorded from all animals for each session). One-way ANOVA and repeated measures ANOVA were used to investigate general main effects; and paired or unpaired t-test were used in all planned and post-hoc comparisons, except for SN DA neurons where a chi-square test was used. Statistical analyses were conducted in Matlab using the statistics toolbox (The MathWorks Inc., MA) and GraphPad Prism 4 (GraphPad Software Inc., CA).

Supplementary Material

01

Acknowledgements

We thank Y. Li for the RGS9-Cre mice, C. Gerfen for the TH-Cre mice, K. Nakazawa for the NMDAR1-loxP mice, F. Tecuapetla and S. Lima for help in the optogenetcis experiment, G. Luo for genotyping, and D. Lovinger, G. Cui, C. French, C. Gremel and E. Dias-Ferreira for comments on the manuscript. This research was supported by the NIAAA DICBR, the CNP at IGC and ERC Grant 243393 to R.M.C..

Footnotes

Author Contributions: X.J. performed the experiments and analyzed the data. R.M.C conducted the optogenetics experiment. X.J. and R.M.C. designed the experiments and wrote the paper.

Author Information: The authors declare no conflicts of interest.

References

  • 1.Lashley KS. The problem of serial order in behavior. In: Jeffress LA, editor. In Cerebral Mechanisms in Behavior. John Wiley Press; New York: 1951. [Google Scholar]
  • 2.Gallistel CR. The organization of action: A new synthesis. Lawrence Erlbaum Associates, Inc.; Hillsdale, N. J.: 1980. [Google Scholar]
  • 3.Grillner S, Wallén P. Central pattern generators for locomotion, with special reference to vertebrates. Annu. Rev. Neurosci. 1985;8:233–261. doi: 10.1146/annurev.ne.08.030185.001313. [DOI] [PubMed] [Google Scholar]
  • 4.Marder E, Bucher D. Central pattern generators and the control of rhythmic movements. Curr. Biol. 2001;11:R986–R996. doi: 10.1016/s0960-9822(01)00581-4. [DOI] [PubMed] [Google Scholar]
  • 5.Hikosaka O, et al. Parallel neural networks for learning sequential procedures. Trends Neurosci. 1999;22:464–471. doi: 10.1016/s0166-2236(99)01439-3. [DOI] [PubMed] [Google Scholar]
  • 6.Yin HH, et al. Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat. Neurosci. 2009;12:333–341. doi: 10.1038/nn.2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Graybiel AM. The basal ganglia and chunking of action repertoires. Neurobiol. Learn Mem. 1998;70:119–136. doi: 10.1006/nlme.1998.3843. [DOI] [PubMed] [Google Scholar]
  • 8.Hikosaka O, Miyashita K, Miyachi S, Sakai K, Lu X. Differential roles of the frontal cortex, basal ganglia, and cerebellum in visuomotor sequence learning. Neurobiol. Learn Mem. 1998;70:137–149. doi: 10.1006/nlme.1998.3844. [DOI] [PubMed] [Google Scholar]
  • 9.Bailey KR, Mair RG. The role of striatum in initiation and execution of learned action sequences in rats. J. Neurosci. 2006;26:1016–1025. doi: 10.1523/JNEUROSCI.3883-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Benecke R, Rothwell JC, Dick JP, Day BL, Marsden CD. Disturbance of sequential movements in patients with Parkinson's disease. Brain. 1987;110:361–379. doi: 10.1093/brain/110.2.361. [DOI] [PubMed] [Google Scholar]
  • 11.Agostino R, Berardelli A, Formica A, Accornero N, Manfredi M. Sequential arm movements in patients with Parkinson's disease, Huntington's disease and dystonia. Brain. 1992;115:1481–1495. doi: 10.1093/brain/115.5.1481. [DOI] [PubMed] [Google Scholar]
  • 12.Castiello U, Stelmach GE, Lieberman AN. Temporal dissociation of the prehension pattern in Parkinson's disease. Neuropsychologia. 1993;31:395–402. doi: 10.1016/0028-3932(93)90162-s. [DOI] [PubMed] [Google Scholar]
  • 13.Phillips JG, Chiu E, Bradshaw JL, Iansek R. Impaired movement sequencing in patients with Huntington's disease: a kinematic analysis. Neuropsychologia. 1995;33:365–369. doi: 10.1016/0028-3932(94)00114-5. [DOI] [PubMed] [Google Scholar]
  • 14.Willingham DB, Koroshetz WJ. Evidence for dissociable motor skills in huntington's disease patients. Psychobiology. 1993;21:173–182. [Google Scholar]
  • 15.Stefanova ED, Kostic VS, Ziropadja L, Markovic M, Ocic GG. Visuomotor skill learning on serial reaction time task in patients with early Parkinson's disease. Mov. Disord. 2000;15:1095–1103. doi: 10.1002/1531-8257(200011)15:6<1095::aid-mds1006>3.0.co;2-r. [DOI] [PubMed] [Google Scholar]
  • 16.Boyd LA, et al. Motor sequence chunking is impaired by basal ganglia stroke. Neurobiol. Learn Mem. 2009;92:35–44. doi: 10.1016/j.nlm.2009.02.009. [DOI] [PubMed] [Google Scholar]
  • 17.Fujii N, Graybiel AM. Representation of action sequence boundaries by macaque prefrontal cortical neurons. Science. 2003;301:1246–1249. doi: 10.1126/science.1086872. [DOI] [PubMed] [Google Scholar]
  • 18.Aldridge JW, Berridge KC. Coding of serial order by neostriatal neurons: a “natural action” approach to movement sequence. J. Neurosci. 1998;18:2777–2787. doi: 10.1523/JNEUROSCI.18-07-02777.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Meyer-Luehmann M, Thompson JF, Berridge KC, Aldridge JW. Substantia nigra pars reticulata neurons code initiation of a serial pattern: implications for natural action sequences and sequential disorders. Eur. J. Neurosci. 2002;16:1599–1608. doi: 10.1046/j.1460-9568.2002.02210.x. [DOI] [PubMed] [Google Scholar]
  • 20.Dang MT, et al. Disrupted motor learning and long-term synaptic plasticity in mice lacking NMDAR1 in the striatum. Proc. Natl. Acad. Sci. U. S. A. 2006;103:15254–15259. doi: 10.1073/pnas.0601758103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gong S, et al. Targeting Cre recombinase to specific neuron populations with bacterial artificial chromosome constructs. J. Neurosci. 2007;27:9817–9823. doi: 10.1523/JNEUROSCI.2707-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Boyden ES, Zhang F, Bamberg E, Nagel G, Deisseroth K. Millisecond-timescale, genetically targeted optical control of neural activity. Nat. Neurosci. 2005;8:1263–1268. doi: 10.1038/nn1525. [DOI] [PubMed] [Google Scholar]
  • 23.Lima SQ, Hromadka T, Znamenskiy P, Zador AM. PINP: a new method of tagging neuronal populations for identification during in vivo electrophysiological recording. PLoS One. 2009;4:e6099. doi: 10.1371/journal.pone.0006099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Meck WH, Penney TB, Pouthas V. Cortico-striatal representation of time in animals and humans. Curr. Opin. Neurobiol. 2008;18:145–152. doi: 10.1016/j.conb.2008.08.002. [DOI] [PubMed] [Google Scholar]
  • 25.Samejima K, Ueda Y, Doya K, Kimura M. Representation of action-specific reward values in the striatum. Science. 2005;310:1337–1340. doi: 10.1126/science.1115270. [DOI] [PubMed] [Google Scholar]
  • 26.Lau B, Glimcher PW. Value representations in the primate striatum during matching behavior. Neuron. 2008;58:451–463. doi: 10.1016/j.neuron.2008.02.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Morris G, Nevet A, Arkadir D, Vaadia E, Bergman H. Midbrain dopamine neurons encode decisions for future action. Nat. Neurosci. 2006;9:1057–1063. doi: 10.1038/nn1743. [DOI] [PubMed] [Google Scholar]
  • 28.Roesch MR, Calu DJ, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 2007;10:1615–1624. doi: 10.1038/nn2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Calabresi P, Pisani A, Mercuri NB, Bernardi G. Long-term Potentiation in the Striatum is Unmasked by Removing the Voltage-dependent Magnesium Block of NMDA Receptor Channels. Eur. J. Neurosci. 1992;4:929–935. doi: 10.1111/j.1460-9568.1992.tb00119.x. [DOI] [PubMed] [Google Scholar]
  • 30.Shen W, Flajolet M, Greengard P, Surmeier DJ. Dichotomous dopaminergic control of striatal synaptic plasticity. Science. 2008;321:848–851. doi: 10.1126/science.1160575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Pomata PE, Belluscio MA, Riquelme LA, Murer MG. NMDA receptor gating of information flow through the striatum in vivo. J. Neurosci. 2008;28:13384–13389. doi: 10.1523/JNEUROSCI.4343-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Brainard MS, Doupe AJ. What songbirds teach us about learning. Nature. 2002;417:351–358. doi: 10.1038/417351a. [DOI] [PubMed] [Google Scholar]
  • 33.Kimura M. Behaviorally contingent property of movement-related activity of the primate putamen. J. Neurophysiol. 1990;63:1277–1296. doi: 10.1152/jn.1990.63.6.1277. [DOI] [PubMed] [Google Scholar]
  • 34.Kermadi I, Joseph JP. Activity in the caudate nucleus of monkey during spatial sequencing. J. Neurophysiol. 1995;74:911–933. doi: 10.1152/jn.1995.74.3.911. [DOI] [PubMed] [Google Scholar]
  • 35.Jog MS, Kubota Y, Connolly CI, Hillegaart V, Graybiel AM. Building neural representations of habits. Science. 1999;286:1745–1749. doi: 10.1126/science.286.5445.1745. [DOI] [PubMed] [Google Scholar]
  • 36.Miyachi S, Hikosaka O, Lu X. Differential activation of monkey striatal neurons in the early and late stages of procedural learning. Exp. Brain Res. 2002;146:122–126. doi: 10.1007/s00221-002-1213-7. [DOI] [PubMed] [Google Scholar]
  • 37.Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  • 38.Schultz W. Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 2007;30:259–288. doi: 10.1146/annurev.neuro.28.061604.135722. [DOI] [PubMed] [Google Scholar]
  • 39.Redgrave P, Gurney K. The short-latency dopamine signal: a role in discovering novel actions? Nat. Rev. Neurosci. 2006;7:967–975. doi: 10.1038/nrn2022. [DOI] [PubMed] [Google Scholar]
  • 40.Hilario MRF, Clouse E, Yin HH, Costa RM. Endocannabinoid signaling is critical for habit formation. Front. Integr. Neurosci. 2007;1:6. doi: 10.3389/neuro.07.006.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 1.Dang MT, et al. Disrupted motor learning and long-term synaptic plasticity in mice lacking NMDAR1 in the striatum. Proc. Natl. Acad. Sci. U. S. A. 2006;103:15254–15259. doi: 10.1073/pnas.0601758103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gong S, et al. Targeting Cre recombinase to specific neuron populations with bacterial artificial chromosome constructs. J. Neurosci. 2007;27:9817–9823. doi: 10.1523/JNEUROSCI.2707-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cui G, Santos SS, Gerfen CR, Costa RM. Genetic disruption of NMDA receptor-mediated excitatory inputs onto medial and lateral midbrain dopamine neurons causes dichotomous behavioral consequences in reinforcement learning. Soc. Neurosci. Abstr. 2009;21:781. [Google Scholar]
  • 4.Tsai HC, et al. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science. 2009;324:1080–1084. doi: 10.1126/science.1168878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hilario MRF, Clouse E, Yin HH, Costa RM. Endocannabinoid signaling is critical for habit formation. Front. Integrat. Neurosci. 2007;1:6. doi: 10.3389/neuro.07.006.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Yin HH, et al. Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat. Neurosci. 2009;12:333–341. doi: 10.1038/nn.2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Burkhardt JM, Jin X, Costa RM. Dissociable effects of dopamine on neuronal firing rate and synchrony in the dorsal striatum. Front. Integr. Neurosci. 2009;3:28. doi: 10.3389/neuro.07.028.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Roesch MR, Calu DJ, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 2007;10:1615–1624. doi: 10.1038/nn2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pan WX, Schmidt R, Wickens JR, Hyland BI. Tripartite mechanism of extinction suggested by dopamine neuron activity and temporal difference model. J. Neurosci. 2008;28:9619–9631. doi: 10.1523/JNEUROSCI.0255-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gulley RL, Wood RL. The fine structure of the neurons in the rat substantia nigra. Tissue Cell. 1971;3:675–690. doi: 10.1016/s0040-8166(71)80013-7. [DOI] [PubMed] [Google Scholar]
  • 11.Juraska JM, Wilson CJ, Groves PM. The substantia nigra of the rat: a Golgi study. J. Comp. Neurol. 1977;172:585–600. doi: 10.1002/cne.901720403. [DOI] [PubMed] [Google Scholar]
  • 12.Grace AA, Bunney BS. The control of firing pattern in nigral dopamine neurons: burst firing. J. Neurosci. 1984;4:2877–2890. doi: 10.1523/JNEUROSCI.04-11-02877.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Belova MA, Paton JJ, Morrison SE, Salzman CD. Expectation modulates neural responses to pleasant and aversive stimuli in primate amygdala. Neuron. 2007;55:970–984. doi: 10.1016/j.neuron.2007.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Paxinos G, Franklin KBJ. The mouse brain in stereotaxic coordinates. 2nd edn Academic Press; 2001. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES