Abstract
One of the most characteristic features of habitual behaviors is that they can be evoked by a single cue. In the experiments reported here, we tested for the effects of such advance cueing on the firing patterns of striatal neurons in the sensorimotor striatum. Rats ran in a T-maze with instruction cues about the location of reward given at the start of the runs. This advance cueing about reward produced a highly augmented task-bracketing pattern of activity at the beginning and end of procedural task performance relative to the patterns found previously with midtask cueing. Remarkably, the largest increase in activity early during the T-maze runs was not associated with the instruction cues themselves, the earliest predictors of reward; instead, the highest peak of early activity was associated with the beginning of the motor period of the task. We suggest that the advance cueing, reducing midrun demands for decision making but adding a working-memory load, facilitated chunking of the maze runs as executable scripts anchored to sensorimotor aspects of the task and unencumbered by midtask decision-making demands. Our findings suggest that the acquisition of stronger task-bracketing patterns of striatal activity in the sensorimotor striatum could reflect this enhancement of behavioral chunking. Deficits in such representations of learned sequential behaviors could contribute to motor and cognitive problems in a range of neurological disorders affecting the basal ganglia, including Parkinson's disease.
Keywords: procedural learning, dorsolateral striatum, habit formation, basal ganglia, Parkinson's disease
the striatum is recognized as an essential component of procedural learning circuits that interconnect the basal ganglia with the neocortex and brainstem (Balleine et al. 2009; Graybiel 2008; Yin and Knowlton 2006). Correspondingly, microelectrode recordings in the striatum have demonstrated remarkable plasticity of neuronal firing and local field potential (LFP) oscillatory activity as rodents learn new actions and environmental conditions (Barnes et al. 2005; Costa et al. 2004; DeCoteau et al. 2007b; Gage et al. 2010; Jin and Costa 2010; Jog et al. 1999; Kimchi and Laubach 2009; Kubota et al. 2009; Stalnaker et al. 2010; Tang et al. 2007; Thorn et al. 2010; van der Meer et al. 2010). In a series of experiments, we earlier demonstrated robust reorganization of the ensemble spike activity of neurons in the sensorimotor part of the striatum as rats and mice learned to perform T-maze tasks in which they turned right or left in response to instruction tones (Barnes et al. 2005; Jog et al. 1999; Kubota et al. 2009; Thorn et al. 2010). Ensemble activity tended to occur throughout the maze runs early in training, but, as the animals became proficient in performing the task, the activity tended to be most prominent at the start and end of the maze runs. Activity patterns emphasizing the beginning and end of entire behavioral sequences have also been found in macaques performing sequential eye movement tasks and in rodents performing lever press tasks (Fujii and Graybiel 2003; Fujii and Graybiel 2005; Jin and Costa 2010). These findings raised the hypothesis that such acquired task-bracketing patterns might reflect behavioral chunking of the procedure as successful learning occurred, a formation of supraordinate units for motor control (Barnes et al. 2005; Boyd et al. 2009; Cohen et al. 1990; Graybiel 1998; Jog et al. 1999; Koch and Hoffmann 2000; Kubota et al. 2009; Miller 1956; Rosenbaum et al. 1983; Sakai et al. 2003; Thorn et al. 2010; Tremblay et al. 2010; Tremblay et al. 2009).
To test this suggestion directly, we again trained rats to run a conditional T-maze task, but we changed the task so that the instruction cues were delivered before the rats began the maze runs rather than just before the turns. This advance cueing removed demands for decision making on the basis of midrun cues and thus allowed preplanning of the runs. We predicted that this decrease in demands for cue detection and response selection at midrun would accentuate the task-bracketing pattern by promoting the opportunity for the entire run sequences to be represented as behavioral units, a form of behavioral chunking (Graybiel 1998). On the other hand, the advance-cueing protocol produced a working-memory load during the maze runs, possibly requiring retrieval of the early instruction cues at midrun, which we predicted might alter, and even obliterate, the repatterning of activity by engaging working-memory processes.
Our findings were clear cut in differentiating between these possibilities. With advance cueing, an even greater task-bracketing pattern of ensemble activity occurred than found during the original midrun-cueing versions of the maze task. Moreover, we found a strong decrease, not increase, in ensemble activity during the memory-requiring period, suggesting that, if activity changes related to remembering the instructions occurred during the runs, they were mainly decreases, the possible exception being just as the turns were to be initiated. These findings are significant in suggesting that advance cueing, by altering cognitive demands to promote pretask shifts in attentional demand and preplanning of entire sequential behaviors, can augment action-boundary representations in the sensorimotor striatum. These representations may be a neural signature of learning-related behavioral chunking.
METHODS
Subjects, habituation, and surgical procedures.
All procedures met the approval of the Massachusetts Institute of Technology Committee on Animal Care and were in accordance with the National Research Council Guide for the Care and Use of Laboratory Animals. Male Sprague-Dawley rats (n = 7; 300–350 g) were first handled for 3–5 days and then acclimated to a T-maze. Acclimation entailed sparsely scattering chocolate-flavored sprinkles uniformly in the maze and allowing the rats to freely explore the environment. Once acclimated, rats were given up to 10 practice trials before surgery in which the rat was placed in the start location, the gate was lowered, and the rat was allowed to traverse the maze. No auditory tones were played. During these practice trials, the chocolate was initially scattered throughout the maze, then placed only in the food wells.
For surgery, rats were pretreated with atropine (0.06 mg/kg) and anesthetized with ketamine (75–100 mg/kg) and xylazine (10–20 mg/kg). Small burr holes were made in the skull above unilateral dorsolateral striatum [anterior-posterior (AP) = +0.5 mm; medial-lateral (ML) = +3.6 mm], and the underlying dura was carefully removed. A headstage containing seven independently moveable tetrodes (200–250 kΩ) made of twisted 10 μm Ni/Cr wire (Kanthal Palm Coast, Palm Coast, FL) was lowered into the holes to the level of dura and affixed to the skull with dental cement and several bone screws. A small metal plate with a hole for a screw soldered to a wire served as the ground. Tetrodes were lowered in small increments after surgery and for 5–7 subsequent days until they reached the target site in the dorsolateral striatum [dorsal-ventral (DV) = 3.6–5.0 mm, Fig. 1F]. Once behavioral training began, the tetrodes were moved as little as possible. However, before each training session, tetrodes lacking single-unit activity were incrementally adjusted as necessary to maximize the number of units recorded by each tetrode.
Fig. 1.
Recordings in sensorimotor striatum of rats learning to perform a procedural T-maze task with advance cueing. A: cartoon of maze with timeline for advance-cueing task procedure. Each trial began with a presentation of a tone (1 or 8 kHz, 500 ms) signaling which end arm of the maze was baited with a chocolate reward. Following tone offset, the start gate was opened, and the rat was free to run down the maze toward the goal. The timing of these task events was detected by photobeam breaks and an overhead video tracker. B and C: average run times (B) and average percent-correct scores (C) across learning stages defined on the basis of behavioral performance for the 7 rats. Stages 1 and 2 are the first 2 days of training; stages 3 and 4 are the first sessions with performance above 60% and 65% correct, respectively; and stages 5-9 are pairs of successive sessions with performance at or above 72.5% correct (P < 0.01 by χ2 test). The rats ran the maze faster as training progressed (P < 0.001) and reached the acquisition criterion of 72.5% correct (P < 0.0001) in stage 5. Error bars indicate standard error of the mean. D: probability of a correct response (solid line), with 95% confidence intervals (shading), for 1 rat (B08), estimated by the state-space model. Dashed horizontal line indicates chance performance. Red arrow indicates the “day of learning” (Smith et al. 2004) on day 6. E: average numbers of sessions required for task acquisition calculated by the state-space model (Smith et al. 2004) (left) and by the 72.5% χ2-based correct-performance criterion (right). F: schematic diagrams illustrating estimated recording sites in the 7 rats, shown in 3 coronal sections redrawn from the atlas of Paxinos and Watson (1997).
Behavioral training.
All rats were trained in a T-maze that consisted of a long starting arm (127 × 7.5 cm) and two shorter goal arms (33 × 7.5 cm) made of black polycarbonate boards (Fig. 1A). All arms of the maze were elevated off the floor 22 cm and were surrounded at a distance of 14.5–16.5 cm by black walls 41 cm in height. Fresh chocolate was periodically sprinkled on the floor of the chamber to mask any possible olfaction cues. Rats began each trial in a 20-cm starting platform behind a gate. The gate could be opened by the experimenter to allow rats to traverse the maze. Plexiglas plates (2.8 × 6 cm) with a circular well (diameter: 2.5 cm) were placed by magnets at the end of each goal arm for reward delivery. All training occurred in dim red light.
For each trial, rats began on the start platform with the start gate closed. Two seconds of baseline activity was recorded before the onset of instruction cues. While still in the start block, rats were presented with one of two tone cues (1 or 8 kHz, ∼80 dB; 500-ms duration), which indicated whether the right or left goal was baited with chocolate sprinkles. Tones played from a speaker located behind the choice point of the maze, and the start gate was lowered by the experimenter 200–300 ms after the tone ended. The rats could then proceed down the long arm of the maze and choose to turn down either the left or right arm. If the correct arm was chosen, rats were rewarded with chocolate sprinkles placed in the delivery food well. Neuronal recordings were terminated 0.5–1.0 s after goal reaching. Rats were allowed to finish eating the reward before being guided back to the start gate by the experimenter. If the incorrect arm was chosen, rats were not allowed to visit the correct arm before being guided back to the start gate. The few trials in which the rat failed to exit the starting block were not analyzed.
Tone-goal associations were counterbalanced between animals. Tones were presented in a pseudorandom order in which the same tone could be repeated up to three times in a row. For each session, rats received ∼20 trials of each tone type. Sessions generally lasted 1–2 h. Training consisted of one session per day, ∼40 trials per session, for up to 57 days. Rats were trained 5–6 days a week until implants or recordings failed. Rats were then deeply anesthetized (Nembutal, 50–100 mg/kg) and then perfused with 4% paraformaldehyde in 0.1 M phosphate buffer, and 30-μm thick brain sections were stained for Nissl substance to identify recording tracks.
Data acquisition.
Neuralynx data acquisition hardware and Cheetah acquisition software (Neuralynx, Bozeman, MT) along with a Med-PC behavioral system (Med Associates, St. Albans, VT) were used for behavioral and neuronal data acquisition. Photobeams supplied timestamps for the following task events: gate opening, out of start, midmaze run, turn start, turn end, and goal reaching. The position of the rat was recorded continuously by a Neuralynx video tracker system that was supplied video images from an overhead CCD camera (60-Hz frame rate). The tracker detected an LED light source mounted to the back of the headstage. Recorded single-unit activity was amplified (gain: 200–10,000) and sampled at 32 kHz once a user-determined threshold was reached. A quiet channel of one of the tetrodes was used as reference.
Behavioral data analysis.
Run times were calculated as the time it took the animal to go from the out-of-start photobeam marker located immediately outside the start area to the goal photobeam marker. Reaction time was defined as the time it took the animal to reach the out-of-start photobeam after gate opening. Training-related changes in behavioral accuracy, running times, and reaction times were analyzed using ANOVAs. Video tracker data and VHS tapes were reviewed to look for changes in behavior as training progressed. No obvious changes (in sniffing, lateral movement, etc.) were found.
Learning on this task was defined in two ways. First, we defined a learning criterion of greater than 72.5% correct as performance at this level is statistically greater than chance performance (P < 0.01, χ2 test). Second, the state-space learning algorithm was used to determine during which session each rat's probability of correct performance reached and remained above chance (Smith et al. 2004).
A staging procedure was used following prior work (Barnes et al. 2005) to compare data across animals despite differences in behavioral performance and duration of training. Stage 1 corresponded to the first day of training, stage 2 to the second day of training. Stage 3 was the first day in which the animals performed above 60% correct; stage 4 was the first day animals performed correctly in greater than 65% of trials; and stage 5 was the first two training days in which rats earned reward in 72.5% or more of trials. Stages 6-9 each included subsequent pairs of training sessions in which the rats again earned reward in 72.5% or more of trials. Following this procedure, the numbers of units across learning stages were relatively constant (see Figs. 2A and 3A). Because stages 5-9 were each comprised of two sessions, an identical unit recorded on different days could have been included in a single stage. However, unless the animals maintained a performance at or above 72.5 percent correct, stages 5-9 were not consecutive, making it less likely that the same unit appeared repeatedly. Patterns of neuronal ensemble activity found in analyses with this staging procedure were compared with those seen in day-by-day analyses and were found to be comparable. The same day was not used in multiple stages; for example, if an animal performed above 60% on day 1 or 2, the next day in which the animal scored 60% or higher was used for stage 3.
Fig. 2.
The ensemble activity of task-responsive putative medium spiny neurons (TRNs) in the sensorimotor striatum is restructured during acquisition of the advance-cued T-maze task. A: average firing rate of TRNs across learning stages defined by χ2 criterion, as in Fig. 1, B and C. Ensemble activity during 800-ms-long periods around consecutive task events, as labeled. The firing rate in each 10-ms bin was normalized by subtracting the firing rate during the baseline period (1,900 to 1,400 ms before tone onset) and is plotted according to the pseudocolor scale shown at right. The numbers of units for each stage are shown on the right. Note that TRNs develop, as training progresses, accentuated activity at the beginning and end of the movement period, but not during the midrun period, of the maze runs. B: results of regression analysis, showing the slope of the regression for each 10-ms bin, with 95% confidence bounds, across task time. Graph shows the bins that increased (above the red line) or decreased (below the red line) significantly as training progressed. Data are illustrated for 800-ms periods around each event. C: line plots of data shown in A to illustrate the progressive appearance of the task-bracketing firing pattern throughout learning. TO, tone onset; GO, gate opening; ST, start; MR, midrun; TS, turn start; TE, turn end; GR, goal reaching. D: average normalized activity plotted with learning stages on the basis of the state-space model (see methods). Stages 1 to 4 are as in A, except that 1 rat (B05) that failed to learn according to the state-space model is excluded. Stages 5 to 9 represent sessions with a 95% probability that the rat will maintain performance, for the duration of the experiment, above 50%, 52.5%, 55%, 57.5% and 60% levels, respectively.
Fig. 3.
Baseline activity of striatal neurons without phasic task-related responses (NTRNs), but not the baseline activity of TRNs, changes during training. A: normalized average firing rate of NTRNs plotted as in Fig. 2A. B: average raw firing rate of NTRNs. Note increases in activity during the baseline period with training. C and D: baseline firing rate of NTRNs (C) and TRNs (D) throughout learning. Shaded area indicates 95% confidence bounds. E and F: slope of the regression for changes in each 10-ms bin across learning stages indicates that activity of NTRNs significantly increased (above the red line) in all 10-ms bins with training (E), but corresponding measure of the activity of TRNs did not change (F).
Neuronal data analysis.
Single units were sorted manually using the program Autocut (DataWave Technologies, Berthoud, CO). The quality of sorted units was determined by analyzing autocorrelograms and overlays of spike waveforms. Units were analyzed only if they had at least 100 spikes in a session. Units were then classified as putative medium spiny-output neurons (MSNs) on the basis of waveform properties, autocorrelograms, interspike intervals, and firing rates (Barnes et al. 2005; Gage et al. 2010; Kubota et al. 2009; Schmitzer-Torbert and Redish 2008; Tepper et al. 2004; Thorn et al. 2010). We recorded 3,207 units in seven rats across training. Of these, 2,456 were identified as MSNs according to the criteria [low baseline firing rate (< 10 Hz, average = 1.48 Hz), presence of long interspike intervals (> 2 s), and narrow gap in autocorrelograms (< 5 ms) indicating bursty activity patterns] used previously (Barnes et al. 2005). Individually we recorded the following numbers of putative MSNs from each animal; 271 (11.0% of total putative MSNs recorded), 220 (9.0%), 192 (7.8%), 124 (5.0%), 865 (35.2%), 396 (16.1%), and 388 (15.8%). It is likely that units were recorded on more than one day throughout training (Kubota et al. 2009), but here we considered our samples as neuronal ensembles recorded in daily sessions and looked for changes in their response properties across training.
To determine whether individual units were task responsive, a perievent histogram was generated for ±400-ms windows (with consecutive 20-ms bins) centered around each task event (tone onset, 1-kHz tone onset, 8-kHz tone onset, gate opening, locomotion onset, midrun, turn start, turn end, right-turn end, left-turn end, goal reaching, right goal reaching, and left goal reaching). If the spike count in four consecutive 20-ms bins was at least two standard deviations above the mean spike count during a pretrial baseline period (1,900 to 1,400 ms before tone onset) and each of those bins contained at least two spikes, the unit was categorized as responsive to that event. Of units recorded on the staged days of learning, on average, 56% were task responsive according to this criterion (57% of all putative MSNs). Units that did not meet this criterion were considered non-task responsive. There was not a significant change in proportions of task-responsive units across learning stages (53% of units recorded in stage 1 were task responsive, whereas 56% of units recorded in stage 9 were task responsive).
To analyze the activity of recorded neurons, the firing of each unit was first normalized with respect to its baseline. The baseline period was defined as 1,900 to 1,100 ms before tone onset, and firing rates during this period were averaged in 10-ms bins over all 40 trials in a session. The baseline firing rate was then defined as the mean firing rate across all bins in this baseline. Next, the firing rate during 800-ms time windows centered on each task event was averaged in 10-ms bins over all 40 trials, and the baseline firing rate was subtracted from each bin to obtain event-related firing patterns for each unit. Binned data across all events were then smoothed using a five-point moving average.
To create population-activity plots, the event-related firing patterns were averaged over all task-responsive putative MSNs (TRNs) recorded for each stage of learning. Because the rat ran each trial at a different speed, 800-ms epochs centered on each task event were analyzed. The size of this epoch was determined by maximizing the size of the time window while minimizing the overlap. All major analyses were repeated using nonnormalized, raw firing rates, as well as firing rates normalized to the mean over all bins rather than mean only in the baseline period. As a control analysis, we also made comparable population-activity plots by inserting seven event markers randomly within trial time and by aligning neuronal activity to these mock events. To determine whether the population activity of TRNs changed significantly as training progressed, we performed a linear regression for each 10-ms bin across the nine stages of learning and computed 95% confidence levels. A significant change was identified for a bin if the confidence bounds of the slope did not overlap with zero.
The activity of single units was also plotted on a relative scale. For each unit, a histogram was first made over the 40 trials of training in 20-ms bins. Data were then smoothed with an eight-point filter and were converted to a scale of 1 to 64, where 1 and 64 were the minimum and maximum firing rates of that unit, respectively. Data were then sorted on the basis of when each unit fired at a rate that was greater than or equal to 35 (55%), 40 (63%), 45 (70%), 50 (78%), 55 (86%), or 60 (94%) on the relative scale or when each unit fired at the maximum rate.
To test whether activity of individual TRNs differentiated between the two cues and between the two turning directions, trial-by-trial firing rates of individual units during the pretrial baseline period and each ±400-ms perievent period were compared between trials with the two cues and between trials in which rats made right vs. left turns by using a t-test. We then tested whether we observed significantly more units with discriminative activity compared with those predicted by chance by using χ2 tests in two ways. First, the observed numbers of discriminative units were compared with the number expected by chance (18, P = 0.05). Second, we calculated the numbers of units that were identified as discriminative units after shuffling data randomly between types, and these numbers were compared with the observed data. In both comparisons, the P value of 0.05 was used for significance.
To find when during the trial each unit fired at its maximum rate, raw, unsmoothed spike counts were summed over all 40 trials for 800-ms time windows around each of seven task events, and we determined which of the 56 100-ms bins contained the maximum firing rate. If two bins had the same maximum firing rate (which happened less than 1% of the time), the unit was counted as having a maximum in both locations. The same procedure was then repeated for 25% of the maximum firing rate. Average firing rates and percentage of units reaching maximums or 25% of maximums were then analyzed for 400-ms windows before and after each event. To examine whether the increasing consistency in the behavior of the animals could account for better temporal alignment of neuronal perievent responses, spike activity during 400-ms time windows before and after each event was analyzed on a per-trial basis. This analysis allowed us to determine when each unit was most active in each trial and could not be affected by temporal consistency across trials. If the unit did not fire during a given trial, or if the unit had more than four identical maximums (e.g., 2 spikes in 4 400-ms epochs) in a trial, the trial was counted as not having a maximum.
We tested for the ability of a naïve Bayesian decoder (Quian Quiroga and Panzeri 2009) to decode from which epoch of a trial (400-ms baseline period, 400 ms after gate opening, 400 ms after midrun, or 400 ms after goal reaching) the set of observed firing rates originated. For each animal, for each staged session, all TRNs that fired at least once during the subset of trials (see below) were examined for all four trial epochs. We employed the “leave-one-out” method to cross validate our model. In this method, each trial is predicted on the basis of the distribution of all the other trials. All priors were based on the frequency of occurrence of items being compared (e.g., 25% in the case of the 4 trial epochs; actual percentage of trials with right and left turns). We compared the ability of the model to decode baseline vs. beginning-, middle-, or end-trial period separately as well as the four together. Similarly, we compared the ability of the Bayesian decoder to decode the tone onset and gate opening periods from the pretrial baseline period.
We also tested whether we could decode which tone (1 or 8 kHz) was presented and which direction the animal turned (right or left) by performing the naïve Bayesian decoder analysis on the per-trial average firing rates of individual TRNs during the 400-ms period after tone onset. We again used the leave-one-out method and excluded any unit that did not fire at least once during each set of trials examined during the epoch.
To determine whether the firing patterns of dorsolateral striatal neurons became more consistent across trials with training, we calculated, for each unit, correlations among the spike counts in consecutive 40-ms bins covering 800-ms windows around each task event, comparing counts in each trial and those in every other in the training session. The average correlation for each unit was calculated for each event, and the mean and standard error of the mean of these correlations were computed for individual learning stages. The same procedure was followed for correlations of running speed across individual trials to test for changes in behavioral consistency during learning.
To analyze further the consistency of the firing patterns from trial to trial across learning, 45 units were randomly selected from each of the nine stages of learning. Spike data were shuffled in 10-ms bins, and then trials were bootstrapped 1,000 times for both the real and shuffled data sets. For each bootstrap, the time during the trial at which each neuron reached one of a series of percentages of its peak firing (e.g., 78%, 86%, and 100%) was ranked in relation to the result of the bootstrap for all other neurons recorded in that stage of learning. Then the standard deviation of the rank orders over 1,000 bootstraps was calculated for each unit to assess the consistency of phasic responses of each unit relative to other recorded units. A standard deviation of rank less than 8 was found in shuffled data less than 1% of the time. The analysis with three thresholds analyzed in detail (78%, 86%, and 100%) produced comparable results, and we report results with 86% threshold in results and Fig. 6.
Fig. 6.
Task-time spiking of single striatal units becomes more consistent with training. A: average correlation of spike counts in 10-ms bins (red) and, for comparison, of running speed (blue) between each trial and every other trial of the session across learning stages. Error bars indicated the standard error of the mean. Note that both the firing rates and the speeds become more consistent with learning (ANOVA, P < 0.001). B: numbers of units, out of 45 in each stage, showing standard deviations in their rank order on the basis of when they reached 86% of their maximum firing rate relative to other units (see methods). Dark blue bars indicate units with standard deviation lower than 8 (chance level based on shuffled data), indicating high consistency in activity patterns. Light blue bars indicate units with standard deviation of 8 or higher. C: numbers of units, out of 45, with a standard deviation of rank order at or below the chance level across learning stages. D: percentage of TRNs across training most often showing largest per-trial response during gate opening, midrun, and goal reaching. Shaded areas indicate 95% confidence intervals.
To determine whether activity related to delay-period-like short-term memory was present in the striatum, we first took the average firing rate for each unit during a baseline period (1.9 to 1.4 s before tone onset). We then looked for units whose firing rate remained elevated or depressed from tone offset to turn onset, relative to the baseline period. For those units with such sustained activity, we tested whether the size and/or direction of modulations discriminated the auditory frequencies of the two tone cues, the turning direction signaled by the cues, or the direction of turns selected. We then tested for changes in the proportions of these units as rats' performance improved with training.
RESULTS
Behavioral performance during acquisition of the T-maze task.
Over the course of training, all seven rats successfully learned to traverse the maze and select the rewarded arm (Fig. 1A). Running times significantly decreased with training (Fig. 1B, P < 0.001). The average percent-correct performance of the rats rose from chance levels to exceed the 72.5% correct criterion for acquisition (Fig. 1C, P < 0.0001) in an average of 15 sessions (Fig. 1E). However, for each individual rat, the percent-correct performance varied significantly from session to session, throughout training. We therefore also estimated the onset of acquisition with a dynamic state-space model that characterizes the learning state as the probability that an animal will maintain a performance level above chance for the duration of the experiment (Smith et al. 2004) (Fig. 1D). Estimated in this way, it took an average of 8.5 sessions (± 0.56 SE) for the rats to acquire the task [excluding one animal (B05), which, according to this model, failed to learn]. Thus the probabilistic estimates suggested that the rats learned on average nearly a week earlier than suggested by the absolute criterion (Fig. 1E), confirming a similar trend in the acquisition estimates of performance on the standard T-maze task (Smith et al. 2004). We used the absolute criterion method to define stages of training (see methods) to have an expanded scale of training but were alert to the state-space estimates in analyzing the data.
Ensemble activity of task-responsive and nontask-responsive striatal projection neurons during training.
We analyzed the activity of 3,207 units recorded in the dorsolateral striatum of the seven rats that leaned the T-maze task (recording sites shown in Fig. 1F). Of these, 77% (2,456) were classified as putative MSNs. Of these putative MSNs, 57% (1,391) were responsive to at least one event (e.g., gate opening) during this task and classified as TRNs. All seven rats contributed at least 124 (5%) putative MSNs to the analysis, with one animal contributing 865 MSNs (35%).
The ensemble activity of the TRNs was strikingly restructured as the animals acquired the task (Fig. 2A). The early dispersed ensemble activity that occurred during the first days of training gave way to a pattern in which the average per-neuron firing rates were highest at the beginning of the runs (after gate opening) and at the end of the task (as the animals approached reward), while ensemble firing during the middle of the trial runs was reduced to the level of pretrial baseline (Fig. 2B). The time course of these changes in firing pattern is shown as line plots in Fig. 2C for each stage throughout the course of training. In the first stage, ensemble activity increased nearly monotonically from start to goal reaching. Already by stage 2, however, firing rates rose, then fell slightly, and then rose again during the runs, in a pattern that became progressively stronger as training progressed. For most of middle to late stages, the only other marked modulation in average firing rates occurred during the window bracketing the onset of turning, when the otherwise strong decrease in midrun firing was lessened. This progressive accentuation of firing for the gate and goal trial events was evident also when the learning stages were determined by the state-space model (Fig. 2D). This pattern was seen regardless of whether firing rates were normalized before averaging, or normalized by different methods (Supplemental Fig. S1, A-C; supplemental material for this article is available online at the Journal of Neurophysiology website), and could be observed in session-by-session plots of the ensemble spike patterns (Supplemental Fig. S2B). However, in control analyses, we observed neither such activity patterns nor learning-related changes in ensemble activity when the ensemble activity was plotted by aligning it to seven randomly inserted events (Supplemental Fig. S1D), suggesting that the activity patterns that we observed likely represented neural processing related to behavioral acquisition and performance of the T-maze task.
We also plotted the averaged ensemble firing patterns of MSNs that did not exhibit phasic increases in activity during the task exceeding two standard deviations above their baseline activity (∼44% of MSNs recorded during sessions included in the learning-stage data set). In contrast to the phasically active MSNs, these nontask-responsive neurons (NTRNs), as an ensemble, decreased their firing rates relative to baseline activity as training progressed (Fig. 3A). However, this baseline firing actually increased significantly as training progressed. The NTRNs thus did not have strong phasic event-related activity during the task, but they did exhibit robust learning-related plasticity that was highly specific to context (e.g., time to start the maze run or environment in which the task should be performed). They fired progressively more before, but not during, the maze runs as training progressed (Fig. 3, B and C, and E). As our goal was to analyze in-task activity patterns, we focused most of our analyses on the phasically active TRNs, for which we did not find such changes in baseline firing rates (Fig. 3, D and F). We note, however, that the context-dependent baseline activity of the NTRNs is clearly an important phenomenon for further analysis.
Changes in firing rates and numbers of responsive neurons as potential sources of the increases and decreases in population activity during learning.
To address the issue of what changes in activity could have contributed to bringing about the changes in firing patterns that we observed in the ensembles of TRNs, we first asked whether these changes could be observed in the firing rates of individual TRNs making up the ensemble populations. We plotted the activity profiles of all individual TRNs recorded during early acquisition (stages 1 and 2), late acquisition (stages 3 and 4), early overtraining (stages 5 and 6), and late overtraining (stages 7 and 8), according to when, across task time, each of these neurons first reached 86% (see methods) of its maximum firing rate (Fig. 4). The plots demonstrate that the single neurons making up the ensembles themselves, cell by cell, developed activity accentuating the beginning and end of each trial. Not all individual neurons followed this pattern; as we found in the midtask-cued version of the task (Barnes et al. 2005; Thorn et al. 2010), some units responded during nearly all times of the trial analyzed (Fig. 4).
Fig. 4.
Task-related spike activity profiles of single neurons recorded during early acquisition (stages 1 and 2), late acquisition (stages 3 and 4), early overtraining (stages 5 and 6), and late overtraining (stages 7 and 8). In the panels, each row represents the activity of a single neuron relative to its maximum and minimum firing rates plotted according to the color scale at right. Units are sorted on the basis of when they first reached 86% of their maximum firing rate.
Given this evidence of a progressive increase in firing rates around gate opening and goal reaching with a concomitant decrease of firing rates during the middle of the trial, we next asked whether the emergence of the task-bracketing pattern reflected changes in the proportions of units responding to beginning-, middle-, and end-task events, or changes in the firing rates of individual neurons during these task periods, or both. We examined firing rates and numbers of neurons with phasic responses during 400-ms time windows during the beginning (after gate opening), middle (after midrun photobeam), and end (after goal reaching) of the task (Fig. 5E). We then determined whether the percentage of units reaching at least 25% of their maximum firing rate during each of these trial epochs changed as training progressed. For units reaching their peak firing or 25% of their peak firing during each task epoch, we also asked whether the firing rate changed as training progressed.
Fig. 5.
Maximum firing rate across learning. A: proportions of TRNs that fired at least 25% of their maximum firing rate during 400 ms after gate opening, 400 ms before midrun photobeam breakage, and 400 ms after goal reaching. The percentage significantly decreased during midrun. B: average maximum firing rates during each task period for TRNs shown in A. The firing rates significantly increased for TRNs responding to gate opening (P < 0.01) and goal reaching (P < 0.05). Error bars indicate the standard error of the mean. C: proportions of TRNs reaching their maximum firing rate during the 3 task periods. The percentage of TRNs increased for the period after goal reaching (P < 0.01). D: average maximum firing rate for TRNs shown in C during each period. E: response profiles of subpopulations of TRNs, plotted as in Fig. 4. TRNs from all stages of learning are plotted.
As shown in Fig. 5, the increase in the average ensemble activity around gate opening could be attributed to an increase in the firing rate of TRNs responding to this event. The average firing rate of TRNs that reached at least 25% of their maximum firing rate during the period around gate opening increased from 3.7 Hz to 8.4 Hz (ANOVA, stage 1 vs. 9, P < 0.01, Fig. 5B, left). The firing rate of those that reached their maximum firing rate at gate opening also increased (ANOVA, stage 1 vs. 9, P < 0.1, Fig. 5D, left). The proportions of TRNs reaching at least 25% of the maximum firing rate (Fig. 5A, left) or their maximum firing rate (Fig. 5C, left) around gate opening did not change significantly with learning.
In contrast to the increases in ensemble firing rates at start, the decrease in the average midtask firing rates that occurred progressively over the course of training appeared to be attributable to a decrease in the numbers of units with phasic firing during this time period. Only 2% of task-responsive neurons reached their maximum firing rate during the middle of the task (Fig. 5C, middle), and the percentage of neurons reaching at least 25% of their maximum firing rate during this period decreased from 42% to 4% across training (χ2 test, P < 0.01, Fig. 5A, middle). Decrease in firing rate did not reach significance (ANOVA, P = 0.1612, Fig. 5B, middle), perhaps because a small sample of units exhibited responses during the midtask period.
Finally, we analyzed the activity at goal reaching and found that both the percentage of units responding and the firing rates of responsive units increased as training progressed. Though the percentage of neurons reaching 25% of their maximum around goal reaching did not increase with learning (Fig. 5A, right), the percentage of neurons reaching their maximum firing rate during the end of the trial runs (the 400 ms after reaching the goal) increased as training progressed from 15 to 44% (χ2 test, P < 0.01, Fig. 5C, right), indicating that more units showed robust goal-related responses late in training. The mean firing rate for neurons reaching 25% of their maximum firing rate during the postgoal period also increased from 5.4 Hz to 7.9 Hz (ANOVA, P < 0.05, Fig. 5B, right).
Spike patterns of task-responsive neurons become less variable during training.
We next asked whether, within successive sessions, the firing patterns of individual TRNs across trials became less variable as training progressed. We correlated the firing pattern of each TRN during each trial with its firing pattern during every other trial in that session. The trial-to-trial correlations of the firing patterns increased from an average correlation of 0.10 in stage 1 (first training session) to 0.21 in stage 9 (end of training). Thus the spike patterns of individual TRNs became more consistent as training progressed (ANOVA, P < 0.0001, Fig. 6A). We found no such increase in consistency for spike activity during the pretrial baseline period (1.9 to 1.1 s before tone onset) during which the correlations were unchanged across training (ANOVA, P > 0.05).
The variability in the run times of the animals also decreased with learning (Fig. 6A). The increases in firing consistency, like those of the patterning of ensemble activity, could have been related to this behavioral plasticity accompanying learning of the procedural task. To test whether training-related reductions in the behavioral trial-to-trial variability led to a tighter temporal alignment of spike firing to specific task-related actions (e.g., tighter time-locked spiking at run initiation) and thus made the neuronal responses related to those behavioral events appear to become larger simply by virtue of the tighter behavioral timing, we analyzed neuronal activity on a trial-by-trial basis. We first determined which 400-ms time epoch before and after each event contained the largest number of spikes during each trial. Then, for each unit, we determined the 400-ms time epoch (e.g., after gate opening) during which the unit fired at the maximum rate in the most number of trials. Thus, if tighter time locking, within 400-ms windows, of a behavior can account for the increases in ensemble activity that we found during training, we should not find changes in numbers of units with responses to specific task events determined with this trial-by-trial method. However, we again found that the proportion of units most often responding around goal reaching on a per-trial basis increased with training (χ2 test, P < 0.001) and found a similar trend for those with responses at gate opening (χ2 test, P < 0.1), whereas the proportions of units with responses to midrun events decreases significantly (χ2 test, P < 0.05, Fig. 6D). These analyses indicate that increases in behavioral consistency across trials do not adequately account for the strengthening of the task-bracketing neuronal responses.
We analyzed trial-by-trial consistency further by examining the consistency of firing of single TRNs relative to others in recorded TRN ensembles (Fig. 6, B and C). We selected a random sample of 45 units for each training stage (the number required to match the sample sizes to the lowest number of units in any stage). For each unit, we examined firing rates in 800-ms time windows around each task event, including the pretone baseline window, to determine when in the trial the unit reached its maximum firing rate. We then ranked each unit recorded in each stage according to when it first reached 86% of its maximum firing rate, in relation to when the remaining 44 units reached their maximum firing rates. We then bootstrapped 1,000 times the trials used to calculate the maximum firing rate for each unit and repeated this ranking procedure for both the actual and shuffled data sets. Then, as an estimate of how much this order changed from bootstrap to bootstrap, we took the standard deviations of the ranking that each unit received for each bootstrap. Therefore, if a particular neuron responded to gate opening repeatedly across trials, it would receive a very consistent rank regardless of which trials were chosen and would therefore receive a low standard deviation over the bootstraps. In this way, we could use the standard deviations in rank order to estimate the consistency of unit firing; the lower the standard deviation in rank, the more consistently the unit fired across trials relative to other units recorded during the same training stage. We calculated the standard deviations of the shuffled data and found that only 1% of the units in the shuffled data set had standard deviations of rank lower than 8. We used this value to contrast with the actual data. We found that, on the first day of training, 44% of the TRNs had standard deviations of ranking less than this control level of 8, whereas on the last stage of training, 78% of the TRNs showed standard deviations of rank lower than 8 (χ2 test, P < 0.001, Fig. 6, B and C). Thus the numbers of TRNs with high consistency from trial to trial increased compared with chance levels across training.
Finally, we examined the consistency across animals in the changes in neuronal firing patterns during training. An example of the session-by-session (unstaged) spike activity recorded is shown in Supplemental Fig. S2B to illustrate the average neuronal firing pattern of TRNs for one rat (B08), whose neuronal sample size during acquisition was largest among the seven rats. As this rat learned and performed the task (Supplemental Fig. S2, C and D), an increasingly pronounced strong accentuation of the beginning-and-end pattern formed in its ensemble activity (Supplemental Fig. S2B). Later sessions in the chronic recordings tended to lack the necessary five units per day for entry into the activity plot, but even with these gaps, the development and retention of the pattern across training are evident, together with the generally highly consistent patterns of the activity as the task-bracketing pattern appeared.
Aspects of the task-bracketing pattern were clearly visible in the ensemble activity recorded for the majority of other rats, even with small numbers of units averaged for each learning stage (Supplemental Fig. S2A), and excluding the rat B08 did not change the development of training-induced ensemble activity (Supplemental Fig. S2E). These results demonstrate that the task-bracketing ensemble activity develops consistently in nearly all individual rats during the course of training.
The ensemble firing patterns of task-responsive neurons are sensitive to correct and incorrect trial performance.
In correctly performed trials, rats chose the baited arm accurately on the basis of the initial instruction cue and therefore received a chocolate reward at goal reaching. In incorrectly performed trials, the rats reached the noncued end arm and did not receive chocolate. We compared the firing patterns of the TRNs in correct and incorrect trials, including only trials that were incorrect because the rat turned the wrong direction, not trials in which the rat simply did not finish a run. The run times of correct and incorrect trials were not significantly different [2.95 ± 0.042 (mean ± SE) s and 3.09 ± 0.09 s, respectively, ANOVA, P = 0.10]. Despite these differences in behavioral accuracy and reward outcome in the correct and incorrect trials, we found that, after the rats had learned the task, the average normalized firing rates in correct and incorrect trials were similar until the final part of the maze runs, as the rats approached the goal (Fig. 7A). Then, as rats approached the goal and detected the reward, either by seeing it or by smelling it, more firing was observed in correct trials, suggesting that this differential activity may serve as a teaching signal, as proposed by the actor/critic model of learning (Barto 1995). We did not detect differences in the firing patterns for trials immediately following a correct trial compared with trials immediately following an incorrect trial in the well-trained animals (Fig. 7B).
Fig. 7.
Average firing rates of striatal neurons for different types of trials across learning. A: ensemble firing rate of TRNs, relative to pretrial baseline, averaged over stages 5 to 9 for correct (blue) and incorrect (red) trials. Shading indicates 95% confidence bounds. Note difference after goal reaching. B: average ensemble activity, as in A, for trials following a correct (blue) and incorrect (red) performance. Note comparable firing patterns for these sets of trials. C: ensemble activity of TRNs, as in A, for sessions wherein rats performed at or below 60% correct (red, n = 293) and sessions wherein rats performed at or above 70% correct (blue, n = 240). D: confusion matrices illustrating the performance of a naïve Bayesian decoder to discriminate the baseline, beginning, middle, and end trial epochs (see methods) for 5 individual rats (top 5, as labeled) and averages of those rats (bottom) for sessions in stage 5. This analysis was performed on rats with sessions during which more than 2 TRNs were recorded.
The rats did not always maintain above-chance performance for a given session, even after reaching behavioral performance criterion of 72.5%. We therefore compared the ensemble firing patterns for sessions after criterion had been reached during which the rats performed poorly (at or below 60% correct; mean = 54%, range = 32–60%) and sessions during which the rats performed well (at or above 70% correct; mean = 74%, range = 70–80%). Interestingly, we found that, on days of good performance, there was an increase in the average firing rate not only at goal reaching but also before gate opening, relative to the firing rates at these times during poor-performance days (Fig. 7C). Thus both at the initial part of the maze runs and at the end of the runs differences in performance were accompanied by localized differences in ensemble firing in the dorsolateral striatum. There was also a significant difference in the speed at which rats ran the task (run times) on high-performance days compared with the speed on low-performance days (2.99 ± 0.03 s and 3.36 ± 0.03 s, respectively, ANOVA, P < 0.0001).
Striatal neurons encode behavioral responses to be performed.
The ensemble activity of TRNs brackets the task time by developing, with training, particularly robust activity at the beginning and end of the task and by suppressing activity in the middle. To explore what this pattern might represent, we constructed a naïve Bayesian decoder and tested whether we could decode different temporal segments of a trial (pretrial baseline, beginning, middle, and end) on the basis of trial-by-trial spike averages of individual TRNs (see methods). In five individual rats in which more than two TRNs were recorded simultaneously during the session of acquisition criterion (stage 5), the four trial epochs were decoded with relatively low levels of miscategorization (Fig. 7D, top 5 panels). This pattern is more robust in the confusion matrix averaged over the five rats (Fig. 7D, bottom). These results demonstrate that striatal neurons, as an ensemble, can represent different temporal segments of the T-maze task.
This ability to mark the task time may be a critical function of the striatal neurons, but it is not likely to be their only role for the acquisition and performance of the T-maze task. To perform the task correctly, the appropriate behavioral response (i.e., turning either right or left) had to be selected on the basis of the cue signaling the reward location. We therefore next tested whether we could decode cue-related or response-related representations in the activity of the TRNs that we recorded in the dorsolateral striatum. A subset (16.5%) of TRNs exhibited responses to the cue, but the proportion of these tone-responsive units did not change significantly during the course of training (Supplemental Fig. S3A), suggesting that the responsiveness to instruction cues did not change with learning. Of 367 TRNs included in the analyses for changes across learning stages, 23 (6%) showed cue-related activity that discriminated the 1- and 8-kHz tones (Table 1); this number was not significantly different from chance (χ2 tests, P > 0.05). However, units with tone-differential activity at goal reaching were more frequent in our sample than those expected by chance (n = 36, 10%). We did not observe significant numbers of cue-discriminating units in other perievent periods, and their numbers did not change across learning stages (Supplemental Fig. S3B).
Table 1.
Numbers of units with discriminative activity (out of 367 TRNs analyzed)
| Task Period |
||||||||
|---|---|---|---|---|---|---|---|---|
| Discrimination | Baseline | Tone Onset | Gate Opening | Start | Midrun | Turn Onset | Turn Offset | Goal Reaching |
| Tone | 8 | 23 | 24 | 5 | 21 | 12 | 22 | 36* |
| Turn Direction | 7 | 9 | 32* | 10 | 55* | 15 | 69* | 67* |
TRN, task-responsive neuron.
Indicates a number that is significantly higher than chance (P < 0.05).
By contrast, the numbers of units that showed activity discriminating the right vs. left turns were significantly greater than chance around gate opening (n = 32, 9%), during midrun (n = 55, 15%), around turn offset (n = 69, 19%), and at goal reaching (n = 67, 18%, Table 1, χ2 tests, P < 0.05). Interestingly, there were increases in the proportions of units with turn-differential activity during the midrun and goal-reaching periods with training (Supplemental Fig. S3C). These results suggest that the identity of behavioral responses selected and performed in the trial may come to be represented in the sensorimotor region of the striatum during learning, but the identity of cues on which the behavioral selection is based may be sparsely represented by comparison.
We performed additional analyses with the naïve Bayesian decoder to test whether the striatal ensemble activity could code for the turning behavior more than the conditional tones, although the numbers of trials per type were relatively small (e.g., 20 trials of each tone type). With this sample, the analysis suggested that the decoder performed better at determining which direction (right or left) the animal turned in a given trial (Fig. 8B) than which tone (1 or 8 kHz) was presented, particularly during overtraining (Fig. 8A). The decoder discriminated the gate-opening event, which allowed rats to initiate locomotion, from the pretrial baseline period clearly and consistently as rats learned and performed the T-maze task (Fig. 8D). By contrast, the decoder performed only marginally to differentiate the baseline and tone onset periods (Fig. 8C). These results further indicate that neurons in the dorsolateral striatum may preferentially represent behavioral actions rather than the instruction cues that signal appropriate action selection.
Fig. 8.
Representation of behavioral responses in the ensemble activity of striatal neurons. A and B: confusion matrices for 1 rat (B08), showing how the decoder can discriminate trials with 1- vs. 8-kHz tones (A) and those with right vs. left turns (B) using average firing rates of individual TRNs during the 400-ms interval following tone onset. Each panel represents a learning stage, as labeled at left. C and D: confusion matrices showing performance of the decoder to determine whether the neuronal activity was recorded during the baseline or tone onset periods (C) and during the baseline or gate-opening periods (D).
Ensemble firing patterns are not tightly linked to profiles of running speed or acceleration.
The finding that activity was highest near the beginning and end of the maze runs raised the question of whether these patterns were simply reflective of changes in acceleration or running speed as the rats ran down the maze. Average speed was low at the beginning, was maximal at midrun, and decreased to zero at goal reaching (Fig. 9, A and B). Correspondingly, accelerations were large at the beginning and end of the maze runs and at turning (Fig. 9, A and C). We carried out a series of analyses to test these possibilities.
Fig. 9.
Correlations of neuronal activity with running speed and acceleration during T-maze performance. A: average speed (red), travelling distance (blue), and firing rate (green) in stages 1, 5, and 9. The speed was calculated for each rat and then was weighted in proportion to the numbers of TRNs recorded in the rat during the stage. B: average running speed, calculated as in A and plotted in 40-ms bins. C: average acceleration, calculated and plotted as in B. D: average normalized firing rate of TRNs. E: correlation between average speed and average firing rate across learning stages. Shaded areas represent 95% confidence intervals. F: correlation between average acceleration and average firing rate. G: average speed for trials with midrun running speed was within 1 standard deviation of the mean calculated for stages 5 through 8. H: ensemble activity of TRNs during trials included in G. I and J: average ensemble activity of TRNs that were not correlated enough with speed to account for 5% of variability in firing rate (I) and of TRNs that were correlated enough with speed to account for 5% of variability in firing rate (J) in the entire trial or in the beginning-, middle-, or end-trial period.
We first correlated the average speeds and average firing rates across rats for all 800-ms perievent time windows for all stages of training. The highest speeds occurred roughly at the time period when firing was lowest, but the patterns were not complementary in detail (Fig. 9, B and D). Notably, the correlations between speed and ensemble firing were inconsistent across training. They started at r = 0.3 on stage 1 and fell to r = −0.54 on stage 7 (Fig. 9E). This result suggested that a relationship between speed and firing rate would have to reverse during the course of training.
As a further test of whether changes in speed across the training sessions might underlie the changes in ensemble firing patterns, we examined the changes in firing rates during the overtraining period for a subset of trials having similar running speeds. To do this, we determined the average speed and its standard deviation for sessions included in stages 5 through 8, the stages for which average speed and average firing rate were consistently negatively correlated (Fig. 9E). We then replotted speed across training stages, excluding trials in all stages in which speeds were lower than one standard deviation below this average (Fig. 9G). We matched the number of trials for samples at each stage by randomly selecting 246 trials for each stage (the lowest number of trials that fitted the criteria for any stage). Therefore, in this data set, running speed and numbers of trials were similar for all of the stages plotted. Nonetheless, the corresponding ensemble activities still changed across training, and they still developed the accentuated beginning-and-end activity of the entire data sample (Fig. 9H). This result suggested that the changes in ensemble firing patterns during training were dissociable from changes in the running speed of the rats; when the speed was held constant, the task-bracketing pattern still emerged over the course of learning.
In a third approach, we looked for potential correlations of firing rate with speed for single TRNs with a time lag giving the maximum correlation on all sessions of training, and we compared the size of the correlations with those in a shuffled data set treated the same way. A large number (304/336) of TRNs showed a modest, but significant, correlation with speed compared with shuffled data (Supplemental Fig. S4, A and B). Only 5% of the TRNs, however, had a correlation high enough to account for 5% of the variability in firing rates in the sample. For the units with the highest speed-firing rate correlations, speed accounted for as much as 10% of the variability in the activity; across all units in this data set, speed accounted on average for less than 1% of the variability found in firing rates of single TRNs. We performed comparable analyses for ensemble activity and acceleration, but we failed to find significant correlation between these measures in any training stages (Fig. 9F).
Together, these results suggested that the emergence of the task-bracketing pattern of activity in the TRN ensembles did not depend principally on training-related changes in running speed or acceleration. However, relationships between TRN activity and behavioral parameters could be dependent on the “state” that changes systematically during maze runs. If the accentuated task-bracketing ensemble-activity patterns that we found were related to task start and end, or to the neural states that accompanied these behavioral segments, then it would be necessary to analyze the data separately for the early, middle, and late parts of the runs. We therefore reanalyzed the entire set of ensemble data, dividing the runs into three time epochs (Supplemental Fig. S5): a beginning period (400 ms before tone onset to 400 ms after gate opening), a middle period (from 400 ms after gate opening to 400 ms after turn offset), and an end period (from 400 ms after turn offset to 400 ms after goal reaching). We computed correlations separately for each of these epochs (Supplemental Fig. S5, A–C). We found that the correlations between firing rate and speed were consistently positive and high for the beginning period, with an average correlation of r = 0.8. For the middle period, the correlations were consistently low with an average correlation of r = −0.53. The end-period correlations were not consistent, starting highly positive and ending strongly negative after training.
Taking into account potential state changes during the runs by analyzing data separately for the beginning, middle, and end periods also increased the numbers of single TRNs with high enough correlations between firing rate and speed to account for 5% of the variability in firing rate. The correlated units increased from 5 to 24% (79/336).
Given this potential differentiation of the beginning, middle, and end periods, we compared the average per-neuron firing rates of the population of TRNs that were correlated with speed during the entire trial or separately for the beginning-, middle-, and end-trial periods (accounting for at least 5% of the variability in firing rate either over the entire trial or during any state) and TRNs that were not correlated with speed by this criterion. As shown in Fig. 9, I and J, the patterns of ensemble activity of the correlated units and noncorrelated units were similar, and both sets of neurons displayed the strong beginning-and-end activity observed in the entire population of recorded TRNs.
In sum, state changes related to the beginning and end of the task performance during individual trials remain potential correlates of the repatterning of activity that we observed during performance of the advance-cued maze task. We were unable, however, to relate closely the ensemble patterns to speed (or acceleration) per se. Given that salient task events, location, and speed are all highly correlated with one another, their interrelationships could prevent clear dissociation of their respective influences (see Fig. 9A and Supplemental Fig. S5D).
Comparison of firing patterns in different tasks.
Because the task-bracketing pattern that we found in this experiment was so pronounced, we compared this activity pattern with the activity recorded in the original version of the task, in which the instruction cues were presented in the middle of the maze runs and remained on until goal reaching (Barnes et al. 2005; Jog et al. 1999; Thorn et al. 2010) (Fig. 10A). The majority of recordings from the midrun-cueing version of the T-maze that we used for this comparison was collected concurrently with recordings reported here in the identical T-maze recording chamber. The behavioral data in the two studies were different in that in the original study the rats on average learned the task more rapidly and the percent correct reached higher asymptotes (Fig. 10B). The rats in the midtask-cueing task also ran the maze faster, particularly early in training (Fig. 10C), but none of these differences were significant, except in stage 7 (P = 0.034). The two cohorts were otherwise comparable.
Fig. 10.
Comparisons of activity of putative medium-spiny neurons recorded during training on 2 versions of the T-maze task. A: previous task version, in which the instruction cues were presented from immediately before turn until goal reaching. B and C: average percent-correct performance (B) and run times (C) of rats trained on the advance-cueing version (n = 7, purple) and those trained on the midrun-cueing version (n = 7, aqua). Arrows indicate the stage at which rats reached the criterion in both tasks. Error bars indicate standard error of the mean. D: average raw activity of TRNs during overtraining (stages 5-9) on the advance-cueing task (purple) and the midrun-cueing task (aqua). Inset: activity of the advance-cueing version aligned at tone offset (purple) with activity of the midrun-cueing version aligned at the click to indicate the beginning of a trial (aqua). E: activity of NTRNs during overtraining on the two task versions, plotted as in D. F: P values for differences in average raw firing rates of TRNs between the advance-cueing and midrun-cueing tasks shown in D. G: slope of the regression fitted to changes in average firing rates of TRNs for each 10-ms bin over the course of training on the advance-cueing task (purple) and the midrun-cueing task (aqua), plotted with 95% confidence bounds. Significant increases and decreases with training are indicated by positive (above the red dotted line) and negative (below the red dotted line) regression slopes.
We calculated averages of raw spike activity of TRNs and of NTRNs for each task across learning stages 5 through 9 (first 10 sessions with performance at or above 72.5% correct). With the advance cueing used in the experiments described here, there was a greatly exaggerated pattern of task bracketing, and this was produced by significantly lower ensemble activity during the middle of the runs (Fig. 10, D and F). Activity of NTRNs was also lower in the advance-cueing task, relative to the midrun-cueing task, in several bins during the midtask period, but these differences were barely significant at P < 0.05 (Fig. 10E). Thus the early cueing produced a remarkable increase in the task-bracketing pattern in TRNs but did not have such robust effects on the activity of NTRNs.
A major difference in the neural activity recorded in the midrun-cued and advance-cued versions of the T-maze task was in the TRN activity at the beginning of the maze runs; the increases in start activity occurred at different times in the two versions. In the midrun-cueing version, the first peak in activity occurred in response to the warning click (which was the earliest predictor of gate opening). By contrast, in the advance-cued version of the task, there was a small increase in activity at the onset of the instruction cues, but the first peak of activity occurred just after gate opening. The analysis in which we compared the time bins in which activity changed significantly throughout training on the two tasks showed clearly the difference in timing of peak activity at the beginning of task time (Fig. 10G). The peaks that occurred in the two task versions around task start were better matched when we aligned the warning click in the midrun-cueing task and the offset of the instruction cues in the advance-cueing task (Fig. 10D, insets), suggesting that the neural responses in both tasks are related to the beginning of the movement period.
Notably, ensemble TRN activity in the advance-cued version declined sharply following the initial peak, until an increase around the point at which the animals began their turns. We tested whether this pattern of activity represents neural processes to hold information about the instruction cues through the midtask period until the appropriate behavioral responses (turning right or left at choice point) are executed. We found a subproportion of striatal units with activity that was significantly higher or lower, relative to that during the pretrial baseline period, throughout the trial time following cue onset. A majority of these units were NTRNs whose firing was suppressed during the entire run time, even after the completion of the turn. Few units showed sustained enhancement bridging the cue presentation and turn execution. Among these units, activity selective for one of the two tones and/or turning directions was rare.
DISCUSSION
Training in simple maze tasks is a classic paradigm for analyzing the behavioral correlates of learning in rodents. Large-scale reconfiguring of ensemble activity occurs in the sensorimotor striatum during such learning, leading to a task-bracketing pattern of accentuated firing at the beginning and end of maze runs as rats are trained on T-maze tasks in which turning directions are cued just before the turn (Barnes et al. 2005; Kubota et al. 2009; Thorn et al. 2010). Here we introduced a new T-maze task in which the cues signaling the rewarded end arm are presented before the rats begin to run. This advance-cueing protocol allowed reallocation of attention and preplanning and at the same time imposed a memory load. Thus the cognitive demands of the maze task were manipulated, but the sensorimotor demands were not changed. With this advance-cueing protocol, even stronger task-bracketing ensemble activity emerged during training than with midtask cueing. A highly significant depression of activity occurred in the sensorimotor striatum during midrun. We suggest that the exaggerated task-bracketing activity and depressed midprocedure activity patterns found with advance cueing could reflect heightened chunking of individual behavioral acts to allow the appropriate maze runs to be executed under the appropriate context as single units of behavior.
Numbers of responsive units and their firing rates change during the development of the task-bracketing activity pattern.
The advance-cueing version of the T-maze task allowed us to address three issues that are critical for understanding the neural processing that occurs in the sensorimotor striatum during procedural learning and habit formation. First, the advance-cueing protocol provided temporal separation of trial start and run start. The early increase in TRN ensemble activity occurred after gate opening (the signal to the rats that they could start to run toward the goal), did not immediately follow the onset of the instruction cues, and did not occur preferentially to one of the two cues, suggesting that the beginning activity did not mark the earliest predictor of reward, but rather, the earliest predictor of the movement period. This finding suggests as a working hypothesis that, with training, the complex sequence of movements needed to traverse the maze may have come to be represented as a composite of the entire movement sequence as a result of transbasal ganglia processing involving the sensorimotor striatum. The volley of activity at task start could allow the tonic inhibition in the striatum to be overcome and the sequence to occur. If this activity were diminished, tonic inhibition of motor output pathways by the basal ganglia might not be relieved. This suggestion would fit well with interpretations of the particular difficulty of Parkinson's disease patients to initiate sequences of movements, reflected in their difficulty with sequential but not simple reaction-time tasks (Benecke et al. 1987; Doyon 2008). This possibility further suggests the working hypothesis that advance cueing, if it generates robust activity volley in patients with Parkinson's disease, could help trigger and execute appropriate motor acts signaled by the cue.
It is unclear whether the end activity that we observed marks the end of the movement period or the end of the trial or is associated directly or indirectly with the reward itself. The activity increased as rats approached the goal during both rewarded and unrewarded trials, suggesting that reaction to available reward itself was not the primary behavioral correlate of the neural activity but leaving open the possibility that anticipation or prediction of reward could nevertheless contribute to the signal. We found little evidence, however, of such anticipatory bias in the responses of the TRNs at earlier times during the maze runs, and we found relatively few responses to the instruction cues themselves, even though they were the only indicators of which way to turn to succeed in obtaining chocolate reward. One further possibility is that, as the task was learned and the runs became more and more consistent, the end activity reflected some interval timing mechanism. The striatum has been directly implicated in such interval timing (Meck et al. 2008). By this view, simply extending the length of the track, changing the length randomly, or inserting delays of reward delivery would be valuable further experimental manipulations to clarify this issue.
The augmented firing at task start appeared to be attributable mainly to increases in the firing rates of individual units in response to gate opening, whereas the increase found after turn completion and near goal reaching included both increases in per-neuron firing rates and increases in the numbers of units recruited. We were unable definitively to follow the firing of identified units through the weeks of training, but these results raise the possibility that different or at least incompletely overlapping neural mechanisms contributed to the neural plasticity at task start and task end, despite their relatively similar time courses of development.
Our analyses presented here focused on neuronal firing patterns in relation to timing and sequences of task events, but it is possible that striatal activity represents spatial, not temporal, information necessary for task performance. In fact, spiking of the majority of TRNs (304 of 336) was significantly correlated with rats' spatial location in the maze (Supplemental Fig S6). However, our task was not designed to disambiguate spatial and temporal coding. Many of the events were defined spatially, and others tended to occur at constant locations. Redish and his colleagues recently reported that the dorsal striatum can encode the location of the rat in their maze task, but robustness of encoding was strongly selective to the portion of the maze where the rat made series of turns, relative to other parts of the maze where the rat simply ran through (van der Meer et al. 2010). Moreover, although the rat's location changes only minimally during the period from baseline to gate opening, many striatal neurons show robust phasic responses only at particular times within this period. These findings are compatible with the view that neurons in the striatum do not encode place, but rather, the order of events in a task (Berke et al. 2009; Schmitzer-Torbert and Redish 2008) or the association of salient events and spatial location.
Changes in running speed and acceleration are not uniformly dominant factors influencing development of the learning-related activity patterns.
A second advantage of the advance-cueing protocol is that we could analyze running speed and acceleration without the insertion of the cue at midrun. A strong correlation between the firing rates in some neurons in the medial striatum and running speed has been reported (Eschenko and Mizumori 2007; Yeshenko et al. 2004), and we found such neurons also in our recordings in the dorsolateral striatum. About 5% of TRNs had firing patterns that could account for significant variability in speed. Moreover, the ensemble activity of the TRNs across the maze runs did not show a consistent correlation with running speed across training and had no correlation with acceleration. To test further the possible links between neural firing and locomotion, we analyzed neural activity in trials with similar running speeds. We found that the task-bracketing pattern persisted even in this data set with little variability in running speed. We further analyzed data sets in which neurons with highly significant correlations with running speed were omitted. Again, we found that the task-bracketing pattern of ensemble activity still appeared. Only by assuming that the early, middle, and late epochs of the trial runs were accompanied by changes in psychomotor state that could affect relationships between neural activity and behavioral performance did we find speed and firing rates to be well correlated (Supplemental Fig. S5).
Our previous studies of LFP activity in the striatum support such a possibility. We have found previously that θ-band LFP oscillatory activity in the striatum, including in the regions sampled in this study, is modulated strongly during the maze runs (DeCoteau et al. 2007a; DeCoteau et al. 2007b), as are other frequencies (Thorn and Graybiel 2007). Although the LFP power in the θ-band was not as tightly correlated with running speed as was that recorded simultaneously from the dorsal CA1 field of the hippocampus, the striatal θ-activity was highly patterned, peaking at the midrun period on the standard task just as spike activity in the dorsolateral striatum weakened after training (DeCoteau et al. 2007a; DeCoteau et al. 2007b). Moreover, the LFP activity, and its relation to other frequencies of oscillatory activity in the striatum, was modulated during learning (DeCoteau et al. 2007a; DeCoteau et al. 2007b; Thorn and Graybiel 2007; Tort et al. 2008). Together, these results strongly suggest that there could be changes in neural states reflecting continuously varying behavioral and cognitive demands during the task time and across learning. However, our experiments did not disambiguate possible differences in state at the start and end of the runs from the fact that the animals were not running at the beginning and end of the runs.
Our findings also do not settle the question of whether the appropriate level of representation to propose is a low-level one related to getting a motor program started and ended, or a higher-level representation of a sequence to be released so that other brain and spinal cord mechanisms can drive the actuators. It is impressive, however, that early during training correlations with running speed were positive and relatively high, whereas after overtraining they were negative and relatively high. Clearly, other behavioral changes during learning could also have contributed to the emergence of the task-bracketing patterns, a general but important caveat for findings related to procedural learning in which multiple behaviors can undergo alterations favoring improved performance. It is with these qualifications in mind that we favor the task-bracketing pattern as being an experience-dependent indication of behavioral chunking. Such representations could be related to state-level neural changes (which might include oscillatory activities themselves related to task start and task end as well as to speed) but seem unlikely to reflect speed as a dominant factor.
Greater suppression of striatal firing during midrun in the advance-cueing task may represent stronger chunking of behavioral sequence.
The advance-cueing task version also allowed us to study the effect of providing the animals, in advance of their performance of the task, all the necessary information to preplan the entire motor sequence for a given trial. This is a standard procedure in many primate experiments, and the use of such protocols has led to major findings related to the encoding of set, motor planning, and the organization of sequential behavior (Cisek and Kalaska 2004; Matsuzaka et al. 2007; Mushiake et al. 2006; Shima and Tanji 2000; Wise and Kurata 1989). In the maze task, the advance cueing was critical in changing the cognitive demands of the task. Preplanning was favored, and decision making on the basis of midrun instructions was removed. However, the animals did have a working-memory load due to the time between early cueing and turning, so that perhaps retrieval of the decision making was maintained midtask. What we observed by recording in the sensorimotor striatum was a sustained baseline-level ensemble firing during the memory period, bracketed by early and late activity peaks. The midtask suppression of firing appeared to be the main contributor to the strikingly strong beginning-and-end pattern relative to that observed with midtask cueing.
What could account for this exaggerated pattern? In recordings during working memory tasks, it is common to find neocortical neurons that maintain elevated activity throughout a cue-response delay period (Chang et al. 2002; Constantinidis et al. 2001; Fuster and Alexander 1971; Ichihara-Takeda et al. 2010; Kubota and Niki 1971; Warden and Miller 2007). Such activity has also been observed in the primate caudate nucleus and rodent dorsomedial striatum during the performance of working memory tasks (Chang et al. 2002; Histed et al. 2009), consistent with evidence for the role of the striatum, particularly the dorsomedial zone that has known connections with the hippocampal circuits, in working memory processes (Cools et al. 2008; DeCoteau et al. 2004; Kesner and Gilbert 2006; Landau et al. 2009). Here, we found the opposite: the ensemble TRN activity during the working memory period decreased progressively in the sensorimotor striatum as rats acquired the advance-cued T-maze task. We found only a small number of TRNs with activity bridging the cue and response periods. The spike activity of these neurons, with either enhancement or inhibition during the memory-holding period, was not selective to one of the two instruction cues or to the directions of turning to be executed, and the altered activity did not return to the precue level upon completion of the turns. It is not clear, therefore, that their activity represents a form of short-term memory process required to perform this version of the T-maze task.
Instead of holding a specific memory within the striatum, the TRNs that showed diminished activity could be a subset of neurons releasing firing of working memory circuits elsewhere. For example, they could comprise a subset of direct pathway neurons encouraging firing of prefrontal working memory circuits. If so, the decreased striatal firing during the midruns could represent a memory-holding pattern in the sensorimotor striatum. Another possibility is that the diminution in striatal firing might itself reflect the activation of working memory circuits that inhibit striatal firing and that a rebound from such inhibition could contribute to heightened activity at the end of maze runs. This sort of circuit-level activity could permit the chunking functions of sensorimotor corticostriatal circuits to be more strongly engaged. Notably, during the period of midrun suppression, there was a brief, moderate increase in activity at the onset of turns. This activity peak might represent retrieval of working memory about the cue identity necessary for selection and performance of the instructed behavior, or about the turning action already selected.
Alternatively, this midrun decrease in activity may reflect changes in attention with learning. Early during learning, rats may attend to all aspects of the maze, but later in learning, they may focus on what is relevant, such as the gate opening and goal reaching. It is important to note, however, that the pattern of activity emphasizing the start and end of maze runs was found, although to a lesser degree, in earlier versions of our T-maze task in which instruction cues were presented, requiring a high level of attention, in the middle of the maze runs. Thus midrun suppression of MSN activity cannot entirely be explained by the level of attention, even though attention could be a contributing factor.
Despite the midtask depression of ensemble activity that we observed, the striatal neurons were not silent during this period of the runs. We found spike activity that differentiated the behavioral responses of the rat in significantly greater numbers of TRNs than predicted by chance during the full series of perievent periods from gate opening, midrun, and turn offset to goal reaching and that proportions of these units increased for the midrun and goal-reaching periods with training. This set of findings suggests that working memory for selected (gate opening and midrun) and completed (turn offset and goal reaching) behaviors may have been maintained by these units. Activity selective for the two cues was rare, however, raising the possibility that cue-related working memory may be mediated by other brain regions (e.g., by the prefrontal cortex and the medial striatum). These hypotheses were further supported by our finding, although putative because of small numbers of trials for this analysis method, that the Bayesian decoder could discriminate the upcoming turning direction, but not the two tone cues, on the basis of the response of sets of individual TRNs to tone onset.
Judging by the behavioral data, the advance-cued task was demanding for the rats, more so than the original task version with midrun cueing in which performance accuracy climbed faster and reached higher asymptotes. However, the restructuring of striatal firing patterns that occurred with advance cueing produced an even stronger task-bracketing pattern than that with the midrun-cueing version. If anything, then, task difficulty brought about a stronger task-bracketing pattern. For example, it is possible that the stronger task-bracketing pattern found in this task version could reflect the animals performing at subasymptotic levels, continuing for longer times to try to acquire this challenging task version. Because the behavior had not been fully acquired, a larger volley of activity might have been necessary to initiate the maze-running sequence. If so, an ensemble activity pattern similar to that found in the midrun-cueing task might have developed with additional training. However, in rats trained on the midrun-cueing version, the task-bracketing pattern was stronger during overtraining than during acquisition, suggesting that the pattern could even be stronger in a situation in which the task was well learned, not when the task was being acquired. It is more likely that differences in cognitive demands between the two versions, ranging from preplanning to memory load to attention, not differences in task difficulty per se, contributed to the differential prominence of the task-bracketing ensemble activity in the advance-cueing task.
If behavioral chunking is advantageous for the performance of procedural tasks, why did we find the beginning-and-end activity pattern in both correct and incorrect trials? The formation of such task-boundary patterns could reduce conscious attention and effort as a sequence of behaviors is performed but not necessarily lead to error-free performance. An error could still occur during transmission of command volleys in circuits leading out from the striatum and other executive regions with which the basal ganglia are interconnected. Alternatively, a chunked behavior could be triggered inappropriately. In either case, we expect to see comparable activity representing task boundary even in trials with incorrect responses.
Exploration and exploitation during development of task representation by single striatal neurons.
As the neurons in the sensorimotor striatum acquired learning-related activity patterns, their spike firing became more consistent from trial to trial. One interpretation of this transformation is that early in training the striatal neurons, as an ensemble, could be activated by a variety of sensory and behavioral events as the rats searched for critical components of the action sequence that led to reward. With training, the activity of each unit might have begun to code, and thus time linked to, a specific task-related event or set of events occurring repeatedly and consistently trial after trial. This view is consistent with an explore-exploit model of the neuroplasticity accompanying the behavioral learning of the task (Barnes et al. 2005) and accords with findings for the birdsong field that the basal ganglia may be a source of variability necessary to permit song learning (Doya and Sejnowski 1995; Kao et al. 2005; Olveczky et al. 2005; Tumer and Brainard 2007).
Neurons lacking phasic task-related responses may encode task context.
Remarkably, the ensemble activity of the neurons that did not show phasic responses in task (the NTRNs) increased during the pretask baseline period as training progressed. During this period, the rats were prevented from running but could anticipate the next run. This baseline change over training stood in sharp contrast to the consistent prerun baseline activity of the TRNs. The increase of NTRN baseline activity demonstrates that these neurons, which apparently did not participate in the neural representation of the task, may nevertheless have encoded critical aspects of the behavioral procedure, such as anticipation of task start acquired with training. We were unable, in the configuration of the task used, to record adequately during the consumption period, when after a successful trial the rats could eat the chocolate. We therefore could have missed neural activity changes related to such postrun behavior in both the TRNs and the NTRNs. The task had clear boundaries between trials. Rats returned to the start region after the end of each trial, either spontaneously or were guided by the experimenter, and an intertrial interval was given before the next trial. This break could have interrupted potential rehearsal or playback, phenomena well studied for the hippocampus and noted in a minority of neurons in the nucleus accumbens (Davidson et al. 2009; Pennartz et al. 2009). These interpretations are interesting, as they suggest the possibility that there may be a division in the sensorimotor striatum between the encoding of the sequence of events that occur during the maze runs proper and the encoding of the context for these in-task sequences. With the aid of genetic manipulations, it soon should be possible to determine whether these two forms of representation are divided between the known classes of projection neurons in the striatum.
GRANTS
This work was supported by National Institute of Health Grants R01 MH060379, Office of Naval Research Grant N000140410208, and the Stanley H. and Sheila G. Sydney Fund to A. Graybiel, and also by National Institute of Health Grants R01 MH071847 and DP1 OD003646-01 to E. Brown, and T32 NS048005 to C. Stamoulis.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the authors.
Supplementary Material
ACKNOWLEDGMENTS
We are grateful to Daniel Gibson, Patricia Harlan, and Henry Hall for help.
REFERENCES
- Balleine BW, Liljeholm M, Ostlund SB. The integrative function of the basal ganglia in instrumental conditioning. Behav Brain Res 199: 43–52, 2009 [DOI] [PubMed] [Google Scholar]
- Barnes T, Kubota Y, Hu D, Jin DZ, Graybiel AM. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature 437: 1158–1161, 2005 [DOI] [PubMed] [Google Scholar]
- Barto AG. Adaptive critics and the basal ganglia. In: Models of Information Processing in the Basal Ganglia, edited by Houk J, Davis J, Beiser D. Cambridge, MA: MIT Press, 1995, p. 215–232 [Google Scholar]
- Benecke R, Rothwell JC, Dick JP, Day BL, Marsden CD. Disturbance of sequential movements in patients with Parkinson's disease. Brain 110: 361–379, 1987 [DOI] [PubMed] [Google Scholar]
- Berke JD, Breck JT, Eichenbaum H. Striatal versus hippocampal representations during win-stay maze performance. J Neurophysiol 101: 1575–1587, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boyd LA, Edwards JD, Siengsukon CS, Vidoni ED, Wessel BD, Linsdell MA. Motor sequence chunking is impaired by basal ganglia stroke. Neurobiol Learn Mem 92: 35–44, 2009 [DOI] [PubMed] [Google Scholar]
- Chang JY, Chen L, Luo F, Shi LH, Woodward DJ. Neuronal responses in the frontal cortico-basal ganglia system during delayed matching-to-sample task: ensemble recording in freely moving rats. Exp Brain Res 142: 67–80, 2002 [DOI] [PubMed] [Google Scholar]
- Cisek P, Kalaska JF. Neural correlates of mental rehearsal in dorsal premotor cortex. Nature 431: 993–996, 2004 [DOI] [PubMed] [Google Scholar]
- Cohen A, Ivry RI, Keele SW. Attention and structure in sequence learning. J Exp Psychol Learn 16: 17–30, 1990 [Google Scholar]
- Constantinidis C, Franowicz MN, Goldman-Rakic PS. Coding specificity in cortical microcircuits: a multiple-electrode analysis of primate prefrontal cortex. J Neurosci 21: 3646–3655, 2001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cools R, Gibbs SE, Miyakawa A, Jagust W, D'Esposito M. Working memory capacity predicts dopamine synthesis capacity in the human striatum. J Neurosci 28: 1208–1212, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Costa RM, Cohen D, Nicolelis MA. Differential corticostriatal plasticity during fast and slow motor skill learning in mice. Curr Biol 14: 1124–1134, 2004 [DOI] [PubMed] [Google Scholar]
- Davidson TJ, Kloosterman F, Wilson MA. Hippocampal replay of extended experience. Neuron 63: 497–507, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- DeCoteau WE, Hoang L, Huff L, Stone A, Kesner RP. Effects of hippocampus and medial caudate nucleus lesions on memory for direction information in rats. Behav Neurosci 118: 540–545, 2004 [DOI] [PubMed] [Google Scholar]
- DeCoteau WE, Thorn C, Gibson DJ, Courtemanche R, Mitra P, Kubota Y, Graybiel AM. Oscillations of local field potentials in the rat dorsal striatum during spontaneous and instructed behaviors. J Neurophysiol 97: 3800–3805, 2007a [DOI] [PubMed] [Google Scholar]
- DeCoteau WE, Thorn CA, Gibson DJ, Courtemanche R, Mitra P, Kubota Y, Graybiel AM. Learning-related coordination of striatal and hippocampal theta rhythms during acquisition of a procedural maze task. Proc Natl Acad Sci USA 104: 5644–5649, 2007b [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doya K, Sejnowski TJ. A novel reinforcement model of birdsong vocalization learning. In: Advances in Neural Information Processing Systems, Vol 7, edited by Tesauro G, Touretzky DS, Leen TK. Cambridge, MA: MIT Press, 1995, p. 101–108 [Google Scholar]
- Doyon J. Motor sequence learning and movement disorders. Curr Opin Neurol 21: 478–483, 2008 [DOI] [PubMed] [Google Scholar]
- Eschenko O, Mizumori SJ. Memory influences on hippocampal and striatal neural codes: effects of a shift between task rules. Neurobiol Learn Mem 87: 495–509, 2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fujii N, Graybiel A. Representation of action sequence boundaries by macaque prefrontal cortical neurons. Science 301: 1246–1249, 2003 [DOI] [PubMed] [Google Scholar]
- Fujii N, Graybiel A. Time-varying covariance of neural activities recorded in striatum and frontal cortex as monkeys perform sequential-saccade tasks. Proc Natl Acad Sci USA 102: 9032–9037, 2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuster JM, Alexander GE. Neuron activity related to short-term memory. Science 173: 652–654, 1971 [DOI] [PubMed] [Google Scholar]
- Gage GJ, Stoetzner CR, Wiltschko AB, Berke JD. Selective activation of striatal fast-spiking interneurons during choice execution. Neuron 67: 466–479, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graybiel AM. The basal ganglia and chunking of action repertoires. Neurobiol Learn Mem 70: 119–136, 1998 [DOI] [PubMed] [Google Scholar]
- Graybiel AM. Habits, rituals and the evaluative brain. Annu Rev Neurosci 31: 359–387, 2008 [DOI] [PubMed] [Google Scholar]
- Histed MH, Pasupathy A, Miller EK. Learning substrates in the primate prefrontal cortex and striatum: sustained activity related to successful actions. Neuron 63: 244–253, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ichihara-Takeda S, Takeda K, Funahashi S. Reward acts as a signal to control delay-period activity in delayed-response tasks. Neuroreport 21: 367–370, 2010 [DOI] [PubMed] [Google Scholar]
- Jin X, Costa RM. Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature 466: 457–462, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jog M, Kubota Y, Connolly CI, Hillegaart V, Graybiel AM. Building neural representations of habits. Science 286: 1745–1749, 1999 [DOI] [PubMed] [Google Scholar]
- Kao MH, Doupe AJ, Brainard MS. Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song. Nature 433: 638–643, 2005 [DOI] [PubMed] [Google Scholar]
- Kesner RP, Gilbert PE. The role of the medial caudate nucleus, but not the hippocampus, in a matching-to sample task for a motor response. Eur J Neurosci 23: 1888–1894, 2006 [DOI] [PubMed] [Google Scholar]
- Kimchi EY, Laubach M. The dorsomedial striatum reflects response bias during learning. J Neurosci 29: 14891–14902, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koch I, Hoffmann J. Patterns, chunks, and hierarchies in serial reaction-time tasks. Psychol Res 63: 22–35, 2000 [DOI] [PubMed] [Google Scholar]
- Kubota K, Niki H. Prefrontal cortical unit activity and delayed alternation performance in monkeys. J Neurophysiol 34: 337–347, 1971 [DOI] [PubMed] [Google Scholar]
- Kubota Y, Liu J, Hu D, DeCoteau WE, Eden UT, Smith AC, Graybiel AM. Stable encoding of task structure coexists with flexible coding of task events in sensorimotor striatum. J Neurophysiol 102: 2142–2160, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landau SM, Lal R, O'Neil JP, Baker S, Jagust WJ. Striatal dopamine and working memory. Cereb Cortex 19: 445–454, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matsuzaka Y, Picard N, Strick PL. Skill representation in the primary motor cortex after long-term practice. J Neurophysiol 97: 1819–1832, 2007 [DOI] [PubMed] [Google Scholar]
- Meck WH, Penney TB, Pouthas V. Cortico-striatal representation of time in animals and humans. Curr Opin Neurobiol 18: 145–152, 2008 [DOI] [PubMed] [Google Scholar]
- Miller GA. The magic number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63: 81–97, 1956 [PubMed] [Google Scholar]
- Mushiake H, Saito N, Sakamoto K, Itoyama Y, Tanji J. Activity in the lateral prefrontal cortex reflects multiple steps of future events in action plans. Neuron 50: 631–641, 2006 [DOI] [PubMed] [Google Scholar]
- Olveczky BP, Andalman AS, Fee MS. Vocal experimentation in the juvenile songbird requires a basal ganglia circuit. PLoS Biol 3: e153, 2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paxinos G, Watson C. The Rat Brain in Stereotaxic Coordinates, Third Ed. San Diego: Academic Press, 1997 [Google Scholar]
- Pennartz CMA, Berke JD, Graybiel AM, Ito R, Lansink CS, van der Meer M, Redish AD, Smith KS, Voorn P. Corticostriatal interactions during learning, memory processing, and decision making. J Neurosci 29: 12831–12838, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quian Quiroga R, Panzeri S. Extracting information from neuronal populations: information theory and decoding approaches. Nat Rev Neurosci 10: 173–185, 2009 [DOI] [PubMed] [Google Scholar]
- Rosenbaum DA, Kenny SB, Derr MA. Hierarchical control of rapid movement sequences. J Exp Psychol Hum Percept Perform 9: 86–102, 1983 [DOI] [PubMed] [Google Scholar]
- Sakai K, Kitaguchi K, Hikosaka O. Chunking during human visuomotor sequence learning. Exp Brain Res 152: 229–242, 2003 [DOI] [PubMed] [Google Scholar]
- Schmitzer-Torbert NC, Redish AD. Task-dependent encoding of space and events by striatal neurons is dependent on neural subtype. Neuroscience 153: 349–360, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shima K, Tanji J. Neuronal activity in the supplementary and presupplementary motor areas for temporal organization of multiple movements. J Neurophysiol 84: 2148–2160, 2000 [DOI] [PubMed] [Google Scholar]
- Smith AC, Frank LM, Wirth S, Yanike M, Hu D, Kubota Y, Graybiel AM, Suzuki WA, Brown EN. Dynamic analysis of learning in behavioral experiments. J Neurosci 24: 447–461, 2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stalnaker TA, Calhoon GG, Ogawa M, Roesch MR, Schoenbaum G. Neural correlates of stimulus-response and response-outcome associations in dorsolateral versus dorsomedial striatum. Front Integr Neurosci 4: 12, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang C, Pawlak AP, Prokopenko V, West MO. Changes in activity of the striatum during formation of a motor habit. Eur J Neurosci 25: 1212–1227, 2007 [DOI] [PubMed] [Google Scholar]
- Tepper JM, Koos T, Wilson CJ. GABAergic microcircuits in the neostriatum. Trends Neurosci 27: 662–669, 2004 [DOI] [PubMed] [Google Scholar]
- Thorn CA, Atallah H, Howe M, Graybiel A. Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 66: 781–795, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thorn CA, Graybiel AM. Medial and lateral striatal LFPs exhibit task-dependent patterns of coherence in multiple frequency bands. Program No 622.14. 2007 Neuroscience Meeting Planner San Diego, CA: Society for Neuroscience, 2007 [Google Scholar]
- Tort AB, Kramer MA, Thorn C, Gibson DJ, Kubota Y, Graybiel AM, Kopell NJ. Dynamic cross-frequency couplings of local field potential oscillations in rat striatum and hippocampus during performance of a T-maze task. Proc Natl Acad Sci USA 105: 20517–20522, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tremblay PL, Bedard MA, Langlois D, Blanchet PJ, Lemay M, Parent M. Movement chunking during sequence learning is a dopamine-dependant process: a study conducted in Parkinson's disease. Exp Brain Res 205: 375–385, 2010 [DOI] [PubMed] [Google Scholar]
- Tremblay PL, Bedard MA, Levesque M, Chebli M, Parent M, Courtemanche R, Blanchet PJ. Motor sequence learning in primate: role of the D2 receptor in movement chunking during consolidation. Behav Brain Res 198: 231–239, 2009 [DOI] [PubMed] [Google Scholar]
- Tumer EC, Brainard MS. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450: 1240–1244, 2007 [DOI] [PubMed] [Google Scholar]
- van der Meer MA, Johnson A, Schmitzer-Torbert NC, Redish AD. Triple dissociation of information processing in dorsal striatum, ventral striatum, and hippocampus on a learned spatial decision task. Neuron 67: 25–32, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Warden MR, Miller EK. The representation of multiple objects in prefrontal neuronal delay activity. Cereb Cortex 17, Suppl 1: i41–i50, 2007 [DOI] [PubMed] [Google Scholar]
- Wise SP, Kurata K. Set-related activity in the premotor cortex of rhesus monkeys: effect of triggering cues and relatively long delay intervals. Somatosens Mot Res 6: 455–476, 1989 [DOI] [PubMed] [Google Scholar]
- Yeshenko O, Guazzelli A, Mizumori SJ. Context-dependent reorganization of spatial and movement representations by simultaneously recorded hippocampal and striatal neurons during performance of allocentric and egocentric tasks. Behav Neurosci 118: 751–769, 2004 [DOI] [PubMed] [Google Scholar]
- Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci 7: 464–476, 2006 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.










