Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jun 10.
Published in final edited form as: Neuron. 2010 Jun 10;66(5):781–795. doi: 10.1016/j.neuron.2010.04.036

Differential Dynamics of Activity Changes in Dorsolateral and Dorsomedial Striatal Loops During Learning

Catherine A Thorn 1,2, Hisham Atallah 1, Mark Howe 1,3, Ann M Graybiel 1,3
PMCID: PMC3108575  NIHMSID: NIHMS294217  PMID: 20547134

SUMMARY

The basal ganglia are implicated in a remarkable range of functions influencing emotion and cognition as well as motor behavior. Current models of basal ganglia function hypothesize that parallel limbic, associative and motor cortico-basal ganglia loops contribute to this diverse set of functions, but little is yet known about how these loops operate and how their activities evolve during learning. To address these issues, we recorded simultaneously in sensorimotor and associative regions of the striatum as rats learned different versions of a conditional T-maze task. We found highly contrasting patterns of activity in these regions during task performance and found that these different patterns of structured activity developed concurrently, but with sharply different dynamics. Based on the region-specific dynamics of these patterns across learning, we suggest a working model whereby dorsomedial associative loops can modulate the access of dorsolateral sensorimotor loops to the control of action.

Keywords: learning and memory, basal ganglia, goal-directed behavior, habit learning, cognitive control, rat, electrophysiology, sensorimotor control, dorsomedial striatum, dorsolateral striatum

INTRODUCTION

The basal ganglia, long known to be critical for normal motor control, are now also recognized as influencing cognitive and motivational aspects of behavior (Balleine et al., 2009; Dagher and Robbins, 2009; Graybiel, 2008). Moreover, the striatum, the largest structure in the basal ganglia, is thought to be critical for learning functions across these domains, especially reinforcement-based learning (Daw et al., 2005; Samejima and Doya, 2007). Reflecting this wide functional scope, basal ganglia dysfunction has been identified in disorders ranging from Parkinson’s disease and Huntington’s disease to neuropsychiatric disorders including obsessive-compulsive disorder, Tourette syndrome, and major psychosis (DeLong and Wichmann, 2007; Graybiel and Mink, 2009).

Candidates for functionally distinct motor and cognitive circuits have been identified in behavioral experiments in humans and non-humans (Graybiel, 2008; Middleton and Strick, 2000; Worbe et al., 2009). In rodents, sensorimotor loops connect somatosensory and motor cortical areas with the dorsolateral striatum, and lesions of these loops, including lesions centered in the dorsolateral striatum, impair the acquisition and performance of motor sequences and stimulus-response (S-R) tasks, as well as the habitual responding in instrumental tasks that follows earlier goal-directed performance (Balleine et al., 2009; White, 2009). Correspondingly, in some sensorimotor tasks, neurons in this dorsolateral region have been shown to fire in relation to motor behaviors, and this activity continues to be modulated late in training (Barnes et al., 2005; Kimchi et al., 2009; Kubota et al., 2009; Schmitzer-Torbert and Redish, 2004; Tang et al., 2007; Yin et al., 2009). It has been suggested that the dorsolateral striatum is important for the chunking of motor patterns as habits are formed and stamped in (Barnes et al., 2005; Graybiel, 2008).

By contrast, associative loops interconnect the medial prefrontal cortex with regions of the dorsomedial striatum. Lesions made within these loops, including lesions of the dorsomedial striatum, impair goal-directed responding in instrumental tasks (Yin and Knowlton, 2006) and impair reversal learning (Ragozzino, 2007). These lesions do not generally affect behavioral performance during learning of simple S-R tasks (Ragozzino, 2007; White, 2009), but may impair the learning and performance of more complicated paradigms (Adams et al., 2001; Corbit and Janak, 2007; Featherstone and McDonald, 2005; Kantak et al., 2001). Neurons in the dorsomedial striatum undergo changes in activity early during motor learning and their firing has been shown to change according to flexible stimulus-value assignments, as well as with response bias (Kimchi and Laubach, 2009a; Kimchi and Laubach, 2009b; Yin et al., 2009). Based on this evidence, it is thought that the associative cortico-basal ganglia loop, including the dorsomedial striatum, is involved in flexible goal-directed behavioral control.

How the parallel dorsolateral and dorsomedial striatum-based loops interact to produce habitual versus goal-directed behaviors is still unclear. Available evidence suggests that behavior often evolves during trial-and-error learning from being flexible and goal-directed to being habitual. As this transition occurs, neural control by dorsal striatal circuits is thought to shift from associative circuits that take account of the outcome contingencies of actions to those that are less flexible and that underpin habit formation and repetitive behaviors and thoughts (Graybiel, 2008; Yin et al., 2008). However, lesions of the dorsomedial striatum can result in the expression of habitual behavior even early in training, and lesions of the dorsolateral striatum can result in goal-directed responding even after extended training (Yin and Knowlton, 2006). These and related results suggest that the two control systems operate independently, and perhaps simultaneously or even competitively (Balleine et al., 2009; Wassum et al., 2009).

To determine the patterns of neural activity that occur in these dorsolateral and dorsomedial striatal districts during procedural learning in freely moving animals, we made simultaneous tetrode recordings of single-unit activity in both the dorsolateral and dorsomedial parts of the striatum as rats acquired a T-maze task. The task was designed to require not only skilled motor performance, but also flexible responding based on sensory cues signaling the baited end-arm, thus taxing both sensorimotor and cognitive circuitry. Moreover, we trained the rats on two different task versions concurrently, with instruction cues of either auditory or tactile modalities, and we varied the difficulty of the tactile version in order further to differentiate changes in neural activity along sensory, motor, and cognitive domains. Finally, given evidence that a classical lithium chloride devaluation procedure shows that training on a similar T-maze task behavior is initially goal-directed and becomes habitual with over-training (Smith and Graybiel, unpublished data), we tracked neural activity chronically from the naïve state to the extensively over-trained state. In this way, we sought to identify activity that was associated with the early flexible action-outcome phase of behavioral control and activity that was related to repetitive late-stage habitual performance.

We focused on the activity patterns of neurons characterized as striatal projection neurons to ensure that the activities recorded would reflect those of the corresponding cortico-basal ganglia loops. Our findings demonstrate that the sensorimotor and associative cortico-basal ganglia loops are active simultaneously during learning, but that they develop strikingly different task-related patterns that are characterized by different dynamics across training sessions.

RESULTS

We recorded from 6750 well-isolated striatal neurons in eight Long-Evans rats over 196 training sessions. All recordings were made concurrently in the dorsolateral and dorsomedial striatum (Figure 1A). We studied two groups of rats. The 5 rats in Group 1 acquired the auditory version of the task (> 72.5% correct performance for 10 consecutive training sessions) in 10-26 sessions (median = 13; Figures 1B and S1) but failed to acquire the tactile discrimination. Group 2 rats (n = 3) were trained using tactile cues with more readily discriminated textures, so that these animals could reach the performance criterion on both the auditory and tactile task versions. The Group 2 rats acquired the auditory discrimination in 9-22 sessions (median = 16) and the tactile discrimination in 18-28 sessions (median = 23; Figures 1C and S1). The combined values for both groups of rats are shown in Figure 1D. Running times decreased across training (p < 0.001, 2-way ANOVA), and mean running times during the tactile-cued trial blocks were slightly longer than those during the auditory-cued trial blocks (Figure 1E, p < 0.001, 2-way ANOVA).

Figure 1. Behavioral training and neuronal recording.

Figure 1

(A) Final tetrode locations for dorsolateral (top) and dorsomedial (bottom) recording sites. Different colors indicate sites from different animals. (B and C) Diagrams of T-maze task-versions (top) and percent correct performance across training sessions (bottom) for Group 1 (B, n = 5) and Group 2 (C, n = 3) animals. Dark gray denotes auditory instruction cue presenation, light gray, tactile instruction cue presentation. Only one animal in Group 1 continued training beyond 23 sessions, and session 25 for this animal was excluded from analysis as too few trials were performed. (D and E) Percent correct performance (D) and cue-to-goal running times (E) averaged across all rats, for auditory (dark gray) and tactile (light gray) task-versions. Stages denoted as: stage A1 = first 1 or 2 days of training; stage A2 = second 1 or 2 sessions of training; stages A3-A5 = evenly sampled 1 or 2 sessions of training prior to criterial performance (72.5%) on either task version; stages B1-B5: evenly sampled 1 or 2 sessions of training following criterial performance on the auditory version but prior to criterion on the tactile version; stages C1-C5: 2 consecutive sessions following criterial performance on both auditory and tactile task versions. Error bars indicate SEM. (F) Percent recorded units from dorsolateral (left, red) and dorsomedial (right, blue) striatum, classified as different putative neuronal subtypes. TRN = task-responsive medium spiny neurons; NTRN = non-task-responsive medium spiny neurons; FF = fast firing interneurons; TAN = tonically active neurons. (G) Percent of TRNs across training stages. See also Figure S1.

Ninety percent (n = 6082) of recorded neurons were classified as putative medium spiny projection neurons (Figures 1F and S2A-C), and were accepted for further analysis if they fired more than 150 spikes in a session. Medium spiny neurons were further classified as “task-responsive” neurons (TRNs) if their firing rates during any peri-event window were greater than 2 standard deviations above their pretrial baseline firing rates for at least 3 consecutive 20-ms bins. The TRNs made up approximately two-thirds of the recorded projection neurons, and this proportion did not change with training (Figure 1G, lateral and medial: p > 0.1, Chi-square test). Tetrodes were not moved except as necessary at the beginning of each session to maintain high-quality single unit recordings. Thus, some neurons may have been recorded over multiple days. Employing the method of Emondi et al. (2004), we estimated that up to one third of our sample could be potential repeated units. Repeating the main analyses after removing these neurons did not qualitatively alter the results (Figure S3A and S3B), and we therefore included all units for the analyses reported.

Simultaneously Recorded Dorsolateral and Dorsomedial Striatal Ensemble Activities Differ During Training on the T-Maze Tasks

We found that markedly different patterns of task-related ensemble activity in the dorsolateral and dorsomedial striatum emerged after the first stages of training. To gain a global picture of this population activity, we normalized firing rates for each neuron by calculating a z-score for each 20-ms bin of a ±300-ms peri-event time histogram constructed around each of 9 task events. For each stage, z-scores were averaged across all included units to calculate ensemble activity for the entire population (Figure 2).

Figure 2. Ensemble neural activity differs between dorsolateral and dorsomedial striatal recording sites during T-maze training.

Figure 2

(A) Ensemble z-score plots illustrating population activity across trial time and training stages for dorsolateral (top) and dorsomedial (bottom) TRNs. Scale for both plots shown in center. Numbers to the right of each row indicate the number of units included in that stage. (B and C) Mean z-scores (solid lines) and SEMs (shaded) plotted across task time for dorsolateral (red) and dorsomedial (blue) TRNs separately (B) and overlaid (C) for successive phases of training. Task events abbreviated as: BL = baseline (1 sec prior to warning click); W = warning click; Ga = gate opening; L = locomotion onset; S = out of start; C = cue onset; TS = turn start; TE = turn end; Go = goal reaching. Gray dots in C indicate significant difference between dorsolateral and dorsomedial activity during the corresponding 20-ms bin (p < 0.01, t-test). See also Table 1 and Figure S2.

During training, TRNs in the dorsolateral striatum (Figure 2A, top) developed strong ensemble responses at action boundaries of the task (locomotion onset, turn, and goal). Activity during mid-run was reduced after the first stages of training. In sharp contrast, ensemble TRN activity recorded in the dorsomedial striatum (Figure 2A, bottom) was strongest mid-run, especially around the time of instruction cue onset and turn start, and was weakest at task start and task end, almost opposite to the dorsolateral pattern. The dorsolateral and dorsomedial activities began to diverge early in training, and were strongly different especially during the middle training stages (Figures 2B and 2C and Table 1). We further examined the ensemble activity of subsets of the dorsolateral and dorsomedial TRNs that responded to particular task events (Figure S2D). These results highlight the preferential firing of dorsolateral ensembles around the beginning and end of the trial, in contrast to the strong dorsomedial activity mid-task.

Table 1.

Difference between dorsolateral and dorsomedial patterns

% Bins RSS K-L Div.
A1-2 2.68 2.78 0.04
A3-5 28.35 6.14 0.10
B1-2 37.93 18.12 0.28
B3-5 68.97 25.64 0.43
C1-2 20.69 10.93 0.18
C3-5 34.48 12.98 0.21

For each group of training stages, the difference in dorsolateral and dorsomedial patterns is expressed as the percentage of 20-ms bins with significantly differing z-score activations (t-test, p < 0.01), the sum of squared dorsolateral-dorsomedial residuals across all bins, and the symmetrized Kullback-leibler divergence of the firing distributions across task time computed for each region. See also Figure 2.

Despite the fact that only the Group 2 animals successfully learned the tactile as well as the auditory version of the T-maze task, the ensemble activity patterns for the two groups of animals were similar (Figures 3A-B and Table 2) as was their motor performance on the maze (Figure 3C). For both groups, TRN ensemble activity in the two regions did not differ substantially during the first training block (stages A1-A5), when neither group had reached the learning criterion for either task, but medial-lateral differences developed during the second training block (stages B1-B5) as the Group 2 animals, but not the Group 1 animals, acquired the tactile task (Figure 3B and Table 2). Laterally, the Group 2 rats had stronger goal responses, even in early sessions, than did the Group 1 rats, and the start activity of Group 2 rats accentuated the warning click rather than locomotion onset in the second training block. Medially, the Group 1 rats, which did not learn the tactile version, exhibited stronger pattern expression during the second training block than did the learners in Group 2. The ensemble activity patterns were otherwise comparable for the two groups. Ensemble patterns were also generally consistent across individual animals (Figures 3D and S1), despite differences in response selection on the tactile task (Figure S1). Further, these patterns remained even after removing the animals in each group that exhibited the strongest patterned activity (Figure S3C-F). Thus, the data from all rats were combined for subsequent analyses.

Figure 3. Group 1 and Group 2 rats have similar ensemble TRN firing patterns.

Figure 3

(A) Ensemble z-score plots for TRN populations in dorsolateral (left) and dorsomedial (right) striatum recorded from rats in Group 1 (top) and Group 2 (bottom). Conventions as in Figure 2A. (B) Mean z-scores and SEMs across task-time for Group 1 (light color) and Group 2 (dark color) neuronal populations in dorsolateral (left, red) and dorsomedial (right, blue) striatum during stages A1-A5 and stages B1-B5. Gray dots as in Figure 2C for Group 1 versus Group 2 activity. (C) Representative run trajectories during the performance of the two task versions recorded during the final training session for a Group 1 animal (left, D22 session 19) and a Group 2 animal (right, D25 session 33). (D) Mean z-scores for TRN ensembles recorded from each rat, left/red: dorsolateral, right/blue: dorsomedial. See also Table 2 and Figure S3.

Table 2.

Comparison of Group 1 and Group 2 activations

Block 1 Block 2
K-L K-L
% Bins RSS Div. % Bins RSS Div.
Group 1 vs. Group 2
Dorsolateral 15.33 6.96 0.122 18.4 8.62 0.13
Dorsomedial 12.64 4.95 0.09 15.3 5.64 0.11
Dorsolateral vs. Dorsomedial
Group 1 15.71 6.62 0.102 55.2 21.2 0.33
Group 2 14.56 3.81 0.058 56.3 20.8 0.34

Difference measures for Group 1 versus Group 2 activities in each region (top), and the dorsolateral-dorsomedial difference measures for each group (bottom), as in Table 1. See also Figure 3.

To quantify the strength of the dorsolateral and dorsomedial ensemble patterns over training, we calculated a spike probability distribution from the ensemble z-scores and estimated the entropy of this distribution as a measure of randomness in the population firing across trial-time for each training stage. In the dorsolateral striatum, ensemble activity became progressively more structured across training, as indicated by the reduced entropy in later training stages compared to that in stage A1 (Figure 4A). By contrast, the entropy of the dorsomedial activity was lowest during the middle training stages (block 2) and then returned to initial levels as training continued (Figure 4D). Figure 4B and 4E show similarly contrasting trends in ensemble pattern development across training, expressed as changes in z-scores relative to the first training stage around each task event. Similar results for the two striatal regions were also obtained for calculations based on spike count distributions as opposed to z-score normalized firing patterns (Figure S4). We found that a single linear regression provided the best fit to the dorsolateral entropy estimates, and that a segmented regression with a breakpoint at stage B1 best fit the dorsomedial entropy estimates. Using these optimal regressions, we next tested each 20-ms bin in each peri-event window for changes in the neural activity across training. Figure 4C shows that dorsolateral TRN activity prior to warning click and at goal reaching increased significantly across training stages, while activity around locomotion onset and out-of-start events declined with training. Dorsomedial TRN activity around cue onset and turn start increased during the first part of training, while activity around goal reaching declined, and during the later stages of training, these trends reversed (Figure 4F-G).

Figure 4. Ensemble TRN activity displays different training-related dynamics in dorsolateral and dorsomedial striatum.

Figure 4

(A and D) Mean entropy and 95% confidence interval of the ensemble firing distribution for each stage of training relative to stage A1 for dorsolateral (A) and dorsomedial (D) striatum. (B and E) Mean z-scores and 95% confidence interval around specific task events for dorsolateral (B) and dorsomedial (E) ensembles across training stages, relative to stage A1. Means and confidence intervals were computed using 1000 bootstrap samples over the neuronal population for each stage. (C and F) Z-score regression slopes for each 20 ms bin and 95% confidence intervals for dorsolateral striatum (C), using a single linear regression across all stages, and for dorsomedial striatum (F), using a segmented linear regression with a single breakpoint at training stage B1. (G) Overlaid 95% confidence intervals of dorsomedial regression slopes for stages A1-B1 and the negative of the regression slopes for stages B1-C5. See also Figure S4.

These findings suggested that task-related projection neurons in the dorsolateral and dorsomedial regions of the striatum, parts of different cortico-basal ganglia loops, develop different structured activities concurrently during the course of learning, and that the dynamics of the activity changes are different throughout learning.

Dorsolateral and Dorsomedial Ensembles Preferentially Respond to Different Stimulus Modalities Only Around the Time of Cue Onset

Surprisingly, despite the differences in percent correct performance on the auditory and tactile task-versions, ensemble neural activity during the auditory and tactile trials was similar in both dorsolateral and dorsomedial regions (Figures 5A, 5B and S5A). We observed differences in ensemble activity only around the time of instruction cue onset: dorsolateral ensembles showed higher activity in response to the presentation of the tactile cues, whereas dorsomedial ensembles preferentially responded to the onset of the auditory cues (Figure 5C). At the single unit level, modest numbers of TRNs differentiated between the two modalities: up to ca. 15% around the cue onset and turn start events (Figure 5D). In the dorsomedial striatum, these units tended to exhibit higher firing rates during auditory trials (p < 0.001, Chi-square). These percentages did not change with training in either region (Figure 5E, p > 0.1, Chi-square).

Figure 5. Ensemble TRN activity differs only around cue onset during auditory and tactile trials.

Figure 5

(A) Pseudocolor z-score plots comparing dorsolateral (left) and dorsomedial (right) striatal TRN ensemble activity during auditory and tactile trials (as labeled). (B) Mean z-scores and SEMs across task-time for auditory (dark color) and tactile (light color) trials, plotted for each training block for dorsolateral (left, red) and dorsomedial (right, blue) ensembles. (C) Mean z-scores and SEM across all stages for dorsolateral (left) and dorsomedial (right) TRNs during ±300-ms around cue onset, (D) Percentage of units differentiating between auditory and tactile task-versions for dorsolateral (left, red) and dorsomedial (right, blue) TRNs. Dark and light bars indicate percentage of units with higher firing during auditory or tactile conditions, respectively. Solid and dashed black lines indicate percentage of auditory- and tactile-preferring neurons obtained after shuffling trials. (E) Percentages of modality-discriminative TRNs in dorsolateral (red) and dorsomedial (blue) striatal regions, plotted across training stage. (F) Percentage of TRNs responding with significant increases or decreases in firing to the onset of each of the four discriminative stimuli and the warning click in the dorsolateral (red) and the dorsomedial (blue) striatum. Dashed line indicates percentage expected by chance. See also Figure S5.

Fewer than 5% of the recorded neurons in either region changed their firing rates significantly in response to the instruction cue presentations (Figure 5F), and units discriminative for each stimulus in any peri-event window were also rare (Figure S5F-O). Within this small stimulus-selective population, dorsomedial units favored the more salient 8 kHz tone and dorsolateral units favored the tactile stimuli (lateral and medial: p < 0.001, Chi-square test). Finally, we found only a few neurons with firing correlated with stimulus value that could not be accounted for by other parameters such as stimulus selectivity, modality selectivity, or turn-specific activity (Figure S5B-E).

Dorsolateral and Dorsomedial Striatal Neurons Similarly Encode Turn Response and Trial Outcome Parameters

Given evidence that the dorsolateral striatum is critical for forming S-R associations and the dorsomedial striatum for forming associations related to reinforcement outcome, we tested for corresponding biases in neural activity in these two striatal districts. We compared the proportion of units in each region firing differentially in relation to either the different responses that the rats could select (right and left turns) or to the different reinforcement outcomes that could occur (reward or lack of reward). Unexpectedly, we found no large-scale differences between the dorsolateral and dorsomedial striatal districts in encoding either motor responses or trial outcomes.

Similar percentages of units in the two striatal regions (ca. 15-35%) differentiated between right and left turns during task events following turn onset (Figure 6A), and the mean number of spikes with which these units differentiated right from left turns were also similar across regions (Figure 6B). Importantly, the activities of these neurons were not predictive of turn direction prior to turn onset in either region. Dorsolaterally, but not dorsomedially, neuronal responses favored turns to the side contralateral to the implant (lateral: p < 0.001, medial: p > 0.1, Chi-square test). The percentage of turn-discriminative neurons did not change with training (Figure 6C, lateral and medial: p > 0.1, Chi-square).

Figure 6. Dorsolateral and dorsomedial striatal TRNs similarly discriminate turn responses and trial outcomes.

Figure 6

(A) Percentage of TRNs with higher peri-event firing rates during right or left turn resonses for dorsolateral (left, red) and dorsomedial (right, blue) striatum. Solid and dashed black lines indicate proportion of right- and left-preferring neurons obtained after shuffling trials. (B) Mean numbers of spikes and SEM with which turn-discriminative TRNs in dorsolateral (red) and dorsomedial (blue) striatum differentiate turn direction during each event-epoch. (C) Percentage of turn-discriminative TRNs across training. (D) Percentage of dorsolateral and dorsomedial TRNs differentiating correct and incorrect trials during each task epoch. Solid and dashed black lines as in A, for correct- and incorrect-preferring populations, respectively. (E) Percentage of outcome-discriminative TRNs across training. See also Figure S6.

We identified a few reward-sensitive neurons with differential firing restricted to the time around goal reaching, when the rat presumably could detect the presence or absence of reward in the food well (Figure 6D). The proportions of such units, though small in both regions, was larger dorsolaterally (p = 0.003, Chi-square), and did not change with training (Figure 6E, lateral and medial: p > 0.1, Chi-square). Nor did population activity differ between correct and incorrect trials (Figure S6A).

Our peri-event analyses suggest that independent populations of neurons encode stimulus, response and reinforcement outcome parameters (Figure S6B). Based on previous work (Histed et al., 2009; Kim et al., 2007; Kimchi and Laubach, 2009a), we searched for, but did not find (Figure S6C-E), a significant number of units with differential activity dependent on the response executed in the previous trial (right or left turn) or the outcome of the previous trial (correct or incorrect). Additional analyses (Figure S6F-K) also suggested that changes in response values or reward values (Kimchi and Laubach, 2009b; Samejima et al., 2005) were not a dominant factor in neuronal responding in our task (Figure S6).

Reduced In-Task Activity Characterizes Subpopulations of Projection Neurons in Both Dorsolateral and Dorsomedial Striatum

Approximately one third of the medium spiny neurons recorded did not meet our criteria for classification as TRNs (Figure 1F). We called this population of units “non-task-responsive neurons” (NTRNs). The population of NTRNs exhibited markedly lower activity during the task than during the pre-trial baseline period. The reduced in-task firing was similar for the dorsolateral and dorsomedial NTRN ensembles (Figures 7A-C). The entropy of the NTRN ensemble activity declined slightly during the first stage of training and then fell sharply at the start of the last block of training, when, both medially and laterally, the pre-task activity was differentially enhanced compared to in-task activity (Figure 7C). Thus the NTRNs, though lacking phasic in-task activity similar to that of the TRNs, nevertheless had activity that was modulated by task context. We did not detect differences in the percentages of NTRNs medially and laterally, nor changes in these percentages across training (data not shown).

Figure 7. The activity of non-task-responsive striatal ensembles is also modulated during training.

Figure 7

(A and B) Pseudocolor z-score plots showing ensemble neural activity for dorsolateral (A) and dorsomedial (B) NTRNs. Conventions as in Figure 2A. (C) Entropy estimates and 95% confidence limints for dorsolateral (red) and dorsomedial (blue) NTRN ensembles across training, shown relative to stage A1.

Dorsolateral and Dorsomedial Activity Patterns Are Correlated with Different Behavioral Parameters

To identify potential relationships between the activity patterns of the TRNs and the behavioral parameters measured as the animals were trained, we used the entropy of the ensemble activity patterns in the dorsolateral and dorsomedial regions as a measure of the strength of pattern expression during each training stage and then computed the correlation coefficients between this neural measure and the measures of behavioral performance. We found significant correlations between the strength of the dorsolateral striatal ensemble pattern and percent correct performance (calculated separately for auditory, tactile, and all trials) as well as significant correlations with running time (Figure 8A): the task-bracketing pattern of ensemble activity that appeared in the dorsolateral striatum became stronger as percent correct performance and running speeds improved over the course of training.

Figure 8. Ensemble activity patterns of dorsolateral and dorsomedial striatal TRNs are correlated with different performance measures.

Figure 8

(A) R2 values for correlations between entropy of ensemble activity and behavioral parameters (as labeled), shown in red for dorsolateral TRN ensembles and in blue for dorsomedial TRN ensembles. *: p < 0.05, **: p < 0.01. (B) R2 values for correlations between NTRN entropy and behavioral performance measures (conventions as in A). (C) Schematic model illustrating hypothesized dorsomedial and dorsolateral cortico-basal ganglia loop interactions across different phases of learning. Activity in both striatal regions and their corresponding loops becomes structured simultaneously during Phase 1. In Phase 3, the reduction in structured dorsomedial striatal activity permits sensorimotor circuits to drive execution of habitual behavior. Broken arrows indicate multisynaptic connections from striatum to neocortex through pallidum and thalamus. MC: motor cortex; PFC: prefrontal cortex; DLS: dorsolateral striatum; DMS: dorsomedial striatum. See also Figure S7 and Tables S1 and S2.

Strikingly, for the dorsomedial striatum, we found no significant correlations between pattern strength and any of these behavioral measures on either the auditory or tactile versions of the task (Figure 8A). These negative findings suggested that the strength of the dorsomedial activity pattern was not linearly related to any measured behavioral parameter. The findings did not, however, exclude either a non-linear association between them or a relationship of the neural activity to combinations of behavioral parameters. We tested for two of these.

First, prior studies have shown that spike activity in the associative striatum is highest during the period in training when behavioral performance is improving most rapidly, principally during the times in which feedback about task performance (Williams and Eskandar, 2006). To test whether this effect could contribute to the modulation of spike activity that we found in the dorsomedial striatal data set, we fit a third order polynomial to the total percent correct performance per learning stage for all rats and calculated the derivative of this polynomial to find the slope of the learning curve for each stage. For the population as a whole, we found a significant correlation between the entropy of the dorsomedial activity and the slope of the total percent correct learning curve (Figure S7A). However, when Group 1 and Group 2 rats were analyzed separately, we found that only the Group 2 rats showed a strong correlation between the slope of the behavioral performance curve and entropy of the dorsomedial striatal activity. Group 1 rats failed to exhibit this correlation: the dorsomedial activity patterns in this group were most strongly expressed toward the end of training, when their behavioral performance had reached asymptote and was no longer changing (Figure S7). These results suggest that neither a close correlation with percent correct or motor performance, nor a close correlation with the rates of change in these parameters, accounted for the patterns of activity that we recorded during training in the dorsomedial striatum in the two groups of animals.

A second possibility was that the development of the patterned ensemble activity in the dorsomedial striatum might be more closely related to the difference in performance levels on the auditory and tactile task versions than to the overall performance improvement. We found that this was so: there was a strong correlation between the disparity in performance levels on the two task-versions and the entropy for the dorsomedial activity pattern, but no such correlation for the dorsolateral striatal activity pattern (Figure 8A). Remarkably, this finding held for both Group 1 and Group 2, considered separately (Figure S7), suggesting that the performance disparity could be key to understanding the dynamics of the TRN ensemble patterns that emerged in the dorsomedial striatum through training. Repeating these correlational analyses for individual rats gave similar results (Tables S1 and S2, Figure S7).

The results for the NTRNs differed from those seen for the TRN ensembles. The changes in entropy of the NTRN ensemble activities were significantly correlated with improvements in both percent correct performance and running time across training (Figure 8B). This was true both dorsolaterally and dorsomedially indicating that, unlike the TRNs, the activities of NTRNs in dorsomedial and dorsolateral regions of the striatum were similarly correlated with behavioral performance.

DISCUSSION

Our findings demonstrate that highly contrasting patterns of task-related ensemble activity emerge in the sensorimotor and the associative parts of the striatum as rats learn T-maze tasks instructed by auditory and tactile cues. The sensorimotor striatum developed ensemble spike activity that was heightened at the action boundaries of the task. The associative striatum developed heightened ensemble spike activity mainly during the middle of the task, when the animals chose between alternate actions based on instruction cues. These striatal activity patterns developed simultaneously across training. Remarkably, however, the dynamics of the learning-related changes in these two striatal regions were sharply different, and they were differently related to the behavior of the rats. In the sensorimotor striatum, the emerging ensemble activity pattern steadily increased as training progressed, and was clearly correlated with improving performance. In the associative striatum, the activity pattern first waxed and then waned as training progressed, and was not correlated with individual behavioral parameters but instead, with the difference in performance on the two versions of the T-maze task. Based on this conjoint reorganization of activity patterns in the sensorimotor and associative striatum during learning, and the differing dynamics of these activities across learning, we suggest that the simultaneous activity of these two striatal regions may be critical in determining the development and expression of habitual behavior.

Dorsolateral and Dorsomedial Striatal Regions Have Different Task-Related Patterns of Activity

Our findings strongly support previous evidence for functional differences between the sensorimotor and associative striatum. As observed in previous studies with a single-modality version of the T-maze task used here (Barnes et al., 2005), we found that the phasic ensemble activity of dorsolateral striatal neurons was, after training, high at action boundaries, including around trial start and goal reaching; and we also found heightened activity at turn. The developing intensity of the dorsolateral pattern was strongly correlated with behavioral improvements in percent correct and decreases in running times across training. These results are consistent with the idea that the phasic ensemble activity in the dorsolateral striatum strengthens as performance on the task improves and behavior becomes highly stereotyped and, as related evidence suggests (Smith and Graybiel, unpublished data), highly habitual.

It was during the critical decision period of the task that phasic task-related activity increased in the dorsomedial striatum and ramped up until the decision was executed. The expression of this mid-task dorsomedial activity was most strongly correlated with the disparity in the performance accuracy of the rats on the auditory and tactile task-versions. This remarkable difference between the behavioral correlates of the neural activities in the dorsolateral and dorsomedial striatum suggests that the two regions, and their corresponding cortico-basal ganglia circuits, have distinct functions during the course of behavioral learning of the conditional T-maze task.

We examined several alternative possibilities to account for the striking experience-dependent modulation of the dorsomedial striatal activity across the different stages of training. A first possibility, favored here, is that the different plasticity demands that the animals faced in the successive training phases accounted for the heightened modulation of activity in the dorsomedial striatum during training. The dorsomedial mid-run activity gradually strengthened during the first training block, in which the rats were attempting to learn both task-versions, but it became intense during the second block when the auditory task-version had been acquired but the tactile version had not. Then the dorsomedial pattern weakened in the third block as both task-versions were mastered. Thus, the dorsomedial ensemble activity pattern was strongest during the time when the acquisition demands on the animals were in conflict for the two task-versions. Moreover, the heightened activity during this conflict period was greater for the Group 1 animals, which never learned the more difficult tactile version. This changing pattern of activity in the dorsomedial striatum stood in contrast to the relative stability of the structured activity in the dorsolateral striatum: there, the patterned activity was relatively constant after the initial phase of the training.

These findings suggest that, at a population level, the strength of the activity patterns in the dorsomedial striatum rose and fell during the successive training blocks in relation to the training demands imposed by the task. During the second phase of training, when the auditory task had been acquired but the tactile task had not, differing plasticity demands were required for the two task versions. For the auditory version, further neuronal plasticity should only have consolidated the already-mastered S-R associations. By contrast, the animals still needed to acquire the S-R associations necessary to gain reward on the tactile version of the task. Thus, new learning in the tactile task was required for improving performance, but new learning on the auditory task (as opposed to continued consolidation) would have been detrimental to the already acquired auditory version. The heightened dorsomedial ensemble activity during this phase of acquisition suggests that the dorsomedial region may have been sensitive to these conflicting plasticity demands during the successive training blocks.

A second possibility, consistent with reinforcement learning models, is that response uncertainty due to a lack of adequate experience with a task could be related to an animal’s willingness to make exploratory actions, and therefore to the rate at which learning occurs (Rushworth and Behrens, 2008). Our finding that the expression of structured activity in the dorsomedial striatum was correlated with the slope of the behavioral performance curve in some animals warrants further consideration of this idea. Assuming, in accord with the behavioral findings, that the S-R associations to the conditional cues were built up slowly through experience for each task-version, there must have been a period during acquisition when the direction of turn that would lead to reward was uncertain in each of the task-versions, and this time-period would have been different for the two tasks. At first glance, response uncertainty should have been highest early in training, when none of the four conditional cues had been mastered. However, some initial exposure to the task might have been required for mastering the task mechanics and determining that there were rules to be learned, and thus uncertainty-related activity might have developed slightly later in training. Even in this view, however, it is not clear why such activity should be highest during the second training block, when two of the four stimulus-response associations had been mastered. Nor should these activities be identical during auditory and tactile versions, as we found them to be, because again, one version was well learned while only the other version remained uncertain. Thus, we think it unlikely that this type of uncertainty can fully account for the patterns of activity we observed.

Notably, the enhanced dorsomedial striatal mid-run activity was present not only in the animals that failed to learn the difficult version of the tactile task, but was also present, though less strong, in the animals that acquired the easier tactile task. This result is important: it was not a failure to learn the tactile version that accounted for the heightened dorsomedial striatal activity.

We also considered the possibility that the heightened dorsomedial activity reflected differential engagement of this striatal region in switching behavior, needed every 20 trials as the auditory and tactile trial sets were interchanged. This view is in accord with evidence that the dorsal striatum is differentially active in relation to switches in stimulus modality or stimulus value (Kimchi and Laubach, 2009b; Kubota et al., 2009). However, population firing as well as the firing rates of the majority of single units were unaffected by cue modality, and the dorsomedial activity clearly rose during training and then fell as training progressed, despite the fact that the switching demands of the task were similar across all sessions. The heightened dorsomedial activity that we observed mid-task and mid-training thus appears unlikely to reflect the within-session switches in the stimulus modality.

We did not have explicit ways of testing definitively for a relationship between the firing of the striatal units and the decision process itself, nor outcome expectancy from the action taken, as opposed to the right or left turn responses emitted. We did test whether the ensemble activity or the individual unit activities during the presumptive decision period predicted the direction or success of the upcoming turn. They did not. It thus seems likely that this activity, though occurring during the decision period, was not directly responsible for the action that the rats subsequently executed in a given trial, even if it was, as we suspect, related to the decision process. The dorsomedial activity is thus likely to be a global or state-level property not related to moment-to-moment conditions.

The proposal that conflicting behavioral and plasticity demands could have evoked the activity modulation in the dorsomedial striatum raises the possibility that the population activity reflected a global monitoring signal tracking the disparity between auditory and tactile task performance during training. This possibility accords well with what is known about the functions of the medial frontal and cingulate cortical areas that project to this striatal region. These neocortical regions have long been implicated in various types of performance monitoring, especially during tasks with ambiguous stimuli or conflicting response choices (Carter et al., 1998; Rushworth, 2008; Schall et al., 2002), or tasks with delayed and/or uncertain rewards (Cardinal, 2006; Rushworth, 2008). Firing rates of neurons in the dorsomedial striatum have been found to be related to response bias during performance of a go/no-go discrimination, suggesting that these responses might be heightened in conjunction with increased uncertainty (Kimchi and Laubach, 2009a). Combined with our findings, a pattern emerges of similar functional engagement throughout entire cortical-basal ganglia loop circuits interconnecting associative cortical regions and associative districts in the striatum.

Individual Units in the Dorsolateral and Dorsomedial Striatum Similarly Encode Stimulus, Response and Outcome Parameters

Behavioral evidence strongly favors the view that the dorsomedial striatum mediates outcome-sensitive behavior and the dorsolateral striatum mediates outcome-insensitive (habitual, S-R) behavior (Balleine et al., 2009; Graybiel, 2008). The simultaneous recordings that we made allowed us to look for unit activity that might be correlated with aspects of these two postulated control functions for learning, including neural activity discriminating the stimuli (tactile or auditory), the responses (left or right turns) and the reinforcement outcome (reward or no-reward). Surprisingly, despite the striking differences between the ensemble activity patterns in the two regions, we found only modest differences in the proportions of single dorsolateral and dorsomedial neurons that differentiated between cue modalities, turn directions and trial outcomes. In both regions a majority of neurons discriminated between right and left turn responses; a large minority of neurons responded differently to the two modalities; and only a very small proportion of neurons were sensitive to trial outcome.

We did observe preferential responding by dorsolateral ensembles to the onset of the tactile conditional cues, whereas single dorsomedial units and ensembles preferentially responded to the onset of the auditory cues. These results, and the preference for contralateral turns in the dorsolateral but not dorsomedial striatum, are consistent with the differential projections of somatosensory and motor cortex to more lateral regions of dorsal striatum and auditory cortex to more medial regions (McGeorge and Faull, 1989). For the few neurons responding to the presentation or lack of reward at goal-reaching, the outcome-sensitive sample was larger in the dorsolateral striatum than in the dorsomedial striatum.

Together, these results suggest that comparable subsets of neurons in dorsolateral and dorsomedial regions of the striatum encode stimulus, response, reinforcement outcome, context, and/or performance parameters. Consistent with other studies (Barnes et al., 2005; Berke et al., 2009; Kimchi and Laubach, 2009a; Kimchi and Laubach, 2009b), we found that neurons responsive to the instruction cues and trial outcomes were sparse for both task-versions as well as across learning. Moreover, the neurons that did discriminate between instruction cue modalities (stimulus), turn directions (response), and reward at trial end (outcome) were largely independent populations (Lau and Glimcher, 2007; Schmitzer-Torbert and Redish, 2004). The unexpected similarity in single unit selectivities in the dorsolateral and dorsomedial striatal regions, combined with the (at most) sparse encoding of combinations of these parameters, suggests that the currently accepted stimulus-response control functions of the dorsolateral striatum and response-outcome control functions of the dorsomedial striatum are not distinguished by the conjunctive representations of stimulus, response and reinforcement outcome by spike activity in the two striatal regions.

In a series of analyses, we found no clear evidence for the activity of more than a few neurons in either striatal region as being related to stimulus or outcome value. Interestingly, in two rats, we observed stronger discrimination among turn-discriminative populations of neurons as training progressed, providing some evidence that action-value encoding may be an important function of striatal neurons. In these rats, right-turn-related firing increased in the dorsolateral striatum, whereas left-turn-related firing increased in the dorsomedial striatum, hinting that the encoding of action-value contingencies might differ between the two regions. The lack of conclusive evidence for value encoding in our experiment is somewhat surprising given previous studies (Kimchi and Laubach, 2009b; Lau and Glimcher, 2008; Samejima et al., 2005). However, our experiments were not designed to study value, and our estimates of value rely heavily on the assumption that stimulus values and response values are correlated with the percent correct performance of the rats throughout training. They must therefore be interpreted with some caution. We also failed to find single unit activity related to previous trial outcome or to the response executed in the previous trial (Histed et al., 2009; Kim et al., 2007; Kimchi and Laubach, 2009a). From a reinforcement learning perspective, the function of reward-contingent neuronal firing would be to update the value estimates associated with a chosen action or stimulus-action combination. The resulting synaptic plasticity changes may not necessarily result in immediate changes in firing on the subsequent trial.

Modes of Neural Firing in Associative and Sensorimotor Striatum

Prior studies have compared dorsolateral and dorsomedial striatal activity during motor skill learning (Yin et al., 2009) and during performance of instrumental behavior (Kimchi et al., 2009). The specific patterns that we have found to emerge in the associative and sensorimotor zones suggest two main modes of activity in the corresponding cortico-basal ganglia loops. First, we found that the dorsolateral task-bracketing pattern of ensemble activity can emerge early during training, before either motor performance or percent correct performance reach asymptote. Such early plasticity accords with the findings of Kimchi et al., but contrasts with those of Yin et al., during learning of markedly different tasks. For the dorsomedial striatum, we found that early increases and then later decreases of mid-run activity emerged with training. Kimchi et al. observed early changes in dorsomedial striatal activity that were sustained or enhanced with training, whereas Yin et al. observed heightened activity only during the initial stages of learning. In agreement with the former study, our findings demonstrate that dorsomedial striatal activity can develop in conjunction with dorsolateral activity and remain active long after the initial stages of learning. In agreement with the latter study, we observed a decline in dorsomedial striatal activation once our task was well learned. However, the relationship of our findings to these previous reports is complex. In contrast to these other studies, we used a task with a navigational component, our dorsomedial recording sites were anterior to those previously reported, and we trained the animals on two task-versions in single training sessions. Nevertheless, combined, these studies suggest that the acquisition of habitual behavior is characterized by the simultaneous operation of cortico-basal ganglia loops based in the dorsomedial and dorsolateral striatum, and that the modes of activation strongly depend on the demands of the task to be learned.

Interestingly, despite the view that dorsomedial striatal regions can mediate goal-directed or flexible responding early in training, few studies have yielded evidence for deficits in initial learning in rats with dorsomedial striatal lesions (Ragozzino, 2007; White, 2009). These previous results are consistent with the idea that multiple learning and memory systems interact in the expression of behavior, and suggest that performance deficits might not appear unless the task were to tax associative circuitry. Supporting this idea, one of the rare studies that did find learning deficits with dorsomedial striatal lesions suggested that the dorsomedial caudoputamen is essential for learning two responses to two similar arbitrary cues, in a paradigm with substantial similarities to the T-maze task used here (Adams et al., 2001). This result favors our suggestion that dorsomedial striatum – and its corresponding cortico-basal ganglia loops – could be important for performance monitoring, perhaps especially in disambiguating closely related contexts such that the correct action is chosen. This view is compatible both with the dorsomedial activity being related to the conflicting plasticity demands faced by the rats as they learned and the proposal that the conflict in task-version demands in itself produced the markedly heightened activity during the second phase of training in our experiment. In a number of other studies, it may be possible to interpret changes in behavioral performance following lesions of the dorsomedial striatum as being related, at least in part, to the inability to disambiguate closely related contexts (Corbit and Janak, 2007; Featherstone and McDonald, 2005; Kantak et al., 2001).

Both Task-Responsive and Non-Task-Responsive Neuronal Subpopulations Are Modulated During Learning

Both dorsolaterally and dorsomedially, a large population of putative projection neurons fired mainly during the baseline period rather than during the maze-runs themselves. We called these “non-task-responsive” neurons (NTRNs), recognizing nonetheless that the context specificity of these neurons and their modulation over the course of training suggests that they were in fact task-sensitive. We did not record after goal-reaching, during the time of reward consumption, due to noise artifact produced by chewing. It is possible that NTRNs (or the TRNs) responded at this time. Thus, we identified the NTRNs as those neurons lacking detectible phasic, in-task responses during the recording periods. These results confirm previous findings from our laboratory for the non-task-responsive neurons recorded in the dorsolateral striatum of rats and mice (Barnes et al., 2005; Kubota et al., 2009), as well as related findings by West and colleagues (Tang et al., 2007). Our findings further suggest that the distinction between neuronal populations with and without significant phasic activity during the task holds across at least two regions of the striatum.

Approximately half of the recorded neurons were classified as TRNs, and about a quarter of the neurons were medium spiny units classified as NTRNs. These estimates are approximate: neurons silent during the task would not have been counted unless we detected their activity during the baseline period. Using a less strict criterion for classifying task-responsiveness, the same as that used by Barnes et al. (Barnes et al., 2005), we found, as they did, that the phasic and quiet neuronal populations were nearly equal in size. With this classification, we also began to detect weak phasic activity in the population of neurons presumably without responses, and thus chose to report our results using the more conservative classifier.

Nevertheless, these results raise the possibility that the two classes of neurons might correspond, at least in part, to the direct and indirect pathway neurons of the striatum. Yin and colleagues reported evidence suggesting that in rats performing a rotarod motor learning task, the striatal neurons that undergo major changes during learning correspond to D2-class dopamine receptor-bearing indirect pathway neurons (Yin et al., 2009). We found large-scale changes in both the TRNs and the NTRNs, but we did find a greater quieting of the NTRNs in the dorsolateral striatum, which is enriched in D2-class dopamine receptors, than in the dorsomedial striatum, which expresses lower levels of D2-class receptors. Moreover, we found that for both dorsolateral and dorsomedial NTRN ensembles, the in-task decrease in activity was correlated with behavioral performance improvements, including increasing percent correct and decreasing running speeds. Selective targeting of neuronal subtypes during recording, now becoming feasible, will help to settle the identity of these two populations of striatal neurons.

Simultaneous Activation of Dorsolateral and Dorsomedial Striatum Has Implications for Understanding Cortico-Basal Ganglia Loop Functions

The central issue that we attempted to address in this study is how, during the course of habit learning, the neural activities in two key striatal regions change. Our results suggest that there are fundamental differences in the patterns of activity in associative and sensorimotor cortico-basal ganglia loops in the task-times of maximal ensemble activity during learning, in the dynamics of the activity changes across learning, and in the relation of the activity each region to the behavioral parameters that we were able to measure. We conclude that cortico-basal ganglia loops can operate simultaneously and with contrasting behavior-related dynamics during procedural learning.

The strikingly different dynamics of the acquired activity patterns in the two striatal regions are of special interest and raise a key question. Why, if the task-bracketing pattern appeared in the sensorimotor striatum early during training, and is a correlate of habitual performance (Barnes et al., 2005), did it not drive habitual behavior from its earliest time of appearance? As a working hypothesis, we propose that the differing dynamics of the activity patterns we observed in the dorsomedial and dorsolateral striatum hold a clue to the answer (Figure 8C). We suggest that even if the dorsolateral activity could have directed behavior from early in training, this dorsolateral activity was able to gain access to such executive capacity only after activity subsided in associative cortico-basal ganglia loops engaging the dorsomedial striatum (Figure 8C).

According to this model, exploration driven by frontostriatal associative circuits would be the default mode for behavior in a new learning environment. During the middle training blocks in the T-maze task, strong dorsolateral task-bracketing activity would have indicated that the neural bases for a habit existed, but equally strong or stronger dorsomedial activation would have prevented its expression. Finally, following mastery of all aspects of the task, the subsiding of dorsomedial activation would have enabled dorsolaterally-based habitual behavior to be expressed (Figure 8C). Though perhaps overly explicit, the core idea of this model is that there is a permissive role of the associative striatum in the evolution of behavior toward habitual performance. Such a permissive function would not require a direct transfer of information from the dorsomedial to the dorsolateral striatum. Rather, through their output connections, they could set up a competition at downstream targets (including regions of the neocortex or brainstem), enabling the disruption of habitual responses that would otherwise be driven by dorsolateral striatum-based loops.

This conceptualization, which considers the dynamics of simultaneously active sensorimotor and associative striatal circuits during training, has implications for many popular models of cortico-basal ganglia loop function (Daw et al., 2005; Graybiel, 2008; Horvitz, 2009; Samejima and Doya, 2007; Yin et al., 2008). By extension, our suggestion that the dorsomedial striatum has a permissive function relative to the dorsolateral striatal circuits that release or inhibit action also has potential clinical implications. Our findings suggest the possibility that in dysfunctions such as seen in addiction, it is the lack of normal associative striatal cortico-basal ganglia circuit activation that contributes more to the pathology than the development of sensorimotor S-R associations per se, though this S-R activity may be most obvious in the addicted state (Graybiel, 2008; Kalivas, 2008; Robbins et al., 2008; Volkow et al., 2009). The classical idea that the prefrontal cortex can act as an inhibitory gate on motor cortex could thus be extended to the entire associative cortico-striatal loop circuitry. The flexibility of activity in the dorsomedial striatum, seen here in the waxing and waning of activity during the course of training, thus could be critical to the emergence of less flexible, habitual patterns of behavior.

EXPERIMENTAL PROCEDURES

All experimental procedures were approved by the Committee on Animal Care of the Massachusetts Institute of Technology. Eight adult (300-350 g) male Long-Evans rats were housed in individual cages in a reverse light-cycle cubicle (lights on: 9 pm-9 am), and were trained during their active cycle. Rats were placed on food restriction such that they maintained at least 90% of their free-feeding weight.

Acclimation.

Prior to surgery, rats were acclimated to the T-maze. For 3-5 sessions, chocolate-flavored sprinkles were placed throughout the maze, and rats were allowed to explore and eat freely. For 1-2 sessions, only the goals were baited and rats could again explore and eat freely. For 1-3 final acclimation sessions, rats received up to 10 trials in which they had to wait at start while both goal arms were baited, and could run in the maze only after the gate was opened.

Surgery.

Each rat was anesthetized with a ketamine/xylazine mixture (100 mg/kg ketamine + 10 mg/kg xylazine). A headstage loaded with 11-12 tetrodes, 5-7 targeting the medial striatum (AP = 1.7 mm, ML = +1.8 mm) and 5-6 targeting the lateral striatum (AP = 0.5 mm, ML = +3.5 mm), was implanted and secured with dental cement and jeweler’s screws. During the week following surgery, tetrodes were lowered to their target depths (3.5-4.5 mm, both sites).

Behavioral Training.

Rats concurrently acquired auditory and tactile versions of a T-maze task. The direction of the baited goal arm was instructed by one of two tones (1 or 8 kHz) or one of two tactile floor textures (rough or smooth runway insert). The turn direction indicated by a given stimulus was varied across subjects, and for each subject, remained consistent throughout training. Trials of the auditory and tactile task versions were interleaved within single daily sessions in sets of 20 trials per modality, with the starting modality alternated daily. Within each set, stimuli instructing each turn direction were presented pseudorandomly. Recordings were made from the session in which the rats first encountered the conditional stimuli through 10 days of overtraining. Rats in Group 1 (n = 6) received training on a more difficult version of tactile discrimination, which they failed to acquire, and overtraining ended for these rats after 10 consecutive sessions in which performance on the auditory version was greater than 72.5% correct. Rats in Group 2 (n = 3) acquired an easier version of the tactile task, and their overtraining ended after 10 consecutive sessions in which performance on both versions was above 72.5% correct. Sessions were divided for analysis into 3 training blocks: block 1, in which performance on both task versions was below 72.5% (stages A1-A5); block 2, in which performance on the auditory trials was above 72.5%, but performance on the tactile trials remained below 72.5% (stages B1-B5); and for Group 2 rats, a block 3 (stages C1-C5), in which performance on both auditory and tactile versions was above 72.5%.

Recording.

Neural recordings were made with a Cheetah Data Acquisition System (Neuralynx, MT), the position of the rat was acquired by an overhead CCD camera, and behavioral events were identified based on photobeam breaks throughout the maze.

Unit classification.

Recorded spikes were manually sorted into different clusters (units) using Plexon Offline Sorter (Plexon, TX). Units were then graded for quality and classified as putative medium spiny, fast firing, or tonically-active subtypes using methods described elsewhere (Barnes et al., 2005, and see Supplemental Experimental Procedures). A medium spiny neuron was further classified as “task-responsive” if its firing rate in any ±300-ms peri-event window was more than 2 standard deviations above its baseline firing rate for 3 consecutive 20-ms bins. Units not classified as task-responsive were deemed “non-task-responsive.”

Z-scores.

For each unit, the mean number of spikes across all trials in a session was determined for each 20-ms bin in a ±300 ms peri-event window around each of the 9 task events. The mean, Smean, and standard deviation, Sstd, were then obtained for all 261 bins (29 bins × 9 events). For each 20-ms bin, a z-score was calculated by normalizing the mean spike count in each bin: Zbin = (Sbin – Smean) / Sstd. To obtain population activity for each stage, the mean z-score and standard error of the mean (SEM) were calculated for each bin across all units included in each stage, and smoothed with a 3-point averaging filter.

Difference measures.

Three complementary measures were used. First, for each 20-ms bin a t-test was performed to compare the mean z-scores of different groups of neurons. The difference between two patterns was expressed as the percentage of significantly differing (p < 0.01) bins: 100 * Nsig / Ntotal, where Ntotal = 261. Next, we calculated a residual sum of squares measure: RSS = ∑ (Zbin,1 – Zbin,2)2. Finally, we computed the symmetrized Kullback-Leibler divergence: KL = ∑ Pbin,1 * ln(Pbin,1 / Pbin,2) + ∑ Pbin,2 * ln(Pbin,2 / Pbin,1). Pbin is a spiking distribution calculated from the ensemble z-scores: Pbin = (Zbin + α) / ∑(Zbin + α), where α = 1 was added to the ensemble z-scores such that values for all bins were greater than 0.

Entropy.

Entropy was calculated for each stage from the spiking distribution Pbin: H(s) = - ∑ Pbin * ln(Pbin). The mean entropy and 95% confidence intervals were estimated using 1000 bootstrap samples from the neuronal population. Mean entropy was then correlated with behavioral performance measures.

Characterizing pattern development across training.

We performed segmented regressions on the z-scores in each bin to obtain the slope of the regression and the 95% confidence limits, after determining the best regressions for each region based on the entropy data. A segmented linear regression was deemed a better fit than a single linear regression if the slopes of both segments were significantly different from 0 at p < 0.05 and the coefficient of determination for the segmented regression was much greater than the R2 value for the single regression (CD > 4 * R2). For dorsolateral entropy, no breakpoint was found that provided a piecewise fit that was better than the single regression. For the dorsomedial striatum, only one potential breakpoint met these criteria (stage B1).

Single unit discriminations.

For each unit, the number of spikes in a ±300-ms window around each of the 9 task events was calculated for each trial. For each task epoch, the mean spike counts for two conditions (e.g. auditory trials vs. tactile trials) were then compared using a standard t-test assuming unequal variances and accepted as significantly different if p < 0.01. At least 10 trials were required in each condition to perform the test; thus, late in training, several units were excluded from the correct/incorrect discrimination. To determine the percentage of units expected to make each discrimination by chance, trials in each session were randomly assigned to each comparison group such that the sizes of the original groups were maintained, and a t-test was performed on the shuffled data.

Supplementary Material

Figures and Text

ACKNOWLEDGMENTS

This work was funded by NIH MH60379, ONR N000140410208, the Stanley H. and Sheila G. Sydney Fund, European Union grant 201716 and a fellowship (CT) from the McGovern Institute for Brain Research. We thank Patricia Harlan, Christine Keller-McGandy, Henry Hall, Gila Fakterman and Kyle Smith for their help.

REFERENCES

  1. Adams S, Kesner RP, Ragozzino ME. Role of the medial and lateral caudate-putamen in mediating an auditory conditional response association. Neurobiol Learn Mem. 2001;76:106–116. doi: 10.1006/nlme.2000.3989. [DOI] [PubMed] [Google Scholar]
  2. Balleine BW, Liljeholm M, Ostlund SB. The integrative function of the basal ganglia in instrumental conditioning. Behav Brain Res. 2009;199:43–52. doi: 10.1016/j.bbr.2008.10.034. [DOI] [PubMed] [Google Scholar]
  3. Barnes TD, Kubota Y, Hu D, Jin DZ, Graybiel AM. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature. 2005;437:1158–1161. doi: 10.1038/nature04053. [DOI] [PubMed] [Google Scholar]
  4. Berke JD, Breck JT, Eichenbaum H. Striatal versus hippocampal representations during win-stay maze performance. J Neurophysiol. 2009;101:1575–1587. doi: 10.1152/jn.91106.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cardinal RN. Neural systems implicated in delayed and probabilistic reinforcement. Neural Netw. 2006;19:1277–1301. doi: 10.1016/j.neunet.2006.03.004. [DOI] [PubMed] [Google Scholar]
  6. Carter CS, Braver TS, Barch DM, Botvinick MM, Noll D, Cohen JD. Anterior cingulate cortex, error detection, and the online monitoring of performance. Science. 1998;280:747–749. doi: 10.1126/science.280.5364.747. [DOI] [PubMed] [Google Scholar]
  7. Corbit LH, Janak PH. Inactivation of the lateral but not medial dorsal striatum eliminates the excitatory impact of Pavlovian stimuli on instrumental responding. J Neurosci. 2007;27:13977–13981. doi: 10.1523/JNEUROSCI.4097-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dagher A, Robbins TW. Personality, addiction, dopamine: insights from Parkinson’s disease. Neuron. 2009;61:502–510. doi: 10.1016/j.neuron.2009.01.031. [DOI] [PubMed] [Google Scholar]
  9. Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005;8:1704–1711. doi: 10.1038/nn1560. [DOI] [PubMed] [Google Scholar]
  10. DeLong MR, Wichmann T. Circuits and circuit disorders of the basal ganglia. Arch Neurol. 2007;64:20–24. doi: 10.1001/archneur.64.1.20. [DOI] [PubMed] [Google Scholar]
  11. Emondi AA, Rebrik SP, Kurgansky AV, Miller KD. Tracking neurons recorded from tetrodes across time. J Neurosci Methods. 2004;135:95–105. doi: 10.1016/j.jneumeth.2003.12.022. [DOI] [PubMed] [Google Scholar]
  12. Featherstone RE, McDonald RJ. Lesions of the dorsolateral or dorsomedial striatum impair performance of a previously acquired simple discrimination task. Neurobiol Learn Mem. 2005;84:159–167. doi: 10.1016/j.nlm.2005.08.003. [DOI] [PubMed] [Google Scholar]
  13. Graybiel AM. Habits, rituals, and the evaluative brain. Annu Rev Neurosci. 2008;31:359–387. doi: 10.1146/annurev.neuro.29.051605.112851. [DOI] [PubMed] [Google Scholar]
  14. Graybiel AM, Mink JW. The basal ganglia and cognition. In: Gazzaniga M, editor. The Cognitive Neurosciences IV. MIT Press; Cambridge, MA: 2009. [Google Scholar]
  15. Histed MH, Pasupathy A, Miller EK. Learning substrates in the primate prefrontal cortex and striatum: sustained activity related to successful actions. Neuron. 2009;63:244–253. doi: 10.1016/j.neuron.2009.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Horvitz JC. Stimulus-response and response-outcome learning mechanisms in the striatum. Behav Brain Res. 2009;199:129–140. doi: 10.1016/j.bbr.2008.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Kalivas PW. Addiction as a pathology in prefrontal cortical regulation of corticostriatal habit circuitry. Neurotox Res. 2008;14:185–189. doi: 10.1007/BF03033809. [DOI] [PubMed] [Google Scholar]
  18. Kantak KM, Green-Jordan K, Valencia E, Kremin T, Eichenbaum HB. Cognitive task performance after lidocaine-induced inactivation of different sites within the basolateral amygdala and dorsal striatum. Behav Neurosci. 2001;115:589–601. doi: 10.1037//0735-7044.115.3.589. [DOI] [PubMed] [Google Scholar]
  19. Kim YB, Huh N, Lee H, Baeg EH, Lee D, Jung MW. Encoding of action history in the rat ventral striatum. J Neurophysiol. 2007;98:3548–3556. doi: 10.1152/jn.00310.2007. [DOI] [PubMed] [Google Scholar]
  20. Kimchi EY, Laubach M. The dorsomedial striatum reflects response bias during learning. J Neurosci. 2009a;29:14891–14902. doi: 10.1523/JNEUROSCI.4060-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kimchi EY, Laubach M. Dynamic encoding of action selection by the medial striatum. J Neurosci. 2009b;29:3148–3159. doi: 10.1523/JNEUROSCI.5206-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kimchi EY, Torregrossa MM, Taylor JR, Laubach M. Neuronal correlates of instrumental learning in the dorsal striatum. J Neurophysiol. 2009;102:475–489. doi: 10.1152/jn.00262.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kubota Y, Liu J, Hu D, DeCoteau WE, Eden UT, Smith AC, Graybiel AM. Stable encoding of task structure coexists with flexible coding of task events in sensorimotor striatum. J Neurophysiol. 2009;102:2142–2160. doi: 10.1152/jn.00522.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lau B, Glimcher PW. Action and outcome encoding in the primate caudate nucleus. J Neurosci. 2007;27:14502–14514. doi: 10.1523/JNEUROSCI.3060-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Lau B, Glimcher PW. Value representations in the primate striatum during matching behavior. Neuron. 2008;58:451–463. doi: 10.1016/j.neuron.2008.02.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. McGeorge AJ, Faull RL. The organization of the projection from the cerebral cortex to the striatum in the rat. Neuroscience. 1989;29:503–537. doi: 10.1016/0306-4522(89)90128-0. [DOI] [PubMed] [Google Scholar]
  27. Middleton FA, Strick PL. Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res Brain Res Rev. 2000;31:236–250. doi: 10.1016/s0165-0173(99)00040-5. [DOI] [PubMed] [Google Scholar]
  28. Ragozzino ME. The contribution of the medial prefrontal cortex, orbitofrontal cortex, and dorsomedial striatum to behavioral flexibility. Ann N Y Acad Sci. 2007;1121:355–375. doi: 10.1196/annals.1401.013. [DOI] [PubMed] [Google Scholar]
  29. Robbins TW, Ersche KD, Everitt BJ. Drug addiction and the memory systems of the brain. Ann N Y Acad Sci. 2008;1141:1–21. doi: 10.1196/annals.1441.020. [DOI] [PubMed] [Google Scholar]
  30. Rushworth MF. Intention, choice, and the medial frontal cortex. Ann N Y Acad Sci. 2008;1124:181–207. doi: 10.1196/annals.1440.014. [DOI] [PubMed] [Google Scholar]
  31. Rushworth MF, Behrens TE. Choice, uncertainty and value in prefrontal and cingulate cortex. Nat Neurosci. 2008;11:389–397. doi: 10.1038/nn2066. [DOI] [PubMed] [Google Scholar]
  32. Samejima K, Doya K. Multiple representations of belief states and action values in corticobasal ganglia loops. Ann N Y Acad Sci. 2007;1104:213–228. doi: 10.1196/annals.1390.024. [DOI] [PubMed] [Google Scholar]
  33. Samejima K, Ueda Y, Doya K, Kimura M. Representation of action-specific reward values in the striatum. Science. 2005;310:1337–1340. doi: 10.1126/science.1115270. [DOI] [PubMed] [Google Scholar]
  34. Schall JD, Stuphorn V, Brown JW. Monitoring and control of action by the frontal lobes. Neuron. 2002;36:309–322. doi: 10.1016/s0896-6273(02)00964-9. [DOI] [PubMed] [Google Scholar]
  35. Schmitzer-Torbert N, Redish AD. Neuronal activity in the rodent dorsal striatum in sequential navigation: separation of spatial and reward responses on the multiple T task. J Neurophysiol. 2004;91:2259–2272. doi: 10.1152/jn.00687.2003. [DOI] [PubMed] [Google Scholar]
  36. Tang C, Pawlak AP, Prokopenko V, West MO. Changes in activity of the striatum during formation of a motor habit. Eur J Neurosci. 2007;25:1212–1227. doi: 10.1111/j.1460-9568.2007.05353.x. [DOI] [PubMed] [Google Scholar]
  37. Volkow ND, Fowler JS, Wang GJ, Baler R, Telang F. Imaging dopamine’s role in drug abuse and addiction. Neuropharmacology. 2009;56(Suppl 1):3–8. doi: 10.1016/j.neuropharm.2008.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Wassum KM, Cely IC, Maidment NT, Balleine BW. Disruption of endogenous opioid activity during instrumental learning enhances habit acquisition. Neuroscience. 2009;163:770–780. doi: 10.1016/j.neuroscience.2009.06.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. White NM. Some highlights of research on the effects of caudate nucleus lesions over the past 200 years. Behav Brain Res. 2009;199:3–23. doi: 10.1016/j.bbr.2008.12.003. [DOI] [PubMed] [Google Scholar]
  40. Williams ZM, Eskandar EN. Selective enhancement of associative learning by microstimulation of the anterior caudate. Nat Neurosci. 2006;9:562–568. doi: 10.1038/nn1662. [DOI] [PubMed] [Google Scholar]
  41. Worbe Y, Baup N, Grabli D, Chaigneau M, Mounayar S, McCairn K, Feger J, Tremblay L. Behavioral and movement disorders induced by local inhibitory dysfunction in primate striatum. Cereb Cortex. 2009;19:1844–1856. doi: 10.1093/cercor/bhn214. [DOI] [PubMed] [Google Scholar]
  42. Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci. 2006;7:464–476. doi: 10.1038/nrn1919. [DOI] [PubMed] [Google Scholar]
  43. Yin HH, Mulcare SP, Hilario MR, Clouse E, Holloway T, Davis MI, Hansson AC, Lovinger DM, Costa RM. Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat Neurosci. 2009;12:333–341. doi: 10.1038/nn.2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Yin HH, Ostlund SB, Balleine BW. Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. Eur J Neurosci. 2008;28:1437–1448. doi: 10.1111/j.1460-9568.2008.06422.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figures and Text

RESOURCES