Skip to main content
eLife logoLink to eLife
. 2017 Dec 18;6:e32353. doi: 10.7554/eLife.32353

Two-photon imaging in mice shows striosomes and matrix have overlapping but differential reinforcement-related responses

Bernard Bloem 1,2,, Rafiq Huda 2,3,, Mriganka Sur 2,3, Ann M Graybiel 1,2,
Editor: Geoffrey Schoenbaum4
PMCID: PMC5764569  PMID: 29251596

Abstract

Striosomes were discovered several decades ago as neurochemically identified zones in the striatum, yet technical hurdles have hampered the study of the functions of these striatal compartments. Here we used 2-photon calcium imaging in neuronal birthdate-labeled Mash1-CreER;Ai14 mice to image simultaneously the activity of striosomal and matrix neurons as mice performed an auditory conditioning task. With this method, we identified circumscribed zones of tdTomato-labeled neuropil that correspond to striosomes as verified immunohistochemically. Neurons in both striosomes and matrix responded to reward-predicting cues and were active during or after consummatory licking. However, we found quantitative differences in response strength: striosomal neurons fired more to reward-predicting cues and encoded more information about expected outcome as mice learned the task, whereas matrix neurons were more strongly modulated by recent reward history. These findings open the possibility of harnessing in vivo imaging to determine the contributions of striosomes and matrix to striatal circuit function.

Research organism: Mouse

Introduction

The striatum, despite its relatively homogeneous appearance in simple cell stains, is made up of a mosaic of macroscopic zones, the striosomes and matrix, which differ in their input and output connections and are thought to allow specialized processing by physically modular groupings of striatal neurons (Crittenden et al., 2016; Fujiyama et al., 2011; Gerfen, 1984; Graybiel and Ragsdale, 1978; Jiménez-Castellanos and Graybiel, 1989; Langer and Graybiel, 1989; Lopez-Huerta et al., 2016; Salinas et al., 2016; Smith et al., 2016; Stephenson-Jones et al., 2016; Walker et al., 1993; Watabe-Uchida et al., 2012). Particularly striking among these modules are the striosomes (also called patches), which are distinct from the surrounding matrix and its constituent modules by differential expression of neurotransmitters, receptors and many other gene expression patterns, including those related to dopaminergic and cholinergic transmission (Banghart et al., 2015; Brimblecombe and Cragg, 2015; Brimblecombe and Cragg, 2017Crittenden and Graybiel, 2011; Cui et al., 2014; Flaherty and Graybiel, 1994; Gerfen, 1992; Graybiel, 2010; Graybiel and Ragsdale, 1978). Striosomes in the anterior striatum have strong inputs from particular regions related to the limbic system, including parts of the orbitofrontal and medial prefrontal cortex (Eblen and Graybiel, 1995; Friedman et al., 2015; Gerfen, 1984; Ragsdale and Graybiel, 1990) and, at subcortical levels, the bed nucleus of the stria terminalis (Smith et al., 2016) and basolateral amygdala (Ragsdale and Graybiel, 1988). The striosomes are equally specialized in their outputs: they project directly to subsets of dopamine-containing neurons of the substantia nigra (Crittenden et al., 2016; Fujiyama et al., 2011) and, via the pallidum, to the lateral habenula (Rajakumar et al., 1993; Stephenson-Jones et al., 2016). By contrast, the matrix and its constituent matrisomes receive abundant input from sensorimotor and associative parts of the neocortex (Flaherty and Graybiel, 1994; Gerfen, 1984; Parthasarathy et al., 1992; Ragsdale and Graybiel, 1990), and project via the main direct and indirect pathways to the pallidum and non-dopaminergic pars reticulata of the substantia nigra (Flaherty and Graybiel, 1994; Giménez-Amaya and Graybiel, 1991; Kreitzer and Malenka, 2008), universally thought to modulate movement control (Albin et al., 1989; Alexander and Crutcher, 1990; DeLong, 1990).

This contrast in connectivity between striosomes and the surrounding matrix highlights the possibility that striosomes, which physically form three-dimensional labyrinths within the much larger matrix, could serve as limbic outposts within the large sensorimotor matrix. The question of what the actual functions of striosomes are, however, remains unsolved. Answering this question has importance for clinical work as well as for basic science: striosomes have been found, in post-mortem studies, to be selectively vulnerable in disorders with neurologic and neuropsychiatric features (Crittenden and Graybiel, 2016; Saka et al., 2004; Sato et al., 2008; Tippett et al., 2007). Ideas about the functions of striosomes have ranged from striosomes serving as the critic in actor-critic architecture models (Doya, 1999), to their generating responsibility signals in hierarchical learning models (Amemori et al., 2011), to their being critical to motivationally demanding approach-avoidance decision-making prior to action (Friedman et al., 2017, 2015), and to other functions (Brown et al., 1999; Crittenden et al., 2016). However, the technical difficulties involved in reliably identifying and recording the activity of striosomal neurons have been exceedingly challenging; striosomes are too small to yet be detected by fMRI, and their neurons have remained unrecognizable in in vivo electrophysiological studies with the exception of those identifying putative striosomes by combinations of antidromic and orthodromic stimulation (Friedman et al., 2017, 2015). With the development of endoscopic calcium imaging (Bocarsly et al., 2015; Carvalho Poyraz et al., 2016; Luo et al., 2011) and 2-photon imaging of deep-lying structures (Dombeck et al., 2010; Howe and Dombeck, 2016; Kaifosh et al., 2013; Lovett-Barron et al., 2014; Mizrahi et al., 2004; Sato et al., 2016), combined with the use of genetic mouse models that allow direct visual identification of selectively labeled neurons, identifying functions of these specialized striatal zones should be within reach.

Here we report that we have developed a 2-photon microscopy protocol for simultaneously examining the activity of striosomal and matrix neurons in the dorsal caudoputamen of behaving head-fixed mice in which we used fate-mapping to label preferentially striosomal neurons by virtue of their early neurogenesis relative to that of matrix neurons (Fishell and van der Kooy, 1987; Graybiel, 1984; Graybiel and Hickey, 1982; Hagimoto et al., 2017; Kelly et al., 2017; Newman et al., 2015; Taniguchi et al., 2011). Key to this work was achieving dense, permanent labeling of not only striosomal cell bodies, but also their striosome-bounded neuropil. We accomplished this differential labeling by pulse-labeling with tamoxifen during the generation time of the spiny projection neurons (SPNs) of striosomes using Mash1(Ascl1)-CreER;Ai14 driver lines with induction at embryonic day (E) 11.5 (Kelly et al., 2017). This method allowed striosomal detection based on the labeling of SPN cell bodies as well as the rich neuropil labeling of the striosomes, capitalizing on the fact that SPN processes of striosome and matrix compartments rarely cross striosomal borders (Bolam et al., 1988; Lopez-Huerta et al., 2016; Walker et al., 1993). Thus even though only a fraction of striosomal neurons were tagged, it was possible, because of the restricted neuropil labeling generated by their local processes, to identify neurons as being inside striosomes and, concomitantly, to identify clearly neurons as lying outside of the zones of neuropil labeling, in the matrix.

With this method, we compared the activity patterns of striosomal and matrix neurons related to multiple elementary aspects of striatal encoding as mice performed a classical conditioning task. By having cues signaling different reward delivery probabilities, we tested whether striosomes and matrix differentially encode changes in expected outcome and received rewards (Amemori et al., 2015; Bayer and Glimcher, 2005; Bromberg-Martin and Hikosaka, 2011; Friedman et al., 2015; Keiflin and Janak, 2015; Matsumoto and Hikosaka, 2007; Oyama et al., 2010, 2015; Schultz, 2016; Schultz et al., 1997; Stalnaker et al., 2012; Watabe-Uchida et al., 2017, 2012). By imaging day by day during the acquisition and overtraining periods of the task, we asked whether these patterns changed in systematic ways with experience. Finally, we tested the effect of reward history on the activity patterns of current trials, given reports that strong reward-history activity has been found in sites considered to be directly or indirectly connected with striosomes (Bromberg-Martin et al., 2010; Hamid et al., 2016; Tai et al., 2012).

We demonstrate that neurons visually identified as being within striosomes or within the extra-striosomal matrix have considerable overlap in their response properties during all phases of task performance. Thus, striosomes and matrix share common features related to simple reward processing and manifest acquisition of responses to different task events as a result of reward-based learning. The activities of neurons in the striosome and matrix compartments differed, however, in their relative emphases on different task epochs. Striosomal neurons more strongly encoded reward prediction, and matrix neurons more strongly encoded reward history. These findings suggest that neurons in striosomes and matrix can be differentially tuned by reinforcement contingencies both during learning and during subsequent performance. This work opens the opportunity for future functional understanding of striosome-matrix architecture by in vivo microscopy combined with selective tagging of neurons with known developmental origins, an opportunity that will be valuable conceptually in linking developmental programs to circuit function, and in the study of both normal animals and those representing models of disease states.

Results

To detect striosomes, we performed experiments in Mash1-CreER;Ai14 mice, following the method of Kelly et al., 2017. This method takes advantage of the finding that Mash1 is a differential driver of the striosomal lineage during the ~E10-E13 window of neurogenesis of striosomes in mouse (Kelly et al., 2017). We injected pregnant Mash1-CreER;Ai14 dams with tamoxifen at E11.5, in the middle of this neurogenic phase of striosomal development. This treatment led to the permanent expression in the resulting offspring of tdTomato in cells being born at the time of induction. We found strong tdTomato labeling of striosomes in the striatal regions of the caudoputamen that we examined (Figure 1, Figure 1—figure supplement 1). Critically, this labeling marked not only the cell bodies of the striosomal neurons, but also their local processes, which were confined to the neuropil as confirmed histologically in initial immunohistochemical experiments (Figure 1). These experiments demonstrated that the clusters of labeled neurons and their neuropil corresponded to striosomes, as evidenced by the close match between the zones of tdTomato neuropil labeling and mu-opioid receptor 1 (MOR1)-rich immunostaining (Table 1) (Kelly et al., 2017; Tajima and Fukuda, 2013). We also observed sparsely distributed tdTomato-labeled neurons outside of MOR1-labeled striosomes, scattered in the extra-striosomal matrix, but they never exhibited patchy neuropil labeling.

Figure 1. Striosomes are labeled with tdTomato in Mash1-CreER;Ai14 mice that received tamoxifen at E11.5.

Images illustrate two examples (rows) of striosomal labeling of cell bodies and neuropil by tdTomato (A,D, red) as verified by MOR1 immunostaining identifying striosomes (B,E, blue). Merged images show overlap of tdTomato and MOR1 labeling (C,F). Scale bars indicate 100 µm.

Figure 1.

Figure 1—figure supplement 1. Striosome labeling in Mash1-CreER;Ai14 mice injected with tamoxifen at E11.5.

Figure 1—figure supplement 1.

Low-magnification images show tdTomato labeling in striosomes (A,D, red), striosomes detected in sections immunostained for MOR1 (B,E, blue), and overlap of the tdTomato and MOR1 signals (C,F). Scale bars indicate 500 µm.

Table 1. Overlap of striosomes outlined using tdTomato and MOR1.

MOR1
Positive Negative
tdTomato Positive 14.2%±1.3% 2.0%±0.3%
Negative 3.7%±0.6% 80.2%±1.9%

MOR1 test-retest error rate = 2.4%.

tdTomato test-retest error rate = 2.3%.

For in-vivo experiments, we used 2-photon microscopy to image the striatum of 5 striosome-labeled mice that had received unilateral intrastriatal injections of AAV5-hSyn-GCaMP6s and had been implanted with cannula windows and a headplate (Figure 2A). Each mouse was trained on a classical conditioning task in which two auditory tones (1.5 s duration each) were associated with reward delivery by different probabilities (tone 1, 80% vs tone 2, 20%) (Figure 2B). Inter-trial intervals were 7 ± 1.75 s. With training, mice began to lick in anticipation of the reward, and the amount of this anticipatory licking became greater when cued by the tone indicating a high probability (80%) of reward (Figure 2C). We calculated a learning criterion based on the anticipatory lick rates during the two cues and the subsequent delay period (0.5 s). Mice exhibiting a divergence in anticipatory licking for the two cues for at least two out of three consecutive sessions were considered as trained (Figure 2D). We performed imaging during training (n = 3; task acquisition) and after this criterion had been reached (n = 5; criterion). Two mice were trained for an additional five sessions (overtraining), in which we imaged the same fields of view as in the criterion phase.

Figure 2. Behavioral task and performance.

Figure 2.

(A) The striatum was imaged during conditioning sessions in which tones predicted reward delivery. (B) Two tones (4 and 11 kHz) were played (1.5 s duration) and were associated with distinct reward probabilities (80% or 20%). After a 0.5 s delay, reward could be delivered. Inter-trial interval durations varied from 5.25 to 8.75 s. (C) Frequency of licking after training, averaged over five mice (±SEM). Anticipatory licking was significantly higher during the presentation of the high-probability tone (blue) than during the presentation of the low-probability tone (green). After reward delivery, licking rates were elevated for several seconds (solid lines: rewarded trials; dotted lines: unrewarded trials). (D) Licking during the tone and reward delay, shown as z-scores calculated relative to the 2 s baseline period preceding the tone, during training sessions (average of 3 mice). Mice began to exhibit differences in levels of anticipatory licking between the two cues after 11–12 sessions. Animals were considered to be trained when they exhibited significantly higher anticipatory licking during the high-probability tone (blue) than during the low-probability tone (green) in 2 out of 3 consecutive sessions. Shading represents SEM.

Imaging of striosomes

Clusters of tdTomato-positive neurons were clearly visible in vivo in the 2-photon microscope at 40x magnification, and the neuropil of these neurons delimited zones in which many dendritic processes could be identified (Figure 3). We simultaneously recorded transients in striosomal and matrix neurons from fields of view with clear striosomes. In all animals, we could see at least two different striosomes, from which we imaged at least five different non-overlapping fields of view. In some instances, we could see two different striosomes in one field of view. In the entire data set, we imaged 1867 neurons in striosomes and 4453 in the matrix. Because striosomes form parts of extended branched labyrinths, it was possible to follow some striosomes through ±100 µm in depth, and across ±800 µm in the field of view. During training, we rotated through the fields of view, but after the training criterion had been reached, we recorded activity in unique non-overlapping fields of view (2704 neurons, of which 727 were in striosomes; between 252 and 782 neurons per mouse; Table 2).

Figure 3. In vivo 2-photon calcium imaging of identified striosomes and matrix.

Figure 3.

(A) Mash1-CreER;Ai14 mice were injected with AAV5-hSyn-GCaMP6s and 4 weeks later were implanted with a cannula. (B) Image of a striosome acquired with the 2-photon microscope, illustrating tdTomato labeling in red and GCaMP in green (scale bar: 100 µm) in the striatum of a trained mouse. (C–E) Higher magnification images of the region indicated in B (scale bar: 10 µm), shown for individual green (C), red (D) and merged (E) channels. Arrowheads indicate double-labeled cells. (F–H) Representative examples of striosomes imaged in three other trained mice (scale bars: 100 µm).

Table 2. Numbers of recorded neurons per mouse.

Mouse
1 2 3 4 5 Total
Number of neurons 587 782 252 426 657 2704
Striosomal neurons 218 (37.1 %) 214 (27.4 %) 41 (16.3 %) 77 (18.1 %) 177 (26.9 %) 727 (26.9 %)
Matrix neurons 369 (62.9 %) 568 (72.6 %) 211 (83.7 %) 349 (81.9 %) 480 (73.1 %) 1977 (73.1 %)
tdTomato-positive neurons in striosomes 33 (5.6 %) 21 (2.7 %) 11 (4.4 %) 13 (3.1 %) 33 (5.0 %) 111 (4.1 %)
tdTomato-negative neurons in striosomes 182 (31.0 %) 191 (24.4 %) 30 (11.9 %) 60 (14.1 %) 134 (20.4 %) 597 (22.1 %)
tdTomato-positive neurons outside of striosomes 3 (0.5 %) 2 (0.3 %) 0 (0.0 %) 4 (0.9 %) 10 (1.5 %) 19 (0.7 %)

To control for small but significant differences in GCaMP6s expression (Table 3) between striosomes and matrix, we calculated ΔF/F as: ΔF/F = Ft – F0 / F0 (Ft: fluorescence at time t; F0: baseline fluorescence). We quantified the mean, standard deviation and maximum values of the ΔF/F signal during the baseline periods to test for potential differences in the signal-to-noise ratio of our recordings, but did not observe differences between striosomal and matrix neurons (Table 3).

Table 3. Baseline fluorescence and ΔF/F values for striosomal and matrix neurons.


Cell type
Striosomal In striosomal neuropil tdTomato labeled Matrix
Baseline fluorescence 290.0 (8.5) *** 274.9 (8.1) *** 337.2 (27.9) 364.5 (6.8)
ΔF/F baseline mean 11.3 (0.7) 11.9 (0.7) 9.3 (1.9) 11.9 (0.4)
ΔF/F baseline standard deviation 37.2 (1.3) 38.2 (1.4) 33.5 (3.5) 38.5 (0.7)
ΔF/F baseline maximum 250.6 (9.9) 259.3 (11.2) 216.2 (22.8) 255.2 (6.0)

***p<0.001.

Striatal neurons exhibit heightened activity during different task epochs

As an initial approach to our data, we analyzed the overall fluorescence for every session in trained animals by averaging the frame-wide fluorescence (Figure 4A). Both cues evoked large responses in the neuropil signal, which were calculated as z-scores based on the mean signal and its standard deviation during a 1 s period before cue onset. These signals were larger for the high-probability cue. After reward delivery, there was a prolonged, strong activation that peaked at around 3 s after reward delivery (Figure 4B). To determine more precisely the nature of this activation, we aligned neuronal responses in the rewarded trials to the tone onset, to the first lick after reward delivery and to the end of the licking bout (Figure 4B). This analysis demonstrated that, in addition to the tone response, there was an additional increase in the signal during the post-reward licking period, and that this signal increased over time, peaked at the time of the last lick, and then subsided.

Figure 4. Striatal activity during reward-predicting cues and during post-reward period.

(A) Aggregate neuropil calcium signal in all four trial types (blue: high-probability cue; green: low-probability cue; solid line: rewarded trials; dotted line: unrewarded trials). Shading represents SEM. (B) Neuropil activation aligned to tone onset (left), first lick after reward delivery (middle) and last lick (right). Only rewarded trials with high-probability cues are included. (C, D) Responses of the neurons (D) color-coded in C during five sample trials (rows) for four different cue-outcome conditions (columns). Dotted lines indicate the tone and reward onsets. Scale bar in C represents 100 µm. Lines above each plot show when licks occurred. (E) Percentage of task-modulated neurons that were selectively active during cue, post-reward licking, or post-licking epochs of the task. Error bars represent 95% confidence intervals. (F) Population-averaged responses of task-modulated neurons selectively active during the three epochs. Data for neurons active during the post-reward licking period are separately shown aligned to the first and the last lick. (G) Session-averaged activity of all task-modulated neurons (left) and those that were significantly active during only one of three task epochs (right). Neurons were sorted by the timing of their peak activity.

Figure 4.

Figure 4—figure supplement 1. Temporal specificity of post-reward licking responses.

Figure 4—figure supplement 1.

(A) Session-averaged post-reward licking responses for observed (left) and shuffled (right) data. Data were shuffled for each neuron by substituting responses in a given trial with response in the same trial from a randomly picked task-modulated neuron recorded simultaneously. Each row is a single neuron. Responses were sorted based on the timing of the peak response for observed and shuffled data separately. (B) Quantification of trial-to-trial response reliability, calculated as the average correlation for all pairwise combinations of trials. (C) The observed decrease in response reliability with shuffling is unlikely due to changes in response amplitude because the shuffling procedure does not affect the mean amplitude of the peak responses. (D) Standard deviation of peak times for observed and shuffled data. (E) Ridge-to-background ratio for the color maps shown in A. The ridge was defined as five data points (1 s) surrounding the peak response, and background as all other data points. ***p<0.001.

Next, we analyzed single-cell activity to investigate the neural dynamics of task encoding by the striatal neurons. In particular, we asked whether the prolonged activation seen in the frame-wide fluorescence signal was also visible in single neurons, or whether individual neurons were active during specific task events. Neuronal firing as indicated by the calcium transients was sparse during the task, but we found that individual neurons were active for particular events during the task (Figure 4D,G). For instance, the red color-coded neuron illustrated in Figure 4C and D became active soon after tone onset, whereas the neuron color-coded in gray fired during the post-reward licking period. The timing of their activities with respect to specific trial events seemed relatively stable, resembling what has been reported before for neurons in the striatum of behaving rodents by recording and analyzing spike activity (Bakhurin et al., 2017; Barnes et al., 2011; Gage et al., 2010; Jog et al., 1999; Rueda-Orozco and Robbe, 2015). To determine task encoding by single neurons at a population level, we defined task-modulated neurons as those that were significantly active, according to Wilcoxon sign-rank tests during the cue, reward licking and post-licking epochs of the task (see Materials and methods). Altogether, 38.2% of the striatal neurons imaged in our samples were task-modulated. Of these, most (85%) were active during only one of the three task epochs. Among task-modulated neurons, most were selectively active during the post-reward licking period (57%), but substantial numbers of neurons were also active during the tone presentation (17%) or after the licking had stopped (11%, Figure 4E,F).

For population analyses, we calculated z-scores for the neuronal responses using the mean and the standard deviation of the 1 s baseline period preceding tone onsets. Analysis of session-averaged population responses of neurons selectively active during these three epochs demonstrated a similar sequence of neuronal events as the sequence that we found with analysis of the frame-wide fluorescence signals. The activation of a small group of neurons after cue onset was followed by a prolonged increase in the responses of neurons active during the post-reward licking period (Figure 4F,G). This population activity ramped up until mice stopped licking, then quickly subsided (Figure 4F). The analysis of single-cell responses also identified a group of neurons that became maximally active just after the end of licking. Grouping neurons based on the epoch during which they were active and sorting responses within each group by the timing of their peak session-averaged activity exposed a tiling of task time by neurons active in each of the three epochs (Figure 4G).

To determine the temporal specificity of responses during the post-reward licking period for individual neurons, we compared them to the same responses shuffled for each neuron by substituting responses in a given trial with the response in the same trial from a randomly selected task-modulated neuron recorded simultaneously during the same session (Figure 4—figure supplement 1A). To quantify the trial-to-trial variability in responses, we computed a reliability index as the mean correlation of responses in all pairwise combinations of trials (Rikhye and Sur, 2015). Shuffling the data decreased response reliability, without affecting the mean peak responses, and increased the standard deviation of peak times (Figure 4—figure supplement 1B–D). In addition, we measured the ridge-to-background ratio, which quantifies the mean response magnitude surrounding response peaks relative to other time points (Harvey et al., 2012). We found that the ratio was higher for observed data as compared to the shuffled data (Figure 4—figure supplement 1E). Together, these analyses indicate temporal specificity in the responses of individual neurons and suggest that the prolonged ramping of population activity observed during the post-reward licking period was produced by individual neurons being active within different specific time intervals during licking, and not by them being active throughout the licking period.

Encoding of reward-predicting tones is stronger in striosomes than in the matrix

To dissociate the specific contributions of striosomes and matrix to task encoding, we again first compared aggregate GCaMP6s neuronal responses in both striatal compartments. We drew regions of interest (ROIs) around striosomes defined by tdTomato neuropil labeling and around nearby regions of the matrix in the same field of view with similar overall intensity of fluorescence and size, and compared the total amount of fluorescence from these regions. Both striosomes and matrix exhibited qualitatively similar responses, but there was a significantly stronger tone-evoked activation in striosomes than in the nearby matrix regions sampled (Figure 5A) (ANOVA main effect p<0.001). Moreover, the high-probability tone cue evoked a larger response than the low-probability tone cue (p<0.001), and there was a trend for an interaction between compartment and tone (p=0.055).

Figure 5. Striosomal neurons respond more strongly to reward predicting cues than matrix neurons.

(A) Average striosomal (S, red) and matrix (M, black) neuropil activation during rewarded trials with high-probability cue (left), and quantification of the magnitude of the response to high- and low-probability cues (right), calculated for the time period indicated by blue box (left). **p<0.01, ***p<0.001 (ANOVA and post hoc t-test). Shading and error bars represent SEM. (B) Neuropil selectivity for the rewarded vs. unrewarded trials for every time point in the trials in high-probability trials (left) and the average selectivity during the time indicated in the blue box (left) for both trial types (right). (C) Average neuropil post-reward activity aligned to the first (left) or last lick (middle), and average response during the ±1 s period (right). *p<0.05, **p<0.01 (ANOVA and post hoc t-test). (D) Trial-by-trial response of three striosomal (left block) and three matrix (right block) neurons that were selectively active during the cue (left), post-reward licking (middle), or end of licking (right) task-epochs. Green and red dots show, respectively, the first lick after reward delivery and the last lick. Average responses for the same neurons are shown underneath the color plots. (E) Proportion of all task-modulated striosomal and matrix neurons (left) and those that were modulated selectively during cue, post-reward licking, or post-licking epochs of the task (right). **p<0.01, ***p<0.001 (Fisher’s exact test). Error bars represent 95% confidence intervals. (F) Session-averaged responses of all task-modulated striosomal (left) and matrix (right) neurons, plotted on the color scale shown in D. Neurons are grouped and sorted as were those shown in Figure 4G. (G) Population-averaged responses of all task-modulated striosomal and matrix neurons to the high-probability cue (left), and the population responses separately averaged for high- and low-probability cues (right). **p<0.01, ***p<0.001 (ANOVA and post hoc t-test). Shading and error bars represent SEM. (H) Discriminability between rewarded and unrewarded trials for striosomal and matrix neurons. Left plot shows selectivity during trials with high-probability cue, and right plot shows average discriminability for all trials (quantified over 1–2 s time window after reward delivery). *p<0.05, **p<0.01 (ANOVA and post hoc t-test). (I,J) Population-averaged response during post-reward licking (I) or post-licking (J) periods, with data aligned, respectively, to first and last lick after reward delivery. *p<0.05 (ANOVA and post hoc t-test).

Figure 5.

Figure 5—figure supplement 1. Response reliability of task-related responses of striosomal (red) and matrix (black) neurons.

Figure 5—figure supplement 1.

Reliability of responses during the cue (left), post-reward licking (middle), or post-licking (right) task epochs for striosomal and matrix neurons was quantified as the average correlation for all pairwise combinations of trials. Only neurons that were task-modulated during these epochs were included in the analysis.

We tested for the selectivity of the responses to the high- and low-probability tone cues by quantifying the area under the receiver operating characteristic curve (AUROC) for these responses. Both striosomal and matrix responses displayed significant selectivity for the high-probability tone (p<0.05), but there was no difference in selectivity between the striosomes and matrix at this stage of learning. We also performed an AUROC analysis to estimate the selectivity for rewarded trials (Figure 5B). Both striosomal and matrix neuropil had elevated activity in rewarded trials, compared to non-rewarded trials, with both high- and low-probability tones. Repeated measures ANOVA showed that striosomes had a higher selectivity for rewarded trials than did the matrix (ANOVA main effect p<0.001). The selectivity for reward was larger in low-probability tone trials than in high-probability tone trials for both compartments (ANOVA main effect p<0.001), but there was no interaction between cell-type and selectivity for reward (Figure 5B). Thus, both striosomes and matrix were more activated when the reward was less expected. We also tested how the beginning and end of licking were reflected by activity in the two compartments (Figure 5C). The striosomal activation was higher than matrix activation (ANOVA main effect p<0.001), and the activation was larger at the end of licking than at the beginning (Figure 5C) (±1 s around event, ANOVA main effect p<0.001). Thus, the overall fluorescence shows that both striosomes and matrix are active during the trials, but that striosomes are more strongly activated, particularly during the cue period, and also at the end of the post-reward licking period. However, we note that summing activity over populations of neurons makes it impossible to dissociate the reward-related activation from the carry-over effects of the tone-related activation.

Next, we analyzed single-cell calcium responses of striosomal and matrix neurons during the task. Individual neurons in both compartments were active during the task (Figure 5D). We found a higher proportion of striosomal neurons (42.9%; 312 out of 712 neurons) than matrix neurons (36.5%; 721 out of 1977 neurons) that were task-modulated (p<0.005, Fisher’s exact test; Figure 5E). Among the task-modulated neurons, a higher percentage of striosomal neurons was active during the cue epoch of the task (23.7% of striosomal, 13.7% of matrix, p<0.001, Fisher’s exact test). By contrast, we found no differences in the percentages of striosomal and matrix neurons that were active during the post-reward licking or post-licking period (p>0.05). In plots of session-averaged activity sorted by the timing of peak responses, we observed that striosomal and matrix neuron activities similarly spanned each of the three task epochs, as though they tiled the temporal space of the task (Figure 5F). We found no differences in the trial-to-trial reliability of striosomal and matrix responses during the task epochs (Figure 5—figure supplement 1). To compare responses of all task-modulated striosomal and matrix neurons during these epochs, we analyzed population-averaged activity aligned to different task events (Figure 5G–J). As in our neuropil analyses, we found that individual striosomal neurons were more robustly active than individual matrix neurons during the cue epoch of the task (Figure 5G, ANOVA main effect p<0.001). Moreover, the high-probability tone elicited a higher response than the low-probability tone (p<0.001).

We used an AUROC analysis to compare activity in trials that were rewarded (responses aligned to first lick after reward delivery) or unrewarded (responses aligned to 2 s after cue onset, a time period matching that for the rewarded trial analysis). We found that striosomal neurons were more selective for rewarded trials (Figure 5H, p<0.001). The selectivity for reward was greater for low-probability than for high-probability tone trials (p<0.01). Although neurons in the two compartments responded similarly during post-reward licking (Figure 5I, p>0.05), striosomes had a higher response during the post-licking period (Figure 5J, p<0.01). Together, these findings demonstrate that neurons in both the striosome and matrix compartments are task-modulated in relatively similar patterns in this appetitive classical conditioning task, that neurons in striosomes are more strongly task-modulated than neurons in the nearby matrix, and that they are particularly more active during reward-predicting cues.

Striosomal tone-evoked responses are acquired during learning

To determine how these responses were shaped by training, we analyzed striatal activity during the acquisition period of the task. To quantify levels of learning, we tested for significance in the difference between anticipatory licking for the high- and low-probability cues during the tone presentation and the reward delay. If mice exhibited a significant difference on 2 out of 3 consecutive days, we considered them as being trained. Sessions performed before this criterion was met were categorized as acquisition sessions. This categorization allowed us to ask whether the strong striosomal cue-related response was a sensory feature, or whether it was an acquired response related to the meaning of the stimulus. Of the five mice studied, two were initially trained on a three-tone version of this task and were therefore excluded from the analysis of the initial training period (Table 4). The three mice included and the two mice excluded from the training data set had similar baseline ΔF/F values and percentages of task-modulated and tone-modulated neurons. Activity measures for the neuropil signals during training for all sessions before the mice reached the learning criterion (n = 33) were compared with the signals in the sessions after criterial performance had been met (n = 20). The striosomal responses to the tones were much stronger after animals learned the task (Figure 6A). The neuropil signal in striosomes was significantly higher after the task performance reached the training criterion than before this point (p<0.05). Such a difference was not observed for the matrix (p>0.05; ANOVA interaction p<0.001). In order to perform a similar analysis for single cells, we grouped responses in consecutive three-day bins. The results indicated that during training, the percentage of task-modulated neurons increased steadily (Figure 6B), but that when mice reached the learning criterion (sessions 11 and 12 for the mice shown in Figure 6B,C), there was a rapid increase in the proportion of cue-modulated neurons (Figure 6C), most notably among striosomal neurons.

Table 4. Data details for individual mice.

Mouse
1 2 3 4 5
Number of acquisition sessions 12 11 10 19 + 2 * 12 + 4 *
Number of criterion sessions 7 9 4 5 8
Number of overtraining sessions 5 5 0 0 0
Mean baseline ΔF/F 11.2 (0.7) 10.8 (0.7) 12.4 (1.4) 12.1 (1.0) 12.7 (0.6)
Standard deviation baseline ΔF/F 35.5 (1.3) 41.3 (1.3) 30.6 (1.6) 37.9 (1.7) 39.8 (1.2)
Maximum baseline ΔF/F 237.2 (9.6) 296.3 (11.2) 148.8 (9.3) 236.1 (13.2) 270.3 (9.7)
Task-modulated neurons in striosomes 54.1% 22.4% 31.7% 66.2% 46.3%
Task-modulated neurons in matrix 46.9% 20.8% 19.4% 56.4% 40.6%
Tone-modulated neurons in striosomes 21.6% 14.0% 14.6% 16.9% 7.3%
Tone-modulated neurons in matrix 13.0% 9.2% 3.3% 9.5% 4.4%

*Two mice were initially trained on a more complex version of the task with three tones instead of two (numbers of sessions trained on the two versions indicated by the first and second number, respectively). Data of these mice were excluded from acquisition analyses.

Figure 6. Cue-related signals in striosomes develop during training.

Figure 6.

(A) Average total striosomal (red) and matrix (black) neuropil signal during the 1.5 s tone period in all sessions before and after reaching learning criterion. *p<0.05, ***p<0.001 (ANOVA and post hoc t-test). Error bars represent SEM. (B, C) Percentage of task-modulated (B) and cue-modulated (C) neurons in striosomes and matrix during the course of training. Shading represents SEM. (D) Mean normalized (z-score) licking (left) and ΔF/F activity in task-modulated striosomal and matrix neurons (right) during ±5 sessions around the session in which the learning criterion was reached (session 0). (E) Activity of task-modulated striosomal and matrix neurons averaged and normalized for blocks of 5 sessions before (dotted lines) and after (solid lines) the learning criterion was reached. (F) Quantification of the mean response of all task-modulated neurons during the period from tone onset to reward onset. **p<0.01 (ANOVA and post hoc t-test). (G) Percentage of neurons modulated in the licking period during training. (H) Mean normalized (z-score) licking (left) and activity of task-modulated neurons (right) during training around the time of reward delivery (R).

We further tested whether there was a sudden step-like increase in striosomal tone signaling during training. We averaged the z-scores of the activity of all task-modulated neurons for each of the last five sessions before criterial performance, for the session in which the learning criterion was met, and for each of the first five sessions after the criterial session (Figure 6D). Comparing these values indicated a clear increase in striosomal signaling during the tone (Figure 6E) when the mice began to exhibit differential licking responses to the two tones. This increase in striosomal activity was significant (Figure 6F, ANOVA, training main effect p<0.001; interaction p<0.05). In addition to this development of tone responsiveness, there was a tone-related activation in striosomes in the sessions in which the animals were first exposed to the task (Figure 6C), perhaps reflecting a surprise or novelty signal effect. This tone-related activation disappeared after 1–3 sessions and then reemerged later as mice learned the task. There was also an increase in the percentage of neurons that responded during the post-reward licking period (Figure 6G), and the average activity of all task-modulated neurons during training increased in the period after reward delivery (Figure 6H). In contrast to the increases in tone response, this reward-period increase occurred several sessions before mice learned the task.

During overtraining, tone-related responses of striosomal neurons intensify and become increasingly selective for high-probability tones

To investigate further the relationship between neuronal responses and learning, two mice were trained for an additional five sessions. In these overtraining sessions, we imaged again the same fields of view from which we had collected movies during the criterion phase (Figure 7). The tone-evoked aggregate response became notably higher and sharper during this phase (Figure 7A). The increase in responses related to the tone during overtraining was particularly strong in striosomes. By contrast, the reward period activation immediately following the peak of the cue-evoked response was reduced. The signal initially dropped compared to the earlier sessions but subsequently reached the same magnitude, thus resembling previously reported task-bracketing patterns (Barnes et al., 2005; Jin and Costa, 2010; Jog et al., 1999; Smith and Graybiel, 2013; Thorn et al., 2010). This pattern contrasted with the cue-evoked licking response (Figure 7B), which remained high after the high-probability cue, when the bracketing-like effect in the ΔF/F signal was greatest. We compared the peak responses of the striosomal and matrix samples during the tone presentation period for the acquisition, post-criterion and overtraining sessions (Figure 7C), and found a highly significant interaction (ANOVA interaction p<0.005). In the trained and overtrained mice, striosomes had significantly higher tone-evoked responses than did the matrix (paired t-test, trained mice p<0.01 and overtrained mice p<0.05). The striosomal neuropil responses also became more selective for the high-probability cue during overtraining (Figure 7D), so that during overtraining the striosomal selectivity was significantly larger than the matrix response (paired t-test p<0.05).

Figure 7. Striosomal cue-related responses strengthen during overtraining and become more selective.

(A) Mean neuropil signals during acquisition (light blue), after learning criterion (medium blue) and during overtraining (dark blue) in striosomes (top) and matrix (bottom). Shading represents SEM. (B) Average licking (top) and neuronal activity in striosomes (middle) and matrix (bottom) in rewarded trials with high- (blue) and low- (green) probability cues. (C) Mean neuropil responses in striosomes (red) and matrix (black) during acquisition, after criterial performance and during overtraining, and the mean of the response sizes (right). *p<0.05, **p<0.01 (ANOVA). Shading and error bars represent SEM. (D) Selectivity for the high-probability cue, shown as in B. *p<0.05, **p<0.01 (ANOVA). (E) Percentages, shown in panels from left to right, of task-modulated neurons (left), tone-modulated neurons (second), neurons modulated in post-reward licking period (third) and neurons modulated in post-licking period (right) during acquisition (ACQ), after criterion (CR) and during overtraining (OT). *p<0.05, **p<0.01, ***p<0.001 (Fisher’s exact test).

Figure 7.

Figure 7—figure supplement 1. The size of tone-evoked ΔF/F activation increases as behavioral performance improves, particularly as seen in the calcium activity in striosomes.

Figure 7—figure supplement 1.

High-probability cues induced increases in licking and ΔF/F signal for striosomes (red circles) and matrix (black circles) for every session. The thin red and black lines indicate the outcome of the linear regression analyses, respectively, for striosomes and matrix. The bold blue line indicates the combined model.

The percentage of task-modulated neurons grew with training, then slightly dropped during overtraining (Figure 7E, left; striosomes: 12.2% during acquisition, 42.9% after criterion and 37.7% during overtraining; matrix: 8.6% during acquisition, 36.6% after criterion and 27.5% during overtraining). At all stages, there were a higher percentage of striosomal task-modulated neurons than matrix neurons responding in the task (Fisher’s exact test, p<0.01). By contrast, the percentage of cue-modulated neurons (Figure 7E, second panel) grew further during overtraining (striosomes: 4.1% during acquisition, 15.0% after criterion and 21.1% during overtraining; matrix: 2.3% during acquisition, 8.1% after criterion and 13.9% during overtraining). There were more tone-modulated neurons in striosomes than in matrix during acquisition, after criterion and during overtraining (Fisher exact test, p<0.05). The percentage of cells that were active during the post-reward licking period (Figure 7E, third panel) increased during training but went down during overtraining (striosomes: 7.0% during acquisition, 28.9% after criterion and 14.9% during overtraining; matrix: 5.6% during acquisition, 27.0% after criterion and 13.3% during overtraining), but there were no differences between striosomes and matrix (Fisher’s exact test, p>0.05). The proportion of neurons activated after the end of licking remained stable for both striosomal and matrix neurons during overtraining (Figure 7E, fourth panel; striosomes: 1.9% during acquisition, 5.6% after criterion and 6.6% during overtraining; matrix: 1.1% during acquisition, 7.3% after criterion and 6.8% during overtraining). We found no differences between striosomes and matrix at any training stage (Fisher’s exact test, p>0.05).

The limited number of significantly modulated neurons in these two mice was too small to make further statistical comparisons between the neuronal responses. Nevertheless, the findings for the entire performance period of the mice collectively demonstrate that the activity patterns observed after training were largely acquired during training, that the strengthening of the tone response was greater for striosomes than for matrix, that this response emerged at the time the animals began differentially responding to the tones, and that this response developed further during overtraining, becoming larger and more selective for the high-probability cue. Because the definition of the training phases was by necessity somewhat arbitrary, and the behavioral performance of the mice could fluctuate across days, we used linear regression to test how well the behavioral performance could predict the ΔF/F activation in striosomes and matrix. For every session, we calculated the mean and standard deviation of the baseline period (1 s preceding the cue onset) and then calculated tone-evoked licking and ΔF/F responses of the neuropil signal in z-scores. We found that in sessions in which the tone-evoked licking was greater, the neuropil response was also greater (Figure 7—figure supplement 1). To test this relationship, we first made two separate models for striosomes and matrix. The regression coefficients for licking were significant for both compartments for the high-probability cue responses but not for the low-probability cue responses (Table 5). When we tested how well the difference in licking during both cue periods could predict the difference in ΔF/F activation during the two cue types, we found a significant regression coefficient for striosomes, but only a trend for the matrix. Next, we made a combined model accounting for ΔF/F activation as a function of licking and quantified the residuals for striosomes and matrix. For both cues and for the difference between them, we found that the striosomal residuals were significantly bigger than those for the matrix. Together, these linear regression analyses demonstrate that in sessions in which the behavioral performance was better, the neuronal response was larger, especially in striosomes. Thus, the behavior was predictive of the neural response, particularly for striosomes.

Table 5. Outcome of the regression analyses.

Trial type
High-probability cue Low-probability cue Difference (high − low)
Striosome model Regression coefficient 0.069 *** 0.057 0.074 **
R-squared 0.180 0.029 0.095
Matrix model Regression coefficient 0.042 * 0.016 0.056
R-squared 0.081 0.003 0.053
Combined model Regression coefficient 0.056 *** 0.037 0.065 **
R-squared 0.118 0.014 0.072
Residual for striosomes (mean ± SEM) 0.053 ± 0.021 *** 0.031 ± 0.021 *** 0.022 ± 0.023 ***
Residual for matrix (mean ± SEM) −0.053 ± 0.021 *** −0.031 ± 0.018 *** −0.022 ± 0.024 ***

*p<0.05, **p<0.01, ***p<0.001.

Matrix responses are more sensitive to recent outcome history than are striosomal responses

In the classical conditioning task employed in this study, mice used the auditory tone presented during the cue epoch to guide their expectation for receiving a reward in the current trial. We examined their licking responses as a proxy for such expectation in order to ask whether, in addition to the information provided by the cue, the mice used the outcome of the previous trial to tailor their reward expectation in the current trial. In trials following rewarded trials, mice showed increased anticipatory licking during the cue and reward delay (Figure 8A, left and right; n = 33 sessions from five mice; p<0.001, Wilcoxon signed-rank test), but licking during the post-reward period was unaffected by outcome in the previous trial (Figure 8A, middle and right; p>0.05, Wilcoxon signed-rank test).

Figure 8. Reward history modulates anticipatory licking behavior and licking-period responses in striatal neurons.

Figure 8.

(A) Session-averaged licking activity during anticipatory and post-reward periods for trials in which the previous trial was rewarded (black solid lines) or unrewarded (purple dotted lines). Bar plot (right) shows modulation of anticipatory or post-reward licking activity by reward history. ***p<0.001 (Wilcoxon signed-rank test). Shading and error bars represent SEM. (B) Single-trial (top two rows) and averaged (bottom row) post-reward licking responses of four sample neurons for previously rewarded (black solid) or unrewarded (purple dotted) trials. (C) Population-averaged post-reward licking activity of all task-modulated neurons for reward histories extending one or two trials back. ***p<0.001 (Wilcoxon signed-rank test).

To determine whether the task-related activity of the striatal neurons in our sample was also modulated by outcome history, we compared the activities in trials preceded by a rewarded or unrewarded trial, regardless of the cue type (high- or low-probability) presented in the current trial. We first analyzed the effect of reward history on the cue-period responses of single task-modulated neurons and found that activity was slightly greater when the previous trial was rewarded (mean z-scores: 0.21 ± 0.01 vs. 0.17 ± 0.01 for previously rewarded and unrewarded; p<0.01). However, when we analyzed the effect of outcome history on neural responses observed during post-reward licking in currently rewarded trials, we found that the activity of a subset of striatal neurons was highly sensitive to outcome in previous trials (Figure 8B). Activity during post-reward licking was enhanced when the previous trial was unrewarded, compared to when the previous trial was rewarded. Similarly, population-averaged responses of task-modulated neurons were significantly higher when the previous trial was unrewarded, as compared to when it was rewarded (p<0.001, Wilcoxon rank-sum test). Importantly, post-reward licking behavior was invariant to previous trial outcome (Figure 8A), making it unlikely that the observed changes in neural activity were related to changes in the motor output during reward consumption.

To determine how far back in time we could detect an outcome history effect, we computed a history modulation index (see Materials and methods) for currently rewarded trials with two types of reward history. In the first group, we separated rewarded trials based on whether the previous trial was rewarded or unrewarded (one trial back). For the second group, we disregarded the outcome status in the immediately preceding trial and separated trials depending on the outcome status of two trials in the past (two trials back). This analysis showed that recent reward history has a stronger influence on post-reward licking responses of task-modulated neurons than trials farther back in the past (Figure 8C, p<0.001, Wilcoxon signed-rank test).

We asked whether this history modulation effect was detectable for both striosomal and matrix neurons (Figure 9). Examination of both population-averaged responses (Figure 9A) and single-cell responses (Figure 9B) suggested that both striosomal and matrix neurons were modulated by previous reward history, but that matrix neurons were more sensitive to this modulation. Quantification of this comparison by calculating the history modulation indices for striosomal and matrix neurons confirmed that the matrix responses were more influenced by previous reward history than the responses of striosomal neurons (Figure 9C,D; p<0.01, Wilcoxon rank-sum test).

Figure 9. Reward-history modulation of striosomal and matrix neurons.

Figure 9.

(A) Population-averaged responses of all task-modulated striosomal (red, left) and matrix (black, right) neurons during trials following previously rewarded (solid) or unrewarded (dotted) trials. Shading represents SEM. (B) Normalized lick-period responses (averaged over 1–3 s after first lick) of individual striosomal (left) and matrix (right) neurons. Responses with previously rewarded trials (x-axis) are plotted against responses from previously unrewarded trials (y-axis). Unity line is shown as blue dotted line (C) Histogram showing reward-history modulation index for all task-modulated striosomal and matrix neurons. (D) Mean reward-history modulation index for striosomal and matrix neurons. ***p<0.001 (Wilcoxon rank-sum test). Error bars represent SEM.

Discussion

Our findings demonstrate that 2-photon calcium imaging can be used to identify the activity patterns of subpopulations of neurons distinguished as being in either the striosome or matrix compartments of the striatum. Even with the use of a simple classical conditioning task involving cues predicting high or low probabilities of receiving reward, we could detect in all mice many task-related striatal neurons, altogether 38% of the 2704 neurons successfully imaged in the post-training phase. We found a remarkable parallel in many of the responses of neurons in striosomes and neurons in the matrix. Yet we also found clear differences in the responses of the striosomal and matrix neurons during cue presentation, found contrasts in the timing and selectivity of striosomal and matrix responses during learning and overtraining, and found that the responses of the two compartments were differentially affected by reward history. These findings, based on direct visual detection of striosomes by their birthdate-labeled neuropil and cell bodies, demonstrate that neurons of the two main compartments of the striatum, even though sharing many basic features of neuronal responses during reward-based conditioning, have distinguishable response properties that hint at distinct encoding functions of striosomal and matrix neurons related to reinforcement learning and performance. These findings suggest that in vivo imaging of striosomes and matrix could succeed in solving long-standing questions about the functions of these two major compartments of the striatum.

Tone-period activity

By the time the animals had reached the learning criterion, neurons in both compartments had developed task-related responses, and striosomes, examined both by averaged neuropil measures and by single-cell activities, were more responsive to the task than were neurons in the surrounding matrix. The differential activation of striosomes was particularly striking for responses to the reward-predictive cues. More neurons of the striosomal population were active in relation to the cues, and this effect grew stronger as animals acquired the task. The striosomal neurons also were more selective for the high-probability cue than were the matrix neurons. Greater responsivity to predictors of reward in the response profiles of striosomal cells is in line with striosomes acting as critic in an actor-critic architecture (Doya, 1999). However, our results are also in accord with other ideas based on limbic associations of the striosomes (Amemori et al., 2011). The enhanced striosomal responses to cues did not reflect an overall greater response of striosomes to all conditions; for example, their responses were less sensitive than those of the matrix neurons to immediate reward history. Thus, striosomal neurons stood out as more sensitive to the cues indicating reward.

Outcome period activity

Over the task-related population, the highest activity levels for many of the neurons as the learning criterion was reached occurred during the outcome period, whether the neurons were in striosomes or in the surrounding matrix. Thus, the compartments seemed equivalently engaged: the highest percentages of neurons of both types were active during this period. Neuronal activity built up and peaked at the end of the licking, leading to the obvious possibility that this activity was primarily related to licking itself. However, several factors pointed to this response as being different from a pure motor response related to the licking movements. Most strikingly, even among the neurons strongly active during the prolonged licking period, the majority rose to their peak activity at specific times within this period rather than during the entire licking period. These post-reward peak responses, collectively, appeared to cover the entire time after reward. A subgroup of these neurons even peaked in activity after the end of the last licks, resembling neuronal activity in electrophysiological recordings (Barnes et al., 2005; Jin and Costa, 2010; Jog et al., 1999; Smith and Graybiel, 2013; Thorn et al., 2010). Second, we found dissociations between licking behavior and neuronal responses. For instance, activations during licking were larger when the previous trial was unrewarded than when it was rewarded, whereas the licking behavior itself was not different. The anticipatory licking during the cue period and the neuronal responses during the cues also appeared dissociable during overtraining. During high-probability cues, when animals licked throughout the reward delay period, the neuronal signal decayed, whereas the opposite occurred during low-probability cues. Thus, although the signals observed during periods of licking were likely to be related to licking, their patterns of occurrence suggest an interesting multiplexing of information about licking, reward prediction, timing with respect to task events, and reward history. Finally, the differences in activity in striosomes and matrix that we observed cannot be accounted for by differences in licking behavior during the imaging of these compartments, because the effects were also visible when analyzing neuropil activity, in which case matched, simultaneously registered striosomal and matrix data points were acquired from every session during the same behavioral performance.

Sensitivity to reward history

In contrast to these accentuated responses of striosomes, striosomal neurons as a population were relatively less sensitive than those in the matrix to immediate reward history, although again, both populations were modulated in parallel so that the differences were quantitative, not qualitative. When the learning criterion had been reached, the neuronal responses for a given trial were elevated when the previous trial was not rewarded. By contrast, anticipatory licking was decreased in trials following unrewarded trials. These effects were significantly larger for the matrix. This reward history effect was much smaller for two-back reward history, suggesting that it reflected immediate reward history. Given the limits of our data set, we could not determine the mechanism underlying this difference in sensitivity to reward history.

Learning-related differences in the responsiveness of striosomal and matrix neurons

Our recordings during the course of training demonstrated that both the cue-related responses and the post-reward responses were built up in striosomes and nearby matrix regions during behavioral acquisition of the task, with tone-related responses abruptly appearing when the mice reached the learning criterion. These learning-related dynamics suggest that the observed tone responses do not simply reflect responses to auditory stimulus presentations.

During overtraining, the striosomal cue response strengthened: more striosomal neurons were significantly modulated by tone presentation, this striosomal response became stronger and more temporally precise, and it became more selective for the high-probability cue. By contrast, the activity in the period after reward delivery until the end of licking did not change notably and was even reduced slightly but non-significantly. Finally, the overall activity patterns in the neuropil began to resemble the classical task-bracketing pattern with peaks of activity at the beginning and the end of the trial (Barnes et al., 2005; Jin and Costa, 2010; Jog et al., 1999; Smith and Graybiel, 2013; Thorn et al., 2010).

In the matrix, the effects of overtraining were less pronounced. The responses to the tone and reward consumption remained similar, but, as in the striosomes, a pattern resembling task-bracketing formed in the matrix. All of these effects could be detected not only at the single-cell level but also by assessing total fluorescence in defined striosomes and regions of the nearby matrix with equivalent areas. These findings suggest that although both compartments have cue-related responses, in striosomes the responses to reward-predicting cues are accentuated relative to responses detected in the matrix and are particularly increased with extended training.

Reward signaling in the dorsal striatum

It has previously been found that a minority of dorsal striatal neurons encode reward prediction errors (Oyama et al., 2010, 2015; Stalnaker et al., 2012). Two of the major targets of striosomes, the dopamine-containing substantial nigra pars compacta and, via the pallidum, the lateral nucleus of the habenula, are well known to signal reward prediction errors (Bayer and Glimcher, 2005; Bromberg-Martin and Hikosaka, 2011; Keiflin and Janak, 2015; Matsumoto and Hikosaka, 2007; Schultz, 2016; Schultz et al., 1997). Therefore, we asked whether striosomes and matrix differentially encode reward prediction error signals. One particular possibility is that striosomes through their GABAergic innervation of dopamine-containing neurons could transmit a negative reward prediction signal. We found that striosomes preferentially encoded reward-predictive cues. We did not find significant differences between striosomes and matrix in outcome-related activity. We also did not find prominent signals related to reward omissions in either striosomes or matrix. Some models of striosome function posit that striosomes would have such signals. Our task, however, was a simple one and likely did not draw out such activity, and we did not have a full data set for the overtraining period, when such responses might be predicted to become apparent. We also note that we were unable to test hypotheses suggesting that tasks with multiple contexts and decision-making modes could be important for striosomal activation. Finally, we did not address motivational conflict, stress or anxiety states as potentially being critical to striosomal activation (Amemori and Graybiel, 2012; Friedman et al., 2017, 2015).

We are also aware that the dorsal striatum is heavily implicated in motor behavior, through learning, action selection and perhaps the invigoration of action (Amemori et al., 2011; Apicella et al., 1992; Balleine et al., 2007; Cui et al., 2013; Hikosaka et al., 2014; Howe et al., 2013; Klaus et al., 2017; Kreitzer and Malenka, 2008; Mink, 1996; Nelson and Kreitzer, 2014; Niv et al., 2007; Packard and Knowlton, 2002; Redgrave et al., 1999; Salamone and Correa, 2012; Samejima et al., 2005; Yin and Knowlton, 2006). Nevertheless, we chose to start in these experiments by determining how fundamental features of the striatum, signaling of outcome and prediction of outcome, are represented in the responses of neurons in the striosome and matrix compartments. Future work will address the involvement of striosomes and matrix in action and decision-making among alternative options.

Striosome labeling

Visual identification of striosomes by their dense neuropil labeling was achieved by pulse-labeling of striosomal neurons and their processes at the mid-point of striosome neurogenesis. Even though minorities of the striosomal neurons were pulse-labeled by the single tamoxifen injections, and despite the fact that there were scattered birthdate-labeled neurons in the extra-striosomal matrix at the striatal levels examined (ca. 15% of tdTomato-positive neurons), we could readily identify striosomes visually in vivo using 2-photon microscopy and could confirm this identification in post-mortem MOR1-counterstained sections prepared to assess the selectivity of labeling. We are aware that, with the use of pulse-labeling at neurogenic time points, we have incomplete labeling of compartments in any one animal, but the time of induction that we used was at the middle of the striosomal neurogenic window and was before the onset of major levels of matrix neuron neurogenesis in the striatal regions imaged (Fishell and van der Kooy, 1987; Graybiel, 1984; Graybiel and Hickey, 1982; Hagimoto et al., 2017; Kelly et al., 2017; Newman et al., 2015). We are also aware that the matrix compartment itself is heterogeneous, as it is composed of many input-output matrisome modules (e.g., Eblen and Graybiel, 1995; Flaherty and Graybiel, 1994), but such heterogeneity could not be taken into account in our experiments. We did choose for analysis zones in the matrix that were close to the striosomes studied. Our method did not rely on a single molecular or genetic marker to distinguish compartmental identify, but this feature had also a possible advantage in thereby avoiding potential unidentified biases that could arise from molecular-identity labeling.

It is currently unknown to what extent there are different subtypes of striosomal neurons and what the exact neuronal subtype composition of striosomes is. Kelly et al., 2017 have found that at E11.5, the time chosen for our tamoxifen induction, neurons expressing D1 dopamine receptors (D1Rs) and those expressing D2 dopamine receptors (D2Rs) are both being born, with a bias toward D1 neurons. Other evidence suggests a predominance of D1R-containing neurons in striosomal mouse models (Banghart et al., 2015; Cui et al., 2014; Smith et al., 2016) or, contrarily, a larger amount of D2R-containing neurons (Salinas et al., 2016). It is likely that differential labeling of subtypes of striosomal and matrix neurons occurs in different mouse lines, as has been seen by ourselves (Crittenden and Graybiel, in prep.), and in different regions of the striatum. It is clearly of great interest to determine the neuronal response properties of specific subgroups of striosomal neurons as defined by genetic markers, but we here have chosen to have secure visual identification of striosomal and matrix populations based on the identification of restricted neuropil labeling of striosomes achieved by their birth-dating and confirmed by their correspondence to the classic identification of striosomes in rodents as MOR1-dense zones (Tajima and Fukuda, 2013).

Prospects for future work

Our findings are confined to the analysis of a very simple task, and they clearly are unlikely to have uncovered the range of functions of the striosome and matrix compartments. Yet the experiments do demonstrate the feasibility of definitively identifying striosomes by 2-photon imaging as mice perform tasks, and of examining the activity of striosomal neurons relative to the activity of simultaneously imaged neurons in the nearly matrix. Our findings demonstrate commonality of striosomal and matrix activities during performance of a cued classical conditioning task. The different emphases on reward prediction and reward history that we detected, however, already suggest that striosomal neurons could be more responsive to the immediate contingencies of events than nearby matrix neurons, that they could gain this enhanced sensitivity by virtue of learning-related plasticity, but that they could be less sensitive to immediately prior reward history. These attributes of the striosomes could be related to real-time direction of action plans based on real-time estimates of value. To our best knowledge, this is the first report of simultaneous recording of visually identified striosome and matrix compartments in the striatum, here made possible by the neuropil labeling in pulse-labeled Mash1-CreER mice. Future refinements of such imaging should help to define the functional correlates of the striosome-matrix organization of the striatum.

Materials and methods

Key resources table.

Reagent type (species)
or resource
Designation Source or reference Identifiers Additional information
strain, strain background
(mouse,both sexes)
Mash1(Ascl1)-CreER Jackson Laboratory Ascl1tm1.1(Cre/ERT2)Jejo/J Stock no:
12882
strain, strain background
(mouse,both sexes)
Ai14 Jackson Laboratory B6.Cg-Gt(ROSA)26Sortm14
(CAG-tdTomato)Hze/J
Stock no: 007914
strain, strain background
(mouse,both sexes)
C57Bl6/J Jackson Laboratory C57BL/6J Stock no: 000664
genetic reagent AAV5-hSyn-GCaMP6s-wpre-sv40 University of Pennsylvania
Vector Core)
antibody anti-MOR1 Santa-Cruz sc-7488 Polyclonal goat (1:500)
antibody anti-GFP Abcam ab13970 Polyclonal chicken (1:2000)
software, algorithm Matlab Mathworks
software, algorithm Image-J National Institutes
of Health

All experiments were conducted in accordance with the National Institutes of Health guidelines and with the approval of the Committee on Animal Care at the Massachusetts Institute of Technology (MIT).

Mice

Mash1(Ascl1)-CreER mice (Kim et al., 2011) (Ascl1tm1.1(Cre/ERT2)Jejo/J, Jackson Laboratory) were crossed with Ai14-tdTomato Cre-dependent mice (Madisen et al., 2010) (B6;129S6-Gt(ROSA)26Sor, Jackson Laboratory) to achieve tdTomato labeling driven by Mash1 and crossed with FVB mice in the MIT colony to improve breeding results. Female Mash1-CreER;Ai14 mice were then crossed with C57BL/6J males to breed the mice that we used for the experiments. Tamoxifen was administered to pregnant dams by oral gavage (100 mg/kg, dissolved in corn oil) to induce Mash1-CreER at embryonic day (E) 11.5, a time point at which predominantly striosomal but almost no matrix neurons are born, in order to label predominantly striosomal neurons in anterior to mid-anteroposterior levels of the caudoputamen. Five mice (4 male and one female) were used for the imaging experiments.

Surgery

Virus injections

Adult Mash1(Ascl1)-CreER;Ai14 mice received virus injections during aseptic stereotaxic surgery at 7–10 weeks of age. They were deeply anesthetized with 3% isoflurane, were then head-fixed in a stereotaxic frame, and were maintained on anesthesia with 1–2% isoflurane. Meloxicam (1 mg/kg) was subcutaneously administered, the surgical field was prepared and cleaned with betadine and 70% ethanol, and based on pre-determined coordinates, the skin was incised, the head was leveled to align bregma and lambda, and two holes (ca. 0.5 mm diameter) were drilled in the skull. Two injections of AAV5-hSyn-GCaMP6s-wpre-sv40 (0. 5 µl each, University of Pennsylvania Vector Core) were made, one per skull opening, to favor widespread transfections of striatal neurons at the following coordinates relative to bregma: 1) 0.1 mm anterior, 1.9 mm lateral, 2.7 mm ventral and 2) 0.9 mm anterior, 1.7 mm lateral and 2.5 mm ventral. Injections were made over 10 min, and after a ~10 min delay, the injection needles were slowly retracted. The incision was sutured shut, the mice were kept warm during post-surgical recovery, and they were given wet food and meloxicam (1 mg/kg, subcutaneous) for 3 days to provide analgesia.

Cannula implantation

We assembled chronic cannula windows by adhering a 2.7 mm glass coverslip to the end of a stainless steel metal tubing (1.6–1.8 mm long, 2.7 mm diameter; Small Parts) using UV curable glue (Norland). Cannula windows were kept in 70% ethanol until used for surgery. At 20–40 days after virus injection, mice were water restricted, and a second surgery was performed under deep isoflurane anesthesia as before to allow insertion of a cannula for imaging (Dombeck et al., 2010; Howe and Dombeck, 2016; Lovett-Barron et al., 2014) and mounting of a headplate to the skull for later head fixation. Bregma and lambda were aligned in the horizontal plane, and the anterior and lateral coordinates for the craniotomy were marked (0.6 mm anterior and 2.1 mm lateral to bregma). The skull was then tilted and rolled by 5° to make the skull surface horizontal at the location of cannula implantation. A 2.7 mm diameter craniotomy was made with a trephine dental drill. The exposed cortical tissue overlying the striatum was aspirated using gentle suction and constant perfusion with cooled, autoclaved 0.01 M phosphate buffered saline (PBS), and part of the underlying white matter was removed. A thin layer of Kwiksil (WPI) was applied, and the chronic cannula was inserted into the cavity. Finally, metabond (Parkell) was used to secure the implant in place and to attach a headplate to the skull. The mice received the same post-surgical care as described above.

Behavioral training

When mice had recovered from surgery and the optical window had cleared, they were put under water restriction (1–1.5 ml per day) and were habituated to head-fixation for on average 5 days. During head fixation, the mice were held in a polyethylene tube that was suspended by springs. When they showed no clear signs of stress and readily drank water while being head-fixed, behavioral training was begun. Training and imaging was performed 5 days a week. Water was delivered through a tube controlled by a solenoid valve located outside of the imaging setup, and licking at the spout was detected by a conductance-based method (Slotnick, 2009). In the behavioral training protocol, two tones (4 or 11 kHz, 1.5 s duration) were played in a random order. The tones predicted reward delivery (5 µl) with, respectively, an 80% or 20% probability. In each trial, there was a 500 ms delay after tone offset before reward delivery. Inter-trial intervals were randomly drawn from a flat distribution between 5.25 and 8.75 s. Training (acquisition phase) was considered to be complete when there was a significant difference in anticipatory licking during the cue period between the two cues (two-sided t-test, α = 0.05). Two of the five mice were initially trained on a three-tone version of the task. The training data of these mice have therefore not been included in our analysis. After reaching the acquisition criterion, mice were tested during 4–9 daily session (criterion phase). After completing the criterion phase of the experiment, two mice were given five overtraining sessions (overtraining phase).

Imaging

Imaging of GCaMP6s and tdTomato fluorescence was performed with a commercial Prairie Ultima IV 2-photon microscopy system equipped with a resonant galvo scanning module and a LUMPlanFL, 40x, 0.8 NA immersion objective lens (Olympus). For fluorescent excitation, we used a titanium-sapphire laser (Mai-Tai eHP, Newport) with dispersion compensation (Deep See, Newport). Emitted green and red fluorescence was split using a dichroic mirror (Semrock) and directed to GaAsP photomultiplier tubes (Hamamatsu). Individual fields of view were imaged using either galvo-resonant or galvo-galvo scanning, with acquisition framerates between 5 and 20 Hz. Laser power at the sample ranged from 11 to 42 mW, depending on GCaMP6s expression levels. For final analysis of the data set, all imaging sessions were resampled at a framerate of 5 Hz.

Fields of view were chosen on the basis of clear labeling of putative striosomes defined by dense tdTomato signal in the neuropil. Within these zones, both tdTomato-positive as well as unlabeled cells were present and were defined as putative striosomal neurons. Because of the 2.4 mm inner diameter of the cannula, we could typically find several striosomes that we could image at different depths. Our sampling strategy was to image as many different neurons as possible. During training, we rotated through the fields of view, but after training and during overtraining, we imaged unique, non-overlapping fields of view.

Image processing and cell-type identification

Calcium imaging data were acquired using PrairieView acquisition software and were saved into multipage TIF files. Data were analyzed by using custom scripts written in ImageJ (National Institutes of Health) or Matlab (Mathworks). Analysis scripts are available at Github (https://github.com/bloemb/eLife_2017_scripts) (Bloem, 2017). Images were first corrected for motion in the X-Y axis by registering all images to a reference frame. We used the pixel-wise mean of all frames in the red channel containing the structural tdTomato signal to make a reference image. All red channel frames were re-aligned to the reference image by the use of 2-dimensional normalized cross-correlation (template matching and slice alignment plugin) (Tseng et al., 2011). The green channel frames containing the GCaMP6s signal were then realigned using the same translation coordinates with the ‘Translate’ function in ImageJ. To verify that calculating translation coordinates on the basis of the tdTomato signal did not provide better registration for striosomal than for matrix neurons, we compared the results obtained by this method with those obtained using a registration method that only uses the GCaMP6s signal. We found that, for both striosomes and matrix, the results for these registration methods were highly correlated (mean correlation coefficient: 0.9971 for striosomes and 0.9978 for matrix). After realignment, ROIs were manually drawn over neuronal cell bodies using standard deviation and mean projections of the movies. With custom Matlab scripts, we drew rings around the cell body ROIs (excluding other ROIs) to estimate the contribution of the background neuropil signal to the observed cellular signal. Fluorescence signal for each neuron was computed by taking the pixel-wise mean of the somatic ROIs and subtracting 0.7x the fluorescence of the surrounding neuropil, as previously described (Chen et al., 2013). After this step, the baseline fluorescence for each neuron (F0) was calculated using K-means (KS)-density clustering to find the mode of the fluorescence distribution. The ratio between the change in fluorescence and the baseline was calculated as ΔF/F = Ft – F0 / F0. For population analysis of single cell data, we calculated z-scores of the neuronal responses using the mean and the standard deviation of the 1 s baseline period preceding the tone onset.

Individual neurons were identified as striosomal if their cell bodies lay in a region that was densely labeled by tdTomato, or if the cells themselves were tdTomato-positive. Hence, the small minority of tdTomato-positive neurons that appeared in the matrix (Kelly et al., 2017) was included in the striosomal population. Altogether 6320 neurons were recorded (2871 during acquisition, 2704 after criterion, and 745 during overtraining). Of these, 1867 were considered striosomal (912 during training, 727 after criterion, and 228 during overtraining). Of these, 294 were labeled with tdTomato, 1828 were located in densely tdTomato-labeled striosomes, and 255 met both criteria. There were 39 tdTomato-labeled cells that were not located in a zone of dense tdTomato neuropil labeling. We excluded these neurons in the multiple analyses resported, but their exclusion never resulted in a different outcome in our analyses.

Analysis of neuropil activity

To provide a first insight into striosomal and matrix signaling, we integrated the fluorescence signal from within an identified striosome and from a part of the matrix in the same field of view that had a similar size, background fluorescence and number of neurons. ΔF/F, calculated as ΔF/F = Ft – F0 / F0, was normalized by calculating z-scores relative to the signal during the last 1 s of inter-trial intervals to correct for relative differences between sessions. To determine the selectivity of responses to different task events, the area under the Receiver Operating Characteristic curves (AUROC) was calculated. For cue selectivity, we calculated the AUROC by comparing the response during high- and low-probability cues. For the selectivity to rewarded trials, we calculated the AUROC by comparing separately rewarded and unrewarded trials for the two cues.

Analysis of single-neuron activity

The conditioning task had three epochs — cue, post-reward licking, and post-licking. To identify task-modulated neurons active during these epochs, we aligned the data either to tone onset, to the first lick after reward delivery, or to the end of licking. We compared the fluorescence values over the following time windows to a 1 s baseline preceding each event. For the tone-aligned data, mean fluorescence was calculated over a 2 s time window after tone onset separately for trials with either the high- or low-probability cues. Neurons that were significantly active in either of the cue conditions were considered to be task-modulated. To find neurons modulated during the post-reward licking period, GCaMP6 fluorescence was averaged between the time when the animal first licked to receive the reward and the time that it stopped licking. We also used a 1 s time window after end of licking for identifying task-modulated neurons during this period. In some trials, animals did not stop licking until the start of the next trial. These trials were excluded from the analysis due to the difficulty in assigning licking end-time. For a neuron to be considered as task-modulated, we required that its activity exhibit a significant increase from baseline for any of the three alignments (two-sided Wilcoxon rank-sum test; α = 0.01, corrected for multiple comparisons). Neurons exclusively active during only one epoch of the task were considered to be selectively responsive during that period. Most neurons (>80%) were significantly active only during one of the epochs. To compare signals across neurons, we used z-score normalization of the ΔF/F signals with a 1 s period before the cue as a baseline. For analysis of the peak activity of task-modulated neurons, ΔF/F signals were normalized to the maximum of the session-averaged activity for any particular alignment in order to compare peak activity times during the time interval of interest. For determining the temporal specificity of responses during the post-reward licking period (rewarded trials with high-probability cue), we generated shuffled data for each neuron by substituting the response in a given trial with response in the same trial from a randomly selected task-modulated neuron recorded simultaneously. Only sessions in which at least ten task-modulated neurons were simultaneously recorded were included in this analysis. We computed a reliability index defined as the average response correlation of all pairwise combinations of trials (Rikhye and Sur, 2015). In addition, we quantified the standard deviation of peak response times across trials. For these measurements, we repeated the shuffle 20 times for each neuron and calculated the mean value of the outcome of the 20 shuffled analyses as the representative metric. Significance was then computed by comparing the observed and shuffled distribution of values using a Wilcoxon rank-sum test. We also computed a ridge-to-background ratio (Harvey et al., 2012), which quantifies the relative magnitude of response close to the peak time relative to all other time points during the post-reward period. The ridge was defined as the mean ΔF/F value (normalized to the max response) taken over five time points (i.e., 1 s due to the 5 Hz frame acquisition rate of our recordings) surrounding the peak time for each neuron’s session-averaged response, and the background value was the mean ΔF/F over all other time points.

To determine whether reward outcome in the previous trial modulated licking behavior during the current trial, we first compared anticipatory licking in trials that were followed by either rewarded or unrewarded trials. We included all current trials, regardless of the cue or the outcome status. To examine the effect of outcome history on licking after reward delivery, we analyzed only currently rewarded trials, again ignoring the identity of the cue presented. To determine whether neural responses were modulated by previous outcome history, we computed a history modulation index (HMI) using the following formula:

HMI=Previous trial rewarded  Previous trial unrewardedPrevious trial rewarded + Previous trial unrewarded

The HMI was computed from z-score values normalized by the following method. First, we took all currently rewarded trials and averaged the z-scores of ΔF/F values over a 2 s window starting 1 s after reward delivery. We chose this time window because we found that most of the task-modulated neurons were active during this period. These values were then scaled by the range of the observed responses, so that normalized values ranged from 0 to 1. Trials were then separated based on different outcome histories.

Linear regression analysis

To quantify the relationship between behavioral performance and neuronal activation, we used linear regression. For every session, we calculated the baseline licking and ΔF/F activation in the 1 s period preceding the cue onset and calculated the mean standard deviation of the baseline across trials, which we then used to calculate z-scores of the tone-evoked licking and ΔF/F activation for every trial. We then averaged the normalized tone-evoked licking and ΔF/F response across trials for both cue types for every session. Next, we performed linear regression analyses to identify a possible relationship between tone-evoked ΔF/F activation and tone-evoked licking. We performed this regression for both high- and low-probability tones and for the difference in the licking and ΔF/F responses between them. As a first step, we created separate models for striosomes and matrix in order to calculate the regression coefficients and significance for these populations separately. In order to compare striosomes and matrix more directly, we made a combined model and then quantified the residuals for striosomes and matrix. The differences in residuals were compared using a paired t-test.

Statistical analysis

We used Wilcoxon sign-rank tests to detect significant modulation of single neurons in different task epoch. ANOVA was used to evaluate interactions between multiple factors. For percentages, Fisher’s exact test was used to compare groups, and confidence intervals were calculated using binomial tests.

Histology

After the experiments, mice were transcardially perfused with 0.9% saline solution followed by 4% paraformaldehyde in 0.1 M NaKPO4 buffer (PFA). The brains were removed, stored overnight in PFA solution at 4°C and transferred to glycerol solution (25% glycerol in tris buffered saline) until being frozen in dry ice and cut in transverse sections at 30 µm on a sliding microtome (American Optical Corporation). For staining, sections were first rinsed 3 × 5 min in PBS-Tx (0.01 M PBS + 0.2% Triton X-100), then were incubated in blocking buffer (Perkin Elmer TSA Kit) for 20 min followed by incubation with primary antibodies for GFP (Polyclonal, chicken, Abcam ab13970, 1:2000) and MOR1 (Polyclonal, goat, Santa Cruz sc-7488, 1:500). After two nights of incubation at 4°C, the sections were rinsed in PBS-Tx (3 × 5 min), incubated in secondary antibodies Alexa Fluor 488 (donkey anti-chicken, Invitrogen, 1:300) and Alexa Fluor 647 (donkey anti-goat, Invitrogen, 1:300) for 2 hr at room temperature, rinsed in 0.1 M PB (3 × 5 min), mounted and covered with a coverslip with ProLong Gold mounting medium with DAPI (Thermo Fisher Scientific).

To quantify the overlap between striosomes as detected by tdTomato and MOR1 staining, we stained sections from five mice and recorded images of 2 brain sections per mouse. We manually outlined striosomes for every marker twice and calculated the percentage of pixels that were marked as striosomes and matrix. In addition, we compared the repeated outlines of the striosomes that were made using the same marker, allowing us to get a measure of test-retest error rates when outlining striosomes on the basis of tdTomato or MOR1.

Acknowledgements

We thank Dr. Mark Howe and Dr. Dan Dombeck for invaluable advice on 2-photon imaging of the striatum, Dr. Leif Gibb and Jannifer Lee for initiating the breeding program, Dr. Josh Huang and Dr. Sean Kelly for their advice in this process, Cody Carter for critical help with the breeding of the mice, Dr. Yasuo Kubota for help preparing the manuscript, and Erik Nelson for his work on the histology.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Ann M Graybiel, Email: graybiel@MIT.EDU.

Geoffrey Schoenbaum, NIDA, United States.

Funding Information

This paper was supported by the following grants:

  • Simons Foundation 306140 to Ann M Graybiel.

  • National Institute of Mental Health R01 MH060379 to Ann M Graybiel.

  • Saks Kavanaugh Foundation to Ann M Graybiel.

  • Bachmann-Strauss Dystonia and Parkinson Foundation to Ann M Graybiel.

  • Netherlands Organization for Scientific Research - Rubicon to Bernard Bloem.

  • National Institute of Neurological Disorders and Stroke U01 NS090473 to Mriganka Sur.

  • National Eye Institute R01 EY007023 to Mriganka Sur.

  • National Science Foundation EF1451125 to Mriganka Sur.

  • Simons Foundation Autism Research Initiative to Mriganka Sur.

  • National Eye Institute F32 EY024857 to Rafiq Huda.

  • National Institute of Mental Health K99 MH112855 to Rafiq Huda.

  • Nancy Lurie Marks Family Foundation to Ann M Graybiel.

  • William N. & Bernice E. Bumpus foundation RRDA Pilot: 2013.1 to Ann M Graybiel.

  • William N. & Bernice E. Bumpus foundation to Bernard Bloem.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Software, Formal analysis, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

Conceptualization, Software, Formal analysis, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

Resources, Supervision, Funding acquisition, Writing—review and editing.

Conceptualization, Resources, Supervision, Funding acquisition, Methodology, Writing—original draft, Writing—review and editing.

Ethics

Animal experimentation: All experiments were conducted in accordance with the National Institute of Health guidelines and with the approval of the Committee on Animal Care at the Massachusetts Institute of Technology (protocol #: 1114-122-17).

Additional files

Transparent reporting form
DOI: 10.7554/eLife.32353.020

References

  1. Albin RL, Young AB, Penney JB. The functional anatomy of basal ganglia disorders. Trends in Neurosciences. 1989;12:366–375. doi: 10.1016/0166-2236(89)90074-X. [DOI] [PubMed] [Google Scholar]
  2. Alexander GE, Crutcher MD. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends in Neurosciences. 1990;13:266–271. doi: 10.1016/0166-2236(90)90107-L. [DOI] [PubMed] [Google Scholar]
  3. Amemori K, Gibb LG, Graybiel AM. Shifting responsibly: the importance of striatal modularity to reinforcement learning in uncertain environments. Frontiers in Human Neuroscience. 2011;5:47. doi: 10.3389/fnhum.2011.00047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Amemori K, Graybiel AM. Localized microstimulation of primate pregenual cingulate cortex induces negative decision-making. Nature Neuroscience. 2012;15:776–785. doi: 10.1038/nn.3088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Amemori K, Amemori S, Graybiel AM. Motivation and affective judgments differentially recruit neurons in the primate dorsolateral prefrontal and anterior cingulate cortex. Journal of Neuroscience. 2015;35:1939–1953. doi: 10.1523/JNEUROSCI.1731-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Apicella P, Scarnati E, Ljungberg T, Schultz W. Neuronal activity in monkey striatum related to the expectation of predictable environmental events. Journal of Neurophysiology. 1992;68:945–960. doi: 10.1152/jn.1992.68.3.945. [DOI] [PubMed] [Google Scholar]
  7. Bakhurin KI, Goudar V, Shobe JL, Claar LD, Buonomano DV, Masmanidis SC. Differential encoding of time by prefrontal and striatal network dynamics. The Journal of Neuroscience. 2017;37:854–870. doi: 10.1523/JNEUROSCI.1789-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Balleine BW, Delgado MR, Hikosaka O. The role of the dorsal striatum in reward and decision-making. Journal of Neuroscience. 2007;27:8161–8165. doi: 10.1523/JNEUROSCI.1554-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Banghart MR, Neufeld SQ, Wong NC, Sabatini BL. Enkephalin disinhibits mu opioid receptor-rich striatal patches via delta opioid receptors. Neuron. 2015;88:1227–1239. doi: 10.1016/j.neuron.2015.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Barnes TD, Kubota Y, Hu D, Jin DZ, Graybiel AM. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature. 2005;437:1158–1161. doi: 10.1038/nature04053. [DOI] [PubMed] [Google Scholar]
  11. Barnes TD, Mao JB, Hu D, Kubota Y, Dreyer AA, Stamoulis C, Brown EN, Graybiel AM. Advance cueing produces enhanced action-boundary patterns of spike activity in the sensorimotor striatum. Journal of Neurophysiology. 2011;105:1861–1878. doi: 10.1152/jn.00871.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bayer HM, Glimcher PW. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. doi: 10.1016/j.neuron.2005.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bloem B. Elife_2017_scripts. dc18a30Github. 2017 https://github.com/bloemb/eLife_2017_scripts
  14. Bocarsly ME, Jiang WC, Wang C, Dudman JT, Ji N, Aponte Y. Minimally invasive microendoscopy system for in vivo functional imaging of deep nuclei in the mouse brain. Biomedical Optics Express. 2015;6:4546–4556. doi: 10.1364/BOE.6.004546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Bolam JP, Izzo PN, Graybiel AM. Cellular substrate of the histochemically defined striosome/matrix system of the caudate nucleus: a combined Golgi and immunocytochemical study in cat and ferret. Neuroscience. 1988;24:853–875. doi: 10.1016/0306-4522(88)90073-5. [DOI] [PubMed] [Google Scholar]
  16. Brimblecombe KR, Cragg SJ. Substance P weights striatal dopamine transmission differently within the striosome-matrix axis. Journal of Neuroscience. 2015;35:9017–9023. doi: 10.1523/JNEUROSCI.0870-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Brimblecombe KR, Cragg SJ. The striosome and matrix compartments of the striatum: a path through the labyrinth from neurochemistry toward function. ACS Chemical Neuroscience. 2017;8:235–242. doi: 10.1021/acschemneuro.6b00333. [DOI] [PubMed] [Google Scholar]
  18. Bromberg-Martin ES, Matsumoto M, Nakahara H, Hikosaka O. Multiple timescales of memory in lateral habenula and dopamine neurons. Neuron. 2010;67:499–510. doi: 10.1016/j.neuron.2010.06.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Bromberg-Martin ES, Hikosaka O. Lateral habenula neurons signal errors in the prediction of reward information. Nature Neuroscience. 2011;14:1209–1216. doi: 10.1038/nn.2902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Brown J, Bullock D, Grossberg S. How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues. Journal of Neuroscience. 1999;19:10502–10511. doi: 10.1523/JNEUROSCI.19-23-10502.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Carvalho Poyraz F, Holzner E, Bailey MR, Meszaros J, Kenney L, Kheirbek MA, Balsam PD, Kellendonk C. Decreasing striatopallidal pathway function enhances motivation by energizing the initiation of goal-directed action. The Journal of Neuroscience. 2016;36:5988–6001. doi: 10.1523/JNEUROSCI.0444-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Chen TW, Wardill TJ, Sun Y, Pulver SR, Renninger SL, Baohan A, Schreiter ER, Kerr RA, Orger MB, Jayaraman V, Looger LL, Svoboda K, Kim DS. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature. 2013;499:295–300. doi: 10.1038/nature12354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Crittenden JR, Graybiel AM. Basal Ganglia disorders associated with imbalances in the striatal striosome and matrix compartments. Frontiers in Neuroanatomy. 2011;5:59. doi: 10.3389/fnana.2011.00059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Crittenden JR, Graybiel AM. Disease-associated changes in the striosome and matrix compartments of the dorsal striatum. In: Steiner H, Tseng K. Y, editors. Handbook of Basal Ganglia Structure and Function. Amsterdam: Elsevier; 2016. pp. 801–821. [Google Scholar]
  25. Crittenden JR, Tillberg PW, Riad MH, Shima Y, Gerfen CR, Curry J, Housman DE, Nelson SB, Boyden ES, Graybiel AM. Striosome-dendron bouquets highlight a unique striatonigral circuit targeting dopamine-containing neurons. PNAS. 2016;113:11318–11323. doi: 10.1073/pnas.1613337113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, Costa RM. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013;494:238–242. doi: 10.1038/nature11846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Cui Y, Ostlund SB, James AS, Park CS, Ge W, Roberts KW, Mittal N, Murphy NP, Cepeda C, Kieffer BL, Levine MS, Jentsch JD, Walwyn WM, Sun YE, Evans CJ, Maidment NT, Yang XW. Targeted expression of μ-opioid receptors in a subset of striatal direct-pathway neurons restores opiate reward. Nature Neuroscience. 2014;17:254–261. doi: 10.1038/nn.3622. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. DeLong MR. Primate models of movement disorders of basal ganglia origin. Trends in Neurosciences. 1990;13:281–285. doi: 10.1016/0166-2236(90)90110-V. [DOI] [PubMed] [Google Scholar]
  29. Dombeck DA, Harvey CD, Tian L, Looger LL, Tank DW. Functional imaging of hippocampal place cells at cellular resolution during virtual navigation. Nature Neuroscience. 2010;13:1433–1440. doi: 10.1038/nn.2648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Doya K. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Networks. 1999;12:961–974. doi: 10.1016/S0893-6080(99)00046-5. [DOI] [PubMed] [Google Scholar]
  31. Eblen F, Graybiel AM. Highly restricted origin of prefrontal cortical inputs to striosomes in the macaque monkey. Journal of Neuroscience. 1995;15:5999–6013. doi: 10.1523/JNEUROSCI.15-09-05999.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Fishell G, van der Kooy D. Pattern formation in the striatum: developmental changes in the distribution of striatonigral neurons. Journal of Neuroscience. 1987;7:1969–1978. doi: 10.1523/JNEUROSCI.07-07-01969.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Flaherty AW, Graybiel AM. Input-output organization of the sensorimotor striatum in the squirrel monkey. Journal of Neuroscience. 1994;14:599–610. doi: 10.1523/JNEUROSCI.14-02-00599.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Friedman A, Homma D, Gibb LG, Amemori K, Rubin SJ, Hood AS, Riad MH, Graybiel AM. A corticostriatal path targeting striosomes controls decision-making under conflict. Cell. 2015;161:1320–1333. doi: 10.1016/j.cell.2015.04.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Friedman A, Homma D, Bloem B, Gibb LG, Amemori K-ichi, Hu D, Delcasso S, Truong TF, Yang J, Hood AS, Mikofalvy KA, Beck DW, Nguyen N, Nelson ED, Toro Arana SE, Vorder Bruegge RH, Goosens KA, Graybiel AM. Chronic stress alters striosome-circuit dynamics, leading to aberrant decision-making. Cell. 2017;171:1191–1205. doi: 10.1016/j.cell.2017.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Fujiyama F, Sohn J, Nakano T, Furuta T, Nakamura KC, Matsuda W, Kaneko T. Exclusive and common targets of neostriatofugal projections of rat striosome neurons: a single neuron-tracing study using a viral vector. European Journal of Neuroscience. 2011;33:668–677. doi: 10.1111/j.1460-9568.2010.07564.x. [DOI] [PubMed] [Google Scholar]
  37. Gage GJ, Stoetzner CR, Wiltschko AB, Berke JD. Selective activation of striatal fast-spiking interneurons during choice execution. Neuron. 2010;67:466–479. doi: 10.1016/j.neuron.2010.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Gerfen CR. The neostriatal mosaic: compartmentalization of corticostriatal input and striatonigral output systems. Nature. 1984;311:461–464. doi: 10.1038/311461a0. [DOI] [PubMed] [Google Scholar]
  39. Gerfen CR. The neostriatal mosaic: multiple levels of compartmental organization in the basal ganglia. Annual Review of Neuroscience. 1992;15:285–320. doi: 10.1146/annurev.ne.15.030192.001441. [DOI] [PubMed] [Google Scholar]
  40. Giménez-Amaya JM, Graybiel AM. Modular organization of projection neurons in the matrix compartment of the primate striatum. Journal of Neuroscience. 1991;11:779–791. doi: 10.1523/JNEUROSCI.11-03-00779.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Graybiel AM, Ragsdale CW. Histochemically distinct compartments in the striatum of human, monkeys, and cat demonstrated by acetylthiocholinesterase staining. PNAS. 1978;75:5723–5726. doi: 10.1073/pnas.75.11.5723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Graybiel AM, Hickey TL. Chemospecificity of ontogenetic units in the striatum: demonstration by combining [3H]thymidine neuronography and histochemical staining. PNAS. 1982;79:198–202. doi: 10.1073/pnas.79.1.198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Graybiel AM. Correspondence between the dopamine islands and striosomes of the mammalian striatum. Neuroscience. 1984;13:1157–1187. doi: 10.1016/0306-4522(84)90293-8. [DOI] [PubMed] [Google Scholar]
  44. Graybiel AM. Templates for neural dynamics in the striatum: Striosomes and matrisomes. In: Shepherd G, Grillner S, editors. Handbook of Brain Microcircuits. New York: Oxford University Press; 2010. pp. 120–126. [DOI] [Google Scholar]
  45. Hagimoto K, Takami S, Murakami F, Tanabe Y. Distinct migratory behaviors of striosome and matrix cells underlying the mosaic formation in the developing striatum. Journal of Comparative Neurology. 2017;525:794–817. doi: 10.1002/cne.24096. [DOI] [PubMed] [Google Scholar]
  46. Hamid AA, Pettibone JR, Mabrouk OS, Hetrick VL, Schmidt R, Vander Weele CM, Kennedy RT, Aragona BJ, Berke JD. Mesolimbic dopamine signals the value of work. Nature Neuroscience. 2016;19:117–126. doi: 10.1038/nn.4173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Harvey CD, Coen P, Tank DW. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature. 2012;484:62–68. doi: 10.1038/nature10918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Hikosaka O, Kim HF, Yasuda M, Yamamoto S. Basal ganglia circuits for reward value-guided behavior. Annual Review of Neuroscience. 2014;37:289–306. doi: 10.1146/annurev-neuro-071013-013924. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Howe MW, Tierney PL, Sandberg SG, Phillips PE, Graybiel AM. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature. 2013;500:575–579. doi: 10.1038/nature12475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Howe MW, Dombeck DA. Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature. 2016;535:505–510. doi: 10.1038/nature18942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Jiménez-Castellanos J, Graybiel AM. Compartmental origins of striatal efferent projections in the cat. Neuroscience. 1989;32:297–321. doi: 10.1016/0306-4522(89)90080-8. [DOI] [PubMed] [Google Scholar]
  52. Jin X, Costa RM. Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature. 2010;466:457–462. doi: 10.1038/nature09263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Jog MS, Kubota Y, Connolly CI, Hillegaart V, Graybiel AM. Building neural representations of habits. Science. 1999;286:1745–1749. doi: 10.1126/science.286.5445.1745. [DOI] [PubMed] [Google Scholar]
  54. Kaifosh P, Lovett-Barron M, Turi GF, Reardon TR, Losonczy A. Septo-hippocampal GABAergic signaling across multiple modalities in awake mice. Nature Neuroscience. 2013;16:1182–1184. doi: 10.1038/nn.3482. [DOI] [PubMed] [Google Scholar]
  55. Keiflin R, Janak PH. Dopamine prediction errors in reward learning and addiction: from theory to neural circuitry. Neuron. 2015;88:247–263. doi: 10.1016/j.neuron.2015.08.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kelly SM, Raudelas R, He M, Lee J, Kim Y, Gibb LG, Wu P, Matho K, Osten P, Graybiel AM, Huang ZJ. Radial glial lineage progression and differential intermediate progenitor amplification underlie striatal compartments and circuit organization. Neuron in Revision. 2017 doi: 10.1016/j.neuron.2018.06.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Kim EJ, Ables JL, Dickel LK, Eisch AJ, Johnson JE. Ascl1 (Mash1) defines cells with long-term neurogenic potential in subgranular and subventricular zones in adult mouse brain. PLoS One. 2011;6:e18472. doi: 10.1371/journal.pone.0018472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Klaus A, Martins GJ, Paixao VB, Zhou P, Paninski L, Costa RM. The spatiotemporal organization of the striatum encodes action space. Neuron. 2017;95:1171–1180. doi: 10.1016/j.neuron.2017.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Kreitzer AC, Malenka RC. Striatal plasticity and basal ganglia circuit function. Neuron. 2008;60:543–554. doi: 10.1016/j.neuron.2008.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Langer LF, Graybiel AM. Distinct nigrostriatal projection systems innervate striosomes and matrix in the primate striatum. Brain Research. 1989;498:344–350. doi: 10.1016/0006-8993(89)91114-1. [DOI] [PubMed] [Google Scholar]
  61. Lopez-Huerta VG, Nakano Y, Bausenwein J, Jaidar O, Lazarus M, Cherassse Y, Garcia-Munoz M, Arbuthnott G. The neostriatum: two entities, one structure? Brain Structure and Function. 2016;221:1737–1749. doi: 10.1007/s00429-015-1000-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lovett-Barron M, Kaifosh P, Kheirbek MA, Danielson N, Zaremba JD, Reardon TR, Turi GF, Hen R, Zemelman BV, Losonczy A. Dendritic inhibition in the hippocampus supports fear learning. Science. 2014;343:857–863. doi: 10.1126/science.1247485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Luo Z, Volkow ND, Heintz N, Pan Y, Du C. Acute cocaine induces fast activation of D1 receptor and progressive deactivation of D2 receptor striatal neurons: in vivo optical microprobe [Ca2+]i imaging. Journal of Neuroscience. 2011;31:13180–13190. doi: 10.1523/JNEUROSCI.2369-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Madisen L, Zwingman TA, Sunkin SM, Oh SW, Zariwala HA, Gu H, Ng LL, Palmiter RD, Hawrylycz MJ, Jones AR, Lein ES, Zeng H. A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nature Neuroscience. 2010;13:133–140. doi: 10.1038/nn.2467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Matsumoto M, Hikosaka O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature. 2007;447:1111–1115. doi: 10.1038/nature05860. [DOI] [PubMed] [Google Scholar]
  66. Mink JW. The basal ganglia: focused selection and inhibition of competing motor programs. Progress in Neurobiology. 1996;50:381–425. doi: 10.1016/S0301-0082(96)00042-1. [DOI] [PubMed] [Google Scholar]
  67. Mizrahi A, Crowley JC, Shtoyerman E, Katz LC. High-resolution in vivo imaging of hippocampal dendrites and spines. Journal of Neuroscience. 2004;24:3147–3151. doi: 10.1523/JNEUROSCI.5218-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Nelson AB, Kreitzer AC. Reassessing models of basal ganglia function and dysfunction. Annual Review of Neuroscience. 2014;37:117–135. doi: 10.1146/annurev-neuro-071013-013916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Newman H, Liu FC, Graybiel AM. Dynamic ordering of early generated striatal cells destined to form the striosomal compartment of the striatum. Journal of Comparative Neurology. 2015;523:943–962. doi: 10.1002/cne.23725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Niv Y, Daw ND, Joel D, Dayan P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology. 2007;191:507–520. doi: 10.1007/s00213-006-0502-4. [DOI] [PubMed] [Google Scholar]
  71. Oyama K, Hernádi I, Iijima T, Tsutsui K. Reward prediction error coding in dorsal striatal neurons. Journal of Neuroscience. 2010;30:11447–11457. doi: 10.1523/JNEUROSCI.1719-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Oyama K, Tateyama Y, Hernádi I, Tobler PN, Iijima T, Tsutsui K. Discrete coding of stimulus value, reward expectation, and reward prediction error in the dorsal striatum. Journal of Neurophysiology. 2015;114:2600–2615. doi: 10.1152/jn.00097.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Packard MG, Knowlton BJ. Learning and memory functions of the Basal Ganglia. Annual Review of Neuroscience. 2002;25:563–593. doi: 10.1146/annurev.neuro.25.112701.142937. [DOI] [PubMed] [Google Scholar]
  74. Parthasarathy HB, Schall JD, Graybiel AM. Distributed but convergent ordering of corticostriatal projections: analysis of the frontal eye field and the supplementary eye field in the macaque monkey. Journal of Neuroscience. 1992;12:4468–4488. doi: 10.1523/JNEUROSCI.12-11-04468.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Ragsdale CW, Graybiel AM. Fibers from the basolateral nucleus of the amygdala selectively innervate striosomes in the caudate nucleus of the cat. The Journal of Comparative Neurology. 1988;269:506–522. doi: 10.1002/cne.902690404. [DOI] [PubMed] [Google Scholar]
  76. Ragsdale CW, Graybiel AM. A simple ordering of neocortical areas established by the compartmental organization of their striatal projections. PNAS. 1990;87:6196–6199. doi: 10.1073/pnas.87.16.6196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Rajakumar N, Elisevich K, Flumerfelt BA. Compartmental origin of the striato-entopeduncular projection in the rat. The Journal of Comparative Neurology. 1993;331:286–296. doi: 10.1002/cne.903310210. [DOI] [PubMed] [Google Scholar]
  78. Redgrave P, Prescott TJ, Gurney K. The basal ganglia: a vertebrate solution to the selection problem? Neuroscience. 1999;89:1009–1023. doi: 10.1016/S0306-4522(98)00319-4. [DOI] [PubMed] [Google Scholar]
  79. Rikhye RV, Sur M. Spatial correlations in natural scenes modulate response reliability in mouse visual cortex. Journal of Neuroscience. 2015;35:14661–14680. doi: 10.1523/JNEUROSCI.1660-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Rueda-Orozco PE, Robbe D. The striatum multiplexes contextual and kinematic information to constrain motor habits execution. Nature Neuroscience. 2015;18:453–460. doi: 10.1038/nn.3924. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Saka E, Goodrich C, Harlan P, Madras BK, Graybiel AM. Repetitive behaviors in monkeys are linked to specific striatal activation patterns. Journal of Neuroscience. 2004;24:7557–7565. doi: 10.1523/JNEUROSCI.1072-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Salamone JD, Correa M. The mysterious motivational functions of mesolimbic dopamine. Neuron. 2012;76:470–485. doi: 10.1016/j.neuron.2012.10.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Salinas AG, Davis MI, Lovinger DM, Mateo Y. Dopamine dynamics and cocaine sensitivity differ between striosome and matrix compartments of the striatum. Neuropharmacology. 2016;108:275–283. doi: 10.1016/j.neuropharm.2016.03.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Samejima K, Ueda Y, Doya K, Kimura M. Representation of action-specific reward values in the striatum. Science. 2005;310:1337–1340. doi: 10.1126/science.1115270. [DOI] [PubMed] [Google Scholar]
  85. Sato K, Sumi-Ichinose C, Kaji R, Ikemoto K, Nomura T, Nagatsu I, Ichinose H, Ito M, Sako W, Nagahiro S, Graybiel AM, Goto S. Differential involvement of striosome and matrix dopamine systems in a transgenic model of dopa-responsive dystonia. PNAS. 2008;105:12551–12556. doi: 10.1073/pnas.0806065105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Sato M, Kawano M, Yanagawa Y, Hayashi Y. In vivo two-photon imaging of striatal neuronal circuits in mice. Neurobiology of Learning and Memory. 2016;135:146–151. doi: 10.1016/j.nlm.2016.07.006. [DOI] [PubMed] [Google Scholar]
  87. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  88. Schultz W. Dopamine reward prediction-error signalling: a two-component response. Nature Reviews Neuroscience. 2016;17:183–195. doi: 10.1038/nrn.2015.26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Slotnick B. A simple 2-transistor touch or lick detector circuit. Journal of the Experimental Analysis of Behavior. 2009;91:253–255. doi: 10.1901/jeab.2009.91-253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Smith KS, Graybiel AM. A dual operator view of habitual behavior reflecting cortical and striatal dynamics. Neuron. 2013;79:361–374. doi: 10.1016/j.neuron.2013.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Smith JB, Klug JR, Ross DL, Howard CD, Hollon NG, Ko VI, Hoffman H, Callaway EM, Gerfen CR, Jin X. Genetic-based dissection unveils the inputs and outputs of striatal patch and matrix compartments. Neuron. 2016;91:1069–1084. doi: 10.1016/j.neuron.2016.07.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Stalnaker TA, Calhoon GG, Ogawa M, Roesch MR, Schoenbaum G. Reward prediction error signaling in posterior dorsomedial striatum is action specific. Journal of Neuroscience. 2012;32:10296–10305. doi: 10.1523/JNEUROSCI.0832-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Stephenson-Jones M, Yu K, Ahrens S, Tucciarone JM, van Huijstee AN, Mejia LA, Penzo MA, Tai LH, Wilbrecht L, Li B. A basal ganglia circuit for evaluating action outcomes. Nature. 2016;539:289–293. doi: 10.1038/nature19845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Tai LH, Lee AM, Benavidez N, Bonci A, Wilbrecht L. Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nature Neuroscience. 2012;15:1281–1289. doi: 10.1038/nn.3188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Tajima K, Fukuda T. Region-specific diversity of striosomes in the mouse striatum revealed by the differential immunoreactivities for mu-opioid receptor, substance P, and enkephalin. Neuroscience. 2013;241:215–228. doi: 10.1016/j.neuroscience.2013.03.012. [DOI] [PubMed] [Google Scholar]
  96. Taniguchi H, He M, Wu P, Kim S, Paik R, Sugino K, Kvitsiani D, Kvitsani D, Fu Y, Lu J, Lin Y, Miyoshi G, Shima Y, Fishell G, Nelson SB, Huang ZJ. A resource of Cre driver lines for genetic targeting of GABAergic neurons in cerebral cortex. Neuron. 2011;71:995–1013. doi: 10.1016/j.neuron.2011.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Thorn CA, Atallah H, Howe M, Graybiel AM. Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron. 2010;66:781–795. doi: 10.1016/j.neuron.2010.04.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Tippett LJ, Waldvogel HJ, Thomas SJ, Hogg VM, van Roon-Mom W, Synek BJ, Graybiel AM, Faull RL. Striosomes and mood dysfunction in Huntington's disease. Brain. 2007;130:206–221. doi: 10.1093/brain/awl243. [DOI] [PubMed] [Google Scholar]
  99. Tseng Q, Wang I, Duchemin-Pelletier E, Azioune A, Carpi N, Gao J, Filhol O, Piel M, Théry M, Balland M. A new micropatterning method of soft substrates reveals that different tumorigenic signals can promote or reduce cell contraction levels. Lab on a Chip. 2011;11:2231–2240. doi: 10.1039/c0lc00641f. [DOI] [PubMed] [Google Scholar]
  100. Walker RH, Arbuthnott GW, Baughman RW, Graybiel AM. Dendritic domains of medium spiny neurons in the primate striatum: relationships to striosomal borders. The Journal of Comparative Neurology. 1993;337:614–628. doi: 10.1002/cne.903370407. [DOI] [PubMed] [Google Scholar]
  101. Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, Uchida N. Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron. 2012;74:858–873. doi: 10.1016/j.neuron.2012.03.017. [DOI] [PubMed] [Google Scholar]
  102. Watabe-Uchida M, Eshel N, Uchida N. Neural circuitry of reward prediction error. Annual Review of Neuroscience. 2017;40:373–394. doi: 10.1146/annurev-neuro-072116-031109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nature Reviews Neuroscience. 2006;7:464–476. doi: 10.1038/nrn1919. [DOI] [PubMed] [Google Scholar]

Decision letter

Editor: Geoffrey Schoenbaum1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Two-Photon Imaging of Striatum Demonstrates Distinct Functions for Striosomes and Matrix in Reinforcement Learning" for consideration by eLife. Your article has been favorably evaluated by Eve Marder (Senior Editor) and three reviewers, one of whom, Geoffrey Schoenbaum (Reviewer #1), is a member of our Board of Reviewing Editors.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

In this paper, the authors deploy a new approach for separating neural signals from patch and matrix neurons recorded in awake, behaving head fixed mice. Using 2-photon Ca++ imaging to separately image these populations in mice performing an auditory discrimination for water reward, they identify neural correlates of cue, response and reward in both populations. Striosomal activity is somewhat stronger and discriminates reward better but overall the simple task identifies very similar neural correlates of the behaviors.

Essential revisions:

All three reviewers agreed that the paper is a technical marvel and presents a potentially transformative tool for the study of the different circuits that run through these two compartments of striatum. This has long been a question that has interested researchers, and the authors seem to have a strong approach for dissociating them in awake behaving animals. This is extraordinary. However all three reviewers felt that additional detail regarding the specificity and sensitivity of the dissociation was needed, as well as some hand-holding perhaps to explain it better, since it is such a novel and important part of the study. Specific requests are given by reviewer 1 and 2 in their initial comments.

The reviewers also generally agreed that the responses of patch and matrix cells were remarkably similar. The authors understandably emphasized the differences, however they seemed relatively slight against a background of very similar activity. Obviously this is somewhat due to the very simple structure of the task used, but it was felt the treatment of this should be more balanced. Indeed isn't it perhaps more interesting that there are so few differences, given all the strong proposals about these two compartments? Further there was some concern that the differences observed could be related to signal differences or normalization approaches (reviewers 1and 2, second comment and reviewer 3, first comment). This should be addressed and the data presented in a more balanced fashion (Abstract, title, Discussion).

Related but separate from this, it was also felt that the authors should also be more clear as to how they think the differences or lack thereof in their data affect any theoretical accounts. To quote one reviewer from the discussion: "It would be a great opportunity for the Graybiel lab to add some insight to the literature on their current thoughts on the function of these compartments" given their data – whether there turn out to be differences – or maybe even more interestingly if the differences are deemed not the main story.

Finally all three reviewers had some difficulty following what was done. This is a general problem throughout, but is particularly acute for understanding how many mice contributed to the recordings of the different cell populations in the different periods of training. This is indicated in strongly by reviewer 2, third point, and reviewer 3, fourth point. This should be clarified.

Reviewer #1:

In this study the authors used 2-photon calcium imaging combined with transgenically-restricted birthdate-labeling of striosomal neurons to gather data on the differential activity of striosome and matrix neurons in the DS of head-fixed mice performing a simple auditory discrimination task for water. This is a pressing question given the differential connectivity of these two compartments and the speculative theories regarding their differential functions. As the authors note, there is little or no data from neural correlates substantiating any of these exciting ideas. Calcium imaging combined with genetic techniques to discriminate the two regions offers the possibility of addressing this potentially important question. Using this approach, the authors seem to have the ability to distinguish patch and matrix neurons in mice learning the discrimination task. They find both populations show task-related firing. Activity is related to the cues, rewards and post-reward period. It is somewhat stronger and discriminates reward better in the striosomes, but neurons in both compartments show generally similar patterns of activity.

Overall the paper is a technical tour de force. Combining calcium imaging with the fate-mapping to segregate these neural compartments is brilliant and offers a potential tool that can be used to test the various theories that have been advanced for how they interact to support striatal function. I think it could be improved if the authors would provide a clearer explanation for how this works for the uninitiated and a more detailed accounting of the specificity of this method (cell counts of numbers of labeled neurons in the patches, out of the patches, versus unlabeled). However it looks very promising.

In this context, the authors chose to apply a very basic task. As they note, this is just a first step, but perhaps as a result, the differences identified are relatively slight. In fact, to me, the patterns are remarkably similar in the two cell types, especially when one considers that the striosomes seem overall more active. A higher overall level of activity would give rise to many of the other statistical differences it seems to me, such as somewhat higher percentages of statistically engaged neurons in a particular epoch or better discrimination of reward vs. non-reward. These differences may be important to a downstream observer, but they are clearly modest and entirely quantitative, rather than qualitative. It seems to me that the truly transformative theoretical accounts mentioned would predict serious qualitative differences in the right setting.

Of course the authors note that they applied a simple task intentionally. The behavioral approach and analysis (at least as described) was not intended to directly target the predictions of any of these accounts, but merely to test in a simple setting whether there were any differences. But as a result, it seems to me that the data do not really challenge or force a modification of any of these proposals. Or at least it is unclear to me whether the authors believe they do – in other words, are the authors prepared to say that their findings cast doubt on or favor any of these proposals that the patches and matrix neurons do fundamentally different jobs? I did not get the impression that the data do this, either from reading them, or from the author's Discussion.

So overall my impression is that this is a remarkable paper in terms of the tools it applies and technical approaches, but the heavy lifting (as the authors note) is left to future studies. My feeling is that this is more than sufficient because of the importance of distinguishing these populations and the novelty of this approach.

Reviewer #2:

This paper addresses an interesting and understudied aspect of striatal complexity, namely how the striosome and matrix components of the striatum function in reinforcement learning. I found the results interesting but largely descriptive. There were no attempts to manipulate function of these populations and thereby test their necessity in these behaviors. As such, I was left unclear on whether these populations have a distinct function in reinforcement learning beyond the correlative differences the authors' observed in their recordings. I was also concerned about the number of animals used, particularly in the over-training dataset where n=2. Additional methodological issues may also impact their results and could be clarified. Specifically:

1) There was little quantitative information on how well their manipulation targeted striosomes. I would like to see a zoomed out image of the striatum, as well as quantification of MOR overlap that they mentioned. I was also concerned that <20% of the neurons labeled as striosomes with their strategy actually expressed tdTmt. What does this say about their expression strategy? Were quantitative methods employed to split non-labeled neurons into striosome/matrix?

2) Did the recording quality (ΔF/F, mean fluorescence, overall variance) differ between the striosome and matrix identified neurons? Did the presence of tdTmt in labeled cells impact the quality of the GCaMP recordings in those cells?

3) Some description of consistency across mice is warranted throughout the paper. Were similar proportions of striosome/matrix neurons recorded in all mice? Was the size and quality of neuronal responses consistent across animals? If not, I worry that results may reflect differences between animals, rather than differences between cell types. This is especially worrisome in the overtraining data, where n=2 mice.

4) Although licking behavior itself changes with training, this was not discussed in the context of their results. Particularly in the over-training experiments, are the neuronal responses to the tone and licking co-varying with increases in licking? Or are they independent?

Reviewer #3:

The authors address a long-standing question about the different functions of striosomal vs. matrix neurons in striatum. Using a new mouse line the authors have recorded striosomal neurons for the first time, a major achievement! Based on extensive anatomical evidence, largely by the senior author, striosomes have been proposed to serve an evaluation function during reinforcement learning, therefore I would have expected rather distinct responses. The authors did find some differences between striosome vs. matrix neurons, but the major conclusion for me was that they appeared rather similar. I think this should be communicated in the manuscript. There are also some technical issues that need to be addressed to support the conclusions about differences.

1) One of their central findings is that striosomal SPNs provide more selective reward responses compared to their matrix counterparts. They show that normalized striosomal responses to reward-predictive cues and to reward are stronger. One potential issue lies in how this normalization is conducted. The authors use z-score normalization to compare across neurons and conditions. One possibility is the enhanced responses are due to increases in activity in response to task events; alternatively they are due to decreased baseline variability. Given that the authors use the striosomal-specific red fluorescence channel to align their image, it's possible that measured neuronal (not neuropil) activity within the matrix is subject to increased noise from imperfect alignment. This issue with baseline variability is visible in the example neurons shown in 5D. The proportion of striosomal vs. matrix task-modulated neurons might also be affected by differences in the noise floor.

2) It would be informative to quantify the variability in timing of responses across trials for both striosomal and matrix populations.

3) The finding that striatal SPNs "tiled the temporal space of the task" needs to be backed up by appropriate controls. Is the apparent tiling of temporal space in the 2d plots in 4G and 5F due to variability inherent in measuring this timing given limited numbers of trials? The authors could address this by quantifying the peak time and spread (standard deviation or other metric) of each neuron. Then they can compare these values to a synthetic population of neurons by shuffling neuron labels for each trial.

4) The division and analysis of neuronal responses across acquisition, criterion and overtraining phases is useful and informative, but also somewhat arbitrary. It would be more informative to quantify how responses evolve on a more continuous basis relative to conditioned behavior. Relatedly, they could explicitly examine stimulus-evoked activity by performing a linear regression to model how activity is explained by licking behavior. The expectation is that the residuals from this fit would be enhanced for striosomal SPNs for cue and reward relative to matrix SPNs.

The demonstration that matrix neurons are modulated more strongly by reward history is nice and begins to address the issue with increased response magnitude vs. baseline variability I have raised above.

eLife. 2017 Dec 18;6:e32353. doi: 10.7554/eLife.32353.023

Author response


Essential revisions:

All three reviewers agreed that the paper is a technical marvel and presents a potentially transformative tool for the study of the different circuits that run through these two compartments of striatum. This has long been a question that has interested researchers, and the authors seem to have a strong approach for dissociating them in awake behaving animals. This is extraordinary. However all three reviewers felt that additional detail regarding the specificity and sensitivity of the dissociation was needed, as well as some hand-holding perhaps to explain it better, since it is such a novel and important part of the study. Specific requests are given by reviewer 1 and 2 in their initial comments.

The reviewers also generally agreed that the responses of patch and matrix cells were remarkably similar. The authors understandably emphasized the differences, however they seemed relatively slight against a background of very similar activity. Obviously this is somewhat due to the very simple structure of the task used, but it was felt the treatment of this should be more balanced. Indeed isn't it perhaps more interesting that there are so few differences, given all the strong proposals about these two compartments? Further there was some concern that the differences observed could be related to signal differences or normalization approaches (reviewers 1and 2, second comment and reviewer 3, first comment). This should be addressed and the data presented in a more balanced fashion (Abstract, title, Discussion).

We have greatly benefitted from these comments. We agree that we should in our original submission have commented more fully on how similar the responses of the sampled striosome and matrix regions were found to be. In fact, we now have featured this similarity alongside the quantitative differences that we observed. We now note in the manuscript that the very similarity adds to the need for further work to identify what key functional properties do distinguish them.

Related but separate from this, it was also felt that the authors should also be more clear as to how they think the differences or lack thereof in their data affect any theoretical accounts. To quote one reviewer from the discussion: "It would be a great opportunity for the Graybiel lab to add some insight to the literature on their current thoughts on the function of these compartments" given their data – whether there turn out to be differences – or maybe even more interestingly if the differences are deemed not the main story.

We appreciate this request. We have added to the discussion of what these compartments might do, but we are limited in how much such discussion we can add because we realize – as the reviewers did, but we inadvertently failed to write adequately – that we have not yet delineated the functions posited for these compartments. We are adding that greater responsivity to predictors of reward in the response profiles of the sampled striosomal neurons would be in line with striosomes acting as critic in an actor-critic architecture, but this finding does not nail this as their exclusive function. This heightened sensitively would also, for example, be in accord with the responsibility function idea that our lab put forward, along with other ideas based on limbic associations of the striosomes. We introduce this manuscript to determine very basic features of the two compartments with what we believe to be the first imaging study, combining 2-photon imaging with birthdate labeling enabling neuropil plus cell body labeling of striosomal cells so that striosomal and matrix neurons can be simultaneously imaged. We definitely hope to go on with a deep exploration of these issues.

Finally all three reviewers had some difficulty following what was done. This is a general problem throughout, but is particularly acute for understanding how many mice contributed to the recordings of the different cell populations in the different periods of training. This is indicated in strongly by reviewer 2, third point, and reviewer 3, fourth point. This should be clarified.

We deeply apologize for having been unclear. In the revision that we have prepared, we have made every effort to incorporate missing experimental details not only in the text but also in a new set of tables with information about the recorded neurons (Table 2) and sessions per mouse (Table 4). We address this issue further in response to relevant reviewers’ questions below.

We have also performed a series of additional analyses described in the revised manuscript. We have added new panels in Figures 6 and 7 that address the question 4 of reviewer 2 and have added supplementary material addressing the question 1 of reviewer 2 (Figure 1—figure supplement 1; Table 1), the questions from reviewers 1 (question 3) and 2 (question 2) (Table 2), and the questions 2, 3 and 4 of reviewer 3 (Figures 4, 5 and Figure 7—figure supplement 1; Table 5).

Reviewer #1:

[…] Overall the paper is a technical tour de force. Combining calcium imaging with the fate-mapping to segregate these neural compartments is brilliant and offers a potential tool that can be used to test the various theories that have been advanced for how they interact to support striatal function. I think it could be improved if the authors would provide a clearer explanation for how this works for the uninitiated and a more detailed accounting of the specificity of this method (cell counts of numbers of labeled neurons in the patches, out of the patches, versus unlabeled). However it looks very promising.

We thank the reviewer very much for these positive comments. We have now added more details about the birth-dating method and apologize for not having been clearer in our original submission. The key to this protocol is that by pulse-tagging the neurons in the mouse line that we used (by administering tamoxifen to pregnant dams), we could not only achieve strong tdTomato labeling of a group of neurons born within the time-window of neurogenesis of striosomal cells, but also achieve labeling of the local processes of these cells, which are known to remain mainly within the bounds of striosomes. This meant that even though the tdTomato marker labeled only a small proportion of the cells that would eventually differentiate into striosomal cells at maturity, they had sufficient local processes to define the borders of striosomes. With this method, we found relatively few tdTomato-labeled neurons outside of striosomes, namely, 19/1996 total matrix cells detected as compared to 111/708 total cells detected in the striosomal sample. We carefully confirmed the match to striosomes by immunohistochemistry (for which we have added further illustrations in Figure 1—figure supplement 1; Table 1). We have added a table summarizing the number of labeled neurons inside and outside of striosomes for the total population and for the individual mice (Table 2). We have also added description of the birth-dating method, both in the Results and in the Materials and methods sections.

In this context, the authors chose to apply a very basic task. As they note, this is just a first step, but perhaps as a result, the differences identified are relatively slight. In fact, to me, the patterns are remarkably similar in the two cell types, especially when one considers that the striosomes seem overall more active. A higher overall level of activity would give rise to many of the other statistical differences it seems to me, such as somewhat higher percentages of statistically engaged neurons in a particular epoch or better discrimination of reward vs. non-reward. These differences may be important to a downstream observer, but they are clearly modest and entirely quantitative, rather than qualitative. It seems to me that the truly transformative theoretical accounts mentioned would predict serious qualitative differences in the right setting.

The reviewer has helped us greatly by these comments. We did exactly as he/she said: we purposely used a simple classical conditioning task in this first report on the detection of striosomal and matrix neurons by direct imaging, as we intended to determine whether the method could reliably detect basic features of striatal function as they have been reported in many previous papers relating the activity of striatal SPNs to aspects of reinforcement-directed behavior. We now, in response to the reviewer’s comment, have addressed the question of how differences in baseline activity in striosomal and matrix neurons affect our results. We think that there are two main arguments why general differences in activity do not account for the observed differences in both compartments. First, we find that in ΔF/F normalized data, there are no differences between striosomal and matrix neurons in the mean, the standard deviation, and the peak signal during baseline periods, suggesting similar signal-to-noise ratios in both populations. We have added a table with these measurements (Table 3). Secondly, we find that striosomes and matrix have similar percentages of reward-modulated neurons and that the average responses in both populations are similar. If striosomal neurons were to be simply more active in general, then we would expect striosomes to have more task-modulated neurons during all epochs of the task, and we would expect larger overall responses. But we do not find these patterns. For these reasons, we do not think that differences between the two compartments in general activity levels can account for our findings that striosomes show preferential cue encoding.

As to the theoretical implications of this work, we have tried to write more about these in the Discussion section. We again hope to emphasize that this manuscript is aimed at reporting for what we believe is the first direct simultaneous detection of the activity of striosomes and matrix.

Of course the authors note that they applied a simple task intentionally. The behavioral approach and analysis (at least as described) was not intended to directly target the predictions of any of these accounts, but merely to test in a simple setting whether there were any differences. But as a result, it seems to me that the data do not really challenge or force a modification of any of these proposals. Or at least it is unclear to me whether the authors believe they do – in other words, are the authors prepared to say that their findings cast doubt on or favor any of these proposals that the patches and matrix neurons do fundamentally different jobs? I did not get the impression that the data do this, either from reading them, or from the author's Discussion.

Once again, we agree with the reviewer, and we very much appreciate this comment. We have chosen to do a simple reinforcement task so that we could observe striosomal and matrix activity in a basic task that captures important aspects of striatal function but that is easy to interpret. We believe that a stronger encoding of reward-predicting cues is an important finding that is relevant to many theories of striatal function, but we also acknowledge that we have not yet tested one of the proposals that striosomes and matrix do fundamentally different jobs. We are therefore trying to be careful about drawing conclusions too strongly about the functions of the compartments and trying to stay to the basic observation as close as possible. At the same time, we have found clear quantitative differences in the activities of the two compartments and have tried to report these carefully. We have tried to connect our findings more to the existing literature and hypotheses in the Discussion. In future work, we aim to address the existing hypotheses about striosomal function more directly.

So overall my impression is that this is a remarkable paper in terms of the tools it applies and technical approaches, but the heavy lifting (as the authors note) is left to future studies. My feeling is that this is more than sufficient because of the importance of distinguishing these populations and the novelty of this approach.

This reviewer has been extremely helpful to usin leading us to state directly that this is not a manuscript to “solve” the riddle of striosome-matrix functions; in fact, this manuscript augments this riddle by indicating that there are many commonalities between the striosomes and matrix in basic classical reward-based learning.

Reviewer #2:

This paper addresses an interesting and understudied aspect of striatal complexity, namely how the striosome and matrix components of the striatum function in reinforcement learning. I found the results interesting but largely descriptive. There were no attempts to manipulate function of these populations and thereby test their necessity in these behaviors. As such, I was left unclear on whether these populations have a distinct function in reinforcement learning beyond the correlative differences the authors' observed in their recordings. I was also concerned about the number of animals used, particularly in the over-training dataset where n=2. Additional methodological issues may also impact their results and could be clarified. Specifically:

1) There was little quantitative information on how well their manipulation targeted striosomes. I would like to see a zoomed out image of the striatum, as well as quantification of MOR overlap that they mentioned. I was also concerned that <20% of the neurons labeled as striosomes with their strategy actually expressed tdTmt. What does this say about their expression strategy? Were quantitative methods employed to split non-labeled neurons into striosome/matrix?

The reviewer raises a critical issue for which we thank him/her. In response to this comment, we have now added a figure supplement containing zoomed out images of the striatum (Figure 1—figure supplement 1), as well as a quantification of the overlap of striosomes as marked by tdTomato and MOR1 staining (Table 1). We find that 2% of the pixels are scored as striosomes on the basis of tdTomato but not MOR1 and that 3.7% of the pixels are scored as striosomes on the basis of MOR1 but not tdTomato. To control for inevitable errors in outlining these structures, we also quantified the test-retest error for outlining MOR1 and tdTomato structures, which we found to be 2.3 and 2.4%, respectively. This shows that using tdTomato will mainly cause us to have false negatives, or missed striosomes, in our analysis.

To answer the reviewer’s question about the small percentage of labeled neurons, the pulse labeling was applied to the mid-interval of neurogenesis of striosomal cells. They are born over a period of ~3 days during embryonic development. We could have given multiple tamoxifen injections on several days to label more neurons, but we chose not to do so because of the adverse effects that this protocol has on the animals, and because this could have resulted in higher false-positive rates, as toward the end of the striosomal neurogenic window, matrix neurons are also beginning to be born. We have experience with the type of anatomical labeling that comes from this sort of pulse tagging, and here again found that the neuropil of the pulse-labeled striosomal neurons largely is confined to striosomes as confirmed with immunostaining. We could obtain sufficient labeling of these local processes of the striosomal neurons to permit detection of striosomal borders, as the illustrations show. We again would like to thank the reviewer for asking for further illustrations and detailed explanation. Our identification of neurons as striosomal critically depended on this neuropil labeling. The delineation of the striosomes was done manually but blind to the neuronal responses.

2) Did the recording quality (ΔF/F, mean fluorescence, overall variance) differ between the striosome and matrix identified neurons? Did the presence of tdTmt in labeled cells impact the quality of the GCaMP recordings in those cells?

We thank the reviewer for addressing this issue. Our original manuscript should have had more information about this, and we apologize. We have now added a paragraph (subsection “Imaging of striosomes”, last paragraph) describing our efforts to control for these issues as well as a table summarizing parameters that are useful for answering this question to the manuscript (Table 3). We find a slightly lower level of expression of GCaMP6 in striosomes. It is not clear what the reason for this is, but we know from the literature as well as from our own experience that many AAVs tend to have lower fluorescent marker signals in striosomes. To overcome this limitation, we normalized the fluorescence signals to their baseline using ΔF/F normalization, as is common in calcium imaging. We compared striosomal and matrix signals by quantifying the mean, the standard deviation and the peak of the signals during the baseline period. We have not found any significant differences with respect to these parameters when we compared matrix neurons to tdTomato-labeled striosomal neurons or neurons within the striosomal neuropil.

3) Some description of consistency across mice is warranted throughout the paper. Were similar proportions of striosome/matrix neurons recorded in all mice? Was the size and quality of neuronal responses consistent across animals? If not, I worry that results may reflect differences between animals, rather than differences between cell types. This is especially worrisome in the overtraining data, where n=2 mice.

We would like to apologize for not including these data in the manuscript. We have now added a table (Table 2) with the number of neurons recorded in every mouse. We tried to get roughly the same numbers of neurons for the different mice, but this was not always possible, because some mice yielded more fields of view in which we could clearly define striosomes than did others.

To address the issue of consistency of recording quality among mice and the low number of mice that we have in the training and overtraining analysis, we have added another table (Table 4). We have included the number of sessions to which the different mice contribute as well as measurements of baseline signals. We find that the size and the quality of the recordings are not the same for all mice, which is likely related to the imaging quality, imaging depth and other differences among mice. However, the mice that were included in Figures 6 and 7 are representative of all mice that were studied.

We thought that the fairest comparison across mice would be to compare the proportions of task-modulated neurons that we found in the different mice. We found that in all mice, the percentage of striosomal and matrix neurons that were task-responsive were roughly similar. In addition, we found that in all mice, there were more cue-modulated neurons in striosomes than in the matrix. Finally, to respond to the question of whether differences among mice, rather than differences between cell types, can account for our results, we would like to note that we found similar results when we analyzed the overall ΔF/F from striosomal/matrix regions. For this analysis, we had matched striosomal and matrix recordings made at the same time, so that differences among mice cannot account for our findings.

4) Although licking behavior itself changes with training, this was not discussed in the context of their results. Particularly in the over-training experiments, are the neuronal responses to the tone and licking co-varying with increases in licking? Or are they independent?

We thank the reviewer foraddressing this interesting issue. We have added plots with the licking data to Figure 6D and 6H and to Figure 7 (new panel B) so that readers can directly compare licking behavior with the neuronal data.

Concerning the second part of the comment, interestingly, we find that the neuronal responses and licking during overtraining are dissociable. After prolonged training, the mice initially respond to both cues by licking, but for the low-probability cue, this lick rate goes down during the tone and delay period, whereas with the high-probability cue the mice keep on licking at the same rate. The neuronal data, on the other hand, seem to show the opposite pattern. There is a higher activation after the high-probability cue, but the ΔF/F signal goes down rapidly. For the low-probability cue, even though the response is smaller, the ΔF/F signal stays more or less the same during the cue and the reward-delay period.

Reviewer #3:

The authors address a long-standing question about the different functions of striosomal vs. matrix neurons in striatum. Using a new mouse line the authors have recorded striosomal neurons for the first time, a major achievement! Based on extensive anatomical evidence, largely by the senior author, striosomes have been proposed to serve an evaluation function during reinforcement learning, therefore I would have expected rather distinct responses. The authors did find some differences between striosome vs. matrix neurons, but the major conclusion for me was that they appeared rather similar. I think this should be communicated in the manuscript. There are also some technical issues that need to be addressed to support the conclusions about differences.

We are grateful indeed for the remarks of the reviewer. As we have noted in our responses above, we realize that we did not emphasize enough the similarities in responses of the striosomal and matrix neurons that we sampled in these experiments. We have now remedied this problem by many comments throughout the manuscript, and we have also edited the Abstract and title to this end.

1) One of their central findings is that striosomal SPNs provide more selective reward responses compared to their matrix counterparts. They show that normalized striosomal responses to reward-predictive cues and to reward are stronger. One potential issue lies in how this normalization is conducted. The authors use z-score normalization to compare across neurons and conditions. One possibility is the enhanced responses are due to increases in activity in response to task events; alternatively they are due to decreased baseline variability. Given that the authors use the striosomal-specific red fluorescence channel to align their image, it's possible that measured neuronal (not neuropil) activity within the matrix is subject to increased noise from imperfect alignment. This issue with baseline variability is visible in the example neurons shown in 5D. The proportion of striosomal vs. matrix task-modulated neurons might also be affected by differences in the noise floor.

We would like to thank the reviewer for addressing these important issues. We agree that the method of normalization is critical for this study. We have carefully considered this issue and have added a table with information about baseline fluorescence in striosomal and matrix neurons (Table 3). We find that there are no differences between striosomes and matrix regarding the means, standard deviations or maxima of the baseline ΔF/F values, indicating that the observed differences between striosomes and matrix cannot be accounted for by differences in the mean and standard deviation that were used to calculate the z-scores.

In response to the reviewer’s question about alignment, we are also grateful to have the chance to add to our revision. To test how the realignment using the two different channels affects the signals in the striosomes and matrix, we have realigned the GCaMP recordings on the basis of the translation coordinates that were calculated using each channel and have then correlated the fluorescent traces from individual neurons using both alignment methods. We have found very high correlations between the measurements, and these were similar for striosomes and matrix (r = 0.9971 for striosomes and 0.9978 for matrix). These results are now included in the Materials and methods section of the manuscript.

2) It would be informative to quantify the variability in timing of responses across trials for both striosomal and matrix populations.

We thank the reviewer for this suggestion. We have now performed additional analysis to address this issue. We computed a reliability index, which captures the trial-to-trial variability in responses during the three epochs of the task for striosomal and matrix neurons. Reliability was measured by taking the average correlation of all pairwise combinations of rewarded high-probability cue trials. This analysis demonstrated no differences in the mean response variability between striosomal and matrix neurons.

3) The finding that striatal SPNs "tiled the temporal space of the task" needs to be backed up by appropriate controls. Is the apparent tiling of temporal space in the 2d plots in 4G and 5F due to variability inherent in measuring this timing given limited numbers of trials? The authors could address this by quantifying the peak time and spread (standard deviation or other metric) of each neuron. Then they can compare these values to a synthetic population of neurons by shuffling neuron labels for each trial.

We thank the reviewer for raising this important issue. We followed the reviewer’s advice and computed the standard deviation of peak response times for neurons that were active during the post-reward epoch. We only analyzed sessions in which at least ten post-reward active neurons were recorded simultaneously. For each active neuron, we constructed a ‘shuffled’ data set by shuffling the neuron labels (other active neurons within the same population) for each trial 20 times and determined the standard deviation of peak response time (across trials) for each of these shuffles. These 20 values were averaged to estimate the standard deviation for each shuffled data set. Comparing standard deviation values from observed and shuffled data showed more variability in the shuffled data.

As a related analysis, we used the same shuffling procedure and calculated the reliability of responses for observed and shuffled data. This analysis showed that observed data had more reliable responses than the shuffled data. Sorting data by peak time is expected to artificially create a tiling effect. Hence, we compared sorted observed and shuffled responses by quantifying the ridge-to-background ratio, which measures the relative magnitude of responses close to the peak relative to other time points during the trial. Observed data had a higher ratio than the shuffled data. Together, these control analyses suggest that there is structure in the timing of single-neuron responses during the task.

4) The division and analysis of neuronal responses across acquisition, criterion and overtraining phases is useful and informative, but also somewhat arbitrary. It would be more informative to quantify how responses evolve on a more continuous basis relative to conditioned behavior. Relatedly, they could explicitly examine stimulus-evoked activity by performing a linear regression to model how activity is explained by licking behavior. The expectation is that the residuals from this fit would be enhanced for striosomal SPNs for cue and reward relative to matrix SPNs.

We thank the reviewer for suggesting this wonderful idea. We have followed this recommendation and have added the results of this analysis to Figure 7 in the form of a supplement (Figure 7—figure supplement 1) and a table (Table 5), and to the text (subsection “During overtraining, tone-related responses of striosomal neurons intensify and become increasingly selective for high-probability tones”, last paragraph). To summarize, we have tried to predict neuropil ΔF/F responses on the basis of tone-evoked licking. We did this separately for both tones and also for the difference in response between the two tones. First, we made separate models for striosomal and matrix responses and found that in both compartments, tone-evoked licking is a significant predictor of the neuropil response in the case of the high-probability tone but not the low-probability tone. In addition, the difference in licking between the two tones was a significant predictor of the difference in the striosomal response between the cues, but not in the matrix. Secondly, we made one model for combined striosomal and matrix responses and then quantified the residuals of both compartments. Here we found that the residuals of the striosomes are significantly larger than for the matrix for both cues and for the difference between them. Together these results demonstrate that for the high-probability cue, sessions in which the tone-evoked licking is greater, the neuropil response also is greater, and that this relationship holds for both compartments, but more so for striosomes. For the low-probability cue, there is no relationship between tone-evoked licking and the tone-evoked neuropil responses. Finally, the differential licking response between the two cues predicts the difference in the striosomal neuropil response but not the matrix neuropil response.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Transparent reporting form
    DOI: 10.7554/eLife.32353.020

    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES