Abstract
The striatum plays critical roles in visually-guided decision-making and receives dense axonal projections from midbrain dopamine neurons. However, the roles of striatal dopamine in visual decision-making are poorly understood. We trained male and female mice to perform a visual decision task with asymmetric reward payoff, and we recorded the activity of dopamine axons innervating striatum. Dopamine axons in the dorsomedial striatum (DMS) responded to contralateral visual stimuli and contralateral rewarded actions. Neural responses to contralateral stimuli could not be explained by orienting behavior such as eye movements. Moreover, these contralateral stimulus responses persisted in sessions where the animals were instructed to not move to obtain reward, further indicating that these signals are stimulus-related. Lastly, we show that DMS dopamine signals were qualitatively different from dopamine signals in the ventral striatum (VS), which responded to both ipsilateral and contralateral stimuli, conforming to canonical prediction error signaling under sensory uncertainty. Thus, during visual decisions, DMS dopamine encodes visual stimuli and rewarded actions in a lateralized fashion, and could facilitate associations between specific visual stimuli and actions.
SIGNIFICANCE STATEMENT While the striatum is central to goal-directed behavior, the precise roles of its rich dopaminergic innervation in perceptual decision-making are poorly understood. We found that in a visual decision task, dopamine axons in the dorsomedial striatum (DMS) signaled stimuli presented contralaterally to the recorded hemisphere, as well as the onset of rewarded actions. Stimulus-evoked signals persisted in a no-movement task variant. We distinguish the patterns of these signals from those in the ventral striatum (VS). Our results contribute to the characterization of region-specific dopaminergic signaling in the striatum and highlight a role in stimulus-action association learning.
Keywords: dopamine, dorsal striatum, mice, sensory uncertainty, ventral striatum, visual decision
Introduction
Central to survival is the ability to execute appropriate actions based on incoming visual information to obtain rewards. Dorsal striatum plays critical roles in visually-guided decision-making (Hikosaka et al., 2006; Ding and Gold, 2013). Previous studies have identified prominent projections from visual cortical areas to the dorsal striatum (Khibnik et al., 2014; Hintiryan et al., 2016; Hunnicutt et al., 2016) and have shown that neurons in the dorsal striatum are active during visually-guided behavior, responding to contralateral visual stimuli (Hikosaka et al., 1989; Kawagoe et al., 2004; Peters et al., 2021), reflecting visual evidence accumulation during decision-making (Ding and Gold, 2010), and contributing causally to visual decisions (Doi et al., 2020). In addition to cortical inputs, striatum receives dense axonal projections from midbrain dopamine neurons (Björklund and Dunnett, 2007; Haber, 2014). However, the roles of striatal dopamine in visual decision-making have remained relatively unknown.
Several lines of evidence suggest that dopamine signals in the dorsal striatum play crucial roles in visual decision-making. First, the activity of midbrain dopamine neurons correlates with statistical decision confidence during visual decision-making (Lak et al., 2017, 2020). Second, dopamine depletion in dorsal striatum alters striatal sensory responses (Ketzef et al., 2017). Third, manipulation of cortico-striatal neurons, terminating in the dorsal striatum, biases choices in two-alternative sensory decision tasks (Znamenskiy and Zador, 2013). Fourth, the strength of cortico-striatal synapses increases in a stimulus-selective manner as animals learn to perform a sensory decision task (Xiong et al., 2015), and these synapses are strongly modulated by dopamine signals innervating the dorsal striatum (Reynolds and Wickens, 2002; Calabresi et al., 2007). Therefore, striatal dopamine signals are well placed to entrain associations between stimuli and actions during visual decisions.
We recorded the activity of dopamine axons in the striatum in mice trained to perform a visual decision task with asymmetric reward payoff. We found that dopamine axon activity in the dorsomedial striatum (DMS) encoded the contrast of contralateral visual stimuli, regardless of subsequent movement direction. In fact, the contralateral stimulus responses persisted in a task in which the stimulus instructed animals specifically not to move to receive the reward, indicating that these responses are truly driven by contralateral stimulus, rather than the action that follows the stimulus presentation. Additionally, we observed contralateral action-aligned signals in these DMS dopamine axons, but only in rewarded trials. For comparison, we also recorded the activity of dopamine axons in the ventral striatum (VS), which responded to both ipsilateral and contralateral stimuli and trial outcomes, and conformed to canonical prediction error signaling under sensory uncertainty. These results reveal distinct roles for dopamine signals in different regions of striatum during visual decisions, and suggest that DMS dopamine signals could facilitate associations between contralateral visual stimuli and contralateral actions.
Materials and Methods
Mice and surgeries
The presented data were collected from 6 male and three female mice (DAT-Cre backcrossed with C57/DL6J; B6.JLSl6a3tm1.1(cre)Bkmn/J; https://www.jax.org/strain/006302) aged between 10 and 24 weeks. Mice underwent surgery during which a metal headplate was implanted, as well as either one or two optic fibers following viral injection. Mice were anaesthetized with isoflurane [induction: 3% in 100% oxygen (0.5 l/min), and maintenance: 1.5% in 100% oxygen (0.5 l/min)] on a heating pad (ATC2000, World Precision Instruments). Hair and skin were removed from the dorsal surface of the skull, which was subsequently washed with saline and sterile cortex buffer. The headplate was then attached with dental cement (Super-Bond C&B; Sun Medical) to the bone posterior to bregma. Next, we made a craniotomy over VTA/SNc and injected 0.5 µl diluted viral construct (0.25 µl of AAV1.Syn.Flex.GCaMP6m.WPRE.SV40 diluted in 0.25 µl of PBS) at ML: 0.5 mm from midline, AP: −3 mm from bregma, DV: 4.4 mm from dura. An optic fiber (400 µm, NA: 0.48, Doric Lenses Inc.) was implanted over nucleus accumbens (NAc) (ML: 1 mm, AP: 1.25 mm, DV: −3.8 mm) in 4 mice (1 mouse was implanted in both left and right NAc, thus the data were collected from five brain hemispheres in total), and in the DMS (ML: 1 mm, AP: 1.25 mm, DV: −2.5 mm) in five mice (two mice were implanted in both left and right DMS, thus DMS data were collected from seven brain hemispheres in total). The fiber was also set in place with dental cement covering the rest of the exposed skull. For pain relief, Carprofen was provided in the cage water for 3 d after surgery (0.1 ml of 5% Carprofen mixed with 150-ml filtered tap water in the cage bottle). The implanted fibers did not substantially influenced decision-making behavior of mice compared with animals without fiber implants performing the same task (p = 0.43, Wilcoxon rank-sum test). All experiments were conducted according to the United Kingdom Animals Scientific Procedures Act (1986) under appropriate project and personal licenses.
Behavioral tasks
After 7 days of recovery from surgery, mice were placed on water control and following 3 d of handling and acclimatization, training began in the two-alternative forced visual detection task (Burgess et al., 2017; Lak et al., 2020). Mice were trained using water as a reward. After the task, they received top-up fluids to achieve a minimum daily amount of 40 ml/kg/d. Body weight and potential signs of dehydration were monitored daily.
In each daily session, mice were head-fixed with their forepaws resting on a steering wheel (diameter: 62 mm). Trials began with an auditory tone (0.1 s, 12 kHz, ∼40–50 dB) after the wheel was held still for at least 0.6 s (quiescence period); 0.7 s after the tone, a sinusoidal grating of varying contrast appeared on either the left or right side of the screen (19", Iiyama, intensity measured in full black and full white: 1.3 and 201 Lux), positioned in front of the mouse (Fig. 1A,B). This was followed by a 0.6–1.8 s open loop period, during which mice could move the wheel but with no effect on the position of the grating. At the end of the open loop period, a distinct auditory tone marked the beginning of the closed loop period, during which mice were able to use the wheel to move the stimulus to the center of the screen to obtain a water reward. Water reward volume was either 1.4 or 2.4 µl, depending on block and stimulus side (Fig. 1C). During training, parameters such as quiescence period, stimulus contrast, and open loop duration were gradually made more difficult. Within two weeks, mice had usually mastered the task, performing frequently above 85% (across all stimulus contrasts). In this task, the correct action to a stimulus on the left of the screen is to turn the wheel clockwise, which moves the stimulus from the left to center. We refer to this action as “contralateral” action when recording from the right striatum (and vice versa for recordings in the left striatum).
Some mice (n = 3) were additionally trained to perform a task variant that required refraining from wheel movements. In this task, mice were trained to keep the wheel still before and after the stimulus onset, thus there was no wheel movement during correct trials. Following a 1 s quiescence period (i.e., no wheel movement), trials began with a grating stimulus appearing on the left or the right side of the screen. Mice were rewarded (2 µl of water) for holding the wheel still for an additional 1.5 s. Wheel movement after the stimulus resulted in abortion of the trial and an auditory white noise.
The behavioral experiments were delivered by custom-made software written in MATLAB (MathWorks), which is freely available (Bhagat et al., 2020). Instructions for both the software as well as hardware assembly are freely accessible at www.ucl.ac.uk/cortexlab/tools/wheel.
Eye tracking
In 31 sessions, we recorded 30-Hz video footage of the left eye. We used a camera (DMK 21BU04.H or DMK 23U618, The Imaging Source) with a zoom lens (ThorLabs MVL7000) focused on the left eye. To avoid contamination of the image by reflected monitor light relating to visual stimuli, the eye was illuminated with a focused infrared LED (SLS-0208A, Mightex; driven with LEDD1B, ThorLabs) and an infrared filter was used on the camera (FEL0750, ThorLabs; with adapters SM2A53, SM2A6, and SM1L03, ThorLabs). We acquired videos with MATLAB's Image Acquisition Toolbox (MathWorks).
Fiber photometry
Dopamine axon activity was measured using fiber photometry (Gunaydin et al., 2014; Lerner et al., 2015). We used multiple excitation wavelengths (465 and 405 nm) modulated at distinct carrier frequencies (214 and 530 Hz) to allow ratiometric measurements of calcium-dependent and calcium-independent (i.e., motion-related) changes in fluorescence. Light collection, filtering, and demodulation were performed as previously described (Lak et al., 2020) using Doric photometry setup and Doric Neuroscience Studio Software (Doric Lenses Inc.). For each behavioral session, least-squares linear fit was applied to the 405-nm isosbestic control signal, and the ΔF/F time series were then calculated as [(465-nm signal – fitted 405-nm signal)/fitted 405-nm signal].
Histology and anatomic verifications
To verify the expression of viral constructs, we performed histologic examination. Mice were anesthetized and perfused, brains were fixed, and 60 µm coronal sections were collected. Confocal images from the sections were obtained using Zeiss 880 Airyscan microscope. We confirmed viral expression and fiber placement in all mice. The anatomic locations of implanted optical fibers were determined from the tip of the longest fiber track found, and matched with the corresponding Paxinos atlas slide (Fig. 1E–G).
Statistical analyses
The presented analyses include 24,495 behavioral and neural trials (after the initial task learning was completed) recorded over a total of 87 sessions in nine mice. The minimum and maximum number of trials per session were 103 and 640.
Normalization of neural activity
The neural responses collected in each session was first normalized by calculating z-scored ΔF/F. The data were further normalized by dividing the z-scored responses by the peak of averaged neural responses to stimuli with the highest contrast in each session. This ensured that the results when averaged across sessions or animals are not dominated by a small number of sessions or animals with stronger signals. We then averaged across all sessions of each animal before averaging the data across mice. These data were used for visualizing neural responses across time. For calculating neural responses in a specific time bin with respect to task events we used the normalized data as described above, and we subtracted the activity during a window before each event in each trial (−0.25–0 s) from the activity during a window (0.4–0.8 s) after the event in the same trial (Using 0.1–0.4 s postevent analysis window yielded comparable results in all our analysis). For animals with bilateral recordings, we first averaged the data across the two hemispheres (by grouping the data into ipsilateral and contralateral with respect to each recorded hemisphere), before averaging the data across mice.
Pairwise comparisons and ANOVAs
We used neural responses measured in a specific time window after each task event (see above for the normalization and analysis time windows used). To test for statistical significance in the behavioral and neural data, we used standard statistical tests (Wilcoxon rank-sum test or ANOVA across trials) as specified in each instance in Results.
Cross-validated regression analysis of neural data
In order to quantify the extent to which different trial features determined the magnitude of neural responses to stimuli in a trial-by-trial fashion, we modeled the changes in z-scored ΔF/F before and after stimulus onset (using temporal windows specified above) in a given trial j, which we denote as Rj, as:
where cj reflects contrast of contralateral stimulus, ij reflects the contrast of ipsilateral stimulus, and vj reflects the value of pending reward (0, 1.4, 2.4 for no reward, small reward, and large reward). Z-scored stimulus contrast and reward sizes were used in the regression. β1, β2, and β3 are the coefficient weights for these variables, and β0 is an offset capturing mean fluorescence over all conditions. We tested reduced versions of the model omitting one or two terms out of [β1*cj], [β2*ij], and [β3*vj] to assess its performance compared with the full model. We used fivefold cross validation (i.e., using 80% of trials to estimate regression coefficients and the remaining 20% of trials to compute explained variance) to estimate the explained variance of the model variants (averaged over sessions), and to select the best regression model for the neural data (Figs. 2J, 3J). Comparing the nested models using other model comparison methods such, as Akaike Information Criterion (AIC), revealed comparable results.
Eye movement analysis
Pupil location in 31 sessions was extracted from a 30-Hz video recording of the left eye using facemap (https://github.com/MouseLand/facemap; Fig. 4A). Pupil location was defined as the center of a 2D ellipse fitted to the pupil in each frame, and the trace was smoothed using a median filter (1-s window). 2D pupil location was projected along the single dimension of maximum variance (PCA), and then z-scored (Fig. 4B).
To assess the relationship between trial-by-trial DMS GCaMP fluorescence, stimulus contrast and pupil position, we used the following regression model:
where Rj is the z-scored ΔF/F fluorescence averaged over a poststimulus window (0.4–0.8 s) in trial j, pj is the pupil position averaged over the same poststimulus window in trial j, cj denotes the contrast of contralateral stimulus and ij denotes the contrast of ipsilateral stimulus. Parameters (β0,β1,β2,β3) were fit by least-squares for each session separately. To illustrate the relationship between eye position and DMS dopamine signals (β3) after controlling for the confounding stimulus contrast (Fig. 4E), pupil position p was plotted against residual fluorescence R – (β0 + β1*c + β2*i). Using an analysis window of 0.1–0.4 s poststimulus produced similar results.
Results
A decision task requiring integration of sensory evidence and reward value
We trained mice (n = 9) in a two-alternative forced choice decision task that requires trial-by-trial evaluation of visual stimuli and reward values (Lak et al., 2020). Mice were head-fixed in front of a computer screen with their forepaws resting on a steering wheel. On each trial, a visual grating was displayed on either the left or right side of the screen at a variable contrast level, followed by an auditory go cue presented after a 0.6- to 1.8-s delay (Fig. 1A,B). Mice were rewarded for turning the wheel after this cue, thereby bringing the grating into the center of the screen (Burgess et al., 2017). In trials with no stimulus on the screen (zero contrast), mice received rewards in 50% of trials. The volume of reward delivered for correct left and right choices was asymmetric, and the side giving larger reward was switched (without any warning) between blocks of 100–500 trials (Fig. 1C; Lak et al., 2020). Mice learned to perform this task in two to three weeks of daily training. After the initial learning was completed, we collected 20,695 trials in 79 test sessions in nine mice. Mice could detect high-contrast (easy) stimuli with an accuracy >90%, and low-contrast (difficult) stimuli near chance levels. Moreover, mice adjusted their choices to reward contingencies: the psychometric curves were shifted toward the side paired with larger reward (Fig. 1D; Lak et al., 2020). The decisions were thus informed by both the strength of sensory evidence and the value of upcoming reward (contrast: F = 256.5, p < 0.000001, reward size: F = 112.6, p < 0.000001, ANOVA).
Dopamine axons in VS respond to both contralateral and ipsilateral visual stimuli and encode confidence-dependent prediction errors
While mice performed the task, we measured the activity of striatal dopamine axons using fiber photometry. We injected AAV containing Flex-GCaMP6m in the midbrain of DAT-Cre mice and implanted an optic fiber above ventral or DMS in different cohorts of mice (Fig. 1E–G).
The responses of VS dopamine axons to the visual stimuli scaled with expected reward size and with stimulus contrast but showed no difference between ipsilateral and contralateral stimuli (Fig. 2A–F). Following stimulus onset (i.e., before outcome onset, since a reward could only be received after the go cue), VS dopamine responses were graded to the contrast of the stimulus, regardless of whether the visual stimulus appeared contralateral or ipsilateral to the recorded hemisphere (contrast: F = 11.96, p < 0.00001, ipsi/contra: F = 0.39, p = 0.53, ANOVA; Fig. 2B). The responses were also scaled to the size of upcoming reward (F = 8.94, p = 0.0053, ANOVA; Fig. 2C,E) and were larger in correct trials than in error trials (F = 4.78, p = 0.007, ANOVA; Fig. 2D,F). In order to statistically quantify the effects of contrast of ipsilateral and contralateral stimuli and the value of pending outcomes on trial-by-trial responses of VS dopamine axons, we used regression models (see Materials and Methods). Specifically, we regressed time-binned neural responses against the contrast of ipsilateral stimulus, contrast of contralateral stimulus, and the value of upcoming reward. This regression indicated that neural responses significantly encoded the contrast of both ipsilateral and contralateral stimuli as well as upcoming reward value (p = 0.0003, p = 0.00001, and p = 0.007 for ipsilateral stimulus, contralateral stimulus, and upcoming reward, F = 45.9, p < 0.000001; Fig. 2I,J, left). We further confirmed these results using nested regressions that included one, two, or all regressors and used cross-validation to assess the predictive performance of each regressor (see Materials and Methods). These regressions confirmed that the full model, i.e., the model that included contrast of both ipsilateral and contralateral stimuli as well as upcoming reward value, accounts for the VS neural data better than models that include only one or two regressors (Fig. 2J, right).
Dopamine axons in VS appeared to encode neither the onset nor the direction of actions, i.e., the wheel movements for reporting choice. Action-locked signals in VS axons were present on average but absent in the subset of trials where the action was executed before the go cue and therefore did not lead to reward (p = 0.47, Wilcoxon rank-sum test), suggesting that this activity is not actually related to movement. In these trials with early movement, stimulus-related responses were also attenuated, consistent with previous observations that VS dopamine release following a reward-predicting cue is attenuated unless a movement is correctly initiated (Syed et al., 2016).
The VS dopamine signals at the time of outcome strongly encoded the reward size (Fig. 2G) and the confidence in obtaining the reward, being largest when the reward was received in a difficult trial (p < 0.05, Wilcoxon rank-sum test between 0 and 0.5 contrast for both small and large reward conditions; Fig. 2G).
These findings indicate that VS dopamine axons integrate reward value and sensory confidence. The VS dopamine signals at the times of both stimuli and outcomes resemble those we previously observed in VTA dopamine cell bodies during the same decision task (Lak et al., 2020). These responses resemble the prediction error term of a belief-state temporal difference (TD) reinforcement learning model that incorporates statistical decision confidence (i.e., subjective probability that the choice will turn out to be correct) into prediction error computation (compare Fig. 2E–G and H, adapted from Lak et al., 2020). In such models, the difference between correct and error trials can arise before choice execution, and can be explained by the difference in statistical choice confidence (see Discussion).
Dopamine axons in DMS respond to contralateral but not ipsilateral visual stimuli
The stimulus-related activity of dopamine axons in the DMS differed from that in the VS in several ways (Fig. 3A–F; compare with Fig. 2A–F). First, dopamine axons in DMS responded to contralateral, but not ipsilateral, visual stimuli (Fig. 3B), and their responses scaled with the contrast of visual stimuli presented contralaterally (contralateral: F = 243.3, p < 0.00001, ipsilateral: F = 0.12, p = 0.94, ANOVA; Fig. 3B). Second, dopamine responses in DMS were largely insensitive to the value of upcoming reward (F = 0.93, p = 0.18, ANOVA), and choice accuracy (Fig. 3D, F; F = 3.4, p = 0.09, ANOVA; Fig. 3C,E).
Lateralized responses to stimuli were evident in DMS dopamine signals from individual animals and in single trials (Fig. 3G,H). DMS dopamine axons recorded simultaneously bilaterally in individual animals responded strongly and rather exclusively to stimuli presented contralaterally: axons in the left and right hemispheres only responded to stimuli presented on the right and left side of the monitor respectively (Fig. 3G). Moreover, DMS dopamine axons showed robust responses to contralateral stimuli in individual trials of the task (Fig. 3H). In order to statistically quantify the effects of stimuli and outcomes on trial-by-trial responses of DMS dopamine axons, we used regression models identical to those used for analyzing VS dopamine signals (see Materials and Methods). The regression showed that neural responses encode the contrast of contralateral stimuli but not contrast of ipsilateral stimuli, nor the value of pending reward (p < 0.000001, p = 0.83 and p = 0.59 for contralateral stimulus, ipsilateral stimulus, upcoming reward; Fig. 3I,J, left). Nested cross-validated regressions further confirmed these results, showing that the contralateral stimulus regressor is sufficient to match the explained variance of the full model (Fig. 3J, right).
DMS dopamine responses to contralateral stimuli cannot be explained by eye movements
The responses of DMS dopamine axons to contralateral stimuli were not because of orienting movement such as eye movements (Fig. 4). While head-fixed mice cannot orient their heads toward the presented stimulus, we reasoned that they might rapidly move their eyes toward the stimulus presented on one side of the monitor and this could contribute to lateralized DMS dopamine responses. To assess this we extracted trial-by-trial pupil position from the recorded videos in sessions in which we recorded DMS dopamine (Fig. 4A,B), and regressed dopamine signals against eye position and contra/ipsi stimulus contrast (see Materials and Methods). After controlling for the stimulus contrast, the regression indicated that DMS dopamine signals were not significantly correlated with pupil movement (p = 0.96). Rather, consistent with our previous analyses, these neural signals significantly reflected the contrast of contralateral visual stimuli (p < 0.00001; Fig. 4C–F). Thus, the responses of DMS dopamine axons reflect the contrast of contralateral stimuli, rather than orienting movements in responses to those stimuli.
DMS dopamine responses to contralateral stimuli are not because of task motor requirements
Might the lateralized stimulus responses of DMS dopamine axons reflect some aspect of the upcoming planned movement, i.e., the directional wheel movements to report the choice? To test this, we measured DMS dopamine axon responses in a new “no movement” task. Mice were retrained to hold the wheel still for the whole trial: from 1 s before visual stimulus onset until 1.5 s after the visual stimulus, when they received reward (Fig. 5A,B). Wheel movement before the stimulus onset delayed the stimulus onset, and any wheel movement after stimulus onset aborted the trial (after an auditory white noise burst). After the initial training, we collected 3800 trials in eight test sessions in three mice. Mice learned to hold the wheel still in 40–60% of trials. We again observed strong responses of DMS dopamine axons in trials with contralateral visual stimuli and no wheel motion (ipsilateral vs contralateral: F = 110.7, p = 0.000001, contrast: F = 16, p = 0.00004, ANOVA; Fig. 5C). These results indicate that the contralateral visual responses of DMS dopamine axons are independent of the task's motor requirements: they appear regardless of whether the stimulus instructs the animal to move or to refrain from moving.
DMS dopamine axons encode specific combination of stimuli and actions in a lateralized manner
During the decision task (Fig. 1), dopamine activity in DMS was modulated not only at the onset of contralateral stimuli but also at the onset of actions, i.e., the onset of wheel movements leading to choice (Fig. 6). In this task the correct action to a stimulus on the left of the screen is to turn the wheel clockwise, which moves the stimulus from the left to center. We refer to this action as a contralateral action when recording from the right striatum (and vice versa for recordings in the left striatum). DMS dopamine axons in the hemisphere contralateral to the stimulus showed robust responses to the contralateral action onset (F = 7.99, p = 0.0007, ANOVA; Fig. 6A) but not ipsilateral action onset (F = 0.12, p = 0.94, ANOVA). These signals occurred only when the visual stimulus was present (non-zero contrast trials) on the contralateral side but did not otherwise correlate with stimulus contrast (F = 0.44, p = 0.64, ANOVA; Fig. 6B), or with the size of upcoming reward (F = 1.08, p = 0.35, ANOVA; Fig. 6C). These contralateral action responses of DMS dopamine axons could not be explained by the movement of the visual stimulus on the screen, because it persisted in trials where mice responded before the auditory go cue, and the visual stimulus did not yet move (p = 0.021, Wilcoxon rank-sum test). Nevertheless, the magnitude of DMS dopamine activity during contralateral actions was larger for correct than incorrect trials (F = 12.41 p = 0.0011, ANOVA; Fig. 6D). Thus, in addition to encoding contralateral visual stimuli, DMS dopamine axons encode correct (rewarded) contralateral actions, consistent with previous reports in freely moving mice (Parker et al., 2016). We did not observe prominent responses to rewards in the DMS dopamine axons in the decision task, consistent with past studies (Howe and Dombeck, 2016).
Taken together, our results indicate that DMS dopamine axons encode a specific combination of stimuli and actions in a lateralized manner (Fig. 6E,F summarize these). First, the DMS axons responded following contralateral stimuli but not ipsilateral stimuli (Fig. 6E, left). Second, these contralateral stimulus responses were followed by responses at the time of contralateral actions (Fig. 6E, right). Third, these contralateral action responses depended on choice accuracy, i.e., whether the ongoing choice is correct (Fig. 6E, right).
Discussion
Our experiments reveal qualitatively distinct roles of dopamine circuitry across the striatum during visual decisions. Dopamine axons in DMS responded to stimuli and actions in a strongly lateralized manner, signaling only contralateral stimuli (largely regardless of the value of pending outcome) and rewarded, but not unrewarded (i.e., incorrect), contralateral actions. The contralateral DMS dopamine responses to stimuli could not be accounted for by eye movements toward stimuli, and persisted in a task variant with no movement, revealing the stimulus-related nature of these signals. For comparison, we also recorded dopamine axons in the VS, which responded to stimuli and outcomes, encoding the confidence in receiving reward and the value of pending and received reward. These responses were largely independent of stimulus position on the screen and action direction.
Our results demonstrate that DMS dopamine axon activity encodes contralateral visual stimuli in behavioral tasks both with and without movement. Contralateral action responses of DMS axons have been reported previously (Parker et al., 2016; Tsutsui-Kimura et al., 2020), but our experiments using visual decision tasks extend these results in two ways. First, lateralized DMS dopamine action signals depend on choice accuracy (i.e., for the same action they differ in error and correct trials), and secondly, DMS dopamine responses to visual stimuli are strongly lateralized. DMS dopamine responses to stimuli depended on the position and contrast of the stimulus and were evident regardless of whether the task required directional actions. Unlike in VS, the DMS dopamine responses before the outcome did not properly encode expected reward because they reflected stimulus contrast only unilaterally and had minimal encoding of reward size and choice accuracy.
The lateralized DMS dopamine signals we observed might shape various known features of dorsal striatal neuronal responses. Previous studies have identified prominent projections from visual cortical areas to the dorsal striatum (Khibnik et al., 2014; Hintiryan et al., 2016; Hunnicutt et al., 2016) and have shown that neurons in the dorsal striatum are particularly responsive to contralateral visual stimuli (Hikosaka et al., 1989; Peters et al., 2021). Given the role of dopamine signals in potentiating cortico-striatal synapses (Reynolds et al., 2001), their roles in rapid regulation of neuronal excitability in the striatum (Lahiri and Bevan, 2020), and evidence that striatal dopamine depletion alters striatal sensory responses (Ketzef et al., 2017), our results suggest that the lateralized dorsal striatal responses may be entrained by lateralized dopamine signals innervating this striatal region. Moreover, the graded response to stimulus contrast (which in our task determines the level of reward uncertainty) but limited encoding of pending reward value in the DMS dopamine axons might shape encoding of reward uncertainty observed in dorsal striatal neuronal responses (White and Monosov, 2016).
Our results help clarify the sensory versus action roles of dorsal striatal dopamine in visually-guided behavior. An early set of studies lesioned dorsal striatum dopamine unilaterally in a task in which freely-moving rats had to make a left or right movement to report the position of a flash of light. These studies concluded that the lesion-induced behavioral deficits (slow and impaired response to contralateral stimuli) were because of impairment in initiation of contralateral actions rather than a deficit in localizing the contralateral stimulus (Carli et al., 1985; Brown and Robbins, 1989). Later studies using single-unit recording in primates or calcium imaging in mice show that some dopamine neurons show stronger responses to contralateral, compared with ipsilateral visual stimuli (Kawagoe et al., 2004; Kim et al., 2015; Engelhard et al., 2019). Among these, by recording single putative dopamine neurons in primates, Kim et al. (2015) extensively studied these neural responses in simple visually-guided saccade tasks, and demonstrated that a subgroup of dopamine neurons located in the lateral substantia nigra and projecting to the caudate have stronger responses to contralateral visual stimuli, and respond to visual stimuli with little dependence on the reward value of the stimulus. These more recent studies therefore identify a strong sensory component in dopamine responses, akin to the DMS dopamine axon responses we observed in our visual decision task in mice. Further studies will be required to establish the precise causal impact of these signals in visual decisions.
Our results also reveal the encoding of confidence-dependent reward prediction errors in the mesolimbic dopamine pathway. The responses of dopamine axons in VS at the time of stimuli and trial outcome scale with the sensory evidence, choice accuracy as well as reward value, resembling prediction error term of a belief-state reinforcement learning model that incorporates statistical decision confidence (estimated, for instance, using signal detection theory) into prediction error estimation (Lak et al., 2020). These VS dopamine signals are similar to the responses of dopamine cells bodies in the VTA imaged in the same task in mice (Lak et al., 2020), and of spiking activity of putative individual dopamine neurons recorded in a similar task in primates (Lak et al., 2017) which also encoded prediction errors scaled to the statistical confidence in obtaining the reward as well as reward value. In both VS dopamine axon signals, as well as in our previous recordings from dopamine cell bodies (Lak et al., 2017, 2020), the difference between correct and error trials emerged before the trial outcome. These early differences could be accounted for by the belief-state reinforcement learning model because in such models the choice confidence can be estimated before the choice execution, and it is lower in the error trials compared with correct trials. Thus, the VTA confidence-dependent dopamine signals appear to be carried forward to ventral regions of striatum. On the other hand, the lateralized DMS dopamine signals to stimuli and actions cannot be explained by canonical prediction error framework, as has been shown previously in the case of the action signals (Howard et al., 2017; Lee et al., 2019; Tsutsui-Kimura et al., 2020).
Our findings are consistent with the idea that dopamine projections to dorsal striatum promote the association between contralateral stimuli and contralateral actions, whereas projections to VS promote the association between stimuli and outcomes. Dorsal striatum is necessary for executing lateralized goal-directed actions and for maintaining stimulus-action associations (Miklyaeva et al., 1994; Brasted et al., 1997; Jog et al., 1999; Featherstone and McDonald, 2005; Yin et al., 2005; Balleine et al., 2007; Tai et al., 2012). During sensory decision-making, manipulation of cortico-striatal neurons, terminating in the dorsal striatum, biases choices in two-alternative sensory decision task (Znamenskiy and Zador, 2013). Moreover, the strength of cortico-striatal synapses increases in a stimulus-selective manner as animals learn to perform a sensory decision task (Xiong et al., 2015). These synapses are under heavy influence of dopamine. Accordingly, the DMS dopamine responses to contralateral stimuli and contralateral rewarded actions we observed here might contribute to forming associations between specific stimuli and actions. Our results on dopamine axons in the VS are consistent with the role of this striatal region as well as the role of dopamine in this region in forming stimulus-outcome associations (Robbins and Everitt, 1992; Rothenhoefer et al., 2017). Thus, anatomically-organized dopamine modulation of striatum can support distinct associations between stimuli, actions and outcomes, thereby refining goal-directed decisions.
Footnotes
This work was supported by Wellcome Trust Grants 213465 (to A.L.) and 205093 (to M.C. and K.D.H.). M.C. holds the GlaxoSmithKline/Fight for Sight Chair in Visual Neuroscience. We thank Rakesh K. Raghupathy and Laura Funnell for histology and Michael Krumin for technical assistance.
The authors declare no competing financial interests.
References
- Balleine BW, Delgado MR, Hikosaka O (2007) The role of the dorsal striatum in reward and decision-making. J Neurosci 27:8161–8165. 10.1523/JNEUROSCI.1554-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhagat J, Wells MJ, Harris KD, Carandini M, Burgess CP (2020) Rigbox: an open-source toolbox for probing neurons and behavior. eNeuro 7:ENEURO.0406-19.2020. 10.1523/ENEURO.0406-19.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Björklund A, Dunnett SB (2007) Dopamine neuron systems in the brain: an update. Trends Neurosci 30:194–202. 10.1016/j.tins.2007.03.006 [DOI] [PubMed] [Google Scholar]
- Brasted PJ, Humby T, Dunnett SB, Robbins TW (1997) Unilateral lesions of the dorsal striatum in rats disrupt responding in egocentric space. J Neurosci 17:8919–8926. 10.1523/JNEUROSCI.17-22-08919.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown VJ, Robbins TW (1989) Deficits in response space following unilateral striatal dopamine depletion in the rat. J Neurosci 9:983–989. 10.1523/JNEUROSCI.09-03-00983.1989 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgess CP, Lak A, Steinmetz NA, Zatka-Haas P, Bai Reddy C, Jacobs EAK, Linden JF, Paton JJ, Ranson A, Schröder S, Soares S, Wells MJ, Wool LE, Harris KD, Carandini M (2017) High-yield methods for accurate two-alternative visual psychophysics in head-fixed mice. Cell Rep 20:2513–2524. 10.1016/j.celrep.2017.08.047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calabresi P, Picconi B, Tozzi A, Di Filippo M (2007) Dopamine-mediated regulation of corticostriatal synaptic plasticity. Trends Neurosci 30:211–219. 10.1016/j.tins.2007.03.001 [DOI] [PubMed] [Google Scholar]
- Carli M, Evenden JL, Robbins TW (1985) Depletion of unilateral striatal dopamine impairs initiation of contralateral actions and not sensory attention. Nature 313:679–682. 10.1038/313679a0 [DOI] [PubMed] [Google Scholar]
- Ding L, Gold JI (2010) Caudate encodes multiple computations for perceptual decisions. J Neurosci 30:15747–15759. 10.1523/JNEUROSCI.2894-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ding L, Gold JI (2013) The basal ganglia's contributions to perceptual decision making. Neuron 79:640–649. 10.1016/j.neuron.2013.07.042 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doi T, Fan Y, Gold JI, Ding L (2020) The caudate nucleus contributes causally to decisions that balance reward and uncertain visual information. Elife 9:e56694. 10.7554/eLife.56694 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engelhard B, Finkelstein J, Cox J, Fleming W, Jang HJ, Ornelas S, Koay SA, Thiberge SY, Daw ND, Tank DW, Witten IB (2019) Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature 570:509–513. 10.1038/s41586-019-1261-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Featherstone RE, McDonald RJ (2005) Lesions of the dorsolateral striatum impair the acquisition of a simplified stimulus-response dependent conditional discrimination task. Neuroscience 136:387–395. 10.1016/j.neuroscience.2005.08.021 [DOI] [PubMed] [Google Scholar]
- Gunaydin LA, Grosenick L, Finkelstein JC, Kauvar IV, Fenno LE, Adhikari A, Lammel S, Mirzabekov JJ, Airan RD, Zalocusky KA, Tye KM, Anikeeva P, Malenka RC, Deisseroth K (2014) Natural neural projection dynamics underlying social behavior. Cell 157:1535–1551. 10.1016/j.cell.2014.05.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haber SN (2014) The place of dopamine in the cortico-basal ganglia circuit. Neuroscience 282:248–257. 10.1016/j.neuroscience.2014.10.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hikosaka O, Sakamoto M, Usui S (1989) Functional properties of monkey caudate neurons. II. Visual and auditory responses. J Neurophysiol 61:799–813. 10.1152/jn.1989.61.4.799 [DOI] [PubMed] [Google Scholar]
- Hikosaka O, Nakamura K, Nakahara H (2006) Basal ganglia orient eyes to reward. J Neurophysiol 95:567–584. 10.1152/jn.00458.2005 [DOI] [PubMed] [Google Scholar]
- Hintiryan H, Foster NN, Bowman I, Bay M, Song MY, Gou L, Yamashita S, Bienkowski MS, Zingg B, Zhu M, Yang XW, Shih JC, Toga AW, Dong H-W (2016) The mouse cortico-striatal projectome. Nat Neurosci 19:1100–1114. 10.1038/nn.4332 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howard CD, Li H, Geddes CE, Jin X (2017) Dynamic nigrostriatal dopamine biases action selection. Neuron 93:1436–1450.e8. 10.1016/j.neuron.2017.02.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howe MW, Dombeck DA (2016) Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature 535:505–510. 10.1038/nature18942 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hunnicutt BJ, Jongbloets BC, Birdsong WT, Gertz KJ, Zhong H, Mao T (2016) A comprehensive excitatory input map of the striatum reveals novel functional organization. Elife 5:e19103. 10.7554/eLife.19103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jog MS, Kubota Y, Connolly CI, Hillegaart V, Graybiel AM (1999) Building neural representations of habits. Science 286:1745–1749. 10.1126/science.286.5445.1745 [DOI] [PubMed] [Google Scholar]
- Kawagoe R, Takikawa Y, Hikosaka O (2004) Reward-predicting activity of dopamine and caudate neurons–a possible mechanism of motivational control of saccadic eye movement. J Neurophysiol 91:1013–1024. 10.1152/jn.00721.2003 [DOI] [PubMed] [Google Scholar]
- Ketzef M, Spigolon G, Johansson Y, Bonito-Oliva A, Fisone G, Silberberg G (2017) Dopamine depletion impairs bilateral sensory processing in the striatum in a pathway-dependent manner. Neuron 94:855–865.e5. 10.1016/j.neuron.2017.05.004 [DOI] [PubMed] [Google Scholar]
- Khibnik LA, Tritsch NX, Sabatini BL (2014) A direct projection from mouse primary visual cortex to dorsomedial striatum. PLoS One 9:e104501. 10.1371/journal.pone.0104501 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim HF, Ghazizadeh A, Hikosaka O (2015) Dopamine neurons encoding long-term memory of object value for habitual behavior. Cell 163:1165–1175. 10.1016/j.cell.2015.10.063 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lahiri AK, Bevan MD (2020) Dopaminergic transmission rapidly and persistently enhances excitability of D1 receptor-expressing striatal projection neurons. Neuron 106:277–290.e6. 10.1016/j.neuron.2020.01.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lak A, Nomoto K, Keramati M, Sakagami M, Kepecs A (2017) Midbrain dopamine neurons signal belief in choice accuracy during a perceptual decision. Curr Biol 27:821–832. 10.1016/j.cub.2017.02.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lak A, Okun M, Moss MM, Gurnani H, Farrell K, Wells MJ, Reddy CB, Kepecs A, Harris KD, Carandini M (2020) Dopaminergic and prefrontal basis of learning from sensory confidence and reward value. Neuron 105:700–711.e6. 10.1016/j.neuron.2019.11.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee RS, Mattar MG, Parker NF, Witten IB, Daw ND (2019) Reward prediction error does not explain movement selectivity in DMS-projecting dopamine neurons. Elife 8:e42992. 10.7554/eLife.42992 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lerner TN, Shilyansky C, Davidson TJ, Evans KE, Beier KT, Zalocusky KA, Crow AK, Malenka RC, Luo L, Tomer R, Deisseroth K (2015) Intact-brain analyses reveal distinct information carried by SNc dopamine subcircuits. Cell 162:635–647. 10.1016/j.cell.2015.07.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miklyaeva EI, Castaneda E, Whishaw IQ (1994) Skilled reaching deficits in unilateral dopamine-depleted rats: impairments in movement and posture and compensatory adjustments. J Neurosci 14:7148–7158. 10.1523/JNEUROSCI.14-11-07148.1994 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker NF, Cameron CM, Taliaferro JP, Lee J, Choi JY, Davidson TJ, Daw ND, Witten IB (2016) Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat Neurosci 19:845–854. 10.1038/nn.4287 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peters AJ, Fabre JMJ, Steinmetz NA, Harris KD, Carandini M (2021) Striatal activity topographically reflects cortical activity. Nature 591:420–425. 10.1038/s41586-020-03166-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reynolds JN, Wickens JR (2002) Dopamine-dependent plasticity of corticostriatal synapses. Neural Netw 15:507–521. 10.1016/S0893-6080(02)00045-X [DOI] [PubMed] [Google Scholar]
- Reynolds JN, Hyland BI, Wickens JR (2001) A cellular mechanism of reward-related learning. Nature 413:67–70. 10.1038/35092560 [DOI] [PubMed] [Google Scholar]
- Robbins TW, Everitt BJ (1992) Functions of dopamine in the dorsal and ventral striatum. Semin Neurosci 4:119–127. 10.1016/1044-5765(92)90010-Y [DOI] [Google Scholar]
- Rothenhoefer KM, Costa VD, Bartolo R, Vicario-Feliciano R, Murray EA, Averbeck BB (2017) Effects of ventral striatum lesions on stimulus-based versus action-based reinforcement learning. J Neurosci 37:6902–6914. 10.1523/JNEUROSCI.0631-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Syed EC, Grima LL, Magill PJ, Bogacz R, Brown P, Walton ME (2016) Action initiation shapes mesolimbic dopamine encoding of future rewards. Nat Neurosci 19:34–36. 10.1038/nn.4187 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tai LH, Lee AM, Benavidez N, Bonci A, Wilbrecht L (2012) Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nat Neurosci 15:1281–1289. 10.1038/nn.3188 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsutsui-Kimura I, Matsumoto H, Akiti K, Yamada MM, Uchida N, Watabe-Uchida M (2020) Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task. Elife 9:e62390. 10.7554/eLife.62390 [DOI] [PMC free article] [PubMed] [Google Scholar]
- White JK, Monosov IE (2016) Neurons in the primate dorsal striatum signal the uncertainty of object-reward associations. Nat Commun 7:12735. 10.1038/ncomms12735 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiong Q, Znamenskiy P, Zador AM (2015) Selective corticostriatal plasticity during acquisition of an auditory discrimination task. Nature 521:348–351. 10.1038/nature14225 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin HH, Ostlund SB, Knowlton BJ, Balleine BW (2005) The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci 22:513–523. 10.1111/j.1460-9568.2005.04218.x [DOI] [PubMed] [Google Scholar]
- Znamenskiy P, Zador AM (2013) Corticostriatal neurons in auditory cortex drive decisions during auditory discrimination. Nature 497:482–485. 10.1038/nature12077 [DOI] [PMC free article] [PubMed] [Google Scholar]