Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Mar 20.
Published in final edited form as: Neuron. 2024 Jan 25;112(6):1001–1019.e6. doi: 10.1016/j.neuron.2023.12.019

Glutamate inputs send prediction error of reward but not negative value of aversive stimuli to dopamine neurons

Ryunosuke Amo 1, Naoshige Uchida 1, Mitsuko Watabe-Uchida 1,*
PMCID: PMC10957320  NIHMSID: NIHMS1956530  PMID: 38278147

Summary

Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs) but the mechanisms underlying RPE computation, particularly contributions of different neurotransmitters, remain poorly understood. Here we used a genetically-encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons in mice. We found that glutamate inputs exhibit virtually all of the characteristics of RPE, rather than conveying a specific component of RPE computation such as reward or expectation. Notably, while glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli toward more positive responses, while excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.

Graphical Abstract

graphic file with name nihms-1956530-f0001.jpg

eTOC Blurb

Dopamine neurons receive glutamate and GABA and send reward and punishment information to downstream areas. Amo et al. recorded glutamate inputs to dopamine neurons with a glutamate sensor. Glutamate inputs convey reward prediction error to dopamine neurons but not punishment information, suggesting division of labor between glutamate and GABA inputs.

Introduction

A fundamental computation that the brain performs is to compare its expectations and reality. It has been proposed that the difference between expectations and reality (called prediction errors) is the driving force behind perception and learning15. Although prediction errors can be defined with simple mathematical formulas, how intricate networks of neurons compute prediction errors remains largely elusive.

The activity pattern of dopamine neurons in the lateral ventral tegmental area (VTA) is relatively uniform6,7, and dopamine responses have long been quantitatively formalized in reinforcement learning theory as reward prediction error (RPE), the discrepancy between actual and expected reward4,8. According to this theory, RPE can be used to learn to take actions in a specific “state”, a distinct combination of environmental and internal information, to maximize future rewards. Because of the simplicity of the learning rule—learning from surprising outcomes, which is consistent with animal behaviors9,10—RPE is widely used to model reinforcement learning in diverse situations11.

While the RPE characteristics of dopamine neuron activity have been intensively studied, the mechanism of how RPE is computed based on inputs to dopamine neurons is not well understood. Building on the simple mathematical formula of RPE, i.e. subtraction of expected reward value from actual reward value, many models assume that a specific brain area sends information about actual reward while a different brain area sends information about expected reward to dopamine neurons4,1216. VTA dopamine neurons receive input from multiple types of neurons, including from glutamate and GABA neurons1721. Given the excitatory and inhibitory nature of these forms of neurotransmissions, respectively, many models proposed that dopamine neurons might implement this subtraction by combining glutamate reward signals and GABA expectation signals12. However, recordings from monosynaptic inputs to dopamine neurons revealed that information for both reward and expectation is distributed and mixed in single presynaptic neurons across various brain regions22. The previous study showed that dopamine neurons uniquely signal relatively more “complete” RPE that is consistent across different task events, whereas activity patterns of presynaptic neurons are more diverse and correspond only to the “partial” RPE22. These results suggested that RPE may be partially computed in multiple nodes in the neural circuits, not only at the level of dopamine neurons.

On the other hand, disinhibition of GABA inputs has often been proposed to generate dopamine phasic activation because of powerful and stringent gating roles of disinhibition as seen in the basal ganglia23. A series of studies identified the lateral habenula (lHb) and its projection target, the rostromedial tegmental area (RMTg), as an important pathway which conveys computed RPE signals to dopamine neurons21,2428. Neurons in both lHb and RMTg signal RPE in an opposite sign than dopamine neurons, which suggests that the direct projection from GABA neurons in RMTg to dopamine neurons can theoretically produce RPE signals in dopamine neurons2528. However, it was reported that excitation of DA neurons precedes inhibition of lHb neurons28, and that ablation of RMTg neurons caused specific deficits in responses to aversive events in dopamine neurons, rather than completely removing RPE signals26. Thus, while RMTg GABA inputs send inverse RPE to dopamine neurons, other inputs, likely distributed, play an important role in shaping RPE signals in dopamine neurons.

In this study, we examined the population activity of glutamate inputs to dopamine neurons, focusing on characteristics of temporal difference (TD) errors. Classic studies have proposed that dopamine activity patterns reflect TD errors, a specific form of RPE that has proven especially useful in machine learning4,29. TD errors are computed locally between adjacent states at each time step to signal changes in reward expectation. This process supports incremental learning to predict reward value at gradually earlier time points. A recent study confirmed a fundamental prediction of this theory, namely, a gradual temporal shift of dopamine responses to earlier and earlier time points between cue and reward30. While information about reward and expectation are distributed in presynaptic neurons, it is not known how glutamate and GABA inputs are used to generate TD errors in dopamine neurons. The common idea, which directly applies the TD error computation in theory to the circuit model is that both glutamate and GABA neurons carry expectation signals with slight time lags. The inhibitory expectation signals are then subtracted from excitatory expectation signals in dopamine neurons to compute derivatives12,14, although this does not match the observation that RMTg GABA neurons already signal RPE2527. In the present study, we found that the activity pattern of glutamate inputs to dopamine neurons show characteristics of TD errors but differ in important ways from the signals conveyed by dopamine neurons. Specifically, glutamate inputs were excited by aversive stimuli, while DA neurons in the same areas were inhibited. This activation, coupled with GABA inputs which are reported to excite at aversive stimuli25,26,31, determines whether DA neurons are excited or inhibited by an aversive stimulus in situations. Opioids altered this excitation-inhibition balance probably through the μ-opioid receptor-enriched GABA input pathway21,3234. Our results demonstrate both redundancy and division of labor by glutamate and GABA inputs, suggesting a strategy to overcome neural constraints to form bi-directional RPE signals in dopamine neurons.

Result

Detection of glutamate release at dopamine neurons

Dopamine neurons receive input, both excitatory and inhibitory, from various areas in the brain1719,21,3538. To examine information conveyed specifically by excitatory input to dopamine neurons, we used a genetically encoded glutamate sensor. We injected adeno-associated virus (AAV) to express a recently improved glutamate sensor (SFiGluSnFR)39 in the ventral tegmental area (VTA) of dopamine transporter (DAT)-Cre mice (Figures 1A1C). The glutamate sensor was specifically expressed in dopamine neurons and distributed throughout cell bodies and dendritic fibers with spine-like protrusions (Figures 1B and 1C; Figures S1A and S1B). To monitor the glutamate release at dopamine neurons, an optical fiber was implanted into the VTA for fiber fluorometry. To monitor dopamine release simultaneously, dopamine sensor (GrabDA2m)40 was expressed in the ventral striatum (VS) and its fluorescent signals were monitored through an optical fiber implanted in the same location. We targeted an area of VS for recording where dopamine release typically signals reward prediction error (RPE)30,41. Overall, simultaneous recording of dopamine release at VS and glutamate release onto dopamine neurons showed similar activity patterns including strong activation at water reward, which was absent in control fluorescence signals (Figure 1D). A significant increase of activity at the water reward delivery was observed in both dopamine release and glutamate input activity (Figure 1E). The water-evoked responses were faster in glutamate sensor signals (rise: 250±17.1 ms, decay: 488±46.4 ms) compared to dopamine sensor signals (rise: 387±41.0 ms, decay: 1222±81.1 ms) (Figure 1F). The glutamate sensor signal is weakly and positively correlated with the dopamine sensor signal at the inter-trial-period and the correlation significantly increased at water (Figure S1C). These results indicate that the natural fluctuation of dopamine sensor signals and glutamate sensor signals are correlated at water reward, and that the temporal resolution of the glutamate sensor signals is sufficient to compare with dopamine sensor signals.

Figure 1. Characterization of glutamate sensor signals in the VTA.

Figure 1.

(A and C) Expression of glutamate sensor (SF-iGluSnFR) in the VTA (A), its close-up view at soma (B), and at the dendrite (C) with filopodia- and spine-like protrusions (arrowheads). Green, glutamate sensor; red, immunohistochemistry against tyrosine-hydroxylase (TH); blue, counter staining with DAPI. Scale bars, 500 μm (A), 10 μm (B) and 1 μm (C).

(C) A schematic of the experimental design, and example raw signals for simultaneous recording of glutamate sensor signals in the VTA, control tdTomato in the VTA, and dopamine sensor signals in the VS.

(D) Signals of glutamate sensor (left upper and middle; t = 7.4, p = 0.72×10−4, two-sided t-test) and dopamine sensor (left bottom and right; t = 15, p = 0.31×10−6, two-sided t-test) in response to unexpected water from an example session (left, mean ± s.e.m.) and each animal (n = 9 animals, middle and right, 1–500 ms from reward onset, mean ± s.e.m.).

(E) Rise and decay time for glutamate sensor and dopamine sensor signals evoked by water reward (n = 9 animals). An example of rise and decay detection (left). Comparison of rise (middle; t = −4.6, p = 0.15×10−2, two-sided paired t-test) and decay (right; t = −6.5, p = 0.18×10−3, two-sided paired t-test) between sensors. Mean ± s.e.m.

(F and G) Glutamate sensor signals at the VTA (F; t = 4.0, p = 0.010, two-sided unpaired t-test; n = 4 animals for ChrimsonR, n = 3 animals for tdTomato) and dopamine sensor signals at the VS (G; t = 8.2, p = 0.41×10−3, two-sided unpaired t-test; n = 4 animals for ChrimsonR, n = 3 animals for tdTomato) at optogenetic activation of glutamate axons (VP, PPTg, and STh) in the VTA from an example session (left, mean ± s.e.m.) and for each animal (right, 1–500 ms from light onset, mean ± s.e.m.).

(H) Comparison of rise (upper; t = −1.2, p = 0.25, two-sided unpaired t-test) and decay times (bottom; t = −7.4, p = 0.31×10−3, two-sided unpaired t-test) for glutamate and dopamine sensor signals evoked by the optogenetic glutamate axon stimulation. Mean ± s.e.m.

Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05.

To test causality between glutamate inputs and dopamine release, we optogenetically activated glutamate inputs to VTA. To do this, ChrimsonR42, a light-gated cation channel, was expressed in glutamate neurons in multiple pre-synaptic areas to dopamine neurons (pedunculopontine tegmental nucleus (PPTg), subthalamic nucleus (STh), and ventral pallidum (VP))17,35 in vesicular glutamate transporter 2 (vGluT2)-Cre mice, and their axons were activated at the VTA (Figures 1G and 1H). We first examined whether glutamate sensor signals reliably capture activation of glutamate inputs. We observed a significant increase of glutamate sensor signals when glutamate axons in VTA were stimulated in ChrimsonR-expressing animals, more than control mice expressing tdTomato (Figure 1G). The same optogenetic stimulation also strongly evoked dopamine release in the VS in ChrimsonR-expressing animals, more than in control mice (Figure 1H). We observed that glutamate sensor signals showed similar or faster responses to optogenetic stimulation (rise: 277±47.1 ms, decay: 375±66.9 ms) compared to dopamine release (rise: 352±36.7 ms, decay: 1083±68.1 ms) (Figure 1I). These observations confirmed that glutamate sensor recording with fiber-fluorometry allows detection of glutamate release in the VTA that causally evokes dopamine release in VS.

Glutamate input sends RPE to dopamine neurons during classical conditioning

Multiple studies have proposed potential neural mechanisms for the computation of TD error signals in dopamine neurons4,1216,27. Most models hypothesized that inhibitory inputs play key roles in TD error computation by providing reward expectation or prediction error that is already computed. According to these models, inhibitory input should be inhibited at reward cue to generate excitation in dopamine neurons. However, a previous study recorded neural activity from monosynaptic inputs to dopamine neurons and found that the vast majority of input neurons were activated by a reward cue22. This suggests that excitatory inputs to dopamine neurons, likely conveyed via glutamate, play critical roles in driving dopamine responses. Previous studies also found that glutamate inputs to the VTA are distributed throughout the brain17 and that presynaptic neurons signaling reward value are also distributed22. To probe the role of glutamate inputs in TD error calculations, we recorded population activity of glutamate input while mice performed a classical conditioning paradigm (Figure 2; Figure S2). We first examined whether glutamate input responses to reward-predicting cues are modulated by associated reward value, which is one of the characteristics of TD error (Figures 2B and 2D). We found that in response to odor cues, glutamate signals were monotonically modulated by the associated probabilities of reward outcome. To analyze the relationship between cue responses and associated reward, we performed linear regression of cue responses with associated reward probability for each animal. We observed a positive linear relationship between glutamate signals and reward probabilities (Figure 2E). While the observed activity patterns in glutamate input are consistent with TD error, this observation does not distinguish TD error from reward value.

Figure 2. Recording of glutamate inputs to the VTA dopamine neurons during classical conditioning.

Figure 2.

(A) Schematics of recording of glutamate sensor signals from dopamine neurons and classical conditioning task.

(B and C) Glutamate sensor responses to odor cues (B) and water reward (C) in each trial (left) and averaged activity of all trials (right) in a single session in an example animal. Mean ± s.e.m.

(D) Average glutamate sensor responses to odor cues in each animal, normalized by 80% reward odor response (left) and averaged activity for all animals (right, n =15 animals). Mean ± s.e.m.

(E) Linear regression of responses to odors (1–1000 ms from odor onset) with associated reward probability. Left, light gray, each animal; dark gray, average of all animals, correlation coefficient 0.94, p = 0.11×10−21, F-test. Right, regression coefficients for each animal (t = 28, p = 0.85×10−13, two-sided t-test, n =15 animals).

(F) Average responses to water reward in each animal, normalized by unexpected reward response (left), and averaged activity for all animals (right). Mean ± s.e.m.

(G) Linear regression of responses to water (201–1200 ms from reward onset) with expected reward. Left, light gray, each animal; dark gray, average of all mice, correlation coefficient −0.71, p = 0.38×10−7, F-test. Right, regression coefficients for each animal (t = −3.9, p = 0.13×10−2, two-sided t-test).

In box plots, grey lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01.

We next examined responses to water reward with different expectations (Figures 2C and 2F). The reward responses in glutamate inputs were negatively modulated by expectation; the higher the predicted reward probability was, the smaller the glutamate signal was (Figure 2G). The response patterns to water reward do not match with reward expectation or actual reward value. Rather, these results are more consistent with TD error coding where reward responses are suppressed when the reward is expected.

Glutamate input activity follows TD error rules in sequential conditioning

A hallmark of TD error signal is that, in addition to rewarding outcomes, moment-by-moment changes in reward expectation (value) drives TD errors11. According to this model, a reduction of response occurs not only for a reward when that reward was predicted by a cue, but also for a reward-predicting cue when the cue was preceded by a different cue that itself predicts the upcoming reward (‘sequential conditioning’, Figure 3A). We next sought to use this property to test for TD error coding in the glutamate input to dopamine neurons (Figure 3A right; Figure S2)43,44. Mice were first trained in simple classical conditioning to associate an odor (‘proximal odor’) with either reward or no outcome (Figure 3A Step1). After learning the association, new odors (‘distal odors’) were presented at an earlier time point (Figure 3A Step2). One distal odor was associated with reward-predicting cue at 100%, one with reward-predicting cue and no outcome cue at 50%, and one with no outcome cue at 100%. After completing the training, we simultaneously monitored glutamate input to dopamine neurons and dopamine release in the VS. As expected, dopamine release at the distal odor cue monotonically increased with the associated reward cue probability (Figures 3B and 3C). Critically, the magnitude of dopamine responses to the proximal reward-predicting cue (Odor A) was negatively scaled by the probability that this cue was predicted by a distal cue; strongest when there was no predicting cue, intermediate when this cue was predicted with 50% probability, and the smallest when the cue was predicted with 100% (Figures 3D and 3E). The dopamine activity pattern in this paradigm was consistent with TD errors in the TD learning model11.

Figure 3. Simultaneously recorded glutamate sensor and dopamine sensor signals during sequential conditioning.

Figure 3.

(A) Schematics of sequential conditioning. The mice were trained to associate proximal cues and outcomes (Step1) and then distal odors were introduced to associate with proximal cues and outcomes (Step2). After completing sequential conditioning, glutamate sensor signals from the VTA dopamine neurons and dopamine sensor signal in the VS were recorded simultaneously (right top). Bottom right, expected value and TD error signals based on a TD learning model.

(B) Dopamine responses to distal odor cues in each animal, normalized by odor 1 response (left), and averaged activity for all animals (right, n = 5 animals). Mean ± s.e.m.

(C) Linear regression of responses to distal odors (1–1000 ms from distal odor onset) with associated reward probability. Left, light color, each animal; dark color, average of all mice, correlation coefficient 0.92, p = 0.11×10−5, F-test; mean ± s.e.m. Right, regression coefficients in each animal (t = 12, p = 0.20×10−3, two-sided t-test, n = 5 animals).

(D) Dopamine responses to the reward-predicted proximal odor (odor A) in each animal, normalized with responses to unexpected odor A (left), and averaged activity for all animals (right, n = 5 animals). Mean ± s.e.m.

(E) Linear regression of responses to odor A (201–1200 ms from odor A onset) with expectation of odor A. Left, light color, each animal; dark color, average of all mice, correlation coefficient −0.94, p = 0.93×10−7, F-test; mean ± s.e.m.. Right, regression coefficients in each animal (right; t = −9.3, p = 0.72×10−3, two-sided t-test, n = 5 animals).

(F) Responses of glutamate inputs to dopamine neurons to distal odor cue (n = 5 animals). Mean ± s.e.m.

(G) Linear regression of responses at distal odors (1–1000 ms from distal odor onset) with associated reward probability. Left, light color, each animal; dark color, average of all mice, correlation coefficient 0.95, p = 0.44×10−7, F-test; mean ± s.e.m.. Right, regression coefficients in each animal (t = 23, p = 0.18×10−4, two-sided t-test).

(H) Responses of glutamate inputs to dopamine neurons to odor A (n = 5 animals). Mean ± s.e.m.

(I) Linear regression of responses to odor A (201–1200 ms from odor A onset) with expectation of odor A. Left, light color, each animal; dark color, average of all mice, correlation coefficient −0.88, p = 0.12×10−4, F-test; mean ± s.e.m.. Right, regression coefficients in each animal (t = −10, p = 0.43×10−3, two-sided t-test, n = 5 animals).

In box plots, grey lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001.

Like dopamine activity, glutamate release at the distal odors showed a monotonic increase with the probability of the reward cue (Figures 3F and 3G). Importantly, similar to dopamine release, the magnitude of glutamate signals at the proximal reward-predicting cue (Odor A) was negatively scaled by the probability that this cue was predicted by a distal cue (Figures 3H and 3I). Thus, glutamate input to dopamine neurons shows characteristics of TD error, resembling the observed patterns in dopamine release.

Notably, results from the sequential conditioning strengthen our observation in the simple classical conditioning in two ways. First, because cue events are temporally separated from actual reward acquisition, potential movement-related neural activity or recording noise caused by anticipatory licking were greatly reduced (Figure S3). We observed similar activity patterns both in unprocessed glutamate sensor signals and in normalized signals with control fluorescent signals (Figure S3; see Methods). Second, expectation-dependent reduction of cue responses in addition to water responses generalizes the idea of RPE to TD error, in which the value of the proximal cue takes the place of reward when computing the difference in value expectations across time. Our observations thus indicate that both dopamine release and glutamate inputs comply with TD error rules (i.e. computing changes in values) in sequential conditioning tasks.

Glutamate input shows inhibition at omission of expected outcome

Omission of expected reward produces dopamine inhibition below baseline4,31. The inhibitory “dip” at reward omission is another characteristic of TD error, because it can be explained by a prediction of reward that fails to materialize. It is commonly assumed that the inhibitory responses to reward omission are driven solely by GABA inputs such as GABA neurons in the RMTg26. However, we found that glutamate input as well as dopamine activity and release showed significant inhibition when an expected reward was omitted during classical conditioning (Figures 4A and 4B; Figure S2). Similarly, TD learning predicts a negative prediction error at the reward omission cue (a cue signaling no outcome when reward cue has been expected) in sequential conditioning. Consistently, dopamine activity dipped below baseline when the reward omission cue appeared in sequential conditioning (Figures 4C4E). Notably, glutamate input activity also showed an inhibitory dip at the reward omission cue (Figures 4D and 4E).

Figure 4. Responses to reward omission in glutamate inputs to dopamine neurons, dopamine somatodendritic activity and dopamine release.

Figure 4.

(A) Responses to omission of 40% expected reward in glutamate sensor in dopamine neurons in the VTA (left, n = 12 animals), somatodendritic GCaMP signal in the VTA (middle, n = 11 animals), and dopamine sensor signal in the VS (right, n = 11 animals) normalized with responses to unexpected water reward. Mean ± s.e.m.

(B) Quantification of omission responses (1501–2500 ms from expected reward timing, Glu, t = −2.6, p = 0.022; GCaMP, t = −4.1, p = 0.18×10−2; DA, t = −18, p = 0.53×10−8; two-sided t-test) and comparison of different sensor signals (F = 0.70, p = 0.50, one-way ANOVA; Glu vs GCaMP, p = 0.69, Glu vs DA, p = 0.92, GCaMP vs DA, p = 0.48, Tukey’s test).

(C) Schematics of a proximal odor (odor B) that signals reward omission after a distal odor (odor 2) that signals 50% reward in sequential conditioning.

(D) Responses to a reward omission odor in simultaneously recorded glutamate sensor signals (left) and dopamine sensor signals (right). n = 5 animals. Mean ± s.e.m.

(E) Quantification of responses to a reward omission odor in glutamate sensor (201–1200 ms from odor B onset, t = −3.2, p = 0.032, two-sided t-test) and dopamine sensor (t = −3.0, p = 0.039, two-sided t-test), and comparison of these signal (t = −0.77, p = 0.48, two-sided paired t-test).

In box plots, lighter color lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05.

We observed that glutamate input activity shows multiple characteristics of TD error in classical conditioning; cue responses were increased by associated reward value, reward responses were decreased by reward expectation, and reward omission induced inhibitory responses. As TD learning predicts, expectation-dependent suppression of responses was not restricted to reward responses. In sequential conditioning, we observed that reward cue responses were suppressed by expectation, and that the reward omission cue induced inhibitory responses. Together, these observations indicate that glutamate inputs to dopamine neurons convey significantly more complete TD error than popular theories predict1214,25,45.

TD errors in glutamate inputs are positively biased compared to dopamine somatodendritic activity and dopamine release

Because we observed striking similarities between glutamate inputs to dopamine neurons in the VTA and dopamine release in VS, we directly compared their activity patterns. We first focused on signals in sequential conditioning recorded simultaneously. While both glutamate inputs and dopamine release showed characteristics of TD error, the temporal patterns were slightly different. Because glutamate sensor signals in the VTA and dopamine sensor signals in the VS show different temporal patterns of responses to the same optogenetic stimulation (Figures 1F1H), we first estimated dopamine sensor signals in sequential conditioning solely by glutamate sensor signals. To do that we deconvolved glutamate signals using “glutamate kernels” estimated from the optogenetic responses in glutamate sensor signals, and then convolved the resulting trace with “dopamine kernels” estimated from the dopamine sensor optogenetic responses (Figures 5A5C; see Methods46). With this method, glutamate input signals explained 54±19 % of variance in dopamine signals in sequential conditioning (Figures 5A5E). While the estimated glutamate input contribution in dopamine distal odor responses were monotonically increased by associated reward odor probability (glutamate: t = 3.0, p = 0.038, t-test; n = 5 animals; Figure 5F left and right), the residual signal did not show significant modulation (residual: t = 0.47, p = 0.66, t-test; n = 5 animals, glutamate vs residual: t = 3.6, p = 0.022, paired t-test; n = 5 animals; Figure 5F middle and right). Similarly, the glutamate contribution, but not residual, at the proximal reward-predicting cue (Odor A) was negatively scaled by the probability that this cue was predicted by a distal cue (glutamate: t = −5.59, p = 0.0050, t-test; n = 5 animals; Figure 5G left and right; residual: t = −1.3, p = 0.25, t-test; n = 5 animals, glutamate vs residual: t = −4.3, p = 0.012, paired t-test; n = 5 animals; Figure 5G middle and right). The lack of TD error characteristics in residual signals suggests that glutamate input explains a significant portion of TD error coding in dopamine activity. While we observed striking similarities in estimated and actual dopamine signals, we also noticed differences. The cue responses predicted from the observed glutamate signals tended to be more positive, while actual dopamine cue responses were more negative, with the no-outcome cue generating inhibitory dopamine responses (Figures 5B, 5D, and 5E).

Figure 5. Estimation of dopamine signals from glutamate inputs in sequential conditioning.

Figure 5.

(A) Glutamate sensor signals at distal odors associated with different reward probability, a reward-predicted proximal odor with different expectation of the odor, water reward with different expectation, and a proximal reward omission odor (ordered left to right). Mean ± s.e.m.

(B) Simultaneously recorded dopamine sensor signals. Mean ± s.e.m.

(C) Dopamine signals were estimated by transforming glutamate sensor signals using differences in response patterns of glutamate sensor signals and dopamine sensor signals by optogenetic activation of glutamate inputs.

(D) Estimated glutamate input contribution to dopamine sensor signals. Mean ± s.e.m.

(E) Residual dopamine sensor signals that are not explained by glutamate sensor signals. Mean ± s.e.m.

(F and G) Linear regression of responses to distal odors (F, 1–1000 ms from distal odor onset) or a proximal “odor A” (G, 201–1200 ms from odor A onset) with associated reward probability. Estimated glutamate contribution (distal odors: correlation coefficient 0.76, p = 0.89×10−3, F-test; proximal odors: correlation coefficient −0.84, p = 0.69×10−4, F-test; left), residual signal (distal odors: correlation coefficient 0.14, p = 0.61, F-test; proximal odors: correlation coefficient −0.25, p = 0.35, F-test; middle), and comparison of coefficient beta (right).

n = 5 animals. In boxplots, lighter color lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05.

We next compared the activity pattern at each step of neural transmission: glutamate input to dopamine neurons in VTA (glutamate sensor in dopamine neurons), somatodendritic calcium signals in VTA dopamine neurons (GCaMP in dopamine neurons), and dopamine release in the VS (dopamine sensor in striatal neurons) during classical conditioning (Figures 6A and 6B; Figure S2). While cue responses were monotonically modulated by associated reward value in all steps, we noticed that glutamate input responses were biased toward excitation. Glutamate input responses to 40% reward odor and 0% reward odor were significantly higher compared to dopamine somatodendritic calcium signals and dopamine release (Figure 6C). We next estimated the minimum associated reward probability for the cue to produce positive responses (“zero-crossing point”; Figure 6D left) by linearly fitting neuronal responses with reward probability. The zero-crossing points were significantly lower in glutamate input activity (4.3±3.5 %) compared to dopamine somatodendritic activity (32±3.0 %) and dopamine release (28±2.2 %) (Figure 6D right). This indicates that glutamate input responses to cues are positively biased compared to dopamine neuron activity. To test for a possible contribution of recording sites to the observed difference, we recorded glutamate and calcium sensor signals from varied locations along the mediolateral axis of the VTA (0.325–0.75 mm from the midline). We found that the difference between glutamate input activity and dopamine neuron activity in 40% reward odor response, 0% reward odor response, or zero-crossing point was not explained by the recording location (Figure 6E).

Figure 6. Comparison of glutamate input activity to dopamine neurons with dopamine somatodendritic activity and dopamine release.

Figure 6.

(A) Responses to odors associated with different reward probability in classical conditioning. Left, glutamate sensor signal from the VTA dopamine neurons (n = 12 animals). Middle, somatodendritic GCaMP signals from dopamine neurons in VTA (n = 11 animals). Right, dopamine release in the VS (n = 11 animals). Mean ± s.e.m. Bottom, 1–1000 ms from odor onset.

(B) Schematics of three different events during dopamine neural transmission.

(C) Comparison of responses to 40% reward-predicted odor (1–1000 ms from odor onset, left; F = 9.2, p = 0.69×10−3, one-way ANOVA; Glu vs GCaMP, p = 0.37×10−2, Glu vs DA, p = 0.15×10−2, GCaMP vs DA, p = 0.94, Tukey’s test; Glu, t = 6.9, p = 0.23×10−4, GCaMP, t = −0.48, p = 0.63, DA, t = 1.0, p = 0.32, two-sided t-test), and an odor that is associated with no outcome (right; F = 21, p = 0.12×10−5, one-way ANOVA; Glu vs GCaMP, p = 0.68×10−6, Glu vs DA, p = 0.012, GCaMP vs DA, p = 0.45×10−2, Tukey’s test; Glu, t = −0.5, p = 0.62, GCaMP, t = −6.2, p = 0.96×10−4, DA, t = −6.5, p = 0.62×10−4, two-sided t-test).

(D) Comparison of zero-crossing point (i.e. estimated reward probability when responses to a cue become zero; F = 29, p = 0.58×10−7, one-way ANOVA; Glu vs GCaMP, p = 0.89×10−7, Glu vs DA, p = 0.92×10−5, GCaMP vs DA, p = 0.25, Tukey’s test).

(E) Effect of recording sites and sensors in responses to 40% reward-predicted odor (left; location, F = 0.14, p = 0.71; sensor, F = 12, p = 0.24×10−2, two-way ANOVA), responses to an odor predicting no outcome (middle; location, F = 0.15, p = 0.69; sensor, F = 33, p = 0.11×10−4, two-way ANOVA), and zero-crossing points (right; location, F = 0.17, p = 0.68; sensor, F = 42, p = 0.25×10−5, two-way ANOVA). Inset is a superimposed schematic for recording sites.

(F) Comparison of responses to distal odors associated with different reward probability (1–1000 ms from distal odor onset). Mean ± s.e.m. n = 5 animals.

(G) Comparison between glutamate and dopamine sensor signals for responses to a distal odor that predicts 50% reward-predicted proximal odor (left; 1–1000 ms from distal odor onset , t = 3.0, p = 0.038, two-sided paired t-test), responses to a distal odor that predicts an odor with no outcome (middle; 1–1000 ms from distal odor onset, t = 2.7, p = 0.049, two-sided paired t-test), and zero-crossing points of distal odor response (right; t = −3.5, p = 0.038, two-sided paired t-test).

In boxplots, lighter color lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05.

We next confirmed that the observed difference was not due to variability across animals. We compared simultaneously recorded glutamate input activity and dopamine release during sequential conditioning. As seen during the odor response in classical conditioning, responses to distal odors were positively biased in glutamate inputs compared to dopamine release in single animals (50% reward distal odor: t = −3.0, p = 0.038, paired t-test; 0% reward distal odor: t = −2.7, p = 0.049, paired t-test; zero-crossing point: Glu, 2.7±5.1 %, DA, 26±6.5 %, t = 3.5, p = 0.023, paired t-test, Figures 6F and 6G). Thus, the population activity of glutamate inputs does not fully explain dopamine activity, suggesting that dopamine neurons do not purely relay information from glutamate input population but require specific inputs, probably additional inputs, to produce more negative responses.

Lack of inhibitory responses to aversive stimuli in glutamate inputs to dopamine neurons

Many dopamine neurons are inhibited by aversive stimuli such as air puff to the eye, consistent with the negative value of such stimuli7,24,31. We confirmed that both dopamine neuron activity in the VTA and dopamine release at the VS showed activation to water reward and inhibition to air puff (Figures 7A7C; Figure S4). Surprisingly, however, while glutamate input showed activation to water reward, it was also phasically activated by the air puff (Figures 7A7C; Figure S4A). Thus, the glutamate input response to the air puff is clearly different from dopamine activity or release in this task condition (Figure 7C). The difference was not explained by recording location (Figure 7D). The opposite directions in air puff response between dopamine activity and glutamate inputs strongly suggest requirement of an additional input, such as RMTg GABA neurons25,27, to cancel out excitation and produce inhibitory responses to aversive air puffs in dopamine neurons. In contrast to common models in which glutamate and GABA inputs provide components of TD errors (Figure 7E) or dopamine neurons inherit already calculated RPE from GABA neurons (Figure 7F), our results indicate that glutamate inputs send TD errors, including inhibitory responses to reward omission, but diverge from dopamine neurons in their response to aversive stimuli, showing excitation rather than inhibition. This excitation may compete with GABA inputs, which are also excited by aversive stimuli (Figure 7G).

Figure 7. Responses to air puff in glutamate input activity to dopamine neurons, dopamine somatodendritic activity and dopamine release.

Figure 7.

(A) Responses to unexpected water reward.

(B) Responses to unexpected air puff.

(C) Average responses to air puff (201–1200 ms from air puff onset, Glu; t = 3.4, p = 0.010, n = 8 animals, GCaMP; t = −7.6, p = 0.58×10−4, n = 9 animals, DA; t = −4.0, p = 0.69×10−2, n = 7 animals, two-sided t-test) and comparison of these responses (F = 24, p = 0.33×10−5, one-way ANOVA; Glu vs GCaMP, p = 0.22×10−3, Glu vs DA, p = 0.31×10−5, GCaMP vs DA, p = 0.30, Tukey’s test). Mean ± s.e.m. In boxplots, lighter color lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers.

(D) Effects of recording sites and sensors in air puff responses (location, F = 1.1, p = 0.30; sensor, F = 32, p = 0.51×10−6, two-way ANOVA).

(E) Classical models for TD error computation in dopamine neurons. Different components for TD error computation are conveyed by input to dopamine neuron; GABA input conveys expectation (V(t)), and glutamate input conveys expectation (V(t+1)) and reward (r(t))12,14 (alternatively expectation (V(t+1)) is provided by GABA input as inverse form12,15). γ; temporal discounting factor (0 ≤ γ ≤1).

(F) Disinhibition model. RMTg GABA inputs provide TD error through disinhibition of dopamine neuron25,27. Glutamate input simply provides baseline activity63.

(G) Schematics of dopamine signal computation. Glutamate input from various areas conveys positively biased TD error as a population. GABA input from RMTg (and potentially other areas) also sends TD error. Combination of these redundant inputs forms balanced TD error in dopamine neurons. Negative value of aversive events is conveyed by GABA input, whereas glutamate input is excited by aversive events, opposing to GABA inputs.

In boxplots, lighter color lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05.

Recent studies established multiple versions of improved glutamate sensor47,48. Among them, iGluSnFR3 v857-GPI was reported to achieve both higher signal-to-noise ratio and increased postsynaptic localization in vivo47. Using this sensor, we confirmed that responses in glutamate inputs to cues and rewards are consistent with RPE, although responses to reward omission were variable in this recording (Figure S5). Moreover, we confirmed that glutamate inputs show activation by aversive air puff.

Opioids may flip dopamine responses to aversive stimuli from inhibition to excitation

If both glutamate inputs and GABA inputs are excited by aversive stimuli, and thus compete each other to shape dopamine responses, the dopamine responses to aversive stimuli may be flexibly altered by slight modulations in the relative strength of excitatory and inhibitory inputs, depending on an animal’s state or context (Figure 7G). We focus here on one state alteration: the exogenous administration of opioids, common and effective analgesics which lead to an increase in dopamine excitability and inhibit some direct and indirect inputs to dopamine neurons4956. We tested the effect of systemic administration of buprenorphine, an opioid commonly used in medical practice, on dopamine release and glutamate inputs with simultaneous recording (Figure 8; Figure S4). While dopamine release in VS was inhibited at an aversive air puff in the control sessions, buprenorphine treatment drastically diminished inhibition of the dopamine release and even flipped the response to activation in some animals (Figures 8A8B). The reduction of dopamine inhibition was specifically observed in buprenorphine treatment while control saline injection did not alter dopamine response to air puff (Figure 8B right). In contrast, glutamate input responses were not clearly changed by the buprenorphine treatment (Figures 8C8D). Neither buprenorphine nor saline administration altered excitatory responses in glutamate input (Figure 8D right). Overall, buprenorphine administration consistently decreased the magnitude of dopamine inhibitory response to an aversive stimulus, a change which was not seen in glutamate inputs (Figure 8E), suggesting competition between glutamate inputs and other inputs. In this way, excitation by aversive stimuli in glutamate and other inputs appears to be differentially modulated to flexibly shape dopamine responses to aversive stimuli.

Figure 8. Effect of buprenorphine on dopamine activity and glutamate input in response to aversive air puff.

Figure 8.

(A) Dopamine release in the VS at unexpected air puff after saline or buprenorphine injection. Mean ± s.e.m.

(B) Left, comparison of dopamine release following air puff after saline and buprenorphine injection (1–1000 ms from air puff onset, saline vs buprenorphine: t = 5.9, p = 0.55×10−3, paired t-test, saline: t = −6.9, p = 0.23×10−3, two-sided t-test, buprenorphine: t = 0.66, p = 0.52, two-sided t-test, n = 8 animals). Right, difference in response to air puff between pre-injection and post-injection (1–1000 ms from air puff onset, saline vs buprenorphine: t = 7.2, p = 0.16×10−3, two-sided paired t-test, saline: t = −0.78, p = 0.46, t-test, buprenorphine: t = 10, p = 0.14×10−4, two-sided t-test, n = 8 animals). A circle indicates each animal. Blue circle: significantly below 0 (p ≤ 0.05, t < 0, two-sided t-test). Red circle: significantly above 0 (p ≤ 0.05, t > 0, two-sided t-test).

(C) Glutamate input to dopamine at air puff after saline or buprenorphine injection. Mean ± s.e.m.

(D) Left, comparison of glutamate input response to air puff after saline and buprenorphine injection (1–1000 ms from air puff onset, saline vs buprenorphine: t = −0.79, p = 0.45, two-sided paired t-test, saline: t = 3.2, p = 0.014, two-sided t-test, buprenorphine: t = 4.8, p = 0.17×10−2, two-sided t-test, n = 8 animals). Right, difference in response to air puff between pre-injection and post-injection (1–1000 ms from air puff onset, saline vs buprenorphine: t = −1.0, p = 0.33, two-sided paired t-test, saline: t = −0.35, p = 0.73, two-sided t-test, buprenorphine: t = −1.5, p = 0.16, t-test, n = 8 animals). Circles indicate individual animals. Blue circle: significantly below 0 (p ≤ 0.05, t < 0, two-sided t-test). Red circle: significantly above 0 (p ≤ 0.05, t > 0, two-sided t-test).

(E) Distribution of changes in dopamine and glutamate input response to air puff with buprenorphine and saline injection (ks = 0.62, p = 0.049, Kolmogorov-Smirnov test, n = 8 animals).

(F) Potential competition between glutamate and GABA inputs to dopamine neurons. At an aversive event, a large increase of inhibitory inputs cancels out the increase of excitatory inputs and generates an inhibitory dip in dopamine activity. With opioid administration, dopamine shows more activation and less inhibition in response to aversive events, potentially because of preferential inhibition of GABA inputs by opioids.

In boxplots, lighter color lines are the median; edges are 25th and 75th percentiles; and whiskers are the most extreme data points not considered as outliers. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05.

Discussion

Excitatory-Inhibitory balance has been intensively studied in the context of neural dynamics, sensory physiology and psychiatric conditions5761, and it is a common idea that inhibitory tones should be suitably titrated for specific roles such as stabilizing feedback loops or sharpening the stimulus selectivity of sensory responses62. Our study points out a critical similarity and difference between dopamine activity and its excitatory inputs activity. Together with reported GABA input activity2527,31, our finding suggests a different aspect of excitatory-inhibitory balance: both glutamate and GABA inputs provide overlapping TD error characteristics to dopamine neurons, while they compete with each other for control over dopamine responses to aversive stimuli. These two contrasting findings suggest that slight modifications of excitatory-inhibitory weights of inputs would dramatically flip responses to aversive stimuli but preserve core information of TD errors in dopamine neurons.

We examined information conveyed by glutamate inputs to dopamine neurons for RPE computation. The prevailing view of RPE computation in dopamine neurons is that inhibitory neurons are the major contributor of prediction error signals in dopamine neurons. For example, because of the heavy projection from inhibitory inputs to dopamine neurons (>70 % of all inputs)20, inhibitory neurons are often hypothesized to provide prediction error signals to dopamine neurons via disinhibition (Figure 7F)12,13,15,16,25,27,63. Recent findings showed that inhibitory projections from the lateral hypothalamus (LH), VS, and superior colliculus (SC) to VTA GABA neurons are capable of disinhibiting dopamine neuron18,37,38, although optogenetically identified GABA neurons in VTA do not show inhibitory responses to reward or reward-predicting cues31. In simple RPE models, it has been proposed that inhibitory inputs provide expectation and excitatory inputs provide actual reward information12. Further, in models of TD learning, excitatory and inhibitory inputs both provide reward expectation (or value) but at slightly shifted timing so that dopamine neurons perform subtraction (or differentiation) to compute TD errors, a specific form of RPE (Figure 7E)12,14. In stark contrast to these views, we observed striking similarity between glutamate input activity and TD error itself. Notably, glutamate signals are negatively modulated by prior expectation, opposite to the idea that they encode expectation signals (Figure 7G).

A previous study found that information for reward expectation (value) and reward is distributed and mixed in single presynaptic neurons to dopamine neurons across various brain areas22. Interestingly, they found that the simple summation of those input activities shows characteristics of RPE. These results indicated that inputs with mixed information are sufficient to produce RPE. However, because the previous study could not distinguish the cell types of recorded neurons (i.e. which neurotransmitters they release), the summation did not reflect actual circuit operations. Here, we extended this investigation to show that the population activity of glutamate inputs to dopamine neurons, which are likely distributed across brain areas17, encodes TD error.

Although the activity patterns of glutamate inputs strikingly resembled those of dopamine neurons, we also found differences between their activities. Overall, glutamate input responses were positively biased compared to dopamine responses. Another important difference is that, although glutamate input showed an inhibitory dip by reward omission, similar to dopamine, glutamate input was activated by aversive air puff, in contrast to dopamine, which shows inhibitory responses (Figure 7). These results suggest that an additional input is required to generate inhibitory responses to aversive air puffs in dopamine activity. Consistently, GABA inputs to dopamine neurons such as the lHb-RMTg pathway21,2527,64 and GABA neurons in the VTA31 are activated by aversive stimuli, and thus may send negative value information to dopamine neurons.

In addition to providing information about aversive events, the lHb-RMTg pathway is thought to provide inhibitory responses at reward omission in dopamine neurons (Figure 7F). Ablation of either Hb or RMTg specifically impairs omission responses of dopamine neurons but largely preserves other responses such as reward cue responses24,26. Although these results indicate an important role of this pathway in reward omission responses in dopamine neurons, we found that glutamate input also responds to reward omission with an inhibitory dip. Thus, the brain treats two different types of negative events in distinct ways: glutamate inputs and GABA inputs may work together to produce inhibitory responses to reward omission, while GABA neurons may need to actively cancel out excitation by glutamate inputs to produce inhibitory responses to aversive stimuli in dopamine neurons (Figure 7G). These results are consistent with the idea that the inhibitory response to reward omission is a part of TD errors4, and thus already incorporated at the early stage of TD error computation, whereas the negative value of aversive events seems to be compiled into TD errors in later stages.

Competition between glutamate and GABA inputs in response to aversive stimuli raises the possibility that dopamine responses to aversive stimuli can be flexibly adjusted by modulating the balance of opposite information from excitatory and inhibitory inputs. One such example is opioid analgesia. While opioids affect multiple sites in pain pathways, as well as reward pathways, in the brain6567, we found that buprenorphine, a partial μ-opioid agonist/κ-opioid antagonist, dramatically decreased inhibitory responses to an aversive stimulus in dopamine neurons but it did not affect their glutamate inputs. μ-opioid receptor expression is enriched in the RMTg and its direct and indirect upstream inputs such as the parabrachial nucleus (PBN), lHb, and the lateral preoptic area of the hypothalamus (LPO)21,3234,68, and μ-opioid receptor activation inhibits signaling of these neurons5356. Thus, it is highly probable that opioids preferentially decrease inhibitory input to dopamine neurons and tip the excitatory-inhibitory balance of inputs to dopamine neurons towards excitation (Figure 8F). We found that inhibitory responses of dopamine neurons to an aversive stimulus were greatly reduced or even flipped to excitation in some cases, consistent with the previous finding that ablation of RMTg flips dopamine responses to aversive electric shock from inhibition to excitation26. Importantly, dopamine inhibitory responses are thought to facilitate negative learning. Hence, opioids suppress dopamine signaling for negative learning with an aversive stimulus, which could contribute to analgesia as well as addiction by undermining the transmission of negative value information or negative RPEs of aversive events.

Our results demonstrate that glutamate and GABA neurons provide overlapping yet complementary information for dopamine neurons to compute TD error. Our quantification indicates that glutamate input contributes to a significant portion of dopamine TD errors, yet does not send negative value of aversive stimuli. These findings lend insight into the brain’s strategy for precise transmission of information of positive and negative valence: positive responses are generated by an increase of excitatory inputs, while negative responses are generated by an increase of inhibitory inputs26,27,31. A similar idea was proposed for pain responses where activation of two different populations of neurons, ON cells (promote nociception) and OFF cells (suppress nociception), rather than activation and inhibition of ON cells, conveys pain information to inform behavioral decisions69. Following this framework, dopamine neurons may combine glutamate and GABA TD error information to overcome a neural constraint: floor effects with inhibition. To convey accurate positive and negative values to dopamine neurons, the brain may preferentially use excitation of glutamate input for dopamine activation and excitation of GABA input for dopamine inhibition, especially by aversive stimuli, to produce intact TD errors in dopamine neurons70.

Excitatory-inhibitory parallel input systems might be functionally advantageous to generate dopamine activity. First, the combination of biased RPE inputs may provide flexibility in dopamine representation. For example, by changing the weights of glutamate and GABA inputs separately, dopamine neurons may provide positively or negatively biased teaching signals to other brain areas so that learning from positive and negative events is separately adjusted according to factors such as internal state and environmental context with other neuromodulators, hormone levels and neuropeptides such as endogenous opioids. Because both glutamate and GABA inputs send (biased) TD errors, these modulations may be able to shift dopamine responses positively or negatively, potentially even flipping responses to an aversive stimulus when appropriate in a given state without distorting core information (i.e. monotonic representation) of TD errors.

Second, different combinations of inputs may contribute to dopamine neuron diversity. For example, dopamine neurons that project to ventromedial VS are activated by aversive stimuli71. It has also been reported that dopamine axons in the dorsolateral striatum exhibit positively shifted RPE compared to dopamine axons in VS and dorsomedial striatum46. Because glutamate inputs convey positively shifted RPE and show excitation to aversive stimuli, differential weighting of inputs from glutamate neurons may partially explain reported diversity in dopamine activity. At a finer scale, a recent study proposed that the learning rate for negative and positive events is adjusted in each dopamine recipient neuron to represent the complete distribution of rewards, rather than simply expected value (i.e. distributional reinforcement learning72,73). Diverse weights of positively and negatively biased input onto dopamine neurons may provide raw material for learning such distributional representations.

Where do RPE signals come from? Our study does not distinguish sources of information. While local inputs within VTA are good candidates, they may not fully explain our results. For example, the inhibitory responses to reward omission have not been observed in glutamate neurons in VTA with fiber fluorometry and electrophysiology recordings7476, suggesting a strong effect of minority of glutamate neurons in VTA or contribution of glutamate inputs from outside of VTA.

We note some technical limitations of the current study. Genetically encoded glutamate sensor is expressed not only at postsynaptic sites but spreads widely, and we recorded only summation of those activity. While spillover signals were detected from the dendritic shaft near spines with SFiGluSnFR, those signals at the dendritic shaft were small and largely similar to signals at nearby spines39. Because the spillover signals are slow, they might cause complex effects on neural activity through metabotropic receptors77,78, although the estimated concentration of released glutamate at the synaptic cleft is much higher than at the extra/peri-synapse7981. More recent glutamate sensors improved in the signal-to-noise ratio, kinetics, and/or synaptic localization47,48. We observed major characteristics of RPE in recording with an alternative sensor, iGluSnFR3 857-GPI, which bolsters our conclusions. Notably, however, even the newer sensor that we utilized does not exclusively localize to synapses, and our recording only measures summation of all signals. Understanding where critical information comes from and how information at each synapse is integrated to compute TD errors in dopamine neurons needs a deeper examination of each synapse and internal events within dopamine neurons82,83.

Taken together, we found that TD error signals are already conveyed in the population activity of glutamate inputs, in contrast to existing models for TD error computation. Because fiber-fluorometry signals are the sum of signals across all inputs, the current observations cannot distinguish whether TD error computations took place at dopamine neurons or upstream. However, previous single neuron recording found that information for RPE in presynaptic neurons is mixed but not as complete as that conveyed by dopamine neurons, suggesting partial computation at multiple nodes, but completion at the level of dopamine neurons22. In the present study, we demonstrate that TD error signals carried by glutamate inputs are positively biased, and lack inhibitory responses to an aversive stimulus, strongly suggesting that mixing with GABA inputs is critical to form complete TD errors. While RPE is observed in various brain areas22,25,28,45,84,85, dopamine neurons may uniquely represent “complete” TD errors by mixing glutamate TD errors and GABA TD errors from distributed sources.

STAR Methods

RESOURCE AVAILABILITY

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Mitsuko Watabe-Uchida (mitsuko@mcb.harvard.edu).

Materials availability

This study did not generate new unique reagents.

Data and code availability

  • Glutamate input and dopamine fluorometry data are deposited at Dryad (https://doi:10.5061/dryad.ncjsxkt25) and are publicly available as of the date of publication. DOI is listed in the key resources table.

  • MATLAB codes used to obtain the results are available at GitHub (https://github.com/VTA-SNc/Amo-2023) and are publicly available as of the date of publication. DOI is listed in the key resources table.

  • Any additional information required to reanalyze the data reported in this paper is available from the Lead Contact upon request.

Key resources table.
REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
Anti-tyrosine hydroxylase (TH) EMD Millipore RRID: AB_390204
Anti-Green Fluorescent Protein Antibody Aves Labs RRID: AB_2307313
F(ab’)2-Goat anti-Rabbit IgG (H+L) Cross-Adsorbed Secondary Antibody, Alexa Fluor 647 Invitrogen RRID: AB_2535814
Goat anti-Chicken IgY (H+L) Secondary Antibody, Alexa Fluor 488 Invitrogen RRID: AB_2534096
Bacterial and virus strains
AAV5-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE Marvin et al.39; UNC Vector Core Custom preparation
AAV8-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE Marvin et al.39; UNC Vector Core Custom preparation
AAV1-hSyn-SF-iGluSnFR(A184S) Marvin et al.39; Addgene #106174-AAV1
AAV1-hSyn-FLEX-iGluSnFR3 v857-GPI Aggarwal et al.47; Addgene #175181-AAV1
AAV8-hSyn-FLEX-jGCaMP7f Dana et al.90; Addgene #104492-AAV8
AAV9-hSyn-DA2m Sun et al.40; ViGene bioscience N/A
AAV5-CAG-FLEX-tdTomato Gift from Ed Boyden (unpublished); UNC Vector Core AAV In Stock Vectors: Ed Boyden
AAV8-hSyn-FLEX-ChrinsonR-tdTomato Klapoetke et al.42; UNC Vector Core AAV In Stock Vectors: Ed Boyden
Chemicals, peptides, and recombinant proteins
Buprenorphine Patterson Veterinary Cat #07-892-5235
Deposited data
Fiber-fluorometry recording data This paper Dryad (doi:10.5061/dryad.ncjsxkt25)
Experimental models: Organisms/strains
Mouse: B6.SJL-Slc6a3tm1.1(cre)Bkmn/J The Jackson Laboratory RRID: IMSR_JAX:006660
Mouse: B6.Cg-Gt(ROSA)26Sortm14(CAG-tdTomato)Hze/J The Jackson Laboratory RRID: IMSR_JAX:007914
Mouse: B6;129S-Slc17a8tm1.1(cre)Hze/J The Jackson Laboratory RRID: IMSR_JAX:028534
Mouse: Slc17a6tm2(cre)Lowl/J The Jackson Laboratory RRID: IMSR_JAX:016963
Recombinant DNA
pGP-AAV-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE Marvin et al.39; Addgene RRID: Addgene_106186
Software and algorithms
MATLAB MathWorks RRID: SCR_001622
LabView National Instruments RRID: SCR_014325
Matlab codes This paper GitHub (doi: 10.5281/zenodo.10407856)
Other
Mono Fiber-optic Cannulas Doric Lenses Cat # MFC_400/430-0.66_5mm_MF1.25_FLT
Mono Fiber-optic Cannulas Doric Lenses Cat # MFC_400/430-0.48_5mm_MF1.25_FLT

EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS

Animals

68 female and male mice, 2–19 months old were used in this study. We used heterozygote for DAT-Cre (Slc6a3tm1.1(cre)Bkmn; The Jackson Laboratory, 006660)86, LSL-tdTomato (Gt(ROSA)26Sortm14(CAG-tdTomato)Hze; The Jackson Laboratory, 007914)87 transgenic lines, vGluT3-Cre (Slc17a8tm1.1(cre)Hze; The Jackson Laboratory, 028534)88 and homozygote for vGluT2-Cre (Slc17a6tm2(cre)Lowl/J; The Jackson Laboratory, 016963)89 line. DAT-Cre lines crossed with LSL-tdTomato were used for some sensor recordings. vGluT3-Cre lines crossed with LSL-tdTomato were used for some of the experiments with DA sensor, without use of Cre recombinase. Mice were housed on a 12 hr dark (7:00–19:00)/12 hr light (19:00–7:00) cycle. Experiments were performed in the dark period. Ambient temperature was kept at 23±2.7 C° and humidity was kept below 50%. All procedures were performed in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and approved by the Harvard Animal Care and Use Committee.

METHOD DETAILS

Virus

pGP-AAV-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE (gift from Loren Looger; Addgene, #106186)39 was packaged into AAV at UNC vector core (AAV5-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE, 2.4 × 1013 vg/ml; AAV8-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE, 0.88 × 1013 and 1.7 × 1013 vg/ml). In addition to the above, the following AAVs were used in this study: AAV8-hSyn-FLEX-jGCaMP7f (gift from Douglas Kim & GENIE Project; 1.8 × 1013 vg/ml; Addgene, #104492-AAV8)90, AAV1-hSyn-SF-iGluSnFR(A184S) (gift from Loren Looger; 2.1× 1013 vg/ml; Addgene, #106174-AAV1)39, AAV9-hSyn-DA2m (gift from Yulong Li; 1.01 × 1013 vg/ml; ViGene bioscience)40, AAV5-CAG-FLEX-tdTomato (gift from Edward Boyden; 7.8 × 1012 vg/ml; UNC Vector Core), AAV8-hSyn-FLEX-ChrinsonR-tdTomato (gift from Edward Boyden; 3.7 × 1012 vg/ml; UNC vector core)42, and AAV1-hSyn-FLEX-iGluSnFR3 v857-GPI (gift from Kaspar Podgorski; 2.4× 1013 vg/ml; Addgene, #175181-AAV1)47.

Surgical procedures

The surgery was performed under aseptic conditions as previously described30. Mice were anesthetized with isoflurane (1–2% at 0.5–1 L/min) and local anesthetic (lidocaine (2%)/bupivacaine (0.5%) 1:1 mixture, S.C.) was applied at the incision site. Analgesia (ketoprofen for post-operative treatment, 5 mg/kg, I.P.; buprenorphine for pre-operative treatment, 0.1 mg/kg, I.P.) was administered for 3 days following surgery. A custom-made head-plate was placed on the well-cleaned and dried skull with adhesive cement (C&B Metabond, Parkell) containing a small amount of charcoal powder. For expression of glutamate sensor in the dopamine neurons, AAV8 (or AAV5)-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE or mixture with AAV5-CAG-FLEX-tdTomato (2–8:1) was injected unilaterally in the VTA (total 600 nl, Bregma −3.05–3.15 mm AP, 0.375–0.75 mm ML, 4.35 mm DV from dura) in DAT-Cre/LSL-tdTomato mice or DAT-Cre mice, respectively. AAV1-hSyn-FLEX-iGluSnFR3 v857-GPI (1–10 times dilution; 500 nl) was injected in the VTA in DAT-Cre/LSL-tdTomato mice using the same coordinate as SF-iGluSnFR(A184S) virus. For expression of GCaMP in dopamine neurons, AAV8-hSyn-FLEX-jGCaMP7f or mixed solution with AAV5-CAG-FLEX-tdTomato (2:1) was injected unilaterally in the VTA (total 500 nl, Bregma −3.05 mm AP, 0.325–0.75 mm ML, 4.35 mm DV from dura) in DAT-Cre/LSL-tdTomato mice or DAT-Cre mice, respectively. For expression of dopamine sensor in the VS, AAV9-hSyn-DA2m was injected unilaterally in the VS (300 nl, Bregma +1.45 AP, 1.4 ML, 4.35 DV from dura) in DAT-Cre/tdTomato mice or vGluT3-Cre/LDL-tdTomato mice. For optogenetic stimulation experiment, AAV8-hSyn-FLEX-ChrinsonR-tdTomato was injected unilaterally in the VP, STh, and PPTg (VP: 300 nl, 20° angled (tip is directed to medial), Bregma +1.0 AP, 2.5 ML, 4.2 DV from dura; STh: 300 nl, Bregma −2.2 AP, 1.5 ML, 4.5 DV from dura; PPTg: 300 nl, 25° angled (tip is directed to medial), Bregma −4.75 AP, 2.35 ML, 4.5 DV from dura) in vGluT2-Cre mice to express ChrimsonR in glutamate input to the VTA, and AAV1-hSyn-SF-iGluSnFR(A184S)/AAV5-CAG-FLEX-tdTomato (8:3) or AAV9-hSyn-DA2m was injected in the VTA (250 nl, Bregma −3.1 mm AP, 0.5 mm ML, 4.35 mm DV from dura) or VS (300 nl, Bregma +1.45 AP, 1.4 ML, 4.2 DV from dura), respectively. A glass pipette containing AAV was slowly moved down to the target over the course of a few minutes and kept for 2 minutes to make it stable. AAV solution was slowly injected (~15 min) and the pipette was left for 10–20 min. Then the pipette was slowly removed over the course of several minutes to prevent the leak of virus and damage to the tissue. An optical fiber (400 μm core diameter, 0.66 or 0.48 NA; Doric) was implanted in the VS (Bregma +1.45 mm AP, 1.4 mm ML, 4.1–4.0 mm DV from dura) or the VTA (Bregma −3.05 mm AP, 0.325–0.75 mm ML, 4.15–4.3 mm DV from dura). For the optogenetic stimulation, stimulation fiber was implanted in the VTA (15° angled (tip is directed to caudal), Bregma −1.85 mm AP, 0. 5 mm ML, 4.1 mm DV from dura), and recording fiber was implanted in the VTA (Bregma −3.1 mm AP, 0.5 mm ML, 4.2 mm DV from dura) or VS (15° angled (tip is directed to caudal), Bregma +2.7 AP, 1.4 ML, 4.35 DV from dura), respectively. The fiber was slowly lowered to the target and fixed with adhesive cement (C&B Metabond, Parkell) containing charcoal powder to prevent contamination of environmental light and leak of laser light. A small amount of rapid-curing epoxy (Devcon, A00254) was applied on the cement to glue the fiber better.

Fiber-fluorometry (photometry)

To effectively collect the fluorescence signal from the deep brain structure, we used custom-made fiber-fluorometry as previously described30,41. Blue light from 473 nm DPSS laser (Opto Engine LLC) and green light from 561 nm DPSS laser (Opto Engine LLC) were attenuated through neutral density filter (4.0 optical density, Thorlabs) and coupled into an optical fiber patch cord (400 μm, Doric) using 0.65 NA 20x objective lens (Olympus). This patch cord was connected to the implanted fiber to deliver excitation light to the brain and collect the fluorescence emission signals from the brain simultaneously. The green and red fluorescence signals from the brain were spectrally separated from the excitation lights using a dichroic mirror (FF01–493/574-Di01, Semrock). The fluorescence signals were separated into green (for GCaMP, DA2m and SFiGluSnFR) and red signals (for tdTomato) using another dichroic mirror (T556lpxr, Chroma), passed through a band pass filter (ET500/40x for green, Chroma; FF01–661/20–25 for red, Semrock), focused onto a photodetector (FDS10X10, Thorlabs), and connected to a current amplifier (SR570, Stanford Research systems). The preamplifier outputs (voltage signals) were digitized through a NIDAQ board (PCI-e6321, National Instruments) and stored in a computer using custom software written in LabVIEW (National Instruments). The sampling rate was 1000 Hz. Light intensity at the tip of patch cord was adjusted to 100–200 μW (glutamate sensor and GCaMP) and 50 μW (dopamine sensor). Because iGluSnFR3 v857-GPI is derived from Venus, the fluorescence is shifted to yellow. For iGluSnFR3 v857-GPI detection, the yellow (for iGluSnFR3 v857-GPI) and red (for tdTomato) fluorescence signals from the brain were separated from the excitation lights using a dichroic mirror (Di01-R488/561–25×36, Semrock). The fluorescence signals were separated into yellow and red signals using another dichroic mirror (T556lpxr, Chroma), passed through a band pass filter (ET525/50m for yellow, Chroma; FF01–661/20–25 for red, Semrock).

Optogenetic stimulation

Red light from 625 nm LED light (M625F2, Thorlab) was applied through an optical patch cord (400 μm, 0.39 NA, Thorlab). 12 mW single block pulse light of 250ms duration was triggered through custom software written in LabVIEW (National Instruments) via a NIDAQ board (PCI-e6321, National Instruments). A variable inter-trial interval (ITI) of flat hazard function (minimum 10s, mean 13s, truncated at 20s) was placed between trials.

Histology

All mice used in the experiments were examined for histology to confirm the fiber position as previously described30. The mice were deeply anesthetized by an overdose of ketamine/medetomidine, exsanguinated with phosphate buffered saline (PBS), and perfused with 4% paraformaldehyde (PFA) in PBS. The brain was dissected from the skull and immersed in the 4% PFA for 12–24 hours at 4 °C. The brain was rinsed with PBS and sectioned (100 μm) by vibrating microtome (VT1000S, Leica). Immunohistochemistry with TH antibody (AB152, Millipore Sigma; 1/750) was performed to identify dopamine neurons, with GFP antibody (GFP-1010, Aves Labs; 1/3000) to localize sensor-expressing areas when GCaMP and glutamate sensor raw signals were not strong enough. The sections were mounted on a slide-glass with a mounting medium containing 4’,6-diamidino-2-phenylindole (VECTASHIELD, Vector laboratories) and imaged with Axio Scan.Z1 (Zeiss) or LSM880 with FLIM (Zeiss). For quantification of glutamate sensor expression, a putative center of infection 300μm × 300μm square was selected from a confocal optical section image of the VTA.

Behavior

After 1 week of recovery from surgery, mice were water-restricted in their cages. All conditioning tasks were controlled by a NIDAQ board and LabVIEW. Mice were handled for 2 days, acclimated to the experimental setup for 1–2 days including consumption of water from the tube, and head-fixed with random interval water for 1–3 days until mice show reliable water consumption. For odor-based classical conditioning, all mice were head-fixed, and the volume of water reward was constant for all reward trials (predicted or unpredicted) in all conditions (6 μl). Some sessions included mild air puff trials, directed at one of the eyes and the intensity of air puff was constant for all air puff trials (predicted or unpredicted; 2.5 psi). In classical conditioning, each association trial began with an odor cue (lasting 1 s) followed by a 2 s delay, and then an outcome (either water, nothing, or air puff) was delivered. In sequential conditioning, each association trial began with a distal odor cue (lasting 1 s) followed by a 2 s delay, and then a distal odor cue (lasting 1 s) followed by 1 s delay, and then an outcome (either water, or nothing) was delivered. Some trial types began with a proximal odor cue (lasting1 s) followed by 1 s delay, and then an outcome (either water, or nothing) was delivered. Odors were delivered using a custom olfactometer.91 Each odor was dissolved in mineral oil at 1:10 dilution and 30 μl of diluted odor solution was applied to the syringe filter (2.7μm pore, 13mm; Whatman, 6823–1327). Odorized air was further diluted with filtered air by 1:8 to produce a 900 ml/min total flow rate. Different sets of odors (Ethyl butyrate, p-Cymene, Isoamyl acetate, Isobutyl propionate, 1-Butanol, 4-Methylanisole, Caproic acid, Eugenol, and 1-Hexanol) were selected for each animal. Some of the animals shared the same odor set (4 animals and 3 animals for DA sensor classical conditioning; 3 animals and 2 animals for sequential conditioning)

A variable inter-trial interval (ITI) of flat hazard function (minimum 10s, mean 13s, truncated at 20s) was placed between trials. Each session was composed of multiple blocks (12–24 trial/block) and all trial types were pseudorandomized in each block. Each day, the mice did about 70–350 trials over the course of 20–75 min, and with constant excitation from the laser and continuous recording in recording sessions.

Training for classical conditioning (Figures 2, 4, and 6; Figure S5) used 4 types of trials; odor cue predicting 100% water, odor cue predicting 40% water/60% no outcome (nothing), odor cue predicting nothing (29.4% of all trials for each odor), and water without cue (free water) (11.8%) for 7–10 days, and then odor cue predicting 80% water/20% nothing, odor cue predicting 40% water/60% nothing, odor cue predicting nothing (29.4% each), and free water (11.8%) for more than 2 days of training followed by recording sessions. The first 170 trials or trials before the animal stops licking for each session are used for analysis. 3 animals used for glutamate sensor recording (Figure 2) were trained with classical conditioning with air puff trial; odor cue predicting 100% water, odor cue predicting 100% air puff, odor cue predicting 40% water/60% no outcome (nothing), odor cue predicting nothing (20.8 % of all trials for each odor), and water without cue (free water), air puff without cue (free air puff) (8.3 % of all trials for each stimulus) for 8–9 days, and then odor cue predicting 80% water/20% nothing, odor cue predicting 80% air puff/20% nothing, odor cue predicting 40% water/60% nothing, odor cue predicting nothing (20.8 % each), and water without cue (free water), air puff without cue (free air puff) (8.3% each) for more than 7 days of training followed by recording sessions. The first 192 trials or trials before the animal stops licking for each session are used for analysis. Of note these 3 animals are used only in the analysis for glutamate sensor activity pattern (Figure 2), and not included for comparison between different sensor recordings (Figures 4 and 6).

For sequential conditioning step1 (Figure 3), mice were trained with proximal odor for 3–7 days, using 3 types of trials; proximal odor cue predicting 100% water, proximal odor cue predicting nothing (45.8 % of all trials for each odor), and water without cue (free water) (8.3 %). Then, for sequential conditioning step2, mice were further trained with distal odor and proximal odor using 6 types of trials; distal odor cue predicting proximal odor cue predicting 100% water (25 %), distal odor cue predicting 50% proximal odor cue predicting water/50% proximal odor cue predicting nothing (25 %), distal odor cue predicting proximal odor cue predicting nothing (25 %), proximal odor cue predicting 100% water (8.3 %), proximal odor cue predicting nothing (8.3 %), and water without cue (free water) (8.3 %). Sensor signals were recorded after 8–12 days of step2 training. The first 192 trials of each session are used for analysis.

For buprenorphine treatment, the mouse received unexpected air puff and water with a variable ITI of the flat hazard function (minimum 10s, mean 13s, truncated at 20s). Animals were first acclimated to head-fixation with water reward for 2 days. Then the animals were habituated to a test procedure for two days, which was composed of 40–70 trials of air puff and water presented in pseudorandomized order (the order of trials is pseudorandomized within each block of 10 trials composed of the same number of air puff trials and reward trials) with subcutaneous injection of saline (10 μl/g in body weights) at the end of sessions. After habituation sessions, buprenorphine sessions and control saline sessions were performed in pseudorandom order. Each session was separated into pre-injection sub-session and post-injection sub-session. The sub-session was composed of 48–75 trials with the air puff and water trials in the pseudorandom order. After pre-injection sub-sessions, either buprenorphine (Buprenorphine hydrochloride, Par Pharmaceutical: diluted in saline 0.03 mg/ml) or control saline was subcutaneously injected (10 μl/g; 0.3 μg/g buprenorphine or corresponding volume of saline). Post-injection sub-session started 5 min after injection. To prevent acute adverse effects, we used a smaller dose of buprenorphine (0.15 μg/g) in the first session and this session is not included in the analysis.

Data analysis

Fiber-fluorometry

The noise from the power line in the voltage signal was cleaned by removing 58–62Hz signals through a band stop filter. The global change within a session was normalized using a moving median of 100 s. Then, the correlation between green and red signals during ITI was examined by linear regression for glutamate sensor signal except for optogenetic stimulation experiments. If the correlation is significant (p ≤ 0.05), fitted tdTomato signals were subtracted from green signals. Responses were calculated by subtracting the average baseline activity from the average activity of the target window. Z-scores of the signals were obtained using mean and standard deviation of signals in all ITI (from 5 s before odor onset to odor onset for classical conditioning, from 5 s before distal odor onset to odor onset for sequential conditioning) in each animal. Stimulus responses were measured as average activity of analysis window (1–1000 ms for odor response in classical conditioning, 1–1000 ms for distal odor response in sequential conditioning, 201–1200 ms for proximal odor response in sequential conditioning, 201–1200 ms for water and air puff response, 1501–2500 ms for omission response, 1–500 ms for optogenetic stimulation response).

Quantification of rise and decay of sensor signals

Rise time of glutamate sensor and dopamine sensor signal pattern for free water and optogenetic stimulation was defined as latency from 10% peak signal timing to 90% peak signal timing. Decay time was defined as latency from peak timing to 50% peak activity timing. The activity peak during response period (3 sec from the stimulus onset) was detected by finding a maximum response in moving windows of 20 ms that exceeds 2 × standard deviation of baseline activity (moving windows of 20 ms during −1 to 0 sec from an odor onset).

Test for monotonic relationship between neural responses and reward outcome

The responses in the sensor signals were normalized by the average responses to chosen events in one session as following: 80% reward odor for odor responses in classical conditioning (1–1000 ms), odor 1 for distal odor in sequential conditioning (1–1000 ms), and unexpected odor A for proximal odor in sequential conditioning (201–1200 ms). Responses in each trial were fitted with reward outcome with linear regression for each animal. Zero-crossing points were determined as x-intercepts of obtained linear line to estimate reward probability that produce zero response in sensor signals by odor in classical conditioning and distal odor in sequential conditioning for each animal.

Estimation of dopamine sensor signal based on the glutamate sensor signal

To estimate dopamine sensor signals from the glutamate sensor signals, we deconvolved glutamate sensor signals in sequential conditioning in each animal (average in 3 sessions) with “glutamate kernel” (interval of 200 ms) using optogenetic stimulation responses in glutamate sensor signals (0–3.5 s)92, and then convolved the resulting trace with “dopamine kernel” using optogenetic responses in dopamine sensor signals (0–3.5 s) (Figure 5C). We used down-sampled (every 20 ms) responses in all trials for the model fitting. We used different kernels for each trial type (Figure 3A). Odor kernels consist of 6 types of kernels: ‘Odor1-OdorA-water’, ‘Odor2-OdorA-water’, ‘Odor2-OdorB-nothing’, and ‘Odor3-OdorB-nothing’ kernels to span 0 to 5 s from distal odor onset, and ‘No odor-OdorA-water’ and ‘No odor-OdorB-nothing’ kernels to span 0 to 2 s from proximal odor onset. Water kernels consist of 2 types of kernels: ‘expected water’ kernels in trials with water after odorA, and ‘free water’ kernels in trials with unexpected water to span 0–4 s from water onset. Glutamate sensor responses were fitted with all the kernels using linear regression with Elastic net regularization (α = 0.75) with 10-fold cross validation. The regularization coefficient lambda was chosen so that cross-validation error is minimum plus one standard deviation. Glutamate kernels were swapped with “dopamine kernel” using optogenetic responses in dopamine sensor signals. The amplitude of the obtained trace was adjusted by linear regression with DA sensor responses in free water trials. % explained by an estimated DA senor signal was expressed as reduction of a variance in the residual responses compared to the recorded DA sensor responses.

Quantification of the effect of buprenorphine administration

Air puff response (1–1000ms) was quantified as the average of 4 sessions for each animal for each condition in 6 animals. We used fewer sessions for two animals because of weak signals (peak of water response below 2 z-score), or system problems (1 buprenorphine session and 2 saline sessions for one animal and 3 buprenorphine sessions and 4 saline sessions for one animal). We used all trials of all sessions to test the significance of the response for each animal (Figures 8B and 8D).

Licking

Licking from a water spout was detected by a photoelectric sensor that produces a change in voltage when the light path is broken. The timing of each lick was detected at the peak of the voltage signal above a threshold. To plot the time course of licking patterns, the lick rate was calculated by a moving average of a 200 ms window.

QUANTIFICATION AND STATISTICAL ANALYSIS

All analyses were performed using custom software written in Matlab (MathWorks). All statistical tests were two-sided. A boxplot indicates 25th and 75th percentiles as the bottom and top edges, respectively. The center line indicates the median. The whiskers extend to the most extreme data that is not considered as outlier. In other graphs, error bars show standard errors. Asterisks in the figures stand for *** p ≤ 0.001, ** p ≤ 0.01, and * p ≤ 0.05. “n.s.” stands for p > 0.05. Linear fitting was performed to test the relation of responses between different trial types (all trials in a session were used to test each session and average of sessions are used to test all animals), and to test for significance of the model fitting, the p-value for the F-test on the model was calculated (Figures 2E and 2G; n = 15 animals, Figures 3C, 3E, 3G, 3I, 5F, and 5G; n = 5 animals, Figure 6A bottom; n = 12 animals for glutamate sensor, n = 11 animals for GCaMP, n = 11 animals for dopamine sensor, Figure S3A middle and S3C middle; n = 5 animals, Figure S5A middle and S5C middle; n = 3 animals). One-sample t-test was performed to test if the mean of a data set is not equal to zero (Figure 1E; n = 9 animals for both glutamate input and dopamine sensor, Figures 2E and 2G; n = 15 animals, Figures 3C and 3E; n = 5 animals, Figures 3G and 3I; n = 5 animals, Figure 4B; n = 12 animals for glutamate sensor, n = 11 animals for GCaMP, n = 11 animals for dopamine sensor, Figure 4E; n = 5 animals for both glutamate sensor and dopamine sensor, Figures 5F right and 5G right; n = 5 animals for both estimated dopamine and residual signal, Figures 6C and 6D; n = 12 animals for glutamate sensor, n = 11 animals for GCaMP, n = 11 animals for dopamine sensor, Figure 6G left and middle; n = 5 animals for both glutamate sensor and dopamine sensor, Figure 7C; n = 8 animals for glutamate sensor, n = 9 animals for GCaMP, n = 7 animals for dopamine sensor, Figures 8B and 8D; n = 8 animals, Figures S3A right and C right, n = 5 animals, Figure S5B right and S5D right; n = 3 sessions for each mouse, S5F and S5H; n = 3 sessions for each mouse). To compare the difference of the mean between two groups, two-sample t-test was performed (Figures 1G and 1H; n = 4 animals for ChrimsonR, n = 3 animals for tdTomato, Figure 1I; n = 4 animals for glutamate sensor and dopamine sensor). To compare the difference of the mean between pairs of measurement, paired t-test was performed (Figure 1F; n = 9 animals, Figure 4E; n = 5 animals, Figure 5F right and 5G right; n = 5 animals, Figure 6G; n = 5 animals, Figures 8B and 8D; n = 8 animals, Figure S1C; n = 9 animals). For comparison of more than three measurements, one-way ANOVA followed by Tukey’s test was performed (Figures 4B, 6C, and 6D right; n = 12 animals for glutamate sensor, n = 11 animals for GCaMP, n = 11 animals for dopamine sensor, Figure 7C; n = 8 animals for glutamate sensor, n = 9 animals for GCaMP, n = 7 animals for dopamine sensor), and when the measurement contains two independent variables, two-way ANOVA was performed (Figure 6E; n = 12 animals for glutamate sensor and n = 11 animals for GCaMP, Figure 7D; n = 12 animals for glutamate sensor and n = 11 animals for GCaMP). The two-sample Kolmogorov-Smirnov test was used to compare distribution of response changes (Figure 8E; n = 8 animals). p-value less than or equal to 0.05 was regarded as significant for all tests. Data distribution was assumed to be normal, but this was not formally tested. No formal statistical analysis was carried out to predetermine sample sizes but our sample sizes are similar to those reported in previous publications41,46,93. No animals were excluded from the study: all analyses included data from all animals.

Supplementary Material

2

Highlights.

  • Activity of glutamate inputs to dopamine neurons is measured with a glutamate sensor

  • Glutamate inputs to dopamine neurons convey temporal difference (TD) error

  • Aversive stimuli activate glutamate inputs but inhibit dopamine neurons

  • Opioids diminish dopamine inhibition at aversion but spare glutamate input excitation

Acknowledgments

We thank Shuhan Huang, Iku Tsutsui-Kimura, and Benedicte Babayan for technical assistance, Adam S. Lowet, Sara Matias, Isobel W. Green, Malcolm G. Campbell, and all lab members for discussion. We thank Catherine Dulac for sharing reagents and equipment. We thank Loren Looger, University of California, San Diego for pGP-AAV-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE plasmid and AAV1-hSyn-SF-iGluSnFR(A184S), Douglas Kim and GENIE Project, Janelia Farm Research Campus, Howard Hughes Medical Institute for AAV8-hSyn-FLEX-jGCaMP7f, Edward Boyden, Media Lab, Massachusetts Institute of Technology for AAV5-CAG-FLEX-tdTomato and AAV8-hSyn-FLEX-ChrinsonR-tdTomato, Yulong Li, State Key Laboratory of Membrane Biology, Peking University for AAV9-hSyn-DA2m, and Kaspar Podgorski, Allen Institute, for AAV1-hSyn-FLEX-iGluSnFR3 v857-GPI. We thank the Harvard Center for Biological Imaging for infrastructure and support. This work was supported by grants from National Institute of Mental Health (R01MH125162, MW-U), National Institute of Health (U19 NS113201, NS 108740, NU), the Simons Collaboration on Global Brain (NU), Japan Society for the Promotion of Science, Japan Science and Technology Agency (RA), and Harvard Brain Initiative Postdoc Pioneers Grant (RA).

Footnotes

Declaration of Interests

The authors declare no competing interests.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Shadmehr R, Smith MA, and Krakauer JW (2010). Error Correction, Sensory Prediction, and Adaptation in Motor Control. Annu. Rev. Neurosci. 33, 89–108. 10.1146/annurev-neuro-060909-153135. [DOI] [PubMed] [Google Scholar]
  • 2.Helmholtz H. von (1867). Handbuch der physiologischen Optik (L. Voss). [Google Scholar]
  • 3.Rao RPN, and Ballard DH (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87. 10.1038/4580. [DOI] [PubMed] [Google Scholar]
  • 4.Schultz W, Dayan P, and Montague PR (1997). A Neural Substrate of Prediction and Reward. Science 275, 1593–1599. 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  • 5.Rescorla RA, and Wagner AR (1972). A theory of Pavlovian conditioning: Variations on the effectiveness of reinforcement and non-reinforcement. In Classical conditioning II: Current research and theory, Black AH and Prokasy WF, eds. (Appleton-Century-Crofts; ), pp. 64–99. [Google Scholar]
  • 6.Eshel N, Tian J, Bukwich M, and Uchida N (2016). Dopamine neurons share common response function for reward prediction error. Nat. Neurosci. 19, 479–486. 10.1038/nn.4239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Matsumoto M, and Hikosaka O (2009). Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841. 10.1038/nature08028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Montague P, Dayan P, and Sejnowski T (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947. 10.1523/JNEUROSCI.16-05-01936.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Thorndike EL (1911). Animal intelligence; experimental studies (The Macmillan Company; ) 10.5962/bhl.title.55072. [DOI] [Google Scholar]
  • 10.Pavlov IP (1927). Conditioned reflexes: an investigation of the physiological activity of the cerebral cortex. (Oxford Univ. Press; ). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sutton RS, and Barto AG (2018). Reinforcement Learning: An Introduction Second. (The MIT Press; ). [Google Scholar]
  • 12.Kawato M, and Samejima K (2007). Efficient reinforcement learning: computational theories, neuroscience and robotics. Curr. Opin. Neurobiol. 17, 205–212. 10.1016/j.conb.2007.03.004. [DOI] [PubMed] [Google Scholar]
  • 13.Morita K, and Kato A (2018). A Neural Circuit Mechanism for the Involvements of Dopamine in Effort-Related Choices: Decay of Learned Values, Secondary Effects of Depletion, and Calculation of Temporal Difference Error. eNeuro 5. 10.1523/ENEURO.0021-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Keiflin R, and Janak PH (2015). Dopamine Prediction Errors in Reward Learning and Addiction: From Theory to Neural Circuitry. Neuron 88, 247–263. 10.1016/j.neuron.2015.08.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Doya K (2000). Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol. 10, 732–739. 10.1016/S0959-4388(00)00153-7. [DOI] [PubMed] [Google Scholar]
  • 16.Aggarwal M, Hyland BI, and Wickens JR (2012). Neural control of dopamine neurotransmission: implications for reinforcement learning: Neural control of dopamine release. Eur. J. Neurosci. 35, 1115–1123. 10.1111/j.1460-9568.2012.08055.x. [DOI] [PubMed] [Google Scholar]
  • 17.Geisler S, Derst C, Veh RW, and Zahm DS (2007). Glutamatergic Afferents of the Ventral Tegmental Area in the Rat. J. Neurosci. 27, 5730–5743. 10.1523/JNEUROSCI.0012-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Nieh EH, Vander Weele CM, Matthews GA, Presbrey KN, Wichmann R, Leppla CA, Izadmehr EM, and Tye KM (2016). Inhibitory Input from the Lateral Hypothalamus to the Ventral Tegmental Area Disinhibits Dopamine Neurons and Promotes Behavioral Activation. Neuron 90, 1286–1298. 10.1016/j.neuron.2016.04.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Qi J, Zhang S, Wang H-L, Wang H, de Jesus Aceves Buendia J, Hoffman AF, Lupica CR, Seal RP, and Morales M (2014). A glutamatergic reward input from the dorsal raphe to ventral tegmental area dopamine neurons. Nat. Commun. 5, 5390. 10.1038/ncomms6390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Bolam JP, and Smith Y (1990). The GABA and substance P input to dopaminergic neurones in the substantia nigra of the rat. Brain Res. 529, 57–78. 10.1016/0006-8993(90)90811-o. [DOI] [PubMed] [Google Scholar]
  • 21.Jhou TC, Geisler S, Marinelli M, Degarmo BA, and Zahm DS (2009). The mesopontine rostromedial tegmental nucleus: A structure targeted by the lateral habenula that projects to the ventral tegmental area of Tsai and substantia nigra compacta. J. Comp. Neurol. 513, 566–596. 10.1002/cne.21891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Tian J, Huang R, Cohen JY, Osakada F, Kobak D, Machens CK, Callaway EM, Uchida N, and Watabe-Uchida M (2016). Distributed and Mixed Information in Monosynaptic Inputs to Dopamine Neurons. Neuron 91, 1374–1389. 10.1016/j.neuron.2016.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chevalier G, and Deniau JM (1990). Disinhibition as a basic process in the expression of striatal functions. Trends Neurosci. 13, 277–280. 10.1016/0166-2236(90)90109-n. [DOI] [PubMed] [Google Scholar]
  • 24.Tian J, and Uchida N (2015). Habenula Lesions Reveal that Multiple Mechanisms Underlie Dopamine Prediction Errors. Neuron 87, 1304–1316. 10.1016/j.neuron.2015.08.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jhou TC, Fields HL, Baxter MG, Saper CB, and Holland PC (2009). The rostromedial tegmental nucleus (RMTg), a GABAergic afferent to midbrain dopamine neurons, encodes aversive stimuli and inhibits motor responses. Neuron 61, 786–800. 10.1016/j.neuron.2009.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Li H, Vento PJ, Parrilla-Carrero J, Pullmann D, Chao YS, Eid M, and Jhou TC (2019). Three Rostromedial Tegmental Afferents Drive Triply Dissociable Aspects of Punishment Learning and Aversive Valence Encoding. Neuron 104, 987–999.e4. 10.1016/j.neuron.2019.08.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hong S, Jhou TC, Smith M, Saleem KS, and Hikosaka O (2011). Negative reward signals from the lateral habenula to dopamine neurons are mediated by rostromedial tegmental nucleus in primates. J. Neurosci. 31, 11457–11471. 10.1523/JNEUROSCI.1384-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Matsumoto M, and Hikosaka O (2007). Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 447, 1111–1115. 10.1038/nature05860. [DOI] [PubMed] [Google Scholar]
  • 29.Schultz W, Apicella P, and Ljungberg T (1993). Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci. 13, 900–913. 10.1523/JNEUROSCI.13-03-00900.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Amo R, Matias S, Yamanaka A, Tanaka KF, Uchida N, and Watabe-Uchida M (2022). A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning. Nat. Neurosci. 25, 1082–1092. 10.1038/s41593-022-01109-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Cohen JY, Haesler S, Vong L, Lowell BB, and Uchida N (2012). Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88. 10.1038/nature10754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Mansour A, Fox CA, Burke S, Meng F, Thompson RC, Akil H, and Watson SJ (1994). Mu, delta, and kappa opioid receptor mRNA expression in the rat CNS: An in situ hybridization study. J. Comp. Neurol. 350, 412–438. 10.1002/cne.903500307. [DOI] [PubMed] [Google Scholar]
  • 33.Minami M, Onogi T, Toya T, Katao Y, Hosoi Y, Maekawa K, Katsumata S, Yabuuchi K, and Satoh M (1994). Molecular cloning and in situ hybridization histochemistry for rat μ-opioid receptor. Neurosci. Res. 18, 315–322. 10.1016/0168-0102(94)90167-8. [DOI] [PubMed] [Google Scholar]
  • 34.Ding Y-Q, Kaneko T, Nomura S, and Mizuno N (1996). Immunohistochemical localization of μ-opioid receptors in the central nervous system of the rat. J. Comp. Neurol. 367, 375–402. . [DOI] [PubMed] [Google Scholar]
  • 35.Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, and Uchida N (2012). Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron 74, 858–873. 10.1016/j.neuron.2012.03.017. [DOI] [PubMed] [Google Scholar]
  • 36.Morales M, and Margolis EB (2017). Ventral tegmental area: cellular heterogeneity, connectivity and behaviour. Nat. Rev. Neurosci. 18, 73–85. 10.1038/nrn.2016.165. [DOI] [PubMed] [Google Scholar]
  • 37.Yang H, de Jong JW, Tak Y, Peck J, Bateup HS, and Lammel S (2018). Nucleus Accumbens Subnuclei Regulate Motivated Behavior via Direct Inhibition and Disinhibition of VTA Dopamine Subpopulations. Neuron 97, 434–449.e4. 10.1016/j.neuron.2017.12.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Zhou Z, Liu X, Chen S, Zhang Z, Liu Y, Montardy Q, Tang Y, Wei P, Liu N, Li L, et al. (2019). A VTA GABAergic Neural Circuit Mediates Visually Evoked Innate Defensive Responses. Neuron 103, 473–488.e6. 10.1016/j.neuron.2019.05.027. [DOI] [PubMed] [Google Scholar]
  • 39.Marvin JS, Scholl B, Wilson DE, Podgorski K, Kazemipour A, Müller JA, Schoch S, Quiroz FJU, Rebola N, Bao H, et al. (2018). Stability, affinity, and chromatic variants of the glutamate sensor iGluSnFR. Nat. Methods 15, 936–939. 10.1038/s41592-018-0171-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sun F, Zhou J, Dai B, Qian T, Zeng J, Li X, Zhuo Y, Zhang Y, Wang Y, Qian C, et al. (2020). Next-generation GRAB sensors for monitoring dopaminergic activity in vivo. Nat. Methods 17, 1156–1166. 10.1038/s41592-020-00981-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Menegas W, Babayan BM, Uchida N, and Watabe-Uchida M (2017). Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice. eLife 6. 10.7554/eLife.21886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Klapoetke NC, Murata Y, Kim SS, Pulver SR, Birdsey-Benson A, Cho YK, Morimoto TK, Chuong AS, Carpenter EJ, Tian Z, et al. (2014). Independent optical excitation of distinct neural populations. Nat. Methods 11, 338–346. 10.1038/nmeth.2836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Pearce JM, Nicholas DJ, and Dickinson A (1981). The potentiation effect during serial conditioning. Q. J. Exp. Psychol. B 33, 159–179. 10.1080/14640748108400820. [DOI] [PubMed] [Google Scholar]
  • 44.Wasserman EA, Carr DL, and Deich JD (1978). Association of conditioned stimuli during serial conditioning by pigeons. Anim. Learn. Behav. 6, 52–56. 10.3758/BF03212002. [DOI] [Google Scholar]
  • 45.Hong S, and Hikosaka O (2008). The globus pallidus sends reward-related signals to the lateral habenula. Neuron 60, 720–729. 10.1016/j.neuron.2008.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Tsutsui-Kimura I, Matsumoto H, Akiti K, Yamada MM, Uchida N, and Watabe-Uchida M (2020). Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task. eLife 9. 10.7554/eLife.62390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Aggarwal A, Liu R, Chen Y, Ralowicz AJ, Bergerson SJ, Tomaska F, Mohar B, Hanson TL, Hasseman JP, Reep D, et al. (2023). Glutamate indicators with improved activation kinetics and localization for imaging synaptic transmission. Nat. Methods 20, 925–934. 10.1038/s41592-023-01863-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Hao Y, Toulmé E, König B, Rosenmund C, and Plested AJR (2023). Targeted sensors for glutamatergic neurotransmission. eLife 12. 10.7554/eLife.84029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Isaacs DP, Leman RP, Everett TJ, Lopez-Beltran H, Hamilton LR, and Oleson EB (2020). Buprenorphine is a weak dopamine releaser relative to heroin, but its pretreatment attenuates heroin-evoked dopamine release in rats. Neuropsychopharmacol. Rep. 40, 355–364. 10.1002/npr2.12139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Leone P, Pocock D, and Wise RA (1991). Morphine-dopamine interaction: Ventral tegmental morphine increases nucleus accumbens dopamine release. Pharmacol. Biochem. Behav. 39, 469–472. 10.1016/0091-3057(91)90210-S. [DOI] [PubMed] [Google Scholar]
  • 51.Matthews RT, and German DC (1984). Electrophysiological evidence for excitation of rat ventral tegmental area dopamine neurons by morphine. Neuroscience 11, 617–625. 10.1016/0306-4522(84)90048-4. [DOI] [PubMed] [Google Scholar]
  • 52.Corre J, van Zessen R, Loureiro M, Patriarchi T, Tian L, Pascoli V, and Lüscher C (2018). Dopamine neurons projecting to medial shell of the nucleus accumbens drive heroin reinforcement. eLife 7, e39945. 10.7554/eLife.39945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Matsui A, and Williams JT (2011). Opioid-sensitive GABA inputs from rostromedial tegmental nucleus synapse onto midbrain dopamine neurons. J. Neurosci. Off. J. Soc. Neurosci. 31, 17729–17735. 10.1523/JNEUROSCI.4570-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Margolis EB, and Fields HL (2016). Mu Opioid Receptor Actions in the Lateral Habenula. PLOS ONE 11, 1–11. 10.1371/journal.pone.0159097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Christie MJ, and North RA (1988). Agonists at μ-opioid, M2-muscarinic and GABAB-receptors increase the same potassium conductance in rat lateral parabrachial neurones. Br. J. Pharmacol. 95, 896–902. 10.1111/j.1476-5381.1988.tb11719.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Waung MW, Maanum KA, Cirino TJ, Driscoll JR, O’Brien C, Bryant S, Mansourian KA, Morales M, Barker DJ, and Margolis EB (2022). A diencephalic circuit in rats for opioid analgesia but not positive reinforcement. Nat. Commun. 13, 764. 10.1038/s41467-022-28332-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hensch TK, and Fagiolini M (2005). Excitatory-inhibitory balance and critical period plasticity in developing visual cortex. Prog. Brain Res. 147, 115–124. 10.1016/S0079-6123(04)47009-5. [DOI] [PubMed] [Google Scholar]
  • 58.Nelson SB, and Valakh V (2015). Excitatory/Inhibitory Balance and Circuit Homeostasis in Autism Spectrum Disorders. Neuron 87, 684–698. 10.1016/j.neuron.2015.07.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Wehr M, and Zador AM (2003). Balanced inhibition underlies tuning and sharpens spike timing in auditory cortex. Nature 426, 442–446. 10.1038/nature02116. [DOI] [PubMed] [Google Scholar]
  • 60.van Vreeswijk C, and Sompolinsky H (1996). Chaos in Neuronal Networks with Balanced Excitatory and Inhibitory Activity. Science 274, 1724–1726. 10.1126/science.274.5293.1724. [DOI] [PubMed] [Google Scholar]
  • 61.Isaacson JS, and Scanziani M (2011). How Inhibition Shapes Cortical Activity. Neuron 72, 231–243. 10.1016/j.neuron.2011.09.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Sprekeler H (2017). Functional consequences of inhibitory plasticity: homeostasis, the excitation-inhibition balance and beyond. Curr. Opin. Neurobiol. 43, 198–203. 10.1016/j.conb.2017.03.014. [DOI] [PubMed] [Google Scholar]
  • 63.Lobb CJ, Wilson CJ, and Paladini CA (2010). A dynamic role for GABA receptors on the firing pattern of midbrain dopaminergic neurons. J. Neurophysiol. 104, 403–413. 10.1152/jn.00204.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Matsumoto M, and Hikosaka O (2009). Representation of negative motivational value in the primate lateral habenula. Nat. Neurosci. 12, 77–84. 10.1038/nn.2233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Fields H (2004). State-dependent opioid control of pain. Nat. Rev. Neurosci. 5, 565–575. 10.1038/nrn1431. [DOI] [PubMed] [Google Scholar]
  • 66.Fields HL, and Margolis EB (2015). Understanding opioid reward. Trends Neurosci. 38, 217–225. 10.1016/j.tins.2015.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Tejeda HA, and Bonci A (2019). Dynorphin/kappa-opioid receptor control of dopamine dynamics: Implications for negative affective states and psychiatric disorders. Brain Res. 1713, 91–101. 10.1016/j.brainres.2018.09.023. [DOI] [PubMed] [Google Scholar]
  • 68.Thompson RC, Mansour A, Akil H, and Watson SJ (1993). Cloning and pharmacological characterization of a rat μ opioid receptor. Neuron 11, 903–913. 10.1016/0896-6273(93)90120-G. [DOI] [PubMed] [Google Scholar]
  • 69.Fields HL (2007). Understanding How Opioids Contribute to Reward and Analgesia. Reg. Anesth. Pain Med. 32, 242–246. 10.1016/j.rapm.2007.01.001. [DOI] [PubMed] [Google Scholar]
  • 70.Keller GB, and Mrsic-Flogel TD (2018). Predictive Processing: A Canonical Cortical Computation. Neuron 100, 424–435. 10.1016/j.neuron.2018.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.de Jong JW, Afjei SA, Pollak Dorocic I, Peck JR, Liu C, Kim CK, Tian L, Deisseroth K, and Lammel S (2019). A Neural Circuit Mechanism for Encoding Aversive Stimuli in the Mesolimbic Dopamine System. Neuron 101, 133–151.e7. 10.1016/j.neuron.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Dabney W, Kurth-Nelson Z, Uchida N, Starkweather CK, Hassabis D, Munos R, and Botvinick M (2020). A distributional code for value in dopamine-based reinforcement learning. Nature 577, 671–675. 10.1038/s41586-019-1924-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Lowet AS, Zheng Q, Matias S, Drugowitsch J, and Uchida N (2020). Distributional Reinforcement Learning in the Brain. Trends Neurosci. 43, 980–997. 10.1016/j.tins.2020.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Root DH, Estrin DJ, and Morales M (2018). Aversion or Salience Signaling by Ventral Tegmental Area Glutamate Neurons. iScience 2, 51–62. 10.1016/j.isci.2018.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Root DH, Barker DJ, Estrin DJ, Miranda-Barrientos JA, Liu B, Zhang S, Wang H-L, Vautier F, Ramakrishnan C, Kim YS, et al. (2020). Distinct Signaling by Ventral Tegmental Area Glutamate, GABA, and Combinatorial Glutamate-GABA Neurons in Motivated Behavior. Cell Rep. 32, 108094. 10.1016/j.celrep.2020.108094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Montardy Q, Zhou Z, Lei Z, Liu X, Zeng P, Chen C, Liu Y, Sanz-Leon P, Huang K, and Wang L (2019). Characterization of glutamatergic VTA neural population responses to aversive and rewarding conditioning in freely-moving mice. Sci. Bull. 64, 1167–1178. 10.1016/j.scib.2019.05.005. [DOI] [PubMed] [Google Scholar]
  • 77.Fiorillo CD, and Williams JT (1998). Glutamate mediates an inhibitory postsynaptic potential in dopamine neurons. Nature 394, 78–82. 10.1038/27919. [DOI] [PubMed] [Google Scholar]
  • 78.Bellone C, and Lüscher C (2006). Cocaine triggered AMPA receptor redistribution is reversed in vivo by mGluR-dependent long-term depression. Nat. Neurosci. 9, 636–641. 10.1038/nn1682. [DOI] [PubMed] [Google Scholar]
  • 79.Dzubay JA, and Jahr CE (1999). The concentration of synaptically released glutamate outside of the climbing fiber-Purkinje cell synaptic cleft. J. Neurosci. Off. J. Soc. Neurosci. 19, 5265–5274. 10.1523/JNEUROSCI.19-13-05265.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Hires SA, Zhu Y, and Tsien RY (2008). Optical measurement of synaptic glutamate spillover and reuptake by linker optimized glutamate-sensitive fluorescent reporters. Proc. Natl. Acad. Sci. U. S. A. 105, 4411–4416. 10.1073/pnas.0712008105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Clements JD, Lester RA, Tong G, Jahr CE, and Westbrook GL (1992). The time course of glutamate in the synaptic cleft. Science 258, 1498–1501. 10.1126/science.1359647. [DOI] [PubMed] [Google Scholar]
  • 82.Otomo K, Perkins J, Kulkarni A, Stojanovic S, Roeper J, and Paladini CA (2020). In vivo patch-clamp recordings reveal distinct subthreshold signatures and threshold dynamics of midbrain dopamine neurons. Nat. Commun. 11, 6286. 10.1038/s41467-020-20041-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Montero T, Gatica RI, Farassat N, Meza R, González-Cabrera C, Roeper J, and Henny P (2021). Dendritic Architecture Predicts in vivo Firing Pattern in Mouse Ventral Tegmental Area and Substantia Nigra Dopaminergic Neurons. Front. Neural Circuits 15, 769342. 10.3389/fncir.2021.769342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Kennerley SW, Behrens TEJ, and Wallis JD (2011). Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat. Neurosci. 14, 1581–1589. 10.1038/nn.2961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Oyama K, Hernadi I, Iijima T, and Tsutsui K-I (2010). Reward Prediction Error Coding in Dorsal Striatal Neurons. J. Neurosci. 30, 11447–11457. 10.1523/JNEUROSCI.1719-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Bäckman CM, Malik N, Zhang Y, Shan L, Grinberg A, Hoffer BJ, Westphal H, and Tomac AC (2006). Characterization of a mouse strain expressing Cre recombinase from the 3’ untranslated region of the dopamine transporter locus. Genesis 44, 383–390. 10.1002/dvg.20228. [DOI] [PubMed] [Google Scholar]
  • 87.Madisen L, Zwingman TA, Sunkin SM, Oh SW, Zariwala HA, Gu H, Ng LL, Palmiter RD, Hawrylycz MJ, Jones AR, et al. (2010). A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat. Neurosci. 13, 133–140. 10.1038/nn.2467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Tong Q, Ye C, McCrimmon RJ, Dhillon H, Choi B, Kramer MD, Yu J, Yang Z, Christiansen LM, Lee CE, et al. (2007). Synaptic glutamate release by ventromedial hypothalamic neurons is part of the neurocircuitry that prevents hypoglycemia. Cell Metab. 5, 383–393. 10.1016/j.cmet.2007.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Vong L, Ye C, Yang Z, Choi B, Chua SJ, and Lowell BB (2011). Leptin action on GABAergic neurons prevents obesity and reduces inhibitory tone to POMC neurons. Neuron 71, 142–154. 10.1016/j.neuron.2011.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Dana H, Sun Y, Mohar B, Hulse BK, Kerlin AM, Hasseman JP, Tsegaye G, Tsang A, Wong A, Patel R, et al. (2019). High-performance calcium sensors for imaging activity in neuronal populations and microcompartments. Nat. Methods 16, 649–657. 10.1038/s41592-019-0435-6. [DOI] [PubMed] [Google Scholar]
  • 91.Uchida N, and Mainen ZF (2003). Speed and accuracy of olfactory discrimination in the rat. Nat. Neurosci. 6, 1224–1229. 10.1038/nn1142. [DOI] [PubMed] [Google Scholar]
  • 92.Park IM, Meister MLR, Huk AC, and Pillow JW (2014). Encoding and decoding in parietal cortex during sensorimotor decision-making. Nat. Neurosci. 17, 1395–1403. 10.1038/nn.3800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Kim HR, Malik AN, Mikhael JG, Bech P, Tsutsui-Kimura I, Sun F, Zhang Y, Li Y, Watabe-Uchida M, Gershman SJ, et al. (2020). A Unified Framework for Dopamine Signals across Timescales. Cell 183, 1600–1616.e25. 10.1016/j.cell.2020.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

2

Data Availability Statement

  • Glutamate input and dopamine fluorometry data are deposited at Dryad (https://doi:10.5061/dryad.ncjsxkt25) and are publicly available as of the date of publication. DOI is listed in the key resources table.

  • MATLAB codes used to obtain the results are available at GitHub (https://github.com/VTA-SNc/Amo-2023) and are publicly available as of the date of publication. DOI is listed in the key resources table.

  • Any additional information required to reanalyze the data reported in this paper is available from the Lead Contact upon request.

Key resources table.

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
Anti-tyrosine hydroxylase (TH) EMD Millipore RRID: AB_390204
Anti-Green Fluorescent Protein Antibody Aves Labs RRID: AB_2307313
F(ab’)2-Goat anti-Rabbit IgG (H+L) Cross-Adsorbed Secondary Antibody, Alexa Fluor 647 Invitrogen RRID: AB_2535814
Goat anti-Chicken IgY (H+L) Secondary Antibody, Alexa Fluor 488 Invitrogen RRID: AB_2534096
Bacterial and virus strains
AAV5-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE Marvin et al.39; UNC Vector Core Custom preparation
AAV8-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE Marvin et al.39; UNC Vector Core Custom preparation
AAV1-hSyn-SF-iGluSnFR(A184S) Marvin et al.39; Addgene #106174-AAV1
AAV1-hSyn-FLEX-iGluSnFR3 v857-GPI Aggarwal et al.47; Addgene #175181-AAV1
AAV8-hSyn-FLEX-jGCaMP7f Dana et al.90; Addgene #104492-AAV8
AAV9-hSyn-DA2m Sun et al.40; ViGene bioscience N/A
AAV5-CAG-FLEX-tdTomato Gift from Ed Boyden (unpublished); UNC Vector Core AAV In Stock Vectors: Ed Boyden
AAV8-hSyn-FLEX-ChrinsonR-tdTomato Klapoetke et al.42; UNC Vector Core AAV In Stock Vectors: Ed Boyden
Chemicals, peptides, and recombinant proteins
Buprenorphine Patterson Veterinary Cat #07-892-5235
Deposited data
Fiber-fluorometry recording data This paper Dryad (doi:10.5061/dryad.ncjsxkt25)
Experimental models: Organisms/strains
Mouse: B6.SJL-Slc6a3tm1.1(cre)Bkmn/J The Jackson Laboratory RRID: IMSR_JAX:006660
Mouse: B6.Cg-Gt(ROSA)26Sortm14(CAG-tdTomato)Hze/J The Jackson Laboratory RRID: IMSR_JAX:007914
Mouse: B6;129S-Slc17a8tm1.1(cre)Hze/J The Jackson Laboratory RRID: IMSR_JAX:028534
Mouse: Slc17a6tm2(cre)Lowl/J The Jackson Laboratory RRID: IMSR_JAX:016963
Recombinant DNA
pGP-AAV-CAG-FLEX-SF-iGluSnFR(A184S)-WPRE Marvin et al.39; Addgene RRID: Addgene_106186
Software and algorithms
MATLAB MathWorks RRID: SCR_001622
LabView National Instruments RRID: SCR_014325
Matlab codes This paper GitHub (doi: 10.5281/zenodo.10407856)
Other
Mono Fiber-optic Cannulas Doric Lenses Cat # MFC_400/430-0.66_5mm_MF1.25_FLT
Mono Fiber-optic Cannulas Doric Lenses Cat # MFC_400/430-0.48_5mm_MF1.25_FLT

RESOURCES