Abstract
Social interactions are motivated behaviors that in many species facilitate learning. However, how the brain encodes the reinforcing properties of social interactions remains elusive. Here, using in vivo recording in freely moving mice, we show that dopamine (DA) neurons of the ventral tegmental area (VTA) increase their activity during interactions with an unfamiliar conspecific and display heterogeneous responses. Using a social instrumental task (SIT), we then show that VTA DA neuron activity encodes social prediction error and drives social reinforcement learning. Thus, our findings suggest that VTA DA neurons are a neural substrate for a social learning signal that drives motivated behavior.
Introduction
A broad range of species display social interactions, which include various modalities of communication between two or more conspecifics1. By increasing animal’s fitness, social behaviors are motivated through their adaptive benefits, and many social interactions are considered to be rewarding experiences both for animals and humans. Although several studies investigated the brain mechanisms underlying reward processes associated with food and drug consumption2,3, the mechanisms of social reward remain largely unknown. Clinical and pre-clinical evidence accumulated over the past decade suggests that social interactions are rewarding experiences reinforced by social cues4–8.
The reward system involves dopamine (DA) neurons of the ventral tegmental area (VTA) that mainly project to the nucleus accumbens (NAc, part of the ventral striatum) and the prefrontal cortex and are involved in motivated non-social and social behaviors. Functional magnetic resonance data showed that social visual stimuli activate the ventral striatum in humans9, and in rodents social interaction increases population activity of VTA DA neurons projecting to the NAc 6. Furthermore, we have previously shown that chemogenetic inhibition of VTA DA neurons decreases interactions with an unfamiliar conspecific10. Although these experiments implicate the VTA in social behavior, which aspects of social interactions are rewarding and how DA neurons encode specific features of social behaviors remains elusive. Here, we used in vivo recordings in freely moving mice to investigate how VTA DA neurons encode active and passive social interactions and found that they are largely activated by active (reciprocal and unilateral) interactions with unfamiliar conspecifics. Despite an overall adaptation of DA neuron activity after repeated exposure to the same conspecific, we observed a high neuronal heterogeneity, with subpopulations of neurons tuned to specific types of active or passive social interactions.
Historically, it has been shown the activity of VTA DA neurons encodes the reward prediction error (RPE), which computes the difference between the received and the expected reward, updates reward expectations and facilitates learning11–14. Positive RPE arises when reward is better than expected and leads to approach and consummatory behaviors, while negative RPE occurs when the reward is less than predicted and leads to avoidance or renouncing to consumption15. Leveraging on complex behavioral tasks, several recent studies have demonstrated heterogeneity in DA neuron responses during learning and suggested that DA transients are also sufficient and necessary for the formation of associative learning independently of value 16,17. Furthermore, in addition to encoding reward, subpopulations of DA neurons also transmit information about various behavioral variables18,19. However, whether social interaction promotes reinforcement via prediction error remains unknown.
One of the most commonly used experimental approaches to study social reward in rodents is social conditioned place preference (sCPP), which uses a range of social stimuli to induce place preference4,5,20. Although sCPP protocols have been useful tasks, they pose some limitations to the investigation of whether and how DA neurons encode a social prediction error. Among these is the inability to record time-locked neuronal activity during sCPP. To overcome these limitations, and considering that social interactions are highly complex and dynamic, we implemented an instrumental lever-pressing task that enabled us to perform in vivo electrophysiology recordings while animals press a lever to obtain social interaction. Using this behavioral paradigm, we demonstrate that mice learn to seek and interact with a conspecific, and show that VTA DA neurons signal social prediction error. Together, these findings provide novel insights into the neuronal dynamics underlying social interaction and motivation and further demonstrate that VTA DA neurons might be the neural substrate for social learning.
Results
VTA DA neuron activity increases during free social interaction
Using in vivo recordings in mice (Suppl. Fig. 1a-d), we recorded the activity of VTA DA neurons during free and direct interaction with a sex-matched unfamiliar conspecific (here defined as social stimulus; Fig. 1a). To record and optogenetically identify VTA DA neurons, AAV-DIO-ChR2 virus was first injected in experimental DAT-Cre mice, then optic fiber and recording electrodes were implanted in the VTA (Fig. 1b and Extended Data Fig. 1e-k)21. Based on waveform and firing pattern (Extended Data Fig. 1l-n), we performed an unsupervised cluster analysis and we identified two spatially distinct groups of neurons: non-putative DA and putative DA/photolabeled DA. To strengthen our clustering analysis we added 97 VTA DA neurons optogenetically identified by the Uchida laboratory (raw data associated with 22; Extended Data Fig. 1o) and 23 from the present study for a total of 120 photolabeled DA neurons. While it is possible to observe two distinct clusters, an EMGM was used to define a confidence interval of 95% based on the photolabeled DA neurons (Extended Data Fig. 1o). It should be noted that we cannot evaluate precisely the extent to which non-DA neurons are included in the putative DA cluster. Some, but not all, results were confirmed by photolabeled DA neurons. However, while such a method does not prevent the possibility to include false positive neurons in the cluster, it helps to characterize VTA DA neurons. Only putative DA neurons inside the confidence interval of the photolabeled DA neurons were used in the subsequent analyses (See Supplementary Tables 1-2).
During 5 mins of free/direct social interaction test, we observed an overall increase in VTA DA firing rate compared to baseline (Fig. 1c), corroborating previous evidence6. However, a thorough understanding of the neuronal correlates of social interaction requires an accurate dissection of animal’s behavior. We used DeepLabCut23 and analyzed the relative position of the social stimulus toward the experimental mouse (Fig.1d) We observed that the stimulus mouse spends longer time in the visual field and in the proximity of the experimental mouse (average proximity of 8.3 cm; Fig. 1e,g). Furthermore, both the time spent by the stimulus mouse in the proximity (<5cm) and in the field of view of the experimental mouse were positively correlated with normalized VTA DA activity (Fig. 1f, h), suggesting that VTA DA firing is associated with the orientation and the contact with the social stimulus.
VTA DA neurons encode different modes of social interaction
To analyze VTA DA neuron activity during specific modes of conspecific interaction, we first quantified and subdivided social contacts in either active (comprising both reciprocal and unilateral) or passive (Fig. 2a; See Materials and Methods). Experimental mice engaged for a longer time and had a higher number of reciprocal and unilateral contacts than passive contacts (Fig. 2b). When we looked at population level, all types of social contacts induced a fast and transient increase of VTA DA neuron firing rate (Fig. 2c-k). We observed heterogeneity in the responses to conspecific interaction at single neuron level. Indeed, while 53% and 42% of neurons were activated by reciprocal and unilateral active interaction, only 7% of neurons were activated by passive contacts, as represented in the pie charts (Fig. 2d, g, j). We observed that there were aslo heterogeneous responses at the level of individual neurons: while some neurons increased or decreased their activity in response to reciprocal, unilateral or passive interactions only, others were similarly modulated by multiple modes of interaction (Fig. 2l, m; Extended Data Fig. 2a). As a control, we considered behavior not related to conspecific interactions and we observed that rearing did not increase VTA DA neuron activity (Extended Data Fig. 2h, i). Altogether, these data suggest that VTA DA neuron subpopulations could encode specific modes of social contact.
We next asked whether VTA DA neuron activation was transient or sustained during the entirety of social interaction bouts. The analysis revealed that higher VTA DA activity was associated with a shorter time of interaction (Extended Data Fig. 2b, d, f). Furthermore, analysis of VTA DA neuron activity throughout the total duration of interaction bouts indicated that DA activation was not sustained and that the activation peak always occurred during the first half of each bout for all modes of social contacts (Extended Data Fig. 2c, e, g). These experiments suggest that VTA DA neurons are tuned to the initiation of social interaction rather than to ongoing social interaction.
VTA DA neuron subpopulations respond differently through exposure to a familiar conspecific
Within the VTA, different subsets of DA neurons are biased towards saliency or novelty rather than reward value 24,25,26. Therefore, we asked whether VTA DA neurons show different responses depending on the type of social contacts across multiple exposures to the same conspecific. In the free and direct interaction task, we repeatedly exposed the experimental mouse to the same social stimulus during 3 trials of 5 min interspaced by 5 min without social interaction (Fig. 3a, b). As previously shown10, time sniffing the social stimulus habituated within and between the 3 trials (Fig. 3c, d). We observed the same pattern of habituation for VTA DA activity (Fig. 3e, f), suggesting that also neuronal activity adapted to the social context. However, when we time-locked DA neuronal activity to the interaction bouts of either active or passive interactions, we observed that, as in the 1st trial, the firing rate still significantly increased during interactions in the 3rd trial (Fig. 4a-i). We revealed a high degree of heterogeneity between different modes of interaction (Fig. 4j, k) not only during the 3rd trial, but also when we compared the responses between the 1st and the 3rd trials for individual neurons (Extended Data Fig. 3a-f). The same analyses were performed only in VTA DA photolabeled neurons, with similar results (Extended Data Fig. 4). Altogether, these data suggest that different VTA DA subpopulations may respond differently to social contacts during repeated exposures to a familiar stimulus.
Reinforcing properties of social interaction through the social instrumental task (SIT)
The increase in VTA DA neuron activity during active contacts indicates that social interaction recruits the reward system and suggests that this system might play a role in social reinforcement learning. To investigate the reinforcing properties of social interaction, we implemented a social instrumental task (SIT) using two-chambered shuttle boxes divided by a gridded auto-guillotine door. The lever in one chamber controlled the opening of the door, allowing interaction between two conspecifics (Fig. 5a). We trained the experimental mice to associate the lever press with the door opening to gain access to a social stimulus, in a daily 20-minute session. After 10 days of training (shaping phase), experimental mice were tested for 15 days (instrumental phase; Fig. 5b, c). Between the shaping and the instrumental phases, the increased number of lever presses (Fig. 5d-f) and transitions of the mouse from the lever to the interaction zone (Fig. 5g, h) indicated that the majority of the mice learned the task (Extended Data Fig. 5a). During the instrumental phase, we observed an increase in the locomotion peak velocity at the lever press (Fig. 5i-k). This occurred together with an increase in fast transitions (< 2sec) and a decrease of missed trials compared to the shaping phase, while the number of slow and delayed transitions did not change (Fig. 5l). To further characterize the task, we confirmed the ability of a different cohort of mice to extinguish and subsequently reinstate operant responses (Extended Data Fig. 5b). The number of lever presses (Extended Data Fig. 5c, d) and transitions (Extended Data Fig. 5e, f) decreased during the extinction, and increased again during the reinstatement phase. While the peak velocity during the transitions between lever and interaction zones increased in the instrumental and reinstatement phases, we did not notice a significant decrease during the extinction phase (Extended Data Fig. 5g, h). However, the proportion of fast transitions (< 2sec) between the lever and the interaction zones increased during instrumental and reinstatement phases compared to shaping and extinction phases (Extended Data Fig. 5i-j). These data suggest that the interaction with social stimulus sustains social reinforcement learning.
Prediction encoded by VTA DA neurons emerges through the SIT
We then recorded VTA DA neuron activity across the SIT. During the shaping phase, by averaging the trials across the recorded VTA DA neurons, we observed an overall increase in the firing rate and in the normalized firing frequency when the social stimulus was accessible (i.e. during interaction window). This increase was observed when neuronal activity was aligned to either the entry in the interaction zone (Extended Data Fig. 6a, b) or to the lever press event (Fig. 6a-d). During the shaping phase, we also noted that more neurons increased their activity during the interaction window (38%) than at the lever press (18%, Fig. 6d, e). However, during the instrumental phase, the increased activity shifted to the onset of the lever press (Fig. 6f-j) and right before the entry to the interaction zone (Extended Data Fig 6c, d). This was associated with a higher proportion of neurons responding to the lever press (32.9%) compared to interaction (17.1%, Fig. 6i). Of note, during an intermediate phase (between the shaping and instrumental phases (day6 – day10), Extended Data Fig 6e) we observed the emergence of the VTA DA activity at the onset of the lever press (Extended Data Fig 6f-g) and a high proportion of VTA DA neurons that increase their activity during interaction window (Extended Data Fig 6h, i). These data suggest the progressive shift in the VTA DA activity occurs during the SIT.
Further analysis showed the emergence of a prediction error signal through the SIT (Fig. 6k) and a significant correlation between normalized VTA DA activity at the moment of lever press and the time spent in the interaction zone (Fig. 6l). These results suggest that the VTA DA activity during the lever press progressively emerges and predicts the time spent interacting. Overall, we did not find that VTA DA firing rate correlated with locomotor activity. Indeed, the peak of VTA DA activity occurred at the action initiation, before the peak of velocity (Fig. 6m, n). Moreover, similar peaks of velocity outside trials (i.e. when the mice are not lever-pressing) are not preceded by a significant peak in neuronal activity (Fig. 6o). Although we also observed that some neurons respond to changes in the mouse’s velocity (Fig. 6p), these neurons are not sufficient to change the overall VTA DA activity. Together, these results indicate that the activity of VTA DA neurons encodes mainly the time spent interacting with the conspecific, through a phasic signal that emerges progressively during learning.
VTA DA neurons encode a social prediction error during the SIT
Of note, in some trials the experimental mice transited from the lever to the interaction zones in less than 2 seconds without pressing the lever. We considered those trials as error trials (Extended Data Fig. 7a). Here, the increased neuronal activity recorded when the mice were leaving the lever zone was followed by a decrease in normalized VTA DA firing when the mice expected to interact with the social stimulus (Extended Data Fig. 7b-e). As control, we looked at the transitions made in less than 2 seconds occurring after a lever press (correct trials; Extended Data Fig. 7f). On these trials, when the activity was aligned on the exit from the lever zone, we still observed the increased activity of VTA DA neurons without decrease during the interaction window (Extended Data Fig. 7g-j). After the instrumental phase, some mice underwent an omission phase, consisting of trials characterized by unpredictable access to the social stimulus after the lever press (Fig. 7a, b). When the social stimulus was unexpectedly not accessible during some trials, the phasic increase was still observed at the lever press, but there was a strong decrease in the normalized VTA DA activity during the expected social interaction window (Fig. 7c-f). This dip in the normalized VTA DA activity started to appear 2 seconds after the lever press until 6 seconds (Fig. 7g). Notably, during the trials with access to the social stimulus, we still observed an increase in VTA DA neurons phasic activity at the lever press but no decrease in neuronal activity during the interaction window (Fig7 h-l).
Altogether, these data suggest that social interaction has motivational value and that VTA DA neurons encode social prediction error (SPE) to support social reinforcement learning.
Optogenetic inhibition of VTA DA neurons affects social reinforcement learning
Finally, to test that VTA DA activity is causally involved in social reinforcement learning, we injected red shifted AAV-FLEX-Jaws opsin or control AAV-DIO-eYFP in the VTA of DAT-Cre mice to optogenetically inhibit VTA DA neurons during the SIT (Fig. 8a-d). Once mice reached the learning criterion (see Material and Methods), we started the task. During 5 days the mice underwent the classical SIT without optogenetic manipulation. From day 6 to day 20 we optogenetically inhibited VTA DA neurons at the time the mouse entered the interaction zone after pressing the lever (Fig. 8a, b). While the number of lever presses and the number of transitions increased in control eYFP mice between off and light conditions, on average we did not observe significant changes in JAWS injected mice (Fig. 8e-g). When we considered individual mice, we observed that 70% of JAWS-injected mice reduced the number of lever presses as a consequence of the inhibition (Fig. 8f). Finally, to exclude the possibility that VTA DA neuron inhibition induced a conditioned place aversion during the task, we calculated the time spent in the interaction zone when the door was closed and by this way the interaction not accessible, between D15 and D20. We found no difference between eYFP control and JAWS injected mice (Fig. 8h, i), indicating that the interaction zone per se was not aversive (Extended Data Fig. 8a-b). The velocity of the mice was not affected by the optogenetic inhibition, regardless of whether the door was open or closed (Extended Data Fig. 8c, d). Together, these results indicate that inhibition of VTA DA neuron activity is sufficient to impair social reinforcement learning.
Discussion
Here, we used in vivo recordings in freely moving mice to directly measure VTA DA neuron activity during social interaction and to test the hypothesis that VTA DA neurons signal prediction error in social context.
Humans and animals are greatly motivated to interact with conspecific. Indeed, social interactions provide adaptive benefits such as safety from predators, access to mates and cooperation. However, what are the basic neuronal building blocks that form social interaction and drive an individual to interact with a conspecific? While social behaviors are mainly considered through global social context or manual scoring of chosen social events, the recent advances in markerless pose estimation of animals enable researchers to investigate accurately specific types of social interactions. Here, we initially parsed out complex social behavior to simple direct interactions between conspecifics. This allowed us to identify that the initiation of active contacts is one of the basic social elements that activates the reward system possibly to drive social interaction. Although previous studies have shown that DA neurons in the reward system are recruited in social contexts and are essential for expressing social behaviors adapted to different contexts6,10, the data presented here reveal that these neurons may encode specific modes of interaction. The reward system has been described as a circuit that reinforces behaviors associated with natural reward. In particular, the VTA is a structure that functionally encodes motivation for palatable food27,28. Whether social interaction and palatable food share overlapping circuits and neuronal mechanisms within the reward system is still an open question. Although our results suggest the recruitment of different DA neuron subpopulations regarding exposure to a same conspecific and contact mode such as active and passive interactions, future experiments are needed to show whether a subset of DA neurons is selectively and uniquely activated by social interaction.
Within the VTA, a subset of DA neurons responds to novel stimuli29. Indeed, an increase in DA neuron firing in response to novel stimuli and a rapid habituation when a stimulus became familiar has been previously shown30. In the context of social behavior, fiber photometry experiments showed that the strongest activation of DA neurons occurs during the earlier bouts of interaction with the novel conspecific and then rapidly habituates through repeated interaction6. Furthermore, we have previously shown that chemogenetic inhibition of DA neuron activity decreases the behavioural response to social novelty10. In the current study, using in vivo recordings in freely moving mice, we observed an overall habituation of VTA DA neuron activity during repeated exposure to the same conspecific. We observed a high heterogeneity in the individual neuronal responses. Indeed, while some neurons only responded to the novelty, others were activated even when the stimulus became familiar. Our data show that different VTA DA neurons are recruited to encode the novelty and the value of different social contacts. Overall, the large heterogeneity of VTA DA neuron responses depending on whether the contacts are passive, unilateral or reciprocal, through exposure to a conspecific, reflect the high complexity of social behavior. It suggests that individual neuronal populations within the VTA may contribute to the different building blocks that shape social interaction.
The canonical function of VTA DA neurons is to signal RPE defined as the discrepancy between the received and predicted reward15. RPE provides a learning and motivational signal that is needed to estimate future reward and to ultimately make decisions31. Here, we observed that while VTA DA neurons are activated during the social-interaction window of the shaping phase of the SIT, the increase in firing rate shifted to the lever press once the learning had stabilized. We also demonstrated that if the mice unexpectedly did not have access to the social stimulus, the activity of VTA DA neurons decreased. Furthermore, devaluation of the social stimulus using optogenetic inhibition leads to a decrease in lever presses. Together, our findings provide strong support for the hypothesis that VTA DA neurons drive social learning through social prediction error (SPE). Moreover, our data suggest that the initiation of the interaction toward the conspecific works as a learning signal. In future experiments, replicating the results of the SIT using photolabeled DA neurons would definitively confirm all these findings.
We presented an unfamiliar conspecific every day during the learning and instrumental phases. The mice still extinguished the task once it was learned, although it took a long time through the extinction phase, while the reinstatement shows a fast learning of the task again. This finding supports the hypothesis that the learning was supported by the rewarding properties of the conspecific more than the saliency of the task per se. Since DA neurons, in certain situations, support associative rather than model-free learning32, we cannot exclude that within the heterogeneous DA neuron population, some neurons support associative learning independent of the value in the social context.
Since deficits in reinforcement learning may be associated with psychiatric disorders33,34, deficits in prediction error during social context might affect social aspects of psychiatric disorders35–37. It has been suggested that impaired RPE relates to clinical phenotypes of psychiatric disorders and in particular of Autism Spectrum Disorders (ASD)35. Indeed, individuals with ASD display abnormal brain activation during RPE in social contexts, suggesting that the processing of social reward could be more challenging than processing other natural rewards38. These findings support the ‘social motivation’ hypothesis elaborated by Chevallier, which proposes that deficits in social-reward processing may underlie some of the social deficits in ASD37. Understanding the neuronal mechanisms by which social reward modulates learning and motivation will help to highlight how deficits in prediction error in social contexts may lead to social deficits related to ASD35.
Methods
Animals
The study was conducted using wild type (WT) and transgenic mice with C57BL/6J background. WT mice (C57BL/6J; N = 169) were obtained from Charles River and used as stimuli mice (4-6 weeks of age). For DA neuron-specific manipulations and recordings, DAT-iresCre (Slc6a3tm1.1(cre)Bkmn/J, called DAT-Cre in the rest of manuscript; N = 53), breeded in Charles River, were employed (8-20 weeks of age). Only male animals were used for all the experiments conducted. Mice were housed in groups (weaning at P21 – P23) and isolated prior the experiments, under a 12 hours light – dark cycle (7:00 a.m.–7:00 p.m.) at 22.5°C and controlled humidity (around 55%). All physiology and behavior experiments were performed during the light cycle. For WT and DAT-Cre mice, multiple behavioral tests were performed with the same group of animals, and were assigned randomly to the different behavioural assays and groups. For optogenetic manipulation experiments, the experimenters were blind to which group belong the animals. All the procedures performed at UNIGE complied with the Swiss National Institutional Guidelines on Animal Experimentation and were approved by the respective Swiss Cantonal Veterinary Office Committees for Animal Experimentation.
Multi-unit recording system – Microdrive
The VTA DA neurons recording was realized thanks to 2 octrodes, each constituted of 8 Nickel-Chrome (NiCr) coated wires of 15 μm diameter, and a reference electrode for each octrode, constituted of stainless steel wires (110 μm). The octrodes are inserted in a homemade microdrive composed of a central piece containing a cannula as guide and a connector (Electrode interface board EIB, Neuralynx) where the recording and amplifier cable are plugged. After implantation, depth modulation is controlled by moving the central piece with a micro-screw. The 2 octrodes and an optic fiber are glued together through the cannula and implanted at the same time with a difference of 200 – 500 μm between the tips of the octrodes and the optic fiber. Once the ensemble mounted, the impedance is uniformed (≈ 300 kOhms) at the octrodes tips with a diluted gold plating solution.
The neuronal activity was recorded using Digital Lynx 4SX acquisition system, with 32kHz sampling rate (Neuralynx). A high band-pass filter (600Hz – 6000Hz) was applied during the recording to extract the fast electrical impulsions (the spikes).
Surgery
Injection of rAAV5-Ef1α-DIO-hChR2(H134R)-eYFP (Titer ≥ 4.2×1012 vg.mL-1, UNC Vector Core) was performed in DAT-Cre mice at 4 – 7 weeks. Mice were anesthetized with a mixture of oxygen (1 L/min) and isoflurane 3% (Baxter AG, Vienna, Austria) and placed in a stereotactic frame (Angle One; Leica, Germany). The skin was shaved, locally anesthetized with 40 – 50 μL lidocaine 0.5% and disinfected. Unilateral craniotomy (1 mm in diameter) was then performed over the VTA at following stereotactic coordinates: ML ± 0.5 mm, AP – 3.2 mm, DV – 4.20 ± 0.05 mm from Bregma. The virus was injected via a glass micropipette (Drummond Scientific Company, Broomall, PA) into the VTA at the rate of 100 nl/min for a total volume of 500 nL. The implantation of the homemade Microdrive (see previous part) was then performed 2 weeks later using the same coordinates. Unilateral craniotomy was made above the VTA and bilateral craniotomy above the cerebellum to implant reference wires. The Microdrive was then fixed on the skull using dental acrylic.
For optogenetic inhibition: rAAV5-hSyn-FLEX-Jaws-KGC-GFP-ER2 (Titer ≥ 3.8×I012 vg.mL-1, UNC Vector Core) or rAAV5-EF1α-DIO-eYFP (Titer ≥ 4.2×1012 vg.mL-1, UNC Vector Core) were injected in DAT-Cre mice at 4 – 7 weeks. Mice were anesthetized and disinfected as previously described. The animals were placed in a stereotactic frame and bilateral injections were performed in the VTA (ML ± 0.5 mm, AP – 3.2 mm, DV – 4.20 ± 0.05 mm from Bregma, 500 nL per side) using a glass micropipette. The virus was incubated for at least 3 weeks. An optic fiber was then implanted above the VTA, unilaterally with a 10° angle at the following coordinates: ML ± 0.9 mm, AP – 3.2 mm, DV – 3.95 mm from Bregma above the VTA and fixed to the skull with dental acrylic.
Injections and implantations sites were confirmed post hoc.
Optogenetic photolabeling of VTA DA neurons
DAT-cre mice injected with rAAV5-Ef1α-DIO-hChR2(H134R)-eYFP and implanted with the microdrive underwent the optogenetic protocol to validate the dopaminergic nature of the neuron. When the recorded neuron was suspected to be DA, based on the electrophysiological criteria (Firing < 12Hz, half-width spike > 1.5ms, regular spiking activity with typical bursting activity), mice were placed alone in a cage with bedding (20 × 30 × 15 cm). An optic fiber (homemade, materials from ThorLabs) was plugged and baseline neuronal activity was recorded during 90s without stimulation. Then a 5Hz optical stimulation of blue laser (BioRay 488 nm 20 mW Elliptical Dot Laser, Coherent) protocol with a light-pulse duration of 5msec was applied with an expected power at the optic fiber tip of 8 – 12 mW (Master-8). After 1000 light-pulses the protocol was stopped, and the neuronal activity was still recorded during 1 min at baseline condition. Using the same procedure, a protocol of 20Hz light stimulation was then applied. The protocols were always applied after the different social experiments, at the end of the day, to avoid any influence of the light stimulation onto the tasks.
Free social interaction task
DAT-Cre male mice, implanted with a Microdrive to record VTA DA neurons, performed a free interaction task. All the animals were isolated 1 week prior to the task. The mice were first placed in a cage-like homecage (20 × 40 × 10 cm) with bedding and VTA DA neuron activity was recorded. After 5 mins of neuronal activity recording, constituting the baseline activity, a social stimulus (unfamiliar conspecific sex-matched juvenile mouse) was introduced in the cage for 5 mins of free social interaction with the experimental mice. For the repeated exposure in free social interaction, the baseline and social conditions were repeated 3 times to obtain 3 different trials for a total duration of 30 mins. The same conspecific was used during the 3 different trials to study how the VTA DA neuron could adapt to a repeated social stimulus. At the end of each session of the task, the cage was cleaned using 70% ethanol.
The animals were tracked post-hoc using DeepLabCut23 in python. Two models were built (experimental mice with implanted microdrive and stimuli mice) to detect nose, body and tail of each tracked subject. Distance error between train and test dataset were 0.0034cm for the two models. The rearing behavior was manually scored.
Social Instrumental Task
The operant chamber (MedAssociates) is composed of 2 different compartments divided by a gridded auto-guillotine door: 1 chamber of 28 × 16 × 21 cm with the experimental DAT-Cre mouse with ad libitum access to a lever press on the wall opposing the door, and 1 chamber of 14 × 16 × 21 cm containing the social stimulus (unfamiliar sex-matched juvenile conspecific mouse C57BL/6J, 3 – 6 weeks). By pressing the lever, the gridded auto-guillotine door is immediately opened without delay for 7 seconds allowing the interaction between the experimental mouse and the social stimulus. The grid prevents the passage between chambers. After 7 seconds the door is closing. During the whole session, the experimental animals have the possibility to press without limit the lever to interact. The apparatus was cleaned using 70% ethanol after each session.
All the mice were isolated 1 week prior to the first session of the experiment to promote motivation to interact. The experimental mouse was placed in the corresponding chamber while neuronal recording was performed. To keep the same experimental conditions, the animals were always plugged even though neurons were not detected during the recording.
The task is performed with 1 daily session of 20 min and divided in several phases across the days:
-
-
Shaping phase (from Day 1 to Day 10): The animals were trained to associate the lever press with the opening of the door, and consequently, the social interaction. Every time the animals were in proximity of the lever, the experimenter, through the MedAssociates software, was opening the door. Every day the area around the lever was decreased and the last day of the shaping phase, the door was opened only when the mice were touching the lever.
-
-
Instrumental Phase (from Day 11 to Day 25): The animals had to perform the task by themselves. Pressing only once on the lever was opening the door to interact with the conspecific.
Exclusively for behavior:
-
-
Extinction Phase (from Day 26 to Day 75): During this phase, the experimental mice were still able to press the lever and open the door, but no conspecific was present in the other chamber.
-
-
Reinstatement Phase (from Day 76 to Day 80): An unfamiliar conspecific was reintroduced in the other chamber of the apparatus, and the experimental were able to interact with the stimulus by pressing the lever.
Exclusively for electrophysiology:
-
-
Intermediate Phase (from Day 6 to Day 10): Subdivision of the Shaping Phase to have a better characterization of the VTA DA activity during the SIT.
-
-
Omission Phase (from Day 26 to Day 35): After instrumental phase, if VTA DA neurons were still present, some animals underwent a paradigm where a lever press was opening randomly the door with a 50% probability. Thereby the mice were not able to predict accurately the future social interaction with the unfamiliar conspecific.
The conspecific was changed every day to avoid experimental animals to interact twice with the same social stimulus during the different sessions of the task. During the instrumental phase, a mouse was considered as learner if it pressed at least 10 times the lever for 3 consecutives days. Otherwise the animal was considered as non-learner. The videos and neuronal recording were monitored and acquired using Neuralynx system. The animals were tracked and zones delimited using Ethovision software (Noldus).
At the end of the task, the animals were sacrificed. The viral infection and the recording electrode placement were verified. If the placement of the recording electrode was outside the VTA, the mice were excluded from the analyses.
Optogenetic inhibition of VTA DA neurons in the Social Instrumental Task
DAT-Cre mice were injected with either the AAV5-FLEX-Jaws-GFP or AAV5-DIO-eYFP as control (see surgery part for details). After 1 week of recovery, the animals started to learn the SIT as described above in the shaping phase. To reinforce and accelerate the learning process, at the end of the shaping phase, we performed 2 sessions where the mice stayed in the operant chamber for the whole night (7 p.m.–10 a.m.). During this overnight session the experimental mice were able to press the lever to access the social stimulus. After criteria of an average of 15 lever presses for at least 3 days after the overnight sessions, the mice underwent optic fiber implantations (see surgery part for details). After 5 days of recovery, the animals started the test in the SIT. During the 5 first days (from Day 1 to Day 5) the animals performed the task as usual without any optogenetic light stimulation. From Day 6 to Day 20, the mice performed the task with conditional optogenetic inhibition: constant red laser (wavelength 640nm, Coherent Bioray) was emitted for 7 seconds when the mice were in the interaction zone (close to the grid) only after pressing the lever (door open). If the mice did not went in the interaction zone within the 7 seconds following the lever press, light emission was not applied. The apparatus was cleaned using 70% ethanol after each session.
The animals were tracked and zones delimited using Ethovision software (Noldus). At the end of the task, the animals were sacrificed. The viral infection and the optic fiber placement were verified. If the placement of the optic fiber was outside the VTA, the mice were excluded from the analyses.
Immunohistochemistry
DAT-Cre mice injected with rAAV5-Ef1α-DIO-hChR2(H134R)-eYFP were anesthetized with pentobarbital (Streuli Pharma) and sacrificed by intra-cardiac perfusion of 0.9% saline followed by 4% PFA (Biochemica). Brains were post-fixed overnight in 4% PFA at 4 °C. 24 hours later, they were washed with phosphate buffered saline (PBS) and then 50 μm thick sliced with a vibratome (Leica VT1200S).
Previously prepared slices were washed three times with PBS 0.1M. Brain slices were pre-incubated with PBS-BSA-TX buffer (10% BSA, 0.3% Triton X-100, 0.1% NaN3) for 60 minutes at room temperature in the dark. Subsequently, cells were incubated with primary antibodies diluted in PBS-BSA-TX (3% BSA, 0.3% Triton X-100, 0.1% NaN3) overnight at 4°C in the dark. The following day cells were washed three times with PBS 0.1M and incubated for 60 minutes at room temperature in the dark with the secondary antibodies diluted in PBS-Tween buffer (0.25% Tween-20). Finally, slices were mounted using Fluoroshield mounting medium with DAPI (abcam). In this study, the following primary antibody was used: rabbit polyclonal anti-Tyrosine Hydroxylase (1/500 dilution, abcam, ab6211). The following secondary antibody was used at 1/500 dilution: donkey anti-rabbit 555 (Alexa Fluor). Immunostained slices were imaged using the confocal laser scanning microscopes Zeiss LSM700 and LSM800. Larger scale images were taken with the widefield Axioscan.Z1 scanner.
Analyses for in vivo recording
Recording
All the in vivo data acquired using Neuralynx system were extracted and analyzed offline using MatLab (The MathWorks). The spike-sorting was done using a custom MatLab code based on principal component analysis (PCA) and expectation-maximization of Gaussian mixture (EMGM). After spike-sorting procedure all the timestamps of the spikes as well as the voltage associated to each spike point were saved. Putative dopaminergic neurons (pDA) were first visually determined by wider waveform, slow firing pattern between 1 and 12Hz and typical triphasic bursting. Non-DAergic neurons (non-pDA) were determined by narrower waveform, high firing pattern or low firing pattern with burst event at high frequency (> 15Hz).
VTA neurons classification
Multiple electrophysiological features were analyzed based on firing pattern from recording of VTA neurons obtained in our lab and from a public dataset (CRCNS.org vta-1)22. To check the possibility to classify VTA neurons based on electrophysiological properties, a cluster analysis was used based on 55 features extracted. To obtain these features, the probability distribution from the log of the instantaneous frequency of defined event (tonic, bursting, pause, spikes within tonic, spikes within burst, spikes within pause) were computed and the following properties of each distribution were extracted: mean, median, coefficient of variation, skewness and kurtosis. Burst and pause interspike interval (ISI) thresholds were determined and burst and pause strings were identified based on the Robust Gaussian Surprise (RGS) method39. Finally, the frequency peak in the power spectrum was also extracted by using the Welch method to calculate power spectrum by averaging Fast Fourier Transforms of overlapping window divisions.
The features were first normalized by using z-score and by rescaling the values and then the dimensionality was reduced by using UMAP technique. While the observation of two distinct clusters was evident, an EMGM was used to define a confidence interval of 95% based on the photolabeled DA neurons. Only neurons inside this confidence interval were considered as pDA neurons.
VTA DA activity and behavior analysis
The construction of Peri-event time histogram (PETH) was made by aligning and centering specific events. These events were obtained by coupling the Neuralynx digital acquisition system with others data acquisitions systems (such as Master-8 or MedAssociates operant chamber) that sent Time-to-Live (TTL) at specific times to link neuronal activity with events/stimuli or with events detected by synchronized video analysis of specific behaviors.
The neuronal recordings were binned depending on the analysis time-window taken. For large time-scales (time-window > 600 sec), 1 sec bin was taken to average the spiking frequency. For low time-scale (time-window < 30 sec, such as PETH), 100 msec bin was taken to average the spiking frequency. Subsequent analysis were performed on non-normalized and normalized activity. Non normalized firing rate corresponds to the number of spikes per sec. To get normalized spiking activity, the normalization was computed as following: ; where m Freq is the averaged frequency of a given bin and μFreq is the mean frequency of all the sessions. After normalization, a convolution using a Kernel-Gaussian sliding window of 16 bins was applied on the data (gausswin MatLab function).
Free Interaction
All the coordinates of the tracked position by DeepLabCut were corrected by using the body of the experimental mice as the origin coordinates and the nose of the experimental mice aligned to the y-axis. All the tracked positions of the social stimulus were then plotted relatively to the position of the experimental mice. Coordinates with a likelihood lower than 95%, defined by DeepLabCut, were replaced by the last coordinates higher than 95% of likelihood. The distance and the angle between the experimental and the stimulus position were reported for each frame (40 ms) and time distribution probability was computed. The neuronal activity was normalized as previously described when stimuli were present, then, for each frame, when the stimuli were closer than 20 cm, the corresponding neuronal activity was reported and analyzed depending on the proximity and the angle in the same way than the time distribution of the stimuli position.
To extract interaction events, a proximity threshold of 5cm, an angle threshold from –110° to +110° (corresponding to the visual field of the experimental mice40) and a duration threshold from 0.2 to 2 seconds were applied regarding the position of the stimuli nose (active reciprocal interaction), the base of the stimuli tail (active unilateral interaction) or the base of the experimental mice tail (passive interaction). PETH analyses were then performed and neuronal activity was aligned on the center of the defined interaction events and normalized as previously described using 10 seconds around each interaction event. To quantify the neuronal response, baseline activity was measured by the mean of the activity between 2 and 5 seconds before and after the event. The activity during the interaction was measured by the mean of activity 1 second before and 1 second after the interaction.
To determine responders and non-responders neurons, we calculated the p-values from t-tests for each neuron by comparing the baseline and interaction activity distributions from each trial. Every significant t-test determined if a neuron was responder or not. The average activity of the neuron during interaction (below or above the baseline) determined the positive or negative response. Neurons without response in any interaction were considered as non-responders and neurons with response in only a subset of interactions were considered as neutral for the interaction without response.
Social Instrumental Task (SIT)
Data of position coordinates, velocity and transition events from lever to interaction zone of experimental mice were obtained directly from Ethovision software after the tracking. Lever-press timestamps event was obtained by TTL sent by MedAssociates apparatus. To analyze behavior and neuronal activity, further analyzes were performed to obtain different events. Transition performances during the interaction window were defined as follows: fast (transitions < 2s), slow (2s < transitions < 7s), delayed (7s < transitions < 12s) and missed (transitions > 12s). Error trials events were defined by fast transitions not preceded by a lever-press and correct trials events by fast transitions preceded by a lever-press.
Velocity control events were defined by peak of velocity outside the trials (when door is closed) and higher than 20cm.s-1 (corresponding to the mean velocity of fast transitions when the door is open).
PETH analysis previously described was then performed. Velocity and/or neuronal activity were aligned:
-
-
On lever presses events, or entry in the interaction zone, during shaping phase between day 1 and day 5, during intermediate phase between day 6 and day 10 and during instrumental phase between day 11 and day 25.
-
-
For the omission phase, on the lever presses of the trials when the door was opening or not.
-
-
For the error and correct trials, on the exit of the lever zone during fast transitions (< 2s) of the instrumental phase between day 15 and day 25. As the exit of the lever zone is not reaching the same temporal precision than a lever press, the neuronal activity was then realigned to the peak of VTA DA activity found 500 ms before or after the exit of the lever zone. For proper control, this readjustment was made in the same conditions for error and correct trials.
-
-
For control of the velocity, on the initiation of the action followed by a peak of velocity outside trials during instrumental phase between day 11 and day 25.
To analyze the prediction value, the mean of neuronal activity at the moment of the lever press (between -1s before and 1s after) during the instrumental phase, between day 11 and day 25, was correlated with the time spent in the interaction zone.
By using the same approach than in the free interaction task, to determine responders and non-responders neurons, the neuronal activity of each trial of each neuron was compared between its own baseline (from 10s before the lever press to 1s before) and during the interaction window (from 1s to 7s after lever press) or lever presses events (1s before and 1s after). We calculated the p-values from paired t-tests for each neuron by comparing the baseline and interaction window distributions. Every significant t-test determined if a neuron was responder or not.
Viruses
rAAV5-Eflα-DIO-hChR2(H134R)-eYFP (Titer ≥ 4.2×1012 vg.mL-1, UNC Vector Core), rAAV5-Eflα-DIO-eYFP (Titer ≥ 4.2×1012 vg.mL-1, UNC Vector Core), rAAV5-hSyn-FLEX-JAWS-KGC-GFP-ER2 (Titer ≥ 4.2×1012 vg.mL”-1, UNC Vector Core).
Statistical analysis
No statistical methods were used to predetermine the number of animals and cells, but suitable sample sizes were estimated based on previous experience and are similar to those generally employed in the field41, 42. The animals were randomly assigned to each group at the moment of viral infections or behavioral tests. Statistical analysis was conducted with MatLab (The Mathwork) and GraphPad Prism 7 (San Diego, CA, USA). Statistical outliers were identified by using the criterion mean ± 3 × s. e. m. and excluded from the analysis. The normality of sample distributions was assessed with the Shapiro–Wilk criterion and when violated non-parametric tests were used. When normally distributed, the data were analyzed with independent t test or paired t test, while for multiple comparisons one-way ANOVA and repeated measures (RM) ANOVA were used. When normality was violated, the data were analyzed with Mann-Whitney test, Wilcoxon matched-pairs signed rank test, while for multiple comparisons, Kruskal–Wallis or Friedman tests were applied followed by Dunn’s test or Bonferroni-Holm correction. For the analysis of variance with two factors (two-way ANOVA or RM two-way ANOVA), normality of sample distribution was assumed, and followed by Bonferroni-Holm correction test or Bonferroni post-hoc test. To compare two variances, the two-sample F-test was used and to compare ratios, Chi-Square test was applied. All the statistical tests adopted were two-sided. Data are represented as the mean ± s. e. m. and the significance was set at P < 0.05.
Extended Data
Supplementary Material
Acknowledgements
We would like to thank Christian Lüscher, Manuel Mameli, Philippe Faure, Jérémie Naudé and Sebastiano Bariselli for the comments on the manuscript. We would also like to thank Sebastien Pellat and Lorena Jourdain for the technical support.
Funding
This work is supported by the Swiss National Science Foundation (31003A_182326) and the NCCR Synapsy from the Swiss National Science Foundation. Camilla Bellone is also supported by the ERC Consolidator Grant (864552).
Footnotes
Author contributions: C.S. and C.B. conceived the project. C.B., C.S., and B.G wrote the manuscript. C.S., B.G. performed the electrophysiological recordings and the behavioral experiments with the help of B.R. and M.T. C.S., B.G. performed all the analyzes and statistics.
Competing interests: The authors declare no competing interests.
Data availability
Original data used in the present study are available in the following link: https://doi.org/10.5281/zenodo.5564893. Dataset contains spiking activity of VTA DA neurons in mice and events timing during social free interaction and social instrumental task corresponding to Figures 1, 2, 3, 4, 6 and 7 and Extended Data Figures 1, 2, 3, 4, 6 and 7. Further data supporting the findings are available upon request.
Code availability
Innovative code used in the present study are available in the following link: https://doi.org/10.5281/zenodo.5564893. Further code supporting the findings are available upon request.
References
- 1.Chen P, Hong W. Neural Circuit Mechanisms of Social Behavior. Neuron. 2018;98:16–30. doi: 10.1016/j.neuron.2018.02.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Berridge KC, Kringelbach ML. Affective neuroscience of pleasure: reward in humans and animals. Psychopharmacology. 2008;199:457–480. doi: 10.1007/s00213-008-1099-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Alhadeff AL, et al. Natural and Drug Rewards Engage Distinct Pathways that Converge on Coordinated Hypothalamic and Reward Circuits. Neuron. 2019;103:891–908.:e6. doi: 10.1016/j.neuron.2019.05.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Panksepp JB, Lahvis GP. Social reward among juvenile mice. Genes Brain Behav. 2007;6:661–671. doi: 10.1111/j.1601-183X.2006.00295.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Dölen G, Darvishzadeh A, Huang KW, Malenka RC. Social reward requires coordinated activity of nucleus accumbens oxytocin and serotonin. Nature. 2013;501:179–184. doi: 10.1038/nature12518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Gunaydin LA, et al. Natural Neural Projection Dynamics Underlying Social Behavior. Cell. 2014;157:1535–1551. doi: 10.1016/j.cell.2014.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Tamir DI, Hughes BL. Social Rewards: From Basic Social Building Blocks to Complex Social Behavior. Perspect Psychol Sci. 2018;13:700–717. doi: 10.1177/1745691618776263. [DOI] [PubMed] [Google Scholar]
- 8.Hu RK, et al. An amygdala-to-hypothalamus circuit for social reward. Nat Neurosci. 2021:1–12. doi: 10.1038/s41593-021-00828-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Izuma K, Saito DN, Sadato N. Processing of Social and Monetary Rewards in the Human Striatum. Neuron. 2008;58:284–294. doi: 10.1016/j.neuron.2008.03.020. [DOI] [PubMed] [Google Scholar]
- 10.Bariselli S, et al. Role of VTA dopamine neurons and neuroligin 3 in sociability traits related to nonfamiliar conspecific interaction. Nat Commun. 2018;9:3173. doi: 10.1038/s41467-018-05382-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Schultz W, Dayan P, Montague PR. A Neural Substrate of Prediction and Reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
- 12.Eshel N, et al. Arithmetic and local circuitry underlying dopamine prediction errors. Nature. 2015;525:243–246. doi: 10.1038/nature14855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Roesch MR, Calu DJ, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat Neurosci. 2007;10:1615–1624. doi: 10.1038/nn2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Waelti P, Dickinson A, Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001;412:43–48. doi: 10.1038/35083500. [DOI] [PubMed] [Google Scholar]
- 15.Schultz W. Reward prediction error. Curr Biol. 2017;27:R369–R371. doi: 10.1016/j.cub.2017.02.064. [DOI] [PubMed] [Google Scholar]
- 16.Sharpe MJ, et al. Lateral Hypothalamic GABAergic Neurons Encode Reward Predictions that Are Relayed to the Ventral Tegmental Area to Regulate Learning. Curr Biol. 2017;27:2089–2100.:e5. doi: 10.1016/j.cub.2017.06.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Takahashi YK, et al. Dopamine Neurons Respond to Errors in the Prediction of Sensory Features of Expected Rewards. Neuron. 2017;95:1395–1405.:e3. doi: 10.1016/j.neuron.2017.08.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Engelhard B, et al. Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature. 2019;570:509–513. doi: 10.1038/s41586-019-1261-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kremer Y, Flakowski J, Rohner C, Lüscher C. Context-dependent multiplexing by individual VTA dopamine neurons. J Neurosci. 2020;40:JN-RM-0502-20. doi: 10.1523/JNEUROSCI.0502-20.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Bariselli S, Contestabile A, Tzanoulinou S, Musardo S, Bellone C. SHANK3 Downregulation in the Ventral Tegmental Area Accelerates the Extinction of Contextual Associations Induced by Juvenile Non-familiar Conspecific Interaction. Front Mol Neurosci. 2018;11:360. doi: 10.3389/fnmol.2018.00360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 2012;482:85–88. doi: 10.1038/nature10754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Starkweather CK, Gershman SJ, Uchida N. The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty. Neuron. 2018;98:616–629.:e6. doi: 10.1016/j.neuron.2018.03.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Mathis A, et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat Neurosci. 2018;21:1281–1289. doi: 10.1038/s41593-018-0209-y. [DOI] [PubMed] [Google Scholar]
- 24.Lisman JE, Grace AA. The Hippocampal-VTA Loop: Controlling the Entry of Information into Long-Term Memory. Neuron. 2005;46:703–713. doi: 10.1016/j.neuron.2005.05.002. [DOI] [PubMed] [Google Scholar]
- 25.Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in Motivational Control: Rewarding, Aversive, and Alerting. Neuron. 2010;68:815–834. doi: 10.1016/j.neuron.2010.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Tapper AR, Molas S. Midbrain circuits of novelty processing. Neurobiol Learn Mem. 2020;176:107323. doi: 10.1016/j.nlm.2020.107323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Berridge KC. ‘Liking’ and ‘wanting’ food rewards: Brain substrates and roles in eating disorders. Physiol Behav. 2009;97:537–550. doi: 10.1016/j.physbeh.2009.02.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Meye FJ, Adan RAH. Feelings about food: the ventral tegmental area in food reward and emotional eating. Trends Pharmacol Sci. 2014;35:31–40. doi: 10.1016/j.tips.2013.11.003. [DOI] [PubMed] [Google Scholar]
- 29.Menegas W, Akiti K, Amo R, Uchida N, Watabe-Uchida M. Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli. Nat Neurosci. 2018;21:1421–1430. doi: 10.1038/s41593-018-0222-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ljungberg T, Apicella P, Schultz W. Responses of monkey dopamine neurons during learning of behavioral reactions. J Neurophysiol. 1992;67:145–163. doi: 10.1152/jn.1992.67.1.145. [DOI] [PubMed] [Google Scholar]
- 31.Berke JD. What does dopamine mean? Nat Neurosci. 2018;21:787–793. doi: 10.1038/s41593-018-0152-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sharpe MJ, et al. Dopamine transients do not act as model-free prediction errors during associative learning. Nat Commun. 2020;11:106. doi: 10.1038/s41467-019-13953-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Geugies H, et al. Impaired reward-related learning signals in remitted unmedicated patients with recurrent depression. Brain. 2019;142:2510–2522. doi: 10.1093/brain/awz167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Chevrier A, et al. Disrupted reinforcement learning during post-error slowing in ADHD. Biorxiv. 2018:449975. doi: 10.1101/449975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Sinha P, et al. Autism as a disorder of prediction. Proc National Acad Sci. 2014;111:15220–15225. doi: 10.1073/pnas.1416797111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Mosner MG, et al. Neural Mechanisms of Reward Prediction Error in Autism Spectrum Disorder. Autism Res Treat. 2019;2019:5469191. doi: 10.1155/2019/5469191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Chevallier C, Kohls G, Troiani V, Brodkin ES, Schultz RT. The social motivation theory of autism. Trends Cogn Sci. 2012;16:231–239. doi: 10.1016/j.tics.2012.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Kinard JL, et al. Neural Mechanisms of Social and Nonsocial Reward Prediction Errors in Adolescents with Autism Spectrum Disorder. Autism Res. 2020;13:715–728. doi: 10.1002/aur.2273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Storey GP, et al. Nicotine Modifies Corticostriatal Plasticity and Amphetamine Rewarding Behaviors in Mice. Eneuro. 2016;3:ENEURO.0095-15.2015. doi: 10.1523/ENEURO.0095-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Prusky GT, Alam NM, Douglas RM. Enhancement of Vision by Monocular Deprivation in Adult Mice. J Neurosci. 2006;26:11554–11561. doi: 10.1523/JNEUROSCI.3396-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Matsumoto H, Tian J, Uchida N, Watabe-Uchida M. Midbrain dopamine neurons signal aversion in a reward-context-dependent manner. ELife. 2016;5:e1728. doi: 10.7554/eLife.17328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tian J, Uchida N. Habenula lesions reveal that multiple mechanisms underlie dopamine prediction errors. Neuron. 2015;87:1304–1316. doi: 10.1016/j.neuron.2015.08.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Original data used in the present study are available in the following link: https://doi.org/10.5281/zenodo.5564893. Dataset contains spiking activity of VTA DA neurons in mice and events timing during social free interaction and social instrumental task corresponding to Figures 1, 2, 3, 4, 6 and 7 and Extended Data Figures 1, 2, 3, 4, 6 and 7. Further data supporting the findings are available upon request.
Innovative code used in the present study are available in the following link: https://doi.org/10.5281/zenodo.5564893. Further code supporting the findings are available upon request.