Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jan 4.
Published in final edited form as: Curr Biol. 2014 May 29;24(12):1347–1353. doi: 10.1016/j.cub.2014.04.044

Role of the primate ventral tegmental area in reinforcement and motivation

John T Arsenault 1,2, Samy Rima 1, Heiko Stemmann 1, Wim Vanduffel 1,2,3,*
PMCID: PMC4698409  NIHMSID: NIHMS591629  PMID: 24881876

Summary

Monkey electrophysiology [1, 2] suggests that the activity of the ventral tegmental area (VTA) helps regulate reinforcement learning and motivated behavior, in part by broadcasting prediction-error signals throughout the reward system. However, electrophysiological studies do not allow causal inferences regarding the activity of VTA neurons with respect to these processes, as this requires artificial manipulation of neuronal firing. Rodent studies fulfilled this requirement by demonstrating that electrical and optogenetic VTA stimulation can induce learning and modulate downstream structures [37]. Still, the primate dopamine system has diverged significantly from that of rodents, exhibiting greatly-expanded and uniquely-distributed cortical and subcortical innervation patterns [8]. Here we bridge the gap between rodent perturbation studies and monkey electrophysiology using chronic electrical microstimulation of macaque VTA (VTA-EM). VTA-EM was found to reinforce cue selection in an operant task and to motivate future cue selection using a Pavlovian paradigm. Moreover, by combining VTA-EM with concurrent functional magnetic resonance imaging (fMRI), we demonstrated that VTA-EM increased fMRI activity throughout most of the dopaminergic reward system. These results establish a causative role for primate VTA in regulating stimulus-specific reinforcement and motivation as well as modulating activity throughout the reward system.

Results

VTA-EM reinforces operant behavior (Experiment 1)

The firing pattern of VTA neurons is consistent with their putative function in reinforcement learning and motivational behavior [1, 2, 9]. Establishing a causal role for the primate VTA in such processes, however, has been hampered by a lack of targeted focal perturbation studies. We therefore developed an MRI-guided method to perform chronic VTA-EM in nonhuman primates (see Supplemental Experimental Procedures). Peri-operative high-resolution imaging, (Figure 1A, Movie S1) was used to direct the insertion of a guidetube and a microwire electrode array [10] and to confirm the final positioning of the electrodes (Figure 1B). After electrode implantation, we tested whether VTA-EM played a causal role in positive reinforcement using an operant conditioning paradigm.

Figure 1. MRI-guided guidetube/electrode implantation.

Figure 1

A) Tri-planar cross-section of T1-weighted anatomical image acquired during guide tube insertion. Hypointensity induced by guidetube (see Movie S1) was used to estimate guide tube trajectory/position during surgery (blue cylinder). Estimated VTA target projected from the trajectory of the guidetube (red sphere). B) Post-operative T1-weighted anatomical used to confirm the final electrode position. This transverse slice was the most ventral to exhibit hypointensity from the electrode. The inset displays an expanded view of the midbrain and electrode with the SN outlined. See also Movie S1.

Monkeys first performed a baseline cue preference test measuring their preferences between two simultaneously presented visual cues in a free choice task. In each session, a new set of cues was used. Individual trials began with a randomized wait period (1000–1500 ms) during which the monkey was required to fixate on a centrally positioned white square. After this, the white square was removed and two visual cues appeared simultaneously on the left and the right side of the screen (Figure 2A). Monkeys were allowed to freely select one of the two cues by saccading to their choice. To motivate cue selection, 50% of all saccades were rewarded with juice (0.07 ml). Critically, juice reward probabilities were equalized across cue positions (left or right) and cue identity (cue A or cue B), and hence, were completely independent of the monkey’s choice (see Supplemental Experimental Procedures).

Figure 2. VTA-EM reinforces cue selection (Experiment 1).

Figure 2

A) Four pseudo-randomized, equiprobable trial-types used in the free choice visual cue preference test. New pairs of cues were used in each session. Juice reward probability was equalized across cue position and cue identity. B) Timing schematic of cue presentation, eye movements, juice reward (100 ms, 50 % of trials) and VTA-EM (200 ms, 50% of selections of VTA-associated cue during cue-VTA-EM blocks). Juice and VTA-EM occurred 32–48 ms after cue selection. Cue preference index [(cue B selections − cue A selections)/(cue B selections + cue A selections)] during a single example session of the operant task for subjects M1 (C) and M3 (D). Cue preference index was calculated in bins of 100 and 200 trials for M1 and M3, respectively. Color of data points denote the cue selection followed by VTA-EM on 50% of the trials (gray – no VTA-EM; red – cue B-VTA-EM; green – cue A-VTA-EM). VTA-EM consisted of a 200 ms train of bipolar stimulation pulses (200Hz; 650 μA (M1), 1mA (M3); 2 VTA electrodes stimulated simultaneously). Mean cue preference indices during the 2nd half of each block type for each full session performed by M1 (E) and M3 (F). Green lines denote a session with a consistent trend for increased preference for the cue reinforced with VTA-EM while red lines represent the opposite trend. See also Figure S1 and Table S1.

For consistency across sessions, the preferred and non-preferred cues during the baseline test were deemed cue A and B, respectively. After the baseline preference test was completed, a cue B-VTA-EM block began in which 50% of all cue B selections were followed by VTA-EM. VTA-EM consisted of a 200 ms train of bipolar stimulation pulses (200Hz; 650 μA – 1mA; 2 VTA electrodes; EM parameters, except the current, were identical for experiments 1–3). Importantly, to determine whether VTA-EM reinforced preceding actions, VTA-EM occurred 32–48 ms after cue selection (Figure 2B). Juice rewards were given in 50% of the trials, but were entirely independent of VTA-EM, cue identity and cue position. After the cue B-VTA-EM block, we began pairing VTA-EM with cue A selections (using the paradigm explained above) and stopped pairing cue B selection with VTA-EM (cue A-VTA-EM block).

To quantify the monkey’s cue selection behavior, a cue preference index was calculated: [(cue B selections – cue A selections)/(cue B selections + cue A selections)]. This index ranges from 1 to −1, indicating a total preference for cue B or A, respectively. Cue preference indices taken from example sessions of M1 and M3 (Figure 2C, D) provide clear evidence that the subject’s preference for the cue associated with VTA-EM increased during the cue-VTA-EM blocks. Furthermore, these data indicate that the shift in cue preference was largest during the later stages of an EM block, as expected after repeated reinforcement. To quantify these effects, we split the data into the first and second half of each block (baseline, cue A- and cue B-VTA-EM) and calculated the mean cue preference during each half block. Because the effect of VTA-EM on cue preference was most evident during the second half of EM blocks (i.e. after the value of both cues could be sampled repeatedly), the mean cue preference during the second half of each block was compared across sessions. The mean cue preference of both M1 and M3 (Figure S1A, C) showed a main effect of block (Friedman test, p < 0.05). Comparison of the mean cue preference during the 2nd half of blocks from M1 and M3 demonstrates the consistency of the VTA-EM effect across sessions (Figure 2E, F). Next, we hypothesized that the preference for the VTA-EM associated cue should increase as a function of time within an EM block, since positive feedback should occur between VTA-EM reinforcement and increased cue selection. Therefore, we calculated the correlation between elapsed time within a cue-VTA-EM block and preference for the cue associated with VTA-EM. Both M1 (mean r = 0.55, sem r = 0.15, p = 0.03) and M3 (mean r = 0.47, sem r = 0.06, p = 6.59 × 10−6) exhibited a significant, positive correlation (sign-rank test, p < 0.05) across blocks confirming the hypothesis that cue preference increased as a function of time (Figure S1B, D).

In an effort to better understand the effect of juice and VTA-EM reinforcement on trial-by-trial cue selection behavior, we utilized Kalman filter learning models (http://www.cs.bris.ac.uk/home/rafal/rltoolbox/, see Supplemental Experimental Procedures) [11, 12]. For each monkey, two separate learning models were generated. One model used juice administration as the reward input while the other model used VTA-EM. The other parameters of the model were assumed to be free. These free parameters were then fit to each subject’s trial-by-trial cue selection behavior in order to maximize the likelihood of explaining the animal’s observed behavior (Table S1). We found that the models utilizing VTA-EM as the reward input provided a better fit for the cue selection behavior (see Akaike information criterion (AIC) calculation in Supplemental Experimental Procedures) of both M1 (ΔAIC = 9.6, AICweight = 121.5) and M3 (ΔAIC = 152.6, AICweight = 1.37 × 1033). Thus, VTA-EM reinforcement better explains trial-by-trial cue selection behavior, despite the same frequency of juice and VTA-EM events following the VTA-EM associated cue. The better fit of VTA-EM reinforcement likely results from transition periods when the relative frequency of reinforcement events are exactly balanced between cues for juice but not VTA-EM, predicting the subsequent shift in cue preference. We next examined the standard deviation of the diffusion (σd) parameter to infer learning rates, as larger σd values lead to higher learning rates. Juice models for M1 and M3 were found to display higher σd values. This indicates that cue selection was more sensitive to juice administration in recent trials while reinforcement from VTA-EM events was integrated over longer periods of time. Next, the exploration, or inverse temperature, parameter was compared between the two reinforcers because higher values indicate less noisy selections (i.e. more frequent selection of the high value cue as calculated by the model). The larger exploration parameter of the VTA-EM models for both M1 and M3 therefore indicates that cue selection behavior was less noisy when VTA-EM reinforcement was modeled. Thus, the Kalman filter learning models confirmed in both animals that VTA-EM reinforcement better accounted for trial-by-trial cue selection behavior while indicating that VTA-EM reinforcement was integrated over longer time periods and used to exploit the high value cue more often relative to equiprobable juice reinforcement.

Pavlovian cue-VTA-EM selectively motivates future cue selection (Experiment 2)

We developed a paradigm to assess whether Pavlovian cue-VTA-EM associations would motivate cue selection during a subsequent instrumental task. Importantly, within this paradigm there was no direct relationship between actions and VTA-EM during the association block, or the cue and VTA-EM during the instrumental block. Therefore, this paradigm offers insight into the motivational function of VTA-EM because it assesses whether a cue acquires incentive motivation during a Pavlovian association [13, 14]. This paradigm began with a 400-trial baseline cue-preference test identical to the baseline test performed in experiment 1, thus in the absence of VTA-EM (Figure 3A). After this test, the subject was exposed to a 20-minute Pavlovian cue-VTA-EM association block. Within this block, the subject performed a passive fixation task to obtain juice rewards (0.03 ml) every 800 – 1200 ms, while every 3500 – 6000 ms one of the two visual cues was randomly presented. VTA-EM occurred 400 ms into every 500 ms presentation of the initially non-preferred cue B. During this block, cue A was presented as often as cue B but never coupled with VTA-EM. After this Pavlovian association block, an identical 400-trial cue preference test was performed. If animal performance permitted, another 20-minute Pavlovian association block was performed in which the VTA-EM coupled cue was switched to cue A. This was followed by another 400-trial cue preference test block.

Figure 3. Pavlovian cue-VTA-EM association motivates future cue selection (Experiment 2).

Figure 3

A) Paradigm consisted of a 20 min. Pavlovian cue-VTA-EM association block surrounded by two cue preference test blocks (no VTA-EM). During the Pavlovian association block, the monkey performed a passive fixation task (0.03 ml juice every 800 – 1200 ms) while only one of the two visual cues (500 ms presentation) shown every 3500 – 6000 ms was temporally associated with VTA-EM (400 ms into cue presentation, bipolar, 200 ms, 200Hz, 1mA, 2 VTA electrodes stimulated simultaneously). The cue preference index from cue preference tests was calculated in bins of 100 trials from single example sessions performed by M2 (B) and M3 (C). Color of data points denotes the preceding Pavlovian association block (gray – no-VTA-EM; red – cue B-VTA-EM; green – cue A-VTA-EM). Mean cue preference index values for each pair of blocks performed by M2 (D) and M3 (E). Green lines denote pairs of cue preference test blocks with a trend for an increased preference of the cue associated with VTA-EM during the intervening Pavlovian association block while red lines represent the opposite trend. See also Figure S2.

Cue preference indices from example sessions in M2 (Figure 3B) and M3 (Figure 3C) demonstrate an increased preference for cue B following the cue B-VTA-EM Pavlovian association block. After the subsequent cue A-VTA-EM association block there was another shift in cue preference towards the cue previously associated with VTA-EM. To quantify the effect of cue-VTA-EM associations on cue preference, the mean cue preference before an association block was compared to the mean cue preference afterwards. For consistency across these pairs of cue preference test blocks, the cue associated with VTA-EM during the intervening association block was designated cue B. We found that both M2 and M3 (Figure S2A, B) exhibited a significantly increased preference for cue B after cue B-VTA-EM (sign rank test, p < 0.05). Examination of the mean cue preference between the two preference tests demonstrates the consistency of the VTA-EM effect in experiment 2 for M2 and M3 (Figure 3D, E). This cue-specific effect is comparable to specific-Pavlovian instrumental transfer (PIT) [1517] as both paradigms demonstrate that incentive motivation acquired through Pavlovian association can be selectively transferred to an instrumental task. Importantly to encourage responses, from which cue-selective effects could be monitored, the post-association instrumental task was performed with 50% juice reward probability and not under extinction as is characteristic for traditional PIT paradigms. Thus our paradigm was not well suited to examine the general form of PIT during which general increases in vigor are displayed [1517]. Nonetheless, because subjects could respond immediately after visual cue presentation, we examined changes in reaction times (RT) as an indicator of vigor. No significant change in the RT (sign rank test, p > 0.25) for preference tests performed before and after Pavlovian association blocks was found for both M2 (n = 6 pairs of blocks, p = 1.00) and M3 (n = 28 pairs of blocks, p = 0.265). In general, experiment 2 demonstrated that cue-VTA-EM associations allowed a cue to gain incentive motivation leading to its increased selection during a subsequent operant task.

VTA-EM increases fMRI activity in the dopaminergic reward network

The dopaminergic reward network consists of structures that receive dense dopaminergic innervation and respond to reward-related tasks. Comparison of a meta-analysis of 142 human fMRI studies of reward processing [18] and primate dopamine receptor innervation [19, 20] reveals the nucleus accumbens (NA), caudate, putamen, thalamus, orbitofrontal cortex (OFC), anterior insula, anterior cingulate cortex (ACC) and prefrontal cortex (PFC) as major nodes in this network. In addition, the hippocampus and the amygdala also exhibit reward responses and strong dopaminergic connections [20]. Human fMRI studies have found that nodes within the dopaminergic reward network, like the ventral striatum and OFC, exhibit modulations in functional activity that correlate with similar reward prediction error signals as coded by the phasic activity of dopamine neurons [21, 22]. This correlative evidence suggests that reward activity within these regions may be driven, in part, by phasic VTA responses. Nonetheless, the distributed network of structures modulated by phasic changes in VTA activity has not been causally investigated in primates. To do so, we utilized combined VTA-EM fMRI [23].

During the scanning procedure juice rewards (0.03 ml) were administered every 800 – 1200 ms to maintain fixation behavior while VTA-EM and control trials (no VTA-EM), used to assess the brain-wide functional effects of VTA-EM, occurred every 3900 – 6400 ms. To generate a representative map of the activations elicited by VTA-EM, fMRI datasets from each subject were first coregistered to the 112 RM-SL space [24]. A whole-brain, voxel-by-voxel GLM analysis was then performed (see Supplemental Experimental Procedures). The resultant statistical image (VTA-EM – no VTA-EM, FDR corrected, P = 0.001, cluster size 10 voxels) reveals a broad network of activated structures (Figure 4). To quantify the regions activated by VTA-EM, we determined the volume of overlap between a large group of anatomical ROIs and activated voxels (Table S2A). Activations were found within many of the nodes of the dopaminergic reward network discussed above and were predominantly found ipsilateral to the site of VTA-EM. Interestingly, current levels used to robustly activate the dopaminergic reward network (<= 392 μA) were much weaker than those needed to reinforce behavior (>650 μA).

Figure 4. fMRI activations induced by VTA-EM (Experiment 3).

Figure 4

Group analysis T-score maps overlaid on coronal slices of the 112 RM-SL T1/T2* anatomical volume (n = 35 runs, M1 = 12 runs, M2 = 5 runs, M3 = 18 runs, fixed effect analysis, VTA-EM – No VTA-EM, FDR corrected, P = 0.001, cluster size 10 voxels). VTA-EM consisted of a 200 ms train of bipolar stimulation pulses (200 Hz, 200 ms, 100 μA – 392 μA, 2 VTA electrodes stimulated simultaneously). Abbreviations: AIP - anterior intraparietal; cnMD - centromedian nucleus; Cd - caudate; DO - dorsal opercular; G - gustatory; GrF - granular frontal; Hc - hippocampus; NA - nucleus accumbens; PAG - periaqueductal gray; Pu - putamen; PrCo - precentral opercular; RN - red nucleus; TPO - temporal parietal occipital; VL - ventral lateral nucleus. See also Figure S3 and Table S2.

To more directly assess the correspondence between the regions activated by VTA-EM and those activated by a natural reinforcer, we compared VTA-EM driven activity to fMRI activity generated by unexpected juice rewards (see Supplemental Experimental Procedures). Because M1 and M2 were not available for further experiments, comparisons were made with a separate group of animals [24, their experiment 2]. Juice driven activity was thresholded at the same level as VTA-EM in Figure 4 (juice – fixation, FDR corrected, p = 0.001, cluster 10 voxels) and a conjunction analysis was performed (Figure S3, see Supplemental Experimental Procedures). All anatomical ROIs displaying direct voxel-to-voxel colocalization of VTA-EM- and juice-driven activity are reported in Table S2B (highlighted in red). Several ROIs also displayed activations in response to both unexpected juice and VTA-EM but within non-overlapping voxels (Table S2A, S2C, highlighted in green). Despite this lack of an exact spatial correspondence at voxel level, this analysis demonstrates that the activation of these anatomical structures was common to both natural (unexpected juice) and artificial (VTA-EM) reinforcement. The majority of regions activated by both VTA-EM and juice [45B (PFC), area 12 and 13 (OFC), area 24 (ACC), area AIP, caudate, gustatory, insula putamen, PrCO and ventral lateral nucleus (VL, thalamus)] were also regions found in a meta-analysis of human reward studies [18] or have been found to respond to primary reinforcers [25, 26]. This correspondence confirms that VTA-EM activates most of the structures typically activated by natural reinforcers. The other regions displaying juice and VTA-EM activations were mainly somatosensory and (pre)motor regions (area 1–2, area 3 a/b, F1, F3, F5a, F5c, PF and SII). While activation of these structures in the juice experiment could result from the motor component of juice consumption, there were no difference in the juice administered temporally surrounding VTA-EM and the no VTA-EM events (see Supplemental Experimental Procedures). Interestingly, these regions receive dense dopaminergic innervation and are affected by dopaminergic modulation [27, 28]. Lastly, regions driven by juice but not VTA-EM were predominantly found within higher order visual areas (V4D, LIPi, FST, LST MSTd, MT, TEO, TE, V6, V6A, STPm) (Table S2C). Activation of these regions in response to rewards has been observed in previous monkey and human fMRI studies [29, 30]. This suggests that with the stimulation parameters utilized in this study VTA-EM has little effect on higher order visual regions. Lastly, the voxel-by-voxel analyses revealed that NA, a key node in the dopaminergic reinforcement network, was activated by VTA-EM but not juice. In contrast, an ROI analysis of the anatomically defined NA revealed stronger activations by juice (n = 40 runs) than VTA-EM (n = 35 runs) in left (VTA-EM mean PSC = 0.09, juice mean PSC = 0.19, rank sum, p = 0.03) and right NA (VTA-EM mean PSC = 0.14, juice mean PSC = 0.24, rank sum, p = 0.007). This analysis suggests that juice rewards increases fMRI activity more broadly throughout NA, while VTA-EM induced stronger yet more focal activations within NA. Despite the differences seen between activation maps generated by VTA-EM and unexpected juice rewards, both reinforcers recruit a largely overlapping set of structures, many of them reward processing structures with dense dopamine innervation.

Discussion

We have demonstrated that monkey VTA can be accurately and precisely targeted for chronic EM using peri-operative MRI-guidance. VTA-EM was capable of selectively reinforcing and motivating behavior during operant and Pavlovian conditioning paradigms. This was demonstrated as an increased selection of a particular visual cue following the reinforcement of its selection with VTA-EM (exp. 1) or its previous Pavlovian coupling with VTA-EM (exp. 2). Therefore, this work establishes a causal role for primate VTA activity in the selective assignment of motivational value to visual cues, thus proving fundamental aspects of the hypothesized functional role of phasic neuronal VTA activity in primate behavior [1, 2, 13, 31]. Finally, by combining fMRI with simultaneous VTA-EM, we demonstrated that artificially increased VTA activity increased fMRI activity throughout most nodes of the dopaminergic reward network.

Comparison with VTA-EM and optogenetic stimulation in rodents

Many rodent studies have demonstrated that VTA-EM [4, 7, 32], unaccompanied by other reinforcers, can reinforce operant behavior. In contrast, we monitored the behavioral effects of VTA-EM during tasks that were also reinforced with equiprobable juice rewards. Juice rewards were employed because pilot experiments (M1, n = 10 sessions; M2 & M3, n = 1 session) demonstrated that VTA-EM alone, at least with the parameters utilized here, was not sufficient to maintain operant behavior. Consequently, balanced juice rewards were needed to maintain task performance while unbalanced VTA-EM was used to affect cue preference. Therefore, while the VTA-EM dependent effects on cue selection confirm VTA-EM’s reinforcing properties, comparison with rodent results suggests that VTA-EM in rodents is a stronger reinforcer than VTA-EM in primates. Interestingly, operant reinforcement through optogenetic stimulation (OS) of dopamine neurons in rodents can also require the concurrent use of a primary reinforcer [3], but see [7, 32]. Because the majority of dopamine neurons phasically respond to unexpected rewards [33], the typical reinforcement/motivational signal conveyed by the VTA to downstream structures likely involves VTA-wide activity. Therefore a plausible interpretation of these results is that some OS paradigms in rodents (due to the smaller volume of tissue affected by OS [34]) and our EM paradigm in monkeys (due to the larger volume of VTA in primates) excite a smaller proportion of the total population of VTA neurons compared to rodent VTA-EM. Excitation of this smaller population may result in a weaker reinforcement/motivational signal and reduced behavioral effects. This is corroborated by our comparison of VTA-EM- and juice-induced fMRI activity within NA, which suggests that, in primates, natural reinforcers (juice) more broadly increase activity throughout reward structures. This may also be why lower currents in the fMRI experiment were sufficient to drive the dopaminergic reward network but higher currents were needed to reinforce behavior. In addition, an important caveat is that unlike the cell type specific OS now becoming common in rodent studies, VTA-EM likely stimulates dopaminergic (~65% of population), GABAergic (~30%) and glutamatergic (~5%) cell types with little to no selectivity. Interestingly, OS of GABA-ergic VTA to NA projections has been shown to enhance stimulus-outcome associations [35]. Therefore, the behavioral effect of VTA-EM’s indiscriminate activation of VTA is likely a complex interaction of these different subpopulations and their targets. Despite this limitation, VTA-EM is an important first step in the causal understanding of VTA activity on motivation, reinforcement and plasticity within the primate.

Comparison with reinforcing stimulation in Monkey

The primate exhibits greater dopamine innervation than rodents. This expansion is evident in both densely (e.g. motor & premotor areas) and sparsely innervated regions (i.e. parietal, temporal areas) [8]. Furthermore, the laminar distribution of dopaminergic receptors differs between species, with layer 1 receiving the highest receptor density in primates while rodents display a more varied laminar distribution that is sparse in upper layers [8, 36]. These structural differences justify the importance of the primate model as they likely affect function. Nonetheless, the few studies assessing the reinforcing properties of EM in primate VTA or within a close proximity of VTA were large-scale mapping studies [3739]. In these experiments, numerous locations were stimulated to determine reinforcing sites. The exact location of the EM sites was then determined using post-hoc, ex-vivo histology. Consequently, the behavioral experiments performed in these studies were conducted without precise knowledge of electrode positioning prohibiting a precisely targeted study of VTA function.

In addition, previous studies attempting to investigate the reinforcing effects of EM in VTA and neighboring structures have used a simple lever-pressing task. While these studies demonstrated that EM reinforces operant behavior they do not demonstrate the specificity of this reinforcement. For example, an important factor that governs Pavlovian and operant reinforcement is the temporal contiguity of the cue or the behavior and the reinforcement [40]. Because only one response was used in these earlier mapping studies, such studies cannot distinguish whether VTA-EM causes an aspecific increase in motivated behavior or whether it elicits effects that are specific to a particular cue or an action temporally associated with VTA-EM. In contrast, the changes in cue selection we observed were dependent on temporal contiguity, as increased cue selection was shown only for the cue whose presentation (Pavlovian) or selection (operant) was temporally coupled with VTA-EM. The specificity of this effect confirms that VTA-EM reinforcement selectively attributes motivational value to the cue temporally associated with VTA-EM. Moreover, the Pavlovian experiment demonstrates that VTA-EM can assign incentive motivation in a cue-selective way in the absence of any direct association with operant behavior.

In addition to VTA, EM has been shown to reinforce behavior at several other sites in the primate brain. These regions include NA, striatum, amygdala, OFC, lateral hypothalamus, mediodorsal nucleus, and locus coeruleus [38, 39, 4143]. Interestingly, the majority of these sites contain a high density of dopamine receptors and many of these same regions showed increased fMRI activity in response to VTA-EM and unexpected juice rewards (see Figure 4, Table S2). Moreover, it has been shown that systemic dopamine receptor blockade significantly reduced EM reinforcement within OFC, hypothalamus and locus coeruleus [44]. Taken together these findings suggest that these interconnected nodes of the dopaminergic reward network play important roles in reinforcement.

The absence of any studies of primate VTA function over the last 40 years causally linking activity with behavior, and the complete absence of any previous work focusing specifically on VTA-EM, highlights the difficulties of chronically targeting this small but profoundly important structure in primates. With the use of peri-operative MRI-guided electrode implantation we have circumvented these obstacles, allowing precise, chronic targeting of VTA. Furthermore, we have demonstrated the critical role of VTA in the selective reinforcement and motivation of visual cue selection. Finally, we combined fMRI with VTA-EM to demonstrate that VTA activity drives many of the nodes in the dopaminergic reward system. This work paves the way for future investigations of the relationship of increased VTA activity to reinforcement, motivation, learning, and plasticity throughout the primate brain.

Experimental Procedures

Please see Supplemental Experimental Procedures for a full description of the experimental design, methodology, and analysis.

Supplementary Material

01
02
Download video file (1.4MB, mp4)

Highlights.

  • Stimulation of VTA alters free-choice behavior in primates.

  • VTA-EM in primates reinforces cue selection in an operant conditioning paradigm.

  • Pavlovian cue-VTA-EM associations motivate future cue selection.

  • VTA-EM increases fMRI activity throughout the dopaminergic reward system.

Acknowledgments

We thank C. Klink, C. Fransen, C. Van Eupen and A. Coeman for animal training and care; W. Depuydt, G. Meulemans, P. Kayenbergh, M. De Paep, S Verstraeten, and I. Puttemans for technical assistance; and P. Balan S. Raiguel for their valuable insights and comments on the manuscript. This work received support from Inter-University Attraction Pole 7/21, Odysseus G.0007.12, Programme Financing PFV/10/008, Geconcerteerde Onderzoeks Actie 10/19, Impulsfinanciering Zware Apparatuur and Hercules funding of the Katholieke Universiteit Leuven, Fonds Wetenschappelijk Onderzoek–Vlaanderen G090714N, G0888.13, G062208.10, G083111.10 and G0719.12, K7148.11. JA is post-doctoral fellow of FWO-Vlaanderen. The Martinos Center for Biomedical Imaging is supported by National Center for Research Resources grant P41RR14075.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 2010;68:815–834. doi: 10.1016/j.neuron.2010.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  • 3.Adamantidis AR, Tsai HC, Boutrel B, Zhang F, Stuber GD, Budygin EA, Tourino C, Bonci A, Deisseroth K, de Lecea L. Optogenetic interrogation of dopaminergic modulation of the multiple phases of reward-seeking behavior. J Neurosci. 2011;31:10829–10835. doi: 10.1523/JNEUROSCI.2246-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Fibiger HC, LePiane FG, Jakubovic A, Phillips AG. The role of dopamine in intracranial self-stimulation of the ventral tegmental area. J Neurosci. 1987;7:3888–3896. doi: 10.1523/JNEUROSCI.07-12-03888.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Tsai HC, Zhang F, Adamantidis A, Stuber GD, Bonci A, de Lecea L, Deisseroth K. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science. 2009;324:1080–1084. doi: 10.1126/science.1168878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Esposito RU, Porrino LJ, Seeger TF, Crane AM, Everist HD, Pert A. Changes in local cerebral glucose utilization during rewarding brain stimulation. Proc Natl Acad Sci U S A. 1984;81:635–639. doi: 10.1073/pnas.81.2.635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, Janak PH. A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci. 2013;16:966–973. doi: 10.1038/nn.3413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Berger B, Gaspar P, Verney C. Dopaminergic innervation of the cerebral cortex: unexpected differences between rodents and primates. Trends Neurosci. 1991;14:21–27. doi: 10.1016/0166-2236(91)90179-x. [DOI] [PubMed] [Google Scholar]
  • 9.Roelfsema PR, van Ooyen A, Watanabe T. Perceptual learning rules based on reinforcers and attention. Trends Cogn Sci. 2010;14:64–71. doi: 10.1016/j.tics.2009.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bondar IV, Leopold DA, Richmond BJ, Victor JD, Logothetis NK. Long-term stability of visual pattern selective responses of monkey temporal lobe neurons. PloS one. 2009;4:e8222. doi: 10.1371/journal.pone.0008222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dayan P, Abbott LF. Theoretical neuroscience : computational and mathematical modeling of neural systems. Cambridge, Mass: Massachusetts Institute of Technology Press; 2001. [Google Scholar]
  • 12.Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature. 2006;441:876–879. doi: 10.1038/nature04766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wassum KM, Ostlund SB, Loewinger GC, Maidment NT. Phasic mesolimbic dopamine release tracks reward seeking during expression of pavlovian-to-instrumental transfer. Biol Psychiatry. 2013;73:747–755. doi: 10.1016/j.biopsych.2012.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, Akers CA, Clinton SM, Phillips PE, Akil H. A selective role for dopamine in stimulus-reward learning. Nature. 2011;469:53–57. doi: 10.1038/nature09588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Corbit LH, Janak PH, Balleine BW. General and outcome-specific forms of Pavlovian-instrumental transfer: the effect of shifts in motivational state and inactivation of the ventral tegmental area. The European journal of neuroscience. 2007;26:3141–3149. doi: 10.1111/j.1460-9568.2007.05934.x. [DOI] [PubMed] [Google Scholar]
  • 16.Corbit LH, Balleine BW. The general and outcome-specific forms of Pavlovian-instrumental transfer are differentially mediated by the nucleus accumbens core and shell. J Neurosci. 2011;31:11786–11794. doi: 10.1523/JNEUROSCI.2711-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Talmi D, Seymour B, Dayan P, Dolan RJ. Human pavlovian-instrumental transfer. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2008;28:360–368. doi: 10.1523/JNEUROSCI.4028-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Liu X, Hairston J, Schrier M, Fan J. Common and distinct networks underlying reward valence and processing stages: a meta-analysis of functional neuroimaging studies. Neuroscience and biobehavioral reviews. 2011;35:1219–1236. doi: 10.1016/j.neubiorev.2010.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lidow MS. D1- and D2 dopaminergic receptors in the developing cerebral cortex of macaque monkey: a film autoradiographic study. Neuroscience. 1995;65:439–452. doi: 10.1016/0306-4522(94)00475-k. [DOI] [PubMed] [Google Scholar]
  • 20.Oades RD, Halliday GM. Ventral tegmental (A10) system: neurobiology. 1. Anatomy and connectivity. Brain Res. 1987;434:117–165. doi: 10.1016/0165-0173(87)90011-7. [DOI] [PubMed] [Google Scholar]
  • 21.Schonberg T, O’Doherty JP, Joel D, Inzelberg R, Segev Y, Daw ND. Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson’s disease patients: evidence from a model-based fMRI study. NeuroImage. 2010;49:772–781. doi: 10.1016/j.neuroimage.2009.08.011. [DOI] [PubMed] [Google Scholar]
  • 22.O’Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ. Temporal difference models and reward-related learning in the human brain. Neuron. 2003;38:329–337. doi: 10.1016/s0896-6273(03)00169-7. [DOI] [PubMed] [Google Scholar]
  • 23.Ekstrom LB, Roelfsema PR, Arsenault JT, Bonmassar G, Vanduffel W. Bottom-up dependent gating of frontal signals in early visual cortex. Science. 2008;321:414–417. doi: 10.1126/science.1153276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.McLaren DG, Kosmatka KJ, Oakes TR, Kroenke CD, Kohama SG, Matochik JA, Ingram DK, Johnson SC. A population-average MRI-based atlas collection of the rhesus macaque. Neuroimage. 2009;45:52–59. doi: 10.1016/j.neuroimage.2008.10.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Scott TR, Yaxley S, Sienkiewicz ZJ, Rolls ET. Gustatory responses in the frontal opercular cortex of the alert cynomolgus monkey. Journal of neurophysiology. 1986;56:876–890. doi: 10.1152/jn.1986.56.3.876. [DOI] [PubMed] [Google Scholar]
  • 26.Rolls ET, Scott TR, Sienkiewicz ZJ, Yaxley S. The responsiveness of neurones in the frontal opercular gustatory cortex of the macaque monkey is independent of hunger. The Journal of physiology. 1988;397:1–12. doi: 10.1113/jphysiol.1988.sp016984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hosp JA, Pekanovic A, Rioult-Pedotti MS, Luft AR. Dopaminergic projections from midbrain to primary motor cortex mediate motor skill learning. J Neurosci. 2011;31:2481–2487. doi: 10.1523/JNEUROSCI.5411-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Pleger B, Ruff CC, Blankenburg F, Kloppel S, Driver J, Dolan RJ. Influence of dopaminergically mediated reward on somatosensory decision-making. PLoS Biol. 2009;7:e1000164. doi: 10.1371/journal.pbio.1000164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Arsenault J, Nelissen K, Jarraya B, Vanduffel W. Dopaminergic reward signals selectively decrease fMRI activity in primate visual cortex. Neuron. 2013 doi: 10.1016/j.neuron.2013.01.008. In Press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Weil RS, Furl N, Ruff CC, Symmonds M, Flandin G, Dolan RJ, Driver J, Rees G. Rewarding feedback after correct visual discriminations has both general and specific influences on visual cortex. J Neurophysiol. 2010;104:1746–1757. doi: 10.1152/jn.00870.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Salamone JD, Correa M. The mysterious motivational functions of mesolimbic dopamine. Neuron. 2012;76:470–485. doi: 10.1016/j.neuron.2012.10.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Witten IB, Steinberg EE, Lee SY, Davidson TJ, Zalocusky KA, Brodsky M, Yizhar O, Cho SL, Gong S, Ramakrishnan C, et al. Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron. 2011;72:721–733. doi: 10.1016/j.neuron.2011.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Mirenowicz J, Schultz W. Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature. 1996;379:449–451. doi: 10.1038/379449a0. [DOI] [PubMed] [Google Scholar]
  • 34.Diester I, Kaufman MT, Mogri M, Pashaie R, Goo W, Yizhar O, Ramakrishnan C, Deisseroth K, Shenoy KV. An optogenetic toolbox designed for primates. Nat Neurosci. 2011;14:387–397. doi: 10.1038/nn.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Brown MT, Tan KR, O’Connor EC, Nikonenko I, Muller D, Luscher C. Ventral tegmental area GABA projections pause accumbal cholinergic interneurons to enhance associative learning. Nature. 2012;492:452–456. doi: 10.1038/nature11657. [DOI] [PubMed] [Google Scholar]
  • 36.Lidow MS, Goldman-Rakic PS, Gallager DW, Rakic P. Distribution of dopaminergic receptors in the primate cerebral cortex: quantitative autoradiographic analysis using [3H]raclopride, [3H]spiperone and [3H]SCH23390. Neuroscience. 1991;40:657–671. doi: 10.1016/0306-4522(91)90003-7. [DOI] [PubMed] [Google Scholar]
  • 37.Plotnik R, Mir D, Delgado JMR. Map of reinforcing sites in the rhesus monkey brain. Int J Psychobiol. 1972;2:1–21. [Google Scholar]
  • 38.Briese E, Olds J. Reinforcing Brain Stimulation and Memory in Monkeys. Exp Neurol. 1964;10:493–508. doi: 10.1016/0014-4886(64)90047-0. [DOI] [PubMed] [Google Scholar]
  • 39.Routtenberg A, Gardner EL, Huang YH. Self-stimulation pathways in the monkey, Macaca mulatta. Exp Neurol. 1971;33:213–224. doi: 10.1016/0014-4886(71)90115-4. [DOI] [PubMed] [Google Scholar]
  • 40.Schultz W. Behavioral theories and the neurophysiology of reward. Annu Rev Psychol. 2006;57:87–115. doi: 10.1146/annurev.psych.56.091103.070229. [DOI] [PubMed] [Google Scholar]
  • 41.Rolls ET, Burton MJ, Mora F. Neurophysiological analysis of brain-stimulation reward in the monkey. Brain Res. 1980;194:339–357. doi: 10.1016/0006-8993(80)91216-0. [DOI] [PubMed] [Google Scholar]
  • 42.Mora F, Avrith DB, Rolls ET. An electrophysiological and behavioural study of self-stimulation in the orbitofrontal cortex of the rhesus monkey. Brain Res Bull. 1980;5:111–115. doi: 10.1016/0361-9230(80)90181-1. [DOI] [PubMed] [Google Scholar]
  • 43.Bichot NP, Heard MT, Desimone R. Stimulation of the nucleus accumbens as behavioral reward in awake behaving monkeys. J Neurosci Methods. 2011;199:265–272. doi: 10.1016/j.jneumeth.2011.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Mora F, Rolls ET, Burton MJ, Shaw GS. Effects of dopamine-receptor blockade on self-stimulation in the monkey. Pharmacol Biochem Behav. 1976;4:211–216. doi: 10.1016/0091-3057(76)90018-6. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01
02
Download video file (1.4MB, mp4)

RESOURCES