Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jun 15.
Published in final edited form as: Biol Psychiatry. 2016 Oct 21;81(12):1041–1049. doi: 10.1016/j.biopsych.2016.10.018

Learning from one’s mistakes: A dual role for the rostromedial tegmental nucleus in the encoding and expression of punished reward seeking

Peter J Vento 1, Nathan W Burnham 2, Courtney S Rowley, Thomas C Jhou 1
PMCID: PMC5400739  NIHMSID: NIHMS834546  PMID: 27931744

Abstract

Background

Psychiatric disorders such as addiction and mania are marked by persistent reward-seeking despite highly negative or aversive outcomes, but the neural mechanisms underlying this aberrant decision-making are unknown. The recently identified rostromedial tegmental nucleus (RMTg) encodes a wide variety of aversive stimuli and sends robust inhibitory projections to midbrain dopamine neurons, leading to the hypothesis that the RMTg provides a brake to reward signaling in response to aversive costs (1, 2).

Methods

To test the role of the RMTg in punished reward seeking, adult male Sprague Dawley rats were tested in several cost-benefit decision tasks after excitotoxic lesions of the RMTg, or temporally specific optogenetic inhibition of RMTg efferents in the ventral tegmental area (VTA).

Results

RMTg lesions drastically impaired the ability of footshock to suppress operant responding for food. Optogenetic inhibition showed that this resistance to punishment was due in part to RMTg activity at the precise moment of shock delivery and mediated by projections to the VTA, consistent with an aversive “teaching signal” role for the RMTg during encoding of the aversive event. We observed a similar resistance to punishment when the RMTg was selectively inhibited immediately prior to the operant lever press, consistent with a second distinct role for the RMTg during action selection. These effects were not attributable to RMTg effects on learning rate, locomotion, shock sensitivity, or perseveration.

Conclusions

The RMTg has two strong and dissociable roles during both encoding and recall of aversive consequences of behavior.

Keywords: Aversion, decision-making, punishment, reward, rostromedial tegmental nucleus

Introduction

Mania and addiction are two major neuropsychiatric disorders characterized by resistance to punishment; i.e. persistent reward-seeking despite potentially severe negative outcomes (3). The neural circuits underlying such aberrant cost-benefit decisions have been proposed to involve cortical, limbic, and striatal networks (48), as well as the precise timing of dopamine release in response to rewarding or aversive experiences (911). In addition to these mechanisms, we posit major roles for the rostromedial tegmental nucleus (RMTg), a GABAergic midbrain structure that encodes negative reward prediction errors and sends dense inhibitory projections to midbrain dopamine neurons (2, 1216), making it uniquely positioned to modulate reward seeking in the context of negative events. Specifically, RMTg firing rates are increased by aversive stimuli, their predictors, and reward omission, a pattern opposite to most dopamine neurons (17, 18). Because RMTg neurons are activated both by aversive stimuli and cues that predict them, we hypothesized that they play multiple distinct roles at these distinct times during learning from aversive outcomes.

Using a paradigm in rats in which lever pressing for food is punished by footshock, we inhibited the RMTg or its projections to the ventral tegmental area (VTA) to determine the RMTg role in punishment-induced suppression of reward seeking. Through use of optogenetics, we went on to inhibit the RMTg-VTA circuit at several distinct periods of the punishment task to determine the spatial and temporal pattern of RMTg involvement in maladaptive reward seeking. Because increased operant responding under punishment may be confounded by several non-specific factors including sensorimotor changes, motor disinhibition, and perseveration, we tested effects of RMTg inhibition on several additional tasks including shock escape, a two-lever discrimination task, and extinction.

Methods and Materials

Animals

Adult male Sprague Dawley rats weighing 250–300g upon delivery from vendor (Charles River Laboratories, Raleigh, NC, USA) were individually housed in standard shoebox cages with food and water provided ad libitum, unless otherwise stated. Procedures conformed to the National Institutes of Health Guide for the Care and Use of Laboratory Animals, and all protocols were approved by Medical University of South Carolina Institutional Animal Care and Use Committee.

Surgeries

Rats were anesthetized by isoflurane, their heads fixed in a stereotaxic frame, and an incision made in the scalp. Burr holes were drilled and glass pipettes or optical fibers were lowered to the desired location. Intracranial injections were made using Nanoject Auto-nanoliter Injectors (100nL/min; Drummond Scientific Company, Broomall, PA, USA). Pipettes were left in place for ≥5min to allow diffusion of injectant. Ketoprofen (5mg/kg) was given immediately after surgery to control pain and swelling. For optogenetic experiments, optical fibers were affixed to the skull using bone screws and dental cement. RMTg-lesioned (250nL of 400mM quinolinic acid; 10° angle from bregma: −7.7mm posterior, +1.5mm lateral, −7.4mm from dura) and sham-lesioned rats (250nL saline) received pentobarbital (55mg/kg) for up to three hours post-surgery to reduce discomfort associated with RMTg excitotoxicity. Subjects were given ≥5 days to recover from surgery before undergoing additional manipulations.

Optogenetic Control of Neural Activity

Adeno-associated virus (AAV) serotype 2/2 containing the gene encoding the inhibitory light-sensitive proton pump archaerhodopsin (AAV2-hSyn-eArch3.0-EYFP; Arch), or control vector (AAV-hSyn-EYFP) was obtained from University of North Carolina, Vector Core and injected bilaterally into RMTg (400nL/side; 10° angle from bregma: −7.1mm posterior, +1.9mm lateral, −7.4mm ventral to dura). Green light (532nm, Dragon Lasers; Changchun, China) was delivered using optical splitters (Precision Fiber Products, Milpitas, CA, USA) mated to 2mm diameter stainless steel ferrules (Precision Fiber Products, Milpitas, CA, USA) containing optical fibers (200µm core, Thorlabs; Newton, NJ, USA) terminating bilaterally in the VTA (10° angle from bregma: −5.7mm posterior, +2.3mm lateral, −6.7mm from dura). We allowed at least three weeks before testing commenced to permit expression of viral vectors. While we occasionally observed some spread of virus laterally to regions outside the boundaries of the RMTg, the use of optogenetics permits greater specificity in our manipulation by targeting axon terminals in the VTA and the fact that VTA inputs are far more sparse in these more lateral regions where virus expression was sometimes observed (19).

Behavioral Testing

Rats were food restricted to 85% (+/−3%) of their ad libitum body weight and trained to lever press for food (FR5; 45mg Standard Chow pellets, LabDiet, St. Louis, MO, USA). Operant behavior was conducted in standard Med Associates chambers (St. Albans, VT, USA) enclosed in sound-attenuating cabinets. Boxes possessed two retractable levers located on either side of a central food tray with cue lights above each lever and house light located on the opposite wall. A detailed explanation of specific behavioral paradigms is provided in Supplemental Information.

Histology

Rats were transcardially perfused with saline and 10% formalin. Brains were removed and placed in 10% formalin overnight before storage in 20% sucrose with PBS azide. Lesions were assessed immunohistochemically for staining of mouse anti-NeuN (1:5K, EMD Millipore, MA, USA), cocaine (10mg/kg; i.p.)-induced rabbit anti-cFos (1:20K, EMD Millipore, MA, USA), or 0.25% cresyl violet. Blinded cell counts were performed to assess RMTg lesions and extent of cell loss was calculated using baselines derived from average number of RMTg cells in sham-lesioned controls. Only subjects with ≥70% RMTg cell loss were included in behavioral analyses. A blinded experimenter assessed damage to the interpeduncular nucleus (IPN) by assigning a rank score (0–5) to approximate degree of cell loss in three consecutive brain sections. While we observed very little damage to ventrolateral structures in the vicinity of the RMTg, cell loss in the IPN ranged from 10–90% (12/16 subjects had <50% cell loss). Virus expression was verified immunohistochemically for rabbit anti-GFP (1:50K; Abcam, MA, USA).

Statistics

A two-way ANOVA was used to analyze group differences in extinction and responses to footshock, and a MANOVA was used to analyze lever presses. Student’s t-tests and one-way ANOVA were used to assess differences in group means. Post-hoc testing was performed using one-way ANOVA or Students t-tests and Bonferroni correction was used to adjust for multiple comparisons. For optogenetic experiments, p-values were derived using Student’s t-distributions on z-transformed ratios of the various light conditions. Groups significantly different at p<0.05.

Results

RMTg-lesioned rats are resistant to the suppressive effect of punishment on food seeking

To test the RMTg role in punished reward seeking, we assessed RMTg- and sham-lesioned rats for the suppressive effect of footshock using a novel “progressive shock” task in which lever pressing for food reward was immediately followed by brief footshock of an intensity that increased after every 3 reinforcers obtained (Figure 1B). Relative to controls, RMTg-lesioned rats required nearly 2.5X more shock to suppress responding (“shock breakpoint”) (1.91 ± 0.42 versus 0.78 ± 0.05 mA, t(18)=2.968, p=0.008; n=9, 11; Figure 1C). While lever pressing during (unpunished) training sessions was nearly identical between sham and lesioned groups (Figure S1), a two-way ANOVA on lever presses during testing under punishment showed lesioned rats emitted more lever presses, reflective of their reaching higher shock intensities before suppressing responding (F1,18=11.735, p=0.003; Figure 1D). Both groups pressed the active more than the inactive lever (F1,18=302.676, p<0.001), suggesting that rats consistently differentiated between rewarded and non-rewarded responses. Notably, there was a significant Condition × Lever interaction (F1,18=9.284, p=0.007), and post-hoc testing confirmed lesioned rats pressed the active lever more than shams (t(18)=3.264, p=0.03), consistent with the greater shock breakpoint observed in the lesioned group. We found no difference in mean inactive lever presses (t(18)=1.54, p=0.079), indicating that persistent responding by RMTg-lesioned rats unlikely arose from non-specific motor disinhibition.

Figure 1.

Figure 1

RMTg lesions lead to impairments in the suppressive effect of footshock on food seeking. (A) Representative images of RMTg and sham lesions. (B) A timeline of the progressive shock task. (C) Rats with excitotoxic lesions of the RMTg endure significantly greater shock intensities to receive food reward (n=9, 11) and (D) emit significantly more responses on the active lever than sham-lesioned controls. No difference in inactive lever presses was found between groups. *p<0.05

The results of the progressive shock task suggest that the RMTg is required to inhibit reward seeking punished by an aversive outcome, but greater shock endured by RMTg-lesioned rats could also be explained by a slower learning rate. In such a case, we would expect that allowing greater exposure to each shock intensity would permit lesioned rats more time to adjust to increasing shock intensities, leading to more effective response suppression. Hence, we repeated the experiment described above, but with each shock intensity used for all trials throughout an entire 10-min session, rather than intensity changing within session (Figure 2A). Not only did slowing of the task not improve suppression in lesioned animals, shock breakpoint was significantly higher in the extended task versus the more condensed progressive shock task described above (t(8)=6.053, p< 0.001). In the extended task, RMTg-lesioned rats required more than 3-fold greater shock to suppress responding relative to controls (5.11 ± 0.79 versus 1.64 ± 0.87 mA, t(18)=4.530, p<0.001; n=9, 11; Figure 2B). Furthermore, shock breakpoint was negatively correlated with cell loss in the RMTg (p=0.014; n=15; Figure 2C). Because the RMTg is adjacent to the IPN which has also been implicated in motivated behavior (20, 21), we examined IPN cell loss and found that in rats with RMTg lesions (>70% cell loss), there was no correlation of shock breakpoint with cell loss in the IPN (r=0.039, p=0.313; n=9).

Figure 2.

Figure 2

RMTg-lesioned rats remain resistant to the inhibitory effect of footshock on food seeking even when given greater exposure to each shock intensity. (A) A timeline of the extended progressive shock task. (B) RMTg-lesioned rats show a significant increase in the minimum shock intensity required to suppress food seeking by 85% of baseline responding (n=9, 11). (C) Shock breakpoints were negatively correlated with counts of RMTg cells remaining after excitotoxin or saline injections. (D) Analysis of binned data showed that significantly greater shock was required to suppress food seeking in RMTg-lesioned rats. *p<0.05

A repeated measures two-way ANOVA on Lesion Condition versus Shock Intensity showed RMTg-lesioned rats obtained substantially more reinforcers than shams in the extended progressive shock task (F1,18=31.759, p<0.001, n=9, 11, Figure 2D). In general, shock reduced the number of reinforcers obtained (F15, 4=175.027, p<.0.001), although effects of RMTg lesion on any particular shock intensity did not reach significance (Shock × Lesion interaction, F15, 4=4.892, p=0.068).

RMTg lesions do not impair sensorimotor responses to footshock

While the previous experiment indicated that RMTg lesion effects are not explained by general delays in learning, several other alternative explanations remained. For example, increased shock breakpoint in RMTg-lesioned rats could arise from: (1) impaired shock sensation, (2) generalized locomotor disinhibition, or (3) perseveration. To test whether the increased shock breakpoint could be explained by impaired shock perception, RMTg (n=6)- and sham (n=6)-lesioned rats were evaluated for latency to escape a chamber with an electrified floor (0mA, 0.5mA, or 0.7mA) to an adjacent unshocked chamber. RMTg-lesioned rats were slightly faster to escape at baseline (0mA; Figure S2); hence, each rats’ data was normalized to its own baseline. A two-way ANOVA showed normalized escape latencies declined as shock intensity increased (F2,18=5.555, p=0.013), but there was no effect of RMTg lesion (F1,9=0.329, p=0.58), nor was there a significant Shock × Lesion interaction (F2,18=0.499, p=0.616; Figure 3A).

Figure 3.

Figure 3

RMTg lesions produce hyperactivity without impairing sensorimotor responses to footshock. (A) RMTg- and sham-lesioned rats (n=6) were placed in a two-chamber shuttle box and tested for latency to escape a shock-paired chamber to an adjacent unshocked chamber. To account for differences in baseline activity (Figure S2), each rat’s data was normalized to its own baseline escape latency (0mA). Increasing shock intensities caused all rats to escape to the unshocked chamber faster; notably, RMTg lesioned rats were not impaired in shock escape, and indeed showed a non-significant trend toward faster escapes. (B) Over a 20-min test session in a novel operant chamber (n=4, 8), RMTg-lesioned rats were significantly more active, and (C) this lesion-induced hyperactivity persisted over the course of the entire test session. AU: arbitrary units; *p<0.05

Increased responding under punishment by RMTg-lesioned rats is not due to non-specific motor disinhibition

Given that the progressive shock task involves a choice between emitting or not emitting an action, one could argue that increased breakpoints result from locomotor hyperactivity. Indeed, RMTg-lesioned rats were more active during a 20-min locomotor test, an effect that was significant both over the entire 20-min session (p<0.001; n=4, 8; Figure 3B), and when analyzed in 5-minute bins (main effect of Lesion (F1,10=46.18; p<0.001); Time (F3,30=7.711, p=0.001); Lesion × Bin interaction (F3,30=2.674, p=0.065); Figure 3C). Despite this hyperactivity, however, lesioned rats never showed increased inactive lever pressing (Figure 1D), although generally low inactive presses reduced statistical power to detect such a change. To more clearly dissociate locomotor effects from impaired decision-making, we developed a two-lever task in which a “high-cost” lever yielded a large 3-pellet reward (and footshock), while a “low-cost” lever yielded a small 1-pellet reward that was never accompanied by shock. In this way, motor disinhibition, which would increase overall pressing, can be dissociated from alterations in choice behavior, which would influence the ratio of high- to low-cost choices.

Under conditions of little-to-no cost, all rats rapidly acquired equally strong preference for the large reward (t(11)=−1.951, p=0.077; n=6, 7; Figure 4A); however, when delivery of the large reward was followed by footshock, sham-lesioned rats rapidly shifted preference to the low-cost alternative while RMTg-lesioned rats continued to prefer the large punished reward. Specifically, lesions caused a significant and dramatic increase in shock “switchpoint”, or maximum shock intensity required to reduce large-reward preference by 85% (t(11)=−2.685, p=0.021; Figure 4B). The 85% reduction threshold was used to approximate the rate of suppression in the one-lever progressive shock tasks described above. Postmortem assessment of lesion-induced cell loss showed that shock switchpoint was negatively correlated with number of cells remaining in RMTg (p=0.029; n=8; Figure 4C), while damage to the adjacent IPN was not correlated with shock switchpoint (p=0.462; n=6; Figure 4D).

Figure 4.

Figure 4

In a two-lever choice task, RMTg-lesioned rats persist in responding for a large punished reward while sham-lesioned controls switch preference to a smaller unpunished alternative. (A) Under conditions of little-to-no cost, lesion and sham groups (n=6, 7) displayed an equally strong preference for the large reward. (B) RMTg-lesioned rats required much greater shock intensities to reduce preference for the large (punished) reward by 85% of their baseline preference (shock switchpoint). (C) Counts of cells remaining after RMTg lesions were negatively correlated with shock-induced switchpoint (n=8). (D) In RMTg-lesioned rats, shock switchpoint was not correlated with extent of cell loss in the adjacent interpeduncular nucleus (IPN; n=6).*p<0.05

RMTg-lesioned rats retain the ability to respond flexibly to receive the larger of two rewards

Because RMTg-lesioned rats persist in choosing a large, punished reward rather than shifting response preference to a safer, unpunished alternative, we tested rats on a random alternation task to determine whether their choice behavior resulted from general inflexibility in decision-making. The same rats (n=6, 7) underwent additional training to discriminate between two levers yielding either a large or small reward (both unpunished). Importantly, the location of the large and small rewards was randomly assigned to the two levers at the beginning of each 20-trial block (with 4 blocks per session).

A repeated measures two-way ANOVA found no effect of RMTg lesions on ability to track the large reward location over 6 consecutive testing sessions (Condition: F1,11=2.0444, p=0.181; Session: F1,11=3.627, p=0.083, Session × Condition: F1,11=0.89, p=0.366; Figure 5A). Similarly, when percent preference was averaged across all 6 test session, we did not detect a difference in average large reward preference (t(11)=−1.43, p=0.181; Figure 5B).

Figure 5.

Figure 5

The ability to respond flexibly to changing reward contingencies was not affected by RMTg lesions. RMTg- and sham-lesioned rats (n=6, 7) were trained to discriminate between two levers that yielded either a large or a small reward in the absence of punishment, but the location of the two reward options was randomly selected across 4 blocks of 20 trials each (each block consisting of 12 forced and 8 free choice trials). (A) There was no difference in mean preference for the large reward across any of the 6 test sessions, (B) nor was there any difference in large reward preference when averaged across all test sessions.

RMTg-lesioned rats extinguish normally

Finally, we tested whether higher shock breakpoints could result from increased perseverative behavior by assessing lesioned rats’ ability to extinguish operant responding when reinforcers were omitted. We again found no effect of lesions on the ability to acquire operant pressing for food (Figure S3), nor did we uncover an effect of RMTg lesions on extinction of lever pressing, indicating that lesioned rats can indeed suppress responding when rewards are absent, and they do so to a strikingly similar degree as intact rats (n=8, 10). Specifically, analysis of total presses during the first extinction session failed to detect group differences in lever presses (Lever: F1,16=77.48, p<0.001; Condition: F1,16=0.383, p=0.545; Lever × Condition: F1,16=0.867, p=0.366; Figure 6A). There also were no group differences in binned lever presses (Lesion: F1,16=0.383, p=0.545; Figure 6B). As expected, there was, however, a significant effect of Lever (F1,16=77.48, p<0.001), Time (F11,6=8.404, p<0.01), and a Lever × Time interaction (F11,6=12.471, p<0.01), indicating rats pressed the active lever more and lever pressing decreased over the course of the session. Notably, we saw no Time × Condition (F11,6=1.367, p= 0.365), Lever × Condition (F1,16=0.867, p=0.366), nor Time × Condition × Lever interaction (F11,6=2.165, p=0.177), indicating both RMTg- and Sham-lesioned rats extinguished lever pressing equally.

Figure 6.

Figure 6

No differences in extinction learning were observed between RMTg- and sham-lesioned rats. During the first extinction test in which the reward was omitted, we found no difference in (A) total lever presses, or (B) binned presses. (C) Analysis of total presses across 3 consecutive extinction sessions again found no differences between lesioned and sham rats, although all subjects consistently discriminated between active and inactive levers, and both groups reduced active lever pressing upon repeated testing (n=8, 10).

It is widely noted that extinguished behavior may spontaneously recover. RMTg-lesioned rats, however, did not show enhanced spontaneous recovery over 2 additional extinction sessions. Analysis of all 3 sessions versus baseline responding showed a main effect of Session (F3,12=120.046, p<0.001), with pressing decreasing over repeated testing, and a significant effect of Lever (F1,14=275.573, p<0.001; Figure 6C), indicating more active than inactive presses across test sessions. Notably, we did not detect a main effect of Lesion (F1,14=0.117, p=0.737), nor any significant interactions between lesion and any other factor (p>0.05), indicating that even though RMTg-lesioned rats are hyperactive (Figure 3B & C), they were able to effectively inhibit pressing in the absence of the reward despite the fact that they failed to inhibit responding under punishment.

RMTg lesions do not significantly affect motivation to obtain food

It could be argued that RMTg-lesioned rats find the food more rewarding, rather than the shock less punishing. The reward alternation experiment described above did not find a significant increase in preference for the large reward, although lesioned rats displayed slightly greater preference for the large reward in later testing sessions (Figure 5A). To more directly test whether RMTg lesions change reward evaluation, we tested rats in a standard progressive ratio task (22). We found no difference, however, between RMTg-and sham-lesioned rats (n=10, 15) in maximum ratio completed averaged across 7 consecutive testing sessions (t(23)=1.629, p=0.117; Figure 7A), nor in the number of ratios, or “steps,” completed (t(23)=1.638, p=0.115; Figure 7B), nor mean active or inactive lever presses (Lesion: F1,23=2.811, p=0.107; Lever: F1,23=57.423, p<0.001; Condition × Lever: F1,23=2.606, p=0.12; Figure 7C).

Figure 7.

Figure 7

RMTg lesions did not significantly alter progressive ratio breakpoint averaged across 7 consecutive test sessions. We found no effect of RMTg lesion on (A) max ratio completed, (B) number of ratio steps completed, or (C) mean lever presses emitted (n=10, 15).

Temporally- and spatially-restricted inhibition of RMTg efferents to the VTA causes punishment resistance

Encoding of punishment by the RMTg is likely mediated by changes in DA (14), and a long history of research indicates a diverse role for DA neurons (23, 24). Specifically, DA neurons encode reward signals coinciding with reinforcer delivery, but they also regulate action-selection and impulse control (11). Our RMTg-lesioned rats exhibit profound punishment resistance, but lesion studies do not indicate when RMTg activity is required for avoiding and learning from negative outcomes, nor which of the RMTg’s numerous targets (VTA, substantia nigra, raphe nucleus, etc.) (25) are involved. Accordingly, rats expressing either the inhibitory opsin Arch or control vector were tested in the progressive shock task described above while simultaneously undergoing optical inhibition of RMTg axon terminals in the VTA during either presentation of footshock, immediately before/after the footshock, or during the moment of action selection (for a timeline see Figure 8).

Figure 8.

Figure 8

A schematic illustrating the 3 distinct time points during which optical inhibition of RMTg axons in the VTA occurred.

To test RMTg-VTA interaction in encoding of punishment, optical inhibition of RMTg terminals was restricted to overlap either with shock presentation (“synchronized” condition) or immediately before/after footshock (“desynchronized” condition). Notably, we found no difference in mean shock breakpoint between Arch-expressing or control vector-expressing rats when light was presented in the desynchronized condition (Viral Vector: F1,16= 1.28; p= 0.273; Light Condition: F1,16= 3.246; p= 0.09; Viral Vector × Light Condition interaction: F1,16= 3.69; p= 0.073; n= 9; Figure 9C). Therefore, we used each animal’s performance in the desynchronized condition as its own baseline, allowing us to perform a within-subjects analysis of the ratio of shock breakpoint in the synchronized vs desynchronized conditions. Inhibition of RMTg terminals caused a 48% increase in shock breakpoint ratios when light specifically overlapped with the brief delivery of footshock (p=0.007; n=9), while there was no effect of light in rats expressing control vector (p=0.735, n=9; Figure 9C). The RMTg role in punishment resistance, therefore, appears highly localized in both time and space; i.e. RMTg is particularly necessary at the instant of shock delivery (and not immediately before/after), and depends on projections to the VTA.

Figure 9.

Figure 9

RMTg signaling to the ventral tegmental area (VTA) is essential for both the encoding of punishment and the decision to engage in the punished response. Rats expressing the inhibitory opsin Arch (AAV2-hSyn-eArch3.0-EYFP) or control vector (AAV2-hSyn-EYFP) were tested in the progressive shock task while simultaneously undergoing inhibition of RMTg axons in the VTA at discrete time points. (A) Representative images of RMTg virus expression and optical fiber placement in the VTA. (B) A timeline of the experiment. (C) Actual shock breakpoints between vector conditions during desynchronized/synchronized light presentation. (D) Optical inhibition of RMTg terminals in the VTA coinciding with footshock (synchronized) caused a significant increase in shock breakpoint relative to sessions in which light was delivered before/after footshock (desynchronized)(n=9). Similarly, (E) optical inhibition at the time leading up to the food-seeking response (decision phase) significantly increased shock breakpoint relative to sessions in which no light was delivered. Notably, we found no effect of light administration in rats expressing control vectors (n=5, 6). *p<0.05

A separate group of rats (n=5, 6) was tested while RMTg terminals in the VTA were inhibited specifically during the “decision” phase of the task, or the window at the start of each trial when rats were confronted with the choice to continue lever pressing despite impending punishment or instead withhold responding. As was found when RMTg inhibition was restricted to the “shock” phase, inhibition of RMTg terminals significantly increased the ratio of shock breakpoints in sessions when light overlapped the “decision” phase versus sessions when no light was administered (p=0.005; Figure 9D). Additional testing in rats expressing control vector (lacking opsins) showed no effect of light administration alone (p=0.898).

Discussion

Persistent seeking of rewards despite severe costs is a defining feature of both addiction and mania (3, 26), and the present study identifies the RMTg and its projections to the VTA as critical regulators of this behavior. We further identified two distinct roles for the RMTg: one role at the time of shock delivery, when the RMTg provides an aversive “teaching signal,” and a second role during the decision to engage in a punished response, when the RMTg appears to be involved in impulse-inhibition.

These dual roles for the RMTg are inversely analogous to the multiple roles known to be played by dopamine neurons in both reward signaling and action selection. Numerous studies have found that phasic firing of midbrain dopamine neurons encodes reward prediction errors, e.g. the difference between actual and expected rewards (17, 18), and drives reinforcement learning (27), consistent with a role for dopamine neurons in providing an appetitive teaching signal to downstream structures. This role is inverse to the aversive teaching signal we posit for the RMTg in the current study, as well as in our earlier work on conditioned avoidance responses to cocaine (28). It is likely, however, that dopamine performs different functions at different times, with dopamine also playing important roles in driving the motivated behavior that occurs before reinforcer delivery (11, 23, 2932). For example, phasic dopamine release is not only able to drive reinforcement, as noted above, but also drives action selection(33) in a manner that is again inversely analogous to our proposed role for the RMTg in impulse inhibition.

Although several of our hypotheses were confirmed, a number of results were initially surprising. Because RMTg neurons display phasic increases in firing in response to reward omission (12, 34) we anticipated a possible role for the RMTg in withholding responding during extinction, but we saw no such impairment. This finding suggests that either the RMTg is not involved in this particular kind of learning from omission, or that its role can be replaced by compensatory recruitment of additional brain structures after lesions have occurred. Further investigation of the RMTg role in extinction using reversible inactivation methods will be needed to distinguish between these possibilities.

We also noted several subthreshold effects of RMTg manipulation that may provide clues to additional mechanisms of RMTg action. For example, RMTg-lesioned rats display a trend towards increased progressive ratio breakpoint, perhaps indicating slightly increased reward value for food even without punishment (or conversely, slightly reduced cost of effort). In the two-lever alternation task, rats also showed a trend toward more effectively tracking the larger magnitude reward when its location shifts. Although these findings did not reach statistical significance, when paired with the locomotor-activating effects of RMTg lesions they raise the possibility that RMTg manipulations may modulate additional aspects of reward and motivation, albeit to much smaller degrees than effects on punishment. Indeed, prolonged inhibition of the RMTg via optogenetics, GABA, or mu opioid receptor agonist does appear to be reinforcing ((35), and Vento and Jhou unpublished findings), although the present study found no evidence for reinforcing effects of brief optical inhibition of RMTg terminals occurring either before/after footshock (Figure 9C).

Differentiating the functional role of distinct VTA inputs will be critical toward clarifying the diverse signals relayed by dopamine neurons, which in turn influence multiple downstream structures, including the prefrontal cortex, amygdala, habenula, and striatum, all of which play roles in mediating aversive responses (3639) and are in turn connected monsynaptically with the RMTg (13, 15). While the present investigation was focused on evaluating connections between the RMTg and VTA, future experiments will be critical to delineate the broader neurocircuitry that mediates punishment and how this relates to selective activation of cortico-striatal networks involved in decision-making and motivated behavior.

Although the current study focused on food-seeking, drug seeking in both humans and rats can also become resistant to punishment (26, 4042). While future studies are needed to elucidate whether the RMTg is also involved in drug seeking under punishment, this hypothesis is supported by several lines of evidence: RMTg activity is directly modulated by several drugs of abuse (14, 19, 43, 44), and RMTg activation is essential for driving conditioned aversive responses (avoidance) to cocaine (28). Specifically, optogenetic inhibition of RMTg at the exact period when cocaine exhibits aversive effects (during acute withdrawal from its rewarding phase) causes persistent cocaine seeking in rats (28), a phenomenon strikingly similar to our current finding of punishment resistance when the RMTg is inactivated at the precise time of shock delivery. The extent to which neuroadaptations in the RMTg contribute to cocaine-induced punishment resistance, however, remains an open question.

The ability to learn from the negative consequences of one’s behavior and properly shift decision-making away from aversive outcomes is essential to survival, and impairment in this decision-making process is a shared characteristic of several neuropsychiatric diseases. Uncovering the neural regulation of aversive decision-making may therefore hold the key to more effective treatment options. The current study shows that the RMTg serves an essential role in encoding multiple components of the decision-making process that regulates reward seeking under punishment, and that blocking RMTg activity causes maladaptive choices that lead to negative outcomes.

Supplementary Material

Acknowledgments

The authors would like to thank Dr. Garret Stuber for methodological insights related to our optogenetic experiments, as well as Jennifer Hergatt, Dominicka Pullmann, and Sara Dunn for technical support associated with these studies. Support for this work was provided by T32 DA007288 awarded to P.V. and R21DA032898 & 1R01DA037327 awarded to T.J.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Financial Disclosures

The authors report no biomedical financial interests or potential conflicts of interest.

References

  • 1.Barrot M, Sesack SR, Georges F, Pistis M, Hong S, Jhou TC. Braking dopamine systems: a new GABA master structure for mesolimbic and nigrostriatal functions. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2012;32:14094–14101. doi: 10.1523/JNEUROSCI.3370-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bourdy R, Sanchez-Catalan MJ, Kaufling J, Balcita-Pedicino JJ, Freund-Mercier MJ, Veinante P, et al. Control of the Nigrostriatal Dopamine Neuron Activity and Motor Function by the Tail of the Ventral Tegmental Area. Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology. 2014 doi: 10.1038/npp.2014.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Association AP. Diagnostic and statistical manual of mental disorders. 5th. Arlington, VA: American Psychiatric Publishing; 2013. [Google Scholar]
  • 4.Stopper CM, Green EB, Floresco SB. Selective involvement by the medial orbitofrontal cortex in biasing risky, but not impulsive, choice. Cerebral cortex. 2014;24:154–162. doi: 10.1093/cercor/bhs297. [DOI] [PubMed] [Google Scholar]
  • 5.St Onge JR, Stopper CM, Zahm DS, Floresco SB. Separate prefrontal-subcortical circuits mediate different components of risk-based decision making. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2012;32:2886–2899. doi: 10.1523/JNEUROSCI.5625-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.St Onge JR, Floresco SB. Prefrontal cortical contribution to risk-based decision making. Cerebral cortex. 2010;20:1816–1828. doi: 10.1093/cercor/bhp250. [DOI] [PubMed] [Google Scholar]
  • 7.Rudebeck PH, Walton ME, Smyth AN, Bannerman DM, Rushworth MF. Separate neural pathways process different decision costs. Nature neuroscience. 2006;9:1161–1168. doi: 10.1038/nn1756. [DOI] [PubMed] [Google Scholar]
  • 8.Churchwell JC, Morris AM, Heurtelou NM, Kesner RP. Interactions between the prefrontal cortex and amygdala during delay discounting and reversal. Behavioral neuroscience. 2009;123:1185–1196. doi: 10.1037/a0017734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 2010;68:815–834. doi: 10.1016/j.neuron.2010.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Stopper CM, Floresco SB. Dopaminergic circuitry and risk/reward decision making: implications for schizophrenia. Schizophrenia bulletin. 2015;41:9–14. doi: 10.1093/schbul/sbu165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Schultz W. Multiple dopamine functions at different time courses. Annual review of neuroscience. 2007;30:259–288. doi: 10.1146/annurev.neuro.28.061604.135722. [DOI] [PubMed] [Google Scholar]
  • 12.Jhou TC, Fields HL, Baxter MG, Saper CB, Holland PC. The rostromedial tegmental nucleus (RMTg), a GABAergic afferent to midbrain dopamine neurons, encodes aversive stimuli and inhibits motor responses. Neuron. 2009;61:786–800. doi: 10.1016/j.neuron.2009.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Jhou TC, Geisler S, Marinelli M, Degarmo BA, Zahm DS. The mesopontine rostromedial tegmental nucleus: A structure targeted by the lateral habenula that projects to the ventral tegmental area of Tsai and substantia nigra compacta. The Journal of comparative neurology. 2009;513:566–596. doi: 10.1002/cne.21891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lecca S, Melis M, Luchicchi A, Muntoni AL, Pistis M. Inhibitory inputs from rostromedial tegmental neurons regulate spontaneous activity of midbrain dopamine cells and their responses to drugs of abuse. Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology. 2012;37:1164–1176. doi: 10.1038/npp.2011.302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kaufling J, Veinante P, Pawlowski SA, Freund-Mercier MJ, Barrot M. Afferents to the GABAergic tail of the ventral tegmental area in the rat. The Journal of comparative neurology. 2009;513:597–621. doi: 10.1002/cne.21983. [DOI] [PubMed] [Google Scholar]
  • 16.Balcita-Pedicino JJ, Omelchenko N, Bell R, Sesack SR. The inhibitory influence of the lateral habenula on midbrain dopamine cells: ultrastructural evidence for indirect mediation via the rostromedial mesopontine tegmental nucleus. The Journal of comparative neurology. 2011;519:1143–1164. doi: 10.1002/cne.22561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Ungless MA, Magill PJ, Bolam JP. Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science. 2004;303:2040–2042. doi: 10.1126/science.1093360. [DOI] [PubMed] [Google Scholar]
  • 18.Schultz W. Predictive reward signal of dopamine neurons. Journal of neurophysiology. 1998;80:1–27. doi: 10.1152/jn.1998.80.1.1. [DOI] [PubMed] [Google Scholar]
  • 19.Geisler S, Marinelli M, Degarmo B, Becker ML, Freiman AJ, Beales M, et al. Prominent activation of brainstem and pallidal afferents of the ventral tegmental area by cocaine. Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology. 2008;33:2688–2700. doi: 10.1038/sj.npp.1301650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Fowler CD, Lu Q, Johnson PM, Marks MJ, Kenny PJ. Habenular alpha5 nicotinic receptor subunit signalling controls nicotine intake. Nature. 2011;471:597–601. doi: 10.1038/nature09797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hsu YW, Wang SD, Wang S, Morton G, Zariwala HA, de la Iglesia HO, et al. Role of the dorsal medial habenula in the regulation of voluntary activity, motor function, hedonic state, and primary reinforcement. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2014;34:11366–11384. doi: 10.1523/JNEUROSCI.1861-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Richardson NR, Roberts DC. Progressive ratio schedules in drug self-administration studies in rats: a method to evaluate reinforcing efficacy. Journal of neuroscience methods. 1996;66:1–11. doi: 10.1016/0165-0270(95)00153-0. [DOI] [PubMed] [Google Scholar]
  • 23.McClure SM, Daw ND, Montague PR. A computational substrate for incentive salience. Trends in neurosciences. 2003;26:423–428. doi: 10.1016/s0166-2236(03)00177-2. [DOI] [PubMed] [Google Scholar]
  • 24.Robinson TE, Berridge KC. The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain research Brain research reviews. 1993;18:247–291. doi: 10.1016/0165-0173(93)90013-p. [DOI] [PubMed] [Google Scholar]
  • 25.Lavezzi HN, Zahm DS. The mesopontine rostromedial tegmental nucleus: an integrative modulator of the reward system. Basal ganglia. 2011;1:191–200. doi: 10.1016/j.baga.2011.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Goldstein RZ, Volkow ND. Dysfunction of the prefrontal cortex in addiction: neuroimaging findings and clinical implications. Nature reviews Neuroscience. 2011;12:652–669. doi: 10.1038/nrn3119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tsai HC, Zhang F, Adamantidis A, Stuber GD, Bonci A, de Lecea L, et al. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science. 2009;324:1080–1084. doi: 10.1126/science.1168878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Jhou TC, Good CH, Rowley CS, Xu SP, Wang H, Burnham NW, et al. Cocaine drives aversive conditioning via delayed activation of dopamine-responsive habenular and midbrain pathways. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2013;33:7501–7512. doi: 10.1523/JNEUROSCI.3634-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Cohen MX, Frank MJ. Neurocomputational models of basal ganglia function in learning, memory and choice. Behavioural brain research. 2009;199:141–156. doi: 10.1016/j.bbr.2008.09.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wickens JR, Reynolds JN, Hyland BI. Neural mechanisms of reward-related motor learning. Current opinion in neurobiology. 2003;13:685–690. doi: 10.1016/j.conb.2003.10.013. [DOI] [PubMed] [Google Scholar]
  • 31.Adamantidis AR, Tsai HC, Boutrel B, Zhang F, Stuber GD, Budygin EA, et al. Optogenetic interrogation of dopaminergic modulation of the multiple phases of reward-seeking behavior. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2011;31:10829–10835. doi: 10.1523/JNEUROSCI.2246-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ettenberg A. The runway model of drug self-administration. Pharmacology, biochemistry, and behavior. 2009;91:271–277. doi: 10.1016/j.pbb.2008.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Phillips PE, Stuber GD, Heien ML, Wightman RM, Carelli RM. Subsecond dopamine release promotes cocaine seeking. Nature. 2003;422:614–618. doi: 10.1038/nature01476. [DOI] [PubMed] [Google Scholar]
  • 34.Hong S, Jhou TC, Smith M, Saleem KS, Hikosaka O. Negative reward signals from the lateral habenula to dopamine neurons are mediated by rostromedial tegmental nucleus in primates. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2011;31:11457–11471. doi: 10.1523/JNEUROSCI.1384-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Jhou TC, Xu SP, Lee MR, Gallen CL, Ikemoto S. Mapping of reinforcing and analgesic effects of the mu opioid agonist endomorphin-1 in the ventral midbrain of the rat. Psychopharmacology. 2012;224:303–312. doi: 10.1007/s00213-012-2753-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Chen BT, Yau HJ, Hatch C, Kusumoto-Yoshida I, Cho SL, Hopf FW, et al. Rescuing cocaine-induced prefrontal cortex hypoactivity prevents compulsive cocaine seeking. Nature. 2013;496:359–362. doi: 10.1038/nature12024. [DOI] [PubMed] [Google Scholar]
  • 37.Jennings JH, Sparta DR, Stamatakis AM, Ung RL, Pleil KE, Kash TL, et al. Distinct extended amygdala circuits for divergent motivational states. Nature. 2013;496:224–228. doi: 10.1038/nature12041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Matsumoto M, Hikosaka O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature. 2007;447:1111–1115. doi: 10.1038/nature05860. [DOI] [PubMed] [Google Scholar]
  • 39.Stamatakis AM, Stuber GD. Activation of lateral habenula inputs to the ventral midbrain promotes behavioral avoidance. Nature neuroscience. 2012;15:1105–1107. doi: 10.1038/nn.3145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Vanderschuren LJ, Everitt BJ. Drug seeking becomes compulsive after prolonged cocaine self-administration. Science. 2004;305:1017–1019. doi: 10.1126/science.1098975. [DOI] [PubMed] [Google Scholar]
  • 41.Pelloux Y, Everitt BJ, Dickinson A. Compulsive drug seeking by rats under punishment: effects of drug taking history. Psychopharmacology. 2007;194:127–137. doi: 10.1007/s00213-007-0805-0. [DOI] [PubMed] [Google Scholar]
  • 42.Deroche-Gamonet V, Belin D, Piazza PV. Evidence for addiction-like behavior in the rat. Science. 2004;305:1014–1017. doi: 10.1126/science.1099020. [DOI] [PubMed] [Google Scholar]
  • 43.Lecca S, Melis M, Luchicchi A, Ennas MG, Castelli MP, Muntoni AL, et al. Effects of drugs of abuse on putative rostromedial tegmental neurons, inhibitory afferents to midbrain dopamine cells. Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology. 2011;36:589–602. doi: 10.1038/npp.2010.190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Perrotti LI, Bolanos CA, Choi KH, Russo SJ, Edwards S, Ulery PG, et al. DeltaFosB accumulates in a GABAergic cell population in the posterior tail of the ventral tegmental area after psychostimulant treatment. The European journal of neuroscience. 2005;21:2817–2824. doi: 10.1111/j.1460-9568.2005.04110.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

RESOURCES