Skip to main content
Howard Hughes Medical Institute Author Manuscripts logoLink to Howard Hughes Medical Institute Author Manuscripts
. Author manuscript; available in PMC: 2013 Jul 1.
Published in final edited form as: Nat Neurosci. 2012 Dec 9;16(1):71–78. doi: 10.1038/nn.3283

Neural signals of extinction in the inhibitory microcircuit of the ventral midbrain

Wei-Xing Pan 1, Jennifer Brown 1,2, Joshua Tate Dudman 1,*
PMCID: PMC3563090  NIHMSID: NIHMS422102  PMID: 23222913

Abstract

Midbrain dopaminergic (DA) neurons are thought to guide learning via phasic elevations of firing in response to reward predicting stimuli. The circuit mechanism for these signals remains unclear. Using extracellular recording during associative learning we show that inhibitory neurons in the ventral midbrain of mice respond to salient auditory stimuli with a burst of activity that occurs prior to the onset of the phasic response of DA neurons. This population of inhibitory neurons exhibited enhanced responses during extinction and was anti correlated with the phasic response of simultaneously recorded DA neurons. Optogenetic stimulation suggested that this population was in part derived from inhibitory projection neurons of the substantia nigra that provide a robust monosynaptic inhibition of DA neurons. Our results thus elaborate upon the dynamic upstream circuits that shape the phasic activity of DA neurons and suggest that the inhibitory microcircuit of the midbrain is critical for new learning in extinction.


The activity of midbrain dopaminergic(DA) neurons is thought to be critical for associative learning. DA neurons respond to unpredicted rewards(or unconditioned stimuli; US) with a phasic increase in spike rate1. Repeated presentation of a neutral sensory stimulus that predicts reward (US) results in an emergence of phasic firing of DA neurons at the onset of the conditioned stimulus (CS) and a reduction in firing at the time of the US2, 3. The phasic response of DA neurons to a CS correlates with behavioral features of associative learning including generalization4, blocking phenomena5, discrimination6, 7, conditioned inhibition8, extinction9, and spontaneous recovery10. The circuit mechanisms by which these diverse signals emerge during associative learning remain poorly understood.

The phasic response of DA neurons to both CS and US occurs with a substantial delay to auditory, olfactory and visual stimuli11. This long latency raises the question whether the phasic activity of DA neurons reflects plastic changes in DA neurons that emerge de novo or whether changes in the processing of sensory stimuli by upstream neurons could be critical. Previous data has suggested that synaptic plasticity may account for changes in excitatory drive of DA neurons12; however, relatively little is known about changes in inhibition onto DA neurons.

If the phasic activity of DA neurons indeed reflects a balance of excitation and inhibition, then one would predict that suppression of DA activity below baseline should be observed under some conditions. Phasic suppression of DA activity has been found to occur under three conditions in associative learning paradigms: 1) in response to omitted rewards at the time of a predicted reward2, 5; 2) in response to a conditioned inhibitor8 or negative association (CS-)13; and 3) in response to an extinguished CS10. Recently, studies in non-human primates have identified an inhibitory circuit that may influence DA neuron activity in a reversal-learning paradigm7, 14. Despite these important results, it remains unclear whether additional sources of inhibition may interact to control the phasic activity of DA neurons and/or how distinct populations of upstream neurons may be recruited in distinct learning conditions.

DA neurons receive inhibitory input from sources both extrinsic and intrinsic to the basal ganglia (BG).In both the ventral tegmental area (VTA) and the substantia nigra (SN) pars compacta (SNc), the BG nuclei in the ventral midbrain that contain DA neurons, the majority of non-dopaminergic neurons synthesize and release the inhibitory neurotransmitter γ-aminobutyric acid (GABA).Recordings from anaesthetized animals suggest that intrinsic GABAergic neurons of the VTA/SNc and GABAergic projection neurons of the SN pars reticulata (SNr) can inhibit DA neurons1518.A crucial step is to identify inhibitory circuits that are both upstream of DA neurons in the pathway carrying sensory information and have responses to salient stimuli that are modulated by learning. As a first step we used extinction as a behavioral model to explore changes in inhibition upstream of DA neurons. We hypothesized that the phasic suppression of activity in DA neurons known to occur during extinction may reflect changes in the intrinsic inhibitory circuit of the ventral midbrain.

Here, using single-unit recordings from freely behaving mice we have explored neural activity of both GABA and DA neurons in the ventral midbrain during an auditory trace conditioning paradigm. We then used an optogenetic strategy to identify GABAergic neurons in vivo from extracellular recordings and demonstrate the presence of a robust monosynaptic projection to DA neurons in vitro from upstream GABAergic neurons of the SNr. Our results suggest the existence of a dynamic inhibitory microcircuit in the ventral midbrain that is upstream of DA neurons and is critical for attenuating the phasic response of DA neurons during extinction.

Results

To study the role of inhibitory neurons of the ventral midbrain we compared the activity of single units recorded in the ventral midbrain(Supplementary Fig. 1)of mice during sessions of acquisition and extinction in an auditory trace-conditioning task (Figure 1a). Repeated presentation of a tone as a CS followed by a delayed reward resulted in a steady increase in the average rate at which mice sampled the reward port. Licking significantly diverged from the baseline around the time of the tone offset (540 ms) and steadily increased during the trace period (Figure 1b). The anticipatory licking response emerged rapidly during acquisition of the CS-US contingency and was almost completely eliminated within 20 trials of extinction (Figure 1c; N=86 sessions).

Figure 1. Discrete timing of sensory responses in the ventral midbrain during auditory trace conditioning.

Figure 1

(a) Schematic of the auditory trace conditioning task. Freely behaving mice (N=5) were trained to obtain a water reward from a small port on the wall of the behavior box following a conditoned stimulus (CS; 500 ms tone). Licks at the reward port were measured using an infrared beam break. Water rewards were delivered by gating a solenoid valve. (b) Time course of the speaker control voltage, valve control, and average rate of licks (black, s.e.m shaded area) aligned to the CS for all acquisition blocks (N=86). A baseline period of 2 s prior to CS onset was used to estimate the background rate of licks (solid cyan line; dashed lines are 0.95 confidence intervals). (c) Average anticipatory licking (licks in trace interval – baseline licks) in 5 trial blocks plotted as a function of position within acquisition sessions (left) and extinction sessions (right). ‘Stable’ response is the first block above 30 trials. (d-f) Units with significant phasic responses to tones were divided into 3 groups: dopamine (’DA’, d); GABAergic, medium latency (’GABA2’, e); GABAergic, short latency (’GABA1’, f) using criteria defined in the main text including response pattern (d-f) and latency (g) . Left two columns show example raster plots and Z-scored peristimulus time histograms (PSTH) from each group. Right two columns show population means for units with significant phasic responses during acquisition blocks. (g) Histogram of the latency to the peak of the phasic response for all units (grey bars). Peaks corresponding to the GABA1 and GABA2 groups are indicated. Superimposed are cumulative histograms of peak latencies for GABA1+2 (cyan) and DA (red) units (right axis).

Sensory information arrives in the midbrain with multiple, distinct latencies

We first identified a subset of single units from our dataset with the baseline firing properties, waveforms, and pharmacological sensitivity characteristic of DA neurons (Figure 1d, n =22, see Methods). Consistent with previous studies in a number of species5, 7, 10, 13, 1921, the response of DA neurons to the CS was relatively delayed, with an onset 50.9 ± 16.1 ms (s.d.) and a peak response 83.2 ± 18.4 ms after CS onset (Figure 1d, g). DA neurons had clear phasic responses to both the CS and US and little to no sustained changes in baseline activity during the delay period.

We next analyzed our dataset of 641 non-DA single units. Our electrode arrays were localized largely to the lateral VTA/SN border and the SN both of which are thought to contain nearly exclusively GABAergic and DA populations in rodents22 (Supplementary Fig. 1). We thus identified the non-DA population as putative GABAergic (hereafter referred to as ‘GABA’) neurons. The responses of non-DA units often exhibited sustained shifts in activity during the delay period (both increases and decreases; Supplementary Fig. 2) similar to those observed previously in the midbrain23,24,25,26. Consistent with these studies, subsequent optogenetics experiments (Figure 57) and pharmacology (Supplementary Fig. 3) were used to confirm that units with short latency phasic responses were indeed GABA neurons.

Figure 5. Light evoked activity in ChR2-expressing GABAergic neurons, but not dopaminergic neurons in vitro.

Figure 5

(a,b) In vitro wide field illumination of a midbrain slice (1ms light pulse) reliably evoked action potentials in SNr GABA neurons (a) but not SNc DA neurons (b). Cells were recorded in current clamp and held to –70mV to detect evoked spiking. Inserts show characteristic voltage response of SNr GABA neuron (a, upper) and SNc DA neuron (b, upper) to a family of injected current steps (from –160pA, step size 40pA), and zoom in of a single action potential waveform. (c,d) Wide field illumination (1ms light pulse, blue) evoked inward current in SNr GABA neurons (c) but not SNc DA neurons (d), recorded in voltage clamp configuration with a holding voltage of – 70mV. Insert shows characteristic response of SNr GABA neuron (c, upper) and SNc DA neuron (d, upper) to hyperpolarizing current steps (from –0mV to –110mV, step size –10mV). (e,f) Reconstructed GABA (cyan) and DA (red) neurons were plotted within a common SN reference frame and independently arrayed horizontally to reveal morphological details (f). Dotted line indicates the perimeter of the SN reference frame. Orientation of the SN indicated by arrows. Scale bars: a,c; inset: 40 mV, 100 ms; 40 mV, 1 ms. b,d; inset: 80 pA, 1 s.

Figure 7. The microcircuit of the SN is spatially extended and sufficient to suppress firing of dopamine neurons.

Figure 7

(a) Schematic of the ChR2-assisted circuit mapping used to obtain data in b–d. (b) Example mapping experiment from an identified DA neuron in the substantia nigra (red dot). 60 ms windows containing evoked IPSCs (black traces) were aligned to the position of the laser (473 nm) stimulus. Approximate perimeter of the substantia nigra is indicated by dashed line. Slice orientation is indicated by D (dorsal) and M (medial). (c) IPSCs aligned to stimulus (cyan line) onset for all stimulation sites under control (black) conditions and in the presence of gabazine (red, ’+Gbz’). (d) Onset latencies of individual IPSCs for all experiments and stimulus positions. (e) Schematic of the experiment to measure response of DA neurons to transient inhibition. Intracellular recordings from a GABA (grey) and DA (red) following transient light stimulation (cyan). (f) Overlay of individual traces (n=80; 40 shown) for a single recorded DA neuron. Photostimulation occurred during the time window indicated by the cyan shading. (g) Average intracellular response (black) to photostimulation used to estimate the FWHM of inhibition. One standard deviation show in gray. (h) Raster plot of spike times for 80 repetitions of the light stimulus (middle panel) and corresponding peristimulus time histogram (PSTH; lower panel). (i) Observed pause in firing as a function of FWHM of evoked inhibition for each DA neuron recorded in vitro (N=12 cells/slices). Shaded region indicates predicted pauses in firing from the theta neuron model (Supp. Fig. 9). Pauses were well matched to pauses observed in vivo (Supp. Fig. 4).

While many non-DA units had substantial sustained changes in activity correlated with approach and consumption of the water reward (Supplementary Fig. 2), most (n = 430/641) showed no significant phasic response to the CS onset (<3 s.d. above baseline within the initial 150 ms after stimulus onset) during acquisition. However, we found a subset of GABA units that did exhibit short-latency phasic responses. These units could be separated into 2 distinct functional groups. One group had short latency response (14.9 ± 1.4 ms [Gaussian fit] hereafter referred to as “GABA1”; Figure 1g) and a phasic US response (Figure 1f). A second group had response latencies intermediate between the GABA1 and DA populations (37.9 ± 8.5 ms [Gaussian fit]; hereafter referred to as “GABA2”; Figure 1g) to the CS and little to no phasic response to the US (Figure 1e). Thus, auditory information arrives at the midbrain with multiple, distinct latencies. Importantly, phasic responses to sensory stimuli in GABA neurons of the ventral midbrain could occur prior to the onset of the phasic DA response to both the CS and US.

Plastic changes in DA and GABA responses to the extinguished CS

Although DA units had diminished US responses (Figure 1d, 2a), consistent with previous studies in rats10, 19 and mice27, 24, we failed to observe a complete attenuation of the response. Thus our task may be analogous to a “delayed response” task previously shown in non-human primates to have phasic responses of DA neurons at the time of the reward trigger28. To confirm that responses to the CS were indeed sensitive to reward contingency we compared activity during blocks of extinction with that recorded during acquisition of the CS-US pairing. The response of DA units to the extinguished CS (CSext) was attenuated and could result in a slowly emerging and relatively sustained suppression of firing below baseline (Figure 2a; red arrow, Supplementary Fig. 4) as reported previously in rats9. By contrast to DA units, we found that an enhanced phasic response to the CSext emerged specifically amongst the short latency responses (Figure 2b-d, Figure 3a) corresponding to the GABA1 and GABA2 populations. Indeed, in some units a CSext response emerged despite a weak or non-existent response to the CS during the acquisition phase of the task (Figure 2b,c, Supplementary Fig. 5). We termed units with significantly larger responses to the CSext than the CS as ‘extinction cells’ (n=63). In a subset of recordings, extinction cells were identified online and subjected to re-acquisition and re-extinction blocks (n = 10 units). We found the response of extinction cells to be highly labile and could be rapidly reversed upon re-acquisition and then restored during a subsequent block of extinction (Figure 2e; Supplementary Fig. 6).

Figure 2. Short latency responses to the CS are enhanced during extinction.

Figure 2

Upper panels show raster plots for 60 trials of representative DA units (a) and ‘extinction cells’ derived from the GABA 2 (b) and GABA 1 (c) groups of units aligned to CS onset (left), US onset (middle), and CSext onset (right). Red arrow indicates the presence of suppression of DA response below baseline in extinction. Shown with increased resolution in Supplementary Figure 4. Lower panels show Z-scored PSTHs. (d) Z-scored PSTHs for the entire population of units with a significant phasic response aligned to CS (left), US (middle), and CSext (right) (<−15 to >15 scaling). Cell index is sorted by the latency of the phasic response and CSext – CS response contrast. (e) Response index ((Acq – Ext)/(Acq + Ext)) for a subset of extinction cells (N=10 in 2 mice) that went through sessions of re-acquisition and re-extinction on the same day. (**, p<0.01; *, p<0.05; t-test, two-tailed).

Figure 3. The popuiation response during extinction.

Figure 3

(a) Difference in the phasic response magnitude normalized by response width (’ΔZ’) between acquisition and extinction plotted as a function of the latency to the peak of the response. Significant differences indicated by filled circles for GABA (cyan) and DA (red) units. (b) The CSext – CS contrast plotted as a function of the difference between the CS and US response for GABA 1 (cyan) and DA (red) units. GABA 1 units showed a significant correlation such that units with a strong US response in acquisition developed a strong CSext response. GABA 2 units had no population US response and thus, were not included. P value reflects significance of hypothesis of a nonzero Pearson’s correlation.

Thus, both in individual units and in the GABA1 and GABA2 population response we found evidence for increases in activity selectively in the response to the CSext. These changes could be reversed by subsequent re-acquisition and restored by re-extinction suggesting a labile and plastic processing of sensory information by midbrain GABA neurons. An enhanced response to the CSext could be characterized as a negative reward prediction, i.e. a phasic response to a stimulus that no longer predicts reward. Alternatively, it is possible that such a signal could reflect the fact that midbrain GABA neurons signal extinction independent of any history of reward association. To address this possibility, we next asked whether the phasic activity of units from the GABA1 population, which show responses to both CS and US, were coordinated across acquisition and extinction. We found that the relative magnitude of the CS and US response in acquisition was significantly correlated (p<0.01) with the change in CS response upon extinction (Figure 3b) in the GABA1 population. In other words, units with a large response at the time of the US, but little response to the CS, developed a strong response to the CS in extinction. These results argue that extinction cells provide a signal of negative reward prediction that is suppressed by positive reward prediction and robustly expressed during extinction.

Simultaneous recordings of midbrain DA and GABA neurons during learning

The short latency of the phasic response of midbrain GABA neurons suggests that these neurons could be upstream of DA neurons. If extinction cells are upstream of DA neurons and contribute to the inhibition of phasic DA activity in extinction, then the extinction cell response should emerge concomitant with the loss of the DA response. To study the time course of changes during extinction we analyzed recordings in which DA units and extinction cells were recorded simultaneously. This yielded 74 pairwise comparisons (Figure 4a-d). Using responses binned across trials we found that the emergence of the extinction cell response was concomitant with the loss of the DA response (Figure 4a). However, the phasic response of both DA units and extinction cells fluctuated trial by trial. Thus, we calculated the response correlation between DA and extinction cells for each trial in extinction. We found 19 significant pairwise correlations in extinction blocks (Figure 4d). Out of the 19 significant correlations (Pearson correlation coefficient) 18 were negative and yielded a significant negative mean correlation during extinction (−0.21 ± 0.03; t-test). In acquisition extinction cells and DA units showed no significant correlation in their CS response (0.00 ± 0.03). This observation suggests that extinction cells are recruited during extinction to suppress the phasic DA response. The rapid loss of a phasic response to the CS during re-acquisition (Figure 2e) suggests that the inhibition from extinction cells may be relieved during acquisition and thereby contribute to the phasic CS response of DA units. However, the relatively modest variation in the amplitude of the CS response in DA units and the greatly reduced population response of extinction cells revealed little to no trialwise correlation in acquisition. Thus, a specific role for disinhibition in the CS response remains less clear.

Figure 4. Opposing extinction signals emerge with similar timecourses.

Figure 4

(a) The mean PSTH for each block of trials during acquisiton and extinction are shown for an example simultaneously recorded putative GABA (cyan) and dopamine (DA; red) unit. PSTHs were smoothed using a 10 ms gaussian kernel for each trial and blockwise averages were calculated in 10 trial blocks. (b-c) Population mean PSTHs for simultaneously recorded DA (b; red) and GABA (c; cayn) units during acquisition (‘Acq’, light) and extinction (‘Ext’, dark). (d) For simultaneous recording sessions the pairwise trial by trial correlations (n=74) in the phasic response amplitude between DA and GABA units was calculated. All significant (p≤0.05) correlations from the Ext session were used to calculate the population mean is shown in right bar (n=19; paired t-test for significance of mean correlation). The mean response correlation for the same comparisons during Acq are shown at left. (e-h) The number of spikes above baseline in the phasic response window on a single trial are plotted for the population of DA (e-f) and Extinction cells (g-h). (i-j) Normalized linear fits to the initial 30 trials of Acq (i) and Ext (j) are shown in solid lines with confidence intervals in dashed lines for DA units (red) and extinction units (cyan). Normalized, trial by trial behavioral responses (anticipatory licks per trial; black circles) are also shown for first 30 trials. Error bars are standard error of the mean. Significance: *** = p<0.001 t-test, two-tailed.

To estimate the behavior of the entire population of DA units (Figure 4e-f) and extinction cells (Figure 4g-h) across all sessions we examined the trialwise change in the phasic response to the CS and CSext. During acquisition both extinction cells and DA units had relatively stable CS responses (Figure 4e,g); however, in extinction blocks the phasic response of the extinction cells gradually increased (Figure 4h) whilst the response of DA units decreased (Figure 4f) over the first 30 trials. In addition we found that DA units showed a delayed latency to peak responses early in extinction and a suppression of activity below baseline in later trials (Supplementary Fig. 4). Together these data are consistent with a model in which recruitment of the extinction cell population shapes the phasic responses of DA units to the CSext in extinction.

We next compared the phasic response to the CS of DA units and extinction cells to the behavioral response in blocks of acquisition (Figure 4i) and extinction (Figure 4j). The phasic response of both extinction cells and DA units were closely matched, whereas the extinction of licking followed a similar, but distinct time course. This suggests that the change in the phasic CS response of extinction cells is not simply explained by a correlation between reward expectation and licking, but is, rather, part of the circuit controlling the phasic response of DA neurons to reward predicting stimuli. Approach behavior is, nonetheless, a complex behavioral response that is only partially captured by anticipatory licking. Future analysis is required to determine the precise relationship between activity in the midbrain and the extinction of approach behavior.

Short latency responses can be derived from GABAergic neurons of the SNr

Previous studies have suggested that midbrain GABA neurons can inhibit the firing of DA neurons1518. We observed that units with increased CSext responses were distributed across electrode positions, many at sites in the SNr that were quite distant from the SNc (Supplementary Fig. 1) and often at recording sites without detectable DA units. The SNr is thought to be composed nearly exclusively of GABAergic projection neurons and a much smaller population of DA neurons22, 29. Thus, anatomical evidence strongly suggested that extinction cells could be derived in part from projection neurons within the SNr. The prominent motor correlates of many units in our population (Supplementary Fig. 2) and the high baseline firing rates in the population (mean: 17.2 Hz) and in extinction cells (Supplementary Fig. 5) were consistent with previous recordings of SNr projection neurons in rats25, 26 and mice30. Moreover, in a subset of recordings extinction cells were found on electrodes clearly localized to the SNr (Supplementary Fig. 7). Finally, the negative pairwise correlation between extinction cells and DA units during extinction suggested the presence of an inhibitory projection (Figure 4). If extinction cells were derived in part from SNr projection neurons and contribute to inhibition of DA neurons, then DA neurons should receive inhibitory input from neurons throughout the SNr.

To study the properties of the inhibitory microcircuit of the midbrain we identified a transgenic mouse line in which channelrhodopsin-2 (ChR2) was expressed in GABA neurons of the SN, but not DA neurons (Figure 5, Supplementary Fig. 8). We used these mice to address two questions: (1) were the population of units with short-latency responses to the CS indeed derived, at least in part, from SNr GABA neurons; and (2) could phasic activation of GABA neurons in the SNr inhibit DA neurons?

We implanted head-fixed mice (N=4) with an electrode array and associated fiber optic to provide diffuse optical stimulation to the SN during recording (Figure 6a). A series of brief light stimuli (1–5 ms) and auditory stimuli (500 ms tones) were then presented to alert mice. We ‘tagged’ single units as GABA neurons if light stimuli elicited spikes with properties similar to those found in in vitro experiments (Figure 5a): short latency (<6 ms) onset of spiking with low jitter (<0.5 ms) and reliable (>50%) across trials (example neuron shown in Figure 6b, left). Within the population of tagged units (N=90), a subset of units (‘tagged-tone’) also had phasic responses to the auditory stimulus (Figure 6c, left, N=34). All tagged-tone units had response latencies shorter than the onset latency of DA neurons (Figure 6d-e) and consistent with the latencies at which the phasic response of extinction cells was observed (Figure 3a). In these recordings we also isolated a small number of putative DA units (n=4; identified blind to response properties). The spontaneous activity of the DA units was suppressed by light stimulation (Figure 6b, right) and showed no response or mild suppression in response to a neutral auditory stimulus (Figure 6c, right).

Figure 6. Optogenetic tagging demonstrates that short-latency auditory responses are from GABAergic neurons.

Figure 6

(a) Schematic of the in vivo stimulation and recording experiment. An electrode array with an integrated optical fiber was slowly lowered into the SN of awake, head-fixed mice. Single units were isolated on the recording electrodes. Trials with either light stimulation (1–5 ms pulse, b) or auditory stimulation (500 ms tone, c) were presented to the quietly resting mouse. (b-c) Response of an example GABA (left) unit that was directly activated by light (indicated by square pulse, b) and responded to a tone (indicated by pure tone voltage command, c) with short latency. Response of an example DA unit (right) to light (b) and tone (c). (d) Population data from the head-fixed experiment for all directly photoactivated units (black, ‘GABA’) and the subpopulation of DA units (red). (e) The latency and amplitude of the tone response for all units that exhibited direct photostimulation responses and tone responses (N=34).

Activation of SN GABA neurons is sufficient to suppress firing of DA neurons

To confirm that the short latency of the inhibition of DA units recorded in vivo was mediated by a monosynaptic connection we next used whole-cell recordings in vitro to characterize the source and circuitry mediating the inhibition of DA neurons. All recorded GABA neurons from the SNr expressed ChR2 (Figure 5a,b,e) and had morphology consistent with SNr projection neurons (Figure 5e-f). The homogeneous expression of ChR2 in GABA neurons of the SNr allowed us to use multisite photostimulation of the SN to characterize the source of inhibition onto DA neurons (Figure 7a). Using whole-cell voltage clamp recordings of identified DA neurons we observed fast (decay τ: 15.07 ± 0.73 ms; N = 9 cells/slices), inhibitory postsynaptic currents (IPSCs) elicited at short latency (3.45 ± 0.20 ms; N = 9) after photostimulation consistent with monosynaptic inhibition (Figure 7b-d). Importantly, we showed through ChR2assisted circuit mapping, inhibition was derived from sites located within the SNr (Figure 7b). Application of gabazine, a GABAA receptor antagonist, completely eliminated evoked IPSCs (N = 6 cells/slices; p<0.001 two-tailed t-test; Figure 7c). Thus, in vitro experiments confirmed that photostimulation of SNr GABA neurons elicited a robust, monosynaptic inhibition of DA neurons in the SN. To further confirm the source of inhibition as the SNr we also used targeted viral infection of SNr GABA neurons with ChR2. In both the transgenic and viral-mediated approach we observed robust inhibition of DA neurons with indistinguishable biophysical properties (Supplementary Fig. 8).

We next sought to confirm that the inhibition of DA neurons by SNr GABA neurons was sufficient to suppress firing as suggested by the in vivo recordings. We used brief phasic activation of SNr GABA neurons in vitro, with a pattern modeled after that observed in vivo during extinction, to suppress the firing of DA neurons (Figure 7e). Alignment of repeated trials revealed that the transient burst of inhibition resulted in a pause (472.7 ± 102.4 ms) of firing (Figure 7e-h) that substantially exceeded the duration of inhibition (89.7 ± 29.4 ms) in all DA cells tested (n=12; p<0.001; Figure 7i). This suppression could be accounted for by a simple computational model (Figure 7i; Supplementary Fig. 9) and closely matched the duration of suppression observed in DA units during extinction (Supplementary Fig. 4). These data demonstrate that SNr GABA neurons can respond to conditioned and extinguished stimuli with latencies shorter than DA neurons and provide a monosynaptic inhibition that is sufficient to suppress DA neuron firing in vivo and in vitro. Thus, our data strongly suggest that extinction cells are derived at least in part from SNr GABA neurons.

Discussion

Using an auditory trace-conditioning paradigm we have shown that a subset of GABAergic neurons of the ventral midbrain respond to a tone with phasic elevations of activity shortly before the onset of the phasic response of DA neurons. We identified two functional groups of putative inhibitory neurons (GABA1 and GABA2) distinguished by the response patterns and latencies of their phasic response to reward predictive stimuli. Signals of a negative reward prediction emerged during extinction in both the GABA1 and GABA2 population. Subsequent optognetics experiments were used to confirm that short latency responses to auditory stimuli can arise in GABAergic projection neurons of the SNr, but may also arise in other populations of midbrain neurons. Cell type specific stimulation of midbrain GABA neurons was sufficient to suppress DA activity below baseline in vivo and in vitro. Based upon the position of our extracellular electrodes, firing properties of the recorded units and optogenetic tagging experiments we concluded that both the GABA1 and GABA2 populations derive, at least in part, from GABAergic projection neurons of the SNr. It is important to note that other populations of neurons in the ventral midbrain, most likely either in the lateral VTA/PBP or SNc could be an additional source of the GABA1 and GABA2 populations. Phasic activity matching the properties observed here were not observed in recent work on GABAergic neurons of the VTA24; however, neither auditory stimuli nor extinction of the CS were explored.

Synaptic plasticity of excitatory inputs to DA neurons is thought to produce changes in excitatory drive during conditioning12. We have provided several lines of evidence to argue that GABAergic neurons in the ventral midbrain provide a source of inhibition that dynamically shapes the phasic activity of DA neurons in extinction: (1) the response to the CSext of extinction cells was maximal just prior to the onset of the response in DA neurons; (2) the increased responding of the extinction cell population emerges as the phasic DA response attenuates; (3) simultaneous recordings of extinction cells and DA neurons reveals a negative correlation in the trial by trial modulation of the CS response; (4) the GABA1 population had clear phasic responses to the US relative to the CS in acquisition, whereas the response of DA neurons was attenuated; (5) Optogenetic activation of SNr GABA neurons in vitro and in vivo resulted in a robust inhibition of DA neuron firing; (6) ChR2-assisted circuit mapping revealed a fast monosynaptic inhibition of DA neurons following stimulation at sites distributed across a large extent of the SNr consistent with the distribution of in vivo recording sites; (8) in vitro and in vivo data showed that transient activation of SNr GABA neurons is sufficient to generate sustained pauses in DA activity (85 ms – 1.3 s) covering the range of pause durations observed in vivo in both our recordings and previous recordings31; (9) finally, we used a computational model to argue that the properties of the sustained suppression of activity in DA neurons were consistent with the observed properties of SNr-mediated inhibition.

The short latency responses of the GABA1 and GABA2 population are relatively surprising given the canonical circuitry of the basal ganglia in which sensory information enters via the cortex and thalamus32. This would suggest the presence of multiple, potentially independent sensory inputs to the midbrain. Stimulation of auditory cortex elicits responses in the subthalamic nucleus (STN) with a latency of ~12 ms in anaesthetized rats33. Given the short latency of auditory responses in the cortex34, the latency of the GABA2 population may correspond to sensory information arriving via the ‘hyperdirect’ pathway from cortex to STN to SNr. However, it would appear that this pathway is not sufficient to account for the short latencies observed in the GABA1 population. Both the dorsal midbrain35 and the tegmentum36 give rise to ascending inputs into the basal ganglia to both the SN and STN. In pedunculopontine tegmental nucleus auditory responses have been recorded at latencies ranging from 4–35 ms37 . In the superior colliculus auditory responses have latencies as short as 8 ms (e.g. 38). Thus, either or both of these ascending pathways may provide the short latency sensory information to the ventral midbrain observed in our recordings.

DA neurons integrate excitatory and inhibitory inputs to generate signals that depend upon reward probability, motivation, context, and salience39, 40. Here, we have identified multiple populations of GABAergic neurons in the ventral midbrain that appear to contribute to the suppression of DA neuron firing during extinction. In the literature, there are two other conditions, both during acquisition, known to reveal a suppression of DA neuron firing: the omission of a predicted reward and in response to a conditioned inhibitor or negative association. Recently, the rostromedial tegmental (RMTg) nucleus has been identified as a source of extrinsic inhibition to DA neurons. The lateral habenula (LHb) -> RMTg -> SNc circuit represents a circuit that was shown to be active during reversal learning (a negative association) and is also sufficient to inhibit DA neuron firing7, 14, 41. An intriguing possibility is that common mechanisms for the inhibition of DA neurons are present both during acquisition and extinction; however, this remains to be directly demonstrated.

These data suggest the presence of two or more functionally distinct populations of midbrain GABA neurons. We provide evidence that distinct functional populations may be present within the SNr and may be selectively recruited to inhibit DA neurons during acquisition and extinction (schematized in Supplementary Fig. 10). Only one class, the GABA1 population, responds to predicted rewards (US). The phasic response of DA neurons to a predicted US can be attenuated but is never suppressed below baseline firing rates. This could reflect the fact that the GABA1 population, which shows phasic responses to predicted rewards, may provide inhibition that is coincident in time with excitation or perhaps feeds back onto upstream excitatory inputs. Both possibilities would be consistent with an early response latency which may be effective at reducing or delaying excitation, but insufficient to suppress activity below baseline. Reconstruction of projection patterns of individual SNr neurons has been used to propose a small number (4) of neuron classes42. Subsets of SNr projection neurons send axons to the superior colliculus43 and tegmentum44, both of which contain excitatory neurons that project to midbrain DA neurons. This suggests that specific anatomical or molecular classes may accord with the functional classes we observed.

Less clear is why extinction leads to the recruitment of multiple populations of inhibitory inputs. Our modeling data suggested that a more sustained inhibition that results from populations with 2 distinct latencies could be especially effective at suppressing the activity of DA neurons. The requirement for multiple inhibitory inputs may reflect a stringent regulation of phasic suppression of DA neuron activity. This suggests that phasic suppression of DA activity (as opposed to the cessation of phasic elevation of firing) is itself an important signal for downstream targets. A transient loss of DA activity may be important for modifying synaptic plasticity rules45 and could contribute, together with the changes in inhibition observed here, to a new inhibitory learning in extinction as first proposed by Pavlov nearly a century ago46.

Methods

Subjects and surgery

For behavior and in vivo electrophysiology experiments we used 5 adult (30 g; 3–6 months old) male mice derived from the C57/Black6 strain and bred from the in house breeding colony. All animals were handled in accordance with institutional guidelines. For the optogenetics experiments we used adult mice (10–30 weeks old) expressing Chr2 under the thymus cell antigen 1 (Thy1) promoter (Line 18; Jackson Labs, Bowdoin, ME). All animals were handled in accordance with institutional guidelines. Mice were trained to learn a classical trace-conditioning task with a tone (7–10 kHz, 500 ms duration) as the CS and sweetened water as a reward (2000–2500 ms delay).

Animal care

Mice were initially housed in a temperature-and humidity-controlled room maintained on a reversed 12 h light/dark cycle. For behavioral and in vivo physiology experiments mice were housed individually, for in vitro experiments mice were group housed. Following one week of recovery from surgery, the water consumption of the mice was limited to 1 mL per day for a week. Mice under went daily health checks, and water restriction was eased if mice fell below 75% of their body weight at the beginning of deprivation. Mice were then familiarized with the training and recording box, which was located inside a sound attenuating enclosure. Mice were trained to obtain fluid from a recessed spout in the wall of the behavioral box. Small volumes (≈0.01 ml) of water sweetened with saccharin (0.005M solution) were delivered to the spout via a computer control using custom software and electronics with a nominal time resolution of 1 kHz (to be published elsewhere). Entries to the reward port, primarily as licks, were detected using an infrared beam break positioned just in front of the spout opening.

Behavioral training

Mice were trained to learn a classical trace conditioning task with a tone (7–10 kHz, 500 ms duration) as conditioned stimulus (CS) following the water reward with a delay of 2000–2500 ms. The intertrial intervals were chosen pseudo-randomly from a uniform distribution over the interval 20 to 40 seconds. Extinction training then involved exposing the animal to equivalent pseudo-randomly delivered repetitions of the previously conditioned CS, without solenoid activation. Each conditioning session was carried out for 40–100 trials, while the following extinction session was carried out for 80–120 trials. In a subset of behavioral sessions a distinct tone, unpaired with reward, was also delivered (e.g. Supplementary Fig. 9).

In vivo electrophysiology

Recordings were performed using 16 or 32-microwire arrays (CD Neural Technologies, Durham, NC). Electrode arrays were stereotaxically implanted under anaesthesia (isoflurane; 1.5%–2.5% in O2). The electrodes were targeted to the substantia nigra and the ventral tegmental area of the ventral midbrain (3.0–4.5mm posterior to bregma, 0.5–2.0 mm lateral to midline and 3.5 mm below the surface of skull). Electrode arrays were mounted directly to a custom-designed microdrive and connected to the recording systems via a flexible wire coupling and connector. This configuration allowed us to advance the electrode arrays between training and recording sessions. Animals were allowed at least 1 week for recovery from surgery and initial advancement of electrodes. Approximate electrode tracks are shown in Supplemental Figure 1. Recordings were generally initiated at multiple electrode advancement steps. We noted an enhanced probability of detecting extinction cells late in our recordings either simultaneously with dopamine units (Figure 4) or after dopamine units had been recorded (i.e. at greater depth relative to the surface). However, the uncertainty present in the exact electrode position due to approximations in the drive displacement and tissue compaction made it unreasonable to report precise locations for individual recording sessions in Supplementary Figure 1.

The movable electrodes were advanced in 30–60 µm increments daily to search for independent units. The voltage signals from electrodes were amplified and filtered with a sequential analog (0.1–7.6KHz bandpass) and digital filter (750–7.6KHz bandpass). Channels with detectable activity were digitized at 30 kilosamples/second, thresholded on-line, and voltage segments (30–50 samples) recorded to disk using the Cerebus Data Acquisition System (Blackrock Microsystems, Salt Lake City, UT). Spikes were re-isolated offline on the basis of wave-shape, using Plexon Offline Sorter (Plexon Inc, USA). Putative DA cells were classified according to the following criteria: 1) low firing rate (< 10 Hz), 2) relatively broad action potential (> 1.2 ms), 3) phasic CS responses with onset latencies of 40–60ms, and 4) profound (>50%) inhibition by the D2-receptor agonist quinpirole (400 µg/kg, s.c., most putative DA cells were tested, but not all).

For head-fixed recording experiments mice were implanted with a custom designed head-restraint several days prior to recording (to be described elsewhere). Mice were then habituated to the head-restraint system. Following recovery and habituation a craniotomy was made under isolflurane anesthesia as described above and electrode arrays were maintained in position by a micromanipulator (Sutter Instruments). A 200 micron core multimode fiber (ThorLabs) was affixed near the central recording wires of a 32 channel array. The entire array was slowly lowered in to the midbrain. After >1 hour of recovery recording data was obtained from alert, but quietly resting mice.

Analysis of physiology and behavioral data

Analysis was performed using custom written routines in Matlab R2011a (Mathworks, Natick, MA) and Igor Pro (Wavemetrics, Eugene, OR). Briefly, z scores were calculated as the mean subtracted PSTH divided by the standard deviation of the baseline period (2 seconds prior to the stimulus). Responses were calculated from more than 40 trials of acquisition and extinction. A phasic response was specified to occur in the first 150 ms after stimulus onset with a width defined as the first point 2 s.d. above baseline prior to and after the peak response. Mean responses were quantified as the integral of the response (z score, or rate) within the peak window, trialwise responses were integrals or spike counts within the same window. Significant differences between extinction and acquisition were defined by comparing equivalent numbers of trials during stable behavior. Units in which both the ranksum and Kruskal-Wallis test were significant (p<0.05) were labeled significant. P values reported for all pairwise comparisons of means are taken from two-tailed t-tests. Significant correlations were assessed using a t transformation of the data and evaluating the Pearson correlation.

Histology

To confirm the position of recording sites, mice were killed by anesthetic overdose (isoflurane, >3%), perfused with phosphate-buffered saline (PBS) then paraformaldehyde (4% w/vol. in PBS). Brains were post-fixed for 24 hours and then rinsed in saline. Whole brains were then sectioned (50–100 µm thickness) using a vibrating microtome (VT–1200; Leica Microsystems, Germany). Electrode tracks were mapped onto standard atlas sections by visual inspection using counter-staining or autofluorescence for registration.

In vitro electrophysiology

Briefly, mice were deeply anaesthetized under isoflurane, decapitated and the brains were dissected out into ice-cold modified artificial cerebral spinal fluid (aCSF) (in mM: 52.5 NaCl, 100 Sucrose, 26 NaHCO3, 25 Glucose, 2.5 KCl, 1.25 NaH2PO4, 1 CaCl2, 5 MgCl2 and in uM: 100 Kynurenic Acid) that had been saturated with 95%O2/5%CO2. 300 µM thick coronal slices were cut (Leica VT1200S; Leica Microsystems, Germany), transferred to a holding chamber and incubated at 35°C for 30 minutes in modified aCSF (in mM: 119 NaCl, 25 NaHCO3, 28 Glucose, 2.5 KCl, 1.25 NaH2PO4, 1.4 CaCl2, 1 MgCl2, 3 Na Pyruvate and in uM: 400 Ascorbate and 100 Kynurenic Acid, saturated with 95%O2/5%CO2) and then stored at room temperature.

For recordings, slices were transferred to a recordings chamber perfused with modified aCSF (in mM: 119 NaCl, 25 NaHCO3, 18 Glucose, 2.5 KCl, 1.25 NaH2PO4, 1.4 CaCl2, 1 MgCl2, 3 Na Pyruvate and in µM: 400 Ascorbate and saturated with 95%O2/5%CO2) maintained at 32–34°C, at a flow rate of 2–3mL per minute. Patch pipettes (resistance 5–8 MΩ) were pulled on a laser micropipette puller (Model P-2000, Sutter Instrument Co., Sunnyvale, CA) and filled with one of the following intracellular solutions: Current-clamp recordings of spike activity used a KGluconate based intracellular solution (in mM: 137.5 KGluconate, 2.5 KCl, 10 HEPES, 4 NaCl, 0.3 GTP, 4 ATP, 10 phosphocreatine, pH 7.5). Voltage-clamp recordings for IPSC measurements used a CeMeSO4 based intracellular solution (in mM: 114 CeMeSO4, 4 NaCl, 10 HEPES, 5 QX314, 0.3 GTP, 4 ATP, 10 phosphocreatine, pH 7.5). Alexa Fluor 488 or Alexa Fluor 568 was commonly added to intracellular solution to aid cell visualization and post-hoc reconstruction. In some experiments the following were added as indicated in the text: 10µM CNQX or 5µM NBQX, 50µM D-AP5, 10µM GABAzine were diluted from stock in the aCSF. All drugs were obtained from Tocris Biosciences, Inc. Intracellular recordings were made using a MultiClamp700B amplifier (Molecular Devices, Sunnyvale, CA) interfaced to a computer using a analog to digital converter (PCI-6259; National Instruments, Austin, TX) controlled by custom written scripts (available at dudmanlab.org) in Igor Pro (Wavemetrics, Eugene, OR). Photostimulation was carried out using a dual scan head raster scanning confocal microscope and control software developed by Prairie Systems, (Middleton, WI) and incorporated into a BX51 upright microscope (Olympus America, Inc., Center Valley, PA).

Viral overexpression of ChR2

An adeno-associated virus (kindly provided by the Sternson laboratory at Janelia Farm Research Campus) with a cre-dependent ChR2 transgene was injected into the SN of mice in which cre-recombinase was expressed under the control of the glutamatic acid decarboxylase 2 gene in a fashion similar to that previously described47. Briefly, under deep anaesthesia a small craniotomy was made over the SN (−3 mm AP, 1 mm ML, −4.2 mm DV). A glass pipette was used to pressure inject small volumes of virus (20–50 nL per injection site). Animals were allowed to recover for at least 2 weeks following infection and before in vitro brain slices were prepared as described above.

Optical stimulation and imaging

The optics were designed to minimize the spread of the laser in the x,y dimensions of the focal plane while accentuating the focus in z by underfilling the back aperature of the objective. Stimulation intensity was controlled by pulse duration (0.2 –1 ms). Stimulation typically consisted of 9×9 maps of stimulation sites with independent stimuli being delivered in a pseudo-random (non-neighbor) sequence at an interstimulus interval of >=150 ms). Stimulation strength was modulated by gating the laser at maximal power (473 nm; AixiZ or 488 nm; BlueSky Research) with varying durations using timing signals from an external pulse controller, PrairieView software, and the internal power modulation circuitry of the laser or an external Pockels cell (Conoptics, Danbury, CT) with indistinguishable results.

Wide-field activation of ChR2 was accomplished using blue LED (470 nm; ThorLabs, Newton, NJ) transmitted through the fluorescence light path of the BX51 microscope. LED intensity and timing were controlled through a TTL-triggered variable current source (ThorLabs, Newton, NJ).

Computational modeling

The simple model described here was inspired by the canonical theta neuron model from Gutkin and Ermentrout48. The DAm was implemented in Matlab R2011a (Mathworks, Natick, MA) with minor modification from previous models. We modified the model to simulate a neuron with an intrinsic bias towards tonic activity that could be perturbed by input stimuli. The phase of the oscillator was solved using numerical integration of a differential equation for phase:

  • dθ/dt = b(1-cosθ) + K(1+cosθ)

where, K = Wnoise + Istim

A ‘spike’ was determined as the phase reset at θ=pi. The intrinsic bias b was introduced to drive a tonically active oscillator independent of stimuli. A large parameter space of the model was examined (Supplementary Fig. 8) by altering the magnitude of b (minimum: 0.0005 a.u., maximum: 0.02 a.u.), the amplitude (±0.005 a.u., ±0.5 a.u.) and decay time constant (10 ms, 200 ms) of the exponentially decaying of Istim. For each parameter combination 100 iterations were run. PSTHs were calculated with 1 ms resolution and smoothed with a gaussian kernel (σ = 10 ms). The full-width half maximum of inhibition and pause duration were calculated as in analysis of in vitro recording data.

Supplementary Material

1

Acknowledgements

Joe Paton, Winfried Denk, Alla Karpova, Albert Lee, Jeff Magee, and Gabe Murphy provided critical feedback at various stages of preparation of the manuscript and progression of the project. We are indebted to the extensive feedback from our colleagues following presentation of this work at internal seminars on the Janelia Farm Research Campus and the helpful comments of 3 anonymous reviewers.

WXP and JTD designed the project. JTD, WXP, JB analyzed the data and wrote the manuscript. WXP performed the in vivo recording and behavioral experiments. JB performed the in vitro experiments. JTD implemented the computational model and performed a minority of the experiments. We thank members of the lab for critical reading and feedback on the manuscript. JB is a graduate scholar in the Cambridge-Janelia Farm Graduate Program. JTD is a JFRC Fellow of the Howard Hughes Medical Institute. This work was supported by funding from the Howard Hughes Medical Institute.

Footnotes

The authors declare no competing financial interests.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES