SUMMARY
It has been proposed that sound information is separately streamed into onset and offset pathways for parallel processing. However, how offset responses contribute to auditory perception remains unclear. Here, loose-patch and whole-cell recordings in awake mouse primary auditory cortex (A1) reveal that a subset of pyramidal neurons exhibit a transient "Off" response, with its onset tightly time-locked to the sound termination and its frequency tuning similar to that of the transient "On" response. Both responses are characterized by excitation briefly followed by inhibition, with the latter mediated by parvalbumin (PV) inhibitory neurons. Optogenetically manipulating sound-evoked A1 responses at different temporal phases or artificially creating phantom sounds in A1 further reveals that the A1 phasic On and Off responses are critical for perceptual discrimination of sound duration. Our results suggest that perception of sound duration is dependent on precisely encoding its onset and offset timings by phasic On and Off responses.
Graphical Abstract
In brief
The mechanism for coding sound duration is not clear. In awake behaving mice, Li et al. show that the transient, temporally precise offset responses of primary auditory cortical neurons to the termination of sounds are critical for the encoding and perceptual recognition of sound duration.
INTRODUCTION
To generate a faithful auditory representation, both sound appearance and disappearance (i.e., onset and offset, respectively) must be encoded by the auditory system. While the auditory nerve and many neurons in the brain increase and decrease firing activity following sound onsets and offsets, respectively, other auditory brain neurons exhibit increases of activity following sound onsets and/or offsets. That is, similar to the visual system, the stimulus onset and offset information is processed in parallel, and dual "On" and "Off" pathways are likely utilized to represent sound onsets and offsets, respectively (Kopp-Scheinpflug et al., 2018; Liu et al., 2019; Scholl et al., 2010). Since On responses are more prevalent than Off responses in the auditory system, especially in anaesthetized animals (Phillips et al., 2002; Young and Brownell, 1976), most previous studies have been focused on the On response to understand how auditory attributes are coded by different firing patterns (Ehret and Merzenich, 1988; Godfrey et al., 1975; Hancock and Voigt, 2002; Recanzone, 2000; Rhode and Kettner, 1987; Rhode and Smith, 1986; Semple and Kitzes, 1985) and the underlying neuronal and synaptic mechanisms (Tan et al., 2004; Wu et al., 2006, 2011; Zhang et al., 2003, 2011; Zhou et al., 2012, 2015). As such, our understanding of functional contributions of Off responses to auditory representation and perception has remained limited (Kopp-Scheinpflug et al., 2018).
Neurons exhibiting Off responses have been found throughout the ascending auditory system, including the dorsal cochlear nucleus (Ding et al., 1999; Suga, 1964), auditory brainstem (Dehmel et al., 2002; Henry, 1985a, 1985b; Kopp-Scheinpflug et al., 2018; Kulesza et al., 2003), inferior colliculus (Akimov et al., 2017; Kasai et al., 2012), medial geniculate body (MGB) of the thalamus (Anderson and Linden, 2016; He, 2001, 2002), and the primary auditory cortex (A1) (Baba et al., 2016; Chong et al., 2020; Deneux et al., 2016; Fishman and Steinschneider, 2009; Joachimsthaler et al., 2014; Keller et al., 2018; Liu et al., 2019; Qin et al., 2007; Recanzone, 2000; Scholl et al., 2010; Sollini et al., 2018; Sołyga and Barkat, 2019; Tian et al., 2013). These neurons may be connected through or contribute to a dedicated Off-response relay pathway (Kopp-Scheinpflug et al., 2011, 2018; Liu et al., 2019; Scholl et al., 2010). Although Off responses have been proposed to play a role in the perception of sound duration or detection of gaps in a continuous sound (Casseday et al., 1994; He, 2001; Kopp-Scheinpflug et al., 2011; Malone et al., 2015; Qin et al., 2009), this idea has not been tested directly in behavioral paradigms for auditory perception, largely due to technical difficulties in manipulating Off responses in a reversible and temporally precise manner (Weible et al., 2014). As such, how sound duration, a most fundamental attribute of acoustic stimuli important for communication and echolocation, is faithfully encoded in the central auditory system has remained largely unclear (Alluri et al., 2016; Casseday et al., 1994; Fuzessery and Hall, 1999; Sayegh et al., 2011).
In this study, using in vivo loose-patch and whole-cell recordings in A1 of awake mice, we characterized properties of transient Off responses in pyramidal as well as two types of inhibitory (I) neurons and then investigated synaptic mechanisms underlying the response in pyramidal neurons. By optogenetically manipulating A1 activity during specific temporal phases of auditory responses in mice performing auditory tasks, we found that the phasic Off response in conjunction with the phasic On response played an essential role in perceptual recognition of sound duration. Finally, we artificially created transient On- and Off-like activity in A1 and found this was sufficient to allow duration-specific sound perception. These results demonstrate that A1 phasic Off responses are essential for encoding and perception of sound duration.
RESULTS
Noise-induced On and Off responses in awake mouse A1
By in vivo cell-attached loose-patch recording, we examined neuronal responses to auditory stimuli of different durations in A1 of awake head-fixed mice (Figure 1A). The recording strongly biased sampling toward pyramidal neurons (see STAR Methods). White noise stimuli of 15 durations (20–300 ms, spaced at 20 ms) were applied in a randomized sequence. As shown by an example neuron, transient spiking responses were evoked following both the onset and offset of noise stimulation (Figure 1B). To display a response-duration function, we measured amplitudes of responses to durations ≥40 ms, for which the offset (Off) response could be clearly separated from the transient onset (On) response (Figures S1A-S1F). The amplitude of the On response was more-or-less constant across durations in the example neuron, whereas that of the Off response was decreased with increasing durations (Figure 1C). At the population level, the Off response showed an overall monotonic decrease in amplitude with increasing durations (Figure 1D), although at the individual-cell level, the response-duration function exhibited some degree of diversity (Figures S1H–S1P). We measured onset latencies of On and Off responses (see Figure S1G and STAR Methods). Their difference (Δlatency, Off minus On) precisely matched with the stimulus duration (Figure 1E as an example cell), as shown by the ∼1 slope (Figure 1F) and the small y intercept (Figure 1G) of the linear fitting of the Δlatency-duration function. This indicates that Off responses are tightly time-locked to the sound termination and therefore are a bona fide response to the cessation of ongoing sound.
Figure 1. Off responses in awake mouse A1 under noise stimulation.
(A) Schematic illustration of experimental setup. The animal was head fixed via a headpost (P) but could run freely on a running plate. Sound (S) was applied to one ear, and patch recording (R) was performed in the contralateral A1.
(B) Top: raster plot of spikes in response to noise of 5 out of 14 testing durations (marked by gray bars) for an example A1 neuron. Bottom: corresponding post-stimulus spike time histogram (PSTH). FR, firing rate.
(C) Evoked spike numbers of On and Off responses at different sound durations for the same neuron shown in (B). Bar represents SD in all figure panels.
(D) Population average of normalized FRs of Off responses at different durations (n = 29 cells).
(E) Plot of Δlatency (Off minus On) versus duration for the neuron shown in (B). Red line is the best-fit linear regression line.
(F) Distribution of slopes of the best-fit linear regression line for recorded A1 neurons showing both On and Off responses (n = 29 cells). Red arrow points to the mean value.
(G) Distribution of y intercepts of the linear regression line (n = 29 cells).
(H) Response amplitudes of On and Off responses for three types of neurons: those exhibiting On-responsiveness only, those exhibiting both On and Off responses, and those exhibiting Off-responsiveness only. Light gray represents individual cells. Column represents mean ± SD.
(I) Pie chart showing the proportion of nonresponsive (NR) and each type of responsive neurons in (H) (45.9%, 39.7%, 9.3%, and 5.1%, respectively).
(J) Distribution of Off-responsiveness indices (ORIs, calculated at 100-ms duration) in recorded A1 neurons (n = 191 cells in total).
(K) Onset latencies of On and Off spiking responses. Light gray represents individual neurons. Solid dark gray represents mean ± SD. Top inset: dotted lines illustrate measurement of the onset latency of the Off response. ***p = 2.2 × 10−11, Wilcoxon rank-sum test, z = 6.68.
(L) Jitter (i.e., standard deviation) of first-spike timing for On and Off responses. ***p = 1.03 × 10−23, Wilcoxon rank-sum test, z = 10.04.
(M) Distribution of half-peak durations of PSTH profiles of On (red) and Off (black) responses. Top inset: red line illustrates measurement of half-peak duration.
(N) Distribution of latencies to the peak Off response. Top inset: red dashed lines illustrate measurement of latency to the peak response.
Consistent with our previous observation (Liang et al., 2019), a large fraction of the recorded cells was not responsive to noise stimuli (see STAR Methods). In neurons that did exhibit evoked spike responses, 73.8% (141 out of 191) exhibited an On response only, 9.4% (18 out of 191) exhibited an Off response only, and 16.8% (32 out of 191) exhibited both On and Off responses (Figures 1H and 1I). We used an Off-responsiveness index (ORI) to quantify the relative strength of Off/On responses in an individual cell, with ORI = 1 indicating Off-responsiveness only and ORI = −1 indicating On-responsiveness only. The distribution of ORIs of recorded neurons (calculated at 100-ms duration) is shown in Figure 1J. Layer (L)5 contained proportionally more neurons exhibiting Off responses than other cortical layers (Figures S2A and S2B). Neurons exhibiting On-responsiveness only, Off-responsiveness only, and On-Off responses did not differ in the level of spontaneous firing activity (Figure S2C). ORI calculated at a longer duration (300 ms) was not significantly different from that calculated at 100-ms duration (Figure S2D).
Compared with On responses, Off responses had a longer onset latency (relative to the timing of sound termination) (Figure 1K) and a larger jitter (i.e., variation) of first-spike timings (Figure 1L). They were, in general, as transient as On responses, as reflected by the short half-peak duration of the post-stimulus spike time histogram (PSTH) profile (Figure 1M) and the short latency to the peak evoked firing rate (FR) (Figure 1N).
Synaptic inputs underlying Off responses
We next examined synaptic inputs underlying the Off and On responses by whole-cell voltage clamp recording in awake mice, following our previous studies (Li et al., 2019; Liang et al., 2019; Zhou et al., 2014). The membrane potential of recorded cells was clamped at −70 mV and 0 mV to isolate excitatory (E) and inhibitory (I) synaptic currents, respectively (see STAR Methods). We recorded from A1 L5 neurons. As shown by an example cell, both E and I responses were evoked following the onset and offset of noise stimulation across different testing durations (Figure 2A). We measured the peak amplitude (Figure 2B) as well as the onset latency (Figure 2C) of E and I responses. While 100% of recorded neurons (62 out of 62) exhibited synaptic responses to the stimulus onset, only 41.9% of them (26 out of 62) exhibited synaptic responses to the stimulus offset, and none of them had a synaptic offset response only (Figure 2D), again demonstrating that Off responses are less dominant. In cells exhibiting both On and Off synaptic responses, the Off response was weaker than the On response for both E and I (Figure 2E, measured at 100-ms duration). Nevertheless, the E/I ratio was similar for On and Off responses (Figure 2F), and the normalized Off response amplitude (Off/On ratio) was not significantly different between E and I (Figure 2G). In addition, the onset latency of synaptic responses was longer for Off than for On responses for both E and I (Figure 2H), and the jitter of onset latencies was larger for Off than On synaptic responses (Figure 2I), in line with the spiking response results. Furthermore, the Δlatency (I minus E) was small for both On and Off responses (1.3 ± 0.8 ms for On; 2.4 ± 1.6 ms for Off, calculated at 100-ms duration), although that of the Off response was slightly longer (Figure 2J). In neurons for which different stimulus durations were tested, we found that neither the normalized amplitude of Off synaptic responses (for both E and I) (Figure 2K) nor the Δlatency (I minus E) of Off synaptic responses (Figure 2L) was modulated by the sound duration.
Figure 2. Synaptic mechanisms for Off responses in L5 E neurons.
(A) Average traces of evoked excitatory (E) and inhibitory (I) synaptic currents by noise of three different durations in an example L5 A1 neuron. Scale: 200 pA, 50 ms.illustrate measurement of latency to the peak
(B) Peak amplitudes of synaptic conductance in response to noise of different durations for the same neuron shown in (O).
(C) Latencies of synaptic responses for the same neuron shown in (A).
(D) Proportion of recorded neurons exhibiting On synaptic responses only (58.1%) and those exhibiting both On and Off synaptic responses (41.9%).
(E) Peak amplitudes of E On (E_on), E Off (E_off), I On (l_on), and I Off (l_off) responses. Paired t test (E_on versus E_off, z = 4.27, ***p = 2.5 × 10−4; l_on versus l_off, z = 5.33, ***p = 1.61 × 10−5), n = 26 cells. Data points for the same cells are connected with a line.
(F) E/I ratios for On and Off responses.
(G) Off/On ratios for E and I synaptic responses.
(H) Onset latencies for synaptic On and Off responses. Wilcoxon signed-rank test (E_on versus E_off, z = 4.42, ***p = 5.96 × 10−8; l_on versus l_off, z = 4.42, ***p = 5.96 × 10−8), n = 26 cells.
(I) Jitters of onset latencies for synaptic On and Off responses. Wilcoxon signed-rank test (E_on versus E_off, z = 4.44, ***p = 2.98 × 10−8; l_on versus l_off, z = 4.44, ***p = 2.98 × 10−8), n = 26 cells.
(J) ΔLatency of synaptic responses (I - E) for On and Off responses. **p = 0.00613, Wilcoxon signed-rank test, z = 2.67, n = 26 cells.
(K) Average peak amplitudes of Off synaptic responses and Off/On ratios (red) at different durations (n = 26 cells).
(L) ΔLatency (I - E) for On and Off synaptic responses at different durations (n = 26 cells).
Off responses of inhibitory neurons
Since inhibitory Off responses were found, we reasoned that at least some A1 inhibitory neurons should exhibit spiking Off responses. To test this, we focused on parvalbumin (PV) and somatostatin (SOM) neurons, which are thought to provide feedforward and feedback I, respectively (Fishell and Rudy, 2011; Li et al., 2015; Ma et al., 2010; Zhang et al., 2011). We injected adeno-associated virus (AAV) encoding Cre-dependent channelrhodopsin 2 (ChR2) in A1 of PV-Cre/SOM-Cre crossed with Ai14 (Cre-dependent tdTomato) reporter mice (Figures 3A and 3E, top). Using loose-patch recording, we optogenetically identified PV or SOM neurons based on their spiking responses time-locked to blue LED light pulses applied (Figures 3A and 3E, bottom). Phasic Off responses were observed in PV and SOM neurons (Figures 3B and 3F), and their amplitudes were not modulated much by the sound duration (Figure 3C, 3D, 3G, and 3H). The distribution of ORIs for PV or SOM neurons was not dramatically different from that of E neurons (Figure 3J), except that the proportion of On-Off neurons was relatively larger, and that of Off-only neurons was smaller in the inhibitory neuron populations (Figures 3I–3K). In the noise-responsive populations, 62% (87 out of 141) of PV neurons exhibited On responses only, 4% (6 out of 141) exhibited Off responses only, and 34% (48 out of 141) exhibited both On and Off responses; meanwhile, 67% (18 out of 27) of SOM neurons exhibited On responses only, 0% exhibited Off responses only, and 33% (9 out of 27) exhibited both On and Off responses (Figure 3I). Overall, SOM neurons showed weaker evoked responses (for both On and Off) than PV neurons (Figures 3L–3N), consistent with our previous studies (Li et al., 2015; Ma et al., 2010; Mesik et al., 2015). The amplitude of Off responses was similar between PV and pyramidal neurons (Figure 3L), which is slightly different from a previous study showing that Off responses measured by a gap-in-noise are slightly stronger in PV than presumed pyramidal neurons (Keller et al., 2018). Similar to E neurons, the Δlatency between Off and On responses of the inhibitory neurons was well matched with sound duration (Figures S2E–S2J). Furthermore, we compared the onset latency of spiking Off responses of inhibitory neurons and that of Off synaptic I in E neurons. We found that the onset of evoked spiking in PV neurons was slightly earlier than that of the synaptic I, whereas that of evoked spiking of SOM neurons was significantly delayed (Figure 3O). Finally, SOM neuron responses had a larger jitter of onset latencies than PV neurons (Figure 3P). Together, these data indicate that PV neurons are capable of providing the temporally precise feedforward I underlying the Off spiking response of E neurons. On the other hand, SOM neurons are only able to provide feedback I, as concluded previously (Li et al., 2015; Ma et al., 2010).
Figure 3. Off responses of I neurons.
(A) Top: confocal image of a brain section showing ChR2-EYFP expression in tdTomato-labeled PV I neurons in the A1 region. Scale bar: 500 μm. Bottom: raster plot of spike responses of an example PV neuron to pulses of blue LED light stimulation (marked by blue bars, 50 ms for each pulse).
(B) Top: raster plot of spikes in response to noise of different testing durations (marked by gray bars) for an example PV I neuron. Bottom: corresponding PSTH.
(C) Evoked spike numbers of On and Off responses by different stimulus durations for the same neuron shown in (B).
(D) Population average of normalized FRs of Off responses at different durations in recorded A1 PV inhibitory neurons (n = 19 cells).
(E–H) As in (A)–(D), but for SOM neurons (n = 9 cells).
(I) Proportions of NR and each type of responsive PV (left) and SOM (right) neuron (3.4%, 59.6%, 32.9%, and 4.1% for PV neurons, respectively; 41.3%, 39.1%, 19.6%, and 0% for SOM neurons, respectively).
(J) Distributions of ORIs for pyramidal (black), PV (red), and SOM (blue) populations (n = 191, 141, and 27 neurons, respectively).
(K) Percentages of pyramidal (black), PV (red), and SOM (blue) neurons showing On-responsiveness only, both On and Off responses, and Off-responsiveness only, respectively. p < 0.01, Cochran-Armitage test.
(L) Comparison of evoked spike numbers for On and Off responses among pyramidal, PV, and SOM neurons. Column represents mean ± SD. One-way ANOVA (F = 9.01 for On group and F = 5.21 for Off group) and post hoc test (***p < 0.001, **p < 0.01, *p < 0.05).
(M) Evoked spike numbers of On and Off responses for three types of neurons in the PV population (n = 87, 48, and 6 cells, respectively).
(N) As in (M), but for SOM neurons (n = 18, 9, and 0 cells, respectively).
(O) Onset latencies of PV neuron spiking responses, I input to pyramidal neurons, and SOM neuron spiking responses. Wilcoxon rank-sum test (PV_spike versus Inh, z = 4.56, ***p = 5.01 × 10−6; Inh versus SOM_spike, z = 3.22, **p = 0.0012), n = 48, 26, and 9, respectively.
(P) Jitters of first-spike timing of Off response in PV and SOM neurons. **p = 0.0018, Wilcoxon rank-sum test, z = 3.12; n = 48 and 9, respectively.
Off responses under tone stimulation
Comparing responses to noise and best-frequency (BF) tones of the same intensity and duration in the same A1 pyramidal neurons, we found that Off responses to tones were weaker than those to noise, whereas On responses did not show a difference (Figure 4A). Consistently, neurons exhibiting Off responses were relatively sparser under tone than noise stimulation (Figure 4B) across different frequency bands within A1, whereas Off responses to noise were more frequently observed in high-frequency bands (Figure S3A). Whole-cell recording also revealed that Off synaptic responses were weaker under tone than noise stimulation for both E and I (Figures 4C and 4D) and that they had longer latencies under tone stimulation (Figure S3B). Despite this difference, under tone stimulation, a similar E/I ratio was found for On and Off synaptic responses (Figure S3C), as were a similar normalized Off response amplitude for E and I (Figure S3D) and a longer synaptic Δlatency (I minus E) for the Off than On response (Figure S3E), similar to noise stimulation.
Figure 4. Off responses under tone stimulation.
(A) Comparison of On and Off response amplitudes between best-frequency (BF) tone (T) and noise (N) stimulation. Data points for the same cells are connected with a line. ***p = 3.05 × 10−5, Wilcoxon signed-rank test, z = 3.49, n = 16 cells.
(B) Top: distribution of ORIs under tone (red) or noise (black) stimulation. Bottom: accumulative distribution of ORIs. p < 0.001, K-S test, n = 343 and 191 cells for tone and noise stimulation, respectively.
(C) Tone- and noise-evoked I and E synaptic responses in two example neurons. Scale: 100 pA, 50 ms.
(D) Relative amplitudes of Off synaptic responses (normalized to the counterpart On response in the same cell) under tone and noise stimulation. Wilcoxon signed-rank test (Exc_T versus Exc_N, z = 4.44, ***p = 2.98 × 10−8; Inh_T versus Inh_N, z = 4.44, ***p = 2.98 × 10−8), n = 26 cells.
(E) Left: heatmap for evoked responses by different tone frequencies in an example pyramidal neuron. Dashed lines indicate the onset and offset of tone stimulation. Right: six other example neurons.
(F) BF of Off responses versus that of On responses in the same neurons (n = 25 pyramidal cells). Blue line is the identity line.
(G) Comparison of frequency-tuning bandwidths (measured at 60 dB SPL) between On and Off responses. **p = 0.00225, Wilcoxon signed-rank test, z = 2.45, n = 25 cells.
(H–J) As in (E)–(G), but for PV neurons. ***p = 3.43 × 10−5, Wilcoxon signed-rank test, z = 3.64, n = 20 cells.
(K) Top: evoked I and E On responses at different tone frequencies in an example cell. Bottom: evoked Off synaptic responses in the same neuron. Scale: 100 pA, 50 ms.
(L) BF of Off responses versus that of On responses for E (black) and I (red). Best-fit linear regression lines are shown. n = 11 cells.
(M) Comparison of tuning bandwidths of synaptic On and Off responses. Wilcoxon signed-rank test (E_on versus E_off, z = 2.75, **p = 0.00195; I_on versus I_off, z = 2.90, ***p = 9.77 × 10−4), n = 11 cells.
Comparing frequency tuning of phasic responses (Figure 4E), we found that Off and On responses exhibited a similar preferred frequency (Figure 4F). Analysis of frequency-intensity tonal receptive fields (TRFs) further revealed that they had a similar characteristic frequency (CF) (Figures S3F and S3G) (i.e., Off and On responses were largely co-tuned). This observation is apparently different from previous reports (Anderson and Linden, 2016; Scholl et al., 2010), likely due to anesthesia conditions in those studies. Nevertheless, the Off response exhibited narrower frequency tuning (Figures 4G and S3H) and a higher-intensity threshold (Figure S3I) than the On response. Similar On-Off relationships were found for PV neuron responses as well (Figures 4H–4J and S3J–S3M).
We further examined frequency tuning of Off and On synaptic responses in presumed pyramidal neurons (Figure 4K). The Off and On synaptic responses exhibited a similar preferred frequency for both E and I (Figure 4L), and frequency tuning of the Off response was narrower than the On response for both E and I (Figure 4M). These synaptic response properties thus could account for the spiking response properties.
Off response is critical for duration detection
To understand whether the Off responses in A1 contribute to the encoding of sound duration, we employed a classical conditioning behavioral paradigm (Guo et al., 2014, 2019) for duration detection (Figure S4A) and then temporarily manipulated spiking activity of A1 pyramidal neurons during specific temporal phases. Mice were trained to lick for water reward (lick) upon detection of a 300-ms noise (60 dB SPL) and to refrain from licking (no-lick) upon detection of the same sound but of 50-ms duration (i.e., 50-/300-ms combination) (Figure S4B; see Figure S5 and STAR Methods for training details). The behavioral performance was quantified as the fraction of total trials with a correct response (hit + correct rejection) (Figure S4C), which increased over training sessions (Figure S4D). Training with other duration combinations (50/150, 50/600, 150/300, 300/50) was similarly efficient (Figure S4D), as shown by a similar accuracy rate at the last training session (Figure S4E). However, training with a 50-/80-ms combination was slower and less effective (Figures S4D and S4E), possibly because the difference between these durations is small. Well-trained animals exhibited "anticipatory" licks before onsets of reward (Guo et al., 2014) (Figures S5I–S5M). A series of control experiments further demonstrated that mice could learn to use duration information to differentiate between conditioned stimulus (CS+) and control (CS−) stimulus to predict reward (Figures S4–S6; see STAR Methods for details).
To temporally silence A1 pyramidal neuron spiking activity, we injected AAV encoding Cre-dependent ChR2 in PV-Cre crossed with Ai14 mice to activate PV inhibitory neurons in A1 (Li et al., 2013; Xiong et al., 2015) (Figure 5A). Based on the recorded activity patterns, the response to a sound can be divided into four temporal phases (Figure 5B): phase 1 contains the phasic On response, phase 2 may contain delayed On or sustained responses, phase 3 contains the phasic Off response, and phase 4 may contain delayed offset responses. We attempted to silence A1 activity during each of these temporal phases or a combination of them. To this end, we applied fading blue light illumination intensity, which was decreased over time (Inagaki et al., 2019; see STAR Methods). In A1 recording, we confirmed that the fading light completely blocked spiking responses during the entire course of illumination (Figures 5C–5E), without inducing significant rebound spiking activity following its termination (Figures S7A–S7E for 300-ms and 600-ms illumination; for 60-ms illumination: 3.6 ± 3.2 Hz and 3.5 ± 2.9 Hz FR for just before and after illumination, p > 0.05, paired t test, n = 26 cells) in contrast to illumination with a constant intensity (Figures S7D and S7E). Therefore, by temporally controlling the LED illumination, we could specifically block On/Off responses during different phases.
Figure 5. The Off response is essential for detecting sound duration.
(A) Left: schematic for viral injection and optic stimulation. Right: image showing ChR2 expression in A1 (left and right hemispheres) of a PV-Cre::Ai14 mouse. Scale bar, 500 μm.
(B) Schematic four temporal phases of a sound-evoked response.
(C) Left: PSTHs of noise-induced responses (On-Off type) in an example A1 neuron without (upper) and with (lower) coupling optogenetic activation (blue bar) of PV neurons in A1. Right: evoked spike numbers of On and Off responses in sound alone (S) and sound plus light stimulation (S + L) conditions for On-Off neurons. Paired t test (Onset S versus S + L, t = 2.98, *p = 0.03098; Offset S versus S + L, t = 5.22, **p = 0.00342), n = 6 cells.
(D) As in (C), but for On-only neurons. ***p = 5.1 × 10−7, paired t test, t = 12.59, n = 10 cells.
(E) As in (C), but for Off-only neurons. ***p = 4.8 × 10−4, paired t test, t = 6.12, n = 8 cells.
(F) Left: optogenetic silencing throughout phases 1–4. Middle and right: behavioral data from an example animal without (middle) and with (right) silencing phases 1–4 activity.
(G) Fraction of correct trials without (S) and with (S + L) optogenetic silencing in ChR2- and GFP-expressing animals. One-way ANOVA (p = 1.43 10−13, F = 76.15) and post hoc test (t = 12.11, ***p = 7.1 × 10−12), n = 8 animals for each group.
(H and I) As in (F) and (G), but for silencing phase 1 activity. One-way ANOVA (p = 4.9 × 10−9, F = 20.90) and post hoc test (t = 7.5, ***p = 1.9 × 10−7), n = 8 animals for each group.
(J and K) Silencing phases 2–4. One-way ANOVA (p = 5.5 × 10−5, F = 11.12) and post hoc test (t = 4.6, ***p = 4.6 × 10−4), n = 8 animals for each group.
(L and M) Silencing phase 2. One-way ANOVA (p = 0.085, F = 2.44), n = 8 animals for each group.
(N and O) Silencing phases 3 and 4. One-way ANOVA (p = 4.8 × 10−5, F = 11.12) and post hoc test (t = 4.54, ***p = 5.7 × 10−4), n = 8 animals for each group.
(P and Q) Silencing phase 3. One-way ANOVA (p = 9.46 × 10−6, F = 13.95) and post hoc test (t = 4.5, ***p = 6.30 × 10−4), n = 8 animals for each group.
(R and S) Silencing phase 4. One-way ANOVA (p = 0.313, F = 1.24), n = 8 animals for each group.
In animals well trained with the 50-/300-ms combination, we first blocked phases 1–4 of the responses to 300-ms noise (i.e., CS+ only). This largely reduced the performance (Figure 5F): the percentage of trials for licking upon CS+ signals reduced from 97% ± 2% to 27% ± 11% (Figure S7F), indicating that blocking both On and Off responses severely impaired the detection of the CS+ signal. No-lick trials were not significantly affected since the CS− signal (50-ms duration) was not coupled with LED illumination (Figure S7F); that is, the manipulation could reduce the overall accuracy by 50% at most (Figure 5G). The LED illumination did not affect performance in GFP control animals (Figure 5G). Next, we blocked only the phase 1 (i.e., phasic On) response. Surprisingly, blocking the phasic On response alone reduced the percentage of correct licking trials by about half (Figures 5H, 5I, and S7G), suggesting that the phasic On response is critical for the detection of duration. To further determine which other response component plays an important role, we specifically blocked phase 2 (potentially containing delayed On/sustained responses), phase 3 (containing only the phasic Off response), phase 4 (containing only delayed Off responses), as well as combinations of 2 + 3 + 4 and 3 + 4 (Figures 5J–5S). We found that as long as phase 3 was interfered, there was a significant reduction of correct licking trials, whereas blocking only phase 2 or only phase 4 had minimal or no significant effects (Figures S7H–S7L). Together, our behavioral data suggest that the presence of both phasic On and phasic Off responses is critical for the detection of duration.
Lower sensitivity to tone duration
We next wondered whether the utilization of Off responses for duration detection could be generalized. To address this issue, we trained mice to distinguish 50-ms versus 300-ms tones (4 or 32 kHz) with the same noise intensity as used in the previous training. Although using tones was equally successful to complete the beginning steps of the training (i.e., to train the animal to lick upon detection of a sound regardless of its duration), the next steps of training to differentiate between durations progressed much more slowly by using tones rather than noise (Figure S5B). It took about 4 training sessions to reach a plateau performance by using noise but about 11 sessions by using tones (Figure 6A). The slower progression of training with tones was also obvious in the same of group of animals first trained with tones and then with noise (Figure S8A), or vice versa (Figure S8B). Thus, mice had a lower sensitivity to tone durations than noise. This is consistent with our observation that Off responses to tones were weaker than those to noise (Figure 4A).
Figure 6. Optogenetically enhancing Off responses facilitates detection of tone durations.
(A) Fraction of correct performance trials over training sessions by using different types of sound stimuli. N = 6 mice for each group. Bar = SD. Dashed line marks a saturation level for tone cues.
(B) Left: viral injection strategy. Right: image showing the expression of ArchT in A1 of a PV-Cre::Ai14 mouse. Scale bar, 1 mm.
(C) Left and middle: spontaneous FR of an example PV neuron without and with LED illumination. Right: FRs of PV neurons in LED-off and LED-on conditions. *p = 0.03321, paired t test, t = 3.19, n = 7 neurons.
(D) Schematic enhancement of the tone Off response by optogenetic silencing of PV neurons in phase 3.
(E) Left: raster plot of spikes of an example neuron in response to characteristic frequency (CF) tones of three different durations in sound-alone condition (S) or when sound was coupled with LED stimulation of a lower intensity (S + L1, 2 mW) or a higher intensity (S + L2, 5 mW). Right: corresponding PSTHs.
(F) Evoked spike numbers in S, S + L1, and S + L2 conditions. Dark triangles: cells showed an increase of Off spikes in either LED condition. Light gray triangles: cells that did not show Off spikes even in LED conditions. Wilcoxon signed-rank test (S versus S + L1, z = 2.61, **p = 0.00391; S + L1 versus S + L2, z = 2.75, **p = 0.00195; S versus S + L2, z = 2.75, **p = 0.00195), n = 10 neurons.
(G) Behavioral data for the fifth session in an ArchT and GFP mouse with LED stimulation.
(H) Comparison of performance in the fifth session with (ArchT) and without (GFP) optogenetic enhancement of Off responses. **p = 0.0043, Wilcoxon rank-sum test, n = 6 animals.
(I) Fraction of correct trials over training sessions in animals with (ArchT, green) and without (GFP, black) optogenetic enhancement of Off responses. n = 6 mice for each group. Bar = SD. Dash line marks 80% level.
(J) Training sessions required to reach 80% correctness in ArchT and GFP groups. **p = 0.0022, Wilcoxon rank-sum test, n = 6 animals.
Enhancing Off-responsiveness improves detection of tone duration
Next, we wondered whether training with tones could be facilitated by artificially enhancing the Off responses to tones. To address this question, we expressed an inhibitory opsin (ArchT) in PV neurons of A1 (Figure 6B). In in vivo recording, PV neurons were identified, which exhibited narrow spikes and suppression of activity during green LED light illumination with a fading pattern (Figure 6C). We then applied this optogenetic stimulation within a short window (60 ms) after the termination of tones to specifically enhance the phasic Off responses of pyramidal neurons (Figure 6D). As shown by an example A1 neuron, in the control condition, the cell displayed a robust phasic On response but without an Off response across different testing durations (Figure 6E, "S"). Applying the LED illumination after the tone termination resulted in the emergence of a weak Off response (Figure 6E, "S + L1"), the amplitude of which was further enhanced by increasing the intensity of LED light (Figure 6E, "S + L2"). In 10 out of 33 recorded pyramidal neurons, we observed an emergence or enhancement of the Off response (Figure 6F). Accordingly, the percentage of cells displaying Off responses was increased by the LED light illumination (Figure S8C). Similar enhancements of Off responses were also observed in noise-evoked activity (Figures S8D–S8F). Thus, by suppressing PV neurons, we could enhance A1 Off responses at the population level. We applied the optogenetic enhancement of Off responses during the entire training process and found that this greatly accelerated the learning process as compared with GFP control animals (Figures 6G and 6H). At the fifth training session, the animals with PV cells that were already suppressed exhibited significantly better performance than control animals (Figures 6I and 6J). These behavioral data strongly suggest that A1 Off responses contribute positively to the perceptual recognition of sound duration.
Optogenetically creating phantom sounds
Finally, we reasoned that if A1 responses could determine duration detection, then artificially created activity in A1 simulating sound-evoked On and Off responses (i.e., phantom sounds) might directly allow duration detection. To test this idea, we attempted to optogenetically generate activity in A1 simulating a 300-ms sound, use this as the CS+ signal to train the animal, and then examine the performance with a real 300-ms sound cue (Figure 7A). We expressed ChR2 in A1 bilaterally using AAV vectors (Figure 7B). Before training, we first recorded local field potential (LFP) responses in A1 to 300-ms noise. LFP responses to LED activation of A1 were also recorded and compared with the sound-evoked responses. Through a negative feedback system, the pattern and duration of LED stimulation were adjusted (Figure 7C) so that the LED-generated LFPs with a transient On and transient Off response (Figure 7D) reached 90% similarity to the sound-induced LFPs, in terms of shape and amplitude (Figure 7E). We then used this LED stimulation pattern as the CS+ for training.
Figure 7. Substituting sound stimulation with light stimulation.
(A) Experimental design. Mouse was trained to discriminate a 50-ms sound- and light-generated activity simulating On-Off responses to a 300-ms sound and then tested with real sounds.
(B) Expressing ChR2 in A1. Right: confocal images of right and left A1. Scale, 500 μm.
(C) Strategy for adjusting LED illumination patterns to match light-evoked and sound-evoked local field potentials (LFPs).
(D) Sound-evoked (left) and light-evoked (right) LFPs from an example recording site.
(E) Ratio of peak amplitude versus ratio of half-peak duration between sound- and light-evoked LFPs for On and Off responses. n = 16 recording sites.
(F) Behavioral data for the last training session before testing (left) and during a testing session (right) with 300-ms light stimulation and 300-ms noise as the CS+ cue, respectively.
(G) Comparison of performance using different stimulus combinations. "S50/L300": last training session using 300-ms light stimulation as the CS+ cue and 50-ms sound as the CS− cue. "S50/S300" and "S50/S150": testing session using 300-/150-ms sound as the CS+ cue and 50-ms sound as the CS− cue. "S50/L150": testing session with 150-ms light stimulation as the CS+ cue and 50-ms sound as the CS− cue. Experiments were performed in the same group of animals. One-way ANOVA (p = 0, F = 169) and post hoc test (S50/L300 versus S50/S300, t = 6.84, ***p = 0.00024; S50/S300 versus S50/S150, t = 9.32, ***p = 3.4 × 10−5; S50/ L300 versus S50/L150, t = 20.88, ***p = 1.4 × 10−7), n = 8 animals.
(H) As in (G), but for substitution of 150-ms sound with 150-ms light stimulation. One-way ANOVA (p = 0, F = 115) and post hoc test (S50/L150 versus S50/S150, t = 8.5, ***p = 4.5 × 10−9; S50/S150 versus S50/S300, t = 7.5, ***p = 8.4 × 10−8; S50/L150 versus S50/L300, t = 14.2, ***p = 3.7 × 10−15), n = 8 animals.
We first trained animals with 50-ms sound as the CS− cue and LED activation simulating a 300-ms sound as the CS+ cue (i.e., S50/L300 combination) (see STAR Methods). In well-trained animals whose performance reached >90% accuracy (Figures 7F, left panel, and 7G, first column), we then applied a real-sound combination (S50/S300 or S50/S150, with sequence randomized) to test the performance (Figure 7F, right panel). The S50/ S300 combination resulted in performance of a high accuracy (>75%) (Figure 7G, second column), although lower than that tested by the S50/L300 combination. Comparison of the early half versus the late half of trials in the test session did not reveal any significant difference (79% ± 5% and 80% ± 4% accuracy for the early and late 50 trials, respectively; p > 0.05, paired t test, n = 6 mice), indicating that the performance under S300 cues unlikely involved a relearning process. Therefore, our data suggest a degree of interchangeability between S300 and L300. The S50/ S150 combination, however, resulted in a much lower accuracy (Figure 7G, third column). In the same group of animals, we also tested performance using a S50/L150 combination, which produced a much lower accuracy than the S50/L300 combination (Figure 7G, fourth and fifth columns). This indicates that the high performance of using L300 is not an unspecific effect of light application. Similar results were obtained in animals trained with a S50/L150 combination, with high performance achieved under testing with S50/S150 (Figure 7H). Together, these results strongly suggest that artificially generating activity in A1 alone to simulate sound-evoked activity can allow duration-specific perceptual detection.
DISCUSSION
In this study, we have characterized auditory responses of A1 neurons to noise and tones of different durations and found that a subpopulation of these neurons exhibited a phasic Off response with its onset timing tightly locked to the termination of sound. Temporally blocking Off responses in A1 significantly impaired perceptual recognition of sound duration. Conversely, temporally enhancing these responses improved detection of sound duration. Finally, artificially creating phasic On- and Off-response-like activity in A1 alone allowed duration-specific perceptual detection. This study thus elucidates the critical functional role of cortical phasic Off responses in the coding and perception of sound duration.
Off responses in auditory cortex
Although widely observed along the auditory axis, properties of the Off response and its relationship with the On response have not been fully examined in awake brains. Several recent electrophysiological studies of Off responses in rodents have been carried out in anesthesia conditions (Anderson and Linden, 2016; Scholl et al., 2010; Sollini et al., 2018) that could largely affect these responses (e.g., Qin et al., 2007). Recent imaging studies in awake auditory cortices examining the spatial organization of Off responses lack high temporal resolutions in signal measurements, and only stimuli of long durations (seconds) have been used (Baba et al., 2016; Deneux et al., 2016; Liu et al., 2019). More recently, Off responses to short-duration frequency-modulated sounds, as in ultrasound vocalizations, have been characterized in mice using single-unit recording (Chong et al., 2020). In the present study, the loose-patch recording approach enabled faithful acquisition of all spikes only from the targeted neuron, allowing a more reliable characterization of cell types in terms of On-/Off-responsiveness. Coupling optogenetics in transgenic mice, we were also able to characterize cell types of inhibitory (PV and SOM) neurons. We found that subpopulations of both E and I neurons exhibit transient responses with onset timings tightly locked (i.e., with small variabilities) to the onset and/or offset of sounds. The interval between the On and Off responses, thus, matches the sound duration well, which allows duration information to be precisely encoded by the timings of these phasic responses. It is interesting to note that in neurons with both On and Off responses elicited by tone stimuli, the On and Off responses exhibit a similar frequency preference. This suggests that for both simple and complex sounds, their durations can be coded at the individual-cell level in the cortex. These largely co-tuned On and Off responses can be attributed to similarly tuned onset and offset synaptic (including both E and I) inputs to A1 neurons resulting from a relay from On-Off thalamic neurons (Anderson and Linden, 2016) or, alternatively, a convergence of separate On and Off thalamic inputs (Liu et al., 2019). Our observation of co-tuning of On and Off responses is different from a previous study (Scholl et al., 2010). As discussed above, this discrepancy could be due to different brain states examined (awake versus anesthetized).
A neural mechanism for encoding sound duration
Three possible neural mechanisms can be exploited by the auditory system to encode or detect sound duration: (1) neurons with sustained On responses can code the full course of sound stimuli (Malone et al., 2015; Qin et al., 2009); (2) neurons with transient On and Off responses can separately code the appearance and disappearance of sound, respectively (He, 2001; Kopp-Scheinpflug et al., 2018; Malone et al., 2015; Qin et al., 2009); and (3) neurons with Off response amplitudes tuned for sound duration can directly code specific durations (Alluri et al., 2016; Casseday et al., 1994; Fuzessery and Hall, 1999; He et al., 1997). Despite some correlational evidence gained from the properties of neuronal responses, these mechanisms have not been directly tested in the context of duration perception. In this study, our optogenetic perturbations of different temporal phases of sound-evoked responses show that perception of duration does not rely much on the sustained response. In addition, different from observations of duration-tuned neurons in the midbrain of frogs as well as echolocating and non-echolocating mammals (Alluri et al., 2016; Casseday et al., 1994; Fuzessery and Hall, 1999; Ma and Suga, 2007; Sayegh et al., 2011), in awake mouse A1, Off responses are generally not tuned to duration, as the recorded neurons predominantly exhibit either a flat or a monotonically increasing/decreasing response-duration function (Figures S1H–S1N). These results indicate that it is unlikely that mechanisms (1) and (3) underlie coding of sound duration in the cortex.
On the other hand, our study provides strong evidence to link the second mechanism to duration coding, based on both electrophysiological and behavioral experiments. Electrophysiologically, a significant proportion of cortical neurons exhibits both On and Off responses. The onset timings of spiking On and Off responses have only small jitters, which ensures that they can faithfully represent the onset and termination of sounds, respectively. Behaviorally, both the phasic On and Off responses in the auditory cortex are critical for perceptual recognition of sound duration, as shown by our loss-of-function and gain-of-function experiments. Blocking either the On or Off response largely reduces the perceptual performance, while specifically enhancing the Off response improves the perceptual performance and accelerates associative conditioning. More importantly, artificially creating transient On- and Off-response-like activity in A1 alone can substitute bona fide sounds to a degree in behavioral training and perceptual performance, suggesting that the two transients are likely sufficient for the perceptual recognition of a sound duration. Together, these pieces of evidence suggest that duration of a sound is coded by the presence of phasic On and Off responses that define the onset and offset of the sound, respectively. It is likely that a similar mechanism is exploited for coding the duration of a gap in sound, where the start and end of a gap can be reversely represented by timings of offset and onset responses, respectively (Anderson and Linden, 2016; Keller et al., 2018).
It should be noted that the optogenetic approach has its limitations. For example, the optogenetic I will also affect the baseline activity level. In addition, shaping of the LED light to match the LFP to sound-evoked activity is only at the population level, rather than at the individual-cell level. Further technical development will be required to achieve more precise spatiotemporal control of brain circuits.
Synaptic circuit mechanisms underlying phasic Off responses
We applied whole-cell voltage clamp recording to understand how two salient properties of A1 responses—(1) temporal precisions of onset timings and (2) diversity of On/Off patterns—are determined at the synaptic level. First, the temporal precision of On/Off responses is reflected not only by the small jitters of first-spike latencies at the individual-cell level (Figure 1L), but also by the small variations among cells (Figure 1K). This can be directly attributed to the temporal precision of E inputs received, which have even smaller jitters in their onset timings than spiking responses (Figure 2I). In addition, similar to previous findings for onset responses that synaptic I always follows the onset of synaptic E by a few milliseconds (Sun et al., 2010; Wu et al., 2006, 2008; Zhou et al., 2014), here, in awake mice, we find that underlying the Off response is also a sequence of phasic synaptic E briefly followed by phasic synaptic I. The temporal delay of I relative to E is comparable to the On response (Figure 2J), suggesting that a similar feedforward inhibitory circuit motif contributes to both Off and On responses. Such temporally precise feedforward I is most likely provided by local PV inhibitory neurons. PV neurons have response properties that fit them into this role: some of these neurons do show a transient Off response with its onset timing slightly earlier than that of the synaptic I observed in presumed pyramidal neurons, whereas the Off response of SOM neurons is much more delayed (Figure 3O). Additionally, silencing PV neurons enhances the amplitude of Off responses (Figures 6E and 6F). One advantage of PV-mediated feedforward I is to ensure transientness of the spiking response (Gabernet et al., 2005; Moore and Wehr, 2013; Zhou et al., 2012), thus further enhancing the temporal precision of the Off response and precision of duration coding. A similar feedforward inhibitory circuit mediated by PV neurons has been suggested for onset responses (Hamilton et al., 2013; Li et al., 2015; Ma et al., 2010).
Second, since offset synaptic E is always concurrent with offset synaptic I, the spiking Off response is strongly shaped by the E and I interplay. In other words, the synaptic I plays a role in the gain control of offset responses. Notably, the population of neurons with offset synaptic inputs is much larger than that of Off-responding neurons. This is because the presence of offset synaptic E does not necessarily lead to a spiking Off response, due to a strong suppressive control by the tightly followed offset I. Depending on the absolute as well as the relative strengths of E and I, in many cells, the Off-responsiveness is completely suppressed, in a similar manner that synaptic I controls the sparseness of cortical responses (Liang et al., 2019). The same mechanism applies to the On response as well, leading to the diversity of spike response patterns (NR, On only, On-Off, Off only). Since changing the E/I balance of Off responses can affect perceptual detection of duration, contextual information conveyed through long-range projections to auditory cortex may play an active role in modulating the perception of sound duration by modulating the E/I balance in pyramidal neurons (Chou et al., 2020; Letzkus et al., 2011; Liang et al., 2019).
In summary, our study has revealed a functional role of phasic Off responses in A1 through comprehensive characterizations of synaptic mechanisms and behavioral impacts. Our results support the notion that A1 Off responses contribute critically to encoding and perception of sound duration. As cortical Off responses are inherited from previous stages of processing (e.g., it is known that phasic Off responses first appear in the cochlear nucleus along the ascending auditory pathway) (Ding et al., 1999; Suga, 1964), it will be of great interest to understand how spiking Off responses with millisecond precisions emerge in the first place and what the underlying synaptic and cellular mechanisms are. These questions remain to be tackled in the future.
STAR★METHODS
RESOURCE AVAILABILITY
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Feixue Liang (Liangfx@smu.edu.cn).
Materials availability
This study did not generate new unique reagents.
Data and code availability
The datasets/code supporting the current study are available from the corresponding author on request.
EXPERIMENTAL MODELS AND SUBJECT DETAILS
All experimental procedures used in this study were approved by the Animal Care and Use Committee at the Southern Medical University. Male and female wild-type C57BL/6J and transgenic PV-Cre::Ai14 and SOM-Cre::Ai14 mice (The Jackson Laboratory) of 7–12 weeks old were used in this study. The animals were housed in the Southern Medical University vivarium with 12h light/dark cycles.
METHOD DETAILS
Animal preparation for recordings and behaviors
Animals were prepared in the same way as we described previously (Li et al., 2019; Liang et al., 2019; Xiong et al., 2015; Zhou et al., 2014). The animal was anesthetized with isoflurane (1.5%, vol/vol) and a screw for head fixation was mounted on top of the skull with dental cement. Skull over the A1 was cleaned and protected from being covered by dental cement. Afterward, 0.1 mg kg−1 buprenorphine were injected subcutaneously before the animal was returned to home cages. During the recovery period, the mouse was trained to get accustomed to the head fixation on the recording setup. To fix the head, the screw was tightly clamped by a custom-made post holder. The following recording and behavioral experiments were all performed in a sound-attenuation booth.
For recording, the mouse was briefly anesthetized with isoflurane and a craniotomy was performed over the A1. The animal was then left to recover from isoflurane for at least 2 h. Recording experiments were started after the animal exhibited normal running behavior. Each recording session lasted for about 4 hr. The animal was given drops of 5% sucrose (wt/vol) through a pipette every hour. Between sessions, the animal was returned to its home cage for a break of at least 2 hr.
For behavioral experiments, mice were water deprived ∼5 days before behavioral training. During the pre-training deprivation period, each mouse obtained ∼1 mL water per day. During the training period, the mouse obtained water from a lickport as the task reward. If they drank less than 0.5 mL of water, additional water was supplemented. The body weight was daily monitored to ensure that the mouse maintained good health during the behavioral period. The weight change was ∼20% of the original body weight at the end of water restriction and was kept below 20% (∼15%) during the training and testing period.
Sound generation
Software for sound stimulation was custom made in LabVIEW (National Instruments). For determining the best-frequency (BF) and frequency range of On and Off responses of each cell, pure tones (2–64 kHz spaced at 0.1 octave, 100-ms duration, 3-ms ramp, 60 dB sound pressure level [SPL], 3 repetitions) were delivered in a pseudo-random sequence at 0.5 s inter-stimulus interval. For recording Off responses of the cell, 15 noise bursts or BF tone bursts (20–300ms duration spaced at 20ms, 60dB SPL) were first generated and repeated for 10–20 times in LabVIEW. They were then delivered in a random sequence at 3 s inter-stimulus interval. For behavioral experiments, noise or tone bursts (with 3-ms ramp) with different durations were pre-calibrated to be 60 dB SPL, and the calibration indices for all bursts were saved as a LabView array for further sound level calculation. In a small set of experiments, we also used pre-calibrated 40–70 dB SPL noise burst as the sound cues.
Sound duration-cued conditioning task
The duration-cued conditioning task was performed with a custom-made behavioral control system, which included four modules: sound wave generation, water valve control, licking signal detection and LED control (Figure S4A). The software for our behavioral control was written in LabVIEW (National Instruments), and the hardware consisted of I/O ports of data acquisition cards (PCI-6731 and PCI-4052e, National Instruments) and peripheral circuits. Noise or tone bursts of different durations as duration cues were generated by the sound wave generation module and delivered through a calibrated open acoustic delivery system using a TDT ES1 speaker (Tucker-Davis Technologies). The amount of water and delivery time were controlled by the water valve control module using a micro water valve (INK, the Lee Company), which was connected to a lickport placed in front of the mouse’s mouth. The micro water valve was placed outside the sound-attenuation room to minimize any effects of its opening sound. The licking signal detection module was designed to detect the mouse licking using an infrared sensor. The LED control module was designed to generate adjustable blue LED light to activate ChR2.
The timeline of the behavioral task was designed as shown in Figure S4B. For the most frequently used S50/S300 combination (other combinations S50/S150, S50/S600, S150/S300, S300/S50 and S50/S80 were used in other subsets of experiments), 300-ms sound was set as the CS+ cue and 50-ms sound was set as the CS− cue. In each CS+ (Go) trial, ∼5μl water was delivered to the lickport 200ms (or 1 s) after the cue offset. Alternatively, a closed-loop system was used to deliver water only upon first licking occurring during the response window (Figure S6A). Reward was provided for all CS+ trials also in testing sessions. A 3 s time window after the sound cue was set as the response window. Correct discrimination was defined as licking the lickport (Lick) for water reward in response to 300-ms noise (hit) or no licking (No-Lick) in response to 50-ms noise (correct rejection) within the response window. If no licking occurred during the response window when 300-ms sound was given, or licking occurred during the response window when 50-ms sound was given, the trial was scored as a miss or false alarm, respectively. Miss or false alarm trials did not lead to punishment in our study. The behavioral performance was quantified as the fraction of trials with a correct response (hit + correct rejection). In some groups, we also quantified the fraction of lick (Go) trials for CS+ and CS− trials separately. A typical behavioral task session included fifty CS+ trials and fifty CS− trials, which were randomly sequenced and delivered at 10 s inter-stimulus interval (ISI) or irregular ISI randomized between 8–12 s.
Animal behavioral training and testing
After ∼4 days of recovery from surgery and ∼5 days of water restriction, the mice were then trained to perform duration-cued behavioral task. The behavioral training and testing were carried out in the dark cycle. Behavioral training was divided into three steps. In the first step, mice were acclimatized to the head-fixation and learned to lick the lickport for water reward. Once the mice found they could receive water from the lickport they would keep licking, regardless of sound stimulus (Figure S5A, step 1). In the following step, mice were trained to associate a sound stimulus to water reward (Figure S5A, step 2). After two training sessions, most mice learned to lick the lickport for water reward only after hearing a sound. The behavioral performance of each mouse was recorded after this step. In the last step, mice were trained to discriminate between 50-ms and 300-ms durations for water reward (Figure S5A, step 3). To accelerate training, each day we applied two training sessions with an ordered sequence (five 300-ms trials and then fifteen 50-ms trials) before a regular training session with a randomized sequence (Figure S5A, middle row). Using this training strategy, to most mice, the fraction of correct trials could reach > 90% after 4–6 regular training sessions (days) for noise-duration detection (Figures S4D and S5B). After the fraction of correct trials reached > 90%, mice were advanced to a behavioral testing session.
In mice well-trained with the 50/300-ms combination, we randomly varied the duration of CS+ signals between 100–500ms and found that the performance was gradually reduced as the CS+ signal became less similar to 300ms that was used as the conditioned stimulus (CS+) during training (Figure S4F). In another set of experiments, we varied the intensity of sound cues randomly between 40–70 dB SPL during training. Testing with these different intensities (Figure S4G) resulted in similar performance (Figure S4H), suggesting that animals could perform duration detection within a relatively wide range of sound intensities. Together, these results argue that the animals could differentiate sound cues based on the information of duration rather than potentially different perceived loudnesses due to different durations.
We examined the licking pattern for CS+ trials in different training sessions (Figure S4I) and found that in early training sessions for distinguishing CS+ and CS− signals there were a lot of licks occurring after the onset of sound and before the reward delivery. These so-called "anticipatory" licks (Guo et al., 2014) gradually reduced in number with the progression of training. The plots of time-dependent lick rates (Figure S4J) demonstrate that the initiation of anticipatory licks became gradually delayed (Figure S4K) and the probability of licking during sound presentation was largely reduced (Figure S4L) with the progression of training. These data suggest that after learning the animal tends to withhold licking until sound stops and judgement of sound duration has been made. The nature of anticipatory licking became even clearer when we prolonged the delay of reward delivery from 200ms to 1 s (Figure S4M).
To further confirm that animals had used sound duration cues, in well-trained animals we performed control testing experiments in which water reward was omitted from or provided for all trials regardless of CS+ or CS− cues (Figures S5D and S5E). We found that in the majority of animals (33 out of 40) the performance was similarly good (> 75% accuracy) in both the no-water and all-water conditions (Figure S5F). Therefore, for the 33 animals which were included in analysis, their licking was due to a successful detection of the reward-predictive sound cue rather than any reward-associated cues (e.g., odor of water or sound from the water valve). Additionally, using a closed-loop system to deliver water only upon animal’s first licking during the response window (as in a Go/No-Go operant task) achieved outcomes similarly to our regular training scheme (Figures S6A–S6D), although the efficiency was lower (Figures S6E and S6F). This further indicates that the licking was due to a detection of the reward-predictive cue. Training or testing with constant or irregular inter-trial-intervals resulted in similar outcomes (Figures S6G and S6H). Together, these control experiments gave us confidence that mice could be trained to distinguish different sound durations.
For behavioral experiments with optogenetic temporal silencing of cortical responses, mice were divided into 6 groups based on the temporal phases (or their combinations) to be silenced: phases 1–4, phases 1, phases 2–4, phases 2, phases 3+4, and phases 3. The temporal windows for the 4 phases were 0–60ms, 60–300ms, 300–360ms, 360–600ms, respectively. Each group consisted of 5–8 well-trained mice. As a control, mice first performed a typical sound duration-cued task session, which was the same as we used in the training sessions. They took a rest for at least one hour before taking another behavioral task session with optogenetic temporal silencing of A1. The behavioral performance of each mouse was the average of three behavioral sessions obtained in three different days.
For behavioral experiments with substitution of sound with light stimulation, mice completed the first 2 steps of the training process. They were then trained to discriminate S50/L300 combination, which included a 50-ms sound as the CS− cue and LED activation simulating a 300-ms sound as the CS+ cue. To all the mice in our study, the fraction of correct trials could reach > 90% after 2–3 training sessions with LED activation simulating sound-evoked activity. Well-trained mice were advanced to perform behavioral testing with real-sound combinations, which included a 50-ms sound as the CS− cue and a 300-ms sound (S50/S300, as a substitution combination) or a 150-ms sound (S50/S150, as a control combination) as the CS+ cue. Each mouse was only tested with a real-sound combination (in a random order) per day. In order to avoid possible extinction of learning, in each testing day the mouse was first subjected to an additional training session with S50/L300 combination before testing with a real-sound combination. Only after > 90% accuracy was confirmed was the mouse allowed to perform the next behavioral testing. Additionally, we tested a S50/L150 combination (including thirty 50-ms sound trials and thirty 150-ms LED simulation trials) at the end of behavioral experiments for each mouse to verify that it did discriminate LED simulated durations rather than detecting LED stimulation per se. Mice were not subjected to further behavioral testing after this control experiment.
In vivo cell-attached and whole-cell recordings
For recording auditory responses, sound stimuli were delivered through a calibrated open field speaker (ES1, TDT) positioned 10 cm from mouse head and facing the left ear. Multiunit recordings were first made with a tungsten electrode (2MU, FHC, Inc.) to determine the best frequency for an array of recording sites. A1 was identified based on response properties and the characteristic tonotopic gradient of best frequencies (high to low from anterior to posterior), as described in previous studies (Li et al., 2014, 2015, 2019; Liang et al., 2019; Sun et al., 2010; Zhang et al., 2002; Zhou et al., 2014). The animal head was tilted so that the electrode could penetrate the A1 surface at an angle of 80°.
Loose-patch and whole-cell recordings were made with an Axopatch 200B amplifier (Molecular Devices) as previously described (Li et al., 2013, 2019; Liang et al., 2019; Sun et al., 2010; Zhou et al., 2015). The patch pipette, controlled by a micromanipulator (Siskiyou), was lowered into the A1 at the same angle as in multiunit recordings. The cortical surface was covered with 3.5% agar prepared in warm artificial cerebrospinal fluid (ACSF; 124 mM NaCl, 1.2 mM NaH2PO4, 2.5 mM KCl, 25 mM NaHCO3, 20 mM glucose, 2 mM CaCl2, 1 mM MgCl2). Loose-patch recording (with 100–500 MΩ seal) was performed with a patch pipette (impedance of 5–7 MΩ) filled with ACSF. Pipette capacitance was fully compensated. Signals were recorded in voltage clamp mode at 20 kHz, with a command voltage applied to adjust the baseline current to be zero. If a cell did not exhibit spontaneous spikes within 10 min, it was not further recorded.
For whole-cell voltage-clamp recordings, patch pipette (impedance of 4–5 MΩ) contained a cesium-based solution: 125 mM cesium gluconate, 5 mM TEA-Cl, 4 mM MgATP, 0.3 mM GTP, 10 mM phosphocreatine, 10 mM HEPES, 10 mM EGTA, 2 mM CsCl, 1.5 mM QX-314, 1% biocytin (wt/vol) or 0.25 mM fluorescent dextrans, pH = 7.3. Signals were low-pass filtered at 2 kHz and sampled at 10 kHz. After forming a whole cell, whole-cell capacitance was fully compensated and the initial series resistance (Rs, 15–50 MΩ) was compensated for 40%–50% to achieve an effective Rs of 10–30 MΩ.A − 10 mV junction potential was corrected. Excitatory and inhibitory synaptic currents were recorded by clamping the cell at −70 mV and 0 mV, respectively. As demonstrated previously (Li et al., 2013; Liang et al., 2019; Sun et al., 2010), the blind whole-cell recording method with relatively large pipette openings resulted in almost exclusive sampling from excitatory cortical neurons.
The laminar locations of the recorded neurons were determined based on the micromanipulator reading, and in some cases confirmed by histology of the track of pipette penetration and/or fluorescence-dextran or biocytin labeled cell bodies. We found a relatively good correspondence between the traveling depth of the recording pipette from the pia and the reconstructed laminar location of the recorded neuron (Zhou et al., 2014). The L2/3 neurons were sampled at a cortical depth of 175–325 μm from the pial surface, L4 neurons at a depth of 350–500 μm, L5 at 525–800 μm following previous studies (Liang et al., 2019; Zhou et al., 2014).
Optogenetic stimulation of A1 and optogenetically guided loose-patch recordings from inhibitory neurons
For photoinhibition or photoactivation of A1 and recording from inhibitory neurons, we expressed ChR2, ArchT, or GFP depending on the purpose of experiments and the strain of mice in A1 by injections of viral vectors (Li et al., 2013; Liang et al., 2019; Xiong et al., 2015). Adult PV-Cre or SOM-Cre (Jackson Laboratory) mice were anesthetized with 1.5% isoflurane. A small cut was made on the skin covering the right A1 (for PV/SOM neuron recording) or both A1 (for optogenetic stimulation) and the muscles were removed. Two ∼0.2-mm craniotomies were made in each A1 region (temporal lobe, 2.7 and 3.2 mm caudal to Bregma). Adeno-associated virus (AAV) encoding Cre-dependent ChR2, ArchT or GFP (rAAV2/9-EF1α-DIO-hChR2(H134R)-EYFP-pA, titer: 5.17×1013 genomes ml−1, rAAV2/9-CAG-FLEX-ArchT-EGFP-WPRE-SV40-pA, titer: 2.04×1012 genomes ml−1, rAAV2/9-hSyn-EGFP-WPRE-HGH-pA, titer: 5.14×1012 genomes ml−1, purchased from BrainVTA Technology Co. Ltd., Wuhan, China) was delivered by using a beveled glass micropipette (tip diameter, ∼40 μm) attached to a microsyringe pump (World Precision Instruments). Injections were performed at two locations and two depths (300 and 600 μm), at a volume of 100 nL per injection and at a rate of 20 nL min−1. Right after each injection, the pipette was allowed to rest for 4 min before withdrawal. We then sutured the scalp, injected buprenorphine at 0.1 mg kg−1 and returned the mouse to its home cage. Mice were allowed to recover for at least 4 weeks. After experiments, the brain was sectioned and imaged with a fluorescence microscope to confirm viral expression.
For recording from PV/SOM neurons, loose-patch recordings using pipettes of smaller tip openings (pipette impedance, ∼10 MΩ) (Li et al., 2013; Zhou et al., 2014) were performed. An optic fiber connected to a blue LED source (470 nm, Thorlabs) was positioned close to the cortical surface of the recording site. We actively searched for neurons exhibiting LED-evoked spikes with the loose-patch recording paradigm, which were identified as PV/SOM neurons.
For optogenetic stimulation of A1, blue or green LED light was applied to the cortical surface to activate or silence PV neurons, which in turn silenced or activated cortical excitatory neurons, respectively (Li et al., 2013; Xiong et al., 2015). An optic fiber patch cord (200 mm, Thorlabs) connected to an LED source (470 nm, Thorlabs) was implanted on the surface of each A1 the day after the surgery for screw mounting. The implantations were made in the mouse anesthetized with isoflurane (1.5%) and mounted to the head-fixation apparatus. A craniotomy over each A1 was made. The cannula was lowered with a motion controller (Siskiyou) to the surface of each A1. The cannula was then secured on the skull by dental cement. To reduce rebound spikes, a pulse of fading blue light illumination was applied to activate PV neurons, for which the LED light intensity was linearly reduced to half of its value within the illumination duration. For optogenetic activation, blue LED light was applied to the cortical surface of A1 to activate ChR2-expressing neurons, with duration, intensity and decaying speed adjusted.
QUANTIFICATION AND STATISTICAL ANALYSIS
Data analysis
We performed data analysis with custom-developed software (MATLAB, MathWorks). Data from all the recorded neurons were first pooled together for a randomized batch processing without categorizing the neurons according to their specific identity (e.g., age, condition, laminar location, etc.).
Spike responses
In cell-attached recordings, spikes could be detected without ambiguity because their amplitudes were normally higher than 50 pA, whereas the baseline fluctuation was < 5 pA. Spikes for the Off response were counted within a 100ms window after the offset of sound. For durations > = 100ms, spikes for the On response were counted within a 0–100 ms time window after the onset of stimuli, while for durations < 100ms, On spikes were counted within the same time window of sound presentation. For On-Off neurons, in order to determine whether On and Off responses were temporally overlapped, we fit the post-stimulus spike time histogram (PSTH) profile with a mix of Gaussian distributions using an EM algorithm. As shown in Figures S1A–S1E, we obtained the amplitudes of Peak 1 (P1, On response), Peak 2 (P2, Off response) and the trough between the two peaks. We further used an index (Trough/ (P1+P2)/2) to measure the degree of temporal overlap between On and Off responses, with zero indicating no temporal overlap. As shown in Figure S1F, temporal overlap was observed in many cells when the sound duration was 20 ms, but was nearly absent from all cells for durations > = 40 ms. Therefore, to quantify duration tuning properties, we only considered responses to sound durations > = 40 ms. Most of cells in our recorded population had only transient responses. Only < 20% of responsive cells showed sustained responses. They were included in the population summaries.
Evoked firing rate was calculated after subtracting the average baseline firing rate. Evoked responses were defined as firing rates higher than the average baseline firing rate by 3 standard deviations. Neurons that did not exhibit evoked spiking responses were excluded from the analysis. The onset timing of sound-evoked spike responses was determined from the PSTH as the lag between the stimulus onset and the time point at which spike rate exceeded the average spontaneous firing rate by 3 standard deviations of baseline fluctuations. To quantify jitter of On or Off responses, the first evoked spike was identified in each trial as the spike closest to and after the determined response onset timing occurring within a 20-ms window after the response onset. The standard deviation of onset timings of the first spikes was then determined as jitter (Zhou et al., 2015). Note that we could most accurately identify the first spike in each trial for a neuron with a low spontaneous firing rate, but judgment of the first spike was sometimes difficult for a neuron with a high spontaneous firing rate. Using a lower threshold (e.g., 2 SD) did not affect the conclusion from Figure 1L. Onset latencies of synaptic responses were determined in a similar manner. Off responsiveness index (ORI) was calculated as (Roff − Ron )/(Roff + Ron ), with Ron and Roff representing evoked firing rates of On and Off responses respectively.
Synaptic responses
Synaptic response traces evoked by the same stimulus were averaged. Synaptic onset latency was determined at the time point where the evoked current exceeded the average baseline by 2 standard deviations. Peak amplitude was determined by averaging within a 5-ms window centered at the response peak after subtracting the baseline current. Excitatory and inhibitory synaptic conductances were derived according to ΔI = Ge (V − Ee) + Gi (V − Ei) (Li et al., 2013; Sun et al., 2010; Xiong et al., 2013; Zhou et al., 2014). ΔI is the amplitude of the synaptic current at any time point after subtracting the average baseline current; Ge and Gi are the excitatory and inhibitory synaptic conductance; V is the holding voltage, and Ee (0 mV) and Ei (−70 mV) are the reversal potentials. The clamping voltage V was corrected from the applied holding voltage (Vh): V = Vh − Rs*I, where Rs is the effective series resistance. By holding the recorded cell at two different voltages (the reversal potentials for excitatory and inhibitory current respectively), Ge and Gi could be resolved from the equation. Resting conductance was calculated based on the average baseline currents within a 50-ms window before the onset of evoked currents recorded under two different voltages (−70 mV and 0 mV).
Statistical test
Shapiro-Wilk test were first applied to exam whether samples had a normal distribution. In the case of a normal distribution, t test or ANOVA test with Bonferonni correction was applied. Otherwise, a nonparametric test (Wilcoxon signed-rank test or Wilcoxon rank-sum test) was applied. Data were presented as mean ± SD if not otherwise specified.
Supplementary Material
KEY RESOURCES TABLE.
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
Mounting Medium, antifading (with DAPI) | Solarbio | Cat# H-1500; RRID:AB_2336788 |
Bacterial and virus strains | ||
rAAV2/9-EF1a-DIO-hChR2(H134R)-EYFP | BrainVTA, Wuhan | PT-0001 |
rAAV2/9-CAG-FLEX-ArchT-EGFP | BrainVTA, Wuhan | PT-0633 |
rAAV2/9-hSyn-EGFP | BrainVTA, Wuhan | PT-1990 |
rAAV-hSyn-hChR2(H134R)-EYFP | BrainVTA, Wuhan | PT-1317 |
Chemicals, peptides, and recombinant proteins | ||
NaCl | OmniPur | UI27FZEMS |
KCl | Mallinckrodt | 7447-40-7 |
NaHCO3 | EMD Chemicals | 48204847 |
MgCl2 | J.T. Baker | 7791-18-6 |
CaCl2 | EMD Chemicals | 41046444 |
NaH2PO4 | EMD Chemicals | SX0320-1 |
Cs-Gluconate | Sigma | G4625 |
TEA-Cl | Sigma | T2265 |
CsCl | Sigma | 289329 |
HEPES | Sigma | SLBX2493 |
EGTA | Sigma | 324626 |
GTP | Sigma | G8877 |
Phosphocreatine | Sigma | P7936 |
QX-314 | Sigma | L5783 |
MgATP | Sigma | A9187 |
Glucose | Sigma | SLBC6575V |
Sucrose | Sigma | D00168514 |
Biocytin | Sigma | B4261 |
Agar | BioFroxx | 1182GR500 |
Paraformaldehyde | Alfa Aesar | 10194340 |
Experimental models: Organisms/strains | ||
Mouse: C57BL/6J | The Jackson Laboratory | RRID: IMSR_JAX:000664 |
Mouse: Ai14 | The Jackson Laboratory | RRID:IMSR_JAX:007914 |
Mouse: PV-Cre | The Jackson Laboratory | RRID:IMSR_JAX:017320 |
Mouse: SOM-ires-Cre | The Jackson Laboratory | RRID:IMSR_JAX:013044 |
Software and algorithms | ||
LabVIEW | LabVIEW | https://www.ni.com; RRID: SCR_014325 |
MATLAB | MathWorks | https://www.mathworks.com/; RRID: SCR_001622 |
Prism | GraphPad | https://www.graphpad.com/scientific-software/prism/; RRID: SCR_00279 |
Fiji | NIH | https://fiji.sc/; RRID: SCR_002285 |
Other | ||
NI board for data acquisition | National Instrument | N/A |
Highlights.
A1 neurons exhibit transient Off responses time-locked to the termination of sound
Precisely timed excitatory followed by inhibitory input underlies the Off response
Weakening A1 Off responses reduces perception of sound duration, and vice versa
Creating artificial phasic On- and Off-like activity in A1 allows duration perception
ACKNOWLEDGMENTS
This work was supported by grants from the National Natural Science Foundation of China (31671084), the Natural Science Funds for Distinguished Young Scholar of Guangdong province (2019B151502033), and the Science and Technology Planning Project of Guangzhou (201804010443) to F.L. H.L was supported by grants from the National Natural Science Foundation of China (32000699), the China Postdoctoral Science Foundation (2019M662970), and the Guangdong Basic and Applied Basic Research Foundation (2019A1515110804). L.I.Z. was supported by grants from the US National Institutes of Health (R01DC008983). H.W.T. was supported by a grant from the US National Institutes of Health (R01EY019049).
Footnotes
SUPPLEMENTAL INFORMATION
Supplemental information can be found online at https://doi.org/10.1016/j.celrep.2021.109003.
DECLARATION OF INTERESTS
The authors declare no competing interests.
REFERENCES
- Akimov AG, Egorova MA, and Ehret G (2017). Spectral summation and facilitation in on- and off-responses for optimized representation of communication calls in mouse inferior colliculus. Eur. J. Neurosci 45, 440–459. [DOI] [PubMed] [Google Scholar]
- Alluri RK, Rose GJ, Hanson JL, Leary CJ, Vasquez-Opazo GA, Graham JA, and Wilkerson J (2016). Phasic, suprathreshold excitation and sustained inhibition underlie neuronal selectivity for short-duration sounds. Proc. Natl. Acad. Sci. USA 113, E1927–E1935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson LA, and Linden JF (2016). Mind the Gap: Two Dissociable Mechanisms of Temporal Processing in the Auditory System. J. Neurosci 36, 1977–1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baba H, Tsukano H, Hishida R, Takahashi K, Horii A, Takahashi S, and Shibuki K (2016). Auditory cortical field coding long-lasting tonal offsets in mice. Sci. Rep 6, 34421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casseday JH, Ehrlich D, and Covey E (1994). Neural tuning for sound duration: role of inhibitory mechanisms in the inferior colliculus. Science 264, 847–850. [DOI] [PubMed] [Google Scholar]
- Chong KK, Anandakumar DB, Dunlap AG, Kacsoh DB, and Liu RC (2020). Experience-Dependent Coding of Time-Dependent Frequency Trajectories by Off Responses in Secondary Auditory Cortex. J. Neurosci 40, 4469–4482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chou XL, Fang Q, Yan L, Zhong W, Peng B, Li H, Wei J, Tao HW, and Zhang LI (2020). Contextual and cross-modality modulation of auditory cortical processing through pulvinar mediated suppression. eLife 9, e54157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dehmel S, Kopp-Scheinpflug C, Doörrscheidt GJ, and Rübsamen R (2002). Electrophysiological characterization of the superior paraolivary nucleus in the Mongolian gerbil. Hear. Res 172, 18–36. [DOI] [PubMed] [Google Scholar]
- Deneux T, Kempf A, Daret A, Ponsot E, and Bathellier B (2016). Temporal asymmetries in auditory coding and perception reflect multi-layered nonlinearities. Nat. Commun 7, 12682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ding J, Benson TE, and Voigt HF (1999). Acoustic and current-pulse responses of identified neurons in the dorsal cochlear nucleus of unanesthetized, decerebrate gerbils. J. Neurophysiol 82, 3434–3457. [DOI] [PubMed] [Google Scholar]
- Ehret G, and Merzenich MM (1988). Complex sound analysis (frequency resolution, filtering and spectral integration) by single units of the inferior colliculus of the cat. Brain Res 472, 139–163. [DOI] [PubMed] [Google Scholar]
- Fishell G, and Rudy B (2011). Mechanisms of inhibition within the telencephalon: "where the wild things are". Annu. Rev. Neurosci 34, 535–567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fishman YI, and Steinschneider M (2009). Temporally dynamic frequency tuning of population responses in monkey primary auditory cortex. Hear. Res 254, 64–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuzessery ZM, and Hall JC (1999). Sound duration selectivity in the pallid bat inferior colliculus. Hear. Res 137, 137–154. [DOI] [PubMed] [Google Scholar]
- Gabernet L, Jadhav SP, Feldman DE, Carandini M, and Scanziani M (2005). Somatosensory integration controlled by dynamic thalamocortical feed-forward inhibition. Neuron 48, 315–327. [DOI] [PubMed] [Google Scholar]
- Godfrey DA, Kiang NY, and Norris BE (1975). Single unit activity in the dorsal cochlear nucleus of the cat. J. Comp. Neurol 162, 269–284. [DOI] [PubMed] [Google Scholar]
- Guo ZV, Li N, Huber D, Ophir E, Gutnisky D, Ting JT, Feng G, and Svoboda K (2014). Flow of cortical activity underlying a tactile decision in mice. Neuron 81, 179–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo W, Robert B, and Polley DB (2019). The Cholinergic Basal Forebrain Links Auditory Stimuli with Delayed Reinforcement to Support Learning. Neuron 103, 1164–1177.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamilton LS, Sohl-Dickstein J, Huth AG, Carels VM, Deisseroth K, and Bao S (2013). Optogenetic activation of an inhibitory network enhances feedforward functional connectivity in auditory cortex. Neuron 80, 1066–1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hancock KE, and Voigt HF (2002). Intracellularly labeled fusiform cells in dorsal cochlear nucleus of the gerbil. I. Physiological response properties. J. Neurophysiol 87, 2505–2519. [DOI] [PubMed] [Google Scholar]
- He J (2001). On and off pathways segregated at the auditory thalamus of the guinea pig. J. Neurosci 21, 8672–8679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- He J (2002). OFF responses in the auditory thalamus of the guinea pig. J. Neurophysiol 88, 2377–2386. [DOI] [PubMed] [Google Scholar]
- He J, Hashikawa T, Ojima H, and Kinouchi Y (1997). Temporal integration and duration tuning in the dorsal zone of cat auditory cortex. J. Neurosci 17, 2615–2625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henry KR (1985a). ON and OFF components of the auditory brainstem response have different frequency- and intensity-specific properties. Hear. Res 18, 245–251. [DOI] [PubMed] [Google Scholar]
- Henry KR (1985b). Tuning of the auditory brainstem OFF responses is complementary to tuning of the auditory brainstem ON response. Hear. Res 19, 115–125. [DOI] [PubMed] [Google Scholar]
- Inagaki HK, Fontolan L, Romani S, and Svoboda K (2019). Discrete attractor dynamics underlies persistent activity in the frontal cortex. Nature 566, 212–217. [DOI] [PubMed] [Google Scholar]
- Joachimsthaler B, Uhlmann M, Miller F, Ehret G, and Kurt S (2014). Quantitative analysis of neuronal response properties in primary and higher-order auditory cortical fields of awake house mice (Mus musculus). Eur. J. Neurosci 39, 904–918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kasai M, Ono M, and Ohmori H (2012). Distinct neural firing mechanisms to tonal stimuli offset in the inferior colliculus of mice in vivo. Neurosci. Res 73, 224–237. [DOI] [PubMed] [Google Scholar]
- Keller CH, Kaylegian K, and Wehr M (2018). Gap encoding by parvalbumin-expressing interneurons in auditory cortex. J. Neurophysiol 120, 105–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kopp-Scheinpflug C, Tozer AJ, Robinson SW, Tempel BL, Hennig MH, and Forsythe ID (2011). The sound of silence: ionic mechanisms encoding sound termination. Neuron 71, 911–925. [DOI] [PubMed] [Google Scholar]
- Kopp-Scheinpflug C, Sinclair JL, and Linden JF (2018). When Sound Stops: Offset Responses in the Auditory System. Trends Neurosci 41, 712–728. [DOI] [PubMed] [Google Scholar]
- Kulesza RJ Jr., Spirou GA, and Berrebi AS (2003). Physiological response properties of neurons in the superior paraolivary nucleus of the rat. J. Neurophysiol 89, 2299–2312. [DOI] [PubMed] [Google Scholar]
- Letzkus JJ, Wolff SBE, Meyer EMM, Tovote P, Courtin J, Herry C, and Lüthi A (2011). A disinhibitory microcircuit for associative fear learning in the auditory cortex. Nature 480, 331–335. [DOI] [PubMed] [Google Scholar]
- Li LY, Li YT, Zhou M, Tao HW, and Zhang LI (2013). Intracortical multiplication of thalamocortical signals in mouse auditory cortex. Nat. Neurosci 16, 1179–1181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li LY, Ji XY, Liang F, Li YT, Xiao Z, Tao HW, and Zhang LI (2014). A feedforward inhibitory circuit mediates lateral refinement of sensory representation in upper layer 2/3 of mouse primary auditory cortex. J. Neurosci 34, 13670–13683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li LY, Xiong XR, Ibrahim LA, Yuan W, Tao HW, and Zhang LI (2015). Differential Receptive Field Properties of Parvalbumin and Somatostatin Inhibitory Neurons in Mouse Auditory Cortex. Cereb. Cortex 25, 1782–1791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H, Liang F, Zhong W, Yan L, Mesik L, Xiao Z, Tao HW, and Zhang LI (2019). Synaptic Mechanisms for Bandwidth Tuning in Awake Mouse Primary Auditory Cortex. Cereb. Cortex 29, 2998–3009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liang F, Li H, Chou XL, Zhou M, Zhang NK, Xiao Z, Zhang KK, Tao HW, and Zhang LI (2019). Sparse Representation in Awake Auditory Cortex: Cell-type Dependence, Synaptic Mechanisms, Developmental Emergence, and Modulation. Cereb. Cortex 29, 3796–3812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu J, Whiteway MR, Sheikhattar A, Butts DA, Babadi B, and Kanold PO (2019). Parallel Processing of Sound Dynamics across Mouse Auditory Cortex via Spatially Patterned Thalamic Inputs and Distinct Areal Intracortical Circuits. Cell Rep 27, 872–885.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma X, and Suga N (2007). Multiparametric corticofugal modulation of collicular duration-tuned neurons: modulation in the amplitude domain. J. Neurophysiol 97, 3722–3730. [DOI] [PubMed] [Google Scholar]
- Ma WP, Liu BH, Li YT, Huang ZJ, Zhang LI, and Tao HW (2010). Visual representations by cortical somatostatin inhibitory neurons–selective but with weak and delayed responses. J. Neurosci 30, 14371–14379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malone BJ, Scott BH, and Semple MN (2015). Diverse cortical codes for scene segmentation in primate auditory cortex. J. Neurophysiol 113, 2934–2952. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mesik L, Ma WP, Li LY, Ibrahim LA, Huang ZJ, Zhang LI, and Tao HW (2015). Functional response properties of VIP-expressing inhibitory neurons in mouse visual and auditory cortex. Front. Neural Circuits 9, 22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moore AK, and Wehr M (2013). Parvalbumin-expressing inhibitory interneurons in auditory cortex are well-tuned for frequency. J. Neurosci 33, 13713–13723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips DP, Hall SE, and Boehnke SE (2002). Central auditory onset responses, and temporal asymmetries in auditory perception. Hear. Res 167, 192–205. [DOI] [PubMed] [Google Scholar]
- Qin L, Chimoto S, Sakai M, Wang J, and Sato Y (2007). Comparison between offset and onset responses of primary auditory cortex ON-OFF neurons in awake cats. J. Neurophysiol 97, 3421–3431. [DOI] [PubMed] [Google Scholar]
- Qin L, Liu Y, Wang J, Li S, and Sato Y (2009). Neural and behavioral discrimination of sound duration by cats. J. Neurosci 29, 15650–15659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Recanzone GH (2000). Response profiles of auditory cortical neurons to tones and noise in behaving macaque monkeys. Hear. Res 150, 104–118. [DOI] [PubMed] [Google Scholar]
- Rhode WS, and Kettner RE (1987). Physiological study of neurons in the dorsal and posteroventral cochlear nucleus of the unanesthetized cat. J. Neurophysiol 57, 414–442. [DOI] [PubMed] [Google Scholar]
- Rhode WS, and Smith PH (1986). Physiological studies on neurons in the dorsal cochlear nucleus of cat. J. Neurophysiol 56, 287–307. [DOI] [PubMed] [Google Scholar]
- Sayegh R, Aubie B, and Faure PA (2011). Duration tuning in the auditory midbrain of echolocating and non-echolocating vertebrates. J. Comp. Physiol. A Neuroethol. Sens. Neural Behav. Physiol 197, 571–583. [DOI] [PubMed] [Google Scholar]
- Scholl B, Gao X, and Wehr M (2010). Nonoverlapping sets of synapses drive on responses and off responses in auditory cortex. Neuron 65, 412–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Semple MN, and Kitzes LM (1985). Single-unit responses in the inferior colliculus: different consequences of contralateral and ipsilateral auditory stimulation. J. Neurophysiol 53, 1467–1482. [DOI] [PubMed] [Google Scholar]
- Sollini J, Chapuis GA, Clopath C, and Chadderton P (2018). ON-OFF receptive fields in auditory cortex diverge during development and contribute to directional sweep selectivity. Nat. Commun 9, 2084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- So1yga M, and Barkat TR (2019). Distinct processing of tone offset in two primary auditory cortices. Sci. Rep 9, 9581. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suga N (1964). Single Unit Activity in Cochlear Nucleus and Inferior Colliculus of Echo-Locating Bats. J. Physiol 172, 449–474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun YJ, Wu GK, Liu BH, Li P, Zhou M, Xiao Z, Tao HW, and Zhang LI (2010). Fine-tuning of pre-balanced excitation and inhibition during auditory cortical development. Nature 465, 927–931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan AY, Zhang LI, Merzenich MM, and Schreiner CE (2004). Tone-evoked excitatory and inhibitory synaptic conductances of primary auditory cortex neurons. J. Neurophysiol 92, 630–643. [DOI] [PubMed] [Google Scholar]
- Tian B, Kuśmierek P, and Rauschecker JP (2013). Analogues of simple and complex cells in rhesus monkey auditory cortex. Proc. Natl. Acad. Sci. USA 110, 7892–7897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weible AP, Moore AK, Liu C, DeBlander L, Wu H, Kentros C, and Wehr M (2014). Perceptual gap detection is mediated by gap termination responses in auditory cortex. Curr. Biol 24, 1447–1455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu GK, Li P, Tao HW, and Zhang LI (2006). Nonmonotonic synaptic excitation and imbalanced inhibition underlying cortical intensity tuning. Neuron 52, 705–715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu GK, Arbuckle R, Liu BH, Tao HW, and Zhang LI (2008). Lateral sharpening of cortical frequency tuning by approximately balanced inhibition. Neuron 58, 132–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu GK, Tao HW, and Zhang LI (2011). From elementary synaptic circuits to information processing in primary auditory cortex. Neurosci. Biobehav. Rev 35, 2094–2104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiong XR, Liang F, Li H, Mesik L, Zhang KK, Polley DB, Tao HW, Xiao Z, and Zhang LI (2013). Interaural level difference-dependent gain control and synaptic scaling underlying binaural computation. Neuron 79, 738–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiong XR, Liang F, Zingg B, Ji XY, Ibrahim LA, Tao HW, and Zhang LI (2015). Auditory cortex controls sound-driven innate defense behaviour through corticofugal projections to inferior colliculus. Nat. Commun 6, 7224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Young ED, and Brownell WE (1976). Responses to tones and noise of single cells in dorsal cochlear nucleus of unanesthetized cats. J. Neurophysiol 39, 282–300. [DOI] [PubMed] [Google Scholar]
- Zhang LI, Bao S, and Merzenich MM (2002). Disruption of primary auditory cortex by synchronous auditory inputs during a critical period. Proc. Natl. Acad. Sci. USA 99, 2309–2314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang LI, Tan AY, Schreiner CE, and Merzenich MM (2003). Topography and synaptic shaping of direction selectivity in primary auditory cortex. Nature 424, 201–205. [DOI] [PubMed] [Google Scholar]
- Zhang LI, Zhou Y, and Tao HW (2011). Perspectives on: information and coding in mammalian sensory physiology: inhibitory synaptic mechanisms underlying functional diversity in auditory cortex. J. Gen. Physiol 138, 311–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou Y, Mesik L, Sun YJ, Liang F, Xiao Z, Tao HW, and Zhang LI (2012). Generation of spike latency tuning by thalamocortical circuits in auditory cortex. J. Neurosci 32, 9969–9980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou M, Liang F, Xiong XR, Li L, Li H, Xiao Z, Tao HW, and Zhang LI (2014). Scaling down of balanced excitation and inhibition by active behavioral states in auditory cortex. Nat. Neurosci 17, 841–850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou M, Li YT, Yuan W, Tao HW, and Zhang LI (2015). Synaptic mechanisms for generating temporal diversity of auditory representation in the dorsal cochlear nucleus. J. Neurophysiol 113, 1358–1368. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets/code supporting the current study are available from the corresponding author on request.