Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 May 20.
Published in final edited form as: Cell Rep. 2019 Apr 16;27(3):872–885.e7. doi: 10.1016/j.celrep.2019.03.069

Parallel processing of sound dynamics across mouse auditory cortex via spatially patterned thalamic inputs and distinct areal intracortical circuits

Ji Liu 1, Matthew R Whiteway 2, Alireza Sheikhattar 3, Daniel A Butts 1,2,4, Behtash Babadi 3, Patrick O Kanold 1,4,5,*
PMCID: PMC7238664  NIHMSID: NIHMS1060482  PMID: 30995483

Summary

Natural sounds have rich spectro-temporal dynamics. Spectral information is spatially represented in auditory cortices (ACX) via large-scale maps. However, the representation of temporal information, e.g. sound offset, is unclear. We performed multiscale imaging of large-scale and cellular neuronal as well as thalamic activity evoked by sound onset and offset in awake mouse ACX. ACX areas differed in onset (On-R) and offset-responses (Off-R). Most excitatory L2/3 neurons showed either On-Rs or Off-R and ACX areas were characterized by differing fractions of On-R/Off-R neurons. Somatostatin and parvalbumin interneurons showed distinct temporal dynamics, potentially amplifying Off-Rs. Functional network analysis showed that ACX areas contained distinct parallel On- and Off-networks. Thalamic (MGB) terminals showed either On-Rs or Off-Rs indicating a thalamic origin of On/Off-R pathways. Thus, ACX areas spatially represents temporal features and this representation is created by spatial convergence and co-activation of distinct MGB inputs and is further refined by specific intracortical connectivity.

Keywords: Auditory cortex, mouse, temporal, pattern, MGB, 2-photon

Introduction

Natural sounds have rich spectral and temporal dynamics, and neuronal populations along the auditory processing stream encode both spectral and temporal information. Sound onset and offset are fundamental dynamic features of sound, to which single neurons at multiple levels of the auditory system respond (He et al., 1997; Henry, 1985; Hillyard and Picton, 1978; Kopp-Scheinpflug et al., 2011), including the auditory cortex (ACX) (Baba et al., 2016; Fishman and Steinschneider, 2009; He, 2001; Qin et al., 2007; Recanzone, 2000; Scholl et al., 2010). While offset-responses (Off-R) have been suggested to be responsible for duration coding (He, 2001), they, together with onset-response (On-R), encode the basic cues (onset/offset) for auditory scene analysis (Bregman, 1994). Thus, besides elucidating the encoding of both sound onset and offset, revealing the underlying cellular networks is essential for understanding auditory processing.

The ACX contains multiple functional areas and the spatial organization of ACX with respect to On-Rs has been extensively studied. On a large scale (hundreds of microns), there are clear tonotopic maps, which are due to topographic thalamocortical projections (Guo et al., 2012; Issa et al., 2014; Kanold et al., 2014; Merzenich et al., 1975; Stiebler et al., 1997; Tsukano et al., 2015), while on a finer scale 2-photon imaging studies in mouse primary ACX (A1) revealed a diverse tonotopic organization of On-Rs in superficial layers (Bandyopadhyay et al., 2010; Kanold et al., 2014; Rothschild et al., 2010; Winkowski and Kanold, 2013). In contrast, the spatial organization of Off-Rs in ACX is less well understood. Widefield flavoprotein imaging revealed the existence of an area adjacent to A1 that is specialized in processing tone offset regardless of tone frequency in anesthetized mice (Baba et al., 2016). On a finer scale, neurons in mouse ACX show distinct On/Off-R patterns (Deneux et al., 2016), and inputs carrying On-Rs and Off-Rs are proposed to originate in non-overlapping synaptic circuits (Scholl et al., 2010). These findings at different scales raise the possibility that On- and Off-Rs reflect distinct parallel pathways not only within A1 but also across ACX, and that On- and Off-Rs might be differentially represented in the cortical space. Here, we tested these hypotheses by investigating the spatial representation and functional microcircuits contributing to On-Rs and Off-Rs on multiple spatial scales in ACX.

Since multiple ACX areas contribute to auditory processing, we first performed widefield imaging of GCaMP6s in awake mice. For unbiased identification of ACX areas we developed an automated image segmentation algorithm based upon temporal responses. We detected known and other ACX areas. ACX areas differed in their response properties to tone onset and offset indicating that temporal selectivity might underlie the functional streams of analysis in ACX. Both On-Rs and Off-Rs showed tonotopic organization. 2-photon calcium imaging of ACX neurons revealed that most excitatory layer 2/3 neurons showed either On- or Off-Rs. ACX areas were characterized by differing fractions of On- or Off-responsive neurons. Parvalbumin (PV) and somatostatin (SOM) interneurons showed differential On-R and Off-R dynamics suggesting suppression of PV neurons by SOM neurons during prolonged tone presentation, potentially exerting disinhibiting effect on local excitatory neurons to selectively amplify cortical Off-R. Functional connectivity analysis showed that ACX areas varied in their intrinsic network structure. Imaging of medial geniculate body (MGB) axons showed a thalamic origin of the parallel On/Off-R circuits and that spatial convergence and co-activation of MGB inputs determines cellular On/Off-preference. Together our results demonstrate that ACX fields differentially process sound onset/offsets via parallel and spatially patterned projections from the MGB and is further refined by specific intracortical connectivity.

Results

We set out to investigate the spatial organization of temporal sensitivity in mouse ACX on multiple spatial scales. Since the temporal sensitivity of ACX responses, especially Off-Rs, are sensitive to anesthesia (Fishman and Steinschneider, 2009; Joachimsthaler et al., 2014; Qin et al., 2007; Recanzone, 2000), we performed our studies in ACX of awake animals. We used F1’s of CBA/CaJ and Thy1-GCaMP6s (C57BL/6 background) crosses (Dana et al., 2014), which show normal adult hearing (Frisina et al., 2011) and widespread cortical expression of GCaMP6s.

We first investigated the spatial distribution of On-R’s and Off-R’s on the mesoscale using widefield (WF) imaging. We imaged the left ACX of awake adult mice (n=13) while presenting 2-second pure tones (Fig. 1A). Tone onset resulted in spatially restricted fluorescence increases at several locations in ACX (Fig. 1B, see 0.4s following tone onset, S1A). Fluorescence increases were widespread in ACX with largest increases present in discrete locations corresponding to activations of primary as well as higher order ACX areas, putatively A1, AAF and A2 respectively. Following tone offset, we observed additional widespread increases of fluorescence (at 2.4s, or 0.4s after tone offset), which corresponded to an offset-response (Off-R) (Fig. 1B). Off-Rs are not due to changes in animal state after tone cessation (Fig. S2). On-R and Off-R were also present in response to ultrasonic frequencies such as 83.0 kHz (Fig. 1B). In both examples, the spatial pattern of On-R qualitatively matches previous results (Baba et al., 2016; Issa et al., 2014; Tsukano et al., 2015).

Fig. 1. Both On-R and Off-R show global tonotopy.

Fig. 1

(A) Experimental paradigm: head fixed mouse passively listened to tones while ACX was imaged. On- and Off-R are defined as increases in fluorescence following tone onset and offset, respectively. (B) Sequence of widefield images showing response to 7.3kHz tone at 35dB SPL and to 83.0kHz tone at 65dB SPL. The red bar indicates the images collected during tone presentation (0–2sec). (C) On-tonotopy showing the contour of 95 percentile of the response following tone onset. Systematic shift of maximum activation location can be seen in A1, AAF and A2. (D) Same as in (C) but for Off-tonotopy. The center of ACX shows weaker tone evoked responses and thus is not marked by contours.

Varying sound frequency and level showed that both On-R and Off-R changed their response location with respect to tone frequency (Fig. S1). We overlaid contours of the strongest activations across ACX for each frequency at the respective threshold of On-R (Fig. 1C) and Off-R (Fig. 1D) and clear systematic changes of activated areas can be seen in multiple locations. Based on the relative positions of these gradients in the On-R we labeled areas as primary ACX (A1), Anterior Auditory Field (AAF) and A2. The gradients were consistent across animals (Fig. S3). A1 shows dual tonotopic axes: one from the caudal area towards the dorsomedial area (Ultrasonic Field or UF) and the other one reaching towards ventrolateral side (Fig. 1C), largely consistent with prior reports (Issa et al., 2014; Polley et al., 2007; Tsukano et al., 2015) with the subtle difference that two On-tonotopic gradients in primary ACX share the low to mid frequency axis before splitting dorsally and ventrally. In addition, we observed that a tonotopic gradient is present for Off-Rs in A1, AAF and A2 in all animals (Fig. 1D, Fig. S3B). The Off-tonotopy gradient from A1 to UF overlapped with the On-tonotopy gradient. However, the Off-tonotopy gradient also extends dorsoposteriorly and thus covers more area dorsally than the On-tonotopic gradient. Between these dominant gradients of strong tone responses there was a weakly responding central region, consistent with previous widefield studies (Issa et al., 2017). Thus, Off-Rs are present in multiple ACX areas and Off-Rs are tonotopically organized. The differences in the tonotopic gradients between On-Rs and Off-Rs suggest that different microcircuits might underlie onset/offset processing.

Distinct ACX areas show selectivity to temporal features

So far, we identified functional ACX areas based on separate On/Off-Rs at threshold. Since these areas showed overlap, we sought to determine if ACX contained distinct functional areas based on the combination of selectivity for On/Off-R throughout frequency/sound level combinations and if such ACX segmentations could idenitify unique ACX areas. We developed an unsupervised and unbiased image segmentation technique taking the entire temporal response of each pixel into account. We expressed the temporal activities of pixels as a linear combination of spatially distinct regions of interest (ROIs) weighted by temporal modulations (Fig. 2A) using an autoencoder neural network with non-negativity constraints on the spatial weights (Whiteway and Butts, 2017). An autoencoder is a neural network with one or more hidden layers (Fig. 2B). While the input and output layers have the same number of nodes, the autoencoder reduces the dimension in the image sequence by expressing the intensity of each pixel as the weighted sum of the activity of the hidden layer, which has a smaller dimensionality. These weights are interpreted as distinct spatial patterns of activity (or ROIs) and the activity of the hidden layer reflects the temporal modulation (Fig. 2C).

Fig. 2. Widefield image segmentation using an Autoencoder reveals ACX areas with distinct On/Off selectivity.

Fig. 2

(A) Cartoon showing image segmentation. The example image sequence at any time point can be expressed as the weighted summation of ROI 1 and ROI 2 by respective activity level. Our goal of image segmentation is to retrieve activated areas as well as their temporal activation traces. (B) Autoencoder is a neural network with one or more hidden layers between input and output layers, which have the same number of nodes. The weights between input/output layer and hidden layer are adjusted such that the output matches the input as closely as possible. The hidden layer typically has much fewer nodes than input/output layer to achieve dimension reduction. (C) Principle of fitting autoencoder ROIs. Original pixels (left) are linearly combined to produce ROIs (middle) such that each pixel in turn can be approximated (right) by the linear combination of these ROIs, while the weights are interpreted as spatial profile of the ROIs. (D-G) On- and Off-R spatial profiles overlaid with selected autoencoder ROIs to validate ROI placement. (D-G) share the same color scale. (H) Parcellation of ROIs into ACX fields. ROIs outlined in solid lines have the On/Off frequency response areas (FRAs) shown in (I). (I) On/Off-R amplitude is plotted as a function of frequency and sound level for ACX fields. Adjacent blue and red bars represent On/Off-R to the same frequency/sound level combination.

Typically, an autoencoder with ~50 ROIs well approximated the acquired image sequence (Fig. S4A). The resulting ROIs densely tiled the imaged area (Fig. S4B, D) with minimum spatial overlap (Fig. S4C), which reflects the distinct selectivity of On/Off-Rs of different ACX fields while making parsing ACX fields unambiguous. Additionally, the minimal overlap is likely due to our choice of the minimum number of ROIs to the desired degree of goodness of fit (Fig. S4A). Adding ROIs increases overlap but does not increase goodness of fit (Fig. S4A). To verify our method, we compared the locations of the ROIs with evoked fluorescence increases. We found that ROI placements agreed with locations of activation for both On-R (Fig. 2D, E) and Off-R (Fig. 2F, G), and their shapes reflected the contours of fluorescence increases. Thus, our method reliably identifies regions of common activations and extracts their temporal activations without prior knowledge of the spatial distribution of activity. This approach provides advantages over the common square/hexagonal grid segmentation as the choice of grid size could be arbitrary and might obscure the temporal selectivity of ROIs by grouping functionally separate fields together. While we here segmented ACX into functional fields, our method can be applied to arbitrary WF datasets for spatiotemporal analysis and image segmentation.

Identified ACX fields show distinct On/Off-R frequency response areas (FRAs) (Fig. 2H, I) indicating that differences in the sensitivity to temporal features is a major determinant of ACX organization. The low-frequency selective A1 ROI (Fig. 2I, A1(L)) shows predominant On/Off-R for tones of 4.0 to 7.3kHz while the mid-frequency selective A1 ROI (Fig. 2I, A1(M)) responded mostly to frequencies around 18.2kHz. The high-frequency selective A1 ROI (ventrolateral gradient of A1, Fig. 2I, A1(H)) typically have On/Off-R very similar to mid-frequency A1 due to their spatial proximity and the diffuse nature of WF signals. However, the average On-R of high-frequency A1 ROI to 61.3kHz is larger than that of mid-frequency A1 ROI at threshold. In contrast, UF ROI shows much higher selectivity to high frequencies (Fig. 2I, UF), consistent with the proposed role in processing conspecific ultrasonic vocalizations (Stiebler et al., 1997). Dorsoposterior (DP) ROI showed stronger Off-R (Fig. 2I, DP). AAF ROIs (Fig. 2I, AAF) showed comparable On/Off-R while A2 ROIs (Fig. 2I, A2) show weaker Off-R. Lastly, the center region (Fig. 2I, CTR) showed weaker responses and is likely less sensitive to simple stimuli such as pure tones (Fig. 2H, Fig. S4F). The spatial layout of these ROIs was consistent across mice (Fig. S3). Thus, ACX contains functional areas with distinct sensitivity to temporal features. Our image segmentation approach can better subdivide ACX as it captures the different temporal dynamics of ACX fields.

ACX fields differ in thresholds and sound level dependence of On- and Off-Rs

We next characterized threshold and sound level dependence of parsed ACX fields. Off-Rs in all areas showed a higher threshold than On-Rs and Off-Rs can have higher amplitudes than On-Rs (e.g. at 50 and 65dB SPL) (Fig. 3AE). UF and DP showed the highest Off-R preference at 65dB SPL (Fig. 3F). Thus, while core ACX fields (e.g. A1, AAF) robustly respond to both tone onset/offset, areas away from core fields can show dominant Off-Rs, especially for loud tones.

Fig. 3. On/Off-R show areal differences in amplitude and spatial distribution.

Fig. 3

(A-E) Differential On-R and Off-R profile as a function of both sound level and ACX field. On-R and Off-R profiles with respect to sound level for different ACX fields were obtained by summing over frequency in On/Off-FRAs. ‘***’ indicates p<0.001; ‘****’ indicates p< 0.0001. Dashed lines show 95% confidence interval. (F) Off- and on-response ratio at 65dB SPL as a function of ACXfields. Error bars show SEM. (G) On/Off-SC as function of distance along the dorsal-ventral axis, calculated among ROIs dorsal to A1 ROIs. Off-R show higher SCs than On-Rs. (H) On/Off-SC calculated among ROIs dorsal to UF ROIs. (I) On/Off-SC among all ROIs.

Off-responsive areas are more spatially extensive than On-responsive areas

The different selectivity for On/Off-R in ACX fields suggest a different underlying circuit topology. To quantify the large-scale spatial topology, we computed signal correlations (SC) between individual ROIs among a dorsal-ventral slice in each ACX area. In A1 and UF, Off-R SCs were significantly higher than On-R SCs(Fig. 3G, H). This relationship was maintained over distance suggesting that Off-Rs are more spatially extensive in the dorsal direction (Fig. 3G, H). This pattern was also true across ACX (Fig. 3I), suggesting that Off-Rs are more diffusively represented in all ACX areas. These results are consistent with that dorsal ACX areas especially UF and DP have dominant Off-R (Fig. 3F). Together, the areal differences in the tonotopic gradients (Fig. 1) and the differences in SC between On/Off-Rs suggest that different intrinsic and ascending microcircuits within each area underlie the regional differences in onset/offset processing.

Neural populations in ACX areas differ in their selectivity to sound onset or offset

To investigate areal differences in processing tone onset/offset, we sought to analyze local microcircuits and assessed the temporal stimulus preferences of single neurons in four ACX areas using in vivo 2-photon imaging. (n=32 mice; A1: 67 field of views (FOVs), 19366 cells; AAF: 24 FOVs, 5425 cells; A2: 20 FOVs, 5918 cells; DP: 8 FOVs, 2573 cells). Cells in all ACX areas could show time-locked responses to tone onset and/or offset (Fig. 4A, B, S5). Cells showing On-R were sparse (A1, 5.05% ± 2.89%; AAF, 5.36% ± 2.58%; A2, 5.83% ± 4.53%; DP, 2.23% ± 1.29%, among all neurons imaged), while the same was true of Off-R (A1, 6.62% ± 4.34%; AAF, 2.14% ± 1.83%; A2, 2.28% ± 2.24%; DP, 4.64% ± 2.42%, among all neurons imaged), consistent with a sparse representation of sound in ACX in electrophysiological studies (Hromádka et al., 2008). Few neurons showed both On-Rs and Off-Rs (A1, 0.98% ± 0.90%; AAF, 0.54% ± 0.54%; A2, 0.95% ± 1.31%; DP 0.43% ± 0.57%, Fig. S6A, among all neurons imaged), suggesting that most L2/3 neurons are either only On-responsive or Off-responsive. We quantified the selectivity of On/Off-Rs by computing the On/Off-R Bias Index (OBI=(Off-On)/(Off+On))) (Fig. 4C). Most OBI values were expectantly −1 (On-only) or 1 (Off-only). In A1 and DP, Off-only neurons (53% and 65% of neurons) outnumber On-only neurons (38% and 28% of neurons), while in A2 and AAF reverse is true (67% and 70% vs. 23% and 19% of neurons). Neurons showing both On-R and Off-R constituted ~10% of responding neurons and were more Off-biased in A1 and DP than in AAF and A2 (Fig. 4D). We confirmed these results in a separate analysis (Fig. S6). Together these results show that ACX areas differ in both number of On/Off-only cells as well as in On/Off-selectivity of individual cells. Thus, ACX areas are defined by the underlying population representation of tone onset/offset and cellular response amplitudes.

Fig. 4. L2/3 neurons show distinct On-R and Off-R and are differentially distributed across ACX areas.

Fig. 4

(A) An example on-responsive neuron (arrow). Vertical dotted lines indicate tone onset and offset respectively. Light blue areas indicate tone duration. (B) An example off-responsive neuron. Scalebar: 10μm. (C) Histogram of cellular OBI values as a function of ACX fields. OBI = (Off-R – On-R) / (Off-R + On-R) while (D) shows cumulative distribution function (CDF) of values other than −1 and 1. Wilcoxon rank sum test, A1 vs AAF, z=2.77, p=0.0056; A1 vs A2, z=4.41, p=1.02×10−5; DP vs AAF, z=1.93, p=0.053; DP vs A2, z=2.49, p=0.013. (E) Left: cartoon showing a linear model to predict the BF of On-R and Off-R with respect to the cells’ spatial locations. A direction is searched onto which the projection of the cell’s coordinates best explains the cell’s BF. Right: Goodness of fit of On-R and Off-R in cells of different ACX fields. (F) Relationship between On- and Off-SCs and pairwise distance on the neuronal level. Solid lines show median while the shading indicates the 95% confidence interval. The flanking panel shows CDF of on-SC and off-SC not regarding distance. ‘***’ indicates p<0.001. ‘**’ indicates p<0.01. A1, rank sum test, z=−13.6, p=4.33×10−42; AAF, z=−3.52, p=4.30×10−4; A2, z=−8.73, p=2.07×10–18; DP, z=−2.93, p=3.4×10−3.

To further confirm our results and to sample across layers, we implanted 16-channel linear multielectrode arrays into A1, spanning a cortical depth of 800μm. We first analyzed the local field potential (LFP) which reflects the combination of local neuronal activity and afferent input into A1 (Herreras, 2016; Katzner et al., 2009; Liu et al., 2015). We found that more tone frequencies evoked Off-R compared to On-R (Fig. S7AC), consistent with the widespread nature of Off-R (Fig. 3). Moreover, distribution of OBI of all electrode contacts shifted towards Off-R (Fig. S7D). These results confirm that Off-R evokes a wide activation in A1 and that A1 responses are biased towards Off-R.

Prior electrophysiology studies reported a higher proportion of neurons showing both On- and Off-R than our imaging results (Joachimsthaler et al., 2014; Qin et al., 2007; Tian et al., 2013). To identify potential sources for this discrepancy we recorded single units (n=220) from A1 of awake mice and analyzed their On/Off-R (Fig. S7FH). 200/220 units (91%) were responsive to either tone onset or offset. Among these units, 26% had only On-R, and 57% had both On- and Off-R, and 7% had only Off-R. We classified neurons based on their spike shape (wide vs. narrow) reflecting putative excitatory and inhibitory units and analyzed their OBI. Both classes showed similar OBI distributions (Fig. S7G). Analyzing OBI across depth showed that OBI was depth-dependent, with deeper layer units more biased to On-R (Fig. S7H). Together these results suggest that A1 contains both On/Off-only neurons and that there is a depth-dependent distribution of these neurons consistent with sublaminar circuit differences in L2/3 (Meng et al., 2017).

Local tonotopy is heterogeneous for both On-R and Off-R in all areas

Both On-R and Off-R show large-scale tonotopy (Figs. 1, 2), while cellular frequency selectivity is heterogeneous in anesthetized A1 (Bandyopadhyay et al., 2010; Kanold et al., 2014; Rothschild et al., 2010). We tested if Off-R exhibited local tonotopy and if On-R and Off-R cells differed in local heterogeneity of frequency preference. We compared the degree to which On-R and Off-R are locally tonotopically organized by analyzing separate linear models between best frequency and spatial locations of cells (Fig. 4E). We found a low local tonotopy of frequency selectivity as the goodness of fit (R2) was low, consistent with prior studies (Bandyopadhyay et al., 2010; Maor et al., 2016; Rothschild et al., 2010). Moreover, the models showed a similar R2 for On-R or Off-R across ACX areas, suggesting that the local heterogeneity of frequency selectivity between On-R and Off-R is similar within and across mouse ACX fields.

ACX areas differ in the spatial pattern of neuronal correlated On-R and Off-R activity

Our results indicate regional differences in cellular selectivity. To gain insight into the spatial distribution of ACX circuits we calculated pairwise SCs which are reflective of shared inputs (Shadlen and Newsome, 1998). In A1 On-RSCs are highest for nearby neurons and decrease with distance, consistent with results in anesthetized mice (Fig. 4F) (Winkowski and Kanold, 2013). Such a decrease is also present in A2 while DP shows a patchy distribution of On-R SCs and AAF shows a weak SC gradient. Off-SCs were larger than On-SCs in most areas except for DP. In A1 these differences between On-SC and Off-SC were widespread, while such differences were present in patchy areas in AAF (~150–175μm) and A2 (~50–275μm). We validated this result by computing the SC among chronically implanted linear electrode contacts and a similar correlation structure was seen where Off-SC was higher than On-SC over distance (Fig. S7E). These results show that Off-R neurons are more widespread among different cortical columns and along cortical depth, which could be due to a difference in the underlying intrinsic circuits or due to the spatial distribution of ascending input..

Granger Causality analysis reveals areal differences in functional On/Off networks

The areal differences in SCs suggest different underlying neuronal networks. We sought to identify the functional networks in the different ACX areas by performing Granger Causality (GC) analysis separately among On-R and Off-R neurons (Francis et al., 2018; Friston et al., 2013; Granger, 1969; Oya et al., 2007; Sheikhattar and Babadi, 2016; Sheikhattar et al., 2018). GC analysis provides a data-driven framework for inferring causal interactions between neurons by statistically testing if a neuron’s activity can be predicted by the recent activity history of other neurons, and thus uncovering functional networks (Francis et al., 2018; Granger, 1969; Sheikhattar and Babadi, 2016). The causal interactions (GC-links) can take positive or negative signs reflecting correlated or anticorrelated neuronal activities, respectively (Francis et al., 2018). Our calcium indicator is expressed in excitatory neurons and thus we focused on positive GC links. An example of two GC linked neurons is shown in Fig. 5A. Note that that the source trace preceded the target trace. Fig. 5B shows one example field of view with the most significant GC links labeled. We quantified the number, strength, length, and directionality of the GC links. In A1 and DP, Off-GC links outnumbered On-GC links, while the opposite was true in AAF and A2 (Fig. 5C). These differences indicate higher respective interconnectivity and are consistent with the differences in the relative numbers of On-R and Off-R neurons. In contrast, GC link strength (J-statistics) largely showed no difference except for AAF (Fig. 5D), suggesting both On-R and Off-R networks are strongly functionally connected. Since the majority of cells had either On-R or Off-R these results indicate that ACX areas contain separate interdigitated On-R and Off-R networks.

Fig. 5. GC analysis reveals distinct On/Off sub-networks.

Fig. 5

(A) Fluorescence time course of GC-linked cells. (B) Example field: On (blue) and Off (red) GC-links. Only GC links with J-statistics>0.95 shown for clarity. (C) Proportion of GC-links (false discovery rate: 0.001). More Off GC-links in A1 and DP (Wilcoxon rank sum test, A1 on vs off: p=2.53×10−7, z=−5.16; DP, p=1.55×10−4). More On GC-links in AAF and A2 (AAF p=5.44×10−7, z=4.55; A2 p=3.32 ×10−6, z=4.65). (D) J-statistics, a measure of GC-link strength. Only AAF shows a slightly higher On GC-link strength (Wilcoxon rank sum test, p=0.0175, z=2.38). (E) GC-link length. A1 contains shorter Off GC-links (Wilcoxon rank sum test, p=0.0022, z=3.06) (F) Distribution of direction of GC-links. The non-uniformity of the distributions was tested using Chi-square goodness-of-fit test. A1, on: p=0.043, off: p=1.08×10−22; AAF, on: p=1.48×10−7, off: p=7.77×10−4; A2, on: 8.15×10−4, off: p=0.42; DP, on: p=0.89, off: p=0.17. On/Off distribution difference: Two-sample Kolmogorov-Smirnov test, A1: p=0.065; AAF: p=0.82; A2: p=0.68; DP: p=0.85.

We next extracted the spatial properties of GC linked for On and Off networks. First, Off-GC-links tend to have more shorter links in A1 (Fig. 5E), suggesting that Off GC networks more densely cover the neural populations in A1 and are more spatially clustered. Other ACX fields showed no length differences (Fig. 5E). Since ACX areas have large-scale tonotopic maps, we next investigated if GC links also show a direction preference. Except for DP and A2 Off-R, the distributions of the GC-link directions significantly deviate from uniform distributions (Fig. 5F). In A1, AAF and A2, the ellipse like distributions have the long axis, reflecting a spatial bias of cell pair interactions, roughly in parallel to the tonotopic axes. Thus, although local cellular populations lack precise tonotopic, there are regularities in their functional connectivity whose spatial patterns are closely related to the tonotopic axis. Moreover, we found no difference in the On/Off GC-link direction distribution (Fig. 5F).. TLastly, the distribution of GC-link directions in AAF appeared to be narrower than in A1 or A2. We thus combined both On and Off GC-links and compared the spread in the direction of the short axis of the eclipse like distributions. Indeed, AAF GC links were more narrowly distributed than in A1 (p=0.033) and the difference between AAF and A2 was close to significance (p=0.058). Thus, the spatial topology of the intrinsic functional architecture of L2/3 in different ACX fields differs. Together these results indicate that although On/Off-R populations are largely non-overlapping, they are spatially intermingled and parallel, consistent with the ‘salt-and-pepper’ structure in L2/3 of mouse ACX (Bandyopadhyay et al., 2010; Rothschild et al., 2010)

The On/Off responsivity of MGB terminals determines areal responses

So far, our results indicate that ACX contains distinct functional areas defined by differing cellular selectivity and intrinsic connectivity. Since ascending inputs to ACX neurons determine the initial cellular selectivity to sound dynamics we examined how the cellular On/Off selectivity emerged from ACX inputs. The main ascending inputs to ACX are provided by medial geniculate body (MGB) axons which terminate on excitatory neurons ranging from L2/3 to L6 with the strongest input in L4 (Ji et al., 2015). Since different ACX areas receive dominant input from different subdivisions of the MGB we speculate that these sets of synapses reflect separate pathways from the MGB. To test this hypothesis, we injected AAV expressing GCaMP6s into the MGB (n=7 mice) and imaged axon terminals in A1 (20 FOVs) (Fig. S8). We focused on A1 because of its distinct difference in On/Off-Rs and because prior in vivo patch clamp recordings showed that in A1 On- and Off-R are driven by non-overlapping sets of synapses (Scholl et al., 2010). MGB terminals showed prominent On-R or Off-R (Fig. 6A, B). Few MGB terminals showed both On- and Off-R (0.88% ± 1.06%). The proportion of MGB terminals showing either On-R or Off-R was similar (Fig. 6C) and most terminals were either On-only or Off-only (Fig. 6D). Thus, the majority of MGB terminals either relay On-R or Off-R suggesting the existence of distinct parallel MGB to A1 pathways. Terminals showing both On-R and Off-R had a more negative OBI compared to the distribution of OBI of the cellular response (Fig. 6D inset), suggesting that there exists a transformation of On- and Off-R selectivity from MGB terminals to A1 cellular responses, which are more Off-R biased. Moreover, given the prevalence of Off-R A1 neurons, this suggests a differential amplification of Off-Rs from MGB inputs to yield a larger fraction of Off-R neurons.

Fig. 6. MGB terminals in A1 largely show either On-R or Off-R and Off-R terminals show higher local signal correlations.

Fig. 6

(A) On-responsive terminal. The image shows the contour of the terminal in red. Scalebar: 5μm. Light blue areas indicate tone duration. (B) Same as in (A) but shows an off-responsive terminal. (C) Proportion of on- or off-responding terminals is similar. On: 5.99% ± 6.72%; Off: 5.62% ± 6.00%; paired t-test, t(20)=0.34, p=0.74. (D) Histogram of OBI values of MGB terminals in A1. Inset shows CDFs of OBI values other than −1 and 1 from MGB terminals and A1 L2/3 neurons (Wilcoxon rank sum test, z=3.64, p=2.71×10−4). (E) Individual MGB terminals in A1 show significant larger On-Rs (Wilcoxon rank sum test, z=2.91, p=0.0036). (F) Overall On/Off-R amplitude (see Method, Wilcoxon rank sum test, z=0.85, p=0.39). (G) Off-Rs show higher off-SC over distance (0–70μm). Dashed lines show 95% confidence interval around the median. Right panel shows cumulative distribution function of all On- and Off-SC. ‘***’ indicate p<0.001. (H) Goodness of fit of linear On-R and Off-R tonotopy model in MGB terminals in A1 was similar.

To gain insight into the transformation, we compared the average strength of On/Off-R pooled across terminals. Terminal On-Rs were larger than Off-Rs (Fig. 6E) which is similar to the cellular responses (Fig. S6B). However, unlike cellular response in A1, the On-R and Off-R MGB terminals have similar overall response amplitude (Fig. 6F). This suggests that the Off-R dominance in A1 cells was not generated by stronger or more numerous Off-R MGB afferents.

Convergence and temporal synchrony of thalamic inputs can strongly influence cortical neurons (Bruno and Sakmann, 2006) which could lead to stronger cellular responses. We observed a distinct spatial SC structure in mesoscale (Fig. 3GI) as well as in cellular responses (Fig. 4F). dThese properties could result from spatially structured MGB input and we found that MGB terminals had higher Off-SCs (Fig. 6G) consistent with the cellular data. We also investigated the distance dependence of On/Off-SC of MGB terminals and found that Off-SC was higher than On-SC over a distance of 0–70μm indicating a larger spatial spread of terminal Off-R. These results suggest that although individual MGB terminals do not respond to tone offset more strongly than to tone onset, the spatial correlation structure of MGB inputs is transformed into cellular tuning in A1 and leads to a more spatially extensive representation of tone offset.

Lastly, we investigated whether there are tonotopic structures in MGB terminal responses. A linear model did not reveal a tonotopic structure in either On-R or Off-R in MGB terminals (Fig. 6H), consistent with reports that local On-R MGB projections to A1 show spatially heterogeneous tuning (Vasquez-Lopez et al., 2017). Together, our results suggest that the spatial mesoscale distribution of On/Off-R A1 neurons is largely inherited from the spatial distribution of On/Off-R MGB terminals.

Cortical inhibitory networks can amplify Off-R through disinhibition

The activity of cortical neurons is influenced by inhibition and we hypothesized that the On/Off-selectivity of ACX L2/3 excitatory neurons is actively shaped by the local inhibitory network. To investigate this question, we focused on PV and SOM positive interneurons which are thought to control the activity of L2/3 neurons via a disinhibitory circuit (Pfeffer et al., 2013). We crossed Thy1-GCaMP6s mice with either PV-cre or SOM-cre mice and injected AAV-virus expressing mRuby and GCaMP6s under control of FLEX switch sequence into the ACX of F1 animals. Thus, PV or SOM interneurons could be identified based on nuclear red fluorescence signal while allowing simultaneous imaging of Thy1+ excitatory neurons and PV/SOM neural populations (Fig. 7A, C) (Thy1xPV-cre: n=8, 427 PV neurons; Thy1xSOM: n=6, 288 neurons). We presented 2-second tones and found that although some PV and SOM interneurons displayed typical On/Off-R similar to those seen in excitatory neurons, most interneurons displayed much slower temporal dynamics in their (Fig. 7A, B). The majority of PV neurons showed a slow decrease in fluorescence following tone onset (Fig. 7B). Although a subset of suppression responses showed a brief positive deflection immediately after tone onset, their occurrence was rarer than pure suppression responses (Fig. S9). SOM neurons showed similarly slow temporal responses albeit positive in sign (Fig. 7D). To classify different response types, we performed k-means clustering on significant responses averaged across trials, pooling responses from both Thy1 and PV/SOM cells. We could identify 5 clusters with distinct temporal dynamics. Cluster 1 shows sharp increase in fluorescence following tone onset and decays afterwards (Fig. 7E, ‘On’). Cluster 2 shows a more graded fluorescence increase, which sustains during tone presentation (Fig. 7E, ‘On-sustained’). Cluster 3 shows even slower rise with little plateau and decays following tone offset (Fig. 7E, ‘On-ramping’). Cluster 4 shows a sharp increase after tone offset and is categorized as typical Off-Rs (Fig. 7E, ‘Off’). Cluster 5 has similar dynamics as ‘On-ramping’ while opposite in sign (Fig. 7E, ‘Suppressed’). The proportion of responses assigned to each cluster differed between cell typess (Fig. 7F). Thy1 responses are mostly ‘Off’, ‘On’ and ‘On-sustained’. PV interneurons mostly show’Suppressed’ responses while SOM interneurons show mostly ‘On-ramping’ responses. These two response clusters show no difference in latency reaching half peak amplitude (0.95±0.36s vs. 0.85±0.26s, p=0.21, Wilcoxon rank sum test). The opposite responses suggest that SOM neurons suppress PV neurons during prolonged tone activation, consistent with a disinhibition circuit scheme (Pfeffer et al., 2013). The inhibitory postsynaptic current from SOM to PV interneurons could last until after tone offset despite the cessation of firing of SOM interneuron. This prolonged suppression of PV neurons by SOM neurons potentially allows a decrease of PV inhibitory input onto local excitatory populations, which in turn could amplify Off-R.

Fig. 7. PV and SOM neurons show distinct temporal dynamics in response to prolonged tones.

Fig. 7

(A) Example field of view showing both Thy1-GCaMP6s cells and PV positive interneurons expressing GCaMP6s and mRuby. Scalebar: 10μm. Light blue areas indicate tone duration. (B) Example PV interneurons showing suppressed response (top), On-R (middle) and Off-R (bottom). (C) Example field of view showing both Thy1-GCaMP6s cells and SOM positive interneurons expressing GCaMP6s and mRuby. (D) Example SOM interneurons showing slow ramping responses following tone onset (top, middle) and Off-R (bottom). (E) K-means clustering on responses by Thy1, PV and SOM cells. All traces were normalized to maximum absolute amplitude before averaged within each cluster. Shaded regions show standard deviation. (F) Thy1, PV and SOM cells show distinct proportion of response types. (G) We propose that cortical On/Off-R are resultant from largely segregated On/Off thalamic input and the spatial pattern of these input determine the spatial layout of On/Off-R selective neurons. Further, Off-R cortical neurons have more recurrent connections that amplify the thalamic input compared to On-R circuitry. Black triangles represent On/Off-R neurons while gray triangles represent unresponsive neurons.

In summary, our results suggest that the spatial distribution of On/Off-R MGB terminals determines the spatial distribution of On- or Off-responsive A1 neurons and that Off-Rs are amplified compared to On-Rs due to disinhibition through suppression of PV interneurons by SOM interneurons (Fig. 7) as well as to increased local spatial clustering of Off-R MGB afferents (Fig. 7G).

Discussion

We show that the ACX encodes tone offset in a parallel, spatially extensive and yet globally tonotopically organized manner. We find distinct functional ACX areas characterized by distinct On/Off-selectivity on the population level. Thus, the cortical representation of spectral information is influenced by the temporal dynamics of spectrally static tones. GC analysis revealed that ACX areas contain intermingled On/Off networks within L2/3. Therefore, areal selectivity is due to both different numbers of On /Off-R neurons and distinct intracortical circuits. Distinct temporal dynamics in the responses of PV and SOM interneurons point to disinhibition as one mechanism that amplifies Off-R. Moreover, areal and cellular On/Off-R selectivity may arise from differences in MGB input which could be further enhanced by spatially correlated activity of MGB terminals. Together, our results suggest that the differential dynamic responses originate from differential feedforward input from MGB and is further elaborated by different intrinsic excitatory and inhibitory circuits in different ACX regions. Thus, ACX areas operate in parallel to extract temporal information. Our results also demonstrate that Off-Rs are tonotopically organized on the mesoscale. The lack of Off-R tonotopy in prior studies (Baba et al., 2016) is likely due to Off-R being most prominent in awake animals (Fishman and Steinschneider, 2009; Joachimsthaler et al., 2014; Qin et al., 2007; Recanzone, 2000).

We here developed a method to define functional ACX areas based on temporal coactivation of pixels in WF dataset. (Whiteway and Butts, 2017). This method is unbiased and unsupervised and requires no prior knowledge on the locations of cortical fields and can be applied to arbitrary WF datasets.

Besides tone onset and offset, ACX neurons can also be sensitive to other dynamic aspects of sound such as amplitude/frequency modulation, sound duration, and frequency sweep rate (Baumann et al., 2015; He et al., 1997; Heil et al., 1992; Issa et al., 2017; Schreiner and Urbas, 1986). While frequency sweep rate is topographically organized in mouse ACX (Issa et al., 2017), our results show that Off-R are also topographically represented.

We found an extensive representation of tone offset in A1 and DP neurons. A1 neurons receive On/Off synaptic inputs shown to be mediated by non-overlapping sets of synapses (Scholl et al., 2010). We find that MGB terminals mostly have only On- or Off-R, suggesting that A1 neurons receive convergent input from such On- or Off-responsive MGB terminals. Further, Off-R MGB terminals do not outnumber On-R MGB terminals and MGB terminals have weaker Off-R, suggesting that the cellular Off-R dominance in A1 is resultant from different On/Off-R input topology, or the spatial distribution of connections. No evidence so far suggests that On- and Off-circuits have different quantal synaptic strength and thus the cellular On/Off-R bias is more likely to result from differential convergence of connections. Together, these observations suggest the presence of local A1 circuits to amplify Off-R. Our results suggest that a disinhibitory circuit formed by SOM and PV cells could play this role. A multilayer nonlinear neural network has been proposed to underlie the wide variety of On/Off-Rs observed in A1 (Deneux et al., 2016). Our work suggests that the MGB-A1 circuit could underlie this transformation. Ideally, our conclusion would be strengthened by simultaneously imaging MGB terminals and ACX postsynaptic neurons. However, such approach would be still be limited as corresponding terminals and postsynaptic neurons would not be necessarily localized in the same imaging plane, making it difficult to determine unequivocally presynaptic terminal and postsynaptic cell pairs.

On- and Off-MGB terminals likely originate from different MGB subdivisons. ACX receives MGB inputs via lemniscal and non-lemniscal pathways. The lemniscal pathway arises from the ventral MGB (MGBv) which shows On-Rs (Aitkin and Webster, 1972; Hackett et al., 2011; Imig and Morel, 1983; Redies and Brandner, 1991). Multiple lines of evidence suggest that Off-Rs originate in non-lemniscal pathways. Off-Rs are predominantly observed in a sheet partially surrounding MGBv (He, 2001). Off-Rs can also originate from the dorsal and medial MGB (MGBd and MGBm). Indeed, we found that A2 and DP which receive MGBd input (Lee and Sherman, 2008; Llano and Sherman, 2008) show Off-Rs. Moreover, the spatial extensiveness of Off-Rs is consistent with broad projection from MGBm to ACX through L1 (Huang and Winer, 2000; Lee and Winer, 2008). Thus, non-leminiscal pathways likely provide tone offset information to ACX. We imaged terminals at roughly the same depth as neurons (~150μm), and thus these terminal might reflect a mixture of lemniscal and non-lemniscal pathways as terminals from both MGBv and MGBd are present in L2 in A1 (Saldeitis et al., 2014). Our results show overlapping tonotopy of On-R and Off-R albeit areal differences, suggesting that lemniscal and non-lemniscal pathways are coarsely aligned but show distinct spatial patterning.

The majority of responding A1 L2/3 neurons have either On-R or Off-R. Thus the spatial heterogeneity of tonal responses in A1 L2/3 might be due to intermingled cells receiving differing thalamic input. In S1, functionally different thalamic inputs from ventral posterior medial nucleus and posterior medial nuclues are relayed to barrels and septa (Koralek et al., 1988; Lu and Lin, 1993) which are spatially separated and carry whisking-touch information (Yu et al., 2006) and temporal information on whisker movement respectively (Ahissar et al., 2000). Our results suggest that, in contrast to S1, functionally different thalamic inputs to A1 are spatially interdispersed. A1 L2/3 contains cells with distinct functional intracortical-circuits and shows a sublaminar organization (Meng et al., 2017). It is possible that the distinct On/Off subnetworks we identified relate to these distinct subpopulations. Because recurrent inputs from subgranular layers are thought to be able to amplify thalamic inputs (Li et al., 2013; Miller et al., 2001; Somers et al., 1995), we speculate that Off-R cells receive stronger or more extensive inputs from subgranular layers. Prior electrophysiology studies identified a larger proportion of neurons responding to both tone onset and offset (Joachimsthaler et al., 2014; Qin et al., 2007). The discrepency most likely results from differences in recording depth and the inclusion of multiunit activity, given the intermingled spatial distribution of On- and Off-R (Fig. 5B) which could bias electrophysiological studies. Although our single unit recordings (Fig. S7) showed a significant proportion of neurons responding to both tone onset and offset, there was a differential distribution of Off-R bias across the depth of A1 with superficial A1 cells being more Off-R biased. Thus, Off-Rs are more prevalent in superficial layers where we imaged (~150μm depth). This is also consistent with the laminar targets of lemniscal and non-lemniscal MGB afferents with the latter being present in L1 (Llano and Sherman, 2008; Saldeitis et al., 2014) and with that L5/6 neurons were less likely to generate Off-R (Volkov and Galazjuk, 1991).In addition, L2/3 shows a functional suborganization (Meng et al., 2017) with deep L3b receiving L4 inputs while superficial L2a receiving little L4 inputs. Thus, it is likely the recurrent connections in L2a further amplifies the segregation of On/Off-R as well as the Off-R strength. Future studies linking intracortical connectivity with functional responses are needed to further explore these issues. Together, given that 2-photon imaging has much higher spatial resolution and lacks electrode bias our imaging results most likely revealed a highly specific On/Off-R selectivity in upper L2/3.

We found that ~5% of neurons in A1 respond to tone onset/offset, consistent with a sparse representation of sound in rat A1 (Hromádka et al., 2008) and mouse A1 (Liang et al., 2018). However, previous imaging studies of A1 have reported 20–30% response rate (Issa et al., 2014; Kato et al., 2015). This discrepancy likely arises from sampling different neuronal populations. Issa et al. (2014) used cre-dependent GCaMP3 driven by Syn1-cre or Emx-cre. In V1 such labeled populations had fewer visual responses compared OGB-1 labeled neurons (Zariwala et al., 2012), suggesting a non-uniform population labeling. Kato et al. (2015) used viral expression of GCaMP6s under Syn1 promotor, which densely labeled neurons close to the injection site. We used the GP4.2 line, which relatively uniformly labels about 70% of L2/3 pyramidal cells (Dana et al., 2014). The difference in response rate between our and prior imaging studies is likely due to labeling of different but potentially overlapping populations, the difference in calcium indicator (GCaMP3 vs GCaMP6s), expression profile (transgenic vs viral expression), and cell selection/inclusion criteria.

We find that L2/3 PV and SOM interneurons show distinct temporal dynamics from each other as well as from excitatory neurons. PV and SOM interneurons show mostly opposite signs of responses. Since the suppression of PV responses likely indicates a reduction in the firing rate (Forli et al., 2018), our data suggests suppression of PV activity by SOM interneurons, consistent with mostly suppressed responses of L2/3 PV neurons to prolonged tones (Kato et al. 2015) and the proposed cortical processing scheme of SOM->PV inhibition (Pfeffer et al., 2013). Finally, SOM neurons more readily inhibit PV neurons than local excitatory neurons (Cottam et al., 2013). We speculate that such inhibtion could facilitate detection of changes in auditory streams, such as tone offset. The duration of inhibitory postsynaptic currents in PV cells could outlast firing of SOM cells, creating a window for elevated excitability in local pyramidal cells before PV activity returns to baseline. Furthermore, we find that SOM cells are active throughout tone presentation, in contrast to previous findings that SOM cells fire transiently, although this difference could be due to the animal’s state (Chen et al., 2015; Li et al., 2014). Thus, SOM cells are potentially important for auditory stream analysis and their interactions with PV neurons could facilitate change detection.

In conclusion, we have demonstrated a distinctly extensive parallel spatial representation of sound dynamics in ACX at multiple levels and we propose that this spatial pattern is determined by the meso- and micro-scale spatial layout of thalamic input and by distinct intracortical circuits.

Star Methods

Contact for Reagent and Resource Sharing

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Patrick Kanold (pkanold@umd.edu).

Experimental Model and Subject Details

All procedures were approved by the University of Maryland’s Animal Care and Use Committee. We crossed CBA/CaJ (JAX stock #000654) mice with Thy1-GCaMP6s (JAX stock #024275, GP4.3, (Dana et al., 2014)) to obtain F1’s since C57BL/6 are homozygous for Cdh23 allele ahl, which causes them to suffer from aging related hearing loss, while CBA/CaJ mice are homozygous for Ahl+, which spare them from the phenotype (Kane et al., 2012). F1’s thus have no hearing loss and yet have uniform expression of GCaMP6s under Thy1 promotor in excitatory neurons. We used adult mice of both sexes whose ages range from 2 to 4 months old. For imaging PV or SOM neurons, we crossed Thy1-GCaMP6s mice with PV-cre (JAX #008069) or SOM-cre (JAX #013044) mice and injected ~30nl of AAV1.Syn.Flex.mRuby2.GSG.P2A.GCaMP6s.WPRE.SV40 (Addgene viral prep # 68720-AAV1 (Rose et al., 2016)) into the left ACX of the F1 animals. Such generated animals express innate GCaMP6s in Thy1 pyramidal cells while expressing GCaMP6s and mRuby in either PV and SOM interneurons.

Method Details

Chronic window implant

2–3 hours before surgery, 0.1cc dexamethasone (2mg/ml, VetOne) was injected subcutaneously to reduce brain swelling during craniotomy. Anesthesia was induced with 4% isoflurane (Fluriso, VetOne) with a calibrated vaporizer (Matrx VIP 3000). During surgery, isoflurane level was reduced to and maintained at a level of 1.5%−2%. Body temperature of the animal was maintained at 36.0 degrees Celsius during surgery. Hair on top of head of the animal was removed using Hair Remover Face Cream (Nair), after which Betadine (Purdue Products) and 70% ethanol was applied sequentially 3 times to the surface of the skin before the central part is removed. Soft tissues and muscles were scraped to expose the skull. Then a custom designed 3D printed stainless headplate was mounted over left auditory cortex and secured with C&B-bond (Parkell). A craniotomy with a diameter of about 3.5mm was then performed over left auditory cortex. A three layered cover slip was used as cranial window, which is made by stacking 2 pieces of 3mm coverslips (64–0720 (CS-3R), Warner Instruments) at the center of a 5mm coverslip (64–0700 (CS-5R), Warner Instruments), using optic glue (NOA71, Norland Products). Cranial window was quickly dabbed in kwik-sil (World Precision Instruments) before mounted onto the brain with 3mm coverslips facing down. After kwik-sil cured (2–5min), C&B-bond was applied to secure the cranial window. Synthetic black iron oxide (Alpha Chemicals) was then applied to the hardened surface. 0.05cc Cefazolin (1 gram/vial, West Ward Pharmaceuticals) was injected subcutaneously when entire procedure was finished. After the surgery, the animal was kept warm under heat light for 30 minutes for recovery before returning to home cage. Medicated water (Sulfamethoxazole and Trimethoprim Oral Suspension, USP 200mg/40mg per 5ml, Aurobindo Pharms USA; 6ml solution diluted in 100ml water) substituted normal drinking water for 7 days before any imaging was performed.

Widefield imaging

Mice were affixed to a custom designed head-post and restrained within a plastic tube. The head of the animal was held upright. Imaging was performed using Ultima-IV two photon microscope (Bruker Technologies) with an orbital nosepiece such that the illuminance light is roughly perpendicular to cranial window (rotation angle was ~60 degrees). As a result, the anterior-posterior axis was not parallel to the edge of the images. 470nm LED light (M470L3, Thorlabs Inc.) was used to excite green fluorescence. Images were acquired with StreamPix 6.5 software (Norpix) at 10Hz and 100ms exposure time. In StreamPix software, we specified the image size to be 400 by 400 with a spatial binning of 3.

Acoustic stimulus

Pure tones were generated with custom MATLAB script. Each tone lasted 2 seconds with linear ramps of 5ms at the beginning and at the end of the tone. The amplitudes of the tones were calibrated to 75Db SPL with a Brüel & Kjær 4944-A microphone. During sound presentation, sound waveform was loaded into RX6 multi-function processor (Tucker-Davis Technologies (TDT)) and attenuated to desired sound levels by PA5 attenuator (TDT). Then the signal was fed into ED1 speaker driver (TDT), which drove an ES1 electrostatic speaker (TDT). The speaker was placed on the right-hand side of the animal, 10cm away from the head, at an angle of 45 degrees relative to the mid-line. The presentation of tones with various combination of frequencies and sound levels are randomized and controlled by a custom MATLAB program. The silent period in between the 2-second tones was randomly chosen from a uniform distribution between 3 and 3.5 seconds. Frequencies of the tones vary from 4kHz to 83.0kHz with logarithmic spacing and with a density of 2.28 tones per octave. Sound levels vary from 5dB SPL to 65dB SPL with a step of 15dB. Each stimulus was repeated 10 times. In total, the widefield imaging session for for each animal lasted ~45min. For 2-photon imaging, 9 tones with equal logarithmic spacing between 4 and 64kHz were used at a single level of 60dB SPL. The tone duration was 2 second and repeated 10 times.

2-Photon imaging of mouse ACX

A week after the cranial window implant, the animals were head-fixed in custom designed holder while 2-second long tones were presented in a similar fashion as in WF experiment. Field of views were placed in A1, AAF, A2 and DP region with a depth of around 150μm and with a size of 369um x 369um. The imaging was performed with a B-SCOPE (Thorlabs Inc.) with the microscope body tilted around 45 to 50 degrees while the mouse head was held upright. The excitation wavelength was 920nm and images were collected with ThorImage software (Thorlabs Inc.) at a frame rate of 30Hz. A 16x Nikon objective was used (NA 0.80). For terminal imaging, the average imaging depth was around 140um, comparable to cellular data.

Injection of GCaMP6s virus in MGB

AAV1.hSyn1.mRuby2.GSG.P2A.GCaMP6s.WPRE.SV40 (Addgene 50942) virus was injected into MGB for axon terminal imaging in ACX. Micropipettes pulled with a long tapering tip (>3mm) were used for injection with Nanoject II (Drummond Inc.). The location of the left MGB was determined using mouse brain atlas (AP: 3.2mm; ML 2.1mm; DV: 3.0mm). Anesthesia was induced with 4% isoflurane and maintained at 1.5%. The skin over the skull was cut open and a small craniotomy was made to allow penetration from the dorsal side and the micropipette was lowered vertically into MGB. 150–200nl of undiluted virus was injected over 5min. After the injection, the skin was sutured back. 3–4 weeks after the injection, the cranial window was implanted over the left ACX as previously described.

Pupillometry

During 2P imaging, the arousal state of the animal was monitored through pupillometry (McGinley et al., 2015). In short, a camera was positioned around 20cm away from and towards the right eye of the head-fixed mouse. An ultraviolet LED was placed near the camera to restrict the pupil dilation to around 1/2 of the maximum dilation. The exposure time of the camera was set to 26ms and each frame was triggered by 2P “Frame Out” triggers and thus synchronized to 2P images.

Extracellular electrophysiology

We performed extracellular electrophysiology in CBA/CaJ and Thy1-GCaMP6s F1 crosses by either acutely recording from A1 neurons or chronically implanting electrodes. We used 16-channel linear arrays with 50μm spacing between adjacent contacts (A1×16–3mm-100–177-CM16, NeuroNexus) and a Neuralynx Cheetah system (32 channels). The acute surgery or implant surgery was similar to the cranial window implantations. In both cases, we first identified the location of A1 through widefield imaging of GCaMP6s and we advanced the electrode at a depth of around 900μm, which was read out from the manipulator. Fig. S7AE used data from chronic implantation while Fig. S7FG used single unit data pooling from both acute and chronic recordings. LFP signals and single units were acquired a previously described (Petrus et al., 2014). Briefly LFPs were acquired at 30kHz (filtered between 1 and 6000Hz) and down-sampled by a factor of 100 (using MATLAB built-in function ‘decimate’) before analysis. To calculate local field potential (LFP) responses, we took the difference of mean LFP amplitude within a 50ms time window before and after tone onset/offset. To determine the significance, we used a paired t-test separately for each frequency and onset/offset and a significant change above baseline was considered a significant response. For spike extraction, raw headstage signal was filtered from 300Hz to 6000Hz and detected online with a threshold of 30μv.

Quantification and Statistical Analysis

Widefield image preprocessing

We performed three preprocessing steps before using autoencoder for image segmentation. First, we downsampled the original image (400 by 400) using MATLAB (2015b) using the MATLAB built-in function ‘imresize’, by a factor of 4. The resultant image size was 100 by 100. Next we performed whitening of the image sequence. We first re-shaped each image into column vectors, then we stacked them horizontally. Let It denote the column vector corresponding to image at time t, M be the stacked matrix, and N be the total number of images:

M=I1,I2, ,IN

We then subtracted the time average image ( <I>t ) from all images:

M^=M<I>t×1, 1, , 1N

We then performed singular value decomposition on sample covariance matrix of M^:

U,S,V=SVD(M^×M^'/N)

Then we obtained the whitened images using the following equation:

M~=U×(S1+λ)×U'×M^

where  λ is the regularization term. We picked λ by first plotting the sorted eigenvalues in S in logarithmic space and usually a fast initial drop off and a following relatively flat region can be observed. We picked λ close to the turning point to preserve relevant variance and to avoid amplifying noise. We then fed M~ into autoencoder algorithm.

Image Segmentation with constrained autoencoder

We used a dimensionality reduction technique to perform automatic image segmentation such that pixels with strong temporal correlations across the set of images were grouped together into single components (ROIs), following the formulation of Whiteway and Butts (2017). To perform this dimensionality reduction, we used an autoencoder neural network. The goal of this constrained autoencoder is to adjust the weights between the input layer and the hidden layer and those between the hidden layer and the output layer such that the output matches the input as closely as possible. For each time point t, the autoencoder takes the vector of pixel values yt  RN and projects it down onto a lower dimensional space RM using an encoding matrix W1 RM×N. A bias term b1  RM is added to this projected vector, so that the resulting vector zt  RM  is given by

zt=W1yt+ b1

The autoencoder then reconstructs the original activity yt by applying a decoding matrix W2 RN×M to zt and adding a bias term b2  RN, so that the reconstructed activity y^t  RN is given by

y^t=W2zt+ b2

Since the dimensionality of zt is typically much smaller than that of yt, zt should capture variations in yt that are shared across many pixels. The entries of W2 then describe how each pixel is related to each dimension of zt (see Fig. 2C).

The weight matrices and bias terms, grouped as Θ = [W1,W2,b1,b2 ], are simultaneously fit by minimizing the mean square error between the observed activity yt and the predicted activity y^t:

Θ^=argminΘ12tyt y^t22 

To further enable interpretability of the results, we constrained the weights W2 to be non-negative, as one could flip the signs of both spatial and temporal components arbitrarily. This also ensured that all pixels in a given ROI always increase or decrease in intensity together, depending on the sign of zt. We also tied the weights such that W2=W1T. Thus, there was essentially only one spatial weight matrix.

This version of the autoencoder is closely related to principal components analysis (PCA) (Bengio et al., 2013). However, PCA is an inadequate technique for automatic image segmentation since it did not in general result in spatially localized ROIs, due to the orthogonality constraints imposed by the PCA model. A similar approach to our non-negatively constrained autoencoder is to use non-negative matrix factorization (NNMF) on the preprocessed image sequence. NNMF constrains both the spatial maps and the temporal activations to be non-negative, whereas the RLVM just constrains the spatial maps to be non-negative. The NNMF ROIs also failed to be spatially localized. Finally, in order to solve the constrained minimization problem above we used the spectral projected gradient method, a constrained variant of gradient descent (Schmidt et al., 2009).

To perform image segmentation with this method we must first specify the number of ROIs (the dimensionality of zt). We determined the appropriate number of ROIs using cross-validation by first fitting the parameters of the autoencoder on 75% of the frames from the image sequence (training data), and then reconstructing the remaining 25% of the images (testing data) using the autoencoder. We then calculated the correlation between the true and reconstructed images on the testing data, as a measurement for goodness of fit. In Fig. S3A, we show that with an increasing number of ROIs, the correlation from the testing data increases monotonically, and roughly plateaus after ~50 ROIs. We also performed fitting on the entire image sequence and plot the correlation (Fig. S3A, blue curve). A similar monotonic increase is observed, and with 50 or more ROIs, the correlation value is above 0.8, which is agreeable considering that the full image sequence consisted of more than 28,000 images. Another criterion we utilized to choose the number of ROIs was the total spatial area covered by the ROIs. An increasing portion of the total area is covered with an increasing number of ROIs, (Fig. S3B), and total area covered by 50 ROIs are close to maximum coverage. Given these results, we typically used 50 ROIs in the autoencoder.

Widefield On-R and Off-R amplitude

To determine response amplitude, first the temporal trace from each trial was normalized to percentage change with respect to baseline fluorescence:

normalized trace at time t=FtF0F0 where F0 is the baseline determined by finding the most frequent value in the histogram of the trace assuming stability. For On-R amplitude, we averaged the normalized trace from 200–500ms after tone onset with the baseline from normalized trace subtracted. For Off-R, we averaged the normalized trace from 200–500ms after tone offset and subtracted the average from the same trace 0–200ms right before tone offset. The 200–500ms window was sufficient to capture the rising phase as well as the peak of the increase in fluorescence in typical On/Off-R.

Field Parcellation

We assigned ROIs to different ACX fields based upon known tonotopic structure revealed with optical approach (Issa et al., 2014; Tsukano et al., 2015). ACX of mice contains several ACX fields, including A1, AAF and Ultrasonic Field (UF), which are characterized by the presence of tonotopic gradients in the On-R (Stiebler et al., 1997). Tonotopy also exists in secondary area A2, albeit on a compressed scale (Issa et al., 2014). First, we identified A1 and UF ROIs based on their two tonotopic axes, one from the caudal side to dorsomedial side (low to high) and the other one, sharing the same low frequency area, from caudal to ventrolateral side (Issa et al., 2014). The example A1 and UF ROIs (Fig. 2I-O) show progression of frequency selectivity along the two tonotopic axes. We use ‘UF’ and ‘high A1’ to distinguish between the two spatially distinct areas that are high frequency selective, while they are both considered primary auditory cortices. We also found a subset of ROIs located dorsoposterior to A1 which we assigned as DP. They showed relatively weak On-Rs but prominent Off-Rs (Fig. 2M). We performed parcellation of ROIs in all animals studied, and the similar spatial layout of A1, UF, AAF, A2 and DP can be robustly observed.

Signal correlation among ROIs

We used corrected signal correlation (SC) for all our calculation due to the limited number of repeats and the strong tendency of close-by pixels to covary in time (Rothschild et al., 2010; Winkowski and Kanold, 2013). The basic idea is that the uncorrected SC equation contains products of responses from the two ROIs in question on the same trial, and these terms also appear in noise correlation equation. Thus, these products represent to some extent the covariation of ROIs regardless of stimulus presentation, and thus should be excluded from SC calculation. The denominator in the equation was adjusted accordingly to take into account the reduction of number of summation in the nominator.

In Fig. 3G, H, we calculated SC among selected ROIs that were dorsally located with respect to A1 and UF respectively. These ROIs have centers within ~450um to the A1 and UF ROIs in the rostrocaudal direction but dorsally located. Then we calculated pairwise SCs among all these ROI pairs and plotted them as a function of distance (Fig. 3I).

On- and Off-tonotopy

To establish On- and Off-tonotopy, threshold of WF On-R and Off-R were first manually determined (Fig. S1, white solid lines). Then WF images with baseline subtracted following tone onset or offset were obtained at identified threshold. Next a homomorphic filter was applied to the images to correct for unevenness of illumination. Then 95 percentile contour lines of the responses were extracted and overlaid to demonstrate systematic movement of activation area as a function of different tone frequencies (Fig. 1C, D, Fig. S3).

2-Photon imaging data analysis

First motion correction was performed with TurboReg plugin (Thevenaz et al., 1998). In a subset of experiments, the motion correction was performed using the Suite2P package (Pachitariu et al., 2016). ROIs were drawn manually using a custom written GUI. A ring was placed on each cell soma to extract raw fluorescence trace while a circular region of 20μm radius was used to extract nearby neuropil signal (excluding soma). We used the following equation to correct neuropil contamination of cell:

Fcorrectedt= Fcellt0.8×Fneuropilt

The coefficient of correction (0.8) was measured with the collected 2P dataset by taking the ratio of the intensity non-radial blood vessel and the intensity of adjacent neuropil containing no neurons. To calculate Δ F/F, the baseline of each cell was determined by constructing a histogram of all fluorescence intensity over time and by finding the peak of the histogram and the corresponding fluorescence intensity value, which we used as the estimate of fluorescence baseline. This procedure is based on several assumptions. First, we assume the baseline is constant over time, which we generally found to be true given our relatively short imaging sessions (~9 min). Second, we assume that the response in ACX is sparse (Hromádka et al., 2008) and thus baseline value should be observed the most often, which will be reflected as the peak in histogram. This procedure is generally robust and generates Δ F/F change over a reasonable range. If this procedure found negative baseline values, suggesting the soma fluorescence was lower in intensity than surrounding neuropil, then these cells were excluded from further analysis. Then, we calculate Δ F/F using the following equation:

ΔFFt=Fcorrectedtbaselinebaseline 

To determine whether a cell is significantly responding to sound onset or offset, we first determine the response amplitude in the Δ F/F trace by finding the maximum change within 1 second after sound onset or offset and average over a small window (±2 frames) around the maximum time point to account for the noisy fluctuation in the trace. Then the 95 percent confidence interval (CI) of the median of the response amplitude was constructed through a bootstrapping procedure (resampling 1000 times) and if the lower CI bound exceeded 1.5 times the standard deviation of the baseline fluctuation (5 frames or ~150ms before sound onset/offset) then the cell was considered significantly on/off-responsive. The response significance was determined separately for each frequency and sound level combination and separately for On-R- and Off-R. Neuropil and MGB terminal signals were processed with the same procedure. Unlike cellular ROIs, MGB terminal ROIs were obtained with Suite2P in an automated fashion.

For classifying different types of responses (Fig. 7E), we performed k-means clustering on averaged responses (across repetitions) to each frequency, pooling these traces from Thy1 (including traces from F1s of CBA/CaJ and Thy1-GCaMP6s crosses), PV and SOM neurons. The clustering is only confined to statistically significant responses. We used correlation as the distance measure and thus the clustering disregarded absolute amplitude of the traces. We chose 5 clusters to sufficiently encompass the different response types encountered.

Off-R Bias Index (OBI)

OBIs are calculated by first averaging On-R and Off-R for responding neurons over frequency and repeats, and then calculated with the following equation:

OBI=<Roff>   <Ron><Roff> +  <Ron>

where the angle brackets denote average over tone frequency and repeats.

Granger Causality analysis

The notion of causality proposed by Granger (Granger, 1969) aims at capturing the two fundamental principles of temporal predictability and the precedence of cause over effect. In order to capture the functional dependencies within a neuronal ensemble and the sparsity of interactions, we employ sparse multivariate autoregressive models. We introduce a measure of GC which accounts for sparse interactions, estimate the model parameters using fast optimization methods, and perform statistical tests to assess the significance of possible GC interactions, while controlling the false discovery rate (FDR) to avoid spurious detection of GC links.

We used the same framework as in for our Granger Causality (GC) measurement (Sheikhattar and Babadi, 2016). In order to infer GC patterns for the two On/Off conditions, we divide the corresponding responses to the onset and offset inputs, and pool across all the tone frequencies, thereby treating them as implicit repetitions to the same stimuli condition. In what follows, we present our modeling, parameter estimation and GC inference procedure.

Modeling:

Consider a sequence of calcium indicator fluorescence measurements from a set of C neurons indexed by c=1,2,,C within a slice, denoted by {yr,n(c)}r=1:R,n=1:Nc=1:C  over time bins n=1,,N, and across R trial repetitions indexed by r=1,,R. We adopt a sparse vector autoregressive (VAR) framework (Valdés-Sosa et al., 2005) for modeling the slow-decaying and transient dynamics of the calcium fluorescence signals as well as the cross-dependencies among the neurons.

Suppose that the fluorescence observation vector of neuron (c) at the r-th repetition is represented by yr(c):=[yr,1(c),,yr,N(c)]', and let y¯(c):=[y1c',y2c',,yRc']' denote the zero-mean total observation vector, containing the set of all observation vectors yr(c) from all trials r=1,,R.

The effective neural covariates taken into account in our models are each neuron’s self-history of activity and the history of activities of other neurons in the ensemble. We consider a lag of L samples within which the possible neuronal interactions may occur. Then, we segment L into M windows of lengths W1,W2,,WM such that i=1MWi=L. Let bm:=l=1mWl for m=1,,M, and b0=0. Let

hr,n,mc1Wmk:=n1bmn1bm1yr,kc (1)

represent the average activity of neuron c within the m-th window lag of length Wm with respect to time n and at trial r. We can then define the vector of history covariates from neuron (c), effective at time n and trial r as hr,n(c):=[hr,n,1(c),hr,n,2(c),,hr,n,M(c)]'. Next, let xr,n:=[hr,n(1)',hr,n(2)',,hr,n(C)']' denote the vector of covariates from all neurons at time n and trial r.

In order to represent the covariates in a more compact form, we consider the N×MC matrix Xr:=[xr,1,xr,2,,xr,N]' which contains in its rows the covariate vectors at all times n=1,,N within trial r. Finally, let X¯:=[X1',X2',,XR']' represent the matrix of all covariates with standardized columns (i.e., zero-mean columns with unit norm), capturing the covariates Xr for all the trials r=1,,R the VAR model can then be expressed as:

y¯(c)=X¯ω(c)+ε¯(c) (2)

where ε¯(c):=[ε1c',ε2c',,εR(c)']'N(0,σ(c)2I) is a zero-mean Gaussian noise vector of size RN with variance σ(c)2, and ω(c) is a parameter vector accounting for the interactions in the network, for c=1,2,,C.

In agreement with the parsing of the covariates in the matrix X¯, the parameter vector ω(c):=[ωc,1', ωc,2',,ω(c,C)']' in Eq. (2) is composed of a collection of cross-history dependence vectors {ω(c,c~)}c~=1:C, where ω(c,c~) represents the contribution of the history of neuron (c~) to the activity of neuron (c) via the corresponding covariate vector hr,n(c) encoded in matrix X¯. In particular the component ω(c,c) is important in capturing the slow calcium florescence decay in an autoregressive fashion, and thereby excluding the transient effects of florescence decay from the GC analysis.

Next, we invoke the hypothesis of sparsity in the interactions among the neurons in the ensemble. In our model, the sparsity of the interactions can be captured through the sparsity of the parameter vector ω(c): when only very few components of ω(c) are non-zero, neuron (c) is only affected by the activity history of a few neurons in the ensemble. In addition, as the dimension of the parameter vector given by MC scales with the network size C, the hypothesis of sparisty enables the detection of salient interactions within a large network, and thereby mitigates overfitting, especially when the observations are noisy and trials are limited in number.

Parameter Estimation:

In order to define a framework for inferring a possible GC link (c~c), two nested models are taken into account: 1) the VAR model in Eq. (2), where the contributing covariates from all the neurons are taken into account, referred to as the full model, and 2) the same model in which the covariates and parameters of a single neuron (c~) on neuron (c), c~c are excluded, to which we refer as the reduced model. The parameters and covariates associated with the reduced model are denoted by ω(cc~) and X¯c~, respectively.

The sparse parameter vector associated with either of the two models can be estimated by solving an l1-regularized maximum likelihood (ML) problem for each neuron as follows:

ω^=argminω12y¯cXω22+γω1 (3)

where X takes the two values of X¯ and X¯c~ for the full and reduced models, respectively, the l1 -norm is defined as ω1:=m=1M|ωm|, and γ0 is a regularization parameter tuning the sparsity level, which can be selected based on analytical results on l1 -regularized ML problems or via cross-validation. Given the parameter estimate ω^, the corresponding variance associated with the model can be computed as σ^2=1NRy¯X¯ω^22.

Inference:

The conventional measures of GC are based on ML estimates of the VAR parameters, and not the regularized ML as in our case. Hence, we need to modify the GC measure and the corresponding deviance statistics, to account for the estimation bias incurred due to l1-regularization. This new measure is the static VAR-based counterpart of a similar measure presented in our earlier studies in the context of dynamic sparse point process models (Sheikhattar and Babadi, 2016) To this end, we modify the deviance difference statistic corresponding to the full and reduced models to compensate for the bias incurred due to sparse regularization. The bias can be computed for the full model as B(c):=g(c)'H(c)1g(c), where g(c):=X¯'(y¯(c)X¯ω^(c))/σ^(c)2 and H(c):=X¯'X¯/σ^(c)2 are the gradient and Hessian of the log-likelihood function for the Gaussian VAR model of Eq. (2), respectively. Similarly, the bias B(cc~) for the reduced model can be computed by replacing the matrix of covariates and parameter estimate by X¯c~ and ω^(cc~), respectively.

The deviance difference statistic associated with the two nested full and reduced models can be expressed as:

D(c~c)NR logσ^(cc~)2σ^(c)2B(c~c) (4)

where B(c~c):=B(c)B(cc~) denotes the difference of bias terms corresponding to the full and reduced models.

We finally employ the inference framework presented in (Kim et al., 2011; Sheikhattar and Babadi, 2016) to simultaneously test the statistical significance of all possible GC interactions and to control the FDR at a given significance level α. This inference framework integrates an extension of classical results on analysis of deviance, and a multiple hypothesis testing procedure based on the Benjamini-Yekutieli FDR control (Benjamini and Yekutieli, 2001). The weights of the detected links are further characterized using the Youden’s J-statistic, which is a summary statistic for quantifying the strength of hypothesis tests. The excitatory or suppressive nature of GC links are determined by the effective sign of estimated cross-history parameters associated with shorter latencies.

To quantify the spread of the distribution of GC-link directions (Fig. 5F), we first constructed a circular histogram of the GC-link angles which were computed from MATLAB built-in function atan2. Based on this histogram we used PCA to extract the long and short axes of the eclipse like distributions. Then all the original angles were projected onto the short axis and the resultant dot products (taking absolute values) were compared between ACX fields. The more the values are shifted towards 1, the larger the spread in the short axis, indicating a less ‘pointy’ distribution.

Pupillometry data analysis

To extract pupil size, each image was first cropped around the eye and MATLAB built-in function “imfindcircles” was used to determine pupil location and diameter. The pupil size over time was further smoothed with a time window of ~150ms. The onset of micro-dilation was determined by first inverting the trace (flip sign) and using MATLAB built-in function “findpeaks” with a minimum peak prominence of 10um. Next, we quantified the occurrence of micro-dilation before, during and after tone onset using 1-second windows, to investigate whether micro-dilation is more likely to occur following tone offset. We established confidence interval by shuffling tone onset time and counting the micro-dilation occurrence in reference to the shuffled stimulus onset. We performed such analysis for 10 sets of experiments (n=9 mice). If micro-dilation is more likely to occur during any specific time window, then the actual counts should exceed upper bound of the confidence interval. If the counts are within the confidence interval, then the occurrence of micro-dilation is equally likely to occur before, during or after tone presentation.

Electrophysiological data analysis

Single units were sorted offline using MClust-3.5 package (A. D. Redish et al., http://redishlab.neuroscience.umn.edu/MClust/MClust.html) and KlustaKwik algorithm (K. Harris, http://klustakwik.sourceforge.net). For single unit analysis, we calculated responses as the spike count change within a 500ms window before or after tone onset/offset and used paired t-test to determine the response significance for each frequency.

Supplementary Material

Supplemental

Key Resource Table.

REAGENT or RESOURCE SOURCE IDENTIFIER
Bacterial and Virus Strains
AAV1.Syn.Flex.mRuby2.GSG.P2A.GCaMP6s.WPRE.SV40 (Rose et al., 2016) Addgene 68720-AAV1
Experimental Models: Organisms/Strains
Mouse: C57BL/6J-Tg(Thy1-GCaMP6s)GP4.3Dkim/J The Jackson Laboratory JAX 024275
Mouse: CBA/CaJ The Jackson Laboratory JAX 000654
Mouse: B6;129P2-Pvalbtm1(cre)Arbr/J The Jackson Laboratory JAX 008069
Mouse: Ssttm2.1(cre)Zjh/J The Jackson Laboratory JAX 013044
Software and Algorithms
Autoencoder (Whiteway and Butts, 2017) https://github.com/themattinthehatt/rlvm
Suite2P (Pachitariu et al., 2016) https://github.com/cortex-lab/Suite2P
TurboReg (Thevenaz et al., 1998) http://bigwww.epfl.ch/thevenaz/turboreg/
MClust3.5 A. David Redish http://redishlab.neuroscience.umn.edu/MClust/MClust.html
Klustakwik Ken Harris. http://klustakwik.sourceforge.net/

Acknowledgements

Supported by NIH RO1DC009607 (POK), NSF 1552946 (BB), NIH T32DC00046 (MRW) and NSF IIS-1350990 (DAB).

Footnotes

Declaration of Interests

None

References

  1. Ahissar E, Sosnik R, and Haidarliu S (2000). Transformation from temporal to rate coding in a somatosensory thalamocortical pathway. Nature 406, 302. [DOI] [PubMed] [Google Scholar]
  2. Aitkin LM, and Webster WR (1972). Medial geniculate body of the cat: organization and responses to tonal stimuli of neurons in ventral division. J Neurophysiol 35, 365–380. [DOI] [PubMed] [Google Scholar]
  3. Baba H, Tsukano H, Hishida R, Takahashi K, Horii A, Takahashi S, and Shibuki K (2016). Auditory cortical field coding long-lasting tonal offsets in mice. Sci Rep 6, 34421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bandyopadhyay S, Shamma SA, and Kanold PO (2010). Dichotomy of functional organization in the mouse auditory cortex. Nat Neurosci 13, 361–368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baumann S, Joly O, Rees A, Petkov CI, Sun L, Thiele A, and Griffiths TD (2015). The topography of frequency and time representation in primate auditory cortices. Elife 4, e03256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bengio Y, Courville A, and Vincent P (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 1798–1828. [DOI] [PubMed] [Google Scholar]
  7. Benjamini Y, and Yekutieli D (2001). The control of the false discovery rate in multiple testing under dependency. Annals of statistics, 1165–1188. [Google Scholar]
  8. Bregman AS (1994). Auditory scene analysis: The perceptual organization of sound (MIT press; ). [Google Scholar]
  9. Bruno RM, and Sakmann B (2006). Cortex is driven by weak but synchronously active thalamocortical synapses. Science 312, 1622–1627. [DOI] [PubMed] [Google Scholar]
  10. Chen I-W, Helmchen F, and Lütcke H (2015). Specific early and late oddball-evoked responses in excitatory and inhibitory neurons of mouse auditory cortex. Journal of Neuroscience 35, 12560–12573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cottam JC, Smith SL, and Häusser M (2013). Target-specific effects of somatostatin-expressing interneurons on neocortical visual processing. Journal of Neuroscience 33, 19567–19578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dana H, Chen TW, Hu A, Shields BC, Guo C, Looger LL, Kim DS, and Svoboda K (2014). Thy1-GCaMP6 transgenic mice for neuronal population imaging in vivo. PLoS One 9, e108697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Deneux T, Kempf A, Daret A, Ponsot E, and Bathellier B (2016). Temporal asymmetries in auditory coding and perception reflect multi-layered nonlinearities. Nature communications 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Fishman YI, and Steinschneider M (2009). Temporally dynamic frequency tuning of population responses in monkey primary auditory cortex. Hear Res 254, 64–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Forli A, Vecchia D, Binini N, Succol F, Bovetti S, Moretti C, Nespoli F, Mahn M, Baker CA, and Bolton MM (2018). Two-Photon Bidirectional Control and Imaging of Neuronal Excitability with High Spatial Resolution In Vivo. Cell reports 22, 3087–3098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Francis NA, Winkowski DE, Sheikhattar A, Armengol K, Babadi B, and Kanold PO (2018). Small Networks Encode Decision-Making in Primary Auditory Cortex. Neuron 97, 885–897. e886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Frisina RD, Singh A, Bak M, Bozorg S, Seth R, and Zhu X (2011). F1 (CBA× C57) mice show superior hearing in old age relative to their parental strains: Hybrid vigor or a new animal model for “Golden Ears”? Neurobiology of aging 32, 1716–1724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Friston K, Moran R, and Seth AK (2013). Analysing connectivity with Granger causality and dynamic causal modelling. Current opinion in neurobiology 23, 172–178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Granger CW (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 424–438. [Google Scholar]
  20. Guo W, Chambers AR, Darrow KN, Hancock KE, Shinn-Cunningham BG, and Polley DB (2012). Robustness of cortical topography across fields, laminae, anesthetic states, and neurophysiological signal types. J Neurosci 32, 9159–9172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hackett TA, Barkat TR, O’Brien BM, Hensch TK, and Polley DB (2011). Linking topography to tonotopy in the mouse auditory thalamocortical circuit. J Neurosci 31, 2983–2995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. He J (2001). On and off pathways segregated at the auditory thalamus of the guinea pig. J Neurosci 21, 8672–8679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. He J, Hashikawa T, Ojima H, and Kinouchi Y (1997). Temporal integration and duration tuning in the dorsal zone of cat auditory cortex. J Neurosci 17, 2615–2625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Heil P, Rajan R, and Irvine DR (1992). Sensitivity of neurons in cat primary auditory cortex to tones and frequency-modulated stimuli. I: Effects of variation of stimulus parameters. Hear Res 63, 108–134. [DOI] [PubMed] [Google Scholar]
  25. Henry KR (1985). Tuning of the auditory brainstem OFF responses is complementary to tuning of the auditory brainstem ON response. Hearing research 19, 115–125. [DOI] [PubMed] [Google Scholar]
  26. Herreras O (2016). Local field potentials: myths and misunderstandings. Frontiers in neural circuits 10, 101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hillyard SA, and Picton TW (1978). On and off components in the auditory evoked potential. Perception & Psychophysics 24, 391–398. [DOI] [PubMed] [Google Scholar]
  28. Hromádka T, Deweese MR, and Zador AM (2008). Sparse representation of sounds in the unanesthetized auditory cortex. PLoS Biol 6, e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Huang CL, and Winer JA (2000). Auditory thalamocortical projections in the cat: laminar and areal patterns of input. J Comp Neurol 427, 302–331. [DOI] [PubMed] [Google Scholar]
  30. Imig TJ, and Morel A (1983). Organization of the thalamocortical auditory system in the cat. Annu Rev Neurosci 6, 95–120. [DOI] [PubMed] [Google Scholar]
  31. Issa JB, Haeffele BD, Agarwal A, Bergles DE, Young ED, and Yue DT (2014). Multiscale optical Ca2+ imaging of tonal organization in mouse auditory cortex. Neuron 83, 944–959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Issa JB, Haeffele BD, Young ED, and Yue DT (2017). Multiscale mapping of frequency sweep rate in mouse auditory cortex. Hear Res 344, 207–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Ji X. y., Zingg B, Mesik L, Xiao Z, Zhang LI, and Tao HW (2015). Thalamocortical innervation pattern in mouse auditory and visual cortex: laminar and cell-type specificity. Cerebral Cortex 26, 2612–2625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Joachimsthaler B, Uhlmann M, Miller F, Ehret G, and Kurt S (2014). Quantitative analysis of neuronal response properties in primary and higher-order auditory cortical fields of awake house mice (Mus musculus). Eur J Neurosci 39, 904–918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kane KL, Longo-Guess CM, Gagnon LH, Ding D, Salvi RJ, and Johnson KR (2012). Genetic background effects on age-related hearing loss associated with Cdh23 variants in mice. Hear Res 283, 80–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Kanold PO, Nelken I, and Polley DB (2014). Local versus global scales of organization in auditory cortex. Trends Neurosci 37, 502–510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kato HK, Gillet SN, and Isaacson JS (2015). Flexible sensory representations in auditory cortex driven by behavioral relevance. Neuron 88, 1027–1039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Katzner S, Nauhaus I, Benucci A, Bonin V, Ringach DL, and Carandini M (2009). Local origin of field potentials in visual cortex. Neuron 61, 35–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kim S, Putrino D, Ghosh S, and Brown EN (2011). A Granger causality measure for point process models of ensemble neural spiking activity. PLoS computational biology 7, e1001110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kopp-Scheinpflug C, Tozer AJ, Robinson SW, Tempel BL, Hennig MH, and Forsythe ID (2011). The sound of silence: ionic mechanisms encoding sound termination. Neuron 71, 911–925. [DOI] [PubMed] [Google Scholar]
  41. Koralek K-A, Jensen KF, and Killackey HP (1988). Evidence for two complementary patterns of thalamic input to the rat somatosensory cortex. Brain research 463, 346–351. [DOI] [PubMed] [Google Scholar]
  42. Lee CC, and Sherman SM (2008). Synaptic properties of thalamic and intracortical inputs to layer 4 of the first- and higher-order cortical areas in the auditory and somatosensory systems. J Neurophysiol 100, 317–326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Lee CC, and Winer JA (2008). Connections of cat auditory cortex: I. Thalamocortical system. Journal of Comparative Neurology 507, 1879–1900. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Li L. y., Li Y. t., Zhou M, Tao HW, and Zhang LI (2013). Intracortical multiplication of thalamocortical signals in mouse auditory cortex. Nature neuroscience 16, 1179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Li L. y., Xiong XR, Ibrahim LA, Yuan W, Tao HW, and Zhang LI (2014). Differential receptive field properties of parvalbumin and somatostatin inhibitory neurons in mouse auditory cortex. Cerebral cortex 25, 1782–1791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Liang F, Li H, Chou XL, Zhou M, Zhang NK, Xiao Z, Zhang KK, Tao HW, and Zhang LI (2018). Sparse Representation in Awake Auditory Cortex: Cell-type Dependence, Synaptic Mechanisms, Developmental Emergence, and Modulation. Cereb Cortex. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Liu X, Zhou L, Ding F, Wang Y, and Yan J (2015). Local field potentials are local events in the mouse auditory cortex. European Journal of Neuroscience 42, 2289–2297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Llano DA, and Sherman SM (2008). Evidence for nonreciprocal organization of the mouse auditory thalamocortical-corticothalamic projection systems. J Comp Neurol 507, 1209–1227. [DOI] [PubMed] [Google Scholar]
  49. Lu S-M, and Lin RC-S (1993). Thalamic afferents of the rat barrel cortex: a light-and electron-microscopic study using Phaseolus vulgaris leucoagglutinin as an anterograde tracer. Somatosensory & motor research 10, 1–16. [DOI] [PubMed] [Google Scholar]
  50. McGinley MJ, David SV, and McCormick DA (2015). Cortical membrane potential signature of optimal states for sensory signal detection. Neuron 87, 179–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Meng X, Winkowski DE, Kao JP, and Kanold PO (2017). Sublaminar subdivision of mouse auditory cortex layer 2/3 based on functional translaminar connections. Journal of Neuroscience 37, 10200–10214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Merzenich MM, Knight PL, and Roth GL (1975). Representation of cochlea within primary auditory cortex in the cat. J Neurophysiol 38, 231–249. [DOI] [PubMed] [Google Scholar]
  53. Miller KD, Pinto DJ, and Simons DJ (2001). Processing in layer 4 of the neocortical circuit: new insights from visual and somatosensory cortex. Current opinion in neurobiology 11, 488–497. [DOI] [PubMed] [Google Scholar]
  54. Oya H, Poon PW, Brugge JF, Reale RA, Kawasaki H, Volkov IO, and Howard III MA (2007). Functional connections between auditory cortical fields in humans revealed by Granger causality analysis of intra-cranial evoked potentials to sounds: comparison of two methods. Biosystems 89, 198–207. [DOI] [PubMed] [Google Scholar]
  55. Pachitariu M, Stringer C, Schröder S, Dipoppa M, Rossi LF, Carandini M, and Harris KD (2016). Suite2p: beyond 10,000 neurons with standard two-photon microscopy. BioRxiv, 061507. [Google Scholar]
  56. Petrus E, Isaiah A, Jones AP, Li D, Wang H, Lee H-K, and Kanold PO (2014). Crossmodal induction of thalamocortical potentiation leads to enhanced information processing in the auditory cortex. Neuron 81, 664–673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Pfeffer CK, Xue M, He M, Huang ZJ, and Scanziani M (2013). Inhibition of inhibition in visual cortex: the logic of connections between molecularly distinct interneurons. Nature neuroscience 16, 1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Polley DB, Read HL, Storace DA, and Merzenich MM (2007). Multiparametric auditory receptive field organization across five cortical fields in the albino rat. J Neurophysiol 97, 3621–3638. [DOI] [PubMed] [Google Scholar]
  59. Qin L, Chimoto S, Sakai M, Wang J, and Sato Y (2007). Comparison between offset and onset responses of primary auditory cortex ON–OFF neurons in awake cats. Journal of neurophysiology 97, 3421–3431. [DOI] [PubMed] [Google Scholar]
  60. Recanzone GH (2000). Response profiles of auditory cortical neurons to tones and noise in behaving macaque monkeys. Hear Res 150, 104–118. [DOI] [PubMed] [Google Scholar]
  61. Redies H, and Brandner S (1991). Functional organization of the auditory thalamus in the guinea pig. Exp Brain Res 86, 384–392. [DOI] [PubMed] [Google Scholar]
  62. Rose T, Jaepel J, Hübener M, and Bonhoeffer T (2016). Cell-specific restoration of stimulus preference after monocular deprivation in the visual cortex. Science 352, 1319–1322. [DOI] [PubMed] [Google Scholar]
  63. Rothschild G, Nelken I, and Mizrahi A (2010). Functional organization and population dynamics in the mouse primary auditory cortex. Nature neuroscience 13, 353–360. [DOI] [PubMed] [Google Scholar]
  64. Saldeitis K, Happel MF, Ohl FW, Scheich H, and Budinger E (2014). Anatomy of the auditory thalamocortical system in the mongolian gerbil: Nuclear origins and cortical field‐, layer‐, and frequency‐specificities. Journal of Comparative Neurology 522, 2397–2430. [DOI] [PubMed] [Google Scholar]
  65. Schmidt M, Berg E, Friedlander M, and Murphy K (2009). Optimizing costly functions with simple constraints: A limited-memory projected quasi-newton algorithm. In Artificial Intelligence and Statistics, pp. 456–463. [Google Scholar]
  66. Scholl B, Gao X, and Wehr M (2010). Nonoverlapping sets of synapses drive on responses and off responses in auditory cortex. Neuron 65, 412–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Schreiner CE, and Urbas JV (1986). Representation of amplitude modulation in the auditory cortex of the cat. I. The anterior auditory field (AAF). Hear Res 21, 227–241. [DOI] [PubMed] [Google Scholar]
  68. Shadlen MN, and Newsome WT (1998). The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. Journal of neuroscience 18, 3870–3896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Sheikhattar A, and Babadi B (2016). Dynamic estimation of causal influences in sparsely-interacting neuronal ensembles. In Information Science and Systems (CISS), 2016 Annual Conference on (IEEE), pp. 551–556. [Google Scholar]
  70. Sheikhattar A, Miran S, Liu J, Fritz JB, Shamma SA, Kanold PO, and Babadi B (2018). Extracting neuronal functional network dynamics via adaptive Granger causality analysis. Proceedings of the National Academy of Sciences, 201718154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Somers DC, Nelson SB, and Sur M (1995). An emergent model of orientation selectivity in cat visual cortical simple cells. Journal of Neuroscience 15, 5448–5465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Stiebler I, Neulist R, Fichtel I, and Ehret G (1997). The auditory cortex of the house mouse: left-right differences, tonotopic organization and quantitative analysis of frequency representation. J Comp Physiol A 181, 559–571. [DOI] [PubMed] [Google Scholar]
  73. Thevenaz P, Ruttimann UE, and Unser M (1998). A pyramid approach to subpixel registration based on intensity. IEEE transactions on image processing 7, 27–41. [DOI] [PubMed] [Google Scholar]
  74. Tian B, Kuśmierek P, and Rauschecker JP (2013). Analogues of simple and complex cells in rhesus monkey auditory cortex. Proceedings of the National Academy of Sciences, 201221062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Tsukano H, Horie M, Bo T, Uchimura A, Hishida R, Kudoh M, Takahashi K, Takebayashi H, and Shibuki K (2015). Delineation of a frequency-organized region isolated from the mouse primary auditory cortex. J Neurophysiol 113, 2900–2920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Valdés-Sosa PA, Sánchez-Bornot JM, Lage-Castellanos A, Vega-Hernández M, Bosch-Bayard J, Melie-García L, and Canales-Rodríguez E (2005). Estimating brain functional connectivity with sparse multivariate autoregression. Philosophical Transactions of the Royal Society of London B: Biological Sciences 360, 969–981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Vasquez-Lopez SA, Weissenberger Y, Lohse M, Keating P, King AJ, and Dahmen JC (2017). Thalamic input to auditory cortex is locally heterogeneous but globally tonotopic. eLife 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Volkov I, and Galazjuk A (1991). Formation of spike response to sound tones in cat auditory cortex neurons: interaction of excitatory and inhibitory effects. Neuroscience 43, 307–321. [DOI] [PubMed] [Google Scholar]
  79. Whiteway MR, and Butts DA (2017). Revealing unobserved factors underlying cortical activity with a rectified latent variable model applied to neural population recordings. J Neurophysiol 117, 919–936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Winkowski DE, and Kanold PO (2013). Laminar transformation of frequency organization in auditory cortex. Journal of Neuroscience 33, 1498–1508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Yu C, Derdikman D, Haidarliu S, and Ahissar E (2006). Parallel thalamic pathways for whisking and touch signals in the rat. PLoS biology 4, e124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Zariwala HA, Borghuis BG, Hoogland TM, Madisen L, Tian L, De Zeeuw CI, Zeng H, Looger LL, Svoboda K, and Chen T-W (2012). A Cre-dependent GCaMP3 reporter mouse for neuronal imaging in vivo. Journal of Neuroscience 32, 3131–3141. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental

RESOURCES