Skip to main content
Journal of Neurophysiology logoLink to Journal of Neurophysiology
. 2018 Nov 21;121(1):218–237. doi: 10.1152/jn.00751.2018

Response properties of single neurons in higher level auditory cortex of adult songbirds

Sarah W Bottjer 1,, Andrew A Ronald 1, Tiara Kaye 1
PMCID: PMC6383665  PMID: 30461366

Abstract

The caudomedial nidopallium (NCM) is a higher level region of auditory cortex in songbirds that has been implicated in encoding learned vocalizations and mediating perception of complex sounds. We made cell-attached recordings in awake adult male zebra finches (Taeniopygia guttata) to characterize responses of single NCM neurons to playback of tones and songs. Neurons fell into two broad classes: narrow fast-spiking cells and broad sparsely firing cells. Virtually all narrow-spiking cells responded to playback of pure tones, compared with approximately half of broad-spiking cells. In addition, narrow-spiking cells tended to have lower thresholds and faster, less variable spike onset latencies than did broad-spiking cells, as well as higher firing rates. Tonal responses of narrow-spiking cells also showed broader ranges for both frequency and amplitude compared with broad-spiking neurons and were more apt to have V-shaped tuning curves compared with broad-spiking neurons, which tended to have complex (discontinuous), columnar, or O-shaped frequency response areas. In response to playback of conspecific songs, narrow-spiking neurons showed high firing rates and low levels of selectivity whereas broad-spiking neurons responded sparsely and selectively. Broad-spiking neurons in which tones failed to evoke a response showed greater song selectivity compared with those with a clear tuning curve. These results are consistent with the idea that narrow-spiking neurons represent putative fast-spiking interneurons, which may provide a source of intrinsic inhibition that contributes to the more selective tuning in broad-spiking cells.

NEW & NOTEWORTHY The response properties of neurons in higher level regions of auditory cortex in songbirds are of fundamental interest because processing in such regions is essential for vocal learning and plasticity and for auditory perception of complex sounds. Within a region of secondary auditory cortex, neurons with narrow spikes exhibited high firing rates to playback of both tones and multiple conspecific songs, whereas broad-spiking neurons responded sparsely and selectively to both tones and songs.

Keywords: auditory cortex, frequency coding, intensity coding, vocal learning

INTRODUCTION

Learned vocal communication is used by songbirds for social interactions, including courtship and territory defense (Marler and Slabbekoorn 2004). Like humans, songbirds memorize sounds used for acoustic communication during a sensitive period of development by listening to adult tutors (Bottjer and Arnold 1997; Doupe and Kuhl 1999; Marler 1970). This essential step of auditory memorization comprises the acquisition of a neural representation of tutor sounds: a template memory to which feedback of self-produced sounds is compared. Recent evidence suggests that the template memory of tutor sounds is localized to higher level regions of auditory cortex including caudomedial nidopallium (NCM; Fig. 1) (Bolhuis and Gahr 2006; Bolhuis and Moorman 2015; Gobes and Bolhuis 2007; Hahnloser and Kotowicz 2010; London and Clayton 2008; Phan et al. 2006; Terpstra et al. 2004; Yanagihara and Yazaki-Sugiyama 2016). NCM receives strong projections from lower levels of auditory cortex and provides indirect input to sensorimotor vocal-control nuclei (Calabrese and Woolley 2015; Mello et al. 1998; Vates et al. 1996; Wang et al. 2010); it is therefore well positioned to participate in encoding memories of complex sounds, as well as in perception and discrimination of vocal and other sounds (Schneider and Woolley 2013; Thompson and Gentner 2010; Thompson et al. 2013).

Fig. 1.

Fig. 1.

Schematic diagram of auditory cortical regions in songbird brain and their connection to the high vocal center (HVC). Primary auditory cortex (known as the Field L complex) comprises a thalamo-recipient subregion (L2) as well as secondary subregions (L1, L3) that are interconnected with L2. L2a and L3 project directly to caudomedial nidopallium (NCM), which is reciprocally connected with the caudal mesopallium (CM). CM is reciprocally connected with HVC and projects to its underlying shelf region; CM also projects indirectly to HVC via nucleus interface of the nidopallium (NIf), which like HVC is a sensorimotor nucleus. CM is reciprocally connected with all subregions of primary auditory cortex (depicted here as a single arrow). Some connections omitted for clarity (for further details see Akutagawa and Konishi 2010; Calabrese and Woolley 2015; Mello et al. 1998; Vates et al. 1996; Wang et al. 2010). After Gentner and Margoliash (2003).

NCM receives feed-forward inputs from the Field L complex, which is analogous to primary auditory cortex (A1) of mammals (Fig. 1) (Calabrese and Woolley 2015; Karten 1968; Kelley and Nottebohm 1979; Mello et al. 1998; Vates et al. 1996; Wang et al. 2010; Wild et al. 1993). Calabrese and Woolley (2015) reported that neurons throughout auditory cortical regions of adult male zebra finches segregate into two major classes, as is true for mammalian neocortex. One class had narrow action potentials and high firing rates (referred to as putative interneurons), and another class had broad action potentials and low firing rates (referred to as putative principal cells). Broad-spiking neurons responded more sparsely and selectively to modulated-noise stimuli compared with narrow-spiking neurons, exhibiting more complex and nonlinear receptive fields (Calabrese and Woolley 2015). In addition, both classes of neurons within NCM responded more sparsely and selectively compared with the thalamo-recipient layer of Field L (L2), consistent with the idea of hierarchical information processing. In a separate study, sparse selective responses of broad-spiking NCM neurons to different conspecific songs were maintained in background noise levels that permitted behavioral perception. However, as background noise levels increased, song-selective responses in NCM were abolished as perception was disrupted (Schneider and Woolley 2013). This pattern of results suggests that complex tuning properties in NCM emerge from a hierarchical organization and contribute to mediating behaviorally relevant responses to specific spectrotemporal stimuli.

Playback of complex auditory stimuli induces expression of the immediate early gene egr-1 in NCM but not in L2, and some studies have reported stronger responses to conspecific than to heterospecific songs in terms of both gene expression and multiunit activity in NCM (Chew et al. 1995, 1996; Mello and Clayton 1994; Mello et al. 1992; Stripling et al. 1997, 2001). In addition, stimulus-specific adaptation to repeated presentation of song stimuli is seen in NCM but not in L2, and this habituation persists over a longer time interval for conspecific than for heterospecific songs (Chew et al. 1995, 1996; Mello et al. 1995), suggesting that NCM retains a memory based on auditory experience.

In the aggregate, these results indicate that encoding species-specific features of vocal sounds may emerge as a prominent feature in NCM neurons. The result of information processing in NCM is forwarded to another area of higher level auditory cortex, the caudal mesopallium (CM), and thence to the sensorimotor region high vocal center (HVC; Fig. 1), which controls vocal motor output in adult songbirds. However, little is known concerning basic auditory processing in NCM, particularly at the level of single neurons, which retards understanding of the neural basis of responsivity to complex stimuli (Knudsen and Gentner 2010). Characterization of tuning properties of auditory cortical neurons to simple sounds is a necessary first step in providing a better understanding of mechanisms underlying processing of complex behaviorally relevant sounds. We studied basic tuning properties of NCM neurons in awake male zebra finches, using pure tones to examine frequency and amplitude tuning, rate-level functions, spike latencies, and temporal response patterns; we also compared tone responses to response selectivity to conspecific songs.

METHODS

Subjects

All procedures used in our experiments were approved by the University of Southern California Animal Care and Use Committee and followed the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. Nineteen adult male zebra finches were raised in our breeding aviaries by their parents. Birds remained in these group aviaries until at least 35 days posthatch, remaining with their natural parents and thereby receiving normal auditory and social experience during the period of tutor memorization. Tutor exposure up until ~35 days of age is sufficient for birds to subsequently make accurate copies of their tutor’s song (Böhner 1990, 1983; Catchpole and Slater 1995; Clayton 1987; Eales 1985; Immelmann 1969; Mann and Slater 1995; Roper and Zann 2006).

Stimuli

Pure tones were generated in MATLAB (44.1 kHz) and were 100 ms in duration with 2-ms linear ramp rise-fall times. For most experiments, 39 frequencies were presented ranging from 200 to 8,000 Hz in 0.1-octave steps; in a few experiments fixed steps of 200 Hz were used. Each frequency was presented from 10 to 70 dB SPL in 10-dB steps. The order of frequency was randomized across trials, while amplitude moved up consistently; i.e., all frequencies were played one time at 10 dB (random order) and then all frequencies are played one time at 20 dB, etc. This sequence was repeated five times (most cells were tested with 5 repetitions of all frequency-amplitude combinations; a small number of cells received 3 repetitions and were used if they showed a clear response pattern). The interval between stimulus onsets was 600 ms, including a 300-ms prestimulus period and a 200-ms poststimulus period.

Neurons were also tested with playback of six different unfamiliar conspecific songs. Song recordings were made by placing an adult bird in a cage within an acoustic isolation box (44-kHz sampling rate, Sound Analysis Pro). A representative song was selected for each bird, high-pass filtered at 400 Hz, and matched for amplitude (Goldwave). Song stimuli were presented 10 times at an amplitude of 60–65 dB maximum SPL, measured with a sound level meter (Larson Davis, model LxT). Each song stimulus consisted of two repetitions of the basic song motif; the durations ranged from 1.40 to 1.75 s (average 1.59 s).

Data Acquisition

Before recording sessions, each bird was anesthetized with isoflurane (1.5%, inhalation) and placed in a stereotaxic apparatus; a stainless steel post was glued to the rostral skull with dental cement (Lang Dental) and cyanoacrylate. Craniotomies overlying left and right NCM were made, and the surface of the brain was protected by applying a silicone elastomer (Kwik-Cast; World Precision Instruments). Stereotaxic coordinates for NCM were as follows: AP: 0.6–0.9 mm, ML: 0.7–0.9 mm, and depth: 0.8–2.5 mm, which represents the caudodorsal region of NCM. During subsequent recording sessions awake birds were comfortably restrained in a cloth jacket and plastic tube; they were placed in the stereotaxic apparatus in a double-walled sound isolation chamber, and the head was immobilized via the steel post. Each bird was tested on 6–8 different days; 78% of all cells recorded were from the left hemisphere.

Recordings were made using glass electrodes filled with sterile avian saline (0.7% NaCl, 5–20 MΩ). Spike data were amplified using an Axopatch 200B amplifier (Axon Instruments), low-pass filtered (2,000 Hz), and digitized at 32 kHz using a data acquisition interface from Cambridge Electronic Design (Power 1401). The electrode was advanced by the experimenter in steps of 1–5 μm using an electronic microdrive (Siskiyou, MC1000). Most recordings (>90%) were made in cell-attached configuration, but we also recorded from a small number of well-isolated single units (high signal-to-noise ratio, stable amplitude and waveform) that failed to achieve a loose-patch configuration. Unit isolation for the latter cells was checked using Spike2 template software (Cambridge Electronic Design). Before each experiment we calibrated pure-tone sound pressure levels (dB SPL re 20 μPa) from 100 to 10,000 Hz using a custom MATLAB script to generate a calibration table by programming an attenuator (PA5; Tucker Davis Technologies). Tone presentations were controlled by custom MATLAB scripts that performed digital-to-analog conversion (Power 1401; Cambridge Electronic Design) and attenuation of the signal (PA5) before being amplified (Parasound 275) and routed to the speaker (Beyma) located 20 cm in front of the bird’s head. Song stimuli were presented using Spike2 software.

Data Analysis

We measured the action potential width of each neuron as the duration in milliseconds from trough to peak of the average waveform for each cell (Niell and Stryker 2008). The distribution of spike widths was used to divide cells into narrow-spiking (NS) and broad-spiking (BS) groups (Fig. 2A). A k-means clustering analysis (k = 2) using both spike width and trough-to-peak ratios was in good agreement with this division into NS and BS groups, only differing in the assignment of 4 out of 87 neurons; these four cells that we called BS were assigned to the NS cluster. Because the break in the distribution of spike widths between 0.36 and 0.39 ms is similar to that seen in previous studies of NCM (Calabrese and Woolley 2015; Schneider and Woolley 2013; Yanagihara and Yazaki-Sugiyama 2016), we maintained the assignment of these four neurons to the BS cell group since all of them had spike widths ≥0.40. Further inspection of waveforms and spontaneous firing rates was used to guide operationally defined subgroups (NS neurons into NS1 and NS2 subgroups and BS neurons into BS and DT subgroups) as described in the first section of the results. These subgroups are based on clear-cut differences in firing rates and the shape of spike waveforms; our data do not resolve whether the four groups of cells we identified correspond to different cell types, as discussed below.

Fig. 2.

Fig. 2.

Cell groups in caudomedial nidopallium (NCM). A: frequency distribution plot of spike widths across all NCM neurons with waveforms in which trough-to-peak duration could be measured. Cells with spike widths ≤0.36 ms were classified as narrow spiking (NS; n = 36) while cells with spike widths >0.36 ms were classified as broad spiking (BS; n = 51). An additional group of cells had a complex spike waveform that precluded measurement of spike widths and was referred to as double trough (DT; n = 46); see results. B: plotting trough-to-peak ratios against spike width revealed a subpopulation of NS cells with trough-to-peak ratios <0.30 and low spontaneous firing rates; these cells were placed in a separate subgroup called NS1 cells (n = 13). The remaining narrow-spiking cells, NS2 (n = 23), are putative fast-spiking interneurons. C, top left: average spike waveforms (with SE) for all 4 cell groups. C, bottom left: all individual spike waveforms for each cell from all groups. C, right: raw traces of single spikes from NS2, BS, and DT cells; all waveforms are aligned at peaks.

Tone responses.

Auditory activity of each unit was determined based on a change in firing rate (spikes/sec) during tone presentation compared with a prestimulus baseline period (paired t-test). The analysis window during tones was 120 ms to capture any poststimulus activity; this window was shortened to 50 ms for cells that showed a response only at tone onset or extended to 160 ms for cells that showed a strong offset or delayed response (see below). The prestimulus baseline period was always equivalent in duration to the analysis window used for tone presentations and was also used to calculate the spontaneous firing rate for each cell. Threshold was defined as the lowest intensity (dB SPL) to elicit a significant response; an additional criterion was that contiguous higher intensity stimuli also had to elicit a significant response (Caras et al. 2012). Characteristic frequency (CF) was defined as the tone frequency with the lowest threshold. First spike latency (FSL) represented the average time to first spike following tone onset at the CF 20 dB above threshold, and FSL jitter was measured as the standard deviation of first spike times across trials for that frequency-amplitude combination.

We examined the frequency and amplitude tuning of each neuron by plotting peristimulus time histograms (PSTHs) for all frequency-amplitude combinations for each cell (10-ms bin size). In addition, we plotted a tuning curve as response strength values; response strength is defined here as the baseline-corrected firing rate (average firing rate during stimulus presentations minus average firing rate during baseline periods). Response strength was used in plots of frequency-amplitude tuning to adjust for differences in spontaneous firing rates between cells. To assess frequency selectivity we measured the frequency bandwidth 20 dB above threshold and at 70 dB (this value was set to zero if a cell did not respond at 70 dB). Amplitude tuning was measured as the range of amplitudes responded to at CF. We also calculated the maximum average evoked response strength at CF and the stimulus intensity that elicited the maximum response at CF. Peak firing rate was measured as the PSTH bin (10 ms) with the highest firing rate at CF, and the time of peak firing rate corresponded to the time of the peak PSTH bin poststimulus onset.

We classified neurons qualitatively according to their temporal response patterns at CF (Woolley and Casseday 2004). Cells with consistent temporal responses fell into five categories: 1) onset, firing rate increased only at tone onset; 2) sustained, increased firing rate throughout tone presentations; 3) primary-like, strong firing at tone onset followed by significant but lower spiking; 4) delayed, spiking that began during tone presentation and continued at least 20 ms beyond end of tone stimulus; and 5) offset, no significant response during tone followed by strong response after tone offset. The last category suggests that tones could evoke suppression, but we did not observe significant inhibition (see below), possibly due to low baseline firing rates. Firing rates of cells with onset and primary-like responses will be artificially depressed by calculating the number of spikes across the entire tone duration; as indicated above we therefore varied the duration of the analysis window and measured the firing rate of the single PSTH bin with the maximum firing rate at CF using a bin size of 10 ms.

We classified neurons qualitatively according to the shape of their tuning curves (frequency response areas defined by frequency-amplitude combinations with firing rates significantly different from baseline): 1) V-shape, neurons responded to a progressively wider range of frequencies as amplitude increased; 2) complex, neurons responded to multiple nonadjacent frequency-amplitude combinations, including dual tuning to frequencies that were integer multiples; 3) columnar, neurons responded to a very narrow range of frequencies and spanned an amplitude range of 30–70 dB or more; and 4) O-shape, neurons responded to a narrow range of frequencies and spanned an amplitude range of ~20 dB. Due to the complexity of response patterns, these categories were not necessarily mutually exclusive, such that neurons could receive a primary and a secondary designation (see below and Fig. 7).

Fig. 7.

Fig. 7.

Examples of different types of tuning curves. Characteristic frequencies (CFs) are marked by arrows. The cell with V-shaped tuning curve at top is the same cell depicted in Fig. 6, top; CF was 2.0 kHz, and suppression was seen at lower and higher frequencies (blue triangle marks suppression at 1.4 kHz between 10 and 30 dB; cf. Fig. 6). The cell with a complex columnar tuning curve (bottom) showed significant responses at an integer multiple (2.4–2.6 and 4.8–5.2 kHz). The cell with a complex tuning curve (2nd from top) showed a primary response at 5.2 kHz (CF) and a weaker response at 6.8 kHz. NS, narrow spiking; BS, broad spiking. See results.

We classified the rate-level functions at CF into four categories based on procedures of Caras et al. (2012): monotonic increasing, monotonic decreasing, nonmonotonic, and not tuned to intensity. A boundary was set halfway between a noise floor (2 SD above the baseline firing rate for each cell) and the maximum average evoked firing rate. Cells for which the maximum firing rate was evoked at lower intensity levels followed by a decrease in firing rate at higher intensity levels to below the boundary were classified as monotonic decreasing. Conversely, neurons with lower firing rates at low intensities that crossed this boundary and stayed above it by maintaining a high firing rate at higher stimulus intensities were classified as monotonic increasing. If the evoked firing rate of a neuron increased and then dropped below the boundary at intensities beyond those that evoked the maximum firing rate, it was classified as nonmonotonic. A few cells in this latter category increased their firing rate a second time, crossing back over the boundary (NS2, n = 1 out of 8; BS, n = 2 out of 15; and DT, n = 1 out of 10); these cells therefore had a bimodal nonmonotonic function and were included in the nonmonotonic category. In addition, three NS2 cells had nonmonotonic inverse functions; their firing rates dropped below the boundary and then increased (these functions were inverted to plot with other cells in Fig. 8, see Rate-level functions below). Cells for which the firing rate stayed above the boundary at all levels were classified as “not tuned to intensity,” unless their firing rate increased or decreased by ≥50%. A total of five cells (2 NS and 3 BS) were classified as not tuned to intensity, and their rate-level functions were not considered further. Cells that increased or decreased their firing rate by ≥50% were classified as either monotonic increasing or monotonic decreasing, respectively.

Fig. 8.

Fig. 8.

Left: rate-level functions in narrow spiking 2 (NS2), broad spiking (BS), and double trough (DT) neurons. Graphs from top to bottom show average response strength for cells classified as NonMonotonic, Monotonic Increasing, and Monotonic Decreasing; scale of y-axis is fixed from top to bottom to facilitate comparison, while insets show the same functions at an expanded scale for NonMonotonic and Monotonic Decreasing cells. Right: example functions in single NS2, BS, and DT cells for each rate-level category. The nonmonotonic rate-level function for the DT neuron appears to be bimodal but is not, since the response strength at 40 dB did not fall below the halfway boundary between the baseline firing rate and the maximum evoked firing rate (see methods).

Song responses.

A unit was classified as showing song-evoked auditory responsivity if it showed a significant difference in firing rate to any one of the six song stimuli presented (paired t-test between spikes/s in baseline vs. song presentations). For each neuron, we calculated the number of songs that elicited a significant response as one measure of song selectivity, with significant responses to an increased number of songs indicating less selectivity (i.e., 1 song response = most selective, 6 song responses = least selective). We measured a burst fraction by calculating the fraction of spikes with an interspike interval (ISI) of <5 ms during both baseline and song periods and made population frequency histograms of ISIs from 1 to 100 ms using a bin size of 2 ms.

The precision of spike timing during song playback was evaluated in several ways. PSTHs with a bin size of 5 ms were constructed for each song; a threshold was set such that if the number of spikes in a bin was >2 SD of the baseline response (for excitation) or <1 SD (for suppression), then that bin was classified as containing a song-evoked response relative to baseline. We calculated the fraction of bins meeting these arbitrary criteria out of the total number of bins across all songs. In addition, “events” were defined as two or more contiguous bins with a response, separated by at least three bins with no response. An additional criterion was that events had to include evoked spikes on at least 40% (4/10) of song trials. Thus events corresponded to stimulus features within songs that reliably evoked spikes across trials (Billimoria et al. 2006). We compared the number of events during baseline periods vs. song playback to quantify the incidence of song-elicited events. We did not identify any instances of suppressed events, likely due to low baseline firing rates, and so all analyses were restricted to excitatory events. Reliability of events was measured as the fraction of trials in which at least one spike was elicited within individual events; we took the average of this number across events for each song. Spike times were extracted and referenced to stimulus onset for each song. Jitter was measured as the standard deviation of all individual spike times for all events across songs.

To further quantify the precision of spike timing in single neurons across repeated trials of song playback, we measured the number of instances in which spikes occurred during the same window of time across all pair-wise combinations of spike trains for playback of each song. This “correlation index” is based on measuring coincident spikes across spike trains followed by a normalization to compare across cells (Joris et al. 2006; Louage et al. 2004). We used a window of 1 ms to define coincidence of spikes. The correlation index (CI) for each song in each cell was calculated as

CI=Nc/[N(N1)r2ωD]

where Nc is the number of coincident spikes, N is the number of trials, r is the average firing rate, ω is the bin width (1 ms), and D is the duration of each song stimulus. A value of 1.0 indicates chance (random spiking across trials that lacks temporal structure); larger values indicate greater degrees of correlation in spike times across trials and lower values indicate anticorrelation. We measured the CI value for each song for each cell, and compared the maximum CI value across all songs for each cell.

Statistics.

Paired t-tests were used to evaluate significant responses as a difference in firing rate (spikes/sec) during tone or song presentations compared with prestimulus baseline periods for each neuron. Most dependent measures were not normally distributed (Jarque-Bera tests), and we therefore chose to use nonparametric statistics, an approach that has the advantage of minimizing the influence of outliers. Kruskal-Wallis tests were used to assess main effects of differences between cell groups, followed by Mann-Whitney tests to assess significance of individual comparisons. Holm corrections were used to adjust for multiple comparisons. Wilcoxon paired-rank tests were used to assess the incidence of bursting and of events within each group based on comparisons between baseline and song periods. Kruskal-Wallis tests were also used to assess main effects of differences between NS2, BS, and DT cell groups within two categories of rate-level functions: monotonic increasing plus monotonic decreasing and nonmonotonic; cells with either increasing or decreasing monotonic functions were combined due to the low number of cells in these two categories (see Rate-level functions below, Fig. 8).

RESULTS

We recorded 133 single neurons in dorsocaudal NCM, a region of higher level auditory cortex. Examination of average spike waveforms revealed that many cells with broad spikes exhibited a small-amplitude trough superimposed on the repolarization phase of the waveform (Fig. 2C). This discontinuity seemed to represent an additional conductance giving rise to a small inward current followed by resumption of the repolarization phase. Because it was impossible to measure spike widths in average waveforms that displayed this discontinuity, we placed these cells in a separate group called DT (n = 46). The distribution of spike widths among the remaining neurons formed a heterogeneous distribution with a break between 0.36 and 0.39 ms, which is similar to the separation found between NS and BS neurons in previous investigations of NCM neurons (Calabrese and Woolley 2015; Schneider and Woolley 2013; Yanagihara and Yazaki-Sugiyama 2016) (Fig. 2A). We therefore placed cells with a spike duration ≤0.36 ms in an NS group (n = 36) and those with a spike width >0.36 ms in a BS group (n = 51), in accord with prior work.

NS neurons also tended to have a larger trough compared with BS neurons; however, plotting spike widths against trough-to-peak ratios for all NS and BS cells revealed a small subpopulation of NS cells with smaller troughs (n = 13, trough-to-peak ratios <0.30) that overlapped the distribution of trough-to-peak ratios seen in BS cells (Fig. 2B). The spontaneous firing rates of this subgroup of NS cells was substantially lower compared with the remaining NS cells and was higher than BS cells (see Figs. 3 and 4), suggesting that they may represent a separate subgroup of NS cells. We therefore placed these NS cells in a category referred to as NS1; the remaining cells, referred to as NS2 (n = 23, trough-to-peak ratios >0.30), may represent fast-spiking interneurons.

Fig. 3.

Fig. 3.

Differences in tone-evoked responses between cell groups for selected measures. Each box-and-whisker plot shows median and 1st/3rd quartiles for selected dependent measures from Fig. 4, which gives all statistical comparisons; red plus signs depict means. *P < 0.01, **P < 0.005, **P < 0.0001. Outliers removed in some plots to avoid compression of y-axis: 2 from first spike latency (FSL), 1 from FSL jitter, 1 from width of frequency range at 70 dB, 2 from maximum evoked response strength (RS) at characteristic frequency (CF). NS, narrow spiking; BS, broad spiking; DT, double trough.

Fig. 4.

Fig. 4.

Tone-evoked excitatory responses (medians, interquartile range). Response measures for excitatory tone-evoked responses; P values for individual comparisons are Mann-Whitney tests, Holm corrected; cells with significant comparisons are highlighted in green with green text, while P values that trended toward significance have green text only (nonsignificant due to Holm correction for multiple comparisons). 1NS, narrow spiking; BS, broad spiking; DT, double-trough; see results. 2CF, characteristic frequency. 3First spike latency (FSL), average latency of first spike in each trial at CF plus 20 dB above threshold. 4Maximum average firing rate (FR) across trials within a 10-ms bin; see methods and results.

The DT cells presented something of a dilemma, since it was impossible to obtain an uncontaminated measure of spike width in these cells. We examined individual spike waveforms in single DT neurons, searching for action potentials that lacked a second trough. We were able to measure ca. 40–50 spikes without the presence of a second trough in 13 of these 46 cells; 12 of these cells had longer spike widths that placed them in the range of BS values (0.6–1.4 ms), whereas 1 cell had a spike width of 0.25 ms. This pattern suggested that most of the DT cells were drawn from the BS category (see Tone Responses below). A few cells in the NS and BS categories exhibited double troughs in a low proportion of their total spike number; however, if the average waveform did not indicate the presence of a second trough, then such cells were not classified as DT. It should be noted that the four groups of cells defined here do not necessarily correspond to different cell types but simply reflect differences based on measurements of spike waveforms and firing rates.

Tone Responses

Spontaneous firing rates differed across cell groups (Fig. 3; statistical tests are given in Fig. 4). Individual comparisons showed that NS2 neurons had higher spontaneous firing rates compared with all other cell groups and that both NS cell groups had higher firing rates than those in BS and DT cells. In addition, spontaneous firing rates of BS cells were higher than those of DT cells; interestingly, this was the only significant difference in spiking responses between BS and DT neurons during tone playback tests. The higher spontaneous firing rate of NS2 cells relative to all other cell groups is consistent with the idea that these cells form a distinct subpopulation, possibly representing fast-spiking interneurons.

Strength of tone-evoked responses was highest and least variable in NS2 neurons.

The incidence of evoked responses to tones varied among cell groups (χ2 = 22.46, P < 0.0001): 100% (23/23) of fast-spiking NS2 cells showed an excitatory response to tones, compared with 77% (10/13) of NS1 cells; a majority of BS neurons responded to tones (65%, 33/51), compared with only 43% (20/46) of DT cells (Table 1, bottom). Subsequent analyses were restricted to these tone-evoked excitatory responses. Median values of CF did not vary between cell groups, but thresholds, defined as the lowest intensity to elicit a response at CF, varied significantly due to NS2 cells having a lower threshold compared with DT cells (Figs. 3 and 4). Latencies to first spike following tone onset at CF in both NS1 and NS2 cells were lower than those in BS and DT cells, which did not differ from each other (FSL; Figs. 3 and 4). NS2 latencies were tightly clustered whereas latencies of NS1 cells included two outliers. The jitter of first-spike timing was also lower in NS neurons compared with BS and DT cell groups (significant for NS2 cells). Thus NS2 cells showed low thresholds plus uniformly fast latencies to respond to CF tones as well as low trial-to-trial variability in time of first spike.

Table 1.

Qualitative categorizations of tone-evoked responses in NCM neurons

Narrow Spiking 1 (NS1) Narrow Spiking 2 (NS2) Broad Spiking (BS) Double Trough (DT)
Temporal pattern
    Onset 0.50 0.30 0.12 0.05
    Primary-like 0.40 0.43 0.39 0.45
    Sustained 0.00 0.26 0.33 0.20
    Delayed 0.00 0.00 0.03 0.25
    Offset 0.10 0.00 0.12 0.05
Tuning curve shape
    V-shape 0.40 0.35 0.09 0.10
    Complex 0.40 0.30 0.45 0.45
    Columnar 0.20 0.26 0.24 0.20
    O-shape 0.00 0.09 0.21 0.25
Rate-level function
    Monotonic increasing 0.40 0.17 0.27 0.20
    Monotonic decreasing 0.40 0.39 0.18 0.30
    Nonmonotonic 0.20 0.35 0.45 0.50
    Not tuned to intensity 0.00 0.09 0.09 0.00
    Proportion of tone-evoked responses 0.77 (10/13) 1.0 (23/23) 0.65 (33/51) 0.43 (20/46)

For each cell group: within-group proportions based on all tone-responsive cells within each group are shown for each qualitative category. NCM, caudomedial nidopallium.

We measured the frequency bandwidth around the CF for each cell at 20 dB above threshold to assess selectivity of frequency tuning. The average bandwidth for NS1 and NS2 cells was somewhat wider than that for BS and DT cells (~0.64 kHz for NS vs. 0.32 kHz for BS/DT), leading to a significant main effect; however, no individual comparisons between cell groups were significant, although NS2 cells tended to be more broadly tuned than were BS and DT neurons (P = 0.02–0.04, nonsignificant due to correction for multiple comparisons, Fig. 4). However, the range of frequencies around the CF to which neurons responded at 70 dB was significantly broader in NS1 and NS2 cells compared with BS and DT neurons (Figs. 3 and 4). In addition, the range of amplitudes to which cells responded at the CF was broader in NS1 and NS2 cells compared with BS and DT cells (particularly in NS2 cells, Fig. 4, and see below). Thus BS and DT neurons were more narrowly tuned to both frequency and amplitude than were NS neurons.

The maximum average response strength at each cell’s CF was substantially larger in NS2 cells compared with both BS and DT cells, which did not differ from each other (Figs. 3 and 4). The intensity level that elicited the maximum response strength did not differ between cell groups, ranging from ~30 to 50 dB. We also measured the peak firing rate, defined as the maximum rate in a single 10-ms PSTH bin at the CF of each cell; this measure was highest in NS1 and NS2 cells, which had higher peak firing rates than did BS and DT cells. The latency to reach the bin of peak firing rate was shorter in NS1 and NS2 cells compared with BS and DT cells (time of peak firing rate, Fig. 4). Overall, NS2 cells had the highest spontaneous and evoked firing rates, in addition to faster and less variable spike latencies, relative to BS and DT neurons.

Temporal response patterns and frequency response areas were simpler in NS neurons.

Table 1, Temporal pattern, shows the proportions of cells within each group assigned to categories based on qualitative judgments of temporal spiking patterns of tone responses (see methods; these values are plotted for NS2, BS and DT cells in Fig. 5, top, for ease of comparison). Both classes of NS neurons tended to show mostly onset and primary-like responses, although many NS2 cells (26%) showed a sustained response throughout tone stimuli. No NS neurons showed delayed or offset responses to tones. The distribution of response patterns was shifted to primary like and sustained for BS and DT neurons. DT neurons were the only cell class with a tendency to exhibit delayed responses that continued beyond tone offset (25%), and BS neurons showed some tendency toward offset responses (12%). Figure 6, top, shows an example of an NS2 neuron that responded in a primary-like fashion to tone playback; PSTHs for all intensity levels (10–70 dB) at the CF (2.0 kHz) and surrounding frequencies show peak response rates at tone onset followed by a sustained response at a lower firing rate. Figure 6, bottom, shows individual frequency-amplitude combinations demonstrating examples of other temporal response patterns.

Fig. 5.

Fig. 5.

Proportions of different qualitative categories of response patterns for narrow spiking 2 (NS2), broad spiking (BS), and double trough (DT) cell groups; narrow spiking 1 (NS1) neurons are not included due to small numbers within each subgroup. See Table 1 and results for further details.

Fig. 6.

Fig. 6.

Examples of different tone-evoked temporal response patterns. Top: peristimulus time histograms at all intensity levels (10–70 dB) for several frequencies surrounding the characteristic frequency (CF) (2.0 kHz) in a narrow spiking 2 (NS2) neuron that responded in a primary-like fashion; red lines mark tone presentation periods (100 ms); frequency-amplitude combinations inside gray outline evoked significant responses. Bottom: examples of single frequency-amplitude combinations for cells that showed different types of temporal response patterns; red lines mark tone presentations, and blue lines mark duration of analysis window (from tone onset; see methods). NS, narrow spiking; BS, broad spiking; DT, double trough.

A large proportion of both NS1 and NS2 cells exhibited V-shape tuning curves (Table 1, Tuning curve shape, and Fig. 5), whereas BS and DT neurons rarely had V-shape tuning curves. All four types of neurons showed complex tuning curves in which nonadjacent frequency-amplitude combinations evoked responses, although BS and DT neurons tended to show a higher proportion of complex tuning compared with NS2 neurons. Figure 7 shows a complex tuning curve from an NS2 neuron that had a primary response ~5.2 kHz (the CF) and a secondary response ~6.8 kHz. All four cell groups also showed a fairly large proportion of columnar frequency response areas, whereas only BS and DT neurons exhibited a strong tendency to have O-shape frequency-response areas that were tuned for both frequency and amplitude (Fig. 7). The narrower range of frequencies and amplitudes to which BS and DT neurons responded compared with NS neurons (Fig. 4) is consistent with the finding that BS and DT neurons had a higher proportion of O-shape and columnar receptive fields relative to V-shape frequency-response areas (Table 1). Some neurons were tuned to multiple frequencies, and tuning to integer multiples of two or three different frequencies were prominent in some cases. These excitatory responses to harmonically related frequencies are likely to contribute to processing complex sounds rich in harmonics (see Wang, 2013, for review). A relatively small number of NS2 neurons displayed multiple-frequency tuning (4/23, 17%) compared with BS (15/33, 45%) and DT (10/20, 50%) neurons, although when multiple tuning occurred in NS2 neurons it was quite clear (Fig. 7, bottom).

Suppressed responses to tone playback were uncommon. However, many NS neurons showed some tendency to show suppressed responses for frequency-amplitude combinations outside areas of excitation. For example, Fig. 6 shows clear suppression at 1.4–1.5 kHz, 10–30 dB for an NS2 cell with a CF of 2.0; the full tuning curve for this cell is shown in Fig. 7, top (blue triangle marks 1.4 kHz) and indicates some suppression both below and above the excitatory frequency response area. For the most part, decreased firing rates for frequency-amplitude combinations adjacent to excitatory CFs were nonsignificant. In addition, it was difficult or impossible to identify suppressed responses in BS and DT neurons due to their low spontaneous firing rate, so we did not attempt to examine tone-evoked suppression in these cells.

Rate-level functions.

The responses of NS2, BS, and DT neurons at CF were classified as nonmonotonic, monotonic increasing, or monotonic decreasing based on their rate-level functions (see methods); we did not include NS1 neurons due to their small number. Monotonic neurons either increase or decrease their firing rate with increasing stimulus intensities, whereas nonmonotonic neurons increase their firing rate at intermediate stimulus intensities and then decrease at higher intensities. Figure 8, left, shows average values of response strength as a function of intensity within each category; three NS2 cells had inverse nonmonotonic functions (i.e., lower response strengths at intermediate intensities) and these functions were inverted in Fig. 8, top left. Figure 8, right, shows examples of each rate-level category for single neurons from each cell group. Although the average functions for nonmonotonic cells appear to have fairly broad amplitude ranges within each cell group, this reflected the fact that individual nonmonotonic neurons were tuned to different amplitudes, ranging from 20 to 60 dB (data not shown). In addition, either one or two neurons within each cell group had bimodal nonmonotonic functions (see methods), which increased and decreased their firing rate twice as intensity increased.

Response strengths between BS and DT cells were highly similar across levels within all three rate-level categories, whereas response strengths of NS2 cells were higher than those of BS and DT cells in all categories. The increased response strength of NS2 neurons relative to BS and DT neurons was greatest for monotonic increasing cells and least for nonmonotonic cells. Interestingly, the range of average response strengths across levels did not differ substantially between rate-level categories for BS and DT cells, falling between ~0 and 35 Hz in all three categories; thus firing rates of BS/DT cells were the same in all three categories as seen across all cells within the general populations of BS/DT cells. In contrast, response strengths of NS2 cells were highest in the monotonic-increasing category, substantially lower in the monotonic-decreasing category, and lowest in the nonmonotonic category.

We examined the tone-evoked dependent measures described above in Fig. 4 separately within nonmonotonic vs. monotonic categories (monotonic increasing and decreasing were combined due to low numbers; see methods). The differences reported in Fig. 4 did not change as a function of the presence or absence of monotonicity, with two exceptions. First, the range of amplitudes that evoked a response at CF was narrower in BS and DT cells compared with NS2 cells within monotonic increasing/decreasing categories (P < 0.0002), as for the general populations within each cell group (Fig. 4), but there was no difference between cell groups in the nonmonotonic category (P = 0.16). The absence of a difference within the nonmonotonic category reflects the fact that nonmonotonic cells in all three groups were tuned to amplitude (by definition). This outcome indicates that nonmonotonic functions of NS2 cells were not more sharply tuned than those of BS and DT neurons. Second, the frequency range that evoked a response at threshold plus 20 dB was broader in NS2 cells compared with BS and DT neurons within monotonic increasing/decreasing categories (P = 0.02 in both cases), as for the trend in the general cell populations (Fig. 4), but all three cell groups responded to a similar frequency range (~0.3 kHz bandwidth) in the nonmonotonic category (P = 0.99). Thus the tendency of NS2 neurons to respond to a broader range of amplitudes and frequencies relative to BS/DT neurons was preserved in the monotonic increasing/decreasing categories but not in the nonmonotonic category.

In addition, although NS2 cells continued to show higher tone-evoked firing rates compared with BS and DT cells in both monotonic and nonmonotonic categories, this difference was greatly diminished in the nonmonotonic category due to a substantial decrease in the firing rates of NS2 cells while those of BS and DT cells were largely unchanged (Fig. 8). For example, the average maximum response strength at CF decreased from 113.0 Hz across all NS2 cells (Fig. 3, bottom right, red plus sign) to 65.4 Hz in nonmonotonic NS2 cells, but maximum response strength of nonmonotonic BS/DT neurons remained at the same level as that seen across all BS/DT cells (~38–40 Hz); this difference in firing rate between NS2 and BS/DT nonmonotonic neurons was still significant (P = 0.03). The disparity in maximum response strength between NS2 and BS/DT cells seen in the general populations for each cell group (Fig. 4) was also preserved in both monotonic categories but increased in monotonic-increasing cells due to an increased firing rate in NS2 neurons (from 113.0 to 171.1 Hz), and decreased in monotonic-decreasing cells due to a decreased firing rate in NS2 neurons (from 113.0 to 74.0 Hz) (P < 0.01 in all cases). Thus relative differences in firing rates between cell groups were preserved in different rate-level categories, but the increased response strength seen in NS2 cells was greatest in monotonic increasing and smallest in nonmonotonic cells.

Song Responses

Song selectivity.

Table 2, columns 2–5, shows the proportions of cells within each group that showed excitatory responses only, suppressed responses only, both excitation and suppression, or no response to any song. All NS2 cells showed only song-excited responses, and 9/11 NS1 cells showed excited responses only. In contrast, only half of BS and DT cells showed excited responses only. A fairly high proportion of both BS and DT cells did not respond to any songs, and 20% of BS neurons showed only suppressed responses to song playback, which was substantially more than other groups. However, it was difficult to judge suppressed responses to songs in BS and DT cells due to their low spontaneous firing rates, such that suppressed responses are likely to be underestimated. The patterns in Table 2, columns 2–5, suggest that song-evoked responses of BS and DT neurons are more selective than are those of NS neurons, in the sense that BS and DT cells may respond to fewer songs and/or fewer elements within songs. We tested this idea by quantifying song selectivity among excitatory responses; we did not analyze suppressed responses across cell groups due to their low incidence (see below).

Table 2.

Types of song responses in NCM

Responses Across All Cell Groups
BS and DT Neurons With or Without Tonal Responses
Song Responses Narrow Spiking 1 (NS1) Narrow Spiking 2 (NS2) Broad Spiking (BS) Double Trough (DT) BS Neurons Without Tonal Response BS Neurons with Clear Tuning Curve DT Neurons Without Tonal Response DT Neurons with Clear Tuning Curve
Excitation only 0.82 (9) 1.00 (22) 0.48 (21) 0.54 (25) 0.31 (5) 0.57 (16) 0.42 (11) 0.70 (14)
Suppression only 0.09 (1) 0.00 (0) 0.20 (9) 0.07 (3) 0.31 (5) 0.14 (4) 0.08 (2) 0.05 (1)
Excitation and suppression 0.00 (0) 0.00 (0) 0.07 (3) 0.00 (0) 0.00 (0) 0.11 (3) 0.00 (0) 0.00 (0)
No song response 0.09 (1) 0.00 (0) 0.25 (11) 0.39 (18) 0.38 (6) 0.18 (5) 0.50 (13) 0.25 (5)
Total numbers 11 22 44 46 16 28 26 20

Columns 2–5: within-group proportions for different categories of song responses based on all cells within each group. Columns 6–9: within-category proportions for different categories of song responses in BS and DT neurons that either lacked tone-selective responses (did not evince a clear response to any pure tone) or exhibited a clear frequency response area.

Our initial measure of selectivity was the number of songs that evoked a significant excitatory response (by paired t-test against baseline); NS1 and NS2 neurons responded to more songs than did BS and DT neurons (Fig. 9). Both NS1 and NS2 neurons tended to show significant responses to all or most of the six songs played, whereas many BS and DT neurons responded to zero to two songs (Fig. 10). Thus BS and DT neurons were more selective in terms of song responsivity than were NS neurons. However, it should be noted that some cells in both BS and DT groups responded to all six songs (15 and 7%, respectively, compared with 82% for NS2 cells). In addition, the response strength of NS2 neurons to songs that evoked a response was substantially higher compared with all other cell groups, which did not differ from each other (Figs. 9 and 10); this pattern was also true when the response strength for the song with the highest firing rate was considered (data not shown). Thus cell groups differed strongly in terms of their selectivity based on excitatory responses to playback of different conspecific songs, and less selective neurons (in particular NS2) had higher song-evoked firing rates.

Fig. 9.

Fig. 9.

Song-evoked excitatory responses (median, interquartile range). Response measures for excitatory song-evoked responses (for each cell, the average value across all songs that evoked an excitatory response); narrow spiking 1 (NS1): n = 9; narrow spiking 2 (NS2): n = 22; broad spiking (BS): n = 24; and double trough (DT): n = 25 (cf. Table 2). P values for individual comparisons are for Mann Whitney tests, Holm-corrected; cells with significant comparisons are highlighted in green with green text, while P values that trended toward significance have green text only (nonsignificant due to Holm correction for multiple comparisons). aNumber of songs that evoked an excitatory response (significant response above baseline by paired t-test). bResponse strength for songs that evoked a significant excitatory response (average across all songs that elicited a significant response). cNumber of songs that evoked at least 1 excitatory event. dNumber of excitatory events per song. eCoefficient of variation (CV) of number of events across songs. fPercentage of 5-ms peristimulus time histogram (PSTH) bins with spike number exceeding baseline by at least 2 SD. gDuration of events across songs. hFraction of trials in which at least one spike was elicited within each event across songs. iJitter equals standard deviation of spike times within excitatory events across songs for each cell. jMedian ISI during song-excited responses. kPercentage of spikes with interspike interval (ISI) <5 ms during baseline. lPercentage of spikes with ISI <5 ms during songs with maximum excitatory response strength. mMaximum correlation index (CI) value out of all songs; NS2 cells not tested statistically due to low numbers.

Fig. 10.

Fig. 10.

Differences in song-evoked excitatory responses between cell groups. Each box-and-whisker plot shows median and 1st/3rd quartiles for selected dependent measures from Fig. 9, which gives all statistical comparisons; red plus signs depict means. NS, narrow spiking; BS, broad spiking; DT, double trough. **P < 0.008, ***P < 0.0001.

Because BS and DT neurons responded sparsely during songs, they might not show a significant response when comparing average firing rate between baseline and entire song durations by paired t-test. We therefore identified spiking “events” during each song as contiguous 5-ms PSTH bins that exceeded the baseline firing rate by 2 SD (see methods). Our rationale was to identify cells that showed a significant excitatory response only during restricted time points in a song. In addition to identifying sparsely distributed spiking activity, events by definition are time-aligned to specific stimulus features within songs and thus provide some indication of spike timing. All cell groups showed a significant increase in events during song playback versus baseline (Wilcoxon signed-rank tests, P values ranged from 0.001 for NS1 cells to 2.7 × 10−7 for BS cells). Figure 11 shows examples of events (green highlighting) in both raster and PSTH displays, with example raw traces above, for individual neurons from NS2, BS, and DT cell classes. To our surprise, we did not identify suppressed events in any cell, including NS2 neurons (even with a criterion of 1 SD below baseline; see methods). Thus, even when targeting discrete periods of low spiking within entire song durations, we were not able to measure significant suppression compared with baseline. The low baseline firing rates in BS and DT neurons likely preclude detection of inhibition, but the lack of suppressed events in NS2 neurons raises the possibility that these cells receive mostly excitatory inputs.

Fig. 11.

Fig. 11.

Examples of excitatory events during song playback in different cell groups. Top: example raw traces; middle: rasters; bottom: peristimulus time histograms (PSTHs); song waveforms are shown below each group; green shading marks each event in rasters and PSTHs. NS, narrow spiking; BS, broad spiking; DT, double trough. See results.

The number of songs containing at least one event provides a second measure of song selectivity. As expected, the number of cells deemed song responsive was higher when assessed by the number of songs that evoked at least one event, compared with responses measured by paired t-tests, particularly for BS and DT cells (Fig. 10). However, song selectivity was still greater in BS and DT neurons (Fig. 9). All NS cells contained at least 1 event in all 6 songs (with the exception of 1 outlier in the NS1 group), compared with an average of 3.5–3.8 songs with at least one event for BS and DT neurons (red plus signs in Fig. 10). The number of events per song (across all 6 songs) was highest in NS2 cells, followed by NS1 cells, which in turn had more events per song than did BS and DT cells (Figs. 9 and 10). The decreased incidence of events in sparsely firing cells was reflected in the fact that 16% of BS cells (7/44) and 17% of DT cells (8/46) had zero events across all six songs whereas no NS cells had zero events across songs. These results show that BS and DT neurons were more selective when assessed by responses to entire songs as well as by the incidence of time-aligned spiking events.

Variability in the number of events per song was higher in BS and DT neurons compared with both types of NS neurons as measured by coefficient of variation values, indicating more song-to-song variation in number of events in BS and DT neurons (Figs. 9 and 10). This increased variability may reflect the sparse firing rates of BS and DT neurons and/or a more probabilistic pattern of spiking across song playbacks. In addition, the percentage of PSTH bins in which the number of spikes was >2 SD above baseline was lowest in BS/DT cells and highest in NS2 cells, indicating that spikes were more likely to occur throughout songs in NS2 neurons (Figs. 9 and 10). The median duration of events was shorter in NS1 neurons (~16 ms) compared with NS2, BS, and DT neurons (~26–29 ms; Fig. 9), and the average reliability of events, measured as the fraction of trials in which at least one spike occurred during each event across songs, was higher in NS2 cells compared with all other cell groups (Fig. 9). Jitter, measured as the standard deviation of spike times within all events across songs was lower in NS1 neurons relative to other cell groups (Fig. 9).

Temporal firing patterns: bursting activity and spike timing.

The median ISI across songs that evoked an excitatory response was shorter in NS2 neurons compared with BS and DT cell groups (Fig. 9). Figure 12 shows frequency distributions of ISIs across all excited responses for intervals up to 100 ms in each cell group. Interestingly, these plots indicate that NS1 neurons had the shortest ISIs in this range (≤100 ms) despite the fact that the average value of median ISI across the entire range of ISI’s was lowest in NS2 neurons. Inspection of individual cells indicated that three out of nine NS1 neurons that showed song-excited responses had a highly heterogeneous distribution of ISIs, with an extended tail of longer ISI values (not shown). Despite this tendency of NS1 neurons to have extremely long ISI values, their population activity showed a clear peak ~1–2 ms (Fig. 12). ISI distributions for all other groups were uniformly skewed to the left (i.e., did not include heterogeneous ISI values). Although BS neurons had an extended range of relatively high ISI values (particularly at intervals >30 ms), the difference between BS and DT neurons showed only a trend toward significance (Fig. 9). In summary, ISI values in fast-spiking NS2 neurons were faster than in BS and DT neurons, whereas ISI values in NS1 neurons were heterogeneous but showed a peak at very short ISIs.

Fig. 12.

Fig. 12.

Distribution of ISIs from 1 to 100 ms in each cell group. Frequency distributions of ISIs across all excited responses for intervals up to 100 ms in each group. NS, narrow spiking; BS, broad spiking; DT, double trough. See results.

Both NS1 and NS2 neurons had higher rates of bursting than did BS/DT cells, in accord with the shorter ISIs seen in NS1 and NS2 neurons. Figure 13 shows burst fractions (percentage of spikes with ISI <5 ms; see methods) during baseline versus the song that evoked the greatest response strength for each cell within each group. The burst fraction during spontaneous baseline firing was substantially higher in both groups of NS neurons compared with BS and DT neurons (Fig. 9). Wilcoxon signed-rank tests showed a significant increase in bursting during song playback vs. baseline for all groups (P values ranged from 0.004 for NS1 to 5.3 × 10−5 for NS2 cells). However, the relative incidence of bursting during song playback increased more in NS neurons than in BS and DT neurons (Fig. 13).

Fig. 13.

Fig. 13.

Burst fractions during baseline vs. song periods for each cell group. Burst fractions [percentage of spikes with interspike interval (ISI) <5 ms; see methods] for the song that evoked the greatest response strength for each cell within each group; averages for each group are depicted by gray line. NS, narrow spiking; BS, broad spiking; DT, double trough.

We assessed the trial-to-trial precision of spike timing in relation to song playback by making pairwise comparisons of all spike trains for each cell and calculating the number of spikes that occurred at the same time points across trials (coincidence window = 1 ms). This value was normalized to a CI, in which a value of 1 indicates a complete lack of temporal structure while larger values indicate a greater degree of coincident spike timing (Joris et al. 2006; Louage et al. 2004). We compared the maximum CI value across all song responses for each cell between NS2, BS, and DT neurons (NS1 neurons not included due to low numbers). This analysis yielded a significant effect of groups, and individual comparisons showed that NS2 cells had lower CI values compared with BS and DT cells (Fig. 9). In addition, CI values for BS cells were lower than those for DT cells, indicating that DT neurons exhibited the most precise alignment to specific time points during song playback.

Higher song selectivity in BS and DT neurons that lacked tone-selective responses.

Neurons that did not respond to pure tones may require a more complex stimulus to evoke a response. If so, that might indicate that NCM neurons that did not evince a clear response to any pure tone would also display greater selectivity for song-evoked responses. We explored this idea by comparing the song selectivity of BS and DT neurons as a function of whether or not they displayed a clear response to playback of any tone. Among cells that were tested with playback of both tones and songs, 64% of BS neurons (28/44) and 43% of DT neurons (20/46) displayed clear tuning curves. Table 2, columns 6–9, shows that BS and DT neurons that lacked responses to pure tones were twice as likely not to respond to playback of any song compared with those with a clear frequency response area and were also less likely to show only excited responses to songs. In addition, BS neurons with no tonal response were more likely to show only suppressed responses compared with those with a clear tuning curve (although three BS neurons with a tuning curve showed both suppression and excitation).

Response strength for songs that evoked either significant excitation or suppression did not differ for BS and DT neurons regardless of whether cells displayed a response to pure tones or not (Table 3). However, BS and DT neurons that lacked clear tuning curves showed excitatory responses to fewer songs, indicating higher song selectivity (Table 3; this trend was not significant in DT neurons, P = 0.10). In contrast, the incidence of song-suppressed responses did not differ between cells with and without a clear response to pure tones. Neurons that lacked a response to tones also displayed increased song selectivity when judged by the presence of an excitatory event and had fewer events per song compared with cells with tuning curves. In the aggregate, these results show that cells that lack tuning to pure tones show higher song selectivity in terms of excitation, suggesting more complex response properties. BS neurons were the only cell group that tended to show suppressed song-evoked responses only; Table 2, columns 6–9, shows that BS neurons without tonal responses were evenly split between cells that showed only excitation, only suppression, or no song response, whereas DT neurons showed little tendency to show suppression regardless of tonal responsivity. However, as indicated above, the low spontaneous firing rates of BS and DT neurons raise the possibility that suppressive responses were underestimated.

Table 3.

Average song selectivity in BS and DT neurons that either did or did not exhibit a clear response to pure tones

Song-Evoked Response Strengtha (Excitation) Song-Evoked Response Strengtha (Suppression) Song Selectivity (by Paired t-Test)b (Excitation) Song Selectivity (by Paired t-Test)b (Suppression) Song Selectivityc (by Presence of Excitation Event) Number Of Events per Song d (Excitation)
BS neurons
    Cells with no tone response 7.42 ± 2.24 −3.31 ± 1.00 1.06 ± 0.52 0.88 ± 0.39 2.50 ± 0.64 2.10 ± 0.88
    Cells with a clear tuning curve 4.93 ± 0.61 −2.27 ± 0.28 2.21 ± 0.44 0.93 ± 0.34 4.61 ± 0.36 3.71 ± 0.55
    Number with no tone response 5 5 16 16 16 16
    Number with tuning curve 19 7 28 28 28 28
    Significant P valuese P = 0.04 P = 0.004 P = 0.01
DT neurons
    Cells with no tone response 4.03 ± 0.85 −1.12 1.23 ± 0.35 0.27 ± 0.19 2.92 ± 0.46 1.46 ± 0.41
    Cells with a clear tuning curve 5.81 ± 0.97 −4.86 2.05 ± 0.47 0.10 ± 0.10 4.30 ± 0.38 2.95 ± 0.58
    Number with no tone response 11 2 26 26 26 26
    Number with tuning curve 14 1 20 20 20 20
    Significant P valuese P = 0.04 P = 0.01

All values are means ± SE; significant P values represent Mann Whitney tests comparing cells with vs. without tonal responses. BS, broad spiking; DT, double trough.

a

Response strength for songs that evoked significant excitatory or suppressed response.

b

Number of songs that evoked a significant excitatory or suppressed response by paired t-test (song-evoked response vs. baseline).

c

Number of songs that evoked at least one excitatory event.

d

Number of excitatory events per song.

e

All other P values are nonsignificant.

DISCUSSION

Single neurons in zebra finch NCM could be subdivided into different groups based on operational criteria; two main categories consisted of broad-spiking (BS and DT) and narrow-spiking (NS1 and NS2) cells. In the absence of additional criteria (e.g., immunohistochemical profiles and juxtacellular labeling), it is impossible to say whether each of the four groups we classified represents a different cell type. It is unclear why previous studies did not identify the same classes of NCM neurons as those reported here; this is the first paper to use cell-attached recordings, which among other factors could influence sampling; in addition analyses of spike shapes have differed between papers. Given the functional diversity both between and within different types of interneurons (Hasenstaub and Callaway 2010; Markram et al. 2004; Mesik et al. 2015), further study is necessary to elucidate the various subtypes of interneurons in NCM. Across the population of all NCM neurons, we observed numerous response patterns during tone presentations, including complex patterns of frequency and amplitude tuning. In general, BS and DT neurons displayed sparser firing rates and more complex patterns of responsivity compared with NS neurons. BS and DT neurons also showed substantially greater song selectivity than did NS neurons, and song playback evoked suppression of firing rates in one-fourth of the responses of BS cells (20% BS cells showed only suppressed responses). BS and DT neurons that displayed well-defined tuning curves tended to show excited responses to songs, and conversely BS and DT neurons that lacked tuning curves were more apt to show no response to any of the six songs presented.

Tone Responses

A large proportion of BS and DT neurons did not respond to tones, and the subset of neurons in each of these groups that did respond to tones were more narrowly tuned to both frequency and amplitude compared with NS2 neurons. In addition, approximately half of BS and DT neurons were tuned to multiple frequencies compared with only 17% for NS2 neurons. This pattern suggests that both BS and DT cells require more complex stimuli to elicit a response and is consistent with the idea of hierarchical refinements in auditory representations at higher levels of processing (Sen et al. 2001; Woolley et al. 2005).

Terleph et al. (2007) reported that tonal frequency tuning curves based on multiunit activity were narrower in the thalamo-recipient layer of auditory cortex (L2) than in NCM in both zebra finches and canaries. On the face of it, this finding appears to conflict with the increasing complexity of spectrotemporal receptive fields, which show that linear response models provide less good predictions of song-evoked activity at the level of auditory cortex and beyond (Calabrese and Woolley 2015; Schneider and Woolley 2013; Woolley et al. 2005). Perhaps the broader multiunit responses reported in NCM reflect subsets of neurons tuned to multiple frequencies. The average bandwidths we measured across the population of NCM neurons at 20 dB above threshold (0.30–0.65 kHz) were substantially narrower than those reported by Caras et al. (2012) at 10 dB above threshold in Field L neurons of white-crowned sparrows (1.2–1.6 kHz), suggesting that tuning of single neurons to individual frequencies is considerably narrower in NCM. Of course, the narrow bandwidths we measured at the CF of individual neurons do not take into account the tendency of approximately half of all BS and DT neurons to respond to multiple frequencies.

NS2 cells had higher spontaneous and tone-evoked firing rates compared with BS and DT cells, as well as lower thresholds, faster latencies, less variable response onsets, and broader frequency and amplitude tuning. This response profile is consistent with that of inhibitory interneurons in mammalian auditory cortex (Wu et al. 2008) and suggests that NS2 neurons may constitute a source of intrinsic inhibition that contributes to the narrower frequency tuning in BS and DT neurons.

Temporal response patterns varied across cell groups. NS cells tended to show phasic onset responses. Combined with the fast latency of NS cells, such phasic responses may help to encode specific temporal patterns in zebra finch song, which includes rapidly changing frequency elements. All cell groups comprised a large proportion of neurons that showed primary-like responses, but only BS and DT neurons showed sustained responses, and DT neurons showed delayed responses that continued for at least 20 ms following tone offset. Sustained/delayed responses of BS/DT cells could facilitate integration across song syllables, which may contribute to the nonlinear temporal and spectral summation seen in HVC neurons and thereby facilitate selectivity for syllable combinations (Margoliash and Fortune 1992). NCM provides indirect input to HVC via CM (Fig. 1), which is reciprocally connected with HVC and NIf; in addition, inactivation of CM causes substantial reduction of song-evoked responses in HVC, and lesion of a specific cell population within CM that projects to HVC (field Avalanche) impairs vocal learning (Akutagawa and Konishi 2010; Bauer et al. 2008; Lynch et al. 2013; Roberts et al. 2017).

In addition to displaying narrow frequency tuning curves, a large proportion of BS and DT neurons (~25%) were also narrowly tuned to amplitude, displaying O-shaped tuning curves that spanned an amplitude range of ~20 dB. Units that are narrowly tuned to both frequency and amplitude are common in primary auditory cortex of marmosets and bats (Sadagopan and Wang 2008; Suga 1977). A population of O-shaped units could serve to encode a level-invariant representation of frequency, which could be useful for feature detection in amplitude-modulated vocal sounds (Sun et al. 2017). In addition, the auditory cortex of echo-locating bats contains a disproportionate representation of level-tuned units corresponding to frequencies and amplitudes of echoes from emitted sound pulses (Suga 1990), suggesting that changes in amplitude spectra over time may themselves represent important information in vocal sounds. Thus encoding of absolute sound level is likely to contribute to processing complex vocal sounds.

Neurons within BS and DT cell groups with multiple frequency tuning showed excitatory peaks at integer multiples of their CF, as well as at noninteger multiples such as 1.5 (3:2 ratio), as has been observed in auditory cortex of marmosets and bats (Fitzpatrick et al. 1993; Kadia and Wang 2003; Kanwal et al. 1999; Suga et al. 1983; Wang 2013). Such harmonically related excitatory peaks in the NCM cells we measured fell within the vocal range of zebra finches and may reflect the rich harmonic structure in many zebra finch syllables. Presentation of single tones in marmosets and zebra finches can evoke weak responses in the auditory cortex, whereas harmonically related tone combinations frequently show substantial nonlinear facilitation or suppression (Feng and Wang 2017; Lim and Kim 1997; Sadagopan and Wang 2008), consistent with the idea that such tuning can contribute to processing vocal sounds with complex harmonic patterns.

Song Responses

BS and DT neurons responded to substantially fewer songs than did NS neurons, consistent with the idea of greater responsivity to complex stimuli and higher selectivity in BS/DT cells. Only half of BS and DT cells showed only excited responses, and many BS/DT neurons did not respond to any songs (25–39%). In addition, 20% of BS neurons displayed only suppressed responses to song playback. In contrast, 100% of NS2 neurons showed only excited responses to songs and no suppressed responses, and 82% (18/22) responded to all six songs. NS2 neurons also had much higher spontaneous and evoked firing rates, suggesting that they may represent fast-spiking inhibitory interneurons.

We measured excitatory events within songs (contiguous PSTH bins >2 SD above baseline) as a more sensitive indicator of song responsivity, which revealed an increase in the number of songs that elicited a response in all cell groups (especially BS and DT neurons) but did not alter the pattern of higher selectivity in BS/DT neurons compared with NS neurons (Figs. 9 and 10). In addition, excitatory song responses included many fewer events per song in BS and DT cells compared with NS cells, indicating greater selectivity within individual songs to specific spectrotemporal features. Consistent with this pattern, song responses in BS and DT cells contained a relatively low proportion of PSTH bins that exceeded the baseline firing rate, supporting the idea that spikes were restricted to fewer features within songs.

In addition to firing more sparsely and selectively overall, BS and DT neurons also displayed broader distributions of ISIs and hence showed a lower incidence of bursting (as defined by spikes separated by <5-ms intervals). Alignment of spikes to precise time points during song playback was highest in DT neurons and relatively low in NS2 neurons. It is interesting that DT cells showed greater precision of spike timing relative to BS cells despite the fact that the only other dependent measure based on spiking of individual cells was the spontaneous firing rate, which was lower in DT neurons. It is not clear whether BS and DT neurons represent two different subcategories of excitatory principal cells, given that we observed a low proportion of individual spike waveforms in BS neurons that also exhibited double troughs (i.e., discontinuities during the repolarization phase of an action potential). However, fewer DT neurons responded to tones compared with BS neurons (Table 1), and substantially more BS neurons showed suppressed responses (27%, 12/44) than did DT neurons (7%, 3/46; Table 2, columns 2–5).

Because many neurons in both BS and DT cell groups did not respond to pure tones (i.e., lacked tone-selective responses), it seemed likely that these subsets of neurons might also display greater selectivity to song playback compared with BS/DT neurons that did have a clear tonal receptive field. This idea was confirmed by the fact that a higher proportion of both BS and DT neurons that lacked tonal tuning curves did not respond to any song, whereas BS/DT neurons with a tuning curve were much more likely to show an excitatory response to song playback (Table 2, columns 6–9). In addition, among neurons that did display excitatory song responses, cells that lacked a response to tones responded to fewer songs as judged by both evoked firing rate and the presence of an event and had fewer events per song (Table 3). This pattern is consistent with the idea that BS and DT neurons that did not respond to tones are more selectively tuned to the combinations of spectral features contained in conspecific songs. Alternatively, cells that lacked a tuning curve may not receive auditory inputs and hence simply not represent auditory neurons.

General Consideration of Information Processing in Avian Auditory Cortex

Calabrese and Woolley (2015) showed that spiking patterns of Field L neurons in awake zebra finches were sparse and selective in deep (L3) and superficial (L1 and lateral CM) subregions compared with the intermediate thalamo-recipient subregion (L2; Fig. 1), indicating that patterns of information processing in avian primary auditory cortex are very similar to the canonical microcircuit of mammalian six-layered cortex. Unfortunately, few studies of avian auditory cortex have distinguished between subregions of Field L, and many studies have been performed in urethane-anesthetized birds, which alters both spontaneous and evoked firing rates (Capsius and Leppelsack 1996). Single- and multineuron recordings throughout subdivisions of Field L in anesthetized birds show strong but nonselective responses to song stimuli; Field L neurons tend not to distinguish between self-generated song and conspecific songs or between forward and reversed versions of self-generated song (Amin et al. 2004; Janata and Margoliash 1999; Lewicki and Arthur 1996; Margoliash 1986). However, Lim and Kim (1997) reported that single neurons in L2 of adult male zebra finches showed relatively simple and stereotyped responses compared with those in L1 and L3: responses of L1 and L3 neurons showed substantial interactions (either enhancement or suppression when different subsets of a harmonic complex were presented singly vs. together), whereas response patterns of L2 neurons showed no significant response interactions (cf. Sadagopan and Wang 2008). Studies in awake birds that record single neurons and sort individual units by waveform shape and other measures, including morphology and neurochemical expression, are needed to begin to understand sensory coding across different cell types and how it changes within auditory cortex.

Field L is interconnected with higher level regions of auditory cortex along parallel, reciprocal, and hierarchical pathways (Fig. 1). Comparison of responses in Field L vs. higher level regions of auditory cortex based on past studies indicates that responsivity to simple stimuli such as pure tones is relatively low in NCM and CM compared with Field L, whereas selective responses to more complex auditory stimulation, including species-specific sounds, increases beyond Field L (Bauer et al. 2008; Gentner and Margoliash 2003; Leppelsack and Vogt 1976; Meliza and Margoliash 2012; Müller and Leppelsack 1985; Schneider and Woolley 2013). Both NCM and CM show evidence of selectivity for different song types (Bauer et al. 2008; Miller-Sims and Bottjer 2014; Yanagihara and Yazaki-Sugiyama 2016). This pattern agrees with the direct comparisons between tones and songs made in the current study. Schneider and Woolley (2013) found that single BS NCM neurons in awake male zebra finches responded on average to only approximately half of 15 conspecific songs presented, whereas NS NCM neurons as well as those in Field L and auditory midbrain displayed relatively little song selectivity; this direct comparison between different levels of auditory processing indicates that selectivity among complex natural vocal stimuli emerges at the level of NCM in BS neurons. Furthermore, BS NCM neurons maintained their selective responsivity in background noise levels that permitted behavioral discrimination but stopped firing once background noise reached levels that precluded behavioral recognition, suggesting a direct contribution to perceptual behavior.

Miller-Sims and Bottjer (2014) reported that multiunit activity in NCM of awake adult zebra finches responded more strongly to each bird’s tutor song than to conspecific or reversed songs. In addition, adult birds that had learned a better imitation of the tutor song showed a greater degree of neural selectivity for that song, consistent with the idea that NCM neurons contribute to a functional circuit that may represent tutor song (or possibly own song). Interestingly, tutor song was an equally effective stimulus regardless of whether syllables were played in the normal forward order or in a reversed order, suggesting that neural activity in NCM may encode individual song syllables but not their temporal sequence.

NCM and CM are also involved in processing discrimination of complex acoustic stimuli as a function of experience; much of this work has been carried out in starlings, a species with a complex and varied repertoire of vocal behavior (Gentner et al. 2004; Gentner and Margoliash 2003; Jeanne et al. 2011, 2013; Knudsen and Gentner 2013; Meliza et al. 2010; Meliza and Margoliash 2012; Thompson and Gentner 2010). Knudsen and Gentner (2013) recorded CM neurons in starlings that had learned to recognize complex patterns of conspecific song while they were engaged in a discrimination task versus passively listening. They found that fewer song stimuli evoked excitation in CM neurons during task engagement, indicating an increased selectivity, as well as a decrease in firing rate variability, allowing for enhanced discrimination. These results show that both learning and current behavioral state modulate the activity of auditory cortical neurons so as to alter the representation of behaviorally relevant auditory stimuli (Knudsen and Gentner 2010). Results presented herein concur with the idea that BS neurons in NCM are higher level neurons that represent acoustic features important for diverse functions of vocal communication.

GRANTS

This work was supported by National Institute on Deafness and Other Communication Disorder Grant DC-012396.

DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

AUTHOR CONTRIBUTIONS

S.W.B. conception and design of research; A.A.R. performed experiments; S.W.B., A.A.R., and T.K. analyzed data; S.W.B. and A.A.R. interpreted results of experiments; S.W.B. prepared figures; S.W.B. drafted manuscript; S.W.B., A.A.R., and T.K. edited and revised manuscript; S.W.B., A.A.R., and T.K. approved final version of manuscript.

ACKNOWLEDGMENTS

We are grateful to Melissa L. Caras for comments on this project.

REFERENCES

  1. Akutagawa E, Konishi M. New brain pathways found in the vocal control system of a songbird. J Comp Neurol 518: 3086–3100, 2010. doi: 10.1002/cne.22383. [DOI] [PubMed] [Google Scholar]
  2. Amin N, Grace JA, Theunissen FE. Neural response to bird’s own song and tutor song in the zebra finch field L and caudal mesopallium. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 190: 469–489, 2004. doi: 10.1007/s00359-004-0511-x. [DOI] [PubMed] [Google Scholar]
  3. Bauer EE, Coleman MJ, Roberts TF, Roy A, Prather JF, Mooney R. A synaptic basis for auditory-vocal integration in the songbird. J Neurosci 28: 1509–1522, 2008. doi: 10.1523/JNEUROSCI.3838-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Billimoria CP, DiCaprio RA, Birmingham JT, Abbott LF, Marder E. Neuromodulation of spike-timing precision in sensory neurons. J Neurosci 26: 5910–5919, 2006. doi: 10.1523/JNEUROSCI.4659-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Böhner J. Song learning in the zebra finch (taeniopygia guttata): Selectivity in the choice of a tutor and accuracy of song copies. Anim Behav 31: 231–237, 1983. doi: 10.1016/S0003-3472(83)80193-6. [DOI] [Google Scholar]
  6. Böhner J. Early acquisition of song in the zebra finch, Taeniopygia guttata. Anim Behav 39: 369–374, 1990. doi: 10.1016/S0003-3472(05)80883-8. [DOI] [Google Scholar]
  7. Bolhuis JJ, Gahr M. Neural mechanisms of birdsong memory. Nat Rev Neurosci 7: 347–357, 2006. doi: 10.1038/nrn1904. [DOI] [PubMed] [Google Scholar]
  8. Bolhuis JJ, Moorman S. Birdsong memory and the brain: in search of the template. Neurosci Biobehav Rev 50: 41–55, 2015. doi: 10.1016/j.neubiorev.2014.11.019. [DOI] [PubMed] [Google Scholar]
  9. Bottjer SW, Arnold AP. Developmental plasticity in neural circuits for a learned behavior. Annu Rev Neurosci 20: 459–481, 1997. doi: 10.1146/annurev.neuro.20.1.459. [DOI] [PubMed] [Google Scholar]
  10. Calabrese A, Woolley SM. Coding principles of the canonical cortical microcircuit in the avian brain. Proc Natl Acad Sci USA 112: 3517–3522, 2015. doi: 10.1073/pnas.1408545112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Capsius B, Leppelsack HJ. Influence of urethane anesthesia on neural processing in the auditory cortex analogue of a songbird. Hear Res 96: 59–70, 1996. doi: 10.1016/0378-5955(96)00038-X. [DOI] [PubMed] [Google Scholar]
  12. Caras ML, O’Brien M, Brenowitz EA, Rubel EW. Estradiol selectively enhances auditory function in avian forebrain neurons. J Neurosci 32: 17597–17611, 2012. doi: 10.1523/JNEUROSCI.3938-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Catchpole C, Slater PJ. Bird Song: Biological Themes and Variations. Cambridge, UK: Cambridge University Press, 1995. [Google Scholar]
  14. Chew SJ, Mello C, Nottebohm F, Jarvis E, Vicario DS. Decrements in auditory responses to a repeated conspecific song are long-lasting and require two periods of protein synthesis in the songbird forebrain. Proc Natl Acad Sci USA 92: 3406–3410, 1995. doi: 10.1073/pnas.92.8.3406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chew SJ, Vicario DS, Nottebohm F. A large-capacity memory system that recognizes the calls and songs of individual birds. Proc Natl Acad Sci USA 93: 1950–1955, 1996. doi: 10.1073/pnas.93.5.1950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Clayton NS. Song tutor choice in zebra finches. Anim Behav 35: 714–721, 1987. doi: 10.1016/S0003-3472(87)80107-0. [DOI] [Google Scholar]
  17. Doupe AJ, Kuhl PK. Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci 22: 567–631, 1999. doi: 10.1146/annurev.neuro.22.1.567. [DOI] [PubMed] [Google Scholar]
  18. Eales LA. Song learning in zebra finches: some effects of song model availability on what is learnt and when. Anim Behav 33: 1293–1300, 1985. doi: 10.1016/S0003-3472(85)80189-5. [DOI] [Google Scholar]
  19. Feng L, Wang X. Harmonic template neurons in primate auditory cortex underlying complex sound processing. Proc Natl Acad Sci USA 114: E840–E848, 2017. doi: 10.1073/pnas.1607519114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fitzpatrick DC, Kanwal JS, Butman JA, Suga N. Combination-sensitive neurons in the primary auditory cortex of the mustached bat. J Neurosci 13: 931–940, 1993. doi: 10.1523/JNEUROSCI.13-03-00931.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gentner TQ, Hulse SH, Ball GF. Functional differences in forebrain auditory regions during learned vocal recognition in songbirds. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 190: 1001–1010, 2004. doi: 10.1007/s00359-004-0556-x. [DOI] [PubMed] [Google Scholar]
  22. Gentner TQ, Margoliash D. Neuronal populations and single cells representing learned auditory objects. Nature 424: 669–674, 2003. doi: 10.1038/nature01731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gobes SM, Bolhuis JJ. Birdsong memory: a neural dissociation between song recognition and production. Curr Biol 17: 789–793, 2007. doi: 10.1016/j.cub.2007.03.059. [DOI] [PubMed] [Google Scholar]
  24. Hahnloser RH, Kotowicz A. Auditory representations and memory in birdsong learning. Curr Opin Neurobiol 20: 332–339, 2010. doi: 10.1016/j.conb.2010.02.011. [DOI] [PubMed] [Google Scholar]
  25. Hasenstaub AR, Callaway EM. Paint it black (or red, or green): optical and genetic tools illuminate inhibitory contributions to cortical circuit function. Neuron 67: 681–684, 2010. doi: 10.1016/j.neuron.2010.08.039. [DOI] [PubMed] [Google Scholar]
  26. Immelmann K. Song development in the zebra finch and other estrildid finches. In: Bird Vocalisations, edited by Hinde RA. Cambridge, UK: Cambridge University Press, 1969, p. 61–74. [Google Scholar]
  27. Janata P, Margoliash D. Gradual emergence of song selectivity in sensorimotor structures of the male zebra finch song system. J Neurosci 19: 5108–5118, 1999. doi: 10.1523/JNEUROSCI.19-12-05108.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Jeanne JM, Sharpee TO, Gentner TQ. Associative learning enhances population coding by inverting interneuronal correlation patterns. Neuron 78: 352–363, 2013. doi: 10.1016/j.neuron.2013.02.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jeanne JM, Thompson JV, Sharpee TO, Gentner TQ. Emergence of learned categorical representations within an auditory forebrain circuit. J Neurosci 31: 2595–2606, 2011. doi: 10.1523/JNEUROSCI.3930-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Joris PX, Louage DH, Cardoen L, van der Heijden M. Correlation index: a new metric to quantify temporal coding. Hear Res 216-217: 19–30, 2006. doi: 10.1016/j.heares.2006.03.010. [DOI] [PubMed] [Google Scholar]
  31. Kadia SC, Wang X. Spectral integration in A1 of awake primates: neurons with single- and multipeaked tuning characteristics. J Neurophysiol 89: 1603–1622, 2003. doi: 10.1152/jn.00271.2001. [DOI] [PubMed] [Google Scholar]
  32. Kanwal JS, Fitzpatrick DC, Suga N. Facilitatory and inhibitory frequency tuning of combination-sensitive neurons in the primary auditory cortex of mustached bats. J Neurophysiol 82: 2327–2345, 1999. doi: 10.1152/jn.1999.82.5.2327. [DOI] [PubMed] [Google Scholar]
  33. Karten HJ. The ascending auditory pathway in the pigeon (Columba livia). II. Telencephalic projections of the nucleus ovoidalis thalami. Brain Res 11: 134–153, 1968. doi: 10.1016/0006-8993(68)90078-4. [DOI] [PubMed] [Google Scholar]
  34. Kelley DB, Nottebohm F. Projections of a telencephalic auditory nucleus-field L-in the canary. J Comp Neurol 183: 455–469, 1979. doi: 10.1002/cne.901830302. [DOI] [PubMed] [Google Scholar]
  35. Knudsen DP, Gentner TQ. Mechanisms of song perception in oscine birds. Brain Lang 115: 59–68, 2010. doi: 10.1016/j.bandl.2009.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Knudsen DP, Gentner TQ. Active recognition enhances the representation of behaviorally relevant information in single auditory forebrain neurons. J Neurophysiol 109: 1690–1703, 2013. doi: 10.1152/jn.00461.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Leppelsack HJ, Vogt M. Responses of auditory neurons in forebrain of a songbird to stimulation with species-specific sounds. J Comp Physiol 107: 263–274, 1976. doi: 10.1007/BF00656737. [DOI] [Google Scholar]
  38. Lewicki MS, Arthur BJ. Hierarchical organization of auditory temporal context sensitivity. J Neurosci 16: 6987–6998, 1996. doi: 10.1523/JNEUROSCI.16-21-06987.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lim D, Kim C. Emerging auditory response interactions to harmonic complexes in field L of the zebra finch. Auris Nasus Larynx 24: 227–232, 1997. doi: 10.1016/S0385-8146(97)00014-X. [DOI] [PubMed] [Google Scholar]
  40. London SE, Clayton DF. Functional identification of sensory mechanisms required for developmental song learning. Nat Neurosci 11: 579–586, 2008. doi: 10.1038/nn.2103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Louage DH, van der Heijden M, Joris PX. Temporal properties of responses to broadband noise in the auditory nerve. J Neurophysiol 91: 2051–2065, 2004. doi: 10.1152/jn.00816.2003. [DOI] [PubMed] [Google Scholar]
  42. Lynch KS, Kleitz-Nelson HK, Ball GF. HVC lesions modify immediate early gene expression in auditory forebrain regions of female songbirds. Dev Neurobiol 73: 315–323, 2013. doi: 10.1002/dneu.22062. [DOI] [PubMed] [Google Scholar]
  43. Mann NI, Slater PJB. Song tutor choice by zebra finches in aviaries. Anim Behav 49: 811–820, 1995. doi: 10.1016/0003-3472(95)80212-6. [DOI] [PubMed] [Google Scholar]
  44. Margoliash D. Preference for autogenous song by auditory neurons in a song system nucleus of the white-crowned sparrow. J Neurosci 6: 1643–1661, 1986. doi: 10.1523/JNEUROSCI.06-06-01643.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Margoliash D, Fortune ES. Temporal and harmonic combination-sensitive neurons in the zebra finch’s HVc. J Neurosci 12: 4309–4326, 1992. doi: 10.1523/JNEUROSCI.12-11-04309.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Markram H, Toledo-Rodriguez M, Wang Y, Gupta A, Silberberg G, Wu C. Interneurons of the neocortical inhibitory system. Nat Rev Neurosci 5: 793–807, 2004. doi: 10.1038/nrn1519. [DOI] [PubMed] [Google Scholar]
  47. Marler P. Birdsong and speech development: could there be parallels? Am Sci 58: 669–673, 1970. [PubMed] [Google Scholar]
  48. Marler P, Slabbekoorn H. Nature’s Music: The Science of Birdsong. Amsterdam, The Netherlands: Elsevier, 2004. [Google Scholar]
  49. Meliza CD, Chi Z, Margoliash D. Representations of conspecific song by starling secondary forebrain auditory neurons: toward a hierarchical framework. J Neurophysiol 103: 1195–1208, 2010. doi: 10.1152/jn.00464.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Meliza CD, Margoliash D. Emergence of selectivity and tolerance in the avian auditory cortex. J Neurosci 32: 15158–15168, 2012. doi: 10.1523/JNEUROSCI.0845-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Mello C, Nottebohm F, Clayton D. Repeated exposure to one song leads to a rapid and persistent decline in an immediate early gene’s response to that song in zebra finch telencephalon. J Neurosci 15: 6919–6925, 1995. doi: 10.1523/JNEUROSCI.15-10-06919.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Mello CV, Clayton DF. Song-induced ZENK gene expression in auditory pathways of songbird brain and its relation to the song control system. J Neurosci 14: 6652–6666, 1994. doi: 10.1523/JNEUROSCI.14-11-06652.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Mello CV, Vates GE, Okuhata S, Nottebohm F. Descending auditory pathways in the adult male zebra finch (Taeniopygia guttata). J Comp Neurol 395: 137–160, 1998. doi:. [DOI] [PubMed] [Google Scholar]
  54. Mello CV, Vicario DS, Clayton DF. Song presentation induces gene expression in the songbird forebrain. Proc Natl Acad Sci USA 89: 6818–6822, 1992. doi: 10.1073/pnas.89.15.6818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Mesik L, Ma WP, Li LY, Ibrahim LA, Huang ZJ, Zhang LI, Tao HW. Functional response properties of VIP-expressing inhibitory neurons in mouse visual and auditory cortex. Front Neural Circuits 9: 22, 2015. doi: 10.3389/fncir.2015.00022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Miller-Sims VC, Bottjer SW. Development of neural responsivity to vocal sounds in higher level auditory cortex of songbirds. J Neurophysiol 112: 81–94, 2014. doi: 10.1152/jn.00484.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Müller CM, Leppelsack HJ. Feature extraction and tonotopic organization in the avian auditory forebrain. Exp Brain Res 59: 587–599, 1985. doi: 10.1007/BF00261351. [DOI] [PubMed] [Google Scholar]
  58. Niell CM, Stryker MP. Highly selective receptive fields in mouse visual cortex. J Neurosci 28: 7520–7536, 2008. doi: 10.1523/JNEUROSCI.0623-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Phan ML, Pytte CL, Vicario DS. Early auditory experience generates long-lasting memories that may subserve vocal learning in songbirds. Proc Natl Acad Sci USA 103: 1088–1093, 2006. doi: 10.1073/pnas.0510136103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Roberts TF, Hisey E, Tanaka M, Kearney MG, Chattree G, Yang CF, Shah NM, Mooney R. Identification of a motor-to-auditory pathway important for vocal learning. Nat Neurosci 20: 978–986, 2017. doi: 10.1038/nn.4563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Roper A, Zann R. The onset of song learning and song tutor selection in fledgling zebra finches. Ethology 112: 458–470, 2006. doi: 10.1111/j.1439-0310.2005.01169.x. [DOI] [Google Scholar]
  62. Sadagopan S, Wang X. Level invariant representation of sounds by populations of neurons in primary auditory cortex. J Neurosci 28: 3415–3426, 2008. doi: 10.1523/JNEUROSCI.2743-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Sadagopan S, Wang X. Nonlinear spectrotemporal interactions underlying selectivity for complex sounds in auditory cortex. J Neurosci 29: 11192–11202, 2009. doi: 10.1523/JNEUROSCI.1286-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Schneider DM, Woolley SM. Sparse and background-invariant coding of vocalizations in auditory scenes. Neuron 79: 141–152, 2013. doi: 10.1016/j.neuron.2013.04.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Sen K, Theunissen FE, Doupe AJ. Feature analysis of natural sounds in the songbird auditory forebrain. J Neurophysiol 86: 1445–1458, 2001. doi: 10.1152/jn.2001.86.3.1445. [DOI] [PubMed] [Google Scholar]
  66. Stripling R, Kruse AA, Clayton DF. Development of song responses in the zebra finch caudomedial neostriatum: role of genomic and electrophysiological activities. J Neurobiol 48: 163–180, 2001. doi: 10.1002/neu.1049. [DOI] [PubMed] [Google Scholar]
  67. Stripling R, Volman SF, Clayton DF. Response modulation in the zebra finch neostriatum: relationship to nuclear gene regulation. J Neurosci 17: 3883–3893, 1997. doi: 10.1523/JNEUROSCI.17-10-03883.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Suga N. Amplitude spectrum representation in the Doppler-shifted-CF processing area of the auditory cortex of the mustache bat. Science 196: 64–67, 1977. doi: 10.1126/science.190681. [DOI] [PubMed] [Google Scholar]
  69. Suga N. Biosonar and neural computation in bats. Sci Am 262: 60–68, 1990. doi: 10.1038/scientificamerican0690-60. [DOI] [PubMed] [Google Scholar]
  70. Suga N, O’Neill WE, Kujirai K, Manabe T. Specificity of combination-sensitive neurons for processing of complex biosonar signals in auditory cortex of the mustached bat. J Neurophysiol 49: 1573–1626, 1983. doi: 10.1152/jn.1983.49.6.1573. [DOI] [PubMed] [Google Scholar]
  71. Sun W, Marongelli EN, Watkins PV, Barbour DL. Decoding sound level in the marmoset primary auditory cortex. J Neurophysiol 118: 2024–2033, 2017. doi: 10.1152/jn.00670.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Terleph TA, Mello CV, Vicario DS. Species differences in auditory processing dynamics in songbird auditory telencephalon. Dev Neurobiol 67: 1498–1510, 2007. doi: 10.1002/dneu.20524. [DOI] [PubMed] [Google Scholar]
  73. Terpstra NJ, Bolhuis JJ, den Boer-Visser AM. An analysis of the neural representation of birdsong memory. J Neurosci 24: 4971–4977, 2004. doi: 10.1523/JNEUROSCI.0570-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Thompson JV, Gentner TQ. Song recognition learning and stimulus-specific weakening of neural responses in the avian auditory forebrain. J Neurophysiol 103: 1785–1797, 2010. doi: 10.1152/jn.00885.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Thompson JV, Jeanne JM, Gentner TQ. Local inhibition modulates learning-dependent song encoding in the songbird auditory cortex. J Neurophysiol 109: 721–733, 2013. doi: 10.1152/jn.00262.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Vates GE, Broome BM, Mello CV, Nottebohm F. Auditory pathways of caudal telencephalon and their relation to the song system of adult male zebra finches. J Comp Neurol 366: 613–642, 1996. doi:. [DOI] [PubMed] [Google Scholar]
  77. Wang X. The harmonic organization of auditory cortex. Front Syst Neurosci 7: 114, 2013. doi: 10.3389/fnsys.2013.00114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Wang Y, Brzozowska-Prechtl A, Karten HJ. Laminar and columnar auditory cortex in avian brain. Proc Natl Acad Sci USA 107: 12676–12681, 2010. doi: 10.1073/pnas.1006645107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Wild JM, Karten HJ, Frost BJ. Connections of the auditory forebrain in the pigeon (Columba livia). J Comp Neurol 337: 32–62, 1993. doi: 10.1002/cne.903370103. [DOI] [PubMed] [Google Scholar]
  80. Woolley SM, Casseday JH. Response properties of single neurons in the zebra finch auditory midbrain: response patterns, frequency coding, intensity coding, and spike latencies. J Neurophysiol 91: 136–151, 2004. doi: 10.1152/jn.00633.2003. [DOI] [PubMed] [Google Scholar]
  81. Woolley SM, Fremouw TE, Hsu A, Theunissen FE. Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds. Nat Neurosci 8: 1371–1379, 2005. doi: 10.1038/nn1536. [DOI] [PubMed] [Google Scholar]
  82. Wu GK, Arbuckle R, Liu BH, Tao HW, Zhang LI. Lateral sharpening of cortical frequency tuning by approximately balanced inhibition. Neuron 58: 132–143, 2008. doi: 10.1016/j.neuron.2008.01.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Yanagihara S, Yazaki-Sugiyama Y. Auditory experience-dependent cortical circuit shaping for memory formation in bird song learning. Nat Commun 7: 11946, 2016. doi: 10.1038/ncomms11946. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Neurophysiology are provided here courtesy of American Physiological Society

RESOURCES