Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Aug 21.
Published in final edited form as: Anim Behav. 2012 Jun;83(6):1411–1420. doi: 10.1016/j.anbehav.2012.03.012

Sensory Constraints on Birdsong Syntax: Neural Responses to Swamp Sparrow Songs with Accelerated Trill Rates

JF Prather 1, S Peters 3, R Mooney 2, S Nowicki 2,3
PMCID: PMC3748630  NIHMSID: NIHMS493111  PMID: 23976787

Abstract

Both sensory and motor mechanisms can constrain behavioral performance. Sensory mechanisms may be especially important for constraining behaviors that depend on experience, such as learned birdsongs. Swamp sparrows learn to sing by imitating the song of a tutor, but sparrows fail to accurately imitate artificial tutor songs with abnormally accelerated trills, instead singing brief and rapid trills interrupted by silent gaps. This “broken syntax” has been proposed to arise from vocal-motor limitations. Here we consider whether sensory limitations exist that could also contribute to broken syntax. We tested this idea by recording auditory-evoked activity of sensorimotor neurons in the swamp sparrow’s brain that are known to be important for the learning, performance and perception of song. In freely behaving adult sparrows that sang songs with normal syntax, neurons were detected that exhibited precisely time-locked activity to each repetition of the syllable in a trill when presented at a natural rate. Those cells failed to faithfully follow syllables presented at an accelerated rate, however, and their failure to respond to consecutive syllables increased as a function of trill rate. This “flickering” auditory representation in animals performing normal syntax reveals a central constraint on the sensory processing of rapid trills. Furthermore, because these neurons are implicated in both song learning and perception, and because auditory flickering began to occur at accelerated trill rates previously associated with the emergence of broken song syntax, these sensory constraints may contribute to the emergence of broken syntax.

Keywords: HVC, auditory, sensorimotor, syntax, development, vocal learning

INTRODUCTION

Behavioral performance can be constrained by a variety of factors. Some behaviors may be shaped by the physical properties of the peripheral motor system, or by physiological limitations of the central motor networks that pattern the behavior (Lauder & Liem 1989). Other behaviors, especially those that are learned, depend heavily on sensory experience and sensorimotor integration. Therefore, the constraints that shape the performance of learned behaviors could also arise from a limited ability of the central nervous system to accurately encode sensory experience relevant to learning. Juvenile songbirds learn to sing by copying the song of an adult tutor in a process that depends heavily on auditory experience and vocal rehearsal (For review, see Mooney, Prather & Roberts 2008). Despite the songbird’s remarkable capacity for vocal learning, even the best singers are constrained in their imitative ability (Nowicki, Westneat & Hoese 1992; Ryan & Brenowitz 1985). When tutored with songs containing features exceeding species-typical norms, birds are often unable to accurately copy those songs (Podos 1996; Podos, Nowicki & Peters 1999). A possible explanation for this inaccurate copying is that the properties of the vocal and respiratory musculature, or of the central circuits important to song patterning, limit the bird’s ability to reproduce an otherwise accurately encoded auditory representation of the abnormal model (Podos 1996). Alternatively, learning could be constrained by the brain’s inability to accurately encode an auditory representation of a song model with abnormal acoustic features (Marler 1976). In this study, we recorded the auditory responses of neurons in freely behaving adult swamp sparrows (Melospiza georgiana) to ask whether sensory limitations constrain the representation of abnormal song models that are difficult for young sparrows to copy.

Swamp sparrow songs consist of a multi-note syllable trilled continuously at a steady rate. Young swamp sparrows tutored with manipulated songs comprising artificially accelerated trills often fail to copy those songs accurately, producing instead versions of those songs that have been modified in ways that make them apparently easier to produce (Podos 1996; Podos et al. 1999). Sparrows tutored with song models containing highly accelerated trill rates subsequently sing songs in which they accurately reproduce both the acoustic features of the song syllable and match the accelerated trill rate, but only in short bursts that are separated by silent gaps, resulting in what is referred to as “broken song syntax.” Broken song syntax is highly atypical of normal swamp sparrow songs, and it remains unclear what constrains the copying of accelerated trills during the song learning and development process in such a way as to produce this outcome. Because birds can accurately copy the accelerated model in brief epochs, Podos and colleagues concluded that the central nervous system must contain an accurate auditory representation of the song, and speculated that broken syntax arises because of peripheral motor constraints (Podos 1996). Here we consider the possibility that the sparrow’s brain is limited in its ability to accurately encode a sensory representation of songs with artificially accelerated trill rates.

If the production of broken song syntax is influenced by the inability of the central nervous system to accurately encode tutor songs with accelerated trills, then this encoding constraint may be most clearly evident at the point where auditory information about song models is integrated with vocal motor commands (Hahnloser, Kozhevnikov & Fee 2002; Nick & Konishi 2005; Nottebohm, Stokes & Leonard 1976; Prather et al. 2008, 2010; Roberts et al. 2010). In that light, we recorded the activity of individual neurons in the swamp sparrow’s HVC (used here as proper name), a sensorimotor brain structure where individual neurons are active during both singing and listening to song (McCasland & Konishi 1981; Prather et al. 2008), and where auditory-evoked activity has been linked to song learning and perception (Nick & Konishi 2005; Prather et al. 2009; Roberts et al. 2010) . Specifically, we sampled the activity of HVC neurons that project to the striatal Area X (i.e., HVCX neurons), which in swamp sparrows exhibit auditory-evoked action potential activity that is precisely time-locked to each syllable in a natural song trill (Prather et al. 2009; Prather et al. 2008). If the properties that constrain the production of broken syntax are exclusively motor-related, then HVCX auditory responses should accurately encode continuous syllables in an accelerated trill. Alternatively, if the emergence of broken syntax also reflects auditory processing constraints, then HVCX cells may be limited in their ability to follow accelerated trills. We compared the responses of HVCX neurons to auditory presentation of natural swamp sparrow songs versus variants of those songs in which the trill rate had been accelerated. Our previous studies have shown that these cells respond in a precisely time-locked fashion to each syllable presented at a normal trill rate (Prather et al. 2009; Prather et al. 2008). In contrast, HVCX neurons failed to faithfully follow syllables presented at an accelerated rate, and failure to respond to consecutive syllables increased as a function of the trill rate. Notably, gaps in neural responses began to form at accelerated trill rates associated with the emergence of broken song syntax in previous behavioral studies. These data were recorded in birds that were naïve to accelerated trills and that were not themselves performing broken syntax, eliminating motor performance of broken trills as a possible influence on this failure to respond faithfully to songs with accelerated trills. Together, these data reveal a previously unknown limitation on the central representation of accelerated trills that could limit the bird’s ability to learn these songs.

METHODS

All procedures were in compliance with recommendations of Duke University Animal Care and Use Committee and state and federal regulations governing the capture and use of wild birds.

Animal Collection and Song Stimulus Preparation

Swamp sparrows were caught with mist nets as adults (age > 1 year) on either their winter grounds in Orange County, NC or their summer breeding grounds in Crawford County, PA. Birds were housed in cages (46 × 22 × 26 cm) individually throughout their time in the lab, both before and during experimentation, and were provided seed and water ad libitum and given a regular supplement of mealworms. Males (N = 6 birds) were identified either by external morphology (breeding season) or by molecular marker techniques (out of season), and females were released (methods described in Prather et al. 2008). Prior to implantation of the stimulus and recording devices, birds were subjected to gradually lengthening photoperiod (1 hr/week from 9:15 up to 15:9 L:D cycle) meant to simulate the onset of the spring breeding season, when swamp sparrows sing most robustly. This change in photoperiod, combined with a subcutaneous implant of testosterone (20 mm length of silastic tubing, I.D.1.47 mm, O.D. 1.96 mm, packed with crystalline T, see Nowicki & Ball 1989), was sufficient to induce the birds to sing. Birds were recorded in a semi-anechoic chamber (Industrial Acoustics Model AC-1, Sony TCM 5000 EV recorder and Shure SM57 microphone), and many examples of song (typically > 100) were recorded from each bird to ensure that the bird's full repertoire was sampled (2 to 5 song types). Although the exact age of birds we used for these experiments was unknown, all songs were performed in the “crystallized” song syntax that typifies adult swamp sparrow song (Marler & Peters 1982), indicating birds were at least 1 year old. Exemplars of each song type were digitized (25 kHz) and saved onto a computer hard drive (SIGNAL by Engineering Design, and LabView by National Instruments) to be used as stimulus songs.

Song stimuli consisted of natural song types and synthetic variants of those song types from the experimental subject and conspecific birds. Natural song types (unaltered from the original recordings) were used to assess the auditory selectivity of each neuron. Following methods used to generate accelerated trills in previous studies, songs were digitally edited to create synthetic variants of the natural song type (Podos et al. 1999). Individual notes were not altered in this process, but the silent intervals between notes and syllables were compressed to 50%, 25% or 0% of their respective durations in the natural song type, resulting in a slight compression of the syllable duration and an acceleration of the overall trill rate (Fig. 1). Individual synthetic syllables were concatenated to generate trilled stimuli with total song duration like that of natural song types (typically ∼ 2 sec).

FIGURE 1.

FIGURE 1

Auditory responses of individual neurons were tested using song types in the bird’s adult repertoire and variants of those song types in which the trill rate had been accelerated. An example of a natural song type (top row) contains periods of song syllables (white boxes below each spectrogram, 5 syllables marked in each row) separated by periods of silence (gray boxes below each spectrogram). In variants of that song type with accelerated trill rate (lower three rows), the periods of sound production were unaltered, but the silences between individual notes and between each syllable were compressed by a specific percentage, resulting in a corresponding acceleration of the trill rate (columns to the right of each spectrogram). The maximal trill rate acceleration for this song type was 164% of the natural trill rate, corresponding to complete compression of all inter-note and inter-syllable intervals.

Microdrive Implantation Surgery

Neurons were sampled using a miniaturized micromanipulation device in awake and freely behaving birds (Fig. 2A) (Fee & Leonardo 2001; Prather et al. 2009; Prather et al. 2008). Several days prior to implantation, birds were transferred from their housing cage to a cage within the recording chamber, a sound-attenuating box (Industrial Acoustics Model AC-1) where they would reside throughout experimentation. During implantation, adult male swamp sparrows were anesthetized using isoflurane (inhalation, 1–3% in 100% O2) and placed in a stereotaxic device. A small incision was made in the skin overlying the skull, and the outer leaflet of bone was removed over HVC and the sites of its projections into the avian striatum (Area X) and the robust nucleus of the arcopallium (RA) (Reiner et al. 2004) (HVC typically 0.3 mm anterior and 2.3 mm lateral of the bifurcation of the midsagittal sinus, Area X typically 5.7 mm anterior and 1.9 mm lateral, RA typically 0.7 mm posterior and 2.3 mm lateral, head angle 30 degrees below zero in all cases). A small craniotomy (approximately 300 by 300 microns) was made in the inner leaflet over Area X, and a small custom-made bipolar stimulus electrode (JF Prather) was inserted to the proper depth. The implant site was covered with a sterile film and the electrode was secured using dental cement. With the electrode in Area X firmly secured, the procedure was repeated to place a bipolar stimulus electrode in RA. With both stimulus electrodes firmly in place, another small craniotomy was made directly over HVC. HVC was located by passing brief (100 µsec) current pulses through the stimulating electrodes in Area X and RA and using a sterilized extracellular electrode (Carbostar 1, Kation Scientific) to observe the extent of the region expressing the resultant antidromic “hash.” The microdrive recording device was implanted so that the recording electrodes were initially positioned slightly dorsal of HVC. Across experiments, there was subtle variation of the location of recording electrodes within HVC. Histological verification of recording sites revealed that our recordings spanned the full anterior-posterior and medial-lateral extent of HVC, and the depth of the recording electrodes spanned the full depth of HVC in each bird. Across these sites, we found no evidence of anatomical specialization within HVC, which would have been evident as a relation between electrode position and auditory response characteristics. The microdrive was secured to the skull using dental cement (microdrive ∼ 1.2 g including dental cement, birds ∼ 16 g), and the incision site was closed using surgical skin adhesive (Vetbond). The bird was monitored closely until it was fully recovered, typically < 15 min. After the recording session was complete (1–5 weeks), the bird was deeply anesthetized with equithesin, perfused transcardially with 0.9% saline followed by 4% paraformaldehyde, and the brain was processed histologically. All electrode positions were verified at the end of each experiment using Nissl-stained sagittal sections (thickness 75 microns).

FIGURE 2.

FIGURE 2

Extracellular recordings revealed reliable responses of HVCX neurons to individual syllables at natural trill rates, but reliability decreased with accelerated trill rate. a) HVCX neurons were identified using antidromic stimulation in Area X, and HVCX cells could be easily distinguished from HVC cells that project to premotor structures (robust nucleus of the arcopallium, RA) or that project locally with HVC (interneurons, omitted for clarity). All classes of HVC neurons receive auditory input (dotted arrow; parasagittal schematic of the swamp brain at the level of HVC; D = dorsal, R = rostral) (Prather et al. 2009). b) HVCX auditory responses to natural song types were phasic and typically evoked by only one song type in the bird’s repertoire (“primary song type” for that cell; top: raw data trace; bottom: oscillogram of song stimulus) (Prather et al. 2009; Prather et al. 2008). c) HVCX cells typically responded to each syllable in primary song types with natural trill rate (7.3 syllables/sec, shaded areas highlight individual syllables; top three rows: responses of one HVCX neuron to its primary song type; bottom row: oscillogram of the song stimulus) (Prather et al. 2009; Prather et al. 2008). d) In contrast, the response of the same cell as in panel c became much less reliable when the primary song type trill rate was accelerated (10.3 syllables/sec, data arranged as in panel c).

Experimental Protocol

Birds were allowed to recover for three days following the implantation procedure before electrophysiological recording began. During recording, microdrive electrodes were slowly advanced into HVC while weak electrical stimulation was delivered to the stimulus electrodes in either Area X or RA (100 µsec pulses, ∼ 100 µA). The boundaries of HVC could be reliably identified by observing where antidromic activity was evident. Once an electrode was positioned in HVC, the electrode was advanced very slowly so that antidromically-evoked action potentials of individual neurons could be identified. All neural data were amplified, filtered (band pass 500 Hz to 10 kHz), and digitized (25 kHz) to computer file (LabView, National Instruments). Action potentials of individual units were discriminated using amplitude discrimination of the largest unit in a record (custom software) or discrimination based on waveform characteristics (WaveClus) (Quiroga, Nadasdy & Ben-Shaul 2004). In both cases, single unit isolation was verified using an interspike interval histogram to test for the presence of a refractory period (Fujimoto, Hasegawa & Watanabe 2011; Prather et al. 2008). Our investigation focused on HVCX neurons, and individual HVCX units were identified using antidromic stimulation via the electrodes placed in Area X or RA (5 cells, 3 birds), or by characteristic features of their action potentials and auditory responses (9 cells, 3 birds) (Prather et al. 2009; Prather et al. 2008).

When the action potential of an HVCX single unit had been isolated and identified, song playback of each song type in the bird's repertoire was initiated immediately (10 sec quiet interval between each song presentation, stimuli presented in randomized order). Songs were played to the sparrow at 70 dB (peak RMS, A-weighted) through a speaker placed 20 to 35 cm away in the chamber (distance varied according to the bird’s location in the cage). Playback of the bird’s entire song repertoire was used to assess the auditory response selectivity of individual HVCX neurons, with each cell responding robustly to one song type (“primary song type” for that cell, e.g., Fig. 2B) and responding little or not at all to other songs (Prather et al. 2008). With the primary song type identified for a given neuron, artificially accelerated variants of the primary song type were presented to investigate the responses of HVCX neurons to trill rate acceleration like that associated with the expression of broken song syntax (Fig. 2C–D).

Many of the HVCX neurons described here (14 cells, 6 birds) were characterized using not only their responses to accelerated trills but also their sensorimotor correspondence evident during singing (3 cells, 1 bird) (Prather et al. 2008) and their categorical auditory responses related to song perception (8 cells, 4 birds) (Prather et al. 2009). In some cases (3 cells, 1 bird) all three parameters were characterized in one and the same cell. This study seeks to gain insight into a behavior in which the bird must perceive accelerated trills and translate them into a sensorimotor context in order to achieve imitation. Although studying birds using a head-fixed anesthetized preparation could have facilitated recording from a larger number of cells (Mooney, Hoese & Nowicki 2001; Prather et al. 2010), it was essential to investigate the response to accelerated trills in the awake state in which HVCX activity is also linked to perception (Prather et al. 2009) and in which HVCX cells express a precise sensorimotor correspondence (Prather et al. 2008). Testing these multiple auditory and singing-related properties introduced limitations in the number of cells in this study, however establishing such links in one and the same cell is essential to understanding the degree to which the auditory responses described here could potentially influence perception and imitation.

Data Analysis

Action potentials from individual HVCX neurons were discriminated and compared against features of the auditory song stimulus (Prather et al. 2009; Prather et al. 2008). Song features were discerned using spectrograms generated in Matlab (MathWorks), and all analyses were performed in Matlab using custom software (JF Prather). Rasters and histograms of action potential activity were constructed by aligning discriminated action potentials to the song (“whole-song” analysis, 10 ms bin size in whole-song histograms, e.g., Figure 1 in Prather et al. 2008). Because swamp sparrow songs are trills consisting of syllables separated by brief quiet intervals, an additional technique was possible wherein the onset of each syllable was detected separately and used to align action potentials that occurred in association with each syllable (“single-syllable” analysis, 1 ms histogram bin size in single-syllable histograms, e.g., Figure 1 in Prather et al. 2008). The onset of song or of each syllable was defined as the first time when a song note > 10 dB louder than background could be detected. In whole-song analyses, action potential latencies were assigned relative to the onset of the song; in single-syllable analyses, action potential latencies were assigned relative to the onset of each associated song syllable.

In both the whole-song and single-syllable analyses, action potential activity during song presentation was compared against the background firing rate when no stimulus was present, and the mean background rate plus 5 SD was taken as the threshold for significance (Prather et al. 2008). If the value in any bin in the peri-stimulus time histogram exceeded that threshold (accounting for bin size), the auditory response was deemed significant. Results obtained in this manner were in good agreement with visual assessment of the efficacy of an auditory stimulus. In quantification of HVCX responses to artificially accelerated versions of the primary song type for that cell, measurements were normalized by the response of the same neuron to the natural song type used to create the accelerated stimulus, which facilitated comparison of responses across different cells and birds. Specifically, the degree of trill rate acceleration was expressed as a percentage of the trill rate of the natural song type, responses to accelerated stimuli were expressed as a percentage of the cell’s response to the corresponding natural song type, and single-syllable response latencies for accelerated stimuli were computed as the difference between the latency of the peak response to the accelerated stimulus versus the latency of the peak response to the natural song type (expressed as either ms or a percentage of syllable duration). The response strengths evoked by reverse playback of the primary song type in each HVCX neuron were averaged, and that mean was taken as a standard above which response strengths were considered valid.

WAV files of songs in Fig. 1 are available online as Supplementary Material.

RESULTS

HVCX Response Strength Decreased but Response Latency was Unchanged with Accelerated Trill Rates

Previously we recorded from HVCX sensorimotor neurons and found that they respond regularly and reliably to each syllable in the primary song type when it is presented at a natural trill rate (Prather et al. 2009; Prather et al. 2008). Here we asked whether the activity of those cells was altered when trill rates were artificially accelerated (N = 14 cells, 6 birds, including 3 cells (1 bird) that expressed a sensorimotor correspondence (Prather et al. 2008), 8 cells (4 birds) with activity related to song perception (Prather et al. 2009) and 3 cells (1 bird) in which sensorimotor correspondence and activity related to perception were both characterized in the same cell). Using stimuli comprising accelerated trills of each neuron’s primary song type in which internote and intersyllable silences were shortened but the notes themselves were unchanged, we observed that the responses of individual HVCX neurons diminished with progressively more accelerated trills. In the representative cell shown in Fig. 3A–B (same cell as in Figure 2B–D), responses to the natural song type were phasic and quite robust, evident as large peaks in the histogram of whole-song responses (Fig. 3A, left). With progressively accelerated trills, however, those histogram peaks became smaller (Fig. 3A, right). This decrement in histogram peaks could indicate that the number of action potentials generated in association with each syllable had decreased, or that the response had remained strong but had become temporally dispersed. To distinguish these possibilities, we examined the latency of those auditory responses in relation to each syllable in the song (Fig. 3B). The latency of the peak response differs by at most 5.5 ms across the natural and accelerated conditions (5.1% of natural syllable duration), revealing that the temporal precision of the auditory response was not disrupted as the trill rate was increased. Therefore, this HVCX neuron became progressively less responsive to individual syllables of the primary song type when they were presented at accelerated trill rates.

FIGURE 3.

FIGURE 3

The strength of auditory response in individual HVCX neurons typically declined with progressively more accelerated trill rates, but different cells within the same bird could express different types of response (N = 14 cells, 6 birds). a) The response of this cell, also shown in figure 2b–d, was typical of the HVCX population, with decreasing responses at higher trill rates (top: response raster for each song presentation; middle: histogram of aggregate auditory response; bottom: oscillogram of song stimulus). b) The response of the same cell as in panel a retained its precise relation to specific song features, revealing that reduced peaks in panel a reflect a lower probability of response to each syllable (top: raster of responses to each syllable; middle: histogram of aggregate auditory response; bottom: spectrogram of song syllable). Exceptions to this typical response were also evident. c) A different cell from a different bird maintained its response strength even at maximally accelerated trills. This was the only cell in which such maintenance of response was observed. d) A different cell from the same bird as in panel c expressed strong decrement in response to accelerated trills. The presence of the response in this panel and in panel c among cells in the same bird reveals that the HVCX population response comprises a diversity of responses to accelerated trills.

The cell described in Figs. 3A–B is representative of the typical response of HVCX neurons to accelerated trills. Two other cells illustrate the range of responses detected, responding to individual syllables in accelerated trills with much better (Fig. 3C) or much worse fidelity (Fig. 3D). The HVCX neuron in Fig. 3C maintained a strong response even as the trill rate was maximally accelerated, indicating that the auditory activity of some HVCX neurons could accurately represent both the phonological and the temporal features of accelerated trills. In contrast, the response of the neuron shown in Fig. 3D was strongly decremented even at moderately accelerated trill rates, and no response was evident at maximally accelerated trill rates, indicating that the auditory response of some HVCX neurons failed beyond even moderately accelerated trill rates. Notably, the neurons shown in Figures 3C and 3D were sampled from the same bird (different primary song types), demonstrating that these disparate patterns of HVCX response to accelerated trills coexist within an individual. Furthermore, such differences are also evident even among cells that respond to the same primary song type (e.g., square symbols in Fig. 4). These data reveal that although HVCX neurons maintain the temporal precision of their response to specific features of syllables in accelerated trills, individual neurons differ in their ability to respond to each syllable in an accelerated trill.

FIGURE 4.

FIGURE 4

Across all HVCX neurons tested for responses to accelerated trills (N = 14 cells, 6 birds), the strength of auditory response became weaker with progressive acceleration of trill rates, but cells retained their relation to specific features within the song syllable. a) The responses of individual neurons became weaker as trill rates were accelerated, with responses becoming less than that evoked by reversed presentation of the primary song type (horizontal dotted line) at even moderately accelerated conditions (triangles indicate antidromically identified HVCX cells; open symbols indicate the cells shown in Figure 3; square symbols illustrate the diversity of responses among 3 cells that responded to the same primary song type in the same bird). At trill rates where broken syntax emerges behaviorally (> 156% of natural trill rate, shaded region), few HVCX neurons expressed a reliable auditory representation of each syllable in those accelerated trills. b) Despite those changes in response strength, activity of HVCX neurons was generally unchanged in its temporally precise association with specific features of the song syllable (symbols as in panel a).

Across the population of all HVCX neurons that we tested (N = 14 cells, 6 birds), responses were strongly decremented with progressively accelerated trill rates, with most cells exhibiting a response roll off between 8 and 10 Hz (Fig. 4A, Supp. Fig. 1). In the response of each neuron, both the peak response strength and the mean response strength were similarly decreased with progressively accelerated trill rates (Spearman corr. coeff.: 0.91, p < 0.001), supporting our previous observation that action potentials generated in response to accelerated trills were fewer in number but not temporally dispersed. The stability of response latency was confirmed by comparing the responses of each neuron at the natural and accelerated trill rates (Fig. 4B), revealing no systematic change in response latency as a function of trill rate acceleration (p = 0.47, slope of linear regression compared to zero). Together, these data reveal that HVCX neurons responded to each syllable in an accelerated trill with action potentials that remained correlated with the same spectrotemporal features as in the natural song, but action potentials were less likely to be evoked by accelerated trills. Thus, auditory-evoked action potentials in HVCX neurons are precisely time-locked to specific acoustic features of the song syllable in both the natural trill and accelerated trills, but the probability of an action potential occurring for any particular syllable declines with accelerated trill rates.

HVCX Neurons Failed to Respond to Consecutive Syllables in Accelerated Trills

The temporal pattern of HVCX responses to accelerated trills resembled the temporal pattern of broken song syntax in birds tutored with accelerated trills. Broken syntax songs consist of abnormally brief trills of several syllables in which both the phonology and the accelerated trill rate are accurately reproduced, but those brief trills are interrupted by gaps of silence that are the equivalent of several syllables in duration (Podos 1996; Podos et al. 1999). When presented a song with accelerated trill rate, HVCX cells were less likely to respond to individual syllables, although when they did respond, they fired action potentials at the same time in the syllable as when presented with natural songs (e.g., Figs. 3, 4). This diminished response to accelerated trills generated epochs in which there were no auditory-evoked action potentials, and those silent periods were typically the equivalent of several syllables in duration (e.g., accelerated cases in Fig. 3A). To investigate this “flickering” phenomenon further, we examined the response of each neuron to serial syllables in the natural and accelerated versions of the primary song type. In response to natural songs, HVCX neurons typically generated action potentials in association with many or all syllables in a song (Fig. 5A, left), with rare failures to respond distributed throughout the song duration (Fig. 5B, left). In contrast, responses to accelerated trills were characterized by significantly more failures to respond (Fig. 5A, right) that were also distributed throughout the song (Fig. 5B, right) and significantly shorter sequences of serial responses and significantly longer sequences of serial failures (Figs. 5C, 6). Together, these findings reveal a “flickering” response of HVCX neurons to accelerated trills reminiscent of the short bursts of accurately imitated syllables separated by gaps of silence in the production of broken syntax.

FIGURE 5.

FIGURE 5

Response failures introduced gaps in the neural representation of songs with accelerated trill rates (N = 14 cells, 6 birds). Responses of one and the same representative cell are shown in each panel. a) The natural song type typically evoked a response to each syllable, evident as generally contiguous rows in each raster plot. The reduced probability of response to individual syllables in accelerated trills was evident as gaps in each row (different syllable numbers were presented at each trill rate to preserve the total stimulus duration). b) Response failures occurred throughout the song duration (black line: mean; gray area: SE of response at each syllable in the song; horizontal dotted line indicates expectation if cell responded to every syllable). c) Response failures tended to occur across consecutive syllables, also evident as gaps in the responses in panel a, and the length of those failures became longer as trill rate became more accelerated.

FIGURE 6.

FIGURE 6

Across all HVCX cells tested for responses to accelerated trills (N = 14 cells, 6 birds), the number of consecutive syllables with an associated auditory response became shorter (filled symbols, solid line), and the number of consecutive syllables with no response became longer (open symbols, dashed line) with progressively accelerated trill rate (means and SE are shown; asterisk indicates p < 0.05, Mann-Whitney U test; trill rate corresponding to each degree of interval compression varied across cells and is summarized by the mean trill rate across all cells).

DISCUSSION

In awake and freely behaving swamp sparrows, HVCX neurons express temporally precise responses to specific features in their song syllables, and individual neurons typically generate action potentials in response to this feature each time the syllable is produced in the trill (Prather et al. 2009; Prather et al. 2008). In response to songs with artificially accelerated trill rates, HVCX neurons continued to maintain the temporally precise relation between auditory-evoked action potentials and specific features of each syllable. Importantly, however, individual cells typically failed to respond to some or many syllables in the accelerated trill, and failure to respond to consecutive syllables became more common with increasingly accelerated trill rates, declining markedly for trill rates above 8–10 Hz. This pattern resulted in a “flickering” auditory representation of accelerated trills, with responses comprising brief strings of syllables for which a response occurred separated by brief gaps in which the cell failed to respond to individual syllables. This pattern of auditory response is reminiscent of the pattern of broken song syntax produced by sparrows tutored on song comprising artificially accelerated trills, in which short bursts of several accurately imitated syllables are separated by gaps of silence (Podos 1996; Podos et al. 1999).

A Possible Central Constraint on Song Imitation

Previous work on broken song syntax in swamp sparrows has implicated motor constraints as the primary factor contributing to the emergence of this atypical vocal behavior (Podos 1996). Podos and colleagues noted that swamp sparrows producing broken syntax songs as a result of being tutored with accelerated trill rate models could accurately imitate the acoustic features of individual syllables even when the models from which they had learned had trill rates far beyond the species norm. Furthermore, those birds could reproduce those syllables at the rapid rate of the model, but only in short bursts punctuated by silent gaps (Podos 1996; Podos et al. 1999). They interpreted the accurate imitation of both syllable structure and trill rate as evidence that the birds’ perception and memory of the accelerated-trill song model was not impaired. Consequently, they suggested that the stereotypic gaps in broken syntax copies of accelerated models reflect a motor constraint encountered during song development, most likely as a consequence of peripheral mechanisms not being able to sustain the demands of producing such rapid songs. In support of the idea that motor constraints are important in that behavior, broken syntax emerged primarily in the cases of the most accelerated song models, with song models accelerated to a lesser extent often reproduced with accurate phonology and in a continuous trill, but performed with slightly decreased trill rate relative to the model or with the loss of one note in the repeated syllable (Podos 1996; Podos et al. 1999; Podos, Peters & Nowicki 2004). These observations led Podos and colleagues to speculate that broken syntax emerges as a result of young birds’ difficulty reproducing the accelerated models they have memorized, calibrating those goals against the motor limits of their song performance and yielding an adult song with broken song syntax (Podos et al. 2004).

Consistent with this interpretation, northern mockingbirds (Mimus polyglottos) are apparently incapable of fully imitating the long trains of rapid trills performed by canaries and also express broken song syntax in their attempts to imitate those trills, with some evidence that learning is limited by underlying motor constraints (Zollinger & Suthers 2004). Measures of respiration during broken syntax reveal that mockingbirds do not perform the “mini-breaths” that canaries routinely perform during such prolonged and rapid trills (Hartley & Suthers 1989), and instead fully inspire periodically during the trill to respire and maintain the pressure head required for song production (Zollinger & Suthers 2004). These deeper inspirations result in the pauses within the trill that characterize broken song syntax in mockingbirds, raising the possibility that similar respiratory demands underlie the sporadic pauses evident in broken song syntax in swamp sparrows.

Our results suggest that a central neural constraint may equally well contribute to the expression of broken syntax in swamp sparrows, by revealing a limitation of the auditory processing properties of neurons known to be key in the sensory-motor mechanisms underlying song learning and development. HVCX neurons were capable of responding to individual syllables in an accelerated trill, and the relation between neural activity and specific features within each syllable was preserved. Therefore, the phonology of individual syllables remains accurately represented by HVCX neurons, even when detected as part of an accelerated trill. However, the neural representation of song syntax became less accurate with progressively accelerated trills, with neural responses containing gaps of several syllables in which no action potentials occurred. Together, these data indicate that even though acoustic features of individual syllables in abnormally rapid trills can be encoded accurately in HVC, the syntax of rapid trills may fail to be represented faithfully at this level of the brain. In other words, at least some cases of broken song syntax could constitute accurate imitation of an inaccurately encoded song model. From weekly recordings of song performances, Podos et al. (2004) gave a broad brush stroke of the variation of trill patterns produced during song development. An investigation of a more detailed timeline of song ontogeny to compare the emergence and sequence of trill development for tutor songs comprising a range of accelerated rates (normal to very rapid) could give insight into the relative contributions of inaccurate auditory coding as well as constraints on production in this form of imperfect learning.

How Limitations in Auditory Processing in HVC Could Lead to the Development of Broken Song Syntax

Prior studies suggest several ways in which the auditory activity of HVC neurons, and especially HVCX cells, could be linked to song learning. First, synapses in HVC are a potential site where auditory memories of tutor songs are stored, because auditory experience of a tutor song rapidly stabilizes and strengthens synapses onto HVC neurons, including HVCX cells, in juvenile birds lacking prior auditory experience of a tutor song (Roberts et al. 2010). Consistent with the idea that HVC encodes tutor song memories, tutor songs can evoke auditory responses from HVC neurons in juvenile songbirds (Nick & Konishi 2005; Volman 1993), and robust auditory responses can be evoked in the adult sparrow HVC by tutor songs that were either copied and rehearsed only transiently during juvenile life or, in some cases, apparently never copied at all (Prather et al. 2010). Second, the auditory activity of HVCX neurons in swamp sparrows, including 8 of the 14 HVCX neurons studied here, is closely linked to song perception. Specifically, these neurons display a response boundary to changes in note duration that closely parallels categorical perceptual boundaries displayed by sparrows for this same acoustic feature (Prather et al. 2009). Furthermore, studies in other songbird species have shown that song perception is impaired by lesions placed in HVC or in the downstream target of HVCX neurons (Area X) (Gentner et al. 2000; Scharff, Nottebohm & Cynx 1998). In the light of these two sets of findings, the flickering responses of HVCX neurons to songs with accelerated trill rate raise an intriguing alternative explanation for the emergence of broken song syntax. Specifically, flickering representations of accelerated trills may also be present in young sparrows as they experience accelerated-trill song models, and broken syntax may emerge at least partially because swamp sparrows encode memories of those songs as containing strings of individual syllables separated by gaps lasting several syllables in duration. These inaccurate auditory memories could in turn compromise subsequent attempts to imitate the vocal model, even in the absence of any motor constraints. Third, because auditory-responsive HVCX neurons also display motor-related activity during singing (Fujimoto et al. 2011; Kozhevnikov & Fee 2007; Prather et al. 2008), and exhibit a precise correspondence in their auditory and singing-related activity (Fujimoto et al. 2011; Prather et al. 2008), the impaired ability of HVCX neurons to faithfully follow accelerated trills may point to a central constraint in the sensorimotor activity of these neurons that could further interfere with motor aspects of imitation. Notably, the wild-caught birds that we studied were adults that sang songs with normal syntax, indicating that the flickering representation of accelerated trills is evident in birds that were previously naïve to accelerated trills and does not depend on the production of broken syntax. Our previous demonstration of a link between junveile auditory experience and the activity of neurons in the adult HVC (Prather et al. 2010) leads us to posit that this pattern of response may also characterize the activity of HVCX neurons in naïve birds exposed to accelerated trills.

To further resolve whether and how limitations in auditory response properties of HVCX neurons lead to the emergence of broken song syntax, an important future direction will be to record the activity of HVCX neurons in juvenile swamp sparrows engaged in memorizing and then beginning to imitate accelerated-trill song models. Characterizing the activity of HVCX cells during the earliest exposure to accelerated-trill song models and tracking that activity throughout the emergence and crystallization of broken syntax will require recordings from very young sparrows collected from the wild as hatchlings and hand-reared in the laboratory. Our experience in attempting to record from very young birds indicates that the skull bones of those birds do not have sufficient density to enable the microdrive recording device to be securely implanted (pers. obs.). Therefore, future tests of the ideas proposed here will benefit from more advanced recording techniques. In addition, a large population of hand-reared birds will be required, as only a small percentage of birds tutored with accelerated-trill song models display broken syntax (Podos 1996; Podos et al. 1999). Together, these factors will make for a very challenging set of future experiments, yet the insights gained from those outcomes will be essential to test the ideas posited here and understand the basis of new forms of behavioral syntax.

Possible Influence on Song Evolution

Our results suggest a mechanism by which constraints inherent in the central nervous system can contribute to the evolution of song syntax. Gradual changes in song structure have been documented in longitudinal studies of individual populations, mostly as differences in the acoustic structure of song elements (“phonology”) attributed to errors in song copying, cultural drift, or changes due to selection (Derryberry 2009; Martens 1996; Slater 1986; Thielke 1972). It is less clear is how relatively large-scale changes in song syntax might arise. Differences across clades of birds (such as the genus Melospiza) are often characterized by syntactical as well as phonological differences and it is unclear how syntactical differences could emerge as the outcome of gradual changes in phonological structure. Broken song syntax represents a disjunct change in syntax and thus may provide a model for further investigation of how saltatory innovation in song syntax arises (Nowicki et al. 2001).

If the broken syntax phenomenon in swamp sparrow song is to serve as a model for saltatory change in the evolution of song syntax, then is important to show that the trait can be transmitted to subsequent generations, and to show that songs having this trait remain sufficiently functional as signals. The former criterion was demonstrated by Podos et al. (1999) who showed that young swamp sparrows copy model songs presented with broken syntax as readily as they copy songs with normal syntax, and birds that learn from broken syntax models subsequently reproduce them with a syntax matching the model. The functionality of broken syntax songs in male-male territorial interactions was tested in the field by Nowicki et al. (2001), who found them to be equally effective as continuously trilled songs in eliciting male aggressive response in the context of field playback experiments. Broken syntax songs also were effective in eliciting female sexual response as measured in laboratory copulation solicitation assays. Here, the response to broken syntax song is less than that to continuous trills, but significantly more than that to heterospecific songs (Nowicki et al. 2001). Thus, once broken syntax arose in a population, the trait could be transmitted culturally to the next generation, and those birds having songs with broken syntax could use them effectively as dual function signals. In support of a broadly relevant role for such processes, broken song syntax has been reported occasionally in the wild for many species that normally sing continuous trills (including Spizella passerina, Pooectes gramineus, Geospiza fortis, and Certhidae olivacea) (described in Podos et al. 1999), revealing that broken syntax is not simply a quirk of swamp sparrows or an artifact of hand-raising birds in captivity or from tape-recorded songs. Thus, the auditory properties described here may constitute a sensory constraint affecting imitation. Together with peripheral motor constraints on song performance, such properties and the broken syntax to which they contribute may provide a broadly relevant mechanism influencing the evolution of song syntax.

Supplementary Material

Fig SUPP 1

SUPPLEMENTAL FIGURE 1 The decrement in response strength observed as a function of the degree to which song stimuli were accelerated above their natural trill rate (Fig. 4A) was also evident as a function of the absolute trill rate (symbols as in Fig. 4; N = 14 cells, 6 birds).

Highlights

HIGHLIGHTS.

  • We consider that sensory constraints may shape the emergence of novel song syntax

  • Auditory activity of sensorimotor neurons follows trills presented at normal rates

  • Gaps in neural responses emerge when trills are presented at accelerated rates

  • The “flickering” suggests a mechanism for the development of “broken song syntax”

ACKNOWLEDGEMENTS

The authors are grateful to Rindy C. Anderson for help with bird collection and to Duke University for support.

REFERENCES

  1. Derryberry E. Ecology shapes birdsong evolution: variation in habitat and morphology explains variation in white-crowned sparrow song. American Naturalist. 2009;174:24–33. doi: 10.1086/599298. [DOI] [PubMed] [Google Scholar]
  2. Fee MS, Leonardo A. Miniature motorized microdrive and commutator system for chronic neural recording in small animals. J Neurosci Methods. 2001;112:83–94. doi: 10.1016/s0165-0270(01)00426-5. [DOI] [PubMed] [Google Scholar]
  3. Fujimoto H, Hasegawa T, Watanabe D. Neural Coding of Syntactic Structure in Learned Vocalizations in the Songbird. The Journal of Neuroscience. 2011;31:10023–10033. doi: 10.1523/JNEUROSCI.1606-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Gentner TQ, Hulse SH, Bentley GE, Ball GF. Individual vocal recognition and the effect of partial lesions to HVc on discrimination, learning, and categorization of conspecific song in adult songbirds. Journal of Neurobiology. 2000;42:117–133. doi: 10.1002/(sici)1097-4695(200001)42:1<117::aid-neu11>3.0.co;2-m. [DOI] [PubMed] [Google Scholar]
  5. Hahnloser RH, Kozhevnikov AA, Fee MS. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature. 2002;419:65–70. doi: 10.1038/nature00974. [DOI] [PubMed] [Google Scholar]
  6. Hartley RS, Suthers RA. Air-Flow and Pressure during Canary Song - Direct Evidence for Mini-Breaths. Journal of Comparative Physiology a-Sensory Neural and Behavioral Physiology. 1989;165:15–26. [Google Scholar]
  7. Kozhevnikov AA, Fee MS. Singing-related activity of identified HVC neurons in the zebra finch. Journal of Neurophysiology. 2007;97:4271–4283. doi: 10.1152/jn.00952.2006. [DOI] [PubMed] [Google Scholar]
  8. Lauder GV, Liem K. The role of historical factors in the evolution of complex organismal functions. In: Wake DB, Roth G, editors. Complex Organismal Functions : Integration and Evolution in Vertebrates. New York: Wiley; 1989. pp. 63–78. [Google Scholar]
  9. Marler P. Sensory templates in species-specific behavior. In: Fentress J, editor. Simpler Networks and behavior. Sunderland, Mass.: Sinauer; 1976. pp. 314–329. [Google Scholar]
  10. Marler P, Peters S. Structural-Changes in Song Ontogeny in the Swamp Sparrow Melospiza-Georgiana. Auk. 1982;99:446–458. [Google Scholar]
  11. Martens J. Vocalizations and speciation of palearctic birds. In: Kroodsma DE, Miller EH, editors. Ecology and Evolution of Acoustic Communication in Birds. New York: Comstock; 1996. pp. 221–240. [Google Scholar]
  12. McCasland JS, Konishi M. Interaction between auditory and motor activities in an avian song control nucleus. Proc Natl Acad Sci U S A. 1981;78:7815–7819. doi: 10.1073/pnas.78.12.7815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Mooney R, Hoese W, Nowicki S. Auditory representation of the vocal repertoire in a songbird with multiple song types. Proc Natl Acad Sci U S A. 2001;98:12778–12783. doi: 10.1073/pnas.221453298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Mooney R, Prather JF, Roberts T. Neurophysiology of Birdsong Learning. In: Eichenbaum H, editor. Learning and Memory: A Comprehensive Reference. Oxford: Elsevier; 2008. pp. 441–474. [Google Scholar]
  15. Nick TA, Konishi M. Neural song preference during vocal learning in the zebra finch depends on age and state. Journal of Neurobiology. 2005;62:231–242. doi: 10.1002/neu.20087. [DOI] [PubMed] [Google Scholar]
  16. Nottebohm F, Stokes TM, Leonard CM. Central control of song in the canary, Serinus canarius. J Comp Neurol. 1976;165:457–486. doi: 10.1002/cne.901650405. [DOI] [PubMed] [Google Scholar]
  17. Nowicki S, Ball GF. Testosterone Induction of Song in Photosensitive and Photorefractory Male Sparrows. Hormones and Behavior. 1989;23:514–525. doi: 10.1016/0018-506x(89)90039-1. [DOI] [PubMed] [Google Scholar]
  18. Nowicki S, Searcy WA, Hughes M, Podos J. The evolution of bird song: male and female response to song innovation in swamp sparrows. Animal Behaviour. 2001;62:1189–1195. [Google Scholar]
  19. Nowicki S, Westneat MW, Hoese W. Birdsong: motor function and the evolution of communication. Seminars in Neuroscience. 1992;4:385–390. [Google Scholar]
  20. Podos J. Motor constraints on vocal development in a songbird. Animal Behaviour. 1996;51:1061–1070. [Google Scholar]
  21. Podos J, Nowicki S, Peters S. Permissiveness in the learning and development of song syntax in swamp sparrows. Animal Behaviour. 1999;58:93–103. doi: 10.1006/anbe.1999.1140. [DOI] [PubMed] [Google Scholar]
  22. Podos J, Peters S, Nowicki S. Calibration of song learning targets during vocal ontogeny in swamp sparrows, Melospiza georgiana. Animal Behaviour. 2004;68:929–940. [Google Scholar]
  23. Prather JF, Nowicki S, Anderson RC, Peters S, Mooney R. Neural correlates of categorical perception in learned vocal communication. Nat Neurosci. 2009;12:221–228. doi: 10.1038/nn.2246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Prather JF, Peters S, Nowicki S, Mooney R. Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature. 2008;451:305–310. doi: 10.1038/nature06492. [DOI] [PubMed] [Google Scholar]
  25. Prather JF, Peters S, Nowicki S, Mooney R. Persistent representation of juvenile experience in the adult songbird brain. Journal of Neuroscience. 2010;30:10586–10598. doi: 10.1523/JNEUROSCI.6042-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Quiroga RQ, Nadasdy Z, Ben-Shaul Y. Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput. 2004;16:1661–1687. doi: 10.1162/089976604774201631. [DOI] [PubMed] [Google Scholar]
  27. Reiner A, Perkel DJ, Mello CV, Jarvis ED. Songbirds and the revised avian brain nomenclature. Behavioral Neurobiology of Birdsong. 2004;1016:77–108. doi: 10.1196/annals.1298.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Roberts TF, Tschida KA, Klein ME, Mooney R. Rapid spine stabilization and synaptic enhancement at the onset of behavioural learning. Nature. 2010;463:948–952. doi: 10.1038/nature08759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Ryan MJ, Brenowitz EA. The Role of Body Size, Phylogeny, and Ambient Noise in the Evolution of Bird Song. American Naturalist. 1985;126:87–100. [Google Scholar]
  30. Scharff C, Nottebohm F, Cynx J. Conspecific and heterospecific song discrimination in male zebra finches with lesions in the anterior forebrain pathway. Journal of Neurobiology. 1998;36:81–90. [PubMed] [Google Scholar]
  31. Slater PJB. The cultural transmission of bird song. Trends in Ecology & Evolution. 1986;1:94–97. doi: 10.1016/0169-5347(86)90032-7. [DOI] [PubMed] [Google Scholar]
  32. Thielke G. On the origin of divergence of learned signals (songs) in isolated populations. Ibis. 1972;115:511–516. [Google Scholar]
  33. Volman SF. Development of neural selectivity for birdsong during vocal learning. Journal of Neuroscience. 1993;13:4737–4747. doi: 10.1523/JNEUROSCI.13-11-04737.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Zollinger SA, Suthers RA. Motor mechanisms of a vocal mimic: implications for birdsong production. Proc Biol Sci. 2004;271:483–491. doi: 10.1098/rspb.2003.2598. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Fig SUPP 1

SUPPLEMENTAL FIGURE 1 The decrement in response strength observed as a function of the degree to which song stimuli were accelerated above their natural trill rate (Fig. 4A) was also evident as a function of the absolute trill rate (symbols as in Fig. 4; N = 14 cells, 6 birds).

Highlights

RESOURCES