Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2018 Aug 1;115(33):8436–8441. doi: 10.1073/pnas.1801251115

Syringeal EMGs and synthetic stimuli reveal a switch-like activation of the songbird’s vocal motor program

Alan Bush a,b,1, Juan F Döppler a,b, Franz Goller c, Gabriel B Mindlin a,b
PMCID: PMC6099895  PMID: 30068604

Significance

The study of the integration between sensory inputs and motor commands has greatly benefited from the finding that in sleeping oscine birds, playback of their own song evokes highly specific firing patterns in neurons also involved in the production of that song. Nevertheless, the sparse spiking patterns that can be recorded from few single neurons gives limited information of the overall activity of the song system. Here, we show that this response is not restricted to the central nervous system but reaches vocal muscles. Combining this integrated measure of the activity of the system with surrogate synthetic songs, we found an all-or-nothing or switch-like activation of the song system.

Keywords: zebra finch, song system, sensory–motor integration, syrinx, electromyogram

Abstract

The coordination of complex vocal behaviors like human speech and oscine birdsong requires fine interactions between sensory and motor programs, the details of which are not completely understood. Here, we show that in sleeping male zebra finches (Taeniopygia guttata), the activity of the song system selectively evoked by playbacks of their own song can be detected in the syrinx. Electromyograms (EMGs) of a syringeal muscle show playback-evoked patterns strikingly similar to those recorded during song execution, with preferred activation instants within the song. Using this global and continuous readout, we studied the activation dynamics of the song system elicited by different auditory stimuli. We found that synthetic versions of the bird’s song, rendered by a physical model of the avian phonation apparatus, evoked very similar responses, albeit with lower efficiency. Modifications of autogenous or synthetic songs reduce the response probability, but when present, the elicited activity patterns match execution patterns in shape and timing, indicating an all-or-nothing activation of the vocal motor program.


The cooperation between different individuals, or the ability to perform coordinated and even synchronic actions, is at the foundation of many species’ success. This requires a delicate interaction between sensory and motor programs, at specific regions of the nervous system. In this work, we explore some aspects of this link, in the framework of oscine song production.

Our model will be the elicitation of motor-like patterns in the oscine song system when exposed, during sleep, to their own song. The avian auditory pathway indirectly connects to the song system, a set of neural nuclei necessary to generate the motor gestures required for phonation (Fig. 1A; reviewed in ref. 1). Neurons of HVC and RA (two of these nuclei) show a stereotyped sparse spiking pattern elicited by playback of autogenous song in sleeping or anesthetized animals, firing at few specific instants of the song (26). In zebra finches and white-crowned sparrows these responses are highly specific to the bird’s own song (BOS) because they are not evoked by a reversed version of BOS, and similar conspecific songs, tone bursts, or combinations of simple stimuli only evoke weak responses, if any (3, 7, 8). Furthermore, the spiking times are almost identical to those of the same neurons measured during song execution (4, 6). Selective responses have also been observed in motor neurons of the hypoglossal motor nucleus (nXIIts) that innervate the syrinx, the avian vocal organ (9, 10) (Fig. 1A). The highly selective response of the song system to playbacks provides an excellent experimental opportunity to study its activation by auditory stimuli and will be our tool to explore the integration between the auditory and motor programs.

Fig. 1.

Fig. 1.

vS EMGs during song execution. (A, Top Left) Schematic representation of the nuclei of the auditory pathway (violet), song motor pathway (blue), and anterior forebrain pathways (red). (Bottom Right) Schematic representation of the syrinx and associated muscles with approximate implantation point of the bipolar electrodes in vS (red). (B, Top) Spectrogram of a typical song of a male zebra finch. Solid lines above the plot indicate song motifs. (Bottom) vS activity measured during song execution (bird ab09, first surgery). (C, Top) Spectrogram of a single motif. (Bottom) Overlay of all vS activity traces measured during motif execution. (D) Distribution of execution vS activity at the indicated times of the motif (blue histograms) and baseline vS activity (gray histogram).

One way to understand how the song system is activated by this stimulus is to study the response to slight variations of BOS that maintain the capacity to activate the system but in a slightly deteriorated way. In this regard, synthetic songs generated with a mathematical model of the avian vocal organ (SYN) have been shown to evoke firing patterns in HVC similar to those evoked by BOS (11). Nevertheless, the probability of any given neuron of responding to SYN was at most 60% of that of responding to BOS (11). This suggests that these synthetic songs are ideally suited to study the activation dynamics of the song system by auditory stimuli: good enough to evoke the highly specific response, yet sufficiently different as to produce measurable changes in that response. Moreover, the parameters in the model used to generate the synthetic songs can be continuously varied to explore changes in the activation dynamics under a smooth degradation of the stimulus.

Measuring the sparse firing pattern of few individual neurons gives limited information on the overall activity of the system. We can overcome this difficulty by measuring efferent motor commands that provide a global, continuous, and graded readout of the song system’s activity. In this regard, the electrical activity of syringealis ventralis (vS), the largest muscle in the syrinx (Fig. 1A), has been measured during song execution (12, 13). Remarkably, this muscle spontaneously activates during sleep, showing variable patterns which sometimes resemble the execution pattern (14).

In this work, we measure electromyograms (EMGs) of the vS muscle during song execution and nocturnal auditory playback experiments. This enables the study of the overall activation dynamics of the song system elicited by autogenous and synthetic stimuli with high temporal resolution.

Results

We implanted electrodes in the syrinx of male zebra finches obtaining EMG signals with amplitudes of [0.4–1.7] mV before amplification and dynamic ranges around 30 dB compared with baseline fluctuations. Implants lasted from a few days up to 3 wk, showing a mild reduction of EMG amplitude over time in some cases. We confirmed electrode placement in vS by postmortem examinations. In some animals we observed an electrocardiogram-like (ECG) component in the signal (SI Appendix, Fig. S1 A and B), consistent with the heart rate of small birds (15). The syrinx is a deep structure located close to the heart; therefore, an ECG component is not surprising. To extract the EMG component of the signal we applied the Teager energy operator (16), a measure of local energy of the signal, because this operator greatly attenuates the ECG component (SI Appendix, Fig. S1A). We then estimated the vS activity as the log-ratio of the Teager energy to its baseline value (Methods).

Traces of vS activity show a repetitive pattern during song execution, consistent with the reiterations of the song’s motif (Fig. 1B). Overlaying these traces for all executions of the motif shows a highly stereotyped pattern, with relatively small variations across renditions (Fig. 1C and SI Appendix, Fig. S1 C and D). Furthermore, this measure of vS activity has an approximately normal distribution at each time in the motif (Fig. 1D). In three birds, we performed a second surgery in which we implanted the electrodes in a different position of the vS muscle. The EMG patterns observed in both surgeries were similar (Pearson’s correlation in [0.53–0.82], P < 10−5 in all cases; SI Appendix, Fig. S1E), suggesting a homogeneous activation of the muscle.

Nocturnal Playbacks Selectively Evoke Activation of the Syrinx.

Auditory stimuli included a recording of the BOS and its reversed version (REV), a conspecific song (CON), and a SYN, presented in random order (Methods). Fig. 2 shows all of the vS activity traces evoked by nocturnal playbacks for a bird, time-locked to each motif (additional birds in SI Appendix, Fig. S2). For REV and CON motifs, the traces fluctuate around baseline, with occasional and inconsistent increases in activity. In contrast, a significant fraction of traces in response to BOS show a consistent activation, resulting in an overall pattern reminiscent of the execution activity (Fig. 2A). Therefore, playback of the bird’s own song selectively elicits a stereotyped activation in the syrinx, consistent with the response observed in the central nervous system (3). Similarly, SYN also stimulated consistent activation of vS. Furthermore, this activation pattern is strikingly similar to that of BOS, although it was evoked in a smaller fraction of the trials (Fig. 2B). Hence, our song synthesis is also capable of eliciting selective vS activity, consistent with the response observed in HVC (11).

Fig. 2.

Fig. 2.

Syringeal activity is selectively evoked by auditory playbacks. Nocturnal playback responses to (A) the BOS, (B) SYN, (C) REV, and (D) a CON. In all cases, Top shows the spectrogram of the stimulus motif, and Bottom shows an overlay of all recorded vS responses to those motifs (bird ab09, second surgery).

Correlation Analysis Reveals Short Delays of the Evoked Syringeal Response.

For each stimulus, we approximated the response patterns as the 95th percentile of vS activity at each time point of the motif (referred to as BOS95, SYN95, CON95, and REV95; Fig. 3A). We then cross-correlated these patterns to the median vS activity during motif execution (EXE50; Fig. 3B). BOS95 and SYN95 had Pearson’s correlation coefficient ranging from 0.63 to 0.83 (Fig. 3C and SI Appendix, Table S1), with no significant difference among them (P = 0.20), but both were significantly higher than those of CON95 and SYN95 (P < 10−5, ANOVA with orthogonal contrasts). The time delays of BOS95 and SYN95, with respect to the execution pattern EXE50, were less than 30 ms in all instances, with some cases showing no detectable delay (Fig. 3C). There was no significant difference in delay times between SYN95 and BOS95 (P = 0.43, paired t test).

Fig. 3.

Fig. 3.

Evoked vS activity correlates with the execution pattern. (A) Median vS activity pattern during motif execution (EXE50) and 95th percentiles of responses to different auditory stimuli (BOS95, SYN95, REV95, and CON95 for bird ab09). (B) Cross-correlation analyses of responses to auditory stimuli with respect to median EXE50. Arrows indicate times of maximal correlation for BOS95 (red) and SYN95 (blue). (C) Maximal correlation values and correspondent delays for BOS95 (circles) and SYN95 (triangles) of all animals (SI Appendix, Table S1). Error bars represent SEs as calculated by bootstrapping. (D, Top) Spectrogram of an entire song of bird ab09. (Middle) vS activity profile during execution of that song (gray) and the evoked profile during a nocturnal playback (red). (Bottom) Correlation spectrogram showing in gray scale the local (250-ms) correlation between the above traces for every time and delay (Methods). (E) vS activity for execution (gray) and evoked response to BOS (red) for the indicated time windows.

Upon closer examination, we noticed that for some animals the time delay of individual responses was not constant (Fig. 3 D and E). To visualize how the delay varies along the motif we constructed correlation spectrograms, in which local correlation coefficients to EXE50 are shown for all times and delays (Fig. 3D, Bottom). In this particular example, the delay of maximal correlation increases linearly with time, reaching values close to 40 ms, after which the correlation is briefly lost and later reappears with a near-zero delay. To analyze the variability in the playback-elicited vS activity, we time warped these patterns to a reference execution trace (SI Appendix, Methods and Fig. S3C). In this way, we could assess the similarity among responses independently of small phase differences between them due to variable time delays.

Syringeal Responses Evoked by Autogenous or Synthetic Songs Have Similar Distributions.

We performed multidimensional scaling (MDS) on all of the vS activity traces measured for the different stimuli and song executions (Fig. 4A and SI Appendix, Methods). MDS is a nonlinear visualization technique that represents the multidimensional distance between traces in a 2D plot; that is, proximate points in the MDS plot correspond to similar vS activity traces. Points representing execution traces (EXE) form a compact cluster toward the right of the plot, whereas REV and CON points form a disperse cloud toward the left (Fig. 4A). Some of the BOS points are in the region of the REV–CON cloud and correspond to traces with baseline vS activity (Fig. 4 A, i; EXE50 shown in gray for reference). Other BOS points form a cluster toward the top right of the plot, which marginally overlaps with the execution cluster (Fig. 4 A, iv). Remarkably, SYN points have a similar distribution to that of BOS, although with a smaller fraction in the cluster adjacent to the execution points. Points in the BOS–SYN cluster correspond to traces that closely follow the execution profile but for only a portion of the motif; as the fraction of time the trace follows the execution pattern increases, the points move toward the execution cluster (Fig. 4 A, ii and iii). We show this transition in more detail in a principal component analysis (PCA) plot of the same data points (SI Appendix, Fig. S4A; note how the nonlinear MDS spreads the EXE cluster compared with PCA). MDS plots for some animals show an overlap of the playback-evoked and execution clusters, whereas others do not (SI Appendix, Fig. S4 BE). In the latter cases, subtle differences during brief time intervals of the motif allow the separation of both clusters, even though those differences are hard to identify visually (SI Appendix, Fig. S4A; compare panels iv and v).

Fig. 4.

Fig. 4.

BOS and SYN have similar elicited response distributions. (A) MDS plot for all execution (EXE, violet) and stimuli-evoked (BOS, red; SYN, blue; REV, brown; and CON, green) vS activity traces (bird ab09). iv show example vS activity traces of the indicated regions of the MDS plot. Gray curve in all panels corresponds to the median execution activity (EXE50). (B) Normalized projection score (Methods) of all vS activity traces of all animals (ab05, ab08, etc.), surgeries (Sx1 or Sx2), and stimuli (exe, bos, syn, con, and rev). Shared letters (a–d) indicate nonsignificant differences between conditions (pairwise K–S tests corrected by FDR; Methods). Horizontal dashed line indicates the high response threshold. (For ab08, BOS95 was used as reference pattern for score calculation instead of EXE50 because no execution was recorded.)

To perform statistical analyses, we defined the normalized projection score (ρ), a measure of similarity between vS activity traces and median execution profile (Methods). A value of zero indicates baseline vS activity, whereas unity corresponds to a trace exactly matching EXE50 in shape and amplitude. The distribution of this score is shown in Fig. 4B for all birds and stimuli. As expected, EXE traces have scores close to 1, whereas REV and CON traces have lower and more disperse values. We performed pairwise Kolmogorov–Smirnov (KS) tests correcting for the familywise false discovery rate (17) (SI Appendix, Table S1), showing that BOS scores are significantly higher than REV and CON in all cases, responses to SYN are significantly higher than those to REV and CON (except in ab10), and score values for SYN are significantly different from BOS (except in ch12).

To gain further insight into the manner in which responses to SYN differ from those to BOS, we defined high responses as those with score greater than 0.5 (ρ>0.5). Note that close to 95% of the responses to CON and REV are below this threshold. BOS elicited a significantly larger fraction of high responses than SYN in eight out of nine experiments (Fisher’s exact test). The distribution of score values for high responses was not significantly different between SYN and BOS (KS test; Fig. 4B and SI Appendix, Table S1).

Onsets of Syringeal Activity Are Clustered at Hot Spots.

The previous analysis considered the vS activity elicited by a playback motif as a whole. However, in many cases the elicited response followed the median execution pattern during only a portion of the motif (Fig. 4 A, ii and iii). To explore the temporal pattern of activation, we identified activity segments as intervals of time where the measured vS activity trace closely followed the expected execution pattern of the motif (Methods and SI Appendix, Fig. S5A). For BOS and SYN these activity segments show a structure consistent with the repetitive motif structure of the song (Fig. 5A). The probability of evoking a response depends on the position of the motif within the song, with a reduced response in the first and fifth motifs (SI Appendix, Fig. S5B).

Fig. 5.

Fig. 5.

Evoked activity onsets cluster at hot spots. (A, Top) Spectrogram of full song used as auditory stimulus (bird ab05). Activity segments detected for (Middle) BOS (red) and (Bottom) SYN (blue) for all playbacks, arranged chronologically per line from top to bottom. (B, i) Spectrogram of BOS motif. (B, ii) Activity segment coverage (i.e., percentage of presented motifs that evoked vS activity) for every time of the motif, for BOS (red) and SYN (blue). Arrows indicate hot spots. Shaded regions correspond to the SE as calculated by bootstrapping. (B, iii and iv) Histograms for the onset times of activity segments for BOS and SYN (15-ms bins). The dashed horizontal line indicates the threshold for significant difference from a uniform distribution (using Bonferroni’s correction for familywise error rate at α = 0.05). (C) In-phase evoked activity can be detected after the end of the stimulus. Spectrogram of the final part of an auditory stimulus (Top; ab09) and a selected example of evoked vS activity pattern (Bottom; black). The expected vS activity based on the execution pattern and typical intermotif intervals is shown in gray. (D) Same as in C for bird ab11.

To assess if the delay of the evoked vS activity varied along the response, we performed a cross-correlation analysis against the execution pattern for each half of the activity segments. We found larger delays for the latter part of the responses in two animals (e.g., Fig. 3D, Wilcoxon sign-rank test) and no significant differences in others. (In some animals, the analysis could not be performed: ab08 had no execution pattern, and the activity segments of ab10 were too short to give informative correlations.) In no case did we find a decreasing delay along the response.

We then calculated the coverage at each time of the motif, that is, the percentage of playback motifs with detected vS activity (Fig. 5 B, ii). Remarkably, the onset times of the activity segments tend to cluster at defined instants of the motif (Fig. 5 B, iii and iv), producing sudden increases in the coverage percentage (Fig. 5 B, ii). We refer to these instants as “hot spots.” The 22coverage for the synthetic versions of the song is roughly similar to that of BOS, although significantly lower in some portions of the motif. Notably, responses to SYN show a subset of the hot spots observed for BOS (Fig. 5B and SI Appendix, Fig. S5 CF).

In some occasions, we observed vS activity consistent with the execution pattern, after the last motif of the playback. Furthermore, the timing of these virtual motifs was consistent with what would be expected based on the intermotif intervals (Fig. 5 C and D).

Modified Stimuli Evoke the Same Motor Gestures with Lower Probability.

Next, we measured the manner in which the response elicited by the synthetic song changed as we continuously varied a parameter of the synthesis. We chose the level of noise in the tension of the labia (σβ; SI Appendix, Methods) because previous work has shown that the response at the level of HVC in sleeping animals depends on this parameter (11). As can be observed in Fig. 6 A and B, the probability of evoking the specific response decreases at high levels of noise. Nevertheless, when a response is evoked, the same complete motor gestures are observed as in the normal responses (individual traces are highlighted in SI Appendix, Fig. S6).

Fig. 6.

Fig. 6.

Degraded stimuli evoke normal motor gestures with lower probability. (A) Overlays of vS activity traces elicited by motifs synthesized with the indicated amount of noise (σβ) added to the parameter representing the tension of the labia, β(t). (B) Response probability, calculated as the fraction of playbacks with normalized projection score (ρ) higher than the 90th percentile of REV and CON (ρ90REV-CON). (C) Overlays of vS activity trances evoked by BOS motifs in which the indicated elements were muted. (D, Top) Labels assigned to playback-evoked motor gestures. (Bottom) Quantification of the change in motor gesture coverage for muted stimuli compared with BOS. SEs are calculated by bootstrapping. Asterisks indicate significant differences from BOS (controlling for familywise error rate at α = 0.05 using Bonferroni’s method). Bird ab18.

Finally, we studied the changes in the response obtained by muting individual elements of the motif, as has been previously done while measuring single unit firing patterns in RA (4). Muting an element of the motif strongly reduces the response following that element. Significant reductions in the elicitation probability of playback-evoked motor gestures can be observed from 50 ms after the start of the muted element to 200 ms after the end of the muted element (Fig. 6 C and D and SI Appendix, Fig. S7). Conversely, the elicitation of each playback-evoked motor gesture depends on one or more of the preceding elements. When a motor gesture is evoked, it appears in its complete form. Stated differently, eliminating motif elements reduces the probability of evoking posterior motor gestures but does not strongly affect the timing or shape of the evoked gestures.

Discussion

In this work, we found that nocturnal playbacks of the bird’s own song, or synthetic versions of that song, evoke vS activity patterns strikingly similar to those recorded during song execution. This result is consistent with the activity of telencephalic nuclei elicited by similar stimuli (24, 6, 11), yet it is surprising that the motor command should reach the syrinx in sleeping animals. In line with our finding, nXIIts motor neurons that project to the syrinx are activated by BOS playbacks in anesthetized animals (9, 10), and spontaneous song-like activity can be detected in vS of sleeping birds (14). Despite the activation of the syrinx, birds do not phonate while sleeping due to the lack of the corresponding respiratory gesture (14). Sturdy et al. (10) reported an entrainment of respiration by auditory playbacks, which suggests a residual coupling between the song and respiratory systems in anesthetized birds.

The spontaneous nocturnal activity of the song system has been suggested to have a functional role in memory consolidation (4, 18). On the other hand, the high variability observed during nocturnal activity suggests it could be involved in the generation of internal error signals that contribute to the stability of the motor program (14). In any case, the functional implications of this spontaneous activity reaching syringeal muscles during nocturnal motor replays remain unclear. It could be involved in the maintenance of superfast syringeal muscles (19, 20) and/or the elasticity of the syringeal membranes. On the other hand, the auditory elicitation of song-like syringeal activity in sleeping animals is not something likely to happen in nature due to high selectivity to the bird’s own song. Therefore, this behavior may be a byproduct of the evolution of the song system with no adaptive value per se (21). Be that as it may, it provides a global readout of the activation of the system well suited to explore the link between sensory and motor systems.

Notably, when activated by auditory stimuli the vS signal closely follows the execution pattern in shape and amplitude; that is, it is an all-or-nothing or switch-like response (Fig. 2). Note that in contrast to measurements of the activity of individual neurons, EMGs reflect the summed activity of tens of thousands of central neurons and, in consequence, could potentially show partial activation in individual trials. Therefore, the observed switch-like activation is a property of the song system and not of the chosen readout. The difference observed between autogenous and synthetic songs is the probability of eliciting a response, not the profile of the evoked response (Fig. 4). Fragments of the response elicited by BOS could be missing from the EMG elicited by manipulated stimuli. This was observed both using continuous manipulations of the stimuli (through continuous changes in the model used to generate synthetic songs) and using discrete manipulations (deletion of complete elements of BOS). However, whenever elicited, the fragments preserved the shape of the execution pattern (Fig. 6). This effect is reminiscent of categorical perception in humans (22, 23). In swamp sparrows, similar categorical responses to continuous variation of auditory stimuli have been reported (23, 24).

An observation is that the moments at which the responses start are not uniformly distributed along the motif. Instead, they cluster at few well-defined hot spots (Fig. 5B). We could not correlate these hot spots to any consistent acoustic features. Our song syntheses usually produced onset of activity in a subset of the hot spots produced by BOS, which may partially explain the lower probability of evoking response by SYN.

The overall delay of the vS responses (relative to the execution pattern) was less than 30 ms in all cases. This is consistent with what has been previously reported in the telencephalic nuclei RA and HVC (4, 6, 11). Furthermore, our measurements bound the delay below 10 ms for several birds (not ruling out zero delay), whereas for others we found a near-zero delay for some segments of the motif (Fig. 3 C and D). These delays are shorter than the 50 ms latency observed from auditory stimulus to activation of the hypoglossal motor neurons that innervate the syrinx (9). Therefore, our data confirm that a specific auditory event can only evoke activity patterns that come later in the motif, not the gesture causally related to the production of that particular event, as already suggested by Dave and Margoliash (4). Consistent with this notion, we found a 50-ms delay from the beginning of a muted element to the first significant reduction in the response (Fig. 6 C and D). Furthermore, in some infrequent cases, vS activity continued after the end of the last motif, in a pattern consistent with what would be expected for an additional virtual motif (Fig. 5 C and D).

In swamp sparrows, some area X-projecting HVC neurons (HVCX), which fire at precise times during a song execution, also fire at the corresponding times during playback of that same song, even in awake and freely behaving animals (25). Furthermore, these HVCX units also fire during playback of similar conspecific songs, suggesting that these auditory–vocal mirror neurons could be involved in song recognition and sender–receiver interactions (25). However, RA-projecting HVC neurons (involved in the song motor pathway; Fig. 1A) are not driven by auditory stimuli in awake swamp sparrows (23, 25) or zebra finches (26, 27). Therefore, syringeal activity evoked by auditory stimuli is unlikely to be triggered during wakefulness in either species.

If the song system has some sort of preprogrammed internal dynamics, the correct sequence of auditory stimuli could then entrain it, activating the system and evoking the response. This conceptual model is consistent with the differences observed in the responses; a suboptimal sequence of auditory stimuli, such as the ones produced by different versions of our synthetic songs, is less efficient at inducing the response, but once induced, the response follows its preprogrammed dynamic, independently of the stimulus that evoked it (Figs. 2 and 6). Notably, continuous auditory stimulation is required to maintain the response in sleeping zebra finches because muting an element of the motif greatly reduces the subsequent response (Fig. 6 C and D and ref. 4). This conceptual model is also consistent with the variable time delays observed in some birds (Fig. 3D) because coupling of nonlinear dynamical systems ordinarily results in time-varying phase differences (28).

Methods

Subjects.

Adult male zebra finches (Taeniopygia guttata) were acquired from local breeders. After a 10-d quarantine protocol, birds were housed in large cages (40 × 40 × 120 cm) with conspecific males under a 14-h/10-h light/dark cycle, had ad libitum access to food and water, and could interact visually and vocally with a conspecific female. Experimental protocols were approved by the Animal Care and Use Committee of the University of Buenos Aires.

Song Recording.

Birds’ songs were recorded 3–5 d before the surgery in a custom-built acoustic isolation chambers with a microphone (SGC568; Takstar) connected to an audio amplifier and a data acquisition device (DAQ, USB-6212; National Instruments) connected to a PC. Custom MATLAB (The Mathworks) scripts were used to detect and record sounds, including a 1-s pretrigger window.

Surgery.

Custom-made bipolar electrodes were implanted in the vS muscle as previously described (12) and connected to an analog differential amplifier (225×) mounted on a backpack previously fitted to the bird (SI Appendix, Methods).

EMG Recording.

After a 4-h recovery time, birds were placed in the acoustic isolation chamber, and the amplifier mounted to the backpack was connected to the DAQ through a rotator to allow free movement. Two external 12-V batteries powered the amplifier to avoid line noise. EMG and audio signals were recorded by a PC running MATLAB, at a sampling rate of 44,150 Hz. During day/night, recordings were triggered by sound/EMG with a 1-s pretrigger window. We adjusted trigger thresholds for each bird to levels slightly higher than the basal range, minimizing the probability of missing events, even when this led to a high percentage of spurious records.

Playback Stimuli.

The biomechanics of birdsong production constrains the plausible relationships between different acoustic features of birdsong sounds, as fundamental frequency and spectral content in the case of zebra finches (11, 29). Therefore, generating stimuli through computational implementations of these physical models simplifies the number of manipulations that can be independently performed in a way that is consistent with the biomechanics. SYNs of the BOS were produced as previously reported (30) (SI Appendix). Playback protocols were passed at night (11–12 PM to 5–6 AM) from a speaker inside the acoustic isolation chamber. Each protocol consisted of several stimuli of similar duration including BOS, SYN, REV, and CON, presented in random order. Fifteen seconds were recorded per playback, starting 3 s before the stimulus onset. Protocols were separated by a random 10.0 ± 0.4-min interval of silence. Playback volume was set at 55 ± 5 dB (calibrated with 4-kHz beeps, YF-20 Sound Level Meter, Yu Fung). For Fig. 6 A and B we produced syntheses with different values for the labial tension noise parameter σβ. For the element muting experiment (Fig. 6 B and C) we replaced selected motif elements by background noise. In both experiments we included BOS, CON, and REV as controls.

Data Analysis.

Recordings were analyzed in Python using custom scripts (available at https://github.com/Dynamical-Systems-Lab). The envelope of the EMG signal was calculated in the following manner. First, the Teager energy operator was applied (16), defined as ϕ(xi)=xi2xi1xi+1. Then the 95th percentile for each 5-ms bin (220 samples) was determined, resulting in a 200-Hz vector with a robust estimation of the EMG’s Teager energy envelope (rϕ). We found that this method greatly attenuated the ECG component of the signal observed in some birds (SI Appendix, Fig. S1A). We then defined vSactivity=10×log10(rϕ/rϕref), where rϕref is the mode of rϕ during periods of no activity. Note that vS activity is measured in dB.

To compare the elicited activation patterns independently of the variable delays observed (Fig. 3D), we implemented a smooth time warping (STW) algorithm that allows small nonlinearities in the time stretching (SI Appendix, Methods and Fig. S3C).

To quantify the response elicited by the different stimuli, we defined the normalized projection score ρ=(yx)/(xx), where x is the median of the execution patterns (EXE50) and y is the vS activity trace for a particular playback trial or execution pattern. Before calculating the score, we time-aligned all of the vS activities to the average execution pattern of a selected song using our STW algorithm.

Statistical Analysis.

Using ρ, we performed all pairwise comparisons between stimuli for each bird using a KS test and controlled for false discovery rate at α=0.05 (17). Results are informed with letter code in Fig. 4B. We compared the proportion of high responses (ρ>0.5) between BOS and SYN using Fisher’s exact test for count data. Finally, we compared the distribution of the high responses between BOS and SYN using a KS test. P values for these tests are informed in SI Appendix, Table S1. All statistical tests were done in R.

Periods of evoked vS activity consistent with the execution pattern were detected in the following manner (SI Appendix, Fig. S5A). The baseline distribution of vS activity P0(vS) was estimated from −3 to −1 s relative to playback onset. The time-dependent distribution of vS activity during motif executions Px(vS,t) was estimated from data as that of Fig. 1C, time-locked to each playback motif. These were then used to calculate the log-probability ratio defined as Λ(t)=log10(Px(vS(t),t+τ)/P0(vS(t))), where vS(t) is the measured vS activity at time t and τ is the average time delay between the evoked response and the execution pattern as calculated in Fig. 3C. Segments of evoked activity between times ti and tf were identified according to the following conditions: (i) Λ(t)0t[ti,tf], (ii) Λ(t0)1.3forsomet0[ti,tf], and (iii) (titfΛ(t)dt)/(tfti)>0.5. Segments of less than 15 ms of duration were removed, and consecutive segments separated by small gaps were consolidated if the gap accounted for less than 10% of the resulting segment. This algorithm was benchmarked against manual classification of activity segments giving qualitatively similar results.

Supplementary Material

Supplementary File
pnas.1801251115.sapp.pdf (15.2MB, pdf)

Acknowledgments

We thank M. A. Suarez for veterinarian support; S. Boari, A. Sanchez, R. G. Alonso, J. Lassa Ortiz, and C. T. Herbert for help with animal care; and A. Amador and D. Margoliash for useful suggestions and discussion. This work was supported by the National Council of Scientific and Technical Research (Argentina), the National Agency of Science and Technology (Argentina), the University of Buenos Aires, and the National Institutes of Health through Grants R01-DC-012859 and R01-DC-006876.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1801251115/-/DCSupplemental.

References

  • 1.Mooney R. Neural mechanisms for learned birdsong. Learn Mem. 2009;16:655–669. doi: 10.1101/lm.1065209. [DOI] [PubMed] [Google Scholar]
  • 2.McCasland JS, Konishi M. Interaction between auditory and motor activities in an avian song control nucleus. Proc Natl Acad Sci USA. 1981;78:7815–7819. doi: 10.1073/pnas.78.12.7815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Margoliash D. Preference for autogenous song by auditory neurons in a song system nucleus of the white-crowned sparrow. J Neurosci. 1986;6:1643–1661. doi: 10.1523/JNEUROSCI.06-06-01643.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Dave AS, Margoliash D. Song replay during sleep and computational rules for sensorimotor vocal learning. Science. 2000;290:812–816. doi: 10.1126/science.290.5492.812. [DOI] [PubMed] [Google Scholar]
  • 5.Hahnloser RH, Kozhevnikov AA, Fee MS. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature. 2002;419:65–70. doi: 10.1038/nature00974. [DOI] [PubMed] [Google Scholar]
  • 6.Hamaguchi K, Tschida KA, Yoon I, Donald BR, Mooney R. Auditory synapses to song premotor neurons are gated off during vocalization in zebra finches. eLife. 2014;3:e01833. doi: 10.7554/eLife.01833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Margoliash D, Fortune ES. Temporal and harmonic combination-sensitive neurons in the zebra finch’s HVc. J Neurosci. 1992;12:4309–4326. doi: 10.1523/JNEUROSCI.12-11-04309.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Margoliash D. Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow. J Neurosci. 1983;3:1039–1057. doi: 10.1523/JNEUROSCI.03-05-01039.1983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Williams H, Nottebohm F. Auditory responses in avian vocal motor neurons: A motor theory for song perception in birds. Science. 1985;229:279–282. doi: 10.1126/science.4012321. [DOI] [PubMed] [Google Scholar]
  • 10.Sturdy CB, Wild JM, Mooney R. Respiratory and telencephalic modulation of vocal motor neurons in the zebra finch. J Neurosci. 2003;23:1072–1086. doi: 10.1523/JNEUROSCI.23-03-01072.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Amador A, Perl YS, Mindlin GB, Margoliash D. Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature. 2013;495:59–64. doi: 10.1038/nature11967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Goller F, Suthers RA. Role of syringeal muscles in gating airflow and sound production in singing brown thrashers. J Neurophysiol. 1996;75:867–876. doi: 10.1152/jn.1996.75.2.867. [DOI] [PubMed] [Google Scholar]
  • 13.Vicario DS. Contributions of syringeal muscles to respiration and vocalization in the zebra finch. J Neurobiol. 1991;22:63–73. doi: 10.1002/neu.480220107. [DOI] [PubMed] [Google Scholar]
  • 14.Young BK, Mindlin GB, Arneodo E, Goller F. Adult zebra finches rehearse highly variable song patterns during sleep. PeerJ. 2017;5:e4052. doi: 10.7717/peerj.4052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Odum EP. The heart rate of small birds. Science. 1945;101:153–154. doi: 10.1126/science.101.2615.153. [DOI] [PubMed] [Google Scholar]
  • 16.Kvedalen E. 2003. Signal processing using the Teager energy operator and other nonlinear operators. PhD dissertation (University of Oslo, Oslo, Norway)
  • 17.Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57:289–300. [Google Scholar]
  • 18.Margoliash D, Schmidt MF. Sleep, off-line processing, and vocal learning. Brain Lang. 2010;115:45–58. doi: 10.1016/j.bandl.2009.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Elemans CP, Mead AF, Rome LC, Goller F. Superfast vocal muscles control song production in songbirds. PLoS One. 2008;3:e2581. doi: 10.1371/journal.pone.0002581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Uchida AM, Meyers RA, Cooper BG, Goller F. Fibre architecture and song activation rates of syringeal muscles are not lateralized in the European starling. J Exp Biol. 2010;213:1069–1078. doi: 10.1242/jeb.038885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gould SJ, Lewontin RC. The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. Proc R Soc Lond B Biol Sci. 1979;205:581–598. doi: 10.1098/rspb.1979.0086. [DOI] [PubMed] [Google Scholar]
  • 22.Liberman AM, Harris KS, Hoffman HS, Griffith BC. The discrimination of speech sounds within and across phoneme boundaries. J Exp Psychol. 1957;54:358–368. doi: 10.1037/h0044417. [DOI] [PubMed] [Google Scholar]
  • 23.Prather JF, Nowicki S, Anderson RC, Peters S, Mooney R. Neural correlates of categorical perception in learned vocal communication. Nat Neurosci. 2009;12:221–228. doi: 10.1038/nn.2246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Nelson DA, Marler P. Categorical perception of a natural stimulus continuum: Birdsong. Science. 1989;244:976–978. doi: 10.1126/science.2727689. [DOI] [PubMed] [Google Scholar]
  • 25.Prather JF, Peters S, Nowicki S, Mooney R. Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature. 2008;451:305–310. doi: 10.1038/nature06492. [DOI] [PubMed] [Google Scholar]
  • 26.Rauske PL, Shea SD, Margoliash D. State and neuronal class-dependent reconfiguration in the avian song system. J Neurophysiol. 2003;89:1688–1701. doi: 10.1152/jn.00655.2002. [DOI] [PubMed] [Google Scholar]
  • 27.Cardin JA, Schmidt MF. Song system auditory responses are stable and highly tuned during sedation, rapidly modulated and unselective during wakefulness, and suppressed by arousal. J Neurophysiol. 2003;90:2884–2899. doi: 10.1152/jn.00391.2003. [DOI] [PubMed] [Google Scholar]
  • 28.Winfree AT. The Geometry of Biological Time. Springer; New York: 2000. [Google Scholar]
  • 29.Sitt JD, Amador A, Goller F, Mindlin GB. Dynamical origin of spectrally rich vocalizations in birdsong. Phys Rev E Stat Nonlin Soft Matter Phys. 2008;78:011905. doi: 10.1103/PhysRevE.78.011905. [DOI] [PubMed] [Google Scholar]
  • 30.Perl YS, Arneodo EM, Amador A, Mindlin GB. Nonlinear dynamics and the synthesis of Zebra finch song. Int J Bifurcat Chaos. 2012;22:1250235. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1801251115.sapp.pdf (15.2MB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES