Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2011 Nov 9;31(45):16353–16368. doi: 10.1523/JNEUROSCI.3009-11.2011

Two Distinct Modes of Forebrain Circuit Dynamics Underlie Temporal Patterning in the Vocalizations of Young Songbirds

Dmitriy Aronov 1, Lena Veit 1, Jesse H Goldberg 1, Michale S Fee 1,
PMCID: PMC3241969  NIHMSID: NIHMS340593  PMID: 22072687

Abstract

Accurate timing is a critical aspect of motor control, yet the temporal structure of many mature behaviors emerges during learning from highly variable exploratory actions. How does a developing brain acquire the precise control of timing in behavioral sequences? To investigate the development of timing, we analyzed the songs of young juvenile zebra finches. These highly variable vocalizations, akin to human babbling, gradually develop into temporally stereotyped adult songs. We find that the durations of syllables and silences in juvenile singing are formed by a mixture of two distinct modes of timing: a random mode producing broadly distributed durations early in development, and a stereotyped mode underlying the gradual emergence of stereotyped durations. Using lesions, inactivations, and localized brain cooling, we investigated the roles of neural dynamics within two premotor cortical areas in the production of these temporal modes. We find that LMAN (lateral magnocellular nucleus of the nidopallium) is required specifically for the generation of the random mode of timing and that mild cooling of LMAN causes an increase in the durations produced by this mode. On the contrary, HVC (used as a proper name) is required specifically for producing the stereotyped mode of timing, and its cooling causes a slowing of all stereotyped components. These results show that two neural pathways contribute to the timing of juvenile songs and suggest an interesting organization in the forebrain, whereby different brain areas are specialized for the production of distinct forms of neural dynamics.

Introduction

Many vertebrate behaviors, such as object manipulation, social displays, and vocalizations, are characterized by precisely timed, stereotyped motor sequences (Lashley, 1951; Levelt, 1993). Often, these sequences are not innate but are acquired by juvenile animals via exploratory learning, during which the timing of behaviors can be highly variable. Little is known about the neural circuit dynamics that underlie variably timed juvenile behaviors or those that underlie the developmental emergence of stereotyped timing in adult behaviors.

Birdsong is an excellent model of motor timing and its development. Zebra finches, like most songbirds, learn their songs by imitating an adult (Immelmann, 1969; Marler, 1970). At an early learning stage, they produce subsong: highly variable vocalizations similar to human babbling (Marler, 1970; Oller et al., 1976; Doupe and Kuhl, 1999). Gradually, during the “plastic-song” stage, vocalizations become more stereotyped until, in adulthood, singing consists of a highly precise sequence of syllables and silent gaps (Tchernichovski et al., 2001; Glaze and Troyer, 2006). We use the term “early song” to refer to the earliest developmental phase, encompassing subsong and early plastic song.

Previous studies have elucidated the neural mechanisms of timing in adult songs. A key “cortical” area for mature singing is HVC (used as a proper name) (Nottebohm et al., 1976; Simpson and Vicario, 1990; Yu and Margoliash, 1996; Hahnloser et al., 2002). HVC projects to RA (robust nucleus of the arcopallium), which in turn projects to brainstem vocal and respiratory nuclei (Wild, 2004). Recent work has demonstrated that mild cooling of HVC results in uniform slowing of all song components, whereas cooling of RA has no effect on song timing (Long and Fee, 2008). Because temperature affects the speed of all cellular processes, this result suggests that the neural dynamics generated by intrinsic circuitry within HVC control song timing in adulthood (Long and Fee, 2008; Fee and Long, 2011).

In contrast to adult song, subsong does not require HVC activity; in fact, after bilateral HVC elimination, birds of all ages produce subsong-like vocalizations (Aronov et al., 2008). Rather, a premotor area for subsong production is the LMAN (lateral magnocellular nucleus of the nidopallium) (Aronov et al., 2008). LMAN is the “cortical” output stage of an avian basal–ganglia–thalamocortical loop (Perkel et al., 2002; Farries et al., 2005) which, like HVC, also projects to RA (Bottjer et al., 1984). During later developmental stages, LMAN continues to contribute to song production by introducing variability to singing (Kao et al., 2005, 2008; Ölveczky et al., 2005; Stepanek and Doupe, 2010), a necessary feature for exploration in; kao et al., 2008; Stepanek and Doupe, 2010 trial-and-error learning (Doya and Sejnowski, 1995; Tumer and Brainard, 2007; Andalman and Fee, 2009).

Where are the neural dynamics that generate the timing of early vocalizations? Although progress has been made in understanding the development of timing later in development (Tchernichovski et al., 2001, 2004), little is known about the earliest song timing. Thus, we first characterize the temporal features of early singing on a detailed quantitative level. We then use various circuit manipulations of premotor brain areas to understand their contributions to early song timing.

Materials and Methods

Subjects were juvenile male zebra finches between 33 and 59 days post-hatch (dph). Birds were obtained from the Massachusetts Institute of Technology breeding facility. Animal care and experiments were performed in accordance with the National Institute of Health guidelines and approved by the local Institutional Animal Care and Use Committee.

Sound recordings.

Juvenile birds were raised in the Massachusetts Institute of Technology breeding facility, in single cages with their parents and siblings. At age 33–42 dph, they were removed and placed in custom-made sound isolation chambers. Vocalizations were recorded with custom-written Matlab software or with Sound Analysis Pro (Tchernichovski et al., 2000), which was configured to ensure that recordings were triggered on all quiet vocalizations of young birds (Aronov et al., 2008).

Song analysis.

All data analyses were performed with custom Matlab software. Sound recordings were inspected manually to remove calls and cage noise. Song syllables were segmented and characterized based on sound between 1 and 4 kHz as follows. In each recorded sound file, the sound level was determined by bandpass filtering the microphone signal between 1 and 4 kHz (order-80 linear-phase FIR filter, Matlab functions fir1 and filtfilt), squaring, and smoothing with a 2.5 ms sliding square window. Relative sound level was converted to units of decibels by computing the logarithm (base 10) and multiplying by 10. The result is referred to in the text and figures as song or syllable “amplitude.” Sound amplitude produced during singing was bimodally distributed, corresponding to syllables and gaps. Means and SDs of these two modes were estimated by fitting a Gaussian mixture (two Gaussians) to the sound level distribution using expectation maximization.

To segment sound into syllables and silent gaps, we defined an “upper threshold” as the Fisher discriminant of the two identified Gaussian modes and a “lower threshold” as 2 SDs above the mean of all values below the upper threshold. Candidate syllable onsets were defined as upward crossings of the lower threshold followed by upward crossings of the upper threshold. Similarly, candidate syllable offsets were defined as consecutive downward crossings of the upper and then lower threshold. Candidate syllables shorter than 7 ms were eliminated, and the surrounding gaps were merged. Similarly, candidate gaps shorter than 7 ms were eliminated, and the surrounding syllables were merged.

Syllables were initially analyzed by fitting an exponential function to their duration distribution using maximum-likelihood estimation (MLE). On a finite interval (a, b), maximum-likelihood analysis yields the equation

graphic file with name zns04511-0964-m01.jpg

Here, the left side of the equation indicates the mean duration of all syllables with durations si between a and b, and τ is a time constant of the exponential distribution that has the maximal probability of producing the observed data. We used the Matlab zero-finding algorithm (fzero function) to solve the above equation for τ. This MLE fitting algorithm was also used to fit exponential functions to various components of gap duration distributions, described in Results.

Syllable duration distributions were generally not exponential below 25 ms, possibly because of peripheral mechanical constraints on the production of brief sounds. Thus, we used the MLE procedure to fit exponential distributions to syllable durations between a = 25 ms and b = 400 ms. This analysis was performed on song data collected during one day of singing (∼1000–10,000 syllables). The goodness-of-fit of the exponential was estimated using the Lilliefors statistic (Lilliefors, 1969), a procedure similar to the Kolmogorov–Smirnov test but suitable for evaluating fits. This involves calculating the maximum difference between the cumulative density functions of the data distribution and the best exponential fit and then normalizing this difference by the square root of the number of syllables. Distributions that were well fit by exponentials typically had a goodness-of-fit metric <2, whereas distributions that were just beginning to exhibit a protosyllable peak typically had values >2. We therefore used 2 as the threshold for distinguishing subsong from plastic song.

In plastic song, syllable duration distributions exhibited peaks at durations in the range of 48–174 ms. To quantify these peaks, we first fit an exponential to the long tail of the syllable duration distribution (200–400 ms). This exponential was then subtracted from the distribution, calculated in 1 ms bins. The peak in the residual was typically well fit by a Gaussian; thus, the best-fit Gaussian was used to estimate the amplitude and location of the protosyllable peak. In 8 of 40 plastic-song birds, the single-Gaussian fit was relatively poor (r2 < 0.9) because two peaks were present in the distributions. For these birds, we fit a mixture of two Gaussians to the data instead. The total peak size was then determined as the sum of the two individual amplitudes, and both center values were included in the analysis of syllable durations.

Air sac pressure recordings.

Air sac pressure measurements during subsong production were performed using methods similar to those described previously for adult birds (Franz and Goller, 2002; Goller and Cooper, 2004). A device was constructed using a piezoresistive pressure transducer chip (Fujikura FPM-02PG) together with an instrumentation amplifier (Burr Brown INA122).

Before surgery, birds were anesthetized with 1–2% isoflurane in oxygen. A small hole was made in the posterior thoracic air sac, and a Silastic tube (2 mm outer diameter, 5 mm length) was inserted and sutured to the rib cage. The other end of the tube was attached to the inlet tube of the pressure transducer. Another Silastic tube (0.9 mm outer diameter, 16 mm length) containing a thin wire was secured subcutaneously on the bird's back, and the device was tied to this tube. Birds began to sing within a few days after surgery, at which point the pressure sensor was connected by a thin cable to a custom mercury commutator. Sound and air sac pressure were digitized at 40 kHz and recorded simultaneously using custom software in Matlab.

We asked whether singing was influenced by the implantation of air sac pressure recording devices. Between the population of subsong birds that received device implants (n = 12) and birds that were not implanted (n = 25), there were no significant differences in any of the parameters that we used to quantify duration distributions of silent gaps (p1, τ1, p2, τ3, μ, σ, and γ; see Results); there was also no difference in the time constants of exponential fits to syllable duration distributions (p > 0.06 for all eight parameters). These measurements suggest that device implants for pressure recordings do not significantly affect the timing of syllable and gap durations.

Lesions and inactivations.

Lesions and details of the inactivation methodology, as well as the histological and electrophysiological confirmations of these techniques have been described previously (Aronov et al., 2008). Briefly, birds were anesthetized with 1–2% isoflurane and placed into a stereotaxic apparatus. Electrolytic HVC lesions were made by passing current (100 μA current for 60 s) at 10–13 locations, spaced at 500 μm. Inactivations of HVC or LMAN were made with muscimol or tetrodotoxin, delivered using custom-built untethered reverse microdialysis probes.

Cooling.

We used two custom-built cooling devices: one for cooling HVC (a superficial brain area) and another for cooling LMAN (a nucleus ∼2 mm deep in the brain). These devices were modified from those described by Long and Fee (2008) to eliminate water cooling, making them less restrictive and thus more suitable for use in young juvenile birds. Design, construction, and testing of these devices have been described previously (Aronov and Fee, 2011). Briefly, both devices consisted of two bismuth telluride thermoelectric elements (Custom Thermoelectric) connected in series between a hot plate and a cold plate. Running electric current across these elements in one direction resulted in cooling of the cold plate; current in the opposite direction resulted in warming of the cold plate.

For HVC cooling, two gold-plated annealed silver pads (1 × 2 mm) were soldered to the cold plate. During surgery, an area of the cranium above HVC was thinned using a dental drill, and the pads were placed in contact with the thinned areas. For LMAN cooling, thermally conductive “probes” (gold-plated annealed silver, 250- or 330-μm-diameter wire) were soldered to the cold plate. These probes were thermally insulated with polyimide tubing (25 μm wall thickness) and a ∼35 μm air gap between the wire and the tubing. The bottom 1 mm of the probes were left un-insulated. During surgery, small craniotomies and duratomies were made above LMAN, and the probes were slowly inserted into LMAN, such that the un-insulated segments spanned the dorsoventral extent of the nucleus.

To remove waste heat generated by current flow through the device (Long and Fee, 2008; Aronov and Fee, 2011), a convective copper-mesh heat sink was attached to the hot plate of the device. Additionally, a “body-coupled” heat sink (silver sheets) was attached to the hotplate; the other end of this heat sink was brought in close proximity to the surface of the cranium to further stabilize the temperature of the device. Details of heat sink construction and heat removal have been described previously (Aronov and Fee, 2011).

We directly measured temperature changes produced by these devices in three birds with the HVC cooling device and in four birds with the LMAN cooling device (for details, see Aronov and Fee, 2011). Briefly, small thermocouples (40 gauge) were either attached to the cold plate of the device or implanted into the brain at various distances from the cooling surface. After surgery, birds were placed into a foam restraint and allowed to recover from anesthesia until fully alert (∼10 min). Electric current was then applied, alternating between 0.5 and −1.5 A using a programmable current source (Lambda ZUP14-60). After a switch in currents, temperature typically reached a steady-state value within ∼10 s. At 0.5 A, the device was maintained close to normal body temperature (41°C). At −1.5 A, the center of HVC (500 μm deep) was cooled by ∼7°C below normal body temperature. The amount of cooling in LMAN strongly depended on distance from the probe; for experimental birds, the average temperature change achieved was therefore estimated from a histological analysis of the implanted probe location (see below).

For cooling experiments, devices were implanted and birds were allowed to recover after surgery until the onset of singing (typically within 1 week). Birds were then tethered with light, flexible cables to custom mercury commutators. Electric current (0.5 and −1.5 A, alternating every 100 s) was delivered to the devices continuously throughout the day using the programmable current supply. Only songs produced during the last 80 s of each current pulse were used for analysis, such that only data at steady-state temperatures were included.

Histological analysis.

After all experiments, animals were killed and perfused with 4% paraformaldehyde. Brains were extracted and sliced parasagittally for histological examination of the tracks created by probes (for inactivation and cooling experiments) or the extents of lesions. HVC lesions were confirmed by the absence of retrogradely labeled RA-projecting cells (for details, see Aronov et al., 2008).

Temperature in the brain is strongly dependent on the distance from the cooling or heating probe (Jafari and Higgins, 1989; Long and Fee, 2008). Consequently, the size of LMAN or positioning of probes in it can have a substantial effect on the average amount of cooling achieved in this nucleus and may vary across birds. To approximate the amount of cooling in each bird, we estimated the location of the cooling probes relative to LMAN boundaries.

To estimate the average amount of cooling in LMAN, we simulated the three-dimensional spatial profile of temperature around the probes in each bird using the standard model of thermal diffusion in cylindrical coordinates (Jafari and Higgins, 1989; for details, also see Aronov and Fee, 2011):

graphic file with name zns04511-0964-m02.jpg

where r is the radius (distance from the central axis of the cooling probe), ΔT(r) is the steady-state temperature change (with respect to normal brain temperature) at that radius, ΔTp is the steady-state temperature at the surface of the cooling probe (also with respect to normal brain temperature), K0 is the modified zero-order Bessel function of the second kind, and λ is the space constant of thermal diffusion in the brain [estimated from the data in the study by Long and Fee (2008) to be ∼1.59 mm]. We estimated ΔTp using thermal modeling and calibrations (Aronov and Fee, 2011) to be approximately −12°C. For calculating the amount of temporal stretch attributable to cooling in percent per degree Celsius, we obtained a single-value estimate of cooling by averaging the values across the two hemispheres.

Results

Behavioral timescales of subsong

Adult zebra finch songs contain distinct syllable types of highly stereotyped durations (Fig. 1A,B). Whereas the timing of syllables and silent gaps in adult song is well described (Glaze and Troyer, 2006), less is known about the temporal structure of early song. We therefore sought to characterize the temporal features of subsong and to identify the underlying timescales of this behavior. We recorded the songs of 68 birds between the ages of 33 and 59 dph (see Materials and Methods). Figure 1C illustrates a typical song bout of a 40 dph juvenile bird. As described in detail previously (Veit et al., 2011), in birds of all ages, even the youngest recorded (<40 dph), the distribution of song amplitudes exhibited a bimodal shape, with two peaks corresponding to vocalization and silence (Fig. 1D). Thus, even early song appeared to consist of two distinct acoustic states, which we refer to as syllables (continuous segments of sound) and gaps (continuous silent intervals; Fig. 1C). We separately analyzed the distributions of syllable and gap durations.

Figure 1.

Figure 1.

Subsong exhibits exponential distributions of syllable durations. A, Spectrogram of a typical adult zebra finch song. Three distinct syllable types are labeled A–C. B, Syllable duration distribution of all recorded renditions of this song illustrating the highly stereotyped syllable durations. C, Bottom, Spectrogram of a typical song produced by a 40 dph juvenile zebra finch. Bars above indicate detected syllables. Top trace, Song amplitude (log units, decibels) in the 1–4 kHz band. D, Distribution of song amplitudes for the song sample shown. E, Syllable duration distribution (black traces) for the same bird on (1) a linear and (2) a semi-log scale. The distribution is well fit by an exponential function (red traces). Labels i–iii indicate durations of syllables shown in C. (3) The residual (black trace) of the distribution after the subtraction of the exponential fit. Also shown is a Gaussian fit to this residual (red trace). (4) Overall gap duration distribution (black trace) and a sum of the exponential and Gaussian fits (red trace). F, Same fits for syllable durations of a bird in the early plastic-song phase. The distribution deviates from an exponential attributable to the appearance of a peak (Gaussian fit). G, Fraction of birds whose syllables are well fit by exponential distributions (Lilliefors statistic; see Materials and Methods) for different age groups. Error bars indicate 95% confidence intervals for a binomial distribution. H, Syllable duration distributions for all subsong birds (gray lines) and the population average (red line). For clarity, only birds with >2000 recorded syllables are shown. I, Distribution of exponential time constants for all distributions shown in H.

In the youngest of the recorded birds, syllable durations were broadly distributed (Fig. 1E). Short syllables (<50 ms) occurred with the highest probability density, and long syllables—some as long as 400–800 ms—occurred with the lowest probability density. Above ∼25 ms, the distributions were well fit by an exponential function, as illustrated by their linearity on a semi-logarithmic plot (Fig. 1E). Most birds younger than 40 dph exhibited exponentially distributed syllable durations (16 of 24 birds; see Materials and Methods for the goodness-of-fit statistic). This fraction of birds decreased with age over a brief developmental period, such that none of the 15 birds older than 50 dph exhibited exponentially distributed syllable durations (Fig. 1F,G). For the remainder of this paper, we define “subsong” as the production of syllables whose durations are well fit by an exponential distribution. In older birds, syllable durations deviated from an exponential distribution because of the appearance of consistently timed syllables during the early plastic-song phase (Liu et al., 2004; Tchernichovski et al., 2004; Aronov et al., 2008); these syllables formed peaks in duration distributions (Fig. 1F) and will be addressed later.

An exponential distribution of durations is remarkable because it is indicative of a highly random process (Poisson process), in which the probability of terminating a syllable is uniform in time and is thus independent of when the syllable was initiated. The time constant of this exponential is a single timescale that characterizes the syllable durations of an individual bird. Across the population of subsong birds, the time constant was 89.5 ± 4.5 ms (±SEM unless otherwise specified; n = 37 birds; Fig. 1H,I).

We next sought to characterize the timing of silent gaps in subsong. Because subsong appears to be a series of randomly timed transitions between vocal and silent periods, we expected gap durations to be distributed similarly to those of syllables. Surprisingly, however, gap duration distributions in subsong birds did not show a monotonic decrease at greater durations but typically exhibited a more complex structure, often including a secondary peak near 60 ms (Fig. 2A–D). Thus, even in young subsong birds with no stereotypy in syllable durations, gaps often appeared to exhibit some consistency in timing.

Figure 2.

Figure 2.

Silent gaps in subsong are shaped by a diversity of respiratory patterns. Except D, all examples shown are from a single typical subsong bird (47 dph). A, Syllable duration distribution, exhibiting an exponential shape. B, Gap duration distribution, exhibiting a complex mixture of timescales including a peak at ∼70 ms. C, Same gap duration distribution shown with a logarithmic vertical axis. Red line, Exponential fit to short gaps (<30 ms). Blue line, Exponential fit to gaps at 100–200 ms. The long tail deviates from an exponential fit at ∼300 ms. D, Gap duration distribution on a log–log scale. Dashed line, Power-law fit to the data between 300 ms and 30 s. Because the tail contains very few data points, data are shown for a bird from which an exceptionally large amount of singing was recorded. E, Spectrogram showing several bouts of subsong with simultaneous air sac pressure recording. Periods of positive and negative pressure are indicated with magenta and green, respectively. Gaps labeled i–iv correspond to those labeled in B–D. Bottom, Detail of the recording, showing four EPs, during which 1, 1, 3, and 0 syllables (black bars) are produced. F, Examples of mode-1 gaps, which contain continuous expiratory pressures. G, Examples of mode-2 gaps, which are filled by a single IP. H, Examples of mode-3 gaps, which contain an IP and extended intervals of nonvocalized positive pressure. Note that mode-2 and mode-3 gaps form a continuum; although we illustrate them as two extremes here, the analysis performed does not classify individual gaps as belonging to mode-2 or mode-3. I, Examples of mode-4 gaps, which contain eupneic breathing. The top examples in F–I are the gaps highlighted in B–E (labeled i–iv).

Even outside the peak at ∼60 ms, the distributions of gap durations were more complex than the single-exponential shape of syllable duration distributions. Specifically, distributions of short gaps (less than ∼30 ms) appeared to be approximately exponential (time constant of 27.9 ± 2.4 ms, n = 34 subsong birds), as did the distributions of gaps between 100 and 200 ms (time constant of 54.9 ± 1.2 ms). However, these time constants were different (p < 0.001, paired t test; Fig. 2C), indicating the possible presence of two additional mechanistically distinct modes in gap durations. Furthermore, although gap distributions were well described by an exponential function at long durations (100–200 ms), they strongly deviated from this exponential fit for gaps longer than ∼300 ms (Fig. 2C). These long gaps approximately followed a power-law distribution (Fig. 2D), indicating the possible presence of yet another mode in gap timing.

Respiratory patterns underlying behavioral timescales in early song

What accounts for this complexity in gap timing? In particular, what causes gaps to be relatively precisely timed at ∼60 ms durations, even in young birds with no equivalent precision in syllable timing? Because inspirations occur primarily during silent intervals (Franz and Goller, 2002), we hypothesized that the distinct modes of gap timing are related to different respiratory patterns during the gaps. To characterize these patterns, we recorded air sac pressure in 12 young juvenile birds (ages 39–53 dph) during subsong production (Fig. 2E; see Materials and Methods). Silent periods between bouts typically contained a periodic pressure fluctuation at 2–3 Hz, corresponding to eupneic breathing. Subsong, like adult song, was accompanied by an increase in the amplitude of both expirations and inspirations. Syllables were almost always produced during periods of positive pressure (in 98.4 ± 0.7% syllables the majority of time points were expiratory, n = 12 birds). Gaps, however, were less homogenous in their respiratory patterns. On the basis of these patterns, we were able to identify approximately four distinct “modes” of gap production, described below.

Many of the gaps were expiratory (“mode-1” gaps, 33.2 ± 2.3% of all gaps, n = 12 birds; Fig. 2F)—that is, air sac pressure stayed positive at all times between the two flanking syllables during these gaps. Although some of these gaps were up to ∼100 ms long (e.g., bottom example in Fig. 2F), most had a short duration (mean duration of 31.0 ± 1.9 ms). Conversely, most short gaps were expiratory (91.9 ± 2.2% of all gaps <25 ms). Expiratory gaps can be thought of as nonvocalized periods that break a single expiratory gesture into multiple syllables. In fact, individual expiratory pressure pulses (EPs) in subsong often contained more than one syllable, with some pulses containing as many as eight syllables (mean of 1.50 ± 0.06 syllables; Fig. 2E). The presence of so many expiratory gaps is unique to juvenile song; in adult zebra finches, nearly every gap contains an inspiration (Franz and Goller, 2002).

The remaining gaps contained at least one inspiratory pressure pulse (IP), although for gaps that were longer than ∼300 ms, this typically involved a return to eupneic breathing; these long gaps will be discussed below. Gaps that were shorter than 300 ms and contained at least one IP comprised 52.6 ± 2.1% of all gaps (n = 12 birds; Fig. 2G,H). In these gaps, periods of negative air sac pressure were brief (average IP duration of 49.4 ± 2.3 ms) and appeared to be distinct events with rapid onsets and offsets. Many gaps containing an IP also included positive pressure regions. These silent expiratory periods were on average 40.7 ± 2.9 ms long and could occur before or after the IP (examples in Fig. 2H). To describe these events, we loosely refer to those gaps that contain a single IP immediately preceded and followed by syllables as “mode-2” gaps (Fig. 2G); we refer to gaps that contain prolonged positive pressure regions in addition to at least one IP as “mode-3” gaps (Fig. 2H). Mode-2 and mode-3 gaps form a continuum, depending on the duration of the positive-pressure region. Aside from illustrating them in Figure 2, we did not classify individual gaps as belonging to one of these modes. The reasons for using two modes to describe these gaps will be further clarified when we develop a quantitative description of gap duration distributions.

Finally, 14.2 ± 1.3% of all gaps were longer than 300 ms and generally contained eupneic breathing (“mode-4” gaps, n = 12 birds; Fig. 2I). The shortest of these consisted of what appeared to be a single period of the breathing cycle. However, most mode-4 gaps were long periods of silence between subsong bouts (median ± bootstrap SE duration of 2.06 ± 0.28 s, but often tens of seconds long). Interestingly, long gaps followed a power-law duration distribution with a long tail extending to ∼30 s (power-law exponent, γ = −1.37 ± 0.03 for gaps 300 ms to 30 s long; Fig. 2D).

Quantitative description of gap duration distributions

Our recordings suggest a model for the structure of the complex gap duration distribution in early singing (Fig. 3A). (1) The initial sharply decreasing segment of the overall gap distribution at short durations is produced by mode-1 gaps. These are produced during EPs, as are syllables, and exhibit an approximately exponential duration distribution (Fig. 3B,C). In support of this hypothesis, we found that the time constant of the short-duration gap duration distribution across all subsong birds (27.9 ± 2.4 ms, n = 34 birds) was consistent with the average time constant of expiratory gaps measured from birds in which air sac pressure was recorded (31.0 ± 1.9 ms, n = 12 birds, p = 0.48; Fig. 3D). (2) The peak in the gap duration distribution is produced by mode-2 gaps containing a consistently timed IP tightly coupled to the preceding syllable offset and the following syllable onset. This peak was approximately Gaussian in shape (Fig. 3E,F) and was centered at 55.5 ± 1.4 ms (see Materials and Methods), similar to the average IP duration (49.4 ± 2.3 ms; Fig. 3F,G). The small but significant difference between the means of these durations (p < 0.05, t test) is likely attributable to the fact that mode-2 gaps also contain a brief transition period between syllable offset and IP onset, as well as between IP offset and syllable onset. (3) The long tail in the overall gap duration distribution is produced by gaps that contain an IP surrounded by an extended period of nonvocalized positive pressure (mode-3 gaps). The duration of these gaps is formed by the sum of an IP duration and the durations of the positive pressure periods before and after the IP, the sum of which exhibited an approximately exponential distribution with a time constant of 48.3 ± 3.3 ms (measured in the tail between 50 and 150 ms; Fig. 3H). This time constant was, in fact, not significantly different from the time constant of an exponential fit to the gap distribution (54.9 ± 2.0 ms for gaps 100–200 ms long, p = 0.10; Fig. 3I,J). (4) Finally, gaps >300 ms are primarily generated by transitions to eupneic breathing (mode-4 gaps). Because of the presence of these gaps, the overall gap distribution deviated from an exponential at long durations.

Figure 3.

Figure 3.

Gap durations are formed by the underlying respiratory patterns. A, Model of the respiratory timescales that generate different modes of gap durations. Schematics illustrate all parameters of the gap duration distributions described in Results. B, An exponential distribution (time constant τ1) is used to model the durations of expiratory (mode-1) gaps. C, Gap duration distribution of a subsong bird overlaid with the distribution of mode-1 gaps (red trace). The distributions overlap well at short durations. D, Histogram of mode-1 gap timescales (τ1), determined from exponential fits to the overall gap duration distribution at short durations (less than ∼30 ms; see Materials and Methods). Red symbols show mode-1 gap timescales directly determined from birds with air sac pressure measurements. E, A Gaussian distribution (mean μ, SD σ) is used to model the distribution of gaps whose durations are tightly coupled to IP durations (mode-2 gaps). F, Gap duration distribution and IP duration distribution (green trace), measured in the same bird and rescaled to match peak height. G, Histogram of peak centers identified in gap duration distributions, μ. Green symbols show average IP durations directly determined from air sac pressure measurements. H, Mode-3 gap durations are modeled by the sum of an exponentially distributed expiratory period duration (time constant τ3, blue trace) and a Gaussian-distributed IP duration (green trace), forming an ex-Gaussian duration distribution (orange trace). I, Gap duration distribution plotted with the distribution of positive-pressure periods in mode-3 gaps, shifted by the mean IP duration (blue trace). J, Histogram of long (100–200 ms) gap timescales (τ3). Blue symbols show timescales of the positive-pressure periods in mode-3 gaps, directly determined from air sac pressure measurements.

Understanding how gaps durations are formed by the underlying respiratory patterns allowed us to quantify several aspects of respiratory timing from songs alone (Fig. 4). Because eupneic-breathing gaps tended to be long (>300 ms), we first concentrated on gaps shorter than 300 ms. We assumed that some fraction p1 of these were mode-1 gaps. Because the durations of mode-1 gaps were approximately exponentially distributed, we wanted to estimate both p1 and the time constant τ1 of this exponential. Because almost all gaps shorter than 30–40 ms were expiratory mode-1 gaps (see above), the overall gap distribution at these short durations was also well fit by an exponential and provided a good estimate of the exponential time constant of mode-1 gaps. At longer durations, the gap duration distribution tended to increase, forming a peak at ∼60 ms. We identified the duration tmin at which distribution began to increase by finding the minimum of the smoothed distribution (1 ms bins, 20 ms square-window smoothing). (Because on some recording days in younger birds a minimum was not present, we excluded those days when determining tmin.) An exponential function was then fit to all gaps shorter than 0.75tmin using an MLE procedure (see Materials and Methods; Fig. 4, step 1). This fit provided a value of τ1 (time constant of the exponential) and p1 (area under the exponential curve).

Figure 4.

Figure 4.

Analysis of gap durations in subsong (A) and early plastic song (B). (1) Gap duration distribution with exponential fit to short gap durations (red trace; see Materials and Methods). The fit was used to estimate the fraction (p1) and time constant (τ1) of mode-1 gaps. (2) Residual after the subtraction of the mode-1 exponential fit from the gap duration distribution (gray trace). Overlaid is another exponential fit to this residual at long durations (100–200 ms, blue trace). The fit was used to estimate the exponential time constant of the ex-Gaussian model of mode-3 gaps (τ3). (3) Fit of a weighted sum of a Gaussian and an ex-Gaussian distribution to the same residual, used to estimate the parameters of mode-2 and mode-3 gaps. (4) Gray traces, Residual shown in (3), after subtractions of either of the individual components of the fit (Gaussian or ex-Gaussian). Green and orange traces show the Gaussian and ex-Gaussian components of the fit separately. Amplitude of the Gaussian component (p2, green trace), describing the contribution of mode-2 gaps, is substantially greater in plastic song than in subsong. Note that individual gaps in this step have not been classified as mode-2 or mode-3; the distributions only show how much of the fit in (3) was assigned to the Gaussian and the ex-Gaussian modes. (5) Overall gap duration distribution from the same bird (gray trace) and the fit of the full model (black trace; weighted sum of an exponential, a Gaussian, and an ex-Gaussian).

The remaining fraction (1 − p1) of all gaps shorter than 300 ms were assumed to be mode-2 or mode-3 gaps. To estimate their distribution, we subtracted the mode-1 exponential fit described above from the overall gap distribution (calculated in 1 ms bins). The residual distributions tended to exhibit a peak near 60 ms and were strongly asymmetric, with a long exponential tail >100 ms (Fig. 4, step 2). Because mode-2 and mode-3 gaps form a continuum, we did not attempt to classify individual gaps as belonging to one of these modes. Rather, we assumed that some fraction p2 of the residual distribution was formed by mode-2, whereas the remaining 1 − p2 was formed by mode-3. Mode-2 gaps closely matched the durations of IPs, the durations of which exhibited approximately Gaussian distributions. We therefore described the duration distribution of mode-2 gaps as a Gaussian of mean μ and SD σ (Fig. 3E,F). Mode-3 gaps contained an IP plus additional expiratory components, the durations of which were approximately exponentially distributed. Thus, we modeled the duration distribution of mode-3 gaps as a sum of a Gaussian-distributed variable and an exponential-distributed variable, forming an ex-Gaussian distribution (Hohle, 1965). We made the further assumption that the Gaussian component of mode-3 gaps has the same center and width as mode-2 gaps. To describe the combined distribution of mode-2 and mode-3 gaps, we therefore needed four parameters: p2, mean μ and SD σ of the Gaussian, and time constant τ3 of the exponential.

The time constant τ3 could be estimated from the long tail of the overall gap distribution (Fig. 4, step 3). The distribution was typically uncontaminated by the Gaussian at durations >100 ms. However, it deviated from the exponential at durations longer than ∼300 ms due to the presence of mode-4 gaps. We thus estimated τ3 by fitting an exponential to the distribution between 100 and 200 ms. The remaining parameters (p2, μ, and σ) were estimated using nonlinear optimization to minimize the mean squared difference of the fit from the distribution.

It is not obvious, a priori, that mode-2 and mode-3 gaps should be modeled as two distinct modes of the distribution. Can the distribution of gap durations be adequately explained with only the ex-Gaussian model used for mode-3 gaps? By fitting a general Gaussian/ex-Gaussian mixture to the data, we found that the extracted mode-2 (Gaussian) component is often a sizable fraction of the mixture and that this component is necessary to accurately describe the peak of the distributions (Fig. 4, step 4). Furthermore, as will be shown later, the amplitude of this mode-2 component increases with respect to the mode-3 component during development, and this relative increase is abolished by HVC lesions.

In summary, a combination of four temporal modes accurately describes the complex distributions of gaps durations in early song (Fig. 4, step 5). Because these temporal modes arise from distinct respiratory patterns, this model allowed us to extract various features of respiratory timing from singing alone. The goal of such extraction is to allow one to quantify the effects of various brain manipulations (described later) on respiratory timing, even in the absence of simultaneous air sac pressure measurements.

Early development of stereotyped timing

We next sought to understand how the temporal structure of singing changes during the period of early development. Figure 5A shows the songs of a typical bird recorded during the period of transition from subsong to early plastic song. Although the spectral changes of songs during this period were relatively subtle, the temporal changes were pronounced. At ∼45 dph, birds began to produce syllables with consistently timed durations, which resulted in the appearance of peaks in syllable duration distributions (Figs. 1F, 5B,C). Adult and late plastic songs in zebra finches usually exhibit multiple peaks in syllable duration distributions, corresponding to distinct syllable types (Tchernichovski et al., 2001, 2004). In the early plastic-song phase we analyzed, however, distributions commonly exhibited only one peak (32 of 40 birds had no detectable second peak; see Materials and Methods). Such single peaks diversify into multiple distinct syllable types later in development (Liu et al., 2004; Tchernichovski et al., 2004). We therefore refer to the syllables of this earliest stereotyped duration as “protosyllables.” Protosyllables became more common over time, resulting in the growth of the peak in syllable duration distributions. The duration of protosyllables varied greatly across birds (48–174 ms; mean ± SD, 93.8 ± 32.0 ms; n = 48 syllable duration peaks in 40 birds; Fig. 5D).

Figure 5.

Figure 5.

Early song development is characterized by appearance of consistent timing. A, Spectrograms of songs produced by one bird at three developmental stages. Red circles, Syllables 140–240 ms long. Blue circles, Gaps 30–90 ms long. B, Developmental progression of syllable and gap duration distributions from the same bird. Brackets indicate duration ranges marked above (red and blue circles). C, Heights of peaks in syllable and gap duration distributions (corresponding to protosyllables and protogaps), quantified across all birds. Bars indicate median values across birds in a given age group; error bars are bootstrap SEMs. D, Distributions of protosyllable and protogap durations across all birds. E, Probability distribution of all consecutive syllable and gap durations in an early plastic-song bird. Each row of the color-coded matrix is individually normalized to a sum of 1, such that values indicate syllable durations conditional on gap duration. Also shown are the overall gap duration distribution (blue trace, left) and syllable duration distribution (red trace, top). F, Duration distributions of syllables that precede and follow gaps of three different duration ranges. Protogaps and protosyllables tend to follow each other.

As discussed above, the peak in gap duration distributions corresponded primarily to mode-2 gaps (Gaussian component of the distribution); by analogy, we refer to these consistently timed gaps as “protogaps.” Just like protosyllables, protogaps also became more common during development (Figs. 4B, 5B,C). Specifically, this was observed as an increase in the contribution of mode-2 gaps (Gaussian component of the duration distribution, parameter p2 described above) relative to the contribution of mode-3 gaps (ex-Gaussian component, 1 − p2) in the quantitative model described above. In contrast to the peaks in syllable duration distributions, the protogap peaks (mode-2) in gap duration distributions were at shorter durations and more narrowly distributed across birds (mean ± SD, 53.3 ± 8.0 ms; n = 59 birds; Fig. 5D).

Protosyllables and protogaps appeared to be strung together into long, rhythmic sequences. For example, the bird in Figure 5E exhibited a peak in gap durations at ∼60 ms and a peak in syllable durations at ∼150 ms. Syllables with durations close to this peak occurred almost exclusively before and after the ∼60 ms protogaps; they were rarely present before and after gaps <25 ms or >100 ms (Fig. 5F). To analyze this sequencing, we identified protosyllables and protogaps as those syllables and gaps whose durations were within 20 ms of the corresponding peaks in the distributions. Across 23 plastic-song birds, 35.3 ± 1.8% of syllables were identified as protosyllables. Incidences of protosyllables before and after protogaps were higher than this value (43.5 ± 2.2 and 40.6 ± 2.0% respectively; p < 0.05 and p < 0.01). Similarly, the incidence of protogaps before and after protosyllables (54.2 ± 2.4 and 51.3 ± 2.9%, respectively) were also higher than expected from the overall fraction of protogaps (44.0 ± 2.0%; p < 0.05 for both comparisons). These numbers indicate that song transitions in which a protosyllable follows a protogap are 16% more likely than expected from random ordering of syllables and gaps; transitions from a protosyllable to a protogap are 24% more likely. Consistent with these transition probabilities, the incidence of protosyllables following other protosyllables was slightly higher than expected by chance (40.0 ± 2.1 vs 35.3 ± 1.8%, p < 0.001).

Involvement of HVC and LMAN in early song timing

Our results indicate that subsong involves an interplay of processes that produce randomly timed durations of syllables and gaps with those that produce consistently timed syllables and gaps. Which brain areas are involved in the timing and development of these processes? It has been reported previously that subsong persists in the absence of HVC and that subsong syllables are essentially unchanged by bilateral HVC lesions. HVC lesions also abolish the production of distinct stereotyped syllables in plastic song and adult song (Aronov et al., 2008). In contrast, LMAN lesions or inactivation produce an increase in song stereotypy in plastic song and adult birds (Kao et al., 2005; Ölveczky et al., 2005). Here we examine the different roles of HVC and LMAN in the control of the temporal structure of early song vocalizations (see Figs. 68).

Figure 6.

Figure 6.

Effects of HVC elimination on early singing. A, Typical spectrograms of a subsong bird before and after complete bilateral HVC lesions. Blue circles, Gaps 30–90 ms long. Syllable duration (B) and gap duration distributions (C) for the same bird before and after the lesions. Bracket indicates the duration range marked above. D–F, Same plots for an early plastic-song bird before and after HVC lesion. Red circles, Syllables 80–120 ms long. Blue circles, Gaps 30–90 ms long. G, Analysis of the temporal modes in normal plastic-song gap distribution. Data are plotted the same way as in Figure 4. H, Analysis of gaps produced by the same bird after bilateral HVC lesions. Note that the major effect of the lesion was the elimination of mode-2 gaps [(4), Gaussian fit, green trace].

Figure 7.

Figure 7.

Effects of LMAN elimination on early singing. A, Typical spectrograms of an early plastic-song bird before and after bilateral LMAN inactivation. Red circles, Syllables 60–110 ms long. Blue circles, Gaps 30–60 ms long. Syllable duration (B) and gap duration (C) distribution for the same bird before inactivation (black trace) and after inactivation (green trace). Brackets indicate duration ranges marked above.

Figure 8.

Figure 8.

Summary of bilateral HVC and LMAN elimination experiments across birds. Note that only HVC eliminations are shown for subsong birds because these do not sing after LMAN inactivations (Aronov et al., 2008). Left column, Schematic illustrations of the effects of HVC elimination (orange arrows) and LMAN elimination (green arrows) on syllable and gap distributions. A, Scatter plot of the size of the peak in syllable durations (protosyllables) before and after HVC elimination. B, Size of the peak in gap durations (protogaps) before and after HVC elimination, calculated as the magnitude of the components corresponding to mode-2 gaps. C, Fraction of mode-1 (expiratory) gaps before and after HVC elimination. D–F, Effects of LMAN elimination on the same quantities as those shown in A–C.

To characterize the role of HVC in early song, we quantified the distributions of syllable and gap durations in young birds before and after bilateral elimination of HVC (see Materials and Methods). As reported previously (Aronov et al., 2008), HVC elimination did not change syllable duration distributions in subsong birds (n = 8 birds; Fig. 6A,B). However, it abolished the peaks corresponding to protosyllables in plastic song (n = 11 birds, peak size of 10.9 ± 2.6 s−1 across birds, 1.3 ± 0.2 s−1 in the same birds after HVC elimination, p < 0.001; Fig. 6D,E, population results are all summarized in Fig. 8A).

Similarly, the peak in gap duration distributions corresponding to protogaps was mostly abolished in both subsong and plastic-song birds by HVC elimination (Fig. 6C,F). Analysis of these distributions (Fig. 6G,H) showed that HVC elimination specifically abolished the production of mode-2 gaps, as determined from the fitting parameter p2 in the quantitative model described above (peak size of 2.8 ± 0.7 s−1 across birds, 0.35 ± 0.05 s−1 after HVC elimination, p < 0.01; Fig. 8B). This result indicates that HVC is necessary for the production of consistently timed protogaps whose duration is tightly coupled to the onsets and offsets of IPs. [Respiratory mechanisms of the involvement of HVC in protosyllable and protogap production have been detailed by Veit et al. (2011)].

In contrast to protogaps, short expiratory (mode-1) gaps in subsong birds were essentially unaffected by HVC elimination (Fig. 6C,F). The peak probability density for short gaps was 13.6 ± 1.0 s−1 across all birds and 13.4 ± 1.0 s−1 in the same birds after HVC elimination (n = 8 bird, p = 0.84; Fig. 8C), indicating that expiratory gaps may be produced by an HVC-independent mechanism distinct from that which produces inspiratory gaps.

Our results so far indicate that HVC is necessary for the production of consistently timed protosyllables and protogaps but not for the production of the exponentially distributed subsong syllables and mode-1 gaps. Because LMAN plays a major role in early song production (Ölveczky et al., 2005; Aronov et al., 2008), we hypothesized that LMAN activity is necessary for the generation of these song components. To test this possibility, we inactivated LMAN in birds producing early plastic song (n = 9 birds, ages 45–56 dph; see Materials and Methods). As shown previously (Ölveczky et al., 2005), plastic songs during LMAN inactivation were strikingly less variable than those before inactivation (Fig. 7A). LMAN-inactivated birds produced distinct syllable types, which formed one or more peaks in the distributions of syllable durations (Fig. 7B). These peaks were the dominant feature of syllable timing during LMAN inactivation (peak size of 8.8 ± 2.2 s−1 across birds, 25.9 ± 5.3 s−1 in the same birds after LMAN inactivation, p < 0.01; Fig. 8D).

The mode-2 peak in gap duration distributions also increased during LMAN inactivation (peak size of 9.3 ± 3.2 s−1 across birds, 28.0 ± 6.8 s−1 in the same birds after LMAN inactivation; Figs. 7C, 8E), whereas mode-1 gaps were essentially eliminated (peak probability density of 15.0 ± 2.6 s−1 across birds, 1.48 ± 0.49 s−1 after LMAN inactivation; Fig. 8F). Thus, the effect of LMAN inactivation is essentially opposite of HVC elimination, which left mode-1 gaps intact but abolished the peaks in gap duration distributions corresponding to protogaps.

In summary, Figure 8 quantifies across birds the opposing effects of HVC and LMAN elimination on syllable and gap production. These results suggest that different temporal components of syllables and gaps may be produced by distinct neural mechanisms, one of which is HVC dependent and the other is LMAN dependent.

Role of HVC and LMAN dynamics in the control of song timing

The HVC and LMAN elimination experiments show that these brain areas are necessary for the production of stereotyped and randomly timed durations in early song, respectively. However, they do not tell us how the timing of these song components is controlled. One possibility is that the biophysical dynamics within HVC control the durations of protosyllables and protogaps, just as the timing of syllables and gaps in adult birds is controlled by HVC (Long and Fee, 2008). An alternative hypothesis is that the timing of these early stereotyped song features is controlled elsewhere in the song system, perhaps by brain areas upstream of HVC, such as nucleus uvaeformis (Nottebohm et al., 1982; Ashmore et al., 2008). To distinguish between these alternatives, we used localized cooling of HVC to slow its intrinsic biophysical dynamics (Long and Fee, 2008; Fee and Long, 2011). We developed a new lightweight Peltier device applicable for temperature manipulations in young juvenile birds (see Materials and Methods; Fig. 9). HVC was cooled bilaterally by an average of ∼7°C (Fig. 9A), and the songs produced during cooling were compared with those produced when HVC was maintained at normal body temperature. We isolated various components of gap and syllable duration distributions and quantified their fractional change divided by the average cooling in HVC (in units of percent change per degree Celsius). A negative stretch indicates a lengthening of song features at lower temperatures.

Figure 9.

Figure 9.

Calibration of devices for cooling HVC and LMAN in juvenile zebra finches. A, Left, Schematic of the device for cooling HVC using cooling pads placed against thinned cranium above HVC. Right, Temperature change at the cooling plate of the device and in the center of HVC measured in an awake bird. Current is alternated between −0.5 and 1.5 A every 100 s. The same protocol is used for all cooling experiments reported here. B, Left, Schematic of the device for cooling LMAN using thermally conductive probes. Right, Temperature change in LMAN at various distances from the cooling probe of the device using the same electric current protocol. C, Approximate locations of cooling probes in LMAN for all analyzed birds. The probes in the left and right hemisphere of each bird are indicated by circles of matching color. Note that probe diameter was 250 μm in some birds and 330 μm in others (indicated by smaller and larger circles). A, Anterior; L, lateral. D, Example of simulated temperature around the probe in a horizontal section of LMAN (see Materials and Methods).

In subsong birds, HVC cooling had no effect on the durations of syllables (Fig. 10A; 0.06 ± 0.08% °C−1, n = 5 birds, p = 0.45, paired t test for all cooling comparisons), consistent with the finding that HVC is not necessary for the production of normal subsong syllables. There was also no effect on the long exponential tail of syllable durations in early plastic-song birds (n = 8 birds, p = 0.84). In contrast, HVC cooling in plastic-song birds caused the peak in syllable duration distributions (protosyllables) to be shifted toward longer durations (Fig. 10B; −1.36 ± 0.16% °C−1, p < 0.001). Thus, HVC dynamics are involved in controlling the durations of protosyllables in plastic song, whereas they have no influence on the broadly distributed durations of syllables at any developmental stage.

Figure 10.

Figure 10.

Biophysical dynamics intrinsic to HVC and LMAN are involved in timing different components of early singing. A, Distributions of subsong syllable durations produced by a single bird at normal body temperature and during HVC cooling, plotted on a semi-logarithmic scale. HVC cooling had no effect on subsong syllable durations. B, Syllable duration distributions for a plastic song bird on a linear scale. HVC cooling increased protosyllable durations. C, Effect of HVC cooling on gap duration distributions. (1) Detail of the gap duration distribution at short durations, plotted on a semi-logarithmic scale. HVC cooling had no effect on mode-1 (expiratory) gaps. (2) Detail of the gap duration distribution on a linear scale, showing the protogap peak. HVC cooling prolonged protogaps. D–F, Effects of LMAN cooling on syllable and gap durations, plotted as in A–C. D, LMAN cooling prolonged subsong syllables. E, LMAN cooling had no effect on the durations of protosyllables in plastic song. F, LMAN cooling increased the duration of mode-1 gaps (1) but had no effect on the durations of protogaps (2). G, Population summary of the effect of HVC cooling (orange) and LMAN cooling (green) on subsong syllable durations. Each dot indicates the effect for an individual bird. Open dots indicate the examples shown in A and D. Error bars indicate SEs across all birds. H–J, Population summary of the effects of HVC and LMAN cooling on protosyllable durations, mode-1 gap timescales, and protogap durations. Asterisks indicate features that showed a significant change with cooling across the population of birds (p < 0.05).

Like protosyllables, protogaps also became longer with HVC cooling (Fig. 10C); the peak in gap durations changed by −1.44 ± 0.11% °C−1 (n = 8 birds, p < 0.001). However, the durations of mode-1 (expiratory) gaps did not change with HVC cooling (0.18 ± 0.30% °C−1, p = 0.58). There was also no effect of HVC cooling on the time constant of long (100–200 ms) gaps, which quantifies the expiratory periods in mode-3 gaps (p = 0.90). These data suggest that biophysical dynamics within HVC are involved in controlling the duration of protogaps early in development but not the randomly timed durations of mode-1 or mode-3 gaps.

Our results so far indicate that HVC dynamics are involved in the timing of both protogaps and protosyllables—the stereotyped components of early singing—but appear not to be involved in the control of randomly timed subsong syllable and gap durations. What controls the duration of these song components? We hypothesized that biophysical dynamics within LMAN may be involved in the timing of these latter song components. To test this possibility, we cooled LMAN bilaterally in subsong and plastic-song birds by ∼6–8°C (see Materials and Methods; Fig. 9B–D) and analyzed the effects of cooling on gap and syllable durations.

LMAN cooling indeed prolonged the durations of subsong syllables (Fig. 10D). Although distributions of syllable durations remained exponential (five of five birds, Lilliefors statistic; see Materials and Methods), the time constant of this exponential became longer during LMAN cooling (−1.86 ± 0.52% °C−1, n = 5 birds, p < 0.05). This suggests that dynamics in LMAN are actively involved in the timing of subsong syllables, i.e., determining when a syllable is terminated relative to its onset. In plastic-song birds, LMAN cooling had no effect on the duration of protosyllables (Fig. 10E; −0.60 ± 0.32% °C−1, n = 6 birds, p = 0.12). However, the long tails of the syllable distributions (longer than the protosyllable) became longer when LMAN was cooled (p < 0.05; data not shown).

The effects of LMAN cooling on gaps were similar: cooling did not change the durations of protogaps (Fig. 10F; −0.005 ± 0.178% °C−1, n = 6 birds, p = 0.98). However, mode-1 gaps became longer with LMAN cooling (change in slope of exponential distribution of −4.13 ± 0.60% °C−1, p < 0.002). Similarly, LMAN cooling stretched long (100–200 ms) gaps, indicating that LMAN is involved in timing of positive-pressure periods of mode-3 gaps as well (significant decrease in slope of exponential distribution, p < 0.02; data not shown).

In summary, the biophysical dynamics intrinsic to HVC and LMAN both play a role in the timing of subsong and early plastic song, but these roles are distinct (Fig. 10G–J). LMAN dynamics are involved in controlling the durations of subsong syllables and the positive-pressure components of gaps (mode-1 and mode-3). The timing of these events appears to be independent of HVC. Conversely, HVC dynamics play a role in the timing of protogaps and protosyllables in early song. Durations of these HVC-controlled events appear to be independent of LMAN.

Discussion

Early singing is characterized by two distinct modes of timing

Characterizing the temporal structure of a behavior is necessary to understand how the brain learns and generates the timing of behavioral sequences (Lipkind and Tchernichovski, 2011). We found that syllable durations in young birds are exponentially distributed, indicating an extreme level of randomness in the syllable generation process. Over a brief developmental period, peaks formed by protosyllables emerge and gradually become the dominant components of syllable duration distributions (see also Veit et al., 2011), a phenomenon that represents the earliest consistent syllable timing. Importantly, this process is not a simple developmental reduction in the variability of syllable durations. Rather, it suggests the presence of two mechanistically distinct processes: one that produces random, exponentially distributed durations and another that produces repeatable durations from one rendition of a syllable to another.

Silent gaps in early singing also exhibited durations formed by a mixture of distinct temporal modes, including exponentially distributed expiratory gaps and consistently timed (∼60 ms) protogaps that contain IPs. Such features are consistent with previous reports of respiratory patterns in juvenile cardinals (Suthers, 2004). Protogaps became increasingly common during development, perhaps arising from an increasing coordination between vocalizations and respiration (Veit et al., 2011) that eventually leads to the precise coordination of these processes in adult song (Hartley and Suthers, 1989; Franz and Goller, 2002).

LMAN and HVC contribute differentially to the modes of song timing

By manipulating two premotor nuclei, LMAN and HVC, we characterized their roles in producing random and stereotyped modes of early song timing. Eliminating LMAN activity specifically disrupted the production of randomly timed components of syllable and gap duration distributions but left intact the consistently timed protosyllables and protogaps. Furthermore, mild cooling of LMAN prolonged randomly timed song components (subsong syllables and expiratory components of gaps) but had no effect on protosyllable and protogap durations. These results suggest that LMAN is involved in producing randomly timed syllables and gaps and that the biophysical dynamics intrinsic to LMAN circuitry are actively involved in controlling the durations of these song elements.

In contrast to LMAN manipulation experiments, we found that neither lesions nor cooling of HVC affect exponentially distributed syllable durations in subsong. Instead, HVC cooling increases the durations of protosyllables and protogaps, whereas HVC elimination entirely abolishes their production. These results are consistent with experiments in adult birds showing that adult motifs, including syllables and gaps, are uniformly slowed by HVC cooling (Long and Fee, 2008). Because sequences of protosyllables and protogaps developmentally transform into adult song motifs (Tchernichovski et al., 2001; Tchernichovski and Mitra, 2002), these results suggest that the biophysical processes within HVC controlling the timing of early song elements may be related to those controlling the timing of adult songs.

Our results do not eliminate the possibility that other brain areas are also involved in generating the dynamics that control early song timing. LMAN is reciprocally interconnected with other brain areas as part of a basal ganglia–forebrain loop (Farries and Perkel, 2002; Person et al., 2008; Gale and Perkel, 2010). Specifically, LMAN receives input from the medial part of the dorsolateral thalamus (DLM), which in turn receives inputs from a basal ganglia nucleus Area X (Okuhata and Saito, 1987; Bottjer et al., 1989; Luo and Perkel, 1999). Recent work has shown that lesions of DLM mostly eliminate vocal variability in young birds, whereas lesions of Area X have little effect on variability (Goldberg and Fee, 2011), suggesting that DLM plays a key role in generating subsong vocalizations. However, the slowing of subsong timescales by LMAN cooling suggests that the circuit dynamics underlying subsong timing are not entirely upstream of LMAN (e.g., in DLM). One possibility is that DLM is not involved in timing syllables and gaps but is necessary to tonically “activate” LMAN during singing. Alternatively, it remains possible that dynamics are produced by neural activity that quickly cycles between multiple brain areas. For example, LMAN forms a loop with RA and DLM (Vates et al., 1997). Thus, cooling LMAN could slow down subsong syllables by increasing the time for signals to travel around this loop.

Similar arguments can be made regarding the role of HVC in controlling the timing of protosyllables and gaps. HVC receives feedback connections from RA via a pathway through the midbrain and thalamus (Schmidt et al., 2004; Ashmore et al., 2008). Thus, although it seems likely that biophysical dynamics within HVC are involved in the control of protosyllable and protogap timing, we cannot rule out the possible involvement of other brain regions afferent to HVC. More detailed lesion and cooling experiments within other areas of the song system are necessary to address these possibilities.

A separate issue is that our analysis specifically concerns the temporal patterning of syllables and gaps in early songs. Some finer aspects of timing in the subsyllabic acoustic or respiratory structure may be influenced by areas downstream of HVC and LMAN (Andalman et al., 2011) or even by peripheral mechanisms of song production (Fee et al., 1998; Goller and Cooper, 2004; Riede and Goller, 2010).

How might the LMAN- and HVC-dependent modes we identified interact during early song development? We found that protosyllables and protogaps tend to follow one another, perhaps accounting for the previously described early appearance of rhythmicity in singing (Saar and Mitra, 2008). However, this tendency is only modestly stronger than expected by chance, suggesting that components with stereotyped and random durations are primarily intermingled during singing. Consistent with this idea, single-neuron recordings in LMAN suggest that this area is active throughout subsong (Aronov et al., 2008), plastic song (Ölveczky et al., 2005), and even adult song (Kao and Brainard, 2006). A more detailed understanding of the specific roles of HVC and LMAN in generating stereotyped and randomly timed components will require an analysis of neuronal activity in these areas during subsong and the transition to plastic song.

Brain areas are specialized for producing distinct modes of neural dynamics

Our results suggest that two forms of circuit dynamics are produced by distinct regions of the avian forebrain, allowing the production of both stereotyped and randomly timed song components. How might the circuits within HVC and LMAN produce these distinct forms of neural dynamics? In adult birds, neurons in HVC produce a brief burst of spikes at one moment in the song, and, as a population, these neurons generate a sparse sequence of bursts (Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Long et al., 2010) that is thought to arise from a synaptically connected chain of neuronal groups (Abeles, 1991; Vogels et al., 2005; Li and Greenside, 2006; Jin et al., 2007; Long et al., 2010). Such a process within HVC might produce sequential dynamics during protosyllables and possibly protogaps early in development. Theoretical work suggests that a synaptic-chain organization could arise early in development from simple spike timing-based synaptic learning rules, even before sensorimotor learning (Jun and Jin, 2007; Fiete et al., 2010).

It is less clear how LMAN-dependent dynamics might generate exponentially distributed syllable and gap durations. One possibility is that LMAN (possibly in conjunction with other brain areas) exhibits two types of population activity states: some that generate premotor activity for vocal production and others that generate silences. By continuously evolving in time, population activity could “wander” between vocalizing and nonvocalizing states, producing the observed transitions between syllables and gaps. Another possibility is that distinct populations of neurons within LMAN actively trigger switching between vocal and silent states. This idea is consistent with the finding that many LMAN neurons exhibit a burst of activity before the onsets or offsets of subsong syllables (Aronov et al., 2008).

An additional question is what type of functional network architecture could generate random timing within LMAN. Some theoretical studies suggest that stochastic activity in neural circuits can be produced by chaotic dynamics (van Vreeswijk and Sompolinsky, 1998; Wilson, 1999; Vogels et al., 2005). Such dynamics may be generated by recurrent networks with sparse, random connectivity or even by biophysical processes in single neurons. Whether these models are applicable to the LMAN circuit or possibly other systems underlying random exploratory behavior in the brain is an open question.

Our analysis of HVC- and LMAN-dependent dynamics suggests an interesting organizational scheme in the song system and possibly in other motor or sensory systems. The anatomical segregation of HVC-dependent stereotyped dynamics and LMAN-dependent random dynamics raises the possibility that distinct brain regions are specialized not only for the receptive field or tuning properties they express but also for the types of dynamics they generate in a given behavior. Furthermore, just as cortical properties such as maps and receptive fields are thought to form under the influence of local learning rules (Miller, 1996), it is possible that distinct forms of circuit dynamics—such as stereotyped neuronal sequences or stochastic patterns—similarly result from the application of simple learning rules (Jun and Jin, 2007; Fiete et al., 2010). What processes govern the formation of distinct neural dynamics in the song system, and whether similarly segregated processes exist in other systems, remains to be determined.

Footnotes

Funding for this work was provided by National Institutes of Health Grant R01 DC009183 and the Hertz Foundation Silvio Micali Fellowship (D.A.). We thank Michael Long for helpful suggestions and all other members of the Fee laboratory for help with the cooling experiments.

The authors declare no competing financial interests.

References

  1. Abeles M. Corticonics. Cambridge, UK: Cambridge UP; 1991. [Google Scholar]
  2. Andalman AS, Fee MS. A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc Natl Acad Sci U S A. 2009;106:12518–12523. doi: 10.1073/pnas.0903214106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Andalman AS, Foerster JN, Fee MS. Control of vocal and respiratory patterns in birdsong: dissection of forebrain and brainstem mechanisms using temperature. PloS One. 2011;6:e25461. doi: 10.1371/journal.pone.0025461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Aronov D, Fee MS. Analyzing the dynamics of brain circuits with temperature: design and implementation of a miniature thermoelectric device. J Neurosci Methods. 2011;197:32–47. doi: 10.1016/j.jneumeth.2011.01.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Aronov D, Andalman AS, Fee MS. A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science. 2008;320:630–634. doi: 10.1126/science.1155140. [DOI] [PubMed] [Google Scholar]
  6. Ashmore RC, Renk JA, Schmidt MF. Bottom-up activation of the vocal motor forebrain by the respiratory brainstem. J Neurosci. 2008;28:2613–2623. doi: 10.1523/JNEUROSCI.4547-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bottjer SW, Miesner EA, Arnold AP. Forebrain lesions disrupt development but not maintenance of song in passerine birds. Science. 1984;224:901–903. doi: 10.1126/science.6719123. [DOI] [PubMed] [Google Scholar]
  8. Bottjer SW, Halsema KA, Brown SA, Miesner EA. Axonal connections of a forebrain nucleus involved with vocal learning in zebra finches. J Comp Neurol. 1989;279:312–326. doi: 10.1002/cne.902790211. [DOI] [PubMed] [Google Scholar]
  9. Doupe AJ, Kuhl PK. Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci. 1999;22:567–631. doi: 10.1146/annurev.neuro.22.1.567. [DOI] [PubMed] [Google Scholar]
  10. Doya K, Sejnowski T. A novel reinforcement model of birdsong vocalization learning. In: Tesauro G, Tourezky DS, Leen TK, editors. Advances in neural information processing systems. Cambridge, MA: Massachusetts Institute of Technology; 1995. pp. 101–108. [Google Scholar]
  11. Farries MA, Perkel DJ. A telencephalic nucleus essential for song learning contains neurons with physiological characteristics of both striatum and globus pallidus. J Neurosci. 2002;22:3776–3787. doi: 10.1523/JNEUROSCI.22-09-03776.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Farries MA, Ding L, Perkel DJ. Evidence for “direct” and “indirect” pathways through the song system basal ganglia. J Comp Neurol. 2005;484:93–104. doi: 10.1002/cne.20464. [DOI] [PubMed] [Google Scholar]
  13. Fee MS, Long MA. New methods for localizing and manipulating neuronal dynamics in behaving animals. Curr Opin Neurobiol. 2011 doi: 10.1016/j.conb.2011.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Fee MS, Shraiman B, Pesaran B, Mitra PP. The role of nonlinear dynamics of the syrinx in the vocalizations of a songbird. Nature. 1998;395:67–71. doi: 10.1038/25725. [DOI] [PubMed] [Google Scholar]
  15. Fiete IR, Senn W, Wang CZ, Hahnloser RH. Spike-time-dependent plasticity and heterosynaptic competition organize networks to produce long scale-free sequences of neural activity. Neuron. 2010;65:563–576. doi: 10.1016/j.neuron.2010.02.003. [DOI] [PubMed] [Google Scholar]
  16. Franz M, Goller F. Respiratory units of motor production and song imitation in the zebra finch. J Neurobiol. 2002;51:129–141. doi: 10.1002/neu.10043. [DOI] [PubMed] [Google Scholar]
  17. Gale SD, Perkel DJ. Anatomy of a songbird basal ganglia circuit essential for vocal learning and plasticity. J Chem Neuroanat. 2010;39:124–131. doi: 10.1016/j.jchemneu.2009.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Glaze CM, Troyer TW. Temporal structure in zebra finch song: implications for motor coding. J Neurosci. 2006;26:991–1005. doi: 10.1523/JNEUROSCI.3387-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Goldberg JH, Fee MS. Vocal babbling in songbirds requires the basal ganglia-recipient motor thalamus but not the basal ganglia. J Neurophysiol. 2011;105:2729–2739. doi: 10.1152/jn.00823.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Goller F, Cooper BG. Peripheral motor dynamics of song production in the zebra finch. Ann NY Acad Sci. 2004;1016:130–152. doi: 10.1196/annals.1298.009. [DOI] [PubMed] [Google Scholar]
  21. Hahnloser RH, Kozhevnikov AA, Fee MS. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature. 2002;419:65–70. doi: 10.1038/nature00974. [DOI] [PubMed] [Google Scholar]
  22. Hartley RS, Suthers RA. Airflow and pressure during canary song: direct evidence for mini-breaths. J Comp Physiol A. 1989;165:15–26. [Google Scholar]
  23. Hohle RH. Inferred components of reaction times as functions of foreperiod duration. J Exp Psychol. 1965;69:382–386. doi: 10.1037/h0021740. [DOI] [PubMed] [Google Scholar]
  24. Immelmann K. Song development in the zebra finch and other estrildid finches. In: Hinde RA, editor. Bird vocalizations. Cambridge, UK: Cambridge UP; 1969. pp. 61–74. [Google Scholar]
  25. Jafari F, Higgins PD. Thermal modeling in cylindrical coordinates using effective conductivity. IEEE Trans Ultrason Ferroelectr Freq Control. 1989;36:191–196. doi: 10.1109/58.19150. [DOI] [PubMed] [Google Scholar]
  26. Jin DZ, Ramazanoğlu FM, Seung HS. Intrinsic bursting enhances the robustness of a neural network model of sequence generation by avian brain area HVC. J Comput Neurosci. 2007;23:283–299. doi: 10.1007/s10827-007-0032-z. [DOI] [PubMed] [Google Scholar]
  27. Jun JK, Jin DZ. Development of neural circuitry for precise temporal sequences through spontaneous activity, axon remodeling, and synaptic plasticity. PLoS One. 2007;2:e723. doi: 10.1371/journal.pone.0000723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kao MH, Brainard MS. Lesions of an avian basal ganglia circuit prevent context-dependent changes to song variability. J Neurophysiol. 2006;96:1441–1455. doi: 10.1152/jn.01138.2005. [DOI] [PubMed] [Google Scholar]
  29. Kao MH, Doupe AJ, Brainard MS. Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song. Nature. 2005;433:638–643. doi: 10.1038/nature03127. [DOI] [PubMed] [Google Scholar]
  30. Kao MH, Wright BD, Doupe AJ. Neurons in a forebrain nucleus required for vocal plasticity rapidly switch between precise firing and variable bursting depending on social context. J Neurosci. 2008;28:13232–13247. doi: 10.1523/JNEUROSCI.2250-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kozhevnikov AA, Fee MS. Singing-related activity of identified HVC neurons in the zebra finch. J Neurophysiol. 2007;97:4271–4283. doi: 10.1152/jn.00952.2006. [DOI] [PubMed] [Google Scholar]
  32. Lashley KS. The problem of serial order in behavior. In: Jeffress LA, editor. Cerebral mechanisms in behavior. New York: Wiley; 1951. pp. 112–136. [Google Scholar]
  33. Levelt WJM. Speaking: from intention to articulation. Cambridge, MA: Massachusetts Institute of Technology; 1993. [Google Scholar]
  34. Li M, Greenside H. Stable propagation of a burst through a one-dimensional homogeneous excitatory chain model of songbird nucleus HVC. Phys Rev E Stat Nonlin Soft Matter Phys. 2006;74 doi: 10.1103/PhysRevE.74.011918. 011918. [DOI] [PubMed] [Google Scholar]
  35. Lilliefors HW. On the Kolmogorov–Smirnov test for the exponential distribution with mean unknown. J Am Stat Assoc. 1969;64:387–389. [Google Scholar]
  36. Lipkind D, Tchernichovski O. Quantification of behavior Sackler colloquium: quantification of developmental birdsong learning from the subsyllabic scale to cultural evolution. Proc Natl Acad Sci U S A. 2011;108(Suppl 3):15572–15579. doi: 10.1073/pnas.1012941108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Liu WC, Gardner TJ, Nottebohm F. Juvenile zebra finches can use multiple strategies to learn the same song. Proc Natl Acad Sci U S A. 2004;101:18177–18182. doi: 10.1073/pnas.0408065101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Long MA, Fee MS. Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature. 2008;456:189–194. doi: 10.1038/nature07448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Long MA, Jin DZ, Fee MS. Support for a synaptic chain model of neuronal sequence generation. Nature. 2010;468:394–399. doi: 10.1038/nature09514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Luo M, Perkel DJ. Long-range GABAergic projection in a circuit essential for vocal learning. J Comp Neurol. 1999;403:68–84. [PubMed] [Google Scholar]
  41. Marler P. Birdsong and speech development: could there be parallels? Am Sci. 1970;58:669–673. [PubMed] [Google Scholar]
  42. Miller K. Receptive fields and maps in the visual cortex: models of ocular dominance and orientation columns. In: Domany E, van Hemmen J, Schulten K, editors. Models of neural networks III. New York: Springer; 1996. pp. 55–78. [Google Scholar]
  43. Nottebohm F, Stokes TM, Leonard CM. Central control of song in the canary, Serinus canarius. J Comp Neurol. 1976;165:457–486. doi: 10.1002/cne.901650405. [DOI] [PubMed] [Google Scholar]
  44. Nottebohm F, Kelley DB, Paton JA. Connections of vocal control nuclei in the canary telencephalon. J Comp Neurol. 1982;207:344–357. doi: 10.1002/cne.902070406. [DOI] [PubMed] [Google Scholar]
  45. Okuhata S, Saito N. Synaptic connections of thalamo-cerebral vocal nuclei of the canary. Brain Res Bull. 1987;18:35–44. doi: 10.1016/0361-9230(87)90031-1. [DOI] [PubMed] [Google Scholar]
  46. Oller DK, Wieman LA, Doyle WJ, Ross C. Infant babbling and speech. J Child Lang. 1976;3:1–11. [Google Scholar]
  47. Ölveczky BP, Andalman AS, Fee MS. Vocal experimentation in the juvenile songbird requires a basal ganglia circuit. PLoS Biol. 2005;3:e153. doi: 10.1371/journal.pbio.0030153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Perkel DJ, Farries MA, Luo M, Ding L. Electrophysiological analysis of a songbird basal ganglia circuit essential for vocal plasticity. Brain Res Bull. 2002;57:529–532. doi: 10.1016/s0361-9230(01)00690-6. [DOI] [PubMed] [Google Scholar]
  49. Person AL, Gale SD, Farries MA, Perkel DJ. Organization of the songbird basal ganglia, including area X. J Comp Neurol. 2008;508:840–866. doi: 10.1002/cne.21699. [DOI] [PubMed] [Google Scholar]
  50. Riede T, Goller F. Peripheral mechanisms for vocal production in birds: differences and similarities to human speech and singing. Brain Lang. 2010;115:69–80. doi: 10.1016/j.bandl.2009.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Saar S, Mitra PP. A technique for characterizing the development of rhythms in bird song. PLoS One. 2008;3:e1461. doi: 10.1371/journal.pone.0001461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schmidt MF, Ashmore RC, Vu ET. Bilateral control and interhemispheric coordination in the avian song motor system. Ann NY Acad Sci. 2004;1016:171–186. doi: 10.1196/annals.1298.014. [DOI] [PubMed] [Google Scholar]
  53. Simpson HB, Vicario DS. Brain pathways for learned and unlearned vocalizations differ in zebra finches. J Neurosci. 1990;10:1541–1556. doi: 10.1523/JNEUROSCI.10-05-01541.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Stepanek L, Doupe AJ. Activity in a cortical-basal ganglia circuit for song is required for social context-dependent vocal variability. J Neurophysiol. 2010;104:2474–2486. doi: 10.1152/jn.00977.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Suthers RA. How birds sing and why it matters. In: Marler PR, Slabbekoorn H, editors. Nature's music: the science of birdsong. San Diego: Academic; 2004. pp. 272–295. [Google Scholar]
  56. Tchernichovski O, Mitra PP. Towards quantification of vocal imitation in the zebra finch. J Comp Physiol A Neuroethol Sens Neural Behav Physiol. 2002;188:867–878. doi: 10.1007/s00359-002-0352-4. [DOI] [PubMed] [Google Scholar]
  57. Tchernichovski O, Nottebohm F, Ho CE, Pesaran B, Mitra PP. A procedure for an automated measurement of song similarity. Anim Behav. 2000;59:1167–1176. doi: 10.1006/anbe.1999.1416. [DOI] [PubMed] [Google Scholar]
  58. Tchernichovski O, Mitra PP, Lints T, Nottebohm F. Dynamics of the vocal imitation process: how a zebra finch learns its song. Science. 2001;291:2564–2569. doi: 10.1126/science.1058522. [DOI] [PubMed] [Google Scholar]
  59. Tchernichovski O, Lints TJ, Deregnaucourt S, Cimenser A, Mitra PP. Studying the song development process: rationale and methods. Ann NY Acad Sci. 2004;1016:348–363. doi: 10.1196/annals.1298.031. [DOI] [PubMed] [Google Scholar]
  60. Tumer EC, Brainard MS. Performance variability enables adaptive plasticity of “crystallized” adult birdsong. Nature. 2007;450:1240–1244. doi: 10.1038/nature06390. [DOI] [PubMed] [Google Scholar]
  61. van Vreeswijk C, Sompolinsky H. Chaotic balanced state in a model of cortical circuits. Neural Comput. 1998;10:1321–1371. doi: 10.1162/089976698300017214. [DOI] [PubMed] [Google Scholar]
  62. Vates GE, Vicario DS, Nottebohm F. Reafferent thalamo-“cortical” loops in the song system of oscine songbirds. J Comp Neurol. 1997;380:275–290. [PubMed] [Google Scholar]
  63. Veit L, Aronov D, Fee MS. Learning to breathe and sing: development of respiratory-vocal coordination in young songbirds. J Neurophysiol. 2011;106:1747–1765. doi: 10.1152/jn.00247.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Vogels TP, Rajan K, Abbott LF. Neural network dynamics. Annu Rev Neurosci. 2005;28:357–376. doi: 10.1146/annurev.neuro.28.061604.135637. [DOI] [PubMed] [Google Scholar]
  65. Wild JM. Functional neuroanatomy of the sensorimotor control of singing. Ann NY Acad Sci. 2004;1016:438–462. doi: 10.1196/annals.1298.016. [DOI] [PubMed] [Google Scholar]
  66. Wilson HR. Spikes, decisions, and actions. Oxford: Oxford UP; 1999. [Google Scholar]
  67. Yu AC, Margoliash D. Temporal hierarchical control of singing in birds. Science. 1996;273:1871–1875. doi: 10.1126/science.273.5283.1871. [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES