Skip to main content
Proceedings of the Royal Society B: Biological Sciences logoLink to Proceedings of the Royal Society B: Biological Sciences
. 2014 Mar 22;281(1779):20132306. doi: 10.1098/rspb.2013.2306

Morphological basis for the evolution of acoustic diversity in oscine songbirds

Tobias Riede 1,, Franz Goller 1
PMCID: PMC3924064  PMID: 24500163

Abstract

Acoustic properties of vocalizations arise through the interplay of neural control with the morphology and biomechanics of the sound generating organ, but in songbirds it is assumed that the main driver of acoustic diversity is variation in telencephalic motor control. Here we show, however, that variation in the composition of the vibrating tissues, the labia, underlies diversity in one acoustic parameter, fundamental frequency (F0) range. Lateral asymmetry and arrangement of fibrous proteins in the labia into distinct layers is correlated with expanded F0 range of species. The composition of the vibrating tissues thus represents an important morphological foundation for the generation of a broad F0 range, indicating that morphological specialization lays the foundation for the evolution of complex acoustic repertoires.

Keywords: vocal behaviour, extracellular matrix, anisotropy

1. Introduction

The relationship between structure and function plays an important role in the evolution of behaviour. For vocal behaviour, the structure of the vibrating tissue is critical for determining a range of acoustic features [13], but this relationship has only been investigated in detail in humans [4,5]. One of the most important acoustic features of animal vocalizations is fundamental frequency (F0). The large range of F0 across species or within the vocal repertoire of individual species may carry important information about the sender and therefore play a major signalling role in the context of natural and sexual selection [6]. However, little specific information is available about the various factors involved in variation of frequency range both within and between species.

In avian interspecifc comparisons, F0 is correlated with body size, but a substantial portion of the variation is not explained by this relationship [7]. However, explaining the F0 range is important because it defines the boundaries for sound frequency of a vocal repertoire and thus poses limits on spectral diversity. Because vocal diversification has played an important role in speciation and the ecological success of birds, it is important to gain a thorough understanding of how frequency range is determined in birds and thus contributes to these processes.

Airflow-driven vibration of tissue is the main mechanism of sound production in birds and mammals, including humans. The vibrating tissues of the mammalian larynx and the avian syrinx are composed of extracellular matrix [8,9]. Its composition of collagen and elastin fibres and hyaluronan ensures that the tissue can engage in self-sustained vibrations and, at the same time, can withstand the physical forces of vibration [10]. The mechanical properties of these vibrating tissues influence the oscillation frequency and, thus, F0 of the generated sound [11]. The tension of the vibrating tissue can be adjusted by muscle action and is therefore controlled by the vocal motor areas in the brain [12,13]. However, neural control can only adjust F0 within the physical limits, which are determined by the morphology and composition of the vibrating tissue. Previous work in songbirds [14] and mammals [15] showed that mechanical differences of labia are associated with vocal differences between species. In vivo experiments demonstrated the role of labia in F0 determination and the role of muscles in labia adjustment [1619], and computational modelling illustrated how muscle control and labial dynamics contribute to generate diverse features of sound [20]. Despite this progress, we still lack a thorough understanding of how morphology and neural control interact during the evolution of acoustic signals [21,22].

To address this gap in our knowledge, a more detailed understanding of the extracellular matrix composition of the syringeal labia is required [7]. The goal of this study was therefore to investigate the relationship between microstructure of the vocal organ and F0 in oscine songbirds. The oscine syrinx contains two independently controlled sound sources whose F0 ranges do not overlap fully [1620]. Oscines acquire song, the most complex vocalizations of their acoustic repertoire, through vocal learning. It is thought that the forebrain control circuits associated with learned vocal behaviour enabled the evolution of increased acoustic diversity [23]. Neural control of the dual sound source undoubtedly plays a major role in acoustic versatility. However, the morphology of these two sound sources must present a mechanical basis for generating diverse sound features. Morphological complexity was investigated in two complementary ways to address the question of how design of the vibrating tissue (labia) is related to the vocal repertoire of a species. First, we expect that an increased morphological asymmetry between the two sound sources is associated with a large F0 range. In a second approach, we investigate the morphology of the labia, because data from human vocal folds indicate that the evolution of a layered structure increases versatility of vocal folds as a sound source [8]. Acoustic energy conversion is tightly linked to how labia oscillate inside the syrinx [19]. We therefore expect that species with a wider frequency range demonstrate more complex labial morphology.

2. Material and methods

(a). Study animals

Eight songbird species were selected to maximize variation in body size, and to maximize variation in song F0. All sound recordings were made in the same populations from which specimens for morphological investigations were collected. Zebra finches (Taeniopygia guttata) were bred at the University of Utah. Other species (see electronic supplementary material, table S1) were captured in the Salt Lake Area, Utah, USA between March and July 2008–2011. Data from zebra finches, white-crowned sparrows (Zonotrichia leucophrys) and European starlings (Sturnus vulgaris) were in part collected for previous studies [9,24,25]. The birds were deeply anaesthetized with a Ketamine/Xylazine mixture (Sigma-Aldrich K-113; 2 μl g−1 body mass) and perfused intracardially with PBS. Only males were used in this study.

(b). Tissue preparation

The syrinx was excised and fixed for 3 days in 10% neutral-buffered formalin, decalcified for 8 h, processed and embedded for coronal sections in paraffin [9,23]. Adjacent 5 μm sections were exposed to one of the following stains: haemotoxylin and eosin (H&E); elastica van Gieson stain (EVG); Masson's trichrome stain (TRI). Micrographs were taken with a digital camera (AxioCam HRc, Carl Zeiss, Germany) combined with an Axioplan Zeiss microscope (Axioplan, Carl Zeiss, Germany) and computer software (Axiovision v. 40, v. 4.6.3.0., Carl Zeiss, Germany).

(c). Labial morphology

Labial composition and morphology of eight songbird species (see electronic supplementary material, table S1) were examined with established histological techniques. Individual fibrous protein components were visualized and quantified. Based on stained tissue, we constructed a schematic cross-section through the medial labium, showing the abundance of collagen and elastin fibres as well as the most prominent orientation of the two fibre types (figure 1a,b). Five basic extracellular matrix designs were found and described as layers (layer numbers are arbitrary; electronic supplementary material, table S2). Percentages for fibrillar proteins refer to positively stained areas in EVG and TRI stains, respectively, analysed with ImageJ (1.41o; NIH, USA):

  • layer 1: less than 10% collagen, more than 60% elastic fibres, elastic fibres in cranio-caudal orientation;

  • layer 2: less than 5% elastic fibres, more than 60% collagen fibres, collagen fibres randomly oriented;

  • layer 3: less than 5% elastic fibres, more than 60% collagen fibres, collagen fibres dorsoventrally oriented;

  • layer 4: less than 5% elastic fibres, more than 60% collagen fibres, collagen fibres cranio-caudally oriented; and

  • layer 5: less than 5% elastic fibres, 30–60% collagen fibres, collagen fibres randomly oriented.

Figure 1.

Figure 1.

(a) The syrinx consists of a cartilaginous framework, oscillatory soft tissues (labia) and muscles for controlling airflow and labial tension. The oscine syrinx contains two sound sources, each consisting of a pair of labia (lateral, LL and medial labium, ML) that are located near the tracheo-bronchial junction. T, trachea; A1–A3, syringeal cartilages; P, pessulus; B, bronchus, MTM, medial tympaniform membrane; ML, medial labium; LL, lateral labium. The grey box indicates the plane for sectioning. (b) Section of a starling medial labium illustrating its cellular and extracellular composition through trichrome stain. Thin epithelia (E) on each side encompass extracellular matrix, composed of collagen (blue) and elastin fibres (black). In this section, a part of the cartilage is also included (A1). The layer structure of the extracellular matrix is shown in a schematic below, identifying layers according to content and orientation of collagen fibres. Collagen fibres are oriented mostly randomly in layers 2 and 5, dorsoventrally in layer 3 and cranio-caudally in layer 4. Layer numbers (2–4) are used to identify similar organization in different species. Starlings do not have a layer dominated by elastin (layer 1 in (c)). (c) Schematics of composition of extracellular matrix and size differences of the medial labia in the eight songbird species. Layers are indentified by numbers corresponding to different fibre composition and orientation. These schemata reflect sections that did not contain cartilage. The layer thickness was drawn to scale and corresponds to the mean from two to eight specimens. Species GK, golden-crowned kinglet (Regulus satrapa); RK, ruby-crowned kinglet (Regulus calendula); ZF, zebra finch; WS, white-crowned sparrow (Zonotrichia leucophrys); ES, European starling (Sturnus vulgaris); AR, American robin (Turdus migratorius); YB, yellow-headed blackbird (Xanthocephalus xanthocephalus); BM, black-billed magpie (Pica hudsonia).

Labial cross-sectional areas were measured at five different levels of equal distance along the ventral to dorsal axis. The volume of labia was estimated from labial cross-sectional area and total length of labia using equation (2.1):

(c). 2.1

where Vol is the volume of a labium, estimated from the sum of products of cross-sectional areas and the distance (length) between two sections of the syrinx.

(d). Analysis of the vocal repertoire

F0 was quantified every 20 ms for songs and calls using a pitch-tracking module (PRAAT software, v. 5.2.12). Results were visually confirmed (figure 2a,b). The F0 data were represented as histograms (0–9 kHz, 100 Hz resolution). Mean F0 (F0-mean) was calculated by averaging all 100-Hz frequency bins in the histogram, thus weighting different frequency bins according to the rate of their occurrence. The F0 limits within the song of a species were calculated in two ways. First, F0-range-A was calculated as the sum of the 100-Hz-bins. Second, F0-range-B was estimated from the F0 limits by calculating the difference between the minimum and maximum F0. While F0-range-B considers the upper and lower boundaries within which sound is produced, F0-range-A captures the actually used spectral range covered by a vocal organ.

Figure 2.

Figure 2.

Fundamental frequency (F0) tracking procedure and quantification exemplified for (a) black-billed magpie and (b) ruby-crowned kinglet songs. Songs are shown as oscillograms (top panel) and spectrograms (middle panel) and the tracked F0 (bottom panel). F0 data points for these songs are displayed as histograms (c,d) showing the occurrence of each frequency between 0 and 9 kHz (100 Hz bins). Whereas magpie song is composed of all frequencies between 0.5 and 3 kHz, song in the ruby-crowned kinglet demonstrates two distinct distributions with maxima around 3 and 6 kHz. The two frequency ranges are not connected, and probably represent the independent contributions from the two sides of the syrinx. (e) Distributions for F0 in song (bars, upward) and calls (bars, downward; stars indicate data obtained from published spectrographic images) for all eight species. For each species, one song from between three to six individuals was submitted to F0 tracking and data were pooled for histograms. Species abbreviations as in figure 1c.

For each species, one song from between three to six individuals was submitted to F0 tracking. The sample size of audio recordings was limited to these relatively small numbers for two reasons. First, the relationship between syrinx morphology and song F0 must be explored with samples from the same population to account for potential differences in the learned aspects of song. Second, averaging across many individuals can obstruct view of certain characteristics of a vocal repertoire. For example, in a species whose song is composed of two non-overlapping distinct frequencies within the total range, the gap between these frequencies will disappear if individuals vary in the respective distinct frequencies.

3. Results

Marked differences between species were found in the distribution and orientation of individual fibre components within the labia (figure 1). For each species, we categorized and quantified the number of different layers according to fibre orientation and density (figure 1b) and established a schematic representation (figure 1c). For example, the labia of golden-crowned kinglets and white-crowned sparrows display a deep layer with prominent presence of elastic fibres (layer 1), followed by a layer of loose connective tissue with lower elastin content (layer 2). The zebra finch labia consist of a homogeneous single layer (layer 2), while the labia in ruby-crowned kinglets, European starlings and yellow-headed blackbirds exhibit multiple layers (figure 1c). Although the left medial labia tend to be larger than the right medial labia, the layer structure was identical on the two sides with one exception. In European starlings, we found four layers on the left side (layers 2, 3, 4, 5) and two layers (layers 4, 5) in the much smaller labia of the right syrinx.

We estimated labial size by measuring labial cross-sectional area in serial sections of the syrinx from anterior to posterior to assess asymmetry between the left and right sound sources for each species. In all investigated species, the area of the left labia was at least somewhat larger than that of the right labia, but this lateral asymmetry varied strongly between species, especially for the medial labia. The difference in medial labia ranged from 0% in the zebra finch to a more than fourfold larger left medial labium in the European starling (electronic supplementary material, table S3).

F0 for song and calls are plotted as histograms in figure 2. The F0-range-B of different species varies substantially. For example, zebra finch song spans the range from 0.5 to 6 kHz with the vast majority of sounds below 3 kHz, whereas European starlings span a range from 0.25 to 9 kHz. An interesting difference in F0 limits occurs in the two kinglet species. Song in the ruby-crowned kinglet consists of many syllables with much lower F0 than is found in the song of the golden-crowned kinglet. Furthermore, ruby-crowned kinglets generate a call with F0 well below 1 kHz, which is highly surprising for such a small bird.

We then explored these two datasets for correlations between morphology, F0 and F0 limits. We tested whether body mass can explain the F0-range-A, F0-range-B or F0-mean of songs in these eight species. Body size should be a good predictor for F0 since labial size scaled with body size (figure 3a), and absolute size of the labia explained F0-mean (figure 3b). However, body mass differences explain only 35% of the variance in F0-mean (figure 3c), which falls within the range of earlier analyses of larger datasets [26,27]. More importantly, body mass variations appear neither related to the F0-range-A calculated by weighing all frequencies in the song repertoire of a species, nor the F0-range-B (figure 3d,e).

Figure 3.

Figure 3.

Labial volume increases with increasing body mass (a), and can explain, in part, F0-mean of song (b). Regression models are presented in electronic supplementary material, table S4. Body mass does not explain more than 35% of the variation in the means of the F0-mean of song (c), dots are mean F0 and bars indicate the range to the minimum and maximum F0) and does not show a strong relationship with the F0-ranges of song in the eight species (d,e). Species abbreviations in c as in figure 1c. The relationships between F0-range-A and F0-range-B and labial volumes were not significant (see electronic supplementary material, table S4). (f,g) The degree of labial asymmetry (expressed as x-fold difference in area at the mid-syrinx level) is positively correlated with the F0-range-A in the songs of the eight species. (h) The number of distinct layers in the extracellular matrix of the medial labia is positively correlated with the F0-range-A but not F0-range-B of songs (see electronic supplementary material, table S4). The relationships between F0-mean and asymmetry or number of layers were not significant (see electronic supplementary material, table S4).

Two features of the labia show a significant correlation with F0-range-A. First, the degree of asymmetry of labial cross-sectional area at mid-organ level is positively correlated with the F0-range-A (figure 3f,g). This relationship is particularly strong for the medial labia, which show more variation in lateral asymmetry than the lateral labia in these eight species (see electronic supplementary material, table S4). For F0-range-B only the relationship for the medial labia was significant (see electronic supplementary material, table S4). Asymmetry should lead to different F0 limits of the two sound generators with minimal overlap and, thus, expand the combined F0 range that can be produced by the dual sound source. This interpretation is supported by physiological data, which show that the two sound generators contribute different frequency ranges in a number of species [1618]. The European starling, with the largest asymmetry in this dataset, also generates the largest F0-range.

Second, the number of different layers within the labia is positively correlated with the F0-range-A but not F0-range-B (figure 3h and electronic supplementary material, table S4). F0-mean is not significantly correlated with labial asymmetry or the number of layers (see electronic supplementary material, table S4). A layer structure provides a morphological basis for anisotropic behaviour. Wave propagation in the soft tissue of the labia is one essential component of self-sustained vibratory behaviour during sound generation [2,20], and these waves propagate differently in the soft material depending on the direction, magnitude and rate of deformation. Effectively, the inhomogeneous structure can enlarge the frequency range or support specific frequency ranges by the potential for engaging different portions of the tissue into vibration or beneficially influence the vibration characteristics [4].

4. Discussion

The combination of two morphological adaptations in the labia of songbirds lays the foundation for extending F0-range and for generating the potential to produce all frequencies within this range and shifting F0-mean presumably with the potential of greater specialization. First, the asymmetry in labial size defines the lower and upper frequency boundaries as well as the wider range of actually produced frequencies within these boundaries. Second, the inhomogeneous composition of the labia (layer structure) permits production of all frequencies throughout the range, and thus also the potential for continuous frequency modulation across the entire range. These specializations of the extracellular matrix of the vibrating tissue are therefore essential for the production of acoustic diversity and lay the foundation for neural control of sophisticated vocal repertoires.

The relationship between labial composition and used frequency range (F0-range-A) is stronger than that found for the simple lower and upper boundaries of the frequency range (F0-range-B). These results indicate that the mechanism for continuous frequency modulation and generation of frequencies within a broad range rests on the degree of inhomogeneity within the labial extracellular matrix. The data also show that quantifying used frequencies within the lower and upper frequency boundaries is a meaningful and functionally relevant acoustic descriptor of a vocal repertoire.

It remains to be tested how variability in labia design contributes to acoustic variability within species, particularly between singing males. Studies in songbirds demonstrate mixed results [2830] for the relationship between body size and song frequency (‘dominant’ or ‘peak’ frequency was measured as frequency with the highest amplitude in spectrum). A lack of a close relationship could be explained by an uncoupling between labial size and F0, which might be more dependent on species-specific labia design. Available data do not permit predictions on the relationship between F0 and body size within species. Although the labia show species-specific characteristics, individual-specific factors (e.g. hormones, stress, nutrition and hydration) are likely to influence their morphology and mechanical properties and may do so more readily than changes in body mass or skeletal body size.

The nonlinear relationship between stress and strain (figure 4) presents a complex target for frequency control for two main reasons. First, the active movements required for regulating tissue stress are different for different frequencies, and these movements may involve other biomechanical nonlinearities [7,31,32]. Second, relaxation of the tissue requires adjustment in motor control of sustained constant frequency vibrations, and relaxation is not uniform across the range of experienced strains (more detailed explanation in figure 4). Neural control of sound frequency therefore has to navigate these changing relationships across the frequency range. The inhomogeneous composition provides a basis for anisotropic behaviour of labia, i.e. the labial layers cause a direction-dependent stress response to deformation, which is likely to facilitate the production of frequencies within the range in stereotyped fashion [33]. Stereotypy of song may constitute an important sexually selected feature [34,35], and the make-up of the labia influences the potential for maintaining a given frequency and for generating precise continuous sweeps of a wide frequency range.

Figure 4.

Figure 4.

(a) Hypothetical stress–strain relationship for connective tissue as found in the labia of the songbird syrinx. The relationship between stress, strain and fundamental frequency of sound is established via the string model, which explains that oscillation frequency is determined by tissue stress, string length and density. However, the impact of stress depends on the range of operation (b). In the low-strain region, relatively large tissue deformation is necessary to achieve a given change in stress (box 1) and thereby to cover a large frequency range. The same magnitude of change in stress can be achieved in the high-strain region with much less change in tissue deformation (box 2). This nonlinear relationship permits greater stability in the low-strain region. For example, a syllable with constant F0 requires constant re-adjustment if produced in the high-strain region because tissue relaxation occurs more rapidly (i.e. energy loss due to heat) than in the low-strain region. The ‘disadvantage’ of the low-strain region is (i) less fine-control, (ii) more change in strain is required to cover a large F0 range, and (iii) less frequency flexibility (due to drag and resistance associated with larger movement amplitudes). A potential mechanism that counters the requirement for larger amplitude active movements in the low-strain region could be a multi-layer tissue design, which allows differential recruitment of tissue. The ‘differential tissue recruitment’ could cause higher stresses in the oscillating tissue aspects facilitating high oscillation frequencies, even at lower strains. Stress is the ratio of force over cross-sectional area. By reducing the volume of oscillating tissue while the same aerodynamic forces apply, greater local stresses inside the oscillating tissue should occur.

In the light of the extensive research on variation in the cartilage structure and muscular apparatus of the oscine syrinx [23,3645], it is notable that the strongest correlation with acoustic behaviour is found in the variation in the composition of the labia. The pronounced behavioural effect of small changes in the make-up of the extracellular matrix arises from the remarkable biomechanical properties of extracellular matrix and joins its many other important functions [46,47]. The detailed morphological composition of vibrating tissues can therefore serve as a predictor of the possible range of acoustic features in a vocal repertoire. Importantly, the suggested role of how labial make-up can affect acoustic features agrees well with the limited observations on labial shape and position during oscillatory behaviour associated with acoustic changes of the sound output [14,19].

The composition of extracellular matrix is not static but can be modified in response to systemic, hormonal and environmental influences [48]. Vibrating tissues are therefore also subjected to dynamic changes arising from specific use, such as vibration frequencies, and hormonal and developmental changes that all may have effects on spectral features of vocalizations. Although extracellular matrix proteins are universally found in viscoelastic tissues of animals, their easily quantifiable expression in spectral features of sound, arising from changes in the fibre composition and orientation within vibrating tissues, provides a unique window for studying the evolution of its biomechanical properties and dynamic remodelling.

In conclusion, the comparative data of this study show that the evolution of complex vocal behaviour does not only require increased sophistication in neural control. Adaptations in the biomechanical properties of the sound generating organ lay the foundation for the diversity of acoustic features. The remarkable similarity between the vibrating tissues of the avian syrinx and the larynx of other tetrapods, including humans, suggests that the same mechanism can drive diversity in vocal repertoires in all these groups.

Funding statement

This work was supported by NIH grant DC06876.

References


Articles from Proceedings of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES