Abstract
Budgerigars learn their vocalizations by reference to auditory information and they retain the ability to learn new vocalizations throughout life. Auditory feedback of these vocalizations was manipulated in three experiments by training birds to produce vocalizations while wearing small earphones. Experiments 1 and 2 examined the effect of background noise level (Lombard effect) and the effect of manipulating feedback level from self-produced vocalizations (Fletcher effect), respectively. Results show that birds exhibit both a Lombard effect and a Fletcher effect. Further analysis showed that changes in vocal intensity were accompanied by changes in call fundamental frequency and duration. Experiment 3 tested the effect of delaying or altering auditory feedback during vocal production. Results showed subsequent production of incomplete and distorted calls in both feedback conditions. These distortions included changes in the peak fundamental frequency, amplitude, duration, and spectrotemporal structure of calls. Delayed auditory feedback was most disruptive to subsequent calls when the delay was 25 ms. Longer delays resulted in fewer errors.
INTRODUCTION
Animals that learn their vocalizations rely on auditory feedback (AF) for the development and maintenance of a normal vocal repertoire (for reviews, see Farabaugh and Dooling, 1996; Janik and Slater, 1997; Doupe and Kuhl, 1999; Brainard and Doupe, 2000; Janik and Slater, 2000; and Boughman and Moss, 2003). AF during vocal production has been most extensively studied in humans, where feedback mechanisms help regulate, among other things, vocal amplitude. The Lombard effect, for instance, describes an increase in vocal amplitude in response to an increase in ambient noise level (Lombard, 1911; for a review, see Lane and Tranel, 1971). The Fletcher effect describes a decrease in vocal amplitude in response to an increase in perceived vocal loudness (Fletcher et al., 1918; Lane and Tranel, 1971; Siegel and Pick, 1974). In human speakers, these changes also include increases in syllable duration and vocal pitch and decreases in speaking rate (Hanley and Steer, 1949; Draegert, 1951; Dreher and O’Neill, 1958). Presumably, these responses function to preserve speech intelligibility in varied listening conditions.
Signal degradation due to environmental noise is a problem for all acoustic communication systems. Mechanisms of noise-dependent amplitude changes in vocal behavior similar to the Lombard effect have been described in monkeys (Sinnott et al., 1975; Brumm et al., 2004; Egnor and Hauser, 2006), quails (Potash, 1972), hummingbirds (Pytte et al., 2003), songbirds (Cynx et al., 1998; Brumm and Todt, 2002; Kobayashi and Okanoya, 2003), and budgerigars (Manabe et al., 1998). Generally, in animal studies, technical challenges in delivering noise or altered AF make precise comparisons with human work difficult. The present studies use small earphones to overcome this limitation.
Changes in vocal output caused by altered feedback extend beyond level effects. Precise timing of feedback also has important consequences for the normal development and maintenance of vocal production. In humans, delayed auditory feedback (DAF) of the speech signal results in a number of disruptive effects, including slower speech rate, higher fundamental frequency, longer syllable durations, and a range of production errors (including stuttering and short consonant-like bursts of sound) while some subjects report a complete inability to continue speaking (Lee, 1950; Fairbanks, 1955; Yates, 1963; Howell and Archer, 1984). The most severe disruptions in speech occur at a feedback delay of about 200 ms, with less disruption at shorter and longer delays. Interestingly, DAF has fluency-enhancing effects on stutterers (e.g., Bloodstein, 1995). Taken together, this suite of effects has been interpreted as evidence for timing malfunctions in a closed-loop feedback circuit, which controls ongoing vocal production via AF (e.g., Fairbanks, 1954; Chase, 1965).
There is very little DAF work in animals. Recent work in songbirds has examined the effects of manipulating AF by playing either altered or delayed song during vocal production in zebra finches (Leonardo and Konishi, 1999; Cynx and Von Rad, 2001; Sakata and Brainard, 2006) resulting in dramatic effects on song. Birds show song syllable repetition, syllable deletion, and loss of syllable sequencing and structure under these conditions. The most severe disruptions caused by DAF occurred at delays of 100 ms in zebra finches (Cynx and Von Rad, 2001) and about 65 ms in Bengalese finches (Sakata and Brainard, 2006).
Here the authors apply new methods to a nonsongbird species to further examine how altered AF affects vocal production in birds and also to provide insights into the operation of auditory-vocal circuits in budgerigars. First, vocal behavior was rigorously controlled by training birds, through operant conditioning with food reward, to produce specific vocalizations (i.e., contact calls) to a visual cue. Second, birds were tested while wearing small earphones allowing more precise delivery of noise or altered AF. In experiment 1 (Lombard effect), the authors played various levels of white noise to subjects through the earphones while they were vocalizing. In experiment 2 (Fletcher effect), the authors played amplitude-adjusted exemplars of the birds’ contact calls through the earphones as the birds were producing the same vocalizations. In experiment 3, the authors examined the effect of spectrotemporal alterations of AF (e.g., DAF, reversing the bird’s call) on vocal production.
GENERAL METHODS
Subjects
The subjects in all three experiments were three adult male budgerigars from a colony maintained in an aviary at the University of Maryland. Each bird was separately caged and had ad libitum access to water. Since food was used to reinforce vocal behavior, the birds were maintained at 90% of their free-feeding body weight. The University of Maryland Animal Care and Use Committee approved all experimental procedures.
Apparatus
Birds were trained in an operant testing apparatus consisting of a small wire cage (14×12×17 cm3) constructed of wire mesh and mounted in an acoustic isolation chamber (Industrial Acoustic Co. model AC-1). Three light-emitting diodes (LEDs) (left, center, and right) were attached to a piece of anechoic foam on the front panel of the cage at approximately the level of the birds’ heads. Three small speakers (SONY model MDR-Q22LP) were mounted on the exterior of the cage—one at the center above the front LED panel and one on each of the left and right sides. A small directional microphone (SONY model ECM-77B) located just below the LED panel detected vocalizations. A food hopper containing hulled millet was located on the floor of the cage under the front LED panel. A small video camera was used to monitor the bird’s behavior while in the chamber.
Training∕testing procedure and analysis
Contact call detection and analysis
Training, testing, and analysis programs were written in MATLAB software (version 6.5, Natick, MA) for Tucker Davis Technologies (TDT) System III hardware (Gainesville, FL). The output of the microphone was amplified, low-pass filtered at 10 kHz, and sent to a circular memory buffer in a TDT real-time digital signal processor (RP2.1) at a sampling rate of 25 kHz. A typical budgerigar contact call duration is 100–150 ms with spectral energy concentrated between 2 and 4 kHz (Farabaugh et al., 1994; Farabaugh and Dooling, 1996; Farabaugh et al., 1998). Thus, incoming signals were classified as contact calls if signal intensity exceeded a user-defined value for a minimum of 70 ms and signal power in the frequency band between 2 and 4 kHz exceeded that between 4 and 10 kHz.
All signals classified as contact calls were saved for later analysis. Analysis first involved the generation of serial power spectra across each call in 5 ms (i.e., 122 pt) windows (with 50% window overlap) using a chirp-z transform spectral estimation method (MATLAB function CZT), which allowed 1 Hz frequency resolution. Within each window, the authors measured the frequency and amplitude of the spectral peak and also the spectral bandwidth 3 dB down from the peak. These measures were then averaged to derive the average peak frequency, peak amplitude, and 3 dB bandwidth across the call. Finally, the authors measured the call duration and calculated the similarity to the stored standard or template call (see Secs. 3 and 4 for a description of the template and correlation algorithm). These measures were later analyzed using SPSS software (version 12.0, Chicago, IL).
Initial training (shaping)
Birds were habituated to the experimental chamber and trained to eat from the food hopper when it was activated. Once the birds consistently ate from the raised hopper, shaping of vocal production began. A tape recording of a flock of vocalizing budgerigars was played in the operant chamber to induce the birds to vocalize. Whenever the birds responded to this flock tape with a contact call, the experimenter activated the hopper. Birds quickly came to associate vocalizing in the test chamber with delivery of food. The flock tape playback became unnecessary after several training sessions as the bird reliably produced calls to obtain food and food reinforcement was delivered automatically under computer control.
Birds were next trained to vocalize only when the center LED was illuminated. The LED was turned off once a vocalization was acquired and then turned on again after a random time interval (approximately 5–15 s). Only vocalizations produced when the light was illuminated were reinforced. Vocalizations produced when the LED was turned off caused the random interval timer to reset and thus increased the wait time before another trial could be started (i.e., the LED turned back on). Birds quickly learned to vocalize in the chamber only in response to the illumination of the center LED.
Selecting the contact call template
Once the birds were reliably responding, they were run in several additional training sessions to establish their call repertoire. Budgerigars produce several different call-types that are easily distinguished based on spectrographic characteristics, with one call-type typically being produced much more often than others (Farabaugh et al., 1994; Farabaugh and Dooling, 1996). An exemplar of the bird’s most frequent contact call-type was selected as that bird’s standard or template call (see Manabe and Dooling, 1997).
The template call was chosen by first computing pair-wise spectral cross-correlations among all calls in the test sessions. A custom MATLAB program created a spectrogram for each call using a 256-point (i.e., 10.5 ms) Hanning window with 50% window overlap. Spectrograms were compared using two-dimensional cross-correlation (MATLAB function XCORR2), resulting in a series of correlation values representing all possible temporal offsets between the two spectrograms. The maximum correlation value was taken as the similarity index between the two calls and was normalized to r=0.0 if the two calls were perfectly dissimilar and r=1.0 if the calls were identical. The resultant similarity matrix was analyzed using a multidimensional scaling algorithm (MATLAB function MDSCALE) and the call in the center of the largest cluster in this two-dimensional space was selected as the template call for the next phase of training.
Training vocal precision
Subsequent training sessions used the template call described above to differentially reinforce the bird’s vocal behavior by only reinforcing calls that were similar to the template call. Every vocalization produced by a bird was compared to the stored digital template in real-time and the bird was reinforced if the correlation between the two calls exceeded an experimenter-defined criterion. At first, the criterion correlation value was set very low (e.g., r=0.01) so that all calls were reinforced. The criterion was gradually increased over several sessions to a maximum value of r=0.70. All training sessions were terminated after 50 reinforcements or 25 min, whichever came first. Birds almost always completed 50 reinforcements within 25 min (>95% of sessions). Subjects were tested in two daily sessions, 5 days∕week. All test sessions were separated by at least 3 h.
Earphones
Surgical procedure and earphone construction
Once the birds were trained, a small, stainless steel head post (jewelry pin with clutch back, Hirschberg Schutz & Co., Inc., model no. JC8425–01) was affixed to each bird’s skull. First, the animal was weighed and given an intramuscular injection of ketamine (40 mg∕kg)∕xylazine (20 mg∕kg). The toe pinch response was used to determine whether the bird was properly anesthetized for surgery. Next, the superior aspect of the skull was exposed using a No. 11 scalpel blade and Vanass scissors (Fine Science Tools, Foster City, CA). The skull surface was abraded using the scalpel to create better adhesion before the head post was attached using dental cement (A-M Systems Inc.). Nexaband (Closure Medical Corporation, Raleigh, NC) was used to seal the incision, and the bird was placed in a heated therapy unit for monitoring until the anesthetic effects had worn off. Birds were monitored for 24–48 h following surgery and a non-narcotic, non-steroidal analgesic (Flunixin meglumine, 10 mg∕kg) was administered daily during this recovery period.
Following recovery, birds were fitted with an earphone assembly. The earphone frame was constructed using thin steel wire (1 mm diameter) with small rubber grommets (10 mm diameter) as earphone cushions. A transducer (Knowles Acoustics, model no. EH-3062) was glued to the interior of each grommet using commercially available silicone sealant. These transducers have a frequency response from 0.2 to 8 kHz with peak sensitivities between 2 and 5 kHz. When affixed to the head post, the transducers were aligned directly at the opening of the bird’s ear canal. The grommets pressed lightly against the sides of the bird’s head providing some attenuation of external sounds. During testing, wires from the earphones were fed through the ceiling of the operant chamber to the output amplifiers of the TDT hardware system. The birds were able to move around freely while wearing the headphones during a test session. The headphones were attached to the head post prior to testing and removed after the test session was complete when the bird was returned to the aviary.
Earphone calibration and testing
The sound pressure level of the feedback was measured with a Larson-Davis model 824 sound level meter and 3-m extension cable with a 1∕4 in. microphone both before and after the experiment. The microphone was placed inside a custom-made open adaptor, which approximated the diameter of the bird’s auditory meatus and the distance to the bird’s tympanic membrane from the transducer.
After being fitted with earphones, each bird was tested in several training sessions to ensure that performance was not affected by the surgery, wearing earphones, or the presence of wiring above the bird’s head. No sounds were delivered through the earphones during these sessions and all birds achieved and maintained a reinforcement rate greater than 90% within five sessions after being reintroduced into the testing environment following surgery. The possibility of occlusion effects was considered unlikely because the earphones were not tightly pressed against the head, the energy in the contact calls was at relatively high frequencies between 2 and 4 kHz, and budgerigars have an interaural pathway that connects their middle ears.
EXPERIMENT 1: LOMBARD EFFECT
The Lombard effect is well studied in humans and has been shown in a number of nonhuman animals, including budgerigars. Here the authors examined the Lombard effect in budgerigars with noise delivered through earphones over a broad range of noise levels. In keeping with free field work (Manabe et al., 1998) the authors hypothesized that vocal amplitude would increase as the level of the AF increased. The authors also hypothesized, based on work with human speech, that increasing vocal amplitude would be accompanied by parallel increases in fundamental frequency and duration.
Methods
Subjects
Three adult male budgerigars were used in this experiment.
Procedure
Once the birds were trained to asymptotic levels of performance on the template-training task described above and were fitted with earphones, Gaussian white noise from the TDT System III RP2.1 hardware was delivered during testing. The feedback noise level was measured using a Larson-Davis model 824 sound level meter for 11 different noise levels [40–90 dB sound pressure level (SPL) in 5 dB steps, A-weighting, fast rms] at the bird’s ear.
Birds were tested in four sessions of 60 trials each. A trial was defined by a bird producing a single vocalization in response to an illuminated LED. All 11 noise levels (and a quiet condition) were presented across a test session in five-trial blocks. Noise level changes occurred after the completion of a given trial block and before the first trial of the next block. With the exception of quiet trials, noise was played constantly throughout a session (i.e., there were no silent intervals between noise level changes) regardless of whether the animal was vocalizing or not. The exact order of levels presented was randomly assigned prior to the start of each session.
All vocalizations produced during the experiment were stored digitally and analyzed off-line using a custom MATLAB signal analysis program. Analysis involved a two-step process in which calls were first sorted by noise level across sessions followed by an acoustic analysis of calls within each noise level. Acoustic measures included average peak frequency, average amplitude, duration, similarity to the template, and 3 dB bandwidth.
Results and discussion of experiment 1
Figure 1 shows the increase in call amplitude and call frequency as a function of noise level. The mean level of all calls produced by the birds significantly increased by 7.8 dB SPL as the level of the noise feedback increased from 40 to 90 dB SPL (one-way repeated measures analysis of variance [henceforth RM ANOVA]; F[11,649]=64.2, p<0.01). Average peak frequency increased by about 84 Hz across the same range of noise levels (one-way RM ANOVA; F[11,649]=9.75, p<0.01). Call duration also significantly increased with increasing noise level (one-way RM ANOVA; F[11,649]=4.19, p<0.01). These results show that budgerigars increase fundamental frequency and call length in response to increases in ambient noise levels. Humans and other primates also show an increase in vocal frequency and syllabic length when producing Lombard speech (Lane and Tranel, 1971; Summers et al., 1988; Brumm et al., 2004; Egnor and Hauser, 2006).
Noise level was significantly and inversely proportional to both the similarity of the vocalization to the call template (one-way RM ANOVA; F[11,649]=2.42, p<0.01) and 3 dB bandwidth (one-way RM ANOVA; F[11,649]=4.18, p<0.01). The decrease in correlation values indicates that vocalizations produced in noise also contained structural changes in the call. These changes may be related to the decrease in call bandwidth, which itself may reflect a strategy to increase call detectability in noise. Prior work on the perception of vocalizations in noise by budgerigars and zebra finches has shown that for the same overall level, narrow band vocalizations are more easily detected in noise than wide band vocalizations (Lohr et al., 2003).
EXPERIMENT 2: FLETCHER EFFECT
Experiment 2 examined the effect of a level increase in AF on production amplitude. The Fletcher effect, which as far as the authors know has only been reported for humans, is a decrease in vocal amplitude in response to an increase in perceived vocal loudness. As in the previous experiment, birds were trained to produce a specific call in the operant environment. Based on results from human studies and also from experiment 1, the authors hypothesized that (1) vocal amplitude would decrease as the level of the AF increased and (2) there would be concomitant decreases in both average vocal frequency and call duration.
Method
Subjects
The three birds from experiment 1 were used in this experiment.
Procedure
Each bird’s stored call template was used as the feedback stimulus and delivered through the earphones while the bird vocalized. The template call was used instead of the bird’s actual vocalizations to ensure that the feedback was delivered at specific levels and was not dependent on scaling a bird’s own vocal production (which itself was expected to vary in amplitude during a session as the feedback level was changed). Observations from training sessions show that these calls tend to be highly stereotyped under operant control, with a standard deviation under 5 ms in duration and under 3 dB in amplitude within a test session. Thus, the template call functioned satisfactorily as a surrogate for the bird’s actual production.
The template was stored in a memory buffer in the TDT RP2.1 and delivered through the earphones when a vocalization was detected at the microphone. The authors used four different feedback levels (i.e., quiet, 70, 80, and 90 dB SPL, A-weighting, fast rms) that were measured using a Larson-Davis model 824 sound level meter. These values were chosen based on the level at which the birds typically vocalized in the operant environment, which was about 70 dB SPL. Birds likely also heard their own vocalizations in each of these conditions through both air [due to the lack of complete occlusion by the headphones (see Sec. 2D above)] and bone conduction.
Birds were run in two sessions of 40 trials each. A trial was defined by a single vocalization produced in response to an illuminated LED. Feedback levels were presented in five-trial blocks and each trial block was tested twice per session. The order of the trial blocks was randomly assigned prior to each session. All vocalizations were stored digitally and analyzed off-line using a MATLAB signal analysis program. Analysis involved a two-step process in which calls were first sorted by feedback level followed by an acoustic analysis of calls within each noise level. Acoustic measures included average peak frequency, average amplitude, duration, similarity to the template, and 3 dB bandwidth.
Results and discussion of experiment 2
Figure 2 shows the decrease in call amplitude and call frequency as a function of feedback level. The mean level of all calls produced by the birds significantly decreased by 3.74 dB between the quiet condition and the 90 dB SPL feedback condition (one-way RM ANOVA; F[3,177]=38.5, p<0.01). Mean call frequency for the three birds across feedback levels significantly decreased by 59.1 Hz across feedback levels (one-way RM ANOVA; F[3,177]=13.4, p<0.01). Call duration also significantly decreased as feedback level increased (one-way RM ANOVA; F[3,177]=2.74, p<0.05). There were no significant differences in either the similarity to the template call or 3 dB bandwidth.
The authors also looked at the time course of these changes by comparing the amplitude and frequency of the first two calls produced after the feedback level changed. There were no differences in amplitude (F[3,6]=1.72, p=0.26) or frequency (F[3,6]=1.08, p=0.43) across feedback levels on the first trial, but calls were significantly different across feedback levels in both amplitude (F[3,6]=5.19, p<0.05) and frequency (F[3,6]=5.12, p<0.05) on the second feedback trial. Thus, budgerigars do not make online adjustments to either the amplitude or frequency of these short contact calls but rather adjust both frequency and amplitude on the subsequent vocalization.
EXPERIMENT 3: DELAYED AND ALTERED AUDITORY FEEDBACK
Here the authors tested the effect of altering AF of the bird’s own vocalization on vocal production. There were three altered feedback conditions: (1) DAF of the bird’s own vocalizations (“DAF” condition), (2) a temporally-reversed version of the birds’ template call (“reversed” condition), and (3) another bird’s call as the altered feedback stimulus (“other” condition). Based on prior work in humans and songbirds, the authors hypothesized that altered feedback would disrupt normal call production by inducing changes in pitch, duration, and other spectrotemporal aspects of the call.
Method
Subjects
The three birds from experiments 1 and 2 were used in this experiment.
Procedure
Birds were trained with the same methods of the previous two experiments to produce contact calls that had a spectral cross-correlation criterion to the template of at least r=0.70. Then, in three conditions, altered feedback was presented through the earphones whenever a vocalization was detected at the microphone. In the DAF condition, the altered feedback stimulus was the bird’s own vocalization from the incoming microphone signal delayed in time by 0, 25, 50, 75, or 100 ms. In both the reversed and other conditions, the stimulus presented was either a reversed version of the birds’ template call or the contact call of another bird as feedback stimuli, respectively. These last two stimulus types were stored in a memory buffer and presented with a 0 ms delay. The level of the altered feedback stimuli was calibrated to 70 dB SPL (typical of birds vocalizing in the operant chamber) using a Larson-Davis model 824 sound level meter.
Birds were run in two sessions for each feedback type (e.g., five DAF delay values, reversed, and other) for a total of 14 sessions. Each session was comprised of 70 total trials: 10 altered feedback trials and 60 non-altered feedback trials. The ten altered feedback trials were randomized within each session so that one altered feedback trial was presented for every three to eight non-altered feedback trials. Also, the order of each feedback type was randomized across testing sessions. All birds completed the full 70 trials for all sessions.
Vocalizations were stored digitally and analyzed off-line using a MATLAB signal analysis program. Calls were first sorted by trial type (i.e., pre-altered feedback trial, altered feedback trial, first trial post-altered feedback, etc.) and then acoustic measures—including peak fundamental frequency, amplitude, duration, and similarity to the template—were calculated to compare calls across different trial and stimulus types.
Results and discussion of experiment 3
The three birds showed disruptions in vocal behavior in all three altered feedback conditions. The most obvious finding was that birds often produced calls that fell below a spectral cross-correlation to the template of r=0.70, which was the minimum value required for reward during training sessions. These calls were labeled errors. Most of these errors (83.0%) occurred within the first two calls after an altered feedback trial and most of those (77.3%) occurred on the first call following altered feedback presentation. No errors occurred during the altered feedback trials.
Figure 3 shows the error rate on the first trial following the altered AF trial across the three conditions. Under the DAF condition, error rates differed as a function of delay length (one-way RM ANOVA; F[4,8]=24.01, p<0.001). As in humans, there was a maximally disruptive delay that resulted in the most errors, which for budgerigars was 25 ms. Both the reversed and other conditions produced statistically indistinguishable error rates of 16.7% and 15.0%, respectively (paired samples t-test; t[2]=−2.00, p=0.18), and the average error rate between these two conditions was significantly lower than in the 25 ms DAF condition (paired samples t-test; t[2]=5.20, p<0.05).
Calls classified as errors fell into two general categories based on visual inspection of spectrographic characteristics. The first type of error occurred when the bird produced a call-type different from the template call (error I). This type of error represented 62.3% of all errors. The second kind of error (error II) was less common (37.7%) and occurred when the bird produced the same call-type as the template call but with new or missing acoustic elements. An example of each of these types of errors is shown in Fig. 4. Overall, there were 180 feedback trials resulting in a total of 88 errors of all types.
GENERAL DISCUSSION
In three experiments the authors show that budgerigar vocalizations are affected by real-time AF. Experiments 1 and 2 showed that these birds exhibit the Lombard and Fletcher effects—increasing vocal amplitude in the presence of background noise and decreasing vocal amplitude when the AF level of their calls is increased. Vocal amplitude increased linearly 1 dB for every 5 dB change in AF loudness. This is a shallower slope than found in humans, where a 1 dB change in vocal amplitude occurs for each 2–3 dB change in AF loudness (see Lane and Tranel, 1971). The difference in slope between humans and budgerigars may be a function of the natural vocal level range of each species. That is, humans can typically speak across 50+dB range of level (from whisper to shout), whereas budgerigars probably have a much smaller range.
The authors also showed that these amplitude changes are accompanied by changes in contact call frequency, duration, bandwidth, and other acoustic characteristics that are also correlated with vocal effort in speech (e.g., Traunmüller and Eriksson, 2000). In humans, an increase in vocal effort results in increased amplitude, duration, and pitch of speech while a decreasing vocal effort shows the opposite effects. Like humans, birds also produce sound driven by respiratory airflow through a set of vibrating structures in the sound producing organ (e.g., the syrinx in birds and the larynx in humans) (Fletcher and Tarnopolsky, 1999; Larsen and Goller, 1999; 2002). From the pattern of vocal changes occurring during altered feedback trials, budgerigars appear to alter their vocal effort, in a manner analogous to humans, by altering the velocity of the air passing through the syringeal membranes without changing membrane tension (e.g., Heaton et al., 1995; Brittan-Powell et al., 1997). Changes such as increased amplitude and duration and decreased bandwidth all have the practical effect of increasing audibility in noise (Lane and Tranel, 1971; Summers et al., 1988; Lohr et al., 2003).
Experiment 3 also showed that budgerigars are affected by other alterations of AF. Delaying or reversing the vocalizations, or presenting another bird’s contact call as AF, resulted in a range of production errors. These errors were similar to those reported in humans and songbirds (e.g., Yates, 1963; Leonardo and Konishi, 1999; Cynx and Von Rad, 2001; Sakata and Brainard, 2006) and included changes in peak frequency, amplitude, and duration. Errors were generally of two types: either production of a different call-type or production of a call with additions or omissions of elements. Interestingly, these errors occurred in subsequent vocalizations but never during the altered feedback trial. This result is consistent with those from experiment 2 (Fletcher effect) showing that amplitude adjustments do not occur in real-time but instead occur on the subsequent call. Taken together, these results on budgerigars show both similarities and differences with analogous behavioral results in humans.
In both humans and birds, the physiological mechanisms underlying altered feedback effects remain obscure. In humans, recent work shows evidence for vocalization-induced suppression of auditory cortex neural activity during ongoing speech (e.g., Houde et al., 2002). Several functional magnetic resonance imaging (fMRI) studies have shown that altered AF, including DAF, activates areas in and around auditory cortex, superior temporal lobe, and planum temporale within 100–130 ms during speech (Hashimoto and Sakai, 2003; Guenther, 2006). Similar patterns of excitation and suppression have been described more recently in nonhuman primates and may result from common mechanisms (e.g., Eliades and Wang, 2008).
In songbirds, vocal errors appear to be processed in forebrain premotor areas. For example, Sakata and Brainard (2006) showed that DAFs of single song syllables provided to Bengalese finches at delays ranging from 40 to 65 ms were most effective in generating vocal errors, although errors occurred with delays as short as 20 ms. Errors did not occur within a syllable, but in subsequent syllables. The authors therefore speculated that the feedback signal is processed in the forebrain premotor nucleus HVC, which is involved in syllable sequencing. Consistent with this hypothesis, a more recent study has shown that HVC contains a population of neurons that are activated by both hearing a song and producing the same song, which could serve an important role in error-correction processes (Prather et al., 2008).
The neural underpinnings of budgerigar vocal feedback control are much less understood than in either humans or songbirds. The budgerigar nucleus NLc, a telencephalic vocal motor region possibly analogous to songbird HVC, responds to auditory input within about 100 ms (Plummer and Striedter, 2000) and projects to striatal structures responsible for learning new contact calls (Striedter, 1994; Brauth et al., 1997). If this nucleus is functionally similar to songbird HVC, it might also contain a population of neurons responsible for comparing actual and expected feedback, and could be responsible for selecting the correct contact call prior to production. The fact that NLc does not receive AF information until 100 ms after vocal onset might explain the bird’s inability to use altered feedback for correcting online vocal errors for short sounds since typical budgerigar contact calls are typically 100–150 ms (Farabaugh and Dooling, 1996). It could be used to adjust subsequent vocalizations, however, which is consistent with the behavioral data described here. In addition, if NLc guides selection of the correct circuitry underlying call production, this may also explain why DAF results in erroneous call selection or the production of alternate call-types. Results of the present behavioral experiments are at least consistent with such a function for budgerigar NLc.
The fact that altered AF in budgerigars affects subsequent calls rather than the ongoing vocalization is different from what has been found in humans and songbirds (e.g., Yates, 1963; Leonardo and Konishi, 1999; Cynx and Von Rad, 2001; Sakata and Brainard, 2006). In part, this may have to do with the fact that budgerigar contact calls are so short that feedback mechanisms do not have time to engage. Vocal amplitude adjustments following altered feedback during sustained vowel production in humans does not occur until approximately 150–175 ms of feedback onset (Heinks-Maldonado and Houde, 2005; Bauer et al., 2006). Similarly, DAF effects in humans and songbirds occur at delays of about 200 ms (e.g., Yates, 1963; Howell and Archer, 1984) and 50–100 ms (Cynx and Von Rad, 2001; Sakata and Brainard, 2006), respectively. These are about the duration of a typical budgerigar contact call (about 150 ms) and suggest that the physiological response to altered AF may require a minimum latency greater than the length of a call.
Another possibility is that contact calls are produced ballistically and cannot be modified once initiated. In contrast, zebra finches and tamarins will interrupt their vocal production in structured ways in response to a strobe light (e.g., Cynx, 1990; Miller et al., 2003). The type of errors budgerigars made in experiment 3 did not appear to be examples of truncated contact calls and all errors occurred after the trial with altered feedback. Instead, budgerigar vocal errors are better described as different, intact call-types. This suggests that incorrect AF might be disrupting selection of the correct motor program sequence that gives rise to the next call-type. In songbirds, onset delays from altered AF are about the average length of song syllables, which are themselves single vocal motor gestures and probably produced ballistically (Cynx, 1990; Riebel and Todt, 1997; Franz and Goller, 2002). While there is evidence that budgerigars learn new calls through a process of recombination and modification of smaller call elements (Farabaugh et al., 1994; Manabe and Dooling, 1997), the present results argue more that the entire call is produced ballistically rather than just the individual elements.
In sum, these results show that AF in budgerigars, as in humans and songbirds, is used to guide future vocal production. The authors measured changes in call amplitude, frequency, and duration that are consistent with the idea that budgerigar vocal production contains mechanisms for overcoming the masking effects of environmental noise. The authors also showed that temporally- and spectrally-misaligned feedback interrupt call production. Our results, in which altered AF affects subsequent calls, differ from both the human and songbird cases in which AF is used to make online adjustments to vocalizations as they are produced. These results could argue for an error-correction mechanism similar to that reported in humans and songbirds but which operates on a time scale greater than the length of a contact call. Alternatively, these results may indicate that different mechanisms are involved in AF-guided vocal production in these two different groups of vocal learners.
ACKNOWLEDGMENTS
The authors thank Elizabeth Brittan-Powell and Peter Marvit for their many helpful comments on the manuscript. They also thank Leah Dickstein for help in performing the experiments. This work was supported by NIH Grant Nos. DC-00198 and DC-04664 to R.J.D. and NIH Grant No. DC-006766 to M.S.O.
References
- Bauer, J. J., Mittal, J., Larson, C. R., and Hain, T. C. (2006). “Vocal responses to unanticipated perturbations in voice loudness feedback: An automatic mechanism for stabilizing voice amplitude,” J. Acoust. Soc. Am. 119, 2363–2371. 10.1121/1.2173513 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bloodstein, O. (1995). A Handbook on Stuttering (National Easter Seal Society, Chicago, IL: ). [Google Scholar]
- Boughman, J. W., and Moss, C. F. (2003). “Social sounds: Vocal learning and development of mammal and bird calls,” in Acoustic Communication, Springer Handbook of Auditory Research, edited by Simmons A. M., Popper A. N., and Fay R. R. (Springer-Verlag, New York: ), pp. 138–224. [Google Scholar]
- Brainard, M. S., and Doupe, A. J. (2000). “Auditory feedback in learning and maintenance of vocal behaviour,” Nat. Rev. Neurosci. 1, 31–40. 10.1038/35036205 [DOI] [PubMed] [Google Scholar]
- Brauth, S. E., Heaton, J. T., Shea, S. D., Durand, S. E., and Hall, W. S. (1997). “Functional anatomy of forebrain vocal control pathways in the budgerigar (Melopsittacus undulatus),” Ann. N.Y. Acad. Sci. 807, 368–385. 10.1111/j.1749-6632.1997.tb51933.x [DOI] [PubMed] [Google Scholar]
- Brittan-Powell, E. F., Dooling, R. J., Larsen, O. N., and Heaton, J. T. (1997). “Mechanisms of vocal production in budgerigars (Melopsittacus undulatus),” J. Acoust. Soc. Am. 101, 578–589. 10.1121/1.418121 [DOI] [PubMed] [Google Scholar]
- Brumm, H., and Todt, D. (2002). “Noise-dependent song amplitude regulation in a territorial songbird,” Anim. Behav. 63, 891–897. 10.1006/anbe.2001.1968 [DOI] [Google Scholar]
- Brumm, H., Voss, K., Kollmer, I., and Todt, D. (2004). “Acoustic communication in noise: Regulation of call characteristics in a New World monkey,” J. Exp. Biol. 207, 443–448. 10.1242/jeb.00768 [DOI] [PubMed] [Google Scholar]
- Chase, R. (1965). “An information-flow model of the organization of motor activity I: Transduction, transmission, and central control of sensory information,” J. Nerv. Ment. Dis. 140, 239–251. [DOI] [PubMed] [Google Scholar]
- Cynx, J. (1990). “Experimental-determination of a unit of song production in the zebra finch (Taeniopygia guttata),” J. Comp. Psychol. 104, 3–10. 10.1037/0735-7036.104.1.3 [DOI] [PubMed] [Google Scholar]
- Cynx, J., Lewis, R., Tavel, B., and Tse, H. (1998). “Amplitude regulation of vocalizations in noise by a songbird, Taeniopygia guttata,” Anim. Behav. 56, 107–113. 10.1006/anbe.1998.0746 [DOI] [PubMed] [Google Scholar]
- Cynx, J., and Von Rad, U. (2001). “Immediate and transitory effects of delayed auditory feedback on bird song production,” Anim. Behav. 62, 305–312. 10.1006/anbe.2001.1744 [DOI] [Google Scholar]
- Doupe, A. J., and Kuhl, P. K. (1999). “Birdsong and human speech: Common themes and mechanisms,” Annu. Rev. Neurosci. 22, 567–631. 10.1146/annurev.neuro.22.1.567 [DOI] [PubMed] [Google Scholar]
- Draegert, G. L. (1951). “Relationships between voice variables and speech intelligibility in high level noise,” Speech Monogr. 18, 272–278. [Google Scholar]
- Dreher, J. J., and O’Neill, J. J. (1958). “Effects of ambient noise on speaker intelligibility of words and phrases,” Laryngoscope 68, 539–548. 10.1288/00005537-195803000-00032 [DOI] [PubMed] [Google Scholar]
- Egnor, S. E. R., and Hauser, M. D. (2006). “Noise-induced vocal modulation in cotton-top tamarins (Saguinus oedipus),” Am. J. Primatol. 68, 1183–1190. 10.1002/ajp.20317 [DOI] [PubMed] [Google Scholar]
- Eliades, S. J., and Wang, X. (2008). “Neural substrates of vocalization feedback monitoring in primate auditory cortex,” Nature (London) 453, 1102–1106. 10.1038/nature06910 [DOI] [PubMed] [Google Scholar]
- Fairbanks, G. (1954). “Systematic research in experimental phonetics I: A theory of the speech mechanism as a servosystem,” J. Speech Hear Disord. 19, 133–139. [DOI] [PubMed] [Google Scholar]
- Fairbanks, G. (1955). “Selective vocal effects of delayed auditory feedback,” J. Speech Hear Disord. 20, 333–346. [DOI] [PubMed] [Google Scholar]
- Farabaugh, S., and Dooling, R. J. (1996). “Acoustic communication in parrots: Laboratory and field studies of budgerigars, Melopsittacus undulates,” in Ecology and Evolution of Acoustic Communication in Birds, edited by Kroodsma D. E. and Miller E. H. (Cornell University Press, Ithaca, NY: ), pp. 97–118. [Google Scholar]
- Farabaugh, S. M., Dent, M. L., and Dooling, R. J. (1998). “Hearing and vocalizations in native Australian budgerigars (Melopsittacus undulatus),” J. Comp. Psychol. 112, 74–81. 10.1037/0735-7036.112.1.74 [DOI] [PubMed] [Google Scholar]
- Farabaugh, S. M., Linzenbold, A., and Dooling, R. J. (1994). “Vocal plasticity in budgerigars (Melopsittacus undulatus): Evidence for social factors in the learning of contact calls,” J. Comp. Psychol. 108, 81–92. 10.1037/0735-7036.108.1.81 [DOI] [PubMed] [Google Scholar]
- Fletcher, H., Raff, G. M., and Parmley, F. (1918). “Study of the effects of different sidetones in the telephone set,” Report No. 19412, Western Electric Company, New York, N.Y.
- Fletcher, N. H., and Tarnopolsky, A. (1999). “Acoustics of the avian vocal tract,” J. Acoust. Soc. Am. 105, 35–49. 10.1121/1.424620 [DOI] [Google Scholar]
- Franz, M., and Goller, F. (2002). “Respiratory units of motor production and song imitation in the zebra finch,” J. Neurobiol. 51, 129–141. 10.1002/neu.10043 [DOI] [PubMed] [Google Scholar]
- Guenther, F. H. (2006). “Cortical interactions underlying the production of speech sounds,” J. Commun. Disord. 39, 350–365. 10.1016/j.jcomdis.2006.06.013 [DOI] [PubMed] [Google Scholar]
- Hanley, T. D., and Steer, M. D. (1949). “Effect of level of distracting noise upon speaking rate, duration and intensity,” J. Speech Hear Disord. 14, 363–368. [DOI] [PubMed] [Google Scholar]
- Hashimoto, Y., and Sakai, K. (2003). “Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: An fMRI study,” Hum. Brain Mapp 20, 22–28. 10.1002/hbm.10119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heaton, J. T., Farabaugh, S. M., and Brauth, S. E. (1995). “Effect of syringeal denervation in the budgerigar (Melopsittacus undulatus): The role of the syrinx in call production,” Neurobiol. Learn Mem. 64, 68–82. 10.1006/nlme.1995.1045 [DOI] [PubMed] [Google Scholar]
- Heinks-Maldonado, T. H., and Houde, J. F. (2005). “Compensatory responses to brief perturbations of speech amplitude,” ARLO 6, 131–137. 10.1121/1.1931747 [DOI] [Google Scholar]
- Houde, J. F., Nagarajan, S. S., Sekihara, K., and Merzenich, M. M. (2002). “Modulation of the auditory cortex during speech: An MEG study,” J. Cogn Neurosci. 14, 1125–1138. 10.1162/089892902760807140 [DOI] [PubMed] [Google Scholar]
- Howell, P., and Archer, A. (1984). “Susceptibility to the effects of delayed auditory feedback,” Percept. Psychophys. 36, 296–302. [DOI] [PubMed] [Google Scholar]
- Janik, V. M., and Slater, P. J. B. (1997). “Vocal learning in mammals,” in Advances in the Study of Behaviour, edited by Slater P. J. B., Rosenblatt J. S., Snowdon C. T., and Milinski H. (Academic, New York: ), pp. 59–99. [Google Scholar]
- Janik, V. M., and Slater, P. J. B. (2000). “The different roles of social learning in vocal communication,” Anim. Behav. 60, 1–11. 10.1006/anbe.2000.1410 [DOI] [PubMed] [Google Scholar]
- Kobayasi, K. I., and Okanoya, K. (2003). “Context-dependent song amplitude control in Bengalese finches,” NeuroReport 14, 521–524. 10.1097/00001756-200303030-00045 [DOI] [PubMed] [Google Scholar]
- Lane, H., and Tranel, B. (1971). “Lombard sign and role of hearing in speech,” J. Speech Hear. Res. 14, 677–709. [Google Scholar]
- Larsen, O. N., and Goller, F. (1999). “Role of syringeal vibrations in bird vocalizations,” Proc. R. Soc. London, Ser. B 266, 1609–1615. 10.1098/rspb.1999.0822 [DOI] [Google Scholar]
- Larsen, O. N., and Goller, F. (2002). “Direct observation of syringeal muscle functions in songbirds and a parrot,” J. Exp. Biol. 205, 25–35. [DOI] [PubMed] [Google Scholar]
- Lee, B. S. (1950). “Effects of delayed speech feedback,” J. Acoust. Soc. Am. 22, 824–826. 10.1121/1.1906696 [DOI] [Google Scholar]
- Leonardo, A., and Konishi, M. (1999). “Decrystallization of adult birdsong by perturbation of auditory feedback,” Nature (London) 399, 466–470. 10.1038/20933 [DOI] [PubMed] [Google Scholar]
- Lohr, B., Wright, T. F., and Dooling, R. J. (2003). “Detection and discrimination of natural calls in masking noise by birds: Estimating the active space signal,” Anim. Behav. 65, 763–777. 10.1006/anbe.2003.2093 [DOI] [Google Scholar]
- Lombard, E. (1911). “Le signe de l’élévation de la voix (The characteristics of the elevation of the voice),” Ann. Maladies Oreille, Larynx, Nez, Pharynx (Annals of the Diseases of the Ear, Larynx, Nose, and Pharynx) 37, 101–119. [Google Scholar]
- Manabe, K., and Dooling, R. J. (1997). “Control of vocal production in budgerigars (Melopsittacus undulatus): Selective reinforcement, call differentiation, and stimulus control,” Behav. Processes 41, 117–132. 10.1016/S0376-6357(97)00041-7 [DOI] [PubMed] [Google Scholar]
- Manabe, K., Sadr, E. I., and Dooling, R. J. (1998). “Control of vocal intensity in budgerigars (Melopsittacus undulatus): Differential reinforcement of vocal intensity and the Lombard effect,” J. Acoust. Soc. Am. 103, 1190–1198. 10.1121/1.421227 [DOI] [PubMed] [Google Scholar]
- Miller, C. T., Flusberg, S., and Hauser, M. D. (2003). “Interruptibility of long call production in tamarins: Implications for vocal control,” J. Exp. Biol. 206, 2629–2639. 10.1242/jeb.00458 [DOI] [PubMed] [Google Scholar]
- Plummer, T. K., and Striedter, G. F. (2000). “Auditory responses in the vocal motor system of budgerigars,” J. Neurobiol. 42, 79–94. [DOI] [PubMed] [Google Scholar]
- Potash, L. M. (1972). “Noice-induced changes in calls of the Japanese quail,” Psychonomic Sci. 26, 252–254. [Google Scholar]
- Prather, J. F., Peters, S., Nowicki, S., and Mooney, R. (2008). “Precise auditory-vocal mirroring in neurons for learned vocal communication,” Nature (London) 451, 305–310. 10.1038/nature06492 [DOI] [PubMed] [Google Scholar]
- Pytte, C. L., Rusch, K. M., and Ficken, M. S. (2003). “Regulation of vocal amplitude by the blue-throated hummingbird, Lampornis clemenciae,” Anim. Behav. 66, 703–710. 10.1006/anbe.2003.2257 [DOI] [Google Scholar]
- Riebel, K., and Todt, D. (1997). “Light flash stimulation alters the nightingale’s singing style: Implications for song control mechanisms,” Behaviour 134, 789–808. 10.1163/156853997X00070 [DOI] [Google Scholar]
- Sakata, J. T., and Brainard, M. S. (2006). “Real-time contributions of auditory feedback to avian vocal motor control,” J. Neurosci. 26, 9619–9628. 10.1523/JNEUROSCI.2027-06.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siegel, G. M., and Pick, H. L. (1974). “Auditory feedback in the regulation of voice,” J. Acoust. Soc. Am. 56, 1618–1624. 10.1121/1.1903486 [DOI] [PubMed] [Google Scholar]
- Sinnott, J. M., Stebbins, W. C., and Moody, D. B. (1975). “Regulation of voice amplitude by the monkey,” J. Acoust. Soc. Am. 58, 412–414. 10.1121/1.380685 [DOI] [PubMed] [Google Scholar]
- Striedter, G. F. (1994). “The vocal control pathways in budgerigars differ from those in songbirds,” J. Comp. Neurol. 343, 35–56. 10.1002/cne.903430104 [DOI] [PubMed] [Google Scholar]
- Summers, W. V., Pisoni, D. B., Bernacki, R. H., Pedlow, R. I., and Stokes, M. A. (1988). “Effects of noise on speech production: Acoustic and perceptual analyses,” J. Acoust. Soc. Am. 84, 917–928. 10.1121/1.396660 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Traunmüller, H., and Eriksson, E. (2000). “Acoustic effects of variation in vocal effort by men, women, and children,” J. Acoust. Soc. Am. 107, 3438–3451. 10.1121/1.429414 [DOI] [PubMed] [Google Scholar]
- Yates, A. J. (1963). “Delayed auditory feedback,” Psychol. Bull. 60, 213–232. 10.1037/h0044155 [DOI] [PubMed] [Google Scholar]