Skip to main content
Journal of Neurophysiology logoLink to Journal of Neurophysiology
. 2016 Feb 3;115(4):1905–1916. doi: 10.1152/jn.01003.2015

Neural correlates of behavioral amplitude modulation sensitivity in the budgerigar midbrain

Kenneth S Henry 1,, Erikson G Neilans 2, Kristina S Abrams 3, Fabio Idrobo 4,5, Laurel H Carney 1,3
PMCID: PMC4869485  PMID: 26843608

Abstract

Amplitude modulation (AM) is a crucial feature of many communication signals, including speech. Whereas average discharge rates in the auditory midbrain correlate with behavioral AM sensitivity in rabbits, the neural bases of AM sensitivity in species with human-like behavioral acuity are unexplored. Here, we used parallel behavioral and neurophysiological experiments to explore the neural (midbrain) bases of AM perception in an avian speech mimic, the budgerigar (Melopsittacus undulatus). Behavioral AM sensitivity was quantified using operant conditioning procedures. Neural AM sensitivity was studied using chronically implanted microelectrodes in awake, unrestrained birds. Average discharge rates of multiunit recording sites in the budgerigar midbrain were insufficient to explain behavioral sensitivity to modulation frequencies <100 Hz for both tone- and noise-carrier stimuli, even with optimal pooling of information across recording sites. Neural envelope synchrony, in contrast, could explain behavioral performance for both carrier types across the full range of modulation frequencies studied (16–512 Hz). The results suggest that envelope synchrony in the budgerigar midbrain may underlie behavioral sensitivity to AM. Behavioral AM sensitivity based on synchrony in the budgerigar, which contrasts with rate-correlated behavioral performance in rabbits, raises the possibility that envelope synchrony, rather than average discharge rate, might also underlie AM perception in other species with sensitive AM detection abilities, including humans. These results highlight the importance of synchrony coding of envelope structure in the inferior colliculus. Furthermore, they underscore potential benefits of devices (e.g., midbrain implants) that evoke robust neural synchrony.

Keywords: amplitude modulation, auditory midbrain, budgerigar, envelope synchrony, inferior colliculus


amplitude modulation (AM) is a critically important acoustic feature of speech and many nonhuman animal vocalizations (Beckers and ten Cate 2001; Rosen 1992; Shannon et al. 1995). Humans exhibit remarkable sensitivity to AM, with the ability to detect modulation depths as low as 3–6% (i.e., 25–30 dB below full modulation) at modulation frequencies less than a few hundred hertz (Carney et al. 2014; Kohlrausch et al. 2000; Viemeister 1979). Currently, the neural mechanisms underlying behavioral sensitivity to this crucial acoustic feature of communication signals are poorly understood.

The inferior colliculus (IC) is a large auditory nucleus in the vertebrate midbrain that is an almost-mandatory processing station in the ascending auditory pathway (Aitkin and Phillips 1984). The IC has emerged as an important location for studying AM sensitivity, because it is the first stage of auditory processing to show neural rate coding of AM (Joris et al. 2004). Whereas more peripheral nuclei encode AM through envelope synchrony (i.e., temporal variation in discharge rate at the modulation frequency of the stimulus) in both mammals (Joris and Yin 1992; Rhode and Greenberg 1994; Sayles et al. 2013) and birds (Gleich and Klump 1995), the IC and other more central nuclei encode AM through both envelope synchrony and substantial changes in average discharge rate compared with responses from unmodulated stimuli [for IC, see Langner and Schreiner (1988) and Woolley and Casseday (2005); for a summary of conserved physiological response properties in the midbrain of mammals and birds, see Woolley and Portfors (2013); for thalamus, see Bartlett and Wang (2007); for cortex, see Johnson et al. (2012) and Rosen et al. (2010)]. Effects of AM on IC response rate can be excitatory or suppressive (Krishna and Semple 2000; Nelson and Carney 2007) and are most pronounced at a neuron's best modulation frequency (BMF). The emergence of two neural codes for AM at the level of the IC raises the question of how synchrony and average rate thresholds compare with behavioral AM detection thresholds.

The relationship between behavioral and IC sensitivity to AM in the same animal model has been examined previously in a single species—the rabbit—which has limited behavioral sensitivity to AM (Carney et al. 2014). The rabbit study used receiver-operating characteristic (ROC) analyses (Egan 1975) to calculate thresholds for AM detection (i.e., minimum detectable modulation depths) across a population of IC neurons, based on both average discharge rate and envelope synchrony, for comparison with behavioral thresholds obtained with operant conditioning procedures. The most sensitive rate thresholds in the rabbit IC were approximately as sensitive as behavioral thresholds. Synchrony thresholds, in contrast, were considerably more sensitive than the behaving animal, falling instead within the range of human AM performance. These results suggest that rate coding in the IC, rather than envelope synchrony, may underlie behavioral detection of AM in rabbits. However, the mechanisms underlying AM perception in species with more sensitive auditory perceptual abilities are unexplored. Human listeners may use envelope synchrony of IC responses to achieve greater AM sensitivity than rabbits (Carney et al. 2014; Nelson and Carney 2007). Alternatively, rate-based AM thresholds in the human IC may be more sensitive than in the rabbit and therefore able to support superior behavioral performance. The relationship between average rate or envelope synchrony and human-like behavioral AM acuity is unknown.

Whereas rate- and synchrony-based neural AM thresholds cannot be measured in the human IC for comparison with behavioral performance, they can be quantified in other species with sensitive AM detection abilities. The budgerigar (Melopsittacus undulatus) is an avian, lifelong vocal learner with complex, temporally modulated vocalizations (Farabaugh et al. 1994; Tu et al. 2011) and the capacity to mimic human speech (Dooling et al. 2000). Previous research in this small Australian parrot suggests that AM detection abilities are similar between the budgerigar and human listeners for noise-carrier stimuli (Dooling and Searcy 1981) and tones with low modulation frequencies (Carney et al. 2013). Compared with mammals of similar size, the sensory epithelium of the avian cochlea is shorter (∼2.5 mm in budgerigar) (Manley et al. 1993) and restricted in sensitivity to lower acoustic frequencies. Behavioral absolute thresholds of the budgerigar are <30 dB sound pressure level (SPL) from 0.35 to 6 kHz and range from 0 to 10 dB SPL at frequencies from 2 to 3 kHz (Brittan-Powell et al. 2002; Dooling and Saunders 1975) (Fig. 1A). Comparable ranges of hearing sensitivity in the domestic mouse and rabbit are 4–64 kHz (Koay et al. 2002) and 0.5–40 kHz (Heffner 1980), respectively. The frequency selectivity of auditory nerve-fiber responses to tones has not been studied in the budgerigar, but in other avian species with similar cochlear anatomy to budgerigar, frequency tuning is similar to or slightly sharper than in typical mammalian auditory nerve fibers (Manley et al. 1985; Sachs et al. 1974).

Fig. 1.

Fig. 1.

Pure-tone tuning in the budgerigar inferior colliculus (IC). A: representative pure-tone tuning curves (solid black and gray lines) plotting the threshold stimulus sound pressure level (SPL) necessary to evoke a reliable increase in firing rate (3 SD above the spontaneous discharge rate) as a function of stimulus frequency. Recording depth, expressed in millimeters relative to the dorsal electrode insertion site, is indicated at the tip of each curve. Tuning curve characteristic frequency (CF) increased with increasing recording depth, i.e., along a dorso-ventral anatomic gradient. The dotted line shows the behavioral audiogram of the budgerigar [from Brittan-Powell et al. (2002)]. B: the 10-dB tuning bandwidth of individual recording sites plotted as a function of CF (n = 68 sites). The trend line is based on a local regression model of log-transformed variables using weighted linear least squares, a first-degree polynomial, and a smoothing parameter of 0.5.

The present study quantified neural and behavioral sensitivity to AM in the budgerigar to gain insight into the relationships between neural rate and synchrony thresholds and AM behavioral thresholds. Neural recordings were made from multiunit neural clusters in the central nucleus of the IC (also known as nucleus mesencephalicus lateralis, pars dorsalis) (Covey and Carr 2005) using chronically implanted electrodes in awake, unrestrained birds. Behavioral AM sensitivity was assessed with operant conditioning procedures. The results suggest that envelope synchrony rather than average rate information in the budgerigar IC underlies human-like behavioral sensitivity to AM in this species.

MATERIALS AND METHODS

All physiological and behavioral experiments included in this study were conducted in English-variety budgerigars and performed under a protocol approved by the University Committee on Animal Resources at the University of Rochester. English budgerigars are bred for larger size (40–65 g) and calmer deportment compared with examples of the species commonly found in pet stores (parakeets). Neurophysiological data were collected from the central nucleus of the IC during 68 multiunit neural recording sessions conducted in three awake, unrestrained birds (1 female, implanted twice; 2 males). Behavioral data were obtained using operant conditioning procedures in four birds (all male). This new behavioral dataset was not part of a previously published study focused on detection of low modulation frequencies (Carney et al. 2013). Behavioral and physiological data were collected from different birds.

Microelectrode implantation procedure.

Tungsten microelectrodes (3–5 MΩ; MicroProbes, Gaithersburg, MD) were implanted into the IC of anesthetized birds using a stereotaxic, surgical procedure to allow recording of sound-evoked neural activity during subsequent daily recording sessions. Electrodes were epoxied to a miniature, head-mounted microdrive (“nDrive”; NeuroNexus, Ann Arbor, MI), which allowed postimplantation adjustments of recording depth. Implant assemblies consisted of the nDrive, electrode, protective cap, and miniature connector (#A79000-001; Omnetics, Minneapolis, MN). Electrodes extended 8 mm below the base of the nDrive in its retracted state, with the capacity to extend to a maximum depth of 11 mm through adjustment of a manual control screw.

Birds were anesthetized with ketamine (3–5 mg/kg sc) and dexmedetomidine (0.08–0.1 mg/kg sc) for the implantation surgery and wrapped loosely in a towel. They were then placed into a stereotaxic frame that held the head in an upright position, with the nares ∼5 mm above the horizontal plane of the interaural axis and the beak tip ∼5 mm below. Supplemental warmth was provided by a disposable heating pack (∼50°C; ComfortTec International, Carlsbad, CA) placed under the body. Body temperature was not measured during implantation surgeries. The top of the head was trimmed of feathers and disinfected with Betadine and alcohol. The scalp was numbed with Lidocaine (0.05 ml sc), and a 15 × 15-mm area of the skull was exposed with an incision to accommodate the dimensions of the implant assembly. An ∼1-mm diameter craniotomy was made using a surgical drill centered 4 mm lateral of the midline and 1 mm posterior to the vertical plane of the interaural axis. Finally, an electrical ground was established on the animal by advancing a self-tapping, stainless-steel screw (#000; ×3/32”) through the skull near the midline and just posterior to the base of the implant assembly.

The implant assembly, with electrodes oriented vertically, was placed into a micromanipulator mounted on the stereotaxic frame. The manipulator was adjusted to center the electrode tip over the craniotomy. A wire was connected to the ground screw with electrically conductive epoxy (#8331-14G; MG Chemicals, Burlington, Ontario, Canada). The dura was then pierced with an ophthalmic scalpel and the implant assembly slowly lowered into the brain using a hydraulic microdrive (David Kopf Instruments, Tujunga, CA) until the base of the nDrive contacted the skull (electrode depth = 8 mm). The nDrive control screw was then used to advance the electrodes toward the IC during stimulation with 75 dB SPL Gaussian noise or tone bursts. Following the emergence of robust, sound-evoked neural activity, which occasionally required retraction of the implant assembly and minor adjustment of the electrode trajectory, a bead of Kwik-Sil adhesive (World Precision Instruments, Sarasota, FL) was placed over the craniotomy, and the nDrive was cemented to the skull using light-cured dental composite material (Vertise Flow; Kerr, Orange, CA). Finally, light-cured composite material was used to cement the protective cap over the implant assembly and the electrical connector to the cap.

Anesthesia was maintained throughout the 2- to 3-h implantation procedure by slow infusion with a syringe pump of a solution combining ketamine (6–10 mg·kg−1·h−1), dexmedetomidine (0.16–0.27 mg·kg−1·h−1), and lactated Ringer's solution (30–50 ml·kg−1·h−1) through a subcutaneous catheter. Following completion of the surgery, animals were removed from the stereotaxic frame and placed in a heated recovery chamber (#912-000; Lyon Technologies, Chula Vista, CA). Antisedan (0.5 mg/kg sc) was given to speed recovery from anesthesia. The analgesic carprofen (1 mg/kg sc) was given on the day of surgery and once daily for 1–2 days thereafter to minimize pain and inflammation. Metoclopramide (1.75 mg/kg) and supplemental fluids (lactated Ringer's solution 1 ml sc) were given once or twice daily following the procedure until normal appetite and droppings were observed (typically 1–2 days).

Neurophysiological recording sessions.

Sound-evoked neural activity was recorded in awake, unrestrained birds for 14–26 daily test sessions beginning 1 wk after the microelectrode implantation procedure. Daily test sessions were 2 h in duration. The position of the recording site was held constant throughout each session until its end, at which point, the electrode was advanced by 35 μm. This distance reliably produced an increase in the characteristic frequency (CF; the frequency of maximum sensitivity to tones) of the recording site. All of the recording tracks included in this study showed robust spiking activity (described further below) and a clear increase in CF with increasing recording depth (e.g., Fig. 1A). This tonotopic gradient is consistent with localization of recording sites in the central nucleus of the IC rather than other nearby auditory nuclei (e.g., thalamus) (Bigalke-Kunz et al. 1987). CFs increase dorso-ventrally in the central nucleus of the IC in mammals (Baumann et al. 2011; Langner et al. 2002; Langner and Schreiner 1988) and all other bird species studied to date [Calford et al. (1985); Knudsen and Konishi (1978); Woolley and Casseday (2004); reviewed for both taxa in Woolley and Portfors (2013)].

Recordings were made in a double-walled, sound-isolation booth (inside dimensions: 2.13 m long, 1.98 m wide, 1.96 m tall; Industrial Acoustics, Bronx, NY), lined with 7.6 cm sound-absorbing foam (Pinta Acoustic, Minneapolis, MN). During recording sessions, birds perched in a wire-mesh cage (0.2 m length, width, and height; 6.4 mm wire spacing) that was centered in the chamber and separated by 45 cm from a single, free-field loudspeaker (#PS180-8; Dayton Audio, Springboro, OH) mounted in the same horizontal plane as the bird. A video-monitoring system was used to ensure that birds remained perched, which was typical behavior in all birds, and facing the loudspeaker throughout the session.

Stimulus generation and response acquisition were coordinated using a data acquisition board (#PCI-6251; National Instruments, Austin, TX) controlled by custom MATLAB programs (MathWorks, Natick, MA). Stimulus waveforms were generated in MATLAB (sampling frequency = 50 kHz) and converted to analog signals on the National Instruments board before passing to a power amplifier (Tascam PA-20 MK II) that drove the loudspeaker. Stimuli were calibrated in MATLAB before analog conversion using a 4,000-point finite impulse response (FIR) filter that compensated for the frequency response of the system, which was determined from the output of a calibrated microphone (Type 4134; Brüel and Kjaer, Marlborough, MA; sampled at 50 kHz by the National Instruments board) placed inside the cage at the location of the animal's head. Tones were presented during calibration at 249 log-spaced frequencies from 0.050 to 15.1 kHz.

Electrode signals were buffered using a miniature headstage that clipped to the implant connector before passing out of the sound booth through thin, flexible wires to a custom-built amplifier designed for extracellular signals. The amplifier filtered signals from 0.3 to 8 kHz and amplified signals by a factor of 1,000–10,000. Amplified signals were digitized on the National Instruments board (16-bit resolution) at a sampling frequency of 31,250 Hz and written to the hard drive of the personal computer.

Responses to 200 ms pure tones of variable frequency (0.2–8 kHz, 7 steps/octave) and level (15–65 dB SPL, 10 dB steps) were recorded at the beginning of each session to generate a frequency-response map for determination of CF and 10 dB bandwidth of frequency tuning at each recording site. Each frequency-level combination was presented three times. Tones were presented in random sequence (within each presentation group) with 10 ms cos2 onset and offset ramps and 500 ms of silence between tones. Responses were used to calculate pure-tone tuning curves, plotting the threshold stimulus level necessary to evoke a criterion discharge rate as a function of stimulus frequency (Fig. 1A). The criterion discharge rate was set at either 3 SD above the mean spontaneous rate or, for recording sites with an excitatory rate response at 15 dB SPL, the maximum discharge rate evoked across all 15 dB SPL stimuli.

Responses to AM tone- and noise-carrier stimuli with varying modulation frequency and depth were then recorded. AM tones were generated with the carrier frequency equal to best frequency (BF), and the same “frozen” Gaussian noise waveform was used as a carrier for all noise signals (bandwidth: 0.1–10 kHz). First, a modulation transfer function (MTF) was obtained by presenting stimuli with 100% modulation depth at 25 log-spaced modulation frequencies ranging from 4 to 1,024 Hz. An unmodulated stimulus was also included in the set. MTF stimuli had a duration of 1 s and were presented in random order for 10 repetitions. Second, for determination of neural AM thresholds, modulation depth functions were obtained at 2–5 modulation frequencies (typically 3; modulation frequency range: 16–512 Hz) by presenting stimuli with modulation depths ranging from −30 to 0 dB in 5 dB steps. Modulation depth in decibels is calculated as 20 log10(m), where m is the modulation index. A value of m = 1 corresponds to 100% or 0 dB depth, whereas m = 0.1 corresponds to 10% or −20 dB depth. An unmodulated carrier signal was also included in each stimulus set. Stimuli had a duration of 500 ms and were presented in random order for 20 repetitions. For both MTFs and modulation depth functions, stimuli were presented at 65 dB SPL with 50 ms cos2 onset and offset ramps and 500 ms of silence between stimuli.

Spikes were detected in multiunit neurophysiological recordings (Fig. 2) after first high-pass filtering to minimize the local field potential (500 point FIR; 1 kHz cutoff frequency) and application of a nonlinear energy operator (Kim and Kim 2000). The nonlinear energy operator gives an output proportional to the product of the instantaneous frequency and amplitude of the input signal and hence, accentuates spike peaks for detection based on an amplitude threshold. For the discrete time sequence x(n), the output of the nonlinear energy operator is first calculated as x2(n) − x(n + 1)x(n − 1). This output is then smoothed with a six-point Bartlett window to eliminate spurious peaks in the transformed signal due to cross-terms and background noise (Kim and Kim 2000). The amplitude threshold for spike detection was set once per recording session at approximately one-half of the peak amplitude of the largest spikes in the transformed neurophysiological recording (i.e., well above the level of the noise). Spike times were calculated as the time of the peak deflection expressed relative to stimulus onset.

Fig. 2.

Fig. 2.

Representative multiunit neurophysiological recordings obtained from 3 sites in the budgerigar IC in response to tone bursts presented at CF. Recording site number (bird and session number) and CF are indicated above each pair of recordings. Stimuli were presented at 60 dB SPL with 10 ms onset and offset ramps. Recordings plot transformed extracellular voltage (arbitrary scale) as a function of time relative to (re.) stimulus onset. Recordings were transformed with a nonlinear energy operator to accentuate spikes for detection with an amplitude threshold (Kim and Kim 2000), which was set once per recording session at ∼½ of the amplitude of the largest peaks in the filtered recording. Circles indicate detected action potentials.

Isolation of single-unit responses using a template-matching procedure was not successful, apparently because the action potentials of neurons near the recording site were too similar in shape for discrimination. Nonetheless, multiunit recordings from groups of spatially clustered neurons are expected to provide valuable insight into the functional response properties of the budgerigar IC based on previous reports of robust anatomical gradients in spectral and temporal response properties in the midbrain of mammals [Baumann et al. (2011); Chen et al. (2012); Langner et al. (2002); Langner and Schreiner (1988); but see Seshagiri and Delgutte (2007)] and other bird species (Calford et al. 1985; Woolley and Casseday 2004).

Rate- and synchrony-based AM thresholds of individual neural recording sites.

Neural thresholds for AM detection were calculated from modulation depth functions based on both the average discharge rate and envelope synchrony of neural responses. Rate-based AM thresholds were estimated using ROC analysis (Egan 1975), which computes classification performance (i.e., percent correct, modulated vs. unmodulated) from the distributions of discharge rates observed in response to each modulation depth compared with the unmodulated stimulus condition. Linear interpolation was applied to the function-plotting classification performance vs. stimulus modulation depth. The rate threshold was calculated as the minimum modulation depth, above which classification performance consistently exceeded 70.7% correct. This value corresponds to average behavioral performance at threshold for the operant task and tracking procedure used in behavioral experiments (see below).

AM thresholds based on envelope synchrony were estimated using ROC analysis and a variation of the classical vector strength (VS) metric of neural synchrony. Classical VS (Goldberg and Brown 1969) is calculated from neural spike times pooled across stimulus repetitions as 1/n [i=1ncos(2π·fm·ti)]2+[i=1nsin(2π·fm·ti)]2, where n is the total number of spikes, fm is the modulation frequency of the stimulus, and ti is the time of the ith spike. In the analysis used here, response synchrony was calculated on a repetition-by-repetition basis using phase-projected VS (VSPP), which is calculated as VSR·cos(φR − φP), where VSR is the VS of spikes associated with an individual stimulus repetition, φR is the mean phase of spikes associated with that repetition, and φP is the mean phase of spikes pooled across repetitions (Johnson et al. 2012; Sayles et al. 2013; Yin et al. 2011). VSPP reduces single-repetition estimates of envelope synchrony, which can be highly variable, given low spike counts, when they are out of phase with the pooled response. VSPP thresholds were estimated using the same ROC analysis approach used for rate thresholds. The distribution of VSPP estimates observed at each modulation depth was compared with the distribution associated with the unmodulated stimulus to determine classification performance given optimal threshold placement. Linear interpolation was applied to the function plotting classification performance by stimulus modulation depth, and the VSPP threshold was calculated as the minimum modulation depth at which classification performance surpassed 70.7%.

The analysis time window used for calculation of response rate extended from 50 ms after stimulus onset to stimulus offset. The analysis window used for calculation of VSPP began 50 ms after stimulus onset and extended for the maximum integer number of stimulus modulation periods possible before stimulus offset. This analysis window effectively excluded the neural response to the first cycle of AM for stimuli with modulation frequencies >25 Hz [the majority of neural AM thresholds was estimated with modulation frequencies ≥32 Hz (see Fig. 7)].

Fig. 7.

Fig. 7.

Neural thresholds for AM detection of tone (left)- and noise (right)-carrier stimuli compared with behavioral performance in the budgerigar. Thresholds based on (A and B) average discharge rate and (C) envelope synchrony are shown compared with behavioral thresholds obtained using the same carrier signal [blue lines; tone carrier: this study; noise carrier: Dooling and Searcy (1981)]. A and C: AM detection thresholds of individual recording sites. Red triangles indicate thresholds measured within ±0.75 octave of rate MTF BMF. Black circles indicate thresholds measured either further from BMF or at sites without a BMF. Black lines show the median of the population of neural thresholds, whereas thicker gray bands span from the 10th to the 90th percentile. Percentiles, including the median, were calculated on a locally weighted basis using the tri-cube weight function (as in local regression) and α of 0.75 (αn weighted observations contribute to each estimate, where n is the total number of observations). Numbers below the dashed horizontal lines indicate the percentage of modulation depth functions with no significant threshold (%NS). B: AM thresholds of the neural population based on optimal pooling of average rate information across recording sites with a pattern decoder analysis (Jazayeri and Movshon 2006). Thresholds are shown for data from individual implanted birds (symbols) and for data pooled across all implanted birds (orange line). Rate thresholds are not sensitive enough to explain behavioral AM thresholds at modulation frequencies below 128 Hz, even with optimal pooling of information across sites. Synchrony thresholds can explain behavioral AM sensitivity across the full range of modulation frequencies studied.

Neural AM thresholds obtained by pooling rate information across recording sites.

AM detection thresholds of the pooled neural population were estimated for both carrier types using a maximum likelihood-based pattern decoder (Day and Delgutte 2013; Jazayeri and Movshon 2006). This analysis calculated, separately for each modulation frequency and at each modulation depth, the percentage of stimuli that could be classified correctly as modulated or unmodulated based on single, optimally weighted population responses (i.e., spike counts sampled across the population) and an assumption of independent, Poisson-distributed spike counts. Percent correct was calculated by first randomly drawing 1,000 population responses from each of the two stimulus conditions (i.e., modulated and unmodulated). For each population draw, the logarithm of the likelihood of the two conditions was calculated as in Jazayeri and Movshon (2006), where ni is the randomly drawn spike count of the ith site, N is the total number of sites, and fi(θ) is the average spike count of site i for stimulus condition θ

logL(θ) = ∑i=1Nni logfi(θ) − ∑i=1Nfi(θ) − ∑i=1Nlog(ni!)

The first term is an optimally weighted sum of spike counts across the population, whereas the second term is the sum of average counts. The last term can be ignored, because it is independent of θ. Note that for each random population draw, selected spike counts ni were removed from the dataset before calculation of fi(θ) to avoid overfitting of the model. Percent correct was calculated as the percentage of population draws for which the log-likelihood of the correct stimulus condition was greater than that of the incorrect condition. The population threshold was calculated as the modulation depth at which the performance of the decoder model exceeded 70.7% correct.

Recording sites with no significant variation in discharge rate with modulation depth theoretically make no contribution to decoder performance. In practice, however, spike counts drawn from these sites carry small weights, due to sampling errors in the estimation of fi(θ), and hence, can decrease performance. We therefore excluded these sites from pooling analyses, which typically improved performance by 1–2 dB compared with analyses conducted with the full sample of sites. A unity relationship was observed across all recording sites between log-transformed mean spike count, calculated for each stimulus, and log-transformed variance, consistent with Poisson distribution of spike counts.

Behavioral AM thresholds.

Thresholds for behavioral AM detection of a 4-kHz tone-carrier signal were estimated at modulation frequencies of 16, 32, 64, 128, 256, and 512 Hz using operant conditioning. Thresholds were estimated four to six times/day in each bird during two, ∼20 min test periods. Behavioral sessions were conducted in a sound-isolation chamber (inside dimensions: 61 cm long, 81 cm wide, 61 cm tall; Industrial Acoustics), lined with 6.7 cm sound-attenuating foam (Pinta Acoustic). Birds perched in a wire-mesh, stainless-steel cage (0.2 m length, width, and height; 6.4 mm wire spacing) located centrally on the floor of the chamber. The wire cage contained three horizontally placed response switches and the delivery tube of a customized seed-dispensing system (ENV-203 Mini; Med Associates, St. Albans, VT). A single overhead loudspeaker (MC60; Polk Audio, Baltimore, MD) was mounted above the cage for presentation of acoustic stimuli. Stimulus generation and behavioral response acquisition were coordinated using a National Instruments data acquisition board (#PCI-6251) controlled by custom MATLAB programs. Stimulus waveforms (sampling frequency = 50 kHz) were converted to analog signals on the National Instruments board before passing to a power amplifier (D-75A; Crown Audio, Elkhart, IN) that drove the loudspeaker in the booth. Stimuli were calibrated using the same filtering procedure applied in the neurophysiology setup.

Birds were trained to perform a single-interval, two-alternative, nonforced-choice task during behavioral test sessions. Birds started each trial by pecking the center switch, which initiated presentation of a single stimulus. The stimulus was either a standard, unmodulated tone or target AM tone, presented for a maximum duration of 500 ms at 65 dB SPL with 50 ms cos2 onset and offset ramps. Birds were trained to peck the left switch in response to unmodulated stimuli (i.e., correct rejections) and the right switch in response to modulated stimuli (i.e., hits). Responses resulted in immediate termination of the stimulus. Correct responses (i.e., hits and correct rejections) were reinforced by delivery of a millet seed, whereas incorrect responses (misses and false alarms) were followed by a 5-s timeout, during which all lights in the chamber were turned off. The pecking of any of the switches during a timeout resulted in extension of the timeout by 5 s. In rare instances in which the bird did not respond left or right within 3 s of stimulus onset, a short, 2-s timeout was imposed before the next trial could begin. Every block of 10 trials contained a random sequence of 5 target and 5 standard stimuli.

For each modulation frequency tested, birds spent the first few sessions discriminating fully modulated test stimuli from the standard stimulus. Following mastery of this task, AM thresholds were estimated repeatedly at the same modulation frequency using a two-down, one-up adaptive-tracking procedure (Levitt 1970). With this procedure, the modulation depth of the target stimulus was systematically varied within a track until AM was just barely detectable. Each pair of consecutive hits at the same modulation depth was followed by a reduction in depth, whereas each miss was followed by an increase in depth (up to the maximum depth of 0 dB). Steps in modulation depth were equal to 6/n dB, where n is the number of steps accumulated since the beginning of the track (Levitt 1970; Robbins and Monro 1951). The value of n was reset to 1 in rare cases when the track returned to 0 dB depth. Behavioral tracks were allowed to continue for a minimum of 15 depth reversals until 2 stability criteria were met: 1) the absolute difference in mean modulation depth between the last 4 reversals of the track and the 4 preceding reversals was required to be <2 dB, and 2) the SD of the modulation depth of the last 8 reversals was required to be <2 dB. The threshold of the track was calculated as the mean modulation depth of the last eight reversals. Response bias was monitored and controlled by varying the percentage of two-seed reinforcements for correct responses on the side against that which was biased. Thresholds, during which bias exceeded 0.3, were excluded from further analysis.

Thresholds were estimated repeatedly at the same modulation frequency a minimum of 13 times until 2 stability criteria were met: 1) the absolute difference in mean threshold among the last 3 thresholds and the preceding 3 thresholds was required to be <2 dB, and 2) the SD of the last 6 thresholds was required to be <2 dB. The overall threshold was calculated as the mean of the last six thresholds. Other modulation frequencies were then tested in different orders across birds. Following estimation of thresholds at all six modulation frequencies, AM thresholds were estimated again, at least two more times at each modulation frequency, until the overall threshold changed by <2 dB. Final thresholds were computed as the average of the last two overall thresholds (where each overall threshold was the average of 6 session thresholds).

RESULTS

Behavioral AM thresholds in the budgerigar.

Behavioral thresholds for AM detection of a 4-kHz tone-carrier stimulus were estimated in four budgerigars over ∼400 test sessions in each bird. Sessions were conducted four to six times per day and consisted of 100–200 trials each. Behavioral AM thresholds in the budgerigar improved by ∼5 dB, with increasing modulation frequency from 16 to 64 Hz (Fig. 3; n = 4 birds). Thresholds remained constant at −20 to −25 dB for modulation frequencies from 64 Hz up to the highest modulation frequency tested, 512 Hz.

Fig. 3.

Fig. 3.

Behavioral thresholds for amplitude modulation (AM) detection of a 4-kHz carrier signal in the budgerigar, plotted as a function of stimulus modulation frequency. Modulation depth is given in decibels [20 log10(m), where m is modulation index] and percent modulation (gray text; m × 100); more-sensitive thresholds appear toward the top of the plot. AM thresholds were obtained in 4 birds (top) with operant conditioning and 65 dB SPL, 500 ms stimuli. Error bars for each symbol show the SD of threshold estimates in individual birds. The solid black line shows AM thresholds averaged across birds. Also shown are previously published (Carney et al. 2014) average behavioral thresholds of humans (dotted black line) and rabbits (solid gray line) obtained with the same single-interval, 2-alternative behavioral task and similar stimulus parameters (carrier frequency: 5 kHz; level: 50 dB SPL; duration: 500 ms). Shaded gray regions show ±1 SD from the mean.

Behavioral sensitivity to AM tones has been studied previously in humans and rabbits (Carney et al. 2014) using similar stimuli (5 kHz carrier, 500 ms duration, 50 dB SPL) and the same single-interval behavioral discrimination task and threshold-tracking procedure used here in the budgerigar (Fig. 3). Compared with humans, behavioral AM thresholds of budgerigars were similar at modulation frequencies from 16 to 128 Hz and slightly more sensitive at 256 Hz. Compared with rabbit, budgerigar thresholds were ∼10 dB more sensitive across the full range of modulation frequencies studied.

Neural pure-tone tuning curves.

Neurophysiological data were collected from the IC of three awake, unrestrained budgerigars over a total of 68, 2-h recording sessions. Individual neural recording sites in the budgerigar IC typically showed excitatory rate responses to tone stimuli with sharp frequency tuning (Fig. 1, A and B). CF increased from ∼400 Hz at dorsal recording sites to 5 kHz at the most ventral sites. The 10-dB bandwidth of pure-tone tuning increased with increasing CF (Fig. 1B), whereas mean Q10 (CF divided by 10 dB bandwidth) increased from 2.0 at CF of 1 kHz to 3.5 at CF of 4 kHz. These values are similar to Q10 values reported over the same range of CFs in the auditory periphery of the starling (Manley et al. 1985) and both the periphery and IC of cats (Miller et al. 1997; Ramachandran et al. 1999). Excitatory rate responses were frequently observed at sound levels as low as 15 dB SPL (the lowest stimulus level tested), particularly at recording sites with CFs from 1.5 to 3.5 kHz.

Neural MTFs.

Individual neural recording sites in the budgerigar IC exhibited robust responses to fully modulated tone (carrier frequency equal to BF, the frequency of the maximal rate response to pure tones)- and noise-carrier stimuli. Representative neural responses are shown in Fig. 4A. For tone-carrier stimuli, MTFs plotting average discharge rate by stimulus modulation frequency usually showed band-enhanced AM tuning, i.e., an excitatory rate response to AM relative to the unmodulated stimulus response that spanned a limited range of modulation frequencies (58/68 recording sites; e.g., Fig. 4B). BMFs varied from 50.8 to 512 Hz within recording sites with band-enhanced modulation tuning [median = 203.2 Hz, interquartile range (IQR) = 128–322.5 Hz; Fig. 5A] and showed no apparent association with BF (Pearson correlation between log-transformed variables, r = 0.081, P = 0.55). Remaining rate MTFs obtained with AM tones were high pass (9/68 sites) or all pass in shape (i.e., flat; 1/68 sites).

Fig. 4.

Fig. 4.

Representative neural responses from 1 recording site (B043-029) in the budgerigar midbrain to fully modulated tone (left)- and noise (right)-carrier stimuli. The carrier frequency of AM tones was set at the best frequency (BF) of the recording site, 2.4 kHz. A: poststimulus time histograms showing temporal variation in instantaneous firing rate. y-Axis limits range from 0 to 1,000 spikes/s. Modulation frequency is given above each plot. B: modulation transfer functions (MTFs) plotting variation in mean firing rate (top) and envelope synchrony (bottom) as a function of stimulus modulation frequency. Best modulation frequencies (BMFs) of rate-based MTFs are marked with an asterisk. Synchrony MTFs show both vector strength (VS; gray) calculated from pooled data and phase-projected VS (VSPP; black) calculated on a repetition-by-repetition basis. Error bars for firing rate and VSPP indicate ±1 SD.

Fig. 5.

Fig. 5.

MTF characteristics of individual recording sites in the budgerigar IC. A: percent enhancement of firing rate at the MTF peak, calculated relative to the unmodulated (0 Hz) condition, plotted as a function of BMF. Summary measurements from tone- and noise-carrier MTFs are plotted with different symbols. Tone-carrier MTFs had greater peak-rate enhancement and higher BMFs than noise-carrier MTFs. B: peak VS of synchrony-based MTFs plotted as a function of BMF (i.e., the frequency of peak synchrony). Maximum synchrony was greater for tone- than noise-carrier MTFs.

Rate MTFs obtained with AM noise were often band enhanced in shape as well (51/68 sites; Fig. 4B) but were limited to lower BMFs (median = 101.6 Hz, IQR = 80.6–153 Hz; Wilcoxon sign rank Z = 5.177, P < 0.0001; Fig. 5A) and showed less enhancement of discharge rate at the MTF peak than MTFs obtained with AM tones (noise-carrier median = 25.0%; tone-carrier median = 200.9%; Wilcoxon sign rank Z = 7.161, P < 0.0001). Noise-carrier MTFs not showing band-enhanced rate responses to AM were invariably all pass in shape (17/68 sites).

Neural responses to AM tones and noise were generally well synchronized to the AM envelope of the stimulus. Envelope synchrony is evident in poststimulus time histograms (Fig. 4A) and period histograms (Fig. 6A) as temporal fluctuations in instantaneous discharge rate at the modulation frequency of the stimulus. Envelope synchrony peaked at modulation frequencies less than a few hundred hertz at most recording sites (tone-carrier median = 80.6 Hz; noise-carrier median = 101.6 Hz; Wilcoxon sign rank Z = −1.421, P = 0.155; Figs. 4B and 5B) and declined with both increasing and decreasing modulation frequency. Maximum synchrony was greater for tone-carrier (median = 0.75, IQR = 0.68–0.79) than for noise-carrier (median = 0.62, IQR = 0.55–0.67; Wilcoxon sign rank Z = 7.106, P < 0.0001; Fig. 5B) stimuli.

Fig. 6.

Fig. 6.

Representative neural responses at 1 recording site (B044-032) to tones of varying modulation depth. Carrier frequency was set at BF (3.4 kHz). Modulation frequency is indicated at the top of each column. A: period histograms showing changes in instantaneous firing rate over 2 periods of the stimulus modulation frequency. y-Axis limits of period histograms range from 0 to 1,000 spikes/s. Modulation depth is indicated to the left of each row. B: modulation depth functions plotting firing rate (top) and envelope synchrony (VSPP; bottom) by stimulus modulation depth. Error bars indicate ±1 SD. Asterisks indicate AM detection thresholds calculated using receiver-operating characteristic analysis (70.7% correct classification).

Rate-based neural thresholds fail to explain behavioral sensitivity to low modulation frequencies.

Neural thresholds for AM detection of tone-carrier stimuli could be calculated based on ROC analysis of average rate responses (e.g., Fig. 6B) from 90% of modulation depth functions (220/244 functions), with every recording site contributing at least one measurable threshold (68/68 sites). Across individual neural recording sites, rate-based AM thresholds improved with increasing modulation frequency (Fig. 7A; n = 220 thresholds, 68 sites). The median rate threshold improved by ∼15 dB with increasing modulation frequency from 32 to 512 Hz. This pattern contrasts with behavioral sensitivity to AM tones in this species, which varied by <5 dB over the same range of modulation frequencies. Rate-based AM thresholds were not sensitive enough to explain behavioral AM sensitivity at modulation frequencies <128 Hz. At higher modulation frequencies, rate thresholds were frequently as sensitive as or more sensitive than behavioral thresholds.

Rate-based thresholds for AM detection of noise-carrier stimuli could be calculated in many cases as well (i.e., 109/216 depth functions, 56/68 recording sites). Compared with rate thresholds for tone-carrier stimuli, rate thresholds for AM noise were generally less sensitive and exhibited peak sensitivity at lower modulation frequencies (64–256 Hz; Fig. 7A; n = 109 thresholds, 56 sites). Compared with previously published behavioral AM thresholds obtained in this species with noise-carrier stimuli (Dooling and Searcy 1981), IC rate thresholds were not sensitive enough to explain behavioral AM thresholds at modulation frequencies <128 Hz. At higher modulation frequencies, the best rate thresholds observed within the neural population approached behavioral performance.

Perhaps not surprisingly, individual recording sites with more strongly peaked rate MTFs (i.e., stronger rate-based modulation tuning) tended to have more sensitive rate-based AM thresholds than recording sites with flatter-rate MTFs. This conclusion is supported by the results of correlation analyses conducted in a subgroup of rate-based AM thresholds measured near the BMF of the recording site (i.e., within ±0.75 octaves; Fig. 7A; tone carrier: n = 82 thresholds, 57 sites; noise carrier: n = 49 thresholds, 44 sites). Inverse relationships were observed between log-transformed, percent-rate enhancement of the MTF and normalized rate threshold (observed threshold minus the mean threshold of the population at the test modulation frequency) for both tone (r = −0.406, P = 0.0002)- and noise (r = −0.710, P < 0.0001; Pearson correlations)-carrier stimuli.

Rate-based AM thresholds of the pooled neural population were approximately as sensitive as the best thresholds of individual recording sites (Fig. 7B) and hence, were also insufficient to explain behavioral sensitivity to low AM frequencies. Pooling was conducted with a maximum likelihood-based decoder analysis that optimally combines information across neural recording sites based on their individual discharge statistics (Day and Delgutte 2013; Jazayeri and Movshon 2006), thus estimating the best performance of the system.

Neural thresholds based on envelope synchrony explain behavioral AM sensitivity across stimulus conditions.

Neural response synchrony to the stimulus AM envelope typically increased with increasing modulation depth for both tone- and noise-carrier signals (Fig. 6B). Neural thresholds for AM detection based on ROC analysis of envelope synchrony could be calculated from nearly every modulation depth function recorded (tone carrier: 240/244 functions, 68/68 sites; noise carrier: 211/216 functions, 68/68 sites). Synchrony-based thresholds for AM detection of tone- and noise-carrier signals were generally most sensitive at modulation frequencies from 64 to 256 Hz (Fig. 7C; tone carrier: n = 240 thresholds, 68 sites; noise carrier: n = 211 thresholds, 68 sites). Notably, envelope synchrony thresholds were sufficiently sensitive to explain behavioral AM thresholds for both carrier types across the full range of modulation frequencies studied. Synchrony thresholds for tone-carrier stimuli were more sensitive than for noise at modulation frequencies >256 Hz, in agreement with behavioral data obtained with these two different carrier signals. Envelope synchrony thresholds were substantially more sensitive than rate thresholds at modulation frequencies up to 128 Hz for both carrier types. Envelope synchrony thresholds at higher modulation frequencies were similar to rate thresholds for noise-carrier stimuli and less sensitive than rate thresholds for tone-carrier stimuli.

Midbrain thresholds for AM detection are similar between budgerigar and rabbit.

The finding that behavioral thresholds for AM detection are ∼10 dB more sensitive in the budgerigar than in the rabbit raises the question of how thresholds of the IC compare between these two species. Neural AM thresholds in the rabbit were recalculated from an existing dataset of IC single-unit and multiunit neural recordings (Carney et al. 2014) using the same rate- and synchrony-based (VSPP) analyses used here. In general, rate and synchrony thresholds were similar between budgerigar and rabbits for both carrier types, up to the maximum modulation frequency studied in the rabbit, 256 Hz (Fig. 8). Whereas subtle differences exist between species (e.g., slightly more sensitive rate-based thresholds of rabbit single-unit responses to noise-carrier stimuli; Table 1), they are too minor to explain the differences in behavioral sensitivity to AM between budgerigar and rabbit.

Fig. 8.

Fig. 8.

Neural thresholds for AM detection of tone (left)- and noise (right)-carrier stimuli compared between budgerigar and rabbit. Rabbit thresholds based on (A) average firing rate and (B) envelope synchrony are shown for single-unit (diamonds; n = 16 for tone and 33 for noise carriers) and multiunit (circles; n = 113 for tone and 119 for noise carriers) neural recordings compared with budgerigar multiunit thresholds (black lines, median; thick, gray bands, 10th–90th percentile; from Fig. 7). Rabbit thresholds were recalculated from existing data (Carney et al. 2014) using the same analyses used here for the budgerigar recordings. Rate and synchrony thresholds were generally similar between rabbits and budgerigars, up to the highest modulation frequency studied, 256 Hz, for both carrier types, and showed minor differences for some comparisons (see Table 1).

Table 1.

Mean differences in neural AM thresholds between rabbit and budgerigar

Response Metric Carrier Type Rabbit Unit Type Mean Diff. ± SE, dB t df P
Average rate Tone Multi 1.33 ± 0.58 2.28 177.7 0.024*
Single 1.91 ± 1.02 1.88 17.3 0.077
Noise Multi −0.75 ± 0.58 −1.30 153.6 0.19
Single −3.03 ± 0.86 −3.52 50.3 0.001*
Envelope synchrony Tone Multi 2.11 ± 0.78 2.69 161.1 0.008*
Single 4.58 ± 2.30 1.99 14.7 0.066
Noise Multi 0.75 ± 0.71 1.06 207.1 0.29
Single 4.24 ± 1.24 3.40 32.2 0.002*

Negative mean differences (Diff.) indicate more sensitive (lower) amplitude modulation (AM) thresholds in the rabbit. AM thresholds were compared using 2-sample t-tests of data that were normalized for the effect of modulation frequency. Normalization was accomplished by subtracting predicted values of a local regression model (independent variable: log-transformed modulation frequency; smoothing parameter = 0.5; weighted linear least squares with 1st-degree polynomial) from observed values. Degrees of freedom (df) were calculated using Satterthwaite's approximation.

*

P ≤ 0.05, statistically significant differences in neural AM sensitivity.

DISCUSSION

The present study compared neural and behavioral sensitivity with AM stimuli in the budgerigar to gain insight into the midbrain processing mechanisms underlying behavioral detection abilities. Behavioral thresholds for AM detection of tone-carrier stimuli in the budgerigar were similar to human AM thresholds, as shown previously for AM noise (Dooling and Searcy 1981) and tone-carrier stimuli with lower modulation frequencies (Carney et al. 2013). Budgerigar IC thresholds based on average discharge rate were not sensitive enough to explain behavioral AM thresholds at modulation frequencies <128 Hz for both tone and noise stimuli. In contrast, thresholds based on envelope synchrony could explain behavioral performance for both carrier types across the full range of modulation frequencies studied (16–512 Hz).

The results show that behavioral thresholds for AM detection in the budgerigar are best explained by envelope synchrony in the IC rather than average discharge rate. This conclusion is based on the observation that synchrony thresholds in the budgerigar IC frequently exceeded the sensitivity of behavioral AM thresholds. The best rate-based thresholds in the neural population, in contrast, were considerably less sensitive than behavioral thresholds at low modulation frequencies (e.g., by 10–15 dB at 32 Hz). Whereas rate thresholds in the budgerigar IC were theoretically sensitive enough to support behavioral AM sensitivity at modulation frequencies >128 Hz, reliance on envelope synchrony may still be advantageous at these frequencies, given that envelope synchrony remains a more reliable indicator of AM structure than average rate in the presence of competing acoustic features. For example, average discharge rates in the IC vary with SPL (Ramachandran et al. 1999) and sound-source location (Calford et al. 1985; Day and Delgutte 2013; Kuwada et al. 2006). in addition to modulation frequency and depth.

Average rate responses in the budgerigar IC remained insufficient to explain behavioral AM sensitivity at low modulation frequencies even after optimal pooling of information across recording sites. Indeed, AM thresholds of the pooled population were approximately as sensitive as the best thresholds of individual recording sites, consistent with previous pooling analyses of AM responses in the macaque auditory cortex (Johnson et al. 2012). Information pooling was conducted with a maximum likelihood-based pattern decoder that scales the contribution of individual population elements based on the reliability of their discharge statistics and hence, estimates of the upper limit of the performance of the system (Jazayeri and Movshon 2006). This pooling model has been shown to decode sound-source location from population rate responses in the rabbit IC (Day and Delgutte 2013, 2016) and the caudolateral region of the macaque cortex (Miller and Recanzone 2009) with performance similar to behaving animals.

The conclusion that envelope synchrony in the budgerigar IC provides a stronger neural correlate of behavioral AM thresholds than average discharge rate contrasts with previous findings in the rabbit, a relatively nonvocal species with limited behavioral AM detection abilities. The results of the rabbit study point to rate coding in the IC rather than envelope synchrony as the primary determinant of behavioral AM sensitivity (Carney et al. 2014). The best neural rate thresholds in the rabbit IC are approximately as sensitive as behavioral thresholds. Best synchrony thresholds, in contrast, are substantially more sensitive than the behaving animal and appear sufficient, even to explain AM detection abilities in budgerigars and humans (Carney et al. 2014; Nelson and Carney 2007). A similar discrepancy between behavioral and synchrony-based estimates of AM sensitivity has been observed in macaque monkeys at the level of the primary auditory cortex (Johnson et al. 2012; Niwa et al. 2012). Macaques, which like rabbits, struggle to detect low modulation frequencies (O'Connor et al. 2011), fail to match the best synchrony thresholds of cortical neurons during behavioral detection of AM, performing more closely to the best rate-based thresholds of the cortical population. Rate and synchrony-based thresholds for AM detection have not been studied in the IC of nonhuman primates.

Greater behavioral sensitivity to AM in the budgerigar compared with the rabbit appears to reflect an improvement in the ability of more central auditory processing stages to decode envelope synchrony in IC responses rather than heightened sensitivity of IC responses per se. Indeed, both rate- and synchrony-based estimates of neural AM sensitivity were similar between the budgerigar and rabbit IC at modulation frequencies up to at least 256 Hz. The important distinction between species is that the budgerigar appears to make effective use of envelope synchrony in IC responses during behavioral AM detection, whereas the rabbit does not. An alternative explanation is that rabbits were simply not sufficiently trained at the AM detection task, but this possibility seems unlikely considering that individual rabbits maintained stable behavioral performance for ∼48,000 trials during testing at near-threshold modulation depths with the Bayesian procedure (Carney et al. 2014).

Human-like behavioral sensitivity to AM in the budgerigar, observed here for tone-carrier stimuli with modulation frequencies from 16 to 256 Hz, has been demonstrated previously at lower modulation frequencies, from 4 to 8 Hz (Carney et al. 2013). Whereas budgerigar AM thresholds were relatively stable from 64 to 512 Hz, human AM thresholds typically show a decline in sensitivity above 100–130 Hz, followed by improvement beyond 300–500 Hz (dependent on carrier frequency), as subjects begin to rely on spectral resolution of the stimulus sidebands for AM detection (Kohlrausch et al. 2000). It is unclear whether budgerigars lack this dip in AM sensitivity or whether a similar pattern might emerge with finer sampling of higher modulation frequencies, for example, closer to the minimum bandwidth of frequency tuning in IC neurons with CFs near 4 kHz (800–900 Hz; see Fig. 1B). Budgerigar AM thresholds for noise-carrier stimuli have been shown to decline steadily in sensitivity with increasing modulation frequency, from 5 to 1,280 Hz (Dooling and Searcy 1981). Budgerigar thresholds for AM noise are similar to thresholds of humans (Viemeister 1979), as well as European starlings and barn owls (Dent et al. 2002; Klump and Okanoya 1991), across this entire range of modulation frequencies.

Neural coding of AM in the budgerigar IC was broadly similar to coding in the midbrain of other bird species (Keller and Takahashi 2000; Woolley and Casseday 2005) and mammal species studied to date (Carney et al. 2014; Krishna and Semple 2000; Langner and Schreiner 1988; Nelson and Carney 2007; Rees and Møller 1987; Rees and Palmer 1989). Band-enhanced AM tuning, as observed at most budgerigar IC recording sites, is a common response property of IC neurons across taxa, although some mammalian species may possess a greater proportion of neurons with other rate-based MTF types (e.g., band suppressed and low pass) (Joris et al. 2004). The present results add to an emerging pattern of conserved IC physiological response properties in birds and mammals that in addition to similar AM response properties, includes shared tonotopic organization, diversity of frequency tuning curve shapes, and nonlinear spectral interactions [reviewed in Woolley and Portfors (2013)]. The combination of conserved physiological response properties and human-like perceptual capabilities, including temporal gap detection, modulation detection of linear-rippled noise, and detection of inharmonicity [reviewed in Dooling et al. (2000)], makes the budgerigar an interesting animal model for studying the neural bases of complex signal perception.

BMFs of rate-based MTFs in the budgerigar IC typically ranged from 130 to 320 Hz for tone-carrier stimuli, presented at BF. Although similar to rate BMFs in some mammalian species [chinchilla: Langner et al. (2002); squirrel monkey: Müller-Preuss et al. (1994)] and higher than in others [gerbil: Krishna and Semple (2000); cat: Langner and Schreiner (1988); rabbit: Nelson and Carney (2007)], the extent to which these patterns reflect true species differences vs. differences in recording methodology is unclear. Multiunit neural recordings, as used here in the budgerigar IC, may shift the distribution of observed-rate BMFs to higher modulation frequencies, either because they are less influenced by recording biases (i.e., neurons with higher BMFs may be more difficult to isolate as single units) or because they contain responses from lemniscal input fibers as well as IC neurons (Langner and Schreiner 1988). Whereas multiunit recordings could potentially yield different AM detection thresholds than single-unit recordings, similarity between multi- and single-unit AM thresholds in the rabbit IC (e.g., Fig. 8) suggests that this possibility may be unlikely. Furthermore, we estimate that only a small number of neurons contributed to budgerigar IC recordings based on moderate maximum discharge rates (<150–250 spikes/s) observed in response to Gaussian noise stimuli, which evoked robust neural activity at all recording sites and our use of relatively high impedance electrodes (3–5 MΩ) with small recording areas.

Between carrier types, budgerigar IC thresholds, based on both envelope synchrony and average rate, were more sensitive for tone-carrier stimuli than for AM noise. These differences, which correlate with differences in behavioral AM sensitivity, can ultimately be attributed to the stochastic nature of noise carrier signals, which in addition to any imposed sinusoidal AM, contain inherent temporal fluctuations in envelope amplitude related to narrow-band cochlear filtering of a wideband acoustic signal. Inherent envelope fluctuations, which are absent for AM tone stimuli, can interfere with neural representation of the target AM signal and ultimately mask behavioral detection. Because inherent envelope fluctuations can drive neurons, even when the stimulus modulation frequency is remote from the BMF, they can also explain the flatter-rate MTFs observed with noise-carrier stimuli compared with AM tones at many recording sites.

Sensitive AM processing abilities in the budgerigar may play an important role in the vocal communication behavior of this gregarious species. Budgerigars of both sexes produce a repertoire of temporally modulated contact calls used for coordination of group activity. Budgerigar vocalizations contain modulation frequencies ranging from 100 to 740 Hz (Lavenex 1999). Contact calls are learned through auditory feedback throughout life and when new social bonds form, undergo a transformation, during which the calls of individual birds converge on a shared call structure (Farabaugh et al. 1994). Male budgerigars also produce longer, multisyllabic warble songs that incorporate novel elements (e.g., mimicked environmental and interspecific sounds) and are directed at females (Tu et al. 2011). Warbles can go on for several minutes and play an important role in mating, courtship, and pair-bond maintenance. Similar behavioral AM sensitivity among the budgerigar, starling, and barn owl (Dent et al. 2002; Dooling and Searcy 1981; Klump and Okanoya 1991) suggests that this trait might be broadly shared across birds, rather than a specialization in the budgerigar.

In conclusion, the present study in the budgerigar demonstrates a clear correlation between thresholds for envelope synchrony in IC neurons and human-like behavioral AM sensitivity. Average rate responses, in contrast, which can explain behavioral performance in a species with more-limited abilities (rabbit), are unable to account for behavioral AM sensitivity in this vocal specialist. These new results highlight both the continued significance of envelope synchrony in the IC and the possible contribution of envelope synchrony to behavioral performance in other species with sensitive AM detection abilities, including humans (Nelson and Carney 2007). The importance of robust envelope synchrony in the IC should be a key design consideration in the development of stimulation strategies for central auditory prostheses (i.e., midbrain implants), which currently provide inadequate AM envelope cues for robust speech perception (Lim and Lenarz 2015).

GRANTS

Support for this work was provided by the National Institute on Deafness and Other Communication Disorders (Grants R01-DC001641 to L. H. Carney and K99-DC013792 to K. S. Henry).

DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

AUTHOR CONTRIBUTIONS

Author contributions: K.S.H., F.I., and L.H.C. conception and design of research; K.S.H., E.G.N., and K.S.A. performed experiments; K.S.H. analyzed data; K.S.H., F.I., and L.H.C. interpreted results of experiments; K.S.H. prepared figures; K.S.H. drafted manuscript; K.S.H., E.G.N., K.S.A., F.I., and L.H.C. edited and revised manuscript; K.S.H., E.G.N., K.S.A., F.I., and L.H.C. approved final version of manuscript.

ACKNOWLEDGMENTS

Mitchell L. Day provided the analysis code for calculation of rate-based population thresholds.

REFERENCES

  1. Aitkin LM, Phillips SC. Is the inferior colliculus an obligatory relay in the cat auditory system? Neurosci Lett 44: 259–264, 1984. [DOI] [PubMed] [Google Scholar]
  2. Bartlett EL, Wang X. Neural representations of temporally modulated signals in the auditory thalamus of awake primates. J Neurophysiol 97: 1005–1017, 2007. [DOI] [PubMed] [Google Scholar]
  3. Baumann S, Griffiths TD, Sun L, Petkov CI, Thiele A, Rees A. Orthogonal representation of sound dimensions in the primate midbrain. Nat Neurosci 14: 423–425, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Beckers GJ, ten Cate C. Perceptual relevance of species-specific differences in acoustic signal structure in Streptopelia doves. Anim Behav 62: 511–518, 2001. [Google Scholar]
  5. Bigalke-Kunz B, Rübsamen R, Dörrscheidt GJ. Tonotopic organization and functional characterization of the auditory thalamus in a songbird, the European starling. J Comp Physiol A 161: 255–265, 1987. [DOI] [PubMed] [Google Scholar]
  6. Brittan-Powell EF, Dooling RJ, Gleich O. Auditory brainstem responses in adult budgerigars (Melopsittacus undulatus). J Acoust Soc Am 112: 999–1008, 2002. [DOI] [PubMed] [Google Scholar]
  7. Calford MB, Wise LZ, Pettigrew JD. Coding of sound location and frequency in the auditory midbrain of diurnal birds of prey, families accipitridae and falconidae. J Comp Physiol A 157: 149–160, 1985. [Google Scholar]
  8. Carney LH, Ketterer AD, Abrams KS, Schwarz DM, Idrobo F. Detection thresholds for amplitude modulations of tones in budgerigar, rabbit, and human. Adv Exp Med Biol 787: 391–398, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Carney LH, Zilany MS, Huang NJ, Abrams KS, Idrobo F. Suboptimal use of neural information in a mammalian auditory system. J Neurosci 34: 1306–1313, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chen C, Rodriguez FC, Read HL, Escabí MA. Spectrotemporal sound preferences of neighboring inferior colliculus neurons: implications for local circuitry and processing. Front Neural Circuits 6: 62, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Covey E, Carr CE. The auditory midbrain in bats and birds. In: The Inferior Colliculus, edited by Winer JA and Schreiner CE. New York: Springer, 2005, p. 493–536. [Google Scholar]
  12. Day ML, Delgutte B. Decoding sound source location and separation using neural population activity patterns. J Neurosci 33: 15837–15847, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Day ML, Delgutte B. Neural population encoding and decoding of sound source location across sound level in the rabbit inferior colliculus. J Neurophysiol 115: 193–207, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dent ML, Klump GM, Schwenzfeier C. Temporal modulation transfer functions in the barn owl (Tyto alba). J Comp Physiol A Neuroethol Sens Neural Behav Physiol 187: 937–943, 2002. [DOI] [PubMed] [Google Scholar]
  15. Dooling RJ, Lohr B, Dent ML. Hearing in birds and reptiles. In: Comparative Hearing: Birds and Reptiles, edited by Dooling RJ, Fay RR, and Popper AN. New York: Springer, 2000, p. 308–359. [Google Scholar]
  16. Dooling RJ, Saunders JC. Hearing in the parakeet (Melopsittacus undulatus): absolute thresholds, critical ratios, frequency difference limens, and vocalizations. J Comp Physiol Psychol 88: 1–20, 1975. [DOI] [PubMed] [Google Scholar]
  17. Dooling RJ, Searcy MH. Amplitude modulation thresholds for the parakeet (Melopsittacus undulatus). J Comp Physiol A 143: 383–388, 1981. [Google Scholar]
  18. Egan JP. Signal Detection Theory and ROC Analysis. New York: Academic, 1975. [Google Scholar]
  19. Farabaugh SM, Linzenbold A, Dooling RJ. Vocal plasticity in budgerigars (Melopsittacus undulatus): evidence for social factors in the learning of contact calls. J Comp Psychol 108: 81–92, 1994. [DOI] [PubMed] [Google Scholar]
  20. Gleich O, Klump GM. Temporal modulation transfer functions in the European starling (Sturnus vulgaris): II. Responses of auditory-nerve fibers. Hear Res 82: 81–92, 1995. [DOI] [PubMed] [Google Scholar]
  21. Goldberg JM, Brown PB. Response of binaural neurons of dog superior olivary complex to dichotic tonal stimuli: some physiological mechanisms of sound localization. J Neurophysiol 32: 613–636, 1969. [DOI] [PubMed] [Google Scholar]
  22. Heffner H. Hearing in glires: domestic rabbit, cotton rat, feral house mouse, and kangaroo rat. J Acoust Soc Am 68: 1584, 1980. [Google Scholar]
  23. Jazayeri M, Movshon JA. Optimal representation of sensory information by neural populations. Nat Neurosci 9: 690–696, 2006. [DOI] [PubMed] [Google Scholar]
  24. Johnson JS, Yin P, O'Connor KN, Sutter ML. Ability of primary auditory cortical neurons to detect amplitude modulation with rate and temporal codes: neurometric analysis. J Neurophysiol 107: 3325–3341, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Joris PX, Schreiner CE, Rees A. Neural processing of amplitude-modulated sounds. Physiol Rev 84: 541–577, 2004. [DOI] [PubMed] [Google Scholar]
  26. Joris PX, Yin TC. Responses to amplitude-modulated tones in the auditory nerve of the cat. J Acoust Soc Am 91: 215–232, 1992. [DOI] [PubMed] [Google Scholar]
  27. Keller CH, Takahashi TT. Representation of temporal features of complex sounds by the discharge patterns of neurons in the owl's inferior colliculus. J Neurophysiol 84: 2638–2650, 2000. [DOI] [PubMed] [Google Scholar]
  28. Kim KH, Kim SJ. Neural spike sorting under nearly 0-dB signal-to-noise ratio using nonlinear energy operator and artificial neural-network classifier. IEEE Trans Biomed Eng 47: 1406–1411, 2000. [DOI] [PubMed] [Google Scholar]
  29. Klump GM, Okanoya K. Temporal modulation transfer functions in the European starling (Sturnus vulgaris): I. Psychophysical modulation detection thresholds. Hear Res 52: 1–11, 1991. [DOI] [PubMed] [Google Scholar]
  30. Knudsen EI, Konishi M. Space and frequency are represented separately in auditory midbrain of the owl. J Neurophysiol 41: 870–884, 1978. [DOI] [PubMed] [Google Scholar]
  31. Koay G, Heffner RS, Heffner HE. Behavioral audiograms of homozygous med(J) mutant mice with sodium channel deficiency and unaffected controls. Hear Res 171: 111–118, 2002. [DOI] [PubMed] [Google Scholar]
  32. Kohlrausch A, Fassel R, Dau T. The influence of carrier level and frequency on modulation and beat-detection thresholds for sinusoidal carriers. J Acoust Soc Am 108: 723, 2000. [DOI] [PubMed] [Google Scholar]
  33. Krishna BS, Semple MN. Auditory temporal processing: responses to sinusoidally amplitude-modulated tones in the inferior colliculus. J Neurophysiol 84: 255–273, 2000. [DOI] [PubMed] [Google Scholar]
  34. Kuwada S, Fitzpatrick DC, Batra R, Ostapoff EM. Sensitivity to interaural time differences in the dorsal nucleus of the lateral lemniscus of the unanesthetized rabbit: comparison with other structures. J Neurophysiol 95: 1309–1322, 2006. [DOI] [PubMed] [Google Scholar]
  35. Langner G, Albert M, Briede T. Temporal and spatial coding of periodicity information in the inferior colliculus of awake chinchilla (Chinchilla laniger). Hear Res 168: 110–130, 2002. [DOI] [PubMed] [Google Scholar]
  36. Langner G, Schreiner CE. Periodicity coding in the inferior colliculus of the cat. I. Neuronal mechanisms. J Neurophysiol 60: 1799–1822, 1988. [DOI] [PubMed] [Google Scholar]
  37. Lavenex PB. Vocal production mechanisms in the budgerigar (Melopsittacus undulatus): the presence and implications of amplitude modulation. J Acoust Soc Am 106: 491–505, 1999. [DOI] [PubMed] [Google Scholar]
  38. Levitt H. Transformed up-down methods in psychoacoustics. J Acoust Soc Am 49: 467–477, 1970. [PubMed] [Google Scholar]
  39. Lim HH, Lenarz T. Auditory midbrain implant: research and development towards a second clinical trial. Hear Res 322: 212–223, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Manley GA, Gleich O, Leppelsack HJ, Oeckinghaus H. Activity patterns of cochlear ganglion neurones in the starling. J Comp Physiol A 157: 161–181, 1985. [DOI] [PubMed] [Google Scholar]
  41. Manley GA, Schwabedissen G, Gleich O. Morphology of the basilar papilla of the budgerigar, Melopsittacus undulatus. J Morphol 218: 153–165, 1993. [DOI] [PubMed] [Google Scholar]
  42. Miller LM, Recanzone GH. Populations of auditory cortical neurons can accurately encode acoustic space across stimulus intensity. Proc Natl Acad Sci USA 106: 5931–5935, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Miller RL, Schilling JR, Franck KR, Young ED. Effects of acoustic trauma on the representation of the vowel “eh” in cat auditory nerve fibers. J Acoust Soc Am 101: 3602–3616, 1997. [DOI] [PubMed] [Google Scholar]
  44. Müller-Preuss P, Flachskamm C, Bieser A. Neural encoding of amplitude modulation within the auditory midbrain of squirrel monkeys. Hear Res 80: 197–208, 1994. [DOI] [PubMed] [Google Scholar]
  45. Nelson PC, Carney LH. Neural rate and timing cues for detection and discrimination of amplitude-modulated tones in the awake rabbit inferior colliculus. J Neurophysiol 97: 522–539, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Niwa M, Johnson JS, O'Connor KN, Sutter ML. Activity related to perceptual judgment and action in primary auditory cortex. J Neurosci 32: 3193–3210, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. O'Connor KN, Johnson JS, Niwa M, Noriega NC, Marshall EA, Sutter ML. Amplitude modulation detection as a function of modulation frequency and stimulus duration: comparisons between macaques and humans. Hear Res 277: 37–43, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Ramachandran R, Davis KA, May BJ. Single-unit responses in the inferior colliculus of decerebrate cats. I. Classification based on frequency response maps. J Neurophysiol 82: 152–163, 1999. [DOI] [PubMed] [Google Scholar]
  49. Rees A, Møller AR. Stimulus properties influencing the responses of inferior colliculus neurons to amplitude-modulated sounds. Hear Res 27: 129–143, 1987. [DOI] [PubMed] [Google Scholar]
  50. Rees A, Palmer AR. Neuronal responses to amplitude-modulated and pure-tone stimuli in the guinea pig inferior colliculus, and their modification by broadband noise. J Acoust Soc Am 85: 1978–1994, 1989. [DOI] [PubMed] [Google Scholar]
  51. Rhode WS, Greenberg S. Encoding of amplitude modulation in the cochlear nucleus of the cat. J Neurophysiol 71: 1797–1825, 1994. [DOI] [PubMed] [Google Scholar]
  52. Robbins H, Monro S. A stochastic approximation method. Ann Math Stat 22: 400–407, 1951. [Google Scholar]
  53. Rosen MJ, Semple MN, Sanes DH. Exploiting development to evaluate auditory encoding of amplitude modulation. J Neurosci 30: 15509–15520, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Rosen S. Temporal information in speech: acoustic, auditory and linguistic aspects. Philos Trans R Soc Lond B Biol Sci 336: 367–373, 1992. [DOI] [PubMed] [Google Scholar]
  55. Sachs MB, Young ED, Lewis RH. Discharge patterns of single fibers in the pigeon auditory nerve. Brain Res 70: 431–447, 1974. [DOI] [PubMed] [Google Scholar]
  56. Sayles M, Füllgrabe C, Winter IM. Neurometric amplitude-modulation detection threshold in the guinea-pig ventral cochlear nucleus. J Physiol 591: 3401–3419, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Seshagiri CV, Delgutte B. Response properties of neighboring neurons in the auditory midbrain for pure-tone stimulation: a tetrode study. J Neurophysiol 98: 2058–2073, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Shannon RV, Zeng F, Kamath V, Wygonski J, Ekelid M. Speech recognition with primarily temporal cues. Science 270: 303–304, 1995. [DOI] [PubMed] [Google Scholar]
  59. Tu HW, Osmanski MS, Dooling RJ. Learned vocalizations in budgerigars (Melopsittacus undulatus): the relationship between contact calls and warble song. J Acoust Soc Am 129: 2289–2297, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Viemeister NF. Temporal modulation transfer functions based upon modulation thresholds. J Acoust Soc Am 66: 1364–1380, 1979. [DOI] [PubMed] [Google Scholar]
  61. Woolley SM, Casseday JH. Processing of modulated sounds in the zebra finch auditory midbrain: responses to noise, frequency sweeps, and sinusoidal amplitude modulations. J Neurophysiol 94: 1143–1157, 2005. [DOI] [PubMed] [Google Scholar]
  62. Woolley SM, Casseday JH. Response properties of single neurons in the zebra finch auditory midbrain: response patterns, frequency coding, intensity coding, and spike latencies. J Neurophysiol 91: 136–151, 2004. [DOI] [PubMed] [Google Scholar]
  63. Woolley SM, Portfors CV. Conserved mechanisms of vocalization coding in mammalian and songbird auditory midbrain. Hear Res 305: 45–56, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Yin P, Johnson JS, O'Connor KN, Sutter ML. Coding of amplitude modulation in primary auditory cortex. J Neurophysiol 105: 582–600, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Neurophysiology are provided here courtesy of American Physiological Society

RESOURCES