Skip to main content
Journal of Neurophysiology logoLink to Journal of Neurophysiology
. 2015 Mar 18;113(10):3683–3691. doi: 10.1152/jn.00548.2014

The influence of cochlear spectral processing on the timing and amplitude of the speech-evoked auditory brain stem response

Helen E Nuttall 1,2, David R Moore 1,3, Johanna G Barry 1, Katrin Krumbholz 1, Jessica de Boer 1,
PMCID: PMC4468972  PMID: 25787954

Abstract

The speech-evoked auditory brain stem response (speech ABR) is widely considered to provide an index of the quality of neural temporal encoding in the central auditory pathway. The aim of the present study was to evaluate the extent to which the speech ABR is shaped by spectral processing in the cochlea. High-pass noise masking was used to record speech ABRs from delimited octave-wide frequency bands between 0.5 and 8 kHz in normal-hearing young adults. The latency of the frequency-delimited responses decreased from the lowest to the highest frequency band by up to 3.6 ms. The observed frequency-latency function was compatible with model predictions based on wave V of the click ABR. The frequency-delimited speech ABR amplitude was largest in the 2- to 4-kHz frequency band and decreased toward both higher and lower frequency bands despite the predominance of low-frequency energy in the speech stimulus. We argue that the frequency dependence of speech ABR latency and amplitude results from the decrease in cochlear filter width with decreasing frequency. The results suggest that the amplitude and latency of the speech ABR may reflect interindividual differences in cochlear, as well as central, processing. The high-pass noise-masking technique provides a useful tool for differentiating between peripheral and central effects on the speech ABR. It can be used for further elucidating the neural basis of the perceptual speech deficits that have been associated with individual differences in speech ABR characteristics.

Keywords: speech-evoked auditory brain stem response, auditory temporal processing, cochlear response time, speech-in-noise, auditory filter


deficits in temporal processing in the central auditory pathway are thought to contribute to difficulties in speech perception, particularly in noise (Boets et al. 2007; Pichora-Fuller and Souza 2003). Recent studies have proposed a neurophysiological correlate of temporal processing deficits in the scalp-recorded auditory brain stem response to speech (referred to hereafter as “speech ABR”), typically evoked by a consonant-vowel (CV) stimulus (Anderson et al. 2010a; Hornickel et al. 2011; Song et al. 2011).

It is generally assumed that the speech ABR is generated by the summed synchronous firing of neurons in the upper auditory brain stem (Chandrasekaran and Kraus 2010). The response consists of an onset peak, evoked by the high-frequency onset burst of the CV syllable, followed by a series of peaks that synchronize to the fundamental frequency (F0) of the periodic portion of the syllable, which comprises a formant transition period followed by a steady-state vowel. The onset peak of the speech ABR is generally thought to share common neural generators with the click ABR wave V. These latter generators are presumed to comprise onset- or primary-like units located in the lateral lemniscus as it enters the inferior colliculus (Melcher and Kiang 1996; Møller and Jannetta 1983). The neural generators of the periodic component of the speech ABR are less clear; these may similarly include onset- and primary-like units that respond to the sharp periodic peaks in the stimulus envelope, but they may alternatively or additionally comprise chopper-type neural units that phase-lock to the periodicity of the envelope (Pfeiffer 1966).

Abnormalities in the speech ABR have been repeatedly correlated with deficits in speech-in noise perception, particularly in aging populations (Anderson et al. 2011, 2012; Ruggles et al. 2012) and in children, usually with language-related learning problems (Anderson et al. 2010b; Hornickel et al. 2011; Hornickel and Kraus 2013). A peripheral basis for these abnormalities in terms of hearing sensitivity has generally been ruled out, because the participant groups tested presented clinically normal audiograms. Instead, it has been suggested that, in these populations, the speech ABR abnormalities reflect reduced precision of phase locking in central neurons.

It can be questioned, however, whether normal audiometric thresholds guarantee normal suprathreshold cochlear function. There is evidence that even in listeners with normal audiometric thresholds, there is considerable variability in suprathreshold measures of cochlear amplification (Dubno et al. 2007; Sommers and Gehr 2010), which is presumed to be driven by outer hair cells and determines not only the sensitivity but also the frequency resolution and dynamic range of human hearing. Similarly, a large degree of interindividual variability has been reported for the medial olivocochlear reflex (MOCR), a neural feedback pathway that projects back into the cochlea and modulates the cochlear amplifier gain (Backus and Guinan 2007; Cooper and Guinan 2006).

The effect of cochlear processing on the ABRs evoked by simple stimuli has been well-documented (Dau 2003). Of particular importance for the click ABR wave V is the increase in cochlear response time from high-frequency (basal) to low-frequency (apical) regions. This increase results in part from the travelling wave delay, which is determined by the passive mechanical properties of the cochlear partition, but to a larger extent results from the increase in filter build-up time due to the narrowing of cochlear filters, the tuning of which is determined by the cochlear amplification process (Don et al. 1998). The frequency-dependent cochlear delays are preserved in the latency of the click ABR wave V, which increases by up to 3 ms when the cochlear place of origin is restricted and moved from base to apex (Burkard and Hecox 1983). In fact, the delays increase so rapidly toward the apex that responses from neurons tuned to lower frequencies contribute little to the ABR as a result of their extensive desynchronization (Dau 2003). Furthermore, upward spread of excitation causes higher frequency neurons to respond to lower frequency stimuli via the basal tails of their tuning curves. This means that at moderately high intensities, the ABR wave V mainly reflects activity from neurons tuned to higher frequencies, even when the stimulus has mainly low-frequency content (Dau 2003). A further consequence of the frequency-dependent variation in cochlear response time on the click ABR is that responses from neurons tuned to different frequencies can add together either constructively or destructively in the overall response, depending on their relative delays, which introduces a further cochlear source of variability (Don et al. 1994).

In this study, we investigated whether the speech ABR shows a similar dependence on cochlear frequency analysis as the click ABR. The investigation specifically focused on the periodic portion of the speech ABR, which has been suggested to reflect distinct neural processes compared with the click ABR wave V and is widely used to index the quality of speech encoding in the brain stem. The aim was to compare the relative amplitude and latency of this portion of the speech ABR as a function of frequency. To this purpose, speech ABRs were recorded from four octave-wide frequency regions between 0.5 and 8 kHz using a subtractive masking technique (Don and Eggermont 1978). Three main hypotheses were tested, based on previous findings on the click ABR wave V: first, that the latency of the frequency-delimited speech ABR increases with decreasing frequency, according to a power-law function that reflects the increase in cochlear filter width; second, that interindividual variability in the latency of the frequency-delimited speech ABR correlates with variability in cochlear filter bandwidth measured psychophysically; and third, that the amplitude of the frequency-delimited speech ABR as a function of frequency is not proportional to the stimulus spectrum, but instead is biased toward mid to high frequencies.

METHODS

Participants.

Twenty-six native English speakers (age range, 18–39 yr; mean age, 22.4 yr; 17 women) took part in this study. All participants had pure-tone hearing thresholds at or below 20 dB hearing level (HL) at octave frequencies between 250 and 8,000 Hz and presented a normal wave V response to 100-μs clicks presented monaurally at a 31.1-Hz repetition rate and a peak-equivalent (pe) level of 70-dB sound pressure level (SPL). Written informed consent was obtained from all participants. The experimental procedures were approved by the Ethics Committee of the University of Nottingham Medical School and were in accordance with the guidelines of the Declaration of Helsinki. Participants were paid at an hourly rate.

Design.

Speech ABRs were recorded in quiet and in five different high-pass noise-masking conditions. For all participants, speech ABRs were recorded across two blocks. In each block, responses were recorded for each of the six conditions so that two replicate responses were collected from each participant in each condition. The order of conditions was counterbalanced across participants and was the same for the first and second block of recordings. Fourteen of the participants attended one experiment in which only speech ABRs were recorded in one experimental session. The remaining 12 participants attended a second experiment in which speech ABRs were recorded in one session and additional tests, including psychophysical measurements of cochlear filter bandwidths (see below), were performed in a second session. Filter widths were measured at four different frequencies using notched noises with four notch widths. The measurements were performed in order from lowest to highest frequency and from narrowest to widest notch width.

Derived-band subtraction.

Derived-band speech ABRs were obtained by subtracting recordings acquired under different high-pass noise-masking conditions, as illustrated in Fig. 1. With the use of this method, derived-band responses were obtained from four adjacent frequency regions (0.5–1, 1–2, 2–4. and 4–8 kHz) that constituted octave-wide bands centered around 0.7, 1.4, 2.8, and 5.7 kHz, respectively. The resulting derived-band responses still reflect neural activity from the rostral brain stem, but now comprise responses only from those neural units that receive input from the cochlear frequency channels that fall within the respective octave-wide bands.

Fig. 1.

Fig. 1.

Schematic of subtractive masking technique. The x-axis represents the center frequency along the cochlear partition. The gray-shaded area (Masker) represents the part of the cochlea masked by high-pass noise, and the open area (Response) represents the region that is left free to respond to the speech stimulus. In this example, the noise maskers used in recordings A and B were cut off above 8 and 4 kHz, respectively. When recording B is subtracted from recording A, the response to the speech stimulus that is common to both (below 4 kHz in this example) is canceled out, leaving a derived-band response from 4 to 8 kHz.

Stimuli.

The speech stimulus was a 170-ms CV syllable ([da]) with five formants and a constant 100-Hz fundamental frequency. The syllable was developed using a KLATT synthesizer and provided by Nina Kraus's group (Northwestern University, Chicago, IL). It comprised a 5-ms stop burst followed by a 50-ms formant transition region with a linearly rising F1 (400–720 Hz), a linearly falling F2 (1,700-1,240 Hz) and F3 (2,580-2,500 Hz), and a flat F4 (3,300 Hz) and F5 (3,900 Hz). The stop burst contained frequencies around F4 and F5. The high-pass noise maskers were composed of “equally exciting” noise, a uniform noise filtered to contain approximately equal energy within each cochlear filter [equivalent rectangular bandwidth (ERB); Glasberg and Moore 1990]. To generate the five high-pass maskers, this noise was high-pass filtered with cutoff frequencies of 0.5, 1, 2, 4, or 8 kHz, with the low-pass cutoff always at 12.2 kHz.

Sound generation and presentation were controlled by a TDT System 3 (Tucker Davis Technologies, Alachua, FL) and MATLAB (The MathWorks, Natick, MA). Stimuli were generated digitally at a 24.4-kHz sampling rate, digital-to-analog converted with a 24-bit amplitude resolution (TDT RP2.1) and amplified (TDT HB7). The [da] stimulus and the high-pass noise maskers were set to levels of 70 dB peSPL and 80 dB SPL per ERB, respectively, and were mixed digitally. Initial pilot experiments indicated that this combination of sound levels resulted in full masking of the speech ABR without high-pass filtering. Speech and noise were presented monaurally to the left ear via a magnetically shielded insert earphone (ER-1; Etymotic Research, Elk Grove Village, IL). In each recording, the noise was turned on 5 s before the speech stimuli and turned off after 2,000 responses had been accepted by the data acquisition system (see below).

Electrophysiology.

Speech ABRs were recorded using the Intelligent Hearing Systems SmartEP evoked potentials system (Miami, FL) in electric ABR mode, which allows the use of an external trigger. Electroencephalographic (EEG) signals were differentially measured between Ag-AgCl scalp electrodes placed at Cz (+) and the right earlobe (−). An electrode placed on the mid forehead (Fpz) served as the common ground. All electrode impedances were maintained below 5 kΩ. The raw EEG signal was amplified by a factor of 105 and band-pass filtered online between 30 and 3,000 Hz. The external trigger was generated in MATLAB during stimulus presentation and initiated acquisition of a 200-ms poststimulus epoch. Epochs containing activity exceeding 35 μV were rejected as artifacts. Alternating polarity responses were averaged together online until 2,000 artifact-free epochs (1,000 for each stimulus polarity) were accepted for each condition. Data sets from two experiments were combined (see Design). Each experiment used the same equipment and recording procedures but different analog-to-digital sampling rates and interstimulus intervals (ISI). In the first experiment, stimuli were presented at an ISI of 130 ms and responses were sampled at 10 kHz. In the second experiment, stimuli were presented at an ISI of 85 ms and responses were sampled at 20 kHz. The two data sets were pooled after the latter were downsampled to 10 kHz. No significant differences in response amplitude or latency were found between the two experiments.

Data analysis.

Offline data preprocessing and analysis were performed in MATLAB. All recorded responses were digitally band-pass filtered between 70 and 2,000 Hz using a 12 dB/octave zero-phase shift Butterworth filter. The onset peak latencies of the grand-average responses were estimated by manual peak picking. No estimates of onset latencies were performed for the individual derived-band responses, because this peak was not reliably present in all participants and frequency bands. The periodic portion of the response was analyzed in a time window between 22.7 and 170 ms. The frequency-dependent timing of this portion of the response was evaluated by measuring the relative delay between the derived-band responses and the broadband response. This was accomplished by cross-correlating the derived-band waveforms to the broadband response over a range of relative delays, or “lags,” that were imposed by shifting the derived-band waveform forward and backward in time (−4 to +4 ms). The lag at which the maximal cross-correlation occurred was taken to correspond to the latency difference between the derived-band and grand-average broadband response. For individual derived-band responses, this cross-correlation procedure was performed with respect to the grand-average broadband response, rather than the individual's broadband response, to maximize the signal-to-noise ratio in the cross-correlation. First, the delay between the individual's broadband response and the grand-average broadband response was estimated, using the same cross-correlation procedure. The grand-average broadband response was then aligned temporally with the individual's broadband response before being used in the cross-correlation procedure to estimate the delay of the individual derived-band response. The delay between the derived-band response and the broadband response is henceforth referred to as “relative latency.” The relative latency could take a positive or negative value, indicating that the derived-band response started later (lag) or earlier (lead) than the broadband response, respectively. Note that the relative latency reflects only the frequency-dependent portion of the response latency and does not include the frequency-independent neural conduction delay between the cochlea and the neural generators of the speech ABR, presumed to be located in the rostral auditory brain stem. The amplitude of the periodic portion of the derived-band responses was estimated by calculating the complex cross-spectrum between two replicate waveforms within the 22.7- to 170-ms time window and summing its real part across frequency. Use of the cross-spectrum reduces the bias introduced by random noise, because it includes only those signal components that have the same phase (timing) in both replicates. To obtain the grand-average amplitudes, individual cross-spectra were averaged and the real part of the resulting grand-average complex spectrum was summed.

Fitting procedures.

The function relating the relative latency of the derived-band responses to the band center frequency was fitted to the model developed by Strelcyk et al. (2009) for the click ABR wave V. The model predicts that the click ABR wave V latency varies as a function of frequency and level according to the following equation:

t(f,i)=a+bc0.93ifd, (1)

where a, b, c, and d are free parameters and f and i are the stimulus frequency and level, respectively. The parameter a represents the asymptotic delay reached as frequency approaches infinity, which is independent of frequency and level; it can be interpreted as the summed postcochlear neural and synaptic delays; b represents the cochlear response time at a frequency of 1 kHz and a pe click level of 93 dB SPL; c describes the level dependence and d the frequency dependence of the wave V latency. Strelcyk et al. (2009) found the population means of these parameters to be a = 4.7 ms, b = 3.4 ms, c = 5.2, and d = 0.5. In the present study, the aim was to evaluate whether the latency of the periodic portion of the speech ABR follows a power-law dependence on frequency similar to that of the click ABR wave V. To this purpose, the relative latencies of individual derived-band speech ABR latencies were fitted to Eq. 1 as a function of derived-band center frequency. The free parameter of interest was d. Because the model was fitted to relative rather than absolute latencies, parameter a in this fit does not represent the summed neural and synaptic delays, but instead reflects the asymptotic delay of the derived-band response relative to the broadband response as frequency approaches infinity. The value of this parameter could not be established a priori or inferred from the click ABR wave V results, and it was therefore retained as a free parameter in the model fit. The parameters b and c were kept fixed at the population mean values estimated by Strelcyk et al. (2009) to reduce the number of free parameters in the fitting procedure and avoid overfitting. The stimulus level i was set to the 70-dB SPL stimulus level used in the present study. The fit function was thus simplified to the following equation:

t(f)=a+Bfd, (2)

where B = 3.4 × 5.20.93−0.70 = 4.96. To evaluate the model, fits were first obtained for individual participants by using a nonlinear least-squares procedure, which minimized the sum of squared errors between the predicted (Eq. 2) and the observed latencies. Subsequently, population mean estimates for the fit parameters were obtained by submitting all individual data combined to a statistical model (see Statistical analyses for further details). The starting point for d in the fitting procedures was set at the population mean found by Strelcyk et al. (estimated using a mixed-effects model approach) for the click ABR wave V, and the starting point for parameter a was set to zero.

Behavior.

To assess psychophysical frequency selectivity, cochlear filter bandwidths were measured using the simultaneous notched-noise masking method (Glasberg and Moore 1990; Patterson 1974, 1976). In the present study, an abbreviated, audiometer-based version of this method was used, which was developed and provided by Glasberg and Moore (University of Cambridge, Cambridge, UK). This test involves measuring the detection threshold of a pulsed tone signal (20-ms raised-cosine ramps, 160-ms steady duration, 200-ms interval between pulses) in a simultaneously presented noise masker with a spectral notch centered on the signal frequency. The masker notch width is specified as the notch width, Δf, divided by the signal frequency, f, i.e., Δf/f.

The pulsed tone and notched noise were presented using a two-channel audiometer. During each measurement, the noise was presented continuously and the pulsed tone was presented for about 1 s once every 2–4 s. Participants were asked to press a button when they heard the pulsed tone. The time of signal presentation and the noise level were controlled by the experimenter.

The threshold procedure was similar to that used in pure-tone audiometry (British Society of Audiology 2004). However, in the present study the tone signal level was held constant and the masker level was varied. Thresholds were measured for four signal frequencies (0.5, 1, 2, and 4 kHz) and four masker notch widths (0.0, 0.1, 0.2, and 0.3). For each signal frequency, detection thresholds were first measured for the pulsed tone in quiet, using a final step size of 2 dB. The level of the pulsed tone was then fixed at 10 dB above threshold, and the level of the noise was varied, again using a step size of 2 dB, to find the noise level at which the tone was just audible for each notch width.

The method assumes that a signal at a frequency f, in a notched noise, is detected within the cochlear filter centered on f. The bandwidth of this filter is expressed as the ERB and was estimated from the change in threshold with increasing notch width using the fitting procedure developed by Glasberg and Moore (1990). Linear regression lines were fitted to the auditory filter bandwidths as a function of frequency for each participant. The resulting fit coefficients were then used to predict auditory filter bandwidth values at 0.7, 1.4, 2.8, and 5.7 kHz for each participant.

Statistical analyses.

Statistical analyses were conducted in the statistical software package R (R Core Team 2013). To evaluate the frequency dependence of the three outcome measures (relative latencies and amplitudes of derived-band speech ABRs, psychophysical cochlear filter bandwidths), a repeated-measures design with frequency as the independent within-participants variable was used. Mixed-effects models were used to account for interindividual variability (“nlme” package, Pinheiro et al. 2013). These models incorporate both fixed effects, which describe the population behavior, and random effects, which describe the variation between experimental units, in this case the individual participants. Model residuals were inspected for violations of the assumption of homogeneity of variance and normality using Levene's test (“car” package, Fox and Weisberg 2011) and inspection of quantile-quantile plots, respectively.

Linear mixed-effects models were used for the derived-band amplitude and psychophysical cochlear filter bandwidths, with frequency entered as a categorical factor, and a participant-related random effect included for the intercept. The general formula for these statistical models was thus

Yi,j=β0+βi+b0,j+εi,j, (3)

where Yi,j is the observed value of the dependent variable for participant j at frequency i; β0 represents the fixed intercept, βi the fixed effect at frequency i, b0,j the random intercept for participant j, and εi,j the residual error for participant j at frequency i.

Data points that exerted disproportionate influence on model parameters were identified using the Cook's distance measure (“lme4” package, Bates et al. 2013; “influence.ME” package, Nieuwenhuis et al. 2012). The cutoff level was defined as 4/N, where N is the total number of observations. These points were removed if their inclusion was found to have a significant effect on the fixed effects. On the basis of these criteria, one influential data point was removed from the amplitude data. No data points were removed from the psychophysical cochlear filter bandwidths, but a log transformation was applied to the data to remedy the violation of homogeneity of variance observed in the residuals of the model. For post hoc comparisons between different levels of frequency as a fixed factor, Tukey's honestly significant difference test was applied (“lsmeans” package, Lenth and Herve 2014).

For the relative latency data, a nonlinear mixed-effects model was used initially, in which the frequency dependence was described by Eq. 2. In this model, parameters a and d were entered as fixed effects, for which random effects were also included. The full statistical model was described by

lati,j=βa+ba,j+4.96×freqi(βd+bd,j)+εi,j, (4)

where lati,j is the relative latency observed in participant j at frequency i, and βa and βd represent the fixed effects, and ba,j and bd,j the random effects, associated with parameters a and d, respectively, with εi,j representing the residual error for participant j at frequency i. Neither of the random effects were found to contribute significantly to the model fit according to a log-likelihood ratio test. Therefore, a nonlinear least-squares regression model (“nls” in the core R package) was used instead, in which only fixed-effect terms for a and d were evaluated. The homogeneity of variance assumption was found to be violated for the residuals of the nonlinear regression model fit, mainly due to a greater variance in the highest frequency band compared with the lower frequency bands. To evaluate the effect of the inhomogeneity on the estimated values for a and d, a frequency-dependent weighting was applied to the data in the nonlinear regression model, with the weighting factors set to the inverse of the variance observed in each frequency band. The difference in the estimates for a and d between the weighted and nonweighted nonlinear regression was found to be less than the respective standard errors, indicating that the inhomogeneity did not have a substantial effect on the parameter estimates. Both nonweighted and weighted estimates are reported in results. To evaluate the goodness of fit of the (nonweighted) nonlinear regression, a parametric bootstrap procedure was performed on the R-squared value (Stute et al. 1993). This procedure tests the hypothesis that the observed data belong to the distribution of expected outcomes for the fitted model. One thousand samples of “repeat experiment” data were generated for the same number of participants and observations included in the original model. The simulated data points were calculated based on the model estimates for the fixed effects, and individual variability was simulated by adding to each data point a random sample of noise taken from a normal distribution with zero mean and a standard deviation equal to the model estimate of the residual error. Each simulated data set was submitted to the original model and the R-squared value calculated, thus generating a distribution of simulated R-squared values. The hypothesis was rejected if the proportion of simulated R-squared values that fell below that of the actual data was greater than (1 − α), where α is the significance level. If the hypothesis was not rejected, this implied that the observed data were a typical outcome of the model.

Fitting of individual data to linear (auditory filter bandwidths) and nonlinear (latencies) frequency functions was performed using least-squares regression. Correlations between variables were evaluated using Pearson's r, and mean comparisons were performed using two-tailed Student's t-tests, or the Wilcoxon signed-rank test when data were nonnormally distributed. In all analyses, the α level for significance was set at 0.05, Bonferroni-corrected for multiple comparisons where appropriate.

RESULTS

Overview.

Figure 2A shows the grand-average broadband speech ABR (i.e., recorded in the absence of frequency-delimiting high-pass noise) overlaid with the stimulus waveform. The stimulus has been moved forward in time in the plot to visually align the stimulus envelope with the periodic response peaks, to show the time-locking of the response to the periodicity of the stimulus envelope. The grand-average onset response peak latency was 10.2 ms. Figure 2, B–E, shows the grand-average derived-band speech ABRs (gray) overlaid with the broadband response (black). The onset peak was observable only in the two highest frequency bands, as would be expected given the high-frequency content of the onset burst. The grand-average onset latency in the 5.7-kHz band was 9.8 ms, which is 0.9 ms earlier than the onset latency in the 2.8-kHz band, which was 10.7 ms. This difference is roughly in line with predictions based on the click ABR wave V model (Eq. 1; the model prediction for the delay at 70 dB SPL is 0.88 ms). No further analysis of the frequency dependence of the onset response was possible because of the low amplitude of the response in the lower frequency bands and in the individual derived-band responses overall.

Fig. 2.

Fig. 2.

Grand-average speech auditory brain stem response (ABR) waveforms. A: broadband response (black) overlaid with the stimulus (gray), which has been shifted forward in time to align the periodic peaks in stimulus and response. Brackets indicate the different regions of the speech ABR. B–E: derived-band speech ABRs (gray) overlaid with broadband response (black) at 0.7 (B), 1.4 (C), 2.8 (D), and 5.7 kHz (E).

In contrast, the periodic portion of the speech ABR showed identifiable responses in each octave band, based on both the reproducibility of the waveform and its resemblance to the broadband response (Table 1). This implies that this part of the response includes contributions from an extensive region of the cochlea, spanning several octaves. The relative latency of the derived-band speech ABRs decreased systematically from low- to high-frequency regions. This is evident from the relative timing between the waveforms of the derived-band and broadband responses, which changed from a just discernible lead of the derived-band response in the highest frequency band (center frequency = 5.7 kHz) to a notable lag in the lowest band (0.7 kHz) (Fig. 2, B–E, Table 1). The amplitude of the speech ABR was greatest in the 2.8-kHz band (Fig. 2D, Table 1) and substantially decreased toward lower and higher frequencies. These observations are in qualitative agreement with the frequency dependence of the click ABR wave V. In the following sections, these observations are tested statistically on the basis of the individual data.

Table 1.

Relative latency, amplitude, and waveform cross-correlations

Center Frequency, kHz Response Latency, ms Response Amplitude, μV rRep rBB
0.7 2.2 0.53 0.55 0.64
1.4 0.4 0.70 0.62 0.74
2.8 −0.5 0.95 0.76 0.71
5.7 −1.1 0.46 0.43 0.50

Values are relative latency, amplitude, and waveform cross-correlations of grand-average speech auditory brain stem responses recorded from different frequency regions. Cross-correlations were calculated between replicate waveforms (rRep) and between derived-band and broadband waveforms (rBB).

Speech ABR latency decreases with increasing derived-band center frequency.

For the individual derived-band speech ABRs, the median relative latency of the periodic portion of the response decreased with increasing center frequency (Fig. 3). As a first step, individual latency-frequency functions were fitted to the power-law model adapted from Strelcyk et al. (2009) (Eq. 2) for each participant separately to evaluate the variation in model fits and range of parameter values across participants (Fig. 3B). The median values of the individually fitted parameters were d = 0.46 (range −0.14 to 1.01) and a = −3.93 (range −5.17 to −2.96 ms).

Fig. 3.

Fig. 3.

Relative latency of the periodic portion of the speech ABR as a function of derived-band center frequency. A: medians (central marks) and interquartile ranges (box edges) of individual latencies grouped by center frequency. Crosses show outliers (see methods); whiskers show the highest and lowest values not considered outliers. B: individual latency-frequency functions. Open circles and gray lines show individual data and regression lines for model fits. The population model fit is shown by the dashed black line.

Next, the set of individual relative latencies were submitted to a nonlinear least-squares regression model (see methods), to obtain population estimates for the fit parameters. The resulting estimates for d and a were 0.46 (0.06) and −3.83 (0.18) ms, respectively [means (SE); see Fig. 3B, black dashed line]. These estimates were not substantially altered when the nonlinear least-squares regression was repeated with a weighting factor applied to each frequency level, equal to the inverse of the variance at that frequency [a: −3.81 (0.14); d: 0.5 (0.05)]; this suggests that these estimates were not affected by the inhomogeneity of variance observed in the fit residuals (see methods).

To assess the goodness of fit of the nonlinear regression model, a parametric bootstrap procedure was performed (see methods). This showed that the R-squared value of the regression model fitted to the actual data was greater than that observed in 37% of 1,000 simulated repeat experiments, implying that the observed data are a representative outcome of the model. The results indicate that the latency-frequency functions are well-fitted by the power-law function described by Eq. 2. The estimate for a represents a frequency-independent delay (see methods); its value is specific to the procedure used in the present study to measure the relative latencies of the derived-band responses and so cannot be meaningfully compared with previous findings for the click ABR wave V. More importantly, however, the estimate for the frequency-dependent parameter d was highly comparable to that of Strelcyk et al. (d = 0.5; 2009). This supports the hypothesis that the latency of the periodic portion of the speech ABR follows the same dependence on cochlear response time as the click ABR wave V.

Psychophysical estimates of cochlear filter bandwidths do not predict interindividual variation in derived-band speech ABR latency.

In line with previous studies (Glasberg and Moore 1990), psychophysically measured cochlear filter bandwidths broadened with increasing center frequency in all 12 participants. Linear regression fits to the individual data (Fig. 4A) yielded a mean slope of 136.50 (9.94) Hz−1 and a mean intercept of 19.29 (12.01) Hz−1. These values are in close agreement with the reported estimates of Glasberg and Moore (1990). To assess the statistical significance of this frequency dependence, the data were submitted to a linear mixed-effects model (see methods). To adjust for the increasing variance with increasing center frequency (Fig. 4B), the bandwidths were log-transformed. The test for fixed effects confirmed a significant effect of frequency [F(3,33) = 282.3, P < 0.0001].

Fig. 4.

Fig. 4.

Relationship between auditory filter bandwidth and relative latency of the periodic portion of the derived-band speech ABR. A: individual filter bandwidths as a function of frequency (open circles). Dotted lines show the associated linear regression fits. B: medians (central marks) and interquartile ranges (box edges) of individual filter bandwidths grouped by center frequency. Crosses show outliers; whiskers show the highest and lowest values not considered outliers. C: relationship between derived-band latencies and filter bandwidths at 0.7 (triangles), 1.4 (circles), 2.8 (squares), and 5.7 kHz (diamonds). Note that bandwidths were interpolated from the fits in A (see methods).

Previous findings for the click ABR wave V have shown that the variance in frequency-specific latencies is at least partly explained by the perceptual auditory filter width measured at the corresponding frequency (Strelcyk et al. 2009). To evaluate whether this was also the case for the speech ABR, correlations were calculated between the relative latencies and perceptual filter widths at corresponding frequencies at each derived-band frequency separately. The corresponding plots are shown together in Fig. 4C. No significant correlation was found between latency and filter width at each separately tested frequency (indicated by the different symbols in Fig. 4C; 0.7 kHz: r = 0.17, P = 0.61; 1.4 kHz: r = 0.30, P = 0.34; 2.8 kHz: r = −0.34, P = 0.28; 5.7 kHz: r = −0.05, P = 0.75; α level for significance after Bonferroni correction = 0.0125). These data do not support the hypothesis that interindividual variation in derived-band speech ABR latency is explained by auditory filter bandwidth.

Speech ABR amplitude is attenuated at lower derived-band center frequencies.

As shown in Fig. 5A, the predominant contribution to the speech ABR originated from the 2.8-kHz band. Frequency region was confirmed to have a significant effect on response amplitude [F(3, 74) = 20.0, P < 0.0001; one influential outlier removed]. Post hoc analysis revealed that the differences in amplitude between derived-bands were all significant (0.7–1.4 kHz: P = 0.0044; 0.7–2.8 kHz: P < 0.001; 1.4–2.8 kHz: P = 0.007; 1.4–5.7 kHz: P = 0.002; 2.8–5.7 kHz: P < 0.001; Fig. 5A), apart from the 0.7- and 5.7-kHz comparison (P = 0.99). The very low amplitude in the highest frequency band (5.7 kHz) likely resulted from the steep drop-off in stimulus energy above 4 kHz (Fig. 5B, light gray line), which would be expected to cause very little excitation in the 4- to 8-kHz region of the cochlea (Fig. 5B, black line). However, the stimulus contained considerable energy in both lower frequency bands, which might thus be expected to produce a comparable, or greater, cochlear response than the 2.8-kHz band. The smaller derived-band speech ABR amplitudes from the lower frequency bands suggest that, like the click ABR, the speech ABR is biased toward the higher frequency regions of the cochlea.

Fig. 5.

Fig. 5.

Amplitude of the periodic portion of the derived-band speech ABR as a function of center frequency. A: medians (central marks) and interquartile ranges (box edges) of individual derived-band amplitudes grouped by center frequency. Crosses show outliers (see methods); whiskers show the highest and lowest values not considered outliers. B: cochlear excitation (black line) evoked by the [da] spectrum (light gray line) as a function of frequency. Filled circles show the expected summed excitation in each of the derived bands, delimited by vertical dashed lines.

DISCUSSION

The findings of the present study show that both the amplitude and the latency of the periodic portion of the speech ABR are strongly dependent on the frequency region of origin in the cochlea. The latency of the response was found to increase by 3.6 ms as the cochlear place of origin moved from regions of high (5.7 kHz) to low (0.7 kHz) frequency. The amplitude of the response was maximal in the 2.8-kHz frequency region, whereas responses from lower frequency regions were attenuated relative to their representation in the stimulus. These results are in line with previous findings on the click ABR wave V (Burkard and Hecox 1983; Don et al. 1977; Strelcyk et al. 2009). In particular, the variation of latency with frequency was well-fitted by the power-law function derived from click ABR wave V data (Strelcyk et al. 2009). The population mean estimate for the parameter d, which determines the shape of the frequency dependence, was highly similar to that obtained for the click ABR wave V (d = 0.46 in this study vs. 0.5 in Strelcyk et al. 2009). The amplitude attenuation toward lower frequency regions was also in line with findings from previous studies focusing on the click ABR wave V, where high-pass noise masking was also used to obtain octave-wide derived-band responses (Burkard and Hecox 1983; Don et al. 1994, 1998).

The frequency dependence of the click ABR wave V is thought to result mainly from the narrowing of the auditory filters from higher to lower frequency regions in the cochlea, which causes an increase in the cochlear response time (Don et al. 1998). The resulting frequency-dependent response delay in the cochlea is preserved at the level of the neural generators of the click ABR wave V, where it is reflected in the peak latency of the response. The increasingly rapid variations in cochlear response time toward lower frequencies are assumed to give rise to phase cancellations (Dau 2003), resulting in a relative attenuation of contributions from these regions. The frequency dependence observed in the present study for the periodic portion of the speech ABR can reasonably be assumed to arise similarly from the narrowing of cochlear filter widths from base to apex. In addition to increasing cochlear response time, narrowing auditory filters also results in a reduced ability to follow amplitude modulation, which declines steeply when the modulation frequency exceeds the auditory filter width (Joris and Yin 1992). For the periodic portion of the speech ABR, this would have reduced the modulation depth of the cochlear response to the envelope of the speech stimulus in the 0.7-kHz band, and to a lesser degree in the 1.4-kHz band. It is likely that both reduced modulation depth and phase cancellations contributed to the attenuation of the amplitude of the response in these frequency bands. One way to evaluate the relative contributions of phase cancellation and reduced modulation depth to the low-frequency attenuation could be to reduce the width of the derived bands to half-octaves. When half-octave-wide bands were used to study click ABR wave V, derived-band amplitudes at lower center frequencies were not found to be attenuated (Don and Eggermont 1978). This difference from other studies on the derived-band click ABR wave V (Burkard and Hecox 1983; Don et al. 1994, 1998) may be explained by the more restricted range of frequencies in the half-octave derived-bands, which would have limited the degree of phase cancellation in the responses. If the derived-band speech ABRs showed a similar dependence on derived-band width, this would indicate a contribution of phase cancellations; no such dependence would be expected to arise if the attenuation of the lower frequency derived bands resulted purely from decreased modulation depth.

One aspect of the present findings that did not agree with expectations based on click ABR wave V studies was the nonsignificant relationship between derived-band speech ABR latency and perceptual auditory filter widths. Strelcyk et al. (2009) reported a significant correlation between these measures for the click ABR wave V at a center frequency of 2 kHz when both normally hearing and hearing-impaired listeners were included. No correlation was reported in the normally hearing group alone, but the sample size was low (n = 5). The present study tested this relationship in a larger sample of normally hearing participants and at multiple frequencies but still found no correlation at any frequency tested. This may be due to the limited interindividual variability in auditory filter bandwidth in the absence of a hearing loss when measured using the notched-noise method (Sommers and Humes 1993). It has been proposed that this method may not provide the most accurate estimate of auditory filter bandwidth (Moore and Glasberg 1981; Oxenham and Shera 2003). It also may be the case that the relative latency estimates for the individual speech ABR derived bands were limited by an inadequate signal-to-noise ratio. Future investigations may need to include hearing-impaired participants and use an alternative method to measure auditory filter bandwidth (Shera et al. 2002). Additional presentations of the stimulus in the acquisition of the response also may be required to improve the signal-to-noise ratio of the derived-band speech ABR.

Scalp-recorded brain stem potentials evoked by simple stimuli, such as clicks or tone bursts, are assumed to represent a linear summation of neural activity across frequency channels (Dau 2003; Goldstein and Kiang 1958). This assumption is supported by computational models that have successfully simulated key properties of these responses (Dau 2003; Rønne et al. 2013). These models incorporate physiological models of cochlear processing and have demonstrated the importance of cochlear frequency dispersion in the formation of the summated scalp-recorded responses. This has been further corroborated by experimental manipulations that have elicited enhanced ABR wave V amplitudes by compensating for cochlear response delays (Dau et al. 2000; Don et al. 1994; Elberling and Don 2008). It is reasonable to expect that the formation of the scalp-recorded speech ABR involves a similar summation across frequencies. A consequence of this summation would be that the amplitude and latency of the overall response could be altered by any cochlear changes that affect the relative weighting and/or delay between contributions from different frequency regions in the cochlea. This could arise, for example, from a loss of higher frequency sensitivity in the cochlea, known to be particularly vulnerable to noise and age-related hearing loss, or from broadening of cochlear filters. In addition, selective loss of high-threshold auditory nerve fibers, which can lead to “hidden hearing loss” (Furman et al. 2013; Kujawa and Liberman 2009), could, if frequency specific, lead to changes in both amplitude and latency of the response. Such peripheral factors thus could constitute a potential confound for the use of speech ABRs in diagnosing central temporal processing deficits. This might be expected to be the case when there is any degree of hearing loss but could even arise in the presence of clinically normal hearing thresholds.

The high-pass masking paradigm used in the present study can provide a useful tool for detecting and/or controlling for these potentially confounding peripheral influences on the speech ABR. The paradigm also could be useful for examining the frequency-specificity of variations in speech ABR characteristics that have been observed in certain clinical populations (Anderson et al. 2010b, 2012; Hornickel et al. 2009, 2013; Song et al. 2011). Such information could help to elucidate the underlying mechanisms that link the speech ABR to indexes of speech-in-noise perception and speech- and language-related deficits in these populations. The present study demonstrates that this technique can be successfully applied to obtain frequency-specific speech ABRs. The main drawback of this paradigm is the long recording times required to obtain responses with adequate signal-to-noise ratio. An alternative method for obtaining frequency-specific responses is to use notched-noise masking (Picton et al. 1979; Terkildsen et al. 1975). This method also uses intense noise masking but utilizes a broadband masker with a spectral gap, or notch, covering the frequency range of interest. This paradigm in principle involves a shorter recording time, because the frequency-specific response is obtained from one recording, rather than from the subtraction of two separate recordings, as in the high-pass masking paradigm. However, a problem that arises with notched-noise masking is the phenomenon of upward spread of excitation, whereby at moderate-to-high sound levels, low-frequency sounds will produce substantial excitation in more basal (high frequency) regions of the cochlea. Noise frequencies below the lower edge of the notch would be expected to spread into the spectral gap and might partially mask the frequency region of interest; this could lead to reduced amplitudes in the frequency-specific responses obtained with the use of this method (Wegner and Dau 2002). Nevertheless, it may be useful for future studies to investigate the relative advantages of the notched-noise versus high-pass noise-masking paradigm for recording frequency-specific speech ABRs.

In conclusion, ABRs evoked by complex sounds, including speech, provide an important tool to investigate the mechanisms of speech encoding in humans, and thus to identify the neural basis of the widespread problems in understanding speech in challenging conditions. The results of the present study highlight the importance of considering the effect of cochlear processing on the amplitude and latency of the speech ABR and of using methods, such as the high-pass masking technique used in this work, that provide more rigorous means to test mechanistic hypotheses that relate the speech ABR to central temporal encoding.

GRANTS

This work was supported by the Medical Research Council.

DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

AUTHOR CONTRIBUTIONS

H.E.N. and J.d.B. conception and design of research; H.E.N. performed experiments; H.E.N. and J.d.B. analyzed data; H.E.N. and J.d.B. interpreted results of experiments; H.E.N. and J.d.B. prepared figures; H.E.N. and J.d.B. drafted manuscript; H.E.N., D.R.M., J.G.B., K.K., and J.d.B. edited and revised manuscript; H.E.N., D.R.M., J.G.B., K.K., and J.d.B. approved final version of manuscript.

ACKNOWLEDGMENTS

We thank Oliver Zobay for help with data analysis, Antje Heinrich for help with the behavioral task, and Nina Kraus for providing the speech ABR stimulus. We especially thank the individuals who participated in the study.

REFERENCES

  1. Anderson S, Parbery-Clark A, White-Schwoch T, Kraus N. Aging affects neural precision of speech encoding. J Neurosci 32: 14156–14164, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Anderson S, Parbery-Clark A, Yi HG, Kraus N. A neural basis of speech-in-noise perception in older adults. Ear Hear 32: 750–757, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Anderson S, Skoe E, Chandrasekaran B, Kraus N. Neural timing is linked to speech perception in noise. J Neurosci 30: 4922–4926, 2010a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Anderson S, Skoe E, Chandrasekaran B, Zecker S, Kraus N. Brainstem correlates of speech-in-noise perception in children. Hear Res 270: 151–157, 2010b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Backus BC, Guinan JJ. Measurement of the distribution of medial olivocochlear acoustic reflex strengths across normal-hearing individuals via otoacoustic emissions. J Assoc Res Otolaryngol 8: 484–496, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bates D, Maechler M, Bolker B, Walker S. lme4: Linear mixed-effects models using Eigen and S4 (Online) The R Foundation for Statistical Computing, Vienna, Austria: http://lme4.r-forge.r-project.org [2013]. [Google Scholar]
  7. Boets B, Wouters J, van Wieringen A, Ghesquière P. Auditory processing, speech perception and phonological ability in pre-school children at high-risk for dyslexia: a longitudinal study of the auditory temporal processing theory. Neuropsychologia 45: 1608–1620, 2007. [DOI] [PubMed] [Google Scholar]
  8. British Society of Audiology. Recommended Procedure: Pure tone air and bone conduction threshold audiometry with and without masking and determination of uncomfortable loudness levels. Reading, UK: British Society of Audiology, 2004. [Google Scholar]
  9. Burkard R, Hecox K. The effect of broadband noise on the human brainstem auditory evoked response. II. Frequency specificity. J Acoust Soc Am 74: 1214–1223, 1983. [DOI] [PubMed] [Google Scholar]
  10. Cooper NP, Guinan JJ. Efferent-mediated control of basilar membrane motion. J Physiol 5: 49–54, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chandrasekaran B, Kraus N. The scalp-recorded brainstem response to speech: neural origins and plasticity. Psychophysiology 47: 236–246, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dau T. The importance of cochlear processing for the formation of auditory brainstem and frequency following responses. J Acoust Soc Am 113: 936–950, 2003. [DOI] [PubMed] [Google Scholar]
  13. Dau T, Wegner O, Mellert V, Kollmeier B. Auditory brainstem responses with optimized chirp signals compensating basilar-membrane dispersion. J Acoust Soc Am 107: 1530–1540, 2000. [DOI] [PubMed] [Google Scholar]
  14. Don M, Allen AR, Starr A. Effect of click rate on the latency of auditory brain stem responses in humans. Ann Otol Rhinol Laryngol 86: 186–195, 1977. [DOI] [PubMed] [Google Scholar]
  15. Don M, Eggermont JJ. Analysis of the click-evoked brainstem potentials in man using high-pass noise masking. J Acoust Soc Am 63: 1084–1092, 1978. [DOI] [PubMed] [Google Scholar]
  16. Don M, Ponton CW, Eggermont JJ, Kwong B. The effects of sensory hearing loss on cochlear filter times estimated from auditory brainstem response latencies. J Acoust Soc Am 104: 2280–2289, 1998. [DOI] [PubMed] [Google Scholar]
  17. Don M, Ponton CW, Eggermont JJ, Masuda A. Auditory brainstem response (ABR) peak amplitude variability reflects individual differences in cochlear response times. J Acoust Soc Am 96: 3476–3491, 1994. [DOI] [PubMed] [Google Scholar]
  18. Dubno JR, Horwitz AR, Ahlstrom JB. Estimates of basilar-membrane nonlinearity effects on masking of tones and speech. Ear Hear 28: 2–17, 2007. [DOI] [PubMed] [Google Scholar]
  19. Elberling C, Don M. Auditory brainstem responses to a chirp stimulus designed from derived-band latencies in normal-hearing subjects. J Acoust Soc Am 124: 3022–3037, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fox J, Weisberg HS. An R Companion to Applied Regression (2nd ed). Thousand Oaks, CA: Sage, 2011. [Google Scholar]
  21. Furman AC, Kujawa SG, Liberman MC. Noise-induced cochlear neuropathy is selective for fibers with low spontaneous rates. J Neurophysiol 110: 577–586, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Glasberg BR, Moore BC. Derivation of auditory filter shapes from notched-noise data. Hear Res 47: 103–138, 1990. [DOI] [PubMed] [Google Scholar]
  23. Goldstein MH, Kiang NY. Synchrony of neural activity in electric responses evoked by transient acoustic stimuli. J Acoust Soc Am 30: 107–114, 1958. [Google Scholar]
  24. Hornickel J, Chandrasekaran B, Zecker S, Kraus N. Auditory brainstem measures predict reading and speech-in-noise perception in school-aged children. Behav Brain Res 216: 597–605, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Hornickel J, Kraus N. Unstable representation of sound: a biological marker of dyslexia. J Neurosci 33: 3500–3504, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hornickel J, Skoe E, Nicol T, Zecker S, Kraus N. Subcortical differentiation of stop consonants relates to reading and speech-in-noise perception. Proc Natl Acad Sci USA 106: 13022–13027, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Joris PX, Yin TC. Responses to amplitude-modulated tones in the auditory nerve of the cat. J Acoust Soc Am 91: 215–232, 1992. [DOI] [PubMed] [Google Scholar]
  28. Kujawa SG, Liberman MC. Adding insult to injury: cochlear nerve degeneration after “temporary” noise-induced hearing loss. J Neurosci 29: 14077–14085, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Lenth RV, Herve M. lsmeans: Least-Squares Means (Online) The R Foundation for Statistical Computing, Vienna, Austria: http://CRAN.R-project.org/package=lsmeans [R package version 2.13; 2014]. [Google Scholar]
  30. Melcher JR, Kiang NY. Generators of the brainstem auditory evoked potential in cat. III: Identified cell populations. Hear Res 93: 52–71, 1996. [DOI] [PubMed] [Google Scholar]
  31. Møller AR, Jannetta PJ. Interpretation of brainstem auditory evoked potentials: results from intracranial recordings in humans. Scand Audiol 12: 125–133, 1983. [DOI] [PubMed] [Google Scholar]
  32. Moore BC, Glasberg BR. Auditory filter shapes derived in simultaneous and forward masking. J Acoust Soc Am 70: 1003–1014, 1981. [DOI] [PubMed] [Google Scholar]
  33. Nieuwenhuis R, te Grotenhuis M, Pelzer B. Influence ME: tools for detecting influential data in mixed effects models. R Journal 4: 38–47, 2012. [Google Scholar]
  34. Oxenham AJ, Shera CA. Estimates of human cochlear tuning at low levels using forward and simultaneous masking. J Assoc Res Otolaryngol 4: 541–554, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Patterson RD. Auditory filter shape. J Acoust Soc Am 55: 802–809, 1974. [DOI] [PubMed] [Google Scholar]
  36. Patterson RD. Auditory filter shapes derived with noise stimuli. J Acoust Soc Am 59: 640–654, 1976. [DOI] [PubMed] [Google Scholar]
  37. Pfeiffer RR. Classification of response patterns of spike discharges for units in the cochlear neurons: tone-burst stimulation. Exp Brain Res 1: 220–235, 1966. [DOI] [PubMed] [Google Scholar]
  38. Pichora-Fuller MK, Souza PE. Effects of aging on auditory processing of speech. Int J Audiol 42: 2S11–2S16, 2003. [PubMed] [Google Scholar]
  39. Picton TW, Ouellette J, Hamel G, Durieux-Smith AD. Brainstem evoked potentials to tonepips in notched noise. J Otolaryngol 8: 289–314, 1979. [PubMed] [Google Scholar]
  40. Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team. nlme: Linear and Nonlinear Mixed Effects Models (Online) The R Foundation for Statistical Computing, Vienna, Austria: http://CRAN.R-project.org/package=nlme [R package version 3.1–115; 2013]. [Google Scholar]
  41. R Core Team. R: A Language and Environment for Statistical Computing (Online) The R Foundation for Statistical Computing, Vienna, Austria: http://www.R-project.org [2013]. [Google Scholar]
  42. Rønne FM, Elberling C, Harte J, Dau T. Modeling Auditory Evoked Potentials to Complex Stimuli (PhD thesis) Kongens Lyngby, Denmark: Technical University of Denmark, 2013. [Google Scholar]
  43. Ruggles D, Bharadwaj H, Shinn-Cunningham BD. Why middle aged listeners have trouble hearing in everyday settings. Curr Biol 22: 1417–1422, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Shera CA, Guinan JJ Jr, Oxenham AJ. Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements. Proc Natl Acad Sci USA 99: 3318–3323, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Sommers MS, Gehr SE. Two-tone auditory suppression in younger and older normal-hearing adults and its relationship to speech perception in noise. Hear Res 264: 56–62, 2010. [DOI] [PubMed] [Google Scholar]
  46. Sommers MS, Humes LE. Auditory filter shapes in normal-hearing, noise-masked normal, and elderly listeners. J Acoust Soc Am 93: 2903–2914, 1993. [DOI] [PubMed] [Google Scholar]
  47. Song JH, Skoe E, Banai K, Kraus N. Perception of speech in noise: neural correlates. J Cogn Neurosci 23: 2268–2279, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Strelcyk O, Christoforidis D, Dau T. Relation between derived-band auditory brainstem response latencies and behavioral frequency selectivity. J Acoust Soc Am 126: 1878–1888, 2009. [DOI] [PubMed] [Google Scholar]
  49. Stute W, Gonzales Manteiga W, Presedo Quindimil M. Bootstrap based goodness-of-fit tests. Metrika 40: 243–256, 1993. [Google Scholar]
  50. Terkildsen K, Osterhammel P, Huis in't Velt F. Farfield electrocochleography. Frequency selectivity of the response. Scand Audiol 4: 167–172, 1979. [Google Scholar]
  51. Wegner O, Dau T. Frequency specificity of chirp-evoked auditory brainstem responses. J Acoust Soc Am 111: 1318–1329, 2002. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Neurophysiology are provided here courtesy of American Physiological Society

RESOURCES