Abstract
Objective
The speech-evoked auditory brainstem response (ABR) provides an objective measure of subcortical encoding of complex acoustic features. However, the intrasubject reliability of this response in both optimal and challenging listening conditions has not yet been systematically documented. This study aimed to evaluate test-retest reliability of the speech-evoked ABR in young adults.
Methods
In each of two sessions, ABRs were obtained with: 1) a 170 ms/da/ syllable presented in quiet as well as 2-talker and 6-talker babble background noise conditions and 2) a 40 ms/da/ syllable presented in quiet. Test-retest reliability of the responses was analyzed in the frequency and time domains.
Results
The speech-evoked ABR does not vary significantly across sessions within individuals on measures of temporal encoding (i.e., peak latencies, stimulus-to-response and response-to-response measures), frequency representation and response magnitude.
Conclusions
The subcortical auditory pathway produces a response to a complex sound that is stable and replicable from session to session.
Significance
By demonstrating the high degree of replicability in optimal and challenging listening conditions, the applicability of the speech-evoked ABR may be increased to examining a range of auditory processing abilities in clinical and research settings.
Keywords: ABR test-retest reliability, speech-evoked ABR, speech-in-noise, brainstem encoding, BioMARK
Introduction
For decades, the highly predictable nature of the click-evoked auditory brainstem response (ABR), demonstrated by its astounding degree of both temporal precision and test-retest reliability, has made it a widely used clinical measure to assess the integrity of the subcortical pathway (Jacobson, 1985; Cone-Wesson et al., 1987; Hall, 1992; Sininger, 1993; Sininger et al., 1997; Hood, 1998; Cone-Wesson et al., 2000; Norton et al., 2000). It is a far-field response that reflects stimulus-locked, synchronous neural firing from nuclei along the brainstem (Møller and Jannetta, 1985; Chandrasekaran and Kraus, 2010). The intrasubject variability of absolute latencies of the click-evoked ABR from one test to another is small (Edwards et al., 1982; Oyler et al., 1991). Thus, temporal precision of the response is such that even subtle divergences in peak latencies on the order of fractions of milliseconds are clinically significant.
More recently, the ABR has been used to evaluate underlying central processes involved in encoding more complex signals presented in diverse listening contexts. Among these are syllables that have been temporally and spectrally enhanced (Cunningham et al., 2002), synthesized with Mandarin tones (Krishnan et al., 2004; Song et al., 2008; Swaminathan et al., 2008; Krishnan et al., 2009) or presented in background noise (Russo et al., 2005; Parbery-Clark et al., 2009b; Anderson et al., 2010; Song et al., submitted) as well as words (Galbraith et al., 1995; Galbraith et al., 1997; Galbraith et al., 2004), musical notes and chords (Musacchia et al., 2007; Lee et al., 2009) and emotionally-valent vocal sounds (Strait et al., 2009). The brainstem response to a consonant-vowel syllable, such as /da/, is composed of two regions: the onset response that signals the beginning of the sound and the frequency following response (FFR) that corresponds to the periodicity of the vowel (Boston and Møller, 1985; Akhoun et al., 2008; Johnson et al., 2008a; Hornickel et al., 2009b). Although the morphology of the onset response of the speech-evoked ABR is similar to that elicited by clicks (Song et al., 2006), the FFR does not appear in the transient click-evoked response. The FFR faithfully reflects the encoding of the fundamental frequency and harmonic structure of the speech stimulus (Moushegian et al., 1973). Thus, an advantage of the speech-evoked ABR is that it reflects both transient and sustained portions of the stimulus which can be objectively assessed at the level of the brainstem.
Investigating the brainstem encoding of more complex sounds is important as responses to these stimuli can uncover auditory processing deficits (Song et al., 2006; Banai et al., 2009) and experience-dependent enhancement (Musacchia et al., 2007; Song et al., 2008; Krishnan et al., 2009; Lee et al., 2009; Parbery-Clark et al., 2009a; Strait et al., 2009) of subcortical encoding that clicks alone cannot. Click and speech stimuli entail different subcortical processing demands such that speech stimuli impose larger desynchronizing influences on neural phase-locking. Specifically, unlike the click (a nonperiodic, relatively simple sound that is very brief in duration and contains a broad range of frequencies), consonant-vowel speech syllables, such as the /da/ used in this study, begin with rapid, relatively low amplitude transient onset features that may be particularly vulnerable to disruption by background noise (Brandt and Rosen, 1980). The vowel that follows, a sustained periodic acoustic signal that is more intense than the consonant, may mask the brief consonant onset crucial for eliciting the onset portion of the speech-evoked ABR. Susceptibility to neural phase-locking desynchronization due to the acoustics of the signal may in turn impart greater challenges in replicating the response with good reliability from one session to another. This may be especially apparent in suboptimal listening conditions (e.g., background noise) where degradation and variability of the response occur naturally. Establishing the degree of intrasubject response reliability of speech-evoked brainstem responses recorded in quiet and noisy background conditions will enhance clinical and research utility.
While studies that investigated the impact of auditory training programs on subcortical responses have corroborated the reliability of the speech-evoked ABR recorded over multiple sessions, a thorough examination of the stability of the multiple components of these responses was not performed. Russo and colleagues (2005) found training related improvements in the precision of brainstem encoding in noise in children, whereas control subjects showed stable test-retest reliability on most measures of brainstem activity. Also in the context of a training paradigm, Song and colleagues (2008) recorded over multiple sessions brainstem responses elicited by a speech sound that was synthesized with different Mandarin tones in young adults. They examined the degree of accuracy to which the response encoded the fundamental frequency (F0) trajectory of the stimuli. Responses elicited by the control stimuli did not change in F0 encoding from pre- to post-training test sessions. The study did not include test-retest analysis of harmonic encoding and measures of timing, and by performing a wider scope of analysis in the temporal and frequency domains of the speech-evoked ABR, the current study aims to provide a broader understanding of the stability and highly predictable nature of the brainstem response in encoding complex sounds.
To determine whether test-retest reliability is a factor when assessing response timing and morphology, we first need to describe the nature and range of the response among normal subjects. This delineation also benefits intrasubject clinical applications of the speech-evoked ABR (e.g., neural index of change following auditory training). The purpose of the present study was to evaluate test-retest reliability of various speech-evoked ABR measures observed in normal hearing young adults. Specifically, we hypothesized that the neural system of a young adult produces a response to a complex sound that is stable and replicable from session to session.
Methods
Participants
Study 1
Thirty-one young adults (20 females, ages 19 to 31 years, mean age = 23.5, SD = 3 years) were tested with the 170 ms /da/ in quiet and background noise. The mean time between Test 1 and Test 2 was 58 (±33) days.
Study 2
Forty-five young adults (29 females, ages 19 to 36 years, mean age = 24.5, SD = 4 years) were tested with the 40 ms /da/. The mean test time between Test 1 and Test 2 was 41 (±34) days.
All participants had normal IQ ≥ 90 as measured by the Wechsler’s Abbreviated Scale of Intelligence (WASI) (Wechsler, 1999) or Test of Nonverbal Intelligence-3 (TONI-3) (Brown et al., 1997), normal hearing (≤20 dB HL pure-tone thresholds from 125 to 8000 Hz), normal click-evoked auditory response wave V latencies to 100 μs clicks presented at 31.1 Hz at 80.3 dB SPL, and no history of neurological disease. All participants gave their informed consent in accordance with the Northwestern University Institutional Review Board regulations.
Neurophysiologic stimuli and recording parameters
170 ms /da/, quiet and background noise
Brainstem responses were elicited in response to the syllable /da/ in quiet and two noise conditions in two test sessions separated by a ~2 month interval. /Da/ is a six-formant syllable synthesized at a 20 kHz sampling rate using a Klatt synthesizer (Klatt, 1980). The duration was 170 ms with voicing (100 Hz fundamental frequency) onset at 10 ms. Formant transition duration was 50 ms and comprised a linearly rising F1 (400 to 720 Hz), linearly falling F2 and F3 (1700 to 1240 Hz and 2580 to 2500 Hz, respectively) and flat F4 (3300 Hz), F5 (3750 Hz) and F6 (4900 Hz). After the transition period, these formant frequencies remained constant at 720, 1240, 2500, 3300, 3750, and 4900 Hz for the remainder of the syllable. The stop burst consisted of ten milliseconds of initial frication centered at frequencies around F4 and F5. The syllable /da/ was presented in alternating polarities via a magnetically shielded insert earphone placed in the right ear (ER-3, Etymotic Research, Elk Grove Village, IL) at 80.3 dB SPL at a rate of 4.35 Hz.
The noise conditions consisted of multi-speaker babble spoken in English. Two-talker (1 female and 1 male, 20 second track) and 6-talker (3 females and 3 males, 4 second track) babble were selected because they provide different levels of energetic masking. To create the babble, the speakers were instructed to speak in a natural, conversational style. Recordings of nonsense syllables were made in a sound-attenuated booth in the phonetics laboratory of the Department of Linguistics at Northwestern University for unrelated research (Smiljanic and Bradlow, 2005) and were digitized at a sampling rate of 16 kHz with 24 bit accuracy (for further detail, see Smiljanic and Bradlow, 2005; Van Engen and Bradlow, 2007). The tracks were RMS amplitude normalized using Level 16 software (Tice and Carrell, 1998). The 2-talker babble and 6-talker babble tracks were looped for the duration of data collection (approximately 25 minutes per condition) with no silent intervals. This presentation paradigm allowed the noise to occur at a randomized phase with respect to the target speech sound. Thus, responses that were time-locked to the target sound could be averaged without the confound of phase-coherent responses to the background noise.
In each of the quiet and two noise conditions, 6300 sweeps of /da/ were collected using Scan 4.3 Acquire (Compumedics, Charlotte, NC) in continuous mode at a sampling rate of 20 kHz. The continuous recordings were filtered, artifact rejected (±35 μV), and averaged off-line using Scan 4.3. Responses were band-pass filtered from 70 to 1000 Hz, 12 dB/octave. Waveforms were averaged with a time window spanning 40 ms prior to the onset and 16.5 ms after the offset of the stimulus and baseline corrected over the pre-stimulus interval (−40 to 0 ms, with 0 corresponding to the stimulus onset). Responses of alternating polarity were added to isolate the neural response by minimizing stimulus artifact and cochlear microphonic (Gorga et al., 1985). The final average response consisted of the first 6000 artifact free responses.
40 ms/da/
The speech syllable is a 40-ms/da/ syllable synthesized at a sampling rate of 10 kHz using a Klatt synthesizer [Klatt, 1980]. It contains a release burst and voiced formant transition with a fundamental frequency (F0) that linearly rises from 103 to 125 Hz with voicing beginning at 5 ms and an onset release burst during the first 10 ms. Although the stimulus does not contain a steady-state vowel, it is psychophysically perceived as a consonant-vowel speech syllable. By eliciting brainstem responses using a relatively shorter plosive-initial syllable /da/, BioMARK reflects the encoding of rapid acoustic offsets and onsets important for consonant identification.
Stimulus and recording parameters followed BioMARK protocol (Bio-logic, 2005), consistent with other studies (Johnson et al., 2005; Banai et al., 2009; Dhar et al., 2009; Hornickel et al., 2009a; Krizman et al., 2010). As with the longer stimulus, responses were obtained to alternating polarity stimuli and then added together to extract the neural response from the cochlear microphonic and to eliminate stimulus artifact (Gorga, 1985). The /da/ was presented at a rate of 10.9 Hz, and two blocks of 3000 responses to each polarity were collected and averaged using a 74.67 ms time window (−15.8 ms pre-stimulus). Responses were sampled at 6857 Hz and bandpass filtered on-line from 100 to 2000 Hz, using a 12 dB/octave filter roll-off. Trials with artifact exceeding ±23 μV were excluded from the average. The 40 ms /da/ was presented via a magnetically shielded Bio-logic Systems insert earphone placed in the right ear at 80.3 dB SPL.
All responses were differentially recorded from Cz (active) to right earlobe (reference), with forehead as ground. During testing, the participants watched a captioned video of their choice with the sound level set at <40 dB SPL to facilitate a passive yet wakeful state.
Neurophysiologic analysis procedures
Speech-evoked brainstem responses were examined on a variety of measures in both time and frequency domains. Test-retest stability of the physiological measures was defined as no change from Test 1 and Test 2 using a repeated measures analysis of variance with test session as the within-subject factor. Because of the variability in the duration of time between Test 1 and Test 2, the duration of days between the recording sessions was used as a covariate in the analysis. Results of post-hoc pairwise t-tests were adjusted for multiple comparisons by using a Bonferroni-correction.
Time domain analyses
Overall response magnitude
Root-mean-square (RMS) amplitude and signal-to-noise (SNR) of the response assessed overall response magnitude.
Inter-response and stimulus-to-response correlations
The effects of background noise on the timing of the response were quantified by performing inter-response (i.e. quiet vs. noise) correlations. This analysis was performed over the region of the response which included both the onset peaks and the FFR (5–180 ms). To assess how well the FFR represented the periodicity of the vowel in which the frequency of the F0 was constant, stimulus-to-response correlations were performed on the FFR (170 ms/da/: 50–180 ms, 40 ms/da/: 11–40 ms). Because Pearson’s r-values were not normally distributed, stimulus-to-response and inter-response correlation measures were converted to Fisher’s z’-scores prior to subsequent parametric statistical analyses. For more detail on auditory brainstem responses to complex sounds, see Skoe and Kraus (2010).
Discrete peak measures
Measures of both timing and magnitude were utilized to assess the discrete peaks. Using the software with which the responses were collected (Neuroscan for the 170 ms/da/ and Bio-Logic for the 40 ms/da/), two experienced peak pickers manually marked the peaks of waves at the onset and transition portion in the averaged response in order to measure their latencies and amplitudes. In cases where the two peak pickers were not in agreement regarding the latency of a particular peak, a third experienced peak picker was consulted. This task was accomplished by selecting the peak with the largest positive or negative amplitude within the estimated time window that each was expected to occur (e.g., peak V: 9.00–10.00 ms and trough A: 10.00–11.00 ms for speech-evoked ABRs collected with Neuroscan) if it was visibly above the noise floor provided by the prestimulus period. When the background noise was introduced with the syllable, response peaks were often obscured or absent in the waveform, similar to the speech-evoked ABR collected in background noise in children (Russo et al., 2005). These peaks were omitted from statistical analyses.
Frequency domain analyses
Fast Fourier transform was used to evaluate the content and strength of frequency encoding of the frequency following portion of the response (i.e., F0 and its harmonics). In the analysis of the 170 ms/da/ response, average spectral amplitude within a 40 Hz wide bin centered around the F0 and its harmonics were separately obtained from the transition (20–60 ms) and steady-state (60–180 ms) regions of the response. Separate spectral analysis was performed because the encoding of the F0 in the region corresponding to the formant transition is weakened by rapid shifts in higher formants (Johnson et al., 2008a; Hornickel et al., 2009b) compared to the sustained region in which the strength of F0 encoding is reinforced by unwavering formants that are integer multiples of the F0. Moreover, because the frequency of the F0 was constant over a relatively long periodic portion compared to the 40 ms/da/, the spectral encoding of the frequency components was more focused and was encapsulated by the 40 Hz bins. For the 40 ms/da/, strength of frequency encoding was defined as the average spectral amplitude within a 100 Hz wide bin centered around the F0 and the harmonics. The larger bin size captured the broader spread of frequency representation primarily due to the absence of a steady-state vowel and a linearly rising F0 (103–125 Hz) in the voiced formant transition.
Results
Study 1: Test-retest reliability of 170 ms/da/ in quiet and background noise
Time domain analyses
Overall response magnitude (SNR and RMS amplitude)
Grand average speech-evoked ABRs for both testing sessions in the three listening conditions: quiet, 2-talker and 6-talker babble are shown in Figure 1A. Stability of the magnitude of the speech-evoked brainstem response was evaluated over the entire range of the response (0–180 ms). The global magnitude of neural activation was replicable from test to retest sessions for each subject and was consistently vulnerable to significant degradation in background noise (see Table 1 for means and standard deviations). A 2 session (Test 1 vs. Test 2) × 3 condition (quiet, 2-talker and 6 talker babble) two-way repeated measures ANOVA was performed separately on each of the 2 dependent variables which reflected overall response magnitude (i.e., SNR and RMS amplitude). There were no significant main effects of SNR (F=1.726, p=0.199), RMS amplitude (F=0.180, p=0.674) nor interaction between test session and condition (SNR: F=0.095, p=0.910, RMS: F=0.131, p=0.877). However there was a significant main effect of condition (SNR: F=18.636, p<0.0001; RMS: F=10.441, p<0.0001). Post-hoc pairwise t-tests showed that 2- and 6-talker background noise significantly degraded both measures of overall magnitude compared to quiet during Test 1 (RMS: t=3.145, 4.437, t, <0.0001; SNR: t=3.660, 3.981, p=0.001, <0.0001, respectively) and Test 2 (RMS: t=2.518, 3.735, p=0.017, 0.001; SNR: t=2.993, 4.041, p=0.005, <0.0001).
Table 1.
170 ms /da/ | |||
---|---|---|---|
OVERALL RESPONSE MAGNITUDE |
|||
Test 1 |
Test 2 |
||
Mean ± SD | Mean ± SD | ||
RMS | Quiet | 0.14 ± 0.04 | 0.14 ± 0.03 |
2-talker | 0.13 ± 0.04 | 0.13 ± 0.03 | |
6-talker | 0.13 ± 0.03 | 0.13 ± 0.03 | |
SNR | Quiet | 1.86 ± 0.33 | 1.90 ± 0.45 |
2-talker | 1.65 ± 0.40 | 1.68 ± 0.40 | |
6-talker | 1.563 ± 0.41 | 1.63 ± 0.34 |
Quiet-to-noise inter-response analysis (5–180 ms)
The relationships among brainstem responses recorded in quiet and background noise were explored using cross-correlation analysis. The impact of noise on the speech-evoked brainstem response for each subject was similar across test sessions. The stability of the degree of similarity between the response across listening conditions was analyzed performing a two-way repeated measures ANOVA separately for each of the two dependent variables arising from quiet-to-noise correlation analysis (i.e., lag time and z’-score of Pearson r-value). A 2 session (Test 1 vs. Test 2) × 2 condition (quiet vs. 2-talker and quiet vs. 6-talker babble) repeated measures ANOVA for lag time showed no significant main effects of session or condition (F=0.300, 0.385, p=0.588, 0.540, respectively) nor a significant interaction between session and condition (F=0.385, p=0.540). Moreover, a two-way repeated measures ANOVA of the z’-score of the Pearson’s r- value showed no significant main effect of session (F=1.734, p<0.198) nor a significant interaction between session and condition (F=0.243, p=0.626). However, there was a significant main effect of condition (F=19.627, p<0.0001), with post-hoc pairwise t-tests showing that the 6-talker babble consistently imposed significantly greater challenge than the 2-talker noise in the neural phase locking of periodic information found in the stimulus as shown by a lower Pearson’s r-value within a given test session (Test 1: t=3.532, p=0.001, Test 2: 3.472, p=0.002). See Table 2 and Figure 2 for means and standard deviations as well as individual test-rest data.
Table 2.
INTER-RESPONSE MEASURES |
|||
---|---|---|---|
Test 1 |
Test 2 |
||
Mean ± SD | Mean ± SD | ||
Lag time (ms) | Quiet vs. 2-talker | 0.26 ± 0.16 | 0.25 ± 0.16 |
Quiet vs. 6-talker | 0.28 ± 0.27 | 0.27 ± 0.20 | |
Pearson’s r-value | Quiet vs. 2-talker | 0.66 ± 0.15 | 0.67 ± 0.14 |
Quiet vs. 6-talker | 0.62 ± 0.15 | 0.63 ± 0.14 |
Stimulus-to-response correlation (50–180 ms)
The stability of the degree of similarity between the FFR and the stimulus and the time displacement that produced the highest Pearson’s r-value were assessed by using stimulus-to-response correlations. The fidelity of the FFR in encoding the timing features of the stimulus was replicable and demonstrated good intrasubject test-retest reliability. Moreover, these properties were consistently affected by noise over multiple test sessions; see Table 3 and Figure 3 for means and standard deviations as well as individual test-retest data. To determine the reproducibility of response fidelity to the stimulus from one session to another, a 2 session (Test 1 vs. Test 2) × 3 conditions (quiet, 2- and 6-talker babble) two-way repeated measures ANOVA was performed separately for each of the 2 dependent variables of the stimulus-to-response analysis (lag time and z’-score of Pearson r-value). There was no significant main effect of session for either the lag time or z’-score measures (F=0.011, 1.529, p=0.915, 0.226, respectively) nor a significant interaction between session and condition (lag time: F=1.208, p=0.313, z’-score: F=0.127, p=0.881). However, there was a main effect of condition for both measures (lag time: F=7.045, p<0.003, z’-score: F=17.082, p<0.0001). Post-hoc pairwise t-tests showed that while the stimulus-to-response measures of lag time and z’-score did not differ from test to retest in any of the three listening conditions (lag time: t=−0.949, −0.309, 1.062, p=0.350, 0.759, 0.297; z’-score: t=−1.102, −0.490, −1.022, p=0.279, 0.628, 0.315, for quiet, 2-talker, and 6-talker, respectively), the presence of 2 and 6-talker background noise significantly degraded both measures compared to quiet during both Test 1 (lag time: t=−3.499, 3.597, p=0.001, 0.001; z’score: t=−5.682, −5.306, p<0.0001, <0.0001) and Test 2 (lag time: t=−3.752, −3.584, p=0.001, 0.001; z’-score: t=−5.273, −4.881, p<0.0001, <0.0001). However, there were no significant differences between 2 and 6-babble for either lag time or z’-score calculated from the responses within each test session (Test 1: lag time: t=−1.472, p=0.151, z’-score: t=0.916, p=0.367; Test 2: lag time: t=−0.450, p= 0.656, z’-score: t=0.095, p=0.925).
Table 3.
STIMULUS-to-RESPONSE MEASURES |
|||
---|---|---|---|
Test 1 |
Test 2 |
||
Mean ± SD | Mean ± SD | ||
Lag time (ms) | Quiet | 8.91 ± 0.58 | 8.93 ± 0.62 |
2-talker | 9.19 ± 0.75 | 9.19 ± 0.74 | |
6-talker | 9.25 ± 0.83 | 9.21 ± 0.78 | |
Pearson’s r-value | Quiet | 0.26 ± 0.07 | 0.26 ± 0.07 |
2-talker | 0.21 ± 0.07 | 0.21 ± 0.07 | |
6-talker | 0.22 ± 0.07 | 0.21 ± 0.06 |
Discrete peak measures
Onset peaks (Waves V and A)
Detectability
In the quiet condition, the onset response peaks (Waves V and A) were detected in all 31 participants’ responses in Tests 1 and 2. In the 2-talker condition, Waves V and A were present in 17 participants’ responses during Tests 1 and 2. For one participant, the onset response present in Test 1 was absent in Test 2 and conversely, for another participant, it was present in Test 2 but absent in Test 1. In the 6-talker condition, the onset response was present in 15 participants’ responses during Test 1 and 14 during Test 2. The responses of two participants who exhibited an onset response during Test 1 did not show it during Test 2. One participant showed an onset response during Test 2 which was absent during Test 1. Pearson’s χ2 showed that detectability of the onset waves was not statistically different from Tests 1 and 2 in either the 2-talker (p=0.108) or 6-talker (p=0.252) conditions.
Response latency
A 2 session (Test 1 vs. Test 2) × 3 condition (quiet, 2-talker, 6-talker) two-way repeated measures MANOVA was performed with Waves V and A latencies as the dependent variables. There was no significant main effect of session (F=2.109, p=1.84) nor significant interaction between session and condition (F=3.257, p=0.096). As expected, there was a main effect of condition (F=19.229, p=0.001) suggesting that the desynchronizing influence of noise on neural firing resulted in a later latency, see Table 4 for means and standard deviations of Wave V and A latencies. In summary, neither the intrasubject detectability nor the latencies of V or A differed significantly from test to retest.
Table 4.
Quiet |
2-talker Babble |
6-talker Babble |
|||||
---|---|---|---|---|---|---|---|
Test 1 |
Test 2 |
Test 1 |
Test 2 |
Test 1 |
Test 2 |
||
Mean ± SD | Mean ± SD | Mean ± SD | Mean ± SD | Mean ± SD | Mean ± SD | ||
Positive Peak Latency (ms) | V | 9.57 ± 0.58 | 9.56 ± 0.57 | 9.99 ± 0.43 | 9.99 ± 0.45 | 10.23 ± 0.48 | 10.23 ± 0.50 |
33 | 33.21 ± 0.46 | 33.19 ± 0.51 | 33.81 ± 0.59 | 33.82 ± 0.62 | 34.34 ± 0.64 | 34.34 ± 0.67 | |
43 | 43.31 ± 0.46 | 43.30 ± 0.47 | 43.33 ± 0.56 | 43.33 ± 0.60 | 43.45 ± 0.75 | 43.49 ± 0.74 | |
53 | 53.48 ± 0.68 | 53.47 ± 0.71 | 53.46 ± 0.79 | 53.47 ± 0.76 | 53.56 ± 0.83 | 53.60 ± 0.77 | |
Negative peak Latency (ms) | A | 10.92 ± 0.88 | 10.93 ± 0.89 | 11.29 ± 0.60 | 11.29 ± 0.59 | 11.31 ± 0.64 | 11.32 ± 0.68 |
35 | 35.28 ± 0.64 | 35.29 ± 0.58 | 35.58 ± 0.58 | 35.60 ± 0.62 | 35.81 ± 0.63 | 35.83 ± 0.64 | |
45 | 45.55 ± 0.63 | 45.54 ± 0.64 | 45.53 ± 0.59 | 45.54 ± 0.63 | 45.51 ± 0.75 | 45.29 ± 0.78 | |
56 | 56.09 ± 0.59 | 56.01 ± 0.61 | 56.20 ± 0.55 | 56.23 ± 0.73 | 56.22 ± 0.77 | 56.22 ± 0.76 |
Formant transition period (20–60 ms)
Peaks in the transition region of the brainstem response collected in either quiet or background noise, were replicable over a test-retest period. This result is important as the neural timing of these peaks, particularly in background noise, has been found to be associated with the speech perceptual ability (Anderson et al., 2010). Table 4 lists the means and standard deviations of peak latencies from the formant transition period. Analysis was conducted on the latencies of peaks in the formant transition period (3 positive- and negative-going peak pairs at mean latencies of approximately 33, 35, 43, 45, 53, 56 ms). The voicing onset peaks, occurring approximately at 23 and 24 ms after the response onset, were excluded from analysis as their amplitudes did not exceed the noise floor, especially in the babble conditions. A 2 session (Test 1 vs. Test 2) × 3 condition (quiet, 2-babble, 6-babble) two way repeated measures MANOVA was performed with the latencies of these peaks as dependent variables. Results showed no significant main effects of session (F=0.740, p=0.642) or condition (F=1.817, p=0.117, respectively) nor a significant interaction between session and condition (F=1.635, p=0.161).
Frequency domain analysis
Representation of fundamental and formant frequencies
The strength of frequency encoding in the transition and steady-state regions of the response elicited by the different listening conditions was examined in the frequency domain using the fast Fourier transform (FFT). Individual responses were segmented into two time ranges: 1) 20–60 ms, which includes the response to the formant transition of the stimulus and 2) 60–180 ms, which includes the response to the steady-state vowel. To examine the strength of frequency encoding, average response magnitudes were calculated for 40 Hz wide bins surrounding the F0 (100 Hz) and subsequent nine harmonics (200 Hz, 300 Hz… 1000 Hz). These test-retest results indicate that the strength of frequency encoding was stable and was not susceptible to change for each subject; refer to Figure 4A for the average and standard deviation of FFT amplitudes from Tests 1 and 2 for each listening condition and response time range. A 2 session (Test 1 vs. Test 2) × 3 condition (quiet, 2-babble, 6-babble) two way repeated measures ANOVA was separately performed on the formant transition (20–60 ms) and steady-state (60–180 ms) regions of the response. There were no significant main effect of test session (F=1.168, 1.585, p=0.366, 0.179, respectively) nor a significant interaction between test and condition (F=0.538, 0.725, p=0.886, 0.744, respectively). As expected, there was a significant main effect of condition (F=3.564, 3.909, p=0.022, 0.012), indicating that the presence of multispeaker babble degraded the subcortical encoding of frequencies compared to responses recorded in quiet.
Study 2: Test-retest reliability of the 40 ms/da/
Time domain analysis
Overall response magnitude (SNR and RMS amplitude)
Grand average responses are shown in Figure 1B. Stability of the overall magnitude of the response was evaluated over the entire range of the response from test to retest. The global magnitude of neural activation is replicable and is resistant to gain or reduction over time for each subject, see Table 5 for means and standard deviations. A 2 session (Test 1 vs. Test 2) one-way repeated-measures ANOVA was performed on each of the 2 dependent variables measuring overall response magnitude (i.e., SNR and RMS amplitude). There was no significant main effect of session for SNR (F=0.107, p=0.745) or RMS amplitude (F=0.489, p=0.488).
Table 5.
40 ms/da/ | |||
---|---|---|---|
OVERALL RESPONSE MAGNITUDE |
|||
Test 1 |
Test 2 |
||
Mean ± SD | Mean ± SD | ||
RMS | 0.09 ± 0.02 | 0.09 ± 0.02 | |
SNR | 3.01 ± 1.11 | 2.89 ± 1.22 |
Stimulus-to-response analysis
The FFR (11–40 ms) was evident in all subjects, see Figure 4B for an overlay of grand average responses in this time region from both testing sessions. The stability of the degree of similarity between the FFR and the stimulus as well the time displacement that produced the highest r- value was assessed using stimulus-to-response correlations. Intrasubject response fidelity to the stimulus was replicable and demonstrated good test-retest reliability, see Figure 4C for means and standard deviations. A 2 session (Test 1 vs. Test 2) one-way repeated measures ANOVA was performed on each of the 2 dependent variables which measured the precision to which the FFR mimics the stimulus (i.e., lag time and z’-score of the Pearson’s r-value). There was no significant main effect of session for lag time (F=1.615, p=0.211) or the z’-score of the Pearson’s r-value (F=0.006, p=0.941).
Peak Analysis
Latencies and amplitudes of peaks V, A, C, D, E, F, and O, as well as VA measures (i.e., duration, amplitude, slope and area) obtained during Test 1 were comparable with those obtained during Test 2; within-subject latency values are shown in Figure 1C. Peak responses fall into two categories: those that encode transient events in the stimulus (V, A, C and O), and those that encode the periodicity of the vowel (D, E and F) (Russo et al., 2004; Kraus and Nicol, 2005; Dhar et al., 2009; Hornickel et al., 2009a; Krizman et al., 2010). For analysis, the peaks were divided in this way. The timing and robustness of these response peaks were highly stable and were reliably replicated over time for each subject. First, a one-way 2 session (Test 1 vs. Test 2) repeated measures ANOVA was performed with V, A, C, and O latency and amplitude and VA duration, amplitude, slope and area serving as dependent variables. There were no significant main effects of session (F=0.989, p=0.500). A second one-way 2 session (Test 1 vs. 2) repeated measures ANOVA was performed with D, E and F latency and amplitude serving as dependent variables. There were no significant main effects of session (F=1.436, p=0.231), refer to Tables 6 and 7 for means and standard deviations.
Table 6.
Peak | LATENCY |
AMPLITUDE |
||
---|---|---|---|---|
Test 1 |
Test 2 |
Test 1 |
Test 2 |
|
Mean ± SD | Mean ± SD | Mean ± SD | Mean ± SD | |
V | 6.65 ± 0.27 | 6.68 ± 0.27 | 0.13 ± 0.05 | 0.13 ± 0.04 |
A | 7.62 ± 0.35 | 7.62 ± 0.37 | −0.20 ± 0.06 | −0.21 ± 0.06 |
C | 18.60 ± 0.68 | 18.47 ± 0.68 | −0.03 ± 0.06 | −0.03 ± 0.05 |
D | 22.67 ± 0.59 | 22.72 ± 0.58 | −0.13 ± 0.07 | −0.14 ± 0.07 |
E | 31.12 ± 0.53 | 31.20 ± 0.57 | −0.22 ± 0.06 | −0.21 ± 0.07 |
F | 39.70 ± 0.57 | 39.71 ± 0.50 | −0.14 ± 0.09 | −0.13 ± 0.08 |
O | 48.26 ± 0.43 | 48.34 ± 0.39 | −0.15 ± 0.06 | −0.16 ± 0.06 |
Table 7.
VA MEASURES |
|||
---|---|---|---|
Test 1 |
Test 2 |
||
Mean ± SD | Mean ± SD | ||
Duration | 0.98 ± 0.23 | 0.94 ± 0.21 | |
Amplitude | 0.33 ± 0.09 | 0.34 ± 0.08 | |
Slope | −0.35 ± 0.11 | −0.37 ± 0.12 | |
Area | 0.16 ± 0.05 | 0.15 ± 0.05 |
Frequency domain analysis
Representation of fundamental and formant frequencies
The strength of frequency encoding in the FFR (11 to 40 ms) elicited by the 40 ms/da/ obtained during Test 1 was comparable to that obtained during Test 2, see Figure 4D. A 2 session (Test 1 vs. Test 2) one-way repeated measures ANOVA with the average response magnitudes from each of the 10 frequencies bins reflecting the F0 (100 Hz) and the subsequent harmonics (200, 300…1 kHz, with a 100 Hz bin size) serving as dependent variables of frequency encoding was performed to examine the reliability of frequency encoding over the two separate recording sessions. There were no significant main effect of session (F=0.560, p=0.834). Thus, magnitude of frequency encoding of the periodic elements of this stimulus was stable for each subject over time.
Discussion
Overall, we found high test-retest reliability of the brainstem response to speech syllables recorded in quiet and noisy listening conditions in young adults. There were no significant test-retest differences in measures that quantify the response in either time and frequency domains regardless of the type of stimulus used (40 ms vs. 170 ms/da/) or background listening conditions (quiet vs. 2- and 6-talker babble). Demonstrating stability of the response is critical, especially as the speech-evoked brainstem response plays an increasingly important role in the objective assessment of auditory processing.
Like the click-evoked brainstem response, the speech-evoked brainstem response also possesses the attribute of having a highly predictable nature shown by its remarkable degree of both temporal precision and test-retest reliability in young adults. For instance, the stop-consonant of both the 40 ms and 170 ms/da/, characterized to be brief and spectrally stochastic, consistently elicited an onset response in the quiet condition at the expected latencies for all subjects. This finding points to the reliability of these speech stimuli to elicit an onset response as well as the robust nature of the neural encoding of young adults to consistently encode this feature from test to retest. Also, for both stimuli, the region of the response reflecting the encoding of the formant transition in the stimulus (analogous regions between the short and long /da/ stimuli), showed faithful representation of stimulus timing corresponding to F0 and formants during both test sessions. This attribute allows for accurate and valid evaluation of the speech-evoked brainstem activity elicited in quiet or noisy backgrounds over multiple test sessions. Thus, due to the steadiness of the response, changes that are observed between the initial test and retest, in a study involving auditory training, can be interpreted in terms of training-related outcomes. Additionally, tracking changes in a physiologic response such as the speech-evoked ABR can be used to objectively evaluate the effectiveness of a particular training program, thereby enhancing the clinical utility of aural rehabilitation programs.
The high test-retest reliability of the speech-evoked ABR obtained in the present study is commensurate with a previous study conducted in children. In a study conducted on eight normal hearing children whose speech-evoked brainstem responses were obtained on two separate sessions spaced 2 to 10 months apart, Russo and colleagues (2004) found that most brainstem measures did not change significantly over the test-retest time interval. Exceptions included the VA interpeak amplitude and slope in quiet. While in the present study, these measures did not differ at retest, these exceptions may be partly explained by known factors, such as the variability of the onset response amplitude (Starr and Don, 1988), as well as the developmental changes that are observed in children for the auditory brainstem response to sounds composed of acoustic elements relevant to speech, often manifested as delayed and less synchronous onset responses (Johnson et al., 2008b).
We have shown that the test-retest reliability of the speech-evoked ABR test was excellent across listening conditions and stimulus durations, with no significant differences at the group level and high levels of agreement at the individual level. Given the high test-retest reliability, the speech-evoked ABR test holds promise as a useful investigative instrument to quantify with confidence the degree to which the neural system of young adults is consistent in encoding a complex sound at the preconscious level. Establishing this level of stability has positive broad implications on research and clinical assessment whenever auditory processing is of interest. This includes investigations of ABRs to complex sounds in challenging listening conditions in populations with auditory specialization (e.g., musicians, native language speakers) and the management of auditory deficits (e.g., auditory processing disorders, language-based learning impairments, hearing loss and age-related hearing decline).
Acknowledgments
The authors wish to thank Professor Ann Bradlow for the use of her background noise stimuli and Professor Steven Zecker for his advice on the statistical treatment of these data. We would also like to thank the people who participated in this study. This work was supported by RO1 DC01510, T32 NS047987, F32 DC008052.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Akhoun I, Gallégo S, Moulin A, Menard M, Veuillet E, Berger-Vachon C, Collet L, Thai-Van H. The temporal relationship between speech auditory brainstem responses and the acoustic pattern of the phoneme /ba/ in normal-hearing adults. Clin Neurophysiol. 2008;119:922–933. doi: 10.1016/j.clinph.2007.12.010. [DOI] [PubMed] [Google Scholar]
- Anderson S, Chandrasekaran B, Skoe E, Kraus N. Neural timing is linked to speech perception in noise. J Neurosci. 2010;30:4922–4926. doi: 10.1523/JNEUROSCI.0107-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banai K, Hornickel JM, Skoe E, Nicol T, Zecker S, Kraus N. Reading and subcortical auditory function. Cerebral Cortex. 2009;19:2699–2707. doi: 10.1093/cercor/bhp024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bio-logic SC. Auditory Evoked Potential (AEP) System User’s and Service Manual, 590- AEPUM1; Rev C) 2005. [Google Scholar]
- Boston JR, Møller AR. Brainstem auditory-evoked potentials. Crit Rev Biomed Eng. 1985;13:97–123. [PubMed] [Google Scholar]
- Brandt J, Rosen JJ. Auditory Phonemic Perception in Dyslexia - Categorical Identification and Discrimination of Stop Consonants. Brain Lang. 1980;9:324–337. doi: 10.1016/0093-934x(80)90152-2. [DOI] [PubMed] [Google Scholar]
- Brown L, Sherbenou RJ, Johnsen SK. Test of Nonverbal Intelligence: A language-free measure of cognitive ability. Austin, TX: PRO-ED Inc; 1997. [Google Scholar]
- Chandrasekaran B, Kraus N. The scalp-recorded brainstem response to speech: Neural origins and plasticity. Psychophysiol. 2010;47:236–246. doi: 10.1111/j.1469-8986.2009.00928.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cone-Wesson B, Kurtzberg D, Vaughan HG. Electrophysiologic Assessment of Auditory Pathways in High-Risk Infants. Int J Pediatr Otorhinolaryngol. 1987;14:203–214. doi: 10.1016/0165-5876(87)90032-2. [DOI] [PubMed] [Google Scholar]
- Cone-Wesson B, Vohr BR, Sininger YS, Widen JE, Folsom RC, Gorga MP, Norton SJ. Identification of neonatal hearing impairment: Infants with hearing loss. Ear Hearing. 2000;21:488–507. doi: 10.1097/00003446-200010000-00012. [DOI] [PubMed] [Google Scholar]
- Cunningham J, Nicol T, King C, Zecker SG, Kraus N. Effects of noise and cue enhancement on neural responses to speech in auditory midbrain, thalamus and cortex. Hearing Res. 2002;169:97–111. doi: 10.1016/s0378-5955(02)00344-1. [DOI] [PubMed] [Google Scholar]
- Dhar S, Abel R, Hornickel J, Nicol T, Skoe E, Zhao W, Kraus N. Exploring the relationship between physiological measures of cochlear and brainstem function. Clin Neurophysiol. 2009;120:959–966. doi: 10.1016/j.clinph.2009.02.172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edwards RM, Buchwald JS, Tanguay PE, Schwafel JA. Sources of Variability in Auditory Brain-Stem Evoked-Potential Measures over Time. Electroencephalogr Clin Neurophysiol. 1982;53:125–132. doi: 10.1016/0013-4694(82)90018-9. [DOI] [PubMed] [Google Scholar]
- Galbraith GC, Amaya EM, Diaz de Rivera JM, Donan NM, Duong MT, Hsu JN, Tran K, Tsang LP. Brain stem evoked response to forward and reversed speech in humans. NeuroReport. 2004;15:2057–2060. doi: 10.1097/00001756-200409150-00012. [DOI] [PubMed] [Google Scholar]
- Galbraith GC, Arbagey PW, Branski R, Comerci N, Rector PM. Intelligible speech encoded in the human brain stem frequency following response. Neuroreport. 1995;6:2363–2367. doi: 10.1097/00001756-199511270-00021. [DOI] [PubMed] [Google Scholar]
- Galbraith GC, Jhaveri SP, Kuo J. Speech-evoked brainstem frequency-following responses during verbal transformations due to world repetition. Electroencephalogr and Clin Neurophysiol. 1997;102:46–53. doi: 10.1016/s0013-4694(96)96006-x. [DOI] [PubMed] [Google Scholar]
- Gorga M, Abbas P, Worthington D. Stimulus calibration in ABR measurements. In: Jacobsen J, editor. The Auditory Brainstem Response. San Diego: College Hill Press; 1985. pp. 49–62. [Google Scholar]
- Hall JW. Handbook of Auditory Evoked Responses. Heedham Heights, MA: Allyn and Bacon; 1992. [Google Scholar]
- Hood LJ. Clinical applications of the auditory brainstem response. San Diego, CA: Singular Publishing Group, Inc; 1998. [Google Scholar]
- Hornickel J, Skoe E, Kraus N. Subcortical Laterality of Speech Encoding. Audiol & Neurotol. 2009a;14:198–207. doi: 10.1159/000188533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hornickel J, Skoe E, Nicol T, Zecker S, Kraus N. Subcortical differentiation of voiced stop consonants: Relationships to reading and speech in noise perception. PNAS. 2009b;106:13022–13027. doi: 10.1073/pnas.0901123106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobson J. The Auditory Brainstem Response. San Diego: College-Hill Press; 1985. [Google Scholar]
- Johnson KL, Nicol T, Kraus N. The brainstem response to speech: a biological marker of auditory processing. Ear Hearing. 2005;26:424–434. doi: 10.1097/01.aud.0000179687.71662.6e. [DOI] [PubMed] [Google Scholar]
- Johnson KL, Nicol T, Zecker SG, Bradlow AR, Skoe E, Kraus N. Brainstem encoding of voiced consonant-vowel stop syllables. Clin Neurophysiol. 2008a;119:2623–2635. doi: 10.1016/j.clinph.2008.07.277. [DOI] [PubMed] [Google Scholar]
- Johnson KL, Nicol T, Zecker SG, Kraus N. Developmental plasticity in the human auditory brainstem. J Neurosci. 2008b;28:4000–4007. doi: 10.1523/JNEUROSCI.0012-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klatt D. Software for cascade/parallel formant synthesizer. J Acoust Soc Am. 1980;67:971–975. [Google Scholar]
- Kraus N, Nicol T. Brainstem origins for cortical ‘what’ and ‘where’ pathways in the auditory system. Trends Neurosci. 2005;28:176–181. doi: 10.1016/j.tins.2005.02.003. [DOI] [PubMed] [Google Scholar]
- Krishnan A, Swaminathan J, Gandour JT. Experience-dependent enhancement of linguistic pitch representation in the brainstem is not specific to a speech context. J Cogn Neurosci. 2009;21:1092–1105. doi: 10.1162/jocn.2009.21077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krishnan A, Xu YS, Gandour JT, Cariani PA. Human frequency-following response: representation of pitch contours in Chinese tones. Hearing Res. 2004;189:1–12. doi: 10.1016/S0378-5955(03)00402-7. [DOI] [PubMed] [Google Scholar]
- Krizman J, Skoe E, Kraus N. Stimulus rate and subcortical auditory processing of speech. Audiol & Neurotol. 2010;15:332–342. doi: 10.1159/000289572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee K, Skoe E, Kraus N, Ashley R. Selective subcortical enhancement of musical intervals in musicians. J Neurosci. 2009;29:5832–5840. doi: 10.1523/JNEUROSCI.6133-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Møller AR, Jannetta P. Neural generators of the auditory brainstem response. In: Jacobsen J, editor. The Auditory Brainstem Response. San Diego, CA: College Hill Press; 1985. pp. 13–32. [Google Scholar]
- Moushegian G, Rupert AL, Stillman RD. Scalp-recorded early responses in man to frequencies in speech range. Electroencephalogr Clin Neurophysiol. 1973;35:665–667. doi: 10.1016/0013-4694(73)90223-x. [DOI] [PubMed] [Google Scholar]
- Musacchia G, Sams M, Skoe E, Kraus N. Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. PNAS. 2007;104:15894–15898. doi: 10.1073/pnas.0701498104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Norton SJ, Gorga MP, Widen JE, Folsom RC, Sininger Y, Cone-Wesson B, Vohr BR, Fletcher KA. Identification of neonatal hearing impairment: Summary and recommendations. Ear Hearing. 2000;21:529–535. doi: 10.1097/00003446-200010000-00014. [DOI] [PubMed] [Google Scholar]
- Oyler RF, Lauter JL, Matkin ND. Intrasubject variability in the absolute latency of the auditory brainstem response. JAAA. 1991;2:206–213. [PubMed] [Google Scholar]
- Parbery-Clark A, Skoe E, Kraus N. Society for Neuroscience 2009. Chicago: 2009a. Biological bases for the musician advantage for speech-in-noise. [Google Scholar]
- Parbery-Clark A, Skoe E, Kraus N. Musical experience limits the degradative effects of background noise on the neural processing of sound. J Neurosci. 2009b;29:14100–14107. doi: 10.1523/JNEUROSCI.3256-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russo N, Nicol T, Musacchia G, Kraus N. Brainstem responses to speech syllables. Clin Neurophysiol. 2004;115:2021–2030. doi: 10.1016/j.clinph.2004.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russo NM, Nicol TG, Zecker SG, Hayes EA, Kraus N. Auditory training improves neural timing in the human brainstem. Behav Brain Res. 2005;156:95–103. doi: 10.1016/j.bbr.2004.05.012. [DOI] [PubMed] [Google Scholar]
- Sininger YS. Auditory Brain-Stem Response for Objective Measures of Hearing. Ear Hearing. 1993;14:23–30. doi: 10.1097/00003446-199302000-00004. [DOI] [PubMed] [Google Scholar]
- Sininger YS, Abdala C, ConeWesson B. Auditory threshold sensitivity of the human neonate as measured by the auditory brainstem response. Hearing Res. 1997;104:27–38. doi: 10.1016/s0378-5955(96)00178-5. [DOI] [PubMed] [Google Scholar]
- Skoe E, Kraus N. Auditory brainstem response to complex sounds: a tutorial. Ear Hearing. 2010:31. doi: 10.1097/AUD.0b013e3181cdb272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smiljanic R, Bradlow AR. Production and perception of clear speech in Croatian and English. JASA. 2005;118:1677–1688. doi: 10.1121/1.2000788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song JH, Banai K, Russo NM, Kraus N. On the relationship between speech- and nonspeech-evoked auditory brainstem responses. Audiol & Neurotol. 2006;11:233–241. doi: 10.1159/000093058. [DOI] [PubMed] [Google Scholar]
- Song JH, Skoe E, Banai K, Kraus N. Perception of speech in noise: Neural correlates. J Cogn Neurosci. doi: 10.1162/jocn.2010.21556. In Press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song JH, Skoe E, Wong PCM, Kraus N. Plasticity in the adult human auditory brainstem following short-term linguistic training. J Cogn Neurosci. 2008;20:1892–1902. doi: 10.1162/jocn.2008.20131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Starr A, Don M. Brain potentials evoked by acoustic stimuli. In: Picton T, editor. Human event-related potentials. Handbook of electroencephalography and clinical neurophysiology. New York: Elsevier; 1988. pp. 97–158. [Google Scholar]
- Strait DL, Kraus N, Skoe E, Ashley R. Musical experience and neural efficiency - effects of training on subcortical processing of vocal expressions of emotion. Eur J Neurosci. 2009;29:661–668. doi: 10.1111/j.1460-9568.2009.06617.x. [DOI] [PubMed] [Google Scholar]
- Swaminathan J, Krishnan A, Gandour JT. Pitch encoding in speech and nonspeech contexts in the human auditory brainstem. NeuroReport. 2008;19:1163–1167. doi: 10.1097/WNR.0b013e3283088d31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tice R, Carrell T. Level. 1998;16 [Google Scholar]
- Van Engen KJ, Bradlow AR. Sentence recognition in native- and foreign-language multi-talker background noise. JASA. 2007;121:519–526. doi: 10.1121/1.2400666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wechsler D. Wechsler Abbreviated Scale of Intelligence. San Antonio, TX: The Psychological Corporation; 1999. [Google Scholar]