Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 May 3.
Published in final edited form as: Trends Amplif. 2010;14(2):73–83. doi: 10.1177/1084713810380227

Objective Neural Indices of Speech-in-Noise Perception

Samira Anderson 1, Nina Kraus 1
PMCID: PMC3086460  NIHMSID: NIHMS284964  PMID: 20724355

Abstract

Numerous factors contribute to understanding speech in noisy listening environments. There is a clinical need for objective biological assessment of auditory factors that contribute to the ability to hear speech in noise, factors that are free from the demands of attention and memory. Subcortical processing of complex sounds such as speech (auditory brainstem responses to speech and other complex stimuli [cABRs]) reflects the integrity of auditory function. Because cABRs physically resemble the evoking acoustic stimulus, they can provide objective indices of the neural transcription of specific acoustic elements (e.g., temporal, spectral) important for hearing speech. As with brainstem responses to clicks and tones, cABRs are clinically viable in individual subjects. Subcortical transcription of complex sounds is also clinically viable because of its known experience-dependence and role in auditory learning. Together with other clinical measures, cABRs can inform the underlying biological nature of listening and language disorders, inform treatment strategies, and provide an objective index of therapeutic outcomes. In this article, the authors review recent studies demonstrating the role of subcortical speech encoding in successful speech-in-noise perception.

Keywords: auditory brainstem, speech-in-noise perception, plasticity

Introduction

In today’s society, we are constantly bombarded by various types of background noise, and the ability to communicate in the presence of noise is an important task for successful participation in educational, social, and vocational environments. Speech-in-noise (SIN) perception is a complex task, and it poses particular demands on older adults and children with noise exclusion deficits (e.g., dyslexia, auditory processing disorders, specific language impairment, autism spectrum disorder, and attention deficit hyperactivity disorder). Auditory brainstem responses to speech and other complex stimuli (cABRs) afford the opportunity to objectively evaluate biological factors associated with successful SIN perception. In this review, we summarize studies that have examined subcortical temporal and spectral speech encoding and its relationship to SIN perception. Taken as a whole, these studies demonstrate the clinical viability of cABRs using a variety of different analysis techniques.

Auditory Brainstem Responses to Complex Stimuli

The cABR is ideal for evaluating auditory processing because of its high degree of transparency between the stimulus waveform and the brainstem response waveform. This similarity is apparent visually (Skoe & Kraus, 2010; Figure 1) as well as through aural demonstrations (Galbraith, Arbagey, Branski, Comerci, & Rector, 1995).1 cABRs represent the features of frequency and timing in speech, music, and other stimuli, thus providing an opportunity to examine auditory processing of these behaviorally relevant sounds.

Figure 1.

Figure 1

Time domain of a 40-ms stimulus /da/ (gray) and response (black)

Note. The stimulus evokes characteristic peaks in the response, labeled V, A, C, D, E, F, and O. The stimulus waveform has been shifted to account for neural lag and to allow visual alignment between peaks in the response and the stimulus. The arrows indicate where peaks in the stimulus correspond to peaks in the response. Two response waveforms of an individual participant are included to demonstrate replicability. Modified from Skoe and Kraus (2010).

The cABR has high test–retest reliability (Russo, Nicol, Musacchia, & Kraus, 2004; Song, Nicol, & Kraus, IN PRESS), comparable with that of clicks and tone bursts (Gorga, Kaminski, Beauchaine, & Jesteadt, 1988; Hall & Mueller, 1997), and similarly, deviations on the order of fractions of milliseconds can be considered clinically significant (Banai et al., 2009; Basu, Krishnan, & Weber-Fox, 2009; Billiet & Bellis, IN PRESS; Cunningham, Nicol, Zecker, & Kraus, 2000; Wible, Nicol, & Kraus, 2004). The cABR is more effective than the click-evoked auditory brainstem response (ABR) for differentiating auditory function in typically developing children from children with auditory-based learning impairments (Cunningham, Nicol, Zecker, Bradlow, & Kraus, 2001; Song, Banai, Russo, & Kraus, 2006; Wible et al., 2004), poor SIN abilities (Anderson, Skoe, Chandrasekaran, & Kraus, 2010; Chandrasekaran, Hornickel, Skoe, Nicol, & Kraus, 2009; Hornickel, Skoe, Nicol, Zecker, & Kraus, 2009), poor reading skills (Banai et al., 2005; 2009; Billiet & Bellis, IN PRESS; McAnally & Stein, 1996), poor language skills (Basu et al., 2009), and poor temporal processing abilities (Johnson, Nicol, Zecker, & Kraus, 2007).

Considerable work in our lab has examined the brainstem’s response to the speech syllable /da/ because time-varying signals, in particular stop consonants, are known to be perceptually vulnerable in clinical populations (de Gelder & Vroomen, 1998; Tallal, 1980; Tallal & Stark, 1981; Tobey, Cullen, Rampp, & Fleischer-Gallagher, 1979; Townsend & Schwartz, 1981; Van Tasell, Hagen, Koblas, & Penner, 1982). The syllable consists of three time-domain components, the onset, transition, and steady state. The onset response corresponds to the onset of the consonant burst and is analogous to Wave V of the click-evoked ABR (Akhoun et al., 2008; Chandrasekaran & Kraus, 2010; Song et al., 2006). The transition region of the response, corresponding to the consonant–vowel formant transition, and the region corresponding to the steady state vowel are characterized by large periodic peaks occurring every 10 ms, paralleling the period of the 100-Hz fundamental frequency of the syllable, and smaller peaks corresponding to the harmonics.

The system of afferent fibers carrying sensory information to the midbrain (inferior colliculus) and auditory cortex, and the extensive system of efferent fibers that synapse all along the auditory pathway extending to the outer hair cells in the cochlea (Gao & Suga, 2000), support the notion that the auditory brainstem is far more than a passive conduit of information to the cortex. Indeed, the efferent fiber count may actually exceed the number of afferent fibers. The importance of the efferent pathway (from the cortex to the inferior colliculus in the brainstem) to auditory learning was demonstrated in a sound localization experiment with ferrets (Bajo, Nodal, Moore, & King, 2010). Sound localization relies on binaural cues; however, adaptation to altered cues is possible after a period of training (A. J. King et al., 2007). Bajo et al. examined the effect of pharmacological inactivation of the corticocollicular pathway on the ferret’s ability to relearn localization following occlusion of one ear. They found that auditory learning in the experimental animals was significantly impaired when comparing their performance with that of controls, thus reinforcing the role of the corticocollicular pathway for auditory learning. (See Tzounopoulos & Kraus, 2009, for review of subcortical experience-dependence.)

Speech-in-Noise Perception

Speech-in-noise perception depends on many factors involving interaction between sensory and cognitive processes. To focus on a particular speaker, the listener must form a perceptual object that enables the listener to distinguish the voice of a target speaker from other sounds (Shinn-Cunningham & Best, 2008). Object formation is determined by three primary aspects of the stimulus: location (Bronkhorst, 2000; Cherry, 1953), timing (Shinn-Cunningham & Best, 2008), and pitch (Bregman & McAdams, 1994; Brokx & Nooteboom, 1982; Darwin & Hukin, 2000; Moore, Peters, & Glasberg, 1985; Parikh & Loizou, 2005; Sayles & Winter, 2008). Pitch, derived primarily from the fundamental frequency (F0) and its second harmonic (H2) (Meddis & O’Mard, 1997), aids in voice tagging, enabling the listener to focus on the target speaker (Chandrasekaran et al., 2009).

Object formation is an important component of auditory stream segregation, or the ability to extract meaning from one particular sound source amid a background of competing sounds (Bee & Klump, 2004; Bregman, 1990; Micheyl et al., 2007; Micheyl, Tian, Carlyon, & Rauschecker, 2005; Snyder & Alain, 2007). Auditory stream segregation is mediated by attention and short-term memory (Cusack, Deeks, Aikman, & Carlyon, 2004; McLachlan & Wilson, 2010; Shinn-Cunningham & Best, 2008; Snyder, Carter, Lee, Hannon, & Alain, 2008; Sussman, Ritter, & Vaughan, 1998). For example, attention was found to modulate the streaming of alternating low- and high-frequency tones, such that the mismatch negativity response (a neural measure of auditory discrimination) was only obtained when participants were instructed to attend to high-pitched tone sequences and note deviants (low-pitched tone sequences) within this stream (Sussman et al., 1998). McLachlan and Wilson (2010) have proposed a model that engages the mechanisms of attention and short-term memory to excite an identification sequence hierarchy, leading to modulation of thalamus and inferior colliculus spectrotemporal receptive fields to control auditory streaming.

Neurophysiologic evidence of stream segregation has been demonstrated at the level of the brainstem in the cochlear nucleus (Pressnitzer, Sayles, Micheyl, & Winter, 2008). Through single-unit recordings in the cochlear nucleus of the guinea pig, Pressnitzer et al. found evidence of stream segregation of pure tones of different frequencies presented in a repeating sequence of ABA triplets. Moreover, the neurometric functions derived from these recordings accurately predicted behavioral streaming in humans.

The neural mechanisms of SIN perception can also be examined in cABRs in humans. Brainstem encoding of stimulus elements (pitch, timing, and timbre) can be considered on a continuum of performance with impaired representation in poor readers (Banai et al., 2009), in children with autism spectrum disorder (Russo, Nicol, Trommer, Zecker, & Kraus, 2009), and in children with specific language impairment (Basu et al., 2009) or auditory processing disorders (Billiet & Bellis, IN PRESS) at one end of the spectrum, typically developing children in the middle of the spectrum, and auditory experts such as musicians (Bidelman, Gandour, & Krishnan, IN PRESS; Kraus & Chandrasekaran, 2010; Musacchia, Sams, Skoe, & Kraus, 2007; Parbery-Clark, Skoe, & Kraus, 2009; Wong, Skoe, Russo, Dees, & Kraus, 2007) at the other end. Here we review subcortical encoding in children and adults who have been grouped on the basis of their SIN perception ability to improve our understanding of the biological factors contributing to SIN perception and to reach clinically viable strategies for the objective assessment of this key communication function.

Subcortical Temporal Representation and SIN Perception

Consonant Differentiation

Temporal precision is a great strength of the auditory brainstem. Stop consonants (e.g., in the syllables /ba, da, ga/) are especially vulnerable to misperception in noise (Miller & Nicely, 1955; Tallal & Stark, 1981). These syllables differ in the time-varying trajectory of the second and third formant frequencies (F2 and F3) during the formant transition period from the consonant to the vowel. Timing differences evoked by acoustic differences in these syllables are reflected in the cABR, with /ga/ having the shortest response latencies, /ba/ having the longest latencies, and /da/ having latencies in between the two (Johnson et al., 2008), as expected given the tonotopicity of the brainstem nuclei (Gorga et al., 1988). See Figure 2.

Figure 2.

Figure 2

Stimulus timing and responses are presented for /ba/ (blue), /da/ (red), and /ga/ (green) syllables

Note. Time-domain grand average responses for 20 typically developing children (bottom panels) demonstrate timing differences in the response that reflect acoustic differences in the stimuli (top panels). The 52- to 57-ms region of the response is magnified to highlight latency differences that are present in the responses. The scatterplot on the right demonstrates a relationship between subcortical differentiation scores and speech-in-noise performance on the Hearing in Noise Test (HINT; r = .492, p = .001). Modified from Hornickel et al. (2009).

Comparison of response spectra to a /da/ syllable presented in background noise in top (red) and bottom (black) SIN perceivers. Group differences were found at the F0 of the stimulus (100 Hz), with the top SIN group having higher amplitudes than the bottom SIN group ( p=.0351).

Hornickel et al. (2009) examined the relationship between stop consonant differentiation scores and SIN perception in children with a wide range of SIN abilities. Differentiation scores were calculated for the peaks in the brainstem response, taking into account the presence of the expected timing patterns for /ba/, /da/, and /ga/, as well the magnitude of the latency differences. A relationship was found between stop consonant differentiation scores and scores on the Hearing in Noise Test (HINT; Biologic Systems Corp., Mundelein, IL), a behavioral measure of SIN perception (Figure 2), and children who had better SIN perception also had the expected latency patterns for the three consonant–vowel syllables (r = .492; p = .001). In addition, when the children were divided into thirds based on HINT performance, the top third HINT performers had significantly better stop consonant differentiation scores than the bottom HINT group. t(26) = 2.287. p = .031.

Noise-Induced Timing Delays

Background noise has fairly predictable effects on the ABR, including latency increases and amplitude decreases (Burkard & Sims, 2002; Russo et al., 2004). The effect of background babble on the cABR was examined in children with good and poor SIN perceptions (Anderson et al., 2010). The noise effect (delay in timing) was greatest shortly after stimulus onset, delaying the response to the onset as much as a full millisecond. In children with good SIN perception (as measured by the HINT), the latency shifts leveled off by 40 ms into the response, but the shifts did not level off in children with poor SIN perception until approximately 60 ms (Figure 4). This region, which corresponds to the formant transition region of the stimulus, is the most perceptually vulnerable segment of the response (Hedrick & Younger, 2007; Nábĕlek, Czyzewski, & Crowley, 1994; Tallal & Stark, 1981). Overall, noise caused inordinate timing delays in the transition region of the response in children with poor SIN perception, indicating that a temporal precision deficit in the auditory brainstem is a factor in difficulties with listening in background noise.

Figure 4.

Figure 4

Effects of noise on brainstem responses in children with good and poor speech-in-noise (SIN) perceptions

Note. The effects are most evident in the formant transition region (A, boxed) of the response from 30 to 60 ms in the grand average waveforms (N = 66 children, B and C). Greater noise-induced latency shifts were noted in the children with poor speech-in-noise perception relative to children with good speech-in-noise perception (p < .01; D). Modified from Anderson et al. (2010).

Subcortical Spectral Representation and SIN Perception

The role of pitch in stream segregation is well documented (Hedrick & Younger, 2007; Nábĕlek et al., 1994; Oxenham, 2008), and performance on segregation tasks has been shown to improve as F0 differences increase (Assmann & Summerfield, 1987; de Cheveigne, 1997; Scheffers, 1983). The strength of the F0 and lower harmonics, particularly H2, important object-grouping cues, underlies successful perception of speech in noise (Bregman, 1990; Brokx & Nooteboom, 1982; Darwin & Hukin, 2000; Gaudrain, Grimault, Healy, & Bera, 2008; Moore et al., 1985; Parikh & Loizou, 2005; Summers & Leek, 1998; Vongpaisal, Trehub, & Schellenberg, 2006). The subcortical representation of pitch turns out to be an important factor in SIN perception, with greater subcortical representation of the F0 patterning with better SIN perception in young adults (Song, Skoe, Banai, & Kraus, IN PRESS). Speech-in-noise measures derived from the Quick Speech-in-Noise Test (QuickSIN; Etymotic Research, Elk Grove, IL; Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) correlated with the magnitude of F0 representation in response to /da/ presented in background babble (rs = .523; p = .031), with larger F0 representation patterning with better SIN perception (Figure 3). This relationship was particularly pronounced in the response to the formant (time-varying) region of the syllable, the region that is considered perceptually vulnerable to the effects of background noise (Hedrick & Younger, 2007; Nábĕlek et al., 1994; Tallal & Stark, 1981). Notably, greater strength in the representation of pitch cues (F0 and H2) is linked to better speech perception ability in children (Anderson, Skoe, Chandrasekaran, Zecker, & Kraus, IN PRESS).

Figure 3.

Figure 3

The strength of F0 encoding is related to Quick Speech-in-Noise test scores (QuickSIN: rs = .523, p – .031). Similar group differences for the F0 have been found in children with good and poor SIN perception (Anderson et al., IN PRESS). Adapted from Song et al., (IN PRESS).

Subcortical, On-Line Statistical Learning and SIN Perception

The importance of taking advantage of statistical regularities in signals is demonstrated when an infant learns to segment words from running speech. Word segmentation, an important aspect of language acquisition, is based on the probability of adjacent speech sounds occurring within the same word or between words, and this statistical learning occurs remarkably quickly, within 2 minutes (Aslin, Saffran, & Newport, 1998; Kirkham, Slemmer, & Johnson, 2002; Saffran, Aslin, & Newport, 1996). Adaptive (on-line) sensory processing is a hallmark of an efficient sensory system. Several studies have documented on-line plasticity in the spectrotemporal receptive fields of auditory pathway neurons (Atiani, Elhilali, David, Fritz, & Shamma, 2009; Fritz, Elhilali, & Shamma, 2005; Fritz, Elhilali, David, & Shamma, 2007; McAlpine, Martin, Mossop, & Moore, 1997). Gain and shape changes in respective fields in the ferret primary auditory cortex vary systematically with task difficulty in a tone-in-noise task, resulting in an enhanced representation of the target tone (Atiani et al., 2009). Furthermore, on-line adaptation to time-compressed speech has been observed in humans in a functional magnetic resonance imaging study, with an association between rapid task learning and increased activation of the right and left auditory association cortices and left motor cortex (Adank & Devlin, 2010).

On-line plasticity occurs in the auditory brainstem responses of children with a wide range of SIN perception abilities (Chandrasekaran et al., 2009). In typically developing children, a sharpening of brainstem responses was seen when responses to speech syllables were recorded in a predictable condition versus a variable condition (Figure 5). Importantly, the magnitude of benefit found in the regularly repeating condition correlated positively with SIN perception, as measured by the HINT. The enhancement of H2 in the predictable condition, associated with SIN perceptual ability, was interpreted as providing a mechanism for more effective “tagging” of a speaker’s voice.

Figure 5.

Figure 5

Grand average response waveforms of typically developing children (N = 21) in response to repetitive (red) versus variable (black) presentation of speech syllable /da/ (top panel)

Note. The black rectangle outlines the formant transition region from 20 to 60 ms. Grand average spectra of repetitive versus variable presentations demonstrate enhanced magnitude of second and fourth harmonics in the repetitive condition (bottom left). The difference scores between repetitive versus variable conditions relate to Hearing in Noise Test scores (r = .486; p = .025; bottom right). Modified from Chandrasekaran et al. (2009).

Effects of Auditory Training on Brainstem Encoding and SIN Perception

Evidence of brainstem plasticity has been demonstrated in humans. Children with learning disabilities had improved brainstem responses to speech presented in a noise background after participating in an auditory training program (Earobics; Scientific Learning, Oakland, CA); moreover, the cABR prior to training was predictive of the benefit obtained from training (C. King, Warrier, Hayes, & Kraus, 2002; Russo, Nicol, Zecker, Hayes, & Kraus, 2005). Young adult native speakers of English showed greater accuracy in brainstem pitch tracking to a pitch contour (a nonnative speech cue) following training on a word identification task incorporating lexical pitch (Song, Skoe, Wong, & Kraus, 2008). Learning-associated brainstem changes were found in subcortical pathways revealed by otoacoustic emissions (de Boer & Thornton, 2008). These studies support the malleability of brainstem responses to complex sounds.

Musicians are auditory experts, and this expertise develops from years of experience selectively attending to relevant cues in a complex sound scape (e.g., melodies from background harmonics, the sound of one’s own instrument). This experience can be considered a form of auditory stream segregation and object formation. Musicians have a behavioral advantage for SIN perception, having better scores on the HINT and the QuickSIN, and this benefit extends to working memory (Parbery-Clark, Skoe, Lam, & Kraus, 2009). In addition, years of consistent musical practice correlates positively both with QuickSIN and working memory (Figure 6). This study was followed by a comparison of cABRs with speech stimuli in quiet and in noise between musicians and nonmusicians. Musicians had less degradation of responses in background noise than nonmusicians (Figure 7), indicating that musical experience results in more robust subcortical speech representation in background noise. Specifically, musicians showed less noise-induced timing delays in response to the onset and formant transition portion of the syllable /da/, consistent with parallel findings in school-age children discussed above (Anderson et al., 2010). The extent of timing disruption was correlated with SIN ability (less delay, better SIN perception on HINT measures). This enhancement of responses may be the result of top-down, corticofugal sharpening of relevant acoustic features.

Figure 6.

Figure 6

Years of musical practice relate to Quick Speech-in-Noise Test (QuickSIN) scores (r = −.580; p = .001) and working memory scores

Note. Composite score based on the Digits Reversed and Auditory Working Memory (WM) subtests of the Woodcock–Johnson III Test of Cognitive Abilities (r = .614; p < .001). Adapted from Parbery-Clark et al. (2009).

Figure 7.

Figure 7

Comparison of brainstem responses to the speech syllable /da/ in quiet and babble noise conditions in musicians versus nonmusicians

Note. The selected peaks (onset and transition) are circled (A). Noise delays peak latencies (B), particularly in the onset and transition portions of the response. The musicians (red) show significantly shorter timing delays in noise than nonmusicians (black) for the onset (C) and transition peaks (D). The latency of the onset (E) and transition peaks (F) is correlated with SIN perception. Modified from Parbery-Clark et al. (2009).

Short-term auditory training is effective in improving SIN perception. The Listening and Communication Enhancement (Neurotone, Redwood City, CA) program uses adaptive computerized auditory training to improve SIN performance in people who have hearing loss or have difficulty in hearing background noise. After using this program 30 minutes a day, 5 times a week, for 4 weeks, participants’ scores on the HINT and QuickSIN improved. Improvement was also documented on self-assessment measures, including the Hearing Handicap Inventory for the Elderly/Adults (Ventry & Weinstein, 1982) and the Communication Scale for Older Adults (Kaplan, Bally, Brandt, Busacco, & Pray, 1997; see also Sweetow & Sabes, 2006).

Training with an auditory-based cognitive training program (Brain Fitness Program, Posit Science, San Francisco, CA) has revealed improvements in auditory memory and attention in adults aged 65 years and older with no history of cognitive impairment. Given the previously established relationship between auditory working memory and SIN perception (Parbery-Clark, Skoe, Lam, & Kraus, 2009) and memory and attention and auditory stream segregation (Shinn-Cunningham & Best, 2008), this kind of training program may benefit SIN perception as well as auditory cognitive skills. We argue that training that engages both sensory and cognitive aspects of SIN perception are likely to be the most effective in driving SIN improvements (Kraus & Chandrasekaran, 2010). Germane to the topic of this review, brainstem responses can reveal this cognitive sensory interplay. Efforts are currently under way in our lab to examine the impact of auditory training on neural and behavioral measures of SIN perception.

Case Study

Figure 8 showcases how cABR can provide an objective clinical measure of SIN perception and a metric for predicting treatment benefit and for documenting training-related changes. The figure illustrates preliminary data from an ongoing study of the relationship between subcortical processing and SIN perception. We feature two older adults who self-report good SIN perception (age 61 years) and poor SIN perception (age 62 years) and have essentially identical normal audiograms. They differ on a behavioral measure of SIN (HINT scores were −1.0 vs. −3.6) and on a subset of four background-noise-related questions on the Speech, Spatial, and Qualities of Hearing Scale (Gatehouse & Noble, 2004; mean score of 8.25/10 vs. 4/10). We obtained brainstem responses to a 40 ms /da/ syllable using BioMARK (Biological Marker of Auditory Processing), a clinical technology available as an addition to the Navigator Pro-Auditory Evoked Potential hardware (Natus Medical Incorporated, San Carlos, CA). The individual with good SIN perception has earlier peak latencies and greater subcortical F0 representation than the individual with poor SIN perception. The differences in the brainstem responses provide quite a marked contrast and are consistent with the participants’ own perception of abilities.

Figure 8.

Figure 8

Case study demonstrating the cABRs of 2 participants with nearly identical audiograms (top left) but differing ability to hear in background noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; 8.5/10 vs. 4/10 on four background-noise-related questions; bottom left)

Note. The participant with better SIN (Hearing in Noise Test [HINT]: −3.6, age 61 years) has faster brainstem timing (top right) and greater representation of pitch (F0 and H2; bottom left) than the participant with poorer SIN (HINT: −1.0, age 62 years). The solid gray line in the fast Fourier transform plot represents the grand average response of 20 normal-hearing participants ages 60 years and older, and the dotted gray line represents the standard error.

Conclusions

Speech-in-noise perception is influenced by many factors, and the brainstem encoding of spectral (F0 and H2) and temporal features of speech appears to be a significant factor in the ability to successfully communicate in background noise. The cABR is objective, highly reliable, and can be used to examine the brainstem encoding of temporal and spectral features of speech and other complex sounds. Brainstem processing is experience-dependent, and a sharpening of responses occurs with short- and long-term training as well as on-line adaptation to stimulus regularities. The cABR continues to inform us of the biology underlying SIN perception and promises to be a useful clinical tool in the assessment and management of SIN perception difficulties.

Acknowledgments

We would especially like to thank the members of the Auditory Neuroscience Laboratory and our participants for their contributions to this work.

Funding

The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article:

This work was funded by the National Institutes of Health (RO1 DC01510) and the National Science Foundation SGER (0842376).

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interests with respect to the authorship and/or publication of this article.

1

See demonstration on home page: http://www.brainvolts.northwestern.edu.

References

  1. Adank P, Devlin JT. On-line plasticity in spoken sentence comprehension: Adapting to time-compressed speech. NeuroImage. 2010;49:1124–1132. doi: 10.1016/j.neuroimage.2009.07.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akhoun I, Gallégo S, Moulin A, Ménard M, Veuillet E, Berger-Vachon C, Collet L, Thai-Van H. The temporal relationship between speech auditory brainstem responses and the acoustic pattern of the phoneme /ba/ in normal-hearing adults. Clinical Neurophysiology. 2008;119:922–933. doi: 10.1016/j.clinph.2007.12.010. [DOI] [PubMed] [Google Scholar]
  3. Anderson S, Skoe E, Chandrasekaran B, Kraus N. Neural timing is linked to speech perception in noise. Journal of Neuroscience. 2010;30:4922–4926. doi: 10.1523/JNEUROSCI.0107-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Anderson S, Skoe E, Chandrasekaran B, Kraus N. Brainstem correlates of speech-in-noise perception in children. Hearing Research. doi: 10.1016/j.heares.2010.08.001. IN PRESS. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Aslin RN, Saffran JR, Newport EL. Computation of conditional probability statistics by 8-month-old infants. Psychological Science. 1998;9:321–324. [Google Scholar]
  6. Assmann PF, Summerfield Q. Perceptual segregation of concurrent vowels. Journal of the Acoustical Society of America. 1987;82:S120. [Google Scholar]
  7. Atiani S, Elhilali M, David SV, Fritz JB, Shamma SA. Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields. Neuron. 2009;61:467–480. doi: 10.1016/j.neuron.2008.12.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bajo VM, Nodal FR, Moore DR, King AJ. The descending corticocollicular pathway mediates learning-induced auditory plasticity. Nature Neuroscience. 2010;13:253–260. doi: 10.1038/nn.2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Banai K, Nicol T, Zecker St, Kraus N. Brainstem timing: Implications for cortical processing and literacy. Journal of Neuroscience. 2005;25:9850–9857. doi: 10.1523/JNEUROSCI.2373-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Banai K, Hornickel J, Skoe E, Nicol T, Zecker S, Kraus N. Reading and subcortical auditory function. Cerebral Cortex. 2009;19:2699–2707. doi: 10.1093/cercor/bhp024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Basu M, Krishnan A, Weber-Fox C. Brainstem correlates of temporal auditory processing in children with specific language impairment. Developmental Science. 2009;13:77–91. doi: 10.1111/j.1467-7687.2009.00849.x. [DOI] [PubMed] [Google Scholar]
  12. Bee MA, Klump GM. Primitive auditory stream segregation: A neurophysiological study in the songbird fore-brain. Journal of Neurophysiology. 2004;92:1088–1104. doi: 10.1152/jn.00884.2003. [DOI] [PubMed] [Google Scholar]
  13. Bidelman G, Gandour J, Krishnan A. Cross-domain effects of music and language experience on the representation of pitch in the human auditory brainstem. Journal of Cognitive Neuroscience. doi: 10.1162/jocn.2009.21362. IN PRESS. [DOI] [PubMed] [Google Scholar]
  14. Billiet C, Bellis TJ. The relationship between brainstem temporal processing and performance on tests of central auditory function in children with reading disorders. Journal of Speech, Language, and Hearing Research. doi: 10.1044/1092-4388(2010/09-0239). IN PRESS. [DOI] [PubMed] [Google Scholar]
  15. Bregman AS. Auditory scene analysis. Cambridge: MIT Press; 1990. [Google Scholar]
  16. Bregman AS, McAdams S. Auditory scene analysis: The perceptual organization of sound. Journal of the Acoustical Society of America. 1994;95:1177–1178. [Google Scholar]
  17. Brokx JP, Nooteboom S. Intonation and the perceptual separation of simultaneous voices. Journal of Phonetics. 1982;10:23–26. [Google Scholar]
  18. Bronkhorst A. The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions. Acta Acustica. 2000;86:117–128. [Google Scholar]
  19. Burkard RF, Sims D. A comparison of the effects of broadband masking noise on the auditory brainstem response in young and older adults. American Journal of Audiology. 2002;11:13–22. doi: 10.1044/1059-0889(2002/004). [DOI] [PubMed] [Google Scholar]
  20. Chandrasekaran B, Hornickel J, Skoe E, Nicol TG, Kraus N. Context-dependent encoding in the human auditory brainstem relates to hearing speech in noise: Implications for developmental dyslexia. Neuron. 2009;64:311–319. doi: 10.1016/j.neuron.2009.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Chandrasekaran B, Kraus N. The scalp-recorded brainstem response to speech: Neural origins and plasticity. Psychophysiology. 2010;47:236–246. doi: 10.1111/j.1469-8986.2009.00928.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cherry EC. Some experiments on the recognition of speech, with one and with two ears. Journal of the Acoustical Society of America. 1953;25:975–979. [Google Scholar]
  23. Cunningham J, Nicol TG, Zecker SG, Bradlow AR, Kraus N. Neurobiologic responses to speech in noise in children with learning problems: Deficits and strategies for improvement. Clinical Neurophysiology. 2001;112:758–767. doi: 10.1016/s1388-2457(01)00465-5. [DOI] [PubMed] [Google Scholar]
  24. Cunningham J, Nicol T, Zecker S, Kraus N. Speech-evoked neurophysiologic responses in children with learning problems: Development and behavioral correlates of perception. Ear and Hearing. 2000;21:554–568. doi: 10.1097/00003446-200012000-00003. [DOI] [PubMed] [Google Scholar]
  25. Cusack R, Deeks J, Aikman G, Carlyon R. Effects of location, frequency region, and time course of selective attention on auditory scene analysis. Journal of Experimental Psychology: Human Perception and Performance. 2004;30:643–656. doi: 10.1037/0096-1523.30.4.643. [DOI] [PubMed] [Google Scholar]
  26. Darwin CJ, Hukin RW. Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. Journal of the Acoustical Society of America. 2000;107:970–977. doi: 10.1121/1.428278. [DOI] [PubMed] [Google Scholar]
  27. de Boer J, Thornton ARD. Neural correlates of perceptual learning in the auditory brainstem: Efferent activity predicts and reflects improvement at a speech-in-noise discrimination task. Journal of Neuroscience. 2008;28:4929–4937. doi: 10.1523/JNEUROSCI.0902-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. de Cheveigne A. Concurrent vowel identification. III. A neural model of harmonic interference cancellation. Journal of the Acoustical Society of America. 1997;101:2857–2865. [Google Scholar]
  29. de Gelder B, Vroomen J. Impaired speech perception in poor readers: Evidence from hearing and speech reading. Brain & Language. 1998;64:269–281. doi: 10.1006/brln.1998.1973. [DOI] [PubMed] [Google Scholar]
  30. Fritz JB, Elhilali M, David SV, Shamma SA. Does attention play a role in dynamic receptive field adaptation to changing acoustic salience in A1? Hearing Research. 2007;229:186–203. doi: 10.1016/j.heares.2007.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Fritz J, Elhilali M, Shamma S. Active listening: Task-dependent plasticity of spectrotemporal receptive fields in primary auditory cortex. Hearing Research. 2005;206:159–176. doi: 10.1016/j.heares.2005.01.015. [DOI] [PubMed] [Google Scholar]
  32. Galbraith GC, Arbagey PW, Branski R, Comerci N, Rector PM. Intelligible speech encoded in the human brain stem frequency-following response. Neuroreport. 1995;6:2363–2367. doi: 10.1097/00001756-199511270-00021. [DOI] [PubMed] [Google Scholar]
  33. Gao E, Suga N. Experience-dependent plasticity in the auditory cortex and the inferior colliculus of bats: Role of the corticofugal system. Proceedings of the National Academy of Sciences of the United States of America. 2000;97:8081–8086. doi: 10.1073/pnas.97.14.8081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gatehouse S, Noble W. The speech, spatial and qualities of hearing scale (SSQ) International Journal of Audiology. 2004;43:85–99. doi: 10.1080/14992020400050014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Gaudrain E, Grimault N, Healy EW, Bera JC. Streaming of vowel sequences based on fundamental frequency in a cochlear-implant simulation. Journal of the Acoustical Society of America. 2008;124:3076–3087. doi: 10.1121/1.2988289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Gorga MP, Kaminski JR, Beauchaine KA, Jesteadt W. Auditory brainstem responses to tone bursts in normally hearing subjects. Journal of Speech, Language, and Hearing Research. 1988;31:87–97. doi: 10.1044/jshr.3101.87. [DOI] [PubMed] [Google Scholar]
  37. Hall JW, Mueller HG. Audiologists’ desk reference. Vol. 1. San Diego, CA: Singular; 1997. [Google Scholar]
  38. Hedrick MS, Younger MS. Perceptual weighting of stop consonant cues by normal and impaired listeners in reverberation versus noise. Journal of Speech, Language and Hearing Research. 2007;50:254–269. doi: 10.1044/1092-4388(2007/019). [DOI] [PubMed] [Google Scholar]
  39. Hornickel J, Skoe E, Nicol T, Zecker S, Kraus N. Subcortical differentiation of stop consonants relates to reading and speech-in-noise perception. Proceedings of the National Academy of Sciences of the United States of America. 2009;106:13022–13027. doi: 10.1073/pnas.0901123106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Johnson KL, Nicol T, Zecker SG, Bradlow AR, Skoe E, Kraus N. Brainstem encoding of voiced consonant-vowel stop syllables. Clinical Neurophysiology. 2008;119:2623–2635. doi: 10.1016/j.clinph.2008.07.277. [DOI] [PubMed] [Google Scholar]
  41. Johnson K, Nicol TG, Zecker SG, Kraus N. Auditory brainstem correlates of perceptual timing deficits. Journal of Cognitive Neuroscience. 2007;19:376–385. doi: 10.1162/jocn.2007.19.3.376. [DOI] [PubMed] [Google Scholar]
  42. Kaplan H, Bally S, Brandt F, Busacco D, Pray J. Communication scale for older adults (CSOA) Journal of the American Academy of Audiology. 1997;8:203–217. [PubMed] [Google Scholar]
  43. Killion M, Niquette P, Gudmundsen G, Revit L, Banerjee S. Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in normal-hearing and hearing-impaired listeners. Journal of the Acoustical Society of America. 2004;116:2395–2405. doi: 10.1121/1.1784440. [DOI] [PubMed] [Google Scholar]
  44. King AJ, Bajo VM, Bizley JK, Campbell RAA, Nodal FR, Schulz AL, Schnupp JWH. Physiological and behavioral studies of spatial coding in the auditory cortex. Hearing Research. 2007;229:106–115. doi: 10.1016/j.heares.2007.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. King C, Warrier CM, Hayes E, Kraus N. Deficits in auditory brainstem pathway encoding of speech sounds in children with learning problems. Neuroscience Letters. 2002;319:111–115. doi: 10.1016/s0304-3940(01)02556-3. [DOI] [PubMed] [Google Scholar]
  46. Kirkham NZ, Slemmer JA, Johnson SP. Visual statistical learning in infancy: Evidence for a domain general learning mechanism. Cognition. 2002;83:B35–B42. doi: 10.1016/s0010-0277(02)00004-5. [DOI] [PubMed] [Google Scholar]
  47. Kraus N, Chandrasekaran B. Music training for the development of auditory skills. Nature Reviews Neuroscience. 2010;11:599–605. doi: 10.1038/nrn2882. [DOI] [PubMed] [Google Scholar]
  48. McAlpine D, Martin R, Mossop J, Moore D. Response properties of neurons in the inferior colliculus of the monaurally deafened ferret to acoustic stimulation of the intact ear. Journal of Neurophysiology. 1997;78:767–779. doi: 10.1152/jn.1997.78.2.767. [DOI] [PubMed] [Google Scholar]
  49. McAnally K, Stein J. Auditory temporal coding in dyslexia. Proceedings: Biological Sciences. 1996;263:961–965. doi: 10.1098/rspb.1996.0142. [DOI] [PubMed] [Google Scholar]
  50. McLachlan N, Wilson S. The central role of recognition in auditory perception: A neurobiological model. Psychological Review. 2010;117:175–196. doi: 10.1037/a0018063. [DOI] [PubMed] [Google Scholar]
  51. Meddis R, O’Mard L. A unitary model of pitch perception. Journal of the Acoustical Society of America. 1997;102:1811–1820. doi: 10.1121/1.420088. [DOI] [PubMed] [Google Scholar]
  52. Micheyl C, Carlyon RP, Gutschalk A, Melcher JR, Oxenham AJ, Rauschecker JP, Wilson EC. The role of auditory cortex in the formation of auditory streams. Hearing Research. 2007;229:116–131. doi: 10.1016/j.heares.2007.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Micheyl C, Tian B, Carlyon RP, Rauschecker JP. Perceptual organization of tone sequences in the auditory cortex of awake macaques. Neuron. 2005;48:139–148. doi: 10.1016/j.neuron.2005.08.039. [DOI] [PubMed] [Google Scholar]
  54. Miller GA, Nicely PE. An analysis of perceptual confusions among some English consonants. Journal of the Acoustical Society of America. 1955;27:338–352. [Google Scholar]
  55. Moore BC, Peters RW, Glasberg BR. Thresholds for the detection of inharmonicity in complex tones. Journal of the Acoustical Society of America. 1985;77:1861–1867. doi: 10.1121/1.391937. [DOI] [PubMed] [Google Scholar]
  56. Musacchia G, Sams M, Skoe E, Kraus N. Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. Proceedings of the National Academy of Sciences. 2007;104:15894–15898. doi: 10.1073/pnas.0701498104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Nábĕlek AK, Czyzewski Z, Crowley H. Cues for perception of the diphthong /i/ in either noise or reverberation. Part I. Duration of the transition. Journal of the Acoustical Society of America. 1994;95:2681–2693. doi: 10.1121/1.409837. [DOI] [PubMed] [Google Scholar]
  58. Oxenham AJ. Pitch perception and auditory stream segregation: Implications for hearing loss and cochlear implants. Trends in Amplification. 2008;12:316–331. doi: 10.1177/1084713808325881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Parbery-Clark A, Skoe E, Kraus N. Musical experience limits the degradative effects of background noise on the neural processing of sound. Journal of Neuroscience. 2009;29:14100–14107. doi: 10.1523/JNEUROSCI.3256-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Parbery-Clark A, Skoe E, Lam C, Kraus N. Musician enhancement for speech-in-noise. Ear and Hearing. 2009;30:653–661. doi: 10.1097/AUD.0b013e3181b412e9. [DOI] [PubMed] [Google Scholar]
  61. Parikh G, Loizou PC. The influence of noise on vowel and consonant cues. Journal of the Acoustical Society of America. 2005;118:3874–3888. doi: 10.1121/1.2118407. [DOI] [PubMed] [Google Scholar]
  62. Pressnitzer D, Sayles M, Micheyl C, Winter IM. Perceptual organization of sound begins in the auditory periphery. Current Biology. 2008;18:1124–1128. doi: 10.1016/j.cub.2008.06.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Russo N, Nicol T, Musacchia G, Kraus N. Brainstem responses to speech syllables. Clinical Neurophysiology. 2004;115:2021–2030. doi: 10.1016/j.clinph.2004.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Russo N, Nicol T, Trommer B, Zecker S, Kraus N. Brainstem transcription of speech is disrupted in children with autism spectrum disorders. Developmental Science. 2009;12:557–567. doi: 10.1111/j.1467-7687.2008.00790.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Russo N, Nicol T, Zecker S, Hayes E, Kraus N. Auditory training improves neural timing in the human brainstem. Behavioural Brain Research. 2005;156:95–103. doi: 10.1016/j.bbr.2004.05.012. [DOI] [PubMed] [Google Scholar]
  66. Saffran JR, Aslin RN, Newport EL. Statistical learning by 8-month-old infants. Science. 1996;274:1926–1928. doi: 10.1126/science.274.5294.1926. [DOI] [PubMed] [Google Scholar]
  67. Sayles M, Winter IM. Ambiguous pitch and the temporal representation of inharmonic iterated rippled noise in the ventral cochlear nucleus. Journal of Neuroscience. 2008;28:11925–11938. doi: 10.1523/JNEUROSCI.3137-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Scheffers MTM. Simulation of auditory analysis of pitch: An elaboration on the DWS pitch meter. Journal of the Acoustical Society of America. 1983;74:1716–1725. doi: 10.1121/1.390280. [DOI] [PubMed] [Google Scholar]
  69. Shinn-Cunningham BG, Best V. Selective attention in normal and impaired hearing. Trends in Amplification. 2008;12:283–299. doi: 10.1177/1084713808325306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Skoe E, Kraus N. Auditory brain stem response to complex sounds: A tutorial. Ear and Hearing. 2010;31:302–324. doi: 10.1097/AUD.0b013e3181cdb272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Snyder JS, Alain C. Toward a neurophysiological theory of auditory stream segregation. Psychological Bulletin. 2007;133:780–799. doi: 10.1037/0033-2909.133.5.780. [DOI] [PubMed] [Google Scholar]
  72. Snyder JS, Carter OL, Lee SK, Hannon EE, Alain C. Effects of context on auditory stream segregation. Journal of Experimental Psychology: Human Perception and Performance. 2008;34:1007–1016. doi: 10.1037/0096-1523.34.4.1007. [DOI] [PubMed] [Google Scholar]
  73. Song JH, Banai K, Russo NM, Kraus N. On the relationship between speech- and nonspeech-evoked auditory brainstem responses. Audiology & Neurotology. 2006;11:233–241. doi: 10.1159/000093058. [DOI] [PubMed] [Google Scholar]
  74. Song JH, Nicol T, Kraus N. Test-retest reliability of the speech-evoked auditory brainstem response in young adults. Clinical Neurophysiology. doi: 10.1016/j.clinph.2010.07.009. IN PRESS. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Song JH, Skoe E, Banai K, Kraus N. Enhancement of brainstem encoding of the fundamental frequency in listeners with good speech perception in noise. Poster presented at the Society for Neuroscience Annual Meeting; Chicago, IL. 2009. Oct, [Google Scholar]
  76. Song JH, Skoe E, Wong PCM, Kraus N. Plasticity in the adult human auditory brainstem following short-term linguistic training. Journal of Cognitive Neuroscience. 2008;20:1892–1902. doi: 10.1162/jocn.2008.20131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Summers V, Leek MR. F0 processing and the separation of competing speech signals by listeners with normal hearing and with hearing loss. Journal of Speech, Language and Hearing Research. 1998;41:1294–1306. doi: 10.1044/jslhr.4106.1294. [DOI] [PubMed] [Google Scholar]
  78. Sussman E, Ritter W, Vaughan HG. Attention affects the organization of auditory input associated with the mismatch negativity system. Brain Research. 1998;789:130–138. doi: 10.1016/s0006-8993(97)01443-1. [DOI] [PubMed] [Google Scholar]
  79. Sweetow RW, Sabes JH. The need for and development of an adaptive listening and communication enhancement (LACE) program. Journal of the American Academy of Audiology. 2006;17:538–558. doi: 10.3766/jaaa.17.8.2. [DOI] [PubMed] [Google Scholar]
  80. Tallal P. Auditory temporal perception, phonics, and reading disabilities in children. Brain & Language. 1980;9:182–198. doi: 10.1016/0093-934x(80)90139-x. [DOI] [PubMed] [Google Scholar]
  81. Tallal P, Stark RE. Speech acoustic-cue discrimination abilities of normally developing and language-impaired children. Journal of the Acoustical Society of America. 1981;69:568–574. doi: 10.1121/1.385431. [DOI] [PubMed] [Google Scholar]
  82. Tobey EA, Cullen JK, Jr, Rampp DL, Fleischer-Gallagher AM. Effects of stimulus-onset asynchrony on the dichotic performance of children with auditory-processing disorders. Journal of Speech, Language, and Hearing Research. 1979;22:197–211. doi: 10.1044/jshr.2202.197. [DOI] [PubMed] [Google Scholar]
  83. Townsend TH, Schwartz DM. Error analysis on the California consonant test by manner of articulation. Ear and Hearing. 1981;2:108–111. doi: 10.1097/00003446-198105000-00004. [DOI] [PubMed] [Google Scholar]
  84. Tzounopoulos T, Kraus N. Learning to encode timing: Mechanisms of plasticity in the auditory brainstem. Neuron. 2009;62:463–469. doi: 10.1016/j.neuron.2009.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Van Tasell D, Hagen LT, Koblas LL, Penner SG. Perception of short-term spectral cues for stop consonant place by normal and hearing-impaired subjects. Journal of the Acoustical Society of America. 1982;72:1771–1780. doi: 10.1121/1.388650. [DOI] [PubMed] [Google Scholar]
  86. Ventry I, Weinstein B. The hearing handicap inventory for the elderly: A new tool. Ear and Hearing. 1982;3:128–134. doi: 10.1097/00003446-198205000-00006. [DOI] [PubMed] [Google Scholar]
  87. Vongpaisal T, Trehub SE, Schellenberg EG. Song recognition by children and adolescents with cochlear implants. Journal of Speech, Language, and Hearing Research. 2006;49:1091–1103. doi: 10.1044/1092-4388(2006/078). [DOI] [PubMed] [Google Scholar]
  88. Wible B, Nicol T, Kraus N. Atypical brainstem representation of onset and formant structure of speech sounds in children with language-based learning problems. Biological Psychology. 2004;67:299–317. doi: 10.1016/j.biopsycho.2004.02.002. [DOI] [PubMed] [Google Scholar]
  89. Wong PCM, Skoe E, Russo NM, Dees T, Kraus N. Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nature Neuroscience. 2007;10:420–422. doi: 10.1038/nn1872. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES