Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
. 2017 Oct 17;60(10):3001–3008. doi: 10.1044/2017_JSLHR-H-17-0070

Speech Perception in Complex Acoustic Environments: Developmental Effects

Lori J Leibold a,
PMCID: PMC5945069  PMID: 29049600

Abstract

Purpose

The ability to hear and understand speech in complex acoustic environments follows a prolonged time course of development. The purpose of this article is to provide a general overview of the literature describing age effects in susceptibility to auditory masking in the context of speech recognition, including a summary of findings related to the maturation of processes thought to facilitate segregation of target from competing speech.

Method

Data from published and ongoing studies are discussed, with a focus on synthesizing results from studies that address age-related changes in the ability to perceive speech in the presence of a small number of competing talkers.

Conclusions

This review provides a summary of the current state of knowledge that is valuable for researchers and clinicians. It highlights the importance of considering listener factors, such as age and hearing status, as well as stimulus factors, such as masker type, when interpreting masked speech recognition data.

Presentation Video

http://cred.pubs.asha.org/article.aspx?articleid=2601620


This research forum contains papers from the 2016 Research Symposium at the ASHA Convention held in Philadelphia, PA.

Children live and learn in complex acoustic environments, which contain multiple sources of competing sounds. Acoustic waveforms generated by these sources may be relatively steady in frequency and intensity over time, or they may be more dynamic. For example, a child in a science classroom might be exposed to speech produced by his or her teacher, speech produced by other children, and noise produced by an aquarium. Given the high prevalence of competing sounds in children's natural listening environments (e.g., Ambrose, VanDam, & Moeller, 2014) and the mounting evidence linking exposure to competing sounds to delays in language development and learning (e.g., Shield & Dockrell, 2008), it is essential that we understand how and when the ability to hear and understand speech in complex acoustic environments develop. This is not a trivial problem; the ability to recognize speech in the presence of competing sounds relies on accurate and efficient processing across multiple stages within the auditory and cognitive systems.

The goal of this review is to (a) provide a simple model describing stages of auditory processing; (b) differentiate between energetic and informational masking; (c) review the literature describing developmental effects in susceptibility to speech-in-speech masking; (4) introduce the hypothesis that the ability to take advantage of acoustic voice characteristics that facilitate segregation of talker from masker speech requires extensive experience with sound; and (5) consider how congenital hearing loss may impact experience with sound, thus altering the maturation of speech-in-speech perception skills.

Stages of Auditory Processing

Figure 1 depicts several stages of auditory processing required to recognize speech in multisource environments. The child in the science classroom must listen to his teacher's lecture while disregarding speech produced by his classmates and noise generated by the aquarium's filter and pump. What reaches the child's ears is a mixture of acoustic waveforms produced by all three sources. In order to “hear out” the teacher's instructions, the basic spectral, temporal, and intensity properties of her speech must first be encoded by the child's peripheral auditory system. The fidelity of this peripheral encoding is compromised by the presence of the competing sounds. The representation of waveforms associated with the competing speech and noise may overlap on the basilar membrane with those of the target speech, thus degrading the neural representation of the teacher's spoken message transmitted to the child's central auditory system. This phenomenon is often referred to in the literature as energetic masking.

Figure 1.

Figure 1.

This illustration highlights three stages of auditory processing. In the first stage, a combination of acoustic waveforms produced by three sources (a science teacher, students working on a project, the pump and filter of an aquarium) reaches the child's ear. In the second stage, represented as a spectrogram, the peripheral auditory system encodes the temporal, spectral, and intensity characteristics of these waveforms into a pattern of neural activity across auditory nerve fibers that is transmitted to higher levels within the auditory system. In the third stage, top-down auditory-perceptual, cognitive, and linguistic processing facilitate reconstruction of the auditory scene.

The ability to hear speech in the presence of competing sounds also relies on central auditory and cognitive processes that allow listeners to group sounds into separate auditory objects and allocate attention to a particular object while discounting other objects (e.g., Best, Ozmeral, & Shinn-Cunningham, 2007; Bregman, 1990; Bronkhorst, 2000). In addition to degrading the peripheral representation of target speech, competing sounds may also impact speech perception by disrupting this higher-level processing. This disruption often reduces the extent to which listeners disentangle target speech from competing sounds, even when the fidelity with which the peripheral auditory system encodes the target speech is sufficient. These difficulties are most pronounced when the target speech and the competing masker are perceptually similar, such as speech recognition in a masker composed of a small number of speech streams (e.g., Brungart, 2001; Carhart, Tillman, & Greetis, 1969; Freyman, Balakrishnan, & Helfer, 2004). This phenomenon is often referred to in the literature as informational masking.

Competing Noise Versus Competing Speech

The majority of studies investigating masked speech perception have examined speech recognition in the presence of relatively steady-state sounds, such as babble (≥ 4 talkers), Gaussian noise, or speech-shaped noise (e.g., Dubno, Dirks, & Morgan, 1984; Frisina & Frisina, 1997; Gravel, Fausel, Liskow, & Chobot, 1999; Lunner & Sundewall-Thorén, 2007). Not surprisingly, these relatively steady-state sounds have commonly been included as a masker for clinical speech-in-noise tests, for example, Quick Speech-in-Noise Test (Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) and Hearing in Noise Test (Nilsson, Soli, & Sullivan, 1994; Niquette et al., 2003). At least for young adults, steady noise is expected to produce primarily energetic masking by physically interfering with encoding of all or parts of the target speech representation at the periphery (reviewed by Brungart, 2005). In their seminal study, for example, Miller and Nicely (1955) measured adults' identification of consonant–vowel syllables in broadband noise using a closed-set task. Their findings showed that adults have more difficulty identifying some consonants than others in noise (e.g., manner information) and that error patterns are generally uniform across listeners. Subsequent work demonstrated that consonant error patterns in noise are influenced by spectral characteristics of the masking noise (e.g., Phatak, Lovitt, & Allen, 2008) and signal-to-noise ratio (SNR; e.g., Miller & Nicely, 1955; Phatak & Allen, 2007; Woods, Yund, Herron, & Cruadhlaoich, 2010). These findings provide compelling evidence that the factors responsible for consonant identification in noise are related to features of the stimuli and how those features are encoded by the peripheral auditory system.

There has been a more recent emphasis in the literature on understanding how maskers composed of a small number of speech streams impact speech recognition (e.g., Freyman et al., 2004). It is well documented that speech recognition in a single stream of competing speech is typically easier for young adults than listening in steady noise, in part because fluctuations within the competing speech stream provide listeners with an opportunity to “glimpse” portions of the target speech (e.g., Cooke, 2006; Howard-Jones & Rosen, 1993). However, speech recognition in a masker composed of two to three streams of speech is often more difficult for listeners than when the masker is steady noise (e.g., Carhart, Johnson, & Goodman, 1975; Freyman et al., 2004). For example, Carhart et al. (1975) estimated young adults' spondee recognition thresholds in the presence of white noise, speech-shaped noise, and combinations of speech produced by one, two, three, 16, 32, 64, or 128 talkers. Overall masker level was held constant across all masker conditions. Considering the speech maskers, an increase in masking was observed as the number of talkers increased from one to three. Interestingly, no further increases in masking were observed as additional talkers were added beyond three. It has been suggested that this pattern of results reflects an increase in informational masking, as number of streams increases from one to two or three as opportunities for glimpsing decrease, followed by a reduction in informational masking and an increase in energetic masking due to a decrease in target/masker similarity as additional talkers are added to the masker stream (e.g., Freyman et al. 2004).

Masked Speech Recognition in Children

Findings from multiple laboratories provide converging evidence that children require higher SNRs than young adults to achieve similar performance on a wide range of speech-in-noise measures (e.g., Corbin, Bonino, Buss, & Leibold, 2016; Elliott, Connors, Kille, Levin, Ball, & Katz, 1979). In many studies, mature performance has been observed by about 9–10 years of age (e.g., Corbin et al., 2016; Nishi, Lewis, Hoover, Choi, & Stelmachowicz, 2010), providing evidence that the ability to perceptually segregate target speech from a noise masker may be immature early during the school-age years but is adultlike by adolescence. Leibold and Buss (2013) examined children's and young adults' consonant identification performance in the presence of speech-shaped noise at a fixed SNR of 0 dB. Percent correct scores are shown in the left panel of Figure 2, with data for children split into three age groupings (5–7, 8–10, and 11–13 years). The two youngest age groups of children performed significantly worse than the older three groups of listeners in the noise masker, performing an average of 11 percentage points more poorly than young adults. In contrast, 8- to 10-year-olds and 11- to 13-year-olds performed as well as the adults.

Figure 2.

Figure 2.

Group average percent correct scores for consonant identification are presented for 5- to 7-year-olds (circles), 8- to 10-year-olds (squares), 11- to 13-year-olds (triangles), and young adults (hexagons), as adapted from Leibold and Buss (2013). Error bars are ± 1 SEM. Data on the left show performance in a speech-shaped noise masker, and data on the right show performance in a two-talker speech masker. Note the magnitude of child–adult differences in the two-talker masker relative to the speech-shaped noise masker.

Although children have more difficulty in recognizing speech in noise than adults, substantially larger and longer-lasting developmental effects have been observed in the presence of one or two streams of competing speech (e.g., Corbin et al., 2016; Hall, Grose, Buss, & Dev, 2002; Leibold & Buss, 2013; Wightman & Kistler, 2005). The right panel of Figure 2 shows percent correct scores for consonant recognition in the presence of a continuous, two-talker speech masker. Striking child–adult differences in performance are evident in these data, including a 36 percentage point decrement in performance for 5- to 7-year-old children relative to 19- to 34-year-old adults. Similar age effects have been reported for word (e.g., Hall, Grose, Buss, & Dev, 2002) and sentence (e.g., Calandruccio, Leibold, & Buss, 2016) recognition in a two-talker masker.

In addition to the substantially larger child–adult differences observed for speech-in-speech compared with speech-in-noise recognition, masked speech recognition appears to mature at different rates in competing speech versus noise. This trend is evident in the masked consonant identification data shown in Figure 2; although 11- to 13-year-olds performed as well as young adults in the speech-shaped noise masker, they performed 10 percentage points poorer than adults in the two-talker speech masker. Recently, Corbin et al. (2016) assessed word recognition in the presence of speech-shaped noise or two-talker speech in over 50 school-age children ranging in age from 5 to 16 years. Young adults (19–40 years) were also tested to provide an estimate of mature performance. Findings indicated a more prolonged time course of development for speech recognition in two-talker speech than in speech-shaped noise. Speech recognition thresholds in speech-shaped noise improved steadily until about 10 years of age, but thresholds in two-talker speech did not reach adultlike levels until 13–14 years of age. Two additional findings reported by Corbin et al. (2016) are worth highlighting. First, an abrupt improvement in speech recognition thresholds was observed in the two-talker masker around 13–14 years of age. Whereas few children younger than 13 years of age had thresholds in the range observed for young adults, almost all children over 14 years of age demonstrated mature performance. The mechanisms responsible for this complex pattern of development are unclear. In Corbin et al. (2016), we posited that maturation of cognitive processing related to executive functioning may underlie the rapid improvement in speech-in-speech recognition observed between 13 and 14 years of age, and we highlight the need for future experiments targeting development across adolescence. The second notable observation from Corbin et al. (2016) is that speech recognition thresholds obtained from the same children in the presence of speech-shaped noise and two-talker speech were uncorrelated. The lack of an association between thresholds in the two masker conditions provides further evidence that speech-in-noise and speech-in-speech perception abilities mature at divergent rates, reflecting contributions from different underlying factors.

Although the data are somewhat mixed, mounting evidence supports the idea that child–adult differences in speech-in-speech recognition partly reflect immature glimpsing abilities. Initial studies investigating speech recognition in temporally modulated noise found no child–adult differences in the amount of benefit derived from masker modulation (Stuart, 2008; Stuart, Givens, Walker, & Elangovan, 2006). However, results from more recent work involving complex noise or speech maskers with both spectral and temporal modulations (e.g., Hall, Buss, Grose, & Roush, 2012; Buss, Leibold, Porter, & Grose, 2017) and/or reverberation (e.g., Wróblewski, Lewis, Valente, & Stelmachowicz, 2012) indicate an immature ability to benefit from available glimpses in a fluctuating masker relative to adults.

Age Effects in the Ability to Utilize Acoustic Differences Between Talkers to Separate Speech Streams

There is growing interest among researchers in the field to characterize the specific factors responsible for children's increased susceptibility to speech-in-speech masking relative to young adults (e.g., Calandruccio et al., 2016; Newman, Morini, Ahsan, & Kidd, 2015; Wightman, Kistler, & Brungart, 2006). In our laboratory, for example, we have evaluated the extent to which children benefit from the introduction of acoustic differences in vocal characteristics between talkers to segregate target from masker speech (e.g., Calandruccio, Buss, & Leibold, 2013; Flaherty, Leibold, & Buss, 2017; Leibold, Taylor, Hillock-Dunn, & Buss, 2013). This approach is based on results from experiments involving young adults showing that target/masker segregation is aided by the presence of robust acoustic differences between speech produced by different talkers (e.g., Bronkhorst 2000; Brungart, Simpson, Ericson, & Scott, 2001; Darwin, Brungart, & Simpson, 2003). These vocal characteristics, primarily fundamental frequency (F0) and formant frequencies, are associated with the length of the vocal folds and the size and length of the vocal tract, respectively (e.g., Fitch & Giedd, 1999).

F0 and formant frequencies vary across talkers, with male voices tending to be lower in frequency than female voices. Consistent with the hypothesis that between-talker differences in these vocal characteristics facilitate target/masker segregation, young adults typically show substantially better speech-in-speech recognition performance for conditions in which target and masker speech are mismatched in sex than when they are matched in sex (e.g., Brungart, 2001; Festen & Plomp, 1990; Freyman et al., 2004). For example, Brungart (2001) compared 21- to 55-year-old adults' speech-in-speech recognition abilities at a fixed SNR using the Coordinate Response Measure Test (Bolia, Nelson, Ericson, & Simpson, 2000) across conditions in which target and masker speech was produced by the same talker, by different talkers matched in sex, or by different talkers mismatched in sex. For sex-matched conditions, performance was 15–20 percentage points higher when target and masker phrases were produced by different talkers. Performance improved by an additional 15–20 percentage points when target and masker phrases were spoken by talkers mismatched in sex. It has been suggested that these findings are the consequence of a decrease in both energetic and informational masking driven by the relatively large acoustic differences between male and female speech productions, making it easier for listeners to segregate target and masker speech streams relative to when target and masker speech is produced by talkers of the same sex (e.g., Freyman et al., 2004).

Results from ongoing experiments in our laboratory suggest that the ability to exploit even large acoustic differences in vocal characteristics between talkers takes many years to fully develop. Calandruccio et al. (2013) compared children's (5–10-years-old) and young adults' speech recognition thresholds in two-talker speech between sex-matched and sex-mismatched target/masker conditions. A similar sex mismatch benefit was observed for children and adults. In a related study, however, Leibold, Taylor, et al. (2013) observed no sex mismatch benefit for 7- to 13-month-old infants in the context of speech-in-speech detection. Although the methods used to test infants and school-age children differ, thresholds for young adults tested by Leibold, Taylor, et al. (2013) using the infant paradigm were considerably lower for sex-mismatched than sex-matched target/masker conditions, a finding consistent with Calandruccio et al. (2013). In sharp contrast, infant thresholds for sex-matched and sex-mismatched conditions were similar. The pattern of results observed across these two studies is consistent with the hypothesis that the ability to take advantage of acoustic differences between male and female speech productions is not established at birth and develops between infancy and the school-age years.

Although school-age children tested by Calandruccio et al. (2013) showed a robust sex mismatch benefit, they continued to be more susceptible to speech-in-speech masking than young adults even when target and masker speech was mismatched in sex. A possible explanation for this finding is that the ability to utilize less redundant and/or more subtle differences in voice characteristics between talkers of the same sex follows a prolonged time course of development. Flaherty et al. (2017) recently examined this possibility by testing a wide age range of school-age children (5–15-years-old) and young adults on speech-in-speech conditions in which only F0 was manipulated. An adaptive procedure was used to estimate the SNR required for 70.7% word recognition in a two-talker speech masker. The target and masker speech was produced by the same talker. The rationale for using target and masker speech produced by the same talker was to accentuate informational masking effects (e.g., Brungart et al., 2001) and to isolate the influence of target/masker differences in F0 on speech-in-speech recognition. In separate conditions, the F0 of the target speech was either matched to the masker's F0 (i.e., unaltered) or shifted higher in frequency by three, six, or nine semitones. The F0 of the masker speech remained constant across experimental conditions. Preliminary data are presented in Figure 3, which shows thresholds estimated using unaltered target words (open circles) and target words shifted up by six semitones (shaded triangles). The vertical lines represent the benefit of introducing the relatively large target/masker F0 difference. Consistent with previous findings (e.g., Darwin et al., 2003; Mackersie, Dewey, & Guthrie, 2011), thresholds for all of the young adult listeners were considerably lower when the target F0 was shifted higher in frequency than the masker F0 relative to when the target and masker F0s were matched. The same general pattern of results was observed for children 10 years of age and older. Surprisingly, however, children younger than about 10 years of age did not take advantage of target/masker F0 differences. This age effect suggests that learning the skills required to utilize target/masker F0 differences requires a decade of auditory experience and/or neural maturation.

Figure 3.

Figure 3.

Estimates of the signal-to-noise ratio (SNR) required to obtain 70.7% word recognition in a two-talker masker are plotted as a function of age for individual children and young adults tested by Flaherty et al. (2017). Thresholds estimated using target and masker speech with the same fundamental frequency (unaltered F0) are shown by the open circles, and thresholds estimated using target speech shifted up by six semitones (shifted F0) are shown by the shaded triangles. Whereas older children and adults benefitted from a target/masker F0 difference, children younger than about 9 years did not.

Influence of Hearing Loss on Speech-in-Speech Recognition

It has been known for many years that sensory/neural hearing loss often reduces the fidelity with which the peripheral auditory system encodes sound (e.g., Buss, Hall, & Grose, 2004; Glasberg & Moore, 1986; Moore & Carlyon, 2005). Moreover, multiple studies involving young adults (e.g., Fu, Shannon, & Wang, 1998; Peters, Moore, & Baer, 1998) and children (e.g., Hall, Buss, Grose, & Roush, 2012) have shown that peripheral encoding deficits negatively impact speech recognition in the presence of nominally steady noise or babble. Damage to sensory hair cells and other structures located within the auditory periphery are likely to also interfere with speech recognition in the presence of competing speech because of reduced access to acoustic cues that facilitate the segregation of target from masker speech as well as poorer representations of temporal and spectral changes over time (e.g., Qin & Oxenham, 2003).

Although less studied, an additional factor that appears to influence speech-in-speech recognition outcomes for children who are hard of hearing is auditory experience. Specifically, results from a growing number of studies indicate that children with sensory/neural hearing loss often have reduced and/or less consistent auditory experience than peers with normal hearing (reviewed by Moeller & Tomblin, 2015). For example, many young children who are hard of hearing do not wear their hearing aids for more than 6 hr per day (e.g., Muñoz, Preston, & Hicken, 2014; Walker et al., 2014). In addition, it has been estimated that approximately a third of pediatric hearing aids may not provide optimal audibility (e.g., McCreery et al., 2014). The critical problem that arises from these two issues is that both hearing aid use and aided audibility moderate language outcomes for children who are hard of hearing (e.g., Tomblin et al., 2015; Tomblin, Oleson, Ambrose, Walker, & Moeller, 2014).

Based on the emerging data indicating that language outcomes for children who are hard of hearing are influenced by experience with sound, Leibold, Hillock-Dunn, Duncan, Roush, and Buss (2013) tested the hypothesis that reduced auditory experience associated with congenital sensory/neural hearing loss negatively impacts the development of perceptual abilities related to the segregation and selection of target from background speech. Children with bilateral sensory/neural hearing loss (9–17-years-old) and age-matched peers with normal hearing completed an adaptive spondee recognition task in two-talker speech and in speech-shaped noise. The children who were hard of hearing wore their hearing aids during testing. In the speech-shaped noise masker condition, which was expected to produce energetic masking, children who were hard of hearing required an additional 3.5 dB SNR relative to their peers with normal hearing to achieve comparable performance. This disadvantage increased to 8.1 dB SNR in the two-talker speech masker, which was expected to produce both energetic and informational masking. In a follow-up study, Hillock-Dunn, Taylor, Buss, and Leibold (2015) observed that performance in two-talker, but not in speech-shaped noise, was correlated with parental reports of their children's everyday communication and speech understanding abilities. Interestingly, Corbin et al. (2016) failed to observe a correlation between speech reception thresholds in a two-talker speech and a speech-shaped noise masker in a related study involving over 50 school-age children with normal hearing. Considered together, these findings have important clinical implications because they suggest that measures of speech-in-speech recognition may be more predictive of children's functional hearing skills than conventional clinical assessments made in quiet or in steady noise.

Provisional Conclusions and Future Directions

Children are more susceptible to auditory masking than young adults, requiring a more advantageous SNR to achieve comparable performance. Although child–adult differences in masked speech recognition are evident in the presence of relatively steady noise, this performance gap is larger, and the time course of development is prolonged, in the presence of a small number of competing speech streams. Using the practical example of the child in the science classroom, speech produced by the student's classmates likely has a more detrimental effect on hearing and understanding the teacher's instruction than the noise produced by the aquarium. Emerging data indicate that, although child–adult differences are substantial for children with normal hearing, children who are hard of hearing are particularly vulnerable to speech-in-speech masking. Considering all of the currently available data, we propose the working hypothesis that maturation of the perceptual skills required to segregate target speech in real-world, multisource environments requires years of exposure to high-quality auditory input.

Several avenues of future research have emerged as important steps toward understanding the development of hearing in complex acoustic environments. First, isolating the factors responsible for children's increased susceptibility to speech-in-speech masking is paramount. Second, it is critical that we design rigorous, theoretically motivated experiments to evaluate the influence of early auditory experience on the development of masked speech perception skills. Finally, we are now in a position to create a new generation of clinical speech recognition tools that more closely approximate the types of complex acoustic environments children encounter in their everyday lives.

Acknowledgments

Research reported in this publication was supported by the National Institute on Deafness and Other Communication Disorders Awards R13DC003383 and R01DC011038 (awarded to Lori J. Leibold). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Work conducted in my laboratory is a collaborative effort, and I had the great fortune to work with Emily Buss and Lauren Calandruccio on many of the studies reported in this review. I am also grateful to the outstanding members of the Human Auditory Development Laboratory.

Funding Statement

Research reported in this publication was supported by the National Institute on Deafness and Other Communication Disorders Awards R13DC003383 and R01DC011038 (awarded to Lori J. Leibold).

References

  1. Ambrose S. E., VanDam M., & Moeller M. P. (2014). Linguistic input, electronic media, and communication outcomes of toddlers with hearing loss. Ear and Hearing, 35, 139–147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Best V., Ozmeral E. J., & Shinn-Cunningham B. G. (2007). Visually-guided attention enhances target identification in a complex auditory scene. Journal for the Association for Research in Otolaryngology, 8, 294–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bolia R. S., Nelson W. T., Ericson M. A., & Simpson B. D. (2000). A speech corpus for multitalker communications research. The Journal of the Acoustical Society of America, 107, 1065–1066. [DOI] [PubMed] [Google Scholar]
  4. Bregman A. S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA: MIT Press. [Google Scholar]
  5. Bronkhorst A. W. (2000). The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions. Acta Acustica United With Acustica, 86, 117–128. [Google Scholar]
  6. Brungart D. S. (2001). Informational and energetic masking effects in the perception of two simultaneous talkers. The Journal of the Acoustical Society of America, 109, 1101–1109. [DOI] [PubMed] [Google Scholar]
  7. Brungart D. S. (2005). Informational and energetic masking effects in multitalker speech perception. In Divenyi P. (Ed.), Speech separation by humans and machines (pp. 261–267). New York, NY: Springer. [Google Scholar]
  8. Brungart D. S., Simpson B. D., Ericson M. A., & Scott K. R. (2001). Informational and energetic masking effects in the perception of multiple simultaneous talkers. The Journal of the Acoustical Society of America, 110, 2527–2538. [DOI] [PubMed] [Google Scholar]
  9. Buss E., Hall J. W. III, & Grose J. H. (2004). Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss. Ear and Hearing, 25, 242–250. [DOI] [PubMed] [Google Scholar]
  10. Buss E., Leibold L. J., Porter H. L., & Grose J. H. (2017). Speech recognition in one- and two-talker maskers in school-age children and adults: Development of perceptual masking and glimpsing. The Journal of the Acoustical Society of America, 141, 2650–2660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Calandruccio L., Buss E., & Leibold L. J. (2013, March). Speech on speech masking for children: Male versus female talkers. Poster presented at the annual meeting of the American Auditory Society, Scottsdale, AZ. [Google Scholar]
  12. Calandruccio L., Leibold L. J., & Buss E. (2016). Linguistic masking release in school-age children and adults. American Journal of Audiology, 25, 34–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Carhart R., Johnson C., & Goodman J. (1975). Perceptual masking of spondees by combinations of talkers. The Journal of the Acoustical Society of America, 58, S35–S35. [Google Scholar]
  14. Carhart R., Tillman T. W., & Greetis E. S. (1969). Perceptual masking in multiple sound backgrounds. The Journal of the Acoustical Society of America, 45, 694–703. [DOI] [PubMed] [Google Scholar]
  15. Cooke M. (2006). A glimpsing model of speech perception in noise. The Journal of the Acoustical Society of America, 119, 1562–1573. [DOI] [PubMed] [Google Scholar]
  16. Corbin N. E., Bonino A. Y., Buss E., & Leibold L. J. (2016). Development of open-set word recognition in children: Speech-shaped noise and two-talker speech maskers. Ear and Hearing, 37, 55–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Darwin C. J., Brungart D. S., & Simpson B. D. (2003). Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. The Journal of the Acoustical Society of America, 114, 2913–2922. [DOI] [PubMed] [Google Scholar]
  18. Dubno J. R., Dirks D. D., & Morgan D. E. (1984). Effects of age and mild hearing loss on speech recognition in noise. The Journal of the Acoustical Society of America, 76, 87–96. [DOI] [PubMed] [Google Scholar]
  19. Elliott L. L., Connors S., Kille E., Levin S., Ball K., & Katz D. (1979). Children's understanding of monosyllabic nouns in quiet and in noise. The Journal of the Acoustical Society of America, 66, 12–21. [DOI] [PubMed] [Google Scholar]
  20. Festen J. M., & Plomp R. (1990). Effects of fluctuating noise and interfering speech on the speech‐reception threshold for impaired and normal hearing. The Journal of the Acoustical Society of America, 88, 1725–1736. [DOI] [PubMed] [Google Scholar]
  21. Fitch W. T., & Giedd J. (1999). Morphology and development of the human vocal tract: A study using magnetic resonance imaging. The Journal of the Acoustical Society of America, 106, 1511–1522. [DOI] [PubMed] [Google Scholar]
  22. Flaherty M. M., Leibold L. J., & Buss E. (2017, February). Developmental effects in the ability to benefit from F0 differences between target and masker speech. Poster presented at the annual meeting of the Association for Research in Otolaryngology, Baltimore, MD. [Google Scholar]
  23. Freyman R. L., Balakrishnan U., & Helfer K. S. (2004). Effect of number of masking talkers and auditory priming on informational masking in speech recognition. The Journal of the Acoustical Society of America, 115, 2246–2256. [DOI] [PubMed] [Google Scholar]
  24. Frisina D. R., & Frisina R. D. (1997). Speech recognition in noise and presbycusis: Relations to possible neural mechanisms. Hearing Research, 106, 95–104. [DOI] [PubMed] [Google Scholar]
  25. Fu Q. J., Shannon R. V., & Wang X. (1998). Effects of noise and spectral resolution on vowel and consonant recognition: Acoustic and electric hearing. The Journal of the Acoustical Society of America, 104, 3586–3596. [DOI] [PubMed] [Google Scholar]
  26. Glasberg B. R., & Moore B. C. J. (1986). Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments. The Journal of the Acoustical Society of America, 79, 1020–1033. [DOI] [PubMed] [Google Scholar]
  27. Gravel J. S., Fausel N., Liskow C., & Chobot J. (1999). Children's speech recognition in noise using omni-directional and dual-microphone hearing aid technology. Ear and Hearing, 20, 1–11. [DOI] [PubMed] [Google Scholar]
  28. Hall J. W., Buss E., Grose J. H., & Roush P. A. (2012). Effects of age and hearing impairment on the ability to benefit from temporal and spectral modulation. Ear and Hearing, 33, 340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hall J. W. III, Grose J. H., Buss E., & Dev M. B. (2002). Spondee recognition in a two-talker masker and a speech-shaped noise masker in adults and children. Ear and Hearing, 23, 159–165. [DOI] [PubMed] [Google Scholar]
  30. Hillock-Dunn A., Taylor C., Buss E., & Leibold L. J. (2015). Assessing speech perception in children with hearing loss: What conventional clinical tools may miss. Ear and Hearing, 36, e57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Howard‐Jones P. A., & Rosen S. (1993). Uncomodulated glimpsing in “checkerboard” noise. The Journal of the Acoustical Society of America, 93, 2915–2922. [DOI] [PubMed] [Google Scholar]
  32. Killion M. C., Niquette P. A., Gudmundsen G. I., Revit L. J., & Banerjee S. (2004). Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 116, 2395–2405. [DOI] [PubMed] [Google Scholar]
  33. Leibold L. J., & Buss E. (2013). Children's identification of consonants in a speech-shaped noise or a two-talker masker. Journal of Speech, Language, and Hearing Research, 56, 1144–1155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Leibold L. J., Hillock-Dunn A., Duncan N., Roush P. A., & Buss E. (2013). Influence of hearing loss on children's identification of spondee words in a speech-shaped noise or a two-talker masker. Ear and Hearing, 34, 575–584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Leibold L. J., Taylor C. N., Hillock-Dunn A., & Buss E. (2013, June). Effect of talker sex on infants' detection of spondee words in a two-talker or a speech-shaped noise masker. In Proceedings of Meetings on Acoustics ICA2013 (Vol. 19, No. 1, p. 060074). Melville, NY: Acoustical Society of America. [Google Scholar]
  36. Lunner T., & Sundewall-Thorén E. (2007). Interactions between cognition, compression, and listening conditions: Effects on speech-in-noise performance in a two-channel hearing aid. Journal of the American Academy of Audiology, 18, 604–617. [DOI] [PubMed] [Google Scholar]
  37. Mackersie C. L., Dewey J., & Guthrie L. A. (2011). Effects of fundamental frequency and vocal-tract length cues on sentence segregation by listeners with hearing loss. The Journal of the Acoustical Society of America, 130, 1006–1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. McCreery R. W., Walker E. A., Spratford M., Bentler R., Holte L., Roush P., … Moeller M. P. (2014). Longitudinal predictors of aided speech audibility in infants and children. Ear and Hearing, 36, 24S–37S. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Miller G. A., & Nicely P. E. (1955). An analysis of perceptual confusions among some English consonants. The Journal of the Acoustical Society of America, 27, 338–352. [Google Scholar]
  40. Moeller M. P., & Tomblin J. B. (2015). Epilogue: Conclusions and implications for research and practice. Ear and Hearing, 36, 92S–98S. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Moore B. C. J., & Carlyon R. P. (2005). Perception of pitch by people with cochlear hearing loss and by cochlear implant users. In Plack C. J., Oxenham A. J., Fey R. R., & Popper A. N. (Eds.), Pitch: Neural coding and perception (pp. 234–277). New York, NY: Springer. [Google Scholar]
  42. Muñoz K., Preston E., & Hicken S. (2014). Pediatric hearing aid use: How can audiologists support parents to increase consistency? Journal of the American Academy of Audiology, 25, 380–387. [DOI] [PubMed] [Google Scholar]
  43. Newman R. S., Morini G., Ahsan F., & Kidd G. Jr. (2015). Linguistically-based informational masking in preschool children. The Journal of the Acoustical Society of America, 138, EL93–EL98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Nilsson M., Soli S. D., & Sullivan J. A. (1994). Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. The Journal of the Acoustical Society of America, 95, 1085–1099. [DOI] [PubMed] [Google Scholar]
  45. Niquette P., Arcaroli J., Revit L., Parkinson A., Staller S., Skinner M., & Killion M. (2003, March). Development of the BKB-SIN Test. Poster presented at the annual meeting of the American Auditory Society, Scottsdale, AZ. [Google Scholar]
  46. Nishi K., Lewis D. E., Hoover B. M., Choi S., & Stelmachowicz P. G. (2010). Children's recognition of American English consonants in noise. The Journal of the Acoustical Society of America, 127, 3177–3188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Peters R. W., Moore B. C. J., & Baer T. (1998). Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. The Journal of the Acoustical Society of America, 103, 577–587. [DOI] [PubMed] [Google Scholar]
  48. Phatak S. A., & Allen J. B. (2007). Consonant and vowel confusions in speech-weighted noise a. The Journal of the Acoustical Society of America, 121, 2312–2326. [DOI] [PubMed] [Google Scholar]
  49. Phatak S. A., Lovitt A., & Allen J. B. (2008). Consonant confusions in white noise. The Journal of the Acoustical Society of America, 124, 1220–1233. [DOI] [PubMed] [Google Scholar]
  50. Shield B. M., & Dockrell J. E. (2008). The effects of environmental and classroom noise on the academic attainments of primary school children. The Journal of the Acoustical Society of America, 123, 133–144. [DOI] [PubMed] [Google Scholar]
  51. Stuart A. (2008). Reception thresholds for sentences in quiet, continuous noise, and interrupted noise in school-age children, Journal of the American Academy of Audiology, 19, 135–146. [DOI] [PubMed] [Google Scholar]
  52. Stuart A., Givens G. D., Walker L. J., & Elangovan S. (2006). Auditory temporal resolution in normal-hearing preschool children revealed by word recognition in continuous and interrupted noise, The Journal of the Acoustical Society of American, 119, 1946–1949. [DOI] [PubMed] [Google Scholar]
  53. Tomblin J. B., Harrison M., Ambrose S. E., Walker E. A., Oleson J. J., & Moeller M. P. (2015). Language outcomes in young children with mild to severe hearing loss. Ear and Hearing, 36, 76S–91S. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Tomblin J. B., Oleson J. J., Ambrose S. E., Walker E., & Moeller M. P. (2014). The influence of hearing aids on the speech and language development of children with hearing loss. JAMA Otolaryngology–Head & Neck Surgery, 140, 403–409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Qin M. K., & Oxenham A. J. (2003). Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers. The Journal of the Acoustical Society of America, 114, 446–454. [DOI] [PubMed] [Google Scholar]
  56. Walker E. A., McCreery R. W., Spratford M., Oleson J. J., Van Buren J., Bentler R., … Moeller M. P. (2014). Trends and predictors of longitudinal hearing aid use for children who are hard of hearing. Ear and Hearing, 36, 38S–47S. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Wightman F. L., & Kistler D. J. (2005). Informational masking of speech in children: Effects of ipsilateral and contralateral distractors. The Journal of the Acoustical Society of America, 118, 3164–3176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wightman F. L., Kistler D. J., & Brungart D. (2006). Informational masking of speech in children: Auditory–visual integration. The Journal of the Acoustical Society of America, 119, 3940–3949. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Woods D. L., Yund E. W., Herron T. J., & Cruadhlaoich M. A. U. (2010). Consonant identification in consonant–vowel–consonant syllables in speech-spectrum noise. The Journal of the Acoustical Society of America, 127, 1609–1623. [DOI] [PubMed] [Google Scholar]
  60. Wróblewski M., Lewis D. E., Valente D. L., & Stelmachowicz P. G. (2012). Effects of reverberation on speech recognition in stationary and modulated noise by school-aged children and young adults. Ear and Hearing, 33, 731–744. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES