Abstract
Objectives
At poor signal-to-noise ratios, speech understanding may depend upon the ability to combine speech fragments that are distributed across time and frequency. The goal of this study was to determine the effects of development and hearing impairment on this ability.
Design
Listeners in the present study included adults and children with normal hearing and with hearing impairment. The children with normal hearing included a younger group (4.6 to 6.9 years of age, n=10) and an older group (7.3 to 11.1 years of age, n=11). The adults with normal hearing were 19–27 years of age (n=10). Adults (19–54 years of age, n=9) and children (7.2 to 10.7 years of age, n=8) with hearing impairment were also tested. The two groups with hearing impairment had comparable mild/moderate bilateral sensorineural hearing impairment. Masked speech reception thresholds for sentences were determined in a baseline condition of steady speech-shaped noise and in noise that was either temporally modulated, spectrally modulated, or both temporally and spectrally modulated.
Results
The results of normal-hearing listeners indicated higher masked speech reception thresholds for children than adults in steady noise. Adults and children showed the same magnitude of masking release for spectral modulation. Adults showed more masking release than the younger children for temporal modulation, and showed more masking release than both the younger and older children for combined temporal/spectral modulation. Comparing normal-hearing and hearing-impaired listeners, the hearing-impaired listeners had higher masked speech reception thresholds (SRTs) in the steady noise condition and reduced masking release in the modulated noise conditions. Neither the two-way interaction between age and hearing impairment nor the three-way interaction between age, hearing impairment, and masking configuration was significant.
Conclusions
Although the reduced masking release for temporal modulation shown by the younger children with normal hearing could be due to poor temporal resolution, it more likely reflects inefficient use of speech cues in temporal gaps or factors stemming from higher signal-to-noise ratios required by children in the baseline condition. The reduced masking release for combined temporal/spectral modulation demonstrated by both the younger and older children with normal hearing may indicate that children in the age range tested here have some difficulty in combining speech information that is distributed across temporal and spectral gaps. Hearing impairment was associated with higher thresholds and reduced masking release in all modulation conditions. Children with hearing impairment showed the poorest performance of any group, consistent with additive effects of hearing loss and development.
Keywords: Speech perception, children, cochlear hearing loss, modulation
INTRODUCTION
Good speech recognition in complex masking backgrounds, where spectral characteristics change dynamically over time, may depend upon the ability of the listener to piece together a target sound from fragments that occur in spectral regions and temporal epochs where the signal-to-noise ratio is relatively high (e.g., Assmann & Summerfield 2004; Buss et al. 2004; Cooke 2006; Hall et al. 2008b; Howard-Jones & Rosen 1993; Miller & Licklider 1950). The present investigation examined this type of ability in adults and children with normal hearing and in adults and children with sensorineural hearing impairment. Maskers were used that were steady (not modulated) or contained either temporal modulation, spectral modulation, or both temporal and spectral modulation (Peters et al. 1998). The Peters et al. (1998) study indicated that adults obtained more masking release with combined temporal/spectral modulation than for either type of modulation alone. Part of the rationale for the present study is related to the theoretical complexity of this kind of task. Although developmental psychoacoustical results indicate that children as young as 5–6 years have relatively adult-like abilities on some hearing tasks, paradigms involving complex masking backgrounds can be associated with continued development until age 10 years or later (e.g., Hall et al. 2005; Hall et al. 2008a; Wightman et al. 2003). It is therefore of interest to determine whether there are developmental effects in the ability to combine speech fragments that are distributed across time and frequency.
It is also of interest to determine whether the ability to combine speech fragments that are distributed across time and frequency might be particularly poor in children with hearing impairment. It is possible that development of this ability could be hampered if the underlying cues are encoded with reduced quality. Reductions in cue quality could occur, for example, because of poor frequency selectivity (e.g., Tyler et al. 1984; Zwicker & Schorn 1978), a deficit that would not be offset by amplification. Such an effect in children with hearing impairment would be demonstrated by a significant interaction between age and hearing impairment, reflecting a disadvantage that is greater than predicted based upon combined age and hearing impairment effects.
Although we are not aware of any previous studies that have specifically examined normal developmental effects for speech in noise having spectral modulation or combined spectral and temporal modulations, the effects of temporally modulated maskers have been investigated. Studies by Stuart et al. (2006) and Stuart (2008) for words and sentences, respectively, concluded that the speech recognition benefit associated with temporal modulation of masking noise did not differ between adults and children. However, both of these studies indicated that children had poorer performance than adults in both steady and modulated noise.
With regard to hearing impairment, previous studies on adult listeners have shown that masking release resulting from temporal modulation of a masking noise is reduced in listeners with hearing impairment relative to listeners with normal hearing (e.g., Eisenberg et al. 1995; Festen & Plomp 1990; George et al. 2006; Peters et al. 1998; Takahashi & Bacon 1992; Wilson & Carhart 1969). A speech perception study by Peters et al. (1998) indicated that masking release resulting from spectral modulation of the masking noise was also associated with reduced benefit in adults with hearing impairment, a finding that was compatible with previous psychoacoustical data indicating that listeners with sensorineural hearing impairment have reduced frequency resolution (e.g., Moore & Glasberg 1986; Pick et al. 1977; Tyler et al. 1984). The study by Peters et al. furthermore indicated reduced benefit in adult hearing-impaired listeners for temporal modulation and for combined temporal/spectral modulation.
In summary, the present study examined effects of hearing loss and age on the ability to combine speech fragments that were distributed across time and frequency. The major goals were to determine normal development of these abilities and to establish whether this development was affected by hearing impairment. To accomplish these goals, we measured speech understanding in noise that was steady, temporally modulated, spectrally modulated, or both temporally and spectrally modulated. Following the collection of data on these conditions, we became aware of recent arguments suggesting that masking release in modulated noise may be closely linked with the signal-to-noise ratio associated with the steady noise baseline condition (Bernstein & Brungart 2011; Bernstein & Grant 2009). We therefore ran supplementary conditions to investigate that question. Results from the supplementary conditions are presented following the results of the main experiment.
MATERIALS AND METHODS
Stimuli and Procedures
Following the methods of Peters et al. (1998), the present speech experiment used maskers that were either steady, temporally modulated, spectrally modulated, or both temporally and spectrally modulated. Specifically, the masking conditions used here included 1) steady speech-shaped noise; 2) speech-shaped noise that was temporally modulated (100% sinusoidal modulation at a rate of 10 Hz); 3) speech-shaped noise that was spectrally modulated with multiple spectral notches, each spanning three equivalent rectangular bandwidths (ERBs) (Glasberg & Moore 1990), separated by regions of masking noise bands that were also three ERBs wide; 4) speech-shaped noise that was modulated both temporally (10-Hz sinusoidal) and spectrally (three ERBs). In this latter condition, the noise was first amplitude modulated and then filtered. The frequencies of masker bands associated with spectral modulation are summarized in Table 1.
Table 1.
band # | lower edge | upper edge |
---|---|---|
1 | 115 | 246 |
2 | 427 | 676 |
3 | 1021 | 1497 |
4 | 2155 | 3063 |
5 | 4317 | 6049 |
The signal used in this experiment was a male voice speaking BKB sentences (Bench et al. 1979). These sentences have been shown to yield an appropriate measure of speech perception in young children (Uchanski et al. 2002). Sentences contained from three to five keywords (average of 3.7 keywords). The BKB corpus consists of 21 lists of 16 sentences, allowing presentation of a novel sentence on each listening interval for every condition. All testing was completed using a speech-shaped noise masker, generated to match the long-term average spectrum of the BKB sentences. The speech-shaped noise was presented at an overall level of 86 dBA and was played continuously throughout all conditions. This level was reduced by approximately 3 dB in the conditions associated with spectral modulation due to the fact that the spectrum level in the bandpass regions of the masker was the same as in the baseline condition. The stimuli were delivered to the listeners through Sennheiser headphones (HD 265).
Each listener sat with an experimenter in a double-walled sound booth. The experimenter was positioned in front of a visual display that showed the current sentence. The listener was seated such that the display was not visible. The listener was presented with a clearly audible sample of the word “ready” spoken by the same adult male who produced the target sentences. After a 500-ms delay, the target sentence was presented. The listener was instructed to repeat as many words from the target sentence as possible after each presentation, and to guess when unsure of any words. No feedback was provided. The experimenter recorded errors following each listener response. Testing used an adaptive staircase procedure that was broadly based on Levitt (1971). The speech presentation level was increased by 2 dB if one or more words were missed, and the level was reduced by 2 dB if all keywords were correctly identified. The run was stopped following 8 reversals, and the masked SRT was taken as the average signal level at the final 6 reversals.
Listeners
Listeners included adults and children with normal hearing and adults and children with mild/moderate, bilateral sensorineural hearing impairment. Listeners with mixed hearing losses (combined sensorineural and conductive) were excluded from participation. There were 10 adults with normal hearing. These listeners had a mean age of 24.0 years (SD=2.9), ranging from 18.9 to 27.3 years. There were nine adults with hearing impairment. These listeners had a mean age of 35.0 years (SD=16.1), ranging from 18.6 to 53.7 years. There were eight children who had hearing impairment since infancy. These listeners had a mean age of 8.9 years (SD=1.2), ranging from 7.2 to 10.7 years. Hearing intervention data for the children with hearing impairment are shown in Table 2. Children with normal hearing were initially recruited to match in age the children with hearing impairment. This subgroup was composed of 11 listeners. The mean age was 9.0 years (SD=1.2), ranging from 7.3 to 11.1 years. Because one of the goals of the study was to examine normal development of the ability to combine spectro-temporally-distributed speech cues, we recruited an additional cohort of younger children with normal hearing. This subgroup was composed of 10 listeners. The mean age was 5.9 years (SD=0.8), ranging from 4.6 to 6.9 years. These two groups will be referred to as younger and older children. For listeners with normal hearing, the left ear was tested and for listeners with hearing impairment, the better ear was tested except when the better ear had only a mild hearing loss. An independent speech-shaped noise, 40 dB down from the speech-shaped masker, was presented continuously to the contralateral ear of all listeners to prevent the availability of speech cues via crossover. Mean audiometric data for the adult and child listeners with hearing impairment are shown in Figure 1.
Table 2.
Listener | Status of newborn screen | Age at identification (years) |
Age at 1st hearing aid fitting (years) |
Duration aided (years) |
Age at test (years) |
---|---|---|---|---|---|
HI1 | Failed bilaterally | 0.1 | 0.2 | 7.0 | 7.2 |
HI2 | Failed bilaterally | 0.2 | 0.2 | 8.3 | 8.5 |
HI3 | Failed bilaterally | 0.2 | 0.5 | 7.5 | 8.0 |
HI4 | Failed bilaterally | 0.2 | 0.2 | 8.3 | 8.5 |
HI5 | Not screened | 1.9 | 2.0 | 6.1 | 8.1 |
HI6 | Not screened | 3.1 | 3.3 | 6.7 | 10.0 |
HI7 | Not screened | 4.4 | 4.5 | 5.5 | 10.0 |
HI8 | Not screened | 7.1 | 7.3 | 3.4 | 10.7 |
All listeners with hearing impairment were users of hearing aids. All of the children with hearing impairment had been fitted with hearing aids for at least three years prior to data collection, and had worn hearing aids consistently since their initial hearing aid fitting.
RESULTS
For the sake of clarity, the normal developmental effects will be considered before examining effects of hearing impairment. A criterion of p<0.05 was used in all statistical tests performed in this study. As in Peters et al. (1998), masked SRTs are reported in terms of signal-to-noise ratio.
Normal Developmental Effects
The mean masked SRTs are shown in Figure 2. The mean SRT in the steady masker and the mean masking release for the various modulation conditions are summarized in Table 3. Adults and children with normal hearing showed masking release for temporal and for spectral modulation, and the most masking release was seen for combined temporal/spectral modulation (see Figure 2, left panel, and Table 3). This pattern of results is similar to that shown by Peters et al. (1998) for adults with normal hearing. As shown in the figure, children with normal hearing demonstrated higher masked thresholds than adults.
Table 3.
Masking release (dB) | |||||
---|---|---|---|---|---|
Baseline SRT (dB) | Spectral | Temporal | Spectral & Temporal |
||
Normal-hearing | Younger Child | −2.3 (1.8) |
6.3 (0.0) |
2.5 (1.6) |
8.5 (1.6) |
Child | −3.0 (1.8) |
5.9 (1.9) |
3.7 (1.5) |
8.7 (1.7) |
|
Adult | −5.3 (1.9) |
6.6 (1.7) |
4.9 (1.3) |
11.2 (2.1) |
|
Adult Supplemental | +0.1 (1.4) |
6.15 (2.7) |
3.96 (2.0) |
9.03 (1.0) |
|
Hearing-impaired | Child | −1.1 (1.9) |
2.6 (1.5) |
0.1 (1.1) |
2.2 (1.2) |
Adult | −2.7 (2.1) |
2.8 (2.3) |
1.2 (2.2) |
4.1 (2.8) |
In the first analysis comparing adults and children with normal hearing, an analysis of variance (ANOVA) was performed to examine age effects for the baseline condition, in which the masker was steady speech-shaped noise. This analysis included two groups of children (younger and older) and adults. The analysis indicated a significant effect of group (F2,28=7.3; p<0.003). Post-hoc (Tukey HSD) testing indicated that the adults had significantly lower masked SRTs than both the younger (p=0.003) and older (p=0.026) children, but that the difference between the younger and older children was not significant (p=0.586). Figure 3 shows a plot of the masked SRTs for the baseline condition as a function of child age. The figure shows a trend for lower thresholds with increasing age, but the correlation did not reach statistical significance (p=0.067, one tailed).
A repeated-measures ANOVA was performed to examine possible group differences in masking release. Masking release was defined as the baseline (steady SSN) threshold minus the threshold for one of the three modulation conditions. Again, the analysis included the two groups of children and the adults. This analysis showed a significant effect of masking-release condition (F2,56=210.4; p<0.001), a significant effect of age group (F2,28=5.0;p=0.014), and a significant interaction between masking-release condition and age group (F4,56=4.6;p=0.003).
Tests of simple effects (Kirk, 1968) were performed to determine the source of the significant interaction. Simple effects tests examine the effect of one independent variable (e.g., age group) within one level of a second independent variable (e.g., the configuration of masker modulation). This showed the following:
For temporal modulation, adults had significantly more masking release than the younger children (p<0.001) but did not differ significantly from the older children (p=0.055). The two child groups did not differ significantly from each other (p=0.078).
For spectral modulation there were no significant differences between any groups (p=0.287 or greater).
For combined temporal/spectral modulation, adults had larger masking release than both the younger (p=0.003) and older (p=0.004) children, but the two child groups did not differ significantly from each other (p=0.897).
Effects Related to Hearing Impairment and Development
Compared to the adults with normal hearing, the listeners with hearing impairment had higher thresholds and reduced masking release (see Figure 2 and Table 3). Statistical analyses were performed to evaluate the effects of hearing loss and age in the present experiment. The data of the normal-hearing children who were matched in age to the children with hearing impairment (the older group of children with normal hearing) were used for the statistical tests reported below. The first analysis was an ANOVA comparing the steady SSN baseline condition across age (adult versus child) and hearing impairment (normal hearing versus impaired hearing). The adults had significantly lower thresholds than children (F1,34=9.2; p=0.005), and listeners with normal hearing had significantly lower thresholds than listeners with hearing impairment (F1,34=13.4; p=0.001). The interaction between age and hearing impairment was not significant (F1,34=0.26; p=0.610).
A repeated-measures ANOVA was also performed to examine the effects of age and hearing impairment on masking release. This analysis had a within-subjects factor of masking-release condition, and between-subjects factors of age group and hearing impairment. This analysis showed a significant effect of masking-release condition (F2,68=124.1; p<0.001), a significant interaction between masking-release condition and hearing loss (F2,68=24.3; p<0.001), and a significant interaction between masking-release condition and age (F2,68=6.4; p=0.003). The three-way interaction among masking condition, hearing impairment and age was not significant (F2,68=0.07; p=0.931). For between-subjects effects, there was a significant effect of hearing loss (F1,34=79.5; p<0.001) and a significant effect of age (F1,34=6.1; p=0.019). The interaction between age and hearing impairment was not significant (F1,34=0.22; p=0.645).
As noted above, the interactions between masking-release condition and hearing impairment and between masking-release condition and age were significant. Tests of simple effects (Kirk, 1968) were therefore performed to determine the sources of the interactions. We first consider the significant interaction between masking-release condition and hearing impairment. This indicated the following:
For the listeners with normal hearing, all masking-release conditions differed significantly from each other (p<0.001), with combined temporal/spectral modulation associated with the greatest masking release and temporal modulation associated with the smallest masking release.
For the listeners with hearing loss, masking release for spectral modulation did not differ from the masking release for combined temporal/spectral modulation (p=0.17). The other masking-release conditions differed significantly from each other: masking release for spectral modulation was greater than for temporal modulation (p<0.001); and masking release for combined temporal/spectral modulation was greater than for temporal modulation (p<0.001).
The listeners with hearing impairment had smaller amounts of masking release than listeners with normal hearing for all three masking-release conditions (p<0.001).
We now consider simple effects for the significant interaction between masking-release condition and age. This indicated that the children did not differ significantly from adults for spectral modulation (p=0.439), but had significantly smaller masking release than adults for temporal modulation (p=0.030) and for combined temporal/spectral modulation (p=0.002).
Finally, analyses were performed that included only the listeners with hearing impairment. Examination of the right panel of Figure 2 shows that the threshold for the temporal modulation condition was only slightly better than for the baseline steady SSN condition. Therefore one question of interest was whether the listeners with hearing impairment attained a significant masking release for temporal modulation. A repeated-measures ANOVA comparing the steady SSN baseline threshold to the threshold in temporally modulated noise indicated no significant difference (F1,15=2.3;p=0.149) and no interaction with age (F1,15=1.5;p=0.239). An additional question of interest was whether, despite the lack of a significant effect for temporal modulation, there was nevertheless evidence of an interaction such that the masking release for combined temporal/spectral modulation was greater than for spectral modulation alone. This was examined with a repeated-measures ANOVA comparing masking release for spectral modulation to masking release for combined temporal/spectral modulation. This showed no significant effect of condition (F1,15=1.4;p=0.248) but a significant interaction between condition and age (F1,15=5.39;p=0.035). Simple effects testing indicated that the interaction was due to the fact that the children with hearing impairment showed no difference between spectral masking release and the combined temporal/spectral masking release (p=0.453), but the adults with hearing impairment showed more masking release for combined temporal/spectral masking release than for spectral masking release (p=0.021). The overall effect of group was not significant (F1,15=1.28;p=0.275).
INTERIM CONSIDERATION OF FACTORS RELATED TO SIGNAL-TO-NOISE RATIO
Bernstein and Grant (2009) recently speculated that the reduced masking release for speech in temporally modulated noise shown by hearing-impaired listeners may be related to the signal-to-noise ratio associated with the baseline (steady noise) condition. This was based in part on a previous filtered speech study by Oxenham and Simonson (2009), indicating that normal-hearing listeners showed relatively small benefit from masker fluctuation for filtered speech conditions where the signal-to-noise ratio at threshold was relatively high in the baseline condition. It has been suggested that the effect of signal-to-noise ratio on masking release may be due to the non-uniform distribution of speech cues as a function of intensity (Bernstein & Brungart 2011; Bernstein & Grant 2009; Freyman et al. 2008).
As noted previously, the normal-hearing children and both of the hearing-impaired groups in the present study achieved baseline condition thresholds at signal-to-noise ratios that were higher than those of the normal-hearing adults. We investigated supplementary conditions on normal-hearing adults in order to examine what effect a higher signal-to-noise ratio in the baseline condition would have on the masking release obtained in the modulation conditions. This was done by repeating the conditions of the main experiment, but using a threshold estimation procedure that converged on a higher percent correct point on the psychometric function. Recall that in the main experiment the speech presentation level was increased by 2 dB if one or more words were missed on a sentence, and the level was reduced by 2 dB if all keywords were correctly identified. In the supplementary conditions, the speech presentation level was increased by 2 dB if one or more words were missed and the level was reduced by 2 dB if all keywords were correctly identified on two consecutive trials. Ten new adults with normal hearing participated.
Figure 4 shows the results of the supplementary conditions along with the results of the normal-hearing and hearing-impaired adults in the main experiment for comparison (see also Table 3). It can be seen that the modified threshold procedure was successful in raising the signal-to-noise ratio at threshold for the baseline condition. In fact, the average threshold obtained was higher than that obtained for the normal-hearing children and for either group with hearing impairment. Statistical tests were performed comparing the data of the adults with normal hearing from the main and supplementary conditions. A t-test indicated that the adults in the supplementary conditions had higher baseline thresholds than the adults tested in the main experiment (t18=7.1; p<0.001). A repeated-measures ANOVA examining masking release for temporal, spectral, and combined temporal/spectral modulation indicated a significant effect of condition (F2,36=85.0; p<0.001). However, the effects of group (F1,18=3.1; p=0.094) and interaction between masking-release condition and group (F2,36=1.9; p=0.168) did not reach statistical significance. Thus even though the procedure used in the supplementary experiment resulted in baseline signal-to-noise ratios that were higher than those of the normal-hearing children and the hearing-impaired groups in the main experiment, the supplementary conditions did not show the significant reduction in masking release that occurred in these groups. The fact that masking release was not significantly reduced in the supplementary conditions would tend to undermine an interpretation that effects related to signal-to-noise ratio account for the relatively large reduction in masking release shown by the listeners with hearing impairment. That is, the adults tested in the supplementary conditions did not show a significant decrease in masking release, while the listeners with hearing impairment did not even show a significant masking release for temporal modulation. We also performed a repeated-measures ANOVA comparing masking release for listeners with normal hearing in the supplementary conditions to the adult listeners with impaired hearing in the main experiment. This showed significant effects of condition (F2,34=35.2;p<0.001) and group (F1,17=18.8;p<0.001), and no significant interaction (F2,34=2.7;p=0.08). This result supports the idea that the reduced masking released found in the listeners with impaired hearing cannot be accounted for entirely in terms of a signal-to-noise ratio factor. Nevertheless, it should be pointed out that the supplementary results showed a trend for reduced masking release (see Table 3) and we cannot rule out some influence of signal-to-noise ratio on the reduced masking release demonstrated by the listeners with hearing impairment.
It is also possible there was an influence of signal-to-noise ratio in the reduced masking release demonstrated by children with normal hearing in temporally modulated noise. Compared to the adults with normal hearing, the younger children with normal hearing showed more modest reductions in masking release than the groups with hearing impairment. As indicated above, the adults in the supplementary conditions did not show significantly reduced masking release compared to the adults in the main conditions. This tends to weigh against the idea that the younger children with normal hearing had reduced temporal masking release solely because of effects related to signal-to-noise ratio. Evidence for this would be even stronger if the children showed a reduction in masking release when compared to the adults tested in the supplementary conditions. Analyses were conducted to examine this issue, comparing the adults in the supplementary conditions to the younger children in the main experiment. A t-test indicated that the adults in the supplementary conditions had higher baseline thresholds than the young children tested in the main experiment (t18=3.3; p=0.005). A repeated-measures ANOVA examining masking release for temporal, spectral or combined temporal/spectral modulation indicated a significant effect of condition (F2,36=81.5; p<0.001). However, the effects of group (F1,18=01.0; p=0.333) and interaction between masking-release condition and group (F2,36=1.9; p=0.167) did not reach statistical significance. The fact that the masking release of the adults in the supplementary condition did not differ from that of the children in this analysis could be seen as consistent with the idea that effects related to signal-to-noise ratio could have contributed to the smaller masking release shown by the normal-hearing children for temporal modulation.
In summary, the supplementary data indicate that the reduced masking release found for listeners with hearing impairment cannot be accounted for in terms of signal-to-noise ratio. However, we cannot rule out the possibility that considerations related to signal-to-noise ratio had some contribution to the reduced masking release found in the present study for some groups and conditions. It should also be noted that the present study examined the signal-to-noise ratio issue with only the single approach of using a threshold procedure that resulted in a relatively high signal-to-noise ratio in the baseline condition. It may be informative to investigate other approaches.
DISCUSSION
Adults and Children with Normal Hearing
With regard to the baseline condition of steady speech-shaped noise, both the younger and older children had higher masked SRTs than the adults, but the younger and older child groups did not differ significantly from each other. Recall that the correlation between the masked SRTs and age among children with normal hearing was not significant using a one-tailed test (p=0.07). It is somewhat difficult to compare this result to previous studies due to methodological differences, but some previous developmental studies indicate that masked speech recognition thresholds improve significantly over the age range of 6 –10 years. For example, this has been reported for spondees in speech-shaped noise (Hall et al. 2002) and for HINT sentences (Nilsson et al. 1994) in a broadband noise having a flat spectral shape (Stuart 2008). The norms for the BKB Speech in Noise test (EtymoticResearch 2005), which uses a multi-talker masker, indicate that masked SRTs improve by about 2.6 dB from age 6 to10 years. In the current study, using a speech-shaped noise masker, the trend was for the masked SRTs to improve by about 1.6 dB over this age range.
Across the conditions tested here, a general finding was that adults and children showed broadly similar patterns, with children having approximately 3–5 dB higher masked SRTs. In terms of the detail of the pattern, both child groups benefited from spectral modulation as much as adults, but they benefited less for combined temporal/spectral modulation. The younger group of children also benefited less than adults for temporal modulation. Both children and adults with normal hearing achieved more masking release in the combined masking-release condition than the better of the spectral-only or temporal-only masking-release conditions. Thus both children and adults benefited from combined masking-release cues, with this benefit being less for children. It is possible that the smaller masking release that children showed for combined temporal/spectral modulation is related to the complexities of the tasks. For example, the ability to piece together speech fragments at the improved signal-to-noise ratios available in either temporal gaps or in spectral gaps could be considered less complex than the ability to piece together speech fragments that are distributed across both spectral and temporal gaps. The present results suggest that the ability to combine speech cues that are distributed across temporal and spectral gaps may not be completely developed in children for the age range examined here.
The present findings of a developmental effect for masking release related to temporal modulation can be compared with previous results of Stuart (2008) who measured word recognition for sentences presented in steady and temporally modulated broadband noise. Stuart (2008) measured performance in five groups of children (6–7, 8–9, 10–11, 12–13, and 14–15 years) and in adults. Although the results of Stuart’s study showed no significant effect of age group for masking release, it may be relevant to note the that the p-value approached statistical significance (p=0.057) and that there was a general trend for the masking release for temporal modulation to increase with increasing age among the children. The present results indicated that release from masking for temporal modulation was significantly larger in adults than in children ranging in age from 4.6– 6.9 years, but not for the children ranging in age from 7.3–11.1 years. Note that some of the children in the present study were younger than the youngest child tested by Stuart. Because of this and because of the similarity in trends between the two studies, it seems reasonable to conclude that the Stuart (2008) study and the present study are not in conflict with each other in terms of the developmental trends related to temporal modulation masking release.
One question of interest is whether the reduced masking release for temporal modulation found here for the younger children is due to relatively poor temporal resolution, conceptualized in terms of a temporal window with a relatively long time constant. Previous results from Hall and Grose (1994) would tend to argue against such an account. Hall and Grose, using a psychophysical temporal modulation transfer function method (Viemeister 1979), found that although children were less sensitive to modulation than adults, there was no developmental effect for the time constant estimated from the decrease in sensitivity as a function of increasing modulation frequency. It is possible that the reduced masking release in temporally modulated noise shown here by the younger children is related to poor efficiency in taking advantage of the improved signal-to-noise ratio in temporal gaps rather than to a prolonged time constant for temporal processing. One possible interpretation is that younger children are relatively poor at piecing together a speech signal from cues that are distributed across temporal epochs where the signal-to-noise ratio is relatively high. The supplemental conditions of the present study also raise the possibility that the smaller masking release for temporal modulation found for young children may be related to the relatively high signal-to-noise ratio associated with their baseline masked thresholds.
The present results indicated that adults and children with normal hearing had similar abilities to take advantage of the spectral gaps in the masking noise used here. This finding is consistent with an interpretation that frequency selectivity was similar across the ages tested, a conclusion that is in accord with previous psychoacoustical tests of frequency selectivity for listeners in this age range (Allen et al. 1989; Hall & Grose 1991). It should be noted, however, that adult-like frequency selectivity in children need not necessarily translate to adult-like performance for speech recognition in spectrally modulated noise. Whereas a frequency selectivity task often requires the detection of a fixed-frequency tone in a masker background, the speech in spectrally modulated noise task requires the listener to piece together a speech signal from multiple, spectrally-distributed regions where the signal-to-noise ratio is relatively high. Therefore, the present spectral modulation task could be considered to be more complex than typical tonal detection tasks measuring frequency selectivity.
The present results suggest that the ability to combine speech fragments that are separated in frequency is adult-like in the children tested here. This result is consistent with findings reported previously by Mlot et al. (2009). That study examined the perception of filtered speech in the absence of a masker in adults and children aged 6–14 years. In the first stage of testing, sentences were band pass filtered around either 500 or 2500 Hz, and the bandwidth of the filter was varied to determine the bandwidth necessary for approximately 20% correct speech recognition. In a second stage of testing, these bandwidths were presented with either one band or both bands present. The main finding was that both children and adults demonstrated a similar increase in performance when the two bands were presented together. It may be of interest for future research to determine whether it is a general finding that children develop the ability to combine spectrally separated speech fragments earlier than the ability to combine speech fragments that are separated in time.
Adults and Children with Hearing Impairment
As is apparent in Figure 2, adults and children with hearing impairment had similar patterns of results. Both groups clearly showed less masking release in comparison to listeners with normal hearing. This finding is in agreement with past studies on adults with hearing impairment showing poor masking release for temporal modulation (Eisenberg et al. 1995; Festen & Plomp 1990; George et al. 2006; Peters et al. 1998; Takahashi & Bacon 1992; Wilson & Carhart 1969), and with the study of Peters et al. (1998) showing reduced masking release for spectral and temporal/spectral modulation. As noted in the results section, neither the children nor adults with hearing impairment showed a statistically significant release from masking in the temporal modulation condition. An interesting finding, however, was that the adults with hearing impairment showed a larger masking release when temporal/spectral modulation was present compared to the condition where only spectral modulation was present. In contrast, the children with hearing impairment did not show a larger masking release when both temporal and spectral modulation was present compared to the spectral modulation condition. This result is compatible with the findings in the listeners with normal hearing, where children were less able than adults to benefit from combined temporal/spectral modulation.
Comparing across all groups tested, it is clear that adults with normal hearing had the best performance of all. Interestingly, the children with normal hearing and the adults with hearing loss performed very nearly the same in the steady-noise baseline masker (see Figure 2). Here, the disadvantage associated with being a child listener with normal hearing was nearly the same as that associated with being an adult having mild/moderate sensorineural hearing impairment. However, in the modulation conditions, the adults with hearing impairment, on average, performed more poorly than the children with normal hearing.
The present finding of no interaction between age and hearing loss did not provide support for the hypothesis that poor masking release in children with hearing impairment might arise because of effects related to reduced quality of speech cues during auditory development. Nevertheless, children with hearing impairment showed the poorest overall performance of the four groups tested here. An interesting remaining question is whether the failure to obtain a significant interaction may have been influenced by hearing aid history. All of the children with hearing impairment had at least three years experience with hearing aids that had been fitted by experienced pediatric audiologists. It is possible that this factor could have mitigated effects related to reduced quality of speech cues, and that different results would occur for children who had not been appropriately fitted with hearing aids. It should also be noted that the conditions examined here were by no means exhaustive, and that the results do not rule out the possibility that an interaction between age and hearing loss may be evident in other approaches investigating the perception of speech in noise.
One practical implication of the present results is related to the idea that children with hearing impairment are at a particular disadvantage in challenging listening environments. It has previously been noted that the relatively poor signal-to-noise ratios associated with many grade school classrooms can present listening problems for both normal-hearing and hearing-impaired children (e.g., Crandell 1993; Finitzo-Hieber & Tillman 1978; Hicks & Tharpe 2002). The present findings support the notion that such conditions are likely to be particularly troublesome for children with hearing impairment. The findings also support the view, already widely held among pediatric audiologists, that it is important to provide hearing devices, including FM systems (Hawkins 1984), designed to enhance signal-to-noise ratios in educational settings for children with hearing impairment.
CONCLUSIONS
Children with normal hearing had higher masked SRTs in speech-shaped noise than adults with normal hearing. Their masking release did not differ significantly from the masking release of adults for spectral modulation, but the younger child group had less masking release than adults for temporal modulation, and both the younger and older groups of children had smaller masking release than adults for combined temporal/spectral modulation. These results are consistent with an interpretation that children in the age range tested here may have some difficulty combining speech information that is distributed across temporal and spectral gaps in the masker.
Hearing impairment in children and adults was associated with increased SRTs in speech-shaped noise. Hearing impairment was also associated with reduced masking release for temporal, spectral, and combined temporal/spectral modulation. Whereas adults with hearing impairment showed more masking release in the combined temporal/spectral masking-release condition than the spectral masking-release condition, the masking release for the children with hearing impairment did not differ between the spectral and combined temporal/spectral masking-release condition.
Although the masked SRTs in this study were poorest for the children with hearing impairment, the interaction between age and hearing loss was not significant. This finding does not support the hypothesis that the ability to piece together speech fragments is particularly poor in children with hearing impairment because of developmental delays caused by poor quality of speech cues. The outcome instead suggests that the results of the hearing-impaired children can be accounted for by the combination of general hearing impairment and age effects, with no need to appeal to additional factors associated with reduced auditory input during development. It is possible that experience with optimally fitted hearing aids could have mitigated any effects related to early hearing loss during auditory development.
The results of the supplementary conditions of the present study failed to support the idea that the reduced masking release found here for listeners with hearing impairment can be fully accounted for in terms of the higher signal-to-noise ratios associated with thresholds in the baseline condition. However, we cannot rule out the possibility that some of the reduced masking release found in the present study for listeners with hearing impairment and for young listeners with normal hearing was related to higher signal-to-noise ratios associated with masked thresholds in the baseline condition.
ACKNOWLEDGMENTS
We thank section editor Marjorie Leek and two anonymous reviewers for helpful comments. Joshua Bernstein provided useful suggestions on a previous version of this manuscript. This work was supported by NIH NIDCD grant R01 00397.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- Allen P, Wightman F, Kistler D, et al. Frequency resolution in children. J Speech Hear Res. 1989;32:317–322. doi: 10.1044/jshr.3202.317. [DOI] [PubMed] [Google Scholar]
- Assmann PF, Summerfield AQ. The perception of speech under adverse conditions. In: Greenberg S, Ainsworth WA, Popper AN, Fay RR, editors. Speech Processing in the Auditory System. Vol. 18. New York: Springer Verlag; 2004. [Google Scholar]
- Bench J, Kowal A, Bamford J. The BKB (Bamford-Kowal-Bench) sentence lists for partially-hearing children. Br J Audiol. 1979;13:108–112. doi: 10.3109/03005367909078884. [DOI] [PubMed] [Google Scholar]
- Bernstein JG, Brungart DS. The effect of spectral and temporal-fine structure distortions on the fluctuating-masker benefit for speech at fixed signal-to-noise ratio. J Acoust Soc Am. 2011 doi: 10.1121/1.3589440. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernstein JG, Grant KW. Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners. J Acoust Soc Am. 2009;125:3358–3372. doi: 10.1121/1.3110132. [DOI] [PubMed] [Google Scholar]
- Buss E, Hall JW, Grose JH. Spectral integration of synchronous and asynchronous cues to consonant identification. J Acoust Soc Am. 2004;115:2278–2285. doi: 10.1121/1.1691035. [DOI] [PubMed] [Google Scholar]
- Cooke M. A glimpsing model of speech perception in noise. J Acoust Soc Am. 2006;119:1562–1573. doi: 10.1121/1.2166600. [DOI] [PubMed] [Google Scholar]
- Crandell CC. Speech recognition in noise by children with minimal degrees of sensorineural hearing loss. Ear Hear. 1993;14:210–216. doi: 10.1097/00003446-199306000-00008. [DOI] [PubMed] [Google Scholar]
- Eisenberg LS, Dirks DD, Bell TS. Speech recognition in amplitude-modulated noise of listeners with normal and listeners with impaired hearing. J Speech Hear Res. 1995;38:222–233. doi: 10.1044/jshr.3801.222. [DOI] [PubMed] [Google Scholar]
- EtymoticResearch. Bamford-Kowal-Bench Speech in Noise Test (Version 1.03) Elk Grove Village, IL: Etymotic Research; 2005. Audio CD. [Google Scholar]
- Festen JM, Plomp R. Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. J Acoust Soc Am. 1990;88:1725–1736. doi: 10.1121/1.400247. [DOI] [PubMed] [Google Scholar]
- Finitzo-Hieber T, Tillman TW. Room acoustics effects on monosyllabic word discrimination ability for normal and hearing-impaired children. J Speech Hear Res. 1978;21:440–458. doi: 10.1044/jshr.2103.440. [DOI] [PubMed] [Google Scholar]
- Freyman RL, Balakrishnan U, Helfer KS. Spatial release from masking with noise-vocoded speech. J Acoust Soc Am. 2008;124:1627–1637. doi: 10.1121/1.2951964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- George EL, Festen JM, Houtgast T. Factors affecting masking release for speech in modulated noise for normal-hearing and hearing-impaired listeners. J Acoust Soc Am. 2006;120:2295–2311. doi: 10.1121/1.2266530. [DOI] [PubMed] [Google Scholar]
- Glasberg BR, Moore BCJ. Derivation of auditory filter shapes from notched-noise data. Hear Res. 1990;47:103–138. doi: 10.1016/0378-5955(90)90170-t. [DOI] [PubMed] [Google Scholar]
- Hall JW, Buss E, Grose JH. Informational masking release in children and adults. J Acoust Soc Am. 2005;118:1605–1613. doi: 10.1121/1.1992675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall JW, Buss E, Grose JH. Comodulation detection differences in children and adults. J Acoust Soc Am. 2008a;123:2213–2219. doi: 10.1121/1.2839006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall JW, Buss E, Grose JH. The effect of hearing impairment on the identification of speech that is modulated synchronously or asynchronously across frequency. J Acoust Soc Am. 2008b;123:955–962. doi: 10.1121/1.2821967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall JW, Grose JH. Notched-noise measures of frequency selectivity in adults and children using fixed-masker-level and fixed-signal-level presentation. J Speech Hear Res. 1991;34:651–660. doi: 10.1044/jshr.3403.651. [DOI] [PubMed] [Google Scholar]
- Hall JW, Grose JH. Development of temporal resolution in children as measured by the temporal modulation transfer function. J Acoust Soc Am. 1994;96:150–154. doi: 10.1121/1.410474. [DOI] [PubMed] [Google Scholar]
- Hall JW, Grose JH, Buss E, et al. Spondee recognition in a two-talker masker and a speech-shaped noise masker in adults and children. Ear Hear. 2002;23:159–165. doi: 10.1097/00003446-200204000-00008. [DOI] [PubMed] [Google Scholar]
- Hawkins DB. Comparisons of speech recognition in noise by mildly-to-moderately hearing-impaired children using hearing aids and FM systems. J Speech Hear Disord. 1984;49:409–418. doi: 10.1044/jshd.4904.409. [DOI] [PubMed] [Google Scholar]
- Hicks CB, Tharpe AM. Listening effort and fatigue in school-age children with and without hearing loss. J Speech Lang Hear Res. 2002;45:573–584. doi: 10.1044/1092-4388(2002/046). [DOI] [PubMed] [Google Scholar]
- Howard-Jones PA, Rosen S. Uncomodulated glimpsing in "checkerboard" noise. J Acoust Soc Am. 1993;93:2915–2922. doi: 10.1121/1.405811. [DOI] [PubMed] [Google Scholar]
- Miller GA, Licklider JCR. The intelligibility of interrupted speech. J Acoust Soc Am. 1950;22:167–173. [Google Scholar]
- Mlot S, Buss E, Hall JW. Spectral integration and bandwidth effects on speech recognition in school-aged children and adults. Ear Hear. 2009;31:56–62. doi: 10.1097/AUD.0b013e3181ba746b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moore BCJ, Glasberg BR. Comparisons of frequency selectivity in simultaneous and forward masking for subjects with unilateral cochlear impairments. J Acoust Soc Am. 1986;80:93–107. doi: 10.1121/1.394087. [DOI] [PubMed] [Google Scholar]
- Nilsson M, Soli SD, Sullivan JA. Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. J Acoust Soc Am. 1994;95:1085–1099. doi: 10.1121/1.408469. [DOI] [PubMed] [Google Scholar]
- Oxenham AJ, Simonson AM. Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference. J Acoust Soc Am. 2009;125:457–468. doi: 10.1121/1.3021299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peters RW, Moore BC, Baer T. Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. J Acoust Soc Am. 1998;103:577–587. doi: 10.1121/1.421128. [DOI] [PubMed] [Google Scholar]
- Pick G, Evans EF, Wilson JP. Frequency resolution in patients with hearing loss of cochlear origin. In: Evans EF, Wilson JP, editors. Psychophysics and Physiology of Hearing. London: Academic Press; 1977. [Google Scholar]
- Stuart A. Reception thresholds for sentences in quiet, continuous noise, and interrupted noise in school-age children. J Am Acad Audiol. 2008;19:135–146. doi: 10.3766/jaaa.19.2.4. [DOI] [PubMed] [Google Scholar]
- Stuart A, Givens GD, Walker LJ, et al. Auditory temporal resolution in normal-hearing preschool children revealed by word recognition in continuous and interrupted noise. J Acoust Soc Am. 2006;119:1946–1949. doi: 10.1121/1.2178700. [DOI] [PubMed] [Google Scholar]
- Takahashi GA, Bacon SP. Modulation detection, modulation masking, and speech understanding in noise in the elderly. J Speech Hear Res. 1992;35:1410–1421. doi: 10.1044/jshr.3506.1410. [DOI] [PubMed] [Google Scholar]
- Tyler RS, Hall JW, Glasberg BR, et al. Auditory filter asymmetry in the hearing impaired. J Acoust Soc Am. 1984;76:1363–1376. doi: 10.1121/1.391452. [DOI] [PubMed] [Google Scholar]
- Uchanski RM, Geers AE, Protopapas A. Intelligibility of modified speech for young listeners with normal and impaired hearing. J Speech Lang Hear Res. 2002;45:1027–1038. doi: 10.1044/1092-4388(2002/083). [DOI] [PubMed] [Google Scholar]
- Viemeister NF. Temporal modulation transfer functions based upon modulation thresholds. J Acoust Soc Am. 1979;66:1364–1380. doi: 10.1121/1.383531. [DOI] [PubMed] [Google Scholar]
- Wightman FL, Callahan MR, Lutfi RA, et al. Children's detection of pure-tone signals: informational masking with contralateral maskers. J Acoust Soc Am. 2003;113:3297–3305. doi: 10.1121/1.1570443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson RH, Carhart R. Influence of pulsed masking on the threshold for spondees. J Acoust Soc Am. 1969;46:998–1010. doi: 10.1121/1.1911820. [DOI] [PubMed] [Google Scholar]
- Zwicker E, Schorn K. Psychoacoustical tuning curves in audiology. Audiology. 1978;17:120–140. doi: 10.3109/00206097809080039. [DOI] [PubMed] [Google Scholar]