Skip to main content
The Journal of the Acoustical Society of America logoLink to The Journal of the Acoustical Society of America
. 2016 Jul 12;140(1):EL73–EL78. doi: 10.1121/1.4954386

Release from masking for small spatial separations: Effects of age and hearing lossa)

Nirmal Kumar Srinivasan 1, Kasey M Jakien 1, Frederick J Gallun 1
PMCID: PMC5392088  PMID: 27475216

Abstract

Spatially separating target and masking speech can result in substantial spatial release from masking (SRM) for normal-hearing listeners. In this study, SRM was examined at eight spatial configurations of azimuth angle: maskers co-located with the target (0°) or symmetrically separated by 2°, 4°, 6°, 8°, 10°, 15°, or 30°. Results revealed that different listening groups (young normal-hearing, older normal-hearing, and older hearing-impaired) required different minimum amounts of spatial separation between target and maskers to achieve SRM. The results also indicated that aging was the contributing factor predicting SRM at smaller separations, whereas hearing loss was the contributing factor at larger separations.

1. Introduction

It is well documented that spatially separating target speech from interfering speech results in significant spatial release from masking (SRM) for normal-hearing (NH) listeners, thereby improving speech intelligibility. This study investigated the effects of age and hearing loss on SRM for very small spatial separations between target and masking speech. Our goal was to characterize the separate functions relating threshold to spatial separation for three listener groups: young normal-hearing (YNH), older normal-hearing (ONH), and older hearing-impaired (OHI).

Traditional models that predict SRM based on interaural differences in level (ILDs) and time (ITDs) are generally successful in predicting the intelligibility of target speech when target and maskers are spatially separated in the azimuthal plane (Zurek, 1993; Bronkhorst, 2015). Recent studies of speech-on-speech masking have concluded that SRM is a combination of better-ear listening and binaural analysis (Kidd et al., 1998; Freyman et al., 1999; Arbogast et al., 2002; Best et al., 2006; Marrone et al., 2008b). NH listeners can achieve more SRM than hearing-impaired (HI) listeners (e.g., Glyde et al., 2013). Marrone et al. (2008a) described the relationship between SRM and spatial separation in azimuth for NH listeners using a filter-like function and concluded that most SRM occurs within the first 15° between target and maskers, with the full benefit for most listeners being achieved by 45°. However, the researchers did not examine release at smaller separations than 15°, or with older or HI listeners.

One reason to examine the effects of small separations in older listeners in particular, is that Gallun et al. (2013) showed that older individuals often achieve less SRM than younger individuals, regardless of hearing status. Füllgrabe et al. (2015) concluded that reductions in SRM could be attributed to cognitive changes related to aging rather than age-related loss of binaural senstivity per se. On the other hand, Whitmer et al. (2014) examined auditory source width sensitivity through headphones and concluded that OHI listeners exhibited a decreased sensitivity to interaural coherence, which the authors interpreted as the inability of OHI listeners to accurately process binaural timing information. This disagreement in the literature motivates further examination of SRM in older listeners with and without hearing impairment. Furthermore, the finding that perceived source width varies less with interaural correlation for OHI listeners (Whitmer et al., 2014) suggests that using smaller separations to examine SRM may reveal age effects not observed at larger separations. Specifically, it is hypothesized that OHI would require larger angles of separation between the target and maskers to obtain SRM. It is also possible that the shape of the function that Marrone et al. (2008a) described could be different for older or HI listeners based on the findings of Gallun et al. (2013). Were that the case, it would be inappropriate to use the findings of Marrone et al. (2008a) to draw conclusions about what is a sufficient spatial separation for good communication across all listeners. It is also important to understand how listeners of different ages and hearing capabilities achieve SRM with small separations because the division between target and maskers in everyday listening environments can be quite small.

2. Methods

2.1. Listeners

Three listener groups were recruited based on age and hearing status. The YNH individuals (n = 10; male = 6, female = 4) had audiometric thresholds of ≤20 dB hearing level (HL) (HL re: ANSI, 2004) at all octave frequencies between 250 Hz and 4 kHz, whereas the ONH (n = 14; male = 8, female = 6) individuals had thresholds of 20 dB HL or better up to 2 kHz, but up to 25 dB HL at 4 kHz. OHI (n = 12; male = 9, female = 3) listeners had thresholds between 10 dB HL and 40 dB HL at frequencies up to 2 kHz and thresholds of 20 to 60 dB HL at 4 kHz. At 8 kHz, the average threshold for the YNH group was 10 dB HL, for the ONH group was 30 dB HL, and for the OHI group was 50 dB HL. The average audiometric thresholds at different octave frequencies and corresponding ranges are shown in Table 1. Tympanometry was performed to rule out middle ear abnormalities and no more than one air-bone gap greater than 10 dB was present at octave frequencies from 500 Hz to 4 kHz. All listeners were in good health with no history of otological disorders. Also, all participants had scores of 24 or higher on the Mini-Mental State Examination (Folstein et al., 1975) to rule out dementia or any other cognitive impairments. None of the individuals in the OHI group used hearing aids for everyday listening.

Table 1.

Average audiometric thresholds at different octave frequencies and corresponding ranges for the three listener groups.

AudiometricThreshold Left Ear Right Ear
(dB HL) 250 Hz 500 Hz 1 kHz 2 kHz 4 kHz 8 kHz 250 Hz 500 Hz 1 kHz 2 kHz 4 kHz 8 kHz
YNH Mean 4.5 5 6.5 7 7.5 10 4 5 5.5 6 7 9.5
Range 0–15 0–15 −5–15 −5–10 0–20 0–20 0–10 −5–10 −5–15 0–10 0–15 0–20
ONH Mean 8.57 8.92 9.64 10.00 17.50 31.43 7.14 8.21 9.29 9.64 18.93 32.86
Range 0–15 5–15 0–20 0–20 5–25 10–70 0–15 0–15 5–20 0–25 5–25 10–60
OHI Mean 22.50 25.83 27.50 29.17 40.00 52.08 20.83 26.67 26.67 29.17 40.82 53.75
Range 10–25 15–35 15–30 10–40 10–60 25–80 10–25 10–40 15–30 10–40 20–60 25–80

2.2. Stimuli

All available sentences for three of the male talkers in the Coordinate Response Measure corpus (CRM; Bolia et al., 2000) were used for the target and maskers. All sentences in the CRM have the form “Ready [CALL SIGN] go to [COLOR] [NUMBER] now.” There are eight possible call signs (Arrow, Baron, Charlie, Eagle, Hopper, Laker, Ringo, and Tiger), four colors (blue, red, white, and green), and eight numbers (1–8). All sentences were bandpass filtered from 80 Hz to 8 kHz. On each trial, the listener was presented with a set of three simultaneous CRM sentences. The goal was to attend to the sentence identified by the call sign “Charlie” and ignore the two masking sentences. Which of the three talkers was the target and which were the maskers varied randomly from trial to trial.

Head-related impulse responses (HRIR) were generated using techniques described by Zahorik (2009). Eight spatial configurations were used: co-located (all three sentences presented from 0° azimuth) and one of seven spatially separated conditions (target at 0°, symmetrical maskers at ±2°, ±4°, ±6°, ±8°, ±10°, ±15°, or ±30°). Target and masking speech were convolved with the HRIRs for their appropriate locations relative to the listener.

2.3. Procedure

Listeners were seated in a sound-attenuating chamber at the National Center for Rehabilitative Auditory Research (NCRAR, Portland, OR, USA) and listened to speech stimuli presented over insert earphones (ER2; Etymotic Research, Elk Grove, IL, USA). The target speech was presented to the listeners at 20 dB sensation level (SL) and was kept constant during the experiment. The masking sentences were presented at levels relative to the target and were appropriately scaled in SL to achieve the required TMRs. Responses were obtained using a computer monitor located in front of the listener. Feedback was given after each presentation in the form of “correct” or “incorrect.” Data collection was self-paced and listeners were instructed to take breaks whenever they felt the need. All procedures were approved by the VA Portland Health Care System Institutional Review Board and all listeners were monetarily compensated for their time. All stimulus presentation and data collection was implemented using matlab; statistical analyses were performed using SPSS Version 22 (IBM Corp, Armonk, NY, USA).

2.4. Scoring

Identification thresholds were estimated using a progressive tracking procedure (Gallun et al., 2013). The procedure involves presenting 20 trials, two at each of ten TMRs, starting at 10 dB TMR and ending at −8 dB TMR (decreasing in steps of 2 dB). TMR thresholds in dB were estimated by subtracting the number of correct responses from ten. Hence, if the listener reported all of the keywords correctly the TMR threshold would be −10 dB; if the listener reported all of the keywords incorrectly the TMR threshold would be 10 dB. This method provides fairly similar estimates of threshold in the co-located condition to what would be obtained with a longer adaptive tracking procedure and only slightly underestimates thresholds in the spatially separated conditions when threshold is near −10 dB or +10 dB (Gallun et al., 2013). Since none of the listeners had thresholds near that range, the progressive tracker can be relied upon as an efficient method for this task and these participants.

3. Results

The left panel of Fig. 1 displays the mean TMR thresholds (±1 standard error of the mean) at different spatial configurations for the three listener groups. A repeated-measures analysis of variance was conducted with spatial separation as a within-subjects factor and age (younger versus older) and hearing status (NH versus HI) as between-subject factors. Significant main effects were found for all the three factors [spatial separation: F (7, 231) = 86.3, p < 0.001; age: F (1, 33) = 34.47, p < 0.001; PTA: F (1, 33) = 71.14, p < 0.001]. TMR thresholds were significantly lower for the spatially separated conditions compared to the co-located condition. Overall, the groups were ordered: YNH, ONH, and OHI (going from the lowest to the highest TMR threshold).

Fig. 1.

Fig. 1.

(Color online) (Left) Target-to-masker ratios plotted as a function of spatial separation between the target and maskers for the three listener groups. Error bars are ±1 standard error of the mean. Significant difference between spatially separated and co-located thresholds at the level of p < 0.001 are indicated by *** and p < 0.05 are indicated by *. (Right) SRM as a function of age (top row) and PTA (bottom row) at 4° and 30° separations. The solid lines are the least squares fits to the data. All correlations are significant at p < 0.05.

To further examine the significant interactions, separate analyses were conducted on each listener group. There was a significant main effect of spatial separation [F (7, 63) = 210.3, p < 0.001] for YNH listeners. A post hoc analysis using paired sample t-tests and Bonferroni correction revealed that the thresholds at all spatial separations were significantly better than co-located thresholds (p < 0.05) indicating that YNH listeners could benefit from a very small separation (2°−4°) between target and maskers. There was a significant main effect of spatial separation [F (7, 91) = 150.5, p < 0.001] for ONH listeners. ONH listeners required a separation of at least 6° between the target and maskers to show a significant decrease in threshold. For OHI listeners, there was a significant main effect of spatial separation [F (7, 77) = 3.88, p = 0.001] and post hoc analyses indicated that the TMR thresholds at the 30° spatially separated condition were the only thresholds that were significantly different from co-located thresholds for that group.

To further analyze the differences between the groups, the effects of age and hearing loss were examined using continuous rather than categorical statistical techniques. Correlations between age, hearing loss (calculated as the average of audiometric thresholds for the octave frequencies 0.5, 1, 2, and 4 kHz) and identification thresholds were statistically significant (p < 0.001) for all spatial separations tested. However, there was a strong correlation between age and hearing loss as measured by PTA [r (34) = 0.62, p < 0.001]. To deal with this potential confounding of age and hearing loss in the sample, two approaches were used. In the first, partial correlations were computed to examine whether age or hearing loss explained the most variance in SRM. In the second approach, multiple regression analyses were performed with SRM as the predicted variable and age and hearing loss as predictors. As both partial correlation and multiple regression analyses showed the same trends, it is not informative to show both. The results of the multiple regression analyses are presented here.

The right panel of Fig. 1 shows the relationship between age, PTA, and SRM at 4° and 30° separations. Table 2 illustrates the amount of variance accounted for, standardized regression coefficients for the predictor variables (age and hearing loss), and corresponding statistics for all of the multiple regression analyses. Age, rather than hearing loss, was a significant predictor at the 4° and 6° spatial separations. Starting at 8° spatial separation, hearing loss was a significant predictor of SRM. To ensure that the correlation between age and high-frequency hearing present in the listener sample was not influencing the model, the analysis was also conducted using high-frequency PTA (average of thresholds at 2, 4, and 8 kHz) and age as predictors. The variance explained by the model at smaller separations was unchanged, but the variance accounted for at the larger separations was reduced by up to 20%.

Table 2.

Multiple regression models predicting SRM at different spatial separations.

Spatial Separation R2 Model Statistics Age Hearing Loss
Standardized Regression Coefficient p value Standardized Regression Coefficient p value
7.7 F(2,33) = 1.38, p = 0.27 −0.33 0.13 0.12 0.59
24.5 F(2,33) = 5.35, p = 0.001 −0.47 0.02 −0.04 0.84
39.3 F(2,33) = 10.68, p < 0.001 −0.61 <0.001 −0.03 0.85
28.4 F(2,33) = 6.55, p = 0.004 −0.25 0.2 −0.35 0.04
10° 43.2 F(2,33) = 12.56, p < 0.001 −0.1 0.57 −0.59 <0.001
15° 46.4 F(2,33) = 14.27, p < 0.001 0.07 0.68 −0.72 <0.001
30° 68.2 F(2,33) = 35.38, p < 0.001 −0.11 0.41 −0.76 <0.001

To further isolate the effects of age, the present study's data were also reanalyzed without including the OHI listeners. The multiple regression model predicting SRM at 30° was significant and accounted for 68% of the variance in the amount of SRM [F (2, 23) = 35.38, p < 0.001]. Age contributed significantly to the model (β = −0.76, p < 0.001) and PTA did not contribute (β = −0.11, p = 0.405). When the data were reanalyzed without including the YNH listeners, the multiple regression model predicting SRM at 30° was significant and accounted for 57% of the variance in the amount of SRM [F (2, 23) = 15.44, p < 0.001]. PTA contributed significantly to the model (β = −0.76, p < 0.001) and age did not contribute (β = 0.01, p = 0.95).

4. Discussion and conclusion

The present study investigated the individual effects of hearing loss and aging in spatial release from masking. SRM occurred for YNH listeners at a very small spatial separation (∼2°) between target and maskers. ONH listeners required a greater spatial separation (∼6°) and OHI listeners obtained very little advantage even at the largest separation tested (30°). Thus, the functions relating TMR threshold to spatial separation varied for three listener groups.

Gallun et al. (2013) demonstrated that aging results in a substantial reduction in SRM. Füllgrabe et al. (2015) found that declines in speech perception in older listeners were related to the cognitive changes and audiometric sensitivity changes correlated with aging. Glyde et al. (2013) reported no significant relationship between age and spatial processing ability—however, those researchers concluded that even a mild hearing loss could affect SRM. The present study found that aging was the most prominent predictor of SRM at very small separations and that hearing loss was the most prominent predictor at larger spatial separations.

One possible explanation for these variations in results could be the difference in the hearing status of the listeners in each experiment. For example, in Gallun et al. (2013), the participants were selected to have no more than mild hearing loss in order for age effects to be more easily observed. When OHI group was removed from the regression analyses of present study's data, the results indicated that aging resulted in a substantial reduction in SRM, as was found in Gallun et al. (2013). This is an important result because it demonstrates that the small difference in hearing thresholds was not responsible for the difference between the YNH and ONH performance. When YNH group was removed from the regression analyses of present study's data, the results indicated that aging did not result in a substantial reduction in SRM, as was found in Glyde et al. (2013). These two analyses indicate that when OHI listeners were included in the analyses, the effect of hearing impairment was so large that it reduced the contribution of aging. This is a likely reason for the divergent results that have been reported in other recently published studies in which age and hearing loss have been used as predictors of SRM.

Acknowledgments

We would like to thank all of the participants who volunteered their time to be involved in this experiment. We are also grateful to Sean Kampel and Meghan Stansell for their assistance with data collection and Samuel Gordon for engineering support. This work was supported by the National Institutes of Health–National Institute on Deafness and Other Communication Disorders grant (R01 DC011828). The contents of this article are the private views of the authors and should not be assumed to represent the views of the Department of Veteran Affairs or the United States Government.

a)

Portions of this work were presented at the International Hearing Aid Research Conference, July 2014, Lake Tahoe, CA and the 168th Acoustical Society of America meeting, November 2014, Indianapolis, IN.

All three authors are also at the Department of Otolaryngology/Head and Neck Surgery, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, Portland, OR 97239, USA.

References and links

  • 1.ANSI (2004). ANSI 3.6-2004, American National Standard Specification for Audiometers ( American National Standards Institute, New York: ). [Google Scholar]
  • 2. Arbogast, T. L. , Mason, C. R. , and Kidd, G., Jr. (2002). “ The effect of spatial separation on informational and energetic masking of speech,” J. Acoust. Soc. Am. 112(5), 2086–2098. 10.1121/1.1510141 [DOI] [PubMed] [Google Scholar]
  • 3. Best, V. , Gallun, F. J. , Ihlefeld, A. , and Shinn-Cunningham, B. G. (2006). “ The influence of spatial separation on divided listening,” J. Acoust. Soc. Am. 120(3), 1506–1516. 10.1121/1.2234849 [DOI] [PubMed] [Google Scholar]
  • 4. Bolia, R. S. , Nelson, W. T. , Ericson, M. A. , and Simpson, B. D. (2000). “ A speech corpus for multitalker communications research,” J. Acoust. Soc. Am. 107(2), 1065–1066. 10.1121/1.428288 [DOI] [PubMed] [Google Scholar]
  • 5. Bronkhorst, A. (2015). “ The cocktail party problem revisited: Early processing and selection of multi-talker speech,” Atten. Percept. Psychophys. 77(5), 1465–1487. 10.3758/s13414-015-0882-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Folstein, M. F. , Folstein, S. E. , and McHugh, P. R. (1975). “ Mini-Mental State: A practical method for grading the cognitive state of outpatients for the clinician,” J. Psychiat. Res. 12, 189–198. 10.1016/0022-3956(75)90026-6 [DOI] [PubMed] [Google Scholar]
  • 8. Freyman, R. L. , Helfer, K. S. , McCall, D. D. , and Clifton, R. K. (1999). “ The role of perceived spatial separation in the unmasking of speech,” J. Acoust. Soc. Am. 106(6), 3578–3588. 10.1121/1.428211 [DOI] [PubMed] [Google Scholar]
  • 9. Füllgrabe, C. , Moore, B. C. J. , and Stone, M. A. (2015). “ Age-group differences in speech identification despite matched audiometrically normal hearing: Contributions from auditory temporal processing and cognition,” Front. Aging Neurosci. 6(347), 1–25. 10.3389/fnagi.2014.00347 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Gallun, F. J. , Kampel, S. D. , Diedesch, A. C. , and Jakien, K. M. (2013). “ Independent impacts of age and hearing loss on spatial release in a complex auditory environment,” Front. Neurosci. 7(252), 1–11. 10.3389/fnins.2013.00252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Glyde, H. , Cameron, S. , Dillon, H. , Hickson, L. , and Seeto, M. (2013). “ The effects of hearing impairment and aging on spatial processing,” Ear Hear. 34(1), 15–28. 10.1097/AUD.0b013e3182617f94 [DOI] [PubMed] [Google Scholar]
  • 12. Kidd, G., Jr. , Mason, C. R. , Rohtla, T. L. , and Deliwala, P. S. (1998). “ Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns,” J. Acoust. Soc. Am. 104(1), 422–431. 10.1121/1.423246 [DOI] [PubMed] [Google Scholar]
  • 13. Marrone, N. , Mason, C. R. , and Kidd, G., Jr. (2008a). “ Tuning in the spatial dimension: Evidence from a masked speech identification task,” J. Acoust. Soc. Am. 124, 1146–1158. 10.1121/1.2945710 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Marrone, N. , Mason, C. R. , and Kidd, G., Jr. (2008b). “ The effects of hearing loss and age on the benefit of spatial separation between multiple talkers in reverberant rooms,” J. Acoust. Soc. Am. 124(5), 3064–3075. 10.1121/1.2980441 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Whitmer, W. M. , Seeber, B. U. , and Akeroyd, M. A. (2014). “ The perception of apparent auditory source width in hearing-impaired adults,” J. Acoust. Soc. Am. 135(6), 3548–3559. 10.1121/1.4875575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zahorik, P. (2009). “ Perceptually relevant parameters for virtual listening simulation of small room acoustics,” J. Acoust. Soc. Am. 126(2), 776–791. 10.1121/1.3167842 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Zurek, P. M. (1993). “ Binaural advantages and directional effects in speech intelligibility,” in Acoustical Factors Affecting Hearing Aid Performance, edited by Studebaker G. A. and Hockberg I. ( Allyn and Bacon, Needham Heights, MA: ) pp. 255–276. [Google Scholar]

Articles from The Journal of the Acoustical Society of America are provided here courtesy of Acoustical Society of America

RESOURCES