Abstract
Objectives.
Traditionally, elevated hearing thresholds have been considered to be the main contributors to difficulty understanding speech in noise; yet, patients will often report difficulties with speech understanding in noise despite having audiometrically normal hearing. The purpose of this cross-sectional study was to critically evaluate the relationship of various metrics of auditory function (behavioral thresholds and otoacoustic emissions) on speech understanding in noise in a large sample of audiometrically normal-hearing individuals.
Design.
Behavioral hearing thresholds, distortion product otoacoustic emission (DPOAE) levels, stimulus-frequency otoacoustic emission levels, and physiological noise (quantified using OAE noise floors) were measured from 921 individuals between 10 and 68 years of age with normal pure-tone averages. The Quick Speech-In-Noise (QuickSIN) test outcome, quantified as the signal-to-noise ratio (SNR) loss, was used as the metric of speech understanding in noise. Principle component analysis (PCA) and linear regression modeling were used to evaluate the relationship between the measures of auditory function and speech in noise performance.
Results.
Over 25% of participants exhibited mild or worse degree of SNR loss. PCA revealed DPOAE levels at 12.5 −16 kHz to be significantly correlated with the variation in QuickSIN scores, although correlations were weak (R2 = 0.017). Out of all the metrics evaluated, higher levels of self-generated physiological noise accounted for the most variance in QuickSIN performance (R2 = 0.077).
Conclusions.
Higher levels of physiological noise were associated with worse QuickSIN performance in listeners with normal hearing sensitivity. We propose that elevated physiological noise levels in poorer speech in noise performers could diminish the effective SNR, thereby negatively impacting performance as seen by poorer QuickSIN scores.
INTRODUCTION
Speech is rarely perceived against a quiet background. Some degree of competition, often from additional speech, is the ubiquitous reality of communication. As a result, communication breakdowns in noise are common even for individuals without measurable hearing loss. In fact, difficulty with speech understanding in noise is one of the most common complaints expressed by individuals with or without measurable hearing loss (Vermiglio et al. 2012).
Difficulty with speech in noise understanding has not been fully explored in young individuals with audiometrically normal hearing. Additionally, the influence of auditory function above 8 kHz on speech understanding in noise has been especially difficult to measure due to instrumentation and calibration limitations. However, advancements in calibration techniques now allow accurate signal delivery up to 20 kHz (Souza et al. 2014). In an effort to explore factors beyond the clinical audiogram, this study investigated the relationship between (1) speech in noise performance in individuals with audiometrically normal hearing; (2) auditory function at extended high frequencies (> 8 kHz); and (3) otoacoustic emission (OAE) levels as indicators of cochlear function.
We demonstrate that, while auditory function at extended high frequencies had a small role to play, physiological noise (quantified using OAE noise floors) played the largest role among these factors in predicting speech in noise understanding. Specifically, higher noise floor levels measured in the ear canal were associated with worse speech in noise understanding, suggesting that physiological noise influences speech in noise understanding in audiometrically normal-hearing individuals. Physiological noise has been found to interfere with low frequency signal detection (Watson et al. 1972; Moulin 1972; Buss et al. 2016) in adults and children (Shaw 1974). This study is the first to reveal a relationship between physiological noise and speech perception.
MATERIALS AND METHODS
The data presented were originally collected with the intention of evaluating early signs of age-related auditory decline. For the purposes of this retrospective study, participants were included based on pure-tone average (PTA – 0.5, 1, 2, 4 kHz) ≤ 20 dB HL in order to quantify factors related to speech in noise performance in audiometrically normal-hearing individuals. The better ear was tested from 921 participants (546 female), between 10 and 68 years of age. None reported ear surgery, history of hearing loss, ear disease, ototoxic medications, dementia diagnosis, or extensive noise exposure.
OAEs and behavioral thresholds, collected using modified Békésy tracking (Fig. 1A), were measured up to 20 kHz using stimuli calibrated in forward pressure levels (FPL, see Souza et al. 2014) delivered using custom-built hardware and software through an ER10B+ probe (Etymotic Research, Inc.), described fully in Lee et al. (2012) and Poling et al. (2014). Measurements were conducted in sound-treated audiometric booths that met ambient noise standards (ANSI S3.1–1999 (R2013)). Additional thresholds were measured between 0.25 and 8 kHz using commercially-available calibrated clinical audiometers (ANSI S3.6–1996). Distortion product otoacoustic emissions (DPOAEs) were elicited by pairs of primary tones (f2∕f1=1.22) between 0.75 and 20 kHz in 1/3 octave steps and are reported here for 72/72 dB FPL (nominally 75/75 dB SPL, Fig. 1B). Stimulus frequency otoacoustic emissions (SFOAEs) are reported for probe tones (fp) between 0.75 and 18 kHz in 1/3 octave steps at 37 dB FPL (nominally 40 dB SPL) and a 57 dB FPL (nominally 60 dB SPL) suppressor tone (47 Hz below the fp) (Fig. 1C). The Quick Speech-in-Noise (QuickSIN) test (Etymotic Research, Inc.) was used to evaluate monaural speech understanding in noise. The dB (signal-to-noise ratio) SNR loss was calculated as an average over four lists.
Figure 1.
Average hearing thresholds (A) and OAE levels and noise floor levels (B-C) for different age groups as a function of frequency. Error bars indicate the 95% confidence interval. Note the different frequency ranges in the threshold and OAE panels.
Principal component analysis (PCA) was performed using SAS 9.4 (Cary, NC) after ensuring regression modeling assumptions were not violated. The first PCA included standard audiometric thresholds, tracking thresholds, DPOAE levels from 72/72 dB FPL, and SFOAE levels from 37 dB FPL. While OAEs were measured for several other stimulus levels (see Poling et al. 2014), these results were not included as the initial PCA indicated OAEs for different conditions co-varied. The second PCA included standard audiometric thresholds, tracking thresholds, and DPOAE noise floor levels. The extracted components were then used as covariates in a multiple regression model to evaluate their influence on QuickSIN variability. In order to determine the effect of age on QuickSIN performance, both an unadjusted and age adjusted multiple regression model were evaluated.
Lastly, using analysis of variance, OAEs and behavioral thresholds were compared for participants with normal/near normal QuickSIN scores (≤ 3 dB SNR loss) and mild or greater SNR loss (> 3 dB) as defined by the QuickSIN guidelines. A main effect of QuickSIN group, frequency, and noise floor level on OAE amplitudes and a main effect of QuickSIN group and frequency on noise floor level or behavioral thresholds were evaluated in two independent analyses. Finally, to compare the extremes of speech in noise performance, a similar analysis was conducted using the best (≤ 5th percentile, scored < −0.3 dB SNR loss) and worst (≥ 95th percentile, scored ≥ 6 dB SNR loss) QuickSIN performers.
RESULTS
Of the 921 audiometrically normal participants, 71.3% had SNR loss that was normal/near-normal (≤ 3 dB SNR loss). The remaining 28.7% participants had mild or worse SNR loss (> 3 dB SNR loss).
The first PCA model reduced 60 auditory measures (OAE level PCA) into 12 components and the second reduced 44 auditory measures (noise floor PCA) into 8 components (eigenvalue > 1.0). From the OAE level PCA, an unadjusted linear regression revealed component 2 (L-C2: DPOAE levels between 12.5 and 16 kHz) to be significantly correlated with the variation in QuickSIN scores (p = 0.047, R2 = 0.017). Component L-C2 was not significantly correlated with QuickSIN variability after adjusting for age (p = 0.081). In the noise floor PCA, linear regression modelling identified component 6 (N-C6: noise floor levels between 0.75 and 4 kHz) to be significantly correlated with QuickSIN variability in both an unadjusted (p < 0.0001, R2 = 0.077) and age adjusted model (p < 0.0001, R2 = 0.085).
When comparing normal (n = 655) and abnormal (n = 261) QuickSIN performers, thresholds, DPOAE, and SFOAE levels were each significantly different between the groups (p < 0.001). The group with the poorer QuickSIN scores paradoxically showed higher OAE levels (Fig. 2). Furthermore, noise floor levels were significantly different between the groups (p < 0.001), with the poorer QuickSIN group demonstrating higher noise floor levels in the mid and low frequencies.
Figure 2.
Average thresholds (A) and DPOAE and SFOAE levels (B-C) are plotted comparing the QuickSIN groups. Individuals with normal QuickSIN scores that are less than or equal to 3 dB SNR loss (gray, n = 655) were grouped and compared to individuals with scores greater than 3 dB SNR loss (black, n = 261). Error bars indicate the 95% confidence interval. Note the different frequency ranges in the threshold and OAE panels.
Similar differences were observed when comparing the best (≤ 5th percentile, n = 46) and worst (≥ 95th percentile, n = 46) QuickSIN performers. Thresholds were worse (p < 0.001) and noise floors were higher (p < 0.001) for the worst performers (Fig. 3).
Figure 3.
Individuals whose QuickSIN scores fell below the 5th (gray, scored < −0.3 dB SNR loss) and above 95th (black, scored ≥ 6 dB SNR loss) percentile were grouped. Thresholds (A) and DPOAE and SFOAE levels (B-C) are plotted comparing the QuickSIN groups. Error bars indicate the 95% confidence interval. Note the different frequency ranges in the threshold and OAE panels.
DISCUSSION
While DPOAEs and thresholds showed decline at high frequencies with increasing age, only DPOAEs between 12.5 and 16 kHz were significantly related to QuickSIN scores, explaining only 1.7% of the QuickSIN variance (L-C2). The independence of QuickSIN scores from OAE or threshold measures could be due to several reasons. First, our thresholds and OAEs measured at frequencies important for speech perception may not have been sensitive enough to detect anomalies that challenge speech understanding in noise. Second, the intrinsic/extrinsic redundancies inherent in the QuickSIN material or compensation by the central nervous system may have distorted or hidden the dependencies between these measures (Bocca & Calearo 1963; Miller al. 1951; Boothroyd & Nittrouer 1988). Lastly, our results showed a decline in peripheral auditory function with age at frequencies > 8 kHz but not at speech frequencies. This could indicate (1) minimal relevance of high frequency peripheral auditory function for speech understanding in noise or (2) sufficiency of auditory function at speech frequencies in our cohort to maintain speech understanding in noise.
Why then did 28.7% of our cohort demonstrate sub-normal QuickSIN performance? We could not readily invoke a major role of linguistic, cognitive, or central processing on QuickSIN variability, a limitation of this retrospective study. It seemed reasonable that listener age could have contributed to sub-normal QuickSIN performance, especially since (1) it was obvious from Fig. 1B that DPOAE levels in general were age-dependent and (2) a high-frequency DPOAE principal component (L-C2), significantly associated with QuickSIN variability, was also likely to be age-dependent. However, age was similar between the normal (average = 35.4, SD = 14.8) and abnormal (37.9, SD = 15.8) QuickSIN performers with a slightly larger age difference observed between the ≤ 5th percentile (average = 29.8, SD = 11.5) and ≥ 95th percentile (average = 35.5, SD = 15.1) QuickSIN performers. It should be noted that when age was controlled for the high-frequency DPOAE component (L-C2) was no longer statistically significantly associated with variability QuickSIN scores. However, in the unadjusted model, the L-C2 component only accounted for 1.7% of the variance in QuickSIN scores. An alternate PCA then confirmed that noise floor levels between 0.75 and 4 kHz (N-C6) were most strongly related to QuickSIN scores, accounting for approximately 8% of the variance regardless of age adjustment.
OAE noise floor measurements were an independent estimate of physiological noise. Common sources of noise, as recorded in the ear canal, are related to circulation, respiration, and general muscle activity (Shaw 1974). Those who scored poorly on the QuickSIN test demonstrated higher noise levels, likely from these sources. We hypothesize that individuals generating higher physiological noise levels have a reduced effective SNR during the QuickSIN task leading to poorer performance. It seems improbable that noise levels below −10 dB SPL could influence QuickSIN performance when the speech and babble levels are 1,000 to 10,000 times higher. Our results suggest the effective reduction in SNR for a given person was significantly greater than suggested by the ear canal noise levels. Furthermore, these noise floor differences between normal and abnormal QuickSIN performers could be revealing a secondary contributing factor to a normal hearing individual’s speech understanding in noise. It is important to note, however, that OAE noise floors reported here were computed after extensive averaging in the time domain, resulting in an estimate likely to be much lower than the instantaneous noise level in the ear canal.
Physiological noise has been implicated to interfere during detection tasks (Shaw 1974). Watson et al. (1972) attributed elevated thresholds at low frequencies to increased masking by physiological noise measurable in the ear canal. Furthermore, psychometric functions became steeper as signal frequency increased from 0.125 to 0.5 kHz but plateaued at higher frequencies, arguably from the waning influence of physiological noise with increasing frequency. A similar reduction in the slope of psychometric functions for tone detection was observed only at the lowest frequencies in individuals with otosclerotic hearing loss compared to those without (Moulin 1972). More recently, Buss et al. (2016) confirmed the influence of physiological noise on low-frequency tone detection in quiet. Maturational decreases in physiological noise levels also coincided with improvement in signal detection.
The impact of physiologic noise levels on speech in noise understanding while theoretically interesting may be difficult to leverage in a clinically meaningful way given current protocols. The reporting of OAE noise floors varies dramatically between different implementations and is heavily dependent on the acoustic environment in which the recordings are made. Therefore, it would be difficult to isolate the influence of physiologic noise on an individual’s speech in noise understanding in current clinical settings using OAE noise floors.
Adults engage in behaviors like remaining still, relaxing facial muscles, delaying inspiration, and slowing heart rate to reduce the detrimental effects of physiological noise when detecting a signal (Stekelenburg & van Boxtel 2001). This internal silencing is observable in electromyographic activity in the masticatory and lower facial muscles during detection tasks. _S1_Reference15Walsh et al. (2014a, b) observed a lowering of noise in OAE measurements and attributed it to activation of the auditory efferent network by selective attention. However, such lowering of ear canal noise has more recently been linked to voluntary internal silencing during task performance (Francis et al. 2018). It is likely that better QuickSIN performers in our cohort were more efficient in internal silencing, leading to lower noise floors. It is enticing to envision the possibility of identifying individuals in whom physiological noise interferes with speech understanding in noise followed by training strategies to mitigate this interference.
Acknowledgments
This research was supported by NIH/NIDCD Grants No. R01DC008420 and Northwestern University. The authors would like to thank Drs. Uzma Wilson, Niall Klyn, Courtney Glavin, Jungmee Lee, Gayla Poling, and Ms. Vickie Hellyer along with many other collaborators for assistance with recruitment, data collection, and analysis.
Conflicts of Interest and Source of Funding: Sumitrajit Dhar and Jonathan Siegel received funding supported by NIH/NIDCD Grants No. R01DC008420 and Northwestern University for this research. For the remaining authors, none were declared.
REFERENCES
- ANSI S3.1, 1999 (R2013). Maximum Permissible Ambient Noise Levels for Audiometric Test Rooms. New York, NY: American National Standards Institute. Reaffirmation; 2013. [Google Scholar]
- ANSI S3.6,1996. American National Standards specification for audiometers. American National Standards Institute; New York: 1996. [Google Scholar]
- Bocca E, Calearo C (1963). Central hearing processes. Modern developments in audiology, 337–370. [Google Scholar]
- Boothroyd A, Nittrouer S (1988). Mathematical treatment of context effects in phoneme and word recognition. J Acoust Soc Am, 84(1), 101–114. [DOI] [PubMed] [Google Scholar]
- Buss E, Porter HL, Leibold LJ, et al. (2016). Effects of Self-Generated Noise on Estimates of Detection Threshold in Quiet for School-Age Children and Adults. Ear Hear, 37, 650–659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Francis NA, Zhao W, Guinan JJ Jr. (2018). Auditory Attention Reduced Ear-Canal Noise in Humans by Reducing Subject Motion, Not by Medial Olivocochlear Efferent Inhibition: Implications for Measuring Otoacoustic Emissions During a Behavioral Task. Front Syst Neurosci, 12, 42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J, Dhar S, Abel R, et al. (2012). Behavioral hearing thresholds between 0.125 and 20 kHz using depth-compensated ear simulator calibration. Ear Hear, 33, 315–329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller GA, Heise GA, Lichten W (1951). The intelligibility of speech as a function of the context of the test materials. Journal of experimental psychology, 41, 329. [DOI] [PubMed] [Google Scholar]
- Moulin LK (1972). The effects of physiological noise on the auditory threshold. J Speech Hear Res, 15, 837–844. [DOI] [PubMed] [Google Scholar]
- Poling GL, Siegel JH, Lee J, et al. (2014). Characteristics of the 2f(1)-f(2) distortion product otoacoustic emission in a normal hearing population. J Acoust Soc Am, 135, 287–299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaw EA (1974). The external ear In Auditory system (pp. 455–490): Springer [Google Scholar]
- Souza NN, Dhar S, Neely ST, et al. (2014). Comparison of nine methods to estimate ear-canal stimulus levels. J Acoust Soc Am, 136, 1768–1787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stekelenburg JJ, van Boxtel A (2001). Inhibition of pericranial muscle activity, respiration, and heart rate enhances auditory sensitivity. Psychophysiology, 38, 629–641. [PubMed] [Google Scholar]
- Vermiglio AJ, Soli SD, Freed DJ, et al. (2012). The relationship between high-frequency pure-tone hearing loss, hearing in noise test (HINT) thresholds, and the articulation index. J Am Acad Audiol, 23, 779–788. [DOI] [PubMed] [Google Scholar]
- Walsh KP, Pasanen EG, McFadden D (2014a). Selective attention reduces physiological noise in the external ear canals of humans. I: auditory attention. Hear Res, 312, 143–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walsh KP, Pasanen EG, McFadden D (2014b). Selective attention reduces physiological noise in the external ear canals of humans. II: visual attention. Hear Res, 312, 160–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watson CS, Franks JR, Hood DC (1972). Detection of tones in the absence of external masking noise. I. Effects of signal intensity and signal frequency. The Journal of the Acoustical Society of America, 52, 633–643. [Google Scholar]



