Skip to main content
JARO: Journal of the Association for Research in Otolaryngology logoLink to JARO: Journal of the Association for Research in Otolaryngology
. 2023 Dec 11;24(6):619–631. doi: 10.1007/s10162-023-00919-w

Speech Perception in Noise and Medial Olivocochlear Reflex: Effects of Age, Speech Stimulus, and Response-Related Variables

Shezeen Abdul Gafoor 1, Ajith Kumar Uppunda 2,
PMCID: PMC10752852  PMID: 38079021

Abstract

Purpose

The role of the medial olivocochlear system in speech perception in noise has been debated over the years, with studies showing mixed results. One possible reason for this could be the dependence of this relationship on the parameters used in assessing the speech perception ability (age, stimulus, and response-related variables).

Methods

The current study assessed the influence of the type of speech stimuli (monosyllables, words, and sentences), the signal-to-noise ratio (+5, 0, −5, and −10 dB), the metric used to quantify the speech perception ability (percent-correct, SNR-50, and slope of the psychometric function) and age (young vs old) on the relationship between medial olivocochlear reflex (quantified by contralateral inhibition of transient evoked otoacoustic emissions) and speech perception in noise.

Results

A linear mixed-effects model revealed no significant contributions of the medial olivocochlear reflex to speech perception in noise.

Conclusion

The results suggest that there was no evidence of any modulatory influence of the indirectly measured medial olivocochlear reflex strength on speech perception in noise.

Supplementary Information

The online version contains supplementary material available at 10.1007/s10162-023-00919-w.

Keywords: Medial olivocochlear reflex, Efferent pathway, Speech perception in noise, Contralateral inhibition of OAE, Adverse listening, Linear mixed-effects model

Introduction

The medial olivocochlear efferent system reflexively reduces the cochlear amplifier gain when stimulated, termed the medial olivocochlear reflex (MOCR) [1]. The MOCR is often quantified by comparing the amplitude of otoacoustic emissions (OAEs)—a measure of the outer hair cell motility—obtained in the presence and absence of a contralateral acoustic stimulus (CAS) [2]. Activation of the MOCR by CAS typically reduces the OAE amplitudes—contralateral inhibition of OAEs, the magnitude of which reflects the strength of the MOCR [3]. It is suggested that the MOCR improves the signal-to-noise ratio (SNR) at the level of the cochlea via its antimasking function [4]. The MOCR increases the dynamic range of the auditory neurons for brief stimuli in the presence of noise [1, 5]. Additionally, in adverse listening conditions, the peripheral auditory system may depend on the efferent inhibitory mechanisms for perceiving the target through dips (aka glimpsing) in the masker envelope [6]. Therefore, the auditory efferent system may also aid speech perception in noise (SPiN) via a combination of these MOCR effects (improved SNR, dynamic range, and ability to glimpse through dips).

Although several studies have been carried out with this notion, the nature and extent of involvement of the MOCR in SPiN remain uncertain—positive, negative, and no relationships between the MOCR strength and SPiN are reported (online Supplementary Table has details of the existing studies in this area). The variations could be due to methodological differences (SPiN and MOCR-related) present across studies. In the current study, we investigated the influence of the SPiN-related factors on the relationship between SPiN and the MOCR strength. Some of the critical differences in the SPiN measure that exist in studies exploring the relationship between the MOCR strength and SPiN are:

  • (i)

    Types of speech stimuli—Studies have used monosyllables [7, 8], words [911], and sentences [1214] to investigate the relationship between the MOCR and SPiN. These stimuli vary in terms of redundancy, amount of contextual information, and duration. These factors may modulate the perceptual benefit derived from the antimasking effects of the MOCR. For instance, the MOCR effects are larger for short-duration stimuli than for continuous stimuli [1, 5]. Also, depending on redundancy and context, a specific SNR change can have differential effects on speech perception [15]. Therefore, any perceptual benefit derived from the MOCR may also vary depending on the type of speech stimuli.

  • (ii)

    SNR—Studies have assessed SPiN across a wide range of SNRs—positive [9], negative [7], and/or 0 dB [10] SNRs. It is postulated that the antimasking effects of MOCR depend on the input SNR. Guinan [16] and Kumar and Vanaja [9] reported that MOCR aids SPiN in more challenging SNRs. However, Jennings [5] postulated that MOCR aids SPiN when speech is more intense than noise, i.e., at positive SNRs. Therefore, we speculated that the SNR at which SPiN is assessed is one of the crucial variables that can influence the relationship between SPiN and the MOCR strength.

  • (iii)

    Quantification metric—SPiN can be quantified using several metrics, each of which points towards a particular aspect of perceptual ability. Among the widely used metrics, the percent-correct scores are direct measures that provide the proportion of accurate stimulus perception relative to the total number of stimuli presented. The SNR-50 summarizes the individuals’ overall speech perception ability and represents the SNR required to correctly understand speech 50% of the time. The slope of the psychometric function of SPiN scores reflects how listeners’ performance changes with SNR change. Steeper slopes indicate larger effects of SNR change on speech perception [15]. The metrics described above have been used to quantify the SPiN in studies exploring the relationship between the MOCR and SPiN [9, 14, 17]. These metrics are interdependent and provide complementary information about individuals’ SPiN abilities. Yet, a few existing studies report correlations in one type of SPiN metric and not the other. For example, Mertes et al. [17] reported a significant association between the slope of the psychometric function and the MOCR strength but not with the percent-correct identification scores. Therefore, it appears that the link between SPiN and the MOCR may vary depending on the quantification metric used.

  • (iv)

    Participant age—The relationship between SPiN and the MOCR has been studied in children [9, 11], young adults [7, 18], middle-aged adults [19], older adults [12, 14], or adults over a wide age range [17]. Age-related changes are reported in both SPiN and the MOCR [14, 20]. Therefore, it appears plausible that the participant’s age may have a role in the association between MOCR and SPiN.

The above-mentioned variables may modulate the relationship between SPiN and MOCR in isolation or in a complex interactive manner. To date, this relationship was assessed using correlational approaches, which may not be suitable for understanding such complex interactions. Such analyses also ignore the random variations within participants. Linear mixed-effects modelling, an extension of linear regression, is an advanced and efficient statistical technique that offers greater flexibility and provides a means to account for individual variability [21, 22]. To this end, we used a linear mixed-effects approach to explore and understand the association between the MOCR and SPiN across

  • (i)

    Three types of speech stimuli that varied in their complexity, duration, and redundancy—monosyllables, words, and sentences

  • (ii)

    Four SNRs— +5, 0, -5 and −10 dB SNR

  • (iii)

    Three metrics to quantify the SPiN performance—percent-correct scores, SNR-50, and slope of the psychometric function, and

  • (iv)

    Two age groups—young and older adults

Methods

Participants

The participants were volunteers residing in and around the Mysore district of Karnataka. Participants received no financial compensation for their participation in the study. They were divided into two groups based on their age. A power analysis was carried out using G*Power (version 3.1) with target effects sizes (correlation coefficients) ranging between .5 and .6 (from studies that assessed the relationship between SPiN and the MOCR strength existing in literature), a target P-value of 0.05, and the number of participants in the current study (29 young adults and 20 older adults). The resultant power of the current study ranged between 70 (for the older adults) and 90% (for the young adults). Written informed consent was obtained from all participants before the commencement of the experiment. The study adhered to the ethical guidelines for bio-behavioral research involving human subjects at the All India Institute of Speech and Hearing, Mysuru [23].

Inclusion Criteria

We recruited 49 adults who were native speakers of Kannada, a south Indian Dravidian language spoken in Karnataka. The participants predominantly used their right hand while performing daily chores. The first group consisted of young adults (N = 29, 21 females) aged 18–27 (mean age, 22 ± 2.5 years). The second group comprised older adults (N = 20, 15 males) aged 50–69 (mean age, 57 ± 6.3 years).

Audiological testing was conducted prior to the experimental procedure in a sound-treated, double-room set-up with appropriate lighting and ventilation that adhered to ANSI standards [24]. All participants had air and bone conduction thresholds at or below 20 dB HL at octave frequencies from 250 Hz to 8 kHz in both the right and left ears. A calibrated two-channel audiometer, MAICO MA 53 diagnostic audiometer (MAICO Diagnostics GmbH, Berlin, Germany) with supra-aural TDH 39 headphones, was used to estimate the hearing thresholds. Tympanometry was performed using a calibrated GSI Tympstar middle ear analyzer (Grason-Stadler, Minneapolis, USA). All participants had “A” type tympanogram [25] with ipsilateral and contralateral acoustic stapedial reflex thresholds at normal sensation levels for 500 Hz and 1000 Hz pure tones bilaterally. None of the participants had acoustic reflex thresholds at levels less than 70 dB HL. Transient evoked otoacoustic emission (TEOAE) screening was performed using a calibrated ILO-V6 (Otodynamics, UK) OAE analyzer. For this, the TEOAEs were recorded using 260 sweeps of non-linear clicks at 80 dB peSPL. All participants had TEOAEs with a minimum amplitude of 6 dB SPL.

Exclusion Criteria

Participants with speech, language, hearing, cognitive, or other related disorders were excluded from the current study. This information was obtained from a structured interview carried out during participant recruitment. Participants with any history of middle ear pathology, noise exposure, and ototoxic drug usage were also excluded. Initially, a total of 65 participants volunteered for the study. However, 16 individuals were excluded from the study as they did not meet the minimal inclusionary requirements. The exclusions were more in the older adult group. Forty-nine participants who met the mentioned inclusionary criteria proceeded to the experiment.

Contralateral Inhibition of TEOAE

TEOAEs were recorded and analyzed using Otodynamics ILO V6 software (Otodynamics Ltd., London, UK). The participants were made to sit comfortably on a chair with arm and back support and instructed in their native language. The participants were asked to relax and avoid body movements, swallowing, or coughing during the test procedure. An appropriate-sized probe was inserted in the ear canal of the participants’ right ear. The MOCR was measured only in the right ear of our participants as studies report no laterality effects in the MOCR strength [26]. Moreover, most studies assessing the relationship between the MOCR and SPiN report the MOCR strength in the right ear [7, 18]. An E-A-RTONE 5A insert earphone connected to a MAICO MA 53 diagnostic audiometer was placed in the left ear. The placement of the insert receiver and probe was undisturbed until the end of the TEOAE recordings. The probe fit and stimulus levels were monitored throughout the recordings. We repositioned the probe and repeated the testing if and when the probe position was altered. We recorded the TEOAEs without and with the CAS, hereafter referred to as CAS- and CAS+ conditions, respectively. The CAS was a 60 dB SPL calibrated white noise with maximum energy in the 500–4000 Hz frequency region, presented to the left ear through the insert earphones connected to the audiometer. This level of CAS was chosen to ensure the contributions of the middle ear muscle reflex, if any, are minimal, as it is below the average clinical acoustic reflex threshold for broadband noise in young and older adults [27]. The CAS was delivered 3 s before the OAE measurement and remained for 3 s after the cessation of the OAE measurement. Two hundred sixty sweeps (1040 individual click stimuli in two buffers) of linear clicks were presented at 65 dB peSPL (±0.5 dB) at a rate of 49 clicks/s to record the TEOAEs. The TEOAE stimulus calibration was done as per the procedure recommended by the manufacturer. We set a rejection level of 6 mPa over a time window of 2.5–20 ms after the presentation of the click. The rejection rate was at or below 10% for all participants. It was ensured that all accepted TEOAE recordings had response reproducibility of more than 80% and stimulus stability of more than 95%. In instances where this was not achieved—mainly due to movements or physiological noise (less than 10% of the total recordings)—the recordings were obtained once again after probe reinsertion and/or re-instruction. The participants were provided with reading material during the measurement to maintain passive attention. The magnitude of contralateral inhibition (a measure of the MOCR strength) was measured by subtracting the TEOAE global amplitudes in the CAS+ condition from the CAS− condition.

Speech Perception in Noise

Stimuli

We measured SPiN using three types of speech stimuli that varied in their redundancy. They were,

  • (i)

    Nonsense consonant-vowel monosyllable [28]—This test included 20 consonants in the context of /a/ vowel (/ma, ʤa, ʧa, sa, da, ta, ra, ṇa, pa, va, ṋa, ka, la, ha, ḷa, ga, ʃa, ja, ḓa, ṱa/). This list has been standardized and validated for use in the Indian subcontinent in both normal hearing and clinical groups, in quiet and noise. A fluent Kannada female talker uttered the monosyllables. We recorded these nonsense syllables using a Behringer B-2 Pro dual-diaphragm condenser microphone (Behringer, Germany) kept at a distance of 5 cm from the speaker’s mouth. These recordings were done at a sampling frequency of 44,100 Hz in Adobe Audition 3.0 from a Dell Inspiron i3 core Laptop using a Motu Microbook II external sound card interface. The monosyllables were rms-normalized in Adobe Audition prior to calibration.

  • (ii)

    Phonemically balanced words—Kannada [29]—This test has 21 equivalent Kannada word lists with comparable word identification performance in noise (refer Manjula et al. for further details regarding the generation and validity of the lists). Each list contains 25 phonemically balanced, meaningful, disyllabic Kannada words clearly articulated by a young female who is a native speaker of the language. The original lists, recorded and stored in a computer with a 16-bit resolution at a sampling rate of 44,100 Hz, were obtained via personal communication with the authors. From this, we utilized eight random lists in the current experiment.

  • (iii)

    Kannada sentences [30]—These were obtained from the sentence identification test—Kannada, which has 25 Kannada sentence lists with matched difficulty levels (refer to Geetha et al. for information on the detailed procedures adopted). Each list has ten low-predictability sentences uttered by a female native Kannada speaker. Each sentence contains four keywords. Scoring is based on the number of keywords correctly identified. The original lists, recorded and stored at a sampling rate of 44,100 Hz with 16-bit resolution, were obtained. From this, we used eight lists in the current experiment.

Eight-talker babble taken from Quick SIN—Kannada [31]—served as the masker. The babble was chosen as it represents the most-encountered communication situation [32]. We hypothesized that this would help us in obtaining a realistic portrayal of our participants’ SPiN abilities. The speech stimuli (monosyllables, words, and sentences) were mixed with the babble using a custom MATLAB function at four different SNRs— 5, 0, −5, and − 10 dB [33]. These SNRs (+5 to −10 dB SNRs) were chosen to obtain a complete psychometric range of the participants’ performance. The code manipulates the SNR by keeping the speech level constant and varying the noise levels. While mixing, the target speech was mixed with arbitrary sections of the masker. In all instances, the masker preceded and succeeded each speech stimuli by 500 ms. The mixing of the speech stimuli and the babble was done prior to the experiment, and every participant listened to the same babble-stimulus combination.

Procedure

Using a Dell Inspiron Laptop with an i3 core processor, the SPiN stimuli were presented binaurally (for better representation of the naturally encountered situation) using Sennheiser HD 280 pro headphones. The output of the headphones was calibrated prior to the experiment using a KEMAR with an ear simulator connected to a Bruel and Kjaer 2270 sound level meter. The overall presentation level of the pre-mixed speech and babble (across SNRs) was maintained at 70 dB SPL. The participants were instructed to repeat what they heard verbatim and were allowed to guess. Before the actual testing, we provided the participants with a few practice trials (separately for each type of speech stimulus—monosyllables, words, and sentences). Once the practice session terminated, the participants were presented with the lists of stimuli chosen for the SPiN experiments. The practice lists were different from the test lists. The test lists were presented randomly such that the difficulty of the succeeding list (both in terms of SNR as well as linguistic complexity) could not be predicted. This counterbalancing was done to remove order effects, if any. A score of one was given for every correct response (repetition of the correct monosyllable, whole word, or keywords of sentences) and zero for any incorrect or partially correct response. The total correct scores were calculated, and the following metrics were derived,

  • (i)

    Percent-correct scores, which were calculated as the number of correct responses/number of items presented) × 100.

  • (ii)

    SNR-50 using the Spearman-Karber equation [34, 35] to obtain the statistical 50% point of speech perception. The equation is as follows

    StartinglevelofSNR+1/2×d-(d×N)/W,
    where d is the SNR step size (5 in the current study), N is the number of correct responses, and W is the number of items presented per step.
  • (iii)

    The logistic slope of the psychometric function using the psignifit 4 toolbox [36] implemented in MATLAB. This toolbox provides Bayesian inference of psychometric functions using likelihood maximization. Using this toolbox, we calculated the logistic slope at the 50% threshold point from the number of correct responses and the number of presented items across SNR for each participant.

Results

Overall, the data followed multivariate normality on the Shapiro-Wilks test. As the sample sizes for the two groups were unequal, bootstrapping analysis [37, 38] with 1000 samples was performed wherever applicable. The statistical tests were carried out using the Statistical Package for Social Sciences, version 26 (IBM Corp., Armonk, NY) and R 4.3.0 [39].

Age and Contralateral Inhibition of TEOAE

The TEOAE amplitudes were obtained in two conditions—CAS− and CAS+ . Both the young and older adults had higher TEOAE amplitude in the CAS− (mean ± standard deviation, 10.8 ± 5.3 dB SPL in young and 4.7 ± 4.3 dB SPL in the older adults) compared to the CAS+ (mean ± standard deviation, 9.3 ± 5.0 dB SPL in young and 3.7 ± 4.3 dB SPL in the older adults) condition. We assessed the statistical significance of this difference (between the CAS− and CAS+ conditions) using a paired samples t-test with 1000 bootstrapped samples for both groups. The TEOAE amplitudes were significantly lower in the CAS+ than in the CAS− condition in both young (t (28) = 8.701, P = .001, d = 1.62) and older (t (19) = 5.501, P = .001, d = 1.23) adults. The 95% bootstrapped bias-corrected accelerated confidence intervals (BCa CI) were [1.231 1.914] and [.691 1.297] for the young and older adults, respectively. The BCa CIs are consistent with the rejection of the null hypothesis. The noise floors for the two repeated recordings in the two conditions (CAS− and CAS+) were comparable for both the groups [(t (28) = 1.919, P = .066) and (t (19) = .015, P = .988) respectively for young and older adults]. This suggests that the reduction in amplitude of TEOAE with CAS is likely due to the activation of the MOCR and not due to changes in external or internal factors.

We then calculated the inhibition magnitudes for all our participants. Young adults had higher inhibition of TEOAEs (mean inhibition magnitude of 1.5 ± 0.9 dB) than older adults (mean inhibition magnitude of 0.9 ± 0.8 dB). Figure 1 depicts the baseline TEOAE and inhibition magnitudes in young and older adults. The inhibition magnitudes are known to be affected by the baseline TEOAE amplitude—the greater the TEOAE amplitude, the higher the inhibition [40, 41]. Therefore, we added baseline TEOAE amplitude (TEOAE amplitude in the CAS− condition) as a covariate. Results of one-way analysis of variance (ANOVA) showed no significant difference between the inhibition magnitude in young and older adults [F (1, 46) = 0.469, P = 0.497, Ƞp2 = .01]. The BCa CI for 1000 bootstraps was [−.392 .705], which is consistent with the lack of significant difference between the two groups. The baseline TEOAE amplitudes had a significant influence on the contralateral inhibition of TEOAEs [F (1, 46) = 6.019, P = 0.018, Ƞp2 = .116].

Fig. 1.

Fig. 1

Mean (horizontal solid line) and one standard deviation (vertical solid line) and individual data of the baseline TEOAE amplitudes for 65 dB peSPL clicks and the magnitudes of contralateral inhibition of TEOAEs in young (N = 29) and older adults (N = 20)

MOCR Strength, SPiN, and Age

The study followed a repeated measures design to understand the nature of the associations between SPiN scores and the MOCR strength. To achieve this, we fit the data into a linear mixed-effects model using a restricted maximum likelihood estimation method. The modelling was performed using RStudio (RStudio Team, 2020) and the “lme4” package within R [42]. To obtain the significance values (P-values) using Satterthwaite’s method, we used the “lmerTest” package [43]. The linear mixed-effects models were adopted as they are advantageous over the traditionally used repeated measures or correlational designs [21, 22]. Linear mixed-effects modelling systematically incorporates fixed and random effects to cater to the categorical grouping factors, the between-subject differences as well as the correlational structure among the predictor variables. This method would help interpret the associations in a better manner. Including a by-participant random effect is particularly beneficial as it helps in addressing unexplained individual variations among the participants [44, 45]. We also performed likelihood ratio tests to compare the goodness of fit of the models. In the current study, as we had three different measures of the dependent variable—SPiN (the three quantification metrics), we ran three separate mixed-effects models—one with the percent-correct metric, one with SNR-50 and one with the logistic slope of the psychometric function.

Percent-Correct Scores and MOCR Strength

The percent-correct scores obtained from the participants across 5, 0, −5, and −10 dB SNRs are displayed in Fig. 2. The performance varied depending on the SNR and stimulus type, as evident from the figure. Following this, we fit a linear mixed-effects model with the percent-correct scores as the dependent variable and the inhibition magnitude (MOCR strength), groups, stimulus type, and SNR as the independent variables (fixed effects), the interaction effects, and a random intercept of participants (full model). An ANOVA of the model revealed significant main effects of the groups [F(1, 46) = 112.3462, P < .001], stimulus type [F(2, 517) = 546.7552, P < .001], and SNR [F(3, 517) = 3239.1885, P < .001]. However, there was no significant main effect of the inhibition magnitude [F(1, 46) = .5809, P = .4498]. The two-way and three-way interactions involving the SPiN parameters (such as SNR and stimulus type) and the age groups were significant. This is because the percent-correct scores varied differentially depending on the type of stimulus and SNR across the two groups. The young adults outperformed the older adults in SPiN across all SNRs and speech materials. We did not conduct post hoc analyses to understand the nature and extent of the interactions as they were not the objectives of the current study and have been reported in detail in the literature [20].

Fig. 2.

Fig. 2

Mean and one standard deviation of the percent-correct scores across SNRs for a monosyllables, b words, and c sentences. Individual points depict the scores of the young (N = 29) and older (N = 20) adults

A summary of the model is provided in Table 1. The table shows that all the fixed effects in the model were significant contributors to SPiN, except the inhibition magnitude. These results show no evidence of the contribution of MOCR strength, as assessed using contralateral inhibition of TEOAEs, to a person’s SPiN ability (when quantified using a percent-correct metric). The full model incorporating the percent-correct measure had a total explanatory power (conditional r2, which is calculated considering both fixed and random effects of the model) of .96, which indicates a good model fit. This model had an Akaike Information Criteria (AIC) value of 4094.9. The full model was then compared with a null model (all fixed effects equated to 1). The fit of the full model was significantly better (P < .001) than that of the null model (AIC of 5817.5).

Table 1.

Outcomes of the linear mixed-effects model that assessed the association between the percent-correct SPiN scores, SNR, stimulus type, age, and the MOCR strength

Fixed effects
Estimate Standard error df t-value P-value
Intercept 48.3409 2.1531 103.5194 22.452 < .001
Age Older adults −32.4660 2.6030 216.6535 12.473 < .001
Young adults Reference
MOCR Inhibition magnitude .6853 0.8992 46.0000 .762 .4498
Speech stimulus Sentences −23.2500 1.8577 517.0000 12.515 < .001
Words −45.1379 1.8577 517.0000 24.297 < .001
Monosyllables Reference
SNR −5 dB SNR 24.8621 1.8577 517.0000 13.383 < .001
0 dB SNR 46.3793 1.8577 517.0000 24.966 < .001
5 dB SNR 49.4483 1.8577 517.0000 26.618 < .001
−10 dB SNR Reference
Random effects of participants
Variance Standard deviation
By-participant intercept 26.93 5.190
Residual 50.04 7.074

SNR-50 and MOCR Strength

Figure 3 represents the SNR-50 obtained in our participants for monosyllables, words, and sentences. The young adults had better SNR-50 and lesser variability than older adults for all three types of speech stimuli. To assess if the MOCR strength contributes to SNR-50, we fit a linear mixed-effects model to the data, with the MOCR strength, stimulus type, and group as the fixed effects. The interaction effects were also incorporated into the model. The model also included a by-participant random intercept. The ANOVA of the model revealed a significant main effect of groups [F(1, 46) = 112.3089, P < .001] and stimulus type [F2, 94) = 307.2795, P < .001]. However, the MOCR strength was not a significant contributor to the SNR-50 [F(1, 46) = .5827, P = .4491]. The interaction between groups and stimulus types was significant. Post hoc analyses of these interactions are not carried out as it was not the primary focus of the current study. A summary of the model is provided in Table 2. The conditional r2 of this model with SNR-50 as the dependent variable was .90. From the table, it is evident that both stimulus type and age group influenced the SNR-50 scores but not the inhibition magnitude.

Fig. 3.

Fig. 3

Mean, one SD, and individual data points of SNR-50 for monosyllables, words, and sentences for young (N = 29) and older adults (N = 20)

Table 2.

Outcomes of the linear mixed-effects model that assessed the association between the SNR-50 metric of SPiN, stimulus type, age, and the MOCR strength

Fixed effects
Estimate Standard error df t-value P-value
Intercept −8.2023 .3777 61.8528 21.715 < .001
Age Older adults 3.7400 .4077 86.4509 9.714 < .001
Young adults Reference
MOCR Inhibition magnitude −0.1373 .1799 46.0000 .763 .4491
Speech stimulus Sentences .8934 .2478 94.0000 3.605 < .001
Words 5.1828 .2478 94.0000 20.914 < .001
Monosyllables Reference
Random effects of participants
Variance Standard deviation
By-participant intercept .9477 .9735
Residual .8905 .9437

The full model (with an AIC of 481.17) was compared with a null model (with an AIC of 743.09). As evident from the lower AIC values, the fit of the full model was significantly better than that of the null model (P < .001).

Logistic Slope of SPiN Scores and MOCR Strength

We calculated the logistic slope of the SPiN scores for all our participants (Fig. 4) to understand how the proportion of correct scores changes with SNR changes. We first assessed the goodness of fit of the logistic slope using Pearson’s chi-square goodness of fit test, which indicated good fit for all participants in all conditions (P < .05). We then fit a linear mixed-effects model with the logistic psychometric slope as the dependent variable and MOCR strength, groups, and stimulus as the fixed effects. However, the full model resulted in a singular fit with the variance of random effects at 0. Therefore, we removed the random effects from the full model, and the summary of the reduced model is presented in Table 3. The table shows that groups and stimuli significantly predicted the logistic slope of the psychometric function. However, the inhibition magnitude did not contribute significantly to predicting the logistic slope. The adjusted r2 was .29, which points to a weak model fit. This indicates that factors other than the stimulus type and age group may be determinants of the logistic slope of the psychometric function.

Fig. 4.

Fig. 4

Mean, one SD, and individual data points of the slope of the psychometric functions of SPiN scores and SNR for monosyllables, words, and sentences in young (N = 29) and older (N = 20) adults. The slope values depicted here represent the slope of the psychometric curve at 50% performance

Table 3.

Outcomes of the linear model that assessed the association between the logistic slope of the psychometric function, stimulus type, age, and the MOCR strength

Estimate Standard error t-value P-value
Intercept .0838 .0055 15.125 < .001
Age Older adults .0213 .0069 3.068 .003
Young adults Reference
MOCR Inhibition magnitude −0.0011 .0022 .507 .613
Speech stimulus Sentences .0479 .0062 7.776 < .001
Words .0210 .0062 3.412 < .001
Monosyllables Reference

Discussion

We aimed to systematically evaluate if the MOCR strength assessed as the magnitude of contralateral inhibition of OAEs modulates a person’s SPiN. The principal findings of our study were that (i) both young and older adults had similar magnitudes of contralateral inhibition of TEOAEs, when controlled for the baseline TEOAE amplitude (known to influence the inhibition magnitude), and (ii) there was no evidence of the contribution of the MOCR (assessed as the magnitude of contralateral inhibition of OAEs) to the SPiN performance.

Effect of Age on Contralateral Inhibition of TEOAE and SPiN

The TEOAE amplitudes in quiet (CAS−) were significantly lower in older adults. Reduced OAE amplitudes, even in the presence of normal hearing sensitivity, are not uncommon in older adults [46]. Interestingly, we observed comparable magnitudes of contralateral inhibition of OAEs between young and older adults when the baseline TEOAE amplitudes were controlled statistically. This is in line with the study by Abdala et al. [19], where the authors found no age-related differences in the reflection component of distortion product OAEs, when the MOCR strength was quantified using the normalized inhibition magnitude (which accounts for variations in the baseline OAE activity). In our study, we measured TEOAEs, the reflection aspect of the outer hair cell response [47].

Age was a significant contributor in all three models of SPiN. In general, older adults had lower percent-correct scores, higher SNR-50, and steeper slopes than young adults (Figs. 2, 3, and 4). Age-related decline in SPiN has been studied extensively. Poor SPiN performance in older adults has been attributed to distorted suprathreshold auditory processing and a decline in cognitive skills with age [48, 49]. Another factor known to affect the SPiN ability is the peripheral hearing sensitivity [48]. In the current study, we ensured that all participants had normal hearing thresholds (at or below 20 dB HL) in conventional audiometric frequencies. However, it is still likely that there could be differences in the absolute thresholds of the young and the older adults. This difference in absolute threshold may also contribute to poor SPiN performance in older adults.

SPiN Performance and Contralateral Inhibition of TEOAEs

In the current study, we quantified SPiN using three different metrics and assessed its association with the MOCR strength as assessed using contralateral inhibition of TEOAEs. There was no evidence of a modulatory influence of the MOCR strength on SPiN. Previous studies that assessed the associations between the two used correlational approaches, and the results are contradictory. However, our findings align with the results of a recent meta-analysis [50] carried out by combining studies that assessed the relationship between SPiN and the MOCR strength (quantified using OAE-based measures) using correlational approach. In the meta-analyses, the authors report no apparent relationship between SPiN and the MOCR strength and suggest that there is a need for improvising the research methodologies adopted in studies assessing this relationship. In the current study, we adopted a more robust statistical approach—linear mixed-effects modelling to understand the contributions of the MOCR to SPiN—and found no evidence of a relationship between SPiN and the MOCR. Additionally, to the best of our knowledge, this is the first study to systematically investigate the association between SPiN and the MOCR strength across a variety of SPiN-related parameters. The repeated measures design and the inclusion of the random by-participant effects in the linear mixed-effects model in the current study are also advantageous as they help cater to the differences brought about by individual variations.

Several animal studies indicate that MOCR aids signal detection in the presence of noise [51, 52]. The difference between these animal models and studies involving human subjects is the way in which MOCR is elicited and recorded. It is not ethical to obtain invasive near-site recordings of the magnitude of the MOCR in humans. Contralateral inhibition of OAE is the easiest and quickest method to assess the MOCR strength non-invasively and, therefore, is the most commonly used. However, this technique assesses the uncrossed medial olivocochlear system, which has weaker MOCR than the ipsilateral crossed pathway [1, 53]. The total MOCR effect may be understated during such indirect measurement of MOCR strength [1, 50]. Therefore, measuring the MOCR strength using contralateral inhibition of OAE (as used in the current study as well as in most existing studies) is not the ideal method to determine how the MOCR and SPiN are related. Other techniques, such as ipsilateral or binaural inhibition of OAEs [53], and psychoacoustic measures like forward masking paradigm [54], may be more appropriate for estimating the total MOCR effects.

It should also be noted that in the current study, though all participants had middle ear muscle reflex at levels higher than 70 dB HL, there was still a possibility of sub-clinical activation of this reflex [55], which mimics the MOCR in many ways and can cause OAE inhibition [56]. Although the probability of contamination cannot be ruled out completely, the chance of it affecting the MOCR is less, as the intensity of the CAS used in the current study was 60 dB SPL. It is reported that the middle ear reflex affects MOCR magnitude as measured through contralateral inhibition of OAEs only if the level of the CAS is 65 dB SPL or higher [57, 58].

We should also remind the readers that the power analysis for older adults came to only about 70%, which is towards the lower end of the traditionally accepted power estimate for behavioral studies like this one. Moreover, this study only considers one aspect of potential variables that could influence the relationship between SPiN and the MOCR - specifically, variables related to SPiN measurements. Getting a thorough picture of the assistance offered by the MOCR in SPiN may also require consideration of several additional factors. Subsequent studies involving a systematic investigation of all OAE-related parameters, particularly pertaining to the type, measurement, and quantification of OAE aspects, will advance understanding further. A recent study by Otsuka et al. [18] reported differential effects of the OAE quantification metric on the relationship between SPiN and the MOCR strength. However, the effects of other OAE parameters on the relationship between MOCR and SPiN are yet to be investigated systematically. Owing to the caveats associated with quantifying the MOCR strength using OAE-based measures, future studies need to be carried out using non-OAE-based measures like electrophysiological or psychoacoustical approaches to understand the associations better.

Conclusions

In the current study, we systematically investigated the effect of a few variables pertaining to SPiN on the relationship between SPiN and MOCR in young and older adults. We found no evidence of modulatory influence of the MOCR on SPiN. These findings imply that although SPiN is known to be influenced by a complicated interplay between cortical and subcortical processes, it may not be possible to predict SPiN from the peripheral antimasking function of the MOCR assessed using contralateral inhibition of TEOAEs.

Supplementary Information

Below is the link to the electronic supplementary material.

Acknowledgements

We thank the participants of the study for their cooperation and the Director, All India Institute of Speech and Hearing, for providing the necessary infrastructure to carry out the study.

Data Availability

Data are available in the form of figures and tables within the article. Additional information can be provided by the authors upon request.

Declarations

Conflict of Interest

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Guinan JJ. Olivocochlear efferents: anatomy, physiology, function, and the measurement of efferent effects in humans. Ear Hear. 2006;27:589–607. doi: 10.1097/01.aud.0000240507.83072.e7. [DOI] [PubMed] [Google Scholar]
  • 2.Collet L, Kemp DT, Veuillet E, Duclaux R, Alain M, Moulin A. Effect of contralateral auditory stimuli on active cochlear micro-mechanical properties in human subjects. Hear Res. 1990;43:251–261. doi: 10.1016/0378-5955(90)90232-E. [DOI] [PubMed] [Google Scholar]
  • 3.Backus BC, Guinan JJ. Measurement of the distribution of medial olivocochlear acoustic reflex strengths across normal-hearing individuals via otoacoustic emissions. J Assoc Res Otolaryngol. 2007;8:484–496. doi: 10.1007/s10162-007-0100-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Chintanpalli A, Jennings SG, Heinz MG, Strickland EA. Modeling the anti-masking effects of the olivocochlear reflex in auditory nerve responses to tones in sustained noise. J Assoc Res Otolaryngol. 2012;13:219–235. doi: 10.1007/s10162-011-0310-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Jennings SG. The role of the medial olivocochlear reflex in psychophysical masking and intensity resolution in humans: a review. J Neurophysiol. 2021;125:2279–2308. doi: 10.1152/jn.00672.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Brown GJ, Ferry RT, Meddis R. A computer model of auditory efferent suppression: implications for the recognition of speech in noise. J Acoust Soc Am. 2010;127:943–954. doi: 10.1121/1.3273893. [DOI] [PubMed] [Google Scholar]
  • 7.Yashaswini L, Maruthy S. The influence of efferent inhibition on speech perception in noise: a revisit through its level-dependent function. Am J Audiol. 2019;28:508–515. doi: 10.1044/2019_AJA-IND50-18-0098. [DOI] [PubMed] [Google Scholar]
  • 8.Shaikh MA, Connell K, Zhang D. Controlled (re)evaluation of the relationship between speech perception in noise and contralateral suppression of otoacoustic emissions. Hear Res. 2021;409:108332. doi: 10.1016/j.heares.2021.108332. [DOI] [PubMed] [Google Scholar]
  • 9.Kumar UA, Vanaja CS. Functioning of olivocochlear bundle and speech perception in noise. Ear Hear. 2004;25:142–146. doi: 10.1097/01.AUD.0000120363.56591.E6. [DOI] [PubMed] [Google Scholar]
  • 10.Harkrider AW, Smith SB. Acceptable noise level, phoneme recognition in noise, and measures of auditory efferent activity. J Am Acad Audiol. 2005;16:530–545. doi: 10.3766/jaaa.16.8.2. [DOI] [PubMed] [Google Scholar]
  • 11.Akbari M, Panahi R, Valadbeigi A, Hamadi Nahrani M. Speech-in-noise perception ability can be related to auditory efferent pathway function: a comparative study in reading impaired and normal reading children. Braz J Otorhinolaryngol. 2020;86:209–216. doi: 10.1016/j.bjorl.2018.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Mukari SZMS, Mamat WHW. Medial olivocochlear functioning and speech perception in noise in older adults. Audiol Neurotol. 2008;13:328–334. doi: 10.1159/000128978. [DOI] [PubMed] [Google Scholar]
  • 13.Bidelman GM, Bhagat SP. Right-ear advantage drives the link between olivocochlear efferent ‘antimasking’ and speech-in-noise listening benefits. NeuroReport. 2015;26:483–487. doi: 10.1097/WNR.0000000000000376. [DOI] [PubMed] [Google Scholar]
  • 14.Maruthy S, Kumar UA, Gnanateja GN. Functional interplay between the putative measures of rostral and caudal efferent regulation of speech perception in noise. J Assoc Res Otolaryngol. 2017;18:635–648. doi: 10.1007/s10162-017-0623-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.MacPherson A, Akeroyd MA. Variations in the slope of the psychometric functions for speech intelligibility: a systematic survey. Trends Hear. 2014;18:233121651453772. doi: 10.1177/2331216514537722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Guinan JJ (2011) Physiology of the medial and lateral olivocochlear systems. In: Ryugo DK, Fay RR (eds) Auditory and vestibular efferents. Springer Handbook of Auditory Research, vol 38. Springer, New York, NY, pp 39–81
  • 17.Mertes IB, Wilbanks EC, Leek MR. Olivocochlear efferent activity is associated with the slope of the psychometric function of speech recognition in noise. Ear Hear. 2018;39:583–593. doi: 10.1097/AUD.0000000000000514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Otsuka S, Nakagawa S, Furukawa S. Relationship between characteristics of medial olivocochlear reflex and speech-in-noise-reception performance. Acoust Sci Technol. 2020;41:404–407. doi: 10.1250/ast.41.404. [DOI] [Google Scholar]
  • 19.Abdala C, Dhar S, Ahmadi M, Luo P. Aging of the medial olivocochlear reflex and associations with speech perception. J Acoust Soc Am. 2014;135:754–765. doi: 10.1121/1.4861841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Füllgrabe C, Moore BCJ, Stone MA. Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition. Front Aging Neurosci. 2015;6:1–25. doi: 10.3389/fnagi.2014.00347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Koerner TK, Zhang Y. Application of linear mixed-effects models in human neuroscience research: a comparison with pearson correlation in two auditory electrophysiology studies. Brain Sci. 2017;7:26. doi: 10.3390/brainsci7030026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gueorguieva R, Krystal JH. Move over ANOVA. Arch Gen Psychiatry. 2004;61:310. doi: 10.1001/archpsyc.61.3.310. [DOI] [PubMed] [Google Scholar]
  • 23.Venkatesan S (2009) Ethical guidelines for bio-behavioral research involving human subjects. http://aiishmysore.in/pdf/ethical_guidelines.pdf. Accessed 7 Sep 2023
  • 24.American National Standards Institute (1999) American national standard maximum permissible ambient noise levels for audiometric rooms [ANSI S3.1–1999]. New York
  • 25.Jerger J. Clinical Experience with impedence audiometry. Arch Otolaryngol. 1970;92:311–324. doi: 10.1001/archotol.1970.04310040005002. [DOI] [PubMed] [Google Scholar]
  • 26.Kaipa R, Kumar UA. Functioning of medial olivocochlear bundle in right- and left-handed individuals. Laterality Asymmetries Body Brain Cogn. 2017;22:445–454. doi: 10.1080/1357650X.2016.1217229. [DOI] [PubMed] [Google Scholar]
  • 27.Wilson RH. The effects of aging on the magnitude of the acoustic reflex. J Speech Lang Hear Res. 1981;24:406–414. doi: 10.1044/jshr.2403.406. [DOI] [PubMed] [Google Scholar]
  • 28.Mayadevi C (1974) Development and standardization of a common speech discrimination test for Indians [Unpublished Dissertation]. University of Mysore
  • 29.Manjula P, Antony J, Kumar K, Geetha C (2015) Development of phonemically balanced word lists for adults in the Kannada language. J Hear Sci 5:22–30. 10.17430/893515
  • 30.Geetha C, Kumar KSS, Manjula P, Pavan M. Development and standardisation of the sentence identification test in the Kannada language. J Hear Sci. 2014;4:18–26. doi: 10.17430/890267. [DOI] [Google Scholar]
  • 31.Avinash M, Meti R, Kumar U. Development of sentences for Quick Speech-in-Noise (Quick SIN) test in Kannada. J Indian Speech Hear Assoc. 2009;24:59–65. [Google Scholar]
  • 32.Stuart A, Butler AK. Contralateral suppression of transient otoacoustic emissions and sentence recognition in noise in young adults. J Am Acad Audiol. 2012;23:686–696. doi: 10.3766/jaaa.23.9.3. [DOI] [PubMed] [Google Scholar]
  • 33.Gnanateja GN (2017) Speech in noise mixing, signal to noise ratio [Computer Software]
  • 34.Finney DJ. Probit Analysis: A Statistical Treatment of the Sigmoid Response Curve. 2. New York-London: Cambridge University Press; 1952. [Google Scholar]
  • 35.Tillman TW, Olsen W (1973) Speech audiometry. In: Jerger J (ed) Modern developments in audiology, 2nd ed. Academic Press, New York, NY, pp 37–74
  • 36.Schütt H, Harmeling S, Macke J, Wichmann F. Psignifit 4: pain-free Bayesian inference for psychometric functions. J Vis. 2015;15:474. doi: 10.1167/15.12.474. [DOI] [Google Scholar]
  • 37.LaFlair GT, Egbert J, Plonsky L (2015) A practical guide to bootstrapping, descriptive statistics, correlations, t-tests, and ANOVAs. In: Plonsky L (ed) Advancing quantitative methods in second language research. Taylor & Francis, New York, pp 46–77
  • 38.Field A (2009) Discovering statistics using SPSS, 3rd edn. SAGE Publications Inc., London, UK
  • 39.R Core Team (2021) R: A language and environment for statistical computing [Computer Software]
  • 40.Ryan S, Kemp DT. The influence of evoking stimulus level on the neural suppression of transient evoked otoacoustic emissions. Hear Res. 1996;94:140–147. doi: 10.1016/0378-5955(96)00021-4. [DOI] [PubMed] [Google Scholar]
  • 41.Lewis JD. The effect of otoacoustic emission stimulus level on the strength and detectability of the medial olivocochlear reflex. Ear Hear. 2019;40:1391–1403. doi: 10.1097/AUD.0000000000000719. [DOI] [PubMed] [Google Scholar]
  • 42.Bates D, Mächler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67:1–48. 10.18637/jss.v067.i01
  • 43.Kuznetsova A, Brockhoff PB, Christensen RHB (2017) lmerTest package: tests in linear mixed effects models. J Stat Softw 82:1–26. 10.18637/jss.v082.i13
  • 44.Brown VA (2021) An introduction to linear mixed-effects modeling in R. Adv Methods Pract Psychol Sci 4:1–19. 10.1177/2515245920960351
  • 45.Meteyard L, Davies RAI. Best practice guidance for linear mixed-effects models in psychological science. J Mem Lang. 2020;112:104092. doi: 10.1016/j.jml.2020.104092. [DOI] [Google Scholar]
  • 46.Keppler H, Dhooge I, Corthals P, Maes L, D’haenens W, Bockstael A, Philips B, Swinnen F, Vinck B (2010) The effects of aging on evoked otoacoustic emissions and efferent suppression of transient evoked otoacoustic emissions. Clin Neurophysiol 121:359–365. 10.1016/j.clinph.2009.11.003 [DOI] [PubMed]
  • 47.Shera CA, Guinan JJ. Evoked otoacoustic emissions arise by two fundamentally different mechanisms: a taxonomy for mammalian OAEs. J Acoust Soc Am. 1999;105:782–798. doi: 10.1121/1.426948. [DOI] [PubMed] [Google Scholar]
  • 48.Heinrich A, Knight S (2016) The contribution of auditory and cognitive factors to intelligibility of words and sentences in noise. In: van Dijk P, Başkent D, Gaudrain E, de Kleine E, Wagner A, Lanting C (eds) Physiology, psychoacoustics and cognition in normal and impaired hearing. Advances in Experimental Medicine and Biology, vol 894. Springer, Cham, Switzerland, pp 37–45 [DOI] [PubMed]
  • 49.Vander Werff KR, Burns KS. Brain stem responses to speech in younger and older adults. Ear Hear. 2011;32:168–180. doi: 10.1097/AUD.0b013e3181f534b5. [DOI] [PubMed] [Google Scholar]
  • 50.Gafoor SA, Uppunda AK (2023) Role of the medial olivocochlear efferent auditory system in speech perception in noise: a systematic review and meta-analyses. Int J Audiol. 10.1080/14992027.2023.2260951 [DOI] [PubMed]
  • 51.Kawase T, Liberman MC. Antimasking effects of the olivocochlear reflex. I. Enhancement of compound action potentials to masked tones. J Neurophysiol. 1993;70:2519–2532. doi: 10.1152/jn.1993.70.6.2519. [DOI] [PubMed] [Google Scholar]
  • 52.Hienz RD, Stiles P, May BJ. Effects of bilateral olivocochlear lesions on vowel formant discrimination in cats. Hear Res. 1998;116:10–20. doi: 10.1016/S0378-5955(97)00197-4. [DOI] [PubMed] [Google Scholar]
  • 53.Berlin CI, Hood LJ, Hurley AE, Wen H, Kemp DT. Binaural noise suppresses linear click-evoked otoacoustic emissions more than ipsilateral or contralateral noise. Hear Res. 1995;87:96–103. doi: 10.1016/0378-5955(95)00082-F. [DOI] [PubMed] [Google Scholar]
  • 54.DeRoy MK, Alexander JM, Strickland EA. The relationship between ipsilateral cochlear gain reduction and speech-in-noise recognition at positive and negative signal-to-noise ratios. J Acoust Soc Am. 2021;149:3449–3461. doi: 10.1121/10.0003964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Feeney MP, Keefe DH. Estimating the acoustic reflex threshold from wideband measures of reflectance, admittance, and power. Ear Hear. 2001;22:316–332. doi: 10.1097/00003446-200108000-00006. [DOI] [PubMed] [Google Scholar]
  • 56.Liberman MC, Guinan JJ. Feedback control of the auditory periphery: anti-masking effects of middle ear muscles vs. olivocochlear efferents. J Commun Disord. 1998;31:471–483. doi: 10.1016/S0021-9924(98)00019-7. [DOI] [PubMed] [Google Scholar]
  • 57.Jennings SG, Aviles ES. Middle ear muscle and medial olivocochlear activity inferred from individual human ears via cochlear potentials. J Acoust Soc Am. 2023;153:1723–1732. doi: 10.1121/10.0017604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Boothalingam S, Goodman SS, MacCrae H, Dhar S. A time-course-based estimation of the human medial olivocochlear reflex function using clicks. Front Neurosci. 2021;15:1–16. doi: 10.3389/fnins.2021.746821. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

Data are available in the form of figures and tables within the article. Additional information can be provided by the authors upon request.


Articles from JARO: Journal of the Association for Research in Otolaryngology are provided here courtesy of Association for Research in Otolaryngology

RESOURCES