Skip to main content
Journal of Neurophysiology logoLink to Journal of Neurophysiology
. 2020 Jul 8;124(2):418–431. doi: 10.1152/jn.00016.2020

Electrophysiological markers of cochlear function correlate with hearing-in-noise performance among audiometrically normal subjects

Kelsie J Grant 1,*, Anita M Mepani 1,*, Peizhe Wu 1,2, Kenneth E Hancock 1,2, Victor de Gruttola 3, M Charles Liberman 1,2,4, Stéphane F Maison 1,2,4,
PMCID: PMC7500376  PMID: 32639924

Abstract

Hearing loss caused by noise exposure, ototoxic drugs, or aging results from the loss of sensory cells, as reflected in audiometric threshold elevation. Animal studies show that loss of hair cells can be preceded by loss of auditory-nerve peripheral synapses, which likely degrades auditory processing. While this condition, known as cochlear synaptopathy, can be diagnosed in mice by a reduction of suprathreshold cochlear neural responses, its diagnosis in humans remains challenging. To look for evidence of cochlear nerve damage in normal hearing subjects, we measured their word recognition performance in difficult listening environments and compared it to cochlear function as assessed by otoacoustic emissions and click-evoked electrocochleography. Several electrocochleographic markers were correlated with word scores, whereas distortion product otoacoustic emissions were not. Specifically, the summating potential (SP) was larger and the cochlear nerve action potential (AP) was smaller in those with the worst word scores. Adding a forward masker or increasing stimulus rate reduced SP in the worst performers, suggesting that this potential includes postsynaptic components as well as hair cell receptor potentials. Results suggests that some of the variance in word scores among listeners with normal audiometric threshold arises from cochlear neural damage.

NEW & NOTEWORTHY Recent animal studies suggest that millions of people may be at risk of permanent impairment from cochlear synaptopathy, the age-related and noise-induced degeneration of neural connections in the inner ear that “hides” behind a normal audiogram. This study examines electrophysiological responses to clicks in a large cohort of subjects with normal hearing sensitivity. The resultant correlations with word recognition performance are consistent with an important contribution cochlear neural damage to deficits in hearing in noise abilities.

Keywords: auditory brain stem responses, auditory nerve, cochlea, cochlear synaptopathy, electrocochleography, hair cell, hearing in noise, hidden hearing loss

INTRODUCTION

Acoustic overexposure can cause hair cell damage and permanent threshold elevation (e.g., Liberman and Dodds 1984; Liberman and Kiang 1984; Schmiedt 1984). Until recently, it was thought that hair cells were the primary targets and that cochlear neurons died only as a result of hair cell degeneration (Bohne and Harding 2000). According to this view, sound exposures leading to a reversible threshold elevation are benign, an assumption that currently underlies the damage-risk criteria set by federal agencies for workplace exposures (Arenas and Suter 2014). The discovery that noise exposure, aging, or ototoxic drugs can destroy synapses between cochlear nerve fibers and their hair cell targets, even when hair cells and thresholds recover, has altered this view (Bourien et al. 2014; Kujawa and Liberman 2009, 2015; Liberman 2017; Sergeyenko et al. 2013).

This type of synaptopathy does not elevate behavioral or electrophysiological thresholds until it becomes extreme (Lobarinas et al. 2013; Woellner and Schuknecht 1955), partly because the most vulnerable cochlear neurons, to both noise and aging, do not contribute to threshold detection in quiet (Furman et al. 2013; Schmiedt et al. 1996). These neurons, with high thresholds and low spontaneous rates (SRs), are key to the coding of transient stimuli in the presence of continuous background noise (Costalupes et al. 1984; Furman et al. 2013; Schmiedt et al. 1996), and their loss could be a major contributor to difficulties in speech discrimination in noisy environments (Alvord 1983; Dubno et al. 1984; Kujawa and Liberman 2015; Rajan and Cainer 2008). Human studies have also shown that the rate of cochlear neural loss greatly outstrips the rate of sensory cell loss in the aging ear (Viana et al. 2015; Wu et al. 2019) and that this type of primary neural degeneration can be associated with poor word recognition scores (Felder and Schrott-Fischer 1995).

In animal studies of noise or aging, the suprathreshold amplitude of auditory brainstem response (ABR) wave I (a.k.a. action potential, AP), the summed activity of cochlear neurons, is correlated with synaptic losses in corresponding cochlear regions (Kujawa and Liberman 2009, 2015; Sergeyenko et al. 2013; Shaheen et al. 2015), so long as cochlear thresholds are normal. Attempts to translate this diagnostic metric to humans with normal audiograms have produced inconsistent results (see Bramhall et al. 2019). While some studies reported a lack of correlation between ABR wave I amplitude and inferred noise exposure (Fulbright et al. 2017; Grinn et al. 2017; Guest et al. 2017, 2018; Prendergast et al. 2017; Spankovich et al. 2017), others have found electrophysiological signs of primary neural damage in subjects with overexposure history (Bramhall et al. 2017, 2018; Grose et al. 2017; Liberman et al. 2016; Ridley et al. 2018; Skoe and Tufts 2018; Valderrama et al. 2018) or aging (Burkard and Sims, 2002; Johannesen et al. 2019). Reasons for this discrepancy likely include 1) difficulties in accurately estimating lifetime noise exposures from questionnaires and interviews, and 2) the intersubject variability in noise vulnerability and/or ABR amplitudes (Bharadwaj et al. 2019; see Nikiforidis et al. 1993).

Differences in ABR analysis protocols may also be important. In subjects with poor speech-in- noise scores (Mepani et al. 2020) or greater acoustic overexposure (Liberman et al. 2016), we observed larger summating potentials (SPs), the component of the ABR response thought to reflect summed hair cell receptor potentials (Durrant et al. 1998). A similar SP enhancement is seen when electrocochleography (ECochG) is compared in subjects immediately before versus after recreational acoustic overexposure (Kim et al. 2005). In animal models, genetic mutations compromising inner hair cell (IHC) synaptic transmission (Santarelli et al. 2009) or drug- or age-induced neural degeneration (Sergeyenko et al. 2013; Yuan et al. 2014) can selectively attenuate the cochlear neural response (wave I or AP), without affecting SP. Such studies suggest that the AP rides on top of an SP plateau when the filter settings do not attenuate this low-frequency wave. Thus an SP enhancement can mask an AP decrement if AP is measured baseline to peak rather than SP to peak.

Finally, another major confound is the possible contribution of cochlear regions tuned to extended high frequencies (EHFs; >8 kHz), which are not assayed by standard audiometry but can respond to stimuli at moderate and high sound pressure levels (SPLs) and thus contribute to measurements of auditory evoked potentials. The resultant confound is that decrements in suprathreshold neural amplitudes in subjects with normal standard-frequency thresholds could be due to neural damage throughout the cochlea or to hair cell damage in the basal end of the cochlea (or both).

To address these issues, we recruited normal-hearing subjects with a wide range of performance on challenging word recognition tasks and extracted markers from auditory evoked potentials obtained under different recording conditions, including the use of a high-pass masker and a change in stimulus presentation rate. Our results suggest that cochlear dysfunction, perhaps including synaptopathy, which is not revealed by the audiogram, nevertheless contributes significantly to the degradation of word recognition in challenging listening environments.

MATERIAL AND METHODS

Subject Pool, Cognitive Assessment, and Inclusion Criteria

A total of 124 native speakers of English in good health, between the ages of 18 and 63 yr, with no history of ear or hearing problems, no history of neurological disorders, and unremarkable otoscopic examinations were recruited. All participants had normal audiometric thresholds from 0.25 to 8 kHz in both ears (≤25 dB hearing level, HL) and normal middle ear function. Tympanometry was performed using the Titan Suite from Interacoustics, with a probe-tone frequency of 226 Hz and an ear canal pressure change ranging from −300 daPa to +200 daPa in each ear to ensure that ear canal volume, tympanic membrane mobility, and middle ear pressure were normal. The Montreal Cognitive Assessment (MoCA version 8.1; https://www.mocatest.org/) was administered to screen for mild cognitive dysfunction (passing score ≥ 26). There were no additional inclusion criteria beyond the ability to give voluntary informed written consent. This study was reviewed and approved by the Institutional Review Board of Massachusetts Eye and Ear. Analysis of middle ear muscle reflexes from most of the same subjects has been presented in a prior report (Mepani et al. 2020).

Audiometric Thresholds and Distortion Product Otoacoustic Emissions

Audiometric thresholds were obtained using Interacoustics Equinox 4.0 with the High Hz option. Pure-tone air conduction (AC) thresholds were measured at standard audiometric frequencies from 0.25 to 8 kHz, plus 3 and 6 kHz, using DD45 headphones. To minimize changes in sound levels due to standing waves and improve intrasubject reliability of threshold estimates above 8 kHz, we measured AC thresholds at extended high frequencies (EHF: 9, 10, 11.2, 12.5, 14, and 16 kHz) using warble tones delivered via a circumaural HDA200 high-frequency headset.

To complement behavioral audiometry and provide an objective, rapid, and independent measure of cochlear amplifier function, we measured distortion product otoacoustic emissions (DPOAEs) as amplitude versus level functions with two primary tones f1 and f2 (f2/f1 = 1.22) with f2 = 0.5, 1, 2, 4, or 6 kHz using the Interacoustics Titan v.3.4.0. For DPOAEs generated at f2 = 8, 11.2, 12.5, 14, and 16 kHz, stimulus generation and data acquisition were handled by a custom rig based on 24-bit digital input-output boards from National Instruments in a PXI chassis, with custom software control via LabVIEW. Response and stimulus waveforms, to and from the input-output boards, were transduced via microphone and dual sound sources in an ER-10X system (Etymotics Research). DPOAEs were measured as amplitude versus level functions in 5-dB steps from 5 dB below threshold to 80 dB SPL. The DPOAE at 2f1f2 was extracted from the ear canal sound pressure after both time-domain and spectral averaging. Threshold was defined as the lowest level required to elicit a DPOAE >5 dB above the noise floor.

Word Recognition

The NU-6 corpus (from Auditec, Inc.) and a modified version of the QuickSIN Speech-in-Noise test (from Etymotic Research, Inc.) were used to assess speech recognition performances. We presented monaurally four different NU-6 word lists of 50 consonant-nucleus-consonant (CNC) phonemically balanced words at 55 dB HL (~75 dB SPL) under different conditions: 1) in the absence or presence of an ipsilateral speech-noise masker (weighted random noise with a constant amplitude from 125 to 1,000 Hz and falling 12 dB/octave from 1,000 Hz to 6,000 Hz) at 0-dB signal-to-noise ratio (SNR) or 2) speeded up (“time compression”) at 45% or 65% with added reverberation (0.3-s echo) (Noffsinger et al. 1994). Each participant was presented with the same lists in the same order. Our modified (m)QuickSIN test consisted of four lists of six sentences with five key words per sentence in the presence of a four-talker babble noise at decreasing SNR from 10 to 5, 3, 2, 1, and 0 dB. The first list of six sentences was used as practice. A combined score for the three subsequent lists consisted of adding the number of correctly repeated key words. Scores from two participants were excluded because they were familiar with the word tests.

Electrocochleography

Stimuli were generated by our custom rig, stimulus waveforms were transduced via ER-3A insert earphones, and data acquisition was handled by the Interacoustics Eclipse hardware and software. Subjects’ ear canals were prepped by scrubbing with a cotton swab coated in Nuprep. Electrode gel was applied on the cleaned portion of the canal and over the gold foil of ER3-26A/B tiptrodes before insertion. A horizontal montage was used, with a ground on the forehead at midline, one tiptrode as the inverting electrode, and the other as the noninverting electrode in the opposite ear (Fig. 1A). Low (<5 kΩ) and balanced impedance readings were obtained with interelectrode impedance values within 2 kΩ of each other. Acoustic stimuli were delivered via silicone tubing connected to the ER-3A earphones. Stimuli were 100-µs clicks delivered at 125 dB peak SPL (pSPL) in alternating polarity at a presentation rate of 9.1 or 40.1 Hz in the presence or absence of a 90-ms forward masking noise (8–16 kHz, 5-ms ramp) presented 6 ms before the click onset (Fig. 1A). The total noise dose for all ECochG measurements was well within Occupational Safety and Health Administration (OSHA) and National Institute for Occupational Safety and Health (NIOSH) standards. Characteristics of the masking noise at the output of the ER-3A were obtained using a Bruel & Kjaer 0.5-in. microphone connected to an insert type coupler (Larsen-Davis AEC202) attached to an ear simulator (Larsen-Davis AEC 201-A) (Fig. 1B). Electrical responses were amplified 100,000 times, and 2,000 sweeps were averaged for each recording. Average traces acquired by the Eclipse software (passband 3.3 Hz – 5,000 Hz) were exported for further analysis, including digital filtering (64-point zero-phase finite impulse response filter) with a 10-Hz to 3,000-Hz passband. To study the filtering effects, unmasked waveforms were also analyzed with a 300-Hz to 3,000-Hz passband.

Fig. 1.

Fig. 1.

Electrocochleographic (ECochG) assessment of click-evoked auditory responses. A: a horizontal montage, with a ground on the forehead at midline, one tiptrode as the inverting electrode, and the other as the noninverting electrode in the opposite ear, was used to measure auditory brain stem responses with or without a forward masker and at a repetition rate of 9.1 Hz vs. 40.1 Hz. B: output of the ER-3A transducer (red) when driven by the 8- to 16-kHz masker (black) as measured in an ear simulator. Though the ER-3A rolls off at 4 kHz, we can generate significant power in the 8- to 16-kHz masker band, along with significant low-frequency energy where the low-power skirts of the bandpass input coincide with the high-power region of the ER-3A output. C: averaged click-evoked response obtained from all participants (±SE) shows how measures were extracted. Baseline was defined as the first amplitude point exceeding 2 SD above the mean pre-onset amplitude (−2 to 0 ms). Summating potential (SP) was defined as the difference between baseline and the early peak, or inflection point, on the rising phase of wave I. Action potential (AP) width was measured at amplitude point 25% down from the AP peak amplitude to the baseline. N1 (wave I) peak was defined as the maximum amplitude 1–2 ms post stimulus onset, and P1 was defined as the next major trough.

All waveform analysis (Fig. 1C) was done independently by two observers blinded to all other test results. Unlike Mepani et al. (2020), we defined baseline as the first point >2 SD above mean pre-onset amplitude (−2 to 0 ms) to improve objectivity in its identification. The N1 (wave I) peak was defined as the maximum amplitude 1–2 ms post stimulus onset, and P1 was defined as the next major trough. SP was defined as the difference between baseline and the early peak, or inflection point, on the rising phase of wave I. AP amplitude was defined as the difference between the SP peak and wave I peak. AP width was calculated at 25% of wave I amplitude. In a few instances, SP measures were excluded as the inflection point was lower than baseline.

To analyze the masking effects, waveforms obtained from each ear of each participant were averaged (as speech was tested only in one ear) and differences between unmasked and masked conditions from each subject were computed for the first 2-ms post stimulus onset. Hierarchical clustering of the differences in waveforms obtained from each participant was run under Matlab to obtain a dendrogram where the distance of split/merge was recorded. A Spearman correlation was used to compute the distance between each pair of observations. We used a rule of hierarchical clustering based on a weighted pair group method to define how waveforms should be grouped (linkage). The cutting distance was set to separate all data into two clusters of similar size (cluster 1, n = 59; cluster 2, n = 65).

Statistical Analyses

Four speech recognition measures were considered as outcome variables: the word recognition score in 1) noise, 2) with 45% or 3) 65% time-compression plus reverberation, and 4) the number of correct words on the mQuickSIN test. These outcome measures were not ear specific, so there is only one measure per subject.

The following threshold measures were considered as predictor variables: four measures of threshold with 1) mean AC threshold at standard frequencies, 2) mean AC threshold at EHFs, 3) mean DPOAE threshold at standard frequencies, and 4) mean DPOAE threshold at EHFs. Additional predictors were derived from ECochG, including three amplitude measures: 1) SP, 2) AP, and 3) AP–P1; AP and wave V latency; and AP width.

All predictor variables were measured in both ears except DPOAEs at standard frequencies. In 9 cases in 124 participants, electrophysiological data were not interpretable in one ear; hence, only data from the opposite ear were used. To avoid the problem of multicollinearity (high correlation among predictors), we transformed the vector of measures from each ear into the mean of the two ears and the difference between one ear and the mean. These values are nearly uncorrelated if the variance in the measures in both ears is similar. This approach has the added advantage that when data on both ears are available, the mean is less variable than the values from either ear.

Pairwise Pearson’s correlations between measures of thresholds (at standard audiometric frequencies or EHFs) and speech recognition tests were estimated before and after adjustment for potential confounding factors (age and sex). To carry out this adjustment, we used the method of partial correlation. All other estimated correlation coefficients obtained between predictors and each outcome measure were adjusted for threshold and sex, two variables that are associated with the variance of auditory evoked potentials (Houston and McClelland, 1985; Maurizi et al. 1988; Nikiforidis et al. 1993; Picton et al. 1981; Stockard et al. 1979). Because age and threshold variables are highly correlated, we chose to adjust our Pearson’s correlation coefficients for threshold and sex (as opposed to age and sex) as there are evidences of age-related neural deficits in normal hearing subjects as measured via ABRs (Burkard and Sims 2002; Johannesen et al. 2019) or inferred from histopathological studies of human temporal bones (Wu et al. 2019). To facilitate adjustment for threshold, a cluster analysis was performed to reduce from four measures of threshold (obtained behaviorally or using DPOAEs at standard frequencies or EHFs) to two, each of which had two variables. Pearson’s correlation coefficients were adjusted by the means of the variables that comprise these clusters.

To identify individual predictors of speech-in-noise recognition, a two-tailed Student’s t test for homoscedastic groups was used to test for a difference in the mean of predictor variables between the best and worst speech-in-noise recognition performers (below 25th and above 75th percentile). A paired Student’s t test was used to assess differences within each group of performers under different conditions (effect of making and change in rate of stimulus presentation).

To investigate the joint effect of predictor variables in a multivariable regression and to adjust for potential confounding factors, stepwise selection methods were then applied using predictor variables obtained from the responses of ECochG waveforms. Using this approach, we identified the best-fitting model separately for each word recognition test. The predictor variables included all the peak waveform amplitudes, widths, and latencies under unmasked conditions and changes of the same variables observed under a high-pass maker or when the presentation rate of the stimulus was speeded up. Two different criteria were considered for the selection of final models: 1) adjusted R2 and P value <0.10. When these criteria led to different final models, the more parsimonious was selected. Because of their potential role as confounding factors, age, sex, and thresholds were included as predictors in all models. To assess the degree to which age, sex, and thresholds confounded relationships of interest, we assessed the percent change in the estimated regression coefficients with and without their inclusion in the model. In addition, we investigated the extent to which age, sex, and thresholds are associated with important predictors: SP, AP, and AP–P1 (see Table 2).

Table 2.

Confounding variables were not associated with SP

Predictors P Value Adjusted r2
SP (n = 123)
Thresholds EHF 0.153 −0.003
Age 0.157
Sex 0.440
Thresholds standard 0.527
AP (n = 123)
Age 0.046 0.167
Sex 0.020
Thresholds standard 0.140
Thresholds EHF 0.872
AP-P1 (n = 124)
Age 0.012 0.144
Sex 0.116
Thresholds standard 0.485
Thresholds EHF 0.895

Multivariate regression analysis was run to investigate the extent to which age, sex, and thresholds were associated with the main predictor variables obtained from the electrocochleographic (ECochG) waveforms (SP, AP, and AP–P1). Statistical significance is P < 0.05 (bold). AP, action potential; EHF, extended high frequencies; P1, next major trough following AP peak; SP, summating potential.

Data were analyzed using SAS (SAS Institute, version 9.4) and Matlab R2018a (cluster analysis).

RESULTS

Threshold Differences and Word-Recognition Scores

A total of 124 subjects (70 women, 54 men), from 18 to 63 yr old, were included in this study. Although all participants had normal audiometric thresholds from 0.25 to 8 kHz in both ears, significant variability was observed at extended high frequencies (EHFs: 9–16 kHz: Supplemental Fig. S1A; all Supplemental Figures are available at https://doi.org/10.6084/m9.figshare.11594157.). As expected, mean EHF thresholds were correlated with age (Fig. 2A). Cochlear amplifier function was further investigated by measuring DPOAEs, which were highly correlated with air conduction (AC) thresholds for either standard audiometric frequencies or EHFs (Supplemental Fig. S1B).

Fig. 2.

Fig. 2.

Thresholds at extended high frequencies (EHFs) are not correlated with word recognition scores. A: air conduction (AC) thresholds at EHFs are highly correlated with age. Each point is for a different participant; y-axis values represent the mean thresholds across all measured EHFs. B: visual representation of the Pearson’s pairwise correlations among 4 measures of threshold sensitivity at standard frequencies (stand.) or EHFs and the 4 word recognition scores (noise, with 45% or 65% time compression (Comp) plus reverberation (Rev), and the number of correct words on the modified QuickSIN (mQuickSIN) test. In this matrix, the diameter of each gray disk is proportional to the unadjusted correlation coefficients (r values). Significance (before adjustment for age or sex) is indicated by the colored ring around the disk, red for negative correlations and blue for positive correlations. None of these relationships was statistically significant after adjustment for age and sex. DPOAE, distortion product otoacoustic emissions.

Word-recognition performance was assessed using 1) the NU-6 corpus that offers phonemically balanced words with no contextual clues and 2) a modified version of the QuickSIN test, which presents phonemically balanced sentences in increasingly high-level background babble. Word scores in quiet were excellent in all subjects. However, when presented in speech-shaped noise at 0-dB SNR or with time compression (45% or 65%) and reverberation, a large range of word scores was observed, e.g., from 2/10 to 6/10 words correct at 0-dB SNR (Fig. 3A). A comparably large variability was obtained for the modified QuickSIN test (Fig. 3B).

Fig. 3.

Fig. 3.

Variability in word recognition performance. A and B: box and whisker plots of word scores define a lower quartile (red), median (gray), and upper quartile (green) of test scores for each word recognition test, with A showing scores in noise or with 45% or 65% time-compression (Comp) plus reverberation (Rev) and B showing the number of correct words on the modified QuickSIN (mQuickSIN) test.

To investigate possible sources of variability in word scores, we first examined effects of threshold sensitivity. Although behavioral thresholds at standard and EHFs were significantly correlated with word scores on some of the tests, none of these correlations remained significant after adjustment for age and sex (Fig. 2B). When outer hair cell (OHC) function was assessed using DPOAEs, the only significant correlation was between the 65% time-compression scores and DPOAEs at EHFs (Fig. 2B). Likewise, after adjustment for age and sex, this relationship was no longer statistically significant. These results suggest that 1) OHC function does not explain the variability in speech recognition performance after adjustment for age and sex in this population, and 2) an age-related fatigue factor may be at play (Füllgrabe et al. 2015), since before adjustment for age and sex, our behavioral measures of hearing sensitivity were correlated with performance on several word tests, whereas the DPOAE measures were generally not.

Neural Response Differences and Word-Recognition Scores

Unmasked conditions.

To further probe cochlear contributions to deficits in word recognition, we measured auditory evoked potentials via ECochG. As schematized in Fig. 1C, we measured both the SP, assumed to reflect mostly hair cell receptor potentials (Durrant et al. 1998; Pappa et al. 2019), and the AP, representing the summation of cochlear nerve action potentials. As noted in our prior study (Mepani et al. 2020), it is important to distinguish “wave I”, i.e., the first peak of the evoked potential, from AP, the contribution of auditory nerve fibers. They can differ significantly when recorded with a wide filter (e.g., 10-Hz high pass vs. 300-Hz high pass), because wave I includes an AP riding on top of the SP “pedestal.” To facilitate comparisons with other studies, AP–P1 amplitude was also assessed.

One approach to identify predictors of word scores is to separate the best and worst performers on each word test and compute the mean value for each of the predictor variables. Best and worst groups defined by cutoffs at the 25th and 75th percentiles of word scores (Fig. 3) showed no significant threshold differences, as assessed via DPOAEs at standard frequencies and EHFs (Supplemental Fig. S2). A difference in EHF thresholds on the modified QuickSIN was the only statistically significant difference observed between the best and worst performers (Supplemental Fig. S2).

In contrast, there were a number of significant differences among the peripheral markers of our ECochG measures. In the worst performers, SP was larger, AP and AP–P1 were smaller (Fig. 4C), and AP was wider (Fig. 4D). Although the reason for SP enhancement remains unclear (see below), a decrease in AP or AP–P1 amplitude and an increase in AP width are both consistent with cochlear nerve deficits in the poor-performing group. The data for best and worst performers in Fig. 4 are derived from the scores from the 45% reverberation and time compression test; however, similar trends were seen in the data from the other tests (Supplemental Fig. S3).

Fig. 4.

Fig. 4.

Worst performers on word recognition tests show electrocochleographic (ECochG) markers of cochlear neural deficits. A–D: mean ECochG measures (±SE) for the best vs. worst performers (top and bottom quartiles, respectively) are compared in the absence or presence of a high-pass masker: in both conditions, the click repetition rate was 9.1 Hz. A and B show mean waveforms; arrowheads point to summating potential (SP) and action potential (AP) peaks. C and D show metrics (means ± SE) extracted from individual waveforms; downward-directed bars show mean differences (ΔµV or Δms) between masked and unmasked values for each subject. Significance of group differences is indicated by brackets: *P < 0.05; **P < 0.01; ***P < 0.001. E–H: analogous to A–D except comparing the effect of increasing repetition rate rather than addition of a masker. Note that the low-rate data in G and H are the same as the unmasked data from C and D. Analyses are based on scores on the 45% time compression plus reverberation condition.

A second analytical approach, which also allows for adjustment of the potential confounding factors, is to assess the pairwise correlations across all participants. We considered the correlations between each ECochG metric and each of the four word scores, with and without adjustment for threshold and sex. It is important to adjust these correlations for thresholds, since there are large differences at EHFs among participants. It is important to adjust for sex, because amplitudes of auditory evoked potentials tend to be larger in females than males (Don et al. 1993). The resultant pairwise correlations, summarized graphically in Fig. 5A, show that SP amplitude was the strongest and most reliable predictor of word scores, and remained so after adjustment for threshold and sex for all word tests (Fig. 5, A and B). As noted above, larger SPs were associated with poorer performance on the word tasks. Similarly, three out of four scores were significantly correlated with AP–P1 amplitude (larger amplitudes = better word score), and two remained significant after adjustment for threshold and sex (Fig. 5A).

Fig. 5.

Fig. 5.

Correlation matrices between electrocochleographic (ECochG) measures and word recognition scores in unmasked, masked, and high-rate conditions. Data are matrices of Pearson’s bivariate correlations, where the diameter of each gray disk is proportional to the unadjusted r value. Correlation coefficients are shown for unmasked (A), masked (C), and high-rate (D) conditions. As indicated in the key, the color of the circle indicates the direction of significant correlations (blue, r > 0; red, r < 0); the background color of each square indicates the statistical significance after adjustment for either threshold and sex (amplitude, latency, width) or threshold (Δamplitude, Δlatency, Δwidth). B illustrates one of these bivariate correlations with a regression line estimated from simple linear regression with no adjustment for covariates. AP, action potential; Comp, time compression; mQuickSIN, modified QuickSIN test; Rev, reverberation; P1, next major trough following AP peak; SP, summating potential; V, wave V.

Other measures of neural health, including AP amplitude, AP latency, or AP peak width, were significantly correlated with some word scores, and some of these correlations remained significant after adjustment for threshold and sex (Fig. 5A). The signs of these correlations were negative, which implies that smaller amplitudes, longer latencies, and wider peaks were associated with worse word scores, as expected.

Given that cochlear damage can lead to hyperactivity in central auditory pathways (Auerbach et al. 2014; Salvi et al. 2017), we also analyzed the amplitude ratios, latencies, and interpeak latencies of the later waves II, III, and V; however, none of these correlations with word scores remained significant after adjustment for threshold and sex (data not shown).

Effects of forward masking.

The click-evoked responses were measured with or without a high-pass forward masker to probe the contribution of cochlear regions tuned to EHFs. The masker (8–16 kHz; Fig. 1, A and B) was delivered 15 dB above masker threshold, as assessed behaviorally in each individual. Predictably, masker threshold was highly correlated with the mean AC threshold at 8–16 kHz (r = 0.70, P < 0.001; Supplemental Fig. S4).

The use of a forward masker should decrease the neural potentials (e.g., AP) while leaving the hair cells’ responses unaffected. Indeed, AP was significantly reduced by the masker, for both the best and the worst performers (Fig. 4C), while the masker-induced changes in SP were not significant (Fig. 4C), although the small changes in mean values were in opposite directions in best performers versus worst performers. The AP decrement (ΔµV in Fig. 4C) was similar in best and worst performers when extracted from each case and then averaged. This suggests that best and worst performers have similar numbers of responsive neurons in EHF regions. In contrast, AP–P1 amplitude was reduced only in the best performers (Fig. 4C); correspondingly, the difference in masker effect on AP–P1 between the best and worst performers was highly significant (ΔµV in Fig. 4, C and D, respectively). It is not clear why AP and AP–P1 behave differently in this regard.

When considered as pairwise correlations across all subjects, rather than just between best and worst performers, addition of the masker erases all the strong negative correlations between SP amplitude and word scores (Fig. 5A vs. Fig. 5C). According to Fig. 4C, this is because SP is increased in the good performers and decreased in the poor performers. Correspondingly, the masker-induced decrements in SP (Δamplitude in Fig. 5C) are strongly correlated with word scores. The correlations between robust AP and good word scores are as strong, or stronger, in the presence of the forward masker as without (Fig. 5C vs. Fig. 5A), as are those between short AP latencies and good word scores, suggesting that the neural effects are not dominated by EHF regions. The decrement in AP–P1 is particularly well correlated with word scores (ΔAP–P1 in Fig. 5C). According to Fig. 4C, that is because it is greatly affected in the best performers and relatively unaffected in the poor performers. Again, the reason for difference in behavior of AP and AP–P1 is not clear.

Given the report that changes in masked ABR wave V latency not only mirror the changes in wave I amplitude but also predict perceptual temporal sensitivity (Mehraei et al. 2016), we also considered the relationships between wave V latency and masker-induced wave V latency shifts with words scores. While three out of four word tests correlated with wave V latency, only the modified QuickSIN correlation with wave V latency survived the adjustment for threshold and sex (Fig. 5C). Likewise, changes in wave V latencies were significantly correlated with the modified QuickSIN (Fig. 5C).

Effects of increased rate.

Another way to differentiate the contributions of hair cells and cochlear neurons to evoked potentials, and to assay the fatigability of the nerve, is to compare results obtained at low versus high click rates. Hair cell potentials should be unattenuated by high repetition rates, whereas neural potentials will show strong adaptation (Eggermont and Odenthal 1974). When averaged across all the best performers, the ECochG waveforms showed the expected results; i.e., SP amplitude is minimally affected, while AP is attenuated, as well as delayed and widened (Fig. 4E). AP amplitude reductions, as well as latency increases and peak widening, are also seen in worst performers (Fig. 4G). Surprisingly, a highly significant SP amplitude reduction is observed at the high repetition rates, but only in the worst performers (Fig. 4, F and G).

When considered across all subjects as pairwise correlations between ECochG metrics and word score, the effects of rate are generally similar to those of the masker. As shown in Fig. 5D, all the strong correlations between SP amplitude and word scores in the low-rate (unmasked) condition (Fig. 5A) disappear when the click rate is quadrupled from 9.1 to 40.1 Hz (Fig. 5D). Correspondingly, the SP decrement (Δamplitude in Fig. 5D) is correlated to all word scores and remains so after adjustment for two out of four tests. As with the masker condition, the correlations between word scores and either AP amplitudes, latencies, or widths are largely unaffected by rate changes. Finally, three out of four word tests correlated with wave V latency under high-rate stimulus presentation, and these correlations survived adjustment for threshold and sex for both word tests presented with time compression and reverberation (Fig. 5D).

Taken together, all these ECochG results suggest that the poor performers have a postsynaptic component contributing to the SP that is sensitive to both masking and increased rate.

Cluster analysis of waveforms.

We then “reversed” our analysis by taking an unbiased cluster-based approach to the ECochG waveforms and looking for associated differences in word scores. For each subject, we averaged the unmasked waveform from each ear, windowed it to include only the first 2 ms, and subtracted it from the averaged masked waveform, similarly windowed. This set of difference-waveforms was then analyzed via a hierarchical clustering algorithm, with the cutting distance set to define two clusters of similar size on the dendrogram (Fig. 6A).

Fig. 6.

Fig. 6.

Larger masking decrements in electrocochleographs (ECochGs) were associated with poorer word scores. A: hierarchical clustering of the masker-effect waveform (unmasked minus masked) from each participant defined masker-sensitive and masker-insensitive groups. A Spearman correlation determined the distance between each pair of observations, and clustering was achieved based on a weighted pair group method. A cutting distance of 1 was chosen to group all data into 2 clusters; c indicates the cophenetic correlation coefficient. B–D: mean ECochG waveforms for the 2 clusters, with or without the masker, as indicated in the keys. Arrowheads point to summating potential (SP) and action potential (AP) peaks; double arrows show location of AP width measurements. E: mean distortion product otoacoustic emission (DPOAE) thresholds (±SE) for the 2 clusters at standard frequencies (St.), extended high frequencies (EHFs), or 8 kHz only. F and G: mean word scores (±SE) on the different word tests (F) or the modified QuickSIN (mQuickSIN; G) for each cluster. Key in G applies to all histograms. **P < 0.01; ***P < 0.001.

There were no statistically significant differences in threshold measures between clusters 1 and 2 as assessed via DPOAEs at standard frequencies, 8 kHz, or EHFs (Fig. 6E). However, participants in cluster 2 had significantly poorer word scores than cluster 1 for all tests except the modified QuickSIN (Fig. 6, F and G). Interestingly, cluster 2, with poorer word scores, showed larger SPs (Figs. 6B and 7A) and wider AP (Figs. 6B and 7B), as was also shown in the poor performers segregated based on word scores (Fig. 4). The small intercluster differences in AP and AP–P1 amplitudes were not statistically different between clusters 1 and 2 (Fig. 7A).

Fig. 7.

Fig. 7.

Larger masking effects on electrocochleographs (ECochGs) were associated with larger rate effects on ECochG. Participants were split into 2 clusters based on a weighted pair group method. A and B: group means (±SE) for clusters 1 and 2 are shown for each ECochG metric, with or without the high-pass masker. C and D: group means (±SE) for clusters 1 and 2 are shown for each ECochG metric, at low vs. high click-repetition rate. The unmasked data from A and B are the same as the low-rate values from C and D. Significance of group differences in mean values is indicated by brackets: *P < 0.05; **P < 0.01; ***P < 0.001. AP, action potential; P1, next major trough following AP peak; SP, summating potential.

Given that the cluster analysis was driven by the masker-difference waveforms, it is not surprising that masker effects differed significantly for cluster 1 versus cluster 2 (Fig. 7A), mimicking group differences between best and worst performers (Fig. 4). Masking increased SP in cluster 1 (the better performers), while SP was decreased in cluster 2 (the worst performers). Similarly, masking decreased AP in both clusters, but slightly more significantly in cluster 2 than in cluster 1 (Fig. 7), as shown for best versus worst performers in Fig. 4. Masker effects on AP–P1 amplitude and AP width were also similar for cluster 1 versus cluster 2 (Fig. 7, A and B), as for best versus worst performers (Fig. 4, A and B). These differences between clusters 1 and 2 held for both male and female subjects in each group (data not shown).

Effects of click rate on the two clusters were similar to masker effects on SP. As with the masker, raising click rate abolished SP differences between the clusters, by slightly raising mean SP in cluster 1 while significantly lowering SP in cluster 2, the poorer performers. Again, similar results were observed in both men and women (data not shown). When data were included from all participants, masker-induced changes in SP, AP, and AP–P1 amplitudes were highly correlated with rate-evoked changes in the same measures (Supplemental Fig. S5).

Multivariable regression analyses.

As described in materials and methods, stepwise selection methods were used in a multivariable linear regression analyses to determine 1) the joint effect of ECochG variables on word scores before and after adjustment for age, sex, and thresholds (Table 1) and 2) the degree to which age, sex, and thresholds predict SP, AP, and AP–P1 amplitudes (Table 2). As shown in Table 1, all four models (for each word recognition test) included SP amplitude, measured under unmasked conditions, as a significant predictor of word scores. Three out of four models included an AP-related measure, and for the mQuickSIN, changes in wave V and P1 latencies under masking conditions were included, as well. Interestingly, while thresholds, age, and sex are associated with AP and AP–P1 amplitudes (Table 2), none of these potential confounding variables were significantly associated with SP (Table 2). As assessed by the percent change in the estimated regression coefficients with and without their inclusions in the model (data not shown), there is little evidence of their confounding effects. The lone exception was the AP amplitude obtained in unmasked conditions, whose effect on 65% time-compressed word scores was reduced by ~30% after inclusion of confounding.

Table 1.

Results of the multivariable regression analysis

Predictors P Value Adjusted r2
Noise 0 dB SNR (n = 123)
SP (unmasked) 0.001 0.100
ΔV latency (high − low) 0.094
ΔAP width (masked − unmasked) 0.104
Thresholds EHF 0.461
Thresholds standard 0.552
Age 0.648
Sex 0.824
45% Time comp. + Rev. (n = 122)
ΔAP (masked − unmasked) <0.001 0.198
AP (unmasked) 0.006
SP (unmasked) 0.007
ΔAP–P1 (masked − unmasked) 0.019
ΔAP width (high − low) 0.029
Thresholds standard 0.103
ΔAP latency (high − low) 0.159
Sex 0.290
Thresholds EHF 0.392
Age 0.877
65% Time comp. + Rev. (n = 121)
SP (unmasked) 0.002 0.228
ΔAP (masked − unmasked) 0.042
Age 0.045
ΔAP–P1 (masked − unmasked) 0.058
AP (unmasked) 0.065
ΔAP latency (high − low) 0.098
Sex 0.601
Thresholds EHF 0.763
Thresholds standard 0.849
Modified QuickSIN (n = 120)
ΔV latency (masked − unmasked) 0.004 0.218
ΔP1 latency (masked − unmasked) 0.011
SP (unmasked) 0.017
AP (unmasked) 0.023
ΔAP (masked − unmasked) 0.064
ΔAP–P1 (masked − unmasked) 0.090
Thresholds standard 0.250
Thresholds EHF 0.690
Sex 0.765
Age 0.843

To investigate the joint effect of predictor variables and to adjust for potential confounding factors, stepwise selection methods were applied using predictor variables obtained from the responses of electrocochleographic (ECochG) waveforms as well as age, sex, and thresholds for their potential role as confounding factors. The best-fitting models were identified for each word recognition test (see materials and methods). Statistical significance is P < 0.05 (bold). AP, action potential; EHF, extended high frequencies; Rev., reverberation; P1, next major trough following AP peak; SP, summating potential; Time comp., time compression; V, wave V.

DISCUSSION

We presented words in speech-shaped noise or with time compression and reverberation, as well as sentences in a four-talker babble, to 124 participants with normal standard audiograms. Word-recognition performance was highly variable but could not be attributed to either the small differences in thresholds at standard frequencies or the large differences at EHFs after adjustment for age and sex (Fig. 2B). These results, which held for both audiometric thresholds as well as DPOAEs in this population, suggest that the range of word recognition scores of “normal-hearing” subjects in difficult listening situations is not due to differences in cochlear amplifier function.

Before adjustment for age, behavioral thresholds at both standard frequencies and EHFs were correlated with performance on both time-compressed word tests as well as the modified QuickSIN (Fig. 2B). In contrast, the only correlation between DPOAEs and word scores, before age adjustment, was between EHF DPOAEs and 65% time-compressed words. DPOAEs, unlike behavioral measures, do not require active participation, suggesting that cognitive factors are affecting word recognition performance, as suggested by other studies (Füllgrabe et al. 2015; Kamerer et al. 2019). Although we excluded subjects who did not pass the Montreal Cognitive Assessment (MoCA), this is not a very stringent test of cognitive dysfunction.

Comparison to Other Recent Studies

To probe for other sources of variability in word recognition performance beyond threshold sensitivity, we collected electrocochleographic measures of auditory nerve function. Human histopathological studies have shown evidence of age-related cochlear synaptopathy (Felder and Schrott-Fischer, 1995; Viana et al. 2015; Wu et al. 2019) and an association between this type of primary neural degeneration and poor word recognition scores (Felder and Schrott Fischer 1995). A recent retrospective study showed word recognition deficits in patients with chronic conductive hearing loss (Okada et al. 2020), a condition predicted to cause cochlear synaptopathy based on a mouse study showing IHC synapse loss after prolonged adult-onset conductive hearing loss (Liberman et al. 2015).

The suprathreshold amplitude of ABR wave 1 is a useful noninvasive assay for cochlear synaptopathy in animal models of noise damage (Kujawa and Liberman 2009, 2015; Shaheen et al. 2015), as long as cochlear thresholds remain normal (Sergeyenko et al. 2013). Attempts to extend these findings to normal-hearing humans include studies of the correlation between 1) noise exposure history or aging and electrophysiological measures or 2) between word recognition scores and electrophysiological measures. Studies in normal-hearing participants of noise exposure histories or aging have sometimes suggested a correlation with ABR responses (Bramhall et al. 2017; Burkard and Sims, 2002; Grose et al. 2017; Johannesen et al. 2019; Liberman et al. 2016; Mepani et al. 2020; Skoe and Tufts 2018; Valderrama et al. 2018) and sometimes not (Fulbright et al. 2017; Grinn et al. 2017; Prendergast et al. 2017; Ridley et al. 2018; Spankovich et al. 2017). Similarly, studies of word scores and ABR responses have sometimes suggested a correlation (Liberman et al. 2016; Mepani et al. 2020; Ridley et al. 2018), as we concluded here, and other times not (Bramhall et al. 2018; Guest et al. 2018; Johannesen et al. 2019; Prendergast et al. 2017).

Interpretation of this apparent discrepancy is complicated by methodological differences among these studies. ABR Wave 1 is typically measured from baseline to peak or from peak to subsequent trough, rather than from SP to AP peak. Including the trough in a measure meant to reflect the ensemble auditory nerve response is suboptimal: although the unitary contribution of each auditory nerve action potential is a biphasic wave, with a depolarization peak and a repolarization trough, the repolarization phases of short-latency auditory nerve spikes can be cancelled by the depolarization phases of longer latency auditory nerve spikes (from more apical locations) as well as early spikes from the cochlear nucleus. Thus peak-to-trough measures can be enhanced by damage, e.g., by destroying fibers in the apical half of the cochlea. Curiously, in the present study, peak-to-trough measures (i.e., AP–P1) were better correlated with word scores than measures based on AP alone. The reasons for this are unclear.

We have argued that measuring wave I from baseline to peak is also suboptimal, because it conflates the SP, a longer duration declining pedestal that includes presynaptic sources, with the shorter duration AP generated by the auditory nerve (Mepani et al. 2020). Because the AP rides on the SP, an SP enhancement coupled with an AP reduction can leave wave I amplitude unchanged. Indeed, had we measured wave I from baseline to peak, we would also have seen no correlation with word scores (data not shown). Another important difference among studies of hidden hearing loss in humans is the bandpass of the ABR response filters: here we used 10 Hz as the high-pass cutoff, to maximize SP. Changing that high pass from 10 Hz to 300 Hz (Fig. 8), as is more typically done in other studies, would also have eliminated most of the statistically significant correlations between word scores and SP amplitudes.

Fig. 8.

Fig. 8.

Waveform filtering alters the statistical relationships between predictor variables and word scores. A: effects of postacquisition digital filtering of mean electrocochleographic (ECochG) waveforms for all participants using 2 bandwidths, as indicated. B: correlation matrices for 3 unmasked ECochG measures and 4 word recognition scores, using the 2 filter bandwidths illustrated in A. Data are matrices of Pearson’s bivariate correlations, where the diameter of each gray disk is proportional to the unadjusted r value. As indicated in the key, the color of the circle indicates the direction of significant correlations (blue, r > 0; red, r < 0); the background color of each square indicates the statistical significance after adjustment for threshold and sex. AP, action potential; Comp, time compression; mQuickSIN, modified QuickSIN test; Rev, reverberation; P1, next major trough following AP peak; SP, summating potential.

SP Versus AP as Potential Markers of Synaptopathy

In mutant mice lacking IHC synapses (Buran et al. 2010) or in mice with noise-induced (Kujawa and Liberman, 2009), age-related (Sergeyenko et al. 2013), or ouabain-generated cochlear synaptopathy (Yuan et al. 2014), SP amplitude is often stable in the face of a decreased AP. These observations were consistent with the long-standing dogma that SP is dominated by hair cell receptor potentials (Durrant et al. 1998) and thus should be unaffected by problems with synaptic transmission or by postsynaptic damage. That reasoning inspired us to measure the SP/AP ratio in our search for hidden hearing loss in human subjects, as a means of minimizing the variability of ABR amplitudes from differences in head size, tissue conductivity, etc. (Bharadwaj et al. 2019; Nikiforidis et al. 1993).

Ironically, in the current study, the strongest correlations between word scores and ECochG measures are with the SP itself, in the curious direction that SP is enhanced among those with the worst word scores. There are other hints from the human literature that SP is more complex than the dogma suggested. With transtympanic electrocochleography in normal humans, both AP and SP were reduced by increasing click rate (68% AP reduction vs. 27% SP reduction), as we saw here, despite no change in the cochlear microphonic (Santarelli et al. 2008). Another human study comparing SP and AP amplitudes before and immediately after music exposure causing a temporary threshold shift showed SP enhancement coupled with AP reduction (Kim et al. 2005). These results echo our prior study, where SP amplitude was larger in participants who performed more poorly on word recognition tests and were assumed to have higher lifetime noise doses (Liberman et al. 2016).

Recent gerbil studies using kainite and tetrodotoxin to block postsynaptic potentials and action potentials, respectively, suggest that the SP has multiple sources, including presynaptic potentials from hair cells and postsynaptic potentials from auditory nerve dendrites and spike generators (Pappa et al. 2019). In humans, a recent study hinted at the existence of a neural contribution to the SP using a continuous loop averaging deconvolution technique (Kennedy et al. 2017). Here we compared responses to high- versus low-rate click rates, since increasing presentation rate should depress postsynaptic potentials without affecting presynaptic receptor potentials (Eggermont and Odenthal 1974). Interestingly, we found that high click rates selectively decreased the SP in the worst performers (Fig. 4G) such that the normally strong correlations between SP and word scores completely vanished (Fig. 5A vs. Fig. 5D). Given its rate sensitivity, the SP decrement likely reflect the postsynaptic contribution, e.g., summed excitatory postsynaptic potentials (EPSPs) in auditory nerve dendrites. Thus we speculate that, at high rate, the remaining SP is dominated by hair cell receptor potentials, which are robust in all our subjects. These observations, along with the reversible SP enhancements noted in a prior noise-damage study (Kim et al. 2005), suggest that mild cochlear dysfunction can enhance the EPSPs and, paradoxically, that EPSP amplitudes can be enhanced despite unchanged or reduced spiking activity in the auditory nerve.

Our use of high-rate clicks was inspired, in part, by recent single-fiber results from noise-exposed, synaptopathic guinea pigs showing abnormal fatigability of the surviving auditory nerve fibers in a paired-click stimulus paradigm (Song et al. 2016). Surprisingly, we found no strong evidence for increased fatigability of neural responses in the worst performers on the word tests: the reduction in AP amplitudes elicited by either the increased click rate or the addition of the masking noise was similar between best and worst performers, as shown in Fig. 4, C and E.

Nevertheless, our electrocochleography suggests that variability in word recognition scores among normal-hearing listeners has a strong cochlear component. Having ruled out a contribution of the cochlear amplifier to explain differences in word scores, the observed differential changes in neural components are consistent with synaptopathy, and the SP changes could be as well.

Contributions from Extended High Frequencies

In the present study, thresholds at EHFs were correlated with performance on several of the word recognition tests, before adjustment for age. This could arise because 1) neurons in EHF regions normally carry information useful for these word tests or 2) EHF threshold shift is correlated with hidden damage in standard audiometric ranges. Scenario 1 is reasonable since even high-frequency auditory nerve fibers have low-frequency tails that respond in the speech frequency range at moderate SPLs (Liberman and Kiang, 1978). Scenario 2 is reasonable since, in noise-induced or age-related hearing loss, the earliest signs of permanent threshold shifts are seen at EHFs, whereas cochlear synaptopathy is seen throughout the cochlear spiral in aging (Wu et al. 2019) and throughout much of the basal half of the cochlea in noise damage (Kujawa and Liberman 2015). The fact that these correlations with performance disappeared after adjustment for age suggests that they may arise from noncochlear effects of aging, e.g., cognitive effects (Füllgrabe et al. 2015; Kamerer et al. 2019).

Establishing diagnostic indicators for cochlear synaptopathy in humans is important if we are to understand the prevalence of primary neural degeneration. Based on the present results, it seems unlikely that electrocochleography could diagnose dysfunction of loss of primary afferents on a case-by-case basis. However, stepwise multivariate regression (Table 1) indicates that measures of SP might be useful to measure and track the accumulation of cochlear neural deficits in longitudinal studies, or in tracking the possible reversal of such degeneration, as seen in animal studies of synaptic reconnection after noise exposure, as can be elicited by local delivery of growth factors to the round window (Sly et al. 2016; Suzuki et al. 2016).

GRANTS

This work was supported by National Institute of Deafness and Other Communication Disorders Grant P50 DC015857 and the Lauer Tinnitus Research Center at Massachusetts Eye and Ear. We also gratefully acknowledge a gift from Decibel Therapeutics for the purchase of the commercial audiometric equipment.

DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

AUTHOR CONTRIBUTIONS

S.F.M. and M.C.L. conceived and designed research; S.F.M., K.J.G., and A.M.M. performed experiments; S.F.M., K.J.G., A.M.M., P.W., K.E.H., and M.C.L. analyzed data; S.F.M., V.D.G., and M.C.L. interpreted results of experiments; S.F.M. prepared figures; S.F.M. drafted the manuscript; M.C.L. edited and revised the manuscript; K.J.G., A.M.M., P.W., K.E.H., V.D.G., M.C.L., and S.F.M. approved final version of manuscript.

ACKNOWLEDGMENTS

We thank Dr. J. J. Guinan, Jr. for feedback.

REFERENCES

  1. Alvord LS. Cochlear dysfunction in “normal-hearing” patients with history of noise exposure. Ear Hear 4: 247–250, 1983. doi: 10.1097/00003446-198309000-00005. [DOI] [PubMed] [Google Scholar]
  2. Arenas JP, Suter AH. Comparison of occupational noise legislation in the Americas: an overview and analysis. Noise Health 16: 306–319, 2014. doi: 10.4103/1463-1741.140511. [DOI] [PubMed] [Google Scholar]
  3. Auerbach BD, Rodrigues PV, Salvi RJ. Central gain control in tinnitus and hyperacusis. Front Neurol 5: 206, 2014. doi: 10.3389/fneur.2014.00206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bharadwaj HM, Mai AR, Simpson JM, Choi I, Heinz MG, Shinn-Cunningham BG. Non-invasive assays of cochlear synaptopathy – candidates and considerations. Neuroscience 407: 53–66, 2019. doi: 10.1016/j.neuroscience.2019.02.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bohne BA, Harding GW. Degeneration in the cochlea after noise damage: primary versus secondary events. Am J Otol 21: 505–509, 2000. [PubMed] [Google Scholar]
  6. Bourien J, Tang Y, Batrel C, Huet A, Lenoir M, Ladrech S, Desmadryl G, Nouvian R, Puel JL, Wang J. Contribution of auditory nerve fibers to compound action potential of the auditory nerve. J Neurophysiol 112: 1025–1039, 2014. doi: 10.1152/jn.00738.2013. [DOI] [PubMed] [Google Scholar]
  7. Bramhall N, Beach EF, Epp B, Le Prell CG, Lopez-Poveda EA, Plack CJ, Schaette R, Verhulst S, Canlon B. The search for noise-induced cochlear synaptopathy in humans: mission impossible? Hear Res 377: 88–103, 2019. doi: 10.1016/j.heares.2019.02.016. [DOI] [PubMed] [Google Scholar]
  8. Bramhall NF, Konrad-Martin D, McMillan GP. Tinnitus and auditory perception after a history of noise exposure: relationship to auditory brainstem response measures. Ear Hear 39: 881–894, 2018. doi: 10.1097/AUD.0000000000000544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bramhall NF, Konrad-Martin D, McMillan GP, Griest SE. Auditory brainstem response altered in humans with noise exposure despite normal outer hair cell function. Ear Hear 38: e1–e12, 2017. doi: 10.1097/AUD.0000000000000370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Buran BN, Strenzke N, Neef A, Gundelfinger ED, Moser T, Liberman MC. Onset coding is degraded in auditory nerve fibers from mutant mice lacking synaptic ribbons. J Neurosci 30: 7587–7597, 2010. doi: 10.1523/JNEUROSCI.0389-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Burkard RF, Sims D. A comparison of the effects of broadband masking noise on the auditory brainstem response in young and older adults. Am J Audiol 11: 13–22, 2002. doi: 10.1044/1059-0889(2002/004). [DOI] [PubMed] [Google Scholar]
  12. Costalupes JA, Young ED, Gibson DJ. Effects of continuous noise backgrounds on rate response of auditory nerve fibers in cat. J Neurophysiol 51: 1326–1344, 1984. doi: 10.1152/jn.1984.51.6.1326. [DOI] [PubMed] [Google Scholar]
  13. Don M, Ponton CW, Eggermont JJ, Masuda A. Gender differences in cochlear response time: an explanation for gender amplitude differences in the unmasked auditory brain-stem response. J Acoust Soc Am 94: 2135–2148, 1993. doi: 10.1121/1.407485. [DOI] [PubMed] [Google Scholar]
  14. Dubno JR, Dirks DD, Morgan DE. Effects of age and mild hearing loss on speech recognition in noise. J Acoust Soc Am 76: 87–96, 1984. doi: 10.1121/1.391011. [DOI] [PubMed] [Google Scholar]
  15. Durrant JD, Wang J, Ding DL, Salvi RJ. Are inner or outer hair cells the source of summating potentials recorded from the round window? J Acoust Soc Am 104: 370–377, 1998. doi: 10.1121/1.423293. [DOI] [PubMed] [Google Scholar]
  16. Eggermont JJ, Odenthal DW. Action potentials and summating potentials in the normal human cochlea. Acta Otolaryngol Suppl 77, Suppl 316: 39–61, 1974. doi: 10.1080/16512251.1974.11675746. [DOI] [PubMed] [Google Scholar]
  17. Felder E, Schrott-Fischer A. Quantitative evaluation of myelinated nerve fibres and hair cells in cochleae of humans with age-related high-tone hearing loss. Hear Res 91: 19–32, 1995. doi: 10.1016/0378-5955(95)00158-1. [DOI] [PubMed] [Google Scholar]
  18. Fulbright AN, Le Prell CG, Griffiths SK, Lobarinas E. Effects of recreational noise on threshold and suprathreshold measures of auditory function. Semin Hear 38: 298–318, 2017. doi: 10.1055/s-0037-1606325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Füllgrabe C, Moore BC, Stone MA. Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition. Front Aging Neurosci 6: 347, 2015. doi: 10.3389/fnagi.2014.00347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Furman AC, Kujawa SG, Liberman MC. Noise-induced cochlear neuropathy is selective for fibers with low spontaneous rates. J Neurophysiol 110: 577–586, 2013. doi: 10.1152/jn.00164.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Grinn SK, Wiseman KB, Baker JA, Le Prell CG. Hidden hearing loss? No effect of common recreational noise exposure on cochlear nerve response amplitude in humans. Front Neurosci 11: 465, 2017. doi: 10.3389/fnins.2017.00465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Grose JH, Buss E, Hall JW 3rd. Loud music exposure and cochlear synaptopathy in young adults: isolated auditory brainstem response effects but no perceptual consequences. Trends Hear 21: 2331216517737417, 2017. doi: 10.1177/2331216517737417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Guest H, Munro KJ, Plack CJ. Tinnitus with a normal audiogram: role of high-frequency sensitivity and reanalysis of brainstem-response measures to avoid audiometric over-matching. Hear Res 356: 116–117, 2017. doi: 10.1016/j.heares.2017.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Guest H, Munro KJ, Prendergast G, Millman RE, Plack CJ. Impaired speech perception in noise with a normal audiogram: No evidence for cochlear synaptopathy and no relation to lifetime noise exposure. Hear Res 364: 142–151, 2018. doi: 10.1016/j.heares.2018.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Houston HG, McClelland RJ. Age and gender contributions to intersubject variability of the auditory brainstem potentials. Biol Psychiatry 20: 419–430, 1985. doi: 10.1016/0006-3223(85)90044-7. [DOI] [PubMed] [Google Scholar]
  26. Johannesen PT, Buzo BC, Lopez-Poveda EA. Evidence for age-related cochlear synaptopathy in humans unconnected to speech-in-noise intelligibility deficits. Hear Res 374: 35–48, 2019. doi: 10.1016/j.heares.2019.01.017. [DOI] [PubMed] [Google Scholar]
  27. Kamerer AM, Kopun JG, Fultz SE, Neely ST, Rasetshwane DM. Reliability of measures intended to assess threshold-independent hearing disorders. Ear Hear 40: 1267–1279, 2019. doi: 10.1097/AUD.0000000000000711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kennedy AE, Kaf WA, Ferraro JA, Delgado RE, Lichtenhan JT. Human summating potential using continuous loop averaging deconvolution: response amplitudes vary with tone burst repetition rate and duration. Front Neurosci 11: 429, 2017. doi: 10.3389/fnins.2017.00429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kim JS, Nam EC, Park SI. Electrocochleography is more sensitive than distortion-product otoacoustic emission test for detecting noise-induced temporary threshold shift. Otolaryngol Head Neck Surg 133: 619–624, 2005. doi: 10.1016/j.otohns.2005.06.012. [DOI] [PubMed] [Google Scholar]
  30. Kujawa SG, Liberman MC. Adding insult to injury: cochlear nerve degeneration after “temporary” noise-induced hearing loss. J Neurosci 29: 14077–14085, 2009. doi: 10.1523/JNEUROSCI.2845-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kujawa SG, Liberman MC. Synaptopathy in the noise-exposed and aging cochlea: Primary neural degeneration in acquired sensorineural hearing loss. Hear Res 330, Pt B: 191–199, 2015. doi: 10.1016/j.heares.2015.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Liberman MC. Noise-induced and age-related hearing loss: new perspectives and potential therapies. F1000 Res 6: 927, 2017. doi: 10.12688/f1000research.11310.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Liberman MC, Dodds LW. Single-neuron labeling and chronic cochlear pathology. III. Stereocilia damage and alterations of threshold tuning curves. Hear Res 16: 55–74, 1984. doi: 10.1016/0378-5955(84)90025-X. [DOI] [PubMed] [Google Scholar]
  34. Liberman MC, Epstein MJ, Cleveland SS, Wang H, Maison SF. Toward a differential diagnosis of hidden hearing loss in humans. PLoS One 11: e0162726, 2016. doi: 10.1371/journal.pone.0162726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Liberman MC, Kiang NY. Acoustic trauma in cats. Cochlear pathology and auditory-nerve activity. Acta Otolaryngol Suppl 358: 1–63, 1978. [PubMed] [Google Scholar]
  36. Liberman MC, Kiang NY. Single-neuron labeling and chronic cochlear pathology. IV. Stereocilia damage and alterations in rate- and phase-level functions. Hear Res 16: 75–90, 1984. doi: 10.1016/0378-5955(84)90026-1. [DOI] [PubMed] [Google Scholar]
  37. Liberman MC, Liberman LD, Maison SF. Chronic conductive hearing loss leads to cochlear degeneration. PLoS One 10: e0142341, 2015. doi: 10.1371/journal.pone.0142341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Lobarinas E, Salvi R, Ding D. Insensitivity of the audiogram to carboplatin induced inner hair cell loss in chinchillas. Hear Res 302: 113–120, 2013. doi: 10.1016/j.heares.2013.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Maurizi M, Ottaviani F, Paludetti G, Almadori G, Pierri F, Rosignoli M. Effects of sex on auditory brainstem responses in infancy and early childhood. Scand Audiol 17: 143–146, 1988. doi: 10.3109/01050398809042184. [DOI] [PubMed] [Google Scholar]
  40. Mehraei G, Hickox AE, Bharadwaj HM, Goldberg H, Verhulst S, Liberman MC, Shinn-Cunningham BG. Auditory brainstem response latency in noise as a marker of cochlear synaptopathy. J Neurosci 36: 3755–3764, 2016. doi: 10.1523/JNEUROSCI.4460-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Mepani AM, Kirk SA, Hancock KE, Bennett K, de Gruttola V, Liberman MC, Maison SF. Middle ear muscle reflex and word recognition in “normal-hearing” adults: evidence for cochlear synaptopathy? Ear Hear 41: 25–38, 2020. doi: 10.1097/AUD.0000000000000804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Nikiforidis GC, Koutsojannis CM, Varakis JN, Goumas PD. Reduced variance in the latency and amplitude of the fifth wave of auditory brain stem response after normalization for head size. Ear Hear 14: 423–428, 1993. doi: 10.1097/00003446-199312000-00008. [DOI] [PubMed] [Google Scholar]
  43. Noffsinger D, Wilson RH, Musiek FE. Department of Veterans Affairs compact disc recording for auditory perceptual assessment: background and introduction. J Am Acad Audiol 5: 231–235, 1994. [PubMed] [Google Scholar]
  44. Okada M, Welling DB, Liberman MC, Maison SF. Chronic conductive hearing loss is associated with speech intelligibility deficits in patients with normal bone conduction thresholds. Ear Hear 41: 500–507, 2020. doi: 10.1097/AUD.0000000000000787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Pappa AK, Hutson KA, Scott WC, Wilson JD, Fox KE, Masood MM, Giardina CK, Pulver SH, Grana GD, Askew C, Fitzpatrick DC. Hair cell and neural contributions to the cochlear summating potential. J Neurophysiol 121: 2163–2180, 2019. doi: 10.1152/jn.00006.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Picton TW, Stapells DR, Campbell KB. Auditory evoked potentials from the human cochlea and brainstem. J Otolaryngol Suppl 9: 1–41, 1981. [PubMed] [Google Scholar]
  47. Prendergast G, Guest H, Munro KJ, Kluk K, Léger A, Hall DA, Heinz MG, Plack CJ. Effects of noise exposure on young adults with normal audiograms I: Electrophysiology. Hear Res 344: 68–81, 2017. doi: 10.1016/j.heares.2016.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Rajan R, Cainer KE. Ageing without hearing loss or cognitive impairment causes a decrease in speech intelligibility only in informational maskers. Neuroscience 154: 784–795, 2008. doi: 10.1016/j.neuroscience.2008.03.067. [DOI] [PubMed] [Google Scholar]
  49. Ridley CL, Kopun JG, Neely ST, Gorga MP, Rasetshwane DM. Using thresholds in noise to identify hidden hearing loss in humans. Ear Hear 39: 829–844, 2018. doi: 10.1097/AUD.0000000000000543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Salvi R, Sun W, Ding D, Chen GD, Lobarinas E, Wang J, Radziwon K, Auerbach BD. Inner hair cell loss disrupts hearing and cochlear function leading to sensory deprivation and enhanced central auditory gain. Front Neurosci 10: 621, 2017. doi: 10.3389/fnins.2016.00621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Santarelli R, Del Castillo I, Rodríguez-Ballesteros M, Scimemi P, Cama E, Arslan E, Starr A. Abnormal cochlear potentials from deaf patients with mutations in the otoferlin gene. J Assoc Res Otolaryngol 10: 545–556, 2009. doi: 10.1007/s10162-009-0181-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Santarelli R, Starr A, Michalewski HJ, Arslan E. Neural and receptor cochlear potentials obtained by transtympanic electrocochleography in auditory neuropathy. Clin Neurophysiol 119: 1028–1041, 2008. doi: 10.1016/j.clinph.2008.01.018. [DOI] [PubMed] [Google Scholar]
  53. Schmiedt RA. Acoustic injury and the physiology of hearing. J Acoust Soc Am 76: 1293–1317, 1984. doi: 10.1121/1.391446. [DOI] [PubMed] [Google Scholar]
  54. Schmiedt RA, Mills JH, Boettcher FA. Age-related loss of activity of auditory-nerve fibers. J Neurophysiol 76: 2799–2803, 1996. doi: 10.1152/jn.1996.76.4.2799. [DOI] [PubMed] [Google Scholar]
  55. Sergeyenko Y, Lall K, Liberman MC, Kujawa SG. Age-related cochlear synaptopathy: an early-onset contributor to auditory functional decline. J Neurosci 33: 13686–13694, 2013. doi: 10.1523/JNEUROSCI.1783-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Shaheen LA, Valero MD, Liberman MC. Towards a diagnosis of cochlear neuropathy with envelope following responses. J Assoc Res Otolaryngol 16: 727–745, 2015. doi: 10.1007/s10162-015-0539-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Skoe E, Tufts J. Evidence of noise-induced subclinical hearing loss using auditory brainstem responses and objective measures of noise exposure in humans. Hear Res 361: 80–91, 2018. doi: 10.1016/j.heares.2018.01.005. [DOI] [PubMed] [Google Scholar]
  58. Sly DJ, Campbell L, Uschakov A, Saief ST, Lam M, O’Leary SJ. Applying neurotrophins to the round window rescues auditory function and reduces inner hair cell synaptopathy after noise-induced hearing loss. Otol Neurotol 37: 1223–1230, 2016. doi: 10.1097/MAO.0000000000001191. [DOI] [PubMed] [Google Scholar]
  59. Song Q, Shen P, Li X, Shi L, Liu L, Wang J, Yu Z, Stephen K, Aiken S, Yin S, Wang J. Coding deficits in hidden hearing loss induced by noise: the nature and impacts. Sci Rep 6: 25200, 2016. doi: 10.1038/srep25200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Spankovich C, Le Prell CG, Lobarinas E, Hood LJ. Noise history and auditory function in young adults with and without type 1 diabetes mellitus. Ear Hear 38: 724–735, 2017. doi: 10.1097/AUD.0000000000000457. [DOI] [PubMed] [Google Scholar]
  61. Stockard JE, Stockard JJ, Westmoreland BF, Corfits JL. Brainstem auditory-evoked responses. Normal variation as a function of stimulus and subject characteristics. Arch Neurol 36: 823–831, 1979. doi: 10.1001/archneur.1979.00500490037006. [DOI] [PubMed] [Google Scholar]
  62. Suzuki J, Corfas G, Liberman MC. Round-window delivery of neurotrophin 3 regenerates cochlear synapses after acoustic overexposure. Sci Rep 6: 24907, 2016. doi: 10.1038/srep24907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Tufts JB, Skoe E. Examining the noisy life of the college musician: weeklong noise dosimetry of music and non-music activities. Int J Audiol 57, Suppl 1: S20–S27, 2018. doi: 10.1080/14992027.2017.1405289. [DOI] [PubMed] [Google Scholar]
  64. Valderrama JT, Beach EF, Yeend I, Sharma M, Van Dun B, Dillon H. Effects of lifetime noise exposure on the middle-age human auditory brainstem response, tinnitus and speech-in-noise intelligibility. Hear Res 365: 36–48, 2018. doi: 10.1016/j.heares.2018.06.003. [DOI] [PubMed] [Google Scholar]
  65. Viana LM, O’Malley JT, Burgess BJ, Jones DD, Oliveira CA, Santos F, Merchant SN, Liberman LD, Liberman MC. Cochlear neuropathy in human presbycusis: confocal analysis of hidden hearing loss in post-mortem tissue. Hear Res 327: 78–88, 2015. doi: 10.1016/j.heares.2015.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Woellner RC, Schuknecht HF. Hearing loss from lesions of the cochlear nerve: an experimental and clinical study. Trans Am Acad Ophthalmol Otolaryngol 59: 147–149, 1955. [PubMed] [Google Scholar]
  67. Wu PZ, Liberman LD, Bennett K, de Gruttola V, O’Malley JT, Liberman MC. Primary neural degeneration in the human cochlea: evidence for hidden hearing loss in the aging ear. Neuroscience 407: 8–20, 2019. doi: 10.1016/j.neuroscience.2018.07.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Yuan Y, Shi F, Yin Y, Tong M, Lang H, Polley DB, Liberman MC, Edge AS. Ouabain-induced cochlear nerve degeneration: synaptic loss and plasticity in a mouse model of auditory neuropathy. J Assoc Res Otolaryngol 15: 31–43, 2014. doi: 10.1007/s10162-013-0419-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Neurophysiology are provided here courtesy of American Physiological Society

RESOURCES