Skip to main content
Trends in Hearing logoLink to Trends in Hearing
. 2023 Oct 9;27:23312165231205719. doi: 10.1177/23312165231205719

Enhanced Place Specificity of the Parallel Auditory Brainstem Response: A Modeling Study

Thomas J Stoll 1,2, Ross K Maddox 1,2,3,
PMCID: PMC10563492  PMID: 37807857

Abstract

While each place on the cochlea is most sensitive to a specific frequency, it will generally respond to a sufficiently high-level stimulus over a wide range of frequencies. This spread of excitation can introduce errors in clinical threshold estimation during a diagnostic auditory brainstem response (ABR) exam. Off-frequency cochlear excitation can be mitigated through the addition of masking noise to the test stimuli, but introducing a masker increases the already long test times of the typical ABR exam. Our lab has recently developed the parallel ABR (pABR) paradigm to speed up test times by utilizing randomized stimulus timing to estimate the thresholds for multiple frequencies simultaneously. There is reason to believe parallel presentation of multiple frequencies provides masking effects and improves place specificity while decreasing test times. Here, we use two computational models of the auditory periphery to characterize the predicted effect of parallel presentation on place specificity in the auditory nerve. We additionally examine the effect of stimulus rate and level. Both models show the pABR is at least as place specific as standard methods, with an improvement in place specificity for parallel presentation (vs. serial) at high levels, especially at high stimulus rates. When simulating hearing impairment in one of the models, place specificity was also improved near threshold. Rather than a tradeoff, this improved place specificity would represent a secondary benefit to the pABR's faster test times.

Keywords: auditory brainstem responses, evoked response audiometry, computational modeling, cochlear place specificity

Introduction

Each region of the cochlea responds best to a certain frequency, but in the absence of that frequency it will readily respond to other frequencies for a stimulus presented at a high enough level (Burkard et al., 2007, p. 236; Pickles, 2012, p. 39; Robles & Ruggero, 2001; Russell & Nilsen, 1997). A low-intensity frequency-specific stimulus will elicit a response from only the region of the cochlea which has a characteristic frequency (CF) near that of the stimulus (Robles & Ruggero, 2001; Stapells & Oates, 1997). However, at high levels, a wider region of the cochlea will be excited, leading to poorer place specificity (illustrated in Figure 1A, from Polonenko & Maddox, 2019, and demonstrated by Encina-Llamas et al., 2019 for single SAM tones; Encina-Llamas et al., 2021 for multiple SAM tones; Johannesen et al., 2022 comparing envelope and fine structure encoding). This widened excitation spreads asymmetrically towards the cochlear base, meaning the decreased specificity stems primarily from the contribution of higher-frequency regions—a problem which is most pronounced for low-frequency stimuli. Thus, low-frequency responses from a diagnostic toneburst auditory brainstem response (ABR) exam at high levels contain significant contributions from higher-frequency regions, which may lead to missed or inaccurate diagnoses if stimuli are not constructed to mitigate the issue (Picton et al., 1979; Stapells & Oates, 1997; Stapells et al., 1994).

Figure 1.

Figure 1.

Cartoon depiction showing why parallel presentation may improve place specificity. (A) A frequency-specific stimulus (frequency spectrum shown by filled area) is shown to produce asymmetric spread on the cochlea (dotted line). (B) Notched noise is used to eliminate the spread of cochlear excitation. (C) When multiple frequency-specific stimuli are presented in parallel, the stimuli will mutually mask each other. From Polonenko and Maddox (2019), licensed CC BY-NC.

Methods such as high-pass or notched noise masking can improve place specificity. If high-pass filtered noise is presented alongside a test stimulus, the noise will continually drive regions of the cochlea with CFs above the filter cutoff (typically above test frequency) (Burkard et al., 2007, p. 236; Oates & Stapells, 1997a). Because the ABR is computed to reflect only the activity that is time-locked to the test stimulus onset by averaging many repetitions, regions which are instead driven by the masking noise do not contribute to it. Notch-filtered noise (with the notch at the test frequency) can also be used (Figure 1B), but the primarily basal spread means that effects are similar to those of high-passed noise in normal-hearing patients. The addition of masking noise, however, results in smaller waveforms and further increased test times, so it is not recommended for clinical use (BC Early Hearing Program, 2012, p. 24; Hall, 2007, p. 260).

Our lab has recently developed the parallel ABR (pABR) to combat the long test times of a standard diagnostic ABR exam (Polonenko & Maddox, 2019). A standard diagnostic ABR exam uses narrow-band stimuli to determine hearing thresholds at different frequencies (Burkard et al., 2007, Chapter 11), and because thresholds must be estimated one at a time at multiple frequencies in both ears, the exam suffers from long test times. These long test times can lead to incomplete results and inaccurate diagnoses, or require repeat visits, which carries logistical and financial burden (BC Early Hearing Program, 2012; Bright et al., 2011). The pABR utilizes randomized stimulus timing to test multiple frequencies in both ears all at once, providing canonical ABR waveforms more quickly. The pABR is conceptually similar to the auditory steady-state response (ASSR) in that both use uncorrelated stimuli to test multiple frequencies at the same time (Herdman & Stapells, 2001). The pABR differs from the ASSR in that it provides canonical ABR waveforms, rather than a scalar measure of confidence in the presence of a response. By having access to the response waveforms, clinicians and researchers can analyze the amplitude and latency of each wave of the ABR, as well as the middle latency response when present. A possible secondary benefit of the pABR, and the focus of this modeling study, is an improvement in responses’ cochlear place specificity .

We hypothesized the pABR would show improved place specificity when compared to serial paradigms due to masking effects inherent to the stimulus construction. The pABR involves the simultaneous presentation of stimulus sequences of five frequency bands in octave spacing (in its typical application). These sequences are temporally uncorrelated from each other. Thus, when calculating the response to stimuli in one band, the stimuli in the four other bands act together as notched masking noise (Figure 1C). As stimulus rates increase, off-CF regions will be more continuously driven, which should lead to increased beneficial masking effects. Additionally, place specificity improvements should be present only at higher stimulus levels, since the ABR is already place specific at low levels (Encina-Llamas et al., 2021; Stapells & Oates, 1997). Our previous experimental results are consistent with both of these phenomena (Polonenko & Maddox, 2019, 2022), though PS was not the focus of either of those studies.

Recorded scalp potentials comprise the summed response of thousands of neurons across CF. Gross scalp potentials like the ABR and frequency-following response have been modeled in the past by simulating the firing rates of auditory nerve (AN) cells to a stimulus and summing across CF, with later stages of processing (corresponding to response components following wave I) based on the output of the AN stage (Dau, 2003; Verhulst et al., 2018). Since our interest here is in the relative contribution of different places along the tonotopic axis to the summed response, we simulate and analyze AN responses before the summing stage. We used two computational models of the auditory periphery to simulate responses of AN fibers to a wide range of stimulus rates and levels. Responses from both models support the overall hypothesis that increased place specificity is indeed a benefit of the pABR paradigm.

Methods

Stimulus Generation

The pABR stimuli were generated as previously described (Polonenko & Maddox, 2019). See Figure 2 for an overview of the stimulus generation and analysis techniques. Windowed tonebursts were generated for five frequencies (500, 1,000, 2,000, 4,000, and 8,000 Hz) by multiplying five cycles of a cosine at each frequency with a Blackman window of matching length. Sixty impulse trains with 1 s duration were generated at a sampling rate of 48 kHz following independent, random Poisson-like processes at a specified rate (20, 40, 80, 120, 160, and 200 stim/s), and half of the impulses were randomly selected to be inverted to produce both rarefaction and condensation tonebusts in a later step. In this study, we tested a wider range of rates than in our prior work because we hypothesized stronger masking effects as the rate increased and wished to test that hypothesis. Rates of 20 and 40 stim/s will remain the optimal rate if used clinically due to providing the shortest test times (Polonenko & Maddox, 2022), but higher rates, which may have better place specificity, may also be of interest, such as when studying suprathreshold hearing. As in our previous work, these impulse trains differed from standard Poisson processes in that each one-second sequence was forced to have exactly the number of impulses corresponding to the pulse rate. This modification was shown to have essentially no effect on the inter-stimulus interval histogram in Polonenko and Maddox (2019). Each toneburst was then convolved with its impulse train to produce a toneburst train with both rarefaction and condensation tonebursts. In practice, obtaining and averaging responses to rarefaction and condensation tonebursts helps to minimize stimulus artifact. Since this study uses computational models, stimulus artifact is not present in the response, but it is still important to present stimuli to the model as they would be in an EEG experiment, as rarefaction and condensation tonebursts can produce slightly different responses—particularly for lower frequency stimuli. The five toneburst trains were then either presented in isolation (i.e., one toneburst train of one frequency is presented in a trial) as the serial condition or summed to be presented in parallel (i.e., toneburst trains of all frequencies are summed and presented at the same time). Since the timing of each toneburst train was generated as an independent random process, they are uncorrelated and the response for each frequency can still be determined when presented simultaneously. Stimuli were scaled to be in units of pascals as required by the models. We varied the level from 30 to 90 dB peak equivalent SPL (peSPL) in 5 dB increments, since the spread of excitation was expected to increase with stimulus level, resulting in 13 stimulus levels.

Figure 2.

Figure 2.

The stimulus construction and analysis chain for the parallel condition. Toneburst trains for five different frequencies of tonebursts are generated and summed, then presented to the model. The raw model output is then cross-correlated with the rectified impulse trains used to generate the stimuli to determine the response for each frequency.

Peripheral Models

To investigate the PS of the pABR, we used two computational models of the auditory periphery and nerve to compare parallel and serial responses. Both models take an acoustic signal in units of pascals as input and allow us to examine modeled excitation patterns of the AN (as average firing rates).

The Zilany et al. model (Zilany et al., 2014) is a phenomenological model for which the user provides the characteristic frequencies of the cells to be modeled, the species, and the spontaneous firing rates. We modeled human responses from high-spontaneous rate fibers (the main drivers of onset scalp response; Bourien et al., 2014) with 201 characteristic frequencies ranging from 125 to 16,000 Hz spaced according to a Greenwood map (using the calc_cfs function in zilany2014 from the Cochlea python package; Rudnicki et al., 2015). Stimuli were first upsampled to 100 kHz before being passed into this model. We used a python implementation of the model which relies heavily on the original C code (Rudnicki et al., 2015).

The Verhulst et al. model (Verhulst et al., 2018) is a transmission line model with the number of channels in the model and their characteristic frequencies predetermined, ranging from ∼125 to 12,000 Hz according to a Greenwood map, resulting in 201 frequency channels. We utilized the convolutional neural network version of this model (CoNNear; Drakopoulos et al., 2021) due to its drastically sped-up simulation time. Prior to response modeling, stimuli were downsampled to 25 kHz and then zero-padded so the number of samples was a multiple of 16, 8, or 16,384 for the basilar membrane displacement, inner hair cell potentials, and AN firing rate, respectively, as required by the convolutional nature of this model.

Custom python scripts were developed to test our stimuli and extract the AN response of each model (all code is available at https://github.com/maddoxlab/pabr_modeling_TIH2023). To make the cross-correlation equivalent to averaging and avoid circular artifacts associated with processing digital signals in the frequency domain, stimuli were padded with the end of the previous stimulus at the beginning of the stimulus and the beginning of the next stimulus at the end of the stimulus. When computing correlation (or convolution) through frequency-domain multiplication, the amount of analysis window before and after lag zero unaffected by circular effects is equivalent to the amount of padding used before and after the stimulus epoch, respectively. Since the time window we plot and analyze for the average response ranges from −50 to 50 ms, where a time of 0 ms corresponds to stimulus onset time, we used only 50 ms to pad the stimulus on either side to reduce computation time. Model outputs were downsampled to 10 kHz, the response was calculated, and the response window from −50 to 50 ms was saved (keeping only 100 ms of the total 1,100 ms) for analysis to reduce file sizes and analysis time.

Response Calculation

The stimuli were input to the models, resulting in a raw response waveform for each modeled CF, paradigm (serial/parallel), stimulus rate, and stimulus level. The raw output of the Verhulst model contained artifactual sinusoidal oscillations at Nyquist and a few other frequencies. These artifacts were easily canceled out by passing in a stimulus consisting of only zeros and then subtracting that response (which contained the same artifacts) from the stimulus responses. This had the additional effect of removing the spontaneous firing rate, but this component of the response was not of interest (as a DC potential cannot contribute to the ABR) and would be removed in later analysis steps anyway. At each CF, for both the serial and parallel stimuli, evoked responses were calculated by taking the cross-correlation of the model output and the rectified impulse train that corresponds to the frequency of interest and then dividing by the number of stimuli (as in Polonenko & Maddox, 2019)—a process which is equivalent to determining the average evoked responses. The impulse train was downsampled to match the model outputs’ sampling rate by changing the indices where unit impulses occurred. Specifically, the original indices were multiplied by the new sampling rate and divided by the old sampling rate, and then rounded to the nearest sample index. Cross-correlations were carried out in the frequency domain for efficiency, as shown in Equation 1, for all combinations of CF, paradigm, rate, and intensity, where w is the evoked response waveform, x is the rectified impulse train, y is the model output, n is the number of stimuli, ℱ and ℱ−1 indicate the Fourier and inverse Fourier transforms, respectively, and * indicates the complex conjugate:

w=1nF1{F{x}*F{y}}. (1)

The zeroth value of the response in the Fourier space was set to zero to subtract the mean of the time-domain signal.

Comparison of Conditions

The overall objective of this study was to determine whether parallel stimulus presentation provides better place specificity than serial presentation. To allow conditions to be compared at a glance, we required a summary metric to describe the place specificity of each response. Due to the asymmetric and often irregular shape of the responses (when viewed across CF), common measures of bandwidth were not applicable here. We designed our metric to describe the place specificity of each response (w from Equation 1), penalizing response spread to CFs away from the stimulus frequency as well as off-place peaks.

Each response started as a firing rate which varied over time and CF that we eventually reduced to a single quantity. To collapse across time, the lags corresponding to the maximum response value at each CF in the serial condition were determined (black dotted line, leftmost panel Figure 3). The value of the response at this lag was then selected for both the serial and parallel conditions, resulting in a single magnitude for each CF for each condition. Using the peak time from the serial condition was important because the parallel responses contained some residual noise from the other stimulus sequences. This noise was small and disappeared as stimulus length increased, but at CFs where the response was also small, the noise had the potential to bias the maximum value across time spuriously higher.

Figure 3.

Figure 3.

Computation of summary metric to compare conditions. The example shown is from the Zilany et al. model, with a test frequency of 1 kHz and stimulus rate of 120 stim/s at 75 dB SPL. Time lags are selected for each CF where the maximum value occurs in the serial condition (black dotted line in the serial panel). These time lags are used to select values from both the serial and parallel condition (green dotted line in parallel panel) to avoid bias, which are then normalized such that the value at the test frequency is equal to one, penalizing responses which are stronger in off-frequency regions. The values are then integrated across CFs in octaves to determine PS (rightmost panel), which serves as a single number descriptor of place specificity. PS = place specificity metric; CF = characteristic frequency; SPL = sound pressure level.

We then normalized all magnitude values by dividing them by the magnitude at the CF corresponding to the stimulus frequency (such that the latter became equal to one). This normalization allowed for comparison across conditions (where response magnitude can vary significantly) and penalized cases where the strongest responses were not at the CF matching the stimulus frequency. Next, the response was integrated across CF, with CF expressed in octaves (reflecting the roughly logarithmic spacing of CF on the cochlea), to produce a single number related to the contribution of off-place responses for each rate-intensity pair. Lastly, we inverted the number to make it interpretable in a similar manner as a Quality Factor, with larger numbers corresponding to more place-specific responses. Hereafter we refer to this numerical place specificity metric simply as PS. This process is illustrated in Figure 3. To compare the serial and parallel conditions, the ratio of the PS for the parallel to the serial condition was calculated. A ratio greater than one indicates an improvement in PS for the parallel stimulus relative to the corresponding serial stimulus with the same parameters.

Since this study uses modeled responses, for which arbitrarily small confidence intervals could be obtained by generating more data, no significance testing was performed. Instead, we comment on grossly observable trends. These trends will be tested using data recorded from human subjects in a future study.

Hearing Impaired Model

The Zilany et al. model has parameters that can be changed in order to model hearing impairment. We used the function carney2015_fitaudiogram as implemented in the Auditory Modeling Toolbox (Majdak et al., 2022) to model responses in a moderately impaired ear (audiogram m9 from Parthasarathy et al., 2020—audiometric thresholds: 35, 37, 37, 39, 41, 43, and 43 dB HL at frequencies of 0.5, 1, 2, 3, 4, 6, and 8 kHz, respectively), with two-thirds of the impairment resulting from outer hair cell (OHC) impairment. We initially sought to model more severe impairment, but we were unable to maintain the desired ratio of two-thirds impairment from OHCs because some of the model parameters went to zero. Since we needed to specify model parameters for every CF, we interpolated the audiogram thresholds within the data range and kept the values at the extents constant when out of range. We repeated our stimulus presentation and response analysis using the impaired model. Adding impairment to the model led to some low-level stimuli showing no response, but PS can only be calculated (and is only relevant) when a response is present. We generated a simple model of the ABR wave I by summing across CFs and then, using these modeled waveforms, we determined if a response was present by requiring the maximum value of the serial condition peak between 0 and 10 ms to be greater than or equal to one standard deviation of the pre-stimulus noise in the parallel condition. If the serial response was smaller than the pre-stimulus noise in the parallel condition, the parallel response would also fall below the noise and not be detectable. We did not analyze conditions for which no response was present.

Results

Overall, the modeled responses suggest that parallel stimulus presentation offers better place specificity than serial at high levels, with no difference at low levels in normal hearing. We first examine some example responses that demonstrate the essential traits of responses at different stimulus rates and levels (Figure 4), where the left two columns show the response across time and CF measured with a train of 1,000 Hz tonebursts, presented either in isolation (serial) or alongside the toneburst trains at other frequencies (parallel). At a high stimulus level (75 dB, Figure 4A and B), serial presentation resulted in broad excitation patterns, primarily due to spread toward the high-frequency base. The parallel responses at the same stimulus level, however, showed less spread—the spread to high CFs was almost completely eliminated at a high stimulus rate (120 stim/s, Figure 4A) and present but with reduced magnitude at a low rate (20 stim/s, Figure 4B, see difference plot in third column). At the low stimulus level (30 dB), both serial and parallel presentation produced place-specific responses. The rightmost column of Figure 4 shows the normalized maximum value from the serial condition and the corresponding value from the parallel response for all five test frequencies. In these examples, the benefit of parallel presentation was strongest at lower stimulus frequencies (rightmost column, responses to high stimulus frequencies are narrower than at low stimulus frequencies). The serial responses had a broader excitation pattern (which led to a smaller PS) than the parallel responses at the high stimulus level, with a more pronounced difference at the high stimulus rate.

Figure 4.

Figure 4.

Example responses of the auditory nerve for different stimulus parameters (Zilany et al. model shown). (A) Responses from a 1,000 Hz toneburst train at a high stimulus rate and level. The serial response (first column) has a greater spread of cochlear excitation (vertical axis) than the parallel response (second column). This difference is shown more clearly in the third column, which plots the subtraction of the serial response from the parallel response. The rightmost column plots the normalized maximum response value across time, showing the parallel responses (shaded regions) are more place specific than the serial responses (solid lines). (B) and (C) The same as (A) but for low-rate and high-level stimuli, or low-rate and low-level stimuli, respectively. Note that color limits for the firing rate plots are consistent within rows but not across to allow for better visualization and because the focus of this study is the spread of excitation rather than its magnitude.

We next compared the place specificity of serial and parallel presentation across stimulus rate and level for all stimulus frequencies (Figure 5), with darker colors indicating higher PS (i.e., better place specificity) in the serial and parallel plots. The overall trends were consistent with the example discussed above—namely, (1) across essentially all conditions, parallel presentation was at least as place specific as serial, (2) PS decreased (worsened) with increased stimulus level for both parallel and serial presentation (lightening colors from bottom to top in each panel of the left two columns of each triplet), and (3) the difference between the serial and parallel PS at higher stimulus levels, with parallel PS higher (better) than serial PS, became greater at higher stimulus rates and was most prominent for lower stimulus frequencies. For parallel presentation, higher stimulus rates increased PS mainly at higher stimulus levels (i.e., iso-PS curves have an upward slant at high levels), which is consistent with our hypothesis. For serial presentation the opposite was observed: PS decreased with higher stimulus rates (iso-PS lines have a downward slant).

Figure 5.

Figure 5.

Summary plots of AN place specificity for serial and parallel presentation from both models (left half of the plot: Zilany et al., right half of the plot: Verhulst et al.). Each row corresponds to a different test frequency which is denoted by color and labeled on the vertical line in the middle of the figure. The left two columns of each triplet show the PS for the serial and parallel responses, while the rightmost column of each triplet shows the PS ratio. The left two columns show PS decreases with stimulus level for both serial and parallel presentation, while PS decreased with stimulus rate for serial presentation but increased for parallel presentation. This effect combines to produce a strong improvement in place specificity at high stimulus rates, particularly for low-frequency stimuli and at high levels, where there is the most room for improvement. This improvement can be seen in the rightmost column of each triplet, which plots the ratio of the parallel to serial PS (higher numbers, represented with darker color, indicated a parallel advantage). PS = place specificity metric; AN = auditory nerve.

To facilitate a direct comparison of parallel to serial PS, the ratio between the two is shown in the third column of each triplet of Figure 5. Here, higher numbers indicate a parallel advantage and are shown with darkening colors. With the trends noted above, parallel presentation's improved PS can be seen in the upper-right region of each panel (i.e., at high stimulus rates and levels). While the trends are generally consistent, there is a noticeable difference in the size of the effect predicted by the models, where the Zilany et al. model showed a stronger benefit of parallel presentation due to a larger decrease of serial PS with increasing stimulus level than seen in the Verhulst et al. model.

In general, we observe a benefit of parallel presentation at high rates and intensities, but the size of this benefit depends on the stimulus test frequency along with which model produced the responses. These differences are more clearly shown in Figure 6, which focuses on the PS ratio at 40 stim/s and 80 dB across test frequencies and models (at a clinically relevant rate and high stimulus level). The lower stimulus frequencies benefited more from parallel presentation than high frequencies. The Zilany et al. model predicted slightly less improvement for the lowest test frequency than the next lowest test frequency, due to some spread of excitation towards the lower CFs (i.e., the response spread to lower frequencies, which could be masked for the middle test frequencies, but not the lowest). Overall, the Zilany et al. model predicted a greater benefit of parallel presentation, since the PS ratio at each test frequency is greater than that predicted by the Verhulst et al. model.

Figure 6.

Figure 6.

The PS ratios at 40 stim/s and 80 dB for all test frequencies and both sites and models show the trend of parallel improvement across test frequency and site. Both models generally show stronger improvements for low test frequencies, with a greater improvement in the AN. The Zilany et al. model showed a greater difference between sites and less improvement for the lowest test frequency than the second lowest test frequency. PS = place specificity metric; AN = auditory nerve.

The hearing impaired (HI) model showed similar trends to the normal hearing (NH) model (Figure 7), with parallel presentation showing an improvement in PS at high levels and stimulus rates. The HI model differs in two ways. (1) Expectedly, the high-level region of improvement is shifted upward by an amount related to the modeled threshold elevation. (2) At low levels, the HI model showed a second region where the parallel PS was higher than the serial. The first difference is attributable to a change in cochlear gain. We consider possible mechanisms for the second observation in the “Discussion” section.

Figure 7.

Figure 7.

Summary plot for PS of responses in the impaired Zilany et al. model. The figure layout is as in Figure 5. No response was present below 40 dB for the 8 kHz response, so PS was not calculated.

Discussion

Here we investigated the effect of parallel stimulus presentation on place specificity using two computational models of the auditory periphery. Supporting our hypothesis, both the Zilany et al. and Verhulst et al. models suggest parallel presentation is at least as place specific as serial presentation, and place specificity is improved with parallel presentation at high stimulus rates and levels. This remains true for the HI model, with additional improvements near threshold, suggesting the pABR may be particularly beneficial in the case of impaired hearing.

At low stimulus levels, we find no difference between serial and parallel presentation for normal hearing, with both stimulus paradigms offering place-specific responses. This is unsurprising, as lower-level serial stimuli optimally excite the CFs near the stimulus frequency, yielding a response that is place specific, and previous work has shown standard ABRs are place specific at low and moderate levels (e.g., Stapells & Oates, 1997). Parallel presentation excites the entire cochlea because all stimulus frequencies are presented, yet responses to each stimulus frequency are place specific because the stimuli are uncorrelated across bands and thus mathematically avoid contamination between responses.

At higher stimulus levels, both models suggest parallel presentation offers superior place specificity over serial presentation. In serial presentation, stimuli at high levels excite a broader region of the cochlea. In parallel presentation, however, stimuli in different bands mutually mask each other (see Versteegh & van der Heijden, 2013 for a prior example of multiple tones masking each other), reducing the contribution of off-frequency regions to responses. Thus, the difference in place specificity between serial and parallel presentation is driven by a much more pronounced worsening of serial place specificity in the modeled responses as level increases when compared to parallel responses.

Stimulus presentation rate also played a role in place specificity, with the largest differences between parallel and serial presentation at the highest stimulus rates. For parallel presentation, masking effects increase as stimulus rates increase because the stimuli more continuously excite the cochlea (in the limit, resembling bandpassed noise more than a series of tonebursts). This effect was observable in both models. The increased benefit offered at high rates will likely be most useful in laboratory settings rather than in the clinic, as our previous work (Polonenko & Maddox, 2022) has shown that 20 or 40 stim/s are the optimal testing rates to produce the shortest test times—the primary consideration for clinical application. However, we note that parallel presentation still provides some benefit to place specificity at these lower stimulus rates—particularly at high levels where it is most needed. Additionally, the pABR is most effective in reducing test times for low stimulus levels (Polonenko & Maddox, 2019) where serial and parallel responses are similarly place specific. We did not expect an effect of rate on serial PS, yet both models suggest that serial responses become less place specific at increasing presentation rates. As the stimulus rate increases, the magnitude of the on-place response is suppressed (i.e., firing rates decrease) due to adaptation associated with higher-rate stimuli (Burkard et al., 2007, p. 239). The off-place responses do not exhibit this adaptation and thus remain constant, since off-place CFs are less strongly driven by the stimulus. This results in an overall “flattening” of the response and a greater relative contribution of the off-place CFs to the overall response, worsening place specificity.

The two models differ in their predicted magnitude of the benefit provided by parallel presentation, with the Zilany et al. model showing a greater difference between serial and parallel place specificity. This difference appears to be partly due to a greater spread of excitation towards the lower CFs in the Zilany et al. model leading to a greater magnitude of excitation in the off-place regions (relative to the response where CF is equal to the test frequency) than the Verhulst et al. model. Additionally, the Zilany et al. model exhibits strong off-place peaks at higher stimulus levels (e.g., as in the right panel of Figure 4A) which are not seen in the Verhulst et al. model. While it is not possible at this stage to determine which model more closely matches actual human responses, both models show the same overall effects. The place specificity of pABR responses will need to be tested in human subjects. Since such a study would not have access to individual sites along the tonotopic axis as in computational models, other approaches such as masked and derived-band ABRs will need to be employed, as by Oates and Stapells (1997b) and Herdman et al. (2002).

A key difference between the NH and HI models is seen at low stimulus levels, where the HI model predicts a benefit of parallel presentation while the NH model does not. This phenomenon occurs due to differences in the serial response between the NH and HI models. In the HI model, the serial response weakly excites a wide range of CFs near threshold due to broadened tuning—a behavior not seen in the NH model, which has sharp tuning at low levels. In the HI model, even at low levels, masking effects from parallel presentation prevent off-place regions from contributing to the response, resulting in more place-specific responses with parallel presentation near threshold. For both serial and parallel presentation, the magnitude of the on-place response grows rapidly when moving from low to moderate levels, while the magnitude of the off-place responses grows more slowly until higher levels. This leads to a dominant on-place response for both serial and parallel presentation at moderate levels and relatively little contribution from the off-place CFs, so the paradigms are both place specific. As the stimulus level increases further, the on-place CF response saturates while the magnitude of the off-place response grows. In the serial condition, this results in a response with poor place specificity. In the parallel condition, masking effects help maintain the place specificity of the response, so parallel presentation is again more place specific than serial. Overall, this behavior makes sense in the context of the model and a simple understanding of hearing impairment. However, due to limited data to validate the HI model—particularly for the low-frequency apex of the cochlea (and humans in general)—and the varied underlying etiologies that result in hearing impairment, further conclusions cannot be drawn without overinterpreting these modeled results.

In humans, the largest ABR component is wave V, which is generated from later stages of processing than the AN (which generates wave I; Gu et al., 2012). We focused only on AN here because of the existence of robust models. While approaches to modeling later processing stages do exist (e.g., Dau, 2003; Nelson & Carney, 2004), they are less well developed than those of the AN, and there are essential aspects of actual responses which are not recapitulated. Still, an improvement in place specificity at the AN should be inherited by later stages, leading to overall evoked waveforms that are more place specific.

While this modeling study shows that higher stimulus rates lead to better place specificity, there are tradeoffs that must be considered for practical use. We have previously found increasing stimulus rate leads to smaller responses (a phenomenon which the models used in this article recapitulate) and thus longer recording times. Thus, for applications where time is the primary consideration, the pABR at rates of 20 or 40 stim/s (Polonenko & Maddox, 2022) will yield responses faster than serial testing (Polonenko & Maddox, 2019) while offering slightly better place specificity at high levels (Figure 5). However, if a given experiment or clinical paradigm demands place-specific responses, those can be obtained through high stimulus rates at the cost of longer acquisition times.

Conclusion

The parallel construction of the pABR stimulus produces modeled responses which are more place specific than standard, serial methods at high stimulus levels. In models of a healthy human ear, this benefit is most prominent for low stimulus frequencies and increases with stimulus rate. When hearing impairment was added to one of the models, an additional region of improvement was seen at low levels. Unlike other methods to improve place specificity, the pABR reduces test times rather than increasing them. An increase in place specificity for high-level stimuli should lead to more accurate threshold estimations for individuals with more severe hearing loss.

Footnotes

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Institute on Deafness and Other Communication Disorders (grant number R01 DC017962).

References

  1. BC Early Hearing Program. (2012). BC early hearing program. Audiology Assessment Protocol. Version, 4, 18. [Google Scholar]
  2. Bourien J. Tang Y. Batrel C. Huet A. Lenoir M. Ladrech S. Desmadryl G. Nouvian R., . , & Puel J. Wang J. (2014). Contribution of auditory nerve fibers to compound action potential of the auditory nerve. Journal of Neurophysiology, 112(5), 1025–1039. 10.1152/jn.00738.2013 [DOI] [PubMed] [Google Scholar]
  3. Bright K., Greeley C. O., Eichwald J., Loveland C. O., Tanner G. (2011). American Academy of Audiology childhood hearing screening guidelines. American Academy of Audiology Task Force. [Google Scholar]
  4. Burkard R. F., Eggermont J. J., Don M. (2007). Auditory evoked potentials: Basic principles and clinical application. Lippincott Williams & Wilkins. [Google Scholar]
  5. Dau T. (2003). The importance of cochlear processing for the formation of auditory brainstem and frequency following responses. The Journal of the Acoustical Society of America, 113(2), 936–950. 10.1121/1.1534833 [DOI] [PubMed] [Google Scholar]
  6. Drakopoulos F., Baby D., Verhulst S. (2021). A convolutional neural-network framework for modelling auditory sensory cells and synapses. Communications Biology, 4(1), 1–17. 10.1038/s42003-021-02341-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Encina-Llamas G., Dau T., Epp B. (2021). On the use of envelope following responses to estimate peripheral level compression in the auditory system. Scientific Reports, 11(1), 6962. 10.1038/s41598-021-85850-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Encina-Llamas G., Harte J. M., Dau T., Shinn-Cunningham B., Epp B. (2019). Investigating the effect of cochlear synaptopathy on envelope following responses using a model of the auditory nerve. Journal of the Association for Research in Otolaryngology, 20(4), 363–382. 10.1007/s10162-019-00721-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Gu J. W., Herrmann B. S., Levine R. A., Melcher J. R. (2012). Brainstem auditory evoked potentials suggest a role for the ventral cochlear nucleus in tinnitus. Journal of the Association for Research in Otolaryngology, 13(6), 819–833. 10.1007/s10162-012-0344-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Hall J. W. (2007). New handbook of auditory evoked responses. Pearson. [Google Scholar]
  11. Herdman A. T., Picton T. W., Stapells D. R. (2002). Place specificity of multiple auditory steady-state responses. The Journal of the Acoustical Society of America, 112(4), 1569–1582. 10.1121/1.1506367 [DOI] [PubMed] [Google Scholar]
  12. Herdman A. T., Stapells D. R. (2001). Thresholds determined using the monotic and dichotic multiple auditory steady-state response technique in normal-hearing subjects. Scandinavian Audiology, 30(1), 41–49. 10.1080/010503901750069563 [DOI] [PubMed] [Google Scholar]
  13. Johannesen P. T., Leclère T., Wijetillake A., Segovia-Martínez M., Lopez-Poveda E. A. (2022). Modeling temporal information encoding by the population of fibers in the healthy and synaptopathic auditory nerve. Hearing Research, 426, 108621. 10.1016/j.heares.2022.108621 [DOI] [PubMed] [Google Scholar]
  14. Majdak P., Hollomey C., Baumgartner R. (2022). AMT 1.x: A toolbox for reproducible research in auditory modeling. Acta Acustica, 6, 19. 10.1051/aacus/2022011 [DOI] [Google Scholar]
  15. Nelson P. C., Carney L. H. (2004). A phenomenological model of peripheral and central neural responses to amplitude-modulated tones. The Journal of the Acoustical Society of America, 116(4), 2173–2186. 10.1121/1.1784442 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Oates P., Stapells D. (1997a). Frequency specificity of the human auditory brainstem and middle latency responses to brief tones. I. High-pass noise masking. The Journal of the Acoustical Society of America, 102(6), 3597–3608. 10.1121/1.420148 [DOI] [PubMed] [Google Scholar]
  17. Oates P., Stapells D. (1997b). Frequency specificity of the human auditory brainstem and middle latency responses to brief tones. II. Derived response analyses. The Journal of the Acoustical Society of America, 102(6), 3609–3619. 10.1121/1.420400 [DOI] [PubMed] [Google Scholar]
  18. Parthasarathy A., Romero Pinto S., Lewis R. M., Goedicke W., Polley D. B. (2020). Data-driven segmentation of audiometric phenotypes across a large clinical cohort. Scientific Reports, 10(1), 6704. 10.1038/s41598-020-63515-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Pickles J. O. (2012). Introduction to the physiology of hearing. BRILL. Retrieved from http://ebookcentral.proquest.com/lib/rochester/detail.action?docID=992501 [Google Scholar]
  20. Picton T., Ouellette J., Hamel G., Durieux-Smith A. (1979). Brain stem evoked potentials to tone pips in notched noise. The Journal of Otolaryngology, 8(4), 289–314. [PubMed] [Google Scholar]
  21. Polonenko M. J., Maddox R. K. (2019). The parallel auditory brainstem response. Trends in Hearing, 23, 2331216519871395. 10.1177/2331216519871395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Polonenko M. J., Maddox R. K. (2022). Optimizing parameters for using the parallel auditory brainstem response to quickly estimate hearing thresholds. Ear and Hearing, 43(2), 646–658. 10.1097/aud.0000000000001128 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Robles L., Ruggero M. A. (2001). Mechanics of the mammalian cochlea. Physiological Reviews, 81(3), 1305–1352. 10.1152/physrev.2001.81.3.1305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Rudnicki M., Schoppe O., Isik M., Völk F., Hemmert W. (2015). Modeling auditory coding: From sound to spikes. Cell and Tissue Research, 361(1), 159–175. 10.1007/s00441-015-2202-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Russell I. J., Nilsen K. E. (1997). The location of the cochlear amplifier: Spatial representation of a single tone on the guinea pig basilar membrane. Proceedings of the National Academy of Sciences, 94(6), 2660–2664. 10.1073/pnas.94.6.2660 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Stapells D., Oates P. (1997). Estimation of the pure-tone audiogram by the auditory brainstem response: A review. Audiology & Neuro-Otology, 2(5), 257–280. 10.1159/000259252 [DOI] [PubMed] [Google Scholar]
  27. Stapells D., Picton T. W., Durieux-Smith A. (1994). Electrophysiologic measures of frequency-specific auditory function. In Jacobson J. T. (Ed.), Principles and applications of auditory evoked potentials (1st ed., pp. 251–283). Prentice Hall. [Google Scholar]
  28. Verhulst S., Altoè A., Vasilkov V. (2018). Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss. Hearing Research, 360, 55–75. 10.1016/j.heares.2017.12.018 [DOI] [PubMed] [Google Scholar]
  29. Versteegh C. P., van der Heijden M. (2013). The spatial buildup of compression and suppression in the mammalian cochlea. Journal of the Association for Research in Otolaryngology, 14(4), 523–545. 10.1007/s10162-013-0393-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Zilany M. S. A., Bruce I. C., Carney L. H. (2014). Updated parameters and expanded simulation options for a model of the auditory periphery. The Journal of the Acoustical Society of America, 135(1), 283–286. 10.1121/1.4837815 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Trends in Hearing are provided here courtesy of SAGE Publications

RESOURCES