Amplitude Compression for Preventing Rollover at Above-Conversational Speech Levels

Michal Fereczkowski; Raul H Sanchez-Lopez; Stine Christiansen; Tobias Neher

doi:10.1177/23312165231224597

. 2024 Jan 5;28:23312165231224597. doi: 10.1177/23312165231224597

Amplitude Compression for Preventing Rollover at Above-Conversational Speech Levels

Michal Fereczkowski ^1,^2,^✉, Raul H Sanchez-Lopez ³, Stine Christiansen ^1,², Tobias Neher ^1,²

PMCID: PMC10771052 PMID: 38179670

Abstract

Hearing aids provide nonlinear amplification to improve speech audibility and loudness perception. While more audibility typically increases speech intelligibility at low levels, the same is not true for above-conversational levels, where decreases in intelligibility (“rollover”) can occur. In a previous study, we found rollover in speech intelligibility measurements made in quiet for 35 out of 74 test ears with a hearing loss. Furthermore, we found rollover occurrence in quiet to be associated with poorer speech intelligibility in noise as measured with linear amplification. Here, we retested 16 participants with rollover with three amplitude-compression settings. Two were designed to prevent rollover by applying slow- or fast-acting compression with a 5:1 compression ratio around the “sweet spot,” that is, the area in an individual performance-intensity function with high intelligibility and listening comfort. The third, reference setting used gains and compression ratios prescribed by the “National Acoustic Laboratories Non-Linear 1” rule. Speech intelligibility was assessed in quiet and in noise. Pairwise preference judgments were also collected. For speech levels of 70 dB SPL and above, slow-acting sweet-spot compression gave better intelligibility in quiet and noise than the reference setting. Additionally, the participants clearly preferred slow-acting sweet-spot compression over the other settings. At lower levels, the three settings gave comparable speech intelligibility, and the participants preferred the reference setting over both sweet-spot settings. Overall, these results suggest that, for listeners with rollover, slow-acting sweet-spot compression is beneficial at 70 dB SPL and above, while at lower levels clinically established gain targets are more suited.

Keywords: hearing aids, amplification, speech perception, individual differences

Introduction

Hearing aids (HAs) provide non-linear amplification to improve the audibility of speech and other signals. Amplification of speech generally leads to improved speech intelligibility (SI) in quiet (Duquesnoy & Plomp, 1983). To quantify speech audibility, researchers and practitioners often use the Articulation Index (French & Steinberg, 1947), which is a measure of the proportion of all speech sounds that lie above a listener's hearing thresholds (Souza et al., 2000). While the relation between the Articulation Index and SI depends on the speech corpus used, SI is generally modeled as a non-decreasing function of the Articulation Index (Webster, 1979). In other words, it is implicitly assumed that as audibility increases (e.g., due to a higher presentation level or HA amplification) SI increases.

While higher presentation levels typically lead to more audibility, they do not necessarily ensure maximum SI. Rather than plateauing out, SI can decrease with increasing presentation level (e.g., Fereczkowski & Neher, 2023a; Studebaker et al., 1999). In the audiological literature, this phenomenon is termed “rollover” (RO; Jerger & Jerger, 1967). In the clinic, RO is inferred from SI measurements performed with monosyllabic words presented in quiet at above-conversational levels. It has been suggested that the presence of RO is indicative of a retro-cochlear disorder (Jerger & Jerger, 1974), for example acoustic neuroma. However, RO has also been observed for listeners with normal audiograms (French & Steinberg, 1947; Shehorn et al., 2020; Studebaker et al., 1999) and for hearing-impaired (HI) individuals with a broad range of audiograms (Shanks et al., 2002; Studebaker et al., 1999), suggesting that RO is a common phenomenon. To account for this, the Speech Intelligibility Index (ANSI, 1997), which is based on the Articulation Index, includes a so-called level-distortion factor to predict performance decreases at high presentation levels.

Recently, Fereczkowski and Neher (2023a) measured speech recognition in quiet and in noise for 37 older experienced HA users with moderate-to-severe hearing losses. To simulate aided listening, they applied individual frequency-shaping based on the “National Acoustic Laboratories-Revised Profound” (NAL-RP; Dillon, 2012) prescription rule. To assess speech recognition in quiet, they presented monosyllabic word lists at above-conversational levels and found statistically significant RO for 35 out of 74 test ears. The median presentation level above which they found RO was 81 dB SPL. Furthermore, they found RO presence to be associated with poorer speech-in-noise performance measured 10 dB above the individual most comfortable speech level. The median of these presentation levels was 78 dB SPL (i.e., a level close to 81 dB SPL where speech recognition in quiet was best). Overall, these findings confirm that RO is common in HI listeners. Furthermore, they indicate that RO is related to poorer speech recognition in noise at moderate presentation levels.

Other research has shown that many HI listeners show RO in quiet and in noise, particularly when HA amplification is used. In a study by Shanks et al. (2002), three different amplification strategies (i.e., linear amplification with either peak clipping or compression limiting and wide-dynamic-range compression; WDRC) implemented in a commercial HA were tested. RO was observed for most listeners at or near conversational levels (62–74 dB SPL), and none of the tested settings led to reduced RO. These results indicate that clinically available WDRC settings (as used by Shanks et al., 2002) may be inadequate for preventing RO with HAs, since they provide too much gain at high levels. In other words, the rate of gain reduction with increasing level (i.e., the compression ratios; CRs) may be too low. To prevent RO, gain needs to be clearly reduced at high input levels, that is, high CRs are necessary. However, high CRs combined with short attack and release times (ATs and RTs) can cause signal distortion (Verschuure et al., 1996).

In view of the above, we hypothesized that for preventing RO at above-conversational speech levels, amplitude compression with a high CR (i.e., 5:1) and long AT and RT would be appropriate. More specifically, we reasoned that by placing speech information in the region of a given individual's performance-intensity function where both intelligibility and listening comfort are high—the so-called “sweet spot” area—aided outcome would be improved. Due to the use of long time-constants, this type of amplification—henceforth referred to as “Sweet-Spot Slow”—can be considered locally linear, thus minimizing undesirable signal distortions. To evaluate our hypothesis, we designed a proof-of-concept study, in which we compared Sweet-Spot Slow to a clinically established reference setting based on the “National Acoustic Laboratories-Non-Linear 1” (NAL-NL1; Dillon, 2012) fitting rule with a short AT and a long RT. Our motivation for this was that in clinical solutions short ATs and long RTs are typically used (Jenstad & Souza, 2005; Kowalewski et al., 2018). In addition, we included a sweet-spot setting with a short AT and a long RT, henceforth referred to as “Sweet-Spot Fast.” Based on these considerations, we evaluated the effect of a CR of 5:1 coupled with long or short ATs in different test conditions.

Since clinical fitting rules prescribe CRs lower than 5:1 (e.g., Dillon, 2012), we expected the NAL-NL1 setting to be less effective in terms of preventing RO and thus to give poorer results at above-conversational levels than Sweet-Spot Slow. Due to the locally linear characteristics of Sweet-Spot Slow, we also hypothesized that at lower presentation levels this setting would lead to performance on a par with, or better than, the NAL-NL1 setting. Lastly, we expected Sweet-Spot Slow to give better results than Sweet-Spot Fast because of the signal distortions caused by the short AT. We evaluated the three compression settings using 16 listeners who had previously shown clear RO (Fereczkowski & Neher, 2023a). Our evaluation included SI measurements at multiple levels in quiet and in noise as well as pairwise preference judgments.

Methods

The current study was conducted as part of the Danish “Better hEAring Rehabilitation” (BEAR) project. The BEAR project was evaluated by the Regional Committees on Health Research Ethics for Southern Denmark, which decided that full ethical approval was unnecessary (i.e., a waiver was given; case no. S-20162000-64). All participants signed an informed consent form and were paid for their time (120 Danish crowns/hour).

Participants

Sixteen adults with a mean age of 75.1 years (standard deviation, SD = 6.2 years) participated. They were recruited using a large clinical database built up as part of the BEAR project. All participants were bilateral HA users with at least four years of HA-use experience. Their own HAs were non-linear, multi-channel devices dispensed in 2017 and fitted according to standard clinical procedures. The participants were chosen based on results from an earlier study where they had shown statistically significant RO in their aided word recognition scores in quiet (Fereczkowski & Neher, 2023a). In the current study, they were tested on one side only. For each participant, the ear (left or right) with more RO was selected. For the 16 test ears, the average RO magnitude observed previously was 28% (SD = 12%, effect size = 2.40; Fereczkowski & Neher, 2023a). The mean pure-tone average hearing loss (PTA) across 500, 1000, 2000, and 4000 Hz was 54.1 dB HL (SD = 9.5 dB HL). The 5th and 95th percentiles of the PTA distribution were 41.6 and 74.0 dB HL, respectively. The mean interaural asymmetry, measured in terms of the absolute PTA difference across left and right ears, was 4.4 dB (SD = 3.1 dB), and the maximum difference was 10 dB. Figure 1 summarizes the audiometric and age data.

Figure 1. — (a) Audiograms of the 16 test ears. The thick line shows the mean audiogram and the error bars show ±1 SD. (b) Boxplot of the PTA data for the 16 test ears and individual datapoints. (c) Boxplot of the differences between the air- and bone-conduction thresholds averaged across 500, 1000, and 2000 Hz and individual datapoints. (d) Age data of the participants. In all boxplots, the edges show the 25th and 75th percentiles and the whiskers show the range of the data. Datapoints located farther than 1.5 times the interquartile range away from the edges are shown by “+” symbols.

Physical Test Setup

Testing was performed in a soundproof booth. The audiometry measurements were performed using an Interacoustics (Middelfart, Denmark) Affinity 2.0 system together with RadioEar (Middelfart, Denmark) DD45 headphones. All other measurements were performed with the help of custom-made Matlab scripts executed on a Windows PC. Audio playback was via an RME Fireface UC soundcard, an SPL Phonitor Mini Amplifier, and a pair of RadioEar DD45 headphones. All sound files had a sampling frequency of 48,000 Hz and a resolution of 16 bits. For all measurements, signal equalization was applied that resulted in a flat frequency response at eardrum level over the frequency range of the stimuli used for the current study.

HA Simulator

All stimulus processing was carried out with a HA simulator implemented in MATLAB (Hau & Andersen, 2012; Sanchez Lopez et al., 2018). The compressor consisted of a 15-band filterbank (0.1–10 kHz), a percentile level estimator, and a non-linear amplifier. Third-octave spacing was used for the 11 mid-frequency bands and half-octave spacing for the two upper and lower bands. Within each frequency band, the signal envelope was estimated based on the lowpass-filtered squared signal (intensity envelope), transformed to the logarithmic domain, and passed through the percentile level estimator. Compression setting-specific time constants were used for the level estimation process, so the percentile level estimator effectively controlled the time constants of the compressor. Since Sweet-Spot Fast and NAL-NL1 shared the same time constants, the same level estimator was used for these settings, whereas Sweet-Spot Slow required a different level estimator (with a long AT). The estimated levels were used to calculate the required gain in the compressor input–output function, which was set individually for each of the three compression settings (see next sub-section). The amplifier's gain function was a broken-stick nonlinearity with a single kneepoint at 65-dB SPL speech level. The corresponding gain was selected to match the target gain at this level. The upper and lower slopes of the function were calculated to match the target gains for soft (55 dB SPL) and loud (80 dB SPL) input signals. The compression threshold was set to 0-dB input level. The compressed output signal was obtained by adding the 15 sub-band signals (Hau & Andersen, 2012).

The compressor was calibrated using the International Speech Test Signal (ISTS; Holube et al., 2010) presented at 65 dB SPL. A calibration constant was determined for each compression setting such that the average level obtained from the percentile level estimator was 65 dB SPL. Since the NAL-NL1 and Sweet-Spot Fast settings shared the same (fast) estimator whereas the Sweet-Spot Slow setting had a different (slow) estimator, the calibration constants for the two percentile level estimators differed (without these constants, the slow estimator would have returned lower input-level estimates than the fast estimator). The constants were determined prior to all measurements and were fixed throughout.

While this calibration step ensured that the fast and slow level estimators returned the same average level estimates for the ISTS at 65 dB SPL, it did not guarantee that for other signals the average level estimates remained in agreement with each other. In fact, the average level estimates from the two percentile estimators did not agree for the different speech materials and test conditions used in the current study. Because our goal was to conduct a proof-of-concept study that investigated the effect of a high CR combined with different time constants, we decided to introduce an additional calibration step. The stimuli were scaled linearly to equalize the average level estimates from the slow and fast percentile estimators, for either individual sentences (for the SI measurements) or individual speech excerpts (for the preference measurements). In that way, we ensured that the implementation details of the level estimators did not influence the average levels “seen” by the different compression settings. Because of these calibration steps, the stimuli had to be pre-processed, that is, they were not generated in real-time.

Compression Settings

As mentioned earlier, the NAL-NL1 setting served as reference. The gains and CRs were therefore prescribed in accordance with this fitting rule. Additionally, the output was limited by a hard limiter set such that in each sub-band the maximum output of the compressor did not exceed the sum of the sub-band level of the ISTS presented at 100 dB SPL and the sub-band gain prescribed by NAL-NL1 for that input level. The compression time constants were set to 10 ms (attack) and 800 ms (release). Thus, the compressor was fast-acting when the presentation level increased and slow-acting when the presentation level decreased.

The Sweet-Spot Slow setting differed from the reference setting in terms of the prescribed gains (see below), CRs, and time constants. The gain prescription comprised frequency shaping and overall gain adjustment. It was designed to be a compromise between the (linear) NAL-RP prescription rule used previously (Fereczkowski & Neher, 2023a) and (non-linear) NAL-NL1 prescription. The aim was to use the knowledge from the previous study about optimum performance with linear amplification as starting point (see below), and to provide good listening comfort using the NAL-NL1 prescription. In other words, Sweet-Spot gain prescription was meant to provide high listening comfort and high SI. Sweet-Spot Slow was characterized by a CR of 5:1 and long attack and release times (both 800 ms).

The Sweet-Spot Fast setting was a mixture of the two other settings. While the prescribed gain and CR were the same as for the Sweet-Spot Slow setting, the time constants were the same as for the NAL-NL1 setting.

Sweet-Spot Gain Prescription

The aim of the two Sweet-Spot settings was to place speech in that area of a given individual's performance-intensity function where both intelligibility and listening comfort are high and, in this manner, to avoid RO. This was achieved in three steps.

In the first step, an input level, L₁, for the NAL-NL1 prescription was found such that it minimized the mean squared difference between the individually prescribed NAL-RP and the NAL-NL1 gains at 500, 1000, and 2000 Hz. The NAL-NL1 gains prescribed for L₁ were then adjusted (in a broadband sense) such that the gain prescribed at 500 Hz was equal to the NAL-RP gain. The resulting frequency shaping, g(f), was taken as the basis for the new prescription.

In the next step, an input level, L₂, was determined at which g(f) was applied, thus defining the overall gain. L₂ was the lower of two options: 80 dB SPL or the most comfortable level (MCL) determined for running speech presented with NAL-RP amplification (see Test Procedures). As a result of this and the previous step, NAL-NL1-based frequency shaping, g(f), was applied at L₂. Therefore, it was expected that at the L₂ input level the Sweet-Spot prescription would provide near-maximum listening comfort, at least in terms of the overall presentation level.

In the last step, the CRs were set. For simplicity, a 5:1 ratio was applied over the range 500–4000 Hz. This ensured that the speech was presented at a level relatively close to the MCL and that its level never exceeded MCL + 10 dB. This value was chosen because MCL + 10 dB was the input level at which 13 out of the 16 participants tested here previously had shown maximum performance, and above which RO had occurred (Fereczkowski & Neher, 2023a). For the frequencies outside the 500–4000 Hz range, the CRs were identical to those from the NAL-NL1 prescription (at 65-dB SPL input).

Figure 2 shows the distributions of insertion gains prescribed by NAL-NL1 (left panel) and the Sweet-Spot approach (middle panel). In each panel, the gain curves represent mean data for three ISTS input levels (55, 65, and 80 dB SPL) with error bars indicating ±1 standard error. While the prescribed gains for the 65-dB SPL input level were comparable, marked differences occurred at the two other input levels, particularly between 500 and 4000 Hz. The greater vertical spread between the gain curves in the middle panel relative to the left panel was due to the higher CR used for the Sweet-Spot compression.¹ Specifically, the high CR resulted in Sweet-Spot prescribing less gain than NAL-NL1 at 80 dB SPL and more gain at 55 dB SPL. The right panel shows the difference between individually prescribed NAL-NL1 and Sweet-Spot gains.

Test Procedures

All measurements were performed in a single 2-h session. Audiometry was carried out first. Next, MCL measurements were performed in quiet with running speech from the Dantale-I material (Elberling et al., 1989) using the Acceptable Noise Level (Nabelek et al., 2006) procedure implemented in the Interacoustics (Middelfart, Denmark) Affinity 2.0 system. To simulate HA amplification, the stimuli were pre-processed using a finite impulse response filter that provided NAL-RP amplification, as done previously (Fereczkowski & Neher, 2023a). To improve accuracy, the MCL was measured three times for each participant. The median of the three MCL values was taken as the final estimate. In the next step, the compressor was prepared and calibrated (see Sweet-Spot Gain Prescription). To investigate the effect of the three compression settings on RO, SI measurements in quiet were carried out at two levels: conversational (i.e., 65 dB SPL) and above-conversational (i.e., the maximum tolerable level for a given individual, but not higher than 90 dB SPL). SI was also measured in noise, with the target speech presented at an input level of 70 dB SPL (see Speech Intelligibility in Noise). For the sake of completeness and a better understanding of the differences between the three compression settings, the SI measurements were complemented with preference judgments carried out in quiet and in noise.

The SI and preference data were collected using a range of speech materials. While monosyllabic words (and linear amplification) were used by Fereczkowski and Neher (2023a), sentence-based materials were considered more suitable for evaluating the compression settings tested here. Further, since semantic context information can mask intelligibility declines at above-conversational presentation levels (Fereczkowski & Neher, 2023b), context-free sentences from the Danish DAT corpus (Nielsen et al., 2014) were used for assessing SI in quiet. For assessing SI in noise, everyday sentences from the Danish version of the hearing-in-noise test (HINT; Nielsen & Dau, 2011) were used. Finally, running speech was used for the preference judgments. The following sections describe the SI and preference measurements in more detail.

Speech Intelligibility in Quiet

For the measurements in quiet, the test lists from the DAT corpus were used. Each list comprises 20 sentences with a fixed structure and two unique, unrelated keywords (e.g., “Dagmar tænkte på en teske og en næse i går”—“Dagmar thought about a teaspoon and a nose yesterday”). The DAT corpus comprises recordings from three different talkers (labeled D, A, and T). In the current study, talkers D and A were used.

The DAT measurements were performed in quiet at 65 dB SPL (conversational) and at an individually determined above-conversational level. The procedure consisted of several steps. First, to familiarize the participants with the test procedure, a training run was carried out with stimuli pre-processed with individual NAL-RP amplification, as for the aided MCL measurements. Second, the same list was used to determine the above-conversational level that was still acceptable. Pairs of sentences from the training list were presented at 80-dB SPL input level. If the level was not uncomfortable, it was increased in 5-dB steps up to the maximum input level of 90 dB SPL. If the participant did not complain about listening discomfort, 90 dB SPL was used as the above-conversational level. This level was chosen based on the results of Fereczkowski and Neher (2023a), where the median presentation level above which RO was present was found to be 81 dB SPL. If the participant reported listening discomfort at any level, the presentation level was lowered by 5 dB and the resulting level was used.

Once the above-conversational level was determined, the actual measurements started. Each compression setting was tested at the two levels (conversational and above-conversational), resulting in six conditions overall. The order of the conditions was randomized, with the constraint that the three high-level conditions could not occur in direct succession. Three D-lists (D1-3) and three A-lists (A1-3) were used. They were randomly assigned to the six test conditions. No test list was repeated. Since each test list had 40 keywords, a score between 0% and 100% (with a resolution of 2.5%) was obtained for each condition.

Speech Intelligibility in Noise

For the measurements in noise, the test lists from the Danish HINT were used. Each of these lists comprises 20 unique everyday sentences uttered by a single female talker. HINT measurements were performed with the target speech at an input level of 70 dB SPL. That level was chosen to be higher than for the measurements performed in quiet, since speech levels are typically higher in the presence of background noise (Christensen et al., 2021; Sørensen et al., 2021). Stationary speech-shaped noise was used. The initial noise level was 60 dB SPL (i.e., 10 dB signal-to-noise ratio; SNR). For the adaptive level adjustments, a 1-up 1-down rule with 4-dB (first four sentences) and 2-dB (remaining sentences) step sizes was used, as done by Nielsen and Dau (2011). To obtain the final SNR corresponding to 50%-correct sentence recognition, the SNRs from the trials with the 2-dB step size were averaged.

The HINT measurements were performed after the DAT measurements. One training list was administered to accustom the participants to the test procedure. This was followed by six randomly selected test lists, one per test condition.

Preference Judgments

For the preference judgments, two stories uttered by two different female Danish talkers taken from the DANTALE-I corpus (Elberling et al., 1989) and the Archimedes project (Hansen & Munch, 1991) were used. Preference was investigated by means of a paired-comparison task. Two scenarios were tested: in quiet at 55-dB SPL input level, and in noise with a speech input level of 75 dB SPL and a noise input level of 70 dB SPL (i.e., 5 dB SNR). Stationary speech-shaped noise was used. In each case, the participants were asked to indicate their preference in terms of listening comfort and, in a separate round of measurements, perceived speech quality. Thus, four conditions were tested: comfort in quiet, comfort in noise, speech quality in quiet, and speech quality in noise.

Three speech passages were used for each condition and participant. Each passage was a 4-s long excerpt randomly selected from one of the two running-speech materials and processed with each of the three compression settings. The resulting sound samples were presented in a series of six trials. On a given trial, two of the three settings were selected and played back in a loop. Using a graphical user interface and a touchscreen, the participants controlled the playback of the two sound samples. They indicated their preferred setting by pressing one of two buttons labeled “A” and “B.” All three possible stimulus pairs were compared. To avoid order effects, each pair (e.g., A = slow and B = fast) was also tested in reversed order (A = fast and B = slow). Thus, six trials were tested for each speech passage, and three separate passages were tested for each condition, leading to 18 trials for each condition. Since 16 participants were tested, a total of 16 × 18 = 288 comparisons were made per condition (96 per compression-setting pair).

Data Analysis

All data analyses were performed in Matlab v2018b (MathWorks, Natick, US). Because some datasets were not normally distributed, non-parametric tests were used throughout. For comparing mean scores for the three compression settings, Friedman's test together with the Tukey-Kramer test for post-hoc testing was used. For testing for RO with each compression setting, the Wilcoxon signed-rank test was used. Since many of the DAT scores approached the maximum possible value of 100%, they were rational arcsine unit (RAU) transformed (Studebaker, 1985) prior to the statistical analyses. For correlation analyses, Spearman's correlation coefficient, r_S, was calculated. Bonferroni correction was used to control for the family-wise error rate.

Because the preference judgments for the listening comfort and speech quality judgments were very similar, these data were pooled. For each pair of compression settings and sound scenario, data from 2 conditions × 96 pairwise comparisons per compression-setting pair and condition = 192 pairwise comparisons were analyzed. The Bradley–Terry–Luce (BTL; Bradley & Terry, 1952) choice model was used for this, as it has been widely used for this purpose (e.g., Bramsløw, 2010; Lelic et al., 2022). The BTL scores and the corresponding confidence intervals (CIs) were estimated using the OptiPt function implemented in Matlab by Wickelmaier and Schmid (2004).

Results

Speech Intelligibility in Quiet

Figure 3 shows boxplots of the DAT scores obtained with the three compression settings. For each setting, the left hand-side boxplot shows the results obtained at the conversational level, and the right hand-side boxplot shows the results obtained at the above-conversational level. The median scores obtained at the conversational level were 92.5%-correct for NAL-NL1 and 90.0%-correct for the two Sweet-Spot settings. The median scores obtained at the above-conversational level were 88.8%-correct for the two compression settings with the fast attack time and 91.3%-correct for Sweet-Spot Slow.

A Friedman's test showed no significant differences between the scores for the three compression settings at the lower level (χ²(2) = 0.76, p = .68). At the higher level (i.e., 90 dB SPL for five listeners, 85 dB SPL for nine listeners, and 80 and 75 dB SPL for the other two listeners), the scores were significantly different (χ²(2) = 8.46, p = .015). Tukey-Kramer post-hoc tests showed that the significant difference was due to the NAL-NL1 scores being lower (poorer) than the Sweet-Spot Slow scores (p = .013; difference between median scores: 2.5%). No other pairwise differences were significant.

The fact that, at the lower level, the three compression settings gave comparable results whereas, at the higher level, NAL-NL1 gave poorer results than Sweet-Spot Slow supports the idea that Sweet-Spot Slow could better reduce RO than NAL-NL1. To investigate this possibility, a one-sided Wilcoxon signed-rank test was used to check, for each compression setting, whether the lower-level scores were significantly higher than the higher-level scores. This was the case for the NAL-NL1 scores (p = .006 after Bonferroni correction), but not for the two other sets of scores (both p > .1). Note that while the present analysis is based on medians, mean RO measured with the NAL-NL1 setting was 4.5% (SD = 5.8%, effect size = 0.76). Corresponding values were −2.2% (SD = 5.8%, effect size = −0.38) for Sweet-Spot Slow and 0.2% (SD = 7.7%, effect size = 0.03) for Sweet-Spot Fast.

The left panel of Figure 4 shows a scatterplot of the individual DAT scores obtained at the lower (horizontal axis) and higher (vertical axis) levels. The dashed line is the identity line, and the solid lines show 95% CIs derived from the binomial distribution of the number of successes per 40 trials. The red “x” symbols mark the Sweet-Spot Fast scores, the green crosses mark the NAL-NL1 scores, and the blue circles mark the Sweet-Spot Slow scores. The right panel of Figure 4 shows a scatterplot of the corresponding RAU scores, offering better visualization of the data close to ceiling performance.

The higher-level NAL-NL1 scores were mostly lower than the lower-level NAL-NL1 scores, with corresponding symbols falling below the identity line, the differences being up to 15% in magnitude. This is consistent with the group-level effect reported above. At the individual level, the differences were significant or close to significant in several cases (see the 95% CIs), consistent with significant RO. Also, the Sweet-Spot Fast and Slow scores were more evenly spread around the identity line. The Sweet-Spot Slow scores showed very few RO cases, with <10% of performance degradation.

Speech Intelligibility in Noise

Figure 5 shows boxplots of the HINT scores for the three compression settings. The median SRTs were 6.5 dB SNR for NAL-NL1, 6.0 dB SNR for Sweet-Spot Fast, and 5.0 dB SNR for Sweet-Spot Slow. A Friedman's test showed a significant effect of compression setting (χ²(2) = 7.62, p = .022). Tukey-Kramer post-hoc tests showed that the significant difference was due to the NAL-NL1 scores being higher (i.e., poorer) than the Sweet-Spot Slow scores (p = .036). Additionally, there was a trend towards the Sweet-Spot Slow scores being lower (better) than the Sweet-Spot Fast scores (p = .056). A limiting factor is the relatively large variability in the HINT scores, which is particularly prominent in the case of Sweet-Spot Fast and, to a lesser extent, Sweet-Spot Slow. In fact, while the HINT scores obtained with the Sweet-Spot Slow setting were better than those obtained with Sweet-Spot Fast in 13 out of 16 cases, the interquartile range of the differences between the two settings was 2.8 dB, while the median difference was just 1.4 dB.

Since the NAL-NL1 and Sweet-Spot Slow settings gave significantly different HINT scores, differences between these scores were analyzed. Because the speech level was fixed at 70 dB SPL, we focused on the gains prescribed for a 70-dB SPL input level. We computed the differences in gain provided by the two compression settings. By design, the CRs of the two compressors differed most substantially between 500 and 4000 Hz (see Sweet-Spot Gain Prescription). Therefore, the gain differences were summarized by calculating the average across these frequencies (Δ₇₀ = G_{70 Sweet−Spot Slow} − G_{70 NAL−NL1}). These gain differences were negative for most participants, meaning that the gain prescribed by Sweet-Spot Slow was lower than the gain prescribed by NAL-NL1. The average of Δ₇₀ across the participants was −3.2 dB (SD = 4.2 dB).² The right panel in Figure 5 shows a scatterplot of Δ₇₀ (horizontal axis) versus the difference in HINT scores (ΔHINT SNR) obtained with the two compression settings (e.g., ΔHINT SNR = −5 dB means that Sweet-Spot Slow gave a 5 dB better score than NAL-NL1). The two datasets were strongly correlated with each other (r_S = 0.81, p < .001). In other words, when compared to NAL-NL1, the lower the gain provided by Sweet-Spot Slow, the better the resultant HINT score. Conversely, when the average gains prescribed by Sweet-Spot Slow were similar to or higher than those prescribed by NAL-NL1 (i.e., Δ₇₀ ≥ 0 dB), the HINT scores obtained with NAL-NL1 were better than those obtained with Sweet-Spot Slow.

Preference in Quiet and in Noise

The left panel of Figure 6 shows the mean BTL scores for the three compression settings together with 95% CIs. The scores can be interpreted as the probability of choosing a given setting over the other settings, and they sum to 1 for each of the two test scenarios (quiet, noise). The plot shows a clear dependence on the test scenario. When speech was presented at 75 dB SPL in the presence of noise, the participants showed a clear preference for Sweet-Spot Slow (probability of 0.8). The probabilities of choosing NAL-NL1 (0.09) or Sweet-Spot Fast (0.11) were significantly lower, and their 95% CIs were very clearly separated from the one for Sweet-Spot Slow. When the speech was presented at 55 dB SPL in quiet, the listeners favored the NAL-NL1 setting (probability of 0.52) over Sweet-Spot Slow (0.17) and Sweet-Spot Fast (0.31).

The right panel of Figure 6 shows a scatterplot of the differences in gain prescribed by Sweet-Spot Slow and NAL-NL1 for a 75-dB SPL input level (Δ₇₅; horizontal axis) plotted against the percentage of cases where Sweet-Spot Slow was preferred over NAL-NL1 when speech was presented in noise. The preference for Sweet-Spot Slow increased with more negative values of Δ₇₅. That is, when compared to NAL-NL1, the lower the gain provided by Sweet-Spot Slow, the stronger the preference for it. When the gain prescribed by these two settings was similar (Δ₇₅ ≈ 0 dB), NAL-NL1 was likely to give better results than Sweet-Spot Slow (i.e., the probability of Sweet-Spot Slow being preferred over NAL-NL1 fell below 50%).

Discussion

The main purpose of the current study was to investigate whether a high CR coupled with long time constants (Sweet-Spot Slow) can improve aided outcome for HI listeners with RO at above-conversational speech levels. To that end, we compared Sweet-Spot Slow with NAL-NL1 amplification combined with a short AT and a long RT. Additionally, we tested a third setting (Sweet-Spot Fast), which combined the high CR from Sweet-Spot Slow with the time constants from the NAL-NL1 setting. We performed measurements at multiple presentation levels in quiet and in noise.

In quiet at 65-dB SPL input level, the three compression settings resulted in comparable SI. At higher input levels, Sweet-Spot Slow gave better SI than NAL-NL1. Additionally, RO was observed with the NAL-NL1 setting but not with the two Sweet-Spot settings. The observed RO was 4.5%, which is less than the 28% found in the previous study (the corresponding effect sizes were 0.78 and 2.40, respectively). Less RO in the current study could be due to at least two factors. First, in the previous study performance was assessed at and above the aided MCL + 10 dB. The aided MCL + 10 dB was the input level at which 13 out of the 16 participants tested here had shown maximum performance in that study. In the current study, lower-level performance was assessed at 65 dB SPL, which is significantly lower than the average aided MCL + 10 dB used previously (i.e., 78 dB SPL). This likely led to the measured DAT scores being suboptimal (i.e., lower than the maximally achievable scores) and thus less RO. Additionally, while in both studies linear amplification was used for determining the maximum level, non-linear amplification (with lower gain at higher levels) was used in the current study for the actual measurements, and this likely led to less RO at the high level. In background noise, when speech was presented at 70 dB SPL, Sweet-Spot Slow gave better SI than NAL-NL1. Further, Sweet-Spot Slow was clearly preferred over the two other settings when speech was presented in noise at 75-dB SPL. In contrast, NAL-NL1 was clearly preferred over the two Sweet-Spot settings in quiet at 55-dB SPL.

Sweet-Spot Slow versus NAL-NL1

Our results consistently showed an advantage of Sweet-Spot Slow over NAL-NL1 when the speech level was at least 70 dB SPL. This implies that for the participants tested here, the Sweet-Spot Slow setting was a more effective solution at these levels. The results also partially support our main hypothesis, as the Sweet-Spot Slow setting (with a long AT and a high CR) was seemingly able to prevent RO at above-conversational speech levels, while the NAL-NL1 setting (with a short AT and lower CRs) was not.

As mentioned in the Methods section, the test procedure was designed to ensure that the differences in processing were due to three main factors: CR, gain, and AT. Out of these factors, only gain was listener-specific. Scatterplots and correlation analyses (Figures 5 and 6) showed that the lower the gain prescribed by Sweet-Spot Slow (as compared to NAL-NL1), the better the HINT scores and the stronger the preference in noise. This suggests that reduced gain was beneficial at higher presentation levels. Conversely, for both the HINT scores and the preference measurements, the NAL-NL1 setting gave similar or better scores than Sweet-Spot Slow when the difference between the gains prescribed at input levels of at least 70 dB SPL was around 0 dB or higher (i.e., NAL-NL1 prescribed lower gains). The fact that the Sweet-Spot Slow advantage was related to the gain difference can partially explain the larger variability of the HINT scores obtained with the Sweet-Spot Slow setting, when compared to the variability of the corresponding NAL-NL1 scores.

At the group level, the results obtained at lower presentation levels followed a similar dependence on the gain differences. At 65-dB SPL, the average of the gain differences between the two settings was −1.2 dB (with NAL-NL1 prescribing more gain than Sweet-Spot Slow), and SI in quiet was comparable for the two settings. At 55-dB SPL input, the average difference was 3.2 dB (with Sweet-Spot prescribing more gain), and NAL-NL1 was strongly preferred over Sweet-Spot Slow in quiet. In fact, the preference for NAL-NL1 over Sweet-Spot Slow was comparable in strength to the opposite pattern observed for 75-dB SPL speech in noise. Since two factors were varied (presentation level and noise presence) but only two conditions were tested, it is unclear whether the preference patterns were driven by level or the presence of noise. Taken together, however, the preference and SI data suggest that at lower presentation levels Sweet-Spot Slow is not an optimal fitting strategy.

Sweet-Spot Fast and the Combined Effect of Gain and AT

The results obtained at different presentation levels seem to suggest that whichever setting (Sweet-Spot Slow or NAL-NL1) prescribed less gain led to better outcome in terms of SI and preference. This suggests that prescribed gain is the key factor for aided outcome. However, the NAL-NL1 setting was consistently preferred over Sweet-Spot Slow when their gains were similar. The difference in AT (short for NAL-NL1, long for Sweet-Spot Slow) could be the reason for this. Consequently, one could hypothesize that a short AT combined with lower-than-NAL-NL1 gains at high levels is a good compromise between Sweet-Spot Slow and NAL-NL1. By design, Sweet-Spot Fast was such a compromise, as it used the same time constants as NAL-NL1 and the same gains as Sweet-Spot Slow. However, the results of the SI and preference measurements were mixed. On the one hand, Sweet-Spot Fast gave clearly poorer results than Sweet-Spot Slow in terms of preference for speech presented at 75 dB SPL in noise. On the other hand, Sweet-Spot Fast was preferred over Sweet-Spot Slow for speech presented at 55 dB SPL in quiet. Further, Sweet-Spot Fast resulted in more cases of RO, although no significant differences were found at the group level. In terms of the HINT scores, there was no significant difference between the two Sweet-Spot settings, but there was a trend towards the Sweet-Spot Slow scores being better than the Sweet-Spot Fast scores. The lack of a significant difference was likely related to the large variability of the HINT scores obtained with Sweet-Spot Fast. This variability could have been due to individual differences in sensitivity to the distortions resulting from combining a short AT with a high CR. Overall, Sweet-Spot Fast was not inferior to Sweet-Spot Slow, except for the preference measurements in noise. In fact, Sweet-Spot Fast ended up in-between the two other settings in all measurements.

Even though Sweet-Spot Fast can be considered a compromise in terms of parameter settings and achievable outcome, it was not the optimal solution in any of the scenarios tested here. This could be due to the signal distortions resulting from combining a short AT with a high CR. These considerations suggest that the good results achieved with Sweet-Spot Slow at higher levels were due not only to its lower-than-NAL-NL1 gains but rather to a combination of prescribed gain and long AT. Therefore, a hybrid approach that combines NAL-NL1 gains at lower levels with Sweet-Spot Slow gains at higher levels could be a way of ensuring optimum results across a wide range of levels for listeners with RO. Due to the suboptimal outcomes with Sweet-Spot Fast, a long AT would be the preferred choice. Moore and Sęk (2016) tested a compression setting with an AT of 50 ms and a RT of 3000 ms together with CRs of up to 10:1. That setting was slightly preferred over a setting with an AT of 10 ms, a RT of 100 ms, and a maximum CR of 3:1 when listening to music at 50-, 65-, and 80-dB SPL input levels. In the case of speech, preference for fast or slow compression was very individual. Further research is required to determine optimal ATs for amplitude compression with high CRs.

Possible Relation to the NAL-NL2 Prescription Rule

Our findings are in partial agreement with some of the considerations behind the NAL-NL2 fitting rule (Dillon et al., 2011). Specifically, the developers of that rule noted that HA users with a severe or profound hearing loss prefer higher CRs when fitted with slow-acting compression (Keidser et al., 2007). Consequently, NAL-NL2 prescribes higher CRs for individuals with severe or profound hearing losses when fitted with slow-acting compression than when fitted with fast-acting compression. However, compression speed has no effect on the CRs prescribed by NAL-NL2 for HA users with milder hearing losses (Dillon et al., 2011). Our results suggest that, for the types of listeners tested here (who had moderate to severe hearing losses), the combination of a high CR with short time constants may not be optimal, and that longer time constants can be used with a rather high CR (5:1), at least at higher levels.

Potential Relation to Retro-Cochlear Deficits

As mentioned before, RO has been observed for listeners with both normal and elevated hearing thresholds (French & Steinberg, 1947; Studebaker et al., 1999). This implies that RO is not necessarily caused by a hearing dysfunction per se. In support of this, Zilany and Bruce (2007) related RO to the level-dependent properties of cochlear processing, including broadened cochlear tuning at high presentation levels. Nevertheless, retro-cochlear deficits could cause additional RO in the performance-intensity functions of individual listeners. For instance, the loss of connections between inner hair cells and auditory nerve fibers (i.e., cochlear synaptopathy) could lead to poorer speech recognition at higher presentation levels (Shehorn et al., 2020), that is, RO. Cochlear synaptopathy is typically more pronounced for high-threshold, lowspontaneous-rate (LSR) nerve fibers compared to low-threshold, high-spontaneous-rate fibers (Shehorn et al., 2020). Since LSR fibers are responsible for processing high-intensity input signals (Furman et al., 2013), synaptopathy may negatively affect the processing of high-level speech signals and thus cause RO. The exact mechanism is still a matter of debate (e.g., Carney, 2018; Shehorn et al., 2020). For instance, one could hypothesize that synaptopathy results in stochastic undersampling of the stimulus representation in the auditory nerve (Lopez-Poveda, 2014), which could lead to poorer speech processing and thus SI.

However, Carney (2018) argued against a direct role of LSR fibers in stimulus coding. Instead, she proposed that the fibers provide an input to the medial-olivocochlear system, which in turn modulates the cochlear gain to maintain optimal “sharpness” of the signal representation at higher stages of the auditory pathway. If this was the case, the effect of cochlear synaptopathy on speech perception at higher presentation levels would be indirect. Yet another hypothesis of an indirect influence of synaptopathy on speech perception was discussed by Shehorn et al. (2020). These authors argued that synaptopathy in LSR fibers may lead to a weaker middle-ear muscle reflex, which in turn can affect SI by reducing the attenuation of low-frequency input signals and thus increasing upward spread of masking. While Fereczkowski and Neher (2023b) showed that RO can be found in listeners with normal audiograms when tested with highpass-filtered stimuli (i.e., under conditions of little, if any, upward spread of masking), the putative link between cochlear synaptopathy and speech perception (including RO) is likely both direct and indirect. More research is necessary to better understand its nature.

In principle, one could consider Sweet-Spot Slow compression as a solution for listeners with retro-cochlear deficits. This could have several advantages. First, due to its locally linear behavior, Sweet-Spot Slow compression is characterized by low signal distortion. As mentioned above, the effect of non-functioning LSR fibers can be described in terms of stochastic undersampling of the neural representation of an input signal (Lopez-Poveda, 2014), which is equivalent to adding signal distortion. Thus, ensuring little distortion on the input side by means of Sweet-Spot Slow compression could be advantageous in the case of high-level signals. Reducing gain at high levels would also limit noise exposure, which is hypothesized to cause synaptopathy in humans (Shehorn et al., 2020). Reducing high sound pressure levels could compensate for such hearing deficits (Strelcyk, 2021).

Study Limitations

In the current study, all stimuli were generated using a HA simulator and presented over headphones, which limits the degree of realism and therefore also the generalizability of our results. However, given the proof-of-concept nature of our study, the setup used here allowed for better control than with real HAs and sound-field testing.

For the measurements, only stationary speech-shaped noise was used, which also limits the realism of our study. If modulated (e.g., babble) noise was used, the locally linear behavior of Sweet-Spot Slow would likely still provide some benefit over NAL-NL1, due to it introducing little signal distortion. Further, Shanks et al. (2002) used babble noise and observed RO in many listeners with different amplification schemes. We would therefore expect that Sweet-Spot Slow with its high CR and rather low gain at high presentation levels could reduce RO in such a scenario, too.

Additionally, Sweet-Spot gain prescription was based on a fixed CR of 5:1 between 500 and 4000 Hz. This resulted in some discontinuities in the gains prescribed near 500 Hz at 80-dB-SPL input, where the gain was lower than at adjacent frequencies (Figure 2). Such gain settings are not representative of clinical solutions but were deemed acceptable here. Despite the discontinuities in the gain curves, Sweet-Spot Slow gave good results at higher presentation levels, which suggests that this issue was of minor importance. Another consequence of the high CR used here were negative gain values for some individuals. For real, as opposed to simulated HAs, this would require a closed fitting.

Even though the participants tested here formed a relatively homogeneous RO group, the sample size was small. This, in turn, constrained the statistical analyses that were possible. Although multiple conditions were tested, the results indicated that additional ones could be of interest. For instance, no measurements were performed at lower input levels in the presence of background noise. Such measurements would benefit the interpretation of the preference data, where the effects of presentation level and noise presence could not be disentangled. Testing RO in noise, as done by Shanks et al. (2002), could also be of interest. Future research should address these issues.

Finally, the relatively small sample size limited the minimum detectable RO in our study. Since the RO observed here was relatively small (mean of 4.5% for NAL-NL1), we performed a sensitivity analysis using G*Power (Faul et al., 2009) to determine the minimum RO we could detect with our 16 participants. For a one-tailed matched-sample Wilcoxon signed-rank test with α = 0.05 and 1−β = 0.8, we found a minimum detectible effect size of 0.67. Assuming a SD of 5.8% (as found for the NAL-NL1 and Sweet-Spot Slow settings) as our estimate of measurement uncertainty, the corresponding minimum detectable RO was 3.9%, that is, a threshold value below the 4.5% found for the NAL-NL1 condition.

In principle, there could have been less RO with Sweet-Spot Slow, which our study would have been unable to detect. To explore this possibility, we calculated the probability of obtaining a specific sample mean (e.g., −2.2% as for Sweet-Spot Slow) assuming another “true” mean (e.g., 1%). For our analysis, we assumed a normal distribution of RO estimates (which we confirmed for our NAL-NL1 and Sweet-Spot Slow data with two Shapiro-Wilk tests; both p > .3). We also assumed an SD of 5.8% and a sample size of 16. We then calculated the z-statistic and corresponding p-value to estimate the probability of obtaining a mean performance increase of 2.2% or more (i.e., RO ≤ -2.2%) given a “true” mean of 1% (see, e.g., Greenland et al., 2016). Specifically, if the RO estimates are normally distributed with µ = 1% and σ = 5.8%, the sampling distribution is a normal distribution with µ = 1% and σ = 5.8%/√16. The z-statistic is the standardized distance between µ and the sample mean, that is, −2.2% with a corresponding p-value of 0.014. According to these results, if the true mean was 1%, the chance of the sample mean being −2.2% or less is 1.4% (for mean values of 2% or 3%, the corresponding p-values are .002 and <.001, respectively). Overall, these findings indicate that the probability of missing “true” RO in the Sweet-Spot conditions was very low.

Acknowledgments

The authors thank Ole Hau (WS Audiology, Denmark) for help with the HA simulator. They also thank the Associate Editor Brian C. J. Moore and two anonymous reviewers for their useful input to an earlier version of this article.

^1.

The “notch” at 500 Hz in the Sweet-Spot gain curve at 80-dB SPL input level is an effect of the fixed 5:1 CR in the 500–4000 Hz range and the CRs prescribed by NAL-NL1 outside of that range.

^2.

The corresponding averages of the gain differences at 80, 65, and 55 dB SPL were −7.2 dB (SD = 4.2 dB), −1.2 dB (SD = 4.1 dB), and 3.2 dB (SD = 3.9 dB), respectively.

Footnotes

Data Availability: The data used for the statistical analyses are available from the corresponding author upon reasonable request.

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Innovation Fund Denmark Grand Solutions 5164 00011B (“Better hEAring Rehabilitation”), GN Hearing, Oticon, and WS Audiology. Collaboration with all partners from the BEAR project is sincerely acknowledged. Additional support was provided by the William Demant Foundation (case no. 21-2011).

ORCID iDs: Michal Fereczkowski https://orcid.org/0000-0002-7960-1188

Raul H. Sanchez-Lopez https://orcid.org/0000-0002-5239-2339

Tobias Neher https://orcid.org/0000-0002-1107-9274

References

ANSI (1997). ANSI S3.5 1997. In Methods for Calculation of the Speech Intelligibility Index (pp. 1–35). American National Standard. [Google Scholar]
Bradley R. A., Terry M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4), 324–345. 10.2307/2334029 [DOI] [Google Scholar]
Bramsløw L. (2010). Preferred signal path delay and high-pass cut-off in open fittings. International Journal of Audiology, 49(9), 634–644. 10.3109/14992021003753482 [DOI] [PubMed] [Google Scholar]
Carney L. H. (2018). Supra-Threshold hearing and fluctuation profiles: Implications for sensorineural and hidden hearing loss. Journal of the Association for Research in Otolaryngology, 19(4), 331–352. 10.1007/s10162-018-0669-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
Christensen J. H., Saunders G. H., Porsbo M., Pontoppidan N. H. (2021). The everyday acoustic environment and its association with human heart rate: Evidence from real-world data logging with hearing aids and wearables. Royal Society Open Science, 8(2), 201345. 10.1098/rsos.201345 [DOI] [PMC free article] [PubMed] [Google Scholar]
Dillon H. (2012). Hearing aids. Thieme. https://books.google.dk/books?id=e9l_sW3TOz8C [Google Scholar]
Dillon H., Keidser G., Ching T. Y., Flax M., Brewer S. (2011). The NAL-NL2 prescription procedure. Audiology Research, 1(1), e24, 88–90. 10.4081/audiores.2011.e24 [DOI] [PMC free article] [PubMed] [Google Scholar]
Duquesnoy A. J., Plomp R. (1983). The effect of a hearing aid on the speech-reception threshold of hearing-impaired listeners in quiet and in noise. The Journal of the Acoustical Society of America, 73(6), 2166–2173. 10.1121/1.389540 [DOI] [PubMed] [Google Scholar]
Elberling C., Ludvigsen C., Lyregaard P. E. (1989). DANTALE: A new Danish speech material. Scandinavian Audiology, 18(3), 169–175. 10.3109/01050398909070742 [DOI] [PubMed] [Google Scholar]
Faul F., Erdfelder E., Buchner A., Lang A.-G. (2009). Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/BRM.41.4.1149 [DOI] [PubMed] [Google Scholar]
Fereczkowski M., Neher T. (2023a). Predicting aided outcome with aided word recognition scores measured with linear amplification at above-conversational levels. Ear & Hearing, 44(1), 155–166. 10.1097/AUD.0000000000001263 [DOI] [PubMed] [Google Scholar]
Fereczkowski M., Neher T. (2023b). Semantic context can mask intelligibility declines at above-conversational speech levels in normal-hearing listeners. Journal of Speech, Language, and Hearing Research 66(6), 1–7. https://doi.org/10.1044/2023_JSLHR-22-00506 [DOI] [PubMed] [Google Scholar]
French N. R., Steinberg J. C. (1947). Factors governing the intelligibility of speech sounds. Journal of the Acoustical Society of America, 19(1), 90–119. 10.1121/1.1916407 [DOI] [Google Scholar]
Furman A. C., Kujawa S. G., Liberman M. C. (2013). Noise-induced cochlear neuropathy is selective for fibers with low spontaneous rates. Journal of Neurophysiology, 110(3), 577–586. 10.1152/jn.00164.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
Greenland S., Senn S. J., Rothman K. J., Carlin J. B., Poole C., Goodman S. N., Altman D. G. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31, 337–350. 10.1007/s10654-016-0149-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hansen V., Munch G. (1991). Making recordings for simulation tests in the Archimedes project. Journal of the Audio Engineering Society, 39(10), 768–774. http://www.aes.org/e-lib/browse.cfm?elib=5961 [Google Scholar]
Hau O., Andersen H. (2012). Hearing aid compression: Effects of speed, ratio and channel bandwidth on perceived sound quality. Audiology Online. https://www.audiologyonline.com/articles/hearing-aid-compression-effects-speed-770
Holube I., Fredelake S., Vlaming M., Kollmeier B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49(12), 891–903. 10.3109/14992027.2010.506889 [DOI] [PubMed] [Google Scholar]
Jenstad L. M., Souza P. E. (2005). Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility. 10.1044/1092-4388(2005/045) [DOI]
Jerger J., Jerger S. (1967). Psychoacoustic comparison of cochlear and 8th nerve disorders. Journal of Speech and Hearing Research, 10(4), 659–688. 10.1044/jshr.1004.659 [DOI] [PubMed] [Google Scholar]
Jerger J., Jerger S. (1974). Audiological comparison of cochlear and eighth nerve disorders. Annals of Otology, Rhinology & Laryngology, 83(3), 275–285. 10.1177/000348947408300301 [DOI] [PubMed] [Google Scholar]
Keidser G., Dillon H., Dyrlund O., Carter L., Hartley D. (2007). Preferred low-and high-frequency compression ratios among hearing aid users with moderately severe to profound hearing loss. Journal of the American Academy of Audiology, 18(01), 017–033. 10.3766/jaaa.18.1.3 [DOI] [PubMed] [Google Scholar]
Kowalewski B., Zaar J., Fereczkowski M., MacDonald E. N., Strelcyk O., May T., Dau T. (2018). Effects of slow-and fast-acting compression on hearing-impaired listeners’ consonant–vowel identification in interrupted noise. Trends in Hearing, 22, 2331216518800870. 10.1177/2331216518800870 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lelic D., Stiefenhofer G., Lundorff E., Neher T. (2022). Hearing aid delay in open-fit devices: Preferred sound quality in listeners with normal and impaired hearing. JASA Express Letters, 2(10), 104803. 10.1121/10.0014950 [DOI] [PubMed] [Google Scholar]
Lopez-Poveda E. A. (2014). Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech. Frontiers in Neuroscience, 8, 348. 10.3389/fnins.2014.00348 [DOI] [PMC free article] [PubMed] [Google Scholar]
Moore B. C. J., Sęk A. (2016). Preferred compression speed for speech and music and its relationship to sensitivity to temporal fine structure. Trends in Hearing, 20, 2331216516640486. 10.1177/2331216516640486 [DOI] [PMC free article] [PubMed] [Google Scholar]
Nabelek A. K., Freyaldenhoven M. C., Tampas J. W., Burchfiel S. B., Muenchen R. A. (2006). Acceptable noise level as a predictor of hearing aid use. Journal of the American Academy of Audiology, 17(9), 626–639. 10.3766/jaaa.17.9.2 [DOI] [PubMed] [Google Scholar]
Nielsen J. B., Dau T. (2011). The Danish hearing in noise test. International Journal of Audiology, 50(3), 202–208. 10.3109/14992027.2010.524254 [DOI] [PubMed] [Google Scholar]
Nielsen J. B., Dau T., Neher T. (2014). A Danish open-set speech corpus for competing-speech studies. Journal of the Acoustical Society of America, 135(1), 407–420. 10.1121/1.4835935 [DOI] [PubMed] [Google Scholar]
Sanchez Lopez R., Bianchi F., Fereczkowski M., Santurette S., Dau T. (2018). Data-Driven approach for auditory profiling and characterization of individual hearing loss. Trends in Hearing, 22, 2331216518807400. 10.1177/2331216518807400 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shanks J. E., Wilson R. H., Larson V., Williams D. (2002). Speech recognition performance of patients with sensorineural hearing loss under unaided and aided conditions using linear and compression hearing aids. Ear and Hearing, 23(4), 280–290. 10.1097/00003446-200208000-00003 [DOI] [PubMed] [Google Scholar]
Shehorn J., Strelcyk O., Zahorik P. (2020). Associations between speech recognition at high levels, the middle ear muscle reflex and noise exposure in individuals with normal audiograms. Hearing Research, 392, 107982. 10.1016/j.heares.2020.107982 [DOI] [PubMed] [Google Scholar]
Sørensen A. J. M., Fereczkowski M., MacDonald E. N. (2021). Effects of noise and second language on conversational dynamics in task dialogue. Trends in Hearing, 25, 23312165211024482. 10.1177/23312165211024482 [DOI] [PMC free article] [PubMed] [Google Scholar]
Souza P. E., Yueh B., Sarubbi M., Loovis C. F. (2000). Fitting hearing aids with the Articulation Index: Impact on hearing aid effectiveness. Journal of Rehabilitation Research and Development, 37(4), 473–482. https://www.rehab.research.va.gov/jour/00/37/4/pdf/Souza.pdf [PubMed] [Google Scholar]
Strelcyk O. (2021). Compensating hidden hearing losses by attenuating high sound pressure levels (US Patent No. US 11,490,216 B2). https://patentimages.storage.googleapis.com/10/c1/df/15f95f2bbec4b8/US11490216.pdf
Studebaker G. A. (1985). A “rationalized” arcsine transform. Journal of Speech, Language, and Hearing Research, 28(3), 455–462. 10.1044/jshr.2803.455 [DOI] [PubMed] [Google Scholar]
Studebaker G. A., Sherbecoe R. L., McDaniel D. M., Gwaltney C. A. (1999). Monosyllabic word recognition at higher-than-normal speech and noise levels. Journal of the Acoustical Society of America, 105(4), 2431–2444. 10.1121/1.426848 [DOI] [PubMed] [Google Scholar]
Verschuure J., Maas A., Stikvoort E., De Jong R., Goedegebure A., Dreschler W. (1996). Compression and its effect on the speech signal. Ear and Hearing, 17(2), 162–175. https://repub.eur.nl/pub/55916/Ovid_-Compression-and-its-Effect-on-the-Speech-Signal.pdfhttps://doi.org/10.1097/00003446-199604000-00008 [DOI] [PubMed] [Google Scholar]
Webster J. (1979). Interpretations of speech and noise characteristics of NTID learning centers. The Journal of the Acoustical Society of America, 66(S1), S37–S37. 10.1121/1.2017738 [DOI] [Google Scholar]
Wickelmaier F., Schmid C. (2004). A Matlab function to estimate choice model parameters from paired-comparison data. Behavior Research Methods, Instruments, & Computers, 36, 29–40. 10.3758/BF03195547 [DOI] [PubMed] [Google Scholar]
Zilany M. S., Bruce I. C. (2007). Predictions of speech intelligibility with a model of the normal and impaired auditory-periphery [Paper presentation]. 2007 3rd International IEEE/EMBS Conference on Neural Engineering, 10.1109/CNE.2007.369714 [DOI] [Google Scholar]

[bibr1-23312165231224597] ANSI (1997). ANSI S3.5 1997. In Methods for Calculation of the Speech Intelligibility Index (pp. 1–35). American National Standard. [Google Scholar]

[bibr2-23312165231224597] Bradley R. A., Terry M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4), 324–345. 10.2307/2334029 [DOI] [Google Scholar]

[bibr3-23312165231224597] Bramsløw L. (2010). Preferred signal path delay and high-pass cut-off in open fittings. International Journal of Audiology, 49(9), 634–644. 10.3109/14992021003753482 [DOI] [PubMed] [Google Scholar]

[bibr4-23312165231224597] Carney L. H. (2018). Supra-Threshold hearing and fluctuation profiles: Implications for sensorineural and hidden hearing loss. Journal of the Association for Research in Otolaryngology, 19(4), 331–352. 10.1007/s10162-018-0669-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr5-23312165231224597] Christensen J. H., Saunders G. H., Porsbo M., Pontoppidan N. H. (2021). The everyday acoustic environment and its association with human heart rate: Evidence from real-world data logging with hearing aids and wearables. Royal Society Open Science, 8(2), 201345. 10.1098/rsos.201345 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr6-23312165231224597] Dillon H. (2012). Hearing aids. Thieme. https://books.google.dk/books?id=e9l_sW3TOz8C [Google Scholar]

[bibr7-23312165231224597] Dillon H., Keidser G., Ching T. Y., Flax M., Brewer S. (2011). The NAL-NL2 prescription procedure. Audiology Research, 1(1), e24, 88–90. 10.4081/audiores.2011.e24 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr8-23312165231224597] Duquesnoy A. J., Plomp R. (1983). The effect of a hearing aid on the speech-reception threshold of hearing-impaired listeners in quiet and in noise. The Journal of the Acoustical Society of America, 73(6), 2166–2173. 10.1121/1.389540 [DOI] [PubMed] [Google Scholar]

[bibr9-23312165231224597] Elberling C., Ludvigsen C., Lyregaard P. E. (1989). DANTALE: A new Danish speech material. Scandinavian Audiology, 18(3), 169–175. 10.3109/01050398909070742 [DOI] [PubMed] [Google Scholar]

[bibr10-23312165231224597] Faul F., Erdfelder E., Buchner A., Lang A.-G. (2009). Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/BRM.41.4.1149 [DOI] [PubMed] [Google Scholar]

[bibr11-23312165231224597] Fereczkowski M., Neher T. (2023a). Predicting aided outcome with aided word recognition scores measured with linear amplification at above-conversational levels. Ear & Hearing, 44(1), 155–166. 10.1097/AUD.0000000000001263 [DOI] [PubMed] [Google Scholar]

[bibr12-23312165231224597] Fereczkowski M., Neher T. (2023b). Semantic context can mask intelligibility declines at above-conversational speech levels in normal-hearing listeners. Journal of Speech, Language, and Hearing Research 66(6), 1–7. https://doi.org/10.1044/2023_JSLHR-22-00506 [DOI] [PubMed] [Google Scholar]

[bibr13-23312165231224597] French N. R., Steinberg J. C. (1947). Factors governing the intelligibility of speech sounds. Journal of the Acoustical Society of America, 19(1), 90–119. 10.1121/1.1916407 [DOI] [Google Scholar]

[bibr14-23312165231224597] Furman A. C., Kujawa S. G., Liberman M. C. (2013). Noise-induced cochlear neuropathy is selective for fibers with low spontaneous rates. Journal of Neurophysiology, 110(3), 577–586. 10.1152/jn.00164.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr15-23312165231224597] Greenland S., Senn S. J., Rothman K. J., Carlin J. B., Poole C., Goodman S. N., Altman D. G. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31, 337–350. 10.1007/s10654-016-0149-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr16-23312165231224597] Hansen V., Munch G. (1991). Making recordings for simulation tests in the Archimedes project. Journal of the Audio Engineering Society, 39(10), 768–774. http://www.aes.org/e-lib/browse.cfm?elib=5961 [Google Scholar]

[bibr17-23312165231224597] Hau O., Andersen H. (2012). Hearing aid compression: Effects of speed, ratio and channel bandwidth on perceived sound quality. Audiology Online. https://www.audiologyonline.com/articles/hearing-aid-compression-effects-speed-770

[bibr18-23312165231224597] Holube I., Fredelake S., Vlaming M., Kollmeier B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49(12), 891–903. 10.3109/14992027.2010.506889 [DOI] [PubMed] [Google Scholar]

[bibr19-23312165231224597] Jenstad L. M., Souza P. E. (2005). Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility. 10.1044/1092-4388(2005/045) [DOI]

[bibr20-23312165231224597] Jerger J., Jerger S. (1967). Psychoacoustic comparison of cochlear and 8th nerve disorders. Journal of Speech and Hearing Research, 10(4), 659–688. 10.1044/jshr.1004.659 [DOI] [PubMed] [Google Scholar]

[bibr21-23312165231224597] Jerger J., Jerger S. (1974). Audiological comparison of cochlear and eighth nerve disorders. Annals of Otology, Rhinology & Laryngology, 83(3), 275–285. 10.1177/000348947408300301 [DOI] [PubMed] [Google Scholar]

[bibr22-23312165231224597] Keidser G., Dillon H., Dyrlund O., Carter L., Hartley D. (2007). Preferred low-and high-frequency compression ratios among hearing aid users with moderately severe to profound hearing loss. Journal of the American Academy of Audiology, 18(01), 017–033. 10.3766/jaaa.18.1.3 [DOI] [PubMed] [Google Scholar]

[bibr23-23312165231224597] Kowalewski B., Zaar J., Fereczkowski M., MacDonald E. N., Strelcyk O., May T., Dau T. (2018). Effects of slow-and fast-acting compression on hearing-impaired listeners’ consonant–vowel identification in interrupted noise. Trends in Hearing, 22, 2331216518800870. 10.1177/2331216518800870 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr24-23312165231224597] Lelic D., Stiefenhofer G., Lundorff E., Neher T. (2022). Hearing aid delay in open-fit devices: Preferred sound quality in listeners with normal and impaired hearing. JASA Express Letters, 2(10), 104803. 10.1121/10.0014950 [DOI] [PubMed] [Google Scholar]

[bibr25-23312165231224597] Lopez-Poveda E. A. (2014). Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech. Frontiers in Neuroscience, 8, 348. 10.3389/fnins.2014.00348 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr26-23312165231224597] Moore B. C. J., Sęk A. (2016). Preferred compression speed for speech and music and its relationship to sensitivity to temporal fine structure. Trends in Hearing, 20, 2331216516640486. 10.1177/2331216516640486 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr27-23312165231224597] Nabelek A. K., Freyaldenhoven M. C., Tampas J. W., Burchfiel S. B., Muenchen R. A. (2006). Acceptable noise level as a predictor of hearing aid use. Journal of the American Academy of Audiology, 17(9), 626–639. 10.3766/jaaa.17.9.2 [DOI] [PubMed] [Google Scholar]

[bibr28-23312165231224597] Nielsen J. B., Dau T. (2011). The Danish hearing in noise test. International Journal of Audiology, 50(3), 202–208. 10.3109/14992027.2010.524254 [DOI] [PubMed] [Google Scholar]

[bibr29-23312165231224597] Nielsen J. B., Dau T., Neher T. (2014). A Danish open-set speech corpus for competing-speech studies. Journal of the Acoustical Society of America, 135(1), 407–420. 10.1121/1.4835935 [DOI] [PubMed] [Google Scholar]

[bibr30-23312165231224597] Sanchez Lopez R., Bianchi F., Fereczkowski M., Santurette S., Dau T. (2018). Data-Driven approach for auditory profiling and characterization of individual hearing loss. Trends in Hearing, 22, 2331216518807400. 10.1177/2331216518807400 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr31-23312165231224597] Shanks J. E., Wilson R. H., Larson V., Williams D. (2002). Speech recognition performance of patients with sensorineural hearing loss under unaided and aided conditions using linear and compression hearing aids. Ear and Hearing, 23(4), 280–290. 10.1097/00003446-200208000-00003 [DOI] [PubMed] [Google Scholar]

[bibr32-23312165231224597] Shehorn J., Strelcyk O., Zahorik P. (2020). Associations between speech recognition at high levels, the middle ear muscle reflex and noise exposure in individuals with normal audiograms. Hearing Research, 392, 107982. 10.1016/j.heares.2020.107982 [DOI] [PubMed] [Google Scholar]

[bibr33-23312165231224597] Sørensen A. J. M., Fereczkowski M., MacDonald E. N. (2021). Effects of noise and second language on conversational dynamics in task dialogue. Trends in Hearing, 25, 23312165211024482. 10.1177/23312165211024482 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr34-23312165231224597] Souza P. E., Yueh B., Sarubbi M., Loovis C. F. (2000). Fitting hearing aids with the Articulation Index: Impact on hearing aid effectiveness. Journal of Rehabilitation Research and Development, 37(4), 473–482. https://www.rehab.research.va.gov/jour/00/37/4/pdf/Souza.pdf [PubMed] [Google Scholar]

[bibr35-23312165231224597] Strelcyk O. (2021). Compensating hidden hearing losses by attenuating high sound pressure levels (US Patent No. US 11,490,216 B2). https://patentimages.storage.googleapis.com/10/c1/df/15f95f2bbec4b8/US11490216.pdf

[bibr36-23312165231224597] Studebaker G. A. (1985). A “rationalized” arcsine transform. Journal of Speech, Language, and Hearing Research, 28(3), 455–462. 10.1044/jshr.2803.455 [DOI] [PubMed] [Google Scholar]

[bibr37-23312165231224597] Studebaker G. A., Sherbecoe R. L., McDaniel D. M., Gwaltney C. A. (1999). Monosyllabic word recognition at higher-than-normal speech and noise levels. Journal of the Acoustical Society of America, 105(4), 2431–2444. 10.1121/1.426848 [DOI] [PubMed] [Google Scholar]

[bibr38-23312165231224597] Verschuure J., Maas A., Stikvoort E., De Jong R., Goedegebure A., Dreschler W. (1996). Compression and its effect on the speech signal. Ear and Hearing, 17(2), 162–175. https://repub.eur.nl/pub/55916/Ovid_-Compression-and-its-Effect-on-the-Speech-Signal.pdfhttps://doi.org/10.1097/00003446-199604000-00008 [DOI] [PubMed] [Google Scholar]

[bibr39-23312165231224597] Webster J. (1979). Interpretations of speech and noise characteristics of NTID learning centers. The Journal of the Acoustical Society of America, 66(S1), S37–S37. 10.1121/1.2017738 [DOI] [Google Scholar]

[bibr40-23312165231224597] Wickelmaier F., Schmid C. (2004). A Matlab function to estimate choice model parameters from paired-comparison data. Behavior Research Methods, Instruments, & Computers, 36, 29–40. 10.3758/BF03195547 [DOI] [PubMed] [Google Scholar]

[bibr41-23312165231224597] Zilany M. S., Bruce I. C. (2007). Predictions of speech intelligibility with a model of the normal and impaired auditory-periphery [Paper presentation]. 2007 3rd International IEEE/EMBS Conference on Neural Engineering, 10.1109/CNE.2007.369714 [DOI] [Google Scholar]

PERMALINK

Amplitude Compression for Preventing Rollover at Above-Conversational Speech Levels

Michal Fereczkowski

Raul H Sanchez-Lopez

Stine Christiansen

Tobias Neher

Abstract

Introduction

Methods

Participants

Figure 1.

Physical Test Setup

HA Simulator

Compression Settings

Sweet-Spot Gain Prescription

Figure 2.

Test Procedures

Speech Intelligibility in Quiet

Speech Intelligibility in Noise

Preference Judgments

Data Analysis

Results

Speech Intelligibility in Quiet

Figure 3.

Figure 4.

Speech Intelligibility in Noise

Figure 5.

Preference in Quiet and in Noise

Figure 6.

Discussion

Sweet-Spot Slow versus NAL-NL1

Sweet-Spot Fast and the Combined Effect of Gain and AT

Possible Relation to the NAL-NL2 Prescription Rule

Potential Relation to Retro-Cochlear Deficits

Study Limitations

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases