Limitations of the Envelope Difference Index (EDI) as a Metric for Nonlinear Distortion in Hearing Aids

James M Kates

doi:10.1097/AUD.0000000000000768

. Author manuscript; available in PMC: 2021 Mar 1.

Published in final edited form as: Ear Hear. 2020 Mar-Apr;41(2):356–361. doi: 10.1097/AUD.0000000000000768

Limitations of the Envelope Difference Index (EDI) as a Metric for Nonlinear Distortion in Hearing Aids

James M Kates ¹

PMCID: PMC6980993 NIHMSID: NIHMS1531280 PMID: 31356388

Abstract

Objectives:

The Envelope Difference Index (EDI) compares the envelopes of two signals. It has been used to measure nonlinear distortion in hearing aids, but it also responds to linear processing. This paper compares linear and nonlinear processing effects on the EDI.

Design:

The EDI for spectral tilt and peak clipping distortion is computed to illustrate the effects of linear and nonlinear signal modifications. The EDI for wide dynamic-range compression (WDRC) is then compared to that obtained for linear amplification for a set of standard audiograms to show the expected range of EDI values for linear and nonlinear hearing-aid processing. The EDI for hearing-aid amplification and compression is also compared to a measure of time-frequency envelope modulation distortion for the same conditions.

Results:

The EDI is shown to be as sensitive to linear amplification as it is to nonlinear processing. The EDI values for spectral tilt can exceed those for peak clipping, and the EDI values for linear amplification exceed those for WDRC for four of the nine audiograms considered. The agreement of the EDI with a nonlinear envelope distortion measure is shown to depend on the long-term spectra of the signals being compared when computing the EDI.

Conclusions:

The accuracy of the EDI as an indicator of nonlinear distortion for sentence materials can be improved by equalizing the long-term spectrum of the processed signal to match that of the unprocessed input. However, the EDI does not have a clear interpretation because of the confound between linear and nonlinear processing effects and the lack of an auditory model in calculating the signal differences.

Keywords: hearing aids, speech intelligibility metrics, speech quality metrics

1. Introduction

The Envelope Difference Index (EDI) was developed by Fortune et al. (1994) for “precisely quantifying the temporal contrasts that exist between two sound samples”. The EDI measures the differences between the broadband signal envelopes. If the sound samples are the input and output from a hearing aid (HA), then the EDI measures the change in the signal envelope introduced by the processing.

Recent papers tend to interpret the EDI as an indicator of nonlinear distortion ¹in hearing aids. For example, Geetha and Manjula (2014) use the EDI “to quantify the temporal changes caused by amplitude compression in hearing aids”, while using a separate metric for the spectral changes. Alexander and Masterson (2016) discuss the “temporal envelope distortion (specifically, EDI)”, and Kowalewski et al. (2018) claim that Fortune et al. (1994) “found that the EDI (and hence the amount of distortion) increased with shorter compression release times and larger CRs”.

But the EDI is a broadband measurement, which renders it sensitive to linear as well as nonlinear modifications to the signal and not just to the presence of nonlinear distortion. Applying a high-frequency boost to a signal to compensate for a hearing loss, for example, will increase the influence of the high frequencies in forming the overall signal envelope compared to the unamplified speech. The greater high-frequency signal content will modify the envelope fluctuations and cause an increase in the EDI, even when no nonlinear distortion is present. Interpreting the EDI as a nonlinear distortion measure is therefore problematical because it cannot separate linear from nonlinear envelope modifications. Furthermore, the EDI does not indicate how nonlinear distortion is processed by the auditory periphery or how different frequency regions are combined in forming a perceptual response.

As shown by the above quotes from recent papers, the EDI is still being used to indicate the presence of nonlinear distortion when comparing two signals. It is therefore important to determine the relative importance of frequency-dependent amplification versus nonlinear distortion when computing the EDI, and to determine how the EDI relates to more recent metrics that have been designed to measure nonlinear distortion in the context of models of the auditory periphery. This paper uses spectral tilt and symmetric peak clipping as examples of linear and nonlinear processing that will impact the EDI. Wide dynamic-range compression (WDRC) is then used as an example of HA processing, and this nonlinear system is compared to frequency-dependent linear amplification.

Examples of newer procedures that have been developed to measure the nonlinear distortion in a system include Goldsworthy and Greenberg (2004), Huber and Kollmeier (2006), Hines and Harte (2010), Jørgensen and Dau (2011), and Kates and Arehart (2014). The cepstral correlation term of this last metric, the Hearing-Aid Speech Quality Index (HASQI; Kates and Arehart, 2014), is particularly attractive because it has been validated against both intelligibility scores and subjective ratings (Arehart et al., 2015; Souza et al., 2015). In this paper, the EDI is compared to the cepstral correlation to illustrate the relationship between the EDI and a perceptually-based procedure for estimating nonlinear distortion.

2. Methods

The EDI was compared to the HASQI cepstral correlation term for several broadband processing and simulated hearing-aid conditions implemented in MATLAB. All processing and filter delays were removed before comparing the signals to ensure exact temporal alignment; it should be noted that the EDI will be increased by any uncompensated group delay in the system. For all comparisons, the stimulus was the sentence “A saw is a tool used for making boards.” spoken by a male talker. The sampling rate was 22.05 kHz. An auditory spectrogram of the test sentence is presented in Fig 1.

Fig 1. — Auditory spectrogram for the test sentence “A saw is a tool used for making boards.” spoken by a male talker.

A. EDI

The EDI was calculated using the procedure of Fortune et al. (1994):

E D I = \frac{1}{2 N} \sum_{n = 0}^{N - 1} ∣ \frac{{env}_{1} (n)}{m_{1}} - \frac{{env}_{2} (n)}{m_{2}} ∣,

(1)

where N is the number of envelope samples, env₁(n) and env₂(n) are the envelopes of the signals x₁(n) and x₂(n) that are being compared, and m₁ and m₂ are the means of the envelopes, respectively. The envelopes were generated by taking the absolute values of each sequence and filtering through a 3-pole Butterworth IIR lowpass filter having a cutoff frequency of 30 Hz; the lowpass filtering approach was chosen to be consistent with previous EDI studies (e.g. Jenstad and Souza, 2005). The EDI ranges from 0 to 1, with 0 indicating a perfect match of the envelopes and 1 indicating no agreement.

B. Cepstral Correlation

The EDI was compared to the cepstral correlation term of the Hearing-Aid Speech Quality Index (HASQI) version 2 of Kates and Arehart (2014). The cepstral correlation is a measure of envelope fidelity that compares the processed/degraded signal to a clean reference signal. The metric is based on a model of the auditory periphery that incorporates auditory frequency analysis, dynamic-range compression controlled by the outer hair cells, inner hair-cell firing-rate adaptation, and the auditory threshold. Hearing loss is integrated into the model as a broadening of the auditory filters, a reduction in dynamic-range compression, and an increase in the auditory threshold as the hearing loss is increased. Both the reference and degraded signals are passed through the model of the impaired periphery if hearing loss is present and through the model of the normal periphery otherwise.

The output of the peripheral model is the envelope of the speech signal in 32 auditory bands having center frequencies ranging from 80 to 8000 Hz. The envelopes for the degraded signal at the model output are compared to the envelopes from a reference signal; for the HASQI cepstral-correlation term, the reference has NAL-R compensation for the hearing loss (Byrne and Dillon, 1986). At each envelope time sample a smoothed version of the auditory spectrum in dB re: threshold is formed. The fluctuations over time of the smoothed short-time log spectrum for the modified signal are compared to those for the reference signal using the normalized cross-covariance. The cross-covariance minimizes the effects of long-term spectral changes on the signals being compared, so the metric primarily determines the amount of nonlinear distortion introduced by the processing.

The resultant cepstral correlation metric considers both the accuracy in reproducing the short-time spectral shape over the auditory frequency bands and the accuracy in reproducing the envelope modulation over the duration of the sentence. The metric thus determines the fidelity of the signal-processing system in reproducing the time-frequency modulation pattern of the original speech (Zahorian and Rothenberg, 1981). The metric values range from 1 to 0, with 1 indicating perfect envelope fidelity and 0 indicating a complete lack of envelope fidelity relative to the reference. The cepstral correlation has been shown to be highly correlated with listener judgments of intelligibility and quality for speech processed using wide dynamic-range compression and frequency shifting (Souza et al., 2015) and for additive noise and noise suppression (Arehart et al., 2015).

C. Broadband Signal Modifications

In a classic paper, Licklider and Pollack (1948) showed that spectral tilt and peak clipping have minimal impact on speech intelligibility for normal-hearing (NH) listeners. These processing conditions were duplicated in the present study to illustrate the sensitivity of the EDI to linear and nonlinear broadband processing in comparison with cepstral correlation. Normal hearing was assumed.

Spectral tilt was implemented using a linear-phase finite-impulse response (FIR) filter to provide a constant slope in dB/oct between 250 and 4000 Hz. The filter response was flat below 250 Hz and above 4000 Hz. The amount of tilt ranged from −6 dB/oct to +6 dB/oct in steps of 1.5 dB/oct, where a positive value of spectral tilt indicates that the high frequencies have more gain than the low frequencies.

Symmetric peak clipping was implemented using a histogram procedure. The silences at the beginning and end of the sentence were discarded. The cumulative histogram of the remaining signal absolute values was then computed. The clipping threshold was set as a percentage of the cumulative magnitude histogram for the sentence. A threshold of 100 percent indicates no clipping, while a threshold of 50 percent indicates that the half of the signal samples having the greatest magnitude were replaced with the positive or negative clipping threshold value.

D. Hearing-Aid Processing

The EDI has been used within the audiological community to evaluate hearing-aid processing, so a second experiment was designed to compare the EDI to cepstral correlation for a simulated hearing aid. The HA processing was configured for nine of the ten IEC standard audiograms (Bisgaard et al., 2010); the standard audiograms are presented in Table 1. Audiogram N7 was not used since it represents a severe loss that lies outside the range for which HASQI has been validated.

Table 1.

IEC standard audiograms used in the HA processing simulations.

IEC Label	Hearing Loss in dB
Freq, Hz	250	500	1000	2000	4000	6000
NH	0	0	0	0	0	0
N1	10	10	10	15	30	40
N2	20	20	25	35	45	50
N3	35	35	40	50	60	65
N4	55	55	55	65	75	80
N5	65	70	75	80	80	80
N6	75	80	85	90	100	100
S1	10	10	10	15	55	70
S2	20	20	25	55	95	95
S3	30	35	60	75	80	85

Open in a new tab

The simulated hearing-aid used a nine-channel filterbank, with either linear amplification or compression gain implemented independently in each band. The nominal band center frequencies and band edges are indicated in Table 2. The filterbank used linear-phase FIR filters; the filter group delay was removed to ensure exact temporal alignment of the output signals with the input. Linear amplification for the audiograms was implemented using the NAL-RP fitting rule (Byrne et al., 1991). WDRC for each audiogram was provided using the NAL-NL2 fitting procedure (Keidser et al., 2011). For linear amplification, the gain was constant across the frequency range of each band, giving a stepped gain-versus-frequency function typical of multi-channel hearing aids using time-domain frequency analysis. The WDRC compression ratios used for each audiogram in the nine frequency bands are indicated in Table 3.

Table 2.

Frequency bands used in the 9-channel hearing-aid simulation.

Band Number	Band Frequency, Hz	Lower Edge, Hz	Upper Edge, Hz
1	200	0	300
2	400	300	500
3	600	500	800
4	1000	800	1250
5	1500	1250	1750
6	2000	1750	2500
7	3000	2500	3750
8	4500	3750	5250
9	6000	5250	11025

Open in a new tab

Table 3.

NAL-NL2 compression ratios used in the hearing-aid simulation.

IEC Label	Band Frequency, kHz
	0.2	0.4	0.6	1.0	1.5	2.0	3.0	4.5	6.0
NH	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
N1	1.00	1.00	1.01	1.03	1.12	1.23	1.44	1.69	1.61
N2	1.10	1.08	1.14	1.52	1.76	2.08	2.12	2.09	1.92
N3	1.52	1.55	1.66	2.16	2.31	2.49	2.41	2.28	2.10
N4	2.35	2.74	3.07	3.01	2.86	2.72	2.37	2.05	1.93
N5	1.97	2.25	2.51	2.63	2.37	2.15	1.99	1.84	1.79
N6	2.04	2.15	2.21	2.13	2.00	1.89	1.68	1.51	1.50
S1	1.00	1.00	1.02	1.11	1.15	1.18	1.45	1.84	1.72
S2	1.00	1.07	1.20	1.57	1.72	1.90	1.59	1.35	1.29
S3	1.87	2.00	2.14	2.35	2.16	1.99	1.83	1.67	1.61

Open in a new tab

The EDI and cepstral correlation were computed for different combinations of HA input and output signals. The first signal in the EDI was selected as either 1) the unprocessed HA input, 2) the hearing-aid output for linear NAL-RP amplification, or 3) the sentence equalized to have the same long-term spectrum as the NAL-NL2 compressed sentence. The second signal in the EDI was selected as either 1) NAL-RP linear amplification, 2) the NAL-NL2 WDRC, or 3) the compressed signal after filtering to match its long-term spectrum to that of the unprocessed speech, which removed the long-term spectral changes introduced by the hearing aid. This latter modification is similar to the spectral equalization proposed by Houben et al. (2011) to permit perceptual evaluation of hearing-aid signal processing.

3. Results

A. Broadband Signal Modifications

For the broadband modifications, the EDI was computed using the unmodified sentence as the first signal and the processed signal (tilt or clip) as the second signal. A second set of EDI calculations was also performed by taking the processed output signal and applying equalization so that its long-term spectrum matched that of the input sentence. The cepstral correlation was computed using a model of normal hearing.

The results for spectral tilt are plotted in Fig 2. Note that the cepstral correlation is plotted as (1 - the cepstral correlation value) so that 0 represents perfect envelope agreement as is the case for the EDI. Spectral tilt is completely linear, and the cepstral correlation plotted as a function of the amount of tilt shows only a small deviation from perfect signal envelope agreement. The EDI, on the other hand, increases as the magnitude of the tilt increases. Only linear amplification is used in this example, so the EDI curve shows just how large the value can get even when there is no nonlinear distortion present and no reduction in intelligibility. Applying the equalization filter to the output compensates for the spectral tilt, and for this condition the EDI returns to being nearly zero.

The results for symmetric peak clipping are plotted in Fig 3. All three curves show similar behavior in that the lower the clipping threshold, the higher the computed index. Equalizing the long-term spectrum of the clipped signal to match that of the input reduces the computed EDI values since the equalization removes the long-term spectral changes introduced by the peak clipping while leaving the nonlinear distortion. Thus even in a situation where there is strong nonlinear distortion, the EDI is still biased by the presence of spectral changes introduced by the signal processing. In comparing Fig 3 to Fig 2, it is clear that the EDI values for spectral tilt can be as large or larger than those computed for symmetric peak clipping, and that the EDI is as sensitive to linear modifications of the signal as it is to nonlinear distortion.

B. Hearing-Aid Processing

Auditory spectrograms for the test sentence processed using a) NAL-RP and b) NAL-NL2 are presented in Fig 4 for the N4 audiogram. Compared to the unprocessed sentence plotted in Fig 1, both the linear amplification and WDRC increase the high frequencies relative to the lower-frequency vowel formants. The NAL-RP linear amplification provides an additional 18 dB gain above 1 kHz compared to the gain at 250 Hz, and NAL-NL2 provides even more average high-frequency gain.

An example of how linear processing can affect the broadband speech envelope is plotted in Fig 5, which shows NAL-RP amplification computed for the N4 audiogram applied to the sentence (red line) compared to the original (black line). The large high-frequency boost seen in the spectrogram of Fig 4(a) greatly increases the importance of high-frequency (consonant) sounds compared to low-frequency (vowel first formant) sounds in forming the envelope of the broadband signal. The result is substantial differences between the envelopes even though there is no nonlinear distortion present.

The EDI for the standard audiograms as the signals compared in the calculation are varied is presented in Table 4, along with cepstral correlation values for each loss. The row headers labeled x₁ and x₂ indicate the input and output signals compared in each column. For the unprocessed speech as input and the NAL-RP amplification as the output, the EDI can be as large as 0.524 (S2 audiogram) even though there is no nonlinear distortion present. For four of the nine audiograms (N5, N6, S1, and S2), the EDI for the linear processing is larger than the EDI for WDRC.

Table 4.

Values for the EDI and HASQI cepstral correlation computed for IEC standard audiograms. NH stands for normal hearing. The input (x₁) and output (x₂) signals for the calculations listed below the headers specify the signals compared in the calculations. Input EQ is the speech filtered to have a long-term spectrum identical to that of the NAL-NL2 compressed speech. NL2EQ is the WDRC output filtered to have a long-term spectrum identical to that of the unprocessed speech. Cep Corr is the HASQI cepstral correlation.

Loss	EDI					Cep Corr
x₁	No Proc	No Proc	NAL-RP	Input EQ	No Proc	NAL-RP	NAL-RP
x₂	NAL-RP	NL2	NL2	NL2	NL2EQ	NAL-RP	NL2
NH	0.006	0.000	0.006	0.000	0.000	1.000	1.000
N1	0.137	0.156	0.055	0.022	0.006	1.000	0.991
N2	0.251	0.298	0.133	0.056	0.023	1.000	0.973
N3	0.332	0.345	0.190	0.088	0.064	0.999	0.947
N4	0.310	0.334	0.186	0.105	0.109	0.999	0.874
N5	0.300	0.254	0.219	0.098	0.093	0.999	0.890
N6	0.323	0.285	0.179	0.087	0.089	0.999	0.845
S1	0.332	0.254	0.101	0.036	0.008	0.999	0.995
S2	0.524	0.379	0.160	0.052	0.015	0.999	0.977
S3	0.389	0.426	0.212	0.087	0.084	0.998	0.882

Open in a new tab

Comparing the WDRC signal to the NAL-RP amplification removes some of the long-term spectral differences since both approaches provide compensation for the hearing loss, and the EDI values for all of the audiograms are reduced. However, both of the signals compared have a high-frequency boost compared to the unprocessed speech, so there is a different relative weighting of the signal spectral content for the signals compensated for the hearing loss as opposed to the original speech. This spectral bias is removed by filtering the compressed HA output to have the same long-term spectrum as the unprocessed input, and the resulting EDI values (column labeled No Proc/NL2 EQ) are lower than those computed when comparing the NAL-NL2 signal to the NAL-RP signal. Matching the input signal spectrum to the long-term spectrum produced by the NAL-NL2 processing, shown in the column with Input EQ as the input and NAL-NL2 as the output, also reduces the EDI value, but not by as much as matching the WDRC long-term spectrum to that of the input signal.

The cepstral correlation values are presented in the last two columns of the table. The cepstral correlation for NAL-RP applied to both signals is 1 for all audiograms, as expected when the input and output are identical. While the EDI values are biased by the linear amplification provided by the HA, the cepstral correlation minimizes this bias. The cepstral correlation values tend to decrease with increasing hearing loss since the WDRC compression ratios increase as the loss becomes more severe as shown in Table 3.

In Fig 6, the EDI values are plotted as a function of the HASQI cepstral correlation along with linear regression fits to the data. The Pearson correlation coefficients are -0.487 (p=0.153) for the EDI computed by comparing NAL-NL2 to no processing, −0.769 (p=0.009) when comparing NAL-NL2 to NAL-RP, and −0.953 (p<0.001) when comparing NL2EQ to the unprocessed signal. Removing the spectral shaping introduced by the hearing-aid processing, as done for NL2EQ, thus eliminates much of the bias and results in EDI calculations that agree much more closely with the cepstral correlation values for these examples of WDRC processing

4. Discussion and Conclusions

The EDI is not a measure of nonlinear distortion; it merely determines the difference between two envelopes. It does not distinguish between envelope changes caused by linear amplification and those caused by nonlinear distortion, and it does not attempt to model any aspects of auditory perception.

The EDI is a broadband measure, so it is dominated by the most-intense spectral regions of the signals. Changes in the spectral weighting will change the EDI, and the same HA can give very different EDI values depending on the input signal and whether or not the HA output signal is compensated for the spectral tilt introduced by the HA frequency response. Comparing the processed output to a linear approximation of the long-term spectral changes (e.g. NAL-RP as the input and NAL-NL2 as the output) removes some of the linear contribution to the EDI but uses a frequency weighting that corresponds to the spectrum of the amplified speech. If a frequency weighting corresponding to the unprocessed speech is desired, then the HA output should be filtered to match the long-term spectrum of the unprocessed input. Adjusting the WDRC long-term spectrum to match that of the input signal produced the best agreement between the EDI and cepstral correlation, but this agreement is not guaranteed for other forms of nonlinear hearing-aid processing not investigated in this paper.

Other metrics may be more appropriate if an accurate estimate of the nonlinear envelope distortion in a system is desired. Apart from the HASQI cepstral correlation term (Kates and Arehart, 2014) used in this paper, examples include the envelope-based STI (Goldsworthy and Greenberg, 2004), PEMO-Q (Huber and Kollmeier, 2006), the mean structured similarity index (Hines and Harte, 2010), and the signal-to-noise envelope power ratio (Jørgensen and Dau, 2011). These metrics use multi-channel signal analysis with normalized envelope cross-correlations or cross-covariances, which effectively remove the long-term spectral changes from consideration when estimating the signal envelope changes due to nonlinear distortion. Interpreting the EDI as a nonlinear distortion measure is problematical since it does not separate linear from nonlinear envelope modifications, and it does not indicate how the distortion is filtered by the auditory periphery or how different frequency regions are combined in making a perceptual judgment.

Acknowledgments

The research reported in this paper was supported by a grant from GN ReSound to the University of Colorado and by a grant from the National Institutes of Health (R01 DC012289).

Footnotes

In the most general sense, “distortion” refers to any modification of a signal. In signal processing, however, distortion generally refers to nonlinear distortion, where there are frequencies present in the system output that are not present at the input. For clarity, we recommend using “linear processing” or “linear filtering” to refer to linear changes in the signal such as introduced by a highpass filter, and using “nonlinear distortion” to refer to the effects of nonlinear processing such as peak clipping. We prefer to not use the terms “distortion” without any qualifier or “linear distortion” since they may lead to ambiguity.

References

Alexander JM, and Masterson K (2016), “Effects of WDRC release time and number of channels on output SNR and speech recognition,” Ear Hear. 36, e35–e49. [DOI] [PMC free article] [PubMed] [Google Scholar]
Arehart KH, Souza PE, Kates JM, Lunner T, and Pedersen MS (2015), “Relationship between distortion, hearing loss, and working memory for digital noise reduction”, Ear Hear. 36, 505–516. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bisgaard N, Vlaming MSMG, and Dahlquist M (2010), “Standard audiograms for the IEC 60118–15 measurement procedure,” Trends Amplif. 14, 113–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
Byrne D, and Dillon H (1986), “The national acoustics laboratories’ (NAL) new procedure for selecting gain and frequency response of a hearing aid,” Ear Hear. 7, 257–265. [DOI] [PubMed] [Google Scholar]
Byrne D, Parkinson A, & Newall P (1991), “Modified hearing aid selection procedures for severe/profound losses,” In Studebaker G, Bess F, & Beck L (Eds.) The Vanderbilt Hearing Aid Report II, pp 295–300. Parkton, MD: York Press. [Google Scholar]
Fortune TW, Woodruff BD, and Preves DA (1994), “A new technique for quantifying temporal envelope contrasts,” Ear Hear. 15, 93–99. [DOI] [PubMed] [Google Scholar]
Geetha C, and Manjula P (2014), “Effect of compression, digital noise reduction and directionality on envelope difference index, log-likelihood ratio and perceived quality,” Audiol. Res. 4:110, 46–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
Goldsworthy RL, and Greenberg JE (2004), “Analysis of speech-based speech transmission index methods with implications for nonlinear operations,” J. Acoust. Soc. Am. 116, 3679–3689. [DOI] [PubMed] [Google Scholar]
Hines A, and Harte N (2010), “Speech intelligibility from image processing,” Speech Comm. 52, 736–752. [Google Scholar]
Houben R, Brons I, and Dreschler WA (2011), “A method to remove differences in frequency response between commercial hearing aids to allow direct comparison of the sound quality of hearing-aid features,” Trends Amplif. 15, 77–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huber R, and Kollmeier B (2006), “PEMO-Q – A new method for objective audio quality assessment using a model of auditory perception,” IEEE Trans. Audio Speech and Lang. Proc. 14, 1902–1911. [Google Scholar]
Jenstad LM, & Souza PE (2005), “Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility,” J. Speech Lang. Hear. Res. 48, 651–667. [DOI] [PubMed] [Google Scholar]
Jørgensen S, and Dau T (2011), “Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing,” J. Acoust. Soc. Am. 130, 1475–1487. [DOI] [PubMed] [Google Scholar]
Kates JM, and Arehart KH (2014). “The hearing aid speech quality index (HASQI), version 2,” J. Audio Eng. Soc. 62, 99–117. [Google Scholar]
Keidser G, Dillon H, Flax M, Ching T, and Brewer S (2011), “The NAL-NL2 prescription procedure,” Audiol. Res. 1:e24, 88–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kowalewski B, Zaar J, Fereczkowski M, MacDonald E, Strelcyk O, May T, and Dau T (2018), “Effects of slow- and fast-acting compression on hearing-impaired listeners’ consonant–vowel identification in interrupted noise,” Trends Hear. 22, 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
Licklider JCR, and Pollack I (1948), “Effects of differentiation, integration, and infinite peak clipping on the intelligibility of speech,” J. Acoust. Soc. Am. 20, 42–51. [Google Scholar]
Souza PE, Arehart KH, Shen J, Anderson MC, and Kates JM (2015), “Working memory and intelligibility of hearing-aid processed speech,” Frontiers Psych. 6, Article 526. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zahorian SA, Rothenberg M (1981). “Principal-components analysis for low-redundancy encoding of speech spectra,” J. Acoust. Soc. Am. 69, 832–845. [Google Scholar]

[R1] Alexander JM, and Masterson K (2016), “Effects of WDRC release time and number of channels on output SNR and speech recognition,” Ear Hear. 36, e35–e49. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Arehart KH, Souza PE, Kates JM, Lunner T, and Pedersen MS (2015), “Relationship between distortion, hearing loss, and working memory for digital noise reduction”, Ear Hear. 36, 505–516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Bisgaard N, Vlaming MSMG, and Dahlquist M (2010), “Standard audiograms for the IEC 60118–15 measurement procedure,” Trends Amplif. 14, 113–120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Byrne D, and Dillon H (1986), “The national acoustics laboratories’ (NAL) new procedure for selecting gain and frequency response of a hearing aid,” Ear Hear. 7, 257–265. [DOI] [PubMed] [Google Scholar]

[R5] Byrne D, Parkinson A, & Newall P (1991), “Modified hearing aid selection procedures for severe/profound losses,” In Studebaker G, Bess F, & Beck L (Eds.) The Vanderbilt Hearing Aid Report II, pp 295–300. Parkton, MD: York Press. [Google Scholar]

[R6] Fortune TW, Woodruff BD, and Preves DA (1994), “A new technique for quantifying temporal envelope contrasts,” Ear Hear. 15, 93–99. [DOI] [PubMed] [Google Scholar]

[R7] Geetha C, and Manjula P (2014), “Effect of compression, digital noise reduction and directionality on envelope difference index, log-likelihood ratio and perceived quality,” Audiol. Res. 4:110, 46–51. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Goldsworthy RL, and Greenberg JE (2004), “Analysis of speech-based speech transmission index methods with implications for nonlinear operations,” J. Acoust. Soc. Am. 116, 3679–3689. [DOI] [PubMed] [Google Scholar]

[R9] Hines A, and Harte N (2010), “Speech intelligibility from image processing,” Speech Comm. 52, 736–752. [Google Scholar]

[R10] Houben R, Brons I, and Dreschler WA (2011), “A method to remove differences in frequency response between commercial hearing aids to allow direct comparison of the sound quality of hearing-aid features,” Trends Amplif. 15, 77–83. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Huber R, and Kollmeier B (2006), “PEMO-Q – A new method for objective audio quality assessment using a model of auditory perception,” IEEE Trans. Audio Speech and Lang. Proc. 14, 1902–1911. [Google Scholar]

[R12] Jenstad LM, & Souza PE (2005), “Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility,” J. Speech Lang. Hear. Res. 48, 651–667. [DOI] [PubMed] [Google Scholar]

[R13] Jørgensen S, and Dau T (2011), “Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing,” J. Acoust. Soc. Am. 130, 1475–1487. [DOI] [PubMed] [Google Scholar]

[R14] Kates JM, and Arehart KH (2014). “The hearing aid speech quality index (HASQI), version 2,” J. Audio Eng. Soc. 62, 99–117. [Google Scholar]

[R15] Keidser G, Dillon H, Flax M, Ching T, and Brewer S (2011), “The NAL-NL2 prescription procedure,” Audiol. Res. 1:e24, 88–90. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Kowalewski B, Zaar J, Fereczkowski M, MacDonald E, Strelcyk O, May T, and Dau T (2018), “Effects of slow- and fast-acting compression on hearing-impaired listeners’ consonant–vowel identification in interrupted noise,” Trends Hear. 22, 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Licklider JCR, and Pollack I (1948), “Effects of differentiation, integration, and infinite peak clipping on the intelligibility of speech,” J. Acoust. Soc. Am. 20, 42–51. [Google Scholar]

[R18] Souza PE, Arehart KH, Shen J, Anderson MC, and Kates JM (2015), “Working memory and intelligibility of hearing-aid processed speech,” Frontiers Psych. 6, Article 526. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Zahorian SA, Rothenberg M (1981). “Principal-components analysis for low-redundancy encoding of speech spectra,” J. Acoust. Soc. Am. 69, 832–845. [Google Scholar]

PERMALINK

Limitations of the Envelope Difference Index (EDI) as a Metric for Nonlinear Distortion in Hearing Aids

James M Kates