Abstract
Compression is used in hearing aids to compensate for the effects of loudness recruitment. This article describes the distinction between, and relative merits of, slow and fast compression systems. A study of Gatehouse and coworkers leads to the following conclusions: (a) The benefit from compression is greatest among individuals who experience a wide range of sound levels within short periods of time, (b) slow compression generally leads to higher listening comfort than fast compression, (c) the benefit from fast compression varies across individuals, and those with high cognitive ability are able to benefit from fast compression to take advantage of temporal dips in a background sound. It is argued that listening in the dips depends on the ability to process the temporal fine structure of sounds. It is proposed that a test of the ability to process temporal fine structure might be useful for selecting compression speed for an individual.
Keywords: hearing aids, compression, time constants, automatic gain control
People with cochlear hearing loss usually experience loudness recruitment and the associated reduced dynamic range (Fowler, 1936; Moore, 2004, 2007; Steinberg & Gardner, 1937). Most modern hearing aids incorporate some form of compression or automatic gain control (AGC) to deal with this. In principle, AGC can provide high gain for low-level sounds, overcoming the loss of sensitivity of the hearing-impaired individual, and reduced gain for high-level sounds, preventing such sounds from becoming uncomfortably loud. However, controversy continues about the “best” way to implement AGC, and particularly whether it should be fast acting or slow acting.
The basic characteristics of an AGC system can be described in terms of the input—output function, which is a plot of the output level as a function of the input level. An example of such a function is shown in Figure 1. For low input levels, the gain (output level in dB minus input level in dB) is independent of input level, and the input–output function has a slope of one. At higher input levels, the gain decreases with increasing input level, and the input–output function has a slope less than one. The compression threshold is defined as the input level at which the gain is reduced by 2 dB, relative to the gain applied in the region of linear amplification (American National Standards Institute [ANSI], 2003). For example, if the gain were 25 dB for input levels well below the compression threshold, the compression threshold would be the input level at which the gain was reduced to 23 dB. One reason for having a compression threshold, with linear amplification for lower levels, is that it is impractical to continue to increase the gain indefinitely as the input level decreases. A second reason is that the use of high gain for very low-level inputs can make microphone noise or low-level environmental noise sound intrusive. Indeed, for very low-level inputs, some hearing aids reduce the gain to prevent such noises from being audible; this is called expansion as opposed to compression.
The “amount” of compression is specified by the compression ratio, which is the change in input level (in dB) required to achieve a 1-dB change in output level (for an input level exceeding the compression threshold); the compression ratio is equal to the reciprocal of the slope of the input–output function in the range where the compression is applied. For example, a compression ratio of 3 means that the output grows by 1 dB for each 3-dB increase in input level. The compression ratio is usually measured using a steady signal, such as a continuous sine wave. When the level of the sine wave is changed, the gain is allowed to stabilize before a measurement is taken. The compression ratio measured in this way will be referred to as CRstatic.
Typically, the speed of response of an AGC system is measured by using as an input a sound whose level changes abruptly between two values, normally 55 dB SPL and 90 dB SPL (ANSI, 2003; see Figure 2). When the sound level abruptly increases, the gain decreases, but this takes time to occur. Hence, the output of the system shows an initial “overshoot,” followed by a decline to a steady value. The time taken for the output to get within 3 dB of its steady value is called the attack time, ta. This is illustrated in Figure 2 for a system with short ta (middle) and a system with longer ta (bottom). When the sound level abruptly decreases, the gain increases, but again this takes time to occur. Hence, the output of the system shows an initial dip or “undershoot,” followed by an increase to a steady value. The time taken for the output to increase to within 4 dB of its steady value is called the recovery time or release time, tr. This is illustrated in Figure 2 for a system with short tr (middle) and a system with longer tr (bottom).
AGC systems can be divided into two broad classes. The first class is intended to adjust the gain automatically for different listening situations. Essentially, such systems are intended to relieve the user of the need to adjust the volume control. The gain is changed slowly with changes in input sound level; this is achieved by making the recovery time, or both the recovery time and the attack time, relatively long (usually tr is between 500 ms and 20 s). These systems are often referred to as automatic volume control (AVC). The compression ratio in such systems can be high (if the fitting philosophy is to present all sounds at a comfortable level) or more moderate (if the fitting philosophy is to give some impression of the overall level of sounds in the environment).
The second class of AGC system is intended to make the hearing-impaired person's perception of loudness more like that of a normal listener. Loudness recruitment behaves like a fast-acting multichannel expansion (Moore & Glasberg, 1993; Villchur, 1974) so restoration of loudness perception to normal would require fast-acting multichannel compression in principle. Systems with this goal have short attack and recovery times (ta is 0.5–20 ms and tr is 5–200 ms). They are often referred to as “fast-acting compressors” or “syllabic compressors,” because the gain changes over times comparable with the durations of individual syllables in speech. Fast-acting AGC systems usually have lower compression ratios and lower compression thresholds than AVC systems. Fast-acting systems with these characteristics are sometimes called “wide dynamic range” compressors, as the compression operates over a wide range of input sound levels. High compression ratios (greater than about 3) are avoided, as these have been shown to have deleterious effects on speech intelligibility (Souza, 2002; Verschuure et al., 1996). These systems usually split the signal into a number of frequency channels (from 2 to about 22), and compression is usually applied independently in each channel. In some aids, the compression may be linked across channels, either as a design feature or because the channels overlap in frequency.
It should be noted that the attack and recovery times of a compression system do not completely specify the temporal characteristics of that system; the overall system design also plays an important role. For example, when the input signal is amplitude modulated, a compressor will reduce the modulation depth at its output by an amount that decreases with increasing modulation rate. However, two compression systems with identical attack and recovery times and identical values of CRstatic may, nevertheless, differ in the extent to which they reduce the depth of amplitude modulation as a function of modulation frequency (Braida, Durlach, De Gennaro, Peterson, & Bustamante, 1982; Stone & Moore 1992, 2003, 2004).
In this article, I first consider ways in which AVC systems have been modified to improve their effectiveness. Then I discuss the theoretical advantages and disadvantages of each class of compression system and consider experimental data comparing the relative benefits of the two classes. Finally, I consider the possibility, espoused especially by Stuart Gatehouse and his coworkers (Gatehouse, Naylor, & Elberling, 2006a, 2006b) that requirements may differ across individuals and that it may not be valid to assume that one set of time constants fits all.
Improving the Effectiveness of AVC Systems
A problem with simple forms of AVC is that ta and tr are chosen as a compromise between conflicting requirements. For speech with a fixed average level, the level may fluctuate markedly from moment to moment and may drop to very low values during brief pauses in the speech. If the gain changes significantly during the speech itself, or during the pauses in the speech, then “breathing” or “pumping” noises may be heard that are objectionable to the user. To reduce this problem, both ta and tr should be long.
On the other hand, it is important to protect the user from intense transient sounds, such as a door slamming, or from sudden increases in sound level, such as an alarm being sounded. This requires the gain to decrease rapidly, which can be dealt with by making ta short (1–10 ms) while keeping tr moderately long (300–2,000 ms). However, such a system has the disadvantage that the gain drops to a low value immediately after an intense transient; the hearing aid effectively goes “dead” for a while. A further problem is that a recovery time of a few hundred milliseconds is not sufficiently long to prevent “breathing” sounds from being heard; a recovery time of more than 1,000 ms may be needed to achieve this.
An AVC system developed in my laboratory provides a better solution to these problems (Moore & Glasberg, 1988; Moore, Glasberg, & Stone, 1991; Stone, Moore, Alcántara, & Glasberg, 1999). This system, referred to as “dual front-end AGC,” involves the generation of two gain-control signals, one with long attack and recovery times and the other with shorter attack and recovery times. Normally, the operation of the system is determined by the slow-acting control signal, and the gain changes slowly enough to avoid breathing sounds during pauses in speech (indeed, for the digital implementation of the system, there is a “hold-off time” of 500–500 ms, during which the gain does not change at all when the input level drops; this prevents changes in gain during brief pauses in speech). However, if there is a sudden increase in sound level, so that the momentary output level increases by more than 8 dB above the “running” level determined by the slow system, then the fast-acting control signal rapidly reduces the gain, thus avoiding uncomfortable loudness. If the increase in sound level is brief, the gain returns to the original value determined by the overall level of the speech. A digital implementation of this system is described in Stone et al. (1999), and it has been used both in hearing aids and in cochlear implants.
An alternative approach is adaptive dynamic range optimization (ADRO; Blamey, 2005). In ADRO processing, the level of sounds is controlled independently in a large number (usually 32 or 64) of narrow frequency bands. The distribution of levels is estimated in each band and expressed as percentiles. For example, the 90th percentile is the level that is exceeded 10% of the time, and the 30th percentile is the level that is exceeded 70% of the time. These percentiles provide measures of the dynamic range of the input signal that are compared with the dynamic range of the listener's hearing. The latter is estimated for each band using measures of threshold (denoted T) and loudness (with interpolation to determine values for center frequencies between the standard audiometric frequencies). There are three target levels for each center frequency: the maximum output level (denoted M), the comfort level (denoted C), and the audibility level (denoted A), which may be defined as C – 20 dB, or T, whichever is the greater. The gain is usually slowly adapted for each center frequency so as to satisfy a series of “rules.” The rules are prioritized, and fuzzy logic is used to decide which rule should “win” when the results of the different rules are incompatible. The rules are, as follows:
The output level of each band should always be less than the M level. Fast-output compression limiting is used to achieve this, when necessary.
The 90th percentile should be less than the C level. The gain is reduced slowly if this rule is not satisfied, to keep sounds comfortable.
The 30th percentile should be greater than the A level. The gain is increased slowly if this rule is not satisfied to keep sounds audible.
There is a limit to the gain applied in each frequency band to avoid overamplification of low-level background noise.
The maximum rate of decrease of gain is about 9 dB/s, whereas the maximum rate of increase of gain is about 3 dB/s. This difference allows the gain to be decreased reasonably quickly in response to increases in sound level but avoids increases in gain during brief pauses in speech, thus avoiding the “pumping” noises described earlier. When all four rules are satisfied, the gain remains unchanged and ADRO acts in a similar way to a linear amplifier. Like the dual-front-end AGC system, ADRO has been implemented in both hearing aids and cochlear implants, although the rules described above differ somewhat for the two cases.
It should be emphasized that the dual-front-end AGC system and ADRO are not the only slow-acting systems that are available. There are commercial hearing aids that incorporate simpler slow-acting systems (i.e., with long attack and recovery times) and a variety of systems with dual time constants rather like the dual-front-end AGC but differing in the details of their implementation.
Advantages and Disadvantages of Slow and Fast Compression
In this section, I discuss some of the advantages and disadvantages of slow and fast compression systems. However, it should be noted that these advantages and disadvantages are not influenced solely by the attack and recovery times of the systems but also by other aspects of the compressor design and of the overall system design, such as the method used to divide the input signal into channels.
For slow-acting systems such as the dual front-end AGC system and ADRO, the following are the advantages:
If desired, speech can be delivered at a comfortable level, regardless of the input level, by use of a high compression ratio.
The temporal envelope of the speech is hardly distorted; envelope fluctuations at syllabic rates are preserved. This may be important for maintaining speech intelligibility (Drullman, Festen, & Plomp, 1994).
Short-term changes in the spectral pattern of sounds, which convey information in speech (Kluender, Coady, & Kiefte, 2003), are not distorted because the pattern of gains across frequency changes only slowly with time.
Harmonic and intermodulation distortion are minimal.
Protection is provided from intense brief transients with little effect on the long-term gain.
Short-term level changes are preserved, so cues for sound localization based on interaural level differences are not markedly disrupted.
The following are the disadvantages of slow-acting systems such as the dual front-end AGC system and ADRO:
Loudness perception is not restored to “normal.” Indeed, because the output level typically changes only slightly with input level, it may be difficult for the user to judge the strength of sound sources—for example, the volume setting on a television or radio. This may have adverse effects on the interpretation of environmental sounds (Gatehouse & Noble, 2004).
The systems may not deal very effectively with situations in which two voices alternate with markedly different levels.
When the user moves from a situation with high sound levels to one with lower levels (e.g., when leaving a noisy room), the gain takes a second or two to reach the value appropriate for the new situation. Hence, the aid may appear to become “dead” for a while.
When trying to listen to one (target) voice in the presence of another (background) voice, a normally hearing person can extract information about the target during the temporal dips in the background (Duquesnoy, 1983). This process is called listening in the dips. The information in the dips may be at a relatively low level, especially when the mean target level is lower than the mean background level. Hearing-impaired people have a reduced ability to listen in the dips (Duquesnoy, 1983; Peters, Moore, & Baer, 1998), partly because of reduced audibility of the target speech in the dips (Bacon, Opie, & Montoya, 1998). Slow-acting AGC may be of limited benefit in this situation because the gain does not increase significantly during brief dips in the input signal; the gain applied during the dips is essentially the same as the gain applied during the peaks in the input.
The advantages of fast-acting multichannel compression are as follows:
It can, at least to a reasonable approximation, restore loudness perception to “normal.” However, this is not quite achieved. When a person has recruitment, an amplitude-modulated sound appears to fluctuate more than normal (Moore, Wojtczak, & Vickers, 1996). This is true for modulation rates up to at least 32 Hz. Even at the short end of the range of time constants used, fast compression does not reduce the depth of amplitude modulation for rates greater than about 10 Hz (Moore, Stone, & Alcántara, 2001; Stone & Moore, 1992, 2003, 2004). Thus, dynamic aspects of loudness perception are not fully restored to normal. However, this lack of full restoration probably has only minor consequences for the overall loudness of speech, because the most prominent amplitude modulations in speech occur for rates less than 10 Hz (Plomp, 1983).
If multiple channels are used, fast-acting compression can compensate for frequency-dependent changes in the degree of loudness recruitment more effectively than slow-acting compression. Although slow-acting compression can apply gain that is appropriate for the average level of the speech in each frequency channel, fast-acting compression can also compensate for the short-term changes in speech level.
Fast compression can restore the audibility of weak sounds rapidly following intense sounds. This at least provides the potential for listening in the dips—that is, for getting “glimpses” of a target sound in the presence of a fluctuating background sound (Moore, Peters, & Stone, 1999). It also improves the ability to detect a weak consonant following a relatively intense vowel.
Fast compression can give good results when two voices alternate with markedly different levels.
The disadvantages of fast-acting multichannel compression are as follows:
It can introduce spurious changes in the shape of the temporal envelope of sounds (e.g., overshoot and undershoot effects; Stone & Moore, 2008), although delaying the audio signal by a small amount relative to the gain-control signal can reduce such effects (Robinson & Huntington, 1973; Stone et al., 1999).
It can introduce spurious changes in amplitude of sounds gliding in frequency, such as formants, as those sounds traverse the boundary between two channels. This happens mainly for systems in which the compression channels are formed using sharp, nonoverlapping filters. The size of the effect depends on the speed of the glide relative to the attack and recovery times of the compressor. The effect does not occur for systems in which the filters used to form the compression channels overlap and have rounded tops and sloping edges (Lindemann & Worrall, 2000). Hence, this effect is not necessarily a weakness of fast compression systems.
It reduces intensity contrasts and the modulation depth of speech, which may have an adverse effect on the perception of certain speech cues, especially when high compression ratios are used (Plomp, 1988). However, reduction of modulation depth by up to a factor of two has no marked adverse effects (Noordhoek & Drullman, 1997; van Buuren, Festen, & Houtgast, 1999). Hence, provided the compression ratios are restricted to moderate values, as is usually the case for commercial hearing aids, this effect of compression is not important.
If many channels are used, then the availability of spectral cues, such as formant frequencies, may be reduced. In a multichannel hearing aid with fast-acting compression in many channels, the spectrum is flattened, reducing spectral contrasts. This compounds difficulties produced by the reduced frequency selectivity that is typically associated with cochlear hearing loss (Moore, 2007; Pick, Evans, & Wilson, 1977). In addition, short-term changes in the spectral pattern of sounds may be distorted because the pattern of gains across frequency changes rapidly with time.
When the input signal to the system is a mixture of voices from different talkers, fast-acting compression introduces “cross-modulation” between the voices because the time-varying gain of the compressor is applied to the mixture of voices (Stone & Moore, 2004, 2007, 2008). Voices that are independently amplitude modulated at the input to the compressor acquire a common component of modulation at the output. This decreases the ability to perceptually segregate the voices and leads to reduced speech intelligibility under conditions when envelope cues are of primary importance (Stone & Moore, 2004, 2008).
Under conditions in everyday life when moderate levels of background sound are present (e.g., noise from computer fans or ventilation systems), fast compression makes the world sound noisier, and this can be annoying, especially for a person who is not used to wearing a hearing aid (Laurence, Moore, & Glasberg, 1983). When the number of channels is small, steady background noises may appear to be modulated by “foreground” sounds, such as speech. This can also be annoying. However, this effect is reduced when the number of channels is increased.
Cues for sound localization based on interaural-level differences may be disrupted by the independent action of the AGC at the two ears (Van den Bogaert, Klasen, Moonen, Van Deun, & Wouters, 2006). This can also happen with slow-acting AGC but to a lesser extent. The effect can be avoided altogether by synchronization of the AGC action across the two ears.
When the compression is very fast, it can introduce harmonic and intermodulation distortion, especially if the gain changes significantly over a duration comparable to one period of the center frequency of interest (Tan & Moore, 2004). However, systems can be designed for which the perceptual effects of such distortion are minimal (Stone et al., 1999), and in practice, harmonic and intermodulation distortion are not usually a significant problem in commercial hearing aids (Moore et al., 2001). In some hearing aids incorporating fast-acting compression, the time constants increase with decreasing channel center frequency to avoid distortion.
Unfortunately, the relative importance of these different factors is hard to assess. Also, the hearing aids that are currently available do not all fall clearly into the categories of slow or fast; intermediate time constants are used in some devices.
In some hearing aids, the compression is relatively fast at low frequencies and is slow at high frequencies. The rationale behind this approach is that the fast compression at low frequencies helps to avoid the masking of high-frequency sounds by short-term peaks in energy at low frequencies, whereas the slow compression at high frequencies avoids distortions in the temporal envelope. However, experimental evaluations of this approach (Lunner, Hellgren, Arlinger, & Elberling, 1997, 1998) have not revealed consistent benefits for the intelligibility of speech in noise or in terms of sound quality.
Experimental Studies of the Effects of Compression Speed
There have been several reviews of the effectiveness of compression systems with differing speeds (Dillon, 1996; Gatehouse et al., 2006a; Hickson, 1994; Moore, 1990, 2007; Souza, 2002). The reviews have highlighted the great diversity of results across studies. For example, Gatehouse et al. (2006a) reviewed 13 studies that had specifically investigated the effect of varying time constants. Of these studies, 4 showed no effect, 3 showed that fast compression was superior to slow compression, 3 showed that slow compression was superior to fast compression, and 3 showed that the “best” time constants varied across participants. One reason for the diversity of results across studies is that a great variety of evaluation criteria were used. Some studies focused on speech intelligibility in quiet, some on speech intelligibility in noise, some on subjective ratings of sound quality or speech intelligibility, and some on paired-comparison judgments of preference. A problem with several studies is that the amount of compression was not adjusted to suit the hearing loss of the individual being tested. Often, the amount of compression (as determined by the compression ratio) was greater than would typically be used in practice and did not vary with frequency in an appropriate way. Also, the number of listeners tested was often rather small.
Gatehouse, Naylor, and Elberling (2003) presented evidence supporting the idea that the relative benefit of fast and slow compression depends both on the characteristics of the individual listener and on their auditory environment, which was termed auditory ecology. This drew on earlier work (Gatehouse, Naylor, & Elberling, 1999) showing that preferences for compression amplification over linear amplification were related to the range of sound levels encountered over successive 10-min periods in everyday life; individuals who encountered a wide range of sound levels showed stronger preferences for compression. Gatehouse et al. (2003) presented preliminary analyses from a study (fully described in Gatehouse et al., 2006a, 2006b) that overcame many of the problems of earlier studies. They tested 50 listeners in a within-subjects, randomized, blind, crossover evaluation of five different hearing aid signal-processing schemes, two with linear amplification and three with two-channel AGC differing only in release time constants. The frequency-dependent gain and compression were adjusted appropriately for each listener, and a wide variety of outcome measures were used. These included measures of speech intelligibility in both steady noise and fluctuating noise (ICRA noise; Dreschler, Verschuure, Ludvigsen, & Westermann, 2001). They also measured the cognitive ability of their listeners using a visual digit-monitoring task and a visual letter-monitoring task. Remarkably, they found a significant interaction between cognitive ability, the temporal characteristics of the noise (steady or fluctuating), and the time constants of the compression. They reported, “Listeners with greater cognitive ability derive greater benefit from temporal structure in background noise when listening via fast time constants, one of whose effects is to facilitate ‘listening in the gaps”’ (p. S77). In other words, the ability to listen in the dips could be facilitated by fast compression, as found previously by Moore et al. (1999), but this happened mainly for listeners with greater cognitive ability.
In further analyses based on the same study, Gatehouse et al. (2006a) reported that slow-acting compression was preferred to fast-acting compression in terms of subjective listening comfort, whereas for reported and measured speech intelligibility, the converse was true. However, in terms of overall satisfaction, there was no significant effect of compression speed. This may help in explaining the inconclusive outcome of several of the studies reviewed by Gatehouse et al. (2006a). Gatehouse et al. (2006a) also reported that there were clear individual differences in the patterns of benefit. They concluded that slow compression (AVC) was a “safe” option, as it was often good for listening comfort and typically gave moderate scores for self-reported or measured speech intelligibility. In contrast, fast compression led to a wider range of scores for measured intelligibility, often giving better scores than slow compression but sometimes giving poorer scores.
The Importance of Dip Listening
As described above, Gatehouse et al. (2003) reported marked individual differences in the ability of hearing-impaired listeners to take advantage of dips in a background sound when trying to understand a target talker. Dip listening is important because many situations where communication difficulties are experienced involve fluctuating background sounds. Recent evidence suggests that the benefit obtained from listening in the dips may be related to the ability to process the temporal fine structure (TFS) of sounds, as represented in the patterns of phase locking in the auditory nerve. The concept here is that changes in the TFS during dips in the background help the listener to determine that target speech is present and to determine what the properties of the target speech are (Moore, 2008). There is evidence that moderate cochlear hearing loss reduces or abolishes the ability to process TFS (Hopkins & Moore, 2007; Moore, Glasberg, & Hopkins, 2006). This may largely account for the fact that the benefit for speech intelligibility obtained from dip listening is markedly smaller for people with cochlear hearing loss than for normally hearing people (Duquesnoy, 1983; Peters et al., 1998).
Lorenzi, Gilbert, Carn, Garnier, and Moore (2006) studied the role of TFS cues in speech perception by processing speech so as to remove envelope cues as far as possible while preserving TFS cues. They did this by filtering the signal into 16 contiguous frequency bands, extracting the envelope in each band, and dividing the signal in each band by the envelope (Smith, Delgutte, & Oxenham, 2002). The resulting signal in each band had a constant envelope amplitude but a time-varying TFS. The band signals were then recombined. For comparison, they also measured performance with unprocessed speech and speech processed to preserve envelope cues while removing TFS cues (using a tone vocoder).
The ability to identify nonsense syllables presented in quiet was measured for three groups of subjects: normally hearing and young and elderly with mild-to-moderate flat hearing loss. For the intact speech, the normal-hearing group achieved perfect scores and both groups of hearing-impaired subjects performed nearly as well as the normal-hearing group. After moderate training, all groups achieved high scores (about 90% correct) for the speech with envelope cues. After more extensive training, the normally hearing group also achieved about 90% correct for the speech with mainly TFS cues. However, both hearing-impaired groups performed much more poorly, with most subjects scoring close to the level that would be expected from the use of cues, such as overall duration. These results indicate that moderate hearing loss causes a dramatic deterioration in the ability to use TFS cues for speech perception.
In a second experiment, Lorenzi et al. (2006) tested only the subjects from the young hearing-impaired group. The stimuli were intact (unprocessed) nonsense syllables presented in a steady background noise and in a background noise that was sinusoidally amplitude modulated at an 8-Hz rate with a 100% depth. The speech-to-noise ratio was fixed individually at the level yielding about 50% correct identification for the speech in steady noise. The difference in scores for the two types of noise provides a measure of the ability to listen in the dips of the modulated background, called masking release. The amount of masking release was found to be highly correlated (r = .83) with scores obtained in the first experiment using speech processed to preserve mainly TFS cues. In other words, subjects who performed relatively well when listening to speech in quiet processed to contain mainly TFS cues showed a relatively large masking release when listening to intact speech. However, the amount of masking release for these subjects was markedly smaller than is typically found for normally hearing subjects under similar conditions (Füllgrabe, Berthommier, & Lorenzi, 2006; Gustafsson & Arlinger, 1994). Overall, the pattern of the results supports the hypothesis that listening in the dips depends on the use of TFS information and that the greatly reduced ability of hearing-impaired subjects to listen in the dips is partly a result of the loss of ability to use TFS cues.
Choosing Compression Speed for the Individual
The results described above may have important implications for the choice of compression speed in hearing aids. An individual who has little or no ability to process TFS information will rely largely on temporal envelope cues in different frequency channels to understand speech. Fast-acting compression can disrupt envelope cues, as described earlier. This decreases the ability to perceptually segregate two or more voices and leads to reduced speech intelligibility under conditions when TFS cues are removed by the use of a noise vocoder (Stone & Moore, 2004, 2008). Hence, for an individual with little or no ability to process TFS information, slow-acting compression might be more effective than fast-acting compression.
For a hearing-impaired individual who retains some ability to process TFS, the situation is different. Fast-acting multichannel compression can help restore the audibility of low-level portions of signals (the dips), and information derived from TFS can be used to extract “glimpses” of the target speech during dips in a background sound. Thus, for an individual who can process TFS, fast-acting multichannel compression may lead to improved intelligibility of speech in the presence of sounds with spectral and/or temporal dips (Moore et al., 1999).
The conclusion from all this is that measures of the ability to use TFS information might be useful in determining the most appropriate speed of compression for a hearing-impaired individual. It is possible that the ability to process TFS is related in a more general way to the speed and accuracy of neural processing in the brain. If this were the case, the ability to process TFS could be related to cognitive abilities. This might explain the link found by Gatehouse et al. (2003) between cognitive abilities and the benefit of fast compression for listening in the dips.
References
- American National Standards Institute (ANSI). (2003). ANSI S3.22-2003, Specification of hearing aid characteristics. New York: American National Standards Institute [Google Scholar]
- Bacon S. P., Opie J. M., Montoya D. Y. (1998). The effects of hearing loss and noise masking on the masking release for speech in temporally complex backgrounds. Journal of Speech, Language and Hearing Research, 41, 549–563 [DOI] [PubMed] [Google Scholar]
- Blamey P. J. (2005). Adaptive dynamic range optimization (ADRO): A digital amplification strategy for hearing aids and cochlear implants. Trends in Amplification, 9, 77–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Braida L. D., Durlach N. I., De Gennaro S. V., Peterson P. M., Bustamante D. K. (1982). Review of recent research on multiband amplitude compression for the hearing impaired. In Studebaker G. A., Bess F. H. (Eds.), The Vanderbilt hearing-aid report (pp. 133–140). Upper Darby, PA: Monographs in Contemporary Audiology [Google Scholar]
- Dillon H. (1996). Compression? Yes, but for low or high frequencies, for low or high intensities, and with what response times? Ear & Hearing, 17, 287–307 [DOI] [PubMed] [Google Scholar]
- Dreschler W. A., Verschuure H., Ludvigsen C., Westermann S. (2001). ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment. Audiology, 40, 148–157 [PubMed] [Google Scholar]
- Drullman R., Festen J. M., Plomp R. (1994). Effect of temporal envelope smearing on speech reception. Journal of the Acoustical Society of America, 95, 1053–1064 [DOI] [PubMed] [Google Scholar]
- Duquesnoy A. J. (1983). Effect of a single interfering noise or speech source on the binaural sentence intelligibility of aged persons. Journal of the Acoustical Society of America, 74, 739–743 [DOI] [PubMed] [Google Scholar]
- Fowler E. P. (1936). A method for the early detection of otosclerosis. Archives of Otolaryngology, 24, 731–741 [Google Scholar]
- Füllgrabe C., Berthommier F., Lorenzi C. (2006). Masking release for consonant features in temporally fluctuating background noise. Hearing Research, 211, 74–84 [DOI] [PubMed] [Google Scholar]
- Gatehouse S., Noble W. (2004). The Speech, Spatial and Qualities of Hearing Scale (SSQ). International Journal of Audiology, 43, 85–99 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (1999). Aspects of auditory ecology and psychoacoustic function as determinants of benefit from and candidature for non-linear processing in hearing aids. In Rasmussen A. N., Osterhammel P. A., Anderson T., Poulsen T. (Eds.), Auditory models and non-linear hearing instruments (pp. 221–233). Copenhagen, Denmark: Holmens Trykkeri. [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (2003). Benefits from hearing aids in relation to the interaction between the user and the environment. International Journal of Audiology, 4(Suppl. 1), S77–S85 [DOI] [PubMed] [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (2006a). Linear and nonlinear hearing aid fittings—1. Patterns of benefit. International Journal of Audiology, 45, 130–152 [DOI] [PubMed] [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (2006b). Linear and nonlinear hearing aid fittings—2. Patterns of candidature. International Journal of Audiology, 45, 153–171 [DOI] [PubMed] [Google Scholar]
- Gustafsson H. Å., Arlinger S. D. (1994). Masking of speech by amplitude-modulated noise. Journal of the Acoustical Society of America, 95, 518–529 [DOI] [PubMed] [Google Scholar]
- Hickson L. M. H. (1994). Compression amplification in hearing aids. American Journal of Audiology, 3, 51–65 [DOI] [PubMed] [Google Scholar]
- Hopkins K., Moore B. C. J. (2007). Moderate cochlear hearing loss leads to a reduced ability to use temporal fine structure information. Journal of the Acoustical Society of America, 122, 1055–1068 [DOI] [PubMed] [Google Scholar]
- Kluender K. R., Coady J. A., Kiefte M. (2003). Sensitivity to change in perception of speech. Speech Communication, 41, 59–69 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laurence R. F., Moore B. C. J., Glasberg B. R. (1983). A comparison of behind-the-ear high-fidelity linear aids and two-channel compression hearing aids in the laboratory and in everyday life. British Journal of Audiology, 17, 31–48 [DOI] [PubMed] [Google Scholar]
- Lindemann E., Worrall T. L. (2000). Continuous frequency dynamic range audio compressor (U.S. Patent No. 609,7824). Washington, DC: U.S. Patent and Trademark Office [Google Scholar]
- Lorenzi C., Gilbert G., Carn C., Garnier S., Moore B. C. J. (2006). Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proceedings of the National Academy of Sciences of the United States of America, 103, 18866–18869 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lunner T., Hellgren J., Arlinger S., Elberling C. (1997). A digital filterbank hearing aid: Three digital signal processing algorithms—user preference and performance. Ear & Hearing, 18, 373–387 [DOI] [PubMed] [Google Scholar]
- Lunner T., Hellgren J., Arlinger S., Elberling C. (1998). Non-linear signal processing in digital hearing aids. Scandinavian Audiology, 27(Suppl. 49), 40–40 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J. (1990). How much do we gain by gain control in hearing aids? Acta Otolaryngologica, (Suppl. 469), 250–256 [PubMed] [Google Scholar]
- Moore B. C. J. (2004). Testing the concept of softness imperception: Loudness near threshold for hearing-impaired ears. Journal of the Acoustical Society of America, 115, 3103–3111 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J. (2007). Cochlear hearing loss: Physiological, psychological and technical issues (2nd ed.). Chichester, UK: Wiley [Google Scholar]
- Moore B. C. J. (2008). The role of temporal fine structure in normal and impaired hearing. In Dau T. (Ed.), International Symposium on Auditory and Audiological Research. Copenhagen, Denmark: Holmens Trykkeri. [Google Scholar]
- Moore B. C. J., Glasberg B. R. (1988). A comparison of four methods of implementing automatic gain control (AGC) in hearing aids. British Journal of Audiology, 22, 93–104 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J., Glasberg B. R. (1993). Simulation of the effects of loudness recruitment and threshold elevation on the intelligibility of speech in quiet and in a background of speech. Journal of the Acoustical Society of America, 94, 2050–2062 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J., Glasberg B. R., Hopkins K. (2006). Frequency discrimination of complex tones by hearing-impaired subjects: Evidence for loss of ability to use temporal fine structure information. Hearing Research, 222, 16–27 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J., Glasberg B. R., Stone M. A. (1991). Optimization of a slow-acting automatic gain control system for use in hearing aids. British Journal of Audiology, 25, 171–182 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J., Peters R. W., Stone M. A. (1999). Benefits of linear amplification and multi-channel compression for speech comprehension in backgrounds with spectral and temporal dips. Journal of the Acoustical Society of America, 105, 400–411 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J., Stone M. A., Alcántara J.I. (2001). Comparison of the electroacoustic characteristics of five hearing aids. British Journal of Audiology, 35, 307–325 [DOI] [PubMed] [Google Scholar]
- Moore B. C. J., Wojtczak M., Vickers D. A. (1996). Effect of loudness recruitment on the perception of amplitude modulation. Journal of the Acoustical Society of America, 100, 481–489 [Google Scholar]
- Noordhoek I. M., Drullman R. (1997). Effect of reducing temporal intensity modulations on sentence intelligibility. Journal of the Acoustical Society of America, 101, 498–502 [DOI] [PubMed] [Google Scholar]
- Peters R. W., Moore B. C. J., Baer T. (1998). Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. Journal of the Acoustical Society of America, 103, 577–587 [DOI] [PubMed] [Google Scholar]
- Pick G., Evans E. F., Wilson J. P. (1977). Frequency resolution in patients with hearing loss of cochlear origin. In Evans E. F., Wilson J. P. (Eds.), Psychophysics and physiology of hearing (pp. 273–281). London: Academic Press [Google Scholar]
- Plomp R. (1983). The role of modulation in hearing. In Klinke R., Hartmann R. (Eds.), Hearing—physiological bases and psychophysics (pp. 270–276). Berlin, Germany: Springer [Google Scholar]
- Plomp R. (1988). The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation-transfer function. Journal of the Acoustical Society of America, 83, 2322–2327 [DOI] [PubMed] [Google Scholar]
- Robinson C. E., Huntington D. A. (1973). The intelligibility of speech processed by delayed long-term averaged compression amplification. Journal of the Acoustical Society of America, 54, 314 [Google Scholar]
- Smith Z. M., Delgutte B., Oxenham A. J. (2002). Chimaeric sounds reveal dichotomies in auditory perception. Nature, 416, 87–90 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Souza P. E. (2002). Effects of compression on speech acoustics, intelligibility, and sound quality. Trends in Amplification, 6, 131–165 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinberg J. C., Gardner M. B. (1937). The dependency of hearing impairment on sound intensity. Journal of the Acoustical Society of America, 9, 11–23 [Google Scholar]
- Stone M. A., Moore B. C. J. (1992). Syllabic compression: Effective compression ratios for signals modulated at different rates. British Journal of Audiology, 26, 351–361 [DOI] [PubMed] [Google Scholar]
- Stone M. A., Moore B. C. J. (2003). Effect of the speed of a single-channel dynamic range compressor on intelligibility in a competing speech task. Journal of the Acoustical Society of America, 114, 1023–1034 [DOI] [PubMed] [Google Scholar]
- Stone M. A., Moore B. C. J. (2004). Side effects of fast-acting dynamic range compression that affect intelligibility in a competing speech task. Journal of the Acoustical Society of America, 116, 2311–2323 [DOI] [PubMed] [Google Scholar]
- Stone M. A., Moore B. C. J. (2007). Quantifying the effects of fast-acting compression on the envelope of speech. Journal of the Acoustical Society of America, 121, 1654–1664 [DOI] [PubMed] [Google Scholar]
- Stone M. A., Moore B. C. J. (2008). Effects of spectro-temporal modulation changes produced by multi-channel compression on intelligibility in a competing-speech task. Journal of the Acoustical Society of America, 123, 1063–1076 [DOI] [PubMed] [Google Scholar]
- Stone M. A., Moore B. C. J., Alcántara J. I., Glasberg B. R. (1999). Comparison of different forms of compression using wearable digital hearing aids. Journal of the Acoustical Society of America, 106, 3603–3619 [DOI] [PubMed] [Google Scholar]
- Tan C. T., Moore B. C. J. (2004, April). Comparison of two forms of fast-acting compression using physical and subjective measures. Paper presented at the Proceedings of the 18th International Congress on Acoustics, Kyoto, Japan.
- van Buuren R. A., Festen J., Houtgast T. (1999). Compression and expansion of the temporal envelope: Evaluation of speech intelligibility and sound quality. Journal of the Acoustical Society of America, 105, 2903–2913 [DOI] [PubMed] [Google Scholar]
- Van den Bogaert T., Klasen T. J., Moonen M., Van Deun L., Wouters J. (2006). Horizontal localization with bilateral hearing aids: Without is better than with. Journal of the Acoustical Society of America, 119, 515–526 [DOI] [PubMed] [Google Scholar]
- Verschuure J., Maas A. J. J., Stikvoort E., de Jong R. M., Goedegebure A., Dreschler W. A. (1996). Compression and its effect on the speech signal. Ear & Hearing, 17, 162–175 [DOI] [PubMed] [Google Scholar]
- Villchur E. (1974). Simulation of the effect of recruitment on loudness relationships in speech. Journal of the Acoustical Society of America, 56, 1601–1611 [DOI] [PubMed] [Google Scholar]