Abstract
Auditory compression was estimated at 250 and 4000 Hz by using the additivity of forward masking technique, which measures the effects on signal threshold of combining two temporally nonoverlapping forward maskers. The increase in threshold in the combined-masker condition compared to the individual-masker conditions can be used to estimate compression. The signal was a 250 or 4000 Hz tone burst and the maskers (M1 and M2) were bands of noise. Signal thresholds were measured in the presence of M1 and M2 alone and combined for a range of masker levels. The results were used to derive response functions at each frequency. The procedure was conducted with normal-hearing and hearing-impaired listeners. The results suggest that the response function in normal ears is similar at 250 and 4000 Hz with a mid level compression exponent of about 0.2. However, compression extends over a smaller range of levels at 250 Hz. The results confirm previous estimates of compression using temporal masking curves (TMCs) without assuming a linear off-frequency reference as in the TMC procedure. The impaired ears generally showed less compression. Importantly, some impaired ears showed a linear response at 250 Hz, providing a further indication that low-frequency compression originates in the cochlea.
INTRODUCTION
It is now well established that the response of the basilar membrane (BM) in the base of the cochlea [the region tuned to high characteristic frequencies (CFs)] is highly compressive. This has been confirmed by direct measurements of BM displacement or velocity in nonhuman mammals (Murugasu and Russell, 1995; Rhode, 1971; Rhode and Recio, 2000; Robles et al., 1986; Ruggero et al., 1997; Russell and Nilsen, 1997) and by indirect behavioral measurements in humans (Nelson et al., 2001; Oxenham and Plack, 1997; Plack and O’Hanlon, 2003). The estimated compression exponent at mid to high levels is typically 0.2 or less, meaning that a 1 dB change in input sound level leads to a 0.2 dB (or less) change in BM response. However, there is some uncertainty regarding the response in the apex of the cochlea (low CFs). Direct measurements in nonhuman mammals suggest that compression may be much reduced in this region with a compression exponent of 0.5 or greater for CFs in the region of 400–800 Hz (Rhode and Cooper, 1996; Zinn et al., 2000).
Most of the recent behavioral techniques have compared the effects of forward maskers at the signal frequency with those well below the signal frequency. Because the response to tones well below the CF of a given place in the base of the cochlea is linear (Ruggero et al., 1997; Russell and Nilsen, 1997), it is assumed that the response to a forward masker well below the frequency of the signal can be used as a linear reference. This is the basis of the growth of masking technique, in which masker level at threshold is measured as a function of signal level or vice versa (Hicks and Bacon, 1999; Moore et al., 1999; Oxenham and Plack, 1997; Rosengard et al., 2005), and the temporal masking curve (TMC) technique, in which the masker level needed to just mask a low-level signal is measured as a function of the temporal gap between the masker and the signal (Lopez-Poveda et al., 2003; Nelson et al., 2001; Plack and Drga, 2003; Rosengard et al., 2005). In both cases, the BM response to the on-frequency masker can be derived by a comparison of the on- and off-frequency masking functions. The comparison leads to an estimate of on-frequency compression by assuming that the BM response to the off-frequency masker is linear and that all other effects (such as neural adaptation and other mechanisms involved in forward masking) are the same for both the on- and off-frequency maskers.
The first assumption, involving linear BM processing of the off-frequency masker, is problematic at low signal frequencies, which are represented near the apex of the cochlea. This is because compression appears to be less frequency selective at the apical end of the cochlea, with compression apparently being applied to a wider range of frequencies above and below the CF of a particular BM location (Plack and Drga, 2003; Rhode and Cooper, 1996). It follows that the off-frequency masker cannot be used as a linear reference at low CFs because the masker may also be compressed to some extent. To circumvent this problem, researchers have used the off-frequency TMC for a high signal frequency (e.g., 4000 Hz) as a linear reference for estimating compression from the TMC for a low on-frequency masker and signal (Lopez-Poveda et al., 2003; Lopez-Poveda et al., 2005; Nelson and Schroder, 2004; Plack and Drga, 2003; Williams and Bacon, 2005). When this is done, estimates of on-frequency compression are similar across a wide range of CFs. Thus, there appears to be a discrepancy between physiological measures in animals, suggesting more linear processing in the apical regions (Rhode and Cooper, 1996; Zinn et al., 2000), and behavioral estimates in humans, suggesting that compression in the apex is similar to that in the base.
Possible reasons for the discrepancy include between-species differences, errors in the physiological measurements, and errors in the behavioral estimates or their underlying assumptions. In the case of the physiology, accessing the apex of the cochlea is technically very challenging and may have led to some damage before measurements were taken. In the case of the psychoacoustic estimates, a number of the assumptions underlying the estimates of compression are subject to challenge, although it is noteworthy that the compression estimates for frequencies in the base of the cochlea are relatively close to those found from direct physiological measurements.
One assumption necessary for using the TMC technique to derive estimates of compression in the apical portion of the cochlea is that the postcochlear decay of forward masking is independent of CF, so that the only process producing differences in the slopes of the TMCs across frequency is the frequency-specific cochlear response to the maskers. Stainsby and Moore (2006) have cast doubt on this assumption recently in a study on listeners with sensorineural hearing loss. Sensorineural hearing loss is often a consequence of dysfunction of the outer hair cells (OHCs) in the organ of Corti. The OHCs are involved in an “active” mechanism that effectively applies gain to stimulation at frequencies close to the CF of each place on the BM (see Yates, 1995). The gain is greatest at low levels and diminishes as the level is increased, resulting in a compressive response function (Murugasu and Russell, 1995; Robles et al., 1986; Ruggero et al., 1997; Yates et al., 1990). OHC dysfunction leads to an increase in absolute threshold and a more linear (less compressive) response function (Ruggero and Rich, 1991; Ruggero et al., 1997). Linearization has been observed in hearing-impaired human listeners by using behavioral measures (Lopez-Poveda et al., 2005; Moore et al., 1999; Nelson et al., 2001; Oxenham and Plack, 1997; Plack et al., 2004). Stainsby and Moore found that the slopes of TMCs for their hearing-impaired listeners were greater at 500 and 1000 Hz than they were at higher frequencies. However, the degree of hearing loss (40–75 dB) suggested that these listeners had lost most, if not all, of their OHC function. Stainsby and Moore argued that the steep TMCs could not be explained by greater cochlear compression at low CFs compared to high CFs. Instead, they suggested that the results could be explained if the rate of decay of forward masking were greater at low CFs. If this were true, it would imply that previous studies using TMCs have overestimated the degree of compression at low CFs since the steeper TMCs at low CFs could be caused in part by the greater rate of decay of forward masking, rather than just by cochlear compression.
A behavioral technique for measuring compression that does not depend on a comparison of the effects of on- and off-frequency maskers is the additivity of nonsimultaneous masking technique, which can involve one forward and one backward masker (Oxenham and Moore, 1995) or two forward maskers (Plack et al., 2007; Plack and O’Hanlon, 2003; Plack et al., 2006). In the additivity of forward masking (AFM) technique, signal threshold is measured in the presence of two temporally nonoverlapping forward maskers and compared to the threshold for each masker presented individually. Compression, as applied to the signal, will influence the amount by which the signal level at threshold increases when the effects of the two maskers are combined: The greater the compression, the more the physical signal level has to increase to produce the same change in internal excitation. There is evidence that the effects of the two maskers add linearly (Plack et al., 2007; Plack et al., 2006); hence, the increase in threshold in the combined case can be used to estimate auditory compression. Plack and O’Hanlon (2003) used the AFM technique to estimate compression at 250, 500, and 4000 Hz at two overall levels, although their results were slightly equivocal. The mean data showed midlevel compression exponents of 0.29 at 250 Hz and 0.34 at 500 Hz, both greater than the exponent of 0.17 at 4000 Hz. However, there was considerable variability between the listeners, such that the effect of signal frequency on compression was not significant. The first aim of the present study was to use the AFM technique to estimate the response function at 250 and 4000 Hz over a wider range of levels to provide a more rigorous test of the hypothesis that compression at low CFs is similar to compression at high CFs.
The second aim of the study was to determine if the compression observed at low CFs originates in the cochlea. For high signal frequencies, the growth of forward masking with masker level is greater for off-frequency maskers than for on-frequency maskers, implying that the on- and off-frequency maskers are compressed differently in the neural frequency channel tuned to the signal. As described above, the BM response to a forward masker well below the frequency of the signal is usually assumed to be linear. Since each auditory nerve fiber responds to the activity at a single CF in the cochlea, this would seem to imply that the on-frequency compression occurs before neural transduction. It is difficult to see how the two maskers (or the off-frequency masker and the signal) could be differentially compressed by subsequent processing in the same neural frequency channel. However, there is no such differential masking growth at low CFs, so it is possible that the site of the compression observed psychophysically is postcochlear. In fact, postcochlear compression provides an alternative explanation for the results of Stainsby and Moore (2006). If a component of the compression at low CFs is postcochlear, then it should not be affected by cochlear hearing loss. Hence, the compression would still be reflected in steep TMCs. In the present study, the hypothesis was tested by using the AFM technique to estimate compression in listeners with normal hearing and listeners with sensorineural hearing loss. If listeners with low-frequency hearing loss show a linearization of the response at low CFs, then this makes a cochlear origin for the compression more likely.
METHOD
Stimuli and equipment
The sinusoidal signal had a frequency (fs) of either 250 or 4000 Hz. The maskers were bands of noise, low pass filtered at 1 kHz (3 dB cutoff, 90 dB∕octave) for the 250 Hz conditions and bandpass filtered between 2800 and 5600 Hz (3 dB cutoffs, 90 dB∕octave) for the 4000 Hz conditions. For the 250 Hz conditions, the signal had a total duration of 10 ms, which consisted of 5 ms raised-cosine onset and offset ramps (no steady state). Masker 1 (M1) had a total duration of 200 ms, including 5 ms onset and offset ramps and 190 ms steady state. Masker 2 (M2) had a total duration of 10 ms, which consisted of 5 ms onset and offset ramps (no steady state). For the 4000 Hz conditions, the signal had a total duration of 4 ms, which consisted of 2 ms onset and offset ramps (no steady state). M1 had a total duration of 200 ms, including 2 ms onset and offset ramps and 196 ms steady state. M2 had a total duration of 6 ms, including 2 ms onset and offset ramps and 2 ms steady state. At both frequencies, the offset of M1 coincided with the onset of M2, and the silent interval between the end of M2 and the start of the signal (0 V points) was 4 ms. When one or the other masker was not present, it was replaced by silence of the same duration, so that the temporal relationships between the remaining stimuli remained the same. The temporal and spectral parameters were chosen based on pilot data, so that at each frequency, the two maskers (M1 and M2) would be roughly equally effective when presented at the same spectrum level. Since the masker bandwidth was much greater than that of the signal at both frequencies, it is unlikely that the signal spectral splatter provided a useful cue.
The data were collected in two different laboratories (UK and US). In both locations, the experiment was run by using custom-made software on a personal computer workstation located outside a double-walled sound-attenuating booth. For the normal ears (UK), stimuli were generated digitally and were output by using an RME Digi96∕8 PAD 24 bit sound card set at a clocking rate of 48 kHz. The sound card included an antialiasing filter. The headphone output of the sound card was fed via a patch panel in the sound booth wall, without filtering or amplification, to Sennheiser HD 580 circumaural headphones. All stimuli were presented to the right ear. Listeners viewed a computer monitor through a window in the sound booth. Lights on the monitor display flashed on and off concurrently with each stimulus presentation and provided feedback at the end of each trial. Responses were recorded via a computer keyboard.
For the hearing-impaired ears (US), the stimuli were generated digitally at a clocking rate of 32 kHz and were played out via a LynxStudio LynxOne sound card at 24 bit resolution. The stimuli were passed through a programable attenuator (TDT PA4) and a headphone buffer (TDT HB6) before being fed to Sennheiser HD 580 circumaural headphones. The stimuli were presented monaurally to either the right or left ear in a double-walled sound-attenuating booth. Lights on a flat-panel monitor located inside the booth flashed on and off concurrently with each stimulus presentation and provided feedback at the end of each trial. Responses were made via the computer keyboard or mouse.
Procedure
The procedure was based on that described by Plack and O’Hanlon (2003). A three-interval, three-alternative, forced-choice adaptive tracking procedure was used with a 300 ms interstimulus interval. In the masking conditions, all three intervals contained either one or both maskers. One of the intervals (chosen at random) contained the signal. Threshold was determined by using a two-up one-down (masker thresholds) or a two-down one-up (signal thresholds) adaptive procedure that tracked the 70.7% correct point on the psychometric function (Levitt, 1971). In the UK setup, the step size was 4 dB up to the fourth turn point, which was reduced to 2 dB for 12 subsequent turn points. The mean level at the last 12 turn points was taken as the threshold estimate for each block of trials. At least four estimates were made for each condition and the results averaged. In the US setup, the step size was 8 dB up to the first turnpoint, which was reduced to 4 dB for the following two turn points and reduced to 2 dB for six subsequent turnpoints. The mean level at the last six turnpoints was taken as the threshold estimate for each block of trials. At least three estimates were made for each condition and the results averaged.
First, the absolute threshold for the signal in the absence of maskers was determined. The main experiment was then conducted in two phases. In phase 1, the signal was presented at a range of sensation levels, chosen separately for each listener and each frequency (limited by the need to avoid clipping when the masker level approached the maximum output of the apparatus and to avoid discomfort for listeners if levels became uncomfortably loud). At each sensation level, the signal was presented with either M1 or M2, and the masker level was varied adaptively to determine the level required to mask the signal. In this way, phase 1 generated pairs of roughly equally effective maskers for each signal level. For some hearing-impaired listeners (I2 and I4 at 250 Hz; I1, I2, and I6 at 4000 Hz), the M2 thresholds could not be determined at the highest signal sensation level due to discomfort. In these cases, M2 was set to 60 dB spectrum level for the highest sensation level in phase 2, except for listener I6 at 4000 Hz, in which case M2 was set to 68 dB spectrum level for the highest sensation level in phase 2.
In phase 2, for each pair of equally effective maskers, the signal threshold was measured in the presence of M1 alone, M2 alone, and M1 and M2 combined. For these conditions, the masker levels were fixed and the signal level was varied adaptively to determine threshold. Thresholds were measured at the two frequencies in separate sessions. In each phase, the conditions were presented in a random order.
Listeners
For the UK study, three normal-hearing listeners (ages 25–34) were tested at both 250 and 4000 Hz, and an additional listener (age 26) was added to make four listeners at 4000 Hz. For the US study, six listeners with mild-moderate sensorineural hearing loss of unknown etiology were tested at both 250 and 4000 Hz. Audiometric thresholds and ages for the individual hearing-impaired subjects are provided in Table 1. All listeners were given several hours of training on the tasks before data collection.
Table 1.
Listener | Age | Ear | Frequency (Hz) | |||||
---|---|---|---|---|---|---|---|---|
250 | 500 | 1000 | 2000 | 4000 | 8000 | |||
I1 | 57 | L | 35 | 30 | 30 | 45 | 50 | 50 |
R* | 30 | 30 | 25 | 45 | 45 | 55 | ||
I2 | 67 | L* | 35 | 30 | 30 | 40 | 50 | 50 |
R | 35 | 30 | 30 | 30 | 45 | 45 | ||
I3 | 33 | L* | 30 | 35 | 35 | 35 | 55 | 70 |
R | 20 | 25 | 25 | 25 | 50 | 60 | ||
I4 | 40 | L* | 60 | 60 | 60 | 60 | 40 | 5 |
R | 60 | 55 | 60 | 50 | 55 | 30 | ||
I5 | 77 | L | 40 | 25 | 20 | 35 | 55 | 65 |
R* | 40 | 20 | 25 | 30 | 55 | 60 | ||
I6 | 52 | L | 50 | 55 | 75 | 80 | 70 | 65 |
R* | 55 | 60 | 70 | 65 | 60 | 55 |
RESULTS AND ANALYSIS
Absolute thresholds
The thresholds for the signal in quiet are shown in Table 2. For the normal-hearing listeners, thresholds are higher at 250 Hz than those at 4000 Hz, despite the longer-duration signal used at 250 Hz. The hearing-impaired listeners show a range of threshold elevations, relative to the normal-hearing listeners, from just 12 dB above the highest normal threshold (I5 at 250 Hz) to 58 dB above the highest normal threshold (I6 at 4000 Hz).
Table 2.
Frequency (Hz) | Listener | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
N1 | N2 | N3 | N4 | I1 | I2 | I3 | I4 | I5 | I6 | |
250 | 22 | 35 | 28 | 48 | 52 | 54 | 73 | 47 | 66 | |
4000 | 13 | 10 | 13 | 10 | 52 | 53 | 69 | 46 | 58 | 71 |
Results of phase 1
The results of phase 1 of the experiment are shown in Figs. 12. The masker spectrum level at threshold for each of the two maskers is plotted as a function of the signal sensation level. The results for the normal-hearing listeners (Fig. 1) show that in most cases, the two maskers were roughly equally effective when they had the same spectrum level, although this was not the case for listeners N1 and N2 at 4000 Hz. In these cases, M2 needed to be higher in level than M1 to mask the signal; hence, M2 was relatively less effective. The slopes of the masking functions are sometimes greater than unity at low levels, as might be expected for cases in which the masker level is higher than the signal level, and hence the masker is subject to more compression (Plack and Oxenham, 1998). For some cases at 4000 Hz, and at higher masker levels (above about 30–40 dB spectrum level, equivalent to 63–73 dB SPL overall), the slope is less than 1. This could indicate that the masker is entering the more linear high-level region of the response function that is observed in some listeners (Nelson et al., 2001; Oxenham and Plack, 1997). In this case, the masker may be compressed less than the signal and, hence, the masker level at threshold grows more slowly than the signal level (Plack and Oxenham, 1998).
Some of the hearing-impaired listeners (Fig. 2) also show similar levels for the two maskers, again indicating that they were roughly equally effective at the same spectrum level. For the impaired listeners, however, the slopes of the masking functions were often close to unity at all levels, consistent with the interpretation that compression was similar for both maskers and signal and also consistent with a more linear response function overall.
Results of phase 2
The results of phase 2 of the experiment are shown in Figs. 34. The signal level at threshold is plotted as a function of the sensation level of the signal used to derive the masker levels in phase 1. The figures show signal thresholds in the presence of M1 alone, M2 alone, and M1 and M2 combined. The dashed lines show predictions of the model for the combined thresholds, which will be described in the next section.
The results for the normal-hearing listeners are shown in Fig. 3. For the single masker conditions (open symbols), thresholds are generally similar in the presence of M1 and M2, as might be expected since the levels of the maskers were chosen in phase 1 so that they were equally effective. At low levels, thresholds for the combined-masker conditions (filled circles) are only slightly above those for the single-masker conditions. Linear intensity summation predicts a 3 dB increase when two equally effective maskers are combined, and the low-level results are not far removed from this prediction. At higher levels, however, the thresholds for the combined-masker conditions are well above those for the single-masker conditions. This is indicative of a compressive system (Oxenham and Moore, 1995; Penner and Shiffrin, 1980; Plack and O’Hanlon, 2003). The amount of “excess” masking is broadly similar at the two frequencies, suggesting similar amounts of compression.
Some of the hearing-impaired listeners (Fig. 4) also show considerable excess masking, notably I1, I2, and I3 at 250 Hz and I1, I2, and I4 at 4000 Hz. However, in most cases, the effect on threshold of combining the two maskers was less for the impaired ears than for the normal ears, indicative of a less compressive, or more linear, system.
Response functions
Response functions were derived from the results of phase 2 by using the procedure described by Plack et al. (2006). The response function was modeled by a third-order polynomial in dB∕dB coordinates with three parameters. In units of intensity, this becomes
(1) |
where x is the input intensity and a, b, and c are the coefficients of the polynomial. (The constant or intercept in the equation is not constrained by the data and does not affect the predictions of the model.) A separate polynomial was derived for each listener.
After preprocessing by the response function, the responses to the stimuli (maskers and signal) were assumed to add linearly. Detection of the signal was based on the signal-to-masker ratio after preprocessing and summation, and this ratio was assumed to be constant at threshold for all conditions. This means that a measure of the masking effect can be taken as the signal intensity at masked threshold after preprocessing,
(2) |
where E is the masking effect and S is the signal intensity at threshold. Assuming that the effects of two maskers sum linearly,
(3) |
where EM1, EM2, and EM1+M2 are the masking effects produced by M1, M2, and M1 and M2 combined. Substituting from Eq. 2 and solving for S gives
(4) |
where SM1 and SM2 are the signal intensities at threshold in the presence of M1 and M2, respectively, and SM1+M2 is the signal intensity at threshold in the presence of M1 and M2 combined. Using this equation and Eq. 1 as the function f, the thresholds from each masker alone in phase 2 (SM1 and SM2) were used as the input to the model, and the thresholds in the presence of both maskers (SM1+M2) were predicted. For each listener and each frequency independently, the coefficients of Eq. 1 (a, b, and c) were selected to minimize the sum of the squared deviations of the model predictions from the thresholds in the combined-masker conditions, under the constraint that the differential of the function f was not permitted to be less than 0 or greater than 1 over the range of signal thresholds measured in each case. The predictions of the best-fitting models are illustrated by the dashed lines in Figs. 34. The model generally provides an accurate account of the data, suggesting that a third-order polynomial provides a good approximation to the shape of the response function.
The derived response functions are shown in Fig. 5. Calibration along the y axis (i.e., the vertical position of the functions) is arbitrary and is not constrained by the data. The response functions are calibrated to give a 100 dB output for a 90 dB SPL input. For the normal-hearing listeners, the functions are quite shallow (i.e., compressive) at both frequencies, although the functions are steeper at low levels and in some cases at high levels than at mid levels. It is interesting to note that the listeners who show a steepening in the response function at high levels (N3 at 250 Hz and N2 and N3 at 4000 Hz) also show a reduction in the slope of the masking function in phase 1 at high levels, consistent with the explanation for the reduction in terms of the reduced compression of the masker compared to the signal (see Sec. 3A). The exception is listener N1 at 4000 Hz, who shows a reduction in the slope of the masking function in Fig. 1 but a response function slope that decreases monotonically with level.
Figure 6 shows the slope of each response function for each listener (i.e., the derivative of each function shown in Fig. 5), which is the compression exponent value for any given input level. Between input levels of 50 and 80 dB SPL, the slopes averaged across levels and then across listeners are similar at the two frequencies for the normal-hearing listeners: 0.17 at 250 Hz and 0.21 at 4000 Hz (across-listener standard errors of 0.021 and 0.014, respectively). The minimum slope averaged across listeners is actually substantially less at 250 Hz (0.09) than that at 4000 Hz (0.18), although a t test showed that this difference is not significant. Hence, there is no evidence for a reduction in midlevel compression (which would correspond to an increase in the minimum slope value) at 250 Hz compared to 4000 Hz. However, the region of high compression extends to lower input levels at 4000 Hz, so the range of levels that are strongly compressed is greater at 4000 Hz than that at 250 Hz. This is consistent with some previous TMC studies that also reported a smaller range of compressed levels at low frequencies (Nelson and Schroder, 2004; Williams and Bacon, 2005).
The response functions for the hearing-impaired listeners (Fig. 5) are much more variable. Some listeners (I1, I2, and I3 at 250 Hz and I1 and I4 at 4000 Hz) show regions with compression comparable to that for the normal-hearing listeners, although the range of levels that are compressed is typically smaller. The other listeners show a more linear response with some listeners showing an almost complete loss of compression (I4, I5, and I6 at 250 Hz and I3 and I5 at 4000 Hz). The listener-frequency combinations with the highest absolute thresholds tend to show the most linear response functions (see Table 2), although listener I5 at 250 Hz has a relatively low threshold but an almost linear response function. Combining the results from the normal-hearing and hearing-impaired listeners revealed a significant positive correlation between the signal absolute threshold and minimum response function slope at both 250 Hz [r(7)=0.70, p=0.037, two tailed] and 4000 Hz [r(8)=0.64, p=0.048, two tailed]. At both frequencies, high absolute thresholds are associated with more linear response functions.
To provide a summary of the response function results, the coefficients of the third-order polynomials were averaged across each listener group at each frequency. The resulting polynomials are shown in Fig. 7 together with plots of the slopes of the response functions in each case. The mean functions for the impaired listeners are clearly steeper (more linear) than those for the normal-hearing listeners at both frequencies.
DISCUSSION
Compression at low CFs
For the normal-hearing listeners, the response functions are similar at 250 and 4000 Hz. Estimates of average midlevel compression are similar at the two frequencies, although the range of levels that are compressed is smaller at 250 Hz. This implies that the maximum gain of the active mechanism is less at 250 Hz. The estimated exponent of about 0.2 is similar to previous estimates of compression at low and high CFs using TMCs (Lopez-Poveda et al., 2003; Nelson and Schroder, 2004; Plack and Drga, 2003; Williams and Bacon, 2005). As described in the Introduction, Stainsby and Moore (2006) found that TMC slopes were steeper at low frequencies than at high frequencies for listeners whose hearing loss was consistent with a complete loss of OHC function. They suggested that the postcochlear decay of forward masking may be more rapid at low than at high CFs, so that the use of an off-frequency TMC reference from a high signal frequency would produce an overestimate of low-CF compression. However, the present compression estimates, which do not depend on a linear off-frequency reference, are consistent with the previous TMC estimates. This result supports the assumption that the postcochlear decay of forward masking is similar at low and high CFs and suggests that the use of an off-frequency reference from a higher signal frequency does not lead to an overestimate of compression.
Results from a different masking procedure also suggest strong compression at low CFs. Oxenham and Dau (2001, 2004) measured the amount of masking produced by harmonic tone complex maskers as a function of the phase relation between the harmonics. When the phase relation is such that the envelope of the response at the signal place in the cochlea has a high peak factor, compression is assumed to reduce the effectiveness of the masker. This is because compression reduces the overall level of a stimulus with a high peak factor compared to a stimulus with a flat temporal envelope with the same rms level. Oxenham and Dau demonstrated substantial phase effects at 250 Hz, suggesting that compression is strong in this CF region.
The present results are also consistent with a recent study of cochlear nonlinearity using distortion-product otoacoustic emissions (DPOAEs). In the DPOAE technique, two pure tones are presented to the ear, with frequencies f1 and f2 (f2>f1) and levels L1 and L2. The level of the 2f1-f2 distortion product generated in the cochlea is measured as a function of L2 with L1 set to maximize the distortion product. This produces an estimate of the BM response function. Gorga et al. (2007) used this technique to estimate response functions at 500 and 4000 Hz. They found similar high-level slopes (approximately 0.25) at the two frequencies but that the compression region extended to lower input levels at 4000 Hz. These indirect physiological findings are similar to the present indirect psychophysical results, suggesting a common (cochlear) origin.
Compression in impaired ears
The results from the hearing-impaired listeners are variable, but it is clear that for most listeners, hearing loss is associated with a partial or complete linearization of the response function. This result is consistent with the findings of Oxenham and Moore (1995) using a forward and a backward masker. In their study, a linear response function was derived from the data of all three hearing-impaired listeners, all of whom had a more severe hearing loss than those tested here. In the present study, linearization was observed at 250 Hz, suggesting that compression at low CFs is susceptible to hearing loss, presumably of a cochlear origin. It seems reasonable to assume that the underlying cause of hearing loss is the same at the two frequencies and that, for the mild-to-moderate impairment of the listeners tested here, the cause is primarily the dysfunction of the OHCs (Plack et al., 2004). Hence, the present results provide evidence that normal compression at low CFs is a consequence primarily of OHC activity, rather than compression in the inner hair cells (IHCs) (Cheatham and Dallos, 2001; Lopez-Poveda et al., 2005; Patuzzi and Sellick, 1983) or postcochlear compression. Supporting this conclusion, Oxenham and Dau (2004) found reduced effects of harmonic phase at 250 Hz for hearing-impaired listeners in their study of masking by harmonic complexes. These results also suggest that cochlear dysfunction at low CFs is associated with a reduction in cochlear compression. Finally, as mentioned above, since DPOAEs are generated by cochlear processes, the results of Gorga et al. (2007) seem to confirm the presence of compression that is cochlear in origin.
For several of the listeners tested in the present study, the hearing loss can be categorized as mild at one or both frequencies (I1, I2, I3, and I5 at 250 Hz and I1, I2, I4, and I5 at 4000 Hz, see Table 1). Plack et al. (2004) showed that for listeners with a mild cochlear loss, the response function shows a reduction in gain at low levels only, such that the compression at high levels is unaffected. The response function appears to be shifted to the right. These characteristics can be observed in some of the listeners tested here. Compared to the normal response functions, the response functions for I1, I2, and I3 at 250 Hz and I1 and I4 at 4000 Hz show a linearization at low-medium levels but comparable compression at high levels. These listeners had relatively low absolute thresholds at the specified frequencies compared to the others.
As described in the Introduction, an alternative explanation for the results of Stainsby and Moore (2006) is that a contribution to low-CF compression arises from a process that is not affected by OHC dysfunction, perhaps because it arises from some aspect of IHC function or a postcochlear neural mechanism. The effect of frequency on the TMC slope for the three impaired listeners of Stainsby and Moore was not large. The average TMC slope ratio between 250 and 4000 Hz was 1.6. This could imply a low-CF compression component with an exponent of 0.6 that is not sensitive to OHC dysfunction. However, it is also conceivable that despite the high thresholds and low DPOAE levels, the ears tested by Stainsby and Moore had residual OHC activity at low CFs that could account for the difference in slopes.
CONCLUSIONS
Response functions, estimated using the AFM technique, were similar at 250 and 4000 Hz for the normal-hearing listeners with a midlevel compression exponent of about 0.2. However, compression extended over a smaller range of levels at 250 Hz, implying that the maximum gain of the active mechanism is reduced at low CFs.
Response functions for the hearing-impaired listeners were generally more linear at both frequencies, although some mildly impaired listeners showed residual high-level compression similar to that for the normal-hearing listeners.
The findings suggest that maximum compression is similar at low and high CFs in humans and are consistent with the idea that the compression at both low and high CFs is primarily cochlear in origin.
ACKNOWLEDGMENTS
The authors thank the Associate Editor and two anonymous reviewers for helpful comments on an earlier version of the manuscript. The research was supported by BBSRC (UK) Grant No. BB∕D012953∕1, by EPSRC (UK) Grant No. GR∕N07219, and by NIH Grant No. R01 DC 03909.
References
- Cheatham, M. A., and Dallos, P. (2001). “Inner hair cell response patterns: Implications for low-frequency hearing,” J. Acoust. Soc. Am. 10.1121/1.1397357 110, 2034–2044. [DOI] [PubMed] [Google Scholar]
- Gorga, M. P., Neely, S. T., Dierking, D. M., Kopun, J., Jolkowski, K., Groenenboom, K., Tan, H., and Stiegemann, B. (2007). “Low-frequency and high-frequency cochlear nonlinearity in humans,” J. Acoust. Soc. Am. 10.1121/1.2751265 122, 1671–1680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hicks, M. L., and Bacon, S. P. (1999). “Psychophysical measures of auditory nonlinearities as a function of frequency in individuals with normal hearing,” J. Acoust. Soc. Am. 10.1121/1.424526 105, 326–338. [DOI] [PubMed] [Google Scholar]
- Levitt, H. (1971). “Transformed up-down methods in psychoacoustics,” J. Acoust. Soc. Am. 10.1121/1.1912375 49, 467–477. [DOI] [PubMed] [Google Scholar]
- Lopez-Poveda, E. A., Plack, C. J., and Meddis, R. (2003). “Cochlear nonlinearity between 500 and 8000 Hz in listeners with normal hearing,” J. Acoust. Soc. Am. 10.1121/1.1534838 113, 951–960. [DOI] [PubMed] [Google Scholar]
- Lopez-Poveda, E. A., Plack, C. J., Meddis, R., and Blanco, J. L. (2005). “Cochlear compression between 500 and 8000 Hz in listeners with moderate sensorineural hearing loss,” Hear. Res. 10.1016/j.heares.2005.03.015 205, 172–183. [DOI] [PubMed] [Google Scholar]
- Moore, B. C. J., Vickers, D. A., Plack, C. J., and Oxenham, A. J. (1999). “Inter-relationship between different psychoacoustic measures assumed to be related to the cochlear active mechanism,” J. Acoust. Soc. Am. 10.1121/1.428133 106, 2761–2778. [DOI] [PubMed] [Google Scholar]
- Murugasu, E., and Russell, I. J. (1995). “Salicylate ototoxicity: The effects on basilar membrane displacement, cochlear microphonics, and neural responses in the basal turn of the guinea pig cochlea,” Aud. Neurosci. 1, 139–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nelson, D. A., and Schroder, A. C. (2004). “Peripheral compression as a function of stimulus level and frequency region in normal-hearing listeners,” J. Acoust. Soc. Am. 10.1121/1.1689341 115, 2221–2233. [DOI] [PubMed] [Google Scholar]
- Nelson, D. A., Schroder, A. C., and Wojtczak, M. (2001). “A new procedure for measuring peripheral compression in normal-hearing and hearing-impaired listeners,” J. Acoust. Soc. Am. 10.1121/1.1404439 110, 2045–2064. [DOI] [PubMed] [Google Scholar]
- Oxenham, A. J., and Dau, T. (2001). “Towards a measure of auditory filter phase response,” J. Acoust. Soc. Am. 10.1121/1.1414706 110, 3169–3178. [DOI] [PubMed] [Google Scholar]
- Oxenham, A. J., and Dau, T. (2004). “Masker phase effects in normal-hearing and hearing-impaired listeners: Evidence for peripheral compression at low signal frequencies,” J. Acoust. Soc. Am. 10.1121/1.1786852 116, 2248–2257. [DOI] [PubMed] [Google Scholar]
- Oxenham, A. J., and Moore, B. C. J. (1995). “Additivity of masking in normally hearing and hearing-impaired subjects,” J. Acoust. Soc. Am. 10.1121/1.413376 98, 1921–1934. [DOI] [PubMed] [Google Scholar]
- Oxenham, A. J., and Plack, C. J. (1997). “A behavioral measure of basilar-membrane nonlinearity in listeners with normal and impaired hearing,” J. Acoust. Soc. Am. 10.1121/1.418327 101, 3666–3675. [DOI] [PubMed] [Google Scholar]
- Patuzzi, R., and Sellick, P. M. (1983). “A comparison between basilar membrane and inner hair cell receptor potential input-output funcitons in the guinea pig cochlea,” J. Acoust. Soc. Am. 10.1121/1.390282 74, 1734–1741. [DOI] [PubMed] [Google Scholar]
- Penner, M. J., and Shiffrin, R. M. (1980). “Nonlinearities in the coding of intensity within the context of a temporal summation model,” J. Acoust. Soc. Am. 10.1121/1.383885 67, 617–627. [DOI] [PubMed] [Google Scholar]
- Plack, C. J., Carcagno, S., and Oxenham, A. J. (2007). “A further test of the linearity of temporal summation in forward masking,” J. Acoust. Soc. Am. 10.1121/1.2775287 122, 1880–1883. [DOI] [PubMed] [Google Scholar]
- Plack, C. J., and Drga, V. (2003). “Psychophysical evidence for auditory compression at low characteristic frequencies,” J. Acoust. Soc. Am. 10.1121/1.1538247 113, 1574–1586. [DOI] [PubMed] [Google Scholar]
- Plack, C. J., Drga, V., and Lopez-Poveda, E. A. (2004). “Inferred basilar-membrane response functions for listeners with mild to moderate sensorineural hearing loss,” J. Acoust. Soc. Am. 10.1121/1.1675812 115, 1684–1695. [DOI] [PubMed] [Google Scholar]
- Plack, C. J., and O’Hanlon, C. G. (2003). “Forward masking additivity and auditory compression at low and high frequencies,” J. Assoc. Res. Otolaryngol. 4, 405–415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plack, C. J., and Oxenham, A. J. (1998). “Basilar-membrane nonlinearity and the growth of forward masking,” J. Acoust. Soc. Am. 10.1121/1.421294 103, 1598–1608. [DOI] [PubMed] [Google Scholar]
- Plack, C. J., Oxenham, A. J., and Drga, V. (2006). “Masking by inaudible sounds and the linearity of temporal summation,” J. Neurosci. 10.1523/JNEUROSCI.1134-06.2006 26, 8767–8773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rhode, W. S. (1971). “Observations of the vibration of the basilar membrane in squirrel monkeys using the Mössbauer technique,” J. Acoust. Soc. Am. 10.1121/1.1912485 49, 1218–1231. [DOI] [PubMed] [Google Scholar]
- Rhode, W. S., and Cooper, N. P. (1996). “Nonlinear mechanics in the apical turn of the chinchilla cochlea in vivo,” Aud. Neurosci. 3, 101–121. [Google Scholar]
- Rhode, W. S., and Recio, A. (2000). “Study of mechanical motions in the basal region of the chinchilla cochlea,” J. Acoust. Soc. Am. 10.1121/1.429404 107, 3317–3332. [DOI] [PubMed] [Google Scholar]
- Robles, L., Ruggero, M. A., and Rich, N. C. (1986). “Basilar membrane mechanics at the base of the chinchilla cochlea. I. Input-output functions, tuning curves, and phase responses,” J. Acoust. Soc. Am. 10.1121/1.394389 80, 1364–1374. [DOI] [PubMed] [Google Scholar]
- Rosengard, P. S., Oxenham, A. J., and Braida, L. D. (2005). “Comparing different estimates of cochlear compression in listeners with normal and impaired hearing,” J. Acoust. Soc. Am. 10.1121/1.1883367 117, 3028–3041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruggero, M. A., and Rich, N. C. (1991). “Furosemide alters organ of Corti mechanics: Evidence for feedback of outer hair cells upon the basilar membrane,” J. Neurosci. 11, 1057–1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruggero, M. A., Rich, N. C., Recio, A., Narayan, S. S., and Robles, L. (1997). “Basilar-membrane responses to tones at the base of the chinchilla cochlea,” J. Acoust. Soc. Am. 10.1121/1.418265 101, 2151–2163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russell, I. J., and Nilsen, K. E. (1997). “The location of the cochlear amplifier: Spatial representation of a single tone on the guinea pig basilar membrane,” Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.94.6.2660 94, 2660–2664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stainsby, T. H., and Moore, B. C. J. (2006). “Temporal masking curves for hearing-impaired listeners,” Hear. Res. 10.1016/j.heares.2006.05.007 218, 98–111. [DOI] [PubMed] [Google Scholar]
- Williams, E. J., and Bacon, S. P. (2005). “Compression estimates using behavioral and otoacoustic emission measures,” Hear. Res. 10.1016/j.heares.2004.10.006 201, 44–54. [DOI] [PubMed] [Google Scholar]
- Yates, G. K. (1995). “Cochlear structure and function,” in Hearing, edited by J. Moore B. C. (Academic, San Diego: ), pp. 41–73. [Google Scholar]
- Yates, G. K., Winter, I. M., and Robertson, D. (1990). “Basilar membrane nonlinearity determines auditory nerve rate-intensity functions and cochlear dynamic range,” Hear. Res. 10.1016/0378-5955(90)90121-5 45, 203–220. [DOI] [PubMed] [Google Scholar]
- Zinn, C., Maier, H., Zenner, H.-P., and Gummer, A. W. (2000). “Evidence for active, nonlinear, negative feedback in the vibration response of the apical region of the in-vivo guinea-pig cochlea,” Hear. Res. 10.1016/S0378-5955(00)00012-5 142, 159–183. [DOI] [PubMed] [Google Scholar]