Apparent auditory source width insensitivity in older hearing-impaired individuals

William M Whitmer; Bernhard U Seeber; Michael A Akeroyd

doi:10.1121/1.4728200

. Author manuscript; available in PMC: 2013 Feb 7.

Published in final edited form as: J Acoust Soc Am. 2012 Jul;132(1):369–379. doi: 10.1121/1.4728200

Apparent auditory source width insensitivity in older hearing-impaired individuals

William M Whitmer ¹, Bernhard U Seeber ², Michael A Akeroyd ³

PMCID: PMC3566657 EMSID: EMS51139 PMID: 22779484

Abstract

Previous studies have shown a loss in the precision of horizontal localization responses of older hearing-impaired (HI) individuals, along with potentially poorer neural representations of sound-source location. These deficits could be the result or corollary of greater difficulties in discriminating spatial images, and an insensitivity to punctate sound sources. This hypothesis was tested in three headphone-presentation experiments varying interaural coherence (IC), the cue most associated with apparent auditory source width (ASW). First, thresholds for differences in IC were measured for a broad sampling of participants. Older HI participants were significantly worse at discriminating IC across reference values than younger normal-hearing participants. These results are consistent with senescent increases in temporal jitter. Performance decreased with age, a finding corroborated in a second discrimination experiment using a separate group of participants matched for hearing loss. This group also completed a third, visual experiment, with both a cross-mapping task where they drew the size of the sound they heard and an identification task where they chose the image that best corresponded to what they heard. The results from the visual tasks indicate that older HI individuals do not hear punctate images and are relatively insensitive to changes in width based on IC.

I. INTRODUCTION

Deficits in absolute azimuthal localization for older hearing-impaired (HI) relative to younger normal-hearing (NH) listeners have been repeatedly shown in the literature (see Section 1.1 below). These errors are often due to an increased variability or scatter in a listener’s observed localization responses, not a systematic bias in a listener’s responses away from the acoustic sound source location. That is, these are errors of imprecision, not bias (Stallings & Gillmore, 1971). This increased imprecision in localization could result from poorer representations of sound-source locations in the aged auditory pathway and lead to a more diffuse percept of sound sources. Given physiological evidence of senescent changes in the neural representations of sound location (May et al., 2006; Ross et al., 2007), it is possible that older HI individuals do not perceive clear, concise – punctate – spatial impressions of sounds. As insensitivity to spatial impression may be a cause of decreased speech-intelligibility benefit from source separation (e.g., Noble et al., 1997), it can impose limits on the usefulness of strategies for restoring localization cues by, for example, bilaterally matched hearing aids.

To examine whether or not HI individuals hear punctate sounds, it is not simply a case of asking whether a sound is punctate or broad; this method has been previously shown to be ineffective (Yost et al., 2007). The perception of punctateness can be determined from measurements of the leftmost and rightmost extents of the spatial image, its apparent auditory source width (ASW; Wiggins & Seeber, 2011). A key parameter influencing the perceived width of a sound is the similarity between the sounds arriving at the two ears, the interaural coherence (IC). The IC is measured as the height of the peak in the interaural cross-correlation function. Changes in IC occur due to the fluctuations in interaural differences due to early reflections (Rakerd & Hartmann, 2010). The current study examined the sensitivity of older HI individuals to changes in the apparent width of a sound caused by changes in IC that were precisely controlled through headphone presentation, and how a method of visual analogy can assess a potential (in)sensitivity to these changes.

A. Localization deficits

Several studies have shown increased horizontal localization errors for older HI individuals relative to younger NH listeners in locating broadband supra-threshold sources. Root-mean-square (RMS) errors increased from 5-8° for younger NH listeners to 13-20° for older HI listeners (Noble & Byrne, 1990; Lorenzi et al., 1999a & 1999b; Keidser et al., 2006; van den Bogaert et al., 2006; Best et al., 2010). RMS localization error, however, cannot separate errors in bias from errors in precision. For studies that reported signed and unsigned (RMS) errors (Noble & Byrne, 1990; Keidser et al., 2006), the localization bias was not different across NH and HI groups, but the precision decreased for HI. Because the groups often differ both in age and pure-tone thresholds, it is not clear whether this imprecision is due more to age-related cortical deficits or peripheral hearing loss, nor how this imprecision manifests itself in perception. In a recent examination of the effects of aging on localization, Dobreva et al. (2011) had young (19-41 years), middle-aged (45-66 years) and elderly (70-81 years) participants locate suprathreshold noise-burst trains in the near front hemifield (−40 to +40°) with a visual pointer. Middle-aged and elderly participants had pure-tone thresholds ranging from normal to mild-to-moderate sloping loss. The results for broadband stimuli, when expressed as signed error, showed no significant differences between groups. The precision of responses, however, significantly varied between groups, with an average intrasubject variability (i.e., standard deviations around mean location) of 2.4° for young, 4.1° for middle-aged, and 5.5° for elderly participants.

B. Aging and localization

There are age-related deficits in temporal cues to sound localization that could lead to imprecise localization judgments and are unrelated to peripheral sensorineural hearing loss (Fitzgibbons & Gordon-Salant, 1996). Aging affects the lateralization of click trains, resulting in a doubling of lateralization threshold for older participants with normal pure-tone audiometric thresholds below 4 kHz compared to younger participants (Herman et al., 1977). Babkoff et al. (2002) corroborated this, finding increased insensitivity with age to temporal (interaural time difference; ITD) but not level cues (interaural level difference; ILD) for the lateralization of click trains. They also found decreased ability to discriminate diotic from dichotic click trains with age. Grose and Mamo (2010) found that the ability to discriminate phase-dynamic (dichotic) from phase-static (diotic) stimuli significantly decreased from young adults (age 18-27 years) to middle-aged adults (age 40-55 years) to older adults (age 63-75 years). Physiological data corroborates the effects of aging on supra-threshold localization, ranging from much more broadly tuned and less intense inferior colliculus responses in small mammals (May et al., 2006) to decreased sensitivity to phase information in the human auditory cortex (Ross et al., 2007). Given older individuals decreased ability to detect differences between diotic and dichotic stimuli (Babkoff et al., 2002; Grose & Mamo, 2010), it is possible that there is a loss in the ability to perceive punctate sounds.

C. Auditory source width and interaural coherence

In architectural acoustics, the spatial impression of a sound can be described by its apparent auditory source width (ASW). Keet (1968) found an inverse linear relationship between subjective judgments of ASW and the IC of orchestral music recordings; that is, ASW decreases with increasing IC. Later studies corroborated this finding, although the relationship for narrowband sounds has been found to be nonlinear (e.g., Ando & Kurihara, 1986). A broadening percept with decreasing IC was demonstrated by Blauert and Lindemann (1986) in a task where NH listeners drew the size of broad- and narrowband noises presented over headphones.¹ The extent of these intra-cranial images increased from full IC (1) to partial (< 1) ICs, but was not significantly different across partial ICs of 0.25-0.75 for broadband stimuli. Merimaa and Hess (2004) described an updated computerized version of this technique and applied it to recordings in different rooms, showing that it was sensitive to acoustic changes as well as inter-listener differences.

The increased imprecision seen in older HI localization studies could be the result or corollary of broader images of sound source location. That is, if there are poor neural responses to sound-source location in aged – not necessarily hearing-impaired – populations, there should be greater difficulties in discriminating ASW based on IC and broader images for highly coherent sounds. In previous studies of normal-hearing interaural coherence discrimination, thresholds for discriminating IC-varying stimuli with bandwidths greater than 1 kHz against a diotic (IC = 1) reference ranged from 0.019 to 0.045 with an average change-in-coherence (ΔIC) threshold of 0.035 (Pollack & Trittipoe, 1959; Gabriel & Colburn, 1981; Akeroyd & Summerfield, 1999; Boehnke et al., 2002; Lüddemann et al., 2009). In a study of several binaural discrimination tasks for a small number of listeners, Gabriel et al. (1992) examined IC discrimination against a diotic (IC = 1) reference for third-octave bands of noise centered at 250-4000 Hz. One of two high-frequency sensorineural hearing loss (SNHL) participants (aged 48 years) had an IC-difference (ΔIC) threshold of near-NH performance at 250 Hz, but was an order of magnitude worse at 500 and 1000 Hz. The other SNHL participant (aged 65 years) could not perform the task at 250 Hz, and had dramatically higher thresholds at other frequencies. In a later study of binaural tasks with hearing-impaired listeners, Koehnke et al. (1995) also examined coherence discrimination for third-octave bands of noise centered at 500 and 4000 Hz from a diotic reference for young NH (age 18-32 years) and HI (age 19-70 years) individuals. All HI individuals performed worse than all NH individuals.

D. Current Study

The goal of the current study was to determine the sensitivity of hearing-impaired adults to punctate sounds through a combination of psychophysical discrimination and perceptual judgment methods. We examined ASW sensitivity through its underlying psychophysical correlate, IC, in three headphone-presentation experiments. First, a broad sampling of participants performed an IC discrimination task across several reference ICs, so allowing a comparison to the NH results of Pollack and Trittipoe (1959) and the HI results of Gabriel et al. (1992) and Koehnke et al. (1995). Second, the particular role of aging in insensitivity to punctate images was investigated with a sample of participants matched for hearing loss performing the same discrimination task against a diotic reference. Third, the same group of participants from the second experiment also drew a visual representation of the size and position of the image they perceived (cf. Blauert & Lindemann, 1986). To corroborate this open-set visual cross-mapping task, the third experiment also included an identification task, where listeners selected from a closed set of visual images the nearest representation to the sound-source size and location they heard.

2. METHODS

Stimuli in all three experiments were broadband noises constructed using octave-spaced, third-octave-wide narrowband noises whose ICs could be independently controlled, to ensure the same IC in each band. The stimuli were generated using the symmetric method, where two independent noises are added and subtracted, respectively, to each other in the left and right channel, to reduce potential variability (Hartmann & Cho, 2011). Simon and Aleksandrovsky (1997) found that equal dB SPL presentation of narrowband noises for hearing-impaired listeners with any audiometric asymmetries produced a more stable midline percept than adjusting the signal to equal SL presentation. Therefore, a flat A-weighted 75-dB SPL presentation was used across experiments here. In the discrimination experiments (I and II), the level was modestly roved to control for changes in level caused by correlation differences (Edmonds & Culling, 2009) while not affecting ASW judgments (cf. Sato & Ando, 2002).

A. Experiment I

In the first experiment, participants varying in age and hearing loss discriminated the ASW of broadband noises based on the IC difference between the noises across several reference ICs and at three global interaural time differences (ITDs).

1. Participants

Twenty-three adults (7 female, 16 male) were recruited from the pool of normal-hearing and hearing-impaired patients available to the Institute of Hearing Research, sourced from attendees at clinics of the local hospitals by postal survey, and employees of the Institute. Seven of the participants (2 female, 5 male) were classified as “younger” adults by being below 40 years of age (25-38 years). The remaining 16 listeners (age 46-75 years) were classified as “older.” Pure-tone thresholds were assessed using the modified Hughson-Westlake method (British Society of Audiology, 1981) with a calibrated audiometer (GSI 61). All hearing losses were predominantly sensorineural, with air-bone conduction differences less than 10 dB HL. As shown in Figure 1, the hearing losses varied widely from normal to moderate-to-severe. Four of the older participants had variable pure-tone threshold average (VPTA) asymmetries greater than 20 dB HL.² At the time of testing, nine of the older participants were unilaterally aided and one older participant was bilaterally aided. All testing was done unaided.

Pure-tone audiometric thresholds as a function of frequency for Experiment I. Gray lines show individual participant’s better-ear (based on variable pure-tone threshold average) audiogram. Black lines show median thresholds for left (crosses) and right (circles) ears. Error bars show first and third quartile ranges.

2. Apparatus

Participants were seated in a sound-dampened booth (1.5 × 1.3 × 2 m). The stimuli were presented via a soundcard (RME DIGI-96/8 PAD), audio amplifier (Arcam A80), and circumaural headphones (Sennheiser HD-580). Responses were given via a touch-screen monitor.

3. Stimuli

To vary their IC, stimuli were 500-ms broadband complexes comprised of third-octave narrow-band noises centered at 250-4000 Hz in octaves. To create each component at a desired IC, two uncorrelated narrowband noises were first generated in the Fourier domain using real and imaginary values from a Gaussian distribution at each spectral frequency with a sampling rate of 48 kHz. The two uncorrelated noises were then mixed using the symmetric generator method (Plenge, 1972; Hartmann & Cho, 2011) so that the stimulus component in one channel (L) was the addition of the two noises (N₁ and N₂) and the other (R) was the subtraction:

\begin{matrix} L = α N_{1} + β N_{2} \\ R = α N_{1} - β N_{2} \end{matrix}

where α = [½(IC + 1)]^½ and β = (1 – α)^½

These narrowband noises were repeatedly generated in advance until there were 100 samples of each narrowband noise within 0.0001 tolerance of the desired IC (0-1.00 in 0.01 increments) at each center frequency. Each narrowband noise was equalized to have the same RMS level regardless of bandwidth and summed. Each broadband signal was composed from a random selection out of the 100 stored samples at each of the center frequencies on each stimulus presentation and for each interval. An ITD of −312, 0 or 312 μs subsequently was applied to the broadband signal. The signal was then adjusted to a calibrated long-term average A-weighted level of 75 dB SPL using an artificial ear (Bruel & Kjaer 4153) coupled to a sound level meter (Bruel & Kjaer 2600). For three participants with moderate-severe sloping hearing loss, the level was adjusted to 85 dB to ensure audibility [i.e., greater than 10 dB SL across test frequencies (250-4000 Hz)].³

4. Procedure

Interaural coherence discrimination thresholds were measured using a two-interval forced-choice adaptive procedure. On each trial, participants were presented with two intervals, one at the reference IC and the other at reference-minus-difference coherence. Participants were asked to judge which of the two sounds appeared wider to them. The IC difference (ΔIC) was adjusted using a two-up/one-down rule, asymptoting on the 71%-correct point on the psychometric function (Levitt, 1971). Participants were first instructed on the task: to judge which of the two sounds was wider. They were then given a shorter version of the adaptive task with a reference IC of 1, starting ΔIC of 0.3 and step size of 0.1 to familiarize them with the stimuli and the task just prior to testing. For test trials, the ΔIC was adjusted in 0.02 steps for the first two reversals, then 0.01 steps for six more reversals. Thresholds were calculated as the average of the last four reversals. Thresholds were estimated at five reference IC values of 0.5, 0.75, 0.88, 0.95 and 1. The order of reference IC was also randomized across runs. Initial ΔIC values were 0.33, 0.22, 0.15, 0.10 and 0.08, respectively, all being approximately 0.05 higher than the discrimination thresholds obtained by Pollack and Trittipoe (1959). The average adaptive-track length was 31 trials. The total session lasted 1-1.5 hours.

B. Experiment II

In the second experiment, the particular role of aging in the results of Experiment I was investigated with a sample of participants matched for hearing loss performing the same discrimination task as Experiment I against a diotic reference. Twenty-one participants (10 female, 11 male), none of whom participated in Experiment I, were recruited based on previously measured audiometric thresholds showing negligible asymmetry between the two ears, and no signs of conductive loss in either ear. Their ages ranged from 47-77 years (median age 65 years). Pure-tone thresholds were re-assessed just prior to the experiment and are shown in Figure 2. The actual range of VPTAs was 33-43 dB HL with asymmetries of 0-10 dB HL. At the time of testing, six of the participants were unilaterally aided and two were bilaterally aided. All testing was done unaided.

Pure-tone audiometric thresholds as a function of frequency for Experiment II. Gray lines show individual participant’s better-ear (based on variable pure-tone threshold average) audiogram. Black lines show median thresholds for left (crosses) and right (circles) ears. Error bars show first and third quartile ranges.

The apparatus was the same as in Experiment I. The stimuli were generated as above except that only a diotic (IC = 1; 0 ITD and ILD) reference was used. Participants performed the same discrimination task procedure as in Experiment I but only for one stimulus condition with three interleaved tracks (cf. one track in Experiment I) with starting ΔIC values randomly chosen from the range 0.14-0.18, based on the results of Experiment I. ΔIC thresholds were calculated as the average of the three interleaved threshold estimations for each participant. The instructions and practice were the same as in the previous experiment.

C. Experiment III

In the first part of the third experiment – Experiment IIIa – the same group of 21 participants from the Experiment II also drew a visual representation of the size and position of the image they perceived on a touch screen (i.e., an open-set, cross-modal task). In the second part – Experiment IIIb – the last 15 HI participants from Experiment II selected from a set of 15 arbitrary images of source width and position the closest visual representation to what they heard (i.e., a closed-set, identification task). Experiment III directly followed Experiment II for all HI participants. In addition, four younger NH participants (1 female) from Experiment I completed Experiment IIIa and IIIb for comparison purposes.

The apparatus was the same as in the previous experiments. The stimuli were generated as above except that there were five simulated positions: 0°, ±30° and monaural left (L) and right (R). The simulated 0° position, with 0 ITD and ILD, was the same as the 0 ITD stimuli in Experiment I. The simulated ±30° positions were produced by using the ITD and ILD values derived from average measurements of the KEMAR and AUDIS person-specific impulse-response databases for targets at ±30° azimuth and 0° elevation (Gardner & Martin, 1995; Blauert et al., 1998): 229 μs and 4.8 dB, respectively. The monaural left and right signals were produced by fully attenuating the other channel. Three IC values were tested, 0.6, 0.8 and 1, at the three simulated positions as well as monaurally, totalling 11 stimulus conditions.

For Experiment IIIa, participants were asked to sketch the perceived size of the sound. Unlike Blauert and Lindemann (1986), participants sketched the size of the sound only from the front perspective, not the top. Participants were therefore instructed to project any images heard at the rear of the head into the frontal plane. No specific instructions were given on how to draw the sound sources except that the experiment was concerned with the size of the sound the participants heard. After the presentation of a stimulus, participants were presented with a 450-pixel (15.9 cm display size) square image of a mannequin head, with an ear-to-ear distance of 360 pixels. Participants were instructed to draw the size of the image using a plastic stylus which displayed a red 8-x-8 pixel square centered at the point of contact. To ensure that participants understood the mirror-image aspect of the task, practice stimuli with an IC of 1 at the five positions (L, −30°, 0°, +30° and R) were presented sequentially. None of the participants swapped the lateral position of the practice stimuli, indicating an initial understanding of the method. After this short practice, participants drew the perceived sound location and size for ten presentations of each combination of IC and position for a total of 110 trials. The stimuli were presented in randomized order. If participants did not respond, the same trial was repeated.

Occasionally the participants sketched incomplete shapes, requiring a two-dimensional recursive moving average to create closed shapes for each trial response. The responses from five older HI participants were not included in further analysis: two participants only drew dots to indicate position, and three others occasionally placed the ±30° stimuli contralaterally. Results are based on the responses of the remaining 16 older HI participants. The width was computed as the difference between the x-axis minimum and maximum for each shape drawn by the participant. The center was computed for each shape as the geometric centroid [(1/n) × Σx_n]. To account for possible outliers, the analysis excluded the minimum and maximum width from the ten responses (i.e., the lower and upper tenth percentiles) for each IC and position, resulting in eight responses per condition per participant.

For Experiment IIIb, participants were asked to select which of 15 images most closely represented the width of the sound they heard (see Figure 3). After presentation of a stimulus, participants were presented with 15 100-pixel square images of the same mannequin in a 5-by-3 matrix with the forehead of the mannequin covered by a gray bar (made of visual noise) representing the ASW. Bars in the first column were 20×20 square, and in progressive columns were 20-pixel height rectangular objects of linearly increasing width from 40-100 pixels (see Figure 3). These images represented IC values of 1.0 to 0.6 in 0.1 increments based on the pilot responses of NH listeners to the first part of Experiment III (see Figure 7) and the source-width formula of Sato and Ando (2002). The top row were left (−30°) images; the middle row were center (0°) images; the bottom row were right (+30°) images. Five IC values were tested: 0.6-1.0 in increments of 0.1. The 15 combinations of position and IC were presented in randomized order for each of six trial blocks (i.e., a total of 90 trials). The first two blocks (i.e., the initial 30 trials) were considered practice trials and those responses were discarded from the results. Participants were allowed to replay any trial.

User interface for Experiment IIIb, the closed-set identification task. Participants were asked to select the position (row) and width (column) of displayed image that best represented the stimulus they heard.