The extent to which a position-based explanation accounts for binaural release from informational masking

Frederick J Gallun; Nathaniel I Durlach; H Steven Colburn; Barbara G Shinn-Cunningham; Virginia Best; Christine R Mason; Gerald Kidd, Jr

doi:10.1121/1.2924127

. 2008 Jul;124(1):439–449. doi: 10.1121/1.2924127

The extent to which a position-based explanation accounts for binaural release from informational masking¹

Frederick J Gallun ^1,^b), Nathaniel I Durlach ¹, H Steven Colburn ¹, Barbara G Shinn-Cunningham ¹, Virginia Best ¹, Christine R Mason ¹, Gerald Kidd Jr ¹

PMCID: PMC2587211 NIHMSID: NIHMS50288 PMID: 18646988

Abstract

Detection was measured for a 500 Hz tone masked by noise (an “energetic” masker) or sets of ten randomly drawn tones (an “informational” masker). Presenting the maskers diotically and the target tone with a variety of interaural differences (interaural amplitude ratios and∕or interaural time delays) resulted in reduced detection thresholds relative to when the target was presented diotically (“binaural release from masking”). Thresholds observed when time and amplitude differences applied to the target were “reinforcing” (favored the same ear, resulting in a lateralized position for the target) were not significantly different from thresholds obtained when differences were “opposing” (favored opposite ears, resulting in a centered position for the target). This irrelevance of differences in the perceived location of the target is a classic result for energetic maskers but had not previously been shown for informational maskers. However, this parallellism between the patterns of binaural release for energetic and informational maskers was not accompanied by high correlations between the patterns for individual listeners, supporting the idea that the mechanisms for binaural release from energetic and informational masking are fundamentally different.

INTRODUCTION

Informational and energetic masking

The term masking, as it is used in this study, refers to a decrease in the detectability of a target in the presence of an interferer. In “energetic” masking (EM), the interference can be associated with overlap of the target and the interferer acoustic energy or neural activity at a given place of excitation (i.e., the basilar membrane). In “informational” masking (IM), the overlap of excitation between the target and the masker at the auditory periphery is negligible, and the interference is assumed to take place more centrally in the auditory pathway. Obviously, these are two extreme examples and the reality is that the same masker can cause both EM and IM. Because the experiments described here were designed to examine the degree to which the same mechanisms can explain binaural release from these two quite different types of masking, artificial stimuli were constructed that would allow the two types of masking to be examined largely in isolation. It should be noted at the outset, however, that the results reported here may not generalize to masking in which the reduction in performance concerns the discriminability, intelligibility, or identifiability of the target, and the target is supra-threshold. In those cases, the tasks of the listener are different from the detection task described here. For this reason, the mechanisms underlying binaural release from masking may be different as well. Nonetheless, in the interest of starting with the most fundamental case and moving to more complex situations in a systematic manner, this study is concerned with detection in a two-interval forced-choice detection task.

When the listener’s task is to detect the presence of a tone of a given frequency, the amount of EM can be estimated by the use of a model (such as estimating the energy passed by filters with widths set to the critical bandwidths specified by Moore and Glasberg, 1983), but the degree of IM is harder to determine. In the majority of cases, the presence of IM is indicated by a rise in threshold or a decrease in performance across two situations for which the EM is the same or even reduced. Durlach et al. (2003a) suggested that the two main sources of IM appear to be the target-masker similarity and stimulus uncertainty. One way of describing these situations is that a target that should be clearly audible is in some way confused with the masker or the masker distracts the listener from the target, resulting in the perception of no target or the misapprehension that the masker is the target. For this reason, the energetic model is insufficient to predict performance in the IM conditions. Furthermore, the amount of individual variability tends to be much greater for IM. This aspect of IM has been modeled through the addition of a filter width parameter and an internal noise parameter, both of which vary across listeners (e.g., Lutfi, 1993; Oh and Lutfi 1998; Durlach et al., 2005). Despite the success of such modeling, there is much that is not yet understood about IM. For example, Richards and Tang (2006) reported modeling results that relied upon negative frequency weightings, which is incompatible with a filter-based energetic model.

Binaural hearing

The goal of this study was to examine whether the differences between EM and IM include different patterns of binaural release from masking. This issue has been raised by recent demonstrations of binaural release from IM (e.g., Neff, 1995; Kidd et al., 1994; Arbogast et al., 2002; Durlach et al., 2003b; Gallun et al., 2005; Best et al., 2005) that have at times been characterized as resulting from perceptual separation of the target and the masker, an idea that has been out of favor in the literature on binaural unmasking for energetic maskers for over 40 years (Colburn and Durlach, 1965). One situation that has been particularly well studied involves the introduction of spatial differences between a target and a competing masker colocated in front of the listener due to the presentation of an identical copy of the masker from a second location before or after a slight delay (e.g., Freyman et al., 1999; Brungart et al., 2005; Rakerd et al., 2006). Due to the precedence effect, the situation with a leading copy results in a shift of the perceived location of the masker toward the source position of that leading copy. Interestingly, substantial release from IM (but not EM) was found in nearly all situations in which the masker is presented from two locations with a delay, whether the spatially displaced copy leads or lags. Both Brungart et al. (2005) and Rakerd et al. (2006) even found that the release due to temporal onset discrepancies was not completely abolished until a delay of 64 ms. Based on these results, listeners seem to be able to use position differences based on the precedence effect out to very long delays. It is worth considering, however, that these long delays may result in “diffuse” or “widened” percepts that could also result in a release from IM by producing a quality difference that would counteract the perceptual similarity of the target and the masker. It is not clear, however, whether these sorts of “image width” percepts are as relevant in the real or simulated free field as they are with highly constrained stimuli presented under headphones.

Regardless of the presentation method, classic work on binaural hearing (e.g., Rayleigh, 1875; Stevens and Newman, 1936; Sandel et al., 1955) emphasized the relationship between perceived location and differences in the time it takes for stimuli to arrive at the two ears [interaural differences in time, (ITDs)] and in the amplitudes of these stimuli at the two ears [interaural differences in level, (ILDs)]. Consequently, when Hirsh (1948) discovered that interaural differences applied to a tone could improve its detection in noise presented diotically (identically at the two ears) over headphones, the initial explanations focused on differences in perceived location. Specifically, Hirsh (1948) argued that phenomenologically, it seems to the listener that this binaural masking level difference (BMLD) occurs because the tone and the noise are perceived in different locations inside the head (i.e., the decision variable is based on differences in the intracranial positions of the masker and the tonal signal). This has been referred to as the “position variable” explanation. One difficulty is that the greatest BMLD occurs for a tone reversed in phase at the two ears, while the clearest lateralization difference between the target and the masker is for a tone presented monaurally (Webster, 1951). Similarly, when a monaural target and a diotic masker are presented together near threshold, the listener can detect the target but not identify the ear to which it is presented. In order to address such concerns, Webster (1951) argued that since the filtering of the critical band allows the masking noise to be treated as a slowly varying sinusoid, it is possible to account for the differences between the monaural and phase-reversed conditions by postulating that at some points in time, there will exist a time difference at the two ears that is based on the phase of the summed components of the target and masker and that will depend on the interaural phase relations of the target. This modifies the position variable explanation so that the decision is based on the perceived (fluctuating) location of the combined target and masker rather than the difference in the perceived locations of each.

Extending this approach, Jeffress et al. (1956) used a vector summation method to analyze the effects of adding a tone with interaural time differences to various noises and argued that the perception underlying the BMLD is not one of a tone in one location and a noise in another, but rather of a change in either the level or the location of the noise due to the addition of the tone. Using this “vector model,” Jeffress et al. (1956) was able to account for much of the available data on the BMLD, with the exception that it does not account for BMLDs that arise from ILDs introduced into the combined stimulus. In order to address this issue, Hafter et al. (Hafter et al., 1969; Hafter and Carrier, 1970; Hafter, 1971; Hafter et al., 1973) introduced the “lateralization” model, which includes the effects of ILDs and which predicts effects of differences in time and level that vary depending on whether or not the differences are “reinforced” (both result in lateralization to the same side) or “opposed” (each alone would result in lateralization to opposite sides). The decision variable in this model is based on integrating the absolute values of the interaural differences rather than on the mean lateral position. Consequently, a noise masker results in a situation where the cue to the presence of the target is the spread of interaural-difference values, which should be greater for reinforced rather than opposed interaural differences. Hafter and Carrier (1970) reported results consistent with this explanation using masking stimuli that were tones rather than noises. This difference in masker may be quite important since a tonal masker results in fixed interaural differences, while a noise masker (with nonzero bandwidth) results in fluctuating interaural differences. Because humans are quite sensitive to fluctuations in interaural differences (Zurek and Durlach, 1987; Goupell and Hartmann, 2006, 2007a, 2007b), it is possible that when the masker is a noise, there is a cue available that is not present when the masker is a tone.

An additional complication associated with comparing opposing and reinforcing interaural differences is that when opposing time and level differences are presented over headphones, the perceived intracranial image is quite different from that obtained with a diotic version of the same stimulus. As Hafter and Carrier (1972) demonstrated, there is no combination of ILD and ITD that results in a perception that cannot be distinguished from a diotic percept. In addition, the level difference needed to “cancel” a given time difference can vary substantially across listeners. Nonetheless, the data of Hafter and Carrier (1972) give us good reason to believe that the perceived location of a stimulus with opposing ITD and ILD values should be quite different from the perceived location with reinforcing cues. Consequently, any differences in the BMLD that occur for opposing and reinforcing cues would suggest that the perceived location (or some other interaction of time and level cues) may play a role in the formation of the BMLD. Alternatively, those situations in which opposing and reinforcing cues produce similar BMLDs provide evidence that either (1) each cue is processed independently or (2) the task is based on a cue (such as fluctuations in interaural differences) that is independent of the perceived location.

A substantively different approach to the BMLD was proposed by Durlach (1960, 1963, 1972) in his formulation of the equalization and cancellation (EC) model of binaural release from masking. The EC model postulates that the stimuli at the two ears are first processed by independent banks of bandpass filters and then passed to an EC mechanism that equalizes the levels of the maskers at the two ears. Subsequently, the total signal at one ear is subtracted from the total signal at the other ear (in the case where the masker is identical at the two ears, the only operation is the subtraction). It is the output of this EC process that serves as the signal to be detected. In the case where the output is below the level of the internal noise, the system is assumed to make its decision based on an independent monaural pathway. One substantial difference between the predictions of the EC model and the lateralization model lies in the effect of introducing time and level differences that are either reinforcing or opposing. Since the EC decision is based entirely on differences in the signals at the two ears, calculated independently, the model predicts that performance for signals in which the interaural differences are reinforcing should be identical to performance for signals in which the differences are opposing. In a test of this prediction, Colburn and Durlach (1965) reported results that conform precisely to the predictions of the EC model (replotted in Fig. 2). Whether the proper explanation is based on the combined lateralization of the target and masker or the result of an EC operation, this pattern of results has been taken as strong evidence against differences in the perceived locations of the target and the masker as an explanation for binaural release from masking. In defense of the lateralization model, however, Hafter (1971) pointed out that while the average perceived locations differ for opposing and reinforcing ITDs and ILDs, the instantaneous values are constantly varying for a noise masker but are fixed for a tonal masker. Because the lateralization model acts not on the average location but on the instantaneous location, Hafter (1971) argued that the data obtained with noise maskers are in agreement both with the lateralization model and with the EC model. Hafter (1971) also argued that this explains why the results of Hafter and Carrier (1970), with a tonal masker, differ from those of Colburn and Durlach (1965), with a noise masker. This point was reinforced by Domnitz and Colburn (1976), who demonstrated that all of the models described above predict dependence on the target parameters. Furthermore, all of the proposed models turn out to rely upon sensitivity to fluctuating interaural differences.

Average BMLDs (differences between diotic thresholds and binaural thresholds) for four binaural conditions for smaller (upper panel) and larger (lower panel) binaural differences. Results for the noise maskers used in the current study (black bars) are plotted alongside the results reported by Colburn and Durlach (1965) for similar noise maskers (gray bars) and the results for the multitone maskers used in the current study (white bars). The error bars represent the critical interval (± two standard errors of the mean) based on seven listeners.

Recent results showing binaural release with informational maskers (Kidd et al., 1994; Neff, 1995; Freyman et al., 1999; Arbogast et al., 2002; Durlach et al., 2003b; Gallun et al., 2005; Best et al., 2005; Rakerd et al., 2006) have rekindled the idea that in some situations, listeners may indeed use perceived differences in spatial location (or possibly image “width” or “diffuseness”) to distinguish the target from the masker. While this may seem surprising given the results described above for EM, it is possible that the results of Colburn and Durlach (1965) do not apply to informational maskers since the mechanism of masking is not simply (or primarily) the overlap of energy in a critical band. As a result, the target and the masker may often be heard as distinct auditory objects in distinct locations in conditions where IM is the dominant form of interference. This interpretation is especially likely for situations involving actual or simulated free-field presentation, which result in a much wider range of possible perceived locations. Consequently, the mechanism of release could indeed rely upon an enhancement in the individual distinctiveness of the target and the informational masker through differences in the perceived location or other spatial attributes. This kind of formulation of the perceptual process is very different from simply detecting differences between intervals containing only the masker and those containing an object made of up of the target plus the masker. In particular, the cue proposed above for EM (fluctuating interaural differences) was specifically associated with the combined target and masker stimulus. If such fluctuations were heard as a broadening of image width, for example, there is no reason to believe that this would help distinguish the target from the masker. Because there are two issues to be examined here, binaural release for informational maskers and the role of target-masker distinctiveness, it seemed appropriate to begin with the simplest case and proceed to the more complex cases. As the simplest IM case is the detection of a tone in multitone maskers, this was chosen as a starting point. If this condition behaves like the EM condition, then it is reasonable to examine more complex conditions, while if it does not, then the two types of masking may indeed differ at a very fundamental level.

To begin to address this issue, listeners were presented with stimuli that strongly resemble those used by Colburn and Durlach (1965) but differ in that a substantial aspect of the masking is due to the target-masker similarity (and∕or uncertainty about the masker frequencies) rather than energy falling in the critical band containing the target. Once baseline performance was established for informational and energetic maskers presented diotically over headphones, interaural differences were applied to the targets. Release from masking was measured for ITD alone, ILD alone, reinforcing ITD and ILD, and opposing ITD and ILD for both types of masker. While the results for both stimulus types showed the same pattern found by Colburn and Durlach (1965), the relationships between the amounts of release obtained by individual listeners for the two masker types raise doubts about whether they truly share a common mechanism that is based on interaural differences rather than on perceived location.

METHODS

Listeners

Seven female listeners between the ages of 20 and 25 with audiometrically normal hearing were paid for their participation. L4 had considerable prior experience with psychophysical listening but very little with stimuli of this sort. None of the others had experience listening in psychoacoustical experiments. All were graduate students at Boston University in hearing-related disciplines (primarily speech and language pathology).

Stimuli

The target to be detected was a 250 ms, 500 Hz tone with 10 ms raised-cosine onsets and offsets. Noise maskers were generated digitally by creating a frequency vector with values spaced at 1 Hz intervals between 100 and 1000 Hz and associating each frequency value with a randomly chosen amplitude and phase value, drawn from rectangular distributions (thus resulting in random but not Gaussian noise). Signals were then converted to the time domain and normalized so that, after attenuation, the overall rms level was 60 dB sound pressure level (SPL) (spectrum level of 31.5 dB SPL). Multitone maskers were generated digitally by choosing from a linear distribution of ten frequencies that fell between 100 and 400 Hz and between 600 and 1000 Hz (leaving a 200 Hz wide “protected region” between 400 and 600 Hz). Each masker frequency was then associated with a random phase value drawn from a rectangular distribution and with an amplitude that was randomly varied within ±5 dB of an arbitrary starting amplitude, also from a rectangular distribution (in decibels). Ten new multitone masker frequencies were chosen randomly before each interval of each trial, always maintaining the 200 Hz protected region. Time-domain conversion and amplitude normalization assured that, after attenuation, the overall rms level of the multitone masker was 70 dB SPL. The maskers, like the targets, were 250 ms in duration with 10 ms raised-cosine-onsets and offsets. The difference in the rms levels of the noise and multitone maskers was initially the result of a programming error but fortuitously led to similar diotic target thresholds for both maskers.

ILD values were introduced into the target by reducing the level at the left or right ear by either 6 dB (“smaller differences”) or 12 dB (“larger differences”). ITD values were introduced by shifting the wave form by either 300 μs, which is equivalent to a phase delay of 54° (smaller differences) or 600 μs, which is equivalent to a phase delay of 108° (larger differences). The target and masker envelopes were applied after the phase shifts, ensuring that the onsets and offsets were synchronized, regardless of interaural differences applied to the wave forms. This removed an onset cue that would be present in a natural situation but ensured that any BMLD arises only because of ongoing binaural information.

Procedures

After time-domain conversion and normalization, stimuli were sent to Tucker-Davis Technology (TDT System II) 16 bit digital-to-analog converters running at a rate of 50 kHz and then low-pass filtered at 7.5 kHz. The target and masker levels at the two ears were controlled separately with a set of four PA4 computer-controlled attenuators and were appropriately combined before being presented through matched and calibrated TDH-50 earphones. Listeners were seated in individual double-walled Industrial Acoustics Company, Inc. (IAC) booths and made responses on a handheld response pad equipped with a screen providing instructions and feedback.

Trials consisted of three intervals, the first of which was a cue and either the second or the third of which contained the target. The cue consisted of the target to be detected, presented at the level at which it would be added to the masker and in the interaural configuration in which it would appear. New maskers were generated on each trial and different maskers were presented on the second and third intervals. For the multitone maskers, each masked interval contained a new, randomly drawn selection of ten frequencies. Indicators on the response pad marked the timing of the intervals and listeners reported whether the target had occurred in the second or the third interval.

The target level was varied adaptively, starting at 60 dB SPL at the more intense ear (or at both ears when no ILD was present) and then choosing successive levels using 2-down∕1-up adaptive tracking (Levitt, 1971), which estimates the target level that results in 70.7% correct detections. The target level was changed by 4 dB for the first four reversals and then 2 dB for an additional ten reversals. Threshold estimates were based on the target levels obtained in the final ten reversals.

Conditions

Two masker conditions, noise (EM) and multitone (IM), were crossed with five interaural configurations of the target: (1) ITD alone, (2) ILD alone, (3) “reinforcing” ITDs and ILDs (left ear advanced in phase and higher in level), (4) “opposing” ITDs and ILDs (right ear advanced in phase; left ear higher in level), and (5) diotic. The masker was always presented diotically. Pilot testing with a localization procedure confirmed that in the opposing condition, an increase in level of 6 dB at the left ear and an advance of 300 μs in time at the right ear produced a percept roughly localized in the center of the head. Similar effects were obtained with an ILD of 12 dB and an ITD of 600 μs. In order to allow comparisons with the work of Colburn and Durlach (1965), all five interaural configurations (including, for symmetry, the diotic) were presented with both the larger and smaller differences.

All listeners participated in four repetitions of each unique interaural configuration condition (eight repetitions of the diotic) for each masker type, presented in blocks of the five interaural configurations arranged in a random order. Each block consisted of either the noise or the multitone masker and either larger or smaller interaural differences. This resulted in average thresholds based on 40 reversals of the adaptive tracks for each masker type for each of the binaural conditions at each size of differences. Because the diotic conditions for the larger and smaller differences were identical, they were combined for analysis, allowing the base line measures of EM and IM to be more accurately measured (80 reversals rather than 40).

Calculation of the binaural masking level difference

Colburn and Durlach (1965) defined the BMLD as the ratio of the diotic threshold to the maximum of the levels at the two ears (at threshold) for a given “binaural” condition, expressed in decibels. Thus, the calculation simply involves subtracting the higher of the threshold target levels at the two ears in a given binaural condition (for example, 39 dB) from the threshold target level in the diotic condition (for example, 45 dB). Thus, in this example, the BMLD is 6 dB. For conditions where the BMLD is due entirely to the ITD, this is not problematic. For the ILD conditions, however, this may be a conservative estimate of the BMLD due to the fact that the loudness of a tone presented monaurally is less than that of the same tone presented binaurally (reviewed by Durlach and Colburn, 1978).

Consider the situation where a 12 dB ILD has been introduced by reducing the target level at the right ear by 12 dB but keeping the target level at the left ear the same. If the threshold at the ear with the higher level is unchanged, then the BMLD is 0 dB according to this calculation. If, on the other hand, the threshold is considered to be the level from which one ear is raised by 6 dB and the other lowered by 6 dB, then the BMLD is 6 dB. In addition, if the cue the listener is using is in some way related to the loudness of the target, then the calculation based on the maximum of the levels at the two ears fails to take into account that the listener has now detected a softer target in the ILD condition than in the diotic condition. Presumably, this ability reflects a binaural processing advantage, but the BMLD calculation shows none.

On the other hand, the BMLD is intended to reflect the improvement obtained with two ears relative to performance with a single ear, for which it makes sense to examine changes in the level at the ear with the maximum target level. Using the maximum value calculation both allows comparisons with the results of previous studies and ensures that there is no overestimation of binaural release from masking simply due to the method of calculating the differences. If the listener in the 12 dB ILD condition made responses based only on the signal at the left ear (which has the most intense target), then the results would be identical for all of the various binaural conditions. A measure based on differences between the most intense target levels presented would give the correct answer in that case, whereas a measure based on any other level would lead to overestimates of the BMLD.

For the purposes of comparing the reinforcing and opposing conditions and comparing the noise masker with the multitone masker, this issue is irrelevant. In addition, there is no reason, given current models of the BMLD, to question this calculation as it gives the appropriate predictions for the EM cases. It is worth pointing out, however, that if one assumes that the target and masker are heard as distinct auditory objects with independently perceived locations, then the role of binaural loudness may become more important than the relative influence of ITD and ILD on fluctuations of interaural differences calculated on the basis of the combined target and masker. In such a case, the current method of calculating the BMLD would potentially underestimate the amount of release obtained by the introduction of ILDs.

RESULTS

Figure 1 shows the individual target levels at threshold for the noise and multitone maskers. Diotic thresholds are shown by horizontal lines (solid for noise maskers and dashed for multitone maskers) and the listener panels are ordered by the level of the multitone thresholds. The upper panels (A) show the thresholds obtained with the smaller binaural differences (6 dB and 300 μs), while the lower panels (B) show the thresholds obtained with the larger binaural differences (12 dB and 600 μs). As can be seen, the variability across listeners is much greater for the multitone than for the noise maskers. This is true both for the diotic thresholds and for the amount of release caused by the various interaural differences.

Threshold level of the 500 Hz tonal target (at the ear with the more intense signal when a level difference was imposed) for all seven listeners across the 18 conditions. Diotic thresholds for the noise maskers are shown with solid lines and diotic thresholds for the multitone maskers are shown with dashed lines. Listeners are ordered by multitone thresholds. In each panel, the filled symbols correspond to the thresholds for the noise maskers and the open symbols correspond to the thresholds for the multitone maskers. The squares represent thresholds for ITD alone, the circles represent ILD alone, the triangles indicate reinforcing ITD and ILD (+), and the inverted triangles indicate opposing ITD and ILD (−). The upper panel contains the thresholds for the smaller binaural differences (6 dB and 300 μs) while the lower panel contains thresholds for the larger binaural differences (12 dB and 600 μs). Error bars represent the critical interval (± two standard errors of the mean).

Figure 2 shows the average BMLD values obtained in this experiment plotted alongside the results of Colburn and Durlach (1965). It should be noted that although Colburn and Durlach (1965) used phase shifts of 45° and 90° (250 and 500 μs), the ILDs used were the same. In addition, the frequency distribution of the noise stimuli used in this experiment was closely matched to theirs, even though their noise was analog and Gaussian and thus their amplitude and phase values were Rayleigh and uniform, respectively, rather than both uniform in distribution. The thresholds obtained in the current experiment, and on which the BMLDs are based, appear in Table 1. A three-way repeated measures analysis of variance performed on the four BMLD conditions (entered as differences between diotic and binaural thresholds) from the current experiment across the two masker types at both sizes of interaural differences found a significant main effect of interaural configuration (F_3,18=6.19,p<0.01), a significant main effect of the size of the binaural differences (F_1,6=21.88,p<0.01), but no significant effect of masker type (F_1,6=0.154,p=0.708) and no significant interactions (F values all less than 1.1, p values all greater than 0.35). Planned paired-sample t tests found no differences between BMLDs for the opposing and reinforcing interaural differences for either masker type at either size of the differences.

Table 1.

Listener thresholds for the detection of a 500 Hz target tone in the presence of diotic noise or multitone maskers. See text for details on how smaller and larger interaural differences were applied to the target.

Listener	Noise Masker
	Smaller differences (6 dB, 300 μs)					Larger differences (12 dB, 600 μs)
	Diotic	ITD	ILD	Reinforcing	Opposing	Diotic	ITD	ILD	Reinforcing	Opposing
1	42.4	38.3	43.9	38.7	40.4	42.4	34.3	38.4	35.9	36.6
2	44.3	37.9	43.0	39.4	41.9	44.3	35.7	41.4	38.2	39.2
3	45.7	42.2	45.5	43.4	43.6	45.7	40.6	44.7	49.6	42.3
4	42.7	37.4	44.2	37.8	41.8	42.7	34.2	39.0	35.4	35.6
5	44.3	41.8	45.2	42.3	43.6	44.3	37.4	44.6	39.1	38.1
6	45.9	42.4	44.3	43.1	43.3	45.9	33.7	43.4	40.5	40.8
7	48.7	45.2	48.3	49.0	45.1	48.7	39.4	44.8	44.5	42.4
Mean	44.8	40.7	44.9	41.9	42.8	44.8	36.4	42.3	40.4	39.3
	Multitone Masker
	Smaller differences (6 dB, 300 μs)					Larger differences (12 dB, 600 μs)
1	51.7	44.2	51.7	49.8	51.1	51.7	41.4	49.7	43.7	48.5
2	41.6	35.8	42.1	37.5	35.2	41.6	33.2	45.4	35.4	41.1
3	44.1	43.4	42.2	48.8	44.8	44.1	40.8	44.7	43.3	42.1
4	40.6	32.0	44.8	32.9	36.1	40.6	30.2	40.7	32.9	36.7
5	60.6	50.9	51.2	43.8	50.7	60.6	41.3	48.9	44.9	44.4
6	50.6	43.3	47.1	51.2	48.5	50.6	44.5	44.1	47.9	43.4
7	62.8	62.7	63.7	64.9	64.3	62.8	51.2	66.0	56.9	56.5
Mean	50.3	44.6	48.9	47.0	47.2	50.3	40.4	48.5	43.5	44.6

Open in a new tab

The BMLD values were also analyzed by performing two correlational analyses. In each case, each pair of values entered corresponded to the BMLDs for an individual listener in the same interaural condition. The first analysis correlated release from the noise masker with release from the multitone masker and was performed separately for larger and smaller interaural differences. The second analysis correlated release based on larger differences with release based on smaller differences and was performed separately for the noise and multitone maskers. The logic behind the first analysis was that perhaps, the nonsignificant effects of masker type were due to variability across listeners. The second analysis was done to determine whether or not the correlational analysis had sufficient power to show a significant difference where one was thought to exist.

The correlation between the BMLDs for the noise and multitone maskers for the smaller differences was nonsignificant (r=0.221) as was also the case for the larger differences (r=0.223). On the other hand, significant correlations (p<0.01) were found between the BMLDs obtained with larger and smaller differences for both the noise masker (r=0.559) and for the multitone masker (r=0.767). Adding listener as a covariate had the effect of increasing the correlations slightly, but the level of significance did not change. These patterns of correlation show that while individual listeners were likely to have similar patterns of BMLDs across larger and smaller interaural differences for a given masker type, the pattern for each individual was not necessarily similar across masker types.

DISCUSSION

These results cause difficulties for a purely position-based account of binaural release from IM because there was not even a trend toward greater BMLDs for reinforcing interaural differences as compared with opposing differences. The similarity between the patterns of masking release obtained with noise maskers and with multitone maskers suggests a common mechanism, but the low correlations for the individual listeners argues against such a conclusion. It might be argued that the variability in the data was simply so great that a true correlation was not able to be observed, but the high correlation for the smaller and larger differences suggests that this is probably not the case.

Due to the large individual differences, it could be the case that the results for the listeners who showed very little IM are fundamentally different from those who experienced larger amounts of IM. In particular, it might be reasonable to conclude that for L2, L3, and L4, for whom the multitone maskers were less effective than were the noise maskers (see Fig. 1), the majority of the masking was energetic rather than informational. On the basis of whether or not the multitone threshold exceeded the noise threshold, two groups can be created, with L2, L3, and L4 falling in the “LowIM” group and L1, L5, L6, and L7 falling in the “HighIM” group. On this basis, one could hypothesize that the LowIM listeners were actually using a mechanism of release for the multitone maskers that was more similar to that employed for the noise maskers than for the HighIM group. A test of this hypothesis would be to examine the correlation between the release for multitone and noise maskers separately for the two groups of listeners.

For the HighIM group, the correlation between the release for the noise and multitone maskers [plotted in Fig. 3, panel (A)] was slightly negative and failed to reach significance for either smaller or larger binaural differences [r=−0.031, p=0.9 for smaller differences (triangles); r=−0.207, p=0.44 for larger differences (diamonds)]. For the LowIM group, the correlations between the release for the noise and multitone maskers [plotted in Fig. 3, panel (B)] was significant for both sizes of differences [r=0.659, p<0.05 for smaller differences (triangles); r=0.704, p<0.05 for larger differences (diamonds)]. These results clearly show that the two groups were performing quite differently. Moreover, the significant correlation between noise and multitone maskers for the LowIM group supports the hypothesis that the dominant form of masking for the LowIM group was EM for both noise and multitone maskers. In contrast, the HighIM group had thresholds that were elevated in the presence of the multitone masker due to IM, and these thresholds were uncorrelated with their performance in the EM-dominated task.

BMLD values for HighIM listeners [L1, L5, L6, and L7; panels (A) and (C)] and LowIM listeners [L2, L3, and L4; panels (B) and (D)] plotted as a function of masker type and size of the binaural differences. Note that the same data appear in both the upper and lower panels.

It is still possible to argue that the low correlations for the HighIM group simply reflect higher variability for those listeners. This is not supported, however, by the fact the correlations between smaller and larger differences [plotted in Fig. 3, panels (C) and (D)] were significant for the multitone maskers for both groups (r=0.802, p<0.01 for HighIM [circles, panel (C)]; r=0.711, p<0.05 for LowIM [circles, panel (D)]), but only the HighIM group reached a significant correlation for the noise maskers (r=0.721, p<0.01 for HighIM [squares, panel (C)]; r=0.538, p=0.07 for LowIM [squares, panel (D)]). In short, these correlations suggest that while the two groups were performing consistently within a masker type, only the LowIM group was performing in the same way for the two masker types. Thus, correlational analysis supports the idea that the LowIM group used a mechanism related to release from EM for both masker types, but that the HighIM group changed strategies or mechanisms in response to the change in masker type. The Appendix describes a potential strategy based on widening and narrowing of the effective auditory filter that could account for this pattern of results.

A central question of this study was whether there was a significant difference in the release obtained with opposing and reinforcing binaural differences. The fact that the amount of release was not significantly different argues against a position-based explanation. However, it should be noted that while the trading ratio chosen (50 μs∕dB) resulted in a roughly centered percept for one of the listeners, the results of Hafter and Carrier (1972) suggest caution in concluding that such a percept was present for all listeners, especially given the changes in overall level that were occurring over the course of the adaptive tracks. While it is unlikely that these differences in perceived location would have resulted in percepts identical to those obtained with reinforcing cues, it could certainly have led to greater variability. A similar concern could be raised regarding the fact that when the target is presented near detection threshold, it is possible that the attenuated ear is rendered inaudible, especially for 12 dB ILDs. Such inaudibility would transform both the opposing and reinforcing cues into a monaural target. The evidence against this argument comes from the fact that a similar inaudibility should have occurred for the ILD-alone condition. Since thresholds were uniformly 2–3 dB higher for the ILD-alone condition both with 6 and 12 dB ILDs, it is unlikely that ITD did not contribute.

On the other hand, variation in perceived location could have led to similar release for opposing and reinforcing cues simply due to the fact that the amount of release from IM seems to be fairly similar across a broad range of perceived locations. In this case, the nonsignificant differences between opposing and reinforcing cues for the EM and IM stimuli may give the appearance of relying upon a common mechanism when they in fact do not. Nonetheless, it is reasonable to conclude from these data that the additional perceived difference in location provided by reinforcing cues was not sufficiently informative to provide additional release.

An alternative explanation for the lack of a significant difference between the opposing and reinforcing cues is that a true difference was obscured due to averaging together the results for the two groups of listeners. In particular, it might be the case that the LowIM group would show no difference for the multitone and noise maskers, given that they performed these tasks in the same way that listeners perform typical BMLD tasks, which show no differences between opposed and reinforced differences. However, the HighIM group might show a difference for the multitone but not the noise maskers. This was not supported by the data. A series of paired t tests showed no significant differences between performance on opposing and reinforcing differences (p>0.05 in all cases) for either group for either masker type at either size of differences.

It is interesting to note that there was a tendency for HighIM listeners to show larger BMLDs for larger binaural differences compared to smaller differences. As the difference between LowIM and HighIM listeners on this measure [compare panels (C) and (D) of Fig. 3] was on the edge of significance for both the noise maskers (p=0.048) and for the multitone maskers (p=0.052), further testing would be necessary to determine whether or not this tendency is consistent and what it may suggest about the mechanisms contributing to binaural unmasking.

What, then, are the possible strategies that could have produced these data? There is no evidence that those listeners who experienced substantial IM had uniformly greater release from IM (or EM) or that those listeners were more sensitive to reinforcing than opposing interaural differences. Consequently, there is no evidence that any of the listeners in any of the conditions were using differences in the lateral positions of the target and the masker to detect the target. These results parallel those of Colburn and Durlach (1965) and show exactly the pattern of data that would be predicted by the EC model. What is still unclear, however, is why the individual patterns of release, which are not significantly different for the larger and smaller differences, differ for the two types of maskers (at least for the HighIM group). Could it be the case that release from EM was due to an EC operation for all listeners, but that the HighIM listeners were using a different spatial cue, such as diffuseness or perceived width, to obtain release from IM? As mentioned above and described in the Appendix, it is possible to account for the differences in performance between the HighIM and LowIM groups by postulating that the HighIM group listened through effectively wider auditory filters in the diotic multitone condition and that introducing interaural differences allowed that group to narrow their effective filters into the range used by the other listeners. Since the LowIM group did not appreciably widen their filters in response to the multitone maskers, it would be reasonable to hypothesize that they simply used the same mechanism of binaural release for the noise and multitone maskers.

Future work will need to address the potential differences between a waveform-based method (such as an EC mechanism, a correlation mechanism, or an interaural-difference mechanism) and a cue based on spatial percepts such as image width or diffuseness rather than mean lateral position. One direction that might be useful for making such a distinction is to investigate IM and EM for identification and discrimination tasks, where the presence of a spatial difference would not be sufficient to indicate the correct response. One particularly relevant example of this is the case of speech maskers overlapping speech targets, such as was studied by Edmonds and Culling (2005). Unfortunately, the situation studied by Edmonds and Culling (2005) involved two very easily segregated speech tokens, as evidenced by the nearly 20 dB improvement in diotic threshold when the masker was speech rather than noise. This suggests that the majority of the masking occurring was energetic, which may account for the similarity between the masking release for the two masker types and the lack of significant difference between the opposing and reinforcing ITD and ILD cues.

A further complication associated with extrapolating from the results presented here to IM with speech is based on evidence that spatial cues may exert their effects only after the initial grouping has occurred (Darwin and Hukin, 1999). It is possible that the irrelevance of spatial position for release from IM applies only to the simultaneous grouping of target and masker components that happens during the initial presentation of new sound objects. In this scenario, because targets and maskers evolve over time, the auditory system may be able to recruit additional mechanisms of release. By moving in a systematic manner from IM obtained with brief, simultaneously presented stimuli through IM using longer stimuli that evolve over time and finally to speech stimuli, it will be possible to determine more fully the range of mechanisms of release and the stimulus characteristics that allow each to be used.

EM and IM can both be reduced by the presence of interaural differences in the target and∕or the masking stimuli. For EM, there is little reason to believe that this reduction is due to an enhancement in the perceived differences in location for the target and the masker, since reinforcing and opposing differences in interaural time and level are equally effective. The results presented here suggest that the same may be true for IM with synchronously presented tonal targets and multitone maskers. On the other hand, it is not clear from the data presented here that similar patterns of binaural release should be taken to imply similar underlying mechanisms of release for EM and IM.

ACKNOWLEDGMENTS

This work was supported by the Department of Veterans Affairs, Veterans Health Administration, Rehabilitation Research and Development Service though Associate Investigator Award No. C4855H to Frederick Gallun at the National Center for Rehabilitative Auditory Research as well as NIH-NIDCD Grant Nos. DC00100, DC04545, DC04663, and F32 DC006526 as well as AFOSR Award No. FA9550-05-1-2005. The authors are extremely grateful to Antje Ihlefeld for lending her auditory system and insights, to Jackie Stachel and Deborah Corliss for help with data collection, and of course to our listeners.

APPENDIX: POWER SPECTRUM MODEL OF MASKING RELEASE

It has been shown (Lutfi, 1993; Oh and Lutfi, 1998; Durlach et al., 2005) that much of the variability across subjects and across conditions in IM tasks involving detection of tonal signals in multitone maskers can be captured by a simple model in which listeners vary the effective width of their auditory filter. While this approach seems to be lacking sufficient free parameters to effectively model such a complex phenomenon as IM, it stands as essentially the only quantitative approach that has been proposed. Accordingly, it is appropriate to determine the extent to which the data in this study can be similarly captured. This modeling exercise starts by estimating the amount of energy falling in the critical-band filter centered on the target tone and then determining the changes in that filter width that would be necessary to produce the thresholds obtained in the experiment for the various listeners.

Using the same software that generated the experimental stimuli, 100 maskers of each type were generated and filtered with a range of filter widths. Figure 4 shows the effective masker level calculated for a range of filter widths and for both masker types. The mean energy through the critical band centered on the 500 Hz target frequency, which Moore and Glasberg (1983) estimated at 76.8 Hz, is marked by the dashed line. In accordance with the fact that the noise band included energy in the region between 400 and 600 Hz while the multitone maskers did not, the average energy falling in the critical band was greater for the noise (49 dB SPL) than for the multitone masker (39.8 dB).

Results of simulations in which 100 randomly generated noise maskers and multitone maskers were passed through filters of the type specified by Moore and Glasberg (1983). The mean effective masker level is plotted for each masker type across a range of equivalent rectangular bandwidths. Error bars indicate ±1 standard deviation across the 100 randomly generated maskers. At the limit, the functions would reach their broadband levels of 60 (noise) and 70 (multitone) dB SPL.

If all of the masking the listeners experienced was due to the masker energy falling within the critical band centered on the target, thresholds should be roughly 9 dB higher for the noise masker than for the multitone case, which was not observed for any of the listeners. As can be seen in Fig. 1 (the values also appear in Table 1), the differences between the diotic thresholds for the two masker types go in the direction predicted by the energy simulation for three of the listeners, with more masking for the noise than the multitone maskers, but the greatest difference is only 2 dB. One explanation for this difference is that the listeners who are adversely affected by the multitone masker are simply widening their effective auditory filters in response to the variability in the stimuli, an idea suggested by Lutfi (1993) and by Durlach et al. (2005).

In order to ask how wide the effective filters would have to be to account for the diotic thresholds entirely on the basis of energy falling in the filter, it is useful to assume that the listeners were all using a critical-bandwidth filter in the noise masker condition. This assumption is supported by previous data (Neff et al., 1993; Oxenham et al., 2003), showing that performance in an IM condition is unrelated to threshold or auditory filter shape as measured in an EM condition. The predicted values lie between −6.6 dB (for L1) and −0.3 dB (for L7), which corresponds fairly well to the values McFadden (1966) and Weir et al. (1977) obtained using similar stimuli (although their values were reported using different units).

Assuming that the target-to-masker energy ratio at threshold is the same in the multitone masker condition and in the noise masker condition, an estimate was made of the filter width that matched the effective target-to-masker ratio (TMR) for the two conditions for each listener. For those listeners with the lowest multitone thresholds (L2, L3, and L4), the differences in thresholds were −2.7, −1.6, and −2.1 dB, respectively, which lead to estimated effective filter widths of 95, 100, and 98 Hz in the multitone masker condition. So, even though the threshold was lower than that for the noise, the filter estimate was still almost 150% of that of the critical band (76.8 Hz). For L6, L1, L7, and L5, the respective differences in threshold were 4.7, 9.4, 14.4, and 16.2 dB. These values lead to equivalent filters of 145, 207, 360, and 475 Hz wide. While this is a very large range, it is similar to that reported by Durlach et al. (2005), who found widths that ranged between 87 and 444 Hz for a similar multitone masking condition.

This single parameter fails to describe the full extent of the release from masking generated by introducing interaural differences, however. The noise masker, being broadband, results in small changes in effective masker energy with changes in filter bandwidth, thus requiring a “subcritical” bandwidth to account for BMLDs. Because reductions in the effective bandwidth to 38 Hz would only account for changes of about 4 dB, capturing the entire range of threshold values requires effective filters 7 Hz wide in order to explain BMLDs of 11.5 dB. Since none of the current models of the BMLD (reviewed in the Introduction) are based on a narrowing of the critical band (indeed, estimates of binaural filters are usually wider than monaural filters), there is little reason to favor a band-narrowing hypothesis over the traditional binaural mechanisms.

For the multitone maskers, the band-narrowing hypothesis is more plausible, especially if it is assumed that the threshold differences between the masker types reflect a widening of the effective auditory filter. If one postulates that interaural differences reduce uncertainty and allow listeners to focus their effective filter more appropriately, then the lower limit on effective filter width is simply the width of the critical band. For L2, L3, and L4, the maximum change in threshold that can be explained by reducing the bandwidth is about 6 dB, but the BMLDs for those listeners include several values as high as 8 dB and one value of 10.4 dB. For the remaining listeners, the maximum BMLDs are greater (up to 19.3 dB for L5), but the band-narrowing model can still account for most of their results since the effective bandwidths for the diotic condition are so wide. Perhaps, then, filter widening and narrowing account for the performance of the HighIM listeners (L1, L5, L6, and L7) but not the LowIM listeners. This is consistent with the correlational analysis reported in Sec. 4, where it appears that the LowIM group was using the same mechanism for both masker types, but the HighIM group was not.

This analysis provides support for the hypothesis that at least some listeners were widening and narrowing the bandwidths of their effective filters in response to the maskers presented and the interaural differences imposed on the target. Given that Durlach et al. (2005) were able to capture much of their data with a band-widening analysis and that the CoRE model of Lutfi (1993) and Oh and Lutfi, (1998) also contains the concept of an effective auditory filter of variable bandwidth, such an approach is certainly worth considering.

Portions of this research were presented at the 2007 Midwinter Meeting of the Association for Research in Otolaryngology.

References

Arbogast, T. L., Mason, C. R., and Kidd, G., Jr. (2002). “The effect of spatial separation on informational and energetic masking of speech,” J. Acoust. Soc. Am. 10.1121/1.1510141 112, 2086–2098 [DOI] [PubMed] [Google Scholar]
Best, V., Ozmeral, E., Gallun, F. J., Sen, K., and Shinn-Cunningham, B. G. (2005). “Spatial unmasking of birdsong in human listeners: Energetic and informational factors,” J. Acoust. Soc. Am. 10.1121/1.2130949 118, 3766–3773. [DOI] [PubMed] [Google Scholar]
Brungart, D. S., Simpson, B. D., and Freyman, R. L. (2005). “Precedence-based speech segregation in a virtual auditory environment,” J. Acoust. Soc. Am. 10.1121/1.2082557 118, 3241–3251. [DOI] [PubMed] [Google Scholar]
Colburn, H. S., and Durlach, N. I. (1965). “Time-intensity relations in binaural unmasking,” J. Acoust. Soc. Am. 10.1121/1.1909625 38, 93–103. [DOI] [PubMed] [Google Scholar]
Colburn, H. S., and Durlach, N. I. (1978). “Models of binaural interaction,” in Handbook of Perception, edited by Carterette E. C. and Friedman M. P. (Academic, New York: ). [Google Scholar]
Darwin, C. J., and Hukin, R. W. (1999). “Auditory objects of attention: The role of interaural time differences,” J. Exp. Psychol. 25, 617–629. [DOI] [PubMed] [Google Scholar]
Domnitz, R. H., and Colburn, H. S. (1976). “Analysis of binaural detection models for dependence on interaural target parameters,” J. Acoust. Soc. Am. 10.1121/1.380904 59, 598–601. [DOI] [PubMed] [Google Scholar]
Durlach, N. I. (1960). “Note on the equalization and cancellation theory of binaural masking level differences,” J. Acoust. Soc. Am. 10.1121/1.1908315 32, 1075–1076. [DOI] [Google Scholar]
Durlach, N. I. (1963). “Equalization and cancellation theory of binaural masking-level differences,” J. Acoust. Soc. Am. 10.1121/1.1918675 35, 1206–1218. [DOI] [Google Scholar]
Durlach, N. I. (1972). “Binaural signal detection: Equalization and cancellation theory,” in Foundations of Modern Auditory Theory, edited by Tobias J. V. (Academic, New York: ). [Google Scholar]
Durlach, N. I., and Colburn, H. S. (1978). “Binaural phenomena,” in Handbook of Perception, edited by Carterette E. C. and Friedman M. P. (Academic, New York: ). [Google Scholar]
Durlach, N. I., Mason, C. R., Gallun, F. J., Shinn-Cunningham, B., Colburn, H. S., and Kidd, G., Jr. (2005). “Informational masking for simultaneous nonspeech stimuli: Psychometric functions for fixed and randomly mixed maskers,” J. Acoust. Soc. Am. 10.1121/1.2032748 118, 2482–2497. [DOI] [PubMed] [Google Scholar]
Durlach, N. I., Mason, C. R., Kidd, Jr., G., Arbogast, T. L., Colburn, H. S., and Shinn-Cunningham, B. G. (2003a). “Note on informational masking,” J. Acoust. Soc. Am. 10.1121/1.1570435 113, 2984–2987. [DOI] [PubMed] [Google Scholar]
Durlach, N. I., Mason, C. R., Shinn-Cunningham, B. G., Arbogast, T. L., Colburn, H. S., and Kidd, G., Jr. (2003b). “Informational masking: Counteracting the effects of stimulus uncertainty by decreasing target-masker similarity,” J. Acoust. Soc. Am. 10.1121/1.1577562 114, 368–379. [DOI] [PubMed] [Google Scholar]
Edmonds, B. A., and Culling, J. F. (2005). “The role of head-related time and level cues in the unmasking of speech in noise and competing speech,” Acta. Acust. Acust. 91, 546–553. [Google Scholar]
Freyman, R. L., Helfer, K. S., McCall, D. D., and Clifton, R. K. (1999). “The role of perceived spatial separation in the unmasking of speech,” J. Acoust. Soc. Am. 10.1121/1.428211 106, 3578–3588. [DOI] [PubMed] [Google Scholar]
Gallun, F. J., Mason, C. R., and Kidd, G.Jr. (2005). “Binaural release from informational masking in a speech identification task,” J. Acoust. Soc. Am. 10.1121/1.1984876 118, 1614–1625. [DOI] [PubMed] [Google Scholar]
Goupell, M. J., and Hartmann, W. M. (2006). “Interaural fluctuations and the detection of interaural incoherence: Bandwidth effects,” J. Acoust. Soc. Am. 10.1121/1.2200147 119, 3971–3986. [DOI] [PubMed] [Google Scholar]
Goupell, M. J., and Hartmann, W. M. (2007a). “Interaural fluctuations and the detection of interaural incoherence. II. Brief duration noises,” J. Acoust. Soc. Am. 10.1121/1.2436714 121, 2127–2136. [DOI] [PubMed] [Google Scholar]
Goupell, M. J., and Hartmann, W. M. (2007b). “Interaural fluctuations and the detection of interaural incoherence. III. Narrowband experiments and binaural models,” J. Acoust. Soc. Am. 10.1121/1.2734489 122, 1029–1045. [DOI] [PubMed] [Google Scholar]
Hafter, E. R. (1971). “Quantitative evaluation of a lateralization model of masking-level differences,” J. Acoust. Soc. Am. 55, 1116–1122. [Google Scholar]
Hafter, E. R., Bourbon, W. T., Blocker, A. S., and Tucker, A. (1969). “A direct comparison between lateralization and detection under conditions of antiphasic masking,” J. Acoust. Soc. Am. 10.1121/1.1911885 46, 1452–1457. [DOI] [PubMed] [Google Scholar]
Hafter, E. R., and Carrier, S. C. (1970). “Masking-level differences obtained with a pulsed tonal masker,” J. Acoust. Soc. Am. 10.1121/1.1912003 47, 1041–1047. [DOI] [PubMed] [Google Scholar]
Hafter, E. R., and Carrier, S. C. (1972). “Binaural interaction in low-frequency stimuli: The inability to trade time and intensity completely,” J. Acoust. Soc. Am. 10.1121/1.1913044 51, 1852–1862. [DOI] [PubMed] [Google Scholar]
Hafter, E. R., Carrier, S. C., and Stephan, F. K. (1973). “Direct comparison of lateralization and the MLD for monaural signals in gated noise,” J. Acoust. Soc. Am. 10.1121/1.1913501 53, 1553–1559. [DOI] [PubMed] [Google Scholar]
Hirsh, I. J. (1948). “The influence of interaural phase on interaural summation and inhibition,” J. Acoust. Soc. Am. 10.1121/1.1906407 20, 536–544. [DOI] [Google Scholar]
Jeffress, L. A., Blodgett, H. C., Sandel, T. T., and Wood, C. L.III (1956). “Masking of tonal signals,” J. Acoust. Soc. Am. 10.1121/1.1909701 38, 416–426. [DOI] [Google Scholar]
Kidd, G., Jr., Mason, C. R., Deliwala, P. S., Woods, W. S., and Colburn, H. S. (1994). “Reducing informational masking by sound segregation,” J. Acoust. Soc. Am. 10.1121/1.410023 95, 3475–3480. [DOI] [PubMed] [Google Scholar]
Levitt, H. (1971). “Transformed up-down methods in psychoacoustics,” J. Acoust. Soc. Am. 10.1121/1.1912375 49, 467–477. [DOI] [PubMed] [Google Scholar]
Lutfi, R. (1993). “A model of auditory pattern-analysis based on component-relative-entropy,” J. Acoust. Soc. Am. 10.1121/1.408204 94, 748–758. [DOI] [PubMed] [Google Scholar]
McFadden, D. (1966). “Masking level differences with continuous and with burst masking noise,” J. Acoust. Soc. Am. 10.1121/1.1910241 40, 1414–1419. [DOI] [PubMed] [Google Scholar]
Moore, B. C. J., and Glasberg, B. R. (1983). “Suggested formulae for calculating critical bands and excitation patterns,” J. Acoust. Soc. Am. 10.1121/1.389861 74, 750–753. [DOI] [PubMed] [Google Scholar]
Neff, D. L. (1995). “Signal properties that reduce masking by simultaneous, random-frequency maskers,” J. Acoust. Soc. Am. 10.1121/1.414458 98, 1909–1920. [DOI] [PubMed] [Google Scholar]
Neff, D. L., Dethlefs, T. M., and Jesteadt, W. (1993). “Informational masking for multicomponent maskers with spectral gaps,” J. Acoust. Soc. Am. 10.1121/1.407217 94, 3112–3126. [DOI] [PubMed] [Google Scholar]
Oh, E. L., and Lutfi, R. A. (1998). “Nonmonotonicity of informational masking,” J. Acoust. Soc. Am. 10.1121/1.423932 104, 3489–3499. [DOI] [PubMed] [Google Scholar]
Oxenham, A., Fligor, B. J., Mason, C. R., and Kidd, G., Jr. (2003). “Informational masking and musical training,” J. Acoust. Soc. Am. 10.1121/1.1598197 114, 1543–1549. [DOI] [PubMed] [Google Scholar]
Rakerd, B., Aaronson, N. L., and Hartmann, W. M. (2006). “Release from speech-on-speech masking by adding a delayed masker at a different location,” J. Acoust. Soc. Am. 10.1121/1.2161438 119, 1597–1605. [DOI] [PubMed] [Google Scholar]
Rayleigh, J. W. S. (1875). “On our perception of the direction of a source of sound,” Proceedings of the Musical Association, Second Session, pp. 75–84.
Richards, V. M., and Tang, Z. (2006). “Estimates of effective frequency selectivity based on the detection of a tone added to complex maskers,” J. Acoust. Soc. Am. 10.1121/1.2165001 119, 1574–1584. [DOI] [PubMed] [Google Scholar]
Sandel, T. T., Teas, D. C., Fedderson, W. E., and Jeffress, L. A. (1955). “Localization of a sound from single and paired sources,” J. Acoust. Soc. Am. 10.1121/1.1908052 27, 842–852. [DOI] [Google Scholar]
Stevens, S. S., and Newman, E. B. (1934). “The localization of pure tones,” Proc. Natl. Acad. Sci. U.S.A. 20, 593–596. [DOI] [PMC free article] [PubMed] [Google Scholar]
Webster, F. A. (1951). “The Influence of Interaural Phase on Masked Thresholds I. The Role of Interaural Time-Deviation,” J. Acoust. Soc. Am. 10.1121/1.1906787 23, 452–462. [DOI] [Google Scholar]
Wier, C. C., Green, D. M., Hafter, E. R., and Burkhardt, S. (1977). “Detection of a tone burst in continuous- and gated-noise maskers; defects of signal frequency, duration, and masker level,” J. Acoust. Soc. Am. 10.1121/1.381432 61, 1298–1300. [DOI] [PubMed] [Google Scholar]
Zurek, P. M., and Durlach, N. I. (1987). “Masker-bandwidth dependence in homophasic and antiphasic tone detection,” J. Acoust. Soc. Am. 10.1121/1.394911 81, 459–464. [DOI] [PubMed] [Google Scholar]

[c1] Arbogast, T. L., Mason, C. R., and Kidd, G., Jr. (2002). “The effect of spatial separation on informational and energetic masking of speech,” J. Acoust. Soc. Am. 10.1121/1.1510141 112, 2086–2098 [DOI] [PubMed] [Google Scholar]

[c2] Best, V., Ozmeral, E., Gallun, F. J., Sen, K., and Shinn-Cunningham, B. G. (2005). “Spatial unmasking of birdsong in human listeners: Energetic and informational factors,” J. Acoust. Soc. Am. 10.1121/1.2130949 118, 3766–3773. [DOI] [PubMed] [Google Scholar]

[c4] Brungart, D. S., Simpson, B. D., and Freyman, R. L. (2005). “Precedence-based speech segregation in a virtual auditory environment,” J. Acoust. Soc. Am. 10.1121/1.2082557 118, 3241–3251. [DOI] [PubMed] [Google Scholar]

[c5] Colburn, H. S., and Durlach, N. I. (1965). “Time-intensity relations in binaural unmasking,” J. Acoust. Soc. Am. 10.1121/1.1909625 38, 93–103. [DOI] [PubMed] [Google Scholar]

[c6] Colburn, H. S., and Durlach, N. I. (1978). “Models of binaural interaction,” in Handbook of Perception, edited by Carterette E. C. and Friedman M. P. (Academic, New York: ). [Google Scholar]

[c7] Darwin, C. J., and Hukin, R. W. (1999). “Auditory objects of attention: The role of interaural time differences,” J. Exp. Psychol. 25, 617–629. [DOI] [PubMed] [Google Scholar]

[c8] Domnitz, R. H., and Colburn, H. S. (1976). “Analysis of binaural detection models for dependence on interaural target parameters,” J. Acoust. Soc. Am. 10.1121/1.380904 59, 598–601. [DOI] [PubMed] [Google Scholar]

[c9] Durlach, N. I. (1960). “Note on the equalization and cancellation theory of binaural masking level differences,” J. Acoust. Soc. Am. 10.1121/1.1908315 32, 1075–1076. [DOI] [Google Scholar]

[c10] Durlach, N. I. (1963). “Equalization and cancellation theory of binaural masking-level differences,” J. Acoust. Soc. Am. 10.1121/1.1918675 35, 1206–1218. [DOI] [Google Scholar]

[c11] Durlach, N. I. (1972). “Binaural signal detection: Equalization and cancellation theory,” in Foundations of Modern Auditory Theory, edited by Tobias J. V. (Academic, New York: ). [Google Scholar]

[c12] Durlach, N. I., and Colburn, H. S. (1978). “Binaural phenomena,” in Handbook of Perception, edited by Carterette E. C. and Friedman M. P. (Academic, New York: ). [Google Scholar]

[c13] Durlach, N. I., Mason, C. R., Gallun, F. J., Shinn-Cunningham, B., Colburn, H. S., and Kidd, G., Jr. (2005). “Informational masking for simultaneous nonspeech stimuli: Psychometric functions for fixed and randomly mixed maskers,” J. Acoust. Soc. Am. 10.1121/1.2032748 118, 2482–2497. [DOI] [PubMed] [Google Scholar]

[c14] Durlach, N. I., Mason, C. R., Kidd, Jr., G., Arbogast, T. L., Colburn, H. S., and Shinn-Cunningham, B. G. (2003a). “Note on informational masking,” J. Acoust. Soc. Am. 10.1121/1.1570435 113, 2984–2987. [DOI] [PubMed] [Google Scholar]

[c15] Durlach, N. I., Mason, C. R., Shinn-Cunningham, B. G., Arbogast, T. L., Colburn, H. S., and Kidd, G., Jr. (2003b). “Informational masking: Counteracting the effects of stimulus uncertainty by decreasing target-masker similarity,” J. Acoust. Soc. Am. 10.1121/1.1577562 114, 368–379. [DOI] [PubMed] [Google Scholar]

[c16] Edmonds, B. A., and Culling, J. F. (2005). “The role of head-related time and level cues in the unmasking of speech in noise and competing speech,” Acta. Acust. Acust. 91, 546–553. [Google Scholar]

[c17] Freyman, R. L., Helfer, K. S., McCall, D. D., and Clifton, R. K. (1999). “The role of perceived spatial separation in the unmasking of speech,” J. Acoust. Soc. Am. 10.1121/1.428211 106, 3578–3588. [DOI] [PubMed] [Google Scholar]

[c18] Gallun, F. J., Mason, C. R., and Kidd, G.Jr. (2005). “Binaural release from informational masking in a speech identification task,” J. Acoust. Soc. Am. 10.1121/1.1984876 118, 1614–1625. [DOI] [PubMed] [Google Scholar]

[c19] Goupell, M. J., and Hartmann, W. M. (2006). “Interaural fluctuations and the detection of interaural incoherence: Bandwidth effects,” J. Acoust. Soc. Am. 10.1121/1.2200147 119, 3971–3986. [DOI] [PubMed] [Google Scholar]

[c20] Goupell, M. J., and Hartmann, W. M. (2007a). “Interaural fluctuations and the detection of interaural incoherence. II. Brief duration noises,” J. Acoust. Soc. Am. 10.1121/1.2436714 121, 2127–2136. [DOI] [PubMed] [Google Scholar]

[c21] Goupell, M. J., and Hartmann, W. M. (2007b). “Interaural fluctuations and the detection of interaural incoherence. III. Narrowband experiments and binaural models,” J. Acoust. Soc. Am. 10.1121/1.2734489 122, 1029–1045. [DOI] [PubMed] [Google Scholar]

[c22] Hafter, E. R. (1971). “Quantitative evaluation of a lateralization model of masking-level differences,” J. Acoust. Soc. Am. 55, 1116–1122. [Google Scholar]

[c23] Hafter, E. R., Bourbon, W. T., Blocker, A. S., and Tucker, A. (1969). “A direct comparison between lateralization and detection under conditions of antiphasic masking,” J. Acoust. Soc. Am. 10.1121/1.1911885 46, 1452–1457. [DOI] [PubMed] [Google Scholar]

[c24] Hafter, E. R., and Carrier, S. C. (1970). “Masking-level differences obtained with a pulsed tonal masker,” J. Acoust. Soc. Am. 10.1121/1.1912003 47, 1041–1047. [DOI] [PubMed] [Google Scholar]

[c25] Hafter, E. R., and Carrier, S. C. (1972). “Binaural interaction in low-frequency stimuli: The inability to trade time and intensity completely,” J. Acoust. Soc. Am. 10.1121/1.1913044 51, 1852–1862. [DOI] [PubMed] [Google Scholar]

[c26] Hafter, E. R., Carrier, S. C., and Stephan, F. K. (1973). “Direct comparison of lateralization and the MLD for monaural signals in gated noise,” J. Acoust. Soc. Am. 10.1121/1.1913501 53, 1553–1559. [DOI] [PubMed] [Google Scholar]

[c27] Hirsh, I. J. (1948). “The influence of interaural phase on interaural summation and inhibition,” J. Acoust. Soc. Am. 10.1121/1.1906407 20, 536–544. [DOI] [Google Scholar]

[c28] Jeffress, L. A., Blodgett, H. C., Sandel, T. T., and Wood, C. L.III (1956). “Masking of tonal signals,” J. Acoust. Soc. Am. 10.1121/1.1909701 38, 416–426. [DOI] [Google Scholar]

[c29] Kidd, G., Jr., Mason, C. R., Deliwala, P. S., Woods, W. S., and Colburn, H. S. (1994). “Reducing informational masking by sound segregation,” J. Acoust. Soc. Am. 10.1121/1.410023 95, 3475–3480. [DOI] [PubMed] [Google Scholar]

[c30] Levitt, H. (1971). “Transformed up-down methods in psychoacoustics,” J. Acoust. Soc. Am. 10.1121/1.1912375 49, 467–477. [DOI] [PubMed] [Google Scholar]

[c31] Lutfi, R. (1993). “A model of auditory pattern-analysis based on component-relative-entropy,” J. Acoust. Soc. Am. 10.1121/1.408204 94, 748–758. [DOI] [PubMed] [Google Scholar]

[c32] McFadden, D. (1966). “Masking level differences with continuous and with burst masking noise,” J. Acoust. Soc. Am. 10.1121/1.1910241 40, 1414–1419. [DOI] [PubMed] [Google Scholar]

[c33] Moore, B. C. J., and Glasberg, B. R. (1983). “Suggested formulae for calculating critical bands and excitation patterns,” J. Acoust. Soc. Am. 10.1121/1.389861 74, 750–753. [DOI] [PubMed] [Google Scholar]

[c34] Neff, D. L. (1995). “Signal properties that reduce masking by simultaneous, random-frequency maskers,” J. Acoust. Soc. Am. 10.1121/1.414458 98, 1909–1920. [DOI] [PubMed] [Google Scholar]

[c35] Neff, D. L., Dethlefs, T. M., and Jesteadt, W. (1993). “Informational masking for multicomponent maskers with spectral gaps,” J. Acoust. Soc. Am. 10.1121/1.407217 94, 3112–3126. [DOI] [PubMed] [Google Scholar]

[c36] Oh, E. L., and Lutfi, R. A. (1998). “Nonmonotonicity of informational masking,” J. Acoust. Soc. Am. 10.1121/1.423932 104, 3489–3499. [DOI] [PubMed] [Google Scholar]

[c37] Oxenham, A., Fligor, B. J., Mason, C. R., and Kidd, G., Jr. (2003). “Informational masking and musical training,” J. Acoust. Soc. Am. 10.1121/1.1598197 114, 1543–1549. [DOI] [PubMed] [Google Scholar]

[c38] Rakerd, B., Aaronson, N. L., and Hartmann, W. M. (2006). “Release from speech-on-speech masking by adding a delayed masker at a different location,” J. Acoust. Soc. Am. 10.1121/1.2161438 119, 1597–1605. [DOI] [PubMed] [Google Scholar]

[c39] Rayleigh, J. W. S. (1875). “On our perception of the direction of a source of sound,” Proceedings of the Musical Association, Second Session, pp. 75–84.

[c40] Richards, V. M., and Tang, Z. (2006). “Estimates of effective frequency selectivity based on the detection of a tone added to complex maskers,” J. Acoust. Soc. Am. 10.1121/1.2165001 119, 1574–1584. [DOI] [PubMed] [Google Scholar]

[c41] Sandel, T. T., Teas, D. C., Fedderson, W. E., and Jeffress, L. A. (1955). “Localization of a sound from single and paired sources,” J. Acoust. Soc. Am. 10.1121/1.1908052 27, 842–852. [DOI] [Google Scholar]

[c42] Stevens, S. S., and Newman, E. B. (1934). “The localization of pure tones,” Proc. Natl. Acad. Sci. U.S.A. 20, 593–596. [DOI] [PMC free article] [PubMed] [Google Scholar]

[c43] Webster, F. A. (1951). “The Influence of Interaural Phase on Masked Thresholds I. The Role of Interaural Time-Deviation,” J. Acoust. Soc. Am. 10.1121/1.1906787 23, 452–462. [DOI] [Google Scholar]

[c44] Wier, C. C., Green, D. M., Hafter, E. R., and Burkhardt, S. (1977). “Detection of a tone burst in continuous- and gated-noise maskers; defects of signal frequency, duration, and masker level,” J. Acoust. Soc. Am. 10.1121/1.381432 61, 1298–1300. [DOI] [PubMed] [Google Scholar]

[c45] Zurek, P. M., and Durlach, N. I. (1987). “Masker-bandwidth dependence in homophasic and antiphasic tone detection,” J. Acoust. Soc. Am. 10.1121/1.394911 81, 459–464. [DOI] [PubMed] [Google Scholar]

PERMALINK

The extent to which a position-based explanation accounts for binaural release from informational masking¹

Frederick J Gallun

Nathaniel I Durlach

H Steven Colburn

Barbara G Shinn-Cunningham

Virginia Best

Christine R Mason

Gerald Kidd Jr

Abstract