Abstract
The relative sensitivity of human listeners to interaural level differences (ILDs) carried by the onsets, offsets, and interior portions of brief sounds was examined. Stimuli consisted of single 4000-Hz Gabor clicks (Gaussian-windowed tone bursts) or trains of 16 such clicks repeating at an interclick interval (ICI) of 2 or 5 ms. In separate conditions, ILDs favored the right ear by a constant amount for all clicks (condition RRRR) or a changing amount that was maximal at sound onset (condition R000), offset (condition 000R), both onset and offset (condition R00R), or at the temporal midpoint of the stimulus (condition 0RR0). ILD increases and decreases were implemented as linear decibel sweeps across four clicks to minimize transient distortion. Threshold ILDs were determined adaptively for each of these conditions and for single clicks. Thresholds were similar for ILDs presented near sound onset or offset (condition R000 vs 000R) but lower when ILDs were carried by both onset and offset clicks (condition R00R) than for ILDs carried by interior clicks alone (condition 0RR0). The results suggest that similar sensitivity to onset and offset ILD does not reflect uniform temporal weighting; instead, ILD sensitivity favors onsets and offsets over the interior portions of sounds.
INTRODUCTION
The robust localization of complex sounds involves the integration of information across time, across frequency, and across multiple acoustic cues, including interaural time differences (ITDs) carried by the temporal fine structure of low-frequency components (“fine-structure ITD”), ITD carried by envelope fluctuations (“envelope ITD”), and interaural level differences (ILDs). In the free field, these various cues tend to agree in corresponding to a single, veridical, sound-source direction, but that relationship may be disrupted by acoustical effects including diffraction around the head (Macauley et al., 2010), echoes (Rakerd and Hartmann, 1985), and reverberation. In such cases, accurate localization requires either selection of the more reliable cues or an appropriately weighted combination of discrepant cues. Because the magnitude and reliability of auditory spatial cues varies across frequency and over time, one would expect listeners’ judgments to more strongly weight cues carried in some spectro-temporal regions of a sound than others. For example, judgments might favor the larger ILD experienced at high versus low frequencies or the more salient envelope ITD carried by a sound’s temporal onset than during weaker fluctuations of its ongoing envelope.
The current study is concerned with the temporal weighting of ILD cues in brief sounds and specifically whether ILDs carried by a sound’s onset and/or offset receive greater weight than ILDs carried by a sound’s interior. A number of previous studies have measured the temporal weighting of free-field cues (Stecker and Hafter, 2002, 2009) and binaural cues (Zurek, 1980; Saberi, 1996; Akeroyd and Bernstein, 2001; Brown and Stecker, 2010), consistently revealing enhanced weighting of cues at sound onsets and, in some cases, offsets. One approach has been to measure listeners’ detection of binaural cues embedded within particular temporal portions of a longer stimulus. For example, Zurek (1980) and Akeroyd and Bernstein (2001) measured ITD and ILD discrimination for brief target noise bursts embedded within a longer diotic noise “fringe.” Both studies reported better ITD and ILD sensitivity when the target appeared near sound onset or offset as compared to poorer sensitivity when targets appeared 1–10 ms post-onset. More recently, Stecker and Brown (2010a) compared sensitivity to binaural cues carried by the onsets or offsets of amplitude-modulated high-frequency sounds. As brief targets cannot be embedded in such stimuli without introducing transient distortion, Stecker and Brown (2010a) instead presented 4000 Hz filtered impulse trains with linear ITD or ILD “sweeps,” such that cue values changed smoothly from 0 (diotic) at onset to peak at offset or vice versa. Discrimination thresholds across these conditions revealed a clear advantage of onset over offset cues for the discrimination of envelope ITD but no such advantage for ILD discrimination, leading Stecker and Brown (2010a) to suggest two explanations: (a) that listeners are equally sensitive to ILD over the complete duration of a brief sound or (b) that ILD sensitivity follows a “U-shaped” time course in which both onsets and offsets play important roles. As noted, though, there is some inconsistency in the results for offsets: e.g., Brown and Stecker (2010) did not find an offset emphasis in ILD discrimination (potential reasons for this are outlined in Sec. 4).
The current study tests the hypotheses of Stecker and Brown (2010a) by comparing ILD discrimination across conditions similar to that study. In one condition, the peak ILD is presented at onset and offset while the intervening “interior” clicks are diotic. In another condition, the temporal order of clicks is altered so that the onset and offset are diotic and the peak ILD occurs among the interior clicks.
METHODS
All procedures, including the recruitment, consent, and testing of human subjects followed the guidelines of the University of Washington Human Subjects Division and were reviewed and approved by the cognizant Institutional Review Board.
Subjects
Eight subjects participated in this experiment. One (0502) was the second author, and one (0926) was a part-time research assistant working in the lab. The remaining six (0601, 0915, 0917, 0918, 0919, and 1013) were paid subjects naive to the purpose of the experiment. All subjects reported normal hearing and demonstrated pure-tone detection thresholds <10 dB HL at octave frequencies over the range 250–8000 Hz except subject 0919 who demonstrated a mild unilateral hearing loss (right ear threshold = 35 dB HL) at 2 kHz but normal thresholds (∼0 dB HL) bilaterally at 4 kHz, the center frequency of the narrowband stimuli employed in the study.
Stimuli
Stimuli consisted of single Gabor clicks (Gaussian-windowed tone bursts) or trains of such clicks. Each click consisted of a 4 kHz cosine multiplied by a Gaussian temporal envelope with σ = 221 μs (367 μs duration at 3 dB below peak). The Gaussian window was truncated at a total duration of 2 ms. The resulting spectrum was also Gaussian, with σ = 750 Hz (−3 dB bandwidth = 1250 Hz). Clicks were synthesized at 48.848 kHz (Tucker–Davis Technologies RP2.1, Alachua, FL) and presented using STAX model 4070 closed-back electrostatic headphones (Stax Ltd., Saitama, Japan). Click trains were presented at approximately 70 dB SPL with peak-to-peak interclick intervals (ICIs) of 2 or 5 ms, thus spanning the range of ICI over which strong onset dominance emerges for such stimuli (Brown and Stecker, 2010). Five click-train conditions, labeled RRRR, R00R, 0RR0, R000, and 000R, were tested at each ICI (see Fig. 1). This indicates the temporal regions to which the target ILD was applied: each letter corresponds to 4 of the 16 clicks, such that R000 indicates right-favoring ILDs (“R”) applied to clicks 1–4 and 0 ILD (“0”) applied to clicks 5–16. All ILD targets favored the right ear. An additional single click condition (condition R) was also tested.
Figure 1.
Schematic illustration of target stimuli in each condition (not to scale; see text). Target stimuli in condition RRRR carried a static right-favoring ILD. Condition R00R stimuli carried a linearly decreasing ILD over the first four clicks in the train, eight diotic clicks, and a linearly increasing ILD over the final four clicks in the train. Condition 0RR0 consisted of a diotic segment, an increasing ILD segment, a decreasing ILD segment, and a final diotic segment. Condition R000 consisted of a decreasing ILD segment followed by three diotic segments. Condition 000R consisted of three diotic segments followed by an increasing ILD segment. Condition R, included as a “baseline” condition, consisted of a single right-favoring click.
Target stimuli are illustrated in Fig. 1. In conditions RRRR and R, these carried static ILDs that favored the right ear by ΔL dB. In all other conditions, ILDs were dynamic over four-click segments within each target stimulus. In each such segment, ILDs either decreased linearly from a peak of ΔL dB at the first click of the segment to 0.25ΔL dB at the fourth click of the segment or increased linearly from 0.25ΔL dB at the first click to a peak of ΔL dB at the fourth click. For example, in condition R00R, clicks 1 and 16 carried an ILD of ΔL, clicks 2 and 15 carried an ILD of 0.75ΔL dB, clicks 3 and 14 carried an ILD of 0.5ΔL dB, and clicks 4 and 13 carried an ILD of 0.25ΔL dB (see Fig. 1, 2nd line). “0” segments always carried 0 dB ILD. Note that for the comparisons of greatest interest (R00R vs 0RR0; R000 vs 000R), stimuli differed only in the temporal ordering of clicks but not in the distribution of ILD values across clicks (i.e., the whole-stimulus mean and variance of ILD were equal for a given ΔL). All ILDs were imposed by amplifying the signal intensity in the right ear by half of the overall ILD and attenuating the signal in the left ear by an equal number of decibels.
It should be emphasized that the stimuli used in conditions R00R, 0RR0, R000, and 000R, like those in conditions RRRR and R, carried static cues related to the overall ILD as well as dynamic cues related to the changing ILD. At sufficiently long ICI, such dynamic cues give rise to the perception of motion with a velocity that is correlated to both ΔL and to the stimulus duration. Previous research on sensitivity to auditory motion suggests a rather “sluggish” mechanism, requiring integration over fairly long durations of 150–300 ms (Grantham, 1986). As a result of the short durations used in the current study (the duration of each four-click ILD “trajectory” was 6–15 ms), we presume limited sensitivity to motion cues per se; performance should be limited primarily by sensitivity to the overall ILD cue. To confirm this, and to further ascertain that performance was mediated by ILD sensitivity, subjects also completed a set of monaural control conditions (R00Rm and 0RR0m). These were identical to conditions R00R and 0RR0, respectively, except that sound was presented only to the right ear.
Procedure
ILD discrimination was assessed using a four-interval two-alternative forced-choice procedure with a single right-favoring target stimulus presented randomly in either the second or third interval on each trial. Other intervals presented a diotic standard stimulus. Intervals were separated by an inter-stimulus interval of 500 ms. Following the fourth interval, subjects indicated by button press (TDT RBOX) the interval which contained the right-favoring target. Feedback notification of the correct interval was provided by LED immediately thereafter. All testing was completed in a double-walled sound booth (IAC, Bronx, NY).
Threshold values were obtained using a two-down one-up adaptive procedure, tracking ΔL at 71% correct (Levitt, 1971). ΔL was set to 10 dB at the start of each run with the tracker step size set to 2 dB for the initial 4 of 12 adaptive reversals and 0.5 dB thereafter. Two thresholds were estimated per run, using two interleaved tracks presented in random order across trials. The threshold ΔL in each track was taken as the mean of ΔL across the final eight reversals on each run. After an initial training period, subjects completed four test runs (for eight threshold estimates) at each combination of ICI and condition. In cases where an adaptive track failed to behave asymptotically, the data were eliminated from further analysis and the test run was repeated. The various combinations of condition and ICI were tested in random order for each subject to minimize sequential effects.
Analysis
Both individual-subject and across-subject mean thresholds, along with appropriate statistical confidence intervals at α = 0.05, were computed for display (in Figs. 23) and analysis as follows. Mean thresholds were first computed for each combination of subject, condition, and ICI. Normalized thresholds were computed for each subject by subtracting that subject’s mean threshold in condition R (i.e., the single-click threshold) from the mean thresholds of each other condition. Figure 2 plots group-level means and 95% confidence intervals of normalized thresholds computed across subjects. Figure 3 plots individual subjects’ mean thresholds, prior to normalization, and 95% confidence intervals computed across repeated runs for each combination of condition and ICI. The 95% confidence intervals were computed across runs. Statistical comparisons of individual values can be made on the basis of the plotted confidence intervals (Payton et al., 2003; Loftus, 1996). Statistical comparisons were also conducted using explicit null-hypothesis significance tests [repeated-measures analysis of variance (ANOVA) and paired t-tests].
Figure 2.
Across-subject mean threshold ILD across stimulus conditions. Individual-subject thresholds were expressed relative to each subject’s threshold ΔL measured with single clicks (condition R; dashed line at 0 dB) before averaging across subjects. Large symbols plot mean threshold ΔL (on the ordinate) against ICI (on the abscissa) for binaural conditions RRRR (squares), R00R (diamonds), 0RR0 (circles), R000 (right-pointing triangles), and 000R (left-pointing triangles). Data for conditions R000 and 000R are presented in the left panel for clarity. Error bars indicate 95% confidence intervals computed across individual-subject means. Smaller open symbols plot thresholds obtained with monaural presentation of the right-ear stimuli employed in conditions R00Rm (diamonds) and 0RR0m (circles). Confidence intervals for monaural thresholds are omitted for clarity.
Figure 3.
ILD thresholds for individual subjects. Each panel plots threshold ΔL (on the ordinate) against ICI (on the abscissa) and across conditions (symbols) for an individual subject. Threshold estimates were averaged over runs (two threshold estimates per run); error bars indicate 95% confidence intervals across those estimates. The dotted line in each panel indicates each subject’s single-click ΔL threshold, and symbols the thresholds for click trains, in absolute units of ILD (i.e., prior to the normalization employed for across-subject averaging in Fig. 2). Otherwise, the layout and symbol assignment within each panel matches that of Fig. 2.
RESULTS
Best sensitivity to ILD at onset and offset
Figure 2 displays normalized ILD discrimination thresholds averaged across subjects for all click-train conditions. Symbols plot the means of individual subjects’ binaural thresholds for RRRR (squares), R00R (diamonds), 0RR0 (circles), R000 (right arrows), and 000R (left arrows), normalized to each subject’s threshold for condition R (dotted line). Monaural thresholds for R00Rm and 0RR0m are plotted as small diamonds and circles, respectively. Error bars represent the 95% confidence intervals as described in Sec. 2D. The panels of Fig. 3 display individual subjects’ ILD discrimination thresholds and corresponding 95% confidence intervals for each condition, using the same symbols as in Fig. 2. Visual inspection of the data suggests substantial differences in discrimination thresholds across conditions. Specifically, at both 2 and 5 ms ICI, RRRR thresholds were lowest overall. R and R00R thresholds were somewhat higher than RRRR and comparable to one another. R000 and 000R thresholds were somewhat higher still but also comparable to one another. For nearly every subject, binaural thresholds were highest overall in condition 0RR0. These differences were tested statistically as follows.
The factors of condition (RRRR, R00R, 0RR0, R000, 000R) and ICI (2, 5) were entered into a 5 × 2 repeated-measures ANOVA on the normalized threshold values. The main effect of condition was significant [F(4,28) = 36.09, P < 0.05] as was the condition × ICI interaction [F(4,28) = 5.34, P < 0.05]. The main effect of ICI was not significant [F(1,7) = 0.56, P = 0.48)] Several follow-up pair-wise t-tests were conducted to evaluate differences between conditions (see Table 1). Taken together, pair-wise testing demonstrated best sensitivity in condition RRRR, equal sensitivity in conditions R and R00R, comparable sensitivity in conditions R000 and 000R at 5 but not 2 ms ICI (a contributor to the significant condition × ICI interaction; see Sec. 4), and highest thresholds in condition 0RR0. Of critical importance to the major question of the study, R00R thresholds were significantly lower than 0RR0 thresholds at both 2 ms ICI [t(7) = 4.82, P < 0.05] and 5 ms ICI [t(7) = 8.56, P < 0.05].
Figure .
Summary of pairwise t-tests (two-tailed). Fifteen tests were conducted at each ICI. Observed P values are indicated by symbols (P < 0.05*, P < 0.01**; P < 0.001***), n.s. indicates no significant difference. Shaded cells indicate statistically significant comparisons at α = 0.05.
ILD discrimination versus monaural intensity discrimination
The dynamic ILD that comprised a target was naturally accompanied by a dynamic intensity cue at each ear. To verify that listeners’ performance reflected sensitivity to the ILD and not this monaural intensity cue, two monaural control conditions, R00Rm and 0RR0m, were additionally tested. Stimuli in these conditions were identical to the corresponding binaural R00R and 0RR0 stimuli (see Fig. 1) except that no sound was presented to the left ear (i.e., right ear stimuli were identical to those employed during binaural conditions). Thresholds were determined identically for monaural as for binaural testing, and subjects completed four runs in each condition at both 2 and 5 ms ICI (although two runs were omitted for subject 0917 in monaural condition 0RR0 at 2 ms ICI). The small symbols in Figs. 23 plot these monaural threshold values, averaged across runs (in Fig. 3) or subjects (in Fig. 2) as for the binaural conditions. In general, monaural thresholds exceeded binaural thresholds to a degree that confirms listeners’ use of binaural information, rather than monaural intensity cues, in the binaural conditions. Statistically optimal combination of monaural intensity and/or envelope information across the two ears would result in binaural thresholds bettering the single-ear thresholds by a factor of (cf. Hartmann and Constan, 2002; Stellmack et al., 2004). Only subject 0918 showed R00R thresholds that were not better than his monaural equivalents by at least this amount.1 Thus it is possible that the relatively poor performance of subject 0918 resulted from the inappropriate use of monaural intensity cues, but the performance of other subjects was clearly limited by their sensitivity to binaural (i.e., ILD) rather than monaural cues.
DISCUSSION
Evidence for U-shaped, not flat, weighting of ILD
The current study was conducted to determine whether the equivalent sensitivity to ILD carried by sound onsets and offsets described by Stecker and Brown (2010a) reflected uniform (flat) weighting of ILD across the stimulus duration or non-uniform (“U-shaped”) weighting favoring both onset and offset ILD. The significant elevation of ILD thresholds in condition 0RR0 as compared to condition R00R strongly suggests that listeners are not uniformly sensitive to ILDs carried by all clicks in the train at least for ICI values of 2 or 5 ms. Instead, ILD sensitivity appears to be mediated by a combination of onset and late-arriving cues.2 Both effects have been described previously: Brown and Stecker (2010) described the dominance of onset over post-onset clicks in ILD discrimination (“onset dominance,” Freyman et al., 1997), and Stecker and Hafter (2009) described the importance of late-arriving cues in free-field sound localization (“upweighting”). The U-shaped temporal weighting of ILD exhibits both onset dominance and upweighting, which presumably reflect different aspects of binaural processing (Stecker and Hafter, 2009).
This greater sensitivity to onset- and late-arriving cues than to interior cues is consistent with the temporal profile of ITD and ILD sensitivity described by Zurek (1980) and Akeroyd and Bernstein (2001) for broadband noise and with temporal weighting functions (TWFs) measured by Stecker and Hafter (2002, 2009) for freefield localization of filtered click trains. The latter studies used statistical regression to measure the weighting of each click in a train on listeners’ spatial judgments. Similar approaches have also been used to measure TWFs for ITD (Saberi, 1996; Brown and Stecker, 2010) and ILD (Brown and Stecker, 2010) of click trains presented over headphones. A common finding across both freefield and headphone studies has been of relatively flat weighting for long ICI (>5 ms) but strong onset weighting for short ICI (<5 ms). Less agreement has been found with regard to the weighting of post-onset clicks at short ICI. Whereas TWFs estimated by Saberi (1996) for ITD and Brown and Stecker (2010) for ITD and ILD discrimination demonstrated uniformly low weighting of post-onset clicks, TWFs estimated for freefield localization by Stecker and Hafter (2002) appeared in many cases to be “U-shaped,” favoring the last few clicks in a train in addition to the onset. Stecker and Hafter (2009) further demonstrated this upweighting effect to be a monotonic increase in weights toward the end of the train, consistent with leaky temporal integration (cf. Tobias and Zerlin, 1959) rather than an effect of sound offset per se. On that basis, they argued for a temporally integrative mechanism affecting performance in their task (localization of sounds in the free field) but not in those employed by Saberi (1996) and others (e.g., discrimination of ITD presented over headphones). One possible source of the difference, they suggested, was the availability of ILD cues in the freefield, but not headphone, tasks. This interpretation is at odds with Brown and Stecker’s (2010) results, however, which did not demonstrate increased weighting of near-offset ILD presented under headphones. This apparent discrepancy is addressed in more detail in the following text.
The current results suggest that sensitivity to ILD is maintained after the onset of a rapidly fluctuating signal in a manner that sensitivity to envelope ITD is not. Specifically, the current results lend further support to Stecker and Hafter’s (2009) suggestion that while both envelope ITD and ILD are processed by one mechanism that is strongly onset-dominated at high rates, ILD is additionally processed via a separate temporally integrative mechanism that gives increased sensitivity to ILD near sound offsets. Although maintained sensitivity to ILD after sound onset was also demonstrated by Brown and Stecker (2010)—the average ILD of stimuli in that study were found to be more predictive of subjects’ lateralization responses than the onset ILD alone—TWFs measured by Brown and Stecker (2010) did not indicate greater weighting of ILD carried by late-arriving than interior clicks.
A resolution of this discrepancy is not immediately apparent except perhaps to point out differences in the stimuli employed across studies. Brown and Stecker (2010) measured TWFs using Gabor click trains the individual clicks of which varied in ILD over a range ± 2 dB. Three features of such stimuli may be relevant to the observed differences. First, each train in the Brown and Stecker (2010) study included clicks with both left- and right-favoring ILD, such that the mean ILD was near 0. In contrast, target ILDs in the current study were all set to favor the right ear, giving an across-signal mean ILD of approximately 0.3ΔL. Second, the maximum ILD across clicks in the Brown and Stecker (2010) study was 2 dB, which is smaller than ΔL at threshold across a majority of conditions tested here (see Fig. 3). Together, these aspects suggest that stimuli used by Brown and Stecker (2010) might have given a weaker impression of lateral position than those of the current study. That difference may have been amplified by the task, in that subjects of Brown and Stecker (2010) made left/right judgments near the interaural midline, whereas those of the current study were required to identify the target interval, presumably the interval with greatest perceived laterality. A third feature of Brown and Stecker’s (2010) stimuli was their stochastic nature. Although subjects in that study reported hearing compact fused images, it is possible that random variation in the ILD resulted in some degree of ambiguity that masked the natural weighting of late clicks. Saberi (1996) raised similar concerns regarding his study’s failure to replicate the “restarting” of binaural adaptation following brief gaps as reported by Hafter and Buell (1990). It should be noted, however, that Saberi’s study varied ITD cues, which introduced variation in the ICI that would not result from random ILD variation, and that Stecker and Hafter (2002, 2009) employed similarly stochastic stimuli, although in the free field, and reported both restarting with gaps and upweighting of late-arriving sound.
Limited evidence for greater salience of onset than offset ILD
Comparing ILD discrimination thresholds for stimuli containing onset and offset cues in separate conditions, Stecker and Brown (2010a) reported no significant difference between conditions and hence no evidence for different sensitivity to onset ILD versus offset ILD. The current study included a pair of conditions (R000 and 000R) designed to replicate those of the earlier study in the sense that cues in those conditions are primarily restricted to either the onset or offset. Despite the difference that current conditions presented ILD in only 4, rather than 15, of the clicks in each train, the major result was consistent: overall ILD thresholds were similar regardless of whether the cue was presented early (R000) or late (000R) in the stimulus. The effect of condition, however, interacted with ICI such that while R000 and 000R thresholds did not differ significantly at 5 ms ICI, R000 thresholds were slightly but significantly lower at 2 ms ICI [t(7) = 3.57, P < 0.01; see Table 1]. That is, greater sensitivity to onset versus offset ILD was observed at 2 ms ICI but not 5 ms ICI, consistent with the range of ICI over which Brown and Stecker (2010) reported onset dominance in TWFs for ILD. The effect of ICI was further consistent with the observation of Stecker and Hafter (2009) that, in freefield localization, onset weight varied significantly with ICI while offset weight did not. That is, the current results provide additional support to the view expressed in that paper that onset dominance reflects an ICI-dependent mechanism (e.g., processing via a rate-limited pathway; Bernstein and Trahiotis, 2002), whereas upweighting reflects a separate, temporally integrative mechanism that is not affected by ICI. As a consequence, ICI affects the symmetry, but not the basic form, of U-shaped TWFs, as the onset cue achieves greater influence at short ICI.
Evidence for compulsory temporal integration of ILD
One expectation of a temporally integrative mechanism for ILD is that integration might be compulsory, such that the presence of uninformative clicks should tend to degrade performance. Indeed there is evidence for such compulsory temporal integration among the current results. First, thresholds in conditions R000 and 000R were consistently and significantly larger than single-click (R) thresholds, presumably due to the unavoidable influence of the 12 diotic clicks included in each stimulus. For several subjects (see Fig. 3), R000 thresholds were smaller at 2 ms than at 5 ms ICI, while 000R thresholds showed the opposite pattern. Both patterns are consistent with greater temporal integration (i.e., reduced onset dominance) at the longer ICI. A second result suggestive of compulsory temporal integration is the similarity of thresholds in conditions R00R and R. That is, listeners’ ability to discriminate ILD carried by a train of 16 clicks, 8 of which carried non-zero ILDs, was no better than for a single click, despite the fact that both the onset and offset clicks were equivalent in ILD magnitude to the single click presented in condition R and that both onset and offset clicks did contribute to performance (compare R00R to R000 and 000R thresholds). Presumably, again, ILD thresholds may have been elevated by listeners’ inability to avoid integrating the smaller ILD carried by interior clicks. Finally, a beneficial effect of temporal integration can be seen in the lower thresholds of condition RRRR compared to all other conditions; in that case, ideal performance would reflect simple averaging of ILD over the 16 clicks, and thresholds would be expected to improve on the basis of pooling variance across the 16 elements (Houtgast and Plomp, 1968). Consistent with this and with a greater contribution of temporal integration at low rates as a consequence of weaker onset dominance, the lowest thresholds overall were recorded for condition RRRR at 5 ms ICI.
Limited rate-dependence of these effects
The processing of ongoing envelope ITD at high carrier frequencies has been shown to be strongly dependent on the stimulus modulation rate (Bernstein and Trahiotis, 2002). As a consequence, at high rates listeners depend to a greater extent on the preserved onset ITD cues (Hafter and Dye, 1983; Saberi, 1996; Brown and Stecker, 2010) and ILD cues (Stecker, 2010). The range of ICI over which this rate dependence manifests is approximately 2–12 ms (80–500 Hz) with the largest effects of onset dominance observed at ICIs between 2 and 5 ms (Brown and Stecker, 2010). For this reason, the current experiments were conducted using ICI values in this range. With few exceptions, the overall results did not strongly depend on ICI over this range. Those exceptions include (1) the relatively greater benefit of RRRR over R thresholds at 5 than 2 ms ICI, (2) better 0RR0 thresholds at 5 ms than 2 ms ICI for a subset of subjects (0601, 0502, 0926), and (3) relatively poorer R000 thresholds at 5 ms than 2 ms ICI. In each case, the difference is consistent with relatively greater influence of ILD carried by interior clicks at longer ICI; slowing the click rate increases the likelihood that each click will contribute optimally to listeners’ judgments.
Functional consequences of sensitivity to late-arriving ILD
An important issue raised by the current results is the functional consequence of sensitivity to late-arriving ILD. One oft-mentioned utility of onset dominance for ITD is echo suppression, which helps reduce the effectiveness of spurious post-onset ITD associated with reflected rather than direct sound. By that account, onset dominance reflects the greater reliability of spatial cues carried by sound arriving via a direct path. The acoustic consequences of echoes and reverberation differ between ITD and ILD, however, and it may be that greater stability of post-onset ILD than post-onset ITD is exploited by an auditory system sensitive to the statistics of those differences. In support of that view, Rakerd and Hartmann (1985) showed that listeners’ spatial judgments shifted to rely more strongly on ILD than ITD in cases where reflections produce implausibly large ITDs.
Another possibility is that the late-arriving ILD provides information about the acoustic environment itself. One puzzling aspect of the precedence effect is that although spatial information about echoes appears suppressed, the system must somehow monitor that information to know when the acoustic environment has changed, such as when moving from one room to another. Several studies have demonstrated the precedence effect to be temporarily disrupted (“broken down”) by such changes (cf. Keen and Freyman, 2009) for stimuli presented in the free field. Krumbholz and Nobbe (2002) demonstrated such breakdown to also occur during headphone presentation but only when the stimuli were lateralized by ILD and not when lateralized by ITD alone. That result is generally consistent with the maintained sensitivity to post-onset ILD demonstrated in the current study.
On the neural mechanisms underlying these effects
Taken together with the results of other studies, the current results suggest that sensitivity to ILD in these stimuli reflects two types of neural mechanism. One is responsible for onset dominance, the greater sensitivity to binaural cues carried by sound onsets than by post-onset clicks. That onset dominance is apparent regardless of the tested cue (ITD or ILD) but varies significantly with ICI (Hafter and Dye, 1983; Hafter et al., 1988; Saberi, 1996; Stecker and Hafter, 2002; Brown and Stecker, 2010). The second is responsible for upweighting, the greater contribution of late-arriving clicks than interior clicks. Upweighting is evident in TWFs measured for freefield sound localization and the discrimination of ILD over headphones but does not appear strongly ICI dependent (Stecker and Hafter, 2009; Stecker and Brown, 2010a; current results). As discussed by Stecker and Hafter (2009), upweighting is consistent with the operation of binaural temporal integration (Tobias and Zerlin, 1959; Akeroyd and Bernstein, 2001).
Although the psychophysical data strongly suggest the functional characteristics of the neural mechanisms responsible, it is also informative to consider whether any physiologically identified mechanisms exhibit those characteristics and thus to consider in turn the neural loci of these effects. Regarding the first (onset) mechanism, Hafter et al. (1988) suggested ICI-dependent adaptation in the pre-binaural cochlear nucleus as a candidate for onset dominance common to both ITD and ILD. The results of subsequent studies remain—we feel—consistent with the arguments presented in that paper. That the second (upweighting) mechanism contributes to ILD but not ITD processing suggests that the temporal integrator lacks the temporal fidelity to extract ITD and thus argues for a downstream mechanism of de novo ILD computation via excitatory-inhibitory interactions at sites beyond the level of initial binaural interaction (Grothe et al., 2010; Stecker et al., 2005). One may not, however, rule out differences intrinsic to the ascending pathways that originate in the lateral versus medial superior olivary nuclei, for example, different balances of tonic versus phasic inputs (Oertel et al., 2011), different time constants of integration, or different degrees of “synchronized” versus “non-synchronized” representations of ILD- and ITD-bearing stimuli (Bartlett and Wang, 2007). Although discriminating among these possibilities is beyond the scope of the current study, a more informed description of ILD processing, its temporal dynamics, and its relationship to ITD processing will ultimately require evaluating the behavioral contributions of such potential mechanisms.
SUMMARY AND CONCLUSIONS
The current study measured the time course of sensitivity to ILD in trains of 16 high-frequency Gabor clicks by imposing dynamic variation in ILD over the course of each click train. ILD variation was constrained to limit non-zero ILD to clicks near sound onset, near sound offset, near both onset and offset, or to interior clicks. Thresholds were compared to those measured with static ILD click trains or single clicks as a function of ICI. The results suggest the following conclusions:
-
(1)
Consistent with previous results (Stecker and Brown, 2010a), listeners exhibited similar sensitivity to ILD regardless of whether the cue was present near the onset or offset of a brief stimulus.
-
(2)
Lower ILD thresholds were measured when ILD was carried by clicks near the onset and offset of the sound (clicks 1–4 and 13–16), then when ILD was carried only by interior clicks (clicks 5–12). The difference suggests that even when listeners appear equally sensitive to onset and offset ILD (Stecker and Brown, 2010a), they are not equally sensitive to ILD carried by all clicks in the train. Rather, as argued by Stecker and Hafter (2009), ILD sensitivity reflects both onset-specific effects and—perhaps via a separate mechanism—a special importance of late-arriving sound.
-
(3)
Although the greatest sensitivity to ILD occurs near the onset and offset of a sound, the results also suggest some degree of compulsory integration of the ongoing ILD in that the presence of clicks carrying zero ILD elevates thresholds for a train. That integration would be consistent with the mechanism suggested by Stecker and Hafter (2009) to explain the upweighting of late-arriving ILD; thus the threshold elevations may reflect the influence of small ILD values carried by clicks near, but not at, sound offset.
-
(4)
Results were influenced only to a small extent by stimulus rate (ICI) with slower rates (longer ICI) suggesting a greater role for temporal integration than for onset dominance. That is, the rate effects appear consistent with the ICI dependence of onset dominance reported previously (e.g., Brown and Stecker, 2010).
ACKNOWLEDGMENTS
The authors thank Julie Stecker and Anna Mamiya for assistance with data collection and two anonymous reviewers for helpful critiques of previous versions of this manuscript. Portions of this work were previously presented in abstract form (Stecker and Brown, 2010b). This work was supported by NIDCD Grant Nos. T32-DC000033, F31-DC010543, and R03-DC009482.
Footnotes
A stricter comparison might consider the effective “dual monaural” cue in binaural conditions (i.e., the cue available under independent processing of monaural information in each ear) to be twice the value available in the single-ear condition. In that case, one expects monaural thresholds to be no greater than twice those obtained in binaural conditions. That criterion was met in only one additional case for condition R00R (at 2 ms ICI, subject 0915’s monaural thresholds exceed binaural thresholds by a factor of 1.8). In condition 0RR0, however, it was met by seven of the subjects for at least one of the tested ICI values (in that case, the ratio of monaural to binaural thresholds was less than 2 but greater than ). Thus by the stricter 2x criterion, one might argue that listeners’ performance in condition 0RR0 reflected a monaural listening strategy and therefore a complete insensitivity to ILD carried by interior clicks. We prefer the more conservative conclusion given in the main text that listeners adopted a binaural strategy in both conditions but were relatively less sensitive to ILD in condition 0RR0 than R00R.
Our use of the term “late-arriving” serves to indicate the possible role of clicks both near to and at the moment of overall sound offset. Given evidence that late but pre-offset clicks contribute significantly to sound localization (Stecker and Hafter, 2009), it is convenient to differentiate that role from the overall sound offset.
References
- Akeroyd, M. A., and Bernstein, L. R. (2001). “ The variation across time of sensitivity to interaural disparities: Behavioral measurements and quantitative analyses,” J. Acoust. Soc. Am. 110, 2516–2526. 10.1121/1.1412442 [DOI] [PubMed] [Google Scholar]
- Bartlett, E. L., and Wang, X. (2007). “ Neural representations of temporally modulated signals in the auditory thalamus of awake primates,” J. Neurophysiol. 97, 1005–1017. 10.1152/jn.00593.2006 [DOI] [PubMed] [Google Scholar]
- Bernstein, L. R., and Trahiotis, C. (2002). “ Enhancing sensitivity to interaural delays at high frequencies by using ‘transposed stimuli,’ ” J. Acoust. Soc. Am. 112, 1026–1036. 10.1121/1.1497620 [DOI] [PubMed] [Google Scholar]
- Brown, A. D., and Stecker, G. C. (2010). “ Temporal weighting of interaural time and level differences in high-rate click trains,” J. Acoust. Soc. Am. 128, 332–341. 10.1121/1.3436540 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freyman, R. L., Zurek, P. M., Balakrishnan, U., and Chiang, Y. C. (1997). “ Onset dominance in lateralization,” J. Acoust. Soc. Am. 101, 1649–1659. 10.1121/1.418149 [DOI] [PubMed] [Google Scholar]
- Grantham, W. D. (1986). “ Detection and discrimination of simulated motion of auditory targets in the horizontal plane,” J. Acoust. Soc. Am. 79, 1939–1949. 10.1121/1.393201 [DOI] [PubMed] [Google Scholar]
- Grothe, B., Pecka, M., and McAlpine, D. (2010). “ Mechanisms of sound localization in mammals,” Physiol. Rev. 90, 983–1012. 10.1152/physrev.00026.2009 [DOI] [PubMed] [Google Scholar]
- Hafter, E. R., and Buell, T. N. (1990). “ Restarting the adapted binaural system,” J. Acoust. Soc. Am. 88, 806–812. 10.1121/1.399730 [DOI] [PubMed] [Google Scholar]
- Hafter, E. R., Buell, T. N., and Richards V. M. (1988). “ Onset coding in lateralization: Its form, its site, and its function,” in Auditory Function: Neurobiological Bases of Hearing, edited by Edelman G. M., Gall W. E., and Cowan W. M. (Wiley, New York: ), Chap. 22, pp. 647–676. [Google Scholar]
- Hafter, E. R., and Dye, R. H. J. (1983). “ Detection of interaural differences of time in trains of high-frequency clicks as a function of interclick interval and number,” J. Acoust. Soc. Am. 73, 644–651. 10.1121/1.388956 [DOI] [PubMed] [Google Scholar]
- Hartmann, W. M., and Constan, Z. A. (2002). “ Interaural level differences and the level-meter model,” J. Acoust. Soc. Am. 112, 1037–1045. 10.1121/1.1500759 [DOI] [PubMed] [Google Scholar]
- Houtgast, T., and Plomp, R. (1968). “ Lateralization threshold of a signal in noise,” J. Acoust. Soc. Am. 44, 807–812. 10.1121/1.1911178 [DOI] [PubMed] [Google Scholar]
- Keen, R., and Freyman, R. L. (2009). “ Release and re-buildup of listeners’ models of auditory space,” J. Acoust. Soc. Am. 125, 3243–3252. 10.1121/1.3097472 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krumbholz, K., and Nobbe, A. (2002). “ Buildup and breakdown of echo suppression for stimuli presented over headphones—the effects of interaural time and level differences,” J. Acoust. Soc. Am. 112, 654–663. 10.1121/1.1490594 [DOI] [PubMed] [Google Scholar]
- Levitt, H. (1971). “ Transformed up-down methods in psychoacoustics,” J. Acoust. Soc. Am. 49, 467–477. 10.1121/1.1912375 [DOI] [PubMed] [Google Scholar]
- Loftus, G. (1996). “ Psychology will be a much better science when we change the way we analyze data,” Psychol. Sci. 5, 161–171. [Google Scholar]
- Macauley, E. J., Hartmann, W. M., and Rakerd, B. (2010). “ The acoustical bright spot and mislocalization of tones by human listeners,” J. Acoust. Soc. Am. 127, 1440–1449. 10.1121/1.3294654 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oertel, D., Wright, S., Cao X.-J., Ferragamo, M., and Bal, R. (2011). “ The multiple functions of T stellate/multipolar/chopper cells in the ventral cochlear nucleus,” Hear. Res. 276, 61–69. 10.1016/j.heares.2010.10.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Payton, M. E., Greenstone, M. H., and Schenker, N. (2003). “ Overlapping confidence intervals or standard intervals: What do they mean in terms of statistical significance?” J. Insect Sci. 3, 34–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rakerd, B., and Hartmann, W. M. (1985). “ Localization of sound in rooms. II. The effects of a single reflecting surface,” J. Acoust. Soc. Am. 78, 524–533. 10.1121/1.392474 [DOI] [PubMed] [Google Scholar]
- Saberi, K. (1996). “ Observer weighting of interaural delays in filtered impulses,” Percept. Psychophys. 58, 1037–1046. 10.3758/BF03206831 [DOI] [PubMed] [Google Scholar]
- Stecker, G. C. (2010). “ Trading of interaural time and level differences in high-rate narrowband click trains,” Hear. Res. 268, 202–212. 10.1016/j.heares.2010.06.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stecker, G. C., and Brown, A. D. (2010a). “ Temporal weighting of binaural cues revealed by detection of dynamic interaural differences in high-rate Gabor click trains,” J. Acoust. Soc. Am. 127, 3092–3103. 10.1121/1.3377088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stecker, G. C., and Brown, A. D. (2010b). “ Does temporal weighting of interaural level differences include both onset and offset-specific effects?” Assoc. Res. Otolaryngol. Abstr. 33, 284. [Google Scholar]
- Stecker, G. C., and Hafter, E. R. (2002). “ Temporal weighting in sound localization,” J. Acoust. Soc. Am. 112, 1046–1057. 10.1121/1.1497366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stecker, G. C., and Hafter, E. R. (2009). “ A recency effect in sound localization?” J. Acoust. Soc. Am. 125, 3914–3924. 10.1121/1.3124776 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stecker, G. C., Harrington, I. A., and Middlebrooks, J. C. (2005). “ Location coding by opponent neural populations in the auditory cortex,” PLoS Biol. 3(3), e78. 10.1371/journal.pbio.0030078 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stellmack, M. A., Viemeister, N. F., and Byrne, A. J. (2004). “ Monaural and interaural intensity discrimination: Level effects and the ‘binaural advantage,’ ” J. Acoust. Soc. Am. 116, 1149–1159. 10.1121/1.1763971 [DOI] [PubMed] [Google Scholar]
- Tobias, J. V., and Zerlin S. (1959). “ Lateralization threshold as a function of stimulus duration,” J. Acoust. Soc. Am. 31, 1591–1594. 10.1121/1.1907664 [DOI] [Google Scholar]
- Zurek, P. M. (1980). “ The precedence effect and its possible role in the avoidance of interaural ambiguities,” J. Acoust. Soc. Am. 67, 953–964. 10.1121/1.383974 [DOI] [PubMed] [Google Scholar]