Abstract
The present investigation assessed fusion and localization dominance aspects of the precedence effect under headphones across a variety of stimulus conditions in 10 normal-hearing listeners. Listeners were presented with “lead-lag” pairs of brief (123 μs) impulses or trains of such pairs lateralized by interaural time or level differences (ITD or ILD). Listeners used a touch-sensitive display to indicate for the final lead-lag pair presented on each trial (1) whether one or two locations were perceived and (2) the location perceived. In the event two locations were perceived, subjects were further instructed to indicate the left-most location perceived. Results demonstrated that lead-lag fusion was more robust for stimuli lateralized by ITD than ILD, particularly when cues of the test stimulus differed from cues of the preceding “buildup” stimulus, consistent with Krumbholz and Nobbe [(2002). J. Acoust. Soc. Am. 112, 654–663]. Unexpectedly, results also demonstrated reduced localization dominance with increasing lead-lag delay, suggesting that the fusion aspect of the precedence effect may be dissociated from the localization dominance aspect under buildup. It is thus argued that buildup of fusion might be understood more generally as an example of auditory object formation rather than a special facility for enhanced sound localization.
INTRODUCTION
Normal-hearing listeners localize sound sources accurately in ordinary listening environments (e.g., rooms) by responding to the auditory spatial cues carried by the early arriving sound rather than the spurious cues carried by later-arriving reflections and reverberation. The various phenomena associated with this observation are known collectively as the “precedence effect” (Wallach et al., 1949; for excellent reviews, see Blauert, 1997, and Litovsky et al., 1999). The precedence effect, studied in a variety of psychophysical and physiological paradigms over the past six decades, depends on essentially two phenomena: (1) fusion of the early arriving (“lead”) and late-arriving (“lag”) sound and (2) dominance of the localization cues carried by the lead over those carried by the lag (termed localization dominance; see Litovsky et al., 1999). One of the most surprising findings related to the precedence effect since its first description (Wallach et al., 1949) was reported by Clifton and colleagues (e.g., Clifton, 1987; Clifton and Freyman, 1989), who discovered that the temporal limit of lead-lag fusion, known as the “echo threshold,” is dynamic and dependent on prior stimulation. Using two (left and right) loudspeakers in a free field paradigm, Clifton and Freyman (1989) demonstrated that repetition of a fixed lead-lag stimulus (e.g., right-lead, left-lag) led to elevation of the echo threshold (i.e., enhancement of fusion, termed “buildup”), while subsequent presentation of a binaurally opposite lead-lag stimulus (e.g., left-lead, right-lag), led to an apparent resetting of the echo threshold (termed “breakdown”). Multiple studies of this phenomenon over the past two decades (e.g., Clifton et al., 1994; Grantham, 1996; McCall et al., 1998; Djelani and Blauert, 2001) have established that a listener's echo threshold, generally on the order of 5–10 ms at baseline (for lead-lag pairs of clicks or other impulsive stimuli), may be elevated to 15–25 ms following buildup, and reduced back to 5–10 ms following breakdown. Studies focused specifically on the breakdown phenomenon have additionally established that breakdown can be induced not only by a “switching” of the lead and lag speakers, but also by a sudden change in the lead-lag delay (Clifton et al., 1994), or a sudden change in the spectrum of the echo (McCall et al., 1998). Djelani and Blauert (2001) later demonstrated that subsequent presentation of a test pair identical to the original conditioner pairs gave extant buildup of fusion (i.e., that buildup was maintained across presentation of the breakdown stimulus, a condition they termed “re-buildup”).
Based on these observations, several investigators have suggested that dynamic aspects of the precedence effect reflect listeners' “construction of an internal model of auditory space” (Freyman and Keen, 2006; Keen and Freyman, 2009; Sanders et al., 2011) specific to a particular reverberant context: Echoes in agreement with the internal model become more effectively suppressed or fused with the direct sound (buildup), while echoes in violation of the internal model remain treated as separate sources (breakdown) (cf. Clifton et al., 1994; Grantham, 1996; Blauert, 1997; Litovsky et al., 1999; Freyman and Keen, 2006; Keen and Freyman, 2009). While it is tempting to assume that such a model would embody listeners' implicit knowledge of the spatial geometry of the perceived room and its sound source(s), i.e., the angles and distances (or equivalently directions and delays) involved (e.g., Keen and Freyman, 2009), an alternative possibility is that the “internal model of auditory space” actually embodies listeners' implicit knowledge concerning the spatial acoustics of the environment, e.g., the statistics of the interaural time differences (ITD), interaural level differences (ILD), or spectral features of the direct (early) and reflected (later-arriving) sound. According to the first possibility, buildup reflects increasing confidence in knowledge about the geometric arrangement of a listening environment, leading to more accurate localization of the veridical sound source(s) and stronger suppression of misleading echoes. According to the second possibility, buildup reflects enhanced fusion or capture of spatial information across acoustic cues, in much the same way that auditory-visual capture (i.e., ventriloquism) exploits the statistics of auditory and visual spatial cues. Importantly, the two possibilities make distinct predictions for breakdown effects: according to the first, breakdown results from geometric violations (such as change in source position or room shape) without regard to the nature of underlying acoustic cues; according to the second, breakdown results from changes in the acoustic cues that violate the expected statistical relationships of those cues. The current study aims to evaluate these different accounts of dynamic precedence by examining buildup and breakdown effects for stimuli lateralized by either ITD or ILD in the same group of listeners.
Different contributions of ITD and ILD to precedence
In a novel study of buildup and breakdown phenomena, Krumbholz and Nobbe (2002) presented listeners with single pairs or trains of pairs of lead-lag clicks over headphones to measure buildup and breakdown effects for stimuli lateralized by either ITD or ILD. This experiment was a departure from most other studies of the precedence effect, which had almost exclusively employed free field, headphone ITD, or virtual acoustic space stimuli, where the contributions of ITD and ILD to listeners' judgments were not independently evaluated (e.g., Wallach et al., 1949; Thurlow and Parks, 1961; Clifton, 1987; Clifton and Freyman, 1989; Shinn-Cunningham et al., 1993; Yang and Grantham, 1997; Djelani and Blauert, 2001; Freyman and Keen, 2006; Keen and Freyman, 2009; see Litovsky et al., 1999). Krumbholz and Nobbe (2002) measured significant buildup of fusion echo thresholds for ITD stimuli, with thresholds increasing from a mean of 7 ms at baseline to 16 ms after stimulus repetition, and lesser but significant buildup for ILD stimuli, with thresholds increasing from a mean of 5 ms at baseline to 11 ms after stimulus repetition. The breakdown effect, in contrast, was evidenced only for binaurally switched ILD stimuli. Echo thresholds for breakdown-ILD stimuli (buildup conditioner + binaurally switched test stimulus) were comparable to thresholds for baseline-ILD stimuli (mean of ∼4 ms), consistent with the breakdown effect observed in the free field (e.g., Clifton and Freyman, 1989). Echo thresholds for breakdown-ITD stimuli, in contrast, remained comparable to thresholds in the buildup-ITD condition (mean of ∼14 ms). That is, presentation of a lead-lag ITD test stimulus carrying cues opposite the conditioner stimulus produced no breakdown of fusion. The findings of Krumbholz and Nobbe (2002) are thus striking in the context of previous studies of buildup and breakdown (e.g., Clifton, 1987; Clifton and Freyman, 1989; Clifton et al., 1994; Djelani and Blauert, 2001; Freyman and Keen, 2006; Keen and Freyman, 2009). Based on their data, breakdown of fusion in free-field lead-lag “switch” paradigms would seem necessarily mediated by sensitivity to changes in the ILD, while the concomitant change in the ITD might be inconsequential.
A number of questions about the precedence effect might be asked on the basis of Krumbholz and Nobbe's (2002) report. For example, the authors did not include a “re-buildup” condition (after Djelani and Blauert, 2001) in their experiment to assess whether echo threshold had actually “broken down” for ILD stimuli, or whether thresholds reflected baseline thresholds for novel stimuli. Of greater interest, while some psychophysical evidence suggests that ITD and ILD are combined at some level of binaural processing into a common code (e.g., Maier et al., 2010), the results of Krumbholz and Nobbe (2002; see their Discussion in particular) suggest that the precedence effect, in terms of lead-lag fusion, strongly depends on which binaural cue (ITD or ILD) is manipulated. A geometric account of buildup and breakdown phenomena holds that the effects of stimulus repetition should depend only on the spatial perception induced by, and not on the specific acoustic cues carried by, the stimulus. Following on the report of Krumbholz and Nobbe (2002), the present study reexamines buildup and breakdown phenomena with particular attention to the room acoustics hypothesis of Freyman and colleagues (e.g., Freyman and Keen, 2006; Keen and Freyman, 2009; Sanders et al., 2011). Three experiments designed to measure both fusion echo thresholds and subjective lateralization for stimuli carrying nonzero ITD or ILD cues presented in isolation or following appropriately designed conditioner stimuli are described. Data are presented and briefly discussed for each experiment, followed by summary points and general discussion in the context of the precedence literature at large.
COMMON EXPERIMENTAL METHODS
All procedures, including subject recruitment, obtaining subject consent, and subject testing followed the guidelines of the University of Washington Human Subjects Division and were reviewed and approved by the cognizant Institutional Review Board.
Subjects
Ten subjects aged 20–58 (four female) completed participation in this experiment. All subjects were naive to the purpose of the experiment and were compensated for their participation. Subjects reported normal hearing and demonstrated pure-tone detection thresholds <20 dB HL with <10 dB asymmetry between left and right ears at octave frequencies 250–8000 Hz.
Stimuli and procedure
All testing was completed in a sound-attenuated booth (IAC, Bronx, NY). Subjects were seated in a swivel chair facing a large (80-cm diagonal) touch-sensitive display (elo Touchsystems 3200L, Tyco Electronics, Bermuda). Stimuli across all phases of all experiments were comprised of monophasic pulses (“clicks”) 123 μs (6 samples) in duration with an average binaural level of ∼60 dB SPL. Pairs and trains of clicks (to be detailed subsequently for each experiment) were programmed in MATLAB (MathWorks, Natick, NJ), synthesized at 48.828 kHz (Tucker-Davis Technologies RP2.1, Alachua, FL) and presented under closed-back electrostatic headphones (STAX Model 4070, Saitama, Japan). Non-zero ITD values were imposed by delaying the signal to the left channel for right-favoring ITD or delaying the signal to the right channel for left-favoring ITD. Non-zero ILD values were imposed by amplifying the signal to the favored earphone by half the total ILD and attenuating the signal to the opposite earphone by an equal amount. Subjects completed each experiment by interacting with the touch-sensitive display as described in the following sections.
Training
Prior to participation in experiment I (Sec. 3), subjects were trained in a simple two-alternative-forced-choice task. The purpose of this training was (1) to familiarize subjects with the lead-lag experimental stimuli, and (2) to provide a standard for judgments in the inherently subjective experimental task. Specifically, stimuli in the training task were pairs of lead-lag dichotic clicks (separated by 1–50 ms) in which the lag carried an ITD or ILD either identical to or opposite the ITD or ILD carried by the lead. On each trial, the subjects' task was to indicate whether the stimuli had consisted of signals from “one location” (true if lead and lag cues were identical, indicated by touching the panel at the top of display) or “two locations” (true if lead and lag cues favored opposite ears, indicated by touching the panel at the bottom of the display). Immediate correct/incorrect feedback was displayed on the monitor after each trial. In addition to verifying that subjects were sensitive to the nonzero ITD or ILD values carried by the stimuli, this task served to train subjects to select the “two locations” panel only when two discrete locations were perceived, rather than simply when two sounds were perceived. This method emphasizes a more stringent definition of echo threshold (summarized by Blauert, 1997), which requires that the lead and the lag be perceived at discrete, though not necessarily their veridical, locations. All subjects completed at least 2 h of training runs, evenly divided between ITD stimuli and ILD stimuli. Several subjects required additional training to reach criterion (90% correct or better at lead-lag delays >25 ms) for either ITD or ILD conditions (or both), suggestive of lesser sensitivity to the cues in those subjects.1
ITD-ILD matching task
The questions addressed by experiments II and III (see Secs. 4, 5) required that ITD and ILD values employed in test stimuli produced equivalent lateralization for each subject (approximate equivalence was assumed for experiment I based on the data of Krumbholz and Nobbe, 2002). Thus, prior to testing in experiments II and III, an explicit ITD-ILD matching task was completed by each subject. On each trial of the matching task, subjects were initially presented with a “standard” click carrying ±308 μs ITD (experiment II matching task) or ±615 μs ITD (experiment III matching task), followed 1 s later by a second “pointer” click carrying a random ILD from the range −15 to +15 dB ILD. By convention, positive cue values favored the right ear, and negative cue values favored the left ear. On each trial, subjects were instructed to adjust the perceived location of the ILD pointer to match the perceived location the ITD standard by using buttons displayed on the touch screen monitor. Three left arrow buttons and 3 right arrow buttons provided for “coarse” (±6 dB), “medium” (±2 dB), and “fine” (±0.3 dB) adjustment. After each adjustment, the standard-pointer click pair was automatically replayed. Subjects were free to make as many adjustments as necessary to match the location of the pointer to the location of the standard. Once subjects were satisfied with the pointer-standard match, they pressed a “match” button at the bottom of the display, and the next trial began automatically. Each run consisted of 20 trials; subjects completed 4 runs for left- and right-leading ITD standards (8 total) for each matching task (experiment II, experiment III). Matching data are presented in Secs. 4, 5).
Main experimental task
Following completion of training (experiment I) or ITD-ILD matching tasks (experiments II and III), subjects began testing. The subject's task on each trial was the same across all conditions and all experiments. Following presentation of either a brief silent period or a conditioner stimulus (see Secs. 3, 4, 5), a “lead-lag” test stimulus was presented. The subject was instructed to indicate for this stimulus (i.e., ignoring the conditioner stimulus, if present) (1) whether one location (upper panel on the touch screen) or two locations (lower panel on the touch screen) had been perceived and (2) to indicate the apparent lateral location of the stimulus within the selected panel (a perceptual scaling task). If two locations were perceived, subjects were further instructed to indicate the left-most location perceived. Each response thus carried two independent components of data. Specifically, which panel the subject touched indicated whether the lead and lag clicks had appeared “fused,” and where the subject touched within the selected panel indicated the extent of localization dominance (i.e., the extent to which the reported spatial percept agreed with cues carried by the lead versus cues carried by the lag). Both aspects (fusion and localization dominance2) were expected to change as a function of lead-lag delay, the parameter varied from trial-to-trial (see below). Trials within a given run were of a single stimulus condition (e.g., Baseline ITD or Buildup ILD; see Secs. 3, 4, 5); stimulus conditions were presented in random order across subjects. Subjects completed at least one practice run in each condition before testing commenced.
Both adaptive methods (e.g., Krumbholz and Nobbe, 2002) and methods of constant stimuli (e.g., Clifton and Freyman, 1989) have been used for echo threshold estimation in studies of the precedence effect. Each has its advantages (e.g., the efficiency of an adaptive staircase versus the completeness of an empirically measured psychometric function). In efforts to ensure consistency of measurement, three echo thresholds were estimated for each run in the present investigation using three simultaneous and independent procedures (programmed in MATLAB). On each trial within a given run, the lead-lag delay was drawn randomly from one of two adaptive tracks (one ascending, starting value 1 ms; one descending, starting value 50 ms) or a constant set of delays ([0.43,3 3, 6, 9, 15, 25, 50 ms], 5 trials per stimulus, each equally probable on a given trial). Each adaptive staircase followed a 1-up, 1-down rule to estimate the 50% echo threshold; logarithmic step sizes of 0.2 (delaynew = delayold × 10±0.2) were employed up to the fourth reversal in each track, then decreased to 0.05 (delaynew = delayold × 10±0.05) for the duration of the run. Each track terminated after 8 reversals. The threshold for each was taken as the geometric mean ITD of the final 4 reversals. A third threshold was taken as the lead-lag delay at the interpolated 50% point on a psychometric function that was fit to responses for the constant set of values once the run had finished (using the custom MATLAB function psignfit, Wichmann and Hill, 2001). In experiments I and II, subjects completed four runs in each condition, giving 12 total threshold estimates per condition per subject. In experiment III, subjects completed two runs in each condition, giving six threshold estimates per condition. Since two out of three threshold trackers were adaptive, the lead-lag delay presented on a given trial was occasionally related to the lead-lag delay presented on the previous trial; nonetheless, tracker randomization and the presence of a constant stimulus tracker made it nearly impossible to anticipate the delay on a given trial. Finally, lateralization responses did not affect the trial-to-trial progression of experimental runs; right-lead, left-lag and left-lead, right-lag stimuli were presented in random order over the duration of a run, and lateralization data were analyzed offline after completion of the experiment.
Analysis
Echo thresholds were compared across conditions and across subjects by repeated-measures analysis of variance (ANOVA) and paired t-tests. Significant differences are given by p < 0.05, with corrections for multiple comparisons applied as appropriate. Substantial individual variability in echo thresholds was anticipated on the basis of past studies of the precedence effect and binaural sensitivity in general (e.g., Wallach et al., 1949; McFadden et al., 1973; Freyman et al., 1991; Grantham, 1996; Yang and Grantham, 1997; Brown and Stecker, 2010); such differences were observed in some conditions of the present investigation. Individual subject data are thus given for each experiment along with mean data.
Lateralization data consisted of horizontal position values giving the location within either panel (“one location” or “two locations”) that the subject touched on each trial, ranging from −1 (maximum left) to +1 (maximum right). For all experiments, lateralization responses were first grouped according to the sidedness of the lead stimulus (left-lead, right-lag or right-lead, left-lag), and then again according to whether the subject touched the upper panel (“one location” trials) or the lower panel (“two location” trials). Sorted responses were plotted as a function of lead-lag delay, and weighted lines of best fit (least-squares) were generated to summarize trends in lateralization for each case as a function of lead-lag delay (dashed black and red lines in Figs. 3, 6, and 9). In some cases, one-sample t-tests were conducted on the slopes of these best fit lines (described separately for each experiment) to test the null hypothesis that the slope of lateralization across lead-lag delay was zero (i.e., that the magnitude of lateralization did not depend on lead-lag delay).
EXPERIMENT I: DYNAMIC PRECEDENCE EFFECTS FOR ITD AND ILD
The goal of this experiment was to replicate and extend the study of Krumbholz and Nobbe (2002), in which echo thresholds were measured for pairs of lead-lag clicks lateralized by ITD or ILD across three different stimulus types: baseline, buildup, and breakdown. Stimuli in the present experiment were of four different types: (1) Baseline stimuli consisted of a single lead-lag click pair, (2) Buildup stimuli consisted of 12 “conditioner” lead-lag click pairs and a final test pair identical to the conditioner pairs, (3) Breakdown stimuli consisted of 12 conditioner pairs and a “switched” test pair in which the interaural cues were swapped between the lead and lag clicks, and (4) Retest stimuli consisted of 11 conditioner pairs, an intervening switched pair, and a final test pair identical to the 11 conditioner pairs [after the re-buildup condition used by Djelani and Blauert (2001) to demonstrate maintenance of buildup following breakdown]. In ITD conditions, lead clicks always carried +/−308 μs ITD (i.e., 308 μs right-favoring or 308 μs left-favoring ITD) and 0 dB ILD, and lag clicks always carried −/+308 μs ITD and 0 dB ILD (i.e., an opposing ITD cue). Correspondingly, in ILD conditions, lead clicks always carried +/−10 dB ILD and 0 μs ITD, and lag clicks always carried −/+10 dB ILD and 0 μs ITD. These and other key stimulus parameters are illustrated in Fig. 1. Stimuli were designed to match those employed by Krumbholz and Nobbe (2002) in their ITD and ILD conditions; cue values were expected to produce approximately equivalent lateralization for ITD and ILD stimuli (see also Fig. 4). The major novelty of the present experiment was its simultaneous assessment of fusion and localization dominance, the latter having never been measured for “buildup” stimuli (cf. Clifton, 1987; Freyman et al., 1991; Clifton et al., 1994; Djelani and Blauert, 2001; Freyman and Keen, 2006; Keen and Freyman, 2009).
Results
Echo thresholds
Figure 2 gives individual subject (symbols) and mean (filled circles, error bars ± SE) echo thresholds for ITD and ILD conditions of experiment I (12 threshold estimates per condition per subject). Mean echo thresholds were greater for ITD than ILD in every condition, with the disparity particularly evident in the Breakdown condition. Of particular interest in the context of the studies of Djelani and Blauert (2001) and Krumbholz and Nobbe (2002), the echo threshold in the Retest ILD condition appeared to be comparable to the echo threshold in the Buildup ILD condition. Thus, lower Breakdown ILD thresholds notwithstanding, buildup was apparently “maintained” for the original lead-lag ILD stimulus. Individual data support the mean trends, with some individual differences evident (e.g., subjects 0601 and 1014 failing to show any breakdown effect in either ITD or ILD conditions). Threshold data were submitted to a 4 × 2 (condition × cue) repeated-measures ANOVA. The main effects of cue [F(1,9) = 27.51, p < 0.05] and condition [F(3,27) = 16.46, p < 0.05], and the cue × condition interaction [F(3,27) = 4.31, p < 0.05] were all significant. Follow-up paired t-tests with set-wise correction for multiple comparisons demonstrated that ITD-based echo thresholds were significantly higher than ILD-based echo thresholds in Baseline [t(9) = 5.31, p < 0.0125], Buildup [t(9) = 3.16, p < 0.0125], and Breakdown [t(9) = 4.09, p < 0.0125] conditions, but not in the Retest condition [t(9) = 2.57, p = 0.030]. An additional set of tests demonstrated that ILD-based echo thresholds were significantly higher in the Buildup condition than in the Baseline [t(9) = 4.78, p < 0.025] and Breakdown [t(9) = 2.76, p < 0.025] conditions. A final set of tests demonstrated that echo thresholds within ITD conditions were significantly higher than Baseline in Buildup [t(9) = 4.15, p < 0.0125], Breakdown [t(9) = 3.39, p < 0.0125], and Retest [t(9) = 4.52, p < 0.0125] conditions, while Breakdown and Buildup thresholds were not statistically different [t(9) = 1.86, p = 0.100].
Lateralization responses
The adaptive threshold estimation procedure introduced dozens of unique lead-lag delays across runs for each subject additional to the constant set of lead-lag delays tested in all runs for all subjects (see Sec. 2). Thus, to assess lateralization responses at the group level, mean lateralization values were first computed for each subject at each tested lead-lag delay. The cross-subject mean was then computed across lead-lag delay as the running mean of a sliding 3-ms window from 1 to 100 ms (interval 0.1 ms). A weight was determined for each such mean according to the number of subjects for which data existed (minimum 0, maximum 10). For means that comprised two subjects or more, the weighted standard deviation was also computed. Finally, means and their weights were used to compute a weighted line of best fit (least-squares) to lateralization responses as a function of lead-lag delay. This procedure was applied separately to lateralization data for “one location” and “two location” trials (i.e., for trials on which subjects responded in the upper panel on the display versus the lower panel; see Sec. 2).
Figures 3a, 3b plot cross-subject mean lateralization responses as a function of lead-lag delay for all conditions of experiment I [Fig. 3a, right-lead, left-lag stimuli; Fig. 3b, left-lead, right-lag stimuli]. Within each panel, axes are arranged such that the magnitude of lateralization is given by the leftward (toward −1) or rightward (toward +1) deviation from the midline (dotted vertical line). Lead-lag delay is plotted in the vertical dimension. Black filled circles give weighted means for “one location” trials; red filled circles give weighted means for “two locations” trials. The size of each point gives the weight of each mean, and error bars give the weighted standard deviation. Finally, dashed black and red lines give weighted linear fits to the mean lateralization responses (nonlinear in appearance due to the logarithmic lead-lag delay axes). Fusion echo thresholds (see Fig. 2), included for visual reference only, are given by inward-pointing triangles along the lead-lag delay axes; as described in Sec. 2, echo thresholds and lateralization were measured independently.
Considering first Fig. 3a (responses for right-lead, left-lag stimuli), consistent with expectations, lateralization responses on trials for which subjects reported “one location” (black) generally fell to the right of midline, in agreement with the right-favoring ITD or ILD carried by the lead, while lateralization responses on trials for which the subject reported perceiving “two locations” (red) generally fell to the left of midline, in agreement with the left-favoring ITD or ILD carried by the lag. Wholly unexpected was an apparent reduction in the magnitude of lateralization for “one location” responses with increasing lead-lag delay. This pattern was particularly evident in conditions featuring elevated echo thresholds (e.g., Buildup conditions) where mean lateralization responses for “one location” trials at lead-lag delays beyond ∼20 ms fell close to or at the midline. This pattern, also reflected in individual subject data (not shown), appeared to hold for both ITD and ILD conditions. Submitting the slopes of all “one location” best fit lines in Fig. 3a to a one-sample t-test (against 0) revealed that slopes taken across conditions were significantly negative [t(7) = −4.83, p < 0.05].
Critically, this pattern of reduced lateralization of “fused” lead-lag stimuli with increasing lead-lag delay was also apparent for left-lead, right-lag stimuli [Fig. 3b]. For these stimuli, given the task instructions, subjects should have responded to the left-favoring lead regardless of fusion (i.e., whether one or two locations were perceived). Nonetheless, as for right-lead, left-lag trials, the magnitude of lateralization for “fused” responses clearly decreased with increasing lead-lag delay, giving rise to significantly positive slopes for “one location” best fit lines [one-sample t-test against 0, t(7) = 3.93, p < 0.05]. In contrast, in the absence of fusion (i.e., for “two locations” responses), there was a trend for slightly increased lateralization with increasing lead-lag delay [one-sample t-test of “two locations” best fit slopes against 0, t(7) = −2.70, p < 0.05].
Interim discussion
The results of experiment I suggest that the precedence effect in terms of lead-lag fusion is more robust for ITD than ILD, in agreement with the data of Krumbholz and Nobbe (2002) and consistent with several previous headphone studies (e.g., Saberi et al., 2004; Stecker and Brown, 2010; Brown and Stecker, 2010). Across all tested conditions, echo thresholds for stimuli lateralized by ITD exceeded those for stimuli lateralized by ILD. The difference was most notable in the Breakdown condition, where a sudden perturbation in the ITD of the test stimulus relative to that of the conditioner stimulus (specifically, a switching of the lead and lag ITD values) failed to produce a change in echo threshold. These observations support the notion that the precedence effect as described in the free field depends on specific contributions from ITD and ILD. Most critically, the data suggests, consistent with Krumbholz and Nobbe (2002), that the breakdown effect demonstrated in the free field must depend on the sudden change in lead-lag ILD values; the concomitant sudden change in ITD, at least under headphones, is evidently inconsequential.
The present data additionally demonstrated a surprising reduction in localization dominance for fused (“one location”) lead-lag images with increasing lead-lag delay. For many such trials at lead-lag delays beyond ∼20 ms, lateralization responses fell near the midline, suggesting that the lead and lag cues both contributed substantially to the response. As these trials make the greatest contribution to the elevation of measured echo thresholds in buildup conditions, it follows that the “built-up” precedence effect may feature enhanced lead-lag fusion without similarly enhanced localization dominance. This observation is difficult to reconcile with a standard view of the buildup phenomenon, which construes the elevation of echo thresholds as enhancement of the precedence effect per se (i.e., enhanced fusion with enhanced localization dominance). The possibility of reduced localization dominance with increased fusion is explored further in experiments II and III, while an alternative explanation for near-midline lateralization responses (concerning the presence of the diotic ILD in ITD stimuli and the diotic ITD in ILD stimuli) is considered specifically in experiment III.
EXPERIMENT II: CROSS-CUE TRANSFER OF BUILDUP
The present experiment was designed to evaluate spatial geometric versus spatial acoustic accounts of the dynamic precedence effect (considered in Sec. 1) by presenting subjects with two novel buildup conditions where conditioner and test stimuli were lateralized by different cues of equal subjective magnitude. Prior to testing in the main experimental task, subjects completed an ITD-ILD matching task to obtain values of ILD (carried by single clicks) that matched the subjective lateralization of a ±308 μs ITD standard (see Sec. 2). These individually determined values of ILD (see Fig. 4) were used for all ILD conditioner and test stimuli in experiment II. Stimulus conditions in Experiment II consisted of Baseline ITD and ILD conditions identical to those of Experiment I (with the exception that the individually determined values of ILD in Fig. 4 were used in place of the +/−10 dB used in Experiment I), and two novel “buildup” conditions, (1) Buildup ILD, Test ITD and (2) Buildup ITD, Test ILD conditions, in which the conditioner and test were lateralized by ILD and ITD and by ITD and ILD, respectively.
Results and interim discussion
Figure 5 gives echo thresholds for the conditions of experiment II. Mean Baseline ITD and ILD thresholds were nearly identical to those measured in experiment I. Toward the primary question addressed by experiment II, the mean Buildup ILD, Test ILD threshold appeared to be moderately elevated relative to the Baseline ITD threshold, while the Buildup ITD, Test ILD threshold appeared to be equal to the threshold in the Baseline ILD condition. Individual subject data reveal that the mean Buildup ILD, Test ITD threshold was skewed by two subjects, most especially subject 1012, whose threshold in that condition was 37.5 ms. Although this point might be treated as an outlier, subject 1012's data were not unusual in other conditions, and very high buildup thresholds are occasionally measured in subjective fusion tasks (e.g., Yang and Grantham, 1997; Djelani and Blauert, 2001). Paired t-tests (corrected for two comparisons) indicated that the Buildup ILD, Test ITD threshold was not significantly higher than the Baseline ITD threshold, though the difference approached significance [t(9) = 2.65, p = 0.027], while Buildup ITD, Test ILD and Baseline ILD thresholds were not significantly different [t(9) = 1.16, p = 0.274]. In comparison to “within-cue” buildup thresholds, the mean Buildup ILD, Test ITD threshold of the present experiment was ∼15 ms (∼12.5 ms with removal of subject 1012) versus ∼18 ms for Buildup ITD (experiment I), while the Buildup ITD, Test ILD of the present experiment threshold was ∼5.5 ms versus ∼14 ms for Buildup ILD (experiment I). The data thus suggest that even when the subjective location of a repeating stimulus is fixed, maximal buildup requires static ITD and ILD cues. Although limited cross-cue transfer of buildup was measured with a subjectively equivalent ILD conditioner and ITD test stimulus (particularly for two subjects), none was measured in the opposite case.
Lateralization data for experiment II are displayed in Fig. 6. Because trends in lateralization as a function of lead-lag delay closely followed those observed in experiment I, data are given only for right-lead, left-lag trials (see Appendix for left-lead, right-lag data). As in experiment I, when one location was reported at lead-lag delays beyond ∼20 ms, subjects tended to respond near the midline, suggesting weak localization dominance. Consequently, as in experiment I, the slope of lines fit to “one location” responses was significantly negative taken across conditions [t(3) = −3.23, p < 0.05]. Additionally, a subtler trend was observed in responses for “two locations” responses: At lead-lag delays near the echo threshold, “two locations” responses tended to be shifted leftward (i.e., toward the lead). This trend, leading to slightly negative slopes for lines fit to “two locations” data, was also present in the data of experiment I, but was particularly evident in the Baseline ITD and Buildup ILD, Test ITD conditions of the present experiment. This observation and the observation of weak lateralization dominance for lead-lag buildup test stimuli, as well as differences between ITD- and ILD-based precedence effects discussed hereto, are explored further in experiment III.
EXPERIMENT III: DYNAMIC PRECEDENCE EFFECTS WITHIN A SINGLE HEMIFIELD
The majority of precedence effect studies have used lead and lag stimuli that were symmetrically opposed across the interaural midline (e.g., Wallach et al., 1949; Thurlow and Parks, 1961; Freyman et al., 1991; Krumbholz and Nobbe, 2002; experiments I and II, and dozens of others). Use of such stimuli offers certain advantages such as avoidance of differences in sensitivity to perturbations in the lead versus lag resulting simply from differences in spatial sensitivity across azimuth. Nonetheless, exclusive use of binaurally symmetric stimuli also presents certain disadvantages: Of greatest concern, information is only obtained about one type of synthetic listening condition: a single source and single echo arranged in a single spatial configuration. The generalizability of psychophysical performance measured under such conditions to real-world listening in rooms is therefore limited.
In the present investigation, another difficulty related to the use of binaurally symmetric stimuli may be identified: We (and many previous investigators) have adopted the terms “ITD stimuli” and “ILD stimuli” to describe stimulus conditions in which the ITD or ILD was manipulated while the ILD or ITD was held constant (usually at 0 dB or 0 μs). For any binaural stimulus, however, both cues are always present. Thus, the ITD test stimuli of experiments I and II carry both ±308 μs ITD and ±0 dB ILD, and the ILD test stimuli carry both ±10 dB ILD and ±0 μs ITD. This is a critical consideration in light of the lateralization data obtained in experiments I and II. When “one location” was reported at lead-lag delays beyond ∼20 ms, localization dominance appeared to be weak: Rather than responding on the side consistent with cues carried by the lead (as at brief lead-lag delays), subjects responded near the midline. We took these responses to evidence a substantial contribution to lateralization by both lead and lag cues, i.e., “averaging” of the lead and lag. An alternative explanation could be that, given a diffuse image comprised of disparate ITD or ILD lead and lag cues, subjects were compelled to respond to the co-occurring and highly stable unmanipulated ILD or ITD cue. To address this concern, experiment III employed lead and lag stimuli confined to a single “hemifield,” such that the average of the manipulated lead and lag cue values was nonzero.
Stimuli and procedure
Lead clicks in ITD conditions of experiment III carried ±615 μs ITD, while lag clicks carried 0 μs ITD (intracranial perception was thus confined to either the right or left half of the head). To determine the ILD producing comparable lateralization (likely to vary among subjects, see Fig. 4), subjects completed a matching task identical to that completed prior to experiment II, with the exception that the ITD standard to be matched by the ILD pointer carried ±615 μs ITD. These individually determined values of ILD (given in Fig. 7) were used in all ILD conditions of experiment III: the lead always carried ± the matched ILD (mean 13.7 dB), while the lag carried 0 dB ILD. Thus, for experiment III stimuli, the unmanipulated cue remained at 0 μs/0 dB, while the “average” of lead and lag ITD values in ITD conditions was ±308 μs, and the average of lead and lag ILD values in ILD conditions was half of the matched ILD (mean ∼7 dB). Conditions of experiment III were otherwise identical to those of experiment I (ITD/ILD Baseline, Buildup, Breakdown, Retest).
Results and interim discussion
Echo thresholds
Figure 8 gives echo thresholds for ITD and ILD conditions of experiment III. Compared to experiment I, echo thresholds were more similar between ITD and ILD conditions—another unanticipated finding. Nonetheless, a repeated-measures ANOVA demonstrated that the main effect of cue remained significant, with higher ITD- than ILD-based echo thresholds across conditions [F(1,9) = 19.57, p < 0.05]. The main effect of condition was also significant [F(3,27) = 12.34, p < 0.05], while the cue × condition interaction was not [F(3,27) = 1.09, p = 0.36]. Follow-up paired t-tests demonstrated that the breakdown effect remained significant (although reduced in magnitude) for ILD [Buildup vs Breakdown, t(9) = 2.91, p < 0.025] and non-significant for ITD [t(9) = 0.56, p = 0.59] stimuli. Individual data generally support the mean data, although inter-subject variability was greater than in experiments I and II, with several subjects exhibiting very high echo thresholds, even in Baseline conditions (cf. Grantham, 1996). Finally, two post hoc repeated-measures ANOVAs demonstrated that ILD-based echo thresholds were significantly higher in the “single-hemifield” conditions of the present experiment than in the otherwise identical conditions of experiment I [main effect of experiment, F(1,9) = 6.13, p < 0.05)], while ITD-based echo thresholds were not [F(1,9) = 2.12, p = 0.18]. The observation that ILD-based echo thresholds significantly depended on the lead and lag locations while ITD-based echo thresholds did not further suggests that key aspects of the precedence effect are cue-specific.
Lateralization responses
Lateralization data for experiment III are given in Fig. 9. As for experiment II, only right-lead, left-lag data are given (see Appendix for left-lead, right-lag data). Regarding the primary motivation for the present experiment, lateralization responses at longer lead-lag delays tended to fall intermediate to “one location” responses at short delays (strong localization dominance) and “two location” responses, giving negative slopes to lines fit to lateralization data [in 7 of 8 conditions, a significant cross-condition effect, one-sample t-test against 0, t(7) = −3.25, p < 0.05]. While this pattern was not as obvious as in experiments I and II, suggestive of stronger localization dominance for the single-hemifield stimuli of the present experiment, the pattern of lateralization in the present experiment strongly supports the conclusion that responses intermediate to the lead and lag cues at longer delays reflect weakened localization dominance rather than an artifact of stimulus design. Responses did not fall near 0 (i.e., the intracranial midline) as in experiments I and II, suggesting that subjects were in all experiments responding to an average (or, more likely, a weighted average) of the manipulated lead and lag cues rather than the diotic unmanipulated cue. Finally, in support of earlier observations (Sec. 4A), “two locations” responses at short lead-lag delays (i.e., delays near or below the echo threshold) in conditions of experiment III appeared to be “pulled” toward the lead, indicative of extant localization dominance (dominance of lead cues over lag cues) in the absence of fusion (see also Litovsky and Shinn-Cunningham, 2001). This pattern was somewhat more evident in the present experiment than in experiments I and II, giving rise to significantly negative slopes to lines fit to “two locations” responses as well [t(7) = −6.28, p < 0.001], and further indicative that the precedence effect—in terms of localization dominance and perhaps also fusion—may be somewhat stronger when stimuli are confined to a single hemifield (cf. Litovsky and Shinn-Cunningham, 2001).
GENERAL DISCUSSION
Accurate sound localization in ordinary listening environments (e.g., rooms) requires that listeners respond to the early arriving spatial cues carried by the direct sound rather than the spurious cues carried by reflections and reverberation arriving a few to hundreds of milliseconds later. In general, this facility is believed to depend on (1) perceptual fusion of the direct signal and its reflections (fusion) and (2) dominance of the fused percept by the early arriving spatial cues (localization dominance) (e.g., Wallach et al., 1949; Thurlow and Parks, 1961; Djelani and Blauert, 2001). Clifton and colleagues (e.g., Clifton, 1987; Clifton et al., 1994; Keen and Freyman, 2009) have further demonstrated that stimulus repetition leads to enhancement of the fusion effect, evidenced by a two- or three-fold increase in the “echo threshold.” By extension, it has been suggested that stimulus repetition must lead to similar enhancement of localization dominance, and therefore that “buildup” must facilitate enhanced localization of repeated (or ongoing) sound sources in ordinary listening environments (e.g., Keen and Freyman, 2009). In the remaining sections we consider the theoretical significance of the present data in the context of existing precedence literature and limitations in their interpretation.
Different fusion effects for ITD and ILD stimuli
The fusion data of the present investigation demonstrate, consistent with the report of Krumbholz and Nobbe (2002), that the precedence effect, in terms of lead-lag fusion, is more robust for stimuli lateralized by ITD than stimuli lateralized by ILD. Most notably, the data of experiments I–III demonstrated, in close correspondence to the data of Krumbholz and Nobbe (2002), that “built-up” fusion for repeated lead-lag ITD stimuli persisted even when relatively drastic changes were made to the lead and lag ITD cues (relative to the conditioner ITD cues), while fusion for analogous ILD stimuli “broke down” with such changes. Further suggestive of unique fusion effects for ITD and ILD, the data of experiment II demonstrated minimal transfer of buildup between “geometrically” equivalent conditioner and test stimuli carrying different spatial acoustic cues. Finally, the fusion data of experiment III demonstrated that fusion of ILD stimuli was somewhat more robust when lead and lag cues were confined to a single hemifield (vs interaurally opposed, experiments I and II), while fusion of ITD stimuli was not statistically different across experiments.
Taken together, these data suggest that the dynamic temporal limit of lead-lag fusion reported in free field and virtual auditory space studies (e.g., Clifton and Freyman, 1989; Djelani and Blauert, 2001; Keen and Freyman, 2009) depends on the interplay of ongoing ITD and ILD cues. Built-up fusion is robust to changes in ITD cues, but aberrant ILD cues readily disrupt fusion (Breakdown conditions of experiments I and III, Buildup ITD, Test ILD condition of experiment II). Interestingly, such disruption was apparently reduced when the ILD carried by the “aberrant” stimulus was 0 dB (Buildup ILD, Test ITD, experiment II; Breakdown ILD, experiment III). We suggest that this dependence of the breakdown effect on aberrant (and especially nonzero) post-onset ILD may capitalize on the statistics of sound in natural environments: Assuming that ILD values are computed with a 1–2 ms window of integration (e.g., Tollin, 2003), post-onset ILD values in rooms are often near zero (e.g., Rakerd and Hartmann, 1985; cf. Shinn-Cunningham et al., 2005). Significantly nonzero post-onset ILD values are thus likely to correspond to novel sound sources. A signal lateralized by ITD alone may not be similarly evaluated on the basis of post-onset cues, as post-onset ITD values tend to vary erratically moment-to-moment (Rakerd and Hartmann, 1985). We note that this cue-specific ecological interpretation of the breakdown effect is compatible with an internal model of the expected relationships among spatial acoustic cues within a given reverberant context, but not with a cue-independent internal model of the spatial geometry of the listening environment. Thus, the room acoustics hypothesis of Freyman and colleagues (see Sec. 1; Freyman and Keen, 2006; Keen and Freyman, 2009) should be reframed in terms of listeners' expectations regarding the behavior and interrelationships of acoustic spatial cues (i.e., ITD and ILD, and spectral cues) within the environment.
Dissociation of fusion and localization dominance with buildup
The lateralization data of the present investigation indicate that localization dominance and fusion aspects of the precedence effect may become dissociated with stimulus repetition. This observation is reminiscent of data reported by Yang and Grantham (1997), which demonstrated a divergence of echo thresholds measured in two different buildup paradigms: (1) a subjective fusion paradigm (like that in the present experiment) and (2) an objective “discrimination suppression” paradigm, where subjects were required to discriminate changes in the location of the lag. In that study, thresholds were significantly more elevated by stimulus repetition in the fusion paradigm than in the discrimination suppression paradigm, suggesting that at extended lead-lag delays, listeners remained sensitive to spatial information carried by the lag despite their proclivity to report a single auditory event. In our paradigm, which enabled direct assessment of lead location perception [particularly in left-lead, right-lag trials, where subjects should have always pointed at the lead; Fig. 3b, Appendix], listeners' lateralization of fused lead-lag images (“one location” responses only) at lead-lag delays ≳20 ms appeared to entail an averaging of lead and lag information (i.e., weak localization dominance). These findings together challenge the notion that buildup acts to enhance sound localization per se by augmenting suppression of late-arriving spatial information. Rather, the data suggest that buildup serves primarily to enhance the fusion of early and late-arriving sound into a single auditory object. The perceived location of this object—at least under headphones and in the absence of disambiguating visual cues—apparently depends on spatial cues carried by both lead/direct and lag/reflected sound. This observation might be taken to suggest that location is assigned to objects rather late during auditory scene analysis (cf. Darwin and Hukin, 1999).
Buildup of precedence and buildup of auditory streaming
The term “buildup” carries a very specific connotation in the precedence effect literature. Interestingly, the same term has been applied in the auditory scene analysis literature to describe what at first seems a rather separate phenomenon: When two sequences of tones sufficiently similar in frequency (up to several semitones apart) are played with a periodic temporal mismatch, most listeners initially perceive a single sequence with a “galloping” rhythm. After several repetitions of the stimulus, listeners tend to separate out the two sequences, “building up” a lower-frequency stream and a higher-frequency stream (e.g., Rogers and Bregman, 1998; Cusack et al., 2004). Although this separation of one stream into two seems the opposite effect of precedence buildup—the fusion of two (lead and lag) into one—certain striking similarities between the phenomena exist. Like buildup of precedence, buildup of streaming typically occurs over the course of multiple stimulus presentations, reaching an asymptote after several presentations (e.g., Rogers and Bregman, 1998; Cusack et al., 2004; cf. Freyman et al., 1991). Like buildup of precedence, buildup of streaming is susceptible to “breakdown”: A breakdown of precedence can be induced by switching the locations of the lead and lag test speakers relative to a preceding conditioner stimulus (e.g., Clifton and Freyman, 1989) or by suddenly adjusting the lead-lag delay (Clifton et al., 1994). A breakdown of streaming can be induced by changing the location of the speaker playing the test tone sequence relative to that playing the conditioner stimulus (Rogers and Bregman, 1998) or by inserting a sudden temporal gap in an ongoing (and built-up) tone sequence (Cusack et al., 2004). Particularly striking in the context of the present study, Rogers and Bregman (1998) noted that breakdown of streaming could not be induced under headphones by sudden changes to the ITD of the test stimulus (relative to the conditioner), leading the authors to resort to adjustment of the frequency separation of their tone sequences to produce breakdown effects. A similar result was reported by Culling and Summerfeld (1995), who found that listeners could not perceptually segregate simultaneously presented synthetic vowels on the basis of common ITD values carried by formants. Most interestingly, Rogers and Bregman (1998) found that breakdown of streaming could be induced under headphones by sudden changes to the ILD of the test stimulus relative to the conditioner, although not as effectively as in the free field by changing the test speaker relative to the conditioner speaker.
Perhaps most importantly, buildup of streaming and buildup of precedence both act to produce ecologically correct perception, such that multiple sound features are correctly attributed to a single sound source. We thus suggest that buildup of precedence might be understood more generally in the context of auditory scene analysis (for review, see Bregman, 1994; cf. Clifton et al., 1994). In the context of auditory scene analysis, the utility of repetition-enhanced lead-lag fusion (i.e., buildup) is clear—independent of sound localization, perceptual fusion of a signal (e.g., speech) and its reflected copies should facilitate improved sound identification (or intelligibility). In support of this interpretation, Brandewie and Zahorik (2010) demonstrated a significant improvement in speech intelligibility, a wholly non-spatial measure, with several seconds of prior exposure to speech sounds in a simulated room (versus a control condition with no prior exposure). Indeed, from an ecological standpoint, the need for significant “enhancement” of our already excellent auditory localization capabilities (i.e., buildup of localization dominance, which we have failed to provide evidence for) may be minimal given the likelihood of concomitant visual cues (cf. Bishop et al., 2011). The utility of lead-lag fusion, on the other hand—even at the expense of accurate localization (e.g., even if the lead/direct and lag/reflected cues are averaged together, cf. Figs. 3, 6, and 9)—should be substantial, providing for enhanced auditory object formation and sound identification in reverberant environments (cf. Bregman, 1994; Brandewie and Zahorik, 2010).
Lateralization versus localization
The present study assessed listeners' fusion and intracranial lateralization of broadband click pairs carrying manipulated ITD or ILD values presented under headphones. While most studies of the precedence effect have employed highly artificial stimuli that correspond only weakly to real-world listening (i.e., pairs of equal-intensity clicks or noise bursts), headphone stimuli, which produce intracranial rather than externalized perception, are further limited in their comparability to real-world listening. Thus, although headphone ITD and free field precedence effect paradigms have been employed and interpreted nearly interchangeably for several decades (since Wallach et al., 1949), results of the present study should be extrapolated to the free field with caution. Specifically, the finding that stimulus repetition enhances fusion to a greater degree than localization dominance remains to be replicated in a free field localization task. Notably, one free field study by Litovsky and Godar (2010) that employed unusual stimuli consisting of a lead-lag pair repeated three times demonstrated higher-than-typical echo thresholds and apparently weak localization dominance at extended lead-lag delays in some subjects. Nonetheless, a comprehensive free field study of localization dominance effects for baseline and buildup stimuli will be required to validate conclusions concerning localization drawn from the present lateralization data. We note that weak localization dominance has not been mentioned by authors of several major buildup studies (e.g., Clifton and Freyman, 1989; Grantham, 1996; Freyman and Keen, 2006). Controlling subjects' knowledge of and expectations concerning possible source locations will be essential in future experiments evaluating localization dominance, as will be the ability to measure localization with sufficient resolution (e.g., by use of a continuous response surface such as a screen placed in front of visible loudspeakers, cf. Stecker and Hafter, 2002).
ITD and ILD in disagreement
Headphone stimuli that carry manipulated values of one cue (e.g., ITD) but leave the other cue (ILD) fixed at zero (such as those used in the present investigation) are inherently artificial in that they impose on a single stimulus substantial and consistent disagreement in the two major cues to its location. While cues in the free field normally “agree,” being consistent in sign and approximately consistent in magnitude across azimuth, headphone stimuli introduce cue-conflict and produce an unnatural perception of source location (i.e., intracranial lateralization; Sec. 6D; Plenge, 1974). This “unnatural” attribute may be particularly marked for the ILD stimuli employed in the present investigation, which carried ±10 dB ILD across the spectrum. Such large ILD values are rarely experienced at low frequencies in real environments, and even for near-field sounds which may produce large low-frequency ILD values, the magnitude of ILD values is unlikely to be constant across the spectrum. Nonetheless, the utility of headphone stimuli in parsing the contributions of ITD versus ILD to spatial hearing phenomena such as the precedence effect is clear and, despite their “unnatural” nature, stimuli of the present investigation were effective in producing buildup and breakdown effects of a magnitude comparable to those measured in the free field. Future studies employing strategic combinations of manipulated ITD and ILD values, which frequently conflict after sound onset in reverberant listening environments (e.g., Rakerd and Hartmann, 1985), will further elucidate the contribution of each cue to the ecological precedence effect.
No evidence for left-right asymmetry in buildup of fusion
Finally, some prior studies of the buildup phenomenon have demonstrated significant left-right asymmetries in its magnitude, such that echo thresholds are higher for either right-leading or left-leading stimuli (e.g., Clifton and Freyman, 1989; Grantham, 1996). While these differences have not been reported (or evaluated) in all free field studies, Grantham (1996), following on Clifton and Freyman's (1989) observation of higher echo thresholds for right-leading than left-leading stimuli, explicitly investigated buildup asymmetries in a relatively large number of subjects (n = 25). Right-handed subjects, at the group level, demonstrated significantly higher right-lead than left-lead fusion echo thresholds, although seven subjects failed to show asymmetry greater than 5 ms and two subjects showed asymmetry greater than 5 ms in the opposite direction (higher echo thresholds for left-lead, right-lag stimuli). Four of five left-handed subjects did not demonstrate asymmetries greater than 5 ms, while one demonstrated a very large asymmetry favoring left-lead, right-lag buildup (opposite the majority of right-handed subjects).
Krumbholz and Nobbe (2002) analyzed left-right asymmetry in lead-lag fusion data from six subjects. A small interaction effect with subject gender was observed, with two of four female subjects experiencing greater fusion for right-leading stimuli and two of two male subjects experiencing greater fusion for left-lead, right-lag stimuli. These differences, however, were not specific to buildup; the asymmetries were comparable across baseline, buildup, and breakdown conditions. Nonetheless, for comparison to the studies of Grantham (1996) and Krumbholz and Nobbe (2002), fusion data from the ten subjects of the present investigation were submitted to a post hoc analysis of left-right asymmetry. Fusion responses for all left-leading and right-leading trials in Baseline and Buildup ITD and ILD conditions of experiments I and III were collected for each subject. For each side-condition-cue combination, responses (“one” or “two”) were plotted against lead-lag delay and fit with a logistic function (using a custom MATLAB script). The 50% point on the logistic function was taken as the echo threshold. This procedure was repeated for each subject for both experiments I and III. ANOVA revealed no main effect of lead side in experiment I [F(1,9) = 0.58, p = 0.46] or experiment III [F(1,9) = 2.04, p = 0.19], nor any significant interactions between lead-side and condition [experiment I, F(1,9) = 0.17, p = 0.69; experiment III, F(1,9) = 0.55, p = 0.48]. Additional t-tests assessing the magnitude of buildup within each gender (six males, four females) revealed no significant effect of lead side (all p > 0.05). The lack of asymmetry in the present data versus asymmetries discussed previously may relate to differences in the experimental designs (e.g., the dual-task nature of our paradigm, which forced subjects to assess lead-lag fusion and location simultaneously), differences between free field and headphone studies discussed hereto, or individual differences in the subjects tested [tenable in light of the large inter-subject variability in asymmetry reported by both Grantham (1996) and Krumbholz and Nobbe (2002)].
SUMMARY AND CONCLUSIONS
The present investigation evaluated fusion and localization dominance aspects of the precedence effect for headphone stimuli lateralized by ITD or ILD. Data suggested the following conclusions.
-
(1)
The fusion aspect of the precedence effect was more robust for stimuli lateralized by ITD than stimuli lateralized by ILD. This difference was particularly marked when sudden changes were made to the cues of the test stimulus relative to those of a preceding conditioner stimulus, suggesting, consistent with Krumbholz and Nobbe (2002), that the “breakdown” effect reported previously in the free field can be induced by sudden changes in the ILD but not sudden changes in the ITD.
-
(2)
The localization dominance aspect of the precedence effect was similar for ITD and ILD in that lateralization of “fused” stimuli typically fell toward the side consistent with the cues of the lead stimulus. Such lateralization, however, unexpectedly decreased with increasing lead-lag delay, particularly beyond ∼20 ms lead-lag delay, demonstrating a weakening of localization dominance with “built-up” lead-lag fusion. This pattern held for both cues and across multiple stimulus conditions.
-
(3)
Thus, the effect of built-up lead-lag fusion may not be to enhance the precedence effect per se (fusion with localization dominance). Rather, buildup may work to form stable auditory objects, similar (or identical) to buildup of streaming phenomena reported in the auditory scene analysis literature.
-
(4)
The present results were obtained with headphone stimuli lateralized intracranially. The data and the conclusions they suggest are thus tentative in the context of localization, and remain to be replicated in a free field localization paradigm.
ACKNOWLEDGMENTS
The authors thank Julie Stecker for help with data collection and Katrin Krumbholz and one anonymous reviewer for helpful comments on an earlier version of this manuscript. This work was completed as part of the first author's doctoral thesis. Funding was provided by NIDCD (Grant Nos. F31-DC050432 [ADB] and R01-DC011548 [GCS]).
APPENDIX
Lateralization data of the present investigation demonstrated a decrease in the magnitude of lateralization with increasing lead-lag delay for trials on which subjects reported perceiving one location, i.e., reduced localization dominance at “built-up” lead-lag delays. This pattern was plainly evident for right-lead, left-lag trials, but was also evident for left-lead, right-lag trials: Since the task instructions were to point to the location perceived in the event of one location or to the left-most location perceived in the event two locations, subjects should have always responded to the left-favoring lead for left-lead, right-lag trials. On the contrary, lateralization responses fell consistently to the left when two locations were reported, but trended toward the right-favoring lead when one location was reported. This pattern held for lateralization responses in both experiment II (Fig. 10) and experiment III (Fig. 11). These data provide strong evidence that although stimulus repetition enhances lead-lag fusion, lead and lag spatial cues both contribute substantially to the resulting percept.
Footnotes
Absolute sensitivity to ITD and ILD was not measured in subjects of the present investigation. We assume on the basis of past studies in our lab, accounts of individual differences in the literature (e.g., McFadden et al., 1973), and variation in ITD-ILD matches across subjects in the present investigation (Figs. 47), that the employed ITD and ILD values produced various degrees of lateralization across subjects. Nonetheless, ITD and ILD values were calibrated (particularly in experiments II and III) to produce equivalent lateralization within each subject. As absolute lateralization was not of particular interest, lateralization data were normalized for each subject prior to cross-subject averaging.
The term “localization dominance” is used throughout the present investigation to describe dominance of localization cues carried by the lead over those carried by the lag, consistent with the nomenclature delineated by Litovsky et al. (1999) and used across a majority of precedence effect studies since. We note, however, that our stimuli, like all “ITD only” and “ILD only” headphone stimuli, do not produce externalized perception of sound that is localized in space, but rather intracranial perception of sound that is lateralized in the direction consistent with the dominant cue. While the observed dominance of leading cues at brief delays in the present investigation might thus be more accurately termed “lateralization dominance,” we have adopted the more common term already existing in the literature.
Five trials on each run were presented with a lead-lag delay of 431 μs. This exceedingly brief delay was included to ascertain that subjects were performing the task as instructed; lead-lag delays <1 ms produce a phenomenon known as “summing localization” and consistently produce the perception of “one location” in all subjects (Litovsky et al., 1999). All subjects consistently reported one location for this delay. Associated lateralization data (which evidence summing localization) are not included in the present report.
References
- Bishop, C. W., London, S., and Miller, L. M. (2011). “ Visual influences on echo suppression,” Curr. Biol. 21, 221–225. 10.1016/j.cub.2010.12.051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blauert, J. (1997). Spatial Hearing: The Psychophysics of Human Sound Localization, revised edition (The MIT Press, Cambridge, MA: ), Chap. 5. [Google Scholar]
- Brandewie, E., and Zahorik, P. (2010). “ Prior listening in rooms improves speech intelligibility,” J. Acoust. Soc. Am. 128, 291–299. 10.1121/1.3436565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bregman, A. S. (1994). Auditory Scene Analysis: The Perceptual Organization of Sound (The MIT Press, Cambridge, MA: ), Chap. 1. [Google Scholar]
- Brown, A. D., and Stecker, G. C. (2010). “ Temporal weighting of interaural time and level differences in high-rate click trains,” J. Acoust. Soc. Am. 128, 332–341. 10.1121/1.3436540 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clifton, R. K. (1987). “ Breakdown of echo suppression of the precedence effect,” J. Acoust. Soc. Am. 82, 1834–1835. 10.1121/1.395802 [DOI] [PubMed] [Google Scholar]
- Clifton, R. K., and Freyman, R. L. (1989). “ Effect of click rate and delay on breakdown of the precedence effect,” Percept. Psychophys. 46, 139–145. 10.3758/BF03204973 [DOI] [PubMed] [Google Scholar]
- Clifton, R. K., Freyman, R. L., Litovsky, R. Y., and McCall, D. (1994). “ Listeners' expectations about echoes can raise or lower echo threshold,” J. Acoust. Soc. Am. 95, 1525–1533. 10.1121/1.408540 [DOI] [PubMed] [Google Scholar]
- Culling, J. F., and Summerfield, Q. (1995). “ Perceptual separation of concurrent speech sounds: Absence of across-frequency grouping by common interaural delay,” J. Acoust. Soc. Am. 98, 785–797. 10.1121/1.413571 [DOI] [PubMed] [Google Scholar]
- Cusack, R. Deeks, J., Aikman, G., and Carlyon, R. P. (2004). “ Effects of location, frequency region, and time course of selective attention on auditory scence analysis,” J. Exp. Psychol. 30, 643–656. [DOI] [PubMed] [Google Scholar]
- Darwin, C. J., and Hukin, R. W. (1999). “ Auditory objects of attention: The role of interaural time differences,” J. Exp. Psychol. 25, 617–629. [DOI] [PubMed] [Google Scholar]
- Djelani, T., and Blauert, J. (2001). “ Investigations into the build-up and breakdown of the precedence effect,” Acta Acoust. 87, 253–261. [Google Scholar]
- Freyman, R. L., Clifton, R. K., and Litovsky, R. Y. (1991). “ Dynamic processes in the precedence effect,” J. Acoust. Soc. Am. 90, 874–884. 10.1121/1.401955 [DOI] [PubMed] [Google Scholar]
- Freyman, R. L. and Keen, R. (2006). “ Constructing and disrupting listeners' models of auditory space,” J. Acoust. Soc. Am. 120, 3957–3965. 10.1121/1.2354020 [DOI] [PubMed] [Google Scholar]
- Grantham, W. D. (1996). “ Left-right asymmetry in the buildup of echo suppression in normal- hearing adults,” J. Acoust. Soc. Am. 99, 1118–1123. 10.1121/1.414596 [DOI] [PubMed] [Google Scholar]
- Keen, R., and Freyman, R. L. (2009). “ Release and re-buildup of listeners' models of auditory space,” J. Acoust. Soc. Am. 125, 3243–3252. 10.1121/1.3097472 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krumbholz, K., and Nobbe, A. (2002). “ Buildup and breakdown on echo suppression for stimuli presented over headphones—The effects of interaural time and level differences,” J. Acoust. Soc. Am. 112, 654–663. 10.1121/1.1490594 [DOI] [PubMed] [Google Scholar]
- Litovsky, R. Y., Colburn, H. S., Yost, W. A., and Guzman, S. (1999). “ The precedence effect,” J. Acoust. Soc. Am. 106, 1633–1654. 10.1121/1.427914 [DOI] [PubMed] [Google Scholar]
- Litovsky, R. Y., and Godar, S. P. (2010). “ Difference in precedence effect between children and adults signifies development of sound localization abilities in complex listening tasks,” J. Acoust. Soc. Am. 128, 1979–1991. 10.1121/1.3478849 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Litovsky, R. Y., and Shinn-Cunningham, B. G. (2001). “ Investigation of the relationship among three common measures of precedence: Fusion, localization dominance, and discrimination suppression,” J. Acoust. Soc. Am. 109, 346–358. 10.1121/1.1328792 [DOI] [PubMed] [Google Scholar]
- Maier, J. K., McAlpine, D., Klump, G. M., and Pressnitzer, D. (2010). “ Context effects in the discriminability of spatial cues,” J. Assoc. Res. Otolaryngol. 11, 319–328. 10.1007/s10162-009-0200-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCall, D. D., Freyman, R. L., and Clifton, R. K. (1998). “ Sudden changes in spectrum of an echo cause a breakdown of the precedence effect,” Percept. Psychophys. 60, 593–601. 10.3758/BF03206048 [DOI] [PubMed] [Google Scholar]
- McFadden, D., Jeffress, L. A., and Russel, W. E. (1973). “ Individual differences in sensitivity to interaural differences in time and level,” Percept. Mot. Skills. 37, 755–761. 10.2466/pms.1973.37.3.755 [DOI] [PubMed] [Google Scholar]
- Plenge, G. (1974). “ On the differences between localization and lateralization,” J. Acoust. Soc. Am. 56, 944–951. 10.1121/1.1903353 [DOI] [PubMed] [Google Scholar]
- Rakerd, B., and Hartmann, W. M. (1985). “ Localization of sound in rooms II: The effects of a single reflecting surface,” J. Acoust. Soc. Am. 78, 524–533. 10.1121/1.392474 [DOI] [PubMed] [Google Scholar]
- Rogers, W. L., and Bregman, A. S. (1998). “ Cumulation of the tendency to segregate auditory streams: Resetting by changes in location and loudness,” Percept. Psychophys. 60, 1216–1227. 10.3758/BF03206171 [DOI] [PubMed] [Google Scholar]
- Saberi, K., Antonio, J., and Petrosyan, A. (2004). “ A population study of the precedence effect,” Hear. Res. 191, 1–13. 10.1016/j.heares.2004.01.003 [DOI] [PubMed] [Google Scholar]
- Sanders, L. D., Zobel, B., Keen, R., and Freyman, R. L. (2011). “ Manipulations of listeners' echo perception are reflected in event-related potentials,” J. Acoust. Soc. Am. 129, 301–309. 10.1121/1.3514518 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stecker, G. C., and Brown, A. D. (2010). “ Temporal weighting of binaural cues revealed by detection of dynamic interaural differences in high-rate Gabor click trains,” J. Acoust. Soc. Am. 127, 3092–3103. 10.1121/1.3377088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stecker, G. C., and Hafter, E. R. (2002). “ Temporal weighting in sound localization,” J. Acoust. Soc. Am. 112, 1046–1057. 10.1121/1.1497366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shinn-Cunningham, B. G., Kopco, N., and Martin, T. J. (2005). “ Localizing nearby sound sources in a classroom: binaural room impulse responses,” J. Acousti. Soc. Am. 117, 3100–3115. 10.1121/1.1872572 [DOI] [PubMed] [Google Scholar]
- Shinn-Cunningham, B. G., Zurek, P. M., and Durlach, N. I. (1993). “ Adjustment and discrimination measurements of the precedence effect,” J. Acoust. Soc. Am. 98, 164–171. 10.1121/1.413752 [DOI] [PubMed] [Google Scholar]
- Thurlow, W. R., and Parks, T. E. (1961). “ Precedence-suppression effects for two click sources,” Percept. Mot. Skills 13, 7–12. 10.2466/pms.1961.13.1.7 [DOI] [Google Scholar]
- Tollin, D. J. (2003). “ The lateral superior olive: A functional role in sound source localization,” Neuroscientist 9, 127–143. 10.1177/1073858403252228 [DOI] [PubMed] [Google Scholar]
- Wallach, H., Newman, E. B., and Rosenzweig, R. (1949). “ The precedence effect in sound localization,” Am. J. Psych. 62, 315–336. 10.2307/1418275 [DOI] [PubMed] [Google Scholar]
- Wichmann, F. A., and Hill, N. J. (2001). “ The psychometric function: I. Fitting, sampling, and goodness of fit,” Percept. Psychophys. 63, 1293–1313. 10.3758/BF03194544 [DOI] [PubMed] [Google Scholar]
- Yang, X., and Grantham, D. W. (1997). “ Echo suppression and discrimination suppression aspects of the precedence effect,” Percept. Psychophys. 59, 1108–1117. 10.3758/BF03205525 [DOI] [PubMed] [Google Scholar]