The role of tone duration in dichotic temporal order judgment II: Extending the boundaries of duration and age

Leah Fostick; Harvey Babkoff

doi:10.1371/journal.pone.0264831

. 2022 Mar 30;17(3):e0264831. doi: 10.1371/journal.pone.0264831

The role of tone duration in dichotic temporal order judgment II: Extending the boundaries of duration and age

Leah Fostick ^1,^*, Harvey Babkoff ²

Editor: Susan Nittrouer³

PMCID: PMC8967006 PMID: 35353821

Abstract

Temporal order judgment (TOJ) measures the ability to correctly perceive the order of consecutive stimuli presented rapidly. Our previous research suggested that the major predictor of auditory dichotic TOJ threshold, a paradigm that requires the identification of the order of two tones, each of which is presented to a different ear, is the time separating the onset of the first tone from the onset of the second tone (stimulus-onset-asynchrony, SOA). Data supporting this finding, however, was based on a young adult population and a tone duration range of 10–40 msec. The current study aimed to evaluate the generalizability of the earlier finding by manipulating the experimental model in two different ways: a) extending the tone duration range to include shorter stimulus durations (3–8 msec; Experiment 1) and b) repeating the identical testing procedure on a different population with temporal processing deficits, i.e., older adults (Experiment 2). We hypothesized that the SOA would predict the TOJ threshold regardless of tone duration and participant age. Experiment 1 included 226 young adults divided into eight groups (each group receiving a different tone duration) with duration ranging from 3–40 msec. Experiment 2 included 98 participants aged 60–75 years, divided into five groups by tone duration (10–40 msec). The results of both experiments confirmed the hypothesis, that the SOA required for performing dichotic TOJ was constant regardless of stimulus duration, for both age groups: about 66.5 msec for the young adults and 33 msec longer (100 msec) for the older adults. This finding suggests that dichotic TOJ threshold is controlled by a general mechanism that changes quantitatively with age. Clinically, this has significance because quantitative changes can be more easily remedied than qualitative changes. Theoretically, our findings show that, with dichotic TOJ, tone duration affects threshold by providing more time between the onsets of the consecutive stimuli to the two ears. The findings also imply that a temporal processing deficit, at least among older adults, does not elicit the use of a different mechanism in order to judge temporal order.

Introduction

Temporal order judgment (TOJ) measures the individual’s ability to correctly perceive the order of consecutive stimuli presented rapidly. Most TOJ tasks involve the presentation of only two stimuli in order to measure basic perceptual abilities without confounding the task by adding a memory component. The two stimuli are usually separated by a silent interval between the offset of the first tone and the onset of the second tone, referred to as the inter-stimulus-interval (ISI), which is manipulated throughout the TOJ task. Short ISIs result in a very rapid presentation of the two stimuli, while longer ISIs result in a slower presentation. When the order of two sounds is judged, the two stimuli must differ by at least one dimension to enable identification. As a result, auditory TOJ paradigms use stimuli that differ either in: a) frequency (pitch) [1–14]; or b) spectrum (pure tone vs. noise) [14]; or c) duration [14, 15]; or d) the ear of presentation, i.e., the ear that receives the first and the ear that receives the second stimulus (referred to as dichotic, spatial, or binaural TOJ) [1–3, 5–12, 14, 16–25].

TOJ has been studied extensively, beginning with the seminal work of Hirsh [26] and Hirsh and Sherrick [27] who measured the amount of time between the onsets of two stimuli (tones, clicks, lights, and their combinations) necessary to correctly report their order. This measure, called the TOJ threshold, reflects the minimum amount of time separating the onsets of the two stimuli at which an individual can correctly identify the order of stimulus presentation 75% of the time. Hirsh and Sherrick [27] originally reported the threshold for TOJ to be 17 msec, regardless of the type of stimulus and presentation modality used [27]. However, more recent studies have, generally, reported longer thresholds [1–3, 5–7, 10–12, 14, 16–17, 20, 23–25].

In a previous study [1], we and others reported the sensitivity of the dichotic TOJ paradigm to methodological and stimulus parameters, specifically to stimulus duration. We considered the possibility that the two manipulations, tone duration and ISI, might affect perception differently, since increasing tone duration increases the amount of sound—thus, the amount of stimulation—at the two ears, while increasing the ISI increases the silent interval between these stimulations—i.e., the lack of stimulation. However, we found that both manipulations reduced the TOJ threshold by the same amount. This means that the parameter predicting accuracy on the dichotic TOJ task is the time separating the onset of the first tone from the onset of the second tone (stimulus-onset-asynchrony, SOA, illustrated at the top of Fig 1), regardless of whether that time interval is filled with stimulus stimulation (tone) or silence.

Fig 1 — The manipulation of stimulus duration is presented as the duration of Stimulus A and Stimulus B, which varied across groups in the current study as a between-subjects variable. ISI is presented as the silent gap between the offset of Stimulus A and the onset of Stimulus B, which varies within each group as a within-subjects variable. Numbers of participants in each group for Experiments 1 and 2 are shown. a. Data by duration and ISI. b. Data by SOA.

Our finding, however, is limited to the range of stimulus durations we tested, namely tone durations of 10–40 ms. It is unclear whether this finding would extend to other tone durations, since sound duration affects our auditory perception in several ways. First, sound duration affects the loudness of a sound via temporal summation, with sounds being perceived as softer or louder when duration is decreased or increased (respectively), up to 200 msec [28]. Second, sound duration affects our ability to perceive pitch, with lower frequencies requiring longer sound durations than higher frequencies. Third, sound duration also affects our ability to localize a sound source, with longer sounds being localized better by allowing the listeners to move their head towards the sound source [28, 29].

The design of our study directed us toward testing tone durations shorter than those we used in our earlier study. The dichotic TOJ ISI threshold was found to be around 60 msec in several studies [1–3, 11, 12, 14, 17, 20, 23], therefore, manipulating tone duration, ISI and SOA necessarily places an upper limit on the tone durations one can test, i.e., they must be shorter than 60 msec. This means that in order to expand the range of tone durations necessary to test the generalization of our ISI-tone duration TOJ equivalence hypothesis, we focus on shorter durations than those we used in the previous study [1] (i.e., less than 10 msec). Such short durations can create transients (short-duration sounds with high amplitude that can accompany the beginning of short sounds) that spread energy across the frequency range [30–33], possibly resulting in different ISI-duration patterns than those observed with tone durations longer than 10 msec. Therefore, in the present study we aimed to test whether our finding applies to very short tone durations (i.e., 3, 6, and 8 msec) as well as durations in the 10–40 msec range, while using the same dichotic TOJ design as Babkoff & Fostick [1].

Furthermore, data from our previous study [1] were robust in showing that, among young adults, a constant time period between the onsets of the two tones (SOA) elicited accurate identification of their order, regardless of the durations of the sound (tone) and silence (ISI). Therefore, as the sum total of the duration and ISI (i.e., SOA) was constant, young adults appear to extract the same temporal information from both elements. It is not yet clear whether the SOA is also the main predictor for dichotic TOJ performance in other populations, e.g., older adults, who may differ in their response to stimulus duration (stimulation) and/or ISI (lack of stimulation). Indeed, older adults have difficulty processing short and rapid stimuli. This difficulty is often reflected in the difficulty of older adults in perceiving speech, especially when the speaker talks fast or when speech is accompanied by background noise [3, 20, 21, 34–37]. Studies of temporal processing among older adults, including the studies using speech stimuli, have reported deficiencies in their performance compared to young adults [5, 7, 10, 20, 38–45]. These studies demonstrated that older adults required longer sound durations [38, 39, 43], ISIs [5, 7, 10, 20, 41], and longer gaps within sounds [40, 42, 44, 45], than young adults, in order to correctly perceive them. Such findings suggest that older adults might be sensitive both to tone duration and ISI.

However, since no study has directly measured the mutual contribution of these two variables to the TOJ threshold in older adults, follow-up questions arise: Do older adults extract the same temporal information from the stimulus duration as from the ISI, so that each of these variables has the same effect on their dichotic TOJ threshold, as is the case for younger adults? Or do older adults extract different temporal information from the stimulus duration than from the ISI, so that each of these variables has different effects on their resulting dichotic TOJ threshold? If the pattern of TOJ performance by older adults is similar to that of younger adults, a “zero” line slope would be expected when thresholds are plotted as a function of SOA. This “zero” line slope is expected for both populations, although the thresholds for older adults are expected to be longer than that of younger adults due to age-related temporal deficits among older adults. However, if older adults have greater difficulty processing shorter duration than longer duration tones, the slope of the line relating the dichotic TOJ threshold to tone duration should be significantly greater than “zero”, indicating a greater contribution to dichotic TOJ threshold of tone duration than just the increase in SOA. To address these questions, in the current study we repeated the dichotic TOJ study using the same design as Babkoff & Fostick [1] but with participants whose ages ranged from 60–75 years.

The aim of the present study was, therefore, to test our previous conclusion that dichotic TOJ is determined by stimulus onset asynchrony (SOA)—the time separating the onset of the first tone from the onset of the second one [1]—a) by using stimulus parameters that test its boundaries, and b) to determine if it can be generalized. We operationalized this aim by implementing two different manipulations to our previous research model: a) extending the range of tone durations to include also very short tone durations (3–8 msec) thus testing TOJ thresholds with tone durations ranging from 3–40 msec (Experiment 1), and b) using the same experimental methodology as in the previous study [1] to test a population of older adults (Experiment 2). As depicted in Fig 1, which illustrates the current study design, stimulus duration was manipulated across eight groups in Experiment 1 and across five groups in Experiment 2, each of which was presented with a different tone duration; within each group, participants were requested to judge the order of the tones while the ISI was manipulated.

Experiment 1: Young adults, stimulus duration 3 to 40 msec

Materials and method

Participants

Participants were 226 undergraduate students (136 females, 90 males), aged 20–35 years (mean = 25.5, SD = 2.8) who volunteered to participate in the study. The current analyses include participant data presented in the earlier paper (n = 65) [1] together with the data from an additional 161 participants (current study). All participants were screened for normal hearing (thresholds of 20 dB HL or less at 500, 1,000, 2,000, and 4,000 Hz, an inclusion criterion). Diagnosis of a learning disability or attention deficit hyperactivity disorder were exclusion criteria. Participants were divided into eight groups, each of which was tested with only one tone duration, as follows: 3 msec (n = 30), 6 msec (n = 28), 8 msec (n = 20), 10 msec (n = 36), 15 msec (n = 52), 20 msec (n = 22), 30 msec (n = 21), and 40 msec (n = 17).

Task and stimuli. We used the experimental design reported in Babkoff and Fostick [1]. In short, participants were presented with two 1 kHz pure tones at a level of 40 dB SL. The tones were presented asynchronously to the right and left ear, and participants were asked to report the order in which they heard them (either right-left or left-right). The tone duration for each participant was 3, 6, 8, 10, 15, 20, 30, or 40 msec, according to their assigned group (between subjects design). Rise/fall times were 1 msec. Eight different ISIs of 5, 10, 15, 30, 60, 90, 120, and 240 msec were randomly used. Each ISI value was repeated 16 times, producing 256 trials (8 ISIs × 2 presentation orders × 16 repetitions). After every 32 trials, participants received a short break. Dichotic TOJ thresholds were defined as the ISI necessary for 75% accuracy, estimated using the best linear approximation of a psychometric function.

The experiment was preceded by a training session performed with tones of the same duration as in the experiment. This was designed to familiarize participants with the sounds and to ascertain whether they correctly reported the ear that was being presented with the sound (right or left) [see 1]. Participants received feedback for their responses during training sessions, but no feedback was presented during the experiment.

Apparatus

The hearing screening test was performed using a Danplex DA64 audiometer. The experiment was performed on a Dell laptop computer and the sounds were delivered through TDH-49 headphones.

Procedure

The study was approved by the Ariel University Institutional Review Board, and prior to the experiment participants provided written informed consent. Participants were screened for normal hearing prior to the experiment, after signed informed consent was obtained. In addition, their absolute threshold for 1 kHz was measured using the same computer and headphones that were used in the study. The experiment, including the screening and training procedures, took 30–45 minutes.

Results

Accuracy

The accuracy data were transformed by probit (transformation for linearizing sigmoid distributions of proportions [46]. Psychometric functions of the probit-transformed data for the proportion of ’left leading’ responses, as a function of ISI, are presented in Fig 2a, separately for each of the eight stimulus durations. A two-way repeated measures analysis of variance (ANOVA) was performed with the probit-transformed data as the dependent variable, ISI as a within-subjects variable, and Stimulus Duration as a between-subjects variable. The analysis revealed main effects of both ISI [F(7,1358) = 948.142, p < .001, ηp2 = .830] and Stimulus Duration [F(7,194) = 6.456, p < .001, ηp2 = .189], as well as an ISI × Stimulus Duration interaction [F(49,1358) = 1.670, p = .003, ηp2 = .057]. Post-hoc ANOVAs between Stimulus Duration for each ISI revealed significant effects of stimulus duration at the short ISIs [5 msec: F(7,194) = 4.288, p < .001; 10 msec: F(7,194) = 5.038, p < .001; 15msec: F(7,194) = 9.599, p < .001; 30 msec: F(7,194) = 5.573, p < .001], but not at the longer ISIs (60, 90, 120, and 240 msec; ps > .05).

Fig 2 — Schematic diagrams illustrating the distinction between ISI and SOA appear in each panel. a. Data by duration and ISI. b. Data by SOA.

The point of subjective equivalence (PSE) for left-leading responses of 50% was calculated on the probit-transformed data using a linear equation. The PSE for all eight stimulus durations was 0 msec (-6.25E-15 to 6.38E-15 msec). Fig 2b presents a scattergram of the same probit-transformed accuracy data when tone duration and ISI were incorporated into one measure, namely, the SOA (the total delay between tone onset at the leading ear and tone onset at the lagging ear). For the transformed SOA data, the PSE was 0 msec, similar to the PSE obtained from the ISI data for all tone durations. The linear component for the data, plotted as a function of the SOA, predicted 95.7% of the variance in accuracy. Notwithstanding, the points below and above SOAs of -200 and +200 msec were out of line with the rest of the data. This might be due to an asymptotic performance at the longest ISI values. Repeating the analysis without these values included resulted in a predictive value of 98.9% (y = 0.0103x − 6E-18).

Thresholds

TOJ ISI thresholds, defined as the ISI necessary for 75% accuracy, were estimated using a linear function. ISI thresholds are presented in Fig 3a as a scattergram for all participants plotted against each tone duration. These data ranged correspondingly between 58.4–25.4 msec, with stable standard errors in the range of 5.3–7 msec. Heterogeneity testing (Levene Statistic) was not significant (F(7,316) = 1.842, p = .079). Group mean data are also plotted (Fig 3a) and were tested against a model that predicted a reduction in threshold for the same magnitude of increase in tone duration. The data were found not to deviate significantly from this model (probit analysis, Z = -3.13; p = .002). The best linear fit to the mean ISI thresholds (R2 = 0.69, p < .001) is depicted in Fig 3a (see straight line) as is the predicted line (based on y = a–bx, predicting a similar reduction in ISI threshold as the increase in tone duration; see dotted line). The slope of the predicted line related to the current study’s tone duration range of 3–40 msec was -1.17, which does not differ significantly from the slope of -0.86 reported for the more limited 10–40 msec range of tone durations from our previous study [1] (t(290) = 1.162, p = 0.123). Furthermore, the Bayes factor (BF₀₁ = 3.49) suggested substantial support for H₀. This indicates that the current data are more likely to occur under H₀, namely, there was no difference between the slopes produced by the data in the current and the previous study [1].

In Fig 3b, dichotic TOJ thresholds are plotted in terms of the SOA, as a function of tone duration. Note, the scattergram data and the averages fall very close to, or on, the zero-slope dotted line. The point at which the average dichotic TOJ threshold (SOA) crosses the vertical axis in Fig 3b is 69.79 ± 10.45 msec (probit analysis, Z = 5.04, p < .001).

Discussion

Extending the range of tone duration beyond 10–40 msec to include shorter durations of 3–8 msec did not change the general pattern of a “tradeoff” between tone duration and ISI [1]. When the data from all 226 participants were analyzed, there was a decrease of 1.17 msec in ISI for every increase of 1 msec in tone duration over the whole range of 3–40 msec. This was not different from the previous study of 65 participants using a tone duration range of 10–40 msec, in which there was an observed decrease of 0.86 msec in ISI for each increase of 1 msec in tone duration.

The data from Experiment 1 show that, for stimulus durations of 3–40 msec, young adults utilize the same cue for temporal processing from both the stimuli (tone duration) and from the silent gap between them (ISI). The extension of the range of the tone duration from 10–40 msec to 3–40 msec in the present study supports our earlier conclusion that the primary predictor in judging dichotic temporal order is the temporal lag between the onsets of the two stimuli, namely, the stimulus onset asynchrony (SOA) [1]. Therefore, among young adults and within the tone duration range of 3–40 msec, dichotic TOJ thresholds (measured as the SOA) are invariant to tone duration. Moreover, as the average dichotic TOJ threshold crosses the vertical axis at approximately 57 msec (as found in our previous study [1]) to 70 msec (as in the current study, Fig 3a and 3b), this suggests that the time between the onset of two tones that is required for perceiving their order is constant (around 60–70 msec). Future studies should extend the investigation to variations of other parameters of the presented TOJ sound stimuli, such as spectrum and intensity, to further explore the boundaries of this time constant.

Experiment 2: Older adults, stimulus duration 10 to 40 ms

Materials and method

Experiment 2 was conducted using the same methodology as Experiment 1, with the exception of: a) older participants and b) a duration range of 10–40 msec. A group of 98 participants (59 females, 39 males), aged 60–75 years (mean = 66.4, SD = 6.1), volunteered to participate in the study. Participants were divided into five groups, each of which was tested with only one tone duration, as follows: 10 msec (n = 16), 15 msec (n = 27), 20 msec (n = 17), 30 msec (n = 19), and 40 msec (n = 19). After providing signed informed consent, the participants were screened for age-normal hearing (hearing thresholds of 35 dB HL or less at 500, 1,000, 2,000, and 4,000 Hz). This was an inclusion criterion while hearing deficit was an exclusion criterion.

Results

Accuracy

The accuracy data were transformed by probit (transformation for linearizing sigmoid distributions of proportions [46]. The psychometric functions of the probit-transformed data for the proportion of ’left leading’ responses, as a function of ISI, are presented in Fig 4a, for each of the four stimulus durations. A two-way repeated measures ANOVA was performed with the probit-transformed data as the dependent variable, ISI as a within-subjects variable, and Stimulus Duration as a between-subjects variable. The analysis revealed main effects for the ISI [F(7,651) = 244.105, p < .001, ηp2 = .724] and an ISI × Stimulus Duration interaction [F(28,651) = 3.291, p < .001, ηp2 = .124], but no main effect for the Stimulus Duration [F(4,93) = 1.165, p = .332, ηp2 = .048]. Post-hoc ANOVAs of Stimulus Duration for each ISI revealed significant effects of stimulus duration at some ISIs [5 msec: F(4,93) = 5.647, p < .001; 15msec: F(4,93) = 6.879, p < .001; 240 msec: F(4,93) = 4.714, p = .002], but not most (10, 30, 60, 90, and 120 msec; ps > .05).

Fig 4 — Schematic diagrams illustrating the distinction between ISI and SOA appear in each panel. a. Data by duration and ISI. b. Data by SOA.

The PSE for ‘left-leading’ responses of 50% was calculated on the probit-transformed data using a linear equation. The PSE for all eight stimulus durations was 0 msec (-3.80E-15 to 1.39E-15 ms). Fig 4b presents a scattergram of the same probit-transformed accuracy data when tone duration and ISI were combined as the SOA. The linear component for the data, plotted as a function of the SOA, predicted 92.9% of the variance in accuracy.

Thresholds

TOJ ISI thresholds (i.e., ISI required for 75% correct responses) for the older adult cohort are plotted as a scattergram as a function of tone duration in Fig 5a. Mean ISI thresholds ranged between 85.2–63.2 msec for stimuli durations of 10–40 msec, with standard errors in the range of 5.5–7.2 msec. Group mean thresholds are also plotted. We tested the group mean data against the predicted model and found that the data do not deviate significantly from this model (probit analysis, Z = -5.989; p < .001). The best linear fit to the means (R2 = 0.91, p < .001) is depicted in Fig 5a (see straight line) as is the predicted line (based on y = a—bx (i.e., dotted line). The slope of the predicted line (-0.69) was not significantly different from the slope found previously for the same stimulus duration among young adults (-0.86; t(162) = -0.264, p = 0.396), and not different from the slope found in Experiment 1 for stimulus durations of 3–40 msec among young adults (-1.17; t(322) = -1.274, p = 0.102). The Bayes factor for the current data compared both to our previous study (BF₀₁ = 5.64) and to the data from Experiment 1 (BF₀₁ = 3.48) suggested substantial support for the H₀. This indicates that the current data are more likely to occur under H₀, namely, there was no difference between slopes of the current study when compared to the previous study [1] and to Experiment 1.

In Fig 5b, dichotic TOJ thresholds (SOAs) are plotted against tone duration. Note, the scattergram data and the averages fall very close to-, or on, the zero-slope dotted line. The point at which the average dichotic TOJ threshold (SOA) crosses the vertical axis in Fig 5b is 100 msec (probit analysis, Z = -2.74, p = .02), indicating that the TOJ threshold for the older adults is, on average, 33 msec longer than the young adults in Experiment 1 (99.6 msec vs. 66.5 msec, respectively).

Discussion

Extending our study to an older adult population did not change the general pattern of a “tradeoff” between tone duration and ISI in predicting dichotic TOJ thresholds, as found among young adults in our earlier study [1] and in Experiment 1. As predicted, the thresholds for older adults were longer than for the younger adults; on average, the older adults’ TOJ threshold was longer by 33 msec across a 10–40 msec range of stimulus durations (i.e., an average SOA of 99.6 msec for the older adults vs. 66.5 msec for the younger adults). Moreover, among the older adults, the results indicated a decrease of 0.69 msec in ISI for every increase of 1 msec in tone duration from 10–40 msec, which does not differ significantly from the 0.86 msec decrease observed among younger adults in the previous study [1] over the same tone duration range.

The purpose of Experiment 2 was to validate the conclusion reached in our earlier study [1]—that the SOA is the main parameter predicting accurate judgment of temporal order of two tones presented dichotically—this time among an older adult cohort. The current data indicate that the SOA is indeed the major parameter predicting dichotic TOJ performance when testing among older adults, who typically show a general deficit in temporal processing (e.g., longer TOJ thresholds). While older adults do need more time than younger adults between the onset of first dichotic tone in the leading ear to the onset of the second dichotic tone in the lagging ear in order to judge the order correctly, the general relationship between dichotic TOJ ISI threshold, and tone duration is the same as for the younger adults. It is the SOA, rather than the silent interval (ISI) or the tone duration, that is the crucial parameter utilized by both younger and older adults in judging temporal order.

Summary and conclusions

The present study aimed to test the generalizability of our previous finding [1] that dichotic TOJ performance is best predicted by stimulus onset asynchrony (SOA), namely, the time separating the onset of the first tone to the onset of the second one. We did so by implementing two different manipulations to our previous research model: 1) extending the range of tone durations to also include 3–8 msec among a population of young adults; and 2) testing the ISI-tone duration relationship among a cohort of older adults, among whom general deficits in auditory temporal order judgment have been shown in prior research [2, 3, 5, 7, 10, 20, 22]. One might have expected to find a larger contribution to TOJ performance of tone duration because tones of 10 msec, or shorter, spread energy across the frequency range. This may change the quality of the tones, making the impact of the duration itself on TOJ threshold greater through the effect on tone quality, and not only through its impact on the temporal separation between the onsets of the first and second tones. Moreover, as older populations have shown previous deficits in dichotic TOJ [i.e., evidencing longer TOJ ISI thresholds] one might also have expected to observe a non-linear relationship between tone duration and TOJ ISI threshold among our older adult cohort, because they may need more time to perceive and process temporal information. However, the results of both experiments seem to support the hypothesis that the dichotic TOJ threshold is determined by a general temporal mechanism, whether the tone duration is as short as 3 msec or as long as 40 msec, and regardless of older adults’ general deficits in temporal processing. This indicates that the temporal mechanism for dichotic TOJ is affected by temporal asynchrony, independent of the nature of that asynchrony (whether filled with silence or sound), as long as the gap between tones sufficiently conveys the information of asynchrony.

The present and the previous study [1] were both conducted utilizing auditory dichotic TOJ (also referred to as spatial or binaural TOJ). This TOJ paradigm involves uses two identical sounds presented asynchronously to the right and left ears; other auditory TOJ paradigms use two tones that differ in pitch or spectrum and are presented either monaurally or diotically (to both ears at the same time). The advantage of measuring temporal processing using dichotic TOJ is that the stimuli are identical, providing assurance that the temporal judgment is based on the temporal relationship of the two stimuli alone and not on other cues such as pitch [2, 3, 19]. Furthermore, perception of the stimulation of both ears by two asynchronous sounds, by definition, reflects mainly central auditory processing [1, 11, 47]. Consequently, the conclusions drawn from the current and earlier studies are limited to TOJ as tested by this paradigm, which has been shown to mainly involve temporal cues [2, 3, 11, 14, 17]. The extent to which these conclusions can be generalized to other TOJ paradigms has yet to be tested.

Several theoretical and clinical conclusions arise from the experiments carried out in the present study. Our first conclusion is that when judging the order of two tones presented to the two ears, individuals extract the same temporal information from the stimuli as from the silent gap between them, whether they are young and have intact temporal processing abilities or are older and have less than intact temporal processing abilities. Second, the same temporal parameter–SOA–impacts the absolute value of dichotic TOJ threshold regardless of whether an individual has good or poor temporal processing ability; this dichotic TOJ threshold will then remain constant across different tone durations. Thus, temporal processing abilities affect the dichotic TOJ mechanism only quantitatively, rather than qualitatively. Clinically, this conclusion is important since quantitative changes can be more easily remedied than those that are qualitative. We recommend future research to further refine the conclusions of the current study by conducting dichotic TOJ testing using sounds of different intensities and spectra, employing additional temporal processing tasks, and studying diverse populations with varied temporal processing abilities.

Data Availability

Data was uploaded to Kaggle. https://www.kaggle.com/leahfostick/tone-duration-and-isi-in-toj.

Funding Statement

The author(s) received no specific funding for this work.

References

1.Babkoff H, Fostick L. The role of tone duration in dichotic temporal order judgment. Attention, Perception, & Psychophysics. 2013. May;75(4):654–60. doi: 10.3758/s13414-013-0449-6 [DOI] [PubMed] [Google Scholar]
2.Fostick L, Babkoff H. Different response patterns between auditory spectral and spatial temporal order judgment (TOJ). Experimental Psychology. 2013. [DOI] [PubMed] [Google Scholar]
3.Fostick L, Babkoff H. Auditory spectral versus spatial temporal order judgment: Threshold distribution analysis. Journal of Experimental Psychology: Human Perception and Performance. 2017. May;43(5):1002. doi: 10.1037/xhp0000359 [DOI] [PubMed] [Google Scholar]
4.Dacewicz A, Szymaszek A, Nowak K, Szelag E. Training-induced changes in rapid auditory processing in children with specific language impairment: electrophysiological indicators. Frontiers in human neuroscience. 2018. Aug 7;12:310. doi: 10.3389/fnhum.2018.00310 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Fink M, Churan J, Wittmann M. Assessment of auditory temporal-order thresholds–A comparison of different measurement procedures and the influences of age and gender. Restorative Neurology and Neuroscience. 2005. Jan 1;23(5, 6):281–96. [PubMed] [Google Scholar]
6.Fink M, Ulbrich P, Churan J, Wittmann M. Stimulus-dependent processing of temporal order. Behavioural processes. 2006. Feb 28;71(2–3):344–52. doi: 10.1016/j.beproc.2005.12.007 [DOI] [PubMed] [Google Scholar]
7.Szymaszek A, Sereda M, Pöppel E, Szelag E. Individual differences in the perception of temporal order: the effect of age and cognition. Cognitive neuropsychology. 2009. Mar 1;26(2):135–47. doi: 10.1080/02643290802504742 [DOI] [PubMed] [Google Scholar]
8.Szymaszek A, Wolak T, Szelag E. The treatment based on temporal information processing reduces speech comprehension deficits in aphasic subjects. Frontiers in aging neuroscience. 2017. Apr 11;9:98. doi: 10.3389/fnagi.2017.00098 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Szelag E, Jablonska K, Piotrowska M, Szymaszek A, Bednarek H. Spatial and spectral auditory temporal-order judgment (TOJ) tasks in elderly people are performed using different perceptual strategies. Frontiers in psychology. 2018. Dec 11;9:2557. doi: 10.3389/fpsyg.2018.02557 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Szymaszek A, Szelag E, Sliwowska M. Auditory perception of temporal order in humans: The effect of age, gender, listener practice and stimulus presentation mode. Neuroscience Letters. 2006. Jul 31;403(1–2):190–4. doi: 10.1016/j.neulet.2006.04.062 [DOI] [PubMed] [Google Scholar]
11.Ben-Artzi E, Fostick L, Babkoff H. Deficits in temporal-order judgments in dyslexia: evidence from diotic stimuli differing spectrally and from dichotic stimuli differing only by perceived location. Neuropsychologia. 2005. Jan 1;43(5):714–23. doi: 10.1016/j.neuropsychologia.2004.08.004 [DOI] [PubMed] [Google Scholar]
12.Fostick L, Bar-El S, Ram-Tsur R. Auditory Temporal Processing as a Specific Deficit among Dyslexic Readers. Psychology Research. 2012. Feb;2(2):77–88. [Google Scholar]
13.Tallal P. Auditory temporal perception, phonics, and reading disabilities in children. Brain and language. 1980. Mar 1;9(2):182–98. doi: 10.1016/0093-934x(80)90139-x [DOI] [PubMed] [Google Scholar]
14.Fostick L, Lifshitz-Ben-Basat A, Babkoff H. The effect of stimulus frequency, spectrum, duration, and location on temporal order judgment thresholds: Distribution analysis. Psychological research. 2019. Jul;83(5):968–76. doi: 10.1007/s00426-017-0915-1 [DOI] [PubMed] [Google Scholar]
15.Malenfant N, Grondin S, Boivin M, Forget-Dubois N, Robaey P, Dionne G. Contribution of temporal processing skills to reading comprehension in 8-year-olds: Evidence for a mediation effect of phonological awareness. Child Development. 2012. Jul;83(4):1332–46. doi: 10.1111/j.1467-8624.2012.01777.x [DOI] [PubMed] [Google Scholar]
16.Choinski M, Szelag E, Wolak T, Szymaszek A. Working memory in aphasia: the role of temporal information processing. Frontiers in Human Neuroscience. 2020;14. doi: 10.3389/fnhum.2020.589802 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Fostick L, Eshcoly R, Shtibelman H, Nehemia R, Levi H. Efficacy of temporal processing training to improve phonological awareness among dyslexic and normal reading students. Journal of Experimental Psychology: Human Perception and Performance. 2014. Oct;40(5):1799. doi: 10.1037/a0037527 [DOI] [PubMed] [Google Scholar]
18.Szymaszek A, Dacewicz A, Urban P, Szelag E. Training in temporal information processing ameliorates phonetic identification. Frontiers in human neuroscience. 2018. Jun 6;12:213. doi: 10.3389/fnhum.2018.00213 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.von Steinbüchel N, Wittmann M, Strasburger H, Szelag E. Auditory temporal-order judgement is impaired in patients with cortical lesions in posterior regions of the left hemisphere. Neuroscience letters. 1999. Apr 2;264(1–3):168–71. doi: 10.1016/s0304-3940(99)00204-9 [DOI] [PubMed] [Google Scholar]
20.Ronen M, Lifshitz-Ben-Basat A, Taitelbaum-Swead R, Fostick L. Auditory temporal processing, reading, and phonological awareness among aging adults. Acta psychologica. 2018. Oct 1;190:1–0. doi: 10.1016/j.actpsy.2018.06.010 [DOI] [PubMed] [Google Scholar]
21.Fostick L. Card playing enhances speech perception among aging adults: comparison with aging musicians. European journal of ageing. 2019. Dec;16(4):481–9. doi: 10.1007/s10433-019-00512-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Fostick L, Taitelbaum-Swead R, Kreitler S, Zokraut S, Billig M. Auditory Training to Improve Speech Perception and Self-Efficacy in Aging Adults. Journal of Speech, Language, and Hearing Research. 2020. Apr 27;63(4):1270–81. doi: 10.1044/2019_JSLHR-19-00355 [DOI] [PubMed] [Google Scholar]
23.Babkoff H, Zukerman GI, Fostick L, BEN-ARTZI EL. Effect of the diurnal rhythm and 24 h of sleep deprivation on dichotic temporal order judgment. Journal of sleep research. 2005. Mar;14(1):7–15. doi: 10.1111/j.1365-2869.2004.00423.x [DOI] [PubMed] [Google Scholar]
24.Kinsbourne M, Rufo DT, Gamzu E, Palmer RL, Berliner AK. Neuropsychological deficits in adults with dyslexia. Developmental Medicine & Child Neurology. 1991. Sep;33(9):763–75. doi: 10.1111/j.1469-8749.1991.tb14960.x [DOI] [PubMed] [Google Scholar]
25.Kolodziejczyk I, Szelag E. Auditory perception of temporal order in centenarians in comparison with young and elderly subjects. Acta Neurobiol Exp. 2008. Jun;68(3):373–81. [DOI] [PubMed] [Google Scholar]
26.Hirsh IJ. Auditory perception of temporal order. The Journal of the Acoustical Society of America. 1959. Jun;31(6):759–67. [Google Scholar]
27.Hirsh IJ, Sherrick CE Jr. Perceived order in different sense modalities. Journal of experimental psychology. 1961. Nov;62(5):423. doi: 10.1037/h0045283 [DOI] [PubMed] [Google Scholar]
28.Gelfand SA. Hearing: An introduction to psychological and physiological acoustics. CRC Press; 2017. [Google Scholar]
29.Howard DM, Angus J. Acoustics and psychoacoustics. Taylor & Francis; 2017. [Google Scholar]
30.Wright HN. An artifact in the measurement of temporal summation at the threshold of audibility. Journal of Speech and Hearing Disorders. 1967. Nov;32(4):354–9. doi: 10.1044/jshd.3204.354 [DOI] [PubMed] [Google Scholar]
31.Suied C, Agus TR, Thorpe SJ, Mesgarani N, Pressnitzer D. Auditory gist: recognition of very short sounds from timbre cues. The Journal of the Acoustical Society of America. 2014. Mar;135(3):1380–91. doi: 10.1121/1.4863659 [DOI] [PubMed] [Google Scholar]
32.Beatini JR, Proudfoot GA, Gall MD. Effects of presentation rate and onset time on auditory brainstem responses in Northern saw-whet owls (Aegolius acadicus). The Journal of the Acoustical Society of America. 2019. Apr 17;145(4):2062–71. doi: 10.1121/1.5096532 [DOI] [PubMed] [Google Scholar]
33.Lee C, Guinan JJ Jr, Rutherford MA, Kaf WA, Kennedy KM, Buchman CA, et al. Cochlear compound action potentials from high-level tone bursts originate from wide cochlear regions that are offset toward the most sensitive cochlear region. Journal of neurophysiology. 2019. Mar 1;121(3):1018–33. doi: 10.1152/jn.00677.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Fostick L, Ben-Artzi E, Babkoff H. Aging and speech perception: Beyond hearing threshold and cognitive ability. Journal of basic and clinical physiology and pharmacology. 2013. Sep 1;24(3):175–83. doi: 10.1515/jbcpp-2013-0048 [DOI] [PubMed] [Google Scholar]
35.Schneider BA, Pichora-Fuller K, Daneman M. Effects of senescent changes in audition and cognition on spoken language comprehension. In “The aging auditory system” 2010. (pp. 167–210). Springer, New York, NY. [Google Scholar]
36.Sommers MS, Hale S, Myerson J, Rose N, Tye-Murray N, Spehar B. Listening comprehension across the adult lifespan. Ear and hearing. 2011. Nov;32(6):775. doi: 10.1097/AUD.0b013e3182234cf6 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Taitelbaum-Swead R, Fostick L. The effect of age and type of noise on speech perception under conditions of changing context and noise levels. Folia Phoniatrica et Logopaedica. 2016;68(1):16–21. doi: 10.1159/000444749 [DOI] [PubMed] [Google Scholar]
38.Alain C, McDonald K, Van Roon P. Effects of age and background noise on processing a mistuned harmonic in an otherwise periodic complex sound. Hearing research. 2012. Jan 1;283(1–2):126–35. doi: 10.1016/j.heares.2011.10.007 [DOI] [PubMed] [Google Scholar]
39.Bergeson TR, Schneider BA, Hamstra SJ. Duration discrimination in younger and older adults. Canadian Acoustics. 2001. Dec 1;29(4):3–9. [Google Scholar]
40.Fitzgibbons PJ, Gordon-Salant S. Age effects in discrimination of intervals within rhythmic tone sequences. The Journal of the Acoustical Society of America. 2015. Jan;137(1):388–96. doi: 10.1121/1.4904554 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Grose JH, Hall JW III, Buss E. Temporal processing deficits in the pre-senescent auditory system. The Journal of the Acoustical Society of America. 2006. Apr;119(4):2305–15. doi: 10.1121/1.2172169 [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Heinrich A, Schneider B. Age-related changes in within-and between-channel gap detection using sinusoidal stimuli. The Journal of the Acoustical Society of America. 2006. Apr;119(4):2316–26. doi: 10.1121/1.2173524 [DOI] [PubMed] [Google Scholar]
43.Ostroff JM, McDonald KL, Schneider BA, Alain C. Aging and the processing of sound duration in human auditory cortex. Hearing research. 2003. Jul 1;181(1–2):1–7. doi: 10.1016/s0378-5955(03)00113-8 [DOI] [PubMed] [Google Scholar]
44.Pichora-Fuller MK, Schneider BA, Benson NJ, Hamstra SJ, Storzer E. Effect of age on detection of gaps in speech and nonspeech markers varying in duration and spectral symmetry. The Journal of the Acoustical Society of America. 2006. Feb;119(2):1143–55. doi: 10.1121/1.2149837 [DOI] [PubMed] [Google Scholar]
45.Schneider BA, Pichora-Fuller MK. Age-related changes in temporal processing: implications for speech perception. In “Seminars in hearing” 2001 (Vol. 22, No. 03, pp. 227–240). Copyright© 2001 by Thieme Medical Publishers, Inc., 333 Seventh Avenue, New York, NY 10001, USA.
46.Armitage P, Berry G. Statistical methods in medical research (3rd ed.). 1994. Oxford, UK: Blackwell Scientific. [Google Scholar]
47.Fostick L, Ben-Artzi E, Babkoff H. Stimulus-onset-asynchrony as the main cue in temporal order judgment. Audiology Research. 2011. Mar;1(1):16–8. doi: 10.4081/audiores.2011.e5 [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0264831.r001

Decision Letter 0

Susan Nittrouer

13 Oct 2021

PONE-D-21-18616

The role of tone duration in dichotic temporal order judgment II: Extending the boundaries of duration and age

PLOS ONE

Dear Dr. Fostick,

Thank you for submitting your manuscript to PLOS ONE. I now have reviews from two individuals who are truly experts in the field. I always feel fortunate when it is possible to get input from such highly qualified authorities. But as you will see, the conclusions reached concerning your manuscript were divided across the reviewers, with the first reviewer decidedly more positive than the second. I believe that all of the observations offered by both reviewers are well justified - and addressable. Many of these concerns have to do with presentation of the experiment and results, rather than with the conduct of the experiment itself. Therefore, I would like to invite you to submit a revision of your manuscript after you address all their concerns.

Please submit your revised manuscript by Nov 27 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Susan Nittrouer, Ph.D.

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In order to improve reporting, in your methods section, please provide additional information about the participant recruitment method and the demographic details of your participants, such as table of relevant demographic details.

3. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This is a well-written and concise manuscript addressing temporal order judgments, following on the authors’ previous work.

Background and motivation for the study: It is an expanded study of tone duration in temporal order judgment (TOJ). The study contains a large group of young adults (n=226) and older adults (n = 98). It Extends previous work by including shorter stimulus durations and by including older listeners.

Stimulus-onset-asynchrony (SOA) continues to explain the TOJ even for shorter durations and abilities. TOJ generally changes with age. (The differences between age groups are Quantitative, NOT qualitative)

The authors conclude that tone duration, then, just provides more information about SOA. It seems to fit the idea that stimulus onsets are very important for perception. A question of significance and motivation for the study arises: is this new information that contributes significantly to what we know?

In asking that, it seems that the motivation for the use of shorter duration stimuli isn’t strong. I’m not sure why we expected that the results from 3-ms stimuli would be different from 10-ms stimuli (and from Fig 1 perhaps they are). If the stimuli are appropriately ramped, then spectral splatter should be minimized and one might expect this result. What were the temporal ramps applied to the stimuli? Also, based on Figure 1a. it appears that the 3-ms stimuli do separate themselves from the other data, with flatter functions.

The reference for spectral splatter of short duration tones is quite out-dated (1967) and presumably instrumentation has changed dramatically since that time. Suied et al from JASA in 2014 have more recent data, and there are probably others as well.

It does seem interesting to address the question of the aging auditory system. Perhaps one might hypothesize that the only differences between younger and older listeners in TOJ would be found with short duration stimuli. This is introduced on page 10. However, then the older adults were tested using 10 to 40 ms stimuli, so it’s challenging to connect experiments 1 and 2. It does not appear that the age-related differences are duration specific, and that seems like an important point that could be emphasized more strongly. I think that question could be addressed more directly in the figures and in the analysis.

I have questions about the fit in Figure 2. I believe readers will need more information about that. Visual inspection shows a great deal of variability and thus it is difficult to understand the proportion of variance that is explained by the linear fit. Was heterogeneity tested? More details are need for me, and presumably other readers, to understand the data.

Fig 5. Is there a relationship between age and threshold among the older listeners? Age could be considered as a continuous variable.

In the Discussion section the authors state: as predicted the older adults’ thresholds were 33 ms longer than younger? Where was this predicted?

Data from younger and older listeners do appear from these data to be qualitatively similar, but it’s not clear enough yet why this is important. It is also unclear how experiment 1 fits into the overall picture. Additional motivation is needed, and details are needed about the stimulus characteristics (ramps) and the statistical analyses.

Reviewer #2: Title: The role of tone duration in dichotic temporal order judgment II: Extending the

boundaries of duration and age

Authors: Leah Fostick and Harvey Babkoff

Submitted to: PLOS ONE

Manuscript number: PONE-D-21-18616

Overview

The idea proposed in this manuscript it that the ability to determine the temporal order of two auditory stimuli is determined by the time between the onsets of the two stimuli (stimulus onset asynchrony, SOA) rather than by the time between the offset of the first stimulus and the onset of the second stimulus (inter stimulus interval, ISI). The authors test this proposal by measuring the proportion correct determinations of temporal order for tones of different durations at each of multiple ISIs, and then comparing how the TOJ threshold is affected by tone duration when the TOJ threshold is based on the ISI versus when it is based on the SOA. The ISI-derived thresholds decrease with increasing tone duration, while the SOA-derived thresholds are constant across tone duration, suggesting that the SOA is the critical cue for performance. The same general pattern is reported for tone durations ranging from 3 to 40 ms in young adults (an expansion from 10 to 40 ms reported in a previous paper by the same authors), and for tone durations ranging from 10 to 40 ms in older adults.

I provide my general comments, and then my more specific comments on the manuscript, below.

General Comments

Stimulus description: I think it is essential to include a description of how the threshold is determined (see other general comment), to include a figure illustrating how manipulating the tone duration allows the determination of whether the dichotic TOJ threshold is determined by the ISI or the SOA, and to include stimulus schematics in each figure to illustrate how the data are being analyzed (based on ISI or SOA). It took me quite a while to understand the stimuli. For example, here is a version of a note to myself as I was reading: Text [Pg. 3, bottom]: “We asked whether changes in the duration of the tones and inter-stimulus interval…affect dichotic temporal order judgment accuracy in the same or different ways” My note: “To me, it is confusing to introduce ISI here, because, I suspect, the TOJ threshold is determined by manipulations of the ISI…Oh!! Is the idea that the ISI is a silent period between the stimuli, but that in the manipulation of stimulus duration, the two tones are always contiguous??” Now I understand, or think I understand, that there are ISIs of various lengths between the two tones for all of the stimuli, that the stimulus duration was varied across conditions (with multiple ISIs for each stimulus duration), and that the question was whether the data are better explained as a whole by evaluating performance based on the ISI or on the stimulus onset asynchrony (SOA). This is a case where a picture really would be worth a thousand words.

Threshold estimate: Why is the estimate of the time required between the onset of the two tones to determine their order—around 67 ms for younger adults--so much longer than the 15-20 ms value reported by Hirsh (1959)…a classic paper on temporal order judgment that is not cited in the current manuscript? Hirsh (and subsequently many others) used two ~500-ms stimuli whose onsets differed but whose offsets were coincident, but if temporal order is determined by stimulus onset asynchrony then it seems that the results of the present experiment and Hirsh’s data should align. Is the difference due to monotic vs. dichotic presentation?

Writing: I found most of the manuscript to be quite difficult to read. The exception was the Summary and Conclusion section. The previous sister paper by Babkoff and Fostick (2013) is much clearer overall, indicating that the authors are capable of producing clear prose and therefore could greatly improve the clarity of the current manuscript.

Introductions to individual experiments: I recommend combining all of the introductions into a single introduction. As it is now, some portions of the introductions to the individual experiments repeat information in the main introduction and other portions provide information that would be quite helpful to include in the main introduction.

Results: I found the results section of the sister paper by Babkoff and Fostick (2013) to be much clearer and more informative than the current results sections. I recommend modeling the current results sections after the earlier paper while still incorporating new additions like the Bayes factor.

Stimulus duration: It would be quite helpful to spell out why the extension of the investigation to stimulus durations beyond 10-40 ms only focused on stimuli in the 3-8 ms range, rather than on a much wider range of stimulus durations. I think the reason is that the dichotic TOJ threshold is around 60 ms, so to compare ISI and SOA requires durations shorter than 60 ms. There is some mention of the possibility that the spectrum of the shortest stimuli would affect the outcome, but I did not find that argument to be compelling, especially without placing the 3-8 ms restriction in the larger context.

TOJ threshold: The TOJ threshold is not defined.

Terminology: The terms temporal order judgment (TOJ) and dichotic TOJ are used interchangeably throughout the manuscript. It would be helpful to select just one term and then stick with it. Is the idea that the dichotic TOJ is just one more example of TOJ or that the dichotic aspect is an important factor? If the focus is on dichotic TOJ, the current claims could be tested with monaural TOJ tests, as well.

Specific Comments

Abstract

Pg. 2, top: “the major predictor of auditory TOJ threshold, and performance on spatial/dichotic TOJ tasks” Is there a difference between the TOJ threshold and performance on TOJ tasks? Is the intent that the predictor is for TOJ tasks in general, and the present results are for dichotic TOJ tasks in particular?

Pg. 2: To me, the abstract as a whole does not capture the major message of the manuscript—that dichotic TOJ thresholds appear to be determined by the SOA, rather than the ISI. I think the abstract would be much stronger if it were introduced using the argument at the beginning of the Summary and Conclusions about how the manuscript provides two tests of the idea that it is the SOA rather than ISI that determines the dichotic TOJ threshold.

Introduction

Pg. 3, top: “Temporal order judgment (TOJ) reflects the individual’s ability to correctly perceive the order of consecutive stimuli presented rapidly.” This sentence is not clear to me. It seems to conflate the task (temporal order judgment) with performance on the task (correctly perceive…and presented rapidly).

Pg. 3, top: “TOJ thresholds” What is a TOJ threshold? I think it would help to introduce the ideas a bit more slowly. For example, I suspect that the TOJ threshold is determined by varying the ISI.

Pg. 3, top: “TOJ thresholds were found to be related to phonological skills [1-10] and to speech perception [2, 8,11-16].” What was the direction of the relationship? I assume that higher thresholds were associated with poorer phonological skills and poorer speech perception, but it would be helpful to state that directly.

Pg. 3, middle: “use stimuli that differ in spectrum or in duration [17], in frequency (pitch)” I do not understand the distinction between spectrum and frequency. The spectra differ for sounds of two different frequencies. I suspect that intent is that ‘spectrum’ means the spectrum of a complex sound, but that is not clear from the text.

Pg. 3, middle: “or in synchronicity, meaning which ear receives the first/second stimulus (2,5,9,17-19; 22-23,24-27]” I do not understand the term ‘synchronicity’ in this context. Would ‘ear of presentation’ work? Does synchronicity mean the same thing as dichotic TOJ?

Pg. 3, bottom: “of the tones” what tones?

Experiment 1: Young Participants, Stimulus Duration 3 to 40 ms

Materials and Method

Pg. 5, bottom: “226 participants” It would be helpful to include the number of males and females. Were there any sex differences in performance?

Pg. 6, top: “The age of the participants ranged from 20 – 35 years.” It would be helpful to add the mean and standard deviation of the ages.

Pg. 6, top: Were the participants compensated for their participation?

Pg. 6, bottom: “ranging between 5-240 msec” It would be helpful to list the ISI values.

Pg. 6, bottom: Were the participants given trial-by-trial feedback?

Pg. 6, bottom: “to ascertain whether they perceived the order of the tones and correctly reported the ear stimulated (right or left)” The distinction between perceiving the order of the tones and correctly reporting the ear stimulated is not clear to me. How can one be done without the other? Is the idea that participants could tell that the two tones were presented in opposite ears, sequentially, but could not indicate which ear was first?

Pg. 6: What was the level of the tones?

Pg. 7, top: How were the tones generated?

Pg. 7 middle: “All participants were screened for hearing difficulties and their absolute threshold for 1 kHz was measured using the same computer and headphones that were used in the study.” Was the screening at 1 kHz separate from the screening for normal hearing mentioned in the ‘Participants’ section?

Results

Pg. 7, bottom: “Mean ISI thresholds” How is the ISI threshold defined?

Pg. 7, bottom: “Group mean data was tested against the predicted model and was found not to deviate significantly from this model (Probit analysis, Z = -3.13; p = .002).” What was the predicted model?

Pg. 7, bottom: “Mean ISI thresholds for the data collected in the present study (3 – 8 ms) and the previous study (10 – 40 ms) ranged between 58.4 – 25.4 ms“ The impression I get from this sentence (and elsewhere) is that all of the new data (n=161) are for 3-8 ms, and all of the previous data (n=65) are for 10-40 ms, but that does not fit with the n provided for each duration separately.

Pg. 7, bottom: “ranged between 58.4—25.4 ms” In which direction?

Pg. 7, bottom: Fig. 1B, the points above the ~-250 ms and +250 ms SOA are out of line with the rest of the data. I think this pattern deserves mention in the results section…presumably arises because of a flattening of performance, asymptotic performance at the longest ISI values. What is the outcome if these values are removed from the line fitting? My sense is that these values lead to underestimation of the ‘true’ slope. Is there a reason for using linear vs. log values?

pg. 8, top “TOJ thresholds” How is the TOJ threshold defined? Is it the same as the dichotic TOJ threshold mentioned later in the same paragraph? Is it the same as the ISI threshold mentioned in the preceding paragraph?

Pg. 8, top: “The best linear fit to the means is drawn as a straight line (R2= 0.69,

p<.001) and the predicted line based on y = a - bx is drawn as a dotted line.” What is the predicted line?

Discussion

Pg. 8, bottom: “for a wide range of stimulus durations (3 – 40 msec)” I do not consider 3-40 ms to be a wide range of stimulus durations. I recommend deleting “for a wide range...” to the end of the sentence, and just stating the outcome.”

Pg. 9, top: “suggest that the time between the onset of two tones that is required for perceiving their order is constant (around 60 – 70 ms)” I think this estimate is quite interesting, and deserves to be emphasized in the paper, discussion, and possibly abstract. However, I think that it would be better to point it out first in the results section (I had to go back to the figure to work out how the value was determined).

Experiment 2: Elderly Participants, Stimulus Duration 10 to 40 ms

Materials and Method

Pg. 9, middle: “elderly” (mentioned several time): I recommend using a different term, possibly “older”

Pg. 9, middle: “Studies of temporal processing among the elderly, including those researching it in the context of speech, have reported deficiencies in performance compared to young adults [8,19-20,22,24,26,38-45]. These studies have demonstrated that aging adults require longer sound durations, ISIs, and gaps, than do young adults, in order to correctly perceive them.” It would be helpful to indicate which of the references in the first sentence are associated with each of the measures listed in the second sentence.

Pg. 9, middle: “Such findings support the notion of aging adults’ sensitivity both to tone duration and ISI” I do not understand this phrase. Is the intent that such findings indicate that aging adults are sensitive to both…??

Pg. 10, top: “If so, we expected the pattern of older adults’ performance to be similar to those of young adults, indicating that although the thresholds will be longer, on average, there will be a “zero” line slope when thresholds are plotted as a function of SOA.”�”If so, the pattern of older adults’ performance should be similar to that of young adults, such that there should be a “zero” line slope when thresholds are plotted as a function of SOA.” I recommend stating in a separate sentence why the thresholds are expected to be longer.

Pg. 10, middle: “than longer ones, we would expect the line’s slope”�”than longer ones, the line’s slope (relating dichotic TOJ threshold to tone duration) should be significantly greater than “zero”

Pg. 10 bottom to pg. 11 top: I recommend taking out the subheads under Materials and Methods and just stating that Experiment 2 was the same as Experiment 1, except…(fill in the blanks).

Pg. 11 top: What does ‘resembled’ mean in this context? Were there a number of differences in the procedure between Experiment 1 and Experiment 2, but a vague similarity between the two experiments? I suspect not.

Discussion

Pg. 12, middle: “the thresholds for aging adults were, on average, 33 ms longer than for the younger adults (an average SOA threshold of 99.6 ms for aging adults vs. 66.5 ms for the younger).” See comment above (pg. 9, top) about the estimate of the threshold. I recommend including that information first in the results section.

Summary and conclusions

Pg. 13, middle: “The present study aimed to evaluate the temporal mechanism of TOJ by testing the generalizability of the conclusion that dichotic TOJ is determined by stimulus onset asynchrony (SOA), the time separating the onset of the first tone to the onset of the second one. We tested this hypothesis by two different manipulations: 1) by extending the range of tone durations to include 3-10 msec on a population of young adults; 2) by performing identical testing on a population of older adults, about whom there is existing evidence of a general deficit in auditory temporal order judgment.” I think this set-up is much clearer than the one in the current introduction.

Figure Captions

Figures

After taking the time to write down the suggestions listed below for improving the figures, I finally looked at the previous sister paper by Babkoff and Fostick (2013) and saw the figures in that paper are much clearer, and actually follow most of my suggestions.

Figure 1

As I understand it, the figure shows the number of ‘left leading’ responses, but those responses increase on the right side of the figure, which is counterintuitive. At a minimum I recommend indicating left and right on the x axis. Replotting the data more intuitively would be better.

Top panel

1) move the lower half of y axis values to the opposite side of the axis so the values are not obscured by the lines (or possibly move the entire y axis to the left of the figure)

2) order the colors of the lines systematically, so the relationship between tone duration and performance is easier to unpack

3) add the n per group to the figure, possibly under the tone duration…or at least to the caption

4) give an estimate of the error…one possibility would be to plot one point and a mean error bar for each duration in the upper left quadrant

5) add schematic diagrams illustrating the distinction between panel a and panel b

Bottom panel

1) use the same (revised) colors from the top panel for the points in this panel, for continuity, and to help the reader see the connection between the two panels

Figure 4

Same comments as for Figure 1.

Top panel

1) use the same (revised) colors for the different durations as in Fig. 1, for continuity

Figures 2,3 and 5,6

Same basic comments as for Figure 1

I recommend combining Figures 2 and 3 into one two-panel figure, and combining Figures 5 and 6 into another two-panel figure.

Grammar and word choice

Pg. 3, top: “reflects the individual’s ability”�”reflects an individual’s ability”

Pg., 3, middle: “and not by other cues”�” and not on other cues”

Pg. 3, bottom: “offset of the tone”� “offset of the first tone”

Pg. 6, top: I recommend: “(hearing thresholds of 20 dB HL or less in frequencies 500, 1,000,

2,000, and 4,000 Hz)”� “(hearing thresholds of 20 dB HL or less at 500, 1,000, 2,000, and 4,000 Hz)”.

Pg. 6, bottom: “2 order of presentation to each ear”…there is something odd about the grammar …possibly “2 presentation orders”

Pg. 6, bottom: [see18]�[see 18]

Pg. 7, top: “using the Danplex DA64 audiometer”� using a Danplex DA64 audiometer.

Pg. 7, bottom, and throughout manuscript: Put a period after “Fig”

Pg. 7, bottom: “for the data plotted, as a function of SOA”� “for the data, plotted as a function of SOA”

Pg. 7, bottom: “mean data was tested”� “mean data were tested”

Pg. 8, middle: “When data from all 226 participants was analyzed”� When the data from all 226 participants were analyzed”

Pg. 8, bottom: “The data from Experiment 1 shows that”� “The data from Experiment 1 show that” (data is a plural word)

Pg. 8, bottom: “The extension of the tone duration from 10-40 msec”� “The extension of the range of tone durations from 10-40 msec”

Pg. 9, middle: “to the TOJ threshold, we do not know”� “to the TOJ threshold in aging adults, we do not know”

Pg. 10, bottom: I recommend: “(hearing thresholds of 35 dB HL or less in frequencies 500, 1,000, 2,000, and 4,000 Hz)”� “(hearing thresholds of 35 dB HL or less at 500, 1,000, 2,000, and 4,000 Hz)”.

Pg. 12, bottom: “parameter in determining”�”parameter predicting”

Pg. 13, middle: “from 10-to-3 msec”� “from 10 to 3 msec”

Pg. 13, middle: “than just serve to separate the onset of the first from the second tone”�” than simply serving to separate the onsets of the first and second tones.”

Pg. 14, bottom: “Theoretically, our finding shows that”�”Theoretically, our findings show that”

Pgs 22 and 25: The caption styles differ slightly.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Mar 30;17(3):e0264831. doi: 10.1371/journal.pone.0264831.r002

Author response to Decision Letter 0

4 Jan 2022

Dear Prof. Nittrouer,

I would like to thank you for allowing a resubmission of a revised version of the manuscript “The role of tone duration in dichotic temporal order judgment II: Extending the boundaries of duration and age.” I would like to thank the reviewers for the useful comments that gave us the opportunity to make the manuscript clearer. Detailed here, as an accompaniment to the revised manuscript, is a summary of the changes made in response the reviewers’ suggestions. The page and line numbers referred to the “manuscript” version (clean version, with no track changes).

Reviewer #1:

1. The authors conclude that tone duration, then, just provides more information about SOA. It seems to fit the idea that stimulus onsets are very important for perception. A question of significance and motivation for the study arises: is this new information that contributes significantly to what we know? In asking that, it seems that the motivation for the use of shorter duration stimuli isn’t strong. I’m not sure why we expected that the results from 3-ms stimuli would be different from 10-ms stimuli (and from Fig 1 perhaps they are). If the stimuli are appropriately ramped, then spectral splatter should be minimized and one might expect this result. What were the temporal ramps applied to the stimuli?

Response. As requested by the reviewer, we have now clarified in the Task and stimuli section that rise/fall times were 1 msec. Also, in an answer to the reviewer’s question as to why we expected shorter stimuli might produce different results than a longer one, we added our rationale to the Introduction section, namely that the duration of short sounds affects its perception.

On page 5, line 14, we note:

“Our finding, however, is limited to the range of stimulus durations we tested, namely tone durations of 10 – 40 ms. It is unclear whether this finding would extend to other tone durations, since sound duration affects our auditory perception in several ways. First, sound duration affects the loudness of a sound via temporal summation, with sounds being perceived as softer or louder when duration is decreased or increased (respectively), up to 200 msec [33]. Second, sound duration affects our ability to perceive pitch, with lower frequencies requiring longer sound durations than higher frequencies. Third, sound duration also affects our ability to localize a sound source, with longer sounds being localized better by allowing the listeners to move their head towards the sound source [33,34].”

2. Also, based on Figure 1a. it appears that the 3 msec stimuli do separate themselves from the other data, with flatter functions.

Response. There was a difference between all stimulus duration, but only for short ISIs (ISIs 5 – 30 msec) not for longer ones (ISIs 60 – 240 msec). Following the reviewer’s comment, We added to both studies repeated-measures ANOVAs on accuracy data, with Stimulus Duration as a between-subjects variable and ISI as within-subjects variable. The results for Experiment 1 showed a main effect for Stimulus Duration, and a Stimulus Duration X ISI interaction. The source of this interaction was an effect for Stimulus Duration only for short ISIs (ISIs 5 – 30 msec) not for longer ones (ISIs 60 – 240 msec), as was also the case in Babkoff and Fostick (2013) study.

In the Results section of Experiment 1 (page 10, line 16), we note:

“The accuracy data were transformed by probit (transformation for linearizing sigmoid distributions of proportions [50]. Psychometric functions of the probit-transformed data for the proportion of 'left leading' responses, as a function of ISI, are presented in Figure 2a, separately for each of the eight stimulus durations. A two-way repeated measures analysis of variance (ANOVA) was performed with the probit-transformed data as the dependent variable, ISI as a within-subjects variable, and Stimulus Duration as a between-subjects variable. The analysis revealed main effects of both ISI [F(7,1358) = 948.142, p < .001, ηp2 = .830] and Stimulus Duration [F(7,194) = 6.456, p < .001, ηp2 = .189], as well as an ISI × Stimulus Duration interaction [F(49,1358) = 1.670, p = .003, ηp2 = .057]. Post-hoc ANOVAs between Stimulus Duration for each ISI revealed significant effects of stimulus duration at the short ISIs [5 msec: F(7,194) = 4.288, p < .001; 10 msec: F(7,194) = 5.038, p < .001; 15msec: F(7,194) = 9.599, p < .001; 30 msec: F(7,194) = 5.573, p < .001], but not at the longer ISIs (60, 90, 120, and 240 msec; ps > .05).”

In the Results section of Experiment 2 (page 14, line 9) we note:

“The accuracy data were transformed by probit (transformation for linearizing sigmoid distributions of proportions [50]. The psychometric functions of the probit-transformed data for the proportion of 'left leading' responses, as a function of ISI, are presented in Figure 4a, for each of the four stimulus durations. A two-way repeated measures ANOVA was performed with the probit-transformed data as the dependent variable, ISI as a within-subjects variable, and Stimulus Duration as a between-subjects variable. The analysis revealed main effects for the ISI [F(7,651) = 244.105, p < .001, ηp2 = .724] and an ISI × Stimulus Duration interaction [F(28,651) = 3.291, p < .001, ηp2 = .124], but no main effect for the Stimulus Duration [F(4,93) = 1.165, p = .332, ηp2 = .048]. Post-hoc ANOVAs of Stimulus Duration for each ISI revealed significant effects of stimulus duration at some ISIs [5 msec: F(4,93) = 5.647, p < .001; 15msec: F(4,93) = 6.879, p < .001; 240 msec: F(4,93) = 4.714, p = .002], but not most (10, 30, 60, 90, and 120 msec; ps > .05).”

3. The reference for spectral splatter of short duration tones is quite out-dated (1967) and presumably instrumentation has changed dramatically since that time. Suied et al from JASA in 2014 have more recent data, and there are probably others as well.

Response. We agree with the reviewer and, as suggested, updated our citations on this topic to include Suied et al. (2014) and the following additional two newer references:

Beatini JR, Proudfoot GA, Gall MD. Effects of presentation rate and onset time on auditory brainstem responses in Northern saw-whet owls (Aegolius acadicus). The Journal of the Acoustical Society of America. 2019 Apr 17;145(4):2062-71.

Lee C, Guinan Jr JJ, Rutherford MA, Kaf WA, Kennedy KM, Buchman CA, Salt AN, Lichtenhan JT. Cochlear compound action potentials from high-level tone bursts originate from wide cochlear regions that are offset toward the most sensitive cochlear region. Journal of neurophysiology. 2019 Mar 1;121(3):1018-33.

4. It does seem interesting to address the question of the aging auditory system. Perhaps one might hypothesize that the only differences between younger and older listeners in TOJ would be found with short duration stimuli. This is introduced on page 10. However, then the older adults were tested using 10 to 40 ms stimuli, so it’s challenging to connect experiments 1 and 2. It does not appear that the age-related differences are duration specific, and that seems like an important point that could be emphasized more strongly. I think that question could be addressed more directly in the figures and in the analysis.

Response. As suggested by the reviewer, we now put more emphasize on the hypothesis of age-related differences in processing short stimuli, and on the conclusion that duration is not specific to age-related differences. Accordingly, we made the following revisions:

a) We added specific mention of the aging adults’ results in the abstract.

On page 2, line 16, we note:

“The results of both experiments confirmed the hypothesis, that the SOA required for performing dichotic TOJ was constant regardless of stimulus duration, for both age groups: about 66.5 msec for the young adults and 33 msec longer (100 msec) for the older adults.”

b) We put the literature review on age-related differences in auditory temporal processing earlier in the Introduction section.

On page 6, line 21, we note:

“Indeed, older adults have difficulty processing short and rapid stimuli. This difficulty is often reflected in the difficulty of older adults in perceiving speech, especially when the speaker talks fast or when speech is accompanied by background noise [3,17,21,24,39-41]. Studies of temporal processing among older adults, including the studies using speech stimuli, have reported deficiencies in their performance compared to young adults [5,7,14,21,42-49]. These studies demonstrated that older adults required longer sound durations [42,43,47], ISIs [5,7,14,21,45], and longer gaps within sounds [44,46,48,49], than young adults, in order to correctly perceive them. Such findings suggest that older adults might be sensitive both to tone duration and ISI.

c) We have now elaborated the conclusions regarding older adults in the Summary and Conclusions section.

On page 18, line 16, we note:

“Several theoretical and clinical conclusions arise from the experiments carried out in the present study. Our first conclusion is that when judging the order of two tones presented to the two ears, individuals extract the same temporal information from the stimuli as from the silent gap between them, whether they are young and have intact temporal processing abilities or are older and have less than intact temporal processing abilities.”.

5. I have questions about the fit in Figure 2. I believe readers will need more information about that. Visual inspection shows a great deal of variability and thus it is difficult to understand the proportion of variance that is explained by the linear fit. Was heterogeneity tested? More details are need for me, and presumably other readers, to understand the data.

Response. We appreciate the opportunity to clarify this. The standard error of the data presented in this figure (Figure 2 in the original submitted version, Figure 3a in the current revised version) is 5.3 – 7 msec, and is also presented in the Results section of Experiment 1 (page 13, line 13). Heterogeneity testing was found to be non-significant (Levene Statistic, F(7,316) = 1.842, p = .079). In line with the reviewer’s comment, we added the results of the heterogeneity test to the Results section.

On page 13, line 21, we note:

“ISI thresholds are presented in Figure 3a as a scattergram for all participants plotted against each tone duration. These data ranged correspondingly between 58.4 – 25.4 msec, with stable standard errors in the range of 5.3 – 7 msec. Heterogeneity testing (Levene Statistic) was not significant (F(7,316) = 1.842, p = .079).”

6. Fig 5. Is there a relationship between age and threshold among the older listeners? Age could be considered as a continuous variable.

Response. In response to the reviewer’s question, we checked for a correlation between age and threshold, but it was non-significant (r = 0.13, p = 0.21).

7. In the Discussion section the authors state: as predicted the older adults’ thresholds were 33 ms longer than younger? Where was this predicted?

Response. As correctly pointed out by the reviewer, we predicted longer thresholds for older adults, but not in a certain number of msec. Correspondingly, we rephrased this sentence.

On page 1, line 7, we note:

“As predicted, the thresholds for older adults were longer than for the younger adults; on average, the older adults’ TOJ threshold was longer by 33 msec across a 10 – 40 msec range of stimulus durations (i.e., an average SOA of 99.6 msec for the older adults vs. 66.5 msec for the younger adults).”

8. Data from younger and older listeners do appear from these data to be qualitatively similar, but it’s not clear enough yet why this is important. It is also unclear how experiment 1 fits into the overall picture. Additional motivation is needed, and details are needed about the stimulus characteristics (ramps) and the statistical analyses.

Response. As suggested by the reviewer here and in the previous comments, we put more emphasis on the study motivation and rationale, as well as elaborated on the conclusions and their potential contribution. We also added further detail such as the ramp information.

Reviewer #2:

1. Stimulus description: I think it is essential to include a description of how the threshold is determined (see other general comment), to include a figure illustrating how manipulating the tone duration allows the determination of whether the dichotic TOJ threshold is determined by the ISI or the SOA, and to include stimulus schematics in each figure to illustrate how the data are being analyzed (based on ISI or SOA). It took me quite a while to understand the stimuli. For example, here is a version of a note to myself as I was reading: Text [Pg. 3, bottom]: “We asked whether changes in the duration of the tones and inter-stimulus interval…affect dichotic temporal order judgment accuracy in the same or different ways” My note: “To me, it is confusing to introduce ISI here, because, I suspect, the TOJ threshold is determined by manipulations of the ISI…Oh!! Is the idea that the ISI is a silent period between the stimuli, but that in the manipulation of stimulus duration, the two tones are always contiguous??” Now I understand, or think I understand, that there are ISIs of various lengths between the two tones for all of the stimuli, that the stimulus duration was varied across conditions (with multiple ISIs for each stimulus duration), and that the question was whether the data are better explained as a whole by evaluating performance based on the ISI or on the stimulus onset asynchrony (SOA). This is a case where a picture really would be worth a thousand words.

Response. We agree with the reviewer that a picture really is worth a thousand words, so added the following illustration as Figure 1.

Figure 1. Schematic illustration of study design demonstrating the relationship of stimulus duration, inter-stimulus interval (ISI), and stimulus-onset asynchrony (SOA) (see top line). The manipulation of stimulus duration is presented as the duration of Stimulus A and Stimulus B, which varied across groups in the current study as a between-subjects variable. ISI is presented as the silent gap between the offset of Stimulus A and the onset of Stimulus B, which varies within each group as a within-subjects variable. Numbers of participants in each group for Experiments 1 and 2 are shown.

2. Threshold estimate: Why is the estimate of the time required between the onset of the two tones to determine their order—around 67 ms for younger adults—so much longer than the 15-20 ms value reported by Hirsh (1959)…a classic paper on temporal order judgment that is not cited in the current manuscript? Hirsh (and subsequently many others) used two ~500-ms stimuli whose onsets differed but whose offsets were coincident, but if temporal order is determined by stimulus onset asynchrony then it seems that the results of the present experiment and Hirsh’s data should align. Is the difference due to monotic vs. dichotic presentation?

Response. We very much appreciated this feedback and added references to the seminal work of Hirsh (1959) and Hirsh and Sherrick (1962) that were regretfully omitted. We also now reference later studies that found longer thresholds. On Page 4, line 17, we now note:

“Hirsh [28] and Hirsh and Sherrick [29] who measured the amount of time between the onsets of two stimuli (tones, clicks, lights, and their combinations) necessary to correctly report their order. This measure, called the TOJ threshold, reflects the minimum amount of time separating the onsets of the two stimuli at which an individual can correctly identify the order of stimulus presentation 75% of the time. Hirsh and Sherrick [29] originally reported the threshold for TOJ to be 17 msec, regardless of the type of stimulus and presentation modality used [29]. However, more recent studies have, generally, reported longer thresholds [1-3,5-7,10-13-18,30-32].”

3. Comment. Terminology: The terms temporal order judgment (TOJ) and dichotic TOJ are used interchangeably throughout the manuscript. It would be helpful to select just one term and then stick with it. Is the idea that the dichotic TOJ is just one more example of TOJ or that the dichotic aspect is an important factor? If the focus is on dichotic TOJ, the current claims could be tested with monaural TOJ tests, as well.

Response: The study was carried out on dichotic/spatial TOJ, as was the previous one. Following the reviewer’s comment, we added a paragraph in the Summary and Conclusions explaining why we chose this paradigm, and consequently the limitations of our findings and conclusions (see below). As suggested by the reviewer, we revised the manuscript to be more coherent when using TOJ / dichotic TOJ terminology.

In the Summary and Conclusions, page 18, line 4, we note:

“The present and the previous study [18] were both conducted on auditory dichotic TOJ (also referred to as spatial or binaural TOJ). This TOJ paradigm involves two identical sounds, presented asynchronously to the right and left ears. Other auditory TOJ paradigms involve two tones that differ in pitch or spectrum and are presented monaurally or diotically (to both ears at the same time). The advantage for measuring temporal processing using dichotic TOJ is that the stimuli are identical, and provides assurance that the temporal judgment is based on the temporal relationship of the two stimuli alone and not on other cues, such as pitch [19-20,27]. Furthermore, the perception of the stimulation of two ears by two asynchronous sounds, by definition, reflects mainly central auditory processing [1,18,28]. However, the conclusions drawn from the current and earlier studies are limited to this paradigm that was shown to mainly involve temporal cues [1,5,17,19,20]. The extent to which these conclusions can be generalized to other TOJ paradigms is yet to be tested.”

4. Writing: I found most of the manuscript to be quite difficult to read. The exception was the Summary and Conclusion section. The previous sister paper by Babkoff and Fostick (2013) is much clearer overall, indicating that the authors are capable of producing clear prose and therefore could greatly improve the clarity of the current manuscript.

Response. We appreciate the reviewer’s appreciation of our previous publication and took this constructive criticism very seriously. Accordingly, we thoroughly rewrote the manuscript as suggested in this comment and the following comments.

5. Introductions to individual experiments: I recommend combining all of the introductions into a single introduction. As it is now, some portions of the introductions to the individual experiments repeat information in the main introduction and other portions provide information that would be quite helpful to include in the main introduction.

Response. As suggested by the reviewer, the introduction sections of Experiment 1 and Experiment 2 were combined into a general introduction.

6. Results: I found the results section of the sister paper by Babkoff and Fostick (2013) to be much clearer and more informative than the current results sections. I recommend modeling the current results sections after the earlier paper while still incorporating new additions like the Bayes factor.

Response. As suggested, we have now modeled the reporting style of the results in the current paper after our previous publication (Babkoff and Fostick, 2013).

7. Stimulus duration: It would be quite helpful to spell out why the extension of the investigation to stimulus durations beyond 10-40 ms only focused on stimuli in the 3-8 ms range, rather than on a much wider range of stimulus durations. I think the reason is that the dichotic TOJ threshold is around 60 ms, so to compare ISI and SOA requires durations shorter than 60 ms. There is some mention of the possibility that the spectrum of the shortest stimuli would affect the outcome, but I did not find that argument to be compelling, especially without placing the 3-8 ms restriction in the larger context.

Response. To address the reviewer’s helpful feedback, we added further explanations regarding our focus on short durations.

On Page 5, lines 14, we note:

The design of our study directed us toward testing tone durations shorter than those we used in our earlier study. The dichotic TOJ ISI threshold was found to be around 60 msec in several studies [1-3,9,11,16,18,21], therefore, manipulating tone duration, ISI and SOA necessarily places an upper limit on the tone durations one can test, i.e., they must be shorter than 60 msec. This means that in order to expand the range of tone durations necessary to test the generalization of our ISI-tone duration TOJ equivalence hypothesis, we focus on shorter durations than those we used in the previous study [1] (i.e., less than 10 msec). Such short durations can create transients (short-duration sounds with high amplitude that can accompany the beginning of short sounds) that spread energy across the frequency range [35-38], possibly resulting in different ISI-duration patterns than those observed with tone durations longer than 10 msec. Therefore, in the present study we aimed to test whether our finding applies to very short tone durations (i.e., 3, 6, and 8 msec) as well as durations in the 10-40 msec range, while using the same dichotic TOJ design as Babkoff & Fostick [1].”

8. TOJ threshold: The TOJ threshold is not defined.

Response. We have now added the definition of TOJ thresholds: in the Introduction, as presented by Hirsh (1959) and Hirsh and Serrick (1962), in the Task and Stimuli section of Materials and Methods, and in both of the Results sections.

On Page 4 line 17, note:

“TOJ has been studied extensively, beginning with the seminal work of Hirsh [28] and Hirsh and Sherrick [29] who measured the amount of time between the onsets of two stimuli (tones, clicks, lights, and their combinations) necessary to correctly report their order. This measure, called the TOJ threshold, reflects the minimum amount of time separating the onsets of the two stimuli at which an individual can correctly identify the order of stimulus presentation 75% of the time.”

In the Task and Stimuli section, page 9, line 14, we note:

“Dichotic TOJ thresholds were defined as the ISI necessary for 75 % accuracy, estimated using the best linear approximation of a psychometric function.”

In the Results section, page 11, line 20, we note:

“TOJ ISI thresholds, defined as the ISI necessary for 75% accuracy, were estimated using a linear function.”

In the Results section, page 1, line 5, we note:

“TOJ ISI thresholds (i.e., ISI required for 75% correct responses) for the older adult cohort…”

9. Comment. Terminology: The terms temporal order judgment (TOJ) and dichotic TOJ are used interchangeably throughout the manuscript. It would be helpful to select just one term and then stick with it. Is the idea that the dichotic TOJ is just one more example of TOJ or that the dichotic aspect is an important factor? If the focus is on dichotic TOJ, the current claims could be tested with monaural TOJ tests, as well.

Response. We appreciate the opportunity to clarify. Dichotic TOJ is the form of TOJ used in the present study, as in the previous one. To be more precise in this revised version of the manuscript, we now mention more explicitly that we use this paradigm, and, as suggested by the reviewer, we have corrected those places where it wasn’t mentioned specifically. We agree with the reviewer that our findings could be tested with monaural TOJ tests as well, therefore we also added to the Summary and Conclusions limitations regarding the use of this specific paradigm and the ability to generalize findings from this paradigm to other TOJ paradigms.

On page 19, line 15 – page 20, line 3, we note:

“The present and the previous study [1] were both conducted utilizing auditory dichotic TOJ (also referred to as spatial or binaural TOJ). This TOJ paradigm involves uses two identical sounds presented asynchronously to the right and left ears; other auditory TOJ paradigms use two tones that differ in pitch or spectrum and are presented either monaurally or diotically (to both ears at the same time). The advantage of measuring temporal processing using dichotic TOJ is that the stimuli are identical, providing assurance that the temporal judgment is based on the temporal relationship of the two stimuli alone and not on other cues such as pitch [2,3,15]. Furthermore, perception of the stimulation of both ears by two asynchronous sounds, by definition, reflects mainly central auditory processing [1,16,51]. Consequently, the conclusions drawn from the current and earlier studies are limited to TOJ as tested by this paradigm, which has been shown to mainly involve temporal cues [2,3,9,11,16,]. The extent to which these conclusions can be generalized to other TOJ paradigms has yet to be tested.”

10. Specific Comments: Abstract. Pg. 2, top: “the major predictor of auditory TOJ threshold, and performance on spatial/dichotic TOJ tasks” Is there a difference between the TOJ threshold and performance on TOJ tasks? Is the intent that the predictor is for TOJ tasks in general, and the present results are for dichotic TOJ tasks in particular?

Response. The task tested in the current and previous study was spatial/dichotic TOJ. In previous studies, we showed different response patterns for different types of TOJ tasks, so it is important to state the specific TOJ that is tested here. However, as was correctly pointed out by the reviewer, the phrasing of the sentence was confusing, so we changed it accordingly.

In the beginning of the abstract (page 2, line 1), we note:

“Temporal order judgment (TOJ) measures the ability to correctly perceive the order of consecutive stimuli presented rapidly. Our previous research suggested that the major predictor of auditory dichotic TOJ threshold, a paradigm that requires the identification of the order of two tones, each of which is presented to a different ear, is the time separating the onset of the first tone from the onset of the second tone (stimulus-onset-asynchrony, SOA).”

11. Pg. 2: To me, the abstract as a whole does not capture the major message of the manuscript—that dichotic TOJ thresholds appear to be determined by the SOA, rather than the ISI. I think the abstract would be much stronger if it were introduced using the argument at the beginning of the Summary and Conclusions about how the manuscript provides two tests of the idea that it is the SOA rather than ISI that determines the dichotic TOJ threshold.

Response. We accept the reviewer’s feedback and have now implemented the wording used in the beginning of the Summary and Conclusions, in order for the abstract to better capture the major message of the manuscript.

In the abstract (page 2, line 8), we note:

“The current study aimed to evaluate the generalizability of the earlier finding by manipulating the experimental model in two different ways: a) extending the tone duration range to include shorter stimulus durations (3 – 8 msec; Experiment 1) and b) repeating the identical testing procedure on a different population with temporal processing deficits, i.e., older adults (Experiment 2).”

12. Introduction. Pg. 3, top: “Temporal order judgment (TOJ) reflects the individual’s ability to correctly perceive the order of consecutive stimuli presented rapidly.” This sentence is not clear to me. It seems to conflate the task (temporal order judgment) with performance on the task (correctly perceive…and presented rapidly).

Response. In line with this comment, the word “reflects” was changed to “measures”.

The first sentence of the introduction (page 4, line 2) now states:

“Temporal order judgment (TOJ) measures the individual’s ability to correctly perceive the order of consecutive stimuli presented rapidly.”

13. Pg. 3, top: “TOJ thresholds” What is a TOJ threshold? I think it would help to introduce the ideas a bit more slowly. For example, I suspect that the TOJ threshold is determined by varying the ISI.

Response: We appreciate the feedback and have tried to address it in the manuscript accordingly. We start first with the definition of TOJ threshold in the second paragraph of the manuscript, before using this term later. We also repeat the definition of TOJ threshold later in the Methods and Results section.

On Page 4 line 17, note:

In the Task and Stimuli section, page 9, line 14, we note:

“Dichotic TOJ thresholds were defined as the ISI necessary for 75 % accuracy, estimated using the best linear approximation of a psychometric function.”

In the Results section, page 11, line 20, we note:

“TOJ ISI thresholds, defined as the ISI necessary for 75% accuracy, were estimated using a linear function.”

In the Results section, page 1, line 5, we note:

“TOJ ISI thresholds (i.e., ISI required for 75% correct responses) for the older adult cohort…”

14. Pg. 3, top: “TOJ thresholds were found to be related to phonological skills [1-10] and to speech perception [2, 8,11-16].” What was the direction of the relationship? I assume that higher thresholds were associated with poorer phonological skills and poorer speech perception, but it would be helpful to state that directly.

Response. As suggested, the direction of the relationship between TOJ thresholds and phonological skills and speech perception was added.

This sentence (page 4, line 13) now states:

“The auditory TOJ paradigms have been used in studies of language skills over the last few decades, and have reported that better TOJ performance is related to better phonological skills [10-12,16-22] and better speech perception [8,10,21,23-27].”

15. Pg. 3, middle: “use stimuli that differ in spectrum or in duration [17], in frequency (pitch)” I do not understand the distinction between spectrum and frequency. The spectra differ for sounds of two different frequencies. I suspect that intent is that ‘spectrum’ means the spectrum of a complex sound, but that is not clear from the text.

Response. Spectrum indeed refers to pure tone vs. noise. Frequency refers to the sounds’ specific pitch. As suggested, this information was added to enhance the clarity.

This sentence (page 4, line 10) now states:

“auditory TOJ paradigms use stimuli that differ either in: a) frequency (pitch)[1-8]; or b) spectrum (pure tone vs. noise) [9]; or c) duration [9]; or d) the ear of presentation, i.e., the ear that receives the first and the ear that receives the second stimulus (referred to as dichotic, spatial, or binaural TOJ)[1-2,5-7,9-15].”

16. Pg. 3, middle: “or in synchronicity, meaning which ear receives the first/second stimulus (2,5,9,17-19; 22-23,24-27]” I do not understand the term ‘synchronicity’ in this context. Would ‘ear of presentation’ work? Does synchronicity mean the same thing as dichotic TOJ?

Response. We appreciate the reviewer’s questions. The term ‘synchronicity’ indeed refers to the dichotic TOJ so that in dichotic TOJ the sounds are delivered asynchronously. In order to provide greater clarity, we replaced the term “synchronicity” with “ear of presentation”. This sentence (page 3, lines 13-16) now states:

17. Pg. 3, bottom: “of the tones” what tones?

Response. This refers to the tones that constitute the TOJ task. To remove ambiguity, we have added a clarifying phrase.

On page 5, line 3, we mote:

“In a previous study [1], we and others reported the sensitivity of the dichotic TOJ paradigm to methodological and stimulus parameters, specifically to stimulus duration. We considered the possibility that the two manipulations, tone duration and ISI, might affect perception differently, since increasing tone duration increases the amount of sound—thus, the amount of stimulation—at the two ears, while increasing the ISI increases the silent interval between these stimulations—i.e., the lack of stimulation.”

18. Experiment 1: Young Participants, Stimulus Duration 3 to 40 ms. Materials and Method

Pg. 5, bottom: “226 participants” It would be helpful to include the number of males and females. Were there any sex differences in performance? Pg. 6, top: “The age of the participants ranged from 20 – 35 years.” It would be helpful to add the mean and standard deviation of the ages. Pg. 6, top: Were the participants compensated for their participation?

Response. In response to these questions, more detailed information on the participants was added. The Participants section now states (page 8, line 17):

“Participants were 226 undergraduate students (136 females, 90 males), aged 20 – 35 years (mean = 25.5, SD = 2.8) who volunteered to participate in the study. The current analyses include participant data presented in the earlier paper (n = 65) [1] together with the data from an additional 161 participants (current study).”

19. Pg. 6, bottom: “ranging between 5-240 msec” It would be helpful to list the ISI values. Pg. 6, bottom: Were the participants given trial-by-trial feedback? Pg. 6, bottom: “to ascertain whether they perceived the order of the tones and correctly reported the ear stimulated (right or left)” The distinction between perceiving the order of the tones and correctly reporting the ear stimulated is not clear to me. How can one be done without the other? Is the idea that participants could tell that the two tones were presented in opposite ears, sequentially, but could not indicate which ear was first? Pg. 6: What was the level of the tones?

Response. The information requested by these questions was added to the manuscript accordingly.

The Task and stimuli section now states (page 9, line 6):

“We used the experimental design reported in Babkoff and Fostick [1]. In short, participants were presented with two 1 kHz pure tones at a level of 40 dB SL. The tones were presented asynchronously to the right and left ear, and participants were asked to report the order in which they heard them (either right-left or left-right). The tone duration for each participant was 3, 6, 8, 10, 15, 20, 30, or 40 msec, according to their assigned group (between subjects design). Rise/fall times were 1 msec. Eight different ISIs of 5, 10, 15, 30, 60, 90, 120, and 240 msec were randomly used. Each ISI value was repeated 16 times, producing 256 trials (8 ISIs × 2 presentation orders × 16 repetitions). After every 32 trials, participants received a short break. Dichotic TOJ thresholds were defined as the ISI necessary for 75 % accuracy, estimated using the best linear approximation of a psychometric function.

20. Pg. 7 middle: “All participants were screened for hearing difficulties and their absolute threshold for 1 kHz was measured using the same computer and headphones that were used in the study.” Was the screening at 1 kHz separate from the screening for normal hearing mentioned in the ‘Participants’ section?

Response. Indeed, these were two separate procedures. To further clarify it in the manuscript we rephrased this sentence. The Procedure and stimuli section now states (page 10, line 8):

“Participants were screened for normal hearing prior to the experiment, after signed informed consent was obtained. In addition, their absolute threshold for 1 kHz was measured using the same computer and headphones that were used in the study.”

21. Pg. 7, bottom: “Mean ISI thresholds” How is the ISI threshold defined?

Response. Thresholds were defined as the ISI necessary for 75% accuracy, estimated using a linear function. This definition has now been added before describing the results of ISI thresholds. The second paragraph of the Results (page 11, line 20) now states:

“TOJ ISI thresholds, defined as the ISI necessary for 75% accuracy, were estimated using a linear function.”

22. Pg. 7, bottom: “Group mean data was tested against the predicted model and was found not to deviate significantly from this model (Probit analysis, Z = -3.13; p = .002).” What was the predicted model?

Response. The predicted model was of a similar reduction in threshold as the increase in tone duration. This clarification was added accordingly.

Page 11 line 1, now states:

23. Pg. 7, bottom: “Mean ISI thresholds for the data collected in the present study (3 – 8 ms) and the previous study (10 – 40 ms) ranged between 58.4 – 25.4 msec” The impression I get from this sentence (and elsewhere) is that all of the new data (n=161) are for 3-8 ms, and all of the previous data (n=65) are for 10-40 ms, but that does not fit with the n provided for each duration separately. Pg. 7, bottom: “ranged between 58.4—25.4 ms” In which direction?

Response. For the current study, additional participants were recruited for the 10 – 40 msec groups, as well as new participants for the 3 – 8 msec groups. As the reviewer correctly pointed out, the current wording gave an inaccurate impression, so we have now changed it, along with adding the corresponding direction.

Page 11, line21, now state:

24. Pg. 7, bottom: Fig. 1B, the points above the ~-250 ms and +250 ms SOA are out of line with the rest of the data. I think this pattern deserves mention in the results section…presumably arises because of a flattening of performance, asymptotic performance at the longest ISI values. What is the outcome if these values are removed from the line fitting? My sense is that these values lead to underestimation of the ‘true’ slope. Is there a reason for using linear vs. log values?

Response. As suggested, we added a description of the data for these extreme SOA values, and repeated the analysis when they are omitted. Page 11, line 14, now state:

“Notwithstanding, the points below and above SOAs of -200 and +200 msec were out of line with the rest of the data. This might be due to an asymptotic performance at the longest ISI values. Repeating the analysis without these values included resulted in a predictive value of 98.9% (y = 0.0103x - 6E-18).”

25. pg. 8, top “TOJ thresholds” How is the TOJ threshold defined? Is it the same as the dichotic TOJ threshold mentioned later in the same paragraph? Is it the same as the ISI threshold mentioned in the preceding paragraph?

Response. As suggested, TOJ thresholds (for dichotic TOJ measured in the present study) were defined in the beginning of the paragraph.

Page 11, line 20, now states:

“TOJ ISI thresholds, defined as the ISI necessary for 75% accuracy, were estimated using a linear function.”

26. Pg. 8, top: “The best linear fit to the means is drawn as a straight line (R2= 0.69,

p<.001) and the predicted line based on y = a - bx is drawn as a dotted line.” What is the predicted line?

Response. The predicted line is based on y = a – bx, predicting a similar reduction in threshold as the increase in tone duration. This was added at the beginning of the paragraph and also repeated in the sentence referred to above.

On page 1, line 1, we now state:

“Group mean data are also plotted (Figure 3a) and were tested against a model that predicted a reduction in threshold for the same magnitude of increase in tone duration. The data were found not to deviate significantly from this model (probit analysis, Z = -3.13; p = .002). The best linear fit to the mean ISI thresholds (R2= 0.69, p<.001) is depicted in Figure 3a (see straight line) as is the predicted line (based on y = a – bx, predicting a similar reduction in ISI threshold as the increase in tone duration; see dotted line).”

27. Pg. 8, bottom: “for a wide range of stimulus durations (3 – 40 msec)” I do not consider 3-40 ms to be a wide range of stimulus durations. I recommend deleting “for a wide range...” to the end of the sentence, and just stating the outcome.”

Response. Changed as suggested. This sentence (page 13, line 3) now states:

“The data from Experiment 1 show that, for stimulus durations of 3 – 40 msec, young adults utilize the same cue for temporal processing from both the stimuli (tone duration) and from the silent gap between them (ISI).”

28. Pg. 9, top: “suggest that the time between the onset of two tones that is required for perceiving their order is constant (around 60 – 70 ms)” I think this estimate is quite interesting, and deserves to be emphasized in the paper, discussion, and possibly abstract. However, I think that it would be better to point it out first in the results section (I had to go back to the figure to work out how the value was determined).

Response. We adopted the reviewer’s suggestion and now mention this finding in the abstract, the Results, and the Discussion. We also repeated in the Discussion the exact numbers and the way they are calculated.

On page 2, line 14, we note:

Page 13, line 10, now states:

“Moreover, as the average dichotic TOJ threshold crosses the vertical axis at approximately 57 msec (as found in our previous study [1]) to 70 msec (as in the current study, Figure 3a and b), this suggests that the time between the onset of two tones that is required for perceiving their order is constant (around 60 – 70 msec).”

29. Experiment 2: Elderly Participants, Stimulus Duration 10 to 40 msec. Materials and Method. Pg. 9, middle: “elderly” (mentioned several time): I recommend using a different term, possibly “older”

Response. This suggestion was accepted and “older adults” is now used throughout the paper.

30. Pg. 9, middle: “Studies of temporal processing among the elderly, including those researching it in the context of speech, have reported deficiencies in performance compared to young adults [8,19-20,22,24,26,38-45]. These studies have demonstrated that aging adults require longer sound durations, ISIs, and gaps, than do young adults, in order to correctly perceive them.” It would be helpful to indicate which of the references in the first sentence are associated with each of the measures listed in the second sentence.

Response. As suggested, we specified the appropriate studies for each measure. This sentence (page 7, line 1) now states:

31. Pg. 9, middle: “Such findings support the notion of aging adults’ sensitivity both to tone duration and ISI” I do not understand this phrase. Is the intent that such findings indicate that aging adults are sensitive to both…??

Response. This indeed is what we were suggesting, according to the findings of previous studies. In light of the reviewer’s comment, however, we rephrased this sentence to be clearer. This sentence (page 7, line 5) now states:

“Such findings suggest that older adults might be sensitive both to tone duration and ISI.”

32. Pg. 10, top: “If so, we expected the pattern of older adults’ performance to be similar to those of young adults, indicating that although the thresholds will be longer, on average, there will be a “zero” line slope when thresholds are plotted as a function of SOA.”�”If so, the pattern of older adults’ performance should be similar to that of young adults, such that there should be a “zero” line slope when thresholds are plotted as a function of SOA.” I recommend stating in a separate sentence why the thresholds are expected to be longer.

Response. These suggestions were fully accepted. This paragraph, (page 7, line 13) now states:

“If the pattern of TOJ performance by older adults is similar to that of younger adults, a “zero” line slope would be expected when thresholds are plotted as a function of SOA. This “zero” line slope is expected for both populations, although the thresholds for older adults are expected to be longer than that of younger adults due to age-related temporal deficits among older adults. However, if older adults have greater difficulty processing shorter duration than longer duration tones, the slope of the line relating the dichotic TOJ threshold to tone duration should be significantly greater than “zero”, indicating a greater contribution to dichotic TOJ threshold of tone duration than just the increase in SOA.”

33. Pg. 10 bottom to pg. 11 top: I recommend taking out the subheads under Materials and Methods and just stating that Experiment 2 was the same as Experiment 1, except…(fill in the blanks).

Response. This suggestion was fully accepted. The Materials and Method section now includes the following (page 15, lines 3-9):

“Experiment 2 was conducted using the same methodology as Experiment 1, with the exception of: a) older participants and b) a duration range of 10-40 msec. A group of 98 participants (59 females, 39 males), aged 60 – 75 years (mean = 66.4, SD = 6.1), volunteered to participate in the study. Participants were divided into five groups, each of which was tested with only one tone duration, as follows: 10 msec (n = 16), 15 msec (n = 27), 20 msec (n = 17), 30 msec (n = 19), and 40 msec (n = 19). After providing signed informed consent, the participants were screened for age-normal hearing (hearing thresholds of 35 dB HL or less at 500, 1,000, 2,000, and 4,000 Hz). This was an inclusion criterion while hearing deficit was an exclusion criterion.”

34. Pg. 11 top: What does ‘resembled’ mean in this context? Were there a number of differences in the procedure between Experiment 1 and Experiment 2, but a vague similarity between the two experiments? I suspect not.

Response. The reviewer is correct, and this sentence has been omitted.

35. Pg. 12, middle: “the thresholds for aging adults were, on average, 33 ms longer than for the younger adults (an average SOA threshold of 99.6 ms for aging adults vs. 66.5 ms for the younger).” See comment above (pg. 9, top) about the estimate of the threshold. I recommend including that information first in the results section.

Response. As indicated in our previous response, we adopted the reviewer’s suggestion and now mention this finding in the abstract, the Results section, and the Discussion.

The abstract now states (page 2, line 16):

The Results now state (page 1, line 21):

“The point at which the average dichotic TOJ threshold (SOA) crosses the vertical axis in Figure 5b is 100 msec (probit analysis, Z = -2.74, p = .02), indicating that the TOJ threshold for the older adults is, on average, 33 msec longer than the young adults in Experiment 1 (99.6 msec vs. 66.5 msec, respectively).”

The Discussion now states (page 16, line 7):

36. Pg. 13, middle: “The present study aimed to evaluate the temporal mechanism of TOJ by testing the generalizability of the conclusion that dichotic TOJ is determined by stimulus onset asynchrony (SOA), the time separating the onset of the first tone to the onset of the second one. We tested this hypothesis by two different manipulations: 1) by extending the range of tone durations to include 3-10 msec on a population of young adults; 2) by performing identical testing on a population of older adults, about whom there is existing evidence of a general deficit in auditory temporal order judgment.” I think this set-up is much clearer than the one in the current introduction.

Response. We appreciate the reviewer’s positive feedback about this description. As suggested, we now utilize this wording in the Introduction.

The Introduction now states (page 8, line 1):

“The aim of the present study was, therefore, to test our previous conclusion that dichotic TOJ is determined by stimulus onset asynchrony (SOA)—the time separating the onset of the first tone from the onset of the second one [1]—a) by using stimulus parameters that test its boundaries, and b) to determine if it can be generalized. We operationalized this aim by implementing two different manipulations to our previous research model: a) extending the range of tone durations to include also very short tone durations (3 – 8 msec) thus testing TOJ thresholds with tone durations ranging from 3 – 40 msec (Experiment 1), and b) using the same experimental methodology as in the previous study [1] to test a population of older adults (Experiment 2).”

37. Figure 1. As I understand it, the figure shows the number of ‘left leading’ responses, but those responses increase on the right side of the figure, which is counterintuitive. At a minimum I recommend indicating left and right on the x axis. Replotting the data more intuitively would be better.

Response. While we agree with the reviewer’s perspective that the figure design is somewhat counterintuitive, we would like to preserve this design in order to maintain consistency with figures in our previous paper (2013). However, to address the reviewer’s recommendation, we have now added indications of ‘Right leading’ and ‘Left leading’ on the x-axis, as we did in our previous paper, to enhance clarity.

38. Figure 1 top panel

1) move the lower half of y axis values to the opposite side of the axis so the values are not obscured by the lines (or possibly move the entire y axis to the left of the figure)

2) order the colors of the lines systematically, so the relationship between tone duration and performance is easier to unpack

3) add the n per group to the figure, possibly under the tone duration…or at least to the caption

4) give an estimate of the error…one possibility would be to plot one point and a mean error bar for each duration in the upper left quadrant

5) add schematic diagrams illustrating the distinction between panel a and panel b

Bottom panel

1) use the same (revised) colors from the top panel for the points in this panel, for continuity, and to help the reader see the connection between the two panels

Figure 4

Same comments as for Figure 1.

Top panel

1) use the same (revised) colors for the different durations as in Fig. 1, for continuity

Figures 2,3 and 5,6

Same basic comments as for Figure 1

I recommend combining Figures 2 and 3 into one two-panel figure, and combining Figures 5 and 6 into another two-panel figure.

Response. All suggestions were accepted fully. The revised figures are as follows (starting from Figure 2, as a new Figure 1 was added to illustrate the study design):

a. Data by duration and ISI

b. Data by SOA

Figure 2. Psychometric function of probit-transformed data for the proportion of 'left leading' responses of young adult participants across eight different stimulus durations for: (a) each stimulus duration by ISI; and (b) all data by SOA. Schematic diagrams illustrating the distinction between ISI and SOA appear in each panel.

a. ISI thresholds

b. SOA thresholds

Figure 3. TOJ thresholds of young adult participants across eight different stimulus durations for: (a) ISI thresholds (silent gap between the offset of the stimulus at the leading ear and the onset of the stimulus at the lagging ear); and (b) SOA thresholds (duration of the stimulus at the leading ear added to the ISI threshold).a. Data by duration and ISI

a. Data by duration and ISI

b. Data by SOA

Figure 4. Psychometric function of probit-transformed data for the proportion of 'left leading' responses of older participants across five different stimulus durations for: (a) each stimulus duration by ISI; and (b) all data by SOA. Schematic diagrams illustrating the distinction between ISI and SOA appear in each panel.

a. ISI thresholds

b. SOA thresholds

Figure 5. TOJ thresholds of older participants plotted as a function of stimulus duration for (a) ISI thresholds (silent gap between the offset of the stimulus at the leading ear and the onset of the stimulus at the lagging ear); and (b) SOA thresholds (duration of the stimulus at the leading ear added to the ISI threshold).

39. Comment. Grammar and word choice: