Skip to main content
PLOS One logoLink to PLOS One
. 2021 Dec 16;16(12):e0261129. doi: 10.1371/journal.pone.0261129

Visual field differences in temporal synchrony processing for audio-visual stimuli

Yasuhiro Takeshima 1,*
Editor: Deborah Apthorp2
PMCID: PMC8675747  PMID: 34914735

Abstract

Audio-visual integration relies on temporal synchrony between visual and auditory inputs. However, differences in traveling and transmitting speeds between visual and auditory stimuli exist; therefore, audio-visual synchrony perception exhibits flexible functions. The processing speed of visual stimuli affects the perception of audio-visual synchrony. The present study examined the effects of visual fields, in which visual stimuli are presented, for the processing of audio-visual temporal synchrony. The point of subjective simultaneity, the temporal binding window, and the rapid recalibration effect were measured using temporal order judgment, simultaneity judgment, and stream/bounce perception, because different mechanisms of temporal processing have been suggested among these three paradigms. The results indicate that auditory stimuli should be presented earlier for visual stimuli in the central visual field than in the peripheral visual field condition in order to perceive subjective simultaneity in the temporal order judgment task conducted in this study. Meanwhile, the subjective simultaneity bandwidth was broader in the central visual field than in the peripheral visual field during the simultaneity judgment task. In the stream/bounce perception task, neither the point of subjective simultaneity nor the temporal binding window differed between the two types of visual fields. Moreover, rapid recalibration occurred in both visual fields during the simultaneity judgment tasks. However, during the temporal order judgment task and stream/bounce perception, rapid recalibration occurred only in the central visual field. These results suggest that differences in visual processing speed based on the visual field modulate the temporal processing of audio-visual stimuli. Furthermore, these three tasks, temporal order judgment, simultaneity judgment, and stream/bounce perception, each have distinct functional characteristics for audio-visual synchrony perception. Future studies are necessary to confirm the effects of compensation regarding differences in the temporal resolution of the visual field in later cortical visual pathways on visual field differences in audio-visual temporal synchrony.

Introduction

Daily, we perceive external environments using multisensory information. Temporal synchrony is critical for the integration of different sensory stimuli. In audio-visual stimuli, synchronously presented sounds improve visual performance. For example, visual target detectability is facilitated by simultaneous sounds in the backward masking paradigm [1]. Moreover, improvements in visual performance when accompanied by simultaneous sounds have been reported in various experimental tasks [24].

Temporal synchrony for audio-visual inputs is based on matching the salient temporal features of each sensory modality [57]. Humans perceive temporal synchrony between visual and auditory inputs, despite natural lags in arrival and processing times [8]. Thus, synchrony perception is malleable for audio-visual stimuli. For example, the point of subjective simultaneity (PSS) for light and sound, which is perceived as maximally simultaneous, differs based on viewing distance [911]. Moreover, a tolerant temporal binding window (TBW) has been found with regard to audio-visual simultaneity perception [12], with sounds of 100−200 ms that precede and follow [13, 14].

Temporal synchrony perception for audio-visual stimuli can be adapted to temporal lags between sensory inputs. Fujisaki, Shimojo, Kashino, and Nishida [15] reported that repeated exposure to audio-visual asynchrony shifts the PSS toward the leading stimuli, which is known as temporal recalibration. Temporal recalibration has been observed in both the temporal order judgment (TOJ) task [16] and Fujisaki et al.’s [15] simultaneity judgment (SJ) task. Moreover, van der Burg, Alais, and Cass [17] showed temporal recalibration without adaptation period, which they termed “rapid recalibration.” In rapid recalibration, the PSS of the current trial was calculated to be contingent upon audio-visual asynchrony in the preceding trial. Rapid recalibration exhibits larger transient effects compared with typical recalibration with a cross-modal adaptation procedure [18].

Temporal synchrony perception for audio-visual stimuli is affected by the processing speed of the visual system. Previous studies have reported that the PSS for audio-visual stimuli is directed to more auditory leads in low spatial frequency stimuli than in high spatial frequency stimuli [19, 20]. Visual systems are composed of at least two spatio-temporal channels, namely, transient and sustained channels, each having different temporal resolutions (processing speeds) [21]. Transient channels respond to low spatial frequencies and exhibit high temporal resolution, whereas sustained channels exhibit high spatial frequencies and low temporal resolution [22]. Therefore, auditory stimuli should be presented earlier for low spatial frequency stimuli than high spatial frequency stimuli for subjective simultaneity to perceive. The retinal positions of vision are similarly different in these two channels: central vision has a low temporal resolution, while peripheral vision has a high temporal resolution [23]. Furthermore, there is a difference in visual latency between the central and peripheral visual fields (VFs). The response time for the visual stimulus reflects the difference in visual latency [22]. Therefore, the response time is shorter at the central VF than it is at the peripheral VF [24]. From the above, it can be seen that the processing speed differs between the central and peripheral VFs. The present study confirmed the effects of visual temporal resolution on audio-visual synchrony perception by manipulating the VF in which the visual stimulus was presented (i.e., eccentricity). If synchrony perception for audio-visual stimuli follows differences in temporal resolution, then the TBW of the central VF would be wider than that of the peripheral VF for tolerant synchrony perception, because the timing perception of visual presentation is ambiguous in the central VF due to low temporal resolution. However, if it follows the difference in visual latency, then the PSS of the central VF would be lower than that of the peripheral VF as previous studies manipulating spatial frequency of visual stimuli [19, 20].

The current study examines the difference in audio-visual synchrony perception between central and peripheral VFs in a multifaceted manner. Kopinska and Harris [10] compared the PSS between central (0° eccentricity) and peripheral (20° eccentricity) VFs using the TOJ task by manipulating both the eccentricity of the visual stimulus and participants’ viewing distance and showed that the timing of subjective simultaneity did not differ between VFs. However, their study did not employ other paradigms to measure PSS nor did it measure the TBW. Therefore, in this study, the PSS and TBW for audio-visual stimuli were measured using TOJ, SJ, and stream/bounce (SB) perception [25]. The underlying mechanisms differ between the TOJ and SJ tasks. The TOJ task reflects the temporal discrimination processes, whereas the SJ task reflects the temporal binding processes [26]. Furthermore, the differences between the TOJ and SJ tasks stem from their decisional and response processes [27]. Van Eijk, Kohlrausch, Juola, and van de Par [28] have shown a lack of correlation between SJ and TOJ PSS, and proposed a different type of sensitivity between TOJ and SJ for audio-visual asynchrony. Therefore, it is necessary to use both TOJ and SJ tasks to examine audio-visual temporal synchrony. Additionally, this study used SB perception as an implicit method to measure the PSS and TBW. In the SB display, two identical circles moving across each other can be perceived either to bounce off or to stream through each other. Sekuler et al. [25] reported that brief sound biases perception toward bouncing. The bouncing perception increases when a brief sound is simultaneously presented at the moment the circles coincide [29, 30]. Apparent causality among visual and auditory events as SB perception affects audio-visual synchrony perception in early multisensory integration processes [31]. The present study also compared PSS and TBW values measured using explicit (TOJ and SJ tasks) and implicit (SB perception) methods.

In addition, the differences in the effects of rapid temporal recalibration between SJ and TOJ tasks were investigated to confirm the effects of VFs on audio-visual temporal processing in this study. Roseboom [32] showed the opposite change in PSS due to rapid temporal recalibration between SJ and TOJ tasks: the PSS changed in the same direction as the stimulus onset asynchrony (SOA) of previous trials in the SJ task and in the opposite direction in the TOJ task. Therefore, this difference in the rapid temporal recalibration between SJ and TOJ tasks was also investigated. Using the SJ task, Takeshima [33] reported that normal rapid temporal recalibration occurred regardless of a difference in visual processing speed based on spatial frequency. The present study predicts that normal rapid temporal recalibration would be observed in both the central and peripheral VFs in the SJ task. In the TOJ task, as in Roseboom [32], temporal recalibration in the opposite direction to that in the SJ task is predicted to occur. This study also explores rapid temporal recalibration using SB perception, since previous studies have not examined this using the paradigm under investigation.

Experiment 1

Experiment 1 measured and compared PSS and TBW values between the central and peripheral VFs in the TOJ task. Moreover, the process of rapid recalibration was investigated for both the VFs.

Materials and methods

Ethics statement

All experiments reported in this paper were approved by the ethics committee of the Department of Psychology, Doshisha University, and were conducted in accordance with the Declaration of Helsinki.

Participants

Twenty volunteers (18 women and 2 men; mean age = 19.95 ± 1.81 years) participated in Experiment 1. To determine the sample size needed for this study, PANGEA (https://jakewestfall.shinyapps.io/pangea/) was used to calculate the power (1-β) needed to detect a two-way interaction with following parameters: effect size (d) = 0.45, variance of error = 0.333, variance of two-way interaction = 0.083, and the number of condition repetitions = 24. PANGEA indicated a power of 0.86 when a sample size was 16 participants. Therefore, participants were recruited from a university lecture held at Doshisha University to meet this sample size. All participants orally reported normal or corrected-to-normal vision and normal audition. Participants were given 500 Japanese yen for their participation and provided written informed consent prior to participation.

Apparatus

Stimuli were generated and controlled using a custom-made program written with MATLAB (The MathWorks, Inc.), Psychtoolbox [3436], and a laptop personal computer (MacBook Pro, Apple). The visual stimuli were displayed on a 21-inch cathode-ray tube monitor (Trinitron CPD-G520, Sony; resolution: 1024 × 768 pixels; refresh rate: 100 Hz). The auditory stimuli were conveyed using an audio interface (Clarett 2Pre, Focusrite) and headphones (MDR-CD900ST, Sony). The simultaneity of the visual and auditory stimuli was confirmed using a digital oscilloscope (DS-5424A, Iwatsu). The experiment was conducted in a darkened room with background noise levels of 39.8 dB (A). Participants viewed the monitor binocularly at a distance of 70 cm with their heads stabilized on a chin rest.

Stimuli

A white (86.64 cd/m2) fixation cross (0.5 × 0.5°) and circle (visual stimulus: 1.5° diameter) was presented on a black (0 cd/m2) background. The fixation cross was presented at the center of the screen. The circle was presented at one of the following three locations: the center of the screen, 10° left, and right of the fixation stimulus. The duration of the target presentation was 10 ms. The auditory stimuli consisted of pure tones with a frequency of 500 Hz, a sound pressure level of 55 dB (A), and a duration of 10 ms (including ramp times of 2 ms at the start and end of the sound wave envelope). The onset times of the visual and auditory stimuli were synchronized. The following 10 SOAs were established between the visual and auditory stimuli: ±510, ±260, ±130, ±50, and ±10 ms (negative SOAs indicate that the auditory stimulus was presented prior to the visual stimulus, and vice versa).

Procedure

A schematic of the trial design is presented in Fig 1. The trials were initiated by pressing the 0 key. Each trial consisted of a 500 ms fixation period followed by blank and target displays. The duration of this blank display was a fixed length of 500 ms, plus an additional SOA when the auditory stimulus was presented prior to the visual stimulus (i.e., 500 ms at the shortest and 1010 ms at the longest). During the target display period, a circle was presented at one of the following three locations: center, left, or right. The tone either preceded or followed the onset of the circle using the SOA that had been drawn randomly from the set. After the target display, a blank display was again presented. The duration of this blank display was a fixed length of 200 ms, plus an additional SOA when the visual stimulus was presented prior to the auditory stimulus (i.e., 200 ms at the shortest and 710 ms at the longest). Then, participants were instructed to judge the temporal order by pressing the 1 key for vision leads and the 3 key for audition leads. First, the participants performed 30 practice trials. The participants then completed 480 trials that were divided into six blocks. One block consisted of 80 trials, with eight trials for each SOA condition. Half the trials of one block had visual stimuli presented at the central VF, while the remaining half had visual stimuli presented at the peripheral VF (i.e., 24 repetitions for each experimental condition). Additionally, in the peripheral VF condition, the visual stimulus was presented to the left or right of the fixation stimulus with equal frequency. Participants took short breaks between the blocks.

Fig 1. The schematic representation of the procedure used in the temporal order judgment task.

Fig 1

Results

The proportion of vision lead responses was calculated for each condition. To compute the PSS and sigma (i.e., TBW), the cumulative Gaussian function was fitted to each participant’s data using the maximum-likelihood method:

P(response|SOA)=11+e[1sigma(SOAPSS)]

The SOA parameters matched those of the experimental conditions (from -510 to +510 ms). The PSS and sigma parameters were evaluated using estimations. The sigma value was restricted to values greater than 0. One participant was excluded from further analysis, because their computed sigma value was large (> 600). The results of the analysis are shown in Fig 2A, which demonstrates the mean percentages of vision lead responses as a function of VF and SOA with fitted psychometric functions (center: Mean RMSE = 0.10 ± 0.03, periphery: Mean RMSE = 0.09 ± 0.03). The PSS and sigma results are shown in Fig 2B and 2C. The PSS was smaller in the center condition than in the periphery condition, t (18) = 2.33, p < .05, d = 0.51, whereas sigma did not differ between the center and periphery conditions, t (18) = 1.09, p = .29, d = 0.13.

Fig 2. Results of synchrony perception analyses in Experiment 1.

Fig 2

(a) Mean percentage of vision lead responses. The average fitting functions are plotted for the central and peripheral visual field data. (b) Mean estimated point of subjective simultaneity. (c) Mean estimated sigma. Error bars represent standard errors of the mean (n = 19).

An inter-trial analysis was conducted to examine whether the modality order in a given previous trial (Trial t-1) affected the distribution of vision leads responses in the current trial (Trial t). The VFs in the previous trial (Trial t-1) were not split because the correspondence of the spatial location between the current (Trial t) and previous (Trial t-1) trials did not affect the rapid recalibration [37]. Further, the distribution of perceived vision leads as a function of SOA was compiled for each participant and each VF separately, given the cases in which trial t-1 exhibited either a negative SOA (i.e., audition leads) or positive SOA (i.e., vision leads). The total distributions were then fitted to a cumulative Gaussian function. The results are shown in Fig 3A and 3B, which demonstrate the mean percentages of vision lead responses as a function of VF and SOA with fitted psychometric functions for both modality orders (center/audition leads: Mean RMSE = 0.12 ± 0.04, center/vision leads: Mean RMSE = 0.10 ± 0.05, periphery/audition leads: Mean RMSE = 0.11 ± 0.03, periphery/vision leads: Mean RMSE = 0.11 ± 0.04). The recalibration shifts are summarized in Fig 3C, which plots the PSS for both modality orders and the two different VFs. For the PSS, a two-way analysis of variance (ANOVA) with modality order (2) and VF (2) was conducted. The results revealed a significant main effect of VF, F (1, 18) = 5.55, p < .05, ηp2 = .24, and a two-way interaction, F (1, 18) = 14.14, p < .01, ηp2 = .44. The simple main effect of the modality order was significant at the central VF, F (1, 18) = 4.94, p < .05, ηp2 = .22, which indicates that the PSS was lower in the audition leads condition than in the vision leads condition. However, this simple main effect was not significant for the peripheral VF, F (1, 18) = 2.74, p = .12, ηp2 = .13. Moreover, the simple main effect of VF was significant in the audition leads condition, F (1, 18) = 12.71, p < .01, ηp2 = .41, which indicates that the PSS was lower in the center than in the periphery. However, this simple main effect was not significant in the vision leads condition, F (1, 18) = 0.48, p = .50, ηp2 = .03. The main effect of modality order was not significant, F (1, 18) = 0.13, p = .77, ηp2 = .01. The sigma values for both modality order and the two different VFs are shown in Fig 3D. For sigma, a two-way ANOVA with modality order (2) and VF (2) was conducted. The results revealed a significant main effect of modality order, F (1, 18) = 11.91, p < .01, ηp2 = .40, which indicates that the sigma of the vision leads was lower than that of the audition leads. However, the main effect of VF, F (1, 18) = 1.15, p = .30, ηp2 = .06, and the two-way interaction, F (1, 18) = 2.87, p = .11, ηp2 = .14, were not significant.

Fig 3. Results of rapid recalibration analyses in Experiment 1.

Fig 3

Mean percentage of vision lead responses in the (a) central and (b) peripheral visual fields. The average fitting functions are plotted for preceding trials of audition leads and vision leads data. (c) Mean estimated point of subjective simultaneity. (d) Mean estimated sigma. Error bars represent standard errors of the mean (n = 19).

Discussion

Experiment 1 compared the synchrony perception and rapid recalibration processes between VFs using the TOJ task. The PSS was smaller at the center than at the periphery, whereas the TBW did not differ between VFs. Moreover, normal rapid recalibration occurred only in the central VF, but not in the peripheral VF. Therefore, presentation of the vision leads to a preceding stimulus that causes its TBW to become narrower than that of the audition leads.

The difference in the PSS observed between the central and peripheral VFs was consistent with the difference in visual latency between the central and peripheral VFs. The response time for the visual stimulus was shorter in the central VF than in the peripheral VF [24]. Previous studies that manipulated spatial frequency also observed differences in the PSS consistent with the differences observed in response time [19, 29]. The difference in visual latency is reflected in the response time of the visual stimulus [22]. Therefore, in a TOJ task, differences in the PSS between the central and peripheral VFs could be attributed to differences in visual latency rather than temporal resolution.

The results of normal rapid recalibration for the central VF were inconsistent with the predictions based on previous studies. Roseboom [32] showed that the PSSs shifted in opposite directions from normal rapid temporal recalibration in a TOJ task. Moreover, Keane, Bland, Matthews, Carroll, and Wallis [38] found that opposite-directed PSS shifts were induced by choice-repetition bias in a TOJ task. Choice-repetition bias refers to the tendency to repeat judgments of the temporal order of a previous trial in a current trial. Additionally, rapid recalibration was obfuscated by opposite-directed PSS shifts to a choice-repetition bias [38]. It is possible that the choice-repetition bias was suppressed in the central VF in this study. Low information reliability induces a larger choice-repetition bias [39, 40]. Therefore, the reliability of judging temporal order for audio-visual stimuli would be high in the central VF. This speculation needs to be further investigated.

Moreover, the difference in the TBW between audition and vision leads during the rapid recalibration process is a novel finding. In Experiment 1, the TBW width narrowed in the preceding vision leads presentation than in the preceding audition leads presentation. A narrow TBW indicates high sensitivity for judging the temporal order between visual and auditory stimuli. Therefore, this finding shows that the temporal information of visual precedence in a previous trial increases sensitivity in a TOJ task for audio-visual stimuli. In previous studies of rapid recalibration (e.g., [17, 28, 36]), the difference in TBW width between the audition and vision leads conditions has not been investigated. Such a difference in the TBW was not predicted in this study, and this needs to be examined in more detail in the future.

Experiment 2

Experiment 2 measured and compared PSS and TBW between the central and peripheral VF in the SJ task. Moreover, the process of rapid recalibration was investigated in both VFs to provide robust evidence.

Materials and methods

Participants

Twenty-one volunteers (14 women and 7 men; mean age = 21.57 ± 2.17 years) participated in Experiment 2. The required sample size was determined based on the same criterion as in Experiment 1, and participants were recruited to meet this sample size from a university lecture held at Doshisha University. All participants orally reported normal or corrected-to-normal vision and normal audition. Participants were compensated with 500 Japanese yen for their participation and provided written informed consent prior to participation.

Stimuli

The same visual and auditory stimuli as in Experiment 1 were used. The onset times of the visual and auditory stimuli were synchronized. The following nine SOAs were present between the visual and auditory stimuli: ±510, ±260, ±130, ±50, and 0 ms (negative SOAs indicate that the auditory stimulus was presented prior to the visual stimulus, and vice versa).

Procedure

The trial sequence was nearly the same as that of Experiment 1, except for the response period. During the response period, participants were instructed to press the 1 key for simultaneity and the 3 key for asynchrony. First, the participants performed 30 practice trials. The participants then completed 480 trials that were divided into six blocks. One block comprised 80 trials, with eight trials for each SOA condition (i.e., 16 trials for 0 ms SOA and eight trials for other SOAs). For half of the trials of one block, visual stimuli were presented in the central VF, and for the other half, visual stimuli were presented at the peripheral VF. Additionally, in the peripheral VF condition, visual stimuli were presented to the left or right of the fixation stimulus with equal frequency. Participants took short breaks between the blocks.

Results

The proportion of simultaneity responses was calculated for each condition. To compute the alpha, PSS, and sigma values, the three-parameter Gaussian function was fitted to each participant’s data per minimization of the root-mean-square error (RMSE):

P(response|SOA)=Alphae[.5(SOAPSSSigma)2]

The SOA parameter was equal to that of the experimental conditions (from -510 to +510 ms). The alpha, PSS, and sigma parameters were estimated, and these parameters indicated the height, peak position on the SOA axis, and the width of the Gaussian function, respectively. The sigma value was restricted to greater than 0, and the alpha value was restricted between 0 and 1. One participant was excluded from further analysis because their computed sigma value was large (> 600). The results are shown in Fig 4A, which presents the mean percentages of simultaneity responses, as a function of VF and SOA with fitted psychometric functions (center: Mean RMSE = 0.07 ± 0.03, periphery: Mean RMSE = 0.07 ± 0.03). Moreover, PSS and sigma are shown in Fig 4B and 4C. The PSS value did not differ between the center and periphery conditions, t (19) = 0.22, p = .83, d = 0.02, whereas the sigma value was larger in the center condition than in the periphery condition, t (19) = 4.99, p < .001, d = 0.41. A difference in alpha value was also not observed between the center (M = 0.99 ± 0.03) and periphery (M = 0.97 ± 0.06) conditions, t (19) = 1.50, p = .15, d = 0.40.

Fig 4. Results of synchrony perception analyses in Experiment 2.

Fig 4

(a) Mean percentage of simultaneous responses. The average fitting functions are plotted for the central and peripheral visual field data. (b) Mean estimated point of subjective simultaneity. (c) Mean estimated sigma. Error bars represent standard errors of the mean (n = 20).

The same inter-trial analysis was conducted as in Experiment 1. Distributions of perceived simultaneity as a function of SOA were compiled for each participant and each VF separately for cases in which trial t-1 exhibited a negative SOA (i.e., audition leads) or positive SOA (i.e., vision leads). These distributions were subsequently fitted using the Gaussian function described above. The results are shown in Fig 5A and 5B, which represent the mean percentages of simultaneity responses, as a function of VF and SOA with fitted psychometric functions for both modality order (center/audition leads: Mean RMSE = 0.10 ± 0.03, center/vision leads: Mean RMSE = 0.10 ± 0.03, periphery/audition leads: Mean RMSE = 0.09 ± 0.03, periphery/vision leads: Mean RMSE = 0.09 ± 0.04). The recalibration shifts are summarized in Fig 5C, which plots the PSS for both modality order and the two different VFs. For the PSS, a two-way ANOVA with modality order (2) and VF (2) was conducted. The results revealed a significant main effect of modality order, F (1, 19) = 14.91, p < .01, ηp2 = .44, which indicates that the PSS was larger for visual leads than for audition leads. However, the main effect of VF, F (1, 19) = 0.80, p = .38, ηp2 = .04, and the two-way interaction, F (1, 19) = 0.76, p = .39, ηp2 = .04, were not significant. Furthermore, the sigma values for both modality orders and the two different VFs are shown in Fig 5D. For sigma, a two-way ANOVA with modality orders (2) and VF (2) was conducted. The results reveal a significant main effect of VF, F (1, 19) = 27.01, p < .001, ηp2 = .59, which indicates that the sigma values of the center condition were larger than those of the periphery condition. However, the main effect of modality order, F (1, 19) = 0.70, p = .41, ηp2 = .04, and the two-way interaction, F (1, 19) = 0.03, p = .86, ηp2 = .002, were not significant.

Fig 5. Results of rapid recalibration analyses in Experiment 2.

Fig 5

Mean percentage of simultaneous responses in the (a) central and (b) peripheral visual fields. The average fitting functions are plotted for preceding trials of audition leads and vision leads data. (c) Mean estimated point of subjective simultaneity. (d) Mean estimated sigma. Error bars represent standard errors of the mean (n = 20).

Discussion

Experiment 2 compared the synchrony perception and rapid recalibration process between VFs using the SJ task. The TBW was wider at the center than at the periphery whereas the PSS was nearly identical between the VFs. Rapid recalibration occurred under both VF conditions. However, the magnitude of the recalibration did not differ between the center and the periphery.

In Experiment 2, a VF difference in the TBW consistent with the difference in temporal resolution, was observed. The central VF projects to the sustained channel, whereas the peripheral VF projects to the transient channel [23]. The temporal resolution of the sustained channel is lower than that of the transient channel [22]. Thus, the range of audio-visual synchrony perception would be more tolerant in the central VF due to adapting to low temporal resolution. Stevenson, Kruger Fister, Barnett, Nidiffer, and Wallace [40] have found a broader TBW in the peripheral VF than in the central VF. However, a higher percentage of simultaneous judgment was observed in the peripheral VF outside of the present experiment in their study (60° and 90° eccentricities). In the peripheral VF (30° eccentricity), which was closer to the present experiment, the percentage of simultaneous judgment was slightly lower than that in the central VF. Therefore, the TBW of the central VF is assumed to become wider than that of the peripheral VF up to a certain range (at least 30° eccentricity), which is attributed to the temporal resolution.

However, the difference in temporal resolution due to VFs did not affect the magnitude of rapid recalibration. Furthermore, unlike Experiment 1, the rapid recalibration process occurred in both VFs. The differences in the temporal resolution did not modulate the intensity of the rapid recalibration process in the SJ task.

Experiment 3

In Experiments 1 and 2, the PSS and TBW were measured using explicit methods (i.e., TOJ and SJ tasks). In Experiment 3, the PSS and TBW were measured using an implicit method using SB perception. Moreover, the process of rapid recalibration was investigated in both VFs using this method.

Materials and methods

Participants

Twenty-one volunteers (14 women and 7 men; mean age = 23.19 ± 2.84 years) participated in Experiment 3. The required sample size was determined based on the same criterion as in Experiment 1, and participants were recruited to meet this sample size from a university lecture held at Doshisha University. All participants orally reported normal or corrected-to-normal vision and normal audition. Participants received 500 Japanese yen for their participation, and their written informed consent was obtained prior to participation.

Stimuli

Two white disks (1.5° in diameter) appeared 3.0° above the fixation stimulus and were initially separated by 4.5°. The two disks in the central VF condition moved laterally toward one another, coincided, and continued moving until they were 2.25° from the point of superposition at a speed of 3.13°/s. In the peripheral VF condition, the superposition was 10° left or right from the superposition of the center condition. The auditory stimulus was the same as in Experiments 1 and 2. The following nine SOAs were present between the visual and auditory stimuli: ±510, ±260, ±130, ±50, and 0 ms (negative SOAs indicate that the auditory stimulus was presented prior to the visual stimulus, and vice versa). In the 0 ms SOA condition, the auditory stimulus was presented at the instance of superposition.

Procedure

The trials were initiated by pressing the 0 key. Each trial consisted of a 500 ms fixation stimulus followed by motion displays (for a duration of 1440 ms, see Fig 6). The tone either preceded or followed the instance of superposition by the SOA that was drawn randomly from the set. During the motion display period, participants were instructed to gaze the fixation cross. After the motion display, participants were instructed to judge the motion trajectory by pressing the 1 key for bounce and the 3key for stream. The participants completed 480 trials divided into six blocks. One block comprised 80 trials, with eight trials for each SOA condition (i.e., 16 trials for 0 ms SOA and eight trials for other SOAs). For half of the trials of one block, visual stimuli were presented at the central VF, and for the other half, visual stimuli were presented at the peripheral VF. Additionally, for the peripheral VF condition, visual stimuli were presented at the left or right of the fixation stimulus with equal frequency. Participants took short breaks between the blocks. Prior to performing this main task, participants performed 30 practice trials, and then judged the same motion trajectory without sound in 48 trials (24 trials each for the central and peripheral conditions).

Fig 6. The schematic representation of the procedure used in the stream/bounce display.

Fig 6

Results

The proportion of bounce responses was calculated for each condition. At first, the bounce responses were significantly larger at the center (M = 38.54%, SEM = 5.39) than at the periphery (M = 27.08%, SEM = 4.15) without sound, t (16) = 2.18, p < .05, d = 0.49. To compute the alpha, PSS, and sigma values, a three-parameter Gaussian function was fitted to each participant’s data to minimize the RMSE, as in Experiment 2. The data of four participants were excluded from further analysis because the computed sigma values were large (> 600: two participants), and the bounce responses were less than 5% in all conditions (two participants). The results are shown in Fig 7A, which represents the mean percentages of bounce responses, as a function of the VF and SOA with fitted psychometric functions (center: Mean RMSE = 0.08 ± 0.03, periphery: Mean RMSE = 0.07 ± 0.01). Moreover, the PSS and sigma values are shown in Fig 7B and 7C. The PSS values did not differ between the center and periphery conditions, t (16) = 0.41, p = .69, d = 0.06. Moreover, the sigma value did not differ between the center and periphery conditions, t (16) = 0.83, p = .42, d = 0.16. The alpha value was larger in the center (M = 0.82 ± 0.14) than in the periphery (M = 0.74 ± 0.19) condition, t (16) = 2.57, p < .05, d = 0.47.

Fig 7. Results of synchrony perception analyses in Experiment 3.

Fig 7

(a) Mean percentage of bounce responses. The average fitting functions are plotted for the central and peripheral visual field data. (b) Mean estimated point of subjective synchrony. (c) Mean estimated sigma. Error bars represent standard errors of the mean (n = 17).

The same inter-trial analysis of Experiments 1 and 2 was conducted. Distributions of perceived bounce responses as a function of SOA were compiled for each participant and each VF separately for cases in which trial t-1 exhibited a negative SOA (i.e., audition leads) or positive SOA (i.e., vision leads). These distributions were subsequently fitted with the same Gaussian function as in Experiment 2. The results are shown in Fig 8A and 8B, which represent the mean percentages of bounce responses as a function of VF and SOA with fitted psychometric functions for both modality order (center/audition leads: Mean RMSE = 0.10 ± 0.04, center/vision leads: Mean RMSE = 0.10 ± 0.04, periphery/audition leads: Mean RMSE = 0.09 ± 0.04, periphery/vision leads: Mean RMSE = 0.10 ± 0.03). The recalibration shifts are summarized in Fig 8C, which plots the PSS for both modality order and the two different VFs. For the PSS, a two-way ANOVA with modality order (2) and VF (2) was conducted. The results revealed a marginally significant main effect of modality order, F (1, 16) = 4.16, p = .06, ηp2 = .21. However, the main effect of VF, F (1, 16) = 0.26, p = .62, ηp2 = .02, and the two-way interaction, F (1, 16) = 2.02, p = .17, ηp2 = .11, were not significant. To confirm the occurrence of rapid temporal recalibration, a one-sample t-test (two-tailed) was conducted for the magnitude of rapid recalibration (calculated by subtracting PSS with auditory leads (t-1) from one with vision leads (t-1)) in both VFs. The results indicated that a significant difference was observed only in the center, t (16) = 2.39, p < .05, d = 0.58, but not in the periphery, t (16) = 0.50, p = .62, d = 0.12. Furthermore, the sigma values for both modality orders and the two different VFs are shown in Fig 8D. For sigma, a two-way ANOVA with modality orders (2) and VF (2) was conducted. The results did not reveal a significant main effect of VF, F (1, 16) = 0.76, p = .40, ηp2 = .05, modality order, F (1, 16) = 0.81, p = .38, ηp2 = .05, and the two-way interaction, F (1, 16) = 0.06, p = .81, ηp2 = .004.

Fig 8. Results of rapid recalibration analyses in Experiment 3.

Fig 8

Mean percentage of bounce responses in the (a) central and (b) peripheral visual fields. The average fitting functions are plotted for preceding trials of audition leads and vision leads data. (c) Mean estimated point of subjective simultaneity. (d) Mean estimated sigma. Error bars represent standard errors of the mean (n = 17).

Discussion

Experiment 3 compared synchrony perception and rapid recalibration process between VFs using SB perception. Both the PSS and TBW were nearly identical to the physical synchrony (i.e., 0 ms) in both VFs. Moreover, rapid recalibration occurred only in the central VF, but not in the peripheral VF.

Audio-visual temporal processing was not affected by VF differences in SB perception. SB perception did not differ between the central and peripheral VFs in terms of the PSS and TBW. However, the duration of motion display was over 1000 ms and participants’ eye movements were not monitored in this experiment. Thus, the participants’ gaze was off the fixation point in the motion display period, which may have affected the current results.

The discrepancy in the results between the central and peripheral VF in rapid recalibration could be explained by this difference in judgment. In this experiment, normal rapid recalibration occurred only in the central VF and not in the peripheral VF. It is possible that participants performed timing judgment of bounce in the central VF and causal judgment of bounce in the peripheral VF. The occurrence of rapid recalibration due to SB perception has not yet been examined. If participants performed a causal judgment in peripheral VF and the timing information on a previous trial did not cause a change in the PSS in causal judgment, it is consistent with the results of this experiment. Future studies are needed to confirm this hypothesis, as there are few experimental data on rapid recalibration using SB perception.

General discussion

The present study examined the different characteristics of temporal synchrony perception for audio-visual stimuli based on the VF. In this study, the following three types of measurements were used: TOJ, SJ, and SB perception tasks. Experiment 1 showed that the PSS value was smaller in the central VF condition than in the peripheral VF condition. Moreover, rapid recalibration did not occur in peripheral VF conditions in Experiment 1. Experiment 2 indicated that the TBW was broader at the central VF than at the peripheral VF. However, the rapid recalibration magnitude did not differ between the central and peripheral VFs in Experiment 2. In Experiment 3, neither the PSS nor the TBW differed between the central and peripheral VFs. Moreover, rapid recalibration was observed only in the central VF (see Table 1).

Table 1. Summary of the three experiments’ main findings.

Exp. Synchrony perception Rapid recalibration
PSS TBW Cent. Peri.
Exp. 1 (TOJ) Cent. < Peri. Cent.≒Peri. A. leads < V. leads A. leads≒V. leads
Exp. 2 (SJ) Cent.≒Peri. Cent. > Peri. A. leads < V. leads A. leads < V. leads
Exp. 3 (SB) Cent.≒Peri. Cent.≒Peri. A. leads < V. leads A. leads≒V. leads

Note. Cent. = central visual field, Peri. = peripheral visual field

A. leads = audition leads (t-1), V. leads = vision leads (t-1).

VF differences in audio-visual temporal synchrony were observed in the TOJ and SJ tasks in this study. In the TOJ task of Experiment 1, the PSS score indicated that the auditory stimulus was presented earlier as a visual stimulus for one to perceive subjective simultaneity in the central than in the peripheral VF condition. In the SJ task of Experiment 2, the TBW width of the central VF was broader than that of the peripheral VF. The difference in the PSS of the TOJ task was consistent with the VF difference in visual latency, whereas the difference in the TBW of the SJ task was consistent with the VF difference in temporal resolution. As a preliminary prediction, the VF differences in the PSS and TBW both follow differences in each visual latency and temporal resolution: the PSS score was lower and the TBW width was wider in the central VF than in the peripheral VF. It has been shown that the difference in eccentricity-dependent temporal resolution is observed in the early visual cortex and is compensated later in the cortical visual pathway [41]. TOJ tasks are assumed to be associated with higher-order processing compared to SJ tasks [28, 42]. Therefore, what PSS score was affected by visual latency could be attributed to the visual hierarchy of eccentricity-dependent temporal contrast in a TOJ task. Moreover, TOJ tasks are proposed to be sensitive to PSS changes [28], whereas an SJ task is proposed as sensitive to TBW change [43]. Thus, the discrepancy in the indices for which VF differences were observed between the TOJ and SJ tasks would reflect the difference in sensitivity of indices to audio-visual synchrony perception.

The present study also revealed PSS differences between the TOJ and SJ tasks. The TOJ task showed negative PSS values (i.e., participants interpreted them as synchronous when the auditory stimulus led the visual stimulus for the pair), whereas the SJ task indicated positive PSS values (i.e., participants interpreted as synchronous when the auditory stimulus led the visual stimulus for the pair). Normally, positive PSSs have been observed in many studies [44] according to a neural delay (visual stimuli need to be presented before sound to compensate for a slow neural processing compared to auditory stimuli) and a tuning toward the natural situation (light reaches the sense organs before sounds do). In contrast, negative PSSs are obtained more from TOJ data than from SJ data [28]. The process of additional cognitive operations to label the judgment of whether auditory leads or visual leads [39] might be associated with this bias of PSS in a TOJ task. In this study, the PSS values replicated these characteristics in audio-visual synchrony perception. A model-based analysis involving the parameters to estimate the timing decisional, and response process is proposed and suggests the underlying common timing processes in TOJ and SJ tasks [45]. The VF differences for audio-visual synchrony perception also need to be re-analyzed using this approach.

Both PSS and TBW obtained from SB perception data did not reveal VF differences unlike the TOJ and SJ tasks. Because the bouncing perception is sensitive to the lag between sound onset and moment the circles coincide in the SB display [29, 30], SB perception implicitly measured audio-visual temporal processing. Fujisaki et al. [15] also measured PSS using SB perception. However, participants perceived the causality for motion trajectory and sound in an SB display, whereas simultaneity for auditory and visual stimuli in an SJ task. Discrepancies between SB and SJ have been reported [46, 47].

A VF difference in rapid recalibration was observed in the TOJ task and SB perception in the current study. In these paradigms, rapid recalibration occurred only in the central VF. However, this VF difference was not attributed to the difference in processing speed based on retinal position in either paradigm. In the TOJ task, it is assumed that choice-repetition bias, which cancels a PSS shift by rapid recalibration, did not occur. In SB perception, it is possible that the information judged was different between the central and peripheral VFs. Therefore, rapid recalibration may be induced by intersensory temporal processes that are not modulated by the visual processing speed [33].

A limitation of the present study is that the effects of the cortical magnification factor were not investigated. The areas projected by the retina are larger in the central VF than in the peripheral VF [48], and visual acuity is usually higher in the central VF than in the peripheral VF. Therefore, the size of the visual stimuli must be large to match the areas on the visual striate cortex in the peripheral VF compared to the central VF. The differences in cortical area size projected by the retina may induce differences in the visual stimulus intensity. Low-intensity stimuli are susceptible to judge synchrony [49]. Thus, future studies should control for the cortical magnification factor.

The present study found VF differences in the temporal synchrony perception of audio-visual stimuli. These findings support the notion that differences in visual processing speed modulate temporal processing in audio-visual integration [19]. However, these VF differences could not be explained simply by the difference in the temporal resolution of the VF. The compensation of temporal resolution difference in the late cortical visual pathway [42] may be associated, and future studies are necessary to support this speculation. Furthermore, a paradigm difference in temporal synchrony perception was also observed in this study, among TOJ, SJ, and SB. The PSS score tends to be a negative value (i.e., interpreted as synchrony when auditory stimulus leads to the visual stimulus) in a TOJ task compared to an SJ task [28]. SB perception is also likely to show a negative PSS value compared to an SJ task, since causality judgment is performed in SB perception [47]. In this study, the PSS scores indicated a similar tendency to these functional characteristics. One of the potential contributions is that each of these three methods measured different functional characteristics for audio-visual synchrony perception in a series of experiments. Additionally, the present findings suggest that explicit measurements (i.e., TOJ and SJ tasks) are more desirable for examining the temporal synchrony perception of audio-visual stimuli between VFs, because of the different judgments that occur when using an implicit measurement (i.e., SB perception).

Conclusion

This study demonstrates that temporal synchrony processing for audio-visual stimuli is modulated by differences in processing speed between the central and peripheral VFs. These functions differed based on the experimental tasks involved, in which the PSS value became smaller in the central VF in the TOJ task, and the TBW became broader in the central VF in the SJ task. In addition, rapid temporal recalibration occurred only at the central VF in the TOJ task and SB perception, whereas at both the central and peripheral VF in the SJ task. These inconsistent results could be associated with the compensation of temporal resolution differences in the late cortical visual pathway and the distinct functional characteristics of each paradigm.

Acknowledgments

I am most grateful to the participants. I am also very grateful to the anonymous reviewers for their valuable and insightful comments and suggestions.

Data Availability

All raw data and R script files are available from the OSF database (https://osf.io/2x73g/?view_only=1dd0968b3b5d4a65a90c4ed78b1dde19).

Funding Statement

YT JP20K14269 the Japan Society for the Promotion of Science https://www.jsps.go.jp/english/e-grants/index.html No

References

  • 1.Bolognini N, Frassinetti F, Serino A, Làdavas E. “Acoustical vision” of below threshold stimuli: interaction among spatially converging audiovisual inputs. Exp. Brain Res. 2005;160:273–282. doi: 10.1007/s00221-004-2005-z [DOI] [PubMed] [Google Scholar]
  • 2.Chen Y-C, Spence C. The crossmodal facilitation of visual object representations by sound: evidence from the backward masking paradigm. J. Exp. Psychol. Human. 2011;37:1784–1802. doi: 10.1037/a0025638 [DOI] [PubMed] [Google Scholar]
  • 3.Olivers C, van der Burg E. Bleeping you out of the blink: sound saves vision from oblivion. Brain Res. 2008;1242:191–199. doi: 10.1016/j.brainres.2008.01.070 [DOI] [PubMed] [Google Scholar]
  • 4.Van der Burg E, Olivers C, Bronkhorst A, Theeuwes J. Pip and pop: nonspatial auditory signals improve spatial visual search. J. Exp. Psychol. Human. 2008;34:1053–1065. doi: 10.1037/0096-1523.34.5.1053 [DOI] [PubMed] [Google Scholar]
  • 5.Fujisaki W, Nishida S. Temporal frequency characteristics of synchrony-asynchrony discrimination of audio-visual signals. Exp. Brain Res. 2005;166:455–464. doi: 10.1007/s00221-005-2385-8 [DOI] [PubMed] [Google Scholar]
  • 6.Fujisaki W, Nishida S. Feature-based processing of audio-visual synchrony perception revealed by random pulse trains. Vis. Res. 2007;47:1075–1093. doi: 10.1016/j.visres.2007.01.021 [DOI] [PubMed] [Google Scholar]
  • 7.Fujisaki W, Nishida S. Top-down feature-based selection of matching features for audio-visual synchrony discrimination. Neurosci. Lett. 2008;433:225–230. doi: 10.1016/j.neulet.2008.01.031 [DOI] [PubMed] [Google Scholar]
  • 8.King AJ. Multisensory integration: strategy for synchronization. Current Biol. 2005;15:R339–R340. doi: 10.1016/j.cub.2005.04.022 [DOI] [PubMed] [Google Scholar]
  • 9.Engel GR, Dougherty WG. Visual-auditory distance constancy. Nature. 1971;234:308. doi: 10.1038/234308a0 [DOI] [PubMed] [Google Scholar]
  • 10.Kopinska A, Harris LR. Simultaneity constancy. Percept. 2004;33:1049–1060. doi: 10.1068/p5169 [DOI] [PubMed] [Google Scholar]
  • 11.Sugita Y, Suzuki Y. Implicit estimation of sound-arrival time. Nature. 2003;421:911. doi: 10.1038/421911a [DOI] [PubMed] [Google Scholar]
  • 12.Lewald J, Guski R. Cross-modal perceptual integration of spatially and temporally disparate auditory and visual stimuli. Cognit. Brain Res. 2003;16:468–478. doi: 10.1016/s0926-6410(03)00074-0 [DOI] [PubMed] [Google Scholar]
  • 13.Dixon MF, Spitz L. The detection of auditory visual desynchrony. Percept. 1980;9:719–721. doi: 10.1068/p090719 [DOI] [PubMed] [Google Scholar]
  • 14.Guski R, Troje NF. Audiovisual phenomenal causality. Percept. Psychophys. 2003;65:789–800. doi: 10.3758/bf03194815 [DOI] [PubMed] [Google Scholar]
  • 15.Fujisaki W, Shimojo S, Kashino M, Nishida S. Recalibration of audiovisual simultaneity. Nature Neurosci. 2004;7:773–778. doi: 10.1038/nn1268 [DOI] [PubMed] [Google Scholar]
  • 16.Vroomen J, Keetels M, de Gelder B, Bertelson P. Recalibration of temporal order perception by exposure to audio-visual asynchrony. Cognit. Brain Res. 2004;22:32–35. doi: 10.1016/j.cogbrainres.2004.07.003 [DOI] [PubMed] [Google Scholar]
  • 17.Van der Burg E, Alais D, Cass J. Rapid recalibration to audiovisual asynchrony. J. Neurosci. 2013;33:14633–14637. doi: 10.1523/JNEUROSCI.1182-13.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Van der Burg E, Alais D, Cass J. Audiovisual temporal recalibration occurs independently at two different time scales. Sci. Rep. 2015;5:14526. doi: 10.1038/srep14526 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Takeshima Y, Gyoba J. Low-level visual processing speed modulates judgment of audiovisual simultaneity. Interdiscip. Inf. Sci. 2015;21:109–114. doi: 10.4036/iis.2015.A.01 [DOI] [Google Scholar]
  • 20.Tappe T, Niepel M, Neumann O. A dissociation between reaction time to sinusoidal gratings and temporal-order judgment. Percept. 1994;23:335–347. doi: 10.1068/p230335 [DOI] [PubMed] [Google Scholar]
  • 21.Hoffmann KP. Conduction velocity in pathways from retina to superior colliculus in the cat: a correlation with receptive-field properties. J. Neurophysiol. 1973;36:409–424. doi: 10.1152/jn.1973.36.3.409 [DOI] [PubMed] [Google Scholar]
  • 22.Breitmeyer B. Simple reaction time as a measure of the temporal response properties of transient and sustained channels. Vis. Res. 1975;15:1411–1412. doi: 10.1016/0042-6989(75)90200-x [DOI] [PubMed] [Google Scholar]
  • 23.Hartmann E, Lachenmayr B, Brettel H. The peripheral critical flicker frequency. Vis. Res. 1979;19:1019–1023. doi: 10.1016/0042-6989(79)90227-x [DOI] [PubMed] [Google Scholar]
  • 24.Lichtenstein M, White CT. Relative visual latency as a function of retinal locus. J. Opt. Soc. Am. 1961;51:1033–1034. doi: 10.1364/josa.51.001033 [DOI] [PubMed] [Google Scholar]
  • 25.Sekuler R, Sekuler AB, Lau R. Sound alters visual motion perception. Nature; 1997;385:308. doi: 10.1038/385308a0 [DOI] [PubMed] [Google Scholar]
  • 26.Zampini M, Shore DI, Spence C. Audiovisual temporal order judgments. Exp. Brain Res. 2003;152:198–210. doi: 10.1007/s00221-003-1536-z [DOI] [PubMed] [Google Scholar]
  • 27.Matthews N, Welch L, Achtman R, Fenton R, FitzGerald B. Simultaneity and temporal order judgments exhibit distinct reaction times and training effects. PLoS ONE. 2016;11(1):e0145926. doi: 10.1371/journal.pone.0145926 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Van Eijk R, Kohlrausch A, Juola J, van de Par S. Audiovisual synchrony and temporal order judgments: effects of experimental method and stimulus type. Percept. Psychophys. 2008;70:955–968. doi: 10.3758/pp.70.6.955 [DOI] [PubMed] [Google Scholar]
  • 29.Remijn GB, Ito H, Nakajima Y. Audiovisual integration: an investigation of the ‘streaming-bouncing’ phenomenon. J. Physiol. Anthropol. Appl. Human Sci. 2004;23:243–247. doi: 10.2114/jpa.23.243 [DOI] [PubMed] [Google Scholar]
  • 30.Watabane K, Shimojo S. When sound affects vision: effects of auditory grouping on visual motion perception. Psychol. Sci. 2001;12:109–116. doi: 10.1111/1467-9280.00319 [DOI] [PubMed] [Google Scholar]
  • 31.Kohlrausch A, van Eijk R, Juola JF, Brandt I, van de Par S. Apparent causality affects perceived simultaneity. 2013;75:1366–1373. doi: 10.3758/s13414-013-0531-0 [DOI] [PubMed] [Google Scholar]
  • 32.Roseboom W. Serial dependence in timing perception. J. Exp. Psychol. Human. 2019;45:100–110. doi: 10.1037/xhp0000591 [DOI] [PubMed] [Google Scholar]
  • 33.Takeshima Y. Rapid temporal recalibration to audiovisual asynchrony occurs across the difference in neural processing speed based on spatial frequency. i-Percept. 2020;11:1–11. doi: 10.1177/2041669520966614 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Brainard DH. The psychophysics toolbox. Spatial Vis. 1997;10:433–436. doi: 10.1163/156856897X00357 [DOI] [PubMed] [Google Scholar]
  • 35.Kleiner M, Brainard DH, Pelli DG. What’s new in Psychtoolbox-3. Percept. 2007;36(supplement):14. [Google Scholar]
  • 36.Pelli DG. The video toolbox software for visual psychophysics: transforming numbers into movies. Spatial Vis. 1997;10:437–442. doi: 10.1163/156856897X00366 [DOI] [PubMed] [Google Scholar]
  • 37.Ju A, Orchard-Mills E, van der Burg E, Alais D. Rapid audiovisual temporal recalibration generalises across spatial location. Multisensory Res. 2019;32:215–234. doi: 10.1163/22134808-20191176 [DOI] [PubMed] [Google Scholar]
  • 38.Keane B, Bland NS, Matthews N, Carroll TJ, Wallis G. Rapid recalibration of temporal order judgments: response bias accounts for contradictory results. Eur. J. Neurosci. 2020;51:1697–1710. doi: 10.1111/ejn.14551 [DOI] [PubMed] [Google Scholar]
  • 39.Cicchini GM, Mikellidou K, Burr D. Serial dependencies act directly on perception. J. Vis. 2017;17(14):6. doi: 10.1167/17.14.6 [DOI] [PubMed] [Google Scholar]
  • 40.Cicchini GM, Mikellidou K, Burr D. The functional role of serial dependence. Proc. R. Soc. B 2018;285:20181722. doi: 10.1098/rspb.2018.1722 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Himmelberg MM, Wade AR. Eccentricity-dependent temporal contrast tuning in human visual cortex measured with fMRI. NueroImage. 2019;184:462–474. doi: 10.1016/j.neuroimage.2018.09.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Binder M. Neural correlates of audiovisual temporal processing-Comparison of temporal order and simultaneity judgments. Neurosci. 2015;300:432–447. doi: 10.1016/j.neuroscience.2015.05.011 [DOI] [PubMed] [Google Scholar]
  • 43.Vroomen J, Keetels M. Perception of intersensory synchrony: a tutorial review. Atten. Percept. Psychophys. 2010;72:871–884. doi: 10.3758/APP.72.4.871 [DOI] [PubMed] [Google Scholar]
  • 44.Stevenson RA, Kruger Fister J, Barnett ZP, Nidiffer AR, Wallace MT. Interactions between the spatial and temporal stimulus factors that influence multisensory integration in human performance. Exp. Brain Res. 2012;219:121–137. doi: 10.1007/s00221-012-3072-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.García-Pérez MA, Alcalá-Quintana R. Converging evidence that common timing processes underlie temporal-order and simultaneity judgments: a model-based analysis. Atten. Percept. Psychophys. 2015;77:1750–1766. doi: 10.3758/s13414-015-0869-6 [DOI] [PubMed] [Google Scholar]
  • 46.Kitamura M, Watanabe K, Kitagawa N. Positive emotion facilitates audiovisual binding. Front. Integr. Neurosci. 2016;9:66. doi: 10.3389/fnint.2015.00066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Vroomen J, Keetels M. Perception of causality and synchrony dissociate in the audiovisual bounce-inducing effect (ABE). Cognit. 2020;204:10430. doi: 10.1016/j.cognition.2020.104340 [DOI] [PubMed] [Google Scholar]
  • 48.Cowey A, Rolls E. Human cortical magnification factor and its relation to visual acuity. Exp. Brain Res. 1974;21:447–454. doi: 10.1007/BF00237163 [DOI] [PubMed] [Google Scholar]
  • 49.Kruger Fister J, Stevenson R, Nidiffer A, Barnett Z, Wallace M. Stimulus intensity modulates multisensory temporal processing. Neuropsychol. 2016;88:92–100. doi: 10.1016/j.neuropsychologia.2016.02.016 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Deborah Apthorp

24 Mar 2021

PONE-D-20-40727

The effects of temporal characteristics depend on visual fields for the temporal synchrony processing of audio-visual stimuli

PLOS ONE

Dear Dr. Takeshima,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by May 08 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Deborah Apthorp, Ph.D

Academic Editor

PLOS ONE

Additional Editor Comments:

The reviewers have provided careful and detailed reviews. Please pay particular attention to the PLoS One data availability policy and be sure to have all your data (and, preferably, your analysis code) available before submitting your revision. In addition, both reviewers have pointed out that a more careful explanation of the theoretical motivation for the study is required.

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In line with PLOS' guidelines on detailed reporting (https://journals.plos.org/plosone/s/criteria-for-publication#loc-3), please ensure that you have provided sufficient detail on participant recruitment in the Methods section, including from where participants were recruited.

3. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript describes the results of three psychophysical experiments that investigate the effects of visual field eccentricity (central vs peripheral) on audio-visual timing performance.

Each experiment uses a different AV timing task to derive an estimate of the point of subjective simultaneity and temporal bandwidth. Each task also uses serial dependency analysis to determine evidence of temporal recalibration of the kind observed previously (e.g. Van der Burg, et al.).

Experiment 1 uses a simultaneity judgement, Experiment 2 uses temporal order judgement and Experiment 3 uses a stream-bounce judgement, all of which have been used extensively to derive PSS and temporal bandwidths.

Each experiment shows a diverse set of visual field effects, notably that:

bandwidths are broader in central than in peripheral visual field locations for SJs but not TOJs nor SBJ;. Regarding serial dependencies, a visual field effect is observed only for TOJs, and SBJs, not SJs.

To account for this diverse set of results, the authors appeal to differences in visual timing precision previously reported to occur across the visual field. Unfortunately, this account is far too vague to offer any explanatory value, nor is it able to account for the full range of results.

Whilst the experiments appear to be well-conducted, the analyses appropriate, and the results are interesting, practically no theoretical motivation is provided for these experiments. Moreover, few if any hypotheses are offered. Consequently, what the reader is left with is a set of parametric experiments with a diverse set of results, with no meaningful interpretation.

Specific comments

Page 2, Lines 11-13

"...exhibited more auditory preceding timing in the peripheral....."

Not clear what this means. Please rephrase.

"On the other hand...." Other hand relative to what? Consider deleting this cliche

“temporal binding window was larger…”

Presumably, this refers to a broader subjective simultaneity bandwidth, not amplitude. Please specify.

Line 21-23

I don't follow this.

Which visual features are the authors referring to? Temporal response? If so, what kind of temporal response exactly?

Page 4 Line 18-23

The authors need to explain for the reader why they are comparing simultaneity judgment with TOJ tasks. The paragraph (lines 18-23) doesn't provide a justification.

Line 22

‘view distances…”

Presumably, the authors are referring to variations in eccentricity arising from variations in viewing distance of an otherwise identical stimulus. This needs to be clarified.

Page 5 Line 7

It's not clear from the Introduction what the theoretical motivation for the visual eccentricity (visual field) manipulation is.

There are no obvious hypotheses pertaining to this manipulation for any of the three tasks or for the effects of temporal recalibration

Page 10, Lines 19-21

Reference to neural transmission speed is an interpretation, and possibly a prediction/hypothesis. Either way, this doesn't belong in the Results section.

“…a vision leads response is suppressed…’ [italics added]

Again, this is an interpretation, not a description of the experimental Results

Page 14, Lines 6-7

Again, this is an interpretation, not a description of the experimental Results

General Discussion

Page 20, Lines 12-13

How do you know they are more suppressed? It is more descriptive to say that TOJ vision leads responses are less frequent in the visual periphery than in central vision. To invoke suppression implies a mechanism for which these experiments offer no evidence

Lines 14-15

Presumably, the authors are referring to the observation that SJ derived TBWs are broader at central than at peripheral visual field locations.

Page 22, Lines 22-23

I don't follow this sentence. Which temporal characteristics exactly are the authors referring to?

Nor is it clear how the temporal characteristics of the visual field ought to explain the diverse pattern of task dependent visual field effects reported here.

An explicit assumption repeated several times throughout the manuscript is that the peripheral visual field affords higher temporal acuity than the central visual field. Whilst there is neurophsyiological evidence for this, psychophysical evidence is more equivocal. In fact, it's only been convincingly demonstrated using using the critical flicker fusion paradigm, and not other methods - see

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6264386/

for a nice summary.

Reviewer #2: The current study examined audiovisual simultaneity perception in the central and peripheral visual fields (VFs) using three well-established paradigms: temporal order judgments (TOJ), simultaneity judgments (SJ), and stream/bounce (SB) perception. The PSS was more negative (i.e., in the auditory-leading side) in the center than in the periphery when using TOJ, but no difference when using SJ and SB. The temporal binding window (TBW) was wider in the center than in the periphery when using SJ, but no such difference when using TOJ and SB. Rapid recalibration was observed in terms of PSS in the center when using TOJ, SJ, and SB, and in the periphery when using SJ. In addition, in TOJ, the TBW was narrower when the previous trial had a positive SOA than when it had a negative SOA. These results demonstrated that audiovisual simultaneity perception was partly modulated by the eccentricity of the visual stimuli, but discrepancy occurred when measured using different experimental paradigms.

The current study certainly provides critical measures of human audiovisual simultaneity perception in the center and 10-degree periphery using three methods. That said, I have concerns about the results—whether the data are reliable and whether the explanations are convincing, especially for those task-dependent effects. I also consider the current version of manuscript is too simplified. I therefore have following suggestions and I hope that they can help the author to improve the next version of manuscript:

Major comments:

1. In order to better predict and explain the results observed in the TOJ, SJ, and SB paradigms, the critical differences of their underlying mechanisms should be introduced and discussed. This is especially critical when different results were observed in each paradigm: should these differences be attributed to the different sensory, perceptual, or decisional processing?

As the author mentioned in the Discussion, SJ and TOJ may share the same perceptual mechanism but different decisional processes (e.g., García-Pérez & Alcalá-Quintana, 2015). On the other hand, the SB perception may involve causal relations and attention in addition to audiovisual integration (Shimojo et al., 2001; van Eijk et al., 2008). Hence, stronger theoretical backgrounds will be necessary to understand the distinct results observed in three experiments, and will provide a clearer rationale for the current study.

2. The author proposed an interesting assumption that visual stimulus presented to sustained channel and transient channel may lead to different audiovisual simultaneity perception, because the transient channel has higher temporal resolution and faster processing speed. However, these two characteristics in the temporal processing may be associated with different aspects of audiovisual simultaneity perception. In my intuitive guessing, higher temporal resolution may be associated with narrower audiovisual TBW, whereas the faster processing time may be associated with the shift of PSS (see the model of García-Pérez & Alcalá-Quintana, 2012, and a recent study by Chien et al., 2020). I also found that García-Pérez and Alcalá-Quintana published a paper last year (2020) on this issue.

Hence, I would expect that the PSS should be at more negative (i.e., the auditory stimulus should be presented earlier) in the periphery than in the center, if visual processing is faster in the periphery than in the center. This prediction was not consistent with the results, and even an opposite direction was observed in Experiment 1. I cannot follow the author’s “suppression” account on p. 10.

On the other hand, the result of TBW in Experiment 2 was consistent with my prediction that the TBW was narrower in the periphery than in the center because of the higher temporal resolution in the periphery. However, this result contrasts with trend as a function of eccentricity reported by Stevenson et al. (2012). Please discuss this critical inconsistency.

3. The results of rapid recalibration in SJ and TOJ were in the same direction in the current study, which is inconsistent with Roseboom (2019). On p. 21, if I understand correctly, the author claimed that the effect of rapid recalibration may be cancelled out by the choice-repetition bias in TOJ. However, I do not understand how this explanation is only applied in the periphery rather than in the center, given that the choice-repetition bias should occur in both conditions when they are mixed in a block. In addition, given the fact that response type in TOJ, SJ, and SB are very different from each other, I am also wondering how the choice-repetition bias can be applied in SB but not in SJ.

4. To my knowledge, it is unusual that the rapid recalibration effect is observed in TBW as demonstrated in Experiment 1. Because the rapid recalibration effect in PSS was only observed in the center while the rapid recalibration effect in TBW was observed both in the center and periphery, the explanation on p.11 therefore does not work.

5. I am wondering how the number of participants was determined in each experiment. I can see that the expected number was 20, but only 17 remaining in Experiment 3. I worry that this number is too small (the main effect of modality order was only marginal significant in the rapid recalibration analysis). The SB perception is a subjective experience and therefore a larger individual difference may exist.

6. I have some critical questions regarding the experimental designs:

(1) In Experiment 1, the blank display before the visual target was 500-1000 ms, and the other blank after the visual target was 200-710 ms (this information is missing in the main text). It seems that both blanks were occasionally shorter than the largest SOAs between the visual and auditory target (+/- 510 ms).

(2) Technically, there is no +/-0 ms, only 0 ms. Hence, there should be only 9 SOAs in Experiments 2 and 3. Was the number of trials in the 0 ms SOA doubled as compared to other SOAs?

(3) In Experiment 3, how long is the duration of the motion display? A figure to demonstrate the procedure will be great.

7. More details and clarifications in data analyses are required:

(1) In Experiment 1, please specify how to estimate the PSS and sigma (TBW) based on the cumulative Gaussian function.

(2) In Experiments 2 & 3, please explain the meaning of alpha, PSS and sigma in terms of the Gaussian function, respectively. Were the different alpha scores in the center and periphery critical in Experiment 3?

(3) Did different fitting methods used in Experiment 1 (maximum-likelihood) vs. Experiments 2 & 3 (minimal RMSE) potentially influence the estimation of PSS and sigma?

(4) On p. 19, line 4, were the p value of the two t tests corrected? Were they one- or two-tailed?

8. Based on the results, which paradigm will be recommended for researchers to explore similar issue in the future studies?

Minor comments:

1. The current title is wordy—it should be condensed.

2. The sound was presented from headphone, so the perceived location was near the participant’s head rather than in the front. Does this influence the results of PSS, TWB, and rapid recalibration since the visual and auditory stimuli were spatially separate (i.e., violating the unity assumption)?

3. Figures 4 & 6: It would be easier to read if the same conditions have the same color code (such as in Figure 2, the center condition has open dots in (a) and white bars in (b,c)). Same suggestion for Figures 3, 5, & 7.

4. A table summarizing all results of three experiments will be helpful.

References

Chien, S. E., Chen, Y. C., Matsumoto, A., Yamashita, W., Shih, K. T., Tsujimura, S. I., & Yeh, S. L. (2020). The modulation of background color on perceiving audiovisual simultaneity. Vision research, 172, 1-10.

García-Pérez, M. A., & Alcalá-Quintana, R. (2012). On the discrepant results in synchrony judgment and temporal-order judgment tasks: a quantitative model. Psychonomic bulletin & review, 19(5), 820-846.

García-Pérez, M. A., & Alcala-Quintana, R. (2020). Assessing multisensory integration and estimating speed of processing with the dual-presentation timing task: Model and data. Journal of Mathematical Psychology, 96, 102351.

Shimojo, S., Scheier, C., Nijhawan, R., Shams, L., Kamitani, Y., & Watanabe, K. (2001). Beyond perceptual modality: Auditory effects on visual perception. Acoustical Science and Technology, 22(2), 61-67.

Stevenson, R. A., Fister, J. K., Barnett, Z. P., Nidiffer, A. R., & Wallace, M. T. (2012). Interactions between the spatial and temporal stimulus factors that influence multisensory integration in human performance. Experimental Brain Research, 219(1), 121-137.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Dec 16;16(12):e0261129. doi: 10.1371/journal.pone.0261129.r002

Author response to Decision Letter 0


6 May 2021

Replies to Reviewer 1’s comments

I appreciate your helpful and valuable comments on this manuscript. I have revised the manuscript to clarify the theoretical motivation and rationale interpretation of this study.

Point 1: Page 2, Lines 11-13

"...exhibited more auditory preceding timing in the peripheral....."

Not clear what this means. Please rephrase.

Reply: I have rephrased this description to clarify the meaning of the text (page 2, lines 22-25).

“The results indicate that auditory stimuli should be presented earlier for visual stimuli in the central visual field than in the peripheral visual field condition for one to perceive subjective simultaneity in the temporal order judgment task conducted in this study.”

Point 2: "On the other hand...." Other hand relative to what? Consider deleting this cliche

Reply: I have deleted the pertinent text.

Point 3: “temporal binding window was larger…”

Presumably, this refers to a broader subjective simultaneity bandwidth, not amplitude. Please specify.

Reply: I revised this statement as per your suggestion (page 2, lines 25-27).

“Meanwhile, the subjective simultaneity bandwidth was broader in the central visual field than in the peripheral visual field during the simultaneity judgment task.”

Point 4: Line 21-23

I don't follow this. Which visual features are the authors referring to? Temporal response? If so, what kind of temporal response exactly?

Reply: The purpose of this study was to investigate the effects of differences in visual processing speed (including temporal resolution) based on visual fields on temporal synchrony perception of audio-visual stimuli. I have revised the passage to clarify the purpose (page 2, lines 31-35).

“These results suggest that differences in visual processing speed based on the visual field modulate the temporal processing of audio-visual stimuli. Future studies are necessary to confirm the effects of compensation regarding differences in the temporal resolution of the visual filed in later cortical visual pathway on visual field differences in audio-visual temporal synchrony.”

Point 5: Page 4 Line 18-23

The authors need to explain for the reader why they are comparing simultaneity judgment with TOJ tasks. The paragraph (lines 18-23) doesn't provide a justification.

Reply: I have added explanations as to why SJ was compared with TOJ tasks (page 5, lines 86-89).

“Van Eijk, Kohlrausch, Juola, and van de Par [26] have shown a lack of correlation between SJ and TOJ PSS, and proposed a different kind of sensitivity between TOJ and SJ for audio-visual asynchrony. Therefore, it is necessary to use both TOJ and SJ tasks to examine audio-visual temporal synchrony.”

Point 6: Line 22

‘view distances…”

Presumably, the authors are referring to variations in eccentricity arising from variations in viewing distance of an otherwise identical stimulus. This needs to be clarified.

Reply: In Kopinska and Harris’s study, eccentricity and viewing distance were manipulated individually. I have revised the descriptions to clarify this point (page 4, line 80-page 5, line 83).

“Kopinska and Harris [11] compared the PSS between central (0° eccentricity) and peripheral (20° eccentricity) VFs using the TOJ task by manipulating both the eccentricity of the visual stimulus and participants’ viewing distance and showed that the timing of subjective simultaneity did not differ between VFs.”

Point 7: Page 5 Line 7

It's not clear from the Introduction what the theoretical motivation for the visual eccentricity (visual field) manipulation is.

Reply: My previous study reported that the difference in visual processing speed based on spatial frequency modulates audio-visual temporal synchrony perception by using the SJ task. Therefore, this study was conducted to confirm this finding by manipulating the eccentricity of visual stimuli, which is also associated with visual processing speed. I have added the descriptions about this theoretical motivation (page 4, lines 64-78).

“Temporal synchrony perception for audio-visual stimuli is affected by the processing speed of the visual system. Previous studies have reported that the PSS for audio-visual stimuli was directed to more auditory leads in low spatial frequency stimuli than in high spatial frequency stimuli [20, 21]. Visual systems have two channels, namely the transient and sustained channels, with each kind having different temporal resolutions (processing speeds) [22]. Transient channels respond to low spatial frequencies and exhibit high temporal resolution, whereas sustained channels exhibit high spatial frequencies and low temporal resolution [23]. Therefore, auditory stimuli should be presented earlier for low spatial frequency stimuli than for high spatial frequency stimuli for one to perceive subjective simultaneity. These two channels are projected from different retinal positions [24]. The transient channel is projected from the peripheral visual field (VF), while the sustained channel is projected from the central VF; thus, the processing speed of visual stimuli in the central VF is slower than that in the peripheral VF. The present study confirmed the effects of visual temporal resolution on audio-visual synchrony perception by manipulating the VF in which the visual stimulus is presented (i.e., eccentricity). “

Point 8: There are no obvious hypotheses pertaining to this manipulation for any of the three tasks or for the effects of temporal recalibration

Reply: I did not set obvious hypotheses for any of the three tasks since the purpose of this study was to confirm the differences in PSS due to the visual processing speed. For rapid recalibration, I added some predictions in the revised manuscript. (page 5, line 102-page 6, line 108).

“Using the SJ task, Takeshima [30] reported that normal rapid temporal recalibration occurred regardless of a difference in visual processing speed based on spatial frequency. The present study predicts that normal rapid temporal recalibration would be observed in both the central and peripheral VFs in the SJ task. In the TOJ task, as in Roseboom [29], temporal recalibration in the direction opposite to that in the SJ task is predicted to occur. This study also explores rapid temporal recalibration using SB perception since previous studies have not examined this using the paradigm under investigation. “

Point 9: Page 10, Lines 19-21

Reference to neural transmission speed is an interpretation, and possibly a prediction/hypothesis. Either way, this doesn't belong in the Results section.

“…a vision leads response is suppressed…’ [italics added]

Again, this is an interpretation, not a description of the experimental Results

Page 14, Lines 6-7

Again, this is an interpretation, not a description of the experimental Results

Reply: In the revised manuscript, I have clearly divided the “Results and discussion” section of the previous manuscript into a “Results” section and a “Discussion” section in each experiment.

Point 10: Page 20, Lines 12-13

How do you know they are more suppressed? It is more descriptive to say that TOJ vision leads responses are less frequent in the visual periphery than in central vision. To invoke suppression implies a mechanism for which these experiments offer no evidence

Reply: Although the result of the TOJ task was inconsistent with the VF difference in temporal resolution, it was consistent with that in visual latency in previous studies. I have discussed this point in the revised manuscript (page 12, line 245-255).

“The difference observed between the central and peripheral VFs was opposite to that predicted by the difference in temporal resolution. The temporal resolution is higher for the peripheral VF than for the central VF [24]; thus, it was predicted that auditory stimuli should be presented earlier for visual stimuli in the peripheral VF than in the central VF condition for one to perceive subjective simultaneity. However, the current observed difference in the PSS was consistent with the difference in visual latency between the central and peripheral VFs. The response time for the visual stimulus was shorter at the central VF than at the peripheral VF [35]. A previous study that manipulae spatial frequency also observed differences in the PSS that were consistent with differences observed in response time [20, 30]. Therefore, in a TOJ task, differences in the PSS between central and peripheral VFs could be attributed to differences in the response time, not temporal resolution.”

Point 11: Page 20, Lines 12-13

How do you know they are more suppressed? It is more descriptive to say that TOJ vision leads responses are less frequent in the visual periphery than in central vision. To invoke suppression implies a mechanism for which these experiments offer no evidence

Reply: I have revised the implications for the description of the PSS value (page 23, line 512-513).

“Experiment 1 showed that the PSS value was smaller in the central VF condition than in the peripheral VF condition.”

Point 12: Lines 14-15

Presumably, the authors are referring to the observation that SJ derived TBWs are broader at central than at peripheral visual field locations.

Reply: I have revised the sentence to refer to the VF difference in TBW width (page 23, lines 514-515).

“Experiment 2 indicated that the TBW was broader at the central VF than at the peripheral VF.”

Point 13: Page 22, Lines 22-23

I don't follow this sentence. Which temporal characteristics exactly are the authors referring to?

Reply: I have revised the last paragraph of the General discussion section to clarify the claim of this study (page 26, lines 561-567).

“The present findings also suggest a discrepancy in judgment between the VFs; judging simultaneity between visual events and sound in the central VF, whereas judging bounce induced by sound in the peripheral VFs. The use of SB perception in investigating temporal synchrony perception for audio-visual stimuli needs to be carefully considered. Moreover, bounce responses differ between the VFs (i.e., the results of a no-sound experiment and alpha values). It is necessary to further investigate whether this difference affects the estimation of the PSS and TBW of SB perception.”

Point 14: Nor is it clear how the temporal characteristics of the visual field ought to explain the diverse pattern of task dependent visual field effects reported here.

Reply: Thank you for your comment. Based on this, I have reinterpreted the results regarding the VF difference in this study (page 24, line 523-page 25, line 539).

“VF differences in audio-visual temporal synchrony were observed in the TOJ and SJ tasks. In a TOJ task, the auditory stimulus is presented earlier as a visual stimulus for one to perceive subjective simultaneity in the central than in the peripheral VF condition. In an SJ task, the TBW width of the central VF is broader than that of the peripheral VF. The difference in the PSS of the TOJ task was consistent with the VF difference in a visual latency, whereas the difference in the TBW of the SJ task was consistent with the VF difference in a temporal resolution. As a preliminary prediction, the VF differences in the PSS and TBW both follow differences in temporal resolution. It has been shown that the difference in eccentricity-dependent temporal resolution are observed in the early visual cortex and are compensated later in the cortical visual pathway [39]. A TOJ task is assumed to be associated with higher-order processing compared with an SJ task [26, 40]. Therefore, the discrepancy between VF difference in the PSS and temporal resolution could be attributed to the visual hierarchy of eccentricity-dependent temporal contrast in a TOJ task. Moreover, a TOJ task has been proposed to be sensitive to PSS change [26], whereas an SJ task has been proposed to be sensitive to TBW change [41]. Thus, the discrepancy in the indices for which VF differences were observed between the TOJ and SJ tasks would reflect the difference in sensitivity of indices to audio-visual synchrony perception.” 

Replies to Reviewer 2’s comments

I appreciate your helpful and valuable comments on this manuscript and have revised it based on your comments.

Major comments:

1. In order to better predict and explain the results observed in the TOJ, SJ, and SB paradigms, the critical differences of their underlying mechanisms should be introduced and discussed. This is especially critical when different results were observed in each paradigm: should these differences be attributed to the different sensory, perceptual, or decisional processing?

As the author mentioned in the Discussion, SJ and TOJ may share the same perceptual mechanism but different decisional processes (e.g., García-Pérez & Alcalá-Quintana, 2015). On the other hand, the SB perception may involve causal relations and attention in addition to audiovisual integration (Shimojo et al., 2001; van Eijk et al., 2008). Hence, stronger theoretical backgrounds will be necessary to understand the distinct results observed in three experiments, and will provide a clearer rationale for the current study.

Reply: I have reinterpreted the results of the VF difference between the TOJ and SJ tasks based on the compensation of the difference in temporal resolution of the visual filed in later cortical visual pathway (page 24, line 525-page 25, line 539).

“VF differences in audio-visual temporal synchrony were observed in the TOJ and SJ tasks. In a TOJ task, the auditory stimulus is presented earlier as a visual stimulus for one to perceive subjective simultaneity in the central than in the peripheral VF condition. In an SJ task, the TBW width of the central VF is broader than that of the peripheral VF. The difference in the PSS of the TOJ task was consistent with the VF difference in a visual latency, whereas the difference in the TBW of the SJ task was consistent with the VF difference in a temporal resolution. As a preliminary prediction, the VF differences in the PSS and TBW both follow differences in temporal resolution. It has been shown that the difference in eccentricity-dependent temporal resolution are observed in the early visual cortex and are compensated later in the cortical visual pathway [39]. A TOJ task is assumed to be associated with higher-order processing compared with an SJ task [26, 40]. Therefore, the discrepancy between VF difference in the PSS and temporal resolution could be attributed to the visual hierarchy of eccentricity-dependent temporal contrast in a TOJ task. Moreover, a TOJ task has been proposed to be sensitive to PSS change [26], whereas an SJ task has been proposed to be sensitive to TBW change [41]. Thus, the discrepancy in the indices for which VF differences were observed between the TOJ and SJ tasks would reflect the difference in sensitivity of indices to audio-visual synchrony perception.”

2. The author proposed an interesting assumption that visual stimulus presented to sustained channel and transient channel may lead to different audiovisual simultaneity perception, because the transient channel has higher temporal resolution and faster processing speed. However, these two characteristics in the temporal processing may be associated with different aspects of audiovisual simultaneity perception. In my intuitive guessing, higher temporal resolution may be associated with narrower audiovisual TBW, whereas the faster processing time may be associated with the shift of PSS (see the model of García-Pérez & Alcalá-Quintana, 2012, and a recent study by Chien et al., 2020). I also found that García-Pérez and Alcalá-Quintana published a paper last year (2020) on this issue.

Hence, I would expect that the PSS should be at more negative (i.e., the auditory stimulus should be presented earlier) in the periphery than in the center, if visual processing is faster in the periphery than in the center. This prediction was not consistent with the results, and even an opposite direction was observed in Experiment 1. I cannot follow the author’s “suppression” account on p. 10.

On the other hand, the result of TBW in Experiment 2 was consistent with my prediction that the TBW was narrower in the periphery than in the center because of the higher temporal resolution in the periphery. However, this result contrasts with trend as a function of eccentricity reported by Stevenson et al. (2012). Please discuss this critical inconsistency.

Reply: Visual latency is shorter in the central VF than in the peripheral VF, whereas temporal resolution is higher in the peripheral VF than in the central VF. The PSS result in the TOJ task was consistent with the prediction because of this visual latency difference. Thus, I have reinterpreted the PSS results in Experiment 1 (page 12, lines 245-255).

“The difference observed between the central and peripheral VFs was opposite to that predicted by the difference in temporal resolution. The temporal resolution is higher for the peripheral VF than for the central VF [24]; thus, it was predicted that auditory stimuli should be presented earlier for visual stimuli in the peripheral VF than in the central VF condition for one to perceive subjective simultaneity. However, the current observed difference in the PSS was consistent with the difference in visual latency between the central and peripheral VFs. The response time for the visual stimulus was shorter at the central VF than at the peripheral VF [35]. A previous study that manipulae spatial frequency also observed differences in the PSS that were consistent with differences observed in response time [20, 30]. Therefore, in a TOJ task, differences in the PSS between central and peripheral VFs could be attributed to differences in the response time, not temporal resolution.”

Stevenson et al. (2012) reported an opposite TBW difference from the present study. However, their study manipulated a wider range of eccentricities, and a similar PSS difference to the results of Experiment 2 was observed at eccentricities closer to the present study. Therefore, their results are considered to be consistent with the results of this study (page 17, line 376-page 18, line 383).

“Stevenson, Kruger Fister, Barnett, Nidiffer, and Wallace [37] have found a broader TBW in the peripheral VF than in the central VF. However, a higher percentage of simultaneous judgment was observed in the peripheral VF outside of the present experiment (60° and 90° eccentricities) in their study. In the peripheral VF (30° eccentricity), which was closer to the present experiment, the percentage of simultaneous judgment is slightly lower than that in the central VF. Therefore, the TBW of the central VF is assumed to become wider than that of the peripheral VF up to a certain range (at least 30° eccentricity), which is attributed to temporal resolution.”

3. The results of rapid recalibration in SJ and TOJ were in the same direction in the current study, which is inconsistent with Roseboom (2019). On p. 21, if I understand correctly, the author claimed that the effect of rapid recalibration may be cancelled out by the choice-repetition bias in TOJ. However, I do not understand how this explanation is only applied in the periphery rather than in the center, given that the choice-repetition bias should occur in both conditions when they are mixed in a block. In addition, given the fact that response type in TOJ, SJ, and SB are very different from each other, I am also wondering how the choice-repetition bias can be applied in SB but not in SJ.

Reply: Previous studies reporting choice-repetition bias in a TOJ task used blurred visual stimuli. Although the present study used a shape-edge visual stimulus, this stimulus was presumably perceived as blurred in the peripheral VF. Thus, a consistent interpretation can be made if the choice-repetition bias occurs only for blurred visual stimuli (page 12, line 256-page 13, line 269).

“The results of normal rapid recalibration for the central VF were inconsistent with the results of previous studies. Roseboom [29] showed that the PSSs shifted in opposite directions from normal rapid temporal recalibration in a TOJ task. Moreover, Keane, Bland, Matthews, Carroll, and Wallis [36] found that opposite-directed PSS shifts were induced by choice-repetition bias in a TOJ task. Choice-repetition bias refers to the tendency to repeat judgments of temporal order of a previous trial on a current trial. Additionally, rapid recalibration was obfuscated by opposite-directed PSS shifts due to a choice-repetition bias [36]. In this study, the choice-repetition bias did not occur only in the central VF. Roseboom [29] and Keane et al. [36] also presented the visual stimulus in the central VF; however, Roseboom [29] used a Gaussian probe and Keane et al. [36] used a small light-emitting diode. Unlike the circle used in this experiment, these visual stimuli had blurred edges. In the present experiment, choice-repetition bias also occurred in the peripheral VF, in which the visual stimuli were perceived as blurred. Therefore, blurred visual stimuli would be necessary for occurring choice-repetition bias in the TOJ task for audio-visual stimuli.”

In Experiment 3, I assumed that the difference in the judged information between the central and peripheral VFs caused the difference in rapid recalibration. Vroomen and Keetels (2020) showed that the PSS of SJ becomes positive, whereas that of SB becomes negative. Although a significant difference was not observed, the PSS of central VF was positive and that of the peripheral VF was negative. Thus, a consistent interpretation can be made if SB perception does not induce rapid recalibration (page 22, line 490-page 23, line 506).

“The PSS values did not significantly differ between the central and peripheral VFs in SB perception. Although there was no significant difference, the signs of PSS values were different between the central and peripheral VFs, a positive value in the central VF and a negative value in the peripheral VF. Vroomen and Keetels [38] showed a discrepancy between the optimal time of sound to induce bounce and maximal audio-visual synchrony in SB displays. The former timing was before the instance of superposition (i.e., negative value), whereas the later timing was after the instance of superposition (i.e., positive value). Thus, in this experiment, participants may have judged the simultaneity between visual events (i.e., contact between two moving visual stimuli) and sound in the central VF, whereas they judged the bounce perception of the two visual stimuli induced by sound in the peripheral VF.

The discrepancy in results between the central and peripheral VF in rapid recalibration could be explained by this difference in judgment. In this experiment, normal rapid recalibration occurred only in the central VF and not in the peripheral VF. The occurrence of rapid recalibration due to SB perception has not been examined. If SB judgment does not cause a change in the PSS due to the timing of sound on a previous trial, it is consistent with results of this experiment. Future studies are needed to confirm this postulation, since there are little experimental data on rapid recalibration using SB perception.”

4. To my knowledge, it is unusual that the rapid recalibration effect is observed in TBW as demonstrated in Experiment 1. Because the rapid recalibration effect in PSS was only observed in the center while the rapid recalibration effect in TBW was observed both in the center and periphery, the explanation on p.11 therefore does not work.

Reply: As you have pointed out, the TBW shift during rapid recalibration is unusual. I have also not predicted and interpreted this result. First, few data on TBW have been reported in previous studies of rapid recalibration. Therefore, I would like to elucidate the mechanism underlying this effect in the future. I have added description about this point (page 13, lines 270-278).

“Moreover, the difference in the TBW between audition and vision leads during the rapid recalibration process is a novel finding. In Experiment 1, the TBW width narrowed in the preceding vision leads presentation than in the preceding audition leads presentation. A narrow TBW indicates a high sensitivity to judge the temporal order between visual and auditory stimuli. Therefore, this finding shows that the temporal information of visual precedence in a previous trial increase the sensitivity in a TOJ task for audio-visual stimuli. In previous studies of rapid recalibration (e.g., [18], [29], and [34]), the difference in TBW width between audition and vision leads conditions have not been investigated. Such a difference in the TBW was not predicted in this study, and this needs to be examined in more detail in the future.”

5. I am wondering how the number of participants was determined in each experiment. I can see that the expected number was 20, but only 17 remaining in Experiment 3. I worry that this number is too small (the main effect of modality order was only marginal significant in the rapid recalibration analysis). The SB perception is a subjective experience and therefore a larger individual difference may exist.

Reply: The sample size was determined using PANGEA. The result of the calculation showed that the power (1-β) was 0.86 with a sample size of 16 participants in each experiment. I also felt that there were many outliers and the sample size became smaller in Experiment 3; however, I did not consider this to have such a large impact.

6. I have some critical questions regarding the experimental designs:

(1) In Experiment 1, the blank display before the visual target was 500-1000 ms, and the other blank after the visual target was 200-710 ms (this information is missing in the main text). It seems that both blanks were occasionally shorter than the largest SOAs between the visual and auditory target (+/- 510 ms).

(2) Technically, there is no +/-0 ms, only 0 ms. Hence, there should be only 9 SOAs in Experiments 2 and 3. Was the number of trials in the 0 ms SOA doubled as compared to other SOAs?

(3) In Experiment 3, how long is the duration of the motion display? A figure to demonstrate the procedure will be great.

Reply: (1) I have added a detailed description of the duration of the blank display (page 8, lines 160-167).

“The duration of this blank display was a fixed length of 500 ms, plus an additional SOA when the auditory stimulus was presented prior to the visual stimulus (i.e., 500 ms at the shortest and 1010 ms at the longest). During the target display period, a circle was presented at one of the following three locations: center, left, or right. The tone either preceded or followed the onset of the circle using the SOA that had been drawn randomly from the set. After the target display, a blank display was again presented. The duration of this blank display was a fixed length of 200 ms, plus an additional SOA when the visual stimulus was presented prior to the auditory stimulus (i.e., 200 ms at the shortest and 710 ms at the longest).”

Reply: (2) As you have pointed out, the number of trials in the 0 ms SOA doubled compared with other SOAs. I have added descriptions about this point (page 14, lines 305-307).

“One block comprised 80 trials, with 10 trials for each SOA condition (i.e., 16 trials for 0 ms SOA and 8 trials for other SOAs).”

Reply: (3) The duration of motion display was 1440 ms in the experiment. I have added descriptions and figure about this point (page 19, lines 417-418).

“Each trial consisted of a 500 ms fixation stimulus followed by motion displays (a duration of 1440 ms, see Fig 6).”

7. More details and clarifications in data analyses are required:

(1) In Experiment 1, please specify how to estimate the PSS and sigma (TBW) based on the cumulative Gaussian function.

(2) In Experiments 2 & 3, please explain the meaning of alpha, PSS and sigma in terms of the Gaussian function, respectively. Were the different alpha scores in the center and periphery critical in Experiment 3?

(3) Did different fitting methods used in Experiment 1 (maximum-likelihood) vs. Experiments 2 & 3 (minimal RMSE) potentially influence the estimation of PSS and sigma?

(4) On p. 19, line 4, were the p value of the two t tests corrected? Were they one- or two-tailed?

Reply: (1) I have added the formula for the cumulative Gaussian function in the revised manuscript (page 9, line 184).

P (response|SOA) = 1/(1+e^[-1/sigma (SOA-PSS)] )

(2) I have added a description of these parameters in terms of the Gaussian function in Experiment 2 (page 15, lines 318-320).

“…these parameters indicated the height, peak position on the SOA axis, and the width of the Gaussian function, respectively.”

The difference in the alpha value indicates the difference in the degree of bounce perception in Experiment 3. However, it is not difficult to determine whether this difference is critical in connection with the results of Experiment 3. I have added a description about this point in the General discussion section (page 26, lines 565-567).

“Moreover, bounce responses differ between the VFs (i.e., the results of a no-sound experiment and alpha values). It is necessary to further investigate whether this difference affects the estimation of the PSS and TBW of SB perception.”

(3) I also calculated the RMSE of fitting in Experiment 1 and these values did not largely differ from the values of Experiments 2 and 3. Therefore, I assume that such a difference in the fitting method would not influence the estimations.

(4) I used the two-tailed one-sample t-tests, and these tests and p values were corrected. I apologize for inserting the incorrect figure regarding these results (Fig 8) in the previous manuscript. I have revised the figure.

8. Based on the results, which paradigm will be recommended for researchers to explore similar issue in the future studies?

Reply: I recommend using the explicit paradigm (i.e., TOJ or SJ task), since it was suggested that different judgments occur between VFs in the SB perception based on this study. I have added at description of this point in the revised manuscript (page 27, lines 590-595).

“Furthermore, a paradigm difference in temporal synchrony perception was also observed in this study, among TOJ, SJ, and SB. The present findings suggest that explicit measurements (i.e., TOJ and SJ tasks) are more desirable to examine the temporal synchrony perception of audio-visual stimuli between VFs, because of the different judgments that occur when using an implicit measurement (i.e., SB perception).”

Minor comments:

1. The current title is wordy—it should be condensed.

Reply: I have revised the manuscript title per your suggestion.

2. The sound was presented from headphone, so the perceived location was near the participant’s head rather than in the front. Does this influence the results of PSS, TWB, and rapid recalibration since the visual and auditory stimuli were spatially separate (i.e., violating the unity assumption)?

Reply: The distance between the participant and the display was approximately 70 cm. Many previous studies investigating the PSS, the TBW, and rapid recalibration have also used headphones to present sound to participants. Therefore, I assume that the use of headphones does not influence these results.

3. Figures 4 & 6: It would be easier to read if the same conditions have the same color code (such as in Figure 2, the center condition has open dots in (a) and white bars in (b,c)). Same suggestion for Figures 3, 5, & 7.

Reply: I have revised these figures per your suggestion.

4. A table summarizing all results of three experiments will be helpful.

Reply: I have added a table summarizing the results of the experiments per your suggestion.

Attachment

Submitted filename: Reply_PO3.docx

Decision Letter 1

Deborah Apthorp

22 Jun 2021

PONE-D-20-40727R1

Visual field differences in temporal synchrony processing for audio-visual stimuli

PLOS ONE

Dear Dr. Takeshima,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Aug 06 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Deborah Apthorp, Ph.D

Academic Editor

PLOS ONE

Additional Editor Comments (if provided):

Thank you for your revisions to your manuscript. While the reviewers consider the manuscript considerably improved, several important concerns remain; in particular, both reviewers find that the English in the paper needs improving for comprehensibility, and both ask for clearer and more concise theoretical justifications for the experiments, as well as clearer explanations of the results.

Please revise the paper according to both reviewers' concerns.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: No

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: General comments:

The authors have done a good job addressing many of my previous concerns. The paper is substantially improved. However, some of my earlier concerns remain.

In particular, the Introduction. The authors need to explain why they have chosen to compare these various psychophysical measures of temporal synchrony. At present, it reads like a parametric list. That can be fine, but it needs to be theoretically, or at least empirically contextualised.

I also strongly encourage the authors to seek advice on English grammar before resubmitting.

There are numerous instances where tense confusions, awkward and/or imprecise phrasing leaves this reader confused.

As for its scientific contribution, the paper offers some interesting and novel findings, particularly the interaction between eccentricity and modality order in the temporal recalibration experiments. Unfortunately, the absence of a coherent explanation makes it difficult to evaluate the significance of these findings. Perhaps the observation fact that these different measures AV synchrony perception exhibit distinct functional characteristics is the paper’s major contribution.

Beyond this, I don’t know what else to suggest.

Specific comments:

Abstract

The authors mention the three methods they use to measure audio-visual timing performance. No explanation or motivation is provided as to why one might use different measures, or what they actually refer to.

The authors correctly describe the main findings associated with each of the various tasks used in this paper. No attempt is offered in the abstract to interpret these task-contingent effects.

Why use these different tasks? I would like to see some attempt to understand the variety of effects observed across the different tasks

Introduction.

39-42

Why do the authors make reference to this asymmetry? It doesn't seem relevant to the point they're making about the potential functional significance of AV synchrony

59-60

Of greatest relevance here is that a prolonged period of adaptation is not necessary to observe temporal recalibration.

67

...are composed of at least two spatio-temporal channels.

73-75

This is an over-generalisation. Please tone this back

Discussion

240

Process should be Processes

252

Spelling - manipulate

268-269

I don't follow what the authors mean here

Are they saying that blurred edges are a necessary constraint?

Table 1 title

Summary of the three experiments' main findings

524

Do the authors mean "In our TOJ task"?

I don't follow this sentence

529-530

Can the authors please clarify this statement?

532

"A TOJ task"

Suggest: TOJ tasks are assumed to be...

535

"a TOJ task"

Suggest:

TOJ tasks are proposed

562-563

Please clarify. I don't see how this relates to the previous sentence

568

Grammar!

574-575

Suggest: "Therefore, rapid recalibration may be induced by ...."

Reviewer #2: I appreciate that the author has tried to reply to my previous comments and to better explain the current results. However, it seems that each effect in each experimental paradigm is explained by distinct mechanism, and these explanations are sometimes quite speculative and unconvincing. Taken together, even though the current version of manuscript is improved compared to previous version, I would expect that the author provides more parsimonious and coherent explanations. In addition, the readability of the manuscript should be improved, and professional English editing is essential.

1. In my previous point 1, I suggested that the author should introduce the critical differences of the mechanisms underlying the three paradigms used here. This is critical to help the author to predict possible results, and to provide readers clearer scope and motivation for the current study.

2. In my previous point 2, I have proposed that the dissociation between processing time and temporal resolution, which may lead to different predictions for PSS and TBW, respectively. Now the author suggested that the visual latency (which is close to the idea of processing time) is shorter in the center than in the periphery, which can explain the PSS result in the TOJ task. There should be hypothesis based on visual latency as a function of visual field addressed in Introduction.

3. P. 12, Keane et al. (2020) demonstrated the “choice-repetition bias” account based on reanalyzing Roseboom’s (2019) data; Keane et al., instead, reported a null rapid recalibration effect in Experiment 1, and a typical rapid recalibration effect in Experiment 2. Hence, the author’s “blurry visual information” account cannot work.

Following the argument of “choice-repetition bias”, is it possible that the more pronounced rapid recalibration effect in both visual fields was a result of over-estimation due to this bias?

4. Given that the motion display was longer than 1 sec, was the participant’s eye movement monitored or controlled?

5. In Experiment 3, given the fact that the PSS was not significantly different in the center and periphery (p = .69), I don’t see any explanation is needed. Specifically, the author’s suggestion that the participants made different judgments in the center and periphery is not convincing, and so is the following paragraph for the rapid recalibration effect.

6. Table 1: in the summary of rapid recalibration, it is weird to compare the results in the center and periphery either in the A-leading PSS or in the V-leading PSS. The summary should be the comparison between the A-leading PSS and in the V-leading PSS in the center and periphery, respectively.

7. Here are some arguments that I cannot follow:

(1) P. 10, lines 204-205: I do not understand this argument since the results were analyzed and presented separately in the central and peripheral visual field.

(2) P. 25, lines 533-535: If eccentricity effect can be compensated in later visual processing, and the TOJ involved higher-order processing, how could the former explain the PSS difference in the center and periphery in the TOJ?

8. There are still some mistakes in the manuscript:

(1) P. 8, line 170: In Experiment 1, there were 10 SOAs. If there were 80 trials in a block, then there should be only 8 trials for each SOA. Same problem on p. 14, line 306, and p. 19, line 422.

(2) P. 11, line 223, the PSS should be “lower” or “smaller” in the center than in the periphery, since the values were negative.

(3) P. 14, line 298, and p. 19, line 412: it is unclear to me why the author insists to keep the “±0 ms” SOA, since it does not make sense at all.

(4) In the Procedure in Experiment 3, was there also double number of trials at the 0 ms SOA than other SOAs?

(5) P. 28, line 603: “…rapid temporal recalibration occurred only at the periphery VF in the TOJ…” should be at the central VF.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Dec 16;16(12):e0261129. doi: 10.1371/journal.pone.0261129.r004

Author response to Decision Letter 1


31 Jul 2021

Replies to Reviewer 1’s comments

I appreciate your helpful and valuable comments on this manuscript. I have revised the manuscript to clarify the theoretical motivation and rationale of this study.

Point 1: Abstract

The authors mention the three methods they use to measure audio-visual timing performance. No explanation or motivation is provided as to why one might use different measures, or what they actually refer to. The authors correctly describe the main findings associated with each of the various tasks used in this paper. No attempt is offered in the abstract to interpret these task-contingent effects. Why use these different tasks? I would like to see some attempt to understand the variety of effects observed across the different tasks.

Reply: I have added the explanation for the use of different measures in this study (lines 22–23), and suggestions from current results (lines 34–36) to the Abstract.

Point 2: 39-42

Why do the authors make reference to this asymmetry? It doesn't seem relevant to the point they're making about the potential functional significance of AV synchrony

Reply: I have deleted the relevant statements based on your comment.

Point 3: 59-60

Of greatest relevance here is that a prolonged period of adaptation is not necessary to observe temporal recalibration.

Reply: I have revised the description as per your suggestion (lines 60–61).

Point 4: 67

...are composed of at least two spatio-temporal channels.

Reply: I have revised the description as per your suggestion (lines 68–69).

Point 5: 73-75

This is an over-generalisation. Please tone this back

Reply: I have revised the relevant statements to be more precise (lines 74–79).

“The retinal positions of vision have a similar difference in these two channels: The central vision has a low temporal resolution, while the peripheral vision has a high temporal resolution [23]. Furthermore, there is a difference in visual latency between the central and peripheral visual fields (VFs): The response time for the visual stimulus is shorter at the central VF than it is at the peripheral VF [24]. Hence, the processing speed differs between the central and peripheral VFs.”

Point 6: 240

Process should be Processes

Reply: I have revised the description as per your suggestion (line 252).

Point 7: 252

Spelling - manipulate

Reply: I apologize for this typo. This has been corrected (line 264).

Point 8: 268-269

I don't follow what the authors mean here

Are they saying that blurred edges are a necessary constraint?

Reply: Reviewer 2 also could not follow the discussion. I have therefore reinterpreted this result on the basis of information reliability (lines 274–278).

“In this study, choice-repetition bias was suppressed in the central VF. Low information reliability induces a larger choice-repetition bias [39, 40]. Therefore, the reliability of judging temporal order for audio-visual stimuli would be high in the central VF. This speculation needs to be further investigated”

Point 9: Table 1 title

Summary of the three experiments' main findings

Reply: I have changed the title of Table 1 as per your suggestion.

Point 10: 524

Do the authors mean "In our TOJ task"? I don't follow this sentence

Reply: I have revised the wording to make it more understandable (line 530-533).

“VF differences in audio-visual temporal synchrony were observed in the TOJ and SJ tasks in this study. In the TOJ task of Experiment 1, the PSS score indicated that the auditory stimulus is presented earlier as a visual stimulus for one to perceive subjective simultaneity in the central than in the peripheral VF condition.”

Point 11: 529-530

Can the authors please clarify this statement?

Reply: I have clarified this statement (lines 536–539).

“As a preliminary prediction, the VF differences in the PSS and TBW both follow differences in temporal resolution: The PSS score was larger and the TBW width was wider in the central VF than in the peripheral VF.”

Point 12: 532

"A TOJ task"

Suggest: TOJ tasks are assumed to be...

Reply: I have revised the description as per your suggestion (line 541).

Point 13: 535

"a TOJ task"

Suggest: TOJ tasks are proposed

Reply: I have revised the description as per your suggestion (line 544).

Point 14: 562-563

Please clarify. I don't see how this relates to the previous sentence

Reply: I have deleted this sentence as per Reviewer 2’s comment.

Point 15: 574-575

Suggest: "Therefore, rapid recalibration may be induced by ...."

Reply: I have revised the description as per your suggestion (line 577). 

Replies to Reviewer 2’s comments

I appreciate your helpful and valuable comments on this manuscript and have revised it based on the comments.

1. In my previous point 1, I suggested that the author should introduce the critical differences of the mechanisms underlying the three paradigms used here. This is critical to help the author to predict possible results, and to provide readers clearer scope and motivation for the current study.

Reply: I have added descriptions regarding the differences in the mechanisms underlying the three paradigms (lines 92–95, lines 103–105).

“The underlying mechanisms differ between the TOJ and SJ tasks. The TOJ task reflects temporal discrimination processes, whereas the SJ task reflects temporal binding processes [26]. Furthermore, the differences between the TOJ and SJ tasks stem from their decisional and response processes [27].”

“Apparent causality among visual and auditory events as SB perception affects audio-visual synchrony perception in early multisensory integration processes [31].”

2. In my previous point 2, I have proposed that the dissociation between processing time and temporal resolution, which may lead to different predictions for PSS and TBW, respectively. Now the author suggested that the visual latency (which is close to the idea of processing time) is shorter in the center than in the periphery, which can explain the PSS result in the TOJ task. There should be hypothesis based on visual latency as a function of visual field addressed in Introduction.

Reply: I have added a hypothesis based on visual latency as a function of visual field (lines 74–84).

“The retinal positions of vision have a similar difference in these two channels: The central vision has a low temporal resolution, while the peripheral vision has a high temporal resolution [23]. Furthermore, there is a difference in visual latency between the central and peripheral visual fields (VFs): The response time for the visual stimulus is shorter at the central VF than it is at the peripheral VF [24]. Hence, the processing speed differs between the central and peripheral VFs. The present study confirmed the effects of visual temporal resolution on audio-visual synchrony perception by manipulating the VF in which the visual stimulus is presented (i.e., eccentricity). If synchrony perception for audio-visual stimuli follows differences in temporal resolution, then the TBW of the central VF would be wider than that of the peripheral VF, whereas if it follows the difference in visual latency, then the PSS of the central VF would be lower than that of the peripheral VF.”

3. P. 12, Keane et al. (2020) demonstrated the “choice-repetition bias” account based on reanalyzing Roseboom’s (2019) data; Keane et al., instead, reported a null rapid recalibration effect in Experiment 1, and a typical rapid recalibration effect in Experiment 2. Hence, the author’s “blurry visual information” account cannot work. Following the argument of “choice-repetition bias”, is it possible that the more pronounced rapid recalibration effect in both visual fields was a result of over-estimation due to this bias?

Reply: Keane et al. (2020) have shown in Experiment 2 that suppressing the choice-repetition bias leads to the typical rapid recalibration effect in the TOJ task. Therefore, in the TOJ task, the choice-repetition bias would be suppressed in the central VF. The choice-repetition bias is associated with the information reliability. Therefore, I have revised the interpretation on the basis of information reliability (lines 274–278).

“In this study, choice-repetition bias was suppressed in the central VF. Low information reliability induces a larger choice-repetition bias [39, 40]. Therefore, the reliability of judging temporal order for audio-visual stimuli would be high in the central VF. This speculation needs to be further investigated.”

4. Given that the motion display was longer than 1 sec, was the participant’s eye movement monitored or controlled?

Reply: The participants were instructed to gaze at the fixation cross during the motion display period. However, eye movement was not monitored. I have added a description of this point (line 430, lines 502–506).

“In the motion display period, participants were instructed to gaze the fixation cross.”

“Audio-visual temporal processing was not affected by VF differences in SB perception. SB perception did not differ in terms of the PSS and TBW between the central and peripheral VFs. However, the duration of motion display was over 1000 ms and participants’ eye movements were not monitored in this experiment. Thus, the participants’ gaze was off the fixation point in the motion display period, which may have affected the current results.”

5. In Experiment 3, given the fact that the PSS was not significantly different in the center and periphery (p = .69), I don’t see any explanation is needed. Specifically, the author’s suggestion that the participants made different judgments in the center and periphery is not convincing, and so is the following paragraph for the rapid recalibration effect.

Reply: I have deleted the statements regarding the VF differences in SB perception based on your comment.

6. Table 1: in the summary of rapid recalibration, it is weird to compare the results in the center and periphery either in the A-leading PSS or in the V-leading PSS. The summary should be the comparison between the A-leading PSS and in the V-leading PSS in the center and periphery, respectively.

Reply: I have revised Table 1 as per your suggestion.

7. Here are some arguments that I cannot follow:

(1) P. 10, lines 204-205: I do not understand this argument since the results were analyzed and presented separately in the central and peripheral visual field.

(2) P. 25, lines 533-535: If eccentricity effect can be compensated in later visual processing, and the TOJ involved higher-order processing, how could the former explain the PSS difference in the center and periphery in the TOJ?

Reply: (1) I did not split the visual field of a previous trial (Trial t-1); the visual field of the current trial (Trial t) was split. I have clarified the description of this point. (lines 216–218).

“The VFs in the previous trial (Trial t-1) were not split because the correspondence of the spatial location between the current (Trial t) and previous (Trial t-1) trials did not affect the rapid recalibration [37].”

(2) Later visual processing compensated the visual field difference of temporal resolution. It is not clear whether the visual field differences in visual latency are compensated in a later visual processing. The visual field difference of the PSS was consistent with that of visual latency in Experiment 1. Therefore, I interpreted that visual latency was not compensated in higher-order visual processing, and that synchrony perception was consistent with the visual field difference of visual latency in the TOJ task.

8. There are still some mistakes in the manuscript:

(1) P. 8, line 170: In Experiment 1, there were 10 SOAs. If there were 80 trials in a block, then there should be only 8 trials for each SOA. Same problem on p. 14, line 306, and p. 19, line 422.

(2) P. 11, line 223, the PSS should be “lower” or “smaller” in the center than in the periphery, since the values were negative.

(3) P. 14, line 298, and p. 19, line 412: it is unclear to me why the author insists to keep the “±0 ms” SOA, since it does not make sense at all.

(4) In the Procedure in Experiment 3, was there also double number of trials at the 0 ms SOA than other SOAs?

(5) P. 28, line 603: “…rapid temporal recalibration occurred only at the periphery VF in the TOJ…” should be at the central VF.

Reply: (1) I have revised the descriptions (line 182, 316, and 433).

(2) I have revised the term (line 235).

(3) I have revised the statement as per your suggestion (lines 316–317 and 433–434).

(4) You are right. I have revised the statement (line 308 and 422).

(5) I have revised the description (line 606).

Attachment

Submitted filename: Reply2_PO3.docx

Decision Letter 2

Deborah Apthorp

22 Oct 2021

PONE-D-20-40727R2Visual field differences in temporal synchrony processing for audio-visual stimuliPLOS ONE

Dear Dr. Takeshima,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 06 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Deborah Apthorp, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments (if provided):

The reviewers still have several concerns with the paper, in particular that the arguments are not very coherent. However, I feel the paper could be improved sufficiently to be publishable in PLoS one. I am unwilling to send this paper back to the reviewers as it has already gone through two rounds of revision. Thus, if the authors could carefully address each of the reviewers' comments (excluding Reviewer 1's suggestion to turn the paper into 3 separate papers), I will assess the manuscript myself and make a decision regarding publication.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript is improved and most of my comments have been addressed. It still reads as a collection of disparate results and tasks. Whilst this doesn’t make for a satisfying story it is arguably an accurate reflection of the state of the current empirical literature on AV synchrony perception. For this reason, the paper makes a useful contribution. I encourage the author to make this point in the Discussion.

Minor point

Line 81-82

I don't follow the presented logic of why AV temporal bandwidth is predicted to be broader in the central than in the peripheral visual field.

Reviewer #2: This is my third time reading this manuscript. Unfortunately, I still think that the current version of manuscript does not provide a clear rationale and comprehensive view of the study – it remains in a superficial state. I also think that the author simply included my words in previous comments in the current manuscript, rather than providing coherent explanations or elaborating these points.

The main problem is that the author included too many issues in a paper that are: 1) the PSS, width, and rapid recalibration of temporal binding window; 2) their differences in the central and peripheral visual fields; and 3) in three experimental paradigms. Not to mention that the author also had to address the inconsistent results in previous studies, as well as the novel results in the current study. My suggestion is that the author should separate three experiments into different papers, so the author can well discuss the results or conduct further experiments. Here are my concerns of the manuscript:

1. The main purpose of the current study, I assume, is stated in the title “visual field differences in temporal processing for audio-visual stimuli”. Given the fact that the author obtained different results in the three experimental paradigms, it is hard for me to judge which result pattern is genuinely attributed to the influence of eccentricity at the level of perceptual processing. Even if the author accepted my previous suggestions that TOJ involves post-perceptual processing compared to SJ, and SB involves causal relations between the audio-visual stimuli, explanations regarding the relationship between the results at difference filed and these mechanisms remains lacking.

For example, in TOJ, the PSS was more negative (i.e., at the auditory-leading side) in the center than in the periphery. On p. 13, the author stated that this difference was consistent with “difference in visual latency between the central and peripheral VFs”. But later the author said “could be attributed to differences in the response time”. Was this a perceptual effect or decisional effect? Would the patter be reversed if the task is changed into “which stimuli comes second”?

Another example on the top of p. 7: I don’t see the link between “the rapid temporal recalibration in the central and peripheral visual fields” and “the reversed pattern of PSS shift in TOJ and SJ reported by Roseboom”.

One more example is that, on p. 7, if the author’s previous study did not observe different rapid temporal recalibration based on different spatial frequency, then what is the reason to assume that there would be difference in the center vs. peripheral visual fields?

2. The author accepted previous suggestions and argued that temporal resolution is related to TBW whereas visual latency is related to PSS (p. 5). However, on p. 13, the authors still contrasted temporal resolution and visual latency as two possible explanations for PSS. The same confusion appeared again on p. 26.

3. Some sentences are still hard to follow or incorrect, such as:

p. 4: “Moreover, rapid recalibration, in which a prolonged period of adaptation is not necessary to observe temporal recalibration, has been reported.” – I don’t get it.

p. 8: the sentence regarding the power analysis is too long and many “and”.

p. 15 and p. 20: there were only 9 SOAs in SJ and SB tasks.

p. 19: “The binding for visual and auditory stimuli must be more flexible at the central VF, given the low temporal resolution” I don’t see the relations.

p. 22, line 456: should be Figs 7b and 7c

p. 24 “If SB judgment does not cause a change in the PSS due to the timing of sound in a previous trial” How come a participant’s judgment can cause something?

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Dec 16;16(12):e0261129. doi: 10.1371/journal.pone.0261129.r006

Author response to Decision Letter 2


14 Nov 2021

Replies to Reviewer 1’s comments

I appreciate your helpful and valuable comments on this manuscript. I have added the discussion related to the inconsistency of results among three methods.

Main point

The manuscript is improved and most of my comments have been addressed. It still reads as a collection of disparate results and tasks. Whilst this doesn’t make for a satisfying story it is arguably an accurate reflection of the state of the current empirical literature on AV synchrony perception. For this reason, the paper makes a useful contribution. I encourage the author to make this point in the Discussion.

Reply: I have discussed the disparate results among three method in General discussion section (lines 595–601).

“The PSS score tends to be a negative value (i.e., interpreted as synchrony when auditory stimulus leads to the visual stimulus) in a TOJ task compared to an SJ task [28]. SB perception is also likely to show a negative PSS value compared to an SJ task, since causality judgment is performed in SB perception [47]. In this study, the PSS scores indicated a similar tendency to these functional characteristics. One of the potential contributions is that each of these three methods measured different functional characteristics for audio-visual synchrony perception in a series of experiments.”

Minor point

Line 81-82

I don't follow the presented logic of why AV temporal bandwidth is predicted to be broader in the central than in the peripheral visual field.

Reply: I predicted that temporal bandwidth becomes broader to make the AV synchrony perception tolerant in the central VF, because the timing perception of visual presentation is ambiguous in the central VF due to its low temporal resolution. I have revised the description to clarify this point (lines 80–83).

“If synchrony perception for audio-visual stimuli follows differences in temporal resolution, then the TBW of the central VF would be wider than that of the peripheral VF for tolerant synchrony perception, because the timing perception of visual presentation is ambiguous in the central VF due to low temporal resolution.” 

Replies to Reviewer 2’s comments

I appreciate your helpful and valuable comments on this manuscript and have revised it based on the comments. However, I disagree that about separating the three experiments into different papers, because I considered that one of the useful contributions of this paper is to report the inconsistency of results among the three methods.

1-1. The main purpose of the current study, I assume, is stated in the title “visual field differences in temporal processing for audio-visual stimuli”. Given the fact that the author obtained different results in the three experimental paradigms, it is hard for me to judge which result pattern is genuinely attributed to the influence of eccentricity at the level of perceptual processing. Even if the author accepted my previous suggestions that TOJ involves post-perceptual processing compared to SJ, and SB involves causal relations between the audio-visual stimuli, explanations regarding the relationship between the results at difference filed and these mechanisms remains lacking.

Reply: I believe that the explanation regarding the VF differences shown in the present study is sufficient. However, the statements were partly unclear in the previous manuscript, and I have, therefore, revised these points.

1-2. For example, in TOJ, the PSS was more negative (i.e., at the auditory-leading side) in the center than in the periphery. On p. 13, the author stated that this difference was consistent with “difference in visual latency between the central and peripheral VFs”. But later the author said “could be attributed to differences in the response time”. Was this a perceptual effect or decisional effect? Would the patter be reversed if the task is changed into “which stimuli comes second”?

Reply: I consider that the difference in PSS between the central and peripheral VFs is the perception effect. Breitmeyer (1985) reported that response time reflects differences in visual latency. Thus, the difference in response time between central and peripheral VFs would also be a perceptual effect caused by difference in visual latency. I have added the statements of this point (lines 259–266).

“The difference in the PSS observed between the central and peripheral VFs was consistent with the difference in visual latency between the central and peripheral VFs. The response time for the visual stimulus was shorter in the central VF than in the peripheral VF [24]. Previous studies that manipulated spatial frequency also observed differences in the PSS consistent with the differences observed in response time [19, 29]. The difference in visual latency is reflected in the response time of the visual stimulus [22]. Therefore, in a TOJ task, differences in the PSS between the central and peripheral VFs could be attributed to differences in visual latency rather than temporal resolution.”

1-3. Another example on the top of p. 7: I don’t see the link between “the rapid temporal recalibration in the central and peripheral visual fields” and “the reversed pattern of PSS shift in TOJ and SJ reported by Roseboom”.

Reply: There was a mistake in the description; therefore the correct wording has been revised (lines 110–116).

“In addition, the differences in the effects of rapid temporal recalibration between SJ and TOJ tasks were investigated to confirm the effects of VFs on audio-visual temporal processing in this study. Roseboom [32] showed the opposite change in PSS due to rapid temporal recalibration between SJ and TOJ tasks: the PSS changed in the same direction as the stimulus onset asynchrony (SOA) of previous trials in the SJ task and in the opposite direction in the TOJ task. Therefore, this difference in the rapid temporal recalibration between SJ and TOJ tasks was also investigated.”

1-4. One more example is that, on p. 7, if the author’s previous study did not observe different rapid temporal recalibration based on different spatial frequency, then what is the reason to assume that there would be difference in the center vs. peripheral visual fields?

Reply: I considered that the VF difference in PSS change between central and peripheral VFs is related to the magnitude of the choice-repetition bias in the TOJ task. In this task, unlike the SJ task, the choice-repetition bias operates the process of PSS change. Thus, it is possible that the magnitude of the choice-repetition bias caused the VF difference in the PSS change, not the rapid recalibration process. I have clarified this explanation in the revised manuscript (lines 267–277).

“The results of normal rapid recalibration for the central VF were inconsistent with the predictions based on previous studies. Roseboom [32] showed that the PSSs shifted in opposite directions from normal rapid temporal recalibration in a TOJ task. Moreover, Keane, Bland, Matthews, Carroll, and Wallis [38] found that opposite-directed PSS shifts were induced by choice-repetition bias in a TOJ task. Choice-repetition bias refers to the tendency to repeat judgments of the temporal order of a previous trial in a current trial. Additionally, rapid recalibration was obfuscated by opposite-directed PSS shifts to a choice-repetition bias [38]. It is possible that the choice-repetition bias was suppressed in the central VF in this study. Low information reliability induces a larger choice-repetition bias [39, 40]. Therefore, the reliability of judging temporal order for audio-visual stimuli would be high in the central VF. This speculation needs to be further investigated.”

2. The author accepted previous suggestions and argued that temporal resolution is related to TBW whereas visual latency is related to PSS (p. 5). However, on p. 13, the authors still contrasted temporal resolution and visual latency as two possible explanations for PSS. The same confusion appeared again on p. 26.

Reply: Thank you, I accepted your useful suggestion. However, it is also necessary to explain why visual latency, rather than temporal resolution, affected the PSS score in the TOJ task. I revised the descriptions to avoid confusion and clarify my arguments. (lines 259–266, lines 537–545).

“The difference in the PSS observed between the central and peripheral VFs was consistent with the difference in visual latency between the central and peripheral VFs. The response time for the visual stimulus was shorter in the central VF than in the peripheral VF [24]. Previous studies that manipulated spatial frequency also observed differences in the PSS consistent with the differences observed in response time [19, 29]. The difference in visual latency is reflected in the response time of the visual stimulus [22]. Therefore, in a TOJ task, differences in the PSS between the central and peripheral VFs could be attributed to differences in visual latency rather than temporal resolution.”

“As a preliminary prediction, the VF differences in the PSS and TBW both follow differences in each visual latency and temporal resolution: the PSS score was lower and the TBW width was wider in the central VF than in the peripheral VF. It has been shown that the difference in eccentricity-dependent temporal resolution is observed in the early visual cortex and is compensated later in the cortical visual pathway [42]. TOJ tasks are assumed to be associated with higher-order processing compared to SJ tasks [28, 43]. Therefore, what PSS score was affected by visual latency could be attributed to the visual hierarchy of eccentricity-dependent temporal contrast in a TOJ task.”

3. Some sentences are still hard to follow or incorrect, such as:

(1) p. 4: “Moreover, rapid recalibration, in which a prolonged period of adaptation is not necessary to observe temporal recalibration, has been reported.” – I don’t get it.

(2) p. 8: the sentence regarding the power analysis is too long and many “and”.

(3) p. 15 and p. 20: there were only 9 SOAs in SJ and SB tasks.

(4) p. 19: “The binding for visual and auditory stimuli must be more flexible at the central VF, given the low temporal resolution” I don’t see the relations.

(5) p. 22, line 456: should be Figs 7b and 7c

(6) p. 24 “If SB judgment does not cause a change in the PSS due to the timing of sound in a previous trial” How come a participant’s judgment can cause something?

Reply: (1) I revised the description to clarify my intention (lines 58–59).

“Moreover, van der Burg, Alais, and Cass [17] showed temporal recalibration without adaptation period, which they termed “rapid recalibration.””

(2) I revised the sentence as per your comment (lines 138–142).

“To determine the sample size needed for this study, PANGEA (https://jakewestfall.shinyapps.io/pangea/) was used to calculate the power (1-β) needed to detect a two-way interaction with following parameters: effect size (d) = 0.45, variance of error = 0.333, variance of two-way interaction = 0.083, and the number of condition repetitions = 24. PANGEA indicated a power of 0.86 when a sample size was 16 participants.”

(3) I corrected the statement (line 306 and 420).

(4) I revised description to clarify my argument (lines 383–385).

“Thus, the range of audio-visual synchrony perception would be more tolerant in the central VF due to adapting to low temporal resolution.”

(5) I corrected the description (line 455).

(6) I revised statements to clarify my argument (lines 508–513).

“It is possible that participants performed timing judgment of bounce in the central VF and causal judgment of bounce in the peripheral VF. The occurrence of rapid recalibration due to SB perception has not yet been examined. If participants performed a causal judgment in peripheral VF and the timing information on a previous trial did not cause a change in the PSS in causal judgment, it is consistent with the results of this experiment.”

Attachment

Submitted filename: Reply3_PO3.docx

Decision Letter 3

Deborah Apthorp

25 Nov 2021

Visual field differences in temporal synchrony processing for audio-visual stimuli

PONE-D-20-40727R3

Dear Dr. Takeshima,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Deborah Apthorp, Ph.D

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Deborah Apthorp

6 Dec 2021

PONE-D-20-40727R3

Visual field differences in temporal synchrony processing for audio-visual stimuli

Dear Dr. Takeshima:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Deborah Apthorp

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Reply_PO3.docx

    Attachment

    Submitted filename: Reply2_PO3.docx

    Attachment

    Submitted filename: Reply3_PO3.docx

    Data Availability Statement

    All raw data and R script files are available from the OSF database (https://osf.io/2x73g/?view_only=1dd0968b3b5d4a65a90c4ed78b1dde19).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES