Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 12;15:39658. doi: 10.1038/s41598-025-23390-4

Dynamic reorganization of functional networks underlying audiovisual interactions

Irem Akdogan 1,2,5,, Serap Aydin 3, Hulusi Kafaligonul 1,4,5
PMCID: PMC12612201  PMID: 41224890

Abstract

Crossmodal interactions involve crosstalk between different cortical areas and dynamic recruitment of regions, which is crucial for integrating sensory information into a coherent percept. Despite their significance, the dynamic cortical networks underlying the crossmodal influence of auditory information on visual motion processing—particularly in terms of temporally resolved EEG connectivity—have yet to be comprehensively characterized. In the present study, we investigated frequency-specific networks underlying audiovisual interactions during motion and speed estimation. Functional networks were generated using directed transfer function (DTF) and adaptive DTF (ADTF) to estimate connectivity patterns of electroencephalogram (EEG) data. Network-based statistical analyses revealed frequency-specific networks in the theta and alpha bands, which supported long-range communication between occipital/parieto-occipital, parietal, and frontal regions during audiovisual interactions compared to unisensory visual motion processing. Graph theory analyses demonstrated a transition from localized and segregated processing to global integration, emphasizing cortical network reorganization according to the demands of sensory processing. Moreover, these analyses further revealed frequency-specific shifts in connectivity over time, with low-frequency oscillations exhibiting sustained connectivity increases, while high-frequency bands showed transient patterns, reflecting the temporal flexibility of neural networks. These findings illustrate how local and global network modulations reflect the brain’s dynamic reorganization, balancing integration and segregation during crossmodal influences.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-23390-4.

Keywords: Brain oscillations, Functional networks, Multisensory, Temporal processing, Graph theory, EEG

Subject terms: Cognitive neuroscience, Sensory processing

Introduction

The human brain is adapted to guide behavior in natural environments that provide information to different sensory organs. Therefore, the integration of information from different sensory modalities is of central importance for a coherent perception of the external world. Notably, the integration and binding of sensory information in the spatial and temporal domain, as well as crossmodal influences where one modality modulates perception in another, have been considered essential elements of perception13. Numerous audiovisual paradigms have illustrated that such crossmodal influences can produce substantial changes in various aspects of perception. For instance, previous research has indicated that audiovisual interactions in the temporal domain can alter perception of different motion attributes, such as direction4, speed5,6, sensitivity to coherent motion7, discrimination and categorization of motion8,9. These findings highlight the multisensory nature of motion and provide important implications for perceptual dynamics in daily life situations10,11.

Utilizing a speed judgment paradigm, Kaya and Kafaligonul12 investigated the neural correlates of auditory-driven modulations in visual motion perception. In their study, the duration of time interval demarcated by brief auditory clicks significantly alters the perceived speed of apparent motion, such that a shorter auditory time interval is typically perceived to move faster than the one with a long time interval5,6. Consistent with the fact that audition is dominant in the temporal domain, these audiovisual interactions have been described by “temporal ventriloquism”13,14. That is to say, it has been described that static clicks alter the perceived time interval between motion frames (e.g., capturing the timing of visual motion frames), thereby leading to changes in the perceived speed of visual motion. Studying such perceptual illusions provides a perspective into how the brain binds asynchronous sensory signals. Using this approach, Kaya and Kafaligonul12 revealed modulations in both early and late ERP components across multiple scalp regions (parieto-occipital, parietal, and prefrontal areas), suggesting that auditory temporal information exerts its influence on visual motion processing at multiple stages of neural processing (see also Kaya et al.,15). The timeline of neural modulations further pointed to a dynamic interplay between the identified cortical areas, emphasizing that the interactions between different areas play a key role in audiovisual processing and final motion perception. Despite these advances, our understanding of how large-scale neural networks and oscillatory dynamics support such crossmodal interactions remains limited.

In the present study, we investigated functional networks associated with auditory-driven modulation of motion perception by using the EEG (electroencephalogram) dataset of Kaya and Kafaligonul12. While previous findings have provided valuable insights into the cortical dynamics involved in audiovisual processing, particularly from fMRI-based approaches, the temporally resolved reorganization of large-scale functional networks remains less understood, especially in the context of perceptual motion tasks that require integration of temporally asynchronous inputs. Recent work by Singh et al.16, for example, demonstrated that perceived audiovisual simultaneity is associated with increased local segregation and reduced global integration of brain networks, as measured by BOLD correlations. While such studies offer important spatial perspectives, they are limited in capturing the fast and frequency-specific interactions that EEG connectivity methods afford. Accordingly, we employed two leading multichannel estimators, directed transfer function17 (DTF) and adaptive DTF18 (ADTF), to reveal the flow of neural processing and the organization of brain networks across different frequency bands. Numerous studies have validated the application of DTF to sensor-level EEG data, highlighting its robustness to noise and insensitivity to volume conduction, as well as its capacity to provide reliable estimates of functional connectivity without the need for source reconstruction1921 We further applied network-based statistics22 and graph theory23 to systematically examine key network features, such as clustering, efficiency, integration, and segregation, which are crucial for understanding the balance between local and global functional connectivity. This approach also allowed us to evaluate cortical self-organization and interaction among various cortical regions under different sensory stimulation profiles.

The rationale of the current study and our motivation to structure a data analysis pipeline based on oscillations, multichannel estimators, and graph theory were threefold. First, we aimed to understand frequency-resolved functional connectivity patterns underlying audiovisual interactions in the temporal domain. We specifically tested whether the brain exhibits different connectivity patterns for the processing of visual apparent motion in isolation and under the conditions in which audiovisual interactions became dominant. By doing so, we aimed to uncover distinct patterns and shared mechanisms within the neural pathways underlying crossmodal influences and multisensory integration. Second, we aimed to identify frequency-specific topological features of these networks associated with temporal dynamics of (auditory-driven changes in) motion perception using graph theory. While studies on audiovisual paradigms have started to reveal how frequency-specific network characteristics adapt dynamically for efficient crossmodal influence24,25, the graph-theoretical representation of functional connections underlying perceptual shifts driven by fine-grained auditory timing—specifically, how subtle modulations in auditory intervals alter the perceived speed of visual motion—has not been systematically explored. Lastly, we aimed to understand the limitations of multichannel estimators, graph theory, and time-varying connectivity analysis in transient, event-related EEG recordings. Although some prior studies have applied graph-theoretical and connectivity-based approaches to time-sensitive paradigms2629, and a few recent studies have specifically focused on audiovisual processing24,25,30, further investigation is warranted to determine how effectively these analyses capture event-related fluctuations in network organization of audiovisual integration, particularly at the sensor level. In this study, we specifically examined whether these approaches could resolve changes in functional connectivity linked to subtle differences in the auditory timing and its effects on perceived visual speed.

Materials and methods

Participants

We used the EEG dataset by Kaya and Kafaligonul12. A total of 21 participants took part in the study, and the final dataset included recordings from 18 healthy volunteers (right-handed, age range 19–32 years) who completed all steps of the study according to the instructions. All participants had normal or corrected-to-normal visual acuity and normal hearing, and none had a history of neurological disorders by self-report. The entire protocol was carried out in accordance with international guidelines31 and approved by the local ethics committee at the Faculty of Medicine, Ankara University (Approval ID: 10.412.13). A written informed consent was obtained from each participant prior to enrollment in the study.

Design and procedure

The experimental design and procedures were covered by Kaya and Kafaligonul12 in detail. Here, we briefly highlight the key features of stimulation and outline the critical steps of design. A two-frame apparent motion was used to generate visual motion. In each motion frame, a bar was flashed 3.4°above the fixation for 50 ms (Fig. 1a). The spatial displacement (i.e., horizontal center-to-center distance) was 1.3°. The inter-stimulus interval (visual ISI) between the motion frames and flashed bars was 100 ms. The auditory stimuli were a pair of static clicks and introduced binaurally with headphones. Each click was a rectangular windowed 480 Hz sine-wave carrier (sampled at 44.1 kHz with 8-bit quantization) and had a duration of 20 ms. The sound pressure level (SPL) was 75 dB.

Fig. 1.

Fig. 1

(a) Schematic representation of the experimental design and the timeline of each trial. Each trial consisted of two consecutive presentations of a two-frame apparent motion with a 900 ms delay. These consecutive visual apparent motions were precisely the same, while the time interval between the concurrent auditory clicks (ISIA) differed. Participants fixated centrally and judged which motion appeared faster. The auditory timings for low and high disparity levels are shown separately. Visual and auditory stimuli are represented by white bars and speaker icons, respectively. (b) Overview of the analysis pipeline. Top row: EEG signals were preprocessed, filtered into canonical frequency bands, and connectivity was computed using directed transfer function (DTF) and adaptive DTF. Bottom row: After thresholding and binarization, graph theory metrics (CC, LE, Q, GE) were extracted. Statistical comparisons were performed using network-based statistics (NBS) and graph-based analyses to evaluate condition-specific changes.

The design was based on comparing the speed of two consecutive apparent motions moving in the same direction (rightward or leftward). On each trial, the two-frame apparent motion was presented twice, and the temporal offset between each presentation (i.e., the offset of the first presentation and the onset of the second) was 900 ms (Fig. 1a). In the audiovisual (AV) conditions, a pair of clicks temporally centered with the presentation of each apparent motion was introduced. While each apparent motion was identical in terms of visual stimulation, the time interval defined by the pair of clicks (auditory ISI) was varied. For one of the apparent motion presentations, the time interval between static clicks was shorter than or equal to the visual time interval between the two motion frames (short condition). For the other one, the auditory time interval was longer than the visual time interval (long condition). In addition, the disparity level between short and long auditory intervals was varied, leading to low and high disparity levels. Specifically, the low disparity level involved a moderate difference between auditory intervals (100 ms vs. 160 ms), while the high disparity level involved a large difference (40 ms vs. 220 ms). Figure 1a illustrates the auditory timing conditions across disparity levels (short vs. long, low vs. high disparity level). Hence, the design included four audiovisual conditions with different auditory time intervals (low disparity level: 100 ms vs. 160 ms auditory ISI; high disparity level: 40 ms vs. 220 ms auditory ISI). At the end of each trial, the participants indicated which apparent motion (first or second) moved faster via keypress. The design also included unimodal conditions (visual-only, four corresponding auditory-only conditions). In these conditions, the timeline of events was exactly the same, but either apparent motions (visual-only) or pairs of clicks (auditory-only) were presented. Similar to the audiovisual conditions, the participants performed the speed comparison at the end of visual-only trials but only fixated and passively listened to clicks during the auditory-only conditions. Each participant completed four experimental sessions, with pseudorandomly intermixed conditions within each session, resulting in a total of 48 presentations per condition.

Data acquisition and preprocessing

The EEG data were acquired using a 64-channel MR-compatible system (Brain Products, GmbH, Gilching, Germany). The caps (BrainCap MR, Brain Products, GmbH) included 63 scalp electrodes and an additional electrocardiogram (ECG) electrode attached to the back of participants to control for cardioballistic artifacts. Two of the scalp electrodes (FCz and AFz) served as the reference and ground, respectively. The electrode impedances were consistently monitored and typically below 10 kΩ throughout an experimental session. The EEG signals were sampled at a rate of 1 kHz and underwent online band-pass filtering between 0.016 and 250 Hz. The neural signals, stimulus markers, and behavioral responses were recorded via Brain Vision Recorder Software (Brain Products, GmbH).

The preprocessing of EEG signals was carried out offline using Brain Vision Analyzer 2.0 software (Brain Products, GmbH). First, the data were down-sampled to 500 Hz, and the ECG signal was used to remove cardioballistic artifacts32. The data were then segmented into epochs ranging from −400 ms (before the onset of each apparent motion) to 1000 ms (after the onset of each apparent motion). A semi-automatic inspection was performed to identify and mark epochs with artifacts. Specifically, epochs showing voltage changes of less than 0.5 μV or more than 200 μV within a 100 ms window, as well as oscillations over 50 μV/ms, were marked as bad and subsequently rejected after manual screening. Independent component analysis (ICA) using the Infomax algorithm was applied to remove common EEG artifacts such as eye blinks. Topographic spline interpolation was used to correct bad channels33. Following these standard preprocessing procedures, an average of 2.37% of trials per condition were rejected, and a total of 18 participants were retained for further analysis.

Frequency band isolation

Further processing steps were outlined in Fig. 1b and carried out with toolboxes and our own custom scripts written in MATLAB (The MathWorks, Natick, MA). In the speed judgment paradigm employed by Kaya and Kafaligonul12, the participants performed a speed comparison on visual motion (primary task-relevant stimulus) and passively listened to concurrent clicks (secondary task-irrelevant stimulus). Therefore, the design emphasizes that the auditory signals originating from the secondary modality interfere and interact with the processing of visual motion. We applied the violation of the additive model to evaluate these audiovisual interactions34 and identified nonlinear modulations by comparing (AV-A) waveforms with the activities elicited by visual apparent motion [(AV-A) vs. V]. This approach is also equivalent to comparing the audiovisual waveforms with the synthetic summation of unimodal signals [AV vs. (A + V)]. Accordingly, the neural responses to the auditory clicks (A) were subtracted from the corresponding AV conditions, yielding four distinct synthetic difference waveforms (AV-A) for each auditory time interval condition (i.e., AV-A40, AV-A100, AV-A160, and AV-A220). These bimodal differences (AV-A) were further used to reveal modulations varying with auditory timing. To study the effect of auditory timing on network-based measures, we specifically compared conditions within each disparity level. For the low disparity conditions, we compared AV-A100 (short) vs. AV-A160 (long), and for the high disparity conditions, we compared AV-A40 (short) vs. AV-A220 (long), as the observers performed speed comparisons during a trial.

Parameters for further analysis of waveforms were set according to the specifications provided by Aydın35 and Aydın et al.36. The (AV-A) waveform for each auditory time interval condition, along with the visual-only (V) data, were filtered using an Infinite-Impulse-Response (IIR) filter with a 35th-order and a 50 Hz notch filter to remove network noise. To isolate specific EEG frequency bands, five separate Finite-Impulse-Response (FIR) filters were applied, targeting delta (1–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (30–60 Hz) bands. The filter parameters were set featuring a 0 Hz transition band at both cutoff frequency edges, a pass-band ripple of 0.0575, a stop-band ripple of 0.001, and a density factor of 20. The optimal filter order for each frequency band was automatically determined using the Parks-McClellan equiripple FIR order estimator via MATLAB’s firpmord function in the Signal Processing Toolbox37.

Measures of whole-brain functional networks

For each participant, we obtained five distinct frequency bands (delta, theta, alpha, beta, gamma) for the visual-only (V) and four audiovisual difference waveforms (AV-A40, AV-A100, AV-A160, and AV-A220), resulting in a total of 25 subsets of data (5 experimental conditions × 5 frequency bands). Connectivity matrices for each subset were constructed using all electrodes based on estimates from the DTF17 and adaptive DTF (ADTF)18. We based DTF parameters on previous studies35,36,38. Similarly, for the ADTF analysis, we followed the methodology described in Xi et al.24 and Xi et al.25. These parameters were first optimized and validated using a separate dataset of audiovisual interactions (Kaya & Kafaligonul39) before being applied to the current dataset.

Directed transfer function

The directed transfer function (DTF) was introduced by Kaminski and Blinowska17 to characterize information flow between neural populations. It is widely used to construct frequency-specific networks and to capture neural information transfer while minimizing the impact of spurious interactions caused by volume conduction19. Recent developments have reinforced the validity of applying multivariate autoregressive (MVAR) models at the sensor level, enabling reliable estimation of directional and causal interactions directly from multichannel EEG recordings40. Techniques derived from MVAR modeling, such as DTF, offer a reliable alternative to bivariate autoregressive methods by effectively characterizing spectral relationships in complex neural systems41. It has been extensively applied to sensor-level EEG data, producing stable estimates of functional connectivity without requiring source reconstruction19,20. In line with established methodological guidance, we did not apply preprocessing steps that could distort the correlation structure of the multichannel data (e.g., common average referencing or surface Laplacian). Prior work has emphasized that “DTF is insensitive to volume conduction and very robust in respect to noise”19 and that “projections on the cortex surface are not needed and may disturb the phase relations between channels”20. While some recent studies have raised theoretical concerns about volume conduction effects, subsequent analyses concluded that “in practice this influence does not distort substantially the estimates”21. Following this literature, we applied preprocessing steps consistent with recommended practices for applying DTF to sensor-level EEG data, and source reconstruction was not included in our analysis pipeline. DTF was calculated based on the following formula42:

graphic file with name d33e575.gif 1

This formula describes the influence of channel-j on channel-i at a particular frequency f. k is the number of channels, and H is the transfer matrix of the system calculated based on the MVAR modeling. It estimates only one MVAR model from all time points within the epochs42,43. Therefore, while DTF could be selected according to the frequency range of interest, it could not provide time-varying connectivity estimations. The result of this equation lies in the range of [0, 1], where a value closer to 0 indicates weak or no influence, and a value closer to 1 indicates a strong influence between the channels. We calculated DTF connectivity matrices (C63x63) for each frequency band of an experimental condition using the electrophysiological Connectome (e-Connectome-2-full) software toolbox44. An integrated ARfit package45 was used to estimate the multivariate autoregressive (AR) models necessary for DTF calculation.

Adaptive directed transfer function

Similar to DTF, ADTF is a frequency-domain estimator of interactions between channels based on multivariate autoregressive (MVAR) modeling. However, its adaptive nature enables it to capture time-varying connectivity relationships between cortical areas. We included ADTF in our analysis to systematically investigate network dynamics over time and to compare two widely used connectivity estimates of DTF and ADTF on the same dataset. Similar to DTF, ADTF has been shown to be robust to volume conduction; therefore, we applied the same preprocessing strategy as described above. The time-varying coefficient matrices of the adaptive MVAR model were calculated by using the Kalman filter algorithm46, resulting in a time-varying transfer matrix H(f,t). Accordingly, ADTF was calculated as a function of both frequency and time based on the following formula44:

graphic file with name d33e633.gif 2

This formula describes the influence of channel j on channel i at a particular frequency f and time t. Here, k represents the number of channels in the network. High values of ADTF indicate a strong influence between channel j and channel i, implying that high information flow occurs between these channels at the specified frequency and time. Conversely, low ADTF values suggest weak or no influence between the channels [see also Wilke et al.18 and He et al.44]. In our analysis, we calculated ADTF connectivity matrices (C63x63) for each frequency band sampled at 2 ms time steps (i.e., 500 Hz) using the electrophysiological Connectome (e-Connectome-2-full) software toolbox44. This approach allowed us to capture the dynamic changes in connectivity patterns over time, providing a more detailed understanding of the temporal evolution of brain network interactions during an experimental condition.

Graph theoretical analysis

Functional connections between brain regions can be analyzed by modeling as complex networks through graph theory, which allows for the identification and understanding of subnetworks that may differ across experimental conditions. As mentioned above, we first used DTF and ADTF to generate connectivity strength values for each channel pair within the constructed matrices for each frequency band. These connectivity matrices were then converted into undirected binary matrices (i.e., adjacency matrices) by applying thresholding based on network sparsity (Fig. 1b). The sparsity threshold was defined as the ratio of existing edges to the maximum possible number of edges in the network, preserving a proportion (0 < t < 1) of the strongest weights from the connectivity matrix47. For subsequent analyses, sparsity thresholds ranged from 0.1 to 0.5 in intervals of 0.148. In the adjacency matrices, rows and columns correspond to the nodes in the network, representing the EEG electrodes, while the entries indicate the links or edges filled with cortical dependency estimates. Various characteristic network parameters were then calculated using these adjacency matrices to investigate the topological performance of neural interactions in unisensory and multisensory conditions. These parameters included global efficiency, local efficiency, clustering coefficient, and modularity, encompassing both segregation and integration characteristics of the brain. Although segregation and integration are opposing demands in networks, they both serve as major organizational principles and are essential for the dynamic functional organization of brain networks. Segregation allows for specialized processing within local clusters of neurons, enabling efficient and focused responses to specific tasks. Integration, on the other hand, facilitates the coordination and communication between these specialized regions, ensuring that the brain can generate coherent and adaptive responses to complex stimuli. Together, these principles underpin the brain’s ability to balance localized, specialized functions with broader, coordinated activities, which is crucial for maintaining optimal cognitive and perceptual performance49. The modeling and extraction of topological features using graph theory were performed using the Brain Connectivity Toolbox, as introduced by Rubinov and Sporns23. The following subsections provide a brief overview of the specific network measures used in this study.

Network segregation

Network segregation refers to the formation of densely interconnected clusters or neighbors, in terms of topological distance, that are specialized in certain processing tasks. These local communities are segregated from each other, with most edges linking nodes within the same modules and only a few connections between different modules49. The clustering coefficient (CC) and local efficiency (LE) are essential measures of network segregation, depending on the number of triangles present in the network. The clustering coefficient quantifies the level of local segregation by measuring the density of connections between a node and its neighbors. It is determined by calculating the proportion of triangles around a node that are formed by connected neighbors. When a node is surrounded by a highly interconnected cluster, this arrangement is vital for the operational functionality of local clusters within the graph35. Local efficiency measures the effectiveness of information transfer among the remaining nodes after one node is removed from its local cluster, thereby reflecting the network’s robustness and fault tolerance at a local level50. High local efficiency indicates that the network can maintain efficient communication even if some nodes are disrupted, underscoring the resilience of the brain’s local networks. CC and LE are mathematically calculated based on the following formulas51,52:

graphic file with name d33e726.gif 3

where Ci is the clustering coefficient of node i (Ci = 0 for ki < 2), ki is the degree of node i, and ti is the number of triangles around node i. A high CC indicates that nodes have a high tendency to form closely connected groups, reflecting local interconnectedness within the network. The CC ranges between 0 and 1, where 0 signifies no clustering (nodes are not connected to each other), and 1 indicates maximum clustering (each node’s neighbors are fully interconnected).

graphic file with name d33e774.gif 4

where Eloc,i is the local efficiency of node i, and djh(Ni) is the length of the shortest path between j and h, that contains only neighbors of i. aij is the connection status between i and j: aij = 1 if a link exists between i and j (i.e., they are neighbors), and aij = 0 otherwise. A high LE indicates that nodes have efficient communication pathways with their direct neighbors, even if one node is removed. LE ranges between 0 and 1, where 0 signifies no local efficiency (poor communication within local clusters) and 1 indicates maximum local efficiency (optimal communication within local clusters).

Network modularity

The modular structure, or modularity, of a network is a sophisticated measure of segregation that reflects the local connectivity structure of the graph23,49. In complex networks, modules represent the balance between the density of connections within a module and those between different modules53,54. When modules display high levels of internal clustering and fewer interconnections with other modules, this indicates a high level of modularity, where each module specializes in specific functional tasks. The modularity measure, Q, quantifies the network’s tendency to form these segregated clusters. It is calculated using the following formula54:

graphic file with name d33e870.gif 5

In the formula, l denotes the number of edges in the network. The term mi represents the module to which node-i belongs. If node-i and node-j are in the same module (mi = mj), then δ(mi, mj) equals 1; otherwise, it equals 0. Modularity increases when nodes within the same module have denser connections (more functional edges) with each other compared to sparser connections between modules. This measure is closely related to the functional organization of brain regions, providing flexibility and adaptability under changing conditions50. Clustering and modularity both capture similar aspects of the connectivity structure since highly modular networks often consist of densely clustered communities. However, high clustering alone does not necessarily indicate the presence of modules or communities49. Thus, each measure of local connectivity provides unique information about the locally embedded nodes and the community structure.

Network integration

While clustering, local efficiency, and modularity assess local connectivity and the segregation of nodes into communities, other measures capture the network’s capacity for engaging in global interactions and network-wide integration49. Integration refers to the ability of distributed cortical regions to effectively combine and share information, relying on path lengths and distances between nodes23. In functional networks, the paths connecting pairs of brain regions represent the routes of information flow. These paths are calculated by the number of distinct edges in binary graphs, providing insights into the efficiency of global integration. A short path length indicates that each node can, on average, be reached from any other node via a path with only a few edges, suggesting higher potential for communication and stronger integration among functionally dependent regions. In the current study, global efficiency (GE) is used as a measure of integration in brain networks. GE was calculated using the average inverse distance matrix using the following formula51:

graphic file with name d33e939.gif 6

In the formula, dij represents the distance between node-i and node-j. The highest possible GE value is found in a network where every node is directly connected to every other node by a single edge, resulting in all distances between nodes (i.e., all dij) equal one. Conversely, the lowest global efficiency is observed in completely disconnected networks, where the distances between nodes are infinite. Networks with shorter path lengths have higher global efficiency, meaning that, on average, nodes can communicate with each other through a minimal number of steps, enhancing the efficiency of information transfer.

Network-based statistics of connectivity matrices

To determine significant differences in connectivity patterns and sub-networks between unisensory and multisensory conditions, as well as among various auditory timing conditions, we analyzed the (thresholded and non-binarized) whole-brain connectivity matrices using Network-Based Statistics (NBS)20. NBS is a nonparametric method for mass univariate significance testing of each network connection, designed to control for the multiple comparison problem (MCP). Compared to other statistical procedures that correct p-values for each connection individually, NBS offers greater statistical power in managing the family-wise error (FWE) rate in complex networks. This method has been previously employed to explore the neurological basis of various psychotic disorders48,55,56. Accordingly, we utilized NBS to examine different connected components of brain functional networks under specific unisensory and multisensory conditions in healthy participants.

In the NBS analysis, we compared unisensory (V) with multisensory (AV-A) conditions, as well as different auditory timing conditions (high vs. low disparity). First, we focused on the effects elicited by audiovisual interactions at the network level, regardless of auditory timing. Thus, contrasts were defined as “unisensory > multisensory” and “unisensory < multisensory,” combining all audiovisual conditions (AV-A) across different auditory timing conditions. We used DTF in this stage of our analysis. Then, we examined the time-varying changes in network dynamics through the NBS analysis of ADTF. Using the ADTF measure, we further tested the contrasts between short and long auditory time intervals within each disparity level (i.e., “short < long” and “short > long”), in addition to the comparisons between unisensory and multisensory conditions. Additionally, we assessed the functional connections that either increased or decreased over time for each condition. The FWER in NBS analysis was controlled at the cluster level, with an alpha (ɑ) threshold value of 0.01 applied to the p-values associated with each connection. We performed 5000 permutations, and the test statistics threshold was set as T < 3.0. Significant subnetworks were visualized using BrainNet Viewer57.

Statistical analysis of network indices

Further statistical tests were carried out to understand how network indices change under different conditions. We performed two-way repeated-measures ANOVA with factors of frequency band and stimulation (V, AV-A40, AV-A100, AV-A160, and AV-A220) on each graph theory index derived from DTF. Follow-up paired comparisons were performed to compare network properties of audiovisual conditions and to examine the effects of audiovisual interactions in time.

We applied a similar approach to the graph theory indices derived from ADTF. We incorporated cluster-based permutation tests, encompassing all time points from −100 to 800 ms. Considering the temporal dimension ADTF approach, paired comparisons on GT indices also included a correction in the time domain to overcome Type I error. This approach aims to test all neighboring and continuous time points of each GT index (i.e., network metrics of CC, LE, Q, GE) that are significantly different among specific conditions while addressing the problem of multiple statistical comparisons and preserving the temporal neighbor relationships objectively58. In the initial step, the GT metrics from all participants were combined irrespective of the condition and then randomly split into two groups. This shuffling procedure was used to create surrogate data and was repeated 1000 times to generate a null distribution. Under the null hypothesis, which assumes no difference between specific conditions, the difference in averaged GT values between two particular conditions was computed for both the real data and the corresponding shuffled surrogates. The difference in GT values between specific conditions was transformed into a z-score by subtracting the mean of the null distribution from the actual GT difference and then dividing by the standard deviation of the null distribution59. This z-score was converted to a p-value using a cumulative distribution function, yielding a time series of p-values. To account for multiple comparisons, cluster-based statistics were used to control false alarms at the map-level threshold, employing the supra-threshold cluster size test with an ɑ threshold value of 0.01, applied to the p-values associated with each time point60. After thresholding all statistical maps generated under the null hypothesis, we obtained a distribution of the largest supra-threshold clusters expected under the null hypothesis. In the distribution of the largest clusters expected under the null hypothesis, clusters containing less than 99% (two-tailed) were removed. Finally, all neighboring and continuous time points of a significant cluster were highlighted on GT plots, indicating continuous significance at two-tailed p < 0.01. Since this test can only compare two conditions at a time, we performed the comparisons as outlined above.

To explore potential links between large-scale brain network dynamics and perceptual performance, we performed exploratory correlation analyses between graph theory metrics and behavioral data. For each participant, behavioral performance was quantified as the percentage of trials in which the short auditory interval condition was perceived as faster than the long auditory interval. Corresponding graph metrics (e.g., global efficiency, modularity) were computed as the difference between short and long auditory intervals, separately for each disparity level, using values derived from significant time clusters identified in the graph theory analysis. Simple linear regression models were then applied to assess the association between behavioral scores and graph differences, using subject-level data. Both slope and intercept coefficients were estimated, and the results were interpreted cautiously due to the modest sample size and absence of statistically significant correlations (p > 0.05 for all tests).

Results

Summary of behavioral findings

We replotted the behavioral data to summarize the findings of Kaya and Kafaligonul12. To quantify the effect of auditory time intervals on perceived speed, we calculated the percentage of trials in which the apparent motion with a short auditory time interval was perceived as faster than the one with a long auditory interval. Figure 2 displays the distribution of these percentage values under low and high disparity conditions. For both conditions, the participants perceived the apparent motion with a short auditory time interval as faster in over 60% of the trials. As reported by Kaya and Kafaligonul12, the percentage values significantly exceeded the 50% chance level (Bonferroni corrected two-tailed t-tests, all p < 0.001). Along with other studies (e.g., Duyar et al.61; Yilmaz & Kafaligonul62), these results highlight significant audiovisual interactions in the temporal domain. More importantly, the percentage value of high disparity was significantly higher than that of low disparity condition (t17 = 7.95, p < 0.001, Cohen’s d = 1.87), indicating a significant effect of temporal disparity on the observed audiovisual interactions. The findings overall demonstrate robust effects of auditory timing on motion and speed estimation.

Fig. 2.

Fig. 2

Behavioral results (n = 18). The percentage of trials in which the apparent motion with a short auditory interval was perceived as faster than the one with a long interval is shown for low and high temporal disparity levels. Individual participant data are represented by connected points, while group-averaged data are shown as boxplots.

Directed transfer function and derived graph theory indices

We first identified the nonlinear interactions between the neural representations of visual apparent motion and auditory clicks by comparing the activity of visual apparent motion (V) with the difference waveforms (AV-A). That is to say, in the initial stage of our analyses, we aimed to determine changes in functional networks associated with audiovisual interactions. Accordingly, we compared the functional connectivity estimates of DTF across V and combined audiovisual (AV-A) conditions by applying network-based statistics. Importantly, the application of sparsity thresholding and cluster-level permutation testing in NBS minimizes potential influences stemming from condition-related power differences, particularly those that might arise from subtraction-based waveform derivations. Additionally, DTF (and ADTF) offers resilience against amplitude-related distortions, providing conservative yet reliable estimates of functional connectivity, even at the sensor level. To further guard against power-related confounds, we applied binarization prior to computing graph theory indices. Collectively, these steps strengthen the validity of the extracted connectivity patterns by ensuring that the observed effects are rooted in network topology rather than fluctuations in signal amplitude.

The functional connectivity patterns in theta (p = 0.0034) and alpha (p = 0.0018) bands were significantly increased in the difference waveforms compared to the visual-only condition (Fig. 3a vs. b). The increased connections predominantly involved long-range connectivity patterns between the parietal/parieto-occipital and frontal regions, highlighting the interactions across sensory representations/neural activities over distant cortical areas (Fig. 3b). In the theta and alpha bands, audiovisual interactions exhibited extensive intra-hemispheric and interhemispheric connections, suggesting robust facilitation at the network level. On the other hand, we identified robust increases in functional connections during unisensory processing of motion (visual-only) across all frequency bands but with distinct characteristics (delta p = 0.0086; theta p < 0.001; alpha p = 0.0017; beta p < 0.001; gamma p < 0.001). The strengthening of functional connections occurred through densely connected neighboring subgroups within narrower occipital and parietal regions, suggesting more localized processing during visual-only processing (Fig. 3a). The pattern of connectivity increase was most prominent in the theta, alpha, and beta bands but relatively sparse in the delta and gamma bands. These findings suggest that unisensory processing of visual stimuli relies more on localized networks, while audiovisual interactions engage broader, more integrative networks specific to the alpha and theta frequency bands. We also compared the visual-only waveforms with those of individual auditory timing conditions. These comparisons overall indicated similar changes at the network level (Figures S1-S4).

Fig. 3.

Fig. 3

Functional connectivity differences between unisensory visual (V) and audiovisual difference (AV-A) waveforms (n = 18). Significant connections identified by network-based statistics are displayed across sagittal (left and right), axial, and coronal brain views. Representations were obtained with the BrainNet Viewer Toolbox57. Results are presented for five frequency bands (delta, theta, alpha, beta, and gamma), with corresponding p-values indicated for each frequency band. (a) Functional connectivity was significantly stronger during unisensory processing of visual motion. Blue edges represent the regions with higher DTF connectivity in the unisensory condition, primarily concentrated in occipital and parieto-occipital clusters. (b) Long-range connections became stronger for the audiovisual interactions. Red edges highlight regions with higher DTF connectivity, pronounced between frontal and parietal/parieto-occipital regions, especially in the theta and alpha bands.

The alterations in functional networks were further examined across different conditions by analyzing the topological parameters (CC, LE, Q, and GE) based on graph theory. Given that network parameters from different sparsity thresholds showed similar statistical results, we presented the topological parameters at the conservative 0.1 sparsity threshold, consistent with previous literature24,48,63. Figure 4 displays the alterations in topological parameters across distinct frequency bands and sensory conditions (V, AV-A40, AV-A100, AV-A160, and AV-A220). A repeated-measures ANOVA on these measures revealed significant main effects of modality/sensory stimulation, frequency band, and a two-way interaction (Table 1). Pairwise comparisons between sensory conditions revealed that network characteristics, including CC, LE, and Q, were significantly higher during visual-only processing compared to difference waveforms of all time interval and disparity conditions and frequency bands (FDR-corrected, all p < 0.0001). Interestingly, for the CC and LE measures based on the alpha band, there were significant differences across auditory time intervals (short vs. long) in the high disparity and marginally significant in the low disparity level. Conversely, GE was significantly higher in all audiovisual conditions compared to the visual-only (FDR-corrected, all p < 0.0001; see also Table S1). These comparisons suggest that the significant main effects of modality, frequency, and the observed pairwise interactions were primarily driven by differences in the processing of visual-only (V) and audiovisual interaction (AV-A) across distinct frequency bands. There were no robust and consistent differences across auditory timing conditions.

Fig. 4.

Fig. 4

The averaged network values for CC, LE, Q, and GE derived from DTF were presented for each frequency band and condition. Error bars represent the standard error (± SEM) across subjects (n = 18).

Table 1.

Repeated-measures ANOVA on graph theory indices derived from DTF (n = 18). Statistical results are presented for the clustering coefficient (CC), local efficiency (LE), modularity (Q), and global efficiency (GE).

Modality Frequency Band Modality X Frequency Band
F(4,17) p Inline graphic F(4,68) p Inline graphic F(16,272) p Inline graphic
CC 527.7  < .001 0.969 14.1  < .001 0.454 41.4  < .001 0.709
LE 505.5  < .001 0.967 14.6  < .001 0.462 45.6  < .001 0.729
Q 480.8  < .001 0.966 15.8  < .001 0.481 20.4  < .001 0.546
GE 123.47  < .001 0.879 15.7  < .001 0.48 3.38 0.005 0.166

Adaptive directed transfer function and derived graph theory indices

The DTF analyses contributed to the understanding of functional connectivity changes in distinct frequency bands associated with audiovisual interactions in the temporal domain and the multisensory nature of motion perception. However, despite the original paradigm and behavioral data highlighting the strong effects of auditory timing, DTF analysis, and derived network measures were limited in detecting changes across different auditory timing conditions. To address this, we further applied an adaptive DTF (ADTF) algorithm that incorporates the temporal dimension of neural activities. The network characteristics of the CC, LE, Q, and GE were calculated based on the connectivity estimates derived from ADTF across five different frequency bands over time.

As indicated by different network measures (Figs. 5 and 6), the ADTF algorithm distinguished some of the auditory timing conditions. In the low-frequency bands (delta and theta, see also Table 2), the ADTF procedure revealed significant differences (all p < 0.01) between visual-only (V) and audiovisual conditions (AV-A40, AV-A100, AV-A160, or AV-A220). Particularly, these differences were dominant in the theta band and mainly indicated by CC, LE, and Q measures. Overall, the significant modulations in LE and CC were in similar time windows and experimental conditions (Fig. 5). This alignment is consistent with existing literature, suggesting that both measures reflect the network’s segregation capabilities. The overall trend of both measures showed a decline over time in theta and alpha bands, suggesting a weakening of local network connections in these low-frequency bands. The temporal dynamics of modularity (Q) showed a complex pattern (Fig. 6a). Compared to the visual-only, the modularity in the audiovisual conditions was higher and lower in the early and late identified time windows of the theta band, respectively. The significant modulations in global efficiency (GE) were dominant in the theta and beta frequency bands (Fig. 6b). The GE measures of theta band increased over time, suggesting an increase in the capacity for engaging in global interactions and network-wide integration. The alpha frequency band had a similar increase, but the paired comparisons of GE did not identify a significant time window. On the other hand, short and long auditory time intervals were notably distinguished for both disparity levels (AV-A100 vs. AV-A160; AV-A40 vs. AV-A220) in GE measures based on beta oscillations. The auditory timing was discriminated in local network measures (CC, Q) measures of alpha oscillations within some non-overlapping time windows.

Fig. 5.

Fig. 5

Time-resolved dynamics of CC and LE derived from ADTF for each frequency band (n = 18). (a) CC values and (b) LE values are shown for visual-only (orange), short auditory time interval (blue), and long auditory time interval (dark blue) conditions, separated by low and high disparity levels. Significant time clusters (p < .01, cluster-corrected) are highlighted with bars and dashed lines.

Fig. 6.

Fig. 6

Time-resolved dynamics of Q and GE derived from ADTF for each frequency band (n = 18). (a) Q values and (b) GE values are shown for visual-only (orange), short auditory time interval (blue), and long auditory time interval (dark blue) conditions, separated by low and high disparity levels. Significant time clusters (p < .01, cluster-corrected) are highlighted with bars and dashed lines.

Table 2.

Significant time windows (in milliseconds, p < .01) were identified by cluster-based permutation tests for graph theory indices—Clustering Coefficient (CC), Local Efficiency (LE), Modularity (Q), and Global Efficiency (GE)—derived from ADTF across auditory timing conditions (low and high disparity) and frequency bands (n = 18).

CC LE Q GE
Low Disparity High Disparity Low Disparity High Disparity Low Disparity High Disparity Low Disparity High Disparity
Delta V–Short 500–670 520–680 530–670 520–690 NS 580–700 NS NS
V–Long NS 550–630 NS NS NS NS NS NS
Short–Long NS NS NS NS NS NS NS NS
Theta V–Short 400–440 520–600 400–440 520–600 230–310

200–310

550–650

230–330 240–330
V–Long 400–440 540–630 400–440 540–600 250–300 250–310 230–330 NS
Short–Long 690–740 NS 690–740 NS NS 550–600 NS NS
Alpha V–Short NS NS NS NS NS NS NS NS
V–Long NS NS NS NS NS NS NS 160–240
Short–Long NS 170–220 NS 170–240 600–800 400–460 NS NS
Beta V–Short NS NS NS NS NS NS NS NS
V–Long NS NS NS NS NS NS NS NS
Short–Long NS 420–490 NS 450–480 NS NS 310–400 310–400
Gamma V–Short NS NS NS NS NS NS NS NS
V–Long NS NS NS NS NS NS NS NS
Short–Long NS NS NS NS NS NS NS NS

Rows compare visual-only (V) with audiovisual short and long auditory timing conditions (V–Short; V–Long) and short versus long auditory conditions (Short–Long). NS corresponds to no significant differences.

Figures 5 and 6 demonstrated major overall modulations of network measures in the temporal domain. These temporal modulations were additionally reflected in raw connectivity matrices derived from the ADTF (Figs. S5-S9), providing an overview of connectivity pattern changes at each 100 ms time step. Using the network-based statistics to compare each matrix with the preceding time step, we observed that functional connections in lower frequency bands (delta, theta, and alpha) consistently increased over time, while those in higher frequency bands (beta and gamma) decreased. This indicates distinct patterns of cortical communication dynamics for different frequency bands (Figs. S10-S14, Table S2). These findings suggest that lower frequency oscillations play a stronger role in long-range cortical communication over time. Specifically, the increase in low-frequency connectivity persisted until ~ 400 ms, while high-frequency connectivity declined sharply within the first 100 ms. As high-frequency activity diminishes over time, the coordination across sensory modalities and audiovisual interaction relies increasingly on the sustained engagement of low-frequency oscillations. To complement our approach in previous analyses, we also compared individual audiovisual conditions with the visual-only, as well as auditory timing conditions with each other at each 100 ms time step. Significant alterations in functional connectivity patterns were observed in association with audiovisual interactions (Fig. 7, Table S3), and these significant changes were predominantly observed in the lower frequency bands (e.g., theta and alpha), consistent with the findings from the DTF analysis. Notably, these connections were particularly prominent in the early time windows, up to 200 ms, and similarly present across audiovisual conditions (Table S3). On the other hand, these additional analyses did not reveal any significant differences across auditory timing conditions.

Fig. 7.

Fig. 7

Significant ADTF connectivity changes between unisensory visual (V) and audiovisual difference (AV-A) waveforms over time (n = 18). The significant connections identified by network-based statistics are displayed across axial and coronal brain views. Representations were obtained with the BrainNet Viewer Toolbox57. Results are presented for significant frequency bands and time windows with corresponding p-values (Table S3). Red edges represent the regions with higher ADTF connectivity in the multisensory condition, primarily concentrated between frontal and parietal/parieto-occipital clusters.

To assess whether graph-theoretical differences were related to perceptual outcomes, we conducted exploratory correlation analyses between GT metrics and behavioral performance. None of these analyses revealed significant correlations between the behavioral performance and the differences (short-long) in any of the GT indices (p > 0.05), regardless of the disparity level (low or high).

Discussion

Multisensory processing involves the coordination and interaction of distributed cortical areas, emphasizing the dynamic and flexible recruitment of different brain regions6466. This aspect of sensory processing is particularly crucial for understanding cortical mechanisms and functional networks underlying perceptual dynamics. Using a dynamic audiovisual paradigm, we investigated cortical networks and oscillations associated with audiovisual interactions and the multisensory nature of motion and speed estimation. Utilizing network-based analyses and graph theory modeling, we investigated functional connectivity patterns in specific frequency bands, particularly in the theta and alpha bands, facilitating long-range communications between cortical regions during audiovisual interactions. These findings primarily associate these oscillations with the interaction/binding of auditory and visual inputs and reveal their important roles in global network coordination. We further observed significant modulations in local and global network measures, reflecting the dynamic reorganization of cortical networks to balance integration and segregation during audiovisual processing. The outcome of the network-based approach contributes to an evolving perspective on how the brain dynamically coordinates sensory information for a coherent multisensory percept. In light of our original aims, we discuss the implications of these findings for crossmodal influences, the multisensory integration underlying motion perception, and the limitations of network-based analyses in understanding temporal modulations.

Frequency-specific connectivity changes associated with audiovisual interactions and the multisensory nature of motion perception

Our findings revealed that audiovisual interactions are associated with distinct functional connectivity patterns compared to the processing of visual motion in isolation. For the unisensory processing of visual motion, cortical networks exhibited stronger localized connectivity, primarily in the theta, alpha, and beta bands, with significant increases predominantly confined to occipital and parieto-occipital regions, suggesting enhanced segregation of sensory information in the visual cortex. This aligns with previous research consistently showing activity in these regions closely linked to visual perception6775. Our results further validate the critical role of the occipital/parietal network in motion perception. More importantly, we found that audiovisual interactions engaged connectivity and crosstalk with other parietal and frontal scalp sites. These connections were primarily characterized by long-range pathways linking parieto-occipital regions to frontal areas, highlighting the involvement of distributed cortical networks in combining auditory and visual inputs. Previous studies have suggested that the interactions over frontal regions depend on the temporal properties of audiovisual stimulation12,76. The neural activities over these regions are commonly associated with high-level sensory processing, with certain neurons demonstrating selectivity for the direction and speed of visual motion7779. For example, activations in areas such as the prefrontal cortex have been implicated in the comparison of sensory signals, playing a critical role in perceptual tasks that require the comparison of remembered and current stimuli. It is possible that the speed comparison task used by Kaya and Kafaligonul12 might engage this region, as the task was performed within a two-interval forced-choice paradigm. Furthermore, specific subdivisions of the frontal regions, such as Brodmann area 8, are known to receive inputs from both the motion pathway and the auditory cortex80 and have been proposed to contribute to audiovisual motion integration81.

Notably, the identified long-range communications between different regions were frequency-specific and carried out by alpha and theta oscillations. Modulations in the alpha band have been commonly reported by audiovisual studies8284, and alpha oscillations have been widely documented in facilitating crossmodal influences and inter-areal crosstalk between sensory regions84. There is also strong evidence that the phase synchronization in the alpha band predicts whether successive stimuli are integrated or segregated in the temporal domain8690. By promoting long-range communication between frontal and posterior regions, alpha activity may reflect feedforward and/or top-down mechanisms crucial for coordinating crossmodal influences and perceptual binding, as well as maintaining ongoing perceptual states9193. This dynamic interplay combines local synchronization in the visual cortex with broader integration across occipital, parietal, and frontal areas94, enabling flexible coordination of sensory and perceptual functions. While theta band is often associated with high-level cognitive functions9597, previous research on audiovisual paradigms also revealed that the modulations in the theta band reflect changes in multisensory perception85. These studies utilized illusions based on the incongruence/mismatch between visual and auditory stimuli in the spatial or temporal domain98,99. The findings illustrated that theta band oscillations may be an index of incongruity processing and reflect variations in the final illusory percept. Given that theta band oscillations have been implicated in the monitoring of response conflict100 and are well suited for long-range information transfer across cortical regions101, they may serve as a neural mechanism for perceptual adjustment in multisensory profiles, where information might be temporally disparate and has to be integrated across different sensory regions.

While analyses based on DTF provided overall changes in network connectivity, the ADTF (adaptive DTF) analyses offered additional insights into how these connections evolved dynamically over time. Similarly, we found frequency-band-specific temporal shifts in connectivity patterns during the speed comparison task. Our results showed that low-frequency bands such as delta, theta, and alpha exhibited consistent increases in connectivity over time, highlighting their role in facilitating communication across different cortical regions. However, high-frequency bands, including beta and gamma, demonstrated transient connectivity patterns, reflecting localized processing demands that diminished rapidly over time. The processing of multisensory information requires flexible neural mechanisms that can rapidly adapt to the sensory changes in the environment66. Such flexibility may be required for the integration of sensory inputs across cortical regions in different timescales. In accordance with this perspective, previous studies argued that neural oscillations provide time windows of varying durations to bind or segregate sensory information71,102105. These intrinsic neural timescales differ across different stages of cortical processing, with early sensory processing having relatively shorter timescales and higher-order areas demonstrating progressively longer timescales106108. Therefore, different timescales, reflected in distinct oscillatory frequency bands, have been proposed to route information through cortical networks, creating functional links between neural populations109111. That is to say, rapid processing demands in early sensory areas may rely on faster oscillations operating within shorter timescales, enabling the quick extraction and segregation of sensory features. On the other hand, the processing at later stages may depend on slower oscillations, facilitating the coordination and synthesis of information across distributed cortical regions to support relatively complex perceptual processing. Recent findings underscore the critical role of multi-timescale neural dynamics in shaping multisensory processing. Functional coupling across different frequency bands enables parallel processing of complementary information, enhancing system capacity through a process known as multiplexing102,112,113. This dynamic interplay is particularly important for different stages of multisensory processing across different timescales, facilitating both rapid extraction of localized sensory features and the slower synthesis of distributed information required for higher-order processing. Our results align with this framework, demonstrating how fast oscillations reflect early sensory processing, while slower oscillations enable coordination across cortical regions for later stages, which might be important for effective integration and coherent perception. Our findings reflect the brain’s dynamic adaptability, demonstrating its capacity to rapidly transition between localized unisensory processing and broader integrative networks required for multisensory processing. Such flexibility allows neural networks to efficiently reconfigure within a short timeframe, optimizing both specialized and integrative functions to meet the varying demands of audiovisual interactions and binding.

Insights from graph network modeling

The graph theoretical analysis based on DTF indicated a clear dominance of global integration during audiovisual interactions and, hence, multisensory processing, particularly reflected in heightened GE in all frequency bands. The increase in GE highlights the brain’s ability to enhance long-range connectivity to merge auditory and visual inputs into a cohesive percept and may reflect relatively higher-order processing demands due to crosstalk across distributed cortical regions45. As Sporns114 noted, integrative processes can be viewed from the perspective of global communication efficiency and the network’s capacity to combine distributed information. This aligns with the idea that global efficiency reflects a workspace configuration characterized by long-range synchronization and reduced modularity, which emerges during demanding processes to facilitate information transfer in parallel115. Our findings also resonate with observations that greater cognitive effort elicits more globally efficient, less clustered, and less modular network configurations, enabling widespread synchronization across anatomically distant regions94,116. On the other hand, the segregation was higher during unisensory processing of visual motion, as indicated by enhanced CC, LE, and Q values. These metrics reflect localized connectivity patterns confined primarily to occipital and parieto-occipital regions, emphasizing specialized visual processing within the sensory cortex. This pattern suggests that visual-only conditions rely more on specialized local processing hubs in occipital and parieto-occipital regions to handle visual motion perception efficiently. Regions with clustered interconnectivity often exhibit both spatial and topological proximity, contributing to low wiring costs and efficient resource allocation49. Recent work by Singh et al.16 using fMRI-based graph analyses found that audiovisual binding was associated with increased local segregation and reduced global integration—a pattern that contrasts with the increased global efficiency and reduced modularity observed in our results. This apparent discrepancy may reflect differences in task demands (simultaneity judgment vs. speed estimation), neuroimaging modality (fMRI vs. EEG), and network construction. Their study16 captured relatively stable, spatially distributed correlations across the timescale of seconds, while our DTF/ADTF-based approach tracks rapid, frequency-specific, and directed interactions over hundreds of milliseconds. As emphasized by Fornito et al.50, such methodological and temporal differences can lead to divergent patterns of network organization, even within the same cognitive domain. In addition, segregated and integrated network configurations are not mutually exclusive but may dominate under different cognitive conditions—segregated networks supporting fast, localized processing and integrated networks enabling deliberative, large-scale coordination. Taken together, our findings illustrate the dynamic balance between integration and segregation in network organization and suggest the brain’s remarkable adaptability to processing demands of diverse sensory profiles114. This dual mechanism allows the brain to flexibly respond to the unique demands of motion processing and audiovisual interactions. Our findings provide evidence for the dynamic and collaborative activity reflecting the flexible organization of neural networks underlying the multisensory nature of motion and speed estimation.

The time-varying graph theoretical analysis of ADTF provided suggestions about how cortical networks dynamically adapt to the demands of audiovisual processing. Our findings demonstrated distinct shifts in topological properties across low-frequency bands (delta, theta, and alpha), with CC and LE exhibiting a sharp decline around 200 ms post-stimulus, coinciding with a rise of Q and GE. Over the course of the 800 ms time window, the Q and GE stabilized after their initial increase, while CC and LE showed a partial rebound. However, the overall trend indicated a decrease in CC and LE and a sustained increase in Q and GE, emphasizing a gradual transition toward more integrative network configurations. This transition reflects a shift from localized connectivity to long-range communication and may be an indication of the brain’s ability to adapt and reconfigure connectivity patterns according to the changes in sensory processing demands. Similarly, previous studies suggest that functional networks can rapidly reorganize to meet specific demands of multisensory processing117119. Notably, in beta-band, all graph network indices, including measures associated with localized and specialized operations, demonstrated a marked increase over time, highlighting their relevance in task-related network reconfiguration115,120. Interestingly, while beta-band GE steadily increased over time, this topological metric showed the most pronounced differentiation between short and long time intervals across both disparity levels, suggesting that beta-band networks might discriminate fine changes in auditory timing. However, we did not find any selectivity across disparity levels, and there was also no correlation with the behavioral changes.

Limitations and future directions

The present study further illustrates the limitations of network-based analyses and graph modeling in capturing sensory events and provides future directions for improving the temporal resolution of connectivity estimates. Our analyses revealed important changes in network dynamics and graph theoretical measures associated with audiovisual interactions within the context of motion and speed estimation. Since Kaya and Kafaligonul12 identified significant effects of auditory timing (time interval and disparity level) in distinct ERP components, we further applied ADTF to examine the effects of auditory timing on connectivity and network measures. As mentioned earlier, these analyses captured overall/coarse changes at the network level. However, in terms of connectivity differences across auditory time intervals (short vs. long) and disparity levels (low vs. high), the analyses did not reveal consistent patterns of modulation—despite the behavioral results showing clear perceptual distinctions across these conditions. This likely reflects limitations in the temporal resolution of the employed connectivity approach, which may not capture short-lived or fine-grained fluctuations in connectivity that occur during rapidly changing stimulation. Moreover, previous work using this dataset12 reported ERP modulations specifically related to auditory timing, suggesting that the effects of timing differences may be more locally expressed in early sensory responses, rather than in delayed large-scale network dynamics. These limitations highlight the need for analytical frameworks that are more sensitive to transient dynamics and localized activity differences. Future work employing higher-temporal-resolution connectivity measures with localized activity measures may better characterize how auditory timing modulates brain responses. It is also worth noting that the original dataset included a modest sample size and a limited number of trials per condition. While the sample size is within the typical range for EEG studies, it may not be optimal for ensuring the full stability of higher-order graph metrics. To mitigate these concerns, we employed conservative statistical strategies, including sparsity thresholding, binarization of connectivity matrices, and cluster-level permutation testing. However, these choices may also contribute to the absence of large-scale connectivity differences across audiovisual timing conditions. This null finding may reflect several possibilities: finer-grained or transient effects could exist but fall below the sensitivity threshold of our current analysis pipeline; or auditory timing may exert its influence on earlier perceptual stages (e.g., latency shifts or sensory gating), which are not readily captured by global graph metrics derived from DTF/ADTF. Future investigations with larger sample sizes will enable subgroup analyses to examine potential individual variability in response to crossmodal influences and provide a direct link between motion/speed estimation and functional alterations at the network level.

Conclusion

Taken together, our findings illustrate the dynamic interplay between local specialization and global communication in cortical networks during audiovisual interactions. We further identified the critical roles of theta and alpha oscillations for long-range connectivity and audiovisual interactions in the temporal domain. By employing time-sensitive adaptive analyses, we captured the brain’s remarkable adaptability in reorganizing functional networks to meet sensory processing demands. These results highlight how the brain leverages distinct frequency bands to balance segregation and integration, enabling efficient coordination of sensory inputs for the multisensory basis of motion and speed estimation. The findings overall provide important implications for understanding neural mechanisms underlying multisensory perception in daily life situations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (28.5MB, pdf)

Acknowledgements

We are grateful to Utku Kaya for providing a well-organized version of the original dataset.

Author contributions

I.A., S.A., and H.K. conceived of the study. I.A. and S.A. constructed and implemented the data analysis pipeline. I.A., S.A., and H.K. interpreted the results. I.A. wrote the first draft of the manuscript. I.A., S.A., and H.K. edited the manuscript and approved the final version.

Funding

This work was supported by the Scientific and Technological Research Council of Turkiye (BIDEB 2211-A) and the Turkish Academy of Sciences (TUBA-GEBIP Award).

Data availability

The dataset and analysis codes of the current study are available at https://osf.io/7p8as/?view_only=cce86ab3239842af9a07d1eb1e744b83.

Declarations

Competing interests

The authors declare no competing financial interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Chen, L. & Vroomen, J. Intersensory binding across space and time: A tutorial review. Atten. Percept. Psychophys.75(5), 790–811. 10.3758/s13414-013-0475-4 (2013). [DOI] [PubMed] [Google Scholar]
  • 2.Van der Stoep, N., Van der Stigchel, S., Nijboer, T. C. W. & Spence, C. Visually induced inhibition of return affects the integration of auditory and visual information. Perception46(1), 6–17. 10.1177/0301006616661934 (2017). [DOI] [PubMed] [Google Scholar]
  • 3.Vroomen, J. & Keetels, M. Perception of intersensory synchrony: A tutorial review. Atten. Percept. Psychophys.72(4), 871–884. 10.3758/APP.72.4.871 (2010). [DOI] [PubMed] [Google Scholar]
  • 4.Freeman, E. & Driver, J. Direction of visual apparent motion driven solely by timing of a static sound. Curr. Biol.18(15), 1262–1266. 10.1016/j.cub.2008.07.066 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kafaligonul, H. & Stoner, G. R. Auditory modulation of visual apparent motion with short spatial and temporal intervals. J. Vis.10(12), 31. 10.1167/10.12.31 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ogulmus, C., Karacaoglu, M. & Kafaligonul, H. Temporal ventriloquism along the path of apparent motion: Speed perception under different spatial grouping principles. Exp. Brain Res.236(2), 629–643. 10.1007/s00221-017-5159-1 (2018). [DOI] [PubMed] [Google Scholar]
  • 7.Kafaligonul, H. & Stoner, G. R. Static sound timing alters sensitivity to low-level visual motion. J. Vis.12(11), 2. 10.1167/12.11.2 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Getzmann, S. The effect of brief auditory stimuli on visual apparent motion. Perception36(7), 1089–1103. 10.1068/p5741 (2007). [DOI] [PubMed] [Google Scholar]
  • 9.Sekuler, R., Sekuler, A. & Lau, R. Sound alters visual motion perception. Nature385(6614), 308. 10.1038/385308a0 (1997). [DOI] [PubMed] [Google Scholar]
  • 10.Soto-Faraco, S., Kingstone, A. & Spence, C. Multisensory contributions to the perception of motion. Neuropsychologia41(13), 1847–1862. 10.1016/S0028-3932(03)00185-4 (2003). [DOI] [PubMed] [Google Scholar]
  • 11.Soto-Faraco, S. & Väljamäe, A. Multisensory interactions during motion perception: From basic principles to media applications (CRC Press, Hoboken, 2012). [PubMed] [Google Scholar]
  • 12.Kaya, U. & Kafaligonul, H. Cortical processes underlying the effects of static sound timing on perceived visual speed. Neuroimage199, 194–205. 10.1016/j.neuroimage.2019.05.062 (2019). [DOI] [PubMed] [Google Scholar]
  • 13.Fendrich, R. & Corballis, P. M. The temporal cross-capture of audition and vision. Percept. Psychophys.63(4), 719–725. 10.3758/BF03194432 (2001). [DOI] [PubMed] [Google Scholar]
  • 14.Morein-Zamir, S., Soto-Faraco, S. & Kingstone, A. Auditory capture of vision: Examining temporal ventriloquism. Cogn. Brain Res.17(1), 154–163. 10.1016/S0926-6410(03)00089-2 (2003). [DOI] [PubMed] [Google Scholar]
  • 15.Kaya, U., Yildirim, F. Z. & Kafaligonul, H. The involvement of centralized and distributed processes in sub-second time interval adaptation: An ERP investigation of apparent motion. Eur. J. Neurosci.46(8), 2325–2338. 10.1111/ejn.13691 (2017). [DOI] [PubMed] [Google Scholar]
  • 16.Singh, S. S., Mukherjee, A., Raghunathan, P., Ray, D. & Banerjee, A. High segregation and diminished global integration in large-scale brain functional networks enhances the perceptual binding of cross-modal stimuli. Cereb. Cortex34(8), bhae323. 10.1093/cercor/bhae323 (2024). [DOI] [PubMed] [Google Scholar]
  • 17.Kaminski, M. J. & Blinowska, K. J. A new method of the description of the information flow in the brain structures. Biol. Cybern.65(3), 203–210. 10.1007/BF00198091 (1991). [DOI] [PubMed] [Google Scholar]
  • 18.Wilke, C., Ding, L. & He, B. Estimation of time-varying connectivity patterns through the use of an adaptive directed transfer function. IEEE Trans. Biomed. Eng.55(11), 2557–2564. 10.1109/TBME.2008.919885 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Blinowska, K. J. Review of the methods of determination of directed connectivity from multichannel data. Med. Biol. Eng. Compu.49(5), 521–529. 10.1007/s11517-011-0739-x (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kaminski, M. & Blinowska, K. J. Directed transfer function is not influenced by volume conduction—inexpedient pre-processing should be avoided. Front. Comput. Neurosci.8, 61. 10.3389/fncom.2014.00061 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kaminski, M. & Blinowska, K. J. The influence of volume conduction on DTF estimate and the problem of its mitigation. Front. Comput. Neurosci.11, 36. 10.3389/fncom.2017.00036 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zalesky, A., Fornito, A. & Bullmore, E. T. Network-based statistic: Identifying differences in brain networks. Neuroimage53(4), 1197–1207. 10.1016/j.neuroimage.2010.06.041 (2010). [DOI] [PubMed] [Google Scholar]
  • 23.Rubinov, M. & Sporns, O. Complex network measures of brain connectivity: Uses and interpretations. Neuroimage52(3), 1059–1069. 10.1016/j.neuroimage.2009.10.003 (2010). [DOI] [PubMed] [Google Scholar]
  • 24.Xi, Y., Li, Q., Zhang, M., Liu, L. & Wu, J. Characterizing the time-varying brain networks of audiovisual integration across frequency bands. Cogn. Comput.12(6), 1154–1169. 10.1007/s12559-020-09783-9 (2020). [Google Scholar]
  • 25.Xi, Y. et al. Patients with epilepsy without cognitive impairment show altered brain networks in multiple frequency bands in an audiovisual integration task. Neurophysiol. Clin.53(5), 102888. 10.1016/j.neucli.2023.102888 (2023). [DOI] [PubMed] [Google Scholar]
  • 26.Leske, S. et al. Prestimulus network integration of auditory cortex predisposes near-threshold perception independently of local excitability. Cereb. Cortex25(12), 4898–4907. 10.1093/cercor/bhv212 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Keil, J., Pomper, U. & Senkowski, D. Distinct patterns of local oscillatory activity and functional connectivity underlie intersensory attention and temporal prediction. Cortex74, 277–288. 10.1016/j.cortex.2015.10.023 (2016). [DOI] [PubMed] [Google Scholar]
  • 28.Weisz, N. et al. Prestimulus oscillatory power and connectivity patterns predispose conscious somatosensory perception. Proc. Natl. Acad. Sci.111(4), E417–E425. 10.1073/pnas.1317267111 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lithari, C., Sánchez-García, C., Ruhnau, P. & Weisz, N. Large-scale network-level processes during entrainment. Brain Res.1635, 143–152. 10.1016/j.brainres.2016.01.043 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Liu, H. et al. The scalp time-varying network of auditory spatial attention in “cocktail-party” situations. Hear. Res.442, 108946. 10.1016/j.heares.2023.108946 (2024). [DOI] [PubMed] [Google Scholar]
  • 31.World Medical Association. World medical association declaration of Helsinki: Ethical principles for medical research involving human subjects. JAMA310(20), 2191–2194. 10.1001/jama.2013.281053 (2013). [DOI] [PubMed] [Google Scholar]
  • 32.Allen, P. J., Polizzi, G., Krakow, K., Fish, D. R. & Lemieux, L. Identification of EEG events in the MR scanner: The problem of pulse artifact and a method for its subtraction. Neuroimage8(3), 229–239. 10.1006/nimg.1998.0361 (1998). [DOI] [PubMed] [Google Scholar]
  • 33.Perrin, F., Pernier, J., Bertrand, O. & Echallier, J. F. Spherical splines for scalp potential and current density mapping. Electroencephalogr. Clin. Neurophysiol.72(2), 184–187. 10.1016/0013-4694(89)90180-6 (1989). [DOI] [PubMed] [Google Scholar]
  • 34.Stevenson, R. A. et al. Identifying and quantifying multisensory integration: A tutorial review. Brain Topogr.27(6), 707–730. 10.1007/s10548-014-0365-7 (2014). [DOI] [PubMed] [Google Scholar]
  • 35.Aydın, S. Cross-validated AdaBoost classification of emotion regulation strategies identified by spectral coherence in resting-state. Neuroinformatics20(3), 627–639. 10.1007/s12021-021-09542-7 (2022). [DOI] [PubMed] [Google Scholar]
  • 36.Aydın, S. et al. Comparison of domain-specific connectivity metrics for estimation of brain network indices in boys with ADHD-C. Biomed. Signal Process. Control76, 103626. 10.1016/j.bspc.2022.103626 (2022). [Google Scholar]
  • 37.Rabiner, L. & Herrmann, O. The predictability of certain optimum finite-impulse-response digital filters. IEEE Trans Circuit Theory20(4), 401–408. 10.1109/TCT.1973.1083705 (1973). [Google Scholar]
  • 38.Aydın, S. & Onbaşı, L. Graph theoretical brain connectivity measures to investigate neural correlates of music rhythms associated with fear and anger. Cogn. Neurodyn.18(1), 49–66. 10.1007/s11571-023-09931-5 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kaya, U. & Kafaligonul, H. Audiovisual interactions in speeded discrimination of a visual event. Psychophysiology58(4), e13777. 10.1111/psyp.13777 (2021). [DOI] [PubMed] [Google Scholar]
  • 40.Bressler, S. L., Kumar, A. & Singer, I. Brain synchronization and multivariate autoregressive (MVAR) modeling in cognitive neurodynamics. Front. Syst. Neurosci.15, 638269. 10.3389/fnsys.2021.638269 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Li, M. & Zhang, N. A dynamic directed transfer function for brain functional network-based feature extraction. Brain Inf9(1), 7. 10.1186/s40708-022-00154-8 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kaminski, M., Ding, M., Truccolo, W. A. & Bressler, S. L. Evaluating causal relations in neural systems: Granger causality, directed transfer function, and statistical assessment of significance. Biol. Cybern.85(2), 145–157. 10.1007/s004220000235 (2001). [DOI] [PubMed] [Google Scholar]
  • 43.Bastos, A. M. & Schoffelen, J. M. A tutorial review of functional connectivity analysis methods and their interpretational pitfalls. Front. Syst. Neurosci.9, 175. 10.3389/fnsys.2015.00175 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.He, B. et al. eConnectome: A MATLAB toolbox for mapping and imaging of brain functional connectivity. J. Neurosci. Methods195(2), 261–269. 10.1016/j.jneumeth.2010.11.015 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Schneider, T. & Neumaier, A. Algorithm 808: ARfit—A MATLAB package for the estimation of parameters and eigenmodes of multivariate autoregressive models. ACM Trans Math Softw (TOMS)27(1), 58–65. 10.1145/382043.382316 (2001). [Google Scholar]
  • 46.Arnold, M., Milner, X. H. R., Witte, H., Bauer, R. & Braun, C. Adaptive AR modeling of nonstationary time series by means of Kalman filtering. IEEE Trans. Biomed. Eng.45(5), 553–562. 10.1109/10.668741 (1998). [DOI] [PubMed] [Google Scholar]
  • 47.Bullmore, E. & Sporns, O. Complex brain networks: Graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci.10(3), 186–198. 10.1038/nrn2575 (2009). [DOI] [PubMed] [Google Scholar]
  • 48.Tan, B., Yan, J., Zhang, J., Jin, Z. & Li, L. Aberrant whole-brain resting-state functional connectivity architecture in obsessive-compulsive disorder: An EEG study. IEEE Trans. Neural Syst. Rehabil. Eng.30, 1887–1897. 10.1109/TNSRE.2022.3187966 (2022). [DOI] [PubMed] [Google Scholar]
  • 49.Sporns, O. Networks of the brain (MIT Press, Cambridge, 2011). 10.7551/mitpress/8476.001.0001. [Google Scholar]
  • 50.Fornito, A., Zalesky, A. & Bullmore, E. Fundamentals of brain network analysis (Academic Press, Cambridge, 2016). [Google Scholar]
  • 51.Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature393(6684), 440–442. 10.1038/30918 (1998). [DOI] [PubMed] [Google Scholar]
  • 52.Latora, V. & Marchiori, M. Efficient behavior of small-world networks. Phys. Rev. Lett.87(19), 198701. 10.1103/PhysRevLett.87.198701 (2001). [DOI] [PubMed] [Google Scholar]
  • 53.Girvan, M. & Newman, M. E. J. Community structure in social and biological networks. Proc. Natl. Acad. Sci.99(12), 7821–7826. 10.1073/pnas.122653799 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Newman, M. E. J. Modularity and community structure in networks. Proc. Natl. Acad. Sci.103(23), 8577–8582. 10.1073/pnas.0601602103 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Erdeniz, B., Serin, E., Ibadi, Y. & Taş, C. Decreased functional connectivity in schizophrenia: The relationship between social functioning, social cognition, and graph theoretical network measures. Psychiatry Res: Neuroimaging270, 22–31. 10.1016/j.pscychresns.2017.09.011 (2017). [DOI] [PubMed] [Google Scholar]
  • 56.Lai, C. H., Wu, Y. T. & Hou, Y. M. Functional network-based statistics in depression: Theory of mind subnetwork and importance of parietal region. J. Affect. Disord.217, 132–137. 10.1016/j.jad.2017.03.073 (2017). [DOI] [PubMed] [Google Scholar]
  • 57.Xia, M., Wang, J. & He, Y. BrainNet Viewer: A network visualization tool for human brain connectomics. PLoS ONE8(7), e68910. 10.1371/journal.pone.0068910 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Maris, E. & Oostenveld, R. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods164(1), 177–190. 10.1016/j.jneumeth.2007.03.024 (2007). [DOI] [PubMed] [Google Scholar]
  • 59.Merholz, G., Grabot, L., VanRullen, R. & Dugué, L. Periodic attention operates faster during more complex visual search. Sci. Rep.12(1), 6688. 10.1038/s41598-022-10647-5 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Cohen, M. X. Analyzing neural time series data: Theory and practice (The MIT Press, Cambridge, 2014). 10.7551/mitpress/9609.001.0001. [Google Scholar]
  • 61.Duyar, A., Pavan, A. & Kafaligonul, H. Attentional modulations of audiovisual interactions in apparent motion: Temporal ventriloquism effects on perceived visual speed. Atten. Percept. Psychophys.84(7), 2167–2185. 10.3758/s13414-022-02555-7 (2022). [DOI] [PubMed] [Google Scholar]
  • 62.Yilmaz, S. K. & Kafaligonul, H. Attentional demands in the visual field modulate audiovisual interactions in the temporal domain. Hum. Brain Mapp.45(12), e70009. 10.1002/hbm.70009 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Li, F. et al. The dynamic brain networks of motor imagery: Time-varying causality analysis of scalp EEG. Int. J. Neural Syst.29(1), 1850016. 10.1142/S0129065718500168 (2019). [DOI] [PubMed] [Google Scholar]
  • 64.Murray, M. M., Lewkowicz, D. J., Amedi, A. & Wallace, M. T. Multisensory processes: A balancing act across the lifespan. Trends Neurosci.39(8), 567–579. 10.1016/j.tins.2016.05.003 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Talsma, D., Senkowski, D., Soto-Faraco, S. & Woldorff, M. G. The multifaceted interplay between attention and multisensory integration. Trends Cogn. Sci.14(9), 400–410. 10.1016/j.tics.2010.06.008 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.van Atteveldt, N., Murray, M. M., Thut, G. & Schroeder, C. E. Multisensory integration: Flexible use of general operations. Neuron81(6), 1240–1253. 10.1016/j.neuron.2014.02.044 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Romei, V. et al. Spontaneous fluctuations in posterior alpha-band EEG activity reflect variability in excitability of human visual areas. Cereb. Cortex18(9), 2010–2018. 10.1093/cercor/bhm229 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Romei, V., Gross, J. & Thut, G. On the role of prestimulus alpha rhythms over occipito-parietal areas in visual input regulation: Correlation or causation?. J. Neurosci.30(25), 8692–8697. 10.1523/JNEUROSCI.0160-10.2010 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.de Graaf, T. A., Koivisto, M., Jacobs, C. & Sack, A. T. The chronometry of visual perception: Review of occipital TMS masking studies. Neurosci. Biobehav. Rev.45, 295–304. 10.1016/j.neubiorev.2014.06.017 (2014). [DOI] [PubMed] [Google Scholar]
  • 70.Brüers, S. & VanRullen, R. At what latency does the phase of brain oscillations influence perception?. eNeuro10.1523/ENEURO.0078-17.2017 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.VanRullen, R. Perceptual cycles. Trends Cogn. Sci.20(10), 723–735. 10.1016/j.tics.2016.07.006 (2016). [DOI] [PubMed] [Google Scholar]
  • 72.Aydin, A., Ogmen, H. & Kafaligonul, H. Neural correlates of metacontrast masking across different contrast polarities. Brain Struct. Funct.226(9), 3067–3081. 10.1007/s00429-021-02260-5 (2021). [DOI] [PubMed] [Google Scholar]
  • 73.Bauer, A. K. R., Van Ede, F., Quinn, A. J. & Nobre, A. C. Rhythmic modulation of visual perception by continuous rhythmic auditory stimulation. J. Neurosci.41(33), 7065–7075. 10.1523/JNEUROSCI.2980-20.2021 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Akdogan, I., Ogmen, H. & Kafaligonul, H. The phase coherence of cortical oscillations predicts dynamic changes in perceived visibility. Cereb. Cortex34(9), bhae380. 10.1093/cercor/bhae380 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Yang, Y. L., Deng, H. X., Xing, G. Y., Xia, X. L. & Li, H. F. Brain functional network connectivity based on a visual task: Visual information processing-related brain regions are significantly activated in the task state. Neural Regen. Res.10(2), 298–307. 10.4103/1673-5374.152386 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Cecere, R., Gross, J., Willis, A. & Thut, G. Being first matters: Topographical representational similarity analysis of ERP signals reveals separate networks for audiovisual temporal binding depending on the leading sense. J. Neurosci.37(21), 5274–5287. 10.1523/JNEUROSCI.2926-16.2017 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Zaksas, D. & Pasternak, T. Directional signals in the prefrontal cortex and in area MT during a working memory for visual motion task. J. Neurosci.26(45), 11726–11742. 10.1523/JNEUROSCI.3420-06.2006 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Hussar, C. R. & Pasternak, T. Common rules guide the comparison of speed and direction of motion in the dorsolateral prefrontal cortex. J. Neurosci.33(3), 972–986. 10.1523/JNEUROSCI.4075-12.2013 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Wimmer, K., Spinelli, P. & Pasternak, T. Prefrontal neurons represent motion signals from across the visual field but for memory-guided comparisons depend on neurons providing these signals. J. Neurosci.36(36), 9351–9364. 10.1523/JNEUROSCI.0843-16.2016 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Romanski, L. M. Representation and integration of auditory and visual stimuli in the primate ventral lateral prefrontal cortex. Cereb. Cortex17(1), 61–69. 10.1093/cercor/bhm099 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Chaplin, T. A., Rosa, M. G. P. & Lui, L. L. Auditory and visual motion processing and integration in the primate cerebral cortex. Front Neural Circuits12, 93. 10.3389/fncir.2018.00093 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Mercier, M. R. et al. Auditory-driven phase reset in visual cortex: Human electrocorticography reveals mechanisms of early multisensory integration. Neuroimage79, 19–29. 10.1016/j.neuroimage.2013.04.060 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Gleiss, S. & Kayser, C. Oscillatory mechanisms underlying the enhancement of visual motion perception by multisensory congruency. Neuropsychologia53, 84–93. 10.1016/j.neuropsychologia.2013.11.005 (2014). [DOI] [PubMed] [Google Scholar]
  • 84.Kayser, S. J., Philiastides, M. G. & Kayser, C. Sounds facilitate visual motion discrimination via the enhancement of late occipital visual representations. Neuroimage148, 31–41. 10.1016/j.neuroimage.2017.01.010 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Keil, J. & Senkowski, D. Neural oscillations orchestrate multisensory processing. Neuroscientist24(6), 609–626. 10.1177/1073858418755352 (2018). [DOI] [PubMed] [Google Scholar]
  • 86.Wutz, A., Melcher, D. & Samaha, J. Frequency modulation of neural oscillations according to visual task demands. Proc. Natl. Acad. Sci.115(6), 1346–1351. 10.1073/pnas.1713318115 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Wutz, A., Muschter, E., van Koningsbruggen, M. G., Weisz, N. & Melcher, D. Temporal integration windows in neural processing and perception aligned to saccadic eye movements. Curr. Biol.26(13), 1659–1668. 10.1016/j.cub.2016.04.070 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Milton, A. & Pleydell-Pearce, C. W. The phase of pre-stimulus alpha oscillations influences the visual perception of stimulus timing. Neuroimage133, 53–61. 10.1016/j.neuroimage.2016.02.065 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Ronconi, L., Busch, N. A. & Melcher, D. Alpha-band sensory entrainment alters the duration of temporal windows in visual perception. Sci. Rep.8(1), 11810. 10.1038/s41598-018-29671-5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Ronconi, L., Oosterhof, N. N., Bonmassar, C. & Melcher, D. Multiple oscillatory rhythms determine the temporal organization of perception. Proc. Natl. Acad. Sci.114(51), 13435–13440. 10.1073/pnas.1714522114 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Patten, T. M., Rennie, C. J., Robinson, P. A. & Gong, P. Human cortical traveling waves: Dynamical properties and correlations with responses. PLoS ONE7(6), e38392. 10.1371/journal.pone.0038392 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Sadaghiani, S. & Kleinschmidt, A. Brain networks and alpha oscillations: Structural and functional foundations of cognitive control. Trends Cogn. Sci.20(11), 805–817. 10.1016/j.tics.2016.09.004 (2016). [DOI] [PubMed] [Google Scholar]
  • 93.Saalmann, Y. B., Pinsk, M. A., Wang, L., Li, X. & Kastner, S. The pulvinar regulates information transmission between cortical areas based on attention demands. Science337(6095), 753–756. 10.1126/science.1223082 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Doesburg, S. M., Green, J. J., McDonald, J. J. & Ward, L. M. From local inhibition to long-range integration: A functional dissociation of alpha-band synchronization across cortical scales in visuospatial attention. Brain Res.1303, 97–110. 10.1016/j.brainres.2009.09.069 (2009). [DOI] [PubMed] [Google Scholar]
  • 95.Başar, E. & Schürmann, M. Functional aspects of evoked alpha and theta responses in humans and cats: Occipital recordings in “cross-modality” experiments. Biol. Cybern.72(2), 175–183. 10.1007/BF00205981 (1994). [DOI] [PubMed] [Google Scholar]
  • 96.Başar, E., Başar-Eroglu, C., Rahn, E. & Schürmann, M. Sensory and cognitive components of brain resonance responses: An analysis of responsiveness in human and cat brain upon visual and auditory stimulation. Acta Otolaryngol.111(sup491), 25–35. 10.3109/00016489109136778 (1991). [DOI] [PubMed] [Google Scholar]
  • 97.Karakaş, S. A review of theta oscillation and its functional correlates. Int. J. Psychophysiol.157, 82–99. 10.1016/j.ijpsycho.2020.04.008 (2020). [DOI] [PubMed] [Google Scholar]
  • 98.Kaiser, M., Senkowski, D. & Keil, J. Mediofrontal theta-band oscillations reflect top-down influence in the ventriloquist illusion. Hum. Brain Mapp.42(2), 452–466. 10.1002/hbm.25236 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Simon, D. M. & Wallace, M. T. Integration and temporal processing of asynchronous audiovisual speech. J. Cogn. Neurosci.30(3), 319–337. 10.1162/jocn_a_01205 (2018). [DOI] [PubMed] [Google Scholar]
  • 100.Cohen, M. X. & Cavanagh, J. F. Single-trial regression elucidates the role of prefrontal theta oscillations in response conflict. Front. Psychol.2, 30. 10.3389/fpsyg.2011.00030 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Von Stein, A. & Sarnthein, J. Different frequencies for different scales of cortical integration: From local gamma to long-range alpha/theta synchronization. Int. J. Psychophysiol.38(3), 301–313. 10.1016/S0167-8760(00)00172-0 (2000). [DOI] [PubMed] [Google Scholar]
  • 102.Senkowski, D. & Engel, A. K. Multi-timescale neural dynamics for multisensory integration. Nat. Rev. Neurosci.25(9), 625–642. 10.1038/s41583-024-00845-7 (2024). [DOI] [PubMed] [Google Scholar]
  • 103.Fries, P. A mechanism for cognitive dynamics: Neuronal communication through neuronal coherence. Trends Cogn. Sci.9(10), 474–480. 10.1016/j.tics.2005.08.011 (2005). [DOI] [PubMed] [Google Scholar]
  • 104.Fries, P. Rhythms for cognition: Communication through coherence. Neuron88(1), 220–235. 10.1016/j.neuron.2015.09.034 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Jensen, O., Gips, B., Bergmann, T. O. & Bonnefond, M. Temporal coding organized by coupled alpha and gamma oscillations prioritizes visual processing. Trends Neurosci.37(7), 357–369. 10.1016/j.tins.2014.04.001 (2014). [DOI] [PubMed] [Google Scholar]
  • 106.Wang, X. J. & Kennedy, H. Brain structure and dynamics across scales: In search of rules. Curr. Opin. Neurobiol.37, 92–98. 10.1016/j.conb.2015.12.010 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Soltani, A., Murray, J. D., Seo, H. & Lee, D. Timescales of cognition in the brain. Curr. Opin. Behav. Sci.41, 30–37. 10.1016/j.cobeha.2021.03.003 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Wolff, A. et al. Intrinsic neural timescales: Temporal integration and segregation. Trends Cogn. Sci.26(2), 159–173. 10.1016/j.tics.2021.11.007 (2022). [DOI] [PubMed] [Google Scholar]
  • 109.Bastos, A. M. et al. Visual areas exert feedforward and feedback influences through distinct frequency channels. Neuron85(2), 390–401. 10.1016/j.neuron.2014.12.018 (2015). [DOI] [PubMed] [Google Scholar]
  • 110.Engel, A. K. & Fries, P. Beta-band oscillations: Signalling the status quo?. Curr. Opin. Neurobiol.20(2), 156–165. 10.1016/j.conb.2010.02.015 (2010). [DOI] [PubMed] [Google Scholar]
  • 111.Michalareas, G. et al. Alpha-beta and gamma rhythms subserve feedback and feedforward influences among human visual cortical areas. Neuron89(2), 384–397. 10.1016/j.neuron.2015.12.018 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Panzeri, S., Brunel, N., Logothetis, N. K. & Kayser, C. Sensory neural codes using multiplexed temporal scales. Trends Neurosci.33(3), 111–120. 10.1016/j.tins.2009.12.001 (2010). [DOI] [PubMed] [Google Scholar]
  • 113.Helfrich, R. F. & Knight, R. T. Oscillatory dynamics of prefrontal cognitive control. Trends Cogn. Sci.20(12), 916–930. 10.1016/j.tics.2016.09.007 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Sporns, O. Network attributes for segregation and integration in the human brain. Curr. Opin. Neurobiol.23(2), 162–171. 10.1016/j.conb.2012.11.015 (2013). [DOI] [PubMed] [Google Scholar]
  • 115.Kitzbichler, M. G., Henson, R. N., Smith, M. L., Nathan, P. J. & Bullmore, E. T. Cognitive effort drives workspace configuration of human brain functional networks. J. Neurosci.31(22), 8259–8270. 10.1523/JNEUROSCI.0440-11.2011 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Palva, S., Monto, S. & Palva, J. M. Graph properties of synchronized cortical networks during visual working memory maintenance. Neuroimage49(4), 3257–3268. 10.1016/j.neuroimage.2009.11.031 (2010). [DOI] [PubMed] [Google Scholar]
  • 117.Ren, Y. et al. Perceptual training improves audiovisual integration by enhancing alpha-band oscillations and functional connectivity in older adults. Cereb. Cortex34(8), bhae216. 10.1093/cercor/bhae216 (2024). [DOI] [PubMed] [Google Scholar]
  • 118.Wang, Z., Yu, L., Xu, J., Stein, B. E. & Rowland, B. A. Experience creates the multisensory transform in the superior colliculus. Front. Integr. Neurosci.14, 18. 10.3389/fnint.2020.00018 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Yu, L., Rowland, B. A., Xu, J. & Stein, B. E. Multisensory plasticity in adulthood: Cross-modal experience enhances neuronal excitability and exposes silent inputs. J. Neurophysiol.109(2), 464–474. 10.1152/jn.00739.2012 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Bassett, D. S. et al. Cognitive fitness of cost-efficient brain functional networks. Proc. Natl. Acad. Sci.106(28), 11747–11752. 10.1073/pnas.0903641106 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (28.5MB, pdf)

Data Availability Statement

The dataset and analysis codes of the current study are available at https://osf.io/7p8as/?view_only=cce86ab3239842af9a07d1eb1e744b83.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES