Abstract
Attention modulates numerous stages of audiovisual integration, and studies have shown that audiovisual integration is higher in attended conditions than in unattended conditions. However, attentional resources are limited for each person, and it is not yet clear how audiovisual integration changes under different attentional loads. Here, we explored how auditory attentional load affects audiovisual integration by applying an auditory/visual discrimination task to evaluate audiovisual integration and a rapid serial auditory presentation (RSAP) task to manipulate auditory attentional resources. The results for peak benefit and positive area under the curve of different probability showed that audiovisual integration was highest in the low attentional load condition and lowest in the high attentional load condition (low > no = medium > high). The peak latency and time window revealed that audiovisual integration was delayed as the attentional load increased (no < low < medium < high). Additionally, audiovisual depression was found in the no, medium, and high attentional load conditions but not in the low attentional load condition. These results suggest that mild auditory attentional load increases audiovisual integration, and high auditory attentional load decreases audiovisual integration.
Keywords: audiovisual integration, attentional load, auditory attention, discrimination task, race model
Introduction
Individuals are often inundated with stimuli from various sensory modalities (e.g., auditory, visual, tactile, olfactory, and somatosensory stimuli). However, our brains can effectively screen out and integrate the most pertinent information to better understand our surroundings. For example, to safely drive a car, we must comprehensively utilize auditory information (e.g., music, car engine), visual information (e.g., road, roadside billboards), and tactile information (e.g., feeling the steering wheel). The integration of information from various sensory modalities is called multisensory integration (Meredith et al., 1987; Stein & Meredith, 1993). In the real world, auditory and visual modalities are common forms of acquiring sensory information, and the investigation for integration of auditory and visual information is the focus in multisensory integration. Studies have confirmed that responses to bimodal audiovisual stimuli are faster than responses to unimodal auditory or visual stimuli (Meredith et al., 1987; Stein & Meredith, 1993). For example, movies with sound are considerably easier to understand than silent films and audio shows. Furthermore, attention is a key factor that alters the processing of sensory stimuli by facilitating the detection and perception of the stimulus that is presented at the attended location (Ho et al., 2009; Mcdonald et al., 2000). Numerous studies have confirmed that attention affects audiovisual integration in multiple stages and that audiovisual integration is higher for attended conditions than for unattended conditions (Talsma et al., 2007; Talsma et al., 2010; Talsma et al., 2009; Talsma & Woldorf, 2005; Xu et al., 2020).
According to the attentional load theory proposed by Lavie (1995) and Lavie and Tsal (1994), attentional resources are limited for each person, and when multiple tasks are conducted simultaneously, if one task occupies more attentional resources, less attention will be allocated to the other tasks. In everyday life, individuals continuously face more than one event simultaneously, and the main task is always disturbed by distractors in the surrounding environment (Fan et al., 2002; Stein, 2012). There is little research on whether our brain can integrate the effective auditory and visual information by suppressing other distractors during these situations. To clarify this matter, Alsius et al. (2014; Alsius et al., 2005) examined audiovisual integration using the classical McGurk effect under low or high visual attentional load conditions. In the low visual attentional load, the participants were instructed to conduct a single McGurk task, but in the high visual attentional load condition, they were instructed to conduct dual tasks (a McGurk task and a rapid serial visual presentation, RSVP, task). Both the behavioral (Alsius et al., 2005) and event-related potential (Alsius et al., 2014) results showed that the audiovisual integration was weaker under the high visual attentional load condition than under the low visual attentional load condition. In view of the limitation of attentional resources (Lavie, 1995; Lavie & Tsal, 1994), the most likely explanation was that fewer attentional resources were left to process the McGurk words under the high visual attentional load condition than under the low visual attentional load condition because the demand for attentional resources by the distractor stimuli was larger under the high-load condition (Alsius et al., 2014; Alsius et al., 2005). Alsius et al. further concluded that attentional load decreases the audiovisual integration. As the audiovisual integration was larger when the stimuli were presented at the attended location than when presented at the unattended location (Talsma et al., 2007; Talsma et al., 2010; Talsma et al., 2009; Talsma & Woldorf, 2005), audiovisual integration might increase with the addition of attentional resources to process the audiovisual information. However, in the studies by Alsius et al. (2005, Alsius et al., 2014) there were only two attentional load conditions (low and high), and the attentional load was too high in the high load condition, yielding an accuracy of less than 60% in the dual tasks (Alsius et al., 2005; Alsius et al., 2014). It cannot be determined whether the audiovisual integration was reduced due to the direct addition of attentional load or whether it increased under low attentional load and then reduced inversely under high attentional load. Additionally, speech materials were employed in the studies by Alsius et al., which included highorder speech processing. Therefore, it is difficult to determine whether the attentional load influenced the audiovisual integration or whether it influenced the high-order speech processing.
By removing the effect of high-order cognitive processing, Ren, Li et al. (2020) and Ren, Zhou et al. (2020) investigated how the audiovisual integration changes with increasing visual attentional load using a meaningless auditory/visual discrimination task and an RSVP task. Similar to Alsius et al.’s (2005, Alsius et al., 2014) experimental design, single (no and low loads) or dual tasks (high load) were performed. Their results showed that the audiovisual integration was higher in the low attentional load condition than in the no or high attentional load conditions. Together with Alsius et al.’s studies, Ren, Li et al. (2020) and Ren, Zhou et al. (2020) proposed that low visual attentional load increases audiovisual integration, and high visual attentional load decreases audiovisual integration. Notably, the aforementioned studies employed transient distractors that induced transient attention. In contrast to transient attention, in which the participant is occasionally cued to the stimulated location, sustained attention maintained over time on a specific task affects the information perception differently (Eimer, 1996; Eimer & Forster, 2003). To investigate the influence of sustained attention on audiovisual integration, Wahn and König (2015) instructed participants to continuously track visual moving balls when performing an audiovisual redundancy task. The results revealed that audiovisual integration was comparable under both the high and low sustained perceptual load conditions, indicating that sustained visual attentional load does not significantly affect audiovisual integration. Wahn and König (2017) reported that shared or distinct attentional resources across sensory modalities are task dependent, where auditory and visual attentional resources are distinct during the discrimination of stimulus attributes (Alais et al., 2006; Arrighi et al., 2011) but are shared during stimulus location (Driver & Spence, 1998a, 1998b; Spence, 2010a, 2010b; Spence & Driver, 2004). In studies by Ren, Li, et al. (2020), Alsius et al. (2005; Alsius et al., 2014), and Wahn et al. (2017; Wahn & König, 2015), the auditory/visual discrimination task involved the discrimination of stimulus attributes, and the second distractor task was from a visual modality. However, whether auditory attentional load influences audiovisual integration has not been determined.
Furthermore, a recent study illustrated that the integration of auditory and visual information is rearranged according to the auditory signal during audiovisual integration (Spence & Squire, 2003). The McGurk effect clearly displays visual dominance where what is being heard is influenced by what is being seen (McGurk & MacDonald, 1976). However, the modulation of auditory information on visual perception has also been extensively reported (Morein-Zamir et al., 2003; Sekuler et al., 1997; Shams et al., 2002; Stein et al., 1996). Shams et al. (2002) instructed participants to judge the number of flashes presented on a monitor screen. The participants reported two or more flashes when one flash was accompanied by two or more beeps, exhibiting the phenomenon of sound-induced illusory flashing, which suggested that the auditory beep altered visual perception. Consistent results were also obtained by Sekuler et al. (1997) using a temporal order judgment task (Morein-Zamir et al., 2003). These results indicate that perceptual reports tend to be biased to auditory signals when an audiovisual stimuli pair is presented. Therefore, it is taken for granted that auditory attentional load affects audiovisual integration. Additionally, to fully understand the interplay between attention and audiovisual integration, it is necessary to investigate audiovisual integration under different auditory attentional load conditions. Therefore, the interest of the present study was to clarify how auditory attentional load influences audiovisual integration. Attentional load theory proposes that if all tasks are easy enough, they can be completed successfully. However, if the tasks are difficult, the main task can be completed by allocating more attentional resources, while the other tasks cannot be completed because of attentional resource exhaustion (Lavie, 1995; Lavie & Tsal, 1994). In addition, attention can facilitate the detection and perception of stimuli that are presented at attentional locations (Petersen & Posner, 2012; Posner & Petersen, 1990), and the audiovisual integration is higher at the attended location than at the unattended location (Talsma et al., 2007; Talsma et al., 2010; Talsma et al., 2009; Talsma & Woldorf, 2005). Adaptive stress can produce optimal performance (Kamijo et al., 2007; Yerkes & Dodson, 1908). Therefore, we hypothesized that audiovisual integration would be higher in a low attentional load condition than in a no attentional load condition. However, according to the perceptual load theory (Lavie, 1995; Lavie & Tsal, 1994), more attentional resources might be delivered to the secondary distractors as the attentional load increases, and less attentional resources would be left to process the auditory/visual discrimination task. Therefore, we further hypothesized that audiovisual integration would decrease inversely in the high auditory attentional load condition.
Methods
Subjects
Twenty young healthy students (18–21 years, Mage ± SD, 19.44 ± 1.00) participated in this study, and all student volunteers were recruited from Guizhou University of Traditional Chinese Medicine. All participants were free of neurological diseases, had normal hearing and normal or corrected-to-normal vision, had no color blindness or color weakness, and were naive to the purpose of the experiment. Additionally, all participants were paid for their time and provided written informed consent to participate, which was previously approved by the ethics committee of the Second Affiliated Hospital of Guizhou University of Traditional Chinese Medicine (No. 2018072).
Stimuli and Task
The participants were instructed to perform single and dual tasks. These were an auditory/visual discrimination task to evaluate audiovisual integration and an RSAP task to manipulate the attentional load by competing attentional resources away from the auditory/visual discrimination task. According to the attentional load session, the auditory/visual discrimination task and RSAP task were presented simultaneously or separately.
For the auditory/visual discrimination task, the visual nontarget stimulus was a black and white checkerboard image (B/W checkerboard, 52 × 52 mm, with a visual angle of 5 °), and the visual target stimulus was a black-and-white checkerboard image with two black dots contained within each white checkerboard (He et al., 1996; Laura et al., 2005; Ren et al., 2016; see Figure 1, Panel A). The auditory nontarget stimulus was a 1000-Hz sinusoidal tone, and the auditory target stimulus was white noise (Ren et al., 2018; Ren et al., 2016; Yang et al., 2015). In line with previous studies about the effect of visual perceptual load on audiovisual integration (Ren, Li et al., 2020; Ren, Zhou et al., 2020), the visual stimuli (V) were presented on a computer monitor in front of participants’ eyes and on the upper/lower left or right quadrant of the screen for 200 ms with a 12 ° visual angle (see Figure 1, Panel B, gray square). The auditory stimuli (A) were presented through two speakers at approximately 60 dB SPL for a duration of 200 ms (10 ms of the rise/fall cosine gate). The audiovisual target stimulus (AV) was the combination of the visual target and auditory target stimuli, and the audiovisual nontarget stimulus was the combination of the visual nontarget and auditory nontarget stimuli. The following conditions were not included: a visual target stimulus accompanied by an auditory nontarget stimulus and a visual nontarget stimulus accompanied by an auditory target stimulus.
Figure 1.
Schematic depiction and the stimuli used in the auditory/visual discrimination task (Panel A). Audiovisual integration was evaluated using an auditory/visual discrimination task, which was presented peripherally (gray square and speakers), and attentional load was manipulated using an RSAP task, which was presented through a speaker located on the right/left side of the monitor (Panel B).
The stimuli in the RSAP task consisted of 10 distractor characters taken from six letters (B, C, P, R, T, and V) and four digits (6, 7, 8, and 9) presented through the speakers located on the right/left sides of the computer monitor (see Figure 1, Panel B). The stimuli of the RSAP were compounded with the auditory stimuli of the auditory/visual discrimination task using Audacity 2.4.2 (https://www.audacityteam.org/).
Procedure
Participants were instructed to perform the experiment in a dimly lit and sound-attenuated room (neuroscience laboratory room, Guizhou University of Traditional Chinese Medicine, China). Four separate sessions were conducted, including a no attentional load session, a low attentional load session, a medium attentional load session, and a high attentional load session. The order of the four sessions was counterbalanced across participants.
In the no attentional load session, a single task was applied. A fixation cross was presented for 3000 ms, and the stimuli of the auditory/ visual discrimination task were subsequently presented randomly with a random interstimulus interval (ISI) from 2000 ms to 2500 ms (see Figure 1, Panel A). Because there was no other additional distractor to compete for attentional resources, it was called the “no attentional load session.” In the no attentional load session, there were 80 trials for each target stimulus type (A, V, and AV) and 20 trials for each nontarget stimulus type (A, V, and AV), with an appropriate break according to the specific situation of each participant. Participants were instructed to press the left mouse button to respond to the target stimuli as rapidly and as accurately as possible. In the low attentional load session, the same single task was performed as in the no attentional load session. However, differently, the RSAP task was presented simultaneously through the speakers located on the right/left side of the monitor, with an identical probability for each letter. Although the participants were instructed to respond to the target stimuli of the auditory/visual discrimination task only, it was impossible for the participants to completely neglect the distractor. Therefore, it was called the “low attentional load session.” According to previous studies, object recognition is more difficult than object classification (Goldhammer et al., 2014; Houlihan et al., 1998). Therefore, a dual task was performed in the medium and high attentional load sessions such that attentional resources would have to compete with the auditory/visual discrimination task. In the medium attentional load condition, the presentation for both the auditory/visual discrimination task and the RSAP task were identical to that of the low attentional load session, but the participants were instructed to press the left mouse button to respond to the target stimuli of the auditory/visual discrimination task and to press the right mouse button to respond to the digits (6, 7, 8, and 9) of the RSAP task as rapidly and accurately as possible (object classification). In the high attentional load condition, the presentations of both the auditory/visual discrimination task and the RSAP task were identical to those of the low attentional load session, but the participants were instructed to press the left mouse button to respond to the target stimuli of the auditory/visual discrimination task and the right mouse button to respond to “8” and “B” of the RSAP task as rapidly and accurately as possible (object recognition). During the medium and high attentional load sessions, the participants were instructed to treat the auditory/visual discrimination task and RSAP task equally.
Data Analysis
The hit rate is the percentage of correct responses (the response time falling within the average time duration ± 2.5 SD) relative to the total number of target stimuli. The hit rates and response times (RTs) were computed separately for each participant under each condition, and the data were then subjected to a 4 (Attentional load: no, low, medium, and high) × 3 (Target modality: A, V, and AV) analysis of variance (ANOVA, Greenhouse-Geisser corrections with corrected degrees of freedom). The statistical significance level was set at p < .05, and the effect size estimates, ηp2, are also reported.
Additionally, to assess the effect of attentional load on audiovisual integration, the amount of audiovisual integration was calculated using a race model based on cumulative distribution functions (CDFs) of the response time of each trial (Miller, 1982, Miller, 1986). The independent race model is a statistical prediction model based on the CDFs of the summed probabilities of the visual and auditory responses to independent unimodal visual and auditory stimuli. This model allows for a direct comparison of the probability of the multisensory condition to the predicted probability of the unimodal conditions, [P(V)+P(A)- P(V)×P(A)], by segmenting the participant-specific CDFs for each condition using 10 ms time bins (Miller, 1982, Miller, 1986). P(V) is the probability of responding within a given time in a unimodal visual trial and P(A) is the probability of responding within a given time in a unimodal auditory trial. If the probability of the response to an AV stimulus is significantly different from that predicted by the race model, integration of the auditory and visual inputs is considered to occur (Miller, 1982, Miller, 1986). The audiovisual integration of the multisensory conditions was defined by subtracting a participant’s race model CDF from their AV CDF in each time bin to generate a difference curve for each participant (Miller, 1982, Miller, 1986). If their AV CDF is significantly greater than the race model CDF, the audiovisual facilitation is assumed to have occurred. Otherwise, audiovisual depression is assumed to have occurred (Meredith et al., 1987). The peak of the curve of the different CDFs is defined as the peak benefit as an important index of audiovisual integration, and the time span from the presentation of the target to the peak benefit is defined as the peak latency and is an important index of when the audiovisual integration occurred (Ren et al., 2016; Xu et al., 2020). According to previous studies (Gibney et al., 2017; Van der Stoep et al., 2015), the whole probability mass was assessed using positive area under the curve (pAUC) of differences in the CDFs. Thus, together with peak benefit, the pAUC was also calculated to assess the amount of audiovisual integration. A time interval that was significantly different in the AV CDF and the race model CDF was defined as the time window of audiovisual integration (Laurienti et al., 2006; Ren et al., 2016; Xu et al., 2020). The peak latency and the time window of audiovisual integration were used to evaluate the time point at which the audiovisual integration occurred (Laurienti et al., 2006; Ren et al., 2016; Xu et al., 2020).
Results
Hit Rates
The hit rate under all conditions was greater than 85%, showing that the data were useable for further analysis (see Figure 2, Panel A). The 4 (Attentional load: no, low, medium, and high) × 3 (Target modality: A, V, and AV) ANOVA revealed that the attentional load, F(3, 57) = 30.240, p < .001, ηp2 = 0.614, had a significant main effect, with the highest hit rate for the no and low attentional load conditions, and lowest hit rate for the high attentional load condition. However, there was no significant difference between the no and low attentional load conditions (no = low > medium > high). The stimulus type, F(2, 38) = 6.737, p = .004, ηp2 = 0.262, also had a significant main effect, with a higher hit rate for the AV stimulus than for the A or V stimuli (all ps < .006). These results indicated that the arrangement of attentional load is reasonable, showing that with the increase of attentional load, the hit rate was significantly reduced.
Figure 2.
Mean hit rates and response times with the standard errors of the mean (SEM) for each condition. The hit rate was decreased (Panel A) and the response was slow down (Panel B) with the addition of attentional load. *p ≤ .05, **p ≤ .01, ***p ≤ .001.
Response Time
The RTs under all conditions are displayed in Figure 2, Panel B, and the data were subjected to 4 (Attentional load: no, low, medium, and high) × 3 (Target modality: A, V, and AV) ANOVA. The results illustrated a significant main effect of attentional load, F(3, 57) = 28.939, p < .0001, ηp2 = 0.604, showing a faster response in the no and low attentional load conditions than in the medium and high attentional load conditions (no = low > medium > high, all ps < .003). However, there was no significant difference between the no and low attentional load conditions (p = 1.000). There was a significant main effect of the stimulus type, F(2, 38) = 91.017, p < .001, ηp2 = 0.827, showing that the response to the AV stimulus was significantly faster than the response to the V and A stimuli (AV > V > A, all ps < .001). Additionally, the interaction between attentional load and the stimulus type was also significant, F(6, 114) = 10.395, p < .001, ηp2 = 0.604. The post hoc analysis for attentional load showed that, for all stimulus types, the responses were faster in the no and low attentional load conditions than in the medium and high attentional load conditions (no = low > medium > high, all p < .004). However, there was no significant difference between the no and low attentional load conditions (all ps > .614). The post hoc analysis for attentional load showed that, in all attentional load conditions, the responses were faster for the AV stimulus than for the A and V stimuli (all ps < .001). There were no significant response differences between the A and V stimuli in the no and low attentional load conditions (all ps > .060), but there were significant differences between the A and V stimuli in the medium and high attentional load conditions (all ps < .001).
Rate Model Comparisons
To evaluate how the audiovisual integration changes with the addition of load, a race model was used to analyze the response times in each trial. Audiovisual integration was assessed using probability difference calculations by subtracting the race model CDFs from the AV CDFs under each condition. The results showed a significant audiovisual integration (p < .05, one-sample t test) under all four attentional load conditions (see Figure 3, Panel A for the no attentional load condition, Panel B for the low attentional load condition, Panel C for the medium attentional load condition, and Panel D for the high attentional load condition).
Figure 3.
Significant audiovisual integration effects were found under the no attentional load (Panel A); low attentional load (Panel B); medium attentional load (Panel C); and high attentional load (Panel D) conditions
Overall, as shown in Table 1, the peak benefit was higher in the low attentional load condition (18.79%) than in the no (11.97%), medium (13.40%), and high attentional load conditions (6.99%), exhibiting an inverted U-shaped function. Furthermore, similar to the results of peak benefit, the pAUC was larger in the low attentional load condition (28.37 ms) than in the no (16.08 ms), medium (17.42 ms), and high attentional load conditions (12.95 ms). These results indicated that the audiovisual integration might be increased under attentional load to some degree, but it decreases inversely when the attentional load is overloaded. The peak latency for each participant was subjected to a one-way ANOVA, and the results showed a significant main effect of attentional load, F(3, 57) = 20.235, p < .001, ηp2 = 0.516, showing that the peak latency was delayed with the addition of attentional load, showing 290 ms, 340 ms, 360 ms, and 380 ms for the no, low, medium, and high attentional load conditions, respectively. The time window of audiovisual integration was also delayed with the addition of attentional load, showing 210–370 ms, 260–490 ms, 280–450 ms, and 300–500 ms for the no, low, medium, and high attentional load conditions, respectively. The results for the peak latency and the time window of audiovisual integration indicated that audiovisual integration was delayed with the addition of attentional load. Furthermore, a significant audiovisual depression effect was found for the no (500–520 ms), medium (610–1020 ms), and high (710–1160 ms) attentional load conditions, but not for the low attentional load condition.
Table 1.
Peak Benefit, pAUC, Peak Latency, and Time Window of Audiovisual Integration under Each Attentional Load Condition
Discussion
The aim of the present study was to investigate the effect of auditory attentional load on audiovisual integration. The audiovisual integration exhibited an inverted U-shaped function, and audiovisual integration was delayed with increasing attentional load. Additionally, an audiovisual depression effect was found in the no, medium, and high attentional load conditions, but not in the low attentional load condition.
These findings are consistent with our hypothesis that audiovisual integration increases with the addition of auditory attentional load in the low attentional load condition but decreases inversely owing to the competition for attentional resources during the RSAP task in the high auditory attentional load condition, exhibiting an inverted U-shaped function. Attentional resources include three separate networks: alerting, orienting, and executive control, with alerting being defined as achieving and maintaining an alert state (Petersen & Posner, 2012; Posner & Petersen, 1990). In the low auditory attentional load condition, although the participants were instructed to respond only to the auditory/visual discrimination task, more alerting resources were activated than in the no attentional load condition while simultaneously detecting the stimuli of the auditory/visual discrimination task and suppressing the distractors of the RSAP task (Ho et al., 2009; Ren, Zhou et al., 2020). According to the arousal hypothesis (proposed by Yerkes & Dodson, 1908), moderately difficult tasks activate the best arousal level, as characterized by an inverted U-shaped function. With increasing physical arousal, enhanced performance is predicted to an optimal point and then decreases with further arousal (Kamijo et al., 2007; Yerkes & Dodson, 1908). Therefore, in the low auditory attentional load condition, the amount of activated attentional resources is larger than that of the no attentional load condition. Studies have confirmed that audiovisual integration is higher for attended locations than for unattended locations (Talsma et al., 2010; Talsma et al., 2009; Talsma & Woldorf, 2005). Therefore, the higher audiovisual integration in the low auditory attentional load condition than in the no attentional load condition is attributed to the greater activation of attentional resources. In the medium auditory attentional load condition, the participants were instructed to respond to the auditory/visual discrimination task and the RSAP task equally. The response to the RSAP task requires extra attentional resources (Ho et al., 2009; Ren, Zhou et al., 2020). Therefore, the RSAP distractor demanded more attentional resources in the medium auditory attentional load condition than in the low auditory attentional load condition. Fewer attentional resources were left to process the stimuli of the auditory/visual discrimination task in the medium auditory attentional load condition than in the low auditory attentional load condition, which led to lower audiovisual integration in the medium load condition than in the low load condition. Similarly, as the complexity of the RSAP task increases, more attentional resources are allocated to the RSAP task, and fewer attentional resource are left to process the auditory/visual discrimination task, as shown by the decreased hit rate in the high attentional load condition. Therefore, audiovisual integration was lower in the high auditory attentional load condition than in the medium auditory attentional load condition.
These results were also consistent with those of Alsius et al. (2005; Alsius et al., 2014) and Ren, Li et al. (2020). In these experiments, McGurk words and the distractor stimuli were presented simultaneously. Under the low attentional load condition, the participant was instructed only to repeat the McGurk words while ignoring the simultaneous distractor stimuli (single task). Under the high attentional load condition, the participants were asked to simultaneously repeat the McGurk words and respond to the target of the distractor stimuli by pressing the mouse button (dual task). By contrast, in Ren, Li et al. (2020)’s study, only the auditory/visual discrimination task was presented in the no attentional load condition, which was used to evaluate audiovisual integration (single task). In the low visual attentional load condition of that study, both the auditory/visual discrimination task and the RSVP task were presented simultaneously, but the participants were instructed only to respond to the auditory/visual discrimination task (single task). In the high attentional load condition, the auditory/ visual discrimination task and the RSVP task were presented simultaneously, and the participants were instructed to respond to both tasks (dual task). However, the results of Alsius et al. (2005; Alsius et al., 2014) and Ren, Li et al. (2020) were consistent in that the audiovisual integration was higher in the low attentional load condition than in the high attentional load condition. However, only Ren, Li et al. found higher audiovisual integration in the low visual attentional load cothan in the no attentional load condition. Considering these findings and the results of the present study on auditory attention, we propose that there may be a positive correlation between audiovisual integration and the attentional resources occupied by in the auditory/visual discrimination task when there is attentional competition from distractors. However, further neuroimaging studies are needed. Notably, in the no and low attentional load conditions, a single task was conducted, but there were dual tasks in the medium and high attentional load conditions. Furthermore, the amount of stimuli was different between the no and low attentional load condition, and the secondary cognitive task was different between the medium and high attentional load conditions. Therefore, task-dependent effects might also have contributed to the changes in audiovisual integration in the various attentional load conditions, which should be given full consideration in the future studies.
The audiovisual integration was delayed with the addition of attentional load. Previous studies have shown that the response speed to target stimuli is reduced by irrelevant distractors (Fan et al., 2009; Fan et al., 2002). Therefore, the response speed in the low auditory attentional load condition was slower than that in the no attentional load condition. In the present study, with increased attentional load, the cognitive processing depth and the task difficulty were also increased. In the medium auditory attentional load condition, the participants were instructed to make a category discrimination for the RSAP stream. However, an accurate character identification was required in the high auditory attentional load condition, and this condition required more sophisticated processing. The more complex and difficult the information processing, the more time is needed to make a response (Goldhammer et al., 2014; Houlihan et al., 1998). Therefore, the response speed slowed with increased attentional load. Colonius and Diedrich (2004; Diedrich et al., 2008) proposed a “time window of integration model,” and they presumed that cross-modal information integration includes at least two serial stages of saccadic reaction times: an early afferent stage of peripheral processing (first stage) and a compound stage of converging subprocesses (second stage). The first stage consists of very early sensory processing, and the processing time is assumed to be independent for unimodal sensory stimuli. If the information processing in the first stage terminates within a given time interval, the integration of auditory and visual information is assumed to occur in the second stage. In the current study, as the attentional load increased, the RT was significantly reduced, leading to a delayed termination of the first stage and then delayed audiovisual integration in the second stage (Ren, Guo et al., 2020; Ren et al., 2016).
The audiovisual depression effect occurred in the no, low, and high attentional load conditions, but not in the low attentional load condition. As in the arousal hypothesis, the optimal attentional load led to maximum physical arousal (Yerkes & Dodson, 1908). In the low attentional load condition, neural activation and physical arousal arrived in a relatively optimal activation state, exhibiting hyperexcitability to various external stimuli. This hyperexcitability then contributed to the super-additivity of auditory and visual stimuli and activated the facilitation system (Driver & Noesselt, 2008; Meredith et al., 1987; Stein & Stanford, 2008). However, in the no attentional load condition, the simple task induced slack and inattention, which led to suppression as well as facilitation (Kamijo et al., 2007; Yerkes & Dodson, 1908). In the medium and high attentional load conditions, the participants were instructed to respond to the auditory/visual discrimination task and the RSAP task simultaneously. While integrating the auditory and visual stimuli, the individuals must allocate attentional resources to suppress the merging of irrelevant RSAP stimuli (Petersen & Posner, 2012; Posner & Petersen, 1990), and both the facilitation and suppression systems were activated. Therefore, both facilitation and suppression effects were found in the no, low, and high attentional load conditions.
It is also notable that Wahn et al. (2017) reported interference was significant when the secondary task was performed simultaneously, showing a higher detection threshold for dual tasks than for a single task. Therefore, in the current study, the simultaneous secondary RSAP task influenced the perception of the auditory and visual stimuli of the auditory/visual discrimination task resulting from the reduction of the signal-to-noise ratio. This influence should be optimized in the future studies.
References
- Alais D., Morrone C., Burr D. Separate attentional resourcesfor vision and audition. Proceedings of the Royal Society B: BiologicalSciences. 2006;273:1339–1345. doi: 10.1098/rspb.2005.3420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alsius A., Möttönen R., Sams M.E., Soto-Faraco S., Tiippana K. Effect of attentional load on audiovisual speech perception: Evidence from ERP. Frontiers in Psychology. 2014;5 doi: 10.3389/fpsyg.2014.00727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alsius A., Navarra J., Campbell R., Soto-Faraco S. Audiovisualintegration of speech falters under high attention demands. Current Biology. 2005;15:839–843. doi: 10.1016/j.cub.2005.03.046. [DOI] [PubMed] [Google Scholar]
- Arrighi R., Lunardi R., Burr D. Vision and audition do not share attentional resources in sustained tasks. Frontiers in Psychology. 2011;2,56 doi: 10.3389/fpsyg.2011.00056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Colonius H., Diederich A. Multisensory interaction in saccadic reaction time: A time-window-of-integrationmodel. Journal of Cognitive Neuroscience. 2004;16:1000–1009. doi: 10.1162/0898929041502733. [DOI] [PubMed] [Google Scholar]
- Diederich A., Colonius H., Schomburg A. Assessingage-related multisensory enhancement with the time-window-of -integration model. Neuropsychologia. 2008;46:2556–2563. doi: 10.1016/j.neuropsychologia.2008.03.026. [DOI] [PubMed] [Google Scholar]
- Driver J., Noesselt T. Multisensory interplay revealscrossmodal influences on ‘sensory-specific’ brain regions, neuralresponses, and judgments. Neuron. 2008;57:11–23. doi: 10.1016/j.neuron.2007.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Driver J., Spence C. Attention and the crossmodal construction of space. Trends in Cognitive Sciences. 1998a;2:254–262. doi: 10.1016/S1364-6613(98)01188-7. [DOI] [PubMed] [Google Scholar]
- Driver J., Spence C. Crossmodal attention. Current Opinion in Neurobiology. 1998b;8:245–253. doi: 10.1016/s0959-4388(98)80147-5. [DOI] [PubMed] [Google Scholar]
- Eimer M. ERP modulations indicate the selective processingof visual stimuli as a result of transient and sustained spatial attention. Psychophysiology. 1996;33:13–21. doi: 10.1111/j.1469-8986.1996.tb02104.x. [DOI] [PubMed] [Google Scholar]
- Eimer M., Forster B. Modulations of early somatosensory ERP components by transient and sustained spatial attention. Experimental Brain Research. 2003;151:24–31. doi: 10.1007/s00221-003-1437-1. [DOI] [PubMed] [Google Scholar]
- Fan J., Gu X., Guise K.G., Liu X., Fossella J., Wang H., Posner M.J. Testing the behavioral interaction and integration of attentional networks. Brain and Cognition. 2009;70:209–220. doi: 10.1016/j.bandc.2009.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fan J., McCandliss B.D., Sommer T., Raz A., Posner M.I. Testing the efficiency and independence of attentional networks. Journal of Cognitive Neuroscience. 2002;14:340–347. doi: 10.1162/089892902317361886. [DOI] [PubMed] [Google Scholar]
- Gibney K.D., Aligbe E., Eggleston B.A., Nunes S.R., Kerkhof W.G., Dean C.L., Kwakye L.D. Visual distractors disruptaudiovisual integration regardless of stimulus complexity. Frontiersin Integrative Neuroscience, 2017;11 doi: 10.3389/fnint.2017.00001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldhammer F., Naumann J., Stelter A., Tóth K., Rölke H., Klieme E. The time on task effect in reading and problem solving ismoderated by task difficulty and skill: Insights from a computerbased large-scale assessme. Journal of Educational Psychology. 2014;106:608–626. [Google Scholar]
- He S., Cavanagh P., Intriligator J. Attentional resolutionand the locus of visual awareness. Nature. 1996; 383:334–337. doi: 10.1038/383334a0. [DOI] [PubMed] [Google Scholar]
- Ho C., Santangelo V., Spence C. Multisensory warning signals: When spatial correspondence matters. Experimental Brain Research. 2009;195:261–272. doi: 10.1007/s00221-009-1778-5. [DOI] [PubMed] [Google Scholar]
- Houlihan M.E., Stelmack R.M., Campbell K.B. Intelligenceand the effects of perceptual processing demands, task difficultyand processing speed on P300, reaction time and movement time. Intelligence. 1998;26:9–25. [Google Scholar]
- Kamijo K., Nishihira Y., Higashiura T., Kuroiwa K. The interactive effect of exercise intensity and task difficulty on humancognitive processing. International Journal of Psychophysiology. 2007;65:114–121. doi: 10.1016/j.ijpsycho.2007.04.001. [DOI] [PubMed] [Google Scholar]
- Laura B., Kenneth C.R., Roy E.C., Daniel H.W., Woldorf M.G., Purves D. The spread of attention across modalities andspace in a multisensory object. Proceedings of the National Academyof Sciences of the United States of America. 2005;102:18751–18756. doi: 10.1073/pnas.0507704102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laurienti P.J., Burdette J.H., Maldjian J.A., Wallace M.T. Enhanced multisensory integration in older adults. Neurobiology of Aging. 2006;27:1155–1163. doi: 10.1016/j.neurobiolaging.2005.05.024. [DOI] [PubMed] [Google Scholar]
- Lavie N. Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception and Performance. 1995;21:451–468. doi: 10.1037//0096-1523.21.3.451. [DOI] [PubMed] [Google Scholar]
- Lavie N., Tsal Y. Perceptual load as a major determinant ofthe locus of selection in visual attention. Journal of Experimental Psychology: Human Perception andPerformance. 1994:183–197. doi: 10.3758/bf03213897. [DOI] [PubMed] [Google Scholar]
- Mcdonald J.J., Tedersälejärvi W.A., Hillyard S.A. Involuntary orienting to sound improves visual perception. Nature. 2000;407(6806):906–908. doi: 10.1038/35038085. [DOI] [PubMed] [Google Scholar]
- McGurk H., MacDonald J. Hearing lips and seeing voices. Nature. 1976;264(5588):746–748. doi: 10.1038/264746a0. [DOI] [PubMed] [Google Scholar]
- Meredith M.A., Nemitz J.W., Stein B.E. Determinantsof multisensory integration in superior colliculus neurons. I.Temporal factors. The Journal of Neuroscience. 1987;7:3215–3229. doi: 10.1523/JNEUROSCI.07-10-03215.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller J. Divided attention: Evidence for coactivationwith redundant signals. Cognitive Psychology. 1982;14:247–279. doi: 10.1016/0010-0285(82)90010-x. [DOI] [PubMed] [Google Scholar]
- Miller J. Timecourse of coactivation in bimodal divided attention. Percepcion and Psychophysics. 1986;40:331–343. doi: 10.3758/bf03203025. [DOI] [PubMed] [Google Scholar]
- Morein-Zamir S., Soto-Faraco S., Kingstone A. Auditory capture of vision: Examining temporal ventriloquism. Cognitive Brain Research. 2003;17:154–163. doi: 10.1016/s0926-6410(03)00089-2. [DOI] [PubMed] [Google Scholar]
- Petersen S.E., Posner M.I. The attention system of the human brain: 20 years after. Annual Reviev of Neuroscience. 2012;35:73–89. doi: 10.1146/annurev-neuro-062111-150525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Posner M.I., Petersen S.E. The attention system of the human brain. Annual Reviev of Neuroscience. 1990;13:25–42. doi: 10.1146/annurev.ne.13.030190.000325. [DOI] [PubMed] [Google Scholar]
- Ren Y., Guo A., Xu Z., Wang T., Wu R., Yang W. Age relatedfunctional brain connectivity during audio–visual handheld tool recognition. Brain and Behavior. 2020 doi: 10.1002/brb3.1759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ren Y., Li S., Wang T., Yang W. Age-related shifts in thetaoscillatory activity during audio-visual integration regardless ofvisual attentional load. Frontiers in Aging Neuroscience. 2020;12 doi: 10.3389/fnagi.2020.571950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ren Y., Ren Y., Yang W., Tang X., Wu F., Wu Q., Takahashi S., Ejima Y., Wu J. Comparison for younger and older adults:Stimulus temporal asynchrony modulates audiovisual integration. International Journal of Psychophysiology. 2018;124:1–11. doi: 10.1016/j.ijpsycho.2017.12.004. [DOI] [PubMed] [Google Scholar]
- Ren Y., Yang W., Nakahashi K., Takahashi S., Wu J. Audiovisual integration delayed by stimulus onset asynchrony between auditory and visual stimuli in older adults. Perception. 2016;46 doi: 10.1177/0301006616673850. [DOI] [PubMed] [Google Scholar]
- Ren Y., Zhou Z., Bi J., Li J., Wang T., Yang W. Effect of attentional load on audio-visual integration:an ERP study. International Conference on Mechatronics and Automation; Beijing. 2020. [Google Scholar]
- Sekuler R., Sekuler A.B., Lau R. Sound alters visual motionperception. Nature. 1997;6614(385) doi: 10.1038/385308a0. [DOI] [PubMed] [Google Scholar]
- Shams L., Kamitani Y., Shimojo S. Visual illusion inducedby sound. Cognitive Brain Research. 2002;14:147–152. doi: 10.1016/s0926-6410(02)00069-1. [DOI] [PubMed] [Google Scholar]
- Spence C. Crossmodal attention. Scholarpedia. 2010a;5:6309. [Google Scholar]
- Spence C. Crossmodal spatial attention. Annals of the NewYork Academy of Sciences. 2010b;1191:182–200. doi: 10.1111/j.1749-6632.2010.05440.x. [DOI] [PubMed] [Google Scholar]
- Spence C., Driver J. Oxford University Press.; 2004. Crossmodal space and crossmodal attention. [Google Scholar]
- Spence C., Squire S. Multisensory integration: Maintainingthe perception of synchrony. Current Biology. 2003;13:519–521. doi: 10.1016/s0960-9822(03)00445-7. [DOI] [PubMed] [Google Scholar]
- Stein B.E. MIT Press; 2012. The new handbook of multisensory processing. [Google Scholar]
- Stein B.E., London N., Wilkinson L.K., Price D.D. Enhancement of perceived visual intensity by auditory stimuli:A psychophysical analysis. Journal of Cognitive Neuroscience. 1996;8:497–506. doi: 10.1162/jocn.1996.8.6.497. [DOI] [PubMed] [Google Scholar]
- Stein B.E., Meredith M.A. The MIT Press; 1993. The merging of the sense. [Google Scholar]
- Stein B.E., Stanford T.R. Multisensory integration: Currentissues from the perspective of the single neuron. Nature ReviewsNeuroscience, 2008;9:256–265. doi: 10.1038/nrn2331. [DOI] [PubMed] [Google Scholar]
- Talsma D., Doty T.J., Woldorf M.G. Selective attentionand audiovisual integration: Is attending to both modalities a prerequisite for early integration? Cerebral Cortex, Cerebral Cortex. 2007;17:679–690. doi: 10.1093/cercor/bhk016. [DOI] [PubMed] [Google Scholar]
- Talsma D., Senkowski D., Soto-Faraco S., Woldorff M.G. The multifaceted interplay between attention and multisensory integration. Trends in Cognitive Sciences. 2010;14:400–410. doi: 10.1016/j.tics.2010.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Talsma D., Senkowski D., Woldorf M.G. Intermodal attention affects the processing of the temporal alignment of audiovisualstimuli. Experimental Brain Research. 2009;198:313–328. doi: 10.1007/s00221-009-1858-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Talsma D., Woldorf M.G. Selective attention and multisensory integration: multiple phases of effects on the evoked brainactivity. Journal of Cognitive Neuroscience. 2005;17:1098–1114. doi: 10.1162/0898929054475172. [DOI] [PubMed] [Google Scholar]
- Van der Stoep N., Van der Stigchel S., Nijboer T.C.W. Exogenous spatial attention decreases audiovisual integration. Attention, Perception and Psychophysics . 2015;77:464–482. doi: 10.3758/s13414-014-0785-1. [DOI] [PubMed] [Google Scholar]
- Wahn B., König P. Audition and vision share spatial attentional resources, yet attentional load does not disrupt audiovisual integration. Frontiers in Psychology. 2015;6 doi: 10.3389/fpsyg.2015.01084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wahn B., König P. Is attentional resource allocation across sensory modalities task-dependent? Advances in Cognitive Psychology. 2017;13:83–96. doi: 10.5709/acp-0209-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wahn B., Murali S., Sinnett S., König P. Auditory stimulusdetection partially depends on visuospatial attentional resources. . Perception. 2017 doi: 10.1177/2041669516688026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu Z., Yang W., Zhou Z., Ren Y. Cue–target onset asynchrony modulates interaction between exogenous attention andaudiovisual integration. Cognitive Processing. 2020 doi: 10.1007/s10339-020-00950-2. [DOI] [PubMed] [Google Scholar]
- Yang W., Yang J., Gao Y., Tang X., Ren Y., Takahashi S., Wu J. Effects of sound frequency on audiovisual integration: anevent-related potential study. Plos One. Plos One. 2015;10 doi: 10.1371/journal.pone.0138296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yerkes R.M., Dodson J.D. The relation of strength of stimulus to rapidity of habit-formation. Journal of Comparative Neurologyand Psychology. 1908;18:459–482. [Google Scholar]