Abstract
We live in a world rich in sensory information, and consequently the brain is challenged with deciphering which cues from the various sensory modalities belong together. Determinations regarding the relatedness of sensory information appear to be based, at least in part, on the spatial and temporal relationships between the stimuli. Stimuli that are presented in close spatial and temporal correspondence are more likely to be associated with one another and thus ‘bound’ into a single perceptual entity. While there is a robust literature delineating behavioral changes in perception induced by multisensory stimuli, maturational changes in multisensory processing, particularly in the temporal realm, are poorly understood. The current study examines the developmental progression of multisensory temporal function by analyzing responses on an audiovisual simultaneity judgment task in 6- to 23-year-old participants. The overarching hypothesis for the study was that multisensory temporal function will mature with increasing age, with the developmental trajectory for this change being the primary point of inquiry. Results indeed reveal an age-dependent decrease in the size of the ‘multisensory temporal binding window’, the temporal interval within which multisensory stimuli are likely to be perceptually bound, with changes occurring over a surprisingly protracted time course that extends into adolescence.
Introduction
Combining information from multiple sensory modalities can dramatically influence behavior and perception. The manifestations of such ‘multisensory integration’ have been investigated in adults using a vast array of behavioral and perceptual methodologies. For example, studies have shown that the pairing of stimuli from two or more sensory modalities can result in decreased saccadic and manual reaction times (e.g. Cappe, Thut, Romei & Murray, 2010; Colonius & Arndt, 2001; Miller, 1982; Raab, 1962) and improved target detection (Frassinetti, Bolognini & Ladavas, 2002; Lovelace, Stein & Wallace, 2003). Electrophysiological and neuroimaging studies have begun to identify the neural substrates for these multisensory interactions and the complex and dynamic brain network underlying such integration (e.g. Beauchamp, Argall, Bodurka, Duyn & Martin, 2004; Giard & Peronnet, 1999; Gondan, Niederhaus, Rosler & Roder, 2005; Molholm, Ritter, Murray, Javitt, Schroeder & Foxe, 2002; Senkowski, Talsma, Grigutsch, Herrmann & Woldorff, 2007; Stevenson, Geoghegan & James, 2007).
Research has shown that these types of multisensory interactions are critically dependent upon the physical relationships between the paired stimuli. For example, the so-called ‘temporal principle’, first established in physiological studies in animal models (Meredith & Stein, 1986), illustrated that multisensory (e.g. visual-auditory) stimuli presented in close temporal coincidence resulted in the largest response gains (i.e. multisensory interactions) (Meredith, Nemitz & Stein, 1987). Furthermore, this work showed that these interactions can take place even when the stimuli were separated by several hundred milliseconds, suggesting a temporal ‘window’ within which integrative processes take place.
Complementary behavioral studies with humans have reinforced this principle by showing that multisensory-mediated performance gains are largest at short temporal offsets and are generally reduced and ultimately eliminated when stimuli are significantly temporally misaligned (e.g. Corneil & Munoz, 1996; Frassinetti et al., 2002). The period of time over which multisensory interactions are highly likely to be produced has been referred to as the temporal window of multisensory integration (Colonius & Diederich, 2004; Hairston, Burdette, Flowers, Wood & Wallace, 2005; Hillock, Powers & Wallace, 2011; Koppen & Spence, 2007; Navarra, Vatakis, Zampini, Soto-Faraco, Humphreys & Spence, 2005; Noesselt, Rieger, Schoenfeld, Kanowski, Hinrichs, Heinze & Driver, 2007; Powers, Hillock & Wallace, 2009; Spence & Squire, 2003; van Wassenhove, Grant & Poeppel, 2007). It has been speculated that the purpose of such a window is that it enables multisensory interactions to be flexibly specified, accommodating for differences in travel and processing times for signals emanating from a common source (i.e. differences in the speed of propagation of visual and auditory signals both in the environment and within the nervous system).
One interesting feature of this temporal binding window in adults is that it differs depending on whether the visual or auditory stimuli are leading or lagging (e.g. Dixon & Spitz, 1980; McGrath & Summerfield, 1985; Powers et al., 2009). Hence, the psychometric distributions (displaying binding probability as a function of temporal offset) from which window measures are derived are asymmetric. The slope of these distributions is generally steeper on the left side, meaning that there is more tolerance for visual-leading stimulus onset asynchronies. Such a result makes ethological sense given that visual-leading circumstances are those typically represented in the real world.
While the influence of stimulus timing on multisensory integration has been extensively studied in adults, surprisingly little is known about how temporal factors influence multisensory interactions in developing populations, particularly in children and adolescents. Infant work has established differences in the detection of multisensory asynchrony in 2- to 8-month-old babies when compared with adults, with infants requiring delays four to five times larger than those of adults to differentiate asynchronous and synchronous presentations (Lewkowicz, 1996). In addition, prior research from our laboratory contrasting behavioral performance in 10- and 11-year-olds and adults revealed differences in the size of the multisensory temporal window (Hillock et al., 2011). Thus, while prior reports show intriguing differences in multisensory temporal processing at discrete times during development, the developmental chronology for multisensory temporal function remains unknown. Consequently, the goal of the present study was to delineate the maturational trajectory for the temporal processing of basic audiovisual stimuli (i.e. tone pip, ring flash) by testing performance on a simultaneity judgment task in participants over a broad range of ages (i.e. 6 to 23 years). Comparisons of temporal asynchrony detection abilities across studies involving infants, children and adults suggest that there will be a decline in window size in the period leading up to middle childhood. It was hypothesized that this would be followed by a more rapid period of window contraction – leading to the adult state during the adolescent phase of development.
Interestingly, enlarged multisensory temporal binding windows reminiscent of those measured in younger participants have been identified in older children with autism and adults with reading disabilities (Foss-Feig, Kwakye, Cascio, Burnette, Kadivar & Stone, 2010; Hairston et al., 2005; Kwakye, Foss-Feig, Cascio, Stone & Wallace, 2011). The implications of such elongated temporal binding intervals are profound, and could result in great ambiguities in the construction of veridical (multi)sensory representations of the external world. Hence, accurate characterization of the developmental progression of multisensory temporal processing under typical circumstances represents a key foundation upon which to evaluate these deficits.
Methods
Participants and screenings
Participants were recruited via institutionally approved advertising materials. All participants and parents/guardians of minors were assented and/or consented prior to study participation in accordance with the regulations and an approved protocol of the Vanderbilt Institutional Review Board (IRB).
Sixty typically developing individuals between 6 and 23 years of age participated in the study and 45 were included in the final analyses. Participants were divided into the following three groups (n = 15/group): children (range = 6–11 years, mean age = 9.5 years), adolescents (range = 12–17 years, mean = 14.7 years), and adults (range = 18–23 years, mean = 21.2 years). The number of groups was restricted to three to retain a considerable number of subjects/group. Age-based cut-offs were based in part on preliminary data suggesting that adult-like processing emerged around puberty (~12 years); groups were constructed to potentially separate younger (immature) subjects, adolescents (potentially immature) and adults.
Individuals with hearing loss (pure tone thresholds greater than 20 dB HL at octave frequencies from 250 to 8000 Hz), vision loss (Snellen, 20/20 −3 or worse for each eye), below average intelligence (i.e. Kaufmann Brief Intelligence, second edition – composite intelligence quotient [IQ]) and those that opted not to complete all parts of the assessment were excluded (n = 3) (Kaufman & Kaufman, 2004). An additional six participants were disqualified based on their responses on the simultaneity judgment task as well as six (randomly selected) due to over-recruitment of the adolescent age group. An explanation of disqualifying criteria from the simultaneity measure can be found in the ‘Data analysis and temporal window derivation’ section below.
In addition to hearing, vision and IQ screenings, sight word reading ability was assessed with the Test of Word Reading Efficiency (TOWRE; Torgesen, Wagner & Rashotte, 1999) and household socioeconomic status (SES) was calculated using the Hollingshead ‘Four Factor Index of Social Status’ (Hollingshead, 1975). No participants were excluded on the basis of reading ability or SES. Administration of screening measures required approximately 45 minutes.
Stimuli and experimental design
For the audiovisual simultaneity judgment measure (adapted from Fujisaki, Shimojo, Kashino & Nishida, 2004) participants were seated in a quiet, dimly lit room approximately 60 cm from a high refresh rate computer monitor (NEC Multisync F3992 [160 Hz refresh capacity] set to 100 Hz). A white crosshair fixation marker (0.75 cm × 0.75 cm) appeared in the center of a black background on the computer screen for the entirety of the experiment. Auditory (10 ms duration, 1800 Hz tone burst, 89 peak dB SPL [A weighted]) and visual (10 ms duration, white ring flash, outer diameter = 12 cm, inner diameter = 7.0 cm, area = 113.1 cm2) stimulus pairs were presented in a randomly interleaved fashion at the following visual-auditory stimulus onset asynchronies (SOAs): 0, ±50, ±100, ±150, ±200, ±250, ±300, ±400 ms and ±500 ms (Figure 1a). Positive values represent visual leading events whereas negative numbers indicate that the auditory cue preceded the visual stimulus. The SOAs used were strategically chosen based on prior research and pilot work and represent those that most adequately capture changes in simultaneity report for participants within the age range studied (Hillock et al., 2011; Lewkowicz, 1996).
Figure 1.
(a) Simultaneity judgment task protocol. Visual representation of the temporal structure between visual (ring flash) and auditory (tone pip) stimuli. The duration of each stimulus was 10 ms and the stimulus onset asynchrony for pairs ranged from 0 to 500 ms. The negative sign represents auditory leading visual presentation and the positive sign denotes auditory lagging conditions. (b) Temporal window derivation in a representative adult subject. Two sigmoids were fitted to discrete data points (open circles), and the overall window size was derived by calculating the sum of the width of each side of the distribution at three-quarters maximum simultaneity report (left ≈ 100 ms; right ≈ 200 ms).
Auditory stimuli were presented via Sennheiser HD 265 linear supra-aural earphones and intensity was verified with a sound level meter (Larson Davis LxT2, 375A02 microphone). Signal duration and interstimulus delays were externally verified with an oscilloscope (Hameg Instruments HM507) within an error tolerance of 6 ms, the temporal fidelity enabled by the ASIO low latency driver used with MATLAB. Stimulus presentation and data logging was controlled using MATLAB 7.7.0 R2008b software.
Task instructions for all participants were embedded in a story describing audiovisual communication in lightning bugs. This method facilitated understanding in younger children (see Appendix). Behavioral judgments were recorded by pressing buttons with lightning bug images, blue (male) or red (female), which denoted simultaneous or successive auditory and visual stimulus presentations, respectively (Cedrus RB-530 response pad). The response pad had five buttons; the two lightning bug images were placed on buttons located to the far right and far left sides of the box; the remaining three buttons in the center were inactive. Responses were counterbalanced across participants. Participants were asked to respond as accurately as possible; speed was not emphasized in an effort to reduce errors. Prior to the assessment, a circumscribed set of pre-test questions was administered to verify understanding and a practice session (comprising five trials) was completed. Participants were given the option of repeating the practice up to two times before beginning the assessment. During the assessment, a total of 374 responses were collected (i.e. 22 samples at each SOA condition [17]). Trials were initiated 1 second after participants logged their response to the previous presentation. The assessment took 10–15 minutes to complete in full, but was split into two parts. After completion of the first block of trials, a break (lasting approximately 5 minutes) was provided for all children and the option of a break was extended to all adults. Participants were informed of their progress toward completion of the task via visual puzzles, which were progressively unveiled each time 25% of total trials were completed.
Temporal window derivation and data analysis
Temporal windows were derived in each individual from curves fitted to the mean probability of simultaneity report at each SOA. Two sigmoids were generated in MATLAB to the average simultaneity judgment values produced from responses at negative (−500 to 0 ms) and positive (0 to +500 ms) SOAs. The distributions comprised interpolated y values (probability of simultaneity report) at x values (SOA [ms]) ranging from −600 to + 600 ms in 0.1 ms increments. The temporal window was established as the width of each distribution (in ms) at three-quarters maximum (probability of simultaneity report) (Figure 1b), thus defining a range over which the perception of synchrony is highly likely. Six participants were eliminated based on a preliminary analysis of the data. Three individuals did not show a systematic decrease in simultaneity report to the three-quarters maximum criterion and an additional three exhibited temporal binding windows that differed by more than two standard deviations from their respective group mean (calculated separately for children, adolescents and adults).
Statistical tests were performed on both the measured (probability of simultaneity report/SOA) and derived (window size) data. For the former, a multivariate repeated measures analysis of variance (rmANOVA) with a within-subjects factor of SOA condition (18 levels) and a between-subjects factor of age group was performed. Greenhouse-Geisser corrections were applied (where needed) to correct for dependence among the repeat measures within participants. Planned comparisons were used to identify the specific SOAs that varied across age groups. These (uncorrected) independent samples t-tests were exclusively performed on SOAs of 150 ms and greater as prior findings from our lab indicated that differences in simultaneity report between children and adults were restricted to moderate and long SOAs (Hillock et al., 2011).
The analyses performed on the derived data provided a more global view of age-related differences in audiovisual temporal processing. Window size was compared across groups using a univariate analysis of variance (ANOVA) and follow-up planned comparisons (between-groups independent samples t-tests). In addition, a Pearson correlation was performed (between participant age and temporal window size) to corroborate the group window analyses, and an exponential regression model was fitted to the data. The exponential function was selected given our initial hypothesis of a rapid decline in window size during late adolescence followed by a leveling off (stabilization of window size) during adulthood.
Window asymmetry was evaluated by comparing the absolute value of window size estimates for the right and left sides of the distribution using a repeated-measures ANOVA with a within-subjects factor of stimulus order (2 levels: left [visual lag], right [visual lead]) and a between-subjects factor of group (3 levels: children, adolescents, adults) followed by within-groups paired samples t-tests.
To test for the possibility of rapid, within-session changes in performance (i.e. fatigue effects, learning effects), window sizes were computed in each individual on the first and last half of responses in the assessment. Because windows could not be derived in all participants using the reduced number of trials (due to poor function fits), within-session analyses were performed on groups comprising 13 of the 15 participants included in the tests described above. A repeated-measures ANOVA was performed using a within-subjects factor of recording epoch (2 levels: early versus late) and between-subjects factor of group. Pearson’s correlations were also used to evaluate the stability of window size estimates on the first and last half of trials within each group.
Results
Age-related differences in simultaneity report
The mean probability of simultaneity report was calculated at each stimulus onset asynchrony (SOA) for each individual and responses were compared across groups. Analyses indicated significant main effects of SOA (F(4.78, 200.87) = 153.14, p < .001) and group (F(2, 42) = 6.02, p < .01), as well as a significant (SOA × group) interaction (F(9.57, 200.87) = 2.11, p < .0) (Figure 2). Follow-up planned comparisons indicated equivalent performance between children and adolescents, but significant differences between both children and adults and adolescents and adults (Tables 1 and 2). The differences between children and adults were significant for auditory leading lags from −200 ms to −500 ms and for visual leading lags from +150 ms to +500 ms (excluding 250 ms), whereas adolescents differed from adults at SOAs from −250 ms to −500 ms and from +150 to +500 ms (excluding 200 ms). Together, these results show that children and adolescents are more likely to report audiovisual stimulus pairs presented at intermediate and long asynchronies as simultaneous when compared with adults, illustrating substantial differences in the temporal constraints of multisensory binding at these ages.
Figure 2.
Children and adolescents are less sensitive to audiovisual asynchrony at moderate and long SOAs. Graph shows mean probability of simultaneity report for each group (n = 15 participants/group) at each SOA condition (−500 ms to + 500 ms). Planned comparisons revealed significant differences in probability of simultaneity report between children and adults (* = p < .05) and adolescents and adults (†= p < .05) at moderate and long SOAs. Error bars = ±1 SEM.
Table 1.
Table displaying differences in the probabilty of simultaneity report between children and adults at the specific SOAs compared. Planned comparisons were performed at positive and negative SOAs from 150 to 500 ms. Analyses revealed that children are more likely to report stimuli as simultaneous at moderate and long SOAs (excluding −150 and 250 ms).
| Mean | SEM | t | df | p | ||
|---|---|---|---|---|---|---|
| −500 ≠ | Children | 0.237 | 0.044 | 4.405 | 15.986 | *0.000 |
| Adults | 0.037 | 0.012 | ||||
| −400 ≠ | Children | 0.233 | 0.037 | 3.831 | 23.007 | *0.001 |
| Adults | 0.067 | 0.022 | ||||
| −300 ≠ | Children | 0.374 | 0.059 | 3.585 | 22.656 | *0.002 |
| Adults | 0.130 | 0.035 | ||||
| −250 | Children | 0.474 | 0.045 | 3.697 | 28 | *0.001 |
| Adults | 0.230 | 0.049 | ||||
| −200 ≠ | Children | 0.633 | 0.049 | 2.112 | 25.258 | *0.045 |
| Adults | 0.456 | 0.069 | ||||
| −150 | Children | 0.733 | 0.380 | 1.034 | 28 | 0.312 |
| Adults | 0.656 | 0.065 | ||||
| +150 | Children | 0.822 | 0.037 | 2.135 | 28 | *0.042 |
| Adults | 0.656 | 0.069 | ||||
| +200 | Children | 0.715 | 0.059 | 2.184 | 28 | *0.038 |
| Adults | 0.515 | 0.070 | ||||
| +250 | Children | 0.578 | 0.069 | 1.982 | 28 | 0.057 |
| Adults | 0.385 | 0.068 | ||||
| +300 | Children | 0.478 | 0.071 | 2.184 | 28 | *0.038 |
| Adults | 0.285 | 0.053 | ||||
| +400 | Children | 0.285 | 0.052 | 2.304 | 28 | *0.029 |
| Adults | 0.141 | 0.036 | ||||
| +500 ≠ | Children | 0.237 | 0.046 | 2.702 | 23.514 | *0.013 |
| Adults | 0.089 | 0.029 |
≠ Equal variances not assumed.
Asterisks denote significance,
= p < .05
Table 2.
Table displaying differences in the probability of simultaneity report between adolescents and adults at the specific SOAs compared. Planned comparisons at positive and negative SOAs from 150 to 500 ms indicated that the proportion of simultaneous reponses is higher in adolescents at positive and negative long SOAs and some moderate delays (−250, 150 ms).
| Mean | SEM | t | df | p | ||
|---|---|---|---|---|---|---|
| −500 ≠ | Adolescents | 0.122 | 0.033 | 2.428 | 17.458 | *0.026 |
| Adults | 0.037 | 0.012 | ||||
| −400 ≠ | Adolescents | 0.178 | 0.042 | 2.315 | 21.293 | *0.031 |
| Adults | 0.067 | 0.022 | ||||
| −300 | Adolescents | 0.311 | 0.049 | 3.012 | 28 | *0.005 |
| Adults | 0.130 | 0.035 | ||||
| −250 | Adolescents | 0.426 | 0.077 | 2.158 | 28 | *0.040 |
| Adults | 0.230 | 0.049 | ||||
| −200 | Adolescents | 0.585 | 0.079 | 1.236 | 28 | 0.227 |
| Adults | 0.456 | 0.069 | ||||
| −150 | Adolescents | 0.756 | 0.055 | 1.174 | 28 | 0.250 |
| Adults | 0.656 | 0.065 | ||||
| +150 | Adolescents | 0.844 | 0.051 | 2.206 | 28 | *0.036 |
| Adults | 0.656 | 0.069 | ||||
| +200 | Adolescents | 0.707 | 0.072 | 1.910 | 28 | 0.066 |
| Adults | 0.515 | 0.070 | ||||
| +250 | Adolescents | 0.600 | 0.068 | 2.241 | 28 | *0.033 |
| Adults | 0.385 | 0.068 | ||||
| +300 | Adolescents | 0.507 | 0.072 | 2.486 | 28 | *0.019 |
| Adults | 0.285 | 0.053 | ||||
| +400 ≠ | Adolescents | 0.359 | 0.067 | 2.866 | 21.199 | *0.009 |
| Adults | 0.141 | 0.036 | ||||
| +500 ≠ | Adolescents | 0.256 | 0.049 | 2.936 | 22.855 | *0.007 |
| Adults | 0.089 | 0.029 |
≠ Equal variances not assumed.
Asterisks denote significance,
= p < .05
Age-related differences in the size of the multisensory temporal binding window
Overall window size was determined for each individual and this singular measure of multisensory temporal processing ability was compared across age groups. A significant main effect of group was observed (F(2, 42) = 5.76, p < .01). Follow-up planned comparisons revealed that the multisensory temporal windows of adults (M = 290.7 ms, SD = 72.4 ms) were significantly narrower than those of children (M = 404.1 ms, SD = 101.8 ms) and adolescents (M = 399.1 ms, SD = 128.5 ms), (p < .05, both tests) (Figure 3). This group-based comparison was reinforced with a correlation analysis, which showed a significant negative relationship between age and window size, r = −0.432, p < .01 (Figure 4). Visual inspection of the data revealed an apparent decline in window size in the young adult period, and regression analysis showed that age accounted for approximately 20% of window size variance. Interestingly, the lower limit of temporal window size appeared consistent across participants of different ages (~200 ms), with several children and adolescents exhibiting rather precocious multisensory temporal processing. As yet, it is unclear what distinguishes these participants, given that no significant relationship was observed between window size and verbal IQ, non-verbal IQ, reading ability or socioeconomic status.
Figure 3.
Mean window size is smaller in adults than in children and adolescents. Bar graph displays mean window size for children (left), adolescents (middle) and adults (right) (n = 15 participants/group). Children and adolescents have significantly smaller windows than adults, * = p < .05. Error bars indicate ±1 standard error of the mean (SEM).
Figure 4.

Multisensory temporal binding windows are smaller in older participants. A significant negative correlation was observed indicating that older participants have smaller windows. An exponential line fit to window size data indicates that age accounts for 20.14% of variance in window size.
Asymmetry in the width of the temporal binding window
To determine whether the window asymmetry typically seen in adults (in which the right side of the distribution is often wider than the left) is preserved during the developmental process, we compared the width of the left and right sides of the window within groups. A repeated-measures ANOVA revealed a significant main effect of stimulus order (i.e. left vs. right) (F(1, 42) = 12.83, p = .001), but no significant group by stimulus order interaction (p > .05). Follow-up planned comparisons indicated that children (t(14) = 2.28, p < .05) and adolescents (t(14) = 2.88, p < .05) had significant right/left window size differences, but not adults (p > .05). Descriptive statistics used to quantify the difference showed larger right (children = 224 ms, adolescents = 227 ms, adults = 153 ms) than left (children = 180 ms, adolescents = 172 ms, adults = 137 ms) window sizes for all groups, although considerable intersubject variability was noted (Table 3). While these descriptive statistics indicate a larger right than left window for children, adolescents and adults, this difference was only significant for younger participants.
Table 3.
Table of mean window sizes and standard deviations (ms) for the right (visual leading, positive SOAs) and left (visual lagging, negative SOAs) sides of the distribution in each group as well as group and right/left mean differences. All groups show larger right-sided windows, but the assymetry is larger in younger groups. The discrepancy in window size between younger groups and adults is larger for the right than left side
| Child (C) | Adolescent (A) | Adult (Ad) | C - Ad | A - Ad | |
|---|---|---|---|---|---|
| Mean Right (MR) | 224.41 | 227.11 | 153.47 | 70.94 | 73.64 |
| St Dev Right (SDR) | 76.55 | 89.05 | 55.49 | ||
| Mean Left (ML) | 179.73 | 171.99 | 137.21 | 42.52 | 34.78 |
| St Dev Left (SDL) | 46.93 | 55.4 | 42.38 | ||
| Mean Diff (MR–ML) | 44.68 | 55.12 | 16.26 |
Lack of within-session changes in the multisensory temporal binding window
In an effort to determine whether there were within-session performance changes that might be due to factors such as learning and fatigue, datasets were split into those acquired during the first and last half of trials during each session. A repeated-measures ANOVA on this split dataset showed no significant effect of recording epoch (early trials vs. late trials), and no epoch by group interaction (p > .05). Correlational analysis showed there to be a strong relationship between window size estimates for these two epochs in children (r = 0.78, p < .01), adolescents (r = 0.74, p < .01) and adults (r = 0.70, p < .01), suggesting that performance was highly consistent throughout the duration of an assessment.
Discussion
The current study represents the first empirical report of changes in audiovisual temporal processing from early childhood through early adulthood, and provides compelling evidence that differences in the perception of multisensory temporal relations persist well into adolescence. The results demonstrate that sensitivity to audiovisual temporal asynchrony increases with age, with adults being less likely to bind more temporally disparate multisensory stimuli than younger participants. The slow developmental progression argues for sensory experience playing an important role in shaping the boundaries of this temporal window, a process likely mediated by changes in the neural circuitry subserving multisensory temporal perception.
While previous studies examining multisensory processing on tasks not based on stimulus timing identified middle childhood and adolescence as important transitional phases in the maturation of multisensory integration (Barutchu, Crewther & Crewther, 2009a; Gori, Del Viva, Sandini & Burr, 2008; Tremblay, Champoux, Voss, Bacon, Lepore & Th3oret, 2007), the timeline needed in order to arrive at adult-like functioning appears to differ across sensory modalities and experimental paradigms. For example, adult-like performance on visual-haptic size and orientation discrimination emerges around 8–10 years of age (Gori et al., 2008). Although some audiovisual integrative effects appear mature by 10 years (Tremblay et al., 2007), others continue to develop thereafter (Barutchu, Danaher, Crewther, Innes-Brown, Shivdasani & Paolini, 2009b). Tremblay and colleagues (2007) reported no effect of age in 5–19-year-olds on perception of the sound induced flash illusion (wherein a single flash can be perceived as two flashes when paired with two successively presented sounds). In contrast, Barutchu et al. (2009b) observed reduced audiovisual facilitation of motor response times in 8- and 10-year-old children relative to adults. Differences in the rate of maturation across sensory systems or the degree of task complexity may influence the time frame within which mature multisensory processing is attained.
Task- and stimulus-related differences also appear to influence the breadth of the window in mature participants. While the overall size of the temporal window in our adult group (291 ms) is consistent with that reported in previous studies from our laboratory in different groups of adults on a highly similar task (Hillock et al., 2011 [300 ms]; Powers et al., 2009 [295 ms]), variability in window size is a common feature of the prior literature. Although some variability can be attributed to differences in the criteria used to calculate the window (e.g. half versus three-quarters maximum), window size appears to also be stimulus- and task-dependent. For example, studies using speech-related stimuli such as those used to examine the McGurk effect typically report larger windows compared to those derived from the simple flashes and beeps used in this study (e.g. Soto-Faraco & Alsius, 2009; van Wassenhove et al., 2007). Moreover, a study by Soto-Faraco and Alsius (2009) showed discrepancies in the size of the temporal windows measured on asynchrony detection and speech identification tasks performed using the same stimuli, suggesting that task plays an important role in delimiting the window (but see van Wassenhove et al., 2007). Thus, it appears that the integrative process is dependent on both the nature of the stimuli being combined and task complexity, and that asynchrony detection of arbitrary and simple stimuli like those used in the current study may not be fully reflective of the integration of more ethologically relevant stimuli. In accordance with this view, a recent study has examined the multisensory temporal window across stimulus complexity and task, and has found substantial differences in window size as a function of stimulus and task (Stevenson and Wallace, 2012). Most importantly in this study, window size within individuals was very well correlated, suggesting a common set of neural operations that dictate the binding process.
Another critical factor that has been shown to influence multisensory processing is stimulus presentation order. A rather unexpected finding of the current study was the lack of a significant difference between the size of the window for the right (visual leading) versus left (visual lagging) stimulus presentations in the adult group, a finding at odds with prior work (e.g. Dixon & Spitz, 1980; Stevenson, Zemtsov & Wallace, in press; van Eijk, Kohlrausch, Juola & van de Par, 2008). Although there is no definitive answer for this difference, it is perhaps illustrative of the known individual variability in window size. Comparisons of window size measurements in these adults with those from a previous study using similar methods (Hillock et al., 2011) suggest that this group is somewhat more sensitive to visual leading asynchronies and less sensitive to auditory leading temporal offsets. Hence, the overall (combined left and right) window size is comparable, but the asymmetric effect that has been previously observed by our laboratory and others is diminished in this particular sample. This may also contribute to the robust differences in the relative maturity of the left and right sides of the distribution for younger subjects. The mean difference in window size between the children and adults and adolescents and adults was 40–50% greater for the right side of the distribution than the left, which presumably reflects greater immaturity for the more ethologically relevant visual leading audiovisual stimulus combinations.
Findings from the current study indicate relatively late maturation of the multisensory temporal window to basic stimuli. Perhaps experience with multisensory stimulus relations drives the development of a ‘mature’ temporal window, and the requisite experience with these relations is not fully realized until a later age. Alternatively, one might posit a more deterministic explanation for these results, and argue that the late maturation of the integrative process is a result of the delayed maturation of the brain networks responsible for the appropriate temporal calculations. Several of the associational areas believed to be involved in encoding multisensory stimulus timing including the insula (Bushara, Grafman & Hallett, 2001), superior temporal cortex (Calvert, Hansen, Iversen & Brammer, 2001; Macaluso, George, Dolan, Spence & Driver, 2004; Noesselt et al., 2007; Powers, Hevey & Wallace, unpublished results; Raij, Uutela & Hari, 2000), and temporo-occipito-parietal junction (Raij et al., 2000), are among the latest to mature in the cortical hierarchy. As an example, structural MRI studies have reported that regions of the superior temporal cortex show gray matter density and total volume changes up to 20 years of age (Gogtay, Giedd, Lusk, Hayashi, Greenstein, Vaituzis, Nugent, Herman, Clasen, Toga, Rapoport & Thompson, 2004). Continued changes in system maturation and organization into early adulthood could provide a basis for age-related differences in multisensory temporal processing extending well into adolescence.
The implications of the current results are far-reaching and suggest that developing humans become increasingly adept at processing external audiovisual events with age and experience. An extended temporal binding window for simple stimuli like those used in this study raise interesting questions about how children perceive rapidly changing stimuli in their environment; findings suggest that the tendency will be for the increased binding of ‘inappropriate’ audiovisual pairs. Interestingly, emerging evidence suggests that multisensory temporal binding windows are generally enlarged in individuals with disabilities such as autism and dyslexia relative to control populations (Foss-Feig et al., 2010; Hairston et al., 2005; Kwakye et al., 2011). Such widened windows are reminiscent of the performance of younger children. Recent research from our laboratory has shown that significant plasticity can be engendered in the size of the adult multisensory temporal binding window through classic perceptual learning approaches (Powers et al., 2009) –approaches that could be readily adapted for use in developing populations.
Acknowledgments
This work was generously supported by the Vanderbilt Kennedy Center for Research on Human Development and the American Speech Language and Hearing Association. We would like to thank Drs. Linda Hood, Wesley Grantham and Bruce McCandliss for their intellectual contributions and editorial assistance. We also wish to thank Matthew Hevey, Nicky Hackett, Olivia Broaddus and Haley Eberly for their assistance with data collection.
Appendix
Story script for simultaneity judgment task
Title: The Great Bug Escape
Page 1: Earlier today at the Nashville Zoo the boy and girl lightning bugs got out of their cages and got mixed up. They need to be returned to their separate homes by tomorrow morning.
Page 2: It’s getting late and has become dark outside. You can’t see the bugs. The only way you can tell them apart is by their chirps and flashes. The lightning bugs’ tails look like this circle…
Page 3: The boy lightning bugs chirp and flash their tail at exactly the SAME time.
Page 4: The girl lightning bugs chirp BEFORE or AFTER they flash their tails. The girl bugs NEVER chirp and flash at the same time.
Page 5: Your job is to help us sort the lightning bugs. When you hear the bug chirp and see a flash at the SAME time, press the button for the boy bug.
Page 6: When the chirp and flash do NOT happen at exactly the same time, press the button for the girl bug. Do your best! Thank you for your help!
Instructions: During this task look straight ahead at the screen. Keep your eyes focused on the plus sign. A white ring will appear and a tone will be played. Press the button to respond after both the ring and tone are played. Get ready. Press any button to start the game.
References
- Barutchu A, Crewther DP, Crewther SG. The race that precedes coactivation: development of multisensory facilitation in children. Developmental Science. 2009a;12 (3):464–473. doi: 10.1111/j.1467-7687.2008.00782.x. [DOI] [PubMed] [Google Scholar]
- Barutchu A, Danaher J, Crewther SG, Innes-Brown H, Shivdasani MN, Paolini AG. Audiovisual integration in noise by children and adults. Journal of Experimental Child Psychology. 2009b;105 (1–2):38–50. doi: 10.1016/j.jecp.2009.08.005. [DOI] [PubMed] [Google Scholar]
- Beauchamp MS, Argall BD, Bodurka J, Duyn JH, Martin A. Unraveling multisensory interaction: patchy organization within human STS multisensory cortex. Nature Neuroscience. 2004;7:1190–1192. doi: 10.1038/nn1333. [DOI] [PubMed] [Google Scholar]
- Bushara KO, Grafman J, Hallett M. Neural correlates of auditory-visual stimulus onset asynchrony detection. Journal of Neuroscience. 2001;21 (1):300–304. doi: 10.1523/JNEUROSCI.21-01-00300.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calvert GA, Hansen PC, Iversen SD, Brammer MJ. Detection of audio-visual integration sites in humans by application of electrophysiological criteria to the BOLD effect. NeuroImage. 2001;14 (2):427–438. doi: 10.1006/nimg.2001.0812. [DOI] [PubMed] [Google Scholar]
- Cappe C, Thut G, Romei V, Murray MM. Auditory-visual multisensory interactions in humans: timing, topography, directionality, and sources. Journal of Neuroscience. 2010;30 (38):12572–12580. doi: 10.1523/JNEUROSCI.1099-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Colonius H, Arndt P. A two-stage model for visual-auditory interaction in saccadic latencies. Attention, Perception, & Psychophysics. 2001;63 (1):126–147. doi: 10.3758/bf03200508. [DOI] [PubMed] [Google Scholar]
- Colonius H, Diederich A. Multisensory interaction in saccadic reaction time: a time-window-of-integration model. Journal of Cognitive Neuroscience. 2004;16 (6):1000–1009. doi: 10.1162/0898929041502733. [DOI] [PubMed] [Google Scholar]
- Corneil BD, Munoz DP. The influence of auditory and visual distractors on human orienting gaze shifts. Journal of Neuroscience. 1996;16 (24):8193–8207. doi: 10.1523/JNEUROSCI.16-24-08193.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dixon NF, Spitz L. The detection of auditory visual desynchrony. Perception. 1980;9:719–721. doi: 10.1068/p090719. [DOI] [PubMed] [Google Scholar]
- Foss-Feig JH, Kwakye LD, Cascio CJ, Burnette CP, Kadivar H, Stone WL. An extended multisensory temporal binding window in autism spectrum disorders. Experimental Brain Research. 2010;203 (2):381–389. doi: 10.1007/s00221-010-2240-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frassinetti F, Bolognini N, Ladavas E. Enhancement of visual perception by crossmodal visuo-auditory interaction. Experimental Brain Research. 2002;147 (3):332–343. doi: 10.1007/s00221-002-1262-y. [DOI] [PubMed] [Google Scholar]
- Fujisaki W, Shimojo S, Kashino M, Nishida S. Recalibration of audiovisual simultaneity. Nature Neuroscience. 2004;7 (7):773–778. doi: 10.1038/nn1268. [DOI] [PubMed] [Google Scholar]
- Giard MH, Peronnet F. Auditory-visual integration during multimodal object recognition in humans: a behavioral and electrophysiological study. Journal of Cognitive Neuroscience. 1999;11 (5):473–490. doi: 10.1162/089892999563544. [DOI] [PubMed] [Google Scholar]
- Gogtay N, Giedd JN, Lusk L, Hayashi KM, Greenstein D, Vaituzis AC, Nugent TF, III, Herman DH, Clasen LS, Toga AW, Rapoport JL, Thompson PM. Dynamic mapping of human cortical development during childhood through early adulthood. Proceedings of the National Academy of Sciences, USA. 2004;101 (21):8174–8179. doi: 10.1073/pnas.0402680101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gondan M, Niederhaus B, Rosler F, Roder B. Multisensory processing in the redundant-target effect: a behavioral and event-related potential study. Perception & Psychophysics. 2005;67 (4):713–726. doi: 10.3758/bf03193527. [DOI] [PubMed] [Google Scholar]
- Gori M, Del Viva M, Sandini G, Burr DC. Young children do not integrate visual and haptic form information. Current Biology. 2008;18 (9):694–698. doi: 10.1016/j.cub.2008.04.036. [DOI] [PubMed] [Google Scholar]
- Hairston WD, Burdette JH, Flowers DL, Wood FB, Wallace MT. Altered temporal profile of visual-auditory multisensory interactions in dyslexia. Experimental Brain Research. 2005;166 (3–4):474–480. doi: 10.1007/s00221-005-2387-6. [DOI] [PubMed] [Google Scholar]
- Hillock AR, Powers AR, 3rd, Wallace MT. Binding of sights and sounds: age-related changes in multi-sensory temporal processing. Neuropsychologia. 2011;49 (3):461–467. doi: 10.1016/j.neuropsychologia.2010.11.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hollingshead AB. Unpublished working paper. Yale University; 1975. Four factor index of social status. [Google Scholar]
- Kaufman AS, Kaufman NL. Kaugman Brief Intelligence Test. 2. Minneapolis, MN: Pearson Assessments; 2004. [Google Scholar]
- Koppen C, Spence C. Audiovisual asynchrony modulates the Colavita visual dominance effect. Brain Research. 2007;1186:224–232. doi: 10.1016/j.brainres.2007.09.076. [DOI] [PubMed] [Google Scholar]
- Kwakye LD, Foss-Feig JH, Cascio CJ, Stone WL, Wallace MT. Altered auditory and multisensory temporal processing in autism spectrum disorders. Frontiers in Integrative Neuroscience. 2011;4:129. doi: 10.3389/fnint.2010.00129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewkowicz DJ. Perception of auditory-visual temporal synchrony in human infants. Journal of Experimental Psychology: Human Perception & Performance. 1996;22 (5):1094–1106. doi: 10.1037//0096-1523.22.5.1094. [DOI] [PubMed] [Google Scholar]
- Lovelace CT, Stein BE, Wallace MT. An irrelevant light enhances auditory detection in humans: a psychophysical analysis of multisensory integration in stimulus detection. Cognitive Brain Research. 2003;17 (2):447–453. doi: 10.1016/s0926-6410(03)00160-5. [DOI] [PubMed] [Google Scholar]
- Macaluso E, George N, Dolan R, Spence C, Driver J. Spatial and temporal factors during processing of audiovisual speech: a PET study. NeuroImage. 2004;21 (2):725–732. doi: 10.1016/j.neuroimage.2003.09.049. [DOI] [PubMed] [Google Scholar]
- McGrath M, Summerfield Q. Intermodal timing relations and audiovisual speech recognition by normal-hearing adults. Journal of the Acoustical Society of America. 1985;77 (2):678–685. doi: 10.1121/1.392336. [DOI] [PubMed] [Google Scholar]
- Meredith MA, Nemitz JW, Stein BE. Determinants of multisensory integration in superior colliculus neurons. I. temporal factors. Journal of Neuroscience. 1987;7 (10):3215–3229. doi: 10.1523/JNEUROSCI.07-10-03215.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meredith MA, Stein BE. Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. Journal of Neurophysiology. 1986;56 (3):640–662. doi: 10.1152/jn.1986.56.3.640. [DOI] [PubMed] [Google Scholar]
- Miller J. Divided attention: evidence for coactivation with redundant signals. Cognitive Psychology. 1982;14 (2):247–279. doi: 10.1016/0010-0285(82)90010-x. [DOI] [PubMed] [Google Scholar]
- Molholm S, Ritter W, Murray MM, Javitt DC, Schroeder CE, Foxe JJ. Multisensory auditory-visual interactions during early sensory processing in humans: a high-density electrical mapping study. Cognitive Brain Research. 2002;14 (1):115–128. doi: 10.1016/s0926-6410(02)00066-6. [DOI] [PubMed] [Google Scholar]
- Navarra J, Vatakis A, Zampini M, Soto-Faraco S, Humphreys W, Spence C. Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration. Cognitive Brain Research. 2005;25 (2):499–507. doi: 10.1016/j.cogbrainres.2005.07.009. [DOI] [PubMed] [Google Scholar]
- Noesselt T, Rieger JW, Schoenfeld MA, Kanowski M, Hinrichs H, Heinze HJ, Driver J. Audiovisual temporal correspondence modulates human multisensory superior temporal sulcus plus primary sensory cortices. Journal of Neuroscience. 2007;27 (42):11431–11441. doi: 10.1523/JNEUROSCI.2252-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Powers AR, 3rd, Hevey M, Wallace MT. Neural correlates of multisensory perceptual training. doi: 10.1523/JNEUROSCI.6138-11.2012. unpublished results. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Powers AR, 3rd, Hillock AR, Wallace MT. Perceptual training narrows the temporal window of multi-sensory binding. Journal of Neuroscience. 2009;29 (39):12265–12274. doi: 10.1523/JNEUROSCI.3501-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raab DH. Statistical facilitation of simple reaction times. Transactions of the New York Academy of Sciences. 1962;24 (5):574–590. doi: 10.1111/j.2164-0947.1962.tb01433.x. [DOI] [PubMed] [Google Scholar]
- Raij T, Uutela K, Hari R. Audiovisual integration of letters in the human brain. Neuron. 2000;28 (2):617–625. doi: 10.1016/s0896-6273(00)00138-0. [DOI] [PubMed] [Google Scholar]
- Senkowski D, Talsma D, Grigutsch M, Herrmann CS, Woldorff MG. Good times for multisensory integration: effects of the precision of temporal synchrony as revealed by gamma-band oscillations. Neuropsychologia. 2007;45 (3):561–571. doi: 10.1016/j.neuropsychologia.2006.01.013. [DOI] [PubMed] [Google Scholar]
- Spence C, Squire S. Multisensory integration: maintaining the perception of synchrony. Current Biology. 2003;13 (13):R519–R521. doi: 10.1016/s0960-9822(03)00445-7. [DOI] [PubMed] [Google Scholar]
- Soto-Faraco S, Alsius A. Deconstructing the McGurk-MacDonald illusion. Journal of Experimental Psychology: Human Perception & Performance. 2009;35 (2):580–587. doi: 10.1037/a0013483. [DOI] [PubMed] [Google Scholar]
- Stevenson RA, Geoghegan ML, James TJ. Superadditive BOLD activation in superior temporal sulcus with threshold non-speech objects. Experimental Brain Research. 2007;179:85–95. doi: 10.1007/s00221-006-0770-6. [DOI] [PubMed] [Google Scholar]
- Stevenson RA, Wallace MT. Multisensory temporal integration: task and stimulus dependencies. Journal of Neuroscience. 2012 doi: 10.1007/s00221-013-3507-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stevenson RA, Zemtsov RK, Wallace MT. Individual differences in the multisensory temporal binding window predict susceptibility to audiovisual illusions. Journal of Experimental Psychology. doi: 10.1037/a0027339. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Torgesen JK, Wagner RK, Rashotte CA. TOWRE: Test of word reading efficiency. Austin, TX: Pro-Ed Publishing; 1999. [Google Scholar]
- Tremblay C, Champoux F, Voss P, Bacon BA, Lepore F, Th3oret H. Speech and non-speech audiovisual illusions: a developmental study. PLoS ONE. 2007;2 (8):e742. doi: 10.1371/journal.pone.0000742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Eijk R, Kohlrausch A, Juola J, van de Par S. Audiovisual synchrony and temporal order judgments: effects of experimental method and stimulus type. Attention, Perception, & Psychophysics. 2008;70 (6):955–968. doi: 10.3758/pp.70.6.955. [DOI] [PubMed] [Google Scholar]
- van Wassenhove V, Grant KW, Poeppel D. Temporal window of integration in auditory-visual speech perception. Neuropsychologia. 2007;45 (3):598–607. doi: 10.1016/j.neuropsychologia.2006.01.001. [DOI] [PubMed] [Google Scholar]



