Abstract
Temporal proximity is one of the key factors determining whether events in different modalities are integrated into a unified percept. Sensitivity to audiovisual temporal asynchrony has been studied in adults in great detail. However, how such sensitivity matures during childhood is poorly understood. We examined perception of audiovisual temporal asynchrony in 7-8-year-olds, 10-11-year-olds, and adults by using a simultaneity judgment task (SJT). Additionally, we evaluated whether non-verbal intelligence, verbal ability, attention skills, or age influenced children's performance. On each trial, participants saw an explosion-shaped figure and heard a 2 kHz pure tone. These occurred at the following stimulus onset asynchronies (SOAs) - 0, 100, 200, 300, 400, and 500 ms. In half of all trials, the visual stimulus appeared first (VA condition) while in another half, the auditory stimulus appeared first (AV condition). Both groups of children were significantly more likely than adults to perceive asynchronous events as synchronous at all SOAs exceeding 100 ms, in both VA and AV conditions. Furthermore, only adults exhibited a significant shortening of RT at long SOAs compared to medium SOAs. Sensitivities to the VA and AV temporal asynchronies showed different developmental trajectories, with 10-11-year-olds outperforming 7-8-year-olds at the 300-500 ms SOAs, but only in the AV condition. Lastly, age was the only predictor of children's performance on the SJT. These results provide an important baseline against which children with developmental disorders associated with impaired audiovisual temporal function, such as autism, specific language impairment, and dyslexia may be compared.
Keywords: audiovisual temporal processing in children, development of audiovisual integration skills in children, cognitive development
1. Introduction
Temporal proximity is one of the key factors determining whether or not events in different modalities will be integrated into a unified percept (Stein & Meredith, 1993). Although this principle appears to apply to various combinations between our senses (e.g., Keetels & Vroomen, 2008; Navarra, Soto-Faraco, & Spence, 2007; Zampini et al., 2005), it has been studied in greatest detail within the context of audiovisual processing. This research shows that perception of audiovisual synchrony is not completely determined by precise temporal matching between auditory and visual modalities. Instead, we perceive two events as synchronous when their onsets fall within a certain time window, often called a temporal binding window or a window of temporal integration (for reviews, see Keetels & Vroomen, 2012; Vatakis & Spence, 2010; Vroomen & Keetels, 2010). The size of this window depends, to some degree, on the nature of audiovisual information and one's subjective experience with it. As an example, individuals require greater temporal separations between modalities in order to notice visual-auditory asynchronies compared to auditory-visual ones (e.g., Bushara et al., 2001; Dixon & Spitz, 1980; Grant et al., 2004; Lewkowicz, 1996; McGrath & Summerfield, 1985; van Wassenhove et al., 2007) and in order to notice asynchrony in audiovisual speech and other complex events as compared to simple events (Dixon & Spitz, 1980; Vatakis & Spence, 2006, 2010; Vroomen & Stekelenburg, 2011). We also know that the temporal binding window is malleable and can be reduced either by acquiring expertise in specific audiovisual activities (such as drumming) (Petrini et al., 2009) or through perceptual training (Powers III, Hillock, & Wallace, 2009; Stevenson, Wilson, Powers, & Wallace, 2013). Lastly, sensitivity to audiovisual asynchrony appears to depend on the type of task used for its evaluation (Stevenson & Wallace, 2013; van Eijk, Kohlrausch, Juola, & van de Par, 2008) and varies greatly across individuals (e.g., Conrey & Pisoni, 2006; Stone et al., 2001).
The absolute majority of studies on the temporal binding window to date have been conducted with adults. As a result, we still know relatively little about how sensitivity to audiovisual temporal asynchrony develops during childhood. Infants require significantly larger separations than adults between auditory and visual modalities (both for speech and non-speech stimuli) in order to notice the asynchrony (e.g., Lewkowicz, 1996; Lewkowicz, 2010; for a review, see also Lewkowicz, 2012), suggesting that the temporal binding window narrows with development. Such narrowing, however, may not start in earnest until approximately 6 years of age, based on the study by Lewkowicz and Flom, who reported that 6 year olds, but not 4 or 5 year olds, were able to detect the 366 ms stimulus-onset asynchrony (SOA) in speech stimuli when the visual modality lagged the auditory one (Lewkowicz & Flom, 2014).
The temporal binding window continues to narrow during school years, although the nature of such narrowing and its timing are still somewhat uncertain. Hillock and colleagues were among the first to examine sensitivity to audiovisual asynchrony in school-age children. In their 2011 study (Hillock, Powers, & Wallace, 2011), these authors compared the ability to detect audiovisual and visuo-auditory temporal asynchrony in 10-11 year old children and in adults. They used simple auditory and visual stimuli (a pure tone and a ring flash) and presented them at SOAs that ranged from 50 to 450 ms. Children were significantly worse than adults at detecting 150-350 ms SOAs between the stimuli but only when the auditory stimulus preceded the visual one. In a follow-up study (Hillock-Dunn & Wallace, 2012), the authors used essentially the same stimuli (with a slightly different set of SOAs), but instead of testing just one group of children, they tested two – those who were 6-11 years of age and those who were 12-17 years of age. A group of 18-23 year old adults was also included. They reported that in this case both groups of children were more likely to perceive stimuli as being synchronous at moderate to large SOAs regardless of the order of modalities. However, the two groups of children did not differ from each other. In other words, it appeared that there was no statistically significant improvement in children's sensitivity to audiovisual asynchrony between approximately 9.5 and 14.7 years of age (which were the mean ages for each of the children's groups). The two studies by Hillock-Dunn and colleagues were among the first to report on the surprisingly prolonged developmental course of audiovisual temporal processing and, as such, deserve recognition; yet, their results do not paint a coherent picture. According to the 2011 study, perception of VA asynchronies is mature by 10-11 years of age. Yet, according to the 2012 study, perception of VA asynchronies remains immature all throughout the teen years. In an earlier study from our laboratory, we also examined sensitivity to audiovisual temporal asynchrony in 7-11 year old typically developing (TD) children, same age children with a history of specific language impairment (H-SLI), and adults (Kaganovich, Schumaker, Leonard, Gustafson, & Macias, 2014). Similarly to the studies by Hillock and colleagues, we used simple stimuli (an explosion-shaped flash and a pure tone) presented at 100 to 500 ms SOAs in 100 ms intervals. We found not only that H-SLI children were significantly less likely to detect asynchrony than their TD peers at longest SOAs, but also that TD children were significantly less likely to detect asynchrony than adults at most SOAs exceeding 100 ms, regardless of the order of modalities.
In sum, previous studies on the development of sensitivity to audiovisual temporal asynchrony during pre-school and school years point unequivocally to a protracted developmental course of this ability. However, previous reports disagree to some extent on whether age-related improvement in audiovisual temporal function proceeds on the same timeline for auditory-visual and visual-auditory SOAs and on whether this function improves significantly during school years. In part, the discrepancy may result from significant individual variability in sensitivity to temporal asynchrony, especially in children of various ages. As an example, in the earlier study from our laboratory described above, adults varied greatly in the ability to detect audiovisual temporal asynchrony, but only when SOAs were relatively short (100-300 ms). At longer SOAs (400-500 ms), adults' responses were very consistent, with the absolute majority detecting asynchrony on 94-100 percent of trials. In contrast, children tended to have large individual variability even at long SOAs, with trials on which children reported perceiving asynchrony varying between 42 and 100 percent. The degree of individual variability in children suggests that age-related improvements in temporal function may be obscured when children with a broad range of ages are included in the same group.
In order to examine in greater detail how sensitivity to audiovisual temporal information matures during school years, we tested two groups of children with narrow age ranges - 7-8 year olds and 10-11 year olds - as well as adults on a simultaneity judgment task (SJT). We used a flash of an explosion-shaped figure and a pure tone as stimuli and presented them either synchronously or with SOAs that ranged from 100 to 500 ms in 100 ms steps. The use of simple stimuli allowed us to turn the paradigm into a game and to easily compare our findings with earlier reports, most of which used stimuli similar to ours. We measured the number of synchronous perceptions at each SOA and expected that as the separation between modalities increased, the number of synchronous reports would decrease. We also recorded response time (RT) as the average amount of time from the onset of the first stimulus in each pair (or from the onset of the audiovisually synchronous stimulus) needed to make a simultaneity judgment. Studies of the temporal binding window rarely, if ever, report RT, focusing primarily on percent of synchronous perceptions. However, the RT measure adds important complementary information about individuals' sensitivity to audiovisual temporal SOAs. More specifically, RT data provide insight into the ease or certainty with which a decision is made. We predicted that synchronous trials and trials with the longest SOAs would be associated with relatively short RTs, while trials with intermediate SOAs (where the uncertainly about the synchronicity of audiovisual information remains high) would be associated with longer RTs.
Lastly, the factors that underlie improvement in audiovisual temporal function during childhood are not yet clear. One possibility is that greater sensitivity to audiovisual temporal asynchrony may be related to stronger attention, language, or non-verbal intelligence skills because such skills may help children better cope with the demands of the experimental task. In order to test this hypothesis, we evaluated children's attentional and language aptitude as well as non-verbal intelligence and explored the relationship between these functions and sensitivity to audiovisual temporal asynchrony in a multiple regression analysis.
2. Method
2.1 Participants
Fifteen 7-8 year olds (mean age 8;2, range 7;4-8;9; 8 female), sixteen 10-11 year olds (mean age 11;2, range 10;0-12;0; 10 female), and 16 adults (mean age 23, range 20-40; 5 female) participated in the study. All gave their written consent or assent to participate in the experiment. Additionally, at least one parent of each child gave a written consent to enroll their child in the study. The study was approved by the Institutional Review Board of Purdue University, and all study procedures conformed to The Code of Ethics of the World Medical Association (Declaration of Helsinki) (1964). None of the children showed any signs of autism (Childhood Autism Rating Scale, 2nd edition (CARS-2); Schopler, Van Bourgondien, Wellman, & Love, 2010) or had a diagnosis of attention deficit/hyperactivity disorder (ADHD) (based on parental report). Mothers of 7-8 year old children had on average more years of education than mothers of 10-11 year old children (group: F(1,30)=4.275, p=0.048); however, this difference was only 2.1 years. A difference in father's years of education was not significant (F(1,28)=3.593, p=0.069; data for father's years of education were not available for one child in each group). According to the Laterality Index of the Edinburgh Handedness Questionnaire, one child in the 7-8-year-old group was left-handed; all other participants were right-handed (Cohen, 2008; Oldfield, 1971). All participants were free of neurological disorders, passed a hearing screening at a level of 20 dB HL at 500, 1000, 2000, 3000, and 4000 Hz, and reported having normal or corrected-to-normal vision.
2.2 Stimuli and Experimental Design
We used a flash of light (shaped as a cartoon explosion, see Figure 1) and a pure 2 kHz tone, both 200 ms in duration, as stimuli. They were presented either synchronously or at the following SOAs - 100, 200, 300, 400, and 500 ms. In half of all audiovisual trials, the sound preceded the flash of light (AV trials), while in another half, the flash of light preceded the sound (VA trials). Additionally, auditory only (A) and visual only (V) trials were also included in order to examine potential differences in responding to each modality at different ages. The experiment consisted of 10 blocks, with 5 instances of each of 13 types of trial (5 AV SOAs, 5 VA SOAs, synchronous presentation, A, and V) presented in a random order. This arrangement yielded 50 responses for each trial type. In order to avoid visual after-effects, the explosion image appeared at slightly different locations in the center of the screen on each consecutive trial. Auditory stimuli were presented at 60 dB SPL via a sound bar located directly under the computer monitor. All participants sat approximately 4 feet from the monitor inside a dimly-lit sound-attenuating booth. Presentation of trials was controlled by the Presentation software (www.neurobs.com). Responses were recorded within a 2200 ms response window time-locked to the appearance of the first stimulus in each pair. The response window was followed by an inter-trial interval varying randomly among 4 values: 350, 700, 1050, and 1400 ms. Hand to response button mapping was counterbalanced across participants.
The simultaneity judgment task (SJT) used in this study was identical to that described in an earlier study from our laboratory (Kaganovich et al., 2014). Prior to the start of the session, all participants viewed a video with instructions and practiced the task until it was clear. Instructions were kept identical for children and adults. At the beginning of the experiment, participants saw a dragon in the middle of the screen and a boy and a girl holding futuristic-looking weapons at the top left and top right corners of the screen (see Figure 1). Participants were told that the boy and the girl live on the planet Cabula, where dragons raid apple orchards and eat all the apples. The only way to keep dragons away is to use special weapons – one weapon shoots lights while the other shoots sounds. When the light and the sound “hit” dragons at exactly the same time, dragons run away. However, if either the sound or the light is delayed, even by the tiniest amount, or if one of the children forgets to shoot altogether (and hence there is either only the flash of light or only the sound present), dragons do not get scared away. After each trial, participants were asked to press one button on a response pad (RB-530, Cedrus Corporation) if the dragon was scared away (i.e., if the sound and the flash of light were perceived as synchronous) and another button if the dragon was not scared away (i.e., if the sound and the flash of light were perceived as being asynchronous and if only one of the two stimuli was present). The images of the girl and the boy with futuristic weapons were present during the instructions video and on the first screen of each block prior to the onset of test trials. To avoid a shift of attention away from the dragon, their images were not present while participants performed the task. Participants were encouraged to respond as fast as possible while also maintaining good accuracy.
2.3 Measures of attention, language, and non-verbal intelligence
All children were administered the following battery of tests. Non-verbal intelligence was assessed with the Test of Non-Verbal Intelligence – 4th edition (TONI-4; Brown, Sherbenou, & Johnsen, 2010). Attention was evaluated with the Parent Rating Scale of the Conners' Rating Scales – Revised (Conners, 1997), which provides an estimate of risk for developing ADHD based on a parental evaluation of the current behaviors of the child. Linguistic ability was measured with the help of the Clinical Evaluation of Language Fundamentals – 4th edition (CELF-4; Semel, Wiig, & Secord, 2003). Each child was tested on four separate sub-tests of CELF-4. Taken together, these subtests provide a Core Language Score (CLS), which is a general measure of language skills. The four sub-tests varied depending on children's age. All children were administered the Concepts and Following Directions (CFD), Recalling Sentences (RS), and Formulated Sentences (FS) sub-tests. In addition, 7-8 year olds also completed the Word Structure (WS) subtest while 9-11 year olds completed the Word Classes-2 (WC) subtest.
2.4 Statistical Analyses
We conducted several analyses in order to compare groups' performance on the SJT. First, we compared children's and adults' response times and rates of asynchrony perception at each SOA. These data were analyzed in repeated-measures ANOVAs with group (7-8 year olds, 10-11 year olds, and adults) as a between-group variable and SOA (0, 100, 200, 300, 400, and 500 ms) as a within-group variable. Because previous research showed asymmetry in sensitivity to auditory-visual as compared to visual-auditory SOAs (Bushara, Grafman, & Hallett, 2001; Dixon & Spitz, 1980; Grant, van Wassenhove, & Poeppel, 2004; Lewkowicz, 1996; McGrath & Summerfield, 1985; van Wassenhove, Grant, & Poeppel, 2007), and because this asymmetry was also present in our data (see below), separate analyses were conducted on the VA and AV conditions.
Second, in order to examine whether all groups showed greater sensitivity to AV as compared to VA asynchrony (which was expected based on previous reports), we conducted a repeated-measures ANOVA with group (7-8 year olds, 10-11 year olds, and adults) as a between-group variable and order (AV and VA) and SOA (100, 200, 300, 400, and 500) as within-group variables separately on the rate of synchrony perception and RT.
Third, in parallel with earlier studies of audiovisual temporal function in children and adults, we have determined the width of the temporal binding window (TBW) for each individual and then compared this measure across the three groups (e.g., Hillock-Dunn & Wallace, 2012; Hillock et al., 2011; Powers III et al., 2009; Stevenson & Wallace, 2013; Stevenson et al., 2013; Stevenson, Zemtsov, & Wallace, 2012). More specifically, we used the glmfit function in MATLAB (MATHWORKS, Inc., Natick, MA) in order to fit two sigmoid functions to the number of synchronous perceptions, separately for the VA and AV SOAs, for each individual. The width of the left (based on the VA SOAs) and the right (based on the AV SOAs) side of each function was then measured at two points – at 50 and 75% of its maximum. Function curves could not be fit to the data from several participants (two 7-8 year olds, one 10-11 year old, and one adult for the 75% of the maximum measurement; one 7-8 year old for the 50% of the maximum measurement). Their data were excluded from group comparisons and from regression analyses (see below). One-way ANOVAs were used to examine group differences in the size of the TBW, separately for its VA and AV sides.
In all statistical analyses, significant main effects with more than two levels were evaluated with a Bonferroni post-hoc test. In such cases, the reported p value indicates the significance of the Bonferroni test, rather than the adjusted alpha level. When omnibus analysis produced a significant interaction, it was further analyzed with step-down ANOVAs or t-tests, with factors specific to any given interaction. Mauchly's test of sphericity was used to check for the violation of sphericity assumption in all repeated-measures tests that included factors with more than two levels. When the assumption of sphericity was violated, we used the Greenhouse-Geisser adjusted p-values to determine significance. Accordingly, in all such cases, adjusted degrees of freedom and the epsilon value (ε) are reported. Effect sizes, indexed by the partial eta squared statistic (ηp2), are reported for all significant repeated-measures ANOVA results.
Lastly, we conducted a multiple regression analysis with children's age, the ADHD index, the TONI-4 score, and the CLS score of CELF-4 as predictors and the width of the TBW as a response variable in order to gauge the degree to which these cognitive, language, and attention skills contributed to children's performance on the SJT. Standardized scores from all of the above tests were used. A stepwise regression was performed to arrive at the final model. Separate regression analyses were conducted on the left (VA) and right (AV) portions of the TBW.
3. Results
The results of standardized tests of attention, non-verbal intelligence, and language skills are summarized in Table 1. As can be seen, both groups of children were quite comparable, except for the CFD subtest of CELF-4, on which, somewhat surprisingly, the younger group performed better than the older one.
Table 1. Group means for measures of age, SES (mother's and father's years of education), non-verbal intelligence (TONI-4), autism symptoms (CARS-2), risk of developing ADHD (ADHD Index from Conner's Scale), and language skills (CELF-4).
Group | Age (years, months) | Mother's Education (years) | Father's Education (years) | TONI-4 | CARS-2 | ADHD Index | CELF-4 | |||
---|---|---|---|---|---|---|---|---|---|---|
CFD | FS | RS | CLS | |||||||
7-8 year olds | 8;2 | 17.6 (0.8) | 17.5 (0.9) | 110.3 (1.7) | 15.4 (0.2) | 47.7 | 12.8 (0.4) | 13 (0.4) | 12.2 (0.6) | 114.1 (1.4) |
10-11 year olds | 11;2 | 15.5 (0.6) | 15.3 (0.7) | 105.5 (2.7) | 15 (0) | 46.1 | 10.7 (0.6) | 12.6 (0.4) | 11.2 (0.6) | 110.1 (2) |
p | 0.048 | 0.069 | 0.15 | 0.082 | 0.417 | 0.001 | 0.439 | 0.247 | 0.124 |
Note. Numbers for TONI-4, CARS-2, the ADHD Index, and CELF-4 represent standardized scores. Numbers in parentheses are standard errors of the mean. P-values reflect a group comparison based on a one-way ANOVA.TONI = Test of Non-Verbal Intelligence; CARS = Childhood Autism Rating Scale; CFD = Concepts and Following Directions; FS = Formulated Sentences; RS = Recalling Sentences; CLS = Core Language Score
3.1 Perception of Asynchrony at Different SOAs
Percent of trials at each SOA on which each group perceived synchrony is shown in Figure 2. Additionally, individual data for each of three groups are shown in Figure 3. The omnibus analyses of VA and AV conditions revealed similar results - namely, a significant effect of SOA (VA, F(2.626,115.529)=351.1, p<0.001, ηp2=0.889, ε=0.525; AV, F(2.814,123.8)=459.9, p<0.001, ηp2=0.913, ε=0.563), group (VA, F(2,44)=13.363, p<0.001, ε=0.378; AV, F(2,44)=31.315, p<0.001, ε=0.587), and a group by SOA interaction (VA, F(10,220)=12.961, p<0.001, ε=0.371; AV, F(10,220)=15.133, p<0.001, ε=0.408). However, follow-up tests showed that the pattern of group effects at specific SOAs was different for the VA and AV trials. More specifically, for the VA condition, the effect of group was significant at 200, 300, 400, and 500 ms SOAs (for all SOAs, F(2,46)=6.466-22.127, p=<0.001-0.003), with a similar trend for the 100 ms SOA (F(2,46)=3.078, p=0.056). The Bonferroni test for multiple comparisons showed that adults were better able to detect temporal asynchrony than either group of children at 200-500 ms SOAs (p=<0.001-0.039) and were marginally better than 7-8 year olds at the 100 ms SOA (p=0.057). Importantly, however, the two groups of children did not differ from each other at any VA SOA (p=0.109-1.0). For the AV condition, the effect of group was significant for the same range of SOAs as in the VA analysis - 200, 300, 400, and 500 ms (for all SOAs, F(2,46)=18.583-42.124, p<0.001). The Bonferroni test for multiple comparisons revealed that adults, again, were more sensitive to temporal asynchrony at the 200-500 ms SOAs (p=0.001-<0.001) than either group of children. However, in contrast to the VA analysis, the older group of children was also better able to detect temporal asynchrony than their younger counterparts at 300, 400, and 500 ms SOAs (p=0.001-0.018) (see Figure 2).
Groups also differed significantly in responses to unimodal stimuli (auditory, F(2,46)=7.385, p=0.002; visual, F(2,46)=7.995, p=0.001), with adults being significantly more accurate in response to both V and A stimuli than the 7-8 year children (both p=0.001) and marginally more accurate in response to the V stimuli than the 10-11 year old children (p=0.064). The two groups of children did not differ from each other (auditory, p=0.218; visual, p=0.34).
3.2 Response Time at Different SOAs
Participants' RT data are shown in Figure 4. The omnibus analysis of RT showed a significant effect of SOA (VA, F(2.401,105.65)=75.666, p<0.001, ηp2=0.632, ε=0.48; AV, F(2.95,129.797)=54.108, p<0.001, ηp2=0.552, ε=.59), group (VA, F(2,44)=24.19, p<0.001, ηp2=0.524; AV, F(2,44)=24.337, p<0.001, ηp2=0.525), and a group by SOA interaction (VA, F(10, 220)=8.084, p<0.001, ηp2=0.269; AV, F(10,220)=9.121, p<0.001, ηp2=0.293). In regard to the effect of group, follow-up tests revealed that adults' responses were significantly faster than those of either group of children (adults vs. 7-8 year olds, p<0.001; adults vs. 10-11 year olds, p<0.001), which held true at all SOAs (all p values <0.001-0.01). At the same time, the two groups of children did not differ from each other at any SOA in either the VA or the AV condition (all p values 0.187-1). The most striking difference between children and adults was the relationship between the SOA and RT - in adults, the RT first increased for short SOAs compared to synchrony and then decreased as the SOAs became longer. In children, on the other hand, there was no noticeable shortening of RT at longest SOAs (see Figure 4).
In order to better describe this age-related difference, we compared the RT to the 200 ms SOA (which, at least in absolute terms, required the longest time to respond to in adults), the RT to true synchrony (SYNC), and the RT to the longest SOA of 500 ms in each group. In adults, the RT to the 200 ms SOA was significantly longer than that to both the synchronous trials and the 500 ms SOA trials (SOA, VA, F(1.831,27.465)=28.752, p<0.001, ηp2=0.657, ε=0.366; VA200 vs. SYNC, p<0.001; VA200 vs. VA500, p=0.042; SOA, AV, F(2.518,37.775)=12.205, p<0.001, ηp2=0.449, ε=0.504; AV200 vs. SYNC, p<0.001, AV200 vs. AV500, p=0.043). In 7-8 year old children, the RT to the 200 ms SOA trials was longer than to synchronous trials but shorter than to the 500 ms SOA trials (SOA, VA, F(2.465,34.513)=33.629, p<0.001, ηp2=0.706, ε=0.493; VA200 vs. SYNC, p=0.003; VA200 vs. VA500, p=0.01; SOA, AV, F(5,70)=31.592, p<0.001, ηp2=0.693; AV200 vs. SYNC, p=0.001; AV200 vs. AV500, p=0.017). In other words, their RT continued to increase for longer SOAs. Finally, 10-11 year old children fell in between the other two groups - their RT to the 200 ms SOA trials was longer than to synchronous trials but did not differ from that to the 500 ms SOA trials (SOA, VA, F(2.196,32.944)=28.549, p<0.001, ηp2=0.656, ε=0.439; VA200 vs. SYNC, p<0.001; VA200 vs. VA500, p=1; SOA, AV, F(2.381,35.711)=32.027, p<0.001, ηp2=0.681, ε=0.476; AV200 vs. SYNC, p<0.001; AV200 vs. AV500, p=1).
3.3 Comparing sensitivity to AV vs. VA SOAs
The order of modalities had a significant effect on the overall number of synchronous perceptions (F(1,44)=10.342, p=0.002, ηp2=0.19). This effect interacted with SOA (F(2.808, 123.545)=5.353, p=0.002, ηp2=0.108, ε=0.702). Follow-up paired-samples t-tests revealed that VA stimuli elicited a significantly higher number of synchronous perceptions than AV stimuli at the 200, 300, and 400 ms SOAs (t(46)=2.305-3.276, p=0.026-0.002, two-tailed). There was no group by order interaction (F(2,44)=1.088, p=0.346, ηp2=0.047).
Analysis of response time as a function of the order of modalities yielded a significant order by SOA by group interaction (F(8,176)=2.169, p=0.042, ηp2=0.09). Further analyses showed that adults took longer to respond to VA compared to AV stimuli at the 200, 300, and 400 ms SOAs (t(15)=2.516-2.813, p=0.013-0.024, two-tailed), with a trend in the same direction at the 500 ms SOA (t(15)=1.862, p=0.082, two-tailed). In contrast, neither group of children showed asymmetry in response time to AV vs. VA stimuli at any SOA (7-8 year olds, t(14)=-0.453-1.547, p=0.144-0.658, two-tailed; 10-11 year olds, t(15)=-0.729-0.028, p=0.477-0.996, two-tailed).
3.4 Size of the TBW
The TBW for each group at 75 and 50% of the function's maximum are shown in Figure 5. Analysis of the TBW at the 75% of the maximum yielded a significant effect of group only over the right (AV) side of the TBW (right: F(2,41)=3.576, p=0.037; left: F(2,41)=1.142, p=0.33), with the 7-8 year old group having a significantly larger right TBW than adults (7-8 year olds vs. adult, p=0.039). No other pairwise comparison was significant (7-8 year olds vs. 10-11 year olds, p=0.2; 10-11 year olds vs. adult, p=1).
Analysis of the TBW at the 50% of the maximum showed a different pattern of group differences. Namely, groups differed significantly over both the left and the right sides of the TBW (left, F(2,45)=12.883, p<0.001; right, F(2,45)=33.006, p<0.001). Pairwise comparisons revealed that adults' left and right TBW were significantly smaller than those of either group of children (left, 7-8 year olds vs. adults, p<0.001, 10-11year olds vs. adults, p=0.007; right, 7-8 year olds vs. adults, p<0.001, 10-11 year olds vs. adults, p<0.001); however, the two groups of children differed only over the right side of the TBW (right, 7-8 year olds vs. 10-11 year olds, p=0.009; left, 7-8 year olds vs. 10-11 year olds, p=0.214), with a larger TBW in 7-8 year olds.
3.5 Multiple Regressions
Only one relationship was found to be significant enough to be included into the model – namely, age significantly predicted the size of the right TBW when the latter was measured at 50% of the function's maximum. More specifically, the size of the right TBW became increasingly smaller with age (F(1,29)=10.761, p=0.003; B=-17.54, SE=5.347, Beta=-.527). Based on R2 (0.278), age accounted for approximately 28% of variance in the size of the right TBW (see Figure 6).
4. Discussion
We examined audiovisual temporal function in two groups of children with narrow age ranges (7-8 year olds and 10-11 year olds) and in adults by using a simultaneity judgment task. Several major differences between children and adults and between the two groups of children have emerged.
First, both groups of children lagged significantly behind adults in the ability to detect audiovisual temporal asynchrony at all SOAs exceeding 100 ms, in both VA and AV conditions. This finding extends an earlier report from our laboratory, which showed that 7-11 year old children were less sensitive to audiovisual temporal asynchrony compared to adults. Because in our previous study the children's group included a relatively broad range of ages, it was possible that the effect was carried mostly by younger children. The results of the current study show unequivocally that by 10-11 years of age, children are still much more likely than adults to perceive asynchronous events as synchronous, even at SOAs as large as 400-500 ms. This result is significantly more dramatic than that reported by Hillock-Dunn and colleagues (Hillock et al., 2011), who showed that 10-11 year old children differed from adults only over the 150-350 ms SOAs and only in the VA condition. One potentially significant difference between our studies is the length of the stimuli. In Hillock-Dunn et al.'s reports, the length of the ring flash and the sound tone was only 8-10 ms. As a result, the onsets of auditory and visual stimuli never overlapped. Our stimuli were 200 ms in length and, therefore, at the SOAs of 100 and 200 ms there was no separation between the offset of the first stimulus and the onset of the second one in each audiovisual pair. And furthermore, even at longest SOAs, the offset to onset separations were significantly smaller than those in Hillock-Dunn and colleagues' studies. This might have made the simultaneity judgment more difficult. However, since many events in real life continue for more than 10 ms, children's performance with longer stimuli is highly informative and suggests that the maturation of the audiovisual temporal function may be even more prolonged than previously reported.
Second, children's ability to detect audiovisual temporal asynchrony undergoes a significant change during mid-childhood years. This change is asymmetrical in nature, with 10-11 year old children surpassing their 7-8 year old counterparts at longest SOAs (300-500 ms), but only when the auditory stimulus precedes the visual one (i.e., in the AV condition). In other words, sensitivities to VA and AV asynchronies have different developmental trajectories, with AV SOAs eliciting more accurate discriminations by 10-11 years of age. However, the age at which sensitivity to VA SOAs begins to improve compared to early school years remains to be determined. This finding dovetails with a well-documented fact in adult literature showing that, overall, adults are more sensitive to AV asynchronies compared to VA ones (Bushara et al., 2001; Dixon & Spitz, 1980; Grant et al., 2004; Lewkowicz, 1996; McGrath & Summerfield, 1985; van Wassenhove et al., 2007). Based on our results, we can conclude that such enhanced sensitivity to AV asynchronies emerges during mid-childhood. It is worth noting, that we have replicated the earlier finding of asymmetry in audiovisual temporal function, with overall fewer perceptions of synchrony reported for AV as compared to VA SOAs at medium (200-400 ms) SOAs. In adults, reduced sensitivity to VA temporal offsets was accompanied by increased RT.
We also compared sensitivity to audiovisual temporal asynchrony in children and adults by using a single metric – the TBW – which helps control for a potential bias in groups' responses to individual modalities. When the width of the TBW was measured at 50% of the function's maximum, this analysis yielded results that were completely parallel to those obtained from the analyses of individual SOAs. More specifically, in adults, both the left and the right sides of the TBW were significantly smaller than in either group of children, and in 10-11 year olds the right (AV) side of the TBW was significantly smaller than in 7-8 year olds.
Third, compared to children, adults were overall faster to make a simultaneity judgment, and, most importantly, had a qualitatively different relationship between RT and SOAs (see Figure 4). More specifically, in adults, the RT first increased with increasing SOA, peaking at the 200 ms offset, then shortened significantly at the 500 ms SOA. In children, on the other hand, the RT either continued to increase all through the longest SOA (7-8 year old group) or leveled off at SOAs exceeding 200 ms (10-11 year old group). While behavioral measures do not tell us whether SOAs that require the longest RT present perceptual or cognitive (i.e., decision making) challenge or both, these findings clearly show that different SOAs are associated with different degrees of certainty in regard to their audiovisual synchronicity, with the 200-300 ms SOAs requiring the longest time to respond to (at least in adults) and thus representing perhaps the most ambiguous stimulus. Interestingly, the same SOAs also tended to have the largest individual variability in adults (see Figure 3). This pattern of results suggests that children found making a simultaneity judgment effortful even when presented with the 400-500 ms SOAs. We take this finding to indicate that these large separations between modalities did not lead to a clear perception of asynchrony in children – a conclusion that is in agreement with the observed higher number of synchronous perceptions in these groups. The fact that in 10-11 year olds, the RTs leveled off at longer SOAs (rather than continuing to increase like in the younger group of children), may signal that their pattern of RT distribution across various SOAs is beginning to approach that of adults, with RT to the longest SOAs being in the process of shortening (compared to the 200 ms SOA).
Fourth, somewhat surprisingly, none of the standardized tests were significantly correlated with children's performance on the SJT, with age being the only significant predictor. Furthermore, age predicted only the width of the right (AV SOAs) TBW, underlining the fact that sensitivity to AV SOAs undergoes a more profound change during school years than sensitivity to VA SOAs. This finding is in general agreement with the study by Hillock-Dunn and Wallace (Hillock-Dunn & Wallace, 2012), who reported that children's performance on the SJT did not correlate with verbal or non-verbal IQ, reading ability, or socioeconomic status but did correlate with age. Therefore, while we know that sensitivity to multisensory temporal asynchrony gradually increases during mid-childhood and is impaired in a number of developmental disorders (for a comprehensive review of this issue, see Wallace & Stevenson, 2014), such as autism (Foss-Feig et al., 2010; Kwakye, Foss-Feig, Cascio, Stone, & Wallace, 2011; Stevenson et al., 2014), specific language impairment (SLI) (Grondin et al., 2007; Kaganovich et al., 2014), and dyslexia (Hairston, Burdette, Flowers, Wood, & Wallace, 2005), the functional significance of multisensory temporal function for language and, more generally, for cognitive development is poorly understood and requires future study. One possibility for the lack of a relationship between children's performance on the SJT and measures of other cognitive abilities is that the selected standardized tests were either not sensitive enough to individual differences or were not focusing on the skills that would be most profoundly affected by a weak audiovisual temporal function. For example, we know that in adults better sensitivity to audiovisual temporal asynchrony is associated with higher susceptibility to the McGurk illusion (Stevenson et al., 2012) and with better perception of degraded speech (Conrey & Pisoni, 2006). However, whether the same holds for children is unknown. In sum, examining a broader array of children's cognitive and perceptual abilities and focusing on skills that directly depend on audiovisual processing (such as speech-in-noise perception, for example) may be instrumental in better understanding the role of audiovisual temporal function during typical development.
However, there may also be another reason for the lack of correlation between performance on the SJT and other cognitive skills in our data. Namely, given the amount of individual variability in sensitivity to audiovisual temporal asynchrony, even in adults, it is possible that poor audiovisual temporal function does not interfere with typical development until it reaches a certain threshold – a threshold that we would rarely see crossed in typically developing children but that could be exceeded by clinical populations. Some support for this hypothesis comes from an earlier study from our laboratory, which showed a significant correlation between language skills and the SJT performance in children with a history of Specific Language Impairment (H-SLI) (Kaganovich et al., 2014), with those children who did well on the SJT also having better language scores. This relationship did not hold for age-matched typically developing children, who, as a group, performed markedly better on the SJT compared to their H-SLI peers. In other words, as long as sensitivity to audiovisual temporal information stays within a certain range, however broad, it may be sufficient for proper cognitive development. If so, determining the boundaries of this acceptable variability range will be an important empirical question to address by future work.
In the current study, we used simple non-speech stimuli in part so we could easily compare our findings with earlier developmental reports (which used simple stimuli as well) and in part because they allowed us to convert the SJT paradigm into a game suitable for testing school-age children. While we acknowledge that the use of more ecologically valid stimuli should be considered in future work, we believe that the use of simple stimuli does not diminish the significance of our findings. Recent work shows that measures of sensitivity to audiovisual asynchrony in simple and complex events are correlated (Stevenson & Wallace, 2013), and that cognitive functions that are affected by audiovisual temporal processing (such as perception of degraded speech) are correlated with audiovisual temporal function in both speech and non-speech stimuli (Conrey & Pisoni, 2006).
In conclusion, the results of the current study provided several important insights into the development of audiovisual temporal function during mid-childhood. More specifically, we showed that by 10-11 years of age children are still significantly more likely than adults to perceive asynchronous audiovisual events as synchronous, even at SOAs as large as 400-500 ms. In agreement with this finding, only adults showed a significant shortening of RT at long SOAs compared to medium SOAs. We also found that audiovisual temporal function undergoes a significant change between approximately 8 and 11 years of age. This change is asymmetrical in nature, with sensitivity to AV asynchronies improving and sensitivity to VA asynchronies remaining constant. As mentioned above, a reduced sensitivity to audiovisual temporal asynchrony has been reported for a number of developmental disorders, suggesting that a timely maturation of audiovisual temporal processing may play a significant role in cognitive and linguistic development. Our results may serve as an important baseline against which children with developmental disorders associated with impaired audiovisual temporal function may be compared.
Acknowledgments
This research was supported in part by grants P30DC010745 and R03DC013151 from the National Institute on Deafness and Other Communicative Disorders, National Institutes of Health. The content is solely the responsibility of the author and does not necessarily represent the official view of the National Institute on Deafness and Other Communicative Disorders or the National Institutes of Health. We are thankful to Jennifer Schumaker, James Hengenius, Kevin Barlow, and Caryn Herring for their assistance with different stages of this project and to children and their families for participation.
References
- Brown L, Sherbenou RJ, Johnsen SK. Test of Nonverbal Intelligence. 4th. Austin, Texas: Pro-Ed: An International Pubilsher; 2010. [Google Scholar]
- Bushara KO, Grafman J, Hallett M. Neural correlates of audio-visual stimulus onset asynchrony detection. The Journal of Neuroscience. 2001;21(1):300–304. doi: 10.1523/JNEUROSCI.21-01-00300.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen MS. Handedness Questionnaire. 2008 Retrieved 05/27/2013, 2013, from http://www.brainmapping.org/shared/Edinburgh.php#.
- Conners KC. Conners' Rating Scales - Revised. North Tonawanda, NY: MHS; 1997. [Google Scholar]
- Conrey B, Pisoni DB. Auditory-visual speech perception and synchrony detection for speech and non-speech signals. Journal of the Acoustical Society of America. 2006;119(6):4065–4073. doi: 10.1121/1.2195091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dixon NF, Spitz L. The detection of auditory visual desynchrony. Perception. 1980;9:719–721. doi: 10.1068/p090719. [DOI] [PubMed] [Google Scholar]
- Foss-Feig JH, Kwakye LD, Cascio CJ, Burnette CP, Kadivar H, Stone WL, Wallace MT. An extended multisensory temporal binding window in autism spectrum disorders. Experimental Brain Research. 2010;203:381–389. doi: 10.1007/s00221-010-2240-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grant KW, van Wassenhove V, Poeppel D. Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony. Speech Communication. 2004;44:43–53. [Google Scholar]
- Grondin S, Dionne G, Malenfant N, Plourde M, Cloutier M, Jean C. Temporal processing skills of children with and without specific language impairment. Canadian Journal of Speech-Language Pathology and Audiology. 2007;31(1):38–46. [Google Scholar]
- Hairston WD, Burdette JH, Flowers DL, Wood FB, Wallace MT. Altered temporal profile of visual-auditory multisensory interactions in dyslexia. Experimental Brain Research. 2005;166:474–480. doi: 10.1007/s00221-005-2387-6. [DOI] [PubMed] [Google Scholar]
- Hillock-Dunn A, Wallace MT. Developmental changes in the multisensory temporal binding window persist into adolescence. Developmental Science. 2012;15(5):688–696. doi: 10.1111/j.1467-7687.2012.01171.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hillock AR, Powers AR, Wallace MT. Binding of sights and sounds: Age-related changes in multisensory temporal processing. Neuropsychologia. 2011;49:461–467. doi: 10.1016/j.neuropsychologia.2010.11.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaganovich N, Schumaker J, Leonard LB, Gustafson D, Macias D. Children with a history of SLI show reduced sensitivity to audiovisual temporal asynchrony: An ERP study. Journal of Speech, Language, and Hearing Research. 2014;57:1480–1502. doi: 10.1044/2014_JSLHR-L-13-0192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keetels M, Vroomen J. Temporal recalibration to tactile-visual asynchronous stimuli. Neuroscience Letters. 2008;430:130–134. doi: 10.1016/j.neulet.2007.10.044. [DOI] [PubMed] [Google Scholar]
- Keetels M, Vroomen J. Perception of synchrony between the senses. In: Murray MM, Wallace MT, editors. The Neural Bases of Multisensory Processes. New York: CRC Press; 2012. pp. 147–177. [PubMed] [Google Scholar]
- Kwakye LD, Foss-Feig JH, Cascio CJ, Stone WL, Wallace MT. Altered auditory and multisensory temporal processing in autism spectrum disorders. Frontiers in Integrative Neuroscience. 2011;4(129) doi: 10.3389/fnint.2010.00129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewkowicz DJ. Perception of auditory-visual temporal synchrony in human infants. Journal of Experimental Psychology: Human Perception and Performance. 1996;22(5):1094–1106. doi: 10.1037//0096-1523.22.5.1094. [DOI] [PubMed] [Google Scholar]
- Lewkowicz DJ. Infant perception of audio-visual speech synchrony. Developmental Psychology. 2010;46(1):66–77. doi: 10.1037/a0015579. [DOI] [PubMed] [Google Scholar]
- Lewkowicz DJ. Development of multisensory temporal perception. In: Murray MM, Wallace MT, editors. The Neural Bases of Multisensory Processes. New York: CRC Press; 2012. pp. 325–344. [PubMed] [Google Scholar]
- Lewkowicz DJ, Flom R. The audiovisual temporal binding window narrows in early childhood. Child Development. 2014;85(2):685–694. doi: 10.1111/cdev.12142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGrath M, Summerfield Q. Intermodal timing relations and audio-visual speech recognition by normal-hearing adults. Journal of the Acoustical Society of America. 1985;77(2):678–685. doi: 10.1121/1.392336. [DOI] [PubMed] [Google Scholar]
- Navarra J, Soto-Faraco S, Spence C. Adaptation to audiotactile asynchrony. Neuroscience Letters. 2007;413:72–76. doi: 10.1016/j.neulet.2006.11.027. [DOI] [PubMed] [Google Scholar]
- Oldfield RC. The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia. 1971;9:97–113. doi: 10.1016/0028-3932(71)90067-4. [DOI] [PubMed] [Google Scholar]
- Petrini K, Dahl S, Rocchesso D, Waadeland CH, Avanzini D, Puce A, Pollick FE. Multisensory integration of drumming actions: musical expertise affects perceived audiovisual asynchrony. Experimental Brain Research. 2009;198:339–352. doi: 10.1007/s00221-009-1817-2. [DOI] [PubMed] [Google Scholar]
- Powers AR, III, Hillock AR, Wallace MT. Perceptual training narrows the temporal window of multisensory binding. The Journal of Neuroscience. 2009;29(39):12265–12274. doi: 10.1523/JNEUROSCI.3501-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schopler E, Van Bourgondien ME, Wellman GJ, Love SR. Childhood Autism Rating Scale. 2nd. Western Psychological Services; 2010. [Google Scholar]
- Semel E, Wiig EH, Secord WA. CELF4: Clinical Evaluation of Language Fundamentals. 4th. San Antonio, TX: Pearson Clinical Assessment; 2003. [Google Scholar]
- Stein BE, Meredith MA. The Merging of the Senses. Cambridge, Massachusetts: The MIT Press; 1993. [Google Scholar]
- Stevenson RA, Siemann JK, Schneider BC, Eberly HE, Woynaroski TG, Camarata SM, Wallace MT. Multisensory temporal integration in autism spectrum disorders. The Journal of Neuroscience. 2014;34(3):691–697. doi: 10.1523/JNEUROSCI.3615-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stevenson RA, Wallace MT. Multisensory temporal integration: task and stimulus dependencies. Experimental Brain Research. 2013;227:249–261. doi: 10.1007/s00221-013-3507-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stevenson RA, Wilson MM, Powers AR, Wallace MT. The effects of visual training on multisensory temporal processing. Experimental Brain Research. 2013;225:479–489. doi: 10.1007/s00221-012-3387-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stevenson RA, Zemtsov RK, Wallace MT. Individual differences in the multisensory temporal binding window predict susceptibility to audiovisual illusions. Journal of Experimental Psychology: Human Perception and Performance. 2012;38(6):1517–1529. doi: 10.1037/a0027339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stone JV, Hunkin NM, Porrill J, Wood R, Keeler V, Meanland M, et al. Porter NR. When is now? Perception of simultaneity. Proceedings of the Royal Society B: Biological Sciences. 2001;268:31–38. doi: 10.1098/rspb.2000.1326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Eijk R, Kohlrausch A, Juola J, van de Par S. Audiovisual synchrony and temporal order judgments: effects of experimental method and stimulus type. Perception and Psychophysics. 2008;70(6):955–968. doi: 10.3758/PP.70.6.955. [DOI] [PubMed] [Google Scholar]
- van Wassenhove V, Grant KW, Poeppel D. Temporal window of integration in auditory-visual speech perception. Neuropsychologia. 2007;45:598–607. doi: 10.1016/j.neuropsychologia.2006.01.001. [DOI] [PubMed] [Google Scholar]
- Vatakis A, Spence C. Audiovisual synchrony perception for music, speech, and object actions. Brain Research. 2006;1111:134–142. doi: 10.1016/j.brainres.2006.05.078. [DOI] [PubMed] [Google Scholar]
- Vatakis A, Spence C. Audiovisual temporal integration for complex speech, object-ation, animal call, and musical stimuli. In: Naumer NJ, Kaiser J, editors. Multisensory Object Perception in the Primate Brain. New York: Springer; 2010. [Google Scholar]
- Vroomen J, Keetels M. Perception of intersensory synchrony: A tutorial review. Attention, Perception, & Psychophysics. 2010;72(4):871–884. doi: 10.3758/APP.72.4.871. [DOI] [PubMed] [Google Scholar]
- Vroomen J, Stekelenburg JJ. Perception of intersensory synchrony in audiovisual speech: Not that special. Cognition. 2011;118(1):75–83. doi: 10.1016/j.cognition.2010.10.002. [DOI] [PubMed] [Google Scholar]
- Wallace MT, Stevenson RA. The construct of the multisensory temporal binding window and its dysregulation in developmental disabilities. Neuropsychologia. 2014;64:105–123. doi: 10.1016/j.neuropsychologia.2014.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zampini M, Brown T, Shore DI, Maravita A, Röder B, Spence C. Audiotactile temporal order judgments. Acta Psychologica. 2005;118:277–291. doi: 10.1016/j.actpsy.2004.10.017. [DOI] [PubMed] [Google Scholar]