Abstract
A weaker McGurk effect is observed in individuals with autism spectrum disorder (ASD); weaker integration is considered to be the key to understanding how low-order atypical processing leads to their maladaptive social behaviors. However, the mechanism for this weaker McGurk effect has not been fully understood. Here, we investigated (1) whether the weaker McGurk effect in individuals with high autistic traits is caused by poor lip-reading ability and (2) whether the hearing environment modifies the weaker McGurk effect in individuals with high autistic traits. To confirm them, we conducted two analogue studies among university students, based on the dimensional model of ASD. Results showed that individuals with high autistic traits have intact lip-reading ability as well as abilities to listen and recognize audiovisual congruent speech (Experiment 1). Furthermore, a weaker McGurk effect in individuals with high autistic traits, which appear under the without-noise condition, would disappear under the high noise condition (Experiments 1 and 2). Our findings suggest that high background noise might shift weight on the visual cue, thereby increasing the strength of the McGurk effect among individuals with high autistic traits.
Keywords: autism spectrum disorder, autism spectrum quotient, McGurk effect, lip-reading, individual differences
Introduction
Autism spectrum disorder (ASD) is a neurodevelopmental disorder defined in terms of difficulties with social interaction and communication, patterns of repetitive behavior, narrow interests, and difficulties with sensory processing (Diagnostic and Statistical Manual of Mental Disorders-5: DSM-5; American Psychiatric Association 2013). Behavioral data has also clarified the kinds of difficulties that individuals with diagnosed ASD have, and how their perceptual drives differ from those with neuro-typical development (Frith 1991, Happé and Frith 2006). For example, previous studies show that individuals with diagnosed ASD tend to have difficulties with social skills during face-to-face communication, which can result from atypical processing of facial information (e.g. Deruelle et al. 2004, Joseph and Tanaka 2002), misreading in emotion recognition (e.g. Baron-Cohen et al. 2001a), or misunderstanding a person’s thoughts (e.g. Baron-Cohen et al. 1986). In brief, atypical processing in individuals with diagnosed ASD seems to appear when perceiving social information involves faces.
Research over the past decade has clarified that individuals with diagnosed ASD show atypical processing in audiovisual speech perception, especially in the McGurk effect (McGurk and MacDonald 1976). The McGurk effect is a well-known illusion demonstrating the strength of our automatic use of facial information when recognizing speech perception. For instance, when an observer is presented with the movie of a person speaking a given phoneme dubbed with a voice speaking an incongruent phoneme, they will perceive an intermediate phoneme. The McGurk effect is relatively weaker in individuals with autism spectrum disorder (ASD) than with typical developments (TD), although no difference is observed in the ability to hear voices between the two populations (e.g. De Gelder et al. 1991, Iarocci et al. 2010, Saalasti et al. 2012 (Saalasti et al., 2011), Williams et al. 2004). This means that individuals with diagnosed ASD rely less on facial speech, or may not rely on facial speech at all, during audiovisual speech perception, thereby showing a weaker McGurk effect. Recently, a meta-analysis pooled nine clinical studies of McGurk effect and revealed that individuals with diagnosed ASD show weaker McGurk effect than TD controls (Zhang et al. 2019).
However, the mechanism for this weaker McGurk effect is not fully understood. One explanation is that individuals with diagnosed ASD have poor lip-reading ability, which refers to the accuracy of interpreting what an individual says without hearing his or her voice (Irwin et al. 2011, Smith and Bennetto 2007, Williams et al. 2004). Irwin et al. (2011) demonstrated that children with ASD had more difficulty in lip-reading than did TD children. They also displayed a weaker McGurk effect because they received less information from visual speech when observing the McGurk stimuli. In contrast, another study argued that children with ASD exhibit delayed development of lip-reading ability (Taylor et al. 2010), which indicates that adults with ASD eventually reach the same lip-reading ability as adults with TD. Another possible explanation is that the weaker McGurk effect comes from atypical integration of multiple sensory inputs (Donohue et al. 2012, Stevenson et al. 2014). De Gelder et al. (1991) tested the ability of children with ASD to recognize facial identity as well as the ability to perceive facial speech. They found that individuals with diagnosed ASD have intact lip-reading ability, but they show the weaker McGurk effect also reported in adults with ASD (e.g. Saalasti et al. 2012).
Thus, it remains unclear what a weaker McGurk effect suggests. This study examined whether a weaker McGurk effect in ASD was influenced by the hearing environment. It has been observed that the McGurk effect is strongly experienced by individuals in an environment with a high level of background noise (Sekiyama and Tohkura 1991, Sekiyama 1994). Furthermore, a cultural comparison study found that while East Asians tend to perceive a weaker McGurk effect than Americans when there is no background noise, this cultural difference is reduced when background noise is present in the environment (Sekiyama and Tohkura 1991, Sekiyama 1994). Additional research also examined whether a cultural difference exists in the McGurk effect (Magnotti et al. 2015). Magnotti et al. (2015) compared the McGurk effect between a large sample of Mandarin- (n = 162) and English-speaking adults (n = 145) by using nine McGurk stimuli from eight different speakers. In their results, there was no difference in the McGurk effect between the two groups (48 vs. 44%). Until now, most studies on the McGurk effect in ASD have not paid much attention to the effect of background noise. That is, if individuals with diagnosed ASD have difficulty in audiovisual speech integration, they will experience a weaker McGurk effect, regardless of the hearing environment. Alternately, if individuals with diagnosed ASD have a tendency not to rely on visual speech, a weaker McGurk effect might arise when they observe McGurk stimuli in a situation with a lower level of background noise.
From the dimensional model of ASD (Baron-Cohen et al. 2001b, Frith 1991, Happé and Frith 2006), our prior studies have supported the assumption that individuals with diagnosed ASD would rely less on visual speech in audiovisual speech integration (Ujiie et al. 2015a, 2015b, 2018). In these studies, we tested the McGurk effect in several conditions, then calculated the correlation between the ratio of the McGurk effect and autistic traits among university students. Such an approach is called an analogue study, and it allows researchers to study ASD symptoms in the general population. Our prior results showed that individuals with high autistic traits observed a weaker McGurk effect in the environment without auditory noise (Ujiie et al. 2015a, 2018) but observed it strongly in the environment with auditory noise (Ujiie et al. 2015b). These results suggest the tendency to rely less on visual speech, rather than an inability to integrate audiovisual speech. On the other hand, our prior studies did not include a lip-reading test and thus were not enough to exclude the possibility of poor lip-reading ability in individuals with diagnosed ASD (e.g. Irwin et al. 2011, Smith and Bennetto 2007, Williams et al. 2004). Furthermore, our prior studies had few individuals with high autistic traits because we did not set-up a screening session to recruit such people. ASD is relatively uncommon, and random participants are unlikely to have ASD (Baron-Cohen et al. 2001b, Wakabayashi et al. 2006). Therefore, the current study includes a screening session allowing us to test our hypothesis comprehensively.
An analogue design approach would be helpful to control for some potentially confounding factors (Reed et al. 2011, Stewart and Ota 2008). In general, people with ASD who are recruited in clinical studies vary in age, the severity of symptoms, whether they have multiple diagnoses (e.g. ASD plus attention-deficit hyperactivity disorder [ADHD]), and IQ (Hofvander et al. 2009). The mixed results in lip-reading ability and McGurk effect in ASD may come from the heterogeneity of the clinical groups (e.g. De Gelder et al. 1991, Iarocci et al. 2010, Saalasti et al. 2012, Taylor et al. 2010, Williams et al. 2004). For instance, the ASD samples in previous studies differed in terms of chronological ages, with some studies examining children aged between six and sixteen years (e.g. De Gelder et al. 1991, Iarocci et al. 2010, Taylor et al. 2010, Williams et al. 2004) and adults (Saalasti et al. 2012). Some researchers have suggested that people with ASD have intact lip-reading ability (e.g. De Gelder et al. 1991, Saalasti et al. 2012) and that the weaker McGurk effect is due to atypical integration of multiple sensory inputs. Whereas, Taylor et al. (2010) have suggested that children with ASD exhibit delayed development of lip-reading ability, which means that adults with ASD have lip-reading abilities comparable to those in TD individuals. In this situation, an analogue design allows researchers to study ASD lip-reading abilities and the McGurk effect in ASD after controlling for potentially confounding factors.
In the ASD analogue design, the Autism Spectrum Quotient (AQ) has been widely used (e.g. Baron-Cohen et al. 2001a, Reed et al. 2011, Stewart and Ota 2008). The AQ, which is a self-report questionnaire, can assess the degree of autistic traits in any individual adult who has not been diagnosed with ASD and who has a normal intelligence quotient (Baron-Cohen et al. 2001b). The AQ is also useful as a screening scale to not only distinguish between clinical and control groups but also measure the distribution of autistic traits within the general population. Its validity in distinguishing between clinical and control groups, and its test reliability were confirmed in the original (UK) sample (Baron-Cohen et al. 2001a). These findings were replicated in the Netherlands (Hoekstra et al. 2008), Australia (Lau et al. 2013), and Japan (Wakabayashi et al. 2006). One of the effective approaches in the analogue design study is the comparison between individuals with averaged and those with higher autistic traits after screening. Such comparison can control some confounding factors such as age, the severity of symptoms, whether the individuals have multiple diagnoses, and IQ, although an analogue design has been criticized for being an indirect investigation of ASD as individuals with higher autistic traits, who are referred to as the Broader Autism Phenotype (e.g. Austin 2005, Hasegawa et al. 2015), do not receive any ASD diagnosis but have similar tendencies as a group of individuals diagnosed with ASD, at least in the cognitive level (Reed et al. 2011, Stewart and Ota 2008, Woodbury-Smith et al. 2005).
In summary, this study has two purposes. The first is to investigate whether the weak McGurk effect observed in individuals with high autistic traits is due to their poorer lip-reading ability. The second is to test whether the weak McGurk effect in individuals with high autistic traits is influenced by the hearing environment. To confirm these purposes, we conducted two experiments. In experiment 1, we assessed McGurk effects in noisy conditions and lip-reading abilities among individuals with a high Autism Quotient (AQ) and their counterparts. In experiment 2, as a control experiment, we assessed the McGurk effect without noise condition in both individuals with high AQs and averaged AQs. The results are discussed in terms of the dimensional model of ASD and the atypical processing of multisensory inputs in ASD.
Experiment 1
Methods
Participants
First, we screened potential participants, by using the Japanese version of the AQ; Wakabayashi et al. 2006), during an introductory psychology class to recruit individuals with high autistic traits. Four hundred seventy-one university students (241 males and 230 females) completed the AQ in a group setting. As shown in Table 1, the AQ scores ranged from 11–36 (M = 22.8, SD = 3.94). We defined the high AQ group by using the cut-off score (AQ ≥ 27) from Woodbury-Smith et al. (2005). Finally, 13 volunteers in the high AQ group and 13 counterpart volunteers in the averaged AQ group participated in the following experiment. Participants were naïve to the purpose of these experiments. All the participants were native Japanese speakers and reported normal (or corrected) hearing and vision. Informed consent was obtained from all participants, and the study was performed in accordance with the local ethical committee.
Table 1.
Means and standard deviations (SD) of AQ among all participants, high AQ group, and averaged AQ group.
| All participants | High AQ group | Averaged AQ group | |
|---|---|---|---|
| (N = 471) | (N = 13) | (N = 13) | |
| Males | 241 | 7 | 2 |
| Females | 230 | 6 | 11 |
| Ages (years) | 18.5 (2.12) | 18.5 (0.18) | 20.8 (1.73) |
| AQ total scores | 22.8 (3.97) | 29.4 (1.98) | 17.2 (2.20) |
| Social skill | 4.0 (1.66) | 6.2 (1.24) | 2.5 (0.97) |
| Attention switching | 4.0 (1.48) | 5.7 (0.85) | 2.6 (0.77) |
| Attention to detail | 5.8 (1.60) | 7.0 (1.22) | 4.7 (0.95) |
| Communication | 5.0 (1.38) | 4.6 (1.26) | 4.5 (1.45) |
| Imagination | 3.9 (1.65) | 5.8 (1.46) | 2.9 (0.95) |
Note. AQ = Autism-spectrum Quotient Japanese version (Wakabayashi et al. 2006), an assessment of the level of autistic traits. This questionnaire contains 50 items rated on a four-point Likert scale (from ‘agree’ to ‘disagree’). Each item was then scored for 0 or 1, with a possible range in total scores of 0 to 50.
Stimuli
We created unimodal (audio and visual) and multimodal (audiovisual congruent and audiovisual incongruent) stimuli from audio and video recordings of utterances from four Japanese speakers (two women). These stimuli were previously used in the study by Ujiie et al. (2015a). The visual stimuli comprised the faces of the speakers recorded using a digital video camera (GZ-EX370, JVC KENWOOD). The audio stimuli comprised the utterances (/pa/, /ta/, or /ka/) recorded using a dynamic microphone (MD42, Sennheiser). The video clip (720 × 480 pixels, 29.97 frames/s) and the speech sound (digitized at 48,000 Hz with 16-bit quantization resolution) were combined and synchronized using Adobe Premiere Pro CS6. The mean duration of all stimuli was 1.2 s.
The unimodal stimuli comprised audio and visual stimuli for each of the three syllables. In the audio condition, each audio stimulus was presented without any visual information. In the visual condition, each visual movie was presented in silence. In this study, we considered accuracy in the visual condition an indicator of the participant’s ability for lip-reading.
The multimodal stimuli consisted of audiovisual congruent (auditory/pa/and visual/pa/, auditory/ta/and visual/ta/, and auditory/ka/and visual/ka/) and audiovisual incongruent stimuli (auditory/pa/, visual/ka/), which are defined as McGurk stimuli. We used the combination of a voice pronouncing the phoneme/pa/and a movie displaying the pronunciation of the phoneme/ka/as the McGurk stimuli, as in our prior studies (Ujiie et al. 2015a, 2018). We defined the rate of the/pa/response, which was the correct response to the audiovisual incongruent stimuli, as the weakness of the McGurk effect.
To manipulate the level of background noise (Sekiyama and Tohkura 1991), we added two signal-to-noise ratio (SNR) conditions (+5 dB and –5 dB) for multimodal stimuli. In the former, audio stimuli were presented at 60 dB with 55 dB of pink noise, while in the latter, audio stimuli were presented at 60 dB with 65 dB of pink noise.
Apparatus
The experiment was conducted using Inquisit version 4.0.6. (Millisecond Software 2014). The visual stimuli were displayed on a 19-inch CRT monitor (E193FPp, Dell). The audio stimuli were presented from a headphone (MDR-Z500, Sony) at approximately 65 dB SPL (the sound pressure level for normal conversation was from 65 to 70 dB SPL) adjusted using a mixing console (MW8CX, YAMAHA).
Procedure
All participants completed two multimodal blocks and two unimodal blocks. The multimodal blocks consisted of one high SNR block and one low SNR block, which included 12 congruent stimuli and 4 incongruent stimuli, each. Each congruent stimulus was presented three times per block, while each incongruent stimulus was presented nine times per block; thus, each block consisted of 72 trials. The unimodal blocks comprised audio and visual stimuli, respectively. Each stimulus was presented three times, so each block consisted of 36 trials per block. For each trial, a fixation point was displayed for 1000 ms at the center of the monitor, followed by the stimulus. Then, a blank display was presented until a participant responded. In the audio and the two multimodal blocks, the participant was asked to answer what he or she had heard. In contrast, in the visual block, the participant was asked to indicate what he or she thought was spoken. The participants responded by pressing a key (/pa/, /ta/, or/ka/). The order of trials was randomized for each block.
Data analysis
Statistical analysis was conducted using R version 3.6.1 for Windows (R Foundation for Statistical Computing, Vienna, Austria). In the audio and multimodal blocks, we defined the ratio of the correct response to auditory voices as ‘accuracies’. In the visual block, we defined the ratio of the correct response to lip-reading as ‘accuracies’. To examine the relationship between task performance and AQ score, we analyzed the mean accuracies for each congruent and incongruent condition separately, using a mixed analysis of variance (ANOVA), with SNR as a within-subjects factor and AQ group as a between-subjects factor. For the audio and visual conditions, we separately analyzed the mean accuracies using an independent samples t-test between the high and averaged AQ groups.
Results and discussion
The mean accuracies of the congruent and incongruent conditions in both AQ groups are summarized in Figure 1. To examine the relationship between task performance and AQ score, we performed a mixed-model ANOVA with SNR as a within-subjects factor and AQ group as a between-subjects factor for both the congruent and incongruent audiovisual conditions. In the congruent condition, there was neither a significant main effect of group, F (1, 24) = 1.62, p = .22, partial η2 = .06, nor a significant interaction, F (1, 24) = .02, p = .89, partial η2 = .001; there was, however, a marginally significant main effect of SNR, F (1, 24) = 3.47, p = .07, partial η2 = .13. In the incongruent condition, an ANOVA revealed a marginally significant main effect of SNR, F (1, 24) = 3.02, p = .09, partial η2 = .11, and a significant interaction, F (1, 24) = 6.80, p = .02, partial η2 = .22, although a main effect of group was not significant, F (1, 24) = 2.58, p = .12, partial η2 = .09. Analysis of the simple main effects of the interaction indicated that the high AQ group had significantly higher accuracy than the averaged AQ group, indicating that there was a weaker McGurk effect in the high AQ group only when the stimuli were presented with a low level of background noise [F (1, 24) = 4.71, p = .04, partial η2 = .20].
Figure 1.
Results for all conditions in both groups. The accuracies were averaged in both Autism-spectrum Quotient (AQ) groups. (a) congruent condition, (b) incongruent condition, and (c) the visual and audio conditions. The error bars are the standard errors. An asterisk indicates a significant difference (p < .05).
For the audio and visual conditions, the mean accuracies for both AQ groups are summarized in Figure 1c. In the audio condition, the results of an independent samples t-test showed that accuracy did not differ significantly between groups, t (24) = .14, p = .63, d = .19. The same result was found in the visual condition, t (24) = .49, p = .89, d = .06, indicating that there was no difference in lip-reading ability between the high and averaged AQ groups.
Drawing on the dimensional model of ASD, we performed experiment 1 to examine whether (1) the weak McGurk effect in individuals with high autistic traits was caused by their poorer lip-reading ability and (2) whether the weak McGurk effect in individuals with high autistic traits was related to the hearing environment. Our results showed that individuals with high autistic traits have undiminished lip-reading ability. Furthermore, a weak McGurk effect was observed in individuals with high autistic traits but only when the auditory noise level was high.
Given these results, we considered that the level of background noise, rather than individual differences of lip-reading abilities, modifies individual differences in the McGurk effect. On the other hand, when looking at the magnitude of the McGurk effect in experiment 1, individuals with high AQs also showed the McGurk effect under two noisy conditions. Therefore, our results failed to provide strong evidence that a weaker McGurk effect was observed in individuals with high AQs as demonstrated in our prior studies (e.g. Ujiie et al. 2015a, 2018). To make support for our hypothesis, we set the without-noise condition in experiment 2 to confirm whether individuals with high AQs showed weaker McGurk effect in the without-noise condition where auditory information is salient.
Experiment 2
Methods
Participants
In order to recruit individuals with high autistic traits, we conducted screening by using the Japanese version of AQ (Wakabayashi et al. 2006) during an introductory psychology class. Five hundred and four university students (282 males and 222 females) completed the AQ in a group setting. All participants provided informed consent and took part in the study voluntarily. Table 2 shows the resulting AQ scores, which ranged from 5 to 43 (M = 21.5, SD = 6.7) among all students. As in experiment 1, we selected a high AQ group and an averaged AQ group. Finally, we recruited 15 volunteers from the high AQ group and another 15 volunteers from the averaged AQ group as their counterparts, and they participated in the following experiment. All participants were native Japanese speakers and reported normal (or corrected) hearing and vision and were naïve to the purpose of these experiments. Informed consent was obtained from all participants, and the study was performed in accordance with the local ethical committee.
Table 2.
Means and standard deviations (SD) of AQ among all participants, high AQ Group, and averaged AQ group.
| All participants | High AQ group | Low AQ group | |
|---|---|---|---|
| (N = 504) | (N = 15) | (N = 15) | |
| Males | 291 | 5 | 3 |
| Females | 213 | 10 | 12 |
| Ages (years) | 18.5 (0.52) | 18.5 (0.52) | 18.7 (1.33) |
| AQ total scores | 21.5 (6.69) | 33.1 (3.74) | 10.6 (2.26) |
| Social skill | 3.8 (2.71) | 7.7 (1.76) | 1.0 (1.36) |
| Attention switching | 5.4 (1.88) | 7.7 (1.71) | 3.3 (1.84) |
| Attention to detail | 4.9 (2.14) | 5.6 (2.82) | 3.9 (2.00) |
| Communication | 3.8 (2.19) | 6.4 (1.40) | 1.3 (1.05) |
| Imagination | 3.4 (1.87) | 5.7 (1.49) | 1.1 (0.88) |
Note. AQ = Autism-spectrum Quotient Japanese version (Wakabayashi et al. 2006), an assessment of the level of autistic traits. This questionnaire contains 50 items rated on a four-point Likert scale (from ‘agree’ to ‘disagree’). Each item was then scored for 0 or 1, with a possible range in total scores from 0 to 50.
Stimuli, apparatus, and procedure
We used the audiovisual congruent stimuli (e.g. auditory/pa/, visual/pa/), and audiovisual incongruent stimuli (auditory/pa/, visual/ka/), which were also used in experiment 1. To enhance the saliency of auditory information, we removed auditory noise from these three types of stimuli. The experimental task consisted of audiovisual congruent stimuli and audiovisual incongruent stimuli, each of which included 36 trials. The order of trials was randomized within a block. We used the same apparatus and procedure as in experiment 1.
Data analysis
Statistical analysis was conducted using R version 3.6.1 for Windows (R Foundation for Statistical Computing, Vienna, Austria). As in experiment 1, we defined the ratio of correct response toward auditory voices as ‘accuracies’. To examine the effect of autistic traits, we analyzed the mean accuracies using a mixed-model ANOVA with groups (High AQ and Averaged AQ) as a between-participants factor and stimulus conditions (audiovisual congruent and audiovisual incongruent) as a within-participants factor.
Results and discussion
Figure 2 shows the results for congruent and incongruent conditions in both AQ groups. The mixed-model ANOVA revealed significant main effects of groups, F (1, 28) = 4.68, p = .04, partial η2 = .14, and stimulus conditions, F (1, 28) = 179.25, p = .00, partial η2 = .86, and a significant interaction between groups and stimulus conditions, F (1, 28) = 4.54, p = .04, partial η2 = .14. This interaction revealed a significant simple main effect of the group in the audiovisual incongruent condition, F (1, 28) = 4.62, p = .04, partial η2 = .14; therefore, participants with high AQ scores showed weaker visual responses than those with low AQ scores in the incongruent condition. Further, simple main effects of stimulus conditions in both groups were significant; High AQ [F (1, 28) = 63.38, p = .00, partial η2 = .30] and Averaged AQ [F (1, 28) = 120.41, p = .00, partial η2 = .57].
Figure 2.
Results for congruent and incongruent conditions in both groups. The error bars are the standard errors. An asterisk indicates a significant difference (p <.05). Note. AQ = Autism-spectrum Quotient Japanese version (Wakabayashi et al. 2006).
In experiment 2, we confirmed whether individuals with high AQs showed a weaker McGurk effect in the without-noise condition where auditory information is salient. Our results revealed that the high AQ group showed less visual influence in the audiovisual incongruent condition than that in the averaged AQ group, although no difference was found in the audiovisual congruent condition. These results were consistent with our prior results that autistic traits predict weaker McGurk effects in populations not diagnosed with ASD (Ujiie et al. 2015a, 2015b, 2018).
General discussion
In this study, we conducted two experiments to investigate (1) whether the weaker McGurk effect in individuals with high autistic traits is caused by poor lip-reading ability, and (2) whether the hearing environment modifies the weaker McGurk effect in individuals with high autistic traits. Based on the dimensional model of ASD (Baron-Cohen et al. 2001b, Frith 1991, Happé and Frith 2006), we used an analogue design, which allowed us to study ASD in the general population, and thus to control for potentially confounding factors in the clinical group with ASD.
The lip-reading ability in individuals with high autistic traits
Previous studies have shown mixed results regarding the lip-reading ability of individuals with high autistic traits, which can be separated into the three propositions mentioned in section 1—individuals with diagnosed ASD with poor lip-reading ability (e.g. Smith and Bennetto 2007, Williams et al. 2004), a delay in the development of this ability (Taylor et al. 2010), or intact lip-reading ability (e.g. De Gelder et al. 1991, Saalasti et al. 2012). The latter two propositions are not in opposition to each other in the case of adults with ASD, because both predict intact lip-reading ability in adults with ASD. More recent studies have supported these propositions (Bebko et al. 2014, Stevenson et al. 2014). In this study, we defined accuracy in the visual condition as the lip-reading ability and found no difference between the high and averaged AQ groups. Although we did use the analogue design—which has been criticized as an indirect investigation of ASD—the high AQ group in this study did show a similar AQ as a group of individuals with diagnosed ASD (Woodbury-Smith et al. 2005). In addition, a correlation analysis showed no significant relationship between AQ and lip-reading ability in this study r(26) = .08, n.s. These findings suggest that poor lip-reading ability may only be present in some individuals with diagnosed ASD, supporting the analogue perspective on ASD (e.g. Ujiie et al. 2018).
The effect of noise in the McGurk effect
The different response rate in the incongruent condition between the two experiments indicates the impact of noise on the McGurk effect. Table 3 shows the summary of mean response ratios in the incongruent conditions with noise (experiment 1) and without-noise (experiment 2). In addition, relationships between AQ and audio response rate for incongruent stimuli in two SNR conditions are shown in Figure 3 (experiment 1) and Figure 4 (experiment 2). In the incongruent condition, there were three possible types of responses: an audio response (/pa/response), a fused response (/ta/response), and a visual response (/ka/response). In general, each response rate fluctuates with the signal to noise level (Saalasti et al. 2011, 2012, Sekiyama and Tohkura 1991). In experiment 1, where the background noise was played, fewer visual responses produced more fused responses, although audio responses were rare. Whereas, in experiment 2 under the without-noise environment, the ratio of audio responses and fused responses traded results, although visual response was rare. These results demonstrate the effect of noise in the McGurk effect, showing that a higher level of background noise forces our perceptual system to put more weight on visual cues even in audiovisual incongruent speech (Saalasti et al. 2011, 2012, Ujiie et al. 2015a, 2015b).
Table 3.
Means and standard deviations (SD) of the response ratio toward the incongruent (McGurk) stimuli in experiment 1 and experiment 2.
| Total | High AQ group | Averaged AQ group | |
|---|---|---|---|
| Experiment 1 | |||
| With noise (SNR = –5 dB) | |||
| /pa/response | .02 (.03) | .02 (.04) | .01 (.02) |
| /ta/response | .41 (.27) | .39 (.20) | .44 (.34) |
| /ka/response | .57 (.28) | .59 (.22) | .55 (.35) |
| With noise (SNR = +5dB) | |||
| /pa/response | .03 (.05) | .05 (.07) | .01 (.01) |
| /ta/response | .65 (.25) | .67 (.18) | .60 (.31) |
| /ka/response | .32 (.26) | .28 (.25) | .39 (.31) |
| Experiment 2 | |||
| Without noise | |||
| /pa/response | .25 (.30) | .36 (.35) | .15 (.18) |
| /ta/response | .72 (.30) | .62 (.40) | .81 (.21) |
| /ka/response | .03 (.08) | .02 (.07) | .04 (.09) |
Figure 3.
A relationship between AQ and audio response rate for audiovisual incongruent stimuli in two SNR conditions (in experiment 1).
Figure 4.

A relationship between AQ and audio response rate for audiovisual incongruent stimuli in no noise condition (in experiment 2).
The McGurk effect in individuals with high autistic traits
Regarding a weaker McGurk effect in individuals with high autistic traits, we had two hypotheses: (1) if individuals with high autistic traits have difficulty in audiovisual speech integration, they would experience a weaker McGurk effect regardless of the level of background noise, (2) if individuals with high autistic traits have tendency not to rely on visual speech, a weaker McGurk effect might arise when they observe McGurk stimuli in a situation with a lower level of background noise. Our results support the latter hypothesis, showing that a weaker McGurk effect in individuals with high autistic traits was found under the without noise condition (experiment 2), but disappeared when the level of background noise was high (–5 dB condition in experiment 1). Indeed, a similar tendency was found in clinical studies; individuals diagnose with ASD showed more audio response and less McGurk effect than TD under the without-noise condition (De Gelder et al. 1991), whereas, under noisy conditions, individuals with diagnosed ASD showed more fused response and less visual response (Saalasti et al. 2012). Accordingly, a possible explanation for our results is that individuals with high autistic traits tend to rely less on visual input during audiovisual information processing, even more so when the auditory input is particularly salient.
Another possible explanation: hyper- and hyposensitivity
Even if that were the case, the mechanism behind the weak McGurk effect in individuals with high autistic traits remains a topic of debate. For instance, the weak reliance of visual input in individuals with high autistic traits could be the result of either hypersensitivity in hearing or hyposensitivity in vision. Both hyper- and hyposensitivity are symptoms of ASD according to the Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5; American Psychiatric Association 2013). Behavioral and neuropsychological data have found that over 90% of individuals with diagnosed ASD have sensory problems, manifested as either hyper- or hyposensitivity in at least one domain (e.g. Crane et al. 2009, Dunn and Westman 1997, Tomchek and Dunn 2007), such as visual (Simmons et al. 2009), auditory (Haesen et al. 2011, O’Connor 2012), tactile (Foss-Feig et al. 2012), or vestibular (Baker et al. 2008). Furthermore, individuals with high AQ scores have reported experiencing abnormal sensory events more frequently (Ujiie et al. 2015). Future studies should examine the relationship between the McGurk effect and the severity of hyper- or hyposensitivity in vision and audition.
Another possible explanation: weak Central coherence
Another possibility is that the cognitive specificity based on Weak Central Coherence (WCC) reflects the individual difference in the McGurk effect. A main claim of WCC is the local bias of individuals with high autistic traits preferring to process local features to processing global features in cognitive competency tests, such as the Embedded Figure Test (EFT) or the Navon-type Global Local Naming Task (Happé and Frith 2006). The local bias leads to a difficulty of ‘grouping of stimuli’, which may cause a weaker reliance on visual input during multisensory processing. Besides the audiovisual domain (Donohue et al. 2012, Stevenson et al. 2014), weak integration in ASD has been reported in other instances of multisensory integration such as the rubber hand illusion, which involves the integration of visual, tactile, and proprioceptive input (Paton et al. 2012, Palmer et al. 2013). Whether a weak binding of multisensory inputs in ASD, including a weaker McGurk effect, is explained by the WCC model; therefore, it should be examined in future studies.
In conclusion, we suggest that the weak McGurk effect observed in individuals with high autistic traits in past studies might appear when auditory input is salient. However, it does not appear to be the result of poor lip-reading ability, which is a somewhat uncommon problem in individuals with diagnosed ASD. The high background noise might shift weight on the visual cue, thereby increasing the strength of the McGurk effect among individuals with high autistic traits. On the other hand, we need to be careful in interpreting our results. We compared a small sample of individuals with averaged autistic traits and individuals with higher autistic traits. In general, a small sample size weakens the reliability of the statistical analysis. Indeed, a marginal main effect of SNR emerged from the analysis of the small sample size, although it was a relatively robust effect in previous studies (Sekiyama and Burnham 2008). Additionally, we consider whether the results from individuals with higher autistic traits can be generalized to individuals diagnosed with ASD. Given our findings herein as a first step, the next step is to compare the McGurk effect between three samples: individuals with averaged autistic traits, individuals with higher autistic traits, and individuals diagnosed with ASD.
Acknowledgement
We thank all the students for voluntarily participating in our experiments.
Funding Statement
This study was supported by a Grant-in-Aid for the Japan Society for the Promotion of Science Fellows (Grant No 19J00722), Grant-in-Aid for Research Activity (Grant No. 19K20650), and Grant-in-Aid for Scientific research on Innovative Areas (Grant No.16H06526 MEXT).
Disclosure statement
The authors have no competing financial interests to declare.
Ethics
Informed consent was obtained from all participants, and the study was performed in accordance with the local ethical committee.
Reference
- American Psychiatric Association . 2013. Diagnostic and statistical manual of mental disorders. 5th ed. Washington, DC: APA. [Google Scholar]
- Austin, E. J. 2005. Personality correlates of the broader autism phenotype as assessed by the Autism Spectrum Quotient (AQ). Personality and Individual Differences, 38, 451–460. [Google Scholar]
- Baker, A. E., Lane, A., Angley, M. T. and Young, R. L.. 2008. The relationship between sensory processing patterns and behavioural responsiveness in autistic disorder: A pilot study. Journal of Autism and Developmental Disorders, 38, 867–875. [DOI] [PubMed] [Google Scholar]
- Baron-Cohen, S., Leslie, A. M. and Frith, U.. 1986. Mechanical, behavioural and intentional understanding of picture stories in autistic children. British Journal of Developmental Psychology, 4, 113–125. [Google Scholar]
- Baron-Cohen, S., Wheelwright, S., Hill, J., Raste, Y. and Plumb, I.. 2001a. The ‘Reading the Mind in the eyes’ test revised version: A study with normal adults, and adults with Asperger syndrome or high-functioning autism. Journal of Child Psychology and Psychiatry, 42, 241–252. [PubMed] [Google Scholar]
- Baron-Cohen, S., Wheelwright, S., Skinner, R., Martin, J. and Clubley, E.. 2001b. The Autism-Spectrum Quotient (AQ): Evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. Journal of Autism and Developmental Disorders, 31, 5–17. [DOI] [PubMed] [Google Scholar]
- Bebko, J. M., Schroeder, J. H. and Weiss, J. A.. 2014. The McGurk effect in children with autism and Asperger syndrome. Autism Research, 7, 50–59. [DOI] [PubMed] [Google Scholar]
- Crane, L., Goddard, L. and Pring, L.. 2009. Sensory processing in adults with autism spectrum disorders. Autism, 13, 215–228. [DOI] [PubMed] [Google Scholar]
- De Gelder, B. D., Vroomen, J. and Van der Heide, L.. 1991. Face recognition and lip-reading in autism. European Journal of Cognitive Psychology, 3, 69–86. [Google Scholar]
- Deruelle, C., Rondan, C., Gepner, B. and Tardif, C.. 2004. Spatial frequency and face processing in children with autism and Asperger syndrome. Journal of Autism and Developmental Disorders, 34, 199–210. [DOI] [PubMed] [Google Scholar]
- Donohue, S. E., Darling, E. F. and Mitroff, S. R.. 2012. Links between multisensory processing and autism. Experimental Brain Research, 222, 377–387. [DOI] [PubMed] [Google Scholar]
- Dunn, W. and Westman, K.. 1997. The sensory profile: The performance of a national sample of children without disabilities. American Journal of Occupational Therapy, 51, 25–34. [DOI] [PubMed] [Google Scholar]
- Frith, U. 1991. Autism and Asperger’s syndrome. Cambridge: Cambridge University Press. [Google Scholar]
- Foss-Feig, J. H., Heacock, J. L. and Cascio, C. J.. 2012. Tactile responsiveness patterns and their association with core feature in autism spectrum disorders. Research in Autism Spectrum Disorders, 6, 337–344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Joseph, R. M. and Tanaka, J.. 2002. Holistic and part-based face recognition in children with autism. Journal of Child Psychology and Psychiatry, 43, 1–14. [DOI] [PubMed] [Google Scholar]
- Haesen, B., Boets, B. and Wagemans, J.. 2011. A review of behavioral and electrophysiological studies on auditory processing and speech perception in autism spectrum disorders. Research in Autism Spectrum Disorders, 5, 701–714. [Google Scholar]
- Happé, F. and Frith, U.. 2006. The weak coherence account: Detail-focused cognitive style in autism spectrum disorders. Journal of Autism and Developmental Disorders, 36, 5–25. [DOI] [PubMed] [Google Scholar]
- Hasegawa, C., Kikuchi, M., Yoshimura, Y., Hiraishi, H., Munesue, T., Nakatani, H., Higashida, H., Asada, M., Oi, M. and Minabe, Y.. 2015. Broader autism phenotype in mothers predicts social responsiveness in young children with autism spectrum disorders. Psychiatry and Clinical Neurosciences, 69, 136–144. [DOI] [PubMed] [Google Scholar]
- Hoekstra, R. A., Bartels, M., Cath, D. C. and Boomsma, D. I.. 2008. Factor structure, reliability and criterion validity of the Autism-Spectrum Quotient (AQ): A study in Dutch population and patient groups. Journal of Autism and Developmental Disorders, 38, 1555–1566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hofvander, B., Delorme, R., Chaste, P., Nydén, A., Wentz, E., Ståhlberg, O., Herbrecht, E., Stopin, A., Anckarsäter, H., Gillberg, C., Råstam, M. and Leboyer, M.. 2009. Psychiatric and psychosocial problems in adults with normal-intelligence autism spectrum disorders. BMC Psychiatry, 9, 35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iarocci, G., Rombough, A., Yager, J., Weeks, D. and Chua, R.. 2010. Visual influences on speech perception in children with autism. Autism, 14, 305–320. [DOI] [PubMed] [Google Scholar]
- Irwin, J. R., Tornatore, L. A., Brancazio, L. and Whalen, D.. 2011. Can children with autism spectrum disorders “hear” a speaking face? Child Development, 82, 1397–1403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lau, W. Y. P., Kelly, A. B. and Peterson, C.C.. 2013. Further evidence on the factorial structure of the autism spectrum quotient (AQ) for adults with and without a clinical diagnosis of autism. Journal of Autism and Developmental Disorders, 43, 2807–2815. [DOI] [PubMed] [Google Scholar]
- Magnotti, J. F., Basu Mallick, D., Feng, G., Zhou, B., Zhou, W. and Beauchamp, M. S.. 2015. Similar frequency of the McGurk effect in large samples of native Mandarin Chinese and American English speakers. Experimental Brain Research, 233, 2581–2586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGurk, H. and MacDonald, J. W.. 1976. Hearing lips and seeing voices. Nature, 264, 746–748. [DOI] [PubMed] [Google Scholar]
- O’Connor, K. 2012. Auditory processing in autism spectrum disorder: A review. Neuroscience & Biobehavioral Reviews, 36, 836–854. [DOI] [PubMed] [Google Scholar]
- Palmer, C. J., Paton, B., Hohwy, J. and Enticott, P. G.. 2013. Movement under uncertainty: The effects of the rubber-hand illusion vary along the nonclinical autism spectrum. Neuropsychologia, 51, 1942–1951. [DOI] [PubMed] [Google Scholar]
- Paton, B., Hohwy, J. and Enticott, P. G.. 2012. The rubber hand illusion reveals proprioceptive and sensorimotor differences in autism spectrum disorders. Journal of Autism and Developmental Disorders, 42, 1870–1883. [DOI] [PubMed] [Google Scholar]
- Reed, P., Lowe, C. and Everett, R.. 2011. Perceptual learning and perceptual search are altered in male university students with higher Autism Quotient scores. Personality and Individual Differences, 51, 732–736. [Google Scholar]
- Saalasti, S., Tiippana, K., Kätsyri, J. and Sams, M.. 2011. The effect of visual spatial attention on audiovisual speech perception in adults with Asperger syndrome. Experimental Brain Research, 213, 283–290. doi: 10.1007/s00221-011-2751-7. [DOI] [PubMed] [Google Scholar]
- Saalasti, S., Kätsyri, J., Tiippana, K., Laine-Hernandez, M., von Wendt, L. and Sams, M.. 2012. Audiovisual speech perception and eye gaze behavior of adults with Asperger syndrome. Journal of Autism and Developmental Disorders, 42, 1606–1615. [DOI] [PubMed] [Google Scholar]
- Sekiyama, K. 1994. Difference in auditory-visual speech perception between Japanese and Americans: McGurk effect as a function of incompatibility. Journal of the Acoustical Society of Japan (E), 15, 143–158. [Google Scholar]
- Sekiyama, K. and Burnham, D.. 2008. Impact of language on development of auditory-visual speech perception. Developmental Science, 11, 306–320. [DOI] [PubMed] [Google Scholar]
- Sekiyama, K. and Tohkura, Y.. 1991. McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility. The Journal of the Acoustical Society of America, 90, 1797–1805. [DOI] [PubMed] [Google Scholar]
- Simmons, D. R., Robertson, A. E., McKay, L. S., Toal, E., McAleer, P. and Pollick, F. E.. 2009. Vision in autism spectrum disorders. Vision Research, 49, 2705–2739. [DOI] [PubMed] [Google Scholar]
- Smith, E. G. and Bennetto, L.. 2007. Audiovisual speech integration and lipreading in autism. Journal of Child Psychology and Psychiatry, 48, 813–821. [DOI] [PubMed] [Google Scholar]
- Stevenson, R. A., Siemann, J. K., Schneider, B. C., Eberly, H. E., Woynaroski, T. G., Camarata, S. M. and Wallace, M. T.. 2014. Multisensory temporal integration in autism spectrum disorders. The Journal of Neuroscience, 34, 691–697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stewart, M. E. and Ota, M.. 2008. Lexical effects on speech perception in individuals with “autistic” traits. Cognition, 109, 157–162. [DOI] [PubMed] [Google Scholar]
- Taylor, N., Isaac, C. and Milne, E.. 2010. A comparison of the development of audiovisual integration in children with autism spectrum disorders and typically developing children. Journal of Autism and Developmental Disorders, 40, 1403–1411. [DOI] [PubMed] [Google Scholar]
- Tomchek, S. D. and Dunn, W.. 2007. Sensory processing in children with and without autism: A comparative study using the Short Sensory Profile. American Journal of Occupational Therapy, 61, 190–200. [DOI] [PubMed] [Google Scholar]
- Ujiie, Y., Asai, T. and Wakabayashi, A.. 2015a. The relationship between level of autistic traits and local bias in the context of the McGurk effect. Frontiers in Psychology, 6, 891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ujiie, Y., Asai, T., Tanaka, A. and Wakabayashi, A.. 2015b. The McGurk effect and autistic traits: An analogue perspective. Letters on Evolutionary Behavioral Science, 6, 9–12. [Google Scholar]
- Ujiie, Y., Asai, T. and Wakabayashi, A.. 2018. Individual differences and the effect of face configuration contexts in the McGurk effect. Experimental Brain Research, 236, 973. [DOI] [PubMed] [Google Scholar]
- Wakabayashi, A., Baron-Cohen, S., Wheelwright, S. and Tojo, Y.. 2006. The Autism-Spectrum Quotient (AQ) in Japan: A crosscultural comparison. Journal of Autism and Developmental Disorders, 36, 263–270. [DOI] [PubMed] [Google Scholar]
- Williams, J. H. G., Massaro, D. W., Peel, N. J., Bosseler, A. and Suddendorf, T.. 2004. Visual–auditory integration during speech imitation in autism. Research in Developmental Disabilities, 25, 559–575. [DOI] [PubMed] [Google Scholar]
- Woodbury-Smith, M. R., Robinson, J., Wheelwright, S. and Baron-Cohen, S.. 2005. Screening adults for Asperger syndrome using the AQ: A preliminary study of its diagnostic validity in clinical practice. Journal of Autism and Developmental Disorders, 35, 331–335. [DOI] [PubMed] [Google Scholar]
- Zhang, J., Meng, Y., He, J., Xiang, Y., Wu, C., Wang, S. and Yuan, Z.. 2019. McGurk Effect by individuals with autism spectrum disorder and typically developing controls: A systematic review and meta-analysis. Journal of Autism and Developmental Disorders, 49, 34–43. [DOI] [PubMed] [Google Scholar]



