Abstract
Purpose:
According to the speech attunement framework, autistic individuals lack the ability and/or motivation to “tune up” their speech to the same level of precision as their neurotypical peers. However, previous studies exploring the articulatory patterns of autistic individuals have yielded disparate findings. One reason contrasting conclusions exist may be because studies have relied on perceptual measures of articulation. Here, we use an objective acoustic measure of articulatory precision to explore the articulatory patterns of autistic children and adults.
Method:
This was a retrospective analysis of an existing corpus of 900 recorded speech samples taken from 30 adult and 30 child participants across two different population groups: autistic individuals (autism spectrum disorder [ASD] group) and neurotypical individuals (neurotypical [NT] group). Articulatory precision scores were calculated using an automated metric that compares observed acoustics to the expected acoustics for each phoneme production. Linear mixed-effects models were used to compare the articulatory precision scores across population group (i.e., ASD group vs. NT group) and to see if these differences were moderated by age group (i.e., children vs. adult).
Results:
The speech of autistic individuals was characterized by reduced articulatory precision relative to their neurotypical peers. This pattern was not significantly moderated by age, indicating it occurred in both the children and adult groups.
Conclusions:
Our preliminary findings indicate that imprecise articulation may be a characteristic of the speech of autistic individuals in both childhood and adulthood. These findings are in line with predictions posited by the speech attunement framework. Given the current lack of speech markers for this clinical population and the importance of speech quality in the social integration of autistic individuals, our results advance articulatory precision as a viable and important target for future research
From its earliest descriptions, atypical speech has been identified as a common characteristic of autism spectrum disorder (ASD). Perceptually, differences in the speech of autistic 1 individuals are readily apparent. For example, blinded listeners have rated autistic individuals as sounding more atypical than their neurotypical peers (Nadig & Shaw, 2012; Redford et al., 2018), even when audio samples were linguistically identical (Filipe et al., 2014). Such disparities are not without their consequences. Several studies found that, after listening to speech samples, research participants perceived autistic individuals as being less likable (Redford et al., 2018) and more awkward (Bone et al., 2015; Grossman, 2015) than neurotypical individuals. Furthermore, these negative perceptions occurred even when samples were identical in content or when extraneous factors such as linguistic ability and IQ were controlled. In a study by Sasson et al. (2017), participants not only indicated more negative first impressions of autistic than neurotypical individuals but also indicated that these impressions would likely affect their behavior toward that person (i.e., they would be less likely to live near or spend time with that person). Furthermore, this pattern was not present when participants were asked to make judgments based solely on written transcripts of the same speech samples, indicating that participants largely based their judgments on speech production rather than linguistic content.
Given the perceptual abnormalities and associated negative consequences, documenting the speech behaviors of autistic individuals is important. Such information can be used to improve our theoretical understanding of the ASD phenotype, identify markers of ASD to supplement existing diagnostic tools, and, where appropriate, develop effective treatment approaches. However, although there is a general consensus that the speech of autistic individuals sounds atypical, research regarding what is actually driving these perceptual differences is much less clear. Subjectively, the speech of autistic individuals has been described as too monotone, too singsongy, too loud, too quiet, too fast, and too slow (e.g., Goldfarb et al., 1972; Nadig & Shaw, 2012; Patel et al., 2020; Shriberg et al., 2011). Objective acoustic measures have also yielded disparate findings with more studies showing nonsignificant than significant results. For example, though some studies indicate higher fundamental frequency in autistic individuals relative to neurotypical peers (e.g., Kissine & Geelhand, 2019; Sharda et al., 2010), many do not (e.g., Depape et al., 2012; Diehl et al., 2009; Grossman et al., 2010). Similar inconsistencies can be observed in studies of intensity (e.g., Diehl & Paul, 2012; Hubbard & Trauner, 2007; Ochi et al., 2019) and speech rate (e.g., Bonneh et al., 2011; Dahlgren et al., 2018; Nadig & Shaw, 2012). In their systematic review of 45 studies examining acoustic speech characteristics of autistic individuals, Fusaroli et al. (2017) concluded that the cause of the abnormal perceptual quality is still unclear and cannot be fully explained by any of these previously studied features.
One area that may contribute to the unusual perceptual quality in the speech of autistic individuals is that of articulation. In the speech attunement framework, Shriberg et al. (2011) posit that enhanced auditory perceptual abilities (e.g., Järvinen-Pasley et al., 2008; Mottron et al., 2006) allow verbal autistic individuals to “tune in” to the speech models around them, enabling relatively intact speech abilities. However, they lack the motivation and/or ability to “tune up” their speech to the level of precision exhibited by neurotypical individuals. Empirical research to support this is, however, disparate. Although some studies suggest that atypical articulatory patterns are present in the speech of autistic individuals (e.g., Cleland et al., 2010; Shriberg et al., 2001), others suggest that their articulatory skills are relatively spared (e.g., Kjelgaard & Tager-Flusberg, 2001; McCann et al., 2007). One reason that contrasting conclusions exist may be due to the heterogeneity of this population (e.g., Masi et al., 2017) in conjunction with differences in inclusion and exclusion criteria between studies. For example, although some studies only consider individuals with high functioning autism (e.g., Shriberg et al., 2001), others employ less strict inclusion criteria (e.g., Kjelgaard & Tager-Flusberg, 2001). Another reason for these discrepancies may be that the majority of studies has relied on perceptual measures of articulation. For example, a standardized perceptual articulation assessment, such as the Goldman–Fristoe Test of Articulation (Goldman & Fristoe, 1986), may be administered, or the percentage of consonants (or phonemes) identified as perceptually correct may be calculated. Within these types of assessments, speech sound errors are identified using phonetic transcriptions; however, such methods have been shown to yield poor reliability (Kent, 1996). Even in instances where reliability is strong, if autistic individuals are simply lacking the mechanisms to “tune up” their speech, subtle abnormalities may not be captured by the binary categorization of speech sounds as correct or incorrect. That is, speech sounds of autistic individuals may be close enough to the intended target to be classified as a correct production while still being less precise than productions of neurotypical individuals, leading to subtle but acoustically and perceptually noticeable differences. In their review of articulation in the speech of autistic individuals, McKeever et al. (2019) noted the “inconsistent and reduced specificity of the perceptual measurements used across studies” and suggested the need for finer-grained techniques to more fully understand articulatory deficits within this population. In order to combat these limitations, the purpose of this initial study is to explore the articulatory patterns of autistic individuals using an objective and holistic measure of articulation focused specifically on articulatory precision. Defined by Stegmann et al. (2020, p. 132) as “the match between expected and observed acoustic features for each phoneme,” articulatory precision (by this definition) represents a gestalt metric of speech that accounts for both overt speech sound errors (i.e., substitutions and deletions) and more subtle deviations from production norms. To quantify articulatory precision, we use an automated measure, which, though relatively novel, is strongly correlated with intelligibility and has been used and validated in studies examining the speech of clinical populations (Borrie et al., 2020, 2022). Importantly, this measure offers several advantages over more commonly used measures of articulation such as standardized articulation assessments or calculation of percent consonants correct. First, this measure is nonbinary. Rather than dichotomize speech sounds as correct or incorrect, articulation is quantified across a continuum, allowing for a more nuanced consideration of articulation. In addition, this measure objectively quantifies acoustic data, increasing the overall reliability of assessment. A final benefit of examining articulatory precision is that it provides an objective method for exploring subjective differences noted in this population. Redford et al. (2018) found that listeners rated the speech of autistic children as less clear and less articulate than neurotypical children. Furthermore, these perceptual ratings were strong predictors of novel listener ratings of likability, as well as judgments regarding whether the speaker was “disordered” or “typical.” Thus, using a measure that specifically examines the “clarity” or “articulateness” of speech may provide objective data to support existing subjective evidence of the role of precision in the unique speech quality in this clinical population. Because symptoms of ASD often manifest themselves differently in childhood and adulthood (e.g., Matthews et al., 2015; Seltzer et al., 2003), we examine these patterns in two age groups: children (ages 6–14 years) and adults (ages 20–40 years). Specifically, we perform a retrospective analysis of an existing corpus of 900 speech samples from 60 participants (Wynn et al., 2018) to address the following two research questions: (a) Are there differences in an objective measure of articulatory precision between autistic individuals and their neurotypical peers (i.e., ASD group vs. neurotypical group [NT group])? (b) Are these differences moderated by age group (i.e., children vs. adults)? Based on the speech attunement framework (Shriberg et al., 2011), we hypothesize that autistic individuals will display less precise speech than neurotypical individuals in both the children and the adult groups.
Method
Overview
To compare the articulatory precision of neurotypical and autistic children and adults, we carried out a retrospective analysis of an existing corpus of 900 recorded speech samples collected and described in detail in Wynn et al. (2018). The speech samples were elicited from 30 adult and 30 child participants across two different population groups: neurotypical individuals (NT group) and autistic individuals (ASD group). This study was carried out with ethical approval from the institutional review board at Utah State University. Further details are described under appropriate subheadings below.
Participants
Participants included 60 participants: 15 neurotypical children, 15 autistic children, 15 neurotypical adults, and 15 autistic adults. All participants were native speakers of American English. Additionally, participants had no hearing impairment by parent or self-report. Children in both the NT and ASD groups were between the ages of 6 and 14 years. An independent t test revealed no significant difference in age between the two groups, t(28) = 0.18, p = .86, d = 0.07. Adults in the two population groups were between the ages of 20 and 40 years. Mean and standard deviation for age and gender of the participants are reported for each group in Table 1.
Table 1.
Group | Age (years) |
Gender |
|||
---|---|---|---|---|---|
M | SD | Range | Male | Female | |
NT adults | 23.94 | 1.34 | 21–25 | 11 | 4 |
ASD adults | 28.33 | 6.12 | 20–40 | 10 | 5 |
NT children | 9.92 | 2.06 | 6–14 | 11 | 4 |
ASD children | 10.07 | 2.56 | 6–14 | 11 | 4 |
Note. NT = neurotypical; ASD = autism spectrum disorder.
NT Group
Children in the NT group had no developmental delays or learning disabilities according to parent report. Receptive and expressive language skills were confirmed as being within normal limits using the Following Directions and Recalling Sentences subtests of the Clinical Evaluation of Language Fundamentals–Fifth Edition (CELF-5; Wiig et al., 2013). The average standard scores of the neurotypical children were 104.33 on the Following Directions subtest and 103.73 on the Recalling Sentences subtest. Adults in the NT group self-reported no developmental delays or learning disabilities. Language skills were confirmed to be within normal limits via an informal conversation with the experimenter.
ASD Group
Children in the ASD group had a parent-reported medical diagnosis of or educational eligibility for ASD. Parents of each child participant completed the caregiver response form from the Children's Communication Checklist–Second Edition (Bishop, 2006). Raw scores on social communication skills (i.e., inappropriate initiation, scripted language, use of context, and nonverbal communication) and autistic-like features (i.e., social relations and interests) were used to find an overall pragmatic score for each participant. Based on these scores, results of an independent t test confirmed significant differences between the neurotypical and autistic children groups on overall pragmatic functioning, t(28) = 9.81, p < .001, d = 3.58. In order to represent a more heterogeneous group of autistic individuals, children in this group were not required to have normal language skills but were required to demonstrate the ability to follow directions and complete the recording task (i.e., produce descriptions that were free from palilalia, echolalia, or idiosyncratic phrases; produce a description with vocabulary relating to the picture; and speak for the full 15 s without excessive delays or inappropriate pausing) through informal assessment during two practice trials. The average standard scores of the autistic children were 92.33 on the Following Directions subtest and 88.66 on the Recalling Sentences subtest of the CELF-5. Adults in the ASD group had received a diagnosis of ASD by a licensed clinical social worker or a clinical mental health counselor no more than 3 years before the experiment and had an IQ level of 90 or above. As with the child group, adults with ASD were not required to have language skills within normal limits but were required to demonstrate the ability to engage in and complete the speech elicitation task.
Speech Samples
Fifteen individual speech samples were elicited and recorded for each participant in the study. During the elicitation task, participants were seated in front of a computer and fitted with a wireless headset (Astro A50 Wireless System). For each speech sample, participants were asked to watch a short audiovisual clip of a woman introducing a picture from a popular children's book. Due to the nature of the original study (Wynn et al., 2018), audiovisual clips were presented randomly at three speeds: 120% (fast), 100% (normal), and 80% (slow) of the original speed. Participants were then given 15 s to describe what they saw in the picture. Verbal descriptions were audio-recorded via the headset microphone. During the speech elicitation task, all participants were presented with the same 15 pictures, but the order of presentation was randomized. The entire speech elicitation task took approximately 15 min for participants to complete.
Articulatory Precision Analysis
Each of the 60 participants produced 15 speech samples, resulting in a data set of 900 audio files. Analysis of these speech samples to extract articulatory precision scores involved a series of steps, which can be viewed in Figure 1. First, using the Praat textgrid function (Boersma & Weenink, 2018), each speech sample was manually segmented for start/stop boundaries and orthographically transcribed by research assistants. In instances of misarticulation, orthographic transcription represented the intended target words rather than the speaker's actual speech production (i.e., a speaker's production of “tat” would be transcribed as “cat” when referring to a cat in the picture). Transcriptions included filler words, part-word repetitions, and whole-word repetitions. However, nonverbal sounds (i.e., laughing and coughing) were not transcribed. In order to ensure accuracy of the transcription process, all transcripts were checked by a second research assistant. Disagreements in transcriptions were then resolved through discussion with a third assistant. To maintain accuracy of the automated alignment process, each 15-s recording was divided at natural breaking points into segments containing less than 15 words. Segments were also divided to remove pauses and nonspeech sounds (e.g., coughing and laughing) greater than 0.5 s. This process resulted in 4,197 segments for analysis. Descriptive data regarding the speech samples, including means and standard deviations for the number of segments analyzed per utterance and number of words per segment, for the four groups of participants are shown in Table 2.
Table 2.
Variable | NT adults |
ASD adults |
NT children |
ASD children |
||||
---|---|---|---|---|---|---|---|---|
M | SD | M | SD | M | SD | M | SD | |
Segments per utterance | 4.99 | 0.91 | 4.56 | 0.98 | 4.58 | 1.00 | 4.55 | 0.98 |
Words per segment | 9.15 | 3.54 | 7.77 | 3.85 | 6.76 | 3.76 | 7.14 | 3.72 |
Note. NT = neurotypical; ASD = autism spectrum disorder.
Using the segmented recordings and associated orthographic transcripts, an automated measure of articulatory precision was extracted for each speech segment. This automated, objective measure, introduced by Tu et al. (2018) and subsequently used by other researchers (e.g., Borrie et al., 2022; Lubold et al., 2019; Stegmann et al., 2020), represents a gestalt measure of articulatory precision. In this method, phonemic transcripts are aligned with audio recordings at the phoneme level using a deep neural network acoustic model. Each phoneme is then compared to an acoustic counterpart, generated from the LibriSpeech corpus (Panayotov et al., 2015), a large corpus including 1,000 hr of read speech samples. The automated measure distills a high-dimensional representation of the speech spectrum (represented by Mel-frequency cepstral coefficients) into a likelihood ratio, which represents the similarity between the expected and observed phoneme. Articulation is more precise in read speech than in spontaneous speech (Johnson, 2004), and as a result, the more precise a speech segment is, the more similar it will be to the speech model created from read speech. A perfect match between the spontaneous speech phoneme and the target phoneme results in a score of 0. Negative scores indicate imprecise articulation; the further the score is from 0, the less precise that recorded phoneme is as compared to its target acoustic model. Although there is no lower limit, scores generally fall between 0 and −4. The scores for each phoneme are then averaged across each segment, resulting in one articulatory precision score per segment.
Statistical Analysis
Linear mixed-effects models were conducted using the lme4 package in the R statistical environment (lme4 package Version 1.1-19 and R Version 3.5.2; Bates et al., 2015; R Development Core Team, 2019). This type of analysis was used to investigate the effects of population group (i.e., ASD group vs. NT group) on articulatory precision scores while controlling for the lack of independence in the data due to the repeated measures for each participant. For the models, the random effects structure included a random intercept by participant. To address the first research question, our first model (Model 1) was used to examine the effect of population on articulatory precision. In this model, the fixed effects included the between-participants factor of population. Additionally, for each segment, stimuli speed (i.e., slow, normal, and fast), number of words, and age group were included as fixed effects to control for potential differences in speech samples brought about by these factors. Thus, the specific formula for this model was lmer(AP ~ group + age + stimuli speed + number of words + (1 | participant)). To address the second research question, an additional model (Model 2) was used to examine age group as a moderating factor. This model included an interaction between population group and age group while controlling for stimuli speed and number of words. Thus, the specific formula for this model was lmer(AP ~ group * age + stimuli speed + number of words + (1 | participant)).
Results
As illustrated in Figure 2, mean articulatory precision scores were highest for adults in the NT group (M = −1.01), followed by adults in the ASD group (M = −1.27), children in the NT group (M = −1.29) and children in the ASD group (M = −1.54). Results from Model 1 are displayed in Table 3. Addressing our first research question, our findings showed a significant effect of population on articulatory precision (b = 0.22, p = .02). Thus, when controlling for stimuli speed, number of words, and age group, autistic individuals spoke less precisely than neurotypical individuals. Results from Model 2 are displayed in Table 4. Addressing our second research question, our findings indicated no significant interaction between population group and age group (b = 0.07, p = .72). Thus, when controlling for stimuli speed and number of words, the differences in articulatory precision between autistic and neurotypical individuals were not moderated by age group.
Table 3.
Term | B | SE | t | p |
---|---|---|---|---|
Intercept | −1.67 | 0.09 | −18.56 | < .001*** |
Population group | 0.22 | 0.09 | 2.34 | .02 * |
Age group | −0.19 | 0.09 | −2.04 | .05 |
Stimuli speed (fast) | −0.04 | 0.03 | −1.06 | .29 |
Stimuli speed (slow) | −0.04 | 0.03 | −1.10 | .27 |
Number of words | 0.05 | 0.004 | 13.49 | < .001*** |
Note. Bold formatting represents the term of interest. SE = standard error.
p < .05.
p < .001.
Table 4.
Term | B | SE | t | p |
---|---|---|---|---|
Intercept | −1.65 | 0.10 | −16.38 | < .001*** |
Population | 0.19 | 0.13 | 1.40 | .17 |
Age group | −0.23 | 0.13 | −1.70 | .09 |
Stimuli speed (fast) | −0.04 | 0.03 | −1.06 | .29 |
Stimuli speed (slow) | −0.04 | 0.03 | −1.10 | .27 |
Number of words | 0.05 | 0.004 | 13.50 | < .001*** |
Population × Age | 0.07 | 0.19 | 0.36 | .72 |
Note. Bold formatting represents the term of interest. SE = standard error.
p < .001.
Discussion
The purpose of this preliminary study was to investigate the articulatory patterns of autistic individuals using an acoustic measure of articulatory precision. Data analysis relied on an existing corpus of recorded speech samples from neurotypical children and adults, and autistic children and adults. We hypothesized that the speech of autistic individuals would be less precise than the speech of their neurotypical peers in both the children and adult groups. Our results confirmed this hypothesis. The speech of autistic individuals was characterized by reduced articulatory precision relative to their neurotypical peers. Here, we note that these reductions are representative of all deviations from expected targets, including possible substitutions, deletions, and distortions. Our findings also indicated that differences between articulatory precision in neurotypical and autistic individuals were not significantly moderated by age group—autistic children spoke with less precision than neurotypical children, and autistic adults spoke with less precision than neurotypical adults. That these findings occurred across a heterogeneous group of verbal autistic individuals (i.e., children and adults with limited exclusion criteria) suggests that reduced articulatory precision may be a salient speech characteristic exhibited by this population.
Although this is the first study, to our knowledge, to objectively quantify articulatory precision in the speech of autistic individuals, our findings are consistent with those of Redford et al. (2018) in which naïve listeners rated the speech of autistic individuals as being less articulate and less clear than neurotypical controls. Our findings are also supported by studies showing evidence of articulatory deficits in many autistic individuals (e.g., Cleland et al., 2010; Shriberg et al., 2001); however, they contrast other studies that report the articulatory skills of autistic individuals to be largely intact (e.g., Kjelgaard & Tager-Flusberg, 2001; McCann et al., 2007). One possible explanation for this discrepancy with this study is a difference in measurement approaches. Here, rather than rely on perceptual assessments that dichotomize speech sounds as correct or incorrect, we use an acoustic measure that quantifies articulatory production along a continuum from more to less precise. Thus, our findings, taken with previous research, suggest that autistic individuals do exhibit articulatory abnormalities but that these abnormalities may often be subtle and more completely captured through more sensitive, objective measurement tools.
Our findings are in line with the speech attunement framework (Shriberg et al., 2011), which suggests that, although verbal autistic individuals are able to “tune in” to the speech of others, they do not “tune up” their speech to the same level of precision as their neurotypical peers. Several factors may account for this. In their explanation of the framework, Shriberg and colleagues suggested the lack of tuning up to be a result of deficits in social reciprocity. That is, autistic individuals may lack the social motivation to attend to the acoustic details of the speech models in their environment and/or integrate these intricacies into their own speech. Imitation deficits have long been recognized as a key characteristic of ASD. More particularly, within the realm of speech, research has shown that although neurotypical adults adopt acoustic–prosodic patterns of their partner, autistic individuals do not (Wynn et al., 2018). Furthermore, there is much evidence to suggest that imitation deficits may be partially explained by social reciprocity deficits (Van Etten & Carver, 2015). For example, one study found that social reciprocity abilities in autistic children significantly predicted the degree of imitation in an interactive play context (McDuffie et al., 2007). In another study, neurotypical and autistic children completed imitation tasks in a spontaneous social setting and a setting where they were explicitly instructed to imitate the examiner. Results showed that autistic children showed significantly less imitation in the spontaneous social condition than when explicitly instructed, whereas neurotypical children showed equal levels of imitation in both conditions (Ingersoll, 2008). Given that speech learning is an inherently social activity, it is possible that lack of motivation inhibits the level of imitation needed to acquire and master the more precise speech exhibited by their neurotypical peers.
Beyond a lack of social motivation, motoric deficits associated with ASD may also contribute to an inability to fully tune up to typical levels of articulatory precision. Recent research has suggested that motoric deficits may be a core feature of ASD (e.g., Fournier et al., 2010; Mosconi & Sweeney, 2015), and there is ample evidence of fine motor and oral motor deficits within this population (Gernsbacher et al., 2008; Hardan et al., 2003; Sullivan et al., 2013). As speech production, particularly movements of the articulators, requires complex motoric coordination and control, it has been advanced that kinematic deficits directly contribute to speech irregularities in this population (Belmonte et al., 2013; McCleery et al., 2013). Indeed, poor motoric performance has been associated with poor speech production in autistic individuals (Mody et al., 2017; Sullivan et al., 2013), providing evidence that motoric deficits may be at least partially responsible for the imprecise articulation evident in this population.
Limitations and Future Directions
We relied on an existing corpus of speech samples for this preliminary study. There were many ideal aspects of this corpus, key being the large number of comparable speech samples produced in a relatively naturalistic and ecologically valid setting. However, this corpus does present a few limitations. For instance, stimuli videos used for eliciting the speech samples were presented in three different speed conditions. Although we controlled for this statistically and our analysis showed no significant effect of stimuli speed, future studies examining measures of articulatory precision could employ different and varied speech elicitation methods (e.g., narrative or conversational tasks). Beyond speech elicitation procedures, participant factors should also be considered. For instance, larger sample sizes would allow more detailed analysis of factors that may affect articulatory differences within the larger autistic population. Fusaroli (2021) highlighted the importance of systematically considering individual variation when investigating acoustic profiles in autistic populations. Our study broadly investigated the difference been adults and children, but future research could include additional assessments (e.g., linguistic and cognitive assessments) and more specific age ranges to better understand and control for participant characteristics. Furthermore, research could explore the relationship between articulatory precision and gender, linguistic ability, cognitive ability, and clinical features. Additionally, although an ASD diagnosis was confirmed for all adults in the study, children were placed in the ASD group based on parental report of diagnosis or educational eligibility. Future studies should include more rigorous confirmation of ASD diagnosis using standardized assessment tools such as the Autism Diagnostic Observation Schedule (Lord et al., 2012).
Additionally, although this study provides quantitative data regarding articulatory deviations in the speech of autistic individuals, it does not give specific information regarding the types of errors (e.g., phonemic distortions vs. phonological substitutions), nor does it offer information regarding the underlying causes of these errors. While providing this type of information was not the purpose of this study, utilizing an objective measure of articulatory precision within investigations of these areas may be efficacious. Additional work could also focus on the potential clinical implications of articulatory precision impairments in autistic individuals. Researchers have attempted to find a speech marker for ASD that could be used as an objective and reliable supplement to current diagnostic efforts. However, such attempts have, to this point, been largely unsuccessful (e.g., Dahlgren et al., 2018; Fusaroli et al., 2017). Our preliminary findings suggest that articulatory precision may be a key speech marker for ASD, either individually or (more likely) in combination with other speech features, and further investigation in this area is warranted. Additionally, although we did not collect any additional measures of speech within this study, future studies could compare articulatory precision to current standardized articulation assessments to better understand the relationship between acoustic and perceptual articulatory measures within this population. Beyond diagnostics, investigating the relationship between measures of articulatory precision and listener ratings of diagnostic severity, oddness, and likability could indicate if lower precision contributes to the negative listener impressions of speech in this population (Bone et al., 2015; Grossman, 2015; Redford et al., 2018). Research could next examine contexts in which autistic individuals are able to tune up their articulatory precision. For instance, preliminary research has shown that autistic individuals decrease speech sound errors in response to a three-dimensional virtual tutor (Chen et al., 2019) and ultrasound visual biofeedback (Cleland et al., 2019). Therefore, treatment may employ similar methods to increase overall articulatory precision, which may be efficacious in improving conversational outcomes.
Conclusions
In summary, this preliminary study investigated how an objective fine-grained metric of articulatory precision is influenced by the presence of ASD in both childhood and adulthood. The results supported our hypothesis that both autistic children and adults would be less precise than their neurotypical peers. Given the current lack of speech markers for this clinical population and the importance of speech quality in the social integration of autistic individuals, our results advance articulatory precision as a viable and important target for future research.
Acknowledgments
This research was supported by National Institute on Deafness and Other Communication Disorders Fellowship F31DC019559, awarded to Camille J. Wynn (PI) and Stephanie A. Borrie (sponsor). Data coding and analysis for this project was led by Elizabeth R. Josephson as part of her master's thesis in the Human Interaction Laboratory at Utah State University. The authors gratefully acknowledge Visar Berisha at Arizona State University for the measure of articulatory precision.
Funding Statement
This research was supported by National Institute on Deafness and Other Communication Disorders Fellowship F31DC019559, awarded to Camille J. Wynn (PI) and Stephanie A. Borrie (sponsor).
Footnote
In line with preferences expressed by many within the autism community (e.g., Bury et al., 2020; Kenny et al., 2016; Kapp et al., 2013), we have opted to use the term autistic throughout this research note.
References
- Bates, D. , Mächler, M. , Bolker, B. , & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://www.jstatsoft.org/v067/i01 [Google Scholar]
- Belmonte, M. K. , Saxena-Chandhok, T. , Cherian, R. , Muneer, R. , George, L. , & Karanth, P. (2013). Oral motor deficits in speech-impaired children with autism. Frontiers in Integrative Neuroscience, 7. https://doi.org/10.3389/fnint.2013.00047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bishop, D. (2006). The Children's Communication Checklist (2nd ed., U.S. ed.). Harcourt Assessment. [Google Scholar]
- Boersma, P. , & Weenink, D. (2018). Praat: Doing phonetics by computer (Version 6.0) [Computer software] . http://www.praat.org
- Bone, D. , Black, M. P. , Ramakrishna, A. , Grossman, R. B. , & Narayanan, S. S. (2015). Acoustic-prosodic correlates of “awkward” prosody in story retellings from adolescents with autism. In INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association. Dresden, Germany, September 6–10, 2015, 1616–1620. https://www.isca-speech.org/archive/interspeech_2015/bone15_interspeech.html
- Bonneh, Y. S. , Levanon, Y. , Dean-Pardo, O. , Lossos, L. , & Adini, Y. (2011). Abnormal speech spectrum and increased pitch variability in young autistic children. Frontiers in Human Neuroscience, 4, 237. https://doi.org/10.3389/fnhum.2010.00237 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borrie, S. A. , Wynn, C. J. , Berisha, V. , & Barrett, T. S. (2022). From speech acoustics to communicative participation in dysarthria: Toward a causal framework. Journal of Speech, Language, and Hearing Research, 65(2), 405–418. https://doi.org/10.1044/2021_JSLHR-21-00306 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borrie, S. A. , Wynn, C. J. , Berisha, V. , Lubold, N. , Willi, M. M. , Coelho, C. A. , & Barrett, T. S. (2020). Conversational coordination of articulation responds to context: A clinical test case with traumatic brain injury. Journal of Speech, Language, and Hearing Research, 63(8), 2567–2577. https://doi.org/10.1044/2020_JSLHR-20-00104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bury, S. M. , Jellett, R. , Spoor, J. R. , & Hedley, D. (2020). “It defines who I am” or “it's something I have”: What language do [autistic] Australian adults [on the autism spectrum] prefer? Journal of Autism and Developmental Disorders. https://doi.org/10.1007/s10803-020-04425-3 [DOI] [PubMed] [Google Scholar]
- Chen, F. , Wang, L. , Peng, G. , Yan, N. , & Pan, X. (2019). Development and evaluation of a 3-D virtual pronunciation tutor for children with autism spectrum disorders. PLOS ONE, 14(1), e0210858. https://doi.org/10.1371/journal.pone.0210858 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cleland, J. , Gibbon, F. E. , Peppé, S. J. E. , O'Hare, A. , & Rutherford, M. (2010). Phonetic and phonological errors in children with high functioning autism and Asperger syndrome. International Journal of Speech-Language Pathology, 12(1), 69–76. https://doi.org/10.3109/17549500903469980 [DOI] [PubMed] [Google Scholar]
- Cleland, J. , Scobbie, J. M. , Roxburgh, Z. , Heyde, C. , & Wrench, A. (2019). Enabling new articulatory gestures in children with persistent speech sound disorders using ultrasound visual biofeedback. Journal of Speech, Language, and Hearing Research, 62(2), 229–246. https://doi.org/10.1044/2018_JSLHR-S-17-0360 [DOI] [PubMed] [Google Scholar]
- Dahlgren, S. , Sandberg, A. D. , Strömbergsson, S. , Wenhov, L. , Råstam, M. , & Nettelbladt, U. (2018). Prosodic traits in speech produced by children with autism spectrum disorders—Perceptual and acoustic measurements. Autism & Developmental Language Impairments, 3. https://doi.org/10.1177/2396941518764527 [Google Scholar]
- DePape, A.-M. R. , Chen, A. , Hall, G. B. C. , & Trainor, L. J. (2012). Use of prosody and information structure in high functioning adults with autism in relation to language ability. Frontiers in Psychology, 3, 72. https://doi.org/10.3389/fpsyg.2012.00072 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diehl, J. J. , & Paul, R. (2012). Acoustic differences in the imitation of prosodic patterns in children with autism spectrum disorders. Research in Autism Spectrum Disorders, 6(1), 123–134. https://doi.org/10.1016/j.rasd.2011.03.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diehl, J. J. , Watson, D. , Bennetto, L. , McDonough, J. , & Gunlogson, C. (2009). An acoustic analysis of prosody in high-functioning autism. Applied Psycholinguistics, 30(3), 385–404. https://doi.org/10.1017/S0142716409090201 [Google Scholar]
- Filipe, M. G. , Frota, S. , Castro, S. L. , & Vicente, S. G. (2014). Atypical prosody in Asperger syndrome: Perceptual and acoustic measurements. Journal of Autism and Developmental Disorders, 44(8), 1972–1981. https://doi.org/10.1007/s10803-014-2073-2 [DOI] [PubMed] [Google Scholar]
- Fournier, K. A. , Hass, C. J. , Naik, S. K. , Lodha, N. , & Cauraugh, J. H. (2010). Motor coordination in autism spectrum disorders: A synthesis and meta-analysis. Journal of Autism and Developmental Disorders, 40(10), 1227–1240. https://doi.org/10.1007/s10803-010-0981-3 [DOI] [PubMed] [Google Scholar]
- Fusaroli, R. , Grossman, R. , Bilenberg, N. , Cantio, C. , Jepsen, J. , & Weed, E. (2021). Toward a cumulative science of vocal markers of autism: A cross-linguistic meta-analysis-based investigation of acoustic markers in American and Danish autistic children. Autism Research. https://doi.org/10.1002/aur.2661 [DOI] [PubMed] [Google Scholar]
- Fusaroli, R. , Lambrechts, A. , Bang, D. , Bowler, D. M. , & Gaigg, S. B. (2017). Is voice a marker for autism spectrum disorder? A systematic review and meta-analysis. Autism Research, 10(3), 384–407. https://doi.org/10.1002/aur.1678 [DOI] [PubMed] [Google Scholar]
- Gernsbacher, M. A. , Sauer, E. A. , Geye, H. M. , Schweigert, E. K. , & Goldsmith, H. H. (2008). Infant and toddler oral- and manual-motor skills predict later speech fluency in autism. The Journal of Child Psychology and Psychiatry, 49(1), 43–50. https://doi.org/10.1111/j.1469-7610.2007.01820.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldfarb, W. , Goldfarb, N. , Braunstein, P. , & Scholl, H. (1972). Speech and language faults of schizophrenic children. Journal of Autism and Childhood Schizophrenia, 2(3), 219–233. https://doi.org/10.1007/BF01537616 [DOI] [PubMed] [Google Scholar]
- Goldman, R. , & Fristoe, M. (1986). Goldman–Fristoe Test of Articulation. AGS. [Google Scholar]
- Grossman, R. B. (2015). Judgments of social awkwardness from brief exposure to children with and without high-functioning autism. Autism, 19(5), 580–587. https://doi.org/10.1177/1362361314536937 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grossman, R. B. , Bemis, R. H. , Skwerer, D. P. , & Tager-Flusberg, H. (2010). Lexical and affective prosody in children with high-functioning autism. Journal of Speech, Language, and Hearing Research, 53(3), 778–793. https://doi.org/10.1044/1092-4388(2009/08-0127) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hardan, A. Y. , Kilpatrick, M. , Keshavan, M. S. , & Minshew, N. J. (2003). Motor performance and anatomic magnetic resonance imaging (MRI) of the basal ganglia in autism. Journal of Child Neurology, 18(5), 317–324. https://doi.org/10.1177/08830738030180050801 [DOI] [PubMed] [Google Scholar]
- Hubbard, K. , & Trauner, D. A. (2007). Intonation and emotion in autistic spectrum disorders. Journal of Psycholinguistic Research, 36(2), 159–173. https://doi.org/10.1007/s10936-006-9037-4 [DOI] [PubMed] [Google Scholar]
- Ingersoll, B. (2008). The social role of imitation in autism: Implications for the treatment of imitation deficits. Infants and Young Children, 21(2), 107–119. https://doi.org/10.1097/01.IYC.0000314482.24087.14 [Google Scholar]
- Järvinen-Pasley, A. , Peppe, S. , King-Smith, G. , & Heaton, P. (2008). The relationship between form and function level receptive prosodic abilities in autism. Journal of Autism and Developmental Disorders, 38(7), 1328–1340. https://doi.org/10.1007/s10803-007-0520-z [DOI] [PubMed] [Google Scholar]
- Johnson, K. (2004). Massive reduction in conversational American English. In Yoneyama K. & Maekawa K. (Eds.), Spontaneous speech: Data and analysis (pp. 29–54). The National International Institute for Japanese Language. [Google Scholar]
- Kapp, S. K. , Gillespie-Lynch, K. , Sherman, L. E. , & Hutman, T. (2013). Deficit, difference, or both? Autism and Neurodiversity. Developmental Psychology, 49(1), 59–71. https://doi.org/10.1037/a0028353 [DOI] [PubMed] [Google Scholar]
- Kenny, L. , Hattersley, C. , Molins, B. , Buckley, C. , Povey, C. , & Pellicano, E. (2016). Which terms should be used to describe autism? Perspectives from the U.K. autism community. Autism, 20(4), 442–462. https://doi.org/10.1177/1362361315588200 [DOI] [PubMed] [Google Scholar]
- Kent, R. D. (1996). Hearing and believing. American Journal of Speech-Language Pathology, 5(3), 7–23. https://doi.org/10.1044/1058-0360.0503.07 [Google Scholar]
- Kissine, M. , & Geelhand, P. (2019). Brief report: Acoustic evidence for increased articulatory stability in the speech of adults with autism spectrum disorder. Journal of Autism and Developmental Disorders, 49(6), 2572–2580. https://doi.org/10.1007/s10803-019-03905-5 [DOI] [PubMed] [Google Scholar]
- Kjelgaard, M. M. , & Tager-Flusberg, H. (2001). An investigation of language impairment in autism: Implications for genetic subgroups. Language and Cognitive Processes, 16(2–3), 287–308. https://doi.org/10.1080/01690960042000058 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lord, C. , Rutter, M. , DiLavore, P. C. , Risi, S. , Gotham, K. , & Bishop, S. (2012). Autism Diagnostic Observation Schedule (2nd ed.). Western Psychological Services. [Google Scholar]
- Lubold, N. , Borrie, S. A. , Barrett, T. S. , Willi, M. , & Berisha, V. (2019). Do conversational partners entrain on articulatory precision? In Proceedings of INTERSPEECH Conference 2019 (pp. 1931–1935). Graz, Austria. https://doi.org/10.21437/Interspeech.2019-1786 [DOI] [PMC free article] [PubMed]
- Masi, A. , DeMayo, M. M. , Glozier, N. , & Guastella, A. J. (2017). An overview of autism spectrum disorder, heterogeneity and treatment options. Neuroscience Bulletin, 33(2), 183–193. https://doi.org/10.1007/s12264-017-0100-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matthews, N. L. , Smith, C. J. , Pollard, E. , Ober-Reynolds, S. , Kirwan, J. , & Malligo, A. (2015). Adaptive functioning in autism spectrum disorder during the transition to adulthood. Journal of Autism and Developmental Disorders, 45(8), 2349–2360. https://doi.org/10.1007/s10803-015-2400-2 [DOI] [PubMed] [Google Scholar]
- McCann, J. , Peppé, S. , Gibbon, F. E. , O'Hare, A. , & Rutherford, M. (2007). Prosody and its relationship to language in school-aged children with high-functioning autism. International Journal of Language & Communication Disorders, 42(6), 682–702. https://doi.org/10.1080/13682820601170102 [DOI] [PubMed] [Google Scholar]
- McCleery, J. P. , Elliott, N. A. , Sampanis, D. S. , & Stefanidou, C. A. (2013). Motor development and motor resonance difficulties in autism: Relevance to early intervention for language and communication skills. Frontiers in Integrative Neuroscience, 7, 30. https://doi.org/10.3389/fnint.2013.00030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDuffie, A. , Turner, L. , Stone, W. , Yoder, P. , Wolery, M. , & Ulman, T. (2007). Developmental correlates of different types of motor imitation in young children with autism spectrum disorders. Journal of Autism and Developmental Disorders, 37(3), 401–412. https://doi.org/10.1007/s10803-006-0175-1 [DOI] [PubMed] [Google Scholar]
- McKeever, L. , Cleland, J. , & Delafield-Butt, J. (2019). Aetiology of speech sound errors in autism. In Fuchs S., Cleland J., & Rochet-Cappellan A. (Eds.), Speech production and perception: Learning and memory (pp. 109–138). Peter Lang. https://doi.org/10.3726/b15982 [Google Scholar]
- Mody, M. , Shui, A. M. , Nowinski, L. A. , Golas, S. B. , Ferrone, C. , O'Rourke, J. A. , & McDougle, C. J. (2017). Communication deficits and the motor system: Exploring patterns of associations in autism spectrum disorder (ASD). Journal of Autism and Developmental Disorders, 47(1), 155–162. https://doi.org/10.1007/s10803-016-2934-y [DOI] [PubMed] [Google Scholar]
- Mosconi, M. W. , & Sweeney, J. A. (2015). Sensorimotor dysfunctions as primary features of autism spectrum disorders. Science China Life Sciences, 58(10), 1016–1023. https://doi.org/10.1007/s11427-015-4894-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mottron, L. , Dawson, M. , Soulières, I. , Hubert, B. , & Burack, J. (2006). Enhanced perceptual functioning in autism: An update, and eight principles of autistic perception. Journal of Autism and Developmental Disorders, 36, 27–43. https://doi.org/10.1007/s10803-005-0040-7 [DOI] [PubMed] [Google Scholar]
- Nadig, A. , & Shaw, H. (2012). Acoustic and perceptual measurement of expressive prosody in high-functioning autism: Increased pitch range and what it means to listeners. Journal of Autism and Developmental Disorders, 42(4), 499–511. https://doi.org/10.1007/s10803-011-1264-3 [DOI] [PubMed] [Google Scholar]
- Ochi, K. , Ono, N. , Owada, K. , Kojima, M. , Kuroda, M. , Sagayama, S. , & Yamasue, H. (2019). Quantification of speech and synchrony in the conversation of adults with autism spectrum disorder. PLOS ONE, 14(12), e0225377. https://doi.org/10.1371/journal.pone.0225377 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Panayotov, V. , Chen, G. , Povey, D. , & Khudanpur, S. (2015). LibriSpeech: An ASR corpus based on public domain audio books. In IEEE 2015 International Conference on Acoustics, Speech and Signal Processing (pp. 5206–5210). https://doi.org/10.1109/ICASSP.2015.7178964
- Patel, S. P. , Nayar, K. , Martin, G. E. , Franich, K. , Crawford, S. , Diehl, J. J. , & Losh, M. (2020). An acoustic characterization of prosodic differences in autism spectrum disorder and first-degree relatives. Journal of Autism and Developmental Disorders, 50(8), 3032–3045. https://doi.org/10.1007/s10803-020-04392-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Development Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing. [Google Scholar]
- Redford, M. A. , Kapatsinski, V. , & Cornell-Fabiano, J. (2018). Lay listener classification and evaluation of typical and atypical children's speech. Language and Speech, 61(2), 277–302. https://doi.org/10.1177/0023830917717758 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sasson, N. J. , Faso, D. J. , Nugent, J. , Lovell, S. , Kennedy, D. P. , & Grossman, R. B. (2017). Neurotypical peers are less willing to interact with those with autism based on thin slice judgments. Scientific Reports, 7(1), 40700. https://doi.org/10.1038/srep40700 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seltzer, M. M. , Krauss, M. W. , Shattuck, P. T. , Orsmond, G. , Swe, A. , & Lord, C. (2003). The symptoms of autism spectrum disorders in adolescence and adulthood. Journal of Autism and Developmental Disorders, 33(6), 565–581. https://doi.org/10.1023/B:JADD.0000005995.02453.0b [DOI] [PubMed] [Google Scholar]
- Sharda, M. , Subhadra, T. P. , Sahay, S. , Nagaraja, C. , Singh, L. , Mishra, R. , Sen, A. , Singhal, N. , Erickson, D. , & Singh, N. C. (2010). Sounds of melody—Pitch patterns of speech in autism. Neuroscience Letters, 478(1), 42–45. https://doi.org/10.1016/j.neulet.2010.04.066 [DOI] [PubMed] [Google Scholar]
- Shriberg, L. D. , Paul, R. , Black, L. M. , & van Santen, J. P. (2011). The hypothesis of apraxia of speech in children with autism spectrum disorder. Journal of Autism and Developmental Disorders, 41(4), 405–426. https://doi.org/10.1007/s10803-010-1117-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shriberg, L. D. , Paul, R. , McSweeny, J. L. , Klin, A. M. , Cohen, D. J. , & Volkmar, F. R. (2001). Speech and prosody characteristics of adolescents and adults with high-functioning autism and Asperger syndrome. Journal of Speech, Language, and Hearing Research, 44(5), 1097–1115. https://doi.org/10.1044/1092-4388(2001/087) [DOI] [PubMed] [Google Scholar]
- Stegmann, G. M. , Hahn, S. , Liss, J. , Shefner, J. , Rutkove, S. , Shelton, K. , Duncan, C. J. , & Berisha, V. (2020). Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis. NPJ Digital Medicine, 3(1), 132. https://doi.org/10.1038/s41746-020-00335-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sullivan, K. , Sharda, M. , Greenson, J. , Dawson, G. , & Singh, N. C. (2013). A novel method for assessing the development of speech motor function in toddlers with autism spectrum disorders. Frontiers in Integrative Neuroscience, 7. https://doi.org/10.3389/fnint.2013.00017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tu, M. , Grabek, A. , Liss, J. , & Berisha, V. (2018). Investigating the role of L1 in automatic pronunciation evaluation of L2 speech. In Proceedings of INTERSPEECH Conference 2018 (pp. 1636–1640). Hyderabad, India. https://arxiv.org/pdf/1807.01738.pdf
- Van Etten, H. M. , & Carver, L. J. (2015). Does impaired social motivation drive imitation deficits in children with autism spectrum disorder? Review Journal of Autism and Developmental Disorders, 2(3), 310–319. https://doi.org/10.1007/s40489-015-0054-9 [Google Scholar]
- Wiig, E. H. , Semel, E. , & Secord, W. A. (2013). Clinical Evaluation of Language Fundamentals–Fifth Edition (CELF-5). Pearson. [Google Scholar]
- Wynn, C. J. , Borrie, S. A. , & Sellers, T. P. (2018). Speech rate entrainment in children and adults with and without autism spectrum disorder. American Journal of Speech-Language Pathology, 27(3), 965–974. https://doi.org/10.1044/2018_AJSLP-17-0134 [DOI] [PMC free article] [PubMed] [Google Scholar]