Abstract
Purpose
Conversational entrainment, a phenomenon whereby people modify their behaviors to match their communication partner, has been evidenced as critical to successful conversation. It is plausible that deficits in entrainment contribute to the conversational breakdowns and social difficulties exhibited by people with autism spectrum disorder (ASD). This study examined speech rate entrainment in children and adult populations with and without ASD.
Method
Sixty participants including typically developing children, children with ASD, typically developed adults, and adults with ASD participated in a quasi-conversational paradigm with a pseudoconfederate. The confederate's speech rate was digitally manipulated to create slow and fast speech rate conditions.
Results
Typically developed adults entrained their speech rate in the quasi-conversational paradigm, using a faster rate during the fast speech rate conditions and a slower rate during the slow speech rate conditions. This entrainment pattern was not evident in adults with ASD or in children populations.
Conclusion
Findings suggest that speech rate entrainment is a developmentally acquired skill and offers preliminary evidence of speech rate entrainment deficits in adults with ASD. Impairments in this area may contribute to the conversational breakdowns and social difficulties experienced by this population. Future work is needed to advance this area of inquiry.
Successful conversation not only requires communication partners to produce and perceive speech but to coordinate these behaviors as well. This coordination of behaviors, termed herein as entrainment, 1 describes the tendency for people to modify their behaviors to more closely match those of their conversation partner. Entrainment is manifested in many verbal aspects of communication, including acoustic–prosodic speech properties (e.g., Lee et al., 2010), word choice (e.g., Brennan & Clark, 1996), linguistic style (e.g., Danescu-Niculescu-Mizil, Gamon, & Dumais, 2011), and syntactic structures (e.g., Branigan, Pickering, & Cleland, 2000; Reitter, Moore, & Keller, 2006). For example, using a quasi-conversational paradigm, in which healthy participants read sentences aloud in response to hearing prerecorded sentences, Borrie and Liss (2014) observed that participants modified the speech rate and pitch variation of their utterances to more closely match the properties of the proceeding audio sample. Entrainment further extends into nonverbal aspects of communication including facial expression (e.g., Louwerse, Dale, Bard, & Jeuniaux, 2012) and gestures (Furuyama, Hayashi, & Mishima, 2005) as well as patterns of laughter (e.g., Truong & Trouvain, 2012), eye gaze (e.g., Nakano, Kato, & Kitazawa, 2011), and yawning (e.g., Helt, Eigsti, Snyder, & Fein, 2010).
Although largely considered a subconscious phenomenon, entraining to the behaviors of others is neither meaningless nor inconsequential. There is a well-established body of literature linking entrainment of both verbal and nonverbal behaviors to successful communicative interactions. Such benefits are far reaching and impact communication from a cognitive, social, and emotional perspective (e.g., Chartrand & Bargh, 1999; Lee et al., 2010). Borrie and Liss previously summarized some of the key literature in this area, concluding that entrainment “…serves as a powerful coordinating device, uniting individuals in time and space to optimize comprehension, establish social presence, and create positive and satisfying relationships” (Borrie & Liss, 2014, p. 816).
The capacity to entrain requires an individual to perceive the rhythms of others, produce rhythmic signals, and integrate rhythmic information, aligning motor output with sensory input (Phillips-Silver, Aktipis, & Bryant, 2010; Todd, Lee, & O'Boyle, 2002). Disruptions in any of these areas will cause breakdowns in the overall entrainment process. Because rhythm deficits are present in a number of communication disorders, it has been suggested that entrainment impairments may be widespread within these clinical populations (Borrie & Liss, 2014). Given the prevalence of rhythmic difficulties in individuals with autism spectrum disorder (ASD; e.g., Isenhower et al., 2012; Marsh et al., 2013; Tordjman et al., 2015), this population may be particularly vulnerable to entrainment deficits. This idea is supported by a small, but growing, body of research demonstrating entrainment deficits in certain nonverbal elements of communication in people with ASD. Nakano and colleagues (2011) have observed that eyeblink entrainment, found in typically developing (TD) adults engaged in face-to-face conversations, is absent in adults with ASD. Studies investigating entrainment of facial expressions indicate that children and adults with ASD entrain with less frequency and greater delays than their TD peers (Mathersul, McDonald, & Rushby, 2013; McIntosh, Reichmann-Decker, Winkielman, & Wilbarger, 2006; Oberman, Winkielman, & Ramachandran, 2009; Yoshimura, Sato, Uono, & Toichi, 2015).
Should entrainment deficits be present in individuals with ASD, the potential effects are far reaching. Indeed, the very benefits derived from successful entrainment are often lacking in the lives of individuals with ASD. For example, entrainment is pivotal in establishing interpersonal relationships and correlates with increased rapport, empathy, and intimacy between conversational partners (e.g., Bailenson & Yee, 2005; Chartrand & Bargh, 1999; Lee et al., 2010; Manson, Bryant, Gervais, & Kline, 2013; Miles, Nind, & Macrae, 2009; Putman & Street, 1984; Smith, 2008; Street & Giles, 1982). Contrastingly, individuals with ASD show marked difficulty in building and maintaining friendships and experience less intimate relationships and higher rates of social exclusion as compared with individuals without ASD (Bossaert, Colpin, Pijl, & Petry, 2015; Dean et al., 2014; Solish, Perry, & Minnes, 2010; Taheri, Perry, & Minnes, 2016). From a cognitive perspective, entrainment facilitates better understanding between communication partners. That is, as communication partners modify their behaviors to more closely match one another, they find common representation on which they can ground their conversation, thereby reducing ambiguities and misunderstandings (Borrie, Lubold, & Pon-Barry, 2015; Pickering & Garrod, 2004). Conversely, individuals with ASD often show limited language comprehension skills (e.g., Rapin & Dunn, 2003), and even those with average verbal IQs display difficulty interpreting the overall intention of their communication partner (Dennis, Lazenby, & Lockyer, 2001; Ozonoff & Miller, 1996). Entrainment has also been shown to support conversational fluidity, particularly by facilitating turn-taking decisions, decreasing interturn latencies, and reducing conversational interruptions (e.g., Local, 2007; Wilson & Wilson, 2005). In comparison, many individuals with ASD have poor reciprocal conversation skills, deficits in turn-taking, and longer interturn latencies than their TD peers (Heeman, Lunsford, Selfridge, Black, & Van Santen, 2010; Klusek, Martin, & Losh, 2014; Loveland, Landry, Hughes, Hall, & McEvoy, 1988). Because of the potential pragmatic and linguistic effects of entrainment deficits in this population, research in this area may be pivotal in improving communicative interactions in individuals with ASD. Furthermore, because acoustic–prosodic characteristics of speech “constitute one of the most significant obstacles to social integration” in populations with ASD (Paul et al., 2005, p. 862), research regarding this area of entrainment may be especially important.
In the current study, we investigated speech entrainment in both children and adults, with and without ASD, to explore the hypothesis that people with ASD present with speech entrainment deficits. We selected speech rate as our entrainment feature of investigation for three key reasons. First, there is a strong literature base that supports entrainment of acoustic–prosodic features, including speech rate, as features that are readily entrained on in healthy populations (e.g., Borrie & Liss, 2014; Local, 2007). Second, speech rate entrainment has been shown as a vital component of successful conversations (Manson et al., 2013; Putman & Street, 1984; Wilson & Wilson, 2005). Third, the speech rate of individuals with ASD generally falls within normal limits (e.g., Nadig & Shaw, 2012; Shriberg et al., 2001). 2 Here, we addressed the following key research question: Do children and adults, with and without ASD, entrain their speech rate to that of a pseudoconfederate during a quasi-conversational paradigm? 3 On the basis of evidence of rhythmic difficulties in people with ASD, we hypothesized that children and adults with ASD would not modify their speech rate during the conversational paradigm, thus implicating deficits in speech rate entrainment. Conversely, given the literature on speech rate entrainment in TD children and adults, we hypothesized that children and adults without ASD would modify their speech rate in the conversational paradigm.
Method
Overview
A quasi-conversational paradigm was set up via a web-based perception–production application (hosted on a secure university-based web server) to elicit speech samples from children and adults, with and without ASD. In this application, participants were required to watch audiovisual recordings, presented under different speech rate conditions. After each audiovisual recording, participants were required to produce a timed verbal response. These response productions were analyzed for speech rate and, subsequently, entrainment. Specific details are described below.
Participants
Data were collected from 60 participants, comprising four experimental groups: (a) children with ASD (ASD-C; n = 15), (b) TD children (TD-C; n = 15), (c) adults with ASD (ASD-A; n = 15), and (d) typically developed adults (TD-A; n = 15). Participants from each group were native speakers of American English with no parent/self-reported hearing impairment. All groups consisted of four women and 11 men, except the ASD-A group, which consisted of five women and 10 men.
Children
The children groups (TD-C and ASD-C) consisted of participants between the ages of 6 and 14 years. 4 Descriptive age data by group are summarized in Table 1. An independent t test confirmed no significant differences between the two groups for age, t(28) = 0.18, p = .86, d = 0.07. Participants from both groups of children were recruited through a variety of sources including the Autism Support Services at Utah State University and announcements on appropriate social media outlets. Parents of the children participants in the TD-C group reported that their children had no known developmental delays or learning disabilities. Language skills of children in the TD-C group were confirmed within normal limits using two subtests (Following Directions and Recalling Sentences) of the Clinical Evaluation of Language Fundamentals–Fifth Edition (Wiig, Semel, & Secord, 2013). The average standard scores of the TD-C group for receptive and expressive language were 104.33 and 103.73, respectively.
Table 1.
Age | ASD-C | TD-C | ASD-A | TD-A |
---|---|---|---|---|
M (SD) | 10.07 (2.56) | 9.92 (2.06) | 28.33 (6.12) | 23.94 (1.34) |
Range | 6–14 | 6–14 | 20–40 | 21–25 |
Note. ASD-A = adults with autism spectrum disorder; ASD-C = children with autism spectrum disorder; TD-A = typically developed adults; TD-C = typically developing children.
For participants within the ASD-C group, medical diagnosis of ASD or educational eligibility under the ASD category was reported by parents; however, parents were not required to provide documentation of such. Therefore, to validate group differences on characteristics of ASD, the caregiver response form from the Children's Communication Checklist–Second Edition (Bishop, 2006) was completed by parents of participants from both groups. An overall pragmatic score was derived by adding raw scores from scales addressing social communication skills (i.e., inappropriate initiation, scripted language, use of context, and nonverbal communication) and autistic-like features (i.e., social relations and interests). An independent t test confirmed significant differences between the TD-C and ASD-C groups on overall pragmatic functioning, t(28) = 9.81, p < .001, d = 3.58. Language scores for the ASD-C group were also collected. Average scores on the Clinical Evaluation of Language Fundamentals–Fifth Edition for Receptive and Expressive Language subtests were 92.33 and 88.66, respectively.
Adults
The adult groups consisted of participants between the ages of 20 and 40 years (see Table 1). Participants from the TD-A group were recruited from undergraduate classes at Utah State University. Participants in the TD-A groups reported no developmental delays or learning disabilities. Language skills of these adults were confirmed within normal limits during an informal conversation with the experimenter. Participants in the ASD-A group were recruited from a transitional school for individuals with autism. These participants had received a verified ASD diagnosis from a licensed clinical social worker or a clinical mental health counselor no more than 3 years before the experiment. In addition, participants in this group had an IQ level of 90 or above. Individual language skills of both groups of participants with ASD were not required to be within normal limits. However, an informal assessment requiring participants to (a) demonstrate understanding of basic instructions relating to the task (i.e., follow directions relating to attending to stimuli, remain quiet while the video was playing, give descriptions when instructed) and (b) produce adequate descriptions of pictures (i.e., produce descriptions that were free from palilalia, echolalia, or idiosyncratic phrases; produce a description with vocabulary relating to the picture; and speak for the full 15 s without excessive delays or inappropriate pausing) was given to ensure that speech and language skills were of sufficient level to participate in the experimental procedure.
Stimuli
Audiovisual stimuli used in this study featured one female native speaker (22 years old) of American English. Stimuli were created in a sound-attenuated booth with an industry standard microphone (Shure SM58) and video camera (Canon EOS 70D), positioned to capture a view of the speaker's head and shoulders against a neutral backdrop. The speaker was encouraged to use her “normal speaking” voice while producing the stimuli for the study. Output elicited during the stimuli collection task was recorded digitally to a memory card at 48 kHz (16-bit sampling rate) and stored as individual recording files.
Fifteen audiovisual recordings, termed speech exposure recordings, were created. In each recording, the speaker is shown holding a novel picture from a popular children's book. 5 The speaker introduces the picture, requests that participants describe what they see, and provides examples of what participants could talk about for each picture (see the Appendix for a sample transcript). Each recording was between 20 and 25 s in length. The original recordings were then digitally manipulated to create two additional versions of each, one with a slower speech rate and one with a faster speech rate. The slow speech rates were calculated as 80% of the original speech rate, and the fast speech rates as 120%. This level of rate modification was chosen to provide sufficient variability in the speech rate of the speaker while still retaining a natural-sounding speech rate. This resulted in slow and fast speech rate conditions for each original recording.
Procedure
Before beginning this study, experimental procedures were piloted on three instructors at a university-based educational center for children with ASD, ensuring that all aspects of the experiment were suitable for participants. The experiment was conducted in a quiet room. Participants were told that they would be participating in a study examining speech patterns of different groups of individuals. However, no other explanation about the purpose of the study was given before the experiment. The use of participants who were blinded to the guiding hypotheses of the experiment was approved by the Utah State University Institutional Review Board.
Upon obtaining informed consent, participants were seated in front of a computer preloaded with the experimental procedure. Participants were informed that the experiment would require them to watch recordings of a person talking about a picture displayed on the computer monitor and subsequently describe the pictures themselves. They were informed that they should continue talking for 15 s until a visual timer (diminishing bar at the top of the screen) and an auditory timer (sound of a beep) let them know that it was time to stop. Experimenters used simple language and visual examples to explain the nature of the task and provided a demonstration to ensure understanding. Before beginning the study, participants were given two practice trials with speech at the original rate. To qualify for the experiment, participants were required to attend to the video and subsequently continue talking for the time allotted in two trials. All participants qualified for the experiment. The experiment consisted of 15 picture description tasks. Each task consisted of three phases. First, participants watched a randomly selected speech exposure recording, as described above. Audio for the recordings was played through a wireless headset (Astro A50 Wireless System). Second, directly after each recording, the picture shown in the speech exposure recording appeared on the screen, and participants spoke for 15 s about the picture. Verbal responses, termed response productions, were audio-recorded via the headset microphone. If a participant stopped talking before the timer ended, experimenters were instructed to point to the timer as a visual cue to continue talking. However, this was never employed, as participants consistently talked for the full 15 s. Third, a reinforcing picture with the words “great job” appeared on the screen.
This cycle of watching, describing, and receiving reinforcement was repeated for 15 novel speech exposure recordings. Of these 15 recordings, five were presented in the original speech rate condition, five were presented in the slow speech rate condition, and five were presented in the fast speech rate condition. Picture recordings were randomized so that no picture was seen twice by a single participant and so that the speech rate condition of each recording varied across participants (i.e., the same picture recording was presented in the slow condition for one participant, the original condition for another participant, and the fast condition for yet another participant). Total time to complete the experimental procedure was approximately 15 min for all participant groups.
Data Analysis
The data set included the response productions for each participant in the slow and fast speech rate conditions. Thus, the total data set consisted of 600 response productions (10 per participant). Trained laboratory assistants transcribed each response production and counted the number of syllables for each production. Involuntary sounds (e.g., hiccups, coughing, sneezing) were not included in the calculation. However, because the overarching objective of this study was to examine rhythmic patterns of verbal output rather than communicative intent, all other vocalizations including part-word repetitions, whole-word repetitions, and filler words (e.g., uh, um) were included in the calculation of speech rate. Using Praat, an acoustic analysis software (Boersma & Weenink, 2015), duration (in seconds) of each response production was calculated. Silences lasting over 1 s were removed from the total response production time. Average speech rate for each response production was calculated by dividing the number of syllables by the duration of the response. Twenty-five percent of the total data set (150 response productions) were randomly selected according to a computer-generated random number list and were reanalyzed by another judge to obtain interrater reliability estimates for the dependent variable of speech rate. Discrepancies revealed high agreement between two judges, with a Pearson correlation r score above .93.
Intraclass correlations (ICCs) were computed to determine the degree of correlation of speech rate within participants in the children and adult groups. The ICC within the children groups was .53. The ICC within the adult groups was .58. Because of these high ICCs, linear mixed models were used to investigate the effects of condition and group on average speech rate while controlling for the lack of independence in the data due to the repeated measures. Consistent with previous research (Nip & Green, 2013; Chermak & Schneiderman, 1985), our data showed a meaningful difference between mean speech rate and variability of speech rate in populations of children and adults. Therefore, to keep the models more parsimonious and insightful, separate analyses were run for the children and adults. Assumptions of normality and homoscedasticity were verified for both children and adult models and indicated that this approach was appropriate.
Results
Average speech rate (syllables per second) of the response productions in the slow and fast speech rate conditions was collected for each participant across the four experimental groups and is illustrated in Figure 1. Within-participant factor was Condition (slow speech rate vs. fast speech rate), and the between-participant factor was Group (ASD-C, TD-C, ASD-A, and TD-A). Analysis of the adult groups (ASD-A and TD-A) revealed significant main effects of group and condition and a significant interaction between group and condition (see Table 2). Thus, the TD-A group spoke more quickly in the fast speech rate condition and more slowly in the slow speech rate condition. However, this entrainment of speech rate was not evident in the ASD-A group. Analyses of the children groups (ASD-C and TD-C) revealed no significant main effects of group or condition and no significant interaction between group and condition (see Table 2). Thus, neither the TD children nor the children with ASD modulated their speech rate to match that of the virtual interlocutor.
Table 2.
Effects | Adults |
Children |
||
---|---|---|---|---|
t | p | t | p | |
Condition | 2.52 | .01* | 0.61 | .55 |
Group | −2.55 | .02* | −0.32 | .75 |
Interaction (Condition × Group) | −2.60 | .01* | −1.60 | .12 |
p < .05.
Collectively, results did not reveal speech rate entrainment in either the TD-C or ASD-C group. However, it is important to note that there was a relatively large age range of participants in these groups. In addition, there was a large degree of variance within the data of the children's groups. These factors warrant a more fine-grained analysis of the data from the children groups. Accordingly, post hoc linear mixed models were used to investigate associations between speech rate scores in each condition and potentially correlative variables (i.e., age and pragmatic scores) within the children participants, regardless of the presence or absence of ASD. 6 First, examining age, analyses revealed significant main effects for condition (p = .03). Main effects for age were nonsignificant. The interaction between age and condition was significant (p = .04). Thus, increased age is predictive of a greater degree of speech rate entrainment (see Figure 2). Results from the analysis of pragmatic scores revealed nonsignificant results.
Discussion
The purpose of this study was to carry out an initial investigation into speech entrainment in children and adults with and without ASD. Our hypothesis, that typically developed populations would entrain their speech rate whereas individuals with ASD would not, was confirmed with the adult populations but not with children populations. Taken together, these findings suggest that speech rate entrainment requires a subset of skills that are generally acquired during childhood and that may be impaired, or essentially not acquired, in individuals with ASD.
This study revealed the occurrence of speech rate entrainment in TD adults. That is, the TD-A group employed a faster speech rate when responding to audiovisual recordings with a fast speech rate and a slower speech rate when responding to audiovisual recordings with a slow speech rate. In contrast, however, speech rate entrainment was not evident in adults with ASD. As seen in Figure 1, there is no significant difference in the speech rate of the ASD-A group in the slow and fast rate conditions. This discrepancy is consistent with research demonstrating a lack of entrainment in nonverbal behaviors in individuals with ASD (Mathersul et al., 2013; Nakano et al., 2011; Yoshimura et al., 2015) and extends the possibility of entrainment deficits to prosodic aspects of communication.
Overall, speech rate entrainment was not revealed in either group of children in the quasi-conversational paradigm employed in the current study. One possible explanation for a lack of entrainment in these populations is that the skills necessary to entrain—rhythmic detection, rhythmic action, and rhythmic integration (Phillips-Silver et al., 2010; Todd et al., 2002)—are not yet fully acquired, particularly within the younger children, in this age group. Upitis (1987), for example, found that the rhythm perception and production skills significantly increased with age in children 7–12 years old. Because of the wide amount of variation within the data and the large age range used in recruitment, a post hoc analysis was performed to investigate the relationship between entrainment and the child's age. As illustrated in Figure 2, when data are collapsed across all 30 children participants, speech rate entrainment skills became evident as the age of the children increased. Whereas younger children showed no signs of entrainment, it appears that the older children spoke at a slower rate in the slow speech rate condition and a faster rate in the fast speech rate condition. We acknowledge that combining data from children with ASD and TD children is not ideal (see also Limitations and Future Directions). In addition, the sample size for this analysis was small, and more research is certainly needed to substantiate these specific findings. However, the idea that the skills necessary for entrainment are acquired throughout childhood is not novel. Indeed, previous studies have shown differences between the entrainment capabilities of younger and older children (Garvey & BenDebba, 1974; Helt et al., 2010; Welkowitz, Cariffe, & Feldstein, 1976). For example, Welkowitz and colleagues (1976) found that entrainment of intrapersonal pausing was absent in children between the ages of 5 and 6 years but present in children only 1 year older.
Limitations and Future Directions
As there is little research examining speech entrainment in individuals with ASD, this study was primarily exploratory and was conducted to provide a foundation for future studies in this area. We therefore offer some important areas of investigation below. First, in this study, we did not confirm a diagnosis of ASD for participants within the children's group but relied on parent report of previous diagnosis or educational eligibility. As all adults had a verified diagnosis of ASD, this does not refute our findings that a discrepancy in entrainment exists between TD individuals and those with ASD. However, future investigators in this area should consider requiring formal documentation of diagnosis of ASD using a reliable assessment tool, such as the Autism Diagnostic Observation Schedule (Lord et al., 2012) in all groups of participants.
Next, because of the preliminary nature of this study, our sample size was relatively small, limiting some important avenues of investigation. For example, in this study, our post hoc analysis of the interaction between the child's age and speech entrainment was conducted with both the group with ASD and the TD group combined. Examining these relationships within separate groups would likely reveal important differences. In addition, it is possible that discrepancies in entrainment between children with and without ASD would be illuminated in older groups of children, where the skills necessary for entrainment are emerging. However, limited numbers of participants in our study make these analyses weak and unreliable. Further research, with increased participant numbers and the inclusion of age groups not represented in this study, would increase the power of findings, help us better understand the stages of entrainment development, and allow us to identify exactly when discrepancies between individuals with and without ASD emerge.
This study examined speech entrainment in a quasi-conversational paradigm with a pseudoconfederate in which the speech rate of the speaker was digitally manipulated. This type of paradigm was necessary for preliminary research, as it provided a high level of experimental control. However, future studies, targeting speech rate entrainment in embodied face-to-face conversations with naturally faster and slower speakers, will help to increase the ecological validity of current findings. Indeed, previous research has suggested that highly structured contexts (as in our study) may yield less obvious behavioral differences between children with and without ASD than more open-ended contexts (e.g., Landry & Loveland, 1989).
Next, further investigation between the possible link between entrainment and communicative and social outcomes is warranted. Although our study did not detect a relationship between pragmatic skills and entrainment in children, we did not investigate this relationship in adults. It is quite plausible that, as age increases and differences between groups in the ability to entrain to prosodic aspects of speech emerge, this relationship would be more readily detected. Indeed, researchers have provided some validation for these speculations, showing a relationship between pragmatic functioning and entrainment skills in nonverbal aspects of entrainment (Deschamps, Coppes, Kenemans, Schutter, & Matthys, 2015; Hermans, van Wingen, Bos, Putman, & van Honk, 2009; Yoshimura et al., 2015).
It is important to acknowledge that individual factors such as cognitive, linguistic, and speech impairments; comorbid disorders; and history of communicative treatments were only minimally controlled for in the participants recruited for the current study. Given the high prevalence of coexisting deficits and delays in the population with ASD (e.g., Joshi et al., 2010), this allowed for an ecologically valid sample of participants with ASD to be studied. However, as we do not yet fully understand the underlying factors contributing to entrainment deficits in this population, future studies with tighter controls may help better elucidate the locus of entrainment deficits. In addition, it is possible that the gender and age of participants may have contributed, to some degree, to entrainment differences between individuals. Although speech rate entrainment is a robust phenomenon, considered to occur regardless of gender (Levitan et al., 2012; Xia, Levitan, & Hrschberg, 2014), further investigation may be warranted.
Clinical Implications
Although investigation into this area of research is still in the initial stages, there are some important potential clinical implications that should be noted. First, it is well known that diagnosing ASD in adulthood is a difficult and largely inaccurate process (e.g., Francis, 2012; Trammell, Wilczynski, Dale, & Mcintosh, 2013). As outlined above, further research is certainly needed to flesh out a comprehensive understanding of speech entrainment in populations with ASD. However, should speech entrainment deficits be confirmed as a pervasive characteristic of this population, quantifying entrainment in the speech domain may offer an objective assessment to supplement existing tools in supporting an ASD diagnosis.
Second, it is often difficult to capture specific behaviors that lead to conversational breakdowns and lack of connection in individuals with pragmatic difficulties, including populations with ASD. Indeed, these individuals often perform better on pragmatic tasks in concrete testing situations when compared with less structured communicative environments (Bishop & Adams, 1989; Volden & Phillips, 2010). Even in natural interactions (i.e., face-to-face conversation), an individual may perform several appropriate pragmatic behaviors while their conversation still lacks the natural cohesiveness and synchronization necessary for meaningful connection. With more research in entrainment of speech behavior and the associated conversational benefits, measuring the degree of entrainment in audio-recorded conversations may provide an objective, ecologically valid method to characterize conversational competence.
Furthermore, where deficits exist, targeting speech entrainment skills in individuals with ASD may be efficacious. Although we do not yet know if entrainment deficits in this population are a result of impairment in perception, production, and/or integration of speech rhythm, future research will seek to elucidate this. Such information will be important in determining treatment targets in the management of entrainment deficits. As discussed, the conversational, linguistic, and pragmatic benefits of entrainment are numerous. Therefore, treatment focused on the skills that strengthen entrainment may lead to enhanced outcomes in conversation and social connection for individuals with ASD.
Conclusion
In summary, we found that typically developed adults modified their speech rate in response to the speech rate of a pseudoconfederate; however, this entrainment pattern was not evident in the overall findings of children populations or adults with ASD. Thus, we provide preliminary evidence of speech rate entrainment deficits in adult populations with ASD and surmise that the absence of such entrainment may contribute to the conversational breakdowns experienced in this population. This suggests that entrainment may be an important aspect of communication to consider in the assessment and management of communication difficulties in populations with ASD. A number of important directions for advancing this line of inquiry are discussed.
Acknowledgments
This research was supported by the National Institute of Deafness and Other Communication Disorders, National Institutes of Health Grant R21DC016084 (awarded to Stephanie A. Borrie). The data included in this report were presented at the American Speech-Language-Hearing Association convention in Philadelphia, Pennsylvania, in November 2016. We gratefully acknowledge Paul Vicioso Osoria for development of the web-based application for this study, Tyson Barret for statistical input, and research assistants in the Human Interaction Lab at Utah State University for assistance with data collection and analysis.
Appendix
Example Transcript of Speech Exposure Recording
This is a picture from a book called The Berenstain Bears Go Green. I want you to describe this picture for me. You can tell me about what the houses look like. You can tell me about what the weather is like outside, or you can tell me about what the bears are doing or what they are wearing. Remember to keep talking until the timer runs out.
Funding Statement
This research was supported by the National Institute of Deafness and Other Communication Disorders, National Institutes of Health Grant R21DC016084 (awarded to Stephanie A. Borrie). The data included in this report were presented at the American Speech-Language-Hearing Association convention in Philadelphia, Pennsylvania, in November 2016.
Footnotes
Other terms that have been used to describe this communication coordination phenomenon include accommodation, alignment, convergence, and synchronization.
As a starting point for investigations into speech entrainment in populations with ASD, we used a speech feature that is not already outside normal limits, allowing us to conclude that any deficits observed are a product of impaired entrainment, rather than existing abnormalities of speech.
Audiovisual recordings of an individual speaking were used to elicit speech from the participant in a turn-taking paradigm.
We selected this age range based on other studies examining entrainment in children with ASD (Helt, Eigsti, Snyder, & Fein 2010; Oberman, Winkielman, & Ramachandran 2009).
Although recordings regarding a children's picture book may not be ideal for adult participants, research has evidenced robust speech rate entrainment in adults, even when the task is void of meaningful communication (Borrie & Liss, 2014).
Although analysis of both groups in combination is not ideal, this was done for the post hoc analysis because of the limited number of participants at each age.
References
- Bailenson J. N., & Yee N. (2005). Digital chameleons automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological Science, 16(10), 814–819. [DOI] [PubMed] [Google Scholar]
- Bishop D. (2006). The Children's Communication Checklist (2nd ed., U.S. ed.). San Antonio, TX: Harcourt Assessment. [Google Scholar]
- Bishop D., & Adams C. (1989). Conversational characteristic of children with semantic–pragmatic disorder. II: What features lead to a judgement of inappropriacy? British Journal of Disorders of Communication, 24, 241–263. [DOI] [PubMed] [Google Scholar]
- Boersma P., & Weenink D. (2015). Praat: Doing phonetics by computer (Version 5.4.05) [Computer software]. Retrieved from http://www.praat.org [Google Scholar]
- Borrie S. A., & Liss J. M. (2014). Rhythm as a coordinating device: Entrainment with disordered speech. Journal of Speech, Language, and Hearing Research, 57(3), 815–824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borrie S. A., Lubold N., & Pon-Barry H. (2015). Disordered speech disrupts conversational entrainment: A study of acoustic-prosodic entrainment and communicative success in populations with communication challenges. Frontiers in Psychology, 6, 1187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bossaert G., Colpin H., Pijl S. J., & Petry K. (2015). Quality of reciprocated friendships of students with special educational needs in mainstream seventh grade. Exceptionality, 23(1), 54–72. [Google Scholar]
- Branigan H. P., Pickering M. J., & Cleland A. A. (2000). Syntactic co-ordination in dialogue. Cognition, 74, 13–25. [DOI] [PubMed] [Google Scholar]
- Brennan S. E., & Clark H. H. (1996). Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 1482–1493. [DOI] [PubMed] [Google Scholar]
- Chartrand T. L., & Bargh J. A. (1999). The chameleon effect: The perception-behavior link and social interaction. Journal of Personality and Social Psychology, 76, 893–910. [DOI] [PubMed] [Google Scholar]
- Chermak G. D., & Schneiderman C. R. (1985). Speech timing variability of children and adults. Journal of Phonetics, 13(4), 477–480. [Google Scholar]
- Danescu-Niculescu-Mizil C., Gamon M., & Dumais S. (2011). Mark my words! Linguistic style accommodation in social media. In Proceedings of the 20th International Conference on World Wide Web (WWW '11) (pp. 745–754). New York, NY: ACM. [Google Scholar]
- Dean M., Kasari C., Shih W., Frankel F., Whitney R., Landa R., … Harwood R. (2014). The peer relationships of girls with ASD at school: Comparison to boys and girls with and without ASD. The Journal of Child Psychology and Psychiatry, 55(11), 1218–1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dennis M., Lazenby A. L., & Lockyer L. (2001). Inferential language in high-function children with autism. Journal of Autism and Developmental Disorders, 31(1), 47–54. [DOI] [PubMed] [Google Scholar]
- Deschamps P. H., Coppes L., Kenemans J. L., Schutter D. G., & Matthys W. (2015). Electromyographic responses to emotional facial expressions in 6–7 year olds with autism spectrum disorders. Journal of Autism and Developmental Disorders, 45(2), 354–362. [DOI] [PubMed] [Google Scholar]
- Francis K. G. (2012). The projection of autism spectrum disorders (ASDs) in adult life. Psychiatriki, 23(Suppl. 1), 66–73. [PubMed] [Google Scholar]
- Furuyama N., Hayashi K., & Mishima H. (2005). Interpersonal coordination among articulations, gesticulations, and breathing movements: A case of articulation of /a/ and flexion of the wrist. In Heft H. & Marsh K. L. (Eds.), Studies in perception and action (pp. 45–48). Mahwah, NJ: Erlbaum. [Google Scholar]
- Garvey C., & BenDebba M. (1974). Effects of age, sex, and partner on children's dyadic speech. Child Development, 45, 1159–1161. [PubMed] [Google Scholar]
- Heeman P. A., Lunsford R., Selfridge E., Black L., & Van Santen J. (2010). Autism and interactional aspects of dialogue. In Proceedings of the SIGDIAL 2010 Conference: 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp. 249–252). Stroudsburg, PA: Association for Computational Linguistics. [Google Scholar]
- Helt M. S., Eigsti I.-M., Snyder P. J., & Fein D. A. (2010). Contagious yawning in autistic and typical development. Child Development, 81(5), 1620–1631. [DOI] [PubMed] [Google Scholar]
- Hermans E. J., van Wingen G., Bos P. A., Putman P., & van Honk J. (2009). Reduced spontaneous facial mimicry in women with autistic traits. Biological Psychology, 80(3), 348–353. [DOI] [PubMed] [Google Scholar]
- Isenhower R. W., Marsh K. L., Richardson M. J., Helt M., Schmidt R. C., & Fein D. (2012). Rhythmic bimanual coordination is impaired in young children with autism spectrum disorder. Research in Autism Spectrum Disorders, 6(1), 25–31. [Google Scholar]
- Joshi G., Petty C., Wozniak J., Henin A., Fried R., Galdo M., … Biederman J. (2010). The heavy burden of psychiatric comorbidity in youth with autism spectrum disorders: A large comparative study of a psychiatrically referred population. Journal of Autism and Developmental Disorders, 40(11), 1361–1370. [DOI] [PubMed] [Google Scholar]
- Klusek J., Martin G. E., & Losh M. (2014). A Comparison of pragmatic language in boys with autism and fragile X syndrome. Journal of Speech, Language, and Hearing Research, 57(5), 1692–1707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landry S. H., & Loveland K. A. (1989). The effect of social context on the functional communication skills of autistic children. Journal of Autism and Developmental Disorders, 19(2), 283–299. [DOI] [PubMed] [Google Scholar]
- Lee C., Black M., Katsamanis A., Lammert A., Baucom B., Christensen A., … Narayanan S. (2010). Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (pp. 793–796). Makuhari, Japan: INTERSPEECH. [Google Scholar]
- Levitan R., Gravano A., Willson L., Benus S., Hirschberg J., & Nenkova A. (2012). Acoustic-prosodic entrainment and social behavior. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 11–19). Montreal, Canada: North American Chapter of the Association for Computational Linguistics. [Google Scholar]
- Local J. (2007). Phonetic detail and the organization of talk-in-interaction. In Proceedings of the 16th ICPhS (ID 1785). Saarbrücken, Germany: ICPhS. [Google Scholar]
- Lord C., Rutter M., DiLavore P. C., Risi S., Gotham K., & Bishop S. (2012). Autism diagnostic observation schedule, second edition. Torrance, CA: Western Psychological Services. [Google Scholar]
- Louwerse M. M., Dale R., Bard E. G., & Jeuniaux P. (2012). Behavior matching in multimodal communication is synchronized. Cognitive Science, 36(8), 1404–1426. [DOI] [PubMed] [Google Scholar]
- Loveland K. A., Landry S. H., Hughes S. O., Hall S. K., & McEvoy R. E. (1988). Speech acts and the pragmatic deficits of autism. Journal of Speech and Hearing Research, 31, 593–604. [DOI] [PubMed] [Google Scholar]
- Manson J. H., Bryant G. A., Gervais M. M., & Kline M. A. (2013). Convergence of speech rate in conversation predicts cooperation. Evolution and Human Behavior, 34(6), 419–426. [Google Scholar]
- Marsh K. L., Isenhower R. W., Richardson M. J., Helt M., Verbalis A. D., Schmidt R. C., & Fein D. (2013). Autism and social disconnection in interpersonal rocking. Frontiers in Integrative Neuroscience, 7, 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mathersul D., McDonald S., & Rushby J. A. (2013). Automatic facial responses to briefly presented emotional stimuli in autism spectrum disorder. Biological Psychology, 94(2), 397–407. [DOI] [PubMed] [Google Scholar]
- McIntosh D. N., Reichmann-Decker A., Winkielman P., & Wilbarger J. L. (2006). When the social mirror breaks: Deficits in automatic, but not voluntary, mimicry of emotional facial expressions in autism. Developmental Science, 9(3), 295–302. [DOI] [PubMed] [Google Scholar]
- Miles L. K., Nind L. K., & Macrae C. N. (2009). The rhythm of rapport: Interpersonal synchrony and social perception. Journal of Experimental Social Psychology, 45, 585–589. [Google Scholar]
- Nadig A., & Shaw H. (2012). Acoustic and perceptual measurement of expressive prosody in high-functioning autism: Increased pitch range and what it means to listeners. Journal of Autism and Developmental Disorders, 42(4), 499–511. [DOI] [PubMed] [Google Scholar]
- Nakano T., Kato N., & Kitazawa S. (2011). Lack of eyeblink entrainments in autism spectrum disorders. Neuropsychologia, 49(9), 2784–2790. [DOI] [PubMed] [Google Scholar]
- Nip I. S. B., & Green J. R. (2013). Increases in cognitive and linguistic processing primarily account for increases in speaking rate with age. Child Development, 84(4), 1324–1337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oberman L. M., Winkielman P., & Ramachandran V. S. (2009). Slow echo: Facial EMG evidence for the delay of spontaneous, but not voluntary, emotional mimicry in children with autism spectrum disorders. Developmental Science, 12(4), 510–520. [DOI] [PubMed] [Google Scholar]
- Ozonoff S., & Miller J. N. (1996). An exploration of right-hemisphere contributions to the pragmatic impairments of autism. Brain and Language, 52(3), 411–434. [DOI] [PubMed] [Google Scholar]
- Paul R., Shriberg L. D., McSweeny J., Cicchetti D., Klin A., & Volkmar F. (2005). Brief report: Relations between prosodic performance and communication and socialization ratings in high functioning speakers with autism spectrum disorders. Journal of Autism and Developmental Disorders, 35(6), 861–869. [DOI] [PubMed] [Google Scholar]
- Phillips-Silver J., Aktipis C. A., & Bryant G. (2010). The ecology of entrainment: Foundations of coordinated rhythmic movement. Music Perception, 28(1), 3–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pickering M. J., & Garrod S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27, 169–226. [DOI] [PubMed] [Google Scholar]
- Putman W. B., & Street R. L. Jr. (1984). The conception and perception of noncontent speech performance: Implications for speech accommodation theory. International Journal of Sociology of Language, 46, 97–114. [Google Scholar]
- Rapin I., & Dunn M. (2003). Update on the language disorders of individuals on the autistic spectrum. Brain and Development, 25(3), 166–172. [DOI] [PubMed] [Google Scholar]
- Reitter D., Moore J. D., & Keller F. (2006). Priming of syntactic rules in task-oriented dialogue and spontaneous conversation. In Sun R. & Miyake N. (Eds.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (pp. 685–690). Sussex, England: Psychology Press. [Google Scholar]
- Shriberg L. D., Paul R., McSweeny J. L., Klin A., Cohen D. J., & Volkmar F. R. (2001). Speech and prosody characteristics of adolescents and adults with high-functioning autism and Asperger syndrome. Journal of Speech, Language, and Hearing Research, 44(5), 1097–1115. [DOI] [PubMed] [Google Scholar]
- Smith E. R. (2008). An embodied account of self-other “overlap” and its effects. In Semin G. R. & Smith E. R. (Eds.), Embodied grounding: Social, cognitive, affective and neuroscientific approaches (pp. 148–159). New York, NY: Cambridge University Press. [Google Scholar]
- Solish A., Perry A., & Minnes P. (2010). Participation of children with and without disabilities in social, recreational and leisure activities. Journal of Applied Research in Intellectual Disabilities, 23(3), 226–236. [Google Scholar]
- Street R. L. Jr., & Giles H. (1982). Speech accommodation theory: A social cognitive approach to language and speech behavior. In Roloff M. E. & Berger C. R. (Eds.), Social cognition and communication (pp. 193–226). Beverly Hills, CA: Sage. [Google Scholar]
- Taheri A., Perry A., & Minnes P. (2016). Examining the social participation of children and adolescents with intellectual disabilities and autism spectrum disorder in relation to peers: Social participation of children with ID and ASD. Journal of Intellectual Disability Research, 60(5), 435–443. [DOI] [PubMed] [Google Scholar]
- Todd N. P. M., Lee C. S., & O'Boyle D. J. (2002). A sensorimotor theory of temporal tracking and beat induction. Psychological Research, 66(1), 26–39. [DOI] [PubMed] [Google Scholar]
- Tordjman S., Davlantis K. S., Georgieff N., Geoffray M.-M., Speranza M., Anderson G. M., … Dawson G. (2015). Autism as a disorder of biological and behavioral rhythms: Toward new therapeutic perspectives. Frontiers in Pediatrics, 3, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trammell B., Wilczynski S. M., Dale B., & Mcintosh D. E. (2013). Assessment and differential diagnosis of comorbid conditions in adolescents and adults with autism spectrum disorders. Psychology in the Schools, 50(9), 936–946. [Google Scholar]
- Truong K. P., & Trouvain J. (2012). Laughter annotations in conversational speech corpora: Possibilities and limitations for phonetic analysis. In Proceedings of Workshop on Corpora for Research on Emotion Sentiment and Social Signals (pp. 20–24). Istanbul, Turkey: Corpora for Research on Emotion. [Google Scholar]
- Upitis R. (1987). Children's understanding of rhythm: The relationship between development and music training. Psychomusicology: A Journal of Research in Music Cognition, 7(1), 41–60. [Google Scholar]
- Volden J., & Phillips L. (2010). Measuring pragmatic language in speakers with autism spectrum disorders: Comparing the Children's Communication Checklist-2 and the Test of Pragmatic Language. American Journal of Speech-Language Pathology, 19(3), 204–212. [DOI] [PubMed] [Google Scholar]
- Welkowitz J., Cariffe G., & Feldstein S. (1976). Conversational congruence as a criterion of socialization in children. Child Development, 47(1), 269–272. [Google Scholar]
- Wiig E. H., Semel E., & Secord W. A. (2013). Clinical Evaluation of Language Fundamentals–Fifth Edition (CELF-5). Bloomington, MN: NCS Pearson. [Google Scholar]
- Wilson M., & Wilson T. P. (2005). An oscillator model of the timing of turn-taking. Psychonomic Bulletin and Review, 12, 957–968. [DOI] [PubMed] [Google Scholar]
- Xia Z., Levitan R., & Hirschberg J. (2014). Prosodic entrainment in Mandarin and English: A cross-linguistic comparison. Proceedings of the International Conference on Speech Prosody (pp. 65–69). [Google Scholar]
- Yoshimura S., Sato W., Uono S., & Toichi M. (2015). Impaired overt facial mimicry in response to dynamic facial expressions in high-functioning autism spectrum disorders. Journal of Autism and Developmental Disorders, 45(5), 1318–1328. [DOI] [PubMed] [Google Scholar]