Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
. 2020 Aug 5;63(8):2567–2577. doi: 10.1044/2020_JSLHR-20-00104

Conversational Coordination of Articulation Responds to Context: A Clinical Test Case With Traumatic Brain Injury

Stephanie A Borrie a,, Camille J Wynn a, Visar Berisha b, Nichola Lubold b, Megan M Willi c, Carl A Coelho d, Tyson S Barrett e
PMCID: PMC7872735  PMID: 32755503

Abstract

Purpose

Coordination of communicative behavior supports shared understanding in conversation. The current study brings together analysis of two speech coordination strategies, entrainment and compensation of articulation, in a preliminary investigation into whether strategy organization is shaped by a challenging communicative context—conversing with a person who has a communication disorder.

Method

As an initial clinical test case, an automated measure of articulatory precision was analyzed in a corpus of spoken dialogue, where a confederate conversed with participants with traumatic brain injury (n = 28) and participants with no brain injury (n = 48).

Results

Overall, the confederate engaged in significant entrainment and high compensation (hyperarticulation) in conversations with participants with traumatic brain injury relative to significant entrainment and low compensation (hypoarticulation) in conversations with participants with no brain injury. Furthermore, the confederate's articulatory precision changed over the course of the conversations.

Conclusions

Findings suggest that the organization of conversational coordination is sensitive to context, supporting synergistic models of spoken dialogue. While corpus limitations are acknowledged, these initial results point to differences in the way in which speech strategies are realized in challenging communicative contexts, highlighting a viable and important target for investigation with clinical populations. A framework for investigating speech coordination strategies in tandem and ideas for advancing this line of inquiry serve as key contributions of this work.


Key for successful conversation is coordinative communicative behavior, supporting a sense of shared understanding between interlocutors (Clark & Brennan, 1991). One coordination strategy considered to facilitate mutual understanding in conversation is “entrainment,” in which interlocutors modify their communicative behaviors to become more like one another. Theoretically rooted in the interactive alignment model (see Pickering & Garrod, 2004), entrainment has been evidenced in many aspects of speech productions, including rate (Manson et al., 2013), pitch (Lee et al., 2014), intensity (Natale, 1975), quality (Borrie & Delfino, 2017), and, importantly for the current study, articulatory precision (Lubold et al., 2019). The coordination strategy of entrainment extends beyond the level of speech production to include linguistic (e.g., lexical choice; Ivanova et al., 2020) and kinesthetic (e.g., body posture; Shockley et al., 2003) behaviors. Through an automatic structural priming mechanism (Pickering & Ferreira, 2008), interactive alignment models theorize that entrained behavior at any level of communication will induce entrainment at other levels, with cognitive processes ultimately aligning and resulting in shared conceptualization of a situation. Empirical support for the role of entrainment in facilitating understanding between interlocutors has been found in studies that report a relationship with communicative efficiency, in which highly entrained dyads achieve better collaborative performance in goal-directed verbal communication tasks (e.g., Nenkova et al., 2008, for lexical entrainment; Borrie et al., 2019, for acoustic–prosodic entrainment). Beyond the cognitive role, entrainment is considered to play an essential social function in conversation. The coordination strategy of becoming more similar to one's communication partner has been linked with prosocial behaviors such as turn-taking, rapport, empathy, and cooperation (e.g., Chartrand & Bargh, 1999; Manson et al., 2013; Wilson & Wilson, 2005).

Interlocutors may also engage in “compensation,” modifying their behaviors to counter understanding breakdowns in challenging communicative contexts. One type of compensatory behavior is hyperarticulated speech. Grounded in Lindblom's continuum of hyperarticulation–hypoarticulation theory (Lindblom 1990), talkers are thought to preferentially use low-effort, imprecise articulation (hypoarticulation) in conversation; however, they will engage in higher effort, more precise articulation (hyperarticulation) if they think it will support message exchange. Thus, speakers will modulate the level of their articulatory precision depending on the context, balancing a sort of trade-off between economy of effort (talker-oriented output) and clarity of speech (listener-oriented output; Smiljanić & Bradlow, 2009). Empirical support for the strategy of increasing articulatory effort to improve understanding in challenging communicative contexts has been documented in many studies of instruction-induced, clear speech (e.g., Bradlow & Bent 2002; Ferguson & Kewley-Port, 2002; Hazan & Barker, 2011). Recently, studies have begun to explore this compensatory strategy in more naturalistic conversational settings, evidencing that interlocutors spontaneously produce speech indicative of hyperarticulation when interacting with non-native individuals (Lee & Baese-Berk, 2020) and individuals with hearing impairment (Granlund et al., 2018).

While both entrainment and compensation of articulatory behavior are considered types of speech coordination that can support understanding in conversation, these strategies have not been studied in tandem, likely because the theoretical models that have been used to explain these strategies focus on a sole behavior—either entrainment or compensation. Recently, however, Fusaroli et al. (2014) presented a more comprehensive and inclusive conversational framework accounting for multiple coordination strategies (see also Riley et al., 2011). Termed “interpersonal synergy,” this dynamic model of spoken dialogue purports that conversational coordination is an emergent, self-organizing system involving both alignment (i.e., entrainment) and complimentary (i.e., compensatory) behaviors. Furthermore, and importantly, this synergistic model of conversation predicts that the organization of coordinative behavior, the overall coordination scheme, will be functionally constrained by the communicative context. That is, coordinative strategies will be selectively engaged as a functional unit, dependent on the communicative context in which the conversation is immersed (Fusaroli et al., 2014).

One communicative context that may shape the organization of speech coordination strategies is conversations with persons with communication disorders. In these conversations, breakdowns in shared understanding are a common occurrence, albeit for many different reasons. While communication breakdowns have traditionally been attributed to the individual with impairment, conversation is a dyadic phenomenon, and thus, the contributions of the communication partner, the neurotypical interlocutor, play an important role. Still a relatively new area of inquiry in speech-language pathology, there is growing evidence that entrainment of speech behaviors may be problematic for conversational dyads when one partner has a communication disorder (e.g., Borrie et al., 2015, 2020; Gordon et al., 2015). In interactions involving an individual with a communication disorder, one could envision that the neurotypical interlocutor would increase the precision of their articulation in an attempt to compensate for communication breakdowns, perhaps even as a direct result of insufficient entrainment; however, studies of hyperarticulation in conversational settings in communication disorders are scarce. To our knowledge, no study has directly investigated coordination in conversation with regard to both entrainment and compensation of articulatory behavior and, more specifically, whether the coordination strategies of the neurotypical interlocutor are shaped by a challenging communicative context, namely, a partner with a communication disorder.

Clinical Test Case: Traumatic Brain Injury

It is well established that conversations involving persons with traumatic brain injury (TBI) are challenging. To briefly summarize, even when speech and language abilities are deemed within normal limits, these conversations have been qualitatively characterized as less interesting, less rewarding, less appropriate, and more effortful than conversations involving individuals with no brain injury (NBI) (Bond & Godfrey, 1997; see also Coelho, 1995, for a review). Deficits in areas of social cognition, including pragmatics, empathy, and theory of mind, have been advanced as key contributors to the conversational breakdowns following TBI (Neumann et al., 2019). A recent and, to our knowledge, only study of entrainment in TBI revealed that interactions between neurotypical interlocutors and persons with TBI (who exhibited no evidence of dysarthria or aphasia) were characterized by significantly less entrainment in the production of words and number of words per turn relative to interactions between two neurotypical interlocutors (Gordon et al., 2015). While more research is needed, the authors connect the finding of disrupted entrainment in TBI conversations to the known deficits that this clinical population has with social communication.

The Current Study

As an initial investigation into whether conversing with a partner with a communication disorder regulates coordination of articulatory behavior, we carry out a test case using a large, existing corpus of spoken dialogue, where a confederate conversed with participants with TBI and NBI. Previous documentation of this corpus noted that, while speech and language abilities of the participants with TBI were within normal limits, social cognitive deficits were evident. Importantly, qualitative characterization of this corpus revealed that the conversations with participants with TBI, relative to those with participants with NBI, were collectively characterized as unsuccessful, tangential, and lacking in interactional flow (Coelho et al., 2002). Thus, the TBI conversations in this corpus represent a challenging communicative context, which could have, in theory, influenced strategy organization of the neurotypical interlocutor. Figure 1 provides a schematic of the framework used to study conversational coordination in the current study. Our first research question targets entrainment of articulatory behavior, operationally defined and measured as simple, turn-by-turn alignment between speakers' adjacent speaking turns, 1 asking (1) “Does the confederate's degree of articulatory entrainment change, depending on whether they are conversing with a participant with TBI or NBI?” The second question targets compensation of articulatory behavior, operationally defined and measured as change in an individual speaker's behavior from one communicative context to another, asking (2) “Does the confederate's articulatory precision change, depending on whether they are conversing with a participant with TBI or NBI?” Our third and final question examines articulatory behavior over time, asking (3) “Does the effect of time on the confederate's articulatory behavior change over the course of the interaction, depending on whether they are conversing with a participant with TBI or NBI?” We also include statistics regarding the participants' articulatory precision to elucidate the complete picture of these behaviors in conversation. Our overarching hypothesis is that the confederates' coordination strategies will be influenced by the communicative context. More specifically, we hypothesize that the confederate's behavior in conversations with participants with TBI will be characterized by low entrainment–high compensation relative to high entrainment–low compensation in the conversations with participants with NBI. We also hypothesize that the effect of time on the confederate's articulatory behavior will be different with participants with TBI and NBI, supporting the idea that coordinative strategies are dynamic and adapt to the needs of the conversation, which presumably become increasingly apparent as the conversations unfold. We plainly acknowledge that this study is simply an initial test case, limited by a single confederate. However, recognizing conversational coordination as interpersonal synergy and examining whether coordinative strategies vary in challenging communicative contexts provides an important foundation for future work, particularly that which seeks to understand and address conversational breakdowns in clinical populations.

Figure 1.

Figure 1.

Schematic depicting our framework of conversational coordination, in which entrainment and compensation of articulatory precision are investigated in tandem.

Method

Conversation Corpus

This study utilized an established corpus of conversational discourse with adults with TBI and NBI, described in detail by Coelho et al. (2002). To summarize, the corpus involved 28 participants with TBI and 48 controls with NBI. The participants with TBI, seven females and 21 males between the ages of 16 and 69 years (M = 32.9 years, SD = 14.7), all presented with a high level of functional speech and language—operationally defined as fluent conversational skills, no significant deficits on traditional clinical language tests, and no significant motor speech disorder as determined by an experienced speech-language pathologist. However, as also determined by the speech-language pathologist and manifested in conversational behavior, the participants with TBI presented with social cognitive deficits. Please refer to Coelho et al. (2002) for comprehensive testing details and demographic information of the participants, including race, education, socioeconomic status, duration of coma, and month after onset of injury. The participants with NBI, 16 females and 32 males between the ages of 16 and 63 years (M = 32 years, SD = 13.6), were hospital employees, specifically selected to match the participants with TBI as closely as possible in terms of age, gender, race, socioeconomic status, and level of education.

Each participant, the individuals with TBI and the controls with NBI, engaged in a conversation lasting approximately 15 min (M = 14.4 min, SD = 1.4) with a confederate. 2 The confederate, a 42-year-old man with a clinical background in speech-language pathology, initiated the interactions by asking the question “Why are you here at the hospital/rehabilitation center today?” The confederate and participants did not know one another prior to the conversation elicitation task and, as this study is a post hoc analysis of an existing corpus, were blind to the purpose of the current study (i.e., analysis of speech coordination strategies). Each conversation was audio-recorded using a high-quality external microphone placed midway between interlocutors and transcribed verbatim by research assistants. The corpus and complete orthographic transcripts were made available for future research, such as the study completed here, by TalkBank at https://talkbank.org/.

Feature Extraction

Each conversation was partitioned into individual spoken utterances, operationally defined as a continuous segment of speech beginning and ending with a pause of greater than 0.5 s or a change in speaker. Table 1 highlights the mean number and duration of utterances by dyad type (TBI vs. NBI) and interlocutor (confederate vs. participant). T tests revealed no significant difference between the number of utterances of the confederate, t(74) = 0.75, p = .46, or participants, t(74) = 0.18, p = .86, in TBI and NBI dyads. Additionally, there was no significant difference between utterance duration of the confederate, t(74) = 1.9, p = .06, and participants, t(74) = 1.9, p = .06, in TBI and NBI dyads.

Table 1.

Comparison of utterances by interlocutor and dyad type.

Variable M (SD)
Confederate
Participant
TBI NBI TBI NBI
Number of utterances 148.0 (57.9) 157.6 (51.6) 236.2 (70.1) 238.9 (58.1)
Duration of utterances (s) 2.0 (1.7) 1.8 (1.6) 2.7 (2.2) 2.5 (1.9)

Note. TBI = traumatic brain injury; NBI = no brain injury.

An automated approach to scoring pronunciation was used to extract measures of articulatory precision (Lubold et al., 2019; Tu et al., 2018; Witt & Young, 2000). This approach results in a precision score for every phoneme in the corpus. Described in full details by Tu et al. (2018), the automation aligns the transcriptions at the phoneme level to their acoustic counterparts utilizing an acoustic model for English based on the LibriSpeech corpus (Panayotov et al., 2015). With the aligned transcripts and audio, an articulatory precision score (APS) can then be calculated for a phoneme p based on the log-posterior probability of observing phoneme p normalized by phoneme duration. In the following equation for calculating the precision score for phoneme p, Op is the corresponding acoustic segment, |Op| is the number of frames in the segment, and Q is the set of all phonemes:

APSp=logP(|Op)/|Op|logP(Op|p)maxqQP(Oq|q)/|Op| (1)

The above equation assumes equal priors for all phonemes. If the phoneme returned by the acoustic model is the same as the target phoneme p, then the APS is equal to 0. Otherwise, the score will be negative; the smaller the score (i.e., the farther from zero), the farther the pronunciation is from that defined by the acoustic model built from the LibriSpeech corpus. Because the LibriSpeech corpus is read speech, this measure is an evaluation of articulatory precision as defined by “read” speech. We interpret scores farther from zero as less precise. The precision scores for the phonemes in an utterance were averaged to obtain an APS.

We acknowledge that articulatory precision, in the way it is measured here, is likely a gestalt metric in which a number of acoustic features contribute. Indeed, previous studies have reported a relationship between articulatory precision and speech rate, with measures associated with underspecified or reduced articulatory movements correlating with fast speech rates (see Mefferd & Green, 2010, for a review). As such, we also extracted a measure of speech rate (in syllables per second) for each spoken utterance. A Pearson's correlation revealed a small relationship between articulatory precision and speech rate, r = −.27.

Data Analysis

Linear mixed models were used to analyze the study data in the R statistical environment (R Version 3.6.1; R Development Core Team, 2019) using the lme4 (lme4 package Version 1.1-19; Bates et al., 2015), lmerTest (Kuznetsova et al., 2017), and ggplot2 (Wickham, 2016) packages. This type of analysis was used to investigate the effects of the independent variables on articulatory precision while controlling for individual variability across the repeated measures. For all models, the random effects structure included a random intercept by conversation. Additionally, in analyses involving both the confederate and participants, a random intercept by speaker was included to control for the repeated use of a single confederate. Fixed effects varied across analyses and included dyad type (denoting communicative context; TBI vs. NBI), interlocutor (confederate vs. participant), and time (minutes). Additionally, to ensure results were not confounded by potential differences in audio recording quality, we extracted the signal-to-noise ratio (SNR) for every conversation and controlled for SNR in the statistical models. Note that there was no statistically significant effect of SNR in any of the models. Given that reduced speech rate contributes to our gestalt metric of articulatory precision, we selected to not control for it in our models in an attempt to most accurately represent the holistic measure. 3 Finally, all p values reported in conjunction with estimates of effect are based on Satterthwaite approximation to degrees of freedom.

To measure entrainment of articulatory behavior, operationally defined as simple, turn-by-turn alignment where one speaker's behavior in a single spoken utterance predicts the behavior of their communication partner in the adjacent spoken utterance, we used linear mixed models similar to those used in previous studies (Lubold et al., 2019; Seidl et al., 2018). In these models, the articulatory precision of one speaker (firstAP) was fit to predict the articulatory precision of the second speaker's adjacent spoken utterance (secondAP). The degree to which secondAP was predicted by firstAP is indicative of the influence of one speaker's articulatory precision on the precision of the other speaker. As such, only spoken utterances between conversational partners (as opposed to consecutive utterances produced by the same individual) were included in the entrainment analysis.

To measure compensation of articulatory behavior, operationally defined as change in an individual speaker's behavior from one context to another, linear mixed models were used to compare the confederate's articulatory precision when interacting with the participants with TBI to his precision when interacting with the participants with NBI. All spoken utterances from the conversations were used in this analysis. Although our research questions target the articulatory behavior of the confederate, entrainment and compensation models included participant data to provide a more complete perspective. All codes, model output, and supplementary materials associated with this work are available at the study repository hosted at https://osf.io/c36z8/.

Results

Entrainment of Articulatory Precision

Model Selection

Our first research question addressed entrainment of the confederate's articulatory behavior. Importantly, entrainment here is the statistical correlation between adjacent utterances, while accounting for intraconversation variability and the cross-conversation variability of the confederate. As is considered best practice with linear mixed models, the best-fitting, most parsimonious linear mixed-effects model was selected using a series of likelihood ratio tests (Hox et al., 2018). In each model, the outcome was secondAP. Likelihood ratio tests indicated that Model 3, which included interactions between firstAP and interlocutor and between dyad type and interlocutor, was the best-fitting model (p = .013; see Table 2).

Table 2.

Linear mixed-model fit indices for entrainment and compensation models of interest, where Model 1 is compared to Model 2, Model 2 is compared to Model 3, and so forth.

Model AIC BIC Log likelihood χ2 χ2 difference p
Entrainment models
Dependent variable: secondAP
  Model 1
  FirstAP + dyad type + interlocutor + SNR
61851 61913 −30918 61835
  Model 2
  FirstAP + dyad type × interlocutor + SNR
61848 61918 −30915 61830 5.470 .019
  Model 3
  FirstAP × Interlocutor + dyad type × interlocutor + SNR
61844 61921 30912 61824 6.216 .013
  Model 4
  FirstAP × dyad type + firstAP × interlocutor + dyad type ×   interlocutor + SNR
61843 61928 −30910 61821 2.584 .108
  Model 5
  FirstAP × dyad type × interlocutor + SNR
661844 61937 −30910 61820 0.626 .429
Compensation models
Dependent variable: articulatory precision
  Model 1
  Dyad type + interlocutor + SNR
105750 105808 −52868 105736
  Model 2
  Dyad type × interlocutor + SNR
105744 105811 52864 105728 7.939 .005

Note. Boldfaced text represents the best-fitting model based on likelihood ratio tests. AIC = Akaike information criterion; BIC = Bayesian information criterion; SNR = signal-to-noise ratio.

Model Results

Table 3 highlights the estimated coefficients for Model 3. The model indicates that the confederate entrained significantly more overall than the participants (b = 0.038, p = .012). However, additional analysis indicated that a significant level of entrainment was achieved by both the confederate (b = 0.125, p < .001) and the participants (b = 0.073, p < .001). Thus, the articulatory precision of both the confederate and participants was predictive of one another across adjacent utterances. For a data-driven example of this entrainment result in an NBI and TBI conversation, see Supplemental Material S1. Furthermore, and key for the current study, the degree of articulatory entrainment of both the confederate and participants was not significantly different in the TBI and NBI conversations (p = .107). Thus, the answer to our first research question—“Does the degree of articulatory entrainment of the confederate change, depending on whether they are conversing with a participant with TBI or NBI?”—is “no.” 4

Table 3.

Results of fixed effects for Model 3 of the entrainment analysis.

Term B SE t p
Intercept −1.627 .026 −6.279 < .001***
Articulatory precision .112 .012 9.559 < .001***
Interlocutor .098 .089 1.097 .275
Dyad type .404 .149 2.713 .008**
SNR −.051 .031 −1.601 .114
Articulatory precision × interlocutor .038 .015 2.507 .012*
Dyad type × interlocutor .313 .137 2.280 .025*

Note. SNR = signal-to-noise ratio.

*

p < .05.

**

p < .01.

***

p < .001.

Compensation of Articulatory Precision

Model Selection

The second research question assessed compensation of the confederate's articulatory behavior. The same approach used for selecting the best-fitting, most parsimonious model in entrainment was used here with compensation. Because the analysis only focused on a single interactive effect (dyad type × interlocutor) on articulatory precision, only two models were tested (see Table 2). Results of the likelihood ratio tests indicated that Model 2, which included the interaction between dyad type and interlocutor, was the best-fitting model (p = .004).

Model Result

Table 4 highlights the estimated coefficients for Model 2. Given the significant interaction effect, the coefficients may be best interpreted via Figure 2. As illustrated in Figure 2, although there was no significant difference between the articulatory precision of the participants with TBI (M = −2.2) and NBI (M = −2.2; b = 0.142, p = .427), the confederate engaged in significantly more precise articulation when conversing with the participants with TBI (M = −1.9) than when conversing with the participants with NBI (M = −2.3; b = 0.394, p = .005). Thus, the answer to our second research question—“Does the confederate's articulatory precision change, depending on whether they are conversing with a participant with TBI or NBI ?”—is “yes.”

Table 4.

Results of fixed effects for Model 2 of the compensation analysis.

Term B SE t p
Intercept −1.731 .288 −6.015 <.001***
Interlocutor .139 .067 2.068 .042*
Dyad type .395 .158 2.489 .014*
SNR −.069 .036 −1.938 .057
Interlocutor × dyad type .320 .111 2.886 .006**

Note. SNR = signal-to-noise ratio.

*

p < .05

**

p < .01.

***

p < .001.

Figure 2.

Figure 2.

Average articulatory precision by interlocutor (confederate or participant) and dyad type. Error bars delineate ±1 SEM. NBI = no brain injury; TBI = traumatic brain injury.

Articulatory Precision Over Time

Model Selection

Our third research question addressed the effect of time on the confederate's articulatory behavior. To answer this question, an additional variable (i.e., time) was added to two models: one for entrainment and one for compensation. Note that the interlocutor variable was removed to ensure models remained parsimonious and interpretable. The analysis also did not use likelihood ratio tests as single models were used to test for the effects of interest using the p values derived from using Satterthwaite degrees of freedom. Our first time-based model examined the effects of time on articulatory entrainment and included the three-way interaction between firstAP (i.e., articulatory precision of the participant), dyad type, and time on secondAP (i.e., articulatory precision of the confederate). Our second time-based model examined the effects of time on compensation of articulatory precision and included a two-way interaction between dyad type and time.

Model Results

Estimates revealed that the entrainment patterns of the confederate were not significantly different over time when speaking with the participants with TBI and NBI (b < 0.001, p = .908). For compensation, results suggested that the interaction was nearly significant, just shy of the .05 threshold (b = 0.013, p = .055). That is, while the confederate significantly decreased articulatory precision over time with both participants with TBI (b = −0.013, p = .007) and NBI (b = −0.026, p < .001), he tended to decrease his articulatory precision over the conversation more when interacting with the participants with NBI relative to the participants with TBI. This pattern of results is depicted in Figure 3. Thus, the answer to our third research question—“Does the effect of time on the confederate's articulatory behavior change, depending on whether they are conversing with a participant with TBI or NBI?”—is “possibly.” The confederate significantly decreased his precision over the conversation in both contexts, and the difference in the magnitude of reduction between the contexts is approaching significance.

Figure 3.

Figure 3.

Articulatory precision of the confederate by dyad type over time. Error ribbons delineate ±1 SEM. NBI = no brain injury; TBI = traumatic brain injury.

Discussion

The aim of this study was to investigate whether conversational coordination of articulation is shaped by communicative context. As an initial clinical test case, we made use of an existing corpus of spoken dialogue, where a confederate conversed with participants with TBI and NBI and examined the confederate's articulatory behavior in regard to two coordinative strategies: entrainment and compensation. Upfront, we acknowledge the limitations of a corpus in which a single confederate interacts with all participants. Confederate characteristics (e.g., age, sex, clinical background) could influence the use of coordination strategies, and as such, one could argue that the specific findings of this study may not generalize to other neurotypical speakers. We do not deny this. However, we argue a critical point: The corpus affords us a rich clinical data set for examining contextually dependent speech behavior. Paradoxically, this limitation is also what makes it a suitable clinical test corpus for an initial investigation into whether coordinative strategies are functionally constrained by the presence of a communication disorder. With a large data set from the same confederate interacting with participants with TBI and NBI, observed differences between dyad types can be considered a function of context and not simply individual differences in baseline behavior. Indeed, a substantial individual difference has been observed in how precisely neurotypical individuals articulate their speech (e.g., Perkell, 1990; Westbury et al., 1998) and the degree to which they entrain to their interlocutor (e.g., Babel et al., 2014; Weise et al., 2019), making between-subjects designs somewhat problematic for investigating the influence of communicative context (see also Dideriksen et al., 2019). Thus, we consider the existing clinical corpus as an appropriate test case for this preliminary investigation, one that (a) establishes a framework for studying speech coordination strategies in tandem, (b) validates the importance of appreciating speech coordination in conversation as a context-sensitive system, and (c) highlights important direction and guidance for future work in this area.

Our first research question targeted local turn-by-turn entrainment of articulatory behavior and found that the confederate entrained his level of articulatory precision to the participants, regardless of whether they presented as TBI or NBI. Furthermore, the degree of articulatory entrainment was not significantly different across the two dyad types. Our second research question targeted compensatory behavior and found that, even though the participants were comparable in terms of their own levels of articulatory precision, the confederate engaged in significantly more precise articulation when interacting with participants with TBI relative to participants with NBI. Thus, when considered as an overall coordination scheme, the confederate engaged in significant entrainment and high compensation (hyperarticulation) in TBI conversations relative to significant entrainment and low compensation (hypoarticulation) in NBI conversations. Accordingly, in line with predictions of the interpersonal synergy model of spoken dialogue (Fusaroli et al., 2014) and our overarching study hypothesis, the organization of speech coordination strategies appears to respond to context. While not clinically situated, similar general conclusions regarding context sensitivity of interpersonal coordination in interaction have been observed in linguistic (Dideriksen et al., 2019) and nonverbal communication (Paxton & Dale, 2017).

Although study findings support the prediction that context regulates the organization of speech strategies, our directional hypothesis regarding how the challenging communicative context would drive strategy organization in this corpus was not entirely supported. Given preliminary evidence of reduced entrainment in conversations involving individuals with communication disorders (e.g., Gordon et al., 2015) and reports of increasing articulatory effort to improve understanding in challenging communicative contexts (e.g., Lee & Baese-Berk, 2020), we hypothesized that the confederate's behavior in conversations with participants with TBI would be characterized by low entrainment–high compensation relative to high entrainment–low compensation in the conversations with participants with NBI. However, we observed that the confederate simultaneously engaged both speech coordination strategies, articulatory entrainment and compensation, when interacting with the participants with TBI, presumably in an attempt to maximally scaffold shared understanding in the presence of a communication disorder. This affords novel insight regarding speech behavior in conversation; along a single speech dimension, interlocutors can and do employ multiple coordinative strategies concurrently—Figure 4 illustrates how a confederate can both entrain and compensate articulatory behavior, in the same conversation.

Figure 4.

Figure 4.

Conceptual illustration of how a confederate can both entrain (turn-by-turn articulatory alignment) and compensate (hyperarticulate) in the same conversation (left panel) compared to when they entrain but do not compensate (right panel).

The study findings also point to the dynamic nature of conversational coordination. Examination of the confederate's articulatory behavior over the course of the conversation revealed a significant reduction in precision over time, regardless of context, and a near-significant difference in the magnitude of this reduction between contexts. That is, when interacting with participants with NBI, the confederate decreased the precision of his articulation over the conversation more than when interacting with participants with TBI. While further investigation is required, it would appear that, as the conversations progressed, mutual understanding emerged and the confederate was able to exert less effort on articulatory precision. This interpretation is supported by existing evidence that speakers shorten content words, reducing intelligibility, when information is redundant in subsequent productions (Fowler, 1988; Fowler & Housum, 1987). The pattern of behavioral change over time was less pronounced in the TBI conversations, implying breakdowns in understanding were not as easily resolved. Collectively, these findings suggest that strategies are not fixed from onset (e.g., upon recognition of a partner's communication disorder) but rather speech compensation is dynamic and continues to adapt and change over the course of the interaction, as interlocutors become familiar with one another and as the needs of the conversation become increasingly apparent.

Follow-Up and Future Directions

As a follow-up case study, we examined the articulatory behavior of a novel confederate, a 24-year-old woman, engaged in conversations with an additional 20 participants with TBI, elicited in an identical manner as the original study corpus (and also available at TalkBank). Results revealed similar patterns of strategy organization; that is, both the original 42-year-old male confederate (reported herein) and a 24-year-old female confederate showed significant entrainment and evidence of precise articulation when conversing with participants with TBI. See Supplemental Material S2 for data analysis, output, and results summary. That the current findings are supported by an additional data set suggests that the current corpus with one confederate may be more generalizable than previously considered. Moving forward, however, we recognize the need for investigations yielding much more generalizable data—particularly if one wants to explicitly investigate questions of coordination deficits in clinical populations. We propose collection of future corpora using a round-robin design, where multiple participants engage in conversations with multiple interlocutors. While admittedly not the most convenient collection procedure, this within-participant setup would satisfy issues related to individual differences while substantially increasing the generalizability of specific results.

While we observed differences in the organization of articulatory behavior in conversations with clinical populations, namely, the simultaneous use of both entrainment and compensation, we do not know if this strategy combination is optimal for success (i.e., resolving breakdowns in understanding). In the original study published on this corpus, the conversations with participants with TBI, as is the case for many populations with communication disorders, were characterized as unsuccessful and lacking in interactional flow (Coelho et al., 2002). Given this, one could postulate that the strategies that are naturally deployed when interacting with persons with communication disorders may not always be the most advantageous. Interestingly, while hyperarticulation has been evidenced to elevate intelligibility in speech perception studies in controlled laboratory settings (e.g., Bradlow et al., 1996), there is little evidence that it is a successful strategy in unstructured interactional settings. Thus, an important next step in this line of enquiry would be to examine what strategies support conversation and then whether strategies can be therapeutically manipulated for improved conversational outcomes. Indeed, the answers to these questions are likely to be population specific, depending on the source of deficit. One can imagine, for example, that hyperarticulation would elevate conversational success in interactions with individuals with auditory deficits (e.g., impaired hearing) but that it may be an unnecessary compensatory strategy in interactions with individuals with TBI, whose primary communication deficits are social cognitive, and not auditory, in nature.

Furthermore, hyperarticulation is not the only compensatory strategy that occurs in conversation, and we acknowledge the constraint of studying a single speech feature. For example, increased loudness has been observed as compensatory accommodations that interlocutors may deploy in noisy conversational settings (e.g., Cooke & Lu, 2010). Additionally, speech behavior does not occur in isolation. There are a number of linguistic strategies (i.e., back channeling, requests for clarification, reduced sentence length) and kinesthetic strategies (i.e., gestures, head nods, facial expression) that interlocutors can employ when communication is challenging. This raises an important consideration for investigations in conversational rehabilitation with clinical populations. If one communication channel is challenged, is compensatory behavior best served by another communication channel? Indeed, this is what likely happened in the current corpus—multiple articulatory strategies were used to compensate for challenges elsewhere (i.e., deficits in lexical entrainment [Gordan et al., 2015] and/or pragmatic deficits [Neumann et al., 2019]). Alternatively, conversations with individuals with the speech production disorder of dysarthria, characterized by deficits in articulatory, rhythmic, and phonatory entrainment (Borrie et al., 2015, 2020), may be best served by compensation in linguistic and/or kinesthetic behavior. Breakdowns in shared understanding are all too common in conversations with individuals with communication disorders. Thus, the framework and directions outlined here advance an important line of inquiry. Ultimately, a comprehensive model of conversation, one that accounts for entrainment and compensatory strategies in multiple communication channels and is population specific, will be most impactful in understanding and addressing conversational impairments in clinical populations.

Conclusions

In summary, the current study brings together analysis of two speech strategies, entrainment and compensation, in a preliminary investigation into whether communicative context, specifically the presence or absence of a communication disorder, impacts articulatory behavior in conversations. Our results reveal that, overall, the conversations with participants with TBI were characterized by entrainment and high compensation whereas the conversations with participants with NBI were characterized by entrainment and low compensation. Thus, in line with recent synergistic models of spoken dialogue, the organization of conversational coordination appears to respond to context. While corpus limitations are recognized, the current test case indicates differences in the way in which coordination strategies are realized in conversations with clinical populations, suggesting a viable and important target for future investigation.

Acknowledgments

This research was supported by National Institute on Deafness and Other Communication Disorders Grant R21DC016084 awarded to Stephanie Borrie and Visar Berisha.

Funding Statement

This research was supported by National Institute on Deafness and Other Communication Disorders Grant R21DC016084 awarded to Stephanie Borrie and Visar Berisha.

Footnotes

1

This turn-by-turn alignment measure has been successfully used in many studies to capture entrainment (e.g., Ko et al., 2015; Lubold et al., 2019). However, we acknowledge that it does not capture the full extent of the entrainment phenomenon in which, as noted by Duran and Fusaroli (2017), “coordinated behaviors do not need to be isomorphic and occur close in time…rather, they can be distributed and loosely coupled across various local and global temporal scales” (p. 2; see also Borrie et al., 2019, for a comprehensive, clinically informed methodology to capture global entrainment).

2

Additional speech tasks were also collected from the participants but were not analyzed in the current study in which we target coordinative strategies used in conversation.

3

Models were also run with the inclusion of speech rate as a fixed effect. Results revealed no statistically meaningful differences between these models and the models reported in text in which we did not control for speech rate.

4

In theory, this model's significant interaction between dyad type and interlocutor could be used to answer the second research question regarding compensation. However, as per the measure of turn-by-turn alignment, this analysis only included adjacent spoken utterances. As such, compensation is interpreted from models that include all spoken utterances.

References

  1. Babel, M. , McGuire, G. , Walters, S. , & Nicholls, A. (2014). Novelty and social preference in phonetic accommodation. Laboratory Phonology, 5, 123–150. [Google Scholar]
  2. Bates, D. , Maechler, M. , Bolker, B. , & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01 [Google Scholar]
  3. Bond, F. , & Godfrey, H. P. D. (1997). Conversation with traumatically brain-injured individuals: A controlled study of behavioural changes and their impact. Brain Injury, 11(5), 319–329. https://doi.org/10.1080/026990597123476 [DOI] [PubMed] [Google Scholar]
  4. Borrie, S. A. , Barrett, T. S. , Liss, J. M. , & Berisha, V. (2020). Sync pending: Characterizing conversational entrainment in dysarthria using a multidimensional, clinically-informed approach. Journal of Speech, Language, and Hearing Research, 63(1), 83–94. https://doi.org/10.1044/2019_JSLHR-19-00194 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Borrie, S. A. , Barrett, T. S. , Willi, M. M. , & Berisha, V. (2019). Synching up for a good conversation: A clinically-meaningful methodology for capturing conversational entrainment in the speech domain. Journal of Speech, Language, and Hearing Research, 62(2), 283–296. https://doi.org/10.1044/2018_JSLHR-S-18-0210 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Borrie, S. A. , & Delfino, C. (2017). Conversational entrainment of vocal fry in young adult female American English speakers. Journal of Voice, 31(4), 513.e25–513.e32. https://doi.org/10.1016/j.jvoice.2016.12.005 [DOI] [PubMed] [Google Scholar]
  7. Borrie, S. A. , Lubold, N. , & Pon-Barry, H. (2015). Disordered speech disrupts conversational entrainment: A study of acoustic–prosodic entrainment and communicative success in populations with communication challenges. Frontiers in Psychology, 6, 1187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bradlow, A. R. , & Bent, T. (2002). The clear speech effect for non-native listeners. The Journal of the Acoustical Society of America, 112(1), 272–284. https://doi.org/10.1121/1.1487837 [DOI] [PubMed] [Google Scholar]
  9. Bradlow, A. R. , & Torretta, G. M. Pisoni, D. B. (1996). Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics. Speech Communication , 20(3), 255–272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Coelho, C. A. (1995). Discourse production deficits following traumatic brain injury: A critical review of the recent literature. Aphasiology, 9(5), 409–429. https://doi.org/10.1080/102687039508248707 [Google Scholar]
  11. Coelho, C. A. , Youse, K. M. , & Lê-, K. N. (2002). Conversational discourse in closed-head-injured and non–brain-injured adults. Aphasiology, 16(4–6), 659–672. https://doi.org/10.1080/02687030244000275 [Google Scholar]
  12. Cooke, M. , & Lu, Y. (2010). Spectral and temporal changes to speech produced in the presence of energetic and informational maskers. The Journal of the Acoustical Society of America, 128(4), 2059–2069. https://doi.org/10.1121/1.3478775 [DOI] [PubMed] [Google Scholar]
  13. Chartrand, T. L. , & Bargh, J. A. (1999). The chameleon effect: The perception–behavior link and social interaction. Journal of Personality and Social Psychology, 76(6), 893–910. https://doi.org/10.1037/0022-3514.76.6.893 [DOI] [PubMed] [Google Scholar]
  14. Clark, H. H. , & Brennan, S. E. (1991). Grounding in communication. In Resnick L. B., Levine J. M., & Teasley S. D. (Eds.), Perspectives on socially shared cognition, (Vol 13, pp. 127–149). American Psychological Association. [Google Scholar]
  15. Dideriksen, C. R. , Fusaroli, R. , Tylén, K. , Dingemanse, M. , & Christiansen, M. H. (2019). Contextualizing conversational strategies: Backchannel, repair and linguistics alignment in spontaneous and task-oriented conversations. In Goel A. K., Seifert C. M., & Freksa C. (Eds.), Proceedings of the 41st Annual Conference of the Cognitive Science Society (pp. 261–267). Cognitive Science Society. [Google Scholar]
  16. Duran, N. D. , & Fusaroli, R. (2017). Conversing with a devil’s advocate: Interpersonal coordination in deception and disagreement. PLOS ONE, 2(6), e0178140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Ferguson, S. H. , & Kewley-Port, D. (2002). Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 112(1), 259–271. [DOI] [PubMed] [Google Scholar]
  18. Fowler, C. A. (1988). Differential shortening of repeated content words produced in various communicative contexts. Language and Speech, 31(4), 307–319. https://doi.org/10.1177/002383098803100401 [DOI] [PubMed] [Google Scholar]
  19. Fowler, C. A. , & Housum, J. (1987). Talkers' signaling of “new” and “old” words in speech and listeners' perception and use of the distinction. Journal of Memory and Language, 26(5), 489–504. https://doi.org/10.1016/0749-596X(87)90136-7 [Google Scholar]
  20. Fusaroli, R. , Raczaszek-Leonardi, J. , & Tylén, K. (2014). Dialog as interpersonal synergy. New Ideas in Psychology, 32, 147–157. https://doi.org/10.1016/j.newideapsych.2013.03.005 [Google Scholar]
  21. Gordon, R. G. , Rigon, A. , & Duff, M. C. (2015). Conversational synchrony in the communicative interactions of individuals with traumatic brain injury. Brain Injury, 29(11), 1300–1308. https://doi.org/10.3109/02699052.2015.1042408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Granlund, S. , Hazan, V. , & Mahon, M. (2018). Children's acoustic and linguistic adaptations to peers with hearing impairment. Journal of Speech, Language, and Hearing Research, 61(5), 1055–1069. [DOI] [PubMed] [Google Scholar]
  23. Hazan, V. , & Baker, R. (2011). Acoustic–phonetic characteristics of speech produced with communicative intent to counter adverse listening conditions. The Journal of the Acoustical Society of America, 130(4), 2139–2152. https://doi.org/10.1121/1.3623753 [DOI] [PubMed] [Google Scholar]
  24. Hazan, V. , Tuomainen, O. , Kim, J. , Davis, C. , Sheffield, B. , & Brungart, D. (2018). Clear speech adaptations in spontaneous speech produced by young and older adults. The Journal of the Acoustical Society of America, 144(3), 1331–1346. https://doi.org/10.1121/1.5053218 [DOI] [PubMed] [Google Scholar]
  25. Hox, J. J. , Moerbeek, M. , & van de Schoot, R. (2018). Multilevel analysis. Routledge; https://doi.org/10.4324/9781315650982 [Google Scholar]
  26. Ivanova, I. , Horton, W. S. , Swets, B. , Kleinman, D. , & Ferreira, V. S. (2020). Structural alignment in dialogue and monologue (and what attention may have to do with it). Journal of Memory and Language, 110, 104052 https://doi.org/10.1016/j.jml.2019.104052 [Google Scholar]
  27. Ko, E. , Seidl, A. , Cristia, A. , Reimchen, M. , & Soderstrom, M. (2015). Entrainment of prosody in the interaction of mothers with their young children. Journal of Child Language, 43(2), 284–309. [DOI] [PubMed] [Google Scholar]
  28. Kuznetsova, A. , Brockhoff, P. B. , & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13 [Google Scholar]
  29. Lee, C. , Katsamanis, A. , Black, M. , Baucom, B. , Christensen, A. , Georgiou, P. G. , & Narayanan, S. (2014). Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions. Computer Speech and Language, 28(2), 518–539. https://doi.org/10.1016/j.csl.2012.06.006 [Google Scholar]
  30. Lee, D. , & Baese-Berk, M. M. (2020). The maintenance of clear speech in naturalistic conversations. The Journal of the Acoustical Society of America, 147, 3702–3711. https://doi.org/10.1121/10.0001315 [DOI] [PubMed] [Google Scholar]
  31. Lindblom, B. 1990. Explaining phonetic variation: A sketch of the H&H theory. In Speech production and speech modelling (Vol. 55, pp. 403–439). Springer; https://doi.org/10.1007/978-94-009-2037-8_16 [Google Scholar]
  32. Lubold, N. , Borrie, S.A. , Barrett, T.S. , Willi, M.M. , & Berisha, V. (2019). Do conversational partners entrain on articulatory precision? Paper presented at the Proceedings of INTERSPEECH 2019. [DOI] [PMC free article] [PubMed]
  33. Manson, J. H. , Bryant, G. A. , Gervais, M. M. , & Kline, M. A. (2013). Convergence of speech rate in conversation predicts cooperation. Evolution and Human Behavior, 34(6), 419–426. https://doi.org/10.1016/j.evolhumbehav.2013.08.001 [Google Scholar]
  34. Mefferd, A. S. , & Green, J. R. (2010). Articulatory-to-acoustic relations in response to speaking rate and loudness manipulations. Journal of Speech, Language, and Hearing Research, 53(5), 1206–1219. https://doi.org/10.1044/1092-4388(2010/09-0083) [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Natale, M. (1975). Convergence of mean vocal intensity in dyadic communication as a function of social desirability. Journal of Personality and Social Psychology, 32(5), 790–804. https://doi.org/10.1037/0022-3514.32.5.790 [Google Scholar]
  36. Nenkova, A. , Gravano, A. , & Hirschberg, J. (2008). High frequency word entrainment in spoken dialogue. Proceedings of ACL/HLT, 2008, 169–172. https://doi.org/10.3115/1557690.1557737 [Google Scholar]
  37. Neumann, D. , Zupan, B. , & Eberle, R. D. (2019). Social cognition. In Silver, J. M. , McAllister, T. W. , & Arciniegas, D. B. . Textbook of traumatic brain injury (3rd ed., Chap. 14). [Google Scholar]
  38. Paxton, A. , & Dale, R. (2017). Interpersonal movement synchrony responds to high- and low-level conversational constraints. Frontiers in Psychology, 8, 1135 https://doi.org/10.3389/fpsyg.2017.01135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Panayotov, V. , Chen, G. , Povey, D. , & Khudanpur, S. (2015). Librispeech: An ASR corpus based on public domain audio books. Paper presented at the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing—Proceedings, 5206–5210.
  40. Perkell, J. S. , (1990). Testing theories of speech production: Implications of some detailed analyses of variable articulatory data. In Hardcastle W. & Marchal A. (Eds.), Speech production and speech modeling (Vol. 55, pp. 263–288). Kluwer Academic Publishers; https://doi.org/10.1007/978-94-009-2037-8_11 [Google Scholar]
  41. Pickering, M. J. , & Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27(2), 169–225. https://doi.org/10.1017/S0140525X04000056 [DOI] [PubMed] [Google Scholar]
  42. Pickering, M. J. , & Ferreira, V. S. (2008). Structural priming: A critical review. Psychological Bulletin, 134(3), 427–459. https://doi.org/10.1037/0033-2909.134.3.427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. R Development Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing; https://www.R-project.org/ [Google Scholar]
  44. Riley, M. A. , Richardson, M. , Shockley, K. , & Ramenzoni, V. C. (2011). Interpersonal synergies. Frontiers in Psychology, 2, 38 https://doi.org/10.3389/fpsyg.2011.00038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Seidl, A. , Cristia, A. , Soderstrom, M. , Ko, E. S. , Abel, E. A. , Kellerman, A. , & Schwichtenberg, A. J. (2018). Infant–mother acoustic–prosodic alignment and developmental risk. Journal of Speech, Language, and Hearing Research, 61(6), 1369–1380. https://doi.org/10.1044/2018_JSLHR-S-17-0287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Shockley, K. , Santana, M. V. , & Fowler, C. A. (2003). Mutual interpersonal postural constraints are involved in cooperative conversation. Journal of Experimental Psychology: Human Perception and Performance, 29(2), 326–332. https://doi.rog/10.1037/0096-1523.29.2.326 [DOI] [PubMed] [Google Scholar]
  47. Smiljanić, R. , & Bradlow, A. R. (2009). Speaking and hearing clearly: Talker and listener factors in speaking style changes. Language and Linguistics Compass, 3(1), 236–264. https://doi.org/10.1111/j.1749-818X.2008.00112.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Tu, M. , Grabek, A. , Liss, J. M. , & Berisha, V. (2018). Investigating the role of L1 in automatic pronunciation evaluation of L2 speech. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2018-September, 1636–1640. https://doi.org/10.21437/Interspeech.2018-1350 [Google Scholar]
  49. Westbury, J. R. , Hashi, M. , & Lindstrom, M. J. (1998). Differences among speakers in lingual articulation for American English /r/. Speech Communication, 26(3), 203–226. https://doi.org/10.1016/S0167-6393(98)00058-2 [Google Scholar]
  50. Weise, A. , Levitan, S. I. , Hirschberg, J. , & Levitan, R. (2019). Individual differences in acoustic–prosodic entrainment in spoken dialogue. Speech Communication, 115, 78–87. https://doi.org/10.1016/j.specom.2019.10.007 [Google Scholar]
  51. Wickham, H. (2016). ggplot2: Elegant graphics for data analysis (pp. 189–201). Springer; https://doi.org/10.1007/978-3-319-24277-4_9 [Google Scholar]
  52. Wilson, M. , & Wilson, T. P. (2005). An oscillator model of the timing of turn-taking. Psychonomic Bulletin and Review, 12, 957–968. https://doi.org/10.3758/BF03206432 [DOI] [PubMed] [Google Scholar]
  53. Witt, S. M. , & Young, S. J. (2000). Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication, 30, 95–108. [Google Scholar]

Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES