Abstract
It is virtually impossible for a speaker to produce identical articulatory movements across several repetitions of the same utterance. This study examined how kinematic endpoint variability, defined as the positional variability of an articulator at its positional extremum, changes in response to cued speech behavioral modifications. As a second step, this study examined the strength of association between articulator speed and kinematic endpoint variability. 17 speakers repeated the sentence “Buy Kaia a kite” ten times under the following conditions: typical, loud, slow, and clear speech. Speech movements were recorded using 3D electromagnetic articulography. Endpoint variability was measured at the maximum jaw opening position during “buy” and at the maximum elevation of the tongue back during /k/ in “Kaia”. Significant speech modification effects were found for the jaw but not for the tongue. Specifically, typical speech yielded significantly lower kinematic endpoint variability than slow and loud speech. Further, jaw peak speed was moderately correlated with kinematic endpoint variability (r = .43, p < .01). Findings for jaw movements suggest that speech modifications that elicit an increase in speed (i.e. loud speech) may negatively impact kinematic endpoint precision; however, other factors such as motor learning and lacking emphasis on spatial precision (i.e. slow speech) may also play a role.
Keywords: articulation, kinematics, electromagnetic articulography
INTRODUCTION
Articulation is a skilled motor behavior executed by orofacial structures (i.e. upper and lower lip, tongue, jaw, velum) in a coordinated fashion. Articulators are set into motion to achieve vocal tract configurations that are associated with a desired speech acoustic outcome. Vocal tract configurations for a specific speech acoustic outcome are known to vary with phonetic context, stress, and speech demands (e.g. Perkell et al., 1995; Perkell et al., 2000; Recasens & Espinosa, 2009; Yunusova et al., 2012). However, even under identical speaking conditions, it is virtually impossible for a speaker to produce several repetitions of the same utterance with exactly the same articulator positions.
In this study, we are particularly interested in the spatial variability of an articulator at its positional extremum of a speech segment that is embedded in the same utterance and repeated several times (token-to-token variability). We will call this positional extremum during a speech segment a kinematic endpoint. Kinematic endpoint variability can capture the spatial precision with which an articulator reaches a specific position during speech. So far endpoint variability not been received a lot of attention in the speech motor control literature; however, studies on limb motor control frequently investigate endpoint variability. For example, early on the spread of kinematic endpoints across repetition trials was thought to be randomly distributed (e.g. Fitts, 1954; Meyer, Smith, & Wright, 1982); however, findings of more recent studies have shown that kinematic endpoint variability is anisotropic; that is, the kinematic endpoint spread predominantly aligns along the primary axis of movement (e.g. Apker, Darling, & Buneo, 2010; Gordon, Ghilardi, & Ghez, 1994; Milner, 2002; Shiller, Laboissieère, & Ostry, 2002; Worringham, 1991). This observation was thought to indicate that kinematic endpoint variability is largely driven by execution-related neuromotor noise in limb movements (e.g. van Beers, Haggard, & Wolpert, 2004).
In the speech motor system, kinematic endpoint variability also does not appear to be random. Perkell and Cohen (1989), for example, showed that the endpoint positions of the tongue back during repeated production of vowels embedded in words were characterized by a relatively small spread in the vertical dimension (constriction degree) and a relatively large spread in the horizontal dimension (constriction location) of the sagittal plane. This shape of the distributions of endpoint positions has been linked to the non-linear associations between vocal tract configurations and their acoustic consequences. That is, the kinematic endpoint spread is thought to be smaller in the vertical dimension than in the horizontal dimension of the sagittal plane because it is acoustically more sensitive (Beckman et al., 1995; Perkell & Nelson, 1985; Perkell et al., 2000).
Although kinematic endpoint variability has been examined previously to some extent in typical speech (e.g., Perkell & Nelson, 1985; Perkell & Cohen, 1989) little is known about how kinematic endpoint variability changes when speakers modulate their speaking rate, loudness or speech clarity. Such knowledge about speech behavioral modification effects on kinematic endpoint variability is highly relevant. Clinically, for example, speech behavioral treatment approaches are commonly used in therapeutic interventions to improve speech of talkers with motor speech impairments (e.g., instructions to speak slower, louder, or as clear as possible; Yorkston et al., 2007); yet, our understanding about the articulatory control demands that are underlying these speech behavioral modifications remain elusive. Many talkers with impaired speech exhibit elevated levels of kinematic variability. For example, talkers with dysarthria due to cerebellar pathologies often exhibit irregular articulator breakdowns and variable articulatory movement patterns (e.g., Blaney & Wilson, 2000; Brown, Darley, & Aronson, 1970; Cummins, Lowit & van Brenk, 2014; Kent, Netsell & Abbs, 1979; van Brenk & Lowit, 2012; Wang et al., 2009). Similarly, talkers with dysarthria due to traumatic brain injuries can face challenges with speech motor control as indicated by findings of increased articulatory movement pattern variability in these talkers (e.g., McHenry, 2003). Finally, apraxia of speech is a neurogenic motor speech disorder that is commonly associated with highly variable articulatory movements (e.g., Mauszycki, Dromey, & Wambaugh, 2007; van Lieshout, Bose, Square, & Steele, 2007). Thus, more knowledge about the effects of behavioral modifications on kinematic endpoint variability in typical talkers could ultimately inform clinical decisions with regard to the selection of speech treatments specifically to promote lowered levels of articulatory variability in impaired speakers.
From a theoretical perspective, insights can provide empirical support for current models of speech production. According to the direction into velocities of articulators (DIVA) model, for example, speech movements are planned to achieve auditory targets; however, articulators are moved to specific regions in the vocal tract to produce the desired auditory targets (Guenther, 1995). These regions are referred to as convex region targets. The size of the convex region targets is thought to vary with speech demands; for example, they are thought to shrink when movements are slowed, perhaps to accommodate increased demands in speech precision. The proposed variations in convex region target size are based on speed-accuracy tradeoffs that have been frequently observed in studies on limb motor control (e.g. Elliot et al., 2010; Fitts, 1954; Keele, 1981; Messier & Kalaska, 1999; Meyer et al., 1988; Schmidt et al., 1979, Woodworth, 1899; van Beers et al., 2004). For example, aiming movements executed at a high speed commonly yield large kinematic endpoint variability; particularly in the direction of movement. Slow movements, by contrast, are frequently associated with low kinematic endpoint variability. Speed-accuracy tradeoffs have been primarily attributed to two factors: (1) the change in neuromotor noise, which rises with and increases in movement speed and amplitude (e.g. Meyer et al., 1988; Schmidt et al., 1979; Wallace & Newell, 1983) and (2) the increased use of sensory feedback for online adjustments of the movement trajectory as speed decreases, which reduces kinematic endpoint variability (Schmidt et al., 1979).
Although references to speed-accuracy tradeoffs can be frequently found in the speech motor control literature (e.g., Bennett, van Lieshout, & Steele, 2007; Goozee et al., 2005; Guenther, 1995; van Brenk et al., 2013), so far only few studies have directly addressed this assertion. For example, a recent study reported moderate associations between movement time and the estimated difficulty level of an articulator to reach its target (Lammert, Shadle, Narayanan, & Quatieri, 2018) suggesting that talkers adjust their articulatory speed to ensure adequate articulatory precision during speech. Similarly, movement time also varied predictably with the Fitts’ index of difficulty when talkers were asked to repeat syllables as fast as possible (Kuberski & Gafos, 2018). However, empirical evidence is lacking that supports the proposed changes in the size of convex region targets, which may occur in response to varying speech task demands (e.g., modulation of speech clarity, rate, or loudness).
As a first exploratory effort to investigate speech modification effects on kinematic endpoint variability, an already existing dataset was used where ten repetitions of the utterance “Buy Kaia a kite” were available at each of the following speech conditions: typical, slow, clear, and loud speech. Therefore, this dataset offered the opportunity to examine how kinematic endpoint variability may potentially be affected by these speech modifications. Further, because clear and loud speech are known to elicit increased articulatory speed relative to typical speech (e.g., Tasko & Greilick, 2010; Darling & Huber, 2011) and slow speech elicits decreased articulatory speeds (e.g., Tasko & McClean, 2004), it was further possible to explore the question if changes in articulatory speed are associated with changes in kinematic endpoint variability.
METHODS
Participants
A total of 31 typical talkers completed this study. However, only data of 17 talkers (8 males, 9 females) with a mean age of 22.8 years (range 19–27) were included to test the research hypothesis. Data of 14 talkers were excluded for the following reasons: jaw or tongue sensor malfunctioning (7 talkers), reduced accuracy of kinematic data (3 talkers), and unsuccessful head movement correction (4 talkers). More information about the kinematic data validation procedures are provided below. Participants reported no history of a speech, language, or hearing disorder or any neurological condition. All participants passed a standard pure tone hearing screening at 0.5, 1, 2, and 4 kHz at 25dB HL and a brief oral motor examination to rule out any structural or functional abnormalities.
Experimental Tasks
All participants were asked to repeat the sentence “Buy Kaia a kite” at their typical rate, at half their typical rate, twice as loud, and as clearly as possible. The instructions for typical speech asked talkers to produce the sentence “like they normally speak in a conversation” and encouraged a causal speech style. Clear speech was elicited by asking speakers to over-enunciate and speak with great effort as clearly as possible. They were told that their speech may become slower and/or louder (e.g. Lam & Tjaden, 2012). Slow speech included instructions to stretch words rather than pausing between words.
Data acquisition
Speech movements were captured using 3D electromagnetic articulography (AG501, Medizinelektronik Carstens, Lenglern, Germany). Three jaw sensors were attached to the gumline of the lower teeth (between the left and right lower canine and first premolar as well as between the two central incisors) with small pieces of putty (Stomahesive®). Two sensor coils were attached to the sagittal midline of the tongue with dental tissue adhesive (i.e. Glustitch®90). Of specific interest in this study were the movements of the posterior tongue sensor (approximately 4cm posterior to tongue tip) and the central jaw sensor. The anterior tongue sensor (approximately 1cm posterior to the tongue tip) was not used for the purpose of this study. Three head reference sensors were attached to plastic goggles that were worn during the data collection. Finally, one sensor was attached to the sagittal midline of the upper and the lower lip. These two sensors were also not used for the purpose of this study.
Data analysis
All positional data were head-corrected and rotated into a head-based coordinate system using Normpos (Medizinelektronik Carstens). Movements were smoothed with a 15 Hz low pass filter in SMASH (Green, Wang, & Wilson, 2013). Tongue movements were not decoupled from the jaw because the purpose of the current study did not demand decoupling.
Positional endpoints.
After visual inspection of the kinematic data of the tongue and jaw two positional endpoints were selected for the purpose of this study: the maximum opening of the jaw in “buy” and the maximum elevation of the posterior tongue associated with /k/ in “Kaia”. These two kinematic endpoints were selected because they are fundamentally different from each other. One kinematic endpoint was minimally constrained by anatomical boundaries (maximum jaw opening position during “buy”) whereas the other endpoint was highly constrained by the palate (maximum tongue elevation associated with /k/). However, despite the palatal constraint, tongue positional variability was still thought to vary with task demands in the horizontal and vertical dimension of the sagittal plane.
The left panels of Figure 1 display the movement trajectories of the jaw (top) and tongue (bottom) in the sagittal plane. The positional endpoint for the jaw was defined as the trough in the vertical dimension during “buy” and the corresponding point in the anterior-posterior dimensions (Yunusova et al., 2012). Similarly, the positional endpoint of the tongue was defined as the peak in the vertical dimension during /k/ in “Kaia” and the corresponding coordinate in the anterior-posterior dimension of the sagittal plane. Endpoint variability was measured in two ways: 1) endpoint area size in terms of the area of a two standard deviation (SD) ellipse constructed by nine endpoint positions associated with one speech condition (e.g., Yunusova et al., 2012) and 2) endpoint spread in the vertical and horizontal dimensions of the sagittal plane in terms of the range of the nine endpoints in each dimension. Although ten repetitions were recorded, the first repetition was not included in the calculation of endpoint variability (e.g. Walsh & Smith, 2011; Yunusova et al., 2012).
Figure 1.

The left panels display the movement trajectories of the jaw (top) and tongue (bottom) in the sagittal plane. The right panels show the 3D Euclidean distance signals calculated between the central head sensor and central jaw sensor (top) or posterior tongue sensor (bottom). The black sections indicate the transition movements used to calculate peak speed.
Peak speed of movement transitions.
To determine the peak speed towards the kinematic endpoints the transition movements of the posterior tongue and jaw towards their endpoints were parsed. The right panels of Figure 1 show the 3D Euclidean distance signals calculated between the central head sensor and central jaw sensor (top) or posterior tongue sensor (bottom). Hence, a peak indicated maximum jaw opening whereas a trough indicated a maximal closure. The onset and offset of the jaw movement towards the maximum opening were defined by the trough during /b/ and peak during /a/ in “buy”, respectively. Further, the onset and offset of the tongue back movement towards the maximum elevation associated with /k/ were defined by the peak during /a/ and the trough during /k/ in “buy Kaia”, respectively. To determine movement speed the first derivative of the change in displacement over the change in time was calculated. The maximum speed observed during each movement transition was the peak speed.
Task performance
To evaluate speech behavioral modifications, mean vocal intensity was measured based on the speech acoustic signal of the target sentence in wavesurfer (Sjölander & Beskow, 2006). A calibration tone, which was recorded prior to data collection, was used to convert the each talker’s measured vocal intensity to dB SPL. Further, durational changes of transition movements were used to verify rate reduction during slow speech. Articulatory movement amplitude was used to verify execution of clear speech (e.g., Mefferd, 2017). Duration and amplitude measures for tongue back and jaw kinematics were based on the respective movement transitions described in Figure 1.
Kinematic data validation
Out of 31 talkers who completed the study, data of seven talkers were not further analyzed due to sensor malfunctioning (e.g. sensor tracking discontinued, sensors created large spikes in random directions). The kinematic data of the remaining 24 participants was inspected for sensor tracking accuracy. Although there are currently no recommendations available for the AG501 model with regards to thresholds of the maximum root mean square (RMS) positional error, participants with data that exceeded a maximum RMS of 10 were excluded based on our observation that the maximum RMS positional error of any sensor in the established dataset was on average below 6. This procedure identified three participants with RMS values of several head reference sensors consistently above 10. In two other cases, however, the RMS values only spiked for a very short interval of one recording. In these two cases, only the data of the affected section of the recording was excluded. Finally, in some participants visual inspection of the head-corrected kinematic data revealed a residual amount of head movements when these sensors should be static after head movement correction. Thus, participants with head reference sensors showing a movement range above 1.0 mm in any of the three dimensions were eliminated. Based on this procedure, four more participants were excluded.
Statistical analyses
To evaluate the task performance (loud, slow, and clear speech), vocal intensity, movement duration, and movement amplitude were each submitted to repeated measure analysis of variance (ANOVA). Although not part of the primary aim of the study, peak speed was also submitted to a repeated measure ANOVA to gain insights in task-dependent changes in peak speed. To address the primary aim of the study, task effects on kinematic endpoint variability was determined by submitting the area of the 2SD ellipse of each speech task to a linear mixed model analysis for each target sound with task as the fixed effect and participant as the random effect. Further, the spread of the kinematic endpoints in each dimension (vertical and horizontal dimension of sagittal plane) was submitted to linear mixed models for each speech sound with dimension and task as the fixed effects and participants as a random effect. The critical alpha-level was set to p < .05. If main effects or interaction term were significant, post-hoc analyses were conducted and the critical alpha-level was adjusted for multiple comparisons using Bonferroni corrections. To evaluate the strength of relation between peak speed and endpoint variability, peak speed and the kinematic endpoint size (area of 2 SD ellipse) as well as kinematic endpoint spread in each dimension were submitted to bivariate correlation analyses (Pearson’s moment-product correlations).
RESULTS
Prehypothesis analyses: Task effects on vocal intensity, duration and movement amplitude
Table 1 provides the descriptive statistics of task performance measures that are either associated with the entire target sentence (vocal intensity) or those associated with the specific transition movement towards the defined positional endpoints of the tongue and jaw. Peak speed was also included in interest of providing complete overview of the kinematic measures that were affected by speech modifications. A significant task effect on vocal intensity was found for sentence repetitions [F(3,48) = 103.51, p < .001]. Pairwise comparisons yielded the following significant findings (all p < .001): loud speech was produced with significantly greater vocal intensity than typical, clear, and slow speech. In addition, clear speech was produced with significantly greater vocal intensity than typical and slow speech.
Table 1.
Means of task performance measures
| Utterance | Measure | Task | Mean | SD |
|---|---|---|---|---|
| Sentence | Vocal intensity (in dB SPL) | typical | 67.50 | 5.24 |
| clear | 70.58 | 5.28 | ||
| loud | 81.43 | 5.06 | ||
| slow | 65.89 | 4.93 | ||
| /a/ | Transition duration (in s) | typical | 0.190 | 0.09 |
| clear | 0.227 | 0.06 | ||
| loud | 0.202 | 0.06 | ||
| slow | 0.392 | 0.13 | ||
| Amplitude of transition movement (in mm) | typical | 5.60 | 2.63 | |
| clear | 9.95 | 5.10 | ||
| loud | 10.51 | 3.95 | ||
| slow | 7.91 | 2.61 | ||
| Peak speed of transition movement (in mm/s) | typical | 64.67 | 22.94 | |
| clear | 96.30 | 37.38 | ||
| loud | 114.46 | 34.98 | ||
| slow | 56.54 | 18.98 | ||
| /k/ | Transition duration (in s) | typical | 0.218 | 0.046 |
| clear | 0.364 | 0.118 | ||
| loud | 0.254 | 0.085 | ||
| slow | 0.681 | 0.405 | ||
| Amplitude of transition movement (in mm) | typical | 15.97 | 5.10 | |
| clear | 19.95 | 5.78 | ||
| loud | 20.17 | 5.68 | ||
| slow | 17.64 | 4.76 | ||
| Peak speed of transition movement (in mm/s) | typical | 165.55 | 55.93 | |
| clear | 141.83 | 46.28 | ||
| loud | 187.23 | 65.39 | ||
| slow | 82.14 | 22.96 |
For jaw movement transitions towards the maximum opening in “buy”, a significant task effect was found for the transition duration [F(3,48) = 30.91, p < .001]; movement amplitude [F(3,48) = 23.67, p < .001)], and jaw peak speed [F(3,49) = 42.977, p < .001]. Pairwise comparisons revealed that slow speech was produced with significantly longer transition durations than all other speech conditions (all p < .001). In addition, clear speech tended to be slower than loud speech (p = .05) and typical speech (p = .05). Further, the amplitude of jaw movements was significantly smaller during typical speech than all other speech conditions (p < .001). Also, loud speech was associated with significantly larger jaw movements than slow speech (p < .001) and clear speech (p = .014). In addition, loud speech yielded significantly higher peak speeds than all other speech conditions (p < .001) and clear speech was produced with significantly greater peak speed than typical speech and slow speech (p < .001). Finally, typical speech tended to be associated with greater peak speeds than slow speech (p = .07).
For posterior tongue movement transitions towards the maximum elevation associated with /k/ in “Kaia”, a significant task effect was found for transition duration [F(3,48) = 19.15, p < .001)], movement amplitude [F(3,48) = 29.39, p < .001)], and peak speed [F(3,48) = 32.776, p < .001)]. Pairwise comparisons of movement durations revealed that slow speech had significantly longer durations than all other speech conditions (p < .001). Further, posterior tongue movements during clear speech were significantly longer than those during loud and typical speech (p < .001) whereas posterior tongue movements during loud speech were not significantly longer than those during typical speech. For posterior tongue movement amplitudes, typical speech was associated with significantly smaller tongue movements than all other speech conditions (p ≤ .002). Further, slow speech was associated with significantly smaller posterior tongue movements than clear and loud speech (p ≤ .002). Posterior tongue movement amplitude did not significantly differ between clear and loud speech; both speech conditions were associated with similar tongue movement amplitudes. Finally, pairwise comparisons for peak speed revealed significantly higher peak speed for loud speech than clear and slow speech (p < .001). Peak speed did not significantly differ between loud and typical speech (p = .09); however, loud speech tended to have higher peak speed than typical speech. Clear speech had significantly higher peak speeds than slow speech (p < .001) but significantly lower peak speeds than typical speech (p = .034). Finally, typical speech was associated with significantly higher peak speeds than slow speech (p < .001).
Speech task effects on kinematic endpoint variability
Figure 2 displays means for measures related to endpoint area size and kinematic endpoint spread for the jaw maximum opening during “buy” and the tongue maximum elevation during /k/ in “Kaia”. The kinematic endpoint area size as well as the spread of the kinematic endpoints in each dimension were square root transformed for both target sounds to reduce observed skewness. A significant task effect on endpoint area size was observed for the jaw [F(3,22.239) = 9.954, p < .001). As can be seen in Figure 2, loud speech was associated with a significantly greater endpoint area than typical speech (p = .001). Further, slow speech tended to yield a greater kinematic endpoint area than typical speech. For posterior tongue endpoint variability, however, no significant task effects were observed. However, as can be seen in Figure 2, the kinematic endpoint area tended to be larger during slow speech than typical, clear, and loud speech.
Figure 2.

Speech task effects on the area of the 2SD ellipse (left panels) and spread of positional endpoints in the vertical (middle panels) and horizontal dimensions (right panels) of the jaw (top row) and the tongue (bottom row). Measures are square root transformed.
Regarding the task effect on the shape of the kinematic endpoint distribution, significant main effects were found for dimension [F(1, 102.851) = 119.369, p < .001] and task [F(3,45.951) = 7.585, p < .001] for the jaw opening endpoints. The task x dimension interaction was not significant. Thus, the spread of the kinematic endpoints was overall significantly larger in the vertical dimension than in the horizontal dimension. Further, the spread of the kinematic endpoints was overall significantly smaller during typical speech than during loud (p = .001) and slow speech (p = .005). Finally, for tongue positional endpoints, a significant main effect was found for dimension [F(1,81.453) = 277.001, p < .001]; however, the main effect of task and the task x dimension interaction were non-significant. That is, the spread of the kinematic endpoints were significantly larger in the horizontal dimension than in the vertical dimension across all speech tasks.
Associations between peak speed and kinematic endpoint variability
Peak speeds of transition movements and their corresponding kinematic endpoint measures (the area of the 2SD ellipse, the spread of endpoints in vertical and horizontal dimensions) were square root transformed and then submitted to bivariate correlation analyses. Figure 3 displays these associations for the jaw (Panel A) and the tongue (Panel B). A significant moderate positive correlation between jaw peak speed and endpoint area size was found for the jaw [r(68) = .426, p < .001]. Furthermore, bivariate correlations between peak speed and the spread of the endpoint in the vertical dimension as well as between peak speed and the spread in the horizontal dimension were significant [r(68) = .306, p = .011 and r(68) = .291, p = .016, respectively]. For the posterior tongue, however, no significant correlations were found between peak speed and any of the three kinematic endpoint variability measures.
Figure 3.

Changes in endpoint variability as a function of peak speed. Measures were square root transformed.
DISCUSSION
The primary aim of this study was to determine how speech behavioral modifications (slow, loud, clear speech) affect kinematic endpoint variability. Further, this study sought to determine if changes in peak speed were associated with changes endpoint variability. Jaw movements towards the maximum opening during “buy” and posterior tongue movements during the maximum elevation associated with /k/ in “Kaia” embedded in the sentence “Buy Kaia a kite” were examined.
Outcomes suggest that speech modifications only significantly impacted endpoint variability associated with jaw opening movements, but not kinematic endpoint variability of tongue movements associated with the segment /k/. Specifically, jaw endpoint variability was significantly greater during loud and slow speech than during typical and clear speech. The changes in jaw peak speed yielded a moderate association with jaw kinematic endpoint variability. By contrast, tongue peak speed was not associated with tongue kinematic endpoint variability; likely due to the lack of change in endpoint variability in response to slow, loud, and clear speech. In the following sections these main findings and other related study outcomes will be discussed in more detail.
Speech modification effects on kinematic endpoint variability
Findings of this current study suggest that speech behavioral modifications elicit predictable changes in kinematic endpoint variability for jaw opening movements but not for posterior tongue elevations associated with the intra-oral constriction during the /k/ segment. The differential task effect may be due to the presence of an anatomical boundary that constrained kinematic endpoint variability in the main direction of movement for the tongue. An anisotropic spread of kinematic endpoint variability, as it has been observed in studies on limb movements (e.g. Apker et al., 2010; Gordon et al, 1994; van Beers et al., 2004) was found for the jaw opening movement. Specifically, the spread of the endpoints was almost twice as large in the vertical dimension (main direction of movement) as in the horizontal dimension of the sagittal plane. By contrast, for the tongue movements, no anisotropic distribution of kinematic endpoint variability was found, likely because the hard palate constrained variability in the primary direction of movement. Kinematic endpoint variability could have changed in response to speech modifications in the horizontal dimension of the sagittal plane; however, because displacement in the horizontal dimension was only minimal (see Figure 1 bottom left panel), speed changes did not impact the spread of kinematic endpoint variability in this dimension.
Typical speech was the speech condition that was associated with the lowest endpoint variability for the jaw opening movement. This finding supports the notion that highly practiced motor acts yield lower kinematic variability than novel ones (e.g. Cohen & Sternad, 2009). In this study, all speech behavioral modifications were cued and did not include a practice period. However, only the slowed speech task (stretching the words by prolonging speech sounds) can truly be considered a novel task for talkers in this study. Loud and clear speech are both frequently produced in everyday life; however, clear was associated with significantly lower jaw kinematic endpoint variability than loud speech.
The notion that spatial precision of articulatory positions varies with speech demands has been proposed previously. Specifically, it was hypothesized that the size of convex region targets in the vocal tract may shrink when talkers reduce their speaking rate and aim to enhance speech precision (Guenther, 1995). Small submovements that are commonly observed during slowed speech are also thought to support this assertion because they have been interpreted to indicate that talkers use feedback-driven corrections to ensure enhanced spatial precision (Adams, Kent, & Weismer, 1993; Bullock & Grossberg, 1988; Ostry et al., 1987). However, findings of the current study do not support this notion. Rather, small submovements may serve to control movement durations during slow speech.
Associations between movement speed and kinematic endpoint variability
Outcomes indicated only modest associations between speed and kinematic endpoint variability for the jaw. In studies on limb motor control, such associations often yielded strong correlations (r > .9) although a few studies have also documented tradeoffs as low as r = .60 (see MacKenzie, 1992). It is likely that the nature of limb motor tasks for which such strong speed-accuracy tradeoffs have been observed are fundamentally different from speech motor tasks used in the current study. For example, in many limb motor control studies participants are explicitly instructed to produce accurate and precise movements towards a target that is defined in size. By contrast, targets for speech tasks potentially exist parallel in auditory, acoustic, and articulatory space and likely differ in size due to quantal auditory-to-acoustic and acoustic-to-articulatory mappings (Stevens, 1989). Further, precision demands for targets in auditory space likely dictate speech motor control processes at the articulatory level. This lack of a clearly defined speech precision requirements for slow speech, for example, may have negatively impacted the strength of associations between speed and kinematic endpoint variability. Indeed, when slow speech is removed from the bivariate correlation, the strength of association increases from r = .43 to r = .53 for the jaw opening movements. In this current study speech precision demands were only explicitly defined for clear speech. To examine speed-accuracy tradeoffs in the speech motor system, future studies should include tasks that explicitly define speech precision demands while speed demands are being varied (e.g. “speak slow and clear”, “speak fast and clear”, “speak as clearly as possible while maintaining your typical speech rate”, “speak loud and clear”).
Clinical implications
Because speech behavioral modifications are commonly used in therapeutic, findings offer important clinical implications. Although study outcomes are only based on typical speakers, the current study suggest that impaired speakers may be particularly challenged to maintain articulatory control over jaw opening movements when increasing their loudness. Further, slow speech may not promote segmental articulatory control as often presumed. Finally, clear speech may be a speech behavior that allows speakers to maintain control over jaw opening movements. However, more research is needed to discover how these speech behavioral modifications affect speech motor control in talkers with speech impairments. Previous studies suggest that these speech modifications can yield differential effects in healthy speakers and speakers with speech impairments (e.g., McHenry, 2003; Kuruvilla-Dugdale & Mefferd, 2017). Furthermore, task-related changes in kinematic endpoint variability may not be perceptually detectable in typical speakers; however, they may yield perceptual significance (e.g. irregular articulatory breakdowns) in speakers with speech impairments.
Caveats
In this study, kinematic endpoint variability was only examined in two dimensions. This decision was based on pilot data that included all three dimensions. No changes were observed in the medial-lateral dimension across speech tasks and, therefore, it was decided to focus only two dimensions in the sagittal plane. Further, kinematic endpoints were defined by the movement extrema in the vertical dimension of the sagittal plane. In the limb literature kinematic endpoints are often defined as the point of near-zero velocity. Similarly, Tasko and McClean (2002) suggested determining target endpoints based on the velocity signal. Unfortunately, this approach could not be used because multiple velocity peaks during slow speech did not permit reliable identification of endpoints based on the velocity signal. For loud, typical and clear speech, however, zero crossing of the articulatory velocity commonly aligned with the positional extrema of the articulator in the vertical dimension.
Finally, transition movements of the tongue crossed word boundaries (“buy Kaia”). This approach may raise the concern that the tongue transition movements may not be directed towards the kinematic endpoint (maximum elevation during /k/ in “Kaia”). Indeed, studies have shown that coarticulatory effects are weaker across sentence and clause boundaries than across word or syllable boundaries (Hardcastle, 1985). However, Kent and Moll (1972) have demonstrated coarticulatory effects on tongue movements across word boundaries suggesting that motor control strategies expand past word boundaries.
Future directions
Future research is warranted to determine if the findings of speed-accuracy tradeoffs are specific to jaw motor control or if they are observable for the lips and tongue, particularly when the positional extrema are not constrained by an anatomical boundary. Further, future studies could include an older age group as well as a group of speakers with a motor speech impairment to examine how aging and neurologic conditions affect endpoint variability and speed-accuracy tradeoffs. Such findings may offer further insights in the contributions of movement speed on motor control and articulatory factors underlying speaking rate decline with aging and disease.
Acknowledgements
This research was supported by start-up funds from Vanderbilt University Medical Center and grant R03DC015075 from the National Institute on Deafness and Other Communication Disorders (NIDCD). The content of this manuscript is solely the responsibility of the author and do not necessarily represent official views of the NIDCD. The author would like to thank Brett Myers for his help with data collection and Ellen Hart, Sophie Mouros, Jaclyn Fitzsimmons, and Mary Jo Bissmeyer for their help with data analysis.
Footnotes
Statement of Interest
The author reports no conflicts of interest.
REFERENCES
- Adams S, Kent R, & Weismer G (1993). Speaking rate and speech movement velocity profiles. Journal of Speech Hearing Research, 36, 41–54. [DOI] [PubMed] [Google Scholar]
- Apker G, Darling T, Buneo C (2010). Interacting noise sources shape patterns of arm movement variability in three-dimensional space. Journal of Neurophysiology, 104(5), 2654–2666. [DOI] [PubMed] [Google Scholar]
- Beckman M, Jung T-P, Lee S, de Jong K, Krishnamurty A, Ahalt S, Cohen KB, & Collins M (1995). Variability in the production of quantal vowel revisited. Journal of the Acoustical Society of America, 97(1), 471–490. [Google Scholar]
- Bennett J, van Lieshout P, & Steele C (2007). Tongue control for speech and swallowing in healthy younger and older subjects. International journal of orofacial myology, 33, 5–18. [PubMed] [Google Scholar]
- Blaney B & Wilson J (2000). Acoustic variability in dysarthria and computer speech recognition. Clinical Linguistics & Phonetics, 14(4), 307–327. [Google Scholar]
- Bullock D, & Grossberg S (1988a). Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties during trajectory formation. Psychological Review, 95(1), 49–90 [DOI] [PubMed] [Google Scholar]
- Brown JR, Darley FL, Aronson AE (1970). Ataxic dysarthria. International Journal of Neurology, 7, 302–309. [PubMed] [Google Scholar]
- Cohen RG & Sternad D (2009). Variability in motor learning: Relocating, channeling, and reducing noise. Experimental Brain Research, 193(1), 69–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cummins F, Lowit A, & van Brenk F (2014). The quantitative assessment of interutterance stability: application to dysarthric speech. Journal of Speech, Language, and Hearing Research, 57 (1), 81–89. [DOI] [PubMed] [Google Scholar]
- Darling M & Huber JE (2011). Articulatory movements in response to loudness and Parkinson’s disease. Journal of Speech, Language, and Hearing Research, 54, 1247–1259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elliot D, Hansen S, Grierson L, Lyons J, Bennet S, & Hayes S (2010). Goal-directed aiming: Two components but multiple processes. Psychological Bulletin, 136(6), 1023–1044. [DOI] [PubMed] [Google Scholar]
- Fitts PM (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47, 381–391. [PubMed] [Google Scholar]
- Goozeée JV, Stephenson DK, Murdoch BE, Darnell RE, Lapointe LL (2005). Lingual kinematic strategies used to increase speech rate: comparison between younger and older adults. Clinical Linguistics and Phonetics, 19, 319–334. [DOI] [PubMed] [Google Scholar]
- Gordon J, Ghilardi MF, & Ghez C (1994). Accuracy of planar reaching movements. Independence of direction and extent variability. Experimental Brain Research, 99(1), 97–111. [DOI] [PubMed] [Google Scholar]
- Green JR, Wang J, & Wilson DL (2013). SMASH: A tool for articulatory data processing and analysis. Interspeech 2013, 1331–1335. [Google Scholar]
- Guenther FH (1995). Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. Psychological Review, 102(3), 594–621. [DOI] [PubMed] [Google Scholar]
- Hardcastle W (1985). Some phonetic and syntactic constraints on lingual coarticulation during /kl/ sequences. Speech Communication, 4, 247–263. [Google Scholar]
- Huber JE & Chandrasekaran B (2006). Effects of increasing sound pressure level on lip and jaw movement parameters and consistency in young adults. Journal of Speech Language, and Hearing Research, 46(9), 1368–1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keele S (1981). Movement control in skilled motor performance. Psychological Bulletin, 70, 387–403. [Google Scholar]
- Kent R & Moll K (1972). Tongue body articulation during vowel and diphthong gestures. Folia Phoniatrica et Logopaedica, 24, 278–300. [DOI] [PubMed] [Google Scholar]
- Kent R & Netsell R & Abbs J (1979). Acoustic characteristics of dysarthria associated with cerebellar disease. Journal of Speech and Hearing Research, 22, 627–648. [DOI] [PubMed] [Google Scholar]
- Kuberski S, & Gafos A (2018). On the speed-accuracy trade-o and speed-curvature power law of tongue movements in repetitive speech. Paper presented at LabPhon 16, Lisbon. Accessed online December 21, 2018: http://labphon16.labphon.org/files/abstracts/025.pdf
- Lam J & Tjaden K, & Wilding G (2012). Acoustics of clear speech: effect of instruction. Journal of Speech, Language, and Hearing Research, 55, 1807–1821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lammert A, Shadle C, Narayanan S, & Quatieri T (2018). Speed-accracy tradeoffs in human speech production. PLoS ONE, 13(9): e0202180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKenzie IS (1992). Fitts’ Law as a research and design tool in human-computer interaction. Human-Computer Interaction, 7, 91–139. [Google Scholar]
- Mauszycki S, Dromey C, & Wambaugh J (2007). Variability in apraxia of speech: a perceptual, acoustic, and kinematic analysis of stop consonants. Journal of Medical Speech Language Pathology, 15(3), 223–242. [Google Scholar]
- McHenry MA (2003). The effect of pacing strategies on the variability of speech movement sequences in dysarthria. Journal of Speech, Language, and Hearing Research, 46, 702–710. [DOI] [PubMed] [Google Scholar]
- Mefferd AS (2017). Tongue- and jaw-specific contributions to acoustic vowel contrast changes in the diphthong /ai/ in response to slow, loud, and clear speech. Journal of Speech, Language, and Hearing Research [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer DE, Smith JEK, & Wright CE (1982). Models for the speed and accuracy of aimed movements. Psychological Review, 89, 449–482. [PubMed] [Google Scholar]
- Meyer DE, Abrams R, Kornblum S, Wright C (1988). Optimality in Human Motor Performance: Ideal Control of rapid arm movements. Psychological Review, 95(3), 340–370. [DOI] [PubMed] [Google Scholar]
- Messier J, & Kalaska J (1999). Comparison of variability of initial kinematics and endpoints of reaching movements. Experimental Brain Research, 125, 139–152. [DOI] [PubMed] [Google Scholar]
- Milner T (2002). Contribution of geometry and joint stiffness to mechanical stability of the human arm. Experimental Brain Research, 143, 515–519. [DOI] [PubMed] [Google Scholar]
- Munhall KG, Ostry DJ, Parush A (1985). Characteristics of velocity profiles of speech movements. Journal of Experimental Psychology: Human Perception and Performance 11:457–474. [DOI] [PubMed] [Google Scholar]
- Ostry DJ, Cooke JD, & Munhall KG (1987). Velocity curves of human arm and speech movements. Experimental Brain Research, 68, 37–47. [DOI] [PubMed] [Google Scholar]
- Perkell JS & Cohen MH (1989). An indirect test of the quantal nature of speech in the production of the vowels /i/, /a/, and /u/. Journal of Phonetics, 17, 123–133. [Google Scholar]
- Perkell JS, Guenther FH, Lane H, Matthies ML, Perrier P, Vick J, & Wilhelms-Tricarico R (2000). A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss. Journal of Phonetics, 28, 233–372. [Google Scholar]
- Perkell JS, Matthies ML, Svirsky MA, & Jordan MI (1995). Goal-based speech motor control: a theoretical framework and some preliminary data. Journal of Phonetics, 23, 23–35. [Google Scholar]
- Perkell JS & Nelson WL (1985). Variability in production of the vowels /i/ and /a/. Journal of the Acoustical Society of America, 77(5), 1889–1895. [DOI] [PubMed] [Google Scholar]
- Schmidt R, Zelaznik H, Hawkins B, Frank J, & Quinn J (1979). Motor-output variability: A theory for the accuracy of rapid motor acts. Psychological Review, 86, 415–451. [PubMed] [Google Scholar]
- Schulman R (1989). Articulatory dynamics of loud and normal speech. Journal of the Acoustical Society America, 85, 295–312. [DOI] [PubMed] [Google Scholar]
- Shiller D, Laboissiere T, & Ostry DJ (2002). Relationship between jaw stiffness and kinematic variability in speech. Journal of Neurophysiology, 88, 2329–2340. [DOI] [PubMed] [Google Scholar]
- Sjölander K, & Beskow J (2006). Wavesurfer (Version 1.8.8p4) [Computer Software]. Stockholm, Sweden: KTH Centre for Speech Technology. [Google Scholar]
- Stevens K (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3–45. [Google Scholar]
- Tasko S & Greilick K (2010). Acoustic and articulatory features of diphthong productions: A speech clarity study. Journal of Speech, Language, and Hearing Research, 53, 84–99. [DOI] [PubMed] [Google Scholar]
- Tasko S & McClean M, (2004). Variation in articulatory movements with changes in speech task. Journal of Speech, Language, and Hearing Research, 47, 85–100. [DOI] [PubMed] [Google Scholar]
- Tasko S & Westbury J (2002). Defining and measuring speech movement events. Journal of Speech, Language, Hearing Research, 45, 127–142. [DOI] [PubMed] [Google Scholar]
- van Beers RJ, Haggard P, & Wolpert DM (2004). The role of execution noise in movement variability. Journal of Neurophysiology, 91, 1050–1063. [DOI] [PubMed] [Google Scholar]
- van Brenk F & Lowit A (2012). The relationship between acoustic indices of speech motor control variability and other measures of speech performance in dysarthria. Journal of Medical Speech Language Pathology, 20, 4, 24–29. [Google Scholar]
- van Brenk F, Terband H, van Lieshout P, Lowit A, Maassen B (2013). Rate-related kinematic changes in younger and older adults. Folia Phoniatrica et Logopaedica, 65, 239–247. [DOI] [PubMed] [Google Scholar]
- van Lieshout P, Bose A, Square, & Steele C (2007). Speech motor control in fluent and dysfluent speech production of an individual with apraxia of speech and Broca’s aphasia. Clinical Linguistics and Phonetics, 21(3), 159–188. [DOI] [PubMed] [Google Scholar]
- Wallace S & Newell K (1983). Visual control of discrete aiming movements. Quarterly Journal of Experimental Psychology, 35A, 311–321. [DOI] [PubMed] [Google Scholar]
- Walsh B & Smith A (2011). Linguistic complexity, speech production, and comprehension in Parkinson’s disease: Behavioral and physiological indices. Journal of Speech Language and Hearing Research, 54, 787–802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y-T, Kent R, Duffy JR, Thomas JE (2009). Analysis of diadochokinesis in ataxic dysarthria using the motor speech profile program. Folia Phoniatrica et Logopaedica, 61, 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woodworth RS (1899). The accuracy of voluntary movements. Psychological Review, 3 (Monograph Suppl.), 1–119. [Google Scholar]
- Worringham C (1991). Variability effects on the internal structure of rapid aiming movements. Journal of Motor Behavior, 23, 75–85. [DOI] [PubMed] [Google Scholar]
- Yorkston KM, Hakel M, Beukelman D, & Fager S (2007). Evidence for effectiveness of treatment of loudness, rate, or prosody in dysarthria: A systematic review. Journal of Medical Speech Language Pathology, 15(2), 6–13. [Google Scholar]
- Yunusova Y, Rosenthal J, Rudy K, Baljko M, Daskalogiannakis J (2012). Positional targets for lingual consonants defined using electromagnetic articulography. Journal of the Acoustical Society of America, 132(2), 1027–1038. [DOI] [PubMed] [Google Scholar]
