Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Sep 29.
Published in final edited form as: J Acoust Soc Am. 2003 May;113(5):2820–2833. doi: 10.1121/1.1562646

Tongue-surface movement patterns during speech and swallowing

Jordan R Green 1, Yu-Tsai Wang 1
PMCID: PMC2754124  NIHMSID: NIHMS115969  PMID: 12765399

Abstract

The tongue has been frequently characterized as being composed of several functionally independent articulators. The question of functional regionality within the tongue was examined by quantifying the strength of coupling among four different tongue locations across a large number of consonantal contexts and participants. Tongue behavior during swallowing was also described. Vertical displacements of pellets affixed to the tongue were extracted from the x-ray microbeam database. Forty-six participants recited 20 vowel-consonant-vowel (VCV) combinations and swallowed 10 ccs of water. Tongue-surface movement patterns were quantitatively described by computing the covariance between the vertical time-histories of all possible pellet pairs. Phonemic differentiation in vertical tongue motions was observed as coupling varied predictably across pellet pairs with place of articulation. Moreover, tongue displacements for speech and swallowing clustered into distinct groups based on their coupling profiles. Functional independence of anterior tongue regions was evidenced by a wide range of movement coupling relations between anterior tongue pellets. The strengths and weaknesses of the covariance-based analysis for characterizing tongue movement are considered.

I. INTRODUCTION

The tongue has been frequently characterized as being composed of several functionally independent articulators (Hardcastle, 1976; Hoole, 1999; Mermelstein, 1973; Öhman, 1967; Perkell, 1969; Stone, 1990). The common use of such terms as tip, blade, body, dorsum, and root to refer to the “parts” of the tongue reflects the widespread acceptance of this assertion. The factors that give rise to functional regionality within the tongue are not fully understood but may include task demands, neuromuscular control, biomechanical tissue linkages, and constraints in motion imposed by palatal shape. The conception of the tongue as a segmented structure is particularly interesting given that studies of its internal structure have not identified morphologic features that could account for the extent of functional partitioning alluded to in the literature. For example, a recent study by Takemoto (2001) revealed the body of the human tongue to contain serially arranged replications of a “structural unit” that consists of several layers of highly interdigitating musculature. Presently, there is little agreement about (1) the number and location of functional regions in the human tongue, (2) the degree of functional independence among tongue regions,1 and (3) the extent to which putative functional regions or characteristic movement patterns in the tongue are similar across speakers.

A number of studies have reported that tongue motions are generated by a small number of independent components and that the tongue assumes relatively few shapes during speech. The small number of tongue surface-deformation patterns exhibited during speech has been interpreted to reflect both speaker-strategies and constraints imposed by the physical properties of the tongue (Kent and Moll, 1972; Perkell, 1969). As early as 1967, Öhman, proposed that the tongue may be regarded as three independently controlled systems: the apical articulator serving the dentals, alveolars, and retroflex; the dorsal serving the palatal and velars; and the tongue-body serving vowels. Since then, several investigators have worked toward estimating both the number of functionally distinct parts of the tongue and the number of unique shapes it assumes during speech.

Using x-ray microbeam and ultrasound to transduce tongue motion, Stone (1990) identified four midsagittal regions that functioned quasi-independently: anterior, dorsal, middle, and posterior. Other investigators have applied factor analysis to mid-sagittal tongue contours to derive the number of distinct shapes exhibited by the tongue during speech (Harshman et al., 1977; Maeda, 1990). Harshman and colleagues (1977) reported that two factors could account for the variations in sagittal tongue shapes associated with ten steady-state vowels. One factor was associated with the forward movement of the tongue-root and upward movement of the blade, and the other accounted for upward and backward movements. Maeda (1990) reported that variations in sagittal tongue shape during ten French sentences could be accounted for by three primary factors related to tongue-dorsum position (front/back), tongue-dorsum shape (arched/flat), and tongue-tip position (raised/lowered). Sanguineti and colleagues’ (Sanguineti et al., 1997) articulatory model corroborates these empirical descriptions of tongue behavior by showing that the repertoire of speech-related tongue behaviors can be generated from a small number of primitive movements that are distinguished by the independent activation of distinct muscle groups.

Although several investigations have quantified speech-related tongue shapes, few have quantified the spatiotemporal relations among adjacent and nonadjacent tongue regions during speech or swallowing. An improved understanding of the extent of functional regionality within the tongue will be important for explaining features of normal and disordered speech and swallowing. For speech, the degree of movement independence across the tongue will delimit the tongue’s capacity to encode phonetic details for linguistic distinction and the time course for coarticulation. For instance, in a CV utterance where the consonant requires alveolar closure, the degree of independence in movement between tongue-tip and tongue-body will determine the time course in which speakers can begin to move the tongue-body for producing the vowel (Kent and Moll, 1972).

There is some empirical evidence that both acquired and developmental disorders of tongue function are associated with a decrease in movement independence among the different tongue regions. Using electropalatography, Gibbon (1999) reported that a majority of children with speech disorders exhibited tongue contact patterns that lacked clear differentiation between the tongue’s apex, body, and lateral margins. In addition, Hardcastle and colleagues (Hardcastle et al., 1991) observed the erroneous coupling of velar and alveolar elevation in a speaker with apraxia resulting in a /t/ for /k/ substitution. In an earlier study of childhood articulatory disorders, Hardcastle et al. (1987) identified one child who exhibited reduced control over different regions of the tongue. During speech, this child’s tongue was reported to move as a “single undifferentiated mass” (p. 180). Similarly, in a cineradiographic study of dysarthric speech, Kent et al. (1975) observed tongue function in speakers with dysarthria to be characterized by “reduced motility” and “limited flexibility in the directions of tongue movement.” Such deficits in lingual coordination might be usefully described in terms of the distributions of coupling relations among adjacent and nonadjacent tongue regions. However, more information regarding the spatiotemporal features of tongue-surface movement patterns in nonimpaired speakers is required before such a measure can be used to gauge the degree of speech-motor impairment.

Swallowing also requires functional independence within the tongue’s supporting musculature. For example, the transport of material through the oral cavity and into the pharynx is executed by the sequential activation of genioglossus muscle fibers from anterior to posterior (Bosma et al., 1990). Thus, the study of the coupling relations among tongue regions has the potential to improve our understanding of tongue control for swallowing, as well as speech, and will provide a quantitative basis for understanding differences in the coordinative requirements for these distinct tasks.

In the present investigation, we examine the question of functional regionality by quantifying the strength of coupling among four different tongue locations across a large number of consonantal contexts and participants. Tongue behavior during swallowing will also be described. Based on this representation of tongue behavior, the following questions will be addressed regarding tongue function during speech and swallowing: (1) How much functional independence in movement is typically exhibited during speech and swallowing across the surface of the tongue? (2) How distinct is spatiotemporal organization of mid-sagittal tongue deformations for differing consonants? (3) Are lingual deformation patterns similarly affected by phonemic contexts across speakers?

The range of movement coupling relations between two regions of the tongue across a variety of tasks is taken as a gross indicator of their functional independence. For example, the observation of persistently high coupling between two regions across a variety of tasks would suggest limited functional independence. In contrast, the observation of a wide range of movement relations between two regions would suggest a high degree of functional independence.

II. METHODS

A. Participants

These data were obtained from the X-Ray Microbeam Speech Production Database (XRMB-SPD, Westbury, 1994), which includes 57 speakers of American English. The present study examined data from 46 of these participants. The 11 excluded participants either did not perform the selected tasks or produced an insufficient amount of data to be analyzed. The mean chronological age of participants (20 male, 26 female) was 21 years; 5 months (SD: 2;6, range: 19;2–29;4) for males and 22 years; 8 months (SD: 4;5, range: 18;4–37;0) for females. The majority of participants (85%) spoke a Midwest dialect and were students at the University of Wisconsin–Madison. All participants passed a hearing screening with thresholds at or below 25 dB HL for a range of frequencies from 500 to 8000 Hz. No participants reported a history of a speech or language disorder (including oral mechanism anomalies) or evidence of neuromotor or other health concerns.

B. Kinematic data

The x-ray microbeam (XRMB) tracked movements of pellets that were affixed to the tongue (T1,T2,T3,T4), the upper and lower lip (UL,LL), and the mandible (MI,MM). An anatomically based reference frame was used to standardize pellet placement across participants [see Fig. 5.2 and Tables 5.1 and 5.2 in Westbury (1994)]. Pellet MI was affixed to the buccal surface of the mandibular incisor and pellet MM was affixed to the junction between the first and second mandibular molars. T1 and T4 pellets were placed on regions of the tongue that are typically classified as blade and dorsum, respectively, and T2 and T3 were placed intermediately and equidistant to each other and the endpoint pellets. For purpose of discussion, T3 will be considered to be located at the body of the tongue, and T1 and T2 are considered to be located at the anterior blade and posterior blade, respectively. The gold pellets (2–3 mm diameter) were affixed to these sites mid-sagittally using dental adhesive (Ketac-Bond).

The XRMB captures the motion of radiodense pellets via computer guided positioning of a narrowly focused x-ray. The operating principles of x-ray microbeam technology for tracking articulatory movements have previously been described in detail by Westbury (1991). Because the articulators tend to move at different speeds, articulatory movements were initially sampled at various rates per second (UL and MI = 40 Hz; LL, T2, T3, and T4 = 80 Hz; and T1 = 160 Hz). However, for ease of analysis, all signals were subsequently resampled at a uniform rate of 160 samples per second. The database expresses all pellet positions relative to the maxillary occlusal plane (see Westbury, 1994). In this coordinate system, the central maxillary incisor defines the origin with the x axis being defined by the maxillary occlusal plane. The y axis was defined as the line that was normal to x axis in the midsagittal plane. All signals were low-pass filtered (fc = 10 Hz) using a zero phase forward and reverse digital filter. The low-pass cutoff frequency was selected based on spectral analysis of over 50 movement traces, which were selected arbitrarily across participants and pellets, showing prominent spectral energy in a narrow band centered near 2.5 Hz.

C. Experimental tasks

Speech data existed for 43 of the 46 participants (19 male, 24 female) with a mean chronological age of 21;7 (SD: 3;4, range: 18;4–37;0). The remaining three participants completed only the swallowing task. Each speaker produced 20 consecutive vowel consonant vowel (V1CV2) combinations, with the consonant changing and the vowels remaining constant (V1=/u/, V2 = /a/). The consonants were 20 American English phonemes (/h,m,w,b,p,f,v,t,d,n,s,z,k,g, r,j∫,3,t∫,d3/). For several of the analyses, phonemes were grouped according to the following place of articulation scheme: laryngeal fricative /h/, bilabials /m,w,b,p/, labiodentals /f,v/, alveolars /t,d,n,s,z/, palatoalveolars /∫,3,t∫,d3/, retroflex /r/, lateral /l/, palatal /j/, velars /k,g/. All utterances were produced at a self-selected typical rate with stress assigned to the second syllable. Each VCV utterance was only produced once. Thus, the present design is predicated on the assumption that a single token provides a reasonable representation of the articulatory kinematics associated with each task. This assumption is supported by previous research showing high reliability of tongue, lip, and jaw kinematic patterns among replicates of basic speech material (Green et al., 2000; Westbury et al., 1998). Data from a given pellet was not included in the analysis if it contained gaps related to mistracking. Consequently, when the data were pooled across participants, the number of missing data points varied from 0% to 8% across pellet pairs, with the highest incidence of missing data observed for T1×T2 and T2×T3 (range = 5%–8%).

Swallow data existed for 42 participants (19 male, 23 female) with a mean chronological age of 22;1 (SD: 3;11, range: 18;4–37;0). Participants swallowed 10 ccs of water for five trials. Because of pellet mistracking, the number of samples analyzed for each swallow trial differed, with a mean of 39 samples (range: 36–42) per trial. Differences in the percent of missing data for specific pellets were evident across the five swallow trials. The percent of missing data varied across all four tongue pellets (M: T1 = 13%; T2 = 14%; T3 = 19%; T4 = 12%) with a range of 4%–30%. Percent of missing data for the upper and lower jaw pellets also varied (M: MI = 3%; MM = 16%) with a range of 0%–18%.

D. Data conditioning and analysis

1. Signal processing

Prior to analysis, the positional data was transformed to achieve tongue and lower lip positions that were independent from that of jaw. Translatory and rotary components of mandibular movements were computed and used to reexpress the position of the tongue and lower lip pellets relative to the mandibular incisor and molar pellets. This computation, which is defined in Formula 1 (from Westbury et al., 2002), effectively transposed these data from the maxillary occlusal plane coordinate system to one that is relative to the position of the mandible:

[xy]=[cosαsinαsinαcosα][(xxMI)(yyMI)], Formula 1

where

α=tan1[(yMMyMI)/(xMMxMI)];

xmi, ymi and xmm, ymm are the positions of the mandible pellets (MI = mandibular incisor, MM = mandibular molar); and x, y is the position of a flesh-point (either the lower lip or tongue) that is being reexpressed into the mandibular-based coordinate system. This transformation was necessary because analysis of the tongue data in its original reference frame (i.e., the maxillary occlusal plane) would have biased the results toward high coupling among all lingual pellet pairs due to the shared influence of the mandible on the position of each pellet.

All analyses were restricted to motions in the vertical dimension (y axis) as defined by the mandibular-based coordinate system. The decision to study only a single dimension of tongue motion was motivated by the need to simplify both the analysis and the interpretation of the large number of conditions being examined. This roughly “vertical” component of articulatory motion was specifically selected because (a) elevation of the appropriate region of the tongue toward the palate is an essential kinematic goal for these speech utterances and (b) based on previous findings, the vertical component is expected to provide better mapping to phonetic variation than the horizontal component (Lofqvist and Gracco, 1994). For example, during V /g/ V utterances, the anterior–posterior positioning of the site of palatal contact varies considerably depending on vowel context (Kent and Moll, 1972). Reduction of the two-dimensional data into a single variable that reflects the motion relative to a primary axis of motion (e.g., principle component analysis) was not pursued because the accuracy of this transformation can vary significantly from token to token with changes in the shape of the movement path. This transformation was also avoided because it makes the direction of movement relative to the palate ambiguous.

Prior to analysis, the movement associated primarily with the consonant was identified on each movement trace. For this procedure, the start- and end-points of each VCV gesture was defined algorithmically based on a near-zero crossing (−0.03 mm/s) in a derived velocity signal. The −0.03-mm/s threshold was empirically derived and was adopted to ensure that the selected segments were associated with speech movements as opposed to those associated with small amplitude fluctuations that frequently occur at rest. The near-zero crossing associated with the beginning of consonantal closure defined the onset of each signal, and the near-zero crossing associated with the ending of consonantal release defined the offset of each signal. If more than one threshold crossing was identified for a given phase of movement, the point that was closest to the middle of the movement segment was designated as the event marker.

2. Performance measures

For each task, pairwise correlations were computed on the vertical time histories of select pellet pairs: T1×T2, T2 ×T3, T3×T4, T1×T3, T2×T4, T1×T4, UL×LL, LL×MI. The resulting correlation coefficients among lingual pellets quantified the strength of movement coupling of tongue-surface regions as they moved toward and away from the palate. Correlations approaching one represented highly coupled articulatory movements; correlations near zero represented independent articulatory movements; and correlations approaching negative one represented highly coupled articulatory movements that were moving in opposite directions. Although the present study was primarily concerned with movements of the tongue, lip and jaw pairs (i.e., UL×LL and LL×MI) were included to examine differences in lip and jaw coordination between lingual and labial consonants, and between labial and glottal consonants.

3. Quantification of articulatory coupling: Covariance

One interpretive limitation of representing movement-coupling solely based on zero-lag correlations is that the relative importance of a given movement on vocal tract acoustics or bolus propulsion cannot be evaluated. That is, because traditional correlation-based analyses are inherently normalized to signal amplitude, small movements cannot be distinguished from larger, potentially more functional movements. To overcome this limitation, we computed the covariance between the vertical time histories associated with each pellet pair (see Formula 2). The covariance formula is rexpressed in Formula 3 to emphasize that it represents spatiotemporal coupling that is weighted by movement amplitude. The SD represents the standard deviation of movement for each vertical time history:

Covxy=(xx)¯(yy)¯N1, Formula 2
Covxy=rij×SDi×SDj. Formula 3

The value of the covariance will decrease in response to both spatial and temporal differences between pellet-position time histories. The maximum value of SDi × SDj is expected to differ across pellet pairs because the maximum vertical position for a given pellet will be determined by the curvature of the palate.

The representation of tongue surface motion in terms of the covariance provided a quantitative means to examine patterns of functional independence for two tongue regions across a variety of phonemes. For example, the observation of consistently high covariance values between two regions and across a variety of phonemes would suggest limited functional independence, whereas the observation of consistently low covariance values would suggest a high degree of functional independence.

To examine phonemic differentiation in tongue coordination, the covariance values associated with each pellet pair (i.e., T1×T2, T2×T3, T3×T4, T1×T3, T2×T4, T1×T4) were grouped to form coupling profiles for each task and speaker. The coupling profiles were used to quantify the degree of coordinative distinctness among different consonants and swallowing, and the degree of variability in lingual movement patterns across tasks and participants. For example, if tongue movements for distinct phonemes were derived from common movement patterns, then coupling profiles would be similar for multiple phonemes. Likewise, if speakers used similar lingual movement patterns, then coupling profiles would be similar across speakers for a given task.

Figure 1 illustrates the effectiveness of the covariance analysis for capturing across-phoneme differences in lingual movement patterns. The data in this figure were obtained from a single speaker’s production “uhda” and “uhga.” The top panel displays the movement path for each pellet in the midsagittal plane as shown in Tf32.exe (Milenkovic, 2000). The middle panel contains the extracted vertical time histories for each pellet, and the bottom panel contains the derived coupling profiles.

FIG. 1.

FIG. 1

Movement data and associated coupling profiles for the utterances “uhda” and “uhga” from a single participant. Top panels: the movement path for each pellet in the mid-sagittal plane. Middle panels: the extracted vertical time histories for each pellet. Bottom panels: coupling profiles based on covariance values derived from the traces in the middle panels. The coupling profiles highlight the differences in tongue motion for these two tasks. The alveolar, as displayed in panel (c), exhibits greatest coupling between pellets T1 and T2 with little activity at other adjacent tongue regions. In addition, coupling between T1 and T4 was negative, suggesting oppositional movement. In contrast, the velar was produced with uniformly strong, positive coupling across all tongue regions. Coupling profiles provided a quantitative means to describe differences in tongue surface motion across all tasks.

The vertical time histories, displayed in the middle panel, emphasize the kinematic differences between these tasks. As expected, the location of maximum constriction is more anterior for the alveolar than for the velar. During “uhda” T1, T2, and T3 moved toward the palate relatively synchronously, while T4 moved away from the palate; during consonantal closure for “uhga,” all tongue pellets moved relatively synchronously toward the palate. The derived coupling profiles, which are displayed in the bottom panels, quantify the observed trends in the vertical time histories. During the production of “uhda,” T1×T2 coupling was highly positive; T1×T4 and T2×T4 coupling was negative; and the other three pellet pairs exhibited low positive coupling. In contrast, during the production of “uhga,” the covariance values for all tongue pellet pairs were high and positive.

III. RESULTS

A. Distribution of coefficients and SDi×SDj values as a function of task and pair

Tongue pellets exhibited differing degrees of functional independence across task (i.e., across different phonemes and swallowing). Panel (a) in Fig. 2 shows the distribution of average SDi × SDj values as a function of average correlation coefficient for each place of articulation and swallowing. Each data point represents the average value across participants. Panel (b) in Fig. 2 shows the same data plotted as a function of pellet pair. In these figures, high degrees of movement independence between any two pellets would be represented by the observation of a relatively large SDi × SDj value that was associated with a low coefficient value. Another indicator of coordinative flexibility between pellets is the range of coefficient values across tasks. That is, a high degree of movement independence for a given pellet pair would be suggested by the observation of a large range of coefficient values across different speech contexts and swallowing. Conversely, limited movement independence would be supported by the observation of a small cluster of coefficient values near either 1 or −1 across tasks.

FIG. 2.

FIG. 2

Panel (a) shows the distribution of SDi × SDj terms as a function of average coefficient for each place of articulation and swallowing. Each data point represents the average value taken across participants. Panel (b) shows the same data plotted as a function of pellet pair.

The large range of coefficient values and movement amplitudes exhibited for most of the pellet pairs suggests that many of these anatomic regions are capable of functioning quasi-independently. Conversely, limitations in motion independence are indicated by the absence of data points in the upper middle region of these figures. This finding was anticipated on the basis that the maximum degree of movement decoupling across different tongue regions was expected to be limited by tissue linkages and volume displacement effects.

As displayed in panel (b) of Fig. 2, the range of coefficient values differed considerably across pellet pairs. Adjacent pellets tended to be only positively coupled, and therefore exhibited a relatively smaller range of coupling relations than nonadjacent pellets, which for some tasks exhibited negative coupling. T3×T4 exhibited the smallest range (0.46) of average coefficient values and T1×T3 exhibiting the largest range (1.35). With the exception of LL×MI, coefficient values for most pellet pairs tended to vary along a continuum.

These data also illustrate the potential limitations of relying solely on the correlation coefficient as a measure of tongue-pellet coupling. As displayed in panel (a) of Fig. 2, the coefficient values do not distinguish the differences in coupling relations among pellets displayed in the lower right-hand corner of the figure (high coupling–small movements) from those displayed in the upper right-hand corner (high coupling–large movements). Covariance values, in contrast, distinguish between movements that reasonably may be assumed to have a greater influence (i.e., large amplitude) on vocal tract acoustics from those that might have only a minimal influence (i.e., small amplitude). Although we also recognize that, in accordance with quantal theory of speech (Stevens, 1989), articulatory to acoustic relations are highly nonlinear, with small articulatory movements producing disproportionately large acoustic changes in some vocal tract regions.

B. Task-related differences in tongue-surface movement patterns

Covariance values were plotted as a function of pair and grouped by task. The resultant coupling profiles for each place of articulation and swallowing are presented in panel (a) of Fig. 3. A comparison of the coupling profiles provides a quantitative means to assess task-related differences in tongue-surface movement patterns. The average covariance values are listed in Table I as a function of pair and task.

FIG. 3.

FIG. 3

Panel (a) shows the average across-participant covariance values plotted as a function of pair and grouped by task. Panel (b) shows the standard deviation values associated with the mean values displayed in Panel (a).

TABLE I.

Covariance summary statistics as a function of pellet pair and place of articulation.

Place Pellet pairs Total
M (SD): Range

T1×T2 T2×T3 T3×T4 T1×T3 T2×T4 T1×T4 LL×J1 UL×LL
Alveolar 5.07(5.10) 1.34(2.79) 2.48(2.98) −0.17(3.70) −1.01(3.00) −3.37(4.11) −0.51(2.01) 0.06(0.27) 0.49(3.37):29.69
Bilabial 1.35(1.42) 1.65(1.84) 2.35(2.47) 0.32(1.78) 1.11(1.99) 0.01(2.04) 4.49(2.93) 2.46(1.70) 1.72(2.00):14.76
Glottal 0.48(0.42) 0.47(0.58) 0.86(1.02) 0.20(0.40) 0.13(0.82) −0.03(0.65) −0.17(0.92) 0.03(0.13) 0.24(0.74):8.11
Labiodental 1.01(1.11) 1.56(1.65) 2.20(2.27) 0.64(1.07) 1.30(1.80) 0.51(1.60) 4.76(3.00) 0.52(1.12) 1.55(2.04):17.06
Palatal 7.94(4.92) 16.42(7.80) 9.37(9.83) 6.49(4.27) 6.68(8.27) 2.38(3.60) −0.66(1.61) 0.03(0.11) 5.92(7.81):42.10
Palatal-alveolar 9.34(5.99) 1.97(3.89) 2.53(3.67) 0.91(4.91) −2.48(3.95) −4.66(5.31) −1.74(3.50) 0.43(0.81) 0.77(5.42):39.53
Velar 2.57(2.95) 11.09(6.29) 17.62(11.76) 2.68(3.53) 10.11(6.75) 2.45(3.32) −0.32(1.50) 0.10(0.23) 5.75(7.99):50.61
Retroflex 9.20(6.40) 1.29(3.41) 1.97(2.90) −1.99(5.97) −3.04(3.01) −6.82(5.76) 3.47(2.40) 0.83(1.39) 0.61(6.14):54.17
Lateral −0.42(2.27) 1.63(2.46) 2.18(2.67) −3.97(3.42) 0.71(1.59) −1.57(4.15) −0.05(1.67) 0.03(0.19) −0.21(3.13):29.98
Swallow 3.37(6.32) 12.93(9.18) 17.38(9.48) 0.33(3.84) 6.06(7.41) −0.23(3.92) −0.55(1.48) 0.34(1.04) 5.30(8.64):41.30
Total: M (SD) 4.45(5.38) 5.45(7.39) 6.26(8.61) 0.63(4.49) 2.10(6.18) −1.25(4.71) 1.04(3.19) 0.55(1.10)
Range 35.24 37.52 49.58 31.69 39.39 42.00 28.66 9.65

Overall, different places of articulation were distinguished by their coupling profiles, with covariance values being greatest for adjacent pellet pairs located near the expected primary place of articulation. Task differences among coupling profiles were tested statistically using multiple repeated measures MANOVAs (task × pair). To reduce the potentially large number of statistical comparisons, the data were grouped by place of articulation and the statistical tests were restricted to lingual pellets. Prior to statistical analysis, normality of the covariance data was examined using histograms and normal probability plots. These plots revealed that the covariance scores were distributed symmetrically about their mean with the exception of a few outliers. The results of the omnibus and main effect analyses are reported in Table II. For these repeated measures comparisons, a Bonferroni correction was applied to each familywise comparison (15 possible tongue-pellet pair comparisons per task) resulting in a corrected α level of 0.003. With the exception of palatoalveolar versus retroflex, all omnibus comparisons achieved statistical significance using this criterion. This finding suggests that mid-sagittal lingual movement patterns, as represented by the coupling profile, were distinct across different places of articulation. In the main effect analysis, each place of articulation exhibited at least one across-pair comparison that achieved statistical significance with the exception of palatoalveolar versus retroflex and velar versus swallow.

TABLE II.

Results of pairwise comparisons testing for differences in covariance values across pairs and tasks. All p values were Bonferroni corrected. Nonsignificant findings are shown in bold. Pa: Palatoalveolar.

Pairwise
comparison
Omnibus T1×T2 T2×T3 T3×T4 T1×T3 T2×T4 T1×T4







df F p df F p df F p df F p df F p df F p df F p
Alveolar vs Palatal (6,31) 38.4 0.0001 1 13.4 0.0008 1 170.7 0.0001 1 13.9 0.0007 1 86.7 0.0001 1 40.9 0.0001 1 92.1 0.0001
Alveolar vs Pa (6,35) 8.9 0.0001 1 34.3 0.0001 1 1.9 0.1802 1 0.0 0.9450 1 5.0 0.0317 1 11.5 0.0016 1 6.4 0.0151
Alveolar vs Velar (6,34) 48.8 0.0001 1 16.5 0.0002 1 115.7 0.0001 1 61.5 0.0001 1 14.7 0.0004 1 148.3 0.0001 1 74.2 0.0001
Alveolar vs Retroflex (6,33) 7.3 0.0001 1 21.7 0.0001 1 0.0 0.9658 1 1.7 0.1984 1 2.9 0.0983 1 17.3 0.0002 1 11.3 0.0018
Alveolar vs Lateral (6,30) 34.8 0.0001 1 108.3 0.0001 1 0.5 0.4722 1 3.6 0.0673 1 54.2 0.0001 1 16.67 0.0002 1 21.8 0.0001
Alveolar vs Swallow (6,26) 44.2 0.0001 1 0.9 0.3448 1 66.3 0.0001 1 117.3 0.0001 1 0.8 0.3746 1 39.9 0.0001 1 16.3 0.0003
Palatal vs Pa (6,31) 42.1 0.0001 1 2.2 0.1481 1 186.0 0.0001 1 14.4 0.0005 1 57.9 0.0001 1 71.0 0.0001 1 97.9 0.0001
Palatal vs Velar (6,30) 27.5 0.0001 1 53.0 0.0001 1 31.8 0.0001 1 57.0 0.0001 1 32.5 0.0001 1 9.1 0.0047 1 0.2 0.6848
Palatal vs Retroflex (6,29) 20.0 0.0001 1 1.1 0.3053 1 115.2 0.0001 1 19.1 0.0001 1 56.4 0.0001 1 49.7 0.0001 1 65.8 0.0001
Palatal vs Lateral (6,28) 38.9 0.0001 1 78.5 0.0001 1 100.4 0.0001 1 20.0 0.0001 1 115.0 0.0001 1 21.4 0.0001 1 19.4 0.0001
Palatal vs Swallow (6,24) 8.5 0.0001 1 10.1 0.0035 1 1.6 0.2153 1 20.8 0.0001 1 25.2 0.0001 1 0.0 0.9509 1 5.2 0.0306
Pa vs Velar (6,34) 50.9 0.0001 1 55.4 0.0001 1 91.7 0.0001 1 62.9 0.0001 1 3.7 0.0631 1 225.3 0.0001 1 79.5 0.0001
Pa vs Retroflex (6,33) 4.3 0.0025 1 0.0 0.8741 1 0.7 0.4198 1 1.3 0.2567 1 6.5 0.0149 1 0.9 0.3488 1 3.9 0.0545
Pa vs Lateral (6,30) 42.5 0.0001 1 117.5 0.0001 1 0.2 0.6841 1 1.8 0.18461 1 38.2 0.0001 1 22.0 0.0001 1 24.8 0.0001
Pa vs Swallow (6,26) 35.4 0.0001 1 16.82 0.0003 1 51.1 0.0001 1 106.0 0.0001 1 0.12 0.73089 1 50.89 0.0001 1 21.4 0.0001
Velar vs Retroflex (6,32) 28.3 0.0001 1 38.6 0.0001 1 64.7 0.0001 1 76.7 0.0001 1 11.6 0.0016 1 154.7 0.0001 1 61.3 0.0001
Velar vs Lateral (6,29) 18.7 0.0001 1 22.66 0.0001 1 69.5 0.0001 1 58.4 0.0001 1 55.2 0.0001 1 61.69 0.0001 1 17.68 0.0002
Velar vs Swallow (6,25) 10.5 0.0001 1 2.2 0.1485 1 4.2 0.0504 1 0.8 0.39337 1 3.7 0.06287 1 6.6 0.01528 1 8 0.0084
Retroflex vs Lateral (6,28) 16.6 0.0001 1 78.4 0.0001 1 0.3 0.6128 1 0.0 0.94993 1 3.6 0.06572 1 41.1 0.0001 1 35.0 0.0001
Retroflex vs Swallow (6,25) 46.8 0.0001 1 11.1 0.0023 1 53.3 0.0001 1 160.4 0.0001 1 2.7 0.11342 1 64.4 0.0001 1 25.5 0.0001
Lateral vs Swallow (6,22) 32.7 0.0001 1 13.1 0.0012 1 85.3 0.0001 1 115.3 0.0001 1 28.3 0.0001 1 36.9 0.0001 1 1.6 0.2109

The coupling profiles for each phonemic context were subjected to a multidimensional scaling (MDS) procedure to derive an articulatory coordination space based on pellet coupling. This analysis provided a novel means for evaluating task specificity of lingual movement patterns by reducing the multivariate data associated with each task into three factors. Three factors were used because this combination of factors accounted for a greater proportion of the variance (R2 = 70%, stress=0.25) than did the two-dimension model (R2 = 64%, stress=032). Panel (a) of Fig. 4 displays the MDS solution, which is plotted as Euclidian distances from a common centroid. Similarities among coupling profiles across place of articulation and task are represented by spatial proximity. When interpreting the MDS solution it is important to consider that the impression of data clusters varies dramatically depending on figure orientation. The individual participant weights for each dimension are presented in panel (b) of Fig. 4.

FIG. 4.

FIG. 4

The coupling profiles for each phonemic context were subjected to a multidimensional scaling (MDS) procedure to derive an articulatory coordination space based on pellet coupling. This analysis provided a means to evaluate task specificity of lingual deformation patterns by reducing the multivariate data associated with each task into three factors or dimensions. Similarities among coupling profiles across place of articulation and task are represented by spatial proximity. Panel (a) shows the MDS solution plotted as Euclidian distances from a common centroid. Panel (b) shows the individual participant weights for each dimension.

Based on visual inspection, the MDS solution identified between five and seven clusters that distinguished the different tongue sounds (e.g., alveolar fricatives from velars and alveolar stops). As expected, all labial sounds occupied a similar location of the MDS space. With the exception of /t/ and /d/, homorganic consonants were in close proximity. The retroflex, lateral, and swallowing each occupied a unique location in the MDS solution. Velars and the palatals appeared to form a unique cluster. Interestingly, /s/ and /z/ did not cluster with other alveolar sounds. The retroflex was primarily distinguished from the other alveolars and the palatoalveolars by its relatively high dimension 3 value, which was more similar to the values associated with labial sounds.

C. Pellet pair by task interactions

Pellet pair effects were tested using multiple posthoc comparisons for swallowing and place of articulation. Due to the large number of comparisons being tested, statistical reporting was abbreviated in the form “p <0.003, for each comparison” when the same alpha level was used for a family of comparisons. In general, pellet pair effects tended to vary predictably with place of articulation. As anticipated, glottals were associated with weak coupling across all pellet pairs. The alveolars, retroflex, and palatoalveolars exhibited significantly stronger coupling for T1×T2 than for all other pellet pairs (p <0.001, for all comparisons) and significantly stronger negative coupling for T1×T4 than for all other pairs (p <0.001, for all comparisons) except T2×T4. In contrast to the more “fronted” sounds, the palatal and velars were associated with positive correlations among all pellet pairs. In the present study, the /l/ was characterized by uniformly low coupling except for T1×T3, which showed significantly greater negative coupling than all other pellet-pairs (p < 0.001, for all comparisons). The palatal was produced with significantly greater coupling for T2×T3 than for all other pellet pairs (p <0.001, for all comparisons). In contrast, velars were produced more posteriorly than was the palatal with significantly stronger coupling for T3×T4 than for all other pairs (p <0.001, for all comparisons). For velars, pellet pair coupling was also significantly stronger for T2×T3 than for all other pairs (p <0.001, for all comparisons) except T3×T4 and T2×T4. Like the palatal and velars, swallowing was characterized by a high degree of positive coupling across all tongue pellets with significantly stronger coupling for T3×T4 than for all other pellet-pairs (p <0.001, for all comparisons) and greater coupling for T2×T3 for all pellet pairs (p <0.001, for all comparisons) except T3×T4.

D. Across-speaker variation in lingual movement patterns

The present analyses provide several parameters that could be examined to assess across-speaker differences in lingual movement patterns. Figure 3(b) presents the across-speaker standard deviation for each mean value displayed in panel (a). These values show marked individual variability, most notably in the expected place of primary constriction for each consonant. In general, the standard deviation values appear to scale closely with their associated means. Swallowing was associated with high degrees of variability across all lingual pellet pairs. The results in Fig. 4(b) support the findings in Fig. 3(b) by showing a wide range of weights, most notably for dimension 1. The weights for the MDS solution measure the importance of each dimension to each participant. A participant with weights proportional to the average weights has a weirdness of zero, the minimum value. A participant with one large weight and many low weights has a weirdness value approaching one. A participant with only one positive weight has a weirdness of one, the maximum value for non-negative weights.

Although oppositional movement between T1×T4 was a distinguishing feature of front consonants, not all of the participants exhibited this pattern and some exhibited oppositional movement for back consonants. For T1×T4, negative coupling (oppositional movement) occurred in 90% of the participants for front consonants, in 20% of the participants for /g,k/, and in 28% of the participants for /j/.

IV. DISCUSSION

The coupling profile analysis provided a relatively simple quantitative method for describing tongue-surface movement patterns and for evaluating the behavioral flexibility exhibited by the tongue during speech and swallowing. Across all tasks, speakers exhibited a moderate degree of movement independence among adjacent and nonadjacent tongue regions. However, several constraints in movement independence were suggested by patterns of persistent high coupling across and within tasks. Specifically, adjacent pairs exhibited the least amount of movement independence, and large movements of posterior pellets (i.e., T3 and T4) were strongly associated with movements of anterior pellets (i.e., T1 and T2). Although coupling profiles describe lingual coordination of only four tongue regions, they effectively captured changes in tongue-surface deformation patterns that distinguish between one place of articulation from another and speech from swallowing. The basic movement patterns captured by the coupling profiles reflect regional organization of the tongue and underscore the importance of local surface elevations in determining constriction location. The patterns of tongue movement identified in the present study may be useful for forming some expectations for tongue behavior during speech, which may potentially be used to gauge the degree of disordered tongue function.

A. Task differentiation in lingual coordination

1. Phonemic differentiation

Overall, the coupling profiles captured the expected features of tongue, lip, and jaw behavior of consonants across participants. Specifically, the predominant peak of each coupling profile exhibited in Fig. 3 varied systematically from anterior to posterior and occurred in locations that are roughly consistent with those identified by conventional places of articulation schemes (Ladefoged, 2001; Nicolosi et al., 1996). This finding provides some evidence for the face validity of covariance as a quantitative index of lingual-surface coordinative organization.

In the present study, the number of distinct profiles grossly represented the degree of phonemic specificity encoded by motions of the tongue in the mandibular anatomic reference plane. Based on visual inspection, the MDS solution (Fig. 4) identified between five and seven clusters that distinguished between, for instance, alveolar fricatives from velars and alveolar stops. This number of distinct lingual movement patterns is greater than might be expected based on previous estimates (Harshman et al., 1977; Maeda, 1990; Stone, 1990). However, a visual inspection of panel (a) of Fig. 3 suggests that if scaling differences among profiles were accounted for, the number of distinct patterns might decrease to four: (1) blade elevation with dorsum depression, (2) body elevation, (3) dorsum elevation, and (4) anterior-blade elevation with body depression. The blade elevation with dorsum depression pattern was observed for alveolars, palatoalveolars, and the retroflex, which exhibited positive T1×T2 coupling and negative T1×T4 coupling. As revealed by the MANOVA the MDS solution, the retroflex was primarily distinguished from the alveolars and the palatoalveolars by the relatively greater covariance values for LL and MI. This finding agrees with prior work suggesting that lip rounding is an additional feature of the retroflex (Westbury et al., 1998). The body elevation pattern, which was observed for the palatal /j/, tended to be produced with positive coupling among all pellet pairs, but with the strongest coupling between T2×T3. The third pattern was associated with swallowing and velars, which, like /j/, was characterized by positive coupling for all pellet pairs, but differed in that the greatest coupling occurred between T3×T4. Finally, the fourth pattern was associated with the lateral, which was distinct from the other fronted sounds in that T1×T2 were weakly coupled, and T1×T3 were negatively coupled.

The four basic tongue-surface movement patterns observed in the present study are similar to those described by Stone and Lundberg (1996). Using electropalatographic and three-dimensional ultrasound techniques, these investigators identified four fundamental tongue-surface shapes: front-raising for /n/ and /∫/, complete groove for /s/ and /θ/, back-raising for /k/, and two-point displacement for /l/. In the present experiment, front-raising was a prominent movement pattern for alveolars, palatoaveolars, and the retroflex and was indicated by strong coupling for anterior pellets (T1 ×T2), relatively weak coupling among posterior pellets (T3 ×T4), and negative coupling for T1×T4. Central grooving may explain why /s/ and /z/ did not cluster with other alveolars in the multidimensional solution, as this type of posturing may restrict motion of the mid-sagittal tongue. Future investigations should explore this within-place category difference. Stone and Lundberg’s “back-raising” gesture for velars was quantitatively supported in the present investigation by the relatively high T3×T4 coupling observed for these consonants. The present analysis also revealed that velars were characterized by the simultaneous elevation of all tongue regions (i.e., positive, moderate to high coupling across all tongue pairs). This “whole tongue” movement pattern is fundamentally different from that observed for alveolars and palatoalveolars, which exhibited a greater diversity of covariance values across tongue regions, and thus more complicated patterns of lingual movement. The coupling profiles for /l/ did not exhibit the anterior–posterior elevation pattern (i.e., “two-point displacement”) described by Stone and Lundberg, although both studies similarly observed tongue behavior for this sound to be distinct from other sounds. In the present study, the /l/ was characterized by uniformly low coupling except for T1×T3, which showed moderate, negative coupling. The similarities between the lingual patterns described by Stone and Lundberg and those identified in the present study provide additional evidence for the strength of covariance as a method for parametrizing tongue-surface movement patterns across a large number of participants.

2. Scaling of basic movement templates across consonants

The present suggestion of four tongue-surface coupling patterns is consistent with the assertion that a small set of movement patterns or shapes form the bases for phonemic distinctions and that differences among closely related sounds may result from a scaling of these basic templates (Stone and Lundberg, 1996). The observation of limited variations in tongue configurations across a variety of phonemes is consistent with motor control theories that rely on neuromuscular synergies. Synergies, in theory, simplify the task of movement control from the central nervous system by reducing the number of independent elements that need to be regulated across a variety of motor tasks (see Bernstein 1967; Turvey et al., 1978). For the present discussion, we adopt the definition of synergy proposed by Saltiel et al., (2001) as “a fixed group of muscles whose activity scales together” (p. 1).

If synergies, as previously defined, were evoked for lingual motion during speech, then we would expect lingual phonemes to be primarily distinguished by the relative level of excitation across a shared set of muscles. Moreover, to the extent that these putative modulations of muscle excitation map to articulatory displacement, we would also expect that some phonemes are primarily distinguished by the amplitude scaling of a common movement pattern. Although synergies are central to many prevailing theories of motor control, including those related to speech production (Browman and Goldstein, 1989; Kelso et al., 1986), empirical verification of their physical manifestation has proven to be challenging and requires further work (Macpherson, 1991; Perkell, 1997).

3. Speech versus swallowing

Despite the fact that the average coupling profile for swallowing was similar in shape to that for velars, swallowing occupied a unique region of the MDS space. This result may be accounted for, in part, by the large variability across participants that was observed for swallowing covariance values [Fig. 3(b)]. The vertical time histories observed for lingual pellets during swallowing were distinct from those observed during speech. During swallowing, lingual pellet motions were initiated sequentially starting at the anterior T1 and ending at the posterior T4. This observation is consistent with reports describing tongue motion during swallowing to propagate in a wavelike manner from apex to dorsum (Bosma et al., 1990; Martin, 1991). In contrast, the pellet motions during speech appeared to be relatively synchronous (for example, see Figs. 1(b) and (e)]. Based on these observations, we suspect that the high, positive covariance values observed during swallowing were not the result of greater movement coupling, but instead were due to the overlapping of periods of stillness that occurred when each pellet assumed a relatively stationary position after achieving palatal closure. This observation suggests that a time-lagged cross-correlational analysis would be a more appropriate method for describing the sequential movement patterns characteristic of swallowing than the zero-lag method used in this investigation.

B. Functional movement independence in midsagittal tongue

The degree of movement independence, as measured by covariance, varied considerably among pellet pairs. Of all the adjacent pellet pairs, the anterior pair (i.e., T1×T2) appeared to exhibit the greatest across-task variation in coupling. This observation is consistent with the expectation that speakers have the finest control over the tongue’s distal regions. Interestingly, morphologic differences between the anterior and posterior tongue musculature have been reported in primates. DePaul and Abbs (1996) reported that in the Macaca fascicularis, type IIA fibers were predominant in the apex of the tongue, with the number of type I fibers increasing posteriorly. These authors speculated that the different fiber types may be activated separately, with the type IIA fibers associated with rapid tongue tip movements and the type I fibers associated with the relatively slower movements of the posterior tongue.

The distribution for covariance values for some pellet pairs (i.e., T2×T3, T3×T4, T2×T4) formed several primary clusters, which suggest that the relative motions between these regions are, in practice, limited. For example, the covariance values for T2×T4 formed two primary clusters, one representing back sounds (positive coupling) and one representing front consonants (negative coupling). Similarly, covariance values associated with T3×T4 and T2×T3 formed two primary clusters that were restricted in range: one cluster representing weak coupling for more anterior tongue consonants and the other cluster representing strong coupling for more posterior tongue consonants. The observation of strong coupling within a restricted range for more posterior consonants is consistent with the extreme convex posturing of the tongue dorsum during back consonants, which has been previously described by other investigators (Perkell, 1969; Stone and Lundberg, 1996). Collectively, these findings reveal that during back-raising gestures, movements of posterior pellets (e.g., T4) were highly coupled with those of more anterior pellets (e.g., T1, T2, T3), whereas during front-raising gestures, anterior pellets exhibited functional independence from more posterior pellets. These observed tendencies in lingual surface motion might be interpreted to represent a general feature of tongue motion for speech: large amplitude movement of anterior tongue can be independent from movement of posterior regions, but large amplitude movements of posterior regions are not independent from movement of anterior regions.

C. Across-speaker variation

In the present investigation, coupling profiles were examined to assess across-speaker variation in tongue movement patterns for very basic speech utterances. There have been relatively few comprehensive reports of across-speaker differences in tongue kinematics largely because the instrumentation for tracking lingual kinematic data is expensive, as are the work hours required for data reduction (hence the impetus for the XRMB database; see Westbury, 1994). Consequently, most investigations of tongue function have studied seven or fewer participants (e.g., Guenther et al., 1999; Harshman et al., 1977; Hoole, 1999; Kent and Moll, 1972; Lofqvist and Gracco, 1994; Perkell and Nelson, 1985; Stone, 1990). The few existing investigations that have studied tongue kinematics in a large number of participants have reported large differences across speakers (Hashi et al., 1998; Westbury et al., 1998). Based on these findings, and the widely reported kinematic changes with regard to speech rate and context, we anticipated observing considerable across-participant differences in coupling profiles, even for the relatively basic speech utterances studied. The expectation for across-participant differences in tongue movement patterns was further strengthened by factors such as individual differences in vocal tract anatomy and pellet placement. Of course, differences in coupling profiles across phonemes will be directly affected by differences in movement amplitude across participants. Vocal tract size may be one factor that contributes to across-speaker differences in the magnitude of displacement (Kuehn and Moll, 1976). However, a direct relationship between vocal tract size and articulatory displacement is not supported by experiments showing that young children exhibit similar articulatory displacements to adults (Goffman and Malin, 1999; Smith and Gartenberg, 1984). Knowledge of how individual differences in vocal tract morphology influence articulatory strategies is surprisingly limited.

Despite the expectation for across-speaker differences, the present findings suggest that covariance is at the appropriate level of analysis for capturing across-speaker similarities in tongue movement patterns. Similarities across-participants were most strongly supported by the phoneme effects observed in the repeated measures MANOVA. Because this analysis statistically controlled for systematic subject effects on covariance values, it was able to detect across-participant similarities in the shape of coupling profiles. In contrast, across participant differences were suggested by the data in Fig. 3(b) and Fig. 4(b), where covariance values appeared to vary considerably across participants for most contexts, as indicated by the high standard deviations and weirdness values, respectively. Some of these differences might be explained by systematic differences in movement amplitude. As a whole, the results of the different levels of analysis suggest that although speakers exhibited a wide degree of variation in their covariance values for a given phoneme, their overall profile shapes were similar.

D. Putative mechanisms for observed tendencies in lingual motion

Some of the present findings may represent biomechanical constraints on tongue movements. For example, mechanical linkages between contiguous tongue regions may have accounted for the relatively high maximum coupling observed between adjacent pellets. This possibility was also suggested by Dembowski and colleagues (1998), who reported that the strength of pairwise correlations of pellet-point positions decreased as the distance between their locations on the tongue increased. Moreover, the consistently high levels of movement coupling observed across the entire tongue during back consonants may be the result of extrinsic muscle activity (e.g., styloglossus), which simultaneously raises the tongue body and dorsum toward the palate (MacNeilage and Sholes, 1964).

The basis for the regular negative coupling observed between anterior and posterior tongue during front consonants is not obvious. One possibility is that speakers produce this lowering gesture to provide clearance for the ensuing air stream posterior to the primary site of constriction. This gesture may also be the result of (a) a motor strategy in which the posterior muscles of the tongue are stiffened to form a stable support for more anterior regions, (b) a coarticulation effect from surrounding vowels (Stone, 1990), and (c) a redistribution of volume within the tongue (Smith and Kier, 1989). The latter possibility considers the hydrostatic mechanisms in the tongue by which depression of the dorsum and root could potentially facilitate anterior elevation through shifting the volume within the tongue anteriorly.

Some of the observed across-pellet differences in coupling may be also explained, in part, by pellet placement effects and palatal constraints on lingual mobility. For example, alveolars may have exhibited lower covariance values than did palatoalveolars because T1 (the most anterior pellet) was located posterior to the tongue tip, which is the primary location of constriction for the alveolars. Moreover, maximum coupling as represented by covariance may have been greater for posterior tongue than for anterior tongue because the high-arching, posterior palate affords more space to move than does the downward-sloping, anterior palate.

E. Design limitations and interpretive caveats

Several aspects of our experimental design should be considered when attempting to generalize the present findings to all tongue behavior. Specifically, a greater diversity of lingual movement patterns may have been observed if vowel context was varied or if more natural speech stimuli were used and if observations of tongue motion were not restricted to the vertical dimension of the mid-sagittal plane. For example, Stone (1990) reported that the oppositional movement between anterior and posterior tongue regions (i.e., negative coupling) during alveolars was somewhat vowel context dependent. Moreover, previous research has shown some consonants to be distinguished by tongue maneuvers outside the mid-sagittal plane such as palatal bracing (Stone, 1990) and cross-sectional movements for linguapalatal sounds (Stone et al., 1992).

In addition, several issues should be considered regarding interpretive limitations of tongue and lower lip data that are referenced relative to the mandibular reference plane. Specifically, the interpretation that this transformation (i.e., Formula 1) yields tongue positions that are independent from the motion of jaw becomes particularly challenging during instance when the tongue is stationary while the jaw is moving. In this case, the kinematic traces of tongue pellets will reflect the movement characteristics of the jaw more than that of the tongue. It is likely that the composition of our utterances minimized this effect because the low vowel context of each VCV utterance encouraged movement of the jaw for both oral opening and closing. Interpreting lingual kinematic traces in the mandibular reference plane will also be challenged if jaw motion does not uniformly influence the motion of different tonque pellets. In this case, the positions of pellets whose motions are not tightly coupled to the jaw’s will be effectively “overcorrected.” At present, the extent of this effect is not known. Finally, this transformation does not account for the inertial forces that jaw motion imposes on the tongue and lower lip. The effects of these forces, however, are not of particular interest to the present study because it is principally concerned with characterizing tongue-surface movement patterns rather than the forces that generate them.

F. Summary and future directions

In summary, the coupling profile analysis effectively captured probable tongue movement patterns for distinguishing different places of articulation and speech from swallowing. In general, pellet-motion coupling patterns varied predictably with place of articulation. This analysis revealed four basic patterns of lingual coordination in the mid-sagittal tongue that could potentially be elaborated on to form further distinction.

The usefulness of covariance as a quantitative means for describing basic lingual function is pending on additional work directed toward evaluating the extent to which the observed trends in tongue-surface coupling apply to less constrained speech tasks. For instance, it is not evident how surrounding vowels, speech rate, and intensity influence coupling profiles. Nonetheless, the present level of success in capturing across-speaker tendencies in tongue-surface movement patterns suggests that with further development, covariance might be a useful metric for gauging the extent of disordered tongue function. For example, the present analysis might be particularly well suited for quantifying the relative increases or decreases in constraints imposed by the neuromotor system that may underlie neurologically impaired tongue function (e.g., the decreased inhibition by the neuromotor system associated with Huntington’s Chorea or the decrease excitation by the neuromotor system associated with Parkinson’s).

ACKNOWLEDGMENTS

We are very grateful to Gary Weismer, Erin Wilson, Rita Patel, and two anonymous reviewers for helpful feedback on earlier versions of this manuscript. We would also like to thank Steven Pittelko for his assistance with data analysis, Dave Wilson for programming assistance, Doris Kistler for statistical consulting, and Carl Johnson and John Westbury for generating and providing access to the XRMB database. This research was supported in part by Research Grant No. R03 DC4643-01 from the National Institute on Deafness and Other Communication Disorders.

Footnotes

PACS numbers: 43.70.Bk, 43.70.Aj, 43.70.Jt [AL]

1

Throughout this manuscript, the degree of functional movement independence refers to the degree of movement decoupling that is observed for different tongue regions across a variety of tasks (e.g., different speech sounds and swallowing). The identification of a high degree of functional movement independence among different tongue regions cannot be taken as direct evidence of independence of neural control for these regions because tongue-surface movement patterns during swallowing and speech will be determined by the combined influences of task demands, neural innervation patterns, palatal shape, and biomechanic architecture and tissue linkages.

References

  1. Bernstein N. The Coordination and Regulation of Movement. Oxford: Pergamon; 1967. [Google Scholar]
  2. Bosma JF, Hepburn LG, Josell SD, Baker K. Ultrasound demonstration of tongue motions during suckle feeding. Dev. Med. Child Neurol. 1990;32:223–229. doi: 10.1111/j.1469-8749.1990.tb16928.x. [DOI] [PubMed] [Google Scholar]
  3. Browman C, Goldstein L. Articulatory gestures as phonological units. Phonology. 1989;6:201–251. [Google Scholar]
  4. Dembowski J, Lindstrom MJ, Westbury JR. Articulator point variability in the production of stop consonants. In: Cannito MP, Yorkston KM, Beukelman DR, editors. Neuromotor Speech Disorders: Nature, Assessment, and Management. Baltimore, MD: Brookes; 1998. pp. 27–46. [Google Scholar]
  5. DePaul R, Abbs JH. Quantitative morphology and histochemistry of intrinsic lingual muscle fibers in Macaca fascicularis. Acta Anat. (Basel) 1996;155:29–40. doi: 10.1159/000147787. [DOI] [PubMed] [Google Scholar]
  6. Gibbon FE. Undifferentiated lingual gestures in children with articulation/phonological disorders. J. Speech Lang. Hear. Res. 1999;42:382–397. doi: 10.1044/jslhr.4202.382. [DOI] [PubMed] [Google Scholar]
  7. Goffman L, Malin C. Metrical effects on speech movements in children and adults. J. Speech Lang. Hear. Res. 1999;42:1003–1015. doi: 10.1044/jslhr.4204.1003. [DOI] [PubMed] [Google Scholar]
  8. Green JR, Moore CA, Higashikawa M, Steeve RW. The physiologic development of speech motor control: Lip and jaw coordination. J. Speech Lang. Hear. Res. 2000;43:239–255. doi: 10.1044/jslhr.4301.239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Guenther FH, Espy-Wilson CY, Boyce SE, Matthies ML, Zandipour M, Perkell JS. Articulatory tradeoffs reduce acoustic variability during American English /r/ production. J. Acoust. Soc. Am. 1999;105:2854–2865. doi: 10.1121/1.426900. [DOI] [PubMed] [Google Scholar]
  10. Hardcastle WJ. Physiology of Speech Production. London: Academic; 1976. [Google Scholar]
  11. Hardcastle WJ, Gibbon FE, Jones W. Visual display of tongue-palate contact: Electropalatography in the assessment and remediation of speech disorders. Br. J. Communi. Disorders. 1991;26:41–74. doi: 10.3109/13682829109011992. [DOI] [PubMed] [Google Scholar]
  12. Hardcastle WJ, Morgan-Barry RA, Clark CJ. An instrumental phonetic study of lingual activity in articulation-disordered children. J. Speech Hear. Res. 1987;30:171–184. doi: 10.1044/jshr.3002.171. [DOI] [PubMed] [Google Scholar]
  13. Harshman R, Ladefoged P, Goldstein L. Factor analysis of tongue shape. J. Acoust. Soc. Am. 1977;62:693–327. doi: 10.1121/1.381581. [DOI] [PubMed] [Google Scholar]
  14. Hashi M, Westbury JR, Honda K. Vowel posture normalization. J. Acoust. Soc. Am. 1998;104:2426–2437. doi: 10.1121/1.423750. [DOI] [PubMed] [Google Scholar]
  15. Hoole P. On the lingual organization of the German vowel system. J. Acoust. Soc. Am. 1999;106:1020–1032. doi: 10.1121/1.428053. [DOI] [PubMed] [Google Scholar]
  16. Kelso JAS, Saltzman EL, Tuller B. The dynamical perspective on speech production: Data and Theory. J. Phonetics. 1986;14:29–59. [Google Scholar]
  17. Kent R, Moll K. Cinefluorographic analyses of selected lingual consonants. J. Speech Hear. Res. 1972;15:453–473. doi: 10.1044/jshr.1503.453. [DOI] [PubMed] [Google Scholar]
  18. Kent RD, Netsell R, Bauer LL. Cineradiographic assessment of articulatory mobility in the dysarthrias. J. Speech Hear Disord. 1975;40:467–480. doi: 10.1044/jshd.4004.467. [DOI] [PubMed] [Google Scholar]
  19. Kuehn DP, Moll KL. A cineradiographic study of VC and CV articulatory velicities. J. Phonetics. 1976;4:303–320. [Google Scholar]
  20. Ladefoged P. A Course in Phonetics. 4th ed. Fort Worth: Harcourt Brace; 2001. [Google Scholar]
  21. Lofqvist A, Gracco VL. Tongue body kinematics in velar stop production: influences of consonant voicing and vowel context. Phonetica. 1994;51:52–67. doi: 10.1159/000261958. [DOI] [PubMed] [Google Scholar]
  22. MacNeilage P, Sholes G. An electromyographic study of the tongue during vowel production. J. Speech Hear. Res. 1964;7:209–232. doi: 10.1044/jshr.0703.209. [DOI] [PubMed] [Google Scholar]
  23. Macpherson JM. How flexible are muscle synergies? In: Humphrey DR, Freund H-J, editors. Motor Control: Concepts and Issues. New York: Wiley; 1991. pp. 33–47. [Google Scholar]
  24. Maeda S. Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model. In: Hardcastle WJ, Marchal A, editors. Speech Production and Speech Modeling. Dordrecht: Kluwer; 1990. pp. 131–149. [Google Scholar]
  25. Martin RE. A Comparison of Lingual Movement in Swallowing and Speech Production. Madison: University of Wisconsin; 1991. Ph.D. dissertation. [Google Scholar]
  26. Mermelstein P. Articulatory model for the study of speech production. J. Acoust. Soc. Am. 1973;53:1070–1082. doi: 10.1121/1.1913427. [DOI] [PubMed] [Google Scholar]
  27. Milenkovic P. Time-frequency analysis for 32-bit windows (computer program) Madison: University of Wisconsin; 2000. [Google Scholar]
  28. Nicolosi L, Harryman E, Kresheck J. Terminology of Communciative Disorders: Speech-Language-Hearing. 4th ed. Baltimore, MD: Lippincott, Williams, & Wilkins; 1996. [Google Scholar]
  29. Öhman SEG. Numerical model of coarticulation. J. Acoust. Soc. Am. 1967;41:310–320. doi: 10.1121/1.1910340. [DOI] [PubMed] [Google Scholar]
  30. Perkell JS. Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Cambridge, MA: MIT; 1969. Research Monograph No. 53. [Google Scholar]
  31. Perkell JS. Articulatory processes. In: Hardcastle W, Laver J, editors. The Handbook of Phonetics Sciences. Cambridge, MA: Blackwell; 1997. pp. 333–370. [Google Scholar]
  32. Perkell JS, Nelson WL. Variability in production of the vowels /i/ and /a/ J. Acoust. Soc. Am. 1985;77:1889–1895. doi: 10.1121/1.391940. [DOI] [PubMed] [Google Scholar]
  33. Saltiel P, Wyler-Duda K, D’Avella A, Tresch MC, Bizzi E. Muscle synergies encoded within the spinal cord: evidence from focal intraspinal NMDA iontophoresis in the frog. J. Neurophysiol. 2001;85:605–619. doi: 10.1152/jn.2001.85.2.605. [DOI] [PubMed] [Google Scholar]
  34. Sanguineti V, Laboissière R, Payan Y. A control model of human tongue movements in speech. Biol. Cybern. 1997;77:11–22. doi: 10.1007/s004220050362. [DOI] [PubMed] [Google Scholar]
  35. Smith BL, Gartenberg TE. Initial observations concerning development characteristics of labio-mandibular kinematics. J. Acoust. Soc. Am. 1984;75:1599–1605. doi: 10.1121/1.390869. [DOI] [PubMed] [Google Scholar]
  36. Smith KK, Kier WM. Trunks, tongues, and tentacles: Moving with skeletons of muscles. Am. Sci. 1989;77:29–35. [Google Scholar]
  37. Stevens K. On the quantal nature of speech. J. Phonetics. 1989;17:3–46. [Google Scholar]
  38. Stone M. A three-dimensional model of tongue movement based on ultrasound and x-ray microbeam data. J. Acoust. Soc. Am. 1990;87:2207–2217. doi: 10.1121/1.399188. [DOI] [PubMed] [Google Scholar]
  39. Stone M, Lundberg AJ. Three-dimensional tongue surface shapes of English consonants and vowels. J. Acoust. Soc. Am. 1996;99:3728–3737. doi: 10.1121/1.414969. [DOI] [PubMed] [Google Scholar]
  40. Stone M, Faber A, Raphael L, Shawker T. Cross-sectional tongue shapes and linguopalatal contact patterns in [s], [sh], and [l] syllables. J. Phonetics. 1992;20:253–270. [Google Scholar]
  41. Takemoto H. Morphologic analysis of the human tongue musculature for three-dimensional modeling. J. Speech Hear. Res. 2001;44:95–107. doi: 10.1044/1092-4388(2001/009). [DOI] [PubMed] [Google Scholar]
  42. Turvey MT, Shaw RE, Mace W. Issues in theory of action: degrees of freedom, coordinative structures and coalitions. In: Requin J, editor. Attention and Performance. Erlbaum: Hillsdale; 1978. pp. 557–595. [Google Scholar]
  43. Westbury JR. The significance and measurement of head position during speech production experiments using the x-ray microbeam system. J. Acoust. Soc. Am. 1991;89:1782–1791. doi: 10.1121/1.401012. [DOI] [PubMed] [Google Scholar]
  44. Westbury JR. X-ray Microbeam Speech Production Database User’s Handbook. Madison, WI: Univ. of Wisconsin; 1994. [Google Scholar]
  45. Westbury JR, Hashi M, Lindstrom MJ. Differences among speakers in lingual articulation for American English /ɹ/ Speech Commun. 1998;26:203–226. [Google Scholar]
  46. Westbury JR, Lindstrom MJ, McClean MD. Tongues and lips without jaws: A comparison of methods for decoupling speech movements. J. Speech Hear. Res. 2002;45:651–662. doi: 10.1044/1092-4388(2002/052). [DOI] [PubMed] [Google Scholar]

RESOURCES