Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Feb 24.
Published in final edited form as: J Acoust Soc Am. 2007 Jul;122(1):512–518. doi: 10.1121/1.2735102

Tongue movement kinematics in long and short Japanese consonants

Anders Löfqvist 1,a)
PMCID: PMC2827771  NIHMSID: NIHMS175252  PMID: 17614508

Abstract

This paper examines tongue movements in stop and fricative consonants where the duration of the oral closure/constriction for the consonant is varied for linguistic purposes. Native speakers of Japanese served as subjects. The linguistic material consisted of Japanese word pairs that only differed in the duration of the lingual consonant, which was either long or short. Recordings were made of tongue movements using a magnetometer system. Results show a robust difference in closure duration between the long and short consonants. Overall, the path of the tongue movement during the consonant was longer for the long than for the short consonant. All speakers decreased the speed of the tongue movement during the long consonant. These adjustments in tongue movements were most likely made to maintain the contact between the tongue and the palate for the closure and constriction.

I. INTRODUCTION

This paper examines tongue movements in stop and fricative productions where the duration of the oral closure/constriction for the consonant is varied for linguistic purposes, using speakers of Japanese. In Japanese, the ratio of closure duration for long and short consonants is about 2:1 (Beckman, 1982; Han, 1994; Hirata and Whiton, 2005). There is an extensive body of acoustic studies of the Japanese sound system with particular emphasis on the role of the mora for speech timing. A mora is traditionally regarded as a unit of timing in Japanese, but its existence and acoustic manifestations are debated, and the reader is referred to Warner and Arai (2001) for a review. The long consonants in Japanese are sometimes referred to as “geminates,” and such a consonant contributes one mora; it is also traditionally assumed that a mora boundary occurs in the long consonant (Vance, 1987). However, the primary focus of this paper is not on mora timing but on speech motor control, capitalizing on the length distinction in Japanese sound structure to study how tongue movements are controlled when the duration of a consonant is changed.

If the duration of the oral closure for the consonant is increased, a speaker is still constrained to maintain the contact between the tongue and the palate to make the closure or constriction for the consonant. To do this, the speaker can in principle use two strategies for controlling the tongue movement, both of which involve modulating the speed of the tongue movement. One strategy would be to momentarily stop the tongue from moving. Alternatively, the speaker could slow down the tongue movement for the long consonant. Of these two possibilities, the second one is the most likely, since a large body of research on tongue movements in speech suggests that the tongue hardly ever stops moving. For example, the material presented by Mooshammer et al. (1995) showed that the tongue body is moving during the closure for a velar stop (see also Houde, 1968; Perkell, 1969). The amount of tongue movement during the closure varied with vowel context. Thus, the tongue body movement during the closure was about 1 mm when the preceding vowel was /i/, and between 4 and 10 mm when the preceding vowel was /u/ or /a/.

Studies of tongue movements during velar stop consonant production have also shown that the tongue tends to move in curved paths (Houde, 1968; Perkell, 1969; Kent and Moll, 1972; Schönle, 1988; Munhall et al. 1991; Mooshammer et al., 1995; Löfqvist and Gracco, 1994; 2002). In the production of velar stops, the tongue movement trajectory into the stop closure is usually going forward, and the location of the point of contact between the tongue body and the palate at stop closure is influenced by the phonetic context of the stop (cf., Dembowski et al. 1998). Less information on movement kinematics is available for stops produced with the tongue tip and tongue blade, /t, d/, although the data published by Kent and Moll (1972) show that the location of the point of contact between the tongue tip and the palate, or alveolar ridge, is influenced by phonetic context. In addition, the pattern of contact between the tongue and the alveolar ridge appears to vary with the shape of the palate (Hiki and Itoh, 1986). Löfqvist and Gracco (2002) recorded tongue movements for stop consonants produced with the tongue tip and the tongue body in different vowel contexts. Their results indicated that both the tongue body and the tongue tip moved during the closure for the consonant. The tongue tip movement was similar to that of the tongue body. In both cases, the movement was upward and forward at consonant onset, and downward and backward at the release. However, both the magnitude and direction of the movement were heavily influenced by the vowel context.

Electropalatographic studies of Italian stops have shown that the amount of tongue palate contact is larger for geminate than for single stops, and also that there is a general increase in the extent of tongue-palate contact with increasing closure duration (Farnetani, 1990); similar results for American English have been presented by Byrd (1995). An x-ray study of French consonants by Vaxelaire (1995) suggested that the area of tongue palate contact was larger for the long (abutted) stops than for the short ones.

The purpose of the present study was to extend the findings of Löfqvist and Gracco (2002) to the control of tongue movements in consonants with different durations. As noted above, a speaker is constrained in producing lingual consonants to maintain the contact between the tongue and the palate. Thus, it was hypothesized that the speed of the tongue movement would be slower for a long than for a short consonant. A second hypothesis to be evaluated is that the magnitude of the tongue movement will be larger during a long than during a short consonant. To examine these hypotheses, tongue movements were recorded in native Japanese speakers.

II. METHOD

A. Subjects

Five native speakers of Japanese, two male and three female, served as subjects. They reported no speech, language, or hearing problems. They were naive as to the purpose of the study. Before participating in the recording, they read and signed a consent form. (The experimental protocol was approved by the IRB at the Yale University School of Medicine.)

B. Linguistic material

The linguistic material consisted of Japanese words with an alveolar or velar consonant. These words formed minimal pairs, where the only difference between the pairs was the duration of the consonant. The words are listed in Table I together with English glosses. The linguistic material was organized into randomized lists and presented to the subjects in Japanese writing, with the words occurring in the frame sentence “karewa_to itta” (“He said _”). Fifty repetitions of each word were recorded.

TABLE I.

The linguistic material, text used for presentation, and glosses in English

/hata/ kanji “flag”
/hatta/ kanji&hiragana “(someone) pasted (something)”
/muda/ kanji “useless”
/budda/ katakana “Buddha” katakana
/hosa/ kanji “assistant”
/hossa/ kanji “attack (medical condition)”
/ha∫a/ kanji “winner”
/ha∫∫a/ kanji “departure (of a train or a vehicle)”
/saka/ kanji “slope”
/sakka/ kanji “author, writer”
/tagu/ katakana “price tag”
/tagau/ katakana “tug”

C. Movement recording

The movements of the lips, the tongue, and jaw were recorded using a three-transmitter magnetometer system (Perkell et al., 1992); when proper care is taken during the calibration, the spatial resolution of the system is on the order of 0.5 mm. Receivers were placed on the vermilion border of the upper and lower lip, on three points of the tongue, referred to as tip, blade, and body, and on the lower incisors at the gum line. Two additional receivers placed on the nose and the upper incisors were used for the correction of head movements. The lip and jaw receivers were attached using Isodent, a dental adhesive, while the tongue receivers were attached using Ketac-Bond, another dental adhesive. Care was taken during each receiver placement to ensure that it was positioned at the midline with its long axis perpendicular to the sagittal plane. Two receivers attached to a plate were used to record the occlusal plane by having the subject bite on the plate during the recording. All data were subsequently corrected for head movements and rotated to bring the occlusal plane into coincidence with the x axis. This rotation was performed to obtain a uniform coordinate system for all subjects (cf. Westbury, 1994). For one speaker, S4, the signal from the tongue body receiver was degraded during the recording session, so no tongue body data are reported for this subject; the same thing happened for subject S3, so only 30 tokens could be used for the words with a velar consonant.

The articulatory movement signals (induced voltages from the receiver coils) were sampled at 500 Hz after low-pass filtering at 200 Hz. The resolution for all signals was 12 bits. After voltage-to-distance conversion, the movement signals were low-pass filtered using a 25-point triangular window with a 3-dB cutoff at 14 Hz; this was done forwards and backwards to maintain phase. To obtain instantaneous velocity of the tongue receivers, the first derivative of the position signals was calculated using a 3-point central difference algorithm. For each tongue receiver, its speed [ν=(x.2+y.2)] was also calculated. The velocity signals were smoothed using the same triangular window. All the signal processing was made using the Haskins Analysis Display and Experiment System (Rubin and Löfqvist, 1996). The acoustic signal was pre-emphasized, low-pass filtered at 4.5 kHz, and sampled at 10 kHz.

The onset and release of the oral closure/constriction for the consonant were identified in waveform and spectrogram displays of the acoustic signal. They were both identified by a change in the amplitude and the spectral properties, cf. Fig. 1.

FIG. 1.

FIG. 1

Waveforms and spectrogram of the utterance “karewa hata to itta” produced by Speaker 1. The top panel shows the whole utterance, and the bottom panel only the word “hata” with the vertical lines showing the onset and release of the voiceless consonant /t/ in “hata.”

The magnitude of the tongue movement trajectory during the consonant was obtained by summing the Euclidean distances between successive samples of the tongue tip and tongue body vertical and horizontal receiver positions from the acoustically defined consonant onset to offset. The average speed of the tongue tip and tongue body during the consonant was obtained by adding the speed of all the individual samples between consonant onset and offset and then dividing by the number of samples in the interval. The kinematic signals are expressed in a maxilla-based coordinate system. Thus, the tongue movement includes the contribution of the jaw, which is appropriate when we are interested in the tongue as the end effector.

T-tests were used to assess differences between the long and short consonants for each subject. Given the large number of comparisons, a conservative α-level of 0.001 was adopted. Since the variances usually differed between the long and short consonants, as shown by Levene’s test, the statistical tests assumed unequal variances, so the degrees of freedom were adjusted (e.g., Winer et al., 1991, p. 67).

III. RESULTS

A. Consonant duration

The duration of the oral closure/constriction for the long and short consonants is summarized in Fig. 2. For all speakers, there is a clear and robust difference between the long and short consonants, with the long ones having twice, or more, the duration of the short ones. The results of the t-tests are summarized in Table II. Comparing the durations of the different consonant categories, the voiceless alveolar stop /t/ was always longer than its voiced cognate /d/. Also, the stops were usually shorter than the fricatives.

FIG. 2.

FIG. 2

Closure duration of the consonants (mean and standard deviation). (a) alveolar stops; (b) alveolar fricatives; and (c) velar stops.

TABLE II.

Results of the t-tests for the duration of the oral closure/constriction

S1 S2 S3 S4 S5
/t, tt/ 41.30 29.8 42.92 37.54 28.92
/d, dd/ 40.56 36.20 55.93 43.65 42.69
/s, ss/ 37.55 39.80 29.51 24.20 30.39
/∫, ∫∫/ 22.11 29.85 29.18 29.28 23.94
/k, kk/ 35.65 40.87 43.06 36.21
/g, gg/ 38.91 28.67 38.90 25.48

B. Tongue movements

Overall, the tongue movement patterns were similar to those reported previously (e.g., Houde, 1968; Mooshammer et al., 1995; Löfqvist and Gracco, 2002). That is, both the tongue tip and the tongue body were moving up and forward towards the consonant constriction and down and backward after the consonant. There was some variability between and within speakers, but it is beyond the scope of the present paper to present a detailed analysis of these movement patterns.

The first analysis focused on the magnitude of the tongue movement during the consonant, while the second one examined the average speed of the tongue movement during the consonant.

1. Magnitude of tongue movement path

Figure 3 shows the magnitude of the tongue movement during the consonant. The path was generally longer for the long than for the short consonants, irrespective of place and manner of articulation. The only exceptions were the voiceless fricative /∫/ and the voiced velar stop /g/ for subject S5, where the path was longer for the short consonant. The t-tests, summarized in Table III, showed that there was no significant difference between the length of the paths for the long and short consonants in the following cases: /t, tt/ for subjects S4 and S5; /s, ss/ for subjects S1, S2, and S4; /d, dd/ and /∫, ∫∫/ for subjects S2 and S5. Given that the path was longer for the long than for the short consonants, correlations were made between the closure duration and the tongue movement path during the closure. All the correlations were positive, in particular for the short consonants. The patterns for the different consonant categories mirrored those for closure duration mentioned above. That is, the consonants with a longer duration also had a larger tongue movement.

FIG. 3.

FIG. 3

Path of the tongue movement during the consonant (mean and standard deviation). (a) alveolar stops; (b) alveolar fricatives; and (c) velar stops. For the velar stops, the tongue body movement was measured, while for the other consonants, the tongue tip was used.

TABLE III.

Results of the t-tests fo the path of the tongue during the oral closurre/constriction

S1 S2 S3 S4 S5
/t, tt/ 6.85 6.35 15.16 2.19 2.42
/d, dd/ 6.75 2.54 22.07 9.00 8.46
/s, ss/ 3.03 3.33 10.90 0.57 5.15
/∫, ∫∫/ 12.44 2.72 8.14 3.95 1.09
/k, kk/ 15.0 15.57 17.9 8.25
/g, gg/ 15.71 9.92 12.64 3.53

2. Average speed of tongue movement

Figure 4 shows the average speed of the tongue body movement from the first to the second vowel. For all subjects and consonants, the average speed was slower for the long than for the short consonant. Table IV provides a summary of the t-tests, showing statistically significant differences for all subjects and consonants. There was no clear pattern for the different consonant caegories.

FIG. 4.

FIG. 4

Average speed of the tongue movement during the consonant (mean and standard deviation). (a) alveolar stops; (b) alveolar fricatives; and (c) velar stops. For the velar stops, the tongue body movement was measured, while for the other consonants, the tongue tip was used.

TABLE IV.

Results of the t-tests for the average speed of the tongue during the oral closure/constriction

S1 S2 S3 S4 S5
/t, tt/ 22.24 18.00 17.89 15.42 23.24
/d, dd/ 26.68 23.81 17.45 9.53 17.09
/s, ss/ 19.19 11.69 12.28 9.04 13.32
/∫, ∫∫/ 7.19 10.57 11.02 7.51 18.04
/k, kk/ 10.31 4.66 4.72 31.7
/g, gg/ 9.14 5.99 5.02 24.31

IV. DISCUSSION

This study examined tongue movement kinematics in Japanese long and short consonants. The hypothesis under investigation was based on the idea that speakers are constrained in the production of lingual consonants to maintain the contact between the tongue and the palate. Earlier work has shown that the tongue is moving during these consonants, although the influence of consonant duration on tongue movements has not been explored in any detail. Based on these earlier findings it was hypothesized that a speaker could either stop the tongue movement momentarily during a long consonant or slow down the movement speed. The results show that in all cases the speakers modulated the speed of the tongue, so that it was lower in the long than in the short consonants. There was no instance for any speaker where the tongue movement stopped completely.

If there is a difference in the average speed on the tongue movement between the long and short consonants, one might also predict that speakers modify the speed of the tongue movement within a category, long or short, as a function of consonant length. In particular, one would expect a negative relationship between consonant duration and average tongue movement speed. An examination of this relationship showed correlations to be mostly negative for the long consonants, but the correlations were not very strong. One reason that such a negative relationship was not found for the short consonants might simply be that their durations spanned a smaller range than those of the long consonants. Overall, the long consonants were produced with a longer tongue movement trajectory during the closure/constriction. Correlations were made between the closure duration and the tongue movement path during the closure to examine this relationship within the long and short consonants separately. All the correlations were positive, in particular for the short consonants.

Earlier work has shown a strong correlation between movement displacement and peak velocity in both speech and nonspeech movements (e.g., Cooke, 1980; Ostry et al., 1983; Kelso et al., 1985; Vatikiotis-Bateson and Kelso, 1993; Hertrich and Ackermann, 1997; Löfqvist and Gracco, 1997). The present study measured the average speed of the tongue movement during the acoustically defined closure/constriction for the consonant. There was, still, a very strong correlation between the magnitude of the tongue movement path and the average speed during the consonant, with almost all of the correlations higher than 0.7.

The present results show that speakers modify the speed of tongue movements in consonant production when the duration of the consonant is varied for linguistic purposes. They thus provide additional evidence for active, task-dependent control of tongue movement speed in speech, since the same was found for the movement from the first to the second vowel in a vowel-bilabial consonant-vowel sequence in Japanese, when the duration of the labial consonant was either long or short (Löfqvist, 2006). In this case, the speed of the tongue movement was lower during a long than during a short consonant. All speakers consistently showed this pattern, thus maintaining a similar, but not identical, pattern of coordination between the lip and tongue movements during long and short consonants.

These studies of the relationship between consonant duration and movement kinematics have used native adult speakers of Japanese. An interesting question is the emergence of the length contrast and its associated movement kinematics during speech development. Although there are methodological issues in the recording of tongue movements in children, lip and jaw movements can be readily analyzed (e.g., Green et al., 2000; 2002). Another issue is how a learner of Japanese as a second language masters the length contrast and if the kinematic differences observed here in native speakers between long and short consonants are also seen in second-language learners, in particular if there is a gradual change over time towards the kinematic patterns found here.

ACKNOWLEDGMENTS

I am grateful to Mariko Yanagawa for help with the Japanese material and running the experiments. This work was supported by Grant No. DC-00865 from the National Institute on Deafness and Other Communication Disorders, National Institutes of Health.

References

  1. Beckman M. Segment duration and the mora in Japanese. Phonetica. 1982;39:113–135. [Google Scholar]
  2. Byrd D. In: Elenius K, Branderud P, editors. Articulatory characteristics of single and blended lingual gestures; Proceedings of the XIIIth International Congress of Phonetic Sciences; Stockholm, Royal Institute of Technology and Stockholm University. 1995.pp. 438–441. [Google Scholar]
  3. Cooke J. The organization of simple, skilled movements. In: Stelmach G, Requin J, editors. Tutorials in Motor Behavior. North-Holland; Amsterdam: 1980. pp. 199–212. [Google Scholar]
  4. Dembowski J, Lindstrom MJ, Westbury JR. Articulator point variability in the production of stop consonants. In: Cannito MP, Yorkston KM, Beukelman DR, editors. Neuromotor Speech Disorders: Nature, Assessment, and Management. Brookes; Baltimore: 1998. pp. 27–46. [Google Scholar]
  5. Farnetani E. V-C-V lingual coarticulation and its spatiotemporal domain. In: Hardcastle W, Marchal A, editors. Speech Production and Speech Modelling. Kluwer; Dordrecht: 1990. pp. 93–130. [Google Scholar]
  6. Green J, Moore C, Reilly K. The sequential development of jaw and lip control for speech. J. Speech Lang. Hear. Res. 2002;45:66–79. doi: 10.1044/1092-4388(2002/005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Green J, Moore C, Higashikawa M, Steeve R. The physiologic development of speech motor control: Lip and jaw coordination. J. Speech Lang. Hear. Res. 2000;43:239–255. doi: 10.1044/jslhr.4301.239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Han M. Acoustic manifestations of mora timing in Japanese. J. Acoust. Soc. Am. 1994;96:73–82. [Google Scholar]
  9. Hertrich I, Ackermann H. Articulatory control of phonological vowel length contrasts: Kinematic analysis of labial gestures. J. Acoust. Soc. Am. 1997;102:523–536. doi: 10.1121/1.419725. [DOI] [PubMed] [Google Scholar]
  10. Hiki S, Itoh H. Influence of palate shape on lingual articulation. Speech Commun. 1986;5:141–158. [Google Scholar]
  11. Hirata Y, Whiton J. Effects of speaking rate on the single/geminate stop distinction in Japanese. J. Acoust. Soc. Am. 2005;118:1647–1660. doi: 10.1121/1.2000807. [DOI] [PubMed] [Google Scholar]
  12. Houde R. A Study of Tongue Body Motion during Selected Consonant Sounds. Speech Communications Research Laboratory; Santa Barbara: 1968. SCRL Monograph 2. [Google Scholar]
  13. Kelso JAS, Vatikiotis-Bateson E, Saltzman E, Kay B. A qualitative dynamic analysis of reiterant speech production: Phase portraits, kinematics, and dynamic modeling. J. Acoust. Soc. Am. 1985;77:266–280. doi: 10.1121/1.392268. [DOI] [PubMed] [Google Scholar]
  14. Kent R, Moll K. Cinefluorographic analyses of selected lingual consonants. J. Speech Hear. Res. 1972;15:453–473. doi: 10.1044/jshr.1503.453. [DOI] [PubMed] [Google Scholar]
  15. Löfqvist A. Interarticulator programming, Effect of closure duration on lip and tongue coordination in Japanese. J. Acoust. Soc. Am. 2006;120:2872–2883. doi: 10.1121/1.2345832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Löfqvist A, Gracco VL. Tongue body kinematics in velar stop production: Influences of consonant voicing and vowel context. Phonetica. 1994;51:52–67. doi: 10.1159/000261958. [DOI] [PubMed] [Google Scholar]
  17. Löfqvist A, Gracco V. Lip and jaw kinematics in bilabial stop consonant production. J. Speech Lang. Hear. Res. 1997;40:877–893. doi: 10.1044/jslhr.4004.877. [DOI] [PubMed] [Google Scholar]
  18. Löfqvist A, Gracco V. Control of oral closure in lingual stop consonant prodiction. J. Acoust. Soc. Am. 2002;105:1864–1876. doi: 10.1121/1.1473636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Mooshammer C, Hoole P, Kühnert B. On loops. J. Phonetics. 1995;23:3–21. [Google Scholar]
  20. Munhall K, Ostry D, Flanagan R. Coordinate spaces in speech planning. J. Phonetics. 1991;19:293–307. [Google Scholar]
  21. Ostry D, Keller E, Parush A. Similarities in the control of speech articulators and the limbs: Kinematics of tongue dorsum movements during speech. J. Exp. Psychol. Hum. Percept. Perform. 1983;9:622–636. doi: 10.1037//0096-1523.9.4.622. [DOI] [PubMed] [Google Scholar]
  22. Perkell JS. Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. MIT Press; Cambridge, MA: 1969. [Google Scholar]
  23. Perkell J, Cohen M, Svirsky M, Matthies M, Garabieta I, Jackson M. Electromagnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements. J. Acoust. Soc. Am. 1992;92:3078–3096. doi: 10.1121/1.404204. M. [DOI] [PubMed] [Google Scholar]
  24. Rubin PER, Löfqvist A. HADES (Haskins Analysis Display and Experiment System) Haskins Labs. Status Rep. Speech Res. 1996 (available at: http://www.haskins.yale.edu/MISC/DOCS/HADESRLn2.pdf). Viewed 5/11/07.
  25. Schönle P. Elektromagnetische Artikulographie. Electromagnet Articulography, Springer; Berlin: 1988. [Google Scholar]
  26. Vance T. An Introduction to Japanese Phonology. State University of New York Press; Albany, NY: 1987. [Google Scholar]
  27. Vatikiotis-Bateson E, Kelso JAS. Rhythm type and articulatory dynamics in English, French, and Japanese. J. Phonetics. 1993;21:231–265. [Google Scholar]
  28. Vaxelaire B. Single vs. double (abutted) consonants across speech rate: X-ray and acoustic data for French; Proceedings XIII Int. Conf. Phonetic Sci.; Stockholm, Royal Institute of Technology and Stockholm University. 1995.pp. 384–387. [Google Scholar]
  29. Warner N, Arai A. Japanese mora-timing: A review. Phonetica. 2001;58:1–25. doi: 10.1159/000028486. [DOI] [PubMed] [Google Scholar]
  30. Westbury J. On coordinate systems and the representation of articulatory movements. J. Acoust. Soc. Am. 1994;95:2271–2273. doi: 10.1121/1.408638. [DOI] [PubMed] [Google Scholar]
  31. Winer B, Brown R, Michels K. Statistical Principles in Experimental Design. McGraw-Hill; Boston: 1991. [Google Scholar]

RESOURCES