Abstract
In the process of phonological development, fricatives are generally assumed to be later acquired than stops. However, most of the observational work on which this claim is based has concerned itself with word-initial onset consonants; little is known about how and when fricatives are mastered in word-final coda position (e.g., nose). This is all the more critical in a language like English, where word-final fricatives often carry important morphological information (e.g., toes, goes). This study examines the development of duration cues to the voicing feature contrast in coda fricatives, using longitudinal spontaneous speech data from CVC words (e.g., noise vs face) produced by three children (1;6–2;6 years) and six mothers. Results show that the children were remarkably adult-like in the use of duration cues to voicing contrasts in fricatives even in this early age range. Furthermore the children, like the mothers, had longer frication noise durations for morphemic compared to non-morphemic fricatives (e.g., toes vs nose) when these segments occurred in utterance-final position. These results suggest that although children's fricatives tend to be overall longer and more voiced compared to those of adults, the voicing and morphological contrasts for fricative codas are acquired early in production.
INTRODUCTION
Early speech production is affected by various factors such as motor, perceptual, and lexical skills. One of the major challenges in the study of speech development has been to identify the factors that underlie when and how children learn to produce cues to individual speech sounds, and use them contrastively. This issue is crucial to understanding the mechanisms underlying the development of speech production and speech planning in children.
The present study focuses on the factors affecting children's production of fricative consonants in coda position. Fricatives are generally reported to be acquired late. For example, on the basis of phonetic transcription data, Smit et al. (1990) showed that alveolar stops were produced correctly by more than 90% of children by age 3, whereas alveolar fricatives were not fully mastered until age 7. Further evidence for the late acquisition of fricatives comes from acoustic studies showing that spectral cues to fricative place of articulation contrasts were not adult-like for children below the age of 5 (Nittrouer et al., 1989; Nissen and Fox, 2005). It has been suggested that fricatives are late-acquired because they require development of particularly fine-grained motor control of the tongue to keep the constriction size just right for generating turbulence noise (Kent, 1992).
A better understanding of the acoustic properties of child fricative productions could help inform the mechanisms underlying fricative development. However, to date there have been only a few studies examining how children produce the individual acoustic cues that characterize fricative consonants in adult speech, and how these distinguish different classes of sounds, such as voiceless /s/ vs voiced /z/. For example, while there is considerable work on the development of spectral cues to place of articulation contrasts in fricative onsets (Nittrouer et al., 1989; Nissen and Fox, 2005; Li et al., 2009), very little is known about the development of durational cues to voicing contrasts in fricatives in coda position. Yet the investigation of coda fricatives is important, particularly for children learning English, where many inflectional morphemes are fricative codas (e.g., books, runs). Thus, examining how coda fricatives are acoustically realized in children's speech can provide much-needed insight into the development of these grammatical morphemes in young children.
The primary goal of the present work was therefore to explore how the contrastive phonological and morphological characteristics of a fricative coda affect its durational realization in child speech, and how this changes over time. To this end, we examined the durational cues to voicing contrasts in the apical fricative coda /s/ vs /z/, and how the morphological status of this coda affected frication duration. Specifically, we compared the realization of fricatives in 2-year-olds' and mothers' spontaneous speech, focusing on similarities and differences between the two populations with respect to several different fricative-associated duration cues. This study therefore expands the set of measures and contexts for which we have detailed acoustic information about children's production of fricatives, contributing to a better understanding the mechanisms underlying fricative development. Such detailed acoustic measures might also help us understand the non-adult-like properties of child fricative productions that have led adult transcribers to characterize them as late acquired.
Previous literature on adult speech production has demonstrated the robust effect of voicing contrasts on the duration of frication noise for English fricative codas, with a longer period of frication for voiceless than for voiced fricatives. For example, Crystal and House (1988) found that the mean duration of frication noise for voiceless fricatives was 47 ms longer than that of voiced fricatives (97 vs 50 ms) when averaged across all words in their corpus. Similarly, Stevens et al. (1992) reported that the duration of frication noise for the voiceless fricative /s/ was about 30 ms longer than that of the voiced fricative /z/ (108 vs 78 ms) in intervocalic position. The duration of frication in adult speech is also known to be affected by the prosodic context of the word; like other final segments (Klatt 1975, 1976; Turk and Shattuck-Hufnagel, 2007), phrase-final fricative codas typically have longer duration noise than phrase-medial fricative codas by as much as 40–100 ms (Klatt, 1976).
In addition to the effects of coda voicing on the duration of the frication noise itself, coda voicing also affects the duration of the preceding vowel. In adult English, vowels are reported to be as much as 100 ms longer before voiced than before voiceless obstruents (Peterson and Lehiste, 1960; House, 1961). Again, the voicing effect on the duration of the preceding vowel is also known to be affected by prosodic context, with longer vowel durations and greater contrasts between voice and voiceless coda contexts in phrase-final environments (Klatt, 1976; Crystal and House, 1988).
Some researchers have argued that the ratio of the duration of the vowel to the duration of the entire vowel-plus-consonant sequence (i.e., the V/VC ratio within the syllable rhyme) is a more appropriate acoustic correlate of coda voicing than the absolute duration of the preceding vowel alone (Barry, 1979). The prediction associated with the V/VC ratio is that this ratio would be greater when the coda is voiced than when it is voiceless, because the duration of the vowel before the voiced coda is longer and the voiced coda itself is shorter. Consistent with the prediction, Myers (2012) showed that this ratio was greater for voiced obstruent codas than for voiceless ones. Interestingly, there was no interaction between voicing and utterance position in his study; the effect of coda voicing on the V/VC ratio held constant across both utterance-medial and utterance-final positions. This contrasts with the duration of the preceding vowel alone, which is known to be affected by coda voicing, particularly in utterance-final position (Crystal and House, 1988). Meyers also showed that the V/VC ratio was overall lower in utterance-final position compared to utterance-medial position, suggesting a devoicing effect in utterance-final position for both voiced and voiceless fricative codas.
Partial devoicing of voiced fricative /z/ seems to be virtually universal in utterance-final or pre-pausal position in English (Haggard, 1978; Smith, 1997), and has also been observed cross-linguistically (Smith, 2003). Smith (1979) used acoustic, airflow, and electroglottographic data to examine the devoicing of /z/ in American English. In her study, each token ending with /z/ was assigned to one of three categories according of the percentage of frication duration during which there was voicing: 0%–25% = devoiced, 25%–90% = partially voiced, 99%–100% = voiced. The results showed that devoicing (0%–25%) occurred most often at the end of a higher-level syntactic domain, and occurred less often with increasingly smaller domains (utterance-final > word-final > syllable-final). Some researchers have attributed devoicing in utterance-final position to physiological factors. Vocal folds are typically spread apart when speech is not being produced, allowing air to pass through freely. Thus, devoicing of voiced sounds in pre-pausal position may naturally occur in anticipation of the following pause, which is a “voiceless” state (Ingram, 1989). In addition to this assimilation effect, voicing in utterance-final position may also be made less likely by a decrease in subglottal pressure toward the end of the utterance, where the pressure drop across the vocal folds may be too low to maintain vibration (Westbury and Keating, 1986).
In contrast to the large literature on adult speech, very little is known about the duration of vocal fold vibration or other durational characteristics associated with the voicing contrast in fricative codas in children's speech. A notable exception is a study by Naeser (1970) of the duration of the preceding vowel. Naeser showed that English-speaking children as young as 1;9 exhibited appropriate vowel duration differences depending on coda voicing; that is, vowels before voiceless obstruents (stops and fricatives) were approximately 50%–60% of the duration of vowels before voiced obstruents, which corresponded to the ratio reported for adults. Studies of voiced vs voiceless stop codas also show that children produce the vowel duration difference associated with coda voicing early, before 2–3 years of age (Buder and Stoel-Gammon, 2002; Ko, 2007; Krause, 1982; Song et al., 2012).
Children also show the adult pattern of utterance-final phonetic devoicing early. For example, in a transcription study looking at the characteristics of babbling in infants aged 0;4–1;1, Oller et al. (1976) showed that the majority of the utterance-final obstruents were perceived as voiceless by adult transcribers. At the same time, a number of transcription and acoustic studies have revealed that, compared with adults, word-final voiced obstruents (e.g., hose) are more frequently devoiced in children's speech, especially before 4 years of age (Naeser, 1970; Smith, 1979; Velten, 1943).
Research on adult speech also raises the possibility that the production of a coda is affected by its morphological status. For example, Walsh and Parker (1983) showed that the duration of English plural –s is longer in coda clusters where it corresponds to a morpheme (e.g., hearts) compared to its non-morphemic counterpart (e.g., Rex). Although in their study the average difference in duration between plural –s and monomorphemic –s was small—only 9 ms, and there was no statistical analysis, further supporting evidence comes from a study by Losiewicz (1992) showing that the past tense morpheme /d/ or /t/ (e.g., rapped) was longer than non-morphemic /d/ or /t/ (e.g., rapt). Furthermore, Cho (2002) provided articulatory evidence for an effect of morpheme boundaries on intergestural timing in Korean. His electromagnetic midsagittal articulography and electropalatography data showed that gestures were coordinated more stably inside a monomorphemic word than across a morphemic boundary, although they were homophonous on the surface. These findings suggest that the speech planning process for adults is sensitive to morphological content.
Many grammatical morphemes are reported to be acquired late in English speech (Brown, 1973), and this is exacerbated in children with language impairment (cf. Leonard, 1998). Children's variable omission of grammatical morphemes before more stable use has traditionally been attributed to incomplete or still-developing syntactic and semantic representations (e.g., Wexler, 1994). However, more recent studies have shown that children's production of grammatical morphemes is influenced by the phonological shape of syllables and words in which they occur and by the prosodic contexts in which they appear (Gerken, 1996; Marshall and van der Lely, 2007; Song et al., 2009; Theodore et al., 2011). For example, before children can reliably produce inflectional morphemes such as plural –s in a word like cats, they must have the ability to produce the word-final cluster /ts/ in monomorphemic words (Bernhardt and Stemberger, 1998). Furthermore, studies on languages other than English have demonstrated that the acquisition of fricative codas is affected by multiple factors, such as within-word position, stress, morphological status, and segmental composition (e.g., Freitas et al., 2001; Lleó, 2006; Prieto and Bosch-Baliarda, 2006). However, most of these studies of the influence of phonology on the production of grammatical morphemes have relied on phonetic transcription of children's speech, and there is limited information about the acoustic realization of children's early grammatical morphemes (although, cf. Theodore et al., 2011).
In order to examine how the acoustic cues to the fricative codas vary as a function of contrastive voicing (study 1) and morphological status (study 2), we conducted systematic acoustic analyses of longitudinal spontaneous speech productions from three 2-year-olds and six mothers speaking American English. In study 1, in addition to the duration of the fricative noise itself, our coding measures included the duration of the preceding vowel and the percentage of the fricative overlapped with vocal fold vibration. This provided an objective estimate of devoicing for voiced fricatives in children's speech. Our study is unique in examining longitudinal, spontaneous data from children as young as 1;6, a challenging age at which to collect speech production data. In contrast to previous studies that have focused on spectral correlates of place contrasts in onset fricatives, it also explores the durational characteristics associated with cues to voicing and morphology in fricative codas.
Based on the literature on adult speech, it was hypothesized that duration cues to fricative codas in children would vary systematically with the voicing and morphemic status of this segment. Specifically, we predicted longer duration for frication noise in voiceless fricatives than in voiced fricatives, longer vowel duration before voiced fricatives than before voiceless fricatives, and a greater degree of devoicing of voiced fricatives in utterance-final position than in utterance-medial position, as measured by the ratio of voiced frication to total frication duration. In addition, given the previous phonological studies showing that children devoice fricatives more frequently than adults at the ends of words and utterances, we anticipated that we might find delayed acquisition of adult-like durations of the cue to fricative voicing contrasts, as well as larger effects of devoicing in children compared to adults.
We also hypothesized that children would show morphological effects early on, with longer frication duration for morphemic /z/ (e.g., toes) than non-morphemic /z/ (e.g., nose). Alternatively, children might not show any morphological effects, since the inflected lexical items they used in spontaneous speech were high frequency, and might thus represent rote-learned, lexicalized (rather than actively inflected) word forms.
Finally, we expected to find overall longer and more variable segmental durations in 2-year-olds' speech. Studies on the temporal aspects of speech development suggest that young children produce speech segments with overall longer durations than those of adults (while maintaining the proportional relationships) and that the durations are overall more variable than those of adults (Smith, 1978). For example, in a study examining the developmental changes of temporal and spectral parameters in children between ages of 5 to 17 years, Lee et al. (1999) showed that both the magnitude and variability of vowel and word-initial /s/ durations decrease significantly between the ages of 9 and 12, approximating adult levels at around 12 years. It was therefore expected that in our study of much younger children, vowel and fricative durations would be overall longer and more variable than those of adults. This would be consistent with previous findings for the durational characteristics of stop codas in the same children's speech (Song et al., 2012).
STUDY 1: VOICING EFFECTS ON THE DURATION OF WORD-FINAL /s/ AND /z/ FRICATIVES IN MONOMORPHEMIC WORDS
Method
Subjects and database
The data examined in this study came from the Providence Corpus (Demuth et al., 2006) and included spontaneous, longitudinal speech data collected from six mother-child pairs.1 All six children (three boys, three girls) were typically developing, monolingual speakers of American English. In the present study, data from only three of the children (one boy, two girls) were used because the other three children did not produce enough /s, z/-final target words to analyze. Data from all six mothers was used. Two of the mothers spoke the dialect typical of Southern New England, which is often characterized by the omission of postvocalic /ɹ/.
Digital audio/video recordings were collected in the children's homes for approximately 1 h every 2 weeks for 2 years. Recording started between the ages of 0;11–1;4, depending on when each child started talking. During the recording sessions both mother and child wore a wireless Azden WLT/PRO VHF lavalier microphone pinned to their collar as they engaged in everyday activities. The recordings were made using a Panasonic PV-DV601D-K mini digital video recorder. The audio from the video was later extracted and digitized at a sampling rate of 44.1 KHz. Both the mothers' and children's speech were orthographically transcribed using Codes for the Human Analysis of Transcripts conventions (MacWhinney, 2000). The children's utterances were also transcribed by trained coders using International Phonetic Alphabet (IPA) transcription, to capture the phonetic representations of words. Overall reliability of IPA-transcribed segments ranged from 80%–97% across files in terms of presence/absence of segments and place/manner of articulation. Voicing is difficult to reliably transcribe phonemically in young children's speech (Stoel-Gammon and Buder, 1999), and was therefore not assessed in these reliability measures.
Data
As the first step to determine the target words, we examined the occurrence frequencies of all monomorphemic CVC (consonant-vowel-consonant) words ending in voiced /z/ and voiceless /s/ codas in the three children's speech. To minimize random variability in speech production, we limited our investigation to two types of open class words, verbs and nouns, excluding adverbs (e.g., less), adjectives (e.g., nice), and function words (e.g., this, was). Words starting with glides or liquids (e.g., rose) were also excluded due to the difficulty of identifying the beginning of the vowel in such contexts, a critical issue for our vowel duration measures. This gave us 12 usable CVC words ending in /z/, with word frequencies ranging from 193 (cheese) to 1 (e.g., pose). There were 34 useable CVC words ending in /s/, with word frequencies ranging from 316 (house) to 1 (e.g., peace). However, many of these potential target words (25 out of 46) had very low frequencies (at or below 10).
Then we narrowed down the words to be analyzed on the basis of their overall frequencies as well as their frequencies in the speech of the individual children; specifically, we tried to select words with uniformly high frequencies for all children. The final set of target words contained three words ending in voiced codas (cheese, noise, nose) [mean frequency = 129, standard deviation (SD) = 60] and five words ending in voiceless codas (bus, face, house, juice, piece) (mean frequency = 142, SD = 113).
We examined both child and adult productions of these words when the child was 1;6, 2;0, and 2;6 years of age, plus or minus one month (i.e., 1;5–1;7, 1;11–2;1, and 2;5–2;7, respectively). This sampling of the raw data provided a reasonable number of tokens for each speaker and allowed us to explore developmental trends, both in the children's speech and in that of the mothers during the same time periods.
We then extracted all the audio files of the sentences containing the above target words. For individual mothers and children at each age, our goal was to code the first ten acoustically clean tokens of each target word in utterance-final position (e.g., He's in the house) and the first 10 acoustically clean tokens in each of four utterance-medial contexts: before words beginning with a stressed vowel (e.g., We turned the house over to him), before words beginning with an unstressed vowel (e.g., The door at the new house is purple), before glide-initial words (e.g., Let's explore the house with Sam), and before non-glide consonant-initial words (e.g., Let's visit the new house today). Ideally, this would give us a total of maximum 7200 tokens (6 speakers × 3 ages × 10 tokens × 8 target words × 5 contexts) for mothers and 3600 tokens (3 speakers × 3 ages × 10 tokens × 8 target words × 5 contexts) for children. Although most of the time we did not have as many as ten tokens in each of the four contexts, having the target number provided some controlled variability for the medial context. Unusable tokens included those with poor acoustic quality, often because of overlap with other speaker's vocalizations or noise from toys.
The numbers of tokens that were initially coded for the mothers and children were 521 and 210, respectively. It turned out that some of the children did not produce the /s, z/ coda in 12 out of the 210 coded tokens (for details, see Sec. 2B1 below). Such tokens were excluded from further analyses in this study. Thus, the final number of tokens examined in study 1 was 521 for six mothers and 198 for three children. Table TABLE I. shows a breakdown of tokens used in the analyses at each age (1;6, 2;0, 2;6) and utterance context (utterance-medial, utterance-final). Children overall had a smaller number of tokens utterance medially than utterance finally; this may be due to the fact that children's early utterances are often comprised of a single word, which was considered as utterance-final. In utterance-medial context, a majority of the tokens were followed by non-glide consonant-initial words (mothers: 39%, children: 45%) and words beginning with an unstressed vowel (mothers: 44%, children: 42%). Target words followed by words beginning with a stressed vowel (mothers: 7%, children: 11%) and by glide-initial words (mothers: 10%, children: 2%) were less common. In general, the distributions for the utterance-medial words were not strikingly different for mothers and children.
TABLE I.
Number of coda /s, z/ tokens analyzed in study 1.
| Mothers | Children | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Voiced | Voiceless | Voiced | Voiceless | |||||||
| Age | Medial | Final | Medial | Final | Total | Medial | Final | Medial | Final | Total |
| 1;6 | 27 | 27 | 45 | 61 | 160 | 7 | 12 | 6 | 41 | 66 |
| 2;0 | 34 | 22 | 65 | 60 | 181 | 7 | 14 | 9 | 38 | 68 |
| 2;6 | 23 | 25 | 71 | 61 | 180 | 0 | 11 | 17 | 36 | 64 |
| Total | 84 | 74 | 181 | 182 | 521 | 14 | 37 | 32 | 115 | 198 |
Acoustic coding and measures
In order to examine the acoustics of each token, we developed a set of coding conventions using visual information from the spectrogram and waveform, as well as auditory information. First, the duration of the frication noise associated with a fricative coda consonant was defined as the interval between the onset and offset of noticeable frication noise [i.e., between (2) and (4) in Fig. 1]. Second, vowel duration was defined as the interval between the onset and offset of a clear F2 energy in the spectrogram [i.e., between (1) and (3) in Fig. 1]. Lastly, we identified the end of the periodicity (F0) associated with the vowel. Although the end of F0 often aligned with the end of the vowel (defined as the offset of F2), it was also often the case that F0 ended a few periods later. In cases where the frication noise began before the voicing ended, the interval between them was defined as frication/voicing overlap [i.e., between (2) and (3) in Fig. 1]. Acoustic coding was carried out by several trained coders using praat (Boersma and Weenink, 2005).
Figure 1.
Representative waveform and spectrogram for the word says [sεz] produced by a mother.
Reliability of acoustic coding
To evaluate inter-coder reliability of acoustic coding, 30 percent of the tokens at age 2;0 (74/249; 21 child tokens and 53 adult tokens) were re-transcribed by the third author, who was also a trained coder. The average difference in frication duration between the original and recoded data was 12 ms (SD = 12) for mothers and 27 ms (SD = 46) for children (see Table TABLE II. for the mean durations of fricatives). The average difference in vowel duration between the original and recoded data was 21 ms (SD = 25) for mothers and 23 ms (SD = 29) for children. The average difference in frication/voicing overlap was 16 ms (SD = 15) for mothers and 29 ms (SD = 36) for children. Overall, the difference was bigger and the variability was greater for children than for adults. Pearson r correlations between the measurement of the original and recoded data were significant for all three measures: frication duration: r(72) = 0.93, p < 0.001, vowel duration: r(72) = 0.96, p < 0.001, frication/voicing overlap: r(72) = 0.93, p < 0.001. These results suggested high reliability between coders.
TABLE II.
Mean fricative duration (in ms) by coda voicing, utterance-position, and subject group. The numbers in the parenthesis are standard errors.
| Voiced | Voiceless | |||||||
|---|---|---|---|---|---|---|---|---|
| Medial | Final | Medial | Final | |||||
| Mothers | Children | Mothers | Children | Mothers | Children | Mothers | Children | |
| 1;6 | 71 (9) | — | 170 (10) | 193 (18) | 109 (11) | 172 (17) | 210 (10) | 199 (17) |
| 2;0 | 80 (18) | — | 155 (12) | 167 (20) | 112 (3) | 112 (5) | 183 (15) | 259 (25) |
| 2;6 | 76 (13) | — | 154 (10) | 169 (17) | 95 (12) | 149 (19) | 225 (23) | 207 (40) |
| Mean | 76 (7) | — | 160 (7) | 176 (12) | 105 (4) | 145 (7) | 206 (11) | 221 (19) |
Results
Developmental patterns
Before turning to the comparison of duration for voiced vs voiceless fricatives, we first examine whether there are any changes in the duration of fricatives as a function of children's age. Out of the 210 tokens that were initially identified as target words in the children's speech, coda fricatives were produced (as determined by the presence of the onset and offset of noticeable frication noise) for 94%, resulting in 198 tokens in the final child dataset. Of the 12 frication-less tokens, 5 occurred at 1;6, 4 occurred at 2;0, and 3 occurred at 2;6. All but one token were from the same child, the one who spoke the most. Seven of the 12 items were house, and most occurred in utterance-medial position. One of the fricative-less tokens had a stop closure and release bursts instead of frication, but the others simply had no acoustically observable coda. The results from a chi-square test indicated that the voicing of the coda was not a factor affecting omission of the coda; that is, both voiced and voiceless coda fricatives were produced with equal likelihood [94% vs 94%, χ2 (1, N = 210) = 0.00, p = 0.96]. However, both segment types were produced more reliably in utterance-final compared to utterance-medial position [96% vs 88%, χ2 (1, N = 210) = 4.35, p = 0.04]. Recall that the coda-less tokens were not included in the study. Mothers were at ceiling, since they always produced the target fricative noise. This was not surprising, especially considering the fact that the mothers in this study produced more careful, child-directed speech as opposed to more casual, adult-directed speech.
We also examined the change in the duration of fricatives from 1;6 to 2;6. Studies have shown that children's segment durations are overall longer and more variable than those of adults (Smith, 1978; Lee et al., 1999). Thus, it was possible that the children in our study would show a developmental change in coda fricative duration, with shorter durations over time. Note, however, that past studies reporting decreasing segment duration with age were conducted with older (5–12-year-old) children, where the groups were separated by 2–3 years (e.g., Smith, 1978). Thus, it is possible that the very young age of the children in this study, in conjunction with the narrow age range of 1;6–2;6 years, would prevent our analyses from showing any systematic developmental changes in duration.
A mixed analysis of variance (ANOVA) was conducted with age (1;6 vs 2;0 vs 2;6) as the within-subjects factor, and the subject group (mother vs child) as the between-subjects factor. The dependent variable was the duration of the frication noise. Four separate ANOVAs were performed, for utterance-medial voiced and voiceless tokens and for utterance-final voiced and voiceless tokens, respectively. None of the three children produced medial tokens of the voiced target words at all three of the ages sampled for this study; thus, we were not able to evaluate the effect of age on utterance-medial voiced tokens for the children.2 The effect of age was not significant for the mothers, suggesting that their coda fricative durations did not change significantly as their children grew from 1;6 to 2;6 (Table TABLE II.). It was not possible to examine the effect of group or age × group on utterance-medial voiced tokens due to the lack of children's data. For utterance-medial voiceless tokens, the effects of age and age × group were not significant, but there was a significant effect of group, suggesting that children overall had a longer frication duration for utterance-medial tokens than mothers, F(1,5) = 24.07, p < 0.01 (see Table TABLE II.). For utterance-final voiced and voiceless tokens, the results showed no significant effects of age, group, and age × group interactions. We suspect that the lack of difference between children and adults in utterance-final position may arise because both populations produce the final-lengthening effect (e.g., Klatt, 1976), and this was limited by a ceiling effect. In sum, these results suggest that the children did not show a significant change in frication duration between 1;6 and 2;6. Thus, the data from the three age windows were collapsed in the follow-up analyses examining the effect of the coda voicing on fricative duration.
Effects of coda voicing on frication duration
In order to examine the effect of coda voicing on the duration of frication noise, a repeated measures ANOVA was performed on the individual mean duration of this measure for the nine subjects (six mothers, three children). The between-subjects factor was the subject group (mother vs child). The within-subjects factors were the voicing of the coda (voiced vs voiceless) and utterance-position (utterance-medial vs utterance-final). The possible effect of utterance-position was considered because previous studies suggest that the duration of various segments, including fricatives, varies with the position of the word in an utterance even in young children's speech (e.g., Oller, 1973).
The results showed that children overall have longer frication durations [M = 169 ms, standard error (SE) = 8] than the mothers (M = 137 ms, SE = 6), as indicated by a significant main effect of group, F(1,7) = 10.76, p = 0.01. However, as indicated by a significant group × position interaction, F(1,7) = 16.40, p < 0.01, the effect of group was only observed for one position, in this case for the medial tokens [see Fig. 2a]. That is, the children did not shorten frication duration medially as much as the mothers did. As expected, there was a significant main effect of position, F(1,7) = 112.40, p < 0.001, with longer durations utterance finally (M = 186 ms, SE = 6) compared to medially (M = 120 ms, SE = 5). In sum, both children and adults had longer frication duration in utterance-final position than in medial position, and the children did not shorten their frication durations in medial position as much as the adults did.
Figure 2.
Effects of (a) position × group, (b) position × voicing, (c) group × voicing, (d) group × voicing × position interactions on frication durations. Error bars represent standard errors.
As predicted, the main effect of voicing was significant, F(1,7) = 29.64, p = 0.001, with longer frication duration for voiceless codas (M = 171 ms, SE = 6) than for voiced codas (M = 135 ms, SE = 6). Although this difference was significant overall, there was also a positional difference: the effect of voicing was greater utterance finally compared to utterance medially, as indicated by a significant voicing × position interaction, F(1,7) = 47.92, p < 0.001 [see Fig. 2b]. There was no group × voicing interaction, F(1,7) = 0.15, p = 0.71 [see Fig. 2c], showing that the children are similar to mothers in showing a voicing contrast in duration. Finally, the three-way interaction of group × voicing × position was marginally not significant, F(1,7) = 5.16, p = 0.06 [see Fig. 2d].
In sum, as predicted, frication duration was affected by coda voicing, with longer duration for voiceless /s/ than for voiced /z/. The effect of voicing on the duration of the frication was found primarily in utterance-final position; less of this systematic variation was found in utterance-medial position, especially for the children. The most important result here is that the duration of the frication noise varied systematically with its voicing feature in both mothers' and 2-year-olds' speech. In the next section, we examine the effect of coda voicing on the duration of the preceding vowel.
Effect of coda fricative voicing on the duration of the preceding vowel
Before we examine the effect of the coda voicing on the duration of the preceding vowel, we first examine the vowel types that occurred in the target words and how many of each were included in the analyses. Since our data were drawn from spontaneously produced samples, the types of vowels were not controlled. Thus, the results for vowel durations in our data will need to be interpreted with caution, since different vowel types have different intrinsic durations (House, 1961). However, most of the vowels in the target words in the present study were either tense vowels (cheese, juice, nose, piece) or diphthongs (face, house, noise); only one target word had a lax vowel (bus). (See Fig. 3 for the frequency counts and the mean vowel duration of each target word.) In the following vowel duration analyses, we therefore excluded the tokens for the word bus (28 tokens in total) to ensure that vowel laxness is not a confounding factor. As a result, the voiced codas were preceded by either tense vowels (cheese, nose) or diphthongs (noise); likewise, the voiceless codas were preceded by either tense vowels (juice, piece) or diphthongs (face, house), both of which are expected to have relatively long durations compared to lax vowels in comparable positions. Although the frequency counts were comparable between the target words, it was noted that many of the voiceless target words were the word house in both mothers' and children's speech (see Fig. 3).
Figure 3.
Mean vowel duration by target word in utterance-medial position (left) and utterance-final position (right). Error bars represent standard errors. The number on each bar indicates the frequency of each target item across all subjects in each mother and child group.
In order to examine the effect of coda voicing on the duration of the preceding vowel, a repeated measures ANOVA was performed on nine subjects' (six mothers, three children) individual mean durations of preceding vowels. The between-subjects factor was subject group (mothers vs children). The within-subjects factors were voicing of the coda (voiced vs voiceless) and utterance-position (utterance-medial vs utterance-final).
As indicated by a significant main effect of group, children overall had longer vowel duration (M = 288 ms, SE = 15) than mothers (M = 211 ms, SE = 11), F(1,7) = 17.50, p < 0.01. Vowel duration was also affected by position, with longer duration finally (M = 302 ms, SD = 14) than medially (M = 197 ms, SD = 12), F(1,7) = 32.68, p = 0.001. Group did not interact with position [see Fig. 4a], F(1,7) = 0.07, p = 0.80. That is, children's vowel durations were longer than mothers' in both utterance-medial and final positions, not in just medial position, as was found for the coda fricatives themselves.
Figure 4.
Effects of (a) position × group, (b) position × voicing, (c) group × voicing, (d) group × voicing × position interactions on vowel durations. Error bars represent standard errors.
As expected, the main effect of voicing was significant, with longer vowel durations before voiced (M = 307 ms, SE = 19) compared to voiceless fricatives (M = 191 ms, SE = 5), F(1,7) = 29.76, p = 0.001. In contrast to previous findings for adult speakers (Crystal and House, 1988), the voicing × position interaction was not significant, F(1,7) = 1.57, p = 0.25. Thus, in the present study, the effect of voicing was independent of position [see Fig. 4b], i.e., vowels before voiced fricatives were longer than vowels before voiceless fricatives in both utterance-medial and utterance-final positions. Furthermore, voicing did not interact with group, F(1,7) = 0.04, p = 0.85, suggesting that the effect of voicing held equally well for mothers and children [see Fig. 4c]. The three-way interaction was also not significant, F(1,7) = 1.14, p = 0.32 [see Fig. 4d].
To summarize so far, the results confirmed our general predictions, showing that both mothers and 2-year-olds exhibited duration cues to the voicing contrast in both the coda frication noise and the preceding vowel. Overall, children also had longer durations than the mothers. In the next section, we examine how the voicing of the coda affects an additional potential cue to the voicing contrast: the extent of frication overlap with voicing.
Effect of coda voicing on frication noise overlap with voicing
For each coda fricative, the percent of the frication noise overlapping with voicing (i.e., the percent of the fricative noise during which the vocal folds were vibrating so as to produce observable excitatory pulses) was calculated. On this measure, zero percent indicates that the frication noise did not overlap with observable vocal fold vibration at all; 100% indicates that the overlap was complete, i.e., all of the frication noise overlapped with voicing. A higher percentage was expected for voiced /z/ than for voiceless /s/.
We performed a repeated measures ANOVA to examine how the percent of frication noise overlap with voicing varied with coda voicing and utterance position in mothers' and children's speech. As with the previous analyses, the between-subjects factor was subject group (mothers vs children). The within-subjects factors were voicing of the coda (voiced vs voiceless) and utterance-position (utterance-medial vs utterance-final).
The result showed that the main effect of group was not significant, although children's fricatives were overall more voiced (M = 17%, SE = 4) than those of mothers (M = 13%, SE = 3), F(1,7) = 0.77, p = 0.41. As predicted, the main effect of position was significant, with more devoiced fricatives utterance finally (M = 9%, SD = 1) than medially (M = 20%, SD = 4), F(1,7) = 8.04, p < 0.05. Group did not interact with position [see Fig. 5a], F(1,7) = 0.22, p = 0.65. That is, the difference between utterance-medial and utterance-final positions held true for both groups of children and mothers.
Figure 5.
Effects of (a) position × group, (b) position × voicing, (c) group × voicing, (d) group × voicing × position interactions on the percentage of tokens exhibiting frication overlap with voicing vowel durations. Error bars represent standard errors.
As expected, the main effect of voicing was significant, with a higher percent of fricative overlap with voicing for voiced (M = 21%, SE = 4) compared to voiceless fricatives (M = 8%, SE = 2), F(1,7) = 20.95, p < 0.01. Voicing did not interact with group, F(1,7) = 0.13, p = 0.73, suggesting that the effect of voicing held equally well for mothers and children [see Fig. 5c]. Consistent with previous findings for adult speakers (Myers, 2012), the voicing × position interaction was not significant, F(1,7) = 0.18, p = 0.69. This suggested that the effect of voicing was independent of position [see Fig. 5b]. However, the voicing × position interaction differed between mothers and children, as shown by a significant three-way interaction between voicing × position, and group, F(1,7) = 9.74, p < 0.05. That is, for mothers, the difference between voiced and voiceless tokens was greater utterance medially than finally, whereas for children, it was the reverse [Fig. 5d]. As shown by the left dotted line in Fig. 5d, for mothers, the degree of voicing for voiced fricatives dropped dramatically from the medial to final position, suggesting utterance-final devoicing of voiced tokens. Interestingly, children's utterance-final voiced tokens were much more voiced (i.e., less devoicing in utterance-final position) than those of mothers, widening the difference between children's voiced and voiceless tokens in this position. Furthermore, children's utterance-medial voiceless fricatives were also much more voiced than those of mothers, narrowing the difference between children's voiced and voiceless tokens in this position.
Thus, children primarily differed from mothers in having more voiced utterance-medial /s/ and utterance-final /z/. This is somewhat surprising giving the often-held assumption that children tend to devoice word-final/utterance-final obstruents more frequently than adults (Naeser, 1970; Smith; 1979; Velten, 1943). It is also counter to the widely held view that maintaining vocal fold vibration in the presence of an oral constriction requires a tricky balancing act, since a pressure drop across both the glottal and supraglottal constrictions must be sufficient to produce two sound sources at the same time, i.e., vibration (at the glottis) and turbulence noise (at the oral constriction). However, our results are consistent with findings from previous studies showing that young children's voice onset time (VOT) values for onset stops often fall within the adult perceptual boundaries for voiced stops (Kewley-Port and Preston, 1974; Zlatin and Koenigsknecht, 1976). In both cases, it seems, children produce acoustic cue values that are shifted toward the adult values for voiced obstruents.
Summary: Study 1
In study 1 we examined the effect of voicing on children's production of word-final fricative codas, finding no significant developmental change in frication duration between 1;6–2;6 years. These 2-year-old children systematically exhibited cues to the voicing distinction for apical fricative codas, with shorter frication duration and longer preceding vowel duration for voiced codas. Although the durational cues to fricative voicing contrasts in 2-year-olds' speech resembled those in adults' speech (e.g., proportionally longer fricative noise duration for voiceless codas), some quantitative differences were also found between the two populations: as expected, children's fricative and preceding vowel durations were typically longer and more variable than those of adults in the CVC words examined, especially in utterance-medial position. These results are consistent with earlier research showing the reduction in magnitude and variability of segmental durations with age (e.g., Lee et al., 1999). There was also a positional effect, with even longer frication and vowel durations utterance finally compared to utterance medially. The one unexpected finding was that the children in this study did not exhibit as much utterance-final fricative devoicing as the mothers. This suggests immature aspects of timing organization [as found for VOT contrasts in stop onset position in 2;6–3;3-year-old child speakers (Imbrie, 2005)], although some other aspects of voicing cues, such as frication and vowel durations, were remarkably adult-like. In study 2, we compared the morphological effects on the production of fricative codas between these 2-year-olds and their mothers.
STUDY 2: MORPHEMIC EFFECTS ON THE DURATION OF WORD-FINAL /z/ FRICATIVES
Method
Subjects and database
The subjects and database used in study 2 were the same as those used in study 1. Again, we used data from the six mothers and three children (one boy, two girls) who produced enough target morphemic (plural and third person singular) and non-morphemic coda fricatives for analysis.
Data
For the target words ending in non-morphemic /z/, we used the voiced coda target words in study 1: cheese, noise, nose. To determine the target words with morphemic /z/, we first examined the occurrence frequencies of all monosyllabic CVC words in the three children's speech ending in the plural (e.g., toes) or third person singular (e.g., goes) morpheme. Again, words starting with glides or liquids were excluded. This yielded 22 useable CVC words, 13 ending in the plural, and 9 ending in the third person singular morpheme. As in study 1, many of these potential target words (11 out of 22) had very low frequencies (at or below 10). Again, we selected the final target words considering the words' overall frequencies, as well as their frequencies in individual children. The final set of target words contained six morphemic (guys, goes, says, shoes, toes, toys) (mean frequency = 109, SD = 94) and three non-morphemic (cheese, noise, nose) (mean frequency = 129, SD = 60) /z/ codas. As in study 1, the data were sampled every 6 months at 1;6, 2;0, and 2;6, plus a month before and after.
The number of tokens initially coded for mothers was 484 and for children 171. As in study 1, explicit noise for fricative codas was sometimes omitted in the children's speech (see Sec. 3B1 below for more detail). These tokens (14 out of 171 tokens) were excluded from analysis. The final number of tokens examined in study 2 was 484 for the six mothers and 157 for the three children. Table TABLE III. shows a breakdown of the tokens ending in morphemic and non-morphemic /z/ that were used in the analyses at each age and utterance position context. In the utterance-medial context, a majority of the tokens were followed by non-glide consonant-initial words (mothers: 42%, children: 67%). About a quarter of the utterance-medial tokens were followed by words beginning with an unstressed vowel (mothers: 30%, children: 18%). Target words followed by words beginning with a stressed vowel (mothers: 17%, children: 12%) and by glide-initial words (mothers: 11%, children: 3%) were the least common utterance-medial contexts.
TABLE III.
Number of /z/ codas analyzed as a function of morphemic status in study 2 (M: Utterance-medial, F: Utterance-final).
| Mothers | Children | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Morphemic | Morphemic | |||||||||||||
| Plural | 3rd | Non-Morphemic | Plural | 3rd | Non-Morphemic | |||||||||
| Age | M | F | M | F | M | F | Total | M | F | M | F | M | F | Total |
| 1;6 | 22 | 25 | 70 | 9 | 27 | 27 | 180 | 10 | 12 | 8 | 4 | 7 | 12 | 53 |
| 2;0 | 19 | 11 | 74 | 9 | 34 | 22 | 169 | 8 | 2 | 18 | 6 | 7 | 14 | 55 |
| 2;6 | 10 | 20 | 50 | 7 | 23 | 25 | 135 | 9 | 4 | 23 | 2 | 0 | 11 | 49 |
| Total | 51 | 56 | 194 | 25 | 84 | 74 | 484 | 27 | 18 | 49 | 12 | 14 | 37 | 157 |
Acoustic coding and measures
Acoustic coding followed the procedures described for study 1.
Reliability of acoustic coding
To evaluate inter-coder reliability of the acoustic coding, approximately 30% of the tokens at age 2;0 (60/224; 18 child tokens and 42 adult tokens) were re-transcribed by the third author. The average difference in frication duration between the original and recoded data was 12 ms (SD = 15) for mothers and 19 ms (SD = 25) for children (see Table TABLE IV. for the mean frication duration for mothers and children). The Pearson r correlation between the measurement of the original and recoded data was significant, r(58) = 0.95, p < 0.001, suggesting high inter-coder reliability.
TABLE IV.
Mean frication duration (in ms) by the morphemic status of the coda fricative, subject group, and utterance-position. The numbers in the parenthesis are standard errors.
| Morphemic | Non-Morphemic | |||||||
|---|---|---|---|---|---|---|---|---|
| Medial | Final | Medial | Final | |||||
| Mothers | Children | Mothers | Children | Mothers | Children | Mothers | Children | |
| 1;6 | 70 (4) | 119 (7) | 165 (11) | 164 (13) | 71 (9) | — | 170 (10) | 193 (18) |
| 2;0 | 58 (9) | 132 (16) | 166 (13) | 242 (15) | 80 (18) | — | 155 (12) | 167 (20) |
| 2;6 | 84 (11) | 115 (19) | 185 (14) | 178 (16) | 76 (13) | — | 154 (10) | 169 (17) |
| Mean | 70 (4) | 122 (7) | 172 (8) | 195 (9) | 76 (7) | — | 160 (7) | 176 (12) |
Results
Developmental patterns
Before examining the effect of the morphological status of the coda on frication duration, we wanted to investigate whether frication durations change with age, as the durations were measured at 6-month intervals. Of the 171 tokens initially identified as target words for the children, frication noise for coda fricatives was produced 92% of the time, leaving 157 tokens in the final dataset. Of the 14 frication-less tokens, 6 occurred at 1;6, 2 at 2;0, and 6 at 2;6. One of the frication-less tokens had release bursts instead of frication, but most (13 tokens) simply had no frication. Frication was produced slightly more often for non-morphemic vs morphemic fricatives (94% vs 91%), but about equally utterance-medially vs finally (91% vs 92%). None of these differences were significant in chi-squared tests [non-morphemic vs morphemic: χ2 (1, N = 171) = 0.73, p = 0.39, medial vs final: χ2 (1, N = 171) = 0.05, p = 0.83]. The tokens without frication were not included in the study. Not surprisingly, mothers always produced a fricative coda in all contexts.
We examined the change in duration of frication noise across the three ages (see Table TABLE IV.). A mixed ANOVA was conducted with age (1;6 vs 2;0 vs 2;6) as the within-subjects factor, and the subject group (mothers vs children) as the between-subject factor. The dependent variable was the duration of frication noise. The ANOVA was performed separately on utterance-medial tokens with morphemic and non-morphemic /z/, and utterance-final tokens with morphemic and non-morphemic /z/. For utterance-medial morphemic /z/, the effects of age and age × group were not significant, but there was a significant effect of group, F(1,6) = 38.67, p = 0.001, suggesting that children overall had longer frication duration (M = 122 ms, SE = 7) than mothers (M = 70 ms, SE = 4). This is not unexpected, given the fact that children speak more slowly than adults. Just as in study 1, we were not able to evaluate the effect of age for children's utterance-medial non-morphemic /z/, because none of the three children had medial target words with non-morphemic /z/ at all three ages. For mothers, the effect of age (of child) was not significant, as for utterance-medial morphemic /z/ (described above). The lack of children's data made it impossible to investigate the effect of group or age × group on utterance-medial non-morphemic /z/. For utterance-final words with morphemic and non-morphemic /z/, the results showed no significant effects of age, group, and age × group interactions. Most importantly, these results suggest that the three children did not show a significant change in frication duration during the one year period of the study (1;6–2;6), suggesting that age was not a confound. Thus, the data from the three age windows were collapsed in the follow-up ANOVA analyses examining the effect of the morphemic status of the coda on frication duration.
Effects of the morphological status of the coda on frication duration
In order to examine how the morphological status of the coda affects the duration of fricative coda /z/, a repeated measures ANOVA was performed on the individual mean duration of frication noise for all nine subjects' (six mothers, three children). The between-subject factor was subject group (mother vs child). The within-subject factors were the morphemic status of the coda (morphemic vs non-morphemic) and position (utterance-medial vs utterance-final).
The results showed a significant main effect of group, with longer frication duration for children (M = 155, SE = 7) than for mothers (M = 121, SE = 5), F(1,7) = 14.45, p < 0.01. However, as shown by the significant group × position interaction, F(1,7) = 22.68, p < 0.01, the difference between children's and mothers' frication durations only occurred utterance-medially [Fig. 6a]; children and mothers produced very similar durations for coda /z/ in final position. Frication duration was overall longer utterance finally (M = 171, SE = 5) than medially (M = 106, SE = 6), as indicated by a significant main effect of position, F(1,7) = 120.82, p < 0.001. The results shown in Fig. 6a suggest that mothers are primarily responsible for the medial-final difference; post hoc paired t-tests using a conservative Bonferroni correction of the alpha level (0.05/2 = 0.025) confirmed that the positional effect on the frication noise was significant for mothers, t(5) = 14.55, p < 0.001, but not for children, t(2) = 3.35, p = 0.08.
Figure 6.
Effects of (a) position × group, (b) position × morphemic status, (c) group × morphemic status, (d) group × morphemic status × position interactions on frication duration. Error bars represent standard errors.
The main effect of morphemic status was not significant, F(1,7) = 1.80, p = 0.22, suggesting that there was no overall durational difference between morphemic (M = 142 ms, SE = 4) and non-morphemic (M = 135 ms, SE = 6) /z/. However, there was a significant interaction of morphemic status × position, F(1,7) = 32.73, p = 0.001. In a subsequent post hoc analysis, we wanted to examine whether and how morphemic and non-morphemic /z/ differed by utterance position. Paired t-tests using a Bonferroni correction of the alpha level (0.05/2 = 0.025) revealed that the duration of morphemic and non-morphemic /z/ differed significantly in utterance-final position, t(8) = 3.17, p = 0.01, but not in utterance-medial position, t(8) = −2.58, p = 0.03, with the frication duration for morphemic /z/ longer than non-morphemic /z/ [Fig. 6b]. The morphemic status × group interaction was not significant, F(1,7) = 0.84, p = 0.39 [Fig. 6c], suggesting that the morphological effect was independent of group. The three-way interaction of group × morphemic status × position was also not significant, F(1,7) = 1.69, p = 0.24 [Fig. 6d].
To summarize, although the main effect of morphemic status was not significant, there was a significant effect of morphemic status × position, suggesting that the duration of morphemic and non-morphemic /z/ differed mainly in utterance-final position. As our morphemic target words included both plural and third person singular /z/, it is of interest to determine whether there was any difference between the two morphemes, and whether the difference was a confounding factor. In order to examine this possibility, we conducted an additional repeated measures ANOVA analysis with the three levels of the morphemic status of a coda (plural vs third person singular vs non-morphemic) as one of the within-subject factors. Other factors were the same; the other within-subject factor was position (utterance-medial vs utterance-final) and the between-subject factor was subject group (mothers vs children).
These results confirmed the results from the earlier analysis that used the two levels of the morphemic status of a coda (morphemic vs non-morphemic). That is, there was a significant main effect of group with a longer frication duration for children (M = 159, SE = 7) than for mothers (M = 120, SE = 5), F(1,7) = 21.72, p < 0.01. The results further showed that the group × position interaction was significant, F(1,7) = 13.06, p < 0.01. That is, children's and mother's frication durations differed only utterance-medially [Fig. 7a]. Also, frication duration was longer utterance finally (M = 174, SE = 5) than medially (M = 105, SE = 6), as indicated by a significant main effect of position, F(1,7) = 105.28, p < 0.001.
Figure 7.
Effects of (a) position × group, (b) position × morphemic status, (c) group × morphemic status, (d) group × morphemic status × position interactions on frication duration. Error bars represent standard errors.
Just as for the two-way comparison of morphemic vs non-morphemic status, the main effect of morphemic status in this three-way comparison was not significant, F(1,7) = 0.35, p = 0.57, suggesting that there was no overall durational difference among plural (M = 142 ms, SE = 6), third person singular (M = 141 ms, SE = 8), and non-morphemic /z/ (M = 135 ms, SE = 6). However, there was a significant interaction of morphemic status × position, F(1,7) = 6.69, p < 0.05. As shown in Fig. 7b, morphemic vs non-morphemic /z/ differed in duration primarily in utterance-final position. The values for plural and third person singular /z/ were almost identical, as indicated by the overlapping lines for the two morphemes in Fig. 7b. The morphemic status × group interaction was not significant, F(1,7) = 2.07, p = 0.19 [Fig. 7c], indicating that the two groups of speakers treated the three kinds of /z/ similarly in terms of frication duration. Finally, the three-way interaction of group × morphemic status × position was marginally not significant, F(1,7) = 4.87, p = 0.06 [Fig. 7d].
It is interesting that no difference was found in the average duration between plural /z/ and third person singular /z/ in the present study, especially since Hsieh et al. (1999) showed that the duration of the plural morpheme was overall longer than that of the third person singular morpheme in mothers when interacting with their 2-year-olds. However, the discrepancy between the two studies appears to be due to the difference in calculating the average duration of each morpheme. In Hsieh et al. (1999), the tokens from different utterance positions were collapsed when averaging the duration of each morpheme. Because plural nouns were more likely to appear in utterance-final position, where fricatives (and other constituents) are significantly lengthened, it is not surprising that their plural morpheme –s durations were on average longer than third singular morpheme –s durations. In contrast, when we calculated the average duration of plural and third person singular morphemes separately for the two utterance positions (utterance-medial, utterance-final), no difference was found between the two types of morphemes. However, as shown above, a significant difference was found in the frication duration between morphemic /z/ and non-morphemic /z/ in utterance-final position.
Summary of study 2
As in study 1, children tended to have longer frication duration for coda fricative consonants than adults, but only in utterance-medial position. Furthermore, as expected, frication duration was longer utterance finally than medially. Finally, morphemic /z/ was longer than non-morphemic /z/ in utterance-final position. This suggests that children as young as two years old are distinguishing between morphological and non-morphological coda fricatives in their speech processing, as suggested by some aspects of the adult data as well.
DISCUSSION
The goal of this study was to explore the acoustic realization of fricative coda consonants in children's early speech, and compare it to that of the (child-directed) mothers' speech they hear. Given the claims in the literature that fricatives tend to be acquired later than stops, possibly due to articulatory issues, we wanted to know when children might begin to produce the same acoustic cues to fricative voicing contrasts, and the same cue values, as are found in adult speech. We also wanted to know how and to what extent the morphological content of the fricative coda affects 2-year-olds' production of fricatives. This is a particularly interesting issue to study in coda position in English, where the acquisition of fricative contrasts had not been explored much, and where plural nouns and third person singular inflected verbs both end in a fricative. It has often been reported that these morphemes can be late acquired, especially in children with Specific Language Impairment (SLI) and hearing loss, but these studies largely rely on transcriptional methods to describe both adult-like and atypical productions of the fricative morpheme by the children. A better understanding of the acoustics of the typical developmental processes in fricative acquisition, especially in word-final coda position, is therefore critical for evaluating the possible factors underlying imperfect learning.
Unexpectedly, the 2-year-olds in the present study produced frication noise for almost all tokens of the target words (i.e., 94% in study 1 and 92% in study 2). Moreover, the durational cues to feature contrasts produced by the children were remarkably similar to those produced by their adult models. With respect to voicing, the children's productions of frication duration and vowel duration patterned similarly to that of the mothers, signaling the contrast between voiced and voiceless sounds. This suggests that, by the age of two, the voicing contrast for coda fricatives is generally well controlled by many children. Interestingly, however, the children exhibited a greater degree of frication overlap with voicing for utterance-final /z/, suggesting less of a devoicing effect than that found in the mothers' speech. Many children below the age of 2 are known to undergo an initial stage where their VOT values for both voiced and voiceless word-initial stops fall within the short lag region that adults use for voiced stops in English, before their VOT values become separated into a short lag region for voiced stops and a long lag region for voiceless ones (Bond and Wilson, 1980; Kewley-Port and Preston, 1974; Macken and Barton, 1980; Zlatin and Koenigsknecht, 1976). Thus, although our finding might seem to contrast with the transcriptional studies reporting frequent devoicing of obstruents in young children's speech (Oller et al., 1976), it is consistent with these previous acoustic analyses showing the prevalent occurrence of short-lag stops and prevoicing in onset position. In both sets of phenomena, children seem to produce cue values that are shifted toward the adult values for voiced segments. These findings provide evidence that 2–3-year-olds are still developing adult-like timing and glottal adjustments for voicing distinctions (see also Imbrie, 2005). The mechanisms underlying these findings are not clear, but it has been speculated that the longer duration of voicing in voiced fricatives may be due to the child's tendency to maintain a high subglottal pressure (Netsell et al., 1994). In addition, as young children's VOT values often fall in the short lag region that is within the adult range for perceiving the voiced stop, some researchers have argued that short lag stops are in some way physiologically easier to produce than long lag stops. Supporting evidence comes from the observation that long lag stops tend to be produced with high variability at least until 4;6 years of age, suggesting greater instability or difficulty in producing long lag stops for children (Kewley-Port and Preston, 1974).
We also found that these children exhibit morphological effects in their production of word-final /z/. That is, frication duration was longer for morphemic /z/, at least in utterance-final position. We do not know yet why the durational difference between morphemic and non-morphemic /z/ occurred primarily in utterance-final position. However, it is well known that the segment duration is lengthened in phrase-final rhymes (Turk and Shattuck-Hufnagel, 2000), and in this context voicing-related duration differences are sometimes more apparent. For example, in a corpus study Crystal and House (1988) reported that the effect of coda voicing on the duration of the preceding vowel was only significant in utterance-final position. Likewise, in study 1, we found a significant voicing and utterance position interaction effect on the duration of frication noise for fricative codas. Thus, we speculate that it is an interaction of utterance-position and the morphemic status of the coda that characterizes the acoustics of morphemic /z/ versus non-morphemic /z/, rather than the morphemic status alone. Nevertheless, this suggests that, even for very young children around 2 years of age, there may be morphological effects in the speech-planning process. This suggests that grammatical morphemes are not rote learned forms, but are being actively compiled on-line in these children's spontaneous speech. This is a potentially interesting area for further research, perhaps also with children with SLI, where productive use of grammatical morphemes is often delayed, and the use of lexicalized forms is often observed.
In sum, this study suggests that, despite the challenges of acquiring fricatives, children learning American English can and do use some of the temporal cues to these segments in a way similar to what they hear in the ambient language. Thus, as also found with a detailed acoustic analysis of stop coda voicing contrasts (Song et al., 2012), children are making fricative voicing contrasts more or less appropriately, shortly after they begin to produce their first words. This suggests that many of the motor, perceptual and lexical skills needed to produce some fricative contrasts are in place quite early in typically-developing children, at least in coda position, even if the means to implement these contrasts phonetically in exactly the same way that adults do are not.
Our results might at first seem to contradict previous findings from transcription studies that child fricatives are not like those of adults until late (around 7 years of age) (e.g., Smit et al., 1990). However, it is important to note that the papers that mention late acquisition of fricatives often used a criterion of when most (90%) children produce these segments in most (or even all) contexts. The results from the present study suggest that there is a difference between such “normative” measures, which are designed to provide a criterion for when parents and therapists should worry that a child has not yet acquired this ability, and “average” behavior, which reflects typical ages of acquisition. Thus, although the children in our study appear to produce frication noise for most of the tokens from the beginning, our results are still well within the range for typically developing children. It is possible that the “late” acquisition of fricatives reported in previous transcription studies describes the long tail of the distribution for typical behavior. Furthermore, our findings suggest the possibility that durational cues to voicing and morphological contrasts in coda fricatives are acquired earlier than spectral cues to place contrasts in onset fricatives, which have been reported to be still developing in children over the age of 5 (Nittrouer et al., 1989). At the same time, it is important to acknowledge that there are several contrasts involved here (place vs voicing vs morphology, onset vs coda, durational vs spectral); in order to fully evaluate when children acquire mastery of fricatives, it would be critical to tease apart the effects of all of these factors. Comparisons of different studies using different methodologies are difficult, particularly in the absence of explicit criteria for judging a child's success at producing sound. Studies concerned with the question of when children master the phonemes of their language should take this into account, by specifying what criteria were used to judge the child's success in producing a sound category, in the continuously process of more closely approximating adult cues and cue values.
A number of studies of both typical phonological development and that of children with phonological disorders have shown that both sets of children often produce feature contrasts that are not perceived as such by the adult listeners/transcribers (e.g., Macken and Barton, 1980; Gierut and Dinnsen, 1986; Scobbie et al., 2000; Richtsmeier, 2010). This phenomenon, in which children produce a statistically reliable distinction between sound categories that is not perceived by adults, is known as covert contrast. First, such findings suggest that the phonological system of a language may be acquired independently of how that system is implemented phonetically. Second, detailed acoustic phonetic analyses of children's speech can sometimes illuminate systematic distinctions between target sounds which are not revealed by transcriptional studies. Although in the present study we did not directly compare our acoustic results to the transcriptions, the fact that 2-year-olds reliably use durational cues to signal voicing and morphological contrasts opens up a possibility for covert contrast in this domain.
An additional finding from this study is that the children consistently produced longer vowel durations in utterance-final position compared to utterance medially. This is particularly interesting in light of earlier results by Snow (1994), who reported that the children in his study produced a final-medial duration difference in syllable duration at the earliest session (mean age of 16 months), then lost the distinction, and then recovered it by the last session (mean age of 25 months). However, at the earliest age in his study the children produced largely one-word utterances, so that the only means for comparing phrase-medial with phrase-final durations was to compare the first syllable of a two-syllable word (e.g., bottle) with a monosyllable (e.g., sock). Therefore, the medial-final duration difference might arise because the syllables in the two-syllable words undergo polysyllabic shortening (see Lehiste, 1972 and Turk and Shattuck-Hufnagel, 2000 for adult speech). By systematically comparing the vowel and frication noise durations in utterance-final and utterance-medial positions, we provide evidence for phrase-final lengthening from the earliest ages analyzed here, i.e., 1;6. Our result suggests that children acquire phrase-final lengthening quite early, possibly before 2 years of age.
The findings from the present study make several contributions to the literature. Although fricative codas are reported to be one of the late-acquired classes of sounds, information about the development of the acoustic cues to coda fricative contrasts has not always been available. Furthermore, studies that have reported acoustic measures have focused on spectral properties of fricatives in onset position, so that only limited information is available about the factors that affect the durational characteristics of fricatives in word-final coda position. Our detailed acoustic analyses of early speech productions suggest that children as young as 2 years of age may have sophisticated, adult-like knowledge of how durational cues vary systematically with fricative coda voicing and morphological contrasts, and some means of actually implementing these contrasts. These results are consistent with those of previous perception studies (e.g., White and Morgan, 2008) showing that children below 2 years of age have adult-like knowledge of phonological feature contrasts, thus filling a gap in our knowledge of the relationship between early speech perception and production. In contrast to the abundant literature concerning the factors affecting segmental duration in adult speech production [including the classic study by Klatt (1974, 1976)], much less is known about the durational characteristics of various segments in young children's speech production. Thus, the results from the current study will serve as a reference for the effect of voicing/morphological status of fricative codas on their durational properties in the speech production of 2-year-olds learning American English. Parallel studies of fricative coda development in children learning other languages, similar to crosslinguistic studies of fricative onsets [e.g., Li et al. (2009)], would provide a valuable contribution to the field.
Broad characterizations of fricatives as late acquired are only the first step toward understanding the mechanism of acquisition and the reasons why children's productions are different. However, more information about the acoustic and articulatory details, including which positions and contexts fricatives are delayed in, and which acoustic and articulatory parameters remain non-adult like for some time, has a good chance of helping us understand the mechanism that is responsible for delayed mastery of adult-like fricatives. This paper is a step in that direction.
ACKNOWLEDGMENTS
This research was funded in part by NIH R01HD057606. We thank members of the Child Language Lab at Brown University (Melanie Cabral, Heidi Jiang, Elana Kreiger-Benson, Jeremy Kuhn, Melissa Lopez, Matt Masapollo, Miranda Sinnott-Armstrong, and Matt Vitorla) for coding assistance.
Footnotes
See the Child Language Data Exchange System [CHILDES; http://childes.psy.cmu.edu/ (date last viewed 6/9/12)].
No averages of utterance-medial voiced tokens are shown in Table TABLE II. for children, indicating no ANOVA test was carried out for these tokens.
References
- Barry, W. (1979). “ Complex encoding in word-final voiced and voiceless stops,” Phonetica 36, 361–372. 10.1159/000259973 [DOI] [PubMed] [Google Scholar]
- Bernhardt, B. H., and Stemberger, J. P. (1998). Handbook of Phonological Development: From the Perspective of Constraint-Based Nonlinear Phonology (Academic Press, San Diego: ), pp. 1–793. [Google Scholar]
- Boersma, P., and Weenink, D. (2005). “PRAAT: Doing phonetics by computer (Version 4.4.07) [computer program],” http://www.praat.org/ (last viewed 8/1/2011).
- Bond, Z. S., and Wilson, H. F. (1980). “ Acquisition of the voicing contrast by language-delayed and normal-speaking children,” J. Speech Hear. Res. 23, 152–161. [DOI] [PubMed] [Google Scholar]
- Brown, R. (1973). A First Language: The Early Stages (Harvard University Press, Cambridge: ), pp. 1–449. [Google Scholar]
- Buder, E. H., and Stoel-Gammon, C. (2002). “ American and Swedish children's acquisition of vowel duration: Effects of vowel identity and final stop voicing,” J. Acoust. Soc. Am. 111, 1854–1864. 10.1121/1.1463448 [DOI] [PubMed] [Google Scholar]
- Cho, T. (2002). “ Effects of morpheme boundaries on intergestural timing: Evidence from Korean,” Phonetica 58, 129–162. 10.1159/000056196 [DOI] [PubMed] [Google Scholar]
- Crystal, T. H., and House, A. S. (1988). “ Segmental durations in connected-speech signals: Current results,” J. Acoust. Soc. Am. 83, 1553–1573. 10.1121/1.395911 [DOI] [PubMed] [Google Scholar]
- Demuth, K., Culbertson, J., and Alter, J. (2006). “ Word-minimality, epenthesis and coda licensing in the early acquisition of English,” Lang. Speech 49, 137–173. 10.1177/00238309060490020201 [DOI] [PubMed] [Google Scholar]
- Freitas, M. J., Miguel, M., and Faria, I. (2001). “ Interaction between Prosody and Morphosyntax: Plurals within codas in the acquisition of European Portuguese,” in Approaches to Bootstrapping: Phonological, Lexical, Syntactic and Neurophysiological Aspects of Early Language Acquisition, edited by Weissenborn J. and Höhle B. (John Benjamins, Amsterdam: ), pp. 45–58. [Google Scholar]
- Gerken, L. A. (1996). “ Prosodic structure in young children's language production,” Lang. 72, 683–712. 10.2307/416099 [DOI] [Google Scholar]
- Gierut, J. A., and Dinnsen, D. (1986). “ On word-initial voicing: Converging sources of evidence in phonologically disordered speech,” Lang. Speech 29, 97–114. [DOI] [PubMed] [Google Scholar]
- Haggard, M. (1978). “ The devoicing of voiced fricatives,” J. Phonetics 6, 95–102. [Google Scholar]
- House, A. S. (1961). “ On vowel duration in English,” J. Acoust. Soc. Am. 33, 1174–1182. 10.1121/1.1908941 [DOI] [Google Scholar]
- Hsieh, L., Leonard, L. B., and Swanson, L. A. (1999). “ Some differences between English plural noun inflections and third singular verb inflections in the input: The contribution of frequency, sentence position, and duration,” J. Child Lang. 26, 531–543. 10.1017/S030500099900392X [DOI] [PubMed] [Google Scholar]
- Imbrie, A. K. K. (2005). “ Acoustical study of the development of stop consonants in children,” Ph.D. thesis, Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA. [Google Scholar]
- Ingram, D. (1989). Phonological Disability in Children (Cole and Whurr, London: ), pp. 1–179. [Google Scholar]
- Kent, R. D. (1992). “ The biology of phonological development,” in Phonological Development: Models, Research, Implications, edited by Ferguson C. A., Menn L., and Stoel-Gammon C. (York, Timonium, MD: ), pp. 65–90. [Google Scholar]
- Kewley-Port, D., and Preston, M. (1974). “ Early apical stop production: a voice onset time analysis,” J. Phonetics 2, 195–219. [Google Scholar]
- Klatt, D. H. (1974). “ The duration of [s] in English words,” J. Speech Hear. Res. 17, 51–63. [DOI] [PubMed] [Google Scholar]
- Klatt, D. H., (1975). “ Vowel lengthening is syntactically determined in a connected discourse,” J. Phonetics 3, 129–140. [Google Scholar]
- Klatt, D. J. (1976). “ Linguistic uses of segmental duration in English: Acoustic and perceptual evidence,” J. Acoust. Soc. Am. 59, 1208–1221. 10.1121/1.380986 [DOI] [PubMed] [Google Scholar]
- Ko, E.-S. (2007). “ Acquisition of vowel duration in children speaking American English,” in Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 1881–1884.
- Krause, S. E. (1982). “ Developmental use of vowel duration as a cue to postvocalic stop consonant voicing,” J. Speech Hear. Res. 25, 388–393. [DOI] [PubMed] [Google Scholar]
- Lee, S., Potamianos, A., and Narayanan, S. (1999). “ Acoustics of children's speech: Developmental changes of temporal and spectral parameters,” J. Acoust. Soc. Am. 105, 1455–1468. 10.1121/1.426686 [DOI] [PubMed] [Google Scholar]
- Lehiste, I. (1972). “ The timing of utterances and linguistic boundaries,” J. Acoust. Soc. Am. 51, 2018–2024. 10.1121/1.1913062 [DOI] [Google Scholar]
- Leonard, L. (1998). Children with Specific Language Impairment (MIT Press, Cambridge, MA: ), pp. 1–347. [Google Scholar]
- Li, F., Edwards, J., and Beckman, M. E. (2009). “ Contrast and covert contrast: The phonetic development of voiceless sibilant fricatives in English and Japanese toddlers,” J. Phonetics 37(1 ), 111–124. 10.1016/j.wocn.2008.10.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lleó, C. (2006). “ Early acquisition of nominal plural in Spanish,” Catalan J. Ling. 5, 191–219. [Google Scholar]
- Losiewicz, B. L. (1992). “ The effect of frequency on linguistic morphology,” Ph.D. thesis, University of Texas, Austin. [Google Scholar]
- Macken, M. A., and Barton, D. (1980). “ The acquisition of the voicing contrast in English: A study of voice onset time in word-initial stop consonants,” J. Child Lang. 7, 41–74. [DOI] [PubMed] [Google Scholar]
- MacWhinney, B. (2000). The CHILDES Project (Erlbaum, Mahwah, NJ: ), pp. 1–808. [Google Scholar]
- Marshall, C., and van der Lely, H. (2007). “ The impact of phonological complexity on past tense inflection in children with Grammatical-SLI,” Adv. Speech Lang. Pathol. 9, 191–203. 10.1080/14417040701261509 [DOI] [Google Scholar]
- Myers, S. (2012). “ Final devoicing: Production and perception studies,” In Prosody Matters: Essays in Honor of Elisabeth Selkirk, edited by Borowsky T., Kawahara S., Shinya T., and Sugahara M. (Equinox, London: ), pp. 148–180. [Google Scholar]
- Naeser, M. (1970). “ The American child's acquisition of differential vowel duration,” Technical Report No. 144, Research and Development Center for Cognitive-Learning, University of Wisconsin–Madison, pp. 1–146.
- Netsell, R., Lotz, W. K., Peters, J. E., and Schulte, L. (1994). “ Developmental patterns of laryngeal and respiratory function for speech production,” J. Voice 8(2 ), 123–131. 10.1016/S0892-1997(05)80304-2 [DOI] [PubMed] [Google Scholar]
- Nissen, S. L., and Fox, R. A. (2005). “ Acoustic and spectral characteristics of young children's fricative productions: A developmental perspective,” J. Acoust. Soc. Am. 118(4 ), 2570–2578. 10.1121/1.2010407 [DOI] [PubMed] [Google Scholar]
- Nittrouer, S., Studdert-Kennedy, M., and McGowan, R. S. (1989). “ The emergence of phonetic segments: Evidence from the spectral structure of fricative-vowel syllables spoken by children and adults,” J. Speech Hear. Res. 32(1 ), 120–132. [PubMed] [Google Scholar]
- Oller, D. K. (1973). “ The effect of position in utterance on speech segment duration in English,” J. Acoust. Soc. Am. 54, 1235–1247. 10.1121/1.1914393 [DOI] [PubMed] [Google Scholar]
- Oller, D., Wieman, L., Doyle, W., and Ross, C. (1976). “ Infant babbling and speech,” J. Child Lang. 33, 1–11. [Google Scholar]
- Peterson, G. E., and Lehiste, I. (1960). “ Duration of syllable nuclei in English,” J. Acoust. Soc. Am. 32(6 ), 693–703. 10.1121/1.1908183 [DOI] [Google Scholar]
- Prieto, P., and Bosch-Baliarda, M. (2006). “ The development of codas in Catalan,” Catalan J. Ling. 5, 237–272. [Google Scholar]
- Richtsmeier, P. T. (2010). “Child phoneme errors are not substitutions,” Toronto Working Papers in Linguistics No. 33, http://web.ics.purdue.edu/~prichtsm/professional/phonphon.pdf (last viewed 6/9/12).
- Scobbie, J., Gibbon, F., Hardcastle, W., and Fletcher, P. (2000). “ Covert contrast as a stage in the acquisition of phonetics and phonology,” in Papers in Laboratory Phonology V: Acquisition and the Lexicon, edited by Broe M. and Pierrehumbert J. (Cambridge University Press, Cambridge: ), pp. 194–207. [Google Scholar]
- Smit, A. B., Hand, L., Freilinger, J. J., Bernthal, J. E., and Bird, A. (1990). “ The Iowa articulation norms project and its Nebraska replication,” J. Speech Hear. Disord. 55, 779–798. [DOI] [PubMed] [Google Scholar]
- Smith, B. L. (1978). “ Temporal aspects of English speech production: A developmental perspective,” J. Phonetics 6, 37–67. [Google Scholar]
- Smith, B. L. (1979). “ A phonetic analysis of consonantal devoicing in children's speech,” J. Child Lang. 6, 19–28. 10.1017/S0305000900007595 [DOI] [Google Scholar]
- Smith, C. (1997). “ The devoicing of /z/ in American English: Effects of local and prosodic context,” J. Phonetics 25(4), 471–500. 10.1006/jpho.1997.0053 [DOI] [Google Scholar]
- Smith, C. (2003). “ Vowel devoicing in contemporary French,” French Lang. Studies 13, 177–194. 10.1017/S095926950300111X [DOI] [Google Scholar]
- Snow, D. (1994). “ Phrase-final syllable lengthening and intonation in early child speech,” J. Speech Lang. Hear. Res. 37(4 ), 831. [DOI] [PubMed] [Google Scholar]
- Song, J. Y., Demuth, K., and Shattuck-Hufnagel, S. (2012). “ The development of acoustic cues to coda contrasts in young children learning American English,” J. Acoust. Soc. Am. 131(4 ), 3036–3050. 10.1121/1.3687467 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song, J. Y., Sundara, M., and Demuth, K. (2009). “ Phonological constraints on children'sproduction of English third person singular –s,” J. Speech, Lang. Hear. Res. 52, 623–642. 10.1044/1092-4388(2008/07-0258) [DOI] [PubMed] [Google Scholar]
- Stevens, K. N., Blumstein, S. E., Glicksman, L., Burton, M., and Kurowski, K. (1992). “ Acoustic and perceptual characteristics of voicing in fricatives and fricative clusters,” J. Acoust. Soc. Am. 91, 2979–3000. 10.1121/1.402933 [DOI] [PubMed] [Google Scholar]
- Stoel-Gammon, C., and Buder, E. (1999). “ Vowel length, post-vocalic voicing and VOT in the speech of two-year olds,” in Proceedings of the XIIIth International Conference of Phonetic Sciences, edited by Elenius K. and Branderud P. (KTH and Stockholm University, Stockholm), pp. 2485–2488.
- Theodore, R., Demuth, K., and Shattuck-Hufnagel, S. (2011). “ Acoustic evidence for position and complexity effects on children's production of plural –s,” J. Speech, Lang. Hear. Res. 54, 539–548. 10.1044/1092-4388(2010/10-0035) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turk, A., and Shattuck-Hufnagel, S. (2000). “ Word-boundary-related durational patterns in English,” J. Phonetics 28, 397–440. 10.1006/jpho.2000.0123 [DOI] [Google Scholar]
- Turk, A., and Shattuck-Hufnagel, S. (2007). “ Multiple targets of phrase-final lengthening in American English words,” J. Phonetics 35(4 ), 445–472. 10.1016/j.wocn.2006.12.001 [DOI] [Google Scholar]
- Velten, H. V. (1943). “ The growth of phonemic and lexical patterns in infant language,” Lang. 19, 281–292. 10.2307/409932 [DOI] [Google Scholar]
- Walsh, T., and Parker, F. (1983). “ The duration of morphemic and non-morphemic /s/ in English,” J. Phonetics 11, 201–206. [Google Scholar]
- Westbury, J., and Keating, P. (1986). “ On the naturalness of stop consonant voicing,” J. Linguist. 22, 145–166. 10.1017/S0022226700010598 [DOI] [Google Scholar]
- Wexler, K. (1994). “ Optional infinitives, head movement and the economy of derivations in child grammar,” in Verb Movement, edited by Lightfoot D. and Hornstein N. (Cambridge University Press, Cambridge: ), pp. 305–350. [Google Scholar]
- White, K. S., and Morgan, J. L. (2008). “ Sub-segmental detail in early lexical representations,” J. Mem. Lang. 59, 114–132. 10.1016/j.jml.2008.03.001 [DOI] [Google Scholar]
- Zlatin, M. A., and Koenigsknecht, R. A. (1976). “ Development of the voicing contrast: A comparison of voice onset time in stop perception and production,” J. Speech Hear Res. 19, 93–111. [DOI] [PubMed] [Google Scholar]







