Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Jul 1.
Published in final edited form as: J Speech Lang Hear Res. 2010 Oct 21;54(2):448–470. doi: 10.1044/1092-4388(2010/10-0161)

Regional dialect variation in the vowel systems of typically developing children

Ewa Jacewicz 1, Robert Allen Fox 2, Joseph Salmons 3
PMCID: PMC4487659  NIHMSID: NIHMS702928  PMID: 20966384

Abstract

Purpose

To investigate regional dialect variation in the vowel systems of normally developing 8–12 years-old children.

Method

Thirteen vowels in isolated h_d words were produced by 94 children and 93 adults, males and females. All participants spoke American English and were born and raised in one of three distinct dialect regions in the United States: western North Carolina (Southern dialect), central Ohio (Midland) and southeastern Wisconsin (Northern Midwestern dialect). Acoustic analysis included formant frequencies (F1 and F2) measured at five equidistant time points in a vowel and formant movement (trajectory length).

Results

Children’s productions showed many dialect-specific features comparable to those in adult speakers, both in terms of vowel dispersion patterns and formant movement. Different features were also found including systemic vowel changes, significant monophthongization of selected vowels and greater formant movement in diphthongs.

Conclusions

The acoustic results provide evidence for regional distinctiveness in children’s vowel systems. Children acquire not only the systemic relations among vowels but also their dialect-specific patterns of formant dynamics. Directing attention to the regional variation in the production of American English vowels, this work may prove helpful in better understanding and interpretation of the development of vowel categories and vowel systems in children.

Keywords: dialect acquisition, regional vowel systems, formant movement


Acoustic characteristics of American English vowels have been widely studied since the seminal work by Peterson and Barney (1952). However, only relatively recently has sociophonetic research recognized that vowels in American English differ greatly in terms of their spectro-temporal characteristics across dialect regions in the United States. Regional variation in vowels has been identified and documented primarily in the speech of adults and, to date, there is still no comprehensive account of regional influences on the development of vowel systems in children. This paper addresses this issue by investigating the extent to which regional vowel features found in adults are present in productions of normally developing 8–12 years-old children. In so doing, this study aims to determine the strength of dialect-specific features in children and to illuminate which vocalic dialect features become permanent parts of a child’s accent or dialect.

By means of acoustic analysis, the study examines whether and to what extent children acquire not only the dialect-specific relative positions of vowels in the two-dimensional F1 by F2 plane but also their dynamic formant frequency patterns. As recent research shows, vowels are never truly static and exhibit at least some amount of vowel inherent spectral change (e.g., Nearey & Assmann, 1986; Watson & Harrington, 1999). This spectral change, also called dynamic formant movement, has been found to differentiate vowel variants across regional dialects of American English (Fox & Jacewicz, 2009). Moreover, cross-dialectal differences have been found both in vowel duration (Jacewicz, Fox & Salmons, 2007) and in speech tempo (Jacewicz, Fox, O’Neill & Salmons, 2009; Jacewicz, Fox, & Wei, 2010), both of which affect vowel dynamics, including the spectral rate of change in a vowel (Fox & Jacewicz, 2009). To what extent do children acquire these dialect-specific dynamic vowel characteristics?

Regional variation in phonetic input to children

When learning vowel systems in their first language, children are usually exposed to highly variable input from speakers in their speech communities. Typically, this input represents a form of a regional variety (or dialect) of the language spoken in a specific geographic area. In the United States, these regional varieties have been documented and characterized in the Atlas of North American English (Labov, Ash & Boberg, 2006) which defines American dialect regions on the basis of a variety of differences, but with an overwhelming focus on vowels. A broad body of sociolinguistic research provides strong evidence that regional variation is always present in spoken utterances, and some researchers argue that the notion of general standard American English should be abandoned. As an example, Wolfram and Schilling-Estes (2006, p. 324) point out that “the definition of spoken standard American English is a flexible one, sensitive to regional variation, stylistic range and other social variables.” This implies that children growing up in a given geographic region are destined to acquire features of the local dialect.

However, even within a regionally-defined variety of English, there is great variability in the input available to children. The children’s acquisition of language typically begins with their mother’s vernacular and her vowel system. The view that female caretakers are the primary source for first language learning has particularly strong support in sociolinguistic research (see Labov, 2001, chpt. 9). After the first few formative years, however, children learn to talk differently from their mothers (and caregivers in general) and adopt the emerging patterns of their speech communities. In fact, as pointed out by Labov (2010), children have the capacity to detect the sound system of their speech community and begin to prefer this system over that they first acquired from their caretakers. This continuous process of learning and re-organizing was defined by Labov (1994) as transmission (i.e., transfer of features from adults to children) and incrementation (i.e., the mechanism by which changes advance in a step-by-step fashion).

The variability in the input available to children in their speech communities comes in part from the fact that adults, adolescents and children’s peers differ in the degree and type of regional characteristics present in their speech and some individuals speak with “stronger” regional accents than others. Yet, the incrementation of language change comes about by children’s awareness that differences in language use (variants) are associated with the age of the speakers and that, specifically, “the younger the speaker, the more advanced the change” (Labov, 2007, p. 379). This indicates that older children who already passed their first formative years begin to turn to their peers as the more “proper” target for their language learning.

Two different accounts have been presented in recent research for how patterns of regional variation are changing today and evidence exists for both. First, much research on American English (reviewed in Labov et al., 2006; Labov, forthcoming) finds an increasing divergence of North American dialect regions and this view is now widespread, if not ubiquitous, among sociolinguists and dialectologists. At the same time, variability in the input increases with population mobility, exposing learners to new and broader sets of inputs. The second line of research focuses specifically on the effects of language contact, studying the acquisition of second dialects, i.e. acquisition of dialects or dialect features by individuals moving from one region to another (e.g., Chambers, 1992; 2003; Kerswill, 1996; Payne, 1980; Rys & Bonte, 2006). Because regional variation as a function of language contact is not of immediate relevance to the present study, this research will not be discussed further here. Supplemental material can be found in Foulkes and Docherty (1999), Kerswill (1994, 2003), Salmons and Purnell (2010), Trudgill (1986, 2004), among others.

Acquisition of dialect features

The presence of inter-speaker variation in the input poses a number of questions related to the acquisition of regional dialect features by children. Most importantly, to what extent are children able to adopt and reproduce the dialect features typical of a given speech community in the midst of highly variable input? As posited by the Labovian model of transmission and incrementation, the initial input provided by primary caregivers from the area constitutes a strong source of regional variants (see also Roberts, 2002, for a discussion of the importance of early input to children). Strong dialect-specific features are often present in the speech of older adults who grew up in the area, who did not travel as extensively as younger generations, and who have maintained close ties with other members of the community. However, newcomers who move into the region can constitute an influential source of non-local linguistic forms. This is reported to be particularly important among peers during preadolescence and adolescence (e.g., Eckert, 1999). Labov (2001, p. 502) hypothesizes that “most linguistic influence is exerted in early and middle adolescence,” the end of the opportunity for “vernacular reorganization.” It is therefore of interest whether children continue to participate in the transmission of dialect-specific features or adhere to the new (or converged) forms introduced to the area.

To address this and related issues for American English, Roberts studied the acquisition of selected dialect-specific phonological markers by young preschool children who were growing up in Philadelphia (Roberts 1997a; 1997b; Roberts & Labov, 1995). In Roberts and Labov (1995), the acquisition of the Philadelphia short /ɑ/ was examined in light of complex conditioning factors coming from lexical, phonological and grammatical constraints. The question was whether preschool children could detect and learn this dialect-specific complexity and reproduce the dialect-specific variation in the pronunciation of /ɑ/. The study showed that children between the ages of 3 and 4 were able to acquire the complex rules and were also found to participate actively in the process of /ɑ/-change in specific consonant environments. In another study, Roberts (1997a) examined the pattern of variability in deletion of final /t/ and /d/ in word final consonant clusters in Philadelphia children (3- and 4- years old). Although the /t, d/ deletion has been studied extensively in sociolinguistic research providing arguments for developmental, social and dialect-specific influences in English worldwide (e.g., Bayley, 1994; Guy, 1980; Patrick, 1991; Smith, Durham & Fortune, 2009; Tagliamonte & Temple, 2005), Robert’s data showed that children learned and matched the adult Philadelphia-specific pattern. That is, they demonstrated acquisition of the variable rules of dialect-specific /t, d/ deletion rather than adhering to more global developmental or social effects. Finally, Roberts (1997b) found further support for dialect-specific pattern of changes in the diphthongs /aɪ/ and /aʊ/ as in fight, right, mice, cow, crown and south produced by preschool children, which also underscores the importance of early input provided by Philadelphia-native parents.

Dialect-related variability in American English vowels and, specifically, variability of formant frequency patterns in vowel production in children has not been studied as systematically as other phonological markers. Children’s data are of particular interest in examining dialect-specific vowel shifts and changes (which have been identified in the production of adults) because of their potential role in language change (Labov, 2001, chpts. 13&14). However, the existing reports are rather sparse and data are presented mostly from individual speakers. For example, Labov (1994) provides illustrative charts of vowel systems of several teenagers and only two children, one 11 year-old female (p. 108) and one 13 year-old male (p. 100). To date, there is no detailed acoustic study of variation in American English vowels produced by children across a wider age range as a function of regional dialect.

Regional dialects and vowel formant patterns in children

Most of the detailed acoustic data examining vowel productions across dialect regions in the United States (Clopper, Pisoni & de Jong, 2005; Hagiwara, 1997) come from young college-age adults as participants. A much wider range of adults, including very old speakers, have participated in Thomas’ regional survey (Thomas, 2001) and in Labov’s lifetime work on regional variation in American English (most notably, Labov et al., 2006). Another acoustic study by Hillenbrand, Getty, Clark and Wheeler (1995) used speakers from the upper Midwest region, mostly from Michigan, including children, ages 10–12 years. However, the goal of the work was to replicate the classic study by Peterson and Barney (1952), which also included children, and not to inquire into regional variation in children’s speech.

On the other hand, acoustic studies that have explored children’s vowel production have been primarily concerned with documenting developmental patterns in children, including sex differences, compared to adults (e.g., Assmann & Katz, 2000; Bennett, 1981; Busby & Plant, 1995; Lee, Potamianos & Narayanan, 1999; Perry, Ohde & Ashmead, 2001; Whiteside & Hodgson, 2000; Whiteside, 2001). The children used in these studies were preadolescent girls and boys ranging in age from 5–12 years, variously divided into narrower age groups, depending on the research focus. Their language background was usually reported as standard American or Australian English although some researchers did provide specific information about geographic regions in which their participants were born and raised (for American English: Assmann & Katz, 2000, Bennett, 1981 and Lee et al., 1999; for British English: Whiteside & Hodgson, 2000). In terms of selected developmental factors, the data in Whiteside (2001) indicate most clearly that sex-related differences in formant frequency values emerge at age 10 (which may reflect the beginning of the developmental prepubertal stage at approximately age 10 proposed by Fitch and Giedd (1999) indicating sex-related anatomical changes in the vocal tract) but these differences vary for specific formants and specific vowels.

Researchers are becoming increasingly aware that dialect background should be considered in the interpretation of the formant frequency data in vowels from both adults and children. While regional variation in adults can be evaluated by including the effects of dialect as a variable in cross-dialect comparisons because minimal developmental factors are involved, understanding children’s productions requires careful examination of both developmental effects (related to age and gender) and possible dialect variation, which may confound the interpretation of developmental changes. Vorperian and Kent (2007, p. 1514) address this very issue in their reanalysis of formant frequency data from 14 different sources which reported children’s formant values. Admitting the confounding effect of dialect, they also state: “One reason why dialectal influence was difficult to control is because the formant-frequency data used in this study were published over an interval of nearly 5 decades, and dialects shift over time.”

Vowel change in English is systemic in nature and has been widely documented in phonetic studies of American, Australian, British and New Zealand English dialects in speech of adults (e.g., Cox, 1999; Gordon, Campbell, Hay, Maclagan, Sudbury & Trudgill, 2004; Kerswill, Torgersen, & Fox (2008); Labov et al., 2006; Torgersen & Kerswill, 2004). Ideally, interpretation of children’s formant frequency data should take into consideration regional formant specifications at a particular time in history. Obviously, this most rigorous condition is not easy to meet given the lack of detailed acoustic studies that compare regionally defined children’s and adults’ speech samples collected at a given stage of a systemic vowel shift. Yet, even a basic knowledge of potential effects of regional variation on acoustic characteristics of vowels produced by children may prove to be helpful in interpreting developmental data across English-speaking territories. The extent to which children acquire dialectal vowel characteristics from the input provided to them by adults is a rich area for future research, of which foundation is laid out in the present study.

Method

Participants

Ninety four children aged 8–12 years (M = 9.9, SD = 1.4) and ninety three adults aged 51–65 years (M = 57.8, SD = 4.5) participated in the study. All participants were born, raised and have spent most of their lives in one of three geographic regions in the United States: western North Carolina (the Cullowhee area), central Ohio (the Columbus area) and southeastern Wisconsin (Madison and areas east). These locations were selected because American English spoken in each area represents a distinct regional variety. In particular, vowel systems in these three dialects exhibit substantial differences in both vowel dispersion patterns and their dynamic formant movements. The southern dialect in the Appalachian region of western North Carolina is affected by sets of changes called the Southern Shift, the Midland dialect (the central Ohio variety) is not affected by any known vowel shift (although chain-like changes have been reported more recently by Durian, 2010) and the northern variety spoken in southeastern Wisconsin shows features of another shift termed Northern Cities Shift (NCS). More details about these regional varieties can be found in Labov et al. (2006).

In terms of speaker gender, there were 16 girls and 16 women in each dialect group for a total of 96 females. For males, there were 16 boys and 16 men in North Carolina, 15 boys and 13 men in Ohio and 15 boys and 16 men in Wisconsin for a total of 91. The participants were recruited using flyers, bulletin board postings, email and word of mouth. Each participant spoke the local dialect as verified by the research team and none reported any speech disorders or wore corrective devices. The recordings took place in the years 2006–2008. The participants were paid for their efforts.

Speech material

13 American English vowels were selected: /i, ɪ, e, ε, æ, ɑ, ɔ, o, u, ʊ, oɪ, aɪ, aʊ/. Each vowel was produced in h_d context in citation form using the prompts: heed, hid, heyd, head, had, hod, hawed, hoed, who’d, hood, hoyd, hide, howed. Each participant produced three repetitions of each token for a total of 7293 tokens used in the acoustic analysis (13 vowels × 3 repetitions × 187 speakers). These tokens, produced in isolation, provided not only a common database for all speakers but also a uniform presentation style. The assumption is that if the regional dialect features are present in the most conservative forms of vowels – their citation forms – they may surface even more in everyday (spontaneous) speech.

Procedure

Printed h_d prompts were presented to each participant on a computer monitor, one at a time. A custom MATLAB program was used to control the experiment and the recordings. In preparation for the experimental task, three different word lists were created as inputs to the MATLAB program, in which each h_d token was repeated three times. In each list, the tokens appeared in a pseudorandom order in which there were no consecutive presentations of the same item. Also, care was given not to include adjacent items bearing potential for generation of a false contrast such as in the case of “hod” and “hawed” which, in some dialects, are produced as a low back merger. No fillers were included. The word lists were counterbalanced across the participants within each group. As already mentioned, each item from the list appeared as an individual prompt on the computer monitor. In six cases, however, additional information in the form of a rhyming word was displayed beneath the prompt as it was felt that speakers might encounter difficulties in reading (i.e., understanding the expected phonetic form of certain nonce items). These included: heyd (rhymes with “made”), hoyd (rhymes with “Lloyd” and “void”), hod (rhymes with “odd”), who’d (rhymes with “mood”), howed (rhymes with “cowed”), hawed (rhymes with “sawed”).

The participant was seated facing a computer monitor and spoke each token into a head-mounted Shure SM10A dynamic microphone, positioned at a distance of about 1.5 inches from the speaker’s lips. The experimenter used a separate computer monitor and keyboard/mouse and either accepted and saved the speaker’s production (which occurred most of the time) or asked for a repetition in case of any obvious mispronunciations (such as [haɪd] for “hid”) or poor quality recording. To familiarize the speaker with the interface and the recording procedure, a 10-item practice was presented prior to the actual experiment. During the practice, the experimenter adjusted the sound recording levels to the voice of an individual speaker and established an allotted time for presentation and production of each item. For most participants, the selected time slot was 3 seconds. On occasion, slow speakers required 4 seconds in order to read and produce an item. Additional repetitions were collected from the speaker if the sound level was either too soft or too loud (which was indicated by color-coding of the speaker’s waveform displayed on the screen immediately following the production) and in cases such as “false” or “late” starts, “slips of the tongue,” laughing, coughing, sneezing, etc. The same procedures were followed with both children and adults. The duration of the task ranged from 15 to 20 minutes. Children did not encounter difficulties with the task, partly because they were all fluent readers (including the 8-year-olds). This requirement was included in the advertisement of the study because the same children were also to participate in a second experiment which involved reading fluently a set of 120 sentences with varying main sentence stress.

The tokens were recorded and digitized at a 44.1-kHz sampling rate directly onto a hard disc drive. The testing took place at the university facilities in three locations: Western Carolina University, The Ohio State University and University of Wisconsin-Madison. At the universities in Ohio and Wisconsin, recordings took place in a sound attenuating booth. At Western Carolina University, participants were recorded in a quiet room designated for the experiment.

Acoustic measurements

Prior to acoustic analysis, the tokens were digitally filtered and downsampled to 11.025 kHz. Spectral measurements included the frequencies of F1 and F2. A measure of formant movement termed trajectory length was calculated from the F1 and F2 values sampled at multiple time points in the course of vowel’s duration.

Formant frequencies

F1 and F2 frequencies were measured over the course of vowel’s duration at five equidistant temporal locations corresponding to the 20-35-50-65-80%-point in the vowel. Vowel onsets and offsets were located by hand, primarily on the basis of a waveform display with segmentation decisions checked against a spectrogram. The 20%- and 80%-points were used as the first and final spectral measurement locations to eliminate the immediate contextual effects on vowel transitions and to examine the vowel-inherent spectral change relatively unaffected by consonantal context. A 25-ms Hanning window was centered at each temporal location. F1 and F2 values (based on 14-pole LPC analysis) were extracted automatically using a custom MATLAB program which displayed these values along with the FFT and LPC spectra and a wideband spectrogram of the vowel. If needed, the formant values were verified using smoothed FFT spectra and wideband spectrograms with formant tracking (in the program TF32, Milenkovic, 2003). A reliability check was performed on all tokens by two researchers and any errors in formant estimation in LPC analysis were then hand-corrected.

Trajectory length (TL)

TL represents a measure of formant movement which tracks more closely formant frequency change over the course of vowel’s duration (see Fox & Jacewicz, 2009, for details). This measure utilizes the five equidistant measurement points in a vowel (20–35–50–65–80%) to calculate the total trajectory change. The TL is a sum of the lengths of the four separate vowel sections between the 20% and 80%-point, where the length of one vowel section (VSL) is:

VSLn=(F1n-F1n+1)2+(F2n-F2n+1)2

The assumption is that a longer TL reflects a greater amount of formant movement in a vowel, indicating greater diphthongization. The “length” of the formant trajectory, measured in Hz, here indicates the amount of frequency change only and has no temporal component, per se. As it will become apparent in later sections of the paper, a vowel can still be considered “monophthongal” having TL values up to about 420 Hz and in a true diphthong such as /oɪ/, TL values can well exceed 1000 Hz. In general, TLs of “diphthongal” vowels are expected to be longer than 620 Hz and “diphthongized” vowels will fall into the 420–620 Hz range, based on average values in the present data.

Vowel duration

Measurements of vowel duration were used as input for the automated measurement of formant frequencies described above. To measure vowel duration, vowel onsets and offsets were located by hand, primarily on the basis of a waveform display with segmentation decisions checked against a spectrogram. The location of the vowel onset was defined as the onset of periodicity in the waveform. Vowel offset was defined as that point when the amplitude of the waveform dropped significantly (to near zero), signaling the closure of /d/. Two additional cues were used to verify the location of the stop closure onset: the lack of high frequency energy (Pickett, 1999) and a relatively sinusoidal waveform which is low in amplitude and shows only slow variations (Olive, Greenwood & Coleman, 1993). All segmentation decisions were later checked and corrected during a reliability check.

Results

Vowel dispersion in the acoustic space

Figures 1, 2 and 3 display relative positions of all 13 vowels in the F1 × F2 plane, along with their dynamic formant patterns, produced by North Carolina (NC), Ohio (OH) and Wisconsin (WI) participants, respectively. The vowels are plotted following the sociolinguistic/sociophonetic tradition, in which the axes show F1 and F2 values in descending order. The vowels produced by children (boys and girls) are shown with reference to adults (males and females), who represent typical dialect-specific features of each regional variety. The displays are scaled to capture the systemic relations among vowels within each dialect. Given this purpose, the formant frequency values (in Hz) are not normalized and each plot is scaled appropriately for each age and gender group to render a comparable display1. Specifically, the same scaling is used for all children’s vowels, a common scaling is used for vowels produced by women and another common scaling is used for men’s plots. Each vowel symbol is placed next to the 80% temporal measurement point in the vowel, indicating the direction of formant movement.

Figure 1.

Figure 1

Mean relative positions of 13 vowels in h_d context and their formant movement (F1 and F2) measured at five equidistant time points (20–35–50–65–80%) over the course of a vowel’s duration. Vowels are produced by speakers of the Southern dialect variety of American English spoken in western North Carolina: boys (n=16), girls (n=16), adult males (n=16) and adult females (n=16). Each vowel symbol is placed close to the 80%-point in a vowel, indicating the direction of formant movement.

Figure 2.

Figure 2

Mean relative positions of 13 vowels produced by speakers of the Midland dialect variety of American English spoken in central Ohio: boys (n=15), girls (n=16), adult males (n=13) and adult females (n=16). Each vowel symbol is placed close to the 80%-point in a vowel, indicating the direction of formant movement.

Figure 3.

Figure 3

Mean relative positions of 13 vowels produced by speakers of the Northern Midwestern dialect variety of American English spoken in southeastern Wisconsin: boys (n=15), girls (n=16), adult males (n=16) and adult females (n=16). Each vowel symbol is placed close to the 80%-point in a vowel, indicating the direction of formant movement.

The vowels produced by NC adults in Figure 1 show the characteristic positions of the Southern Shift: the proximity and considerable overlap of the /i, e, ɪ, ε/ set, the raised and fronted /æ/, very fronted /u/ and /o/ (relative to the onset of the diphthong /oɪ/), monophthongal version of /aɪ/ and distinct relative positions of /ɑ/ and /ɔ/. These characteristics can be found in both males and females. Compared with adults, children’s vowel systems reflect considerable reorganization, which are not in the expected direction of the Southern Shift. In particular, there is a separation of /i/ from /e, ɪ, ε/ and only a partial overlap of the latter, especially in boys. The vowel /æ/ undergoes lowering and backing, the monophthongal /aɪ/ becomes diphthongized (more in girls than in boys) and the vowel /ɔ/ changes both the pattern and direction of its formant movement. The diphthongs /oɪ/ and /aʊ/ show a comparatively greater amount of frequency change. These changes in the vowel system indicate that, in general, the Southern Shift is gradually receding in this dialect area. However, despite the differences, features of the southern dialect are clearly present in children’s vowel system, including the very fronted /u/ and /o/, the proximity and partial overlap of /e, ɪ, ε/ and the distinct character of the diphthongal /aɪ/ which, as we will see below, does not display as great an amount of formant change as in the OH and WI variants, at least when produced by boys.

The OH vowel system in adults (see Figure 2) presents a different set of dialect-specific features. The vowels /ɪ, ε, æ/ are more centralized, are more clearly separated from one another and do not overlap with /e/ as it was the case in NC adults. The vowels /u/ and /o/ are fronted (although to a lesser extent than NC variants) and /ɑ/ and /ɔ/ are close to each other although still represent two distinct vowel categories. The vowels /aɪ, oɪ/ are full diphthongs and have a greater amount of formant change compared to NC variants. As can be seen in Figure 2, these vowel characteristics are generally maintained in children. The two innovations introduced by children include reduction of spectral change (i.e., greater monophthongization) of /ɪ, ε, æ/ along with the lowering of /æ/ and a corresponding merger of /ɑ/ and /ɔ/ into one vowel category. This “low back merger” is prevalent in the productions of current young adults and children who grew up in central Ohio (Thomas, 2001, p. 94; Labov et al. 2006, pp. 263–271).

Finally, vowels produced by WI adults display yet a third set of dialect-specific features. As Figure 3 shows, the vowels /ε, æ / are in a close proximity. The greater formant movement in /æ/ indicates “Northern breaking” (Labov et al., 2006, chpt. 13), a characteristic mark of the Northern Cities Shift. The WI variant of /ɑ/ is positioned lower compared to both NC and OH. Furthermore, both /u/ and /o/ are far back vowels, which is in sharp contrast with the fronted variants in NC and OH. As can be seen, WI children produce more monophthongal versions of /ɪ, ε/ and a slightly raised /ɑ/. All other dialect features, including the Northern breaking in /æ/ and the back variants of /u, o/ remain as in the adults’ system.

In summary, the current data show considerable presence of dialect-specific features in children’s productions. A visual inspection of vowel system in each dialect allows us to detect a remarkable correspondence between the systemic relations among vowels in the productions of adults and children. However, acoustic characteristics can also be found in children’s vowels which are absent in adults. These new features may be related to children’s participation in the process of language (or sound) change in their respective regional varieties. We will expand on these points in the General Discussion. Prior to this, the next section will examine formant movement in vowels to determine whether and to what extent a dialect-specific pattern of dynamic vowel characteristics is present in children’s productions compared with adults.

Formant dynamics

As seen in Figures 13, there is substantial variation in the formant dynamics of vowels in the regional varieties considered here. To gain a better understanding of the amount of spectral change over the course of vowel’s duration, we examined changes in TL. Because the main interest of the study was to assess the effects of dialect, one-way ANOVAs were conducted for each individual vowel and this was done separately for each age and gender group. In these ANOVAs, dialect was the only between-subject factor. Significant gender effects were expected because we have chosen to work with unnormalized formant values. For all vowels produced by adult female speakers, we expected their TLs to be longer compared to males due to their shorter vocal tracts. The gender-related differences were expected to be smaller in children due to the fact that anatomical differences between girls and boys at this prepubertal age are not as dramatic as between adult females and males. By the same reasoning, we predicted generally longer TLs in children than in adults, although dialect-specific reduction of formant movement in children, at least for selected vowels, cannot be ruled out because of their participation in the process of sound change and systemic vowel shifts. We will address this point subsequently.

Initial analyses confirmed our expectations, showing on average longer TLs for adult females compared to adult males (581 Hz and 394 Hz, respectively) and longer TLs for children (666 Hz for girls and 617 Hz for boys). We now turn to the results of the ANOVAs which assessed the effects of dialect. In these analyses, the degrees of freedom for the F-tests were Greenhouse-Geisser adjusted to address significant violations of sphericity. We also report a measure of the effect size – partial eta squared (η2). Following the significance of the main effect, Scheffé’s multiple comparisons were used as post hoc tests. If children acquire dialect-specific patterns of formant dynamics, we expect to find similar cross-dialectal differences in vowel formant patterns in adults and in children.

Table 1 summarizes the statistical results for the effects of dialect in adult males and females. Listed are means (and standard deviations) for TLs in vowels produced by NC, OH and WI speakers, significance (p-value), effect size (η2) and the nature of significant differences between the dialects as revealed by the Scheffé procedure. As can be seen, the amount of spectral change was not significantly different across dialects for the three vowels /i, ɑ, o/, for both males and females, nor for /u/ and /aʊ/ in males. The remaining vowels show significant variation in their TLs as a function of dialect, pointing to several dialect-specific pecularities.

Table 1.

Summary of statistical results from one-way ANOVAs for formant trajectory length (TL) for the effects of dialect in adults.

Vowel Token Gender NC OH WI p-value Partial eta squared Significant difference (Scheffé)
/i/ heed M 170 (51) 167 (70) 160 (41) 0.882 0.006
F 229 (77) 215 (60) 240 (94) 0.648 0.019
/ɪ/ hid M 398 (92) 276 (91) 359 (80) 0.002 0.256 OH<NC; OH<WI
F 647 (101) 461 (86) 577 (141) <0.001 0.336 OH<NC; OH<WI
/e/ heyd M 454 (112) 280 (60) 237 (82) <0.001 0.555 OH<NC; WI<NC
F 661 (162) 384 (83) 339 (71) <0.001 0.631 OH<NC; WI<NC
/ε/ head M 341 (102) 222 (67) 317 (61) 0.001 0.297 OH<NC; OH<WI
F 590 (155) 343 (71) 509 (180) <0.001 0.355 OH<NC; OH<WI
/æ/ had M 304 (107) 255 (105) 393 (89) 0.002 0.255 OH<WI
F 563 (208) 372 (111) 608 (142) <0.001 0.307 OH<NC; OH<WI
/ɑ/ hod M 242 (95) 295 (100) 258 (78) 0.300 0.056
F 377 (128) 407 (165) 394 (52) 0.794 0.010
/u/ who’d M 223 (102) 220 (126) 221 (48) 0.996 0.000
F 263 (125) 254 (83) 383 (112) 0.002 0.240 OH<WI; NC<WI
/ʊ/ hood M 242 (84) 342 (94) 487 (90) <0.001 0.591 NC<OH; NC<WI; OH<WI
F 325 (135) 451 (111) 714 (149) <0.001 0.613 NC<OH; NC<WI; OH<WI
/o/ hoed M 307 (94) 260 (61) 255 (66) 0.121 0.096
F 444 (81) 425 (105) 367 (101) 0.070 0.111
/ɔ/ hawed M 271 (83) 293 (126) 391 (78) 0.002 0.250 NC<WI; OH<WI
F 378 (94) 423 (132) 492 (116) 0.026 0.150 NC<WI
/aɪ/ hide M 229 (123) 777 (168) 837 (177) <0.001 0.772 NC<OH; NC<WI
F 512 (330) 1108 (140) 1124 (178) <0.001 0.619 NC<OH; NC<WI
/ɔɪ/ hoyd M 841 (212) 1138 (150) 1134 (183) <0.001 0.383 NC<OH; NC<WI
F 1434 (245) 1650 (180) 1503 (254) 0.032 0.142 NC<OH
/aʊ/ howed M 621 (133) 628 (110) 541 (90) 0.072 0.118
F 929 (219) 886 (126) 668 (111) <0.001 0.357 WI<OH; WI<NC

Shown are means (SD) in Hz, significance (p-values), effect size (partial eta squared) and significant differences between the dialects obtained from Scheffé’s pairwise comparisons. NC = North Carolina, OH = Ohio, WI = Wisconsin, M = male, F = female.

In particular, OH variants of /ɪ, ε, æ/ have the smallest amount of spectral change, NC /e/ and WI /ɔ/ are most “diphthongal” and each dialectal variant of /ʊ/ differs significantly one from another so that WI /ʊ/ has the greatest amount of change followed by OH and NC, respectively. As for the diphthongs, NC /aɪ/ is produced as a monophthong (which is in sharp contrast with both OH and WI variants) and NC /oɪ/ has a lesser amount of spectral change in comparison with the variants in both OH (in both males and females) and in WI (in males only). WI /aʊ/ differs significantly from both OH and NC variants in females due to the fact that its formant change is greater in F1 than in F2. WI males follow the same pattern of formant change in /aʊ/, although these differences did not reach significance.

How are these cross-dialectal patterns reflected in children’s productions? Table 2 provides a summary of the statistical results for the effects of dialect in boys and girls. The following results are consistent with the adults: (1) no significant differences for /i, ɑ/ in boys and girls and no significant differences for /u, aʊ/ in boys; (2) the same pattern of significant cross-dialectal differences for /e/ in boys and girls, for /ɔ/ in girls, and for /aɪ/ in boys. The results are partially consistent for /ɪ, æ/: two new significant differences were found in children due to the fact that WI /ɪ/ and NC /æ/ became monophthongized in their productions. Also partially consistent are the results for /ʊ/; although children’s means still show the longest TLs for WI followed by OH and NC, respectively, the differences between OH and WI boys and NC and OH girls were too small to reach significance.

Table 2.

Summary of statistical results from one-way ANOVAs for formant trajectory length (TL) for the effects of dialect in children.

Vowel Token Gender NC OH WI p-value Partial eta squared Significant difference (Scheffé)
/i/ heed M 303 (77) 308 (85) 279 (93) 0.601 0.023
F 316 (61) 316 (97) 337 (160) 0.824 0.009
/ɪ/ hid M 467 (158) 273 (60) 287 (84) <0.001 0.415 OH<NC; WI<NC
F 477 (238) 277 (64) 263 (77) <0.001 0.314 OH<NC; WI<NC
/e/ heyd M 685 (141) 465 (156) 393 (129) <0.001 0.446 OH<NC; WI<NC
F 736 (164) 503 (144) 473 (117) <0.001 0.418 OH<NC; WI<NC
/ε/ head M 365 (144) 241 (63) 318 (86) 0.006 0.207 OH<NC
F 323 (140) 272 (74) 324 (68) 0.245 0.061
/æ/ had M 346 (96) 303 (67) 565 (215) <0.001 0.415 OH<WI; NC<WI
F 366 (107) 372 (103) 594 (144) <0.001 0.456 OH<WI; NC<WI
/ɑ/ hod M 589 (194) 598 (111) 507 (111) 0.167 0.078
F 682 (181) 568 (154) 568 (186) 0.114 0.092
/u/ who’d M 417 (132) 413 (133) 352 (120) 0.303 0.053
F 405 (130) 383 (118) 338 (92) 0.253 0.059
/ʊ/ hood M 358 (152) 512 (156) 587 (150) <0.001 0.294 NC<OH; NC<WI
F 457 (117) 561 (154) 711 (130) <0.001 0.392 OH<WI; NC<WI
/o/ hoed M 467 (111) 367 (116) 430 (106) 0.047 0.130 OH<NC
F 515 (107) 399 (84) 452 (107) 0.008 0.194 OH<NC
/ɔ/ hawed M 503 (164) 552 (158) 605 (152) 0.216 0.067
F 518 (153) 565 (170) 687 (170) 0.017 0.166 NC<WI
/aɪ/ hide M 829 (308) 1201 (231) 1192 (213) <0.001 0.333 NC<OH; NC<WI
F 1127 (319) 1152 (233) 1306 (218) 0.119 0.090
/ɔɪ/ hoyd M 1631 (433) 1723 (328) 1737 (227) 0.643 0.020
F 1901 (232) 1803 (241) 1968 (294) 0.200 0.069
/aʊ/ howed M 1030 (188) 939 (205) 907 (161) 0.171 0.077
F 1003 (146) 953 (166) 1013 (189) 0.564 0.025

Shown are means (SD) in Hz, significance (p-values), effect size (partial eta squared) and significant differences between the dialects obtained from Scheffé’s pairwise comparisons. NC = North Carolina, OH = Ohio, WI = Wisconsin, M = male, F = female.

The inconsistencies with the adults include: (1) no significant differences for /ε/ in girls due to a considerable reduction of its formant movement (i.e., monophthongization of /ε/) in children; (2) no significant differences for /u/ in girls as a result of TL reduction in the WI variant; (3) significant differences between OH and NC boys and girls for /o/ which arose due to some reduction in TLs in OH children; (4) no significant differences for /ɔ/ in boys; (5) no significant differences for the diphthongs: /oɪ/ in boys and girls, /aɪ/ and /aʊ/ in girls.

The question arises whether the differences between the adults and children may result, at least partially, from developmental factors rather than from dialect-specific sound change over time. It is the case that most studies on vowel development in typically developing children maintain that vowels are acquired relatively early, by 3 or 5 years of age (e.g., Lieberman, 1980; Pollock & Keiser, 1990; Stoel-Gammon & Beckett Herrington, 1990; Templin, 1958; Winitz & Erwin, 1958). In terms of formant frequencies, it has been shown that children’s values, despite being higher, show spectral patterning comparable with that of adults. The higher formant frequencies in children are primarily a product of their shorter supralaryngeal vocal tracts (Lieberman, 1980). In the present study, we found that standard deviations in children are slightly larger compared to adults which would indicate greater variation in their formant frequency values. This is somewhat expected given that 8-year-olds may still have comparatively shorter vocal tracts than 12-year-olds and this age difference will produce some differences in the working vowel spaces of children in terms of absolute formant values (cf. Figures 2 and 3 in Vorperian & Kent, 2007) but not in terms of relative positions of the vowels in the vowel space. To examine if this assumption holds true for the present study, the general means for the children’s data shown in Figures 1, 2 and 3 were redistributed into the means for each year of age, from eight to twelve. The set of figures displaying these values for each dialect is included in the Appendix.

It is now clear that developmental factors may come into play in form of elevated F2 and F1 values for the youngest children. For example, the vowels /i/ and /e/ and the offset of /oɪ/ in the upper left corner of the vowel space are fronted (i.e., have higher F2 values) in 8- and 9-year-olds and these high F2 values decrease with age. The back of the vowel space shows more variation although generally higher F2 in /u/ and /o/ can be found primarily in younger children. However, despite the shifts in F1 and F2, there is a robust dialect-specific dispersion pattern of vowels in the vowel space of children at each age and this general pattern corresponds to the basic patterning in the vowel space found in the adult speakers of each dialect.

Based on the analysis of formant movement (TL) presented above, it is clear that dialect-specific patterns of formant dynamics found in the adults are still present in children’s productions. For some vowels, there is a remarkable correspondence in terms of both the amount of spectral change and trajectory shape. In general, two changes in children’s pronunciation can be detected across dialects which diverge from adults’ productions: (1) reduction of formant movement (i.e., increased monophthongization) in selected vowels and (2) a more uniform production of all three diphthongs (i.e., /aɪ, aʊ, oɪ/), in which the dialectal differences seen in the adult speakers begin to disappear.

The current acoustic results provide evidence for regional distinctiveness of American English vowel systems. Figures 4 and 5 illustrate several patterns for selected vowels to substantiate this claim. As will be shown, cross-dialectal positional differences are well maintained in children in spite of new vowel features such as significant reductions in the amount of spectral change or changes in formant trajectory shape. Furthermore, even if the amount of formant movement does not differ significantly across dialects, children’s displays indicate their knowledge of dialect-specific vowel positions in their respective vowel systems.

Figure 4.

Figure 4

Mean relative positions of /e,æ,ɑ,u / displayed cross-dialectally for children and adults. Data are redrawn from Figures 1, 2 and 3.

Figure 5.

Figure 5

Mean relative positions of /ε,ʊ,o / displayed cross-dialectally for children and adults. Data are redrawn from Figures 1, 2 and 3.

Considering the cross-dialectal variants of vowel /e/ in Figure 4, we expect significant dialectal differences in their TLs as listed in Tables 1 and 2: NC /e/ had significantly greater formant movement compared to either OH or WI and the latter two did not differ significantly one from another. This pattern can be seen clearly in Figure 4, both in adults and in children. In addition, we also find that boys follow the same general cross-dialect dispersion as adult males (i.e., the NC variant is separated from both OH and WI, which are overlapping) and girls as female adults (i.e., all three variants tend to overlap). This gender-related observation should be interpreted with caution, however, and further detailed studies are needed to explore the role of gender in the acquisition of selected vowels. The second vowel in the plots, /æ/, has three very different dialect variants in adults: the trajectory shape of NC /æ/ signals the Southern drawl or breaking, where the vowel “breaks” into two parts (Sledd, 1966), the WI variant shows Northern breaking (Labov et al., 2006) and the OH /æ/ is most “monophthongal,” particularly in OH females. Children’s plots indicate that their vowels still retain the cross-dialect positional differences despite a drastic reduction of formant movement in NC and OH variants and the apparent disappearance of Southern breaking in NC children.

The two remaining vowels in Figure 4, /ɑ/ and /u/, provide further evidence that children learn dialect-specific vowel positions. As listed in Tables 1 and 2, TLs for /ɑ/ did not differ significantly as a function of dialect, neither in adults nor children. However, the pattern of cross-dialectal positional differences remains the same in each age group. The same is generally true for /u/, whose regional variants produced by adults and children also show remarkable consistencies in their trajectory shapes.

Three further examples of cross-dialect variation in children’s vowels relative to adults are shown in Figure 5. First, the respective locations of cross-dialect variants of /ε/ are well maintained in children despite considerable reduction in their formant movement in all three dialects, which is in line with the changes observed in /æ/ in Figure 4. Second, the vowel /ʊ/ exemplifies an inverse scenario in which the cross-dialectal pattern of formant movement found in adults is maintained in children but there is some variation in relative positions of OH and WI variants across age and gender groups. Finally, the cross-dialect renditions of /o/ clearly indicate that children are aware of the dialect-specific pronunciation of the vowel which can be inferred from both the position and formant trajectory shape of each dialect variant.

Vowel duration

Given the nature of the experimental task (i.e., citation-form words produced in isolation), we did not expect substantial effects of dialect, age and gender on vowel duration. However, significant differences may still show up for some vowels which will help us detect whether children also learn dialect-specific temporal vowel characteristics. A repeated-measures ANOVA was used to examine the variation in vowel duration. In this ANOVA, vowel was included as a within-subject factor and the between-subject factors were dialect, age and gender. Not surprisingly, there was a significant main effect of vowel (F(12, 2112)=545.29, p<0.001, η2=0.756), reflecting differences in the intrinsic vowel duration which is a well known property of vowels (e.g., Peterson and Lehiste, 1960). However, there were only marginal effects for the between-subject factors. The effect of gender was not significant. The effect of age was significant but its effect size was very small (F(1, 176)=4.49, p=0.035, η2=0.025). The means showed that children produced slightly longer vowels than adults (297 and 280 ms, respectively). Dialect was significant but again, its effect size was small (F(2, 176)=5.04, p=0.007, η2=0.054). A subsequent exploration of the effect of dialect using one-way ANOVAs and Scheffé’s comparisons as post hoc tests showed that significant differences were found only for selected vowels as listed in Table 3. Interestingly, the pattern of cross-dialectal differences in vowel duration for the vowels /ɪ, ε, ʊ/ is consistent for the adults and children. This finding suggests that children may also acquire temporal characteristics of vowels in their dialect. However, a more appropriate design is needed to further explore the acquisition of vowel duration which should be tested in larger passages of speech including sentences and spontaneous talks.

Table 3.

Summary of statistical results from one-way ANOVAs for vowel duration for the effects of dialect in adults and children.

Vowel Token NC OH WI p-value Partial eta squared Significant difference (Scheffé)
Adults

/ɪ/ hid 254 (45) 205 (61) 185 (37) <0.001 0.278 OH<NC; WI<NC
/ε/ head 281 (51) 213 (59) 199 (42) <0.001 0.345 OH<NC; WI<NC
/ʊ/ hood 269 (49) 214 (66) 192 (40) <0.001 0.291 OH<NC; WI<NC

Children

/ɪ/ hid 256 (58) 224 (44) 193 (45) <0.001 0.220 OH<NC; WI<NC
/ε/ head 273 (49) 235 (44) 206 (53) <0.001 0.243 OH<NC; WI<NC
/ʊ/ hood 278 (59) 236 (44) 214 (59) <0.001 0.194 OH<NC; WI<NC
/o/ hoed 339 (66) 321 (54) 296 (62) 0.021 0.080 WI<NC

Shown are means (SD) in ms, significance (p-values), effect size (partial eta squared) and significant differences between the dialects obtained from Scheffé’s pairwise comparisons. NC = North Carolina, OH = Ohio, WI = Wisconsin.

General discussion

This study examined relative positions and dynamic formant patterns of vowels in three distinct regional varieties of American English as produced by 8–12 years-old children and 51–65 years-old adults. Of interest was whether the vowel systems of children retain dialect-specific features found in the vowel systems of adults in terms of general dispersion pattern and the amount of vowel-inherent spectral change. The results demonstrate that regional characteristics are clearly present in children’s vowels, even when elicited in their most conservative citation forms (i.e., words produced in isolation). This indicates that children acquire and reproduce the dialect features to a large extent.

As already mentioned, all participants of the study were born, raised and spent most of their lives in their respective speech communities. From background information collected at the time of testing we found that they did not travel extensively out-of-state. Thus, most of the language input to both adults and children was locally based, which no doubt included both strong regional features as well as new forms through interactions with newcomers to the area, mass media or various forms of social networks. It needs to be pointed out that the speech communities in both central Ohio and southeastern Wisconsin are to some extent influenced by new populations of students and academic staff associated with two big research universities, The Ohio State University and University of Wisconsin-Madison. In addition to being in close proximity to Western Carolina University, the western North Carolina community is located close to the Great Smoky Mountains National Park, which for years has been a popular tourist destination during the summer and autumn months and, more recently, experienced an increased in-migration of new residents from other states. It is therefore noteworthy that the adults in all three locations retained their regional vowel characteristics and the children were greatly influenced by the dialect background of adults despite their increased exposure to non-local forms. This clearly underscores the role of local influences on linguistic behavior of children, be it through input provided by adults such as parents and grandparents or by peers who speak the local dialect (cf. Roberts 1997b; 2002; Labov, 1991; Kerswill & Williams, 2000; Stanford, 2008). This point was recently emphasized by Labov (2010), who views the acquisition of cultural values as an important component in mastering linguistic community patterns. He gives an example of high school students from the Midland (Columbus) who do not accept linguistic norms of their peers from the Northern Cities Shift area (Cleveland) because, for the Midland students, their Cleveland peers are not models of linguistic influence.

This study is the first to show that learning regional vowel category involves learning the dialect-specific patterns of vowel inherent spectral change. The present results provide ample evidence that children do acquire regionally-defined dynamic vowel characteristics. The measure of spectral change employed in this study, TL, by no means provides an exhaustive characterization of the formant movement in time as it lacks a true temporal component. In addition, it fails to account for the direction of movement. Nonetheless, we found that a 5-point measurement system can still produce a good estimate of the actual trajectory length and a reasonable characterization of the trajectory shape (see Fox & Jacewicz, 2009 for evidence in favor of TL and a discussion of its limitations). Comparing the TL measure with other approaches, it was shown in that study that the onset-to-offset measurement used in the literature to calculate the amount of spectral change, so-called “vector length” (e.g., Ferguson & Kewley-Port, 2002; Hillenbrand et al., 1995; Hillenbrand & Nearey, 1999), provides a reasonable measure of spectral change only when the diphthong is moving in a relatively straight line in the acoustic vowel space. However, the vector length measure severely underestimates the amount of formant change in the case of more complex formant curves as shown here.

When measuring vowel movement in the F1 × F2 vowel space, the question naturally arises as to how many measurement points are needed to provide a reasonable estimate of the vowel’s position in the space as well as either the length or the shape of its trajectory. A single point (e.g., the 50% point) will provide information about the vowel’s location in the space, although it will not provide any information about the amount of the spectral change. Two measurement points (e.g., the 20% and 80% points) will provide some information about spectral change, but unless the movement is in a straight line, two points will not provide a good estimate of either the length of the trajectory nor its shape. Some studies have used three points, ours uses five points, while other have used a more dense sampling (e.g., Van Son & Pols, 1992).

While using the TL measure derived from a 5-point measurement, we were able to assess the amount of spectral change in a vowel as a function of speaker dialect. We found that, for some vowels, cross-dialect differences in the amount of formant movement found in adults can be large and similar variation was evident in children (compare the TL results for /e/ and /ʊ/). This indicates that children do acquire dialect-specific patterns of formant dynamics when learning vowel categories in their first language. Another finding was that even if the amount of spectral change did not differ significantly across dialects, children’s productions showed similar dialect-specific formant trajectory shapes as found in adults (e.g., in /u/ and /o/).

However, we also found a significant reduction of formant movement in children’s /ɪ, ε, æ/ (except for NC /ɪ/ and WI /æ/) along with lowering of these vowels in the acoustic space. This new trend in children, which was absent in adults, is an indication of children’s participation in language change as argued in the sociolinguistic literature (see Roberts, 2002, for a review). In particular, the positional change of these three vowels is most likely related to systemic changes (i.e., system-driven) in English known as vowel shifts. The present data support the operation a key historical principle of vowel chain shifting (Sievers, 1881), reformulated more recently by Labov (1994) as “Principle II,” which states that in chain shifts, lax nuclei fall along a non-peripheral track. Although the variety spoken in central Ohio is not generally reported to be participating in any vowel chain shifts at present, we clearly see in children’s data a positional change in these three vowels in accord with Principle II. A new finding in the present study is that this positional change is accompanied by a reduction of formant movement and the vowels /ɪ, ε, æ/ becoming more monophthongal in children’s production compared to adults.

Another set of new features in children found in this study include a systemic change in OH children’s vowel systems called the low back merger, ɑ> ɔ (Labov, 1994; Thomas, 2001). This merger is becoming a standard feature in Canadian English and is spreading rapidly in the United States (Chambers, 1992; Labov et al., 2006). Another systemic change known as the æ>a Backing can be found particularly in OH and NC girls which seems to be related to monophthongization and lowering of /æ/ in their productions. This feature is still absent in WI children, most likely due to the extensive formant movement in /æ/ which is common among WI speakers. Finally, we found that children’s diphthongs /oɪ, aɪ, aʊ/ exhibit greater formant movement than adults and dialect differences are gradually disappearing in their productions.

We need to emphasize that none of these new features in children’s vowel systems should be viewed as developmental, which is evident from additional children’s plots included in the Appendix. That is, whether we examine general means of 16 speakers across the ages 8–12 yrs or the means of two speakers (or just a single speaker) broken down by each year of age, we find the same general patterns of dialect-specific positional changes and dialect-specific patterns of formant dynamics. Neither individual variation nor a narrower breakdown by age obscures these general patterns. The differences between 8-year-olds and 12-year-olds are found in absolute frequency differences in F1 and F2, not in the patterning of the vowels within the vowel space. Moreover, we also found dialect-specific differences in vowel duration for selected vowels in children, which provide further evidence of the acquisition of dialect-specific vowel features.

In this study, male and female participants were used in order to compare their dialect-specific vowel systems and detect any potential gender-related differences in vowel dispersion and formant dynamics. Overall, only small differences were found for individual vowels, indicating that vowel systems of male and female speakers are generally similar in each dialect. As listed in Table 1, some differences in the amount of formant change were found for five vowels in adults’ productions. In particular, Scheffé’s comparisons showed that significant cross-dialectal differences in TLs varied slightly between males and females. As for children’s productions, Table 2 shows that such gender-related variation was found only for three vowels. However, we can also detect some positional differences between male and female vowels, particularly in NC vowel system, which is undergoing greater reorganization than OH and WI vowel systems. Tracing the positional vowel changes in male and female vowel spaces in NC, it becomes apparent that both adult females and girls manifest more advanced changes compared to male speakers, thus supporting the sociolinguistic position of females as the leaders in linguistic change, especially unconscious ones or “changes from below” (Gauchat, 1905; Trudgill, 1974; Chambers, 1995; Labov, 2001).

The findings of this work may prove helpful in better understanding and interpretation of the development of vowel spaces in children. Most importantly, the study directs attention to regional variation in vowel systems of American English and the acquisition of regional features by children, including dialect-specific pattern of formant dynamics. This study documented regional variation in children’s vowel systems recorded in years 2006–2008, in a particular time in history of American English. It will be of interest to future investigators who will inquire into development of vowel systems in children yet to be born. Future studies of that type will determine how much regional vowel systems changed in time and which dialect-specific features will be transmitted to younger generations.

Acknowledgments

This work was supported by the research grant No. R01 DC006871 from the National Institute of Deafness and Other Communication Disorders, National Institutes of Health.

Appendix 1 figure.

Appendix 1 figure

Mean F1 and F2 values for NC boys redistributed for each year of age.

Appendix 2 figure.

Appendix 2 figure

Mean F1 and F2 values for NC girls redistributed for each year of age.

Appendix 3 figure.

Appendix 3 figure

Mean F1 and F2 values for OH boys redistributed for each year of age.

Appendix 4 figure.

Appendix 4 figure

Mean F1 and F2 values for OH girls redistributed for each year of age.

Appendix 5 figure.

Appendix 5 figure

Mean F1 and F2 values for WI boys redistributed for each year of age.

Appendix 6 figure.

Appendix 6 figure

Mean F1 and F2 values for WI girls redistributed for each year of age.

Footnotes

1

Normalization procedures such as that developed by Lobanov (1971) are generally used to eliminate variation in the acoustic characteristics between speakers related to vocal tract length. However, this procedure eliminates any direct correspondence with the measured formant values in Hz. Given that we need to compare our results with those in the literature – which are overwhelmingly described using non-normalized Hz values, we decided to use the Hz values for further considerations and analyses of the dynamic formant movement.

Contributor Information

Ewa Jacewicz, The Ohio State University.

Robert Allen Fox, The Ohio State University.

Joseph Salmons, University of Wisconsin-Madison.

References

  1. Assmann PF, Katz WF. Time-varying spectral change in the vowels of children and adults. Journal of the Acoustical Society of America. 2000;108:1856–1866. doi: 10.1121/1.1289363. [DOI] [PubMed] [Google Scholar]
  2. Bayley R. Consonant cluster reduction in Tejano English. Language Variation and Change. 1994;6:303–326. [Google Scholar]
  3. Bennett S. Vowel formant frequency characteristics of preadolescent males and females. Journal of the Acoustical Society of America. 1981;69:231–238. doi: 10.1121/1.385343. [DOI] [PubMed] [Google Scholar]
  4. Busby PA, Plant GL. Formant frequency values of vowels produced by preadolescent boys and girls. Journal of the Acoustical Society of America. 1995;97:2603–2606. doi: 10.1121/1.412975. [DOI] [PubMed] [Google Scholar]
  5. Chambers JK. Dialect acquisition. Language. 1992;68:673–705. [Google Scholar]
  6. Chambers JK. Sociolinguistic theory. Oxford: Blackwell; 1995. [Google Scholar]
  7. Chambers JK. Sociolinguistic theory: Linguistic variation and its social significance. 2. Oxford: Blackwell; 2003. [Google Scholar]
  8. Clopper CG, Pisoni D, de Jong K. Acoustic characteristics of the vowel systems of six regional varieties of American English. Journal of the Acoustical Society of America. 2005;118:1661–1676. doi: 10.1121/1.2000774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cox F. Vowel change in Australian English. Phonetica. 1999;56:1–27. doi: 10.1159/000028438. [DOI] [PubMed] [Google Scholar]
  10. Durian D. Doctoral dissertation. The Ohio State University; Columbus, OH: 2010. A new perspective on vowel variation across the 20th century in Columbus, OH. [Google Scholar]
  11. Eckert P. Linguistic variation as social practice. Oxford: Blackwell; 1999. [Google Scholar]
  12. Ferguson SH, Kewley-Port D. Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. Journal of the Acoustical Society of America. 2002;112:259–271. doi: 10.1121/1.1482078. [DOI] [PubMed] [Google Scholar]
  13. Fitch WT, Giedd J. Morphology and development of the human vocal tract: A study using magnetic resonance imaging. Journal of the Acoustical Society of America. 1999;106:1511–1522. doi: 10.1121/1.427148. [DOI] [PubMed] [Google Scholar]
  14. Foulkes P, Docherty G. Urban Voices: Accent studies in the British Isles. London: Arnold; 1999. [Google Scholar]
  15. Fox RA, Jacewicz E. Cross-dialectal variation in formant dynamics of American English vowels. Journal of the Acoustical Society of America. 2009;126:2603–2618. doi: 10.1121/1.3212921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gauchat L. Aus romanischen Sprachen und Literaturen: Festschrift for Heinrich Morf. Halle: Niemeyer; 1905. L’unité phonétique dans le patois d’une commune; pp. 175–232. [Google Scholar]
  17. Gordon E, Campbell L, Hay J, Maclagan M, Sudbury A, Trudgill P. New Zealand English: Its origin and evolution. Cambridge: Cambridge University Press; 2004. [Google Scholar]
  18. Guy G. Variation in the group and the individual: The case of final stop deletion. In: Labov W, editor. Locating language in time and space. New York: Academic Press; 1980. pp. 1–36. [Google Scholar]
  19. Hillenbrand JM, Nearey TM. Identification of resynthesized /hVd/ utterances: Effects of formant contour. Journal of the Acoustical Society of America. 1999;105:3509–3523. doi: 10.1121/1.424676. [DOI] [PubMed] [Google Scholar]
  20. Hillenbrand JM, Getty LA, Clark MJ, Wheeler K. Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America. 1995;97:3099–3111. doi: 10.1121/1.411872. [DOI] [PubMed] [Google Scholar]
  21. Hagiwara R. Dialect variation and formant frequency: The American English vowels revisited. Journal of the Acoustical Society of America. 1997;102:655–658. [Google Scholar]
  22. Jacewicz E, Fox RA, O’Neill C, Salmons J. Articulation rate across dialect, age, and gender. Language Variation and Change. 2009;21:233–256. doi: 10.1017/S0954394509990093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Jacewicz E, Fox RA, Salmons J. Vowel duration in three American English dialects. American Speech. 2007;82:367–385. doi: 10.1215/00031283-2007-024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Jacewicz E, Fox RA, Wei L. Between-speaker and within-speaker variation in speech tempo of American English. Journal of the Acoustical Society of America. 2010;128:839–850. doi: 10.1121/1.3459842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kerswill P. Dialects converging: Rural speech in urban Norway. Oxford: Clarendon Press; 1994. [Google Scholar]
  26. Kerswill P. Children, adolescents, and language change. Language Variation and Change. 1996;8:177–202. [Google Scholar]
  27. Kerswill P. Dialect levelling and geographical diffusion in British English. In: Britain D, Cheshire J, editors. Social dialectology: In honour of Peter Trudgill. Amsterdam: Benjamins; 2003. pp. 223–243. [Google Scholar]
  28. Kerswill P, Torgersen EN, Fox S. Reversing “drift”: Innovation and diffusion in the London diphthong system. Language Variation and Change. 2008;20:451– 491. [Google Scholar]
  29. Kerswill P, Williams A. Creating a new town koine: Children and language change in Milton Keynes. Language in Society. 2000;29:65–115. [Google Scholar]
  30. Labov W. The three dialects of English. In: Eckert P, editor. New ways of analyzing sound change. New York: Academic Press; 1991. pp. 1–44. [Google Scholar]
  31. Labov W. Principles of linguistic change. 1: Internal factors. Oxford: Blackwell; 1994. [Google Scholar]
  32. Labov W. Principles of linguistic change. 2: Social factors. Oxford: Blackwell; 2001. [Google Scholar]
  33. Labov W. Transmission and diffusion. Language. 2007;83:344–387. [Google Scholar]
  34. Labov W. What is to be learned?. Paper presented at the 34th LAUD Symposium on Cognitive Sociolinguistics; March 16; Landau, Germany. 2010. [Google Scholar]
  35. Labov W. Principles of linguistic change. 3: Cognitive factors. Oxford: Blackwell; Forthcoming. [Google Scholar]
  36. Labov W, Ash S, Boberg C. Atlas of North American English: Phonetics, phonology, and sound change. Berlin: Mouton de Gruyter; 2006. [Google Scholar]
  37. Lee S, Potamianos A, Narayanan S. Acoustics of children’s speech: Developmental changes of temporal and spectral parameters. Journal of the Acoustical Society of America. 1999;105:1455–1468. doi: 10.1121/1.426686. [DOI] [PubMed] [Google Scholar]
  38. Lieberman P. On the development of vowel production in young children. In: Yeni-Komshian GH, Kavanagh JS, Ferguson CA, editors. Child Phonology, Volume 1: Production. New York: Academic Press; 1980. [Google Scholar]
  39. Lobanov B. Classification of Russian vowels spoken by different speakers. Journal of the Acoustical Society of America. 1971;49:606–608. [Google Scholar]
  40. Milenkovic P. TF32 software program. Madison: University of Wisconsin; 2003. [Google Scholar]
  41. Nearey TM, Assmann PF. Modeling the role of inherent spectral change in vowel identification. Journal of the Acoustical Society of America. 1986;80:1297–1308. [Google Scholar]
  42. Olive JP, Greenwood A, Coleman J. Acoustics of American English speech. New York, NY: Springer-Verlag; 1993. [Google Scholar]
  43. Patrick P. Creoles at the intersection of variable processes: -t, d deletion and past-tense marking in the Jamaican mesolect. Language Variation and Change. 1991;3:171–189. [Google Scholar]
  44. Payne A. Factors controlling the acquisition of the Philadelphia dialect by out of-state children. In: Labov W, editor. Locating language in time and space. New York: Academic Press; 1980. pp. 143–178. [Google Scholar]
  45. Perry TL, Ohde RN, Ashmead DH. The acoustic bases for gender identification from children’s voices. Journal of the Acoustical Society of America. 2001;109:2988–2998. doi: 10.1121/1.1370525. [DOI] [PubMed] [Google Scholar]
  46. Peterson GE, Barney HL. Control methods used in a study of the vowels. Journal of the Acoustical Society of America. 1952;24:175–184. [Google Scholar]
  47. Peterson GE, Lehiste I. Duration of syllable nuclei in English. Journal of the Acoustical Society of America. 1960;32:693–703. [Google Scholar]
  48. Pickett JM. The acoustics of speech communication: Fundamentals, speech perception theory, and technology. Boston: Allyn and Bacon; 1999. [Google Scholar]
  49. Pollock KE, Keiser NJ. An examination of vowel errors in phonologically disordered children. Clinical Linguistics and Phonetics. 1990;4:161–178. [Google Scholar]
  50. Roberts J. Acquisition of variable rules: A study of (-t,d) deletion in preschool children. Journal of Child Language. 1997a;24:351–372. doi: 10.1017/s0305000997003073. [DOI] [PubMed] [Google Scholar]
  51. Roberts J. Hitting a moving target: Acquisition of sound change in progress by Philadelphia children. Language Variation and Change. 1997b;9:249–266. [Google Scholar]
  52. Roberts J. Child language variation. In: Chambers JK, Trudgill P, Schilling-Estes N, editors. The handbook of language variation and change. Oxford: Blackwell; 2002. pp. 333–348. [Google Scholar]
  53. Roberts J, Labov W. Learning to talk Philadelphian. Language Variation and Change. 1995;7:101–122. [Google Scholar]
  54. Rys K, Bonte D. The role of linguistic factors in the process of second dialect acquisition. In: Hinskens F, editor. Language variation – European perspectives. Amsterdam: Benjamins; 2006. pp. 201–215. [Google Scholar]
  55. Salmons J, Purnell T. Language contact and the development of American English. In: Hickey R, editor. The handbook of language contact. Oxford: Blackwell; 2010. pp. 454–477. [Google Scholar]
  56. Sievers E. Grundzüge der Phonetik. 2. Leipzig: Breitkopf & Härtel; 1881. [Google Scholar]
  57. Sledd J. Breaking, umlaut, and the southern drawl. Language. 1966;42:18–41. [Google Scholar]
  58. Smith J, Durham M, Fortune L. Universal and dialect-specific pathways of acquisition: Caregivers, children, and t/d deletion. Language Variation and Change. 2009;21:69–95. [Google Scholar]
  59. Stanford JN. Child dialect acquisition: New perspectives on parent/peer influence. Journal of Sociolinguistics. 2008;12:567–596. [Google Scholar]
  60. Stoel-Gammon C, Beckett Herrington P. Vowel systems of normally developing and phonologically disordered children. Clinical Linguistics and Phonetics. 1990;4:145–160. doi: 10.3109/02699209008985478. [DOI] [PubMed] [Google Scholar]
  61. Tagliamonte S, Temple R. New perspectives on an ol’ variable: (t,d) in British English. Language Variation and Change. 2005;17:281–302. [Google Scholar]
  62. Templin M. Certain language skills in children: Their development and interrelationships. Minneapolis, MN: University of Minnesota Press; 1957. [Google Scholar]
  63. Thomas E. An acoustic analysis of vowel variation in New World English. Durham, NC: Duke University Press; 2001. [Google Scholar]
  64. Torgersen E, Kerswill P. Internal and external motivation in phonetic change: Dialect levelling outcomes for an English vowel shift. Journal of Sociolinguistics. 2004;8:23–53. [Google Scholar]
  65. Trudgill P. The social differentiation of English in Norwich. Cambridge: Cambridge University Press; 1974. [Google Scholar]
  66. Trudgill P. Dialects in contact. Oxford: Blackwell; 1986. [Google Scholar]
  67. Van Son RJJH, Pols LCW. Formant movements of Dutch vowels in a text, read at normal and fast rate. Journal of the Acoustical Society of America. 1992;92:121–127. doi: 10.1121/1.404277. [DOI] [PubMed] [Google Scholar]
  68. Vorperian HK, Kent RD. Vowel acoustic space development in children: A synthesis of acoustic and anatomic data. Journal of Speech, Language, and Hearing Research. 2007;50:1510–1545. doi: 10.1044/1092-4388(2007/104). [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Watson CI, Harrington J. Acoustic evidence for dynamic formant trajectories in Australian English vowels. Journal of the Acoustical Society of America. 1999;106:458–468. doi: 10.1121/1.427069. [DOI] [PubMed] [Google Scholar]
  70. Whiteside SP, Hodgson C. Speech patterns of children and adults elicited via a picture-naming task: An acoustic study. Speech Communication. 2000;32:267–285. [Google Scholar]
  71. Whiteside SP. Sex-specific fundamental and formant frequency patterns in a cross-sectional study. Journal of the Acoustical Society of America. 2001;110:464–478. doi: 10.1121/1.1379087. [DOI] [PubMed] [Google Scholar]
  72. Winitz H, Irwin O. Syllabic and phonetic structure of infants’ early words. Journal of Speech and Hearing Research. 1958;1:250–256. doi: 10.1044/jshr.0103.250. [DOI] [PubMed] [Google Scholar]
  73. Wolfram W, Schilling-Estes N. American English: Dialects and variation. 2. Oxford: Blackwell; 2006. [Google Scholar]

RESOURCES