Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Sep 4.
Published in final edited form as: J Acoust Soc Am. 2005 Sep;118(3 Pt 1):1661–1676. doi: 10.1121/1.2000774

Acoustic characteristics of the vowel systems of six regional varieties of American English

Cynthia G Clopper 1,a), David B Pisoni 2, Kenneth de Jong 3
PMCID: PMC3432912  NIHMSID: NIHMS399880  PMID: 16240825

Abstract

Previous research by speech scientists on the acoustic characteristics of American English vowel systems has typically focused on a single regional variety, despite decades of sociolinguistic research demonstrating the extent of regional phonological variation in the United States. In the present study, acoustic measures of duration and first and second formant frequencies were obtained from five repetitions of 11 different vowels produced by 48 talkers representing both genders and six regional varieties of American English. Results revealed consistent variation due to region of origin, particularly with respect to the production of low vowels and high back vowels. The Northern talkers produced shifted low vowels consistent with the Northern Cities Chain Shift, the Southern talkers produced fronted back vowels consistent with the Southern Vowel Shift, and the New England, Midland, and Western talkers produced the low back vowel merger. These findings indicate that the vowel systems of American English are better characterized in terms of the region of origin of the talkers than in terms of a single set of idealized acoustic-phonetic baselines of “General” American English and provide benchmark data for six regional varieties.

I. INTRODUCTION

1In what is now one of the most frequently cited papers in speech science and acoustic-phonetics, Peterson and Barney (1952) reported the results of an acoustic analysis of 10 vowels of American English produced by 33 men, 28 women, and 15 children in hVd utterances. When they plotted the first and second formant frequencies of the vowels in an F1 × F2 space, they found a high degree of both within-speaker and cross-speaker variability in the production of vowel phonemes. While Peterson and Barney’s data have since become important benchmarks in the characterization of American English vowels, the talkers in their study varied substantially in terms of native dialect and even native language. As a result, their data more closely approximate the speech of the American eastern seaboard than General American English. In addition, their data were collected over 50 years ago and cannot be expected to accurately reflect current pronunciation patterns, because American English, like all languages, continues to change over time (Bauer, 1985; Labov, 1994).

More recently, Hillenbrand, Getty, Clark, and Wheeler (1995) replicated the Peterson and Barney (1952) methods with male and female adult and child speakers from the northern Midwest in an attempt to control for some of the dialect-related problems in the original study. Hillenbrand et al. replicated Peterson and Barney’s results with respect to cross-speaker variability, but they also observed a number of differences in the mean formant frequencies between their study and the earlier Peterson and Barney study. In particular, the authors found evidence of the Northern Cities Chain Shift (NCCS) in their speech samples. The NCCS is characterized by the clockwise rotation of the low and low-mid vowels; /æ/ is raised and fronted, /ɛ/ and /ʌ/ are backed, /ɔ/ is lowered, and /ɑ/ is lowered and fronted, as shown in Fig. 1 (Labov, 1998). The comparison of the Peterson and Barney mean formant frequencies with those from the Hillenbrand et al. talkers revealed that all but the most recent component of the NCCS, /ʌ/ backing, were present in both the male and female adult productions. The NCCS is the common speech pattern of residents in urban and suburban areas in upstate western New York and the northern Midwest, including Buffalo, Chicago, and Detroit (Labov, 1998). Because Hillenbrand et al. used native speakers from the northern Midwest, it is not surprising that their talkers would display this particular dialect variant of American English.

FIG. 1.

FIG. 1

Schematic of the Northern Cities Chain Shift.

In a subsequent study, Hagiwara (1997) noted that the Hillenbrand et al. (1995) data were limited because they reflected only a single dialect of American English and he provided mean vowel formant frequency data for speakers from southern California as a supplement to the Hillenbrand et al. and Peterson and Barney (1952) datasets. The data Hagiwara described diverged from the earlier data reported by Hillenbrand et al., particularly with respect to the high back vowels /u/ and /Ʊ/, which were fronted, as is common in southern California speech (Thomas, 2001). Hagiwara concluded by making a plea for other researchers to collect and publish formant frequency data on vowels from other varieties of American English:

Documentation of every American dialect on such a scale is obviously beyond the scope of any single researcher or research group… However, studies of a dozen or so speakers are well within the scope of most researchers, perhaps most students, and, in the absence of studies of large numbers of speakers, would be better than nothing. If the results of many such studies were combined, they would fill a significant void in objective descriptions of American English (p. 658).

American sociolinguists have in fact collected large corpora of vowel formant frequency data from talkers all over the United States and Canada (e.g., Labov, Ash, & Boberg, forthcoming; Thomas, 2001). However, their research interests have focused on descriptions of vowel variation in the United States (Thomas, 2001) or definitions of the regional varieties of American English (Labov et al., forthcoming). The Labov et al. Atlas of North American English, for example, presents data in the form of maps showing where certain variants do and do not occur. Thomas, on the other hand, presented vowel spaces for individual talkers but made no claims about what we might expect the vowel system of the average speaker from a given location to look like. While answers to questions about the acoustic characteristics of American English vowel systems may be lurking in the work of sociolinguists, the results of their acoustic analyses are typically not presented with the familiar summary ellipses of Peterson and Barney (1952), but instead are presented on maps or in vowel spaces of individual talkers.

Thus, we are still lacking a comprehensive acoustic-phonetic description of the characteristics of the major regional varieties of American English. Hillenbrand et al. (1995) provided some benchmark vowel data for the northern Midwest and Hagiwara (1997) provided the same for southern California. However, the acoustic characteristics of “American English” vary based on where the speakers are from. The present study was designed to provide a summary and description of the acoustic characteristics of the vowel systems of six regional varieties of American English in an effort to fill this important gap in our understanding of how American English vowel systems are structured and how they differ with respect to one another. An underlying assumption of this research program is that synchronic differences between various regional dialects of American English can be defined relative to some standard or baseline variety. We will return to the issue of how to determine the baseline variety of American English in the General Discussion.

II. METHODS

A. Talkers

Forty-eight talkers between the ages of 18 and 25 were selected from the Nationwide Speech Project corpus (Clopper, 2004) for use in the current study. All of the talkers were monolingual native speakers of American English with no history of hearing or speech disorders. Both parents of each talker were also native English speakers. The 48 talkers included four males and four females from each of six dialect regions of the United States: New England, Mid-Atlantic, North, Midland, South, and West. The six regions were based on the Labov et al. (forthcoming) descriptions of phonological variation in the United States. The map in Fig. 2 shows the geographic boundaries of the six dialect regions, as well as the hometown of each of the 48 talkers. Female talkers are indicated by light squares and male talkers are indicated by dark circles. Each talker had lived in a single dialect region for his or her entire life and both of his or her parents were also raised in that same dialect region. While all of the talkers were recorded at Indiana University in Bloomington, each talker had lived in Bloomington, Indiana for less than two years at the time of recording to reduce the effects of dialect leveling. Additional demographic information about the talkers is provided in the Appendix.

FIG. 2.

FIG. 2

(Color online) Map of the 6 dialect regions and the hometowns of the 48 talkers included in the acoustic analysis.

Based on previous research by sociolinguists (Labov, 1998; Labov et al., forthcoming, Thomas, 2001), we predicted that the talkers in this study would produce dialect-specific variants of a number of the vowels we examined. As noted above and shown in Fig. 1, the vowel system of the Northern dialect of American English is characterized by the Northern Cities Chain Shift (Labov, 1998). The Southern dialect of American English, on the other hand, is characterized by the Southern Vowel Shift (Labov, 1998). The primary attribute of this shift is the fronting of the back vowels /u/ and /o/. In addition, the front lax vowels /ɪ/ and /ɛ/ are moving to the periphery of the vowel space through raising and fronting and the front tense vowels /i/ and /e/ are being centralized through lowering and backing. A schematic of the Southern Vowel Shift is shown in Fig. 3.

FIG. 3.

FIG. 3

Schematic of the Southern Vowel Shift.

The common feature of the Midland, Western, and New England dialects is the merger of the low-back vowels /ɑ/ and /ɔ/, creating homophones of such pairs of words as caught and cot or Dawn and Don (Labov, 1998). Other features of Western New England (eastern New York State, Vermont, and western Massachusetts) reflect several components of the Northern Cities Chain Shift with some raising of /æ/, fronting of /ɑ/, and backing of /ɛ/ (Boberg, 2001; Thomas, 2001). Western speech is characterized by the low back merger and by /u/ fronting (Labov et al., forthcoming; Thomas, 2001). Unlike Southern back-vowel fronting, however, the Western pattern is typically limited to fronting of /u/. The Midland dialect is the least marked of the regional American English varieties, exhibiting no distinct features other than the ɑ ~ ɔ merger. The Mid-Atlantic dialect does not exhibit the ɑ ~ ɔ merger and, in fact, the two vowels are found to be more distinct due to /ɔ/ raising (Labov, 1994; Thomas, 2001). /æ/ also exhibits raising in some words, but not others, in the Mid-Atlantic region, due to a maintenance of a historical contrast between long and short /æ/ (Labov, 1994; Thomas, 2001).

B. Stimulus materials

A subset of the materials collected from each talker was used in an acoustic analysis of 11 vowels of American English: /i, ɪ, e, ɛ, æ, ɑ, ɔ, ʌ, o, Ʊ, u/, as shown in Table I. Five repetitions of each of the vowels except /ɔ/ were obtained from each talker in hVd utterances, for a total of 50 tokens per talker. In addition, six tokens of /ɔ/ were obtained in sentence-final position, three in the word frogs and three in the word logs, from each talker.1 The resulting set of stimulus materials included 56 vowel tokens per talker with 5–6 tokens per vowel.

TABLE I.

Vowel tokens for acoustic analysis. Tokens followed by (5) were taken from the hVd utterances, for which five repetitions of each token were available. Tokens followed by (3) were taken from the sentence-length utterances, for which three repetitions of each token were available in sentence-final position.

Vowel Tokens
i heed (5)
ɪ hid (5)
e hayed (5)
ɛ head (5)
æ had (5)
ɑ hod (5)
ɔ frogs (3), logs (3)
ʌ hud (5)
o hoed (5)
Ʊ hood (5)
u who’d (5)

The stimulus materials were digitally recorded on a Macintosh Powerbook G3 laptop using a Shure SM10A head-mounted microphone, a microphone tube preamplifier, and a Roland UA-30 USB audio interface in a sound-attenuated chamber (IAC Audiometric Testing Room). Each utterance was recorded in an individual .aiff 16-bit sound file at a sampling rate of 44.1 kHz. Additional information about the collection of the stimulus materials can be found in Clopper (2004).

C. Procedures

Five acoustic measures were obtained from each of the 56 vowel tokens from each of the 48 talkers: vowel duration, first and second formant frequencies at the one-third temporal point in the vowel, and first and second formant frequencies at the two-thirds temporal point in the vowel, for a total of 13 440 measurements. All of the measurements were made using the Speech Analysis tool in WaveSurfer 1.6.2 (Sjölander and Beskow, 2004). The speech analysis tool included a time-aligned waveform, f0 trace, and wide-band spectrogram with formant tracks for F1, F2, F3, and F4. The automatic formant-tracking procedure was computed using a 12th-order LPC analysis over a 49 ms window with a 10 ms frame interval (Sjölander and Beskow, 2004).

For each token, the duration measurements were made first. The onset of the vowel was marked by the onset of voicing for those vowels preceded by a voiceless consonant and by a sudden change in intensity or formant frequency for those vowels preceded by a voiced consonant. The offset of the vowel was marked by the offset of voicing or a sudden drop in intensity, indicating closure. Particularly for those vowels following a liquid consonant, vowel onsets were determined by visual inspection of the waveform and spectrogram as well as by ear. Vowel duration was calculated as the difference between offset and onset of the vowel in milliseconds.

Formant values were automatically extracted from the formant traces at time stamps determined from the duration measures. The first-third and second-third time stamps were located at one-third and two-thirds of the vowel duration plus the time stamp of the onset. Formant measures were hand corrected by the first author by visual inspection of the wideband spectrogram using the cursor tool as necessary.

A total of 28 vowel tokens (1%) were excluded because the talker misread the word (25 tokens) or a recording error occurred (3 tokens). All of the excluded tokens were from the hVd materials set in which trials with disfluencies or mispronunciations were not repeated.

The measurements were hand-checked for outliers prior to any further analysis. A total of 37 measurements out of the total 13 440 (0.3%) were rechecked as potential outliers. 1 was an error in measurement, 5 were typographic errors in data recording, and 14 were formant tracking errors not hand-corrected at the time of the original measurement. These measurements were all corrected prior to the analysis of the data. The remaining 17 potential outliers were found to be due to natural variation in the corpus and these data points were not altered.

A small subset (1%) of the tokens was also remeasured by the first author to assess reliability. The reliability subset included tokens from all six dialects and all 11 vowels. The absolute difference between the first and second duration measures ranged from 0 to 11 ms, with a mean difference of 3 ms. The absolute difference for the formant frequency measures ranged from 0 to 44 Hz, with a mode of 0 Hz and a mean of 0.3 Hz. These differences are within the acceptable range of error for acoustic measures (Hillenbrand et al., 1995). The reliability analysis and the outlier check suggest that the measurements were highly reliable.

III. RESULTS

A summary of the mean formant frequencies for the 11 American English vowels /i, ɪ, e, ɛ, æ, ɑ, ɔ, ʌ, o, Ʊ, u/ for each dialect is shown in Fig. 4. The filled symbols represent the means for the male talkers and the open symbols represent the means for the female talkers. For the Western talkers, the plain X’s represent the female talkers and the boxed X’s represent the male talkers. The means are based on the formant frequency measures taken at the first-third temporal point and were normalized using Lobanov’s (1971) z-score transformation to reduce formant frequency variation due to anatomical differences between males and females.

FIG. 4.

FIG. 4

(Color online) Mean z-score normalized formant frequency values for 11 American English vowels for each of the six dialects for the male talkers (filled symbols) and female talkers (open symbols). For the West, plain X’s indicate female talkers and boxed X’s indicate male talkers. The ellipses were hand-drawn to include every token for each vowel: /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/.

Figure 4 clearly shows the Northern Cities Chain Shift for both the male and female Northern talkers (triangles), with the fronting and lowering of /ɑ/ in hod and the raising and fronting of /æ/ in had relative to the other dialects. In addition, the male and female Southern talkers (diamonds) show fronting of /u/ in who’d and /o/ in hoed. The Midland and Western talkers also show some fronting of /u/ in who’d. Finally, the New England males and females, Midland females, and Western males and females appear to have a merger or partial merger of /ɑ/ and /ɔ/ in hod and frogs/logs. In order to quantitatively assess differences in formant frequency due to talker dialect, a series of statistical analyses was conducted on the acoustic measurements.

A. Statistical analysis

Prior to the analysis, all of the formant frequency measures were normalized using Lobanov’s (1971) z-score metric for each talker. The z-score method was selected because a recent comparison of vowel normalization metrics by Adank, Smits, and van Hout (2004) showed that Lobanov’s metric was the best transformation for reducing anatomical variation while still maintaining phonological and sociolinguistic variation. The z-score transform is a vowel-extrinsic, talker-intrinsic normalization procedure that centers each talker’s vowel space on the origin in an F1 × F2 plane. The equation in (1) was used to calculate the z-score normalization for each talker for each vowel token. The normalization was computed separately for F1 and F2. In the equation, z is the transformed normalized frequency value of F1 or F2 for a given vowel token, f is the raw frequency in Hz of that formant value, μ is the overall mean frequency of the relevant formant frequency (F1 or F2), and σ is the standard deviation of the overall mean of the same formant. The means and standard deviations are calculated individually for each talker based on all of the tokens produced by that talker. Given this talker-intrinsic nature of the transform, we expected that vowel-specific within-dialect gender differences would be maintained while overall gender differences in formant frequency would be greatly reduced. The duration measures were not normalized prior to statistical analysis.

z=(fμ)/σ. (1)

A repeated measures ANOVA was calculated using vowel category (i, ɪ, e, ɛ, æ, ɑ, ɔ, ʌ, o, Ʊ, or u) as a within-subjects factor and dialect (New England, Mid-Atlantic, North, Midland, South, or West) and gender (male, female) as between-subjects factors for each of three measures: vowel duration, F1 z score (at the first-third temporal point), and F2 z score (at the first-third temporal point). Because three analyses were computed, the p value was set to 0.01 for each ANOVA.

1. Duration

The repeated measures ANOVA on vowel duration revealed a significant main effect of vowel [F(10,2020) = 363.3, p < 0.001], a significant main effect of dialect [F(5,2020) = 3.8, p = 0.003], a significant vowel X dialect interaction [F(50,2020) = 3.0, p < 0.001], and a significant vowel X gender interaction [F(10,2020) = 3.5, p < 0.001]. The main effect of gender, the dialect X gender interaction, and the vowel X dialect X gender interaction were not significant. The significant main effect of vowel category merely confirms that American English vowels differ in their inherent length and no further analyses on that factor were conducted. The significant main effect of dialect suggests that some dialects have longer or shorter overall vowels than others. Post-hoc Tukey tests on dialect revealed significant differences between the Southern talkers and the New England, Mid-Atlantic, and Western talkers based on vowel duration (all p < 0.01). Overall, the Southerners had significantly longer vowels than the talkers from New England, the Mid-Atlantic, and the West.

The vowel X dialect interaction suggests that the effects of dialect differences on vowel duration are not consistent across all vowels. To explore this interaction, a one-way ANOVA on vowel duration with dialect as the factor was computed for each of the 11 vowels. To correct for the large number of analyses, the p value for the ANOVAs and the post-hoc Tukey tests was set to 0.001. Significant main effects of dialect were found for /ɪ/ [F(5,234) = 7.6, p < 0.001], /ɛ/ [F(5,228) = 8.2, p < 0.001], /ʌ/ [F(5,229) = 9.3, p < 0.001], and /Ʊ/ [F(5,229) = 8.3, p < 0.001]. In all four cases, post-hoc Tukey tests revealed that the vowels produced by the Southern talkers were significantly longer than the vowels produced by the New England, Mid-Atlantic, and Western talkers (all p < 0.001). The vowel /ɛ/ was also longer for Southerners than Northerners and the vowel /ʌ/ was longer for Southern talkers than for Northern and Midland talkers (all p < 0.001). These results suggest that Southerners did not produce generally longer vowels or have an overall slower speaking rate (as indicated by longer vowels), but that the vowel duration differences based on dialect were selective in nature and were due to longer lax vowels for Southern talkers than for the other dialect groups. That is, the durational distinction between lax and tense vowels was reduced for the Southern talkers.

The vowel X gender interaction suggests that while there was no overall effect on vowel duration due to gender, the male and female talkers did produce significant duration differences for some of the vowels. A series of t tests was conducted to compare the male and female talkers, collapsed across dialect, for each of the 11 vowels. To correct for the large number of analyses, the p value for the t tests was set to 0.001. Significant differences due to gender were found for /ɪ/, /ɛ/, /ɑ/, /ʌ/, and /Ʊ/ (all p < 0.001). In all five cases, the females produced longer vowels than the males. These results parallel those for the Southern talkers and reveal that the female talkers produced longer lax vowels than the male talkers, although they did not produce longer vowels overall.

2. Nucleus formant frequencies

The repeated measures ANOVA on the F1 and F2 z scores revealed a significant main effect of vowel [F(10,2020) = 3039.5, p < 0.001 for F1 and F(10,2020) = 4575.9, p < 0.001 for F2]. These results merely confirm the existence of significant differences in formant frequencies due to vowel category, and will not be analyzed further. Both the F1 and the F2 analyses also revealed a significant vowel X dialect interaction [F(50,2020) = 10.1, p < 0.001 for F1 and F(50,2020) = 15.5, p < 0.001 for F2], a significant vowel X gender interaction [F(10,2020) = 12.6, p < 0.001 for F1 and F(10,2020) = 16.6, p < 0.001 for F2], and a significant vowel X dialect X gender interaction [F(50,2020) = 2.5, p < 0.001 for F1 and F(50,2020) = 3.3, p < 0.001 for F2]. The main effects of dialect and gender and the dialect X gender interaction were not significant for either the F1 or F2 z scores. These findings suggest that the vowel spaces of the different dialects were not globally shifted along either F1 or F2, but that individual vowels were affected differentially by the six dialects. In addition, the z-score transform was successful in eliminating overall gender-related differences in formant frequency, but some vowel-specific differences in gender were retained, leading to the significant three-way interaction.

To explore the three-way interaction in more detail, post-hoc repeated measures ANOVAs were computed for the males and the females separately on the F1 and F2 z scores with vowel category (i, ɪ, e, ɛ, æ, ɑ, ɔ, ʌ, o, Ʊ, or u) as a within-subjects factor and dialect (New England, Mid-Atlantic, North, Midland, South, or West) as a between-subjects factor. The p value was again set at 0.001 to correct for the large number of post-hoc analyses. The repeated measures ANOVAs on male and female F1 and F2 z scores revealed significant main effects of vowel [F(10,970) = 2486.4, p < 0.001 for the male F1, F(10,970) = 2416.1, p < 0.001 for the male F2, F(10,1050) = 1143.3, p < 0.001 for the female F1, and F(10,1050) = 2219.3, p < 0.001 for the female F2]. The results again confirm the existence of significant differences in formant frequencies due to vowel category and will not be analyzed further.

Significant vowel X dialect interactions were also observed for both the male and female F1 and F2 z scores [F(50,970) = 8.3, p < 0.001 for male F1, F(50,970) = 11.5, p < 0.001 for male F2, F(50,1050) = 5.6, p < 0.001 for female F1, and F(50,1050) = 7.7, p < 0.001 for female F2]. The main effect of dialect was not significant in any of the post-hoc analyses. These results again suggest that the vowel spaces of the different dialects were not globally shifted along either F1 or F2, but that individual vowels are affected differentially by the six dialects. Post-hoc one-way ANOVAs on F1 and F2 for each gender with dialect as the factor were computed for each of the 11 vowels in order to explore the vowel X dialect interactions. The p value was again set at 0.001 to correct for the large number of post-hoc analyses. A summary of the results of the statistical analysis is provided in Table II.

TABLE II.

Summary of the results of the acoustic analysis. Key: New England (E), Mid-Atlantic (A), North (N), Midland (M), South (S), West (W), Southern Vowel Shift (SVS), Northern Cities Chain Shift (NCCS), z-score transformation artifact (z-score), new Midland results (Midland), new Mid-Atlantic results (Mid-Atlantic), new Southern results (Southern).

Vowel F1
Male
F2
Male
F1
Female
F2
Female
Results
i n.s. n.s. n.s. n.s.
ɪ n.s. n.s. n.s. n.s.
e S > E, A S < A S > A S < A, W SVS
M < A Midland
ɛ S < E, N, W n.s. SVS
N > A, M, S, W N < M NCCS
M < E, N Midland
æ N < ALL N > ALL N < ALL N > ALL NCCS
S > A, W Southern
ɑ N > E, A, M, W N > ALL N > M N > M, S, W NCCS
A > E, M, S, W A > M, S, W Mid-Atlantic
ɔ n.s. A > E, M, S, W n.s. n.s. Mid-Atlantic
S < E, A, M, W z score
ʌ n.s. n.s. N < E, M, S, W NCCS
A < E A < M, S Mid-Atlantic
S < E, M, W z score
o n.s. S > N n.s. S > A, N SVS
Ʊ S < A, N, M n.s. n.s. n.s. Southern
u n.s. S > E, A, N n.s. SVS
W > N West
S < A, N, M, W Southern
E < W Spurious

For males for F1, significant effects of dialect were found for /e/ [F(5,114) = 8.9, p < 0.001], /ɛ/ [F(5,109) = 14.4, p < 0.001], /æ/ [F(5,113) = 16.3, p < 0.001], /ɑ/ [F(5,108) = 14.7, p < 0.001], and /Ʊ/ [F(5,113) = 12.0, p < 0.001]. Post-hoc Tukey tests revealed that Southern male talkers produced significantly lower /e/s (higher F1) than the New England and Mid-Atlantic male talkers and significantly higher /ɛ/s (lower F1) than the New England, Northern, and Western male talkers (all p < 0.001). These results are consistent with the Southern Vowel Shift. The Midland male talkers also produced significantly higher /ɛ/s than the New England and Northern talkers (both p < 0.001), suggesting that the Southern pattern may be spreading to the Midland dialect region. The post-hoc Tukey tests also revealed significant /æ/ raising for the Northern male talkers compared to all of the other male talkers and significant /ɑ/ lowering for the Northern male talkers compared to the New England, Mid-Atlantic, Midland, and Western male talkers (all p < 0.001). These findings reflect the Northern Cities Chain Shift. Finally, /Ʊ/ was produced significantly higher by the Southern male talkers than the Mid-Atlantic, Northern, and Midland talkers (all p < 0.001). This result was not anticipated based on the previous research on the Southern Vowel Shift, and appears to reflect back-vowel raising among Southern talkers.

For F2 for the males, the one-way ANOVAs revealed significant main effects of dialect for /e/ [F(5,114) = 7.6, p < 0.001], /æ/ [F(5,113) = 16.1, p < 0.001], /ɑ/ [F(5,108) = 38.7, p < 0.001], /ɔ/ [F(5,138) = 24.8, p < 0.001], /ʌ/ [F(5,110) = 10.9, p < 0.001], /o/ [F(5,113) = 5.1, p < 0.001], and /u/ [F(5,114) = 16.7, p < 0.001]. Post-hoc Tukey tests revealed significant differences between the Southern and Mid-Atlantic male talkers in /e/ fronting (p < 0.001), consistent with the centralization of /e/ in the Southern Vowel Shift. The Northern male talkers produced significantly fronted /æ/s and /ɑ/s compared to all of the other male talkers (all p < 0.001), consistent with the Northern Cities Chain Shift. The Mid-Atlantic male talkers also produced significantly fronted /ɑ/s and /ɔ/s compared to the New England, Midland, Southern, and Western male talkers and significantly backed /ʌ/s compared to the New England male talkers (all p < 0.001). These results were not expected based on the previous literature, but seem to suggest an alignment of the low back vowels with respect to F2 in the Mid-Atlantic dialect, as shown in Fig. 4. The post-hoc Tukey tests also revealed significant /o/ and /u/ fronting for the Southern male talkers compared to the Northern male talkers. The Southern male talkers also produced significantly fronted /u/s compared to the New England and Mid-Atlantic male talkers (all p < 0.001). These results are consistent with the Southern Vowel Shift. The Western male talkers also produced significantly more fronted /u/s than the Northern talkers (p < 0.001), as predicted.

Finally, the Southern male talkers produced significantly fronted /æ/s compared to the Mid-Atlantic and Western male talkers and significantly backed /ʌ/s and /ɔ/s compared to the New England, Mid-Atlantic (/ɔ/ only), Midland, and Western male talkers (all p < 0.001). The fronting of /æ/ among Southern talkers was unexpected and may reflect a new shift in Southern American English. However, the apparent backing of /ʌ/ and /ɔ/ for the Southern male talkers is probably an artifact of the z-score transform. Due to the significant back-vowel fronting produced by the Southern talkers, the top part of the vowel space for these talkers is narrower than for talkers with less extreme back-vowel fronting. In terms of the z-score transformation, the overall mean second formant frequency is higher and the second formant frequency standard deviation is smaller for talkers with fronted back vowels than for talkers without. The low back vowels normalized relative to this higher mean and smaller standard deviation appear artificially backed as the result of a larger numerator and smaller denominator in the z-score calculation. The result of the transform is therefore artificially backed low back vowels for the Southern male talkers.

Taken together, the results for the male talkers revealed components of the Northern Cities Chain Shift among the Northern male talkers, components of the Southern Vowel Shift among the Southern male talkers, Western /u/ fronting, some aspects of the Southern Vowel Shift among the Midland male talkers, and a new alignment of the low back vowels among the Mid-Atlantic males.

For the female talkers for F1, significant effects of dialect were revealed in the one-way ANOVAs for /e/ [F(5,114) = 7.1, p < 0.001], /ɛ/ [F(5,113) = 10.0, p < 0.001], /æ/ [F(5,114) = 12.7, p < 0.001], /ɑ/ [F(5,110) = 4.9, p < 0.001], and /u/ [F(5,114) = 10.2, p < 0.001]. Post-hoc Tukey tests revealed overall patterns similar to those found for the male talkers. In particular, the Southern female talkers produced significantly lower /e/s than the Mid-Atlantic females (p < 0.001), consistent with the Southern Vowel Shift. The Northern females also produced lower /ɛ/s than the Mid-Atlantic, Midland, Southern, and Western females, as well as significantly higher /æ/s than all of the other female talkers, and significantly lower /ɑ/s than the Midland female talkers (all p < 0.001). These results are all consistent with the Northern Cities Chain Shift. The post-hoc Tukey tests also revealed significantly higher /u/s for the Southern females than the Mid-Atlantic, Northern, Midland, and Western females (all p < 0.001). While this finding was not predicted based on previous descriptions of Southern American English, it is consistent with the results of the analysis of the male talkers, which also revealed back-vowel raising. Finally, the New England females were found to produce significantly higher /u/s than the Western females (p < 0.001). This result was not anticipated based on earlier work and further research is needed to explore the effects of dialect on back vowel raising.

For F2 for the female talkers, the ANOVAs revealed significant main effects of dialect for /e/ [F(5,114) = 8.5, p < 0.001], /ɛ/ [F(5,113) = 5.2, p < 0.001], /æ/ [F(5,114) = 17.6, p < 0.001], /ɑ/ [F(5,110) = 18.4, p < 0.001], /ʌ/ [F(5,113) = 16.8, p < 0.001], and /o/ [F(5,113) = 7.4, p < 0.001]. Post-hoc Tukey tests revealed significant /e/ backing for the Southern female talkers compared to the Mid-Atlantic and Western female talkers (both p < 0.001), consistent with the Southern Vowel Shift. The Midland females also produced significantly backed /e/s compared to the Mid-Atlantic female talkers, suggesting that this pattern of the Southern Vowel Shift may be spreading to the Midland dialect region. The Northern female talkers produced significantly backed /ɛ/s compared to the Midland talkers, significantly fronted /æ/s compared to all of the other female talkers, significantly fronted /ɑ/s compared to the Midland, Southern, and Western talkers, and significantly backed /ʌ/s compared to the New England, Midland, Southern, and Western talkers (all p < 0.001). These results are all consistent with the Northern Cities Chain Shift. Similar to the Mid-Atlantic male talkers, the Mid-Atlantic female talkers produced significantly fronted /ɑ/s compared to the Midland, Southern, and Western female talkers and significantly backed /ʌ/s compared to the Midland and Southern female talkers (all p < 0.001). These results are consistent with our earlier claim of an increasing alignment of the low back vowels with respect to F2 among Mid-Atlantic talkers. Finally, the Southern females produced significantly greater /o/ fronting than the Mid-Atlantic and Northern females (both p < 0.001).

Taken together, the analysis of the female data revealed components of the Northern Cities Chain Shift among the Northern females, components of the Southern Vowel Shift among the Southern females, some aspects of the Southern Vowel Shift among Midland females, and a new shift in the low back vowels for the Mid-Atlantic female talkers.

3. Merger of /ɑ/ and /ɔ/

To assess the degree of merger of /ɑ/ and /ɔ/, a series of paired-sample t tests was calculated. For each of the six dialect regions, one paired-sample t test was computed for F1 z scores and one was computed for F2 z scores. Significant differences in this analysis suggest distinct vowels, whereas nonsignificant differences in both F1 and F2 suggest a merger or partial merger. Due to the large number of comparisons, the p value was set at 0.005 for this analysis. A partial merger of /ɑ/ and /ɔ/ was found for the New England talkers [t(7) = 5.4, p = 0.001 for F1 and t(7) = 3.5, p = 0.009 for F2], the Mid-Atlantic talkers [t(7) = 3.5, p = 0.01 for F1 and t(7) = 2.1, p = 0.08 for F2], the Midland talkers [t(7) = 2.7, p = 0.03 for F1 and t(7) = 2.0, p = 0.08 for F2], and the Western talkers [t(7) = 3.6, p = 0.01 for F1 and t(7) = 0.03, p = 0.98 for F2]. However, /ɑ/ and /ɔ/ were clearly distinct for the Northern talkers [t(7) = 6.3, p < 0.001 for F1 and t(7) = 6.5, p < 0.001 for F2] and the Southern talkers [t(7) = 6.3, p < 0.001 for F1 and t(7) = 4.4, p = 0.003 for F2].

4. Summary

Overall, the results were highly consistent across both the male and female talkers. However, the three-way interaction between dialect, vowel, and gender was significant, suggesting that gender-specific differences are important. The primary differences between the two genders involve the extent of the Northern Cities Chain Shift and the magnitude of back-vowel fronting. In particular, we found evidence of the later stages of the Northern Cities Chain Shift, particularly /ɛ/ and /ʌ/ backing, among the females but not the males. With respect to back-vowel fronting, we found significant effects of /u/ and /o/ fronting for the males, but only /o/ fronting for the females. An inspection of the raw data suggests that /u/ fronting is more advanced for the female talkers across all of the dialect regions, while /o/ fronting remains predominantly a Southern phenomenon. For the males, however, /u/ fronting is still mostly a Southern and Western feature and has not spread to the northern and eastern dialects. Finally, a merger or partial merger of /ɑ/ and /ɔ/ was found for the New England, Mid-Atlantic, Midland, and Western talkers. An inspection of the individual vowel productions, however, provides additional insights into the range of talker variation observed within and across dialect regions.

B. Descriptive summary

Figures 510 show all of the tokens for each of the 11 American English vowels for each dialect. In these figures, the tokens have been plotted in raw frequency in Hertz and by gender for clarity. The ellipses in these figures were drawn by hand to include all of the tokens for each vowel.

FIG. 5.

FIG. 5

All tokens produced by the New England male (top) and female (bottom) talkers for the 11 vowels /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/. The ellipses were hand-drawn to include every token for each vowel.

FIG. 10.

FIG. 10

All tokens produced by the Western male (top) and female (bottom) talkers for the 11 vowels /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/. The ellipses were hand-drawn to include every token for each vowel.

1. New England

The top panel of Fig. 5 shows the data for the New England male talkers. The most striking aspect of this figure is the split between the different talkers in their production of /æ/. While three of the talkers showed raised /æ/s similar to those found in the Northern dialect region, one talker (NE1) retained lowered /æ/s.2 In addition, the New England males produced a merger or partial merger of /ɑ/ and /ɔ/.

The data for the New England females are shown in the bottom panel of Fig. 5. Like the New England males, the New England females showed two different patterns of productions of /æ/. One talker (NE8) had a raised /æ/, consistent with the Northern Cities Chain Shift, while the other three talkers maintained a distinction between the nuclei of /æ/ and /ɛ/. The New England females also produced a merger or partial merger of /ɑ/ and /ɔ/, like their male counterparts.

2. Mid-Atlantic

The data for the Mid-Atlantic males are shown in the top panel of Fig. 6. The Mid-Atlantic males seem to be the least variable subgroup of talkers in the NSP corpus. In general, the ellipses in Fig. 6 are smaller than the ellipses found in the other figures. We also see evidence of the merger of /ɑ/ and /ɔ/ among the Mid-Atlantic males. An inspection of the individual vowel spaces suggests that although this may be the case for two of the talkers (AT1 and AT3), the other two talkers maintained distinct low back vowels.

FIG. 6.

FIG. 6

All tokens produced by the Mid-Atlantic male (top) and female (bottom) talkers for the 11 vowels /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/. The ellipses were hand-drawn to include every token for each vowel.

The bottom panel of Fig. 6 shows the vowel formant frequency data for the Mid-Atlantic female talkers. Like the Mid-Atlantic males, the Mid-Atlantic females were inconsistent in producing the low-back merger. One talker (AT9) showed a clear merger of /ɑ/ and /ɔ/, while the other three talkers maintained a greater distinction between these two vowels. Only one talker (A18) produced the expected raised /ɔ/. The Mid-Atlantic female talkers were also variable in their production of /u/. One talker (A18) produced fronted /u/s, while the other three females produced more backed /u/s. Finally, Fig. 6 shows a great deal of overlap between /Ʊ/ and /ʌ/ for the Mid-Atlantic females. An inspection of the individual vowel spaces suggests that this overlap is not due to mergers at the individual talker level, but is due to variation across talkers in the production of these vowels, particularly in F1.

3. North

The top panel of Fig. 7 shows the data for the Northern males. The Northern Cities Chain Shift appears to be present in all four talkers in this sample. /ɑ/ is clearly distinct from /ɔ/, due to the lowering and fronting of /ɑ/. All four talkers also produced raised and/or fronted /æ/s. In addition, /ɛ/ shows some backing to reduce overlap with raised /æ/. An inspection of the individual vowel spaces indicates that the talkers from upstate New York and Wisconsin produced backed /ɛ/s, while the two talkers from northern Indiana maintained a more fronted production of /ɛ/. /ʌ/ was also backed and shows some overlap with /ɔ/.

FIG. 7.

FIG. 7

All tokens produced by the Northern male (top) and female (bottom) talkers for the 11 vowels /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/. The ellipses were hand-drawn to include every token for each vowel.

The data for the Northern females are shown in the bottom panel of Fig. 7. The vowels produced by the Northern females also reflect the Northern Cities Chain Shift. Like the Northern males, these female talkers produced lowered and fronted /ɑ/s, raised and fronted /æ/s, and backed /ɛ/s and /ʌ/s. One of the Northern female talkers (NO9) also produced fronted /u/s, while the other three retained backed /u/s.

Despite the overlap of /æ/ and /ɛ/ in the vowel nuclei, the trajectories of these vowels are distinct for both the male and female talkers. While /ɛ/ moves up over the middle third of the vowel, /æ/ moves back and down, so that the offsets of the two vowels are well separated in the F1 × F2 space.

4. Midland

The vowel tokens for the Midland male talkers are shown in the top panel of Fig. 8. This figure reveals two distinctive splits between the talkers. First, two talkers (MI3 and MI4) showed fronted /u/s, whereas the other two talkers had more backed /u/ productions. In addition, two of the talkers (M12 and MI3) had raised /æ/s, whereas the other two talkers retained the lower /æ/ production. These results are particularly interesting because /u/ fronting is associated with the Southern Vowel Shift while /æ/ raising is associated with the Northern Cities Chain Shift, but they were both present in a single Midland talker. It is somewhat surprising that a single talker (MI3) would exhibit both Southern /u/ fronting and Northern /æ/ raising. It is also interesting to note in this figure that the vowel /ɪ/ is completely encompassed by the vowel /e/. This result is due to an apparent merger of these two vowels (at least in terms of nucleus formant frequencies) for one of the Midland male talkers (MI4). An inspection of the trajectories of these vowels for MI4, however, suggests that the phonemic distinction is maintained through differences in the spectral change from the onset to the offset of the vowel. While /e/ moves forward and up over the middle third of the vowel, /ɪ/ moves backward, and the offsets of the two vowels do not overlap in the F1 × F2 space.

FIG. 8.

FIG. 8

All tokens produced by the Midland male (top) and female (bottom) talkers for the 11 vowels /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/. The ellipses were hand-drawn to include every token for each vowel.

The vowel tokens for the Midland females are plotted in the bottom panel of Fig. 8. Unlike their male counterparts, who individually showed evidence of Southern and Northern features, the Midland females showed very few shifted vowels. An inspection of the individual vowel spaces suggests that the variation in Fig. 8 is due to overall differences between talkers and not to individual differences in vowel shifts or mergers, with the exception of one talker (MI6) who produced slightly raised /æ/s. The Midland female talkers as a group also exhibited the merger of /ɑ/ and /ɔ/ and this merger is evident in the individual vowel spaces for three (MI8, MI9, and MI0) of the four talkers.

5. South

Vowel production data for the Southern males are shown in the top panel of Fig. 9. /u/ and /o/ fronting are consistently present in all of the tokens. In the case of /o/ fronting, this results in a near-complete overlap of /o/ and /Ʊ/ across talkers. An inspection of the individual vowel spaces, however, suggests that each individual talker maintained a distinction between /o/ and /Ʊ/ in both the nucleus position and in the formant trajectory over the course of the vowel. Like the Midland talkers, the Southern talkers showed highly similar /e/s and /ɪ/s with respect to formant frequencies at the first-third temporal point, but in all four talkers, the trajectories for /e/ and /ɪ/ were clearly distinct. /e/ moved up and front over the course of the vowel, while /ɪ/ moved back and down. Finally, /æ/ was also fronted consistently in all of the tokens.

FIG. 9.

FIG. 9

All tokens produced by the Southern male (top) and female (bottom) talkers for the 11 vowels /i ɪ e ɛ æ ɑ ɔ ʌ o Ʊ u/. The ellipses were hand-drawn to include every token for each vowel.

The bottom panel of Fig. 9 is a plot of the vowel tokens for the Southern females. The Southern females were by far the most variable subgroup of talkers in the NSP corpus. The overlap across different vowels is quite large for the entire space, except for /i/. As a group, the Southern females showed a general trend for /u/ and /o/ fronting like the Southern males, but individually, the female talkers ranged from virtually no fronting of these vowels by one of the talkers from Kentucky (SO6) to very fronted back vowels by one of the talkers from Texas (SO7).

6. West

The top panel of Fig. 10 shows the data for the last group of male talkers, the Westerners. The large variance in F2 for /u/, /Ʊ/, and /o/ is due to the talker from Montana (WE4), who produced back vowels with very low second formants. The merger of /ɑ/ and /ɔ/ is also visible in this figure, and an inspection of the individual vowel spaces suggests that all four talkers produced merged or nearly merged low back vowels.

Finally, the data for the Western female talkers are shown in the bottom panel of Fig. 10. Like their male counterparts, the Western females showed a merger of /ɑ/ and /ɔ/ as a group and individually. In addition, /u/ fronting was found in two of the Western females (WE7 and WE9), but not in the others.

In general, the female talkers were more variable within a given dialect than the male talkers. A comparison of the individual vowel spaces, however, suggests that individual females were not generally more variable in their productions of specific vowels than individual males, but that across different female talkers within a given dialect, there was simply more variation in the formant frequencies. Impressionistically, the female talkers in the NSP corpus sounded more heterogeneous than the male talkers in terms of speaking style, which may explain the observed differences in the raw acoustic measures.

The raw data presented in Figs. 510 reveal substantial variability within each dialect region, despite the relative homogeneity of the talkers with respect to age, ethnicity, and level of education. As noted above, /æ/ raising was variable in New England, /u/ fronting was variable in the Mid-Atlantic, Midland, and West, and the merger of /ɑ/ and /ɔ/ was also variable in the Mid-Atlantic and West. This within-dialect variability reflects the different geographic locations and influences that the talkers have been exposed to. Each of the six geographic dialect regions described in the current study is composed of smaller speech communities, which may have their own unique dialect features. In addition, the talkers had different residential and travel histories, which provided them with different opportunities to hear and acquire regional and social variants. The results of the statistical analyses, however, confirm that these talkers also produce consistent between-dialect differences that reflect previous descriptive sociolinguistic research on the primary regional varieties of American English.

IV. DISCUSSION

The statistical analysis of the vowel duration and formant frequency measures confirmed the presence of the Northern Cities Chain Shift in the Northern talkers and the Southern Vowel Shift in the Southern talkers. In particular, Northerners produced lowered and fronted /ɑ/s and fronted and raised /æ/s. The Northern females also produced backed /ɛ/s and /ʌ/s. The Southern talkers exhibited fronting of /o/ and /e/ centralization. The Southern males also produced fronted /u/s and raised /ɛ/s. We also found evidence of several new features of Southern speech: the fronting of /æ/ (male talkers only) and the raising of /Ʊ/ and /u/. Finally, the Southerners produced longer lax vowels than any of the other talkers. This increase in lax vowel duration could lead to greater perceptual confusability between the tense and lax vowel pairs (such as /i/ and /ɪ/ or /u/ and /Ʊ/), but an inspection of the vowel trajectories suggests that Southern talkers maintain the tense/lax distinction through greater spectral change in the lax vowels than the other talker groups. Taken together, the most robust dialect differences in the present dataset were the Northern Cities Chain Shift among the Northern talkers and the Southern Vowel Shift among the Southern talkers.

In addition, the Westerners produced a merger of /ɑ/ and /ɔ/, as predicted, and the Western males exhibited /u/ fronting. The low-back merger was also reliable for the Midland talkers. In addition, both the male and female Midland talkers exhibited some aspects of the Southern Vowel Shift with the males producing raised /ɛ/s and the females producing centralized /e/s. The individual data plotted for each dialect region in Figs. 510 also suggest that other dialect-specific features are present in at least some talkers, although they might not be consistent enough across all of the talkers to be significant in the statistical analyses conducted in this study. For example, some Midland talkers showed /u/ fronting, a typically Southern feature, while others showed Northern features, such as /æ/ raising. All three of the talkers who exhibited these Northern and/or Southern features in their speech were from the Indianapolis metropolitan area. These results are consistent with earlier claims in the literature that the Midland dialect region is not a unique dialect, but instead may be a transition area between the North and the South (Davis and Houck, 1992; but also see Frazer, 1994; Johnson, 1994).

Boberg (2001) suggested that Western New England can also be treated as a transition area between Eastern New England and the North. At first glance, the data presented here support this interpretation: some of the New England talkers showed the /æ/ raising found in the North and some did not. However, /æ/ raising among the male talkers was found in both Eastern and Western New England, while the females from Western New England were split on /æ/ raising with one producing the raised variant and one not. In addition, all of the New England talkers, both Easterners and Westerners, showed the merger of /ɑ/ and /ɔ/ reportedly found in Eastern New England. A partial merger of /ɑ/ and /ɔ/ in the New England talkers was confirmed by the statistical analysis. These results suggest that Eastern and Western New England are perhaps more homogeneous than previously suggested (Boberg, 2001; Labov et al., forthcoming).

The distinctive features of the Mid-Atlantic talkers were less clear. We did not find evidence of raised /ɔ/ in most of the speakers, and, instead, found evidence of an unexpected merger of /ɑ/ and /ɔ/. In addition, the acoustic analysis revealed significant differences between the Mid-Atlantic talkers and the other talker groups in the F2 of /ɑ/, /ɔ/, and /ʌ/. The result of these shifts is particularly apparent in Fig. 4, which shows the alignment of these three low vowels with respect to F2 for the Mid-Atlantic males (filled squares). The pattern is less clear for the Mid-Atlantic females, although significant effects were found for F2 for both /ɑ/ and /ʌ/ for these talkers. Given the reported tendency for [ɔ] raising in the Mid-Atlantic (Labov, 1994; Thomas, 2001), these findings were somewhat surprising. However, the lexical items used to obtain the measure of [ɔ] (frogs and logs) were among a small set of words that have been reported to show large variation between regional varieties of American English and do not always pattern with other words containing the same vowel (Trager, 1930; Wells, 1982). Thus, this effect may be lexically specific. An additional exploration of this result is needed to determine the precise nature of the low back vowels in Mid-Atlantic speech.

In addition to these potential lexical effects, several other aspects of the recording conditions and stimulus materials should be considered in interpreting the results. First, while the recordings were made using high quality digital equipment in a sound-attenuated booth, the participants were asked to read the stimulus materials that were displayed on a computer monitor, which may have resulted in somewhat unnatural utterances. Both speech scientists and sociolinguists have shown that more informal speech can be obtained when participants are allowed to talk spontaneously (Labov, 1972; Ladefoged, Kameny, and Brackenridge, 1976). Second, the stimulus materials that were measured in this study were restricted to the hVd context, and we did not explore vowel tokens in pre-/r/, pre-/l/, or pre-nasal environments. However, the effects of the following consonant, particularly nasals and liquids, on vowel production have been well documented by sociolinguists (Labov et al., forthcoming; Thomas, 2001). Third, most of our discussion of the vowel productions were based on acoustic measurements made at a single temporal point in each vowel token, but a preliminary inspection of the trajectories suggests that additional variation may also be present in how talkers from different regions manipulate spectral change to maintain vowel category contrasts. Finally, while the NSP corpus included nearly an hour of speech from each talker, we examined only a small number of tokens from each talker and the statistical analyses relied on differences between four talkers of each gender of each dialect. In addition, our talkers were relatively homogeneous with respect to age, level of education, and ethnicity. Despite these limitations, however, we hope to have laid the foundation for future research that will examine in more detail the production of American English vowels in different speaking styles and different phonetic and lexical contexts, the role of spectral movement in maintaining vowel category contrast, and talkers representing a wider range of ages, ethnicities, and socioeconomic groups.

The z-score transformation that we used to normalize the vowel formant frequency data in this study was generally successful. The statistical analyses revealed no significant effect of gender on F1 or F2, but significant effects of vowel category and significant vowel X dialect and vowel X gender interactions were maintained. The transformation did produce artifacts in the statistical analysis, however, particularly for the Southern talkers. As discussed above, the back-vowel fronting that is found in the speech of Southern talkers led to a higher mean F2 and a smaller F2 standard deviation. The z-score transformation then produced artificially backed low back vowels as a result of the larger numerator and smaller denominator. Similarly, the z-score transformation is not adequate for the normalization of formant frequency measures obtained at multiple temporal points in the vowel. When vowels that exhibit spectral movement are included in an analysis, the overall shape of the vowel space will differ at different temporal slices, and we predict that artifacts of the transformation will be produced. We can conclude that in cases where vowel systems are being compared that differ in their overall shape, the z-score transform should be used with caution.

As mentioned in the Introduction, one important assumption underlies the present discussion of the different vowel systems of regional varieties of American English: The characteristics of each dialect are defined relative to an unspecified baseline. This baseline could be defined historically in terms of earlier vowel systems in the United States. For example, Labov (1994) characterized the Northern Cities Chain Shift in historical terms, by describing the vowel system of the Northern dialect as the result of a series of phonological changes that can be traced through both real-time and apparent-time data3 over the course of the second half of the 20th century. An alternative to this historical perspective would be to identify baseline pronunciations based on the current vowel systems of the different regional varieties. For example, Fig. 4 showed relatively consistent productions of /i/ across the six different dialects, but much greater variation in /æ/, /ɑ/, and /o/. In the case of /æ/, the Northern talkers were the only ones who produced the raised and fronted variant, suggesting that the lowered and backed production should be treated as the baseline. Similarly, the Southern /o/ was fronted, while a backed /o/ was found in the other five dialects, suggesting that the backed variant should be treated as the baseline. By considering both historical developments and synchronic idiosyncrasies, speech scientists can develop implicit baseline productions to which many other possible variants are compared.

The present results demonstrate that the acoustic characteristics of the vowel systems of American English differ based on the region of origin of the talker. In the current study, talkers from the Northern dialect region reliably produced the Northern Cities Chain Shift, whereas talkers from the Southern dialect region reliably produced some features of the Southern Vowel Shift. Finally, the merger of /ɑ/ and /ɔ/ was robustly present in New England, Mid-Atlantic, Midland, and Western talkers. Our analysis also revealed four new results: the spreading of the high front vowel shifts from the Southern to the Midland dialect, the raising of the high back vowels for the Southern talkers, the fronting of /æ/ for the Southern male talkers, and the alignment of the low back vowels /ɑ/, /ɔ/, and /ʌ/ with respect to F2 in the Mid-Atlantic dialect. These data provide new benchmarks for the acoustic characteristics of the vowel systems of six regional varieties of American English as well as implicit baselines for “General” American English.

ACKNOWLEDGMENTS

This work was supported by NIH NIDCD T32 Training Grant No. DC00012 and NIH NIDCD R01 Research Grant No. DC00111 to Indiana University. We would like to acknowledge the contributions of Allyson Carter, Connie Clarke, Caitlin Dillon, Jimmy Harnsberger, Rebecca Herman, and Luis Hernandez in the development of the Nationwide Speech Project corpus, including the compilation of the materials, the selection of the equipment, and pilot testing of both equipment and participants.

APPENDIX

TALKER DEMOGRAPHIC INFORMATION

Key: New England (NE), Mid-Atlantic (AT), North (NO), Midland (MI), South (SO), West (WE).

Talker ID Sex Age Hometown (city, state) Education Occupational field (or major)
NE1 m 24 Chichester, NH BA/BS Psychology
NE2 m 20 Marblehead, MA Undergraduate4 Law
NE3 m 20 Sandown, NH Undergraduate Business
NE4 m 18 Londonderry, NH Undergraduate Finance
NE6 f 24 Winchester, MA BA/BS English
NE7 f 18 Newfane, VT Undergraduate Business
NE8 f 18 Enfield, CT Undergraduate Translation
NE0 f 18 Sharon, MA Undergraduate Psychology

AT1 m 18 Long Island, NY Undergraduate Business
AT2 m 18 Manalapan, NJ Undergraduate Law
AT3 m 18 Marlboro, NJ Undergraduate Accounting
AT5 m 18 Plainview, NY Undergraduate Television
AT6 f 18 Middletown, NJ Undergraduate Musical Theater
AT7 f 20 Great Neck, NY Undergraduate Public Relations
A18 f 18 Owings Mills, MD Undergraduate Education
AT9 f 18 Oradell, NJ Undergraduate Apparel Merchandising

NO2 m 20 Buffalo, NY Undergraduate Law
NO3 m 18 St. John, IN Undergraduate Undecided
NO4 m 19 New Berlin, WI Undergraduate Business
NO5 m 18 Hobart, IN Undergraduate Education
NO6 f 19 South Bend, IN Undergraduate Education
NO8 f 18 Wilmette, IL Undergraduate Advertising
NO9 f 20 Munster, IN Undergraduate Business
NO0 f 18 South Bend, IN Undergraduate Journalism

MI1 m 19 Centerville, IN Undergraduate Broadcasting
M12 m 19 Greencastle, IN Undergraduate Film
MI3 m 19 Carmel, IN Undergraduate Film
MI4 m 18 Anderson, IN Undergraduate Investment Banking
MI6 f 19 Grandview, IN Undergraduate Nutrition Science
MI8 f 19 Seymour, IN Undergraduate Dentistry
MI9 f 18 Selvin, IN Undergraduate Psychology
MI0 f 20 Lafayette, IN Undergraduate Law

SO1 m 18 New Albany, IN Undergraduate National Security
SO2 m 22 Statesville, SC BA/BS Music Performance
S22 m 20 Georgetown, IN Undergraduate Education
SO5 m 18 Birmingham, AL Undergraduate Investment Banking
SO6 f 18 Louisville, KY Undergraduate Journalism
SO7 f 18 Dallas, TX Undergraduate Retail
SO8 f 19 League City, TX Undergraduate Interior Design
S10 f 19 Owensboro, KY Undergraduate Physical Therapy

WE2 m 20 Albuquerque, NM Undergraduate Law
WE3 m 19 Oakland, CA Undergraduate Journalism
WE4 m 23 Billings, MT MA/MS Library Science
WE5 m 18 Santa Clarita, CA Undergraduate Music Performance
WE6 f 20 West Covina, CA Undergraduate Broadcast News
WE7 f 19 Los Angeles, CA Undergraduate Music Performance
WE8 f 21 Henderson, NV Undergraduate Biology
WE9 f 20 Orange, CA Undergraduate Fashion

Footnotes

1

Due to an oversight in the design and collection of the NSP corpus, /ɔ/ was not available in hVd context. The use of the frogs and logs tokens, while not ideal, allowed us to examine the vowel systems of our talkers more completely and to discuss the regional extent of the merger of /ɑ/ and /ɔ/. More details about the sentences can be found in Clopper (2004).

2

Plots of the individual vowel spaces, including formant trajectories, for the 48 talkers are presented in Clopper (2004).

3

“Realtime” refers to data collected longitudinally to study language change over time within individuals and within a speech community. “Apparent-time” refers to data collected at a single point in time from participants with a range of ages, from which inferences about language change can be tentatively drawn (Labov, 1994).

4

“Undergraduate” refers to talkers who were enrolled as undergraduate students at Indiana University at the time of recording.

Contributor Information

Cynthia G. Clopper, Department of Psychology, Indiana University, Bloomington, Indiana 47405.

David B. Pisoni, Department of Psychology, Indiana University, Bloomington, Indiana 47405

Kenneth de Jong, Department of Linguistics, Indiana University, Bloomington, Indiana 47405.

References

  1. Adank P, Smits R, van Hout R. A comparison of vowel normalization procedures for language variation research. J. Acoust. Soc. Am. 2004;116:3099–3107. doi: 10.1121/1.1795335. [DOI] [PubMed] [Google Scholar]
  2. Bauer L. Tracing phonetic change in the received pronunciation of British English. J. Phonetics. 1985;13:61–81. [Google Scholar]
  3. Boberg C. The phonological status of Western New England. Am. Speech. 2001;76:3–29. [Google Scholar]
  4. Clopper CG. Linguistic experience and the perceptual classification of regional varieties of American English. Indiana University Ph.D. dissertation. 2004 [Google Scholar]
  5. Davis LM, Houck CL. Is there a Midland dialect?—Again. Am. Speech. 1992;67:61–70. [Google Scholar]
  6. Frazer TC. On transition areas and the ‘Midland’ dialect: A reply to Davis and Houck. Eur. J. Morphol. 1994;69:430–435. [Google Scholar]
  7. Hagiwara R. Dialect variation and formant frequency: The American English vowels revisited. J. Acoust. Soc. Am. 1997;102:655–658. [Google Scholar]
  8. Hillenbrand J, Getty LA, Clark MJ, Wheeler K. Acoustic characteristics of American English vowels. J. Acoust. Soc. Am. 1995;97:3099–3111. doi: 10.1121/1.411872. [DOI] [PubMed] [Google Scholar]
  9. Johnson E. Yet again: The Midland dialect. Am. Speech. 1994;69:419–430. [Google Scholar]
  10. Labov W. Some principles of linguistic methodology. Lang. Soc. 1972;1:97–120. [Google Scholar]
  11. Labov W. Principles of Linguistic Change: Internal Factors. Malden, MA: Blackwell; 1994. [Google Scholar]
  12. Labov W. The three dialects of English. In: Linn MD, editor. Handbook of Dialects and Language Variation. San Diego: Academic Press; 1998. pp. 39–81. [Google Scholar]
  13. Labov W, Ash S, Boberg C. Atlas of North American English. New York: Mouton de Gruyter; (to be published). [Google Scholar]
  14. Ladefoged P, Kameny I, Brackenridge W. Acoustic effects of style of speech. J. Acoust. Soc. Am. 1976;59:228–231. doi: 10.1121/1.380856. [DOI] [PubMed] [Google Scholar]
  15. Lobanov BM. Classification of Russian vowels spoken by different speakers. J. Acoust. Soc. Am. 1971;49:606–608. [Google Scholar]
  16. Peterson GE, Barney HL. Control methods used in a study of the vowels. J. Acoust. Soc. Am. 1952;24:175–184. [Google Scholar]
  17. Sjölander K, Beskow J. WaveSurfer, Version 1.6.2. Stockholm: Centre for Speech Technology, Royal Institute of Technology; 2004. computer software. [Google Scholar]
  18. Thomas ER. An Acoustic Analysis of Vowel Variation in New World English. Durham, NC: Duke University Press; 2001. [Google Scholar]
  19. Trager GL. The pronunciation of ‘short a’ in American Standard English. Am. Speech. 1930;5:396–400. [Google Scholar]
  20. Wells JC. Accents of English 3: Beyond the British Isles. Cambridge: Cambridge University Press; 1982. [Google Scholar]

RESOURCES