Abstract
Purpose
The production of speech sound classes in adult language learners is affected by (a) interference between the native language and the target language and (b) speaker variables such as time speaking English. In this article, we demonstrate how phonological process analysis, an approach typically used in child speech, can be used to characterize adult target language phonological learning.
Method
Sentences produced by 2 adult Japanese English language learners were transcribed and coded for phoneme accuracy and analyzed according to the percent occurrence of phonological processes. The results were interpreted relative to a contrastive analysis between Japanese and English phonetic inventories and developmental norms for monolingual English children.
Results
In this pilot study, common consonant processes included vocalization, final consonant devoicing, and cluster reduction. These are processes commonly observed in the speech of children who are typically developing.
Conclusions
The process analysis can inform clinical approaches to pronunciation training in adult English language learners. For example, the cycles approach (Hodson & Paden, 1981) may provide more clinical efficacy than an articulatory approach in which phonemes are targeted individually. In addition, a process analysis can enable clinicians to examine the principles of within-class and across-class generalization in adult pronunciation instruction.
Most adults who learn a new language will speak that language with a noticeable foreign accent (Flege, Munro, & MacKay, 1995; Scovel, 2000). Some nonnative speakers of English may experience negative social and professional consequences as a result of their foreign-accented speech (Deprez-Sims & Morris, 2010). A study by Munro (2003) identified stereotyping, harassment, and job discrimination as some of the negative social consequences experienced by English language learners (ELLs). Gluszek and Dovidio (2010) demonstrated that speakers with strong foreign accents reported more difficulties with communication and a higher perception of stigmatization from others than did individuals with mild accents.
Speaking with a foreign accent does not automatically lead to communication difficulties such as low intelligibility or comprehensibility (Derwing & Munro, 1997, 2009). However, there are times when interference between a speaker's native language (NL) and target language (TL) does result in reduced intelligibility and comprehensibility (Bresnahan, Ohashi, Nebashi, Liu, & Shearman, 2002; Trofimovich & Isaacs, 2012). In such cases, nonnative speakers may elect to receive accent modification from teachers of English to speakers of other languages or speech-language pathologists to improve their communication skills. In fact, depending on the clinical population served, some speech-language pathologists may benefit from receiving accent modification if they plan to provide services outside of their native language community (Levy & Crowley, 2012).
Accent modification is an acknowledged area of clinical practice for speech-language pathologists (American Speech-Language-Hearing Association [ASHA], 1997; Derwing & Munro, 2005; Sikorski, 2005). ASHA clearly states that foreign accents demonstrate speech differences rather than speech disorders (ASHA, 1997). Regardless of this distinction, the same clinical rigor applied to the treatment of speech disorders should also be applied to accent modification services (Franklin & Stoel-Gammon, 2014; Levy & Crowley, 2012; Muller, Ball, Guendouzi, & Muller, 2000; Sikorski, 2005). This clinical rigor is encapsulated in the framework of evidence-based practice. In this framework, clinical experience, published empirical research, and client values act as drivers of best practice (Kamhi, 2006). However, in our well-intentioned efforts to distinguish difference from disorder, speech-language pathologists may be overlooking effective approaches for accent modification simply because those approaches are already associated with the remediation of speech disorders in children. When treating sound errors in children, clinicians attempt to effect the greatest amount of change in the child's phonological system with the least amount of teaching (Gierut, 1998). This type of efficacy should also apply to pronunciation training in adults with speech differences rather than disorders. Yet experts in the field of adult pronunciation instruction report the following two broad challenges to this area of practice: (a) a lack of empirical studies addressing pronunciation instruction and (b) research that addresses general learning influences without informing teaching practices (Munro & Derwing, 2011).
In this article, we address the second challenge by presenting a framework for adult phonological learning that addresses general learning influences and informs pronunciation teaching practices. First, we review Flege's (1981) speech learning model (SLM) and discuss its implications for adult language learning regarding the impact of language experience and the interaction between NL and TL phonetic inventories. Next, we use phonological process analysis to demonstrate parallels between first-language phonological development in children who speak English and phonological learning in adult ELLs. Last, we discuss how these parallels may inform approaches to adult pronunciation instruction that are currently associated with the remediation of speech sound disorders in children.
Implications of Flege's SLM for Adult Language Learners
Flege's SLM (Flege, 1981; Flege et al., 1995) establishes a relationship between age, experience, and phonological learning. As individuals age, they are less likely to establish new categories for TL sounds that do not exist in their NL. This phenomenon results in a foreign accent that is, in part, a function of the phonological disparity between the speaker's NL and the TL. According to Flege, the decrease in pronunciation proficiency with increased age of learning is gradual rather than precipitous because the phonetic systems of adults remain malleable throughout their lifetime. In addition, Flege states that experience can alter an adult's phonological system. This experience can come from time spent speaking the foreign language (Flege, 1981; Mack, 2003; Munro & Derwing, 2008; Oh et al., 2011) or from direct phonological instruction (Franklin & Stoel-Gammon, 2014). These last two points are important to consider because pronunciation instruction would be of no value if adult phonological systems were intractable. Last, Flege posits that the same processes underlying NL acquisition can be applied to TL learning (Mack, 2003).
Flege (1981; Flege et al., 1995) identifies the following three relationships between NL and TL segments: identical, similar, and new. According to the SLM, the perception and production of TL segments are affected by two mechanisms: category assimilation and category dissimilation (Yeni-Komshian, Flege, & Liu, 2000). Category assimilation occurs when a speaker has difficulty forming a new category for a TL sound because a similar sound exists in the speaker's NL. Because this NL phoneme is not identical to the TL phoneme, learners assimilate the TL phoneme into the existing NL category. Category dissimilation occurs when the learner successfully forms a new category for the TL sound. Dissimilation is most likely to occur when the learner perceives a clear difference between the new TL sound and existing NL sounds, resulting in the creation of a new and distinct TL phoneme.
The SLM relies on a comparison of a speaker's two languages to identify which TL sounds a speaker will have difficulty learning. Researchers and clinicians have used contrastive analysis to characterize differences between English and a variety of languages and to inform clinical practice (Anderson, 2004; Chen, Robb, Gilbert, & Lerman, 2001; Hwa-Froelich, Hodson, & Edwards, 2002; Yavas & Goldstein, 1998). Depending on the phonetic inventories of the NL and the TL involved, adults may encounter several new phonemes and phonological patterns when learning a TL. At the same time, there are some features of the TL that adults will not need to learn firsthand because they already have an established phonological system and have learned such features in their NL.
Flege et al.'s (1995) SLM implies that both across-language and within-language factors influence general adult phonological learning. Across-language factors stem from interference between the phonetic inventories of the speaker's two languages. Within-language factors include speaker variables such as time speaking English and age of arrival in the TL country. The SLM also posits that parallels exist between first-language phonological acquisition and second-language phonological learning. Next, we present a framework that demonstrates the effects of across-language and within-language factors on Japanese ELL phonological learning as well as the parallels between NL acquisition and TL learning.
Across-Language Differences Between Japanese and English
Japanese Consonant Inventory
Table 1 contrasts the Japanese and English consonant inventories. Shared phonemes are sounds that are phonetically similar between the two languages. Unshared phonemes are sounds that are phonetically dissimilar and language specific.
Table 1.
Shared and unshared consonant phonemes between English and Japanese.
| Sound classes | Shared phonemes | Unshared phonemes specific to English | Unshared phonemes specific to Japanese |
|---|---|---|---|
| Plosives | /p, b, t, d, k, g/ | ||
| Nasals | /n, m, ŋ/ | /ɴ/ | |
| Fricatives | /s, z, h/ | /f, v, θ, ð, ʃ, ʒ/ | /ɸ, ç/ |
| Affricate | /tʃ, dʒ/ | /dz, ts/ | |
| Approximants: liquid glide | /j, w/ | /l, ɹ/ | /r/ |
Note. Speech-language pathologists typically classify /ʃ/ and /ʒ/ as palatal fricatives. However, these phonemes are articulatory postalveolar in English and therefore differ from the Japanese voiceless palatal fricative.
Stops. Japanese stops include /p, b, t, d, k, g/, however, voiceless stops are aspirated in English and unaspirated in Japanese (Tsujimura, 1996). The placement of the tongue differs slightly in Japanese as well because stops are produced with the tongue blade rather than the tip (Tsujimura, 1996).
Fricatives. Japanese does not contain labiodental (/f, v/) or interdental (/θ, ð/) fricatives. Japanese contains the voiced and voiceless alveolar fricatives /s/ and /z/. The Japanese /s/ is produced slightly different from the English /s/: The lips are not rounded in Japanese but slightly rounded in English (Tsujimura, 1996). Japanese consists of fricatives not found in English, including the voiceless bilabial fricative /ɸ/ and the voiceless palatal fricative /ç/ (Tsujimura, 1996).
Affricates. The affricates of English and Japanese vary considerably. Japanese includes voiced and voiceless alveo-palatal and alveolar affricates (Tsujimura, 1996). The Japanese alveo-palatal affricates /tʃ, dʒ/ are slightly less rounded than the English equivalents. The Japanese alveolar affricates /ts/ and /dz/ are not present in English.
Approximants. Japanese contains three approximants; one liquid, /r/, and two glides, /w/ and /j/ (Tsujimura, 1996). In Japanese, the alveolar liquid sounds much like the English /d/, which is usually transcribed as a flap in Japanese. The velar glide /w/ is unrounded and does not typically involve much lip movement, unlike the English /w/, which mandates lip movement. Some cases of production of the Japanese /w/ have included lip movement, indicating possible dialectal differences (Tsujimura, 1996; Vance, 1987).
Nasals. The Japanese /m/ and /n/ do not differ from English. The velar nasal, /ŋ/, is used by some Japanese speakers but not all. Japanese also has a uvular nasal /ɴ/, which is used before a pause (Vance, 1987).
Observed Substitution Patterns in Japanese ELLs
Because some English sounds are not present in Japanese, Japanese ELLs may substitute those sounds with Japanese phonemes. For example, /f/ may be substituted with /Φ/, /v/ with /b/, /θ/ with /s/, and /ð/ with /z/. Japanese speakers are not likely to make a distinction in the pronunciation of English words beginning with the above pairs (Tsujimura, 1996). For example, the words vase and base would both likely be produced as base.
The English /ɹ/ presents some challenges for Japanese speakers. In word-final position, some Japanese ELLs omit the /ɹ/ and lengthen the preceding vowel. When the /ɹ/ is in word-initial position, it is often substituted with the Japanese alveolar liquid /ɾ/ (Tsujimura, 1996), which sounds much like the Standard English flapped /d/ or /t/. Although the Japanese /ɾ/ and the English liquids /ɹ/ and /l/ differ in place and manner of articulation, Japanese ELLs perceptually collapse the English liquids into the same category as the Japanese alveolar liquid, leading to sound substitutions (Aoyama, Flege, Guion, Akahane-Yamada, & Yamada, 2004). This is an example of the category assimilation described in Flege's SLM.
Japanese Phonotactic Constraints and Resulting English Phonological Processes
Phonotactic constraints are language-specific restrictions for combining phonemes into words. Japanese words end mostly with open syllables. English words may end with open or closed syllables. Japanese lacks consonant clusters in word-initial or word-final positions, so Japanese ELLs may experience difficulty producing English words with closed syllables or consonant clusters (Avery & Ehrlich, 1992). To simplify English clusters, Japanese ELLs may insert a vowel between consonants (Tajima & Kubo, 1999) or add vowels to the end of English words, a phonological process termed epenthesis. Examples of epenthesis are also found in Japanese words that have been borrowed from English, such as lamp, bus, and hot, which are produced as /rampu/, /basu/, and /hot:o/, respectively (Avery & Ehrlich, 1992; Tsujimura, 1996).
Epenthesis is a type of phonological process. Phonological processes are defined as orderly sound changes that affect entire classes of sounds (Edwards & Shriberg, 1983). Processes simplify a TL phonology to match a learner's existing phonological rules. They are observed in the speech of very young children whose phonological systems do not yet match the complexity of an adult's (Edwards & Shriberg, 1983; Grunwell, 1985; Stoel-Gammon & Dunn, 1985). A number of studies have investigated the occurrence of phonological processes in the speech of children who are typically developing bilingual or ELL (Anderson, 2004; Goldstein, Fabiano, & Washington, 2005; Holm & Dodd, 1999; Morrow, Goldstein, Gilhool, & Paradis, 2014).
Fewer studies have characterized adult ELL pronunciation using phonological processes. Hwa-Froelich, Hodson, and Edwards (2002) identified a number of possible phonological processes in the speech of adult Vietnamese ELLs. Final consonant devoicing and deletion, epenthesis, and problems completing clusters were among those predicted based on a contrast of Vietnamese and English phonologies. Although Hwa-Froelich, Hodson, and Edwards (2002) predicted several processes that may occur in the speech of Vietnamese ELLs, they did not analyze their speakers' English productions to determine if the predicted processes actually occurred. Their study focused on identifying dialectal differences in Vietnamese phonology.
This study predicts phonological processes that may occur in the speech of adult Japanese ELLs and compares those predictions with the productions of two Japanese ELL speakers. Figure 1 illustrates the predicted interlanguage phonological processes that may be produced in the speech of Japanese adult ELLs. Japanese phonotactic constraints are indicated in the white area outside the circle. Arrows identify phonological processes that may result from the effects of the Japanese phonological constraints on English. The resulting interlanguage processes are indicated in the outer dark gray area. The smaller white circle in the middle of the figure represents production of English phonology with native-like proficiency. This native-like proficiency is more likely to be attained when a speaker learns a TL as a young child than when learning occurs in adulthood. In Figure 1, the area labeled “within-language differences” represents differences in the frequency and occurrence of interlanguage processes that can exist among Japanese ELLs.
Figure 1.
Representation of possible Japanese native language (NL) effects on English target language (TL) phonology.
Japanese phonotactic constraints can result in a variety of interlanguage phonological processes in English. As Figure 1 illustrates, the lack of word-initial or word-final consonant clusters in Japanese may result in cluster reduction (e.g., string /stɹɪŋ/ → /sɹɪŋ/) or epenthesis (e.g., /stɹɪŋ/ → /stəɹɪŋ/). Epenthesis may also occur because Japanese favors open syllables and because nasals are the only allowable codas. Therefore, Japanese ELLs may add a vowel to the end of words with nonnasal codas to create an open syllable (e.g., big /bɪg/ → /bɪgə/). The deletion of final consonants is another interlanguage process that may arise from the Japanese language's specification of nasal codas and simple open syllables. For example, when an English word contains a coda that is more marked than nasals (e.g., /bɪg/) the speaker may delete the /g/ to create an open syllable (e.g., /bɪg/ →/bɪ/). Another option for dealing with marked English codas is to produce a relatively less marked coda, in this case by devoicing the final /g/ to its cognate /k/ (e.g., /bɪg/→/bɪk/).
As noted earlier, Japanese speakers are known to produce the words vase and base as /bes/. The relative lack of fricatives in Japanese when compared with English may result in the stopping of fricatives in the interlanguage. The unaspirated nature of voiceless stops in Japanese may result in what is perceived to be prevocalic voicing by English listeners. For example, the target word pig is produced with an aspirated initial /p/ in Standard American English (e.g., [phɪg]). However, if the speaker produced pig without aspiration (e.g., [p=ɪg]) the initial /p/ will likely be perceived as /b/ by a native English speaker because the unaspirated word-initial [p=] is phonetically equivalent to an English /b/. Last, the lateral liquid /l/ and the rhotic liquid /ɹ/, both missing from the Japanese inventory, may cause some Japanese speakers to glide or vocalize these liquids in some contexts (e.g., care /kɛɹ/ → /kɛʊ/).
Next, we demonstrate the feasibility and clinical utility of phonological process analysis for adult ELLs. Sentences produced by two adult Japanese ELLs are analyzed. The results are interpreted with respect to the differences between Japanese and English and observed parallels with developmental processes observed in monolingual English-speaking children.
Method
Participants
This study was approved by the institutional review board at the University of Washington. Two female native speakers of Japanese participated in this study. Both were residing in Washington State at the time of the study. Neither participant had any known history of speech, language, or hearing impairments by self-report. English and Japanese were the only languages spoken by the participants. Both participants reported that their exposure to English before moving to the United States involved academic instruction that focused on reading and writing with little opportunity to use English conversationally.
Speaker A was 37 years of age and had been speaking English consistently in the United States for 7 years at the time of the study. Her level of spoken English proficiency was classified as “advanced high” according to the American Council on the Teaching of Foreign Languages (ACTFL; 2012) proficiency guidelines. An advanced high speaker is characterized by the ability to speak confidently at the level of the oral paragraph in past, present, and future contexts. Advanced high speakers may still retain a noticeable accent but can be understood by native speakers of the TL who are not familiar with accented speech.
Speaker B was 39 years of age and had been speaking English in the United States for 3 months at the time of the study. She reported speaking English occasionally when interacting with English speakers at the store and in other public venues. Speaker B's level of spoken English proficiency was classified as “intermediate low” according to the ACTFL guidelines. The intermediate low level of proficiency is characterized by the ability to speak in sentences and strings of sentences concerning familiar topics in uncomplicated communicative tasks (ACTFL, 2012).
Recording Procedures
Recordings took place in a sound-attenuated room in the Speech and Hearing Clinic at the University of Washington. The recordings were made through a mono channel using an audio-technica ATM75 (Audio Technica Corp., Tokyo, Japan) condenser headset microphone that was placed approximately 3 inches in front and to the right of the participant's mouth. The microphone was connected to an Apogee Electronics Mini-Me Digital-to-Analog Converter (Apogee Electronics Corp., Santa Monica, CA) for sound digitization at the following settings: a sampling rate of 44.1 KHz, 16-bit resolution, and a curve setting of 2. The soft-limiting setting was activated to prevent peak-clipping during the recording. Recordings were made directly into a Sony (Tokyo, Japan) laptop with Praat 4.1.27 through a mono channel with buffer size of 50 megabytes and a sampling rate of 44.1 KHz (Boersma & Weenink, 2005; Wood, 2005).
Each participant was recorded as she read 22 sentences. Before recording began, the participants were given time to review each sentence and ask questions regarding unfamiliar words. The sentences were generated using the sentence intelligibility portion of the Speech Intelligibility Test software, which randomly creates lists of semantically unpredictable, phonetically balanced sentences ranging in length from 5 to 15 words (Yorkston, Beukelman, & Hakel, 1996). See the Appendix for the sentences used in this study.
Data Analysis
Identification of Phoneme Accuracy
The sentences were phonetically transcribed according to Standard English connected speech. For example, in Sentence 3, “I don't want to discourage people,” the words want to were transcribed as /wɑntə/ to account for the gemination of the final /t/ in want with the initial /t/ in to. The researcher and three graduate students listened to the sentences produced by the speakers and coded each sentence for phoneme accuracy. Phonemes that were produced differently than the Standard English target were marked as inaccurate productions. Phonemes produced differently due to coarticulation were not coded as inaccurate. For example, most native speakers of English partially devoice phrase final /z/. Therefore, final devoiced consonants were identified as being inaccurate only when the Japanese speaker fully devoiced a consonant to its voiceless cognate. The researcher and the graduate students met to compare results. Any phonemes for which there was disagreement were reviewed by the group to reach consensus.
Identification of Phonological Processes
For each sentence in the set, the number of opportunities for various consonant and vowel processes to occur was identified based on the connected speech transcription. For example, the consonant processes possible in the word patrons /peɪtɹənz/ are prevocalic voicing, cluster reduction, gliding, and final consonant devoicing. Once the opportunities for each process were identified, the number of actual occurrences of processes was determined for each speaker by identifying the phoneme errors in each sentence and characterizing each error according to the aforementioned phonological processes being investigated. For example, if the target word /peɪtɹənz/ was produced as /peɪtəns/, we noted the occurrence of cluster reduction /tɹ/ → /t/ and final consonant devoicing /z/ → /s/. To calculate the percent occurrence (POC) for each process, the total number of occurrences was divided by the total number of opportunities and multiplied by 100.
The processes included in this study were final consonant deletion, cluster reduction, gliding, stopping, vocalization, prevocalic voicing, epenthesis, and final consonant devoicing. These processes were included because they are developmental processes commonly observed in the speech of monolingual English children (Hodson & Paden, 1981; Lowe, 1994; Stoel-Gammon & Dunn, 1985) and because they are also expected to occur as a result of interlanguage pressures between Japanese and English. In this study, only processes that had the opportunity to occur at least 30 times in the sentence set were analyzed.
Results
Phonological Processes POC
Tables 2 and 3 summarize the POC for phonological processes and the consonants affected, respectively. The results reveal that Speaker B, who spoke English with an intermediate low level of proficiency, had a higher overall POC than Speaker A, who was more proficient in English (17.4 POC and 7.88 POC, respectively). For both speakers, the three most common processes were vocalization, final consonant devoicing, and cluster reduction.
Table 2.
Percent occurrence of phonological processes affecting consonants.
| Phonological process | Opportunities for occurrence | Percent occurrence |
|
|---|---|---|---|
| Speaker A (age = 37 y, TSE = 7 y), % | Speaker B (age = 39 y, TSE = 3 mos), % | ||
| Vocalization | 34 | 32.4 | 52.9 |
| Cluster reduction | 84 | 15.5 | 15.5 |
| Final consonant devoicing | 54 | 10.7 | 28.6 |
| Final consonant deletion | 111 | 4.5 | 12.6 |
| Stopping | 101 | 4.0 | 11.9 |
| Gliding | 36 | 0.0 | 13.9 |
| Prevocalic voicing | 53 | 0.0 | 3.8 |
| Epenthesis | 138 | 0.0 | 0.0 |
| Average | 7.88 | 17.4 | |
Note. TSE = time speaking English.
Table 3.
Consonant phonemes affected by processes.
| Speaker | FCDel | CR | PVV | Fdev | Gl | VF | St | Voc |
|---|---|---|---|---|---|---|---|---|
| A | /n, r, t/ | /nt, nd, ɹd, mz, dz, ɹs, ɹt bl/ | /z, d/ | /ŋ/ | /ð/ | /l, ɚ/ | ||
| B | /n, r, t, v, /l, d,/ | /nt, nd, ɹd, nz, dz, ɹs, ɹt, ns, ɹd, md/ | /t/ | /z, d/ | /ɹ, l/ | /ð,v/ | /l, ɚ/ |
Note. FCDel = final consonant deletion; CR = cluster reduction; PVV = prevocalic voicing; Fdev = final consonant devoicing; Gl = gliding; VF = velar fronting; St = stopping; Voc = vocalization.
These three processes are also among the latest developmental processes to disappear from the speech of monolingual English children (Stoel-Gammon & Dunn, 1985). It is here that we see a parallel between NL phonological acquisition and TL phonological learning.
Vocalization
Vocalization affected the mid-central rhotic vowel in words such as mother (Sentence 14) and the postvocalic /l/ in words such as special (Sentence 21). Vocalization in Speaker A did not affect the rhotic vowels and the postvocalic /l/ equally. Of the 19 opportunities for postvocalic /l/ production, Speaker A vowelized only twice. By contrast, Speaker A vocalized in nine of 15 opportunities involving the rhotic vowel. Of the 19 opportunities for postvocalic /l/, Speaker B vocalized 11 times. Speaker B vocalized in seven of 15 opportunities involving a rhotic vowel.
Final Consonant Devoicing
The POC of final consonant devoicing was more than twice as common for Speaker B than for Speaker A. Final /z/ accounted for 28 of the 54 instances of voiced final consonants in the sentence set. Opportunities for the production of final /d/ and /v/ occurred 14 and 10 times, respectively. Word-final /dʒ/ and /g/ each occurred once in the words discourage (Sentence 3) and big (Sentence 2), respectively. Of the aforementioned phonemes, /z/ was devoiced most often. Speaker B devoiced word-final /z/ in 12 of the 28 opportunities, whereas Speaker A did so only five times. Likewise, Speaker B devoiced the final /d/ in five of the 14 opportunities, whereas Speaker A did so only once.
Cluster Reduction
In this study, we have made a distinction between final consonant deletion and cluster reduction affecting the last phoneme in a word. Final consonant deletion must result in a word-final open syllable (e.g., cat → /kæ/), whereas cluster reduction involving the final phoneme would result in a closed syllable (e.g., cast → /kæs/). The POC for cluster reduction was identical for both speakers. The sound sequences in the “cluster reduction” column in Table 3 represent word-medial and word-final clusters in the sentence set. Several word-initial clusters were also present in the sentence set (e.g., /sm, st, sp, fl, fɹ, pɹ, pl, dɹ, kw/). However, none of these word-initial sound sequences was reduced by either of the speakers. When word-final clusters included a fricative and a nasal, as in the words and, rooms, and dozens, both speakers eliminated the fricative and preserved the nasal 100% of the time.
Within-Language Differences
Figure 1 illustrated the interlanguage phonological processes that may occur in the speech of Japanese adult ELLs as a result of the imposition of Japanese phonotactic constraints on English. Many of these processes were observed in the speech of the two speakers analyzed in this article, as was presented in Table 2. Recall that in Figure 1, there was an area between the interlanguage phonology zone and the target English zone where within-language differences between ELLs from the same NL background may emerge. Such within-language differences are evident in the different-process POCs observed in both speakers.
Figures 2 and 3 illustrate the within-language differences between Speakers A and B, respectively. The number in each circle represents the POC for each process observed. (Note that the sizes of the nested circles in Figures 2 and 3 are not scaled to the POC of each process.) The lower the POC, the closer the process is to the target English zone (0% occurrence). Processes that demonstrated 0% occurrence are located directly in the target English zone. When viewed in this context, we see that Speaker A's phonological patterns are closer to those of a native English speaker than Speaker B's patterns. This observation is true regarding the higher number of processes that have completely disappeared from Speaker A's productions and the lower POC of processes exhibited by Speaker A. Recall that Speaker A had a higher level of English proficiency and had been speaking English for a longer period of time than had Speaker B. The lower number and POC of processes in Speaker A's sentences fits with the SLM's tenet that experience and time spent speaking a language can alter an adult's phonological system.
Figure 2.
Phonological process sorting based on percent occurrence (POC) in Speaker A. The size of the nested circles is not scaled directly to the POC of each process. Stop = stopping, FC Del = final consonant deletion.
Figure 3.
Phonological process sorting based on percent occurrence (POC) in Speaker B. The size of the nested circles is not scaled directly to the POC of each process. PVV = prevocalic voicing.
The preliminary results demonstrated in these figures also suggest that overall, the most frequently occurring processes in both Japanese ELLs' sentences are among the latest developmental processes to disappear from the speech of monolingual English children. Stoel-Gammon and Dunn (1985) demonstrated that vocalization does not begin to decline in English-speaking children until roughly the age of 4 years and may persist well beyond the age of 5 years. Final consonant devoicing begins to decline at approximately 3 years of age and disappears by the age of 4 years, and cluster reduction can persist until the age of 5 years.
Clinical Implications
The goal of this article was to present a framework for adult phonological learning that addresses general learning influences and informs pronunciation teaching practices. The interlanguage and within-language observations were informed by Flege's (1981) SLM. The SLM addresses the impact of language experience and the interaction between NL and TL phonetic inventories on TL pronunciation. It also suggests parallels between NL acquisition and TL learning.
The framework presented in this article provides opportunities for additional research in three key areas. First, this framework has the potential to make predictions regarding the interlanguage phonological processes that may arise from interference between an NL and a TL. Second, it opens the door for some interesting clinical research regarding approaches to pronunciation training and accent modification with adult ELLs. Last, this framework has the potential to investigate parallels between NL phonological acquisition and TL phonological learning.
Predicting Interlanguage Phonological Processes
In response to new TL sounds and patterns, adult ELLs commonly produce phonological processes. In this study, a contrastive analysis between English and Japanese phonology predicted the following phonological processes: vocalization, final consonant devoicing, cluster reduction, gliding, final consonant deletion, stopping, prevocalic voicing, and epenthesis. These processes are the result of interlanguage influences between Japanese and English. Of the predicted processes, all but epenthesis was observed in the speech of Speaker B, who spoke English with the lower level of proficiency. The more advanced and experienced speaker, Speaker A, demonstrated fewer processes than Speaker B. This difference may reflect the impact of language experience on TL phonology, as addressed by Flege's SLM.
Knowledge of the phonetic inventory and phonotactic constraints of any NL should allow clinicians and researchers to predict the interlanguage phonological processes that a speaker may exhibit. However, clinicians should take care not to assume that all predicted interlanguage processes will be produced by all speakers from a given NL background. Variables such as time speaking a TL, motivation, and age of learning have been documented to affect pronunciation patterns in language learners (Flege et al., 1995; Munro & Derwing, 2008; Oh et al., 2011).
Clinical Approaches to Accent Modification
A large variety and number of processes in one person's speech will likely have a negative impact on intelligibility and comprehensibility. Based on the framework presented in this article, the goal of phonological training in adult ELLs would be to help the speaker move from the interlanguage processes zone to the target English zone, where processes that are inconsistent with English no longer occur. However, clinicians must investigate how best to accomplish this trajectory. It may not always be efficient for clinicians to address all speech errors by training one affected phoneme at a time. As suggested by Gierut (1998), clinicians should attempt to create the greatest amount of systemwide change in a client's phonological system with the least amount of teaching.
One way systemwide change can be accomplished is through the cycles approach (Hodson & Paden, 1991, 1981). The cycles approach is based on an analysis of a speaker's phonological patterns. A variety of phonological processes are targeted in succession for a period of time (a cycle) after which the cycle is repeated. Processes that were not remediated in the first cycle may be reintroduced in the next cycle. This approach allows the client to be exposed to several different sound contrasts in a given language (Gierut, 1998). Although this approach is generally used for the remediation of speech sound disorders in children, future research should examine whether this approach could benefit adult ELLs with speech differences rather than disorders.
Of course, the cycles approach may not be appropriate for all adult ELLs. Those clients with few errors or few affected phonemes may benefit more from an articulatory approach, where phonemes are targeted individually. In addition, all phonological errors are not necessarily equal with respect to their effect on TL intelligibility. This is where the concept of functional load should factor into target selection (Munro & Derwing, 2006). Errors that have a low impact on intelligibility are said to carry a low functional load. For example, in this study, stopping primarily affected the voiced interdental fricative /ð/. However, the /ð/ → /d/ substitution has a low impact on intelligibility (Munro & Derwing, 2006). Therefore, as is done with speech sound remediation in children, clinicians should base target selection on the impact of errors on intelligibility. In addition, the impact of any particular process will likely vary across TLs. For example, cluster reduction may be less deleterious in English than in a language in which complex clusters occur with higher frequency than in English.
Future clinical research should examine treatment effects in association with phonological process training in adult ELLs. In an extensive review article addressing treatment efficacy of phonological disorders in children, Gierut (1998) noted that observable systemwide changes to a speaker's phonological system constitutes important efficacy evidence. As a consequence, clinical studies should examine whether effects such as within-class and across-class generalization from treated to untreated sounds occurs in adult ELLs who receive training through a processes approach. Clinical efficacy studies should also examine whether training more complex later-developing sounds and syllable shapes generalizes to untrained sounds and shapes that are less complex. Such studies would demonstrate whether clinical approaches currently used for phonological training in children could apply to adult ELLs.
Parallels Between English NL Acquisition and Adult TL English Learning
The data in this article involve just two Japanese ELL speakers. Therefore, generalizations cannot be made to adult Japanese ELLs as a group or to adult ELLs from different NL backgrounds. However, the preliminary findings suggest some parallels between the processes observed in the ELL's speech samples and the developmental patterns observed in monolingual English children. The most frequently occurring processes in both Japanese ELLs' speech samples are among the latest developmental processes to disappear from the speech of monolingual English children. In addition, the two speakers in this study reduced clusters in the word-medial and word-final positions, but not in the word-initial position. This observation is also consistent with the developmental data; clusters are mastered by English children in the word-initial position before the word-final position. As stated in the introduction, adult ELLs already have an established phonological system. Therefore, it should not be expected that all aspects of TL learning should parallel first-language acquisition.
Flege's SLM states that the adult phonological system is malleable and that adult language learners can and do improve TL pronunciation with increased experience and without direct pronunciation training (Flege, 1981; Mack, 2003; Munro & Derwing, 2008; Oh et al., 2011). Future studies should investigate whether there is a predictable pattern in the disappearance of phonological processes over time in the speech of adult ELLs. It should be noted, however, that the disappearance of developmental processes does not occur abruptly in child speech. Rather, there is a range between the age a process begins to decline among children and the age at which a process disappears completely. Such ranges leave room for variability among children who are typically developing. Therefore, we must assume that there will be variability in the elimination of processes among adult ELLs as well.
Collaborative Clinical Investigations
To apply a processes approach to adult language learners should not be seen as an attempt to pathologize what is merely a speech difference, not a disorder. English pronunciation instruction is offered by professionals other than speech language pathologists. However, speech language pathologists are uniquely equipped to perform phonological process analyses, understand the developmental trajectory in the disappearance of processes, and apply this knowledge to the clinical context when setting goals and objectives for ELL clients. We encourage researchers and clinicians to explore this approach in new investigations of speech production in adult ELLs. Furthermore, we hope the framework presented in this article will inspire innovative approaches to pronunciation training that capitalize on the unique clinical skills of speech language pathologists.
Acknowledgments
The preparation of this article was supported in part by a predoctoral fellowship (F31 HD046412-05) from the National Institute of Child Health and Human Development, Bethesda, MD, awarded to Amber D. Franklin. The authors thank the members of the English Language Learning and Pronunciation Lab at Miami University, Ann Dillard, Lisa Floccari, Lauren Polster, Kaitlyn Gilftert, and Anna Lichtenstein, and Barbara Weinrich, Jeanie Ducher, and Dana Miller for their assistance with manuscript preparation.
Appendix
Sentence Set from Sentence Intelligibility Test (Yorkston, Beukelman, & Hakel, 1996).
| 1. They will make many friends. |
| 2. Money wasn't a big problem. |
| 3. I don't want to discourage people. |
| 4. The book is small and lightweight. |
| 5. I feel I can play this weekend. |
| 6. Now I'm living exactly as I choose. |
| 7. A low price will sell a house quickly. |
| 8. It can lead to any number of adventures. |
| 9. We cannot and need not back either side totally. |
| 10. They are important natural sources of vitamins and minerals. |
| 11. Accordingly, when it is gone it is gone for good. |
| 12. They almost had to lift me out of the car. |
| 13. There are many dozens of worthwhile places to break the trip. |
| 14. My mother nursed me in the wings and in dressing rooms. |
| 15. After what seemed like hours of waiting, the taxi finally showed up. |
| 16. He seeks constantly to improve his product and maintain high quality standards. |
| 17. Telephone operators take messages but never give the room number of the patrons. |
| 18. I used to watch it all the time but now I become bored. |
| 19. If you have a complaint, first ask the merchant to take care of it. |
| 20. It is unrealistic to expect any human personality to remain frozen for two decades. |
| 21. Yet, it is so different from other flowers that it needs its own special terms. |
| 22. If he and his wife are having difficulties he will talk them out with her. |
Note. These 22 sentences were randomly drawn from a large set of sentences. Copyright © Yorkston, Beukelman, and Hakel (1996). Reprinted with permission.
Funding Statement
The preparation of this article was supported in part by a predoctoral fellowship (F31 HD046412-05) from the National Institute of Child Health and Human Development, Bethesda, MD, awarded to Amber D. Franklin.
References
- American Council on the Teaching of Foreign Languages. (2012). ACTFL proficiency guidelines. Alexandria, VA: Author. [Google Scholar]
- American Speech-Language-Hearing Association. (1997). Accent modification. Retrieved on May 2, 2014, from http://www.asha.org/public/speech/development/accent-modification/
- Anderson R. T. (2004). Phonologicical acquisition in preschoolers learning a second language via immersion: A longitudinal study. Clinical Linguistics and Phonetics, 18(3), 183–210. [DOI] [PubMed] [Google Scholar]
- Aoyama K., Flege J. E., Guion S. G., Akahane-Yamada R., & Yamada T. (2004). Perceived phonetic dissimilarity and L2 speech learning: The case of Japanese /r/ and English /l/ and /r/. Journal of Phonetics, 32, 233–250. [Google Scholar]
- Avery P., & Ehrlich S. (1992). Teaching American English pronunciation. Oxford, UK: Oxford University Press. [Google Scholar]
- Boersma P., & Weenink D. (2005). Praat: Doing phonetics by computer (Version 4.1.27). Retrieved from http://www.praat.org/
- Bresnahan M. J., Ohashi R., Nebashi R., Liu W. Y., & Shearman S. M. (2002). Attitudinal and affective response toward accented English. Language & Communication, 22, 171–185. [Google Scholar]
- Chen R., Robb M., Gilbert H., & Lerman J. (2001). Vowel production by Mandarin speakers of English. Clinical Linguistics & Phonetics, 15(6), 427–440. [Google Scholar]
- Deprez-Sims A.-S., & Morris S. B. (2010). Accents in the workplace: Their effects during a job interview. International Journal of Psychology, 45(6), 417–426. [DOI] [PubMed] [Google Scholar]
- Derwing T. M., & Munro M. J. (1997). Accent, intelligibility, and comprehensibility: Evidence from four L1s. Studies in Second Language Acquisition, 20, 1–16. [Google Scholar]
- Derwing T. M., & Munro M. J. (2005). Second language accent and pronunciation teaching: A research-based approach. Language Learning, 58(3), 479–502. [Google Scholar]
- Derwing T. M., & Munro M. J. (2009). Putting accent in its place: Rethinking obstacles to communication. Language Teaching, 42(4), 476–490. [Google Scholar]
- Edwards M. L., & Shriberg L. D. (1983). Phonology: Applications in Communicative Disorders. San Diego, CA: College-Hill Press. [Google Scholar]
- Flege J. E. (1981). The phonological basis of foreign accent: A hypothesis. TESOL Quarterly, 15(4), 443–455. [Google Scholar]
- Flege J. E., Munro M. J., & MacKay I. R. A. (1995). Factors affecting strength of perceived foreign accent in a second language. Journal of the Acoustical Society of America, 97, 3125–3133. [DOI] [PubMed] [Google Scholar]
- Franklin A. D., & Stoel-Gammon C. (2014). Using multiple measures to document change in English vowels produced by Japanese, Korean, and Spanish speakers: The case for goodness and intelligibility. American Journal of Speech-Language Pathology, 23, 625–640. [DOI] [PubMed] [Google Scholar]
- Gierut J. A. (1998). Treatment efficacy: Functional phonological disorders in children. Journal of Speech, Language, and Hearing Research, 41, S85–S100. [DOI] [PubMed] [Google Scholar]
- Gluszek A., & Dovidio J. F. (2010). Speaking with a nonnative accent: Perceptions of bias, communication difficulties, and belonging in the United States. Journal of Language and Social Psychology, 29(2), 224–234. [Google Scholar]
- Goldstein B. A., Fabiano L., & Washington P. S. (2005). Phonological skills in predominantly English-speaking, predominantly Spanish-speaking, and Spanish-English bilingual children. Language, Speech, and Hearing Services in Schools, 36(3), 201–218. [DOI] [PubMed] [Google Scholar]
- Grunwell P. (1985). Phonological assessment of child speech (PACS). Windsor, England: NferNelson. [Google Scholar]
- Hodson B. W., & Paden E. P. (1981). Phonological processes which characterize unintelligible and intelligible speech in early childhood. Journal of Speech and Hearing Disorders, 46, 369–373. [Google Scholar]
- Hodson B. W., & Paden E. P. (1991). Targeting intelligible speech: A phonological approach to remediation (2nd ed). Austin, TX: Pro-Ed. [Google Scholar]
- Holm A., & Dodd B. (1999). A longitudinal study of the phonological development of two Cantonese–English bilingual children. Applied Psycholinguistics, 20(03), 349–376. [Google Scholar]
- Hwa-Froelich D., Hodson B. W., & Edwards H. T. (2002). Characteristics of Vietnamese phonology. American Journal of Speech-Language Pathology, 11(3), 264–273. [Google Scholar]
- Kamhi A. G. (2006). Treatment decisions for children with speech sound disorders. Language, Speech, & Hearing Services in Schools, 37, 271–279. [DOI] [PubMed] [Google Scholar]
- Levy E. S., & Crowley C. J. (2012). Policies and practices regarding students with accents in speech-language pathology training programs. Communication Disorders Quarterly, 34(1), 59–68. [Google Scholar]
- Lowe R. J. (1994). Phonology: Assessment and intervention applications in speech pathology. Baltimore, MD: Williams & Wilkins. [Google Scholar]
- Mack M. (2003). The phonetic systems of bilinguals. In Banich M. T. & Mack M. (Eds.), Mind, brain, and language: Multidisciplinary perspectives (pp. 309–349). Mahwah, NJ: Erlbaum. [Google Scholar]
- Morrow A., Goldstein B. A., Gilhool A., & Paradis J. (2014). Phonological skills in English language learners. Language, Speech, and Hearing Services in Schools, 45, 26–39. [DOI] [PubMed] [Google Scholar]
- Muller N., Ball M. J., Guendouzi J., & Muller N. (2000). Accent reduction programmes: Not a role for speech-language pathologists? Advances in Speech-Language Pathology, 2(2), 119–154. [Google Scholar]
- Munro M. J. (2003). A primer on accent discrimination in the Canadian Context. TESL Canada Journal, 20(2), 38–51. [Google Scholar]
- Munro M. J., & Derwing T. M. (2006). The functional load principle in ESL pronunciation instruction: An exploratory study. System, 34, 520–531. [Google Scholar]
- Munro M. J., & Derwing T. M. (2008). Segmental acquisition in adult ESL learners: A longitudinal study of vowel production. Language Learning, 58(3), 479–502. [Google Scholar]
- Munro M. J., & Derwing T. M. (2011). The foundations of accent and intelligibility in pronunciation research. Language Teaching, 44, 316–327. [Google Scholar]
- Oh G. E., Guion-Anderson S., Aoyama K., Flege J. E., Akahane-Yamada R., & Yamada T. (2011). A one-year longitudinal study of English and Japanese vowel production by Japanese adults and children in an English-speaking setting. Journal of Phonetics, 39, 156–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scovel T. (2000). A critical review of the critical period research. Annual Review of Applied Linguistics, 20(20), 213–223. [Google Scholar]
- Sikorski L. D. (2005). Foreign accents: Suggested competencies for improving communicative pronunciation. Seminars in Speech and Language, 26(2), 126–130. [DOI] [PubMed] [Google Scholar]
- Stoel-Gammon C., & Dunn C. (1985). Normal and disordered phonology in children. Baltimore, MD: University Park Press. [Google Scholar]
- Tajima K., & Kubo R. (1999). Vowel epenthesis in productions of English consonant clusters by Japanese. Journal of the Acoustical Society of America, 106(4), 2155–2155. [Google Scholar]
- Trofimovich P., & Isaacs T. (2012). Disentangling accent from comprehensibility. Bilingualism: Language and Cognition, 15(4), 905–916. [Google Scholar]
- Tsujimura N. (1996). An introduction to Japanese linguistics. Cambridge, MA: Blackwell. [Google Scholar]
- Vance T. J. (1987). An introduction to Japanese phonology. New York, NY: SUNY Press. [Google Scholar]
- Wood S. (2005). Praat for beginners [Manual]. Retrieved from http://www.ling.lu.se/persons/Sidney/praate/
- Yavas M., & Goldstein B. (1998). Phonological assessment and treatment of bilingual speakers. American Journal of Speech-Language Pathology, 7(2), 49–60. [Google Scholar]
- Yeni-Komshian G. H., Flege J. E., & Liu S. (2000). Pronunciation proficiency in the first and second languages of Korean-English bilinguals. Bilingualisim: Language and Cognition, 3(2), 131–149. [Google Scholar]
- Yorkston K., Beukelman D., & Hakel M. (1996). Speech Intelligibility Test for Windows [Communication disorders software]. Lincoln, NE: Institute for Rehabilitation Science and Engineering at Madonna Rehabilitation Hospital. [Google Scholar]



