Abstract
Adults typically struggle to perceive non-native sound contrasts, especially those that conflict with their first language. Do the same challenges persist when the sound contrasts overlap but do not conflict? To address this question, we explored the acquisition of lexical tones. While tonal variations are present in many languages, they are only used contrastively in tonal languages. We investigated the perception of Mandarin tones by adults with differing experience with Mandarin, including naïve listeners, classroom learners, and native speakers. Naïve listeners discriminated Mandarin tones at above-chance levels, and performance significantly improved after just one month of classroom exposure. Additional evidence for plasticity came from advanced classroom learners, whose tonemic perception was indistinguishable from that of native speakers. The results suggest that unlike many other non-native contrasts, adults studying a language in the classroom can readily acquire the perceptual skills needed to discriminate Mandarin tones.
Keywords: Second Language Learning, Lexical Tone, Plasticity
1. Introduction
Researchers have long been fascinated by the challenges inherent in adult second language (L2) learning. Phonology appears to be especially difficult for late learners, including both perception (e.g., Goto, 1971; Miyawaki et al., 1975; Werker & Tees, 1984b) and production (Flege, Munro, & McKay, 1995; Flege, Yeni-Komshian, & Liu, 1999). It has been suggested that there may be a period after which learners can no longer acquire native-like phonological abilities, and that the window may be narrower than for other aspects of language, such as morphosyntax (e.g., Long, 1990; Ruben, 1997; Scovel, 1988). These challenges can have latent effects; for instance, difficulties differentiating non-native phonemes can impede L2 lexical development (Leonard, 1982).
Previous research exploring non-native speech perception has typically utilized two kinds of L2 phoneme categories: those that conflict with the first language (L1) and those that do not exist in L1. In the case of conflicting non-native speech sounds, there is typically a mismatch in the distributions of phonetic cues in L1 vs. L2. For instance, Japanese speakers have difficulty discriminating the English /r/-/l/ contrast, likely because exemplars of /r/ and /l/ occur along the continuum of a single Japanese phoneme (Goto, 1971; Miyawaki et al., 1975). While training can improve L2 listeners’ performance (e.g., Antoniou & Wong, 2016; Logan, Lively, & Pisoni, 1991; McClelland, Fiez, & McCandliss, 2002), learners may still fail to achieve native-like proficiency (Bradlow, Pisoni, Akahane-Yamada, & Tohkura, 1997; Strange & Dittman, 1984). In contrast, other non-native phonemes appear to present less difficulty and may be perceived as entirely novel. Zulu clicks, for example, are unrelated to English phonemes, and adult English speakers can readily discriminate between them (Best, McRoberts, & Sithole, 1988).
There is another important class of non-native phonetic contrasts: sounds that exist in L1 but are only used contrastively in L2. One such property of speech, with different functions in different languages, is pitch variation. In syllable-tone languages, pitch variations distinguish lexical meanings at the syllabic level (e.g., Burnham & Mattock, 2007; Yip, 2002). For example, the four citation tones in Mandarin can be categorized as high level (T1), high rising (T2), low dipping (T3), or high falling (T4) (Bao, 1990, 2003). The syllable /ma/ can mean “mom”, “hemp”, “horse”, or “scold”, depending on the lexical tone. Thus, in tonal languages, tone recognition is an essential component of word comprehension.
This particular characteristic of tonal languages is distinct from the way pitch is used in Indo-European languages. In English, for example, intonational variation provides pragmatic and emotional information (e.g. Fernald et al., 1989; Moore, Spence, & Katz, 1997), and denotes phrase boundaries (e.g. Gussenhoven, 2004). At the syllabic level, though lexical stress can change a word’s meaning (e.g., REcord vs. reCORD), other cues such as amplitude and duration are also usually included. Importantly, pitch changes alone in English do not alter word meanings, and thus, English pitch patterning does not encourage English speakers to treat tone as contrastive for each syllable, or even to track it with reference to lexical representations (Chen & Kager, 2016; Hay, Graf Estes, Wang, & Saffran, 2015; Liu & Kager, 2015; Quam & Swingley, 2010). Furthermore, even though speakers of non-tonal languages may be familiar with some of the acoustic cues involved in lexical tone, tones still represent an unfamiliar suprasegmental feature. Non-tonal speakers may be able to make use of their L1 domain of prosody, for example, and can categorize Mandarin tones using acoustic similarity between the tones and their native intonational, rhythmic, and stress systems, relying on F0 properties (Broselow et al., 1987; So & Best, 2011; White, 1981). However, the different role of F0 in tonal vs. non-tonal languages makes assimilation difficult (Yang & Chan, 2010). Due to the unique combination of overlap and contrast between tone- and non-tone-languages, a better understanding of how L2 learners perceive tones could yield valuable insights regarding adult L2 learners’ plasticity for perceptual adaptation in adulthood.
Research investigating adults’ learning of tones reveals that lexical tone contrasts present a challenge for L2 learners (Burnham & Francis, 1997; Showalter & Hayes-Harb, 2013; Wang et al., 2003; Wayland & Guion, 2004; Wang, 2013; but see Antoniou & Wong (2016) for discussion of the relative difficulty of tones compared to contrasts such as voice onset time). The difficulty is particularly pronounced for learners whose L1 does not utilize pitch height or contour in a contrastive manner (see Antoniou & Chin, 2018, for a review). Non-tonal L2 speakers use different strategies to process lexical tone than native speakers (e.g., Francis et al., 2008; Gandour et al., 2000; Wang et al., 2003). Unlike native speakers of a tonal language, non-tonal L2 speakers do not perceive tones categorically (Hallé, Chang, & Best, 2004; Peng et al., 2010; Wu & Lin, 2008), and the perception of tone categories differs across conditions and languages (see Burnham & Mattock, 2007, for a review). For example, native Mandarin speakers attend to the fundamental frequency contour when processing tonal contrasts, such that they focus on the direction of change (e.g., rising versus falling) of the pitch height (Liu & Samuel, 2004; Xu, 1997). In comparison, native English speakers utilize just the height of the tones for Mandarin tonal contrasts, leading to errors during tone processing (Wang et al., 2003). Thus, both tonal and non-tonal speakers rely on the dimensions of pitch that are meaningful in their native languages.
Given that tonal contrasts have been consistently shown to be challenging (e.g., Burnham and Francis, 1997; Wang et al., 1999; Wang et al., 2003; Wayland & Guion, 2004), several past studies have explored the effects of tone training for L2 learners using both artificial tonal languages (e.g., Caldwell-Harris et al., 2015; Wong & Perrachione, 2007) and real Mandarin stimuli (e.g., Chandrasekaran, Sampath, & Wong, 2010; Showalter & Hayes-Harb, 2013; Wang et al., 1999). Most adult language-learning occurs in a classroom setting (Snyder & Dillow, 2011), and results from both behavioral and neuroscientific paradigms suggest that naïve listeners and adult L2 learners can improve their perception of lexical tone contrasts (e.g., Antoniou & Wong, 2016; Chandrasekaran, Sampath, & Wong, 2010; Showalter & Hayes-Harb, 2013; Wang et al., 1999; Wong & Perrachione, 2007).
These improvements have been demonstrated in both lab-based and classroom learning contexts (Shih et al. 2010; Song et al., 2008; Wang et al., 1999; Wang & Kuhl, 2003; Wong & Perrachione, 2007; Wong et al., 2007). Successful lab-based manipulations typically take place over multiple sessions (e.g., Wang et al., 1999), and have yielded enhancements in naïve listeners’ ability to identify tones in as little as a total of 90 minutes of training (Wong & Perrachione, 2007), although it remains unclear how long these effects last. In addition, tone training paradigms have also been shown to produce reliable changes in the learners’ brains (e.g., Asaridou et al., 2015; Wong & Perrachione, 2007). Thus, adult learners are able to change their perception of tonal contrasts, but it is not yet known how much experience, particularly in the classroom, is needed to create these changes.
To date, only a few studies have explored the trajectory for improvement on tone discrimination tasks. Zhang (2011) examined native English speakers studying Mandarin in China. When asked to identify items that differed by a single tone, both “novice” (7–9 months of study in China) and “intermediate” (20–25 months) groups performed significantly worse than the native speaker group. Guo and Tao (2008) recorded American students in a first-year Chinese class over two semesters and showed that they continued to struggle with tone production at the sentence level across all four tones. Even studies with highly advanced learners suggest that L2 learners are unlikely to perform at native levels. For instance, Hao (2012) found that while L2 learners with more than two years of Mandarin-learning experience were relatively successful in mimicking monosyllabic tones, they still made errors in identifying them, suggesting that even advanced learners fail to match native speakers. Similarly, research by Lee, Tao and Bond (2010) showed that while third-year Mandarin students were almost as accurate as native speakers in identifying isolated Mandarin tones, they took less advantage of subtle cues, such as coarticulation, and their error patterns differed from native speakers. Likewise, recent research by Pelzl and colleagues (2019) found that advanced Mandarin L2 learners lagged behind native speakers on their accuracy in identifying syllables in both behavioral and ERP tasks. Together, these studies suggest a persistent and long-lasting difficulty for L2 learners attempting to fully master the lexical tones of Mandarin.
To summarize, past findings have suggested adult L2 learners of Mandarin Chinese can learn tonal contrasts with a relatively short amount of exposure (e.g., Wang et al., 1999; Wong & Perrachione, 2007) and advanced learners may achieve high-level performance on tonal discrimination tasks (e.g., Hao, 2012; Lee et al., 2010). However, no prior study has systematically tested classroom learners at multiple stages of learning, and it is not known how long it takes for learners to achieve native or near-native proficiency in discriminating non-native lexical tones as a result of continued classroom exposure.
The current research addresses these issues by assessing phonological learning in L2 learners acquiring a tonal language through classroom instruction. The ability to learn about L2 sounds typically depends on L1 characteristics (e.g., Best et al., 1988; Flege, 1987, 1989; Finn et al., 2013; Guion, Flege, & Loftin, 2000; Iverson et al., 2003; Pallier, Bosch, & Sebastián-Gallés 1997; Strange, 1995). Pitch contours are present in English, but they are not used contrastively; changes in pitch serve a primarily non-lexical (e.g., pragmatic) purpose. Therefore, one prediction is that adult English speakers will be slow to acquire lexical tones, requiring a great deal of experience to overcome their bias against using pitch contrastively. Alternatively, since tonemes (pitch variations over every syllable) do not exist in English, Mandarin tone categories will not conflict with extant L1 categories (much like the aforementioned Zulu clicks). By this account, tonal contrasts should be learned quite readily by English speakers. By testing learners with varying amount of experience with a tonal language, the present study can shed light on how quickly L2 learners are able to acquire a new phonological category and discriminate tonemic contrasts at the syllabic level.
We developed an AXB task to investigate tonemic perception in adults with varying amounts of exposure to Mandarin. Participants heard triads of syllables, and were asked whether the first or the third syllable had the same tone as the second syllable. We used this task to address three questions. First, can English speakers with no prior tonal language experience perceive the differences between Mandarin tonemes? Prior investigations of this issue have rendered contradictory evidence (e.g., Gao et al., 2011; Harrison, 2000; Liu & Kager, 2011; Mattock & Burnham, 2006; Mattock, Molnar, Polka, & Burnham, 2008). Second, how much exposure to Mandarin is necessary to observe improvements in discrimination of tones? We addressed this question by testing groups of participants with varying amounts of college Mandarin coursework, including individuals who had no experience with Mandarin; these participants served as a baseline that allowed us to measure how even minimal exposure might influence perception. By testing learners enrolled in second language classrooms, we were able to take an ecologically valid approach to examine how tone learning progresses in typical L2 settings, which is a vastly under-studied topic, as well as to evaluate the effectiveness of classroom-based instruction. Finally, we asked how much experience it would take for adult L2 learners to achieve native-like tonemic perception. We addressed this question by including a group of advanced classroom Mandarin learners in the study. Taken together, these three research questions address the role of experience in L2 tonal perception, with implications for understanding language plasticity and the trajectory of L2 learning in adulthood.
2. Method
2.1. Participants
Five groups of students participated in this study; all were enrolled at the same large Midwestern university and reported normal hearing. The Naïve Listeners reported no prior tonal language experience (n = 40; 12 male). The Beginner group had just begun studying Mandarin, and had received one month of classroom instruction (n = 50; 22 male). The Intermediate group was nearing the end of their second semester of classroom Mandarin instruction (n = 49; 29 male). The Advanced group had taken at least four semesters of Mandarin; 12 had studied abroad in China (n = 15; 8 male). Participants in the three L2 learning groups (i.e., Beginner, Intermediate, and Advanced groups) all went through the same course sequence; there were no heritage language learners in any of the groups. All of the Mandarin learners were enrolled in the same university language program and all instructors had received the same training and were regularly evaluated on their implementation of a standard curriculum. Finally, the Native Speaker group consisted of L1 Mandarin speakers (n = 33; 7 male), all of whom grew up in China and described Mandarin as their first and dominant language, though they were currently living in the United States. None of the participants had experience with any tonal language other than Mandarin. An additional 25 participants were tested but excluded due to: non-native speakers with experience with Mandarin prior to college (n = 7), experience with another tonal language (n = 15), missing information about language experience (n = 1), failure to complete the task (n = 1), or failure to follow directions (n=1). Participants were recruited through their Mandarin or Psychology classes and received course credit or $10.
2.2. Stimuli
Materials for the discrimination task consisted of nine syllables with both CV and CVC structures. These syllables (ma, gu, dui, chai, nan, xi, lu, feng, que) were produced in isolation in each of the four Mandarin tones by a female native Mandarin speaker (see Appendix for details). The 36 tokens were edited to match in amplitude; duration was left unmatched in order to preserve the naturalness of tonal characteristics (Chao, 1968; Duanmu, 2000; Xu, 1997; Yip, 2001). Three additional syllables (he, lo, ju) were recorded in the first and fourth tones to be used for practice trials. All syllables were real words in Mandarin but nonwords in English.
2.3. Procedure
We used an AXB task to test tonemic discrimination. Participants sat at a computer and listened to triads of syllables over headphones; syllables were separated by 1 sec. There were four practice trials, which each consisted of three unique syllables in succession, two of which shared a tone (e.g., ju1-he1-lo4). They were told that the second syllable would have the same tone as either the first or third syllable, and asked to indicate their choice of the matching tone via a key press. After responding, they received feedback, and the next trial began after a .5 sec pause.
The test phase was identical to the practice phase except that participants no longer received feedback. Each tone was used equally often in each position, such that all possible contrasts between tonemes were tested equally. All tonemes were presented in each of the nine test syllables. There were 72 test trials.
2.4. Questionnaires
Participants provided information about their language experience using the Language Experience and Proficiency Questionnaire (Marian, Blumenfeld, & Kaushanskaya, 2007). Participants were also asked about their musical experience because of the possibility that musicians might have better tone perception than non-musicians (Wong & Perrachione, 2007). Fifteen participants (Natives: n = 11, Intermediate: n = 3, Beginners: n = 1) did not provide information about their musical training. For the remaining 172 participants, those reporting five or more years of experience of musical training on a single instrument or voice (e.g., 5 years of piano, not 2 years of piano and 3 years of violin) were coded as “experts” (n = 75); those who reported some experience (but fewer than five years on any one instrument or voice) were considered “intermediate” (n = 63); participants with no formal training were classified as “novices” (n = 34). The proportion of experts did not differ across the five groups [range: 40–53%, X2(3, n = 172) = .34, p= .99].
3. Results
To determine whether experience with Mandarin was related to the perception of Mandarin tonemes, we computed participants’ mean accuracy on the tonemic discrimination task and conducted a one-way ANOVA comparing the five language groups. The ANOVA was significant [F(4, 182) = 16.80, p < .00001, partial η2 = .270]1 with better performance with increased Mandarin experience; see Figure 1. We next asked whether the Naïve Listener group could successfully discriminate Mandarin tonemes; their performance was significantly better than chance (50%) [67.4%, t(39) = 7.09, p < .0001, d = 2.271]. These results suggest that even in the absence of experience with a tonal language, non-native speakers have a rudimentary ability to discriminate Mandarin tonemes. The Naïve Listener group also serves as a baseline, providing a point of comparison for the L2-learning groups.
Figure 1:
Mean performance on the tone discrimination task. Error bars indicate standard errors.
To address the remaining questions, we conducted a series of pairwise comparisons using Tukey’s HSD to control for multiple comparisons (adjusted p-values and group means are reported in Table 1). The Naïve Listeners performed significantly worse than all the other groups (all p < .001), including the Beginners, who had only had one month of experience with Mandarin. This finding suggests that even a very brief amount of classroom exposure to Mandarin facilitates tonemic discrimination. The Beginner and Intermediate groups both performed significantly worse than the Native Speaker group (p < .05), indicating that a few weeks to a year of classroom experience is not sufficient to achieve native-like perceptual abilities. However, the Advanced learners did not differ from the Native Speaker group (p = .997), suggesting that it is possible for adult learners to reach native levels of performance on this task.2
Table 1.
Group means and comparisons between language groups. Adjusted p-values were calculated using Tukey’s HSD, and starred values indicate a significant difference between groups.
| GROUP (mean % correct) |
Naïve Listeners |
Beginners | Intermediate | Advanced | Native Speakers |
|---|---|---|---|---|---|
| Naïve Listeners (67.4%) |
X | ||||
| Beginners (82.1%) |
p < .00001* | X | |||
| Intermediate (80.2%) |
p < .001* | p = .96 | X | ||
| Advanced (90.0%) |
p <.00001* | p = .28 | p = .11 | X | |
| Native Speakers (91.5%) |
p < .000001* | p = .02* | p < .005* | p = .997 | X |
To rule out the possibility that differences in performance might be attributable to differences in musical training, we performed a second ANOVA that included Music Experience, coded as described above, in addition to Language Group, for the 172 participants for whom we had information about musical training. There was a main effect of Music Experience (Novice vs. Intermediate vs. Expert) [F(2, 157) = 3.05, p < .05, partial η2 = .037], suggesting that individuals with more musical training outperformed individuals with less musical training. However, the interaction with language group was not significant [F(8, 157) = .80, p = .61). Therefore, we can conclude that the group differences we observed are unlikely to be due to differences in musical training.
Finally, past research has shown that not all tones are equally difficult to discriminate (Hallé, Chang, & Best, 2004; Hao, 2012; Kiriloff, 1969, Zue, 1976), and language experience can influence which pairs are more easily confused. Therefore, we wanted to test whether experience with Mandarin changed which contrasts were the most challenging to distinguish. We performed a two-way mixed ANOVA exploring the effects of Tonemic contrast (within subjects) and Language Group (between-subjects). The model revealed a significant main effect of Tonemic contrast [F(5, 910) = 30.11, p < .00001, partial η2 = 1.42], demonstrating that not all contrasts were equally discriminable. However, the interaction between Tonemic contrast and Language Group was not significant [F(20, 910) = 1.25, p = .21]. All groups, including the Native Speakers, performed worst on the Tone2 vs. Tone3 contrast, and increased experience with Mandarin led to improvement across all tonemic contrasts, which is consistent with other reports (e.g., Hao, 2012; Wang, Spence, Jongman, & Sereno, 1999). Thus, even though some tonemic pairs were easier to discriminate than others, varying knowledge of Mandarin did not lead to differential sensitivity to particular tonemic contrasts (see Table 2 for details).
Table 2.
Group means and standard deviations for each of the pairs of tone contrasts.
| T1 vs. T2 | T1 vs. T3 | T1 vs. T4 | T2 vs. T3 | T2 vs. T4 | T3 vs. T4 | |
|---|---|---|---|---|---|---|
| Naïve Listeners |
68.3% (18.2) |
72.1% (18.9) |
67.5% (17.4) |
60.1% (19.0) |
64.6% (18.9) |
71.0% (21.6) |
| Beginners | 82.0% (15.1) |
88.5% (15.8) |
79.3% (15.4) |
75.3% (19.6) |
80.1% (15.7) |
87.0% (14.9) |
| Intermediate | 82.0% (16.9) |
86.4% (16.5) |
78.1% (19.4) |
73.0% (21.4) |
78.6% (17.2) |
83.3% (19.0) |
| Advanced | 85.0% (16.1) |
96.7% (8.8) |
89.4% (8.0) |
81.2% (12.3) |
92.2% (10.7) |
95.0% (7.6) |
| Native Speakers |
92.2% (13.2) |
92.2% (13.3) |
94.7% (10.2) |
84.8% (16.5) |
91.4% (13.6) |
93.7% (10.1) |
4. Discussion
Using an AXB task, we investigated how adults with varying amounts of exposure to Mandarin perceive tonemic contrasts and documented the effect of L2 classroom experience on perceptual sensitivities. Naïve adults with no prior tonal language experience performed significantly above chance. Surprisingly, a mere month of classroom experience was enough to significantly improve performance over naïve listeners, but an additional year of classroom exposure did not lead to further improvements. Finally, advanced L2 learners (who had taken at least 4 semesters of college Mandarin plus study abroad) performed as accurately as native Mandarin speakers. The finding that L2 learners of Mandarin Chinese can show marked improvements in their perception of tonal contrasts coheres with prior findings (e.g., Guo & Tao, 2008; Hao, 2012; Lee et al., 2010). By including learners at different stages of proficiency, the current results highlight the incremental and non-linear trajectory by which learners gain proficiency with this L2 feature in a classroom setting.
Despite having no exposure to Mandarin, naïve participants showed some sensitivity to tonemic contrasts. Non-native contrasts are not all equally difficult to perceive (e.g. Best, et al., 1988), and these results suggest that Mandarin tonemes can be distinguished by listeners who are unfamiliar with any tonal language, which is consistent with prior studies (e.g., Wang et al., 1999). This sensitivity suggests that even though participants may not be accustomed to attending to the lexical function of suprasegmental information, their L1 experience did not prevent them from being able to perceive tonemic contrasts. In addition, the significant improvement in performance shown by the Beginners, who had only had a month of Mandarin experience, also supports the view that L2 contrasts that do not directly conflict with L1 can be learned relatively easily (e.g., Antoniou & Wong, 2016; Best, 1993, 1994; Best, et al., 1988; Best, McRoberts, LaFleur, & Silver-Isentadt, 1995; Kuhl, 1991; Kuhl et al., 2005, Zhang et al., 2009). Furthermore, the rapid change shows that classroom experience can have immediate, measurable consequences on adults’ perception.
While one month of experience appears to be enough to yield enhanced sensitivity to tonemic contrasts, an additional year of experience did not improve performance (Intermediate group), suggesting that complete mastery of this linguistic feature does not come rapidly. It could be that in the first weeks of a language course, instructors emphasize this feature of Mandarin. Once students have a cursory understanding of tonal structure, it receives less attention; in the early stages of L2 learning, a superficial knowledge of tones may suffice. Another possibility is that until students reach a higher level of proficiency, tonemic contrasts remain challenging. As with other L2 sounds, increased experience and expertise in the L2 could enhance phonological abilities (Guion et al., 2000). In the current study, Advanced learners performed significantly better than the less experienced learners, suggesting that learning may not follow an even trajectory. Even though the measure of change was not taken at equal time intervals (e.g., the time difference between Beginner and Intermediate groups was 6 months, while it was 1 to 3 years for the Intermediate and Advanced groups), these data still suggest that there may not be an even progression in the rate at which these contrasts are learned. It is likely that both increased linguistic input and overall proficiency played a role in the superior performance of the Advanced learners. With sufficient exposure to tones, Advanced learners may have developed a phonemic category for lexical tones that is distinct from any of the existing ones in their L1, resulting in better sound discrimination. It is also possible that enhanced familiarity with the L2 allowed the Advanced students to allocate attention to processing tonal cues in greater detail (Finn et al., 2013).
Though the Advanced learners performed as well as native Mandarin speakers on the AXB task, this result does not necessarily suggest that they are perceiving lexical tones in an identical fashion to native speakers. Prior research indicates that even when L2 learners show sensitivity to non-native cues in simple tasks, they continue to struggle to use supragmental information in more demanding contexts (e.g., Cooper, Cutler, & Wales, 2002; Dupoux, Sebastián-Gallés, Navarrete, & Peperkamp, 2008). A limitation of our study was that it only tested the perception of isolated syllables produced by a single speaker and did not include more challenging tasks, such as producing tones or using tones to understand spoken language. Perceptual enhancement on a task assessing tonemic perception does not establish that Advanced learners perceive tones lexically or that they are able to integrate tonal information in speech processing. Indeed, other evidence suggests that classroom Mandarin learners struggle with some language learning tasks that require them to make use of tonal information (Potter, Wang, & Saffran, 2017). Still, it is important to note that the Advanced learners were capable of distinguishing tonemes and were performing as well as native speakers on this AXB task, demonstrating that classroom instruction can allow adult L2 speakers to achieve native-like perceptual skills, at least in a simplified context.
In the present study, we only tested learners for whom tone was an unfamiliar lexical feature, and prior tonal experience may interfere with non-native tone perception (So, 2005; So & Best, 2010). Learners whose L1 uses a different tonal system might face different challenges compared to non-tonal speakers in learning new lexical tone contrasts in the classroom (e.g., Gandour et al., 2000; Hao, 2012). Future studies will explore whether experience with different L1 tonal categories helps learners attend to L2 tones in a classroom learning environment, or instead hinders the acquisition of L2 tones in comparison to naïve learners.
One important caveat is that our participants were not randomly assigned to language exposure groups and there may be differences in their levels of motivation. Individuals in the Naïve group had chosen not to study Mandarin, whereas the three classroom groups chose Mandarin as an L2. Moreover, there is a high attrition rate from introductory Mandarin classes to the more advanced classes. Students who initially found Mandarin tones especially challenging may have been less likely to continue to take Mandarin classes. Different groups of learners could also have varied in their desire to achieve high proficiency. First-Year Chinese was offered as a two-semester sequence that fulfilled the university’s foreign language requirement, and it may be that some less-motivated students remained enrolled through the Intermediate level despite limited interest in gaining proficiency, but only those more motivated students continued to the Advanced level. However, the fact that we saw no difference in performance between students who had just enrolled in Mandarin (Beginners) and those who had completed two semesters (Intermediate) suggests that self-selection is unlikely to be the sole cause of the differences we observed, as the least-motivated students were unlikely to take the second course in the sequence3. Furthermore, even if individuals who were more sensitive to tones were more likely to continue studying Chinese, it is still unusual to see late L2 learners achieve native-like performance on a non-native sound contrast (e.g., Dupoux, Pallier, Sebastián-Gallés, & Mehler, 1997). To fully understand why some L2 learners and environments are more successful than others, future research will need to examine contributions of student motivation, classroom experience, and individual abilities.
In a second language classroom, students are asked to perceive, comprehend, and produce unfamiliar sounds after limited exposure to the new language. It has been well-documented that learners have trouble learning contrasts that are inconsistent with their L1, but less attention has been paid to how the ability to contend with these contrasts may change over time. The current study provided a systematic examination of how L2 listeners contend with a novel contrast, lexical tone, as they gain experience with Mandarin through classroom instruction in a university setting. Our results suggest not only that perceptual abilities change quickly, but that on a simple discrimination measure, advanced L2 learners become indistinguishable from native speakers. Thus, these findings suggest that adult learners retain sufficient plasticity to successfully learn at least some non-native sound contrasts later in life.
Acknowledgements
This research was funded by grants from the National Institute of Child Health and Human Development (to J. R. Saffran, R37HD037466; and the Waisman Center, P30HD03352), the National Science Foundation Graduate Research Fellowship (to C. E. Potter, DGE-1256259) and the James S. McDonnell Foundation (to J. R. Saffran). We would like to thank the participants at the University of Wisconsin-Madison as well as the teachers and students at the Department of East Asian Languages & Literature. We also thank Yayun Zhang, Rachel Wang, Federica Bulgarelli, Hilary Stein, and Shelby Adler for their assistance in testing participants, Margarita Kaushanskaya for her comments and advice on the project, in addition to the editor and two anonymous reviewers for their thoughtful feedback and suggestions.
Appendix
| Tone | mean F0 max (SD) | mean F0 min (SD) | Range |
|---|---|---|---|
| T1 | 263.93 (12.93) | 245.23 (12.47) | 221.6–281.1 |
| T2 | 251.91 (5.53) | 176.13 (23.48) | 117.9–261.7 |
| T3 | 223.33 (3.02) | 116.32 (37.54) | 76–229.7 |
| T4 | 293.53 (17.52) | 116.18 (38.08) | 76.2–337.6 |
Acoustic characteristics of F0 (in Hz) for all tokens.
Footnotes
Due to unequal Ns in the test groups (range=15–51) and possible ceiling effects in the Advanced and Native groups, we also performed nonparametric tests to confirm the results. Kruskal-Wallis test showed a significant effect on the k independent sample test, Chi-square (4) = 52.184, p < .00001.
| Group (median % correct) |
No experience (66.7%) |
Beginners (86.1%) |
Intermediate (81.9%) |
Native
Speakers (94.4%) |
| Advanced (90.3%) | p < .000001* | p = .019* | p = .035* | p = .129 |
According to Chinese instructors at the university, students who receive a grade below C typically do not continue with the 2nd semester of Chinese.
Contributor Information
Tianlin Wang, University at Albany SUNY.
Christine E. Potter, Princeton University
Jenny R. Saffran, University of Wisconsin-Madison
References
- Antoniou M, & Chin JL (2018). What can lexical tone training studies in adults tell us about tone processing in children?. Frontiers in psychology, 9, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Antoniou M, & Wong PCM (2016). Varying irrelevant phonetic features hinders learning of the feature being trained. The Journal of the Acoustical Society of America, 139, 271–278. doi: 10.1121/1.4939736 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Asaridou SS, Takashima A, Dediu D, Hagoort P, & McQueen JM (2015). Repetition suppression in the left inferior frontal gyrus predicts tone learning performance. Cerebral Cortex, 26(6), 2728–2742. [DOI] [PubMed] [Google Scholar]
- Bao ZM (1990). On the Nature of Contour Tone. Ph.D. dissertation, MIT. [Google Scholar]
- Bao ZM (2003). Tone, accent, and stress in Chinese. Journal of Linguistics, 39(1), 147–166. [Google Scholar]
- Best CT (1993). Emergence of language-specific constraints in perception of non-native speech: A window on early phonological development. In Developmental neurocognition: Speech and face processing in the first year of life (pp. 289–304). Springer; Netherlands. [Google Scholar]
- Best CT (1994). The emergence of native-language phonological influences in infants: A perceptual assimilation model. In The Development of Speech Perception: The Transition from Speech Sounds to Spoken Words, (pp. 167–224). Cambridge, MA: MIT Press. [Google Scholar]
- Best CT, McRoberts GW, LaFleur R, & Silver-Isenstadt J. (1995). Divergent developmental patterns for infants’ perception of two nonnative consonant contrasts. Infant Behavior and Development, 18(3), 339–350. [Google Scholar]
- Best C, McRoberts G, & Sithole N. (1988). Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants. Journal of Experimental Psychology: Human Perception and Performance, 14(3), 345–360. [DOI] [PubMed] [Google Scholar]
- Bradlow AR, Pisoni DB, Akahane-Yamada R, & Tohkura YI (1997). Training Japanese listeners to identify English/r/and/l: IV. Some effects of perceptual learning on speech production. The Journal of the Acoustical Society of America, 101(4), 2299–2310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broselow E, Hurtig RR, & Ringen C. (1987). The perception of second language prosody. Interlanguage phonology: The acquisition of a second language sound system, 350–361. [Google Scholar]
- Burnham D, & Francis E. (1997). The role of linguistic experience in the perception of Thai tones. Southeast Asian linguistic studies in honour of Vichin Panupong, 29–47. [Google Scholar]
- Burnham D, & Mattock K. (2007). The perception of tones and phones. Language experience in second language speech learning: In honor of James Emil Flege, 259–280. [Google Scholar]
- Caldwell-Harris CL, Lancaster A, Ladd DR, Dediu D, and Christiansen MH (2015). Factors influencing sensitivity to lexical tone in an artificial language: Implications for second language learning. Studies in Second Language Acquisition, 37(2), 335–357. [Google Scholar]
- Chandrasekaran B, Sampath PD, & Wong PC (2010). Individual variability in cue-weighting and lexical tone learning. The Journal of the Acoustical Society of America, 128(1), 456–465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chao YR (1968). A Grammar of Spoken Chinese. University of California Press. [Google Scholar]
- Chen A, & Kager R. (2016). Discrimination of lexical tones in the first year of life. Infant and Child Development, 25(5), 426–439. [Google Scholar]
- Dupoux E, Sebastián-Gallés N, Navarrete E, & Peperkamp S. (2008). Persistent stress ‘deafness’: The case of French learners of Spanish. Cognition, 106(2), 682–706. [DOI] [PubMed] [Google Scholar]
- Dupoux E, Pallier C, Sebastian N, & Mehler J. (1997). A destressing “deafness” in French? Journal of Memory and Language, 36(3), 406–421. [Google Scholar]
- Duanmu S. (2000). The Phonology of Standard Chinese. Oxford: Oxford University Press. [Google Scholar]
- Fernald A, Taeschner T, Dunn J, Papousek M, de Boysson-Bardies B, & Fukui I. (1989). A cross-language study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. Journal of Child Language, 16(03), 477–501. [DOI] [PubMed] [Google Scholar]
- Finn AS, Hudson Kam CL, Ettlinger M, Vytlacil J, & D’Esposito M. (2013). Learning language with the wrong neural scaffolding: the cost of neural commitment to sounds. Frontiers in Systems Neuroscience, 7, 85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flege JE, Yeni-Komshian GH, & Liu S. (1999). Age constraints on second-language acquisition. Journal of Memory and Language, 104, 78–104. [Google Scholar]
- Flege JE (1987). The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics, 15, 47–65. [Google Scholar]
- Flege JE (1989). Chinese subjects’ perception of the word-final English / t / - / d / contrast: Performance before and after training. The Journal of the Acoustical Society of America, 86(5), 1684–1697. [DOI] [PubMed] [Google Scholar]
- Flege JE, Munro MJ, & MacKay IR (1995). Factors affecting strength of perceived foreign accent in a second language. The Journal of the Acoustical Society of America, 97, 3125–34. [DOI] [PubMed] [Google Scholar]
- Francis AL, Ciocca V, Ma L, & Fenn K. (2008). Perceptual learning of Cantonese lexical tones by tone and non-tone language speakers. Journal of Phonetics, 36(2), 268–294. [Google Scholar]
- Gandour J, Wong D, Hsieh L, Weinzapfel B, Van Lacker D, & Hutchins GD (2000). A crosslinguistic PET study of tone. Journal of Cognitive Neuroscience, 12(1), 207–222. [DOI] [PubMed] [Google Scholar]
- Gao J, Shi R, & Li A. (2011). Lexical tone perception in non-tone-learning infants. 12th International Congress for the Study of Child Language, Montreal. [Google Scholar]
- Goto H. (1971). Auditory perception by normal Japanese adults of the sounds “L” and “R.” Neuropsychologia, 9, 317–323. [DOI] [PubMed] [Google Scholar]
- Guion SG, Flege JE, & Loftin JD (2000). The effect of L1 use on pronunciation in Quichua–Spanish bilinguals. Journal of Phonetics, 28(1), 27–42. [Google Scholar]
- Guo L, & Tao L. (2008, April). Tone production in Mandarin Chinese by American students: A case study. In Proceedings of the 20th North American Conference on Chinese Linguistics (NACCL-20) (Vol. 1, pp. 123–138). [Google Scholar]
- Gussenhoven C. (2004). The Phonology of Tone and Intonation. Cambridge: Cambridge University Press. [Google Scholar]
- Hallé PA, Chang Y-C, & Best CT (2004). Identification and discrimination of Mandarin Chinese tones by Mandarin Chinese vs. French listeners. Journal of Phonetics, 32(3), 395–421. [Google Scholar]
- Hao Y-C (2012). Second language acquisition of Mandarin Chinese tones by tonal and non-tonal language speakers. Journal of Phonetics, 40(2), 269–279. [Google Scholar]
- Harrison P. (2000). Acquiring the phonology of lexical tone in infancy. Lingua, 110(8), 581–616. [Google Scholar]
- Hay JF, Graf Estes K, Wang T, & Saffran JR (2015). From flexibility to constraint: The contrastive use of lexical tone in early word learning. Child Development, 86, 10–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iverson P, Kuhl PK, Akahane-Yamada R, & Diesch E. (2003). A perceptual interference account of acquisition difficulties for non-native phonemes. Cognition, 87, B47–B57. [DOI] [PubMed] [Google Scholar]
- Kiriloff C. (1969). On the auditory perception of tones in Mandarin. Phonetica, 20(2–4), 63–67. [Google Scholar]
- Kuhl P, Conboy B, Padden D, Nelson T, & Pruitt J. (2005). Early speech perception and later language development: Implications for the “Critical Period.” Language Learning and Development, 1(3&4), 237–264. [Google Scholar]
- Kuhl PK (1991). Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Perception & psychophysics, 50(2), 93–107. [DOI] [PubMed] [Google Scholar]
- Lee C, Tao L, & Bond ZS (2010). Modified Mandarin tones by non-native listeners. Language and Speech, 53(2), 217–243. [DOI] [PubMed] [Google Scholar]
- Leonard LB (1982). Phonological deficits in children with developmental language impairment. Brain and Language, 16(1), 73–86. [DOI] [PubMed] [Google Scholar]
- Liu L, & Kager R. (2011). How do statistical learning and perceptual reorganization alter Dutch infants’ perception to lexical tones? In ICPhS (Vol. 17, pp. 1270–1273). [Google Scholar]
- Liu L, & Kager R. (2015). Bilingual exposure influences infant VOT perception. Infant Behavior and Development, 38, 27–36. [DOI] [PubMed] [Google Scholar]
- Liu S, & Samuel AG (2004). Perception of Mandarin lexical tones when F0 information is neutralized. Language and Speech, 47, 109–138. [DOI] [PubMed] [Google Scholar]
- Logan J, Lively S, & Pisoni D. (1991). Training Japanese listeners to identify English /r/ and /l/: A first report. The Journal of the Acoustical Society of America, 89(2), 874–886 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Long M. (1990). Maturational constraints on language development. Studies in Second Language Acquisition, 12(3), 251–285. [Google Scholar]
- Marian V, Blumenfeld HK, & Kaushanskaya M. (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language, and Hearing Research, 50(4), 940–967. [DOI] [PubMed] [Google Scholar]
- Mattock K, & Burnham D. (2006). Chinese and English infants’ tone perception: Evidence for perceptual reorganization. Infancy, 10(3), 241–265. [Google Scholar]
- Mattock K, Molnar M, Polka L, & Burnham D. (2008). The developmental course of lexical tone perception in the first year of life. Cognition, 106(3), 1367–1381. [DOI] [PubMed] [Google Scholar]
- McClelland JL, Fiez JA, & McCandliss BD (2002). Teaching the /r/-/l/ discrimination to Japanese adults: behavioral and neural aspects. Physiology & Behavior, 77, 657–662. [DOI] [PubMed] [Google Scholar]
- Miyawaki K, Strange W, Verbrugge R, Lieberman A, Jenkins J, & Fujimura O. (1975). An effect of linguistic experience: The discrimination of [r] and [l] by native speakers of Japanese and English. Perception & Psychophysics, 18(5), 331–340. [Google Scholar]
- Moore D, Spence M, & Katz G. (1997). Six-month-olds’ categorization of natural infant-directed utterances. Developmental Psychology, 33, 980–989. [DOI] [PubMed] [Google Scholar]
- Pallier C, Bosch L, & Sebastián-Gallés N. (1997). A limit on behavioral plasticity in speech perception. Cognition, 64(3), B9–B17. [DOI] [PubMed] [Google Scholar]
- Pelzl E, Lau EF, Guo T, & DeKeyser R. (2019). Advanced second language learners’ perception of lexical tone contrasts. Studies in Second Language Acquisition, 41, 59–86. [Google Scholar]
- Peng G, Zheng H-Y, Gong T, Yang R-X, Kong J-P, & Wang WS-Y (2010). The influence of language experience on categorical perception of pitch contours. Journal of Phonetics, 38(4), 616–624. [Google Scholar]
- Potter CE, Wang T, & Saffran JR (2017). Second Language Experience Facilitates Statistical Learning of Novel Linguistic Materials. Cognitive science, 41(S4), 913–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quam C, & Swingley D. (2010). Phonological knowledge guides 2-year-olds’ and adults’ interpretation of salient pitch contours in word learning. Journal of Memory and Language, 62(2), 135–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruben RJ (1997). A time frame of critical/sensitive periods of language development. Acta Oto-Laryngologica, 117(2), 202–205. [DOI] [PubMed] [Google Scholar]
- Scovel T. (1988). A Time to Speak: A Psycholinguistic Inquiry into the Critical Period for Human Speech. New York: Newbury House. [Google Scholar]
- Showalter CE, & Hayes-Harb R. (2013). Unfamiliar orthographic information and second language word learning: A novel lexicon study. Second Language Research, 29(2), 185–200. [Google Scholar]
- Shih C, Lu HYD, Sun L, Huang JT, & Packard J. (2010). An adaptive training program for tone acquisition. In Proceedings of Speech Prosody 2010. [Google Scholar]
- Snyder TD, & Dillow SA (2011). Digest of Education Statistics, 2010. NCES 2011–2015. National Center for Education Statistics. [Google Scholar]
- So CK (2005). The influence of L1 prosodic background on the learning of Mandarin tones: Patterns of tonal confusion by Cantonese and Japanese naïve listeners. In Proceedings of the 2005 annual conference of the Canadian Linguistic Association. [Google Scholar]
- So CK, & Best CT (2010). Cross-language perception of non-native tonal contrasts: Effects of native phonological and phonetic influences. Language and speech, 53(2), 273–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- So CK, & Best CT (2011). Categorizing Mandarin tones into listeners’ native prosodic categories: The role of phonetic properties. Poznań Studies in Contemporary Linguistics PSiCL, 47, 133. [Google Scholar]
- Song JH, Skoe E, Wong PC, & Kraus N. (2008). Plasticity in the adult human auditory brainstem following short-term linguistic training. Journal of cognitive neuroscience, 20(10), 1892–1902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Strange W, & Dittmann S. (1984). Effects of discrimination training on the perception of /r-l/ by Japanese adults learning English. Perception & Psychophysics, 36(2), 131–45. [DOI] [PubMed] [Google Scholar]
- Strange W. (Ed.), (1995). Speech Perception and Linguistic Experience: Issues in Cross-language Research. Baltimore: York Press. [Google Scholar]
- Wang X. (2013). Perception of Mandarin tones: The effect of L1 background and training. The Modern Language Journal, 97, 144–160. [Google Scholar]
- Wang Y, & Kuhl PK (2003). Evaluating the” Critical Period” hypothesis: perceptual learning of Mandarin tones in American adults and American children at 6, 10 and 14 years of age. In Proceedings of the 15th International Congress of Phonetic Sciences (pp. 1537–1540). [Google Scholar]
- Wang Y, Sereno JA, Jongman A, & Hirsch J. (2003). fMRI evidence for cortical modification during learning of Mandarin lexical tone. Journal of cognitive neuroscience, 15(7), 1019–1027. [DOI] [PubMed] [Google Scholar]
- Wang Y, Spence MM, Jongman A, & Sereno JA (1999). Training American listeners to perceive Mandarin tones. The Journal of the Acoustical Society of America, 106, 3649–3658. [DOI] [PubMed] [Google Scholar]
- Wayland RP, & Guion SG (2004). Training English and Chinese listeners to perceive Thai tones: A preliminary report. Language Learning, 54(4), 681–712. [Google Scholar]
- Werker J, & Tees R. (1984b). Phonemic and phonetic factors in adult cross-language speech perception. The Journal of the Acoustical Society of America, 75, 1866. [DOI] [PubMed] [Google Scholar]
- White CM (1981). Tonal perception errors and interference from English intonation. Journal of Chinese Language Teachers Association, 16(2), 27–56. [Google Scholar]
- Wong PC, & Perrachione TK (2007). Learning pitch patterns in lexical identification by native English-speaking adults. Applied Psycholinguistics, 28(4), 565–585. [Google Scholar]
- Wong P, Perrachione TK, & Parrish TB (2007). Neural characteristics of successful and less successful speech and word learning in adults. Human brain mapping, 28(10), 995–1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu X, & Lin H. (2008). Perception of Mandarin tones by Mandarin and English listeners. Journal of Chinese Language and Computing, 18(4), 175–187. [Google Scholar]
- Xu Y. (1997). Contextual tonal variations in Mandarin. Journal of Phonetics, 25(1), 61–83. [Google Scholar]
- Yang C, & Chan MK (2010). The perception of Mandarin Chinese tones and intonation by American learners. Journal of Chinese Language Teachers Association, 45(1), 7–36. [Google Scholar]
- Yip M. (2002). Tone. Cambridge: Cambridge University Press. [Google Scholar]
- Zhang Y, Kuhl PK, Imada T, Iverson P, Pruitt J, Stevens EB, Kawakatsu M, Tohkura Y, & Nemoto I. (2009). Neural signatures of phonetic learning in adulthood: a magnetoencephalography study. Neuroimage, 46(1), 226–240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zue VW (1976). Some perceptual experiments on the Mandarin tones. The Journal of the Acoustical Society of America, 60(S1), S45–S45. [Google Scholar]

