Skip to main content
Sage Choice logoLink to Sage Choice
. 2024 May 23;41(2):367–396. doi: 10.1177/02676583241249348

The effects of incidental learning and input frequency on the perception of non-native speech

Andrew H Lee 1,, Jackie S Lloyd 1
PMCID: PMC11954681  PMID: 40166383

Abstract

The current study investigated the extent to which naive listeners could incidentally acquire non-native phonemic contrasts and the degree to which the frequency of exposure to the target phonemes affects their learning. A total of 100 English speakers were assigned to the following conditions: (1) 0-occurrence; (2) 2-occurrence; (3) 10-occurrence; (4) 20-occurrence; or (5) 30-occurrence. The participants watched a video that provided instruction on counting numbers in Korean while incidentally exposing them to various repetitions of the target phonemes. All participants completed a pretest, an immediate posttest, and a delayed posttest, each comprising an AX discrimination task. The effects of incidental exposure were found only in the 10-occurrence condition, in both the immediate posttest and the delayed posttest. While the current study demonstrates the overall efficacy of incidental exposure on the perception of non-native speech, it also highlights the important role that selective attention plays in language learning.

Keywords: incidental learning, instructed L2 speech perception, Korean three-way stops

I Introduction

Incidental learning refers to the acquisition of a linguistic target without the conscious intention to learn the target, such as ‘picking up’ a new word or expression from linguistic input (Hulstijn, 2013). Evidence largely suggests that intentional learning is more beneficial for second language (L2) acquisition than incidental learning (e.g. Hamrick and Rebuschat, 2014; Ishikawa, 2019; Sonbul and Schmitt, 2010; Webb et al., 2023). However, due to the finite nature of learner attention and time available for L2 instruction, not only is it unrealistic to expect all linguistic targets to be acquired in an intentional manner, but learning additional targets incidentally as a by-product of intentional learning or of other activities would be in the best interests of learners. Researchers have thus investigated the effects of incidental learning on L2 acquisition to a great extent. While the overwhelming majority of previous research focuses on the acquisition of L2 vocabulary and morphosyntax (e.g. Ishikawa, 2019; Ruiz et al., 2018; Tao and Williams, 2018; Webb, 2007), the available research in the L2 speech domain indicates that non-native listeners can acquire linguistic targets in an incidental manner to some extent, with a few studies (e.g. Lim and Holt, 2011; Vlahou et al., 2012) even suggesting that incidental learning can be equal to or more effective than explicit, intentional learning. In addition, given the notion that frequency of input is an important component of language learning in general, a substantial number of studies in the vocabulary and morphosyntax domains have examined frequency as a variable. However, frequency of input has not been studied in incidental learning of L2 speech. The current study aims to fill this gap by investigating the role of input frequency in the extent to which naive listeners could acquire non-native phonemic contrasts in an incidental manner.

To pursue this research objective, an experimental study was conducted with 100 first language (L1) speakers of English targeting Korean lenis stops (/p/, /t/, and /k/) and fortis stops (/p*/, /t*/, and /k*/), which are difficult for L2 learners of Korean to acquire (e.g. Chang, 2010; Francis and Nusbaum, 2002; Holliday, 2015, 2019). The current study is expected to expand the literature regarding the effects of incidental exposure and its frequency on L2 acquisition, while offering pedagogical implications with respect to the role of incidental learning in L2 speech learning.

II Background

1 Incidental learning and frequency

In the field of L2 acquisition, learning can be categorized into two types: intentional and incidental. Intentional learning involves a conscious effort to learn or memorize new information (DeKeyser, 2003; Hulstijn, 2003, 2013), whereas incidental learning is widely defined as learning without a deliberate intent to learn (Hulstijn, 2003; Williams, 2009). Incidental learning stems from the view that aspects of an L2 can be ‘picked up’ while the learner’s attention is focused on a different target (Hulstijn, 2003). Hence, linguistic knowledge gained while one was focusing on something other than that language form or pattern can be said to have been learned incidentally. With respect to L2 acquisition, there is ample evidence suggesting that intentional learning results in more learning than incidental learning does (e.g. Hamrick and Rebuschat, 2014; Norris and Ortega, 2000; Sonbul and Schmitt, 2010), with higher rates of knowledge retention (Denhovska et al., 2016; Hulstijn, 2003; Ishikawa, 2019; Schmitt, 2008). However, some studies (e.g. Lim and Holt, 2011; Morgan-Short et al., 2010, 2012; Vlahou et al., 2012) have made a case for incidental learning, which has potential implications for optimal L2 learning conditions and warrants further examination. Furthermore, given common constraints, such as limited learner attention and time devoted to learning an L2, it is worthwhile to explore ‘what can be learned as mere result of exposure and without explicit instructional treatments’ (Rebuschat, 2013: 598). As noted by Leow and Zamora (2017), a deeper understanding of incidental learning is ‘of clear theoretical value to the field of SLA [second language acquisition]’ (p. 44), which may yield further insights on learning conditions that fully utilize learners’ cognitive processes.

A major variable often addressed in research on incidental learning, especially of vocabulary, is the frequency of exposure to the target input. In general, it is uncontroversial that ‘learning is sensitive to frequency: the more times a stimulus is encountered, the faster and more accurately it is processed’ (Ellis, 2006a: 5). At this point, it should be noted that frequency is just one of several interconnected components that contribute to L2 learning, such as attention and salience (Robinson et al., 2019). The distinction between input and intake is also important in considering the role of input frequency; frequency alone does not guarantee that the target input will be further processed by the learner, particularly in incidental learning. Nonetheless, frequency is a highly important factor that is relatively easy to control and manipulate in instructed L2 acquisition, making it the primary focus of the current study.

There is a wealth of research on the effects of incidental learning on L2 acquisition, with a notably heavy focus on vocabulary and morphosyntax. Following research that showed gains in vocabulary knowledge from meaning-focused activities, such as extensive reading (e.g. Horst et al., 1998; Waring and Takaki, 2003; Zahar et al., 2001), numerous studies have demonstrated that vocabulary can be learned incidentally while reading (Eckerth and Tavakoli, 2012; Ruiz et al., 2018; Teng, 2020; Webb, 2007), including an eye-tracking study (Mohamed, 2018), which found that even during comprehension-focused reading, L2 learners spent more time looking at novel words. Incidental learning of L2 vocabulary has also been shown to occur during activities other than reading, such as reading while listening (Malone, 2018; Webb and Chang, 2015; Webb et al., 2013), listening (Jin and Webb, 2020; Pavia et al., 2019; van Zeeland and Schmitt, 2013), watching videos (Nguyen and Boers, 2019; Peters and Webb, 2018), and speaking (Newton, 2013). A meta-analysis of 32 studies on incidental vocabulary learning (de Vos et al., 2018) substantiated the benefits of listening during meaning-focused activities. Another meta-analysis by Webb et al. (2023) found that, while likely less effective than learning intentionally, incidental learning of L2 vocabulary occurred during meaning-focused activities, at similar rates during reading, listening, or reading while listening.

Following a seminal study by Saragi et al. (1978), many researchers have reported a positive correlation between frequency and incidental vocabulary learning at varying degrees (Horst et al., 1998; Rott, 1999; Vidal, 2003, 2011; Waring and Takaki, 2003; Webb, 2007). Holding repetition as an important variable, recent studies have provided further support for input frequency as a predictor of incidental vocabulary learning (Hulme et al., 2019; Mohamed, 2018; Pavia et al., 2019; Peters and Webb, 2018; Teng, 2020). However, there is a lack of consensus on the exact number of encounters required (i.e. between two and over 20), and some studies (e.g. Jin and Webb, 2020; Webb and Chang, 2015) found no significant effects of frequency on vocabulary learning. Nonetheless, incidental learning has been reported as more sensitive to frequency than intentional learning is (Hamrick and Rebuschat, 2014), and a meta-analysis by Uchihara et al. (2019) found a medium effect (r = .34) of repetition on incidental vocabulary learning. Taken together, though frequency may not be the only nor most significant predictor of learning, it remains a necessary component of language processing and acquisition.

Morphosyntax is another domain in which incidental L2 learning has been extensively investigated. Research has provided evidence that novel grammar structures can be successfully acquired under incidental learning conditions, by employing semiartificial languages that combine target vocabulary with morphological or syntactic features of other languages, (e.g. Grey et al., 2014; Rebuschat and Williams, 2012; Rogers et al., 2016; Tao and Williams, 2018) as well as natural languages (e.g. Brooks and Kempe, 2013; Denhovska and Serratrice, 2017; Godfroid, 2016; Lee, 2002; Robinson, 2005; Shintani, 2015). However, it has been noted that while various aspects of grammar can be acquired incidentally, the amount of learning is usually not robust (Leow and Zamora, 2017). In many cases, morphosyntactic knowledge gained through incidental exposure was limited to receptive, but not productive, knowledge (Denhovska and Serratrice, 2017; Godfroid, 2016; Shintani, 2015) or not transferred to new items learners had not been exposed to (Robinson, 2005). Ruiz et al. (2018), which examined the simultaneous incidental learning of vocabulary and syntax, found significantly greater learning gains for vocabulary than for syntax.

In contrast to those on vocabulary, few studies on incidental learning of L2 morphosyntax have specifically examined frequency as a variable. They have provided divergent conclusions, either that higher frequency of exposure to target input increases learning to some extent (Aka, 2020; Lee, 2002; Robinson, 2005), or that ‘less is more’ (Denhovska et al., 2016: 178) for beginner learners, who are cognitively taxed while processing L2 input. While it seems probable that frequently occurring morphosyntactic features would be acquired more successfully than infrequent features, it remains open whether that is empirically the case.

There is relatively little research on incidental learning of L2 speech (see Hulstijn, 2003; Loewen, 2020), but the available evidence collectively indicates that sounds can be learned incidentally. Using a videogame task, Lim and Holt (2011) demonstrated that Japanese speakers could not only incidentally learn the English /r/–/l/ categories, but that their gains after a mere 2.5 hours of incidental exposure were comparable to learning gains found in previous studies that largely involved two to four weeks of explicit categorization training. Saito et al. (2022), a later study based on Lim and Holt (2011), produced limited but similar results, adding that incidental speech learning may be more effective for more learnable targets (i.e. the English /æ/–/ʌ/ contrast rather than /r/–/l/ for Japanese speakers). Targeting non-speech sounds, several studies employing the videogame paradigm also demonstrated that new sound categories or patterns could be reliably learned in an incidental manner (Gabay et al., 2018, 2023; Lim et al., 2019; Wade and Holt, 2005), be generalized to non-native natural speech (Liu and Holt, 2015), and even be consolidated into long-term memory (Gabay et al., 2023). Notably, fMRI scans in Lim et al. (2019) showed that the striatum, an area of the brain thought to be involved in explicit category learning, similarly contributed to the incidental learning of auditory categories.

Employing a simpler technique for incidental exposure to non-native speech, Vlahou et al. (2012) had Greek speakers listen to Hindi consonants but focus on the differences in volume between pairs of sounds, rather than on the consonants themselves. Results showed that the most robust learning occurred in the group that received this incidental exposure without any feedback on their volume discrimination, compared to the group that received feedback and to the group that received explicit training on identifying the target consonants. Luthra et al. (2019), utilizing the same learning paradigm and targets, provided weak but consistent findings with those of Vlahou et al. (2012), in that participants incidentally exposed to the Hindi consonants performed better on an identification task than control participants with no exposure. Hutchinson and Dmitrieva (2022) explored a more ecologically valid method of incidental exposure, in which naive-listener L1 speakers of English watched a French film while completing a vocabulary task. Though their focus was on production (as opposed to perception), Hutchinson and Dmitrieva’s (2022) findings indicate that incidental exposure through film viewing improved the participants’ pronunciation of the French /y/.

As discussed, the extant research provides evidence that novel sounds, both speech and non-speech, can be acquired incidentally, potentially even more effectively than in an intentional manner. Yet, although L2 environments and classrooms are full of novel speech sounds, pronunciation instruction is seldom prioritized in instructional settings (Darcy, 2018; Foote et al., 2016; Huensch, 2019), with time constraints cited as the primary reason. With this reality, neglecting to utilize incidental exposure for pedagogical benefit would be a missed opportunity in instructed L2 speech acquisition. A key factor to consider in the pursuit of leveraging incidental learning with limited time is the amount of incidental exposure necessary for learning. However, to the best of our knowledge, the role of input frequency in incidental L2 learning has not been investigated in the domain of speech perception as it has been in vocabulary and morphosyntax. Therefore, the current study attempts to fill this research gap by conducting an experimental study on the effects of various input frequencies on the incidental learning of non-native speech sounds; more specifically, Korean stop consonants.

2 The three-way laryngeal contrast of Korean stops

Stop consonants are differentiated by various acoustic dimensions, such as voicing, aspiration, and voice onset time (VOT). Korean, in particular, has a typologically unusual three-way stop contrast, comprising lenis (lax), fortis (tense), and aspirated stops from three places of articulation: bilabial (/p/, /p*/, /ph/), alveolar (/t/, /t*/, /th/), and velar (/k/, /k*/, /kh/). For instance, /tal/, /t*al/, and /thal/ denote ‘moon’, ‘daughter’, and ‘mask’, respectively. Phonetically, in the word-initial position, the three categories of Korean stops are all voiceless and – much like two- and three-way stop contrasts in many other languages – are differentiated by the VOT. It has historically been the case that fortis stops have the shortest VOT, lenis stops have intermediate VOT, and aspirated stops have the longest VOT. However, unlike many other languages, the Korean three-way stops are also differentiated by the fundamental frequency (F0) of the following vowel, with the highest F0 corresponding to aspirated stops, then fortis stops, and lenis stops with the lowest F0 (Cho et al., 2002; Francis and Nusbaum, 2002; Kim, 2004). Remarkably, researchers have noted a sound change in the contrasts in the past several decades, specifically in the convergence of VOTs of lenis and aspirated stops among speakers of Seoul (‘standard’) Korean (Bang et al., 2018; Kang, 2014; Kang and Guion, 2008; Silva, 2006). With the disappearance of the difference in VOT, which previously differentiated the two stops, the secondary cue F0 is now the primary contrast between lenis and aspirated stops. Lee et al. (2020) report that this ‘phonetic reorganization’ is spreading to all varieties of Korean.

Due to their unique three-way contrast, Korean stops have been a topic of much research not only with respect to their phonetic and phonological properties (see Kang et al., 2022), but also in their perception and acquisition by naive listeners and L2 learners of Korean, as introduced below. According to the Perceptual Assimilation Model (PAM) (Best, 1995), adults perceive novel non-native phones in terms of their articulatory similarities and differences to phonemes and contrasts in their L1s. Within this framework, novel non-native phones are assimilated, or mapped onto existing L1 sound categories based on their perceived similarity to the L1 category. Hence, when Korean stops are perceived by non-native listeners, they would be assimilated to the listeners’ L1 categories based on acoustic dimensions that differentiate stop consonants in their respective L1s. The PAM outlines six types of assimilation that can occur, depending on the perceived similarities among the non-native phonemes and corresponding L1 phonemes:

  • • two-category (TC);

  • • single-category (SC);

  • • category-goodness (CG);

  • • uncategorized–categorized (UC);

  • • uncategorized–uncategorized (UU); and

  • • non-assimilable (NA) (see Best, 1995).

This prediction has been explored in numerous previous studies for a variety of L1s. With naive-listener L1 speakers of Mandarin, which contrasts stops by VOT, Holliday (2014) demonstrated that the Mandarin speakers categorized the Korean stops primarily in terms of VOT, assimilating both lenis and aspirated stops to Mandarin aspirated voiceless stops, which have longer-lag VOTs (SC), and fortis stops to Mandarin unaspirated voiceless stops (with short VOTs). Though less categorical, similar effects were observed with naive-listener L1 speakers of Japanese (Holliday, 2019), a language that also mainly differentiates stops by VOT. The Japanese participants also assimilated the Korean lenis and aspirated stops to Japanese voiceless stops (SC) and the Korean fortis stops as Japanese voiced stops, solely based on VOT cues. Martínez-García and Holliday (2019) compared the perception of Korean three-way stops by Spanish naive listeners and Spanish L2 learners of Korean. With Spanish stops being either ‘unequivocally pre-voiced’ (p. 2585) or voiceless in the word-initial position, all the participants assimilated all three of the Korean stops, which are voiceless in the word-initial position, to a Spanish voiceless stop category (SC). Although a significant level of inter-listener variability was found, on the whole, the naive listeners and the L2 learners showed only minor differences in assimilating word-initial Korean stops. This pattern of perception that aligns with predictions based on the PAM has also been found in L1 speakers of Hindi and Paite (CG) (Ngaihte and Holliday, 2019), as well as Quebec French (Nam et al., 2021). Notably, in addition to an identification task, in which all Korean stops were assimilated to French voiceless stops as expected (SC for most contrasts; UC for a few exceptions), Nam et al. (2021) employed an AX discrimination task to measure the discrimination of Korean stops by naive-listener Quebec French speakers. Based on the participants’ relative difficulty discriminating between the lenis and aspirated stops, Nam et al. (2021) concluded that the stop contrasts with high assimilation overlap hindered the participants’ discrimination ability.

The current study focuses particularly on L1 English speakers’ ability to perceive Korean stops. Previous research indicates a pattern in line with speakers of the other L1s discussed above. English has a two-way stop contrast that is differentiated by voicing (i.e. voiced and voiceless stops) along the VOT continuum (Cho et al., 2019). Even with high degrees of variation among speakers, the variation is still highly structured, with mean VOT values correlated with linear relations between voiced and voiceless stops and between different places of articulation (Chodroff and Wilson, 2017). With VOT being the key cue for discriminating English stops, naive-listener L1 speakers of English also perceive both Korean lenis and aspirated stops as English voiceless stops and Korean fortis stops as English voiced stops (Schmidt, 2007), and they are unable to attend to the F0 cue that distinguishes lenis stops from aspirated and fortis stops. However, Francis and Nusbaum (2002) demonstrated that with explicit training on identification, L1 English speakers with no prior knowledge of Korean learned to use both VOT and F0. In a similar vein, Kong et al. (2022) found that for L1 English learners of Korean, VOT and F0 were the primary and secondary cues, respectively, in distinguishing Korean stops.

III Current study

The current study aims to advance our understanding of incidental learning of non-native speech and the role of the frequency of input. Considering the attention that frequency of exposure has received in other domains (e.g. Aka, 2020; Hamrick and Rebuschat, 2014; Lee, 2002; Robinson, 2005; Webb, 2007), the current study addresses the research gap that exists in the domain of L2 speech. This inquiry is particularly important in speech learning. It is well known that infants can detect all phonetic information in their early stages due to their ability to attend to fine-grained phonetic distinctions and then direct their attention only to information that is significant in their ambient languages (Munro, 2021). Though they weaken with age, there is evidence that the mechanisms underlying L1 speech learning are maintained and that L2 learners could access them when they are given optimal L2 input (Flege and Bohn, 2021). Regarding what constitutes optimal L2 input, Flege and Bohn (2021) further added that it is ‘unknown at present how much L2 input is needed to form phonetic categories in an L2 and optimally adapt them to everyday use’ (p. 15). By testing various quantities of repeated input, the current study aims to offer some insight into the elements of optimal input in L2 speech learning.

The current study focuses particularly on the perceptual discrimination of Korean lenis stops and fortis stops by naive L1 English speakers. Despite the three-way contrast of Korean stops, only the lenis and fortis stops were included for the following reasons: with the aforementioned convergence of VOTs of lenis and aspirated stops in modern Korean speech, the lenis-aspirated contrast may not be perceptible at all to naive L1 English speakers, who are likely to attend only to VOT. The results of Schmidt (2007) suggested that Korean aspirated stops would be the most difficult for L1 English speakers, especially in distinguishing them from lenis stops. The same was noted among L1 speakers of Spanish (Martínez-García and Holliday, 2019) and Dutch (Broersma, 2009; Choi, 2015). Furthermore, as demonstrated in Nam et al. (2021), discrimination by L1 speakers of French, which has two-way stops differentiated by voicing like English, was the least accurate for lenis-aspirated contrasts across all three places of articulation due to their high assimilation overlap. In Guion and Pederson (2007), out of five Hindi phonemic contrasts learned by L1 speakers of English, explicit directing of attention to their phonetic forms had an effect for only the most difficult contrast, with the authors concluding that explicit attention has a greater effect on learning for difficult contrasts that take longer to learn. Bearing that in mind, the lenis-aspirated contrasts are unlikely to be impacted by the incidental learning conditions of the current study. Conversely, evidence suggests that Korean fortis and aspirated stops are easily distinguished by speakers of L1s that have VOT-cued stop contrasts. If so, the participants in the current study may perform well in discriminating the fortis-aspirated contrast regardless of any incidental exposure, making it difficult to attribute any learning to the instructional treatment or creating a potential ceiling effect.

To pursue our research objectives, an experimental study was conducted to answer the following research questions:

  • • Research question 1: To what extent do naive L1 speakers of English increase their perception accuracy in discriminating between Korean lenis stops and fortis stops after incidental exposure to them?

  • • Research question 2: To what extent does the frequency of exposure to the Korean lenis and fortis stops (i.e. 0, 2, 10, 20, or 30 occurrences) affect the L1 English speakers’ incidental learning?

IV Method

1 Participants

A total of 100 English speakers (22 males; 78 females) participated in the current study. Their average age was 25.6 years (SD = 10.1); 89 participants were university students, and the remaining 11 participants were professionals. All participants learned English as their L1 from birth, from at least one parent. They had never learned Korean nor ever lived in Korea, and no participant spoke an L2 that contains phonemes similar to those targeted in the current study.

In addition, five speakers of Korean (2 males; 3 females) contributed to the current study. One female speaker served as an instructor in a language instruction video, and the remaining four speakers provided audio stimuli for testing sessions. Thirteen additional Korean speakers contributed by providing baseline data on the discrimination task. The average age of all the Korean-speaking contributors was 22.2 years (SD = 1.5). They all learned Korean as their L1 from birth from at least one parent. While all of them were university students from South Korea, they were residing in Canada (average length of residence = 3.1 months; SD = 1.5) for the purpose of learning English at the time of data collection.

2 Procedures

The overall design of the current study comprised a pretest, an incidental learning session, an immediate posttest, and a delayed posttest. The 100 participants were randomly assigned to one of the following five conditions (20 participants per condition): (1) 0-occurrence; (2) 2-occurrence; (3) 10-occurrence; (4) 20-occurrence; or (5) 30-occurrence. On their first day, the participants completed a pretest consisting of an AX discrimination task. All participants attended their second session an average of 4.5 days (SD = 19.4) later. Eighty-nine out of the 100 participants completed their second sessions within one to four days of their pretests, 10 participants attended between five to nine days later, and one outlier participant completed the second session 197 days later. Because the current study was interrupted by institutional closures due to a global pandemic, this participant had to return after data collection resumed approximately six months later. At the second session, all participants received language instruction consisting of two consecutive viewings of a video that explicitly taught the participants how to count numbers in Korean while incidentally exposing them to various repetitions of the target phonemes (i.e. Korean lenis and fortis stops), depending on their treatment condition. Immediately following the instructional treatment, all the participants completed the same AX discrimination task. The participants returned for a third session to complete their delayed posttests (i.e. the AX discrimination task) an average of 61.6 days (SD = 70.4) after their immediate posttests. Again, due to the pandemic interruption, while 76 participants had a delay of between 12 and 33 days (average = 22.2; SD = 3.6), 24 participants had a delay of between 174 and 218 days following their immediate posttests. It is crucial to note that for both the immediate and delayed posttests, the variation in the delays did not affect the data. Statistical analyses including and excluding the outlier participants showed no significant differences (p > .05).

a Instructional treatment

To operationalize incidental exposure as defined in the current study, the instructional treatment consisted of a video teaching participants how to count numbers in Korean (i.e. meaning-focused instruction) and a worksheet that assessed participants’ knowledge of Korean numbers, about which they were forewarned (i.e. directing their attention to the Korean numbers instead of the target phonemes). The goal of the instructional treatment was to direct the participants’ attention to comprehending and memorizing the Korean numbers while measuring their ability to ‘pick up’ the unfocused targets as a by-product of completing the main task. Each of the 100 participants watched a video providing explicit instruction on how to count from one to four in Korean. Before watching, the participants were told that they would learn how to count in Korean and complete a worksheet after watching the video twice. The video for each condition was narrated by a female instructor, one of the aforementioned Korean speakers. Given that none of the participants had learned Korean before, the medium of the instruction was English. In the video, the instructor explained how to count from one to four in Korean. She orally presented each number with an explicit translation (e.g. ‘one is hana in Korean’; ‘two is dul in Korean’). Pronouncing the numbers clearly, the instructor invited the participants to verbally repeat after her. The accompanying visual material in the video consisted of the words ‘How to count numbers in Korean’, the numerals 1, 2, 3, 4, and color drawings of six different fruits. The instructor then used the Korean numbers to count the six fruits, each of which was pseudo-labeled as each of the target phonemes paired with the vowel /a/ as follows: ‘apple’ as /pa/, ‘orange’ as /p*a/, ‘strawberry’ as /ta/, ‘lemon’ as /t*a/, ‘pear’ as /ka/, and ‘grapes’ as /k*a/. The instructor orally presented the pseudo-labels at different frequencies depending on the condition, but they were not explicitly taught nor emphasized in any way.

Regarding the different frequencies of exposure in the current study, as no previous research on the role of repetition in the incidental learning of L2 speech had been conducted, an exploratory approach was taken to cover a variety of numbers between 0 and a number high enough to provide input flooding in a relatively short video (watched twice). Hence, in addition to 0 as a control, a video including just one exposure was created, as well as with five exposures to align with the ‘count to four’ scheme. The block of five exposures was included twice to include 10 exposures, and three times to include 15 exposures in each video. To illustrate, for the 2-occurrence condition, the instructor said ‘This is called /pa/’ when a picture of an apple appeared on the screen, then counted hana, dul, set, net as additional apples appeared, exposing the viewer to the target syllable /pa/ just once, then she continued on in the same manner with the other objects and target phonemes. For the 10-occurrence condition, the instructor said ‘This is called /pa/’ when an apple appeared, then counted ‘/pa/ hana, /pa/ dul, /pa/ set, /pa/ net’ to expose the viewer to /pa/ a total of five times, and so on. For the 20- and 30-occurrence conditions, the instructor repeated the counting accordingly, and for the 0-occurrence condition, the instructor counted Korean numbers as the objects appeared without using their pseudo-labels, not exposing the viewer to the target phonemes at all (for a script of all the videos by condition, see Appendix A). Each participant watched the instructional video, which ranged in length from approximately 2.5 to 4 minutes depending on their condition, twice consecutively. In this manner, participants in the 2-occurrence, 10-occurrence, 20-occurrence, and 30-occurrence conditions were incidentally exposed to each target phoneme a total of twice, 10 times, 20 times, and 30 times respectively (i.e. once, five times, 10 times, and 15 times in the video, watched twice).

As informed in advance, after the second viewing of the instructional video, all the participants were asked to complete a worksheet comprising three simple multiple-choice questions, as shown in Figure 1. This worksheet activity was included to keep the participants’ focus on comprehending the message of the instructional video (i.e. counting numbers in Korean), rather than on the target phonemes (i.e. the ‘names’ of the fruits). Although the results of the worksheet were not further analysed in the current study, all participants showed nearly perfect scores, indicating that they successfully learned the numbers one to four in Korean.

Figure 1.

Figure 1.

Worksheet on Korean numbers.

b Testing sessions

The participants completed a pretest, an immediate posttest, and a delayed posttest, each of which consisted of an AX discrimination task designed and administered using Praat (Boersma and Weenink, 2021) and Test Invite on individual computers. To measure listeners’ perception of target sounds, two commonly implemented types of tasks are forced-choice identification tasks and categorical AX discrimination tasks (see Strange and Shafer, 2008). Given that the participants are completely naive listeners with no perceptual nor orthographic knowledge of the Korean stops, an identification task would be unsuitable for the current study. In contrast, an AX discrimination task merely requires listeners to decide whether two audio stimuli are the same or different from one another, making it a more accurate measure of whether the participants learned the phonemic contrasts (i.e. whether they can perceive the categorical differences, which exist in Korean). Hence, an AX discrimination task was employed to measure the extent to which the participants perceptually discriminated between the corresponding Korean lenis stops and fortis stops (i.e. /pa/ vs. /p*a/; /ta/ vs. /t*a/; /ka/ vs. /k*a/).

To record the audio stimuli for the task, two male (M1, M2) and two female (F1, F2) Korean speakers were asked to pronounce each target stimulus (see Table 1) in a carrier sentence, ‘The next word is [X].’ Each stimulus was then extracted from the carrier sentence using Praat (Boersma and Weenink, 2021). To ensure the validity of all stimuli, two separate analyses were conducted. First, the authors of this article, who are L1 speakers of Korean, listened to each stimulus in a random manner and orthographically transcribed it in Korean. The analysis confirmed that all the stimuli were correct. Second, the stimuli were acoustically analysed by Praat (Boersma and Weenink, 2021), measuring VOT and F0 values in particular. Based on previous studies (Chang, 2010; Francis and Nusbaum, 2002; Kim, 2004), VOT values were measured as the time, in milliseconds, from the beginning of the release burst to the onset of voicing (i.e. the first point of the glottal periods and a clear voicing bar in the spectrogram). F0 values were measured by calculating the average duration of the first three glottal pulses in the vowel /a/ following each target phoneme and then converting it to a frequency value (Chang, 2010).

Table 1.

Mean voice onset time (VOT) and F0 values (with standard deviations in parentheses) by stimulus and by gender.

Stimulus Male speakers (n = 2)
Female speakers (n = 2)
VOT (ms) F0 (Hz) VOT (ms) F0 (Hz)
/pa/ 59 (6.78) 102 (11.21) 62.5 (5.24) 207 (9.54)
/p*a/ 19.5 (5.21) 159.5 (8.24) 17.5 (4.32) 254 (10.11)
/ta/ 63.5 (7.24) 106 (7.34) 68.5 (5.87) 203 (9.87)
/t*a/ 21.5 (5.21) 151.5 (8.62) 15.5 (3.21) 250 (7.54)
/ka/ 62 (5.69) 96.5 (6.54) 73 (6.53) 196 (6.95)
/k*a/ 21.5 (5.31) 147 (7.21) 19 (4.21) 234 (6.54)

Table 1 summarizes the acoustic properties of the audio stimuli. The acoustic properties were compatible with those reported in previous studies (Francis and Nusbaum, 2002; Kang and Guion, 2008; Kim, 2004), showing that the lenis sounds (i.e. /pa/, /ta/, and /ka/) had longer VOT values and lower F0 values than the fortis sounds (i.e. /p*a/, /t*a/, and /k*a/).

During the task, the participants listened to a sequence of two sounds ‘A’ and ‘X’ and then were asked to indicate whether the second sound (i.e. ‘X’) was the same as or different from the first sound (i.e. ‘A’) by clicking either the ‘same’ or ‘different’ button on a computer screen. Each sequence of two sounds was played only once. There was no predetermined time interval between trials, and participants moved onto subsequent trials by clicking ‘next’ on the screen. To induce the participants to focus on the categorical differences between two sounds during the task, sound ‘A’ of each sequence was a recording of a male speaker, and sound ‘X’ was a recording of a female speaker in a 1,500-msec inter-stimulus interval condition (Werker and Logan, 1985). The four speakers were paired as follows: M1–F1 and M2–F2. A total of 24 same sound trials were prepared: 6 same pairs (/pa/–/pa/, /ta/–/ta/, /ka/–/ka/, /p*a/–/p*a/, /t*a/–/t*a/, and /k*a/–/k*a/) × 2 speaker pairs × 2 repetitions. Another 24 trials were prepared as different sound trials: 6 different pairs (/pa/–/p*a/, /ta/–/t*a/, /ka/–/k*a/, /p*a/–/pa/, /t*a/–/ta/, and /k*a/–/ka/) × 2 speaker pairs × 2 repetitions. Each participant thus completed a total of 48 randomized trials at each testing session, which took approximately 15–20 minutes. Further, baseline data was collected from 13 L1 speakers of Korean, which indeed confirmed that the acoustic variability among the four speakers did not interfere with the categorial distinctions between each pair of sounds.

3 Data preparation and analysis

To analyse data, the percentage accuracy of the same and different sound trials by condition at the time of each testing was prepared. In addition, signal detection theory was employed to effectively control for individual response biases in the AX discrimination task. Specifically, dʹ (sensitivity index) scores were calculated to measure sensitivity to each phonemic contrast (i.e. /pa/–/p*a/; /ta/–/t*a/; /ka/–/k*a/) with z scores of hit rates (H) and false-alarm rates (F) using the following formula (MacMillan and Creelman, 2005):

d=z(H)z(F)

A hit rate resulted from the ratio of the number of times the ‘different’ button was selected to the total number of the different sound trials. A false-alarm rate was the ratio of the number of times the ‘different’ button was selected to the total number of the same sound trials. Based on Stanislaw and Todorov (1999), the computation of dʹ scores was adjusted when hit and false-alarm rates were 0 or 1. Therefore, a dʹ score for a perfect detection performance was 4.65 (the effective limit, using .99 and .01) in the current study. Following this procedure, a dʹ score per phonemic contrast was prepared for each participant for each testing.

The participants’ dʹ scores were statistically analysed using mixed effects models in R (R Core Team, 2023) using the lme4 package (version 1.1-34) and restricted maximum likelihood. For the statistical model, fixed effects included ‘condition’ (0-occurrence, 2-occurrence, 10-occurrence, 20-occurrence, and 30-occurrence), ‘time’ (pretest, immediate posttest, and delayed posttest), ‘phonemic contrast’ (/pa/–/p*a/, /ta/–/t*a/, and /ka/–/k*a/), their two-way interactions, and three-way interactions. Given the research questions, the factors ‘condition’, ‘time’, and ‘phonemic contrast’ were coded using treatment coding with the 0-occurrence condition, the pretest, and /pa/–/p*a/ as reference levels. Individual participants were treated as random effects in the model. All statistical outcomes were interpreted with alpha set at .05. Before conducting each analysis, statistical assumptions were verified (e.g. the explanatory variables were linearly related to the response; the errors had constant variance, which were independent and normally distributed).

V Results

Table 2 summarizes the descriptive statistics regarding mean percentage accuracy scores and their standard deviations (in parentheses) for the same and different sound trials by condition at the time of each testing. Overall, the participants had lower accuracy for different sound trials than for same sound trials, indicating that they had difficulty discriminating between the lenis stops and the fortis stops.

Table 2.

Mean percentage accuracy scores (with standard deviations in parentheses) for the same and different sound trials by condition at the time of each testing.

Contrast Condition (n = 20 for all) Pretest Immediate posttest Delayed posttest
Same sound trials 0-occurrence 87.1 (14.8) 88.4 (14.3) 83.3 (15.3)
2-occurrence 83.3 (13.1) 85.3 (13.1) 84.2 (12.2)
10-occurrence 82.7 (14.5) 89.1 (16.0) 87.8 (14.2)
20-occurrence 85.3 (16.1) 88.1 (15.9) 82.1 (11.9)
30-occurrence 85.6 (15.4) 84.3 (14.3) 82.4 (13.4)
Different sound trials 0-occurrence 67.9 (24.1) 70.2 (20.8) 66.9 (22.4)
2-occurrence 67.9 (23.1) 70.5 (21.3) 65.3 (20.2)
10-occurrence 66.9 (21.3) 84.9 (20.5) 71.1 (23.5)
20-occurrence 65.8 (22.8) 71.3 (21.5) 68.4 (19.1)
30-occurrence 68.3 (23.1) 73.1 (19.1) 71.3 (20.5)

Table 3 shows the descriptive statistics including mean dʹ scores and their standard deviations (in parentheses) by phonemic contrast and by condition at the time of each testing. Figure 2 visualizes the mean dʹ scores of each phonemic contrast by condition at the time of each testing. As a pre-analysis, separate univariate analyses were conducted; there were no significant differences among the five conditions for all three phonemic contrasts at the time of pretesting (p > .05). As shown in the descriptive statistics, the pretest scores ranged from 1.30 to 1.75 out of 4.65.

Table 3.

Mean dʹ scores (and standard deviations in parentheses) by phonemic contrast and by condition at the time of each testing.

Contrast Condition (n = 20 for all) Pretest Immediate posttest Delayed posttest
/pa/–/p*a/ 0-occurrence 1.71 (.57) 1.64 (.57) 1.66 (.63)
2-occurrence 1.75 (1.23) 1.66 (.84) 1.45 (.90)
10-occurrence 1.63 (.60) 3.33 (.82) 2.62 (1.43)
20-occurrence 1.60 (.63) 1.85 (.52) 1.95 (.97)
30-occurrence 1.56 (.49) 1.72 (.48) 1.96 (.52)
/ta/–/t*a/ 0-occurrence 1.51 (1.27) 1.58 (1.08) 1.63 (0.73)
2-occurrence 1.52 (.44) 1.60 (1.52) 1.58 (1.17)
10-occurrence 1.63 (.47) 3.47 (1.13) 3.19 (1.29)
20-occurrence 1.56 (.65) 1.86 (.66) 1.97 (1.02)
30-occurrence 1.58 (.66) 1.77 (.47) 1.82 (.95)
/ka/–/k*a/ 0-occurrence 1.30 (1.25) 1.54 (.58) 1.71 (.77)
2-occurrence 1.68 (1.41) 1.60 (1.08) 1.53 (.92)
10-occurrence 1.55 (1.08) 3.50 (.76) 2.97 (.62)
20-occurrence 1.68 (1.51) 1.83 (.57) 1.87 (.84)
30-occurrence 1.66 (1.09) 1.85 (.58) 1.81 (.74)

Figure 2.

Figure 2.

Mean dʹ scores by phonemic contrast and by condition at the time of each testing.

Table 4 shows the inferential statistics regarding the fixed effects in the model. The variance of the random effects was .16. Integrating the random effects improved the fit of the model: χ2 (1) = 76.65, p < .001. The marginal R2 of the model was .26, and the conditional R2 was .41, which indicates that the fixed effects accounted for 26% of the variance and that the fixed and random effects altogether accounted for 41% of the variance in the model. Based on Plonsky and Ghanbar (2018)’s effect size benchmarks (R2 ⩽ .20: Small; .20 < R2 < .50: Medium; .50 ⩽ R2: Large), the overall model was in the medium range.

Table 4.

Fixed-effects model.

Predictor Estimate (β) Standard error t p
Intercept 1.71 .20 8.40 < .001*
2-occurrence .04 .29 .15 .883
10-occurrence −.08 .29 −.29 .769
20-occurrence −.12 .29 −.04 .689
30-occurrence −.15 .29 −.51 .611
Immediate posttest −.07 .26 −.26 .791
Delayed posttest −.05 .26 −20 .839
/ta/–/t*a/ −.20 .26 −.79 .431
/ka/–/k*a/ −.41 .26 −1.58 .113
2-occurrence × Immediate posttest −.02 .37 −.06 .950
10-occurrence × Immediate posttest 1.78 .37 4.85 < .001*
20-occurrence × Immediate posttest .32 .37 .89 .375
30-occurrence × Immediate posttest .22 .37 .60 .548
2-occurrence × Delayed posttest −.25 .37 −.69 .489
10-occurrence × Delayed posttest 1.05 .37 2.87 .004*
20-occurrence × Delayed posttest .40 .37 1.11 .269
30-occurrence × Delayed posttest .45 .37 1.22 .222
2-occurrence × /ta/–/t*a/ −.03 .37 −.07 .942
10-occurrence × /ta/–/t*a/ .20 .37 .56 .578
20-occurrence × /ta/–/t*a/ .17 .37 .46 .643
30-occurrence × /ta/–/t*a/ .22 .37 .61 .540
2-occurrence × /ka/–/k*a/ .34 .37 .93 .351
10-occurrence × /ka/–/k*a/ .33 .37 .91 .363
20-occurrence × /ka/–/k*a/ .50 .37 1.36 .174
30-occurrence × /ka/–/k*a/ .51 .37 1.39 .163
Immediate posttest × /ta/–/t*a/ .15 .37 .40 .690
Delayed posttest × /ta/–/t*a/ .17 .37 .48 .634
Immediate posttest × /ka/–/k*a/ .31 .37 .85 .397
Delayed posttest × /ka/–/k*a/ .47 .37 1.27 .204
2-occurrence × Immediate posttest × /ta/–/t*a/ .02 .52 .04 .966
10-occurrence × Immediate posttest × /ta/–/t*a/ −.01 .52 −.01 .989
20-occurrence × Immediate posttest × /ta/–/t*a/ −.09 .52 −.19 .848
30-occurrence × Immediate posttest × /ta/–/t*a/ −.11 .52 −.22 .826
2-occurrence × Delayed posttest × /ta/–/t*a/ .19 .52 .36 .718
10-occurrence × Delayed posttest × /ta/–/t*a/ .39 .52 .76 .466
20-occurrence × Delayed posttest × /ta/–/t*a/ −.12 .52 −.23 .815
30-occurrence × Delayed posttest × /ta/–/t*a/ −.33 .52 −.64 .519
2-occurrence × Immediate posttest × /ka/–/k*a/ −.30 .52 −.58 .560
10-occurrence × Immediate posttest × /ka/–/k*a/ −.06 .52 −.12 .903
20-occurrence × Immediate posttest × /ka/–/k*a/ −.42 .52 −.82 .414
30-occurrence × Immediate posttest × /ka/–/k*a/ −.27 .52 −.52 .600
2-occurrence × Delayed posttest × /ka/–/k*a/ −.31 .52 −.60 .548
10-occurrence × Delayed posttest × /ka/–/k*a/ −.04 .52 −.08 .937
20-occurrence × Delayed posttest × /ka/–/k*a/ −.63 .52 −1.21 .226
30-occurrence × Delayed posttest × /ka/–/k*a/ −.72 .52 −1.38 .167

Note. * p < .05.

In the model, the only significant effects were found from the following two interactions: 10-occurrence condition × Immediate posttest (β = 1.78, SE = .37, t = 4.85, p < .001) and 10-occurrence condition × Delayed posttest (β = 1.05, SE = .37, t = 2.87, p = .004). That is, the participants in the 10-occurrence condition significantly outperformed those in the 0-occurrence condition at the time of both posttesting. According to the descriptive statistics in Table 3 and the nonsignificant three-way interaction effects, the effects of the 10-occurrence condition were observed across all three phonemic contrasts. There were no other significant findings from those in the other conditions.

VI Discussion

Based on the results, incidental learning of non-native phonemic contrasts clearly occurred for some of the participants in the current study. The pretest scores indicated that L1 English speakers had difficulty discriminating between Korean lenis stops and fortis stops, with dʹ scores ranging from 1.30 to 1.75 in the AX discrimination task. After watching the instructional video, participants who received 10 incidental exposures to each target phoneme significantly increased in their discrimination accuracy, indicating that an incidental learning condition can enable naive listeners to improve their perception of the Korean stops. While this outcome is consistent with previous research findings that non-native speech can be acquired incidentally, the finding that learning gains were observed only in the 10-occurrence condition, consistently across all the target contrasts in both the immediate posttest and the delayed posttest, was unanticipated and warrants further examination.

While there has been significant variation in the specific number of occurrences required in input for learning, the overwhelming majority of extant research in both vocabulary and morphosyntax learning has suggested a positive correlation between input frequency and learning gains, that is, the more frequently a target occurs in the input, the more likely that the target is to be learned. To the best of our knowledge, no other study investigating input frequency in incidental L2 learning has found results similar to those in the current study, in which learning occurred only at a specific input frequency then returned to baseline levels at higher frequencies. It is thus difficult to interpret or explain the results in terms of frequency as the key factor. However, given how pronounced the results are, it can only be deduced that they were affected by some specific determinant.

In the absence of frequency-related explanations, we turned to the characteristics of the learning conditions themselves. Before watching the instructional video, all the participants were told that they would learn to count in Korean and be asked to demonstrate that knowledge afterwards. As far as the participants were concerned, their main task was to learn and memorize novel vocabulary (i.e. Korean numbers). As they watched the video, all participants were presented with information necessary for that task. In addition, depending on their condition, some participants received various amounts of extra input (i.e. the target Korean stops). Based on the responses on the simple worksheet that was administered after the video, nearly all participants were successful in learning the Korean numbers, regardless of their treatment condition. This is unsurprising, given that there were only four simple mono- or bi-syllabic words to process, and the participants heard them a total of 44 times each. The key question at hand is how and why only those in the 10-occurrence condition were able to process the extra input that was irrelevant to their overt main task. The results from the 0-occurrence condition were as expected, with no change in discrimination ability. Similar results from the 2-occurrence condition were also expected, as learning completely novel phonemes after hearing each of them only twice in a span of roughly five and a half minutes is justifiably difficult. However, as the exposure increases to 10, 20, and 30 occurrences, an explanation seems to lie not in how often, but in the way in which the target phonemes were presented, along with the cognitive mechanisms involved in processing them.

In order to learn the novel Korean words, the participants likely employed their working memory (Baddeley, 1986), or more specifically, the phonological loop. The phonological loop comprises the short-term phonological store, which temporarily holds auditory information, and an articulatory rehearsal process. When novel information is presented auditorily, as in the current study, it has direct obligatory access to the phonological store (see Salamé and Baddeley, 1982). To retain any information held in the phonological store, the articulatory rehearsal process works to repeatedly refresh those new sounds to prevent them from fading away. While the participants in the current study were invited to verbally repeat after the instructor in the video, most of them did not and instead engaged in subvocal rehearsal, the typical method of articulatory rehearsal. The primary purpose of the phonological loop is to facilitate the learning of new words, as seen in both L1 by children and L2 by adults (see Baddeley et al., 1998), and likely in the current study.

When the Korean numbers are presented in a continuous pattern (i.e. hana–dul–set–net), it can be easily learned using the phonological loop. However, the counting pattern is disrupted when the target Korean syllables are embedded in between the numbers (i.e. pa hana, pa dul, pa set, pa net) (see Appendix A). The videos for the 0-, 2-, and 10-occurrence conditions contain ample instances of the hana–dul–set–net pattern, making the task of learning the numbers relatively effortless. In contrast, the Korean numbers are presented in the fragmented manner more often than as a continuous pattern in the 20-occurrence video, and entirely in the 30-occurrence video. Thus, the main task was likely more cognitively demanding for those in the 20- and 30-occurrence conditions, who could not rely on the phonological loop for the hana–dul–set–net pattern. In the case of the 10-occurrence condition, by the third repetition of numbers (in which the target syllables appeared for the first time), the participants likely had learned the numbers and had enough cognitive capacity to process the additional input. But due to the increase in cognitive load in the 20- and 30-occurrence conditions, those participants likely resorted to selective attention (see Ellis, 2006b for a summary of this concept applied to L2 acquisition). As a result, the numbers and target syllables were perceived as two competing types of cues, and because the participants were given one overt task, the numbers were the more salient cues, overshadowing (Kamin, 1969) the less salient cues (the target syllables) and causing the participants to block them (Chapman and Robbins, 1990). In other words, the participants in the 20- and 30-occurrence conditions suppressed the processing of the target syllables (task-irrelevant information) to avoid being distracted in their attempt to focus on the Korean numbers (task-relevant information). This effect is also commonly found in dichotic listening studies (see Murphy et al., 2017). When participants are presented with two auditory messages simultaneously after being instructed to attend to only one of them, they are unable to process the unattended message.

The explanation above provides a likely account of why the participants in the 20- and 30-occurrence conditions did not improve their discrimination of the target Korean phonemes, despite having received more exposures than those in the 10-occurrence condition. The main finding of the current study indeed suggests that 10 incidental exposures to a non-native sound are sufficient for learning, while 2 exposures are not. Based on the unexpected results and our interpretation of them, any further conclusions regarding the effect of various input frequencies on incidental learning of non-native speech would be speculative at best. However, there are significant pedagogical implications to derive from the current study.

The first pedagogical implication regards leveraging incidental exposures to novel sounds in L2 instruction. Based on our finding that 10 exposures to the target Korean phonemes resulted in durable perceptual learning, practitioners could utilize their regular instruction of other L2 domains as opportunities to incidentally expose learners to L2 speech. Preemptively embedding certain L2 phonemes that are difficult for learners to acquire into primarily meaning-focused activities could be a double-pronged pedagogical technique that maximizes learning when there is limited time and attention.

Second, the lack of learning in the 20- and 30-occurrence conditions underscores the importance of selective attention in L2 learning. Despite the high frequency of the target phonemes in the input, the participants failed (or neglected) to attend to them due to the cognitive demands of the main task. While the 10-occurrence condition demonstrated that it is possible for intentional and incidental learning to occur simultaneously in the domain of L2 speech, it seems possible only if learners are not overtaxed during instruction. Practitioners should thus be mindful of the attentional demands of specific tasks and ensure that simultaneous learning targets are paired appropriately. It is important to ensure that one learning target does not get acquired at the expense of the other by overshadowing and blocking it.

VII Limitations and directions for future research

The present study has several limitations, yielding suggestions for future studies in this line of research. The design of the learning conditions (i.e. instructional videos), though they yielded important unintended findings, did not reveal the effects of various frequencies of incidental exposure. With much more research needed in this area, it would be important to conduct studies that can more clearly attribute results to frequency only. Also, including an intentional learning condition would provide concrete, measurable comparisons between incidental and intentional L2 speech learning. Furthermore, incorporating more ecologically valid learning conditions, such as watching videos (Nguyen and Boers, 2019; Peters and Webb, 2018) and listening to songs (Baills et al., 2021; Pavia et al., 2019), would provide more realistic accounts of incidental learning, particularly in the domain of L2 speech.

The current study used an AX discrimination task to test participants’ ability to discriminate the phonemic contrasts. Though the task itself is an effective measure, given the current study’s focus on incidental learning, there is a possibility of participants becoming aware of the targets during the pretest and intentionally learning them during the instructional treatment. A way to mitigate this concern in future research is to include distractors in the AX discrimination task using non-target filler items.

For our participant sample, naive listeners with no prior knowledge of Korean and the target phonemes were deliberately selected to ensure that we could control their amount of exposure to the target phonemes. However, their level of motivation may be a concern, as their interest in the Korean language is likely lower than that of active L2 learners of Korean. It is possible that as non-learners, the participants did not care to learn or maintain any learning gains that may have occurred. A similar study with L2 learners of the target language may provide different results that are more in line with findings in L2 acquisition research.

Due to the timing of the data collection, we were met with unforeseen methodological issues, particularly the significant variation in the intervals among testing sessions and instructional treatment. While this fortunately did not impact the data in the current study, it goes without saying that better control of the data collection process should be a priority.

The current study specifically examined Korean lenis and fortis stops as linguistic targets. As the Korean three-way stops are a uniquely rich set of phonemes, including the aspirated stops in the set of targets would provide more complete insights to the acquisition of L2 speech. It would also be useful to target different phonemes and various languages to generalize the findings of the current study. In addition, the current study focused on the development of L2 speech perception. Keeping in mind that L2 speech learning pertains not only to perception but also to production, future studies need to be conducted to examine whether L2 learners can improve their production accuracy of target phonemes after incidental exposure to them. Previous studies (Denhovska and Serratrice, 2017; Godfroid, 2016; Shintani, 2015) found that L2 learners showed improvement in receptive tasks after incidental exposure but not in productive tasks. Therefore, taking a close look at L2 speech production would contribute significantly to what is currently known.

Finally, previous studies (e.g. Li and DeKeyser, 2017; Saito et al., 2020) indicated that L2 speech learning is highly influenced by L2 learners’ individual differences such as L1 background, L2 learning experience, age, awareness, motivation, attitudes, and musicality. Accordingly, it would be worthwhile to tease apart how incidental learning of L2 speech and individual differences may be related.

VIII Conclusions

The current study examined the extent to which naive listeners could acquire non-native phonemic contrasts (i.e. Korean lenis vs. fortis stops) in an incidental manner, as well as the degree to which the frequency of exposure to each target phoneme affects their incidental learning. A total of 100 L1 English speakers received one language instruction session that provided explicit instruction on how to count numbers in Korean. During the session, they were incidentally exposed to each of the target phonemes twice, 10 times, 20 times, 30 times, or not at all. According to the results of the AX discrimination tasks, only the participants in the 10-occurrence condition significantly improved their discrimination ability in both posttests for all target contrasts, indicating that 10 incidental exposures to non-native phonemes are indeed enough for perceptual learning. However, given the general premise that high frequency of input yields more learning, along with the lack of previous research in incidental learning that contradicts this notion, the results showing no learning in the 20- and 30-occurrence conditions were unexpected. This outcome is thus interpreted as demonstrating the role of selective attention in incidental learning of non-native speech, rather than a function of input frequency. Due to the way in which the target phonemes were embedded in the instructional content, the participants in the 20- and 30-occurrence conditions were subject to more attentional demands, which likely led them to block the target phonemes, the less salient input, in order to focus on the Korean numbers, the more salient input. It is hoped that the current study contributes valuable insights to this line of research and that future studies will build on its findings to continue exploring the effects of incidental exposures to non-native speech on perceptual learning.

Acknowledgments

We sincerely thank all participants and extend our gratitude to the following research assistants: Kelyn Rae Best and Caroline Duarte Ramos Avila.

Appendix A

Script of all the videos by condition

Today, we are going to learn how to count numbers from one to four in Korean. Are you ready?

One is 하나 (/hana/). Repeat after me, 하나. Two is 둘 (/dul/). Repeat after me, 둘. Three is 셋 (/set/). Repeat after me, 셋. Four is 넷 (/net/). Repeat after me, 넷.

Once again, 하나. Repeat after me, 하나. 둘. Repeat after me, 둘. 셋. Repeat after me, 셋. 넷. Repeat after me, 넷.

Now, let’s practice counting with some objects.

0-occurrence condition (duration: 02:27)

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (peach): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (peach): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (peach): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

2-occurrence condition (duration: 02:49)

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (orange): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (orange): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 하나, 둘, 셋, 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 하나, 둘, 셋, 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 하나, 둘, 셋, 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 하나, 둘, 셋, 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 하나, 둘, 셋, 넷

10-occurrence condition (duration: 02:57)

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (orange): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (orange): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 바 하나, 바 둘, 바 셋, 바 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 빠 하나, 빠 둘, 빠 셋, 빠 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 다 하나, 다 둘, 다 셋, 다 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 따 하나, 따 둘, 따 셋, 따 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 가 하나, 가 둘, 가 셋, 가 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 까 하나, 까 둘, 까 셋, 까 넷

20-occurrence condition (duration: 03:28)

  • Picture 1 (apple): 하나, 둘, 셋, 넷

  • Picture 2 (orange): 하나, 둘, 셋, 넷

  • Picture 3 (strawberry): 하나, 둘, 셋, 넷

  • Picture 4 (lemon): 하나, 둘, 셋, 넷

  • Picture 5 (pear): 하나, 둘, 셋, 넷

  • Picture 6 (grapes): 하나, 둘, 셋, 넷

Once again.

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 바 하나, 바 둘, 바 셋, 바 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 빠 하나, 빠 둘, 빠 셋, 빠 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 다 하나, 다 둘, 다 셋, 다 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 따 하나, 따 둘, 따 셋, 따 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 가 하나, 가 둘, 가 셋, 가 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 까 하나, 까 둘, 까 셋, 까 넷

Once again.

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 바 하나, 바 둘, 바 셋, 바 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 빠 하나, 빠 둘, 빠 셋, 빠 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 다 하나, 다 둘, 다 셋, 다 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 따 하나, 따 둘, 따 셋, 따 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 가 하나, 가 둘, 가 셋, 가 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 까 하나, 까 둘, 까 셋, 까 넷

30-occurrence condition (duration: 03:58)

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 바 하나, 바 둘, 바 셋, 바 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 빠 하나, 빠 둘, 빠 셋, 빠 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 다 하나, 다 둘, 다 셋, 다 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 따 하나, 따 둘, 따 셋, 따 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 가 하나, 가 둘, 가 셋, 가 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 까 하나, 까 둘, 까 셋, 까 넷

Once again.

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 바 하나, 바 둘, 바 셋, 바 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 빠 하나, 빠 둘, 빠 셋, 빠 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 다 하나, 다 둘, 다 셋, 다 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 따 하나, 따 둘, 따 셋, 따 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 가 하나, 가 둘, 가 셋, 가 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 까 하나, 까 둘, 까 셋, 까 넷

Once again.

  • Picture 1 (apple): This is called ‘바’ (/pa/) in Korean. 바 하나, 바 둘, 바 셋, 바 넷

  • Picture 2 (orange): This is called ‘빠’ (/p*a/) in Korean. 빠 하나, 빠 둘, 빠 셋, 빠 넷

  • Picture 3 (strawberry): This is called ‘다’ (/ta/) in Korean. 다 하나, 다 둘, 다 셋, 다 넷

  • Picture 4 (lemon): This is called ‘따’ (/t*a/) in Korean. 따 하나, 따 둘, 따 셋, 따 넷

  • Picture 5 (pear): This is called ‘가’ (/ka/) in Korean. 가 하나, 가 둘, 가 셋, 가 넷

  • Picture 6 (grapes): This is called ‘까’ (/k*a/) in Korean. 까 하나, 까 둘, 까 셋, 까 넷

Footnotes

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Faculty of Social Sciences at Brock University and the Social Sciences and Humanities Research Council of Canada (SSHRC) (430-2021-00076).

References

  1. Aka N. (2020) Incidental learning of a grammatical feature from reading by Japanese learners of English as a foreign language. System 91: 1–14. [Google Scholar]
  2. Baddeley AD. (1986) Working memory. Oxford: Oxford University Press. [Google Scholar]
  3. Baddeley AD, Gathercole S, Papagno C. (1998) The phonological loop as a language learning device. Psychological Review 105: 158–73. [DOI] [PubMed] [Google Scholar]
  4. Baills F, Zhang Y, Cheng Y, Bu Y, Prieto P. (2021) Listening to songs and singing benefitted initial stages of second language pronunciation but not recall of word meaning. Language Learning 71: 369–413. [Google Scholar]
  5. Bang H-Y, Sonderegger M, Kang Y, Clayards M, Yoon T-J. (2018) The emergence, progress, and impact of sound change in progress in Seoul Korean: Implications for mechanisms of tonogenesis. Journal of Phonetics 66: 120–44. [Google Scholar]
  6. Best CT. (1995) A direct realist view of cross-language speech perception. In: Strange W. (ed.) Speech perception and linguistic experience: Issues in cross-language speech research. Timonium, MD: York Press, pp. 167–200. [Google Scholar]
  7. Boersma P, Weenink D. (2021) Praat: Doing phonetics by computer: Version 6.1.49 [computer program]. Available at: http://www.praat.org (accessed April 2024).
  8. Broersma M. (2009) Dutch listeners’ perception of Korean stop triplets. The Journal of the Acoustical Society of America 125: 2775. [DOI] [PubMed] [Google Scholar]
  9. Brooks PJ, Kempe V. (2013) Individual differences in adult foreign language learning: The mediating effect of metalinguistic awareness. Memory and Cognition 41: 281–96. [DOI] [PubMed] [Google Scholar]
  10. Chang CB. (2010) The implementation of laryngeal contrast in Korean as a second language. Harvard Studies in Korean Linguistics 13: 91–104. [Google Scholar]
  11. Chapman GB, Robbins SJ. (1990) Cue interaction in human contingency judgment. Memory and Cognition 18: 537–45. [DOI] [PubMed] [Google Scholar]
  12. Cho T, Jun S, Ladefoged P. (2002) Acoustic and aerodynamic correlates of Korean stops and fricatives. Journal of Phonetics 30: 193–228. [Google Scholar]
  13. Cho T, Whalen DH, Docherty G. (2019) Voice onset time and beyond: Exploring laryngeal contrast in 19 languages. Journal of Phonetics 72: 52–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Chodroff E, Wilson C. (2017) Structure in talker-specific phonetic realization: Covariation of stop consonant VOT in American English. Journal of Phonetics 61: 30–47. [Google Scholar]
  15. Choi J. (2015) Dutch listeners’ perception of Korean stop consonants. Journal of the Korean Society of Speech Sciences 7: 89–95. [Google Scholar]
  16. Darcy I. (2018) Powerful and effective pronunciation instruction: How can we achieve it? The CATESOL Journal 30: 13–45. [Google Scholar]
  17. de Vos JF, Schriefers H, Nivard MG, Lemhöfer K. (2018) A meta-analysis and meta-regression of incidental second language word learning from spoken input. Language Learning 68: 906–41. [Google Scholar]
  18. DeKeyser RM. (2003) Implicit and explicit learning. In: Doughty CJ, Long MH. (eds) The handbook of second language acquisition. Oxford: Wiley-Blackwell, pp. 313–48. [Google Scholar]
  19. Denhovska N, Serratrice L. (2017) Incidental learning of gender agreement in L2. Journal of Psycholinguistic Research 46: 1187–211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Denhovska N, Serratrice L, Payne J. (2016) Acquisition of second language grammar under incidental learning conditions: The role of frequency and working memory. Language Learning 66: 159–90. [Google Scholar]
  21. Eckerth J, Tavakoli P. (2012) The effects of word frequency and elaboration of word processing on incidental L2 vocabulary acquisition through reading. Language Teaching Research 16: 227–52. [Google Scholar]
  22. Ellis NC. (2006. a) Language acquisition as rational contingency learning. Applied Linguistics 27: 1–24. [Google Scholar]
  23. Ellis NC. (2006. b) Selective attention and transfer phenomena in L2 acquisition: Contingency, cue competition, salience, interference, overshadowing, blocking, and perceptual learning. Applied Linguistics 27: 164–94. [Google Scholar]
  24. Flege JE, Bohn O. (2021) The revised Speech Learning Model (SLM-r). In: Wayland R. (ed.) Second language speech learning: Theoretical and empirical progress. Cambridge: Cambridge University Press, pp. 3–83. [Google Scholar]
  25. Foote JA, Trofimovich P, Collins L, Soler Urzúa F. (2016) Pronunciation teaching practices in communicative second language classes. The Language Learning Journal 44: 181–96. [Google Scholar]
  26. Francis AL, Nusbaum HC. (2002) Selective attention and the acquisition of new phonetic categories. Journal of Experimental Psychology: Human Perception and Performance 28: 349–66. [DOI] [PubMed] [Google Scholar]
  27. Gabay Y, Karni A, Holt LL. (2018) Consolidation and retention of auditory categories acquired incidentally in performing a visuomotor task. In: Rogers TT, Rau M, Zhu X, Kalish CW. (eds) Proceedings of the 40th Annual Conference of the Cognitive Science Society. Cognitive Science Society, pp. 402–407. [Google Scholar]
  28. Gabay Y, Karni A, Holt LL. (2023) Memory for incidentally learned categories evolves in the post-learning interval. eLife 12: e81855. [DOI] [PMC free article] [PubMed]
  29. Godfroid A. (2016) The effects of implicit instruction on implicit and explicit knowledge development. Studies in Second Language Acquisition 38: 177–215. [Google Scholar]
  30. Grey S, Williams JN, Rebuschat P. (2014) Incidental exposure and L3 learning of morphosyntax. Studies in Second Language Acquisition 36: 611–45. [Google Scholar]
  31. Guion SG, Pederson E. (2007) Investigating the role of attention in phonetic learning. In: Bohn O, Munro MJ. (eds) Language experience in second language speech learning: In honor of James Emil Flege. Amsterdam: John Benjamins, pp. 57–77. [Google Scholar]
  32. Hamrick P, Rebuschat P. (2014) Frequency effects, learning conditions, and the development of implicit and explicit lexical knowledge. In: Connor-Linton J, Amoroso LW. (eds) Measured language: Quantitative studies of acquisition, assessment, and variation. Washington, DC: Georgetown University Press, pp. 125–39. [Google Scholar]
  33. Holliday JJ. (2014) The perceptual assimilation of Korean obstruents by native Mandarin listeners. The Journal of the Acoustical Society of America 135: 1585–95. [DOI] [PubMed] [Google Scholar]
  34. Holliday JJ. (2015) A longitudinal study of the second language acquisition of a three-way stop contrast. Journal of Phonetics 50: 1–14. [Google Scholar]
  35. Holliday JJ. (2019) The perception and production of word-initial Korean stops by native speakers of Japanese. Language and Speech 62: 494–508. [DOI] [PubMed] [Google Scholar]
  36. Horst M, Cobb T, Meara P. (1998) Beyond a clockwork orange: Acquiring second language vocabulary through reading. Reading in a Foreign Language 11: 207–23. [Google Scholar]
  37. Huensch A. (2019) The pronunciation teaching practices of university-level graduate teaching assistants of French and Spanish introductory language courses. Foreign Language Annals 52: 13–31. [Google Scholar]
  38. Hulme RC, Barsky D, Rodd JM. (2019) Incidental learning and long-term retention of new word meanings from stories: The effect of number of exposures. Language Learning 69: 18–43. [Google Scholar]
  39. Hulstijn JH. (2003) Incidental and intentional learning. In: Doughty CJ, Long MH. (eds) The handbook of second language research. Oxford: Wiley-Blackwell, pp. 349–81. [Google Scholar]
  40. Hulstijn JH. (2013) Incidental learning in second language acquisition. In: Chapelle CA. (ed.) The encyclopedia of applied linguistics. Chichester: Wiley-Blackwell, pp. 2632–40. [Google Scholar]
  41. Hutchinson AE, Dmitrieva O. (2022) Exposure to speech via foreign film and its effects on non-native vowel production and perception. Journal of Phonetics 95: 101189. [Google Scholar]
  42. Ishikawa K. (2019) Incidental and explicit learning of L2 derivational morphology and the nature of acquired knowledge. Applied Psycholinguistics 40: 1377–404. [Google Scholar]
  43. Jin Z, Webb S. (2020) Incidental vocabulary learning through listening to teacher talk. The Modern Language Journal 104: 550–66. [Google Scholar]
  44. Kamin LJ. (1969) Predictability, surprise, attention, and conditioning. In: Campbell BA, Church RM. (eds) Punishment aversive behavior. New York: Appleton-Century-Crofts, pp. 279–96. [Google Scholar]
  45. Kang K, Guion SG. (2008) Clear speech production of Korean stops: Changing phonetic targets and enhancement strategies. The Journal of the Acoustical Society of America 124: 3909–17. [DOI] [PubMed] [Google Scholar]
  46. Kang Y. (2014) Voice onset time merger and development of tonal contrast in Seoul Korean stops: A corpus study. Journal of Phonetics 45: 76–90. [Google Scholar]
  47. Kang Y, Schertz J, Han S. (2022) The phonology and phonetics of Korean stop laryngeal contrasts. In: Cho S, Whitman J. (eds) The Cambridge handbook of Korean linguistics. Cambridge: Cambridge University Press, pp. 215–47. [Google Scholar]
  48. Kim M. (2004) Correlation between VOT and F0 in the perception of Korean stops and affricates. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP) (Interspeech 2004), pp. 49–52. [Google Scholar]
  49. Kong EJ, Kang S, Seo M. (2022) The acoustic cue-weighting and the L2 production–perception link: A case of English-speaking adults’ learning of Korean stops. Phonetics and Speech Sciences 14: 1–9. [Google Scholar]
  50. Lee H, Holliday JJ, Kong EJ. (2020) Diachronic change and synchronic variation in the Korean stop laryngeal contrast. Language and Linguistics Compass 14: 1–12. [Google Scholar]
  51. Lee JF. (2002) The incidental acquisition of Spanish: Future tense morphology through reading in a second language. Studies in Second Language Acquisition 24: 55–80. [Google Scholar]
  52. Leow RP, Zamora CC. (2017) Intentional and incidental L2 learning. In: Loewen S, Sato M. (eds) The Routledge handbook of instructed second language acquisition. New York: Routledge, pp. 33–49. [Google Scholar]
  53. Li M, DeKeyser RM. (2017) Perception practice, production practice, and musical ability in L2 Mandarin tone-word learning. Studies in Second Language Acquisition 39: 593–620. [Google Scholar]
  54. Lim S-J, Fiez JA, Holt LL. (2019) Role of the striatum in incidental learning of sound categories. Proceedings of the National Academy of Sciences 116: 4671–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Lim S-J, Holt LL. (2011) Learning foreign sounds in an alien world: Videogame training improves non-native speech categorization. Cognitive Science 35: 1390–405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Liu R, Holt LL. (2015) Perceptual scaffolding of non-native speech categories through videogame-based training. The Journal of the Acoustical Society of America 137: 2386. [Google Scholar]
  57. Loewen S. (2020) Introduction to instructed second language acquisition (2nd ed.). New York: Routledge. [Google Scholar]
  58. Luthra S, Fuhrmeister P, Molfese PJ, et al. (2019) Brain-behavior relationships in incidental learning of non-native phonetic categories. Brain and Language 198: 104692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Macmillan NA, Creelman CD. (2005) Detection theory: A user’s guide (2nd ed.). New York: Lawrence Erlbaum. [Google Scholar]
  60. Malone J. (2018) Incidental vocabulary learning in SLA: Effects of frequency, aural enhancement, and working memory. Studies in Second Language Acquisition 40: 651–75. [Google Scholar]
  61. Martínez-García MT, Holliday JJ. (2019) The perception of Korean stops by native speakers of Spanish. In: Calhoun S, Escudero P, Tabain M, Warren P. (eds) Proceedings of the 19th International Congress of Phonetic Sciences. Canberra: Australasian Speech Science and Technology Association, pp. 2585–89. [Google Scholar]
  62. Mohamed AA. (2018) Exposure frequency in L2 reading: An eye-movement perspective of incidental vocabulary learning. Studies in Second Language Acquisition 40: 269–93. [Google Scholar]
  63. Morgan-Short K, Sanz C, Steinhauer K, Ullman MT. (2010) Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning 60: 154–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Morgan-Short K, Steinhauer K, Sanz C, Ullman MT. (2012) Explicit and implicit second language training differentially affect the achievement of native-like brain activation patterns. Journal of Cognitive Neuroscience 24: 933–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Munro MJ. (2021) Applying phonetics: Speech science in everyday life. Oxford: Wiley-Blackwell. [Google Scholar]
  66. Murphy S, Spence C, Dalton P. (2017) Auditory perceptual load: A review. Hearing Research 352: 40–48. [DOI] [PubMed] [Google Scholar]
  67. Nam Y, Paul MJ, Safi D. (2021) Examination of Korean stop perception in Quebec French listeners through the lens of assimilation overlap. JASA Express Letters 1: 125201. [DOI] [PubMed] [Google Scholar]
  68. Newton J. (2013) Incidental vocabulary learning in classroom communication tasks. Language Teaching Research 17: 164–87. [Google Scholar]
  69. Ngaihte CN, Holliday JJ. (2019). Asymmetry in the perceptual assimilation of the Korean laryngeal contrast by Indian listeners. Poster presented at the Hanyang International Symposium on Phonetics and Cognitive Sciences of Language 2019 (HISPhonCog 2019), Hanyang University, Seoul, South Korea. [Google Scholar]
  70. Nguyen C, Boers F. (2019) The effect of content retelling on vocabulary uptake from a TED Talk. TESOL Quarterly 53: 5–29. [Google Scholar]
  71. Norris JM, Ortega L. (2000) Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning 50: 417–528. [Google Scholar]
  72. Pavia N, Webb S, Faez F. (2019) Incidental vocabulary learning through listening to songs. Studies in Second Language Acquisition 41: 745–68. [Google Scholar]
  73. Peters E, Webb S. (2018) Incidental vocabulary acquisition through viewing L2 television and factors that affect learning. Studies in Second Language Acquisition 40: 551–77. [Google Scholar]
  74. Plonsky L, Ghanbar H. (2018) Multiple regression in L2 research: A methodological synthesis and guide to interpreting R2 values. The Modern Language Journal 102: 713–31. [Google Scholar]
  75. R Core Team (2023) R: A language and environment for statistical computing [software]. Vienna: R Foundation for Statistical Computing. Available at: https://www.R-project.org (accessed April 2024). [Google Scholar]
  76. Rebuschat P. (2013) Measuring implicit and explicit knowledge in second language research. Language Learning 63: 595–626. [Google Scholar]
  77. Rebuschat P, Williams JN. (2012) Implicit and explicit knowledge in second language acquisition. Applied Psycholinguistics 33: 829–56. [Google Scholar]
  78. Robinson P. (2005) Cognitive abilities, chunk-strength, and frequency effects in implicit artificial grammar and incidental L2 learning: Replications of Reber, Walkenfeld, and Hernstadt (1991) and Knowlton and Squire (1996) and their relevance for SLA. Studies in Second Language Acquisition 27: 235–68. [Google Scholar]
  79. Robinson P, Mackey A, Gass SM, Schmidt R. (2019) Attention and awareness in second language acquisition. In: Gass S, Mackey A. (eds) The Routledge handbook of second language acquisition. New York: Routledge, pp. 247–67. [Google Scholar]
  80. Rogers J, Révész A, Rebuschat P. (2016) Implicit and explicit knowledge of inflectional morphology. Applied Psycholinguistics 37: 781–812. [Google Scholar]
  81. Rott S. (1999) The effect of exposure frequency on intermediate language learners’ incidental vocabulary acquisition through reading. Studies in Second Language Acquisition 21: 589–619. [Google Scholar]
  82. Ruiz S, Tagarelli KM, Rebuschat P. (2018) Simultaneous acquisition of words and syntax: Effects of exposure condition and declarative memory. Frontiers in Psychology 9: 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Saito K, Hanzawa K, Petrova K, et al. (2022) Incidental and multimodal high variability phonetic training: Potential, limits, and future directions. Language Learning 72: 1049–91. [Google Scholar]
  84. Saito K, Macmillan K, Mai T, et al. (2020) Developing, analyzing and sharing multivariate datasets: Individual differences in L2 learning revisited. Annual Review of Applied Linguistics 40: 9–25. [Google Scholar]
  85. Salamé P, Baddeley AD. (1982) Disruption of short-term memory by unattended speech: Implications for the structure of working memory. Journal of Verbal Learning and Verbal Behavior 21: 150–64. [Google Scholar]
  86. Saragi T, Nation ISP, Meister GF. (1978) Vocabulary learning and reading. System 6: 72–78. [Google Scholar]
  87. Schmidt AM. (2007) Cross-language consonant identification: English and Korean. In: Bohn O, Munro MJ. (eds) Language experience in second language speech learning: In honor of James Emil Flege. Amsterdam: John Benjamins, pp. 185–200. [Google Scholar]
  88. Schmitt N. (2008) Review article: Instructed second language vocabulary learning. Language Teaching Research 12: 329–63. [Google Scholar]
  89. Shintani N. (2015) The incidental grammar acquisition in focus on form and focus on forms instruction for young beginner learners. TESOL Quarterly 49: 115–40. [Google Scholar]
  90. Silva DJ. (2006) Acoustic evidence for the emergence of tonal contrast in contemporary Korean. Phonology 23: 287–308. [Google Scholar]
  91. Sonbul S, Schmitt N. (2010) Direct teaching of vocabulary after reading: Is it worth the effort? ELT Journal 64: 253–60. [Google Scholar]
  92. Stanislaw H, Todorov N. (1999) Calculation of signal detection theory measures. Behavior Research Methods, Instruments, and Computers 31: 137–49. [DOI] [PubMed] [Google Scholar]
  93. Strange W, Shafer VL. (2008) Speech perception in second language learners: The re-education of selective perception. In: Hansen Edwards JG, Zampini ML. (eds) Phonology and Second Language Acquisition. Amsterdam: John Benjamins, pp. 153–91. [Google Scholar]
  94. Tao Y, Williams JN. (2018) Generalization of syntactic knowledge in semiartificial language learning. Language Learning 68: 1001–31. [Google Scholar]
  95. Teng F. (2020) Retention of new words learned incidentally from reading: Word exposure frequency, L1 marginal glosses, and their combination. Language Teaching Research 24: 785–812. [Google Scholar]
  96. Uchihara T, Webb S, Yanagisawa A. (2019) The effects of repetition on incidental vocabulary learning: A meta-analysis of correlational studies. Language Learning 69: 559–99. [Google Scholar]
  97. van Zeeland H, Schmitt N. (2013) Incidental vocabulary acquisition through L2 listening: A dimensions approach. System 41: 609–24. [Google Scholar]
  98. Vidal K. (2003) Academic listening: A source of vocabulary acquisition? Applied Linguistics 24: 56–89. [Google Scholar]
  99. Vidal K. (2011) A comparison of the effects of reading and listening on incidental vocabulary acquisition. Language Learning 61: 219–58. [Google Scholar]
  100. Vlahou EL, Protopapas A, Seitz AR. (2012) Implicit training of nonnative speech stimuli. Journal of Experimental Psychology: General 141: 363–81. [DOI] [PubMed] [Google Scholar]
  101. Wade T, Holt LL. (2005) Incidental categorization of spectrally complex non-invariant auditory stimuli in a computer game task. The Journal of the Acoustical Society of America 118: 2618–33. [DOI] [PubMed] [Google Scholar]
  102. Waring R, Takaki M. (2003) At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language 15: 130–63. [Google Scholar]
  103. Webb S. (2007) The effects of repetition on vocabulary knowledge. Applied Linguistics 28: 46–65. [Google Scholar]
  104. Webb S, Chang AC. (2015) Second language vocabulary learning through extensive reading with audio support: How do frequency and distribution of occurrence affect learning? Language Teaching Research 19: 667–86. [Google Scholar]
  105. Webb S, Newton J, Chang AC. (2013) Incidental learning of collocation. Language Learning 63: 91–120. [Google Scholar]
  106. Webb S, Uchihara T, Yanagisawa A. (2023) How effective is second language incidental vocabulary learning? A meta-analysis. Language Teaching 56: 161–80. [Google Scholar]
  107. Werker JF, Logan JS. (1985) Cross-language evidence for three factors in speech perception. Perception and Psychophysics 37: 35–44. [DOI] [PubMed] [Google Scholar]
  108. Williams JN. (2009) Implicit learning in second language acquisition. In: Ritchie WC, Bhatia TK. (eds) The new handbook of second language acquisition. Bingley: Emerald, pp. 319–53. [Google Scholar]
  109. Zahar R, Cobb T, Spada N. (2001) Acquiring vocabulary through reading: Effects of frequency and contextual richness. Canadian Modern Language Review 57: 541–72. [Google Scholar]

Articles from Second Language Research are provided here courtesy of SAGE Publications

RESOURCES