Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jan 1.
Published in final edited form as: Dev Psychol. 2018 Oct 4;55(1):1–8. doi: 10.1037/dev0000610

Infants’ selective use of reliable cues in multidimensional language input

Christine E Potter 1,*, Casey Lew-Williams 1
PMCID: PMC6296852  NIHMSID: NIHMS987140  PMID: 30284882

Abstract

Learning always happens from input that contains multiple structures and multiple sources of variability. Though infants possess learning mechanisms to locate structure in the world, lab-based experiments have rarely probed how infants contend with input that contains many different structures and cues. Two experiments explored infants’ use of two naturally occurring sources of variability – different sounds and different people – to detect regularities in language. Monolingual infants (9–10 months) heard a male and female talker produce two different speech streams, one of which followed a deterministic pattern (e.g., AAB, le-le-di) and one of which did not. For half of the infants, each speaker produced only one of the streams; for the other half of infants, each speaker produced 50% of each stream. In Experiment 1, each stream consisted of distinct sounds, and infants successfully demonstrated learning regardless of the correspondence between speaker and stream. In Experiment 2, each stream consisted of the same sounds, and infants failed to show learning, even when speakers provided a perfect cue for separating each stream. Thus, monolingual infants can learn in the presence of multiple speech streams, but these experiments suggest that infants may rely more on sound-based rather than speaker-based distinctions when breaking into the structure of incoming information. This selective use of some cues over others highlights infants’ ability to adaptively focus on distinctions that are most likely to be useful as they sort through their inherently multidimensional surroundings.

Keywords: statistical learning, infant language learning, rule learning, cognitive development


To make sense of complex perceptual environments, infants must prioritize the cues that allow them to efficiently extract relevant information. It has been widely shown that infants are endowed with powerful learning abilities and are highly sensitive to structure in their environment. Infants detect regularities across a variety of domains, modalities, and types of statistical relations (e.g., Fiser & Aslin, 2002; Gómez, 2002; Marcus, Vijayan, Rao, & Vishton, 1999; Saffran, Aslin, & Newport, 1996; see Saffran & Kirkham, 2018 for recent review). However, the vast majority of studies exploring infants’ statistical learning have used deterministic input from a single source, which is unlikely to reflect the true challenge of learning important structures such as those found in language, as natural language environments contain myriad structures and cues that vary in their relevance. In two experiments, we tested infants’ ability to use different cues to discover regularities in input that contained multiple, distinct patterns.

Researchers have begun to test infants’ learning from input that may better reflect infants’ real experience. For example, Pelucchi, Hay, and Saffran (2009) found that infants could segment words from a corpus of real Italian speech, showing the ability to find regularities in natural input. Graf Estes and Lew-Williams (2015) introduced a different form of noise, the presence of multiple talkers, and found that infants could demonstrate learning of an artificial language produced by many different speakers. Still, even these studies contained only one set of learnable regularities, and the only cues to structure came from statistical co-occurrences between syllables, which does not capture the multidimensionality of infants’ experience. Similarly, studies that have included more complex statistical relations (e.g., Gerken & Knight, 2015; Gómez, 2002; Gómez & Lakusta, 2004; Gómez & Maye, 2005) have not typically included other forms of noise, such as speaker-based indexical differences. In the present research, we focused on three separate sources of variability: different statistical regularities, different speakers, and different sets of speech sounds. We introduced these cues simultaneously to probe infants’ ability to contend with some of the challenges found in real language environments, such as the need to track different aspects of linguistic structure, or even different languages. Infants heard two speakers (one female, one male) produce two different streams of speech: a Target stream with recurring repetition-based patterns (e.g., AAB, as in previous research by Marcus et al., 1999) and a Non-Target stream with no overt repetition. Hereafter, we use the term “stream” to refer to a sequence of items constructed according to these constraints. These streams consisted of either non-overlapping (Experiment 1) or fully overlapping sounds (Experiment 2) in order to test infants’ use of sounds and speakers to discover reliable structure in variable input.

Statistics available to young learners are rarely deterministic, and inconsistency shapes infants’ learning strategies (e.g., Tummeltshammer & Kirkham, 2013). Even adults struggle to demonstrate learning when there are multiple sets of regularities within the same materials (e.g., Bulgarelli & Weiss, 2016; Gebhart, Aslin, & Newport, 2009; Karuza et al., 2016; Pacton & Perruchet, 2008; Weiss, Gerfen, & Mitchel, 2009). The few studies that have tested infants’ learning from multiple streams of information have highlighted the challenge presented by less consistent input. Both Antovich and Graf Estes (2017) and Bulgarelli and colleagues (2017) reported that monolingual infants were unable to demonstrate learning of two different artificial languages, presented either sequentially or interleaved. However, monolingual infants learn more effectively when competing speech streams are separated in time (Gonzales, Gerken, & Gómez, 2015) and when regularities are highlighted by additional cues, such as prosody (e.g., Chun & Jiang, 1998; Gervain & Werker, 2013; Richardson & Kirkham, 2004). Therefore, infants may benefit from a range of cues that help them separate concurrently presented streams of information.

In auditory environments in particular, many cues can facilitate learning, and infants use experience to focus on acoustic cues that are most likely to support learning (e.g., Lew-Williams & Saffran, 2012; May & Werker, 2014; Nazzi, Jusczyk, & Johnson, 2000; Nazzi, Mersad, Sundara, Iakovia, & Polka, 2014; Thiessen & Saffran, 2007; Werker & Tees, 1984). For example, different sequences of sounds and different speakers are ubiquitous in infants’ environments, and their use of both phonological and indexical cues is tuned through experience (e.g., Best, Gooding, Tyler, Orlando, & Quann, 2009; Houston & Jusczyk, 2000; Werker & Tees, 1984).

Infants’ ability to differentiate different types of speech sounds has been convincingly demonstrated (e.g., Eimas, Siqueland, Jusczyk, & Vigorito, 1971; Kuhl, 1983; Werker & Lalonde, 1988). Young infants perceive contrasts that older listeners do not (e.g., Mazuka, Hawega, & Tsuji, 2014; Werker & Tees, 1984), suggesting that early in development, infants may be particularly focused on fine phonological details. Moreover, infants are acutely sensitive to the distribution of sounds in their input (e.g., Maye, Werker, & Gerken, 2002; Maye, Weiss, & Aslin, 2008) and track the frequency of different sound combinations in their language (e.g., Friederici & Wessels, 1993; Graf Estes, Edwards, & Saffran, 2011; Jusczyk, Friederici, Wessels, Svenkerud, & Jusczyk, 1993; Mattys, Jusczyk, Luce, & Morgan, 1999). They can also discriminate between two languages on the basis of sound properties (e.g., Bosch & Sebastián-Gallés, 2001) and can make category generalizations on the basis of phonetic information (e.g., Gómez & Lakusta, 2004). However, though past studies demonstrate that infants can separate different sets of sounds, it is not yet known if infants use this dimension to facilitate learning through multiple streams.

Similarly, infants are attuned to the presence of different speakers from an early age and readily distinguish male and female speakers (e.g., Floccia, Nazzi, & Bertoncini, 2000; Jusczyk, Pisoni, & Mullennix, 1992). They often attend to indexical properties of speakers’ voices even when they are not task-relevant (e.g., Graf Estes & Lew-Williams, 2015; Houston & Juscyzk, 2000; Singh, Morgan, & White, 2004). Given that adults better detect multiple speech streams that are separated by speaker (Mitchel & Weiss, 2010; Weiss et al., 2009), we expected that having two streams produced by speakers of different genders might offer a powerful cue for infants.

The current studies were designed to test how infants might exploit the combination of sounds and speakers in order to locate structure in complex input – specifically, input containing multiple streams of information. We defined structure using two patterns that have previously been tested in isolation. Infants heard three-syllable sequences that followed either an AAB pattern (e.g., le-le-di) or an ABA pattern (e.g., le-di-le) and were subsequently tested on their learning of that pattern. These kinds of regularities have been readily learned by infants across many studies (e.g., Gerken, 2006; Marcus et al., 1999; see Rabagliati, Ferguson, & Lew-Williams, in press, for meta-analyses). Furthermore, infants are more likely to discover patterns in stimuli that are connected to their real-life experiences (Ferguson & Lew-Williams, 2016; Rabagliati et al., in press; Saffran, Pollak, Seibel, & Shkolnik, 2007), suggesting that realistic sources of variability may in fact help infants identify patterns.

In two experiments, we asked how the co-occurrence of ecologically valid cues would support infants’ learning of regularities embedded in input containing two streams of speech, produced by two different speakers. In prior studies, monolingual infants have been unable to contend with multiple streams of information (e.g., Antovich & Graf Estes, 2017; Bulgarelli et al., 2017; Kovács & Mehler, 2009). But given that monolingual infants constantly encounter multiple sources of information, they must be able to overcome this complexity.

We tested the hypothesis that co-occurring cues in the speech signal can support infants’ ability to locate regularities in the input. We presented infants with interleaved Target and Non-Target speech streams. Half of infants heard each stream produced by a separate speaker (Consistent Speaker condition), and half of infants heard 50% of each stream produced by each speaker (Variable Speaker condition). Because prior studies suggest that infants struggle to learn two structured streams, we chose regularities for the Target stream that have previously been shown to be robustly learnable (i.e., AAB or ABA patterns). The Non-Target stream contained no repetitions (i.e., CDE), so the structure of the two streams did not conflict. In Experiment 1, the Target and Non-Target streams involved unique sets of sounds, meaning that phonological cues, which can be used to signal to the presence of different structures, were available in addition to speaker cues to help infants segregate the two streams. In Experiment 2, we eliminated the phonological distinction and tested whether or not speaker cues alone would support infants’ learning.

Experiment 1

In Experiment 1, infants were provided with multiple cues to the presence of different structures. We predicted that the combination of sound- and speaker-based cues would help infants segregate the Target from the Non-Target stream. We also predicted infants would demonstrate better learning when regularities in the Target stream co-occurred with consistent speaker information.

Method

Participants.

Experiment 1 included 40 full-term monolingual English-learning infants (17 female), ranging in age from 9.3 to 11.0 months (M = 10.1). A power analysis (Faul, Erdfelder, Lang, & Buchner, 2007) indicated that a sample size of n=20 per condition will have .83 power to detect a medium effect for the interaction between condition and test item (f=.25, based on comparable studies, e.g., Lew-Williams & Saffran, 2012) with two groups of participants. All infants were reported to have normal hearing and were exposed to English at least 85% of the time. Half of the infants were assigned to the Consistent Speaker condition; the other half were assigned to the Variable Speaker condition. Four additional infants were tested but excluded due to fussiness (n=3) or performance that was more than 2.5 standard deviations from the mean (n=1). The parents of all infants provided informed consent, and all participants received a small gift in exchange for their participation. All experimental protocols, including procedures for obtaining informed consent, were approved by the Princeton University IRB (Approval number: 0000007117A021, Language Learning: Sounds, Words, and Grammar).

Stimuli.

All stimuli consisted of trisyllabic strings, produced in infant-directed speech. Two native English speakers, one male and one female, recorded the stimuli. Each syllable was recorded in isolation, and all syllables were normed to match in intensity and duration (625ms). Sequences were created by combining syllables with 250ms of silence between syllables, and there was 1s of silence between 3-syllable items.

Familiarization.

During Familiarization, infants listened to two separate streams of speech. The Target stream was closely modeled on materials previously demonstrated to be learnable by infants of this age (Marcus et al., 1999) and consisted of a series of strings that followed either an AAB (e.g., le-le-di) or ABA pattern (e.g., le-di-le). There were 16 unique items in the Target stream (see Table 1), and infants heard each item twice. The Target stream was randomized and divided into four 8-item blocks of approximately 30s, which were intermixed with a second, Non-Target stream. Half of infants were randomly assigned to hear the AAB Target materials; the other half heard the ABA Target materials.

Table 1.

Familiarization stimuli for AAB condition in Experiment 1. Participants in the ABA condition heard the same syllables, but arranged to form ABA patterns instead. Infants heard all 16 items in each stream presented twice, in two different randomized orders and presented in blocks of 8.

Target Stream Non-Target Stream
le-le-we le-le-je le-le-li de-de-li fɔɪ-nae-vʌ ru-vae-tʌ tae-fu-sʌ fae-su-tɔɪ
wi-wi-je wi-wi-di de-de-je le-le-di nu-tae-rɔɪ tu-sɔɪ-rae vɔɪ-nu-fʌ tʌ-vu-nɔɪ
ji-ji-we ji-ji-li ji-ji-di wi-wi-li sʌ-rae-vɔɪ fʌ-tɔɪ-nae sae-nʌ-ru nɔɪ-sʌ-vu
de-de-di wi-wi-we ji-ji-je de-de-we vae-rʌ-tu vʌ-fae-su rʌ-fɔɪ-sae nʌ-rɔɪ-fu

The Non-Target stream was also comprised of 16 different trisyllabic sequences (each repeated twice), but unlike the Target stream, each syllable within a string was unique (CDE), so there were no repetitions within strings (e.g., foi-nae-vuh, see Table 1). In addition, none of the phonemes used in the Target stream appeared in the Non-Target stream. Like the Target stream, the Non-Target stream was divided into four blocks of eight items.

All infants heard identical linguistic materials, with alternating 30s blocks (8 strings) of the Target and Non-Target streams. In the Consistent Speaker condition, the Target stream was produced entirely by the female speaker, while the male speaker produced the full Non-Target stream. In the Variable Speaker condition, each speaker produced half of each stream. The speaker change always occurred midway through an 8-item block in the Variable Speaker condition, such that infants listened to 30s blocks of a single speaker, and 30s blocks of a single stream, as in the Consistent speaker condition (see Figure 1).

Figure 1.

Figure 1.

Blocking design for the Familiarization streams. Green indicates the Target stream; grey is the Non-Target stream. X’s denote the female speaker, while spaces without X’s denote the male. In the Consistent Speaker condition, there was perfect correspondence between speaker and language (top row), while in the Variable Speaker condition, there was no reliable correspondence between speaker and language (bottom row).

Test.

In the Test phase, infants listened to single items that followed either an AAB (ko-ko-ba, po-po-ga) or ABA (ba-ko-ba, ga-po-ga) pattern. Items that maintained the regularity from the Target stream were considered Familiar (e.g., AAB test items following exposure to an AAB regularity) while items that violated the Target regularity were considered Unfamiliar (e.g., ABA items following exposure to an AAB regularity). None of the phonemes used in the Test items appeared in either Familiarization stream. All Test items were produced by the female speaker.

Procedure.

Infants were tested using the Headturn Preference Procedure. Participants sat on their parents’ lap in a darkened room with monitors on three sides. Parents listened to music through headphones to prevent them from interfering with children’s behavior. During Familiarization, infants listened to the intermixed Target and Non-Target streams for nearly four minutes while visual stimuli appeared on the monitors, contingent with the infants’ looking behavior.

Infants were then tested for learning of the Target pattern. On each test trial, the infant’s attention was drawn to the center monitor with an interesting visual stimulus (i.e., a pinwheel). The same stimulus then appeared on one of the two side monitors, and when the infant looked to that side, a test item (either Familiar or Unfamiliar) played from a loudspeaker until the infant looked away for 1s or a maximum of 20s had elapsed. A trained experimenter, who wore noise-canceling headphones in order to be unaware of what the infant was hearing, controlled the stimuli using custom software. Each test item was repeated three times, for a total of 12 test trials.

Results & Discussion

Our main analysis in Experiment 1 tested infants’ ability to discriminate Familiar vs. Unfamiliar test items after hearing the patterns produced by a single speaker vs. multiple speakers. For each participant, we calculated mean looking times for Familiar and Unfamiliar trials. Looking times were analyzed using a 2×2 mixed ANOVA, with Test Type (Familiar vs. Unfamiliar) as a within-subjects factor and Condition (Consistent vs. Variable Speaker) as a between-subjects factor. The ANOVA revealed a main effect of Test Type [F(1,38)=13.05, p=.0009, ηp2=.008], demonstrating that infants differentiated between Familiar and Unfamiliar items. However, there was no interaction [F(1,38)=.02, p=.88], suggesting that learning was similar under both Consistent and Variable Speaker conditions. Planned comparisons using two-tailed paired samples t-tests confirmed that infants displayed similar performance across conditions. In both conditions, infants listened significantly longer to Familiar items [Consistent Speaker: 5.2s vs. 4.3s, t(19)=2.19, p=.04, Cohen’s d=.49; Variable Speaker: 5.6s vs. 4.6s, t(19)=3.08, p=.006, Cohen’s d=.69, see Figure 2]. Fourteen of 20 infants in the Consistent Speaker condition, and 15 infants in the Variable Speaker condition demonstrated this preference. These results suggest that infants discovered patterns presented in noisy input, and there was no advantage for hearing each pattern produced deterministically by a unique speaker.

Figure 2.

Figure 2.

Mean looking times in Experiment 1, where the two familiarization streams consisted of unique phonology. In the Consistent Speaker condition, each speaker produced a single stream, while in the Variable Speaker condition, each speaker produced half of each stream. Error bars represent standard errors of the mean.

Infants’ successful learning, independent of the correspondence between speaker and information stream, reveals that monolingual infants can discover structure in noisy, multidimensional input, unlike previous studies (e.g., Antovich & Graf Estes, 2017; Bulgarelli et al., 2017). Without relying on indexical cues, infants could segregate different sources of information and discovered regularities that required them to generalize across different voices. Reduplication has been suggested to be easy for infants to learn (e.g., Ota & Skarabela, 2016), and it may be that these particular regularities were highly salient, such that even when only 50% of the input conformed to a particular pattern, infants could learn. Alternatively, it could be that the phonological distinction between the two streams facilitated infants’ ability to find the target regularity. Sound differences help infants separate two natural languages (e.g., Bosch & Sebastián-Gallés, 2001; Molnar, Gervain, & Carreiras, 2014), and adults struggle to learn two artificial languages with overlapping phonology (Perruchet, Poulin-Charronnat, Tillmann, & Peereman, 2014), suggesting that salient phonological cues may be helpful in dividing different streams of information. Furthermore, phonological cues can boost adults’ learning of regularities that are otherwise challenging (e.g., Onnis, Monaghan, Richmond, & Chater, 2005; Van den Bos, Christiansen, & Misyak, 2012), and concurrent statistical and phonological regularities can support infants’ learning of artificial language materials (Sahni, Seidenberg, & Saffran, 2010). Indeed, infants did not need speaker cues to separate the Target and Non-Target streams when there was a phonological distinction between the two streams. In our second study, we examined whether or not infants were able to display learning when presented with two streams that were not phonologically marked.

Experiment 2

In Experiment 2, we tested whether infants could discover regularities when faced with two speech streams consisting of the same sounds. As in Experiment 1, infants were assigned to the Consistent Speaker or Variable Speaker condition and presented with interleaved Target and Non-Target streams, but in Experiment 2, the two streams involved the same syllable inventory, so infants could no longer use phonological cues to separate the two streams. We then asked if the addition of correlated speaker information would play a more substantial role in dictating infants’ ability to discover patterns from these less distinct linguistic streams. We predicted that infants would need this additional cue, and only infants in the Consistent Speaker condition, where structured information was paired with speaker, would discover regularities. An alternative possibility was that infants were relying on the phonological cues and would be unable segregate the two streams without that demarcation.

Method

Participants.

Experiment 2 included 40 additional monolingual infants (9 females, mean age: 10.2 months, range: 9.1–11.0). Half of the infants were assigned to the Consistent Speaker condition; half were assigned to the Variable Speaker condition. Three additional infants were tested, but excluded for fussiness.

Stimuli.

Familiarization materials again consisted of a Target and Non-Target stream. Infants heard one of the same two Target streams (AAB or ABA) as in Experiment 1. The Non-Target stream still consisted of trisyllabic sequences made up of three unique items, but in Experiment 2, the phonemes used in the Non-Target stream were the same as those in the Target stream (see Table 2).

Table 2.

Stimuli used in the Non-target stream in Experiment 2. All items were presented twice, in a pseudo-randomized order. As in Experiment 1, half of the infants heard these items intermixed with AAB Target pattern; the other half heard them intermixed with an ABA pattern.

Non-Target Stream
le-we-ji we-li-ji de-je-wi je-wi-li
le-wi-je we-je-di de-li-je je-di-we
li-de-wi wi-le-de di-ji-we ji-de-le
li-ji-de wi-di-le di-we-li ji-le-di

The design of the Consistent and Variable Speaker conditions was identical to Experiment 1. Infants in the Consistent Speaker condition heard the Target stream produced by the female speaker and the Non-Target stream produced by the male speaker. In the Variable Speaker condition, infants heard each speaker produce half of each Familiarization stream.

Test materials were identical to Experiment 1, with all test items involving phonemes not found in the Familiarization phase, produced by the female speaker.

Procedure.

The procedure for Experiment 2 was identical to Experiment 1.

Results & Discussion

In Experiment 2, we performed parallel analyses to those of Experiment 1. Our main (Test Type by Condition) ANOVA revealed that unlike in Experiment 1, there was no main effect of Test Type [F(1,38)=.68, p=.42]. The interaction between Test Type and Condition also was not significant [F(1,38)=.32, p=.58], and infants did not look significantly longer at Familiar vs. Unfamiliar items in either condition (Consistent Speaker: 6.1s vs. 5.5s, t(19)=1.09, p=.29; Variable Speaker: 6.7s vs. 6.6s, t(19)=.17, p=.87, see Figure 3). Only 12 infants in the Consistent Speaker condition and 11 infants in the Variable speaker condition displayed a familiarity preference. When the Target and Non-Target stream included the same sounds, infants did not demonstrate learning of the Target regularities. Even when speaker information was perfectly correlated with structure in the Consistent Speaker condition, infants failed to exploit this association. Thus, indexical cues alone were not sufficient in highlighting the underlying structure found in the two speech streams.

Figure 3.

Figure 3.

Mean looking times in Experiment 2, where the two familiarization streams consisted of the same syllables. Error bars represent standard errors of the mean.

General Discussion

In two studies, we presented monolingual infants with multi-dimensional input and assessed how they used different acoustic cues to uncover regularities. Infants encountered two streams of speech, comprised of two sets of sounds, produced by two different speakers. In Experiment 1, when each speech stream consisted of unique sounds, infants detected underlying patterns regardless of whether the patterns were produced by a single speaker or multiple speakers. However, in Experiment 2, when there was no phonological distinction between streams, infants failed to demonstrate learning, even when speakers offered a reliable cue. Thus, sound-based differences appeared to be particularly useful in helping infants to segregate different channels of information, suggesting that infants may prioritize some cues over others in complex environments.

First and foremost, these results highlight infants’ robust ability to detect structure. Monolingual 10-month-olds displayed learning of a target speech signal surrounded by inconsistent information. In complex input, learners must discover which regularities to track (e.g., Gerken & Knight, 2015; Mintz, 2002), and even adults rely on concurrent cues, such as sounds or speakers, to track multiple streams of information (e.g., Bulgarelli & Weiss, 2016; Conway & Christiansen, 2006; Gebhart et al., 2009; Karuza et al., 2016; Weiss et al., 2009). In Experiment 1, the phonological division between streams appeared to support infants’ discovery of repetition-based regularities in the Target stream. Two potential explanations could account for this successful learning. First, the presence of two different sets of sounds may have helped infants separate the streams. Research on infant bilingualism supports this explanation, as sound-based differences may signal the presence of different languages, structures, or information (e.g., Bosch & Sebastián-Gallés, 2001). Moreover, even infants in single-language environments are exposed to separable speech streams, such as speech directed to the infant versus overheard between adults. This division is marked acoustically (e.g., Fernald & Simon, 1984; Newport, Gleitman, & Gleitman, 1977; Piazza, Iordan, & Lew-Williams, 2017), and infant-directed speech both captures infants’ interest and enhances learning, suggesting that acoustic differences between streams may highlight the most relevant information (e.g., Cooper & Aslin, 1990; Weisleder & Fernald, 2013). Another (not mutually exclusive) possibility is that the unique sounds in the Non-Target stream increased the overall variability in the input, and surrounding variability can draw infants’ attention to consistent structure (e.g., Gómez, 2002; Tosano & McMurray, 2010). In either case, infants’ successful learning illustrates their keen sensitivity to patterns in their input and suggests that sound-related cues – a pervasive feature of their natural environments – may support learning.

However, not all cues were equally useful in highlighting relevant structure. Contrary to our predictions, dividing information streams by speaker did not facilitate infants’ learning in either experiment. Though indexical information may initially be salient for infants (e.g., Jusczyk et al., 1992; Singh et al., 2004; Quam, Knight, & Gerken, 2017), 10-month-old infants can generalize across speakers, suggesting that they learn that differences between speakers are not always meaningful (Houston & Jusczyk, 2000). Indeed, when listening to unfamiliar speech, infants are less sensitive to individual voices, presumably because differences between languages are more salient than differences between speakers (e.g., Johnson, Westrek, Nazzi, & Cutler, 2011; Nazzi et al., 2000). In Experiment 2, infants failed to exploit the usefulness of speakers to help them differentiate the two streams, even when there were no other acoustic distinctions between the streams. Given that listeners use experience to adjust their use of particular dimensions in speech (e.g., Idemaru & Holt, 2011; Potter & Saffran, 2015; 2017; Rost & McMurray, 2009), infants in our experiments may have down-weighted the potential value of speakers. Infants do not track all cues equally (e.g., Johnson & Tyler, 2010) and they rely on their everyday experience with language to shape their learning strategies (Thiessen & Saffran, 2007). While speakers may be critical socially, they are unlikely to be a reliable cue to important linguistic variation in natural environments, and learners overlook correlated cues that do not typically co-occur with language regularities, such as changes in background color (e.g., Mitchel & Weiss, 2010). On the other hand, sounds can be used to support infants’ differentiation of languages (e.g., Bosch & Sebastián-Gallés, 2001; Nazzi et al., 2000), potentially making them a better candidate cue to segregate different information streams. Thus, by 10 months, infants may have discovered it is not advantageous to segregate their learning by speaker, but that different sounds can mark relevant and valuable structural differences.

These studies included only monolingual infants, whose prior experience may have led them to expect that all speakers use a single language. Bilingual infants, whose experience includes regular exposure to multiple, independent structures, might not display similar learning. Prior studies have suggested that bilingual infants may be better able to learn two structures simultaneously (e.g., Antovich & Graf Estes, 2017; Kovács & Mehler, 2009), and bilingual infants may attend to distinctions that monolingual infants ignore, such as non-native sound contrasts (e.g., Pettito et al., 2012; Sebastián-Gallés, Albaredo-Castellot, Weikum, & Werker, 2012). In bilingual environments, reliable correlations exist between speakers and languages; monolingual speakers use one language exclusively, and bilinguals tend to preferentially use one language (Grosjean, 2001). Bilingual infants may be sensitive to these relations, and in fact, it has often been explicitly recommended that parents employ a “One Parent-One Language” strategy, where each parent speaks a separate language to the child (e.g., Barron-Hauwaert, 2004; Ronjat, 1913). It could be that bilingual infants divide their learning by speaker and might benefit from correlated speaker information to learn regularities. Specifically, one could predict that bilingual infants would show successful learning in the Consistent Speaker condition of Experiment 2. Alternatively, it may be that bilingual infants, like monolinguals, would ignore speaker cues, as all infants must learn to generalize across speakers. Future studies will explore these questions and can provide insight into how infants use experience to inform their use of different cues.

Infants’ performance in these experiments demonstrates their ability to focus on the distinctions that are most likely to be useful (e.g., Kuhl, Stevens, & Hayashi, 2006; Werker & Tees, 1984). For monolingual infants, a lack of attention to individual speakers may reflect an adaptive strategy. In complex auditory and social environments, infants encounter many cues that vary substantially in their reliability, both in aggregate and in individual contexts of processing. Infants’ ability segregate different speech streams, possibly by recognizing phonological differences, suggests that infants detect and take advantage of reliable cues, an ability that may also support their ability to distinguish one natural language from another. Thus, these studies demonstrate that infants selectively exploit the cues available in their input to find reliable structure in a noisy environment.

Acknowledgments:

We would like to thank the participating families and members of the Princeton Baby Lab, especially Eva Fourakis and Fernanda Fernandez. This work was supported by grants from the National Institute of Child Health and Human Development (R03HD079779) and the Overdeck Education Research Innovation Fund.

References

  1. Antovich DM, & Graf Estes K (2017). Learning across languages: Bilingual experience supports dual language statistical word segmentation. Developmental Science. doi: 10.1111/desc.12548 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barron-Hauwaert S (2004). Language strategies for bilingual families: The one-parent-one-language approach Parents’ and teachers’ guides; no. 7 (Vol. 1st, p. xv,220). Clevedon: Multilingual Matter Ltd. doi: 10.1017/S0272263105250283 [DOI] [Google Scholar]
  3. Best CT, Tyler MD, Gooding TN, Orlando CB, & Quann CA (2009). Development of phonological constancy: Toddlers’ perception of native- and Jamaican-accented words. Psychological Science, 20(5), 539–542. doi: 10.1111/j.1467-9280.2009.02327.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bosch L, & Sebastián-Gallés N (2001). Evidence of early language discrimination abilities in infants from bilingual environments. Infancy, 2(1), 29–49. doi: 10.1207/S15327078IN0201_3 [DOI] [PubMed] [Google Scholar]
  5. Bulgarelli F, Weiss DJ, Bulgarelli F, & Weiss DJ (2016). Anchors aweigh: The impact of overlearning on entrenchment effects in statistical learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(10), 1621–2631. doi: 10.1037/xlm0000263 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bulgarelli F, Benitez V, Saffran J, Byers-Heinlein K, & Weiss D (2017). Statistical learning of multiple structures by 8-month-old infants. Proceedings of the 41st Annual Boston University Conference on Language Development. [PMC free article] [PubMed] [Google Scholar]
  7. Chun MM, & Jiang Y (1998). Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cognitive Psychology, 36(1), 28–71. doi: 10.1006/cogp.1998.0681 [DOI] [PubMed] [Google Scholar]
  8. Conway CM, & Christiansen MH (2006). Statistical learning within and between modalities: pitting abstract against stimulus-specific representations. Psychological Science, 17(10), 905–12. doi: 10.1111/j.1467-9280.2006.01801.x [DOI] [PubMed] [Google Scholar]
  9. Cooper RP, & Aslin RN (1990). Preference for infant‐directed speech in the first month after birth. Child Development, 61(5), 1584–1595. [PubMed] [Google Scholar]
  10. Eimas PD, Siqueland ER, Jusczyk P, & Vigorito J (1971). Speech perception in infants. Science, 171(3968), 303–306. doi: 10.1126/science.202.4369.705. [DOI] [PubMed] [Google Scholar]
  11. Faul F, Erdfelder E, Lang AG, & Buchner A (2007). G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. [DOI] [PubMed] [Google Scholar]
  12. Ferguson B, & Lew-Williams C (2016). Communicative signals support abstract rule learning by 7-month-old infants. Scientific Reports, 6, 25434. doi: 10.1038/srep25434 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fernald A & Simon T (1984). Expanded intonation contours in mothers’ speech to newborns. Developmental Psychology, 20(1), 104–113. [Google Scholar]
  14. Fiser J, & Aslin RN (2002). Statistical learning of new visual feature combinations by infants. Proceedings of the National Academy of Sciences of the United States of America, 99(24), 15822–15826. doi: 10.1073/pnas.232472899 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Floccia C, Nazzi T, & Bertoncini J (2000). Unfamiliar voice discrimination for short stimuli in newborns. Developmental Science, 3(3), 333–343. doi: 10.1111/1467-7687.00128 [DOI] [Google Scholar]
  16. Friederici AD, & Wessels JMI (1993). Phonotactic knowledge of word boundaries and its use in infant speech perception. Perception & Psychophysics, 54(3), 287–295. doi: 10.3758/BF03205263 [DOI] [PubMed] [Google Scholar]
  17. Gebhart AL, Aslin RN, & Newport EL (2009). Changing structures in midstream: Learning along the statistical garden path. Cognitive Science, 33(6), 1087–1116. doi: 10.1111/j.1551-6709.2009.01041.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Gerken L (2006). Decisions, decisions: Infant language learning when multiple generalizations are possible. Cognition, 98(3), 1–8. doi: 10.1016/j.cognition.2005.03.003 [DOI] [PubMed] [Google Scholar]
  19. Gerken L, & Knight S (2015). Infants generalize from just (the right) four words. Cognition, 143, 187–192. 10.1016/j.cognition.2015.04.018 [DOI] [PubMed] [Google Scholar]
  20. Gervain J, & Werker JF (2013). Prosody cues word order in 7-month-old bilingual infants. Nature Communications, 4, 1490–1496. doi: 10.1038/ncomms2430 [DOI] [PubMed] [Google Scholar]
  21. Gómez RL (2002). Variability and detection of invariant structure. Psychological Science, 13(5), 431–436. doi: 10.1111/1467-9280.00476 [DOI] [PubMed] [Google Scholar]
  22. Gómez RL, & Lakusta L (2004). A first step in form-based category abstraction by 12-month-old infants. Developmental Science, 7(5), 567–580. 10.1111/j.1467-7687.2004.00381.x [DOI] [PubMed] [Google Scholar]
  23. Gómez R, & Maye J (2005). The developmental trajectory of nonadjacent dependency learning. Infancy, 7(2), 183–206. 10.1207/s15327078in0702_4 [DOI] [PubMed] [Google Scholar]
  24. Gonzales K, Gerken L, & Gómez RL (2015). Does hearing two dialects at different times help infants learn dialect-specific rules? Cognition, 140, 60–71. doi: 10.1016/j.cognition.2015.03.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Graf Estes K, & Lew-Williams C (2015). Listening through voices: Infant statistical word segmentation across multiple speakers. Developmental Psychology, 51(11), 1517–1528. doi: 10.1037/a0039725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Graf Estes K, Edwards J, & Saffran JR (2011). Phonotactic constraints on infant word learning. Infancy, 16(2), 180–197. doi: 10.1111/j.1532-7078.2010.00046.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Grosjean F (2001). The bilingual’s language modes In Nichol JL (Ed.), One Mind, Two Languages: Bilingual Language Processing. Cambridge, MA: Blackwell, 1–22. [Google Scholar]
  28. Houston DM, & Jusczyk PW (2000). The role of talker-specific information in word segmentation by infants. Journal of Experimental Psychology: Human Perception and Performance, 26(5), 1570–1582. doi: 10.1037/0096-1523.26.5.1570 [DOI] [PubMed] [Google Scholar]
  29. Idemaru K, & Holt LL (2012). Word recognition reflects dimension-based statistical learning. Journal of Experimental Psychology: Human Perception and Performance, 100(2), 130–134. doi: 10.1016/j.pestbp.2011.02.012.Investigations [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Johnson EK, & Tyler MD (2010). Testing the limits of statistical learning for word segmentation. Developmental Science, 13(2), 339–345. 10.1111/j.1467-7687.2009.00886.x.Testing [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Johnson EK, Westrek E, Nazzi T, & Cutler A (2011). Infant ability to tell voices apart rests on language experience. Developmental Science, 14(5), 1002–1011. doi: 10.1111/j.1467-7687.2011.01052.x [DOI] [PubMed] [Google Scholar]
  32. Jusczyk PW, Friederici AD, Wessels JMI, Svenkerud VY, & Jusczyk AM (1993). Infants’ sensitivity to sound patterns of native language words. Journal of Memory and Language, 32, 402–420. doi: 10.1006/jmla.1993.1022 [DOI] [Google Scholar]
  33. Jusczyk PW, Pisoni DB, & Mullennix J (1992). Some consequences of stimulus variability on speech processing by 2-month-old infants. Cognition, 43(3), 253–291. doi: 10.1016/0010-0277(92)90014-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Karuza EA, Li P, Weiss DJ, Bulgarelli F, Zinszer BD, & Aslin RN (2016). Sampling over nonuniform distributions: A neural efficiency account of the primacy effect in statistical learning. Journal of Cognitive Neuroscience, 28(10), 1484–1500. doi: 10.1162/jocn [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kovács AM, & Mehler J (2009). Flexible learning of multiple speech structures in bilingual infants. Science, 325(5940), 611–2. doi: 0.1126/science.1173947 [DOI] [PubMed] [Google Scholar]
  36. Kuhl P, Stevens E, & Hayashi A (2006). Infants show a facilitation effect for native language phonetic perception between 6 and 12 months. Developmental Science, 9(2), F13–F21. doi: 10.1111/j.1467-7687.2006.00468.x [DOI] [PubMed] [Google Scholar]
  37. Kuhl PK (1983). Perception of auditory equivalence classes for speech in early infancy. Infant Behavior and Development, 6, 263–285. doi: 10.1016/S0163-6383(83)80036-8 [DOI] [Google Scholar]
  38. Lew-Williams C, & Saffran JR (2012). All words are not created equal: Expectations about word length guide infant statistical learning. Cognition, 122(2), 241–246. 10.1016/j.cognition.2011.10.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Mattys SL, Jusczyk PW, Luce PA, & Morgan JL (1999). Phonotactic and prosodic effects on word segmentation in infants. Cognitive Psychology, 38(4), 465–494. doi: 10.1080/15475441.2013.876270 [DOI] [PubMed] [Google Scholar]
  40. Marcus G, Vijayan S, Rao S, & Vishton P (1999). Rule learning by seven-month-old infants. Science, 283(5398), 77–80. doi: 10.1126/science.283.5398.77 [DOI] [PubMed] [Google Scholar]
  41. May L, & Werker JF (2014). Can a click be a word?: Infants’ learning of non‐native words. Infancy, 19(3), 281–300. doi: 10.1111/infa.12048 [DOI] [Google Scholar]
  42. Maye J, Werker JF, & Gerken L (2002). Infant sensitivity to distributional information can affect phonetic discrimination. Cognition, 82(3), B101–11. doi: 10.1016/S0010-0277(01)00157-3 [DOI] [PubMed] [Google Scholar]
  43. Maye J, Aslin RN, & Tanenhaus MK (2008). The weckud wetch of the wast: Lexical adaptation to a novel accent. Cognitive Science, 32(3), 543–562. doi: 10.1080/03640210802035357 [DOI] [PubMed] [Google Scholar]
  44. Mazuka R, Hasegawa M, & Tsuji S (2014). Development of non-native vowel discrimination: Improvement without exposure. Developmental Psychobiology, 56(2), 192–209. doi: 10.1002/dev.21193 [DOI] [PubMed] [Google Scholar]
  45. Mintz TH (2002). Category induction from distributional cues in an artificial language. Memory & Cognition, 30(5), 678–686. doi: 10.3758/BF03196424 [DOI] [PubMed] [Google Scholar]
  46. Mitchel AD, & Weiss DJ (2010). What’s in a face? Visual contributions to speech segmentation to speech segmentation. Language and Cognitive Processes, 25(4), 456–482. doi: 10.1080/01690960903209888 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Molnar M, Gervain J, & Carreiras M (2014). Within-rhythm class native language discrimination abilities of Basque-Spanish monolingual and bilingual infants at 3.5 months of age. Infancy, 19(3), 326–337. doi: 10.1111/infa.12041 [DOI] [Google Scholar]
  48. Nazzi T, Jusczyk PW, & Johnson EK (2000). Language discrimination by Englishlearning 5-month-olds: Effects of rhythm and familiarity. Journal of Memory and Language, 19, 1–19. doi: 10.1006/jmla.2000.2698 [DOI] [Google Scholar]
  49. Nazzi T, Mersad K, Sundara M, Iakimova G, & Polka L (2014). Early word segmentation in infants acquiring Parisian French: task-dependent and dialect-specific aspects. Journal of Child Language, 41(3), 600–633. doi: 10.1017/S0305000913000111 [DOI] [PubMed] [Google Scholar]
  50. Newport E, Gleitman H, & Gleitman LR (1977). Mother, I’d rather do it myself: Some effects and non-effects of maternal speech style In Snow CE & Ferguson CA (Eds.), Talking to children: Language input and acquisition (pp. 109–149). Cambridge: Cambridge University Press. [Google Scholar]
  51. Ota M, & Skarabela B (2016). Reduplicated words are easier to learn. Language Learning and Development, 12(4), 380–397. doi: 10.1080/15475441.2016.1165100 [DOI] [Google Scholar]
  52. Onnis L, Monaghan P, Richmond K, & Chater N (2005). Phonology impacts segmentation in online speech processing. Journal of Memory and Language, 53(2), 225–237. doi: 10.1016/j.jml.2005.02.011 [DOI] [Google Scholar]
  53. Pacton S, & Perruchet P (2008). An attention-based associative account of adjacent and nonadjacent dependency learning. Journal of Experimental Psychology. Learning, Memory, and Cognition, 34(1), 80–96. doi: 10.1037/0278-7393.34.1.80 [DOI] [PubMed] [Google Scholar]
  54. Pelucchi B, Hay JF, & Saffran JR (2009). Statistical learning in a natural language by 8-month-old infants. Child Development, 80(3), 674–85. doi: 10.1111/j.1467-8624.2009.01290.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Perruchet P, Poulin-Charronnat B, Tillmann B, & Peereman R (2014). New evidence for chunk-based models in word segmentation. Acta Psychologica, 149, 1–8. doi: 10.1016/j.actpsy.2014.01.015. [DOI] [PubMed] [Google Scholar]
  56. Petitto LA, Berens MS, Kovelman I, Dubins MH, Jasinska K, & Shalinsky M (2012). The “Perceptual Wedge” hypothesis as the basis for bilingual babies’ phonetic processing advantage: New insights from fNIRS brain imaging. Brain and Language, 121(2), 130–143. doi: 10.1016/j.bandl.2011.05.003.The [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Piazza EA, Iordan MC, & Lew-Williams C (2017). Mothers Consistently Alter Their Unique Vocal Fingerprints When Communicating with Infants. Current Biology, 27(20), 3162–3167. doi: 10.1016/j.cub.2017.08.074 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Potter CE, & Saffran JR (2015). The role of experience in children’s discrimination of unfamiliar languages. Frontiers in Psychology, 6, 1587. doi: 10.3389/fpsyg.2015.01587 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Potter CE, & Saffran JR (2017). Exposure to multiple accents supports infants’ understanding of novel accents. Cognition, 166, 67–72. doi: 10.1016/j.cognition.2017.05.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Quam C, Knight S, & Gerken L (2017). The distribution of talker variability impacts infants’ word learning. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 8(1), 1–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Rabagliati H Ferguson B & Lew-Williams C (in press). The profile of abstract rule learning in infancy: Meta-analytic and experimental evidence. Developmental Science. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Richardson DC, & Kirkham NZ (2004). Multimodal events and moving locations: eye movements of adults and 6-month-olds reveal dynamic spatial indexing. Journal of Experimental Psychology: General, 133(1), 46–62. doi: 10.1037/0096-3445.133.1.46 [DOI] [PubMed] [Google Scholar]
  63. Ronjat J (1913). Le développement du langage observé chez un enfant bilingue. Paris: Champion. [Google Scholar]
  64. Rost G, & McMurray B (2009). Speaker variability augments phonological processing in early word learning. Developmental Science, 12(2), 339–349. doi: 10.1111/j.1467-7687.2008.00786.x.Speaker [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Saffran JR, & Kirkham NZ (2018). Infant statistical learning. Annual Review of Psychology, 69, 2.1–2.23. doi: 10.1146/annurev-psych-122216-011805 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Saffran JR, Aslin RN, & Newport EL (1996). Statistical learning by 8-month-old infants. Science, 274(5294), 1926–1928. doi: 10.1126/science.274.5294.1926 [DOI] [PubMed] [Google Scholar]
  67. Saffran JR, Pollak SD, Seibel RL, & Shkolnik A (2007). Dog is a dog is a dog: Infant rule learning is not specific to language. Cognition, 105(3), 669–80. doi: 10.1016/j.cognition.2006.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Sahni SD, Seidenberg MS, & Saffran JR (2010). Connecting cues: overlapping regularities support cue discovery in infancy. Child Development, 81(3), 727–36. 10.1111/j.1467-8624.2010.01430.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Sebastián-Gallés N, Albareda-Castellot B, Weikum WM, & Werker JF (2012). A bilingual advantage in visual language discrimination in infancy. Psychological Science, 23(9), 994–999. doi: 10.1177/0956797612436817 [DOI] [PubMed] [Google Scholar]
  70. Singh L, Morgan JL, & White KS (2004). Preference and processing: The role of speech affect in early spoken word recognition. Journal of Memory and Language, 51(2), 173–189. doi: 10.1016/j.jml.2004.04.004 [DOI] [Google Scholar]
  71. Thiessen ED, & Saffran JR (2007). Learning to learn: Infants’ acquisition of stress-based strategies for word segmentation. Language Learning and Development, 3(1), 73–100. doi: 10.1080/15475440709337001 [DOI] [Google Scholar]
  72. Toscano JC, & McMurray B (2010). Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cognitive Science, 34(3), 434–464. doi: 10.1111/j.1551-6709.2009.01077.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Tummeltshammer KS, & Kirkham NZ (2013). Learning to look: Probabilistic variation and noise guide infants’ eye movements. Developmental Science, 16(5), 760–771. doi: 10.1111/desc.12064 [DOI] [PubMed] [Google Scholar]
  74. Van Den Bos E, Christiansen MH, & Misyak JB (2012). Statistical learning of probabilistic nonadjacent dependencies by multiple-cue integration. Journal of Memory and Language, 67(4), 507–520. doi: 10.1016/j.jml.2012.07.008 [DOI] [Google Scholar]
  75. Weisleder A, & Fernald A (2013). Talking to children matters: Early language experience strengthens processing and builds vocabulary. Psychological Science, 24, 2143–2152. 10.1177/0956797613488145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Weiss D, Gerfen C, & Mitchel A (2009). Speech segmentation in a simulated bilingual environment: A challenge for statistical learning? Language Learning and Development, 5(1), 30–49. doi: 10.1080/15475440802340101.SPEECH [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Werker JF, & Lalonde CE (1988). Cross-language speech perception: Initial capabilities and developmental change. Developmental Psychology, 24(5), 672. doi:http://dx.doi.org.helicon.vuw.ac.nz/10.1037/0012-1649.24.5.672 [Google Scholar]
  78. Werker JF, & Tees R (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 7(1), 49–63. doi:Doi 10.1016/S0163-6383(84)80022-3 [DOI] [Google Scholar]

RESOURCES