Eighteen-month-olds selectively generalize words from accurate speakers to novel contexts

Elena Luchkina; David M Sobel; James L Morgan

doi:10.1111/desc.12663

. Author manuscript; available in PMC: 2019 Nov 1.

Published in final edited form as: Dev Sci. 2018 Mar 22;21(6):e12663. doi: 10.1111/desc.12663

Eighteen-month-olds selectively generalize words from accurate speakers to novel contexts

Elena Luchkina ¹, David M Sobel ¹, James L Morgan ¹

PMCID: PMC6151175 NIHMSID: NIHMS945005 PMID: 29569386

Abstract

The present studies examine whether and how 18-month-olds use informants' accuracy to acquire novel labels for novel objects and generalize them to a new context. In Experiment 1, two speakers made statements about the labels of familiar objects. One used accurate labels and the other used inaccurate labels. One of these speakers then introduced novel labels for two novel objects. At test, toddlers saw those two novel objects and heard an unfamiliar voice say one of the labels provided by the speaker. Only toddlers who had heard the novel labels introduced by the accurate speaker looked at the appropriate novel object above chance. Experiment 2 explored possible mechanisms underlying this difference in generalization. Rather than making statements about familiar objects' labels, both speakers asked questions about the objects' labels, with one speaker using accurate labels and the other using inaccurate labels. Toddlers' generalization of novel labels for novel objects was at chance for both speakers, suggesting that toddlers do not simply associate hearing the accurate label with the reliability of the speaker. We discuss these results in terms of potential mechanisms by which children learn and generalize novel labels across contexts from speaker reliability.

Because the relation between how a word sounds and what a word means is arbitrary and conventional (Saussure, 1916/1966), children must learn the meanings of words from other competent language users. Several studies have found that preschoolers use speakers' history of accuracy in labeling familiar words when they learn meanings of novel words. For example, Koenig and colleagues (e.g., Clément, Koenig & Harris, 2004; Koenig, Clément, & Harris, 2004; Koenig & Harris, 2005) have shown that 4-year-olds identified an informant who was accurate in labeling familiar objects as knowledgeable about the meaning of words, and predicted that she would be more likely to know the name of a novel object over an informant who was always inaccurate at labeling familiar objects.

Even before children reach preschool age, they exhibit an ability to track interlocutors' competence or accuracy and use it in learning situations. For example, Zmyj, Buttelmann, Carpenter and Daum (2010) showed that 14-month-olds are less likely to imitate agents' novel actions when those agents use familiar objects incompetently (for example, putting a shoe on their hand instead of foot) than when agents use familiar objects competently. Similarly, 12-month-olds track informants' competence and use it in their own exploratory behavior (Stenberg, 2013). Even 8-month-old infants have the capacity to monitor the reliability of a potential informant's gaze and use it in planning their responses about whether interesting events will appear (Tummeltshammer, Wu, Sobel, & Kirkham, 2014).

Despite converging evidence from investigations of infants' accuracy judgments in these non-verbal domains, studies examining younger children's learning of novel labels have reported conflicting evidence. Koenig and Woodward (2010) showed that 2-year-olds differentiate between speakers with different histories of accuracy when those speakers provided contrasting information. Brooker and Poulin-Dubois (2013) tested even younger children; they showed that 18-month-olds were more likely to learn words and actions from an informant who labeled familiar objects accurately than from a person who did so inaccurately. Krogh-Jespersen and Echols (2012), however, found that 2-year-olds accepted speakers' labels for novel objects regardless of whether those informants were previously accurate, inaccurate, or ignorant about labels for familiar objects.

In light of these conflicting findings, the present study had two primary goals. The first was to broaden the finding that 18-month-olds can use speakers' accuracy to make inferences about novel objects' labels. In particular, we were concerned with the generalization of word learning, in this case from a social context to a less social context with a novel, disembodied voice. A characteristic of most of the studies cited above is that a familiar individual asks for the target object using a novel label during test¹. While toddlers may form an association between the utterance, the speaker, and the object, it is not clear whether they acquire generalizable knowledge of object labels. In our work, we aimed to resolve two potential complications that result from using the same speaker during novel label training and test.

First, seeing the speaker who provided the novel labels might serve as contextual support (Hendrickson & Sundara, 2017) and trigger object-label-speaker associations that toddlers may have formed during the training. Henderson, Graham, and Schell (2015) showed that 24-month-olds are open to learning novel labels from unreliable sources if they expect them to have in-context relevance. To avoid the possibility that toddlers in our study demonstrated context-specific knowledge of novel labels, we did not use the image or the voice of the speakers who provided the labels to test toddlers' knowledge and rather had toddlers hear that label spoken by a novel voice.

Second, word learning does not merely entail learning that specific phonetic forms are associated with specific referents. Phonological representations may be manifested by varying phonetic forms with slightly different qualities produced by the same voice or by different voices. Toddlers might be more likely to associate tokens uttered by the same voice with an object, as opposed to making a more general inference that the object has a label that is phonologically invariant (though phonetically variable) across voices. By using an unfamiliar voice during the test phase, we aimed to ensure that toddlers demonstrate generalizable knowledge of novel labels. Experiment 1 was designed to test whether 18-month-olds selectively learn generalizable-across-contexts labels from reliable speakers.

The second primary goal of this study was to investigate the mechanisms behind selective social learning in 18-month-olds. Regarding the mechanisms underlying social learning, one possibility is that toddlers simply recognize the associations between label accuracy and the speaker who provided those labels. They may then extend those associations to the novel labels provided by the same speaker. Recently, Heyes (2015) has suggested that selective social learning is best explained by these kinds of asocial mechanisms. While her arguments were specific to copying behaviors, one could imagine similar arguments extending to word learning. Indeed, Jaswal, Croft, Setia, and Cole (2010, see also Mills, 2013) speculated that children have a “highly robust bias to trust what people—particularly visible speakers—say.” (p. 1541). That initial bias might be based on a strong learned association that verbal statements are accurate. The evidence that 2-year-olds do not learn from individuals selectively (Krogh-Jespersen & Echols, 2012) is consistent with this argument. Similarly, such an approach is consistent with research suggesting that “dumb attentional” associative mechanisms could govern conceptual development and category-based inference in preschoolers (e.g., Sloutsky & Fisher, 2004; Smith, Jones, & Landau, 1996).

While associative accounts can be instrumental in explaining early forms of word learning and social inference, it is not clear whether they can account for complex forms of selective social learning. Golinkoff and Hirsh-Pasek (2006), citing empirical evidence from Hollich, Hirsh-Pasek, and Golinkoff (2000), argued that while children's word learning might be governed by associations earlier in development, by 18 months they rely on a combination of cues, including social information, such as referential intent (see also Bloom, 2000). In this case, toddlers should move beyond simple associations among labels, objects, and speakers and incorporate additional information into their inferences concerning whether a novel label for a novel object should be generalized. We considered this in Experiment 2, by replicating the procedure of Experiment 1 with one critical difference. Rather than making assertions about the labels of familiar objects during the familiarization phase of the experiment, the two speakers asked questions about the objects' names. One asked a question about an object using its actual label (e.g., asking “is this a book?” while holding a book); the other used an inappropriate label (e.g., asking “is this a cup?” while holding a ball). By using questions, speakers could present the same accurate or inaccurate label information, while not indicating whether they possessed different knowledge about object labels.

If toddlers are simply recognizing associations among labels, objects, and speakers to make reliability inferences, one might expect similar distinctions between the speakers in Experiments 1 and 2 – toddlers might believe that when the speaker who used the accurate label in her question makes a statement about the novel label for a novel object, she is more reliable than the speaker who used the inaccurate label in her question. In contrast, if toddlers integrate some understanding of epistemic states into their inferences, they should show no difference between the likelihood to generalize either speaker's novel label. Further, because questions have no truth-values to them, toddlers might be likely to consider both speakers equally reliable or equally unreliable as sources of novel information.

Experiment 1

Experiment 1 presented an extension of Brooker and Poulin-Dubois (2013), aimed at determining whether toddlers can use a speaker's past accuracy to guide their generalization of novel labels for novel objects introduced by the speaker. Toddlers first watched a video in which two speakers sat at a table and named different familiar objects. One speaker labeled a set of four objects accurately; the other speaker labeled four different objects inaccurately. Toddlers then saw either the accurate or the inaccurate speaker (between-subject) label two novel objects with novel labels (one at a time).

At test, toddlers saw trials in which a pair of familiar objects or a pair of novel objects were presented on a TV screen. Toddlers heard an audio-recording of an unfamiliar voice utter a label consistent with one of those objects (i.e., “dog” or “ball” if presented with a dog and a ball on the familiar trials and “lif” or “neem” on the novel trials). We recorded toddlers' looking time to the target referents of the labels. We expected that during the test, toddlers would look equally long to the correct objects corresponding to familiar labels, regardless of which speaker labeled the novel objects. However, if toddlers distinguish between the accurate and inaccurate speakers, they should look longer to the correct referents of the novel labels only if the accurate speaker introduced them.

We wish to note two important points about the methodology. First, presenting toddlers with an unfamiliar voice at test critically tests the extent to which they generalize the novel label to that novel object when spoken by different voices. In most experiments examining toddlers' selective word learning (e.g., Brooker & Poulin-Dubois, 2013; Koenig & Woodward, 2010, Experiment 1; Krogh-Jespersen & Echols, 2012) the test utterance is produced by a familiar speaker. Thus, these experiments do not necessarily show that toddlers generalize the meaning of that label to novel contexts (i.e., across speakers). This methodological change also had the benefit of ensuring that test trials were identical for toddlers in all conditions, a control absent from many previous studies.

Second, instead of a forced-choice procedure, toddlers in both experiments were tested in an intermodal preferential looking procedure. This method presented toddlers with a more continuous way to respond to the test question, in which they could weigh responses to both options (as opposed to simply choosing a referent as in previous studies, see Golinkoff, Ma, Song, & Hirsh-Pasek, 2013). Given that only one study (Brooker & Poulin-Dubois, 2013) showed selective learning of novel words in 18-month-olds, we wanted to extend these findings using a measure that was potentially more sensitive.

Method

Participants

Forty 18-month-olds (M = 18.07 months, SD = 0.43 months, age range: 16.99-19.02 months; 23 female, 17 male) participated in the study. Participants were recruited from public birth records in an urban area in the Northeastern United States. All were born full-term to monolingual English-speaking families and had no known developmental or hearing disabilities. Families were compensated with a toy, book, or t-shirt at each visit. Additional participants were tested but excluded from the analyses because of hardware malfunction (n = 2), crying or fussiness (n = 6), loss of attention to the task (n = 5), and impaired vision (n = 2). We did not collect race, ethnicity, and SES information from participants' parents, but most families were Caucasian and represented middle to upper SES backgrounds.

Apparatus

Toddlers watched visual stimuli presented on a screen (Dell 19″ P190S flat screen monitor) mounted on a pegboard approximately 90 cm in front of the seat. Centered behind the pegboard was a Sentry 110A monitor speaker that played the auditory stimuli at a conversational level (75 dB). A video camera (Panasonic WV-BP330 CCTV camera) was situated below the screen, so that the experimenter in the remote control room could record experimental sessions.

Materials

Familiarization and test stimuli included two sets of four familiar objects and two novel objects. One set of four objects included a porcelain cup, a picture book, a stuffed cat, and a rubber ball. The other set of objects included a stuffed dog, a brown shoe, a feeding bottle with a nipple, and a cardboard star painted gold. The objects were chosen to have labels familiar to 18-month-olds, according to the MacArthur CDI lexical norms (Fenson et al., 1993). The labels are listed among 200 most frequently produced and understood words at 18 months in the UCSD Lexical Developmental Norms Database (Dale & Fenson, 1996). Novel objects were a green drain cover and a yellow tarpaulin clip (Figure 1).

Participants' parents were asked to complete a survey about their child's receptive and productive vocabulary and visual familiarity with object referents of the words used during familiarization. This was included to ensure that participants were familiar with all labels used during the familiarization phase. Parents also were asked to report how many hours their toddlers were exposed to TV and Skype (or similar software) in a typical week. We collected these data to ensure that all participants had had experience with electronic media and that their looking behavior would not be confounded by the novelty of the experience.

Procedure

Toddlers were seated in a testing room on the lap of a caregiver who wore active noise reduction headphones playing music to mask the auditory stimuli. The procedure consisted of three phases: familiarization, novel label training, and test (see Table 1). During 3-second intervals between the phases and between different test trials toddlers saw a dark screen. The familiarization phase began with an attention grabber that appeared several times at the areas of the screen where the two speakers appeared later during familiarization.

Table 1. Phases of Experiment 1.

Phase:	Familiarization (8 labeling events: 4 accurate and 4 inaccurate)	Novel Label Training (4 labeling events: 2 per novel object)	Test (2 novel object trials and 2 familiar object trials)
Description:	The accurate speaker and the inaccurate speaker sit at the same table and take turns to present familiar objects, 4 objects each. Only one object is present on the table during a labeling event. Each object only receives one label. The presenting speaker alternates between looking at the camera and at the object during a labeling event. The other speaker looks at the camera with a neutral expression.	Either the accurate speaker or the inaccurate speaker sits at the table and presents novel objects. The speaker alternates between looking at the camera and at the object she is presenting.	Two objects (both familiar or both novel) are presented on the screen against a black background. An unfamiliar novel voice asks the participant to look at one of the objects.
Example:	Accurate speaker [presents a cup]: Look, a cup! Cup! Inaccurate speaker [presents a toy cat]: Look, a book! Book!	Speaker [presents a novel object 1]: Look, a neem! Neem! Speaker [presents novel object 2]: Look, a lif! Lif!	Familiar object trial [a toy dog and a ball appear on the screen]: Look, a dog! Where is the dog? Novel object trial [the two novel objects presented earlier appear on the screen]: Look, a lif! Where is the lif?
Labels used:	Set 1: Ball, Cat, Shoe, Star Set 2: Cup, Book, Bottle, Dog	Lif, Neem

Open in a new tab

Toddlers saw an orange circle blinking first on the left and then on the right side of the screen at locations matched with the locations of the faces of the two female informants in the familiarization video. This was done to help coders correctly identify the direction of participants gaze during the rest of the test trials. Following the attention grabber, participants saw a video, in which the two informants sat at a table with an opaque black cloth on it, in front of an opaque black curtain. Two female speakers were chosen to be similar in appearance to minimize potential artifacts in the looking time data. The speakers were the same race, similar in age, had similar hairstyle and wore similar college insignia sweatshirts of distinct colors (to ensure that toddlers could easily distinguish between them). The speakers took turns bringing out one object, looking at it, and labeling it. One speaker (the accurate speaker) always labeled the objects correctly (e.g., while displaying a star, “This is a star. Star.”). The other speaker (the inaccurate speaker), always labeled the objects inaccurately (e.g., while displaying a book, “This is a cat. Cat.”). Each speaker labeled a unique set of four objects (eight labeling events in total; see Table 1). This way, only one object was presented at a time, only one speaker labeled it, and each object received only one label. The locations of the object sets, the identity and location of the accurate speaker, and which speaker began showing the objects first were counterbalanced across participants.

After familiarization, the novel label training began. Half of the participants saw the accurate speaker while the other half saw the inaccurate speaker introduce novel labels for two novel objects. Participants were randomly assigned to the accurate or inaccurate speaker condition. During this phase of the procedure, one of the speakers sat at the table. She showed and labeled two novel objects one at a time. The novel labels were nonsense words – “neem” and “lif” (e.g., “Look, a lif! Lif.”). We chose the same nonsense CVC words used by Werker, Cohen, Lloyd, Casasola, and Stager (1998) to ensure that the two labels were phonetically distinct yet neither label was more acoustically salient than the other one. This choice of labels allowed us to minimize potential confounding effects of label salience. The novel label training was presented twice.

Following the novel label training, toddlers were tested using an Intermodal Preferential Looking Procedure. There were four test trials. In two, participants saw a photograph of a pair of familiar objects (a ball and a dog) on a TV screen. In the other two, they saw a photograph of the pair of novel objects that were labeled during the novel label training. After the photograph was presented for 1 second, participants heard the phrase “Look, a [label]! Where is the [label]?” with a minimal pause between the two utterances. In the familiar object trials, the label was either “ball” or “dog”. One of these labels was uttered by the accurate speaker and the other was uttered by the inaccurate speaker during familiarization. Due to the counterbalancing scheme, 50% of the time “ball” was uttered by the inaccurate speaker and 50% of the time “dog” was uttered by the inaccurate speaker.

In the novel object trials, the label was either “lif” or “neem”. Utterances in the test trials were provided by a novel voice that did not belong to either the accurate or the inaccurate speaker.

While the novel object trials were the critical experimental trials, the familiar object trials were included as controls to ensure that participants understood the nature of the task, as their knowledge of the meaning of these words should be independent of hearing either the accurate or inaccurate speaker label novel objects (i.e., performance on these trials should indicate mapping of the label to the object in both conditions). Trials were presented in the following order: familiar objects, novel objects, familiar objects, novel objects. The location of the objects was counterbalanced between the trials as well as across participants. During familiarization and novel label training the speakers alternated between looking to the object to cue the object referents of the labels provided and looking to the camera to emulate eye contact with the viewer.

Data Coding and Analysis

Toddlers' looking behavior was measured during all three phases. The between-subjects independent variable was speaker accuracy during the training phase – whether the accurate or the inaccurate speaker introduced and labeled the novel objects. The dependent measure was the average proportion of looking time to the target object of the total combined looking time to the target and the distractor objects during 3 seconds (thirty 100-ms bins), starting from 400 ms from the onset of the first mention of the label (to allow toddlers enough time to react to the label heard, see Swingley & Aslin, 2000). The time interval of 3 seconds was chosen based on overall looking patterns on familiar test trials. On average, toddlers tended to look at the target object the most during the window of analysis and then lost interest and looked at the distractor object or looked away. Figure 2 depicts the proportion of toddlers' looking time at the target from the onset of the first mention of the target word to the end of the window of analysis. All data were coded offline frame by frame (1 frame = 33.33 ms) using SuperCoder software (Hollich, 2005). Proportions of looking time to the target objects were calculated for 100-ms bins. A subset of videos (10%) was coded by a second coder. The intercoder reliability was 97.80% (Cohen's Kappa = .932) and was calculated based on the exact match of the frames between the coders. For familiarization and test trials, the data from the time intervals during which toddlers did not look at either object were excluded from analyses.

Experiment 1: Proportion of looking time to the target object (smoothed) from the onset of the target word. The window of analysis extended from 400 ms to 3400 ms post onset.

Results

Based on parental report, participants knew on average 6.73 of 8 words used in familiarization. Thirty-seven out of 40 toddlers knew both familiar words – dog and ball – that were used during the test. The remaining three toddlers knew one of the words.

We analyzed the effects of counterbalancing factors, toddlers' verbal competence, media exposure reported by parents, and gender to ensure that individual differences between participants and biases potentially arising from the experimental design did not explain the observed differences in toddlers' looking behavior across the conditions. Mann-Whitney tests showed no significant effects, all Z-scores ≤ 1.58, all p-values ≥ .18. These variables were not considered in any further analyses.

Looking behavior during the familiarization and novel label training phases was analyzed separately to ensure that toddlers were equally attentive when both accurate and inaccurate speakers presented the objects (see Table 2). For familiarization trials, the numbers in Table 2 represent proportions of looking time at the target object and/or the presenting speaker of the total looking time at the screen, excluding looking away. Because there were no stimuli that would distract the child from the target object and the speaker during the novel label training trials, the numbers represent the proportions of looking time at the target object and/or the presenting speaker of the total duration of the trial. Mann-Whitney tests showed no significant differences in looking time during familiarization or novel label training phases, all Z-scores ≤ 1.08, all p-values ≥ .31. Moreover, inspection of Table 2 shows that the majority of the time, toddlers attended to the presenting speaker.

Table 2. Proportion of Looking Time to Speakers During Familiarization and Novel Label Training.

		Accurate Speaker Condition	Inaccurate Speaker Condition	p-value

	Phase	M(SD)	M(SD)
Experiment 1	Familiarization – Inaccurate Speaker	.96 (.04)	.96 (.05)	.55
	Familiarization – Accurate Speaker	.98 (.03)	.96 (.05)	.31

	Label 1 (“Neem”) Training	.99 (.03)	.98 (.05)	.69
	Label 2 (“Lif”) Training	.99 (.05)	.96 (.08)	.95

	Test – Familiar objects	.63(.11)	.63(.11)	.90
	Test – Novel objects	.63(.13)	.47(.18)	.007

Experiment 2	Familiarization – Inaccurate Speaker	.97(.04)	.96(.05)	.17
	Familiarization – Accurate Speaker	.98 (.03)	.96(.05)	.37

	Label 1 (“Neem”) Training	.99(.03)	.98(.05)	.70
	Label 2 (“Lif”) Training	.99(.05)	.97(.08)	.95

	Test – Familiar objects	.65(.13)	.69(.15)	.12
	Test – Novel objects	.49(.19)	.39(.19)	.55

Open in a new tab

We conducted an omnibus ANOVA with the proportion of looking time to the target object as a dependent measure, condition (accurate vs. inaccurate speaker during the novel label training) as a between-subject factor, and trial type (novel vs. familiar objects) as a within-subject factor. There was a significant effect of trial type, F(1, 38) = 5.68, p = 0.022, η_p² =.13 with toddlers looking longer on average to familiar (M = 63.53%) than to novel (M = 56.24%) objects. There was also a significant effect of condition, F(1, 38) = 4.66, p = 0.037, η_p² =.11, with toddlers looking longer to the target objects in the accurate speaker condition (M = 63.35%) than in the inaccurate speaker condition (M = 56.41%), and a significant interaction between condition and trial type, F(1, 38) = 5.87, p = .02, η_p² =.13.

Independent sample two-tailed t-tests were used to analyze differences in the proportions of looking time to the novel and familiar target objects across conditions. As Figure 3 illustrates, toddlers in the accurate speaker condition looked to the novel target labels significantly more than toddlers in the inaccurate speaker condition, t(38) = -2.85, p = .007, d = 0.90 (accurate speaker M = 63.41%, SD = 13.71; inaccurate speaker M = 47.07%, SD = 17.86). There was no significant difference in looking to the familiar target objects between the accurate and inaccurate conditions, M = 63.29%, SD = 11.67 vs. M = 63.76%, SD = 11.26, t(38) = .127, p = .90.

Proportion of looking time to the target object during 3 s, starting from 400 ms after the first mention of the target label.

We further examined the proportion of looking time to the target objects with respect to chance (50%) using two-tailed one-sample t-tests. On the familiar object trials, toddlers looked significantly above chance in both accurate and inaccurate speaker conditions, t(19) = 4.81, p = .000; d = 1.14 and t(19) = 5.46, p = .000; d = 1.22. In contrast, on the novel object trials, only toddlers in the accurate speaker condition looked significantly above chance, t(19) = 4.37, p = .000; d = 0.97. Toddlers in the inaccurate speaker condition did not look at the target objects significantly above chance, t(19) = -0.23, p = .82.

Discussion

Using a continuous measure – toddlers' looking time in an IPLP procedure – we showed that 18-month-olds were more likely to map novel labels to novel objects and generalize them to a novel context when they learned those labels from informants who previously labeled familiar objects accurately rather than inaccurately. Our results extend the findings by Brooker and Poulin-Dubois (2013), who showed that toddlers responded differently to a disambiguation paradigm when labels were presented by accurate versus inaccurate speakers. Critically, toddlers made a similar inference here when they heard an unfamiliar voice during the test trial. This ensured that all toddlers were exposed to exactly the same stimuli during test trials and that they were generalizing newly learned labels to novel social conditions.

Importantly, during familiarization the two speakers were presented on the screen side by side. Vanderbilt, Heyman and Liu (2014) examined how such conflicting information affected 3- and 4-year-olds' use of speaker accuracy. They showed that if a conflict was presented at test (the accurate and inaccurate speakers provided different novel labels for the same novel object), preschoolers used speakers' accuracy. Without such a conflict (i.e., if speakers only conflicted during familiarization), preschoolers trusted an accurate and inaccurate speakers' subsequent novel labels equally. In the present study, the speakers never contrasted labels for objects at either stage of the procedure, but there was a contrast between an accurate and inaccurate speaker during familiarization before the child heard novel labels for novel objects from only one speaker. Experiment 1 suggests that this subtler contrast – simply seeing accurate and inaccurate informants together, even if they never provide conflicting information – was sufficient for our younger children (and possibly the older children investigated by Vanderbilt et al., 2014) to trust the informants selectively.

The results of Experiment 1 provide evidence that toddlers possess the ability to track informants' accuracy and use such information for subsequent inferences about the meaning of novel labels. This account is consistent with various findings demonstrating infants' selective trust of informants in domains other than word learning (e.g., Poulin-Dubois, Brooker & Polonia, 2011; Stenberg, 2013; Tummeltshammer et al., 2014; Zmyj et al., 2010). However, these results leave open the question about the mechanisms underlying toddlers' reliability judgments. One possibility is that toddlers generalize the association between speaker's identity and the accuracy of the labels she provides for familiar objects to all new information provided by the same speaker. This mechanism does not involve judgments about speakers' epistemic states and is based on superficial associative information. Alternatively, toddlers might use speaker's utterances to make inferences regarding her knowledge of object names and, hence, her competence in naming. Southgate, Chevallier, and Csibra (2010) have demonstrated that at 17 months toddlers are already sensitive to informants' epistemic states and rely on that information in novel object naming contexts. It is possible that toddlers also incorporate inferences about epistemic states into their reliability judgments. Instead of simply relying on associative information, such information might be integrated with judgments of epistemic knowledge.

To investigate the mechanisms underlying toddlers' reliability judgments, in Experiment 2, we replaced statements about object names during familiarization with questions mentioning the same labels. Infants as young as 15 months old understand subject questions (Seidl, Hollich, & Jusczyk, 2003) and infants as young as 7 months are sensitive to prosodic cues that indicate the difference between questions and statements (Soderstrom, Ko, & Nevzorova, 2011). Liszkowski, Carpenter and Tomasello (2008) showed at 12-month-olds understood the pragmatics of questions based on the speaker's access to information and responded to her questions appropriately. Shatz (1978) similarly showed that toddlers can comprehend direct requests for information (e.g., “Is this an apple?”) as well as indirect requests for actions (“Can you jump?”); toddlers respond in an appropriate manner to both types of questions. These findings suggest that young children have an ability to appreciate pragmatic contexts of questions, make inferences about askers' intentions, and generate responses accordingly. In Experiment 2, the speakers were instructed to emulate genuine questions, with facial expressions and intonations signaling a request for information rather than rhetorical questions.

Such a method potentially allows us to distinguish between these two mechanisms for selective learning. If toddlers judge speaker accuracy on the basis of whether they hear a speaker provide appropriate label-object pairs, then one would expect similar responses here as in Experiment 1. In contrast, if toddlers understand that questions, unlike statements, lack truth-values, then questions will indicate the same level of accuracy regardless of whether they contain the appropriate or inappropriate label-object pairings.

Experiment 2

In Experiment 2, we consider whether toddlers are attending to speakers' epistemic knowledge as opposed to a simple label-object-speaker association. We replicated the procedure of Experiment 1 with one modification: during familiarization, the two speakers asked questions about labels of objects familiar to toddlers instead of making statements about the objects' labels; all other aspects of the procedure were the same.