Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Sep 1.
Published in final edited form as: Dev Sci. 2012 Jun 18;15(5):659–673. doi: 10.1111/j.1467-7687.2012.01168.x

Language input and acquisition in a Mayan village: how important is directed speech?

Laura A Shneidman 1, Susan Goldin-Meadow 1
PMCID: PMC3538130  NIHMSID: NIHMS426164  PMID: 22925514

Abstract

Theories of language acquisition have highlighted the importance of adult speakers as active participants in children’s language learning. However, in many communities children are reported to be directly engaged by their caregivers only rarely (Lieven, 1994). This observation raises the possibility that these children learn language from observing, rather than participating in, communicative exchanges. In this paper, we quantify naturally occurring language input in one community where directed interaction with children has been reported to be rare (Yucatec Mayan). We compare this input to the input heard by children growing up in large families in the United States, and we consider how directed and overheard input relate to Mayan children’s later vocabulary. In Study 1, we demonstrate that 1-year-old Mayan children do indeed hear a smaller proportion of total input in directed speech than children from the US. In Study 2, we show that for Mayan (but not US) children, there are great increases in the proportion of directed input that children receive between 13 and 35 months. In Study 3, we explore the validity of using videotaped data in a Mayan village. In Study 4, we demonstrate that word types directed to Mayan children from adults at 24 months (but not word types overheard by children or word types directed from other children) predict later vocabulary. These findings suggest that adult talk directed to children is important for early word learning, even in communities where much of children’s early language input comes from overheard speech.

Introduction

Literature examining the relation between early language input and later language acquisition demonstrates that the quality and amount of speech directed from the mother to the child is a strong predictor of the child’s later linguistic competencies (e.g. Barnes, Gutfreund, Satterly & Wells, 1983; Hart & Risley, 1995; Hoff, 2003; Hoff & Naigles, 2002; Huttenlocher, Haight, Bryk, Seltzer & Lyons, 1991; Huttenlocher, Vasilyeva, Cymerman & Levine, 2002; Huttenlocher, Vasilyeva, Waterfall, Vevea & Hedges, 2007). Moreover, mothers’ active involvement in monitoring children’s attentional focus has been found to facilitate early vocabulary development (e.g. Bornstein, Tamis-LeMonda & Haynes, 1999; Carpenter, Nagell & Tomasello, 1998; Dunham, Dunham & Curwin, 1993; Tomasello & Farrar, 1986; Tomasello & Todd, 1983). Based on these findings, theories of language acquisition have highlighted the importance of adult speakers as active participants in children’s language learning (e.g. Akhtar & Tomasello, 1998).

However, researchers have reported that, in many communities, children are directly addressed by caregivers only rarely and instead receive most of their early input from overhearing speech (e.g. Brown, 1998; de Leon, 1998; Pye, 1986a, 1986b; Schieffelin & Ochs, 1986). Children in these communities nevertheless grow up to become competent users of the language spoken around them (review in Lieven, 1994). These descriptions have raised questions about the generalizability of theoretical models that emphasize the relation between child-directed input and later language development (e.g. Akhtar, 2005a; Akhtar & Gernsbacher, 2007; Lieven, 1994).

There is little cross-cultural empirical data that address this issue. Most reports of children’s language input in communities with little direct interaction with adults come in the form of ethnographic descriptions of the community as a whole (e.g. Schieffelin & Ochs, 1986) or in linguistic case studies detailing properties of language input for a small number of children (e.g. Pye, 1986b). These reports provide valuable information about diverse language environments, but they are not (for the most part) quantitative and comparative in nature. They therefore cannot be used to answer questions about relative amounts of directed input across different environments, nor can they be used to consider the relation between directed and overheard input and language acquisition in communities where directed interaction is rare.

The goal of this paper is to provide this kind of quantitative, comparative data. We chose to collect language data in Yucatec Mayan villages in southern Mexico based on descriptions of Mayan populations provided by anthropologists, linguists and psychologists (e.g. de Leon, 1998; Gaskins, 1990, 1999, 2006; Redfield & Villa Rojas, 1934; Rogoff, Mistry, Göncö & Mosier, 1993; Pye, 1986a, 1986b). These descriptions suggest that Mayan children, in general, and Yucatec Mayan children, more specifically, experience vastly different kinds of language environments from children growing up in middle-class Euro-American communities. These differences stem from several factors. First, because of large family size and distribution of household labor, Mayan children are unlikely to spend the majority of their waking hours with a single adult caregiver. Instead, most time is spent with multiple others, and children, not adults, typically serve as a child’s primary caregivers and playmates (e.g. Gaskins, 2006). Second, Mayan caregivers, both adults and children, do not typically consider young children to be valid conversational partners, and thus rarely address children directly (e.g. de Leon, 1998; Gaskins, 1990).

In this paper, we first describe early language input to 1-year-old children in a Yucatec Mayan village and compare this input to input heard by children growing up in large families in the United States (Study 1). We then consider how input changes over development between 1 and 3 years for both Mayan and US children (Study 2). Our final step is to explore the impact that early linguistic input experience has on vocabulary growth in Mayan children. To do so, we first establish the validity of using videotape to assess the Mayan child’s linguistic input (Study 3), and then explore the relation between linguistic input at 24 months and child vocabulary at 35 months (Study 4). Theories espousing the importance of mutual engagement in language learning would lead us to predict that speech directed to children will be key in accounting for lexical development, even in environments where a large portion of early language input comes from overheard speech. We ask whether this prediction is borne out in Mayan children or, alternatively, whether Mayan children (unlike US children; Shneidman, Arroyo, Levine & Goldin-Meadow, in press) make use of overheard speech in acquiring their lexicons.

Study 1

The goal of Study 1 is to investigate the amount of input that 1-year-old children in a Yucatec Mayan village receive, compared to 1-year-old children growing up in multiparty families in the United States. We considered both the source of spoken input (does input come from adults or children?) and the intended recipient of spoken input (are utterances directed to the child or to other individuals?) in our analyses. If Yucatec Maya children do indeed receive very little directed input in early development, we can then entertain the possibility that these children learn language without the benefit of mutual engagement.

Method

Participants

Nine monolingual Yucatec Mayan speaking children from one of two adjacent villages in Mexico (populations of around 600 inhabitants) and nine monolingual English speaking children from a large city in the United States participated (Mayan participants: six females, three males; mean age = 13.1 months, range = 11.0 to 15.1 months; US participants: seven females, two males; mean age = 14.1 months, range = 13.5 to 14.5 months).

Mayan data (for this study and for subsequent studies) were collected in a series of field stays from August 2007 to January 2010 by the first author. Daily life in the villages was much the same as has been described by other researchers working in similar communities (e.g. Gaskins, 1990; Redfield & Villa Rojas, 1934). Families were living in single room structures situated on plots of land that contained one or more nuclear families. Typically, nuclear families were large (5–12 children per family), and extended family members worked together to care for one another’s children. Thus, even first-born children were not ‘only children’ in the US sense as they were typically surrounded by cousins, aunts and uncles. For the nine families observed for this study, there were an average of seven people (other than the target child) present at the home visit (range = 4–12 people). Both adults (over 11 years) and children (2–11 years) were represented in every household at the visit (average number of adults present = 2.1; range = 1–4; average number of children present = 4.7; range = 3–10). Infants (under 2 years) were present in two households (average number present: 0.2; range: 0–1). None of the target children was first-born.

The US children were a subset of participants in a larger longitudinal study examining language development. This subset was selected because these children typically had multiple individuals around them, and thus had opportunities to overhear speech (Shneidman et al., in press). For the nine families observed for this study, there were an average of 2.8 people (other than the target child) present at the home visit (range = 2–4 people). Adults were represented in every household (average number of adults present = 2; range = 1–4). Six households had children other than the target child present during the home visit (average number of children present in these households = 1.3; range = 1–2 children). No infants (other than the target child) were present in any of the households. Four children were Caucasian, three were African American, and one was of mixed racial background; the remaining children’s parents did not list their racial background. Four families had yearly income levels below $35,000, one had an income between $50,000 and $75,000, and four had incomes over $75,000. two of the nine target children were first-born.

Procedure

All children were videotaped in their homes in natural interaction with their families (for 60 minutes in the Mayan sample, 90 minutes in the US sample). During this time, an experimenter followed the target child wherever he or she went. A wide frame was used during recording in order to capture the target child as well as the people and objects in the surrounding area. Family members were instructed to act as they would have had the experimenter not been present. Only the first hour of the US videos was used.

Transcription and coding

All audible speech was transcribed from the first hour of videotape and broken into utterances. Three occurrences could mark the end of a single utterance: (1) the end of a conversational turn, (2) a change in pitch (marking the end of a question or declarative utterance); or (3) a pause in the flow of speech (based on Huttenlocher et al., 2007). Each utterance in input was classified on the basis of (1) who was speaking, adult (over 11 years) or child, and (2) whether it was directed to the child or overheard by the child. Speech was considered directed if it was addressed to the child alone or if it was addressed to a group of individuals that included the child. All other speech was categorized as overheard. We relied on several cues to categorize utterances as directed or overheard: gaze direction, grammatical marking, utterance content, and proximity to child. None of these cues alone determined whether we categorized the utterance as directed or overheard. Rather, we evaluated the cues as an ensemble in deciding whether the speaker intended the target child to be a recipient of the utterance (either the sole recipient or part of a larger group of recipients).

Reliability for each coding category was established by having a second independent coder (native to the language group) categorize 10% of the utterances from a randomly selected section of each transcript. Agreement between coders was 95% (Mayan) and 100% (US) for deciding who was speaking; and 94% (Mayan) and 98% (US) for deciding whether utterances were directed or overheard.

Results and discussion

Our first goal was to consider possible differences in the number of utterances that children in the two cultural communities heard. We conducted a repeated measure ANOVA on number of utterances, with speech type (directed or overheard) as a within-subjects variable, and cultural group (US or Mayan) as a between-subjects variable. We found a main effect of cultural group, F(1, 16) = 14.53, p < .01 (US children heard significantly more total utterances, M = 895 [SD = 277], than Mayan children, M = 428 [SD = 241]); but no main effect of speech type, F(1, 16) = .34, p = .60. We did, however, find a culture by speech type interaction F(1, 16) = 17.51, p < .01: US children heard more utterances in directed input (M = 616 [SD = 231]) than they heard in overheard input (M = 278 [SD = 247]), t(8) = 2.6, p < .05, whereas Mayan children heard more utterances in overheard input (M = 342 [SD = 201]) than they heard in directed input (M = 86 [SD = 59]), t(8) = 4.5, p < .01 (Figure 1).

Figure 1.

Figure 1

Mean number of utterances that 1-year-old children in US and Mayan households heard in an hour, classified according to whether the utterances were directed to the child or overheard by the child.

Note in Figure 1 (gray bars) that US children heard more utterances directed to them than Mayan children, t(16) = 6.67, p < .001. However, the US children overheard about the same number of utterances not directed to them as the Mayan children, t(16) = .60, ns. Thus, the difference in the total number of utterances heard across the two groups was due entirely to US children hearing more directed speech than Mayan children. Looking at the proportion of utterances in directed vs. overheard speech gives the same pattern: Independent samples t-tests on the arcsin-transformed proportional data showed that a smaller proportion of the child’s total input came from directed speech for Mayan children (20%) than for US children (71%), t(16) = 5.2, p < .001.

We next considered whether input came from adult or child speakers in each cultural group. In families from the US, 90% of total speech input (97% of directed input, 73% of overheard input) came from adults. However, for three of the US families, no other children were present at the home visit; as a result, children in these families had no opportunity to hear input from other children. But even if we consider only the six homes where other children were present, we still find that most of the US children’s input came from adult speakers. In these six families, 85% of total speech input came from adults (96% of directed input, 66% of overheard input). In contrast, Mayan children heard only 31% of total speech input (29% of direct speech, 32% of overheard speech) from adults. Most of the (directed and overheard) input that Mayan children heard came from other children: 69% of total input, compared to 10% for the US children, t(16) = 5.43, p < .001.

Perhaps the US children heard a greater proportion of input from adults because fewer children were physically present in the US sample than in the Mayan sample. To assess this possibility, we divided the total number of adult utterances by the number of adults that were physically present at the session, and the total number of child utterances by the number children that were present at the session (for families where both adult and child speakers were present). This procedure yielded a measure of the average number of utterances that each adult and each child spoke during the session. For the US sample, there was a significant difference between the average number of utterances spoken by adults and children: M = 509 (SD = 291) utterances spoken by each US adult during the session, compared to M = 56 (SD = 69) utterances spoken by each US child, t(6) = 5.8, p < .01. In contrast, controlling for the number of speakers, we found that Mayan adults and children produced the same number of utterances during the session: M = 71 utterances spoken by each adult (SD = 39) and by each child (SD = 64), t(8) = .01, ns. The ratio of a single adult’s to a single child’s speech input was significantly different across the two cultural groups, 9.1 US vs. 1.0 Mayan, t(12) = 3.24, p < .01. This difference suggests that our findings are not due, exclusively, to the number of physically present adults and children in the two communities. Controlling for the number of individuals present, US children still hear relatively more of their total input from adult speakers than from child speakers, whereas Mayan children hear approximately equal input from adult and child speakers.

These findings confirm ethnographic reports suggesting vastly different language environments across cultural communities. Children growing up in a Mayan community received most speech input from overhearing others, who were, for the most part, children. In contrast, US children in large families received most input from speech directed to them by adult speakers.

Study 2

We found in Study 1 that most of the children’s early linguistic input came from directed speech for US children, but from overhead speech for Mayan children. However, Study 1 examined children’s language environments at only a single time point in development. It is possible that the amount of directed input that Mayan children hear increases over development. Prior research with US children has found no such developmental increase in the amount of speech directed to children (Huttenlocher et al., 2007). However, these children were likely treated as conversational partners from birth (Snow, 1972), and thus may have always heard ample directed input. In contrast, it has been reported that Mayan speakers do not consider infants as valid partners for conversational interaction (e.g. Gaskins, 2006). But since Mayan children do grow up to become full-fledged linguistic participants, at some point in development it is likely that the amount of speech directed to children increases. In Study 2, we considered differences in children’s input over time in Mayan and US families, focusing not only on whether input is directed to the child or overhead by the child, but also on whether it is produced by an adult or another child.

Method

Participants

Six of the nine Mayan children described in Study 1 participated in this longitudinal study (three males, three females). These particular children were selected because it was logistically possible to videotape them over development. The Mayan children were an average of 13.0 months at time point 1 (range = 11.1 to 15.1 months), 18.3 months at time point 2 (range = 16.3 to 20.2 months), 23.5 months at time point 3 (range = 22.1 to 24.5 months), and 35.3 months at time point 4 (range = 34.2 to 36.2 months). These children had an average of 7.6 people around them at time point 1 (5.3 children, 2.3 adults, 0 infants), 7 people around them at time point 2 (4.5 children, 2 adults, .5 infants), 8.7 people around them at time point 3 (5.3 children, 2.6 adults, .8 infants) and 7.8 people around them at time point 4 (5 children, 2.4 adults, .4 infants). None of the children were first-born.

Six of the nine US children described in Study 1 (one male, five females) were also observed longitudinally, although at fewer time points. These children were selected because they had multiple speakers around them at each time point. The US children were an average of 14.0 months (range = 13.5 to 14.1) at time point 1, 23.4 months at time point 2 (range = 21.8 to 26.3), and 30 months (range = 29.9 to 31.5) at time point 3. The children had an average of 3.2 people around them at time point 1 (1.2 children, 2 adults, 0 infants), 3.1 people around them at time point 2 (1.3 children, 1.8 adults, 0 infants), and 3.7 people around them at time point 3 (1.2 children, 2.5 adults, 0 infants). None of the children were first-born.

Procedure

All children were videotaped in their homes as described in Study 1. Speech was transcribed and coded as in study 1. Reliability for each coding category was established by having a second independent coder (native to the language group) categorize 10% of the utterances from a randomly selected section of each transcript for each time point. Agreement between coders ranged between 93% and 97% (Mayan) and between 95% and 100% (US) for identifying the speaker (child or adult) at the different time points; and between 87% and 93% (Mayan) and between 95% and 98% (US) for determining whether an utterance was directed to the child or overheard at the different time points.

Results

Table 1 displays the means and standard deviations for the US and Mayan children across development.

Table 1.

Mean (SD) number of utterances heard by US and Mayan children in one hour (longitudinal, videotaped method)

Input
directed to
the child
Input
overheard by
the child
Proportion of
total input
directed to the child
Mayan households
n = 6
   13 months 55 (39) 220 (91) .21 (.17)
   18 months 170 (155) 215 (115) .43 (.19)
   24 months 228 (98) 262 (120) .47 (.18)
   35 months 209 (91) 142 (84) .60 (.18)
US households
n = 6
   14 months 605 (150) 341 (294) .69 (.21)
   24 months 652 (256) 475 (419) .63 (.23)
   30 months 970 (243) 631 (310) .62 (.12)

Mayan comparisons:

p = .08, Proportion of input direct to child: 18 vs. 35 mos.

p < .05, Number of directed utterances: 13 vs. 24 mos., 13 vs. 35 mos.; Number of overheard utterances: 24 vs. 35 mos.; Proportion of input directed to child: 13 vs. 18 mos., 13 vs. 24 mos., 13 vs. 35 mos.

US comparisons:

p < .05, Number of total utterances, 14 vs. 30 mos. 24 vs. 30 mos.

Mayan children

Mayan children heard 21% of total input in directed speech at 13 months, 43% at 18 months, 47% at 24 months, and 60% at 35 months (see the right-most column in the top of Table 1 and Figure 2). A repeated measures ANOVA on proportion of input in directed speech (arcsin-transformed), with time point as a within subjects variable, revealed that increases in the proportion of total input directed to children were reliable F(3, 15) = 12.4, p < .01. Mayan children received a higher proportion of their total language input in directed speech at 35 months compared to 13 months (w(6) = 0, z = 2.2, p < .05), at 24 months compared to 13 months (w(6) = 0, z = 2.2, p < .05), and at 18 months compared to 13 months (w(6) = 0, z = 2.2, p < .05). There was also a marginally significant increase in the proportion of directed input children received between 18 and 35 months (w(6) = 2, z = 1.8, p < .08).

Figure 2.

Figure 2

Percent of total language input in child-directed speech over development in US and Mayan households.

We explored whether there were changes in the source of Mayan children’s input over development by considering the proportion of total input that came from child speakers. The Mayan children received, on average, 50% (SD = 21%) of their total input from children at 13 months (64% [SD = 28%] directed input, 49% [SD = 21%] overheard input), 70% (SD = 22%) of their total input from children at 18 months (73% [SD = 29%] directed input, 66% [SD = 30%] overheard input), 68% (SD = 32%) of their total input from children at 24 months (70% [SD = 35%] directed input, 65% [SD = 28%] overheard input), and 85% of their total input from children at 35 months (92% [SD = 6%] directed input, 74% [SD = 20%] overheard input). A repeated measure ANOVA on the (arcsin-transformed) proportion of total input from children with time point as a within-subjects variable revealed significant developmental differences, F(3, 15) = 3.4, p < .05. Children received a significantly higher proportion of input from other children at 35 months than they received at 13 months, w(6) = 0, z = 2.2, p < .05, and a marginally greater proportion at 18 months than at 13 months, w(6) = 2, z = 1.8, p < .08. See Figure 3.

Figure 3.

Figure 3

Percent of total language input from child speakers over development in US and Mayan households.

US children

In our next series of analyses, we examined differences in the proportion of input that was directed to US children over development (see the right-most column in the bottom of Table 1 and Figure 2) and found no significant differences, F(2, 10) = .33, ns. US children heard 69% (SD = 21%) of utterances in child-directed speech at 14 months, 64% (SD = 23%) at 24 months, and 62% (SD = 12%) at 35 months.1

We explored whether there were changes in the source of US children’s input over development by considering the proportion of total input that came from child speakers. The US children received, on average, 10% (SD = 11%) of their total input from children at 14 months (3% [SD = 4%] directed speech, 22% [SD = 18%] overheard speech), 10% (SD = 13%) of their total input from children at 24 months (4% [SD = 5%] directed speech, 17% [SD = 17%] overheard speech) and 11% (SD = 16%) of their total input from children at 35 months (9% [SD = 15%] directed speech, 14% [SD = 17%] of overheard speech). A repeated measure ANOVA on the (arcsin-transformed) proportion of total input from children with time point as a within subjects variable showed no reliable developmental differences, F(2, 10) = .02, ns. See Figure 3.

Absolute differences in speech input

Mayan children experienced increases in the proportion of directed input they heard across development. However, even at the older ages (see the first two columns of Table 1) they did not hear as much total input as children growing up in the United States. To assess differences in the absolute amount of speech input the two groups heard at the oldest age point (30 months for US children, 35 months for Mayan children), we conducted a repeated measures ANOVA on the total number of utterances children heard, with speech type (directed or overheard) as a within-subjects variable and cultural group (Mayan or US) as a between-subjects variable. Results revealed a main effect of cultural group, F(1, 10) = 42.9, p < .001, indicating that US children did hear more total speech input at 30 months than Mayan children heard at 35 months. However, there was also a main effect of speech type indicating that, at the oldest ages, children in both groups heard more input in directed than in overheard speech, F(1, 10) = 8.0, p < .05, which was not the case for the Mayan children at the earlier ages. Finally, there was a marginal speech type by cultural group interaction, F(1, 10) = 3.6, p = .08. Mann-Whitney tests demonstrated that children in the US heard more total utterances in both directed, U(12) = 36, z = 2.7, p < .005, and in overheard speech, U(12) = 36, z = 2.9, p < .005, than did Mayan children.

Discussion

The results of this study showed distinct patterns of early input for US and Mayan children. For US children, there was no change across development in the proportion of input that was directed to the child (as opposed to being overheard by the child). Most of the speech the US children heard was directed to them by adults throughout development.

In contrast, Mayan children’s input showed remarkable change across development. At 13 months, most of the speech the Mayan children heard was not directed to them. However, speech directed to the Mayan children increased steadily over development so that, by the time they were 35 months old, they were receiving approximately the same proportion of their input in directed speech as were children from large families in the United States. Mayan children also experienced developmental changes in the source of input across development. The proportion of their total input that came from other children increased from 50% at 13 months to 85% at 35 months. Thus, although they resembled the US children in the proportion of directed speech they heard at 35 months, the Mayan children were getting almost all of their input from children under 11.

There are several developmental changes that occur between 13 and 18 months that might instigate changes in speaker-to-child interactions in the Mayan community. First, as children begin to talk more, they may be more likely to elicit speech from those around them, and thus may hear more speech directed to them. Second, children increase their locomotive capacities with age and thus could change their own language environment by seeking out individuals who provide more opportunities for directed interaction.

By 35 months Mayan children received a similar proportion of input in directed speech as compared to 30 month-old US children. However, they still heard far less total speech input than US children. It is unclear how to evaluate the differences we find in the raw amount of speech input heard across the two communities. It is certainly possible that these differences reflect true cross-cultural differences in input environments. If this is the case, our findings could have real implications for how children learn language in the Mayan community. For children in industrialized communities, the raw amount of language input (at least in child-directed speech) has been found to be highly predictive of language outcomes (e.g. Huttenlocher et al., 1991). If Mayan children do hear less total speech input, they could face a disadvantage in early language learning, compared to children growing up in the United States. It is not clear, however, that these findings do reflect true differences between the two communities. Mayan individuals might simply be less comfortable being participants in a study, particularly one that involves videotaping, than are individuals from the United States and may talk less when they are being observed. If so, we may be underestimating the total amount of speech input that children hear. We begin to address this issue in Study 3, but future work might address this concern by using even less obtrusive data collection techniques.

In sum, we have found that there are increases in the directed input that Yucatec Mayan children receive over development. Nevertheless, from 13 to 24 months – an important time period in the acquisition of language – most of the linguistic input Mayan children receive is not directed to them, but overheard in conversations between others. Our final step is to consider the impact of this early input on vocabulary growth. To do so, we first assess the validity of using videotape to assess a Mayan child’s linguistic input.

Study 3

Studies 1 and 2 both used a method that relied on videotaping families. Because video cameras are fairly novel devices in the Mayan community, interpretation of Mayan data from these studies is potentially problematic. Mayan participants might talk less to children when they are being videotaped than when they are not. More importantly for the questions we are addressing, Mayan caregivers might direct more talk to children as caregivers become more comfortable with videotaping procedures. If so, the increases we observed in the proportion of directed input over time might simply be a product of the methods we used. In Study 3, we addressed this concern by observing a group of Mayan children cross-sectionally, without a video camera, at three time points over the first 2 years of life. We then compared these data collected without videotape to data collected with videotape in Study 2 to assess the validity of using videotaped data to measure language patterns in the Mayan community.

Method

Participants

Eighteen Mayan children participated in the study (six children for each age group). None of these children participated in Studies 1 or 2. Children were an average of 13.2 months in group 1 (range = 12.9 to 13.3 months), 18.2 months in group 2 (range = 17.8 to 18.7), and 24.3 months in group 3 (range = 23.5 to 24.7 months).

Procedure

The first author visited participating families two times on different days within the same week. One visit was made for 2 hours in the morning, and one visit was made in the afternoon for an additional 2 hours. During the visits, she recorded (with notebook) every time an utterance was spoken by anyone other than the target child. Utterances were classified online as either directed to the child or directed to others, based on cues like utterance content, grammatical markers, proximity of the speaker to the child, and eye gaze. Because of the attentional constraints associated with live coding, it was not possible to categorize the source of every utterance in input (i.e. whether an utterance came from an adult or child speaker). However, at each 5-minute interval in the study (as timed by a stop-watch), the experimenter made note of how many adults and children were present in the vicinity of the child. As in Study 2, both adults and children were present during both sessions for every observed family. Across the time point intervals, at any given moment, there were on average 3.4 people in the vicinity of the child for the 13-month age group (1.2 adults and 2.2 children), 3.6 people for the 18-month age group (1.4 adults and 2.2 children), and 4.4 people for the 24-month age group (2.4 adults and 2.0 children). Note that these numbers reflect the number of individuals around the child at any given moment and not the total number of individuals present across the session (as was reported in Studies 1 and 2).

Results

Developmental patterns of input

We first examined developmental differences in the proportion of the children’s total input that was directed to them (as opposed to overheard). We conducted a univariate ANOVA on the arcsin-transformed proportion of total input directed to the child, with age group (13, 18, or 24 months) as a between-subjects factor, and found developmental differences across the three age groups, F(2, 15) = 5.5, p < .05. Mann-Whitney tests revealed that children in the 18-month group heard a greater proportion of their total input in directed speech (M = .47, SD = .12) than did children in the 13-month group (M = .27, SD = .14), U(12) = 3, z = 2.4, p < .05. Children in the 24-month group heard a marginally greater proportion of input in directed speech at 24 months (M = .47, SD = .07) than children in the 13- month group (M = .27, SD = .14), U(12) = 6, z = 1.9, p = .06 (see Figure 2 and Table 2). These findings replicate the findings of Study 2 – children in both studies heard a reliably greater proportion of utterances that were directed to them at 18 months than at 13 months.

Table 2.

Mean (SD) number of utterances heard by Mayan children in one hour (cross sectional, observation only method)

Mayan
households
Input
directed to
the Child
Input
overheard
by the child
Proportion of
total input directed
to the child
13 months 140 (55) 377 (176) .27 (.14)
n = 6
18 months 211 (70) 240 (96) .47 (.12)
n = 6
24 months 315 (69) 360 (73) .47 (.07)
n = 6

Comparisons:

p < .05, Proportion of input direct to child: 13 vs. 18 mos., 13 vs. 24 mos.;

Number of overheard utterances: 13 vs. 24 mos.

p < .01, Number of directed utterances: 13 vs. 24 mos.

Validity of videotaped data

We explored the validity of using videotape in the Mayan population by considering differences between the number of utterances (per hour) spoken across the two data collection methods. For each age group (13, 18, and 24 months), we used a repeated measure ANOVA, with speech type (directed or overheard) as a within-subjects variable, and collection method (videotaped or observed-only) as a between-subjects variable. Results revealed a main effect of collection method at 13 months, F(1, 10) = 7.12, p < .05, and at 24 months, F(1, 10) = 8.36, p < .05, but not at 18 months, F(1, 10) = .38, ns. The children who were observed without videotape at 13 and 24 months heard more total utterances than the children observed with videotape at these time points. Importantly, however, there were no speech-type by collection-method interactions at any time point. In other words, there were no reliable differences in the pattern of directed vs. observed input as a function of collection method. The results of univariate ANOVAs on the (arcsin-transformed) proportion of directed utterances at each time point, with collection method as a fixed factor, confirmed this point – there were no reliable differences in the proportion of directed input heard by children observed without videotape and children observed with videotape at 13 months, F(1, 10) = .72, ns, 18 months, F(1, 10) = .14, ns, or 24 months, F(1, 10) = .04, ns.

Discussion

The results of Study 3 replicate the findings from Study 2 and show that Mayan children experience increases in speech directed to them over development. The results also demonstrate that the method used for data collection had an impact on the total number of utterances caregivers produced – children received more input when observed without videotape than with videotape. It is important to note that even the observation-only method may underestimate the total amount of talk that Mayan children hear in their everyday lives – the mere presence of an outsider may inhibit speech production. Importantly, however, we found that collection method had no impact on the proportion of total utterances that were directed to the child (as opposed to overheard by the child). Children received the same proportion of their total input in directed speech at each time point (see Figure 2), independent of collection method. These findings have important methodological implications for future studies with Mayan children. Researchers should be aware that videotaping Mayan children could reduce the total amount of input children receive and should therefore treat absolute frequency input data with caution.

Study 4

Studies 1 through 3 demonstrate that Mayan children receive a large proportion of their early input from overheard speech, and most directed and overheard speech from other children. The goal of Study 4 is to explore the impact of directed vs. overheard speech and child vs. adult speech on Mayan children’s later vocabulary outcomes. Previous work with US children has demonstrated that children are able to learn new words from overheard speech in experimental contexts (Akhtar, 2005b; Akhtar, Jipson & Callanan, 2001; Floor & Akhtar, 2006). However, for children growing up in large families in the United States, naturally occurring overheard input does not add predictive value to the relationship between input and children’s later vocabulary; for these children, directed speech is the most robust predictor of children’s later receptive vocabulary (Weisleder & Fernald, 2010; Shneidman et al., in press). Study 4 asks whether this pattern holds for Mayan children.

Overheard input is the prevalent source of input heard by Mayan children. As such, it may be the speech these children naturally attend to and learn from. Moreover, researchers have found that, in communities where children spend the majority of their time around multiple individuals and are rarely directly engaged by caregivers, children develop patterns of attention that are characterized by active observation of other people (Chavajay & Rogoff, 1999; Gaskins, 1999, 2006; Gaskins & Paradise, 2010; Rogoff et al., 1993). In older children, these experiences relate to variation in children’s ability to learn new skills from observation (e.g. Silva, Correa-Chávez & Rogoff, 2010). Children who grow up in communities where they have ample opportunity to observe others may become adept at learning from observation. These children may then be more likely to learn new words from overheard speech than children growing up in cultures where they are rarely directly engaged.

In Study 4, we consider what type of input (directed input, overheard input, or all input) predicts children’s later vocabulary by examining the relation between Mayan children’s language input at 24 months and their vocabulary production and comprehension at 35 months. We chose to examine input at 24 months for two reasons. First, it was logistically possible to observe a relatively large sample of Mayan children at this age because of the demographic properties of the villages. Second, 24 months is relatively close to 30 months, an age at which US children have demonstrated a robust ability to learn from overheard speech in an experimental context (e.g. Akhtar, 2005b).

The results of Studies 1 and 2 showed that, unlike children from the United States who hear the majority of early speech input from adults, Mayan children hear most of their speech input from other children. The implications of receiving linguistic input primarily from other children are unclear. One possibility is that input from children would be particularly helpful for learning language since this type of input has the potential to match more closely the language that children are producing themselves. On the other hand, input from children might be less useful for learning language because it is more likely than input from adults to be incorrect or dysfluent, and could contain less varied vocabulary than adult-directed speech. In Study 4, we consider how speech from children and from adults relates to children’s subsequent vocabulary.

Method

Participants

Fifteen Mayan children participated in this study (six males, nine females). Six of these children had been participants in both Studies 1 and 2.

Procedure

Children were visited in their homes two times for this study (more for the six children who participated in Studies 1 and 2), at 24 months (range = 22.1 to 26 months, mean age = 24 months), and 35 months (range = 33.7 to 38 months, mean age = 35.3 months). At the 24-month visit, children were videotaped in natural interaction with their caregivers, as described in Study 2. At the 35-month visit, children were given receptive and expressive vocabulary tasks by a local research assistant.

Materials and vocabulary task administration

Receptive vocabulary task

The receptive vocabulary task was based on the Peabody Picture Vocabulary Test (Dunn & Dunn, 1981). The material for the task was a 28-page booklet containing an array of four photos on each page (one target item and three distracter items). Target items included photos of common objects (n = 21) and photos of common actions (n = 7). All target items were previously piloted with four adult Mayan speakers who were asked to identify each of the items. Only photos that received a common label across the four speakers were included as a target item. The location of the target photo was counterbalanced across the arrays.

During testing, children were shown the pages of the booklet and were asked to find the target photo. For example, children would be shown a page with a banana, a shoe, a cat, and a bucket, and asked ‘tu’ux yaan miis’, meaning ‘Where is the cat?’ All children were administered two warm-up trials in which they received corrective feedback on their performance. During the warm-up trials, if the child failed to respond to the experimenter, the experimenter would repeat the prompt two times. If the child continued to be unresponsive, the experimenter would take the child’s hand and place it on the correct item. This procedure was repeated until the child correctly identified the target item. If the child responded incorrectly during the warm-up trials by pointing to the wrong item, the experimenter would say ‘ma’’, meaning ‘no’, and then place the child’s hand on the correct item while labeling it. The experimenter repeated this procedure until the child responded correctly. All children successfully completed the warm-up trials before moving on to test trials. During test trials, the child received no feedback on his or her responses. If the child failed to respond to the experimenter, the experimenter would repeat the prompt two times before moving to the next trial. Children received a score based on the percentage of target items that they correctly identified.2

Expressive vocabulary task

The expressive vocabulary task was based on the Expressive One-Word Picture Vocabulary Test (EOWPVT; Brownwell, 2000). The materials for the task were 44 cards containing photos of common objects (n = 28) and photos of common actions (n = 16). All target items were previously piloted with four adult Mayan speakers who were asked to identify each of the items. Only photos that received a common label across the four speakers were included as a target item.

During testing, children were shown each photo card and asked, ‘ba’ax lela’, meaning ‘What is this?’ for object photos and, ‘ba’ax tu meetik’, meaning ‘What is he/she/it doing?’ for action photos. All children saw all object photos first, followed by all action photos. Children were given four warm-up trials (two object photos and two action photos). If children did not respond or gave the incorrect response during the warm-up trials, the experimenter corrected and prompted the child to give the correct response. This procedure was repeated until the child produced the object label. All children passed the warm-up phase before moving on to testing.

If a child gave an incorrect response during test trials, the experimenter moved on to the next item and provided no feedback. If the child failed to respond during test trials, the experimenter provided no feedback but repeated the prompt up to two additional times before moving on to the next test item. Children received a score based on the percentage of target items that they correctly labeled.

Measures

Input measures

Videotapes from the 24-month visit were transcribed, and utterances were categorized as described in Studies 1 and 2 (reliability for categorizing utterances for the 24-month age group was established in Study 2). To measure children’s language input, the number of different words (word types) was extracted from the 24- month transcripts for directed and overheard speech from children and adult speakers. Several decisions were made about what constituted a word type. Non-referential noises and interjections were not considered word types and were excluded from these counts. In addition, words that served as inflectional markers of the verb were excluded from the counts. For example, in the sentence, Tu baaxtik, ‘She’s playing’, the Tu, which marks the verb as a third person present participle, was not included in the count. Inflected forms of the same stem were not counted as separate word types; however, derived stem forms were counted as distinct word types. For example, wenel, ‘to sleep’, was counted as a distinct word type from wensik, ‘to put to sleep’. The number of word types was divided into three categories for each child: (1) word types that appeared in speech directed to the child, (2) word types that appeared in speech overheard by the child, (3) all word types heard by the child. The last measure was not merely a sum of the first two measures because some word types were heard in both directed and overheard speech.

Outcome measure

The outcome measure was children’s language skill at 35 months, as assessed by averaging the scores on the receptive and expressive vocabulary tasks (Composite Vocabulary Measure).

Control measures

Because the results of Study 3 demonstrated that some Mayan speakers talk less when being videotaped, all analyses controlled for the total number of utterances spoken by those around the child (length of transcript). We also controlled for the child’s language ability at 24 months (number of utterances produced). Children who talk more might elicit different input from those around them than children who talk less; if so, any relation we find between input and later vocabulary could be due to the child’s language ability rather than to their input.

Results and discussion

Vocabulary task performance

Children’s score on the receptive vocabulary task ranged from 33% to 86% correct (mean proportion of correct responses = .56, SD = .18). Performance on the expressive vocabulary task ranged from 21% to 90% correct (mean proportion of correct responses = .55, SD = .17). We combined these two measures into a composite because the measures showed a similar relation to input, and because the composite provides a richer index of children’s vocabulary knowledge than either measure alone (the target words in the expressive and receptive measures were non-overlapping; by using both measures, we tapped knowledge of a larger set of target words). Performance on the composite measure of the two tasks ranged from 29% to 81% correct (mean proportion of correct responses = .55, SD = .15).

Directed versus overheard input

Children heard, on average, 229 (SD = 73) total word types during the recording session; 121 (SD = 51) were found in directed input, 178 (SD = 71) in overheard input, and 70 (SD = 27) in both directed and overheard speech.

Regressions were used to consider the relation between measures of input at 24 months (directed types, overheard types, all types) and children’s language at 35 months (composite vocabulary score), controlling for length of transcript (total utterances) and for children’s language at 24 months (child utterances). Table 3 presents the models. Model A displays the results of the control model using length of transcript and child language as the predictors of the (arcsin-transformed) proportion of correct responses on the vocabulary measure. Neither of these measures was a significant predictor of vocabulary, but together the measures accounted for 21% of the total variance in vocabulary score.

Table 3.

Regression models using input measures at 24 months (directed and overheard word types) to predict children’s vocabulary at 35 months (arcsin of proportion of correct responses on composite vocabulary measure), controlling for length of transcript at 24 months (total utterances in input) and child language at 24 months (utterances)

Composite vocabulary measure β (stan-
dardized)
Predictors (24 months) Model A Model B Model C Model D
Length of transcript −.31 −.96** −.14 −.70
Child utterances   .38   .27   .28   .52
Word types in speech directed to the child   .89*
Word types in speech overheard by the child −.25
Word types in all input   .45
R2 statistic   .21   .55   .24   .26
*

p < .05;

**

p < .01.

We next examined the impact that directed input had on children’s subsequent vocabulary. In model B (Table 3) we included word types that appeared in directed speech as the measure of input (along with the control variables). This measure was a strong, positive predictor of vocabulary score at 35 months (β= .89, p < .05). The standardized parameter estimate indicates that, when controlling for length of transcript and child language at 24 months, each standard deviation change in word types appearing in directed speech input was associated with a .89 standard deviation difference in vocabulary score. This model accounted for 55% of the variance in vocabulary score at 35 months, significantly more variance than the control model (F = 8.30, p < .05).

Next, we considered whether word types overheard by children at 24 months related to their 30-month vocabulary. In model C (Table 3) we included word types overheard by children as a predictor of vocabulary score (along with the control measures). Overheard types failed to significantly relate to children’s subsequent vocabulary (β = −.45, ns) and this model did not account for more variance in vocabulary score than the control model A (F = 0.39, ns).

Finally, we included all word types that children heard in input (directed and overheard) as a predictor of vocabulary along with the control variables. Note that the number of types in all input is not the sum of types in directed speech plus types in overheard speech simply because some words appeared in both the directed and overheard speech a child heard. The number of types in all input did not reliably relate to vocabulary (β = .45, ns), and this model failed to account for more variance than the control model (F = 0.73, ns).

To summarize, controlling for the length of transcripts and child language at 24 months, directed input, but not overheard input or all input, reliably predicted vocabulary at 35 months. This finding suggests that speech input directed to the child is important for early word learning, even in a community where most of the speech children hear is overheard.3 However, this interpretation comes with one important caveat. Both the productive and the receptive vocabulary measures used in this study relied on eliciting responses from children. As such, it could be that the tests were measuring children’s knowledge of directed interactional structure rather than vocabulary per se. Children who were more often directly addressed might have had more experience with the question-and-answer format than children who were less often directly addressed.

To investigate this possibility, we asked whether the number of questions asked of children in directed interaction at 24 months independently related to their vocabulary score at 35 months, controlling for the total number of word types children heard in directed input. We found that it did not (r = −.04, ns), suggesting that experience with the question-and-answer format alone was not driving our findings. However, future research could address this issue by using a more passive method to measure children’s vocabulary, e.g. preferential looking measures rather than eliciting techniques.4

Child versus adult input

Given that directed input was the most robust predictor of children’s subsequent vocabulary, we next asked whether differences in the source of directed input (input from adults vs. input from children) differentially related to vocabulary outcomes. Recall that, on average, children heard 120 word types per hour in directed input; 52 (SD = 36) were heard uniquely from other children, 39 (SD = 32) uniquely from adults, and 31 (SD = 23) from both adults and children.

Regressions were used to consider the relation between measures of directed input at 24 months (types directed from adults, types directed from children) and children’s language at 35 months (composite vocabulary score), controlling for length of transcript (total utterances) and children’s language at 24 months (child utterances). Table 4 presents the models. Model A displays the results of the control model using length of transcript and child language as the predictors of the (arcsin-transformed) proportion of correct responses on the vocabulary measure. Model B includes word types in input directed from other children; Model C includes word types in directed speech from adults (both models include control variables).

Table 4.

Regression models using directed input measures at 24 months (directed word types from adults and children) to predict children’s vocabulary at 35 months (arcsin of proportion of correct responses on composite vocabulary measure), controlling for length of transcript at 24 months (total utterances in input) and child language at 24 months (utterances)

Predictors (24 months) Model A Model B Model C
Length of transcript −.31 −.57 −.53*
Child utterances   .38   .34   .40
Directed types from children   .27
Directed types from adults   .62*
R2 statistic   .21   .23   .42
*

p < .05;

**

p < .01.

Word types in directed input from children did not significantly relate to children’s subsequent vocabulary. Model B did not account for significantly more variance in vocabulary score than the control model A (F = .20, ns). In contrast, word types directed from adults at 24 months did significantly predict children’s vocabulary at 35 months. Model C accounted for 42% of variance in vocabulary score, significantly more than the control model (F = 8.11 p < .05).

To summarize, speech directed from adults related to children’s subsequent vocabulary;5 speech from other children on its own did not. Why might speech from adults be primary in predicting children’s performance? One possibility is that properties of adult and child-directed speech differ in ways that make adult speech more useful for success on our vocabulary task. To investigate this possibility, we examined the content of speech input from children vs. adults. Given that our vocabulary task contained words for objects and actions, we divided input into two categories: lexical items that were nouns or verbs, and other lexical items (e.g. adjectives, prepositions, pronouns, conjunctions etc.). We found that directed speech from adults did indeed contain a significantly greater proportion of nouns and verbs (M = .71 [SD = .13]) than directed speech from other children (M = .62 [SD = .02], t[27] = 2.13, p < .04).6 This difference may have made adult input a richer source from which to learn vocabulary words than child input (at least in terms of success on our vocabulary task).

General discussion

Our goal was to compare the early language input that children hear in a Yucatec Mayan village to input heard by children living in large families in the United States, and to explore how language input directed to the child vs. overheard by the child relate to Mayan children’s later vocabulary. Study 1 demonstrated that 1-year-old Mayan children heard fewer total utterances in their input, heard fewer utterances in speech directed to them, and heard a significantly smaller mean proportion of their total language input in speech directed to them than did children growing up in large families in the United States. These findings are the first quantitative, comparative assessments of linguistic input in a Mayan community and confirm ethnographic descriptions of Mayan children’s early language environments. Unlike young children in the United States who receive the majority of their linguistic input in child-directed speech, Mayan children receive most of their input in overheard speech. Further, unlike children from the United States who hear the majority of early speech input from adults, Mayan children hear most speech input from other children.

We found that the proportion of speech that was directed to Mayan children increased over development. Approximately 20% of the input Mayan children heard came from speech directed to them at 13 months. This proportion increased to 50% at 18 and 24 months and, by 35 months, 60% of the total input Mayan children heard came from speech directed to them – a proportion comparable to the input US children in large households hear throughout this time period. Interestingly, the biggest cross-cultural difference in caregiver interaction style occurs early in development (around 1 year of age), a period during which some theorists have proposed that directed, ostensive communication is critical for acquiring social knowledge (Csibra & Gergely, 2009; Herold & Akhtar, 2008; Moll & Tomasello, 2007; Moore, 2007). The situation in the Yucatan allows us to explore the importance of ostensive interaction for early development. Future investigations could, for example, examine children’s early social learning ability inside and outside of ostensive episodes.

We found that although the proportion of speech directed to Mayan children increased across development, even at 3 years of age, Mayan children received far less total input than the children in our US sample. Findings from industrialized communities suggest that the total amount of directed speech input that children hear is important for supporting language development (e.g. Huttenlocher et al., 1991). Given these findings, one interpretation of our results is that Mayan children face a deficit in early acquisition, compared to US children. However, this interpretation comes with two important caveats. First, it is possible that the differences we found in absolute numbers are due to differences in the comfort level that US and Mayan families had in being observed by an outsider. Study 3 demonstrated that Mayan speakers certainly talk less when being videotaped than they do when being observed without videotape, and it is possible that even participating in the observation-only study significantly reduced the amount of talk produced around the child. As a result, it is quite likely that we are under-representing the amount of talk that Mayan children hear. Second, Yucatec Mayan children clearly grow up to be competent speakers of Yucatec Mayan. Thus, it is unclear how differences in early input environments might relate to later linguistic outcomes in this particular community. Future research will need to consider whether there are differences in the rate of acquisition for US and Mayan children and, if so, whether there are long-term consequences of these differences.

Our results showed large developmental changes in the source of input for the Mayan children (but not for the children in our US sample). At 1 year of age, Mayan children heard about half of their input from other children. By age 3, they heard nearly 90% of their total input from children and not from adults. This finding is of note for two reasons: (1) prior research has shown that there are differences in the structural properties of Mayan adult speech vs. Mayan child speech to other children that could conceivably have an impact on acquisition (Shneidman & Goldin-Meadow, in press) and (2) in the current study we found that adult speech, and not child speech, robustly related to children’s later vocabulary, a point we address below.

Previous research has shown that Mayan children are active observers who are more likely than children from the United States to learn from observation (e.g. Silva et al., 2010). Given this difference, we might have hypothesized that Mayan children would be able to make effective use of overheard input in forming their lexicons. However, our findings suggest that this is not the case. The results of Study 4 show that speech directed to Mayan children, not overheard speech, was the most robust predictor of children’s later lexical competencies. Vocabulary at 35 months was related to number of different word types that children heard in speech directed to them at 24 months – not to the number of word types that children overheard in the speech around them. The data from the Mayan children thus replicate the pattern found in children growing up in large households in the United States (e.g. Weisleder & Fernald, 2010; Shneidman et al., in press). Directed input may therefore be best at fostering children’s early lexical competencies – even in environments where children receive a large proportion of input from overhearing others.7

Why does directed input predict later vocabulary so well? One possibility is that it is the directedness of the speech per se that matters. However, there are many other characteristics of speech directed to children that could be important in fostering vocabulary growth; for example, directed speech may be syntactically simple, it may refer to visually present objects, and it may have acoustic clarity. In other words, the directed/overheard distinction may be a proxy measure for some other distinction that could more aptly explain vocabulary outcomes. Indeed, there is evidence that speech directed to Mayan children is syntactically and lexically simpler than speech directed to Mayan adults and speech overheard by children (Shneidman & Goldin-Meadow, in press). A remaining question is whether properties that align with directed input uniquely relate to later vocabulary development.

We found that overheard input did not add value to the model predicting children’s vocabulary outcomes. Note, however, that this is a negative finding. It is therefore premature to conclude that overheard speech plays no role in fostering children’s language development. The language children overhear is likely to vary in how relevant it is to the child. For example, an overheard conversation about a children’s game might be more accessible, and therefore easier to learn from, than an overheard conversation about an adult’s day at work. Overheard speech that is relevant to the child may, in fact, play a role in fostering child word learning. This hypothesis could be explored by categorizing overheard speech into talk that is relevant to the child’s concerns vs. talk that is not relevant.

In addition, our findings might be limited to vocabulary learning. Speech directed to children may not be as important in fostering syntactic development as it appears to be in fostering vocabulary development. Twelve-month-old infants can successfully learn to distinguish grammatical from ungrammatical constructions after simple exposure to an artificial speech stream containing only predictive dependencies as cues to proper word order (Saffran, Hauser, Seibel, Kapfhamer, Tsao & Cushman, 2008). Findings of this sort suggest that general perceptual and pattern-abstraction abilities, and not the properties of interaction style per se, may be particularly important in facilitating children’s early syntactic growth. If exposure to linguistic patterns is all that is needed to learn syntactic patterns, then overheard speech may be as useful in fostering early syntactic development as child-directed speech. On the other hand, even if interaction style is not important, directed speech does have some special structural properties (at least in English; Soderstrom, 2007) that may make it particularly useful for early syntactic development.

But linguistic input that is overheard by the child could be even more useful than input that is directed to the child in fostering the development of other aspects of children’s language precisely because overheard speech is not tailored to the knowledge level of the child (Blum-Kulka & Snow, 2002). For example, Blum-Kulka and Snow (2002) suggest that multiparty input could be key for children to learn some aspects of humor, appropriate language use, and narrative structure. Future research will need to determine whether there are any properties of language whose development is affected by overheard input and, if so, how much and what type of overheard input is needed for this effect to occur.

We also found that directed speech from adults independently predicted children’s subsequent vocabulary, whereas directed speech from other children did not. This finding is of interest because we found that most of the Mayan children’s input came from other children, and not from adults. However, we need to exercise caution in interpreting these null findings. Although directed input from other children did not significantly relate to vocabulary, neither did it reduce the predictive value of the input model (and speech from other children showed a positive, albeit non-significant, relation to children’s subsequent vocabulary). It is thus possible that, in a larger sample, directed input from other children would relate to subsequent vocabulary.

Why was adult input primary in predicting children’s vocabulary in our study? One possibility is that adult input matched the content of our tasks better than child input. We found some evidence for this explanation; adult input had a higher proportion of nouns and verbs than child input, corresponding to the objects and actions that were the target items on our vocabulary tasks. Further, since an adult research assistant administered the tasks, children with more experience interacting with adults might have performed better on the task by virtue of this experience. Alternatively, adult input might really be better for vocabulary learning than child input (by virtue of its fluency or its content). Future research is needed to determine how effective input from other children is in fostering vocabulary development.

In conclusion, this paper is the first quantitative, comparative study to explore early language input in a community where directed speech to children has been claimed to be rare. The results confirm ethnographic reports that children growing up in a Mayan community do indeed receive little input in directed speech early in life. However, we have also found that the proportion of directed input showed dramatic increases over early development in these children. By 3 years of age, Mayan children heard approximately the same proportion of input in child-directed speech as children from the United States growing up in large families (although they continued to hear less total language input than US children). We also found that Mayan children received most of their early input from other children, whereas the children in our US sample received most of their input from adults. For Mayan children, the proportion of input from children showed increases across development. By the time these children were 3 years of age they heard nearly all of their input from other children, and not from adults. Finally, we found that even though Mayan children received a great deal of linguistic input through overheard speech and from other children, these types of speech did not relate to their later lexical outcomes. Instead, it was primarily speech directed to the child from adults that predicted children’s later vocabulary. Our findings thus suggest that adult talk directed to children is important for early word learning, even in communities where a large proportion of children’s early language input comes from overhearing speech.

Acknowledgements

Portions of this research were presented at the Boston University Conference on Language Development, November 2010, and at the biennial meeting of the Society for Research in Child Development, April 2008 and March 2011. Data collection was supported by a field grant from the Center for Latin American Studies at the University of Chicago and an Agnes and Nathan Janco travel grant from the University of Chicago Social Science division to LS and P01HD40605 to SGM. We thank K. Schonwald, B. Trofatter, and J. Voigt for their administrative and technical help, G. Balam Balam, K. Brasky, C. Chay Cano, L.F Chay Cano, W. Chay Cano, R. Chay Wikab, M. Chay Can, S. Chay Can, E. Croft, A. Drake, K. Duboc, B. Free, J. Griffin, S. Gripshover, C. Meanwell, E. Mellum, M. Nikolas, J. Oberholtzer, L. Rissman, B. Seibel, K. Uttich, and J. Wallman for help in data collection and transcription. These studies were conducted as part of the first author’s doctoral dissertation, and we thank J. Lucy, S. Levine, S. Gaskins and A. Woodward for comments and feedback during that process.

Footnotes

1

Previous research has found that US families vary as a function of SES in the extent to which caregivers talk to children (e.g. Hoff, 2003). Children in the three US families in our sample that were low SES (incomes under $35,000) did, indeed, hear less total input than children in the three high SES families. However, the proportion of speech that was directed to children across development did not differ in the two groups. Low SES children heard 74% of their input in directed speech at 14 months, 66% at 24 months, and 67% at 30 months. For high SES children, the percentages were 63%, 61%, and 56%, respectively. Both high and low SES families from the US thus differed from the Mayan families in that they directed a relatively high proportion of input to children throughout the child’s development, suggesting that our findings reflect broad differences between the two cultural communities that transcend within-group differences in SES.

2

One subject received prompting from her mother on 14 of the test items. These items were dropped from the analysis and this child received a receptive vocabulary score based on the percentage correct of the remaining test items. All subsequent analyses were performed with and without this child’s data. Since removing the child did not affect any pattern, all reported analyses include all children.

3

Because word types could overlap between the directed and overheard contexts, the directed input measure included some word types that also occurred in overheard speech. One possibility is that word types heard across contexts are more likely to be learned by children than the words heard in the directed context alone. However, we find that words heard only in directed speech strongly predict subsequent vocabulary (β = .61, p < .05) and that there is no significant increase in the variance accounted for in vocabulary score when we include word types that overlap between directed and overheard speech (F = 1.78, ns). We therefore cannot conclude that words heard across contexts are any more useful for acquisition than words heard in the directed context alone, although future research might more closely investigate this possibility.

4

Natural production may not be the best measure of children’s vocabulary knowledge in this community, as children of this age rarely talk unless prompted and production is rarely elicited.

5

Importantly, it was adult types heard in directed speech that predicted children’s subsequent vocabulary (β = .62, p < .05). Including other types of adult input in the models (adult types in both directed and overheard speech; adult types uniquely in overheard speech) did not account for additional variance. Moreover, neither overheard adult types nor all adult types (directed and overheard) significantly related to child vocabulary.

6

One family was excluded from this analysis because there was no directed speech from adults.

7

De Leon (1998) has suggested that children growing up in a Tzotzil Mayan community could learn from overheard interactions by virtue of their physical placement. Unlike the Yucatec Mayan children in this study, Tzotzil children are typically strapped to their caregivers’ backs throughout the day. Children who are often in physical contact with caregivers may be more likely to attend to objects that caregivers label, even when the labels are not directly addressed to them. Future research might consider the relation between overheard input and vocabulary acquisition in a community where physical contact between children and caregivers is common.

References

  1. Akhtar N. Is joint attention necessary for early language learning? In: Homer BD, Tamis-LeMonda CS, editors. The development of social cognition and communication. Mahwah, NJ: Lawrence Erlbaum; 2005a. pp. 165–179. [Google Scholar]
  2. Akhtar N. The robustness of learning through overhearing. Developmental Science. 2005b;8(2):199–209. doi: 10.1111/j.1467-7687.2005.00406.x. [DOI] [PubMed] [Google Scholar]
  3. Akhtar N, Gernsbacher MA. Joint attention and vocabulary development: a critical look. Language and Linguistics Compass. 2007;1(3):195–207. doi: 10.1111/j.1749-818X.2007.00014.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Akhtar N, Jipson J, Callanan MA. Learning words through overhearing. Child Development. 2001;72(2):416–430. doi: 10.1111/1467-8624.00287. [DOI] [PubMed] [Google Scholar]
  5. Akhtar N, Tomasello M. Intersubjectivity in early language learning and use. In: Braten S, editor. Intersubjective communication and emotion in early ontogeny. New York: Cambridge University Press; 1998. pp. 316–335. [Google Scholar]
  6. Barnes S, Gutfreund M, Satterly D, Wells G. Characteristics of adult speech which predict children’s language development. Journal of Child Language. 1983;10:65–84. doi: 10.1017/s0305000900005146. [DOI] [PubMed] [Google Scholar]
  7. Blum-Kulka S, Snow CE, editors. Talking to adults: The contribution of multiparty discourse to language acquisition. Mahwah, NJ: Lawrence Erlbaum Associates; 2002. [Google Scholar]
  8. Bornstein MH, Tamis-LeMonda CS, Haynes OM. First words in the second year: continuity, stability, and models of concurrent and predictive correspondence in vocabulary and verbal responsiveness across age and context. Infant Behavior and Development. 1999;22:65–85. [Google Scholar]
  9. Brown P. Conversational structure and language acquisition: the role of repetition in Tzeltal adult and child speech. Journal of Linguistic Anthropology. 1998;8(2):197–221. [Google Scholar]
  10. Brownell R. Expressive One-Word Picture Vocabulary Test. Upper Saddle River, NJ: Pearson Education; 2000. [Google Scholar]
  11. Carpenter M, Nagell K, Tomasello M. Social cognition, joint attention, and communicative competence from 9 to 15 months of age. Monographs of the Society for Research in Child Development. 1998;63(4, Serial No. 255) [PubMed] [Google Scholar]
  12. Chavajay P, Rogoff B. Cultural variation in management of attention by children and their caregivers. Developmental Psychology. 1999;35:1079–1090. doi: 10.1037//0012-1649.35.4.1079. [DOI] [PubMed] [Google Scholar]
  13. Csibra G, Gergely G. Natural pedagogy. Trends in Cognitive Sciences. 2009;13(4):148–153. doi: 10.1016/j.tics.2009.01.005. [DOI] [PubMed] [Google Scholar]
  14. de Leon L. The emergent participant: interactive patterns in the socialization of Tzotzil (Mayan) infants. Journal of Linguistic Anthropology. 1998;8:131–161. [Google Scholar]
  15. Dunham PJ, Dunham FS, Curwin A. Jointattentional states and lexical acquisition at 18 months. Developmental Psychology. 1993;29:827–831. [Google Scholar]
  16. Dunn LM, Dunn LM. Peabody Picture Vocabulary Test-Revised. Circle Pines, MN: American Guidance Service; 1981. [Google Scholar]
  17. Floor P, Akhtar N. Can 18-month-old infants learn words by listening in on conversations? Infancy. 2006;9:327–339. doi: 10.1207/s15327078in0903_4. [DOI] [PubMed] [Google Scholar]
  18. Gaskins S. Mayan exploratory play and development. University of Chicago: Unpublished doctoral dissertation; 1990. [Google Scholar]
  19. Gaskins S. Children’s daily lives in a Mayan village: a case study of culturally constructed roles and activities. In: Göncö A, editor. Children ’s engagement in the world: Sociocultural perspectives. New York: Cambridge University Press; 1999. pp. 25–60. [Google Scholar]
  20. Gaskins S. Cultural perspectives on infant-caregiver interaction. In: Nicholas J Enfield, Levinson SC., editors. Roots of human sociality: Culture, cognition and interaction. Oxford: Berg; 2006. pp. 279–298. [Google Scholar]
  21. Gaskins S, Paradise R. Learning through observation in daily life. In: Lancy DF, Bock JC, Gaskins S, editors. The anthropology of learning in childhood. Walnut Creek, CA: AltaMira Press; 2010. pp. 85–117. [Google Scholar]
  22. Hart B, Risley T. Meaningful differences in the everyday experience of young American children. Baltimore, MD: Brooks Publishing; 1995. [Google Scholar]
  23. Herold KH, Akhtar N. Imitative learning from a third-party interaction: relations with self-recognition and perspective taking. Journal of Experimental Child Psychology. 2008;101(2):114–123. doi: 10.1016/j.jecp.2008.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hoff E. The specificity of environmental influence: socioeconomic status affects early vocabulary development via maternal speech. Child Development. 2003;74:1368–1378. doi: 10.1111/1467-8624.00612. [DOI] [PubMed] [Google Scholar]
  25. Hoff E, Naigles L. How children use input to acquire a lexicon. Child Development. 2002;73:418–433. doi: 10.1111/1467-8624.00415. [DOI] [PubMed] [Google Scholar]
  26. Huttenlocher J, Haight W, Bryk A, Seltzer M, Lyons T. Early vocabulary growth: Relation to language input and gender. Developmental Psychology. 1991;27:236–248. [Google Scholar]
  27. Huttenlocher J, Vasilyeva M, Cymerman E, Levine S. Language input and child syntax. Cognitive Psychology. 2002;45:337–374. doi: 10.1016/s0010-0285(02)00500-5. [DOI] [PubMed] [Google Scholar]
  28. Huttenlocher J, Vasilyeva M, Waterfall HR, Vevea JL, Hedges LV. The varieties of speech to young children. Developmental Psychology. 2007;43:1062–1083. doi: 10.1037/0012-1649.43.5.1062. [DOI] [PubMed] [Google Scholar]
  29. Lieven EVM. Crosslinguistic and crosscultural aspects of language addressed to children. In: Gallaway C, Richards BJ, editors. Input and interaction in language acquisition. Cambridge: Cambridge University Press; 1994. pp. 56–73. [Google Scholar]
  30. Moll H, Tomasello M. How 14- and 18-month-olds know what others have experienced. Developmental Psychology. 2007;43(2):309–317. doi: 10.1037/0012-1649.43.2.309. [DOI] [PubMed] [Google Scholar]
  31. Moore C. Understanding self and others in the second year. In: Brownell CA, Kopp CB, editors. Socioemotional development in the toddler years: Transitions and transformations. New York: Guilford; 2007. pp. 43–65. [Google Scholar]
  32. Pye C. An ethnography of Mayan speech to children. Working Papers in Child Language. 1986a;1:30–58. doi: 10.1017/s0305000900000313. [DOI] [PubMed] [Google Scholar]
  33. Pye C. Quiché Mayan speech to children. Journal of Child Language. 1986b;13(1):85–100. doi: 10.1017/s0305000900000313. [DOI] [PubMed] [Google Scholar]
  34. Redfield R, Villa Rojas A. Chan kom a maya village. Washington, DC: Carnegie Institution of Washington; 1934. [Google Scholar]
  35. Rogoff B, Mistry J, Göncö A, Mosier C. Guided participation in cultural activity by toddlers and caregivers. Monographs of the Society for Research in Child Development. 1993;58(8, Serial No. 236) [PubMed] [Google Scholar]
  36. Saffran J, Hauser M, Seibel R, Kapfhamer J, Tsao F, Cushman F. Grammatical pattern learning by human infants and cotton-top tamarin monkeys. Cognition. 2008;107(2):479–500. doi: 10.1016/j.cognition.2007.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Schieffelin BB, Ochs E. Language socialization across cultures. Cambridge and New York: Cambridge University Press; 1986. [Google Scholar]
  38. Shneidman L, Arroyo M, Levine S, Goldin-Meadow S. What counts as effective input for language learning? Journal of Child Language. doi: 10.1017/S0305000912000141. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Shneidman L, Goldin-Meadow S. Proceedings of the 36th Annual Boston University Conference on Language Development. Summerville, MA: Cascadilla Press; Mayan and US caregivers simplify speech to children. (in press) [Google Scholar]
  40. Silva KG, Correa-Chávez M, Rogoff B. Mexicanheritage children’s attention and learning from interactions directed to others. Child Development. 2010;81(3):898–912. doi: 10.1111/j.1467-8624.2010.01441.x. [DOI] [PubMed] [Google Scholar]
  41. Snow CE. Mothers’ speech to children learning language. Child Development. 1972;43:549–565. [Google Scholar]
  42. Soderstrom M. Beyond babytalk: re-evaluating the nature and content of speech input to preverbal infants. Developmental Review. 2007;27(4):501–532. [Google Scholar]
  43. Tomasello M, Farrar MJ. Joint attention and early language. Child Development. 1986;57:1454–1463. [PubMed] [Google Scholar]
  44. Tomasello M, Todd J. Joint attention and lexical acquisition style. First Language. 1983;4:197–211. [Google Scholar]
  45. Weisleder A, Fernald A. Streams of talk: child-directed speech, but not overheard speech, predicts infants’ vocabulary and language. Poster presented at Boston University Conference on Language Development; Boston, MA. 2010. Nov, [Google Scholar]

RESOURCES