Abstract
Prior work finds that some elements of language input or skills during infancy and toddlerhood predict later language skills. Here we ask if combining these two sources of information about early language development improves predictions of language outcomes, using a longitudinal dataset that captures early language input and abilities over the first five years in a sample of 44 American English-learning children. While several early language skills significantly predicted later language skills (Spearman’s rho=0.44–0.63, p<.05), most infant input measures did not. Notably, the most robust predictor of preschool language was parent-reported productive vocabulary at 1.5 years. This suggests that early language assessments (e.g., parental-reported vocabulary) can be reliable measures of language skills with high predictive value for longer-term language outcomes.
Keywords: language development, language input, longitudinal, individual differences, early childhood
Children’s language skills at the time they enter formal schooling have been linked to social, behavioral, and academic outcomes (Kaiser et al., 2022). For instance, children with smaller vocabularies in kindergarten are more likely to have behavioral problems (Yew & O’Kearney, 2017) and to experience peer rejection (van der Wilt et al., 2020). Language skills at kindergarten entry also predict children’s later reading ability (Scarborough, 1998) and subsequent academic achievement (Duncan et al., 2007).
Language learning is a complex, multidimensional process, and children must master a variety of different skills on their way to becoming fluent speakers, including phonology, morphology, word learning (both receptive and productive), and syntax. Such skills can be measured individually, or measures of multiple skills can be combined to create more holistic metrics of language learning. While these skills are separable, they are also often heavily correlated in both typical and atypical development (Bates & Goodman, 2001; Bornstein & Haynes, 1998; Condouris et al., 2003). In this paper, we use “language skills” as an umbrella term to broadly capture diverse measures of children’s language development (including both parental reports and direct child assessments of various combinations of language skills), while noting the specific skill (e.g., vocabulary) whenever relevant.
Given the established associations of a variety of early language skills at school entry with a range of outcomes, the present study aims to identify predictors of language skills themselves, based on measures of the child and/or home environment collected in infancy and toddlerhood. Our hope is that by identifying correlates of skills tied to early school success, we may also identify prospective avenues for providing support to families who may benefit from it. That said, we note at the outset that neither “language skills” nor “success” are monolithic constructs: they are complex interlocking sets of variables that interact with the broader sociocultural contexts in which family life unfolds. Thus, our goal is not, in any sense, to specify what caretakers or children should do more or less of to ensure a specific outcome. Rather, we seek to identify relationships between earlier measures of home language and child knowledge and later ones, in an effort to establish links that may be fruitful for subsequent study, both on a theoretical level (linking experience and early knowledge to later knowledge) and an applied one (considerations of what may, in principle, support thriving in social, behavioral, or academic contexts).
Predicting language outcomes with children’s previous language skills
Prior research has found that emerging language skills in infancy and toddlerhood predict children’s school-age language skills (Fish & Pinkerman, 2003; Hohm et al., 2007; Lee, 2011). For example, Fish and Pinkerman (2003) found that children’s vocabulary at just 15 months – an age when most toddlers are saying only a few words – significantly predicted their language scores on a standardized instrument (Preschool Language Scale 3) at kindergarten entry, accounting for 5% of the variance (r = 0.23, p < .05) in scores. Using a more complex analytical approach, Bornstein and colleagues have investigated the stability of language abilities over time by combining various child language measures collected at multiple ages to extract a latent “core language skill.” They found this skill to be stable from infancy through preschool (Bornstein et al., 2004, 2016, 2018) across a variety of measures. For example, Bornstein et al. (2004) reported that individual differences in multiple measures of children’s language (e.g., vocabulary, mean length of utterances) at 1;8 were moderately to strongly stable with multiple measures of children’s language at 4;0 (e.g., standardized oral language instruments).
However, while a large body of research documents stability in children’s language abilities over time, there is also evidence that some early language challenges are unstable over the same age range. For example, most children who were “late talkers” catch up to their peers and have typical language skills (e.g., vocabulary size within the normative range) by the time they reach school (Dale et al., 2003; Fernald & Marchman, 2012; Rescorla, 2011). At the same time, some children who were not labeled “late talkers” as infants go on to have language difficulties (Dale et al., 2003; Duff et al., 2015). Additionally, while Bornstein and colleagues have found evidence for long-term stability in children’s language skills across ages and measures, they have also reported that measures of language skills at age 1 and younger – specifically, vocabulary as measured by parent report via the MacArthur-Bates Communicative Development Inventory (Fenson, 2007) – may not reflect children’s core language ability, which stabilizes only after age 1 (Bornstein et al., 2004, 2018).
Summarily, these results suggest that children’s early language skills, even in infancy and toddlerhood, are predictive of language outcomes even years later, although there are limits to our ability to predict specific children’s language outcomes reliably.
Predicting language outcomes with children’s language input
While the work reviewed above shows links between early and later language skills, a complementary literature reports measurable effects of both “qualitative” and “quantitative” aspects of children’s language input on their subsequent skills as well (Anderson et al., 2021; Huttenlocher et al., 1991; Rowe, 2012). While language quality and quantity are inextricably linked in actual interactions, prior work tends to consider them separately, generally with metrics considered more of one kind than the other; we therefore review such metrics using these categories. With regards to input quantity, researchers have found that infants who receive more language input (e.g., those who hear more spoken language in their environment) have more advanced language skills (Bergelson et al., 2023; Huttenlocher et al., 1991, 2010; Rowe, 2012; Weisleder & Fernald, 2013). However, some research suggests that input quantity may be most important at the earliest stages of development, while specific features of that input may become more important over time (e.g. Rowe, 2012).
Language input quality refers to the interactional, conceptual, and linguistic features of children’s language environments that promote language learning (Rowe & Snow, 2020). This can include features of the input that make word learning opportunities particularly informative (Hoff, 2006), or they may be features of supportive parent-child interactions more generally. For example, hearing words in isolation (Brent & Siskind, 2001) may help infants segment words from fluent speech. Children may also benefit from reading books with their caregivers (DeBarshye, 1993; Dowdall et al., 2020; Karrass & Braungart-Rieker, 2005), which can expose children to a wide variety of words (Montag et al., 2015) and object labeling opportunities (DeLoache & DeMendoza, 1987). Another aspect of input “quality” is referential transparency (Bergelson & Aslin, 2017; Cartmill et al., 2013), or how readily apparent the referent of a spoken word is in a child’s environment. Researchers theorize that such learning instances, when children hear a word as they attend to an object, can help infants form word-object mappings. In contrast, children who hear more commands have been found to have worse language skills (e.g., smaller vocabulary size, slower syntactic development, Creaghe et al., 2021; Newport et al., 1977). By hypothesis, parents’ use of commands may negatively affect language development as it leads them to re-direct, instead of follow, the child’s behavior and interests (Tomasello & Farrar, 1986; Tomasello & Todd, 1983).
This and other aspects of language input “quality” may be related to broader differences in parenting styles which affect language development (e.g., intrusive parenting, Pungello et al., 2009). Moreover, children’s experience with language is embedded in rich social interactions (Tomasello, 2003). Therefore, it is unsurprising that non-linguistic parenting behaviors and other features of caregiver-child interactions are also associated with language learning (Madigan et al., 2019). However, studies have found that while language input may be related to non-linguistic parenting behaviors such as sensitivity, language input specifically predicts children’s vocabulary growth (Hirsh-Pasek et al., 2015; Hoff & Naigles, 2002).
Critically, effects of early language input may persist over time: longitudinal studies find that measures of infants’ language input predict their language skills, even years later in childhood (Anderson et al., 2021; Bornstein et al., 2020; Duff et al., 2015; Rowe, 2012; Vernon-Feagans et al., 2020). Anderson et al. (2021) conducted a meta-analysis to assess the effect of parental language input on children’s language skills. In their analysis, they aggregated across 17 and 15 longitudinal studies of parental input quantity and quality respectively; the time between parental input and child language assessments ranged from 2 months to 2 years. They found that parental language input quantity (r = .29; p < .001) and language input quality (r = .26; p < .001) were both significantly predictive of children’s later language skills, both receptive and expressive. In fact, Anderson et al. (2021) found that the effect sizes were larger for longitudinal studies than cross-sectional studies, indicating that the effects of language input may even grow stronger over time.
Measuring children’s language input
Across the studies reviewed above, researchers employed a number of diverse methods to sample and quantify properties of children’s language input. It is important to consider how these various measurement methods may affect our conclusions (Bergelson et al., 2019; Dailey & Bergelson, 2022). Historically, researchers have spent significant amounts of time manually transcribing recordings of children’s language experiences. This is a valuable endeavor because manual annotations of language samples can provide fine-grained details about the language children hear: the specific words, who says them, what types of sentences they occur in, what else is going on in the environment concurrently, and more. However, the process of collecting and manually transcribing long-form recordings (e.g., full 16-hour days) – which may be more strongly related to children’s language outcomes (Anderson et al., 2021) – is labor-intensive and difficult to scale up. Fortunately, developments in automatic speech processing technology can, in principle, enable researchers to collect data on children’s early language input more quickly and easily. Such automated measures can estimate the quantity of speech that occurs within long-form recordings, but cannot provide fine-grained details about it. For example, from manual transcriptions, researchers can aggregate the lexical diversity or syntactic complexity in parental speech directed to the target child, while current automated speech analysis can provide only general estimates of the number of words in a recording said in the vicinity of the child by different talker categories (e.g., target child, female adult). While advances in automated speech analyses have improved greatly in recent years (as highlighted by the successes of technologies such as Amazon Alexa and Siri), they remain relatively modest in truly challenging naturalistic contexts like child-perspective long-form recordings (Cristia et al., 2021; Lavechin et al., 2020). The tradeoff is that manual transcription can typically only practically sample a small amount of a given longform recording, while automated methods can leverage the entire day’s content.
Given these tradeoffs, one may wonder how well current automated speech estimates (such as those calculated by Language ENvironment Analysis, the LENA Research Foundation, Boulder, CO) predict language outcomes. A recent meta-analysis found that automated language estimates (adult word, conversational turn, and child vocalization counts), as generated by the LENA system, significantly predict language skills for children up to age 4 (Wang et al., 2020). Combining data across 17 studies, Wang et al. (2020) found that LENA estimates of adult word count were significantly correlated (r = 0.21, p < .001) with children’s later language outcomes (broadly defined). This provides evidence that automated language analysis software can provide useful metrics of children’s language input that are related to their language skills.
However, while automated measures have been validated against manual measurements, and the two measures are typically correlated (Cristia et al., 2020), these two methods do not generally aim to quantify exactly the same aspects of the input and may lead to divergent results. For example, returning to Anderson et al’s (2021) meta-analysis mentioned above, this work tested the relationship between children’s language skills and language input quantity, operationalized as the number of words and utterances in parental child-directed speech from manual transcriptions. They found that the effect of input quantity was stronger for longitudinal studies (i.e., when input was measured at an earlier age than child language skills) than for concurrent studies (i.e., when input was measured at the same age as child language skills). However, Wang et al.’s (2020) meta-analysis (also mentioned above) found conflicting results: the effect of LENA-estimated adult words and conversational turns on children’s language skills decreased as the gap between input/recording and test increased. This discrepancy highlights again that the methods researchers employ to sample children’s language input may affect their conclusions.
Both manual and automated approaches are useful in helping establish what children’s daily language environments are like, but by design, they target different scales of speech input. Nevertheless, establishing whether one type of approach renders metrics with greater predictive power is a worthwhile goal for both learning theories and potential applications. In what follows, we analyze the effects of children’s language input, measured via both manual annotations and automated language analysis, on children’s language development within a single sample over time; we aim to clarify the ways in which these two types of input metrics relate to language development.
Combining language skills and input from infancy to predict preschool language skills
While prior work has considered the roles of input and earlier language skills, few have considered multiple measures of both input and skills within individual children. Here we investigate how well integrating these two data sources predicts language outcomes. We ask: how well do language input and language skills in infancy, combined, predict children’s language skills in preschool? Particularly relevant to this question, Rowe (2012) found that parental language during a 90-minute play session at 18 months accounted for additional variance in child vocabulary 2 years later, above and beyond earlier child vocabulary. On the other hand, Jago (2020) found that language input at 18 and 21 months predicts children’s vocabulary at 24 months, but this effect goes away once controlling for children’s previous vocabulary. Therefore, the balance of home input and prior skill on predicting children’s language skills, particularly for very early language input (<18 months) and longer-term outcomes (>2 years later), remains unclear. The present study provides an opportunity to directly compare the predictive potential of early speech input and early language skills on children’s subsequent language skills, considering how these may contribute to preschool knowledge alone or in tandem, and the size of their effects.
The present study
The present study is a longitudinal investigation of children’s early language input and developing language skills. Specifically, we query how children’s early language input, early language skills, and their combination predicts children’s language outcomes three years later. In line with previous research, we expect that children’s language skills in infancy, input in infancy, and input in preschool will each correlate with core language skill scores in preschool (Anderson et al., 2021; Bornstein et al., 2020; Fish & Pinkerman, 2003; Hohm et al., 2007). We predict that children’s language input will improve our prediction of preschoolers’ core language skill, over and above the effect of their prior language skills (Rowe, 2012).
Prior research has not yet established whether previous language skills as opposed to language input are likely to be stronger predictors of children’s language outcomes, or how well they may account for later skills collectively. Thus, if language input emerges as a strong predictor (while controlling for early language skills), this would highlight the importance of language input during the earliest stages of children’s language learning and its role in laying the foundation for the development of their language skills for years to come. In contrast, we may find evidence for a limited role of early input, relative to early skills, in predicting later language outcomes, which would suggest a more modest contribution of early language experience (at least within the typical range of experiences captured here).
Additionally, a comparison of the predictive power of early child measures and early language input measures has methodological implications. The former are far easier and more cost-effective to collect than the latter. Thus, depending on an investigator’s research questions, it is worthwhile to know whether the time, effort, and expense of home recordings is justified.
In line with the literature outlined above, our results will be a function of which measures we employ to examine aspects of early language input and skills. Thus, our secondary question asks: How do different measurements of language input relate to language outcomes? As detailed and justified below, we will use hand-annotated noun counts and automated estimates of word counts to predict children’s language skills and assess how the results differ between these measurement types. Researchers face a trade-off between the feasibility of manually annotating vast quantities of language samples and the fine-grainedness of the language data they can analyze. Therefore, a comparison of how manual noun counts vs. automated estimates of word counts predict children’s language skills over time are enlightening as well.
Method
This study is part of a broader longitudinal study on children’s early language development (SEEDLingS, Bergelson, 2016a, 2016b). The study protocol was approved by the University of Rochester Research Subjects Review Board (IRB Protocol D0676) and the Duke University Campus Institutional Review Board (IRB Protocols 2017–0123; 2018–0158; and 2019–0476).
Participants (N=44, 21 girls, 23 boys)1 were recruited during an 8-month enrollment period in 2014–2015 from a database of local families in Rochester, NY. Participants were typically-developing, monolingual English-learning (>75% spoken English input) children. Participants are primarily non-Hispanic White (93%, 2% Hispanic, 5% multiple races). Mothers were highly educated (2% had a high school diploma, 23% had some college, 25% had a bachelor’s degree, and 50% had a graduate degree). The sample included one set of dizygotic twins.
Data collection for the main phase of the study occurred from 2014–2016 and for two follow-ups in 2018 and 2019. 35 children participated at each follow-up timepoint (80% retention); the remaining 9 were unable to be reached or declined to participate. Participants were compensated a total of $340 for the original yearlong study, $10 for completion of the age 3 follow-up, and up to $70 for completion of all components of the age 4 follow-up.
Language input measures
Researchers collected a set of recordings in the children’s home environments during infancy and a follow-up recording during preschool. Participants completed a daylong audio recording and hourlong video recording at home each month from 0;6—1;5, plus an additional daylong audio recording at age 4;6, for a total of 13 audio and 12 video recordings per child. Researchers were not present during the recordings. When processing each audio recording from infancy, periods of extended silence (e.g., naps) were screened and excluded from analysis.
In this study, we consider seven measures of children’s language input (leading to eight predictors) aggregated from these recordings (see Table 1).
Table 1.
Overview of measure and timepoint combinations used in current study (8 for input; 4 for child language). N shows the number of participants (out of 44) that completed each measure. Input measures (6 manual; 1 LENA-based across 2 timepoints) are shaded blue; child language measures (4 assessments across 4 timepoints) are shaded orange.
| Age | Measure | Assessment Method | N |
|---|---|---|---|
| 0;6–1;7 | Noun tokens, noun types, object presence, imperatives, short phrases, reading | Manual annotation of at-home audio and video recordings | 44 |
| 0;6–1;7 | Adult words per hour | LENA: Language ENvironment Analysis of at-home audio recording | 44 |
| 1;0 | Productive vocabulary | MCDI: MacArthur-Bates Communicative Development Inventories: Words and Gestures | 41 |
| 1;6 | Productive vocabulary | MCDI: MacArthur-Bates Communicative Development Inventories: Words and Gestures | 36 |
| 3;6 | Language skill | QUILS: Quick Interactive Language Screener | 34 |
| 3;6 | Receptive vocabulary | TPVT: NIH Toolbox Picture Vocabulary Test | 37 |
| 4;6 | Core language skill | CELF-P 2: Clinical Evaluation of Language Fundamentals-Preschool: 2nd Edition | 34 |
| 4;6 | Adult words per hour | LENA: Language ENvironment Analysis of at-home audio recording | 33 |
For each audio recording (0;6—1;5 and 4;6), we have automated speech counts generated by the Language ENvironment Analysis (LENA) system (Gilkerson et al., 2017; Xu et al., 2009). We use the estimate of adult word count (AWC) as a measure of language input. We do not use the estimate of conversational turn count (CTC), as it has been found to be less reliable than other LENA metrics (Cristia et al., 2020; Ferjan Ramírez et al., 2021). A systematic review found that correlations between human counts and CTC averaging r = .36, while correlations between human counts and AWC average r = .79 (Cristia et al., 2020). Additionally, conversational turns combine input and output, so it conflates speech by and to the child in a single measure, rather than being a pure measure of input (which is what we aim to capture here).
To account for differences in recording length ( = 14.07 hrs, SD = 2.42 hrs, mode: 16 hrs; range: 2.70 to 16 hrs2), we calculated the number of adult words per hour. For our infancy recordings (0;6—1;5), we took the mean number of adult words per hour across all 12 recordings for each child for each month. For our preschool timepoint, we used adult words per hour.
In our infancy recordings (audios and videos 0;6—1;5), trained researchers manually annotated instances of concrete nouns (e.g., apple, foot, bottle). The manual annotation focused on nouns because it is well-established that early lexical development is dominated by nouns in English and cross-linguistically (Bates et al., 1994; Frank et al., 2021; Goodman et al., 2008), and previous work has established that noun counts are a viable proxy for infants’ input more generally (Bulgarelli & Bergelson, 2020). Researchers annotated several properties of each word when it occurred: the talker (i.e., who said the word), the type of utterance it occurred in (e.g., imperative, question, short phrase), and whether the referent of the word was present and attended to (determined via video and/or audio context, e.g., “here’s your spoon!”). Ten percent of annotations were double-coded for reliability, and disagreements were resolved via consensus. Inter-rater reliability metrics (utterance types: 87.59% agreement, Cohen’s = 0.83; object presence: 81.29% agreement, Cohen’s = 0.61) indicated high levels of agreement.
Varying lengths of the audio recording were manually annotated across months (ranging from 3 to 16 hours). For parity, we sample the annotations from the top 3 hours of each infancy audio recording (i.e., 3 hour-long regions with highest amount of talking, as determined by an algorithm which weighed LENA automated metrics over the course of the day; see https://github.com/SeedlingsBabylab/audiowords). Therefore, the noun counts in the present analyses are aggregated from 48 hours of recordings per child (12 hourlong video recordings and 36 hours from daylong audio recordings from across 12 months). The following metrics of children’s noun input were aggregated and used as predictors in the analyses below: noun types, noun tokens, the proportion of object presence (i.e., whether the referent of the word is present), the proportion of nouns in imperative utterances, the proportion of nouns in short phrases ( 3 word utterances), and the proportion of nouns that occurred in reading3. Each of these input properties has been found to predict children’s language development in prior work (e.g., Huttenlocher et al., 1991; Rowe, 2012; Cartmill et al., 2013; Newport et al., 1977; Brent & Siskind, 2001; Karrass & Braungart-Rieker, 2005; see section “Predicting language outcomes with children’s language input” for more detail).
Language skill measures
Researchers collected a range of child language assessments in infancy during the main phase of the study, as well as at two follow-up timepoints at ages 3;6 and 4;6. See Table 1 for overview.
During the main phase of the study from ages 0;6—1;6, parents completed the MacArthur-Bates Communicative Development Inventories: Words and Gestures (MCDI, Fenson, 2007) for their children every month. This measure is a parent-reported vocabulary checklist that asks parents if their child “Understands” or “Understands and Says” a list of early-learned words and phrases. We selected the MCDI as it is a highly reliable and valid assessment of children’s early vocabulary developed and validated for infants starting during the first year of life. In this study, we use children’s productive vocabulary4, as measured on the MCDI, at 1;0 and 1;6. We opted for productive rather than receptive vocabulary (i.e., words produced rather than understood) given its higher reliability in the literature (Bergelson, 2020; cf., Frank et al., 2021; Tomasello & Mervis, 1994). We selected these two timepoints because 1;6 was the final timepoint from infancy and because 1;0 is the typical age that children say their first words, and accordingly, children’s vocabularies in our sample were very limited before their first birthdays.
Families were invited to participate in two additional follow-ups when the children were 3;6 and 4;6. Children completed the Quick Interactive Language Screener (QUILS, Golinkoff et al., 2017) and the NIH Toolbox Picture Vocabulary Test (TPVT, Weintraub et al., 2013) at age 3;6 (n = 35, = 3;7, SD = 20.7 days). The QUILS and TPVT are iPad-based assessments of children’s receptive language skills. We selected the QUILS as it is a quick (approx. 15 minutes) and easy to administer measure that assesses multiple domains of language skills (vocabulary, syntax, and word learning skills) for children aged 3—6 years; we report children’s standardized overall language scores. We additionally selected the TPVT as it is normed for ages 3—99 and is quick to administer (approx. 5 minutes); we use age-corrected scores for our analyses.
Our primary outcome measure for this study is the Clinical Evaluation of Language Fundamentals-Preschool: 2nd Edition (CELF-P2, Wiig et al., 2004), which children (n = 35) completed shortly after age 4;6. Researchers aimed to schedule follow-up visits when children were between 4;6 and 5;0 ( = 4;10, SD = 52.6 days; range 4;6 to 5;1; mode = 4;9). The CELF-P2 is a standardized in-depth assessment of language skills (semantics, morphology, and syntax) for children aged 3—6 years. It was designed to assess the aspects of children’s language skills that are important for their transition into school and is commonly used by speech-language pathologists to assess children’s language abilities (e.g., Denman et al., 2023). The CELF-P2 provides a score for “core language skill” plus multiple subscales, including both expressive and receptive skills and both language content and structure. It has been found to be reliable and valid (Wiig et al., 2004). A recent comparison of language assessments for children aged 4—11 years found the CELF-P2 to have high psychometric quality and concluded it was one of the best assessment options for this age group (Denman et al., 2017). For the present analyses, we use children’s core language score5 from the CELF-P2, which is an age-standardized overall composite score (i.e., standardized to mean = 100 and SD = 15 for each 6-month age bin). We note that this “core language skill” composite score from the CELF-P2 is not the same as the latent variable “core language skill” used by Bornstein and colleagues cited above. It is a broad score that includes a variety of both expressive and receptive language skills across semantics, morphology, and syntax.
In addition to the above, children also completed multiple other assessments of language and non-verbal cognitive skills at the 3;6 and 4;6, detailed in the Supplementary Materials6 This includes two additional standardized language assessments at age 4;6: the TPVT7 and the Renfrew Bus Story (Cowley & Glasgow, 1994). We have chosen to analyze the CELF-P2 as our primary outcome measure for the current paper over these other measures because the CELF provides a more comprehensive assessment of children’s language skills. For thoroughness and transparency, we report basic descriptive statistics and correlations between each of these measures and the CELF-P2 in the Supplementary Materials, Table S1, as well as a full correlation matrix in Table S2.
Transparency and Openness
This manuscript follows the APA Style Journal Article Reporting Standards (Appelbaum et al., 2018). All analyses were conducted in R (R version 4.4.3 (2025–02-28), R Core Team, 2021) and the manuscript was generated using papaja (Version 0.1.0.9997, Aust & Barth, 2020). This study’s design and analyses were not preregistered. All data and analysis scripts are available via Open Science Framework (https://osf.io/kyxua/). Recordings with parental authorization to release are available to authorized researchers upon request via Databrary (https://databrary.org/party/61) and Homebank (https://homebank.talkbank.org/access/Password/Bergelson.html).
Results
Analysis plan
First, we report analyses predicting children’s language outcomes in preschool with their previous language skills. We calculated Spearman’s (non-parametric correlations, due to non-normally distributed data; however, all patterns remain the same using Pearson’s correlations) to see which predictors were independently related to our language outcome measure. Next, we selected the predictors that were significantly correlated with preschool language and entered them collectively into a model to assess how children’s early language skills predict their language outcomes in preschool.
Then, we predict children’s preschool language skills using measures of their language input concurrently and from infancy. We conducted the same analyses as described above using language input measures as predictors, to assess how language input in infancy is associated with language outcomes in preschool.
Finally, we combine children’s earlier language skills and their language input to predict their preschool language outcomes. Here we selected all predictors that were significantly correlated with preschool language outcomes from both previous sets of analyses: children’s language skills and language input. We then conducted backward stepwise model comparison by AIC using the MASS package in R (Venables & Ripley, 2002) to determine the overall best model for predicting preschool language skills. In the Supplementary Materials, we provide an additional analysis that includes child gender and maternal education as predictors, with an analogous model selection process to the one reported here. For all reported analyses, missing data points are excluded (rather than imputed); Table 1 lists the n for each measure, and Supplementary Table S1 provides further descriptive statistics of each measure.
While the specifics of our analysis plan were not preregistered, our first two sets of analyses are confirmatory, as our directional a priori hypothesis was that all measures of children’s language input and language skills would positively correlate with preschool language skills, other than proportion of imperatives in the input, which was predicted to negatively correlate with preschool language skills based on the prior literature (Creaghe et al., 2021; Newport et al., 1977; Tomasello & Farrar, 1986; Tomasello & Todd, 1983). Our final analysis combining language skills and language input to predict language outcomes was exploratory.
Predicting language outcomes with children’s previous language skills
First, we assessed how well children’s earlier language skills predict their preschool language outcomes using simple zero-order correlations. We found that three of our four earlier language measures showed a moderate-to-strong correlation with 4;6 core language skill: MCDI vocabulary at 1;6 (Spearman’s = 0.54, p = .002), QUILS language score at 3;6 ( = 0.63, p < .001), and TPVT vocabulary at 3;6 ( = 0.44, p = .014). MCDI vocabulary at 1;0 was not significantly correlated with 4;6 core language skill ( = 0.28, p = .114). Correcting for multiple comparisons the Holm procedure yields the same pattern of results. See Figure 1.
Figure 1:
Correlations between core language skill (CELF-P2) at age 4;6 and language skill measures from earlier timepoints: MCDI at 1;0 and 1;6 (log-transformed for visualization purposes only), and QUILS and TPVT at 3;6. Non-parametric correlations (Spearman’s and corresponding p-value in each facet) were used due to non-normally distributed data; error bars are for visualization purposes only.
We then entered these three measures that individually correlated with preschool language collectively in a single regression model predicting preschool CELF scores (model 1: core language skill MCDI vocabulary (1;6) + QUILS language score (3;6) + TPVT vocabulary (3;6)). (While many of the language skill measures are inter-correlated, this model did not have concerning levels of multicollinearity: all VIFs 3.) This model accounts for 48.12% of the variance in children’s preschool language outcomes (adjusted-R^2 = 0.40, ). See Table 2. Within this model, only MCDI vocabulary at 1;6 remained a significant predictor on its own (, 95% CI ). To put this effect in more intuitive units, for every 20 additional words a child was reported to say at 18 months on the MCDI, they scored 1.10 points higher in core language skill on the CELF-P2 (which, as a reminder, is age-normed and standardized to mean = 100 and SD = 15) three years later.
Table 2.
Regression table for four models predicting CELF core language score at 4;6.
| Dependent variable: | ||||
|---|---|---|---|---|
|
| ||||
| CELF-P2 core language skill at 4;6 | ||||
|
| ||||
| Skills (1) | Input (2) | Best model (3) | MCDI only (4) | |
| Intercept | 65.06** | 120.37*** | 63.13** | 103.15*** |
| (27.47) | (4.16) | (24.59) | (2.51) | |
| 1;6 productive MCDI | 0.06** | 0.05** | 0.09*** | |
| (0.03) | (0.02) | (0.02) | ||
| 3;6 QUILS language | 0.41 | 0.39* | ||
| (0.26) | (0.23) | |||
| 3;6 TPVT vocab | −0.04 | |||
| (0.24) | ||||
| 0;6–1;5 prop. imperatives | −144.18** | |||
| (53.67) | ||||
|
| ||||
| Observations | 24 | 34 | 24 | 29 |
| R2 | 0.48 | 0.18 | 0.48 | 0.40 |
| Adjusted R2 | 0.40 | 0.16 | 0.43 | 0.37 |
Note:
p<0.1
p<0.05
p<0.01
Predicting language outcomes with children’s language input
Next, we turned to children’s language input. Using the LENA-generated counts to look for zero-order Spearman’s correlations as we did for earlier language skills, we found that neither adult words per hour from infancy ( = 0.11, p = .529) nor concurrently from preschool ( = 0.05, p = .804) were correlated with preschool language skills.
Using hand-annotated noun counts, we found that the majority of input metrics were not related to preschool language (ps > 0.05). The only exception was the proportion of imperatives, which showed a statistically robust, moderate, and negative correlation with preschool language (i.e., hearing relatively more imperatives was associated with lower CELF-P2 scores; = −0.47, p = .005). See Figure 2. While analyses of these six variables were planned based on the literature reviewed above, using a more conservative approach and correcting for multiple comparisons via a Holm correction, this result remains statistically significant (p = .032).
Figure 2:
Correlations between core language skill (CELF-P2) at age 4;6 and language input measures from noun annotations (left 3 columns) and LENA (right column). Non-parametric correlations (Spearman’s and corresponding p-value in each facet) were used due to non-normally distributed data; error bars are for visualization purposes only.
We then ran a model predicting preschool language using the predictors which were significantly correlated in the above analyses. This model, which includes only the proportion of imperatives as a predictor (model 2: core language skill prop. imperatives (0;6–1;5)), accounts for 18.40% of the variance in children’s preschool language outcomes (adjusted-R^2 = 0.16, ). (While Spearman’s correlations are appropriate for our zero-order correlations, the Pearson’s for proportion of imperatives and 4;6 CELF-P2 is −0.43, consistent with model results.) Thus, for every 1% increase in the proportion of imperatives children heard in infancy, we see a decrease of 1.40 points in their core language skill at 4;6. That is, this model predicts that a child whose input contained 5% imperatives would have a core language score of 113, while a child whose input contained 15% imperatives would have a core language score of 99. See Table 2.
Combining language skills and input from infancy to predict preschool language skills
Finally, we built a model to predict preschoolers’ language outcomes with their language input and earlier language skills. We included as predictors those variables that were significantly correlated with preschool language (above). That is, we started with the following full model (model 3): core language skill MCDI vocabulary (1;6) + QUILS language score (3;6) + TPVT vocabulary (3;6) + prop. imperatives (0;6–1;5). Next, we ran stepwise model comparison by AIC to determine the best model for predicting CELF at age 4;6. The best model as determined by model comparison is: core language skill MCDI vocabulary (1;6) + QUILS language score (3;6). See Table 2 (model 3). This model accounts for 48.04% of the variance in children’s preschool language outcomes (adjusted-R^2 = 0.43, ). Similar to our findings above, MCDI vocabulary at 1;6 is the only predictor that remained significant on its own within this model ( = 0.05, 95\% CI [0.005, 0.10]).
Due to this strong effect, we ran an additional model using only MCDI vocabulary at 1;6 as a predictor (model 4: core language skill MCDI vocabulary (1;6)). This model accounts for 39.58 % of the variance in children’s preschool language outcomes (adjusted-R^2 = 0.37, ). (While Spearman’s correlations are appropriate for our zero-order correlations, the Pearson’s for 1;6 MCDI and 4;6 CELF-P2 is 0.63, consistent with model results.) Again in more intuitive units, this effect means that for each additional 10 words a child was reported to say at 1;6, children scored 0.89 points higher on the CELF-P2 at 4;6. Therefore, a child with a MCDI productive vocabulary of 50 words at 1;6 would be predicted to score 107 on the CELF-P2 at age 4;6, while a child with a MCDI productive vocabulary of 150 words at 1;6 would be predicted to score 116.
Discussion
Our longitudinal analysis of how well measures of children’s language input and skills predicted language outcomes in preschool revealed several notable results. We found that earlier measures of child language were nearly all predictive of children’s language outcomes at 4;6 (i.e., MCDI at 1;6 but not at 1;0, TPVT at 3;6 and QUILS at 3;6); these measures span vocabulary, syntax, and word learning skills. On the other hand, we found that the proportion of nouns in imperative utterances was the only input predictor of children’s language outcomes, such that more imperatives in early input predicted lower overall child language scores (CELF-P2) at 4;6. None of our other automated or manually derived input measures were significant predictors. Overall, the best predictors of core language skill at 4;6 were child vocabulary at 1;6 and language scores at 3;6, which accounted for nearly half of the variance in CELF scores at 4;6. Child vocabulary at 1;6 on its own was a notably strong predictor, accounting for over a third of the variance in CELF scores at 4;6.
It is perhaps not surprising that measures of the child (via parent report) do a better job predicting measures of the child later (via direct assessments, like the CELF-P2) than measures of the environment. That said, a novel contribution of this work is that it permitted us to consider both child- and input-based measures in relation to each other and evaluate their relative contributions.
Effects of imperatives on language development
We found children who heard more imperatives in infancy had lower language scores in preschool. This finding aligns with prior work showing that imperatives are negatively associated with children’s vocabulary and syntactic development (Creaghe et al., 2021; Newport et al., 1977). Researchers have posited that imperatives may have this negative effect because they redirect children’s attention and control their behavior (Newport et al., 1977; Tomasello & Farrar, 1986; Tomasello & Todd, 1983).
In line with this finding, some studies identify subtypes of imperative utterances where the parent’s underlying goal differs (Akhtar et al., 1991; Masur et al., 2005). Akhtar et al. (1991) found that imperatives that followed in on the child’s attention (e.g., “Put the block here”) – as opposed to imperatives that redirected the child’s attention (e.g., “No, look at this cow”) – positively correlated with children’s language. Although we did not differentiate between these two types of imperatives in this study, this prior work suggests that the effect of imperatives may not be due to their syntactic structure per se, but instead to caregivers’ interactional styles. Parents’ use of imperatives may relate to controlling or intrusive parenting behaviors (Taylor et al., 2009). In this way, parents’ imperatives may be a signal of a broader feature of parenting behaviors and parent-child interactions. However, the literatures on linguistic input and parenting styles are largely separate, with few studies attempting to disentangle the effects of linguistic input and more general parenting styles on language development. This is an intriguing avenue for future research.
Additionally, our measure of imperatives in parents’ speech was the proportion of nouns that occurred in imperative utterances, and not the overall proportion of all imperative utterances. For example, our measure doesn’t capture sentences like “Hey, stop that,” but does capture “Let me put on your shoes” and “Don’t steal my glasses!” While children’s noun input is highly correlated with their overall input (Bulgarelli & Bergelson, 2020), it is possible that imperative utterances with nouns represent a different slice of children’s language experience than imperative utterances more generally and may impact children’s language development differently.
Lack of input effects on preschool language skills
Beyond the effect of imperatives, we did not find any other effects of children’s input on their core language skill in preschool, contrary to our hypotheses. Each of the language input measures we considered in this study (types, tokens, object presence, imperatives, short phrases, reading, and LENA-generated adult words) has previously been found to relate to children’s language learning (e.g., Anderson et al., 2021; Wang et al., 2020). However, in this study, no other input measures from manual noun counts or LENA estimates were predictive of language outcomes. Moreover, our single significant input predictor (imperatives) did not explain any additional variance once we accounted for children’s earlier language skills.
These results suggest that children’s language input does not appear to be strongly related to their language outcomes in preschool, at least using the set of measures we use here. It is possible that methodological decisions, including the measures selected to measure both language skills and language input, affected these results.
First, we note that little prior research on input effects has used the CELF P-2 as their outcome measure. We identified one prior study which analyzed the effects of parental language input on children’s CELF P-2 scores. Unlike the core language score we utilize here, Tabulda (2017) used three subtests of the CELF P-2. However, they similarly found null effects of earlier input on children’s CELF P-2 scores. Specifically, Tabulda (2017) found that two measures of parental speech at age 3 (number of different words, mean length of utterance) did not predict children’s scores on three CELF P-2 subtests (vocabulary, syntax, and phonological awareness) at age 4, after controlling for earlier child language skill (measured by number of different words at age 3).
Furthermore, the aspects of children’s language environment most related to language development may change over the course of development. For example, as described in Rowe (2012), input quantity (e.g., word tokens) may be more important during the earliest stages of vocabulary development, while input diversity (e.g., word types) may become increasingly important as infants’ vocabularies become larger. In line with this possibility, we find that the number of noun tokens is the best predictor of children’s vocabulary at age 1;6 in the current sample (see Supplementary Materials), but this effect does not persist into preschool (neither for the CELF core language skill, nor its vocabulary subtest). Additionally, although noun counts are strongly correlated with overall input counts (Bulgarelli & Bergelson, 2020), it is possible that noun-based input measures, such as those used here, are most relevant for children’s early vocabulary growth and their effects could fade out over time as children’s language skills further develop. Future research considering the potentially time-varying effects of language input over multiple timepoints (e.g., Silvey et al., 2021) is important for expanding our understanding of the dynamically changing language skills at play.
Relatedly, we were not able to include measures of syntactic complexity (e.g., mean length of utterance), given the nature of the data annotation and its focus on nouns. We note that prior work suggests such measures have been predictive of children’s later language skills, particularly their syntactic development (Huttenlocher et al., 2010), and welcome future work exploring this further.
Our findings highlight the importance of considering measurement issues when interpreting results. If this study had included only automated speech estimates (from LENA), we would have reported that none of our measures of language input in infancy and language skill in preschool predicted core language at 4;6, despite the fact that these automated measures drew from many more hours of data per child per timepoint. Thus, this study adds to a growing literature documenting how researchers’ decisions about language input measurement can affect their results and conclusions (Bergelson et al., 2019; Dailey & Bergelson, 2022; Sperry et al., 2019). For example, in the current analyses, in addition to the LENA measures from the full audio-recorded day, we sampled language input from 12 hourlong video recordings and the top three hours from each of 12 daylong audio recordings per participant. While this represents a relatively sizeable sample of language input (48 hours over the course of a year), prior research demonstrates that “top” hours are not representative of all hours throughout the day (Bergelson et al., 2019; De Barbaro & Fausey, 2022). Thus, while we utilized a combination of longer samples of children’s language environments (automated measures from LENA) and shorter samples of manual annotation (from 3 hours of audio and 1 hour of video), we remind readers that other sampling procedures may yield different results.
Additionally, we note the importance of considering potential effect sizes across measures. For the present sample sizes, our LENA AWC measures, both aggregated across infancy recordings and from a concurrent preschool recording, were not predictive of CELF core language skills. This is despite a large and growing literature about the association between LENA metrics and language outcomes: In their meta-analysis, Wang et al. (2020) found that LENA AWC and language outcomes were significantly related with an effect size of r = 0.21 (p < .001). This small-to-medium sized effect indicates that larger-scale samples are needed for sufficient statistical power to detect associations between LENA-based input measures and children’s language outcomes. This may be useful for researchers to consider for sampling considerations in other use cases, particularly if samples of <50 participants are being analyzed, as is often the case in hard-to-recruit settings in early childhood.
Children’s vocabulary in infancy predicts their language skills in preschool
In this study, parent-reported vocabulary from infancy was the strongest predictor of children’s later language scores in this study. In fact, children’s MCDI productive vocabulary at 1;6 alone explains nearly 40% of the variance in children’s preschool language scores. In comparison, previous studies predicting preschool language skills with predictors from infancy were able to account for approximately 5%−20% of the variance (Duff et al., 2015; Fish & Pinkerman, 2003; Friend et al., 2019; Reilly et al., 2010).
Our results align with the literature that finds long-term stability in individual differences in language abilities (Bornstein et al., 2016, 2018). While we can make no causal claims, this suggests that early productive vocabulary measurements from parental report have high predictive value even in relatively small sample sizes, while language input measures (at least the 8 manual and automated measures we consider here) do not. This result leaves open the question of how we can predict and support children’s early vocabularies. In a further supplemental analysis, we find weak evidence that early input in infancy is predictive of children’s vocabulary at 1;6 (see Supplementary Materials). Related work endeavors to explore this question in more depth (Bulgarelli & Bergelson, 2023).
It is perhaps surprising that parent-reported productive vocabulary at 1;6 is a better predictor than direct assessments of language comprehension at 3;6. In contrast to our findings, Friend et al. (2019) found that a direct assessment of comprehension at 1;10 was a better predictor of preschool language than parent-reported vocabulary on the MCDI at 1;4. These conflicting patterns may indicate that vocabulary at 1;6 is a significantly better predictor than vocabulary just 2 months earlier at age 1;4, perhaps due to the rapid vocabulary growth many infants experience during this timeframe (Bergelson, 2020; Frank et al., 2021). It may also be that our direct tablet-based assessments in preschool (i.e., QUILS and TPVT) were not sufficiently sensitive to capture differences in preschool language skill for the purposes of this study.
More broadly, this study has important implications for researchers and practitioners, as children’s language abilities are considerably easier to measure than their language input in infancy. To collect the language input measures used in this study, parents consented to having researcher-supplied equipment record their interactions with their infants and other goings-on in their homes for many hours. Researchers then scheduled home visits with parents, before traveling to participants’ homes to provide, set up, and recollect recording equipment each month. After data collection, the four hours of recordings we analyzed for each child each month (a subset of the broader annotation efforts for this project) took trained researchers roughly 16 hours to annotate (per child per month).
In contrast, measuring children’s early vocabulary was far less demanding for both parents and researchers. The productive vocabulary measure used in this study at 1;6 was a parental report checklist, which takes parents approximately 30 minutes to complete (less for younger infants), can be administered online, and is easy to score (DeMayo et al., 2021; Fenson, 2007). It is thus an important finding that this easy-to-administer vocabulary checklist may provide us with a better signal of children’s language development years down the road than their difficult-to-measure language input, even at this young age. Researchers may not need to laboriously collect detailed information about children’s language input, when simple measurements of their vocabulary may suffice, depending of course on research goals. However, we note that vocabulary checklists and standardized vocabulary assessments may not be feasible or appropriate in all socio-cultural contexts. For this reason, researchers should carefully consider how to appropriately measure children’s early language skills in their specific context.
Notably (and deliberately), we leave open the question of if and how we should intervene to support language development in infancy. Here we have shown that early language skills are strong predictors of children’s scores on language assessments in preschool. More work is needed to investigate potential approaches to boost children’s language skills earlier in life (even in infancy), though it is not yet clear what such an approach would entail. At the same time, while the present work cannot speak directly to intervention development, the results presented here suggest that interventions to increase parents’ child-directed speech may not turn out to be the most effective way to bolster children’s language skills at schooling onset in the long-term, at least in samples like this one. Though prior work has linked aspects of language input to child outcomes in correlational designs (e.g., Huttenlocher et al., 1991; Rowe, 2012), whether interventions would have lasting measurable efficacy is not well-established (e.g., Huber et al., 2023; McGillion et al., 2017; Suskind et al., 2016). While outside of the scope of the present work, future work could fruitfully investigate the many possibilities for supporting all children’s language development in early infancy, in order to set children up for success before they enter school.
Limitations
There may be effects of input that we do not capture in this study due to limitations of our sample. Firstly, our sample at 4;6 includes 35 participants who completed the preschool language outcome measure. This leaves us underpowered to detect small effects. For instance, we found that the effects of noun token and type counts were marginally significant (ps .1). Given previous research (e.g., Anderson et al., 2021; Rowe, 2012) we expect that in a larger sample, this would be a small but potentially meaningful effect. Additionally, our sample size (n = 35) limits the complexity of statistical analyses we are able to conduct. For example, a larger sample could utilize structural equation modeling to analyze complex relationships in longitudinal data. We would welcome this kind of analysis in future, larger-scale work, which could take steps towards testing causal possibilities beyond the scope possible here.
Secondly, our sample is fairly homogeneous (93% White, and 75% of mothers have at least a bachelor’s degree) and this has several consequences for the interpretation of our results. It may be that we encounter a threshold effect in this sample: Perhaps all children in our sample receive “enough” input of the relevant kind, and therefore, the variation within our sample does not reflect meaningful differences in input that may be found across broader populations. We cannot rule this out, though we note that the variation in CELF scores we were modeling ranged from 1 SD below to 2 SDs above the norming sample mean for this instrument, suggesting that while our sample’s mean may be shifted, its outcome variation remained wide. That said, generalization of our results to broader more representative samples should be done with caution. In our supplemental analyses, we include child gender and maternal education as predictors and find little evidence that adding sociodemographic variables substantially changes our pattern of results (see details in Supplementary Materials). This is perhaps unsurprising due to the limited range of maternal education in our sample (i.e., nearly half of mothers had advanced degrees). However, more broadly, this is in line with recent evidence suggesting that socioeconomic effects on everyday language use are, at best, weaker than prior reports (Bergelson et al., 2023; Dailey & Bergelson, 2022; Piot et al., 2022).
Additionally, our sample scored highly on our main language outcome measure, the CELF-P2. Our participants’ average score on the CELF-P2 was approximately one standard deviation above the average for the measure’s norming sample (our sample: M = 110.38, SD = 11.82, range: 86 to 137; norming sample: M = 100, SD = 15). Despite this, we did have variability in language scores, as scores ranged from approximately 1 SD below the norming sample mean to 2 SDs above the norming sample mean. Thus, it is possible that language input measures could be more predictive of later language skills for families with children scoring in a lower range. Furthermore, we believe caution is warranted in generalizing our results to a broader population.
That said, as our results highlight, the effects of children’s language input on language development are likely incredibly complex and sensitive to measurement differences. Nevertheless, our findings do suggest that variation in children’s language abilities on the kinds of skills captured by the CELF-P2 are more readily and robustly predicted by measuring children’s early language abilities directly rather than by measuring their early language input.
Conclusion
We found that the best predictor of children’s language skills before school entry was their earlier language skills. Measures of children’s language environment, which have been found to predict language skills (Anderson et al., 2021), did not add any predictive power over children’s own language skills at an earlier age in this sample. Strikingly, we found that parental report of children’s productive vocabulary at 1.5 years predicted over a third of the variability in children’s standardized language scores three years later.
While children’s language input is certainly an important contributor to their language learning, measuring the input, in all its different facets, is challenging. Our results suggest that in contexts like those studied here, clinicians and researchers can have confidence in parental report of children’s productive vocabulary even before age two. Such measures turn out to predict language skills beyond vocabulary alone, and provide an easy-to-collect and accurate predictor of language skills before kindergarten entry.
Supplementary Material
Public significance statement:
This study investigated how children’s early language input and language skills can predict their language outcomes later in childhood. We found that early measures of children’s language skills were consistently strong predictors of later language skills, while measures of language input during infancy were not. This suggests that early language assessments are reliable measures with high predictive value for children’s long-term language outcomes, which themselves have been linked to further metrics of school achievement downstream.
Acknowledgments
We thank Federica Bulgarelli for her contributions to an earlier version of this work. An earlier version of this manuscript was part of the first author’s dissertation. CRediT author contributions: Shannon Egan-Dailey: Conceptualization; Methodology; Formal analysis; Investigation; Writing - Original Draft; Writing - Review & Editing; Visualization. Elika Bergelson: Conceptualization; Methodology; Resources; Writing - Review & Editing; Supervision; Project administration; Funding acquisition.
Footnotes
The study aimed to enroll 48 participants. This sample size was determined because planned substudies for the original longitudinal study split the sample into 3 groups, and the standard minimum sample size is 16. Based on this, the study aimed to enroll 3×16 or 48 participants. Two additional participants enrolled but dropped out in the early stages of data collection.
There was one file with an unusually short recording length of 2.7 hours due to issues with the recording process. The second shortest recording was 5.3 hours long.
In our analyses, we considered nouns in imperative utterances, short phrases, and reading due to our a priori hypotheses based on the prior literature. However, we include a supplemental analysis of the remaining utterance types (i.e., proportions of nouns in questions, singing, and declarative utterances) for thoroughness; see Supplementary Materials.
We report analyses using raw vocabulary scores; using raw or percentile-based scores yield the same pattern of results.
Using the vocabulary subscore from the CELF-P2 yields the same pattern of results; see Supplemental Materials.
In addition to the mentioned measures, participants completed the NIH Toolbox Picture Sequence Memory Test (Zelazo et al. 2013), the Minnesota Executive Function Scale (Carlson & Zelazo 2014), and the Ages & Stages Questionnaire, Third Edition (Squires and Bricker 2009) at age 3;6, and the Block Design and Matrix Reasoning subtests of the Wechsler Preschool & Primary Scale of Intelligence, 4th Edition (Wechsler 2012) and the Ages & Stages Questionnaire at age 4;6.
See Supplemental Materials for an additional set of analyses predicting preschool vocabulary using the TPVT.
References
- Akhtar N, Dunham F, & Dunham PJ (1991). Directive interactions and early vocabulary development: The role of joint attentional focus. Journal of Child Language, 18(1), 41–49. 10.1017/S0305000900013283 [DOI] [PubMed] [Google Scholar]
- Anderson NJ, Graham SA, Prime H, Jenkins JM, & Madigan S. (2021). Linking quality and quantity of parental linguistic input to child language skills: A meta-analysis. Child Development, 92(2), 484–501. 10.1111/cdev.13508 [DOI] [PubMed] [Google Scholar]
- Appelbaum M, Cooper H, Kline RB, Mayo-Wilson E, Nezu AM, & Rao SM (2018). Journal article reporting standards for quantitative research in psychology: The APA Publications and Communications Board task force report. American Psychologist, 73(1), 3–25. 10.1037/amp0000191 [DOI] [PubMed] [Google Scholar]
- Aust F, & Barth M. (2020). Papaja: Create APA manuscripts with R Markdown. [Google Scholar]
- Bates E, & Goodman JC (2001). On the inseparability of grammar and the lexicon: Evidence from acquisition. In Tomasello M. & Bates E. (Eds.), Language development: The essential readings (pp. 134–162). Blackwell Publishing. [Google Scholar]
- Bates E, Marchman V, Thal D, Fenson L, Dale P, Reznick JS, … Hartung J. (1994). Developmental and stylistic variation in the composition of early vocabulary. Journal of Child Language, 21(1), 85–123. 10.1017/S0305000900008680 [DOI] [PubMed] [Google Scholar]
- Bergelson E. (2016a). HomeBank Bergelson Corpus. TalkBank. 10.21415/T5PK6D [DOI] [Google Scholar]
- Bergelson E. (2016b). SEEDLingS Corpus. Databrary. [Google Scholar]
- Bergelson E. (2020). The comprehension boost in early word learning: Older infants are better learners. Child Development Perspectives, 14(3), 142–149. 10.1111/cdep.12373 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergelson E, Amatuni A, Dailey S, Koorathota S, & Tor S. (2019). Day by day, hour by hour: Naturalistic language input to infants. Developmental Science, 22(1), e12715. 10.1111/desc.12715 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergelson E, & Aslin RN (2017). Nature and origins of the lexicon in 6-mo-olds. Proceedings of the National Academy of Sciences, 114(49), 12916–12921. 10.1073/pnas.1712966114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergelson E, Soderstrom M, Schwarz I-C, Rowland CF, Ramírez-Esparza N, Hamrick R, Hamrick L, … Cristia A. (2023). Everyday language input and production in 1,001 children from six continents. Proceedings of the National Academy of Sciences, 120(52), e2300671120. 10.1073/pnas.2300671120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bornstein MH, Hahn C-S, & Haynes OM (2004). Specific and general language performance across early childhood: Stability and gender considerations. First Language, 24(3), 267–304. 10.1177/0142723704045681 [DOI] [Google Scholar]
- Bornstein MH, Hahn C-S, & Putnick DL (2016). Long-term stability of core language skill in children with contrasting language skills. Developmental Psychology, 52(5), 704–716. 10.1037/dev0000111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bornstein MH, Hahn C-S, Putnick DL, & Pearson RM (2018). Stability of core language skill from infancy to adolescence in typical and atypical development. Science Advances, 4(11), eaat7422. 10.1126/sciadv.aat7422 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bornstein MH, & Haynes OM (1998). Vocabulary Competence in Early Childhood: Measurement, Latent Construct, and Predictive Validity. Child Development, 69(3), 654–671. 10.1111/j.1467-8624.1998.tb06235.x [DOI] [PubMed] [Google Scholar]
- Bornstein MH, Putnick DL, Bohr Y, Abdelmaseh M, Lee CY, & Esposito G. (2020). Maternal sensitivity and language in infancy each promotes child core language skill in preschool. Early Childhood Research Quarterly, 51, 483–489. 10.1016/j.ecresq.2020.01.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brent MR, & Siskind JM (2001). The role of exposure to isolated words in early vocabulary development. Cognition, 81(2), B33–B44. 10.1016/S0010-0277(01)00122-6 [DOI] [PubMed] [Google Scholar]
- Bulgarelli F, & Bergelson E. (2020). Look who’s talking: A comparison of automated and human-generated speaker tags in naturalistic day-long recordings. Behavior Research Methods, 52(2), 641–653. 10.3758/s13428-019-01265-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bulgarelli F, & Bergelson E. (2023). Linking acoustic variability in the infants’ input to their early word production [Preprint]. PsyArXiv. 10.31234/osf.io/su768 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cartmill EA, Armstrong BF, Gleitman LR, Goldin-Meadow S, Medina TN, & Trueswell JC (2013). Quality of early parent input predicts child vocabulary 3 years later. Proceedings of the National Academy of Sciences, 110(28), 11278–11283. 10.1073/pnas.1309518110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Condouris K, Meyer E, & Tager-Flusberg H. (2003). The Relationship Between Standardized Measures of Language and Measures of Spontaneous Speech in Children With Autism. American Journal of Speech-Language Pathology, 12(3), 349–358. 10.1044/1058-0360(2003/080) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cowley J, & Glasgow C. (1994). The Renfrew Bus Story: Language screening by narrative recall—Examiner’s manual. Centreville, DE: The Centreville School. [Google Scholar]
- Creaghe N, Quinn S, & Kidd E. (2021). Symbolic play provides a fertile context for language development. Infancy, 26(6), 980–1010. 10.1111/infa.12422 [DOI] [PubMed] [Google Scholar]
- Cristia A, Bulgarelli F, & Bergelson E. (2020). Accuracy of the Language Environment Analysis System Segmentation and Metrics: A Systematic Review. Journal of Speech, Language, and Hearing Research, 63(4), 1093–1105. 10.1044/2020_JSLHR-19-00017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cristia A, Lavechin M, Scaff C, Soderstrom M, Rowland C, Räsänen O, … Bergelson E. (2021). A thorough evaluation of the Language Environment Analysis (LENA) system. Behavior Research Methods, 53(2), 467–486. 10.3758/s13428-020-01393-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dailey S, & Bergelson E. (2022). Language input to infants of different socioeconomic statuses: A quantitative meta-analysis. Developmental Science, 25(3). 10.1111/desc.13192 [DOI] [PubMed] [Google Scholar]
- Dale PS, Price TS, Bishop DVM, & Plomin R. (2003). Outcomes of early language delay: I. Predicting persistent and transient language difficulties at 3 and 4 years. Journal of Speech, Language, and Hearing Research, 46(3), 544–560. 10.1044/1092-4388(2003/044) [DOI] [PubMed] [Google Scholar]
- De Barbaro K, & Fausey CM (2022). Ten Lessons About Infants’ Everyday Experiences. Current Directions in Psychological Science, 31(1), 28–33. 10.1177/09637214211059536 [DOI] [PMC free article] [PubMed] [Google Scholar]
- DeBarshye B. (1993). Joint picture-book reading correlates of early oral language skill. Journal of Child Language, 20(2), 455. [DOI] [PubMed] [Google Scholar]
- DeLoache JS, & DeMendoza OAP (1987). Joint picturebook interactions of mothers and 1-year-old children. British Journal of Developmental Psychology, 5(2), 111–123. 10.1111/j.2044-835X.1987.tb01047.x [DOI] [Google Scholar]
- DeMayo B, Kellier D, Braginsky M, Bergmann C, Hendriks C, Rowland CF, … Marchman V. (2021). Web-CDI: A system for online administration of the MacArthur-Bates Communicative Development Inventories. Language Development Research, 44. [Google Scholar]
- Denman D, Cordier R, Munro N, Kim J-H, & Speyer R. (2023). Standardized Measures Used Regularly by Speech-Language Pathologists’ when assessing the Language Abilities of School-Aged Children: A Survey. Folia Phoniatrica Et Logopaedica. 10.1159/000530718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denman D, Speyer R, Munro N, Pearce WM, Chen Y-W, & Cordier R. (2017). Psychometric properties of language assessments for children aged 4–12 years: A systematic review. Frontiers in Psychology, 8, 1515. 10.3389/fpsyg.2017.01515 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dowdall N, Melendez-Torres GJ, Murray L, Gardner F, Hartford L, & Cooper PJ (2020). Shared picture book reading interventions for child language development: A systematic review and meta-analysis. Child Development, 91(2). 10.1111/cdev.13225 [DOI] [PubMed] [Google Scholar]
- Duff FJ, Nation K, Plunkett K, & Bishop DVM (2015). Early prediction of language and literacy problems: Is 18 months too early? PeerJ, 3, e1098. 10.7717/peerj.1098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duff FJ, Reen G, Plunkett K, & Nation K. (2015). Do infant vocabulary skills predict school-age language and literacy outcomes? Journal of Child Psychology and Psychiatry, 56(8), 848–856. 10.1111/jcpp.12378 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duncan GJ, Dowsett CJ, Claessens A, Magnuson K, Huston AC, Klebanov P, … Japel C. (2007). School readiness and later achievement. Developmental Psychology, 43(6), 1428–1446. 10.1037/0012-1649.43.6.1428 [DOI] [PubMed] [Google Scholar]
- Fenson L. (Ed.). (2007). MacArthur-Bates Communicative Development Inventories: User’s guide and technical manual (2nd ed). Baltimore, Md: Paul H. Brookes Pub. Co. [Google Scholar]
- Ferjan Ramírez N, Hippe DS, & Kuhl PK (2021). Comparing Automatic and Manual Measures of Parent–Infant Conversational Turns: A Word of Caution. Child Development, 92(2), 672–681. 10.1111/cdev.13495 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fernald A, & Marchman VA (2012). Individual Differences in Lexical Processing at 18 Months Predict Vocabulary Growth in Typically Developing and Late-Talking Toddlers: Lexical Processing and Vocabulary Growth. Child Development, 83(1), 203–222. 10.1111/j.1467-8624.2011.01692.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fish M, & Pinkerman B. (2003). Language skills in low-SES rural Appalachian children: Normative development and individual differences, infancy to preschool. Journal of Applied Developmental Psychology, 23(5), 539–565. 10.1016/S0193-3973(02)00141-7 [DOI] [Google Scholar]
- Frank MC, Braginsky M, Yurovsky D, & Marchman VA (2021). Variability and consistency in early language learning the wordbank project. [Google Scholar]
- Friend M, Smolak E, Patrucco-Nanchen T, Poulin-Dubois D, & Zesiger P. (2019). Language status at age 3: Group and individual prediction from vocabulary comprehension in the second year. Developmental Psychology, 55(1), 9–22. 10.1037/dev0000617 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilkerson J, Richards JA, Warren SF, Montgomery JK, Greenwood CR, Kimbrough Oller D, … Paul TD (2017). Mapping the Early Language Environment Using All-Day Recordings and Automated Analysis. American Journal of Speech-Language Pathology, 26(2), 248–265. 10.1044/2016_AJSLP-15-0169 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golinkoff RM, de Villiers J, Hirsh-Pasek K, Iglesias A, & Wilson MS (2017). User’s manual for the Quick Interactive Language Screener (QUILS): A measure of vocabulary, syntax, and language acquisition skills in young children. Baltimore: Brookes Publishing. [Google Scholar]
- Goodman JC, Dale PS, & Li P. (2008). Does frequency count? Parental input and the acquisition of vocabulary. Journal of Child Language, 35(3), 515–531. 10.1017/S0305000907008641 [DOI] [PubMed] [Google Scholar]
- Hirsh-Pasek K, Adamson LB, Bakeman R, Owen MT, Golinkoff RM, Pace A, … Suma K. (2015). The Contribution of Early Communication Quality to Low-Income Children’s Language Success. Psychological Science, 26(7), 1071–1083. 10.1177/0956797615581493 [DOI] [PubMed] [Google Scholar]
- Hoff E. (2006). How social contexts support and shape language development. Developmental Review, 26(1), 55–88. 10.1016/j.dr.2005.11.002 [DOI] [Google Scholar]
- Hoff E, & Naigles L. (2002). How Children Use Input to Acquire a Lexicon. Child Development, 73(2), 418–433. 10.1111/1467-8624.00415 [DOI] [PubMed] [Google Scholar]
- Hohm E, Jennen-Steinmetz C, Schmidt MH, & Laucht M. (2007). Language development at ten months: Predictive of language outcome and school achievement ten years later? European Child & Adolescent Psychiatry, 16(3), 149–156. 10.1007/s00787-006-0567-y [DOI] [PubMed] [Google Scholar]
- Huber E, Ferjan Ramírez N, Corrigan NM, & Kuhl PK (2023). Parent coaching from 6 to 18 months improves child language outcomes through 30 months of age. Developmental Science, 26(6), e13391. 10.1111/desc.13391 [DOI] [PubMed] [Google Scholar]
- Huttenlocher J, Haight W, Bryk A, Seltzer M, & et al. (1991). Early vocabulary growth: Relation to language input and gender. Developmental Psychology, 27(2), 236–248. 10.1037/0012-1649.27.2.236 [DOI] [Google Scholar]
- Huttenlocher J, Waterfall H, Vasilyeva M, Vevea J, & Hedges LV (2010). Sources of variability in children’s language growth. Cognitive Psychology, 61(4), 343–365. 10.1016/j.cogpsych.2010.08.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jago L. (2020). Predictors of individual differences and language delay in children learning English (PhD thesis). University of Liverpool. [Google Scholar]
- Kaiser AP, Chow JC, & Cunningham JE (2022). A Case for Early Language and Behavior Screening: Implications for Policy and Child Development. Policy Insights from the Behavioral and Brain Sciences, 9(1), 120–128. 10.1177/23727322211068886 [DOI] [Google Scholar]
- Karrass J, & Braungart-Rieker JM (2005). Effects of shared parent–infant book reading on early language acquisition. Journal of Applied Developmental Psychology, 26(2), 133–148. 10.1016/j.appdev.2004.12.003 [DOI] [Google Scholar]
- Lavechin M, Bousbib R, Bredin H, Dupoux E, & Cristia A. (2020). An open-source voice type classifier for child-centered daylong recordings. 10.48550/ARXIV.2005.12656 [DOI] [Google Scholar]
- Lee J. (2011). Size matters: Early vocabulary as a predictor of language and literacy competence. Applied Psycholinguistics, 32(1), 69–92. 10.1017/S0142716410000299 [DOI] [Google Scholar]
- Madigan S, Prime H, Graham SA, Rodrigues M, Anderson N, Khoury J, & Jenkins JM (2019). Parenting Behavior and Child Language: A Meta-analysis. Pediatrics, 144(4), e20183556. 10.1542/peds.2018-3556 [DOI] [PubMed] [Google Scholar]
- Masur EF, Flynn V, & Eichorst DL (2005). Maternal responsive and directive behaviours and utterances as predictors of children’s lexical development. Journal of Child Language, 32(1), 63–91. 10.1017/S0305000904006634 [DOI] [PubMed] [Google Scholar]
- McGillion M, Pine JM, Herbert JS, & Matthews D. (2017). A randomised controlled trial to test the effect of promoting caregiver contingent talk on language development in infants from diverse socioeconomic status backgrounds. Journal of Child Psychology and Psychiatry, 58(10), 1122–1131. 10.1111/jcpp.12725 [DOI] [PubMed] [Google Scholar]
- Montag JL, Jones MN, & Smith LB (2015). The words children hear: Picture books and the statistics for language learning. Psychological Science, 26(9), 1489–1496. 10.1177/0956797615594361 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newport EL, Gleitman H, & Gleitman LR (1977). Mother, I’d rather do it myself: Some effects and non-effects of maternal speech style. In Snow CE & Ferguson CA (Eds.), Talking to Children (pp. 109–149). Cambridge University Press. [Google Scholar]
- Piot L, Havron N, & Cristia A. (2022). Socioeconomic status correlates with measures of Language Environment Analysis (LENA) system: A meta-analysis. Journal of Child Language, 49(5), 1037–1051. 10.1017/S0305000921000441 [DOI] [PubMed] [Google Scholar]
- Pungello EP, Iruka IU, Dotterer AM, Mills-Koonce R, & Reznick JS (2009). The effects of socioeconomic status, race, and parenting on language development in early childhood. Developmental Psychology, 45(2), 544–557. 10.1037/a0013917 [DOI] [PubMed] [Google Scholar]
- R Core Team. (2021). R: A language and environment for statistical computing [Manual]. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
- Reilly S, Wake M, Ukoumunne OC, Bavin E, Prior M, Cini E, … Bretherton L. (2010). Predicting Language Outcomes at 4 Years of Age: Findings From Early Language in Victoria Study. Pediatrics, 126(6), e1530–e1537. 10.1542/peds.2010-0254 [DOI] [PubMed] [Google Scholar]
- Rescorla L. (2011). Late talkers: Do good predictors of outcome exist? Developmental Disabilities Research Reviews, 17(2), 141–150. 10.1002/ddrr.1108 [DOI] [PubMed] [Google Scholar]
- Rowe ML (2012). A longitudinal investigation of the role of quantity and quality of child-directed speech in vocabulary development: Child-directed speech and vocabulary. Child Development, 83(5), 1762–1774. 10.1111/j.1467-8624.2012.01805.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rowe ML, & Snow CE (2020). Analyzing input quality along three dimensions: Interactive, linguistic, and conceptual. Journal of Child Language, 47(1), 5–21. 10.1017/S0305000919000655 [DOI] [PubMed] [Google Scholar]
- Scarborough HS (1998). Early identification of children at risk for reading disabilities: Phonological awareness and some other promising predictors. In Accardo P, Capute A, & Shapiro B. (Eds.), Specific Reading Disability: A View of the Spectrum (pp. 75–119). Timonium, MD: York Press. [Google Scholar]
- Silvey C, Demir-Lira ÖE, Goldin-Meadow S, & Raudenbush SW (2021). Effects of Time-Varying Parent Input on Children’s Language Outcomes Differ for Vocabulary and Syntax. Psychological Science, 32(4), 536–548. 10.1177/0956797620970559 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sperry DE, Sperry LL, & Miller PJ (2019). Reexamining the Verbal Environments of Children From Different Socioeconomic Backgrounds. Child Development, 90(4), 1303–1318. 10.1111/cdev.13072 [DOI] [PubMed] [Google Scholar]
- Suskind DL, Leffel KR, Graf E, Hernandez MW, Gunderson EA, Sapolich SG, … Levine SC (2016). A parent-directed language intervention for children of low socioeconomic status: A randomized controlled pilot study. Journal of Child Language, 43(2), 366–406. 10.1017/S0305000915000033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tabulda GA (2017). Role of parent oral language input in the development of child emergent literacy skills (PhD thesis). The Florida State University. [Google Scholar]
- Taylor N, Donovan W, Miles S, & Leavitt L. (2009). Maternal control strategies, maternal language usage and children’s language usage at two years. Journal of Child Language, 36(2), 381–404. 10.1017/S0305000908008969 [DOI] [PubMed] [Google Scholar]
- Tomasello M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge, MA: Harvard University Press. [Google Scholar]
- Tomasello M, & Farrar MJ (1986). Joint attention and early language. Child Development, 57(6), 1454–1463. [PubMed] [Google Scholar]
- Tomasello M, & Mervis CB (1994). The instrument is great, but measuring comprehension is still a problem. Monographs of the Society for Research in Child Development, 59(5), 174–179. 10.1111/j.1540-5834.1994.tb00186.x [DOI] [Google Scholar]
- Tomasello M, & Todd J. (1983). Joint attention and lexical acquisition style. First Language, 4(12), 197–211. 10.1177/014272378300401202 [DOI] [Google Scholar]
- van der Wilt F, van der Veen C, van Kruistum C, & van Oers B. (2020). Language abilities and peer rejection in kindergarten: A mediation analysis. Early Education and Development, 31(2), 269–283. 10.1080/10409289.2019.1624145 [DOI] [Google Scholar]
- Venables WN, & Ripley BD (2002). Modern Applied Statistics with S (4th ed.). New York: Springer. [Google Scholar]
- Vernon-Feagans L, Bratsch-Hines M, Reynolds E, & Willoughby M. (2020). How early maternal language input varies by race and education and predicts later child language. Child Development, 91(4), 1098–1115. 10.1111/cdev.13281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Williams R, Dilley L, & Houston DM (2020). A meta-analysis of the predictability of LENA™ automated measures for child language development. Developmental Review, 57, 100921. 10.1016/j.dr.2020.100921 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weintraub S, Bauer PJ, Zelazo PD, Wallner-Allen K, Dikmen SS, Heaton RK, … Gershon RC (2013). I. NIH Toolbox Cognition Battery (CB): Introduction and Pediatric Data: NIH Toolbox Cognition Battery (CB). Monographs of the Society for Research in Child Development, 78(4), 1–15. 10.1111/mono.12031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weisleder A, & Fernald A. (2013). Talking to children matters: Early language experience strengthens processing and builds vocabulary. Psychological Science, 24(11), 2143–2152. 10.1177/0956797613488145.Talking [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wiig EH, Secord WA, & Semel E. (2004). Clinical Evaluation of Language Fundamentals Preschool–Second Edition (CELF Preschool-2). Toronto, Canada: The Psychological Corporation/ A Harcourt Assessment Company. [Google Scholar]
- Xu D, Yapanel U, & Gray S. (2009). Reliability of the LENA Language Environment Analysis System in young children’s natural home environment (Technical {{Report}} No. LTR-05–2). Boulder, CO: LENA Foundation. [Google Scholar]
- Yew SGK, & O’Kearney R. (2017). Language difficulty at school entry and the trajectories of hyperactivity-inattention problems from ages 4 to 11: Evidence from a population-representative cohort study. Journal of Abnormal Child Psychology, 45(6), 1105–1118. 10.1007/s10802-016-0241-x [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


