ABSTRACT
We introduce a novel method to test a classic idea in developmental science that children's attention to a stimulus is driven by how much they can learn from it. Preschoolers (4–6 years, ) watched a video where a distracting animation accompanied static, page‐by‐page illustrations of a storybook. The audio narration for each storybook page was looped so that children could listen to it up to six times in total. However, the narration automatically ended if the child looked at the distractor for an extended period of time, indicating their loss of attention to the story, and triggering the next page. The complexity of the narration was manipulated between‐subjects: The simple narration largely contained words that should be familiar to preschoolers, while the complex narration contained many rare, late‐acquired words. Children's learning was measured via post‐tests of their plot comprehension and ability to generalize the embedded rare words. Consistent with the hypothesis that children's attention was driven at least partly by their ability to learn from the speech, we observed a significant interaction between narration complexity and age in predicting children's probability of continuing listening on each page, and the proportion of their visual attention that they devoted to the story illustration, over the animated distractor. That is, while younger children were more likely to continue listening to the simple speech, older children became increasingly likely to sustain attention to the complex speech. Our results provide evidence that young children may actively direct their attention toward linguistic input that is most appropriate for their current level of cognitive and linguistic development, which may provide the best learning opportunities.
Keywords: cognitive development, language processing, lexical development, rational learning, selective attention, self‐directed learning
1. Introduction
If you have ever read a young child a bedtime story, you have likely noticed how children will demand that you read some books over and over again, yet insist that you abandon others just as you begin. Moreover, the same—relatively simple—book that a toddler demands over and over may bore a preschooler, whose favorite—comparatively complex—book the toddler immediately rejects. This everyday example is suggestive of a general principle according to which children's attention is most readily sustained by information that they are best able to learn from: The toddler may have the sense that they are still learning from repeated narrations of the book that they favor, while the preschooler's favorite book is too far beyond the toddler's linguistic and world knowledge to readily support learning, leading to its immediate rejection. Notably, this hypothesis—that children's attention to an information source is driven by the degree to which it supports their learning—has its roots in foundational theory in developmental psychology (e.g., Bruner 1961; Vygotsky et al. 1978), but has been difficult to obtain direct evidence for. Here, we employ a novel method inspired by the above scenario. Our study manipulates the complexity of a naturalistic speech stream and explores how children's attention to and learning from that speech shifts across a 2‐year age range, as children's linguistic competence and world knowledge grow. Support for this hypothesis would suggest a way in which children are active learners—and active language learners in particular—selectively attending to the sources of linguistic information that they are best able to learn from (Foushee et al. 2023; Gureckis and Markant 2012).
Summary
We test the idea that children's attention to a stimulus (here, spoken language) is driven by how much they can learn from it.
Preschoolers heard a story at one of the two complexity levels; we infer their attention to the speech from their gaze to a story illustration over a distractor.
Children's attention was predicted by an interaction between age and speech complexity: Younger children attended more to the simpler speech and older children to the complex speech.
Children's online attention to the speech predicted their learning in post‐tests of plot and novel word knowledge.
1.1. Background
Results from infant studies are consistent with the idea that children's attention to a stimulus is driven by their sense of learning. In one body of work, researchers independently define the complexity of different stimuli—irrespective of participants' knowledge or experience—and show that the duration of participants' attention systematically varies in response (e.g., Caron and Caron 1969; Kidd et al. 2012, 2014; Martin 1975; Thomas 1965). Many of these studies manipulate the predictability of highly simplified visual sequences, and use an ideal learner model to quantify the complexity of each event in the sequence via its conditional probability. In an influential 2012 study, for example, Kidd et al. played simple sequences of visual events for 8‐month‐old infants and measured infants' duration of visual attention in response. The authors dubbed the pattern they observed the “Goldilocks effect”: Infants' probability of looking away was lowest for events of intermediate (or “just right”) complexity (see also Kidd et al. 2014). Attending to intermediate levels of complexity is consistent with attending on the basis of learning, because the space between highly familiar and unmanageably novel is where learning is likely to be the most efficient. Importantly, this “U‐shaped” relation between stimulus complexity and infants' probability of looking away was evident not just at the group level, but in individual infants' gaze behavior, at different ranges along the complexity continuum. This is what we would expect if infants' attention was driven by their sense of learning, because different complexity ranges will be relevant and appropriate for different infants. However, while attending to intermediate complexity is understood as a domain‐general learning mechanism, studies showing complexity‐based attentional preferences are typically not designed to directly demonstrate the learning payoff of early selective attention, leaving open the possibility that infants' attention reflects something more like a heuristic (“attend to medium complexity”), rather than a responsive monitoring process (“attend while learning”).
Studies that take a step closer toward linking selective attention and learning show how individuals' attention shifts with experience. For example, Forest et al. (2022) show how adults attend to increasingly complex patterns as they gain more experience with sequential visual stimuli. Poli et al. (2020) link 8‐month‐olds' attention to their learning progress: infants watched individually cued target shapes re‐appear at different locations on a screen. Each shape had a most‐likely target location, making some trials more informative than others for ultimately predicting where a given shape was going to appear. Infants' gaze in this paradigm showed the established relationship between complexity and attention, in that infants were least likely to look away for trials of intermediate predictability. However, learning progress, or how informative a given trial was, proved an even stronger predictor of infant looking times: Infants were least likely to look away when information gain was highest. What's more, infants' actual learning progress was evident in their looking times across trials: Infants became faster and faster at directing their gaze toward predictable targets, consistent with having developed an efficient model of the statistical environment. Together, these studies confirm that learners' attention is informed by relative complexity. Further, they show that relative complexity is a moving target, informed by what learners have already seen (and it is notably almost always “seen,” as most studies have shown these effects in the visual domain). These studies also employ specific notions of complexity and learning, compatible with their stripped‐down visual‐event stimuli: “complexity” means lower conditional probability and “learning” means being able to efficiently predict the next event.
Our study aimed to test this theoretically domain‐general learning mechanism in natural language—a messy, multiplex domain with real‐world consequences. Using meaningful linguistic stimuli enables us to probe relations among stimulus complexity, selective attention, and learning not just for sequential statistical dependencies, but for the higher‐order sense‐making involved in language comprehension at older ages. Two follow‐ups to Kidd et al.'s original study are especially relevant, in light of this focus. The first extends the Goldilocks effect to auditory attention, measuring infants' probability of looking away from a display in response to tone sequences of varying predictability (Kidd et al. 2014). The second tests children 3–6 years of age, finding the same relation between the predictability of visual event sequences and children's probability of looking away (Cubit et al. 2021). To our knowledge, only one other study has used linguistic stimuli to home in on the hypothesis that infants attend more to information that they would be more likely to learn from: In one experiment, Gerken et al. (2011) exposed half of 17‐month‐old infants to artificial language stimuli that had supported grammar learning and generalization in prior samples of same‐age infants (making it subjectively learnable by infants of this age). The other half of infants heard artificial language stimuli that had failed to support learning and generalization of the corresponding grammar in previous samples (making it subjectively unlearnable). Infants took longer to habituate to the subjectively learnable stimuli, leading the authors to propose that infants implicitly monitor their learning rate from a particular information source, and disattend when it falls below some threshold of efficiency.
Notably, while learning is implicated as the underlying motivation for children's attention, previous studies have not directly tested children's learning from the same stimuli to which attention is measured (Poli et al.'s implicit tracking of learning progress is a notable exception). Existing studies have also been limited in their capacity to say anything about learning by only varying the complexity of the stimulus, and not the relative competence of the learner. “Subjective learnability” is a product of the interaction between stimulus complexity and the relative competence of the learner; that is, the same “objective” level of complexity will be less subjectively learnable for a more novice learner, and more subjectively learnable for a more advanced learner. Thus, any study that only varies stimulus complexity cannot be sure that attentional preferences among children at the same level of development are a result of learnability, rather than alternate dimensions of the stimulus.
1.2. The Present Study
The current study addresses these gaps using a novel paradigm to test the hypothesis that children's attention to a source of linguistic information is driven by the degree to which it supports their learning. In a departure from previous studies that employ highly simplified visual or auditory stimuli (Kidd et al. 2012; 2014; Cubit et al. 2021; Poli et al. 2020; Forest et al. 2022), we use natural language stimuli, which both interests children and carries real information for learning. Children across a 2‐year age range (4–6 years) listened to one of the two alternate tellings of the same story, narrated at distinct levels of complexity: While the simple story mostly used words that children were likely to know, the complex story contained many words that were likely to be unfamiliar. During the story narration, we measured children's attention to the speech via their gaze to story‐relevant visual stimuli. After the story narration, we measured children's learning from the speech via explicit tests of their listening comprehension and partial word knowledge.
What do we expect to see? In thinking through our predictions, it is useful to distinguish between a child's sense that they are or could be learning something, and the product(s) of a child's learning—their learning outcomes. We of course expect that the amount a child learns from a stimulus will be related to the amount that the child attended to it. 1 That is, children's learning outcomes in our study and their attention to the story should be correlated. The more nuanced hypothesis that our study allows us to test is whether a child's attention allocation is itself determined by the interaction between stimulus complexity and the child's own competence, as this determines the child's capacity to learn from the stimulus. In our study, we use children's age as a proxy for their relative cognitive development and linguistic competence. 2 We expect that there will be a larger gap between the complex speech and the language that the younger children in our sample know (making it difficult for them to learn from), and we expect this gap to narrow as children age (making the complex speech more learnable for older children). Thus, when listening to the complex narration, we predict greater attention from the older children in our sample than from the younger children. Conversely, when listening to the simple narration, we predict greater attention from the younger children in our sample than from the older children.
To test these predictions, we tracked children's visual attention to a display while listening to either the simple or complex narration of a textless storybook (Mayer 1969)—distinguished by the ages of acquisition of the words the narrations used (Kuperman et al. 2012). We directly tested children's learning outcomes from the story narration, including their comprehension of its plot and their ability to understand and generalize the rare words embedded in the story. On each page of the story, the storybook illustration competed for children's visual attention with the distractor (a continuous animation of three penguins double‐dutching; see Figure 1). Given the presence of this dynamic animation, we reasoned that visual attention to the comparatively dull illustration was likely to be a meaningful index of children's attention to the speech. That is, we expected that children would continue to look at the static illustration only as long as they were actively processing the story narration (even, that it would be difficult for them not to, as when the secret location of a queried object is inadvertently revealed by a child's gaze; Cooper 1974; Salverda and Altmann 2011). Indeed, when the same split‐screen display was presented sans narration during piloting, children looked almost exclusively at the distractor, rather than at the illustration. Thus, we expected that children would be lured by the distractor as soon as they were no longer listening to the story. This basic design originated from even earlier pilot studies in which children heard the story narration while watching a screen with only the Illustration. Pilot eyetracking data revealed a large amount of “lost” gaze data—children's gaze falling off‐screen, in the white space around the image, or otherwise proving untrackable—which seemed to coincide with behavioral indices of children's loss of interest in the speech (fidgeting, humming, etc.). In introducing the distractor, we hoped to capture rather than lose those ambiguous attentional data, and to refine our interpretation of children's non‐illustration gaze as an inverse signal of their attention to the speech.
FIGURE 1.

Schematic of experimental eyetracking procedure manipulating the speech complexity of a narrated storybook and measuring child attention and learning.
Inspired by the trial structure of gaze‐contingent infant paradigms (Colombo and Mitchell 2009), the duration of each storybook page was contingent on children's allocation of visual attention. While children who continually gazed at the illustration—suggesting that they were paying attention to the speech—could hear each page of the story repeated up to five times (inset Figure 1), children who were consistently drawn in by the distractor moved through the story quickly and heard the narration for each page only once. This design met the twin goals of making sure that every child (a) heard the entirety of the story content (so that variation in learning outcomes could not come from variation in having heard the critical information) and (b) provided attentional data for all pages (i.e., even if the child would otherwise have disattended from the entire story—and experiment—at an earlier point). We quantify children's attention to the story by measuring whether children continued listening to further, optional repetitions of the narration for each page, as well as how much they looked to the illustration (which is only made salient by the narration), over the distractor (which is salient otherwise).
To probe links between individual children's attention to the speech and their learning, we measured two learning outcomes after the story: (1) children's listening comprehension (their recollection of the content of the story), and (2) (only for the children hearing the complex narration) their capacity to generalize the rare target words embedded in the speech to novel referents. These variables enabled us to answer the following specific research questions: (a) Is preschool‐aged children's attention to naturalistic speech responsive to its complexity? (b) Does children's age—as a proxy for their level of language and cognitive development—interact with speech complexity to predict their attention (consistent with children's attention to spoken language being sensitive to how much they can learn from it)? and (c) Within each condition (simple/complex), is children's attention to the speech correlated with their learning outcomes?
2. Materials and Methods
Full documentation of our procedures, including study scripts and stimuli, is at https://osf.io/zsjfb/?view_only=024c8e83e56a4fff95e5d5ae840035c2. Session videos are available on databrary.org (linked in the study repository), for viewing by registered users at the access level permitted by each participating family. De‐identified data and analysis files, along with a reproducible version of this manuscript, are at https://github.com/foushee/storybook‐habituation.
2.1. Participants
Our participants were 46 children (4.0–6.0 years; [0.13, 0.14], ) whose parents reported English as their primary language. Based on demographic information provided by 83% () of caregivers, children's families occupied a range of socioeconomic positions (17% had reported annual household incomes below 25K, 25% above 200K), with a skew toward higher‐income households (50% of children came from households reporting 100K or more in annual income; the median household income in Berkeley, California was $91,259 in 2020; U.S. Census Bureau 2020). Caregivers were overwhelmingly educated, with 75% of caregivers holding a graduate degree (only 17% of caregivers had completed fewer than 4 years of college). Children were generally identified by caregivers as Asian or Pacific Islander (42%) or White (42%), with 9% of children identified as Black, and 17% of children identified as belonging to multiple racial categories. Children were recruited from local preschools or from a database of interested families maintained by The Institute of Human Development at the University of California, Berkeley. They were tested in a quiet area of their school or in lab, and received a sticker and/or certificate and small toy for their participation. The study was approved by the University of California, Berkeley Committee for the Protection of Human Subjects.
The COVID‐19 pandemic forced us to halt data collection before reaching our planned sample of 64 children. However, a sensitivity analysis using simulated data suggests that this nevertheless left us with over 80% power to detect a small crossover interaction between condition and age in predicting child attention.
Prior to their study session, children were randomly assigned to the simple (, [0.20, 0.24], ) or complex (, [0.16, 0.18], ) condition. There was no significant difference between the ages of the children in the simple and complex conditions (). Two additional children were excluded after another child (1) or teacher (1) intervened on their study session.
2.1.1. Vocabulary Questionnaire
To validate our assumptions about the words likely to be familiar versus unfamiliar to the children in our sample, we asked caregivers to fill out a vocabulary questionnaire, administered via Qualtrics (https://www.qualtrics.com). For every content word used in either condition of the study, caregivers indicated whether or not their child would “understand the word if [the caregiver] said it out loud.” Caregivers typically completed this measure while their child was participating in the study, followed by a demographic survey and language environment questionnaire. Caregivers of the children tested in preschool received a link to the questionnaires via e‐mail. Importantly, this survey confirmed that the rare target words embedded in the complex condition were indeed novel to children: 0% of caregivers reported that they were familiar to their children (and many verbally reported having themselves learned at least some of the words from the study). On the other hand, caregivers reported that their children understood all of the words used in the simple condition.
2.2. Procedure
2.2.1. Familiarization
Children sat before a laptop connected to an SMI RED‐n eyetracker, wearing child‐sized over‐ear headphones. After a brief five‐point calibration of the eyetracker (“Can you follow the little fairy on the screen?”), the familiarization began. The first screen displayed a black‐and‐white animation of three penguins jumping rope (the distractor) on the left side of the screen (Figure 1). This screen lasted for 10 s, during which a feminine voice drew the child's attention to the ongoing animation and encouraged them to look there “if the story gets boring.” Next, the cover of the book, “Frog, Where are You?” (Mayer 1969) appeared alongside the distractor. Both images were displayed for 15s, during which the voiceover re‐iterated that the child was going to hear a story, and again directed the child's attention to the distractor (“Where are you going to look if the story gets boring?”). The familiarization phase ended with a looming fixation cross on a gray background, used to center children's gaze before the onset of the narration—and critical data collection—phase.
2.2.2. Storybook Narration
The same feminine voice narrated a boy and dog's search for their escaped pet frog across six pages of a textless picture book. On each page, the illustration for the story appeared on the right side of the screen, while the distractor played continuously on the left. To ensure high‐quality eyetracking data, a gaze‐contingent fixation cross gated the onset of each new page (see inset of Figure 1).
2.2.2.1. Speech Complexity Manipulation
Depending on the condition to which they were assigned, children heard the story narrated at either the simple or complex level (center column of Figure 1). The simple and complex narrations were matched on multiple linguistic dimensions, but differed in the estimated age of acquisition (AoA) of the words they used (Kuperman et al. 2012). 3 The simple narration exclusively used words from the MacArthur‐Bates Communicative Development Inventory (Fenson et al. 2007), which is normed for children between 16 and 30 months. In contrast, each page of the complex narration included five words with AoAs estimated between 7 and 13 years (bolded in the sample page narration in Figure 1), as well as a single rare and unfamiliar word with an estimated AoA of over years, which was presented twice (bolded and underlined in Figure 1). The rare words were ogled, absconded, flummoxed, hyaline, aperture, and tor (two verbs, two adjectives, and two nouns). Children's learning of these rare words was assessed in the test phase.
2.2.2.2. Child‐Controlled Listening
Children obligatorily heard the narration for each page at least once (∼15 s), after which the same audio continued to loop for up to five additional repetitions (∼75 s), separated by a brief pause (500 ms). After the first obligatory narration, children could advance to the next page early by looking at the distractor: A fixation of 1.5 s (1500 ms) to the distractor automatically triggered the next page. The child‐controlled portion of the experiment lasted between and s ( s [2.64, 3.31]).
2.2.2.3. Happy Ending
Regardless of condition, all children experienced the same (brief: s) end of the story: Instead of the distractor‐illustration split‐screen, the display showed facing storybook pages. The pages turned as the narrator described the boy and the dog's rediscovery of the frog (on a log surrounded by “his whole family!”).
2.2.3. Learning Tests
After the story, we measured children's learning outcomes via two blocks—Listening Comprehension and Unfamiliar Word Generalization—of six test trials each. Within each block, each test trial tested knowledge from a different content page in the storybook. Three initial trials familiarized children with the format of the test questions, by asking them to point to the “dog,” “boy,” and “frog” in successive arrays. All children got these questions right. The subsequent test questions were always presented in the same order across children, mirroring the order in which the relevant information was introduced within the story.
2.2.3.1. Listening Comprehension
In the first test block, listening comprehension trials tested children's knowledge of story events or characters (equivalently presented across the simple and complex conditions, just using easier or more difficult synonyms). On each trial, the same narrator's voice asked a question (e.g., “Who were the boy and the dog looking for?”) over a gray screen with a central fixation cross. When the child fixated on the cross, the screen switched to a 2 3 grid of black‐and‐white images (all drawn by the author‐illustrator of “Frog, Where Are You?”; see the rightmost column of Figure 1). Children responded by pointing to one of the images.
2.2.3.2. Unfamiliar Word Generalization
Unfamiliar word generalization trials asked children to identify appropriate illustrations for the unfamiliar target words embedded (exclusively) in the complex narration. For the children in the complex condition, this meant generalizing the unfamiliar target words that they had heard to novel stimuli (e.g., from the boy “ogling” the frog to a person peering through a magnifying glass, or from the frog in the story “absconding” from the jar to a stylized graphic of a person running away). 4 As in the previous block of trials, children heard each test question (e.g., “Can you point to the person who is absconding?”) over a gray screen with a central fixation cross. When children's fixation on the cross triggered the next screen, they responded by pointing to one of four candidate black‐and‐white illustrations, arranged in a 2 2 grid. Competitor images were selected to be compatible with the syntax of the test question (e.g., depicting other actions with thematic patients as options for “ogling”). The correct response for all questions was normed via a sample of undergraduates exposed to the same story narration ().
2.3. Variable Coding and Predictions
2.3.1. Child Attention Metrics
We captured variability in children's attention to the speech via measurements of (1) children's probability of continuing listening beyond the first obligatory narration of each page and the (2) duration and (3) distribution of children's visual attention to our predefined areas of interest (AOIs; the illustration and distractor).
2.3.1.1. Continued Listening
On each page, we coded whether the child moved on to the next page as soon as they could (that is, as soon as the obligatory first repetition of the narration for that page had ended, plus the 1500 ms threshold for the trigger AOI: ), or continued listening for any amount of time past that (). Coding children's listening time data in this way enabled us to meaningfully analyze children's voluntary exposure to the speech, in spite of the challenges presented by children's raw voluntary listening durations (namely, zero‐inflation—many children moved on to the next page shortly after the first repetition—and a long tail; see Online Supplementary Materials).
We use this recoded variable for analyses at both the page‐ and subject‐ levels. At the page level, continued listening is a binary variable. Across all pages, children moved on immediately of the time, and listened to all five additional repetitions less than of the time (on just five individual storybook pages). At the subject level, we analyze continued listening in terms of the proportion of pages on which a child continued listening (range = 0–1 [0.64, 0.78]). Approximately half of all children (; ) continued listening on at least five out of the six total pages of the story. We take this measure to reflect a child's ongoing attention to the speech, or their sustained appetite for hearing more of it. Conversely, we can think of preschoolers' probability of “moving on” from a storybook page in the present paradigm as analogous to infants' probability of looking away in previous research, interpreted minimally as a loss of interest in the stimulus (Rankin et al. 2009), and more richly as an active decision to re‐allocate cognitive resources away from it (Kidd and Hayden 2015).
2.3.1.2. Gaze to the Illustration vs. Distractor
For a more granular view of children's attention while listening to the story, we analyze continuous measures of children's gaze to the two equal‐sized AOIs that we defined on the eyetracking display: the illustration and the distractor.
2.3.1.2.1. Net Gaze Duration
A child's net gaze duration to a given AOI reflects the total time (in milliseconds) 5 during which their gaze was both detectable by the eyetracker and fixated on that AOI. Thus, this measure combines information about the distribution of a child's attention during the story (i.e., between AOIs) and the overall length of their exposure to the story. At the page level, each child contributed 12 net gaze durations: one value for each of the two AOIs, on each of the six storybook pages (illustration: range = 0–62.78s, [14.38, 16.44]; distractor: range = 0–24.51s, [6.12, 7.04]). When analyzing net gaze durations at the subject level, we sum gaze durations to the illustration across pages, and take a child's total illustration gaze duration as a global index of their attention to the speech (range = 27.38–148.47s, [82.30, 97.92]; total distractor gaze duration: range = 14.11–73.11s, [35.79, 43.47]).
2.3.1.2.2. Proportion Gaze Duration
A child's proportion gaze duration for a given AOI represents their gaze to that AOI as a proportion of their gaze across the entire display. 6 This measure narrows in on the relative share of children's visual attention devoted to each AOI (illustration: range = 0–0.86, [0.45, 0.50]; distractor: range = 0–0.94, [0.33, 0.38]), irrespective of overall duration. For a page‐level index of attention to the story, we analyze children's proportion gaze durations specifically to the illustration. We average this value across pages for a subject‐level metric (mean illustration proportion gaze duration: range = 0.18–0.71, [0.43, 0.51]; mean distractor proportion gaze duration: range = 0.07–0.77, [0.31, 0.40]).
If children's degree of attention to the speech is related to how appropriate it is for their current level of cognitive‐linguistic competence, we should see an interaction between speech complexity and age in predicting children's attention. To illustrate with our “continued listening” variable: In the simple condition, we might expect older children to typically move on from each page after hearing it once and likely extracting its information. On the other hand, we might expect younger children—who might still be learning from each simple page narration by the end of its first repetition—to be more likely to continue listening. In the complex condition, by contrast, we might expect children in this younger age group to have already disattended by the end of the first page repetition (because the complexity of the speech makes it difficult for them to learn from), and have already had their attention captured by the distractor, causing the story to quickly advance to the next page. At the same time, we might expect older children—who have more hope of “getting something” out of the more complex speech—to be more likely to continue listening past the first repetition of the page.
We note that the attention metrics that we have included are likely to be intercorrelated: When children “continue listening,” they have a greater window in which to look to the illustration, making longer illustration gaze durations more likely. Looking more to the illustration will likely correspond with looking less to the distractor, resulting in greater illustration proportion gaze durations. And devoting more attention to the distractor, relative to the illustration, might mean that the child is already looking at the distractor when the first narration ends, causing them to “move on” as soon as possible, rather than continue listening. Despite their likely intercorrelations, we nevertheless chose to include all three metrics because they offered distinct analytic and interpretive advantages. First, by splitting children's listening times into the binary categories of “moved on” versus “continued listening,” we maximize our ability to detect differences in children's attention patterns on the basis of condition and age, and provide an analogy to the look‐away probability measure employed in infant research. Second, insofar as they pull apart, our two metrics of children's visual attention to the illustration may allow us to infer whether learning is driven more by increased looking toward relevant stimuli (greater illustration gaze durations) versus decreased looking toward irrelevant stimuli (greater illustration gaze proportions).
2.3.2. Learning Outcome Variables
We consider two measures of how well children were able to learn from the speech, one (listening comprehension) analyzable across all children, and the other (unfamiliar word generalization) applicable only to children in the complex condition.
2.3.2.1. Listening Comprehension
Children's responses on the six trials testing their knowledge of the story content were coded as correct (1) or incorrect (0). Children typically answered at least half of the questions correctly (range = 0%–100%, [, ]).
2.3.2.2. Unfamilar Word Generalization
Children's responses on the trials testing the unfamiliar words in the complex narration were likewise coded as correct (1) or incorrect (0). Children showed predictably variable performance on this measure. However, while we expect the variation in performance among children in the complex condition (range = 0%–83% correct, [31%, 48%]) to be potentially meaningful, we expect the variation in the simple condition—where children never heard the target words—to merely reflect chance (range = 0%–83% correct, [24%, 40%]). Thus, we analyze unfamiliar word generalization accuracy only in the complex condition, where children actually had the opportunity to learn something about the words from within the experiment.
If children's attention is at least partly sustained by their sense that they are or could be learning something, we expect that these two measures of individual children's learning outcomes—evidence that they indeed learned from the speech—will correlate with the measures of their attention to the speech described in the preceding section.
2.4. Analysis
In addition to reporting descriptive statistics regarding our variables of interest, we conduct two primary varieties of analyses: (1) analyses testing the link between speech complexity and child attention, and (2) analyses testing the link between child attention and child learning outcomes.
In the first set of analyses, we fit separate models to the page‐by‐page data for each “positive”—as in, predicted to be associated with learning—attention metric, according to its distribution: We use the lme4 package (Bates et al. 2015) in R (v4.1.2; R Core Team 2021) to fit mixed effects logit models to children's binary continued‐listening codes, 7 and linear mixed effects models to children's continuous illustration net gaze durations. 8 We use the mgcv package (Wood 2011) to fit a generalized additive model to children's illustration proportion gaze durations, assuming a beta distribution. 9 In each of these models predicting page‐by‐page attention, we include condition (simple/complex), child age (mean‐centered, in years), and the interaction of condition and child age as fixed effects, and random intercepts (or random effect smooths, for the generalized additive model) for child and page. In cases where a model of this structure fails to converge, we refit the model after dropping the random effect with the lowest variance (Barr et al. 2013).
In the second set of analyses, we use mixed effects logit models to predict children's trial‐by‐trial test question accuracy. We fit models to predict accuracy from each subject‐level attention variable (continued listening proportion, total illustration net gaze duration, mean illustration proportion gaze duration) separately, controlling for condition (listening comprehension models only) and age (listening comprehension and unfamiliar word generalization models). We standardize the attention predictor variables (; SD = 1) to enable us to compare effect sizes across them. Models include random intercepts for child and test question. 10
We rely on estimates and 95% bootstrapped confidence intervals of model coefficients or odds ratios (ORs) to interpret the impact of different predictors on the dependent variable. We assess the significance of individual predictors by comparing nested models with and without the relevant predictor (using the anova function in R; R Core Development Team 2021).
3. Results
3.1. Is Children's Attention Responsive to Spoken Language Complexity?
3.1.1. Do Children Differentially Attend to the Simple vs. Complex Speech?
Children continued listening on an average of [4.00, 5.21] (range = 1–6) pages in the simple condition, and [3.27, 4.50] (range = 0–6) pages in the complex condition (see Online Supplementary Materials for further details). While rates of continued listening were numerically greater in the simple condition, this difference was not significant (t() , ). Table 1 reports the median values for children's net gaze durations and proportion net gaze durations to each AOI, by condition. When listening to the simple speech—compared to when listening to the complex speech—children showed greater total illustration net gaze durations (t() = , ), but similar total distractor net gaze durations (t() = , ). 11 Children's mean proportion net gaze durations did not appear to differ significantly between the two conditions for either the illustration (t() = , ) or the distractor (t() = , ).
TABLE 1.
Metrics of attention to each AOI by condition.
| simple | complex | ||||
|---|---|---|---|---|---|
| Mdn | 95% CI | Mdn | 95% CI | ||
| By Page | Net gaze duration (s) | ||||
| illustration | 14.76 | (11.97, 20.83) | 12.66 | (7.84, 17.72) | |
| distractor | 5.42 | (3.74, 8.28) | 5.64 | (3.19, 8.80) | |
| Proportion gaze duration | |||||
| illustration | 0.51 | (0.41, 0.64) | 0.44 | (0.27, 0.61) | |
| distractor | 0.28 | (0.17, 0.51) | 0.37 | (0.21, 0.54) | |
| By Participant | Total gaze duration (s) | ||||
| illustration | 97.99 | (82.59, 117.58) | 80.21 | (56.50, 92.72) | |
| distractor | 41.40 | (29.88, 44.68) | 39.68 | (27.20, 49.47) | |
| Mean proportion gaze duration | |||||
| illustration | 0.50 | (0.45, 0.57) | 0.42 | (0.34, 0.57) | |
| distractor | 0.31 | (0.24, 0.43) | 0.37 | (0.32, 0.47) | |
3.1.2. Does Complexity Level Interact With Age in Predicting Children's Attention?
Table 2 shows the results for a model testing the hypothesis that children's attention to the speech will reflect an interaction between our speech complexity manipulation and children's own cognitive/linguistic development, operationalized here via children's age. As predicted, there was a significant interaction between condition and age in predicting children's probability of continuing listening (; Figure 2). Specifically, when listening to the simple speech, older children were less likely to continue listening than younger children (Age OR = 0.56 [0.32, 0.93]), but the opposite was true for children listening to the complex speech: I,n the complex condition, older children were more likely than younger children to continue listening on each page (complex:Age OR = 3.37 [1.42, 9.02]). This pattern of results is consistent with the idea that the simple speech represented an appropriate level of complexity for the younger children in our study, whose attention it was more likely to elicit and maintain. Older children may have been more likely to have learned all they could from the first repetition of each page, and thus to disattend (gravitate toward the distractor) when the narration began to loop. The fact that the complex speech was more likely to maintain the attention of older children is consistent with the idea that their greater cognitive and linguistic skills may have made them better able to recognize that there was “something to learn.”
TABLE 2.
Mixed effects logit model of children's probability of continuing listening.
| (Intercept) | 4.62*** (2.50, 9.80) |
| Condition (complex) | 0.47 (0.19, 1.07) |
| Age | 0.56** (0.32, 0.93) |
| complex: Age | 3.37** (1.42, 9.02) |
| Observations | 276 |
| Subjects | 46 |
| Log likelihood | −151 |
| AIC | 314 |
| BIC | 335 |
p
0.05;
p
0.01;
p
0.001.
FIGURE 2.

Model‐estimated probabilities of a child listening to further optional repetitions of each storybook page narration. Child age (x‐axis) was associated with a significant decrease in the probability of continuing listening in the Simple condition, but a significant increase in the probability of continuing listening in the Complex condition. Shaded region indicates SE; hatches represent individual children, colored by condition.
We saw mixed results with our two measures of children's attention to the illustration. For our absolute measure (children's net illustration gaze durations), only condition was a significant predictor, such that children looked less to the illustration in the complex condition (; ). Neither age (; ), nor the interaction between condition and age (; ) significantly predicted gaze durations. Turning to our relative measure of visual attention to the illustration, age, condition, and their interaction were significant predictors of children's proportion illustration gaze durations. Children directed lesser proportions of their gaze to the illustration when listening to the complex speech (; ). The interaction between condition and age was in the expected direction: When listening to the simple speech, older children tended to direct a lesser proportion of their gaze to the illustration, relative to younger children (; ). When listening to the complex speech, older children directed a greater proportion of their gaze to the illustration, relative to younger children (; ; see Online Supplementary Materials for full details).
3.2. Are Children's Learning Outcomes and Patterns of Attention Related?
If children's attention to the speech was driven at least in part by their ongoing sense that they were learning from it, we should see a correspondence across conditions between individual children's learning outcomes and measures of their attention. Our final analyses test this prediction for each learning outcome and attention metric.
As anticipated, the positive subject‐level attention metrics on which these analyses rely were significantly intercorrelated (continued‐listening proportion and total illustration gaze duration: Spearman's , ; total illustration gaze duration and mean illustration proportion gaze duration: Spearman's , ; mean illustration proportion gaze duration and continued‐listening proportion: Spearman's , ). In the General Discussion, we interpret differences in how results for each attention metric (reported in the following two sections) conform to our theoretical predictions.
3.2.1. Did Children Who Attended More Understand the Story Better?
Children tended to perform well on trials testing their listening comprehension (simple: range = 0%–100% accuracy, [59%, 80%]; complex: range = 17%–100%, [55%, 74%]). To test the relation between this learning outcome and children's online attention to the speech, we fit separate mixed effects logit models to their listening comprehension test trial accuracy for each index of children's attention to the speech over the course of the story (the overall proportion of storybook pages on which a child continued listening, their total illustration gaze duration across pages, and their mean illustration proportion gaze duration).
In both conditions, children who paid more attention to the narration—according to our three measures of child attention—showed greater understanding and recollection of the story's plot and characters at test (see Table 3). Controlling for condition and age, the proportion of pages on which a child continued listening (Table 3, Model 1) was significantly related to their listening comprehension accuracy (OR = 1.59 [1.17, 2.18], Wald's , ). The same was true of children's total illustration gaze duration (, ; Table 3, Model 2), and children's mean illustration proportion gaze duration (, ; Table 3, Model 3). The odds ratios for these different subject‐level attention metrics were similar, such that a standard deviation increase in each was associated with about a one‐and‐a‐half times increase in a child's probability of answering a test question correctly.
TABLE 3.
Mixed effects logit models predicting listening comprehension accuracy from child attention.
| Model 1 | Model 2 | Model 3 | |
|---|---|---|---|
| (Intercept) | 2.98* (0.96, 9.99) | 3.57* (0.97, 14.70) | 3.00* (0.98, 9.93) |
| Age | 2.09*** (1.51, 3.00) | 2.42*** (1.53, 4.23) | 2.06*** (1.47, 2.97) |
| Condition (complex) | 0.87 (0.48, 1.59) | 0.82 (0.34, 1.95) | 0.85 (0.46, 1.57) |
| Continued listening | 1.59** (1.17, 2.18) | ||
| illustration gaze (s) | 1.61* (1.04, 2.62) | ||
| illustration proportion | 1.48** (1.10, 2.02) | ||
| Observations | 276 | 276 | 276 |
| Log likelihood | −144 | −142 | −146 |
| AIC | 299 | 297 | 301 |
| BIC | 317 | 319 | 319 |
p
0.05;
p
0.01;
p
0.001.
3.2.2. Did Children Who Attended More Learn the Hard Words Better?
Finally, we asked whether our three child attention metrics were positively related to children's generalization of the unfamiliar words tested in the final block of test trials. We fit mixed effects logit models to children's unfamiliar word generalization performance in the complex condition, where the unfamiliar words that we tested were actually used in the narration. Holding child age constant, two of three models showed significant relations between child attention and unfamiliar word generalization test accuracy: These were the proportion of trials on which children continued listening (, ; Table 4, Model 1), and children's mean illustration proportion gaze duration (, ; Table 4, Model 3). Similar to our listening comprehension results, the odds ratios for these two attention metrics suggest that a standard deviation increase in either was associated with approximately a one‐and‐a‐half times increase in children's probability of correctly generalizing the unfamiliar target word at test. Children's total gaze duration to the illustration did not emerge as a significant predictor of their word generalization accuracy (, ; Table 4, Model 2). These results are consistent with—though by no means decisive proof of—the idea that children's attention to the speech was reflective of their implicit sense that they were or could be learning something. The differences across variables suggest that the overall time that children spent looking at the illustration may not have been as good a cue to children's online learning from the speech as their tendency to favor the illustration over other regions of the screen.
TABLE 4.
Mixed effects logit models predicting unfamiliar word generalization from child attention.
| Model 1 | Model 2 | Model 3 | |
|---|---|---|---|
| (Intercept) | 0.67 (0.28, 1.49) | 0.64 (0.27, 1.38) | 0.68 (0.28, 1.54) |
| Age | 0.80 (0.50, 1.28) | 0.95 (0.61, 1.47) | 0.85 (0.54, 1.33) |
| Continued listening | 1.56* (1.03, 2.43) | ||
| illustration gaze (s) | 1.15 (0.84, 1.57) | ||
| illustration proportion | 1.59** (1.14, 2.27) | ||
| Observations | 132 | 132 | 132 |
| Log likelihood | −84 | −86 | −82 |
| AIC | 175 | 179 | 172 |
| BIC | 187 | 191 | 184 |
p
0.05;
p
0.01;
p
0.001.
4. General Discussion
Here, we sought evidence for the foundational idea that children's attention to different sources of information reflects the degree to which the sources of information support their learning. Inspired by the real‐life context of storybook‐reading, we tested this idea by manipulating both the complexity of the language that children heard (by varying the estimated ages of acquisition of the words used across a more simple vs. more complex narration of the same story) and the capacities of the learners themselves (by testing children across a 2‐year age range spanning significant growth in vocabulary and language knowledge; Brown 1973). Systematic differences in child attention while listening to the simple versus complex story narrations suggest that our experimental manipulation of speech complexity was effective, and that our novel method left children free to direct their attention between a speech stream offering new opportunities for learning and an alluring distractor.
The strongest support for our hypothesis that children's attention was at least partially driven by their sense of learning comes from the interaction between speech complexity and age in predicting children's (a) probability of continuing listening on each page of the story and (b) illustration gaze proportion. Our results suggest that children's attention depended on the narration's subjective complexity, rather than on an objective level whose effects or attractiveness was preserved across children. That is, children's desire to hear further repetitions of the same page depended on the size of the gap between their current linguistic competence and the difficulty of the words used to tell the story. When the gap was small, children wanted to continue listening. When the gap was greater, children tuned out. That older children were less likely than younger children to continue listening in the simple condition (where there was little for them to learn from a second repetition of the same page) is especially critical evidence that children's attention reflected their sense of learning, because it helps rule out the possibility that children merely paid more attention with age. We saw similar effects when looking at how children distributed their visual attention across the display in the two conditions. Relative to younger children, older children devoted a greater proportion of their overall gaze to the story‐relevant illustration only in the complex condition, consistent with the idea that children attend more to speech as it becomes less subjectively complex.
There was one unexpected finding, that is, children overall seemed to attend more to the simple speech. On further consideration, there are a few reasons why we might have anticipated this result. First, the story was novel, even if the language was not. We had initially expected older children to exhibit significantly less attention to the simple speech because it would hold little learning opportunity for them. However, regardless of whether the language itself offered new material for learning—that is, for example, whether the words and their usages were already familiar to children—the speech was conveying a story that children did not already know. Thus, even if learning were the primary way to secure children's attention, we would expect the simple narration to be attractive to both younger and older children. Second, the “U‐shaped” curve associated with the Goldilocks effect is not symmetric: Infants and young children disattend more decisively to too‐complex stimuli than to too‐simple stimuli (Cubit et al. 2021; Kidd et al. 2012, 2014), suggesting that children may have been less likely to disattend to the simple speech because it was “too easy” than from the complex speech because it was “too hard.” Third and finally, our sample skewed young. In previous pilot data, the simple speech led to robust listening comprehension performance in same‐age children recruited from the same area (a result we replicate). This means that we knew in advance of our study that children across our sample age range could at least understand and learn from the simple speech. In contrast, only older children in previous samples had reliably learned from the complex speech. Thus, it is likely relevant for explaining the overall simple speech advantage that our sample skewed toward the younger end of our age range (see hatches along the x‐axis in Figure 2).
Our study additionally enabled us to directly test the link between children's selective, learning‐driven attention and their learning outcomes. Previous infant research measuring attention based on complexity or learnability has necessarily employed highly simplified stimuli, with limited potential for assessing learning (Gerken et al. 2011; Kidd et al. 2012, 2014). Studies inferring infants' learning from their gaze behavior have been limited to the visual domain, and to an operationalization of learning as event prediction (Poli et al. 2020). Across conditions, we found that individual children's self‐directed attention to the speech—measured in terms of children's probability of continued listening, illustration gaze duration, and illustration proportion gaze duration—was positively related to their plot knowledge. Two of these three attention metrics—children's tendency to continue listening and their mean illustration proportion gaze durations—were also positively related to children's generalization of the unfamiliar target words embedded in the complex speech. Our results thus offer a novel contribution to previous studies, suggesting that individual children's self‐directed attention to speech reflects the fit between children's cognitive‐linguistic knowledge and the objective complexity of the speech.
Despite their significant intercorrelations, we did not observe all the predicted effects for all the positive attention variables. Most notably, we saw distinct results for the two measures of children's attention to the illustration. The proportion of children's overall gaze that was devoted to the illustration was significantly predicted by the interaction of complexity and age, but their absolute gaze duration was not. Likewise, children's mean illustration gaze proportions predicted their unfamiliar word generalization, but their total illustration gaze durations did not. This pattern lends itself to a few mutually compatible interpretations. First, net gaze duration may just be a noisy measure of attention. There are many routes to high values, including even children “zoning out” (Shepherd and Kidd 2024). Another route to high net gaze durations is via many shorter fixations, potentially broken up by looks to the distractor that repeatedly fail to meet the threshold for triggering the next page. Relative and absolute measures of children's gaze to the illustration pull apart in this latter scenario, as a child's proportion gaze duration would be comparatively low (given that the child would be effectively splitting their time between the illustration and the distractor). This points to another, related interpretation of the discrepancy between measures: It may be that in this experimental setup, what is really important for us to track is how much children get distracted, rather than how much they attend, or how much exposure they get to the speech—and proportion gaze duration may be particularly sensitive to a child's degree of distraction. Lastly, differences across attention variables might derive from our analytic approach. For example, our power simulations anticipated a crossover effect in predicting attention from condition and age. For an interaction where the effect of age on a particular attention variable is merely attenuated in one condition versus another (rather than reversed), our study is likely underpowered.
We were particularly interested in children's sensitivity to naturalistic speech complexity as a means of explaining why certain sources of language input have proven to be more useful for children's learning than others (e.g., the quantity of simplified child‐directed language that children receive reliably correlates with early vocabulary growth, while the quantity of—often more complex—language around them does not; Ramírez‐Esparza et al. 2014; Shneidman et al. 2013; Shneidman and Goldin‐Meadow 2012; Weisleder and Fernald 2013). Relative to studies of infant language development, the idea that low‐level processes of attention to spoken language might continue to mediate language development into the preschool years has received little attention (Houston and Bergeson 2014). Yet our study suggests that this is likely the case. As in other domains where, for example, children track the past accuracy of informants and use it to select who they want as a teacher (Pasquini et al. 2007), our results indicate that children may track the relative difficulty of processing and encoding different sources of linguistic information, and preferentially attend to those sources where their learning is most efficient. Future studies could directly test (1) the idea that independent measures of children's level of linguistic knowledge might predict how they allocate attention to language inputs of different levels of complexity in their environment (e.g., overhearable speech or news broadcasts; Foushee et al. 2016), and (2) whether children are able to actively select the best linguistic information sources to enhance their own learning. More generally, language is an interesting test domain for these questions because language‐learning is a lifelong endeavor—not only do adults continue to learn new vocabulary in their first language(s), but studying a second language puts them back in the child's position of being sensitive to the subjective complexity of different language sources. Future work might also investigate how differing levels of proficiency across a given child's or adult's languages predict the complexity of linguistic input that they spontaneously attend to in each language.
4.1. Limitations
Our study has several key limitations. First, while our experimental design was motivated by a real‐world context (bookreading) and child behavior (saying “again!” to demand a repeat read), the mechanics of the experiment—in which individual pages, rather than the entire book, repeat again and again—are of dubious ecological validity. Second, and relatedly, we introduced the distractor to “catch” children's attention when it was no longer captured by the story and matching Illustration. However, the animation's presence itself may have interfered with children's attention to and processing of the story. Understanding the role of the distractor in children's learning is important for understanding what our method can tell us about how children attend to speech in the real world. Of course, children are rarely offered a riveting .gif to watch as an alternative to paying attention to a speech stream. However, children are arguably always faced with potential competitors for the attention that they could be devoting to language (e.g., when engaged with a younger sibling, rather than listening in on theircaregivers' conversation, or even when just thinking really hard about dinosaurs, rather than hearing what someone has to say). Finally, we used child age as a proxy for children's level of cognitive‐linguistic development. This is defensible based on previous samples, which have shown child age to be highly correlated with vocabulary size, and moderately correlated with unfamiliar word generalization performance. However, future studies could more precisely test the hypotheses advanced here by employing language‐specific measures of development, rather than relying on child age.
4.2. Conclusion
We designed a novel method to directly test the classic idea that children's attention to a stimulus is driven by its support for their learning, and to extend this idea to an important new domain: spoken language. Our results reveal one of many ways in which children can be thought of as active language‐learners (Foushee et al. 2023; Zettersten and Saffran 2021; Saylor and Ganea 2018; Bloom 2000; Foushee et al. 2021), effectively shaping their language input via the deployment of their own attention.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
We thank the children and their families for participating, as well as the teachers at the Harold E. Jones Child Study Center, University Village Albany Child Development Center, Haste Street Child Development Center, Kensington Elementary School, and The Berkeley School; Alison Fong, Jacqueline Nguyen, Jon Wehry, Harmonie Strohl, Gwyneth Heuser, Grace Horton, and Luvy Vanegas Grimaud for help with stimuli and data collection; Mike Frank, Susanne Gahl, Terry Regier, and Azzurra Ruggeri for comments on analysis; the members of the Berkeley Early Learning and Language and Cognitive Development Labs at UC Berkeley—including Monica Ellwood‐Lowe and Ariel Starr—for feedback on drafts; and Chef Miko for editing. Earlier versions of this work were presented at the 45th Annual Boston University Conference on Language Development, the Biennial Meeting of the Society for Research in Child Development, 2021, and the Annual Meeting of the Jean Piaget Society, 2022. This work was supported by the NSF GRFP DGE‐1752814 to R.F. and the NSF SMA‐1640816 to F.X. This article was written using the papaja library in RStudio (Aust and Barth 2024).
Funding: This work was supported by the NSF GRFP to R.F. and the NSF SMA‐1640816 to F.X.
Endnotes
While seemingly obvious, we note that this remains untested in previous work.
In a previous sample of same‐age children recruited from the same venues as our current participants, children's age was highly correlated with their raw scores on the Peabody Picture Vocabulary Test (, ; Dunn and Dunn 1981), a more narrowly relevant measure of language competence. Age was also significantly correlated (, ) with children's learning of the same set of rare words embedded in the complex speech condition of the current study (see Materials and Methods). Both of these facts suggest that age is a reasonable index to use in predicting the relative “learnability” or “subjective complexity” of the linguistic stimuli whose objective complexity we have manipulated.
Across pages, narrations were matched for syllable count (range = 50–61 syllables, [53.33, 56.59]; , paired by page), speech rate (range = 3.42–3.99 syllables/s, [3.58, 3.76]; , paired by page), number of sentences (5/page) and number of questions versus declarative sentences on each page. Sentences 1, 2, and 5 on each page—where the complex narration embedded five later‐acquired content words—were additionally matched on type‐token ratio (range = 0.81–1, [0.87, 0.94]; , paired by page). Sentences 3 and 4 in the complex condition used the rare target word for that page one time each, while the simple condition used a more accessible alternative. Comparisons between the language of the two conditions can be found in the online repository for this study.
The test trial for “absconding” also used a different tense (present progressive) from the story, where both uses were in the simple past: “…the frog absconded from the jar. He absconded to find his mom and dad.”
While we report descriptive statistics for gaze durations in seconds for readability (Table 1), we use log‐transformed millisecond values in our page‐level statistical models.
The majority of children's gaze to the display— [81%, 85%] across pages—was typically captured by one of our two AOIs. In other words, children mostly looked at either the illustration or the distractor, and spent a small minority of their looking time in the white space surrounding the two images, for example, when switching between them.
Model syntax: glmer(continued_listening age + condition + age:condition + (1|subject) + (1|page), family = “binomial” (link = “logit”)).
Model syntax: lmer(gaze_duration age + condition + age:condition + (1|subject) + (1|page)).
Model syntax: gam(gaze_proportion age + condition + age:condition + s(subject, bs = “re”) + s(page, bs = “re”), family = betar(link = “logit”)).
Model syntax: glmer(question_correct standardized_attention_index + condition + age + (1|subject) + (1|question), family = “binomial” (link = “logit”))
Recall that variability in children's gaze to the distractor was likely truncated by the mechanics of the experiment, which, after the first repetition of the current page, transitioned to the next page as soon as one of children's fixations to the distractor crossed the 1500ms threshold.
Data Availability Statement
The data that support the findings of this study are openly available in “Is child attention responsive to linguistic complexity?” at https://osf.io/zsjfb/?view_only=024c8e83e56a4fff95e5d5ae840035c2.
References
- Aust, F. , and Barth M.. 2024. Papaja: Prepare Reproducible APA Journal Articles with R Markdown. R Package Version 0.1.3. Retrieved from https://github.com/crsh/papaja.
- Barr, D. J. , Levy R., Scheepers C., and Tily H. J.. 2013. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep it Maximal.” Journal of Memory and Language 68, no. 3: 255–278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bates, D. , Mächler M., Bolker B., and Walker S.. 2015. “Fitting Linear Mixed‐Effects Models Using lme4.” Journal of Statistical Software 67, no. 1: 1–48. 10.18637/jss.v067.i01. [DOI] [Google Scholar]
- Bloom, L. 2000. “The Intentionality Model of Word Learning: How to Learn a Word, Any Word.” In Becoming a Word Learner: A Debate on Lexical Acquisition, edited by Golinkoff R., Hirsh‐Pasek K., Bloom L., Smith L., Woodward A., Akhtar N., Tomasello M., and Hollich G., 19–50. Oxford University Press. 10.1093/acprof:oso/9780195130324.003.002. [DOI] [Google Scholar]
- Brown, R. 1973. A First Language: The Early Stages. Harvard University. [Google Scholar]
- Bruner, J. S. 1961. “The Act of Discovery.” Harvard Educational Review 31: 21–32. [Google Scholar]
- Caron, R. F. , and Caron A. J.. 1969. “Degree of Stimulus Complexity and Habituation of Visual Fixation in Infants.” Psychonomic Science 14, no. 2: 78–79. [Google Scholar]
- Colombo, J. , and Mitchell D. W.. 2009. “Infant Visual Habituation.” Neurobiology of Learning and Memory 92, no. 2: 225–234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooper, R. M. 1974. “The Control of Eye Fixation by the Meaning of Spoken Language: A New Methodology for the Real‐Time Investigation of Speech Perception, Memory, and Language Processing.” Cognitive Psychology 6: 84–107. [Google Scholar]
- Cubit, L. S. , Canale R., Handsman R., Kidd C., and Bennetto L.. 2021. “Visual Attention Preference for Intermediate Predictability in Young Children.” Child Development 92, no. 2: 691–703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunn, L. M. , and Dunn L. M.. 1981. Peabody Picture Vocabulary Test‐Revised. American Guidance Service, Inc. [Google Scholar]
- Fenson, L. , Marchman V. A., Thal D. J., Dale P. S., Reznick J. S., and Bates E.. 2007. MacArthur‐Bates Communicative Development Inventories: User's Guide and Technical Manual. 2nd ed. Brookes. [Google Scholar]
- Forest, T. A. , Siegelman N., and Finn A. S.. 2022. “Attention Shifts to More Complex Structures With Experience.” Psychological Science 33, no. 12: 2059–2072. [DOI] [PubMed] [Google Scholar]
- Foushee, R. , Griffiths T. L., and Srinivasan M.. 2016. “Lexical Complexity of Child‐Directed and Overheard Speech: Implications for Learning.” In Proceedings of the 38th Annual Meeting of the Cognitive Science Society , 1697–1702.
- Foushee, R. , Srinivasan M., and Xu F.. 2021. “Self‐Directed Learning by Preschoolers in a Naturalistic Overhearing Context.” Cognition 206: 104415. 10.1016/j.cognition.2020.104415. [DOI] [PubMed] [Google Scholar]
- Foushee, R. , Xu M. Srinivasan, and F.. 2023. “Active Learning in Language Development.” Current Directions in Psychological Science 32, no. 3: 250–257. [Google Scholar]
- Gerken, L. A. , Balcomb F. K., and Minton J. L.. 2011. “Infants Avoid ‘Labouring in Vain’ by Attending More to Learnable Than Unlearnable Linguistic Patterns.” Developmental Science 14, no. 5: 972–979. 10.1111/j.1467-7687.2011.01046.x. [DOI] [PubMed] [Google Scholar]
- Gureckis, T. M. , and Markant D. B.. 2012. “Self‐Directed Learning: A Cognitive and Computational Perspective.” Perspectives on Psychological Science 7, no. 5: 464–481. 10.1177/1745691612454304. [DOI] [PubMed] [Google Scholar]
- Houston, D. M. , and Bergeson T. R.. 2014. “Hearing Versus Listening: Attention to Speech and Its Role in Language Acquisition in Deaf Infants With Cochlear Implants.” Lingua 139: 10–25. 10.1016/j.lingua.2013.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kidd, C. , and Hayden B. Y.. 2015. “The Psychology and Neuroscience of Curiosity.” Neuron 88, no. 3: 449–460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kidd, C. , Piantadosi S. T., and Aslin R. N.. 2012. “The Goldilocks Effect: Human Infants Allocate Attention to Visual Sequences That Are Neither too Simple nor Too Complex.” PLoS ONE 7, no. 5: e36399. 10.1371/journal.pone.0036399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kidd, C. , Piantadosi S. T., and Aslin R. N.. 2014. “The Goldilocks Effect in Infant Auditory Attention.” Child Development 85, no. 5: 1795–1804. 10.1111/cdev.12263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuperman, V. , Stadthagen‐Gonzalez H., and Brysbaert M.. 2012. “Age‐of‐Acquisition Ratings for 30,000 English Words.” Behavior Research Methods 44, no. 4: 978–990. 10.3758/s13428-012-0210-4. [DOI] [PubMed] [Google Scholar]
- M Martin, R. 1975. “Effects of Familiar and Complex Stimuli on Infant Attention.” Developmental Psychology 11, no. 2: 178–185. [Google Scholar]
- Mayer, M. 1969. Frog, Where Are You? Dial Press. [Google Scholar]
- Pasquini, E. S. , Corriveau K. H., Koenig M., and Harris P. L.. 2007. “Preschoolers Monitor the Relative Accuracy of Informants.” Developmental Psychology 43, no. 5: 1216. [DOI] [PubMed] [Google Scholar]
- Poli, F. , Serino G., Mars R., and Hunnius S.. 2020. “Infants Tailor Their Attention to Maximize Learning.” Science Advances 6, no. 39: eabb5053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team . 2021. R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing. https://www.R‐project.org/. [Google Scholar]
- Ramírez‐Esparza, N. , García‐Sierra A., and Kuhl P. K.. 2014. “Look Who's Talking: Speech Style and Social Context in Language Input to Infants Are Linked to Concurrent and Future Speech Development.” Developmental Science 17, no. 6: 880–891. 10.1111/desc.12172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rankin, C. H. , Abrams T., Barry R. J., et al. 2009. “Habituation Revisited: An Updated and Revised Description of the Behavioral Characteristics of Habituation.” Neurobiology of Learning and Memory 92, no. 2: 135–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salverda, A. P. , and Altmann G. T. M.. 2011. “Attentional Capture of Objects Referred to by Spoken Language.” Journal of Experimental Psychology. Human Perception and Performance 37, no. 4: 1122–1133. 10.1037/a0023101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saylor, M. M. , and Ganea P. A.. 2018. Active Learning From Infancy to Childhood: Social Motivation, Cognition, and Linguistic Mechanisms . Springer, March. 10.1007/978-3-319-77182-3. [DOI] [Google Scholar]
- Shepherd, S. S. , and Kidd C.. 2024. “Visual Engagement Is Not Synonymous With Learning in Young Children.” In Proceedings of the 46th Annual Meeting of the Cognitive Science Society .
- Shneidman, L. A. , Arroyo M. E., Levine S. C., and Goldin‐Meadow S.. 2013. “What Counts as Effective Input for Word Learning?” Journal of Child Language 40, no. 3: 672–686. 10.1017/S0305000912000141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shneidman, L. A. , and Goldin‐Meadow S.. 2012. Mayan and U.S. Caregivers Simplify Speech to Children. Vol. 2, 536–544. Cascadilla Press. [Google Scholar]
- Thomas, H. 1965. “Visual‐Fixation Responses of Infants to Stimuli of Varying Complexity.” Child Development 36: 629–638. [Google Scholar]
- U.S. Census Bureau . “Financial Characteristics.” American Community Survey, ACS 5‐Year Estimates Subject Tables, Table S2503, 2020. https://data.census.gov/table/ACSST5Y2020.S2503?q=median+household+income+2020&t=Income+and+Poverty&g=160XX00US0606000.
- Vygotsky, L. S. , Cole M., John‐Steiner V., Scribner S., and Souberman E.. 1978. Mind in Society: Development of Higher Psychological Processes. Harvard University Press. https://books.google.se/books?id=RxjjUefze%7B%5C_%7DoC. [Google Scholar]
- Weisleder, A. , and Fernald A.. 2013. “Talking to Children Matters: Early Language Experience Strengthens Processing and Builds Vocabulary.” Psychological Science 24, no. 11: 2143–2152. 10.1177/0956797613488145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wood, S. N. 2011. “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models.” Journal of the Royal Statistical Society (B) 73, no. 1: 3–36. [Google Scholar]
- Zettersten, M. , and Saffran J. R.. 2021. “Sampling to Learn Words: Adults and Children Sample Words That Reduce Referential Ambiguity.” Developmental Science 24, no. 3: 1–12. 10.1111/desc.13064. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are openly available in “Is child attention responsive to linguistic complexity?” at https://osf.io/zsjfb/?view_only=024c8e83e56a4fff95e5d5ae840035c2.
