Abstract
Studies of non-linguistic statistical learning (SL) have often linked performance in SL tasks with differences in language outcomes. Most of these studies have focused on Western and high-income educational contexts, but children worldwide learn in radically different educational systems and communities, and often in a second language. In the west African nation of Côte d’Ivoire, children enter fifth grade (CM-1) with widely varying ages and literacy skills. Across three iteratively-developed experiments, 157 children, age 8–15 years, in rural communities in the greater-Adzópe region of Côte d’Ivoire watched sequences of cartoon images with embedded triplet patterns on touchscreen tablets, while performing a target-detection task. We assessed these tablet-based adaptations of non-linguistic visual SL and asked whether the children’s individual differences in performance on the SL tasks were related to their first and second language and literacy skills. We found group-level evidence that children used the statistical regularities in the image sequence to gradually decrease their response times, but their responses on post-test discrimination did not reflect this learning. When evaluating the correlation between SL and language skills, individual differences related to other task demands predicted oral language skills shared by first and second languages, while SL better predicted second language print skills. These findings suggest that non-linguistic SL paradigms can measure similar skills in Ivorian children as previous samples, but they also echo recent calls for further cross-cultural validation, greater internal reliability, and tests for confounding variables (such as processing speed) in studies of individual differences in statistical learning.
Keywords: Statistical learning, emergent literacy, Côte d’Ivoire, bilingualism, education
A major goal in child research is understanding how general cognitive processes later support specific developmental or academic outcomes. Statistical learning is one example of a theoretical framework that has proposed clear and testable pathways from domain-general processes in infancy and early childhood to important abilities from primary school through adulthood, such as language learning and literacy. Yet most of the cognitive development research relating statistical learning to language and literacy has been conducted in W.E.I.R.D. settings (Western, Educated, Industrialized, Rich, Democratic). This narrow focus is especially problematic when generalizing findings about child development from one cultural or educational context to another, where assumptions held in one context may not be applicable to new, under-sampled settings (Henrich, Heine, & Norenzayan, 2010).
In the present study, we set out to re-examine the relationship between non-linguistic visual statistical learning (VSL) and related skills in a population of children that differ considerably from most previous VSL studies. The principal challenges to such a cross-cultural study, however, are quite fundamental: The translation of even simple experimental paradigms across cultural settings requires a careful re-assessment along several methodological dimensions (see Burger et al., 2022 for an outline and best practices). In this report, we describe a series of three pilot experiments, which we iteratively updated and conducted in rural agricultural villages in southeastern Côte d’Ivoire. Our goal was to establish a measure of non-linguistic VSL that reliably reflects individual differences in this population of children, is minimally confounded by other task demands, and correlates with expected differences in language and literacy outcomes.
Following Burger et al.’s recommendations for equipping local researchers to lead related work in their own communities, we take a methodological focus. We present our process and rationale for task adaptation, describe both the instruments that succeeded and those that failed, and share complete datasets and analysis code in open, plain-text formats. All resources can be downloaded from Open Science Framework at: https://osf.io/nza7m/
Statistical learning and individual differences
Statistical learning (SL) is a learning mechanism present from infancy through older adulthood (Kirkham, Slemmer, & Johnson, 2002; Palmer, Hutson, & Mattys, 2018; Saffran, Aslin, & Newport, 1996; Saffran, Johnson, Aslin, & Newport, 1999). By 16 months, infants acquire language-specific prosodic patterns by learning to segment syllable and words from continuous speech (Nazzi et al., 2006). SL is thought to support this early language learning by highlighting linguistic units (syllables and words) via co-occurrence probabilities when other cues (e.g., brief pauses) are unreliable (Saffran, 2003; Sawi & Rueckl, 2019).
Literacy, in turn, builds on these oral language skills developed in early childhood (i.e., emergent literacy; Storch & Whitehurst, 2002; Whitehurst & Lonigan, 1998), including vocabulary and phonological awareness (the recognition that speech is decomposable into constituent syllables and phonemic units; Snow, 2006). Thus, SL mechanisms supporting spoken language extend to support literacy (Arciuli, 2018). Several non-linguistic segmentation studies, in which participants learn transitional probabilities for a continuous sequence of images or sounds, report correlations between SL and literacy for school-aged children (Arciuli & Simpson, 2012; Torkildsen et al., 2019; Tong et al., 2019) and adults (Frost et al., 2013; Yu et al., 2019).
Additionally, the connections between SL, language, and educational outcomes are likely to be enhanced in bilingual environments. Oral language skills in a child’s first language (L1) are associated with second language (L2) reading (Bialystok, 2002; Hammer et al., 2009; Jasińska et al., 2019). This cross-language transfer suggests that the same learning mechanisms support emergent literacy in both languages. Phonological awareness is supported by learning co-occurrence probabilities (Goschke, Friederici, Kotz, & Van Kampen, 2001; Warker, 2013), mediating the correlation between individual differences in non-linguistic auditory SL and first-language literacy for an English monolingual population (Qi et al., 2019).
Many SL studies, however, are designed to identify group-level differences and are ill-suited to measure individual differences (Siegelman, Bogaerts, & Frost, 2017). Even within existing W.E.I.R.D. samples, implicit learning tasks have limited internal and test-retest reliability (West et al., 2018), especially for children (Arnon, 2020; but see Torkildsen et al., 2019 and Qi et al., 2019 for exceptions), calling into question their utility as predictors of educational outcomes. However, there are few proposed alternatives for developmental SL research. Test regimes that improve individual difference measures in VSL for adults combine various question formats (preference vs. completion, trigram vs. bigram judgments; Siegelman, Bogaerts, & Frost, 2017), which are likely inaccessible to children who already struggle to meet the simpler demands of typical two-alternative forced choice (2-AFC) tasks.
Challenges to adapting statistical learning across cultures
Understanding a general, non-linguistic SL mechanism could help bridge developmental research across languages, cultures, and educational backgrounds. However, the existing obstacles to measuring individual differences in SL are amplified by these differences. Adapting tasks that were initially developed for high-income or technologically resourced communities requires great care to ensure that both ecological validity (how closely a measure reflects the ability or behavior of interest in day-to-day life; Kihlstrom, 2021; Holleman et al., 2020) and cultural validity (how socio-cultural expectations impact task performance; Kūkea Shultz & Englert, 2021; Solano-Flores, 2011; Solano-Flores & Nelson-Barber, 200) are preserved in the new context.
Simply participating in most SL tasks depends on participants’ sustained attention to unfamiliar and repetitive stimuli and their willingness to make multiple, independent preference judgments between sets of these stimuli. Recent cross-cultural research has highlighted the fragility of this independence assumption for participants’ responses. Hruschka et al. (2018) found that participants provided seemingly inconsistent judgments in a social discounting paradigm (being willing to give away a large or small amount of a resource, but not a medium amount), due to the repetitious, iterative format of the questions. Although Hruschka and colleagues observed this effect in farming communities of rural Bangladesh, they note that truck drivers in the United States (Burks et al., 2009) and children (Sharp et al., 2012) provide similar response patterns. This non-independence of serial forced-choice judgments could underlie the low internal reliability observed in previous SL studies as well.
In rural, agrarian contexts, limited experience with common testing technologies used in psychological research may also present challenges to validity. Associating two-dimensional (2D) images with real-world objects is an insight developed in early childhood (DeLoache, 2000), but depends on a child’s regular exposure to these media representations, such as picture books or cartoons. While one-year-old American infants learn names of novel objects through either seeing pictures of the object or seeing the object itself, infants in rural Tanzania were less successful in picture-mediated learning (Walker et al., 2013). These latent representational differences can affect cognitive measures like pattern acquisition: Six-year-old children in Zambia learned patterns using familiar tactile stimuli that they were not able to learn using the standard 2D image presentation (Zuilkowski et al., 2016).
Children outside high-income communities also have widely varying experiences with computer interfaces. Nearly all VSL research with children and adults is performed on personal computers with responses provided by pressing buttons. However, recent studies have exploited the wide global adoption of mobile phone interfaces and begun validating tablet computer interfaces as testing tools. For example, while local and refugee children in Jordan have differing experiences using personal computers, they reported similar levels of familiarity with cell phones and quickly picked up tablet-based tasks with touchscreens (Chen et al., 2019). Touchscreen adaptations have proven useful for timed cognitive processing tasks with children in rural Bolivia (Pelz, Yung, & Kidd, 2015), Lebanon, and Niger (Ford et al., 2019). Despite these advantages, statistical learning studies have seen little uptake of these interfaces so far.
Along many of these dimensions, communities in rural Cȏte d’Ivoire differ from the communities in which previous statistical learning studies have been conducted. Interaction with digital and print media is limited: Some Ivorian languages, such as Attié, are rarely presented in print and adult literacy rates have been low (47%, UNESCO Institute for Statistics, 2017), so many Ivorian children have little print exposure prior to starting school. Further, irregular attendance due to illness and economic obstacles leads to highly variable ages of school enrollment and grade repetition. Consequently, children reach their fifth year of primary education (CM-1) at a wide range of ages, developmental stages, and with highly variable skillsets.
The Present Study
To achieve a better understanding of the cognitive mechanisms that Ivorian children apply to language and literacy acquisition, the first steps are establishing task validity and reliability. We report three non-linguistic visual statistical learning (VSL) experiments that we gradually adapted for deployment in settings with limited technological resources and for cultural relevance to children in rural Côte d’Ivoire. In each VSL experiment, cartoon images were presented one-at-a-time in a sequence determined by four groups of three images, and we assessed participants’ ability to passively learn these patterns. The first experiment updated Arciuli and Simpson’s (2012) VSL study to accommodate touchscreen interfaces, deliver stimuli over a low-bandwidth cellular network, and expand opportunities for children to practice making touch responses. In two successive experiments, we incorporated elements of Qi et al.’s (2019) and Schneider et al.’s (2020) VSL studies, switching from a 1-back to a target-detection task, adding more colorful and detailed stimuli, adding a narrative to contextualize the task, and increasing the number of 2-AFC trials. Data were collected in villages outside the southeastern city of Adzopé in January and February 2019 and February 2020.
For clarity, we present these three experiments in parallel, contrasting their similarities and differences. We began by asking (1) whether the tablet interface and cover tasks were generally accessible to the children in our sample, (2) whether the VSL tasks evoke similar evidence of group-level learning as previous studies, (3) whether individual differences in these tasks reliably reflect SL abilities or other cognitive and cultural factors, and (4) finally, we test for the expected relationships between SL and other individual differences in emergent literacy: bilingual oral language and L2 print skills.
Method
This study was conducted during the randomized controlled trial for a phone-based French literacy intervention in cocoa-producing communities of rural southeastern Côte d’Ivoire (Allo Alphabet; Madaio et al., 2019a, 2019b, 2020), approved by the University of Delaware Institutional Review Board. Each community’s local leadership (the village chief), parent representative group (COGES), and school directors provided informed consent after presentations by the research team, according to local norms (see Jasinska & Guei, 2018). Each child verbally assented to participation and was rewarded with a small gift (a child’s book). Testing was performed in school yards at folding tables or partially enclosed (breeze block) classrooms, with researcher-child dyads maximally spaced apart. Other children were asked to quietly keep a distance of several meters from their classmates while testing.
Participants.
Children in Côte d’Ivoire are mainly educated in their second language (L2), French, while speaking one of over sixty local languages at home (Brou-Diallo, 2011; Ayewa, 2018; Jasińska & Guei, 2022; Jasińska, Ball, & Guei, 2023). A survey we conducted in early 2019 found that 92% of households in this region speak Attié as their first language (L1), and 28% of children reported having at least one French-speaking family member at home. We recruited children enrolled in CM-1 (equivalent of U.S. 5th grade) at primary schools in villages of the greater Adzópe region.
Experiments 1 and 2 were performed during baseline visits for the literacy intervention, so that statistical learning was not influenced by participation in the intervention. Experiment 3 was collected during a midline visit to control-arm schools so SL was not influenced by the intervention. Children were recruited on a convenience basis: If a child was not engaged in other assessments and assented to participate, they were matched with an experimenter. Children were not included or excluded on any criteria besides membership in the visited classroom.
In Experiment 1, we recruited 71 children in four villages, aged from 10 to 15 years (mean=11.4y, SD=1.5y). In Experiment 2, we visited three new villages and recruited 40 children, ages 9 to 15 years (mean=11.1y, SD=1.3y). In Experiment 3, we visited two schools in one village and recruited 46 children ages 8 to 13 years old (mean=10.5y, SD=1.4y). See Table 1 for a summary. Due to the distributed recruitment (about a dozen research assistants working in parallel), children completed different subsets of the available tasks. In Experiment 1, SL, demographic, and at least one language and literacy test were available for 58 children. In Experiments 2 and 3, 29 and 33 children met these criteria, respectively. See Table 2 for a further breakdown.
Table 1.
Sample characteristics in each experiment
| Sample | n | Age | % using French at home | # grades repeated |
|---|---|---|---|---|
|
| ||||
| Experiment 1 | 71 | 9–15 y (11.3, 1.4) | 33% | 0–5 (0.68, 1.01) |
| Experiment 2 | 40 | 9–15 y (11.1, 1.4) | 48% | 0–2 (0.65, 0.62) |
| Experiment 3 | 46 | 8–13 y (10.6, 1.5) | 29% | 0–2 (0.37, 0.55) |
Note: Range (mean, SD) reported where applicable
Table 2.
Sample sizes in each experiment
| Sample | Number of participants in each dataset (male / female / not surveyed) | ||||
|---|---|---|---|---|---|
| SL | Demographic | Attié skills | French skills | Complete Cases | |
|
| |||||
| Exp. 1 | 71 | 67 (31 / 36 / 0) | 60 (27 / 33 / 0) | 70 (32 / 36 / 1) | 59 (26 / 33 / 0) |
| Exp. 2 | 40 | 35 (19 / 14 / 2) | 30 (16 / 12 / 2) | 34 (18 / 14 / 2) | 29 (15 / 12 / 2) |
| Exp. 3 | 46 | 34 (21 / 13 / 0) | 35 (20 / 14 / 1) | 35 (21 / 14 / 0) | 33 (20 / 13 / 0) |
Note: Sex information is missing when children did not also participate in the demographic questionnaire. Complete cases have age data and at least one measure from each of SL, Attié, and French tasks.
Materials.
The statistical learning tasks, demographic surveys, and language tests were all recorded on iPad Air 2 tablets with a 9.7 inch display. VSL stimuli were presented using the Paradigm software (Perception Research Systems, 2007) with a touchscreen interface, allowing children to directly touch the target stimuli instead of pressing buttons. Survey instruments were administered via REDCap hosted at the University of Delaware (Harris et al., 2009, 2019).
During the VSL task, all stimuli had to be buffered from Dropbox via a mobile hotspot on 2G or 3G cellular network. Consequently, the size of image stimuli was a key constraint. Images were carefully scaled and compressed to be streamed over the network without compromising their visual quality.
Visual statistical learning tasks.
Procedures in all three experiments included practice, familiarization, and test phases. Research assistants (native speakers of Ivorian French with either a college degree in linguistics or a related field) provided verbal instructions in French, accompanied by gestures and physically guiding the child’s hand in early practice trials if needed. Stimulus images were always referred to as “animaux” - animals, rather than aliens or monsters, to avoid confusion for the children (who were generally not familiar with science fiction and fantasy genres). Throughout all three phases of the task, researchers monitored the children to make sure they stayed on-task. See Table 3 for a comparison of stimuli, tasks, and other parameters across the three experiments.
Table 3.
Experimental parameters for statistical learning in each of the experiments
| Phase | Parameter | Experiment 1 | Experiment 2 | Experiment 3 |
|---|---|---|---|---|
|
| ||||
| Practice | Cover task | 1-back matching | Target detection | Target detection |
| Stimulus images | Parrot photos | Parrot photos | Monsters1 | |
| Duration (ms) | 800 | 800 | 800 | |
| SOA (ms) | 200 | 200 | 200 | |
| Feedback type | Oui! / Non! text | smile / frown icon | smile / frown icon | |
| Familiarization | Cover task | 1-back matching | Target detection | Target detection |
| Stimulus images | Grayscale aliens2 | Monsters1 | New aliens3 | |
| Duration (ms) | 600 | 800 | 800 | |
| SOA (ms) | 200 | 200 | 200 | |
| Test | # 2-AFC trials | 8 | 16 | 32 |
Notes:
Licensed from 123RF (2019)
Obtained from Arciuli & Simpson (2012)
Obtained from Schneider et al. (2020)
Practice stimuli.
In Experiments 1 and 2, we presented four photographs of parrots obtained from a Google Image search. Each parrot was a different color and appeared on generic out-of-focus backgrounds. Images were scaled to approximately 200×250 pixels and jpeg-compressed to 3–8 KB. In Experiment 3, we re-used Monster stimuli from Experiment 2 (see “VSL Stimuli” below) as practice stimuli to be consistent with the cartoon appearance of the familiarization stimuli.
Familiarization and test stimuli.
Each experiment used a different set of stimuli in the SL task. For Experiment 1, we obtained grayscale copies of cartoon alien images from Arciuli and Simpson’s (2011, 2012) supplementary materials. These images are not licensed for redistribution, and original color stimuli are no longer available. Images were scaled to 500×500 pixels and jpeg-compressed to 23–36 KB.
For Experiment 2, we licensed color images of cartoon monsters from a clipart repository (123RF, 2019; Figure 1). The monsters are visually simple, with a face, body, and limbs in different configurations. Each monster had one or two main colors and solid, spotted, or striped texture. The monsters were individually unique (e.g., an orange one, a slug-like one), but they were not distinguishable by any single feature, such as color, limbs, shape, texture, or eyes. Twelve monsters were used to create the four triplets for the VSL task. Images were cropped and scaled to 500 × 500 pixels and jpeg compressed to 18–35 KB.
Figure 1.

Panels A through D: Sample screenshots and the Monster stimulus inventory used in Experiment 2. Children were familiarized with the target image (“an amusing animal”; Panel A) and performed a target detection task by touching the target animal when it appeared in the familiarization stream (Panel B). After familiarization, children selected between triplets of animals depicted sequentially on the left and right sides of the screen by touching the hand symbol corresponding to the group’s position (Panel C). All twelve images used in the tasks are depicted in Panel D. Panels E through H: Sample screenshots and the Alien stimulus inventory used in Experiment 3. Example presentation of a target image (Panel E), familiarization phase (Panel F), and 2-AFC response screen in the test phase (Panel G). All twelve images used in the tasks are depicted in Panel H.
Because 2-AFC accuracy was very low in Experiments 1 and 2, we adapted the familiarization procedures in Experiment 3 for closer parity with a previous VSL experiments reported by Schneider et al. (2020). Schneider and colleagues’ experiment used high-color, detailed cartoon aliens (Figure 1), a narrative about aliens lining up to board a spaceship, and a corresponding image of a rocket, and they found that 6 to 12-year-old children performed significantly above chance in the 2-AFC post-test. The twelve Alien images were cropped and scaled to approximately 240 × 300 pixels and jpeg compressed to 8–13 KB, and where Schneider et al.’s study used a narrative about aliens and a spaceship, we modified screenshots of a common Ivorian minibus (gbaka) and driver, totaling 161 KB, from a commercial for mobile wifi (Voodoo Group, 2016). These images were used to create introduction and practice scripts similar to Qi et al.’s (2019) and Schneider et al.’s (2020) and maintain the children’s interest while remaining culturally accessible.
Practice procedures.
Practice stimuli appeared for 800 ms with a 200 ms interstimulus interval (ISI) to acclimate the children to interacting with the touchscreen and illustrate the cover task, either 1-back matching (responding to a repeated image) in Experiment 1 or target detection (responding to a pre-specified image) in Experiments 2 and 3. Prior to the practice, the experimenter explained that the child would see some birds or interesting animals, and provided one of the following sets of instructions: For 1-back, children were told that the same bird would sometimes appear twice in a row, and the child should touch that repeated bird on the screen. For target-detection, children were shown a target image (one of four parrots or monsters) and asked to touch that image every time it appeared on the screen. In all cases, the children were advised that the animals were very fast and to respond quickly in order to touch them.
Children completed up to 64 trials (just over a minute long) generated from a uniform random distribution (without replacement) of sixteen copies of four different stimulus images so that no transitional probabilities could be learned. In Experiment 1, print feedback, “Oui!” (yes!) or “Non!” (no!), was provided for all screen touches. In Experiments 2 and 3, feedback was given using an illustrated smile- or frown-face (2 KB each), because not all children immediately recognized the French words “Oui” and “Non.” Children who failed to respond or seemed confused received scaffolded demonstrations by holding their hand and guiding them to touch a target photo. Practice was terminated when the child achieved two hits more than their false alarms or after all 64 trials. All children moved on to the familiarization phase regardless of practice performance.
Familiarization phase.
A sequence of images, about five minutes in duration, appeared according to an underlying statistical pattern. Twelve images are grouped into four ordered triplets: ABC, DEF, GHI, and JKL. Because the within-triplet order is fixed, the transitional probabilities from the first to second (A->B) and second to third images (B->C) are 100%. The triplets may occur in any order without repeating (ABC->DEF, GHI, or JKL), so the transitional probability between a third and first image (C->D) is 33%.
In Experiment 1, we followed Arciuli and Simpson’s (2012) 1-back cover task to keep the children fixated on the stimuli. Each image of the twelve images in the task appeared as a duplicate on two occasions at randomized points in the stream, thus providing no cue to the stream’s statistical structure. The children’s instructions were to touch “d’animaux amusant” (the amusing animal) if it re-appeared as a duplicate. In Experiments 2 and 3, children were assigned one of the four stimulus images at the beginning of the task and instructed to touch the amusing animal each time they saw it. In Experiment 2, these target images always appeared in the third position of a triplet (C, F, I, or L), such that the appearance of the target was 100% predictable based on the two preceding images. In Experiment 3, about 60% of children were assigned targets in the third position, while 40% were assigned targets in the less predictable (33%) first position of a triplet. Across all experiments, children were reminded to respond quickly because the animals were very fast. No further feedback was given in this phase.
Further, in Experiment 3, we elaborated the task instructions in the familiarization to more closely match Qi et al.’s (2019) procedures, involving multiple presentations and interactions with the target image prior to the main task. The French and English storyboards for this task are available on Open Science Framework.
Initial piloting among adult research team members suggested that a duration of 600 ms (with a 200 ms interstimulus interval, ISI) was necessary to successfully respond to most of the targets. However, after Experiment 1, we found that this duration was too short for the touchscreen interface, and we extended the stimulus duration in Experiments 2 and 3 to 800 ms (200 ms ISI).
Test phase.
In all three experiments, children were informed that the animals had been appearing in groups with their friends during the familiarization phase. They were instructed to look at two groups of animals, which would appear sequentially, three on the left followed by another three on the right, and select which group they had seen previously arriving together. Presentation speed was matched to the familiarization phase, with an additional 1000 ms ISI between groups. After presenting both groups, touch-response squares (in Experiment 1) or “touch” icons (a hand with the index finger extended; Experiments 2 and 3, see Figures 1 and 2) appeared on the left and right sides of the screen where the animals had appeared, and children were asked to touch the side with the correct group.
Figure 2.

Group-level measures of statistical learning. (A) Distribution of participants’ response accuracies on the two-alternative forced choice tests when considering only the first 8, 16, or 32 trials (as applicable) by experiment. (B) Group-average response times and slopes in each of the three Experiments.
In each trial, one group was composed of a real triplet (e.g., ABC or GHI), while the other was composed of an impossible triplet that preserved the serial positions of the images but violated the transitional probabilities (e.g., AEI or DHL). No feedback was given. Correct responses were counterbalanced between left- and right-hand positions, and trial order was uniquely randomized for each participant. On the recommendation from the local research team, based on their experience with the children’s engagement in repetitive testing, we limited the Experiment 1 test phase to eight 2-AFC trials, requiring 2–3 minutes to complete. In the subsequent experiments, we increased this limit to 16 trials (2–4 minutes) and 32 trials (5–7 minutes), respectively, in an attempt to improve internal reliability of the SL estimates.
Language and literacy assessments.
Native speakers of Attié and French tested children’s skills in each language using analogous instruments previously administered with children in rural Côte d’Ivoire (Sobers et al., 2023; Jasińska et al., 2022a, 2022b; Ball et al., 2022) with high internal reliability (Zinszer et al., 2023). Phonological awareness assessments were adapted from the Early Grade Reading Assessment (EGRA: Gove & Wetterberg, 2011; RTI International, 2009), Yopp-Singer segmentation task (Yopp, 1995), and Bruce’s (1964) phoneme deletion task using common words each language. Vocabulary assessments used Attié and French translations of the synonym and antonym generation tasks in the Woodcock-Johnson-III Test of Cognitive Abilities (Woodcock, 2001). French reading was tested using the EGRA grapheme and word reading tasks. If a child did not provide correct answers to the first ten graphemes or five words, their assessment ended, and they received a score of 0. Reading tasks were timed for 60 seconds. The complete French test materials are available in the Open Science Framework. The Attié adaptations of these tests are available in the Ivorian Children’s Language Assessment Toolkit (Akpé et al., 2021; Jasińska et al., 2022a). In the absence of population norms, all tests use raw accuracy scores.
Household questionnaires.
We administered a household inventory questionnaire, part of Early Grade Reading Assessment, EGRA (Gove & Wetterberg, 2011; RTI International, 2015) used in West Africa (including Francophone countries of Mali and Senegal). Specific items in the inventory provided information about the availability of print and television media at home. We also asked children whether anyone in their homes spoke French.
Planned analyses
Estimates of statistical learning.
Analyses were performed in RStudio (RStudio Team, 2020; R Core Team, 2018). In the test phase, responses to the 2-AFC trials were averaged for each child and tested against chance performance (50%) at the group level. In the target detection task (used in Experiments 2 and 3), response times (RT) to the target are also reflective of the children’s statistical learning as the target becomes predictable over time based on the transitional probabilities (Qi et al., 2019). We estimated the fixed effect of RT over target trial to represent the group-level change in RT from one appearance of the target to the next, while also estimating a fixed effect for the experiment (since the images in the familiarization had changed) and random RT slopes and intercepts for each child in a linear mixed effects regression model (lmerTest package; Kuznetsova, Brockhoff, & Christensen, 2017) with bobyqa optimization.
We estimated internal (split-halves) reliability for the 2-AFC response accuracy and RT slope estimates to determine whether they were suitable for detecting individual differences in SL. We assessed construct validity of the RT slope by examining whether other potential confounds predicted this SL measure: performance criteria from the practice and familiarization phases, children’s responses to the household questionnaire, French exposure at home (the language in which instructions were provided), and the availability of visual media at home.
Analysis of individual differences in SL and language.
Previous studies performed partial correlations controlling for grade between SL measures and literacy scores. We tested the relationship of each measure of interest (SL, language, and literacy variables) with age, which varied widely within each classroom, and the other potential confounding variables described above. We then performed partial correlations adjusting for the confounding variables that related to one or more of the individual difference variables of interest. We report the partial correlations as well as Bayes Factors (BF; Morey, Rouder, & Jamil, 2015) for and against the null hypothesis range [−0.20, 0.20] because most related SL studies reported correlations with magnitude greater than 0.20. Bayes factors are not significance tests and do not need to be corrected for multiple comparisons.
Results
Accessibility of task demands
Practice phase.
Only 13 children (18%) in Experiment 1 met criterion performance on the practice. These children required an average of 7.2 presentations of the target (SD=4.9) to complete the practice. They had a hit rate of 23% and a false alarm rate of 12%. Mean response time across all children was 494 ms (SD=92 ms) for correct hits.
Changing the children’s goal from 1-back to target-detection in Experiment 2 raised the practice hit rates (33%) and lowered false alarm rates (6%) among the 27 children (68%) who met criterion performance, requiring an average of 8.4 target presentations. The mean response time was 540 ms (SD 38 ms). This improvement was sustained in Experiment 3, where all but one child (98%) achieved criterion performance in the practice, requiring an average of 5 target presentations (significantly fewer than Experiment 2, Welch’s two-sample t(74)=6.26, p<0.001). The Experiment 3 hit rate was 38% and false alarm rate was 11%. Mean RT on hits was 517 ms, which was slightly, but not significantly, faster than Experiment 2 (t(80)=1.03, p=0.305).
Familiarization phase.
Sixteen children (23%) in Experiment 1 failed to respond to the target images at all during the response window. Average hit rate among children who provided at least one correct response was 20%, and false alarm rate was 11%. Only one child in Experiment 2 did not record any responses in the familiarization phase. Hit rates were much higher (85%) with the simpler task, longer stimulus durations, and more colorful stimuli. A 100% response rate and high hit rates in Experiment 3 (89%) indicated that children were still able to complete the target detection task successfully.
Exposure to visual media and French at home.
Household surveys indicated that most children in the sample had exposure to some kind of two-dimensional visual media (see Figure S1 of the supplement). Among the children in Experiment 1, 23% reported having a children’s book in their home and 46% reported having a television (57% with either media source). Households in Experiment 2 reported even higher rates of media exposure: 92% overall (children’s book: 44%, television: 75%), and in Experiment 3, 69% overall (children’s book: 33%, television: 64%). Further, classrooms at the schools were decorated with posters and children’s picture books were generally available. We tested whether exposure to visual media at home or the presence of a French speaker at home related to performance in the practice or familiarization phases. We found no significant differences based on visual media (all p>0.53) or having a French speaker at home performed (all p>0.16; see Table S1 and Figure S2).
Group-level statistical learning (SL)
Accuracy in 2-AFC test phase.
In Experiment 1, in which participants completed the 1-back task prior to the test phase, the mean group-level accuracy was 54%, which was significantly greater than chance-level of 0.50 (one-tailed t(70)=1.81, p=0.037; Figure 2A) but with only a small effect size (Cohen’s d=0.22). In Experiments 2 and 3, when participants completed the target detection task for targets in the predictable (third) position, the mean 2-AFC accuracy was 50% and 51%, respectively, and did not significantly differ from chance. We investigated whether performance on the first eight trials was better, as observed in the shorter Experiment 1 test phase. In Experiment 2, the mean score did not change (mean=50%, t(39)=0.11, p=0.457), and in Experiment 3, the mean score increased, but still did not significantly exceed chance (mean=55%, t(27)=1.30, p=0.10, Cohen’s d=0.25). Accuracy in Experiment 3 for participants who saw the target in the less predictable (initial) position did not significantly exceed chance, at 52% for all 32 trials (t(17)=1.42, p=0.09) and 49% for the first eight trials (t(17)=−0.29, p=0.61).
Response time slopes in familiarization phase.
In Experiments 2 and 3, the target detection task served as a secondary indicator of statistical learning. We contrasted the children’s response time slopes (RT slope) in Experiments 2 and 3 against Experiment 1, where the 1-back cover task provided no information about target appearance, and against a group of participants (n=16) in Experiment 3 who were assigned targets in a less predictable first position of a triplet using a linear mixed effects model: RT ~ TargetTrial * TargetPosition * Experiment + (1+TargetTrial | subj)
The conditional R2 for the mixed effects model was 0.49, and marginal R2 was 0.13. We used the Experiment 1 data as the reference level. There was no fixed effect of target trial in this condition (see Table 4; b=1.16, t(334)=1.08, p=0.281). The fixed effect of target trial when targets appeared in the predictable final position was significantly more negative (b=−4.99, t(182)=−3.67, p<0.001) with an estimated 115 ms decrease in RT over the 24 targets. When targets appeared in the less predictable initial position, the fixed effect of target trial did not differ from the Experiment 1 reference (Target Trial x Initial; b=0.05, t(162)=0.03, p=0.975). This finding indicates that, in the target detection task, only the targets that were strongly predicted via transitional probabilities elicited significant decreases in response time. Group-level slopes are illustrated in Figure 2B, with RT at each target trial averaged across participants and random slopes by subject that are retained and addressed in the next section.
Table 4.
Linear mixed effects model for response time across experiments.
| Predictor | Estimate | S.E. | df | t | p |
|---|---|---|---|---|---|
|
| |||||
| Intercept | 454.11 | 16.39 | 243.47 | 27.70 | <0.001 |
| Target Trial (for 1-back, Exp. 1) | 1.16 | 1.08 | 334.29 | 1.08 | 0.281 |
| Target Position (Final; Exps. 2 & 3) | 109.24 | 22.41 | 136.58 | 4.87 | <0.001 |
| Target Position (Initial; Exp 3) | 145.31 | 25.32 | 124.31 | 5.74 | <0.001 |
| Experiment 2 (vs. Exps. 1 & 3) | 34.02 | 20.25 | 83.06 | 1.68 | 0.097 |
| Target Trial * Target Position (Final) | −4.99 | 1.36 | 182.30 | −3.67 | <0.001 |
| Target Trial * Target Position (Initial) | 0.05 | 1.51 | 162.17 | 0.03 | 0.975 |
| Target Trial * Experiment 2 (vs. Exps. 1 & 3) | 1.29 | 1.10 | 88.39 | 1.18 | 0.243 |
Estimating individual differences in SL tasks
Internal reliability estimates.
We tested the internal reliability of the statistical learning measures described in the previous section. In Experiment 1, split-halves reliability of the 2-AFC test trials was 0.04. In Experiment 2, split-halves reliability for the full test and the first eight trials −0.30 and 0.07, respectively. In Experiment 3, split-halves reliability for 32 trials was −0.30 and −0.11 for the first eight trials. The split-halves estimates of RT slope (the random effect of Target Trial by Subject) were reliable in Experiment 2 (0.78) and Experiment 3 (0.74) when the target appeared in the final position.
Bias and temporal dependency of 2-AFC responses.
We observed that several children responded to the 2-AFC trials by selecting all responses on the same side (all left or all right), alternating sides (left-right-left-right), or following other simple response patterns. These responses were strongly autocorrelated (see Figure S3), so we first modeled each child’s trial-level responses (left or right) on the side of the correct answer and the child’s overall left/right response bias, and then auto-regressed the residuals of the first model on the preceding three trials. Figure 3 illustrates each child’s overall bias for (or against) correct answers, their left/right response bias, and coefficients from the AIC-guided autoregression (AR) models. These estimates identified 59% of children in Experiment 1 who gave exclusively left- or right-sided responses (bias of −1 or +1) or temporally dependent responses (non-zero AR terms). The rates were 50% in Experiment 2 and 46% in Experiment 3. Consistent with the 2-AFC accuracy results, when controlled for left/right bias, the biases for correct answers were distributed around zero (chance performance) and narrowed as the number of test trials increased across experiments.
Figure 3.

Histograms of individual subject-level left-vs.-right response bias and temporal autoregression estimates for residuals of all 2-AFC trials. Correct answer biases of +1 or −1 indicate 100% or 0% response accuracy, respectively. A bias of 0 is chance-level performance. Left/right biases are coded −1 for left side responses and +1 for right side responses. Autoregression models included only terms that improved the model fit, and thus all non-zero values indicate temporal dependence in responses.
Validity of the RT slope estimates.
Only the RT slope estimates provided adequate group-level evidence of statistical learning and internal reliability for use as an individual differences measure (Figure 4A). We regressed the subject-level RT slopes over several likely confounding variables that would influence task performance: The child’s age, whether French is spoken by anyone in their home, the availability of visual media (children’s books or television) in the child’s home, the child’s proficiency interacting with the tablet (average response time in the practice phase, intercept for response times in the familiarization phase), the child’s proficiency in the task (number of trials needed to complete the practice phase), the child’s attentiveness to the task (average hit rate for targets in the familiarization phase), and the experiment in which each child participated (which affected the instructions and images the child saw). See Figure S4 for pairwise correlations.
Figure 4.

Upper panels depict histograms for the six language and literacy tests (A). Middle panels depict histograms for RT slope (B), oral language skills component (C), and French print skills component (D). Lower panels depict partial correlations between RT slope and oral language skills (E), RT slope and French print skills (F), and RT slope and French print skills using Spearman rank correlation (G).
This model predicted RT slope using the confounding variables and was estimated across 51 children in Experiments 2 and 3 data for whom the target appeared in the final position of a triplet (Table S2). Because many predictors were collinear, we performed an automated AIC-based stepwise search (both directions) to identify a reduced model (F(2,48)=6.136, p=0.004, multiple R2=0.204), which included significant effects of standardized RT intercept in the familiarization phase (p=0.004) and the availability of visual media at home (p=0.033; Table S3). Along with the fixed effect of experiment number, these variables were carried forward to further analyses as potential confounds for individual differences in VSL.
Relationships between SL and language skills
Language and literacy variables.
Between 125–134 children completed each of the language and literacy tests (Figure 4A). Mean scores were higher in Attié than French in vocabulary (Attié: 7.7 out of 20 items, French: 6.4/20, paired t(113)=3.11, p=0.002) and phonological awareness (Attié: 21.7/40, French: 18.3/40, paired t(111)=4.63, p<0.001). Average French letter reading (27.3/100) and word reading (11.5/50) scores were significantly lower than a previous cohort of 298 fifth graders we sampled in ten schools across southern and central Côte d’Ivoire (letters: difference=−10.4, Welch’s t(274.8)=−4.40, p<0.001; words: difference=−6.1, Welch’s t(275.2)=−3.88, p<0.001; Jasińska et al., 2022). Descriptive statistics for the language and literacy data appear in Table S4 of the supplement.
Across 102 complete cases, age was negatively correlated with French word reading (r=−0.255, FDR-corrected p=0.014) and French letter reading (r=−0.284, FDR-corrected p=0.006), and the six literacy variables were highly correlated with one another (see Figure S5). We used principal components analysis across language and literacy data from all three experiments (n=102) to estimate six orthogonal components for phonological awareness in L1 and L2, vocabulary in L1 and L2, French letter reading, and French word reading. The first two components contributed 61% and 17% of the variance, with all remaining components contributing less than 10% each (see Table S5 in Supplement). We rotated the first two components (accounting for 78% of variance in the data) using varimax, so the two new rotated components each explained 39% of the variance from the first two principal components while remaining orthogonal. The first rotated component loaded mainly on French letter and word reading (>0.90 each) and moderately on French phonological awareness (0.63), while the second rotated component loaded mainly on Attié vocabulary (0.90) and phonological awareness (0.79) and moderately on the same variables in French (vocabulary: 0.67, phonological awareness: 0.60, see Table S6 in Supplement). For comparison with the emergent literacy literature, we interpret these two components as (1) French print skills and (2) oral language skills. See Figure 4C and 4D for distributions.
Correlations with SL.
Linear regression over the complete cases from Experiments 2 and 3 revealed that among the candidate confounding variables from the previous section, French print skills were significantly related only to age (b=−0.329, p=0.031; all other p>=0.20; Tables S6 and S7), while the AIC-optimized oral language skills model included standardized response time in the practice phase (b=−0.230, p=0.089) and access to visual media (b=−0.473, p=0.091; Tables S8 and S9). Controlling for the confounds identified in the previous section (familiarization RT intercept, availability of children’s media, and experiment) and practice response time, RT slope was not significantly correlated to oral language skills (n=36 cases, Pearson r=0.129, p=0.454, 95% CI: [−0.177, 0.465], Bayes Factor=0.62, Figure 4E). Controlling for the confounds identified in the previous section and age, RT slope was significantly negatively correlated with French print skills (n=37 cases, r=−0.389, p=0.017, 95% CI: [−0.632, −0.073], Bayes Factor=2.70, Figure 4F). We ruled out extreme values in either measure as responsible for this correlation by also testing Spearman’s rank correlation, which was similar in value and also statistically significant (ρ=−0.356, p=0.031, CI: [−0.610, −0.036], Bayes Factor=1.88, Figure 4G).
Discussion
We conducted three visual statistical learning (VSL) experiments with children in CM-1 (US 5th grade equivalent) in rural communities of Côte d’Ivoire. Our goal was to evaluate VSL as a culturally adaptable paradigm to identify individual differences in childhood cognition supporting later outcomes, like emergent literacy. We aimed to overcome technological barriers by introducing children to the touchscreen interface of a tablet, given the wide adoption of cellphones, despite the tablet itself being a novel interface. We outlined several factors that could limit the generalizability of VSL: cultural relevance, familiarity with computer-based tasks, low reliability of repeated forced-choice testing, and contributions of multiple task demands on a key measure (response time). We found evidence that, even after controlling for confounding variables, children’s response times were related to their emergent literacy skills. The precise relationship between non-linguistic statistical learning mechanisms and children’s language development in this context requires further study.
Group-level evidence of statistical learning and reliability
Children demonstrated learning of statistical dependencies in the familiarization streams on a group-level. In Experiments 2 and 3, the children’s response times in a target detection task significantly decreased over time when targets were predictable based on the underlying statistical structure. The average change in response speed was roughly on par with previous work (Qi et al., 2019; Schneider et al., 2020). Confounding factors, such as familiarity with the tablet interface or task, should also be evident when targets were not predictable based on SL. We did not see negative response time slopes in these conditions. These contrasting outcomes indicate that children were learning transitional probabilities and anticipating target images.
We found mixed evidence that the VSL paradigm consistently measured individual differences in statistical learning. When targets appeared in a triplet’s final position, internal reliability of response time slope was above 0.70. While this measure identified individual differences, unlike previous studies, the response time slopes in the familiarization phase could not be meaningfully compared 2-AFC scores (as in Schneider et al., 2020; Qi et al., 2019) because the latter lacked adequate internal reliability for assessing individual differences. The low internal reliability of 2-AFC in all three experiments is consistent with Siegelman et al.’s (2017) critique, far lower than other recent studies (Torkildsen et al., 2019; Qi et al., 2019), and supports Arnon’s (2020) conclusion that 2-AFC tests rarely capture individual differences. Restricting the analysis to the first 8 trials in each 2-AFC test did not markedly increase the test performance nor meet internal reliability standards, ruling out later-trial fatigue or boredom as decisive factors in 2-AFC performance.
Predictors and confounds for statistical learning
We found some evidence that the individual differences in RT slope were influenced by children’s proficiency or comfort interacting with the tablet interface. Exposure to 2-dimensional media was a significant predictor of greater statistical learning. These resources were relatively common in most households (92% in Experiment 2, 69% in Experiment 3) and most classrooms, generally ruling out the unfamiliarity of 2D images that Zuilkowski et al. (2016) encountered. However, while we took for granted that all children were equally unfamiliar with these specific Monster or Alien images, the children’s overall familiarity with cartoon images is a plausible contributor to their responses in the 1-back and target detection tasks. The cognitive load for encoding complex visual forms is not universal. In paired association tasks, visual complexity can drive between-subject performance differences (Twum & Parente, 1994; Madan, 2014).
In this study, the array of unusual features that characterize the aliens and the monsters were selected to preserve internal validity by minimizing the contribution of mnemonics (“the bird”, “the yellow one”) to performance, but may affect response times under pressure. The significance of the RT intercept as a confounding predictor raises the possibility that processing speed may also pose a confound for individual differences in SL. This relationship is particularly problematic because the RT intercept correlates with both variables of interest (RT slope and language).
Improving the offline (not RT-based) test phase measure of SL could mitigate these baseline response speed issues. Research with adults has shown that other test formats offer improved reliability, but at the cost of complexity (Siegelman, Bogaerts, & Frost, 2017). However, changing the statistical structure itself from temporal (transitional probabilities) to some other dimension (e.g., spatial co-occurrence, as in Fiser & Aslin, 2001) could improve the intuitiveness of the task. In a spatial presentation, the participant’s judgment changes from “which sequence of animals did you see arriving together [over time]?” to “which group of animals did you see arriving together [simultaneously]?” This approach clarifies the comparison by allowing the concurrent presentation of all three members of each group in the test phase, and it removes the working memory demand on each trial to compare two sequences of three items retrospectively.
Individual differences in VSL and emergent literacy skills
We present some preliminary evidence that individual differences in VSL are related to first (L1) and second language (L2) emergent literacy skills. Controlling for confounding variables, French print skills (which captured word and letter reading, and to a lesser extent, French phonological awareness), were correlated with individual differences in VSL. Children who sped up more over the course of the target detection task also exhibited higher scores on the French print skills measure. The correlation (r=−0.389) is in line with the effects historically seen between VSL and reading in children (Arciuli & Simpson, 2012; Tong et al., 2019; Torkildsen et al., 2019) and adults (Frost et al., 2013), although these studies all used 2-AFC test accuracy as their measure of statistical learning.
While we did not find that oral language skills (which primarily captured L1 and L2 vocabulary and phonological awareness) were correlated with VSL, Qi et al. (2019) found that monolingual children’s phonological awareness mediated the relationship between auditory SL and reading. Interestingly, the rotated language components computed in the present study loaded L2 (French) phonological awareness equally on combined L1/L2 oral language skills and on L2 print skills. The relationship we report between SL and L2 print skills may be specific to variance shared between L2 phonological awareness and L2 reading, and not the variance L2 phonological awareness shares with L2 vocabulary or with L1. This decomposition of shared and distinct elements of L1 and L2 has—to our knowledge—not been addressed in the SL literature.
Further, the relationship of both the practice and familiarization RTs to oral language skills may also point to an age effect that has not yet been addressed in studies linking statistical learning and literacy: Across the three experiments, we observed tentative evidence for a negative relationship between age and emergent literacy. Because all children are in the same grade, it is not surprising if the older children–who started school later or repeated grades–are also performing worse on emergent literacy measures. The Declarative-Procedural model predicts that older children rely more on declarative memory to read than their younger classmates (Lum et al., 2010, 2013). The present results could reflect such a trade-off: Stronger performance in target detection (relying on explicit recognition processes in declarative memory) and vocabulary are likely positively associated with age, while implicit procedural processes supporting segmentation or grapheme-to-phoneme conversion are backgrounded in these older children. Larger sample sizes are necessary to disentangle these effects, particularly within and across grade levels, which is an issue we are currently pursuing (Hannon et al., under revision).
Translating the visual statistical learning paradigm across cultures
Our iterative approach to adapting the statistical learning tasks revealed that target detection (Experiments 2 and 3), with average hit rates greater than 80%, was significantly more accessible to the children than the original 1-back task (Experiment 1, hit rate 20%). This change is further contrasted by two things that remained consistent across three experiments: First, average hit rates in the practice phase of all three experiments were below 40%, making it clear that acclimation to the device and procedures was required for the children to achieve the high hit rates seen in the familiarization phase. Second, even as performance in the familiarization phase improved, accuracy and reliability of the 2-AFC items in the test phase did not. Although questions still remain about why the test phase task did not translate well, our updates to the paradigm across experiments allow us to rule out the number of test items, the children’s ability to interact with the touch interface, and their engagement in the familiarization phase as driving the 2-AFC performance.
The bilingual context of the experiment also departs from previous SL studies in children’s language or literacy. The children’s weak performance in the more complicated 1-back task and their highly patterned responses in the 2-AFC task (which must have seemed entirely opaque to them) suggest that communicating the instructions in French could have been an obstacle. Nonetheless, French exposure at home did not appear to confer any obvious advantage—it was not a predictor of VSL. Further, the strong convergence of L1 and L2 in the oral language skills component fits the portability of phonological awareness between languages (Bialystok, 2002; Jasińska et al., 2019), and the oral language skills component based on these measures (bilingual vocabulary and phonological awareness) was also not related to statistical learning, but was related to standardized response time in the unstructured practice phase.
On one hand, these experiments describe a relatively circumscribed improvement to cultural adaptation of statistical learning paradigms: adapted for a group of children in a single region, still sensitive to individual differences in access to visual media and technology, and related to some (but not all) of the expected language and literacy outcomes. On the other hand, the need to investigate how common experimental paradigms evoke or assess different abilities across cultural contexts is evident. This non-linguistic statistical learning experiment was designed to be as independent as possible from specific language, culture, and domain knowledge, and adapted to accommodate differences therein. We found significant group-level evidence of statistical learning, but we also found that task demands likely interfered with the statistical learning mechanism that we intended to measure at the individual level.
These findings do not decisively answer the questions of how best to adapt visual SL experiments for cross-cultural portability or whether the links between SL and emergent literacy extend to children learning to read in a second language in low-income or rural regions. However, the effects of design changes discussed in this series of pilot experiments indicate that improving accessibility, internal reliability, and measurement validity of SL are necessary and attainable goals for cross-cultural experimental research (Burger et al., 2022). Future paradigms will need to better acclimate children to the task demands of interacting with the data collection instruments, reliably estimate non-SL-related individual differences in baseline response speed, and grapple with the independence and reliability of 2-AFC trials for measuring SL. Only rigorously adapted measures can enable our theories of both statistical learning and language development to become more inclusive across the world’s language, cultural, and educational contexts.
Supplementary Material
Research Highlights.
We iteratively adapted three visual statistical learning studies for children in rural Côte d’Ivoire.
Group-level analyses indicates that the children learn the underlying statistical regularities.
Individual-differences analyses reveal some evidence that the statistical learning measure is also correlated with of task demands that may be driven by cross-cultural differences.
Like previous research, statistical learning is correlated with second language literacy, but we did not find a relationship with oral language skills in first and second languages.
Acknowledgements:
We are grateful to the Ivorian research assistants who helped with data collection: Armand Kouakou, Durcas Latto, Anthelme Yapo, Noeline Kra, and Anicet N’Goran; to the US research assistants who helped to create test materials: Krista Weber, Victoria Bobowska, Jiamian Wang, Elise Shealy, and Erin Curran; and to our Python programmer Betty Zhang. We would also like to thank the children, families, and teachers who took part in this study. This work was funded by a Jacobs Foundation Early Career Award 2015118455 (Jasińska, PI), Jacobs Foundation Transformation Education in Cocoa Communities (TRECC) Program training grant 2015-1184 (Jasińska, PI) and research grants that supported the larger literacy intervention program (Jasińska, co-PI). Stimuli construction for Experiment 3 is part of the project sponsored by NARSAD Young Investigator Award #24836 (Qi, PI). Qi’s time is supported by NIDCD R21DC017576 (Qi, PI). Work on this paper was also supported by the Eugene M. Lang Summer Research Fellowship (Wang), a Registered Reports by Early Career Researchers Grant from the journal Language Learning (Zinszer, PI), and the Swarthmore College Department of Psychology.
Footnotes
Conflict of interest statement: The authors declare no conflict of interest.
CRediT author statement: Benjamin D. Zinszer: conceptualization; data curation; formal analysis; investigation (supporting); methodology; software; supervision (supporting); validation; visualization; writing – original draft preparation; writing – review and editing. Joelle Hannon: data curation; investigation (supporting); supervision (supporting); writing – original draft preparation (supporting); writing – review and editing. Anqi Hu: software (supporting); validation; writing – original draft preparation (supporting); writing – review and editing. Aya Élise Kouadio: data curation; investigation. Hermann Akpé: conceptualization (supporting); data curation; investigation; methodology (supporting); project administration; resources; supervision; visualization (supporting); writing – review and editing (supporting). Fabrice Tanoh: conceptualization (supporting); data curation; investigation; methodology (supporting); project administration; resources; supervision; writing – review and editing (supporting). Madeleine Wang: formal analysis (supporting); validation (supporting); visualization; writing – review and editing; Zhenghan Qi: conceptualization; resources; supervision (supporting); writing – original draft preparation (supporting); writing – review and editing. Kaja Jasińska: conceptualization; data curation; funding acquisition; investigation (supporting); project administration; resources; supervision; writing – original draft preparation (supporting); writing – review and editing.
Data availability Statement:
All data reported in this paper are available on the Open Science Framework: https://osf.io/nza7m/
References
- 123RF. (2019, February). Vector - Cute monster color character funny design element. Retrieved from: https://www.123rf.com/photo_63465211_stock-vector-cute-monster-color-character-funny-design-element-humour-emoticon-fantasy-monsters-unique-expression.html
- Akpé H, Seri A, Tanoh F, Yoffo R, & Jasińska K (2021). De l’introduction d’un kit d’évaluation linquistique à l’evaluation des competences orales chez les apprenants du primarie en langue Ivoirienne pour les langues Attié, Abidji, Baoulé, et Bété. La Revue Universitaire des Sciences de l’Education ASSEMPE, 17. Available at https://www.revues-ufhb-ci.org/?parcours=revues&desc=5&arti=3223. [Google Scholar]
- Arciuli J (2018). Reading as statistical learning. Language, Speech, and Hearing Services in Schools, 49(3S), 634–643. [DOI] [PubMed] [Google Scholar]
- Arciuli J, & Simpson IC (2011). Statistical learning in typically developing children: the role of age and speed of stimulus presentation. Developmental Science, 14(3), 464–473. 10.1111/j.1467-7687.2009.00937.x [DOI] [PubMed] [Google Scholar]
- Arciuli J, & Simpson IC (2012). Statistical learning is related to reading ability in children and adults. Cognitive Science, 36(2), 286–304. [DOI] [PubMed] [Google Scholar]
- Arnon I (2020). Do current statistical learning tasks capture stable individual differences in children? An investigation of task reliability across modality. Behavior Research Methods, 52(1), 68–81. [DOI] [PubMed] [Google Scholar]
- Ayewa PNK (2018). Les reformes pédagogiques Ivoiriennes au fil des années: Le piège n’a pas été évité. Quelle solution aujourd’hui? Revue du Laboratoire des Théories et Modèles Linguistiques, 14.http://ltml.univ-fhb.edu.ci/wp-content/uploads/files/article14/Noe_Kouassi_AYEWA.pdf [Google Scholar]
- Ball MC, Curran E, Tanoh F, Akpé H, Seri A, Nematova S & Jasińska K (2022). Learning to read in environments with high risk of illiteracy: the role of bilingualism and bilingual education. Journal of Educational Psychology. doi: 10.1037/edu0000723 [DOI] [Google Scholar]
- Bialystok E (2002). Acquisition of literacy in bilingual children: A framework for research. Language Learning, 52(1). 159–199. [Google Scholar]
- Bogaerts L, Szmalec A, De Maeyer M, Page MP, & Duyck W (2016). The involvement of long-term serial-order memory in reading development: A longitudinal study. Journal of Experimental Child Psychology, 145, 139–156. [DOI] [PubMed] [Google Scholar]
- Brou-Diallo C (2011). Le projet école intégrée (PEI), un embryon de l’enseignement du français langue seconde (FLS) en Côte d’Ivoire. Revue electronique internationale de sciences du langage. http://www.sudlangues.sn/spip.php?article173
- Bruce DJ (1964). The analysis of word sounds by young children. British Journal of Educational Psychology, 34(2), 158–170. doi: 10.1111/j.2044-8279.1964.tb00620.x [DOI] [Google Scholar]
- Burger O, Chen L, Erut A, Fong FTK, Rawlings B, & Legare CH (2022). Developing cross-cultural data infrastructures (CCDIs) for research in cognitive and behavioral sciences. Review of Philosophy and Psychology. 10.1007/s13164-022-00635-z [DOI] [Google Scholar]
- Burks SV, Carpenter JP, Goette L, & Rustichini A (2009). Cognitive skills affect economic preferences, strategic behavior, and job attachment. Proceedings of the National Academy of Sciences, 106(19), 7745–7750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen A, Panter-Brick C, Hadfield K, Dajani R, Hamoudi A, & Sheridan M (2019). Minds Under Siege: Cognitive Signatures of Poverty and Trauma in Refugee and Non-Refugee Adolescents. Child Development, 90(6), 1856–1865. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiser J, & Aslin RN (2001). Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychological Science, 12(6), 499–504. [DOI] [PubMed] [Google Scholar]
- Ford CB, Kim HY, Brown L, Aber JL, & Sheridan MA (2019). A cognitive assessment tool designed for data collection in the field in low-and middle-income countries. Research in Comparative and International Education, 14(1), 141–157. [Google Scholar]
- Frost R, Armstrong BC, & Christiansen MH (2019). Statistical learning research: A critical review and possible new directions. Psychological Bulletin, 145(12), 1128. [DOI] [PubMed] [Google Scholar]
- Frost R, Siegelman N, Narkiss A, & Afek L (2013). What predicts successful literacy acquisition in a second language?. Psychological Science, 24(7), 1243–1252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goschke T, Friederici AD, Kotz SA, & Van Kampen A (2001). Procedural learning in Broca’s aphasia: Dissociation between the implicit acquisition of spatio-motor and phoneme sequences. Journal of Cognitive Neuroscience, 13(3), 370–388. [DOI] [PubMed] [Google Scholar]
- Gove A, & Wetterberg A (2011). The Early Grade Reading Assessment: Applications and Interventions to Improve Basic Literacy: ERIC.
- Hammer CS, Davison MD, Lawrence FR, & Miccio AW (2009). The effect of maternal language on bilingual children’s vocabulary and emergent literacy development during Head Start and kindergarten. Scientific Studies of Reading, 13(2), 99–121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hannon J, Zinszer BD, Seri AB, Kouadio E, Tanoh F, Earle S, & Jasinska K (under review). Contributions of procedural memory to emergent reading in older children: Insights from Côte d’Ivoire.
- Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG (2009). Research electronic data capture (REDCap) - A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics, 42(2), 377–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, McLeod L, Delacqua G, Delacqua F, Kirby J, Duda SN (2019). The REDCap consortium: Building an international community of software partners. Journal of Biomedical Informatics. doi: 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henrich J, Heine SJ, & Norenzayan A (2010). The weirdest people in the world?. Behavioral and Brain Sciences, 33(2–3), 61–83. [DOI] [PubMed] [Google Scholar]
- Holleman GA, Hooge IT, Kemner C, & Hessels RS (2020). The ‘real-world approach’ and its problems: A critique of the term ecological validity. Frontiers in Psychology, 11, 721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hruschka DJ, Munira S, Jesmin K, Hackman J, & Tiokhin L (2018). Learning from failures of protocol in cross-cultural research. Proceedings of the National Academy of Sciences, 115(45), 11428–11434. 10.1073/pnas.1721166115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jasińska K, Akpé H, Seri AB, Zinszer B, Yoffo R, Mulford K, Curran E, Ball MC, & Tanoh F (2022a). Evaluating Bilingual Children’s Native Language Abilities in Côte d’Ivoire: Introducing the Ivorian Children’s Language Assessment Toolkit for Attié, Abidji, Baoulé, and Bété. Applied Linguistics. doi: 10.1093/applin/amac025 [DOI] [Google Scholar]
- Jasińska K, Ball M & Guei S (2023). Literacy in Côte d’Ivoire. In Joshi M (Ed), Handbook of Literacy in Africa. Literacy Studies Series. Springer. [Google Scholar]
- Jasińska K & Guei S (2022). Interventions to Support Learning at the Bottom of the Pyramid in Côte d’Ivoire. In Wager DA, Castillo NM, and Lewis SG (Eds.), Learning, Marginalization, and Improving the Quality of Education in Low-income Countries (Learning at the Bottom of the Pyramid 2). doi: 10.11647/OBP.0256 [DOI] [Google Scholar]
- Jasińska KK, & Guei S (2018). Neuroimaging field methods using functional near infrared spectroscopy (NIRS) neuroimaging to study global child development: Rural sub-saharan africa. Journal of Visualized Experiments: JoVE, (132). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jasińska KK, Wolf S, Jukes MC, & Dubeck MM (2019). Literacy acquisition in multilingual educational contexts: Evidence from Coastal Kenya. Developmental Science, e12828. [DOI] [PubMed] [Google Scholar]
- Jasińska K, Zinszer B, Xu Z, Hannon J, Seri A, Tanoh F & Akpé H (2022b). Home Learning Environment and Physical Development Impact Children’s Executive Function Development and Literacy in Rural Côte d’Ivoire. Cognitive Development, 64. doi: 10.1016/j.cogdev.2022.101265 [DOI] [Google Scholar]
- Kihlstrom JF (2021). Ecological validity and “ecological validity”. Perspectives on Psychological Science, 16(2), 466–471. [DOI] [PubMed] [Google Scholar]
- Kirkham NZ, Slemmer JA, & Johnson SP (2002). Visual statistical learning in infancy: Evidence for a domain general learning mechanism. Cognition, 83(2), B35–B42. [DOI] [PubMed] [Google Scholar]
- Kokora PD (1979). Une orthographe pratique des langues ivoiriennes, Institut de Linguistique Appliquée, Université d’Abidjan, Abidjan : Ministère de l’Education Nationale. Ministère de la Recherche Scientifique, Ministère des Affaires Culturelles, Ministère de la Jeunesse, de l’Education Populaire, et des Sports.
- Kūkea Shultz P, & Englert K (2021). Cultural validity as foundational to assessment development: An Indigenous example. Frontiers in Education, 6(244). [Google Scholar]
- Kuznetsova A, Brockhoff PB, Christensen RHB (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. doi: 10.18637/jss.v082.i13. [DOI] [Google Scholar]
- Lum J, Kidd E, Davis S, & Conti-Ramsden G (2010). Longitudinal study of declarative and procedural memory in primary school-aged children. Australian Journal of Psychology, 62(3), 139–148. [Google Scholar]
- Lum JA, Ullman MT, & Conti-Ramsden G (2013). Procedural learning is impaired in dyslexia: Evidence from a meta-analysis of serial reaction time studies. Research in Developmental Disabilities, 34(10), 3460–3476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madaio M, Kamath V, Yarzebinski E, Zasacky S, Tanoh F, Hannon-Cropp J, Cassell J, Jasińska K, & Ogan A (2019a). “You give a little of yourself”: Family support for children’s use of an IVR literacy system. In the 2019 ACM SIGCAS Conference on Computing and Sustainable Societies (ACM COMPASS), July, 2019. [Google Scholar]
- Madaio M, Tanoh F, Blahoua A, Jasińska K & Ogan A (2019b). “Everyone brings their grain of salt”: Designing for low-literate parental engagement with a mobile literacy technology in Côte d’Ivoire. In the ACM Conference on Human Factors in Computing Systems (CHI), April, 2019. [Google Scholar]
- Madaio M, Yarzebinski E, Kamath V, Zinszer BD, Hannon J., Tanoh F, Akpe YH, Seri AB, Jasińska K, & Ogan A (2020). Collective support and independent learning with a voice-based literacy technology in rural communities. In the Proceedings of Computer Human Interaction, 2020. [Google Scholar]
- Madan CR (2014). Manipulability impairs association-memory: Revisiting effects of incidental motor processing on verbal paired-associates. Acta Psychologica, 149, 45–51. [DOI] [PubMed] [Google Scholar]
- Morey RD, Rouder JN, & Jamil T. (2015). Package ‘bayesfactor’. URL http://cran/r-projectorg/web/packages/BayesFactor/BayesFactor.pdf [Google Scholar]
- Nazzi T, Iakimova G, Bertoncini J, Frédonie S, & Alcantara C (2006). Early segmentation of fluent speech by infants acquiring French: Emerging evidence for crosslinguistic differences. Journal of Memory and Language, 54(3), 283–299. [Google Scholar]
- Palmer SD, Hutson J, & Mattys SL (2018). Statistical learning for speech segmentation: Age-related changes and underlying mechanisms. Psychology and Aging, 33(7), 1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pelz M, Yung A, & Kidd C (2015). Quantifying curiosity and exploratory play on touchscreen tablets. In Proceedings of the IDC 2015 Workshop on Digital Assessment and Promotion of Children’s Curiosity, June 21–24, Boston, MA. [Google Scholar]
- Perception Research Systems. 2007. Paradigm Stimulus Presentation, Retrieved from http://www.paradigmexperiments.com
- Qi Z, Sanchez Araujo Y, Georgan WC, Gabrieli JD, & Arciuli J (2019). Hearing matters more than seeing: A cross-modality study of statistical learning and reading ability. Scientific Studies of Reading, 23(1), 101–115. [Google Scholar]
- R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. [Google Scholar]
- Revelle W (2019). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, Illinois. R package version 1.9.12, https://CRAN.R-project.org/package=psych. [Google Scholar]
- RStudio Team (2020). RStudio: Integrated Development for R. RStudio, PBC, Boston, MA: http://www.rstudio.com/. [Google Scholar]
- RTI International. (2009). Manuel pour l’evaluation des competences fondamentales en lecture. [Early Grade Reading Assessment toolkit, French adaptation]. Adaptation by L. Sprenger-Charolles. Prepared for the US Agency for International Development under the EdData II project, Task Order 3, Contract No. EHC-E-01-03-00004-00. Research Triangle Park, North Carolina: RTI International. Retrieved April 19, 2011, from https://pdf.usaid.gov/pdf_docs/Pnadq182.pdf [Google Scholar]
- Saffran JR (2003). Statistical language learning: Mechanisms and constraints. Current Directions in Psychological Science, 12(4), 110–114. [Google Scholar]
- Saffran JR, Aslin RN, & Newport EL (1996). Statistical learning by 8-month-old infants. Science, 274(5294), 1926–1928. [DOI] [PubMed] [Google Scholar]
- Saffran JR, Johnson EK, Aslin RN, & Newport EL (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70(1), 27–52. [DOI] [PubMed] [Google Scholar]
- Sawi OM, & Rueckl J (2019). Reading and the neurocognitive bases of statistical learning. Scientific Studies of Reading, 23(1), 8–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider JM, Hu A, Legault J, & Qi Z (2020). Measuring statistical learning across modalities and domains in school-aged children via an online platform and neuroimaging techniques. Journal of Visualized Experiments. E61474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp C, Barr G, Ross D, Bhimani R, Ha C, & Vuchinich R (2012). Social discounting and externalizing behavior problems in boys. Journal of Behavioral Decision Making, 25(3), 239–247. [Google Scholar]
- Siegelman N, Bogaerts L, & Frost R (2017). Measuring individual differences in statistical learning: Current pitfalls and possible solutions. Behavior Research Methods, 49(2), 418–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Solano-Flores G (2011). Assessing the cultural validity of assessment practices: An introduction, Basterra, In M. R., Trumbull E, & Solano-Flores G (eds.) Cultural validity in assessment: A guide for educators (pp.3–21). New York: Rutledge. [Google Scholar]
- Solano-Flores G, & Nelson-Barber S (2001). On the cultural validity of science assessments. Journal of Research in Science Teaching, 38(5), 553–573. [Google Scholar]
- Snow CE (2006). What Counts as Literacy in Early Childhood?. Blackwell Handbook of Early Childhood Development, 274–294. [Google Scholar]
- Storch SA, & Whitehurst GJ (2002). Oral language and code-related precursors to reading: Evidence from a longitudinal structural model. Developmental Psychology, 38(6), 934–947. [PubMed] [Google Scholar]
- Tong X, Leung WWS, & Tong X (2019). Visual statistical learning and orthographic awareness in Chinese children with and without developmental dyslexia. Research in Developmental Disabilities, 92, 103443. [DOI] [PubMed] [Google Scholar]
- Torkildsen J, Arciuli J, & Wie OB (2019). Individual differences in statistical learning predict children’s reading ability in a semi-transparent orthography. Learning and individual differences, 69, 60–68. [Google Scholar]
- Twum M, & Parenté R (1994). Role of imagery and verbal labeling in the performance of paired associates tasks by persons with closed head injury. Journal of Clinical and Experimental Neuropsychology, 16(4), 630–639. [DOI] [PubMed] [Google Scholar]
- UNESCO Institute for Statistics. (2018). UNESCO eAtlas of Literacy, 1970–2018. Retrieved from: https://tellmaps.com/uis/literacy/#!/tellmap/-1003531175
- Voodoo Group. (2016). Campagne Orange AIRBOX - 4G - GBAKA [Video]. YouTube. https://www.youtube.com/watch?v=mqTks-EYX5E. [Google Scholar]
- Walker CM, Walker LB, & Ganea PA (2013). The role of symbol-based experience in early learning and transfer from pictures: Evidence from Tanzania. Developmental Psychology, 49(7), 1315. [DOI] [PubMed] [Google Scholar]
- Warker JA (2013). Investigating the retention and time course of phonotactic constraint learning from production experience. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(1), 96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- West G, Vadillo MA, Shanks DR, & Hulme C (2018). The procedural learning deficit hypothesis of language learning disorders: we see some problems. Developmental Science, 21(2), e12552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whitehurst GJ, & Lonigan CJ (1998). Child development and emergent literacy. Child Development, 69(3), 848–872. [PubMed] [Google Scholar]
- Woodcock RW, McGrew KS, & Mather N (2001). Woodcock-Johnson III tests of achievement.
- Yu A, Chen MS, Cherodath S, Hung DL, Tzeng OJ, & Wu DH (2019). Neuroimaging evidence for sensitivity to orthography-to-phonology conversion in native readers and foreign learners of Chinese. Journal of Neurolinguistics, 50, 53–70. [Google Scholar]
- Yopp HK (1995). A test for assessing phonemic awareness in young children. The Reading Teacher, 49(1), 20–29. [Google Scholar]
- Zuilkowski SS, McCoy DC, Serpell R, Matafwali B, & Fink G (2016). Dimensionality and the development of cognitive assessments for children in sub-saharan Africa. Journal of Cross-Cultural Psychology, 47(3), 341–354. 10.1177/0022022115624155 [DOI] [Google Scholar]
- Zinszer BD, Hannon J, Kouadio AE, Akpé H, Tanoh F, Hu A, Qi Z, & Jasińska K (2023). Does non-linguistic segmentation predict literacy in an L2 education? Statistical learning in Ivorian primary schools. Language Learning, in press. 10.1111/lang.12603 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data reported in this paper are available on the Open Science Framework: https://osf.io/nza7m/
