Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jan 1.
Published in final edited form as: Sci Stud Read. 2018 Dec 2;23(1):1–7. doi: 10.1080/10888438.2018.1549045

The Role of Statistical Learning in Word Reading and Spelling Development: More Questions than Answers

Amy M Elleman 1, Laura M Steacy 2, Donald L Compton 3
PMCID: PMC6358202  NIHMSID: NIHMS997573  PMID: 30718941

Abstract

This special issue bundles a set of eight empirical studies and one review article that explore the role of SL mechanisms (both domain-specific and domain-general) in supporting word reading and spelling development, and vice-versa. In this introduction to the special issue, we worked to summarize the extent to which studies support our hypotheses relating SL to reading and spelling development while also pointing out inconsistencies across studies that require us to refine and rethink our hypotheses.


While there is general consensus that inductive learning shapes general cognitive and linguistic development, the mechanisms by which individuals come to know the probabilistic constraints in their environment are not well understood (for different perspectives see Griffiths, Chater, Kemp, Perfors, & Tenenbaum, 2010; McClelland et al., 2010). Inductive learning relies on a child’s ability to acquire and eventually exploit statistical regularities in the environment (see Seidenberg, 1997). Human language and writing systems have a rich and complex structure that enable children to induce statistical regularities from relatively noisy inputs, allowing the encoding of a large number of probabilistic constraints derived from experience (McClelland et al., 2010; Perruchet & Pacton, 2006). Statistical learning (SL) has been linked to children’s development of word segmentation (Aslin, Saffran, & Newport, 1998); early literacy-related skills (Spencer, Kaschak, Jones, & Lonigan, 2015); acquisition of orthographic structure (Pacton, Perruchet, Fayol, & Cleeremans, 2001); grapheme-phoneme correspondence (Apfelbaum, Hazeltine, & McMurray, 2013); stress placement on bisyllabic words (Arciuli, Monaghan, & Seva, 2010); as well as development of reading, spelling, and vocabulary (Arciuli & Simpson, 2012; Yurovsky, Fricker, Yu, & Smith, 2014).

We as editors believe that exploring how children acquire and eventually exploit probabilistic knowledge during literacy development is of crucial importance to the field of reading and hope that this special issue helps to advance our understanding of the relationship between SL and reading and spelling development. For the special issue we advance three working hypotheses regarding the relationship between SL and reading and spelling development, namely: 1) SL of orthographic-phonological (O→P) relationships in quasi-regular orthographies builds through the learning of multiple items within the domain (both reading and spelling) allowing a probabilistic signal to be derived and therefore these mechanisms are reliant on item-specific learning algorithms to provide the raw materials required for induction; 2) individual differences in both domain-general and domain-specific inductive learning mechanisms influence reading and spelling development with a privileged role for domain-specific mechanisms; and 3) differences between typically developing children and children who struggle to learn to read and spell (e.g., children with dyslexia) can be partially explained by differences across groups in SL ability. Thus, we set out to provide a set of papers that would explore some of these important hypotheses.

This special issue bundles a set of eight empirical studies and one review article that explore the role of SL mechanisms (both domain-specific and domain-general) in supporting word reading and spelling development, and vise-versa. In our call for abstracts we explicitly asked for empirical manuscripts examining: 1) the extent to which SL explains individual differences in word reading and spelling development in the broader population of learners, 2) the degree to which SL mechanisms explain differences between typically developing and those who struggle to learn to read and spell, and 3) the common and unique contributions of domain-general and domain-specific inductive learning mechanisms related to reading and spelling development. We relied on the work of Pacton (Pacton et al., 2001; 2005; Perruchet & Pacton, 2006) to structure our selection of manuscripts, by making a distinction between laboratory experiments and developmental studies examining SL related to sensitivity to orthographic regularities. Laboratory experiments by necessity tend to use brief exposures both in duration and in the number of stimuli experienced by participants. In contrast, developmental studies allow an extended time frame over which real-world learning takes place and allows SL to progress through natural mechanisms.

As luck would have it, we received a good mix of high quality studies that collectively ticked a majority of the boxes on our wish list. For instance, the special issue contains four developmental and five laboratory studies; six studies exploring the extent to which SL explains individual differences in word reading and spelling development; two studies exploring whether SL mechanisms explain differences between typically developing and individuals who struggle to learn to read and spell; and one study that explored the contributions of domain-general SL (laboratory measure) in explaining individual differences in domain-specific SL of reading (developmental measure). In general, results across studies clearly support a relationship between SL and reading and spelling development. However, as one might expect, there were differences across studies which lead to important questions such as whether the various domain-general measures employed are appropriate for modeling individual differences in SL; whether measures of SL draw upon similar cognitive and neurological substrates associated with reading and spelling development; whether visual and auditory SL tasks represent unitary or separate factors that theoretically could be differentially aligned with reading and spelling development; and whether SL is causally related to poor word reading and spelling development, or vice-versa? In this introduction to the special issue, we worked to summarize the extent to which studies support our hypotheses relating SL to reading and spelling development while also pointing out inconsistencies across studies that require us to refine and rethink our hypotheses.

Hypotheses 1: Statistical learning of orthographic-phonological (O→P) relationships in quasi-regular orthographies builds through the learning of multiple items within the domain (both reading and spelling).

We have previously characterized SL as the primary inductive learning mechanism by which children (and by extension adults) derive abstract probabilistic knowledge of O→P relationships (Steacy, Elleman, & Compton, 2017). Knowledge of the probabilistic relationships governing O→P mappings appear particularly important in quasi-regular (i.e., the relationship between orthography and phonology is systematic but with many exceptions) orthographies such as English and French. As Sawi and Rueckl (current issue) put it, “one way to account for how readers cope with challenges imposed by quasi-regularity is to posit that reading is driven by knowledge of the statistical properties of the writing system, and hence of reading acquisition as an exercise in SL.” Thus, our first hypothesis is that SL of written language builds through the learning of items within the domain allowing a probabilistic signal to build with reading and spelling development and experience.

Results from the Gingras and Sénéchal (current issue); Rahmanian and Kuperman (current issue); and Steacy et al. (current issue) certainly support this hypothesis. In each of the studies the outcome measure was designed to be sensitive to probabilistic knowledge developed through experience with reading and spelling development. Steacy et al. reported that children’s use of less frequent grapheme-phoneme correspondences when reading a nonword with a variant vowel (e.g., chead) was predicted by word reading ability and rime support for the less frequent “context-dependent” vowel pronunciation. Results support a model of reading development in which child and corpus attributes work together to tune variant vowel pronunciations across individual children and words, with important variance associated with both factors. Similarly, Gingras and Sénéchal found that as children (grades 1–5) were exposed to more frequent silent-letter endings or double consonants in French they implicitly acquired representations for letters with no phonological value. Results of both the Steacy et al. and Gingras and Sénéchal studies suggest a shift from lower-level, phoneme-based processing to a higher-level processing at the word and rime level as children acquire more reading and spelling experience. Finally, Rahmanian and Kuperman used a novel measure, processing of words that have homophonic substandard spelling variants (comit vs. commit) of varying frequency, to explore how spelling errors in the written corpus of English affect eye movements and lexical choice in adult readers. Results suggested that words with greater uncertainty of spelling variants, based on higher frequency of occurrence, elicited longer fixation durations and lexical decision latencies. However, in contrast to findings reported by Steacy et al. and Gingras and Sénéchal, no relation was found between reading and spelling experience and the inhibitory effect of word frequency.

Overall, findings from these three studies clearly suggest probabilistic learning of O→P relations, in both developing and skilled readers, result from reading and spelling experience. We are, however, mindful of arguments made by Nation and Mak (in press) that “the closer statistical tasks become to the task in hand – reading – the harder it is to maintain a distinction between SL as a cause of individual differences in reading vs. a finer-level description of them,” supporting the hypothesis that typical reading and spelling development naturally results in probabilistic knowledge about the writing system.

Hypotheses 2: Both domain-general and domain-specific statistical learning mechanisms influence reading and spelling development.

We define domain-general and domain-specific SL slightly differently from what has been traditionally used in the literature. Typically, domain-general SL has been defined as a unitary learning mechanism in which capacity is controlled by a single learning system across all modalities; whereas domain-specific SL is considered componential with capacity varying across modality, suggesting that learning produces representations that are specific to the stimulus properties present in auditory and visual modalities (see Frost et al., 2015). We, on the other hand, use the term domain-specific SL to refer to probabilistic knowledge developed specifically through the experience of learning to read and spell and we use the term domain-general SL to refer to a broader set of learning mechanisms that relates to general learning, potentially supporting reading and spelling development, that can vary as a function of modality (for details see Steacy, Elleman, Compton, 2017). According to our SL taxonomy, domain-specific SL is measured using reading or spelling measures that are sensitive to probabilistic constraints of the orthography, whereas domain-general SL is measured with tasks designed to assess general SL independent of reading and spelling development (e.g., artificial grammar learning, serial reaction time, visual SL, and auditory SL).

Our definition of domain-general and domain-specific SL has interesting consequences when applied to the special issue studies. According to our scheme, domain-specific SL is the by-product of learning to read and spell and in three of the studies it is the dependent measure of interest (Gingras & Sénéchal; Rahmanian & Kuperman; Steacy et al.), whereas domain-general SL refers to a set of component processes which represent general capacities (i.e., related to the encoding, retention, and abstraction of regularities) that allow statistical regularities to be abstracted from the environment (Arciuli, 2017). Typically, this capacity is measured by assessing an individual’s ability to induce regularities from a novel set of nonalphanumeric stimuli presented over a short period of time. In our set of studies this type of task was included as an independent variable used to predict individual differences in reading skill (Hung, et al., current issue; Qi, Araujo, Georgan, Gabrieli, & Arciuli, current issue.; Schmalz, Moll, Mulatti, & Schulte-Körne, current issue ; Steacy et al., current issue; van der Kleij, Groen, Segers, & Verhoeven, current issue;Vandermosten, Wouters, Ghesquière, & Golestani, current issue).

As outlined above, studies by Gingras and Sénéchal (current issue); Rahmanian and Kuperman (current issue); and Steacy et al. (current issue) clearly support our hypothesis that domain-specific SL develops with increased reading and writing experience. Results of the special issue studies examining the relationship between domain-general SL and reading skill, on the other hand, were mixed, both in terms of measure type and magnitude of the relationship. Schmalz et al. (current issue) reported nonsignificant correlations between SL tasks (i.e., serial reaction time task and artificial grammar task) and reading ability, SL and bigram sensitivity, and between the SL tasks in a sample of 84 adults. Contrasting visual sequential versus visual spatial serial reaction time tasks, van der Kleij et al. (current issue) reported that sequential, but not spatial, SL predicted growth in reading skills in children with and without dyslexia. In addition, relations between the serial reaction time measure and reading were stronger for nonwords than words prompting the authors to speculate that computation of phonology for novel words and forming new orthographic representations depend more on implicit learning skills than word recognition. Finally Hung et al. (current issue) used a serial reaction time task and reported at the behavioral level that differences in performance across the SL measure presented in random versus ordered stimuli related significantly to word reading skill in adolescent participants. At the neural level, the authors also identified network regions common to both the SL and word reading tasks, suggesting that bimodal mapping, sequential binding and storage were commonly involved in sequence learning and reading. Hung et al. argue that sequential processing is involved both in motor learning and word retrieval; further speculating that skilled readers engage shared neural systems when retrieving the serial phonological patterns and covertly or overtly reading visual words. While mixed, results point to a possible relationship between serial reaction time performance and reading development both at the behavioral and neural levels with the van der Kleij et al. and Hung et al. studies speculating that sequential computation of phonology may be the important link. Differences across studies in terms of age of participants, type of reading measures assessed, and transparency of orthography makes it difficult to resolve conflicting results.

Qi et al. (current issue) measured children and adults’ performance on sequential auditory and visual SL and related performance to several measures of reading. Auditory SL, but not visual SL, was significantly associated with sentence reading fluency in the combined sample of children and adults. In the subsample of children, auditory SL was significantly associated with nonword reading accuracy with the relationship mediated by phonological processing abilities. Steacy et al. (current issue) used a novel approach to explore the contributions of domain-general visual SL (laboratory measure) in explaining individual differences in domain-specific SL (developmental measure) of reading (i.e., individual differences in children’s assignment of more vs. less frequent GPC to vowel pronunciations as a function of rime coda influence in monosyllabic nonwords). While the expectation was that visual SL performance would be associated with a higher use of the conditionalized vowel pronunciation in monosyllabic nonwords (chead rhyming with head instead of bead) results did not support expectancies. The visual SL task did not account for unique variance in conditionalized nonword vowel pronunciation after controlling for other variables (e.g., phonological awareness skill, set for variability, and reading skill). Instead it was an interaction between child-level reading skill and item-level rime support for the less frequent vowel pronunciation that predicted variance in domain-specific SL, supporting the role of reading skill and exposure to the broader corpus of words in shaping abstract probabilistic knowledge of O→P relationships.

Our original hypothesis was that individual differences in both domain-general and domain-specific inductive learning mechanisms influence reading and spelling development with a privileged role for domain-specific mechanisms. Given the results from the seven studies examining relations between SL and reading and spelling skill, some modifications to the hypothesis are warranted. First, it is likely based on the results of Gingras & Sénéchal (current issue), Rahmanian & Kuperman (current issue), and Steacy et al. (current issue) that individual differences in reading and spelling skill and experience drive the development of domain-specific SL, and less so the alternative (see Nation & Castles, 2017; Nation & Mak, in press). We are certainly cognizant of the potential of a bidirectional relationship between item-level learning skill and inductive learning such that more powerful item-level learning leads to more sophisticated abstract probabilistic knowledge structures from the inductive learning system, while more sophisticated induction systems make it easier to add new items. Specifically, individual differences in item-specific learning algorithms across children may support or inhibit derivation of probabilistic O→P knowledge and vice-versa.

Across the four studies examining the extent to which individual differences in domain-general SL account for variance in reading development results were mixed. Some support for a relationship between serial reaction time performance and reading was evident, specifically implicating the role of sequential computation of phonology in linking SL and reading. In terms of the potential relationship between visual and auditory SL and reading development the picture becomes cloudier with mixed results across studies. Methodological variations across studies such as the type of statistical information to be learned, modality of the stimuli, tasks used to measure learning, and the degree to which participants are explicitly directed towards the to-be-learned regularities likely affect the strength of SL–reading relationships (see Sawi & Rueckl). We believe results across the studies support Sawi and Rueckl’s conclusion that, “the organization of reading processes is shaped by the statistical structure of the writing system and learning to read is thus fundamentally a form of statistical learning (Harm & Seidenberg, 2004; Rueckl, 2016). An open question is whether and how “statistical learning” in this context is related to learning in so-called “statistical learning” tasks such as the canonical SL or serial reaction time tasks.” Results certainly support the importance of further work examining the relation between various types of SL tasks and reading skills.

Hypothesis 3: Typically developing children and children who struggle to learn to read and spell differ systematically on statistical learning ability.

Finally, we looked across special issue studies to examine whether typically developing readers and children who struggle to learn to read and spell (e.g., children with dyslexia) differ on SL ability. Results from previous studies examining SL in children with dyslexia are equivocal, with a number of studies reporting that individuals with dyslexia have significantly lower SL scores relative to typically developing individuals, while others failed to find such group differences (for details see Sawi & Rueckl). Two of the special issue studies looked specifically at whether SL performance varied across typically developing and dyslexic individuals. Vandermosten et al. (current issue) explored whether the core phonemic representation deficits found in children with dyslexia arise from reduced sensitivity to the statistical distribution of sounds. Specifically, the authors investigated the role distributional learning plays in the formation of phoneme categories in school-aged children and further the extent to which children with dyslexia vary from typically developing children on this skill. Results suggest that the statistical distribution of the presented sounds implicitly enhanced the formation of phonemic representations and that dyslexic children make less use of the statistical cues embedded in oral language, resulting in less distinct phonemic categories and thus a higher risk for failing to establish robust connections between these and written language. In the second study, van der Kleij et al. examined whether typically developing and children with dyslexia varied on a serial reaction time task. Results suggest that children with dyslexia had longer reaction times in general on the task, but did not differ from typical readers in how well or how quickly they learned on either implicit learning task or in their overnight consolidation. The mixed findings across these studies align with the inconsistent findings in the wider literature examining differences between children with dyslexia and typical readers in terms of performance on domain-general SL tasks.

It is worth noting that connectionist models (e.g., Harm & Seidenberg, 1999) have modeled the lack of domain-specific SL in children with phonological dyslexia. Specifically, the triangle model provides a computational account of why poor phonological representations lead to poor reading, and in particular poor nonword generalization (a form of domain-specific SL). The crucial insight from these simulations is that a phonological impairment leads to poor learning in the O→P component. Instead of forming representations sensitive to subword units such as onsets and rimes, the hidden units in the impaired simulations learn item-specific representations (i.e., whole words). The formation of these item-specific representations is what directly impairs nonword reading. The poor nonword reading in the model is not due to the phonological system’s impaired ability to assemble phonemes produced by the reading system, but rather the phonological impairment causes poor O→P representations to be formed during learning. This suggests a potential mechanism linking poor phonological representations with decreased domain-specific SL (the ability to generalize O→P representations to read nonwords) in children with dyslexia. We interpret the modeling results as supporting the importance of domain-specific SL in explaining the poor reading and spelling development of children with dyslexia.

We hope you enjoy the special issue as much as we enjoyed editing it. We believe the issue has much to offer the field in terms of clarifying the role that SL plays in promoting word reading and spelling development. Results certainly justify the importance of continued exploration of mechanisms undergirding how children acquire and eventually exploit probabilistic knowledge during literacy development. We hope you agree that the special issue generates more questions than it provides answers.

Acknowledgments

This research was supported in part by Grant P20HD091013 awarded to Florida State University by Eunice Kennedy Shriver National Institute of Child Health and Human (NICHD). The content is solely the responsibility of the authors and does not necessarily represent the official view of NICHD.

Contributor Information

Amy M. Elleman, Literacy Studies, Middle Tennessee State University

Laura M. Steacy, Florida Center for Reading Research, Florida State University

Donald L. Compton, Florida Center for Reading Research, Florida State University

References

* denotes special issue papers

  1. Apfelbaum KS, Hazeltine E, & McMurray B (2013). Statistical learning in reading: Variability in irrelevant letters helps children learn phonics skills. Developmental Psychology, 49, 1348–1365. [DOI] [PubMed] [Google Scholar]
  2. Arciuli J, Monaghan P, & Seva N (2010). Learning to assign lexical stress during reading aloud: Corpus, behavioral, and computational investigations. Journal of Memory and Language, 63, 180–196. [Google Scholar]
  3. Arciuli J, & Simpson IC (2012). Statistical learning is related to reading ability in children and adults. Cognitive Science, 36, 286–304. [DOI] [PubMed] [Google Scholar]
  4. Aslin RN, Saffran JR, & Newport EL (1998). Computation of conditional probability statistics by 8-month-old infants. Psychological Science, 9, 321–324. [Google Scholar]
  5. Frost R, Armstrong BC, Siegelman N, & Christiansen MH (2015). Domain generality versus modality specificity: the paradox of statistical learning. Trends in Cognitive Sciences,19(3), 117–125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. *.Gingras M & Sénéchal M (2018) Evidence of statistical learning of orthographic representations in grades 1–5: the case of silent letters and double consonants in French, Scientific Studies of Reading [Google Scholar]
  7. Griffiths TL, Chater N, Kemp C, Perfors A, & Tenenbaum JB (2010). Probabilistic models of cognition: Exploring representations and inductive biases. Trends in Cognitive Sciences, 14, 357–364. [DOI] [PubMed] [Google Scholar]
  8. Harm MW, & Seidenberg MS (1999). Phonology, reading acquisition, and dyslexia: Insights from connectionist models. Psychological Review,106(3), 491–528. [DOI] [PubMed] [Google Scholar]
  9. Harm MW, & Seidenberg MS (2004). Computing the meanings of words in reading: Cooperative division of labor between visual and phonological processes. Psychological Review,111(3), 662–720. [DOI] [PubMed] [Google Scholar]
  10. *.Hung Y, Frost SJ, Molfese P, Malins JG, Landi N, Mencl WE, Rueckl JG, Bogaerts L & Pugh KR (2018) Common neural basis of motor sequence learning and word recognition and its relation with individual differences in reading skill, Scientific Studies of Reading [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. McClelland JL, Botvinick MM, Noelle DC, Plaut DC, Rogers TT, Seidenberg MS, & Smith LB (2010). Letting structure emerge: Connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14, 348–356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Nation K, & Castles A (2017). Putting the learning into orthographic learning Theories of reading development (pp. 148–168). Amsterdam, The Netherlands: John Benjamins Publishing. [Google Scholar]
  13. Nation K, & Mak MHC (in press). Orthographic learning and learning to read: implications for developmental dyslexia In Compton DL, Wahington JA, & McCardle P (Eds.), Dyslexia 101: Revisiting etiology, diagnosis, treatment, and policy. Baltimore, MD: Brookes Publishing. [Google Scholar]
  14. Pacton S, Fayol M, & Perruchet P (2005). Children’s implicit learning of graphotactic and morphological regularities. Child Development, 76, 324–339. [DOI] [PubMed] [Google Scholar]
  15. Pacton S, Perruchet P, Fayol M, & Cleeremans A (2001). Implicit learning out of the lab: The case of orthographic regularities. Journal of Experimental Psychology: General, 130, 401. [DOI] [PubMed] [Google Scholar]
  16. Perruchet P, & Pacton S (2006). Implicit learning and statistical learning: One phenomenon, two approaches. Trends in Cognitive Sciences, 10, 233–238. [DOI] [PubMed] [Google Scholar]
  17. *.Qi Z, Araujo YS, Georgan WC, Gabrieli JDE & Arciuli J (2018) Hearing matters more than seeing: a cross-modality study of statistical learning and reading ability, Scientific Studies of Reading [Google Scholar]
  18. *.Rahmanian S & Kuperman V (2018) Spelling errors impede recognition of correctly spelled word forms, Scientific Studies of Reading [Google Scholar]
  19. Rueckl JG (2016). Toward a theory of variation in the organization of the word reading system. Scientific Studies of Reading, 20(1), 86–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. *.Sawi OM & Rueckl JG (2018) Reading and the neurocognitive bases of statistical learning, Scientific Studies of Reading [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. *.Schmalz X, Moll K, Mulatti C & Schulte-Körne G (2018) Is statistical learning ability related to reading ability, and if so, why? Scientific Studies of Reading [Google Scholar]
  22. Seidenberg MS (1997). Language acquisition and use: Learning and applying probabilistic constraints. Science, 275, 1599–1603. [DOI] [PubMed] [Google Scholar]
  23. Spencer M, Kaschak MP, Jones JL, & Lonigan CJ (2015). Statistical learning is related to early literacy-related skills. Reading and Writing, 28, 467–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. *.Steacy LM, Compton DL, Petscher Y, Elliott JD, Smith K, Rueckl J, Sawi O Frost S & Pugh K (2018). Development and prediction of context-dependent vowel pronunciation in elementary readers. Scientific Studies of Reading [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Steacy LM, Elleman AM, & Compton DL (2017). Opening the “black box” of learning to read: inductive learning mechanisms supporting word-learning development with a focus on interventions for children who struggle to read Theories of reading development, (pp. 99–121). Amsterdam, The Netherlands: John Benjamins Publishing. [Google Scholar]
  26. *.van der Kleij SW, Groen MA, Segers E, & Verhoeven L (2018). Sequential implicit learning ability predicts growth in reading skills in typical readers and children with dyslexia, Scientific Studies of Reading [Google Scholar]
  27. *.Vandermosten M, Wouters J, Ghesquière P & Golestani N (2018) Statistical learning of speech sounds in dyslexic and typical reading children, Scientific Studies of Reading [Google Scholar]
  28. Yurovsky D, Fricker DC, Yu C, & Smith LB (2014). The role of partial knowledge in statistical word learning. Psychonomic Bulletin & Review, 21 1–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES