Abstract
The statistical-learning view of word reading and spelling is based on the idea that writing systems have a rich statistical structure and that people implicitly pick up this structure as they learn to read and write. Whereas laboratory studies stress the speed and power of statistical learning, the evidence we review shows that adults with years of reading and writing experience do not always mirror the statistics of their writing system in their behavior. We consider possible reasons for these discrepancies, including the complexity of the statistical relationships, ease of production, and satisficing. The findings suggest that literacy instruction should address the probabilistic patterns in writing systems and the role of context in selecting appropriate pronunciations and spellings
Keywords: statistical learning, spelling, reading, writing systems
Statistical learning in children and adults
In recent years, research on word reading and spelling has adopted a statistical-learning view: that people implicitly learn about the written forms of words and their connections with spoken language as they learn to read and write [1–4]. According to this view, written words and their links with spoken language are characterized by a rich statistical structure, and people rapidly pick up these statistics without explicit teaching. These ideas about statistical learning (see Glossary) underlie connectionist models of skilled reading and spelling (Box 1), and they are supported by studies showing that even infants can extract statistics about the structure of a miniature artificial language with just a few minutes of experience [5–8]. Many studies of statistical learning in spelling and word reading have focused on children, asking whether they learn statistics about written words and spelling–sound relationships without explicit instruction (Box 2). Here we address another important question about the statistical-learning view of spelling and reading: Do people with years of experience with a writing system faithfully follow its statistics? If learning to read is a process of internalizing the statistical patterns of a writing system [3], and if statistical learning is as quick and powerful as laboratory studies suggest, then we should see a close match between the behavior of experienced readers and the statistics of the writing system [9]. In what follows, we discuss several lines of research on spelling and word reading that have made such comparisons. As we will see, adults’ spelling and word reading performance aligns with the vocabulary statistics in some ways but not others. We consider the reasons for the discrepancies and the implications for the teaching of word reading and spelling and for the concept of statistical learning.
Box 1. Statistical-learning and dual-route theories of word reading and spelling.
Reading and spelling words in an alphabetic writing system has been traditionally thought to involve two processes [44]. The first process, the sublexical route, is based on simple rules that link individual letters or digraphs and sounds, such as that the letter ⟨f⟩ corresponds to the phoneme /f/. The second process, the lexical route, is based on whole words, such as the knowledge that the written form ⟨of⟩ corresponds to the word of. The dual-route view has been instantiated in computational models for English [44] and other languages (e.g. German [45]), stimulating research comparing the behavior of human readers to that of the models [e.g., 12] and the development of competing computational models, including those adopting statistical-learning mechanisms [46]. This research revealed that the traditional dual-route model was based on some incorrect assumptions, including that words can be neatly divided into regular and exceptional and that exceptions to the rules are randomly scattered across the vocabulary. Researchers now agree that the relations between written words and their linguistic forms are characterized by a richer statistical structure than envisaged by the traditional dual-route view, including also morphology and graphotactics, and that people take advantage of probabilistic as well as deterministic patterns. However, dual-route theory remains influential, especially in reading instruction.
Box 2. Statistical learning about written words and their links to spoken words in children.
Children in modern societies are exposed to writing from an early age—in books, on cans of food, and so on. Even before they understand that writing symbolizes language, and even before they have begun formal literacy instruction, children begin to learn about the visual characteristics of writing [47, 48]. Thus, the scribbles that Chinese and U.S. 2- to 5-year-olds produce when asked to write are sparser, smaller, and more angular than the scribbles they produce when asked to draw, with less filling in [49, 50]. When children start using sequences of letters to write words, even sequences that do not make phonological sense follow some of the statistical patterns in the written words to which the children have been exposed. For example, 3- to 5-year-olds exposed to Portuguese use more vowel letters than those exposed to English, in line with the fact that Portuguese words contain a higher percentage of vowel letters than English words [51]. As children begin using letters that make phonological sense, their letter choices are influenced by orthographic as well as phonological plausibility. For example, Spanish primary school children pick up the general pattern that /b/ is more likely to be spelled as ⟨v⟩ before ⟨i⟩ and as ⟨b⟩ before ⟨u⟩ [52].
Reading of long and short vowels in English
A writing system is a way to visually record spoken language, and alphabets do this by using letters to represent the sounds of a language. However, the links between letters and sounds are not always simple. For example, in English ⟨a⟩ has one pronunciation in words like ⟨paper⟩ and another pronunciation in words like ⟨rabbit⟩. (This review uses angle brackets to enclose letters.) We refer to these pronunciations as long and short, respectively, because these terms are usually used in phonics teaching [e.g., 10, 11]. The other vowel letters also have both short and long pronunciations. A rule that is sometimes taught in phonics instruction is that a vowel letter typically has its long pronunciation when it is followed by a single consonant (C) letter and then a vowel (V) letter [12]. Words like ⟨paper⟩ and ⟨open⟩ follow this V̄CV rule and are seen as regular, whereas words like ⟨credit⟩ and ⟨comic⟩ are seen as exceptional.
If people follow the V̄CV rule, we would expect them to use primarily long first-syllable vowels for novel words like ⟨placo⟩ and ⟨yufar⟩, and we would expect them to do so regardless of the identity of the vowel. However, as illustrated by the bars marked “Behavior” in Figure 1, university students are more likely to use long pronunciations for ⟨u⟩ and ⟨o⟩ than for ⟨a⟩, ⟨e⟩, and ⟨i⟩ [13]. Differences among vowel letters are not expected on the basis of the V̄CV rule, and we know of no program of reading instruction that directs students to treat different vowels differently. The differences are consistent with a statistical-learning view of reading, however. Investigators who have examined the pronunciation of the first vowels of VCV sequences in large sets of English words have found that the percentage of words with a long first vowel is higher for ⟨u⟩ and ⟨o⟩ than for ⟨a⟩, ⟨e⟩, and ⟨i⟩ [12, 13], as shown by the bars marked “Vocabulary” in Figure 1. People appear to have internalized these differences among vowels through exposure to words and their pronunciations, without explicit teaching.
Figure 1.

Long pronunciations of first vowels of two-syllable items
Bars show mean percentage of two-syllable items in which first vowel is pronounced long before a single consonant: adults’ behavior with novel words vs. English lexicon (data from [13]).
The letters at the end of a nonword—letters that are not even adjacent to the first-syllable vowel—also influence adults’ choice between long and short pronunciations of this vowel. For example, people are more likely to use a long first-syllable vowel for disyllables ending with ⟨us⟩ and ⟨al⟩ than for items ending with ⟨ic⟩, ⟨id⟩, and ⟨it⟩ [13]. For disyllables with the latter three endings, in fact, people prefer short over long first-syllable vowels, in opposition to the V̄CV rule. The differences among endings, like those among vowels, reflect patterns in English itself. For example, long first-syllable vowels are more common in disyllabic words ending with ⟨us⟩ and ⟨al⟩ than in those ending with ⟨ic⟩, ⟨id⟩, and ⟨it⟩ [13, 14]. Words with short first-syllable vowels before single medial consonants would be seen as exceptions to the V̄CV rule, but the experimental results show that people prefer the exceptional pronunciation for specific endings.
Although skilled readers follow the vocabulary statistics in differentiating among vowels and among endings, their behavior does not perfectly mirror the vocabulary statistics. One difference is that, averaging across vowels and endings, readers produce fewer long vowels than would be anticipated given the vocabulary statistics. In an analysis that included data from 13 experiments with disyllabic nonwords, the percent of long first-syllable vowels relative to the total number of long and short vowels averaged 43% for university students, as compared to 69% in a vocabulary analysis for disyllabic words that weighted words by their frequency of occurrence [13]. Another difference between the behavioral and vocabulary data is that, although people are influenced by the identity of the ending, the differences among endings are not as large as the vocabulary statistics would suggest [13].
Statistical-learning views of reading lead us to expect that behavior of skilled readers would mirror the statistics of the writing system. Why is this not always the case? Learning about the influence of the ending letters on the pronunciation of the first-syllable vowel may be a long and difficult process because nonadjacent dependencies are difficult to learn [15] and because English has many different word endings, each with its own influence on the pronunciation of the vowel. The finding that people use fewer long vowels overall than anticipated based on the statistics of disyllabic words may also reflect people’s tendency to focus more on the vowel itself than on its context. Short pronunciations of vowels are common overall, as in many types of one-syllable words (e.g., ⟨cap⟩) and words of more than two syllables (e.g., ⟨national⟩), and people may be drawn toward short pronunciations for this reason. When people encounter a novel disyllable, therefore, they do not use long first-syllable vowels as often as anticipated given the vocabulary statistics for disyllables.
Reading of ⟨c⟩ and ⟨g⟩ in English and Italian
English is noted for the complexity of its writing system [16], but some complexities are found even in languages that are seen as highly transparent in their spelling-to-sound relationships, such as Italian. One complexity of Italian is that ⟨c⟩ is pronounced as /k/ when it precedes ⟨a⟩, ⟨o⟩, ⟨u⟩, or another consonant, e.g., ⟨caffè⟩. (Slashes enclose International Phonetic Alphabet [59] symbols for sounds; see Table 1 for information about the symbols used in this review.) Before ⟨e⟩ or ⟨i⟩, ⟨c⟩ is pronounced /tʃ/ as in ⟨violoncello⟩, a sound that is further forward in the mouth. The letter ⟨g⟩ behaves analogously, having a more front pronunciation, /dʒ/, when it comes before ⟨e⟩ or ⟨i⟩ (e.g., ⟨gelato⟩) than in other contexts (e.g., ⟨graffiti⟩). See Box 3 for information about the reasons behind this and other complexities. Italian university students almost always follow these rules about ⟨c⟩ and ⟨g⟩ when pronouncing novel items [17].
Table 1.
Symbols of the International Phonetic Alphabet [60] used in this review
| Symbol | Keyword containing sound in General American English |
|---|---|
|
| |
| aɪ | nice |
| æ | bat |
| ɑ | dot |
| b | box |
| d | dive |
| dʒ | gene |
| e | cake |
| ə | a cover symbol for reduced, unstressed vowels, as in campus |
| ɛ | bed |
| f | fox |
| g | go |
| i | mete |
| ɪ | big |
| ju | unit |
| k | king |
| l | love |
| n | nest |
| o | vote |
| p | pet |
| r | red |
| s | six |
| t | tip |
| tʃ | chip |
| u | rude |
| ʊ | put |
| v | van |
| ˈ | shows that the next syllable is stressed |
Box 3. Why are writing systems complex?
Some of the complexities in reading and spelling reflect the fact that spoken language changes over time, and generally more quickly than written language. For example, Latin 〈c〉 always spelled /k/ and 〈g〉 always spelled /g/, sounds produced with the back of the tongue. The sounds changed as Latin evolved into the Romance languages, coming to be pronounced with the front of the tongue before front vowels. The original spellings of the sounds were retained, leading to the context-conditioned rule for reading of 〈c〉 and 〈g〉 that we see today in Romance languages, whereby readers need to consider the following vowel in order to pronounce the consonant. For spelling, the sound changes mean that /s/, for example, has two possible spellings in certain contexts. For example, Latin American Spanish /s/ is spelled as 〈c〉 in 〈cena〉 (‘dinner’), which derives from a Latin word that started with /k/, and as 〈s〉 in 〈seña〉 (‘sign’), which derives from a Latin word that started with /s/. Without knowing Latin, it is hard to predict which letter is used to spell /s/ in a particular word.
Languages change not only in the pronunciations of sounds but also by borrowing words from other languages. The original spellings of words are sometimes retained, and this accounts for some of the complexities in writing systems. For example, English borrowed many words from Latin and Romance languages [53], and such words usually have front pronunciations of 〈g〉 before 〈i〉 and 〈e〉, as in 〈generous〉. Words in the original Germanic vocabulary, such as 〈girl〉, have back pronunciations, especially at the start of the word.
English has a similar context-sensitive rule for ⟨c⟩: It is almost always pronounced as /k/ when before ⟨a⟩, ⟨o⟩, or ⟨u⟩ and as /s/, a sound that is pronounced more toward the front of the mouth, before ⟨e⟩ or ⟨i⟩. Across two experiments in which U.S. university students pronounced nonwords with initial ⟨c⟩ before each of ⟨a⟩, ⟨e⟩, ⟨i⟩, ⟨o⟩, and ⟨u⟩, the percentage of /s/ pronunciations was 87% before ⟨e⟩, 86% before ⟨i⟩, and around 1% before ⟨a⟩, ⟨o⟩, and ⟨u⟩ [18]. Most participants, when later questioned about the pronunciation of ⟨c⟩, were not able to describe the factors that make one pronunciation more likely than another. Despite participants’ lack of conscious awareness, their pronunciations of ⟨c⟩ were clearly influenced by the following vowel. However, it is noteworthy that people used /k/ 13% to 14% of the time before ⟨e⟩ and ⟨i⟩ even though /k/ pronunciations of ⟨c⟩ virtually never occur in this context in English words.
The discrepancy between the behavioral data and the vocabulary data appears to be smaller when it comes to learning about the influence of a vowel on the pronunciation of a preceding ⟨c⟩ than when it comes to learning about the influence of the final letters of a disyllable on the pronunciation of the first vowel, as discussed in the preceding section. The better learning in the case of ⟨c⟩ may reflect the strong context effects in the vocabulary, the fact that the vowel is immediately adjacent to the consonant, and the fact that there are just five vowels to learn about, as compared to a much larger number of word endings.
Still, it is interesting that readers of English with substantial experience of words in which ⟨c⟩ is pronounced as /s/ before ⟨e⟩ and ⟨i⟩ and virtually no experience of words in which ⟨c⟩ is pronounced as /k/ before ⟨e⟩ and ⟨i⟩ produce /k/ pronunciations over 10% of the time in this context. That is, even after years of experience, readers of English do not always follow a very strong regularity of their language. Italians, with similar experience of ⟨c⟩ words, adhere more strongly to the regularity. One reason why the match between the behavioral data and the vocabulary data is closer for Italian than for English may be that the Italian pattern is more general, in that ⟨g⟩ changes in the same way as ⟨c⟩. In English, the pronunciation of ⟨g⟩ changes with the following vowel in some words (e.g., ⟨gem⟩) but not in others (e.g., ⟨get⟩). What Italians learn about ⟨g⟩ thus reinforces what they learn about ⟨c⟩, whereas in English learning about ⟨g⟩ sometimes works against learning about ⟨c⟩. Another reason why the match between the behavioral data and the vocabulary data is closer for Italian ⟨c⟩ than for English ⟨c⟩ may be that English has many more other complications in its spelling–sound relationships than Italian. This may give Italian students and teachers more time and attention for the few complications that do occur.
Spelling of long and short vowels in English
We turn now to studies of how people spell short and long vowels in novel words. As we will discuss, these studies give us the opportunity to examine not only the learning of links between phonology and spelling but also the learning of statistical patterns that involve sequences of letters themselves: graphotactic patterns.
Consider disyllabic English words with a stressed vowel in the first syllable and a single medial consonant. Many such words are spelled with two identical medical consonant letters, as in ⟨latter⟩, and such spellings are especially common when the first vowel is pronounced as short. However, medial consonant doublets are quite uncommon when the first vowel is spelled as a digraph (a sequence of two vowel letters). Thus, English includes words like ⟨threaten⟩ but very few like ⟨threatten⟩ [19]. This is a graphotactic pattern, one that involves sequences of letters themselves. Adults appear to have internalized this graphotactic pattern, producing few spellings such as ⟨breappo⟩, and their behavior mirrors the vocabulary statistics in this way [19–21].
The behavioral data and the vocabulary data match less closely when it comes to the contrast between long and short vowels. University students are more likely to produce spellings with medial consonant doublets for nonwords with short first-syllable vowels than for nonwords with long first-syllable vowels, in line with the vocabulary statistics, but the difference is much smaller than anticipated given the vocabulary statistics. In one study, the percent of spellings with a single first-syllable vowel letter and a medial consonant doublet was 47% for nonwords with short first-syllable vowels and 2% for nonwords with long first-syllable vowels, as compared to 71% versus 1% in the vocabulary [19]. Children and learners of English as a second language are even less likely than university students to produce spellings with a single first-syllable vowel letter and a medial consonant doublet for nonwords with short first-syllable vowels [19, 22].
Why do even adults with years of reading and spelling experience use many fewer double consonants after phonologically short vowels than expected based on the vocabulary statistics, while following the vocabulary statistics more closely in the case of the graphotactic pattern? The answer does not lie in how students are taught to read and spell. No instructional program for learners of English as a first or a second language, to our knowledge, teaches that a consonant doublet rarely follows a vowel digraph. In contrast, some programs teach an explicit rule about the doubling of consonants after short vowels [23, 24]. Nor does the better performance on the graphotactic pattern reflect better conscious awareness. Few adults have explicit knowledge that consonant doubling varies with the number of letters used to spell the preceding vowel, whereas some are consciously aware that consonant doubling varies with the sound of the preceding vowel [21].
One reason that learning to avoid consonant doublets after vowel digraphs is easier than many other aspects of learning to spell in English is that it requires attention only to what words look like. Children start learning about these matters even before they learn how printed words link to spoken words (Box 2) and, once they have begun to read, they can continue to learn what words look like even when they are unsure of or not attending to their pronunciations. Also contributing to the ease of learning the graphotactic regularity is its breadth. Sequences of a vowel digraph followed by a consonant doublet are rare not only in the middles of morphemes but also at the ends. For example, ⟨deaff⟩ would be highly unusual in English. Moreover, the graphotactic pattern applies not only to sequences of two identical consonant letters but also to several other consonant sequences that represent a single phoneme, including ⟨ck⟩. People’s knowledge about word-final sequences [25, 26] thus reinforces their knowledge of medial sequences.
Learning to double consonants after short vowels is one of the harder aspects of English spelling because it requires learning that a feature of spelling, a consonant doublet, can encode both characteristics of the consonant phoneme and characteristics of the preceding vowel phoneme. For example, the ⟨ll⟩ of ⟨belly⟩ signals the identity of the medial consonant phoneme—that it is /l/ rather than some other consonant—and that the preceding vowel is pronounced short. Further contributing to the difficulty of learning about medial consonant doubling is that the pattern is somewhat restricted in terms of phonemes and positions. Medial ⟨r⟩ behaves differently than other consonants, the first vowels of ⟨marry⟩ and ⟨Mary⟩ being alike in some dialects of English. And consonant doubling functions differently at the ends of words than it does in the middles. Even though ⟨bell⟩ and ⟨bel⟩ differ in whether the final consonant is doubled, both are pronounced with a short vowel.
Use of morphology in spelling
The spelling of a sound can vary with other sounds or letters in a word, as when /b/ is spelled as ⟨b⟩ in ⟨rabies⟩ but as ⟨bb⟩ in ⟨rabbit⟩, or with position in a word, as when /l/ is spelled as ⟨l⟩ in ⟨let⟩ but as ⟨ll⟩ in ⟨tall⟩. In this section, we consider another property of a word that can affect the choice among spelling alternatives: its morphological structure, or the smaller meaningful units within it.
As an example of one of the many ways in which morphology influences spelling [27], consider the English adjectives ⟨generous⟩, ⟨jealous⟩, and ⟨enormous⟩. These all end with the sound sequence /əs/, and it is spelled as ⟨ous⟩. Indeed, 98% of English adjectives that end with /əs/ that is not part of a larger suffix, as in ⟨hopeless⟩, are spelled with final ⟨ous⟩ [28]. When /əs/ occurs at the ends of words with other parts of speech, it has a variety of spellings, as in ⟨cactus⟩ and ⟨canvas⟩, but virtually never ⟨ous⟩. No program of spelling instruction that we know of explicitly teaches that the spelling of /əs/ depends on whether or not it is a separate morpheme. To find out whether adults have picked this up, researchers have presented participants with sentences in which a novel item is used as an adjective, as in “She is very /ˈdrɪbəs/”, or as another part of speech, as in “She is a /ˈdrɪbəs/”, and asked them to spell either the novel item or the full sentence. Adults produce more ⟨ous⟩ spellings in adjective contexts than in other contexts, but the difference is substantially smaller than the 98% that is anticipated on the basis of the vocabulary statistics: 6% in a community sample of adults [29], 18% in a study of university students [30], and 13% in a study that pooled data from native speakers of English and students of English at a German university [31]. The participants in these studies often used ⟨is⟩ and ⟨us⟩ at the ends of adjectives, preferring these shorter and simpler spellings over the ⟨ous⟩ that marks adjectival status.
The failure of adults to fully follow vocabulary statistics related to morphology is not limited to adjectival ⟨ous⟩. One of the studies mentioned above [29] also included nonwords that ended with /ək/, which is virtually always spelled as ⟨ic⟩ at the ends of adjectives, as in ⟨angelic⟩ and ⟨scenic⟩, and which is spelled as ⟨ic⟩ in 33% of English words that are not adjectives [28]. Participants produced ⟨ic⟩ 50% of the time for nonwords that were used as adjectives and 42% of the time for those that were not, a much smaller difference than anticipated given the size of the effect in the English vocabulary. Nor is the failure of adults to fully follow the vocabulary statistics related to morphology limited to spelling production tasks. Similar results were found when participants were asked to choose the better of two presented spellings for nonwords ending in /əs/ and /ək/ when they occurred in sentence contexts [32].
It is impressive that adults show some knowledge of untaught aspects of spelling, but the large gap between their behavior and the vocabulary statistics is surprising. What explains this gap? One reason may be, as discussed previously, that spellers focus more on sounds themselves than on their contexts. Another reason may be that many English adjectives with final ⟨ous⟩ and ⟨ic⟩ begin with a recognizable morpheme, as in ⟨humor⟩ + ⟨ous⟩ and ⟨angel⟩ + ⟨ic⟩. The spellings ⟨dribous⟩ for /ˈdrɪbəs/ and ⟨blenic⟩ for /ˈblɛnək/ do not and are in this way graphotactically unusual. What people learn about the links between part of speech and spelling, which favor the spelling ⟨dribous⟩ for the sentence “She is very /ˈdrɪbəs/,” is thus tempered by their graphotactic knowledge and their preference for simpler spellings, which favors two-letter endings as in ⟨dribis⟩ or ⟨dribus⟩ over the three-letter ending in ⟨dribous⟩.
Statistical learning in the world and the laboratory
Since the discovery that infants can extract statistics about the structure of a miniature artificial language with just a few minutes of experience [5–8], statistical learning has been seen as automatic and powerful. If infants perform above the level of chance on a test of statistical knowledge after several minutes’ experience, shouldn’t adults with daily and decades-long exposure to written English consistently read ⟨c⟩ as /s/ before ⟨i⟩ and ⟨e⟩ and spell adjectives ending with /əs/ with ⟨ous⟩? There are at least three reasons for the imperfect match between the behavioral and the vocabulary statistics.
Complexity. The statistics involved in reading and writing are more complex than the statistics of the artificial languages used in laboratory studies [33]. Statistics of different kinds exist and interact for reading and writing: statistics about the sequences of letters in written words, statistics about the sequences of sounds in spoken words, and statistics about the links between spelling and sounds without regard to context and the associations of these links with various kinds of contexts. The complexity of the statistics makes learning difficult.
Ease of production. People’s choices are influenced not just by the conformity of the options to the statistics of the world but also by the ease of producing the options. Thus, people may prefer to use a short spelling even if it does not mark a word’s part of speech, as with ⟨us⟩ as opposed to ⟨ous⟩.
Satisficing. People often solve real-world problems in a way that gives satisfactory but not optimal results [34, 35]. If others understand ⟨rythum⟩ as a spelling for ⟨rhythm⟩, for example, a speller may consider this a reasonable attempt and not make the effort to learn the conventional spelling and the reasons for it.
Reading and spelling instruction
Reading and spelling instruction that includes explicit tuition about the links between letters and sounds—phonics—works better than instruction that focuses on the reading and spelling of whole words without analysis, allowing learners to better generalize to untaught items [36–39]. But what kind of phonics instruction works best? The traditional approach of teaching, that words should obey certain rules and that those that deviate from the rules must be memorized as wholes, overlooks the probabilistic patterns in writing systems. What is the best way for children to learn these patterns? Some phonics programs teach only the most common sound for each letter or digraph (e.g., ⟨ch⟩ to /tʃ/, ⟨e⟩ to /ɛ/), assuming that explicit teaching of a minimal set of common correspondences will allow children to begin reading independently and that they can then use their statistical-learning skills to induce the subtler and less common patterns [40]. Other programs teach that that some letters and digraphs have more than one possible pronunciation and that some sounds can be spelled in more than one way. Children are encouraged to adopt a set for variability, trying multiple options and choosing the one that one yields the most sensible result [41, 42]. Again, the assumption is that children will acquire further knowledge on their own through statistical learning. However, statistical learning can be slow and incomplete, as we have seen, and there is a need to boost it through instruction. For example, teachers covering the digraph ‹oo› can group words into those that have ‹oo› before ‹k› and those that have ‹oo› before other letters. They can point out that the digraph is usually pronounced as /ʊ/ in the first set of words, including ⟨book⟩ and ⟨cook⟩, but as /u/ in the second set, including ⟨boot⟩ and ⟨moon⟩, or they can help children notice the pattern themselves. Such instruction can help alert children that context can make a difference, something that can be hard for them to learn without such hints.
Concluding remarks
Research applying the concept of statistical learning to word reading and spelling began by asking whether children perform above the level expected by chance on tests assessing knowledge of untaught patterns (Box 2) and whether children’s and adults’ performance on laboratory tests of statistical learning relates to their reading and spelling performance (Box 4). More recent studies of statistical learning have brought attention to the question addressed here: Do people with years of experience with a writing system faithfully follow its statistics? We would expect them to do so if statistical learning is as rapid and powerful as laboratory studies suggest. However, the results reviewed here suggested that skilled spellers and readers do not always follow the vocabulary statistics. People are drawn to what is common and easy, and they may not fully appreciate how a letter’s pronunciation or sound is affected by the context in which it appears. This tendency to underestimate the effects of context is not unique to reading and spelling: People often do the same when interpreting the behavior of others, drawing conclusions about other people’s enduring dispositions from behaviors that can be explained by the situations in which the behaviors appear [43]. The implication of the present results for literacy instruction is that it is unrealistic to teach children a minimal set of letter-sound correspondences and expect them to induce the more complex statistics of a writing system without guidance. The research further points to the need for models of word reading and spelling and their acquisition to account for the finding that users of writing systems benefit from probabilistic patterns, not only deterministic rules (Box 1). Much of the research we have reviewed has involved English, and more work is needed with users of other languages (see Outstanding Questions). Still, it is clear that the findings of laboratory studies of statistical learning do not generalize straightforwardly to the real world.
Box 4. Individual differences in statistical learning.
One of the most active areas of research in the study of statistical learning as it relates to reading and spelling has been whether there is a general statistical-learning ability on which people can be arrayed and whether people who possess more of this ability are better readers and spellers than those who possess less [2, 54, 55]. One motivation behind these studies is that statistical-learning ability might be tested before children learn to read and spell so that children with poor statistical-learning skills can be given special attention. Another motivation is that teaching children to improve their statistical-learning skills using tasks that do not involve written words might improve their reading and spelling. According to a recent meta-analysis that included studies with children and adults, the correlation between performance in laboratory tasks of statistical learning and reading is weak (0.24), especially in children (0.18) [56]. The weak correlation may, in part, reflect the fact that the systems used in laboratory studies differ substantially from real writing systems. Also, whereas commonly used statistical-learning tasks are sensitive enough to pick up differences between the performance of a group of participants and chance-level performance, they are not very sensitive to variations among participants [57, 58]. Indeed, many participants perform at levels that are not statistically different from chance [58].
In our view, a more fruitful way to apply the concept of statistical learning to word reading and spelling is to study the statistics of actual writing systems and how people learn them, including studies of individual and developmental differences in knowledge of these statistics [e.g., 29, 59]. Because teaching people to perform a task is more effective than teaching them something else and expecting it to transfer, we do not anticipate that teaching of statistical-learning skills outside the context of written language would have much benefit for reading or spelling.
Acknowledgments
Preparation of this manuscript was supported in part by NIH grant R01HD102346.
Glossary
- C
stands for any letter that normally spells a consonant sound
- digraph
a sequence of two letters that spells a single sound, such as ⟨ea⟩ spelling /ɛ/ in ⟨head⟩
- dual-route
models of reading positing two parallel routes: (1) recalling words by memorized whole-word spellings and (2) decoding the pronunciation with rules saying what sounds letters and digraphs can have, then recalling the word by that pronunciation. The lexical route would win with words like ⟨of⟩, because there is no rule that says that ⟨f⟩ spells the sound /v/.
- front
formed at the roof of the mouth between the upper teeth and the hard palate. Front sounds include /s/, /tʃ/, /dʒ/, /i/, and /e/, but /k/ and /g/ are back consonants
- graphotactic
involving patterns of letters, without consideration of their sounds. For example, ⟨ff⟩ cannot appear at the beginning of an English word, even though many English words begin with /f/.
- long vowel
one of two major pronunciations of a vowel letter (see also short). For the vowels ⟨a, e, i, o, u, y⟩ the long pronunciations in General American English are /e, i, aɪ, o, ju, aɪ/.
- morphological
dealing with the composition of words in terms of functional units (morphemes). “Cats” has two morphemes: “cat” bearing meaning and “s” marking grammatical plurality.
- novel words
words that a participant in an experiment is unlikely to know, typically because the researcher invents them (nonwords).
- satisfice
to choose the first acceptable option that one comes across rather than searching for the best.
- set for variability
an approach to reading where the reader first tries the usual letter–sound patterns, and if that doesn’t make a known word, tries less common patterns or searches for similar-sounding words
- short vowel
one of two major pronunciations of a vowel letter (see also long). For the vowels ⟨a, e, i, o, u, y⟩ the short pronunciations in General American English are /æ, ɛ, ɪ, ɑ, ə, ɪ/.
- statistical learning
the process of learning patterns by implicitly keeping track of the relative frequencies with which objects or events are encountered.
- V
any letter that normally spells a vowel sound, namely ⟨a, e, i, o, u⟩ and maybe ⟨y⟩; see also C. For example, the pattern CVCV could describe the word ⟨coda⟩ but not ⟨coal⟩.
- V̄
a vowel letter that is assigned a long pronunciation.; e.g., CV̄CC fits ⟨pint⟩ but not ⟨mint⟩.
- V̄CV rule
the (violable) rule that when a single consonant comes between two vowels, the first vowel, if stressed, will be pronounced long. For example, ⟨coda⟩ has the VCV pattern in it — ⟨…oda⟩ — and, in accordance with the rule, the ⟨o⟩ is long: /ˈkodə/.
References
- 1.Chetail F (2017) What do we do with what we learn? Statistical learning of orthographic regularities impacts written word processing. Cognition 163, 103–120 [DOI] [PubMed] [Google Scholar]
- 2.Sawi OM & Rueckl J (2019) Reading and the neurocognitive bases of statistical learning. Sci. Stud. Read. 23, 8–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Siegelman N et al. (2020) Using information-theoretic measures to characterize the structure of the writing system: The case of orthographic-phonological regularities in English. Behav. Res. Methods 52, 1292–1312 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Treiman R & Kessler B (2022) Statistical learning in word reading and spelling across languages and writing systems. Sci Stud. Read. 26, 139–149 [Google Scholar]
- 5.Gómez RL & Gerken L (2000) Infant artificial language learning and language acquisition. Trends Cogn. Sci. 4, 178–186 [DOI] [PubMed] [Google Scholar]
- 6.Isbilen E & Christiansen M (2022) Statistical learning of language: A meta-analysis into 25 years of research. Cogn. Sci. 46, e13198. [DOI] [PubMed] [Google Scholar]
- 7.Saffran JR et al. (1996) Statistical learning by eight-month-old infants. Science 274, 1926– 1928 [DOI] [PubMed] [Google Scholar]
- 8.Saffran JR (2001) Words in a sea of sounds: The output of infant statistical learning, Cognition, 81, 149–169 [DOI] [PubMed] [Google Scholar]
- 9.Brown GDA (1998). The endpoint of skilled word recognition: The ROAR model. In J. Word recognition in beginning literacy (Metsala L & Ehri LC (eds.)), pp. 121–138, Erlbaum. [Google Scholar]
- 10.Fisher D et al. (2023) Wonders. McGraw Hill. [Google Scholar]
- 11.Simmons L (2020) Saxon phonics and spelling. Houghton Mifflin Harcourt. [Google Scholar]
- 12.Kearns DM (2020) Does English have useful syllable division patterns? Read. Res. Q. 55, S145–S160 [Google Scholar]
- 13.Treiman R & Kessler B (2023) Spelling-to-sound translation for English disyllables: Use of long and short vowels before single medial consonants. J. Exp. Psychol.: Learn. Mem. Cogn. 49, 2034–2047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Berg K (2016) Double consonants in English: Graphemic, morphological, prosodic and etymological determinants. Read. Writ. 29, 453–474 [Google Scholar]
- 15.Misyak JB, & Christiansen MH (2012) Statistical learning and language: An individual differences study. Lang. Learning, 62, 302–331 [Google Scholar]
- 16.Share DL (2008) On the Anglocentricities of current reading research and practice: The perils of overreliance on an ‘outlier’ orthography. Psych. Bull. 134, 584–615 [DOI] [PubMed] [Google Scholar]
- 17.Job R et al. (1999) Lexical effects in naming pseudowords in shallow orthographies: Further empirical data. J. Exp. Psychol.: Hum. Percept. Perform. 24, 622–630 [Google Scholar]
- 18.Treiman R et al. (2007) Anticipatory conditioning of spelling-to-sound translation. J. Mem. Lang. 56, 229–245 [Google Scholar]
- 19.Altmiller R et al. (2023) Double trouble: Using spellings of different lengths to represent vowel length in English. J. Exp. Child Psychol. 231, 105649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Treiman R & Boland K (2017). Graphotactics and spelling: Evidence from consonant doubling. J. Mem. Lang. 92, 254–264 [Google Scholar]
- 21.Treiman R & Wolter S (2018) Phonological and graphotactic influences on spellers’ decisions about consonant doubling. Mem. Cogn. 46, 614–624 [DOI] [PubMed] [Google Scholar]
- 22.Yin L et al. (2020) Decisions about consonant doubling among non-native speakers of English: Graphotactic and phonological influences. Read. Writ. 33, 1839–1858 [Google Scholar]
- 23.Calkins L, & Báez A (2019). Big words take big resolve: Tackling multisyllabic words. Heineman. [Google Scholar]
- 24.Gentry JR (2022). Spelling connections (Teacher edition, 2). Zaner-Bloser. [Google Scholar]
- 25.Hayes H et al. (2006) Children use vowels to help them spell consonants. J. Exp. Child Psychol. 94, 27–42. [DOI] [PubMed] [Google Scholar]
- 26.Treiman R & Kessler B (2016) Choosing between alternative spellings of sounds: The role of context. J. Exp. Psychol: Learn. Mem. Cogn. 42, 1154–1159. [DOI] [PubMed] [Google Scholar]
- 27.Hegland SS (2021) Beneath the surface of words: What English spelling reveals and why it matters. Learning about Spelling. [Google Scholar]
- 28.Berg K & Aronoff M (2017) Self-organization in the spelling of English suffixes: The emergence of culture out of anarchy. Language 93, 37–64 [Google Scholar]
- 29.Treiman R et al. (2021) How sensitive are adults to the role of morphology in spelling? Morphology 31, 261–271 [Google Scholar]
- 30.Ulicheva A et al. (2020) Skilled readers’ sensitivity to meaningful regularities in English writing. Cognition 195, 103810. [DOI] [PubMed] [Google Scholar]
- 31.Heyer V (2021) Below the surface: The application of implicit morpho-graphic regularities to novel word spelling. Morphology 31, 243–260 [Google Scholar]
- 32.Iwao HK et al. (2025). Sensitivity to morphological spelling regularities in Chinese-English bilinguals and English monolinguals. Read. Writ. 38, 503–530 [Google Scholar]
- 33.Frost R, Armstrong BC, & Christiansen MH (2019). Statistical learning research: A critical review and possible new directions. Psychol. Bull, 145, 1128–1153 [DOI] [PubMed] [Google Scholar]
- 34.Simon HA (1956). Rational choice and the structure of the environment. Psychol. Rev 63, 129–138 [DOI] [PubMed] [Google Scholar]
- 35.Ferreira F, Engelhardt PE, & Jones MW (2009) Good enough language processing: A satisficing approach. In Proc. 31st Annual conf. of the Cogn. Sci. Soc. (Vol. 1, pp. 413–418). Cognitive Science Society [Google Scholar]
- 36.Castles A et al. (2018) Ending the reading wars: Reading acquisition from novice to expert. Psychol. Sci. Public Interest 19, 5–51 [DOI] [PubMed] [Google Scholar]
- 37.Ehri LC et al. (2001). Systematic phonics instruction helps students learn to read: Evidence from the National Reading Panel’s meta-analysis. Read. Res. Q. 36, 250–287 [Google Scholar]
- 38.Rastle K et al. (2021). The dramatic impact of explicit instruction on learning to read in a new writing system. Psychol. Sci. 32, 471–484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rose J (2006). Independent review of the teaching of early reading final report. U.K. Department for Education and Skills [Google Scholar]
- 40.Solity JE (2020) Instructional psychology and teaching reading: Ending the reading wars. Ed. and Dev. Psychol. 37, 123–132 [Google Scholar]
- 41.Savage R et al. (2019). Preventative reading interventions teaching direct mapping of graphemes in texts and set-for-variability aid at-risk learners. Sci. Stud. Read. 22, 225–247 [Google Scholar]
- 42.Steacy LM et al. (2023) Set for variability as a critical predictor of word reading: Potential implications for early identification and treatment of dyslexia. Read. Res. Q. 58, 254–267 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Gilbert DT, & Malone PS (1995). The correspondence bias. Psychol. Bull., 117 21–38 [DOI] [PubMed] [Google Scholar]
- 44.Coltheart M et al. (2001) DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychol. Rev. 108, 204–256 [DOI] [PubMed] [Google Scholar]
- 45.Ziegler et al. (2000). The DRC model of visual word recognition and reading aloud: An extension to German. Eur. J. Cogn. Psychol. 12, 413–430 [Google Scholar]
- 46.Seidenberg MS et al. (2022). Models of word reading: What have we learned? In Science of Reading: A handbook (2nd Ed. (Snowling M et al., eds.)), pp. 36–59, Wiley [Google Scholar]
- 47.Pacton S & Commissaire E (2024). Statistical learning and spelling: The case of graphotactic regularities. Année Psychol. 124, 317–345 [Google Scholar]
- 48.Treiman R et al. (2022) Prephonological spelling and its connections with later word reading and spelling performance. J. Exp. Child Psychol. 218, 105359. [DOI] [PubMed] [Google Scholar]
- 49.Treiman R & Yin L (2011). Early differentiation between drawing and writing in Chinese children. J. Exp. Child Psychol. 108, 786–801 [DOI] [PubMed] [Google Scholar]
- 50.Otake S et al. (2017) Differentiation of writing and drawing by U.S. two- to five-year-olds. Cogn. Dev. 43, 119–128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Pollo TC et al. (2009) Statistical patterns in children’s early writing. J. Exp. Child Psychol. 104, 410–426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Carrillo MS & Alegría J (2014) The development of children’s sensitivity to bigram frequencies when spelling in Spanish, a transparent writing system. Read. Writ. 27, 571–590 [Google Scholar]
- 53.Hernandez A et al. (2021). German in childhood and Latin in adolescence: On the bidialectal nature of lexical access in English. Humanit. Soc. Sci. Commun. 8, 162 [Google Scholar]
- 54.Ren J & Wang M (2024) Contribution of statistical learning in learning to read across languages. PLoS ONE. e0298670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.van Witteloostuijn M et al. (2021) The contribution of individual differences in statistical learning to reading and spelling performance in children with and without dyslexia. Dyslexia 27, 168–186 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Ren J et al. (2023) A meta-analysis on the correlations between statistical learning, language, and reading outcomes. Dev. Psychol. 59, 1626–164 [DOI] [PubMed] [Google Scholar]
- 57.Arnon I (2020). Do current statistical learning tasks capture stable individual differences in children? An investigation of task reliability across modality. Behav. Res. Methods 52, 68–81 [DOI] [PubMed] [Google Scholar]
- 58.Siegelman N, Bogaerts L, & Frost R (2017). Measuring individual differences in statistical learning: Current pitfalls and possible solutions. Behav. Research Methods 49, 418–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Jared D et al. (2013). Discrimination of English and French orthographic patterns by biliterate children. J. Exp. Child Psychol. 114, 469–488 [DOI] [PubMed] [Google Scholar]
- 60.International Phonetic Association (1999) Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press [Google Scholar]
- 61.Katz L, & Frost R (1992). The reading process is different for different orthographies: The orthographic depth hypothesis. In Advances in psychology, Vol. 94. Orthography, phonology, morphology, and meaning (Katz L & Frost R, eds.), pp. 67–84, North-Holland. [Google Scholar]
