Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Feb 1.
Published in final edited form as: J Mem Lang. 2023 Jan 5;129:104399. doi: 10.1016/j.jml.2022.104399

Number and Syllabification of Following Consonants Influence Use of Long Versus Short Vowels in English Disyllables

Rebecca Treiman 1, Brett Kessler 2, Kayla Hensley 3
PMCID: PMC10100579  NIHMSID: NIHMS1888484  PMID: 37064814

Abstract

Spelling-to-sound translation in English is particularly complex for vowels. For example, the pronunciations of ‹a› include the long vowel of ‹paper› and ‹sacred› and the short vowel of ‹cactus› and ‹happy›. We examined the factors that are associated with use of long versus short vowels by conducting analyses of English disyllabic words with single medial consonants and consonant sequences and three behavioral studies in which a total of 119 university students pronounced nonwords with these structures. The vocabulary analyses show that both the number of medial consonants and their syllabification influence vowel length. Participants were influenced by these aspects of context, some of which are not explicitly taught as a part of reading instruction. Although these results point to implicit statistical learning, participants produced fewer long vowels before single medial consonants than anticipated based on our vocabulary statistics for spelling-to-sound correspondences in disyllabic words. Participants also produced more long vowels before two identical consonant letters than anticipated given these statistics. We consider the reasons for these outcomes, and we also use the behavioral data to test two models of spelling-to-sound translation.

Keywords: spelling-to-sound translation, vowel length, consonantal context, syllabification, statistical learning, models of reading


The spellings of English words are not always related to their pronunciations in a straightforward manner, raising questions about how people choose among different possible pronunciations of a letter. Complexities in spelling–sound translation are especially notable for vowels. For example, ‹a› is pronounced as /e/ in ‹paper› and ‹sacred› but as /æ/ as in ‹cactus› and ‹happy›. Educators refer to the former pronunciation as long (usually notated ā) and the latter as short (ă). We use those labels here, and we use the term normative to refer to these two most common pronunciations. The vowels ‹e›, ‹i›, ‹o›, and ‹u› also have long and short pronunciations. Table 1 shows the long and short pronunciations of each vowel, and it provides examples of words with each pronunciation. Each vowel letter also has some less common pronunciations, which we refer to as other pronunciations. Table 1 shows examples of some of the other pronunciations for each vowel.

Table 1.

Long, Short, and Other Pronunciations of Vowels and Sample Disyllabic Word with Each Pronunciation in Stressed First Syllable

Vowel Long Short Other
a /e/ (paper) /æ/ (cactus) /ɑ/ (wander), /ɛ/ (many)
e /i/ (Peter) /ɛ/ (pepper) /e/ (Vegas)
i /aɪ/ (nitrate) /ɪ/ (litmus) /i/ (kilo)
o /o/ (tonal) /ɑ/ (tonic) /ʌ/ (comfort), /u/ (movie)
u /u/ (tuna), /ju/ (puny) /ʌ/ (tunnel) /ʊ/ (sugar)

When a vowel letter has more than one possible pronunciation, how do readers choose among them? In the three studies reported here, we focus on the choice between the normative pronunciations: the long one and the short one. We test that idea that within-word context can help signal whether a vowel letter is pronounced as long or short and that readers take advantage of the contextual clues. We examine these issues not for the monosyllabic items that have been the focus of most previous studies of reading but for disyllables, specifically for the first vowels of disyllables with medial VCV and VCCV spelling sequences (V = vowel, C = consonant). The large majority of such words have a stressed first vowel (e.g., Treiman et al., 2020), but what determines whether the vowel is pronounced as long or short? In the vocabulary analyses that we report, we analyze English words with medial VCVs and VCCVs to study the contextual clues to vowel pronunciation that exist in the language. In behavioral studies, we examine university students’ use of the clues by asking them to pronounce nonwords. We use nonwords for the behavioral studies because readers do not have stored representations of these items that they can use for pronunciation. Comparing the results of the vocabulary and behavioral studies allows us to test a statistical learning view of reading: that readers pick up patterns relating spellings to sounds, even ones that are not explicitly taught (e.g., Arciuli, 2018; Sawi & Rueckl, 2019). If experienced readers are optimal learners in this sense (Brown, 1998), we would expect their pronunciations to closely mirror the vocabulary statistics.

We also consider two models that were designed to simulate how people translate from spellings to pronunciations for items whose pronunciations have not been stored in memory, comparing the models’ vowel pronunciations to those produced by our participants. Many models of the spelling-to-sound translation process (e.g., Chang et al., 2019; Coltheart et al., 2001; Harm & Seidenberg, 2004) have been implemented only for monosyllables. One of the few models that makes specific predictions for disyllables, that of Rastle and Coltheart (2000), includes a set of rules that determine whether the first vowel of a disyllable receives stress and, if stressed, whether it is short or long. The rules are built into the model, and they are based in part on those of the Coltheart et al. model. We also test the model of Perry et al. (2010). This model includes a network that learns about the connections between letters and phonemes. The model is trained on simple phonics rules (e.g., ‹l› to /l/, ‹a› to /æ/) and then with a large set of mono- and disyllabic English words and their pronunciations. We examined the pronunciations produced by the trained model. Whereas previous evaluations of the models’ performance on disyllables have examined which syllable is stressed (Ktori et al., 2018; Mousikou et al., 2017),we focus on the models’ use of long versus short vowels in the first syllables of disyllables that are assigned first-syllable stress and how this compares to skilled readers’ use of long and short vowels.

Before elaborating on what is known about use of long and short vowels in disyllables, we consider some findings about vowel pronunciation in monosyllables. Analyses of the spelling–sound correspondences in the monosyllabic words of English show that single vowel letters usually have their short pronunciations in words that end with one or more consonant letters. But there are some exceptions, such as that ‹i› often has its long pronunciation when it is followed by ‹ld› or ‹nd›, as in ‹mild› and ‹mind› (Kessler & Treiman, 2001; Venezky, 1970). The university students tested by Treiman et al. (2003) sometimes used the long vowel “ī” when asked to pronounce novel items such as ‹brild›, suggesting a sensitivity to this pattern. This sensitivity to context may reflect implicit statistical learning, for many instructional programs do not explicitly teach that ‹ld› and ‹nd› influence the pronunciation of a preceding ‹i›. Participants’ rate of long pronunciations for ‹i› before ‹ld› and ‹nd› was substantially lower than expected based on the statistics that Treiman et al. calculated for the English vocabulary, however. This difference suggests that even experienced readers do not make full use of contextual cues (the following consonants, in this case) that signal a less common vowel pronunciation. Thus, contrary to views of readers as optimal statistical learners (e.g., Brown, 1998; Sawi & Rueckl, 2019), even experienced readers may not have fully internalized the patterns in the words to which they are exposed.

The study by Treiman et al. (2003) that was described above, like many other studies (e.g., Steacy et al., 2019; Treiman et al., 2006), focused on the influence of context on the pronunciations of letters in monosyllables. Longer words are critical for understanding texts’ meanings, in part because they are less predictable from the surrounding context than shorter words (Mahowold et al., 2021), but they have received less research attention. In the present study, we take a step beyond past work by conducting vocabulary statistics and behavioral studies with disyllables.

A common view, and one that is often promoted in programs that use a phonics approach to reading instruction, is that words with a single ‹a›, ‹e›, ‹i›, ‹o›, or ‹u› in a stressed first syllable are usually pronounced with a long first-syllable vowel if the vowel is followed by a CV and with a short vowel if the vowel is followed by a CCV (e.g., Cox & Hutcheson, 1988; Kearns, 2020). Educators explain the difference in vowel length by saying that a VCV sequence normally has a syllable break (which we notate as |) before the consonant. The first syllable is thus open, ending with a vowel, and vowels have their long pronunciations in open syllables. This V̄|CV rule explains why words like ‹paper› have long vowels in their first syllables. VCCV sequences are thought to be syllabified between the two consonants. According to the V̆C|CV rule, the first syllable of a word like ‹cactus› is closed, ending with a consonant, and is therefore pronounced as short.

Words like ‹paper› and ‹cactus› follow the postulated V̄|CV and V̆C|CV rules, but it is easy to find words that do not. For example, ‹lemon› is an exception to the V̄|CV rule and ‹sacred› is an exception to the V̆C|CV rule. Kearns (2020) examined the reliability of the V̄|CV and V̆C|CV rules by analyzing a large set of English words that appear in reading materials for students in Grades 1 through 8. Among disyllables with a stressed ‹a›, ‹e›, ‹i›, ‹o›, or ‹u› in the first syllable and a single medial consonant, 65% of all normative vowel pronunciations were long. Thus, although the V̄|CV rule holds for the majority of words, there are many exceptions like ‹lemon›. The V̆C|CV rule was more reliable in that only 6% of all normative first-syllable vowel pronunciations in disyllabic words with a stressed ‹a›, ‹e›, ‹i›, ‹o›, or ‹u› in the first syllable and two medial consonants were long.

Kearns (2020) lumped together virtually all disyllabic words with medial VCCVs in his analyses of the reliability of V̆C|CV rule in such words. However, there are linguistic reasons to expect that the pronunciation of the first vowel may vary with the syllabification of the medial cluster. Phonologists have long argued that, for most languages, words are syllabified in such a way that the onset (initial consonant or cluster) of each syllable contains as many consonants as possible given the phonological constraints of the language (e.g., Kahn, 1976; Pulgram, 1970). That is, there is a preference for consonants to be part of an onset rather than a coda when phonologically possible. According to this maximal onset principle, a V|CCV division is preferred when the consonant sequence is a legal onset cluster, one that may occur at the beginnings of words. The /kr/ sequence spelled ‹cr› in ‹sacred› is such a cluster, as illustrated by words like /kretɚ/ ‹crater›. Thus, ‹sacred› is syllabified before the cluster according to the maximal onset view. Its first syllable is open, potentially explaining its long vowel. In contrast, the /kt/ sequence spelled ‹ct› as in ‹cactus› is not a legal onset cluster in English. The word is thus syllabified between the two medial consonants, /k/ ‹c› forming the coda (the ending consonantal portion) of the first syllable and /t/ ‹t› forming the onset of the second syllable. The first syllable of ‹cactus› is closed, and this may explain its short vowel.

In the vocabulary analysis that we conducted as a part of Study 1, we distinguished words in which the medial consonant sequence can be an onset, such as ‹sacred› and ‹macro›, from words in which the medial consonant sequence must be analyzed as a coda plus an onset, such as ‹cactus›. We refer to these two types of words as onset+onset and coda+onset, respectively. According to the V̆C|CV rule of phonics, the vowel of the first syllable should be short in both cases. The maximal onset principle, in contrast, predicts that long first-syllable vowels should be common for onset+onset words but not for coda+onset words. We also examined onset single words: those like ‹bacon› in which the single medial consonant is a well-formed onset. According to both the V̄|CV rule of phonics and the maximal onset principle, the first vowels of such words should be long. The maximal onset principle further predicts that the rate of long first-syllable vowels should be equally high for onset single words and onset+onset words because both are syllabified after the first vowel. In the behavioral portion of Study 1, we asked participants to pronounce nonwords of the onset+onset, coda+onset, and onset single types. We examined participants’ use of long and short vowels and the degree to which it corresponded to the statistics that we calculated for disyllabic words with similar structures. In Studies 2 and 3, we also obtained data on onset double items: those with medial CC sequences such as ‹tt› in which the two consonants are identical and are pronounced as a single consonant sound in a syllable onset.

No vocabulary analysis, to our knowledge, has examined long and short pronunciations of the first vowel in English disyllables with different types of medial consonant sequences. In the only directly relevant previous behavioral study, which was reported as a conference presentation, Waese and Jared (2006) asked university students to pronounce isolated nonwords of the onset single, onset+onset, and coda+onset types. The percentage of long vowel pronunciations was 26% for onset single items such as ‹febisk›, 12% for onset+onset items such as ‹feblid›, and 6% for coda+onset items such as ‹febgal›. These results suggest that people do not always follow the V̄|CV and V̆C|CV rules when translating from spelling to sound. Readers appear to make some distinctions between different types of medial consonant sequences, and they use long vowels much less often for onset single items than expected given the V̄|CV rule. A limitation of the Waese and Jared study is that, although items of the three types were matched for their initial CVCs, they were not matched for the vowels and consonants that followed. The results of a vocabulary analysis conducted by Berg (2016) suggest that, for disyllabic words with medial single consonants, the identity of the final sequence influences the pronunciation of the first vowel. If Waese and Jared’s participants were sensitive to these effects, this could have caused uncontrolled variation in vowel pronunciation.

The behavioral portion of our Study 1 included the same categories of items as in the study of Waese and Jared (2006), and its design incorporated several improvements. We used triplets of items such as ‹fupel›, ‹fuprel›, and ‹fupmel› that differed only in whether there was another consonant immediately after the first medial consonant and whether this consonant, if present, allowed for a legal onset cluster (e.g., ‹pr›) or not (e.g., ‹pm›). This matching procedure ensures that any influences of letters other than the medial consonant or cluster on the pronunciation of the first vowel are the same across the items of a triplet. We presented each nonword in isolation, to familiarize participants with it, and then asked participants to read it aloud in a sentence such as “Ann bought a fupmel.” This sentence reading task is more natural than the reading of isolated nonwords. Using a sentence reading task also allows us to place the nonwords in noun contexts, which has been shown to promote pronunciations with first-syllable stress (e.g., Ktori et al., 2018; Smith & Baker, 1976; Treiman et al., 2020). To further promote first-syllable stress, we also instructed participants to emphasize the first vowels of the nonwords.

Study 1 included, in addition to the behavioral study, an analysis of a large set of English disyllabic words that were very similar in structure to the nonwords of the behavioral study. We examined the rate of long and short pronunciations of the words’ first vowels, comparing the results to those expected on the basis of the V̄|CV and V̆C|CV rules and on the basis of the maximal onset principle. We also compared the patterns in the vocabulary analysis to those in the behavioral study in order to test the hypothesis that experienced readers have internalized the spelling-to-sound patterns of their language through statistical learning. Finally, we compared the pronunciations of our participants to those produced by the models of Rastle and Coltheart (2000) and Perry et al. (2010).

Study 1

Data Availability

Data and analysis scripts for this and the other studies are available at https://osf.io/ckxsa/?view_only=274f5fee16ad4354b2f8b6d5567e1293

Method

Behavioral Study

Participants.

We recruited 40 undergraduate students from Washington University in St. Louis through the subject pool of the Department of Psychological and Brain Sciences. Students in this and the other studies provided informed consent and received either pay or course credit in exchange for participation. Twenty-seven of the participants identified as female and 13 as male. Their mean age was 19.7 years (range 18–24). The first 16 participants completed the study in a lab, with an investigator transcribing their responses in the same room. Due to the onset of the COVID-19 epidemic, the rest of the participants completed the study over Zoom. Participants reported that they were native English speakers with no language-related disabilities, including hearing or vision impairments. We excluded data from one additional person who did not meet these guidelines.

Stimuli.

We designed triplets of disyllabic nonwords that began with a single consonant letter followed by ‹a›, ‹e›, ‹i›, ‹o›, or ‹u›. In the onset single item of each triplet, the initial consonant–vowel sequence was followed by a consonant letter that corresponded to a stop consonant or a fricative and then either one or two letters that were expected to be pronounced as a single vowel phoneme (e.g., ‹y›, ‹ee›) or a vowel–consonant sequence (e.g., ‹on›, ‹it›). Examples include ‹nafex› and ‹bipu›. The onset+onset items, such as ‹nafrex› and ‹biplu›, were formed by adding ‹r› or ‹l› after the medial consonant of the single medial consonant item to form a cluster that may appear at the beginnings of well-known English words. The coda+onset items were formed by adding a consonant after the medial consonant of the single medial consonant item to form a sequence that does not appear at the beginnings of well-known English words. Examples are ‹nafpex› and ‹bipmu›. We did not use items with medial ‹s› given reports that medial clusters beginning with ‹s› behave differently than other medial clusters (Eddington et al., 2013; Redford & Randall, 2005; Treiman et al., 1992). Also, we avoided medial ‹x› because it is normally pronounced as a sequence of two consonant phonemes (/ks/) rather than a single consonant. There were 59 triplets of items. These, together with the items for the other studies, are listed in the OSF site. (One triplet that was originally constructed for the study had an error, and the results on this triplet were not included in the analyses.) All participants were presented with all of the items.

The nonwords were presented in frames such as “Ann bought a ___.” Each frame began with a one-syllable proper noun or pronoun. This was followed by a one-syllable transitive verb and then either “the” or “a.” The nonword appeared after the end of the frame, serving as the object of the verb. The frames were paired with the experimental nonwords in a different random arrangement for each participant. For example, one participant read “Ann bought a pifgoc” and another participant read “James holds the pifgoc.”

We constructed seven disyllabic nonwords such as ‹spoiky› for use in the practice phase of the experiment. The first syllable of these items had a two-letter sequence that corresponded to a syllable nucleus rather than a single vowel letter as in the experimental nonwords. The use of different vowels ensured that experience on the practice trials could not skew participants toward particular pronunciations of the first vowels of the experimental nonwords. The sentence frames for the practice trials were similar to those for the experimental items. Each practice nonword was paired with the same frame for all participants, and the order of the practice nonwords was also the same.

Procedure.

Participants completed the experiment in a session that lasted approximately 45 minutes. Those who consented to participate were given a short questionnaire about demographics, language background, and any reading, spelling, hearing, or speech problems they may have had. The investigator told the participant that, on each trial, a nonword would appear on a screen for six seconds and the participant should pronounce it aloud, taking their time to decide on a pronunciation. The investigator explained that the nonword would then appear in a sentence, which participants should read aloud as naturally as possible. The investigator told participants that they should emphasize the first vowels of the nonwords, pronouncing a nonword with first-syllable stress and with second-syllable stress in order to illustrate the desired stress pattern. The investigator took the participants through the seven practice trials, telling them to push a button to proceed to the next trial. The investigator demonstrated the correct response for the first two practice trials. If a participant stressed the second syllable on a later practice nonword, the investigator reminded them to emphasize the first vowel. The first three practice trials were then reprised, but this time they were presented as in the experimental trials, with a new trial beginning after a sentence had appeared on the screen for six seconds.

The experimental trials began after the completion of the practice trials. The experimental trials were presented in a different random order for each participant. Participants were given the opportunity to take a short break after completing half of the experimental trials. The procedure for the experimental trials was the same as that for the last three practice trials, except that the investigator did not immediately correct participants if they stressed the second syllable of a nonword. The investigator provided such a reminder during the break if a participant had produced some responses with second-syllable stress.

Scoring.

The investigator transcribed the participants’ pronunciations as they read the nonwords in the sentence. For trials that yielded a clear response with stress on the first syllable, the first vowel was classified as long if the investigator scored it as having the normative long pronunciation shown in Table 1, short if it had the normative short pronunciation, and other if it had any other pronunciation. To check their reliability, a second researcher provided an independent transcription of the pronunciations of four participants. Comparing the scoring based on the two researchers, 99.4% of the 720 trial scores agreed in whether the nonword was judged clear enough to be transcribed and 99.6% of the remainder agreed on whether the pronunciation was stressed on the first syllable. Of those trials, the pronunciation of the first vowel was transcribed the same in 96.5% of the comparisons. In cases of disagreement, we used the transcriptions of the first scorer.

Vocabulary Analysis

Stimuli.

We selected from the Unisyn lexicon version 1.3 (http://www.cstr.ed.ac.uk/projects/unisyn) words that began with the spelling pattern used in the experimental items. Specifically, we chose words that began with any consonant letter, including ‹y›; then a single ‹a›, ‹e›, ‹i›, ‹o›, or ‹u›; then a consonant letter that spells an obstruent, but not ‹s› or ‹x›, followed optionally by another consonant letter, either one that spells an obstruent, but not ‹s› or ‹x› or a repeat of the preceding letter, or ‹l›, ‹m›, or ‹n›. The rest of the word could be any sequence starting with a vowel letter, including ‹y›, optionally followed by one more letter. We also filtered words by their pronunciation. For a word to be included in the vocabulary analysis, it had to consist of two syllables, with primary stress on the first syllable. It had to begin with a single consonant, then a stressed vowel, then a single obstruent other than /s/ or /ʃ/ that is followed, in the case of items with two medial consonant letters, by another obstruent (other than /s/ or /ʃ/) or a liquid or nasal. The next phoneme had to be a vowel, followed optionally by one consonant. We used the General American pronunciations that were generated by Unisyn, editing them lightly by hand to correct pronunciations that are, in fact, not common in the United States. This editing was largely based on the pronunciations given by Dictionary.com (https://www.dictionary.com). We removed items that the dictionary labeled as chiefly British, such as ‹labour›. For heterographic homophones, we included all spelling–pronunciation combinations as separate entries. The resulting word list contained 900 entries. We divided words into onset single, onset+onset, and coda+onset categories based on whether they contained one or two consonant letters after the first vowel and, for words with two medial consonants, whether the two letters occurred at the beginnings of well-known English words.

Scoring.

The pronunciation of the first vowel was scored as long, short, or other following the same procedures as in the behavioral study.

Models

We tested the Rastle and Coltheart (2000) model by implementing in R the subset of the model required for our research questions. The model considers nonwords to have first-syllable stress unless they have a stress-repelling prefix or a stress-attracting suffix. Because not all such prefixes and suffixes were listed in the Rastle and Coltheart article, we built our lists by studying Fudge (1984), examples of how the algorithm has been applied in the past, and personal files documenting past usage that Kathleen Rastle generously shared with us. When the first vowel is stressed, the model yields the long pronunciation for ‹a›, ‹o›, and ‹u› when the consonant that comes after the vowel is immediately followed by ‹e›, ‹ue›, or final ‹y›. Otherwise, the model yields the short vowel.

We tested the Perry et al. (2010) model by running our nonwords through their CDP++ program for Windows (Perry et al., 2014). We used the program’s batch mode, allowing up to 300 cycles for each word. We turned off the fast-processing option. Nonwords were scored as pronounced with stress on the first syllable if their trochaicity activation (field 5) was higher than their iambicity activation (field 6). For items given first-syllable stress, we scored vowel length by extracting the first vowel symbol from the output and interpreting it in relation to the first vowel letter in the nonword. For ‹a›, ‹e›, ‹i›, ‹o›, ‹u› respectively, the symbols “1, i, 2, 5, u” were scored as long vowels, and “{, E, I, Q, V” were scored as short vowels; all other symbols were considered “other.”

Results

Behavioral Study

We eliminated data from trials on which the pronunciation was dysfluent, the audio was unclear, or the second syllable received primary stress; we also omitted data from the item triplet that contained an error. Data from 5% of the trials were dropped for one of these reasons. On the 6,862 trials that were included in the analyses, participants produced long pronunciations 25% of the time, short pronunciations 70% of the time, and other pronunciations 5% of the time. The column labelled “Behavioral study” in Table 2 shows, for each type of medial sequence in the study, the mean and standard deviation by subjects of the percentage of normatively long pronunciations relative to all normative pronunciations.

Table 2.

Percentage of Normative Pronunciations with Long Vowel in First Syllable in Behavioral Study, Vocabulary Analysis, and Models for Study 1

Item type Behavioral study Vocabulary analysis Rastle and Coltheart (2000) model Perry et al. (2010) model
M SD % n a % n b % n b

Onset single (e.g., ‹f›) 47 21 79 513 5 58 43 56
Onset+onset (e.g., ‹fr›) 29 18 44 80 0 58 25 56
Coda+onset (e.g., ‹fg›) 5 8 0 200 0 58 17 54
a

Number of words with normative pronunciation of first-syllable vowel.

b

Number of words pronounced by model with normative stressed pronunciation of first-syllable vowel.

We conducted a mixed-model logistic regression analysis with data from the trials on which the first vowel was pronounced as either long or short. This and the other mixed-model analyses that we report were performed in R using the lme4 package (Bates et al., 2015). The dependent variable was whether the vowel of the first syllable was long, and medial type was a fixed factor with onset+onset as the reference level. The model included random intercepts for participants and item triplets. The contrast between the onset single case and the onset+onset case was statistically significant (b = 1.35, SE = 0.09, p < .001). As Table 2 shows, long vowels occurred 47% of the time before single medial consonants as compared to 29% of the time before onset clusters. The contrast between the coda+onset case and the onset+onset case was also significant (b = −3.12, SE = 0.14, p < .001). Participants used long vowels only 5% of the time for nonwords with coda+onset sequences, significantly less than the 29% observed for onset+onset sequences. The results were very similar for the participants run in person and via Zoom.

Vocabulary Analysis

In the 900 words with the types of medials examined here, 49% had long pronunciations, 39% had short pronunciations, and 12% had other pronunciations of the first vowel. Many of these other pronunciations occurred in relatively recent loanwords, such as /i/ for the ‹i› of kilo. Others reflected influences of adjacent letters that have been noted in previous studies, such as /ɑ/ for ‹a› after ‹w› (e.g., wander) (Kessler & Treiman, 2001). The column marked “Vocabulary analysis” in Table 2 shows the percentages of long vowel pronunciations relative to all normative pronunciations and the number of words with normative pronunciations for each type of medial. The data show that the percentage of long pronunciations is highest before single consonants and intermediate before onset clusters. The first vowel is never long in words in which the first consonant of the medial cluster must be analyzed as the coda of the first syllable and the second consonant must be analyzed as the onset of the second syllable. The data also show that words with onset+onset clusters are less numerous than words with onset+coda clusters.

Models

Table 2 shows the percentage of long first-syllable vowels relative to the total number of normative vowels for each type of item for each model. The table also shows the number of words given normative stressed vowel pronunciations by each model.

Discussion

According to the V̄|CV and V̆C|CV rules of phonics, the number of medial consonants in a disyllabic word determines whether the first vowel is long or short (e.g., Cox & Hutcheson, 1988; Kearns, 2020). The rules predict long vowels for disyllables with a stressed first vowel and a medial single consonant and short vowels for disyllables with two medial consonants. The results of our vocabulary analysis show that, among disyllables with normative stressed pronunciations of the first vowel, long vowels are indeed more likely when there is one medial consonant than when there are two. However, the number of medial consonants is not the only determinant of whether the first vowel is long or short. The properties of the medial consonant sequence are also influential. We found no long vowels for words such as cactus for which the first consonant of a medial sequence must be analyzed as the coda of the first syllable and the second consonant must be analyzed as the onset the second syllable. We found a higher rate of long vowels, 44%, when the medial consonants can form a syllable onset, as in sacred. Thus, consideration of whether a medial consonant sequence can be an onset makes the long pronunciation of the preceding vowel more likely. Our results also show that long first-syllable vowels are less common for words with medial sequences that form onsets, where they occur less than half the time, than for words with single medial consonants, where they occur three quarters of the time. If onset maximization (e.g., Kahn, 1976; Pulgram, 1970) were the only determinant of long versus short vowel use, we should have seen equally high rates of long vowels for these two types of words. The results of our vocabulary analysis thus suggest that the distribution of long versus short vowels is influenced by the number of following consonants and, if there are two consonants, by the syllabification of the consonant sequence.

The participants in our behavioral study showed, as in the vocabulary analysis, the highest rate of long vowels in the onset single case, an intermediate rate in the onset+onset case, and the lowest rate in the coda+onset case. Waese and Jared (2006) found the same three-way distinction, although their stimuli were not as carefully controlled as ours. In both our study and theirs, participants used many fewer long vowels than expected given the vocabulary statistics that we calculated for items with single medial consonants and items with medial onset clusters. In the subsequent discussions, we will consider possible reasons for the differences between the behavioral results and the vocabulary statistics calculated here.

In Study 2, we examined a case that was not included in Study 1 or in the Waese and Jared (2006) study: items with medial sequences such as ‹tt› in which the two consonants are identical and are pronounced as a single consonant sound, as in “letter.” Visually, double consonants form a VC|CV sequence, which predicts a short vowel, but phonetically they form a V|CV sequence, which is discordant; a graphic V|CV sequence would predict a long vowel. Kearns (2020) did not distinguish in his vocabulary analysis between VCCVs in which the two consonants are identical and those in which they are not, and we made this distinction in the vocabulary analysis of Study 2. In the behavioral portion of Study 2, we used quadruplets such as ‹potil›, ‹potril›, ‹potnil›, and ‹pottil›. This gave us the opportunity to replicate the behavioral results of Study 1 for the onset single, onset+onset, and coda+onset cases and to examine the onset double case. The behavioral portion of Study 2 used the same procedure as that of Study 1, except that we added the reading subtest Wide Range Achievement Test (WRAT; Wilkinson & Robertson, 2017) in order to characterize the general word-reading skill of our participants.

Study 2

Method

Behavioral Study

Participants.

We recruited 39 undergraduate students from the same population as Study 1, none of whom had participated in that study. The students received course credit in exchange for participation. The mean age of the participants was 19.9 years (range 18–22). They all reported that they were native English speakers with no language-related disabilities. Thirty-three of the participants identified as female and six as male. The participants’ mean standard score on the reading subtest of the WRAT was 111 (SD = 10).

Stimuli.

We designed quadruplets of disyllabic nonwords that began with a single consonant letter followed by ‹a›, ‹e›, ‹i›, ‹o›, or ‹u›. Each quadruplet included an onset single item, an onset+onset item, and a coda+onset item. These were formed in the same way as in Study 1, and all but two of the nonwords were different from those of Study 1. Each quadruplet also contained an onset double item that was formed by doubling the medial consonant of the onset single item. There were 45 quadruplets of items. The nonwords were presented in the same sentence frames as in Study 1, and all participants received all of the nonwords. The practice trials were the same as in Study 1.

Procedure.

Due to the COVID-19 epidemic, all participants completed the study over Zoom. The procedure was like that of Study 1 except that, in a small number of cases, a reminder to emphasize the first syllable of the nonwords was given when a participant stressed the second syllable in a stretch of nonwords. After the nonword reading task, the words from the word reading subtest of the WRAT were presented on the computer screen for the participant to read aloud. This test includes 42 words that are presented in order of increasing difficulty. We omitted the first 13 words on the assumption that our university student participants would read these easy words correctly.

Scoring.

The investigator scored each participant’s pronunciation of the first vowel of each nonword in the sentence frame as long and stressed, short and stressed, other and stressed, or unstressed. Cases in which no first-syllable vowel could be identified did not receive a score. A second researcher independently scored all of the responses of six participants. Comparing the scores of the two researchers, 99.9% agreed on whether the response was audible; of those, 98.5% of the trial scores agreed on whether the pronunciation was stressed on the first syllable. Of those, the first vowel was coded the same in 97.6% of the cases. In cases of disagreement, we used the transcriptions of the primary investigator.

Vocabulary Analysis

We selected words for the vocabulary analysis in the same way as for Study 1, adding words with double medial consonants. We classified each first-syllable vowel as long, short, or other.

Models

We ran the nonwords through the models and classified the pronunciations in the same way as described in Study 1.

Results

Behavioral Study

We eliminated data from the 4% of trials on which no first-syllable vowel could be identified or the second syllable received primary stress. On the remaining 6,758 trials, participants produced long pronunciations of the vowel 21% of the time and short pronunciations 75% of the time; 5% of the pronunciations fell into the “other” category. The second column of data in Table 3 shows, for each type of medial sequence, the mean and standard deviation by subjects of the proportion of long pronunciations relative to all normative pronunciations.

Table 3.

Mean Percentage of Normative Pronunciations with Long Vowel in First Syllable in Behavioral Study (Standard Deviation in Parentheses), Vocabulary, and Models in Study 2

Item type Behavioral study Vocabulary analysis Rastle and Coltheart (2000) model Perry et al. (2010) model
M SD % n a % n b % n b

Onset single (e.g., ‹f›) 40 16 79 513 13 45 36 45
Onset+onset (e.g., ‹fr›) 25 13 44 80 0 45 22 45
Coda+onset (e.g., ‹fg›) 8 8 0 200 0 45 0 42
Onset double (e.g., ‹ff›) 18 14 0 425 0 45 0 45
a

Number of words with normative pronunciation of first-syllable vowel.

b

Number of words pronounced by model with normative stressed pronunciation of first-syllable vowel.

We conducted a mixed-model logistic regression analysis with data from the 6,446 trials on which the first vowel was pronounced as either long and stressed or short and stressed. The dependent variable was whether the vowel of the first syllable was long, and medial type was a fixed factor with onset double as the reference level (18% long). The model included random intercepts for participants and item quadruplets. The rate of long vowel use was significantly higher in the onset single case (b = 1.88, SE = 0.12, p < .001), 40%, and the onset+onset case (b = 0.74, SE = 0.11, p < .001), 25%, than in the onset double case. The rate of long vowel use was significantly lower for coda+onset items, 8%, than for onset double items (b = −1.30, SE = 0.14, p < .001).

Vocabulary Analysis

Table 2 shows the percentage of words with a long vowel in the first syllable as a percentage of the total number of words with normative pronunciations of the vowel. Of the 425 words with two identical medial consonants, only “Gucci” and “tutti” had the normative long vowel—a long vowel rate of less than 0.5%.

Models

Table 3 shows the percentage of long first-syllable vowels relative to the total number of normative vowels for each type of item for each model. The table also shows the number of words for which each model produced a stressed and normative pronunciation of the first vowel.

Discussion

The results of the vocabulary analysis show that making distinctions among different types of VCCV sequences helps us understand why English vowels are pronounced as they are. The V̆C|CV rule of phonics treats all medial consonant sequences alike, and Kearns (2020) did not distinguish among different types of medial sequences in his vocabulary analysis. Our results show that the syllabification of the medial sequence matters. Long vowel pronunciations hardly ever occur in words with double medial consonants or words with medial coda+onset sequences, but there are some such pronunciations among words whose medial sequences can form onset clusters.

The behavioral results for items in the onset single, onset+onset, and coda+onset categories were similar to those of Study 1. Also as in Study 1, the model of Rastle and Coltheart (2000) was a poor fit to the behavioral data for these types of items. This model underestimated participants’ use of long vowels for items with single medial consonants, and it did not produce any pronunciations with long vowels for items whose medial sequences can form onset clusters. The Perry et al. (2010) model was a better fit to the behavioral data for items with single medial consonants, medial onset clusters, and medial coda+onset sequences, showing the same three-way distinction that participants did.

The new results of Study 2 involve items with double medial consonants. Our participants used long vowels 18% of the time in this case, significantly more often than in the coda+onset case, 8%. One would not expect a difference based on the vocabulary analysis, which found no or almost no long pronunciations of the first vowel for words with either type of medial sequence. Also, neither model produced a long pronunciation of the first vowel for any of the onset double or coda+onset nonwords. That is, the models fail to account for the elevated rate of long first-syllable vowels that participants showed for items like ‹haddob› as compared to items like ‹hadgob›.

A potential problem with the behavioral study is that, when participants were presented with an onset double item such as ‹haddob›, the corresponding onset single item, ‹hadob›, would have been presented earlier in the experiment half of the time. Participants used long vowels 40% of the time for onset single items, and previous production of /ˈhedəb/ for ‹hadob› may have primed participants to use the same pronunciation for the very similar stimulus ‹haddob›. In Study 3, we guarded against such effects by designing quadruplets of nonwords similar to those of Study 2 and presenting each participant with just one item from each quadruplet. Thus, a given participant in Study 3 did not see both ‹hadob› and ‹haddob›. We asked whether we would find more long vowels for onset double items than coda+onset items in Study 3 when highly similar pairs of items were not presented.

Study 3

Method

Participants.

Forty undergraduate students from the same population as Studies 1 and 2, none of whom had participated in those studies, gave valid consent to participate in Study 3. The students received course credit in exchange for participation. The mean age of the participants was 18.9 years (range 18–22). All of the participants reported that they were native English speakers with no language-related disabilities. Thirty-six identified as female and four as male. The mean standard score on the reading subtest of the WRAT was 111 (SD = 10).

Stimuli.

The stimuli began with a consonant-vowel-consonant sequence, an optional consonant, and a vowel, and they ended in at most one additional letter. We designed 180 quadruplets of nonwords such each of the nonwords in a quadruplet was identical except for the consonant, if any, that followed the initial CVC sequence. Each quadruplet included an onset single item, an onset+onset item, a coda+onset item, and an onset double item. None of the items had been used in Studies 1 or 2. We prepared randomized lists of stimuli for the participants such that each participant received one item from each quadruplet and each item was presented exactly once within each group of four participants. The sentence frames and the practice trials were the same as in Studies 1 and 2.

Procedure.

The procedure was like that of Study 2.

Scoring.

The investigator scored each participant’s pronunciation of the first vowel as in Study 2. A second researcher independently scored the responses of six participants. Comparing the scores of the two researchers, 99.5% of the trial scores agreed on whether the pronunciation was stressed on the first syllable. Of those, the first vowel was coded the same in 96.8% of the cases. In cases of disagreement, we used the transcriptions of the primary investigator.

Results

Behavioral Study

We eliminated data from the 3% of trials on which no first-syllable vowel could be identified or the second syllable received primary stress. On the remaining trials, participants produced long pronunciations of the first vowel 19% of the time, short pronunciations 77% of the time, and other pronunciations 4% of the time. The first column of data in Table 4 shows, for each type of medial sequence, the mean and standard deviation by subjects of the proportion of long pronunciations relative to all normative pronunciations.

Table 4.

Mean Percentage of Normative Pronunciations with Long Vowel in First Syllable in Behavioral Study (Standard Deviation in Parentheses) and Models in Study 3

Item type Behavioral Rastle and Coltheart (2000) model Perry et al. (2010) model
M SD % n a % n a

Onset single (e.g., ‹f›) 41 20 11 175 31 170
Onset+onset (e.g., ‹fr›) 24 19 0 175 24 170
Coda+onset (e.g., ‹fg›) 5 9 0 179 2 176
Onset double (e.g., ‹ff›) 14 15 0 178 2 172
a

Number of words pronounced by model with normative stressed pronunciation of first-syllable vowel.

We conducted a mixed-model logistic regression analysis using data from the 6,668 trials on which the first vowel was pronounced as either long and stressed or short and stressed. The dependent variable was whether the first-syllable vowel was long, and medial type was a fixed factor with onset double as the reference level (14%). The model included random intercepts for participants and item quadruplets. Participants produced more significantly long vowels for onset single items (b = 2.65, SE = 0.13, p < .001), 41%, and onset+onset items (b = 1.20, SE = 0.12, p < .001), 24%, than for onset double items. The rate of long vowel use was significantly lower for coda+onset items, 5%, than for onset double items (b = −1.50, SE = 0.16, p < .001).

Models

Table 4 shows the percentage of long first-syllable vowels relative to the total number of normative vowels for each type of item for each model. The table also shows the number of words given normative stressed vowel pronunciations by each model.

Discussion

The main goal of Study 3 was to follow up on one of the findings of Study 2: that participants used significantly more long vowels for nonwords with double medial consonants than for nonwords with medial coda+onset sequences. Such a difference is not expected under the view that experienced readers have fully picked up the statistics for the relevant types of disyllabic words in their language. If they had, we should have seen extremely few responses with long first-syllable vowels for both onset double items such as ‹kettif› and coda+onset items such as ‹ketmif›. A weakness of Study 2, however, was that each participant saw both items with medial double consonants and their corresponding single-consonant items, for example both ‹kettif› and ‹ketif›. We designed Study 3 so that a given participant did not see items that were identical except for medial consonant doubling. Even in this case, we found a significantly higher rate of long vowel responses for nonwords such as ‹kettif› than for nonwords such as ‹ketmif›.

What explains participants’ markedly higher rate of long vowel responses for onset double items than for coda+onset items? One possible explanation is that readers’ selection of vowel length is influenced not just by spelling patterns but also by pronunciation patterns. Although words like “tutti” with a long vowel before a double consonant letter are extremely rare, the double consonant letter is pronounced as a single consonant phoneme (here /t/) and it is easy to think of words with a long vowel before a single consonant phoneme (e.g., “duty,” “beauty,” “cougar”). If a reader determines that there is a single medial consonant phoneme before settling on the length of the first vowel, the pronunciation pattern (long vowels common) could moderate the spelling pattern (long vowels rare). In contrast, for coda–onset clusters with two letters such as ‹tm›, we could expect the pronunciation patterns to align with the spelling patterns. To quantify these impressions about pronunciation frequencies, we extracted all words from the Unisyn lexicon whose pronunciation corresponded to the expected pronunciation pattern of the nonwords in our behavioral studies, without any restrictions on the spelling. To be specific, we searched for words that began with a single consonant, followed by a stressed vowel or diphthong, followed by an oral stop, affricate, or labiodental fricative, optionally followed by another obstruent of the same type or a liquid or nasal, followed by a vowel or syllabic consonant, and then at most one consonant. Crucially, this analysis does not filter words by spelling, so that, for example, words like “cougar” were included despite having vowel digraphs; their vowel sounds were classified as long or short based on whether any single letter has that sound as its long or short pronunciation. The syllabification of the medials was classified based on the maximal onset principle. We found long first-syllable vowels in 46% of the words with single medial consonants in their phonological forms. The rate was lower, 25%, for words with coda+onset medial sequences. These results are consistent with the idea that people use some long pronunciations for the first vowels of items like ‹kettif› because a number of English words with single medial consonant phonemes have long first-syllable vowels. Readers are less likely to use long pronunciations for the first vowels of items like ‹ketmif› because fewer words with coda+onset medial sequences have long vowels. Thus, readers’ implicit knowledge about the typicality of phonological forms may help explain why they produce some pronunciations that deviate from the spelling-to-sound statistics that we calculated.

In other respects, the behavioral results of Study 3 agree with those of Studies 1 and 2. Specifically, participants were more likely to use a long first-syllable vowel when a sequence of two medial consonants could be the onset of a syllable than when it could not. The nature of the medial consonants and not just their number, therefore, influences people’s choices between long and short pronunciations of the preceding vowel.

As in Studies 1 and 2, the model of Rastle and Coltheart (2000) was a poor fit to the behavioral data. It produced fewer long vowel pronunciations than our participants did for items with single medial consonants. It did not produce any such pronunciations for items with medial clusters that could form onsets, whereas participants produced such pronunciations about a quarter of the time. The Perry et al. (2010) model fit the data better. However, it produced equally low rates of long vowels before double medial consonants and coda+onset sequences, whereas our participants produced more long vowels before double medial consonants.

General Discussion

The goal of the research reported here was to improve our understanding of the spelling-to-sound correspondences of English and of how people pronounce novel written items. Rather than examining monosyllables, as in many previous studies (e.g., Kessler & Treiman, 2001; Siegelman et al., 2020; Steacy et al., 2019; Treiman et al., 2003), we examined disyllables, specifically those with a single vowel letter in the first syllable and one or two medial consonants. The first vowels of such disyllables are pronounced as long in some words (e.g., ‹paper›, ‹sacred›) and short in others (e.g., ‹cactus›, ‹pepper›), making them a good case to examine the complexities of English spelling-to-sound relations for items of more than one syllable. In our vocabulary analyses, we asked whether the length of the vowel is associated with the context in which it appears—specifically, by the number and type of following consonants. In our behavioral studies, we examined skilled readers’ pronunciations of novel disyllables that varied in the number and type of medial consonants. We also evaluated two models of the processes used by readers (Perry et al., 2010; Rastle & Coltheart, 2000), examining whether the vowel pronunciations generated by the models agree with those produced by our participants.

The only study to have quantitatively examined spelling-to-sound relations in English disyllabic words of the type examined here, that of Kearns (2020), found an association between the number of medial consonants and the pronunciation of the first-syllable vowel. Specifically, Kearns reported, short vowels are more common before two medial consonants than before one medial consonant in the words to which students in Grades 1 to 8 are exposed. We found the same result for the words in adult reading materials, but we also found substantial differences among words with different types of medial consonant sequences. Specifically, the first-syllable vowel is much more likely to be long in words with medial sequences that can be syllable onsets, such as ‹cr›, than in words with medial letter sequences that cannot be syllable onsets, such as ‹ct› and ‹pp›. Kearns did not separate different types of medial consonant sequences, and our results show there are distinctions among them that people pick up on.

To study how readers determine the pronunciation of vowels in disyllabic items, we asked participants to read aloud nonwords. If people were taught and used the V̄|CV and V̆C|CV rules of phonics, or if they inferred such rules based on their experience with words, we would expect them to produce long first-syllable vowels for nonwords with single medial consonants and short vowels for nonwords with two medial consonants. We would not expect differences between nonwords in which the medial consonants are identical and ones in which they are not. Table 5 shows the combined results of our behavioral studies and the results that would be predicted if readers relied on the phonics rules. Clearly, readers’ performance is not well captured by the idea that they use only these rules. Readers seem to have induced patterns based on experience with print that are not covered as a part of reading instruction.

Table 5.

Mean Percentage of Normative Pronunciations with Long Vowel in First Syllable Pooling Across Subjects in Behavioral Studies (Standard Deviations in Parentheses) and Percentages Expected According to V̄|CV and V̆C|CV Phonics Rules, Vocabulary Statistics, and Models

Item type Behavioral results Phonics rules Vocabulary analysis Rastle and Coltheart (2000) model Perry et al. (2010) model
Onset single (e.g., ‹f›) 43 (20) 100 79 10 34
Onset+onset (e.g., ‹fr›) 26 (17) 0 44 0 24
Coda+onset (e.g., ‹fg›) 6 (9) 0 0 0 5
Onset double (e.g., ‹ff›) 16 (15) 0 0 0 2

The model of Rastle and Coltheart (2000) is not a good fit to readers’ behavior regarding use of long and short vowels. The model produces long vowels for 10% of items with single medial consonants, a much lower rate than our participants’ 43%, and it does not produce long vowels for any of the items with two medial consonants. People, however, produce some long vowels for items with medial two-consonant sequences, particularly when the medal consonant sequence may form the onset of a syllable. Other work shows that the Rastle and Coltheart et al. model is also not a good fit to readers’ behavior regarding stress assignment (Ktori et al., 2018; Mousikou et al., 2017).

The model of Perry et al. (2010) is a better fit to our behavioral data on vowel length than the model of Rastle and Coltheart (2000). Given that the Perry et al. model is trained on a comprehensive set of two-syllable English words and that it is described as picking up the relationships between orthographic and phonological patterns in the trained words, it is surprising that its pronunciations deviate substantially from the vocabulary statistics that we calculated for two-syllable words. One reason for this difference may be that the model is trained on basic letter-sound correspondences for monosyllables, including the correspondences between vowel letters and short vowel sounds, before it is trained to pronounce words. Although the Perry et al. model fits our behavioral data well in some respects, it shows a slightly lower percentage of long vowels for onset double items than for coda+onset items. Our participants, however, used significantly more long vowels for onset double items than for coda+onset items. This difference, we have suggested, reflects readers’ use of implicit knowledge about the frequencies of different phonological forms. The Perry et al. model does not include a mechanism that learns about these matters.

Participants’ responses did not line up closely with the vocabulary statistics that we calculated for disyllabic words. For items of the onset single and onset+onset types, participants used fewer long vowels than expected on the basis of these vocabulary statistics. Participants also produced more long pronunciations of the first vowel for items of the onset double type than for items of the coda+onset type even though long vowels are rare in both of these contexts in the English disyllabic words in our analyses. These results are surprising given the idea that people internalize statistics about the correspondences between orthography and phonology to which they are exposed and that skilled readers mirror these statistics in their behavior (e.g., Brown, 1998; Sawi & Rueckl, 2019).

One potential reason for the discrepancies noted above is that we based the vocabulary analyses on words that were very similar to our experimental nonwords. It is possible that people consider a larger set of words in their implicit calculations of vocabulary statistics. To address this issue, we calculated vocabulary statistics that extended the set of words in various ways. One extension was to permit sequences of more than one consonant at the beginnings of the words. The base vocabulary analyses were restricted to words that had exactly one initial consonant, and the column “Initials” in Table 6 shows the vocabulary statistics when we permitted any number of initial consonants, including zero (e.g., ‹bridal›). The base vocabulary analyses were also limited to words with medial ‹b, c, d, f, g, j, k, p, q, t, v, z›, perhaps followed by any of ‹l, m, n, r›. The column “Medials” in Table 6 shows the results when we expanded the set of medials to also include ‹l, m, n, s› and ‹s, w, y›, respectively. The base vocabulary analyses were also restricted to disyllabic words that had one vowel letter at the end followed by, at most, one consonant. The “Tails” column of Table 6 show the results when we removed the restriction on tails, bringing in disyllables ending in more than one consonant (e.g., ‹agent›) and words of more than two syllables with the pattern of interest at the beginning (e.g., ‹elephant›). Finally, the last column of Table 6 shows the results when the restrictions on initials, medials, and tails were all removed.

Table 6.

Percentage of Normative Pronunciations with Long Vowel in First Syllable Pooling Across Subjects in Behavioral Studies, in Base Vocabulary Analyses, and in Vocabulary Analyses Relaxing Restrictions on Initials, Medials, Tails, and Initials + Medials + Tails

Item type Behavioral results Base IInitials Medials Tails Initials + medials + tails
Onset single (e.g., ‹f›) 43 79 78 78 52 50
Onset+onset (e.g., ‹fr›) 26 44 45 22 54 25
Coda+onset (e.g., ‹fg›) 6 0 0 7 0 4
Onset double (e.g., ‹ff›) 16 0 1 1 0 1

As Table 6 shows, relaxing the restricting on initials made little difference to the vocabulary statistics. However, relaxing the restriction on medials had a substantial impact for onset+onset items, making the vocabulary statistics more similar to the behavioral statistics. The rate of long vowels is lower before clusters beginning with ‹s› than for other clusters, and participants may have been influenced by this fact. Relaxing the restriction on tails was particularly influential for onset single items, with longer words such as “natural” showing lower rates of long vowels than the two-syllable words included in the base vocabulary analysis. Our participants still produced fewer long vowels for words with single medial consonants than expected on the basis of the vocabulary statistics that removed all three restrictions, but the difference was substantially smaller than it was for the base vocabulary analysis. These results suggest that, when calculating statistics about vowel length, people generalize to some degree over words of different lengths and with different characteristics.

The vocabulary analyses reported in Table 6 do not generally show a higher percentage of long vowels for words with medials of the onset double type than for words with medials of the coda+onset type. Our participants, however, produced significantly more long vowels for nonwords with medials of the onset double type than for nonwords with medials of the coda+onset type. Postulating that people use information based on a broader vocabulary in calculating statistics about spelling-to-sound relationships statistics thus does not explain the significantly higher rate of long vowel pronunciations for nonwords such as ‹kettif› than for nonwords such as ‹ketmif›. We have suggested that this difference reflects how, when reading aloud, people use their knowledge of what words of their language typically sound like in addition to their knowledge of how spellings relate to sounds. Whereas many theories of skilled reading emphasize how readers pick up the spelling-to-sound relationships in the words of their language (e.g., Brown, 1998; Perry et al., 2010; Sawi & Rueckl, 2019), few theories consider readers’ use of phonotactic patterns. Some reading models incorporate all-or-none phonotactic patterns, as when ‹nk› is translated to /nk/ by the spelling-to-sound rules of the dual-route model (Coltheart et al., 2001) but /n/, being illegal in this context, is changed to /ŋ/. Our results suggest that knowledge about probabilistic phonotactic patterns is also important. Specifically, we suggest, speakers of English have learned that long vowels are more common before single consonant phonemes than before sequences that must be analyzed as a coda followed by an onset. This knowledge influences their spelling-to-sound translation, causing them to use long vowels for items like ‹kettif› substantially more often than anticipated given the spelling-to-sound statistics of English.

In future work, it will be important to study determinants of vowel pronunciation other than the number and syllabification of the following consonants, the factors examined here. For example, the findings of Kearns’s (2020) vocabulary analysis suggest that the proportion of long vowels in VCV and VCCV contexts may differ across vowels, and the results of Berg (2016)’s vocabulary analysis point to an influence of the word’s ending. Further work is needed to explore such differences and determine whether they affect reading performance when controlling for other factors. Participant-related differences also merit exploration. The participants in the present studies were better than average readers, and future research is needed with more typical adult readers and with developing readers. In such research, it will be important to keep in mind a potential problem with the current studies: that participants were explicitly asked to emphasize the first syllables when pronouncing the items. This feature of the methodology makes the task less representative of real-life situations in which readers encounter novel items, and it will be important to determine whether it influences the results.

Conclusions

Much of the previous research on spelling-to-sound translation has involved single-syllable words, and many models of the process have been implemented only with monosyllables. But we will not have a full understanding of the reading process until we understand how readers map the spellings of longer items to pronunciations and how they learn to do so. Our results show that skilled readers consider a vowel’s context when deciding whether to use its short or its long pronunciation. Readers make distinctions they have not been taught to make in any phonics teaching they may have received, and they appear to do so based on implicit learning about the letter-to-sound links in the words to which they are exposed. It is not always straightforward, however, to link readers’ behavior to statistics about language. Because reading draws on knowledge of spoken language, for example, people seem to use statistics about the frequencies of phonological forms as well as statistics about the frequencies of letter-to-sound mappings.

Acknowledgments

This research was supported in part by NIH grant R01HD102346–01A1

We thank Maya Dubno, Samra Haseeb, Rebecca Jewell, Anna Daileader Sheriff, and Lily Trossman for their assistance in investigation.

Footnotes

Contributor Information

Rebecca Treiman, Department of Psychological and Brain Sciences, Washington University in St. Louis.

Brett Kessler, Department of Psychological and Brain Sciences, Washington University in St. Louis.

Kayla Hensley, Department of Psychological and Brain Sciences, Washington University in St. Louis.

References

  1. Arciuli J (2018). Reading as statistical learning. Language Speech and Hearing Services in Schools, 49, 634–643. 10.1044/2018_LSHSS-STLT1-17-0135 [DOI] [PubMed] [Google Scholar]
  2. Bates D, Mächler M, Bolker BM, & Walker SC (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]
  3. Berg K (2016). Double consonants in English: Graphemic, morphological, prosodic and etymological determinants. Reading and Writing: An Interdisciplinary Journal, 29(3), 453–474. 10.1007/s11145-015-9610-z [DOI] [Google Scholar]
  4. Brown GDA (1998). The endpoint of skilled word recognition: The ROAR model. In Metsala JL & Ehri LC (Eds.), Word recognition in beginning literacy (pp. 121–138). Erlbaum. [Google Scholar]
  5. Chang Y-N, Monaghan P, & Welbourne S (2019). A computational model of reading across development: Effects of literacy onset on language processing. Journal of Memory and Language, 108, Article 104025. 10.1016/j.jml.2019.05.003 [DOI] [Google Scholar]
  6. Coltheart M, Rastle K, Perry C, Langdon R, & Ziegler J (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108(1), 204–256. 10.1037/0033-295x.108.1.204 [DOI] [PubMed] [Google Scholar]
  7. Cox AR, & Hutcheson L (1988). Syllable division: Prerequisite to dyslexics’ literacy. Annals of Dyslexia, 38(1), 226–242. 10.1007/BF02648258 [DOI] [PubMed] [Google Scholar]
  8. Eddington D, Treiman R, & Elzinga D (2013). Syllabification of American English: Evidence from a large-scale experiment. Part I. Journal of Quantitative Linguistics, 20(1), 37–41. 10.1080/09296174.2012.754601 [DOI] [Google Scholar]
  9. Fudge EC (1984). English word-stress Allen and Unwin. [Google Scholar]
  10. Harm MW, & Seidenberg MS (2004). Computing the meanings of words in reading: Cooperative division of labor between visual and phonological processes. Psychological Review, 111(3), 662–720. 10.1037/0033-295X.111.3.662 [DOI] [PubMed] [Google Scholar]
  11. Kahn D (1976). Syllable-based generalizations in English phonology MIT. [Google Scholar]
  12. Kearns DM (2020). Does English have useful syllable division patterns? Reading Research Quarterly, 55(S1), S145–S160. 10.1002/rrq.342 [DOI] [Google Scholar]
  13. Kessler B, & Treiman R (2001). Relationships between sounds and letters in English monosyllables. Journal of Memory and Language, 44(4), 592–617. 10.1006/jmla.2000.2745 [DOI] [Google Scholar]
  14. Ktori M, Mousikou P, & Rastle K (2018). Cues to stress assignment in reading aloud. Journal of Experimental Psychology: General, 147(1), 36–61. 10.1037/xge0000380 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Mahowold K, Dautriche I, Braginsky M, & Gibson E (2021). Efficient communication and the organization of the lexicon. In Papafragou A, Trueswell JC, & Gleitman LR (Eds.), Oxford handbook of the mental lexicon (pp. 200–220). Oxford University Press. [Google Scholar]
  16. Mousikou P, Sadat J, Lucas R, & Rastle K (2017). Moving beyond the monosyllable in models of skilled reading: Mega-study of disyllabic nonword reading. Journal of Memory and Language, 93, 169–192. 10.1016/j.jml.2016.09.003 [DOI] [Google Scholar]
  17. Perry C, Ziegler JC, & Zorzi M (2010). Beyond single syllables: Large-scale modeling of reading aloud with the Connectionist Dual Process (CDP++) model. Cognitive Psychology, 61(2), 106–151. 10.1016/j.cogpsych.2010.04.001 [DOI] [PubMed] [Google Scholar]
  18. Perry C, Ziegler JC, & Zorzi M (2014). CDP: The Connectionist Dual Process model of reading aloud. University of Padua, Computational Cognitive Science Lab http://ccnl.psy.unipd.it/CDP.html, link “CDP++ can be downloaded here”.
  19. Pulgram E (1970). Syllable, word, nexus, cursus. Mouton
  20. Rastle K, & Coltheart M (2000). Lexical and nonlexical print-to-sound translation of disyllabic words and nonwords. Journal of Memory and Language, 42(3), 342–364. 10.1006/jmla.1999.2687 [DOI] [Google Scholar]
  21. Redford MA, & Randall P (2005). The role of juncture cues and phonological knowledge in English syllabification judgments. Journal of Phonetics, 33(1), 27–46. 10.1016/j.wocn.2004.05.003 [DOI] [Google Scholar]
  22. Sawi OM, & Rueckl J (2019). Reading and the neurocognitive bases of statistical learning. Scientific Studies of Reading, 23(1), 8–23. 10.1080/10888438.2018.1457681 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Siegelman N, Kearns DM, & Rueckl JG (2020). Using information-theoretic measures to characterize the structure of the writing system: The case of orthographic-phonological regularities in English. Behavior Research Methods, 52(3), 1292–1312. 10.3758/s13428-019-01317-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Smith PT, & Baker RG (1976). The influence of English spelling patterns on pronunciation. Journal of Verbal Learning and Verbal Behavior, 15(3), 267–285. 10.1016/0022-5371(76)90025-6 [DOI] [Google Scholar]
  25. Steacy LM, Compton DL, Petscher Y, Elliott JD, Smith K, Rueckl JG, Sawi O, Frost SJ, & Pugh KR (2019). Development and prediction of context-dependent vowel pronunciation in elementary readers. Scientific Studies of Reading, 23(1), 49–63. 10.1080/10888438.2018.1466303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Treiman R, Gross J, & Cwikiel-Glavin A (1992). The syllabification of /s/ clusters in English. Journal of Phonetics, 20(3), 383–402. 10.1016/S0095-4470(19)30640-0 [DOI] [Google Scholar]
  27. Treiman R, Kessler B, & Bick S (2003). Influence of consonantal context on the pronunciation of vowels: A comparison of human readers and computational models. Cognition, 88(1), 49–78. 10.1016/S0 [DOI] [PubMed] [Google Scholar]
  28. Treiman R, Kessler B, Zevin JD, Bick S, & Davis M (2006). Influence of consonantal context on the reading of vowels: Evidence from children. Journal of Experimental Child Psychology, 93(1), 1–24. 10.1016/j.jecp.2005.06.008 [DOI] [PubMed] [Google Scholar]
  29. Treiman R, Rosales N, Cusner L, & Kessler B (2020). Cues to stress in English spelling. Journal of Memory and Language, 112, 104089. 10.1016/j.jml.2020.104089 [DOI] [Google Scholar]
  30. Venezky RL (1970). The structure of English orthography. Mouton
  31. Waese M, & Jared D (2006, November 16–19). The role of intervocalic consonants in disyllabic word naming [Poster presentation] 47th Annual Meeting of the Psychonomic Society, Houston, TX, United States. [Google Scholar]
  32. Wilkinson GS, & Robertson GJ (2017). Wide Range Achievement Test Fifth Edition. Pearson. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data and analysis scripts for this and the other studies are available at https://osf.io/ckxsa/?view_only=274f5fee16ad4354b2f8b6d5567e1293

RESOURCES