Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Mar 1.
Published in final edited form as: Child Dev. 2014 Sep 26;86(2):342–361. doi: 10.1111/cdev.12295

Genetic and Environmental Etiologies of the Longitudinal Relations between Pre-reading Skills and Reading

Micaela E Christopher a,b,*, Jacqueline Hulslander b, Brian Byrne c,d, Stefan Samuelsson d, Janice M Keenan e, Bruce Pennington e, John C DeFries a,b, Sally J Wadsworth b, Erik Willcutt a,b, Richard K Olson a,b,d,*
PMCID: PMC4375099  NIHMSID: NIHMS616667  PMID: 25263167

Abstract

The present study explored the environmental and genetic etiologies of the longitudinal relations between pre-reading skills and reading and spelling. Twin pairs (n = 489) were assessed before kindergarten (M = 4.9 years), post-1st grade (M = 7.4 years), and post-4th grade (M = 10.4 years). Genetic influences on five pre-reading skills (print knowledge, rapid naming, phonological awareness, vocabulary, and verbal memory) were primarily responsible for relations with word reading and spelling. However, relations with post-4th-grade reading comprehension were due both to genetic and shared environmental influences. Genetic and shared environmental influences that were common among the pre-reading variables covaried with reading and spelling, as did genetic influences unique to verbal memory (only post-4th-grade comprehension), print knowledge, and rapid naming.

Keywords: Individual differences, Genetic, Twins, Longitudinal, Reading development


Given the importance of reading for academic and career success, predicting which children will struggle in reading and which will excel is a major goal of current research in child development. Reading is a learned skill that builds upon a child's early language and cognitive abilities. Identifying skills in pre-readers that are predictive of future reading levels could facilitate the understanding of why children vary in their later reading abilities, and potentially how to help children at-risk for future reading difficulties. The main aim of the present study is to use longitudinal data from identical and fraternal twins to assess the genetic and environmental influences on the covariance between children's pre-reading skills (phonological awareness, rapid naming, print knowledge, vocabulary, verbal memory) and their subsequent development of word reading, reading comprehension, and spelling at the end of first and fourth grades.

The present study expands upon previous research in three important ways. First, the pre-reading skills were measured in the year prior to kindergarten entry, before the children started receiving literacy instruction in school. Reading-related skills in preschool-age children and early kindergarteners are sometimes referred to as emergent literacy to highlight the idea that reading development occurs on a continuum, with skills children have prior to learning to read forming the foundation upon which reading is acquired (see Whitehurst & Lonigan, 1998 for a review). However, as children start to learn to read, emergent literacy skills can quickly become reciprocally related to reading performance (e.g., Perfetti, Beck, Bell, & Hughes, 1987). Therefore, to understand the predictive nature of different pre-reading skills more fully, it is important to measure them as early as possible. Because of the realities of participant recruitment, studies of early predictors of literacy often begin after the children are already in kindergarten, or later (e.g., Hecht, Burgess, Torgesen, Wagner, & Rashotte, 2000; Parrila, Kirby, & McQuarrie, 2004).

Second, we examined the relations between pre-reading skills and reading and spelling abilities at both the end of first grade and the end of fourth grade. First described by Chall (1983), the focus of reading and spelling instruction and practice gradually shifts from “learning to read” in the early elementary grades to “reading to learn” by the fourth grade. Thus, testing at the end of first and fourth grade allowed us to address the question of whether the pre-reading skills differentially predict individual differences in word reading, spelling, and reading comprehension in these two different phases of reading development. In addition, by including measures of both word reading and reading comprehension, it was possible for us to test for potential differences in pre-reading relations between these two types of reading ability. For example, reading comprehension may have stronger relations with pre-reading skills that index general language ability (such as vocabulary and verbal memory) than does word reading, and these relations may strengthen as children transition to “reading to learn.”

Third, we assess why pre-reading skills predict later reading ability. The data come from the United States (U.S.) component of the International Longitudinal Twin Study (ILTS; Byrne et al., 2009). By comparing the similarity of identical and fraternal twins (described in more detail in the Methods section), we obtain estimates for how much of the variance in children's pre-reading skills, reading and spelling abilities, and their respective longitudinal correlations are due to genetic and environmental factors.

Previous studies from the ILTS have examined the genetic and environmental influences on pre-reading skills and reading ability in the early grades in the U.S., Australia, and Scandinavia (Byrne et al., 2009; Samuelsson et al., 2005). In general, results show a mix of genetic and shared environmental influences on individual differences in the pre-reading skills, but largely genetic influences on individual differences in reading and spelling abilities by the end of first grade. The very modest amount of shared environmental influences on individual differences in reading and spelling abilities after the first year of formal reading instruction in Australia, the U.S., and Scandinavia suggests that the relation between pre-reading skills and future reading may be driven primarily by their shared genetic influence. For example, Byrne et al. (2009) found that word reading, reading comprehension, and spelling at the end of second grade shared significant genetic variance, but not shared environmental variance, with preschool print knowledge, phonological awareness, and rapid naming.

Two previous studies from the U.S. component of the ILTS have assessed the etiology of longitudinal relations between preschool measures and reading measured at the end of fourth grade. Olson et al. (2011) found that the genetic and shared environmental influences on vocabulary measured in preschool-aged children strongly overlapped with those influences on vocabulary at the end of fourth grade, such that no new significant genetic or shared environmental variance was found. Keenan, Olson, Byrne, Samuelsson, Hulslander, and Christopher (2011) found that 88% of the genetic influences and 65% of the shared environment influences on post-4th-grade oral language were accounted for respectively by genetic and shared environment influences on a broad range of preschool skills. The respective percentages of genetic and shared environment influences from preschool skills on post-4th-grade word reading (47%, 50%) and reading comprehension (51%, 100%) were also considerable.

Another relevant longitudinal behavioral genetic study is the Twins Early Development Study in England and Wales (TEDS; Trouton, Spinath, & Plomin, 2002). Harlaar, Hayiou- Thomas, Dale, and Plomin (2008) analyzed parent-report measures of vocabulary and grammar when children were two, three, and four years old. The early language ability latent variable was significantly correlated with a latent variable composed of teacher assessments of reading at seven, nine, and 10 years, and this correlation was due to both genetic and shared environmental factors, although the shared environmental correlation was larger.

The present study expands upon these previous studies by testing the independent contributions of each pre-reading skill, by including both first-grade (when children are primarily learning to read) and fourth-grade (when children are primarily reading to learn) measures of word reading, spelling, and reading comprehension, and by including five pre-reading skills. Therefore, the present study is able to address multiple related issues: which pre-reading skills predict future reading ability, whether predictions vary depending upon the stage of reading development, whether predictions vary depending on the reading skill being assessed, and the role of genetic and environmental influences upon these relations.

The five pre-reading skills included in the present study (phonological awareness, rapid naming, print knowledge, vocabulary, and verbal memory) have all been previously implicated as longitudinal correlates with reading ability. Phonological awareness is broadly defined as the ability to decompose and manipulate speech sounds at the sub-word level. Phonological awareness is one of the best concurrent predictors of reading ability (e.g., Brady & Shankweiler, 1991; Scarborough, 1989). In addition, levels of phonological awareness measured within the first year of reading instruction longitudinally predict reading ability over the first two years of literacy instruction (e.g., Boscardin, Muthén, Francis, & Baker, 2008; Parrila et al., 2004).

Rapid naming (or naming speed) requires the participant to name letters, numbers, colors, or objects in a visual display as quickly as possible. Since the 1970s, researchers have found that children with reading disabilities are slower on rapid naming tasks than their peers (Denckla, 1972; Denckla & Rudel, 1974). In addition to being indicative of reading disabilities, early levels of rapid naming appear to predict individual differences in early reading ability across the normal range of reading skill (e.g., Boscardin et al., 2008), even after controlling for other skills, such as vocabulary and print knowledge (e.g., de Jong & van der Leij, 2002; Georgiou, Parrila, & Papadopoulos, 2008; Schatschneider, Fletcher, Francis, Carlson, & Foorman, 2004). Exactly why rapid naming is related to reading is an open question, but one theory is that the rate at which children are able to identify and name stimuli is tied to how easily they are able to bind together visual stimuli and stored phonological representations when reading (e.g., Sunseth & Bowers, 2002).

Measures of preschool print knowledge are designed to assess what children have learned from their varied exposure to letters, printed words in books, and environmental print prior to learning to read. Because one of the first steps in learning to read is understanding that written letters and words correspond to spoken phonemes, more exposure of pre-readers to print may facilitate later reading performance (Byrne, 1998; Stuart & Coltheart, 1988). For example, numerous studies have shown that early levels of print knowledge longitudinally predict individual differences in later reading ability (e.g., de Jong & van der Leij, 2002; Schatschneider et al., 2004).

Like print knowledge, the number of words a young child knows and is able to use correctly is an important index of overall home literacy environment, with strong ties between size of vocabulary and socioeconomic status (Fernald, Marchman, & Weisleder, 2013; Hart & Risley, 1995). For example, results obtained from previous studies have shown that early levels of vocabulary are important for later reading development in part because of a reciprocal relation between oral language and phonological awareness (e.g., Bowey & Patel, 1988; Lonigan, Burgess, & Anthony, 2000): Young children may not understand that individual phonemes form a word if they do not already know the word. Longitudinal studies of vocabulary support the relations between early vocabulary and future reading ability (e.g., Catts, Fey, Zhang, & Tomblin, 1999; Olson et al., 2011; Scarborough, 1989).

The fifth and final pre-reading skill included in the present study is verbal memory. Verbal memory refers to the short-term encoding and use of verbally presented stimuli. Previous research has shown that children with reading disabilities have lower verbal memory capacities than their peers (e.g., Shankweiler, Liberman, Mark, Fowler, & Fischer, 1979). In addition, there is evidence supporting verbal memory as a concurrent predictor of reading ability (e.g., Muter & Snowling, 1998; Scarborough, 1998) and as a longitudinal predictor of early reading ability (e.g., Georgiou et al., 2008).

It is important to note that the relation between the pre-reading skills and spelling is an open question, as few studies have included spelling measures. In addition, by including both reading and spelling at the end of first grade and at the end of fourth grade, the present study will test whether pre-reading skills predict fourth-grade word reading, reading comprehension, and spelling independent of their influence on performance at the end of first grade. Finally, we know of no previous study that has included all five pre-reading skills in the same study. By including all five, we will test the extent to which the genetic or environmental influences on these skills are shared. The results of the present study have implications for understanding what and why pre-reading skills help to build a foundation for future reading and spelling acquisition.

Method

Participants

Although the ongoing International Longitudinal Twin Study (ILTS; Byrne et al., 2009) includes twins from Australia, Colorado, and Scandinavia, the present study only includes participants from Colorado because that is the only sample with complete fourth-grade data. The Colorado twin pairs all had English as their first language, were learning to read in English, and were recruited from the Colorado Twin Registry based on birth records. Zygosity was determined from DNA collected via cheek swabs, or in a minority of cases from selected items from the Nichols and Bilbro (1966) questionnaire.

The current study analyzed data from three testing waves: preschool, end of first grade, and end of fourth grade. Because twins were also tested at the end of second grade, all models were also fit to the second-grade data. However, these results were highly similar to those at the end of first grade. Because of space limitations, they are not reported here, but are available from the first author. The preschool sample included 224 monozygotic (MZ; i.e., identical) twin pairs (97 male and 127 female) and 265 same-sex dizygotic (DZ; i.e., fraternal) twin pairs (146 male and 119 female), for a total of 489 twin pairs. Attrition in the sample through the end of fourth grade was minimal; the post-4th-grade wave consisted of 213 MZ twin pairs (91 male and 122 female) and 256 DZ twin pairs (143 male and 113 female), for a total of 469 twin pairs. Mean ages in months (standard deviation, range) were 58.75 (2.31, 54-71), 89.06 (3.81, 79-104), and 125.43 (3.86, 116-140) for the preschool, post-1st-grade, and post-4th-grade waves, respectively.

The first testing session is referred to as “preschool” to mean, “before starting kindergarten.” While most of our children did attend formal preschools for at least one day a week, not all did. In addition, we will refer to the preschool-aged children as “pre-readers” given that 87% measured in the preschool testing could not read any words on a test of word reading and only 2% could read more than 10 words (Word Identification subtest from the Woodcock Reading Mastery; Woodcock, 1987). To check that the children who could read words did not unduly influence our results, we also ran the phenotypic analyses with only the 87% who read no words.

Procedure and Measures

The measures in the present analyses are from larger test batteries administered in the ILTS. The preschool testing took place over five days, about one hour each day, all within a two-week time frame, in the year prior to starting kindergarten (for more details, see Byrne et al., 2002). The reading and spelling measures, at post-1st and post-4th grade, were given in the summer after each school year. Testing at each time point was conducted in a single session lasting about one to two hours in the twins’ homes. Two testers separately assessed each twin at the same time. For all measures, scores were adjusted for age, age squared, and age cubed to control for any linear and nonlinear age effects, standardized within-sex to control for any sex differences, and trimmed to +/— 3 standard deviations.

Pre-reading measures

To increase reliability in the pre-reading measures, each of the five pre-reading constructs of interest (print knowledge, vocabulary, rapid naming, verbal memory, and phonological awareness) was assessed using multiple measures modeled as latent variables (Bollen, 1989). To ensure that our a priori five-factor structure for the pre-reading variables was appropriate, we first fit the pre-reading variables to a phenotypic confirmatory factor analysis that included one twin from each pair selected at random using the AMOS software (Arbuckle, 2008). The hypothesized five-factor structure for the pre-reading variables fit the data well (χ2 = 215.47, df = 80, CFI = .95, RMSEA = .059 [95% confidence interval: .050, .068]). Dropping the three variables (word cards, sound matching, and syllable and phoneme elision) that loaded less than .50 onto the latent factors increased model fit (χ2 = 87.25, df = 44, CFI = .98, RMSEA = .045 [95% confidence interval: .031, .059]). However, because these are measures used by other researchers and because the loadings were all greater than .40 and significant, our final model includes the measures. For all subsequent analyses, we will refer to the pre-reading skills as latent factors rather than individual measures.

The following descriptions group the measures according to the latent variable onto which they loaded. In the interest of space, only brief descriptions of the 15 pre-reading measures are provided; for more detailed descriptions, please see Byrne et al. (2002) and Samuelsson et al. (2005). Reliabilities for the experimental measures are from Samuelsson et al. (2005).

Print knowledge

Four measures were used to assess print knowledge. Concepts about print (Clay, 1975) tested understanding of print conventions, such as left-to-right direction of print and the difference between pictures and print (Cronbach's α = .83). In letter identification, children pointed out one letter out of four on a card that represented the letter spoken by the experimenter (Cronbach's α = .92). Sound identification was similar to letter identification, but the experimenter spoke the sound of the letter instead (Cronbach's α = .87). Word cards tested understanding of six common examples of print in the environment, such as a stop sign and exit sign (Cronbach's α = .46).

Rapid naming

Two measures were used to assess rapid naming, both from the Comprehensive Test of Phonological Processing (CTOPP; Wagner, Torgesen, & Rashotte, 1999). In rapid object naming, children named 72 objects (six objects repeated eight times) as quickly as possible (Cronbach's α = .71). Rapid color naming was identical in format, but used six colors as stimuli (Cronbach's α = .81). Both were reverse-coded in all analyses, such that faster times equated better performance.

Phonological awareness

Five measures were used to assess phonological awareness. The Sound Matching subtest from the CTOPP (Wagner et al., 1999) required children to identify which of three words started or ended with the same sound as a target word (Cronbach's α = .77). The three blending and elision tasks were made available by Lonigan (personal communication; 2000). Syllable and phoneme blending tested a child's ability to combine syllables or phonemes into words (Cronbach's α = .76). In word elision, children deleted a syllable from a compound word to form a new word (Cronbach's α = .77). Syllable and phoneme elision required children to delete a syllable or phoneme from a word to form a new word (Cronbach's α = .49). In rhyme and final sound, children had to recognize that two words either rhymed or ended with the same phoneme (Cronbach's α = .68).

Vocabulary

Two measures were used to assess vocabulary. The Wechsler Preschool and Primary Scale of Intelligence Vocabulary measure (WPPSI; Wechsler, 1989) required children to name either pictures or provide definitions to words (test-retest reliability for 4.5-year-olds = .83). In the Hundred Picture Naming Test (Fisher & Glenister, 1992), children named pictures (Cronbach's α = .89).

Verbal memory

Two measures were used to assess verbal memory. The Nonword Repetition Task (adapted from Gathercole, Willis, Baddeley, & Emslie, 1994) required children to repeat nonsense words ranging from two to five syllables (Cronbach's α = .84). The WPPSI Sentence Memory subtest (Wechsler, 1989) consisted of sentences ranging in length from two to 18 words that children repeated verbatim (split-half reliability for 5-year-olds = .88).

Reading and spelling measures

At the post-1st and post-4th-grade waves, the twin pairs were assessed on measures of word reading, reading comprehension, and spelling. For the word reading measure, the Sight Word Reading Efficiency subtest from the Test of Word Reading Efficiency (TOWRE; Torgesen, Wagner, & Rashotte, 1999), participants read a list of difficulty-ordered words as quickly as possible, with the score being the number correctly read in 45 seconds (test-retest reliability for children aged 6- to 9-years-old = .97). The Woodcock Passage Comprehension subtest (Woodcock, 1987) used a cloze procedure where children read short passages silently and are then asked to provide the missing word that completes the sentence (split-half reliability for first grade = .94). The spelling measure, the Wide Range Achievement Test Spelling Production subtest (WRAT Spelling; Jastak & Wilkinson, 1984), required children to generate written spellings of orally presented words (published alternate form reliability = .90).

Descriptions of Behavioral Genetic Analyses

Identical (MZ) twins share 100% of their genes, while fraternal (DZ) twins share 50% of their segregating genes on average. Shared family influences, however, are assumed to be equally similar regardless of zygosity (Plomin, DeFries, Knopik, & Neiderhiser, 2013). Using this standard twin model, it is possible to decompose the phenotypic variance in a variable into three components: additive genetic influences (a2), shared environmental influences (that make twins in a pair similar regardless of genetic factors; c2), and nonshared environmental influences (that are independent for members of twin pairs, including measurement error; e2).

The univariate twin model can be extended to multivariate analyses in order to examine the extent to which the genetic and environmental influences covary across variables. The present study uses correlated factors models (Neale & Cardon, 1992) that allow the genetic, shared environmental, and nonshared environmental components of each variable's variance to correlate. For each pair of variables, therefore, three correlations are estimated: genetic, shared environmental, and nonshared environmental. For example, a genetic correlation of 1.00 would imply that the genetic influences on one variable completely overlap with the genetic influences on the second variable.

The genetic and environmental correlations do not take into account the amount of genetic or environmental variance present. For example, it is possible to have 100% genetic overlap between two variables, but have the majority of the variance in each of the variables driven by shared environmental influences. To aid in interpretation and comparison of the correlations across variables, we took into account the magnitudes of the genetic and environmental influences by weighting the covariance by the estimates of genetic and environmental influences on each variable. The resulting “phenotypically standardized covariances” decompose the phenotypic (i.e., observed) correlation between two measures into genetic and environmental components such that rphenotypic = covgenetic + covshared environmental + covnonshared environmental (Plomin & DeFries, 1979). All behavioral genetic models were estimated using the OpenMx package (Boker et al., 2011).

Results

Descriptive Statistics and Univariate Behavioral Genetic Analyses

Means and standard deviations for all of the pre-reading, reading, and spelling measures are shown in Table 1. Our sample's standard score (SS) means and standard deviations on the measures of reading comprehension, spelling, and word reading are close to the standardizing population's means and standard deviations of 100 and 15, respectively, as is our sample's scaled score mean and standard deviation on the WPPSI sentence memory test (population M = 10, SD = 3). The descriptive statistics for the five pre-reading measures are grouped under their respective latent variables. Three of the measures showed significant skew (rapid color naming and rapid objects naming and sound matching) and were log-transformed prior to all analyses.

Table 1.

Descriptive Statistics and Univariate Estimates with 95% Confidence Intervals.

Univariate Estimates
n Mean SD a2 c2 e2
Print Knowledge Latent Variable
Concepts about Print (/24) 973 7.15 3.80
Letter Identification (/26) 972 17.69 6.92 .27* .70* .03
Sound Identification (/26) 948 12.33 5.97 [.14, .42] [.56, .81] [.00, .08]
Word Cards (/6) 975 2.42 1.14
Rapid Naming Latent Variable
Objectsa (seconds) 970 126.68 39.16 .58* .18 .24*
Colorsa (seconds) 935 141.88 51.07 [.30, .83] [.00, .41] [.16, .33]
Phonological Awareness Latent Variable
Syllable & Phoneme Blending (/12) 976 6.47 2.54
CTOPP Sound Matchinga (/20) 976 3.67 3.11 .71* .29* .00
Word Elision (/12) 976 6.93 2.96 [.50, .97] [.03, .50] [.00, .03]
Syllable & Phoneme Elision (/12) 975 3.81 1.85
Rhyme & Final Sounds (/16) 973 8.59 3.14
Vocabulary Latent Variable
WPPSI Vocabulary (scaled) 970 10.52 3.08 .22* .76* .02
100 Pictures (/100) 968 75.90 9.45 [.05, .40] [.60, .91] [.00, .12]
Verbal Memory Latent Variable
Nonword Repetition (/28) 950 12.46 5.53 .53* .40* .08
WPPSI Sentence Memory (scaled) 966 10.28 2.89 [.31, .77] [.17, .59] [.00, .18]
Word Reading (TOWRE Sight Word Reading Efficiency)
Post-1st grade standard score 955 102.23 14.02 .78* .07 .14*
[.61, .88] [.00, .24] [.12, .18]
Post-4th grade standard score 936 102.30 12.19 .59* .14 .26*
[.40, .78] [.00, .32] [.21, .33]
Reading Comprehension (Woodcock Passage Comprehension)
Post-1st grade standard score 960 104.80 12.84 .69* .11 .20*
[.51, .83] [.00, .28] [.16, .25]
Post-4th grade standard score 938 98.49 13.82 .72* .00 .28*
[.51, .78] [.00, .19] [.23, .34]
Spelling (WRAT Spelling)
Post-1st grade standard score 960 100.14 15.44 .67* .09 .24*
[.48, .80] [.00, .27] [.19, .29]
Post-4th grade standard score 938 100.66 15.01 .81* .03 .16*
[.63, .87] [.00, .21] [.13, .20]

Note:

*

significant at p < .05, determined by the 95% confidence intervals [brackets].

a2 = genetic variance, c2 = shared environmental variance, e2 = nonshared environmental variance.

a

Due to significant skew, the variable was transformed prior to all analyses.

Also shown in Table 1 are behavioral genetic univariate analyses that estimate the amount of variance in each measure that is due to genetic influences (a2), shared environmental influences (c2), and non-shared environmental influences (e2). Shared environmental influences were largest for print knowledge and vocabulary (.70 and .76, respectively), with smaller, significant genetic influences (.27 and .22). Rapid naming, phonological awareness, and verbal memory all had large, significant genetic influences (.58, .71, and .53, respectively). Individual differences in phonological awareness and verbal memory also had moderate and significant shared environmental influences (.29 and .40, respectively). Shared environmental influences on rapid naming were small and non-significant (.18). Non-shared environmental influences were generally very small and non-significant, with the exception of rapid naming (.24). The reading and spelling variables all had large genetic influences (between .59 and .81), small and non-significant shared environmental influences (between .00 and .14), and moderate non-shared environmental influences (including measurement error; between .14 and .28).

Testing the Relations Between Pre-reading Skills and Reading and Spelling Abilities

Individual pre-reading factors

The phenotypically standardized covariances for the pre-reading latent factors and the post-1st –grade and post-4th-grade reading and spelling measures are presented in Table 2. As noted previously, phenotypically standardized covariances decompose the phenotypic correlations into their genetic and environmental components. For example, the phenotypic correlation between preschool print knowledge and rapid naming (.42) is the sum of the genetic covariance (.13), shared environmental covariance (.23), and nonshared environmental covariance (.05). The genetic covariance was not significant at p < .05 (determined via 95% confidence intervals calculated in OpenMx), but the shared environmental and nonshared environmental covariances were. In other words, print knowledge and rapid naming share a moderate amount of variance, which is largely composed of shared environmental influences with some nonshared environmental influences as well.

Table 2.

Phenotypically Standardized Covariances.

Phenotypic Correlation = Genetic + Shared Environmental + Nonshared Environmental

1. 2. 3. 4. 5. 1. 2. 3. 4. 5. 1. 2. 3. 4. 5. 1. 2. 3. 4. 5.
1. Print Knowledge LV - - - -
2. Rapid Naming LV .42 - .13 - .23 - .05 -
3. Phonological Awareness LV .75 .36 - .32 .02 - .42 .31 - .01 .03 -
4. Vocabulary LV .67 .35 .76 - .17 .01 .21 - .48 .29 .55 - .02 .05 .01 -
5. Verbal Memory LV .52 .27 .70 .73 - .13 .10 .17 .29 - .37 .14 .50 .42 - .02 .03 .02 .03 -

Word Reading (TOWRE Sight Word Reading Efficiency)
Post-1st .52 .39 .44 .34 .31 .29 .33 .29 .17 .25 .20 .02 .15 .13 .04 .03 .04 .00 .04 .02
Post-4th .40 .40 .39 .36 .36 .20 .30 .24 .19 .22 .19 .06 .17 .16 .12 .02 .03 −.02 .01 .02
Residual Post-4th .02 .16 .09 .15 .19 −.02 .07 .04 .09 .05 .06 .08 .08 .09 .13 −.01 .01 −.03 −.03 .01

Reading Comprehension (Woodcock Passage Comprehension)
Post-1st .57 .38 .56 .48 .41 .36 .30 .42 .24 .33 .19 .04 .14 .17 .07 .03 .03 .00 .07 .02
Post-4th .50 .34 .61 .57 .55 .15 .19 .28 .27 .41 .30 .10 .31 .27 .10 .05 .05 .03 .04 .04
Residual Post-4th .17 .12 .32 .35 .38 −.10 −.01 .00 .15 .26 .23 .09 .28 .21 .08 .04 .04 .04 −.01 .04

Spelling (WRAT Spelling)
Post-1st .48 .29 .46 .32 .30 .30 .19 .35 .23 .23 .15 .07 .10 .07 .05 .02 .03 .01 .01 .02
Post-4th .44 .32 .46 .34 .34 .32 .23 .31 .18 .20 .11 .06 .13 .12 .10 .01 .03 .02 .04 .04
Residual Post-4th .11 .15 .17 .14 .17 .13 .13 .06 .01 .04 −.01 .01 .09 .09 .10 −.01 .02 .02 .04 .03

Note: The phenotypic correlations are decomposed into their genetic, shared environmental, and nonshared environmental components. LV = Latent Variable of Pre-reading measures; Residual Post-4th = Post-4th score after controlling for post-1st-grade score.

Correlations/Covariances in bold are significant at p < .05, determined by 95% confidence intervals.

As shown in Table 2, the five pre-reading latent factors were all significantly phenotypically correlated (rphenotypic between .27 and .76). Rapid naming had the smallest correlations with the other pre-reading skills (rphenotypic between .27 and .42), with the bulk of those relations significantly due to shared environmental influences. In contrast, print knowledge, phonological awareness, vocabulary, and verbal memory were more strongly correlated (rphenotypic between .52 and .76), with significant amounts of both genetic and shared environmental covariance. Nonshared environmental covariances were very small and, in general, not significant, consistent with the idea that using latent variables minimizes measurement error and any remaining measurement error is not correlated among variables.

Turning now to the correlations for the reading and spelling variables shown in Table 2, post-1st-grade word reading, reading comprehension, and spelling were moderately phenotypically correlated with all of the pre-reading factors (rphenotypic between .29 and .57), supporting previous research that longitudinally linked each of the pre-reading abilities to later reading and spelling. When the analyses were rerun for the 87% of the participants who read zero words on the Woodcock word identification subtest (Woodcock, 1987), the phenotypic correlations between the pre-reading and post-1st and post-4th grade reading and spelling measures were smaller, but most of the correlations were similar (i.e., within .10 of original correlation) and all correlations continued to be significant.

The etiologies of the correlations for the whole sample were largely genetic. While all five of the pre-reading latent factors had significant genetic covariances with post-1st-grade reading and spelling, shared environmental covariances were only significant for print knowledge with word reading, reading comprehension, and spelling, and for vocabulary with reading comprehension. The majority of the longitudinal relations between rapid naming, phonological awareness, vocabulary, and verbal memory and post-1st-grade reading and spelling, therefore, were driven by genetic factors. This shows that while individual differences in these pre-reading skills had moderate to large shared environmental estimates, the smaller magnitude genetic influences on each pre-reading skill were mainly what predicted reading and spelling ability at the end of 1st grade.

The finding that genetic covariance, rather than shared environmental covariance, was largely responsible for the longitudinal relations between the individual pre-reading skills and post-1st-grade reading and spelling may not be too surprising given that there was no significant shared environmental variance for word reading, reading comprehension, or spelling at the end of first grade in the univariate estimates shown in Table 1. However, the non-significant univariate estimates for shared environment seem to be contradicted by the fact that we found significant shared environmental covariance between print knowledge and the three reading and spelling variables. This suggests that the univariate behavioral genetic analyses, which are based on the difference between the MZ twin correlation and the DZ twin correlation within a variable, can underestimate the shared environmental influence on that variable. In a multivariate analysis, such as a correlation between two variables, the cross-trait MZ and DZ correlations are also relevant. By including information regarding the MZ and DZ cross-trait relations, the multivariate analyses can find significant shared environmental covariance even when the univariates show little shared environmental variance. It is important to note that the significant shared environmental covariances with post-1st-grade reading and spelling were found only for print knowledge, which had a large amount of shared environmental variance (.70).

Three years later at the post-4th-grade assessment all five of the pre-reading factors continued to be significantly correlated with word reading, reading comprehension, and spelling (rphenotypic between .32 and .61). While most of the post-4th-grade correlations with the pre-reading skills were similar in magnitude to their post-1st-grade correlations (as judged by overlapping 95% confidence intervals), the correlation between verbal memory and post-4th-grade reading comprehension significantly increased from rphenotypic = .41 at post-1st-grade to rphenotypic = .55 at post-4th-grade (significance assessed by non-overlapping 95% confidence intervals). Smaller non-significant increases (i.e., 95% confidence intervals overlapped) were found for vocabulary (.48 to .57) and phonological awareness (.56 to .61).

The genetic and environmental phenotypically standardized covariances indicated that, as with the post-1st-grade covariances, each of the pre-reading factors shared significant genetic covariance with post-4th-grade word reading, reading comprehension, and spelling. Genetic influences affecting individual differences in preschool-aged children, therefore, are partially shared with reading and spelling ability five years later. In addition, significant shared environmental covariances were found for print knowledge, phonological awareness, and vocabulary with word reading and reading comprehension. These covariances were generally similar in magnitude to the genetic covariances, but were somewhat greater for reading comprehension (.30, .31, and .27, respectively) compared to word reading (.19, .17, and .16). As with the post-1st-grade covariances, the shared environmental covariances were significant despite the univariate results of little to no shared environmental variance for post-4th-grade word reading and reading comprehension (shown in Table 1).

The final set of covariances, those between each pre-reading factor and post-4th-grade ability independent of post-1st-grade ability, addressed the question of whether the genetic and environmental influences shared between the pre-reading factors and post-1st grade are the same as those shared with pre-reading and post-4th grade. For word reading and spelling, the phenotypic correlations were generally small (rphenotypic between .02 and .19) and, with the exception of the genetic covariance between print knowledge and residual fourth-grade spelling, the phenotypically standardized genetic and shared environmental covariances did not reach significance. In general, the majority of the genetic and environmental influences shared between pre-reading factors and post-4th-grade word reading and spelling overlapped with those at post-1st grade.

In contrast to word reading and spelling, residual post-4th-grade reading comprehension showed larger phenotypic correlations with the pre-reading factors (rphenotypic between .12 and .38). Shared environmental influences significantly accounted for large portions of the relations of reading comprehension with print knowledge, phonological awareness, and vocabulary. Interestingly, verbal memory shared significant genetic covariance with residual fourth-grade reading comprehension. These results show that successful reading comprehension as a fourth grader has somewhat different genetic and environmental etiologies than successful reading comprehension as a first grader. This will be explored more in the discussion.

Finally, while the results in Table 2 show that print knowledge, rapid naming, phonological awareness, vocabulary, and verbal memory measured in pre-readers predicted individual differences in reading and spelling both at the end of first grade and the end of fourth grade, the total variance (calculated by squaring the phenotypic correlations) accounted for by each pre-reading factor ranged between 8% (post-1st-grade spelling with rapid naming) and 37% (post-4th-grade reading comprehension with phonological awareness). These percentages imply that most of the variance in later reading and spelling is not due to any one pre-reading skill. In the following analyses, we first directly model the etiology of the common variance shared among all of the pre-reading factors and then test whether the genetic and environmental influences across the five pre-reading skills account for more reading and spelling variance than the genetic and environmental influences on individual pre-reading factors.

Common pre-reading genetic and environmental variance

We used a hierarchical model, also called a second-order model, to estimate the extent to which the genetic and environmental influences shared among the five pre-reading factors contribute to variance in each pre-reading skill (similar to the approach of Friedman et al., 2008). The results of this model are shown in Figure S1. In the hierarchical model, the genetic and environmental variances in the five pre-reading latent factors are split into two levels: those unique to each pre-reading factor and those shared by all factors (common pre-reading variance). The results of the hierarchical model show that the majority of the variance in pre-reading print knowledge (62%), phonological awareness (83%), vocabulary (74%), and verbal memory (59%) was accounted for by the common pre-reading variance factor, with the majority (67%) of the common variance due to shared environmental influences (with 28% due to genetic influences and 4% due to non-shared environmental influences). In contrast, 82% of rapid naming variance was not shared with the other pre-reading factors and the majority of its variance (58%) was due to unique genetic factors, perhaps reflecting the speeded component of the rapid naming tasks. In general, these results suggest that with the exception of rapid naming, the pre-reading measures share large amounts of variance, and this common variance is due largely to shared environmental influences and lesser amounts of genetic variance.

Testing the etiologies underlying the relations between common pre-reading variance and reading and spelling development

The next step in our analyses was to assess the etiology of the covariance between common pre-reading variance, unique pre-reading variance, and word reading, reading comprehension, and spelling. To do this, we modified the hierarchical model (Figure S1) to a nested behavioral genetic model (Figure 1; see Chen, West, & Sousa, 2006 for more details regarding nested and hierarchical models). In the nested model, the individual pre-reading variables load directly onto both the common pre-reading latent variable and their independent latent variables. The resulting latent factors are orthogonal to each other and the latent variables specific to each pre-reading skill contribute to variance after the variance common to all of the pre-reading measures is partialled out. Because we use data from both twins in a pair (i.e., four observed variables rather than two) but we constrain the individual factor loadings and residuals equal across the two twins, we do not encounter the same identification issues for the two-indicator latent variables as phenotypic nested models.

Figure 1.

Figure 1

Behavioral Genetic Nested Model of Pre-reading Variables. Ovals represent latent variables while squares represent measured variables. All loadings on single-headed arrows are standardized regression coefficients and are significant at p < .05, with the exception of Phonological Awareness to CTOPP Sound Matching (dashed line). Single-headed arrows and numbers below the observed variables show the amount of residual variance independent of the latent variable. Percentages above the A (genetic), C (shared environmental), and E (non-shared environmental) latent variables show the genetic and environmental proportions of variance. If shaded, A, C, and E estimates are significant. CAP = concepts about print; S & P Blending = syllable and phoneme blending; S & P Elision = syllable and phoneme elision.

Finally, we modified the nested model by adding correlations between the pre-reading genetic and environmental factors and the word reading (Figure 2), reading comprehension (Figure 3), and spelling (Figure 4) genetic and environmental factors. These correlated factor models test the etiology of the covariance between the common and unique pre-reading variance and the reading and spelling variables. Because the unique genetic and environmental variance in vocabulary was not significant in the nested behavioral genetic model, we fit the correlated factors models both with and without the unique vocabulary factor. The results were very similar between the two approaches and we chose to include the unique vocabulary factor in the final models. The results of fitting the final correlated factors models are shown in Figures 2, 3, and 4. Four primary results were obtained: those related to common pre-reading variance, unique genetic print knowledge variance, unique genetic rapid naming variance, and the role of unique verbal memory variance in predicting post-4th-grade reading comprehension.

Figure 2.

Figure 2

Behavioral Genetic Correlated Factors Model of Pre-reading Variables with Word Reading (TOWRE Sight Word Reading Efficiency). Numbers on double-headed arrows are phenotypically standardized covariances, such that the phenotypic correlation (double-headed arrow connecting P components) is the sum of the additive genetic covariance (A), shared environmental covariance (C), and non-shared environmental covariance (E). Solid lines and bold numbers depict significant correlations/covariances. While all phenotypically standardized covariances were estimated, only the ACE decompositions of significant phenotypic correlations are shown.

Figure 3.

Figure 3

Behavioral Genetic Correlated Factors Model of Pre-reading Variables with Reading Comprehension (Woodcock Passage Comprehension). Numbers on double-headed arrows are phenotypically standardized covariances, such that the phenotypic correlation (double-headed arrow connecting P components) is the sum of the additive genetic covariance (A), shared environmental covariance (C), and non-shared environmental covariance (E). Solid lines and bold numbers depict significant correlations/covariances. While all phenotypically standardized covariances were estimated, only the ACE decompositions of significant phenotypic correlation s are shown.

Figure 4.

Figure 4

Behavioral Genetic Correlated Factors Model of Pre-reading Variables with Spelling (WRAT Spelling). Numbers on double-headed arrows are phenotypically standardized covariances, such that the phenotypic correlation (double-headed arrow connecting P components) is the sum of the additive genetic covariance (A), shared environmental covariance (C), and non-shared environmental covariance (E). Solid lines and bold numbers depict significant correlations/covariances. While all phenotypically standardized covariances were estimated, only the ACE decompositions of significant phenotypic correlations are shown.

The phenotypic correlations between the common pre-reading factor and post-1st-grade reading and spelling (rphenotypic = .40 for spelling, .44 for word reading, and .57 for reading comprehension) were very similar to the phenotypic correlations between the common pre-reading factor and post-4th-grade reading and spelling (rphenotypic = .39 for spelling, .43 for word reading, and .61 for reading comprehension). By squaring the phenotypic correlations, we calculated that common pre-reading variance accounted for between 15% and 37% of the variance in word reading, reading comprehension, and spelling.

Unlike the correlations for each individual pre-reading factor shown in Table 2, the phenotypic correlations with common pre-reading variance were due largely to overlapping shared environmental covariance, with shared environmental covariances ranging from covshared environmental = .25 (post-4th-grade spelling) to .43 (post-4th-grade reading comprehension). The genetic covariances between common pre-reading variance and post-4th-grade word reading (covgenetic = .12), post-4th-grade spelling (covgenetic = .11), and both post-1st and post-4th-grade reading comprehension were also significant (covgenetic = .10 and .11, respectively).

The unique pre-reading variables, which were composed of skill-specific variance, did not account for large amounts of variance in the outcome measures. The largest phenotypic correlation between a unique pre-reading factor and a literacy outcome measure was between post-1st-grade word reading and print knowledge (rphenotypic = .25), suggesting that common pre-reading variance was a better predictor of post-1st-grade and post-4th-grade word reading, reading comprehension, and spelling than the unique pre-reading factors. Out of the pre-reading factors, only unique rapid naming was significantly correlated phenotypically with all of the outcome measures. Unique print knowledge was significantly correlated with all three post-1st-grade outcomes, as well as post-4th-grade spelling. These phenotypic correlations were largely due to genetic influences: significant genetic covariances were found between unique print knowledge variance and post-1st-grade word reading (covgenetic = .28), reading comprehension (covgenetic = .35), and spelling (covgenetic = .29), as well as with post-4th-grade spelling (covgenetic = .29). In addition, with the exception of post-1st-grade spelling, significant genetic covariances were found between unique rapid naming and the reading and spelling variables (covgenetic = .29 for post-1st-grade word reading; .25 for post-1st-grade reading comprehension; .25 for post-4th-grade word reading; .18 for post-4th-grade reading comprehension; .18 for post-4th-grade spelling). In contrast, the unique phonological awareness, vocabulary, and verbal memory (with the exception of post-4th-grade reading comprehension) factors did not have significant genetic and environmental covariances with the reading and spelling measures.

Finally, post-4th-grade reading comprehension was significantly correlated phenotypically with unique verbal memory (rphenotypic = .16). This correlation was a combination of a significant positive genetic covariance (covgenetic = .37), a significant negative shared environmental covariance (covshared environmental = −.19), and a nonsignificant negative non-shared environmental covariance (covnonshared environmental = −.02; not shown in the Figure 5). Although the negative shared environmental covariance is likely due to chance, it is important to note that the magnitude of post-4th-grade reading comprehension variance due to unique verbal memory (2.6%; square of phenotypic correlation) was much smaller than for the common pre-reading latent variable (37%). Regardless, as in Table 2, these results show that, for our measure of reading comprehension at least, reading comprehension relies more on memory in fourth grade than in first grade.

Performance on an additional reading comprehension measure at the end of fourth grade was available for 80% of our sample (the Gates MacGinitie reading comprehension test; MacGinitie, MacGinitie, Maria, & Dreyer, 2000). We checked whether our results varied across comprehension measures for this subset by fitting two correlated factors models (one with the Gates substituted for the Woodcock Passage Comprehension, data not shown, and one with a latent variable composed of both post-4th-grade reading comprehension measures, shown in Figure S2). The results of both models were similar to those with only the Woodcock. However, other comprehension measures that use much longer passages might show a different pattern of relations with the pre-reading factors.

Taken together, results obtained from the correlated factors models indicated that common pre-reading variance was the strongest phenotypic predictor of post-1st and post-4th-grade word reading, reading comprehension, and spelling, largely due to overlapping shared environmental variance, with smaller but statistically significant genetic influences for post-1st-grade reading comprehension and post-4th-grade reading and spelling. In addition, genetic variance unique to print knowledge and genetic variance unique to rapid naming also played important longitudinal roles.

Discussion

The present study included measures of five pre-reading skills, both individually and together, as predictors of individual differences in post-1st-grade and post-4th-grade word reading, reading comprehension, and spelling ability. The five pre-reading skills were measured prior to the start of formal literacy instruction, minimizing the confounding influence of learning to read. Also, because we had multiple measures of each of the pre-reading skills, they were modeled as latent variables, increasing the reliability of the constructs. Importantly, we explored the genetic and environmental etiologies underlying the longitudinal relations between the pre-reading skills and future reading and spelling abilities by using behavioral genetic analyses of our sample of identical and fraternal twins. In general our results indicate, with a few important caveats by grade and reading measure, that genetic and shared environmental influences common to all of the pre-reading measures, as well as genetic influences unique to print knowledge and rapid naming, have significant longitudinal overlap with reading and spelling development.

Evidence for Stability and Instability in Transition from “Learning to Read” to “Reading to Learn”

We included reading and spelling variables measured at both post-1st grade and post-4th grade to test whether the pre-reading skills predicted similar amounts of variance when children were “learning to read” as when they were “reading to learn”. By controlling for post-1st-grade word reading and spelling, the correlations with the end of fourth-grade word reading and spelling were largely diminished, suggesting that influences on individual differences at the end of fourth grade largely overlap with individual differences at the end of first grade for these two measures. In the absence of interventions or other factors that could affect an individual's growth rate, this is indicative of stability in word reading and spelling such that knowing how well children are performing early, even by the end of first grade, will predict how well they will be reading words and spelling years later (see also Juel, 1988; Scarborough, 1998).

The correlations between post-4th-grade reading comprehension and some of the pre-reading skills are an important exception. The significant increase in reading comprehension's phenotypic correlation with verbal memory between post-1st and post-4th grade (from rphenotypic = .41 to .55), the increases in the phenotypic correlations with vocabulary (from rphenotypic = .48 to .57) and phonological awareness (from rphenotypic = .56 to .61), the significant genetic covariance for verbal memory and post-4th-grade reading comprehension after controlling for post-1st grade (covgenetic = .26), and the significant shared environmental covariances between residual post-4th-grade reading comprehension and print knowledge (covshared environmental = .23), phonological awareness (covshared environmental = .28), and vocabulary (covshared environmental = .21) all indicate that the task demands for reading comprehension changed between first and fourth grade. Previous work by Keenan, Betjemann, and Olson (2008) using the same reading comprehension measure but a different sample revealed that performance on this measure was more dependent on word reading skills in younger children than in older children, with performance in older children tied to both word reading and listening comprehension. Phonological awareness, vocabulary, and verbal memory, therefore, may be more predictive of reading comprehension at the end of fourth grade because of increased reliance on oral comprehension ability.

Exploring the Longitudinal Roles of Genetic and Environmental Influences

To our best knowledge, the present study is the first behavioral genetic analysis of the common variance among pre-reading skills, allowing us to estimate both the etiology of this common variance as well as the extent to which the common pre-reading variance longitudinally predicted reading and spelling ability. We found that the majority of the variance in print knowledge, phonological awareness, vocabulary, and verbal memory was accounted for by a common pre-reading latent variable, with the majority (67%) of the common pre-reading variance due to shared environmental influences. In contrast, rapid naming only shared 18% of its variance with the other pre-reading skills, showing that the speed with which children are able to name pictures and colors is largely separable from the other pre-reading skills.

It was interesting to find that the common pre-reading factor and the unique pre-reading factors did not account for the majority of the phenotypic variance in future reading and spelling ability (between 25% and 38% of the variance in post-1st-grade reading and spelling ability, and between 25% and 44% of the variance at the end of fourth grade; because the latent factors in the nested model are orthogonal, the total percentage of variance can be calculated as the sum of the squared phenotypic correlations). This finding highlights the fact that additional influences on reading emerge as children enter formal schooling. These influences could be either genetic (due to the increasing cognitive complexity of reading which may tap new genetic influences) or environmental (reflecting family and school environment effects).

The variance that the common pre-reading factor did share with reading and spelling development was due to both genetic and shared environmental influences. This shows that the pre-reading literacy environment is important in two ways: the extent to which children are exposed to print and vocabulary (shared environmental influences) and the extent to which children are able to learn from that exposure (genetic influences). Previous research concerning the home and school literacy environments of preschoolers has suggested that active engagement in literacy activities is an important concurrent and longitudinal predictor of literacy development (e.g., Connor, Morrison, & Slominski, 2006; Levy, Gong, Hessels, Evans, & Jared, 2006; Sénéchal & LeFevre, 2002). Teachers and parents play an important role in providing an engaging literacy environment for the children. For example, the quality of the vocabulary teachers use around preschoolers is longitudinally tied to future reading ability (Dickinson & Porche, 2011). Thus, our findings provide the mechanisms underlying these previous findings.

It is important to note that early “environmental” variables such as number of books in the home, time spent reading with child, IQ, parental education, etc. are likely a mixture of genetic and shared environmental variance. For example, parents who have more books in the home or take their children to the library may be better readers themselves. Regardless, the result that shared environmental influences on the common pre-reading factor covary with future reading and spelling suggests that general shared environmental influences at preschool continue to play a role in reading ability, even five years later. Whether or not this reflects something specific about the pre-reading environment is an open question. It is possible that the quality of the pre-reading literacy environment could act as a proxy for the quality of the subsequent general family and school environment.

The Unique Genetic Influences from Print Knowledge and Rapid Naming

Genetic variance in how quickly children do speeded naming of visual stimuli is predictive of how well they will be reading up to five years later. After extracting the common pre-reading variance, with the exception of post-1st-grade spelling, the genetic covariances between rapid naming and the reading and spelling variables were significant.

In addition, the unique genetic variance in print knowledge was significantly correlated with post-1st-grade word reading, reading comprehension, and spelling, as well as with post-4th-grade spelling. Letter identification and sound identification loaded most strongly onto the unique print knowledge factor (standardized loadings = .63 and .52 respectively; shown in Figure 1). The significant genetic covariance at post-1st grade could reflect the importance of learning sound-symbol associations, as suggested by findings from a recent ILTS study that explored the genetic variance shared by learning and reading measures (Byrne et al., 2013). The finding that the genetic covariance between the unique print knowledge variance and post-4th-grade word reading and reading comprehension was not significant suggests that readers who are “reading to learn” depend less on decoding ability and instead may use a more lexical route, with increased importance for vocabulary, oral comprehension, and context.

Finally, the finding that unique genetic and environmental variance in pre-reading phonological awareness, verbal memory (with the exception of post-4th-grade reading comprehension), and vocabulary did not significantly correlate with word reading, reading comprehension, and spelling has implications for understanding why pre-reading skills longitudinally predict future reading and spelling ability. For example, phonological awareness may be important for predicting reading because it is related to children's general environmental exposure to text and ability to learn from that exposure.

Implications

Taken together, the results offer suggestions for developing interventions aimed at children at-risk for future reading difficulties. For example, if genetic influences that carry over from preschool to elementary school reflect learning rate, children identified as at-risk for future reading problems may need to devote substantially more time to learning to read than their peers. Rather than training specific pre-reading skills, it may be more effective to increase the amount of reading time and the training of grapheme-phoneme correspondences.

In addition, given that the five pre-reading factors together only accounted for between 25% and 38% of the variance in reading and spelling ability at the end of first grade, early identification of future reading problems could benefit from including additional constructs, such as inattention (Ebejer et al., 2010) or family risk (e.g., Eklund, Torppa, & Lyytinen, 2011). One possibility is to assess children's rate of growth on a pre-reading measure. Children who struggle to learn phonological awareness even with direct and targeted instruction, for example, are likely to continue to struggle learning to read (Byrne, Fielding-Barnsley, & Ashley, 2000; Byrne et al., 2013). Future studies could also develop methods for more controlled dynamic assessments of learning rate for print-sound associations in children who are low in print knowledge. This could help differentiate children who will catch up to their peers with consistent literacy instruction from those who will need to spend additional time-on-task to reach or more closely approach grade level. However, the specific challenges associated with learning print-sound associations may not generalize to other forms of paired-associate learning as predictors of early reading development (Lervåg, Bråten, & Hulme, 2009).

Limitations

Significant shared environmental covariances were found even though the univariate analyses of the reading and spelling variables showed no significant shared environmental variance. As discussed previously, this can be due to the greater information available in the multivariate case; alternatively, if our assumption that all genetic influences are additive has been violated, the univariate genetic estimates may be too high and the shared environment estimates too low (Keller & Coventry, 2005). However, there are three reasons to think that any bias is likely to be minimal. First, any violation of the no-assortative-mating assumption of our twin models would lead to improperly inflated shared-environment estimates and decreased genetic estimates. Second, some researchers argue that there is little evidence that non-additive genetic variance is a major source of genetic variance for complex cognitive traits (e.g., Hill, Goodard, & Visscher, 2008). Finally, adoption studies of reading generally find low correlations between reading scores of adopted siblings, and shared environment estimates are similar to those from twin studies (e.g., Petrill, Deater-Deckard, Thompson, DeThone, & Schatschneider, 2006; Wadsworth, Corley, Plomin, Hewitt, & DeFries, 2006).

It is also important to highlight conclusions that should not be drawn from our results. Specifically, the present estimates of genetic and environmental influences are specific for our twin sample, but these estimates may not generalize to samples that differ in their environmental range. For example, samples including children learning to read in their second language or including children receiving very different approaches to or amounts of literacy education could increase the overall environmental variance, resulting in higher estimates of shared environment and lower estimates of genetic influence. Also, genetic influences on reading may be weaker, and shared environmental influences stronger, among children in lower SES families (Friend, DeFries, & Olson, 2008; Hart, Soden, Johnson, Schatschneider, & Taylor, 2013).

The results are also specific to the measures used. For example, we only have single measures of word reading, reading comprehension, and spelling common to post-1st-grade and post-4th-grade. If we had additional measures at both time points, it would be possible to fit the correlated factors models with word reading, reading comprehension, and spelling as latent variables, potentially reducing measurement error and increasing the precision of the genetic and environmental estimates.

While we are able to estimate the extent to which genetic and environmental factors matter for the average performance of our sample, our results do not specify the etiological influences that affect reading development for a particular child; it is possible that some children in our sample may have struggled learning to read or excelled in reading for largely environmental reasons rather than genetic reasons, while genetic influences may have been more important for other children. Finally, we are not able to assess the relative importance of genetic and environmental influences prior to the first measurement point.

Conclusion

Individual differences in pre-readers’ print knowledge, rapid naming, phonological awareness, vocabulary, and verbal memory accounted for substantial variance in how well children read and spelled at the end of first grade and at the end of fourth grade. Our findings showed that the basis for this stability is not just due to common environmental influences across this span of development but also to common genetic influences. However we also noted that the results varied depending on the reading measure used. While post-4th-grade word reading and spelling had similar genetic and environmental etiologies to post-1st-grade word reading and spelling, the etiologies of individual differences in reading comprehension at the end of fourth grade appear to be partially distinct (both genetically and environmentally) from post-1st-grade reading comprehension. Distinct genetic influences from speeded access to stored representations (naming speed) and learning sound-symbol associations (print knowledge) were also identified. For children with access to consistent and formalized early literacy education, genetic influences appear to be at least as important for future reading success as environmental influences on pre-reading skills and help explain the variety of ways in which children can succeed or have difficulties learning to read.

Supplementary Material

01

Acknowledgements

Funding was provided by the National Institutes of Health, grant numbers T32 HD007289, P50 HD027802 and R01 HD038526. We thank the twins and their families who participated in our research, as well as Greg Carey for statistical advice.

References

  1. Arbuckle JL. Amos 17.0.0 [computer software]. Amos Development Corporation; Crawfordville, FL: 2008. [Google Scholar]
  2. Bollen KA. Structural equations with latent variables. John Wiley & Sons; New York: 1989. [Google Scholar]
  3. Boker SM, Neale MC, Maes HH, Wilde MJ, Spiegel M, Brick TR, Fox J. OpenMx: An Open Source Extended Structural Equation Modeling Framework. Psychometrika. 2011 doi: 10.1007/s11336-010-9200-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Boscardin CK, Muthén B, Francis DJ, Baker EL. Early identification of reading difficulties using heterogeneous developmental trajectories. Journal of Educational Psychology. 2008;100:192–208. doi:10.1037/0022-0663.100.1.192. [Google Scholar]
  5. Bowey JA, Patel RK. Metalinguistic ability and early reading achievement. Applied Psycholinguistics. 1988;9:367–383. doi:10.1017/S0142716400008067. [Google Scholar]
  6. Brady S, Shankweiler D, editors. Phonological processes in literacy. Erlbaum; Hillsdale, NJ: 1991. [Google Scholar]
  7. Byrne B. The foundation of literacy: The child's acquisition of the alphabetic principle. Psychology Press; Hove: 1998. [Google Scholar]
  8. Byrne B, Coventry WL, Olson RK, Samuelsson S, Corley R, Willcutt EG, DeFries JC. Genetic and environmental influences on aspects of literacy and language in early childhood: Continuity and change from preschool to grade 2. Journal of Neurolinguistics. 2009;22:219–236. doi: 10.1016/j.jneuroling.2008.09.003. doi:10.1016/j.jneuroling.2008.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Byrne B, Delaland C, Fielding-Barnsley R, Quain P, Samuelsson S, Høien T, Willcutt E. Longitudinal twin study of early reading development in three countries: Preliminary results. Annals of Dyslexia. 2002;52:47–73. doi:10.1007/s11881-002-0006-9. [Google Scholar]
  10. Byrne B, Fielding-Barnsley R, Ashley L. Effects of preschool phoneme identity training after six years: Outcome level distinguished from rate of response. Journal of Educational Psychology. 2000;92:659–667. doi: 10.1037/0022-0663.92.4.659. [Google Scholar]
  11. Byrne B, Wadsworth S, Boehme K, Talk AC, Coventry WL, Olson RK, Corley R. Multivariate genetic analysis of learning and early reading development. Scientific Studies of Reading. 2013;17:224–242. doi: 10.1080/10888438.2011.654298. doi:10.1080/10888438.2011.654298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Catts HW, Fey ME, Zhang X, Tomblin JB. Language basis of reading and reading disabilities: Evidence from a longitudinal investigation. Scientific Studies of Reading. 1999;3:331–361. doi:10.1207/s1532799xssr0304_2. [Google Scholar]
  13. Chall JS. Stages of reading development. McGraw-Hill; New York: 1983. [Google Scholar]
  14. Chen FF, West SG, Sousa KH. A comparison of bifactor and second-order models of quality of life. Multivariate Behavioral Research. 2006;41:189–225. doi: 10.1207/s15327906mbr4102_5. doi: 10.1207/s15327906mbr4102_5. [DOI] [PubMed] [Google Scholar]
  15. Clay M. The early detection of reading difficulties: A diagnostic survey. Heinemann; Auckland, New Zealand: 1975. [Google Scholar]
  16. Connor CM, Morrison FJ, Slominski L. Preschool instruction and children's emergent literacy growth. Journal of Education Psychology. 2006;98:665–689. doi: 10.1037/0022-0663.98.4.665. [Google Scholar]
  17. de Jong PF, van der Leij A. Effects of phonological abilities and linguistic comprehension on the development of reading. Scientific Studies of Reading. 2002;6:51–77. doi: 10.1207/S1532799XSSR0601_03. [Google Scholar]
  18. Denckla MB. Color-naming defects in dyslexic boys. Cortex. 1972;8:164–176. doi: 10.1016/s0010-9452(72)80016-9. [DOI] [PubMed] [Google Scholar]
  19. Denckla MB, Rudel R. Rapid “automatized” naming of pictured objects, colors, letters and numbers by normal children. Cortex. 1974;10:186–202. doi: 10.1016/s0010-9452(74)80009-2. [DOI] [PubMed] [Google Scholar]
  20. Dickinson DK, Porche MV. Relation between language experiences in preschool classrooms and children's kindergarten and fourth-grade language and reading abilities. Child Development. 2011;82:870–886. doi: 10.1111/j.1467-8624.2011.01576.x. doi: 10.1111/j.1467-8624.2011.01576.x. [DOI] [PubMed] [Google Scholar]
  21. Ebejer JL, Coventry WL, Byrne B, Willcutt EG, Olson RK, Corley R, Samuelsson S. Genetic and environmental influences on inattention, hyperactivity-impulsivity, and reading: Kindergarten to grade 2. Scientific Studies of Reading. 2010;14:293–316. doi: 10.1080/10888430903150642. doi: 10.1080/10888430903150642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Eklund KM, Torppa M, Lyytinen H. Predicting reading disability: Early cognitive risk and protective factors. Dyslexia. 2013;19:1–10. doi: 10.1002/dys.1447. doi: 10.1002/dys.1447. [DOI] [PubMed] [Google Scholar]
  23. Fernald A, Marchman VA, Weisleder A. SES differences in language processing skill and vocabulary are evident at 18 months. Developmental Science. 2013;16:234–248. doi: 10.1111/desc.12019. doi: 10.1111/desc.12019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Fisher JP, Glenister JM. The hundred pictures naming test. Australian Council for Educational Research; Hawthorn, Australia: 1992. [Google Scholar]
  25. Friedman NP, Miyake A, Young SE, DeFries JC, Corley RP, Hewitt JK. Individual differences in executive functions are almost entirely genetic in origin. Journal of Experimental Psychology: General. 2008;137:201–225. doi: 10.1037/0096-3445.137.2.201. doi: 10.1037/0096-3445.137.2.201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Friend A, DeFries JC, Olson RK. Parental education moderates genetic influences on reading disability. Psychological Science. 2008;19:1124–1130. doi: 10.1111/j.1467-9280.2008.02213.x. doi: 10.1111/j.1467-9280.2008.02213.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gathercole SE, Willis CS, Baddeley AD, Emslie H. The children's test of nonword repetition: A test of phonological working memory. Memory. 1994;2:103–127. doi: 10.1080/09658219408258940. [DOI] [PubMed] [Google Scholar]
  28. Georgiou GK, Parrila R, Papadopoulos TC. Predictors of word decoding and reading fluency across languages varying in orthographic consistency. Journal of Educational Psychology. 2008;100:566–580. doi:10.1037/0022-0663.100.3.566. [Google Scholar]
  29. Harlaar N, Hayiou-Thomas ME, Dale PS, Plomin R. Why do preschool language abilities correlate with later reading? A twin study. Journal of Speech, Language, and Hearing Research. 2008;51:688–705. doi: 10.1044/1092-4388(2008/049). doi:10.1044/1092-4388(2008/049. [DOI] [PubMed] [Google Scholar]
  30. Hart B, Risley TR. Meaningful Differences in the Everyday Experiences of Young American Children. Brookes; Baltimore, MD: 1995. [Google Scholar]
  31. Hart SA, Soden B, Johnson W, Schatschneider C, Taylor J. Expanding the environment: Gene x school-level SES interaction on reading comprehension. Journal of Child Psychology and Psychiatry. 2013;54:1047–1055. doi: 10.1111/jcpp.12083. doi: 10.1111/jcpp.12083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hecht SA, Burgess SR, Torgesen JK, Wagner RK, Rashotte CA. Explaining social class differences in growth of reading skills from beginning kindergarten through fourth-grade: The role of phonological awareness, rate of access, and print knowledge. Reading and Writing. 2000;12:99–128. [Google Scholar]
  33. Hill WG, Goddard ME, Visscher PM. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genetics. 2008;4:1–10. doi: 10.1371/journal.pgen.1000008. doi:10.1371/journal.pgen.1000008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Jastak S, Wilkinson GS. The Wide Range Achievement Test–Revised: Administration manual. Jastak Associates, Inc.; Wilmington, DE: 1984. [Google Scholar]
  35. Juel C. Learning to read and write: A longitudinal study of 54 children from first through fourth grades. Journal of Educational Psychology. 1988;80:437–447. doi:10.1037/0022-0663.80.4.437. [Google Scholar]
  36. Keenan JM, Betjemann RS, Olson RK. Reading comprehension tests vary in the skills they assess: Differential dependence on decoding and oral comprehension. Scientific Studies of Reading. 2008;12:281–300. doi: 10.1080/10888430802132279. [Google Scholar]
  37. Keenan J, Olson R, Byrne B, Samuelsson S, Hulslander J, Christopher M. Preschool predictors of grade 4 reading comprehension, listening comprehension, and decoding.. Paper presented at the 2011 Society for the Scientific Study of Reading conference; St. Pete Beach, Florida. 2011, July. [Google Scholar]
  38. Keller MC, Coventry WL. Quantifying and addressing parameter indeterminacy in the classical twin design. Twin Research and Human Genetics. 2005;8:201–213. doi: 10.1375/1832427054253068. doi:10.1375/twin.8.3.201. [DOI] [PubMed] [Google Scholar]
  39. Lervåg A, Bråten I, Hulme C. The cognitive and linguistic foundations of early reading development: A Norwegian latent variable longitudinal study. Developmental Psychology. 2009;45:764–781. doi: 10.1037/a0014132. doi:10.1037/a0014132. [DOI] [PubMed] [Google Scholar]
  40. Levy BA, Gong Z, Hessels S, Evans MA, Jared D. Understanding print: Early reading development and the contribution of home literacy experiences. Journal of Experimental Child Psychology. 2006;93:63–93. doi: 10.1016/j.jecp.2005.07.003. doi: 10.1016/j.jecp.2005.07.003. [DOI] [PubMed] [Google Scholar]
  41. Lonigan CJ, Burgess SR, Anthony JL. Development of emergent literacy and early reading skills in preschool children: Evidence from a latent-variable longitudinal study. Developmental Psychology. 2000;36:596–613. doi: 10.1037/0012-1649.36.5.596. doi:10.1037/0012-1649.36.5.596. [DOI] [PubMed] [Google Scholar]
  42. MacGinitie WH, MacGinitie RK, Maria K, Dreyer LG. Gates-MacGinitie Reading Tests. Fourth Edition Riverside Publishing; Itasca, IL: 2000. [Google Scholar]
  43. Muter V, Snowling M. Concurrent and longitudinal predictors of reading: The role of metalinguistic and short-term memory skills. Reading Research Quarterly. 1998;33:320–337. doi:10.1598/RRQ.33.3.4. [Google Scholar]
  44. Neale MC, Cardon LR. Methodology for Genetic studies of Twins and Families. Kluwer; Dordrecht, The Netherlands: 1992. [Google Scholar]
  45. Nichols RC, Bilbro WC., Jr The diagnosis of twin zygosity. Acta Genetica et Statistica Medica. 1966;16:265–275. doi: 10.1159/000151973. doi:10.1159/000151973. [DOI] [PubMed] [Google Scholar]
  46. Olson RK, Keenan JM, Byrne B, Samuelsson S, Coventry WL, Corley R, Hulslander J. Genetic and environmental influences on vocabulary and reading development. Scientific Studies of Reading. 2011;15:26–46. doi: 10.1007/s11145-006-9018-x. doi:10.1080/10888438.2011.536128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Parrila R, Kirby JR, McQuarrie L. Articulation rate, naming speed, verbal short-term memory, and phonological awareness: Longitudinal predictors of early reading development? Scientific Studies of Reading. 2004;8:3–26. doi:10.1207/s1532799xssr0801_2. [Google Scholar]
  48. Perfetti CA, Beck I, Bell LC, Hughes C. Phonemic knowledge and learning to read are reciprocal: A longitudinal study of first grade children. Merrill-Palmer Quarterly. 1987;33:283–319. [Google Scholar]
  49. Petrill SA, Deater-Deckard K, Thompson LA, DeThorne LS, Schatschneider C. Reading skills in early readers: Genetic and shared environmental influences. Journal of Learning Disabilities. 2006;39:48–55. doi: 10.1177/00222194060390010501. doi:10.1177/00222194060390010501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Plomin R, DeFries JC. Multivariate behavioral genetic analysis of twin data on scholastic abilities. Behavior Genetics. 1979;9:505–517. doi: 10.1007/BF01067347. [DOI] [PubMed] [Google Scholar]
  51. Plomin R, DeFries JC, Knopik VS, Neiderhiser JM. Behavioral genetics. 6th ed. Worth Publishers; New York, NY: 2013. [Google Scholar]
  52. Samuelsson S, Byrne B, Quain P, Wadsworth S, Corley R, DeFries JC, Olson R. Environmental and genetic influences on prereading skills in Australia, Scandinavia, and the United States. Journal of Educational Psychology. 2005;97:705–722. doi: 10.1037/0022-0663.97.4.705. [Google Scholar]
  53. Scarborough HS. Prediction of reading disability from familial and individual differences. Journal of Educational Psychology. 1989;81:101–108. doi:10.1037//0022-0663.81.1.101. [Google Scholar]
  54. Scarborough HS. Predicting the future achievement of second graders with reading disabilities: Contributions of phonemic awareness, verbal memory, rapid naming, and IQ. Annals of Dyslexia. 1998;48:115–136. doi:10.1007/s11881-998-0006-5. [Google Scholar]
  55. Schatschneider C, Fletcher JM, Francis DJ, Carlson CD, Foorman BR. Kindergarten prediction of reading skills: A longitudinal comparative analysis. Journal of Educational Psychology. 2004;96:265–282. doi:10.1037/0022-0663.96.2.265. [Google Scholar]
  56. Sénéchal M, LeFevre JA. Parental involvement in the development of children's reading skill: A five-year longitudinal study. Child Development. 2002;73:445–460. doi: 10.1111/1467-8624.00417. doi:10.1111/1467-8624.00417. [DOI] [PubMed] [Google Scholar]
  57. Shankweiler DP, Liberman IY, Mark LS, Fowler CA, Fischer FW. The speech code and learning to read. Journal of Experimental Psychology: Human Learning and Memory. 1979;5:531–545. doi:10.1037/0278-7393.5.6.531. [Google Scholar]
  58. Stuart M, Coltheart M. Does reading develop in a sequence of stages? Cognition. 1988;30:139–181. doi: 10.1016/0010-0277(88)90038-8. doi:10.1016/0010-0277(88)90038-8. [DOI] [PubMed] [Google Scholar]
  59. Sunseth K, Bowers PG. Rapid naming and phonemic awareness: Contributions to reading, spelling, and orthographic knowledge. Scientific Studies of Reading. 2002;6:401–429. doi:10.1207/S1532799XSSR0604_05. [Google Scholar]
  60. Torgesen JK, Wagner RK, Rashotte CA. Test of Word Reading Efficiency (TOWRE) Pro-Ed; Austin, TX: 1999. [Google Scholar]
  61. Trouton A, Spinath FM, Plomin R. Twins early development study (TEDS): A multivariate, longitudinal genetic investigation of language, cognition and behavior problems in childhood. Twin Research and Human Genetics. 2002;5:444–448. doi: 10.1375/136905202320906255. doi:10.1375/136905202320906255. [DOI] [PubMed] [Google Scholar]
  62. Wadsworth SJ, Corley RP, Plomin R, Hewitt JK, DeFries JC. Genetic and environmental influences on continuity and change in reading achievement in the Colorado Adoption Project. In: Huston A, Ripke M, editors. Developmental contexts of middle childhood: Bridges to adolescence and adulthood. Cambridge University Press; New York: 2006. pp. 87–106. [Google Scholar]
  63. Wagner RK, Torgesen JK, Rashotte CA. The Comprehensive Test of Phonological Processes (CTOPP). PRO-ED; Austin, TX: 1999. [Google Scholar]
  64. Wechsler D. Manual for the Wechsler Preschool and Primary Scale of Intelligence—Revised. Psychological Corporation; New York: 1989. [Google Scholar]
  65. Whitehurst GJ, Lonigan CJ. Child development and emergent literacy. Child Development. 1998;69:848–872. doi:10.2307/1132208. [PubMed] [Google Scholar]
  66. Woodcock RW. Woodcock Reading Mastery Test–Revised. American Guidance Services; Circle Pines, MN: 1987. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES