Abstract
Across multiple schools in three sites, the impact of grade-at-intervention was evaluated for children at risk or meeting criteria for reading disabilities. A multiple-component reading intervention with demonstrated efficacy was offered to small groups of children in 1st, 2nd, or 3rd grade. In a quasi-experimental design, 172 children received the Triple-Focus Program (PHAST + RAVE-O), and 47 were control participants. Change during intervention and 1–3 years later (6–8 testing points), and the influence of individual differences in predicting outcomes, were assessed using reading and reading-related repeated measures. Intervention children out-performed control children at posttest on all 14 outcomes, with average effect sizes (Cohen’s d) on standardized measures of .80 and on experimental measures of 1.69. On foundational word reading skills (standardized measures), children who received intervention earlier, in 1st and 2nd grade, made gains relative to controls almost twice that of children receiving intervention in 3rd grade. At follow-up, the advantage of 1st grade intervention was even clearer: First graders continued to grow at faster rates over the follow-up years than 2nd graders on six of eight key reading outcomes. For some outcomes with metalinguistic demands beyond the phonological, however, a posttest advantage was revealed for 2nd grade Triple participants and for 3rd grade Triple participants relative to controls. Estimated IQ predicted growth during intervention on seven of eight outcomes. Growth during follow-up was predicted by vocabulary and visual sequential memory. These findings provide evidence on the importance of early intensive evidence-based intervention for reading problems in the primary grades.
Keywords: reading, reading disabilities, early intervention, outcomes, follow-up
Educational Impact And Implications Statement
Does it matter in what grade early reading intervention is provided for young children who are struggling with learning to read—in 1st, or 2nd, or 3rd grade? Children with reading disabilities (RD), or at risk of RD, were taught in small groups an hour a day for 125 hours using a reading intervention developed and found effective in our earlier research; the children received this program either in 1st or 2nd or 3rd grade. All children improved their reading after receiving this program when compared to other children with RD who received whatever their schools offered. Children who received the program in 1st or 2nd grade made greater gains in basic reading skills than those who received it in 3rd grade; and those who received it in 1st grade continued to develop reading at faster rates well after the program ended. These findings provide evidence for the importance of early intensive reading intervention for struggling readers, and support intervention starting in 1st grade.
Several landmark studies reported in the late 1990s compared different approaches to the remediation and/or prevention of reading acquisition problems in the early elementary grades (Foorman, Francis, Fletcher, Schatschneider, & Mehta, 1998; Foorman et al., 1997; Scanlon & Vellutino, 1997; Torgesen, Wagner, & C.A. Rashotte, 1997; Torgesen et al., 1999; Vellutino et al., 1996). Research by Foorman and her colleagues (1998) provided important evidence that explicit classroom instruction in letter-sound correspondences can prevent reading failure in 1st and 2nd grade children at risk for reading problems. Another classroom-based intervention, Peer-Assisted Learning Strategies (PALS), developed by Doug and Lynn Fuchs, yielded positive results on measures of word recognition and text reading skill, and was been found to improve reading skills for both struggling and average readers (Fuchs & Fuchs, 2005; Mathes, Howard, Allen, & Fuchs, 1998). These results demonstrate that opportunities exist to provide targeted and differentiated instruction within the classroom setting to reduce the prevalence of reading problems (Fletcher, et al., 2007; Fuchs, Fuchs, Mathes, & Simmons, 1997).
Torgesen, Wagner, Rashotte et al. (1999) also reported seminal early intervention research, but their work involved one-on-one remedial intervention outside of the classroom over a period of 2 ½ years. Their participants were at-risk children, those lowest in letter naming and phonological awareness, entering kindergarten. At the end of the 2 1/2 year intervention, children who had received explicit phonological awareness and synthetic phonics training were the strongest readers when average group scores were assessed. Both their nonword reading (Word Attack) and word reading (Word Identification) skills fell overall within the average range. Their advantage relative to other groups was not consistently established, however, on all dimensions of reading skill at the end of second grade, suggesting that early intervention efforts may not yield equivalent impact on different reading-related processes (Torgesen et al., 2001; Torgesen, Wagner, Rashotte, Alexander, & Conway, 1997; Torgesen, et al., 1999).
Overall, these landmark studies and those that have followed have provided strong converging evidence for the efficacy and cost-effectiveness of early intervention efforts (Al Otaiba, 2000; Berninger et al., 2000, 2002; Connor, Morrison, Fishman, et al., 2007; Mathes et al., 2005; O’Connor, 2000; O’Connor, Fulmer, Harty, & Bell, 2005; S. Vaughn, Linan-Thompson, & Hickman, 2003). Foorman and Al Otaiba (2009) contend that better classroom instruction can reduce the number of low-achieving children to around 5%, and further supplemental small group or individual tutoring can bring the numbers down even lower to 1% - 3%. Evaluations of multi-tiered intervention models have suggested rates of inadequate response could be as low as 2%–5% with effective and well-timed early reading intervention (Berninger et al., 2003; Mathes et al., 2005; McMaster, Fuchs, Fuchs, & Compton, 2005; Torgesen, 2000).
Such efforts require coordinated infrastructure and investment at the school level, including universal early screening for academic risk, access to effective early intervention within the school, teacher preparation and confidence in such procedures, regular progress monitoring for all children, and access to booster interventions when needed (Fletcher & Vaughn, 2009). The growing literature on response to intervention (RTI) provides guidance on how this infrastructure can be put in place and its benefits, while also recognizing the challenges still to be addressed (Denton, Fletcher, Anthony, & Francis, 2006; Fletcher & Vaughn, 2009; Fuchs & Fuchs, 1998; Glover & Vaughn, 2010; Vaughn & Fuchs, 2003; Vaughn et al., 2011).
How important is the timing of early intervention?
Despite evidence of the effectiveness of early intervention for children at risk for reading failure, relatively few empirical studies have been conducted to compare the relative efficacy of reading intervention initiated at different ages. The majority of early intervention research has been conducted with at-risk children in kindergarten or first grade and has reported positive outcomes. A meta-analysis by Wanzek and Vaughn (2007) summarized evidence from early intervention studies offering at least 100 sessions. There were diverse outcomes reported from this work, although the majority described effect sizes in the moderate-to-large range. In this review, effect sizes were found to be larger for intervention studies conducted with Kindergarten and first graders (average e.s. ranging from .31 to .84) than with children in 2nd and 3rd grades (average e.s. .23 – .27). Scammacca, Vaughn, Roberts et al. (2007) in their report suggest that gains from early interventions of longer duration tend to be maintained at least until the 2nd grade. Vadasy and colleagues provided two years of intervention in 1st and 2nd grade, and reported average effect sizes of .64 on reading outcomes, suggesting that longer duration of intervention was also relevant to benefits of early intervention (Vadasy, Sanders, Peyton et al., 2002). It is noted that the meta-analysis cannot address causal evidence examining the effects of duration, intensity, or timing of intervention and that the studies assessed varied greatly in the extent to which interventions were operationalized.
Evidence from classroom intervention studies
The only experimentally controlled study to date of the timing of reading intervention was reported by Connor and her colleagues (Connor, Morrison, Fishman, Crowe, Al Otaiba, & Schatschneider, 2013). These researchers asked whether the timing and duration of individualized reading instruction within the classroom would make a difference to children’s reading achievement by the end of 3rd grade. The reading intervention included teacher professional development and instruction individualized through computer software to match the student’s performance on word reading, vocabulary, and comprehension assessments; this intervention has been found to yield positive effects in smaller efficacy trials conducted within single grade levels (A2i – Connor et al., 2007; Connor, Morrison, Schatschneider et al., 2011; Connor, Morrison, Fishman et al., 2011). The intervention itself varies the amount of instructional time allocated different instructional components and types of reading activities. While not a supplemental reading intervention, Individualizing Student Instruction (ISI) individualizes instruction by differentially weighting instructional time and components within the classroom.
In Connor’s 2013 study, the influence of ISI was evaluated longitudinally over three grades in a cluster-randomized controlled design; classrooms were randomly assigned to ISI treatment or control conditions, and teachers in the control condition received an equivalent amount of professional development and attention in a mathematics intervention condition. Randomization of classrooms to condition occurred every year, so there were children who received one, two, or three years of ISI, with ISI beginning in their 1st, 2nd, or 3rd grade year. Both the timing and duration of the ISI intervention therefore could be evaluated. Results revealed an overall advantage for those children who received ISI reading instruction throughout Grades 1–3; these children on average were performing above grade level by the end of 3rd grade, demonstrating the advantage of accumulated benefit (e.s. = .90 relative to three years control placement). There was an advantage for 1st grade intervention: Those who received only one year of ISI in 1st grade outperformed those whose year of ISI occurred in 2nd or 3rd grade. Connor and colleagues note, however, that the first grade advantage was “inconsistent” and was not replicated for children who received two years of ISI. For these children, there was greater benefit to receiving ISI in Grades 1 and 3 rather than Grades 1 and 2 or Grades 2 and 3.
Rather than a supplementary or pull-out remedial intervention, Individualizing Student Instruction (ISI) is a weighting of instructional time and components within the general classroom (Connor et al., 2007; Connor et al., 2011; Connor et al., 2013). Many early intervention studies are classroom studies and typically recruit entire classes as samples. The emphasis is one of preventing reading failure through early intense instruction for at-risk children. These studies do not focus on samples of struggling learners and may be expected to produce different patterns of findings than those that recruit samples performing substantially below age level expectations.
Evidence from supplemental intervention studies
In contrast, other early intervention studies have involved well-defined supplemental programs for at-risk learners and have operationalized these interventions. It should be noted that almost all of this research has avoided evaluation of school-based special education programs. In fact, as Fletcher and Vaughn (2009) point out, outcome data from evaluations of at-risk learners in special education placements have not been encouraging—many reports have documented limited growth and poor outcomes, suggesting typical interventions in many special educational settings to be generally ineffective in terms of accelerating academic growth (Hanushek, Kain, & Rivkin, 1998; Morgan, Frisco, Farkas, & Hibel, 2010; Vaughn, Levy, Coleman, & Bos, 2002). As Fletcher and Vaughn suggest, “There is a major disconnection between what is known about efficacy of instruction for students with academic difficulties and how students are taught in schools, especially for students most at risk for academic and behavioral difficulties” (2009, p. 33).
A recent report using a national US dataset (the Early Childhood Longitudinal Study-Kindergarten Cohort) provides quite a different perspective on the potential effectiveness of special education placement. Ehrhardt, Huntington, Molino, and Barbaresi (2013) were interested in determining whether grade at entry to special education was related to reading growth between 1st and 5th grades in a sample of children with reading problems. Lacking standard measurement for reading disorders, these investigators selected children from the cohort who had an IEP targeting reading and for whom the special education teacher listed specific learning disability as the primary category of disability. Early entry to special education proved significantly associated with reading achievement: Children entering before or during 1st grade demonstrated superior reading achievement gains to those who entered in 2nd or in 3rd grade. Of interest, these early-entry children were not significantly different from those who entered special education in 4th or 5th grade however; it is acknowledged that the latter students may have had less severe reading impairment than those with earlier identification (Leach, Scarborough, & Rescorla, 2003).
Another meta-analysis was undertaken with a developmental perspective, focusing on the interaction of grade and intervention modality to assess moderators of intervention efficacy (effect sizes) for at-risk and struggling readers. Suggate (2010) was interested in whether intervention effect sizes varied with grade at intervention (preschool through Grade 7) and type of intervention offered (phonics, comprehension, or mixed focus). Overall he reported that reading intervention was associated with clear improvement in reading outcomes following intervention both in the immediate short-term (d = 0.49) and over the longer-term (d = 0.36). Overall effect sizes were found to be greater for older children (Gr 5–7: d = 0.68) than for children in preschool and kindergarten (d = 0.36), Grades 1 and 2 (ds = 0.52 and 0.54), and Grades 3 and 4 (d = 0.59). Mixed and comprehension-focused interventions were associated with greater effect sizes for older children, and phonics interventions for children in kindergarten and 1st grade.
Suggate’s finding of greater effects with older readers stands in contrast to that of most previous studies and should be considered in light of how effect sizes are calculated in his report. Mean effect sizes were calculated for each outcome measure from the original 85 studies included in the meta-analysis: intervention group performance minus control group performance divided by the pooled standard deviation of the two groups (Cohen’s d; Hunter & Schmidt, 2004). Lower effect sizes for children in the younger grades may indicate not that they did not gain with intervention but that control participants in those grades also made reading gains without the intervention. Such an interpretation is possible given that Suggate (2010) describes significant negative correlations between grade and control group standard scores, illustrating that older children were more impaired relative to norms.
Measurement issues complicate the interpretation of intervention effects for different reading outcomes
Evaluating timing-of-intervention effects is made much more complex by the interaction between age at intervention and the type of reading outcome being evaluated, as well as differences in the ability to reliably measure different dimensions of reading skill at different ages. There is ample evidence of reading interventions having demonstrated efficacy on some, but not all, dimensions of reading skill. Meta-analyses such as those conducted by Scammacca, Edmonds, and their colleagues have revealed marked variability in reading comprehension effects across studies, and that at least with older students, average gains in reading comprehension with intervention were typically smaller than those seen on basic reading skills (Edmonds, Vaughn, Wexler, et al., 2009; Scammacca, Roberts, Vaughn, et al., 2007). Connor’s data on the timing of her ISI intervention with younger readers (Grades 1, 2, 3) also demonstrated some variability in effect size estimates for word identification vs. comprehension outcomes in 3rd grade (Connor et al., 2013). For 1st and 2nd graders, effect sizes for ISI instruction were roughly equivalent for word identification and passage comprehension outcomes (1st grade Cohen’s d = .32 and .36; 2nd grade d = .44 and .43 respectively), while for 3rd graders, ISI yielded effect sizes of .26 on word identification, but only .06 on passage comprehension. Similarly the Reading First Impact Report noted that while the Reading First initiative had no effect on the reading comprehension scores of students in Grades 1, 2, or 3, small positive effects on decoding skills were observed for the subsample of 1st graders studied (Gamse, Jacob, Horst, Boulay, & Unlu, 2008). Another higher-level reading outcome has also demonstrated variable outcomes following early intense reading intervention—reading rate or fluency. When Torgesen collected follow-up data on his intervention participants at 8 and 10 years of age, he found that they exhibited substantial deficits in reading rate despite otherwise positive reading outcomes (Torgesen, Alexander, Wagner, Rashotte et al., 2001).
Part of the problem concerns the continuing difficulty in developing appropriate measurement for more complex dimensions of reading skill like text comprehension and reading fluency. Outcome measures that have been used across studies have varied enormously in their power and sensitivity to intervention-related change. Many investigators have acknowledged that reading interventions typically yield larger effects on researcher-developed than standardized measures (Edmonds, et al., 2009; Lovett, Barron, & Frijters, 2013; Swanson, Hoskyn, & Lee, 1999). Experimental measures with more trials per level of difficulty result in more visible gains and better opportunities to demonstrate intervention-related change over the short-term. Many questions remain regarding the best measurement models for evaluating an intervention that targets functions as complex as reading comprehension and fluency.
Individual variability in response to early intervention
Another contributor to variability in response to early intervention may stem from individual differences among the children receiving reading intervention. There have been some efforts to determine whether children with different cognitive and academic profiles respond differently to early intervention and some evidence to support that suggestion (Connor, Morrison, Fishman, Schnatschneider, & Underwood, 2007; Frijters, Lovett, Steinbach, Wolf, Sevcik, & Morris, 2011; Foorman, et al., 1998; Nelson, Benner, & Gonzalez, 2003). Al Otaiba and Fuchs (2002) reviewed 21 studies examining nonresponders to reading intervention; they found that although seven child characteristics had been related to nonresponse, phonological awareness was the most consistently correlated across studies. These investigators subsequently conducted a longitudinal study with kindergarteners and 1st graders and found that a combination of naming speed, vocabulary, sentence imitation, problem behavior, and amount of reading intervention correctly predicted 82% of nonresponsive students and 84% of always responsive students (Al Otaiba & Fuchs, 2006). Inconsistent responders were predicted far less reliably (30%), focusing attention on the issue of how treatment response is actually defined and operationalized, an issue Frijters and our group recently assessed with older disabled readers (Frijters, Lovett, Sevcik, & Morris, 2013).
Multiple component interventions in the remediation of reading disabilities
In addition to measurement issues, many intervention reports over the past two decades have suggested that progress in remediating phonological decoding deficits has not been matched by gains in fluency and reading comprehension. Compton and his colleagues (Compton, Miller, Elleman, & Steacy, 2014) noted recently that successfully learning how to decode a new word does not ensure that that word will come to be integrated into what has been called “ a rich orthographic reading vocabulary” (Torgesen, Wagner, & Rashotte, 1997b). Torgesen suggested that the problem reflects the complexity of the processing impairments seen in more severely disabled readers (Torgesen, et al., 1997b). Such children require intervention that includes systematic and explicit phonological decoding instruction, but also offers focused remedial components to address other areas of deficit.
Researchers have demonstrated that many children with RD experience particular difficulties with strategy learning and the acquisition of self regulatory strategies, and that these problems appear to exist independent of their phonological difficulties (Swanson & Sáez, 2003; Swanson, Sáez, & Gerber, 2006; Swanson & Siegel, 2001). The strategy deficits of children at risk for reading acquisition failure extend beyond the word identification and word attack foundations of literacy and encompass all aspects of reading for meaning, expository text comprehension, and written expression. There is evidence that low-achieving readers can make gains in reading comprehension with systematic instruction and practice on specific reading comprehension strategies (Mason, 2004; Vaughn et al., 2000). It is reasonable to hypothesize that explicit strategy training and metacognitive instruction could be used to address and prevent generalization failures, and provide an important component of effective remediation for RD in the acquisition of both decoding and reading comprehension skills.
Lovett, Lacerenza, Borden, et al. (2000) reported evidence supporting this speculation: When a phonological reading intervention (PHAB/DI for Phonological Analysis and Blending) was combined with the teaching of specific word identification strategies, and these strategies were implemented, practiced, and evaluated using self-directing dialogue (WIST for Word Identification Strategy Training), severely disabled readers demonstrated superior reading achievement and faster learning than when they received an equal amount of intervention in phonological or strategy training conditions separately. The combined intervention conditions were associated with the greatest generalization of gains for these children with severe RD. These results provided evidence of the importance of strategy instruction to effective remediation and led to integration of these two interventions into the PHAST (Phonologcial and Strategy Training) Reading Program (Lovett, Lacerenza, & Borden, 2000). In earlier work, the authors had demonstrated in a controlled evaluation the efficacy of both the PHAB and WIST Programs relative to a control program, and some program-specific effects (Lovett et al., 1994).
The need for a multidimensional perspective on the core processing deficits of children with RD is echoed in the work of Wolf and her colleagues who identify naming speed deficits as a window on the failure of struggling readers to build integrated, rapid, and automatic connections among the component processes necessary to fluent reading acquisition (Wolf, 2007; Wolf & Bowers, 1999; Wolf & Katzir-Cohen, 2001). Citing the support of research on the importance of high quality orthographic, semantic, morpho-syntactic, and phonological lexical representations, and of their interconnections, Wolf and colleagues developed a reading intervention designed to strengthen lexical representational systems and teach explicitly the connections among representations. Called RAVE-O (for Retrieval, Automaticity, Vocabulary, Engagement with Language, and Orthography), the program seeks to teach young readers to enrich and connect all their knowledge about a word as quickly as possible. The idea is to simulate what typically developing brain circuitry does during the early stages of reading development (Wolf et al, 2009). RAVE-O is designed to accompany a systematic program of phonologically-based decoding instruction, and is directed to development of an appreciation of the richness of oral and printed language, and an enjoyment of words and reading for meaning.
Both the PHAST and the RAVE-O Reading Programs have been evaluated in a previous multi-site intervention study conducted by the present authors (Morris et al., 2012). This previous study included 279 2nd and 3rd grade children meeting low achievement or IQ-reading discrepancy definitions of RD (the majority meeting both criteria), and with diverse demographic profiles (IQ, SES, race). Children were randomly assigned to program according to a 2×2×2 factorial design according to the demographic variables of IQ (70–89; 90+), SES (low; average), and race (Black; Caucasian). The effectiveness of two multiple-component intervention programs for children with RD (PHAB/DI + RAVE-O (Wolf et al., 2000); and the PHAST Reading Program (Lovett et al., 2000) were evaluated against both an alternative treatment control program (Classroom Survival Skills (CSS) + Math), and a phonological treatment program paired with CSS (PHAB/DI + CSS). Interventions were taught an hour daily for 70 days on a 1:4 ratio at 3 different sites (Atlanta, Boston, Toronto).
Results indicated that both the PHAST and the RAVE-O (+ PHAB/DI) Programs were associated with significant improvement on basic reading skills relative to the alternative control group and the phonological treatment group at the end of the program and at one-year follow-up testing (Morris et al., 2012). Equivalent gains were observed for children of different racial, SES, and IQ groups; these factors did not systematically interact with treatment program and did not differentially predict outcomes at either posttest or at one-year follow-up. Both multiple-component programs were confirmed to be effective vehicles of intervention for struggling readers from a wide range of backgrounds and with differing levels of intellectual functioning. Differential treatment outcome effects were found between the multi-dimensional programs at post-testing based on the respective emphases of the programs.
In the present study, the PHAST and RAVE-O Programs were integrated to produce what is called the Triple-Focus Program, designed to capitalize upon the positive effects associated with both multiple component programs. (The PHAST Program is considered a ‘double program’ because it integrates the PHAB and WIST Programs.) The Triple-Focus Program provides tailored and intensive remediation that combines explicit phonological instruction with word identification strategy training, reading comprehension strategy training, and instructional activities that foster enriched lexical representations and increased engagement with word play, reading, and text comprehension.
Questions motivating the present study
The present study was undertaken to evaluate issues related to the timing of reading intervention for children meeting criteria for reading disabilities at the end of 1st or 2nd grade, or meeting risk criteria at the end of kindergarten. A full year of small group intervention using the Triple-Focus Program was provided for a total of 100–125 instructional hours in Grades 1, 2, or 3. The questions addressed in the present study included:
Did grade at intervention influence treatment outcomes and rate of growth in the short-term and/or over follow-up?
Did grade at intervention influence rate of normalization of reading scores following intervention?
Were there individual differences in cognitive and reading-related profiles that influenced response to intervention in the short- and long-term?
Method
Study Design
This present design evaluated the impact of developmental timing of reading intervention (1st, 2nd, or 3rd grade), longitudinal change in a repeated measurement design (testing at 0, 35, 70, 105, 125 hours of instruction, and at 1–3 years follow-up), and the role of individual differences on short- and long-term reading outcomes. The experimental reading intervention used here integrates two research-based remedial reading programs with demonstrated efficacy (PHAST + RAVE-O) into a comprehensive Triple-Focus intervention. The Triple-Focus intervention employed the same format as our previously reported interventions PHAST and RAVE-O (Morris et al., 2012), was taught by trained research teachers hired for the project, and was independently monitored for treatment integrity. The program was a pull-out intervention taught on a 1:4 ratio for an hour a day in the child’s home school. Reading outcomes for the Triple-Focus participants were compared to those of curricular or business-as-usual controls in the same grades and with the same degree of reading and reading-related impairment. Because participants were not consistently assigned randomly to intervention or control condition, this is considered a quasi-experimental design.
Participants.
Participants were recruited from multiple schools in three large metropolitan areas (Atlanta, Boston, and Toronto) on the basis of teacher referral for significant underachievement in reading. General inclusion criteria consisted of: English as their first and primary language, enrolment in 1st, 2nd, or 3rd grade at time of teacher referral, and normal or corrected hearing and vision. A total of 416 children with reading problems were referred for screening across the three study sites to see if they would qualify for participation.
All participants were required to meet specified exclusion and inclusion criteria. Children who had histories of hearing impairment (>25dB at 500+Hz bilaterally), of uncorrected visual impairment (>20/40), serious emotional/psychiatric disturbance (i.e., psychotic, pervasive developmental disorder), or chronic medical/neurological conditions (i.e., uncontrolled seizure disorder, congenital heart disease, acquired brain injuries) were excluded based on a brief demographic and history form completed by their parents. In addition, children were excluded if they had repeated a grade or received a K-BIT composite score below 70. The repetition of a grade was an exclusionary criterion because of our attempt to recruit each grade level groupings of the same age; in practice, grade retention was very rarely seen in participating schools. The co-occurrence of ADHD, a disorder common in RD populations, did not exclude a child from participation.
Participants were further selected from this pool based on their performance on a screening battery that included the Kaufman Brief Intelligence Test (K-BIT; Vocabulary and Matrices; Kaufman & Kaufman, 1990), Woodcock Reading Mastery Test-Revised (WRMT-R; Woodcock, 1987), and the Wide Range Achievement Test-3rd Edition–Reading (WRAT-3; Wilkinson, 1993). Subtests from the WRMT-R included Word Identification, Word Attack, and Passage Comprehension; for those children screened at the end of Kindergarten or beginning of Grade 1, WRMT-R Visual-Auditory Learning and Letter Identification were also administered although almost no children qualified solely on the Readiness Cluster score to which these subtests contribute.
All children selected for participation qualified based on meeting a low-achievement criterion for reading disabilities; this criterion required reading performance at or below a composite standard score of 85 on multiple standardized reading measures. Reading performance was measured using one or more of the following indices: (1) a Reading Total score calculated by averaging the standard scores on the WRMT-R Passage Comprehension, Word Identification, Word Attack, and WRAT-3 Reading subtests; (2) the WRMT-R Basic Skills Cluster score; (3) and/or the WRMT-R Total Reading Cluster score (Short Scale). The Basic Skills Cluster Score is the composite of Word Identification and Word Attack; the Total Reading Cluster - Short Form is the composite of Word Identification and Passage Comprehension. Children qualified by demonstrating low achievement on one of these three reading indices—i.e., at least one of their reading composite standard scores was 85 or below (at or less than the 16th % tile). Of the participants who qualified for inclusion, 63% met all three criteria, 18% met two, and 19% met one.
Children of any race or ethnic group, or either sex, were included as long as they met the English as the primary language requirement and the low achievement criterion for RD. We sought to include diverse samples of children, with the goal of including large numbers of minority children, girls, and children from low SES families. Given that our studies were located within public schools in three major cities, obtaining this level of minority children involvement was not difficult, although obtaining samples with 50% girls proved to be more difficult (Morris et al., 2012).
SES was assessed by parental occupation and educational status using an index of the families’ SES. SES data from all sites was derived using two American SES scales (Entwisle & Astone, 1994; Hollingshead, 1975; Nakao & Treas, 1992) and one Canadian SES scale (Blishen, Caroll, & Moore, 1987). Our goal was to develop an index for systematically identifying the children’s families as average or above SES, or below average SES. The particular differences or actual levels of SES provided by the scales were not as critical as an accurate ranking of children. A systematic evaluation of the reliability and concordance of these different scales was undertaken and results were used to classify the children into the Average or Low SES groups based on a systematic combination of the different indices. Details of this work on SES measurement have been published (Cirino, et al., 2002).
Of the 237 participants who met all criteria and were selected, 172 children participated in instructional groups in the Triple-Focus Program (79 Grade 1, 43 Grade 2, 51 Grade 3) and 47 served as control participants (18 Grade 1, 13 Grade 2, 16 Grade 3). Attrition was fairly low in the study given the length of the participation period (31 out of 237 enrolled cases, or a rate of 13%). A total of 17 children were lost to attrition between enrollment and the start of intervention, and an additional 14 were lost between pretest and posttest. Attrition generally was due to families relocating, switching schools, or having difficulties transporting children to the classes. Random assignment of children to intervention condition was not possible during the first and last years of data collection, and the design therefore should be considered quasi-experimental1. Children meeting criteria within a school were grouped together on the basis of grade and raw reading scores (WRMT-R Word Identification and Word Attack); the group was then proposed as an instructional group to the main site in Atlanta. If accepted, the instructional group was assigned to the Triple-Focus intervention in the present study. The control group included participants who met all criteria for inclusion but failed to match into an instructional group: Any participant meeting criteria who did not match into an instructional group, or who was referred and screened after classes had started, or was from a school where other participants were not available to form an intervention class, was assigned to the control condition. Despite efforts to enroll more control participants, control numbers remained far lower than projected due to the difficulty in enrolling children with reading disabilities and having them wait a full year before they could access our intervention program. In the final year of data collection, there was a bias towards ‘catching up’ by attempting to add more control participants; this led to inclusion of some control participants from schools who did not have any Triple-Focus intervention classes running. The study ran over five school years in total. A flowchart provides an overview of recruitment, enrollment, assignment, and intervention for 1st, 2nd, and 3rd grade participants in Atlanta, Boston and Toronto (see Supplementary Table 1 in the Appendix).
Table 1 displays overall reading scores and participant profiles for the sample. Results of a multivariate ANOVA confirmed that intervention and control participants were comparable at pretest on all selection criteria, including age, F (8, 199) = 1.44, p = .18. Descriptive statistics for every outcome measure subdivided by time of test (pretest, posttest), intervention condition (Triple-Focus, Control), and grade (Grades 1, 2, 3) have been provided in Supplementary Table 2 (Appendix). Additional evidence of group comparability can be seen in Table 3, in which γ01 represents the test of intervention and control pretest differences on each outcome. The total sample was confirmed to be significantly impaired on all measures of reading achievement, performing more than one standard deviation below expectations on measures of decoding, word reading, and passage comprehension, but at the lower end of the average range on measures of receptive vocabulary and intellectual functioning. Overall the sample was almost a full SD below expectation on the Freedom from Distractibility factor score from the WISC, suggesting that a high proportion of participants may have had attention difficulties. Approximately 51% of the sample was from low SES families, and 64% were males.
Table 1.
Intervention (n = 172) | Control (n = 47) | |
---|---|---|
M (SD) | M (SD) | |
Age in months | 89.2 (12.2) | 91.5 (11.9) |
WRMT-R Word Attack Scaled Score | 76.6 (9.3) | 74.2 (8.2) |
WRMT-R Word Identification Scaled Score | 81.1 (9.9) | 77.7 (11.1) |
WRMT-R Passage Comprehension Scaled Score | 79.2 (8.9) | 76.9 (10.6) |
WRAT-III Reading Scaled Score | 85.4 (10.0) | 79.3 (10.0) |
WISC Freedom from Distractibility Index | 87.3 (12.6) | 83.2 (12.6) |
WISC Processing Speed Index | 96.6 (15.2) | 95.1 (13.8) |
Kaufman Brief Intelligence Test | 92.6 (10.4) | 93.0 (10.6) |
Peabody Picture Vocabulary Test | 92.1 (13.6) | 98.1 (14.9) |
Proportion Male | .624 | .702 |
Number in Grade 1 | 79 | 18 |
Number in Grade 2 | 43 | 13 |
Number in Grade 3 | 51 | 16 |
Proportion Low Socio-Economic Status | .487 | .574 |
Table 3.
Fixed Effects | Par. | CHT | WAT | WID | WPC | SRI | TSW | TPD | GRT | |
---|---|---|---|---|---|---|---|---|---|---|
Pretest | Intercept | γ00 | 4.25** | 456.00** | 404.05** | 439.57** | 7.55** | 19.31** | 4.08** | 3.94** |
Intervention | γ01 | 0.33 | 6.27** | 9.64** | 6.37** | 0.46 | 2.60 | 0.72 | 0.65 | |
Grade 1/2 vs. 3 | γ02 | −4.08** | −6.49** | −17.46** | −10.66** | −4.57** | −8.01** | −1.70** | −2.89** | |
Grade 1 versus 2 | γ03 | −0.27 | −4.84** | −17.38** | −9.70** | −3.21** | −6.31** | −0.91* | −0.83** | |
| ||||||||||
Growth to Posttest | Intercepta | γ10 | 21.87** | 25.95** | 47.33** | 29.76** | 12.59** | 18.64** | 6.21** | 7.00** |
Intervention | γ11 | 18.15** | 15.80** | 19.37** | 14.80** | 8.61** | 9.56** | 6.21** | 4.23** | |
Grade 1/2 vs. 3 | γ12 | −0.85 | 5.81** | 11.79** | 5.41** | 0.76 | 2.98** | 0.23 | 0.02 | |
Grade 1 versus 2 | γ13 | −4.80** | 3.99** | 15.12** | 5.50** | −0.12 | 3.99** | 0.28 | −0.07 | |
Interv. X Grade 1/2 vs. 3 | γ14 | 0.46 | 1.18 | 5.62* | 1.30 | 0.58 | 2.44* | 0.40 | 0.03 | |
Interv. X Grade 1 versus 2 | γ15 | −6.88* | 0.21 | 4.98 | 2.74 | −0.92 | 1.92 | 1.88 | 0.66 | |
| ||||||||||
Followup trajectory | Intercepta | γ20 | 2.93* | −0.69 | 7.19** | 4.21** | 3.12** | 5.81** | 2.38** | 2.89** |
Grade 1/2 vs. 3 | γ21 | 0.86* | −0.05 | −0.62 | −0.28 | 0.22 | −0.31 | −0.54** | −0.07 | |
Grade 1 versus 2 | γ22 | 1.37** | 1.32** | 0.94 | 1.56** | 0.52 | 0.68** | 0.83** | 0.95** |
Notes:
p < .05;
p < .01.
all effects FDR corrected at p < .001. CHT = Challenge Test, WAT = WRMT Word Attack, WID = WRMT Word Identification, WPC = WRMT Passage Comprehension, SRI = Standardized Reading Inventory Comprehension, TSW = TOWRE Sight Word Efficiency, TPD = TOWRE Phonemic Decoding Efficiency, GRT = GORT Rate. All analyses reported in this table were performed on raw scores, unless the measure was from a WRMT-R subtest, and these were analyzed using W scores.
Measures
The reading and reading-related measures below were selected because they are standardized, widely-used in educational and intervention outcome research, and psychometrically-appropriate for growth-curve modeling. Use of these and the experimental measures allow for comparison with our own past research and other major intervention studies in the literature. The robust psychometric properties of the experimental measures of learning and transfer-of-learning have been documented in a separate report by our group (Cirino et al., 2002). As well, the standardized measures have been selected because of similar excellent psychometric characteristics, including reliability, construct validity, and the ability to sensitively measure change in reading and related skills. Measures were administered by Masters-level research assistants or senior graduate students trained and supervised by the Research Coordinator in each site. Examiners were only allowed to test independently after completing training, observing a trained examiner, and being observed by the Research Coordinator during testing. Double scoring was used to ensure the accuracy of scoring and to assess inter-scorer reliability.
Standardized measures of reading and related skills (intervention: pre, mid, post, and follow-up; controls: pre and posttest).
Word reading.
Woodcock Reading Mastery Test - Revised (WRMT-R, Form G, Woodcock, 1987)—Word Identification subtest. The Word Identification subtest presents letters and then words in isolation for students to identify. Wide Range Achievement Test-3 (WRAT-3). The WRAT-3 (Wilkinson, 1993) similarly measures both individual letter identification and word reading (Reading). Test-retest reliability exceeds .95 for the WRMT-R Word Identification subtest; alternate form reliability exceeds .87 for the WRAT-3 Reading subtest.
Speeded word identification.
Test of Word Reading Efficiency (TOWRE), Sight Word Efficiency subtest (Torgesen, Wagner, & Rashotte, 1999). Sight Word Efficiency assesses the number of real words that can be accurately read within 45 seconds. Alternate-form reliability exceeds .88 for Sight Word Efficiency.
Nonword decoding.
WRMT-R Word Attack subtest; TOWRE Phonemic Decoding Efficiency subtest. On these measures, students decode a series of progressively harder pronounceable nonsense words; the TOWRE has a speed component as students are asked to read as many nonwords as possible in 45 seconds. Alternate-form reliability exceeds .91 for Phonemic Decoding Efficiency; test-retest reliability for Word Attack exceeds .73 for this grade-range.
Reading comprehension skills.
Gray Oral Reading Test–Version 4 (GORT-4; Wiederholt & Bryant, 2004); Standardized Reading Inventory-2 (SRI-2; Newcomer, 1999); WRMT-R Passage Comprehension subtest. The GORT-4 and SRI-2 both provide text reading accuracy and comprehension scores. The GORT-4 stories are read aloud once, obtaining a measure of reading rate and comprehension. The SRI-2 stories are read once aloud and once silently, with comprehension measured using lexical, inferential, and factual open-ended questions about the text. Time to read each passage is also recorded to provide an additional indicator of reading rate. The Passage Comprehension task assesses comprehension using a cloze procedure. Test-retest reliability exceeds .85 for the GORT-4, .85 for the SRI-2, and .91 for the WRMT-R.
Spelling.
PIAT-R Spelling subtest assesses the child’s ability to recognize standard spellings of spoken words, a measure of orthographic awareness. Test-retest reliabilities were .91 in this age range and with this population (Cirino et al., 2002).
Experimental measures of training and transfer (intervention: pre, mid, post, and follow-up; controls: pre and posttest).
Sound Combinations tests the reader’s ability to pronounce a set of 30 letter clusters including vowel digraphs (ee, oa, ai), diphthongs (oo, oi, ou), vowel-controlled consonants (ge, gi, ce, ci), r- and l-controlled vowels, and high frequency bound morphemes (-ing, -tion). This measure has been found to be a reliable index of training success (Lovett et al., 1994; Lovett et al., 2000; Lovett & Steinbach, 1997). Observed internal consistency (Cronbach’s alpha) was .83.
The Challenge Words Test consists of 55 uninstructed, multisyllabic words that embed the instructed spelling patterns and affixes. This test provides students with the opportunity for application of the decoding strategies taught in both the PHAST and Triple-Focus Programs. It is also a sensitive index of transfer of learning for children and adolescents with RD (Lovett et al., 1994; Lovett et al., 2000; Lovett & Steinbach, 1997) and consistently produces 70-hour treatment effect sizes ranging from .65 to .85. Observed internal consistency (Cronbach’s alpha) was .93.
Word Knowledge Tests (pre and post-testing only). Multiple Definitions: This task assesses the student’s ability to provide two or more definitions for words with multiple meanings. Students receive credit for each unique definition provided. All items on this test are words presented and discussed in the RAVE-O portion of the Triple-Focus Program, and thus this test serves as a measure of instructed vocabulary content. Test-retest reliability for this task was .66, calculated on a pre-intervention repeat assessment using a similar sample as reported in Morris et al., 2012.
The Word Test 2, Flexible Word Use subtest (Bowers, Huisingh, Johnson, LoGiudice, & Orman, 2004): This task, assessing vocabulary knowledge, asks students to produce two meanings for each stimulus word provided. The standardized scoring is 1 or 0 per item – with 1 point only being awarded if the child provides two definitions. Standardized scoring was used, but we also deviated from test administration guidelines and asked participants to provide as many definitions as they could. We then calculated an alternate raw score for this subtest that summed ALL of the definitions provided, thus yielding similar scoring as used for the Multiple Definitions test but on an uninstructed vocabulary list. The Word Test 2 has average test-retest reliability of .90 and average internal consistency reliability of .81.
Language and cognitive abilities: predictor variables.
Phonological processing.
Comprehensive Tests of Phonological Processing (CTOPP; Wagner, Torgesen, & Rachotte, 1999): 1) Blending Words measures the ability to combine orally-presented, individual speech sounds into words; 2) Elision measures the ability to repeat a spoken word omitting one of the phonemes. The average internal consistency reliability is .84 for Blending Words, and .89 for Elision.
Naming speed (at multiple time points).
Rapid Automatized Naming (RAN; Wolf & Denckla, 2005). These tasks assess the ability to rapidly name visual symbols (letters, colors, objects, letters, or combinations). Test-retest reliability exceeds .84. Rapid naming of letters is reported here.
Cognitive ability (pretesting only).
Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler, 1999). The WASI is an abbreviated measure of verbal and nonverbal cognitive ability, adapted from the Wechsler Intelligence Scale for Children–III (Wechsler, 1991), and the Wechsler Adult Intelligence Scale–III (Wechsler, 1997). Students were administered all four subtests (Vocabulary, Similarities, Block Design and Matrix Reasoning). Test-retest reliability exceeds .90 for composite scores.
Receptive vocabulary (pretesting).
Peabody Picture Vocabulary Test—Third Edition (Dunn & Dunn, 1997). The PPVT-III assesses receptive vocabulary skills; participants select from one of four pictures that which best represents the meaning of a word presented orally. Test-retest reliability exceeds .91 for the PPVT-III.
Visual Sequential Memory (pretesting).
The Visual Sequential Memory subtest of The Test of Visual Perceptual Skills-Revised (Gardner, 1996) tested the child’s ability to recall a series of forms just presented from four possible alternatives. Average internal consistency reliability for this age range is .54.
Intervention Conditions
Subjects with similar single word reading levels (WRMT-R Word Identification and Word Attack raw scores) were assigned to an instructional group of four children who received the Triple-Focus Program outlined below. A total of 100–125 intervention sessions were conducted during the school year; children typically were seen in a ‘pull out’ format for 60 minutes a day, five days a week, in their own schools. At the school’s discretion, these intervention sessions were scheduled to occur while their classroom was receiving the day’s reading instruction, and this occurred in approximately 75% of cases. Where this schedule was not possible, schools elected to have their children come to the program during art, science, and social science instruction. Classes were not scheduled to occur during math instruction. Because we were in multiple cities, school districts and schools, we chose to allow previous and current curriculum to vary randomly to better evaluate the generalizability of our specific program results.
The Triple-Focus Program was an experimental reading intervention developed and directly based upon our previous work demonstrating that (i) developmental reading problems are associated with multiple core linguistic and cognitive deficits (phonological awareness, naming speed, and cognitive strategy use) that limit reading acquisition; and that (ii) remedial reading interventions that address more then one of these deficits are most effective (Lovett, Lacerenza, Borden et al., 2000; Morris et al., 2012). The Triple-Focus intervention integrated proven instructional modules from our previous randomized control trial report (Morris et al., 2012).
Samples from the Triple-Focus Scope and Sequence for Lessons 32, 77, and 106 are provided in Supplementary table 3 (Appendix). Looking at lessons sampled from different points in the program illustrates how the components were integrated, the amount of instructional time allocated different components, and how the focus and time allocation shifts over the course of 125 hours of Triple-Focus intervention.
The Triple-Focus Reading Program is an integration of the PHAST Reading Program: Parts One (Decoding) and Two (Comprehension) with the RAVE-O Program (Retrieval, Automaticity, Vocabulary Elaboration, Orthography; Wolf, Miller, & Donnelly, 2000; Wolf et al., 2009). The PHAST Reading Program: Decoding teaches children five specific metacognitive word identification strategies so they may become competent and independent readers (overview in Lovett, Lacerenza, & Borden, 2000). Part Two of the program teaches children comprehension strategies (predicting, summarizing, clarifying, questioning) using a metacognitive approach to improve text reading and comprehension skills (overviews in Lovett, Lacerenza, Steinbach, & De Palma, 2014; Lovett, Lacerenza, De Palma, & Frijters, 2012). The RAVE-O Program is an experimental, fluent comprehension intervention developed by Wolf and her colleagues (Wolf, Donnelly, & Miller, 2000; Wolf et al., 2009) that is based on theoretical neurocognitive models of reading. Specifically, RAVE-O facilitates the development of accuracy and fluency in underlying phonological, orthographic, semantic, syntactic, and morphological skills, and their rapid amalgamation at the sublexical, lexical, and connected text levels. RAVE-O addresses the need to explicitly teach children each of these components, and to teach explicit inter-connections among these component systems of oral and printed language at the time core words are taught (Wolf et al., 2009). Core words are taught that exemplify the polysemous nature of many words, their varied syntactic functions in different contexts, and how morphemes facilitate meaning. Thus, the RAVE-O Program focuses on the linguistic building blocks of reading fluency, as well as three strategies for comprehension.
All groups started at the first lesson, but more advanced groups could progress through the lessons more rapidly. As the program progressed, the number of strategies increased and time was devoted to acquisition of a metacognitive ‘Game Plan’ so that children learned how to select a strategy, monitor its effectiveness, and evaluate the results. The focus moved from building phonological and orthographic skills and knowledge to increasing attention paid multiple components of words and connected text at the semantic, syntactic morph-syntactic, and discourse levels.
The Triple-Focus Program was designed to teach the children a set of word identification strategies and specific decoding procedures so that they become more competent and independent in their approach to reading unfamiliar words in print—and, at the same time, to develop accuracy and fluency in underlying linguistic retrieval skills so the children could learn to read text fluently and with comprehension. As an extension of two reading interventions that had been used with positive results in our Atlanta, Boston, and Toronto sites, the Triple-Focus program was designed to offer a structured and scaffolded instructional framework of effective decoding and reading strategies. The original five decoding strategies of the PHAST Program were supplemented by and tied to the fluency, orthography, vocabulary, syntax, and morphology activities of the RAVE-O Program. This allowed a richer linguistic framework of component skills and strategies with which to remediate the multifaceted language-based deficits of these struggling readers.
As in our previous implementations, the program began with phonological remediation, acquisition of the letter- and letter-cluster sound mappings, phonological analysis and blending skills, and practice using a ‘Sounding Out’ strategy with precision in how sounds were blended (Engelmann & Bruner, 1988). As strategy-specific preskills and knowledge were acquired, additional word identification strategies were learned and practiced, using a strategy dialogue modeled by the teacher and acquired by the children; these included Rhyming (word identification by analogy- Gaskins et al, 1986), Peeling Off (separating affixes in multisyllabic words), Vowel Alert (learning the multiple pronunciations of vowel and vowel combinations according to their frequency in printed English), and I Spy (useful for compound words—identifying smaller known words). Each lesson contained RAVE-O activities and games, drawing upon words and sublexical patterns from PHAST and the core words of RAVE-O, incorporating words with shared phonemes and orthographic patterns, and semantic richness (multiple meanings) into work on vocabulary and orthographic knowledge, word retrieval, and other linguistic building blocks of reading fluency (Wolf et al., 2000; Wolf & Katzir-Cohen, 2001; Wolf et al., 2009).
The Triple-Focus Program was taught by experienced and certified teachers working for the research teams; some had Masters degrees, and all had special education and/or reading additional qualifications. There were multiple teachers at each site; several had participated in our earlier studies and had experience teaching PHAST and RAVE-O. All teachers were trained to provide multiple interventions within our research (i.e., in other related studies). All were trained a priori to a level of competence during intensive training conducted in Boston. All teachers had a detailed Scope and Sequence, scripted lessons to follow, and timelines with which to adhere for their teaching.
Throughout the study, a senior/lead research teacher at each site (i) continually monitored the progress and pace of each teacher and group through the lessons, (ii) initiated cross-site teleconferences between teachers to answer questions and problem-solve challenges, and (iii) offered reminders and instructional refreshers during regular team meetings. To further support fidelity of implementation in each class and across sites, an email list-serve was established where teachers could ask questions and receive an immediate response. Every teacher prepared a weekly progress report, summarizing all lessons and activities completed by each class. This report was posted weekly on the list-serve. In-person mentor visits by the senior/lead teacher occurred 3–4 times per year with feedback being provided. Finally, videotapes of classes were shared between sites so that trainers/lead teachers could ensure cross-site consistency in program implementation.
The control condition was a curricular control group, including children who met study criteria for RD, and who were not placed in the Triple-Focus intervention; these children constituted a ‘business-as-usual’ control to be followed and evaluated over time. As a classroom-based control, these children received whatever level and type of intervention the schools or their parents would provide for them. Because schools in these years (2001–2006) frequently waited until 3rd grade to identify children as needing help, it is unlikely that many children in Grades 1 and 2 received extra reading assistance in school. This was the case for all three sites. Schools in all sites provided 90 minutes of classroom literacy instruction daily. For ethical reasons, these children were offered access to the intervention program the year following their control participation. Control participants were assessed at pre- and posttest only.
The full 125 hours of instruction were implemented as planned for 68% of the sample (n = 117). The remaining 55 intervention children received an average of 104.5 hours of instruction (SD = 14.5; range = 70 to 124 hours). All control participants were assessed on intervention outcomes after equivalent time in the business-as-usual condition. Post-intervention assessments for all participants occurred after the final lesson was delivered, with those who did not complete the full 125 hours having their last observation carried forward.
Results
First Analysis
The first analysis is preliminary to the main analyses presented in the following section. The goals of the first analysis were to replicate program findings from our previous work, to generate traditional effect sizes, to maximize power for the intervention versus control contrast, and to provide a conservative test of the efficacy of the intervention. As such, the first analysis included all participants who contributed any valid outcome data, regardless of how much instruction was delivered, carrying forward the last outcome measurement for those who dropped out prior to the planned 125 hours. Moderated regression models were formed, regressing each outcome on pre-intervention outcome scores, intervention group (i.e., Triple; Control), grade (i.e., a priori focused contrasts of Grades 1/2 vs. 3; Grade 1 vs. 2), the interaction between grade and intervention assignment, and the interaction between pre-intervention scores and intervention group. This final interaction, representing the homogeneity of regression slopes in ANCOVA, was initially included in each model, but dropped from the final model if nonsignificant. Since each model explicitly included a developmental indicator (i.e., grade), raw scores on each outcome were analyzed, except on the WJ-III outcomes, which utilized the Rasch-scaled W scores.
Since children were nested within their instructional groups, the analysis was conducted within a mixed model framework, with instructional group as a random effect. Cross-level interactions and nested group effects were not of substantive interest and are not reported here, but simply incorporated into each model to account for the group-level dependence in the data and to obtain appropriate standard errors for grade and treatment effects. The resulting intra-class correlations (ICC) are included in Table 2 for use in conducting power analyses for future cluster-randomized trials. In addition to the moderated regressions, intervention effect sizes (Cohen’s d) were calculated via the pooled pretest standard deviations of the intervention and control groups, along with the model-adjusted post-intervention mean score on each outcome. Table 2 reports these values for outcome models formed for 14 outcomes. Since this analysis involved 14 correlated outcome measures, the potential for an inflated false-discovery rate existed. As a result, the Benjamini-Hochberg procedure (Benjamini & Hochberg, 1995) was implemented to correct for multiple significance tests and control the false-discovery rate.
Table 2.
Outcome | Model ICC | Fa | Adj. mean difference | SE | CI Lower | CI Upper | d | Grade by Treatment | F | p |
---|---|---|---|---|---|---|---|---|---|---|
Experimental measures | ||||||||||
Sound combinations | 0.43 | 58.11 | 9.17 | 0.96 | 7.26 | 11.08 | 1.81 | |||
Challenge Test | 0.23 | 70.17 | 16.65 | 2.08 | 12.52 | 20.79 | 1.82 | 2 > 1b | 4.44 | .04 |
Multiple Definitions Trained | 0.34 | 77.73 | 0.50 | 0.06 | 0.38 | 0.62 | 1.44 | 2 > 1 | 6.05 | .02 |
Standardized/Norm-referenced measures | ||||||||||
TOWRE Phon. Decoding | 0.12 | 36.30 | 6.36 | 1.07 | 4.23 | 8.49 | 1.39 | |||
TOWRE Sight Words | 0.10 | 22.69 | 8.81 | 1.74 | 5.36 | 12.26 | 0.57 | 1/2 > 3c | 3.91 | .05 |
WRAT-3 Reading | 0.11 | 15.66 | 4.40 | 0.59 | 3.23 | 5.58 | 0.91 | |||
WRMT-R Word Attack | 0.17 | 75.32 | 17.48 | 1.98 | 13.56 | 21.41 | 1.08 | 1/2 > 3 | 3.09 | .08 |
WRMT-R Word Identification | 0.07 | 55.34 | 22.97 | 2.93 | 17.17 | 28.77 | 0.59 | 1/2 > 3 | 3.93 | .05 |
WRMT-R Passage Comp. | 0.21 | 55.69 | 16.52 | 2.21 | 12.13 | 20.91 | 0.63 | |||
GORT-R Comprehension | 0.27 | 18.23 | 6.22 | 1.46 | 3.33 | 9.11 | 0.90 | |||
SRI Comprehension | 0.27 | 19.34 | 7.66 | 1.74 | 4.20 | 11.11 | 0.64 | |||
GORT-R Rate | 0.21 | 23.20 | 3.86 | 0.80 | 2.27 | 5.46 | 0.78 | |||
WORD-R | 0.05 | 25.06 | 0.22 | 0.04 | 0.14 | 0.31 | 0.61 | |||
PIAT-R Spelling | 0.20 | 19.23 | 8.23 | 1.32 | 5.60 | 10.84 | 0.72 | 3 > 1/2d | 7.37 | .01 |
Notes: ICC = Intraclass correlation;
Reports the F statistic for the Intervention versus Control test of posttest adjusted means, all p < .001 after adjustment for the False Discovery Rate
Grade 2 adjusted posttest mean greater than Grade 1
Combined Grade 1/2 adjusted posttest mean greater than Grade 3
Grade 3 adjusted posttest mean greater than combined Grade ½. All analyses reported in this table were performed on raw scores, unless the measure was from a WRMT-R subtest, and these were analyzed using W scores.
Globally, across all outcomes, statistically significant and substantial program effects were observed on adjusted posttest scores. In every case, participants in the Triple intervention outperformed those in the Control condition, with effect sizes ranging from a moderate .57 to a large 1.82. Effect sizes were largest for experimental outcomes assessing directly instructed content and lowest, but still moderate-to-large, for standardized measures of single word identification. Very strong effect sizes were observed for measures of nonword decoding, and strong effect sizes for reading comprehension outcomes. The average effect size across the 14 outcomes was 0.99. The average effect size on standardized measures was .80 and on experimental measures was 1.69. Intra-class correlations ranged from 0.05 to 0.43, with the largest ICCs observed for experimental measures of instructed content and comprehension outcomes.
The grade by treatment interaction was statistically significant for five outcomes, and marginally significant for one. After accounting for pretest scores, the difference between intervention and control for Grade 1/2 participants at posttest was approximately twice as large as the difference for Grade 3 participants. This pattern was repeated across TOWRE Sight Words (Grade 1/2 intervention-control posttest difference of 11.1 versus Grade 3 difference of 3.7; illustrated in Figure 1), WRMT-R Word Identification (Grade 1/2 intervention-control difference of 26.6 versus Grade 3 difference of 14.6), and WRMT-R Word Attack (Grade 1/2 intervention-control difference of 19.7 versus Grade 3 difference of 12.1; marginally significant). A reverse pattern was observed for three other outcomes, whereby the difference between intervention and control for participants in higher grades was approximately twice as large as the difference for younger participants. The Grade 2 intervention-control difference was greater than the Grade 1 difference on the Challenge test outcome (Grade 2 intervention-control posttest difference of 22.9 versus Grade 1 difference of 12.2). A similar pattern was observed for the WORD-2 outcome (Grade 2 posttest difference of 0.71 versus Grade 1 difference of 0.36). This pattern was repeated for the PIAT Spelling outcome, with three times the intervention-control posttest difference for Grade 3 participants (15.5) compared to Grade 1/2 participants (5.1).
The interaction between pretest scores and intervention condition was significant for the Sound Combination and PIAT Spelling outcomes. Within the ANCOVA framework, this would indicate a violation of the homogeneity of regression assumption, being differential adjustment of posttest scores by group. Within the moderated regression framework, these two effects can be explicitly modeled and interpreted as substantive effects. In the case of Sound Combinations, a dramatic increase in intervention group score variance from pretest to posttest resulted in a lower pre-post correlation for that group compared (r = .30) to Controls (r = .62). A similar, but less dramatic, pattern was observed for PIAT Spelling, with a few Intervention participants making large gains by posttest, thereby reducing the correlation between pre- and posttest for the Intervention group.
Second Analysis
The goals of the second analysis were to utilize all available repeated observation data to precisely estimate specific trajectories of change, incorporating trajectories representing the yearly follow-up outcome measurement, and exploring predictors of these two intra-individual parameters. Growth curves were used to estimate intercepts and trajectories, and to model individual differences in intervention response across five repeated observations: pretest, after 35, 70, 105, and 125 hours of instruction, and at each follow-up occasion, which occurred yearly, one to three times after the intervention depending on grade at entry to the study (i.e., up to the end of 4th grade only).
The growth curve analysis was performed on the eight outcomes for which there were outcome measurements at each of the time-points mentioned above. For precision in estimating fixed effects and to utilize all available data, this second analysis included all cases that had at minimum pretest outcome scores. All available outcome data were utilized, and every dropout was represented in the growth curve analysis. Data density across observation points was as follows: pretest, 219 participants (100%); 35 hours, 172 (78.5%); 70 hours, 172 (78.5%); 100 hours, 158 (72.1%); 125 hours, 205 (93.6%); first follow-up, 52 (23.7%); second follow-up, 52 (23.7%); third follow-up, 28 (12.7%). Lower data density at the 35, 70, and 100 hours testing points and at follow-up represent the fact that control participants were only tested at pretest and posttest.
Growth models were based on a two-piece parameterization of time that modeled linear growth to posttest, with a separate component representing linear growth from posttest through the follow-up period. Not presented here are the competing piecewise and polynomial models that each provided a less adequate fit (via nested −2LL comparisons) across all outcomes. The most notable practical advantage of the chosen parameterization was that it afforded the ability to segregate and estimate effects that might interact with intervention condition from effects that might predict follow-up trajectories. This segregation was important, since follow-up trajectories could not be estimated for participants in the control condition. Several metrics for time were considered, including time as intervention days, chronological time, and models that incorporated both metrics. The most parsimonious and well-fitting model, used in the analyses reported here, was a hybrid model in which intervention days were utilized as the metric for time, with time to follow-up rescaled to this metric.
Model fitting proceeded according to current best practices in multilevel growth modeling (e.g., Hox, Moerbeek, & van de Schoot, 2010; Snijders & Bosker, 2012). Initial model fitting also investigated several models to account for nesting of observations within individuals, within teacher and/or instructional group. Since the control condition was business-as-usual, and thus individual children were not cluster-randomized, clusters of one were formed for analysis purposes. Simulation studies have shown this strategy to be both more efficient and powerful than either forming pseudo-clusters, or treating the entire condition as one cluster (Bauer, Sterba, & Hallfors, 2008; Roberts & Roberts, 2005). Null and growth models involving only random effects were first fit and competitively evaluated within measure via BIC/AIC values. Across the eight outcomes, the best fitting random effects model included variance components for intercept and intervention growth rate, both at the participant and participant nested within instructional group levels. The best fitting model per outcome also included the follow-up trajectory piecewise parameter as a variance component, but in almost every case only for participants nested within instructional group. Finally, individual differences were incorporated as predictors of either program-related growth and/or as predictors of follow-up trajectories. The initial fixed-effect predictor model included intervention group, grade at intervention start, and the interaction between group and grade. Fixed effects results from these models are presented in Table 3.
Examination of the fixed effects for Pretest, rows γ01 to γ03 in Table 3, indicate that participants in Grade 3 began intervention with substantially higher scores on all outcomes when compared to those in Grade 1/2 (row γ02); participants in Grade 2 began with higher scores than those in Grade 1 (row γ03), except for scores on the Challenge Test. When considering Growth to Posttest, row γ10 represents the growth rate in the control group, with γ02 growth made by Intervention participants over and above this baseline. Given the scaling of the growth models’ time parameter, the estimates in these rows are a direct representation of estimated growth over the course of 125 hours of instruction. For example, control participants gained an average of 21.87 Challenge Test words over 125 hours, and Intervention participants gained an additional 18.15. Across all outcomes, additional gains by Intervention participants were both substantial and statistically significant.
Rows γ12 and γ13 represent growth rates across the grade contrasts. On four of eight outcomes, Grade 1/2 participants gained skills at a faster rate than Grade 3 participants (row γ12). In the case of two outcomes (WRMT-R Word Identification and TOWRE Sight Words), this effect interacted with intervention group (row γ14). In these cases, the growth rate of Grade 1/2 participants in the intervention group far exceeded the rate of growth for Grade 3 participants (see Figure 1). This replicates similar effects seen in the first analysis. On four of eight outcomes, Grade 1 participants gained skills at a faster rate than those in Grade 2 (row γ13). This pattern was reversed for the Challenge Test and interacted with intervention group, such that the intervention effect was much more pronounced for participants who received the intervention in Grade 2 (row γ15).
A pseudo-R2 (Hox, 2010; Raudenbush & Bryk, 2002) was calculated as an estimate of the proportion of variance in growth rates that could be accounted for by a) assignment to intervention or control condition; and b) the incremental proportion of variance explained by the grade by intervention interaction. Treatment assignment accounted for an average of 32% of the explainable variation in growth rates (range = 21% to 49%); grade by intervention interactions accounted for an average of 52% additional variance in growth rates (range = 29% to 76%).
The rows γ20 to γ22 in Table 3 characterize follow-up trajectories. Overall, participants continued to gain reading skills from posttest through the follow-up occasions, on all outcomes except WRMT-R Word Attack (row γ20). The parameters in this row represent skill growth per year of follow-up. On six of eight outcomes, continued growth interacted with grade at intervention. In these cases, participants who began the intervention in earlier grades continued gaining skills at a rate that exceeded later intervention starts through the follow-up years (row γ22). Figure 2 illustrates this effect on the WRMT-R Passage Comprehension outcome. Note that by the second follow-up observation, participants starting the intervention in Grade 1 had caught up to those starting the intervention in Grade 2, despite being one year younger at that observation point.
Examination of variance component residuals indicated that additional intra-individual variability remained after accounting for intervention and grade effects. As a result, a secondary analysis was conducted incorporating additional individual difference factors as follows: receptive vocabulary scores (Peabody Picture Vocabulary Test), phonological awareness (Comprehensive Tests of Phonological Processing phonological composite score), rapid naming (Rapid Automatized Naming Letters score), visual sequential memory (Test of Visual Perceptual Skills-Revised), and IQ (Wechsler Abbreviated Scales for Intelligence: 4-subtest IQ score). Each of these factors was introduced to the model initially alone, as predictive of growth to posttest, and as predictive of follow-up trajectories. Interactions of intervention group with these predictors were also included. Models were pruned of higher-order nonsignificant results to reflect a parsimonious model of individual differences. Table 4 reports significant results for these fixed effect individual difference predictors and their interaction with intervention condition.
Table 4.
Fixed Effects | Par. | CHT | WAT | WID | WPC | SRI | TSW | TPD | GRT | |
---|---|---|---|---|---|---|---|---|---|---|
Pretest | Phonological Awareness | γ00 | 1.70** | 4.04** | 4.53** | 2.37** | 0.63 | 2.00** | 1.53** | 0.26 |
Rapid Naming | γ01 | 2.04** | 2.67** | 8.44** | 6.82** | 2.10** | 5.71** | 0.95** | 2.46** | |
Vocabulary | γ02 | −0.54 | −1.07 | 1.49 | −0.34 | 0.56 | −1.25* | −0.33 | −0.01 | |
Visual Sequential Memory | γ03 | −0.90* | −0.54 | 0.52 | 0.83 | −0.03 | −0.59 | −0.56** | −0.01 | |
IQ | γ04 | 1.16 | 1.47 | 1.98 | 1.53 | 0.93 | 1.00 | 0.48 | 0.09 | |
| ||||||||||
Growth to Posttest | Phonological Awareness | γ10 | ||||||||
Rapid Naming | γ11 | −2.62** | ||||||||
Vocabulary | γ12 | 2.55** | 3.43** | |||||||
Visual Sequential Memory | γ13 | |||||||||
IQ | γ14 | 0.51 | −0.09 | 1.83 | 0.01 | −0.31 | 1.04 | 0.71 | −0.04 | |
Int. X Phonological Aware. | γ15 | |||||||||
Int. X Rapid Naming | γ16 | |||||||||
Int. X Vocabulary | γ17 | −3.08* | ||||||||
Int. X Visual Seq. Memory | γ18 | |||||||||
Int. X IQ | γ19 | 10.79** | 16.93** | 16.36** | 11.11** | 7.76* | 7.58* | 6.02* | 1.67~ | |
| ||||||||||
Followup trajectory | Phonological Awareness | γ20 | −0.89** | −0.63** | −0.12* | |||||
Rapid Naming | γ21 | 0.48* | 0.66** | 0.36* | ||||||
Vocabulary | γ22 | 2.14** | 0.60~ | 1.95** | 1.99** | 1.34** | 0.33** | |||
Visual Sequential Memory | γ23 | 1.58** | 0.85** | 0.36* | 1.12** | |||||
IQ | γ24 | −1.02* | −1.46** | −1.92* | −1.24** |
Notes:
p < .10
p < .05
p < .01.
CHT = Challenge Test, WAT = WRMT Word Attack, WID = WRMT Word Identification, WPC = WRMT Passage Comprehension, SRI = Standardized Reading Inventory Comprehension, TSW = TOWRE Sight Word Efficiency, TPD = TOWRE Phonemic Decoding Efficiency, GRT = GORT Rate; Int. = Intervention. All analyses reported in this table were performed on raw scores, unless the measure was from a WRMT-R subtest, and these were analyzed using W scores.
In Table 4, rows γ00 to γ04 indicate whether each individual difference predictor was related to pretest scores on each outcome measure. Across multiple outcomes, phonological awareness and rapid automatized naming were related to pretest reading skill. Rows γ10 to γ14 indicate whether initial levels of the individual difference predictors were related to rate of change during the intervention period. Most of these effects are not interpretable, since they were included to ensure that all nested effects within a significant higher-order interaction were included, as reported in rows γ15 to γ19.
Across seven out of eight outcomes, IQ interacted with intervention group growth rates (this relationship was marginally significant for the remaining outcome-GORT Rate). Post-hoc examination of these interactions indicated that intervention growth rates were highest among lower-IQ Triple participants, with the greatest discrepancy in growth rates between intervention and control occurring when WASI IQs were low. In fact, the only group not demonstrating growth during the intervention period was the control subgroup with lower WASI IQ scores at entry. Post-hoc examination indicated parallel slopes across participants with lower vs. higher IQs if they participated in the Triple-Focus intervention. The interaction between WASI IQ and response to intervention on the WRMT-R Word Attack outcome is depicted in Figure 3. A reverse pattern was observed on SRI comprehension outcomes with receptive vocabulary (PPVT) as the predictor (row γ12). Higher vocabulary scores were associated with a greater difference in growth rates between intervention and control participants. The greatest growth rates in comprehension were observed for Triple-Focus participants who began intervention with relatively stronger vocabulary skills.
Rows γ20 to γ22 indicate the relationship between growth trajectories during the follow-up period and the individual difference predictors. Across four of eight outcomes (row γ23), higher visual sequential memory skill was associated with greater gains in the follow-up period. Post-hoc visual inspection of this effect showed that participants with the highest visual sequential memory skills continued to gain reading skills, while those with the lowest did not continue to gain, but rather leveled off one year after intervention. This pattern is depicted in Figure 4 for measure of multisyllabic challenge word reading. The same pattern was evident on five of eight outcomes for the receptive vocabulary predictor. Relatively higher vocabulary skill was associated with greater continued gains during the follow-up period (row γ22).
Normalization rates
A final examination was made of the proportion of participants in each condition and each grade whose posttest scores fell within the average range following the intervention period. The proportions ‘normalized’ on four standardized outcome measures are displayed in Table 5.
Table 5.
|
||||||
---|---|---|---|---|---|---|
Triple Intervention | Control | |||||
Outcome | Grade 1 | Grade 2 | Grade 3 | Grade 1 | Grade 2 | Grade 3 |
WRMT-R Word Identification | 76.3* | 52.6* | 21.3 | 38.9 | 7.7 | 12.5 |
WRMT-R Word Attack | 77.6* | 50.0* | 38.3* | 27.8 | 0.0 | 6.3 |
WRMT-R Passage Comprehension | 67.1* | 36.8* | 34.0* | 29.4 | 0.0 | 6.3 |
SRI Passage Comprehension | 40.3 | 21.6 | 28.3 | 17.6 | 0.0 | 12.5 |
SRI Accuracy | 61.1* | 62.2* | 57.8 | 17.6 | 7.7 | 37.5 |
SRI Reading Quotient | 40.3* | 24.3* | 15.6 | 11.8 | 0.0 | 6.3 |
Note:
χ2 test indicated that the proportion normalized within grade differed across intervention and control conditions, p <.05.
Chi-square tests of independence were calculated to establish whether the proportion of children achieving scores in the average range at posttest differed between the Triple and control groups. On the WRMT-R subtests (Word Attack, Word Identification, Passage Comprehension), significantly greater normalization was achieved by the Triple group in every grade, the only exception being a greater but nonsignificant advantage of the Triple over the control children in Grade 3 on Word Identification. On the SRI-2, lower rates of normalization were observed overall, however, Triple intervention children were normalized at significantly greater rates for SRI Accuracy and Reading Quotient scores in Grades 1 and 2, but the difference fell short of significance for Grade 3 children.
Discussion
The preliminary analysis confirmed that the research intervention, the Triple-Focus Reading Program, was associated with reliable gains in reading achievement that were evident on multiple dimensions of reading skill. Across 14 reading outcomes, ranging from experimental measures of skills targeted for instruction (e.g. Sound Combinations, multisyllabic Challenge Word reading, Multiple Definitions vocabulary knowledge) to standardized measures of word identification, word attack, word reading efficiency, and reading comprehension, children who received the Triple-Focus intervention substantially out-performed those in the control condition. Effect sizes (Cohen’s d) ranged from .57 to 1.82, with an average effect size of 0.99 and a median effect size of .84. These effect sizes, achieved after only 125 hours of instruction (approximately 7 months chronological time), are comparable to those reported by Connor et al. (2013) comparing three years of ISI intervention to three years of control placement. The present effect sizes surpass most of those reported in the meta-analysis conducted by Wanzek and Vaughn (2007), however, in which effects were generally greater for children receiving intervention in Kindergarten and Grade 1 (average e.s. ranging from .31 to .84) than in Grades 2 or 3 (.23 - .27).
The efficacy of the Triple-Focus Reading intervention was anticipated because its component programs (PHAST and RAVE-O) had been rigorously evaluated against two control groups in a previous multi-site study with Grades 2 and 3 children with reading disability (Morris et al., 2012). Both multiple component programs shared an emphasis on phonology, orthography, and morphology, and both included specific motivational and metacognitive components in their design. The two programs offered the same base of phonological reading intervention (PHAB/DI), but differed in some other areas of cognitive-linguistic focus. The RAVE-O Program provided instruction on several linguistic aspects of word knowledge (e.g., semantic depth and flexibility, lexical retrieval, syntactic and morpho-syntactic structure) and offered many game-like practice opportunities to build engagement with language learning. The PHAST program provided a metacognitive approach to decoding, with attention paid sub-syllabic orthographic patterns, variable vowel pronunciations, affixes, and the direct teaching of five word identification strategies, along with a plan for their implementation, monitoring, and evaluation. These two research-based intervention programs were associated with superior outcomes relative to controls on multiple standardized reading achievement tests at posttest, and participants continued to demonstrate a significant advantage a full year after intervention ended (Morris et al., 2012). The superiority of the PHAST and PHAB/DI + RAVE-O programs was replicated across multiple measures of reading and spelling achievement at one-year follow-up. In the present design, these multidimensional programs were integrated and extended to form the Triple-Focus program, and given our previous evidence, there was ample reason to believe that the new intervention would have efficacy for struggling readers in the early grades.
In this previous work, interventions offered only 70 hours of small-group instruction. Sampling was conducted according to a 2 × 2 × 2 factorial design such that every treatment group included equal numbers of Caucasian and Black children, children from average or below-average family socio-economic circumstances, and children of average or below-average IQ (IQs 70 – 89). In this study, program benefits generalized to a much broader sample of disabled readers than typically evaluated. These multidimensional, systematic and intense, linguistically-motivated reading interventions were associated with positive outcomes for young children with RD, of high and low IQ, and from a range of ethnic backgrounds and environmental circumstances (Morris et al., 2012).
In the present research, 125 hours of small-group instruction was offered over the course of 1st, 2nd, or 3rd grade, allowing an integration of the PHAST and RAVE-O components and further development of reading comprehension instruction. Of primary interest was the question of whether grade at intervention would influence intervention outcomes and rate of growth. While robust intervention effects were observed on all 14 outcomes, the interaction between intervention condition and grade-at-intervention was significant for just less than half of these outcomes: For five of the 14 outcomes, outcomes differed according to grade and a 6th outcome was marginal. In five of six cases, these effects concerned acquisition of basic foundational reading skills.
There was powerful evidence of an early intervention advantage on most basic word reading skills assessed: For word attack, word identification (WRMT-R), and sight word reading efficiency (TOWRE), intervention in Grades 1 and 2 was associated with greater gains than in Grade 3. The only standardized word reading measure that did not demonstrate this advantage was WRAT-3 Reading. Phonological decoding training appeared to benefit all grades equally on measures of letter-sound combination knowledge and nonword reading efficiency (TOWRE decoding), although on one central measure of decoding skills (WRMT Word Attack), 1st and 2nd grade Triple children were at a substantial advantage relative to 3rd grade Triple children.
After controlling for pretest, the average posttest difference between intervention and control Grade 1/2 participants was 20.6 W-scores, relative to 12.1 for Grade 3 participants. On Word Identification (28.7 vs. 14.5 W-scores) and TOWRE sight words outcomes (11.9 vs. 3.9 words), the two younger grades demonstrated posttest advantages relative to controls two to three times as great as those in Grade 3. These grade-by-intervention interaction effects were substantial: after accounting for the effects of assignment to condition, grade-by-intervention effects accounted for an average of 54% of the explainable variation in growth rates.
These data provide evidence in support of the efficacy of early intervention within the 1st or 2nd grade of elementary school. This result is of practical significance given the still prevailing stance of some school districts to delay detailed assessment until a child reaches 3rd grade with persisting academic problems. These results are consistent with those of Connor and colleagues (Connor et al., 2013) who found a 1st grade advantage for students receiving only one year of her ISI intervention relative to those whose single year of ISI occurred in 2nd or 3rd grade. Connor et al. noted, however, that their 1st grade advantage was inconsistent and not replicated for students receiving two years of ISI intervention. In this case, students receiving ISI in 1st and 3rd grades outperformed those with ISI in 1st and 2nd or 2nd and 3rd grades.
In the present data, the early intervention effect was not replicated on three other outcomes for which a significant intervention x grade interaction was revealed. As the word literacy outcome became more complex, different grade x intervention patterns emerged. On two measures relevant to specific metacognitive and metalinguistic instruction in the Triple-Focus Program, multisyllabic word identification (Challenge Words) and the ability to provide multiple definitions of multiple-meaning vocabulary (Multiple Definitions), 2nd grade Triple children demonstrated a greater intervention-control posttest advantage than 1st grade Triple children. Finally on an orthographic awareness or spelling recognition measure (PIAT Spelling), 3rd grade Triple participants achieved a greater posttest advantage relative to controls than 1st and 2nd grade Triple participants. Each of these outcome measures requires awareness of and a capacity to manipulate linguistic components of written language beyond the phonological domain. On Challenge Words, morphological awareness and an ability to work with bound morphemes are tapped, on Multiple Definitions, morpho-syntactic and semantic awareness and flexibility, and on PIAT Spelling, orthographic awareness. These findings are among the first to attempt to examine developmental effects in disabled readers’ response to intervention according to the complexity of the component reading skills being assessed.
Perhaps the most complex aspects of reading development involve the comprehension of connected text. In this regard, it is of interest that no intervention condition x grade interactions were found on any of the three standardized reading comprehension tests included in the pre- and posttest battery. Substantial intervention effects were revealed on all three comprehension outcomes, with large effect sizes reported (GORT Comprehension d = .90, SRI Comprehension d = .64, WRMT-R Passage Comprehension d = .63). Similarly, a measure of text reading rate (GORT Rate d = .78) demonstrated a reliable posttest advantage for the Triple intervention participants, but no interaction with grade.
It is difficult to know whether this pattern truly reflects no grade differences in how the Triple intervention affected reading comprehension performance. The failure to observe developmental response differences on these text reading measures may reflect instead the current state of measurement for more complex dimensions of reading skill like text comprehension and reading fluency. It is acknowledged that traditional measures assess somewhat crudely the products of reading comprehension—what is understood after a text is read. Different limitations of these standardized reading comprehension tests have been extensively discussed, including inadequate content validity, concurrent validity, task sensitivity, and an imbalance in the type of comprehension questions included (Cain & Oakhill, 2006; Cutting & Scarborough, 2006; Keenan, Betjemann, & Olson, 2008; Kendeou, Papdopoulos, & Spanoudis, 2011; Morsy, Kieffer, & Snow, 2010).
These concerns are particularly acute when attempting to assess comprehension of text by beginning readers whose skills are undergoing rapid developmental change. Kendeou and colleagues reported a longitudinal study comparing different comprehension measures widely used in the early grades (Kendeou et al. 2011). These investigators demonstrated that these tests vary in the processing demands they make on young readers’ component reading-related skills (e.g., vocabulary, orthographic processing, rapid naming, phonological processing, working memory, fluency), skills that are developing rapidly during the early grades. Of relevance to the present work, one of the tests used here as an outcome measure, Passage Comprehension, was found to exert particular processing demands on orthographic processing and working memory, and less on phonological decoding. It should be noted, however, that the Kendeou study sample included typically developing Greek children, and that phonological decoding is typically mastered early in reading development in languages like Greek with highly consistent letter-sound mapping.
The growth curve analyses undertaken with the present data allowed two types of determination of individual differences effects: a) an examination of predictors of growth during the intervention period, with Triple-specific determinants of growth revealed through interactions between the predictor and intervention condition (Triple vs. control); and b) predictors of growth following intervention for children who had received the Triple. Control children did not contribute to the follow-up data since after posttest they received the Triple reading intervention.
Compatible with evidence establishing phonological awareness and rapid naming speed as predictors of reading achievement in young readers (Kirby, Parrila, & Pfeiffer, 2003; Manis, Doi, & Bhadha, 2000; Parrila, Kirby, & McQuarrie, 2004), these two factors were related to pretest reading skill for the present sample. These individual difference factors were not, however, consistently related to rate of growth during intervention or over the follow-up years.
In contrast, and across seven of the eight outcomes analyzed, IQ interacted with intervention condition on rates of growth during the intervention period. Specifically, better rates of growth in the Triple intervention, relative to controls, were seen among those participants with lower WASI IQ estimates. Another way of expressing this interaction is that the difference in growth rate between Triple and Control children was greatest for children with lower WASI scores at entry. This is put into context by the fact that Control participants with lower WASI IQs did not demonstrate growth across the intervention period in marked contrast to higher WASI Control children (see WRMT-R Word Attack growth, Figure 3). Post hoc examination revealed parallel slopes (rates of growth) across Triple participants with higher- and lower WASI scores, indicating that equal growth was attained in the intervention for lower and higher-IQ children. These data suggest that the provision of systematic, linguistically-informed, and intense reading intervention is particularly critical for struggling readers with lower overall cognitive and language functioning, and that children of varying cognitive profiles at entry were able to profit equally from the Triple instruction.
Two other individual differences factors emerged to be of interest. The first was an estimate of vocabulary knowledge (PPVT), and this factor was associated with a different pattern than that seen for WASI IQ. In this case, the difference in growth rates between Triple and control children was greater for those children demonstrating relatively stronger vocabulary skills at entry. Greater growth in intervention on SRI Comprehension was observed for Triple children with relatively better vocabulary skills. Similarly greater continued growth on four outcomes including single-word reading, comprehension, and fluency during the follow-up period was revealed for high-vocabulary Triple children. It is not surprising that vocabulary knowledge is related to growth in and development of comprehension skills; there is evidence of the substantial correlations between estimates of vocabulary knowledge and reading ability (Baumann, Kame’enui, & Ash, 2003; Kamil, 2004; Nagy, 2007).
An unexpected predictor of growth during the follow-up period emerged for four of eight outcomes. Visual Sequential Memory performance predicted rate of growth after intervention ended on four reading outcome measures over the follow-up period. Triple children with higher visual sequential memory scores at pretest made greater continued growth on word attack and nonword reading efficiency standardized measures, on multisyllabic challenge word reading, and on GORT Reading Rate over the follow-up year. Verbal working memory is more typically related to differences in reading growth among children and youth, but recent evidence by Pham and Hasson (2014) suggests that visual spatial working memory also significantly predicts reading achievement in children. Swanson (2000, 2010) has speculated that any advantage that visual spatial working memory may give to children with reading disabilities may vary according to the processing demands that reading places on different components of the working memory system.
Grade at intervention continued to exert an influence in predicting differences in rate of growth during the follow-up period. While many of the intervention x grade interactions revealed in the first analysis of posttest change revealed an advantage for Triple participants in Grades 1 and 2, a more specific advantage for 1st graders is found in follow-up growth rates. Children who received the Triple intervention in 1st grade continued to grow during the following three years at faster rates than children who received the intervention in 2nd grade. The grade x follow-up effect was consistently found across outcome measures, with six of the eight demonstrating this robust effect.
The latter observation of superior growth after treatment for our 1st grade sample is reinforced by examination of the normalization rates achieved on different dimensions of reading development by children receiving intervention in Grades 1, 2, or 3. On standardized tests of word attack, word identification, and passage comprehension, significantly greater normalization was attained by Triple participants in every grade, with the exception of a greater but nonsignificant advantage of Triple over control children in Grade 3. On the word attack measure, 78% of Triple 1st graders scored within the average range at posttest, 50% of Triple 2nd graders, and 38% of Triple 3rd graders. In contrast, control participants were normalized at the following rates in these grades: 28% 1st grade, 0% 2nd grade and 6% 3rd grade. Although relatively fewer participants were normalized on the SRI Reading Quotient, 40% of Triple 1st graders and 24% of Triple 2nd graders scored within the average range at posttest compared to 12% and 0% of their control peers.
These data provide further support for the clear benefits of early intervention, particularly in 1st grade. These findings are compatible with those recently reported by Al Otaiba and colleagues (Al Otaiba et al., 2014). These investigators compared two response-to-intervention (RTI) models implemented in 34 first grade classrooms using a randomized controlled design. A typical RTI procedure, that deferred further intervention until Tier 1 response was measured, was compared with a dynamic RTI model that provided Tier 2 or Tier 3 intervention immediately based on children’s screening results. The interventions differed only with respect to when intervention began. Children in the dynamic RTI condition had significantly higher reading achievement at the end of 1st grade than children in the typical RTI condition. As with the present results, these findings suggest that delaying intervention for struggling early readers is not associated with any advantage for the children; to the contrary, the best outcomes are seen with intervention that begins in Kindergarten or Grade 1. As Al Otaiba and colleagues indicate, any effect of false negatives seems negligible. And as our follow-up data demonstrate, superior growth for 1st graders on foundational reading skills continues over three years after the intervention ends.
Limitations
Clear limitations characterize the present study and qualify the findings. The most obvious concerns the inability to randomly assign participants to intervention or control conditions. Quasi-experimental research designs lack the credibility of RCTs with respect to assessing causality. An inability to randomly assign participants to treatment and control conditions is not uncommon in clinical and applied research settings however (Gliner & Morgan, 2000; Harris, McGregor, Perencevich, Furuno, Zhu, Peterson, & Finkelstein, 2006). The use of both repeated measurement and a comparison group makes it easier to avoid certain threats to validity within a quasi-experimental design. In that regard, the present study provided compelling evidence of the comparability of intervention and control groups on all selection criteria, and on all pretest and demographic measures.
Another limitation concerns the unequal sample sizes for the intervention and control groups, and resulting imbalance across intervention and control conditions within each grade. In our study, the lower control group numbers were associated with difficulty enrolling control children with reading disabilities who would be required to wait a full year before receiving the intervention.
Related and equally important issues qualify interpretation of the follow-up data. First, no follow-up data are available for control participants. The failure to follow control children untreated over a follow-up period was due to the ethical need to offer the Triple intervention to control children immediately following their posttest assessment. Although Triple placement could not be arranged for all control children for logistical reasons (school location, transportation, etc.), no follow-up assessment was conducted for them because of the intent to offer intervention. This necessitated two separate analyses to consider predictors of outcomes immediately following intervention and then in the years after intervention ended. Second, decreased numbers were available for follow-up analyses due to attrition of the intervention sample over the follow-up years. Of the intervention participants, follow-up data were collected on 30.2% at Follow-up Year 1, 30.2% at Follow-up Year 2, and 16.3% at Follow-up Year 3. Follow-up was only conducted until the end of 4th grade, and so follow-up opportunities decreased with intervention at later grades. These limitations qualify the conclusions that may be offered on the basis of these follow-up data.
A third limitation also concerns the control comparison available in this curricular control design. Because early intervention and RTI initiatives were less prevalent during the time period of the present study, it is possible that some control participants received less reading instruction overall that those in the intervention group. Some intervention children received whole class reading instruction in addition to the small group Triple intervention program. In these cases, because small group intervention is considered a good vehicle for intensifying reading instruction for struggling readers, it is difficult to determine whether these intervention participants’ post-intervention superiority can be attributed to the research-based Triple intervention itself, to the additional amount of reading instruction offered, or to the increased individual attention that small group programs can afford. This concern is attenuated somewhat by two previous RCTs demonstrating efficacy of the components of the Triple-Focus intervention relative to other intervention conditions offering additional reading instruction in groups of equal size (Lovett et al., 2000; Morris et al., 2012).
In addition, it should be noted that the majority of intervention children (approximately 75%) attended their Triple-Focus class during the time of whole class literacy instruction. These intervention sessions were scheduled at the school’s discretion and many school boards preferred that classes occur while regular classroom reading instruction was occurring. Where this scheduling was not possible, schools generally elected to have children come to the program during art, science, and social science instruction.
An added limitation relevant to the control participants was the lack of specific information on what type of literacy instruction they received in their schools. The majority of control children (81%) were from Toronto and surrounding area schools, and an eclectic approach to literacy instruction was used, largely at the teacher’s discretion. The Ministry of Education in the Province of Ontario provided general instructional guidelines but did not endorse any particular reading program or instructional approach during those years. Children in the elementary grades received 90 minutes of literacy instruction daily, covering reading and writing activities, and including a range of approaches. The same 90-minute block of literacy instruction also characterized our schools in Atlanta and Boston.
Other limitations relate to the context and time within which these data were collected. The study was undertaken in two American cities (Boston, Atlanta) and one Canadian city (Toronto). The study was conducted during a time when NCLB legislation in the US may have affected the instructional practices of teachers in early reading and math instruction, and this may have disproportionately affected control participants from US sites. As noted above, however, 81% of control children were from Toronto schools. Although it could be speculated that Canadian schools were not as influenced by the instructional emphases encouraged by NCLB and therefore at a disadvantage, the superiority of Canadian students’ reading achievement results over those of their American peers in international comparisons (PISA) might assuage such concern. Students in Canada outrank those in the United States in reading, math, and science on PISA testing. Canada is ranked 7th in the world, while the US is ranked 24th on PISA reading scores (OCED, 2013). Although conclusions from this research are necessarily qualified by all these contextual and design factors, it is unlikely that the preponderance of Canadian controls biased intervention findings in a positive direction.
Finally, because this study was conducted in three sites with quite different school calendars, there was a difference between sites in the ability to complete 125 hours of intervention within the school year. The full 125 hours of instruction were implemented as planned for 68% of the intervention sample (n = 117). The remaining 55 intervention children received an average of 104.5 hours of instruction (SD = 14.5; range = 70 to 124 hours). Data density varied considerable across time-points for testing, therefore, and as expected fewer participants were available for follow-up assessment. As in other long-term studies, attrition during the follow-up years occurred.
Conclusion
In conclusion, the present study contributes more evidence on the relative importance of the timing of early intervention for reading problems in the primary grades. Although the Triple-Focus intervention was associated with benefits for struggling readers across 1st, 2nd, and 3rd grades, on all reading and reading-related outcomes, there was a marked advantage on some outcomes for early intervention. Children who received intervention earlier, in 1st and 2nd grade, made gains relative to control children almost twice that of children receiving intervention in 3rd grade on foundational word reading skills such as word attack, word identification, and sight word efficiency. On follow-up testing, the advantage of 1st grade intervention was even clearer: First graders in the Triple condition continued to grow at faster rates over the follow-up years than 2nd graders on six of eight reading outcomes (word attack, passage comprehension, sight word and phonemic reading efficiency, multisyllabic challenge word reading, and GORT reading rate.). Normalization rates indicated that a majority of first graders in the Triple intervention improved and achieved age-appropriate performance scores at posttest on the WMRT reading achievement subtests. These findings suggest that the cost of investing in first grade intervention, using an instructional vehicle with demonstrated efficacy, is offset by the substantial immediate gains, benefits still evident years after the intervention ends. The substantial effect sizes attained with provision of 100–125 hours of intervention provide compelling evidence for the early intervention position.
Finally, the present study is one of the first to examine grade effects in intervention response according to different types of reading outcomes. Evidence was provided of developmental differences in intervention response according to the complexity of the component reading skills being evaluated. On two measures relevant to metacognitive and metalinguistic aspects of the Triple instruction (Challenge Words, Multiple Definitions), 2nd grade Triple children demonstrated a greater posttest advantage relative to controls than 1st grade Triple children. On an orthographic awareness measure (PIAT spelling), 3rd grade Triple children achieved a greater posttest advantage over controls than the 1st or 2nd grade participants. On these outcome measures that require an ability to manipulate linguistic components of written language beyond the phonological dimension, 2nd and 3rd graders enjoyed some intervention advantage. On tests of reading comprehension, however, despite robust intervention effects and large effect sizes, no intervention-by-grade interactions were revealed. This may be attributable to difficulties in reading comprehension measurement for this age and level of reading skill.
Supplementary Material
Acknowledgements
The research reported here was supported by a National Institute of Child Health and Human Development Grant (HD30970) to Georgia State University, Tufts University, and The Hospital for Sick Children/University of Toronto.
In Atlanta, Boston, and Toronto, we are especially grateful to the 237 children who participated in this research for their interest, enthusiasm, and effort. We also gratefully acknowledge their parents, teachers, and schools for their commitment and support during the course of this study. The cooperation and contribution of the principals and staff of all the participating schools, all of whom offered space and opportunities for our programs, are much appreciated.
In Atlanta, we acknowledge the students and their families from the Fulton County School system, administrators, participating schools, their principals and staffs, particularly Susan Grabel; our research teachers—Eileen Cohen, Mary Bucklen, Kim Imbrecht, Victoria Burke, Heather Lubeck, Cashawn Myers, Judith Mahoney, Nioyonu Olutosin, and members of our research team- Paul Cirino, Marla Shapiro, Cynthia Martin, Nicole Mickley, Becky Doyle, Justin Wise, Hye K. Pae, Jennifer Harrison, and Laina Jones.
In Boston, we acknowledge the students and their families from Somerville, Medford, and Newton School systems, administrators, particularly Alice O’Rourke and Roy Belson, participating schools, their principals and staffs; our research teachers- Katharine Donnelly-Adams, Terry Joffe Benaryeh, Joanna Christodoulou, Fran Lunney, Anne Knight, Jill Ludmar, Jane Hill-Lovins, Andrea Marquant; and members of our research team— Beth O’Brien, Cathy Moritz, Julie Jeffery, Lynne Miller, Alyssa Goldberg O’Rourke, Chip Gidney, Wendy Galante, Tami Katzir, Alexis Berry, Laura Vanderberg, Ellen Boiselle, Sasha Yampolsky and Gordon Goodman.
In Toronto, we acknowledge the children and their families from the Toronto District School Board, the Toronto Catholic District School Board, and the Peel District School Board; our Senior Teacher Trainer and Program Developer, Léa Lacerenza; our research teachers: Denis Murphy, Jody Chong, Tammy Cohen, Vicky Grondin, and Steacy O’Connor; and our psychology assessment team: Jennifer Janes, Jennifer Goudey, Leslie Daniels, Jennifer McTaggart, Jennifer Lasenby; and our senior research team members—Maria De Palma, Meredith Temple, and the late Dr. Nancy Benson.
Footnotes
In Year 1, a decision was made to start as many intervention classes as possible in the two sites developing content for programming (Toronto, Boston).
Contributor Information
Maureen W. Lovett, The Hospital for Sick Children and the University of Toronto
Jan C. Frijters, Brock University
Maryanne Wolf, Tufts University.
Karen A. Steinbach, The Hospital for Sick Children
Rose A. Sevcik, Georgia State University
Robin D. Morris, Georgia State University
References
- Al Otaiba S. (2000). Children who do not respond to early literacy instruction: A longitudinal study. Unpublished doctoral dissertation. Vanderbilt University. Nashville, TN. [Google Scholar]
- Al Otaiba S, Connor CM, Folsom JS, Wanzek J, Greulich L, Schatschneider C, & Wagner RK (2014). To wait in Tier 1 or intervene immediately: A randomized experiment examining first grade response to intervention (RTI) in reading. Exceptional Children, 81(1), 11–27. doi: 10.1177/0014402914532234 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Al Otaiba S, & Fuchs D. (2002). Characteristics of children who are unresponsive to early literacy intervention: A review of the literature. Remedial and Special Education, 23, 300–316. [Google Scholar]
- Al Otaiba S, & Fuchs D. (2006). Who are the young children for whom best practices in reading are ineffective? An experimental and longitudinal study. Journal of Learning Disabilities, 39(5), 414–431. [DOI] [PubMed] [Google Scholar]
- Bauer DJ, Sterba SK, & Hallfors DD (2008). Evaluating group-based interventions when control participants are ungrouped. Multivariate Behavioral Research, 43(2), 210–236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baumann JF, Kame’enui EJ, & Ash GE (2003). Research on vocabulary instruction: Voltaire redux. In Flood J, Jensen JM, Lapp D. & Squire JR (Eds.), Handbook of Research in Teaching the English Language Arts (2 ed., pp. 752–785). New York, NY: MacMillan. [Google Scholar]
- Benjamini Y, & Hochberg Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300. [Google Scholar]
- Berninger VW, Abbott RD, Brooksher R, Lemos A, Ogier S, Zook D, & Mostafapour E. (2000). A connectionist approach to making the predictability of English orthography explicit to at-risk beginning readers: Evidence for alternative, effective strategies. Developmental Neuropsychology, 17(2), 241–271. [DOI] [PubMed] [Google Scholar]
- Berninger VW, Abbott RD, Vermeulen K, Ogier S, Brooksher R, Zook D, & Lemos Z. (2002). Comparison of faster and slower responders to early intervention in reading: Differentiating features of their language profiles. Learning Disability Quarterly, 25(1), 59–76. [Google Scholar]
- Berninger VW, Nagy WE, Carlisle JF, Thomson JB, Hoffer D, Abbot S, … Aylward EH (2003). Effective treatment for children with dyslexia in Grades 4–6: Behavioral and brain evidence. In Foorman BR (Ed.), Preventing and Remediating Reading Difficulties: Bringing Science to Scale (pp. 381–417). Timonium, MD: York Press, Inc. [Google Scholar]
- Blishen BR, Carroll WK, & Moore C. (1987). The 1981 socioeconomic index for occupations in Canada. Canadian Review of Sociology and Anthropology, 24, 465–488. [Google Scholar]
- Bowers L, Huisingh R, Johnson PF, LoGiudice C, & Orman J. (2004). The Word Test 2: Elementary. East Moline, IL: LinguiSystems. [Google Scholar]
- Cain K, & Oakhill JV (2006). Profiles of children with specific reading comprehension difficulties. British Journal of Educational Psychology, 76(Pt 4), 683–696. [DOI] [PubMed] [Google Scholar]
- Cirino PT, Chin CE, Sevcik RA, Wolf MA, Lovett MW, & Morris RD (2002). Measuring socioeconomic status: Reliability and preliminary validity for different approaches. Assessment, 9(2), 145–155. [DOI] [PubMed] [Google Scholar]
- Compton DL, Miller AC, Elleman AM, & Steacy LM (2014). Have we forsaken reading theory in the name of “quick fix” interventions for children with reading disability? Scientific Studies of Reading, 18, 55–73. doi: 10.1080/10888438.2013.836200 [DOI] [Google Scholar]
- Connor CM, Morrison FJ, Fishman B, Crowe EC, Al Otaiba S, & Schatschneider C. (2013). A longitudinal cluster-randomized controlled study on the accumulating effects of individualized literacy instruction on students’ reading from first through third grade. Psychological Science, 24(8), 1408–1019. doi: 10.1177/0956797612472204 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Connor CM, Morrison FJ, Fishman B, Giuliani S, Luck M, Underwood PS, … Schatschneider C. (2011). Testing the impact of child characteristics x instruction interactions on third graders’ reading comprehension by differentiating literacy instruction. Reading Research Quarterly, 46(3), 189–221. [PMC free article] [PubMed] [Google Scholar]
- Connor CM, Morrison FJ, Fishman BJ, Schatschneider C, & Underwood P. (2007). The early years: Algorithm-guided individualized reading instruction. Science, 315(5811), 464–465. [DOI] [PubMed] [Google Scholar]
- Connor CM, Morrison FJ, Schatschneider C, Toste J, Lundblom E, Crowe EC, & Fishman B. (2011). Effective classroom instruction: Implications of child characteristics by reading instruction interactions on first graders’ word reading achievement. Journal of Research on Educational Effectiveness, 4(3), 173–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cutting LE, & Scarborough HS (2006). Prediction of reading comprehension: Relative contributions of word recognition, language proficiency, and other cognitive skills can depend on how comprehension is measured. Scientific Studies of Reading, 10(4), 277–299. [Google Scholar]
- Denton CA, Fletcher JM, Anthony JL, & Francis DJ (2006). An evaluation of intensive intervention for students with persistent reading difficulties. Journal of Learning Disabilities, 39, 447–466. [DOI] [PubMed] [Google Scholar]
- Dunn LM, & Dunn LM (1997). Peabody Picture Vocabulary Test (3rd Ed.)—PPVT-III. Circle Pines, MN: American Guidance Service. [Google Scholar]
- Edmonds MS, Vaughn S, Wexler J, Reutebuch CK, Cable A, Tackett KK, & Wick Schnakenberg J. (2009). A synthesis of reading interventions and effects on reading comprehension outcomes for older struggling readers. Review of Educational Research, 79(1), 262–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ehrhardt J, Huntington N, Molino J, & Barbaresi WJ (2013). Special education and later academic achievement. Journal of Developmental and Behavioral Pediatrics, 34(2), 111–119. [DOI] [PubMed] [Google Scholar]
- Entwisle DR, & Astone NM (1994). Some practical guidelines for measuring youth’s race/ethnicity and socioeconomic status. Child Development, 65(6), 1521–1540. [Google Scholar]
- Fletcher JM, Lyon GR, Fuchs LS, & Barnes MA (2007). Learning Disabilities: From Identification to Intervention. New York, NY: Guilford Press. [Google Scholar]
- Fletcher JM, & Vaughn S. (2009). Response to intervention: Preventing and remediating academic difficulties. Child Development Perspectives, 3(1), 30–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Foorman BR, & Al Otaiba S. (2009). Reading remediation: State of the art. In Pugh KR & McCardle P. (Eds.), How Children Learn To Read: Current Issues and New Directions in The Integration of Cognition, Neurobiology and Genetics of Reading and Dyslexia Research and Practice (Extraordinary Brain Series) (pp. 257–274). New York, NY: Psychology Press. [Google Scholar]
- Foorman BR, Francis DJ, Fletcher JM, Schatschneider C, & Mehta P. (1998). The role of instruction in learning to read: Preventing reading failure in at-risk children. Journal of Educational Psychology, 90(1), 37–55. [Google Scholar]
- Foorman BR, Francis DJ, Winikates D, Mehta P, Schatschneider C, & Fletcher JM (1997). Early interventions for children with reading disabilities. Scientific Studies of Reading, 1(3), 255–276. [Google Scholar]
- Frijters JC, Lovett MW, Sevcik RA, & Morris RD (2013). Four methods of identifying change in the context of a multiple component reading intervention for struggling middle school readers. Reading and Writing: An Interdisciplinary Journal, 26(4), 539–563. doi: 10.1007/s11145-012-9418-z [DOI] [Google Scholar]
- Frijters JC, Lovett MW, Steinbach KA, Wolf MA, Sevcik RA, & Morris R. (2011). Neurocognitive predictors of reading outcomes for children with reading disabilities. Journal of Learning Disabilities, 44(2), 150–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuchs D, & Fuchs LS (1998). Researchers and teachers working together to adapt instruction for diverse learners. Learning Disabilities Research and Practice, 13, 126–137. [Google Scholar]
- Fuchs D, & Fuchs LS (2005). Peer-assisted learning strategies: Promoting word recognition, fluency, and reading comprehension in young children. Journal of Special Education, 39(1), 34–44. [Google Scholar]
- Fuchs D, Fuchs LS, Mathes PG, & Simmons D. (1997). Peer-assisted learning strategies: Making classrooms more responsive to diversity. American Educational Research Journal, 34(1), 174–206. [Google Scholar]
- Gamse BC, Jacob RT, Horst M, Boulay B, & Unlu F. (2008). Reading First Impact Study Final Report Executive Summary (NCEE 2009–4039). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, US Department of Education. [Google Scholar]
- Gardner MF (1996). Test of Visual Perceptual Skills (Non-Motor)--Revised (TVPS-R). San Fransisco, CA: Psychological and Educational Publications. [Google Scholar]
- Gliner JA, & Morgan GA (2000). Research methods in applied settings: An integrated approach to design and analysis. Mahwah, N.J: Lawrence Erlbaum. [Google Scholar]
- Glover TA, & Vaughn S. (2010). The Promise of Response to Intervention: Evaluating the Current Science and Practice. New York, NY: The Guilford Press. [Google Scholar]
- Hanushek EA, Kain JF, & Rivkin SG (1998). Does special education raise academic achievement for students with disabilities? Paper presented at the National Bureau of Economic Research, Working Paper No. 6690, Cambridge, MA. [Google Scholar]
- Harris AD, McGregor JC, Perencevich EN, Furuno JP, Zhu J, Peterson DE, & Finkelstein J. (2006). The use and interpretation of quasi-experimental studies in medical informatics. Journal of the American Medical Informatics Association, 13(1), 16–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hollingshead AB (1975). Four factor index of social status. Unpublished manuscript. Yale University, New Haven, CT. [Google Scholar]
- Hox JJ (2010). Multilevel Analysis: Techniques and Applications (2 ed.). New York, NY: Routledge. [Google Scholar]
- Hunter JE, & Schmidt FL (2004). Methods of Meta-Analysis: Correcting Error and Bias in Research Findings (2 ed.). Thousand Oaks, CA: Sage. [Google Scholar]
- Kamil ML (2004). The current state of quantitative research. Reading Research Quarterly, 39(1), 100–108. [Google Scholar]
- Kaufman AS, & Kaufman AL (1990). Kaufman Brief Intelligence Test. Circle Pines, MN: American Guidance Service, Inc. [Google Scholar]
- Kendeou P, Papadopoulos TC, & Spanoudis G. (2012). Processing demands of reading comprehension tests in young readers. Learning and Instruction, 22(5), 354–367. [Google Scholar]
- Kirby JR, Parrila R, & Pfeiffer S. (2003). Naming speed and phonological processing as predictors of reading development. Journal of Educational Psychology, 95, 453–464. [Google Scholar]
- Leach JM, Scarborough HS, & Rescorla L. (2003). Late-emerging reading disabilities. Journal of Educational Psychology, 95(2), 211–224. [Google Scholar]
- Lovett MW, Barron RW, & Frijters JC (2013). Word identification difficulties in children and adolescents with reading disabilities: Intervention research findings. In Swanson HL, Harris K. & Graham S. (Eds.), Handbook of Learning Disabilities (2 ed., pp. 329–360). New York, NY: Guilford Press. [Google Scholar]
- Lovett MW, Borden SL, DeLuca T, Lacerenza L, Benson NJ, & Brackstone D. (1994). Treating the core deficits of developmental dyslexia: Evidence of transfer-of-learning following phonologically- and strategy-based reading training programs. Developmental Psychology, 30(6), 805–822. [Google Scholar]
- Lovett MW, Lacerenza L, & Borden SL (2000). Putting struggling readers on the PHAST track: A program to integrate phonological and strategy-based remedial reading instruction and maximize outcomes. Journal of Learning Disabilities, 33(5), 458–476. [DOI] [PubMed] [Google Scholar]
- Lovett MW, Lacerenza L, Borden SL, Frijters JC, Steinbach KA, & De Palma M. (2000). Components of effective remediation for developmental reading disabilities: Combining phonological and strategy-based instruction to improve outcomes. Journal of Educational Psychology, 92(2), 263–283. [Google Scholar]
- Lovett MW, & Steinbach KA (1997). The effectiveness of remedial programs for reading disabled children of different ages: Does the benefit decrease for older children? Learning Disability Quarterly, 20(3), 189–210. [Google Scholar]
- Manis FR, Doi LM, & Bhadha B. (2000). Naming-speed, phonological awareness, and orthographic knowledge in second graders. Journal of Learning Disabilities, 33(4), 325–333, 374. [DOI] [PubMed] [Google Scholar]
- Mason LH (2004). Explicit self-regulated strategy development versus reciprocal questioning: Effects on expository reading comprehension among struggling readers. Journal of Educational Psychology, 96(2), 283–296. [Google Scholar]
- Mathes PG, Denton CA, Fletcher JM, Anthony JL, Francis DJ, & Schatschneider C. (2005). The effects of theoretically different instruction and student characteristics on the skills of struggling readers. Reading Research Quarterly, 40, 148–182. [Google Scholar]
- Mathes PG, Howard JK, Allen SH, & Fuchs D. (1998). Peer-assisted learning strategies for first-grade readers: Responding to the needs of diverse learners. Reading Research Quarterly, 33(1), 62–94. [Google Scholar]
- McMaster KN, Fuchs D, Fuchs LS, & Compton DL (2005). Responding to nonresponders: An experimental field trial of identification and intervention methods. Exceptional Children, 71, 445–463. [Google Scholar]
- Morgan PL, Frisco M, Farkas G, & Hibel J. (2010). A propensity score matching analysis of the effects of special education services. Journal of Special Education, 43(4), 236–254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morris RD, Lovett MW, Wolf MA, Sevcik RA, Steinbach KA, Frijters JC, & Shapiro M. (2012). Multiple-component remediation for developmental reading disabilities: IQ, socioeconomic status, and race as factors in remedial outcome. Journal of Learning Disabilities, 45(2), 99–127. doi: 10.1177/0022219409355472 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morsy L, Kieffer M, & Snow CE (2010). Measure for Measure: A Critical Consumers’ Guide to Reading Comprehension Assessments for Adolescents. New York, NY: Carnegie Corporation of New York. [Google Scholar]
- Nagy WE (2007). Metalinguistic awareness and the vocabulary—Comprehension connection. In Wagner RK, Muse AE & Tannenbaum KR (Eds.), Vocabulary Acquisition: Implications for Reading Comprehension (pp. 52–77). New York, NY: The Guilford Press. [Google Scholar]
- Nakao K, & Treas J. (1992). 1989 Socioeconomic Index of Occupations: Construction from the 1989 Occupational Prestige Scores. General Social Survey Methodological Report No 74. Chicago, IL: National Opinion Research Cener. [Google Scholar]
- Nelson JR, Benner GJ, & Gonzalez J. (2005). An investigation of the effects of a prereading intervention on the early literacy skills of children at risk of emotional disturbance and reading problems. Journal of Emotional and Behavioral Disorders, 13(1), 3–12. [Google Scholar]
- Newcomer P. (1999). Standardized Reading Inventory—2 (SRI-2). Austin, TX: Pro-Ed. [Google Scholar]
- O’Connor RE (2000). Increasing the intensity of intervention in kindergarten and first grade. Learning Disabilities Research and Practice, 15, 43–54. [Google Scholar]
- O’Connor RE, Fulmer D, Harty K, & Bell K. (2005). Layers of reading intervention in kindergarten through third grade: Changes in teaching and child outcomes. Journal of Learning Disabilities, 38, 440–455. [DOI] [PubMed] [Google Scholar]
- Parrila R, Kirby JR, & McQuarrie L. (2004). Articulation rate, naming speed, verbal short-term memory, and phonological awareness: Longitudinal predictors of early reading development? Scientific Studies of Reading, 8(1), 3–26. [Google Scholar]
- Pham AV, & Hasson RM (2014). Verbal and visuospatial working memory as predictors of chidlren’s reading ability. Archives of Clinical Neuropsychology, 29(5), 467–477. doi: 10.1093/arclin/acu024 [DOI] [PubMed] [Google Scholar]
- Raudenbush SW, & Bryk AS (2002). Hierarchical Linear Models: Applications and Data Analysis Methods (2nd ed.). Thousand Oaks, CA: Sage Publications, Inc. [Google Scholar]
- Roberts C, & Roberts SA (2005). Design and analysis of clinical trials with clustering effects due to treatment. Clinical Trials, 2, 152–162. Clinical Trials, 2(2), 152–162. [DOI] [PubMed] [Google Scholar]
- Scammacca N, Roberts G, Vaughn S, Edmonds M, Wexler J, Reutebuch CK, & Torgesen JK (2007). Interventions for adolescent struggling readers: A meta-analysis with implications for practice. Portsmouth, NH: RMC Research Corporation, Center on Instruction. [Google Scholar]
- Scanlon DM, & Vellutino FR (1997). A comparison of the instructional backgrounds and cognitive profiles of poor, average, and good readers who were initially identified as at risk for reading failure. Scientific Studies of Reading, 1(3), 191–216. [Google Scholar]
- Snijders TAB, and Bosker RJ (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). London: Sage Publishers. [Google Scholar]
- Snow CE (2002). Reading for Understanding: Toward an R&D Program in Reading Comprehension. Santa Monica, CA: RAND Reading Study Group, RAND Corporation. [Google Scholar]
- Suggate SP (2010). Why what we teach depends on when: Grade and reading intervention modality moderate effect size. Developmental Psychology, 46(6), 1556–1579. doi: 10.1037/a0020612 [DOI] [PubMed] [Google Scholar]
- Swanson HL (2000). Are working memory deficits in readers with learning disabilities hard to change? Journal of Learning Disabilities, 33(6), 551–566. [DOI] [PubMed] [Google Scholar]
- Swanson HL (2010). Does the dynamic testing of working memory predict growth in nonword fluency and vocabulary in children with reading disabilities. Journal of Cognitive Education and Psychology, 9, 51–77. [Google Scholar]
- Swanson HL, Hoskyn M, & Lee C. (1999). Interventions for Students with Learning Disabilities: A Meta-Analysis of Treatment Outcomes. New York, NY: The Guilford Press. [Google Scholar]
- Swanson HL, & Saez L. (2003). Memory difficulties in children and adults with learning disabilities. In Swanson HL, Harris KR, & Graham S. (Eds.), Handbook of learning disabilities (pp. 182–198). New York: Guilford. [Google Scholar]
- Swanson HL, Saez L, & Gerber M. (2006). Growth in literacy and cognition in bilingual children at risk or not at risk for reading disabilities. Journal of Educational Psychology,98(2), 247–264. [Google Scholar]
- Swanson HL, & Siegel LS (2001). Learning disabilities as a working memory deficit. Issues in Education: Contributions from Educational Psychology, 7(1), 1–48. [Google Scholar]
- Torgesen JK (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resisters. Learning Disabilities Research and Practice, 15(1), 55–64. [Google Scholar]
- Torgesen JK, Alexander AW, Wagner RK, Rashotte CA, Voeller KKS, & Conway T. (2001). Intensive remedial instruction for children with severe reading disabilities: Immediate and long-term outcomes from two instructional approaches. Journal of Learning Disabilities, 34(1), 33–58. [DOI] [PubMed] [Google Scholar]
- Torgesen JK, Wagner RK, & Rashotte CA (1997). Approaches to the prevention and remediation of phonologically-based reading disabilities. In Blachman BA (Ed.), Foundations of Reading Acquisition and Dyslexia: Implications for Early Intervention (pp. 287–304). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. [Google Scholar]
- Torgesen JK, Wagner RK, & Rashotte CA (1997b). Prevention and remediation of severe reading disabilities: Keeping the end in mind. Scientific Studies of Reading, 1(3), 217234. [Google Scholar]
- Torgesen JK, Wagner RK, & Rashotte CA (1999). Test of Word Reading Efficiency (TOWRE). Austin, TX: Pro-Ed Publishing, Inc. [Google Scholar]
- Torgesen JK, Wagner RK, Rashotte CA, Alexander AW, & Conway T. (1997). Preventive and remedial interventions for children with severe reading disabilities. Learning Disabilities: A Multidisciplinary Journal, 8, 51–62. [DOI] [PubMed] [Google Scholar]
- Torgesen JK, Wagner RK, Rashotte CA, Rose E, Lindamood P, Conway T, & Garvan C. (1999). Preventing reading failure in young children with phonological processing disabilities: Group and individual responses to instruction. Journal of Educational Psychology, 91(4), 579–593. [Google Scholar]
- Vadasy PF, Sanders EA, Peyton JA, & Jenkins JR (2002). Timing and intensity of tutoring: A closer look at the conditions for effective early literacy tutoring. Learning Disabilities Research and Practice, 17(4), 227–241. [Google Scholar]
- Vaughn S, Chard DJ, Pedrotty-Bryant D, Coleman M, Tyler B-J, Linan-Thompson S,& Kouzekanani K. (2000). Fluency and comprehension interventions for third-grade students. Remedial and Special Education, 21(6), 325–335. [Google Scholar]
- Vaughn S, & Fuchs LS (2003). Redefining learning disabilities as inadequate response to instruction: The promise and potential problems. Learning Disabilities Research and Practice, 18(3), 137–146. [Google Scholar]
- Vaughn S, Levy S, Coleman M, & Bos CS (2002). Reading instruction for students with LD and EBD: A synthesis of observation studies. Journal of Special Education, 36, 2–15. [Google Scholar]
- Vaughn S, Linan-Thompson S, & Hickman P. (2003). Response to instruction as a means of identifying students with reading/learning disabilities. Exceptional Children, 69, 391–409. [Google Scholar]
- Vaughn S, Wexler J, Roberts G, Barth A, Cirino PT, Romain MA, … Denton CA (2011). Effects of individualized and standardized interventions on middle school students with reading disabilities. Exceptional Children, 77(4), 391–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vellutino FR, Scanlon DM, Sipay ER, Small SG, Pratt A, Chen R, & Denckla MB (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experiential deficits as basic causes of specific reading disability. Journal of Educational Psychology, 88(4), 601–638. [Google Scholar]
- Wagner RK, Torgesen JK, & Rashotte CA (1999). Comprehensive Test of Phonological Processing (CTOPP). Austin, TX: Pro-Ed. [Google Scholar]
- Wanzek J, & Vaughn S. (2007). Research-based implications from extensive early reading interventions. School Psychology Review, 36(4), 541–561. [Google Scholar]
- Wechsler D. (1997). Wechsler Adult Intelligence Scale (3rd Ed.) (WAIS-III). San Antonio, TX: The Psychological Corporation. [Google Scholar]
- Wechsler D. (1999). Wechsler Abbreviated Scale of Intelligence (WASI). New York, NY: The Psychological Corporation. [Google Scholar]
- Wiederholt JL, & Bryant BR (2004). Gray Oral Reading Tests—4th Edition (GORT-4). Austin, TX: Pro-Ed. [Google Scholar]
- Wilkinson GS (1993). Wide Range Achievement Test—3. Wilmington, DE: Jastak Associates. [Google Scholar]
- Wolf M, Barzillai M, Gottwald S, Miller L, Spencer K, Norton E, Lovett M, & Morris R. (2009). The RAVE-O intervention: Connecting neuroscience to the classroom. Mind, Brain, and Education, 3(2), 84–93. [Google Scholar]
- Wolf MA, & Denckla M. (2005). The rapid automatized naming and rapid alternating stimulus tests. Austin, TX: Pro-Ed. [Google Scholar]
- Wolf M, & Katzir-Cohen T. (2001). Reading fluency and its intervention. Scientific Studies of Reading, 5(3), 211–239. [Google Scholar]
- Wolf M, Miller L, & Donnelly K. (2000). Retrieval, automaticity, vocabulary elaboration, orthography (RAVE-O): A comprehensive, fluency-based reading intervention program. Journal of Learning Disabilities, 33(4), 375–386. [DOI] [PubMed] [Google Scholar]
- Woodcock RW (1987). Woodcock Reading Mastery Tests—Revised (WRMT-R). Circle Pines, MN: American Guidance Service. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.