Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Oct 8.
Published in final edited form as: J Learn Disabil. 2020 Nov 30;54(3):170–186. doi: 10.1177/0022219420972184

A Synthesis of the Sustainability of Remedial Reading Intervention Effects for Struggling Adolescent Readers

Johny Daniel 1, Philip Capin 2, Paul Steinle 2
PMCID: PMC8500577  NIHMSID: NIHMS1744455  PMID: 33251955

Abstract

A majority of reading-related intervention studies aiming to remediate struggling readers’ reading outcomes assess student performance immediately following the conclusion of an intervention to determine intervention effects. Few studies collect follow-up data to measure the long-term sustainability of treatment effects. Hence, the aim of the current synthesis was to examine follow-up intervention effects of reading interventions involving adolescent struggling readers in Grades 6 to 12. Our literature search yielded only 10 studies that reported follow-up data for intervention participants, which highlights the dearth of intervention research that examines sustainability of intervention effects. Of the 10 included studies, the weighted mean effect size for all reading outcome measures was gw = 0.78 at immediate posttest and gw = 0.27 at follow-up, in favor of treatment group students. Although the magnitude of difference between treatment and control groups diminished at follow-up time, a comparison of treatment group students’ immediate posttest and follow-up scores showed that students mostly maintained gains made during intervention at follow-up time points.

Keywords: adolescent struggling readers, learning disabilities, follow-up, maintenance, reading interventions


Socially significant behavior changes are those that last over time, are used by the learner in all relevant settings and situations, and are accompanied by changes in other relevant responses … to perform below this standard is more than just regrettable; it is a clear indication that the initial instruction was not entirely successful.

J. O. Cooper et al. (2008, p. 623).

Reading intervention studies that aim to improve adolescent struggling readers’ reading outcomes generally measure and report the efficacy of an intervention based on students’ immediate posttest reading performance. However, there is inadequate research addressing the sustainability of intervention effects as a measure of intervention effectiveness (Suggate, 2016). In other words, little is known about adolescent struggling readers’ ability to maintain gains made due to interventions. Evaluating student performance at follow-up time points can further demonstrate a program’s effectiveness or, alternatively, detect programs that lead to only short-term gains (Keogh, 2004; Suggate, 2016). More importantly, performance on follow-up tests can add substantial scientific value to the evaluation of reading interventions for adolescent struggling readers.

Effectiveness of Reading Interventions for Adolescent Struggling Readers

A substantial body of research exists on examining the effects of instructional methods for students who struggle to read and comprehend grade-level text in middle and high school (e.g., Boardman et al., 2008; Roberts et al., 2008; Scammacca et al., 2007; Vaughn, Roberts, Wexler, et al., 2015; Vaughn et al., 2019). However, these studies have generally yielded mixed findings. For instance, the Striving Readers Project (Boulay et al., 2015), funded by the Institute of Education Sciences (IES), summarized findings from 17 randomized controlled trials that evaluated the effects of 10 different reading interventions for struggling adolescent readers in Grades 6 through 10 (Cantrell et al., 2010, 2011, 2012; Deussen et al., 2012; Dimitrov et al., 2012; Faddis et al., 2011; Feldman et al., 2011; Hofstetter et al., 2011; Loadman et al., 2011; Meisch et al., 2011; Newman et al., 2012; Schenck et al., 2012; Schiller et al., 2012; Swanlund et al., 2012; The Education Alliance at Brown University, 2012; Tunik et al., 2011; Vaden-Kiernan et al., 2012). The IES report summarizing the findings rated each intervention’s effect on student reading outcomes. Of the 10 intervention studies, IES summarized that six reported no discernable effects, three reported positive or potentially positive effects, and one study reported mixed effects of intervention on students’ reading outcomes.

Similarly, meta-analyses and syntheses that aggregate the results of multiple studies also provide a mixed picture of the effects of secondary reading interventions. Some past meta-analyses have reported moderate effects of reading interventions for struggling readers in upper elementary and later grades: g = 0.41 (Flynn et al., 2012), g = 0.47 (Edmonds et al., 2009), g = 0.49 (Scammacca et al., 2015). In contrast, Wanzek et al. (2013), measuring the effects of extensive (i.e., comprising of 100 or more sessions) reading interventions, reported small effects of interventions on various reading outcomes (g = 0.10–0.16). In addition, Scammacca et al. (2015) disaggregated results of interventions and reported much smaller effects for standardized reading outcome measures (g = 0.21), with multicomponent reading interventions demonstrating the largest positive effect on standardized reading comprehension measures (g = 0.46).

Past systematic reviews and meta-analyses also report on the type of interventions that are most effective in improving students’ reading outcomes for struggling adolescent readers. For instance, Scammacca et al. (2007) reported large effects of comprehension strategy instruction (d = 1.23), vocabulary instruction (d = 1.62), and word study instruction (d = 1.60) on various researcher-developed and standardized reading measures. In a subsequent meta-analysis, Scammacca et al. (2015) reported large effects of vocabulary interventions (d = 1.58) and reading comprehension interventions (d = 0.74) on adolescent struggling readers’ reading outcomes. However, it is important to note that, across these meta-analyses, researchers have generally reported substantial differences in effects between researcher-developed and standardized measures, with greater effects observed on researcher-developed measures (e.g., Edmonds et al., 2009; Scammacca et al., 2015). For instance, Scammacca et al. (2015) reported that, whereas the overall effect size (ES) across all included studies and measures was 0.49, the average reported ES was 0.21 on standardized measures.

In a more recent systematic review, Berkeley and Larsen (2018) reviewed the extent to which self-regulation of reading strategies benefited adolescent students with learning disabilities (LD). The researchers (Berkeley & Larsen, 2018) reported that the average effect across 18 studies, on predominantly researcher-developed reading measures, was large at posttest (ES = 1.35). In addition, eight of the 18 included studies reported follow-up data that showed treatment group students continued to exhibit improved performance on reading measures compared with their control group peers. The average follow-up effect was also large (ES = 0.95); however, most studies assessed maintenance effects using researcher-developed measures. This finding is vital in evaluating the benefits of embedding self-regulation elements to make a long-lasting impact on students’ reading performance.

In summary, several past studies have implemented a variety of reading interventions to improve reading outcomes for adolescent struggling readers. One challenge with interpreting the effects of past reading intervention studies is that a majority of interventions, and almost all past systematic reviews of these studies, focus on student performance at the end of the intervention period. Rarely do researchers follow study participants to analyze the long-term effects of interventions. Thus, the goal of this review is to examine the sustainability of reading intervention effects observed at immediate posttest compared with follow-up time points. The current review is also not limited to any one type of reading intervention (e.g., self-regulation strategy; see Berkeley & Larsen, 2018) but aims to evaluate the sustainable effects of a variety of reading interventions that target different components of reading (i.e., comprehension, vocabulary, word reading, and fluency). Follow-up is defined in this review as any data point collected two or more weeks after the end of the original intervention.

Importance of Follow-Up Data

Researchers have advocated for the collection of follow-up data to better assess the effectiveness of educational interventions (Keogh, 2004; Suggate, 2016). Those who collected follow-up data for early elementary reading intervention studies generally reported positive maintenance effects of phonological awareness and phonics instruction on reading outcomes for low performing students in Grades K–3 (e.g., Blachman et al., 2004, 2014; Ryder et al., 2008; Vadasy & Sanders, 2013). These studies have contributed to the growing body of evidence emphasizing the importance of code-oriented instruction in early elementary education, especially for low performing students. Results indicate that benefits of instruction extended from 1 to 10 years after the intervention concluded (e.g., Blachman et al., 2014; Byrne & Fielding-Barnsley, 1993, 1995; Ryder et al., 2008; Vadasy et al., 2006).

Indeed, a recent meta-analysis (Suggate, 2016) of the follow-up effects of reading interventions targeting students in Grades pre-K–6 corroborates the effectiveness of phonological awareness instruction for typical and low performing students. Suggate (2016) identified 71 reading intervention studies, with an average follow-up time of approximately 11 months. Among the 17 phonemic awareness intervention studies identified, the average effect across all reading measures at immediate posttest (dw = 0.43) was mostly sustained at follow-up time points (dw = 0.36). Similarly, averaging across all reading measures, positive effects of comprehension interventions were also sustained from immediate posttest (dw = 0.38) to follow-up (dw = 0.46) for treatment group students. In contrast, the effects of fluency interventions on students’ overall reading outcomes diminished from posttest (dw = 0.47) to follow-up (dw = 0.28). More surprisingly, effects of phonics interventions on reading outcomes were significant at immediate posttest (dw = 0.29) but were trivial at the follow-up time point (dw = 0.07); Suggate (2016) hypothesized that the diminished performance at follow-up may be due to a stronger counterfactual rather than loss of learning for the treatment group students.

Whereas the primary focus of early elementary reading instruction is to develop students’ word reading skills, in the upper elementary and later grades the focus of instruction shifts to extracting and constructing meaning from text (Chall & Jacobs, 2003). Results from previous early elementary intervention studies and Suggate’s (2016) meta-analysis establish the long-term benefits of implementing early reading interventions, especially for at-risk student populations. However, whereas considerable evidence supports the effectiveness and extended benefits of reading interventions in early elementary grades, there is much less evidence confirming the long-term benefits of effective middle and high school reading instructional practices for struggling adolescent readers.

Current Study

Although our understanding of the effects of reading interventions for adolescents is growing, no previous synthesis has examined the long-term effects of these interventions. This article will serve as an upward extension of the Suggate (2016) meta-analysis, which examined the long-term effects of elementary reading interventions. However, unlike Suggate’s (2016) meta-analysis, which focused on the ways struggling readers’ response to interventions varied from typical peers, this synthesis focuses solely on the reading outcomes of struggling readers. Thus, the aim of this synthesis is to address the following research question:

Research Question 1 (RQ1):

What are the effects of reading interventions provided in small-group settings on reading outcomes for struggling readers in Grades 6 to 12 at immediate posttest and follow-up time points?

Method

Data Collection

A comprehensive search of the literature was conducted. First, an online search of four educational literature databases was conducted on Education source, Education Resources Information Center (ERIC), PsycINFO, and ProQuest Dissertation and Theses Global to locate unpublished and published studies between 1996 and August 2019. We searched abstracts using search terms for reading (reading OR vocabulary OR phon* OR fluency OR decod* OR comprehen*), study type (intervention OR strateg* OR curricul* OR approach* OR treatment OR teaching method* OR instruction* OR teaching aids OR program), sample (disabilit* OR disorder OR delay* OR struggling OR “reading problem*” OR dyslexi* OR “learning problem*” OR “special education” OR “special need*” OR “at risk” OR “high risk” OR “mild handicap*” OR reading difficult*), and follow-up data (“long-term” OR “medium-term” OR “follow-up” OR posttest OR “posttest” OR longitudinal OR period OR maint*). Compared with this study’s screening process, Suggate’s (2016) literature search was limited to two databases (i.e., ERIC and PsycINFO). Our search terms for the follow-up data were the same, and we added “vocabulary” to the reading search terms. There was no overlap for the study type and sample search terms.

The second step in identifying articles relevant to the research question involved a hand search of 14 prominent educational journals spanning from January 2017 through August 2019. This 2-year window ensured that the electronic search captured all relevant articles. The hand-searched journals included Annals of Dyslexia, Cognition and Instruction, Exceptional Children, Journal of Educational Psychology, Journal of Learning Disabilities, Journal of Research on Educational Effectiveness, Journal of Research in Reading, Journal of Special Education, Learning Disability Quarterly, Learning Disabilities Research and Practice, Reading and Writing Quarterly, Reading Psychology, Reading Research Quarterly, Remedial and Special Education, and Scientific Studies of Reading. In addition, relevant articles were sourced through an ancestry search of articles that fit the inclusion criteria. Finally, we did an ancestry search of existing reviews that synthesized the effects of reading interventions for adolescent struggling readers (Berkeley & Larsen, 2018; Edmonds et al., 2009; Scammacca et al., 2007; Scammacca et al., 2015, 2016).

Figure 1 shows the process for including studies for this systematic review. The online database search revealed 22,770 potential articles. The first author screened abstracts and included any abstract related to reading interventions for full-text screening (n = 904). Table 1 provides examples of studies that were not included in this synthesis. A total of 10 studies (six peer-reviewed articles and four dissertations) met all inclusion criteria:

  1. Interventions involving participants identified with LD, dyslexia, or struggling readers in Grades 6 to 12;

  2. Studies that were randomized controlled trials or quasi-experimental designs;

  3. Intervention studies targeting English language reading-related skills, such as decoding, fluency, vocabulary, reading comprehension, or multicomponent reading interventions;

  4. Studies that reported immediate posttest and maintenance data for at least one dependent measure that assessed reading-related outcomes;

  5. Reading-related interventions conducted in school settings (i.e., no summer school or home-based literacy program);

  6. Reading-related interventions conducted in school settings outside the general education classroom;

  7. Studies published between January 1996 and August 2019;

  8. Studies published in a peer-reviewed journal or unpublished dissertations;

  9. Studies available in English.

The target sample of studies were experimental or quasi-experimental reading interventions that reported reading-related outcome data for immediate posttest and for at least one reading measure at a follow-up time point. The authors made an a priori decision to exclude single-case design studies due to inconsistencies in the number of data points reported in the intervention and maintenance (i.e., follow-up) phases; typically, several data points are reported in the intervention phase, whereas one/two data points are reported for the maintenance/follow-up phase. This imbalance would lead to skewed Tau-U ESs as the magnitude of difference is dependent on the length of the phases for ES calculation (J. E. Pustejovsky, personal communication, July 08, 2018). Thus, following Suggate’s (2016) lead, only group design studies were considered for this review.

Figure 1.

Figure 1.

Flowchart for inclusion of studies.

Note. PI = primary investigator.

Table 1.

Examples of Studies Not Included in This Synthesis.

Study Reason for exclusion
Antoniou and Souvignier (2007) Intervention targeted German language reading-related skills. Current synthesis only included studies that targeted English language related skills.
Borkowski et al. (1988) Study publication year did not meet our search time frame that included studies published on or after 1996.
M. K. Burns et al.(2008) Study reports 2-year follow-up data for sixth-grade students. However, students were in fourth grade when the intervention concluded.
Ellis and Grave (1990) Study publication year did not meet our search time frame that included studies published on or after 1996.
Graves (1986) Study publication year did not meet our search time frame that included studies published on or after 1996.
Hock et al. (2017) Treatment group received 2 years of intervention. Control group was wait-listed for Year 1 and received intervention in Year 2. No follow-up data were collected after the intervention ended in Year 2.
Johnson et al. (1997) The study included students in Grades 4–6. Johnson et al. were unable to confirm that the majority of students were in Grade 6 or provide disaggregated data for students in that grade.
Miranda et al. (1997) Intervention targeted Spanish language reading–related skills. Current synthesis only included studies that targeted English language–related skills.
Vaughn, Roberts, Swanson, et al. (2015) Study provides whole class instruction to a heterogeneous population of readers. No small-group instruction provided.

Study Coding

All included studies were coded using the Guide for Education-Related Intervention Study Syntheses (Vaughn et al., 2014). This codesheet has been used in numerous previous syntheses (e.g., Daniel & Williams, 2019; Hall et al., 2017; Scammacca et al., 2016) and includes all critical components identified in the systematic review process of the What Works Clearinghouse (WWC, 2017) Study Review Guide. Critical components included in the codesheet include design information, sample description; sample sizes; baseline measures; measures’ description including validity, reliability, and internal consistency information of each measure; data used for analysis; attrition information; description of treatment and control groups; and description of treatment and control group procedures. Furthermore, the codesheet was updated to include follow-up data outcome measures and scores, and times when follow-up data were collected.

Data Analysis

Standardized mean difference ESs were calculated using Hedges’s g to adjust for the possibility of small sample bias. Treatment and comparison groups’ immediate posttest and follow-up means, standard deviations, and sample sizes were used to calculate Hedges’s g. In addition, we sought to examine the sustainability of effects for treatment group students by calculating an ES comparing treatment group students’ immediate posttest and follow-up mean outcome scores.

All eligible ESs in each study that provided mean and standard deviation or other relevant statistics, such as F test scores, were considered in calculating the weighted mean ES. Group design studies contributed multiple ESs when the sample for each ES was independent. For studies that reported multiple ESs from the same sample (e.g., two ESs based on two reading comprehension measures were calculated for treatment vs. control in one study), analysis also accounted for the statistical dependencies using the random effects robust standard error estimation technique developed by Hedges et al. (2010). This analysis allows for clustered data (i.e., ESs nested within samples) by correcting the study standard errors to take into account the correlations between ESs from the same sample. The robust standard error technique requires that an estimate of the mean correlation (ρ) between all the pairs of ESs within a cluster be estimated for calculating the between-study sampling variance estimate, τ2. In all analyses, we estimated τ2 with ρ = .80. Because this review included studies conducted in Grades 6 to 12, it was hypothesized that the research body was reporting a distribution of ESs with significant between-studies variance, as opposed to a group of studies attempting to estimate one true ES (Lipsey & Wilson, 2001). Thus, a random-effect model was used for the current study. Robust variance estimation analysis was conducted in R, using the robumeta package (Fisher & Tipton, 2015).

The WWC recommends interpreting ESs of 0.25 and larger as “substantially important” in educational research settings (WWC, 2017). This recommendation was considered when interpreting the magnitude and importance of the effects. Finally, descriptive statistical data were used to calculate 95% confidence intervals (CIs) to determine whether each individual ES was significant, that is, if a statistic is significantly different from zero at the 0.05 level, then the 95% CI will not contain zero.

Results

Table 2 provides a description of the 10 studies that met all inclusion criteria and were included in this synthesis. Of the 10 studies, six were peer-reviewed journal articles and four were unpublished doctoral dissertations. Across the studies, immediate posttest and follow-up data were collected on 856 and 693 adolescent struggling readers, respectively. Of these, at posttest, 263 students were identified as having an LD while 593 were identified as struggling readers.

Table 2.

Study Information.

Study
Type
Participant disability
Grade(s)
Reading level
Design
Dosagea
Frequency/total sessions
Treatment
(n = sample size)
Comparison
(n = sample size)
Berkeley et al. (2011) Experimental (treatment/comparison)
6 hr
NR/12
T1: Direct instruction in using comprehension strategies (n = 20).
T2: Direct instruction in using comprehension strategies plus attribution retraining to improve student self-belief (n = 19).
CO: Students read text and made predictions, practiced repeated reading, answered comprehension questions, and graphed their fluency scores (n = 20).
 Peer-reviewed article
 LD
 Grades 7, 8, and 9
 Participation reading level: 3.7 to 4.2 years below grade level
Clarke et al. (2017) Experimental (treatment/comparison)
35 hr
3 times a week/60
T1: Instruction in reading fluency, phonics, and writing (n = 95).
T2: Instruction in reading fluency, phonics, writing, vocabulary, listening comprehension, and strategy instruction (graphic organizer; n = 94).
CO: Control group students were wait-listed to receive intervention. While wait-listed, students were received business-as-usual instruction. (n = 89).
 Peer-reviewed article
 Struggling readers
 Grades 7 and 8
 <90 standard score on standardized reading measure
Esser (2001) Experimental (treatment/comparison)
5 hr
Twice a week/6
T1: Direct instruction in using comprehension strategies (n = 20)
T2: A combination of direct instruction in comprehension strategies and attribution retraining (n = 20).
CO: Students read text and answered comprehension questions (n = 20).
 Unpublished doctoral dissertation
 LD
 Grades 6 and 7
 NR
Haines et al. (2018) Experimental (treatment/comparison)
NR
5 times per wk/NR
T1: Implemented the Read 180 program (n = 7) CO: Business-as-usual instruction (n = 7).
 Peer-reviewed article
 At-risk students
 Grade 6
 Failed the AIMS test
Jitendra et al. (2000) Experimental (treatment/comparison)
7.5 to 10 hr
NR/15
T1: Direct instruction in generating main idea after reading the text and selfmonitor during reading using cue cards (n = 18) CO: Systematic reading instruction that emphasized decoding and comprehension activities (n = 15).
 Peer-reviewed article
 LD
 Grades 6–8
 2 years below grade level
Kennedy et al. (2015) Experimental (treatment/comparison)
1 hr
3 times per week/3
Multimedia-based instruction on vocabulary words,
T1: that contained explicit instruction with text and images (n = 7)
T2: that contained keyword mnemonic strategy (n = 8)
T3: that contained explicit instruction and keyword mnemonic strategy (n = 7)
CO: Multimedia-based instruction on vocabulary words that contained text and no images. (n = 8)
 Peer-reviewed article
 LD
 Grade 10
 NR
Lane (1997) Experimental (treatment/comparison)
8.3 hr
5 times per week 10
T1 and T2: Students were taught reading comprehension strategy and how to cope with failure or respond positively to teacher instruction (n = 138) CO: Business-as-usual instruction (n = 98).
 Unpublished doctoral dissertation
 Struggling readers
 Grade 6
 2 to 3 years below grade level
Newbern (1998) Experimental (treatment/comparison)
3 hr
Once per week/6
T1: Instruction in using the RAP strategy in a small-group setting (n = 16) CO: No strategy related instruction was provided (n = 13).
 Unpublished doctoral dissertation
 LD
 Grades 7 and *
 1 to 2 years below grade level
O’Connor et al. (2019) Experimental (treatment/comparison)
55 hr
5 times per week/60
T1: Direct instruction in phonics and vocabulary (n = 32). CO: Direct instruction in phonics and reading text fluently (n = 20).
 Peer-reviewed article
 LD
 Grade 6
 2.5 years below grade level
Vachon (1999) Experimental (treatment/comparison)
17.3 hr
5 times per week/25
All groups received instruction in multisyllabic word reading.
T1 and T2: When students achieved 90% mastery, they moved to the next set of words. They also read grade-level passages or sentences (n = 32).
T3 and T4: Students did not have to achieve mastery to move to next lesson. They also read grade-level passages or sentences (n = 33).
CO: There was no control group
 Unpublished doctoral dissertation
 Struggling readers
 Grades 6–8
 1 to 3 years below grade level

Note. LD = learning disabilities; NR = not reported; T = treatment; CO = control; AIMS = Arizona’s Instrument to Measure Standards test; RAP = read–ask–paraphrase.

a

Total hours per student.

On average, follow-up data collection took place 21.2 weeks after posttesting (range = 2 weeks–2 years). The analysis showed that the estimated average weighted ES on all reading outcome measures between treatment and control groups at immediate posttest was gw = 0.78, 95% CI = [0.25, 1.31], (τ2 = .55), and at follow-up was gw = 0.27, 95% CI = [−0.23, 0.77], (τ2 = .20). For researcher-developed reading measures, weighted ES between the treatment and control groups at immediate posttest was gw = 0.86, 95% CI = [0.30, 1.42], (τ2 = .59), and at the follow-up time point, it was gw = 0.35, 95% CI = [−0.20, 0.91], (τ2 = .20). For standardized measures, weighted ES at immediate posttest was gw = 0.05, 95% CI = [−0.15, 0.25], (τ2 = .00). No mean effect on standardized measures at follow-up was calculated because only two of the four studies that administered standardized measures collected follow-up data on control group students. In addition, we conducted statistical significance tests for each treatment and control group comparison. Of the 44 immediate posttest ESs measured across 10 studies, 17 were significant and positive in favor of the treatment group; 27 ESs were not significant. Similarly, of the 26 follow-up ESs measured, 13 were positive and significant; 13 ESs were not significant.

An ES was also calculated to measure the magnitude of difference between each treatment group’s immediate posttest and follow-up reading scores. Of the 35 immediate posttest and follow-up reading outcome comparisons, the difference in scores between treatment groups’ immediate posttest and follow-up were no different from zero for 31 ESs as shown in Figure 2 for the studies in which the CI contains zero. In one study (Haines et al., 2018), treatment group students performed significantly higher on a standardized reading measure at the 2-year follow-up compared with posttest. In contrast, the data from three studies showed that treatment group students performed significantly lower on certain reading measures at follow-up than at the immediate posttest (Clarke et al., 2017; Esser, 2001; Jitendra et al., 2000).

Figure 2.

Figure 2.

Reading outcome comparison for treatment groups at follow-up and immediate posttest time points.

Note. (a) Effect size is significant when 95% confidence interval does not contain zero. (b) Positive effects (where the CI does not contain zero) indicate significantly greater performance at follow-up compared with immediate posttest. (c) Negative effects (where the CI does not contain zero) indicate significantly lower performance at follow-up compared with immediate posttest. (d) Kennedy et al. (2015) was not included because at follow-up researchers administered a truncated version of the posttest measure. (e) Lane (1997) and Newbern (1998) were not included because raw data or summary statistics for follow-up time point are not reported; the authors only report F-statistic for difference between treatment and control groups.

Study Participants

Participants in nine studies included in this synthesis were sixth, seventh, eighth, and/or ninth graders (Berkeley et al., 2011; Clarke et al., 2017; Esser, 2001; Haines et al., 2018; Jitendra et al., 2000; Lane, 1997; Newbern, 1998; O’Connor et al., 2019; Vachon, 1999). Participants in one study (Kennedy et al., 2015) were 10th graders. Participants in six studies were selected due to their school/district identification of LD (Berkeley et al., 2011; Esser, 2001; Jitendra et al., 2000; Kennedy et al., 2015; Newbern, 1998; O’Connor et al., 2019). Four studies included students who did not have an LD identification but were below grade level on a standardized reading measure (Clarke et al., 2017; Haines et al., 2018; Lane, 1997; Vachon, 1999). Clarke and colleagues (2017) included students from Grades 7 and 8 who scored below 90 on the Single Word Reading Test (SWRT; Foster & The National Foundation for Educational Research, 2008). Haines et al. (2018) selected participants in high poverty schools who failed to pass the state test. Lane (1997) included sixth-grade students who scored between the 9th and 39th percentile on a standardized reading measure (the authors did not report which standardized measure was used). Similarly, Vachon (1999) included students from Grades 6 to 8 who scored between third- and fifth-grade equivalencies on the Woodcock Johnson Reading Mastery Test–Word Identification subtest (Woodcock, 1987), read 60 to 90 words correct per minute on a grade-level text, and scored at or below Grade 3–level equivalency on the Peabody Picture Vocabulary test (Dunn et al., 1965).

Intervention Type

As shown in Table 2, five of the 10 studies made use of metacognitive strategy instruction to improve reading outcomes for struggling readers. However, results of the intervention at posttest and follow-up time points varied across several factors, such as type of strategy, measurement instrument, and duration of the intervention. Two studies used a multicomponent framework to provide instruction in multiple areas of reading (Clarke et al., 2017; Haines et al., 2018). Across both studies, there was no clear trend on the benefits of intervention for treatment group students. Similarly, the effects of vocabulary and word reading instruction for adolescent struggling readers also did not depict a clear trend of benefits for treatment group students at follow-up time points (Kennedy et al., 2015; O’Connor et al., 2019; Vachon, 1999).

Comprehension.

The estimated average weighted ES between treatment and control groups on all reading comprehension measures at immediate posttest was gw = 0.67, 95% CI = [0.10, 1.25], (τ2 = .43), and at follow-up was gw = 0.33, 95% CI = [−0.19, 0.85], (τ2 = .10). A majority of studies (n = 5) included in this synthesis focused on improving students’ comprehension of expository (Berkeley et al., 2011; Esser, 2001; Lane, 1997) or narrative texts (Jitendra et al., 2000; Newbern, 1998). All five studies taught students to use various comprehension strategies, however, only three studies (Berkeley et al., 2011; Esser, 2001; Jitendra et al., 2000) reported employing a combination of direct (i.e., modeling, guided, and independent practice) and strategy instruction.

As shown in Table 2, Esser (2001) and Berkeley et al. (2011) provided very similar reading interventions to adolescent struggling readers. These researchers provided a combination of direct instruction and reading comprehension strategy instruction (i.e., activating background knowledge, setting a purpose for reading, previewing text, generating questions, and summarization) to two treatment groups. In both studies, Treatment Group 2 received additional instruction after each session in attribution retraining to improve their self-belief. Berkeley et al. (2011) found positive effects of intervention on a researcher-developed summarization outcome measure for both treatment groups at immediate posttest (g = 1.39 and 0.92) and 6-week follow-up (g = 1.12 and 0.67). However, both treatment and control groups did not differ significantly at immediate posttest and follow-up time points on another researcher-developed measure—the passage test measure comprising multiple-choice and open-ended questions. Conversely, Esser (2001) administered only one researcher-developed reading measure and found positive effects of intervention at immediate posttest for both treatment groups (g = 0.58 and 1.23). Nonetheless, treatment and control groups did not differ significantly on the same test at the 6-week follow-up.

In Jitendra et al. (2000), tutors provided a combination of direct instruction and main idea generation instruction to treatment group students. Of the six researcher-developed measures administered at immediate posttest and follow-up time points, treatment group students outperformed control group participants on five of six measures at immediate posttest with significant ESs ranging from g = 0.93 to 2.65. However, on the 6-week follow-up test, the treatment group outperformed control group participants on only three of the six measures, with significant ESs ranging from g = 0.75 to 1.26.

Studies that did not provide direct instruction also reported mixed maintenance effects. Newbern (1998) taught students the mnemonic Read–Ask–Paraphrase (RAP) to generate the main idea of the passage. On a researcher-developed measure of reading comprehension, large positive effects of intervention in favor of the treatment group were reported at immediate posttest (g = 1.45). However, this positive intervention effect was not maintained at the 2-week follow-up time point and the F test score was not significant.

Lane (1997) taught students to activate background knowledge, think about the most important who/what, and write a sentence describing the main idea after reading. A greater magnitude of difference at follow-up was reported compared with the immediate posttest. At posttest, the treatment group outperformed control group students on a researcher-developed main idea generation measure (g = 0.55). However, treatment and control groups were not significantly different on another researcher-developed multiple-choice comprehension measure and the Gates–MacGinitie Reading Tests (GMRT; Gates & MacGinitie, 1964). At the 2-week follow-up, the treatment group outperformed the control group participants on all three measures: multiple choice (g = 0.34), main idea generation (g = 0.49), and GMRT (g = 0.33).

The type of instruction that control group students received in all five studies varied slightly. In three studies (Esser, 2001; Lane, 1997; Newbern, 1998), control group students received no comprehension strategy instruction; students were required to read text and answer comprehension questions. In one study (Jitendra et al., 2000), control group students continued their business-as-usual activities that included decoding and comprehension activities. Finally, the control group students in Berkeley et al. (2011) practiced repeated reading, graphed their fluency scores, and made predictions before reading the text.

Vocabulary.

Our search located two vocabulary-related interventions that involved struggling readers and collected follow-up data. Kennedy et al. (2015) taught 10th-grade students vocabulary words from a grade-level history lesson on World War I using multimedia-based instructional videos. Of the three different treatment groups, Treatment Group 1 watched videos containing explicit instruction incorporating text and images, Treatment Group 2 watched videos on the usage of a mnemonic strategy, and Treatment Group 3 watched videos that combined explicit instruction with the mnemonic strategy. Control group participants were also taught the same set of vocabulary words through vocabulary videos that contained only text (in the absence of images, keyword mnemonic strategy, and direct instruction).

All three treatment groups outperformed control group students at immediate posttest (range g = 1.57–2.81) and at the 3-week follow-up (range g = 1.67–2.88) on a researcher-developed, open-ended vocabulary measure that asked students to write student-friendly definitions. However, on another researcher-developed multiple-choice vocabulary measure, only the participants in Treatment Group 3 (explicit instruction + mnemonic strategy) outperformed control group students at immediate posttest (g = 1.57). At the follow-up time point, both Treatment Groups 2 (mnemonic strategy only) and 3 outperformed control group students (T2 = 1.41, T3 = 1.33).

In the O’Connor et al. (2019) study, researchers provided daily supplemental vocabulary lessons spanning 15 min. These sessions were in addition to the school-provided instruction students were receiving in special education classrooms. In each session, students were taught four new vocabulary words. Lessons included word synonyms, student-friendly definitions, discussions about the words, and writing sentences with learned words. Treatment group students significantly outperformed control group students at immediate posttest on both researcher-developed measures (g = 1.88 and g = 2.31). Only treatment group students were administered the follow-up vocabulary measure. A comparison between the treatment group’s immediate posttest and follow-up scores showed that participants maintained gains made during the intervention and performed similarly on the researcher-developed measure at the 4-week follow-up test (g = 0.06).

Word reading.

The authors were unable to locate any studies, for this student population, which provided a reading fluency intervention and collected follow-up data. One study (Vachon, 1999) taught students to read multisyllabic words and collected follow-up data on their decoding and fluency outcomes. Although the study was a randomized controlled trial, it is important to note that the control group in this study did receive very similar word reading instruction. The difference between the treatment and control conditions was related to the criteria that students had to meet during instruction to receive the next set of words. The researcher compared groups of students who had to achieve 90% mastery in word reading with students who did not have to achieve mastery before new sets of words were introduced. No differences were found at immediate posttest or follow-up between the mastery and non-mastery groups on standardized measures of decoding and a researcher-developed fluency measure.

Multicomponent reading interventions.

Two studies implemented multicomponent reading interventions and collected follow-up data for treatment group students. Clarke et al. (2017) randomized study participants to three groups. In Treatment Group 1, students read on- and below-grade-level passages to improve reading fluency, worked on improving their decoding skills through phonics instruction, and wrote sentences. In addition to receiving instruction in reading fluency, phonics, and writing, Treatment Group 2 also received instruction in new vocabulary, listening comprehension, and strategy use. The control group received business-as-usual instruction and was wait-listed to receive treatment. At posttest, on almost all reading measures, there was no significant difference between treatment and control group participants. Due to the study design, control group students received the 20-week treatment after posttesting. Follow-up data were only available for treatment group students. Treatment participants in both groups maintained their immediate posttest performance on all reading measures at the 20-week follow-up.

Haines et al. (2018) collected data on students who participated in the Read 180 program (Scholastic, 2015). Study participants attended daily 90-min sessions for one academic year. The program included instruction in phonemic awareness, phonics, fluency, comprehension, vocabulary, spelling, and writing. At the end of the intervention, treatment group students were matched to students who did not receive the Read 180 program instruction. Students were matched on their baseline Scholastic Reading Inventory scores (SRI; Scholastic, 2014). Treatment and control group students did not differ significantly on the SRI measure at immediate posttest, 1- and 2-year follow-up tests.

Treatment Dosage

On average, researchers provided 15.6 hr (range = 1–55 hr) of reading-related interventions across the nine studies; it was not possible to estimate the total hours of instruction for one study (Haines et al., 2018). Two studies collected data on both treatment and control group students at immediate posttest and follow-up time points; Newbern (1998) provided 3 hr and Esser (2001) provided 5 hr of comprehension instruction. Both studies reported no significant difference between treatment and control groups at a follow-up testing time point. On the contrary, Berkeley et al. (2011), Lane (1997), and Jitendra et al. (2000) provided 6 to 10 hr of comprehension-related interventions, and follow-up results varied for different measures. Berkeley et al. (2011) reported stable maintenance effects at the follow-up time point on a non-standardized measure of main idea summarization. No significant differences were observed for students in treatment and control groups at immediate posttest and follow-up time points on another non-standardized measure of explicit and implicit questions related to the test passage. Jitendra et al. (2000) reported stable positive maintenance effects at follow-up on a researcher-developed near transfer measure of comprehension, but the magnitude of difference on a researcher-developed far transfer measure was only significant at immediate posttest and not at the follow-up time point. Conversely, Lane (1997) reported moderate positive effects of intervention on researcher-developed and standardized measures at the follow-up time point.

Outcome Measures

Of the 25 different reading measures students were assessed on, across the 10 included studies, 10 were standardized norm-referenced reading measures (see Table 3). These included standardized measures of reading comprehension, reading fluency, word reading, and vocabulary. The 15 researcher-developed reading comprehension measures required students to read text and either generate a main idea statement or answer open-ended or multiple-choice questions.

Table 3.

Study Measures and Outcomes.

Study Intervention type Dependent measure(s) Std Group PT FU
Sample size g 95% CI Sample size g 95% CI Week
Berkeley et al. (2011) Comprehension Summary test N T1-CO 59 1.39 [0.70, 2.08] 59 1.12 [0.45, 1.79] 6
T2-CO 0.92 [0.26, 1.58] 0.67 [0.03, 1.32]
Passage test N T1-CO 0.13 [−0.49, 0.75] 0.25 [−0.37, .88]
T2-CO −0.16 [−0.78, 0.47] 0.05 [−0.58, 0.68]
Clarke et al. (2017) Multicomponent NGRT Y T1-CO 278 0.21 [−0.21, 0.54] 145 *Due to the wait list control group study design, follow-up data were only available for treatment group students. 20
T2-CO 0.45 [0.12, 0.74]
TOWRE-sight word Y T1-CO 0.01 [−0.34, 0.32]
T2-CO 0.06 [−0.26, 0.39]
TOWRE-phonemic decoding Y T1-CO 0.11 [−0.22, 0.44]
T2-CO 0.22 [−0.11, 0.53]
SWRT Y T1-CO 0.17 [−0.15, 0.50]
T2-CO 0.03 [−0.30, 0.36]
WIAT III RC Y T1-CO −0.23 [−0.56, 0.10]
T2-CO 0.05 [−0.29, 0.37]
WASI II vocab Y T1-CO −0.03 [−0.37, 0.29]
T2-CO 0.08 [−0.24, 0.48]
Taught words N T1-CO −0.12 [−0.46, 0.20]
T2-CO 0.24 [−0.09, 0.57]
Nontaught words N T1-CO −0.04 [−0.37, 0.28]
T2-CO 0.18 [−0.15, 0.51]
Esser (2001) Comprehension Comprehension quiz N T1-CO 60 0.58 [−0.05, 1.21] 60 −0.19 [−0.82, 0.43] 6
T2-CO 1.23 [0.56, 1.91] −0.22 [−0.84, 0.41]
Haines et al. (2018) Multicomponent Scholastic reading inventory Y T1-CO 14 −0.15 [−1.20, 0.90] 14 −0.56 [−1.63, 0.51] 52
−0.99 [−2.03, 0.18] 104
Jitendraet al. (2000) Comprehension Main idea generation N 33 33 6
Traininga T1-CO 2.65 [1.71, 3.58] 1.26 [0.51,2.01]
Near transfera 1.22 [0.47, 1.96] 1.02 [0.30, 1.75]
Far transfera 2.07 [1.22, 2.91] 0.58 [−0.12, 1.28]
Multiple choice N
Training T1-CO 0.93 [0.21, 1.65] 0.02 [−0.67, 0.71]
Near transfer 1.44 [0.67, 2.21] 0.75 [0.04, 1.45]
Far transfer 0.63 [−0.07, 1.33] 0.12 [−0.57, 0.80]
Kennedy et al. (2015) Vocabulary Multiple choice N T1-CO 30 0.96 [−0.11, 2.03] 30 0.73 [−0.32, 1.78] 3
T2-CO 0.86 [−0.16, 1.89] 1.41 [0.31,2.50]
T3-CO 1.57 [0.41,2.73] 2.33 to 3.64
Open-ended N T1-CO 1.74 [0.55, 2.93] 1.80 [0.60, 3.01]
T2-CO 1.57 [0.45, 2.70] 1.67 [0.53,2.81]
T3-CO 2.81 [1.38, 4.23] 2.88 [1.43,4.32]
Lane (1997) Comprehension Multiple choice N T1 and T2-CO 236 0.14 [−0.11, 0.40] 226 0.34 [0.07, 0.60] 2
Main idea generation N T1 and T2-CO 0.55 [0.28, 0.80] 0.49 [0.22, 0.76]
Gates-MacGinitie Y T1 and T2-CO 0.21 [−0.04, 0.47] 0.33 [0.06, 0.59]
Newbern (1998) Comprehension Reading comprehension test N T1-CO 29 1.45 [0.63, 2.27] NR NRb NRb 2
O’Connor et al. (2019) Vocabulary Vocabulary use N T1-CO 52 1.88 [1.21,2.54] 32 *As control group students did not learn vocabulary words, follow-up tests were only administered to treatment group. 4
Multiple choice-vocabulary N T1-CO 2.31 [1.60, 3.02]
Vachon (1999) Phonics WRMT-Word identification Y T1 and T2–T3 and T4 65 −0.02 [−0.51, 0.47] 65 0.00 [−0.49, 0.49] 7
WRMT-Word attack Y T1 and T2–T3 and T4 0.00 [−0.49, 0.49] 0.16 [−0.33, 0.65]
Passage reading error N T1 and T2–T3 and T4 −0.05 [−0.54, 0.43] −0.22 [−0.71, 0.27]

Note. Std = standardized measure; T = treatment; CO = control; PT = posttest; FU = follow-up; NR = not reported; SWRT = Single Word Reading Test; NGRT = New Group Reading Test Digital; TOWRE II = Test of Word Reading Efficiency–2; WIAT II = Wechsler Individual Achievement Test–Second Edition; WASI II = Wechsler Abbreviated Scale of Intelligence–Second Edition; WRMT = Woodcock Johnson Reading Mastery Test.

a

Training test items similar to items students were trained on; Near Transfer measure items were based on similar narrative text not used in training; Far Transfer items were based on expository passages from social studies text.

b

Author reported that F test was not significant for follow-up test; however, F-score was not reported.

Clarity of Causal Inference and Study Quality

In all studies except two (Haines et al., 2018; Newbern, 1998), participants were randomly assigned to treatment or comparison conditions. Haines et al. (2018) matched treatment group students to control group students who had similar pretest scores on the SRI. The matching was done after treatment group students had completed the intervention. Pretest data were not available to establish baseline equivalency between the two groups. Newbern (1998) selected participants based on students’ LD identification at their school/district and a reading score, on a standardized reading measure, indicating that the participant’s reading ability was one or more years below grade level. Due to scheduling issues, 13 students were assigned to the control group. The remaining 36 students were randomly assigned to one of two treatment groups. It is equally important to note that on the pretest measure, the treatment and control groups were not comparable. According to WWC (2017), the absolute ES between the treatment and control groups should be ≤0.05, or between 0.05 and 0.25, with statistical adjustments required to satisfy baseline equivalence. In Newbern’s (1998) study, the absolute ES on the pretest measure between treatment and control groups was d = 0.65. Hence, both studies do not satisfy the baseline equivalence requirement.

For all 10 studies included in this synthesis, no differential attrition was reported that exceeded the acceptable level (WWC, 2017). Group sizes remained similar at the start of the study, during posttest, and at follow-up testing time points. Based on the WWC study ratings, eight of the 10 studies met WWC standards for group design studies without reservations and were rated high. Two studies (Haines et al., 2018; Newbern, 1998) did not meet the WWC group design standards and were rated low.

Discussion

The objective of this synthesis was to understand how well effects of reading interventions were sustained at follow-up time points for struggling adolescent readers in Grades 6 to 12. Ten studies met inclusion criteria, and analyses of data showed a large significant intervention effect of reading interventions at posttest, which, on average, reduced to a moderate effect at follow-up. Of the 25 reading measures students were assessed on, 15 were researcher-developed reading measures.

Across all studies, the effect of treatment was gw = 0.78 at posttest and gw = 0.27 at follow-up. The estimated posttest ES in this study (gw = 0.78) was high, relative to past reviews of reading interventions for adolescent struggling readers, which yielded ESs ranging from g = 0.41 to 0.47 (Flynn et al., 2012; Edmonds et al., 2009; Scammacca et al., 2015). One explanation for the heightened posttest ES in this study may be the high number of researcher-developed measures in the included studies. Past studies have consistently reported that researcher-developed measures yield greater ESs compared with standardized reading measures (Cheung & Slavin, 2016; Slavin & Madden, 2011).

Of the 26 ESs measured to compare treatment and control groups’ performance at immediate posttest and follow-up, 14 were positive and significant in favor of the treatment group. It was observed that the CIs of these 14 ESs overlapped with 12 follow-up ES CIs, denoting sustainability and stability of intervention effects. Although there is no general consensus on the appropriate time to collect follow-up data, it is relevant to note that the average follow-up time frame was approximately 21 weeks. It is also important to note that the average follow-up time was skewed due to one study collecting follow-up data 1 and 2 years after posttest. In contrast to the mean follow-up time, the median follow-up time was 6 weeks. Hence, it could be argued that additional research with greater time between immediate posttest and follow-up data collection is needed to build on the current review’s findings, which indicate that adolescent struggling readers generally maintain their reading-related gains over time.

Follow-Up Effects of Reading Interventions for Adolescent Struggling Readers

Interventions targeted at improving students’ reading comprehension reported that middle and high school struggling readers, in general, performed better at immediate posttest and follow-up on measures of summarizing text and identifying the main idea compared with answering multiple-choice questions. However, it should be noted that almost all of these were researcher-developed measures, and it is difficult to estimate whether these tests were overaligned with learning acquisitions that favored treatment group students unfairly. Only two comprehension-related intervention studies (Haines et al., 2018; Lane, 1997) administered a standardized measure of reading comprehension to both treatment and control groups at immediate posttest and follow-up time points.

Lane (1997) reported that, whereas treatment and control group students did not differ significantly at immediate posttest on the GMRT (Gates & MacGinitie, 1964), treatment group students outperformed controls at the 2-week follow-up test. This finding could imply sleeper effects of intervention, indicating that students may take time to adopt strategies learned during the intervention, and positive effects may be documented if follow-up data are collected and analyzed. Haines et al. (2018) observed that treatment and control group students did not differ significantly at posttest, 1- and 2-year follow-ups but that both groups’ reading performance continued to improve over time. On the contrary, two studies (Esser, 2001; Newbern, 1998) reported moderate to large positive effects of intervention on treatment group students’ reading outcomes compared with control group participants at posttest. However, treatment group students in both studies were not significantly different from control group students on reading measures at follow-up time points. In contrast to Lane’s (1997) findings, results from these two studies suggest fading effects once treatment is over. These examples provide preliminary evidence of the importance of collecting follow-up data to assess students’ response to intervention in a more nuanced manner.

Comprehension-related intervention studies that delivered instruction for a total of <6 hr (Esser, 2001; Newbern, 1998) reported no significant differences between treatment and control groups at follow-up time points. In contrast, studies that delivered reading comprehension–related interventions for six or more hours generally reported positive effects of intervention on reading comprehension measures at immediate posttest and follow-up time points. These findings align with current recommendations in the field that advocate for interventions spanning longer durations to allow students who struggle with reading to make substantial gains in their targeted area of reading difficulty (Vaughn et al., 2012).

Finally, Suggate’s (2016) analysis of the follow-up effects of early elementary reading intervention studies showed that providing reading interventions for at-risk student populations was beneficial in the long-term. Treatment group students not only outperformed their peers at the end of treatment but also continued to show sustainable positive effects of phonological awareness and comprehension-related interventions months, and sometimes years, after the intervention. In an attempt to understand the long-term effects of reading interventions for middle and high school students at risk of reading failure, we found that not many studies follow adolescent struggling readers post-intervention. Over the past two decades, a handful of studies collected follow-up data for this student population. Our analyses showed that, in general, providing intensive reading comprehension strategy instruction, either with or without direct instruction, was beneficial to students’ progress in reading. This finding indicates that when provided with targeted reading instruction in small-group settings, middle and high school students who struggle to read can still make gains and improve their reading outcomes. While a majority of the studies included in this synthesis aimed to improve comprehension outcomes for struggling adolescent readers, a few studies focused on improving students’ vocabulary and word reading skills. Considering the paucity of such intervention studies reporting follow-up data, it is unclear how effective word reading and vocabulary interventions are in sustainably improving students’ word reading ability and lexicon post-intervention.

Study Limitations

A key constraint of this synthesis is the limited number of studies included in this review. Although an exhaustive search process was utilized to access studies with follow-up data, only 10 studies were located that provided the data needed to measure effects. Although a previous synthesis (Berkeley & Larsen, 2018) found other reading intervention studies with follow-up data, some of these studies did not meet our inclusion criteria due to the publication date, language of instruction, and/or lack of access to disaggregated data (see Table 1). In addition, hand search of relevant journals, to locate studies that fit the inclusion criteria, was limited to 2017–2019, and it is possible that additional articles may have been missed. It is also likely that we missed out on potential studies due to indexing issues, highlighted in previous reviews (e.g., Lemons et al., 2016), which could lead to different search results, depending on the search interface, vendor retrieval algorithms, and article indexing (C. S. Burns et al., 2019). However, it is worth noting that we followed H. M. Cooper’s (2017) recommendations—an online database search, ancestry, and hand searches, in addition to contacting primary investigators for disaggregated data—to locate relevant articles.

Furthermore, due to the small number of studies included in this synthesis, it was not possible to conduct moderator analyses to analyze intervention elements that influenced the strength of association between treatment and follow-up effects. In addition, although this study reports findings for middle and high school students’ reading outcomes, the study’s findings are limited to struggling readers in Grades 6 to 10 because no studies were found that involved Grades 11 and 12 participants.

Another important limitation of this synthesis is that a majority of measures administered in the included studies were researcher-developed measures. Past reviews of literature have generally reported great effects of treatment when measured on researcher-developed measures compared with distal or standardized reading measures (Cheung & Slavin, 2016; Edmonds et al., 2009; Scammacca et al., 2015). One potential implication of overreliance on making inferences related to study effectiveness based on researcher-developed measures is the potential of inflating program effectiveness. In addition, multiple exposures to the same researcher-developed reading measure could lead to testing effects and fatigue.

Implications

Findings from the current synthesis on measuring student data at follow-up time points suggests that reading interventions can still be effective methods to improve reading outcomes for struggling readers in middle and high school. For instance, two studies included in this synthesis delivered instruction to high school students. In one vocabulary intervention study (Kennedy et al., 2015), the large positive gains students made at immediate posttest were sustained at the 3-week follow-up testing time point. Similarly, one reading comprehension intervention study (Berkeley et al., 2011) reported that substantial gains made at immediate posttest on a researcher-developed comprehension measure was sustained at the 6-week follow-up time point. These findings accentuate the need for intensive reading interventions for students who continue to struggle in middle and high school as this may be the final window of time within which their reading skills can be improved before they exit the school system.

In a review of a century of progress in reading interventions, Scammacca et al. (2016) noted that a majority of reading intervention studies since the year 2000 were designed to deliver multicomponent reading strategies. Two studies included in this synthesis delivered multicomponent reading interventions that targeted more than one reading component. Whereas differences between treatment and control group students at immediate posttest were not significant on multiple reading measures, it was observed that treatment group students maintained gains made from baseline to immediate posttest almost 2 years after the intervention concluded.

However, more studies are needed to substantiate the claim that effects of interventions are sustained at follow-up time points. For instance, only two studies in the current synthesis implemented multicomponent reading interventions and only one study implemented vocabulary interventions. Considering the paucity of studies, generally small sample sizes, and effectiveness of programs being measured on predominantly researcher-developed measures, there is less certainty about the long-term effectiveness of reading intervention approaches, especially in the area of vocabulary and word reading development for adolescent struggling readers.

Conclusion

It is important to acknowledge the challenges researchers face in collecting follow-up data. One of the biggest challenges may be the need for access to continuous resources including personnel to collect data at follow-up time points. Other challenges relate to threats to internal validity that may arise when collecting follow-up data. For instance, follow-up study design is more susceptible to high rates of attrition due to participants leaving the school district, getting homeschooled, or dropping out of the school system. Another threat to the internal validity of this study design is testing effects. Students may get familiar with the testing instrument over multiple exposures and their response to tests could be mistaken for treatment effects.

However, we contend that the benefits to the field of collecting and measuring follow-up data may outweigh the inherent challenges. Studies that collect follow-up data after the completion of interventions can provide unique insights into the long-term efficacy of academic interventions. Collecting follow-up data can provide powerful evidence concerning students’ response to intervention, their reading development over time, and the extent to which their reading problems persist post-intervention. Furthermore, individual reading interventions differ in terms of their intensity, duration, and instructional techniques. The long-term impact of reading interventions that vary on these key variables also needs to be tested to improve our understanding of the components of interventions that yield long-term effects. Similar to the conclusion made by Suggate (2016), the authors hope that findings from the current study will encourage researchers to collect follow-up data for this student population to improve delivery methods that translate to sustained intervention effects.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grant P50 HD052117 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development at the National Institutes of Health. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health and Human Development or the National Institutes of Health.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

References marked with an asterisk indicate studies included in this synthesis.

  1. Antoniou F, & Souvignier E (2007). Strategy instruction in reading comprehension: An intervention study for students with learning disabilities. Learning Disabilities: A Contemporary Journal, 5, 41–57. [Google Scholar]
  2. Berkeley S, & Larsen A (2018). Fostering self-regulation of students with learning disabilities: Insights from 30 years of reading comprehension intervention research. Learning Disabilities Research & Practice, 33(2), 75–86. 10.1111/ldrp.12165 [DOI] [Google Scholar]
  3. *.Berkeley S, Mastropieri MA, & Scruggs TE (2011). Reading comprehension strategy instruction and attribution retraining for secondary students with learning and other mild disabilities. Journal of Learning Disabilities, 44(1), 18–32. 10.1177/0022219410371677 [DOI] [PubMed] [Google Scholar]
  4. Blachman BA, Schatschneider C, Fletcher JM, Francis DJ, Clonan SM, Shaywitz BA, & Shaywitz SE (2004). Effects of intensive reading remediation for second and third graders and a 1-year follow-up. Journal of Educational Psychology, 96(3), 444–461. 10.1037/0022-0663.96.3.444 [DOI] [Google Scholar]
  5. Blachman BA, Schatschneider C, Fletcher JM, Murray MS, Munger KA, & Vaughn MG (2014). Intensive reading remediation in Grade 2 or 3: Are there effects a decade later? Journal of Educational Psychology, 106(1), 46–57. 10.1037/a003363 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Boardman AG, Roberts G, Vaughn S, Wexler J, Murray CS, & Kosanovich M (2008). Effective instruction for adolescent struggling readers: A practice brief. RMC Research Corporation, Center on Instruction. [Google Scholar]
  7. Borkowski JG, Weyhing RS, & Carr M (1988). Effects of attributional retraining on strategy-based reading comprehension in learning disabled students. Journal of Educational Psychology, 80, 46–53. 10.1037/0022-0663.80.1.46 [DOI] [Google Scholar]
  8. Boulay B, Goodson B, Frye M, Blocklin M, & Price C (2015). Summary of research generated by striving readers on the effectiveness of interventions for struggling adolescent readers (NCEE 2016–4001). National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. [Google Scholar]
  9. Burns CS Ii, R. MS, Nix T, & Huber JT (2019). Search results outliers among MEDLINE platforms. Journal of the Medical Library Association, 107(3), 364–373. 10.5195/jmla.2019.622 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Burns MK, Senesac BJ, & Silberglitt B (2008). Longitudinal effect of a volunteer tutoring program on reading skills of students identified as at-risk for reading failure: A two-year follow-up study. Literacy Research and Instruction, 47(1), 27–37. 10.1080/19388070701750171 [DOI] [Google Scholar]
  11. Byrne B, & Fielding-Barnsley R (1993). Evaluation of a program to teach phonemic awareness to young children: A 1-year follow-up. Journal of Educational Psychology, 85(1), 104–111. 10.1037/0022-0663.85.1.104 [DOI] [Google Scholar]
  12. Byrne B, & Fielding-Barnsley R (1995). Evaluation of a program to teach phonemic awareness to young children: A 2- and 3-year follow-up and a new preschool trial. Journal of Educational Psychology, 87(3), 488–503. 10.1037/0022-0663.87.3.488 [DOI] [Google Scholar]
  13. Cantrell SC, Almasi JF, Carter JC, & Rintamaa M (2011). Striving readers final evaluation report: Danville, Kentucky. University of Kentucky, Collaborative Center for Literacy Development. Presented to the Striving Readers Program, U.S. Department of Education. [Google Scholar]
  14. Cantrell SC, Almasi JF, Carter JC, Rintamaa M, & Madden A (2010). The impact of a strategy-based intervention on the comprehension and strategy use of struggling adolescent readers. Journal of Educational Psychology, 102(2), 257–280. 10.1037/a0018212 [DOI] [Google Scholar]
  15. Cantrell SC, Carter JC, & Rintamaa M (2012). Striving readers Cohort II evaluation report: Kentucky. University of Kentucky, Collaborative Center for Literacy Development. [Google Scholar]
  16. Chall JS, & Jacobs VA (2003). The classic study on poor children’s fourth-grade slump. American Educator, 27(1), 14–15. [Google Scholar]
  17. Cheung AC, & Slavin RE (2016). How methodological features affect effect sizes in education. Educational Researcher, 45(5), 283–292. 10.3102/0013189x16656615 [DOI] [Google Scholar]
  18. *.Clarke PJ, Paul SAS, Smith G, Snowling MJ, & Hulme C (2017). Reading intervention for poor readers at the transition to secondary school. Scientific Studies of Reading, 21(5), 408–427. 10.1080/10888438.2017.1318393 [DOI] [Google Scholar]
  19. Cooper HM (2017). Research synthesis and meta-analysis a step-by-step approach. SAGE. [Google Scholar]
  20. Cooper JO, Heron TE, & Heward WL (2008). Applied behavior analysis. Pearson/Merrill-Prentice Hall. [Google Scholar]
  21. Daniel J, & Williams KJ (2019). Self-questioning strategy for struggling readers: A synthesis. Remedial and Special Education. Advanced online publication. 10.1177/0741932519880338 [DOI] [Google Scholar]
  22. Deussen T, Scott C, Nelsestuen K, Roccograndi A, & Davis A (2012). Washington striving readers: Year 1 evaluation report. Education Northwest. [Google Scholar]
  23. Dimitrov D, Jurich S, Frye M, Lammert J, Sayko S, & Taylor L (2012). Year one evaluation report/impact study: Illinois striving readers. RMC Research Corporation. [Google Scholar]
  24. Dunn LM, Dunn LM, Bulheller S, & Häcker H (1965). Peabody picture vocabulary test. American Guidance Service. [Google Scholar]
  25. Edmonds MS, Vaughn S, Wexler J, Reutebuch C, Cable A, Tackett KK, & Schnakenberg JW (2009). A synthesis of reading interventions and effects on reading comprehension outcomes for older struggling readers. Review of Educational Research, 79(1), 262–300. 10.3102/0034654308325998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ellis ES, & Graves AW (1990). Teaching rural students with learning disabilities: A paraphrasing strategy to increase comprehension of main ideas. Rural Special Education Quarterly, 10, 2–10. 10.1177/875687059001000201 [DOI] [Google Scholar]
  27. *.Esser MMS (2001). The effects of metacognitive strategy training and attribution retraining on reading comprehension in African-American students with learning disabilities [Unpublished doctoral dissertation, University of Wisconsin-Milwaukee: ]. [Google Scholar]
  28. Faddis BJ, Beam M, Maxim L, Gandhi EV, Hahn K, & Hale R (2011). Portland public school’s striving readers program: Year 5 evaluation report. Prepared for the U.S. Department of Education Office of Elementary and Secondary Education. [Google Scholar]
  29. Feldman J, Schenck A, Feighan K, Coffey D, & Rui N (2011). Memphis striving readers project: Evaluation report, year 4. Research for Better Schools. Presented to the Striving Readers Program, U.S. Department of Education. [Google Scholar]
  30. Fisher Z, & Tipton E (2015). Robumeta: An R-package for robust variance estimation in meta-analysis. arxiv Preprint, arXiv:1503.02220. [Google Scholar]
  31. Flynn LJ, Zheng X, & Swanson HL (2012). Instructing struggling older readers: A selective meta-analysis of intervention research. Learning Disabilities Research & Practice, 27(1), 21–32. 10.1111/j.1540-5826.2011.00347.x [DOI] [Google Scholar]
  32. Foster H, & the National Foundation for Educational Research. (2008). Single word reading test 6–16. GL Assessment. [Google Scholar]
  33. Gates AI, & MacGinitie WH (1964). Gates-MacGinitie reading tests. Teachers College Press, Columbia University. [Google Scholar]
  34. Graves AW (1986). Effects of direct instruction and meta-comprehension training on finding main ideas. Learning Disabilities Research & Practice, 1, 90–100. [Google Scholar]
  35. *.Haines ML, Husk KL, Baca L, Wilcox B, & Morrison TG (2018). Longitudinal effects of reading intervention on below grade level, preadolescent, Title I students. Reading Psychology, 39(7), 690–710. 10.1080/02702711.2018.1515135 [DOI] [Google Scholar]
  36. Hall C, Roberts GJ, Cho E, McCulley LV, Carroll M, & Vaughn S (2017). Reading instruction for English learners in the middle grades: A meta-analysis. Educational Psychology Review, 29(4), 763–794. 10.1007/s10648-016-9372-4 [DOI] [Google Scholar]
  37. Hedges LV, Tipton E, & Johnson MC (2010). Robust variance estimation in meta-regression with dependent effect size estimates. Research Synthesis Methods, 1(1), 39–65. 10.1002/jrsm.5 [DOI] [PubMed] [Google Scholar]
  38. Hock MF, Brasseur-Hock IF, Hock AJ, & Duvel B (2017). The effects of a comprehensive reading program on reading outcomes for middle school students with disabilities. Journal of Learning Disabilities, 50(2), 195–212. 10.1177/0022219415618495 [DOI] [PubMed] [Google Scholar]
  39. Hofstetter CH, Strick B, Der-Martirosian C, Ong-Dean C, Long P, & Lou Y (2011). Implementation and impact of the targeted and whole school interventions, Summary of Year 4 (2009–2010). San Diego (CA) Unified School District. Report submitted to the U.S. Department of Education, Office of Elementary and Secondary Education. [Google Scholar]
  40. *.Jitendra AK, Kay Hoppes M, & Xin YP (2000). Enhancing main idea comprehension for students with learning problems: The role of a summarization strategy and self-monitoring instruction. The Journal of Special Education, 34(3), 127–139. 10.1177/002246690003400302 [DOI] [Google Scholar]
  41. Johnson L, Graham S, & Harris KR (1997). The effects of goal setting and self-instruction on learning a reading comprehension strategy. Journal of Learning Disabilities, 30(1), 80–91. 10.1177/002221949703000107 [DOI] [PubMed] [Google Scholar]
  42. *.Kennedy MJ, Deshler DD, & Lloyd JW (2015). Effects of multimedia vocabulary instruction on adolescents with learning disabilities. Journal of Learning Disabilities, 48(1), 22–38. 10.1177/0022219413487406 [DOI] [PubMed] [Google Scholar]
  43. Keogh B (2004). The importance of longitudinal research for early intervention practices. In McCardle P & Chhabra V (Eds.), The voice of evidence in reading research (pp. 81–102). Brookes. [Google Scholar]
  44. *.Lane SS (1997). Motivating inactive readers: The effects of video-modeled coping training on the metacognitive acquisition of a reading strategy by low-achieving readers [Unpublished doctoral dissertation, University of California, Los Angeles: ]. [Google Scholar]
  45. Lemons CJ, King SA, Davidson KA, Berryessa TL, Gajjar SA, & Sacks LH (2016). An inadvertent concurrent replication: Same roadmap, different journey. Remedial and Special Education, 37(4), 213–222. 10.1177/0741932516631116 [DOI] [Google Scholar]
  46. Lipsey MW, & Wilson DB (2001). Practical meta-analysis. SAGE. [Google Scholar]
  47. Loadman WE, Moore RJ, Ren W, Zhu J, Zhao J, & Lomax R (2011). Striving Readers Year 5 Project evaluation report: Ohio. An addendum to the Year 4 report.
  48. Meisch A, Hamilton J, Chen E, Quintanilla P, Fong P, Gray-Adams K, et al. ,. (2011). Striving readers study: Targeted and whole school interventions–Year 5. Westat. [Google Scholar]
  49. Miranda A, Villaescusa MI, & Vidal-Abarca E (1997). Is attribution retraining necessary? Use of self-regulation procedures for enhancing the reading comprehension strategies of children with learning disabilities. Journal of Learning Disabilities, 30(5), 503–512. 10.1177/002221949703000506 [DOI] [PubMed] [Google Scholar]
  50. *.Newbern SL (1998). The effects of instructional settings on the efficacy of strategy instruction for students with learning disabilities [Unpublished doctoral dissertation, Johns Hopkins University: ]. [Google Scholar]
  51. Newman DL, Kundert DK, Spaulding DT, White SP, & Gifford TA (2012). Striving readers project: New York State Department of Education/New York City Department of Education: Striving Readers local evaluation. State University of New York; –Albany. [Google Scholar]
  52. *.O’Connor RE, Beach KD, Sanchez VM, Kim JJ, Knight-Teague K, Orozco G, & Jones BT (2019). Teaching academic vocabulary to sixth-grade students with disabilities. Learning Disability Quarterly, 42(4), 231–243. 10.1177/0731948718821091 [DOI] [Google Scholar]
  53. Roberts G, Torgesen JK, Boardman A, & Scammacca N (2008). Evidence-based strategies for reading instruction of older students with learning disabilities. Learning Disabilities Research & Practice, 23(2), 63–69. 10.1111/j.1540-5826.2008.00264.x [DOI] [Google Scholar]
  54. Ryder JF, Tunmer WE, & Greaney KT (2008). Explicit instruction in phonemic awareness and phonemically based decoding skills as an intervention strategy for struggling readers in whole language classrooms. Reading and Writing, 21(4), 349–369. 10.1007/s11145-007-9080-z [DOI] [Google Scholar]
  55. Scammacca N, Roberts G, Vaughn S, Edmonds M, Wexler J, Reutebuch CK, & Torgesen JK (2007). Interventions for adolescent struggling readers: A meta-analysis with implications for practice. RMC Research Corporation, Center on Instruction. [Google Scholar]
  56. Scammacca NK, Roberts G, Vaughn S, & Stuebing KK (2015). A meta-analysis of interventions for struggling readers in grades 4–12: 1980–2011. Journal of Learning Disabilities, 48(4), 369–390. 10.1177/0022219413504995 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Scammacca NK, Roberts GJ, Cho E, Williams KJ, Roberts G, Vaughn SR, & Carroll M (2016). A century of progress: Reading interventions for students in grades 4–12, 1914–2014. Review of Educational Research, 86(3), 756–800. 10.3102/0034654316652942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Schenck A, Jurich S, Frye M, Lammert J, & Sayko S (2012). Evaluation report/impact study: Virginia striving readers intervention initiative (VSRII). RMC Research Corporation. [Google Scholar]
  59. Schiller E, Wei X, Thayer S, Blackorby J, Javitz H, & Williamson C (2012, June). A Randomized control trial of the impact of the fusion reading intervention on reading achievement and motivation for adolescent struggling readers. Submitted for publication review by SRI International. [Google Scholar]
  60. Scholastic. (2014). SRI college & career technical guide.
  61. Scholastic. (2015). READ 180 Reading Intervention Program: A comprehensive reading intervention solution.
  62. Slavin R, & Madden NA (2011). Measures inherent to treatments in program effectiveness reviews. Journal of Research on Educational Effectiveness, 4(4), 370–380. 10.1080/19345747.2011.558986 [DOI] [Google Scholar]
  63. Suggate SP (2016). A meta-analysis of the long-term effects of phonemic awareness, phonics, fluency, and reading comprehension interventions. Journal of Learning Disabilities, 49(1), 77–96. 10.1177/0022219414528540 [DOI] [PubMed] [Google Scholar]
  64. Swanlund A, Dahlke K, Tucker N, Kleidon B, Kregor J, Davidson-Gibbs D, & Hallberg K (2012). Striving readers: Impact study and project evaluation report: Wisconsin Department of Public Instruction (with Milwaukee Public Schools). American Institutes for Research. [Google Scholar]
  65. The Education Alliance at Brown University. (2012). Springfield-Chicopee School Districts Striving Readers (SR) Program: Final report Years 1–5: Evaluation of implementation and impact. Prepared for U.S. Department of Education, Institute of Education Sciences, Office of Elementary and Secondary Education. [Google Scholar]
  66. Tunik J, Simon A, Alemany J, Zhu J, Zacharia J, Ramsey L, Swann R, Bergman A, Fields A, & Mendes R (2011). Chicago Public Schools striving readers initiative: Year four evaluation report. Submitted to Elizabeth Cadenas-Lopez, Project Director, Striving Readers, U.S. Department of Education. [Google Scholar]
  67. *.Vachon VL (1999). Effects of mastery of multisyllabic word reading component skills and of varying practice contexts on word and text reading skills of middle school students with reading deficiencies [Unpublished doctoral dissertation, University of Oregon: ]. [Google Scholar]
  68. Vadasy PF, & Sanders EA (2013). Two-year follow-up of a code-oriented intervention for lower-skilled first-graders: The influence of language status and word reading skills on third-grade literacy outcomes. Reading and Writing, 26(6), 821–843. 10.1007/s11145-012-9393-4 [DOI] [Google Scholar]
  69. Vadasy PF, Sanders EA, & Peyton JA (2006). Code-oriented instruction for kindergarten students at risk for reading difficulties: A randomized field trial with paraeducator implementers. Journal of Educational Psychology, 98(3), 508–528. 10.1037/0022-0663.98.3.508 [DOI] [Google Scholar]
  70. Vaden-Kiernan M, Caverly S, Bell N, Sullivan K, Fong C, & Atwood E (2012). Louisiana striving readers: Final report. Southwest Educational Development Laboratory. [Google Scholar]
  71. Vaughn S, Elbaum BE, Wanzek J, Scammacca N, & Walker MA (2014). Code sheet and guide for education-related intervention study syntheses. The Meadows Center for Preventing Educational Risk. [Google Scholar]
  72. Vaughn S, Martinez LR, Williams KJ, Miciak J, Fall AM, & Roberts G (2019). Efficacy of a high school extensive reading intervention for English learners with reading difficulties. Journal of Educational Psychology, 111(3), 373–386. 10.1037/edu0000289 [DOI] [Google Scholar]
  73. Vaughn S, Roberts G, Swanson EA, Wanzek J, Fall AM, & Stillman-Spisak SJ (2015). Improving middle-school students’ knowledge and comprehension in social studies: A replication. Educational Psychology Review, 27(1), 31–50. 10.1007/s10648-014-9274-2 [DOI] [Google Scholar]
  74. Vaughn S, Roberts G, Wexler J, Vaughn MG, Fall AM, & Schnakenberg JB (2015). High school students with reading comprehension difficulties: Results of a randomized control trial of a two-year reading intervention. Journal of Learning Disabilities, 48(5), 546–558. 10.1177/0022219413515511 [DOI] [PubMed] [Google Scholar]
  75. Vaughn S, Wanzek J, Murray CS, & Roberts G (2012). Intensive interventions for students struggling in reading and mathematics. A practice guide. Center on Instruction.
  76. Wanzek J, Vaughn S, Scammacca NK, Metz K, Murray CS, Roberts G, & Danielson L (2013). Extensive reading interventions for students with reading difficulties after grade 3. Review of Educational Research, 83(2), 163–195. 10.3102/0034654313477212 [DOI] [Google Scholar]
  77. What Works Clearinghouse. (2017). Standards handbook (Version 4.0).
  78. Woodcock RW (1987). Woodcock reading mastery tests–Revised. American Guidance Service. [Google Scholar]

RESOURCES