Abstract
This study investigated the extent to which problem behaviors were factors associated with response to a year-long multicomponent reading intervention for fourth- and fifth-grade students with reading difficulties. Students scoring ≤85 standard score on the Test of Silent Reading Efficiency and Comprehension (n = 108), a reading fluency and comprehension screener measure, were randomized to the researcher-provided treatment condition (n = 55) or the business-as-usual comparison condition (n = 53). Results indicated that problem behaviors were associated with lower reading comprehension outcomes. Findings also suggested that students with higher levels of overall problem behaviors and externalizing behaviors in the treatment condition outperformed similar students in the comparison condition on the Gates-MacGinitie Reading Test (p < .05). Future research is needed on how to best identify, develop, and adapt effective interventions for students with reading difficulties and problem behaviors within school-wide response to intervention frameworks.
Keywords: reading intervention, behavior, elementary school, learning disability, response to intervention
Students with reading difficulties (SWRD) are more likely than students reading at grade level to demonstrate co-occurring problem behaviors such as externalizing behaviors (e.g., conduct disorder and oppositional defiant disorder; Lin, Morgan, Farkas, Hillemeier, & Cook, 2013; Morgan, Farkas, & Wu, 2009), internalizing behaviors (e.g., overanxious disorder and generalized anxiety disorder; Lin et al., 2013; Morgan et al., 2009), and hyperactive and inattentive behaviors (Carroll, Maughan, Goodman, & Meltzer, 2005; Pennington, 2006). Utilizing the Early Childhood Longitudinal Study–Kindergarten Cohort data set, Lin et al. (2013) and Morgan et al. (2009) identified Kindergarten and Grade 3 students who struggle academically as more likely to have later externalizing and internalizing behaviors, hyperactivity and inattention, poor self-control, and low task engagement in Grades 5 and 8. For students with co-occurring reading difficulties and problem behaviors (SWRD + PB), academic deficits also tend to be stable and persistent throughout their education (Algozzine, Wang, & Violette, 2011).
The relationship between problem behaviors and reading difficulties is not confined to students with emotional disturbance (ED) or attention-deficit/hyperactivity disorder (ADHD; Forness, Freeman, Paparella, Kauffman, & Walker, 2012). Data suggest many students have a range of unidentified behavioral needs, which are not addressed in schools, but still negatively impact reading outcomes (Forness et al., 2012; Maggin, Wehby, & Gilmour, 2016). Forness et al. (2012) estimated that at any given point in time, approximately 12% of students have an ED and 26% have a psychiatric disorder such as ADHD, depression, or anxiety. Therefore, to efficiently and effectively support SWRD across all tiers in a school-wide Response to Intervention (RTI) framework, it is necessary to consider that many students have unidentified co-occurring problem behaviors (Farmer, Gatzke-Kopp, Lee, Dawes, & Talbott, 2016; Maggin et al., 2016). It is also necessary to continue to identify, develop, and adapt efficacious and intensive reading programs for SWRD + PB, and additionally identify for whom and under what conditions these interventions are most likely to be most effective (Benner, Nelson, Ralston, & Mooney, 2010; Farmer et al., 2016; Maggin et al., 2016; Miciak et al., 2017).
Intervention Research for Students With Reading and Behavioral Difficulties
SWRD + PB are at an increased risk of not adequately responding to an intensive reading intervention, yet there is a paucity of reading intervention research for SWRD + PB investigating reading comprehension outcomes (Jacobson, Ryan, Denckla, Mostofsky, & Mahone, 2013; Nelson, Benner, & Gonzalez, 2003). In a meta-analysis investigating the impact of reading interventions on reading outcomes for SWRD + PB, Benner et al. (2010) identified six group-design studies and 18 single-case design studies. Of these six group-design studies, no study included a norm-referenced reading comprehension measure. Of the 18 single-case design studies, three measured reading comprehension at posttest. Within-student prepost effect sizes ranged from 0.57 to 1.47 for these three studies. Since Benner and colleagues (2010), only Tamm et al. (2017) and Tannock et al. (2018) have used a group-design to investigate the impact of a reading intervention on reading outcomes for students with reading and behavior or attention difficulties. Tamm et al. found Grade 3 to 5 SWRD and co-occurring ADHD who received one of the two forms of reading treatment (with or without a behavioral component) outperformed students who received a behavior-only treatment (no reading component), on the Wechsler Individual Achievement Test–Third Edition (Wechsler, 2009) phoneme decoding (p < .01, g = 0.23–0.32) and word reading (p < .05, g = 0.39–0.39) subtests. Tannock et al. investigated the impact of two forms of reading treatments (phonological awareness-based or strategy-based), against a comparison condition, for 7- to 11-year-old SWRD and co-occurring ADHD. Findings from Tannock et al. suggested the reading treatments (collapsing the phonologically-based and strategy-based groups into a single group) significantly outperformed a researcher-delivered general academic instruction with social skills training comparison condition on the Woodcock Reading Mastery Tests–Revised Word Attack (p < .01) and Passage Comprehension (p = .02; Woodcock, 1987). Results across Tannock et al. and Tamm et al. suggested that a research-based reading intervention, with or without a behavioral component, can improve reading outcomes when compared with a comparison condition not receiving a reading intervention. To date, when investigating group-design reading interventions for SWRD + PB, only Tannock et al. (including SWRD and co-occurring ADHD) included a norm-referenced reading comprehension measure.
Overall, there is a lack of reading intervention research in the upper elementary grades for SWRD + PB. This lack of research is problematic, as the upper elementary grades are a time when the cognitive and instructional demands of reading shift from more malleable word and sentence reading tasks to more complex, and thus more difficult to remediate, reading comprehension tasks (Chall & Jacobs, 1983; Compton, Fuchs, Fuchs, Elleman, & Gilbert, 2008). In the upper elementary grades, there also continues to be a need to better understand how problem behaviors are associated with response to reading interventions (Durlak, Weissberg, Dymnicki, Taylor, & Schellinger, 2011; Maggin et al., 2016; Schonfeld et al., 2015).
Theoretical Underpinnings of Co-Occurring Reading Difficulties and Problem Behaviors
There are four prevailing theories on the causal mechanism underlying the high co-occurrence rate of reading difficulties and problem behaviors (Hinshaw, 1992). One theory states that reading difficulties cause later problem behaviors through negative reading experiences, lowered engagement, and negative feelings about oneself, which ultimately hinder reading growth (e.g., Griffiths & Snowling, 2002; Guthrie, Schafer, & Huang, 2001; Maughan, Rowe, Loeber, & Stouthamer-Loeber, 2003). A second theory states that problem behaviors cause later reading difficulties, because students with behavior problems exhibit poor self-regulation, inattention, and other factors, which interfere with access to instruction, and lead to poor outcomes (Roeser, van der Wolf, & Strobel, 2001). Third is the theory that a bi-directional relationship exists with a negative cycle of reading and behavioral problems leading to further reading and behavioral problems (Morgan, Farkas, Tufis, & Sperling, 2008). Finally, it is possible that a third variable (e.g., working memory and processing speed) is the cause of this relationship (McGrath et al., 2011; Roberts, Solis, Ciullo, McKenna, & Vaughn, 2015).
The current study does not evaluate the causal mechanism underlying the high co-occurrence rate between reading difficulties and problem behaviors. Instead, we aimed to better understand the extent to which problem behaviors were associated with reading outcomes. In addition, we investigated the treatment by problem behavior interaction to determine whether students with higher levels of problem behaviors in the treatment condition would outperform students with higher levels of problem behavior in the comparison condition. If this treatment by problem behavior interaction were to occur, it would suggest an intensive reading intervention, such as one presented in this study, could help to mitigate the negative impact of problem behaviors. To test the treatment by problem behavior interaction effect, we measured behavior through the norm-referenced Social Skills Improvement System–Rating Scale, Teacher Report (SSIS-RS; Gresham & Elliott, 2008), to a subsample of schools in Vaughn, Roberts, Miciak, Taylor, and Fletcher (2019). In this study, Vaughn and colleagues (2019) investigated the impact of a year-long reading without systematic behavior support intervention on reading outcomes for fourth- and fifth-grade SWRD.
Furthermore, students were not screened for behavioral difficulties. By not restricting the sample to students with behavioral difficulties, we could (a) identify and measure the extent to which behavior affected reading outcomes across the continuum of behavior profiles and (b) identify whether reading response differed for students with specific behavior profiles in the treatment condition compared with students with similar behavior profiles in the comparison condition. In addition, by the reading intervention not including systematic behavioral supports, it allowed this study to identify the extent to which problem behaviors were a factor associated with response to an intensive reading-only treatment. We chose to have teachers complete the SSIS-RS for two reasons. First, the SSIS-RS has a problem behavior scale as well as subscales of interest (i.e., externalizing behavior, internalizing behavior, and hyperactivity/inattention). We also chose this measure because it is an updated version of the Social Skills Rating System (SSRS; Gresham & Elliot, 1990), the problem behavior measure used in the Early Childhood Longitudinal Study–Kindergarten Cohort data set and other reading intervention research (e.g., Hagan-Burke et al., 2011).
Study Purpose and Research Questions
It is well recognized that a need exists to further support RTI frameworks through the development of comprehensive models with reliable screening and effective programs of targeted individualized interventions for SWRD + PB (Fuchs, Fuchs, & Vaughn, 2014; Maggin et al., 2016; Roberts et al., 2015). Previous research has found reading interventions (without a behavioral component) to be efficacious for students with reading and behavioral difficulties on measures of word reading. Yet, for these students, more research is needed on the efficacy of reading interventions on reading comprehension outcomes. Therefore, we aimed to investigate the extent to which levels of problem behavior, impacted response to a research-based reading treatment, through randomizing Grade 4 and 5 SWRD, to a researcher-delivered multicomponent reading intervention or a business-as-usual comparison condition.
In the current study, we examined outcomes from a single site (six schools from two school districts), in a two-site (nine schools total from three districts) randomized controlled trial presented in Vaughn and colleagues (2019). In Vaughn et al. (2019), the authors investigated the effects of a multicomponent reading intervention for fourth- and fifth-grade SWRD (n = 280). To identify fourth- and fifth-grade students as struggling in reading, and therefore eligible to participate in the study, Vaughn et al. (2019) used the Test of Silent Reading Efficiency and Comprehension (TOSREC; Wagner, Torgesen, Rashotte, & Pearson, 2010) as a reading comprehension and fluency screener measure. Vaughn et al. (2019) screened all fourth- and fifth-grade students at both sites (nine schools total from three districts). Students were eligible to participate if they scored at or below a standard score of 85 on the TOSREC, assented to participate, and returned the appropriate parent consent form. Eligible students were randomized, blocked by school and grade, to a treatment or comparison condition. Vaughn et al. (2019) found posttest group differences on the AIMSweb passage reading fluency measure (p < .001; Shinn & Shinn, 2002) and a researcher-designed word reading measure (p < .001). No posttest group differences were identified on norm-referenced measures of decoding, word reading fluency, or reading comprehension (p > .05).
In the current study, teachers at the previously described single site (six schools total from two districts) completed the SSIS-RS on the treatment and comparison condition students. Due to logistical concerns, the teachers from the other site (three schools total from one school district) did not complete the SSIS-RS. Intervention materials and components were identical across both sites (nine schools total from three districts).
Based on this single site (six schools total from two districts) subsample from Vaughn et al. (2019), we asked three research questions. The first and second research questions were: What is the efficacy of a multicomponent reading intervention on reading comprehension outcomes for Grade 4 and 5 students with reading difficulties? and To what extent were problem behaviors associated with these reading outcomes? We did not expect outcomes to vary from Vaughn et al. (2019), in that we did not anticipate finding significant posttest group differences on reading comprehension standardized measures. Across all students, we also expected higher levels of problem behavior to be associated with lower reading outcomes. Finally, we asked to what extent does response to a multicomponent reading intervention vary, based on levels of problem behavior and assignment to condition, following a year-long multicomponent reading intervention, for fourth- and fifth-grade students with reading difficulties? We hypothesized that students with higher levels of problem behaviors in the treatment condition would outperform students with high levels of problem behavior in the comparison condition.
Method
Characteristics of Subsample
School and student characteristics.
The subsample used in this study included six elementary schools from two school districts, as compared with the entire sample with nine schools across three school districts. Three schools were from a large urban district and three schools were from a near urban school district. Both districts were in the Southwestern United States. These schools’ mean enrollment was 590 students (SD = 121.3, range: 472–768) with a mean of 25.4% (SD = 23.2%, range: 3.3%–55.5%) of the students qualifying for a free or reduced lunch.
As reported in Vaughn and colleagues (2019), students were randomized to condition blocked on school. Because treatment-assigned students were then assigned a tutor for purposes of implementing the intervention, the data were partially nested. The data were also cross-classified because tutors were crossed with teachers. We return to this topic in a later section. There were 55 students assigned to the treatment condition and 53 to the comparison condition. Table 1 presents student demographics across the total sample and by condition. There were no pretreatment group differences on age, t(106) = 0.86, p > .05, gender, χ2(1) = 0.03, p > .05, English learner status, χ2(1) = 1.22, p > .05, special education status, χ2(1) = 0.01, p > .05, or race/ethnicity, χ2(3) = 0.63, p > .05. Furthermore, there were no differences at pretest between treatment and comparison groups on the TOSREC, t(106) = −0.04, p > .05, or on the Gates-MacGinitie Reading Test–Fourth Edition (GM-RT; MacGinitie, MacGinitie, Maria, Dreyer, & Hughes, 2000), t(104) = −0.27, p > .05.
Table 1.
Student Demographics.
Characteristic | Total sample | Treatment | Comparison |
---|---|---|---|
n | 108 | 55 | 53 |
Age | |||
M (years) | 10.10 | 10.15 | 10.05 |
SD (years) | 0.60 | 0.66 | 0.54 |
% Female | 46 | 45 | 47 |
% SPED | 13 | 13 | 13 |
% EL | 8 | 5 | 11 |
Race/Ethnicity | |||
% Hispanic | 19 | 18 | 19 |
% African American | 16 | 18 | 13 |
% Caucasian | 50 | 47 | 53 |
% Other | 14 | 16 | 15 |
Note. SPED = special education; EL = English learner.
Attrition
Six of the 55 treatment condition students withdrew from the study due to conflicts in school scheduling (n = 2), moved from school (n = 1), and parent withdrawal (n = 3). Eight of the 53 comparison condition students withdrew from the study due to (a) school request (n = 1), (b) moved from school (n = 6), and (c) parent withdrawal (n = 1). Based on TOSREC pretest scores, students who withdrew from the study, whether assigned to treatment, t(93) = 0.79, p > .05, or comparison, t(95) = 1.72, p > .05, did not differ from those who remained at posttest.
Two additional treatment condition students and two comparison condition students were excluded from the final analysis due to incomplete SSIS-RS forms. These four students did not differ from those remaining in the study based on posttest reading scores of the Woodcock–Johnson III Passage Comprehension subtest (WJ-III PC; Woodcock, McGrew, & Mather, 2001), t(92) = 0.84, p > .05, or the GM-RT, t(92) = 1.07, p > .05. These student scores were excluded from all posttest analysis, resulting in 47 students in the treatment condition and 43 students in the comparison condition.
The final sample of treated students had a TOSREC and GM-RT (MacGinitie et al., 2000) pretest mean scores of 78.00 (SD = 7.72, range: 55–85) and 92.09 (SD = 7.93, range: 72–110), respectively. The comparison condition had TOSREC and GM-RT pretest mean scores of 77.47 (SD = 8.28, range: 55–85) and 93.47 (SD = 11.41, range: 74–116), respectively. The two groups did not differ statistically (n = 90) on TOSREC, t(88) = 0.32, p > .05, or GM-RT, t(88) = −0.67, p > .05, at pretest, suggesting that attrition-related bias was minimal.
Measures
Trained research team members delivered all reading measures to students, blinded to student-assigned condition, during a 2-week period prior to and following the treatment.
TOSREC.
The TOSREC is a brief, group-administered test of reading fluency and comprehension. Students were given 3 min to read and verify the truthfulness (by circling “yes” or “no”) of as many sentences as possible. Reliability coefficients with other reading measures (e.g., WJ-III; Woodcock et al., 2001) exceeded .70 (Wagner et al., 2010), and the TOSREC correlated with other standardized reading fluency and comprehension ranging from .59 to .67 (Vaughn et al., 2019). The TOSREC was the screener measure.
WJ-III PC.
The WJ-III PC is an individually administered reading comprehension assessment. In this assessment, students first matched a picture symbol with an actual picture of an item. Next students identified which of three pictures was best associated with a short phrase. Finally, students completed a cloze-based reading assessment where they read a passage and fill in missing words. Test–retest reliability coefficients for children aged 8 to 13 years range from .89 to .96. For students aged 9 to 13 years, the Passage Comprehension subtest correlates with the WJ-III Reading Vocabulary (r = .59) and Story Recall (r = .46). This assessment was delivered at posttest only.
GM-RT.
The GM-RT reading comprehension subtest is a timed, group-administered assessment that measures reading comprehension with targeted inference making, summarization, literal understanding, and vocabulary. This assessment included expository and narrative passages ranging in length from three to 15 sentences, with three to six multiple-choice questions per passage. Internal consistency reliability ranges from .91 to .93, and the Kuder–Richardson reliability statistic ranges from .92 to .93. This assessment was delivered at pretest and posttest.
SSIS-RS.
The SSIS-RS Teacher Form, an updated version of the SSRS, was completed by each student’s English Language Arts (ELA) teacher. ELA teachers were blind to student assignment. The SSIS-RS includes scales of social skills, competing problem behavior, and academic competence. All questions include the frequency of behavior and importance of behavior. Frequency questions are answered on a 4-point rating of 0 to 3 meaning never, seldom, often, and almost always, respectively. Importance questions are answered on a 3-point rating of 0 to 2 meaning not important, important, and critical, respectively. The teacher form takes approximately 10 to 15 min to complete per student.
The Competing Problem Behavior Scale, Teacher Elementary Form, of the SSIS-RS was the only form used in the analysis. This Competing Problem Behavior Scale Teacher Form contains 49 items with 12 items on the externalizing subscale, 10 items on the internalizing subscale, five items on the bullying subscale, seven items on the hyperactivity/inattention subscale, and 15 items on the autism spectrum subscale. For the analysis, we utilized the Competing Problem Behavior Scale, and its externalizing, internalizing, and hyperactivity/inattention subscales, due to their potential association with reading outcomes (e.g., Lin et al., 2013; Morgan et al., 2009). We did not set out to investigate the extent to which the behavioral characteristics associated with autism spectrum disorder or bullying were related to reading outcomes, and therefore did not include the autism spectrum or bullying subscales. The Competing Problem Behavior Scale internal consistency correlation alpha estimate is .95. Test–retest reliability estimates exceed .80. The SSIS-RS Problem Behavior Scale moderately and negatively correlates with the SSIS-RS Social Skills Scale (−.42 to −.65). The SSIS-RS was delivered at posttest.
Procedures
Intervention.
The researcher-provided reading intervention was designed to improve the reading outcomes of SWRD and was not developed with a behavioral component, or to meet any specific needs associated with problem behaviors. The intervention occurred daily for 30 to 45 min, in groups of three to six students for an average of 68 lessons (mean hours of instruction = 44.4, SD = 11.2). Lessons were sequenced so that lessons 1 to 40 prioritized instruction and practice in word reading skills through systematic decoding (e.g., word patterns and word parts), automaticity (e.g., timed word lists) and sight words, and passage fluency. The remaining lessons, 41 to 110, prioritized reading comprehension instruction and also included practice in fluency instruction and systematic decoding. Throughout the 110 lessons, texts were science-based and contained expository, narrative, and hybrid (i.e., combination of expository and narrative). Tutors were also provided the option to deliver student incentives (e.g., stickers and pencils) to increase engagement and motivation, but neither this nor any other specific behavioral or social skill component were systematically integrated into the lessons. A further description of the intervention components is provided in Vaughn et al. (2019). Sample lessons are available at www.texasldcenter.org/files/lesson-plans/TCLD_4-5_SelfReg.pdf
Lessons 1 to 40
Word study was the instructional focus of the first 40 lessons, with the following components introduced daily: (a) systematic decoding, (b) automaticity instruction, and (c) sight word instruction. During the systematic decoding routine, word cards with sound patterns (e.g., “ea,” “ai,” and “eigh”) were taught in isolation as well as in word lists, used in a spelling routine (i.e., students write words on a whiteboard), and incorporated in fluency passages. The automaticity instruction included tutors who modeled word reading, and students reading and re-reading the word list until a word or sentence list was mastered based on time and accuracy. This activity addressed word reading skills within single-syllable, multisyllable, and high-frequency sight words. Feedback was provided to the student from both tutor and peers. Instruction was individualized and intensified through a systematic progression to more difficult word reading skills and formats (i.e., word, phrase, and sentence level) based on individual learner mastery of previous lists. Finally, high-frequency sight word instruction included students reading high-frequency sight words in isolation and in sentences followed by a spelling activity and quiz.
Text-based reading during the first 40 lessons included fluency-text instruction occurring daily for 10 min. The text selection included QuickReads (Hiebert, 2003) and additional narrative texts, which were designed to facilitate comprehension by limiting unfamiliar words in the text and reducing the cognitive load placed on lower level reading tasks. Text selection prioritized the match between instructional level of student and text over the content in the text. The fluency-text instructional routine included: (a) introducing key words that represented the main idea of the text; (b) multiple readings of the text in different formats (e.g., adult models reading with expression and choral read) to build rate, accuracy, and expression; and (c) using key words to summarize the passage.
Lessons 41 to 110
Text-based reading was the instructional focus for lessons 41 to 110 and included two formats: (a) stretch-text with goal setting and (b) fluency text. The stretch-text with goal setting instruction occurred on three out of every five lessons for 25 to 30 min and included four lesson components: (a) introduction of the reading goals, (b) reading of the passage, (c) answering comprehension questions, and (d) reviewing the reading goals reflection. First, and prior to reading the text, students identified and reviewed reading goals that related to the main idea of the text. Next, students read the stretch-text, which included grade-level (adapted for readability) expository, narrative, and hybrid texts on science topics (e.g., minerals). Stopping points were embedded into the text to signal students to summarize previous section(s) of the text and answer reading comprehension questions. Following the reading of the text, students answered researcher-developed multiple choice comprehension questions. At the conclusion of the lesson, students responded to whether they met their goal and reflected on their ability to comprehend the text.
The fluency-text instruction occurred two out of every five lessons for 15 min and included two components: (a) fluency-text routine and (b) “Does it Make Sense” activity. The fluency-text routine used an identical format to the first 40 lessons, except that key words were not provided to students to cue them to the main idea, and students were asked literal and inferential comprehension questions. The “Does it Make Sense” activity had students read and evaluate the syntax and semantics of a sentence or group of sentences to determine whether the sentence made sense. When a student determined that the sentence or group of sentences did not make sense, the student underlined the word(s) in the sentence which disrupted the meaning. Feedback and discussion was provided during this activity.
Word Study for lessons 41 to 110 included two parts: (a) systematic instruction in morphology and (b) automaticity instruction. The systematic instruction in morphology occurred two out of every five lessons for 10 to 15 min. This component was designed to complement the systematic decoding instruction in the first 40 lessons with students being taught to decode and derive meaning of root words, prefixes, and suffixes (as compared with work patterns). The morphology instruction included three parts. First, explicit instruction in the meanings of the word parts was delivered. Then, students hypothesized the meaning of the words through a practiced cognitive strategy (i.e., I know the meaning of the base word and prefix/suffix, so I think the word means). Finally, a confirmation (or rejection) of the student’s hypothesis was demonstrated through a sentence application worksheet. In the sentence application worksheet, students determined whether the hypothesized definition of the word made sense in the context of the sentence. The second part of word study was the automaticity instruction. During automaticity instruction, students continued to work on mastery of words and sentences.
Fidelity
Tutor lessons in the researcher-provided reading intervention were audio-recorded daily with a subset of recorded lessons being blocked by tutor and unit (i.e., lessons 1–40, lessons 41–80, and lessons > 80) and randomly selected and coded for fidelity. Four coders coded a total of 135 audio-recorded lessons. All coders were trained and independently reached 90% reliability in adherence to the gold standard method (Gwet, 2001). To protect against coder drift, at the half-way point of coding the audio-recorded lessons, coder reliability was reassessed based on the gold standard method previously described.
Coding was based on a 4-point Likert-type rating scale (with high scores indicating higher fidelity). Researchers rated the implementation fidelity (i.e., adherence to the essential elements of each component), global quality (e.g., appropriate use of feedback and pacing), and global fidelity (e.g., holistic evaluation of implementation and success). The mean implementation score for the intervention components was 3.84 (SD = 0.65, range: 1–4), for global quality was 3.91 (SD = 0.29, range: 3–4), and for global fidelity was 3.99 (SD = 0.09, range 3.00 to 4.00).
Comparison Group Reading Instruction
A lead teacher at each school completed an alternative reading inventory (ARI) to document the comparison condition’s reading instruction, whereas the treatment condition received the researcher-provided intervention. Results indicated that three students did not receive additional reading instruction outside of their core reading program, four students received computer-based learning supervised by a noncertified teacher, and 36 students received reading instruction by a certified teacher in groups of 11 or more (n = 1), 5 to 10 (n = 7), and 1 to 4 (n = 28). These reading sessions ranged from one 60 min session to five 30 min sessions per week.
Data Analysis
The research design of the original study (Vaughn et al., 2019) was partially nested and cross-classified with students nested in tutors in the treatment condition only and tutors and classroom teachers crossed. Nested data can bias standard errors and result in Type I errors. Cross-classified structures represent a departure from fully nested data, introducing additional assumptions and requiring estimation of additional model parameters. We fit unconditional models (i.e., empty means and random intercept) to estimate variance related to partial nesting and cross-classification at different levels of the model, starting with the partial nesting of students in tutors, under the assumption that trivial amounts of tutor-level partial nesting would preclude any consequential cross-classification of students from the same classrooms to different tutors.
We tested the hypothesis that intra-class correlations within the treatment-assigned condition do not differ statistically from 0 (within-group deviations were normally distributed) by calculating F-statistics as ñ(S2between)/(S2within), where ñ is the average cluster size, S2between is the observed between-group variance, and S2within is the observed within-group variance (Snijders & Bosker, 1999), with n – 1 and m – n degrees of freedom, n being the number of level 2 units and m the number of level 1 units. Tutor-level clustering (ρ of .01–.03 across measures) within the treatment condition did not differ statistically from 0 at pretest or at posttest on any measure (p-values from .65 to .78). Thus, we ignored the design’s partial nesting. We also ignored the cross-classification of teacher and tutor for the reasons outlined above. Furthermore, because our research question involved student-level constructs and because the sample-wide nesting (independent of tutor) of students in classroom teachers was trivial (p = .86), we fit single-level regression models to estimate the simple main effects of treatment and problem behaviors on reading outcomes and to evaluate moderation by problem behaviors of treatment’s effect on reading outcomes (school-level clustering is trivial [ρ < .01], as well). The GM-RT was included as a grand-mean centered covariate in all models (we did not have WJ-III PC data at pretest). We standardized behavior outcomes as z-scores. We fit models using Mplus 8. We estimated effect sizes as Hedges g (Hedges, 1981). To control for Type I errors associated with multiple contrasts, we used the Benjamini–Hochberg (BH) correction for false discovery rates (FDR; Benjamini & Hochberg, 1995), a less conservative method than the Bonferroni method (Bonferroni, 1935), which controls for family-wise errors. Per What Works Clearinghouse recommendations (Institute of Education Sciences, 2017), we corrected for FDR on simple main effects only. We did not adjust p-values in the moderation analysis.
Results
Overview of Results
Table 2 presents standard scores for reading outcomes by condition. Treated students scored 94.57 (SD = 9.04, range: 55–85) and 94.09 (SD = 4.75, range: 72–110) on average on the GM-RT and WJ-III PC, respectively. Students in the comparison had posttest mean values of 94.30 (SD = 10.06, range: 55–85) and 93.77 SD = 8.61, range: 74–116) on the GM-RT and WJ-III PC, respectively. The effect sizes (Hedges g) were 0.11 and 0.11 for the GM-RT and the WJ-III PC, respectively. The groups did not differ at posttest on subscales of the SSIS-RS (p > .05). On the problem behavior scale, 16 and 18 students scored one standard deviation above and below the sample mean, respectively, with higher scores indicating more problem behaviors.
Table 2.
Reading Pretest and Posttest Group Comparisons.
Prea |
Posta |
||||||||
---|---|---|---|---|---|---|---|---|---|
Measure | Construct | Group | n | M (SD) | M (SD) | Adj. M (SE) | F | p | g |
WJ-III PC | Comprehension | T | 47 | 94.09 (4.75) | 94.28 (0.92) | 0.30 | .58 | 0.11 | |
C | 43 | 93.77 (8.61) | 93.55 (0.96) | ||||||
GM-RT | Comprehension | T | 47 | 92.09 (7.93) | 94.57 (9.04) | 94.96 (1.12) | 0.45 | .50 | 0.11 |
C | 43 | 93.47 (11.41) | 94.30 (10.06) | 93.88 (1.17) |
Note. WJ-III PC = Woodcock–Johnson III Tests of Achievement Passage Comprehension subtest (Woodcock, McGrew, & Mather, 2001); T = treatment; C = Comparison; GM-RT = Gates-MacGinitie Reading Test–Fourth Edition (MacGinitie, MacGinitie, Maria, Dreyer, & Hughes, 2000).
Standard scores.
Main effects of intervention and problem behaviors.
Table 3 summarizes results for the eight regression models (four moderators [i.e., problem behavior, externalizing behavior, internalizing behavior, and hyperactivity/inattention] by two outcomes [i.e., GM-RT and WJ-III PC]). We report unstandardized coefficients. As expected and consistent with our prior work, there were no main effects associated with treatment (p = .69–.90 depending on outcome and moderator). For the SSIS-RS, there were main effects on the GM-RT for problem behaviors (b = −3.32, p = .003), for externalizing behaviors (b = −3.14, p = .005), and for hyperactivity/inattention (b = −2.91, p = .017). On the WJ-III PC, there were main effects for problem behaviors (b = −3.01, p = .001), for internalizing behaviors (b = −2.17, p = .015), for externalizing behaviors (b = −2.22, p = .016), and for hyperactivity/inattention (b = −3.45, p = .001). Because the SSIS-RS scores were z-scored and because higher values reflect more prevalent problem behaviors, the negatively signed coefficients suggest that reading outcomes were better for students with fewer problem behaviors, whether treated or not (i.e., independent of assignment to the reading treatment).
Table 3.
Tests of Treatment, Behavior, and Interaction Effects.
GM-RT |
WJ-III PC |
|||||
---|---|---|---|---|---|---|
Model parameters | Estimate | SE | p | Estimate | SE | p |
Intercept | 44.29 | 7.84 | .000 | 71.10 | 6.27 | .000 |
Pretest | 0.54 | 0.08 | .000 | 0.25 | 0.07 | .000 |
Condition | 0.34 | 1.60 | .833 | −0.16 | 1.28 | .901 |
Problem Behavior | −3.32 | 1.10 | .003 | −3.01 | −0.87 | .001 |
Problem Behavior × Condition | 3.31 | 1.60 | .042 | 1.93 | 1.29 | .136 |
Intercept | 42.27 | 7.78 | .000 | 68.99 | 6.43 | .000 |
Pretest | 0.56 | 0.08 | .000 | 0.27 | 0.07 | .000 |
Condition | 0.64 | 1.59 | .690 | 0.20 | 1.31 | .882 |
Externalizing Behavior | −3.14 | 1.09 | .005 | −2.22 | 0.99 | .016 |
Externalizing Behavior × Condition | 3.94 | 1.59 | .015 | 1.47 | 1.32 | .267 |
Intercept | 39.98 | 8.06 | .000 | 67.98 | 6.40 | .000 |
Pretest | 0.58 | 0.09 | .000 | 0.28 | 0.07 | .000 |
Condition | 0.86 | 1.63 | .600 | 0.41 | 1.30 | .752 |
Internalizing Behavior | −1.08 | 1.10 | .331 | −2.12 | 0.88 | .015 |
Internalizing Behavior × Condition | −0.10 | 1.67 | .954 | 1.37 | 1.32 | .304 |
Intercept | 46.50 | 8.28 | .000 | 74.46 | 6.46 | .000 |
Pretest | 0.52 | 0.09 | .000 | 0.21 | 0.07 | .002 |
Condition | 0.24 | 1.62 | .885 | −0.28 | 1.27 | .824 |
Hyperactivity/Inattention | −2.91 | 1.19 | .017 | −3.45 | 0.93 | .000 |
Hyperactivity/Inattention × Condition | 1.83 | 1.61 | .257 | 2.16 | 1.26 | .090 |
Note. Based on Social Skills Improvement System–Rating Scale (Gresham & Elliott, 2008). GM-RT = Gates-MacGinitie Reading Test–Fourth Edition (MacGinitie, MacGinitie, Maria, Dreyer, & Hughes, 2000); WJ-III PC = Woodcock–Johnson III Passage Comprehension subtest (Woodcock, McGrew, & Mather, 2001).
The GM-RT and the WJ-III PC both measured reading comprehension. However, they may measure different dimensions of the multidimensional comprehension construct, due in part to their very different formats. The What Works Clearinghouse (Institute of Education Sciences, 2017) recommends that comprehension be treated as a unitary domain for purposes of FDR adjustment. However, we argue that matters may be more nuanced when working in this population. The GM-RT presents test takers with relatively lengthy passages followed by a series of multiple choice comprehension questions, whereas a typical WJ-III PC item asks students to read only a sentence or in cases, very brief passages. For students with attention and other problem behaviors, these differences may be especially salient. Two hypothetical students who possess the same level of reading comprehension ability may nonetheless score differently across the GM-RT and the WJ-III PC because performance on the two measures may depend as much on attentional control and related problem behaviors as it does on reading ability. For this reason, we treated the two outcome measures as distinct, though related, domains for the purpose of estimating and adjusting for false discovery rates. On the GM-RT, the adjusted p-values for problem behaviors, externalizing behaviors, internalizing behaviors, and hyperactivity/inattention were .01, .01, .331, and .02, respectively. For the WJ-III PC, BH-adjusted p-values were .0002, .0002, .016, and .016 across the four subscales. (As an aside, the BH correction when applied to comprehension as a single domain yielded the same pattern of findings, though the actual p-values differed in magnitude somewhat.)
Problem behaviors by treatment interaction.
Table 3 also presents results for our tests of the interaction effects. On the GM-RT, problem behaviors (p = .042) and externalizing behaviors (p = .015) moderated the effect of treatment, meaning that students with different patterns of poor behavior and/or externalizing behaviors responded differently to the reading treatment, all else being equal. Interaction effects on the WJ-III PC did not differ from 0 (p-values from .09 to .304), nor did the effects for internalizing behaviors and hyperactivity/inattention on the GM-RT (p-values of .95 and .26, respectively). For continuous moderators, tests of interaction effects are modeled at the moderator’s mean. However, the interaction effect may exist at values other than (in addition to) the mean. To evaluate possibilities in this respect, we refit the regression of GM-RT on problem behaviors and on externalizing behaviors for conditional values of the moderators (i.e., simple slopes technique) corresponding to 1 or more standard deviations (SDs) above and below the variables’ respective mean values. For students with problem behaviors at or above 1SD from the mean, reading outcomes as measured by the GM-RT did not differ statistically for treated and untreated students (t = 1.05, p = .315). The sample was small (n = 16), with nine comparison students and seven in the treatment group. The effect size was small to moderate in magnitude (g = 0.28), favoring the treatment condition. For students at or below 1SD from the mean (n = 18; 11 in treatment and 7 in comparison), the reading intervention was not differentially effective (t = −0.64, p = .535). However, the effect size was large (g = 0.85), in favor of the comparison, suggesting that an adequately powered sample may have yielded a statistically significant difference, with students in comparison classes outperforming treated students.
For students with high levels of externalizing behavior (1SD above the sample mean; n = 12 with four in treatment eight in comparison), the coefficient for treatment’s effect on GM-RT did not differ from 0 (t = 2.10, p = .069). However, the effect size was very large (g = 1.03), favoring the treated condition, meaning that students with more prevalent or severe externalizing behaviors may have benefited from the reading treatment more so than students in the comparison with similar externalizing behaviors. As before, the small-sized samples suggest very low statistical power, suggesting a possibility for future research with adequately powered subsamples (i.e., samples of students with high externalizing behaviors). A total of 17 students scored 1SD below the mean (z = −1) on externalizing behaviors, 11 in treatment and six in comparison. The effect of treatment did not differ (t = −.45; p = .657). The effect size was 0.57 (Hedges g), indicating moderate-sized differences at posttest. Appendix displays the interaction plots of problem behavior by condition on reading comprehension outcomes, with problem behavior outcomes displayed as −1, 0, and 1SD from the grand mean (Hayes & Montoya, 2017). Reading outcomes are displayed in standard scores.
Discussion
We analyzed a subsample of an extant database (NICHD # D052117-07) to better understand the relationship of problem behaviors and student response to reading intervention for upper elementary students who experience reading difficulties. The reading treatment was a year-long intervention with multiple components, although behavior was not an instructional focus. We estimated simple main effects for the reading treatment and for problem behaviors as measured by the SSIS-RS. We also evaluated the handful of interaction effects across the two conditions (i.e., treatment and control conditions) and the four potential moderators (i.e., problem behavior, externalizing behavior, internalizing behavior, and hyperactivity/inattention).
There were no statistically significant differences at posttest between the treatment and comparison groups. This is consistent with the earlier study (Vaughn et al., 2019) where main treatment effects were estimated across two sites (nine schools total from three districts). In that earlier study, the effect of site (schools in city A versus schools in city B) was also not statistically significant, nor were interactions between site and condition. Therefore, the finding of no main treatment effect was unsurprising and wholly consistent with our own prior research, and with the work of colleagues working with older struggling readers. Challenges in this respect are well-documented in the research literature (e.g., Miciak et al., 2017), and the noted pattern of findings underscores the need to better understand conditions under which response to reading interventions is likely for different populations of students, particularly for students with co-occurring reading and behavior problems. In our sample, as in previous studies (Benner et al., 2010; Durlak et al., 2011; Maggin et al., 2016; Schonfeld et al., 2015) on reading and behavior, higher levels of problem behavior were associated with lower reading outcomes.
Tests of interaction effects suggested that higher levels of problem behavior and higher levels of externalizing behavior moderate the effect of reading treatment on the GM-RT (p < .05), though not on the WJ-III PC. When modeling with continuous moderators, tests of interaction are conditional on the sample mean, and our findings are most robust (due to sample size issues) for students who scored in the average range on subscales of the SSIS-RS. However, results for students who were more distal from the mean on problem behaviors and externalizing behaviors (i.e., more than 1SD above or below the mean) were intriguing, though underpowered, therefore preliminary at this point. Of particular interest was the large effect size (>1SD) for students with high levels of externalizing behavior.
The Co-Occurrence of Reading and Behavior Problems
In the introduction, we described four prevailing theories about the high rates of co-occurrence between reading and behavior problems. Our findings do not directly address the causal underpinnings of this relationship (that was not our intent); however, several inferences seem warranted. First, because these were older students with long standing reading problems, and because only a subset also had behavior problems as measured by the SSIS-RS, it is reasonable to assume that reading difficulties for most of the sample resulted from something other than behavior problems. Most struggling readers, at least in this sample, did not have problems with behavior simply because most of the sampled students, all of whom were struggling to read, did not have elevated levels of problem or externalizing behavior. Note that we implicitly assume that reading difficulty predates behavior problems because our sample consists entirely of poor readers. A study that sampled on poor behaviors and allowed reading ability to vary would require a different set of first assumptions. Of course, it would also represent a very helpful next step for untangling the underlying causal mechanism.
Nonetheless, in the context of our original sampling frame (i.e., all struggling readers), the more reasonable possibility, in our view, is that poor reading and poor behavior co-occur in the subset of the group of poor readers because early reading problems lead to poor behavior, which in turn can lead to lost opportunities to learn, suggesting the bi-directional relationship described in the third of the four theories (Hinshaw, 1992; Morgan et al., 2008). This does not answer the question “why do only a subset of poor readers first engage in poor behavior,” initiating the spiraling back and forth between the reading and behavior problems. It is possible that this subgroup of students, those who respond to early reading problems by engaging in externalizing and problem behaviors, are predisposed to such due to a third cause or due to multiple causes that correlate with both reading and behavior problems. This idea of correlated liabilities (the fourth theory outlined in the Introduction) is also prevalent in the research. According to the correlated liabilities theory, there exists specific shared risk factors for those with co-occurring reading difficulties and ADHD (e.g., processing speed; McGrath et al., 2011; Peterson et al., 2017; Willcutt et al., 2010) as well as independent risk factors for those with reading difficulties (e.g., phonological awareness) or ADHD (i.e., inhibition; Peterson et al., 2017; Willcutt et al., 2010). A similar phenomenon may be at play in this present context, though additional research across a range of disciplines (e.g., intervention, neuroscience, and clinical research) will be necessary to fully disentangle the underlying causes and observed effects (Hruby & Goswami, 2011; Peterson et al., 2017).
Limitations, Future Research, and Implications for Practice
Several features of the study should be considered when interpreting the findings. In particular, we describe measures related to measurement and the study sample. First, the SSIS-RS measure was the only behavior measure, and it was administered at posttest only. Unfortunately, the timing of the study precluded the possibility of obtaining student pretest behavioral data, as the study began early in the school year and teachers were not yet able to provide accurate data on student behavior. Furthermore, when asking the ELA teachers to complete the SSIS-RS, we did not disclose student assignment to condition; although throughout the study, we also did not conceal group assignment from the ELA teachers. Therefore, teachers could have identified which students received the researcher-delivered treatment and which did not, inadvertently creating a potential bias. Finally, student behavior progress monitoring data were neither collected through direct observations nor surveys, to monitor the extent to which behavior changed throughout the intervention. Collecting progress monitoring data could have allowed us to better understand the conditions by which a statistically significant interaction between behavior and assignment to condition on reading outcomes may occur.
The study would have also benefited from collecting data on the extent to which students received a school-delivered behavioral intervention during this study’s implementation. This would have allowed us to contextualize the findings for the small sample of students with higher levels of problem behaviors. Furthermore, due to large interaction effects, an adequately powered sample may have yielded a statistically significant difference not currently identified.
Future research is needed to address these measurement and sample limitations. For example, future studies could investigate whether outcomes of a similar study vary based on student, parent, or other teacher reports, either through the SSIS-RS or progress monitoring tools (e.g., daily reports and direct observations). The collecting of additional data, such as progress monitoring (e.g., behavior and attention), or the extent to which a student received a school-delivered behavioral intervention would support the ecological validity of future studies. As hyperactivity and inattention were associated with lower reading outcomes for both conditions, future research could also include more robust measures of ADHD. Considering that students with co-occurring reading difficulties and ADHD, as compared with peers without reading difficulties or ADHD, have deficits in reading, as well as cognitive and executive functioning domains, future research is also needed to identify targeted and individualized interventions or components to be embedded in reading interventions to meet specific learner needs.
Finally, this study provided evidence to support practitioners. Most notably, this study provided evidence as to the importance for SWRD + PB to receive intensive small-group systematic and explicit research-based reading instruction. When participating students with elevated problem behaviors received such intensive instruction, they outperformed similar peers not receiving the reading treatment on reading comprehension outcomes. Therefore, when schools aim to further develop their school-wide RTI frameworks, it would be worthwhile for such frameworks to include both a screening for problem behaviors, as well as systematic and explicit small-group reading instruction as part of a comprehensive model of treatment for SWRD + PB. Findings also suggested that behavior negatively impacted reading outcomes, regardless of assignment to condition. Therefore, SWRD + PB may continue to need support, beyond a reading treatment to make adequate gains. When additional support is needed, practitioners can turn to research to provide guidance from other independent behavioral interventions (e.g., daily report card) or components to embed within reading instruction (e.g., manipulate antecedents [e.g., provide choice] provide specific behavior-related praise, provide tangible reinforcement) to increase engagement and reduce disruptive behaviors (e.g., Epstein, Atkins, Cullinan, Kutash, & Weaver, 2008; Moore et al., 2017). Utilizing comprehensive models to support both reading and behavioral outcomes can aid both researchers and practitioners to (a) better developing RTI frameworks to predict which students will need further intensified interventions in reading and behavior, (b) deliver and adapt intensive interventions, and (c) determine intervention efficacy to make instructional changes based on progress monitoring data. Such models are needed to mitigate the risk of treatment resistance to reading intervention.
Conclusion
Study findings suggested that behavior should be further investigated as a factor associated with response to intervention, particularly for students in the upper elementary grades for whom less is known. The relationship between problem behavior and reading achievement is well-cited, yet additional research is needed to identify, develop, and adapt effective intervention for SWRD + PB, regardless of students meeting criteria for ED or ADHD. Finally, this study demonstrated that students with higher levels of problem behaviors benefited from an intensive reading intervention, as well as provided rigorous evidence that problem behavior was a factor related to a student’s response to intervention, and therefore needs to be considered within a school-wide RTI framework.
Acknowledgments
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grant P50 HD052117-07 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development.
Appendix
Interaction plot of condition by behavior on reading outcomes.
Note. Behavior scales were from the Social Skills Improvement System–Rating Scale (Gresham & Elliott, 2008). GM-RT = Gates MacGinitie Reading Test (MacGinitie, MacGinitie, Maria, Dreyer, & Hughes, 2000); WJ-III PC = Woodcock-Johnson III Passage Comprehension subtest (Woodcock, McGrew, & Mather, 2001).
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
References
- Algozzine B, Wang C, & Violette AS (2011). Reexamining the relationship between academic achievement and social behavior. Journal of Positive Behavior Interventions, 13, 3–16. doi: 10.1177/1098300709359084 [DOI] [Google Scholar]
- Benjamini Y, & Hochberg Y (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57, 289–300. [Google Scholar]
- Benner GJ, Nelson JR, Ralston NC, & Mooney P (2010). A meta-analysis of the effects of reading instruction on the reading skills of students with or at risk of behavioral disorders. Behavioral Disorders, 35, 86–102. Retrieved from jstor.org/stable/43153810 [Google Scholar]
- Bonferroni CE (1935). Il calcolo delle assicurazioni su gruppi di teste. In Carboni SO (Ed.), Studi in onore del Professore Salvatore Ortu Carboni (pp. 13–60).Rome, Italy: Bardi. [Google Scholar]
- Carroll JM, Maughan B, Goodman R, & Meltzer H (2005). Literacy difficulties and psychiatric disorders: Evidence for comorbidity. Journal of Child Psychology and Psychiatry, 46, 524–532. doi: 10.1111/j.1469-7610.2004.00366.x [DOI] [PubMed] [Google Scholar]
- Chall JS, & Jacobs VA (1983). Writing and reading in the elementary grades: Developmental trends among low SES children. Language Arts, 60, 617–626. [Google Scholar]
- Compton DL, Fuchs D, Fuchs LS, Elleman AM, & Gilbert JK (2008). Tracking children who fly below the radar: Latent transition modeling of students with late-emerging reading disability. Learning and Individual Differences, 18, 329–337. doi: 10.1016/j.lindif.2008.04.003 [DOI] [Google Scholar]
- Durlak JA, Weissberg RP, Dymnicki AB, Taylor RD, & Schellinger KB (2011). The impact of enhancing students’ social and emotional learning: A meta-analysis of school-based universal interventions. Child Development, 82, 405–432. doi: 10.1111/j.1467-8624.2010.0156 [DOI] [PubMed] [Google Scholar]
- Epstein M, Atkins M, Cullinan D, Kutash K, & Weaver R (2008). Reducing behavior problems in the elementary school classroom: A practice guide (NCEE #2008-012). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from https://ies.ed.gov/ncee/wwc/Docs/PracticeGuide/behavior_pg_092308.pdf [Google Scholar]
- Farmer TW, Gatzke-Kopp LM, Lee DL, Dawes M, & Talbott E (2016). Research and policy on disability: Linking special education to developmental science. Policy Insights from the Behavioral and Brain Sciences, 3, 138–145. doi: 10.1177/2372732215624217 [DOI] [Google Scholar]
- Forness SR, Freeman SFN, Paparella T, Kauffman JM, & Walker HM (2012). Special education implications of point and cumulative prevalence for children with emotional or behavioral disorders. Journal of Emotional and Behavioral Disorders, 20, 4–18. doi: 10.1177/1063426611401624 [DOI] [Google Scholar]
- Fuchs D, Fuchs LS, & Vaughn S (2014). What is intensive instruction and why is it important? Teaching Exceptional Children, 46, 13–18. doi: 10.1177/0040059914522966 [DOI] [Google Scholar]
- Gresham FM, & Elliot SN (1990). Social Skills Rating System (SSRS). Circle Pines, MN: American Guidance Service. [Google Scholar]
- Gresham FM, & Elliott SN (2008). Social Skills Improvement System (SSIS) Rating Scales. Bloomington, MN: Pearson Assessments. [Google Scholar]
- Griffiths YM, & Snowling MJ (2002). Predictors of exception word and nonword reading in dyslexic children: The severity hypothesis. Journal of Educational Psychology, 94, 34–43. doi: 10.1037//0022-0663.94.1.34 [DOI] [Google Scholar]
- Guthrie JT, Schafer WD, & Huang CW (2001). Benefits of opportunity to read and balanced instruction on the NAEP. The Journal of Educational Research, 94, 145–162. doi: 10.1080/00220670109599912 [DOI] [Google Scholar]
- Gwet K (2001). Handbook of inter-rater reliability: How to estimate the level of agreement between two or multiple raters. Gaithersburg, MD: STATAXIS. [Google Scholar]
- Hagan-Burke S, Kwok OM, Zou Y, Johnson C, Simmons D, & Coyne MD (2011). An examination of problem behaviors and reading outcomes in kindergarten students. The Journal of Special Education, 45, 131–148. doi: 10.1177/0022466909359425 [DOI] [Google Scholar]
- Hayes AF, & Montoya AK (2017). A tutorial on testing, visualizing, and probing an interaction involving a multicate-gorical variable in linear regression analysis. Communication Methods and Measures, 11, 1–30. doi: 10.1080/19312458.2016.1271116 [DOI] [Google Scholar]
- Hedges LV (1981). Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational and Behavioral Statistics, 6, 107–128. doi: 10.3102/10769986006002107 [DOI] [Google Scholar]
- Hiebert E (2003). QuickReads: A research-based fluency program. Parsippany, NJ: Pearson. [Google Scholar]
- Hinshaw SP (1992). Externalizing behavior problems and academic underachievement in childhood and adolescence: Causal relationships and underlying mechanisms. Psychological Bulletin, 111, 127–155. [DOI] [PubMed] [Google Scholar]
- Hruby GG, & Goswami U (2011). Neuroscience and reading: A review for reading education researchers. Reading Research Quarterly, 46, 156–172. doi: 10.1598/RRQ.46.2.4 [DOI] [Google Scholar]
- Institute of Education Sciences. (2017). What Works Clearinghouse procedures and standards handbook version 4.0. Retrieved from https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_procedures_handbook_v4.pdf
- Jacobson LA, Ryan M, Denckla MB, Mostofsky SH, & Mahone EM (2013). Performance lapses in children with attention-deficit/hyperactivity disorder contribute to poor reading fluency. Archives of Clinical Neuropsychology, 28, 672–683. doi: 10.1093/arclin/act048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin YC, Morgan PL, Farkas G, Hillemeier M, & Cook M (2013). Reading, mathematics, and behavioral difficulties interrelate: Evidence from a cross-lagged panel design and population-based sample of US upper elementary students. Behavioral Disorders, 38, 212–227. doi: 10.1177/019874291303800404 [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacGinitie WH, MacGinitie RK, Maria K, Dreyer LG, & Hughes KE (2000). Gates-MacGinitie Reading Tests (4th ed.). Itasca, IL: Riverside. [Google Scholar]
- Maggin DM, Wehby JH, & Gilmour AF (2016). Intensive academic interventions for students with emotional and behavioral disorders: An experimental framework. Journal of Emotional and Behavioral Disorders, 24, 138–147. doi: 10.1177/1063426616649162 [DOI] [Google Scholar]
- Maughan B, Rowe R, Loeber R, & Stouthamer-Loeber M (2003). Reading problems and depressed mood. Journal of Abnormal Child Psychology, 31, 219–229. doi:0091-0627/03/0400-0219/0 [DOI] [PubMed] [Google Scholar]
- McGrath LM, Pennington BF, Shanahan MA, Santerre-Lemmon LE, Barnard HD, Willcutt EG, … Olson Richard K (2011). A multiple deficit model of reading disability and attention-deficit/hyperactivity disorder: Searching for shared cognitive deficits. Journal of Child Psychology and Psychiatry, 52, 547–557. doi: 10.1037/spq0000037 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miciak J, Roberts GJ, Taylor WP, Solis M, Vaughn S, & Fletcher JM (2017). The effects of a two-year, intensive reading intervention implemented with late elementary struggling readers. Learning Disabilities Research & Practice, 33, 24–36. doi: 10.1111/ldrp.12159 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moore TC, Wehby JH, Oliver RM, Chow JC, Gordon JR, & Mahany LA (2017). Teachers’ reported knowledge and implementation of research-based classroom and behavior management strategies. Remedial and Special Education, 38, 222–232. doi: 10.1177/0741932516683631 [DOI] [Google Scholar]
- Morgan PL, Farkas G, Tufis PA, & Sperling RA (2008). Are reading and behavior problems risk factors for each other? Journal of Learning Disabilities, 41, 417–436. doi: 10.1177/0022219408321123 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morgan PL, Farkas G, & Wu Q (2009). Kindergarten predictors of recurring externalizing and internalizing psychopathology in the third and fifth grades. Journal of Emotional and Behavioral Disorders, 17, 67–79. doi: 10.1177/1063426608324724 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nelson JR, Benner GJ, & Gonzalez J (2003). Learner characteristics that influence the treatment effectiveness of early literacy interventions: A meta-analytic review. Learning Disabilities Research and Practice, 18, 255–267. doi: 10.1111/15405826.00080 [DOI] [Google Scholar]
- Pennington BF (2006). From single to multiple deficit models of developmental disorders. Cognition, 101, 385–413. doi: 10.1016/j.cognition.2006.04.008 [DOI] [PubMed] [Google Scholar]
- Peterson RL, Boada R, McGrath LM, Willcutt EG, Olson RK, & Pennington BF (2017). Cognitive prediction of reading, math, and attention: Shared and unique influences. Journal of Learning Disabilities, 50, 408–421. doi: 10.1111/j.1469-7610.2010.02346.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts G, Rane S, Fall A-M, Denton CA, Fletcher JM, & Vaughn S (2015). The impact of intensive reading intervention on level of attention in middle school students. Journal of Clinical Child & Adolescent Psychology, 44, 942–953. doi: 10.1080/15374416.2014.913251 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts GJ, Solis M, Ciullo S, McKenna JW, & Vaughn S (2015). Reading interventions with behavioral and social skill outcomes: A synthesis of research. Behavior Modification, 39, 8–42. doi: 10.1177/0145445514561318 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roeser RW, van der Wolf K, & Strobel KR (2001). On the relation between social-emotional and school functioning during early adolescence: Preliminary findings from Dutch and American samples. Journal of School Psychology, 39, 111–139. [Google Scholar]
- Schonfeld DJ, Adams RE, Fredstrom BK, Weissberg RP, Gilman R, Voyce C, … Speese-Linehan D (2015). Cluster-randomized trial demonstrating impact on academic achievement of elementary social-emotional learning. School Psychology Quarterly, 30, 406–420. doi: 10.1037/spq0000099 [DOI] [PubMed] [Google Scholar]
- Shinn MR, & Shinn MM (2002). AIMSweb training workbook: Administration and scoring of reading maze for use in general outcome measurement. Eden Prairie, MN: Edformation. [Google Scholar]
- Snijders TAB, & Bosker RJ (1999). Testing and model specification. In Snijders TAB & Bosker RJ (Eds.), Multilevel analysis: An introduction to basic and advanced multilevel modeling (pp. 86–98). Thousand Oaks, CA: SAGE. [Google Scholar]
- Tamm L, Denton CA, Epstein JN, Schatschneider C, Taylor H, Arnold LE, … Maltinsky J (2017). Comparing treatments for children with ADHD and word reading difficulties: A randomized clinical trial. Journal of Consulting and Clinical Psychology, 85, 434–446. doi: 10.1037/ccp0000170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tannock R, Frijters JC, Martinussen R, White EJ, Ickowicz A, Benson NJ, & Lovett MW (2018). Combined modality intervention for ADHD with comorbid reading disorders: A proof of concept study. Journal of Learning Disabilities, 51, 55–72. doi: 10.1177/0022219416678409 [DOI] [PubMed] [Google Scholar]
- Vaughn S, Roberts GJ, Miciak J, Taylor PM, & Fletcher JM (2019). Randomized control trial of a word and text-based comprehension intervention with fourth- and fifth-grade students with significant reading comprehension problems. Journal of Learning Disabilities, 52, 31–44. doi: 10.1177/0022219418775113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wagner RK, Torgesen JK, Rashotte CA, & Pearson NA (2010). TOSREC: Test of sentence reading efficiency and comprehension. Austin, TX: Pro-Ed. [Google Scholar]
- Wechsler D (2009). Wechsler Individual Achievement Test (3rd ed.). London, England: The Psychological. [Google Scholar]
- Willcutt EG, Betjemann RS, McGrath LM, Chhabildas NA, Olson RK, DeFries JC, & Pennington BF (2010). Etiology and neuropsychology of comorbidity between RD and ADHD: The case for multiple-deficit models. Cortex, 46, 1345–1361. doi:0.1016/j.cortex.2010.06.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woodcock RW (1987). Woodcock Reading Mastery Tests-Revised. Circle Pines, MN: American Guidance Service. [Google Scholar]
- Woodcock RW, McGrew KS, & Mather N (2001). Woodcock-Johnson III tests of achievement. Itasca, IL: Riverside. [Google Scholar]