Abstract
Objective
No previous empirical study has investigated whether the LD identification decisions of proposed methods to operationalize processing strengths and weaknesses (PSW) approaches for LD identification are associated with differential treatment response. We investigated whether the identification decisions of the concordance/discordance model (C/DM; Hale & Fiorello, 2004) and Cross Battery Assessment approach (XBA method; Flanagan, Ortiz, & Alfonso, 2007) were consistent and whether they predicted intervention response beyond that accounted for by pretest performance on measures of reading.
Method
Psychoeducational assessments were administered at pretest to 203 4th graders with low reading comprehension and individual results were utilized to identify students who met LD criteria according to the C/DM and XBA methods and students who did not. Resulting group status permitted an investigation of agreement for identification methods and whether group status at pretest (LD or not LD) was associated with differential treatment response to an intensive reading intervention.
Results
The LD identification decisions of the XBA and C/DM demonstrated poor agreement with one another (κ = −.10). Comparisons of posttest performance for students who met LD criteria and those who did not meet were largely null, with small effect sizes across all measures.
Conclusions
LD status, as identified through the C/DM and XBA approaches, was not associated with differential treatment response and did not contribute educationally meaningful information about how students would respond to intensive reading intervention. These results do not support the value of cognitive assessment utilized in this way as part of the LD identification process.
Keywords: learning disabilities, cognitive assessment, reading disabilities
The identification of a student with learning disabilities (LD) has important educational, legal, and scientific implications. A positive identification decision may lead to segregated instruction, assessment and curricular modifications, and legal protections unavailable to students who do not meet LD criteria. Students with LD also represent an important population for experimental and descriptive research. For these reasons, the LD identification process and its criteria for identification should demonstrate adequate reliability and validity.
Historically, researchers and practitioners concerned with LD have struggled to achieve consensus on issues of definition and the most valid and reliable process for the identification of LD; multiple alternatives have been proposed (for an historical review, see Fletcher, 2012). Yet, the extent to which proposed identification processes and criteria for identification are reliable and valid is an empirical question (Morris & Fletcher, 1998). Proposed processes and criteria can be applied and resulting subgroups can be compared on dimensions not utilized to form subgroups, such as response to treatment, psychoeducational measures not utilized for group formation, or comparisons utilizing neuroimaging techniques. The classification accrues validity if the resulting groups differ on these external dimensions.
In the present study, we investigated the reliability and validity of two proposed methods for LD identification: (a) the cross battery assessment method (XBA; Flanagan, Ortiz, & Alfonso, 2007) and (b) the concordance/discordance model (C/DM; Hale & Fiorello, 2004). Both proposed methods operationalize a cognitive discrepancy approach to LD identification based on the identification of an intraindividual pattern of processing strengths and weaknesses (PSW) methods. A third common method, the Discrepancy/Consistency Method (D/CM; Naglieri, 1999) requires a specific assessment battery (Cognitive Assessment System, Naglieri & Das, 1997) is therefore not included in the present study.
Patterns of Processing Strengths and Weaknesses for Learning Disabilities Identification
In 2010, the Learning Disabilities Association (LDA) issued a White Paper documenting findings from a nonrandom survey of experts in the fields of education, psychology, medicine and law (Hale et al., 2010). The White Paper reached five conclusions: (1) the statutory definition of LD should be maintained; (2) neither an IQ achievement discrepancy or a failure to respond to intervention is sufficient as a means of LD identification; (3) PSW methods make the most empirical and clinical sense; (4) comprehensive evaluations should occur for the purposes of LD identification; and (5) the results of cognitive and neuropsychological assessments should be utilized for both LD identification and intervention planning (Hale et al., 2010, p. 223). However, these conclusions—particularly conclusions 3–5—were controversial and do not represent a consensus among experts in the surveyed fields (Consortium for Evidence-Based Early Intervention Practices [CEBEIP], 2010).
Despite this unsettled scientific controversy, PSW methods continue to receive attention within the school psychology community and among policy makers. PSW methods have been described and recommended at national conferences. Numerous articles and books have been published arguing for their adoption. Further, there is evidence that this growing literature is impacting practice. A recent review of LD identification procedures in the U.S. indicates that at least 25 states allow for the use of PSW methods for LD identification, although specific guidance is noticeably lacking (Maki, Floyd, & Roberson, 2015). At least 14 states explicitly allow for PSW methods to identify LD and two states (Oregon and Tennessee) provide specific guidelines for the utilization of PSW methods for LD identification.
Proposed Methods to Operationalize PSW Methods
At least three approaches to operationalize PSW methods have been proposed and are commonly cited in training documents as examples of PSW methods (Hanson, Sharmon, & Esparza-Brown, 2008). Although often presented as equivalent (e.g. Hale et al., 2010), these proposed operationalizations demonstrate key differences in how they identify an intraindividual PSW pattern and in theoretical orientation. For example, the XBA approach utilizes a series of normative comparisons to identify deficient and normal academic and cognitive abilities. The criteria for LD identification is an intraindividual PSW marked by specific academic and cognitive deficits in an otherwise normal cognitive profile (Flanagan et al., 2007). In contrast, the C/DM calculates difference scores across a number of cognitive and achievement tests and compares these difference scores against computed critical values to determine if there is a pattern of significant and nonsignificant differences between an achievement deficit, a cognitive deficit, and a cognitive strength (Hale & Fiorello, 2004). Within the C/DM, LD is marked by a nonsignificant difference between an achievement deficit and related cognitive deficit, as well as significant differences between those deficit areas and a cognitive strength. The D/CM also relies on a series of intraindividual comparisons to identify LD. Within the D/CM approach, performance on highly related tests cognitive abilities and achievement are hypothesized to be consistent, while cognitive abilities not related to the academic domain are discrepant. The D/CM also differs from other approaches in its theoretical orientation (Naglieri, 1999). It utilizes a cognitive assessment battery based on the Planning, Attention, Simultaneous, and Successive (PASS) theory of intelligence. In contrast, the XBA approach is aligned with the Cattell-Horn-Carrol (CHC) theory of intelligence and the C/DM is atheoretical. In the present study, both the XBA and C/DM are operationalized in a manner aligned with CHC theory.
CHC theory synthesizes the three strata theory of cognitive ability (Carroll, 1993) and extended Gf-Gc theory (Horn & Noll, 1997) into a single, hierarchical taxonomy of cognitive functioning (McGrew, 2009). The three strata progress from specific or narrow abilities to more general abilities. Stratum I consists of over 80 identified narrow abilities, which are organized within Stratum II broad ability domains or clusters. Most theorists identify nine clusters or broad abilities: (1) fluid intelligence (Gf); (2) crystallized intelligence (Gc); (3) visual processing (Gy); (4) auditory processing (Ga); (5) short-term memory (Gsm); (6) long-term storage and retrieval (Glr); (7) processing speed (Gx); (8) quantitative reasoning (Gq); and (9) reading and writing ability (Grw). Stratum III represents overall cognitive ability or intelligence (g). Within the XBA approach, the assessment battery typically includes tests across all 7 “cognitive” broad ability domains and may include assessment of either or both academic domains (i.e. quantitative reasoning and reading writing ability; Flanagan et al., 2007). The CD/M provides the diagnostician greater flexibility to develop an individualized test battery that assesses hypothesized areas of cognitive and academic weaknesses, as well as cognitive abilities unlikely to be affected (Hale & Fiorello, 2004).
Rationale for PSW Methods
Proponents of PSW methods often cite two related lines of research as evidence for the validity of PSW methods for LD identification. First, in recent years researchers in cognitive psychology have advanced understanding of the constitution and organization of cognitive abilities (Horn & Noll, 1997; McGrew 2009) and the relation of those abilities to academic performance (Evans, Floyd, McGrew, & Leforgee, 2001; Flanagan, Fiorello, & Ortiz, 2010). For example, the Catell-Horn-Carroll (CHC) theory represents a unified schema for understanding cognitive functions and provides a common language for the investigation of interrelations of different cognitive and academic abilities. Further, cognitive research has elucidated many of the critical cognitive functions underlying important academic skills and predictors. To illustrate, the importance of auditory processing (Ga) and specifically phonological processing to early word reading is well established (Torgeson, Wagner, Rashotte, Burgess, & Hecht, 1997; Speece, Ritchey, Cooper, Roth, & Schatschneider, 2004). Crystallized intelligence (Gc) in the form of vocabulary and background knowledge is strongly related to reading comprehension, particularly in late elementary onward (Catts, Adlof, & Weismer, 2006; Evans et al., 2002).
A second line of research has investigated the cognitive characteristics of students with difficulties in reading (Al Otaiba & Fuchs, 2002; Nelson, Benner, & Gonzalez, 2003) and in math (Fuchs et al., 2008; Geary, 2011). Along these lines, Johnson et al. (2011) conducted a meta-analysis comparing the cognitive processing of students with LD to their typically achieving peers. The authors analyzed 213 effect sizes across eight areas of cognitive processing and executive functioning. Results indicated that students with LD demonstrated moderate to large comparative deficits across all areas (reading disabled vs. typical effect size range: −.595 to −1.276). Deficits were most pronounced in the areas of cognitive processing and verbal working memory. Results were similar when comparing students with math disabilities (math disabled vs. typical effect size range: −.594 to −2.656). The authors conclude that such pronounced deficits in cognitive functioning have important educational implications and therefore should be assessed as part of the LD identification process (Johnson et al., 2011, p. 15).
However, these two lines of research do not represent direct evidence for the reliability and validity of PSW methods for LD identification. Specifically, the studies above first identify LD on the basis of academic performance and then examine cognitive performance, rather than using cognitive performance as a criterion for initial identification. The reliability and validity of methods of identification based on cognitive performance should be directly evaluated by determining if the identification decisions that emerge are consistent and reflect educationally meaningful distinctions, including the capacity for reliably and validly differentiating students with academic deficits identified as demonstrating LD from students with similar academic deficits who are not identified as demonstrating LD.
Research on PSW Methods
Recent studies have raised questions about the reliability of proposed PSW methods. Stuebing et al. (2012) simulated data to evaluate the technical adequacy of proposed PSW methods. Latent variables were created that simulated the relations between commonly utilized tests of achievement and cognitive functioning. The relations between latent academic and cognitive variables were utilized to generate case-level values and permitted an investigation of identification rates, as well as overall agreement, positive predictive values (PPV), and negative predictive values (NPV). Across three proposed operationalizations of the PSW method (including the C/DM and XBA methods), identification rates were very low, resulting in high overall agreement and high NPV. However, PPVs were consistently low, suggesting that even if a PSW profile was educationally meaningful, it could not be reliably identified at the level of observation.
Subsequently, Miciak, Fletcher, et al. (2014) and Miciak, Taylor, et al., (2014) investigated the reliability of PSW methods utilizing empirical data from students in middle school and early elementary school. Miciak, Fletcher, et al., (2014) compared the identification decisions of two methods to operationalize the PSW approach: the XBA (Flanagan et al., 2007) and C/DM methods (Hale & Fiorello, 2004). Results indicated poor agreement for LD identification decisions between the two methods (kappa range: −.04 to .31), which differed in their reliance on normative deficits or intraindividual deficits as inclusionary criteria for LD. These findings suggest that the two methods should be validated separately. Miciak, Taylor, et al., (2014) evaluated the effect of achievement test selection on LD identification decisions of the C/DM. The study utilized two batteries that were equivalent at the latent level, but differed in the achievement tests utilized to establish academic weaknesses. Across the two batteries, agreement was low (kappa = .29), suggesting that different identification decisions result from differences in test selection. Together, these studies illustrate that PSW methods do not overcome well-established reliability problems associated with the IQ-achievement criteria, previously the most common cognitive discrepancy method for the identification for LD (Macmann, Barnett, Lombard, Belton-Kocher, & Sharpe, 1997; Francis et al., 2005).
Cognitive Profiles and Intervention Response
Unfortunately, all psychometric methods for the identification of LD are unreliable at the individual level because of imperfect test reliability and validity (Fletcher, Lyon, Fuchs, & Barnes, 2007). Despite reliability challenges at the individual level, it is possible that PSW methods could demonstrate good validity and utility making them suitable for widespread adoption. For example, proponents of PSW methods argue that the identification of intraindividual patterns of cognitive processing strengths and weaknesses can inform academic interventions (Fiorello, Hale, Wycoff, 2012; Johnson, 2014; Hale et al., 2010; Flanagan, Ortiz, & Alfonso, 2008) and explain why a particular intervention was not effective (Fiorello et al., 2012; Johnson, 2014; Hale et al., 2010). These claims are based on an assumption of differential treatment response for students with different cognitive profiles (aptitude by treatment interactions). However, to our knowledge, no peer-reviewed study has directly investigated whether LD status derived from PSW methods is associated with differential treatment response. Nor has any peer-reviewed study directly investigated whether the inclusionary criteria specified by PSW methods are associated with differential treatment response. These are fundamental questions because much of the literature in support of PSW methods includes the assertion that students with a PSW profile need individualized interventions tailored to this profile to improve academic skills (Decker, Hale, & Flanagan, 2013; Fiorello, Hale, & Wycoff, 2012).
Stuebing et al. (2014) identified three analytic approaches to investigate the relation of students’ cognitive characteristics to treatment response: (a) unconditional gain models; (b) unconditional growth models; (c) and growth models conditioned on initial reading status. These approaches differ in the analyses methods used and in the research questions addressed. For example, unconditional models answer questions about the relation of specific cognitive processes to improvement in reading, without investigation of the cause of this association. Such models are valuable for elaborating theoretical relations between variables and identifying potential targets for future intervention research. In contrast, conditional growth models include baseline achievement and cognitive performance in order to parse the “value added” effects of cognitive predictors of intervention response. In this model, the question is whether the baseline cognitive characteristic predicts posttest performance after controlling for pretest performance. This focus on growth unexplained by initial achievement is particularly useful in educational settings because academic testing is less expensive and less burdensome than cognitive testing, which must be completed individually with a school psychologist or trained diagnostician. Improved understanding of the cognitive characteristics or profiles of students who will respond differently to intensive interventions would be useful, educationally meaningful information. Conversely, if cognitive characteristics or profiles do not result in better predictive models for who will and who will not respond to subsequent intervention, they are of questionable value.
The Present Study
To our knowledge, no empirical study has investigated whether students who meet LD criteria according to proposed PSW methods demonstrate differential intervention response. In the present study, we investigated whether the identification of a PSW profile was predictive of differential intervention response. Specifically, we investigated the application of PSW approaches as operationalized by the C/DM and XBA methods for the identification of LD in the context of an intensive reading intervention for 4th graders with reading comprehension deficits. We used cognitive and academic data with this group of struggling students at pretest to identify students who did and did not meet PSW criteria for LD. Comparisons of the resulting groups permitted an investigation of the agreement between the two PSW methods for LD identification and whether group status defined in that manner was associated with differential treatment response on measures of word reading, reading fluency, and reading comprehension, after controlling for pretest performance. Further analyses investigated whether the inclusionary criteria specified by the C/DM and XBA methods predicted differential intervention response. Three hypotheses guided the study:
-
Hypothesis 1
Based on prior work (e.g., Steubing et al., 2014; Miciak, Fletcher et al., 2014), we hypothesize that the C/DM and XBA methods will identify different individuals as meeting LD criteria.
-
Hypothesis 2
Although little evidence directly addresses this issue, to the extent that PSW identification methods are valid, then students identified as LD by the C/DM and XBA methods should demonstrate differential response to reading intervention, relative to students who do not meet PSW criteria.
-
Hypothesis 3
To the extent that the inclusionary criteria specified by PSW identification methods are valid, then students identified as meeting these criteria should demonstrate differential response to reading intervention.
Methods
Overview
This study took place with the approval of the institutional review boards of all participating universities. All data were collected as part of two larger, overlapping studies investigating intervention for reading comprehension and the relations of executive functions (EFs) and reading comprehension among late elementary students. The intervention study evaluated the effectiveness of a small-group, explicit reading intervention for improving reading comprehension outcomes for fourth-grade students with significant reading comprehension difficulties (Vaughn, Solís, Miciak, Taylor, & Fletcher, 2015). Most students from the intervention study also participated in a large measurement study investigating the relations of EFs, cognitive and language abilities, and reading comprehension. Data collected in fall 2012 for 206 fourth graders who failed to respond to school-based reading interventions were utilized to empirically classify students who met or did not meet LD criteria using the C/DM and XBA methods for LD identification prior to receiving researcher-provided reading intervention. Resulting groups were then compared to evaluate agreement for LD decisions and whether LD status (and inclusionary criteria) predicted differential treatment response.
Participants and Setting
Schools
This study was conducted in three school districts in the Southwestern United States. Approximately half the sample was drawn from eight schools located in one large urban district. The remaining sample was drawn from nine schools located in two near urban school districts. The mean enrollment of the 17 participating schools was 697 students (range 425–1140 students). All schools included a significant population of students qualifying for free or reduced lunch (M= 81.6%; range 46.1%–98.4%).
Criteria for participation in the intervention
All fourth grade students in participating schools (n = 1,695) were administered the Gates MacGinitie Reading Test (GMRT; MacGinitie, MacGinitie, Maria, Dreyer, & Hughes, 2000). Students were excluded from screening if: (a) they were enrolled in an alternative curriculum, (i.e., life skills class), or (b) they were identified as having a significant sensory disability that interfered with participating in the study (e.g., blindness, deafness). Students who received a score < 1 SD below the population mean (standard score < 85) or lower were eligible for participation. A total of 488 students met eligibility criteria and represent the initial sampling frame.
Student participants
Three of the initially identified 488 students were not randomized: one parent refused participation; one student did not complete the screening test; and one student was identified with significant intellectual deficits unknown at the time of screening. The 485 eligible participants were randomly assigned within school 2:1 to treatment (n = 324) or to a business as usual control condition (BAU; n = 161). As the goal of the present study concerned response to the explicit reading intervention, participation was limited to students assigned to the treatment condition. Among the initial 324 students assigned to treatment, 37 withdrew from the intervention study. Of these 37: 12 moved from a participating school; 25 were withdrawn from the study by the school because of scheduling conflicts or behavior problems (n =19), by the parent (n = 3), or for unknown reasons (n =2); one additional student was excluded because of a visual impairment unknown at the time of randomization. Among the 287 students who were eligible, and received and completed the reading intervention, 81 were not included in this study because they did not complete the full cognitive assessment battery due to the planned missingness design of the larger measurement study and therefore could not be classified according to PSW methods. The large number of students who did not complete all tests is a result of a planned missingness design utilized by the large measurement study. Students in that study were randomly assigned to one of six comprehensive test patterns or a seventh minimal pattern to evaluate EFs and reading comprehension in late elementary. Most, but not all, of the intervention students completed one of the six comprehensive test patterns (for a more complete discussion of the measures and procedures of the measurement study, visit http://texasldcenter/projects/measures). The final sample of 206 students did not differ from the initial sample of 324 on the GMRT administered at screening, F (1, 323) = .01, p > .05. Demographic information for the final sample is provided in Table 1.
Table 1.
Variable | % |
---|---|
% Male | 55.3 |
% Free or Reduced Priced Lunch | 93.4 |
Race and Ethnicity | |
% African American | 23.8 |
% Latino/Hispanic | 28.2 |
% White Not Hispanic | 9.2 |
% Other/ 2 or More Races | 39 |
N = 206
Intervention
Intervention participants were provided instruction daily for 16 weeks (November through April) in groups of four to five. Each intervention lesson consisted of three components: vocabulary instruction, word study, and text-based reading of expository text. Lesson content was aligned to the concepts taught in participants’ social studies classes. Interventionists (18 female) were hired and trained by the research team to provide the intervention. All interventionists but one held at least an undergraduate degree, six interventionists held a master’s degree, and one held a doctorate in education. Interventionists were provided 10 hours of professional development prior to teaching, an additional 8 hours of professional development throughout the year, and on-site coaching (approximately once every 2 to 3 weeks).
Intervention fidelity
All intervention sessions were audio recorded to evaluate intervention fidelity. Eight lessons were randomly selected by blocking on reading group and school within each tutor to be coded for fidelity. Fidelity was coded by rating each of the instructional components on a 4-point Likert-type rating scale ranging from 1 (low), 2 (mid-low), 3 (mid-high), and 4 (high). Quality of implementation (e.g., active engagement, frequent opportunity for students’ responses, appropriate use of feedback and pacing) was rated on the same 4-point Likert-type scale for each of the instructional components. Mean coder reliability was 95%. The mean implementation score across components and across interventionists was 3.71 (SD = .24, range 3.45 to 4.00). The mean quality score across components and across interventionists was 3.71 (SD = 0.46, range 2.00 to 4.00). The mean total fidelity ranking was 3.48 (SD = 0.55, range 2.00 to 3.82).
Intervention results
Results of the larger intervention study (Vaughn et al., 2015) indicated there were statistically significant differences at posttest between the treatment and control groups on a measure of spelling only (d = .38). Between-group comparisons on measures of decoding, fluency, and reading comprehension were not statistically significant (d range = −.06 to .12). The lack of statistically significant between-groups differences is not necessarily indicative of an ineffective intervention. Between-groups comparisons represent tests of the effectiveness of both the researcher-implemented intervention and the counterfactual condition (Lemons, Fuchs, Gilbert, & Fuchs, 2014). In the larger study, there was evidence that both the researcher-provided intervention and school-provided interventions in the BAU control were effective. Post hoc analyses revealed that within-group pretest-posttest effect sizes were moderate to large for both conditions (median d = .65, range: .19 to 1.52). These effect sizes compare favorably with estimates of mean effects on standardized measures of reading for one year of instruction in fourth grade (d = .40; Lipsey et al., 2012, p. 28). Furthermore, large between group differences are not essential to the present study because inclusion is limited to the treatment group and because the research question concerns individual differences in response to the researcher-provided intervention. To the extent that there are individual differences in response to the intervention, it is possible to apply the PSW criteria and determine whether the proposed criteria predict differential treatment response.
Measures
The test battery for the larger measurement study was selected to incorporate established and experimental measures of reading, cognitive processing, language, and executive function. A full description of all measures from the measurement study is available at (http://www.tcld.org/projects/measures). For the present study, we selected measures that: (a) had good coverage within the intervention sample; (b) demonstrated good psychometric properties; and (c) assessed the full range of broad abilities specified by the CHC framework. In line with recommendations for implementation of the XBA approach and C/DM, we selected at least one cognitive assessment for each of the 7 CHC broad cognitive abilities identified by Flanagan et al., (2007) in addition to tests of reading (Grw). As the focus of the present study was reading, quantitative reasoning (Gq) was not assessed. All testing was completed before participation in the researcher-provided intervention. Trained examiners completed all testing in a quiet area at participating students’ schools over 3 or 4 sessions.
Reading measures
Gates-MacGinitie Reading Test (MacGinitie et al., 2000)
The Reading comprehension subtest of the GMRT is a nationally-normed, group administered test of reading comprehension. The Kuder-Richardson Formula 20 reliability coefficient for Reading Comprehension is .92. The GMRT was utilized as the screening measure for the intervention study and is utilized as the indicator of reading comprehension deficits in the present study.
Woodcock-Johnson III Test of Achievement (WJ-III; Woodcock et al., 2001)
The WJ-III is a nationally-normed, individually administered test of academic achievement. Two subtests were administered: Letter-Word Identification and Passage Comprehension. Letter-Word Identification is a standardized assessment of decoding skills. Passage Comprehension is a cloze-type activity and is an assessment of reading comprehension. Reliability coefficients for students aged 8–10 for Letter-Word Identification range from .93 to .96 and .89 to .92 for Passage Comprehension. We utilized age-based standard scores for all analyses. Both subtests were administered at pretest and posttest and are utilized as outcome measures.
Test of Word Reading Efficiency (TOWRE; Torgesen, Wagner, & Rashotte, 1998)
The Sight Word Efficiency (SWE) subtest from the TOWRE is a nationally-normed, individually administered test of single-word reading fluency. Alternate-forms reliability of the SWE subtest exceed .90 (Torgesen, et al., 1998). We utilized age-based standard scores for all analyses. The TOWRE SWE was administered at pretest and posttest and is utilized as an outcome measure.
Measures of cognitive processing
Woodcock Johnson III Tests of Cognitive Abilities (WJ-III; Woodcock et al., 2001)
The WJ-III Tests of Cognitive Abilities is a nationally-normed, standardized assessment of cognitive abilities. We administered two subtests: Planning and Verbal Comprehension. Test-retest reliability coefficients for students aged 8–10 range from .69 – .77 for Planning and .88–.90 for Verbal Comprehension. We utilized age-based standard scores for both subtests. Both subtests were administered at pretest only. Planning was utilized as an indicator of fluid intelligence (Gf). Factor analysis indicates that the Planning subtest loads on the fluid intelligence factor within the WJ-III norming sample (Woodcock et al., 2001); measures of planning have also been associated with reading skill (Sesma et al., 2009) Verbal Comprehension was utilized as an indicator of crystallized intelligence (Gc), and loads on this factor in the WJ-III norming sample (Woodcock et al., 2001). Crystallized abilities, as indicated by verbal knowledge and listening comprehension are associated with reading comprehension (Catts, Adlof, & Weismar, 2006).
Comprehensive Test of Phonological Processing (CTOPP; Wagner, Torgesen, & Rashotte, 1999)
The CTOPP is a nationally-normed, individually administered test of phonological processing. We administered two subtests: Elision and Rapid Letter Naming. Test- retest reliability coefficients for students aged 8–17 are .79 for Elision and .72 for Rapid Letter Naming. We utilized age-based standard scores for both subtests. Scale scores (M = 10; SD = 3) were converted to standard scores (M = 100; SD = 15) to facilitate comparisons. Elision was utilized as an indicator of auditory processing (Ga). CTOPP Elision is a commonly utilized test of phonological processing. Within the WJ-III norming sample, similar measures load on the auditory processing factor (Woodcock et al. 2001). Rapid Letter Naming was utilized as an indicator of long-term storage and retrieval (Glr) consistent with recommendations by Flanagan et al. (2007, p. 60). Within this age range, letters are known and must be retrieved efficiently.
Working Memory Test Battery (WMTB; Pickering & Gathercole, 2001)
The WMTB is a standardized measure of working memory for children. The normative sample was 750 children aged 4–15 from the United Kingdom. We utilized one subtest: Listening Recall. One year test-retest reliability for Listening Recall is .83. Age-based standard scores were utilized. Listening recall was utilized as an indicator of working memory (Gsm). The Listening Recall task is a commonly utilized measure of working memory and requires both processing and storage similar to classical span measures (e.g., Daneman & Carpenter, 2980).
N Back: 0 Back
The 0 Back task is a computer-based task in which students are required to press the space bar each time a letter or shape appears on the monitor at varying time intervals. As such, it requires brief attention and processing of the stimuli; the primary outcome for this task as used in this study was mean reaction time (milliseconds). We derived standard scores from z-scores calculated within the larger sample. The measurement study sample (n = 846) was weighted to represent a full distribution of reading proficiency in grades 3–5. The 0 Back task was utilized as a measure of basic processing speed (Gs).
Applying PSW Methods for LD Identification
XBA
Flanagan et al. (2007) specified that LD identification requires: (a) an achievement deficit, (b) a normal cognitive profile, and (c) a cognitive deficit that is theoretically related to the achievement deficit. Deficit scores are defined as scores falling > 1 SD below the population mean. All participants scored > 1 SD below the population mean on the GMRT and therefore all demonstrated an achievement deficit in reading comprehension.
Normal cognitive ability was calculated by replicating formulae in the SLD Assistant version 1.0 (Flanagan et al., 2007). For each of the seven CHC cognitive clusters, we assigned a 0 (below deficit cut point) or 1 (at or above cut point). The dichotomous variable was then multiplied by a g loading and summed across the seven clusters. If the sum is ≥ 1, the student’s cognitive profile is considered normal.
Flanagan et al. (2007) provided a number of potential cognitive deficits which could be theoretically linked with reading comprehension deficits. To assist in specifying which cognitive deficits could serve as theoretically-linked cognitive deficits, we consulted a guidance document for Portland Public Schools (2013) on the identification of LD. This document follows XBA recommendations and provides explicit guidance for implementing the XBA method. We are unaware of a more explicit practitioner guide. The document identifies four required assessment areas that could explain reading comprehension deficits: language (Gc), working memory (Gsm), long term storage-retrieval (Glr), and fluid reasoning (Gf; Portland Public Schools, 2013, p. 21). We therefore specified that a student with a reading comprehension deficit (all participants), a normal ability profile as described above, and a cognitive deficit in any of these four areas would be identified as meeting XBA LD criteria. All others did not meet criteria.
C/DM
Hale and Fiorello (2004) specified that LD identification requires three criteria: (a) an achievement deficit; (b) a theoretically linked cognitive deficit; and (c) a pattern of concordance/discordance based on significant and non-significant differences between cognitive abilities and academic achievement. Because a cut point for achievement deficits is not explicitly defined in this model, we specified that an achievement deficit would be a standard score 1 SD or more below the population mean (met by all participants in this study) to increase comparability with the XBA method. Additionally, for comparability reasons we specified that the same four theoretically related cognitive factors could represent a linked cognitive deficit (Gc, Gsm, Glr, Gf) as those utilized in the XBA method.
A pattern of concordance and discordance requires: (a) a significant difference between the achievement deficit and a cognitive strength, (b) a significant difference between the cognitive weakness and a cognitive strength, and (c) a non-significant difference between the achievement deficit and cognitive weakness. The significance of the difference scores was calculated following specified procedures based on the standard error of the difference at p < .05 (Hale & Fiorello, 2004, p. 102). Observed scores were compared and students who demonstrated a reading comprehension deficit (all students) and a pattern of concordance and discordance were identified as meeting C/DM LD criteria. All other students were identified as not meeting criteria.
Results
Descriptive statistics for the full sample on all measures are provided in Table 2. On all academic measures, the sample scored below age or grade-based norms (range 77.2 to 92.8). On cognitive measures, the sample performance was more varied (range 83.6 to 102.8).
Table 2.
Variable | M | SD | N |
---|---|---|---|
Academic Measures | |||
Gates MacGinitie Pretest | 77.2 | 6 | 206 |
Gates MacGinitie Posttest | 84.3 | 7.7 | 206 |
WJ3 Passage Comprehension Pretest | 82.4 | 8 | 206 |
WJ3 Passage Comprehension Posttest | 83.7 | 8.3 | 206 |
TOWRE Sight Word Efficiency Pretest | 80.9 | 11.7 | 205 |
TOWRE Sight Word Efficiency Posttest | 85.6 | 12 | 206 |
WJ3 Letter Word Identification Pretest | 89.8 | 9.3 | 206 |
WJ3 Letter Word Identification Posttest | 92.8 | 9.8 | 206 |
Cognitive Processing Measures | |||
WJ3 Oral Comprehension (Gc) | 83.6 | 15.8 | 206 |
WJ3 Planning (Gf) | 101 | 7.6 | 206 |
WMTB Word Recall (Gsm) | 88 | 14.1 | 206 |
CTOPP Rapid Naming (Glr) | 89.5 | 10.7 | 206 |
CTOPP Elision (Ga) | 86.4 | 12.8 | 206 |
Inquisit Nback: 0 Back (Gs) | 102.8 | 16.2 | 203 |
WJ3 = Woodcock Johnson Third Edition; TOWRE = Test of Word Reading Efficiency; WMTB = Working Memory Test Battery; CTOPP = Comprehensive Test of Phonological Processing
Hypothesis I: Agreement for LD Identification
A total of 90 (43.7%) students met XBA LD criteria and 116 (56.3%) students did not meet these criteria. A total of 123 (59.7%) students met C/DM LD criteria and 83 (40.3%) did not. Crosstabs for agreement are provided in Table 3. Cohen’s kappa is an index of agreement for categorical data. It represents the improvement over chance by independent raters or classification criteria. It quantifies agreement for both positive (identified as LD by both methods) and negative (not identified as LD by either method) agreement. Agreement for the two methods achieved kappa = −.10, indicating no statistically significant improvement over that which would be expected by chance. Of the students identified by at least one battery (n=175), only 59 of these students were identified by both batteries. Positive Agreement (Cichetti & Feinstein, 1989) was calculated as two times the number of agreements for positive identification decisions (LD) divided by the total number of individuals positively identified by either method. Negative agreement was calculated as two times the number of agreements for negative decisions (not LD) divided by the total number of negative identification decisions. Positive agreement was .55 and negative agreement was .52.
Table 3.
XBA Method | C/DM
|
||
---|---|---|---|
LD | Not LD | Total | |
LD | 59 | 31 | 90 |
Not LD | 64 | 52 | 116 |
|
|||
Total | 123 | 83 | 206 |
Kappa = −.10; XBA = Cross Battery Assessment Method (Flanagan et al. 2007); C/DM = Concordance Discordance Model (Hale & Fiorello, 2004);
Hypothesis II: LD Status as a Predictor of Treatment Response
The second research question addressed whether LD status identified through the XBA and C/DM methods predicted differential treatment response. Following the growth conditioned upon initial status model identified by Stuebing et al. (2014), we conducted six analyses of covariance (ANCOVA) (one for each combination of identification method and reading outcome), to see whether XBA LD status or C/DM LD status predicted differential treatment response. The outcome for each ANCOVA was posttest performance on a reading measure of decoding (WJ3 Letter Word ID), reading fluency (TOWRE SWE), and reading comprehension (WJ3 Passage Comprehension). Pretest performance on the respective outcome variable was added as a covariate. The significance of the F statistic for LD status was the main hypothesis test. We evaluated all comparisons at p < .05 to maximize statistical power to detect an effect and minimize Type II errors.
The significance of XBA LD status as a predictor of treatment response in decoding, fluency, and reading comprehension was first assessed. Results are provided in Table 4. In each of these models, pretest performance was a significant predictor of posttest performance, with eta squared statistics ranging from .52 to .69. In this context, XBA LD status was not a significant predictor of posttest performance in decoding, F (1, 205) = .03, p > .05, η2 = .00, reading fluency F (1, 205) = .47, p > .05, η2 = .00, or reading comprehension F (1, 205) = .57, p > .05, η2 = .00.
Table 4.
Model | Predictor | Reading Comprehension
|
Reading Fluency
|
Word Reading
|
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
df | F | η2 | p | df | F | η2 | p | df | F | η2 | p | ||
XBA: LD Status | |||||||||||||
Pretest | 1 | 228.1 | .528 | < .0001 | 1 | 352.1 | .635 | < .0001 | 1 | 441.10 | .685 | < .0001 | |
LD Status | 1 | 0.57 | .001 | 0.45 | 1 | 0.47 | .001 | .50 | 1 | 0.03 | .000 | .87 | |
XBA: Normal Ability | |||||||||||||
Pretest | 1 | 159.1 | .528 | < .0001 | 1 | 354 | .635 | < .0001 | 1 | 455.10 | .685 | < .0001 | |
Normal Ability | 1 | 0.05 | .000 | 0.82 | 1 | 1.57 | .003 | 0.21 | 1 | 6.47 | .010 | .01 | |
XBA: Deficit Areaa | |||||||||||||
Pretest | 1 | 227.6 | .528 | < .0001 | 1 | 352.5 | .635 | < .0001 | 1 | 467.67 | .685 | < .0001 | |
Gc | 1 | 1.3 | .003 | 0.71 | 1 | 0.65 | .001 | .42 | 1 | 12.24 | .018 | .0006 | |
Gf | 1 | 1.49 | .006 | 0.11 | 1 | 0.06 | .000 | .81 | 1 | 0.08 | .000 | .78 | |
Gsm | 1 | 0.32 | .001 | 0.57 | 1 | 0.00 | .000 | .96 | 1 | 5.55 | .008 | .02 | |
Glr | 1 | 0.14 | .000 | 0.71 | 1 | 1.76 | .003 | .19 | 1 | 0.82 | .001 | .37 | |
Ga | 1 | 1.85 | .004 | 0.17 | 1 | 2.45 | .004 | .12 | 1 | 2.16 | .003 | .14 | |
Gs | 1 | 0.95 | .002 | 0.33 | 1 | 0.22 | .000 | .63 | 1 | 0.04 | .000 | .84 | |
C/DM: LD Status | |||||||||||||
Pretest | 1 | 228 | .528 | < .0001 | 1 | 351.7 | .635 | < .0001 | 1 | 443.17 | .685 | < .0001 | |
LD Status | 1 | 0.47 | .001 | 0.49 | 1 | 0.24 | .000 | .63 | 1 | 0.96 | .002 | 0.328 | |
C/DM: Deficit Areaa | |||||||||||||
Pretest | 1 | 230.6 | .528 | < .0001 | 1 | 351.5 | .635 | < .0001 | 1 | 441.34 | .685 | < .0001 | |
Gc | 1 | 1.85 | .004 | .18 | 1 | 0.08 | .000 | .77 | 1 | 0.12 | .000 | .73 | |
Gf | 1 | 1.23 | .003 | .27 | 1 | 0.4 | .001 | .53 | 1 | 0.04 | .000 | .83 | |
Gsm | 1 | 0.08 | .000 | .78 | 1 | 0.33 | .001 | .57 | 1 | 5.4 | .008 | .02 | |
Glr | 1 | 0.06 | .000 | .81 | 1 | 0.95 | .002 | .33 | 1 | 0.01 | .000 | .93 |
N = 206; Note: Bold = significant at p < .05; XBA = Cross Battery Assessment Method (Flanagan et al. 2007); C/DM = Concordance Discordance Model (Hale & Fiorello, 2004); Reading Comprehension = Woodcock-Johnson Third Edition (WJ3): Passage Comprehension; Reading Fluency = Test of Word Reading Efficiency: Sight Word Efficiency; Word Reading = WJ3 Letter Word Identification; Gc = WJ3 Oral Comprehension; Gf = WJ3 Planning; Gsm = Working Memory Test Battery: Word Recall; Glr = Comprehensive Test of Phonological Processing: Rapid Letter Naming.
Models evaluating specific deficit areas represent repeated analyses with pretest and the deficit area of interest as the predictor.
Similar results were apparent for C/DM LD status as a predictor of treatment response (see Table 4). In each of the models, pretest performance was a significant predictor of posttest performance, with eta squared statistics ranging from .52 to .69. C/DM LD status was not a significant predictor of posttest performance in decoding, F (1, 205) = .96, p > .05, η2 = .00, reading fluency, F (1, 205) = .24, p > .05, η2 = .00, or reading comprehension, F (1, 205) = .47, p > .05, η2 = .00.
Hypothesis III: Inclusion Criteria as Predictors of Intervention Response
We next evaluated whether status on specified inclusionary criteria was a significant predictor of intervention response. For the XBA, inclusionary criteria included normal cognitive ability and a normative cognitive deficit in an area related to the student’s academic weakness. For the C/DM, inclusionary criteria are specified as requiring a concordance between an area of academic weakness and a cognitive weakness. For each specified criterion, we dichotomized status on the criterion and evaluated whether it was a significant predictor of posttest performance in decoding, reading fluency, or reading comprehension, mirroring previous analyses.
Results of all comparisons are provided in Table 4. Comparisons predicting posttest performance in reading fluency and reading comprehension were all non-significant. Comparisons predicting word reading posttest performance were mixed. For the XBA, “normal” ability status was a significant predictor after accounting for pretest performance. Deficits in crystallized intelligence and short-term memory were also significant predictors (η2 range .008 to .018). For the C/DM, only short-term memory deficit status was a significant predictor of posttest performance in word reading.
Improvement in Prediction
The real-world effects of these statistically significant improvements in prediction can be evaluated through data simulation. For pretest, eta squared was .685 yielding a pretest posttest correlation of .828. If the predictors explain an additional .018 of posttest variance, the eta squared for the model becomes .703. This yields a correlation of .838 between the regression weighted linear composite of the pretest and PSW dimension (in this case a deficit in crystallized intelligence) and posttest performance. A simulation of two datasets of two correlated measures and 1,000 observations illustrates the implications of this improvement in prediction. For dataset one, with an r2 of .828 and a dichotomized cut point at the mean (z < 0), we would likely get a cross-tabulation similar to that simulated for Table 5. In this model, we would expect to misclassify approximately 192 students. If the dichotomized cut point was dropped to the 25th percentile (z < .66), we would expect to misclassify approximately 152 students (Table 6). Following similar procedures, but utilizing r2 = .838 to reflect the additional variance explained by the PSW variable, we would likely get a cross-tabulation similar to Table 7. This would yield an expected misclassification of 182 students at a mean cut point. When the cut point is dropped to the 25th percentile (Table 8), the expected misclassification is approximately 147 students. To obtain the classification improvement of 5 and 10 students out of 1,000 would require a full cognitive battery on all 1,000 students and classifying students according to PSW methods. This scenario simulates results for the most optimistic of the 33 contrasts performed in the present study—the median improvement for eta squared was much smaller at .001.
Table 5.
Pass | Fail | |
---|---|---|
Pass | 403 | 95 |
Fail | 97 | 405 |
Total number of misclassifications = 192
Table 6.
Pass | Fail | |
---|---|---|
Pass | 670 | 76 |
Fail | 76 | 178 |
Total number of misclassifications = 152
Table 7.
Pass | Fail | |
---|---|---|
Pass | 409 | 92 |
Fail | 90 | 409 |
Total number of misclassifications = 182
Table 8.
Pass | Fail | |
---|---|---|
Pass | 672 | 73 |
Fail | 74 | 181 |
Total number of misclassifications = 147
Discussion
The present study evaluated the reliability and validity of proposed PSW methods for LD identification. We utilized a sample of 206 fourth graders with persistent reading comprehension difficulties who participated in an intensive reading intervention to compare the identification decisions of the XBA and C/DM methods for LD identification and evaluate whether resulting group status predicted differential treatment response. The first research question was addressed by comparing the resulting LD identification decisions of the XBA and C/DM methods to one another for individual students. Results indicated poor agreement. A kappa of −.10 indicates that agreement for LD identification decisions did not improve upon that which would be expected by chance. These results align with the results of Miciak, Fletcher, et al. (2014), which also demonstrated chance levels of agreement for most comparisons of the identification decisions of these two models for LD identification. These findings raise significant questions about the equivalence of the methods and have important implications for the validation of PSW models. To the extent that the models do not identify the same subset of students as LD, they cannot be jointly validated. The process of building evidence for the validity of these models may need to be undertaken separately. Further, a school district cannot expect that decisions made using different methods to operationalize the PSW approach will yield the same identification decisions. Identification decisions will depend, in part, on the specific PSW method utilized and on the cognitive and academic tests utilized within these methods (Miciak, Taylor, et al., 2014).
The second and third research questions were addressed by a series of contrasts evaluating whether posttest performance was predicted by LD status or status on inclusionary criteria such as specific cognitive deficits or the identification of a “normal” cognitive profile beyond that predicted by pretest. Results of these comparisons were largely null and not practically significant, with eta squared statistics ranging from .00 to .018 across all contrasts. No contrast predicting reading fluency or reading comprehension was statistically significant. A small number of contrasts predicting word reading achieved statistical significance, with eta squared statistics indicating a small increase in the amount of variance on posttest predicted by inclusionary criteria for the C/DM and XBA methods for LD identification. The largest eta squared was .018 for crystallized intelligence deficits predicting word reading, indicating that deficit status in crystallized intelligence uniquely explained 1.8% of the variance in the posttest outcome. For comparison, pretest performance explained 68.5% of the variance in posttest outcome.
This partition of posttest variance helps contextualize the practical significance of these findings. Steubing et al. (2014) defined three approaches for studies investigating whether child cognitive characteristics predict intervention response. The first two approaches utilize unconditional predictions of growth or gain scores. Such studies investigate whether a child cognitive characteristic is associated with treatment response but do not address the cause of the association. Unconditional analyses of cognitive characteristics of treatment response are particularly meaningful in research settings in which identifying potential targets for intervention is the goal. In recent years, research of this type has led to novel intervention studies investigating academic interventions incorporating working memory and executive functions trainings (see for example, Fuchs et al., 2014; Morris et al., 2012).
In the present study, we utilized the third approach identified by Stuebing et al. (2014), which investigates growth conditioned upon pretest performance. This analysis approach has clear implications for practice, as it addresses the predictive improvement resulting from additional assessment data. This approach directly investigates whether the cognitive characteristic or profile of interest contributes to our understanding of which students will respond to an intervention and which will not beyond what could be predicted by academic performance at pretest. This is a particularly important consideration as students advance beyond early elementary when assessments of underlying processes (e.g. phonological awareness) can be helpful in predicting how students will respond to instruction prior to significant formal reading instruction. Once formal reading instruction has begun, tasks that involve print are more robust predictors of reading achievement (Scarborough, 1998).
Cognitive assessment like of the type recommended by PSW proponents is time-consuming, costly, and requires considerable expertise on the part of the school psychologist. This expenditure of resources would only be justified if cognitive assessment contributed meaningful predictive information about who will and will not respond to intervention or provided clear direction for tailoring the intervention to the student’s cognitive strengths or weaknesses. However, the results of the present raise significant doubts as to whether proposed methods to operationalize PSW methods in their current form would allow for adequate prediction of individual response to intervention. Data simulations indicated that the most optimistic scenario would result in only a small improvement in predictive accuracy (approximately 5–10 students for every 1,000 students assessed). Most comparisons resulted in no statistically significant or educationally meaningful improvement in prediction.
Limitations
The results of the present study are specific to the sample and measures utilized to form groups and measure outcomes. Different measures would yield different subgroups (Miciak, Taylor, et al., 2014). Another potential source of unreliability in group formation concerns the specific operationalization of the XBA and C/DM we employed in the present study. Although the specifications utilized in the present study follow the recommendations of each approach, the methods could be operationalized differently yielding different subgroups and potentially different findings. However, the consistency of the findings makes it unlikely that these group fluctuations would yield qualitatively different results.
The present study did not evaluate whether cognitive assessment data attained through PSW methods could be utilized to formulate better or more effective intervention plans. The design of the present study does not permit a test of this assertion. However, the fact that this assertion has not been disproven is not sufficient justification for adopting PSW methods, as is occasionally claimed (Decker, Hale, & Flanagan, 2013, p. 4; Flanagan, Fiorello, & Ortiz, 2010, p. 741). Two recent reviews have concluded that the evidence base for interventions tailored to individual cognitive characteristics or learning styles is not sufficient to recommend widespread adoption (Kearns & Fuchs, 2013; Pashler, McDaniel, Rohrer, & Bjork, 2008). Additionally, the results of this study and previous studies of PSW methods highlight one specific challenge to utilizing PSW assessment data for planning treatment: the inherent unreliability of classifications based on proposed methods to operationalize the PSW method (Miciak, Fletcher, et al., 2014; Miciak, Taylor, et al., 2014; Stuebing et al., 2012). Perhaps future approaches to the PSW method that rely on improved measures or procedures will be associated with higher reliability and improved validity, but as investigated in this study there is little evidence supporting the reliability of these methods.
Implications for Practice and Research
Despite strong recommendations for practice (e.g. Hale et al., 2010; Hanson et al., 2008), there is little empirical research investigating the implications of widespread adoption of PSW methods. Previous research has highlighted issues related to the technical adequacy of proposed methods to operationalize the PSW model (e.g. Miciak, Taylor, et al. 2014; Stuebing et al., 2013). These challenges are inherent to all psychometric methods for the identification of LD, but are exacerbated in PSW methods because of the complexity of the models, many of which involve cut points applied to a distribution of score discrepancies. Thus, justification for the models could only derive from improved prediction of treatment response or improved treatment. To our knowledge, there is no peer-reviewed study that demonstrates that the application of PSW methods achieves either of these goals. The null results of the present investigation suggest that PSW status as operationalized through the C/DM and XBA methods may be only minimally related to treatment outcomes.
The results of the present study must also be understood in a context of ongoing tension concerning the goals of special education assessment for LD. At present, the decision emerging from the special education assessment process affects both the educational program of the individual child and resource allocation decisions at the school or district level. The inherent unreliability of complex psychometric methods for the identification of LD would thus seem to require stronger justification for the resources required to produce such identifications. Such risk is amplified by the relatively high stakes nature of the special education assessment process.
Conclusions
This study found little evidence that the identification of PSW profile or proposed inclusionary criteria are associated with differential treatment response. Given that much of the literature in support of PSW methods recommends these methods based on an assumption of differential treatment response for different cognitive profiles, these null results are particularly problematic. Additionally, the present study replicates previous investigations that find that the XBA and C/DM methods for LD identification are not interchangeable. Agreement for LD identification decisions between the two models did not improve upon that which would be expected by chance. Until such time as PSW methods for LD identification are demonstrated to improve the effectiveness of treatment, the results of this and previous empirical research suggest that the proposed methods are at best superfluous, and may be detrimental to the goal of ensuring the availability of high-quality academic instruction for all struggling students.
Acknowledgments
This research was supported by grant P50 HD052117, Texas Center for Learning Disabilities, from the Eunice Kennedy Shriver National Institute of Child Health and Human Development. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health and Human Development or the National Institutes of Health.
Contributor Information
Jeremy Miciak, University of Houston.
Jacob L. Williams, Education Northwest
W. Pat Taylor, University of Houston.
Paul T. Cirino, University of Houston
Jack M. Fletcher, University of Houston
Sharon Vaughn, The University of Texas at Austin.
References
- Al Otaiba S, Fuchs D. Characteristics of children who are unresponsive to early literacy intervention: A review of the literature. Remedial and Special Education. 2002;23:300–316. [Google Scholar]
- Catts HW, Adlof SM, Weismer SE. Language deficits in poor comprehenders: A case for the simple view of reading. Journal of Speech, Language, and Hearing Research. 2006;49(2):278–293. doi: 10.1044/1092-4388(2006/023). [DOI] [PubMed] [Google Scholar]
- Cicchetti DV, Feinstein AR. High agreement but low kappa: II. Resolving the paradoxes. Journal of clinical epidemiology. 1990;43(6):551–558. doi: 10.1016/0895-4356(90)90159-m. [DOI] [PubMed] [Google Scholar]
- Consortium for Evidence-Based Early Intervention Practices. A response to the Learning Disabilities Association of America (LDA) white paper on specific learning disabilities (SLD) identification. 2010 Downloaded on May 16, 2012 from: www.isbe.state.il.us/spec-ed/LDA_SLD_white_paper_response.pdf.
- Daneman M, Carpenter PA. Individual differences in working memory and reading. Journal of verbal learning and verbal behavior. 1980;19(4):450–466. [Google Scholar]
- Decker SL, Hale JB, Flanagan DP. Professional practice issues in the assessment of cognitive functioning for educational applications. Psychology in the Schools. 2013;50(3):300–313. [Google Scholar]
- Evans JJ, Floyd RG, McGrew KS, Leforgee MH. The relations between measures of Cattell-Horn-Carroll (CHC) cognitive abilities and reading achievement during childhood and adolescence. School Psychology Review. 2001;31:246–262. [Google Scholar]
- Flanagan DP, Fiorello CA, Ortiz SO. Enhancing practice through application of Cattell–Horn–Carroll theory and research: A “third method” approach to specific learning disability identification. Psychology in the Schools. 2010;47(7):739–760. [Google Scholar]
- Fiorello CA, Hale JB, Wycoff KL. Cognitive hypothesis testing: Linking test results in the real world. In: Flanagan DP, Harrison P, editors. Contemporary intellectual assessment: Theories, tests, and issues. 3. New York, NY: Guilford Press; 2012. pp. 484–496. [Google Scholar]
- Flanagan D, Ortiz S, Alfonso VC, editors. Essentials of cross battery assessment. 2. Hoboken, NJ: John Wiley & Sons, Inc; 2007. [Google Scholar]
- Fletcher JM. Classification and identification of learning disabilities. In: Wong B, Butler D, editors. Learning about learning disabilities. 4. New York, NY: Elsevier; 2012. [Google Scholar]
- Fletcher JM, Lyon GR, Fuchs LS, Barnes MA. Learning disabilities: From identification to intervention. New York, NY: Guilford Press; 2007. [Google Scholar]
- Francis DJ, Fletcher JM, Stuebing KK, Lyon GR, Shaywitz BA, Shaywitz SE. Psychometric approaches to the identification of LD IQ and achievement scores are not sufficient. Journal of Learning Disabilities. 2005;38(2):98–108. doi: 10.1177/00222194050380020101. [DOI] [PubMed] [Google Scholar]
- Fuchs LS, Fuchs D, Stuebing K, Fletcher JM, Hamlett CL, Lambert W. Problem solving and computational skill: Are they shared or distinct aspects of mathematical cognition? Journal of Educational Psychology. 2008;100(1):30. doi: 10.1037/0022-0663.100.1.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuchs LS, Schumacher RF, Sterba SK, Long J, Namkung J, Malone A, Changas P. Does working memory moderate the effects of fraction intervention? An aptitude–treatment interaction. Journal of Educational Psychology. 2014;106(2):499. [Google Scholar]
- Geary DC. Cognitive predictors of achievement growth in mathematics: a 5-year longitudinal study. Developmental Psychology. 2011;47(6):1539. doi: 10.1037/a0025510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hale JB, Alfonso V, Berninger B, Bracken B, Christo C, Clark E, Yalof J. Critical issues in response-to-intervention, comprehensive evaluation, and specific learning disabilities identification and intervention: An expert white paper consensus. Learning Disability Quarterly. 2010;33(3):223–236. [Google Scholar]
- Hale JB, Fiorello CA. School neuropsychology: A practitioner’s handbook. New York, NY: The Guilford Press; 2004. [Google Scholar]
- Horn JL, Noll J. Human cognitive capabilities: Gf-Gc theory. In: Flanagan DP, Genshaft JL, Harrison PL, editors. Contemporary intellectual assessment: Theories, tests, and issues. New York: Guilford; 1997. pp. 53–93. [Google Scholar]
- Johnson ES. Understanding why a child is struggling to learn: The role of cognitive processing evaluating in learning disability identification. Topics in Language Disorders. 2014;34(1):59–73. [Google Scholar]
- Johnson ES, Humphrey M, Mellard DF, Woods K, Swanson HL. Cognitive processing deficits and students with specific learning disabilities: A selective meta-analysis of the literature. Learning Disability Quarterly. 2010;33(1):3–18. [Google Scholar]
- Kearns DM, Fuchs D. Does cognitively focused instruction improve the academic performance of low-achieving students? Exceptional Children. 2013;79:263–290. [Google Scholar]
- Lemons CJ, Fuchs D, Gilbert JK, Fuchs LS. Evidence-Based Practices in a Changing World Reconsidering the Counterfactual in Education Research. Educational Researcher. 2014;43(5):242–252. [Google Scholar]
- Lipsey MW, Puzio K, Yun C, Hebert MA, Steinka-Fry K, Cole MW, Roberts M, Busick MD. Translating the statistical representation of the effects of education interventions into more readily interpretable forms. (NCSER 2013-3000) Washington, DC: National Center for Special Education Research, Institute of Education Sciences, U.S. Department of Education; 2012. This report is available on the IES website at http://ies.ed.gov/ncser/ [Google Scholar]
- MacGinitie W, MacGinitie R, Maria K, Dreyer L, Hughes K. Gates-MacGinitie Reading Tests-4. Itasca, IL: Riverside; 2000. [Google Scholar]
- Macmann GM, Barnett DW, Lombard TJ, Belton-Kocher E, Sharpe MN. On the actuarial classification of children: Fundamental studies of classification agreement. Journal of Special Education. 1989;23:127–149. [Google Scholar]
- Maki KE, Floyd RG, Roberson T. State Learning Disability Eligibility Criteria: A Comprehensive Review. School Psychology Quarterly. 2015 doi: 10.1037/spq0000109. online first publication, January 12, 2015. doi:org/10.1037/spq0000109. [DOI] [PubMed] [Google Scholar]
- McGrew KS. CHC theory and the human cognitive abilities project: Standing on the shoulders of the giants of psychometric intelligence research. Intelligence. 2009;37(1):1–10. [Google Scholar]
- Morris RD, Fletcher JM. Classification in neuropsychology: A theretical framework and research paradigm. Journal of Clinical and Experimental Nueropsychology. 1998;10:640–658. doi: 10.1080/01688638808402801. [DOI] [PubMed] [Google Scholar]
- Miciak J, Fletcher JM, Stuebing KK, Vaughn S, Tolar TD. Patterns of cognitive strengths and weaknesses: Identification rates, agreement, and validity for learning disabilities identification. School Psychology Quarterly. 2014;29(1):21. doi: 10.1037/spq0000037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miciak J, Taylor WP, Fletcher JM, Denton CD. The effect of achievement test selection on identification of learning disabilities within a patterns of strengths and weaknesses framework. School Psychology Quarterly. 2014 doi: 10.1037/spq0000091. Advance online publication. http://dx.doi.org/10.1037/spq0000091. [DOI] [PMC free article] [PubMed]
- Morris RD, Lovett MW, Wolf M, Sevcik RA, Steinbach KA, Frijters JC, Shapiro MB. Multiple-Component Remediation for Developmental Reading Disabilities: IQ, Socioeconomic Status, and Race as Factors in Remedial Outcome. Journal of learning disabilities. 2012;45(2):99–127. doi: 10.1177/0022219409355472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nelson RJ, Benner GJ, Gonzalez J. Learner Characteristics that Influence the Treatment Effectiveness of Early Literacy Interventions: A Meta-Analytic Review. Learning Disabilities Research & Practice. 2003;18(4):255–267. [Google Scholar]
- Pashler H, McDaniel M, Rohrer D, Bjork R. Learning styles: Concepts and evidence. Psychological Science in the Public Interest. 2009;9(3):105–119. doi: 10.1111/j.1539-6053.2009.01038.x. [DOI] [PubMed] [Google Scholar]
- Pickering S, Gathercole SE. Working memory test battery for children (WMTB-C) Psychological Corporation; 2001. [Google Scholar]
- Portland Public Schools. Guidance for the identification of specific learning disabilities. 2013 Retrieved May 1, 2015 from http://www.pps.k12.or.us/files/special-education/PSW_Feb_2013_Guide.pdf.
- Sesma HW, Mahone EM, Levine T, Eason SH, Cutting LE. The contribution of executive skills to reading comprehension. Child Neuropsychology. 2009;15(3):232–246. doi: 10.1080/09297040802220029. http://doi.org/10.1080/09297040802220029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Speece DL, Ritchey KD, Cooper DH, Roth FP, Schatschneider C. Growth in early reading skills from kindergarten to third grade. Contemporary Educational Psychology. 2004;29(3):312–332. [Google Scholar]
- Stuebing KK, Fletcher JM, Branum-Martin L, Francis DJ. Evaluation of the technical adequacy of three methods for identifying specific learning disabilities based on cognitive discrepancies. School Psychology Review. 2012;41:3–22. [PMC free article] [PubMed] [Google Scholar]
- Stuebing KK, Barth AE, Trahan LH, Reddy RR, Miciak J, Fletcher JM. Are child cognitive characterstics strong predcitors of response to intervention? A meta-analysis. Review of Educational Research. 2014 doi: 10.3102/0034654314555996. Online first Nov. 12, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Torgesen J, Wagner R, Rashotte C. Test of Word Reading Efficiency. Austin, TX: Pro-Ed; 1998. [Google Scholar]
- Torgesen JK, Wagner RK, Rashotte CA, Burgess S, Hecht S. Contributions of phonological awareness and rapid automatic naming ability to the growth of word-reading skills in second- to fifth-grade children. Scientific Studies of Reading. 1997;1:161–185. [Google Scholar]
- Vaughn S, Solís M, Miciak J, Taylor WP, Fletcher JM. Effects from a randomized control trial comparing researcher and school implemented treatments with fourth graders with significant reading difficulties. Journal of Research on Educational Effectiveness. 2015 doi: 10.1080/19345747.2015.1126386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wagner RK, Torgesen JK, Rashotte CA. Comprehensive Test of Phonological Processing. Austin, TX: PRO-ED Inc; 1999. [Google Scholar]
- Woodcock RW, McGrew KS, Mather N. Woodcock-Johnson III Tests of Achievement. Itasca, IL: Riverside; 2001. [Google Scholar]