Abstract
The main purpose of this study was to test the effects of word-problem intervention, with versus without embedded language comprehension instruction, on at-risk 1st graders’ word-problem performance. We also isolated the need for a structured approach to word-problem intervention and tested the efficacy of schema-based instruction at 1st grade. Children (n=391; mean age = 6.53, SD = 0.32) were randomly assigned to 4 conditions: schema-based word-problem intervention with embedded language instruction, the same word-problem intervention but without language comprehension instruction, structured number knowledge intervention without a structured word-problem component, and a control group. Each intervention included 45 30-min sessions. Multilevel models, accounting for classroom and school effects, revealed the efficacy of schema-based word-problem intervention at 1st grade, with both word-problem conditions outperforming the number knowledge condition and the control group. Yet, word-problem performance was significantly stronger for the schema-based condition with embedded language comprehension instruction compared to the schema-based condition without language comprehension instruction. Number knowledge intervention conveyed no word-problem advantage over the control group, even though all 3 intervention conditions outperformed the control group on arithmetic. Results demonstrate the importance of a structured approach to word-problem intervention; the efficacy of schema-based instruction at 1st grade; and the added value of language comprehension instruction within word-problem intervention. Results also provide causal evidence on the role of language comprehension in word-problem solving.
Keywords: math word problems, word-problem development, word-problem intervention, language comprehension
Children’s competence with word-problem solving (WPS) is a strong predictor of employment prospects and wages in adulthood (Batty, Kivimäki, & Deary, 2010; Every Child a Chance Trust, 2009). It reflects understanding of and the capacity to apply mathematical ideas in everyday life and in science, technology, and engineering. It may also support advanced mathematics learning: In a large national survey, U.S. Algebra I teachers rated WPS as most important, among 15 math skills, for success with algebra (Hoffer, Venkataraman, Hedberg, & Shagle, 2007). Unfortunately, WPS difficulty is widespread (Daroczy, Wolska, Meurers, & Nuerk, 2015).
Because WPS requires arithmetic, one might expect intervention that improves arithmetic to simultaneously build WPS. This, however, does not appear to be the case for at-risk learners. Fuchs et al. (2013) assessed the efficacy of a number knowledge intervention that produced dramatically superior arithmetic outcomes for at-risk first graders. Yet, the effect size (ES) on WPS was substantially lower than on arithmetic (0.22 vs. 0.87). Further, while intervention closed the arithmetic achievement gap between at-risk and typically-achieving classmates (ES for intervention vs. not-at-risk classmates = −0.39), the WPS achievement gap dramatically widened (ES = 0.73). This demonstrates that although arithmetic is foundational to WPS, it is not a sufficient pathway to word-problem competence. Thus, a deliberate focus on WPS within first-grade intervention appears necessary for at-risk learners. (In this paper, achievement gaps are expressed as ESs: difference not-at-risk mean minus at-risk mean divided by the standard deviation [SD] of the not-at-risk group, such that positive values indicate stronger performance for not-at-risk students.)
One reason for a widening WPS achievement gap in the face of a closing arithmetic achievement gap is that WPS requires text processing to build problem models and generate number sentences before executing calculations, and text comprehension in turn relies on language (Catts, Hogan, & Adolf, 2005; Gough & Tunmer, 1986). It is not surprising, therefore, that although arithmetic is associated with language ability (Dehaene, Spelke, Pinel, Stanescu-Cosson, & Tsivkin, 1999; Powell, Driver, Roberts, & Fall, 2017; Purpura & Ganley, 2014), stronger relations are demonstrated between language comprehension and WPS than between language comprehension and arithmetic, as shown in studies that formally contrast pathways for both outcomes (Fuchs et al., 2008, 2016; Singer, Strausser, & Cuadro, 2019).
The main purpose of the present randomized controlled trial was to test the effects of word-problem intervention, with versus without embedded language comprehension instruction, on at-risk first graders’ word-problem performance. At the same time, we isolated the need for a structured approach to word-problem intervention and tested the efficacy of schema-based instruction at first grade. We randomly assigned at-risk first-grade children to four conditions: schema-based word-problem intervention with embedded language instruction, the same word-problem intervention but without language comprehension instruction, number knowledge intervention without a structured word-problem component, and a control group. Each intervention condition received the same amount of intervention time, and each condition included five min of speeded, strategic arithmetic practice in each 30-min session (45 sessions). The focus on arithmetic across conditions was to ensure adequate skill for computing answers.
We expected results to provide insight into the added value of language comprehension instruction in word-problem intervention; on whether a structured approach to word-problem intervention is needed; and on the efficacy of schema-based instruction at 1st grade. The larger goals were to identify strategies by which schools can reduce the first-grade WPS achievement gap and to deepen insight into language comprehension as a causal agent in WPS.
In the remaining sections of this introduction, we provide a framework for understanding WPS as a complex undertaking that taxes the cognitive resources of at-risk learners. Then we explain how schema-based instruction, an approach validated at higher grades, is designed to compensate for WPS’s cognitive demands, and we establish the need to test its effects at first grade. In the third section, we argue that research on schema-based and other forms of word-problem intervention has omitted a focus on language comprehension; we summarize previous research indicating a role of language comprehension in WPS; and we explain how the present study extends that work by experimentally contrasting effects of schema-based intervention with versus without embedded language comprehension. Fourth, we discuss the need for a structured intervention approach to WPS, which provides the rationale for this study’s contrast between number knowledge intervention and control on WPS. As we contextualize each purpose, we provide our hypothesis and, in the final section, we describe exploratory moderation analyses conducted to assess whether intervention effects on WPS benefit students in differential ways depending on students’ cognitive resources.
WPS Is a Complex Undertaking Involving Language Comprehension
Kintsch and colleagues hypothesized that WPS relies on a combination of language comprehension and problem-solving processes (Cummins, Kintsch, Ruesser, & Wiener, 1988; Kintsch & Greeno, 1985; Nathan et al., 1992). Based on theories of discourse processing (van Dijk & Kintsch, 1983), the model assumes that word-problem representations have three components. The first involves constructing a coherent structure to capture the text’s essential ideas. The second, the situation model, requires supplementing the text with inferences based on the child’s world knowledge, including information about relations among quantities. The problem solver coordinates this knowledge with the third component, knowledge about problem models, to formalize relations among quantities and guide application of solution strategies within schema. At first grade, the three main schemas (Riley et al., 1983) are combine or total word problems (quantities are combined to form a total), compare or difference word problems (quantities are compared to find a difference), and change word problems (an action triggers an increase or decrease in a starting amount).
Kintsch and Greeno (1985) posited that this process of building the propositional text structure, inferencing, identifying schema, and applying solution strategies makes strong demands on short-term memory. We reframed short-term memory as working memory, because WPS requires not only briefly storing information but also sequentially updating that information in memory as the problem solver processes segments of the word-problem statement (e.g., Fuchs, Fuchs, Seethaler, & Barnes, in press). This model revision is grounded in studies showing that working memory is involved in WPS (Lee et al. 2004; Peng et al., in press: Raghubar et al., 2010; Swanson, 2016; Swanson & Beebe-Frankenberger, 2004; Swanson, Jerman, & Zheng, 2008; Swanson, Moran, Lussier, & Fung, 2014).
Further, in light of correlational research also demonstrating a role for attentive behavior and reasoning in WPS (Fuchs et al., 2006; 2010a, 2010b; Swanson & Beebe-Frankenberger, 2004), we added these cognitive processes to the model. Attentive behavior reflects students’ capacity to engage through the WPS process. Via careful attending and access to information stored in working memory, students apply reasoning to integrate newly encountered information into working memory, as they logically induce relations between objects and actions described in the word-problem narrative and distinguish between relevant and irrelevant information.
To illustrate the role of attentive behavior, working memory, and reasoning, we explain competent WPS for this combine problem (Part 1 plus Part 2 equals Total): Connie has 3 puppets. Tony has 5 puppets. Tony also has 10 blocks. How many puppets do the children have in all? The problem solver attends to the information presented in sentence 1, reasoning to identify that the object is puppets; the quantity is 3; the actor is Connie; but Connie’s role is TBD. These pieces of information are stored in working memory. Attentive behavior and reasoning are then applied to code and store the propositions in sentence 2 (object=puppets; quantity=5; actor=Tony; Tony’s role=TBD) in working memory. With attentive behavior and reasoning, the problem solver next processes sentence 3 to determine that blocks does not match the object code in the first two sentences, signaling that 10 may be irrelevant. This is stored in working memory. In the question, how many puppets and the phrase in all, the problem solver updates working memory with the new information, while relying on attentive behavior and reasoning to identify the combine schema; assign the role of superset (Total) to the question; assign subset roles (Parts 1 and 2) to the TBD information; and reject 10 blocks as irrelevant. Filling in these slots of the schema triggers a set of strategies to find the missing information (Total). Errors are viewed as failures in attentive behavior or reasoning or difficulty managing working memory demands, which results in compromised mental representations.
As Kintsch and Greeno (1985) discussed, however, WPS also relies on language comprehension. Children “understand important vocabulary and language constructions prior to school entry” and “through instruction in arithmetic and word problems, learn to treat these words in a task-specific way, including extensions to ordinary usage for terms (e.g., all or more) to more complicated constructions involving sets (in all and more than)” (p. 111).
To appreciate how WPS taxes language comprehension, consider this revised problem: Celia has 3 fish. Fernando has 5 turtles. Fernando also has 2 plants. How many pets do the children have in all? Objects in this text increase demands on language comprehension for assigning roles in the propositional text structure (despite similar demand for schema induction). This is due to more sophisticated representations of vocabulary involving taxonomic relations at a superordinate level and subtle distinctions among categories (fish + turtles = pets; plants are not pets).
Schema-Based Intervention’s Is Designed to Address WPS’s Cognitive Demands
Schema-based word-problem intervention is designed to address the cognitive demands specified in this WPS model. With schema-based intervention, students learn to conceptualize word problems as belonging to word-problem types (i.e., the problem model derived from text processing; Kintsch & Greeno, 1985). Students learn to represent each word-problem type with a diagram or equation that maps onto the word-problem type’s central mathematical event. Once students identify the word-problem type, they execute the step-by-step solution strategy for that problem type. This involves placing relevant information from the word problem into the problem type’s diagram or equation and transforming that representation into a number sentence with a missing quantity. Students then compute to solve for the missing quantity.
Research conducted by Jitendra and colleagues (e.g., Jitendra, Griffin, Haria, Leh, Adams, & Kaduvetoor, 2007; Jitendra & Hoff, 1996), Powell and colleagues (Powell, Berry, & Barnes, 2019; Powell & Fuchs, 2010; Powell et al., 2019), and Fuchs and colleagues (e.g., Fuchs et al., 2009; Fuchs, Zumeta et al. 2010) provides corroborating evidence for the efficacy of schema-based word-problem intervention at grades 2–3. The median ES approximates 0.70. In all three lines of research, schema-based instruction is designed to provide children with strategies that reduce the attentional, reasoning, and working memory demands involved in WPS.
For example, Pirate Math (the program used in the two word-problem conditions in the present study) includes a systematic self-regulation component to foster on-task behavior, hard work, and correct answers, with the goal of supporting attentive behavior and perseverance through challenging tasks. To support reasoning, the teacher makes connections among the situation model, schema, and productive solution strategies transparent, while teaching the problem models (schema) in a structured way. Also, problem model equations provide an analytical framework to help students integrate relevant problem information into a structure that maps onto the problem model (the combine equation is Part 1 + Part 2 = Total; the compare equation is Bigger – smaller = Difference; the change equation is Start +/− Change – End).
In terms of working memory, the teacher models step-by-step strategies for identifying problem statements as combine, compare, or change schema and for building the propositional text structure. Students are taught to begin with an attack strategy, in which they read the problem, underline the word representing the problem’s object code (to anchor the problem’s central focus and provide a label for the numerical answer), and name the schema as they write the first letter of the problem type, rather than holding it in working memory (T for total, D for difference, C for change). Then, they write the problem-model equation and re-read the problem statement. While re-reading, they build a number sentence by replacing letters in the equation with relevant numerals, crossing out irrelevant numerals from the word-problem statement, and using a x to signify the missing quantity. This combination of strategies is designed to compensate for limitations in working memory.
Despite multiple demonstrations of efficacy across the three research programs, we identified no study assessing schema-based intervention’s effects at first grade, where the dominant intervention focus for at-risk learners has been number knowledge and arithmetic. In fact, we did not locate any word-problem intervention efficacy trial for first graders. We selected schema-based instruction for word-problem intervention in the present study because it is the word-problem intervention with strongest evidence at other grades (U.S. Department of Education, 2009). We adapted Pirate Math schema-based intervention, previously validated at grades 2 and 3 (Fuchs et al., 2014; Fuchs et al., 2009; Powell et al., 2015), for first grade. In the present study, we used this adapted intervention to assess the efficacy of schema-based instruction at first grade. We hypothesized that both schema-based word-problem conditions would outperform the control group as well as the number knowledge intervention condition.
Schema-Based Intervention’s Missing Focus on Language Comprehension Demands
Another oversight in the schema-based intervention literature is that prior work fails to address the role of language comprehension in WPS. We located no such studies within schema-based intervention or any other established word-problem intervention (e.g., Bottge et al., 2007; Montague, 2007). As Kintsch and Greeno argued (1985), embedding a language comprehension component may address a critical need in the at-risk population and boost WPS outcomes.
Two types of studies suggest a role for language comprehension. The first is small-scale experiments demonstrating that small changes in the wording of word problems alter problem-solution accuracy. Hudson (1983) contrasted two versions of a word problem with preschoolers: “There are 5 birds and 3 worms. How many more birds are there than worms?” versus “There are 5 birds and 3 worms. How many more birds don’t get worms?” The rewording increased performance by 83%. This effect has been demonstrated in older children when word problems are altered to more clearly reveal semantic relations between sets (Cummins, 1991; Davis-Daroczey et al., 1991; De Corte et al., 1985). Vicente, Orarntia, and Verschaffel (2007) showed how conceptual re-wording improved performance for third more than fifth graders and on more difficult problems.
The second type of study is correlational, demonstrating that individual differences in language comprehension help explain individual differences in WPS (Fuchs et al., 2010; 2016; Singer et al., 2019; Swanson & Beebe-Frankenberger, 2004; Van der Schoot, Bakker Arkema, Horsley, & Van Lieshout, 2009). For example, Fuchs, Gilbert, Fuchs, Seethaler, and Martin (2018) assessed second graders on start-of-year text comprehension, general language comprehension, reasoning, working memory, and word identification and arithmetic skill; at year-end, on WPS, word-problem language comprehension, and calculations. Multilevel path analysis indicated that start-of-year language comprehension was a stronger predictor of both year-end word-problem outcomes than of calculations. By contrast, start-of-year arithmetic was a stronger predictor of calculations than of the WPS or word-problem language comprehension.
Across word-problem rewording manipulations and correlational individual difference studies, results suggest a role for language comprehension in word-problem intervention. What is still required, however, is an experiment to assess whether language comprehension instruction improves WPS outcomes. We designed the present study to isolate the effects of an embedded language comprehension instructional component on WPS. This extends the literature on schema-based instruction in an innovative direction and provides the strongest test to date of the hypothesis that language comprehension is a causal agent in WPS. We hypothesized that schema-based intervention with embedded language comprehension instruction would outperform schema-based intervention without language comprehension instruction.
To deepen insight into the processes by which intervention effects occur, we also assessed whether end-of-study word-problem language comprehension mediates intervention effects on WPS. We hypothesized word-problem language mediation for the effect between word-problem intervention with embedded language comprehension versus word-problem intervention without embedded language and for the effect of word-problem intervention with embedded language comprehension versus the control group.
We embedded language instruction within word-problem intervention because studies suggest that stand-alone domain-general cognitive training often fails to transfer to academic skill (e.g., Melby-Lervag & Hulme, 2013). Research on language therapy in the area of reading comprehension (also associated with language comprehension) indicates a similar pattern of effects (Catts & Kamhi, 2017; Schleppegrell, 2007): Oral language comprehension improves, but transfer to reading comprehension is disappointing. Transfer difficulty for at-risk learners is well documented (Haskell, 2001; National Research Council, 2000). In the present study, we expected that embedding language comprehension instruction within schema-based instruction would facilitate transfer of improved language comprehension to WPS, even as schema-based intervention directly builds WPS competence, onto which improved language comprehension may be applied. We focused on first grade because structured language comprehension instruction may be more important for younger and at-risk children (Kintsch & Greeno, 1985; Koedinger & Nathan, 2004; Vicente et al., 2007).
The Need for a Structured Approach to WPS
At the same time, we deemed it important to assess, in the same study design, whether a structured approach to word-problem intervention is necessary. We expected each word-problem intervention condition to outperform students who receive a structured, validated form of number knowledge intervention. Assessing the necessity of a deliberate, structured focus on WPS for at-risk learners is important because teachers often treat word-problem instruction as if word problems are arithmetic tasks (Daroczy et al., 2015). Assessing transfer from the arithmetic skill derived from number knowledge intervention to WPS may also deepen understanding into how different aspects of mathematical cognition relate to each other.
Do Cognitive Processes or Word-Reading Skill Moderate Intervention Effects on WPS?
To provide insight into the robustness of intervention effects, we explored potential moderators of this study’s intervention effects on WPS, including pretest measures of the cognitive processes associated with WPS as well as word-reading skill. The paucity of such analyses in the mathematics intervention literature makes it difficult to frame hypotheses about whether moderation may occur and if so what the pattern of effects might be. The hope was that results would provide insight into the robustness of first-grade mathematics intervention and about the role of cognitive processes and word reading in WPS.
Method
In the method and results sections, we use LC for language comprehension, WP for word problems, and NK for number knowledge.
At-Risk Participants
We conducted this study in accord with our university-approved IRB protocol. As conventional in the intervention literature and school practice, we defined risk as low math performance at the start of the study. We screened 3,009 consented children in 186 classrooms in 21 schools to enroll 417 at-risk and 480 not-at-risk children in this study. Teachers administered two alternate-form practice screening tests (First-Grade Test of Mathematics Computation and First-Grade Test of Concepts and Applications; Fuchs, Hamlett, & Fuchs, 1990; see Measures for description of this and other screening measures) one week before the study’s actual screening session, which was conducted by research assistants. Established cut-scores < the 25th percentile were applied to identify at-risk children (combined score across screeners < 15, with scores < 4 on Computation and < 12 on Concepts and Applications). To avoid false positives, we excluded 14 students whose teachers identified them as entirely or almost entirely non-English speakers. Because the study’s interventions were not designed to address the needs of students with intellectual disability, we excluded 23 students with standard scores below 80 on both subtests of the 2-subtest Wechsler Abbreviated Intelligence Scale (WASI; Wechsler, 2011).
We randomly sampled 416 of the remaining children, ensuring no more than five at-risk participants in the same classroom. Children were in 186 classrooms in 21 schools. We randomly assigned participants to conditions individually (students in the same class participated in different conditions). The four conditions were number knowledge intervention (NK), WP intervention (WP), WP intervention with embedded WP-language instruction (WP[L]), and control. Over the course of participation, 25 at-risk children (10 NK, 6 WP, 3 WP[L], 6 control) moved beyond the study’s reach, leaving a final sample of 391 at-risk children.
Screening and other descriptive data are provided in Table 1. At-risk children in the NK, WP, WP[L], and control conditions, respectively, were 50%, 45%, 46%, and 48% male. They were 48%, 40%, 32%, and 35% African American, 11%, 21%, 23%, and 20% white non-Hispanic, 35%, 32%, 38%, and 38% white Hispanic, and 7%, 7%, 6%, and 7% other. They were 34%, 34%, 41%, and 38% English-learners, with 76%, 77%, 80%, and 74% receiving subsidized lunch (from economically disadvantaged households). There were no significant differences among at-risk conditions on any screening, demographic, or other descriptive measure.
Table 1.
Means1and Standard Deviations by Study Condition
| At-Risk Study Condition | 
||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Control (n=104) | 
Number Know (n=92) | 
Word Problem (n=96) | 
Word Problem [Lang] (n=99) | 
Not-At-Risk (n=455) | 
||||||
| Variable | Mean | (SD) | Mean | (SD) | Mean | (SD) | Mean | (SD) | Mean | (SD) | 
| Screening | ||||||||||
| Concepts, Applications, Calculations | 10.00 | (6.13) | 9.63 | (6.66) | 9.59 | (5.96) | 8.72 | (5.09) | 27.38 | (10.05) | 
| WASI: Vocabulary | 37.89 | (9.51) | 36.88 | (9.84) | 37.08 | (9.76) | 36.68 | (9.95) | 47.33 | (11.42) | 
| Matrix Reasoning2 | 45.26 | (6.53) | 43.15 | (4.79) | 44.95 | (5.45) | 44.03 | (7.02) | 50.44 | (7.71) | 
| Descriptive | ||||||||||
| Listening Comprehension2 | 77.70 | (16.39) | 75.57 | (18.63) | 75.14 | (17.79) | 73.64 | (19.42) | 94.01 | (16.88) | 
| Working Memory: Listen Recall2 | 67.21 | (15.50) | 70.13 | (17.39) | 66.48 | (15.50) | 67.36 | (16.21) | 92.20 | (19.64) | 
| Count Recall2 | 80.55 | (15.40) | 78.15 | (13.47) | 80.16 | (14.79) | 78.04 | (13.73) | 93.33 | (15.73) | 
| SWAN (attentive behavior)2 | 33.57 | (10.62) | 33.84 | (10.89) | 31.95 | (10.62) | 31.59 | (8.55) | 44.82 | (10.81) | 
| KeyMath-Problem Solving | 93.37 | (5.89) | 92.72 | (5.26) | 93.07 | (5.87) | 92.78 | (5.54) | 104.96 | (7.93) | 
| WRAT-Arithmetic | 88.41 | (10.96) | 88.07 | (10.15) | 89.38 | (11.11) | 88.11 | (11.88) | 110.76 | (11.04) | 
| Word Identification Fluency2 | 14.05 | (17.13) | 15.56 | (17.22) | 12.71 | (15.65) | 13.70 | (15.34) | 43.74 | (23.10) | 
| Outcomes | ||||||||||
| Arithmetic: Pre | 5.85 | (4.77) | 6.45 | (5.20) | 5.90 | (4.13) | 5.61 | (3.91) | 18.24 | (9.27) | 
| Post | 16.16 | (11.47) | 24.32 | (13.33) | 24.01 | (12.47) | 24.87 | (11.41) | 35.69 | (14.94) | 
| Word Problems: Pre | 1.55 | (1.20) | 1.33 | (1.23) | 1.54 | (1.26) | 1.62 | (1.28) | 6.27 | (3.02) | 
| Post | 3.36 | (2.38) | 3.47 | (2.98) | 7.40 | (5.12) | 9.81 | (4.94) | 7.82 | (3.50) | 
| Word-Problem Language: Post | 13.41 | (3.79) | 13.07 | (4.02) | 13.07 | (3.69) | 15.55 | (3.82) | 18.24 | (3.35) | 
Concepts, Applications, Calculations are reported as raw score scores, as are Word Identification Fluency, SWAN, and the outcomes. Standard scores are provided for both WASI measure (mean=50; SD=10) and for Listening Comprehension, Working Memory, KeyMath, and WRAT (mean=100; SD=15).
Explored as potential moderators.
Not-at-Risk Classmates
We also included a sample of not-at-risk classmates to serve as a comparison group for judging the severity of at-risk students’ pretest performance gaps and posttest achievement gaps. (Not-at-risk classmates were not involved in the randomized control trial.) To identify not-at-risk classmates, we randomly sampled 480 children from among (a) those scoring above the cut-point on each math screener, with a combined score > 17, (b) those scoring above an established risk cut-point (19) on Word Identification Fluency (Fuchs, Fuchs, & Compton, 2004) to avoid the complication of not-at-risk students with reading disability, (c) those who were not entirely or almost entirely non-English speakers, and (d) those scoring at or above 80 on both WASI subtests. Over the course of participation, 25 not-at-risk children moved to schools outside the study’s reach, leaving a final sample of 455 not-at-risk classmates.
Not-at-risk classmates were assessed on the same measures completed by at-risk participants. See Table 1 for not-at-risk students’ screening and other descriptive data. Not-at-risk classmates outperformed at-risk students on each measure. Performance gaps between at-risk children and their not-at-risk classmates are provided in Table 2. Not-at-risk classmates were 50% male; 30% African American, 40% white non-Hispanic, 20% white Hispanic, and 20% other. Nineteen percent were English-learners; 60% received subsidized lunch.
Table 2.
Performance Gaps by Study Condition: Effect Sizes (ESs; Hedges g) for At-Risk Study Conditions versus Not-At-Risk Classmates
| At-Risk Study Condition | 
||||
|---|---|---|---|---|
| Control (n=104) | 
Number Know(n=92) | 
Word Problem (n=96) | 
Word Problem [Lang] (n=99) | 
|
| Variable | ES | ES | ES | ES | 
| Screening | ||||
| Concepts, Applications, Calculations | 1.84 | 1.86 | 1.88 | 1.99 | 
| WASI: Vocabulary | 0.85 | 0.93 | 0.92 | 0.95 | 
| Matrix Reasoning | 0.69 | 1.00 | 0.74 | 0.84 | 
| Descriptive | ||||
| Listening Comprehension | 0.97 | 1.07 | 1.11 | 1.17 | 
| Working Memory: Listen Recall | 1.32 | 1.15 | 1.36 | 1.30 | 
| Count Recall | 0.82 | 0.99 | 0.85 | 0.99 | 
| KeyMath-Problem Solving | 1.52 | 1.61 | 1.56 | 1.61 | 
| WRAT-Arithmetic | 2.03 | 2.08 | 1.93 | 2.02 | 
| Word Identification Fluency | 1.34 | 1.27 | 1.41 | 1.37 | 
| Outcomes | ||||
| Arithmetic: Pre | 1.44 | 1.35 | 1.43 | 1.48 | 
| Post | 1.36 | 0.77 | 0.80 | 0.75 | 
| Word Problems: Pre | 1.70 | 1.76 | 1.69 | 1.65 | 
| Post | 1.34 | 1.27 | 0.11 | −0.52 | 
| Word-Problem Language: Post | 1.41 | 1.49 | 1.51 | 0.78 | 
Note: The negative effect size indicate at-risk children in the word-problem with embedded language condition performed higher than not-at-risk classmates at posttest.
Screening Measures
Screening measures were administered once, before students were randomly assigned to intervention. (Word-Identification Fluency, which was used to exclude not-at-risk classmates with risk for reading disability, was also used as a moderator and is grouped in Tables 1 and 2 with other descriptive measures used as moderators.)
First-Grade Test of Computational Fluency (Fuchs et al., 1990) includes 25-items that sample the typical first-grade computation curriculum: adding two single-digit numbers (9 items), subtracting two single-digit numbers (10 items), adding three single-digit numbers (2 items), adding two 2-digit numbers without regrouping (2 items), and subtracting a 1-digit number from a 2-digit number (2 items). Students have 2 min to complete as many items as possible. Staff entered responses into a computerized program on an item-by-item basis, with 15% of tests re-entered by an independent scorer. Data-entry agreement was 99.6. Predictive validity across first grade with respect to end-year WRAT (Wilkinson, 1993) was .73 (Fuchs et al., 2013). Sample-based α was .98.
First-Grade Test of Mathematics Concepts and Applications (Fuchs et al., 1990) includes 25 items sampling the typical first-grade concepts/applications curriculum (i.e., numeration, concepts, geometry, measurement, applied computation, money, charts/graphs, WPs). The tester reads the words in each item aloud. For 20 items, students have 15 s to respond; for 5 items, 30 s. Staff entered responses into a computerized program on an item-by-item basis, with 15% re-entered by an independent scorer. Data-entry agreement was 98.5%. Predictive validity across first grade with respect to KeyMath-Revised Problem Solving (Connolly, 1998) is .62 (Fuchs et al., 2013). Sample-based α was .93. The cut-scores on the two math screening measures, used to identify at-risk and not-at-risk students at the start of first grade, were derived from a longitudinal dataset (Bailey et al., in press; Fuchs et al., 2013) to demonstrate strong classification accuracy for end-of-second grade mathematics difficulty status on the Wide Range Achievement Test [WRAT]; Wilkinson, 1993).
WASI (Wechsler, 2011) is a 2-subtest measure of general cognitive ability, comprising Vocabulary and Matrix Reasoning subtests (reliability > .92). Vocabulary assesses expressive vocabulary, verbal knowledge, memory, learning ability, and crystallized and general intelligence. Students identify pictures and define words. Matrix Reasoning measures nonverbal fluid reasoning and general intelligence. Students complete matrices with missing pieces.
Word Identification Fluency (Fuchs et al., 2004) provides students with 1 min to read a list of 50 words randomly sampled from 100 high-frequency pre-primer, primer, and first-grade words. If a student finishes before 1 min, the score is prorated. We administered two alternate forms and averaged scores. Alternate-form reliability/stability is .97; correlations with Woodcock Reading Mastery Test-Word Identification (Woodcock, 1997) are .77–.82. The cut-score on Word Identification Fluency, used to designate not-at-risk students as also low-risk for reading disability, was derived from a longitudinal dataset (Bailey et al., in press; Fuchs et al., 2013) to provide strong classification accuracy for adequate end-of-second grade reading performance on WRAT-Reading; Wilkinson, 1993).
Other Descriptive and Moderator Measures
These measures were administered once, before students were randomly assigned to intervention and used as to describe the sample and tested as potential moderators.
Language
Woodcock Diagnostic Reading Battery (WDRB) - Listening Comprehension (Woodcock, 1997) measures the ability to understand sentences or passages. With 38 items, students supply the word missing at the end of sentences or passages that progress from simple verbal analogies and associations to discerning implications. Reliability is .80.
Nonverbal reasoning
WASI Matrix Reasoning (Wechsler, 2011) measures nonverbal reasoning with pattern completion, classification, analogy, and serial reasoning tasks. For each item, students complete a matrix, from which one section is missing, from five response options. Reliability is .94.
Working memory
The Working Memory Test Battery for Children – Listening Recall and Counting Recall (WMTB-C; Pickering & Gathercole, 2001), both dual-task subtests, each has six items at span levels from 1–6 to 1–9. Passing four items at a level moves the child to the next level. At each span level, the number of items to be remembered increases by one. Failing three items terminates the subtest. Subtest order is designed to avoid overtaxing any component area and is generally arranged from easiest to hardest. We used the trials correct score. For Listening Recall, children determine if a sentence is true; after making true/false determinations for a series of sentences, they recall the last word of each sentence. For Counting Recall, children count a set of 4, 5, 6, or 7 dots on a card; after counting a series of cards, they recall the number of counted dots on each card.
Attentive behavior
The Strength and Weaknesses of ADHD-Symptoms and Normal-Behavior (SWAN; Swanson et al., 2004) samples items from the Diagnostic and Statistical Manual of Mental Disorders-IV criteria for Attention Deficit Hyperactivity Disorder for inattention (9 items) and hyperactivity-impulsivity (9 items), but scores are normally distributed. Teachers rate items on a 1–7 scale. We report data for the inattentive subscale as the average rating across the nine items. The SWAN correlates well with other dimensional assessments of behavior related to attention (www.adhd.net) and is widely used in the identification of students with attention deficit disorder. Sample-based α was .99.
Word-reading skill
We also used Word Identification Fluency (Fuchs et al., 2004) to describe the sample and as a potential moderator. See description under screening measures.
Mathematics Measures
The arithmetic measure was administered at pretest (within 3 weeks of random assignment) and posttest (within 3 weeks of intervention ending). The WPS measure differed at pre- and posttest. Pretest problems were uniformly simpler to ensure sensitivity to performance differences at the start of first grade; the posttest assessed the full set of WP skills expected at the end of first grade. The WP language measure was administered only at posttest, because we expected a floor effect in the at-risk sample at the start of first grade.
Arithmetic
From the First-Grade Mathematics Assessment Battery (Fuchs, Hamlett, & Powell, 2003), Arithmetic Combinations includes two subtests. Addition comprises 25 addition problems with sums from 5 to 12 (two items have an addend of 1; one has an addend of zero). Subtraction comprises 25 subtraction fact problems with minuends from 5 to 12 (one item has a minuend of 1; one has a minuend of zero). For each subtest, students have 1 min to write answers. Because the pattern of results was similar across subtests, we used total number of correct answers across addition and subtraction. The correlation with WRAT at start and end of first grade was .68 – .73 (Fuchs et al., 2013). Sample-based α was .96.
Word problems
At pretest, we administered Pennies Story Problems (Jordan & Hanich, 2000), which comprises 14 WPs representing combine, compare, or change schema. The scenarios all involve pennies. Problems require addition and subtraction (sums and minuends to 12) for solution. The tester reads each item aloud; students have 30 s to write an answer and can ask for re-reading(s) as needed. The score is the number of correct number answers (pennies is the correct label for every problem). The correlation with KeyMath– 3 Applied Problem Solving at start of first grade .66 (Fuchs et al., 2013). Sample-based α was .86.
At posttest, we administered First-Grade Word Problems (Fuchs et al., 2009), which includes 12 WPs representing combine, compare, and change schemas, with/without irrelevant information, relevant quantities presented in graphs, superordinate terms in questions, challenging questions (no reference to what the problem is mostly about or to associated problem-type vocabulary), implicit change verbs, non-compare –er words, and unusual time passage conjunctions. One combine problem involves three parts. Solutions require addition and subtraction (sums and minuends to 12). We minimized use of intervention instructional vocabulary in words used in test items (some vocabulary [e.g., more] was impossible to avoid). Testers read a WP aloud; students follow along on paper, with up to 2 min to write the answer before testers read the next problem. Each problem is scored for correct math (1 point) and label (1 point) to reflect processing of the WP statement and understanding of the problem’s theme. The correlation with KeyMath-3 Applied Problem Solving at end of first grade was .59 (Fuchs et al., 2013). Sample-based α was .86.
Word-problem language
We modeled our WP measure after Fuchs et al. (2015), who found that start-of-second-graders’ WP language, indexed via a similar task, mediated the effects of general language comprehension on WPS. The first-grade Word Problem-Language Assessment (Fuchs, Craddock, & Seethaler, 2013) was designed to tap understanding of language’s role in determining operation in the word-problem types included in the first-grade state standards. Items on the measure do not parallel any task used during intervention.
Testers read WPs aloud while the child follows along on paper. The child can request re-readings as needed. For each problem, the child decides whether addition or subtraction is needed to solve the problem and selects from three choices to indicate which words are most important in determining whether to add or subtract. Testing begins with a practice item (designed to be easier than the test items), with which testers explain what they want the child to do. Testers point to the problem as they read: “Lacy has 1 pink flower and 2 yellow flowers. How many flowers does she have altogether? Here’s my first question: To find the answer to this problem, do you add or subtract (tester points to the words, add and subtract)? Let’s think. If we want to know how many flowers she has altogether, we add the pink flowers plus the yellow flowers to find the answer. So you say, ‘add.’ Here’s my second question, ‘Which word or words are the most important to help you figure out if you add or subtract: altogether (tester points) or How many (tester points) or 1 pink flower (tester points)? The word altogether is most important in telling us to add. So you say, ‘altogether.’” For each problem, testers pose the questions used for the practice questions, point to answers to help the child focus on the response options, and record answers on a score sheet. There are 12 problems representing combine, compare, and change schemas, with/without irrelevant information, superordinate terms in questions, implicit change verbs, non-compare –er words, and unusual time passages conjunctions. Each WP earns two possible points, for a maximum score of 24. Sample-based α was .74.
Intervention
When describing interventions, we use the present tense because these interventions are current and are used in other work. When describing other information about study conditions, we use the past tense to communicate those procedures are completed.
Commonalities across conditions
The three intervention conditions shared five commonalities. First, each intervention comprised 45 30-min sessions conducted one-to-one over 15 weeks outside the classroom in the child’s school. Absences and snow days were made up to ensure each child received 45 sessions.
Second, instruction was structured to compensate for the domain-general cognitive and linguistic limitations associated with WP difficulty (Kintsch & Greeno, 1985; Fuchs et al., 2019). Research syntheses indicate the importance of structured instruction for improving at-risk children’s learning (Baker et al., 2002; Gersten et al., 2009). In the three interventions, structured instruction (a) ensures students have the foundational knowledge and skills to succeed with new content; (b) provides explanations in simple, direct language; (c) models efficient solution strategies instead of expecting students to discover strategies on their own; (d) gradually fades support for correct execution of taught strategies; (e) provides interleaved practice so students use knowledge and strategies to generate many correct responses and distinguish among problem types; and (f) incorporates systematic cumulative review.
Third, intervention includes an attention, motivation, and self-regulation system centered on four rules: use inside voice; stay in seat; follow directions; and try hard to answer problems correctly. Tutors set a timer to beep at 5-min intervals. Tutors award a checkmark if the child is following all four rules at the moment the timer beeps. Tutors keep track of checkmarks earned. At the end of the session, checkmarks are converted to stickers on the child’s chart. When the sticker chart is full (~weekly), the child picks a small prize.
Fourth, each session comprises three segments: speeded practice on arithmetic problems (5 min); the lesson, in which tutors introduce and review concepts and strategies (20 min); and practice (5 min). Fifth, throughout these segments, tutors require children to know the answer or use taught counting strategies to solve arithmetic problems.
With speeded practice, referred to as Meet or Beat Your Score, children have 60s to answer flash cards. In the first six lessons, cards are restricted to n+/−1, n+/−0, and n+/−2; after efficient addition and subtraction counting strategies are taught, all combinations of addends and minuends up to 18 are included. Children are taught to “know the answer right off the bat” (retrieve from memory) if confident; otherwise, use the taught counting strategies. Children answer each presented problem correctly because, as soon as an error occurs, the tutor requires them to use the taught counting strategy to produce the correct response. To discourage guessing or careless application of counting strategies, seconds elapse as children execute the counting strategy as many times as needed to produce the correct answer. In this way, careful but quick responding increases the number of correct responses. Children have a chance to meet or beat the first score, and the day’s higher score is plotted.
To teach efficient counting strategies, tutors first address the conceptual bases using manipulatives and the number line and then teach how to use fingers to execute these strategies. For addition, children “count-in.” For 3 + 4 =, they hold 3 fingers up on one hand to represent the smaller addend; then they close their other fist to indicate holding the larger addend, gently tapping the closed first against the other hand while saying the larger addend 4; then they count 5 (putting one finger down), 6 (putting another finger down), and 7 (putting the last finger down). The last number counted is the answer. For subtraction, they “count-up.” For 5 – 2 = 3, they count the difference between the numbers, saying 2 with a closed fist; then they count 3 (hold up a finger), 4 (hold up a finger), 5 (hold up a finger). The number of raised fingers is the answer.
NK intervention
To foster engagement, NK intervention (Galaxy Math, previously known as Number Rockets), incorporates a space theme (e.g., rocket ship Meet or Beat Your Score charts; moon beam manipulatives; Fuchs et al., 2013; see Fuchs, Fuchs, Craddock, & Seethaler, 2019 for a complete manual). After speeded practice, tutors conduct the lesson, relying on number lines and manipulatives to represent mathematical ideas (1–19 for Units 1–5; 1–100 for Unit 6). See Supplemental File Table S1 for an overview of lesson content.
Unit 1 (lessons 1–4) addresses basic number knowledge. Unit 2 (lessons 5–6) focuses on adding and subtracting concepts and principles (e.g., adding and subtracting as moving up and down the number line; +0, +1, and +2 problems as simple counting on; the meaning of the equal sign; commutative property of addition; inversion principle). Unit 3 (lessons 7–11) teaches counting strategies. Unit 4 (lessons 12–13) focuses on doubles concepts. Unit 5 (lessons 14–37) focuses on number sets 5 – 12. Unit 6 (lessons 38–45) focuses on writing, counting, and reading numbers 0–99, 3-addend sums, and double-digit adding and subtracting.
Units 4 and 5, which focus on partitioning numbers into constituent sets (for the 5 set: 0 + 5, 1 + 4, 2 + 3, 5 − 0, 5 − 1, 5 − 2, etc.) and number families (four number sentences with the same three numbers), comprise most of the program. Three lessons are allocated to each set. Four activities are conducted in each lesson. (1) Children use unifix cubes to explore how the target number in that set is partitioned in different ways to derive the adding and subtracting problems comprising the set. (2) Children use blocks and visual displays that group families in the set to strengthen part-whole understanding. (3) Children generate all addition and subtraction problems with answers in the target set, using manipulatives to represent problems. (4) Children answer problems in previous sets, with corrective feedback.
In the 5-min practice segment of each lesson, children and tutors play one of two games with a space theme using that day’s number set. The Space Alien game board shows cartoon aliens and footprints stretching to the end of a game board. For each arithmetic problem answered correctly, children move one footprint forward. With Bingo, children complete arithmetic problems, covering answers with rocket ships on a card displaying nine numbers. The winning Bingo rule varies (e.g., 3 in a row, 4 corners, cover all). In both games, tutors require counting strategies on errors to produce correct answers, and tutors and children compete against each other, with tutors occasionally responding incorrectly and children providing corrections.
WP intervention (with and without embedded LC instruction)
In this paper, WP intervention refers to Pirate Math schema-based intervention. To foster engagement, WP intervention incorporates a pirate theme (pirate-themed sticker charts; gold coin manipulatives, Find x!; see Fuchs, Fuchs, & Seethaler, 2019 for a complete manual). After speeded practice, tutors conduct the lesson. For an overview of lesson content, see Supplemental File Table S2 for WP intervention content.
Content is organized in five units. Unit 1 (lessons 1–9) addresses adding and subtracting concepts, addition and subtraction counting strategies, and solving for a missing number, represented by the letter x. In WP intervention, there is substantially less focus than in NK intervention on number knowledge, including adding/subtracting and arithmetic concepts and principles. Unit 2 (lessons 10–18) focuses on total problems (combining two or three quantities to make a total; e.g., There are 5 girls on the playground and 3 girls in the yard. How many girls are there?). Unit 2 also includes instruction on 3-addend addition and 3-part total problems. Unit 3 (lessons 19–27) focuses on difference problems (comparing a larger and a smaller quantity to find the difference; e.g., At the picnic, the kids ate 5 hot dogs. They ate 3 hamburgers. How many more hot dogs did they eat than hamburgers?). Unit 4 (lessons 28–36) focuses on change problems (increasing or decreasing a start quantity to produce an end quantity; e.g., Jamarius baked 6 cookies. Then, he gave 3 of them to his friend. How many cookies does Jamarius still have?). Unit 5 (lessons 37–45) introduces a sorting game where students decide whether a problem is total, difference, or change; it also provides review and practice.
Units 2–4 begin by teaching the mathematical structure of the focal WP type for that unit. This involves role playing the problem type’s central mathematical event using an intact number story (no missing quantity), concrete objects, and the child’s and tutor’s names. Tutors next use the intact story to connect the mathematical central event to (a) a visual schematic (into which story quantities are written) and (b) a hand gesture, which is used across lessons to quickly remind children of the problem type’s central event. Then tutors connect the problem’s central event to a problem-model number sentence: for combine (referred to as total): P1 + P2 = T (Part 1 plus Part 2 equals Total; also 3-part problems); for compare (referred to as difference): B – s = D (bigger quantity minus smaller quantity equals difference); for change: St +/− C = E (for change increase, start number plus change number equals end number; for change decrease, start number minus change number equals end number).
Tutors finally introduce a problem (with a missing quantity), using the same cover story with which the lesson is introduced. The problem is enacted via role playing with concrete objects and the child’s/tutor’s names; the problem type’s schematic and hand gesture are applied; and the problem model number sentence is introduced with x standing for the missing quantity. In each WP unit, the idea of irrelevant information is taught.
In Units 2–4 for both WP conditions, tutors explicitly teach step-by-step strategies to reduce demands on attentive behavior, reasoning, and working memory, as explained in the introduction. This includes strategies for understanding WPs as belonging to WP types (schemas or problem models) and for building the WP model. As children process and solve a WP, they use these strategies to name the problem type, to represent that problem model with the model number sentence, to enter relevant quantities from the WP statement into the problem model while crossing out “extra” (irrelevant) numbers, and to solve for the missing quantity.
Children RUN through the problem: Read it, Underline the segment in which the problem’s object code (which becomes the label) is revealed, and Name the problem type. They write T, D, or C next to the problem to help them remember the problem type, and they write the problem type’s model number sentence. Then, they re-read the problem as they enter known quantities and x to stand for the unknown quantity into the slots of the problem type’s number sentence. For example, given the total problem, There are 5 girls on the playground and 3 girls in the yard. There are also 4 boys in the classroom. How many girls are there?, the tutor reads the problem as the child follows along. The child underlines girls; identifies the problem as total and writes T and the problem type’s model number sentence, P1 + P2 = T; re-reads the problem while replacing 5 for P1, 3 for P2, and x for T; and crosses out 4. To solve 5 + 3 = x, the child retrieves the answer or uses counting strategies. See Supplemental File Figure SF1 for sample first-grade WP work using these Pirate Math strategies.
Embedded LC instruction
The embedded LC component addresses WP language relevant to the first-grade WP schemas. The content was derived via a thorough analysis of prior work on WP language in the primary grades and an analysis of the demands specific to first-grade problem types. In Unit 2, the meaning and application of vocabulary are taught in the context of combine problems: joining words (e.g., altogether, in all) and superordinate categories (e.g., animals = dogs + cats). In Unit 3, the meaning and application of vocabulary and syntax are taught in the context of compare problems: compare words (e.g., more, fewer, than, -er words) and adjective -er versus verb -er words (e.g., bigger vs. teacher). In change problems, the meaning and application of vocabulary and syntax are taught in the context of change problems: cause - effect conjunctions (e.g., then, because, so), implicit quantity change verbs (e.g., cost, ate, found), and time passage phrases (e.g., 3 hours later, the next day). Other focal points apply across problem types: for example, confusing cross-problem constructions (e.g., more than vs. then … more) and challenging answer labels (e.g., questions with superordinate category words, without a label, noun that’s the wrong label [as in money questions]). For an overview of lesson content, see Supplemental File Table S2, with embedded language content underlined.
It is important to note that embedded language instruction does not teach children to rely on key words. Instead, children are taught that processing words in isolation is an error-fraught strategy; are taught to read entire problems so they can meaningfully derive the problem’s central mathematical event; and are taught specifically how and why “grabbing numbers and key words” to identify a number sentence frequently produces wrong answers. To help children appreciate this, they check the work of “other children” (prepared worked problems). Students find errors and explain how and why errors occurred. Worked examples rely on key words to select the wrong operation or misuse irrelevant numbers or fail to recognize 3-part total problems.
The 5-min practice segment of each lesson provides students independent practice finding the unknown (x) in number sentences and solving a WP. WPs align with that day’s lesson focus or provide a review problem. Tutors provide corrective feedback as needed.
Training and support for tutors
Full or part-time project employees served as tutors. Almost all were university master’s students (7% in teacher licensure programs; 5% licensed before graduate school). Each tutor worked with 5–6 students, distributed across the three intervention conditions. (Tutoring materials were color coded to avoid contamination across conditions, with audio-recordings of sessions reviewed on a weekly basis to ensure that contamination across conditions did not occur.)
Tutors were introduced to the interventions in a 2-day workshop and then supported to implement the programs in weekly meetings throughout the 15-week intervention. The initial workshop provided an overview; explained the program’s focus and rationale; discussed distinctions between the intervention conditions and methods to ensure each student received the condition in line with random assignment; modeled key elements of each intervention; provided practice in implementing these elements; and explained methods for providing appropriate corrective feedback throughout lessons. After the workshop but prior to the first lesson with a student, tutors completed a quiz covering major components of the intervention program with at least 90% accurate implementation with a project coordinator. To promote fidelity, tutors also studied lesson guides to guide teaching, but were not allowed to read or memorize lesson guides.
During implementation, tutors attended weekly meetings, in which tutors provided updates on their students, discussed learning and behavior challenges, and problem solved with each other, the two project coordinators, and the first author. Key information on upcoming topics were also reviewed and upcoming materials were distributed. Also, every intervention session was audiotaped. Staff listened to a randomly selected lesson for each tutor in each condition and performed live observations on a weekly basis. The purpose was to identify difficulties with or deviations from the protocol and provide corrective feedback as needed.
Control Condition and Mathematics Instructional Time
On a questionnaire completed in the spring, classroom teachers described their WP instruction (classroom and intervention), math instructional time, and at-risk students’ participation in school intervention. The mathematics block averaged 319.20 (SD = 126.23) min per week; the focus on WPs was 87.96 (SD = 55.94) (~28% of the math program). The first-grade intervention (RTI) block was scheduled for 45–60 min.
On a 5-point scale, teachers reported how often the school program relied on various sources or methods to teach WPs (1 = never; 2 = rarely; 3 = every once in a while; 4 = sometimes; 5 = a lot). They did not teach WPs in terms of problem types (1.22, SD = 0.38) and rarely relied on the textbook (1.80, SD = 0.99). Reliance on graphic representations was occasional (2.98, SD = .46), and reliance on a meta-cognitive attack strategy occasionally to sometimes (3.80, SD = 1.41). Major instructional foci were labeling answers (4.37, SD = 0.75), keywords (4.40, SD = 1.40), and drawing pictures (4.80, SD = 0.45), using objects (4.82, SD = 0.43), and using number sentences (4.84, SD = 0.42).
State standards guided instruction. For representing and solving WPs, the standards specify (a) add and subtract within 20 with unknowns in all positions involving total, compare, and change situations and (b) add three whole numbers whose sum is within 20 using objects, drawings, and equations with a symbol for the unknown number to represent the WP.
Thus, the school program addressed the same WP types, with similar focus on labeling and using objects and number sentences to represent WPs. In contrast to study conditions, it provided major emphasis on drawing pictures and keywords, which are not deemed productive instructional methods (Powell & Fuchs, 2018). Also, the school program did not rely on a structured approach; instead, school personnel individually designed instructional methods with minimal guidance from textbooks.
Of the 32% of students who missed classroom math instruction to participate in the study’s intervention, 68% missed core math instruction, 16% missed math centers, and 16% missed some other type of math instruction. Of the 68% who did not miss classroom math instruction, 90% instead missed the school’s intervention block (math or reading), 3% missed part of the reading block, and 7% missed another activity. Almost one-third (30%) of students in the study’s intervention conditions also received the school’s supplemental math intervention, for an average of 128.71 minutes (SD = 43.49) per week, while 49% of control students received the school’s supplemental math intervention, for an average of 146.11 min (SD = 59.07) per week.
Fidelity of Implementation
To quantify implementation fidelity, we sampled 15% of audio-recordings such that intervention conditions, tutors, and lesson types were sampled comparably. Research assistants independently listened to recordings while completing a checklist to identify the percentage of essential points addressed in that lesson. Coding agreement exceeded 96%. The mean percentage of points addressed was 97.70 (SD=2.96) in NK; 97.66 (SD=2.33) in WP; and 96.80 (SD=2.98) in WP[L]. Thus, fidelity was high and similar across conditions.
Procedure
In August, we screened students for study entry. In September - October, we administered descriptive measures individually and mathematics measures in small groups. In November, teachers completed the SWAN. Intervention began in late October and continued through March. Within 4 weeks of intervention ending, we administered mathematics measures in small groups and the WP language test individually. All test sessions were audio-recorded; 15% of recordings were randomly selected, stratifying by tester, for accuracy checks by an independent scorer. Agreement exceeded 99%. Testers were blind to study condition during test administration and scoring.
Data Analysis
The study’s data structure involved three levels of nesting: 391 students nested within 186 classrooms nested in 21 schools. Three-level multilevel models (e.g., Raudenbush & Bryk, 2002) were employed to account for nesting for the arithmetic and WP-language outcomes, but only two-level models were estimable for the WPS outcome. All models were fit in Mplus 7.4 (Muthen & Muthen, 1998–2019) using full-information robust maximum likelihood estimation.
Our first set of research questions involved the main effect of intervention condition on arithmetic outcome, controlling for pretest arithmetic; the main effect of intervention condition on the WPS outcome controlling for pretest story problems; and the main effect of the intervention condition with WP language instruction. These questions were evaluated using the multilevel models in Equations (1) - (3).
| (1) | 
| (2) | 
| (3) | 
Here, i indexes student, j indexes classroom, and k indexes school. eijk is a student-level residual with variance σ2, and u0jk and u00k are, respectively, classroom-level and school-level deviations from the average intercept γ000 with variances of and , respectively. prWPij is pretest story problems and prARIijk is pretest arithmetic. poWPij is WPS outcome; poWP[L]ijk is WP language; poARIijk is posttest arithmetic. Respectively. prSDij prBFijk VSP is WPS outcome; poOL is WP language; poBF is posttest arithmetic. D1ijk,D2ijk,D3ijk are dummy variables representing, respectively, NK intervention versus control, WP intervention versus control, and WP[L] intervention versus control. We tested the difference between WP[L] intervention and NK intervention by testing (γ003 − γ001) =0; the difference between WP intervention and NK intervention by testing (γ002 − γ001)= 0; and the difference between the two WP conditions (WP[L] vs. WP) by testing (γ003 − γ002) = 0.
An additional research question involved the indirect effect of intervention contrasts on the outcomes via WP language, controlling for pretest story problems and pretest arithmetic. This was investigated using the multilevel mediation model in Equation (4).
| (4) | 
Parameters and residuals in the equation for the mediator (WP language) are denoted with a superscript “p.” Parameters and residuals in the equation for the WPS outcome are denoted with a superscript “v.”C1ijk is a dummy code for a specific comparison only: NK intervention versus control or WP intervention versus control or WP[L] intervention versus control or WP intervention versus WP[L] intervention. Hence, four versions of Equation (4) were fit with different definitions of C1ijk to evaluate four different indirect effects. Defining the a-path as and the b-path as , the point estimate for the indirect effect in each case was (i.e., a-path × b-path), and its confidence interval (CI) was obtained using Preacher and Selig’s (2012) Monte Carlo method, as implemented at www.quantpsy.org.
An exploratory research question involved examining whether intervention effects on the WPS outcome were moderated by any of six pretest variables: language comprehension, two forms of complex working memory, reasoning, attentive behavior, or word reading. In these analyses, we controlled for pretest story problems. Moderation was investigated using the multilevel moderation model in Equation (5).
| (5) | 
The focal predictor, C1ijk is a dummy code for one of the four focal intervention contrasts (NK intervention vs. control, WP intervention vs. control, WP[L] intervention vs. control, and WP intervention vs. WP[L] intervention. The moderatorij stands for one of the six moderators. Four possibilities for the focal predictor C1ijk and six possibilities for the moderatorij led to fitting 24 versions of the Equation (5) multilevel moderation model. For significant moderation relations, we followed Preacher, Curran, and Bauer (2006) and Preacher and Sterba (2019) to identify regions of significance in terms of the percentile score associated with the value of the moderator for which the effect of the focal intervention contrast was significant.
Results
For the three outcomes (arithmetic, WPS, WP language), intraclass correlations (ICCs) at the classroom level were .022, .010, and .036, respectively. ICCs at the school level were .062, .001, and .020, respectively. See Table 1 for means and SDs by at-risk conditions and for not-at-risk classmates. See Table 2 for at-risk children’s pre- and post-intervention performance gaps (against not-at-risk classmates; ESs) on the outcome measures. See Table 3 for results of main effects multilevel models in Equations (1) - (3), as well as Benjamini-Hochberg procedure critical values and Hedges g ESs between at-risk conditions (i.e., the covariate-adjusted mean difference divided by the unadjusted pooled within-group SD).
Table 3.
Main Effects Multi-Levels Model Results (n=391)
| Model/Parameter | Estimate | SE | p-value | B-H critical | Hedges g ES | 
|---|---|---|---|---|---|
| Arithmetic | |||||
| Fixed Effects | |||||
| Intercept | 8.675 | 1.049 | <.0001 | ||
| Adjusted Mean Difference | |||||
| NK v. control | 7.419 | 1.528 | <.0001 | .0125 | 0.59 | 
| WP v. control | 7.985 | 1.475 | <.0001 | .0083 | 0.65 | 
| WP[L] v. control | 9.154 | 1.272 | <.0001 | .0042 | 0.79 | 
| WP v. NK | 0.566 | 1.882 | .764 | .0167 | 0.03 | 
| WP[L] v. NK | 1.735 | 1.496 | .246 | .0208 | 0.14 | 
| WP v. WP[L] | 1.169 | 1.345 | .385 | .0250 | 0.10 | 
| Pretest effect | 1.332 | 0.110 | .001 | ||
| Variance Components | |||||
| Student-level residual | 99.374 | 6.385 | |||
| Classroom-level intercept | 3.367 | 7.133 | |||
| School-level intercept | 6.426 | 6.071 | |||
| Word Problems | |||||
| Fixed Effects | |||||
| Intercept | 2.383 | 0.360 | <.0001 | ||
| Adjusted Mean Difference | |||||
| NK v. control | 0.260 | 0.401 | .517 | .0250 | 0.09 | 
| WP v. control | 4.045 | 0.526 | <.0001 | .0125 | 1.08 | 
| WP[L] v. control | 6.416 | 0.550 | <.0001 | .0042 | 1.75 | 
| WP v. NK | 3.785 | 0.576 | <.0001 | .0167 | 0.94 | 
| WP[L] v. NK | 6.156 | 0.563 | <.0001 | .0083 | 1.55 | 
| WP v. WP[L] | 2.371 | 0.684 | .0001 | .0208 | 0.47 | 
| Pretest effect | 0.632 | 0.171 | <.0001 | ||
| Variance Components | |||||
| Student-level residual | 14.746 | 1.280 | |||
| Classroom-level intercept | 0.715 | 1.395 | |||
| Word-Problem Language | |||||
| Fixed Effects | |||||
| Intercept | 13.511 | 0.355 | <.0001 | ||
| Mean Difference | |||||
| NK v. control | −0.386 | 0.413 | .349 | .0250 | 0.17 | 
| WP v. control | 0.573 | 0.504 | .255 | .0208 | 0.16 | 
| WP[L] v. control | 2.164 | 0.559 | <.0001 | .0125 | 0.56 | 
| WP v. NK | 0.960 | 0.386 | .013 | .0167 | 0.24 | 
| WP[L] v. NK | 2.550 | 0.331 | <.0001 | .0042 | 0.63 | 
| WP v. WP[L] | 1.590 | 0.444 | <.0001 | .0083 | 0.41 | 
| Variance Components | |||||
| Student-level residual | 13.064 | 1.084 | |||
| Classroom-level intercept | 1.180 | 1.034 | |||
| School-level intercept | 0.283 | 0.315 | 
Note. p-values for z-tests of variance components are conservative and thus often not reported (divide est/SE and compare to +1–1.96 to discern significance according to z-test). An adjustment Fitzmaurice et al. (2011, p. 209) suggest is to employ α = .10 rather than α = .05; it performs similarly to other correction methods (Ke & Want, 2015). B-H is the Benjamini-Hochberg procedure critical value. ES is effect size.
On the arithmetic outcome, effects of all three active conditions, controlling for pretest arithmetic, were not significantly different from each other but were significantly stronger than for the control condition. For the main effect of intervention on WPS, controlling for pretest story problems, the effect of WP[L] intervention was significantly stronger than WP intervention, which was significantly stronger than NK intervention, which was not significantly different from control. On WP language, the effect of WP[L] intervention was significantly stronger than WP intervention, which was significantly stronger than NK intervention, although WP intervention and NK intervention were not significantly different from control. As shown in Table 3, significant effects were retained after the Benjamini-Hochberg procedure (Thissen, Steinberg, & Kuang, 2002) was applied to control for Type 1 error.
For the multilevel mediation models, the indirect effect for the contrast between NK intervention versus control on WPS outcome, via WP language, was not significant, (a-path p= .254; b-path p = .001; direct effect or c’-path p = .417, and total effect or c-path = .060(.363) p = .869). The indirect effect for the contrast between WP intervention versus control on WPS outcome, via WP language, also was not significant, (a-path p = .081; b-path p = .004; direct effect or c’-path p < .001; and total effect or c-path = 4.763(.490) p < .001).
By contrast, both indirect effects involving WP[L] intervention on WPS outcome via WP language were significant: for the contrast between WP[L] intervention versus control, (a-path p < .001; b-path p < .001; direct effect or c’-path p < .001, and total effect or c-path = 7.246(.581) p < .001), and for the contrast between WP[L] intervention versus WP intervention, CI = {.138, 1.147} (a-path p < .001; b-path p = .002; direct effect or c’-path p = .003; and total effect or c-path = 2.470(.653) p < .001).
For multilevel moderation models, parameter estimates and standard errors for all 24 fitted models are provided in the Supplemental File. Three significant moderator effects are shown. Although the two initially suggested that attentive behavior and working memory each moderated the effect between NK intervention and control on WPS, probing the computed regions of significance revealed that both moderation effects were driven by a single influential case. When that case was removed, both interaction effects became nonsignificant. With the remaining effect, attentive behavior significantly moderated the effect of WP[L] intervention versus control on WPS. Yet, computing the region of significance (Preacher et al., 2006) indicated the WP[L] intervention effect was significant for all observed values of attentive behavior. Because values outside the observed range are unlikely to occur in intervention, this moderation effect is of no practical consequence. Thus our moderation analyses suggest the effects of NK intervention, WP intervention, and WP[L] intervention are robust across the full distribution of at-risk students’ pretest cognitive processes and word-reading skill.
Discussion
This study centered on three questions: whether a structured approach to WPS intervention is necessary, whether schema-based word-problem intervention improves at-risk first graders’ WPS, and whether embedding language comprehension instruction produces stronger outcomes.
Is A Structured Approach to WPS Intervention Necessary?
Results demonstrate the need for a structured approach to WPS intervention. Transfer from enhanced arithmetic skill, achieved via number knowledge intervention, proved insufficient for addressing the WPS difficulty of at-risk learners. Although number knowledge intervention conveyed an ES advantage of 0.59 over at-risk control group students on arithmetic, this did not translate into stronger WPS (ES on WPS = 0.09). Moderation analyses revealed that the disappointing effect of number knowledge intervention on WPS pattern was robust across the spectrum of at-risk students’ cognitive processes and word-reading skill. Further, the posttest WPS achievement gap for number knowledge intervention students with respect to not-at-risk classmates was sizeable (1.38 SDs) and similar to that of control group students (1.41 SDs). These results, which indicate a need for a structured approach to WPS intervention.
Finding an absence of transfer from arithmetic to WPS echoes Fuchs et al. (2014), in which researcher-designed classroom calculations instruction combined with researcher-designed small-group intervention for at-risk second graders improved calculations but failed to transfer to WPS. This was the case across learner types (Fuchs et al., 2014) and specifically for at-risk learners (Powell, Cirino, Fuchs, Compton, & Changas, 2015). Those findings and the present study’s corroborating results provide the basis for recommending that teachers and curriculum developers not assume that arithmetic skill will translate to improved WPS performance. Instead, a structured approach to WPS intervention appears necessary.
Does Schema-Based Word-Problem Intervention Improve At-Risk First Graders’ WPS?
Results also indicate that schema-based word-problem intervention is a structured approach that boosts WPS at first grade. Prior research on schema-based word-problem intervention demonstrates positive effects at second and third grade (Fuchs et al., 2008; Fuchs et al., 2004; Jitendra et al., 2007; Jitendra & Hoff, 1996; Powell et al., 2019; Powell & Fuchs, 2010). Yet, prior work had not investigated efficacy at first grade. The present study reveals first-grade efficacy for both schema-based word-problem intervention conditions.
Even without the embedded language component, the effects of schema-based intervention were strong, with an ES advantage of 1.08 SDs over the control group; 0.94 SDs over number knowledge intervention. Additionally, the posttest WPS achievement gap was 0.20, indicating that at-risk intervention students were in range of not-at-risk peers, and moderation analyses for each of the schema-based intervention conditions suggested its effects on WPS were robust across the spectrum of children’s pretest cognitive competencies and word-reading skill. We therefore conclude that schema-based word-problem intervention provides schools with an efficacious approach for delivering structured WPS intervention, which substantially narrows or even closes the first-grade WPS achievement gap.
Does Embedding Language Comprehension Instruction Produce Stronger Outcomes?
That conclusion does, however, need to be qualified, because we found that the effects of schema-based word-problem intervention are further strengthened via instruction on word-problem language. Language comprehension is transparently involved in WPS (Kintsch & Greeno, 1985). Prior studies demonstrate that small changes to word problems can improve problem solution accuracy and document that individual differences in language comprehension are associated with individual differences in WPS.
Additionally, the severity of the language problems among students with mathematics difficulties is highlighted by our sample’s language scores at the start of the study. Although participants were selected due to poor mathematics (not poor language), their pretest language scores, as shown in Table 2, placed them nearly a full SD below not-at-risk classmates. For these reasons, it is surprising that prior work on word-problem intervention has not investigated the added value of instruction on the language inherent in WPS.
To create a stringent test of this effect, we contrasted the added value of word-problem language instruction embedded within schema-based word-problem intervention against the same schema-based word-problem intervention without language instruction. The ES advantage between these intervention conditions on the WPS outcome was almost one-half SD in favor of the embedded language condition. This is considerable especially because in order to hold intervention time constant, the condition with embedded language instruction received less than direction word-problem instruction (~5 min less per session; ~225 min less across 45 sessions).
To understand how additional value accrued in the language-enriched word-problem intervention condition, it is instructive to think about effects on the study’s word-problem language outcome. In the main effects model, only the condition with word-problem language instruction significantly outperformed the control group. The mediation analysis indicates that the word-problem language-enriched condition not only improves WPS directly (as revealed in the significant c’ effect), via schema-based instruction, but also indirectly, by strengthening word-problem language. This mediation effect was also specific. It was not significant for the number knowledge versus control group contrast or for the contrast between word-problem intervention without enriched language instruction versus control group. (It was significant for the language embedded word-problem condition versus control group contrast.)
In terms of embedded language instruction’s added value, it is interesting to note at-risk children’s post-intervention WPS achievement gaps. At the end of intervention, the gap had narrowed substantially in the word-problem intervention without language instruction condition: from 1.56 SDs below not-at-risk classmates at pre-intervention to 0.20 at post-intervention. Even so, in the condition with embedded language instruction, the gap went from 1.54 SDs below not-at-risk classmates before intervention to 0.51 SDs above not-at-risk classmates. This reflects these students’ added boost of WPS competence, accomplished via improved word-problem language. This may afford these children an additional layer of protective cushion against fade out effects as they exit supplemental math intervention.
Our results thus substantiate a causal role for language comprehension in WPS and provide support for Kintsch and Greeno’s (1985) conceptualization of WPS as drawing on language competencies. Finding that language comprehension plays a causal role not only deepens insight into word-problem development but provides implications for designing word-problem instruction, for supporting children’s language development, and perhaps for conceptualizing reading comprehension instruction. With respect to language development, results suggest that parents and preschool teachers be vigilant for and act on opportunities to extend children’s ordinary word usage to mathematical contexts. In terms of word-problem instruction, results indicate that it incorporate a deliberate focus on language, including but not limited to word-problem specific vocabulary and syntactic knowledge (e.g., understanding the distinction between more than and then there were more; that the cause and effect in change word problems may be presented in either order within word problems).
With respect to reading comprehension, it is instructive to consider Kintsch and Greeno’s proposal (1985) that WPS be conceptualized as a form of text comprehension. Our results establish a causal connection between language comprehension and WPS, while the connection between language comprehension and reading comprehension is already well established (e.g., Catts et al., 2005; Gough & Tunmer, 1986). Other correlational data suggest connections among the three domains (Fuchs et al., 2015, 2018; Swanson, Cooney, & Brock, 1993; Vilenius-Tuohimaa, Aunola, & Nurmi, 2008).
Conceptualizing WPS as a form of text comprehension raises the possibility that a theoretically coordinated approach, via language comprehension instruction, may simultaneously improve performance across reading comprehension and WPS. This might include, for example, teaching cause-effect informational structure in reading passages in conjunction with change word problems, in which an event increases or decreases a starting amount to create a new ending amount, or connecting compare-contrast informational structure in reading passages with word problems that compare quantities. Such an approach is consistent with recent calls to target learners with language deficits for embedded language instruction within reading comprehension instruction (Catts & Kamhi, 2017; Ukrainetz, 2017). And such an approach may address the needs of an especially vulnerable subset of the population: students with co-occurring difficulty across reading comprehension and WPS, who perform lower in each domain than do students with difficulty in one area (Willcutt et al., 2013) and are at risk of poorer long-term outcomes in and out of school (Batty et al., 2010; Every Child a Chance Trust, 2009).
Three Study Conclusions, to Be Understood in the Context of Four Study Limitations
Our results yield three conclusions about effective word-problem intervention for at-risk first graders. First, number knowledge intervention (which improves arithmetic) does not improve word-problem performance; instead, a structured intervention focused on word problems is necessary. Second, structured schema-based word-problem intervention, which teaches children to understand word problems in terms of problem types and to execute strategies for solving each problem type, improves first-grade word-problem performance. Third, the effects of schema-based word-problem intervention are stronger when it embeds structured instruction on word-problem vocabulary and syntax (which also teaches children the importance of considering language in the context of the complete word problem, not as key words).
We note these conclusions generalize to schools of varying socioeconomic status (SES). Although approximately three-quarters of the at-risk sample received subsidized lunch, the percentage of at-risk and not-at-risk children with subsidized lunch in study schools was wide, and we selected students as at risk based on low math performance, not low SES. As reflected in the language comprehension gaps between at-risk and not-at-risk samples, structured language comprehension instruction may be more important for children with low math performance (Kintsch & Greeno, 1985; Koedinger & Nathan, 2004; Vicente et al., 2007). This is the population that typically receives first-grade mathematics intervention in schools of varying SES.
Readers should, however, understand conclusions in the context of four study limitations. First, this was an efficacy study, designed to ensure fidelity of implementation. The question of generalizability of embedded language instruction, as well as schema-based intervention, to real-world school intervention awaits further study. Second, present results do not speak to the question of whether effects persist over time. As demonstrated in prior intervention work in the primary grades (e.g., Clarke et al., 2016; Clements, Saram, Wolfe, & Spitler, 2013), mathematics intervention effects may fade over time, which would suggest that many at-risk learners require sustained intervention support (Bailey, Fuchs, Gilbert, Geary, & Fuchs, in press).
Third, methods for indexing word-problem language are not established. We relied on a strategy designed to be sensitive to components of our approach to word-problem intervention, centered on problem types and word-problem language. In the present study, this approach for assessing word-problem language proved reliable, sensitive to intervention effects (in the main effect model), and capable of indexing individual differences (in the mediation model), as has been the case for a similar approach in prior work (e.g., Fuchs et al., 2015, 2018). Yet, future research to develop comprehensive batteries for assessing word-problem language would create the basis for expanding related research questions and may contribute to improvements in clinical practice.
Finally, in the present study, we indexed word-problem language at the end of intervention, which constrains inference in the mediation models. Future studies should assess word-problem language performance far enough out from the start of intervention to ensure sensitivity to the intervention’s effect but prior to WPS posttesting. Even so, we remind readers that the conditions were comparable on other pretest language variables. Further, this study’s main effects between the two word-problem conditions (on WPS and on WP language) are a sufficient basis for causal inference and provide clear implications for education practice.
Supplementary Material
Figure 1.
Models assessing the indirect effect of WP language on the WPS outcome for four intervention contrasts: NK vs. control (Panel A), WP v. control (Panel B), WP[L] (shown as WP[L]) vs. control (Panel C), and WP[L] vs. WP (Panel D).
Educational Impact and Implications Statement.
This study yields three conclusions about effective word-problem intervention for at-risk first graders. The first conclusion is that arithmetic intervention does not improve word-problem performance; instead, structured word-problem intervention is required. Second, structured schema-based word-problem intervention, which teaches children to understand word problems in terms problem types and to align problem-solving strategies with the problem type, improves word-problem performance. Third, schema-based word-problem intervention is more effective when it embeds word-problem language instruction that teaches children the importance of understanding language in the context of the complete word problem (not as key words).
Acknowledgments
This research was supported by 2 R01 HD053714 and R01 HD HD097772 and Core Grant HD15052 from the Eunice Kennedy Shriver National Institute of Child Health & Human Development to Vanderbilt University in the National Institutes of Health. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health & Human Development or the National Institutes of Health.
References
- Bailey D, Fuchs LS, Gilbert JK, Geary DC, & Fuchs D (in press). Prevention: Necessary but insufficient? A two-year follow-up of effective first-grade mathematics intervention. Child Development. doi: 10.1111/cdev.13175 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Baker S, Gersten R, & Lee D-S. (2002). A synthesis of empirical research on teaching mathematics to low-achieving students. The Elementary School Journal, 103, 51–73. https://www.jstor.org/stable/1002308 [Google Scholar]
 - Batty GD, Kivimäki M, & Deary IJ (2010). Intelligence, education, and mortality. BMJ: British Medical Journal, 340, 1–2. doi: 10.1136/bmj.c563 [DOI] [PubMed] [Google Scholar]
 - Bottge BA, Rueda E, Serlin RC, Hung Y-H, & Kwon JM (2007). Shrinking achievement differences with anchored math problems: Challenges and possibilities. The Journal of Special Education, 41, 31–49. https://files.eric.ed.gov/fulltext/EJ762921.pdf [Google Scholar]
 - Catts HW, Hogan TP, & Adolf SM (2005). Developmental changes in reading and reading disabilities. In Catts HW & Kamhi AG (Eds.), The connections between language and reading disabilities (pp. 25–40). Mahwah, NJ: Erlbaum. [Google Scholar]
 - Catts HW, & Kamhi AG (2017). Prologue: Reading comprehension is not a single ability. Language, Speech, and Hearing Services in Schools, 48(2), 73–76. doi: 1044/2017_LSHSS-16-0033 [DOI] [PubMed] [Google Scholar]
 - Clarke B, Doabler C, Smolkowski K, Nelson EK, Fien H, Baker SK, & Kosty D (2016). Testing the immediate and long-term efficacy of a tier 2 kindergarten mathematics intervention. Journal of Research on Educational Effectiveness, 9, 607–634. DOI: 10.1080/19345747.2015.1116034 [DOI] [Google Scholar]
 - Clements DH, Sarama J, Wolfe CB, & Spitler ME (2013). Longitudinal evaluation of a scale-up model for teaching mathematics with trajectories and technologies: Persistence of effects in the third year. American Educational Research Journal, 50, 812–850. [Google Scholar]
 - Connolly AJ (2007). KeyMath – 3 Diagnostic Assessment. London: Pearson. [Google Scholar]
 - Cummins DD (1991). Children’s interpretations of arithmetic word problems. Cognition and Instruction, 8, 261–289. doi: 10.1207/s1532690xci0803_2 [DOI] [Google Scholar]
 - Cummins DD, Kintsch W, Reusser K, & Weimer R (1988). The role of understanding in solving word problems. Cognitive Psychology, 20, 405–438. doi: 10.1016/0010-0285(88)90011-4 [DOI] [Google Scholar]
 - Daroczy G, Wolska M, Meurers WD, & Nuerk H-C (2015). Word problems: A review of linguistic and numerical factors contribution to their difficulty. Frontiers in Psychology, 6, 348. doi: 10.3389/fpsyg.2015.00348. [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Davis-Dorsey J, Ross SM, & Morrison GR (1991). The role of rewording and context personalization in the solving of mathematical word problems. Journal of Educational Psychology, 83(1), 61–68. Doi: 10.1037/0022-0663.83.1.61 [DOI] [Google Scholar]
 - De Corte E, Verschaffel L, & de Win L (1985). Influence of rewording verbal problems on children’s problem representations and solutions. Journal of Educational Psychology, 77(4), 460–470. doi: 10.1037/0022-0663.77.4.460 [DOI] [Google Scholar]
 - Dehaene S, Spelke ES, Pinel P, Stanescu-Cosson R, & Tsivkin S (1999). Sources of mathematical thinking: Behavioral and brain-imaging evidence. Science, 284(5416), 970–974. doi: 10.1126/science.284.5416.970 [DOI] [PubMed] [Google Scholar]
 - Every Child a Chance Trust (2009). The long-term costs of numeracy difficulties. Retrieved. August 14, 2009, http://www.everychildachancetrust.org/counts/index.cfm.
 - Fitzmaurice GM, Laird NM, &Ware JH (2011). Applied longitudinal analysis (2nd ed.). New York: Wiley. [Google Scholar]
 - Fuchs LS, Craddock C, & Seethaler PM (2013). Word-Problem Language Assessment. Available from L.S. Fuchs, Vanderbilt University, Nashville, TN 37203. [Google Scholar]
 - Fuchs LS, Fuchs D, Craddock C, & Seethaler PM (2019). Galaxy Math Teachers Manual. Available from L.S. Fuchs, Vanderbilt University, Nashville, TN 37203. [Google Scholar]
 - Fuchs LS, Fuchs D, & Compton DL (2004). Monitoring early reading development in first grade: Word identification fluency versus nonsense word fluency. Exceptional Children, 71, 7–21. doi: 10.1177/001440290407100101 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Fuchs D, Compton DL, Hamlett CL, & Wang AY (2015). Is word-problem solving a form of text comprehension? Scientific Studies of Reading, 19, 204–223. doi: 10.1080/10888438.1005745 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Fuchs D, Finelli R, Courey SJ, & Hamlett CL (2004). Expanding schema-based transfer instruction to help third graders solve real-life mathematical problems. American Educational Research Journal, 41, 419–445. doi: 10.3102/00028312041002419 [DOI] [Google Scholar]
 - Fuchs LS, Fuchs D, & Seethaler PM (2019). First-Grade Pirate Math Manual. Available from L.S. Fuchs, Vanderbilt University, Nashville, TN 37203. [Google Scholar]
 - Fuchs LS, Fuchs D, Seethaler PM, & Barnes MA (in press). Addressing the role of working memory in mathematical word-problem solving when designing intervention for struggling learners. ZDM Mathematics Education. doi: 10.1007/s11858-019-01070-8 [DOI] [Google Scholar]
 - Fuchs LS, Fuchs D, Stuebing K, Fletcher JM, Hamlett CL, & Lambert WE (2008). Problem-solving and computation skills: Are they shared or distinct aspects of mathematical cognition? Journal of Educational Psychology, 100, 30–47. 30–47. doi: 10.1037/0022-0663.100.1.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Geary DC, Compton DL, Fuchs D, Hamlett CL, Seethaler PM, Bryant JV, & Schatschneider C (2010). Do different types of school mathematics development depend on different constellations of numerical and general cognitive abilities? Developmental Psychology, 46, 1731–1746. doi: 10.1037/a0020662 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Geary DC, Compton DL, Fuchs D, Schatschneider C, Hamlett CL, DeSelms J, Seethaler PM, Wilson J, Craddock CF, Bryant JD, Luther K, & Changas P (2013). Effects of first-grade number knowledge tutoring with contrasting forms of practice. Journal of Educational Psychology, 105, 58–77. 8–77. doi: 10.1037/a0030127 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Gilbert JK, Fuchs D, Seethaler PM, & Martin B (2018). Text comprehension and oral language as predictors of word-problem solving: Insights into word-problem solving as a form of text comprehension. Scientific Studies of Reading, 22, 152–166. doi: 10.1080/10888438.2017.1398259 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Gilbert JK, Powell SR, Cirino PT, Fuchs D, Hamlett CL, Seethaler PM, & Tolar TM (2016). The role of cognitive processes, foundational math skill, and calculation accuracy and fluency in word-problem solving versus pre-algebraic knowledge. Developmental Psychology, 52, 2085–2098. doi: 10.1037/dev0000227 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Hamlett CL, & Fuchs D (1990). First Grade Test of Computational Fluency and First-Grade Test of Mathematics Concepts and Applications. Available from L.S. Fuchs, 228 Peabody, Vanderbilt University, Nashville, TN 37203. [Google Scholar]
 - Fuchs LS, Powell SR, Cirino PT, Schumacher RF, Marrin S, Hamlett CL, Fuchs D, Compton DL, & Changas PC (2014). Does calculation or word-problem instruction provide a stronger route to pre-algebraic knowledge? Journal of Educational Psychology, 106, 990–1006. doi 10.1037/a0036793 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Powell SR, Seethaler PM, Cirino PT, Fletcher JM, Fuchs D, Hamlett CL, & Zumeta RO (2009). Remediating number combination and word problem deficits among students with mathematics difficulties: A randomized control trial. Journal of Educational Psychology, 101, 561–576. doi: 10.1037/a0014701 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Fuchs LS, Seethaler PM, & Craddock C (2009). First-Grade Math Word Problems. Available from L.S. Fuchs, Vanderbilt University, Nashville, TN 37203. [Google Scholar]
 - Fuchs LS, Zumeta RO, Schumacher RF, Powell SR, Seethaler PM, Hamlett CL, & Fuchs D (2010). The effects of schema-broadening instruction on second graders’ word-problem performance and their ability to represent word problems with algebraic equations: A randomized control study. Elementary School Journal, 110, 440–463. doi: 10.1086/651191 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Gersten R, Chard DJ, Jayanthi M, Baker SK, Morphy P, & Flojo J (2009). Mathematics instruction for students with learning disabilities: A meta-analysis of instructional components. Review of Educational Research, 79, 1202–1242. doi: 10.3102/0034654309334431 [DOI] [Google Scholar]
 - Gough PB, & Tunmer WE (1986). Decoding, reading, and reading disability. Remedial and Special Education, 7, 6–10. doi: 10.1177/074193258600700104 [DOI] [Google Scholar]
 - Haskell RE (2001). Transfer of learning: Cognition, instruction, and reasoning. Academic. [Google Scholar]
 - Hedges LV (1982). Estimation of effect size from a series of independent experiments. Psychological Bulletin, 92, 490–499. doi: 10.1037/0033-2909.92.2.490 [DOI] [Google Scholar]
 - Hoffer TB, Venkataraman L, Hedberg EC, & Shagle S (2007). Final report on the National Survey of Algebra Teachers for the National Math Panel (retrieved 3/26/2019). https://www.researchgate.net/profile/Eric_Hedberg/publication/228513219_Final_report_on_the_national_survey_of_algebra_teachers_for_the_National_Math_Panel/links/55364e710cf20ea35f11ca6d/Final-report-on-the-national-survey-of-algebra-teachers-for-the-National-Math-Panel.pdf.
 - Hudson T (1983). Correspondences and numerical differences between disjoint sets. Child Development, 54, 84–90. doi: 10.2307/1129864 [DOI] [Google Scholar]
 - Jitendra AK, Griffin CC, Haria P, Leh J, Adams A, & Kaduvettoor A (2007). A comparison of single and multiple strategy instruction on third-grade students’ mathematical problem solving. Journal of Educational Psychology, 99, 115–127. doi: 10.1037/0022-0663.99.1.115 [DOI] [Google Scholar]
 - Jitendra AK, & Hoff K (1996). The effects of schema-based instruction on the mathematical word-problem-solving performance of students with learning disabilities. Journal of Learning Disabilities, 29, 422–431. doi: 10.1177/002221949602900410 [DOI] [PubMed] [Google Scholar]
 - Jordan NC, & Hanich L (2000). Mathematical thinking in second-grade children with different forms of LD. Journal of Learning Disabilities, 33, 567–578. doi: 10.1177/002221940003300605 [DOI] [PubMed] [Google Scholar]
 - Ke Z, & Wang L (2015). Detecting individual differences in change: Methods and comparisons. Structural Equation Modeling: A Multidisciplinary Journal, 22, 382–400. doi: 10.1080/10705511.2014.936096 [DOI] [Google Scholar]
 - Kintsch W, & Greeno JG (1985). Understanding and solving word arithmetic problems. Psychological Review, 92, 109–129. doi: 10.1037/0033-295X.92.1.109 [DOI] [PubMed] [Google Scholar]
 - Koedinger KR, & Nathan M (2004). The real story behind story problems: Effects of representations on quantitative reasoning. Journal of the Learning Sciences, 113, 129–164. doi: 10.1207/s15327809jls1302\_1 [DOI] [Google Scholar]
 - Lee K, Ng SF, Ng EL, & Lim ZY (2004). Working memory and literacy as predictors of performance on algebraic word problems. Journal of Experimental Child Psychology, 89, 140–158. doi: 10.1016/j.jecp.2004.07.001 [DOI] [PubMed] [Google Scholar]
 - Melby-Lervag M, & Hulme C (2013). Is working memory training effective? A meta-analytic review. Developmental Psychology, 49, 270–291. doi: 10.1037/a0028228. [DOI] [PubMed] [Google Scholar]
 - Montague M, (2007). Self‐regulation and mathematics instruction. Learning Disabilities Research and Practice, 22, 75–83. doi: 10.1111/j.1540-5826.2007.00232.x [DOI] [Google Scholar]
 - Muthén LK, & Muthén BO (1998–2019). Mplus User’s Guide. Los Angeles, CA: Muthén & Muthén. [Google Scholar]
 - Nathan MJ, Kintsch W, & Young E (1992). A theory of algebra-word-problem comprehension and its implications for the design of learning environments. Cognitive and Instruction, 9, 392–389. https://www.jstor.org/stable/3233559 [Google Scholar]
 - National Research Council. (2000). How people learn: Brain, mind, experience, and school. The National Academies Press. [Google Scholar]
 - Peng P, Namkung J, Barnes M, & Sun C (in press). A meta-analysis of mathematics and working memory: Moderating effects of working memory domain, type of mathematics skill, and sample characteristics. Journal of Educational Psychology. Advance online publication. doi: 10.1037/edu0000079 [DOI] [Google Scholar]
 - Pickering S, & Gathercole S (2001). Working Memory Test Battery for Children. London: The Psychological Corporation. [Google Scholar]
 - Powell SR, Berry KA, & Barnes MA (2019). The role of algebraic reasoning with a word-problem intervention for third-grade students with mathematics difficulty. Manuscript submitted for review.
 - Powell SR, Berry KA, & Barnes MA (2019). The role of algebraic reasoning with a word-problem intervention for third-grade students with mathematics difficulty. Manuscript submitted for review.
 - Powell SR, Driver MK, Roberts G, & Fall A-M (2017). An analysis of the mathematics vocabulary knowledge of third- and fifth-grade students: Connections to general vocabulary and mathematics computation. Learning and Individual Differences, 57, 22–32. doi: 10.1016/j.lindif.2017.05.011 [DOI] [Google Scholar]
 - Powell SR, & Fuchs LS (2010). Contribution of equal-sign instruction beyond word-problem tutoring for third-grade students with mathematics difficulty. Journal of Educational Psychology, 102, 381–394. doi: 10.1037/a0018447 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Powell SR, & Fuchs LS (2010). Contribution of equal-sign instruction beyond word-problem tutoring for third-grade students with mathematics difficulty. Journal of Educational Psychology, 102, 381–394. doi: 10.1037/a0018447 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Powell SR, & Fuchs LS (2018). Effective word-problem instruction: Using schemas to facilitate mathematical reasoning. TEACHING Exceptional Children, 51(1), 31–42. doi: 10.1177/0040059918777250 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Powell SR, Fuchs LS, Cirino PT, Fuchs D, Compton DL, & Changas PE (2015). Effects of a multi-level support system on calculation, word-problem, and pre-algebraic learning among at-risk learners. Exceptional Children, 81, 443–470. doi: 10.1177/0014402914563702 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Preacher KJ, Curran PJ, & Bauer DJ (2006). Computational tools for probing interaction effects in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics, 31, 437–448. doi: 10.3102/10769986031004437 [DOI] [Google Scholar]
 - Preacher KJ, & Selig JP (2012) Advantages of Monte Carlo confidence intervals for indirect effects. Communication Methods and Measures, 6(2), 77–98. doi: 10.1080/19312458.2012.679848 [DOI] [Google Scholar]
 - Preacher KJ, & Sterba SK (2019). Aptitude-by-treatment interactions in research on educational interventions. Exceptional Children, 85, 248–264. doi: 10.1177/0014402918802803 [DOI] [Google Scholar]
 - Purpura DJ, & Ganley CM (2014). Working memory and language: Skill-specific or domain-general relations to mathematics? Journal of Experimental Child Psychology, 122, 104–121. doi: 10.1016/j.jecp.2013.12.009 [DOI] [PubMed] [Google Scholar]
 - Purpura DJ, & Ganley CM (2014). Working memory and language: Skill-specific or domain-general relations to mathematics? Journal of Experimental Child Psychology, 122, 104–121. doi: 10.1016/j.jecp.2013.12.009 [DOI] [PubMed] [Google Scholar]
 - Raghubar KP, Barnes MA, & Hecht SA (2010). Working memory and mathematics: A review of developmental, individual difference, and cognitive approaches. Learning and Individual Differences, 20(2), 110–122. doi: 10.1016/j.lindif.2009.10.005 [DOI] [Google Scholar]
 - Raudenbush SW, & Bryk AS (2002). Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage. [Google Scholar]
 - Riley MS, Greeno JG, & Heller JH (1983). Development of children’s problem-solving ability in arithmetic. In Ginsburg HP (Ed.), The development of mathematical thinking (pp.153 – 196). San Diego, CA: Academic. [Google Scholar]
 - Schleppegrell MJ (2007). The linguistic challenges of mathematics teaching and learning: A research review. Reading & Writing Quarterly, 23(2), 139–159, doi: 10.1080/10573560601158461 [DOI] [Google Scholar]
 - Singer V, Strausser K, & Cuadro A (2019). Direct and indirect paths from arithmetic to school performance. Journal of Educational Psychology, 111, 434–445. doi: 10.10737/edu0000290 [DOI] [Google Scholar]
 - Swanson J et al. (2004). Categorical and dimensional definitions and evaluations of symptoms of ADHD: The SNAP and the SWAN rating scales. Downloaded from www.adhd.net on 12/20/2004. [PMC free article] [PubMed]
 - Swanson HL (2016). Word problem solving, working memory and serious math difficulties: Do cognitive strategies really make a difference? Journal of Applied Research in Memory and Cognition, 5, 368–383. doi: 10.1016/j.jarmac.2016.04.012 [DOI] [Google Scholar]
 - Swanson HL, & Beebe-Frankenberger M (2004). The relationship between working memory and mathematical problem solving in children at risk and not at risk for serious math difficulties. Journal of Educational Psychology, 96, 471–491. doi: 10.1037/0022-0663.96.3.471. [DOI] [Google Scholar]
 - Swanson HL, Jerman O, & Zheng X (2008). Growth in working memory and mathematical problem solving in children at risk and not at risk for serious math difficulties. Journal of Educational Psychology, 100(2), 343–379. doi: 10.1037/0022-0663.100.2.343 [DOI] [Google Scholar]
 - Thissen D, Steinberg L, & Kuang D (2002). Quick and easy implementation of the Benjamini-Hochberg procedures for controlling the false positive rate in multiple comparisons. Journal of Educational and Behavioral Statistics, 27, 77–83. doi: 10.3102/10769986027001077 [DOI] [Google Scholar]
 - U.S. Department of Education (2009). Assisting students struggling with mathematics: Response to intervention for elementary and middle schools. Washington DC: Institute of Education Sciences, NCEE; 2009–4060; https://ies.ed.gov/ncee/wwc/Docs/PracticeGuide/rti_math_pg_042109.pdf [Google Scholar]
 - Ukrainetz TA (2017). Commentary on “Reading Comprehension Is Not a Single Ability:” Implications for child language intervention. Language, Speech, and Hearing Services in Schools, 48(2), 92–97. doi: 10.1044/2017_LSHSS-16-0031 [DOI] [PubMed] [Google Scholar]
 - Van der Schoot M, Bakker Arkema BAH, Horsley TM, & Van Lieshout EDCM (2009). The consistency effect depends on markedness in less successful but not successful problem solvers: An eye movement study in primary school children. Contemporary Educational Psychology, 34, 58–66. doi: 10.1016/j.cedpsych.2008.07.002 [DOI] [Google Scholar]
 - van Dijk TA, & Kintsch W (1983). Strategies of discourse comprehension. New York: Academic. [Google Scholar]
 - Vicente S, Orarntia J, & Verschaffel L (2007). Influence of situational and conceptual rewording on word problem solving. British Journal of Educational Psychology, 77, 829848. doi: 10.1348/000709907X178200 [DOI] [PubMed] [Google Scholar]
 - Vilenius-Tuohimaa PM, Aunola K, & Nurmi JE (2008). The association between mathematical word problems and reading comprehension. Educational Psychology, 28, 409–426. doi: 10.1080/01443410701708228 [DOI] [Google Scholar]
 - Wechsler D (2011). Wechsler Abbreviated Scale of Intelligence. San Antonio, TX: Psychological Corporation. [Google Scholar]
 - Willcutt EG, Petrill SA, Wu S, Boada R, Defries JC, Olson RK, & Pennington BF (2013). Comorbidity between reading disability and math disability: Concurrent psychopathology, functional impairment, and neuropsychological functioning. Journal of Learning Disabilities, 46, 500–516. doi: 10.1177/0022219413477476 [DOI] [PMC free article] [PubMed] [Google Scholar]
 - Wilkinson GS, & Robertson GJ (2006). Wide Range Achievement Test (4th ed). Wilmington, DE: Wide Range. [Google Scholar]
 - Woodcock RW (1997). Woodcock Diagnostic Reading Battery. Itasca, IL: Riverside. [Google Scholar]
 
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

