Skip to main content
Behavior Analysis in Practice logoLink to Behavior Analysis in Practice
. 2024 Oct 21;18(1):244–252. doi: 10.1007/s40617-024-00999-x

Efficacy of Strategic Incremental Rehearsal in a Word List

Taylor K Lewis 1, Tom Cariveau 1,, Alexandria Brown 1, Paige Ellington 1, James Stocker 1
PMCID: PMC11904067  PMID: 40092324

Abstract

Strategic incremental rehearsal (SIR) involves the systematic introduction of targets during instruction. Specifically, SIR includes an incrementing set size such that correct responding to a subset of targets is required before additional targets are included during instructional sessions. Prior research has arranged SIR using flashcards, although the features of SIR that are likely responsible for its efficacy may not be restricted to flashcards. In the current study, we arranged SIR in a word list (SIR-WL), which includes the presentation of target words on a single page. Instruction using SIR-WL was effective across all evaluations during sight word instruction for children exhibiting reading deficits and resulted in durable responding during maintenance and generalization probes for most targets.

  • Several trial interspersal methods have been described in the extant literature and may confer unique benefits for skill acquisition interventions in applied practice.

  • SIR has been shown to be effective, likely due to the arrangement of an incrementing target set size and within-session prompt delay fading.

  • These features of SIR might also result in fewer errors than static set sizes and across-session prompt delay fading procedures.

  • Presentation modalities, such as word lists rather than flashcards, might improve the feasibility of effective instructional methods by reducing material management.

Keywords: Instruction, Reading, Sight words, Strategic incremental rehearsal


The most recent National Assessment of Educational Progress (NAEP) report found that 37% of fourth-grade students in the United States are unable to read at a basic level (NAEP, 2022). Targeted interventions provided in the early elementary grades have been shown to produce significant gains in a range of reading-related performances including reading fluency and high-frequency word identification (Harn et al., 2008; Vaughn et al., 2009). Methods to teach word identification typically include explicit instruction (e.g., prompting and reinforcement) and repeated presentation of the target word (Carnine et al., 2017). Additional research has evaluated other features of instructional arrangements that may improve the efficiency of sight word instruction, such as the methods by which unknown targets are introduced (e.g., Daly et al., 2000). As one example, Strategic Incremental Rehearsal (SIR) includes an incrementing set size such that only a subset of targets are initially presented during instruction. Once the participant accurately responds to the subset of targets, additional targets are added to the instructional set until all targets in a set are taught simultaneously. SIR may confer several benefits relative to other interspersal strategies (see Kupzyk et al., 2011; Lozy & Donaldson, 2019), although some features of SIR might be modified to improve acceptability.

Lozy and Donaldson (2019) suggested that SIR may be challenging to implement and result in greater integrity errors relative to other instructional methods. The authors endorse that “the experimenter must be vigilant in data collection to keep track of how many times each stimulus has been presented and the sequence of correct and incorrect responses to accurately complete the folding-in procedure” (p. 72). One feature of SIR that may impact its feasibility is the use of flashcards to present targets. Kennedy et al. (2023) found that instruction using flashcards resulted in a longer session duration and were less preferred by instructors than alternative stimulus arrangements. Interestingly, SIR and related procedures (e.g., Incremental Rehearsal; Nist & Joseph, 2008) are commonly described as flashcard methods (e.g., Kupzyk et al., 2011), although their efficacy are likely not constrained to this presentation modality.

The current study sought to extend previous research by evaluating the effects of SIR when presented in a word list (SIR-WL) rather than flashcards. Word lists present targets equally spaced on a standard 21.0 cm by 29.7 cm sheet of paper. Given that all targets can be arranged on a single page, word lists might be more easily managed than flashcards. Moreover, careful arrangement of the word list would allow for instructors to make instructional decisions based on performance in a single row, which may be more feasible in a classroom setting. The current study further extended previous research on SIR by including assessments of maintenance and generalization to whiteboard (i.e., handwritten) and tablet modalities. These generalization assessments were used to ensure that the participant’s performance would extend to common classroom conditions.

Method

Participants and Setting

Four elementary-aged students attending an urban elementary school with 99% of students eligible for free or reduced-price lunch participated. All participants identified as Black and were enrolled in kindergarten (Tiana), first (Trinity and Zaire), or second grade (Jayce). Each participant was referred by their teacher as being in the greatest need of reading support in their class, which was confirmed using curriculum-based measures (CBMs). Instructional sessions occurred in a 1:1 arrangement in a large classroom in the participants’ elementary school.

Materials

The experimenter identified target sight words from the Fry sight word list (Fry, 1980). For pre-assessment probes, the sight words appeared in black ink on a standard white sheet of paper (21.6 cm. by 27.9 cm.) in size-14 Futura font, a sans serif font with a single-story lowercase letter a. The experimenter prepared a single word list for each set of targets to be used during SIR-WL instruction (described below). The experimenter also prepared daily probe word lists, which showed all words from both target sets (i.e., eight words total) across two rows and in a randomized order.

Dependent Variable

Observers recorded unprompted correct responses, prompted correct responses, and incorrect responses. We defined unprompted correct responses as the participant emitting a predefined response within 5 s of the word being presented (e.g., saying “cat” when presented with the textual stimulus cat). A target was defined as having been presented once the instructor or participant placed their finger under the word. We defined prompted correct responses as the participant emitting the target response within 5 s of the experimenter-presented echoic model. Data collectors also recorded an incorrect response if the participant emitted any response that did not correspond to the target or did not respond within 5 s. Finally, observers recorded each presentation of a target stimulus as an exposure. We calculated percentage of unprompted correct responses by dividing the number of unprompted correct responses by the total number of exposures and multiplying by 100. We also calculated the total number of exposures to mastery separately for each target by summing the total number of exposures in each session for that target required to produce responding at the mastery criterion.

Design

We evaluated the effects of SIR-WL on acquisition of sight words using a concurrent multiple baseline design across two target sets. Following mastery of the target set (top panel), the experimenter conducted generalization probes to whiteboard and tablet-based modalities. The experimenter then introduced SIR-WL to any remaining target sets (i.e., bottom panel).

Procedure

Pre-Assessment

The experimenter presented sight words from the Fry sight word list (Fry, 1980). Responding produced no differential consequences. The experimenter compiled a list of unknown words following the pre-assessment and created two, four-word, target sets. We selected targets that included a similar number of letters and were similar in difficulty (i.e., word length) to words previously read correctly during CBM assessments.

Daily Probe

Each day began with a daily probe before any sessions of SIR-WL. The experimenter presented the daily probe word list, which showed the targets for both sets (i.e., eight targets total) in a random order across two rows of four words. Correct responses produced praise. Incorrect responses produced no differential consequences. We defined mastery as the participant emitting 100% unprompted correct responses to the targets in the instructional set during a single daily probe. An initial baseline phase consisted of at least two daily probes. In the bottom panel, this initial baseline phase continued until responding met the mastery criterion in the top panel to allow for verification of baseline rates.

Strategic Incremental Rehearsal – Word List

The experimenter placed the word list on the table covered by a blank sheet of paper in front of the participant, started a timer, and immediately moved the blank sheet to reveal the first row of two targets. The rest of the word sheet remained covered by the blank sheet of paper. Consistent with previous research on SIR, every instructional session included an immediate (i.e., 0-s) echoic prompt the first time a target appeared during the session (e.g., “this word is [target]”). The experimenter required that the participant echo the prompt before responding to any remaining targets in the row. After the participant emitted prompted correct responses to the first two targets, the experimenter moved the blank sheet to reveal the next row. The second row included the same two target words in reverse order (see Fig. 1, row 2) and the participant was given an independent opportunity to respond to the first target. If the participant responded correctly, the experimenter delivered praise and allowed the participant to independently respond to the remaining target in the row. If the participant responded incorrectly, the experimenter presented an echoic prompt. If an incorrect response was emitted to any of the targets in a row, the experimenter re-presented those same targets by moving the blank sheet to one of the other rows that included only those same targets (e.g., row 1 or 2). Once the participant responded correctly to all targets in a row, a new target was introduced. The new target always appeared in the first position of the target block and the experimenter immediately prompted the correct response. Instruction continued with the new target and previously introduced words (i.e., the two targets from the first block) until the participant responded correctly to all three targets in a row. The experimenter then presented the last target word by moving the blank page to reveal the last target and immediately presented an echoic prompt. The SIR-WL session ended once the participant emitted unprompted correct responses to all four targets in a row or until 3 min had elapsed, whichever occurred first.

Fig. 1.

Fig. 1

Example strategic incremental rehearsal – word list template

Generalization

The experimenter conducted generalization probes to whiteboard and tablet-based modalities during the initial baseline phase (excluding Trinity’s set 1 due to experimenter error) and following mastery. Across both modalities, correct responses resulted in praise and incorrect responses produced no differential consequences. During whiteboard generalization probes, the experimenter used a dry-erase marker to write the four targets on a 22.9 cm by 31.8 cm whiteboard in a list. During tablet-based probes, the experimenter presented each target individually in a PowerPoint© slideshow on an iPad© in size-80 Futura font.

Maintenance

The experimenter conducted maintenance probes seven to twelve days following mastery using daily probe procedures. The timing of maintenance probes depended upon participant absences and school breaks.

Remedial Instruction

If a participant did not emit 100% unprompted correct responses during the maintenance probe, the experimenter re-introduced SIR-WL instruction for all targets in the set. Instruction continued until responding again met the mastery criterion. The experimenter conducted an additional maintenance probe at least seven days after the mastery criterion was met.

Interobserver Agreement and Procedural Fidelity

Two independent observers were present during at least 46.7% (M = 73.1%) of daily probes and 22.7% (M = 48.9%) of SIR-WL sessions across participants. Each presentation of a target word was recorded as an exposure. Thus, the number of exposures during SIR-WL sessions varied based on the learner’s performance. Exposure-by-exposure (i.e., trial-by-trial) interobserver agreement (IOA) was calculated by dividing the total number of exposures with an agreement by the total number of exposures and multiplying by 100. Mean IOA was 99.4% (range, 75% to 100%) during daily probes and 99.2% (range, 93.9% to 100%) during SIR-WL sessions. Only a single session fell below 80% IOA during daily probes, which occurred when the observers disagreed about a single response to a target in the second set. Thus, three of four exposures were scored as an agreement.

Data collectors recorded exposure-by-exposure procedural fidelity during at least 46.7% (M = 74.9%) of daily probes and 47.1% (M = 74.3%) of SIR-WL sessions across participants. During daily probes, data collectors scored an exposure as being implemented with fidelity if the experimenter presented praise or no differential consequences following unprompted correct or incorrect responses, respectively. During SIR-WL, procedural fidelity was scored if the experimenter adhered to all procedural components for that exposure. The experimenter must have (a) presented an unprompted or prompted opportunity, (b) delivered praise, (c) introduced a new target after 100% unprompted correct responses were emitted in one row, and (d) terminated instruction after 100% unprompted correct responses were emitted in one row of the final block or 3 min elapsed. Procedural fidelity during daily probes was 100%. Mean procedural fidelity during SIR-WL was 99.7% (range, 88.2% to 100%).

Results

The findings for each participant are shown in Figs. 2, 3 and 4. No participant responded correctly to any target during the baseline phase. The findings for Trinity are shown in Fig. 2. Trinity emitted greater than 70% unprompted correct responses during all SIR-WL sessions. Her responding met the mastery criterion in four daily probes for target sets 1, 2, and 3 and a single daily probe for set 4. Generalization to the whiteboard and tablet modalities was observed for all but one target (set 2). Trinity’s performance did not maintain at 100% for targets in sets 1 and 2, so remedial SIR-WL instruction was introduced. Trinity’s responding met the mastery criterion and maintained for set 1. Set 2 required additional remedial instruction before responding maintained. Trinity emitted 100% unprompted correct responses during all generalization and maintenance probes for sets 3 and 4.

Fig. 2.

Fig. 2

Trinity’s responding on daily, generalization, and maintenance probes. Note. SIR-WL = Strategic incremental rehearsal – word list. Closed circles represent daily probes; gray bars represent SIR-WL instructional sessions; black bars represent generalization and maintenance probes

Fig. 3.

Fig. 3

Jayce’s responding on daily, generalization, and maintenance probes. Note. SIR-WL = Strategic incremental rehearsal – word list. Closed circles represent daily probes; gray bars represent SIR-WL instructional sessions; black bars represent generalization and maintenance probes. *Maintenance was not assessed due to a school break

Fig. 4.

Fig. 4

Zaire (top panels) and Tiana’s (bottom panels) responding on daily, generalization, and maintenance probes. Note. SIR-WL = Strategic incremental rehearsal – word list. Closed circles represent daily probes; gray bars represent SIR-WL instructional sessions; black bars represent generalization and maintenance probes

The findings for Jayce are shown in Fig. 3. Jayce initially exhibited moderate levels of accurate responding during SIR-WL instruction for all target sets, which improved across successive instructional sessions. His responding met the mastery criterion in an average of 5.3 daily probes (range, 4 to 6 probes). Jayce responded correctly to 93.8% of all targets during generalization probes, yet his accurate responding decreased during all maintenance probes. SIR-WL instruction was reintroduced for sets 1 and 2 and responding maintained during all subsequent assessments. Remedial instruction for set 3 and a maintenance assessment for set 4 were not possible due to a school break.

The top two panels of Fig. 4 show Zaire’s evaluation. Zaire emitted moderate to high levels of unprompted correct responses during all SIR-WL instructional sessions. His responding met the mastery criterion in three and two daily probes for sets 1 and 2, respectively. For set 1, generalization was observed for three targets to both modalities, although his responding did not maintain. Three additional remedial SIR-WL sessions were required to produce responding at the mastery criterion, which resulted in 100% correct responding on the final maintenance probe. For set 2, 100% correct responding was observed during all generalization and maintenance probes.

The bottom two panels of Fig. 4 show the findings for Tiana. Across both sets, Tiana emitted moderate to high levels of correct responding during instruction. For set 1, responding met the mastery criterion in two daily probes and generalized to tablet and whiteboard modalities. Responding on the subsequent maintenance assessment fell to 50% correct. Remedial instruction was not possible due to a school break. For set 2, responding met the mastery criterion in five daily probes. Generalization was observed for two and three targets across the whiteboard and tablet modalities, respectively. Nevertheless, accurate responding to all targets was observed during the maintenance probe.

Figure 5 represents the total exposures to individual targets as a function of performance on the first maintenance probe. Jayce’s fourth target set is not represented as maintenance data were not available. A mean of 43.0 exposures were required to produce responding at the mastery criterion. Participants maintained 72.7% of all targets and we found no clear relation between the total number of exposures and response maintenance. Over half (56.3%) of targets that maintained required fewer exposures than average and 58.3% of targets that did not maintain required more than the average number of exposures. Response maintenance was observed for all targets that required fewer than 30 exposures to reach the mastery criterion.

Fig. 5.

Fig. 5

Total exposures during SIR-WL and maintenance of individual targets. Note. Response maintenance during the first maintenance probe is shown for individual targets

Discussion

The current study evaluated the effects of SIR-WL on sight-word acquisition, generalization, and maintenance for four children exhibiting reading deficits. SIR-WL produced mastery-level responding in all cases. Participants’ responding also generalized to tablet and whiteboard modalities for the majority of targets and responding maintained for 72.7% of all targets. Following maintenance failures, remedial instruction using SIR-WL produced durable responding although remediation was not possible for Tiana (set 1) or Jayce (set 3) due to a school break.

Previous research has emphasized the use of flashcards in SIR (Kupzyk et al., 2011; Lozy & Donaldson, 2019). In the current study, we extended previous research by arranging SIR in a word list. Our findings replicated those of earlier work on SIR suggesting that its efficacy is not constrained to any particular stimulus modality. In an attempt to maximize the feasibility and ecological validity of SIR in applied settings, a single word list was prepared for each instructional set. Using a single word list rather than sets of flashcards may address some of the challenges with SIR instruction reported by Lozy and Donaldson (2019). Nevertheless, the inflexibility of word lists, at least relative to flashcards, might also present unique challenges.

In the current study, a single word list was used during SIR instruction for each target set. This arrangement may be socially valid as a single worksheet is needed for each student; however, it also resulted in unequal exposures to individual targets within a set. Although we had originally suspected this to be a flaw of the word list arrangement, differential exposure to the targets within a set is characteristic of previous research on SIR, even when flashcards were used. Specifically, a possible advantage of flashcards is that they could be shuffled each day before instruction, which would allow the experimenters to approximate the number of exposures to each target within an instructional set. Nevertheless, previous research on SIR commonly assigned targets to a position in a set (e.g., Kupzyk et al., 2011; Lozy & Donaldson, 2019). As a result, a common feature of SIR is that targets that are introduced first will receive a disproportionate number of exposures relative to those assigned to later positions in the set.

Despite differences in total exposures to individual targets, our findings suggest that a greater number of exposures was not predictive of maintenance (Fig. 5). Indeed, upon review of target-specific exposures, we found that the number of exposures required to produce correct responding during daily probes frequently varied. As one example, Tiana responded accurately to the third and fourth targets from set 2 following fewer than 14 exposures; however, she required 72 exposures to the first target before she ever emitted a correct response to that target during the daily probe. We observed similar findings across several target sets and participants, suggesting that the total number of exposures to individual targets may not be predictive of correct responding during daily probes.

The current study extended prior research on SIR by including an assessment of generalization to whiteboard and tablet modalities. Importantly, we included the tablet modality as an analogue to flashcards as each target appeared alone. During generalization probes, the participants made a total of seven errors out of 80 trials, five of which occurred when targets were handwritten on the whiteboard. It should be noted that we did not maintain strict guidelines regarding the preparation of the stimuli during these probes, so variation in the experimenter’s handwriting may have affected the participants’ performances. Nevertheless, errors during whiteboard generalization probes also predicted maintenance failures for the same targets in three of the five instances. This finding suggests that generalization assessments might provide a glimpse into the strength of stimulus control exerted by the target stimuli. In practice, these data could be used to make more immediate instructional modifications as an alternative to waiting for the findings of subsequent maintenance probes.

The current study extended past research on SIR by presenting targets in a word list format and including measures of generalization and maintenance. Additional research might attempt to further optimize SIR or related instructional strategies to promote the greatest gains for individuals exhibiting reading deficits. Researchers may not need to look far as several topics receiving recent attention in the behavior analytic literature may be uniquely implicated in the efficacy of these instructional procedures (e.g., target set size, mastery criteria, and interspersal procedures).

Funding

This study was not funded.

Data Availability

The data sets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Declarations

Ethical Approval

This study was approved by an Institutional Review Board and all procedures involving human participants were conducted in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed Consent

All participants provided consent before participating in the current study.

Conflict of interest

The authors declare that they have no conflict of interest..

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Carnine, D. W., Silbert, J., Kame’enui, E., Slocum, T. A., & Travers, P. A. (2017). Direct instruction reading (6th ed.). Pearson.
  2. Daly, E. J., Hintze, J. M., & Hamler, K. R. (2000). Improving practice by taking steps toward technological improvements in academic intervention in the new millennium. Psychology in the Schools,37(1), 61–72. 10.1002/(SICI)1520-6807(200001)37:1/3C61::AID-PITS7/3E3.0.CO;2-/23 [Google Scholar]
  3. Fry, E. (1980). The new instant word list. The Reading Teacher, 34(3), 284–289.
  4. Harn, B. A., Linan-Thompson, S., & Roberts, G. (2008). Intensifying instruction: Does additional instructional time make a difference for the most at-risk first graders. Journal of Learning Disabilities,41(2), 115–125. 10.1177/0022219407313586 [DOI] [PubMed] [Google Scholar]
  5. Kennedy, T., Cariveau, T., Grelck, K., Brown, A., Platt, D. F., & Ellington, P. (2023). An analysis of instructor- and tablet-presented conditional discriminations: Fidelity and rapidity. Advance online publication. [Google Scholar]
  6. Kupzyk, S., Daly, E. J., III., & Andersen, M. N. (2011). A comparison of two flash-card methods for improving sight-word reading. Journal of Applied Behavior Analysis,44(4), 781–792. 10.1901/jaba.2011.44-781 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Lozy, E. D., & Donaldson, J. M. (2019). A comparison of traditional drill and strategic incremental rehearsal flashcard methods to teach letter-sound correspondence. Behavioral Development,24(2), 58–73. 10.1037/bdb0000089 [Google Scholar]
  8. National Assessment of Educational Progress [NAEP] (2022). NAEP report card: Reading. National Center for Education Statistics. https://www.nationsreportcard.gov/reading/nation/achievement/?grade=4
  9. Nist, L., & Joseph, L. M. (2008). Effectiveness and efficiency of flashcard drill instructional methods on urban first-graders’ word recognition, acquisition, maintenance, and generalization. School Psychology Review,37(3), 294–308. 10.1080/02796015.2008.12087877 [Google Scholar]
  10. Vaughn, S., Wanzek, J., Scammacca, N., Linan-Thompson, S., & Woodruff, A. L. (2009). Response to early reading intervention examining higher and lower responders. Exceptional Children,75(2), 165–183. 10.1177/001440290907500203 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data sets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.


Articles from Behavior Analysis in Practice are provided here courtesy of Association for Behavior Analysis International

RESOURCES