Skip to main content
Journal of Applied Behavior Analysis logoLink to Journal of Applied Behavior Analysis
. 2012 Winter;45(4):667–684. doi: 10.1901/jaba.2012.45-667

DISCRIMINATION ACQUISITION IN CHILDREN WITH DEVELOPMENTAL DISABILITIES UNDER IMMEDIATE AND DELAYED REINFORCEMENT

Jolene R Sy 1, Timothy R Vollmer 1
PMCID: PMC3545492  PMID: 23322925

Abstract

We evaluated the discrimination acquisition of individuals with developmental disabilities under immediate and delayed reinforcement. In Experiment 1, discrimination between two alternatives was examined when reinforcement was immediate or delayed by 20 s, 30 s, or 40 s. In Experiment 2, discrimination between 2 alternatives was compared across an immediate reinforcement condition and a delayed reinforcement condition in which subjects could respond during the delay. In Experiment 3, discrimination among 4 alternatives was compared across immediate and delayed reinforcement. In Experiment 4, discrimination between 2 alternatives was examined when reinforcement was immediate and 0-s or 30-s intertrial intervals (ITI) were programmed. For most subjects, discrimination acquisition occurred under immediate reinforcement. However, for some subjects, introducing delays slowed or prevented discrimination acquisition under some conditions. Results from Experiment 4 suggest that longer ITIs cannot account for the lack of discrimination under delayed reinforcement.

Key words: delayed reinforcement, translational research, discrimination training, acquisition, intertrial interval, developmental disabilities


Several notable textbooks recommend that reinforcement be provided immediately to establish or maintain responding (e.g., Skinner, 1953). However, practical constraints and educational considerations may limit the extent to which reinforcement can be provided immediately. For example, teachers may be unable or unwilling to provide reinforcement immediately for a particular student if he or she is working with another student. In addition, with some activity reinforcers (e.g., extra time on the playground), interrupting an educational task (e.g., circle time) to deliver the reinforcer may be contraindicated. In these cases, reinforcement necessarily must be delayed.

Previous studies with rats and pigeons have demonstrated that delayed reinforcement can lead to response acquisition and maintenance under controlled laboratory conditions (e.g., Lattal & Gleeson, 1990; Schaal & Branch, 1988). Researchers also have examined the effects of delayed reinforcement on discriminated responding by examining response rates on both reinforcement and no-consequence operandi (e.g., Escobar & Bruner, 2007; Keely, Feola, & Lattal, 2007; Sutphin, Byrne, & Poling, 1998). For example, Sutphin et al. (1998) examined the effects of unsignaled (i.e., no stimulus change associated with the onset of the delay) 8-s, 16-s, 32-s, and 64-s delays to reinforcement on the response distribution of eight rats. All subjects were more likely to respond on the reinforcement lever than on the no-consequence lever when the delay to reinforcement was small (i.e., 0 or 8 s). Discriminated responding was disrupted as delays to reinforcement increased.

Relatively few studies have examined the extent to which humans learn simple or conditional discriminations when reinforcement is delayed. Establishment of both simple and conditional discriminations is essential for building more sophisticated learning, especially in those who are diagnosed with developmental disabilities. Simple discriminations are formed when one response is reinforced while another response is not reinforced (e.g., de Rose, McIlvane, Dube, Galpin, & Stoddard, 1988). For example, natural reinforcement available for selecting yellow bananas, but not green or brown bananas, could lead to the formation of a simple discrimination: consistent selection of yellow bananas. Conditional discriminations are formed when each response is reinforced only in the presence of a particular discriminative stimulus (e.g., Saunders & Spradlin, 1989). For example, the response “banana” will produce reinforcement following the question “what is it?” if a picture of a banana is present, but not if a picture of an apple is present.

Hockman and Lipsitt (1961) provided an example of the effects of delayed reinforcement on acquisition during educational activities. They examined conditional discrimination acquisition with 60 fourth-grade children when (a) two or three alternatives were targeted and (b) reinforcers were delivered 0 s, 10 s, or 30 s following a correct response. They found that 0-s, 10-s, or 30-s delays to reinforcement did not prevent discrimination acquisition when two alternatives were targeted. However, in general, subjects did not acquire discriminations when three alternatives were targeted and reinforcers were delayed by 10 s or 30 s.

However, Hockman and Lipsitt (1961) reported several limitations that limit the potential generality of the data. First, comparisons were made across groups of 10 children, so it is unclear how variations in task difficulty and delay to reinforcement affected the behavior of individual subjects. Second, the authors reported only the mean number of correct responses in each condition, which is problematic because the number of trials administered was not held constant across subjects and means may conceal variability across subjects. Third, subjects were exposed to a limited number of trials. Given extended exposure, discrimination acquisition may have occurred when three alternatives were targeted and reinforcement was delayed. Fourth, subjects were provided with a rule about the contingencies (i.e., they were informed about what consequences would be delivered and the relation between those consequences and their behavior). It is unclear whether delayed reinforcement could produce discrimination acquisition when rules are not provided.

To date, only one study has examined the extent to which delayed reinforcement will produce discrimination acquisition in humans when rules about the contingencies are not provided (Grindle & Remington, 2002). Thus, additional research in this area is warranted. The purpose of the present series of experiments was to extend research on the effects of delayed reinforcement on discrimination acquisition. We examined (a) a wider range of delay values, (b) whether the opportunity to respond during the delay affected conditional discrimination acquisition, (c) the effects of both delayed reinforcement and immediate reinforcement on conditional discrimination acquisition when a relatively large number of alternatives were targeted, and sessions continued until discrimination acquisition occurred or evaluation termination criteria were reached, and (d) the effects of delayed reinforcement when rules about the contingencies were not provided and signals that reinforcement was forthcoming were not presented after a response. In addition, we evaluated the extent to which any detrimental effects of delayed reinforcement were due to longer intertrial intervals (ITI), which are typically associated with delayed reinforcement conditions.

GENERAL METHOD

Subjects and Setting

Subjects were preschool- and school-aged children who had been diagnosed with developmental disabilities. All subjects had limited verbal repertoires, as measured by their below-average performance on the expressive and receptive communication components of the Battelle Developmental Inventory, administered by a school psychologist. Jorma was a 4-year-old boy who had been diagnosed with an autism spectrum disorder (ASD). Alice was a 5-year-old girl who had been diagnosed as developmentally delayed. Victor was a 4-year-old boy who had been diagnosed with ASD. Vlade was a 5-year-old boy who had been diagnosed with ASD. Jade was a 3-year-old girl who had been diagnosed as developmentally delayed. Morgan was a 7-year-old girl who had been diagnosed with Down syndrome. Amira was an 8-year-old girl who had been diagnosed with an intellectual disability. Mara was a 4-year-old girl who had been diagnosed as developmentally delayed. Alice, Amira, Jade, Jorma, Mara, Morgan, Victor, and Vlade served as subjects in Experiment 1. Jade, Mara, and Victor served as subjects in Experiment 2. Jade and Jorma served as subjects in Experiment 3. Amira, Jade, and Morgan served as subjects in Experiment 4. All sessions were conducted in a common area of the subjects' school.

Target Responses and Data Collection

The target responses differed for each subject, but generally consisted of arbitrarily selected academic conditional discrimination tasks. Subjects were required to receptively identify stimuli (e.g., numbers, letters, words) from a field of stimuli following a verbal instruction by placing the correct stimulus in the experimenter's hand. Table 1 displays the tasks during each condition of Experiments 1, 2, and 3, and Table 2 displays the tasks during each condition of Experiment 4.

Table 1.

Alternatives Targeted in Experiments 1, 2, and 3

graphic file with name jaba-45-04-02-t01.jpg



Immediate reinforcement: Set 1

Delayed reinforcement: Set 1

Immediate reinforcement: Set 2

Delayed reinforcement: Set 2
Experiment 1
 0 s vs. 20 s
  Jorma s; d h; f cat; cow dad; dog
  Alice four triangles; six triangles five triangles; seven triangles 2; 11 3; 10
  Mara E; J G; M
  Morgan 7; 12 8; 11
  Victor Louisiana; Michigan Oklahoma; Texas
  Vlade California; Oklahoma Idaho; Louisiana
  Jade 1; 3 2; 4
 0 s vs. 30 s
  Jorma had; hot bat; big
  Vlade Missouri; Pennsylvania New Hampshire; North Dakota
  Victor Idaho; Maine New Hampshire; North Dakota
  Jade 5; 8 6; 7
 0 s vs. 40 s
  Jorma ear; eye mad; mom melt; moon dime; door
  Victor Montana; New York Mississippi; Pennsylvania
  Amira up; us of; on
Experiment 2
  Mara blue; orange gray; yellow
  Jade mad; mop new; not rock; roll pull; push
  Victor Minnesota; Washington Vermont; Mississippi Iowa; Ohio Massachusetts; Nevada
Experiment 3
  Jade click; crack; cloud; crown shape; smell; shock; smoke
  Jack lake; look; land; loud fade; fish; fast; from ball; blue; bike; book call; cool; clap; corn
Table 2.

Alternatives Targeted in Experiment 4

graphic file with name jaba-45-04-02-t02.jpg



0-s ITI: Set 1

30-s ITI: Set 1

0-s ITI: Set 2

30-s ITI: Set 2

0-s ITI: Set 1

30-s ITI: Set 2
Jade 13; 14 11; 12 Louisiana; Maine Idaho; Texas bee; big fit; fun
Amira is; to at; me
Morgan 9; 15 10; 13

Independent observers used handheld computers to collect data on correct responses, incorrect responses, and the absence of responding. Responses made within 5 s of the instruction (e.g., “Give me Missouri”) that corresponded with the instruction were counted as correct. Responses that did not correspond with the instruction were counted as incorrect. Responses were counted as incorrect if the subject selected more than one alternative (i.e., placed more than one stimulus in the experimenter's hand). A “no response” was scored if the subject did not place stimuli in the experimenter's hand within 5 s of an instruction. Percentage correct was calculated by dividing the number of correct responses by the total number of trials during a session. Discrimination acquisition was defined as percentage correct equal to or greater than 85% across three consecutive sessions. Interobserver agreement data were collected during 29% of all sessions during Experiment 1, 21% of all sessions during Experiment 2, 54% of all sessions during Experiment 3, and 32% of all sessions during Experiment 4. Interobserver agreement was calculated by dividing each session into 10-s intervals, calculating agreement for each interval, and averaging across intervals. Across subjects, agreement averaged 95% (range, 78% to 100%) for Experiment 1, 97% (range, 80% to 100%) for Experiment 2, 98% (range, 89% to 100%) for Experiment 3, and 97% (range, 80% to 100%) for Experiment 4.

General Procedure

Preference assessment

Prior to each discrimination training session, subjects were allowed to choose from an array of edible items and toys. The item chosen first was subsequently used as a reinforcer during the remainder of the session, unless the subject indicated preference for a new item during the session by vocally requesting the item or physically moving towards the item.

Baseline assessment

Either two (Experiments 1, 2, and 4) or four (Experiment 3) unique instructions were targeted during each condition. The number of instructions targeted was equal to the number of stimuli (9 cm by 5 cm or 18 cm by 13 cm cards) presented. In most cases, stimuli were placed in the array on the table side by side approximately 15 cm apart and 15 cm in front of the subject. An exception was made midway through Experiment 1 for Mara. The experimenter held the two stimuli side by side approximately 15 cm apart and 0.9 m in front of Mara so that she had to stand up and walk to the stimuli to make a selection. This was done to increase the probability that she would attend to the stimuli prior to making a selection. During each session, the experimenter delivered a prespecified number of trials of each instruction. Across trials, the experimenter randomly alternated both the position of the stimuli in the horizontal array and the order of the instructions. Thus, during a single trial, the experimenter might deliver the instruction, “Give me 50” when the number 50 was in the right side of the field, “Give me 50” when the number 50 was on the left side of the field, “Give me 70” when the number 70 was in the right side of the field, or “Give me 70” when the number 70 was in the left side of the field. During baseline, therapists did not provide consequences for correct or incorrect responses. The experimenter simply repositioned all the cards in the horizontal array before delivering the next instruction. Instructions associated with low percentages of correct responses during at least three consecutive sessions were targeted during conditional discrimination training.

Conditional discrimination training

Immediate and delayed reinforcement conditions (Experiments 1, 2, and 3) or 0-s and 30-s ITI conditions (Experiment 4) were alternated across sessions until discrimination acquisition occurred in one condition. Once this occurred, the experimenter continued to conduct sessions in the nonmastered condition while periodically conducting maintenance sessions in the mastered condition.

Sessions in the immediate reinforcement condition were identical to baseline, except that a correct response resulted in vocal praise and the immediate delivery of a small edible item or 30 s of access to a preferred toy on a fixed-ratio (FR) 1 schedule. Incorrect responses did not produce reinforcement or vocal feedback. Instead, the experimenter repositioned the stimuli in front of the subject and started the next trial.

During the delayed reinforcement condition, sessions were identical to those in baseline except a chain (Experiments 1 and 3) or tandem (Experiment 2) FR 1 fixed-time (FT) schedule of reinforcement was in effect for correct responses. During the chain schedule, a stimulus change (the removal of stimuli) was uniquely associated with the second link of the chain schedule. During the tandem schedule, a stimulus change did not occur at the onset of the delay (stimuli remained in front of the subject). Thus, each component of the schedule was associated with the same stimuli. During both chain and tandem schedules, reinforcers were delivered at the end of the delay (the FT component) if the response that initiated the delay had been correct. Incorrect responses did not produce reinforcement or vocal feedback. Instead, after the delay interval elapsed, the experimenter started a new trial.

Sessions continued until (a) discrimination acquisition occurred in both conditions, (b) discrimination acquisition occurred in one condition and at least two additional sessions had been conducted in the other condition with mean percentage correct in that condition ≤50% across the last three sessions and no increasing trend, (c) subjects refused to respond in the presence of stimuli or left the school before the evaluation could be completed (Jorma, Experiment 3), or (d) discrimination acquisition was not evident in either condition for at least 15 sessions. In some cases, sessions continued past these criteria to evaluate the stability of the findings.

EXPERIMENT 1

The purpose of Experiment 1 was to compare the occurrence and rate of conditional discrimination acquisition when two alternatives were targeted and reinforcers were delivered immediately or following unsignaled delays to reinforcement.

Procedure

A combined nonconcurrent multiple baseline multielement design was used to evaluate whether the delivery of a preferred item immediately on an FR 1 schedule or following an unsignaled delay on a chain FR 1 FT schedule would produce discrimination acquisition. Two alternatives were targeted during each condition. The duration of the unsignaled delay varied across evaluations and ranged from 20 s to 40 s. For all subjects except Amira, a 20-s delay was initially programmed in the delayed reinforcement condition. If this condition produced discrimination acquisition, a 30-s delay to reinforcement was evaluated. If the 30-s delay to reinforcement produced discrimination acquisition, a 40-s delay to reinforcement was evaluated. For Amira, a 40-s delay was initially programmed in the delayed reinforcement condition. Based on the performance of the other subjects, we hypothesized that discrimination acquisition was likely under the 20-s and 30-s delays. Immediate reinforcement sessions (one stimulus pair) were alternated with delayed reinforcement sessions (a different stimulus pair).

For all subjects except Jade, a total of 20 trials (10 trials of each alternative) were administered per session. Within-session analyses suggested that Jade's performance decreased significantly during the second 10 trials. Thus, with the exception of the 20-s delayed reinforcement versus immediate reinforcement comparison and the first 24 sessions of the 30-s delayed reinforcement versus immediate reinforcement comparison, 10 trials (five trials of each alternative) were administered per session.

Results and Discussion

Results from Experiment 1 are depicted in Figures 1, 2, and 3. All subjects' percentage correct was at or below levels expected by chance during baseline. Figure 1 displays the comparison between immediate reinforcement and 20-s delays to reinforcement. During the first reinforcement condition, immediate reinforcement led to discrimination acquisition for six of the seven subjects, and 20-s delays to reinforcement led to discrimination acquisition for four of the seven subjects. Results were replicated with new sets of stimuli for Jorma and Alice. For the four subjects who acquired conditional discriminations under delayed reinforcement, rates of discrimination acquisition were comparable across immediate reinforcement and 20-s delayed reinforcement conditions. For Mara, increasing the effort required to make a response resulted in further separation between the delayed reinforcement and immediate reinforcement conditions, but did not lead to discrimination acquisition (by our definition) in either condition.

Figure 1. .

Figure 1. 

Percentage correct during 20-s delayed reinforcement (filled circles) and immediate reinforcement (open circles) conditions in Experiment 1. The dotted horizontal line indicates the 85% acquisition criterion.

Figure 2. .

Figure 2. 

Percentage correct during 30-s delayed reinforcement (filled circles) and immediate reinforcement (open circles) conditions during Experiment 1. The dotted horizontal line indicates the 85% acquisition criterion.

Figure 3. .

Figure 3. 

Percentage correct during 40-s delayed reinforcement (filled circles) and immediate reinforcement (open circles) conditions during Experiment 1. The dotted horizontal line indicates the 85% acquisition criterion.

Figure 2 displays the comparison between immediate reinforcement and 30-s delays to reinforcement for the four subjects who had acquired discriminations under a 20-s delay. For all of these subjects, discrimination acquisition occurred under both conditions. However, for Jade, discrimination acquisition was slower under the 30-s delayed reinforcement condition than under the immediate reinforcement condition. Jade acquired discriminations under 30-s delays to reinforcement only when the total number of trials was reduced from 20 to 10.

Figure 3 displays the comparison between immediate reinforcement and 40-s delays to reinforcement for the three subjects who had acquired discriminations under a 30-s delay. For all subjects, both conditions led to discrimination acquisition. We replicated the effect with new stimuli for Jorma. During this evaluation, rates of discrimination acquisition were comparable across conditions for Jorma and Victor. However, Amira's rate of discrimination acquisition was slower under the 40-s delayed reinforcement condition than under the immediate reinforcement condition.

In summary, these findings extend the results of applied literature that examines discriminated responding and delayed reinforcement (e.g., Grindle & Remington, 2002) by demonstrating that delayed reinforcement can produce discrimination acquisition for at least some subjects (a) in the absence of a prompting procedure, (b) when a signal is not programmed between a response and a reinforcer, (c) when delays of up to 40 s are programmed, and (d) when more than one response alternative is targeted in each condition.

Discrimination acquisition may have occurred rapidly in both conditions for some participants because only two alternatives were targeted per condition. Specifically, reinforcement following a correct response to one instruction may have influenced selection of the correct response to the subsequent instruction. For example, the selection of the picture of California following the instruction “Give me California” produced reinforcement. On the next trial, the instruction may have been “Give me Idaho.” The correct selection of the picture of Idaho may have been influenced by the previous reinforcement of the other stimulus under a different instruction.

Another limitation of Experiment 1 was that subjects were unable to make either correct or incorrect responses during the delay. As noted by Lattal (2010), “the delay is not a period of behavioral emptiness through which time passes … responding invariably occurs during the delay” (p. 135). Responses that occur during the delay may increase the efficacy of delayed reinforcement by making obtained delays (e.g., the delay between the last correct response and reinforcer delivery) shorter than programmed delays (i.e., the delay between the correct response that initiated the delay and reinforcer delivery). Conversely, this arrangement may instead decrease the efficacy of delayed reinforcement if incorrect responses occur closer to reinforcer delivery than correct responses.

EXPERIMENT 2

The purpose of Experiment 2 was to examine whether the procedures used in Experiment 1 would produce discrimination acquisition when subjects were free to make either correct or incorrect responses during the delay. A trial-based procedure (that required subjects to sit at a table with limited access to stimuli) was programmed to limit the number and type of responses that might occur during the delay (i.e., subjects could not engage in responses that required them to leave their chairs or access other stimuli). Experimenters recorded only correct or incorrect responses (previously defined) during the delay.

Procedure

A combined nonconcurrent multiple baseline multielement design was used to evaluate whether the delivery of a preferred item immediately or after an unsignaled delay would produce discrimination acquisition when response alternatives were available during the delay. Immediate reinforcement sessions were alternated with delayed reinforcement sessions. The immediate reinforcement condition was identical to the one in Experiment 1. During the delayed reinforcement condition, a tandem FR 1 FT schedule of reinforcement was in effect. Each component of the tandem schedule was associated with the same stimuli, and stimuli were repositioned in front of the subjects following the response that initiated the delay and any responses that occurred during the delay. Thus, subjects could make either correct or incorrect responses during the delay. After either response, the experimenter replaced the stimuli in front of the subject. The duration of the unsignaled delay was 20 s (Mara), 30 s (Jade), or 40 s (Victor). For Jade and Victor, these delays had previously produced discrimination acquisition when response alternatives were not available during the delay.

For Mara and Victor, a total of 20 trials (10 trials of each alternative) were administered per session. For Jade, a total of 10 trials (five trials of each alternative) were administered per session. The experimenter randomly alternated between instructions during each session.

Results and Discussion

Figure 4 displays percentage correct across consecutive sessions during both delayed reinforcement and immediate reinforcement conditions of Experiment 2. For Mara, the combination of a delay to reinforcement and the availability of stimuli during the delay interfered with discrimination acquisition. However, the availability of stimuli during the delay did not interfere with discrimination acquisition for Jade or Victor. Jade's rate of discrimination acquisition was initially slower under the delayed reinforcement condition than under the immediate reinforcement condition. However, when a replication was conducted with new sets of stimuli, rates of discrimination acquisition were comparable across conditions. For Victor, discriminations were acquired rapidly across both conditions.

Figure 4. .

Figure 4. 

Percentage correct during delayed reinforcement (filled circles) and immediate reinforcement (open circles) conditions of Experiment 2. The dotted horizontal line indicates the 85% acquisition criterion.

Differences between subjects may have been due to the fact that Victor did not respond during the delay, Jade generally made correct responses during the delay, and Mara made both correct and incorrect responses during the delay. For Victor, reinforcers were never preceded by incorrect responses during the delay. On average, 81% of Jade's responses during the delay were correct (range, 53% to 100%), and only 17% (range, 0% to 71%) of her reinforcers were preceded by an incorrect response that occurred during the delay (i.e., when the response that initiated the delay was correct, but the last response to occur during the delay was incorrect). Overall, there was very little opportunity for adventitious reinforcement of incorrect responses. Finally, for Jade, in cases in which the last response that occurred prior to the end of the delay was correct, 53% (range, 0% to 100%) of these responses were followed by reinforcement. The other 47% of these responses were not followed by reinforcement because the response that initiated the delay was incorrect. Thus, adventitious reinforcement of correct responses that occurred close in time to reinforcement may have strengthened discriminated responding. On the other hand, on average, only 46% (range, 32% to 65%) of Mara's responses during the delay were correct. In addition, for Mara, 53% (range, 29% to 70%) of reinforcers were preceded by an incorrect response. In cases in which the last response that occurred prior to the end of the delay was correct, 56% (range, 14% to 92%) of these responses were followed by reinforcement.

These results suggest that responding during the delay may prevent discrimination acquisition if subjects make incorrect responses during the delay and if reinforcers are often preceded by incorrect responses. However, collectively, the results of Experiments 1 and 2 suggest that behavior is sensitive to delayed reinforcement when two response alternatives are presented in a context of conditional discriminations. However, less is known about whether discrimination acquisition will occur when reinforcement is delayed and more than two alternatives are targeted at one time. Although Hockman and Lipsitt (1961) found that the number of alternatives targeted at one time covaried with discrimination acquisition under delayed reinforcement, the study had several limitations (noted previously). To date, no study has examined the combined effects of delayed reinforcement and an increase in the number of alternatives targeted on discrimination acquisition by school-aged children with developmental disabilities.

EXPERIMENT 3

The purpose of Experiment 3 was to evaluate whether discrimination acquisition would still occur under immediate and unsignaled delayed reinforcement when the number of alternatives was increased from two to four.

Procedure

The effects of immediate and delayed reinforcement on discrimination acquisition were evaluated in a combined nonconcurrent multiple baseline and multielement design. Procedures were identical to those used in Experiment 1, except that different sets of four (instead of two) alternatives were targeted during each condition. The experimenter randomly selected one of the four instructions during each trial, until five trials of each instruction were delivered, for a total of 20 trials per session. The duration of the unsignaled delay varied across subjects and evaluations, and ranged from 10 s to 40 s. The initial duration of the reinforcement delay equaled the highest delay duration previously associated with discrimination acquisition when two alternatives were targeted. Immediate reinforcement sessions were alternated with delayed reinforcement sessions.

Results and Discussion

Figure 5 displays results from Experiment 3. For Jade, a subject for whom 0-s, 20-s, and 30-s delays to reinforcement had previously produced discrimination acquisition when two alternatives were targeted, both 0-s and 30-s delays to reinforcement produced discrimination acquisition when four alternatives were targeted. The rate of discrimination acquisition was comparable across conditions. For Jorma, a subject for whom 0-s, 20-s, 30-s, and 40-s delays to reinforcement had previously produced discrimination acquisition when two alternatives were targeted, 40-s delays to reinforcement did not produce discrimination acquisition when four alternatives were targeted. However, immediate reinforcement produced discrimination acquisition when four alternatives were targeted. We replicated the evaluation using different stimuli and a shorter (i.e., 10-s) delay to reinforcement (third and fourth conditions). Again, immediate reinforcement produced discrimination acquisition when four alternatives were targeted. Although percentage correct was on an upward trend during the 10-s delay to reinforcement condition, discrimination acquisition was not conclusively demonstrated because Jorma left the school before the evaluation could be completed.

Figure 5. .

Figure 5. 

Percentage correct during delayed reinforcement (filled circles) and immediate reinforcement (open circles) conditions of Experiment 3. The dotted horizontal line indicates the 85% acquisition criterion.

In summary, these results suggest that, for some subjects, the efficacy of delayed reinforcement may depend on the number of alternatives targeted at one time, with a negative relation between both the number of alternatives targeted and the duration of the delay and discrimination acquisition.

One unanswered question that remains from Experiments 1, 2, and 3 is the mechanism that produced less robust acquisition during delayed reinforcement. At least two hypotheses are plausible. First, the delay to reinforcement and subsequent breakdown of the response–reinforcer relation may have slowed acquisition. Second, delays to reinforcement naturally produced longer ITIs (and corresponding lower rates of reinforcement). Findings on the effects of ITI duration are mixed, with some research finding that longer ITIs increase the probability of acquisition relative to short ITIs (e.g., Holt & Shafer, 1973), some research finding that short ITIs increase the probability of acquisition relative to long ITIs (e.g., Koegel, Dunlap, & Dyer, 1980), and other research finding no effect of ITI length (e.g., Valcante, Roberson, Reid, & Wolking, 1989).

EXPERIMENT 4

The purpose of Experiment 4 was to compare the effects of different ITI durations on the occurrence and rate of discrimination acquisition.

Procedure

A combined nonconcurrent multiple baseline multielement and reversal (for Jade) design was used to evaluate discrimination acquisition during 0-s or 30-s ITI conditions. In both conditions, reinforcers were delivered immediately on an FR 1 schedule contingent on correct responses. During the 0-s ITI condition, the experimenter presented a new trial immediately after the subject consumed the reinforcer or immediately after an incorrect response. During the 30-s ITI condition, the experimenter presented a new trial 30 s after the subject consumed the reinforcer or 30 s after an incorrect response. Two response alternatives were targeted during each condition, and response alternatives differed across conditions. Either five (Jade) or 10 (all other subjects) trials of each of two instructions were delivered per session. The experimenter randomly alternated between the instructions.

Results and Discussion

Figure 6 depicts the percentage of correct responses made during both 0-s ITI and 30-s ITI conditions. Discrimination acquisition occurred in the 30-s ITI condition for all three subjects. For Jade (top panel), discrimination acquisition initially failed under the 0-s ITI condition even though discrimination acquisition had previously occurred under immediate reinforcement and 0-s ITIs (see bottom panels of Figures 1 and 2, middle panel of Figure 4, and top panel of Figure 5). We hypothesized that the auditory similarity of the two instructions “Give me 13” and “Give me 14” may have confounded the intended independent variable manipulation (i.e., ITI duration). Thus, we compared 0-s and 30-s ITI conditions with two new sets of stimuli. During the second and third evaluations, discrimination acquisition occurred in both 0-s and 30-s ITI conditions. For Amira, discrimination acquisition occurred slightly more rapidly in the 30-s ITI condition than in the 0-s ITI condition. For Morgan, the rate of discrimination acquisition was comparable across conditions. In addition, discrimination acquisition occurred rapidly in each condition. These results are contrary to those of previous studies (e.g., Bilodeau & Bilodeau, 1958) and suggest that longer ITIs may not always have detrimental effects on behavior. In addition, these results suggest that cases in which delayed reinforcement fails to produce discrimination acquisition likely cannot be accounted for by longer ITIs (or decreased reinforcement rate) typically associated with the delayed reinforcement condition.

Figure 6. .

Figure 6. 

Percentage correct during conditions in which either short 0-s (open circles) or long 30-s (filled circles) ITIs were programmed during Experiment 4. The dotted horizontal line indicates the 85% acquisition criterion.

GENERAL DISCUSSION

The purpose of Experiments 1 through 3 was to examine the effects of immediate and delayed reinforcement on discrimination acquisition of children with developmental disabilities and to evaluate some variables that may affect the efficacy of delayed reinforcement. A variety of factors can vary during teaching arrangements set up in complex environments, and these factors may preclude discrimination acquisition. In the current study, we examined the effects of a few of them (i.e., delays to reinforcement, adventitious reinforcement of incorrect responses, number of alternatives targeted, time between trials) and found that reinforcement is remarkably robust in the face of challenges. However, programming a delay to reinforcement either slowed down or prevented discrimination acquisition in some cases.

Nevertheless, for some subjects, discriminations were acquired even when reinforcers were delivered following 20-s to 40-s unsignaled delays (Experiment 1). Delayed reinforcement produced discrimination acquisition when responses could occur during the delay (Experiment 2). Results showed that in fact responses did not occur during the delay or occurred during the delay but were correct. Few incorrect responses were closely followed by reinforcement. The efficacy of delayed reinforcement decreased for one of two subjects when the number of alternatives targeted per condition increased (Experiment 3). However, it may be possible to increase the number of targeted alternatives by decreasing the delay to reinforcement. Finally, ITI duration (which was longer in delayed reinforcement conditions than in immediate reinforcement conditions in Experiments 1, 2, and 3) did not affect the occurrence or rate of discrimination acquisition (Experiment 4).

These experiments extend results of basic research to an arrangement commonly used to teach new skills to children with developmental disabilities. Although several researchers have examined the effects of delayed reinforcement on discriminated responding of nonhumans (e.g., Sutphin et al., 1998), relatively few studies have examined the effects of delayed reinforcement on discriminated responding of humans. Those who have (e.g., Grindle & Remington, 2002; Hockman & Lipsitt, 1961) did not conduct extended evaluations of the effects of unsignaled delays on conditional discrimination acquisition. The current study was designed to fill a gap in the research literature on delayed reinforcement and discrimination acquisition by individuals with developmental disabilities. We attempted to carefully control the number and type of responses that could occur. Further, we recorded behavior during the delay to reinforcement and examined the interactions among response difficulty (i.e., number of alternatives targeted), delayed reinforcement, and discrimination acquisition.

As noted by Pipkin, Vollmer, and Sloman (2010), translational models, such as the one used in this study, are beneficial because they allow researchers (a) to examine the effects of specific variables while holding all other variables constant, which is not always possible in the natural environment; (b) to identify which variables have the greatest effect on behavior so that those variables can then be examined in application; and (c) to rapidly examine the effects of several variables. In addition, the translational model used in the current study allowed us to examine the effects of delayed reinforcement on a relatively benign response. Results from this study can be used to inform discrimination training procedures that are designed for free-operant situations in which individuals are free to engage in both potentially dangerous problem behavior and appropriate behavior.

In addition, results have several potential implications for practitioners. First, findings suggest that, in many cases, practitioners can complete other necessary tasks (e.g., recording data, blocking dangerous behavior) prior to delivering a reinforcer as long as the reinforcer is delivered within a reasonable amount of time and a contingency is arranged between a correct response and reinforcement. The option to record a response immediately prior to delivering reinforcement, especially for new observers, may increase the accuracy of data collection. Second, findings suggest that it may be necessary to limit the number of alternatives targeted at one time when practical limitations or educational considerations necessitate delayed reinforcement. These situations could arise when another child is interacting with the preferred toy used as a reinforcer, when it is necessary that the learner transition to another area (e.g., playground) to access a reinforcer, or when providing a reinforcer immediately would interrupt a functional task.

An unexpected finding was Mara's results in Experiment 1. Mara was the only subject who failed to acquire discriminations under both delayed reinforcement and immediate reinforcement. It is possible that her indiscriminate responding (i.e., responding not under control of discriminative stimuli) was maintained by a variable-ratio (VR) 2 schedule of reinforcement. That is, by selecting any alternative, Mara had a 50% chance of selecting the correct response and earning reinforcement. McIlvane and Dube (2003) noted that such VR schedules may be sufficient to maintain undifferentiated responding during discrimination training.

Results of this study should be considered in light of several limitations. First, as mentioned previously, delays to reinforcement may rarely occur during discrimination training because behavior analysts who conduct discrimination training typically have extensive training and work with clients on a one-to-one basis (thus, limiting variables that may make it difficult to deliver reinforcement immediately, e.g., the presence of other children). Therefore, the extent to which the findings translate meaningfully to application remains to be seen.

A second limitation of this study was that the procedures used in Experiments 1 through 3 (alternation between immediate reinforcement and delayed reinforcement conditions) could have produced carryover effects. In other words, discrimination acquisition in the immediate reinforcement condition could have made discrimination acquisition in the delayed reinforcement condition more likely. However, the unique alternatives targeted in each condition (see Table 1) could have improved discrimination between conditions and helped protect against carryover effects. Nevertheless, future research should evaluate the effects of delayed reinforcement on conditional discrimination acquisition when delayed reinforcement is not alternated with immediate reinforcement. We have conducted such an evaluation with two subjects (data available from the first author) and obtained similar results when the delayed reinforcement condition was evaluated by itself or alternated with an immediate reinforcement condition: The discrimination was not acquired under either condition.

A third limitation was that the sequential evaluation of longer delays may have increased tolerance for delayed reinforcement and increased the probability of discrimination acquisition under 30-s and 40-s delays. However, it should be noted that, for Amira, discrimination acquisition occurred when a 40-s delay to reinforcement was programmed, despite having no previous experimental contact with shorter delays. This suggests that long delays may produce discrimination acquisition, even in the absence of incremental increases in delays. Nevertheless, in future research, experimenters should consider either starting with a long delay to reinforcement and then gradually decreasing the delay until acquisition occurs or evaluating only a single delay with each subject.

A fourth limitation was that it was unclear whether slight changes in experimenter behavior after a response (e.g., subtle muscle movements around the eyes or mouth indicating approval or dissatisfaction, eye contact or lack thereof, slight body orientation towards or away from the reinforcer) could have functioned as a signal that reinforcement was forthcoming. Although the experimenter made every effort to maintain consistent expressions and body posture following both correct and incorrect responses, it is unclear whether subtle changes in experimenter behavior differentially occurred after correct and incorrect responses and, if these changes did occur, whether they influenced subjects' behavior. Future research should evaluate experimenter behavior under this type of arrangement to determine whether experimenters are more likely to unintentionally react positively after correct responses, even if reinforcement is not immediately delivered. Alternatively, reinforcer delivery could be automated.

Future studies should continue to evaluate the effects of intervening responses after it has been determined that delayed reinforcement does not prevent discrimination acquisition. Only two of three subjects in Experiment 2 had previously acquired discriminations under delayed reinforcement. Thus, it is unclear whether Mara failed to acquire the discrimination because reinforcement was delayed or because she could make incorrect responses prior to reinforcer delivery. Although we evaluated how the opportunity to respond during the delay affected discrimination acquisition with two subjects who had previously acquired discriminations under delayed reinforcement (Victor and Jade), it should be noted that Victor never responded during the delay. Thus, it is still unclear whether the occurrence of target and nontarget responses during the delay would prevent discrimination acquisition. Future studies should use procedures similar to those programmed in Experiment 2 with subjects who have previously acquired discriminations under delayed reinforcement and who engage in target and nontarget responses during the delay.

In conclusion, results from this study provide evidence that delayed reinforcement can produce discrimination acquisition in individuals with developmental disabilities under some conditions. Methods from the current study may be used to design future studies that examine the effects of delayed reinforcement on discriminated responding during free-operant arrangements that involve problematic and appropriate behavior. Specifically, results suggest that a small number of alternatives should be targeted at a time (e.g., an individual should be taught only one or two appropriate responses at a time during differential reinforcement procedures that involve a delay to reinforcement) and that careful monitoring of responding during the delay may be necessary to determine whether temporal pairings between nontarget responses (e.g., problem behavior) and reinforcement are likely to occur, and whether contingencies designed to prevent such temporal pairings are necessary to avoid adventitious reinforcement of nontarget responses.

Footnotes

Jolene Sy is now affiliated with Saint Louis University. Portions of this research were presented at the 37th annual meeting of the Association for Behavior Analysis International in Denver, Colorado.

Action Editor, Michael Kelley

REFERENCES

  1. Bilodeau E. A, Bilodeau I. M. Variation of temporal intervals among critical events in five studies of knowledge of results. Journal of Experimental Psychology. (1958);55:603–612. doi: 10.1037/h0043070. doi:10.1037/h0043070. [DOI] [PubMed] [Google Scholar]
  2. de Rose J. C, McIlvane W. J, Dube W. V, Galpin V. C, Stoddard L. T. Emergent simple discrimination established by indirect relation to differential consequences. Journal of the Experimental Analysis of Behavior. (1988);50:1–20. doi: 10.1901/jeab.1988.50-1. doi:10.1901/jeab.1988.50-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Escobar R, Bruner C. A. Response induction during the acquisition and maintenance of lever pressing with delayed reinforcement. Journal of the Experimental Analysis of Behavior. (2007);88:29–49. doi: 10.1901/jeab.2007.122-04. doi:10.1901/jeab.2007.88-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Grindle C. F, Remington B. Discrete-trial training for autistic children when reward is delayed: A comparison of conditioned cue value and response marking. Journal of Applied Behavior Analysis. (2002);35:187–190. doi: 10.1901/jaba.2002.35-187. doi:10.1901/jaba.2002.35-187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Hockman C. H, Lipsitt L. P. Delay-of-reward gradients in discrimination learning with children for two levels of difficulty. Journal of Comparative Physiological Psychology. (1961);54:24–27. doi: 10.1037/h0039874. doi:10.1037/h0039874. [DOI] [PubMed] [Google Scholar]
  6. Holt G. L, Shafer J. N. Function of intertrial interval in matching-to-sample. Journal of the Experimental Analysis of Behavior. (1973);19:181–186. doi: 10.1901/jeab.1973.19-181. doi:10.1901/jeab.1973.19-181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Keely J, Feola T, Lattal K. A. Contingency tracking during unsignaled delayed reinforcement. Journal of the Experimental Analysis of Behavior. (2007);88:229–247. doi: 10.1901/jeab.2007.06-05. doi:10.1901/jeab.2007.88-229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Koegel R. L, Dunlap G, Dyer K. Intertrial interval duration and learning in autistic children. Journal of Applied Behavior Analysis. (1980);13:91–99. doi: 10.1901/jaba.1980.13-91. doi:10.1901/jaba.1980.13-91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Lattal K. A. Delayed reinforcement in operant behavior. Journal of the Experimental Analysis of Behavior. (2010);93:129–139. doi: 10.1901/jeab.2010.93-129. doi:10.1901/jeab.2010.93-129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Lattal K. A, Gleeson S. Response acquisition with delayed reinforcement. Journal of Experimental Psychology. (1990);16:27–39. doi:10.1037//0097-7403.16.1.27. [PubMed] [Google Scholar]
  11. McIlvane W. J, Dube W. V. Stimulus control topography coherence theory: Foundations and extensions. The Behavior Analyst. (2003);26:195–213. doi: 10.1007/BF03392076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Pipkin C. S, Vollmer T. R, Sloman K. N. Effects of treatment integrity failures during differential reinforcement of alternative behavior: A translational model. Journal of Applied Behavior Analysis. (2010);43:47–70. doi: 10.1901/jaba.2010.43-47. doi:10.1901/jaba.2010.43-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Saunders K. J, Spradlin J. E. Conditional discrimination in mentally retarded adults: The effect of training the component simple discriminations. Journal of the Experimental Analysis of Behavior. (1989);52:1–12. doi: 10.1901/jeab.1989.52-1. doi:10.1901/jeab.1989.52-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Schaal D. W, Branch M. N. Responding of pigeons under variable-interval schedules of unsignaled, briefly signaled, and completely signaled delays to reinforcement. Journal of the Experimental Analysis of Behavior. (1988);50:33–54. doi: 10.1901/jeab.1988.50-33. doi:10.1901/jeab.1988.50-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Skinner B. F. Science and human behavior. New York, NY: The Free Press; (1953). [Google Scholar]
  16. Sutphin G, Byrne T, Poling A. Response acquisition with delayed reinforcement: A comparison of two-lever procedures. Journal of the Experimental Analysis of Behavior. (1998);69:17–28. doi: 10.1901/jeab.1998.69-17. doi:10.1901/jeab.1998.69-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Valcante G, Roberson W, Reid W. R, Wolking W. D. Effects of wait-time and intertrial interval duration on learning by children with multiple handicaps. Journal of Applied Behavior Analysis. (1989);22:43–55. doi: 10.1901/jaba.1989.22-43. doi:10.1901/jaba.1989.22-43. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Applied Behavior Analysis are provided here courtesy of Society for the Experimental Analysis of Behavior

RESOURCES