Contingency Tracking During Unsignaled Delayed Reinforcement

Josue Keely; Tyler Feola; Kennon A Lattal

doi:10.1901/jeab.2007.06-05

. 2007 Sep;88(2):229–247. doi: 10.1901/jeab.2007.06-05

Contingency Tracking During Unsignaled Delayed Reinforcement

Josue Keely ¹, Tyler Feola ¹, Kennon A Lattal ^1,^✉

PMCID: PMC1986436 PMID: 17970417

Abstract

Three experiments were conducted with rats in which responses on one lever (labeled the functional lever) produced reinforcers after an unsignaled delay period that reset with each response during the delay. Responses on a second, nonfunctional, lever did not initiate delays, but, in the first and third experiments, such responses during the last 10 s of a delay did postpone food delivery another 10 s. In the first experiment, the location of the two levers was reversed several times. Responding generally was higher on the functional lever, though the magnitude of the difference diminished with successive reversals. In the second experiment, once a delay was initiated by a response on the functional lever, in different conditions responses on the nonfunctional lever either had no effect or postponed food delivery by 30 s. The latter contingency typically lowered response rates on the nonfunctional lever. In the first two experiments, both the functional and nonfunctional levers were identical except for their location; in the third experiment, initially, a vertically mounted, pole-push lever defined the functional response and a horizontally mounted lever defined the nonfunctional response. Higher response rates occurred on the functional lever. These results taken together suggest that responding generally tracked the response–reinforcer contingency. The results further show how nonfunctional-operanda responses are controlled by a prior history of direct reinforcement of such responses, by the temporal delay between such responses and food delivery, and as simple generalization between the two operanda.

Keywords: unsignaled delay of reinforcement, contingency tracking, discrimination, lever press, pole-push response, rats

New responses are learned in the absence of temporal contiguity between them and the reinforcers they produce (Byrne, Lesage, & Poling, 1997; Critchfield & Lattal, 1993; Lattal & Gleeson, 1990; LeSage, Byrne, & Poling, 1996; Wilkenfield, Nickel, Blakely, & Poling, 1992; Williams & Lattal, 1999). These responses are easily developed with 30-s delays and have been reported with delays of reinforcement of up to 60 s (Avila & Bruner, 1995); however, a claim that the responses are controlled by the temporally extended response–reinforcer relation requires that other potential sources of control over the response be excluded. Evidence in support of the claim comes from the findings that responses are neither established nor maintained in the absence of reinforcement nor when otherwise identical reinforcers are delivered independently of the responses (Gleeson & Lattal, 1987; Lattal & Gleeson, 1990).

A further test of the limits of the differential sensitivity of responding to response–reinforcer relations involves the use of two operanda. Responding on one of them, hereafter labeled the functional lever, produces food after an unsignaled delay period as noted above. Responses on the other, hereafter labeled the nonfunctional lever, are recorded, but without other programmed effect. The distribution of responses between the two operanda ideally would index the extent to which responding tracks, that is, is sensitive to, the response–reinforcer relation, or, as it often is described, the contingency (cf. Williams & Lattal, 1999). The present experiments examined such tracking to further assess the control of operant responding by delayed consequences.

The results of two previous experiments bear on contingency tracking. Wilkenfield et al. (1992) reported higher response rates of rats on a functional lever than on a nonfunctional one when each functional response initiated unsignaled delays of 8 s or less. When Wilkenfield et al. used 16- or 32-s resetting delays to reinforcement, however, responding on the nonfunctional lever was more frequent than on the functional lever. By contrast to the latter finding, Critchfield and Lattal (1993) reported little or no responding by rats on a nonfunctional lever during the acquisition of a spatially defined operant with delayed reinforcement. Specifically, each break of a photocell beam across the rear of the experimental chamber initiated an unsignaled 30-s resetting delay that terminated with a food pellet delivery.

Wilkenfield et al.'s (1992) findings are theoretically important because they raise questions about observations (e.g., Avila & Bruner, 1995, Gleeson & Lattal, 1987; Lattal & Gleeson, 1990; Richards, 1981), noted above, that response acquisition and maintenance in experimentally naive and otherwise untrained animals are possible with relatively long temporal gaps between the reinforcer and the response that produces it. Specifically, their data could be taken to suggest that the response–reinforcer relation is effective in controlling responding only when reinforcers occur within about 8 s of a response. A different interpretation of the Wilkenfield et al. findings, however, is that nonfunctional-lever responding is maintained by variables independently of, or interdependently with, the response-delayed reinforcer relation in effect on the functional lever. At least three such variables might contribute to high rates of nonfunctional-lever responding. First, in the Critchfield and Lattal (1993) experiment, topographically different responses served as the functional and nonfunctional responses, whereas Wilkenfield et al. used identical levers for either response. Thus, nonfunctional-lever responding might occur as a result of simple generalization from one operandum to the other. Second, responses on the nonfunctional lever may occur in closer proximity to food delivery than those on the functional lever and, therefore, may be maintained by less-delayed, but unprogrammed, reinforcement. Wilkenfield et al., for example, did not include any contingency to ensure temporal separation between responses on the nonfunctional lever and delivery of food for responding on the functional lever.

A third variable, not addressed by Wilkenfield et al. (1992), that may affect the control of nonfunctional-operandum responding is the prior history of reinforcement correlated with such responding. Williams and Lattal (1999), for example, reinforced pigeons' functional-key pecks according to a tandem variable-interval (VI) 15-s differential-reinforcement-of-other-behavior (DRO) 10-s schedule of reinforcement such that an unsignaled transition from the VI component to the DRO component occurred on the first response after the VI interval had lapsed. Pecks on a second, nonfunctional, key had no consequence unless they occurred during a delay initiated by a functional-key peck, in which case the upcoming reinforcer was delayed 10 s. Williams and Lattal reversed the position of the functional response every one, two, or three sessions. Because of the frequent alternation of the functional and nonfunctional keys, responding on the latter persisted at relatively high rates, but the ratio of functional key pecks to total key pecks increased the longer the functional response was fixed at the same location.

As already discussed, contingency tracking bears on an understanding of the temporal relations involved in reinforcement. The present experiments therefore were conducted to examine, first, the development of the tracking of a response-delayed reinforcer dependency or its absence. Second, the effects of prior histories of reinforcement, possible unprogrammed reinforcement of nonfunctional-operandum responses, and the physical similarity between the functional and nonfunctional operanda were examined to further isolate sources of control over the functional responses.

Experiment 1

This experiment examined the extent to which operant responding tracks the contingencies as the location of the lever correlated with an unsignaled, resetting delay of reinforcement procedure was reversed successively. As noted above, Williams and Lattal (1999) found increased tracking accuracy with increasing numbers of sessions in which the functional lever remained constant in one location. In the present experiment, the locations of functional and nonfunctional operanda remained fixed over repeated sessions, with a 30-s resetting delay operative on the functional lever.

Method

Subjects

Each of 3 female Sprague-Dawley rats, approximately 120 days old at the start of the experiment, was maintained at 75% of its ad libitum body weight by postsession feeding. Each was housed individually and had unrestricted access to water except during experimental sessions.

Apparatus

Two Ralph Gerbrands Co. Model G7010 rat conditioning chambers were enclosed in sound-attenuating chambers. The chambers were 20.5 cm wide by 19.5 cm high by 23.5 cm long. The aluminum work panel contained two horizontal rat levers, 5 cm long by 1.2 cm wide, operated by a force of 0.25 N. The levers were located 8.0 cm from the floor of the chamber and 6.0 cm from the left and right edges of the panel. The panel also contained on its midline a 4.4 cm square food aperture into which 45-mg food pellets could be delivered by a Gerbrands Model G5100 pellet dispenser. The bottom edge of the food aperture was located 0.75 cm from the floor. General illumination was provided by a white houselight (3 W, 28-V DC) located on the horizontal midpoint of the work panel 11 cm from the floor. White noise and a ventilation fan on the chamber enclosure masked extraneous noise. An IBM-compatible microcomputer using a Med-PC® interface and software was located in an adjacent room, and it controlled all experimental events and recorded data.

Procedure

Magazine training began when each rat reached its target weight. A food pellet was delivered immediately after the rat was placed in the experimental chamber and the houselight turned on. Pellets were delivered thereafter approximately every 15 s after the previous one had been consumed. When the latency from pellet delivery to consumption decreased to less than 1 s, as assessed by direct observation of the rat, pellets then were delivered according to a variable-time (VT) 30-s schedule. The levers were present during this magazine training but responses on them were without effect. Magazine training ended, and the first experimental session began, when the latency from pellet delivery to consumption was 2 s or less for 10 consecutive pellet deliveries (also assessed by direct observation). Magazine training sessions lasted no more than 30 min per session and required one or two sessions.

During each session of the first condition, responding on the right lever was reinforced according to a tandem fixed-ratio (FR) 1 DRO 30-s schedule of reinforcement, that is, each response initiated a 30-s unsignaled, resetting delay-to-reinforcement interval. This procedure will be described hereafter as an unsignaled delay. The lever correlated with the unsignaled delay was designated the functional lever. Responding on a second lever, located on the left side of the work panel during the first condition, did not initiate delays leading to reinforcement. If, however, a response occurred on this nonfunctional lever during the last 10 s of a delay initiated by a functional-lever response, the upcoming pellet delivery was postponed by 10 s. Subsequent responses on the nonfunctional lever during the delay period restarted this 10-s interval.

The first condition continued until response rates on the functional lever were consistently higher over at least 10 sessions than response rates on the nonfunctional lever. The positions of the functional and nonfunctional levers then were reversed. In subsequent conditions, the positions were reversed twice more for Rats JK2 and JK3 and once more for Rat JK4. The number of sessions that each condition was in effect for each rat is shown in Table 1. Sessions occurred 5 days a week at the same time and lasted for 2 hr or until 60 reinforcers were delivered, whichever occurred first. All 60 reinforcers were obtained within the allotted time in most sessions.

Table 1.

Number of sessions in each condition during Experiment 1.

	Functional Lever Position
	Condition
	1	2	3	4
Rat	Right	Left	Right	Left
JK2	89	34	58	53
JK3	116	40	32	47
JK4	81	97	15

Open in a new tab

Results

The leftmost panel of Figure 1 shows that responding on the functional and nonfunctional levers was differentiated for each of the 3 rats after 2 to approximately 40 sessions, with higher response rates on the functional lever. After varying numbers of sessions following the first and second reversal for Rat JK2 and following the first, second, and third reversal for Rat JK3, response rates reversed, with consistently higher rates occurring on the functional lever toward the end of these respective conditions. Rat JK4 failed to show reversals in response rates during either reversal, as did Rat JK2 during the third reversal. Each of these reversal failures reflected continued relatively high response rates on the nonfunctional lever.

Fig 1 — Responses per min on the functional and nonfunctional levers across conditions of Experiment 1. “Right” and “Left” describe the location of the functional lever in the conditions shown in the graphs.

Figure 2 shows discrimination ratios (the ratio of the number of functional-lever responses to total responses) across reversals. Ratios above 0.5 indicate more responses on the functional lever. The discrimination ratios show that in each condition for Rats JK2 and JK3, the ratios increased across successive sessions following the reversal, eventually rising and remaining above .5 in all but the last reversal for Rat JK2. In line with the data in Figure 1, the discrimination ratios for Rat JK4 mostly remained below 0.5 following both reversals and despite extended training during the first reversal. During both the first and second reversals with this rat, the discrimination ratios did increase across the first few sessions following the reversal.

Fig 2 — Discrimination ratios (functional-lever presses/total lever presses) across conditions of Experiment 1.

The number of reinforcers in each session (out of 60) that were initiated by a nonfunctional-lever press is shown in Figure 3. These reinforcers were those delivered 10 s after a nonfunctional-lever press. Thus, the sequence resulting in this circumstance would be as follows: A functional-lever response initiated the delay interval. Each response on the functional lever reset the interval to 30 s, but responses on the nonfunctional lever were without effect unless they occurred during the last 10 s of the delay. Any nonfunctional-lever response occurring during this period initiated a 10-s delay, at the end of which the reinforcers depicted in Figure 3 were delivered. During the first reversal, Rat JK4 was far more likely to have received a food pellet 10 s from the last response on the nonfunctional lever than were the other 2 rats in this condition. During the third and fourth reversals for Rats JK2 and JK3, at least one or two pellets occurred 10 s after a response on the nonfunctional lever in most sessions.

Fig 3 — Total number of reinforcers delivered 10 s after a nonfunctional-lever press across conditions of Experiment 1.

Discussion

The results from the initial condition of this experiment replicate and extend the findings of Critchfield and Lattal (1993) and Williams and Lattal (1999) in that responding was established and maintained at higher rates on the functional than the nonfunctional lever. These results do not replicate the findings of Wilkenfield et al. (1992) because they found higher response rates on the nonfunctional lever at 32 s delays to reinforcement programmed similarly to the delays in the present experiment. Two differences between Wilkenfield et al.'s procedure and the present experiment may have contributed to the different results. First, Wilkenfield et al.'s experiment lasted only a single 8-hr session. The data in Figures 1 and 2 show that, during the first condition, the development of consistently higher response rates on the functional lever required, in some cases at least, longer than 8 hr of exposure to the contingencies. Second, Wilkenfield et al. did not arrange for responses on the nonfunctional lever to postpone food delivery, but the present procedure did. This second difference was the topic of the next experiment.

The results of the reversals suggest that as a history of reinforced responding—albeit reinforced 30 s after the response—accumulates, responding on the now-nonfunctional lever becomes more resistant to change. The increased responding on the nonfunctional lever sets into play a complex contingency: Responses on the functional lever produce reinforcers reliably, but each time after a 30-s period of no further responding on the functional lever. No programmed relation existed between nonfunctional responses and 30-s delay initiation, but nonfunctional responses could occur only 10 s from food delivery. Figure 3 shows that the number of reinforcers occurring 10 s after a nonfunctional-lever response typically was low, but was highest for Rat JK4, where responding failed to reverse. Lattal (1974) showed that only a few food deliveries need be contiguous with the response to sustain responding above the level maintained by all response-independent food deliveries. Interspersing a small number of relatively less delayed reinforcers in a stream of otherwise longer-delayed reinforcers may have similar effects. Other factors that likely influenced responding on the nonfunctional lever were simply the history of reinforcement of responding on that lever in preceding conditions, the similarity of the two levers to one another, and, in the case of Rat JK4, perhaps position bias as well.

Experiment 2

As noted in the Discussion of Experiment 1, a major difference between that experiment and Wilkenfield et al. (1992) was the presence of a delay in the former between any responses on the nonfunctional lever and food delivery. Experiment 2 was conducted to directly compare nonfunctional-lever responding in the presence and absence of such a delay.