Do Pigeons Prefer Information in the Absence of Differential Reinforcement?

Thomas R Zentall; Jessica P Stagner

doi:10.3758/s13420-012-0067-5

. Author manuscript; available in PMC: 2013 Dec 1.

Published in final edited form as: Learn Behav. 2012 Dec;40(4):465–475. doi: 10.3758/s13420-012-0067-5

Do Pigeons Prefer Information in the Absence of Differential Reinforcement?

Thomas R Zentall ¹, Jessica P Stagner ¹

PMCID: PMC3508152 NIHMSID: NIHMS364344 PMID: 22367755

Abstract

Prior research indicates that pigeons do not prefer an alternative that provides a sample (for matching-to-sample) over an alternative that does not provide a sample (i.e., there is no indication of which comparison stimulus is correct). However, Zentall and Stagner (2010) showed that when delay of reinforcement was controlled, pigeons had a strong preference for matching over pseudo-matching (there was a sample but it did not indicate which comparison stimulus was correct). Experiment 1 of the present study replicated and extended the results of the Zentall and Stagner study by including an identity relation between the sample and one of the comparison stimuli in both the matching and pseudo-matching tasks. In Experiment 2, when we asked if the pigeons would still prefer matching if we equated the two tasks for probability of reinforcement, we found no systematic preference for matching over pseudo-matching. Thus, it appears that in the absence of differential reinforcement, the information provided by a sample that signals which of the two comparison stimuli is correct is insufficient to produce a preference for that alternative.

Keywords: information, conditioned reinforcement, choice, matching, pseudo-matching, pigeons

For many years comparative psychologists have asked whether animals prefer information over its absence (Prokasy, 1956; Roper & Zentall, 1999; Wyckoff, 1952). In all of these procedures, animals can choose between obtaining cues that signal both the presentation of reinforcement and the absence of reinforcement and not obtaining those cues. The general finding is that animals prefer to obtain cues, even if obtaining them does not affect the probability or rate of reinforcement. However, there is little evidence that these preferences reflect the preference for information as defined by classical information theory (Berlyne, 1957; Hendry, 1969).

According to classical information theory, the amount of information transmitted is a function of the degree to which uncertainty is reduced. When prior to an initial choice the outcome is totally uncertain (50% reinforcement for choice) and following the choice the probability of reinforcement either increases to 100% or decreases to 0%, uncertainty reduction should be maximal. Any increase or decrease in the overall probability of reinforcement should reduce the amount of information transmitted because it should result in a decrease in uncertainty reduction. From this perspective, information should be symmetrical and a cue for reinforcement should provide just as much information as a cue for the absence of reinforcement.

Roper and Zentall (1999) tested this prediction of information theory by manipulating the overall probability of reinforcement. They started with an overall probability of 50% reinforcement (one alternative provided a cue 50% of the time that always predicted reinforcement or a cue 50% of the time that signaled the absence of reinforcement, whereas the other alternative provided one of two cues on each trial each associated with 50% reinforcement) and they found a strong preference for the alternative that provided cues for reinforcement and its absence. Consistent with information theory, Roper and Zentall found that increasing the overall probability of reinforcement associated the two alternatives to 87.5% (one alternative provided a cue for reinforcement 87.5% of the time and a cue for the absence of reinforcement 12.5% of the time, whereas the other alternative provided one of two cues on each trial each associated with 87.5% reinforcement) decreased the preference for the alternative associated with the discriminative stimuli. On the other hand, contrary to information theory, they found that lowering the overall probability of reinforcement associated with the two alternatives to 12.5%, which also should have decreased the preference for the alternative that was followed by the discriminative stimuli, actually increased it. That is, when one alternative provided a cue for reinforcement 12.5% of the time and a cue for the absence of reinforcement 87.5% of the time, whereas the other alternative provided one of two cues on each trial, each associated with 12.5% reinforcement (which also should have decreased the preference for the alternative that was followed by the discriminative stimuli), it actually increased it.

An alternative theory was proposed by Dinsmoor (1983). He suggested that animals will prefer conditioned reinforcers. For example, a good conditioned reinforcer such as a signal for 100% reinforcement would be preferred over a signal for an uncertain 50% reinforcement, a poorer conditioned reinforcer. The asymmetry found by Roper and Zentall (1999) is consistent with a conditioned reinforcement interpretation because the signal for 100% reinforcement would be compared with the uncertain alternative. When the overall probability of reinforcement associated with the two alternatives increased (to 87.5%), the difference in the probability of reinforcement between the two conditioned reinforcers became relatively small (100% − 87.5% = 12.5%) and thus, the preference for the conditioned reinforcer decreased. However, when the overall probability of reinforcement associated with the two alternatives decreased (to 12.5%), the difference in the probability of reinforcement between the two conditioned reinforcers became relatively large (100% − 12.5% = 87.5%) and thus, the preference for the conditioned reinforcer increased. Thus, it appears that preference for discriminative stimuli may be controlled by the preference for a cue that reliably predicts reinforcement, a conditioned reinforcer.

To fully account for the results of Roper and Zentall (1999), one must also posit that the negative effect of the stimulus associated with the absence of food (inhibition) does not detract from the discriminative-stimulus alternative sufficiently to counteract the effects of conditioned reinforcement. In support of this hypothesis, Stagner and Zentall (2010) found that pigeons will prefer discriminative stimuli even when the ratio of stimuli associated with the absence of reinforcement to stimuli associated with reinforcement was 4 to 1 and the nondiscriminative-stimulus alternative provided 2.5 times as much food as the discriminative-stimulus alternative.

Recently, Roberts, Feeney, McMillan, MacPherson, Musolino, and Petter (2009, Exp. 4) modified the task used by Roper and Zentall (1999) by replacing the simple successive discrimination in the terminal link with a conditional discrimination. Pigeons were offered an initial-link choice between a conditional discrimination (matching-to-sample) in which they could obtain close to 100% reinforcement and the same task but without a sample, that is, with a simultaneous discrimination in which they could obtain only about 50% reinforcement because there was no cue to indicate which of the comparison stimuli was correct on each trial. Surprisingly, the pigeons showed little preference for the matching task that would have provided them with almost twice as much reinforcement.

Zentall and Stagner (2010) argued that differential delay of reinforcement may have played a role in the finding by Roberts et al. (2009). Typically, in matching-to-sample research, a fixed number of responses are required to the sample or the sample is presented for a fixed duration before the comparison stimuli are presented. However, to attempt to make the sample and no-sample tasks comparable, Roberts et al. presented the sample and the two comparison stimuli at the same time, so after choosing either task, the pigeons could immediately choose one of the comparison stimuli. For this reason, if a pigeon chose the sample-absent alternative, it could choose one of the comparison stimuli and could possibly receive reinforcement immediately, but of course only 50% of the time. However, if the pigeon chose the matching-to-sample task, to obtain high probability of reinforcement it would have to attend to and identify the sample, and then locate the correct (matching) comparison stimulus. This would have required additional time. As pigeons are known to have relatively steep delay-discounting functions (Green, Myerson, Holt, Slevin, & Estle, 2004), it may not be surprising that they would be relatively indifferent to the tradeoff between the higher probability of delayed reinforcement (that they could obtain with the matching-to-sample alternative) over the immediate 50% reinforcement (that they could obtain with the absent-sample alternative).

To test this hypothesis, Zentall and Stagner (2010) equated the nominal delay to reinforcement associated with the matching task by using a pseudo sample, rather than no sample, with the alternative task. If a pigeon chose the matching task, it received a sample for 5 s followed by presentation of the comparison stimuli. Choice of the matching comparison stimulus was reinforced. However, if the pigeon chose the pseudo-matching task, it received a sample for 5 s and then comparison stimuli, but neither comparison stimulus matched the sample and the color of the sample did not indicate which comparison stimulus was correct. Thus, choice of the pseudo-matching alternative resulted in 50% reinforcement. Under those conditions, the pigeons showed a clear preference for the matching-to-sample alternative. Thus, when delay to reinforcement is controlled, pigeons do show a preference for the alternative that provides them with a sample stimulus that they can use to choose the correct comparison stimulus over one that does not.

The advantage of using a conditional discrimination (Zentall & Stagner, 2010) over a simple successive discrimination (Roper & Zentall, 1999) is that although the two conditional stimuli provide information in the form of a cue that signals which comparison stimulus is correct, the comparison stimuli are not differentially associated with reinforcement. Thus, they should not become differential conditioned reinforcers, as is the case with a simple successive discrimination in which one stimulus is associated with reinforcement and the other with the absence of reinforcement.

The purpose of the present experiments was to further explore pigeons' sensitivity to the relation between the sample and the comparison stimuli. In Experiment 1, we asked whether the results found by Zentall and Stagner (2010) were obtained because the pigeons preferred the task in which one of the comparison stimuli matched the sample (there was an identity relation between them) rather than because the matching task provided a conditional stimulus that allowed the pigeons to choose the correct comparison stimulus that was signaled by the sample. In Experiment 2, we asked if pigeons would prefer a matching task even if it was not associated with a higher rate of reinforcement. That is, would they prefer a task in which reinforced responding was contingent on choice of the stimulus that matched the sample, over a second matching task in which the sample did not indicate which comparison was correct, but the two tasks were equated for the probability of reinforcement. This is another way of asking if pigeons prefer information (in the classic sense) over its absence. In the case of matching-to-sample, however, there should be no differential conditioned reinforcer as there was in the Roper and Zentall (1999) experiment, and the probability and delay of reinforcement between the two tasks should be the same. If pigeons prefer information, they should choose the alternative that provides them with a matching task over one that yields the same probability of reinforcement but does not provide information about which comparison stimulus is correct.

Experiment 1

In Experiment 1 we tested the hypothesis that preference for the alternative that provided a matching-to-sample task depended on the identity relation between the sample and one of the comparison stimuli. In this experiment, one alternative provided an identity matching task in which choice of the comparison stimulus that matched the sample was reinforced, whereas the other alternative provided a pseudo-matching task in which one of the comparison stimuli matched the sample but choice of either comparison stimulus was reinforced on a random 50% of the trials.

Method

Subjects

The subjects were eight unsexed White Carneau pigeons ranging from 5-8 years of age. They were retired breeders purchased from the Palmetto Pigeon Plant, Sumter, South Carolina. The pigeons had all served as subjects in a previous experiment involving daily reversals of a simultaneous hue discrimination. The pigeons were kept on a 12:12-h light/dark cycle and were maintained at 80-85% of their free-feeding body weight. The pigeons had free access to grit and water, and were cared for in accordance with the University of Kentucky's animal care guidelines.

Apparatus

A standard (LVE/BRS, Laurel, Md.) test chamber was used, with inside measurements 35 cm high, 30 cm long, and 35 cm across the response panel. The response panel in the chamber had a horizontal row of three response keys, 25 cm above the floor. The rectangular keys (2.5 cm high × 3.0 cm wide) were separated from each other by 1.0 cm and behind each key was a 12-stimulus inline projector (Industrial Electronics Engineering, Van Nuys, Calif.) that projected a white plus and a white line-drawn circle on a back background as well as red, yellow, blue, green, and white hues (Kodak Wratten Filter Nos. 26, 9, 38, 60, and no filter respectively). In the chamber, the bottom of the center-mounted feeder (filled with Purina Pro Grains) was 9.5 cm from the floor. When the feeder was raised, it was illuminated by a 28 V, 0.04 A lamp. A 28 V 0.1 A houselight was centered above the response panel and an exhaust fan was mounted on the outside of the chamber to mask extraneous noise. A microcomputer in the adjacent room controlled the experiment.

Procedure

Pretraining

All pigeons received two pretraining sessions in which they were required to peck once at each discriminative stimulus on each key (yellow, red, green, blue), as well as white on the center key and circle and plus on the side keys, for 1.5 s access to mixed-grain reinforcement. Each session consisted of five presentations of each stimulus in each position, for a total of 75 trials per session.

Training

All trials began with a white light presented on the center key. A single peck to the white stimulus illuminated one or both of the side keys. On forced matching trials, the plus, for example, appeared on one of the side keys. One peck extinguished the plus, and illuminated, for example, a red or green sample on the center key for 5.0 s, after which the sample was turned off and red and green comparison stimuli appeared on the side keys. Choice of the comparison color that matched the sample was reinforced and started the 10-s intertrial interval (ITI), whereas choice of the other comparison led directly to ITI. There were 32 forced matching trials per session.

On forced pseudo-matching trials, the circle, for example, appeared on the other side key. One peck extinguished the circle and illuminated a blue or yellow stimulus, for example, on the center key for 5.0 s, after which the stimulus was turned off and blue and yellow comparison stimuli appeared on the side keys. On a random half of these trials choice of either comparison stimulus was reinforced and started the ITI. On the remaining half of the pseudo-matching trials, the comparison choice was not reinforced and led directly to ITI. There were 32 forced pseudo-matching trials per session.

On choice trials, both the plus and the circle were presented on the side keys following the response to the white center key. A single peck to either shape stimulus was followed by the contingencies associated with that stimulus on forced trials. There were 32 choice trials in each session, randomly mixed among the forced trials. The design of Experiment 1 appears in Figure 1. For a given pigeon, the shape stimuli appeared on the same side keys throughout the experiment but the assignment of the shape and side, as well as the colors associated with the matching and pseudo-matching task, was counterbalanced over subjects. The pigeons received 16 training sessions.

Design of Experiment 1: Choice trials. Pigeons could choose between matching and pseudo-matching. The red and blue samples on the third line would sometimes be green or yellow. The spatial location of the comparison colors (left and right) were counterbalanced over trials (fourth line). Shapes and colors were counterbalanced over subjects.

Results and Discussion

The pigeons acquired the matching task quickly (see Figure 2). By Session 15 they were performing at 90.6% correct. In parallel with their acquisition of matching, the pigeons showed a preference for the matching over the pseudo-matching alternative. By Session 15 they chose the matching alternative on 85.2% of the choice trials. A t-test performed on the matching accuracy scores pooled over the last 5 sessions of training indicated that the pigeons were performing significantly above chance, t(7) = 8.93, p < .0001. A t-test performed on the pigeons' choice of the matching alternative pooled over the last 5 sessions of training indicated that they had a strong preference for that alternative, t(7) = 2.72, p = .03. Finally, we asked if matching accuracy was greater than the preference for the matching alternative. When pooled over the last 5 sessions of training, the difference between matching accuracy and matching preference was not statistically significant, t < 1.

Experiment 1: Preference for the matching alternative over the pseudo-matching alternative as a function of training session (solid lines). Matching accuracy on forced trials (dashed lines).

An examination of the comparison choice latencies, when pooled over all sessions, indicated that they were longer on matching trials (1.41 s) than they were on pseudo-matching trials (1.05 s) (see Figure 3). A correlated t-test performed on the choice latencies pooled over training sessions, indicated that the difference in choice latencies between matching and pseudo-matching trials was significant, t(7) = 3.76, p = .007. To get a better feel for the terminal difference in comparison choice latency the latency data were pooled over the last 6 training sessions. A correlated t-test indicated that the difference in latencies between matching (1.49 s) and pseudo-matching (1.07 s) trials was also significant, t(7) = 2.75, p = .028. Thus, the pigeons took somewhat longer (about 0.4 s) to make their comparison choices on matching trials than they did on pseudo matching trials. However, this difference in comparison choice latency cannot account for the preference for the matching alternative because the increase in delay to reinforcement on matching trials should have reduced the preference for matching trials over pseudo-matching trials.

Experiment 1: Comparison choice latencies on matching and pseudo-matching trials as a function of training session.

These results were quite similar to those reported by Zentall and Stagner (2010). They reported matching accuracy at Session 15 of 81.3% and choice of the matching alternative over the pseudo-matching alternative of 82.8% at that point in training. Thus, the results reported by Zentall and Stagner did not result from the fact that in their pseudo-matching task there was no identity match between the sample and one of the comparison stimuli. When in the present study, both the matching and pseudo-matching tasks involved a potential identity relation, preference for the matching alternative was at least as strong as it was in the Zentall and Stagner study.

The present results also confirm that when delay to reinforcement is carefully controlled, pigeons are sensitive to the probability of reinforcement and will choose to see a sample rather than no sample or a pseudo sample. These results, together with the results from Zentall and Stagner, indicate Roberts et al.'s (2009, Experiment 4) finding that pigeons do not prefer a sample (that could result in near 100% reinforcement) over the absence of a sample (that could result in only 50% reinforcement) was very likely due to the simultaneous matching procedure that they used. Specifically, when the pigeons chose the matching-task alternative they would have had to identify the sample color and then look for the matching comparison color on that trial (sometimes on the left, sometimes on the right), whereas when the pigeons chose the alternative that did not include a sample, they could have selected one of the comparison stimuli without regard to its identity. The difference in choice latency between these two chains could have been 1-2 s.

An interesting question unrelated to the purpose of the present experiment is what did the pigeons choose on forced pseudo-matching trials? Recall that trials were defined as either reinforced or not reinforced independent of comparison choice. One interesting possibility is that the pigeons showed evidence of generalized matching. That is, because whenever the colors associated with the identity matching task appeared, color matching was reinforced and it is possible that the pigeons generalized choice of the matching comparison on pseudo-matching trials. However, no evidence of generalized matching was found on pseudo-matching trials. Over the last 10 sessions of training, on pseudo-matching forced trials, of the eight pigeons, five showed a strong position bias, with 4 preferring the right key (mean = 99.2%) and one preferring the left key (100%), and three showed a strong color bias, with one pigeon each preferring red (100%), green (100%), and blue (96.5%). The position biases would have been expected because the pigeons could have anticipated pecking their preferred comparison location at the time of pseudo-sample presentation. However, the color preferences were not anticipated because pecking the preferred comparison color would have required locating the preferred color before choosing it, thereby incurring a small additional delay.

Experiment 2

Roper and Zentall (1999) found that pigeons preferred discriminative stimuli of equal frequency (one stimulus associated with 100% reinforcement or a different stimulus associated with 0% reinforcement) over nondiscriminative stimuli (both stimuli associated with 50% reinforcement) in spite of the fact that the alternatives were associated with an equal overall rate of reinforcement.

According to information theory (Berlyne, 1957; Hendry, 1969) this preference may result from a preference for information over the absence of information. However, Dinsmoor (1983) proposed that the preference resulted from the strong conditioned reinforcement associated with the stimulus associated with 100% reinforcement.

As already noted, Roper and Zentall (1999) offered support for the conditioned reinforcement hypothesis. They found that although the pigeons showed a decrease in preference for the discriminative stimulus alternative when the probability of reinforcement associated with each alternative was increased (consistent with information theory), they found an increase in preference for the discriminative stimulus alternative when the probability of reinforcement associated with each alternative was decreased (contrary to information theory but consistent with conditioned reinforcement).

The design of Experiment 1 provides an alternative means of distinguishing between information theory and conditioned reinforcement, if instead of differential reinforcement for choice of matching and pseudo-matching the two alternatives are followed by similar probabilities of reinforcement. By following the initial choice with a conditional discrimination, rather than sometimes by a simple conditioned reinforcer (a stimulus followed by 100% reinforcement) and at other times by a simple conditioned inhibitor (a stimulus followed by 0% reinforcement), there should be no differential conditioned reinforcement and thus no differential preference. On the other hand, if information theory is correct, the pigeons should prefer informative samples over uninformative samples, even if the probability of reinforcement is equated.

The purpose of Experiment 2 was to determine if pigeons will prefer an alternative that provides them with information in the form of a sample that indicates which comparison stimulus is correct, over a sample that provides no information but is associated with the same probability of reinforcement. If information has value for pigeons, one might expect them to prefer the matching alternative, especially during the acquisition of the matching task.