Abstract
Observing responses are those that produce stimuli correlated with the availability (S+) or non-availability (S−) of reinforcement but that have no influence on the actual delivery or timing of reinforcement. Prior research has shown that observing is maintained by the occasional production of the S+ (“good news”) and not by production of the equally informative S− (“bad news”). However, for both humans and rats the S− maintains observing when it is at least implicitly correlated with good news. In the present study, pigeons could obtain both good and bad news by responding during the appropriate key color. In one condition, the bad news was actually more informative about reinforcement than was the good news. Nevertheless, a preponderance of the birds’ responses was made on the nominally good-news option. The present results offer further support for the central role of good news in maintaining observing responses and are entirely consistent with the traditional conditioned-reinforcement (or classical conditioning) interpretation of observing.
Keywords: Observing, Information, Conditioned Reinforcement, Behavioral Ecology, Optimality, Classical Conditioning, Contiguity, Variable-time schedules, Pigeons
1. Introduction
Observing responses produce stimuli correlated with schedules of reinforcement, but do not affect the occurrence of reinforcement (Wyckoff, 1952). For example, two equally probable schedules of reinforcement differing only in frequency of reinforcement may alternate unpredictably. Effective observing responses would produce stimuli identifying the schedule in effect. Does a stimulus maintain observing because it is correlated with primary reinforcement (the “conditioned-reinforcement hypothesis”), or because it provides information about the availability of reinforcement (the “information hypothesis”)? The critical test for distinguishing between these views is whether a stimulus associated with extinction (EXT; an S−), is reinforcing. The evidence shows that it is not (e.g., Dinsmoor, 1983; Fantino and Case, 1983), a result consistent with the conditioned-reinforcement hypothesis.
It has also been shown that when production of “bad news” (the S−) is correlated with the opportunity to rest during an effortful task (e.g., Case et al., 1985; Perone and Baron, 1980) or to engage in another activity (e.g., Case et al., 1990), then the “bad news” is also a discriminative stimulus for reinforcement. What remains to be explored includes the question of the conditions under which an S− can function as an aversive, neutral, or reinforcing stimulus depending on its correlation with reinforcement.
A recent paper showed that for humans the S− or “bad news” would be reinforcing when its absence was correlated with the proximity of reinforcement (Fantino and Silberberg, 2010; also see Escobar and Bruner, 2009, for a related demonstration in rats). These results suggest that for humans and rats an S− may be reinforcing (and maintain observing) when it is also correlated with positive reinforcement. One purpose of the present study was to ascertain if birds would show this effect. Birds in the present study were exposed to two conditions in which both positive information (“good news”, the S+), and negative information (“bad news”, the S−) were concurrently available. In one condition, however, the “bad news” was indirectly correlated with good news, while the “good news” was not well correlated with the presentation of reinforcement. Would good news be preferred to bad news in both conditions or would bad news be preferred in the latter condition in which good news could be inferred from the bad news? The latter outcome would extend to birds the recent finding of Fantino and Silberberg (2010) with humans.
2. Method
2.1 Subjects
Four adult pigeons (P3-P6) of unknown sex and breed served. They had previously served in a study in which key pecking produced grain reinforcers under progressive-ratio schedules. They were individually housed in a temperature-controlled, continuously illuminated vivarium where they had free access to water, and were occasionally fed to maintain them at 80% of their free-feeding weights.
2.2. Apparatus
Two identical chambers, measuring internally 27.5 by 32.5 by 29 cm, housed birds individually during experimental sessions. With the exception of the stainless-steel response panel and wire-mesh floor, all surfaces were wood painted white. A 5.5- by 5-cm food aperture and 24-V DC houselight were centered on the panel 5.5 and 26.2 cm, respectively, from the floor. Three Lehigh Valley Electronics response keys, 2.54-cm in diameter and spaced 6.5 cm apart, center-to-center, were located 21 cm from the floor. A force of approximately 0.15 N activated a microswitch behind the center key, the only one used in this study. An Industrial Electronics Engineers in-line display unit could illuminate each key from behind with various colors and shapes. A computer controlled all experimental events and data collection.
2.3. Procedure
Prior to beginning the experiment, all birds were exposed for two sessions to a variable-interval (VI) 4-min schedule that provided 4 s of access to mixed grain contingent upon responding to the white center key of the chamber. In this and all subsequently discussed variable schedules, the 12 intervals composing the schedule were based on the Fleshler-Hoffman sequence (Fleshler and Hoffman, 1962) and were sampled without replacement. The houselight and key were continuously illuminated during the session except when grain was presented. During grain presentation, a light was illuminated within the grain aperture. Sessions ended after 30 reinforcers.
Following this pretraining, all birds were exposed to the main experimental procedure, a multiple variable-time (VT) 4-min VT 4-min schedule. For P3 and P4 in the first condition, one component was cued by the projection of green light through the transparent center key, while the other component was cued by red light. Termination of a component and its associated VT schedule was controlled by a VT 2-min schedule. When a component ended, all illumination was extinguished for 1 s, followed by the re-illumination of the center key with green or red light, the color determined by a probability gate equal to 0.5. Any response in the presence of the green-key component caused the superimposition of three horizontal lines (good-news stimulus) if the next VT reinforcer was scheduled to be delivered in that component within 20 s. If the next VT reinforcer was not due for more than 20 s, then responses in the presence of the green key had no effect. In terms of the red-key component, the response-dependent superimposed stimulus was a single vertical line (bad-news stimulus). It appeared if the next scheduled within-component VT reinforcer was 120 s or more away. The end of a component extinguished the green- or red-key color and, if present, its superimposed stimulus.
The procedure for the first condition for P5 and P6 was identical to that for the other birds except: (a) the multiple-schedule components were cued by yellow and green key colors, respectively, instead of green and red; (b) onset of the good-news stimulus occurred only if a response to the yellow key occurred when the next VT reinforcer was scheduled to be delivered in that component within 120 s; and (c) onset of the bad-news cue occurred only if a response to the green key occurred when the next VT reinforcer was more than 20 s away. Daily sessions ended after 48 reinforcers. Condition 1 and all subsequent conditions ended after 15 sessions. No alternative measure of stability was used.
The second condition was preceded by two sessions’ exposure to the pretraining regimen on a VI 4-min schedule that preceded the first condition of this experiment. Then all birds were exposed to the same experimental contingencies as in the first condition except that the stimuli cuing multiple-schedule components were reversed. The third condition of the experiment was not preceded by VI pretraining. In this condition, P3 and P4 were exposed to the multiple-schedule components previously used for P5 and P6, while P5 and P6 were exposed to the component colors previously in use by P3 and P4. For all birds, a response in the bad-news component produced the bad-news cue if VT reinforcement was more than 20 s away. In the good-news component, a response produced the good-news cue if it occurred within 120 s of VT reinforcement.
3. Results
Table 1 presents the results of this experiment based solely on performances during the last session of each condition. Except for P6 in Condition 1, a cue temporally proximal to the impending arrival of food (a good-news stimulus) supported more responding than a cue that signaled that food was temporally distant (a bad-news stimulus) across all conditions and birds. This difference was present when the good-news stimulus was more informative than the bad-news stimulus (Good news > Bad news). It was also present when the absence of the bad-news cue was the more effective predictor of impending reinforcement (Bad news more informative than Good news or Bad news > Good news). We use “informative” here to refer to temporal proximity to reinforcement, and not to probability of reinforcement. These results were immune to the order in which these conditions were tested (Condition 1 vs. Condition 2) and to the rates of responding particular component colors supported in prior conditions (Condition 3 vs. Conditions 1 and 2).
Table 1.
Bird (1) | Condition (2) | Good news > Bad news: | Bad news > Good news: | ||
---|---|---|---|---|---|
Good-news Component (3) | Bad-news Component (4) | Good-news Component (5) | Bad-news Component (6) | ||
P3 | 1 | 66 | 7.8 | ||
2 | 36 | 5.4 | |||
3 | 19.2 | 1.8 | |||
P4 | 1 | 16.2 | 1.2 | ||
2 | 68.4 | 1.2 | |||
3 | 42 | 6 | |||
P5 | 1 | 48 | 0 | ||
2 | 40 | 0 | |||
3 | 16.8 | 0 | |||
P6 | 1 | 16.8 | 18 | ||
2 | 62.4 | 4.8 | |||
3 | 42 | 2.4 |
Column numbers are enclosed in parentheses. Columns 1 and 2, respectively, identify the bird and the experimental condition. Columns 3 and 4 present the response rate in responses/min during the green and red components for P3 and P4, respectively, and the yellow and green components for P5 and P6, respectively. These are the component rates prior to the arrival of the good-news or bad-news stimulus. Columns 5 and 6 are the same as 3 and 4 except for the predictive value of the two cuing stimuli. For columns 3 and 4, the designation “Good news>Bad news” means that the cue in the good-news component is more informative than the cue in the bad-news component. For columns 5 and 6, this designation is reversed.
4. Discussion
These results permit two unambiguous conclusions. First, as with many prior studies (e.g., Fantino and Silberberg, 2010; for reviews see Dinsmoor, 1983; Fantino, 1977) good news maintains observing and bad news does not. Even when the bad-news stimulus was more predictive of good news than the good-news stimulus (Bad news > Good news data in Table 1) each of the four birds showed a markedly higher rate for the good-news stimulus (the S+). In fact, the mean good-news-to-bad-news ratio when Bad news > Good news was 6.57 across birds. Second, these rates were largely unaffected by the fact that good news could, in principle, be more readily inferred from the S− than from the S+ (mean ratio for good news was 17.74 when good news was more informative, while this ratio was 6.57 when bad news was more informative). Despite this fact, birds did appear sensitive to the difference in conditions in that three of four showed higher ratios when Good news > Bad news (the fourth, P5 responded exclusively for the S+ in both conditions).
A recent discussion in the behavioral ecology literature has parallels to the present research. McLinn and Stephens (2006) pitted the reliability of color on their “signal” key against the likelihood that a particular color was correct in a modified matching-to-sample procedure with blue jays similar to that used by Hartl and Fantino (1996) in their research on base-rate neglect with pigeons (also Fantino, et al., 2005). In both studies, jays’ and pigeons’ choices were controlled by whichever source of information was more reliable. In McLinn and Stephens’ terms the jays displayed environment tracking when the key color was more reliable (“base-rate sensitivity” for Hartl and Fantino) and signal tracking when the predictive value of the signal was higher (high “sample accuracy” for Hartl and Fantino). Their results are consistent with the present ones in suggesting that the efficacy of a stimulus—be it viewed as a conditioned reinforcer or as information in an optimal foraging task—requires it to be of predictive utility.
Despite this congruence we suggest that an account based on conditioned reinforcement is more comprehensive. For example, in McLinn and Stephens’ (2006) manipulations of predictive value between unreliable and reliable information (environment- and signal-tracking probabilities of 0.5 to 1.0), information, when it was provided, was always positive. What would have happened had the manipulations of information been cuing not the likelihood that a response leads to food, but rather that a response does not lead to food (probabilities from 0.0 to 0.5)? From an information perspective this change should not matter if the bad and good news are equally informative. But from the perspective of learning theory, whether the stimulus is paired with a positive or negative outcome is critical because stimuli that are paired with the absence of reward are typically not conditioned reinforcers. The predictive problem any optimality-based account of information confronts is that, at least for birds, there is considerable evidence, including the present results, that useful information is avoided when the message conveyed is negative. As noted earlier, this is not the case with humans (e.g., Fantino and Silberberg, 2010).
The results support the conclusions of Escobar and Bruner (2009) and of Fantino and Silberberg (2010) in underscoring the complexity of the role of the S− in the maintenance of the observing response, a complexity highlighted by the novel aspect of the results in the Bad news > Good news condition. Birds in this condition did not appear sensitive to the temporal information provided by the S−, perhaps because any conditioned reinforcement provided by the S− was less potent than the occasional temporal contiguity of the S+ and food. We cannot place great confidence on our suggestion of a species difference owing to procedural variations across the studies. In any event, the present results offer further support for the central role of good news in maintaining observing responses and are entirely consistent with the traditional conditioned-reinforcement interpretation of observing.
Acknowledgments
Research and manuscript preparation were supported by National Institute of Mental Health Grant MH57127 to the University of California, San Diego.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Alan Silberberg, American University.
Edmund Fantino, University of California, San Diego.
References
- Case DA, Fantino E, Wixted J. Human observing: maintained by negative informative stimuli only if correlated with improvement in response efficiency. J Exp Anal Behav. 1985;43:289–300. doi: 10.1901/jeab.1985.43-289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Case DA, Ploog BO, Fantino E. Observing behavior in a computer game. J Exp Anal Behav. 1990;54:185–199. doi: 10.1901/jeab.1990.54-185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dinsmoor JA. Observing and conditioned reinforcement. Behav Brain Sci. 1983;6:693–728. [Google Scholar]
- Escobar R, Bruner CA. Observing responses and serial stimuli: searching for the reinforcing properties of the S−. J Exp Anal Behav. 2009;92:215–231. doi: 10.1901/jeab.2009.92-215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fantino E. Conditioned reinforcement: choice and information. In: Honig WK, Staddon JER, editors. Handbook of Operant Behavior. Prentice-Hall; Englewood Cliffs: 1966. pp. 313–339. [Google Scholar]
- Fantino E, Case DA. Human observing: Maintained by stimuli correlated with reinforcement but not extinction. J Exp Anal Behav. 1983;40:193–210. doi: 10.1901/jeab.1983.40-193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fantino E, Kanevsky IG, Charlton SR. Teaching pigeons to commit base-rate neglect. Psychol Sci. 2005;16:820–825. doi: 10.1111/j.1467-9280.2005.01620.x. [DOI] [PubMed] [Google Scholar]
- Fantino E, Silberberg A. Revisiting the role of bad news in maintaining human observing behavior. J Exp Anal Behav. 2010;93:157–170. doi: 10.1901/jeab.2010.93-157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fleshler M, Hoffman HS. A progression for generating variable-interval schedules. J Exp Anal Behav. 1962;5:529–530. doi: 10.1901/jeab.1962.5-529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartl J, Fantino E. Choice as a function of reinforcement ratios in delayed matching-to-sample. J Exp Anal Behav. 1996;66:11–27. doi: 10.1901/jeab.1996.66-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McLinn CM, Stephens DW. What makes information valuable: signal reliability and environmental uncertainty. Anim Behav. 2006;71:1119–1129. [Google Scholar]
- Perone M, Baron A. Reinforcement of human observing behavior by a stimulus correlated with extinction or increased effort. J Exp Anal Behav. 1980;34:239–261. doi: 10.1901/jeab.1980.34-239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wyckoff LB., Jr The role of observing responses in discrimination learning. Psychol Rev. 1952;59:68–78. doi: 10.1037/h0053932. [DOI] [PubMed] [Google Scholar]