Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jan 1.
Published in final edited form as: Behav Processes. 2012 Nov 1;92:65–70. doi: 10.1016/j.beproc.2012.10.011

Pigeons Show Near-Optimal Win-Stay/Lose-Shift Performance on a Simultaneous-Discrimination, Midsession Reversal Task with Short Intertrial Intervals

Rebecca M Rayburn-Reeves 1, Jennifer R Laude 1, Thomas R Zentall 1
PMCID: PMC3601908  NIHMSID: NIHMS423517  PMID: 23123672

Abstract

Discrimination reversal tasks have been used as a measure of species flexibility in dealing with changes in reinforcement contingency. The simultaneous-discrimination, midsession reversal task is one in which one stimulus (S1) is correct for the first 40 trials of an 80-trial session and the other stimulus (S2) is correct for the remaining trials. After many sessions of training with this task, pigeons show a curious pattern of choices. They begin to respond to S2 well before the reversal point (they make anticipatory errors) and they continue to respond to S1 well after the reversal (they make perseverative errors). That is, they appear to be using the passage of time or number of trials into the session as a cue to reverse. We tested the hypothesis that these errors resulted in part from a memory deficit (the inability to remember over the intertrial interval, ITI, both the choice on the preceding trial and the outcome of that choice) by manipulating the duration of the ITI (1.5, 5, and 10 s). We found support for the hypothesis as pigeons with a short 1.5-s ITI showed close to optimal win-stay/lose-shift accuracy.

Keywords: discrimination learning, timing, memory, win-stay/lose shift, midsession reversal, pigeons


Research on the speed with which animals learn to reverse an acquired discrimination has been used to study behavioral flexibility (Bitterman, 1975). In a serial reversal task, subjects are typically presented with two stimuli simultaneously, and choice of one stimulus is reinforced (S1+) whereas choice of the other stimulus is not (S2−). Subjects are trained on this discrimination until they reach a performance criterion or for a fixed number of trials, at which point the contingencies are reversed (S1−/S2+) and remain in effect until the subject again reaches criterion (or for a fixed number of trials). The measure of behavioral flexibility is the improvement in the number of trials it takes the animal to reach criterion (or improvement in accuracy with a fixed number of trials) with successive reversals. Serial reversal learning has been a common method of assessing behavioral flexibility because species tend to show varying levels of improvement in rate of acquisition with successive reversals.

Recently, a reversal learning procedure has been developed using a simultaneous discrimination in which a single reversal occurs consistently at the midpoint of each session (Cook & Rosen, 2010; Rayburn-Reeves, Molet, & Zentall, 2011; Rayburn-Reeves, Stagner, Kirk, & Zentall, in press). Behavioral flexibility studied with this paradigm allows for the assessment of strategies that animals can use to maximize reinforcement. The optimal strategy with this task has been referred to as win-stay/lose-shift. It can be described as an instruction to repeat the response from the last trial if it was correct but to switch to the alternative response if it was incorrect. If it is possible to adopt a win-stay/lose-shift rule with this task, it would result in only a single error (on the first reversal trial) in each session. Evidence suggests that pigeons are not able to use the feedback from the preceding trial efficiently, and instead they appear to use the reference memory for the location of the reversal in the session (the time or the number of trials into the session) to estimate the point of the reversal. This pattern of choice results in increasing anticipatory errors as the session midpoint approaches and then additional perseverative errors after the midpoint has been reached (Rayburn-Reeves et al., 2011).

One hypothesis for the pigeons’ use of time into the session as a cue to reverse is the predictability of the reversal which encourages the pigeons to use the passage of time or number of trials into the session as a reliable cue. However, pigeons also appear to use time or trial number as a cue even when the location of the reversal in the session is unpredictable (i.e., they tend to use the average time or trial number to the reversal as a cue; Rayburn-Reeves et al., 2011).

Although one might think that the development of a win-stay/lose-shift rule would be relatively easy to acquire, careful analysis of the task may clarify why it is difficult for pigeons. To use a win-stay/lose-shift rule, the pigeon must remember not only the outcome of its last response but also the stimulus alternative to which it most recently responded. Remembering both may not appear to be too difficult either, however, in this research the typical intertrial interval (ITI) has been 5 s. Thus, the task can be viewed as a biconditional delayed-matching task. In a biconditional discrimination, the presence of one cue, such as a house light, signals the conditional discrimination that is in effect on a given trial. For example, Edwards, Miller, and Zentall (1985) trained pigeons on a task with four independent, 3-component rules: if the house light is on and the sample is red, choose the red comparison but if the sample is green, choose the green comparison; however, if the house light is off and the sample is red, choose the green comparison but if the sample is green, choose the red comparison.

A biconditional matching task even more similar in stimulus components to a spatial midsession reversal task (e.g., Rayburn-Reeves et al., 2011) was developed by Randall and Zentall (1997). In that study, pigeons were presented with a lit key on the left or on the right. Reinforcement for comparison choice depended on the combination of two cues, the location of the key that had been pecked and the outcome of that peck (in effect, a serial-compound sample). For example, one group of pigeons received reinforcement for choosing the comparison stimulus (left or right) if reinforcement had followed a peck to that location but reinforcement followed choice of the other comparison stimulus if reinforcement had not followed a peck to that location (i.e., a biconditional win-stay/lose-shift task).

Interestingly, when Randall and Zentall (1997) introduced a delay between the compound sample and the comparison choice in their biconditional win-stay/lose-shift task, matching accuracy rapidly declined from 90% at a 0-s delay to 64% at a 4-s delay. Thus, applying this finding to a further analysis of the difficulty that pigeons experience with the midsession-reversal task, when a 5-s delay (the ITI) is inserted between the serial-compound sample and comparison choice, the difficulty of the midsession-reversal task may be more apparent and the pigeons’ use of the passage of time into the session may seem less unreasonable. In fact, there is evidence from a study designed to test pigeons for their ability to acquire a win-stay/lose-shift response rule (Shimp, 1976) indicating that pigeons’ accuracy is strongly affected by the ITI (2.5, 4 or 6 s). Thus, if one views the midsession reversal as a biconditional delayed matching task, it should be possible to improve the accuracy of pigeons on this task by shortening the ITI. Furthermore, Ploog and Williams (2010) have found that in a simultaneous color discrimination with a reversal every 2 sessions, pigeons’ accuracy is significantly better when the ITI is 4 s as compared to 40 s long.

The purpose of the present experiment was to test the hypothesis that reducing the duration of the ITI would improve the ability of pigeons to use the location selected and the outcome of that choice as a basis for the subsequent choice instead of time or trial into the session. Specifically, if subjects are given a shorter delay between trials and fewer errors occur in the vicinity of the reversal, it would suggest that the ability to use the outcome of the previous trial is affected by the duration of the delay between trials (i.e., the ITI) and that timing may be used when it is difficult to remember the location pecked on the previous trial and the outcome of that peck.

Method

Subjects

Eleven White Carneau and one homing pigeon (Columba livia) ranging in age from 2 to 12 yrs served as subjects. All subjects had experience in previous unrelated studies involving simultaneous color discriminations, but had never been exposed to a reversal learning task. The pigeons were maintained at 85% of their free-feeding weight. They were individually housed in wire cages with free access to water and grit in a colony room that was maintained on a 12-hr/12-hr light/dark cycle. The pigeons were maintained in accordance with a protocol approved by the Institutional Animal Care and Use Committee at the University of Kentucky.

Apparatus

The experiment was conducted in a BRS/LVE (Laurel, MD) sound attenuating standard operant test chamber measuring 34 cm high, 30 cm from the response panel to the back wall, and 35 cm across the response panel. Three circular response keys (2.54 cm diameter) were aligned horizontally on the response panel and were separated from each other by 6.0 cm but only the left and right side keys were used in this experiment. The bottom edge of the response keys was 24 cm from the wire-mesh floor. A 12-stimulus in-line projector (Industrial Electronics Engineering, Van Nuys, CA) with 28-V, 0.1-A lamps (GE 1820), that projected a blue light (Kodak Wratten Filter No. 38) was mounted behind each of the two response keys. Mixed-grain reinforcement (Purina Pro Grains - a mixture of corn, wheat, peas, kafir and vetch) was provided from a raised and illuminated grain feeder, which was located behind a horizontally centered 5.1 × 5.7 cm aperture located vertically midway between the response keys and the floor of the chamber. Reinforcement consisted of 1.5-s access to the mixed grain. The experiment was controlled by a microcomputer and interface located in an adjacent room.

Procedure

Phase 1

Subjects were randomly assigned to one of three groups (n = 4, each), varying only in the duration of the ITI (1.5, 5.0, and 10.0 s): Group 1.5, Group 5, and Group 10, respectively. For all pigeons, at the start of each trial, both the left and right response keys were illuminated blue. For half of the subjects, a single peck to the left key (which served as S1 for these pigeons) turned off both keys and resulted in 1.5-s access to food, together with the ITI, whereas a response to the right key (which served as S2 for these pigeons) also turned off both keys but resulted in the ITI alone. For the other half of the pigeons, choice of the right key (S1) was reinforced and not the left key (S2). For the first 40 trials of each 80-trial session, subjects were trained with S1+/S2−. From Trial 41 to Trial 80, the contingencies were reversed (S2+/S1−). Subjects were trained for 40 sessions.

Phase 2

In Phase 2, two of the pigeons from Group 5 and Group 10 (randomly selected) were transferred to the 1.5-s ITI condition. The pigeons from Group 1.5 and the remaining pigeons from Group 5 and Group 10 continued on the procedure from Phase 1. The pigeons remained in Phase 2 for 80 sessions.

Phase 3

To obtain another measure of flexibility, all pigeons were transferred to a procedure in which the point of the reversal in the session was variable. On each session the reversal could occur randomly in any one of five locations (i.e., following Trial 10, Trial 25, Trial 40, Trial 55, or Trial 70) but only once per session. The ITI remained the same as it was during Phase 2. Thus, there were now pigeons that had always had ITIs of 1.5 s, pigeons that had been transferred from 5-s and 10-s ITIs to 1.5 s, and pigeons that had always had ITIs of 5 or 10 s. The pigeons remained in Phase 3 for 20 sessions at each point of reversal (a total of 100 sessions).

Results

Phase 1

Group 5

As in previous experiments, pigeons reached a stable level of performance in about 20 sessions and improved minimally over the next 20 sessions. The percentage choice of the first correct stimulus (S1) as a function of trial number (in blocks of five trials) averaged over subjects for the last 10 training sessions (Sessions 31–40) appears in Figure 1. As can be seen in the figure, the pigeons chose S1 almost exclusively during early trials in each session, choice of S1 then declined prior to the reversal and continued to decline to almost exclusive choice of S2 following the reversal. Accuracy was at 70.0% correct (choice of S1) during the 5 trials immediately before the reversal (very similar to that of previous research; Rayburn-Reeves et al., 2011) and it was at 63.0% correct (37.0 % choice of S1) during the 5 trials immediately following the reversal (also similar to that of previous research). Over the last 10 training sessions, pigeons in Group 5 made an average of 8.1% errors per session.

Figure 1.

Figure 1

Percentage choice of S1 as a function of trial number (in blocks of 5 trials) averaged over Sessions 31–40 for Group 1.5 (filled circles), Group 5 (open circles), and Group 10 (open triangles). Error bars = ±1 sem

Group 1.5

Consistent with the results from Group 5, subjects that were exposed to the 1.5-s ITI procedure also stabilized in approximately 20 sessions; however, in contrast to subjects in Group 5, by Session 30, pigeons in Group 1.5 showed markedly different performance prior to the reversal and maintained that difference for the last 10 sessions of training. The percent choice of the first correct stimulus (S1) as a function of trial number (in blocks of five trials) averaged over subjects for the last 10 training sessions (Sessions 31–40) also appears in Figure 1. As can be seen in the figure, subjects consistently chose S1 almost exclusively during the first half of the session, showing little decline prior to the reversal. Percentage choice of S1 for the five trials immediately preceding the reversal was 96.5%. An independent samples t-test indicated that overall, the difference in errors between Group 5 and Group 1.5 over the last 10 sessions was significant, t(6) = 3.02, p = .02. Following the reversal, Group 1.5 showed similar levels of accuracy to Group 5, resulting in accuracy levels of 62.5% correct (37.5% choice of S1) for the five trials immediately following the reversal, and then responding almost exclusively to S2 on the remaining trials in the session. Thus, the drop in choice of S1 from Trials 36–40 to Trials 41–45 for Group 1.5 (59.0%) was considerably greater than for Group 5 (38.5%), t(6) = 3.14, p = .02. Over the last 10 training sessions, pigeons in Group 1.5 made an average of 4.2% errors per session or a little more than half as many errors as pigeons in Group 5.

Group 10

The performance of Group 10 was almost identical to that of Group 5, with accuracy for the five trials prior to the reversal at 67.5%, and 64.0% for the five trials immediately following the reversal (36% choice of S1). Data averaged over the last 10 sessions of training also can be seen in Figure 1. Over the last 10 training sessions, pigeons in Group 10 made an average of 7.2% errors per session, about the same as pigeons in Group 5 and considerably more than the pigeons in Group 1.5.

Trial by Trial Analyses

A more detailed depiction of the differences in performance of the three groups near the reversal appears in Figure 2. The figure shows the trial by trial plot of choice of the first correct stimulus over the block of trials immediately prior to the reversal (Trials 36–40) and immediately after the reversal (Trials 41–45) for Sessions 31–40. The average drop in choice of S1 from Trial 41 (the first trial that provided feedback that the reversal had occurred) to 42 was 50% for Group 1.5. This sharp drop in choice of S1 indicates that the pigeons in this group used the feedback from the choice and outcome of the previous trial as a cue to switch responding from S1 to S2.

Figure 2.

Figure 2

Percentage choice of S1 as a function of trial number for Trials 36–40 (immediately prior to the reversal) and Trials 41–45 (immediately after the reversal) averaged over Sessions 31–40 for Group 1.5 (filled circles), Group 5 (open circles), and Group 10 (open triangles). Error bars = ±1 sem.

To get a measure of the degree to which the pigeons came close to the ideal win-stay/lose-shift rule we examined the mean accuracy on Trials 36 to 45 with accuracy on Trial 41 (the outcome of which was the first cue to shift) omitted. Mean accuracy on those 9 trials pooled over Sessions 31–40 was 87.8%, 70.3%, and 68.9% for Groups 1.5, 5, and 10, respectively. An independent samples t-test indicated that percent accuracy was significantly greater for Group 1.5 than for Group 5, t(6) = 3.03, p = .007, and for Group 10, t(6) = 2.88, p = .01; however, there was not a statistically significant difference between Groups 5 and 10, t < 1.

If subjects were using a win-stay/lose-shift rule, this should be evidenced by sessions in which subjects produce only one error on the trial in which the reversal took place (i.e., an error on Trial 41). Therefore, an additional measure of accuracy is the number of one-error sessions (perfect scores). Over the 40 sessions, subjects in Group 1.5 averaged 5.2 one-error sessions, compared to Groups 5 and 10, which averaged 2.0 and 0.5 one-error sessions, respectively (see Figure 3). The difference in number of one-error sessions between Group 1.5 and the other two groups was statistically significant, F(1,10) = 4.95, p = .05, but the difference between Group 5 and Group 10 was not, t(6) = 1.34, p = .23.

Figure 3.

Figure 3

Mean number of one-error sessions during training for Group 1.5, Group 5, and Group 10. Error bars = ±1 sem.

Phase 2

Pigeons Transferred from 5-s and 10-s ITIs to 1.5-s ITIs

The pigeons that were transferred to the 1.5-s ITIs gradually acquired the pattern of responding of the pigeons that were on 1.5-s ITIs from the start of training. The transfer data from Sessions 61–80 are presented in Figure 4. For comparison purposes the data from the pigeons that remained on 1.5-s ITIs and those that remained on 5- and 10-s ITIs also appear in Figure 4. Thus, as can be seen in the figure, not only did the transferred pigeons develop a pattern of choice much like the pigeons that had had 1.5-s ITIs from the start but they performed better on the block of 5 trials immediately before the reversal than the pigeons that remained on the longer ITIs. A t-test performed on the data from the block of 5 trials immediately before the reversal for the pigeons on the 1.5-s ITIs was significantly more accurate than for the pigeons that remained on the 5- and 10-s ITIs, t(10) = 2.28, p = .05.

Figure 4.

Figure 4

Percentage choice of S1 as a function of trial number (in blocks of 5 trials) averaged over Sessions 61–80. Half of the pigeons in Groups 5 and 10 were shifted to 1.5-s ITI s (from 5 and 10 s to 1.5 s) while the remaining pigeons in those groups stayed with 5- and 10-s ITIs, respectively, and the pigeons from Group 1.5 continued to have 1.5-s ITI s. Error bars = ±1 sem.

Trial by Trial Analyses

A more detailed depiction of the differences in performance of the three groups near the reversal appears in Figure 5. The figure shows the trial by trial plot of choice of the first correct stimulus over the block of trials immediately prior to the reversal (Trials 36–40) and immediately after the reversal (Trials 41–45) for Sessions 61–80. The average drop in choice of S1 from Trial 41 (the first trial that provided feedback that the reversal had occurred) to 42 was 80.0% for the group that was transferred to 1.5-s ITIs and the drop was 76.2% for Group 1.5 (that had been on 1.5-s ITIs from the start). This sharp drop in choice of S1 indicates that the pigeons in this group used of the feedback from the choice and outcome of the previous trial as a cue to switch responding from S1 to S2. The comparable drop in choice of S1 for the pigeons that remained on 5- and 10-s ITIs was only 23.8%, a statically significant difference, t(10) = 3.35, p < .01.

Figure 5.

Figure 5

Percentage choice of S1 as a function of trial number for Trials 36–40 (immediately prior to the reversal) and Trials 41–45 (immediately after the reversal) averaged over Sessions 61–80 for Group 1.5, Groups 5 and 10 combined and for the pigeons transferred from Groups 5 and 10 to Group 1.5 (Trans to 1.5-s ITI). Error bars = ±1 sem.

Phase 3

The data from the variable point of reversal for the last 25 (of the 100) sessions of Phase 3 are presented in Figure 6. The data from Group 1.5 (that had been on 1.5-s ITIs from the start) are presented at the top panel of Figure 6. There it can be seen that the pigeons in Group 1.5 responded quickly to the reversal regardless of when it occurred during the session. However, even for this group there was a small number of perseverative errors when the reversal occurred early in the session (following Trial 10) but very few anticipatory errors when the reversal occurred late in the session (following Trial 55 and Trial 70).

Figure 6.

Figure 6

Percentage choice of S1 as a function of trial number plotted separately for each point of reversal in the session for Group 1.5 (top), pigeons from Groups 5 and 10 that were transferred to 1.5-s ITIs (middle), and the pigeons from Groups 5 and 10 combined (bottom) that remained in the same ITI condition.

The data from the pigeons that were transferred from the 5- and 10-s ITI to the 1.5-s ITI are presented in the middle panel of Figure 6. There it can be seen that the transferred pigeons were almost as sensitive to the reversal as the pigeons in Group 1.5. They did show a bit more perseverative errors at most of the reversal points and overall a few more anticipatory errors.

The data from the pigeons that remained on the 5-s and 10-s ITI are presented in the bottom panel of Figure 6. These pigeons continued to be relatively insensitive to the reversal no matter when it occurred in the session and the relative insensitivity was especially apparent in anticipatory errors late in the session.

Discussion

The results of the current experiment support our hypothesis that the duration of the ITI is important in the pigeon’s ability to use the cues provided on the most recent trial (the stimulus selected and the outcome) for the choice response on the following trial. However, when the ITI exceeds a few seconds, pigeons rely very little on those cues and instead rely on the reference memory for the location of the reversal in the session as a basis for choice. Using timing or trial-estimation cues results in errors in the vicinity of the reversal (midsession) and significantly reduces the overall level of reinforcement.

In effect, the results of the current experiment indicate that pigeons can use a win-stay/lose-shift rule as long as the ITI is quite short. This experiment is the first to provide evidence of near perfect win-stay/lose-shift behavior in pigeons in reversal learning. The results of the present experiment suggest that at shorter delays, the memory for the most recent response choice-outcome combination is sufficient for the pigeons to develop a win-stay/lose-shift rule. The fact that subjects in both Group 5 and 10 were not able to use those cues efficiently (evidenced by the number of errors on trials in the vicinity of the reversal), suggests that the delay between trials caused the pigeons to rely on the passage of time or trial number into the session as an additional cue.

That the deficits in correct choice are related to the use of timing (or trial number estimation) cues, perhaps combined with reduced memory for the most recent response-choice outcome, rather than merely less sensitivity to the feedback from the reversal, is suggested by the fact that most of the difference in accuracy on this task between Group 1.5 and the other two groups occurred prior to the reversal. Although subjects in Groups 5 and 10 made more errors prior to the reversal than subjects in Group 1.5, the additional use of timing or trial-number estimation cues may have allowed the pigeons in the long ITI groups to maintain higher levels of performance accuracy than otherwise, based solely on the most recent response-choice outcome.

One hypothesis for why pigeons in Groups 5 and 10 were unable to use a win-stay/lose-shift rule is that the long ITI made it more likely that the pigeons moved away from the key they had just pecked and thus lost a position cue that would have helped them to bridge their memory for their last pecked location. Evidence from video tapes of the sessions confirmed the hypothesis that the pigeons’ movement may have contributed to the increase in errors by pigeons in Group 5 and Group 10. From the video tapes it could be seen that pigeons in Group 1.5 would stand in front of the feeder, peck the key most recently pecked (unless reinforcement did not follow that peck), eat from the feeder, and immediately return to the key they had just pecked, whereas pigeons in Groups 5 and 10 were generally more mobile during the ITIs. Thus, pigeons in the 1.5-s ITI condition may have used a form of procedural memory (repetitive movements from key to feeder and back to key) until the absence of the raised feeder disrupted this behavioral pattern. If this hypothesis is correct, it would suggest that if the task were changed such that the discriminative stimuli were red and green hues that varied in location from trial to trial, reduction in the duration of the ITI would have a more limited effect on reductions in the pigeons’ error rate because with a hue discrimination, the location of the positive discriminate stimulus would not be predictable from trial to trial. However, the results of such a manipulation are not obvious because Rayburn-Reeves et al. (2011) found that with the midsession reversal using a 5-s ITI there was virtually no difference in error rate between a spatial discrimination reversal task and a hue discrimination reversal task. Of course it may be that with shorter ITIs a difference might have emerged.

The hypothesis that the short ITIs allowed for more accurate pre-reversal than post-reversal performance is supported by the results presented in Figures 1 and 4. There it can be seen that virtually all of the reduction in errors with shorter ITIs occurred prior to the reversal (anticipatory errors). However, as can be seen in Figure 2 and especially in Figure 5, the absence of an effect of shorter ITIs on perseverative errors in Figures 1 and 4 may have been artifactually produced by inclusion of the results of Trial 41, the first trial on which feedback could have been obtained that the reversal had occurred. Furthermore, the mere fact that the longer ITIs resulted in so many anticipatory errors meant that there would very likely be a reduction in perseverative errors following the reversal because the response that would be considered an anticipatory error prior to the reversal would be considered as a correct response following the reversal. That is, on Trial 41, the groups with long ITIs were already making fewer perseverative errors than the group with short ITIs.

A consequence of the shortened ITI for pigeons in Group 1.5 was that their session was considerably shorter than for the other two groups. Thus, independently of their higher accuracy, they effectively received a higher rate of reinforcement than the pigeons in Groups 5 and 10. However, it is unlikely that this difference contributed to the more accurate performance by the pigeons in Group 1.5 because massed practice (1.5-s ITI) is generally less efficient than distributed practice (5- and 10-s ITI). Thus, it is not clear how the higher density of reinforcement would have resulted in more accurate choices by the pigeons in Group 1.5.

Relative to 5-s ITIs, 1.5-s ITIs had a large effect on the pigeons’ ability to use feedback cues from the preceding trial; however 10-s ITIs did not. The absence of a difference between the functions for Groups 5 and 10 in Figures 1, 2, and 3 suggests that the pigeons in those groups may have been using numerical estimation rather than a temporal cue to estimate the occurrence of the reversal. Had they been using a temporal cue, we should have seen a difference in accuracy between the two groups because there should have been more errors in estimating the midpoint of the session when the ITI was 10 s than when it was 5 s. In the case of the 10-s ITI, the session was almost twice as long. However, if they were estimating the number of trials completed, one might expect the pigeons in the two groups to be about equally accurate in estimating the midpoint of the session because in both cases there were always 40 trials before the reversal.

The results of Phase 2 demonstrate that the pigeons transferred from longer to shorter ITIs reduced their tendency to make anticipatory errors and showed a rapid response to the outcome of the first reversal trial. The results of Phase 3 show that pigeons with short ITIs learned to use the feedback from the first nonreinforced response as a cue to reverse whenever it occurred in the session. However, pigeons with long ITI s continued to have a difficult time using nonreinforcement as a cue to reverse. Instead, they appeared to continue to use the midpoint in the session as a cue to reverse, and this strategy resulted in perseverative errors when the reversal came early in the session and especially anticipatory errors when the reversal occurred late in the session. Although in the present experiment the continued use of the midpoint in the session as a cue to reverse may have carried over from training in Phase 1 (where the reversal always came after the midpoint of the session), surprisingly, in earlier research we found that when pigeons were trained with 5-s ITIs with the variable point of reversal procedure from the start of training, they too made more perseverative errors when the reversal occurred early in the session and more anticipatory errors when the reversal occurred late in the session (Rayburn-Reeves et al., 2011).

The procedures used in the present experiment (as well as Rayburn-Reeves et al., 2011 and Rayburn-Reeves et al., in press), unlike those of earlier reversal experiments, allow one to assess the pigeon’s ability to anticipate a reversal. The results suggest that relatively simple tasks such as a predictable (or unpredictable) simultaneous discrimination reversal may involve different learning mechanisms including timing or number estimation, memory for the compound stimulus composed of the stimulus responded to and the outcome of that response from the preceding trial, and possibly a repetitive motor response. Furthermore, the time between trials can affect which of these mechanisms has the greater influence on task performance. Thus, the results of the present experiment extend our understanding of the mechanisms by which animals deal with anticipated reversals of simultaneous discriminations and provide additional means of assessing the flexibility of learning by animals.

Highlights.

  1. Reversal learning can provide a measure of an animal’s behavioral flexibility

  2. Midsession reversal of a simultaneous discrimination allows for a crude means of anticipating a reversal

  3. Pigeons tend to anticipate the reversal and perseverate following the reversal rather than adopting a win-stay/lose-shift ‘strategy’

  4. We tested the hypothesis that poor memory for the last response and its outcome were responsible

  5. When we shortened the intertrial interval near perfect win-stay/lose-shift performance resulted

  6. It appears that memory rather than insensitivity to local outcome constrains pigeons performance on this task

Acknowledgments

This research was supported by National Institute of Mental Health Grant 63726 and by National Institute of Child Health and Development Grant 60996.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Bitterman ME. The comparative analysis of learning. Science. 1975;188:699–709. doi: 10.1126/science.188.4189.699. [DOI] [PubMed] [Google Scholar]
  2. Cook RG, Rosen HA. Temporal control of internal states in pigeons. Psychonomic Bulletin & Review. 2010;17:915–922. doi: 10.3758/PBR.17.6.915. [DOI] [PubMed] [Google Scholar]
  3. Edwards CA, Miller JS, Zentall TR. Control of pigeons’ matching and mismatching performance by instructional cues. Animal Learning and Behavior. 1985;13:383–391. [Google Scholar]
  4. Ploog BO, Williams BA. Serial discrimination reversal learning in pigeons as a function of intertrial interval and delay of reinforcement. Learning & Behavior. 2010;38:96–102. doi: 10.3758/LB.38.1.96. [DOI] [PubMed] [Google Scholar]
  5. Randall CK, Zentall TR. Win-stay/lose-shift and win-shift/lose-stay learning by pigeons in the absence of overt response mediation. Behavioural Processes. 1997;41:227–236. doi: 10.1016/s0376-6357(97)00048-x. [DOI] [PubMed] [Google Scholar]
  6. Rayburn-Reeves RM, Molet M, Zentall TR. Simultaneous discrimination reversal learning in pigeons and humans: Anticipatory and Perseverative Errors. Learning & Behavior. 2011;39:125–137. doi: 10.3758/s13420-010-0011-5. [DOI] [PubMed] [Google Scholar]
  7. Rayburn-Reeves RM, Stagner JP, Kirk CR, Zentall TR. Reversal learning in rats and pigeons: Qualitative differences in behavioral flexibility. Journal of Comparative Psychology. doi: 10.1037/a0026311. (in press) [DOI] [PubMed] [Google Scholar]
  8. Shimp CP. Short-term memory in the pigeon: The previously reinforced response. Journal of the Experimental Analysis of Behavior. 1976;26:487–493. doi: 10.1901/jeab.1976.26-487. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES