Abstract
In the Monty Hall dilemma, an individual chooses between three options, only one of which will deliver a prize. After the initial choice, one of the non-chosen options is revealed as a losing option, and the individual can choose to stay with the original choice or switch to the other remaining option. Previous studies have found that most adults stay with their initial choice, although the chances of winning are 2/3 for switching and 1/3 for staying. Pigeons, college students, and preschool children were given many trials on this task to examine how their choices might change with experience. The college students began to switch on a majority of trials much sooner than the pigeons, contrary to the findings by Herbranson and Schroeder (2010) that pigeons perform better than people on this task. In all three groups, some individuals approximated the optimal strategy of switching on every trial, but most did not. Many of the preschoolers immediately showed a pattern of always switching or always staying and continued this pattern throughout the experiment. In a condition where the probability of winning was 90% after a switch, all college students and all but one pigeon learned to switch on nearly every trial. The results suggest that one main impediment to learning the optimal strategy in the Monty Hall task, even after repeated trials, is the difficulty in discriminating the different reinforcement probabilities for switching versus staying.
Keywords: Monty Hall dilemma, reinforcement probability, pigeons, college students, preschool children
When the behaviors of non-humans and humans are studied in similar choice situations, similar behavior patterns are frequently found. For instance, numerous studies on choice with concurrent variable-interval schedules have found that the response proportions of both non-humans (e.g., Herrnstein, 1961; Schneider & Lickliter, 2010) and humans (e.g., Ecott & Critchfield, 2004; McDowell & Caron, 2010) tend to approximate the reinforcement proportions, such that their response distributions can be described by the generalized matching law (Baum, 1979; Davison & McCarthy, 1988). In choices between immediate and delayed reinforcers, both non-humans (e.g., Mazur & Biondi, 2009; Woolverton, Myerson, & Green, 2007) and humans (e.g., Green, Fry, & Myerson, 1994) display patterns of delay discounting that are well described by a hyperbolic equation (see Madden & Bickel, 2010; Odum, 2011). In some cases, as when choosing between fixed-ratio and progressive-ratio reinforcement schedules, the response patterns of both non-humans (Hineline & Sodetz, 1987) and humans (Wanchisen, Tatham, & Hineline, 1992) approximate the optimal solution, minimizing the number of responses needed per reinforcer. In other cases, both non-humans and humans appear to make the same logical errors in decision-making, such as base-rate neglect (Fantino, Kanevsky, & Charlton, 2005) or the sunk cost effect (Navarro & Fantino, 2005). Interestingly, young children perform better than older children and adults on base-rate problems under certain conditions (DeNeys & Vanderputte, 2011; Jacobs & Potenza, 1991), for example, when there is less familiarity with stereotypes that conflict with base-rate information (DeNeys & Vanderputte, 2011).
Because cross-species similarities are so common in research on choice, cases where non-humans and humans exhibit differences in choice behavior are of special interest, as are cases where there are differences between adult humans and children, particularly when the children demonstrate an advantage (Chi, 1978). A common element among and across species is that learning evolves from variation in production and choice of strategies (Siegler, 2004). Comparisons between species, as well as developmental comparisons within a species, provide information about “which aspects of problem solving derive from the task environment and which from the characteristics of the subject” (Klahr, 1978, p. 183). Recently, Herbranson and Schroeder (2010) reported one such case, involving a comparison of pigeons and adult humans on a task known as the Monty Hall Dilemma.
The Monty Hall dilemma, named after the host of the game show Let’s Made a Deal, represents a choice illusion studied since the early 1980’s. It has been variously called a “tenacious brain teaser” (Krauss & Wang, 2003), a choice anomaly (Friedman, 1998), a cognitive illusion (Piattelli-Palmarini, 1994) and finally summed up as “notoriously difficult” (De Neys & Verschueren, 2006). Notably, the psychological factors underlying non-optimal performance on the task (i.e., staying) were, in part, responsible for the financial success of the Let’s Make a Deal television game show (Friedman, 1998).
The Monty Hall dilemma involves a person choosing one of three doors. One door has a big prize (e.g., a new car), and the other two have much less desirable outcomes (e.g., a goat). After a person makes a choice, one of the three doors is opened, revealing an inferior prize. The person is then asked whether he/she would like to stay with the original choice or to switch to the other unopened door. Most people in this situation decide to stay with their original choice which in this case represents a non-optimal strategy. Although it is counterintuitive, the odds of winning are 1/3 for a stay response, but 2/3 for a switch response. Studies have shown staying rates (non-optimal performance) ranging from 79 to 91% (see Burns & Wieth, 2004; Friedman 1998; Granberg, 1999; Krauss & Wang, 2003).
Specific heuristics are believed to be involved in solving the Monty Hall task, and it is reliance on these heuristics that presumably leads to non-optimal performance. The first heuristic, sometimes called the equiprobability heuristic, is the belief that in the second round of the game, when one door has been removed, the chances of winning by staying are 50% (De Neys & Verschueren, 2006). College aged students almost universally believed that staying and switching were equally likely to win (De Neys, 2007). Since there is no perceived reason to switch, the predominant response is to stay with the original choice. This heuristic does not take into consideration that Monty Hall would not have revealed a winning door (see the collider principle; Burns & Wieth, 2004). In addition to the equiprobability heuristic, there is a strong bias to stay with original choices, possibly reflecting anticipation of regret (see Gilovich, Medvec & Chen, 1995). Thus the equiprobability heuristic that leads to the perception that there is no advantage to switching is combined with a reluctance people have to change from their original response.
Herbranson and Schroeder (2010) sought to examine how the performances of college students and pigeons might differ, given that the heuristics directing the adult responses would not be directing those of the pigeons. They also asked whether both groups would improve if they were given repeated trials, i.e., an iterative procedure, so that they could obtain direct and extensive experience with the consequences of staying or switching. This iterative procedure is contrasted with the standard Monty Hall procedure where only one trial is given (Burns & Wieth, 2004). In Herbranson and Schroeder, on each trial, the pigeons first chose between three response keys, and then between two of the keys (the one they had just pecked, plus one of the others). The task was the same for the college students, except that they chose between three squares, and then between two squares, on a computer touchscreen. The reinforcers were food for the pigeons and points for the college students. With the students, the experimenters were careful not to use the words “stay” or “switch”—they were simply told to choose among the squares and try to win as many points as possible. The students received 200 trials in a single session. The pigeons received 30 sessions, most of which included 100 trials.
On the early trials, the switching percentages for both the college students and the pigeons were below 50%, but as the trials progressed, the percentages increased for both groups. By the end of their 200 trials, the average switching percentage for the college students was roughly 67%. Herbranson and Schroeder (2010) noted that this performance was consistent with the theory of probability matching (where an individual’s response percentages match the reinforcement probabilities). In contrast, in the pigeons’ 30th session, the mean switching percentage was approximately 95%. This nearly exclusive preference for switching was close to the optimal strategy, which is to switch on every trial. Based on these results, Herbranson and Schroeder concluded that the performance of pigeons is better than that of humans on the Monty Hall task. They proposed that this was because, unlike pigeons, the humans’ use of heuristics, such as a tendency to stick to one’s initial decision (Gilovich, et al., 1995), interfered with their ability to learn the optimal strategy on this task. It should be noted, however, that the pigeons had received many more trials than the college students (almost 3000 trials versus 200 trials). Herbranson and Schroeder did not provide data on the performance of the pigeons and college students after an equivalent number of trials.
The three experiments reported here were designed to collect further data about potential differences between the choices of pigeons and humans (college undergraduates and preschoolers) in the Monty Hall task, and to examine some possible reasons why adult humans have not performed well on this task in previous studies. The first experiment, with pigeons as subjects, used a procedure that closely replicated the one used by Herbranson and Schoeder (2010). The pigeons were tested on the standard Monty Hall task for 30 sessions of 100 trials each, with three response keys as the three alternatives, and food as the reinforcer. As is usually the case in the Monty Hall task, the probability of earning a reinforcer was 1/3 if the pigeon stayed with its original choice, but 2/3 if it switched to the other remaining key on its second choice.Because our pigeons did not perform as well as those of Herbranson and Schroeder, we also included a condition in which the reinforcement probabilities were 90% for switching and 10% for staying with the original choice.
In addition, previous findings suggest that humans may perform poorly on this task, so one goal was to see if we could obtain better performance from our human participants. With this in mind, the second experiment, with college students as participants, included several differences from the procedure used by Herbranson and Schroeder (2010). To test the possibility that sub-optimal performance on this task found in previous research was due to a lack of motivation or to inadequate feedback, we offered the participants monetary rewards for good performance, and we provided them with continuous feedback on how many times they had won, their winning percentages, and the winning percentage that was theoretically possible if they used the best strategy. We also considered the possibility that, in Herbranson and Schroeder’ experiment, it might have been difficult for participants to discriminate the reinforcement probabilities and recognize that switching leads to a higher probability of winning than staying (especially because participants were never told that the two options were “switching” or “staying”—they were simply told to choose between one of three options, and then between one of the two remaining options). To examine this possibility, we included three conditions with different reinforcement percentages for stay and switch responses. One group received a series of trials with the reinforcement percentages that apply in the standard Monty Hall task (33% chance of winning for a stay response, and 67% chance of winning for a switch response). For a second group the stay and switch winning percentages were set at 20% and 80%, respectively, and for a third group the winning percentages were 10% and 90%. The purpose of including these three groups was to examine whether performance would become progressively closer to optimal as the consequences for staying versus switching became more extreme.
The third experiment used preschool children (ages 3–5 years) as participants. Models of decision making emphasize the role of heuristics (Kahneman, Slovic & Tversky, 1982), and the poor performance on the Monty Hall task for adults has been attributed to an overreliance on the equiprobability heuristic. According to Herbranson and Schroeder (2010), pigeons outperformed adults on the Monty Hall task presumably because they are not handicapped by these heuristics.
The performance of preschoolers may provide a different perspective on the Monty Hall dilemma. To our knowledge, the Monty Hall task has never been studied in young, preschool aged children, and so our first question was very general: Will preschoolers’ responses to the Monty Hall task be strategic and rule based (Siegler, 1983, 2004)? If, when presented multiple trials of this choice task, young children identify a problem space with specific goals, then their responses will be patterned rather than random. Greer (2001) and Nickiforidou and Pange (2010) have suggested that preschoolers, when faced with apparent randomness, will adopt and maintain strategies that create predictable patterns. Based on these prior findings, we hypothesized that preschoolers would develop consistent patterns of responding that could be could be characterized as strategic, although exactly what response patterns they would adopt is unclear. Because of our uncertainty about how the children would perform on this task, and because of the possibility that large individual differences might arise, we chose to test all of the preschoolers on the standard Monty Hall task (with winning percentages of 67% for staying and 33% for switching) rather than divide them into groups with different winning percentages.
A second, more specific question is how will preschoolers’ performance compare to that of adults and pigeons? One possibility is that preschoolers will perform better than adults, as Herbranson and Schroeder found with pigeons. In one study (DeNeys, 2007), 12-year-olds were slightly more likely to switch in a single trial of the Monty Hall task (10%) compared to 13–17 year-olds (0%), and the younger group had a less well developed sense of equiprobability than the older group. In our study it is unlikely that the preschoolers will have a well-developed equiprobability heuristic, and this could give them an advantage over adults. On the other hand, it is likely that preschoolers do have some intuitions regarding probability. For example, children between 4- and 6-years-old were able to indicate the most likely outcome in a game with simple 5:1 or 3:1 comparisons (Nikiforidon & Pange, 2010) and preschoolers (3- to 5-years-old) reacted more quickly to higher probability than lower probability events (Téglás, Girotto, Gonzalez & Bonatti, 2007). Even 12-month-old infants react to outcomes in ways congruent with the event probabilities associated with them (Téglás, et al., 2007). Will these early intuitions of probability bias young children’s responses toward staying in the Monty Hall task as they presumably do for adults? If so, preschoolers will perform similarly to adults.
The Monty Hall dilemma offers an opportunity to compare nonhuman and human (adult and preschooler) choice performance. The present experiments had several purposes. First, we wished to reexamine the previous finding that pigeons did better than adults. Second, given the consistent finding that the Monty Hall dilemma is difficult for adults, is it possible to increase adult performance under conditions of greater motivation and feedback about performance? Third, what are the effects of manipulating the underlying probability structure of the task? If performance increases with the winning percentages of 20% versus 80%, and 10% versus 90%, then it may be that the 33% to 67% probabilities operating in the Monty Hall task are too difficult to detect and influence choice. Under slightly clearer probabilities, performance may improve. Fourth, will preschoolers approach the task strategically with identifiable response patterns, and how will their performance compare to adults and pigeons? If all three groups, pigeons, adults and preschoolers, perform in similar, non-optimal ways, this would support a view that performance on this task is driven more by the properties of the task than by the characteristics of the subject, or the use of particular heuristics.
Finally, these studies relate to a more general question about how individuals choose between probabilistic reinforcers. Some early studies found evidence for probability matching, in which response percentages match the relative reinforcement probabilities for the two alternatives (e.g, Bitterman, Wodinsky, & Candland, 1958). Bitterman and his colleagues suggested there may be species differences, because fish and pigeons often exhibited probability matching (Bitterman et al., 1958; Bullock & Bitterman, 1962) whereas monkeys and rats tended to display exclusive preference for the alternative with the higher probability of reinforcement (which is the optimal strategy; Bitterman et al, 1958; Meyer, 1960). Based on their data, Herbranson and Schroeder (2010) suggested that pigeons showed near- exclusive preference whereas people approximated probability matching. The present experiments will provide additional data on whether pigeons, adults, and preschoolers tend to exhibit probability matching, exclusive preference, or possibly some other pattern of responding when choosing between probabilistic reinforcers.
EXPERIMENT 1
Method
Subjects
The subjects were 12 male white Carneau pigeons maintained at about 80% of their free-feeding weights. All pigeons had previous experience with a variety of experimental procedures.
Apparatus
Three identical experimental chambers were used. Each chamber was 30 cm long, 30 cm wide and 31 cm high. The chambers had three response keys, each 2 cm in diameter, mounted on the front wall of the chamber, 24 cm above the floor and 8 cm apart. A force of approximately 0.15 N was required to operate each key. Each key could be transilluminated with lights of different colors. Every effective peck to a lit key was followed by a feedback click. A hopper below the center key delivered controlled access to grain (whole-grain wheat), and when the grain was available, the hopper was illuminated with a 2-W white light. Two 2-W white houselights were mounted above the Plexiglas ceiling toward the rear of the chamber. Each chamber was enclosed in a sound-attenuating box with a ventilation fan. All stimuli were controlled and responses were recorded using an IBM compatible computer using the Medstate programming language.
Procedure
Because of the pigeons’ extensive prior experience with choice procedures in these chambers, no pretraining was necessary. The experiment consisted of two conditions, each lasting for 30 sessions. All pigeons first received the 67% condition, in which the probability of reinforcement for a switch response was 2/3, and the probability of reinforcement for a stay response was 1/3. In the second condition (the 90% condition), the probability of reinforcement for a switch response was 90%, and the probability of reinforcement for a stay response was 10%.
Each session lasted for 100 trials or 60 min, whichever came first. The pigeons usually completed all 100 trials in much less than 60 min, and there was only one session in which one pigeon did not complete 100 trials. The white houselights were lit throughout each session except during reinforcement periods, when the white houselights above the grain hopper were lit instead. At the start of each trial, all three response keys were transilluminated with white light. When a pigeon made single peck on any key, all three keys were darkened for 1 s. Then, the key that had been pecked along with one of the other two keys (chosen at random) were transilluminated with green light. The third key remained dark, and any responses to this key had no consequence. If the pigeon pecked the same key as it had pecked previously, this was counted as a stay response; if it pecked the other green key, this was considered a switch response. When the pigeon pecked one of the two green keys, both keys were darkened, and this was followed either by reinforcement (a 3-s presentation of grain) or by the start of a 5-s inter-trial interval (ITI), during which only the white houselights were lit. If a trial ended without reinforcement, the 5-s ITI began immediately. After the ITI, the three keys were again transilluminated with white light and the next trial began.
For the first 4 pigeons tested in the 67% condition (Pigeons P9 through P12), whether a trial ended with or without reinforcement was determined by a purely random process, with a 1/3 probability of reinforcement for stay responses and a 2/3 probability for switch responses. This procedure is similar to contingencies in the game show (where the winning door was presumably selected at random), but when an individual receives a series of such trials, the actual winning probabilities can depart from these nomimal probabilities due to chance runs of wins or losses. To minimize such departures from the nominal probabilities, the reinforcer deliveries for the other 8 pigeons in the 67% condition were determined by a pseudorandom schedule that ensured that the actual reinforcement percentages remained very close to their nominal values in each session. Eight out of every 12 switch responses were followed by reinforcement, and 4 out of every 12 stay responses were followed by reinforcement. In the 90% condition, a pseudorandom schedule was used for all 12 pigeons: 9 out of every 10 switch responses were followed by reinforcement, and 1 out of every 10 stay responses was followed by reinforcement.
Results and Discussion
Figure 1 shows the mean switching percentage for all 60 sessions of the experiment. In the 67% condition, switching percentages began near 50%, and increased to about 60% by the end of the condition. In the 90% condition, mean switching percentages rose to almost 90% by the end of the condition. A repeated-measures analysis of variance (ANOVA) found a significant effect of condition, F(1, 11) = 14.28, p = .003, η2 = 0.56, a significant effect of sessions, F(29, 319) = 7.00, p < .001, η2 = 0.39, and no significant condition-by-session interaction, F(29, 319) = 0.33, p = .99.
Figure 1.
Mean switching percentages for the pigeons in each session of Experiment 1. The error bars are standard errors of the mean.
Taken as a group, the pigeons in this experiment did not perform as well in the 67% condition as the 6 pigeons in the comparable experiment of Herbranson and Schroeder (2010), where the mean switching percentage was about 95% in the 30th session. However, the group means shown in Figure 1 are misleading, because there were very large differences among individual pigeons. Figure 2 presents the switching percentages for each pigeon, plotted in 5-session blocks. The left panels show the results from the 6 pigeons that had the highest switching percentages by the end of the 67% condition, and the right panels show the results from the other 6 pigeons. As can be seen, the performance of the 6 pigeons in the top left panel was similar to the pigeons of Herbranson and Schroeder: By the end of the 67% condition, these pigeons were making switching responses on a high percentage of the trials. However, the performance of the 6 pigeons in the top right panel was very different. Three pigeons (P1, P4, and P9) made progressively fewer switching responses across the 30 sessions of the 67% condition, and 2 of these pigeons made almost no switching responses at the end of this condition. The switching percentages of the other 3 pigeons (P2, P6, and P10) did not show any obvious trends, and they remained close to 50% throughout the 67% condition. It should be noted that there were no discernible differences between those pigeons exposed to the random schedules of reinforcement and those exposed to the pseudorandom schedules. Of the 4 pigeons exposed to the random schedules, 2 displayed high percentages (P11 and P12) and 2 did not (P9 and P10). This pattern of individual differences is very similar to what was seen in the other 8 pigeons.
Figure 2.
Switching percentages are shown for each pigeon in Experiment 1 in 5-session blocks. For each condition, the left panel shows the results from the 6 pigeons that had the highest switching percentages at the end of the 67% Condition; the right panel shows the results from the other 6 pigeons.
With both the random and pseudorandom reinforcement schedules, departures from the nominal probabilities are possible in the short run (e.g., there could be several consecutive trials with food after a stay response, or several consecutive trials without food after a switch response). Could it be that differences in the actual reinforcement probabilities on the first few trials of experiment were responsible for the large individual differences observed in the 67% condition? To assess this possibility, the actual reinforcement percentages were examined for the first 20 trials of the first session. On these 20 trials, the mean reinforcement percentages after a stay response were 34.8% for the 6 pigeons that ultimately developed the highest overall switching percentages (those in the left panel of Figure 2) and 32.2% for the 6 pigeons that developed the lowest overall switching percentages (right panel of Figure 2), t(10) = 0.14, p = .89. The corresponding reinforcement percentages after a switch response were 68.5% and 63.8%, respectively, t(10) = 0.63, p = .54. These results therefore offer little support for the notion that differences in the reinforcement percentages on the first few trials were related to individual differences in the pigeons’ long-term switching percentages.
The bottom half of Figure 2 shows the results from the 90% condition. The switching percentages for the 6 pigeons in the bottom left panel were all between 95% and 100%. For 5 of the 6 pigeons in the bottom right panel, switching percentages increased as the 90% condition progressed, and in the 30th session, the mean switching percentage for these 5 pigeons was 94.6%. The remaining pigeon (P1) continued to make stay responses on nearly every trial.
Why was there much more between-subject variability in this experiment than in the study of Herbranson and Schroeder (2010), and why did half of the pigeons in the present experiment fail to learn (at least in the 67% condition) that switching was the optimal strategy? The procedure was virtually identical to that of Herbranson and Schroeder, requiring a choice among three white keys then between two green keys. The present experiment used the same timing of all trial events, and the same number of trials per session. One difference is that the pigeons used by Herbranson and Schroeder were experimentally naïve, whereas those in the present experiment had extensive previous experience in choice experiments in these test chambers. The 12 pigeons in this experiment had previously participated in three different experiments (4 pigeons per experiment), but there was no systematic relation between which previous experiment and their performance in the Monty Hall task. That is, some pigeons from each previous experiment reached a high switching percentage in the 67% condition, and some did not. In addition, one might have expected that prior experience in choice experiments would help, not hinder, their learning in the present experiment, because the birds had learned in the previous experiments that pecking different keys had different consequences. However, all of the pigeons had prior exposure to procedures that involved either long delays or large ratio schedules, with only occasional food deliveries. In comparison, in the 67% condition of present experiment, the pigeons could obtain several reinforcers per minute by making only two responses per trial, even if they did not make the optimal choices. The suboptimal choices of some pigeons could therefore be instances of “satisficing”; that is, the 33% chance of obtaining food after a stay response may have provided little incentive to learn a better strategy. Whatever the reason, their previous experience evidently did not help the pigeons in this experiment, and, if anything, it made them less likely to learn the optimal strategy.
A closer examination of the choices of the pigeons that did not learn the optimal strategy in the 67% condition gives some indication of what factors were controlling their behaviors. The 2 pigeons that developed exclusive preference for staying may have developed this pattern because it required less effort (i.e., less bodily movement) to peck the same key twice in succession than to switch keys. This behavior, though not optimal, still led to food on 33% of the trials. Once these pigeons were staying on almost every trial, they had no opportunity to learn that a pattern of switching led to a higher probability of food. The 4 pigeons that continued to produce a mixture of staying and switching responses each exhibited a tendency to respond on certain keys and avoid others. The response patterns of these 4 pigeons were examined in the last 5 sessions of the 67% Condition. Pigeon 1 almost never responded on the right key, making 99% of its responses on the left white key, then sometimes staying and sometimes switching to the center key. The behavior of Pigeon 4 was similar. Pigeon 10 almost never responded on the left key, making more than 99% of its responses on the center and right keys (sometimes switching between these two keys, sometimes staying). Pigeon 6 started 95% of its trials by pecking one of the side keys, then sometimes switching keys and sometimes staying. In summary, these 4 pigeons each displayed strong key preferences that may have interfered with their learning of the optimal strategy. However, when the different consequences for staying versus switching became more pronounced in the 90% condition, these 4 pigeons eventually started switching on nearly every trial.
These results are similar in some ways to those from a recent experiment by Mazur (2010), who used a discrete-trial procedure to study pigeons’ choices between red and green keys that had either different delays to reinforcement or different probabilities of reinforcement. When the differences between the two alternatives were relatively large (e.g., a fixed 4-s delay to food for the red key versus a fixed 8-s delay to food for the green key), all pigeons exhibited nearly exclusive preference for the key with the shorter delay. However, when the two alternatives were more difficult to discriminate (e.g., a variable delay to food averaging 10 s for the red key versus a variable delay to food averaging 8 s for the green key), a range of different response patterns was observed. In some cases, the pigeons showed exclusive preference for the key with the shorter delay, but in other cases they showed exclusive preference for the key with the longer delay (possibly because of a color or position bias, or because of a failure to discriminate which key had the shorter average delay). There were also some cases in which the pigeons distributed their choices between the two keys (e.g., choosing the 8-s delay on 80% of the trials and the 10-s delay on 20% of the trials). Similar results were obtained when the two keys had different probabilities of reinforcement instead of different delays. Based on these results, Mazur concluded that in discrete-trial choice procedures (where an animal must choose between two alternatives by making a single, brief response), exclusive preference for one alternative is the most frequent outcome—it is the animal’s “default option.” However, “distributed preference may occur when two factors (such as reinforcer delay and position bias) compete for the control of choice, or when the consequences for the two alternatives are similar and difficult to discriminate” (Mazur, 2010, p. 321).
The results of the present experiment are consistent with these conclusions. In the 67% condition, where the consequences of staying versus switching were presumably more difficult to discriminate, 6 pigeons exhibited predominant preference for switching, 2 pigeons showed exclusive preference for staying, and 4 pigeons showed distributed preference. As already noted, each of the latter 4 pigeons developed patterns of responding predominantly on certain keys and avoiding others, and these key position preferences evidently overrode the differential consequences of switching versus staying. However, in the 90% condition, where the different consequences of staying and switching were easier to discriminate, all 12 pigeons displayed exclusive or near-exclusive preference (although for one pigeon it was exclusive preference for staying rather than switching).
Finally, switching percentages were calculated for each pigeon from the last 25 trials of the second session in the 67% condition (i.e., after 175 trials on this task). These percentages were obtained so that they could be compared to the results from the college students in Experiment 2, who received a total of 200 trials on the Monty Hall task. For the pigeons, the mean switching percentage on these 25 trials was 40.3%, ranging from 16% to 72% for individual birds. One of the purposes of Experiment 2 was to evaluate the suggestion of Herbranson and Schroeder (2010) that pigeons tend to learn the optimal strategy on the Monty Hall task more readily than do adult humans.
EXPERIMENT 2
Method
Participants
The participants were 36 students from Introductory Psychology, all 18 years of age or older, who received research participation credit for serving in the experiment. Twelve participants were assigned to each of the three conditions in the experiment (with 67%, 80%, and 90% probabilities of reinforcement for switching responses, respectively).
Apparatus
The experiment was conducted on a laptop computer with a 15-in screen. During the experiment, the bottom portion of the screen displayed three “doors”--white rectangular boxes with a 4-cm green circle in the center of each box. Participants made their choices by using the laptop’s touchpad to move the cursor and click on the green circles. The top part of the screen had three boxes that were labeled “Wins”, “Trials”, and “Winning Percentage.” These numbers were updated after each trial as a participant played the game.
Procedure
For each participant, the experimenter began by reading a set of instructions. For the 67% condition, the instructions were as follows:
In this experiment, you will be asked to play a simple choice game on the computer. Each game will consist of 50 trials, and you will be asked to play four games. On each trial, you should click on one of the three green circles. One circle will disappear, and you should then click on one of the two remaining circles. The screen will then say either “Win” or “Lose.” Click on the word “Win” or “Lose,” and the next trial will start in about 2 seconds. It is not possible to win every trial in this game. However, your goal should be to win as many times as you can. Using the best possible strategy, a player can win about 67% of the trials (which is about 33 wins in each 50-trial game). At the top of the screen is a display that will show you how many wins you have obtained, how many trials you have played, and your winning percentage. When you have completed 50 trials, the bottom of the screen will tell you that the game is completed. Please call me after each game is over. You can then take a short break while I set up the next game for you. In summary, in each of the four times you play the game, you should try to win on as many trials as you can. The four participants with the most wins will receive a $25 prize.
The instructions were the same for the 80% and 90% conditions, except that participants were informed that with the optimal strategy, they could win on 40 or 45 trials, respectively, in each 50-trial game.
At the start of each trial, the screen displayed the three white doors with green circles. When the participant clicked on one of the green circles, one of the other two doors and its green circle disappeared. The two remaining white doors and green circles remained in their original locations on the screen. After a participant clicked on one of the two remaining green circles, the doors and green circles disappeared and were replaced with either the word “WIN” or “LOSE.” The boxes on the top half of the screen were updated to show the number of wins, the number of trials, and the winning percentage for that block of 50 trials. The participant had to click on the word “WIN” or “LOSE”, and 2 s later the word disappeared and the three boxes with green circles appeared for the next trial. As in Experiment 1, whether a trial was a win or loss was determined by a pseudorandom schedule that ensured that the actual reinforcement percentages remained very close to their nominal values in each session. In the 67% condition, 8 out of every 12 switch responses led to a win, and 4 out of every 12 stay responses led to a win. In the 80% condition, 8 out of every 10 switch responses led to a win, and 2 out of every 10 stay responses led to a win. In the 90% condition, 9 out of every 10 switch responses led to a win, and 1 out of every 10 stay responses led to a win.
Results and Discussion
Figure 3 shows the mean switching percentage for each of the three groups, plotted in 25-trial blocks. In the 67% condition, switching percentages showed almost no increase across the 200 trials. The mean switching percentage was 66% in the first 25-trial block and 67.7% in the last block. In the 80% and 90% conditions, the mean switching percentages were higher in the first 25-trial block, and they increased rapidly, reaching levels above 90% by the fourth block in the 80% condition and by the third block in the 90% condition. An ANOVA found a significant effect of condition, F(2, 33) = 15.93, p < .001, η2 = 0.56, and a significant effect of trial blocks, F(7, 231) = 12.93, p < .001, η2 = 0.56, but the condition-by-block interaction failed to reach statistical significance, F(14, 231) = 1.68, p = .061. Tukey’s HSD test found that switching percentages were significantly lower in the 67% condition than in either the 80% or 90% conditions (ps < .001), but there was no significant difference between the 80% and 90% conditions.
Figure 3.
Mean switching percentages are shown for the college students in each of the three conditions of Experiment 2, plotted in blocks of 25 trials.
As in Experiment 1, the group means in Figure 3 do not provide a good indication of the performances of individual participants. Figure 4 presents the switching percentages from all the individual participants, plotted in 25-trial blocks. For each condition, the results from the 6 participants who had the highest switching percentages are shown in the left panel, and the results from the other 6 participants are shown in the right panel. In the 67% condition, 3 of the 12 participants were switching on nearly every trial in the last 25-trial block. Most of the other participants in the 67% condition showed no systematic increase in switching percentages across the experiment (see the upper right panel in Figure 4). However, they did not choose the switching and staying options equally often: Averaged across all 200 trials, the switching percentages of all 12 participants in the 67% condition were above 50% (ranging from 53.5% to 98%).
Figure 4.
Switching percentages are shown for each college student in Experiment 2 in 25-trial blocks. For each condition, the left panel shows the results from the 6 students who had the highest switching percentages, and the right panel shows the results from the other 6 students. The keys identify individual participants in the form “condition-participant” (e.g., “80-3” refers to the third participant in the 80% condition).
As was done for the pigeons in Experiment 1, the actual reinforcement percentages were examined for the first 20 trials of the experiment to determine whether different percentages on the first few trials might be responsible for the long-term individual differences in performance. On these 20 trials, the mean reinforcement percentages for a stay response were 36.1% for the 6 participants that ultimately developed the highest overall switching percentages (those in the top left panel of Figure 4) and 29.2% for the 6 participants that developed the lowest overall switching percentages (top right panel of Figure 4), t(10) = 1.55, p = .15. The corresponding reinforcement percentages after a switch response were 65.5% and 66.7%, respectively, t(10) = 0.54, p = .60. These results provide no support for the view that individual differences in the long-term performances of these participants were related to differences in their reinforcement percentages for staying or switching on the first few trials.
The middle and bottom rows of Figure 4 show that in the early trials, there was also considerable variability among participants in the 80% and 90% conditions. Some participants established a strong pattern of switching early in the experiment, whereas others did not. By the end of the experiment, however, most of these participants had developed a pattern of switching on nearly every trial. In the final 25-trial block, 7 of 12 participants in the 80% condition were switching on more than 90% of the trials, and all 12 participants in the 90% condition were doing so.
The results from the 67% condition were quite similar to those from the college students tested by Herbranson and Schroeder (2010). In their experiment, the switching percentage was 65.7% in the last 50 trials, compared to 69.5% in the present experiment. These similar results suggest that the changes we included in an attempt to obtain more optimal behavior from our participants (the opportunity to win money for good performance, the trial-by-trial feedback on the number of wins and the winning percentage, and the information about what winning percentages were possible using the best strategy) had little or no effect. What did make a big difference, both with the college students and with the pigeons in Experiment I, were the relative reinforcement probabilities for switching versus staying. By the end of the 67% condition, only 3 of 12 students and 4 of 12 pigeons were switching on 90% or more of the trials. By the end of the 90% condition, all 12 students and 11 of 12 pigeons were switching on 90% or more of the trials. These results suggest that one of the main reasons why some students and some pigeons did not learn the optimal strategy in the 67% condition was that the different reinforcement probabilities for switching versus staying are difficult to discriminate in this task, even after exposure to many trials.
As already noted, Herbranson and Schroeder (2010) concluded that the performance of pigeons was better than that of humans on the Monty Hall task, but their human participants received only 200 trials on the task, compared to nearly 3000 trials for the pigeons (and the same was true in our experiments). To obtain a fairer comparison between pigeons and humans, we compared their switching percentages after 175 trials (i.e., the last 25 trials of the experiment for the college students, and the last 25 trials of the second session for the pigeons). The mean switching percentage for the students was 67.7% (ranging from 40% to 100%). The mean switching percentage for the pigeons was 40.3% (ranging from 16% to 72%). Therefore, after the same number of trials, the college students were performing better on this task, and the difference was statistically significant, t(22) = 3.73, p < .001, d = 1.52. This advantage for the college students is not surprising, especially considering that the instructions they received prior to the experiment gave them considerable information about the nature of the task, whereas the pigeons had to learn about the task entirely from their trial-by-trial experiences.
If we compare the performance of the college students and pigeons across the entire experiment (not just after 175 trials), two similarities are evident. First, in the 67% condition, a few college students and a few pigeons exhibited near-optimal performance (switching on nearly every trial), but most did not. Second, in the 90% condition, all college students and all but one pigeon eventually displayed near-optimal performance, although there were substantial individual differences in how long it took before they reached that level of performance. However, there was one notable difference between these two groups. In the 67% condition, all of the college students, even those whose performance was least optimal, made more switching responses than staying responses. In contrast, 5 of the 12 pigeons had switching percentages below 50% at the end of the 67% condition, and 2 of these pigeons had switching percentages at or near zero. Thus even when they had received many more trials on the task than the college students, several of the pigeons displayed response patterns that did not reflect the higher probability of reinforcement that was available for switching. This is another way in which the overall performance of the pigeons was worse than that of the college students.
As seen in these two experiments, neither adults nor pigeons performed optimally in the 67% condition. The next experiment extends this work to young, preschool aged children, exploring whether young children will approach the Monty Hall task strategically. In addition, will their performance be similar to the non-optimal performance of the pigeons and adults in Experiments 1 and 2, or will their performance exceed the switching rates of these two groups?
EXPERIMENT 3
Method
Participants
Seventeen preschoolers (11 boys and 6 girls) between 37 and 57 mos (mean age = 48.375 mos, SD = 6.42) participated in this study. Preschoolers were recruited from two local daycare centers.
Apparatus
Preschoolers were presented with the Monty Hall dilemma on a laptop computer with a 15 inch screen. During the experiment, the bottom portion of the screen was divided into three “doors” (white rectangular boxes of identical size and shape, with a 4-cm green circle in the center of each box). Children made their choices by touching the circle of their choice and the experimenter clicked on it. The top part of the screen had a box indicating the number of trials completed.
Procedure
The experimenter sat with the child and explained the game in the following way: “In this game you will see three doors. Behind one of the doors you will find a fun picture. If you find the fun picture, then you win. Do you want to play?” When the child said “yes”, the game began.
At the start of each trial, the screen displayed the three white doors with their green circles, and a child was asked to pick a door. Upon picking a door, one of the other two doors and its green circle disappeared, leaving the child’s original choice and another door. The two remaining white doors and green circles stayed in their original locations on the screen. The experimenter then asked the child “What would you like to do next?” At this point, the child either picked the original choice or moved to the other remaining door. Sometimes in the first trial, the child would be confused by the game, and the experimenter would say again “What would you like to do next?” and would leave the child to figure out that another choice could be made. The experimenter was careful never to say the words stay or switch. If the child did not pick a door, the experimenter prompted the child by indicating that something could be done next.
After choosing a door, either a fun picture (e.g., Winnie the pooh, Toy Story characters, Dora the Explorer, etc.) or a red unhappy face would appear. If the fun picture appeared, the experimenter would talk about the picture with enthusiasm and say “you won”, if the unhappy face appeared, the experimenter would say “oh too bad”.
Each child participated in a minimum of 50 trials over three days. As in the two previous experiments, whether a trial ended with or without reinforcement was determined by a pseudorandom schedule that ensured that the actual reinforcement percentages remained very close to their nominal values.
Some children enjoyed the game and persisted after 50 trials. Examination of their later performance showed that it was similar to the first 50 trials; and therefore we adopted a 50 trial cutoff in order to preserve the size of the sample. One participant was not able to complete the minimum 50 trials and was dropped from further analyses.
Two differences between the adult and child version of the task should be noted: First, preschoolers completed the task with an experimenter present, while adults completed the task alone, and second, preschoolers were not given feedback about whether they were “doing well” over the course of the session, while the adults had a counter which displayed their win/lose percentage. In every other way the tasks were identical.
Results and Discussion
The work with preschoolers focused on two basic questions: On this choice task, was preschoolers’ performance strategic, and how did their performance compare to pigeons and adults?
First, there is evidence that overall the preschoolers’ performance was strategic, although not optimal. Comparisons between preschoolers, pigeons, and adults show that, as a group, preschoolers’ performance across the 50 trials averaged around 67% switching (see Figure 5), which was nearly identical to the adult and pigeon data in the 67% condition (compare to Figures 1 and 3). The mean switching percentage was 68% in the first 10-trial block and 66% in the last 10-trial block. Thus, as a group, the preschoolers did not perform better or worse than either pigeons or adults on this task. Common explanations for suboptimal performance by adults on this task are 1) operation of the equiprobability heuristic and 2) preference for staying to avoid regret about switching choices. Although these two explanations seem plausible for adults, it is unlikely that they could account for the similar pattern across these three diverse groups. It is more likely that the 33% to 67% probability structure limits performance on this task more than the operation of heuristics.
Figure 5.
Mean switching percentages are shown for the preschool children in Experiment 3, plotted in blocks of 10 trials. The error bars are standard errors of the mean.
Moving beyond the preschoolers’ group averages, what were the patterns of strategies utilized in this task? As in the first two experiments, the performances of individual participants are informative. Figure 6 presents the switching percentages for all the individual participants, plotted in 5 10-trial blocks. Three distinct patterns of responding can be identified. The first, seen in the left panel of Figure 6, was a switching strategy employed by 44% of the sample or 7 of 16 children. Children using this strategy switched between 88% to 100% of the time over all 50 trials. This switching strategy was not bound by location (right, left, center), and was maintained even in the face of losses. For example, the first choices of Child C2 were distributed 20% left, 34% center, and 46% right, yet this child switched on 96% of the trials. Similarly, the other children who exhibited this pattern of responding also distributed their responses among the three choices. The switching strategy represents an abstracted rule rather than a fixation on a particular location. Of those children who switched, all but one adopted a 100% switching strategy from the very first trial, and therefore it is clear that this subset of children did not learn to switch during the course of the experiment. The one child (Child C8), who did not adopt a 100% switching strategy from the first trial, did progressively move to a switching strategy over the 50 trials (e.g., 80, 80, 90, 90, 100% switching in successive 10-trial blocks). Notably, this child had the lowest switching percentage in this group at 88%.
Figure 6.
Mean switching percentages are shown for each child in Experiment 3, plotted in blocks of 10 trials. The results for the 7 children with the highest switching percentages are in the left panel, for those with intermediate switching percentages in the center panel, and for those with the lowest switching percentages in the right panel.
The second subset of children, seen in the far right panel of Figure 6, used a staying strategy, and similar to the switching strategy, a majority of these children adopted an almost exclusive staying strategy from the first trial. This staying strategy was employed by 25% of the sample or 4 of 16 children. Children employing a staying strategy stayed with their first choice 70% to 100% of the time. This staying strategy was not bound by location (right, left, center) and was maintained in the face of losses. For example, the first choices of Child C5 were distributed 34% left, 38% center, and 28% right, yet this child stayed 100% of the trials. The other 3 children distributed their first choices among the three options in similar ways. Similar to switching, the staying strategy represents an abstracted rule rather than a fixation on a particular location. Whereas Child C5 stayed on 100% of the trials, the other 3 children showed an increasing tendency toward staying as the trials progressed. These preschoolers who stayed were similar to the small subset of the pigeons that also stayed from the earliest trials and maintained this strategy over time; however, unlike these pigeons, there was no bias for or against a particular location. The performance of this subset of preschoolers was worse than that of any adult, because no adult participant stayed on more than 50% of the trials.
Finally, 31% of the sample or 5 of 16 children employed a middling strategy with switching hovering between 46% and 80% over 50 trials (as shown in the middle panel of Figure 6). As can be expected this group was variable; however, looking at the averages of their performance over the trials there was some indication of a trend toward learning to switch. For these 5 children, the switching percentages were 60%, 50%, 64%, 64%, and 78% over 5 successive 10-trial blocks. Whether with more trials, this subset of children would display progressively more switching behavior is an interesting question for future work with this age group. Siegler (2004) suggested that greater opportunities for learning are present when children moderately vary their strategies when faced with novel problems, rather than when variations of strategies are more extreme. Thus, the subsets of children with nearly exclusive switching and staying strategies would represent poorer conditions for learning than the intermediate group. There were no significant age differences between the three styles of responding (mean ages in months were 50, 48 and 46.4, respectively, F(2, 13) = .433, p = .66.
In summary, on average, the preschoolers’ performance was remarkably similar to the pigeons’ and adults’ performance when tested in the 67% probability structure. Although it seems unlikely that the preschoolers’ responses were influenced by the equiprobability heuristic, they did not show any advantage compared to adults. Looking at the individual patterns of responding, three styles emerged, with 11 of the 16 children (69%) approaching the task with a clear strategy of either switching or staying, in some cases from the very first trial. Therefore, we conclude that most preschoolers do behave strategically in this task, applying a rule of either switching or staying. Importantly, both of these strategies were adopted within the initial trials and were maintained over the course of the experiment. These strategies were not linked to a particular preference for a response location, but rather represent abstracted rules which were applied with striking resistance to win/loss feedback. However, some of the preschoolers who exhibited intermediate performance did show more switching by the final 10-trial block, which could reflect some learning about the higher probability of winning after a switch.
GENERAL DISCUSSION
Previous studies on the Monty Hall task with adults (Burns & Weith, 2004; Friedman 1998) or adolescents (De Neys, 2007) have found in a one-trial procedure that a large majority of the participants choose to stay with their initial choice, whereas the optimal strategy is to switch choices. A variety of explanations for this suboptimal behavior have been proposed, many related to the heuristics that people use in choice situations (De Neys & Verschueren, 2006; Krauss & Wang, 2003). The experiments of Herbranson and Schroeder (2010) and those reported here were different in that they gave individuals (both people and pigeons) many trials of exposure to the Monty Hall task, to see how their choices might change with repeated exposure to the consequences of staying versus switching. Herbranson and Schroeder found that with repeated exposure, both college students and pigeons began to make switching choices more often than staying responses. However, the pigeons eventually reached switching percentages of about 95% (close to the optimal strategy of switching every time) whereas the college students reached switching percentages of only about 66%. Herbranson and Schroeder concluded that the performance of their pigeons was superior to that of the college students on the Monty Hall task, perhaps because the same heuristics that lead to staying responses on the first trial continued to interfere with the students’ ability to learn that switching is the better strategy.
Our first two experiments were patterned after the Herbranson and Schroeder (2010) studies, with some modifications, and although our results were similar in some ways, we came to very different conclusions. First, we found little evidence to convince us that the performance of pigeons was superior to that of humans (college students or preschoolers); in some ways, the humans performed better than the pigeons. As in the Herbranson and Schroeder research, in the 67% conditions, our college students made switching choices on about 2/3 of the trials, but so did our pigeons. An examination of the results from individual subjects showed that some pigeons and some students (less than half in both cases) came close to the optimal strategy of switching on almost every trial, but most subjects were still making a substantial percentage of stay responses. Furthermore, in both the Herbranson and Schroeder experiments and in our experiments, the pigeons received many more trials of exposure to the task than did the students. When we compared the last 25 trials of our college students to the pigeons after the same number of trials, the performance of the students was significantly better (a switching percentage of 67.7% for the students and 40.3% for the pigeons). Although the pigeons eventually achieved about the same switching percentage as the students, it took them many more trials to reach that level. Finally, the behaviors of the pigeons and students with lower switching percentages in the 67% condition were markedly different: All the students had switching percentages above 50%, whereas switching percentages were below 50% for 5 of 12 pigeons, and close to 0% for 2 pigeons. Therefore, in several different ways, the performance of the students was better than that of the pigeons in the 67% condition.
In each of the three populations we examined (pigeons, students, and preschoolers), there were large individual differences in the standard, 67% condition. In each group, some individuals ended up switching on almost every trial, but some exhibited a mixture of stay and switch responses. Among the pigeons and preschoolers (but not the college students), a few individuals even developed a pattern of staying on every trial—the worst possible strategy. We hypothesized that one reason for these large individual differences, and for the sub-optimal performance, is that it was difficult to discriminate the different reinforcement percentages for switching and staying. When the difference in the reinforcement percentages for switching versus staying was made larger (the 80% and 90% conditions), the switching percentages for both pigeons and college students were much higher. In fact, in the 90% condition, all the adults and all but one of the pigeons arrived at the near-optimal strategy of switching on nearly every trial. It therefore seems that a major impediment to good performance on the standard Monty Hall task is the difficulty in discriminating the higher reinforcement percentage for switching as opposed to staying. Some individuals (both pigeons and people) manage to learn the optimal strategy, but many do not, even after many trials of exposure to the task.
We found no indication that two other factors that might make a difference in the standard task—the participants’ levels of motivation and they type of feedback they receive—had any effect on the performance of the college students. In Herbranson and Schoeder’s (2010) experiment, the reinforcers for the college students were simply points, and they were not given a running tally of how many points they had won. In an attempt to obtain better performance from our college students, we awarded monetary prizes to the best performers, and we provided trial-by-trial feedback on the number of points earned and winning percentage. Despite these changes, the mean switching percentages for the college students in the 67% condition were almost identical to those of Herbranson and Schroeder (about 2/3 in both experiments). Making comparisons across experiments is always risky, of course, but there was clearly no evidence for any better performance by our students.
A well-known hypothesis about choice in situations with probabilistic reinforcers is the theory of probability matching—the idea that response percentages will match the relative reinforcement probabilities (e.g., Behrend & Bitterman, 1966; Bullock & Bitterman, 1962). For instance, in the standard Monty Hall task, with reinforcement probabilities of 2/3 for switching and 1/3 for staying, probability matching would occur if a subject made 2/3 switching responses and 1/3 staying responses. Because their college students had a mean switching proportion of about 2/3, Herbranson and Schroeder (2010) suggested that they may have been displaying probability matching, as opposed to their pigeons, who were exhibiting an optimization strategy. In the 67% conditions of our experiments, the group means for the pigeons, college students, and preschoolers were all close to 2/3 switching (see Figures 1, 3, and 5). In spite of these group means, however, the results from individual subjects strongly suggested that there was no probability matching by the pigeons, the college students, or the preschoolers. As already discussed, in each group there were a variety of response patterns ranging from near-exclusive preference for switching to near-exclusive preference for staying. An examination of the performance of individual subjects in the 67% conditions in Figures 2, 4, and 6 shows a few cases where the final switching percentages were near 67%, but many more cases where the switching percentages were much higher or much lower. Similarly, in the 80% condition of Experiment 2, 11 of 12 college students reached switching percentages above 80%, and for 7 of 12 students they were above 90%. In the 90% conditions of Experiments 1 and 2, most of the pigeons and college students had final switching percentages closer to 100% than to 90%. None of these results are consistent with the principle of probability matching.
The results from all three groups are generally consistent, however, with Mazur’s (2010) proposal that animals tend towards exclusive preference in discrete-trial choice situations unless (1) the consequences for the different alternatives are hard to discriminate or (2) some other factors, such as a position bias, compete with the tendency toward exclusive preference. All of college students and most of the pigeons displayed near-exclusive preference in the 90% conditions, where the different consequences for staying and switching were easiest to discriminate. Switching percentages were not quite as high for the students in the 80% condition, but about half of the students had switching percentages close to 100%, and the switching percentages of others were still increasing when the experiment ended. In the 67% conditions, where the reinforcement probabilities for staying and switching were most similar, some pigeons, some college students, and some preschoolers did exhibit nearly exclusive switching behavior, but most did not. A few of the pigeons and preschoolers displayed nearly exclusive preference for staying rather than switching.
We included preschool children in this research to explore the possibility that they might perform better on the Monty Hall task than adults because they had not learned the probability heuristics that ostensibly make the task difficult for adults (cf. De Neys, 2007). This was not the case. In two ways, their performance was similar to that of the college students. First, their mean switching percentage of 67% was almost identical to the adult mean. Second, there were large individual differences: Some preschoolers approximated the optimal behavior of switching on every trial, but others did not. However, unlike the college students, many of preschoolers began with a strategy of always switching or always staying and continued with that strategy throughout the experiment. Only a few preschoolers showed evidence of a gradual shift toward switching as the trials progressed.
Previous studies have shown that when adults are given the Monty Hall dilemma for just one trial, a large majority of the participants choose to stay with their initial choice (e.g., Burns & Wieth, 2004; Granberg & Brown, 1995). These studies have pointed to a variety of heuristics used by adults that may be responsible for this suboptimal choice. When the Monty Hall dilemma is presented to humans as a one-trial game, participants are drawn into a situation where there is a “knowledgeable host” who always eliminates an inferior option after the player’s initial choice. Participants may be more likely to use heuristics when the Monty Hall is presented in this way, i.e., as a one-trial insight problem, than when it is presented in an iterative fashion as a learning problem. Thus, we are not suggesting that heuristics do not operate in the one-trial version of the Monty Hall dilemma, but only that in a muti-trial procedure, pigeons, adults and preschoolers perform similarly with similar patterns of individual differences, and thus it is unlikely that biasing heuristics are playing a role.
In summary, these studies provided little convincing evidence that either pigeons or preschoolers perform better on the Monty Hall task than adults. The college students acquired the strategy of switching faster than pigeons, and even after many more trials on the task, the average performance of the pigeons was not better than that of the students. In all three groups—pigeons, college students, and preschoolers—there were large individual differences, with only a minority of the individuals learning the optimal strategy of switching on every trial. However, Herbranson and Schroeder (2010) found that even after many trials, adults do not usually learn the optimal strategy of switching every time. The present research confirmed this finding, and it suggested that the main impediment to better performance after repeated trials on the Monty Hall task is the difficulty in discriminating the consequences of staying versus switching. When the differences between staying and switching became more extreme, the choices of both college students and pigeons came close to the optimal strategy of switching on every trial.
Acknowledgments
The research with pigeons was supported by Grant R01MH38357 from the National Institute of Mental Health. The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Mental Health or the National Institutes of Health. We thank Christina Falkenstein, Michael Lejeune, Jay Morrissey, Kimberly Rakiec, and Michael Petrocelli for their help in conducting this research.
References
- Baum WM. Matching, undermatching, and overmatching in studies of choice. Journal of the Experimental Analysis of Behavior. 1979;32:269–281. doi: 10.1901/jeab.1979.32-269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Behrend ER, Bitterman ME. Probability-matching in the goldfish. Psychonomic Science. 1966;6:327–328. [Google Scholar]
- Bitterman ME, Wodinsky J, Candland DK. Some comparative psychology. American Journal of Psychology. 1958;71:94–110. [PubMed] [Google Scholar]
- Bullock DH, Bitterman ME. Probability matching in the pigeon. American Journal of Psychology. 1962;75:634–639. [PubMed] [Google Scholar]
- Burns BD, Wieth M. The collider principle in causal reasoning: Why the Monty Hall Dilemma is so hard. Journal of Experimental Psychology: General. 2004;133:434–449. doi: 10.1037/0096-3445.133.3.434. [DOI] [PubMed] [Google Scholar]
- Chi MTH. Knowledge structure and memory development. In: Siegler RS, editor. Children’s thinking: What develops? Hillsdale, NJ: Erlbaum; 1978. [Google Scholar]
- Davison M, McCarthy D. The matching law: A research review. Hillsdale, NJ: Erlbaum; 1988. [Google Scholar]
- De Neys W. Developmental trends in decision making: The case of the Monty Hall Dilemma. In: Elsworth JA, editor. Psychology of decision making in education, business and high risk situations. Hauppauge, NY: Nova Science; 2007. pp. 271–281. [Google Scholar]
- De Neys W, Vanderputte K. When less in not always more: Stereotype knowledge and reasoning development. Developmental Psychology. 2011;47:432–441. doi: 10.1037/a0021313. [DOI] [PubMed] [Google Scholar]
- De Neys W, Verschueren N. Working memory capacity and a notorious brain teaser: The case of the Monty Hall Dilemma. Experimental Psychology. 2006;53:123–131. doi: 10.1027/1618-3169.53.1.123. [DOI] [PubMed] [Google Scholar]
- Ecott CL, Critchfield TS. Noncontingent reinforcement, alternative reinforcement, and the matching law: A laboratory demonstration. Journal of Applied Behavior Analysis. 2004;37:249–265. doi: 10.1901/jaba.2004.37-249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fantino E, Kanevsky I, Charlton SR. Teaching pigeons to commit base-rate neglect. Psychological Science. 2005;16:820–825. doi: 10.1111/j.1467-9280.2005.01620.x. [DOI] [PubMed] [Google Scholar]
- Friedman D. Monty Hall’s three doors: Construction and deconstruction of a choice anomaly. The American Economic Review. 1998;88:933–946. [Google Scholar]
- Gilovich T, Medvec VH, Chen S. Commission, omission, and dissonance reduction: Coping with regret in the “Monty Hall” problem. Personality and social Psychology Bulletin. 1995;21:185–190. [Google Scholar]
- Granberg D, Brown TA. The Monty Hall dilemma. Personality And Social Psychology Bulletin. 1995;21:711–723. [Google Scholar]
- Granberg D. Cross-cultural comparison of responses to the Monty Hall Dilemma. Social Behavior And Personality. 1999;27:431–438. [Google Scholar]
- Green L, Fry AF, Myerson J. Discounting of delayed rewards: A life-span comparison. Psychological Science. 1994;5:33–36. [Google Scholar]
- Herbranson WT, Schroeder J. Are birds smarter than mathematicians? Pigeons (Columba livia) perform optimally on a version of the Monty Hall Dilemma. Journal of Comparative Psychology. 2010;124:1–13. doi: 10.1037/a0017703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herrnstein RJ. Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior. 1961;4:267–272. doi: 10.1901/jeab.1961.4-267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hineline PN, Sodetz FJ. Appetitive and aversive schedule preferences: Schedule transitions as intervening events. In: Commons ML, Mazur JE, Nevin JA, Rachlin H, editors. Quantitative analyses of behavior: Vol. 5. The effect of delay and of intervening events on reinforcement value. Hillsdale, NJ: Erlbaum; 1987. [Google Scholar]
- Jacobs JE, Potenza M. The use of judgement heuristics to make social and object decisions: A developmental perspective. Child Development. 1991;62:166–178. [Google Scholar]
- Kahneman D, Slovic P, Tversky A, editors. Judgment under uncertainty: Heuristics and biases. Cambridge, United Kingdom: Cambridge University Press; 1982. [Google Scholar]
- Klahr D. Goal formation, planning, and learning by pre-school problem solvers or: “My socks are in the dryer”. In: Siegler RS, editor. Children’s thinking: What develops. Hillsdale, NJ: Erlbaum; 1978. pp. 181–212. [Google Scholar]
- Krauss S, Wang XT. The psychology of the Monty Hall problem: Discovering psychological mechanisms for solving a tenacious brain teaser. Journal of Experimental Psychology: General. 2003;132:3–22. doi: 10.1037/0096-3445.132.1.3. [DOI] [PubMed] [Google Scholar]
- Madden GJ, Bickel WK. Impulsivity: The behavioral and neurological science of discounting. Washington, DC: American Psychological Association; 2010. [Google Scholar]
- Mazur JE. Distributed versus exclusive preference in discrete-trial choice. Journal of Experimental Psychology: Animal Behavior Processes. 2010;36:321–333. doi: 10.1037/a0017588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mazur JE, Biondi DR. Delay-amount tradeoffs in choices by pigeons and rats: Hyperbolic versus exponential discounting. Journal of the Experimental Analysis of Behavior. 2009;91:197–211. doi: 10.1901/jeab.2009.91-197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDowell JJ, Caron ML. Matching in an undisturbed natural human environment. Journal of the Experimental Analysis of Behavior. 2010;93:415–433. doi: 10.1901/jeab.2010.93-415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer DR. The effects of differential probabilities of reinforcement on discrimination learning by monkeys. Journal of Comparative and Physiological Psychology. 1960;53:173–175. [Google Scholar]
- Navarro AD, Fantino E. The sunk cost effect in pigeons and humans. Journal of the Experimental Analysis of Behavior. 2005;83:1–13. doi: 10.1901/jeab.2005.21-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nikiforidou Z, Pange J. The notions of chance and probabilities in preschoolers. Early Childhood Education Journal. 2010;38:305–311. [Google Scholar]
- Odum AL. Delay discounting: I’m a k, you’re a k. Journal of the Experimental Analysis of Behavior. 2011;96:427–439. doi: 10.1901/jeab.2011.96-423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piattelli-Palmarini M. Inevitable illusions: How mistakes of reason rule our minds. New York: Wiley; 1994. [Google Scholar]
- Reynolds B. A review of delay-discounting research with humans: Relations to drug use and gambling. Behavioural Pharmacology. 2006;17:651–667. doi: 10.1097/FBP.0b013e3280115f99. [DOI] [PubMed] [Google Scholar]
- Schneider SM, Lickliter R. Choice in quail neonates: The origins of generalized matching. Journal of the Experimental Analysis of Behavior. 2010;94:315–326. doi: 10.1901/jeab.2010.94-315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siegler RS. Five generalizations about cognitive development. American Psychologist. 1983:263–277. (need to get volume) [Google Scholar]
- Siegler RS. Learning about learning. Merrill-Palmer Quarterly. 2004;50:353–368. [Google Scholar]
- Tégláss E, Girotto V, Gonzalez M, Bonatti LL. Intuitions of probabilities shape expectations about the future at 12 months and beyond. Proceedings of the National Academy of Sciences. 2007;104(48):19156–19159. doi: 10.1073/pnas.0700271104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wanchisen BA, Tatham TA, Hineline PN. Human choice in ‘counterintuitive’ situations: Fixed- versus progressive-ratio schedules. Journal of the Experimental Analysis of Behavior. 1992;58:67–85. doi: 10.1901/jeab.1992.58-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woolverton WL, Myerson J, Green L. Delay discounting of cocaine by rhesus monkeys. Experimental and Clinical Psychopharmacology. 2007;15:238–244. doi: 10.1037/1064-1297.15.3.238. [DOI] [PMC free article] [PubMed] [Google Scholar]






