Skip to main content
Journal of the Experimental Analysis of Behavior logoLink to Journal of the Experimental Analysis of Behavior
. 2012 Jul;98(1):45–64. doi: 10.1901/jeab.2012.98-45

SAVING THE BEST FOR LAST? A CROSS-SPECIES ANALYSIS OF CHOICES BETWEEN REINFORCER SEQUENCES

Leonardo F Andrade 1, Timothy D Hackenberg 1
PMCID: PMC3408725  PMID: 22851791

Abstract

Two experiments were conducted to compare choices between sequences of reinforcers in pigeon (Experiment 1) and human (Experiment 2) subjects, using functionally analogous procedures. The subjects made pairwise choices among 3 sequence types, all of which provided the same overall reinforcerment rate, but differed in their temporal patterning. Token reinforcement schedules were used in both experiments and the type of exchange schedule varied across blocks of sessions. Some conditions permitted immediate exchange of tokens for consumable reinforcers (food for pigeons, video access for humans); in other conditions, tokens accumulated and were exchanged for consumable reinforcers only at the end of the sequence. Choice patterns in the immediate-exchange conditions were generally similar across species, with both pigeons and humans preferring sequences with the shortest delay to the initial reinforcer in the series. The results are broadly consistent with models of temporal discounting expanded to include the impact of sequences of delayed reinforcers acting in parallel from the time of the choice. Preferences were less consistent with discounting models in the delayed exchange conditions. Questionnaire data gathered at the end of the experiment were consistent with prior results of questionnaire studies, but showed no straightforward relation to the observed choice patterns, urging caution in the extrapolation of results from one decision-making domain to the other.

Keywords: choice, reinforcer sequences, token reinforcement, behavioral economics, cross-species comparisons, humans, pigeons


Many patterns of adaptive choice—from foraging in a depleting patch of resources to saving money for retirement—involve sensitivity to patterns of consequences distributed over time. Most theories and models of choice, however, focus on choices with single outcomes. As a result, little is known about how sequences of outcomes and their possible interactive effects impact choice patterns (Ariely & Loewenstein, 2000; Kirby, 2006).

An influential study involving choices between sequences of outcomes was carried out by Loewenstein and Prelec (1993), in which human subjects answered questionnaires with hypothetical scenarios involving sequences of qualitatively different reinforcers. The number of outcomes in each sequence, as well as the interevent duration, was altered across question sets. In one set of questions, for example, subjects were first asked whether they preferred to have dinner at a fancy French restaurant or at a local Greek restaurant. Subsequently, the subjects who preferred the French restaurant were asked the following two questions: (1) Would you prefer to have dinner at the French restaurant on Friday in 1 month or on Friday in 2 months? and (2) Would you prefer to have dinner at the French restaurant on Friday in 1 month and dinner at the Greek restaurant on Friday in 2 months; or dinner at the Greek restaurant on Friday in 1 month and dinner at the French restaurant on Friday in 2 months? (Those who initially preferred the Greek restaurant received the converse choices.)

When the question involved a single outcome—Question (1) above—the majority of subjects preferred to have the French dinner sooner rather later. When the question involved two outcomes, however, the Greek and the French restaurants—Question (2) above—most of these subjects chose to defer their favorite restaurant (French) and have the Greek dinner sooner. In other words, a majority of subjects favored sequences that improved in time. According to Loewenstein and Prelec (1993), when the sequential nature of events is highlighted, the gestalt properties emerge. Said another way, the value of individual events is modulated by the overall temporal context in which those events are embedded. Including multiple events delineated as a sequence encourages sensitivity to the aggregate properties of the sequence.

Whatever the mechanism, the preference for improving sequences of outcomes found by Loewenstein and Prelec (1993) has been confirmed in numerous other studies with human subjects. Most of these studies involve some kind of hypothetical outcome, such as money (Guyse, Keller, & Eppel, 2002; Hsee, Abelson, & Salovey, 1991; Loewenstein & Sicherman, 1991; Matsumoto, Peecher, & Rich, 2000; Read & Powell, 2002), gambling (Read & Powell, 2002; Ross & Simonson, 1991), stock markets (Ariely & Zauberman, 2000; Matsumoto et al., 2000), grades (Hsee et al., 1991), vacations and meals in restaurants (Loewenstein & Prelec, 1993; Matsumoto et al., 2000; Montgomery & Unnava, 2009), quality of the environment (Guyse et al., 2002); health (Chapman, 2000; Guyse et al., 2002), and subjective experiences of discomfort (Varey & Kahneman, 1992). While fewer in number, studies using real stimuli, such as video games (Ross & Simonson, 1991), music (Montgomery & Unnava, 2009), aversive noise (Ariely & Loewenstein, 2000; Ariely & Zauberman, 2000; Schreiber & Kahneman, 2000), temperature (Ariely, 1998; Kahneman, Fredrickson, Schreiber, & Redelmeier, 1993), and mechanical pressure (Ariely, 1998), have also shown preference for improving sequences.

The experimental conditions in the studies cited above are often hypothetical situations described in sentences or depicted via some kind of graphic representation, in which the size of the geometric figures represent the intensity of some dimension of stimulus events. The dependent measure consists of choices, ranking, or rating of these stimuli. When real stimuli are used, subjects are typically exposed to the stimulus first and then are asked to rank or rate it on some arbitrary scale after the exposure. When choosing between these real events, rarely do subjects gain access to the outcomes they select (e.g., Ariely & Loewenstein, 2000; Kahneman et al., 1993; Schreiber & Kahneman, 2000). In Ariely and Loewenstein's (2000, Experiment 4) study, for instance, subjects were instructed to listen to annoying sound sequences and then select to which ones they would prefer to listen to in a subsequent phase of the experiment. Subjects were never actually exposed, however, to the choices to which they committed.

While such preferences for improving outcome sequence appears consistent with a sizeable body of research with human subjects, they run counter to dominant psychological and economic models based on value discounting. Most such discounting models have been formulated to deal with single outcomes only, but some have been expanded to include reinforcer sequences. According to one such model (Mazur, 1986), for example, the discounted value of a sequence of reinforcers is equal to the sum of the present discounted values of each individual reinforcer in the sequence:

graphic file with name jeab-98-01-03-e01.jpg

where V represents the value of the sequence of reinforcers delivered after some delay D; n represents the number of reinforcers in the sequence, A is the undiscounted value of a reinforcer, and k is a free parameter that determines how sharply V decreases in direct function of D.

This way of calculating the nominal value of a sequence of reinforcers is very similar to one proposed earlier by McDiarmid and Rilling (1965), in which the value of a sequence was assumed to be equal to the sum of the immediacies of each reinforcer in the series (i.e., the reciprocal of the delays; see Brunner & Gibbon, 1995, for alternative delay-discounting models within sequences). Other models based on different methods for calculating reinforcer value make generally similar predictions, but Mazur's (1986) model will be used here, as it has been successfully applied to multiple-reinforcer sequences across species, including rats (Brunner, 1999; Brunner & Gibbon, 1995; Mazur 2007), pigeons (Mazur, 1986; Shull, Mellon & Sharp, 1990), and humans (Kirby, 2006).

When Equation 1 is applied to choice between reinforcer sequences, and A and k parameters are held constant, those sequences with shorter initial delays (but longer subsequent delays) should be preferred to those with longer initial delays (but shorter subsequent delays), owing to the disproportionate weighting of earlier reinforcers in the sequence. Therefore, according to this model, worsening sequences should be preferred to improving sequences.

Brunner (1999, Experiment 1) investigated rats' choices between sequences that improved or worsened in time. More specifically, rats were given repeated choices between a sequence in which the interpellet delay increased over time (worsening sequence) and an alternative sequence in which the interpellet delay decreased over time (improving sequence). Contrary to the results found with humans and qualitatively different reinforcers (e.g., Loewenstein & Prelec, 1993; Loewenstein & Sicherman, 1991), but consistent with delay-discounting models such as Equation 1, rats strongly preferred the worsening sequence.

The reasons for the discrepancy between the results of choice experiments with human and nonhuman subjects are important, but unresolved questions. Do such differences reflect qualitative differences between species (e.g., human verbal or rule-governed behavior that enhances sensitivity to global outcomes)? Or, might the differences reflect methodological differences (e.g., differences in the ways humans and other animals are typically studied)? An important procedural difference is the format for presenting choices. In studies with humans, choices are typically presented once (so-called single-shot choices) and are based on hypothetical scenarios (i.e., subjects do not typically gain access to the chosen outcome). In studies with nonhumans, on the other hand, subjects are repeatedly exposed to choices and the contingent outcomes. This repeated exposure, then, allows for choices to be affected by experience with the task.

Even in choice studies with humans in which procedures include actual consequences presented repeatedly over time, there is mounting evidence that choices are affected by the economic context in which the choices are embedded. More specifically, when choices produce immediately consumable reinforcers, greater discounting of delayed outcomes is normally seen (Jimura, Myerson, Hilgard, Keighley, Braver, & Green, 2011; Lagorio & Hackenberg, 2010; Locey, Pietras, & Hackenberg, 2009). On the other hand, when choices produce tokens later exchangeable for consumable reinforcers, less discounting is typically seen. Indeed, many studies utilizing token reinforcement procedures are consistent with global maximizing of overall reinforcement rate, in which value is assumed to be equal to the nondiscounted arithmetic average of aggregate reinforcer per unit of time (e.g., Flora & Pavlik, 1992; Logue, Peña-Correal, Rodriguez, & Kabela, 1986).

Therefore, because consumable and token reinforcers are usually associated with procedures used with nonhuman and humans, respectively, cross-species comparisons have been confounded with these differences in methods. More meaningful comparisons between humans and nonhumans require experimental paradigms that are more methodologically equivalent. This methodological standardization is one of the broad aims of the present study. Here, we made the typical experimental procedure used with pigeons and humans more comparable by introducing token reinforcers exchangeable for consumable reinforcers at different times of the session. The consumable reinforcers consisted of food for pigeons and video clips from popular television shows for humans, a reinforcer that has proven effective in laboratory research with humans (Hackenberg & Pietras, 2000; Lagorio & Hackenberg, 2010; Locey et al., 2009; Navarick, 1996, 1998).

Two experiments were conducted to assess the pattern of choice in pigeons (Experiment 1) and humans (Experiment 2) between sequences of token and consumable reinforcers that provided the same overall rate, but different temporal patterning: a Worsening (WOR) sequence with increasing inter-reinforcement delays, an Improving (IMP) sequence with decreasing inter-reinforcement delays, and a Standard (STD), sequence composed of fixed interreinforcement delays. In addition, the sequences of choice outcomes were embedded in two different economic contexts to assess temporal sensitivity to the exchange periods: Immediate Exchange (IE), in which token exchange opportunities were made available immediately after each token delivery; and Delayed Exchange (DE), in which token exchange opportunities were made available after the delivery of final reinforcer in a four-reinforcer sequence.

Value discounting models, such as Equation 1, predict differential preference in the IE conditions (after holding the other variables constant). Because the discounting function decreases more sharply in the short run compared to the long run, Equation 1 predicts that the first reinforcer in the series will exert stronger control over preference than the more delayed ones. Figure 1 shows the discounting functions of each sequence (WOR, STD, and IMP) under IE conditions based on Equation 1 (see Panels A, B, and C). Note that all four reinforcers are obtained within a 60 sec trial and, therefore, the overall reinforcement rate is constant at four per minute. What differs between panels is whether the time to the next token increases over the trial (5 sec to Token 1, 10 more secs to Token 2, 15 more secs to Token 3, and an addition 30 s to Token 4; WOR), decreases over the trial (30 sec to Token 1, 15 more secs to Token 2, 10 more secs to Token 3, and an additional 5 s to Token 4; IMP) or is unchanged (STD). For simplicity, we set the free parameter k value equal to 1 (a value that provided a reasonable description of pigeon and human choice in a recent study with similar procedures [Lagorio & Hackenberg, 2010]) and the parameter A equal to 1 (because reinforcer amount was not manipulated and remained constant throughout the current study). Panel D shows the summed value of each sequence, from which the following ordinal ranking emerges: WOR > STD > IMP. Thus, Equation 1 predicts differential preference among the three sequences. Conversely, under the DE conditions (predicted values not shown in Figure 1), this model predicts indifference among the different sequences because the consumable reinforcers are made available at the same time (at the end of the sequence), and thus the value of each sequence is the same.

Fig. 1.

Fig. 1

Value of Worsening (Panel A), Standard (Panel B), and Improving (Panel C) sequences of four reinforcers, all timed from a single choice point, computed according to Equation 1. Panel D shows the summed values of each sequence from the Immediate Exchange conditions.

In sum, the main goal of the current investigation was to compare choices between reinforcer sequences across species (pigeons and humans) and economic contexts (IE, permitting immediate exchange and consumption, and DE, permitting exchange and consumption at the end of the sequence), using procedures that were functionally similar.

EXPERIMENT 1

Method

Subjects

Four naïve male White Carneau pigeons (Columba livia) served as subjects. The pigeons were housed individually in a humidity and temperature-controlled colony room where they had continuous access to water and grit. The lights in the colony room were on from 7:00 a.m. to 11:00 p.m. The pigeons were maintained at approximately 83% of their free-feeding weights via additional post-session feeding when necessary.

Apparatus

A standard operant conditioning chamber with a modified stimulus panel served as experimental location. The working space measured 35 cm high by 35 cm wide by 30.5 cm long. The stimulus panel contained three horizontally aligned plastic keys located 8.7 cm from the top. Each key had a circumference of 7.85 cm and was located 5.7 cm apart from each other. The minimum force required to operate each key was approximately 0.3 N. A row of 12 horizontally aligned and evenly spaced red stimulus lights was inserted in the panel at 4.5 cm below the ceiling and served as tokens. The circumference of each token was approximately 4.71 cm. Only the four centermost token lights were used. A white houselight centrally located above the row of tokens remained on throughout the session, except when food was being delivered. A food hopper delivered mixed grain accessed through a centrally located 5.8 by 5.8 cm aperture, which was 10 cm above the floor. The experiment was controlled through a microcomputer and MED-PC IV software/interface located in an adjacent room.

Training

All pigeons were first exposed to 1 or 2 days of adaptation to the chamber, followed by sessions of magazine training, key-peck shaping, token-exchange training, and token-production training. When key pecking had been established, the pigeons were exposed to sessions in which the tokens were paired with food. This was accomplished by transilluminating the center red key and the four centermost tokens. Each peck on the center red (exchange) key produced 3-s access to food and a short (0.04 s) beep, and turned off the center key and a single token. Immediately after the food hopper was lowered, the red center key was re-illuminated and a new cycle began. Each subsequent exchange response produced the same events, until all four tokens had been exchanged for food. Immediately following the exchange of the fourth token, a new trial (composed of the presentation of four tokens) began. The tokens were always exchanged in sequence (either from left to right, or vice versa), with the starting position determined randomly each four-token cycle. Ten training sessions were required, each lasting 12 trials.

Following token-exchange training, the pigeons were trained to produce tokens by pecking the side keys. On each cycle, the left or right key was illuminated white, and a single peck produced four tokens simultaneously, a short beep, and the center (exchange) key. Each peck on the exchange key produced food, as described above, until all four tokens had been exchanged, after which a new cycle began with one of the side keys lit white. The position of the active key (left or right) was determined randomly in each cycle, with the restriction that each occurred six times in each session. Pigeons were exposed to 10 token-exchange training sessions, each lasting for 12 cycles.

Experimental Procedure

Following training, the pigeons were given choices between sequences of tokens and food using a concurrent-chains schedule with two links. At the beginning of each choice cycle, the center key was lit white, and a single peck on it produced the initial-link stimuli. This initiation response was implemented to reduce the likelihood of position biases by ensuring that the initial links began with a response that was equidistant to the side keys. The initial link consisted of a concurrent fixed-ratio 1 fixed-ratio 1 (Concurrent FR1 FR1) in the presence of white keys. Thus, a single peck on either white side (choice) key produced one of two terminal-link stimuli—a green or a yellow side key. (Within a condition, each color was associated with a specific reinforcer sequence.) The terminal link comprised a sequence of delays to each of four tokens and an exchange schedule. Each token presentation was accompanied by a brief beep. The tokens were presented from left to right (if the initial-link response had been on the left key) or from right to left (if the initial-link response had been on the right key). Each terminal link was followed by a 5-s intertrial interval (ITI) during which only the houselight remained on.

Figure 2 (Panel A) shows a schematic of the terminal-link events. Three different sequences providing the same overall rate of reinforcement, but different temporal patterning, were used. All sequences included four tokens in the terminal link delivered over the same overall time span, timed from terminal-link onset. More specifically, the tokens were presented response-independently in the terminal link and the total time over which all four were presented was held constant at 60 s. In the Standard (STD) sequence, four tokens were presented at equal intertemporal delays of 15 s. In the Improving (IMP) sequence, the delays between successive tokens decreased (30 s, 15 s, 10 s, and 5 s); whereas in the Worsening (WOR) sequence, the delays between successive tokens increased (5 s, 10 s, 15 s, and 30 s). Note that the intertemporal delay between token presentation, as well as overall duration of the terminal link, depicted in Figure 2, does not include the token exchange and food consumption periods.

Fig. 2.

Fig. 2

Diagram of the terminal-links implemented for each sequence in Experiments 1 and 2. The horizontal lines show the terminal link with time going from left to right. Each vertical bar represents the temporal placement of tokens timed from terminal-link onset. In Experiment 1, tokens were presented at 15, 30, 45, and 60 s in the Standard (STD) sequence, 5, 15, 30, and 60 s in the Worsening (WOR) sequence, and 30, 45, 55, and 60 s in the Improving (IMP) sequence. In Experiment 2, tokens were presented at 30, 60, 90, and 120 s in the STD sequence; 10, 30, 60, and 120 s in the WOR sequence, and 60, 90, 110, and 120 s in the IMP sequence.

Besides manipulation of the intertemporal delay of token presentation, the other major independent variable was the scheduling of the token-exchange period. Tokens were either exchanged for food after the delivery of the fourth token in the sequence—Delayed Exchange (DE)—or immediately after the delivery of each individual token within the sequence—Immediate Exchange (IE). The token-exchange schedule was signaled by the darkening of the side keys and the illumination of the center red key. During the IE conditions, when the token and exchange key was presented, the delay timer for the next token delivery in the sequence was stopped until the token earned was exchanged. After the completion of the token exchange, the timer that controlled the delay to the next token was reset and the side key pecked in the choice phase was re-illuminated until the presentation of the next token. The exchange schedule was identical to the one used during training, except that each token was exchanged for 2.5-s rather than 3-s access. Latencies to exchange the tokens, once available, were typically less than 1 s.

Sessions were scheduled once per day, 7 days per week. A session was composed of 12 cycles—2 forced and 10 free-choice cycles. On forced cycles, only one of the two initial-link keys was lit. Such cycles were implemented to ensure adequate exposure to both alternatives. The order of presentation of each forced alternative was randomly determined, but both alternatives were always presented once during the initial two cycles each session. The final 10 cycles of each session were free-choice cycles, in which both alternatives were available. Sessions lasted for approximately 16 min.

Table 1 shows the order in which subjects experienced each condition and the number of sessions conducted in each. Some conditions were replicated to assess reliability of preferences. There were also frequent reversals of the contingencies to assess position and color biases. In the transition to a new condition, the higher-value sequence was always placed on the side opposite to the one preferred in the immediately preceding condition. The position of each alternative sequence was counter-balanced across subjects. Because Pigeon P894 developed a side bias during the experiment, it was exposed to two additional conditions with a higher ratio of forced-choice: free-choice trials aimed to rectify the bias.

Table 1.

Sequence of conditions and mean number of responses in the terminal-link for each alternative in Experiment 1.

graphic file with name jeab-98-01-03-t01.jpg


Sub.

Cond. Ordera

Total Sessions

Left Alternative

Right Alternative

Sequence

Resp. T-L

Sequence

Resp. T-L
P883 1 22 STD 5.4 IMP 0
2 19 IMP 0.4 STD 2.2
3 12 WOR 7.8 STD 0
4 16 IMP 0 WOR 0.6
5 31 IMP(DE) 8 WOR(DE) 4.2
6 12 WOR 0.2 STD 0
7 13 STD 0 WOR 2.2
8 19 WOR 0.6 STD 0
9 19 WOR(DE) 13.6 STD(DE) 1.2
10 18 IMP 0 WOR 0.2
P702 1 12 STD 0 IMP 4.2
2 13 IMP 0 STD 2.2
3 17 STD 3 IMP 0
4 12 STD 0 WOR 0.2
5 16 WOR 5.8 IMP 0
6 41 WOR(DE) 3.4 IMP(DE) 3.6
7 23 STD 0 WOR 0
8 21 WOR 0 STD 0
9 36 WOR(DE) 0.8 STD(DE) 0.6
10 16 IMP 0 WOR 0.2
P894 1 12 IMP 0 STD 9.6
2 5 STD 0 IMP 15.8
3 6 STD 30.4 IMP 1.4
4 18 IMP 0 STD 12
5 20 WOR 0.8 STD 0
6 18 IMP 0 WOR 2.2
7 16 IMP(DE) 49.2 WOR(DE) 5
8 14 WOR 26.4 STD 0.8
9 17 STD 0 WOR 13.8
10 20 STD(DE) 2.4 WOR(DE) 11.4
11 16 WOR 0.2 IMP 0
P942 1 18 IMP 0 STD 694
2 26 STD 686 IMP 0
3 13 STD 0 WOR 306
4 12 WOR 471 IMP 0
5 25 WOR(DE) 21.6 IMP(DE) 518
6 18 STD 0 WOR 656
7 15 WOR 176 STD 10.6
8 18 WOR(DE) 2 STD(DE) 129
9 13 IMP 0 WOR 422

Note. All conditions are Immediate Exchange conditions (IE), unless otherwise specified.

DE = Delayed Exchange Condition

T-L = Terminal-link

a

= Order in which subjects experienced each experimental condition in Experiment 1.

Conditions remained in effect for a minimum of 12 sessions and until the proportion of initial–link responses was deemed stable according to the following criteria: (a) absence of increasing or decreasing trend across five consecutive sessions; and (b) absence of the highest or lowest point in the condition.

Results and Discussion

Figure 3 (Panels A, B, and C) shows the mean proportion of choices for the last five (stable) sessions of each experimental condition during IE conditions. To facilitate the visual analysis, each pairwise comparison is grouped and displayed in a single panel of each figure. Bars on the left and right side of each graph represent the choices allocated to the left and to the right side alternatives, respectively. The labels on the left correspond to the sequence available on the left choice alternative, and the labels on the right indicate the sequence available on the right alternative. The labels in bold show which sequence of the pair had the highest value (computed as per Equation 1). Error bars indicate standard deviations. Note that the order of conditions shown in the figure is not the order in which the conditions were experienced, but is presented in this way to highlight the relevant comparisons. See Table 1 for the experienced order of conditions for each pigeon.

Fig. 3.

Fig. 3

Mean (SD) proportion of choices for each alternative during the final five sessions of immediate-exchange conditions in Experiment 1. Bars on the left and on the right side depict choice proportions on the left and right side, respectively. Panel A: choice between the Standard (STD) and the Improving (IMP) sequences. Panel B: choice between the Worsening (WOR) and STD sequences. Panel C: choice between WOR and IMP sequences. See text for further details.

Panel A in Figure 3 shows choice proportions for each pigeon in the STD versus IMP comparison. Pigeons P883 and P942 strongly preferred (> .90 choice proportions) the STD sequence. Pigeon P702 preferred the IMP condition during the first exposure, but the STD condition in the subsequent two side-reversal conditions. Pigeon P894 showed a strong bias toward the right key in the first two experimental conditions so it was exposed to two additional conditions with a higher ratio of forced: free-choice trials to rectify the bias. After exposure to these training conditions, this pigeon also preferred the STD sequence in two subsequent conditions (including a side reversal).

Panel B shows choice proportions in the WOR versus STD comparison. All pigeons exhibited strong preferences (≥ .90 choice proportions) for the WOR sequence. For Pigeons P702, P894, and P942 such preferences were seen in every condition and replication; for Pigeon P883, preference for WOR was seen in three of four conditions. Specifically, the occasion on which P883 preferred STD over WOR was in the condition immediately following the DE condition (see Table 1).

Panel C shows choice proportions in the WOR versus IMP comparison. All pigeons showed a strong preference for the WOR sequence on both occasions in which they were exposed to these conditions. This was the critical comparison, as far as reconciling prior results is concerned. The results are extremely clear and consistent with prior results conducted with nonhuman subjects: Strong preference for the sequence with the shorter initial delay to reinforcement.

Equation 1 predicts that preferences should be ranked in the following way: WOR > STD > IMP. We analyzed preference using a binomial test, entering in the model the proportion of cases in which preference for the higher-value sequence was observed. Results revealed a significant effect of the sequence type on preference, with greater preference for the higher-value sequence compared to the other sequences (ps < .001). Therefore, the results obtained in Experiment 1 provide strong support for the ordinal predictions of the Equation 1 and are consistent with prior studies that tested the predictions of this model with multiple reinforcers in rats (Brunner, 1999; Brunner & Gibbon, 1995; Mazur, 2007) and pigeons (Mazur, 1986; Shull et al., 1990).

Equation 1 specifies that reinforcer value is inversely related to relative reinforcer delays, but there are at least two potentially important delays: delays to tokens and delays to exchange periods when those tokens can be cashed in for food. Because tokens are exchanged immediately for food under IE conditions, these two delays are confounded. Under DE conditions, however, tokens are delivered in the same temporal sequence, but the exchange periods now occur at the end of the four-reinforcer terminal link. If value is calculated with respect to token delays, then Equation 1 predicts no difference in choice proportions under IE and DE conditions. If value is calculated with respect to the delays to exchange periods and food, however, Equation 1 predicts (1) indifference in the DE conditions because delays to exchange and food are constant across sequence types and (2) differential preference in IE conditions because delays to exchange and food vary across sequence types. The actual choice patterns fell somewhere in between these predictions.

Figure 4 shows choice proportions for the final five sessions under the DE conditions, in relation to two pairwise comparisons: WOR versus STD (Panel A) and WOR versus IMP (Panel B). Choice proportions were less extreme during DE conditions than during analogous IE conditions, though they usually stabilized at levels above indifference. When the choice proportions favored one alternative over another in DE conditions, they generally were in accord with the more favorable relative token delays, computed as in Equation 1 (WOR > STD > IMP). Although these results are broadly consistent with a view of the tokens as conditioned reinforcers—that is, with the view that tokens acquire reinforcing functions similar to the primary reinforcers with which they have repeatedly paired—such choice patterns were confounded with order effects. Because the DE conditions functioned mainly as probes, these conditions were arranged to immediately follow adjacent and otherwise identical IE conditions. Preferences seen during these DE conditions were generally in the same direction (albeit somewhat weaker) than those obtained in the immediately prior IE condition. To disentangle order and possible carryover effects from conditioned reinforcement effects, future research should include within-subject replications and reversals and/or between-subject counterbalancing condition order.

Fig. 4.

Fig. 4

Mean (SD) proportion of choices for each alternative during the final five sessions of Delayed Exchange conditions in Experiment 1. Bars on the left and on the right side depict choice proportions from left and right side, respectively. Panel A: choice between Worsening (WOR) and Standard (STD). Panel B: choice between WOR and Improving (IMP). See text for further details.

Because DE conditions served mainly as probes, designed to reveal sensitivity to the economic context, the critical comparisons are between the choice patterns in the IE and adjacent DE conditions. Figure 5 shows the mean proportion of choice favoring the WOR sequence in the DE condition and the immediately preceding IE condition. Preferences were less extreme (moving toward indifference) in the transition from the IE to the DE conditions. When WOR was pitted against STD (Panel A), there was a trend toward indifference when moving from IE to DE conditions, albeit weaker compared to the transition to the WOR versus IMP (DE) condition (Panel B). When WOR was pitted against IMP, preference moved from near exclusive to near indifference (ranging from 42% to 56%) in three out of the four pigeons.

Fig. 5.

Fig. 5

Mean proportion of choices for the Worsening (WOR) sequence from the last five sessions of Experiment 1. A) Choice between WOR and Standard (STD) during Immediate Exchange and Delayed Exchange conditions; B) Choice between WOR and Improving (IMP) during Immediate Exchange and Delayed Exchange conditions. See text for further details.

In sum, preferences in IE conditions were ordered with respect to relative reinforcer delays, computed as per Equation 1, with multiple reinforcers all timed from a single choice point. The move toward indifference in choice proportions under DE conditions is broadly consistent with the predictions of Equation 1 with reinforcer delays computed with respect to exchange/food delays. That preferences generally stabilized above indifference, and in the direction of relatively more immediate token delays, is broadly consistent with the conditioned reinforcing functions of the tokens. In the absence of replications or reversals, however, alternative accounts of these preferences cannot be ruled out.

EXPERIMENT 2

Although preference for improving sequences has been consistently reported in the literature using humans as subjects, the vast majority of these studies used single-shot choice procedures rather than repeated choices that are more typical of research with nonhuman subjects (including our Experiment 1). One of the main goals of Experiment 2 was to analyze human preference using a procedure that allowed subjects to repeatedly choose between sequences of outcomes and be repeatedly exposed to the contingent outcomes of their choices. To facilitate comparisons across species, the procedure implemented in this experiment was analogous to Experiment 1: The subjects chose between sequences in which the inter-reinforcement delay increased (WOR), decreased (IMP), or remained fixed (STD) in the terminal link. As in Experiment 1, choices produced tokens and consumable reinforcers in the terminal links, but instead of food, popular TV shows were used as reinforcers. In addition, the economic context—defined in term of token-exchange opportunities (IE, DE)—was also manipulated as in Experiment 1.

Method

Participants

Eleven participants were recruited via newspaper advertisement or flyers posted on campus after signing an informed consent. None had prior experience with similar experiments and all expressed high interest in television shows. Prior to exposure to the main experimental conditions described below, all participants were exposed to a contingency aimed to assess delay sensitivity to the video reinforcer (see below). Only those participants who showed strong and unambiguous preference for the shorter delay were invited to continue in the experiment. Of the initial pool of 11 participants, 5 did not pass the delay-sensitivity test, and 2 chose voluntarily to leave prior to the main experimental conditions. Therefore, the results of the present experiment are based on the performance of 4 remaining participants: 2 male and 2 female undergraduate college students between the ages of 18 and 23 years. They completed 24 to 54 sessions over the course of the study, and they earned approximately $6 per/hour. Two participants completed all planned experimental conditions, and 2 left voluntarily after two or three conditions.

Setting, Apparatus, and Materials

The experiment was conducted in one of two small rooms, each containing a chair, a desk, a computer, a pair of speakers, a keyboard, and a mouse. Both rooms were used during the experiment, but a given participant was always studied in the same room. During sessions, participants remained seated in front of the computer monitor and responded to the visual stimuli presented on the screen by “clicking” with the computer mouse. The computers were IBM-compatible, and the visual interface displayed on the screen, as well as data collection were controlled via Visual Basic 6.0 software program. The monitor screen measured approximately 36.5 cm wide by 27.5 cm high, and was placed on the desk at approximate eye-level height of the participant when seated.

The visual interface displayed on the screen of the computer was composed of four aligned red circles that served as tokens, and three aligned colored rectangles. The circles had a circumference of 7.85 cm and were aligned approximately 3.5 cm from the top of the screen. They were equally spaced from each other (2 cm) and centered on the screen. The three 7.7 cm wide by 7 cm high rectangles, representing the choice alternatives and token exchange response, were centered on the screen, and located approximately 4.5 cm below the tokens and 3.2 cm from each other. The color of the rectangles depended on the experimental condition or the specific link within the chained schedule (see below). The background screen color was gray. When inactive, the tokens and the rectangles were also colored gray, but remained slightly visible on the screen.

A variety of popular TV shows were used as reinforcers. The videos were converted to AVI format and stored on the hard disk of both computers. An episode was divided into segments of approximately 30 s, and a segment was played each time the participant exchanged a token. A total of 48 video segments, which corresponded to a full show episode, were played in chronological order during each session. The particular episode played during a session was selected by the participant prior to the beginning of that session from among 10 options: 1) Friends – Season 6; 2) Friends – Season 7; 3) Family Guy – Season 1; 4) Looney Tunes; 5) Seinfield – Season 4; 6) The Simpsons – Season 2; 7) The Simpsons – Season 3; 8) Sports Bloopers; 9) Will and Grace – Season 1; and 10) Wallace and Gromit. To avoid repeated episodes, the program would automatically select the next available episode in the sequence. In the event a participant had watched all episodes available of a given show, the program would prompt the participant to choose a different show prior to the initiation of the session.

Experimental Procedure

During each session, participants remained alone in the room. Two sessions lasting approximately 50 min each, were scheduled per day, usually 5 days a week. Sessions occurred successively, and a 5-min break was inserted between the sessions. No timing device of any sort was permitted in the experimental room. Participants received minimal instructions about the experimental contingencies during the experiment. More specifically, after choosing a TV show (as described above) and clicking on the “continue” button, two additional messages were displayed: “You will need to use only the mouse for this part of the experiment.” and “When you are ready to begin, click the “begin” button below.” Immediately after participants clicked on the buttons following the prompts, the experimental session started, and participants were exposed to the experimental choice contingency.

Delay-Sensitivity Test

Prior to exposure to the main experimental conditions described below, all participants were exposed to a contingency aimed to assess delay sensitivity to the video reinforcer. During this preexperimental phase, the contingency involved choices between alternatives that produced a single 30-s video segment delivered after different delays. One alternative (see description of the choice contingency below) produced a video clip segment after 5 s, while the other alternative produced the same outcome after a delay of 30 s. The duration of the video clip segment was the same (30 s); the only difference was the delay between the choice and the video clip. To maintain a constant reinforcement rate between both alternatives, a 25-s postreinforcer delay followed the video clip after the 5-s delay. Each session consisted of 8 forced and 40 free-choice trials. Participants completed a minimum of 10 sessions.

Experimental Procedure

The experimental procedures implemented here were analogous to those used in Experiment 1. Participants were given repeated choices between sequences of token and video clips using a concurrent-chains schedule with two links. At the beginning of each cycle, the visual display showed a flashing white centered rectangle, and a click on the rectangle produced the initial-link stimuli (i.e., trial-initiation response). The initial link consisted of a Concurrent FR1 FR1 schedule of reinforcement. In the initial link, the token and center rectangle were inactive, while the two side rectangles were colored yellow or green and remained flashing on the screen. A single response on either rectangle produced the terminal-link stimuli, signaled by the following events: (1) The clicked alternative stopped flashing and became inactive; (2) the other alternative also became inactive, but the rectangle color changed to gray. The terminal link consisted of a sequence of delays to each of four tokens and exchange periods associated with the chosen alternative. A brief beep accompanied each token presentation. Similar to Experiment 1, the tokens also were presented from left to right or from right to left, depending on whether the choice occurred on the left or right alternative, respectively. No ITI was implemented in this experiment.

Similar to Experiment 1, tokens were exchanged either after the delivery of the fourth token in the sequence (DE Condition) or immediately after the delivery of each individual token in the sequence (IE Condition). The timer that controlled the delivery of the tokens during DE and IE conditions worked in a similar manner as described in Experiment 1. Each token was exchanged for approximately 30 s of video segment, and the exchange schedule was signaled by flashing the red center rectangle (exchange rectangle) and the deactivation and darkening of the choice-rectangle alternatives. A single click on the exchange rectangle produced a brief beep and the video clip. Participants generally exchanged the tokens immediately (usually within 1–2 s) after they became available.

Figure 1 (Panel B) depicts the sequences of terminal-link events implemented in this experiment. The sequences followed the same rationale of the sequences used in Experiment 1, but the overall time span of the delivery of all tokens (i.e., the terminal-link duration) was 120 s rather than 60 s. (The duration of the terminal-link was extended in relation to Experiment 1 because previous unpublished work in our laboratory has shown that average interreinforcer delays of approximately 30 s were effective with video reinforcers.) In the Standard (STD) sequence, four tokens were presented at equal intertemporal delays of 30 s; in the Improving (IMP) sequence, the delays between successive tokens decreased (60 s, 30 s, 20 s, and 10 s); in the Worsening (WOR) sequence, the delays between successive tokens increased (10 s, 20 s, 30 s, and 60 s). Experimental conditions were in effect for a minimum of four sessions and until choice proportions were deemed stable via visual inspection. Table 2 shows the sequence of conditions in the order subjects experienced them and the number of sessions conducted at each.

Table 2.

Sequence of conditions and number of sessions conducted in each in Experiment 2.

graphic file with name jeab-98-01-03-t02.jpg


Sub.

Cond. Ordera

Total Sessions

Left Alternative

Right Alternative
H146 1 12 IMP STD
2 4 WOR STD
3 6 IMP WOR
4 4 IMP(DE) WOR(DE)
5 4 WOR STD
6 6 WOR(DE) STD(DE)
H148 1 8 IMP STD
2 4 WOR STD
3 4 STD WOR
4 10 IMP WOR
5 4 IMP(DE) WOR(DE)
6 4 WOR STD
7 4 WOR(DE) STD(DE)
8 6 STD WOR
9 4 WOR IMP
10 4 IMP WOR
H154 1 6 IMP STD
2 4 STD IMP
3 10 IMP STD
4 4 WOR STD
H161 1 4 IMP WOR
2 4 WOR STD
3 6 IMP STD
4 4 IMP WOR

Note. All conditions are Immediate Exchange conditions (IE), unless otherwise specified. Responses in the de-activated choice alternatives during the terminal-link were not collected in Experiment 2.

DE = Delayed Exchange Condition

a

= Order in which subjects experienced each experimental condition in Experiment 1.

Questionnaire

At the end of the experiment, immediately after the last session was completed, participants were given a written questionnaire that included questions involving hypothetical sequences of two outcomes in which the delays to the outcomes were manipulated across questions. These questions, patterned after those used in the Loewenstein and Prelec (1993) study, are shown in the Appendix.

Results and Discussion

Figure 6 shows the mean proportion of choices over the last three sessions on the left and right side alternatives across IE conditions. When the STD was pitted against IMP (Panel A), participants H148 and H161 showed preference for IMP, whereas H146 and H154 showed preference for STD predominantly. In the WOR versus STD condition (Panel B), all participants showed strong preference for the WOR sequence (eight of eight cases). In the WOR versus IMP (Panel C), there was a clear and consistent preference for the WOR sequence (five of six cases).

Fig. 6.

Fig. 6

Mean (SD) proportion of choices for each alternative during Immediate Exchange conditions in Experiment 2. Bars on the left and on the right side depict the proportions from the last three sessions on the left and right side, respectively. A) Choice between Standard (STD) and Improving (IMP); B) Choice between Worsening (WOR) and STD; C) Choice between WOR and IMP. See text for further details.

Figure 7 shows choice proportions under DE conditions. Only participants H146 and H148 were exposed to DE conditions, and both showed consistent preference for the WOR sequence over the STD and IMP sequences. This is similar to the preference exhibited by pigeons under similar conditions in Experiment 1, and thus is also broadly in line with the view of tokens as conditioned reinforcers. As in Experiment 1, however, the DE conditions were designed as probes and were not replicated with reversals. It is therefore important to consider these choice patterns in relation to the immediately prior IE condition. In three of four exposures to the DE conditions, H146 and H148 preferred the alternative on the side consistent with the preferences in the immediately prior condition. As noted earlier (in Experiment 1), future studies are needed to disentangle order and possible carryover effects from conditioned reinforcement effects.

Fig. 7.

Fig. 7

Mean (SD) proportion of choices for each alternative during Delayed Exchange conditions in Experiment 2. Bars on the left and on the right side depict the proportions from the last three sessions on the left and right side, respectively. A) Choice between Worsening (WOR) and Standard (STD) for H146; B) Choice between WOR and STD for H148; C) Choice between WOR and Improving (IMP) for H146; D) Choice between WOR and IMP for H148. See text for further details.

Considered as a whole, the human participants showed preference under IE conditions for the WOR sequence in 13 of 14 opportunities (p = .002, according to the binomial test) and showed preference for the higher-value sequence in 16 out of 20 occasions (p = .012). Therefore, these results are consistent with the ordinal predictions of Equation 1. They are also consistent with prior results obtained with rats (Brunner, 1999, Experiment 1) and pigeons (Experiment 1 of the present study). They are inconsistent, however, with results obtained in prior research with humans, that is, preference for improving sequences (Ariely, 1998; Ariely & Loewenstein, 2000; Chapman, 2000; Loewenstein & Prelec, 1993; Loewenstein & Sicherman, 1991; Schreiber & Kahneman, 2000).

A major procedural difference between the present and past studies with humans concerns the format for presenting choices. In past studies, questionnaire methods with hypothetical scenarios typically have been used, whereas in the present study repeated choices with actual outcomes were examined. To facilitate comparisons to prior research using questionnaire methods, the present participants were also asked a series of questions taken from the Loewenstein and Prelec (1993) study. The results are shown in Figure 8.

Fig. 8.

Fig. 8

Percentage of Answers A. Bars above the dotted line indicates preference for Answer A and below the dotted line indicates preference for Answer B. A) Loewenstein and Prelec (1993), Questions 2 and 3 (n = 82); Questions 4 (n = 48); B) Experiment 2 (n = 4); C) Experiment 2 and pilot study (n = 58).

The top panel (A) of Figure 8 summarizes the results of the Loewenstein and Prelec study (n = 82), the middle panel (B) the results from the 4 participants who completed Experiment 2, and the bottom panel (C) the results from Experiment 2 plus a similar pilot study (n = 58). (In this pilot study the choice procedures were similar and the questionnaire was identical.) The x-axis refers to the question number, and the y-axis refers to the percentage of participants who selected Answer A. Question 2 involved a single outcome delivered sooner (Answer A) or later (Answer B); Questions 3 and 4 involved sequences composed of two outcomes with worsening trend (Answer A) and improving trend (Answer B). The results of Question 1 are irrelevant to the present study and are not shown. To avoid redundant data, the proportion of choices for Answer B is not shown (as it equals 1.0 - proportion for A). Thus, percentage of Answer A above the dotted line indicates preference for Answer A, whereas below the dotted line indicates preference for Answer B.

Participants in all three samples opted for dinner at their preferred restaurant sooner rather than later (Question 2). This was akin to the delay-sensitivity test in Experiment 2, in which we asked whether more immediate presentation of a reinforcer was preferred to delayed presentation of the same reinforcer. Question 3 asked whether the inclusion of a less-preferred restaurant modified preference for the more-preferred alternative. In all three samples, some participants who chose to have the preferred outcome sooner rather than later when faced with choices with a single outcome chose to postpone the preferred outcome when a less preferred outcome was embedded in the series. In the Loewenstein and Prelec sample (Panel A), this manipulation resulted in a preference reversal, with participants preferring on average the improving sequence. In the other two samples, the results moved in the same direction, though the preferences were less extreme. As sample size increased (Panel C), the results more closely approximated the Loewenstein and Prelec results (Panel A).

The less extreme preference for the improving sequence is perhaps not surprising when one considers that the present questionnaires (Panels B and C) were administered following extended exposure to analogous choice procedures with actual outcomes. This may have made more salient the worsening sequences, or activated a general tendency for consistency across decision-making scenarios. However, when given an analogous but slightly different scenario (Question 4), participants in all three samples behaved similarly, preferring an improving sequence (upcoming weekend with an abrasive aunt and following weekend with friends) to a worsening sequence. The preferences of participants in Experiment 2 (Panel B) thus mirrored prior findings (Panel A), but were inconsistent with the preferences displayed in their repeated choices with actual outcomes. This suggests that different methods may occasion different decision-making tendencies—a point to be considered more fully in the next section.

The small number of participants in Experiment 2 is a limitation of this study—at least insofar as comparisons to prior large-n questionnaire results are concerned—urging caution in the generalization of the findings. As noted earlier, only 6 of the 11 participants initially recruited showed sensitivity to the video delay, and 2 dropped out after a few experimental sessions. This high level of attrition may have been caused by the repeated extended exposure to long sessions—back-to-back 50-min sessions—or to some other characteristic of the present experiment.

GENERAL DISCUSSION

The main goal of this study was a comparative analysis of choices between reinforcer sequences in humans and pigeons. The experiments aimed to bring procedures used with pigeons and humans into better alignment through the use of token and consumable-type reinforcers and the manipulation of the economic context. In general, a similar pattern of choices was found across species.

This cross-species similarity was evident in two main ways. First, choice patterns in both species were sensitive to differences in economic context. This was assessed by comparing choice patterns under IE conditions, when tokens could be exchanged immediately after they were produced, to those under DE conditions, when tokens could not be exchanged until the end of the terminal link. For pigeons, preferences under DE conditions were generally less extreme than in the immediately prior IE conditions. Although generally in the direction of indifference—the prediction of Equation 1 applied to exchange/food delays—preferences tended to stabilize at levels above indifference, and often favored sequences with value computed in terms of token delays. For humans, preferences under DE conditions departed even more substantially from indifference in favor of WOR sequences, though the limited number of subjects and DE conditions urges caution in the interpretation of these data. The results from DE conditions in both experiments may suggest a reinforcing role for the tokens, but in the absence of replications and reversals a more precise characterization is not possible.

A second and more reliable cross-species similarity was in the choice patterns under IE conditions, in which tokens were immediately exchangeable for consumable-type reinforcers. Both pigeons and humans tended to prefer sequences with the lowest relative reinforcer delays. Preferences were consistent with the ordinal predictions of the hyperbolic discounting of reinforcer delay of Equation 1 (as well as other temporal discounting models, as noted above). The most consistent patterns seen across species were in the critical IMP versus WOR comparisons: in eight of eight cases with pigeons and in five of six conditions with humans, the WOR sequence was preferred to the IMP sequence. These results are consistent with prior data on temporal discounting with rats (Brunner, 1999) but differ from prior data with humans (e.g., Ariely, 1998; Chapman, 2000; Loewestein & Prelec, 1993; Loewenstein & Sicherman, 1991; Schreiber & Kahneman, 2000).

What accounts for such differences between the present and prior results? There are a number of procedural differences that must be considered. To begin with, in the current study, the value of reinforcer sequence was delay based, whereas other studies typically include magnitude manipulations—either in relation to some qualitative property of a hypothetical stimulus (Chapman, 2000; Loewenstein & Prelec, 1993; Loewenstein & Sicherman, 1991) or in relation to some actual sensorial experience, such as aversive noise or temperatures (Ariely, 1998; Ariely & Loewenstein, 2000; Schreiber & Kahneman, 2000). It is possible that delay-based changes in value are discounted differently than intensity/magnitude changes. This possibility could be readily explored within the context of the present procedures, by altering the duration of the video reinforcer across the sequences, while holding constant the interreinforcer delays. In addition to reinforcer magnitude other reinforcer dimensions, such as quality or probability, as well as negative reinforcers could be examined.

Second, it is possible that discounting depends on the nature of the reinforcer. Video reinforcers may differ from other more commonly used reinforcers in that their reinforcing value depends, at least in part, on their continuity through time. The value of a particular video clip, or a pattern of clips, likely depends on its timing relative to the episode as a whole (or in the case of a TV series, on its relation to previous episodes). Little is known at present about how arbitrarily segmenting continuous video into fixed reinforcer periods compares to other more conventional reinforcers used in laboratory research with humans. Future research might profitably be directed to a more systematic exploration of the use of this highly effective, but rarely studied reinforcer.

Third, the present study differed from most prior studies in the manner in which choices were presented. In previous experiments, subjects made a single choice, whereas in the present experiment, subjects were given repeated choices over extended time periods, which presumably allowed them to learn from direct experience with the task and its contingent consequences. One advantage of the repeated-choice format used here is the possibility of detecting continuous, or graded, measures of preference—to determine not just which sequence is preferred but how much one sequence is preferred to another. In the present study, however, near-exclusive preference patterns were obtained, which was likely due to the use of FR 1 initial links. Using longer interval-based schedules in future research may yield more continuous preference measures necessary in a more detailed quantitative analysis, bringing the study of reinforcer sequences more in line with the study of concurrent reinforcement schedules more generally.

Finally, the present study differed from most of the earlier studies in the use of actual as opposed to hypothetical outcomes. There is no a priori reason to suppose that hypothetical scenarios cannot in some circumstances substitute for actual outcomes, nor is there reason to suppose noncorrespondence between verbal–hypothetical and nonverbal–actual choice patterns. On the other hand, there is no a priori reason to suppose strong correspondence. Verbal statements about imagined behavior and outcomes and actual response allocation among directly experienced outcomes are two different classes of operant behavior, and the correspondence between them is not something to be taken for granted; rather it is to be worked out empirically.

In the present study, we observed a striking lack of correspondence in choice patterns obtained via different methods: The very same participants who preferred a worsening sequence with repeated choices and actual outcomes preferred an improving sequence with single-shot choices with verbal-hypothetical outcomes. This lack of correspondence suggests that different methods may recruit different decision-making strategies. For example, verbal scenarios involving sequences of outcomes unfolding on the order of days or weeks may encourage a longer time horizon than directly experienced outcomes unfolding on the order of seconds and minutes. Such differences in scale should not be taken to imply that one method is necessarily superior to the other—both yield valuable information on how people make decisions. It does urge caution, however, in extrapolating from one set of methods to another.

Thus, while the present procedures differed in some significant ways from previous methods used with humans, they are in greater alignment with methods typically used with nonhumans. This makes them more suitable in cross-species comparisons, a major aim of the present investigation. The general similarities across species are consistent with an expanding body of research showing that species differences often reflect differences in methods used to study different species. As procedures are made more similar, species differences often fade. This has been seen across several different experimental paradigms, including self-control (Hackenberg & Vaidya, 2003; Hyten, Madden, & Field, 1994; Jackson & Hackenberg, 1996), risky choice (Lagorio & Hackenberg, 2010; Locey et al., 2009; Pietras & Hackenberg, 2001; Pietras, Locey, & Hackenberg, 2003), sunk-cost choices (Macaskill & Hackenberg, 2012; Navarro & Fantino, 2005), and response-cost punishment (Pietras & Hackenberg, 2005; Raiff, Bullock, & Hackenberg, 2008). The present study adds to this list. The general comparability of the preferences for the two species suggests that previously reported differences between species may be less related to species differences than to methodological differences.

Acknowledgments

This research was supported by NSF Grant IBN 0420747 and was conducted in accordance with IACUC and IRB standards at the University of Florida. Manuscript preparation was supported by NIH T32-AA07290 and NIDA R01-026127. Portions of these data were presented at the 2007 and 2008 meetings of the Association for Behavior Analysis International. An earlier version of the paper was submitted by the first author to the Graduate School at the University of Florida in partial fulfillment of the Doctorate Degree at the University of Florida. We are indebted to Marc Branch, Timothy Vollmer, Jesse Dallery, David Smith, and Drake Morgan for comments on the paper, and to Anthony DeFulio, Carla Lagorio, and Rachelle Yankelevitz for assistance with data collection.

APPENDIX

Questionnaire

  • 1) Which would you prefer if both were free?

  •  a. Dinner at a fancy French restaurant.

  •  b. Dinner at a local Greek restaurant.

  • If you prefer French:

  • (for those who preferred Greek, the restaurant order was reversed for Questions 2 and 3.)

  • 2) Which would you prefer?

  •  a. Dinner at the French restaurant on Friday in 1 month

  •  b. Dinner at the French restaurant on Friday in 2 months.

  • 3) Which would you prefer?

  •  a. Dinner at the French restaurant on Friday in 1 month and dinner at the Greek restaurant on Friday in 2 months

  •  b. Dinner at the Greek restaurant on Friday in 1 month and dinner at the French restaurant on Friday in 2 months

  • 4) Imagine you must schedule two weekend outings to a city where you once lived. Suppose one outing will take place this coming weekend, the other the weekend after. Which would you prefer?

  •  a. This weekend: friends, and next weekend: unpleasant Aunt

  •  b. This weekend: unpleasant Aunt, and next weekend: friends

REFERENCES

  1. Ariely D. Combining experiences over time: The effects of duration, intensity changes and on-line measurements on retrospective pain evaluations. Journal of Behavioral Decision Making. 1998;11:19–45. [Google Scholar]
  2. Ariely D, Loewenstein G. When does duration matter in judgment and decision making. Journal of Experimental Psychology: General. 2000;129:508–523. doi: 10.1037//0096-3445.129.4.508. [DOI] [PubMed] [Google Scholar]
  3. Ariely D, Zauberman G. On the making of an experience: The effects of breaking and combining experiences on their overall evaluation. Journal of Behavioral Decision Making. 2000;13:219–232. [Google Scholar]
  4. Brunner D. Preference for sequences of rewards: further tests of a parallel discounting model. Behavioral Process. 1999;45:87–99. doi: 10.1016/s0376-6357(99)00011-x. [DOI] [PubMed] [Google Scholar]
  5. Brunner D, Gibbon J. Value of food aggregates: parallel versus serial discounting. Animal Behavior. 1995;50:1627–1634. [Google Scholar]
  6. Chapman G. B. Preferences for improving and declining sequences of health outcomes. Journal of Behavioral Decision Making. 2000;13:203–218. [Google Scholar]
  7. Flora S. R, Pavlik W. B. Human self-control and the density of reinforcement. Journal of the Experimental Analysis of Behavior. 1992;57:201–208. doi: 10.1901/jeab.1992.57-201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Guyse J, Keller L, Eppel T. Valuing environmental outcomes: Preferences for constant or improving sequences. Organizational Behavior and Human Decision Processes. 2002;87:253–277. [Google Scholar]
  9. Hackenberg T. D, Pietras C. J. Video access as a reinforcer in a self-control paradigm: A method and some data. Experimental Analysis of Human Behavior Bulletin. 2000;18:1–5. [Google Scholar]
  10. Hackenberg T. D, Vaidya M. Determinants of pigeons' choices in token-based self-control procedures. Journal of the Experimental Analysis of Behavior. 2003;79:207–218. doi: 10.1901/jeab.2003.79-207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hsee C. K, Abelson R. P, Salovey P. The relative weighting of position and velocity in satisfaction. Psychological Science. 1991;2:263–266. [Google Scholar]
  12. Hyten C, Madden G. J, Field D. P. Exchange delays and impulsive choice in adult humans. Journal of the Experimental Analysis of Behavior. 1994;62:225–233. doi: 10.1901/jeab.1994.62-225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Jackson K, Hackenberg T. D. Token reinforcement, choice, and self-control in pigeons. Journal of the Experimental Analysis of Behavior. 1996;66:29–49. doi: 10.1901/jeab.1996.66-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Jimura K, Myerson J, Hilgard J, Keighley J, Braver T. S, Green L. Domain independence and stability in young and older adults' discounting of delayed rewards. Behavioural Processes. 2011;87:253–259. doi: 10.1016/j.beproc.2011.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Kahneman D, Fredrickson B. L, Schreiber C. A, Redelmeier D. A. When more pain is preferred to less: Adding a better end. Psychological Science. 1993;4:401–405. [Google Scholar]
  16. Kirby K. N. The present values of delayed rewards are approximately additive. Behavioural Processes. 2006;72:273–282. doi: 10.1016/j.beproc.2006.03.011. [DOI] [PubMed] [Google Scholar]
  17. Lagorio C. H, Hackenberg T. D. Risky choice in pigeons and humans: A cross-species comparison. Journal of the Experimental Analysis of Behavior. 2010;93:27–44. doi: 10.1901/jeab.2010.93-27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Locey M. L, Pietras C. J, Hackenberg T. D. Human risky choice: Delay sensitivity depends on reinforcer type. Journal of the Experimental Analysis of Behavior. 2009;35:15–22. doi: 10.1037/a0012378. [DOI] [PubMed] [Google Scholar]
  19. Loewenstein G, Prelec D. Preferences for sequences of outcomes. Psychological Review. 1993;100:91–108. [Google Scholar]
  20. Loewenstein G, Sicherman N. Do workers prefer increasing wage profiles. Journal of Labor Economics. 1991;9:67–84. [Google Scholar]
  21. Logue A. W, Peña-Correal T. E, Rodriguez M. L, Kabela E. Self-control in adult humans: Variation in positive reinforcer amount and delay. Journal of the Experimental Analysis of Behavior. 1986;46:159–173. doi: 10.1901/jeab.1986.46-159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Macaskill A. C, Hackenberg T. D. The sunk-cost effect with pigeons: Some determinants of decisions about persistence. Journal of the Experimental Analysis of Behavior. 2012;97:85–100. doi: 10.1901/jeab.2012.97-85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Matsumoto D, Peecher M, Rich J. Evaluations of outcome sequences. Organizational Behavior and Human Decision Processes. 2000;84:331–352. doi: 10.1006/obhd.2000.2913. [DOI] [PubMed] [Google Scholar]
  24. Mazur J.E. Choice between single and multiple delayed reinforcers. Journal of the Experimental Analysis of Behavior. 1986;46:67–77. doi: 10.1901/jeab.1986.46-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Mazur J. E. Rats' choices between one and two delayed reinforcers. Learning & Behavior. 2007;35:169–176. doi: 10.3758/bf03193052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. McDiarmid C. G, Rilling M. E. Reinforcement delay and reinforcement rate as determinants of schedule preference. Psychonomic Science. 1965;2:195–196. [Google Scholar]
  27. Montgomery N, Unnava H. Temporal sequence effects: A memory framework. Journal of Consumer Research. 2009;36:83–92. [Google Scholar]
  28. Navarick D. J. Choice in humans: Techniques for enhancing sensitivity to reinforcement immediacy. The Psychological Record. 1996;46:539–554. [Google Scholar]
  29. Navarick D. J. Impulsive choice in adults: How consistent are individual differences. The Psychological Record. 1998;48:665–674. [Google Scholar]
  30. Navarro A. D, Fantino E. The sunk cost effect in humans and pigeons. Journal of the Experimental Analysis of Behavior. 2005;85:1–13. doi: 10.1901/jeab.2005.21-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Pietras C. J, Hackenberg T. D. Risk-sensitive choice in humans as a function of an earnings budget. Journal of the Experimental Analysis of Behavior. 2001;76:1–19. doi: 10.1901/jeab.2001.76-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Pietras C. J, Hackenberg T. D. Response-cost punishment via token-loss with pigeons. Behavioural Processes. 2005;69:343–356. doi: 10.1016/j.beproc.2005.02.026. [DOI] [PubMed] [Google Scholar]
  33. Pietras C. J, Locey M. L, Hackenberg T. D. Human risky choice under temporal constraints: Test of an energy-budget model. Journal of the Experimental Analysis of Behavior. 2003;80:59–75. doi: 10.1901/jeab.2003.80-59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Raiff B. R, Bullock C. E, Hackenberg T. D. Response-cost punishment with pigeons: Further evidence of response suppression via token loss. Learning & Behavior. 2008;36:29–41. doi: 10.3758/lb.36.1.29. [DOI] [PubMed] [Google Scholar]
  35. Read D, Powell M. Reasons for sequence preferences. Journal of Behavioral Decision Making. 2002;15:433–460. [Google Scholar]
  36. Ross W. T, Simonson I. Evaluations of pairs of experiences: A preference for happy endings. Journal of Behavioral Decision Making. 1991;4:273–282. [Google Scholar]
  37. Schreiber C. A, Kahneman D. Determinants of the remembered utility of aversive sounds. Journal of Experimental Psychology: General. 2000;129:27–42. doi: 10.1037//0096-3445.129.1.27. [DOI] [PubMed] [Google Scholar]
  38. Shull R.L, Mellon R, Sharp J.A. Delay and number of food reinforcers: Effects on choice and latencies. Journal of the Experimental Analysis of Behavior. 1990;53:235–246. doi: 10.1901/jeab.1990.53-235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Varey C, Kahneman D. Experiences extended across time: Evaluation of moments and episodes. Journal of Behavioral Decision Making. 1992;5:169–185. [Google Scholar]

Articles from Journal of the Experimental Analysis of Behavior are provided here courtesy of Society for the Experimental Analysis of Behavior

RESOURCES