Abstract
Stimuli that precede aversive events are typically less preferred than stimuli that precede nonaversive events. It has recently been demonstrated that stimuli that follow less preferred events may become favored more than stimuli that follow more preferred events. This phenomenon has been investigated under a variety of names, most commonly, within-trial contrast and state-dependent valuation. Although this effect has been replicated, there have been several failures to replicate and it is still little understood. This paper reviews and summarizes the literature on within-trial contrast and state-dependent valuation. Procedural variations across studies are identified and discussed. The two current models that explain the phenomenon are then outlined and the limitations of each model are described. A third explanation is offered that incorporates the concept of motivating operations. Last, the predictions of all three models are compared.
Keywords: conditioning preference, within-trial contrast, contrast, motivating operations, state-dependent valuation
Stimuli that are followed by aversive events may become less preferred than stimuli that are followed by nonaversive events. It is also possible, however, for stimulus preference to be altered by the events that precede the presentation of the stimulus. This phenomenon has been investigated under a variety of names, most commonly, within-trial contrast (WTC) and state-dependent valuation (SDV). Both WTC and SDV are conceptual models that essentially describe an effect wherein exposure to a less preferred event increases preference for stimuli that follow that event. Although each model invokes different mechanisms to explain the phenomenon and thus makes different predictions (discussed later), in general both predict that organisms will demonstrate a preference for stimuli that follow less preferred events relative to stimuli that follow more preferred events.
In one of the first studies examining this phenomenon, Clement, Feltus, Kaiser, and Zentall (2000) exposed eight pigeons to two chain-schedule conditions that differed by the amount of effort required.1 In the first condition (see Figure 1), pigeons were exposed to a fixed-ratio (FR) 1 schedule with a lit center key (initial component). Completion of this component led to another FR 1 schedule (middle component) followed by a two-key discrimination task (terminal component). In the terminal component, a peck to the red key (S+FR1) resulted in food reinforcement whereas a peck to the yellow key (S−FR1) resulted in no food reinforcement. The second condition consisted of an identical initial component, followed by an FR 20 in the middle component, followed by a similar two-key discrimination task in which a peck to the green key (S+FR20) resulted in food reinforcement whereas a peck to the blue key (S−FR20) resulted in no food reinforcement. After repeated exposure to both conditions, preference for the various terminal component stimuli was assessed via a paired-stimulus preference assessment that compared S+FR1 against S+FR20 and S−FR1 against S−FR20. Results indicated a preference for the S+FR20 and the S−FR20 (both stimuli that had previously followed the high-effort condition).
Figure 1.
Training trials adapted from Clement et al. (2000). A key peck to Circ produced either an FR 1 or FR 20 schedule followed by discrimination trials. A key peck to S+ resulted in food reinforcement. A key peck to S− resulted in no food reinforcement.
Although there have been multiple replications, the finding is not entirely reliable, and there have been numerous failures to replicate. Further, there are still lingering questions as to the cause of the change in preference and whether any middle-component events can produce the effect. As a result, two competing models have been offered to explain the phenomenon: WTC and SDV.
The purpose of this paper is three-fold. First, the literature on WTC and SDV will be reviewed and the results will be discussed in terms of subject type and initial, middle, and terminal components. Differences in procedures will be delineated, as will inconsistencies in findings. Second, the current models that explain the phenomenon will be contrasted, and their limitations will be described. Third, an alternative explanation of the findings will be presented, and the predictions of all three models will be evaluated.
METHOD
Two literature searches were conducted using the databases PsycINFO and PubMed. The key words used were within-trial contrast for the first search and state-dependent valuation for the second. All resulting peer-reviewed articles were retained. Of these articles, a search of the references was conducted and studies that investigated WTC or SDV or used similar procedures were included in the list of articles. Excluded were articles that focused only on cognitive dissonance or justification of effort, because these studies typically employed different procedures or did not directly measure preference but rather relied on preference rating scales. Also excluded was research on the concorde fallacy and sunk cost effects (see Arkes & Ayton, 1999, for discussion). Although these concepts are somewhat related to WTC and SDV, the procedures are quite dissimilar and may not examine the same behavior-change phenomenon.
RESULTS AND DISCUSSION
Generality of the Findings
In total, 23 articles were identified containing a total of 38 separate experiments. Of these 38 experiments, 22 (across 18 separate articles) demonstrated an increase in preference for a stimulus that followed a less rather than more preferred event, and 16 (across 8 separate articles) failed to replicate the phenomenon. Note that some articles contained both replications and failures to replicate. Of the 23 total articles, six appeared to interpret the phenomenon from the conceptual perspective of SDV (Aw, Holbrook, Burt de Perera, & Kacelnik, 2009; Kacelnik & Marsh, 2002; Marsh, Schuck-Paim, & Kacelnik, 2004; Pompilio & Kacelnik, 2005; Pompilio, Kacelnik, & Behmer, 2006; Waite & Passino, 2006), and 15 interpreted the phenomenon from the perspective of WTC (Alessandri, Darcheville, Delevoye-Turrell, & Zentall, 2008; Alessandri, Darcheville, & Zentall, 2008; Arantes & Grace, 2008; Clement et al., 2000; Clement & Zentall, 2002; DiGian, Freidrich, & Zentall, 2004; Friedrich, Clement, & Zentall, 2005; Friedrich & Zentall, 2004; Gipson, Miller, Alessandri, & Zentall, 2009; Klein, Bhatt, & Zentall, 2005; O'Daly, Meyer, & Fantino, 2005; Singer, Berry, & Zentall, 2007; Vasconcelos & Urcuioli, 2008, 2009; Vasconcelos, Urcuioli, & Lionello-DeNolf, 2007). One study directly compared empirical predictions from both conceptual frameworks (Aw, Vasconcelos, & Kacelnik, 2011). Finally, a clear conceptual perspective was not identifiable for one study (Armus, 2001). Because the procedures used in these studies were quite similar despite different conceptual underpinnings, the articles will be collapsed and discussed together. (See Appendix A for a breakdown of each article and to examine effect by individual experiment.)
Subject types
Of the 22 experiments that successfully demonstrated the effect, 18 were conducted with various nonhumans including pigeons, locusts, grasshoppers, rats, banded tetras, and starlings. The remaining four experiments demonstrated the effect with humans, both children and adults. Of the 16 experiments that were unable to replicate the findings, all were conducted with nonhuman organisms including starlings, pigeons, or rats. These data suggest that the phenomenon is general and not a species-specific characteristic.
Middle components
Across the 22 experiments that successfully demonstrated the effect, a variety of events have been programmed for or required in the middle component. For example, the effect has been documented when the preceding events were high versus low effort (Alessandri, Darcheville, Delevoye-Turrell, & Zentall, 2008; Aw et al., 2011; Clement et al., 2000; Clement & Zentall, 2002; Friedrich & Zentall, 2004; Kacelnik & Marsh, 2002; Klein et al., 2005), high versus low probabilities of reinforcement (Clement & Zentall, 2002; Gipson et al., 2009), short versus long delays to the terminal component Alessandri, Darcheville, Delevoye-Turrell, et al., 2008; Alessandri, Darcheville, & Zentall, 2008; Clement et al., 2000; DiGian et al., 2004; O'Daly et al., 2005), preferred versus less preferred schedules of reinforcement (Singer et al., 2007), the absence or presence of reinforcement (Friedrich et al., 2005), and low versus high states of food deprivation (Aw et al., 2009; Marsh et al., 2004; Pompilio & Kacelnik, 2005; Pompilio et al., 2006; Vasconcelos & Urcuioli, 2008). Although the effect has been most robustly documented with food deprivation, in each of the above cases, preference was increased for stimuli that followed the less preferred event. Thus, the effect does not appear to be limited to conditions that involve differential response requirements (or more generally, effort) alone.
Although the effect has been produced with a variety of events in middle components, there have also been many failures to replicate the effect. Studies that have failed to produce the effect have examined middle components such as high versus low effort (Arantes & Grace, 2008; Armus, 2001; Friedrich & Zentall, 2004; Vasconcelos & Urcuioli, 2009; Vasconcelos et al., 2007; Waite & Passino, 2006), long versus short delay (Aw et al., 2011), and high versus low states of food deprivation (Vasconcelos & Urcuioli, 2008).
Explanations for replication failures
Explanations for these failures to replicate the phenomenon have been numerous. Singer et al. (2007) suggested that the effect is slow to develop. Zentall (2008) suggested a variety of possible reasons for failure to replicate, including that overtraining is required, that the terminal component must be contiguous with the middle component, and that prior exposure to lean schedules influences the effect. Finally, Aw et al. (2011) suggested that the effect occurs only when the middle component requires energy expenditure.
For each suggested explanation, however, there is at least one study that demonstrates that explanation to be insufficient. First, Arantes and Grace (2008) provided an amount of training equal to that used in other successful studies and were unable to reproduce the results. Further, Vasconcelos and Urcuioli (2009) provided extensive overtraining, but although they noted a tendency towards preference change, this tendency was not statistically significant. Second, an effect has been found when using a delay as the less preferred event (e.g., Alessandri, Darcheville, & Zentall, 2008), which would seem to counter both the argument concerning contiguity between middle and terminal components and the argument that energy expenditure is required. Finally, researchers have used experimentally naive pigeons and employed overtraining but still failed to replicate previous findings (e.g., Vasconcelos & Urcuioli, 2009).
There are other possible explanations for some replication failures. Armus (2001), for example, conducted experiments using two differently flavored pellets (grape and bacon) as terminal components following different amounts of effort. Repeated exposure to different food items may increase preference (Wardle, Herrera, Cooke, & Gibson, 2003), and it is possible that this repeated exposure masked or influenced the results of the experiment.
Procedural Variations Across Experiments
There are a variety of procedural differences across the various experiments that have examined the phenomenon. It is possible that these variations contribute to the differences in experimental findings. The following sections describe several procedural variations regarding the presentation of the initial and terminal components across the reviewed studies.
Initial component
A stimulus that precedes a less preferred event may become a conditioned aversive stimulus and function as a punisher (e.g., Vorndran & Lerman, 2006). Because a change in preference for stimuli presented in the terminal component is presumed to be directly related to preference for the event in the middle component, it is necessary to be able to attribute the preference change to those two components alone. If the initial component is a less preferred or aversive event, this may confound interpretations by either adding or subtracting from the effect. It is therefore important to distinguish between studies that used distinct initial components (in which differential preference could develop) and those that used identical initial components.
Of the experiments that successfully demonstrated the phenomenon, 10 used distinct initial components (Alessandri, Darcheville, Delevoye-Turrell, et al., 2008, Phases 1 and 2; Alessandri, Darcheville, & Zentall, 2008; Aw et al., 2011, Experiment 3; Clement & Zentall, 2002, Experiments 1, 2, and 3; Friedrich et al., 2005; O'Daly et al., 2005, Experiment 2; Singer et al., 2007), 4 used identical initial components (Clement et al., 2000; Friedrich & Zentall, 2004, Experiment 1; Gipson et al., 2009; Klein et al., 2005), 2 used both distinct and identical initial components for different groups (DiGian et al., 2004; O'Daly et al., 2005, Experiment 1), and 6 did not include initial components (Aw et al., 2009; Kacelnik & Marsh, 2002; Marsh et al., 2004; Pompilio & Kacelnik, 2005; Pompilio et al., 2006; Vasconcelos & Urcuioli, 2008, Experiment 1). Those that did not include initial components investigated the effects of different levels of food deprivation, and an initial component was not feasible.
Of the 16 experiments that were unable to replicate the phenomenon, 3 used distinct initial components (Aw et al., 2011, Experiment 2; Vasconcelos & Urcuioli, 2009, Experiments 1 and 2), 9 used identical initial components (Arantes & Grace, 2008, Experiments 1 and 2; Aw et al., 2011, Experiment 1; Friedrich & Zentall, 2004, Experiment 2; Vasconcelos et al., 2007, Experiments 1 through 5), 1 used both distinct and identical initial components (Vasconcelos et al., 2007, Experiment 6), and 3 did not include initial components (Armus, 2011; Vasconcelos & Urcuioli, 2008, Experiment 2; Waite & Passino, 2006).
Of the 13 experiments that used identical stimuli in the initial components, four produced an effect and nine did not. Of the 13 experiments that used different stimuli in the initial components, 10 produced an effect and three did not. Thus, whether the stimuli used in the initial component influence the contrast effect is currently unclear, but the data suggest the possibility that they do. Studies directly examining this possibility include, for example, DiGian et al. (2004) who used different initial components (vertical or horizontal lines) for one group of pigeons and identical initial components (white keys) for a different group of pigeons. Although both groups significantly preferred the stimulus that followed longer delays, the first group (different initial components) displayed a greater degree of preference change than the second group (identical initial components). This may indicate that the initial component exerts some effect in the conditioning process, although such a conclusion is speculative at this point.
Terminal component
The presentation of terminal component stimuli varied among studies primarily based on whether the researchers presented a single stimulus after each condition or presented two stimuli together in a discrimination task. In 12 of the 22 experiments that demonstrated an effect, a discrimination task followed the middle component (Alessandri, Darcheville, Delevoye-Turrell, et al., 2008, Phases 1 and 2; Alessandri, Darcheville, & Zentall, 2008; Clement et al., 2000; Clement & Zentall, 2002, Experiments 1 through 3; DiGian et al., 2004; Friedrich et al., 2005; Gipson et al., 2009; Klein et al., 2005; Singer et al., 2007, Experiment 2). The remaining 10 studies presented a single stimulus after each condition. Of the 16 experiments that failed to replicate the effect, 11 employed a discrimination task following the middle component (Arantes & Grace, 2008, Experiments 1 and 2; Vasconcelos & Urcuioli, 2008, Experiment 2; Vasconcelos & Urcuioli, 2009, Experiments 1 and 2; Vasconcelos et al., 2007, Experiments 1 through 6). The remaining five studies presented a single stimulus after each condition. In summary, of the 23 studies that presented a discrimination task in the terminal component, 12 successfully produced the effect. Of the 15 studies that presented a single stimulus in the terminal component, 10 produced the effect. Thus it appears as though the phenomenon is more frequently seen when terminal components involve single stimulus presentation rather than discrimination tasks.
Experiments that employed a discrimination task in the terminal component have produced conflicting results. Of the studies that replicated the phenomenon, some have shown that both the S+ and S− following the less preferred event (S+LP; S−LP) are equally likely to become preferred relative to the S+ and S− following the more preferred event (Clement & Zentall, 2002; Gipson et al., 2009; Singer et al., 2007), and Clement et al. (2000) found an even greater preference for S−LP. These findings suggest that relative preference for the preceding event is responsible for the effect rather than the events that follow the terminal component (i.e., reinforcement or no reinforcement).
Other research, however, has demonstrated that the effect is weaker with the S−LP than the S+LP (Friedrich et al., 2005) or have found no effect with the S−LP (Alessandri, Darcheville, & Zentall, 2008; Clement & Zentall, 2002; DiGian et al., 2004; Klein et al., 2005). These results are difficult to interpret and may suggest that the reinforcement that follows the S+ somehow influences the effect.
An alternative explanation, however, might be that in order for preference to change, the organism must respond to the stimulus that follows the middle component. Arantes and Grace (2008) and Vasconcelos et al. (2007) propose that because the S+ and S− were presented in a discrimination task, the organism learned to consistently select the S+ stimulus and avoid the S−. They suggest that the presentation of stimuli in a discrimination task may produce an effect with the S+ but may inhibit the effect with the other stimulus. Currently, it is unclear whether this is the case or whether the addition of reinforcement after a terminal component somehow influences the effect.
More straightforward examples of the effect are evident in the 10 successful experiments that did not impose a discrimination task in the terminal component (Aw et al., 2009; Aw et al., 2011, Experiment 3; Friedrich & Zentall, 2004, Experiment 1; Kacelnik & Marsh, 2002; Marsh et al., 2004; O'Daly et al., 2005, Experiments 1 and 2; Pompilio & Kacelnik, 2005; Pompilio et al., 2006; Vasconcelos & Urcuioli, 2008, Experiment 1). In these studies the terminal component was a single stimulus (e.g., differently colored key, specific arm of a Y maze, etc.). After training, preference was measured by presenting both terminal components simultaneously and recording the organism's choice between components. In each of these studies, a significant preference was found for the stimulus that had previously followed less preferred events, with the exception of Aw et al. (2011), who found an effect only when the middle component involved effort rather than delay.
Procedural Problems with Reviewed Studies
Because changes in preference for terminal component stimuli are purportedly related to differential preference for the middle component stimuli, it seems important that baseline preference for both components be established prior to the onset of an experiment. Surprisingly, however, of the 38 experiments reviewed (including both replications and failures to replicate) only three (Alessandri, Darcheville, Delevoye-Turrell, et al., 2008, Experiments 1 and 2; Singer et al., 2007) measured preference for the middle components. Friedrich and Zentall (2004) did measure middle-component preference, but only after training was complete.
It may be tempting to assume that more effort or longer delays will always be less preferred; however, this may not always be true. For example, Alessandri, Darcheville, Delevoye-Turrell, et al. (2008) employed middle components that required participants to press a button with varying amounts of force and for varying lengths of time. For two participants, the less preferred condition was not the least effort/shortest delay condition. Without prior measurement, results for these two participants might have been erroneously interpreted.
Another concern is that only two of the 38 experiments assessed preference for terminal component stimuli prior to training (Friedrich & Zentall, 2004, Experiments 1 and 2). Without knowing initial preference, it is difficult to determine the impact of training. If preference for a stimulus was initially low and then shifted to high, this would indicate a very strong effect. If, on the other hand, preference for the stimulus was initially somewhat high and then shifted higher, this would indicate a fairly weak effect. Without measuring preference for middle and terminal components prior to training, it is difficult to draw strong conclusions regarding the phenomenon. If we are to suggest that preference for terminal components is influenced by preference for middle components, it is of utmost importance that we establish these preferences before exposure to training.
DIFFERENT MODELS TO EXPLAIN THE EFFECTS
The Within-Trial Contrast Model
The WTC model (see Figure 2) was first proposed by Zentall (2005) and generally assumes that when an organism is exposed to a less preferred event there is a negative change in the organism's hedonic state (H − ΔH). This negative change is directly proportional to the degree to which the event is less preferred. When a reinforcing stimulus is presented following this negative shift, there is a positive shift in the hedonic state of the organism. The contrast, then, is between the organism's hedonic state before and after the presentation of the reinforcing stimulus. The degree to which preference is changed depends on the amount of positive shift, and therefore also on the degree to which one event is preferred more or less than another (cf. relative values in Figure 2).
Figure 2.

“A model based on change in relative hedonic value, proposed to account for within-trial contrast effects. According to the model, trials begin with a relative hedonic state, H; key pecking results in a negative change in hedonic state, H − ΔH1 for FR 1 and H − ΔH20 for FR 20; obtaining a reinforcer results in a positive change in hedonic state, H + ΔHRf; the net change in hedonic state depends on the difference between H + ΔHRf and H − ΔH1 on an FR 1 and between H + ΔHRf and H − ΔH20 on an FR 20 trial.” (Zentall, 2005, p. 280).
Limitations of the within-trial contrast model
There are several problems with the WTC model. Aw et al. (2011) outline many similar limitations; however, they deserve mention here in order to contrast the various models. The first limitation is that the model assumes a hedonic state and uses changes in this state (both absolute and relative) as the primary events responsible for contrast. Hedonic state, however, is a term that only vaguely refers to an organism's well-being, and there is currently no agreed-upon definition (Pompilio & Kacelnik, 2005). Hedonic state seems to function much the same as a hypothetical construct, and there is no method of objectively determining an organism's hedonic state, much less detecting changes in that state. Although the term hedonic state, and assumptions of changes in this state, may be convenient, it is unclear whether hedonic state refers to any measurable state of being.
A second problem with this model is that it describes an upward shift in hedonic state as the result of the presentation of a reinforcer after a less preferred event. Recall, however, that several researchers used a discrimination task as the terminal component and found an effect with both the S+ and S− (Clement et al., 2000; Clement & Zentall, 2002; Gipson et al., 2009; Singer et al., 2007). Because the S− was never associated with a reinforcer, there should be no increase in hedonic state and hence no shift in preference for the S− stimulus.
Due to these limitations, the current WTC model appears to be an inadequate explanation. Although the WTC model affords some prediction, the model is less than ideal at providing a scientific explanation of the phenomenon.
The State-Dependent Valuation Model
Kacelnik and Marsh (2002) proposed the SDV as an alternative model that hypothesizes that the relative value of a reinforcer is directly related to the energetic state or fitness of the organism at the point of reinforcer delivery. A reinforcer delivered at a point of low energy reserves is presumed to be relatively more valuable than that same reinforcer delivered at a point of higher energy reserves. The model further assumes that with repeated exposures to a food item under states of low energy reserves, the utility value of the food item is somehow represented in memory (Pompilio & Kacelnik, 2005).
Limitations of the state-dependent valuation model
One limitation of the SDV model is that it assumes that the function of the middle components is to differentially depress energy reserves. It is unclear whether this is always the case, however, because the effect has been successfully demonstrated with middle components that would not be expected to differentially depress energy reserves, such as delay lengths (DiGian et al., 2004), different schedules of reinforcement (Singer et al., 2007), or anticipated or unanticipated effort (Clement & Zentall, 2002). If energy reserves were not differentially depressed, however, the terminal components would not have differential fitness value, and the effect would not be predicted.
A second limitation is that several studies have shown changes in preference when the terminal components were followed by stimuli with presumably no capacity to increase energy reserves. Alessandri, Darcheville, and Zentall (2008), for example, provided children with short song segments or segments of a cartoon after successful discrimination in the terminal component. Similarly, Klein et al. (2005) had a computer screen display the words “correct” or “incorrect” during discrimination. Furthermore, as noted in the limitations of the WTC model, an increase in preference has been observed with S− that was not associated with any reinforcement. Thus, it appears that for some of the successful demonstrations, the SDV model provides an insufficient explanation.
ALTERNATIVE EXPLANATION: MOTIVATING OPERATIONS AND THE FUNCTION-ALTERING EFFECT
As currently conceptualized, motivating operations (MOs) are stimulus conditions or events that produce a momentary change in the reinforcing effectiveness of other stimuli (the value-altering effect) and a momentary change in the frequency of behaviors (the behavior-altering effect) that have functioned to produce those stimuli (Michael, 2004). Thus, an MO is not associated with differential availability of consequences, but rather produces a temporary change in stimulus value and the probability of behavior that produces that stimulus.
The function-altering effect describes a conditioning process whereby, due to a particular learning history, the function of specific stimuli is altered in the presence of other stimuli (Schlinger & Blakely, 1994). If, for example, food is provided for key pecking under a state of food deprivation (an establishing operation), key pecking is more likely in the future because the evocative effect of the establishing operation has been altered. The function-altering effect has been to modify the function of the establishing operation. In this case, the evocative effect of food deprivation is to evoke key pecking.
Let us now apply the concept of MOs and function-altering effects to the typical presentation of the initial, middle, and terminal components in the reviewed studies. In a typical study on effort, for example, a pigeon might be exposed to both of the following conditions in an alternating fashion:
Condition 1: Initial Component 1 (FR 1) → Middle Component 1 (FR 5) → Terminal Component 1 (red key followed by food reinforcement)
Condition 2: Initial Component 2 (FR 1) → Middle Component 2 (FR 30) → Terminal Component 2 (blue key followed by food reinforcement)
In both conditions, engaging in the initial component produces differential effort outcomes: low effort (FR 5) or high effort (FR 30). Assuming that the pigeon would prefer not to engage in any effort expenditure, both Middle Components 1 and 2 may be considered MOs in that they both function to increase the value of stimuli that terminate the middle component. With repeated exposure to training, the function of these establishing operations is altered. However, due to differences in the magnitude of the required effort in both events, Middle Component 2 (FR 30) is less preferred than Middle Component 1 (FR 5). Therefore, although the stimuli that follow both Middle Components 1 and 2 should function as conditioned reinforcers, the stimuli associated with the termination of Middle Component 2 should be valued more than the stimuli associated with the termination of Middle Component 1. The entire procedure, therefore, has served to expose the pigeon to a less preferred stimulus condition (the middle component) and to associate the termination of that condition with the presentation of another stimulus (the terminal component).
It has previously been noted that the terminal component of a chain schedule may become a conditioned reinforcer because it is consistently contiguous with the delivery of a primary reinforcer at the end of the chain (Fantino, 1977). The explanation described here asserts that the same effect occurs in these chain schedules except that the terminal component becomes a conditioned reinforcer not through contiguity with the positive reinforcer but through contiguity with the termination of the middle component. In essence, a stimulus that terminates a less preferred event is a reinforcer, and the less preferred the event, the more valuable the stimulus.
The MO/function-altering explanation is more parsimonious than the SDV model because it does not assume that some events (e.g., 6-s delays; differential reinforcement of other behavior vs. fixed-interval schedules) function to decrease an organism's energy reserves. Nor does it assume that the presentation of cartoons or affirmations functions to increase an organism's biological fitness. The MO interpretation is also more parsimonious than the WTC model because it is able to specify the precise mechanism that is responsible for altering preference (i.e., MOs) as opposed to relying on unmeasured changes in a hypothetical and ill-defined construct (i.e., hedonic state).
In addition, the MO interpretation could help to explain some of the inconsistent findings regarding the effect. The most reliable demonstration of the effect has been when levels of food deprivation were manipulated, whereas the least reliable demonstrations were those that used events such as delay. The MO explanation would predict that changes in preference for the terminal components are directly related to the degree to which preference for the middle components differs. If one middle component was highly aversive and the other only mildly nonpreferred, a large preference change would be expected, and this preference change would diminish as the two middle components became equally aversive. It is possible that there is simply a larger discrepancy in preferences for levels of food deprivation compared to different lengths of delay.
Further, this explanation is able to account for findings regarding increased preference for the S−LP. As noted earlier, the WTC and SDV models predict an increase in preference when a reinforcer is delivered that differentially increases hedonic states or energetic reserves. Because the S− was never associated with reinforcement, neither model would predict preference change. On the other hand, the MO explanation would predict preference change because the S−LP was associated with the termination of a relatively less preferred event.
DIFFERENT PREDICTIONS YIELDED BY EACH MODEL
If an organism was exposed to two training conditions with different middle components (long delay and short delay, both followed by some stimulus), the different models would yield different predictions. The WTC model would predict preference for the stimulus that followed the long delay relative to the stimulus that followed the short delay. The SDV model would not predict differentiated preference because neither delay condition resulted in energetic expenditure. According to the MO interpretation, a prediction could only be made if the two delay conditions were demonstrated to be differentially preferred. That is, if the long delay was less preferred than the short delay, preference should shift towards the stimulus following the long delay. If the short delay was less preferred than the long delay, the opposite prediction would be made.
Furthermore, the MO interpretation could make novel predictions to which the WTC and SDV models could not as easily speak. For example, if instead of identical middle components differing by one dimension (e.g., 5-s delay vs. 20-s delay), the middle components were a less preferred condition involving energy expenditure (e.g., running) versus a less preferred condition involving energy gain (e.g., eating an unpleasant-tasting food item) the predictions of the WTC and SDV are unclear. The WTC model would not necessarily be able to quantify which condition resulted in a greater decrease in hedonic state and could only make this determination after the fact. The SDV model is predicated on the expectance of energy expenditure, so the unpleasant-tasting food item would not fit with the model. On the contrary, the MO explanation would merely require that one condition be shown to be less preferred relative to the other to make a prediction as to preference change.
Finally, the MO explanation may be extended to situations in which one middle component is nonpreferred and the other is preferred. If, for example, one condition involved an unpleasant tone and the other involved watching a preferred movie segment, the MO explanation would suggest increased preference for the stimulus following the unpleasant tone and decreased preference for the stimulus following the preferred movie clip (being associated with the termination of a preferred event). It is unclear what outcome would be predicted by the WTC model, and the SDV model would predict no preference change because no energy was expended in either condition.
CONCLUSION
The precise circumstances necessary to produce the preference-change phenomenon described in this paper are not fully understood at this time. The effect has been replicated multiple times across a variety of species and middle components, yet there have also been numerous failures to replicate. Some of the discrepancies in findings may be due to procedural differences across studies (e.g., the presentation of a discrimination task vs. single stimulus presentation; identical vs. distinct initial components). It is also possible that these conflicts are due to procedural problems seen across most of the studies reviewed (e.g., a failure to measure preference for middle and terminal components prior to training). Further complicating our understanding of the phenomenon is that the two models currently offered to explain the findings both make different predictions and are unable to account for all of the research findings. This paper offers an alternative explanation to the WTC and SDV models. This alternative explanation relies on the concept of MOs and function-altering effects and suggests that the terminal components are conditioned as reinforcers through contiguity with the termination of the middle components. Although this account remains speculative at this point, it is conceptually systematic, is able to account for inconsistent research findings, and is able to make novel predictions.
The studies reviewed in this paper, along with the alternative explanation, have important implications for behavior analysis. One implication concerns the design of behavioral interventions. When these programs are created, reinforcers are selected and then typically programmed to be delivered on some schedule. Although much attention is given to the effect a reinforcer has on an organism's engagement with a schedule, very little attention is given to the effect the schedule, and corresponding work requirements, have on the value of the reinforcer. The studies reviewed here indicate that the work requirement may alter the value of the corresponding reinforcer. Care, therefore, should be exerted when selecting and delivering reinforcers because the schedule requirements may either increase or decrease the value of those stimuli.
A second implication is related to the current understanding of the interaction between MOs and the function-altering effect. The value-altering effect of an establishing operation is understood as “an increase in the current [italics added] effectiveness of some stimulus, object, or event as reinforcement” (Cooper, Heron, & Heward, 2007, p. 376). Food deprivation is presumed to momentarily increase the effectiveness of food as a reinforcer. When the level of food deprivation subsides, the effectiveness of food as a reinforcer diminishes. The research reviewed here, however, suggests that the function-altering effect interacts with MOs. That is, the food is made more valuable when it is presented during food deprivation, and that increase in value persists into the future.
The finding that less preferred events alter preference for subsequent stimuli warrants further inquiry. Future research into this phenomenon may help explain how stimulus preferences develop with greater precision than is currently available, and may allow a greater understanding of how the environment and behavior interact and affect one another.
Acknowledgments
I thank Jonathan Ivy, Nancy Neef, and Manish Vaidya for their thoughtful comments and suggestions regarding this paper. The contents of this article were developed under a grant from the U.S. Department of Education, OSEP (H325DO60032) (N. A. Neef, Principal Investigator). These contents do not necessarily represent the policy of the U.S. Department of Education, OSEP, and no endorsement by the federal government should be assumed.
APPENDIX
Footnotes
Throughout this paper, chain schedules will be described as consisting of three components (an initial component, middle component, and terminal component). The term initial component will refer to the beginning of the chain, middle component will refer to the event manipulated by the researchers (e.g., response requirement, effort, delay), and terminal component will refer to the conditions that follow the manipulated event. Although the usage is unconventional (cf. Clement et al., 2000), explicit identification of a middle component will allow discussion of a larger variety of events than is possible using conventional chain schedule descriptions.
REFERENCES
- Alessandri J., Darcheville J.-C., Delevoye-Turrell Y., Zentall T.R. Preference for rewards that follow greater effort and greater delay. Learning & Behavior. 2008;36:352–358. doi: 10.3758/LB.36.4.352. [DOI] [PubMed] [Google Scholar]
- Alessandri J., Darcheville J.-C., Zentall T.R. Cognitive dissonance in children: Justification of effort or contrast? Psychonomic Bulletin & Review. 2008;15:673–677. doi: 10.3758/pbr.15.3.673. [DOI] [PubMed] [Google Scholar]
- Arantes J., Grace R.C. Failure to obtain value enhancement by within-trial contrast in simultaneous and successive discriminations. Learning & Behavior. 2008;36:1–11. doi: 10.3758/lb.36.1.1. [DOI] [PubMed] [Google Scholar]
- Arkes H.R., Ayton P. The sunk cost and concorde effects: Are humans less rational than lower animals. Psychological Bulletin. 1999;125:591–600. [Google Scholar]
- Armus H.L. Effect of response effort on the reward value of distinctively flavored food pellets. Psychological Reports. 2001;88:1031–1034. doi: 10.2466/pr0.2001.88.3c.1031. [DOI] [PubMed] [Google Scholar]
- Aw J.M., Holbrook R.I., Burt de Perera T., Kacelnik A. State-dependent valuation learning in fish: Banded tetras prefer stimuli associated with greater past deprivation. Behavioral Processes. 2009;81:333–336. doi: 10.1016/j.beproc.2008.09.002. [DOI] [PubMed] [Google Scholar]
- Aw J.M., Vasconcelos M., Kacelnik A. How costs affect preferences: Experiments on state dependence, hedonic state and within-trial contrast in starlings. Animal Behaviour. 2011;81:1117–1128. [Google Scholar]
- Clement T.S., Feltus J.R., Kaiser D.H., Zentall T.R. “Work ethic” in pigeons: Reward value is directly related to the effort or time required to obtain the reward. Psychonomic Bulletin & Review. 2000;7:100–106. doi: 10.3758/bf03210727. [DOI] [PubMed] [Google Scholar]
- Clement T.S., Zentall T.R. Second-order contrast based on the expectation of effort and reinforcement. Journal of Experimental Psychology. 2002;28:64–74. [PubMed] [Google Scholar]
- Cooper J.O., Heron T.E., Heward W.L. Applied behavior analysis (2nd ed.) Upper Saddle River, NJ: Pearson Education; 2007. [Google Scholar]
- DiGian K.A., Freidrich A.M., Zentall T.R. Discriminative stimuli that follow a delay have added value for pigeons. Psychonomic Bulletin & Review. 2004;11:889–895. doi: 10.3758/bf03196717. [DOI] [PubMed] [Google Scholar]
- Fantino E. Conditioned reinforcement: Choice and information. In: Honig W.K., Staddon J.E.R., editors. Handbook of operant behavior. Englewood Cliffs, NJ: Prentice Hall; 1977. pp. 313–363. (Eds.) [Google Scholar]
- Friedrich A.M., Clement T.S., Zentall T.R. Discriminative stimuli that follow the absence of reinforcement are preferred by pigeons over those that follow reinforcement. Learning & Behavior. 2005;33:337–342. doi: 10.3758/bf03192862. [DOI] [PubMed] [Google Scholar]
- Friedrich A.M., Zentall T.R. Pigeons shift their preference toward locations of food that take more effort to obtain. Behavioral Processes. 2004;67:405–415. doi: 10.1016/j.beproc.2004.07.001. [DOI] [PubMed] [Google Scholar]
- Gipson C.D., Miller H.C., Alessandri J.J.D., Zentall T.R. Within-trial contrast: The effect of probability of reinforcement in training. Behavioral Processes. 2009;82:126–132. doi: 10.1016/j.beproc.2009.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kacelnik A., Marsh B. Cost can increase preference in starlings. Animal Behavior. 2002;63:245–250. [Google Scholar]
- Klein E.D., Bhatt R.S., Zentall T.R. Contrast and justification of effort. Psychonomic Bulletin & Review. 2005;12:335–339. doi: 10.3758/bf03196381. [DOI] [PubMed] [Google Scholar]
- Marsh B., Schuck-Paim C., Kacelnik A. Energetic state during learning affects foraging choices in starlings. Behavioral Ecology. 2004;15:396–399. [Google Scholar]
- Michael J. Concepts and principles of behavior analysis. Kalamazoo, MI: Association for Behavior Analysis; 2004. [Google Scholar]
- O'Daly M., Meyer S., Fantino E. Value of conditioned reinforcers as a function of temporal context. Learning and Motivation. 2005;36:42–59. [Google Scholar]
- Pompilio L., Kacelnik A. State-dependent learning and suboptimal choice: When starlings prefer long over short delays to food. Animal Behavior. 2005;70:571–578. [Google Scholar]
- Pompilio L., Kacelnik A., Behmer S.T. State-dependent learned valuation drives choice in an invertebrate. Science. 2006;311:1613–1615. doi: 10.1126/science.1123924. [DOI] [PubMed] [Google Scholar]
- Schlinger H.D., Blakely E. A descriptive taxonomy of environmental operations and its imlpications for behavior analysis. The Behavior Analyst. 1994;17:43–57. doi: 10.1007/BF03392652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singer R.A., Berry L.M., Zentall T.R. Preference for a stimulus that follows a relatively aversive event: Contrast or delay reduction. Journal of the Experimental Analysis of Behavior. 2007;87:275–285. doi: 10.1901/jeab.2007.39-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vasconcelos M., Urcuioli P.J. Deprivation level and choice in pigeons: A test of within-trial contrast. Learning & Behavior. 2008;36:12–18. doi: 10.3758/lb.36.1.12. [DOI] [PubMed] [Google Scholar]
- Vasconcelos M., Urcuioli P.J. Extensive training is insufficient to produce the work-ethic effect in pigeons. Journal of the Experimental Analysis of Behavior. 2009;91:143–152. doi: 10.1901/jeab.2009.91-143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vasconcelos M., Urcuioli P.J., Lionello-DeNolf K.M. Failure to replicate the “work ethic” effect in pigeons. Journal of the Experimental Analysis of Behavior. 2007;87:383–399. doi: 10.1901/jeab.2007.68-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vorndran C.M., Lerman D.C. Establishing and maintaining treatment effects with less intrusive consequences via a pairing procedure. Journal of Applied behavior Analysis. 2006;39:35–48. doi: 10.1901/jaba.2006.57-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waite T.A., Passino K.M. Paradoxical preferences when options are identical. Behavioral Ecology and Sociobiology. 2006;59:777–785. [Google Scholar]
- Wardle J., Herrera M.L., Cooke L., Gibson E.L. Modifying children's food preferences: The effects of exposure and reward on acceptance of an unfamiliar vegetable. European Journal of Clinical Nutrition. 2003;57:341–348. doi: 10.1038/sj.ejcn.1601541. [DOI] [PubMed] [Google Scholar]
- Zentall T.R. A within-trial contrast effect and its implications for several social psychological phenomena. International Journal of Comparative Psychology. 2005;18:273–297. [Google Scholar]
- Zentall T.R. Within-trial contrast: When you see it and when you don't. Learning & Behavior. 2008;36:19–22. doi: 10.3758/lb.36.1.19. [DOI] [PubMed] [Google Scholar]


