Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jun 1.
Published in final edited form as: J Appl Behav Anal. 2023 May 31;56(3):498–519. doi: 10.1002/jaba.1004

Pavlovian learning and conditioned reinforcement

Gregory J Madden 1, Saba Mahmoudi 1, Katherine Brown 1
PMCID: PMC10364091  NIHMSID: NIHMS1919043  PMID: 37254881

Abstract

Conditioned reinforcers are widely used in applied behavior analysis. Basic-research evidence reveals that Pavlovian learning plays an important role in the acquisition and efficacy of new conditioned reinforcer functions. Thus, a better understanding of Pavlovian principles holds the promise of improving the efficacy of conditioned reinforcement in applied research and practice. This paper surveys how (and if) Pavlovian principles are presented in behavior-analytic textbooks; imprecisions and disconnects with contemporary Pavlovian empirical findings are highlighted. Thereafter, six practical principles of Pavlovian conditioning are presented along with empirical support and knowledge gaps that should be filled by applied and translational behavior-analytic researchers. Innovative applications of these principles are outlined for research in language acquisition, token reinforcement, and self-control.

Keywords: conditioned reinforcement, impulsivity, language acquisition, Pavlovian conditioning, praise, token economy


Two cornerstones of applied behavior analysis are (1) providing effective therapy that produces practical, noticeable improvements in client behavior and (2) building these therapies on a rigorous foundation of behavioral principles (Baer et al., 1968). These include both operant and Pavlovian principles.1 In the landscape of empirically supported principles used by applied behavior analysts, reinforcement is undoubtedly at the summit, but a nearby peak must be conditioned reinforcement (Vollmer & Hackenberg, 2001). Tokens, points, and praise are widely employed in applied settings to establish and maintain new, more adaptive behaviors (Donahoe & Vegas, 2021; Hackenberg, 2018). These conditioned-reinforcing consequences, when effectively arranged, render applications more practical (e.g., fewer tangible and edible reinforcers are needed), portable, and helpful in bridging the delay between the desired response and backup reinforcer (Russell et al., 2018). For this reason, conditioned reinforcement has long been recognized as an effective clinical practice (American Psychological Association, 1993).

But these points have been made for decades and are well known to readers of this journal. In addition, excellent reviews of the conditioned and token reinforcement literatures have been published in this journal and elsewhere (Bell & McDevitt, 2014; Cló & Dounavi, 2020; Fantino, 1977; Hackenberg, 2009, 2018; Williams, 1994). Thus, we seek to do something different. This paper will focus on the less often articulated Pavlovian-learning principles underlying conditioned reinforcement. What are the basic Pavlovian research findings, relevant to conditioned reinforcement, that rise to the level of a principle, systematic experimental manipulations producing replicable behavioral outcomes? If these principles were better understood, would they inspire innovative research that develops more effective approaches to establishing and using conditioned reinforcers? Such developments require that knowledge gaps be identified and addressed.

Before diving into Pavlovian principles, a brief refresher on Pavlovian learning may be useful. To that end, imagine a hungry rat in an operant chamber that obtains free food (the unconditioned stimulus, US) once, on average, every 2 min (i.e., a variable-time, VT 120-s schedule). Before each of these free-food events, a cue light is illuminated for 3 s. With experience with this Pavlovian contingency (if light → then food), the light will acquire a conditioned-stimulus (CS) function. That is, when presented alone, it will evoke a variety of conditioned responses (e.g., salivation, amygdala activation, attentional bias toward and physically approaching the cue; Bermudez & Schultz, 2010; Bucker & Theeuwes, 2017; Pavlov, 1927; Robinson & Flagel, 2009).2 Importantly, these conditioned responses evoked by the CS are not operant behaviors maintained by consequences (see the negative auto-maintenance and sensory preconditioning literatures; Davey et al., 1981; Rescorla, 1980; Seidel, 1959; Thompson, 1972; Williams & Williams, 1969).

There is wide agreement that Pavlovian learning plays an important role in the acquisition of a conditioned reinforcement function. For example, when we reviewed six commonly used applied behavior-analytic textbooks and edited volumes3, Pavlovian learning was either explicitly identified as important (Chance, 1998; Cooper et al., 2020; Fisher et al., 2021) or implied as important by the use of terms like “pairing” and “associated” (Martin & Pear, 2019; Mayer et al., 2018; Miltenberger, 2016), two terms commonly used in the Pavlovian literature. This textbook linkage of conditioned reinforcement and Pavlovian learning is supported by the basic-research literature. As we will see throughout this paper, principles of effective Pavlovian conditioning align closely with the principles of effective conditioned reinforcement (Fantino, 1977; Mackintosh, 1974; Shahan, 2010; Skinner, 1938). Additionally, basic researchers working independently in the Pavlovian and operant domains arrived at very similar quantitative models of their respective phenomena (Gibbon & Balsam, 1981; Preston & Fantino, 1991; Shahan & Cunningham, 2015; Williams, 1994). Thus, a thorough understanding of Pavlovian learning principles should facilitate translational research and practice. But before those principles can be considered, it is important to clear the board of some commonly held misunderstandings about Pavlovian learning, misunderstandings inherent to imprecisions in the description of Pavlovian procedures.

THE IMPRECISION OF “PAIRING”

Five of the textbooks and chapters we reviewed (Cooper et al., 2020; Donahoe & Vegas, 2021; Martin & Pear, 2019; Mayer et al., 2018; Miltenberger, 2016) fairly uniformly describe the Pavlovian conditioning procedure as one in which a neutral stimulus (NS) is repeatedly paired with a US until the NS acquires a CS function. “Pairing” is a commonly used term in the Pavlovian literature and in applied behavior-analytic papers on conditioned reinforcement (e.g., Esch et al., 2009; Ivy et al., 2017). But what exactly does “pairing” mean?

Cooper et al. (2020) describe Pavlovian conditioning as a “stimulus-stimulus pairing procedure” which is “most effective when the NS is presented just before or simultaneous with the US” (p. 31). “Simultaneous” pairing might be read to suggest that presenting the NS and US at the same time is an effective training method (see Figure 1A; left and right edges of the boxes above the timeline indicate the onset and offset of each stimulus). Martin and Pear (2019) visually illustrate NS-US pairing with simultaneous onsets of these two stimuli (their Figures 5.1–5.4) but indicate in the text that the NS should precede the US by approximately 0.5 s. Mayer et al. (2018) provide several examples of conditioned reinforcers in which the NS and US are presented simultaneously, and Cooper et al.’s Figure 2.1 illustrates respondent pairing as “NS+US,” which is open to interpretation. In Pavlovian circles, this arrangement is referred to as a simultaneous conditioning procedure, and it is generally ineffective (for review, see Lattal, 2013). That is, when the NS onset and offset occur in tandem with US onset and offset, the NS will evoke minimal (if any) conditioned responding when the NS is presented alone (e.g., Fitzwater & Thrush, 1956; Pavlov, 1927; White & Schlosberg, 1952).

FIGURE 1.

FIGURE 1

Presentations of neutral and unconditioned stimuli in Pavlovian conditioning procedures. Four ways in which the neutral stimulus (NS) and unconditioned stimulus (US) might be presented in time in these procedures.

Early experiments on conditioned reinforcement evaluated this simultaneous NS+US Pavlovian pairing procedure as a means of establishing a new conditioned reinforcer function. During the test of conditioned reinforcement that followed (i.e., does the rat engage in an operant response to produce the NS), the NS failed to acquire conditioned reinforcing properties (Bersh, 1951; Marx & Knarr, 1963; Schoenfeld et al., 1950; Stubbs & Cohen, 1972). Thus, we may conclude that stimulus-stimulus pairing does not mean that the would-be conditioned reinforcer (the NS) should be presented simultaneously with the appetitive US (i.e., the back-up reinforcer) during the Pavlovian conditioning phase. Perhaps when Cooper et al. (2020) wrote “simultaneous,” they meant that the NS preceding the US may partially overlap with the US, which is common in effective Pavlovian conditioning procedures.

The remainder of the Cooper et al. (2020) quote from above indicates that a Pavlovian conditioning procedure will be more effective if the NS occurs “just before” the US (Figure 1B). What counts as “just before” is not specified but will be given further consideration in the next section. For now, we will note that “just before” must mean that an effective Pavlovian conditioning procedure will ensure that the NS precedes the US. A similar conclusion was reached by Pavlov (1927) when he found that a NS→US sequence readily established a CS function, whereas simultaneously presenting these stimuli (Figure 1A) did not. Likewise, those early conditioned-reinforcement studies that reported simultaneous NS+US pairings did not work, also reported that a Pavlovian NS→US sequence reliably established a new conditioned-reinforcer function (Bersh, 1951; Marx & Knarr, 1963; Stubbs & Cohen, 1972). In sum, “pairing” is a frequently used but imprecise term. When used as synonymous with a procedure designed to change behavior, it fails to specify how and when the NS and US should be arranged.

“Pairing” implies the centrality of temporal contiguity

When Cooper et al. (2020) note that effective Pavlovian procedures present the NS just before or simultaneous with the US, the reader may deduce that pairing involves temporal contiguity. All the textbooks and chapters we reviewed agree on this point and some suggest that the optimal Pavlovian procedure will present the NS 0.5 s before the US, the so-called “half-second rule” (Donahoe & Vegas, 2021; Martin & Pear, 2019; Miltenberger, 2016). However, this textbook advice has been consistently rejected in seminal review papers in the Pavlovian and operant literatures (Balsam et al., 2010; Fantino, 1977; Lattal, 2013; Rescorla, 1988; Shahan, 2010). These authors outline considerable empirical evidence at odds with the theory that repeatedly presenting the NS and US in close temporal proximity (e.g., within 0.5 s of the US) is either a necessary or a sufficient condition for Pavlovian learning or for establishing a conditioned-reinforcement function. One reason for this stance has already been discussed—simultaneous presentation of the NS and US is generally ineffective, despite the two stimuli being entirely paired in time.

Another category of evidence against the pairing hypothesis is that if repeatedly presenting the NS just before the US were sufficient, then robust Pavlovian learning would occur whenever these stimuli co-occur. In Figure 1C, the NS→US sequence occurs just as often as it did in 1B. However, the US also occurs just as often without a preceding NS event. If this happens, the NS is far less likely to acquire a CS function (e.g., Rescorla, 1968), and hence, will not function as a conditioned reinforcer (Hyde, 1976). This might inadvertently occur in applied settings if the client’s preferred edible (US) is provided just as often without the would-be conditioned reinforcer (NS) as with it. Worse still, if the US occurs more often when the NS is not present than when it is “paired” with the US, the NS can acquire an inhibitory function; that is, conditioned responding will be less frequent when the NS is present, despite the fact that it is always temporally paired with the US (Gottlieb & Begej, 2014; Rescorla, 1969).

Yet another category of evidence against the pairing hypothesis is that a contiguous NS→US sequence, although often helpful (all else being equal), is not a necessary condition for establishing a CS function. Many Pavlovian studies have demonstrated an evocative effect of the CS when, for example, a 12-s no-stimulus interval separates CS offset and US onset (Figure 1D; e.g., Kaplan, 1984; Lucas et al., 1981). In these trace-conditioning procedures, the conditioned response is typically not evoked during the CS. It occurs instead during the so called “trace interval” (the no-stimulus interval between CS and US), and increases as the time to the US approaches (Marlin, 1981). Likewise, Pavlovian trace conditioning can be used to establish a new conditioned-reinforcer function (Jenkins, 1950; Thrailkill & Shahan, 2014; comparison of trace-only and random-SO groups in the latter study). Thus, the oft-quoted half-second rule of Pavlovian conditioning (and conditioned reinforcement) is, at minimum, incomplete. More broadly, the position that temporal contiguity between the NS (would-be conditioned reinforcer) and US (back-up reinforcer) is either necessary or sufficient may be rejected.

SIX PRINCIPLES OF PAVLOVIAN LEARNING AND CONDITIONED REINFORCEMENT

Having cleared the board of some common misunderstandings, we may now consider six principles of Pavlovian learning that applied conditioned-reinforcement researchers should be aware of as they translate principles into effective practice and fill remaining knowledge gaps in the literature. Some, but not all, of these principles are covered in the textbooks we reviewed. For the sake of comprehensiveness, we repeat those textbook principles here, though with pairing-extracting language modifications. The sequence of principles presented here roughly follows the sequence in which they would be considered when designing a Pavlovian procedure for establishing a new CS/conditioned-reinforcer function. For each principle, we outline the extant empirical support, identify knowledge gaps, and discuss how the principle might be employed in applied research. These principles have been better established in the Pavlovian learning literature than they have in operant, conditioned reinforcement experiments. In the Supporting Information, readers may find a list of the knowledge gaps and applied research opportunities identified throughout this paper. By filling these gaps, translational researchers can evaluate the full potential of these principles in enhancing the technology of effective conditioned reinforcement.

Principle 1: Choose a salient and novel NS

Prior to appetitive Pavlovian conditioning, the would-be CS/conditioned-reinforcer is an NS. Our immediate environments are filled with neutral, functionless stimuli, many of which we do not attend. Learning a new “if CS → then US” Pavlovian contingency is facilitated by ensuring that the NS (the would-be CS/conditioned-reinforcer) is salient (see Hall et al., 1977, for empirical support), which is to say that there is a high probability of attending to the stimulus (Catania, 2013). One way to increase the salience of a stimulus is to make it novel, something not previously encountered (Horstmann, 2015). If the screen you are reading this on right now went completely red for 1 s and then returned to normal, this would be simultaneously salient and novel. It happened in a location where you were currently looking (salient) and, presumably, this has never happened before (novel). Another advantage of novel stimuli is that they have no unwanted/unknown experiential baggage that can negatively impact Pavlovian learning. For example, if the individual has previously encountered the NS without the US, then it will take longer for that stimulus to acquire a CS function when a Pavlovian contingency is introduced. This latent inhibition effect has been demonstrated in many species, including humans (Escobar et al., 2003; Lubow, 1973).

We are aware of no systematic experiments evaluating if these NS-history effects also slow the acquisition of a new conditioned-reinforcer function. Petursdottir et al. (2011) manipulated this variable, but because they could not reliably establish a CS/conditioned-reinforcer function, the manipulation did not shed light on the effect of latent inhibition for human subjects. Until this research gap is filled, it seems prudent to ensure that stimuli to be newly established as conditioned reinforcers are both salient and novel.

Counter to this advice is the common research practice of arranging a baseline session in which the NS is repeatedly presented without the US (e.g., Moher et al., 2008). The practice is designed to ensure that the putative NS does not already function as a conditioned reinforcer. However, latent inhibition predicts that this control procedure will have negative effects on subsequent Pavlovian learning. Other control procedures are available. For example, some applied researchers have adopted the practice of using two neutral stimuli (e.g., a red and a green token) and selecting one for NS→US presentations and the other for NS presentations at times uncorrelated with the US, with assignments counterbalanced across participants (e.g., Lepper & Petursdottir, 2017; Petursdottir et al., 2011). If, across participants, the only NS to acquire a CS/conditioned-reinforcer function is the one that occurred prior to the US, then confidence that the conditioned-reinforcer function was learned, rather than preexisting, increases with the sample size. For studies with just a single participant, this strategy will not be convincing, so a very brief pre-test of the conditioned-reinforcer efficacy of the NS would appear to be unavoidable.

The Principle-1 advice to use a novel NS is also inconsistent with current recommendations to find a conditioned reinforcer that is visually preferred or engaging to the individual (Hine et al., 2018). That recommendation is presumably given under the assumption (or clinical observations) that preferred objects are valued by the client, and this added value may aid in maintaining the desired operant behavior. Inconsistent with this assumption, Fernandez (2021) reported that “interest-based tokens” were no more effective in maintaining behavior than novel conditioned reinforcers. If the applied behavior-analytic researcher is interested in demonstrating conditioned reinforcement effects, it is important to control all aspects of the conditioned reinforcer, including past experience with that stimulus.

Principle 1 may also run counter to the previously mentioned half-second rule (i.e., the NS should be presented no more than 0.5 s before the US). This rule is generally ignored by Pavlovian researchers outside the eye-blink conditioning literature. Using such a brief NS may reduce the salience of that stimulus. Perhaps this explains why a common procedure in applied research is to present other stimuli along with the NS (e.g., calling the participant’s name; Petursdottir et al., 2011) so as to draw client attention to the imminent NS presentation. As discussed below in Principle 4, these additional attention-directing stimuli can inhibit or prevent the NS from acquiring a CS/conditioned-reinforcer function. Pavlovian experiments routinely use NS durations exceeding 0.5 s, which increases salience and may facilitate acquisition of the CS function.

Applying Principle 1

Before research begins, the applied behavior analyst should collaborate with stakeholders and members of the clinical team to identify a salient/novel NS. Salience increases with stimulus intensity, so whatever the dimension of the NS (visual, auditory, etc.), it should have greater intensity than other background stimuli in that same dimension. To promote further salience, the NS should be presented in a way that is hard to miss. Just as a red screen momentarily taking the place of this article would be highly salient to the reader, the salience of a visual NS will be enhanced if it is presented near the client and within their field of vision. Choosing a novel stimulus means the NS cannot be something the client sees, hears, smells, etc. daily (consider latent inhibition effects); an ideal NS will be a salient stimulus with which the client has no prior history.

In applied research settings, it may be useful to conduct a single test trial to evaluate if the NS is sufficiently salient/novel. No warning stimuli should precede that trial (e.g., moving the client to a new location or calling their name), which consists of the salient presentation of the NS (and nothing else). If the client orients toward the NS, even momentarily, the test is passed. If attention is not shifted, then a more salient NS should be selected. Such NS test-trials should be conducted just once, as repeatedly presenting the NS without the US could produce a latent-inhibition effect.

Some clinical populations (e.g., individuals with autism spectrum disorders, ASD) may have deficits in shifting their attention between stimuli or shifting from a preferred stimulus. The NS test just described may not be appropriate with those populations, though surprisingly little research has evaluated Pavlovian learning among individuals with ASD who require substantial or very substantial support (American Psychiatric Association, 2013; Powell et al., 2016). If the attention test is failed with more than one NS, then the test may need to be conducted in a location with fewer potential distractor stimuli. If the test is passed in that setting, then the Pavlovian training that follows (see below) should also be conducted in that setting. Once that training is complete, training may be migrated out of that setting, continuing until the new CS/conditioned-reinforcer is established in the settings in which it will be used.

Principle 2: Arrange an effective US/backup-reinforcer

After choosing a salient and novel NS, one must select an appetitive US (i.e., a reinforcer) to follow the NS. Principle 2 is straightforward and familiar to readers of this journal. The principle is exemplified in the widely followed advice to use generalized conditioned reinforcers, which have a variety of backup reinforcers, one of which should address currently operative motivating operations (e.g., Hackenberg, 2018; Martin & Pear, 2009).

Principle 2 applies to both Pavlovian learning and conditioned reinforcement. With respect to the former, if the US does not reliably elicit an unconditioned response, then the NS that precedes it will be similarly ineffective in acquiring a CS function. This has been evaluated in laboratory studies that manipulate the magnitude of the US or the motivating operation (e.g., animal subjects’ food deprivation level). The typical finding is that (a) speed of acquisition of a CS function and (b) magnitude of the CR is determined by the size/efficacy of the US (Annau & Kamin, 1961; Barry, 1959; Morris & Bouton, 2006; Sparber et al., 1991).

The same is true when attempting to establish a new conditioned reinforcer function. Laboratory experiments that manipulate the quality/size of the backup reinforcer, or the motivating operation relevant to backup reinforcer efficacy, find that a more effective backup reinforcer facilitates establishing a new conditioned reinforcer function (Burton et al., 2011; Tabbara et al., 2016; Wolfe, 1936). Likewise, if a client in an applied setting will not engage in operant behavior to acquire the backup reinforcer, then they will not work for a stimulus signaling that the backup reinforcer is coming (Moher et al., 2008).

Applying Principle 2

Applying Principle 2 is easier in principle than it is in practice. Readers will be familiar with the many preference-assessment techniques used to identify stimuli that may function as reinforcers (e.g., Fisher et al., 1992; Pace et al., 1985). Although there are some conditioned reinforcement experiments that have used preference assessments to select the US (e.g., Lepper & Petursdottir, 2017; Petursdottir et al., 2011) it is difficult to evaluate the utility of preference assessments in this context because Pavlovian training was often unsuccessful, perhaps because of lack of adherence to principles other than Principle 2.

A potential problem with the preference-assessment methodology is that a simple preference for one stimulus over another may not be predictive of how much operant behavior the preferred stimulus will maintain when it is arranged as a reinforcer (see Madden et al., 2023, for review). Although it is often impractical to do more than a preference assessment in applied settings (Poling, 2010), establishing the reinforcing function of the US is critical to the success of applied research using Pavlovian procedures. Thus, at minimum, it must be shown that the US will maintain operant behavior at an above-baseline level. Better still would be to demonstrate that the reinforcer will maintain operant behavior when more than one response is required per reinforcer.

Principle 3: Large C/T ratios are better than small ones

So far, if the investigator is adhering to Principles 1 and 2, they will have identified a salient/novel attention-attracting stimulus that will acquire a CS/conditioned-reinforcer function and an effective US/backup reinforcer if they follow the remaining principles. Next, we must consider the temporal logistics of arranging these stimulus events. Our textbooks specify that the NS must precede the US, but precisely when and, just as importantly, how much time should occur between US events is unspecified. These logistics are addressed by Principle 3: large C/T ratios are better than small ones.

The C/T ratio may be new to some readers, but it is at the core of important quantitative models of Pavlovian learning (Balsam & Gallistel, 2009) and conditioned reinforcement (Fantino, 1977; Shahan & Cunningham, 2015). The latter models make predictions that are very similar to even more complex quantitative models of conditioned reinforcement (Christensen & Grace, 2010; Grace, 1994; Mazur, 2001) which, perhaps because of this quantitative complexity, have not had much influence on the use of conditioned reinforcers in applied behavior-analytic research. To address this, the C/T ratio is offered as a practical simplification of more complex theories.

Like all ratios, the C/T ratio is a statement specifying a mathematical operation of division (i.e., C divided by T). The numerator of the ratio, C, stands for cycle time, which is the average time between US events. So, if in a Pavlovian conditioning experiment, free food pellets are provided to a rat once, on average, every 2 min, then C = 120 s. The denominator of the C/T ratio, T, refers to the interval from NS onset until US delivery; if the NS is a cue-light that is turned on for 3 s before the US occurs, then T = 3. Note that after the NS acquires a CS function, T refers to the interval from CS onset until US delivery.

Having defined our terms, let us consider how C and T values influence Pavlovian learning. As previously noted, all else being equal, Pavlovian conditioning will often be more effective if there is temporal contiguity in the CS→US sequence. To illustrate how this is captured by C/T, we will keep C (the inter-US interval) constant at 120 s, and manipulate T. In Figure 2A, the cue-light (CS) is illuminated for 3 s prior to food delivery; therefore, T = 3. In Figure 2B, the light is illuminated for 60 s prior to food, so T = 60. Given these parameters, Principle 3 predicts that Pavlovian conditioning will work better (e.g., faster acquisition, higher conditioned-response rates) in Panel A, when T = 3. This common-knowledge temporal-contiguity principle is quantified by the C/T ratios to the right of the timelines. When the quotient of the C/T ratio is 40, it means that when the cue light comes on, food will be delivered 120/3 = 40 times sooner than normal (“normal” referring to the mean interval separating US events, C). In Panel B, when the C/T ratio is 2, food will be delivered 120/60 = 2 times sooner than normal—not bad, but not anywhere near as good as 40 times sooner. By changing T, we increase the C/T ratio and, according to Principle 3, large C/T ratios are better (more effective) than small ones.

FIGURE 2.

FIGURE 2

Three Pavlovian conditioning procedures. Panels A and B illustrate the two durations in the C/T ratio. C is the “cycle time” or the average interval separating successive unconditioned stimulus (US) events; T is the interval from conditioned stimulus (CS) onset to US delivery. By dividing C by T, we obtain a C/T ratio value (the quotient). In Panel C, the CS→US temporal contiguity is the same as in Panel A but the average inter-US interval (C) is shorter. Hence, Principle 3 holds that Panel A is the more effective procedure because of its longer duration of C.

Of course, T is only half of the story. The numerator, C, is also important in determining the quotient, but it is often overlooked. In Figure 2C, temporal contiguity between the cue-light and food is as it was in Figure 2A (T = 3 s), but the inter-US interval (C) is, on average, 6 s instead of 120 s. Decreasing C reduces the quotient (C/T = 2), which, according to Principle 3, will reduce the efficacy of the Pavlovian conditioning procedure relative to Figure 2A (C/T = 40). If all we know is that CS-US pairing is important, we would predict that these Figure 2A and 2C procedures would produce comparable outcomes—there has been no change in temporal contiguity. By contrast, Principle 3 holds that, all else being equal, longer inter-US intervals facilitate Pavlovian learning and make for more effective conditioned reinforcers.

A Principle 3 metaphor that may help readers remember and intuitively understand the C/T ratio is to think of an effective CS as a Paul Revere stimulus—its onset signals that “The US is coming! The US is coming!” (Madden et al., 2021).4 When Paul Revere shouted his famous warning that the British army was approaching the city, the townspeople learned that a normally infrequent US event (long-duration C) was imminent (brief T duration and large C/T ratio). This warning undoubtedly evoked a variety of emotional responses (e.g., fear, excitement) among the soldiers, tradespeople, and citizens nearby. Less useful would be if Paul’s warning mirrored Figure 2C; it would be like yelling, “The British are coming!” 3 s prior to each in a long line of troops entering the city. Because the C/T ratio value is so small, such “warnings” would evoke very little conditioned emotional responding; indeed, after the first in this series of warnings (the only one with a large C/T ratio), the rest are likely to be ignored. Thus, if we want our salient/novel NS to function as an effective CS/conditioned-reinforcer, Principle 3 holds that we will arrange a large C/T ratio.

The empirical support for Principle 3 will be separately evaluated in the Pavlovian and conditioned-reinforcement domains. First, there is a sizeable empirical literature showing that experimental manipulations of C and T, when they increase the C/T ratio, yield faster Pavlovian learning (i.e., fewer trials, or sessions, are needed before the NS acquires a CS function; e.g., Gibbon et al., 1977; Lattal, 1999; Perkins et al., 1975; Stein et al., 1958; Terrace et al., 1975; Ward et al., 2012). To our knowledge, only one human laboratory study has experimentally manipulated the duration of C. Consistent with Principle 3, Nelson et al. (2014) reported that increasing the interval between US events (alien spaceships appearing on the screen) decreased the number of Pavlovian training trials needed for a NS (light on the display panel) to acquire a CS function. To our knowledge, no human Pavlovian experiments have manipulated T. Clearly, further studies are needed both in basic and translational research.

Beyond acquisition speed, larger C/T ratios typically produce a CS that evokes more conditioned responding than is achieved with a small C/T ratio (Lattal, 1999; Nelson et al., 2014; Perkins et al., 1977). One form of that conditioned responding is physical attraction to the CS; that is, when a large C/T ratio is arranged with an appetitive US, nonhumans will often physically approach and interact with the CS (Burns & Domjan, 2001; Lee et al., 2018; Thomas & Papini, 2020; van Haaren et al., 1987). Similarly, humans will demonstrate a compulsive attentional bias to a Pavlovian CS in eye-tracking studies (Bucker & Theeuwes, 2017, 2018; Garofalo & di Pellegrino, 2015); although, to the best of our knowledge, no studies have evaluated if this bias is more robust or emerges more quickly with larger C/T ratios.

The other category of empirical support for Principle 3 comes from conditioned reinforcement experiments. Larger C/T ratios generally produce larger conditioned reinforcement effects. This has most often been demonstrated in concurrent-chains procedures where conditioned reinforcer efficacy is gauged by preference for one conditioned reinforcer over another. In this choice procedure, conditioned reinforcers with larger C/T ratios are preferred over smaller ratios (Fantino, 1977; Fantino et al., 1993; Williams & Fantino, 1978; see Shahan & Cunningham, 2015, for review). Additional evidence comes from the observing-response literature. In a nonhuman study of observing, contingencies alternate between periods of time when food can be obtained and periods when food is unavailable; however, the subject cannot tell which period they are in at any given time. In that context, pressing a lever produces a stimulus (e.g., a light) that is correlated with reinforcer-availability periods. If the lever is pressed at an above-baseline level, that stimulus (the light) functions as a conditioned reinforcer. In these observing experiments, conditioned reinforcers with larger C/T ratios maintain more observing responses (Auge, 1974; Case & Fantino, 1981; Roper & Zentall, 1999).

Surprisingly few studies have examined the effects of the C/T ratio on single-operant responding, and, to our knowledge, none have evaluated its effects on human single-operant behavior. This is a significant research gap because conditioned reinforcers are most often used in applied settings to maintain human operant behavior. That is, we are not usually asking clients to choose between conditioned reinforcers (concurrent chains) or to work to produce a schedule-correlated stimulus (observing procedure); we are trying to establish and maintain adaptive behavior with a conditioned reinforcing consequence.

In the nonhuman lab, we are aware of just two studies that have evaluated the effects of manipulating C or T on single-operant behavior maintained by a conditioned reinforcer. In the first, Bersh (1951) had rats complete six appetitive Pavlovian training sessions to establish a light as a CS. Across groups, the light was illuminated between 0.5 and 10 s prior to the delivery of food, which occurred once every C = 35 s, regardless of group; this produced a C/T ratio range of 3.5 to 705. In post-training test sessions, pressing a lever turned on the light-CS for 1 s, but food was never delivered (Pavlovian extinction). Responding for the CS was lowest in the C/T = 3.5 group (T = 10 s) and rates in the remaining groups were significantly higher, though the latter were statistically undifferentiated from each other. The other experiment was conducted in our lab (Mahmoudi et al., 2022). Unlike Bersh, we manipulated the duration of C during Pavlovian training (14 s, 28 s, or 96 s), while holding T constant at 8 s. In a post-training test of conditioned reinforcement that was like that arranged by Bersh, rats in the C/T = 12 group responded to produce the CS significantly more than rats in the C/T = 1.75 or 3.5 groups; the latter two groups were statistically undifferentiated. Thus, consistent with Principle 3, large C/T ratios maintained more responding than small ones. However, neither study showed a direct relation between C/T value and rate of responding for the conditioned reinforcer (e.g., doubling C/T from 1.75 to 3.5 in the Mahmoudi et al., 2022, study did not increase responding).

Clearly more research is needed to flesh out the parametric relation between C/T value and single-operant responding maintained by a conditioned reinforcer. Only two nonhuman parametric studies have been conducted and the relation has not been investigated at all with humans. The few human studies evaluating C/T ratio effects on conditioned reinforcer efficacy have used concurrent-chains schedules in which participants choose between conditioned reinforcers (Alessandri et al., 2010, 2011; Belke et al., 1989; Stockhorst, 1994)6. This approach does not assess how much behavior can be maintained by a conditioned reinforcer. Future studies should take care to manipulate C and T and arrange some conditions with the same C/T ratio but composed of different C and T component values (e.g., 100/10 = 10 = 500/50). Finding an optimal range of C, T, and C/T values—those that (a) speed Pavlovian learning, (b) quickly establish a new conditioned-reinforcer function, and (c) produce a conditioned reinforcer that maintains peak single-operant responding—has important translational implications.

Applying Principle 3: Token reinforcers

Having identified the empirical support for Principle 3 and some remaining gaps to be filled, we will consider how existing knowledge might be useful in designing and implementing token reinforcer programs. To that end, imagine an individual working under a token system in which eight tokens must be earned before they can be exchanged for a backup reinforcer (a fixed-ratio [FR] 8 exchange-production schedule, to use the terminology of Hackenberg, 2009, 2018). In this arrangement, C corresponds to the average time between backup reinforcers and T is the time separating the delivery of any given token and the acquisition of the backup reinforcer. Let’s assume that the inter-backup-reinforcer interval is C = 160 s. If a token is earned once every 20 s, then after the first token is obtained, there are seven tokens x 20 s = 140 s left until the backup reinforcer is available. Assuming a 5-s interval for the eventual token exchange (after all eight tokens are acquired), the C/T value for this first token is 160/(140+5) = 1.1, which should have minimal conditioned-reinforcing efficacy. Consistent with this prediction, long pauses prior to working for the first token are commonly reported in the token reinforcement literature (e.g., Bullock & Hackenberg, 2006; Kelleher, 1958). By contrast, the C/T value for the last token, assuming the same 5-s interval from receipt to exchange, is 160/5 = 32; this higher C/T value accords with the characteristically brief latencies to begin working for the final token (Foster et al., 2001).

The traditional behavior-analytic theory behind long pauses under FR exchange-production schedules is to conceptualize each response run, ending with a token reinforcer, as a single response-unit. Such response units should, the theory goes, behave like single responses under a standard schedule of reinforcement, and considerable evidence supports this account (see Hackenberg, 2018). If simple FR schedules produce long post-reinforcer pauses, then the same should be true of FR exchange-production schedules. If simple variable-ratio (VR) schedules have shorter pauses and maintain more behavior than FR schedules at large ratio values (Ferster & Skinner, 1957; Madden et al., 2005; Zeiler, 1979), then the same should be true of VR and FR exchange-production schedules. Very few empirical tests of the predicted differences in VR vs. FR exchange-production schedules have been conducted, and those that have been conducted have produced equivocal outcomes (Hackenberg, 2018).

Should readers wish to further test this hypothesis in applied research, we encourage you to also test a unique prediction of Principle 3: under a VR exchange-production schedule, a larger net C/T ratio may be achieved by using discriminably different “win” and “no win” tokens. For example, when the client completes a task and earns a token, it is randomly drawn from a bag containing one blue token (the CS/conditioned-reinforcer) and seven brown ones (a VR 8 exchange-production schedule). If the blue token uniquely signals that the backup reinforcer is 5-s away, then C/T = 160/5 = 32. The brown tokens, when encountered, do not signal when the backup reinforcer will be delivered. Their effect appears limited to increasing the duration of C (Cunningham & Shahan, 2018; Shahan & Cunningham, 2015).7

The efficacy of this differentially signaled VR exchange-production schedule has been demonstrated with nonhumans in single-operant experiments (e.g., Notterman, 1951) and in the suboptimal choice literature (Zentall, 2016). In the latter studies, pigeons and rats prefer a differentially signaled VR exchange-production schedule over a variety of alternatives that produce more food but with consequent stimuli that do not differentially signal upcoming “win” and “no win” events (Cunningham & Shahan, 2019; Molet et al., 2012; Stagner & Zentall, 2010; Zentall, 2016). In a comparable applied experiment, Lalli et al. (2000, Experiment 1) reported that two preschoolers with mild developmental delays similarly preferred a suboptimal alternative (differentially signaled VR-2 exchange-production schedule) over an optimal (FR 1) alternative.

To our knowledge, this differentially signaled VR exchange-production schedule has been evaluated only twice in applied research using response rate (rather than choice) as the dependent measure. In the first study, Van Houten and Nau (1980) arranged a VR-8 exchange-production schedule in an academic setting with hearing-disabled children. That is, if a child had been attending, on-task, and nondisruptive, they were allowed to randomly draw a token from a bag containing one blue token and seven brown ones. When a blue token was drawn, it was exchanged for a trinket toy just a few seconds later. This differentially signaled VR-8 exchange-production schedule maintained higher rates of academic behavior than an otherwise equivalent FR-8 schedule in which eight tokens had to be earned every time before the backup reinforcer could be obtained (see Bullock & Hackenberg, 2006; Foster et al., 2001; Notterman, 1951; Saltzman, 1949 for nonhuman systematic replications). The effect was replicated in the same project by Van Houten and Nau at VR- and FR-12 schedules, and teachers preferred the VR over the FR exchange-production schedule.

In the other applied study, Lalli et al. (2000; Experiment 3) reported that a 7-year old with pervasive developmental delays made more functional communication responses than aggressive responses when the former were maintained by a differentially signaled VR-2 exchange-production schedule, and the latter were maintained on an FR-1 schedule for the same backup reinforcer. Although these findings are encouraging, more research is needed. Does the differentially signaled VR exchange-production schedule maintain more adaptive behavior than a simple VR exchange-production schedule? Can the Lalli et al. finding with one client be replicated with other clients, with different ages, challenging behaviors, intellectual capacities, and when discrete-trials procedures are employed?

Those questions remaining, it appears that Principle 3 offers additional principled reasons to agree with Hackenberg (2018) that exchange-production schedules are “powerful variables” with “wide generality.” As we will see later in this paper (Principle 6) exchange-production schedules may yield operant behavior that is more resistant to Pavlovian extinction (i.e., token provided but never exchanged for a backup reinforcer) than behavior produced by manipulating token-production schedules. Where textbooks sometimes encourage applied behavior analysts to manipulate the response requirement for earning a token (e.g., Martin & Pear, 2019), we encourage readers to explore the largely untapped potential of differentially signaled VR exchange-production schedules.

A second application of Principle 3: Language acquisition

Because Principle 3 may be new to some readers, we provide a second area of application for your consideration, language acquisition among children with delayed speech (e.g., children diagnosed with ASD). One strategy for encouraging vocalizations is to alter the function of speech sounds, so they function as conditioned reinforcers. If this can be accomplished, then the child may increase their verbalizations to produce this auditory conditioned reinforcer (for review, see Petursdottir & Lepper, 2015; Shillingsburg et al., 2015). In the existing literature this is accomplished using a stimulus-stimulus pairing procedure in which the therapist makes an auditory vocalization (e.g., “bah”) and then immediately gives the client a highly preferred item (i.e., a NS→US sequence). When this sequence is repeated many times, from a textbook account of conditioned reinforcement, all the boxes have been checked—repeated pairing of a salient NS (“bah”) with an effective US, with close temporal contiguity between the two. Therefore, the sound of “bah” should (a) acquire CS properties (e.g., a positive emotional response should be evoked by CS onset) and (b) should acquire conditioned reinforcing properties (i.e., the client should say “bah” more often than before).

Looking at these procedures from a Principle-3 perspective, two intervals are of interest, C and T. From the reviews of this literature provided by Petursdottir et al. (2015) and Shillingsburg et al. (2015), it is clear that researchers have been conscientious about ensuring that “bah” is presented in close temporal proximity to the US, but the duration of T has received less systematic attention. Recalling that T is the interval from NS onset to US delivery, the duration of the NS can be an important contributor to T. The longer T is, the smaller the C/T ratio, all else being equal. Because smaller C/T ratios are less effective, Principle 3 provides a principled reason to advise against presenting the NS repeatedly before the US (e.g., “bah,” “bah,”, “bah,” US; “bah,” “bah,” “bah”…), which is a common practice in this literature (e.g., Normand & Knoll, 2006). Although Miliotis et al. (2012) reported better outcomes with one NS presentation rather than three, insufficient research has systematically evaluated the effects of manipulating T in applied settings.

In this language-acquisition literature, even less attention has been paid to C (i.e., the inter-US interval). For example, it is common to report no procedural details from which C could be deduced (e.g., Moher et al., 2008; Smith et al., 1996) or to present “bah” immediately after the US is consumed (e.g., Carroll & Klatt, 2008). When the NS is presented right after the back-up reinforcer is consumed, the inter-US interval is nearly identical to the NS→US interval (i.e., CT), a condition known to produce little to no Pavlovian learning (Gibbon et al., 1975; Gibbon & Balsam, 1981). If the child says “bah” more often after these CT interventions (and they do with 25% of the targeted vocalizations; Carroll & Klatt, 2008; Esch et al., 2005; Miguel et al., 2002), then the improvement cannot be attributed to Pavlovian learning. Without a theoretical account of these limited successes, it may prove difficult to replicate them (they may be due to an unidentified procedural confound) or to integrate them with behavior-analytic principles (Baer et al., 1968).

Although the durations of C and T have been manipulated between subjects or studies in this language-acquisition literature, no study has been designed to systematically evaluate the efficacy of interventions arranging different values of C, T, and C/T ratios (while holding all other procedural variables constant). This is a significant research gap that should be filled. Principle 3 makes a clear prediction: when the principles of Pavlovian conditioning are adhered to, large C/T values will enhance (relative to small values) the CS and conditioned-reinforcer functions of “bah.” If research is conducted to evaluate this hypothesis, we recommend assessing the CS and conditioned-reinforcer functions independent of tests of increased participant vocalizations (e.g., Petursdottir et al., 2011); doing so provides an independent test of the efficacy of the Pavlovian procedures in establishing the core CS/conditioned-reinforcement functions.

Principle 4: The CS/conditioned-reinforcer should uniquely signal the sooner than normal arrival of the US/backup-reinforcer

With three principles under our belts, we have chosen our NS and our US, and we have ensured that the duration of C is much longer than the duration of T. Next, we must ensure that no other (non-target) stimulus signals that the US will arrive sooner (after T) than normal (again, “normal” is given by the inter-US interval, C), lest our target stimulus fail to acquire CS/conditioned-reinforcer properties. We will discuss two examples of this in the Pavlovian literature—overshadowing and blocking—both of which have important implications for the use of conditioned reinforcement in applied research.

Overshadowing

Figure 3 illustrates a Pavlovian overshadowing procedure. For individuals assigned to the Pavlovian-training group (top panel), a neutral stimulus (NS1) is presented alone 10 s prior to the US; it uniquely signals that the appetitive US will arrive 13 times sooner than normal. For the overshadowing group (middle panel), training is different. Two neutral stimuli, NS1 and NS2, are presented together 10 s before the US. Neither of these neutral stimuli has a behavioral function (they are both neutral), but NS2 is more salient than NS1. Because of this difference in salience, NS2 is more likely to acquire a CS function than NS1. This outcome is revealed in the test phase (bottom panel), when NS1 and NS2 are periodically presented alone, without the US. For Pavlovian-trained subjects, NS1 evokes more conditioned responding than NS2, and the opposite is true among those in the overshadowing group (e.g., Jennings et al., 2007; Mackintosh, 1971; Pavlov, 1927). This is a replicable finding (see Gottlieb & Begej, 2014, for review), and overshadowing effects have been documented in human laboratory experiments (e.g., Chamizo et al., 2003; Prados, 2011). To our knowledge, no published studies have systematically explored overshadowing effects in a test of conditioned reinforcement, not in the lab, not with humans, nor in applied settings; these are significant gaps that should be filled.

FIGURE 3.

FIGURE 3

Pavlovian overshadowing procedures. The top panel shows the training phase for the Pavlovian group—a neutral stimulus (NS1) is presented 10 s before the unconditioned stimulus (US), and US events happen, on average, every 130 s. For the overshadowing group, everything is the same except a second, highly salient neutral stimulus (NS2) is presented simultaneously with NS1. In the test phase, NS1 is presented alone to evaluate if it will evoke a conditioned response.

Blocking

The other example of Principle 4 in the Pavlovian literature, blocking, is illustrated in Figure 4; parameters are from the famous Kamin (1968) experiment. In the first phase, a 180-s stimulus acquired CS properties because it signaled that the US would arrive 10 times faster than normal. In Phase 2, the now functional CS was presented simultaneous with a NS, both of which were followed by the US. In Kamin’s test phase, the NS from Phase 2 failed to acquire CS properties, whereas for a control group that did not complete Phase 1, the NS acquired a CS function. Although replications conducted in other labs, and with other CS and US stimuli, show that blocking is rarely complete, it is nonetheless clear that presenting a NS with a previously established CS hinders Pavlovian learning (Rescorla, 1999a; Soto, 2018), including with humans (e.g., Arcediano et al., 1997).

FIGURE 4.

FIGURE 4

Pavlovian blocking procedures involving a neutral stimulus (NS), conditioned stimulus (CS) and unconditioned stimulus (US). In Phase 1, the NS→US sequence at C/T = 10 establishes the CS function (C refers to the inter-US interval and T to the CS→US interval). In Phase 2, procedures are unchanged with the exception that an NS is presented simultaneously with the functional CS. In the test phase, the NS is presented alone to evaluate if it will evoke a conditioned response.

When it comes to the relevance of Principle 4 to conditioned reinforcement, several experiments have shown that presenting the would-be conditioned reinforcer with a previously established CS hinders acquisition of the conditioned-reinforcer function (Burke et al., 2007; Palmer, 1988; Panlilio et al., 2007; Vandbakk et al., 2020). For example, in the Burke et al. (2007) experiment, a CS function was established with rats using food as the US (C/T = 150/30 = 5). In the next phase, the established CS was presented simultaneous with a NS, both prior to US delivery (as in Phase 2 of Figure 4). In test sessions, rats made approximately 7.5 times more operant responses to produce the CS established in Phase 1, relative to the NS introduced in Phase 2, thereby suggesting an at least partial blocking of acquisition of the conditioned-reinforcer function.

Applying Principle 4

Applying Principle 4 means the applied behavior analyst should be on the lookout for other, non-target stimuli that are inadvertently correlated with the would-be conditioned reinforcer; the latter should be the only stimulus signaling the more imminent than normal arrival of the US. That non-target stimulus might be a salient NS (overshadowing) or a previously established CS (blocking). For example, presenting a NS with a highly salient praise event could have an overshadow effect; if praise already functions as a CS/conditioned-reinforcer, it may block the NS from acquiring those functions. Likewise, if we are trying to establish the conditioned-reinforcer function of an auditory stimulus like “bah,” the vocal stimulus should not be presented at the same time that the therapist reaches for the backup reinforcer. The latter is a visual non-target stimulus that presumably already has a CS function—it signals that a normally rare backup reinforcer is now imminent. If “bah” is blocked in this way, then it is less likely to acquire conditioned reinforcer properties (see Petursdottir et al., 2011, for discussion).

In reviewing the applied literature, we encountered several apparent violations of Principle 4, though most of these involved presenting a salient stimulus before the NS→US trial began. For example, when Smith et al. (1996) were trying to establish the conditioned reinforcer function of an auditory stimulus like “bah,” they always cleared away the toys before saying “bah.” If a non-target, toy-clearing event signals the shift from a long inter-US interval (C) to its relatively imminent arrival (T), then the highly salient toy-clearing stimulus could overshadow “bah” from acquiring a conditioned-reinforcer function (see Egger & Smith, 1962, for a laboratory demonstration of such overshadowing of conditioned reinforcement). Similarly, it is a common practice in this same applied literature to call the child’s name just before presenting the would-be conditioned reinforcer (e.g., Esch et al., 2009; Petursdottir et al., 2011). Again, this seemingly innocuous but highly salient non-target stimulus could overshadow the acquisition of the CS/conditioned-reinforcer function if that stimulus has already signaled the more imminent than normal arrival of the US. If Paul Revere’s brother, Tom Revere, warned the townspeople of the imminent British invasion just before Paul rode into town, Paul’s warning would largely be ignored. Principle 4 reminds us that the CS/conditioned-reinforcer must uniquely signal that the US/backup reinforcer is coming.

This common practice of attracting participant attention before presenting the NS raises the concern that the participant is not otherwise attending to the presentation of the NS. Returning to Principle 1, it is important to choose a novel and salient NS. This may be challenging in some populations of participants, but it is a challenge that must be met if we are to optimize Pavlovian learning.

Principle 5: Pavlovian learning is facilitated by arranging fewer trials per session

When considering how many Pavlovian training trials to arrange per session, intuition tells us that arranging more trials—more learning opportunities—is a good strategy. Our textbooks imply the same when they indicate that establishing a new CS function requires repeated pairing of NS and US events (Chance, 1998; Cooper et al., 2020; Martin & Pear, 2019; Mayer et al., 2018; Miltenberger, 2016). However, there are two reasons that intuition may be a poor guide. The first reason will take us back to Principle 3 and the second will lead us to argue for a separate principle (Principle 5), which cannot be accounted for by Principle 3 alone. Either way you look at it, per-trial Pavlovian learning is facilitated by arranging fewer NS→US trials per session.

Let’s tackle the first reason first. If an applied behavior analyst has scheduled 10 minutes for a Pavlovian training session, then the number of trials per session will influence the C/T ratio, which will influence Pavlovian learning and performance outcomes (Principle 3). For example, if 40 trials are completed per session (e.g., Dozier et al., 2012), then the average US→US interval (C) is approximately 600/40 = 15 s. If the NS→US interval is always 5 s, then C/T = 15/5 = 3. Principle 3 tells us that arranging such a low C/T value is counter to the goal of Pavlovian training; many more training trials will be necessary for acquisition than would be required if just four trials were arranged in each 10-min session. If all else is held constant in these four-trial sessions, then the average inter-US interval is 600/4 = 150 s, and C/T = 150/5 = 30. When Gottlieb (2008) conducted experiments arranging these kinds of manipulations with rats and mice, he found that arranging 8-fold more trials per session (4 vs. 32 trials) did not improve learning outcomes; indeed, by some performance measures, superior behavioral outcomes were obtained with fewer trials per session (see Papini & Overmier, 1985, for similar outcomes with pigeons). Thus, when session duration is fixed and the applied behavior analyst’s decision is how many trials should be conducted per session, the basic nonhuman literature suggests that more than three or four trials is unnecessary and, perhaps, contraindicated.

Now that readers are gaining expertise with Principle 3 and have been reminded of the importance of it during Pavlovian training, we can ask how many trials should be programmed in a Pavlovian training session in which the experimenter is careful to hold C/T constant. Does the trials-per-session effect hold up when Principle 3 predicts no change in behavioral outcomes? If that effect still holds true, then a new principle is needed.

Several laboratory experiments have examined the trials-per-session effect when C/T is held constant across groups of animals that complete different numbers of trials per session. The results of two of these experiments are shown in Figure 5. In the study conducted by Gallistel and Papachristos (2020; Figure 5A), mice were randomly assigned to groups that completed different numbers of trials per session (2.5 [on average], 10, or 40). As shown in Figure 5A, the new CS function was acquired in fewer trials when mice completed 2.5 trials per session; said another way, Pavlovian learning was facilitated (on a per-trial basis) by arranging fewer trials per session. Figure 5B shows the results of a similar experiment conducted with rats (C/T held constant across groups; Papini & Dudley, 1993). For rats assigned to the 1-trial per session group, the CS evoked conditioned responding after about 20 training trials (i.e., the fourth 5-trial block). By contrast, for rats in the 20-trials per session group, the NS failed to evoke conditioned responding after 40 training trials. Thus, as before, arranging fewer trials per session facilitated per-trial Pavlovian learning. Hence, Principle 5 holds that Pavlovian learning is facilitated on a per-trial basis by arranging fewer trials per session; this is true even when C/T is held constant.

FIGURE 5.

FIGURE 5

Pavlovian acquisition data. Panel A: Number of Pavlovian training trials needed before mice met an acquisition criterion (i.e., the CS alone reliably evoked the conditioned response) plotted as a function of the number of NS→US trials completed per session (Gallistel & Papachristos, 2020). Panel B: Conditioned response rate in a rat autoshaping experiment (Papini & Dudley, 1993). In rats completing one trial per session, the NS evoked conditioned responding at above baseline levels by the 4th trial block. For rats in the 20-trials per session group, the NS was not yet evoking conditioned responding after 40 training trials. Panel C: Sessions to acquisition across groups that acquired the CS function.

Before we get too excited about this, have a look at the data in Figure 5C. In that panel we have calculated the number of sessions to acquisition based on the trials to acquisition data in Figure 5A and Figure 5B. Those data show that training goals are achieved more quickly (fewer sessions) by arranging more trials per session. Importantly, this will be true only if the C/T ratio is held constant when increasing the number of trials per session, as in the Gallistel and Papachristos (2020) and Papini and Dudley (1993) experiments. Arranging more training trials in a fixed-duration session will decrease C, and hence the C/T ratio, which will slow acquisition (Principle 3). To the best of our knowledge, the trial-per-session effect has not been investigated either with humans or in applied settings.

Shifting from Pavlovian to conditioned-reinforcement experiments, to our knowledge only one nonhuman laboratory study has explored this trials-per-session effect on conditioned-reinforcer outcomes (Bersh, 1951). In that study, different groups of rats completed either 2, 4, 8, 16, or 20 NS→US trials in each of four training sessions. Speed of acquisition was not a dependent measure, just operant response rates in a post-training test in which rats could press a lever to gain brief access to the CS alone. The only significant effect was that rats receiving 20 trials per session pressed the lever more than rats trained with four trials per session. Although this would seem to counter Principle 5, none of the other between-group differences (e.g., rats completing 2 vs. 20 trials per session) were significant. Bersh (1951) concluded that the trials-per-session effect was unreliable in his experiment, perhaps because he did not control many of the variables that are routinely controlled for in more contemporary research (e.g., session duration, C/T ratio). More research is needed on the isolated effects of trials per session in conditioned reinforcement experiments.

Applying Principle 5

The empirical evidence informing Principle 5 suggests that Pavlovian learning is facilitated on a per-trial basis when fewer training trials are arranged per session. This effect is observed when C/T values are held constant but, as noted above, the effect can also occur if fewer trials are arranged in fixed-duration sessions; this increases C/T, which facilitates learning and behavioral outcomes. Below we discuss three implications of these observations for applied research.

First, it is worth noting that applied researchers often conduct several brief sessions in a row in a single day, which is different from nonhuman studies in which just one session is conducted per day. Given that the findings supporting Principle 5 come from nonhuman studies, we should state the obvious, that readers should not implement Principle 5 by simply redefining their sessions as composed of fewer trials (e.g., four NS→US trials instead of 40) and then conducing additional back-to-back sessions until the prior number of trials is reached (e.g., 10 four-trial sessions = 40 trials). This shift in the definition of a session introduces no functional changes that would be expected to facilitate Pavlovian learning on a per-trial basis.

Second, there is some complementarity between Principles 3 and 5. In an applied study in which a salient/novel 10-s NS is slated to acquire a CS/conditioned-reinforcer function, Principle 3 encourages us to program a long inter-US interval, perhaps C = 300 s (C/T = 300/10 = 30). In many applied research settings, scheduling 20 such trials per session will be impractical, as sessions will last one hour and 40 minutes (20 × 300 s = 6,000 s). Thankfully, Principle 5 tells us that many trials per session is unnecessary; arranging fewer trials per session improves the efficiency of Pavlovian learning. Hence, the complementarity of these principles.

The third implication of Principle 5 comes from the observation that, if the inter-US interval lasts an average of 300 s (5 min), there will be a good deal of downtime between Pavlovian training trials. For a rat in an operant chamber this is not a problem, but for a client who comes to the clinic for two hours a day, downtime is an inefficiency that runs counter to the goals of the behavioral interventions. Perhaps Principle 5 can inspire a novel approach to conducting Pavlovian training in applied settings. That approach will dispense with the practice of holding Pavlovian training sessions that have a discriminable beginning and end. Instead, NS→US trials will be conducted at randomly selected times during a clinic visit. That is, these trials will be overlaid on top of normal clinic activities. To reduce disruptions to these activities, we will schedule just a few trials per day, which will have the added benefit of increasing the inter-US interval, and therefore ensuring C/T is large. For example, if a client visits the clinic from 10:30 a.m. until 12:30 p.m., then the normally scheduled behavioral services would be provided throughout the visit; no dedicated session time will be set aside for a Pavlovian training session. Instead, at three random, pre-selected times (e.g., 10:47 a.m., 11:29 a.m., and 12:21 p.m. on Monday, with different randomly selected times on Tuesday and so forth), the salient/novel NS would be conspicuously presented for 10-s and the US provided after that. If the NS uniquely signals that the backup reinforcer will arrive more than 100 times faster than normal while at the clinic (e.g., on Monday C/T = an average of 1,460 s / 10 s = 146 times faster than normal), then acquisition of the CS/conditioned-reinforcer function should proceed as in Figure 5B. As this Pavlovian training continues across trials and days, when noncontingent presentations of the NS alone reliably evoke positive emotional responses (a conditioned response), this reveals that the NS has acquired a CS function and may subsequently function as a conditioned reinforcer.

This “incidental” Pavlovian training approach is seemingly a good fit with those clinical settings in which staff and session-time resources are limited. However, as previously discussed in the context of Principle 1, some clinical populations may not shift their attention toward the NS when it is first presented, which will inhibit learning the temporal relation between that stimulus and the US that follows. Should this happen, Pavlovian training may need to occur in a less-distracting setting until the NS acquires a CS function. Training could then be continued in the settings in which the CS/conditioned-reinforcer will be used therapeutically.

Principle 6: A reliable NS→US contingency facilitates acquisition; an intermittent contingency increases resistance to Pavlovian extinction

Principle 6 has two components; the first is concerned with acquisition. That component holds that, all else being equal, acquisition of a Pavlovian CS function will occur more quickly if the NS is always followed by the US (Gibbon et al., 1980; Humphreys, 1939, 1940; Jenkins & Stanley, 1950; Pavlov, 1927). That is, acquisition will be slowed if the US follows the NS only some of the time (e.g., NS→ [p = 0.25 US])8. If the NS is encountered many times but it is not at all correlated with US events, then the NS will not acquire a CS function (Gottlieb & Begej, 2014).

Considering the implications of the first component of Principle 6 to conditioned reinforcement, new conditioned-reinforcer functions should be acquired faster if a reliable NS→US contingency is used; that is, if the would-be conditioned reinforcer is always, rather than only sometimes followed by the backup reinforcer. Although this outcome seems obvious, to our knowledge, this prediction has never been empirically tested.

The second component of Principle 6 specifies that resistance to Pavlovian extinction is enhanced by sometimes presenting the CS without the US; for which there is considerable empirical support (Chan & Harris, 2017; Fitzgerald, 1963; Gibbs et al., 1978; Harris et al., 2019; Haselgrove et al., 2004; Humphreys, 1939; Rescorla, 1999b). For example, in a series of rat experiments conducted by Chan and Harris (2019), the programmed probability of the US given the CS was a direct predictor of resistance to extinction. That is, if, during training, the US followed one in three (randomly selected) CS events (i.e., CS → [p = .33 US]) then, in the subsequent (post-training) extinction phase, three times as many CS-alone trials were required to extinguish conditioned responding, relative to rats with a training history of CS→ [p = 1 US]. If one in five CS events was followed by the US, extinction took five times as many trials.

When this second component of Principle 6 is applied to conditioned reinforcement, it suggests that arranging a NS → [p < 1 US] contingency during Pavlovian training should produce a conditioned reinforcer that is more resistant to Pavlovian extinction; that is, more likely to maintain operant behavior when the conditioned reinforcer is no longer exchanged for a backup reinforcer. Empirical support for this prediction is mixed. One of the better-controlled studies was conducted by Knott and Clayton (1966). Two experimental groups of rats completed four equal-duration Pavlovian training sessions, with 100 NS events in each. One group was trained with a NS→ [p = 1 US] contingency and the other group with a NS→ [p = 0.5 US] contingency. For both groups the appetitive US was a brief pulse of electronic brain stimulation. In the conditioned-reinforcement test that followed, rats could press a lever to gain 1-s access to the newly acquired CS, but not the US (Pavlovian extinction). Both groups responded at above control-group levels to produce the CS (a conditioned reinforcer effect), but the intermittent-US group’s responding proved to be more durable between Pavlovian extinction sessions. Although results consistent with this finding have been reported in other nonhuman labs (James, 1968; Klein, 1959) and with human children (Fort, 1961, 1965), these systematic replication studies exerted less control of confounded variables, so the supporting evidence is not entirely convincing. Adding further to that skepticism, other studies have failed to systematically replicate these outcomes (e.g., Fox & King, 1961; Jacobs, 1968). Thus, Principle 6 enjoys robust empirical support in the Pavlovian literature (i.e., acquisition of CS function), but application of the principle to conditioned reinforcement is not as clear and requires more systematic research.

Applying Principle 6

In our review of the applied behavior analytic-literature on acquisition of new conditioned-reinforcer functions, the first component of Principle 6 was uniformly followed (e.g., Dozier et al., 2012; Moher et al., 2008; Petursdottir et al., 2011). That is, researchers were careful to use a NS→[p = 1 US] contingency during Pavlovian training. As for the second component, we could find no applied studies that either trained with or moved to a NS→ [p < 1 US] contingency at any point during Pavlovian training. Therefore, to the best of our knowledge, no one has tested if an intermittent Pavlovian contingency produces a conditioned reinforcer that is more robust to lapses in conditioned-reinforcement treatment fidelity (i.e., Pavlovian extinction – tokens received but not exchanged for backup reinforcers).

Instead, it is common to establish a new CS/conditioned-reinforcer function using a NS→[p = 1 US] contingency, use the CS as a conditioned reinforcer that is always exchanged for the backup reinforcer, and then to switch to a gradually more intermittent exchange-production contingency (e.g., Argueta et al., 2019; Leon et al., 2016). The latter contingencies were previously discussed in the context of Principle 3, and those points will not be repeated here. Instead, we will emphasize that Principle 6 predicts that training with a NS→ [p < 1.0 US] Pavlovian contingency should produce a conditioned reinforcer that is more resistant to treatment lapses that approximate Pavlovian extinction. Given the translational utility of such a finding, if it were to be confirmed, we would encourage readers to fill this research gap.

Finally, it is worth noting that the second component of Principle 6 is specific to the continued efficacy of a CS/conditioned-reinforcer in the face of Pavlovian extinction; it is not suggesting that a CS→ [p < 1.0 US] contingency will produce a more preferred conditioned reinforcer than a CS→ [p = 1 US] contingency – it will not. If given a choice between the two, the more consistent CS will be preferred (Anselme, 2021).

TRANSLATING THE PRINCIPLES TO PROMOTE SELF-CONTROL

In this final section, we attempt to illustrate how the six practical principles of Pavlovian conditioning might prove useful in addressing a problem of interest to basic, translational, and applied researchers, reducing impulsive choice. Promoting self-control choice and delay tolerance has been a behavioral target for decades (e.g., Mazur & Logue, 1978; Schweitzer & Sulzer-Azaroff, 1988). In these studies, self-control is defined as preference for a larger-later over a smaller-sooner reward, and impulsive-choice is defined as the opposite. Self-control can also involve adhering to an initial self-control choice (enrolling in substance-use treatment) when defections to impulsivity are possible (daily relapse opportunities). Because impulsive-choice is robustly correlated with substance-use, gambling, and other health-impacting behaviors (for meta-analyses, see Amlung et al., 2016; MacKillop et al., 2011; Weinsztok et al., 2021), there is good reason to target improvements in self-control. All else being equal, a child who cannot make or sustain a self-control choice is more likely to engage in disruptive behavior when asked to wait, relative to a child who patiently waits to obtain what they want (Brown et al., 2021; Ghaemmaghami et al., 2016).

The Pavlovian strategy for promoting self-control is simple—focus on the stimulus presented during the delay to the larger-later reward. To unpack this a bit, when the participant in a laboratory test of impulsive choice chooses the larger-later reward (the self-control choice), a delay-bridging stimulus is presented immediately, and remains in place until the large reward is obtained. The function of that delay-bridging stimulus may be a partial determinant of impulsive or self-control choice—it is, after all, a salient immediate consequence of making a self-control choice.

Peck et al. (2020) evaluated if the function of the delay-bridging stimulus is aversive. In an impulsive-choice test phase, stable choices between smaller-sooner and larger-later rewards were assessed. When the latter reward was selected, a cue light bridged the delay to food. In the next phase, no food rewards were arranged; instead, the delay-bridging cue-light was illuminated periodically during the session and rats could press an escape-lever to turn it off. The prevalence of impulsive choice in the first phase was positively correlated with rates of escape-lever responding in the second phase. Said, another way, rats that made a lot of impulsive choices found the delay-bridging stimulus aversive—they turned that light off; “self-controlled” rats did not. If the immediate consequence of a self-control choice is the presentation of an aversive stimulus, then impulsive choice may be partially influenced by its aversive-stimulus avoidance function.

A Pavlovian approach to self-control training would alter the function of the delay-bridging stimulus before the test of impulsive choice. Rather than allowing that stimulus to acquire aversive properties, it would first be established as a CS/conditioned-reinforcer. In the subsequent test of impulsive choice, Pavlovian-trained subjects may make more self-control choices than control-group rats because doing so produces an immediate conditioned reinforcer (the delay-bridging stimulus). When Principles 1–6 are followed in the Pavlovian-training phase, the new CS/conditioned-reinforcement function can be established with rats after about 75 trials (Robinson & Flagel, 2009). By comparison, other learning-based interventions that reduce impulsive choice in rodents typically require thousands of training trials (see Rung & Madden, 2018; Smith et al., 2019, for reviews).

Delay-bridging stimuli, such as a clock-timer, have been used in translational and applied experiments. Although Vessells et al. (2018) and Vollmer et al. (1999) reported that these stimuli facilitate self-control choice among children with developmental disabilities, Newquist et al. (2012) reported that bridging the delay with a timer did not reduce impulsive choice among typically developing preschoolers (see Cardinal et al., 2000, for comparable results with rats). A seemingly critical difference between these studies is that Vessells et al. and Vollmer et al. arranged systematic training over many sessions (delay-fading; Mazur & Logue, 1978; Schweitzer & Sulzer-Azaroff, 1988), which may have reduced the aversive function of the delay-bridging stimulus. Newquist et al. and Cardinal et al. (2000) provided no such training when they found that delay-bridging stimuli were not helpful. The state of the literature, then, is that delay-bridging stimuli can promote self-control choice, but seemingly only when the non-aversive function of those stimuli is established through systematic training.

Pavlovian procedures offer one approach to that systematic training, an approach that may produce outcomes quickly, but has yet to be evaluated in translational or applied settings. If one were to undertake this research in, for example, a preschool setting, then Principle 1 suggests we should choose a salient/novel NS. For example, the preschool teacher, while already talking to the class about an activity, might hold up a large black-and-yellow striped card—a highly salient event. Then, 20 s later, a highly appetitive US (Principle 2) would occur (e.g., a special activity or snack that the whole class can enjoy). In accord with Principle 3, the Pavlovian-training phase will arrange a large C/T ratio, more on that in a moment. Principle 4 warns us that the striped card, and no other stimulus, can signal the faster than normal arrival of the US event. This means that in the absence of the CS, the preschoolers must be incapable of predicting when the US will occur. Thus, the US cannot be an extra-special snack provided at snack time, or an extra fun activity that predictably happens at circle time. Such regularities can overshadow or block the NS from acquiring a CS/conditioned-reinforcer function. Moving on to Principle 5, a handful of “incidental” Pavlovian training trials can be conducted per day, each embedded in the usual classroom activities and occurring at unexpected times. Fewer trials per day means longer durations of C (the inter-US interval) which, if T is held constant at 20 s, will produce a very large C/T ratio. Finally, Principle 6 reminds us to present the US following every NS until it has acquired a CS function, after which one could, if there was a therapeutic reason for doing so, present the US less often.

When this training is complete, the preschool teacher could use the striped card as a tool for promoting self-control choice. For example, because subjects are often attracted to an effective CS, the card could be placed near the self-control alternative, which may increase that selection. Alternatively (or in addition), once a child makes a self-control choice (e.g., agreeing to let another child play with a toy first), the teacher could give the striped card to the child as they wait, which may reduce the probability of disruptive behavior during the delay to their turn. The child holding the card could also use it as a sanctioned way to remind the teacher than they are waiting patiently for the larger-later reward that was promised. The striped card could be used in these ways with/by any child in the classroom that has completed Pavlovian training along with the rest of the class. We hope readers will evaluate the efficacy of this potentially expedient method of teaching self-control and bring their own innovations to the fledgling ideas presented here.

CONCLUSIONS

This paper has sought to address the disconnect between findings in the basic Pavlovian and conditioned-reinforcement literatures and conditioned reinforcement practices in translational and applied research (Fernandez, 2021). A key aim of this paper was to discuss the central role of Pavlovian learning in establishing a conditioned reinforcer function. We have attempted to convey the importance of Pavlovian principles with an extensive review of the relevant literature. Although many of these principles have been rigorously examined in basic labs, they have not been sufficiently evaluated in humans and fewer studies still have explored their potential in translational or applied studies. As such, it will be important for researchers to conduct empirical evaluations of these principles with various clinical populations (e.g., children with ASD or intellectual disabilities, neurotypical individuals) in applied settings. We hope that readers will undertake the needed research to fill these knowledge gaps, and that the results will enhance the efficacy of one of the most frequently used technologies in behavior analytic research and practice—conditioned reinforcement.

FUNDING INFORMATION

The research was supported by NIH grant DA052467-01, which was awarded to the first author.

Footnotes

CONFLICT OF INTEREST

The authors have no known conflict of interest to disclose.

1

Researchers in the field of Pavlovian learning are not fond of the term “respondent,” much like operant researchers bristle when their Pavlovian-trained colleagues refer to operant behavior as “instrumental behavior.” Out of an abundance of respect for our natural allies in the behavioral sciences, we use the term “Pavlovian” rather than “respondent.”

2

Domjan (2016) argues convincingly that the distinction between “elicit” and “emitted” is incompatible with a modern understanding of Pavlovian conditioning. He, like Michael (2004) before him, concludes that “evoke” equally applies to stimulus control by Pavlovian conditioned stimuli and operant discriminative stimuli. “Elicit” should be reserved for unconditioned stimulus control of unconditioned responses (reflexes).

3

The textbooks and chapters were selected because of their applied orientation, their frequent use in training applied behavior analysts, and because they were sufficiently comprehensive to include discussions of Pavlovian learning and conditioned reinforcement.

4

In the American war for independence, Paul Revere is remembered for alerting the American militia of the approaching British armies prior to a key battle in the war. His warning call, yelled through the streets of Concord, MA, was “The Regulars are coming! The Regulars are coming!” The “Regulars” referred to the British army.

5

In Pavlovian studies like Bersh (1951), single-subject methodology is not easily employed. For example, once a CS function has been acquired at a particular C/T ratio, it cannot be extinguished and re-acquired under a different C/T due to the rapid reacquisition phenomenon (see Lattal, 2013).

6

Three of these four studies support Principle 3, Belke et al. (1989) is the exception to this rule and they attributed their outlier finding to their use of non-consumable rewards (consumables were used in all of the other studies). Humans and pigeons are far less sensitive to delays to non-consumable rewards that are exchanged for something else, at a later time (Hyten et al., 1994; Jackson & Hackenberg, 1996).

7

C/T predicts that identical outcomes would be obtained if the “no-win” tokens are included or omitted from the VR exchange-production schedule (i.e., only the “win” tokens are needed), a prediction that has yet to be put to an empirical test.

8

It worth noting that intermittent Pavlovian conditioning involves occasionally presenting the NS without the US, it does not involve presenting the US without the NS. The latter substantially weakens Pavlovian conditioning (Gamzu & Williams, 1971; Rescorla, 1968), whereas the former does not.

REFERENCES

  1. Alessandri J, Molet M, & Fantino E (2010). Preference for a segmented schedule using a brief S+ stimulus correlated with a great delay reduction in humans. Behavioural Processes, 85(1), 72–76. 10.1016/J.BEPROC.2010.06.009 [DOI] [PubMed] [Google Scholar]
  2. Alessandri J, Stolarz-Fantino S, & Fantino E (2011). Psychological distance to reward: Effects of S+ duration and the delay reduction it signals. Learning and Motivation, 42(1), 26–32. 10.1016/J.LMOT.2010.06.001 [DOI] [Google Scholar]
  3. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental discorders American Psychiatric Association. [Google Scholar]
  4. American Psychological Association Task Force on Promotion and Dissemination of Psychological Procedures. (1993). A report adopted by the Division 12 Board http://www.div12.org/sites/default/files/InitialReportOfTheChamblessTaskForce.pdf
  5. Amlung M, Petker T, Jackson J, Balodis I, & MacKillop J (2016). Steep discounting of delayed monetary and food rewards in obesity: A meta-analysis. Psychological Medicine, 46(11), 2423–2434. 10.1017/S0033291716000866 [DOI] [PubMed] [Google Scholar]
  6. Annau Z, & Kamin LJ (1961). The conditioned emotional response as a function of intensity of the US. Journal of Comparative and Physiological Psychology, 54(4), 428–432. 10.1037/H0042199 [DOI] [PubMed] [Google Scholar]
  7. Anselme P (2021). Effort-motivated behavior resolves paradoxes in appetitive conditioning. Behavioural Processes, 193, 104525. 10.1016/J.BEPROC.2021.104525 [DOI] [PubMed] [Google Scholar]
  8. Arcediano F, Matute H, & Miller RR (1997). Blocking of Pavlovian conditioning in humans. Learning and Motivation, 28(2), 188–199. 10.1006/LMOT.1996.0957 [DOI] [Google Scholar]
  9. Argueta T, Leon Y, & Brewer A (2019). Exchange schedules in token economies: A preliminary investigation of second-order schedule effects. Behavioral Interventions, 34(2), 280–292. 10.1002/BIN.1661 [DOI] [Google Scholar]
  10. Auge RJ (1974). Context, observing behavior, and conditioned reinforcement. Journal of the Experimental Analysis of Behavior, 22(3), 525–533. 10.1901/jeab.1974.22-525 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Baer DM, Wolf MM, & Risley TR (1968). Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis, 1(1), 91–97. 10.1901/jaba.1968.1-91 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Balsam PD, Drew M, & Gallistel C (2010). Time and associative learning. Comparative Cognition & Behavior Reviews, 5, 1–22. 10.3819/ccbr.2010.50001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Balsam PD, & Gallistel CR (2009). Temporal maps and informativeness in associative learning. Trends in Neurosciences, 32(2), 73–78. 10.1016/J.TINS.2008.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Barry H (1959). Effects of strength of drive on learning and on extinction. Journal of Experimental Psychology, 55(5), 473. 10.1037/H0046904 [DOI] [PubMed] [Google Scholar]
  15. Belke TW, Pierce WD, & Powell RA (1989). Determinants of choice for pigeons and humans on concurrent-chains schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 52(2), 97–109. 10.1901/JEAB.1989.52-97 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Bell MC, & McDevitt MA (2014). Conditioned reinforcement. In McSweeney FK & Murphy ES (Eds.), The Wiley Blackwell Handbook of Operant and Classical Conditioning (pp. 221–248). John Wiley & Sons Ltd. [Google Scholar]
  17. Bermudez MA, & Schultz W (2010). Responses of amygdala neurons to positive reward-predicting stimuli depend on background reward (contingency) rather than stimulus-reward pairing (contiguity). Journal of Neurophysiology, 103(3), 1158–1170. 10.1152/jn.00933.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Bersh PJ (1951). The influence of two variables upon the establishment of a secondary reinforcer for operant responses. Journal of Experimental Psychology, 41(1), 62–73. 10.1037/h0059386 [DOI] [PubMed] [Google Scholar]
  19. Brown KR, Gaynor RN, Randall KR, & Zangrillo AN (2021). A comparative analysis of procedures to teach delay tolerance. Behavior Analysis: Research and Practice, 22(2), 195–211. 10.1037/BAR0000224 [DOI] [Google Scholar]
  20. Bucker B, & Theeuwes J (2017). Pavlovian reward learning underlies value driven attentional capture. Attention, Perception, and Psychophysics, 79(2), 415–428. 10.3758/s13414-016-1241-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Bucker B, & Theeuwes J (2018). Stimulus-driven and goal-driven effects on Pavlovian associative reward learning. Visual Cognition, 26(2), 131–148. 10.1080/13506285.2017.1399948 [DOI] [Google Scholar]
  22. Bullock CE, & Hackenberg TD (2006). Second-order schedules of token reinforcement with pigeons: Implications for unit price. Journal of the Experimental Analysis of Behavior, 85(1), 95–106. 10.1901/JEAB.2006.116-04 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Burke KA, Franz TM, Miller DN, & Schoenbaum G (2007). Conditioned reinforcement can be mediated by either outcome-specific or general affective representations. Frontiers in Integrative Neuroscience, 0(NOV), 2. 10.3389/NEURO.07.002.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Burns M, & Domjan M (2001). Topography of spatially directed conditioned responding: Effects of context and trial duration. Journal of Experimental Psychology: Animal Behavior Processes, 27(2), 269–278. 10.1037//0097-7403.27.3.269 [DOI] [PubMed] [Google Scholar]
  25. Burton CL, Noble K, & Fletcher PJ (2011). Enhanced incentive motivation for sucrose-paired cues in adolescent rats: Possible roles for dopamine and opioid systems. Neuropsychopharmacology 2011 36:8, 36(8), 1631–1643. 10.1038/npp.2011.44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Cardinal RN, Robbins TW, & Everitt BJ (2000). The effects of d-amphetamine, chlordiazepoxide, α-flupenthixol and behavioural manipulations on choice of signalled and unsignalled delayed reinforcement in rats. Psychopharmacology 2000 152:4, 152(4), 362–375. 10.1007/S002130000536 [DOI] [PubMed] [Google Scholar]
  27. Carroll RA, & Klatt KP (2008). Using stimulus-stimulus pairing and direct reinforcement to teach vocal verbal behavior to young children with autism. The Analysis of Verbal Behavior 2008 24:1, 24(1), 135–146. 10.1007/BF03393062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Case DA, & Fantino E (1981). The delay-reduction hypothesis of conditioned reinforcement and punishment: Observing behavior. Journal of the Experimental Analysis of Behavior, 35(1), 93–108. 10.1901/JEAB.1981.35-93 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Catania AC (2013). Learning (5th ed.). Sloan Publishing. [Google Scholar]
  30. Chamizo VD, Aznar-Casanova JA, & Artigas AA (2003). Human overshadowing in a virtual pool: Simple guidance is a good competitor against locale learning. Learning and Motivation, 34(3), 262–281. 10.1016/S0023-9690(03)00020-1 [DOI] [Google Scholar]
  31. Chan CK, & Harris JA (2017). Extinction of Pavlovian conditioning: The influence of trial number and reinforcement history. Behavioural Processes, 141(Part 1), 19–25. 10.1016/J.BEPROC.2017.04.017 [DOI] [PubMed] [Google Scholar]
  32. Chan CK, & Harris JA (2019). The partial reinforcement extinction effect: The proportion of trials reinforced during conditioning predicts the number of trials to extinction. Journal of Experimental Psychology: Animal Learning and Cognition, 45(1), 43–58. 10.1037/XAN0000190 [DOI] [PubMed] [Google Scholar]
  33. Chance P (1998). First Course in Applied Behavior Analysis Waveland Press. [Google Scholar]
  34. Christensen DR, & Grace RC (2010). A decision model for steady-state choice in concurrent chains. Journal of the Experimental Analysis of Behavior, 94(2), 227–240. 10.1901/JEAB.2010.94-227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Cló E, & Dounavi K (2020). A systematic review of behaviour analytic processes and procedures for conditioning reinforcers among individuals with autism, developmental or intellectual disability 10.1080/15021149.2020.1847953. 10.1080/15021149.2020.1847953 [DOI] [Google Scholar]
  36. Cooper JO, Heron TE, & Heward WL (2020). Applied Behavior Analysis (3rd ed.). Pearson Education. [Google Scholar]
  37. Cunningham PJ, & Shahan TA (2018). Suboptimal choice, reward-predictive signals, and temporal information. Journal of Experimental Psychology: Animal Learning and Cognition, 44(1), 1–22. 10.1037/xan0000160 [DOI] [PubMed] [Google Scholar]
  38. Cunningham PJ, & Shahan TA (2019). Rats engage in suboptimal choice when the delay to food is sufficiently long. Journal of Experimental Psychology: Animal Learning and Cognition, 45(3), 301–310. 10.1037/XAN0000211 [DOI] [PubMed] [Google Scholar]
  39. Davey GCL, Oakley D, & Cleland GG (1981). Autoshaping in the rat: Effects of omission on the form of the response. Journal of the Experimental Analysis of Behavior, 36(1), 75–91. 10.1901/JEAB.1981.36-75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Domjan M (2016). Elicited versus emitted behavior: Time to abandon the distinction. Journal of the Experimental Analysis of Behavior, 105(2), 231–245. 10.1002/jeab.197 [DOI] [PubMed] [Google Scholar]
  41. Donahoe JW, & Vegas R (2021). Respondent (Pavlovian) conditioning. In Fisher WW, Piazza CC, & Roane HS (Eds.), Handbook of Applied Behavior Analysis (pp. 15–36). Guilford. [Google Scholar]
  42. Dozier CL, Iwata BA, Thomason-Sassi J, Worsdell AS, & Wilson DM (2012). A comparison of two pairing procedures to establish praise as a reinforcer. Journal of Applied Behavior Analysis, 45(4), 721–735. 10.1901/jaba.2012.45-721 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Egger MD, & Smith NE (1962). Secondary reinforcement in rats as a function of information value and reliability of the stimulus. Journal of Experimental Psychology, 64(2), 97–104. 10.1037/h0040364 [DOI] [PubMed] [Google Scholar]
  44. Esch BE, Carr JE, & Grow LL (2009). Evaluation of an enhanced stimulus-stimulus pairing procedure to increase early vocalizations of children with autism. Journal of Applied Behavior Analysis, 42(2), 225–241. 10.1901/jaba.2009.42-225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Esch BE, Carr JE, & Michael J (2005). Evaluating stimulus-stimulus pairing and direct reinforcement in the establishment of an echoic repertoire of children diagnosed with autism. The Analysis of Verbal Behavior, 21(1), 43–58. 10.1007/BF03393009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Escobar M, Arcediano F, & Miller RR (2003). Latent inhibition in human adults without masking. Journal of Experimental Psychology: Learning Memory and Cognition, 29(5), 1028–1040. 10.1037/0278-7393.29.5.1028 [DOI] [PubMed] [Google Scholar]
  47. Fantino E (1977). Conditioned reinforcement: Choice and information. In Honig WK & Staddon JER (Eds.), Handbook of operant behavior (pp. 313–339). Prentice-Hall. [Google Scholar]
  48. Fantino E, Preston RA, & Dunn R (1993). Delay reduction: Current status. Journal of the Experimental Analysis of Behavior, 60(1), 159–169. 10.1901/jeab.1993.60-159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Fernandez N (2021). Evaluating token arrangements commonly used in applied settings (Publication No. 28650446). [Doctoral Dissertation, University of Florida]. ProQuest Dissertations Publishing. [Google Scholar]
  50. Ferster CB, & Skinner BF (1957). Schedules of reinforcement Appleton-Century-Crofts. [Google Scholar]
  51. Fisher WW, Piazza CC, Bowman LG, Hagopian LP, Owens JC, & Slevin I (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilities. Journal of Applied Behavior Analysis, 25(2), 491–498. 10.1901/jaba.1992.25-491 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Fisher WW, Piazza CC, & Roane HS (2021). Handbook of Applied Behavior Analysis Guilford. [Google Scholar]
  53. Fitzgerald RD (1963). Effects of partial reinforcement with acid on the classically conditioned salivary response in dogs. Journal of Comparative and Physiological Psychology, 56(6), 1056–1060. 10.1037/H0048204 [DOI] [PubMed] [Google Scholar]
  54. Fitzwater ME, & Thrush RS (1956). Acquisition of a conditioned response as a function of forward temporal contiguity. Journal of Experimental Psychology, 51(1), 59–61. 10.1037/H0041439 [DOI] [PubMed] [Google Scholar]
  55. Fort JG (1961). Secondary reinforcement with preschool children. Child Development, 32, 755–764. 10.2307/1126562 [DOI] [PubMed] [Google Scholar]
  56. Fort JG (1965). Discrimination based on secondary reinforcement. Child Development, 36(2), 481–490. [PubMed] [Google Scholar]
  57. Foster TA, Hackenberg TD, & Vaidya M (2001). Second-order schedules of token reinforcement with pigeons: Effects of fixed- and variable-ratio exchange schedules. Journal of the Experimental Analysis of Behavior, 76(2), 159–178. 10.1901/JEAB.2001.76-159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Fox RE, & King RA (1961). The effects if reinforcement scheduling on the strength of a secondary reinforcer. Journal of Comparative and Physiological Psychology, 54(3), 266–269. [DOI] [PubMed] [Google Scholar]
  59. Gallistel CR, & Papachristos EB (2020). Number and time in acquisition, extinction and recovery. Journal of the Experimental Analysis of Behavior, 113(1), 15–36. 10.1002/jeab.571 [DOI] [PubMed] [Google Scholar]
  60. Gamzu E, & Williams DR (1971). Classical conditioning of a complex skeletal response. Science, 171(3974), 923–925. 10.1126/SCIENCE.171.3974.923 [DOI] [PubMed] [Google Scholar]
  61. Garofalo S, & di Pellegrino G (2015). Individual differences in the influence of task-irrelevant Pavlovian cues on human behavior. Frontiers in Behavioral Neuroscience, 9. 10.3389/fnbeh.2015.00163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Ghaemmaghami M, Hanley GP, & Jessel J (2016). Contingencies promote delay tolerance. Journal of Applied Behavior Analysis, 49(3), 548–575. 10.1002/JABA.333 [DOI] [PubMed] [Google Scholar]
  63. Gibbon J, Baldock MD, Locurto C, Gold L, & Terrace HS (1977). Trial and intertrial durations in autoshaping. Journal of Experimental Psychology: Animal Behavior Processes, 3(3), 264–284. 10.1037/0097-7403.3.3.264 [DOI] [Google Scholar]
  64. Gibbon J, & Balsam PD (1981). Spreading associations in time. In Locurto CM, Terrace HS, & Gibbon J (Eds.), Autoshaping and conditioning theory (pp. 219–253). Academic Press. [Google Scholar]
  65. Gibbon J, Farrell L, Locurto CM, Duncan HJ, & Terrace HS (1980). Partial reinforcement in autoshaping with pigeons. Animal Learning & Behavior 1980 8:1, 8(1), 45–59. 10.3758/BF03209729 [DOI] [Google Scholar]
  66. Gibbon J, Locurto C, & Terrace HS (1975). Signal-food contingency and signal frequency in a continuous trials auto-shaping paradigm. Animal Learning & Behavior, 3(4), 317–324. 10.3758/BF03213453/METRICS [DOI] [Google Scholar]
  67. Gibbs CM, Latham SB, & Gormezano I (1978). Classical conditioning of the rabbit nictitating membrane response: Effects of reinforcement schedule on response maintenance and resistance to extinction. Animal Learning & Behavior 1978 6:2, 6(2), 209–215. 10.3758/BF03209603 [DOI] [PubMed] [Google Scholar]
  68. Gottlieb DA, & Begej EL (2014). Principles of Pavlovian Conditioning. In McSweeney FK & Murphy ES (Eds.), The Wiley Blackwell Handbook of Operant and Classical Conditioning (pp. 1–25). John Wiley & Sons, Ltd. 10.1002/9781118468135.CH1 [DOI] [Google Scholar]
  69. Grace RC (1994). A contextual model of concurrent-chains choice. Journal of the Experimental Analysis of Behavior, 61(1), 113–129. 10.1901/JEAB.1994.61-113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Hackenberg TD (2009). Token reinforcement: A review and analysis. Journal of the Experimental Analysis of Behavior, 91(2), 257–286. 10.1901/jeab.2009.91-257 [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Hackenberg TD (2018). Token reinforcement: Translational research and application. Journal of Applied Behavior Analysis, 51, 393–435. 10.1002/jaba.439 [DOI] [PubMed] [Google Scholar]
  72. Hall G, Mackintosh NJ, Goodall G, & Martello MD (1977). Loss of control by a less valid or by a less salient stimulus compounded with a better predictor of reinforcement. Learning and Motivation, 8(2), 145–158. 10.1016/0023-9690(77)90001-7 [DOI] [Google Scholar]
  73. Harris JA, Kwok DWS, & Gottlieb DA (2019). The partial reinforcement extinction effect depends on learning about nonreinforced trials rather than reinforcement rate. Journal of Experimental Psychology: Animal Learning and Cognition, 45(4), 485. 10.1037/XAN0000220 [DOI] [PubMed] [Google Scholar]
  74. Haselgrove M, Aydin A, & Pearce JM (2004). A partial reinforcement extinction effect despite equal rates of Reinforcement during Pavlovian conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 30(3), 240–250. 10.1037/0097-7403.30.3.240 [DOI] [PubMed] [Google Scholar]
  75. Hine JF, Ardoin SP, & Call NA (2018). Token economies: Using basic experimental research to guide practical applications. Journal of Contemporary Psychotherapy 2017 48:3, 48(3), 145–154. 10.1007/S10879-017-9376-5 [DOI] [Google Scholar]
  76. Horstmann G (2015). The surprise–attention link: A review. Annals of the New York Academy of Sciences, 1339(1), 106–115. 10.1111/NYAS.12679 [DOI] [PubMed] [Google Scholar]
  77. Humphreys LG (1939). The effect of random alternation of reinforcement on the acquisition and extinction of conditioned eyelid reactions. Journal of Experimental Psychology, 25(2), 141–158. 10.1037/H0058138 [DOI] [Google Scholar]
  78. Humphreys LG (1940). Extinction of conditioned psychogalvanic responses following two conditions of reinforcement. Journal of Experimental Psychology, 27(1), 71–75. [Google Scholar]
  79. Hyde TS (1976). The effect of Pavlovian stimuli on the acquisition of a new response. Learning and Motivation, 7(2), 223–239. 10.1016/0023-9690(76)90030-8 [DOI] [Google Scholar]
  80. Hyten C, Madden GJ, & Field DP (1994). Exchange delays and impulsive choice in adult humans. Journal of the Experimental Analysis of Behavior, 62(2), 225–233. 10.1901/jeab.1994.62-225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Ivy JW, Meindl JN, Overley E, & Robson KM (2017). Token economy: A systematic review of procedural descriptions. Behavior Modification, 41(5), 708–737. 10.1177/0145445517699559 [DOI] [PubMed] [Google Scholar]
  82. Jackson K, & Hackenberg TD (1996). Token reinfocement, choice, and self-contro in pigeons. Journal of the Experimental Analysis of Behavior, 66(1), 29–49. 10.1901/jeab.1996.66-29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Jacobs BL (1968). Predictability and number of pairings in the establishment of a secondary reinforcer. Psychonomic Science 1968 10:7, 10(7), 237–238. 10.3758/BF03331498 [DOI] [Google Scholar]
  84. James JP (1968). Effectiveness of learned reinforcement as a function of primary reward percentage and number of primary pairings. Canadian Journal of Psychology, 22(6), 465–473. [Google Scholar]
  85. Jenkins WO (1950). A temporal gradient of derived reinforcement. The American Journal of Psychology, 63(2), 237–243. [PubMed] [Google Scholar]
  86. Jenkins WO, & Stanley JC (1950). Partial reinforcement: A review and critique. Psychological Bulletin, 47(3), 193–234. 10.1037/h0060772 [DOI] [PubMed] [Google Scholar]
  87. Jennings DJ, Bonardi C, & Kirkpatrick K (2007). Overshadowing and stimulus duration. Journal of Experimental Psychology: Animal Behavior Processes, 33(4), 464–475. 10.1037/0097-7403.33.4.464 [DOI] [PubMed] [Google Scholar]
  88. Kamin LJ (1968). “Attention-like” processes in classical conditioning. In Jones MR (Ed.), Miami symposium on the prediction of behavior: Aversive stimulation (pp. 9–31). University of Miami Press. [Google Scholar]
  89. Kaplan PS (1984). Importance of relative temporal parameters in trace autoshaping: From excitation to inhibition. Journal of Experimental Psychology: Animal Behavior Processes, 10(2), 113–126. 10.1037/0097-7403.10.2.113 [DOI] [Google Scholar]
  90. Kelleher RT (1958). Stimulus-producing responses in champanzee. Journal of the Experimental Analysis of Behavior, 1(1), 87–102. 10.1901/jeab.1958.1-87 [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Klein RM (1959). Intermittent primary reinforcement as a parameter of secondary reinforcement. Journal of Experimental Psychology, 58(6), 423–427. 10.1037/H0047692 [DOI] [PubMed] [Google Scholar]
  92. Knott PD, & Clayton KN (1966). Durable secondary reinforcement using brain stimulation as the primary reinforcer. Journal of Comparative and Physiological Psychology, 61(1), 151–153. 10.1037/H0022879 [DOI] [PubMed] [Google Scholar]
  93. Lalli JS, Mauro BC, & Mace FC (2000). Preference for unreliable reinforcement in children with mental retardation: The role of conditioned reinforcement. Journal of Applied Behavior Analysis, 33(4), 533–544. 10.1901/JABA.2000.33-533 [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Lattal KM (1999). Trial and intertrial durations in Pavlovian conditioning: Issues of learning and performance. Journal of Experimental Psychology: Animal Behavior Processes, 25(4), 433–450. 10.1037/0097-7403.25.4.433 [DOI] [PubMed] [Google Scholar]
  95. Lattal KM (2013). Pavlovian conditioning. In Madden GJ (Ed.), APA handbook of behavior analysis: Vol. 1 methods and principles (pp. 283–306). American Psychological Association. [Google Scholar]
  96. Lee B, Gentry RN, Bissonette GB, Herman RJ, Mallon JJ, Bryden DW, Calu DJ, Schoenbaum G, Coutureau E, Marchand AR, Khamassi M, & Roesch MR (2018). Manipulating the revision of reward value during the intertrial interval increases sign tracking and dopamine release. PLOS Biology, 16(9), e2004015. 10.1371/journal.pbio.2004015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Leon Y, Borrero JC, & DeLeon IG (2016). Parametric analysis of delayed primary and conditioned reinforcers. Journal of Applied Behavior Analysis, 49(3), 639–655. 10.1002/JABA.311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Lepper TL, & Petursdottir AI (2017). Effects of response-contingent stimulus pairing on vocalizations of nonverbal children with autism. Journal of Applied Behavior Analysis, 50(4), 756–774. 10.1002/JABA.415 [DOI] [PubMed] [Google Scholar]
  99. Lubow RE (1973). Latent inhibition. Psychological Bulletin, 79, 398–407. [DOI] [PubMed] [Google Scholar]
  100. Lucas GA, Deich JD, & Wasserman EA (1981). Trace autoshaping: Acquisition, maintenance, and path dependence at long trace intervals. Journal of the Experimental Analysis of Behavior, 36(1), 61–74. 10.1901/JEAB.1981.36-61 [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. MacKillop J, Amlung MT, Few LR, Ray LA, Sweet LH, & Munafò MR (2011). Delayed reward discounting and addictive behavior: A meta-analysis. Psychopharmacology, 216(3), 305–321. 10.1007/s00213-011-2229-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Mackintosh NJ (1971). An analysis of overshadowing and blocking: 10.1080/00335557143000121, 23(1), 118–125. 10.1080/00335557143000121 [DOI] [Google Scholar]
  103. Mackintosh NJ (1974). The psychology of animal learning Academic Press. [Google Scholar]
  104. Madden GJ, Dake JM, Mauel EC, & Rowe RR (2005). Labor supply and consumption of food in a closed economy under a range of fixed- and random-ratio schedules: Tests of unit price. Journal of the Experimental Analysis of Behavior, 83(2), 99–118. 10.1901/jeab.2005.32-04 [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Madden GJ, Reed DD, & DiGennaro Reed FD (2021). An Introduction to Behavior Analysis John Wiley and Sons Inc. [Google Scholar]
  106. Madden GJ, Reed DD, & Lerman DC (2023). Behavioral economics for applied behavior analysts. In Craig AR, Roane HS, Saine V, & Ringdahl JE (Eds.), Behavior Analysis: Translational Perspectives and Clnical Practice Guilford. [Google Scholar]
  107. Mahmoudi S, Peck S, Jones JI, & Madden GJ (2022). Effect of C/t ratios on sign-tracking and conditioned reinforcement efficacy Poster Presented at the Annual Convention of the Association for Behavior Analysis International. [Google Scholar]
  108. Marlin NA (1981). Contextual associations in trace conditioning. Animal Learning & Behavior 1981 9:4, 9(4), 519–523. 10.3758/BF03209784 [DOI] [Google Scholar]
  109. Martin G, & Pear J (2019). Behavior modification: What it is and how to do it (11th ed.). Routledge. [Google Scholar]
  110. Marx MH, & Knarr FA (1963). Long-term development of reinforcing properties of a stimulus as a function of temporal relationship to food reinforcement. Journal of Comparative and Physiological Psychology, 56(3), 546–550. 10.1037/h0046942 [DOI] [PubMed] [Google Scholar]
  111. Mayer GR, Sulzer-Azaroff B, & Wallace M (2018). Behavior analysis for lasting change (4th ed.). Sloan. [Google Scholar]
  112. Mazur JE (2001). Hyperbolic value addition and general models of animal choice. Psychological Review, 108(1), 96–112. 10.1037/0033-295X.108.1.96 [DOI] [PubMed] [Google Scholar]
  113. Mazur JE, & Logue AW (1978). Choice in a “self-control” paradigm: Effects of a fading procedure. Journal of the Experimental Analysis of Behavior, 30(1), 11–17. 10.1901/jeab.1978.30-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Michael J (2004). Concepts and principles of behavior analysis (Revised Ed). Society for the Advancement of Behavior Analysis [Google Scholar]
  115. Miguel CF, Carr JE, & Michael J (2002). The effects of a stimulus-stimulus pairing procedure on vocal behavior of children diagnosed with autism. The Analysis of Verbal Behavior, 18, 3–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Miliotis A, Sidener TM, Reeve KF, Carbone V, Sidener DW, Rader L, & Delmolino L (2012). An evaluation of the number of presentations of target sounds during stimulus-stimulus pairing. Journal of Applied Behavior Analysis, 45(4), 809–813. 10.9101/jaba.2012.45-809 [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Miltenberger RG (2016). Behavior modification: Principles and Procedures (6th ed.). Cengage. [Google Scholar]
  118. Moher CA, Gould DD, Hegg E, & Mahoney AM (2008). Non-generalized and generalized conditioned reinforcers: Establishment and validation. Behavioral Interventions, 23(1), 13–38. 10.1002/bin.253 [DOI] [Google Scholar]
  119. Molet M, Miller HC, Laude JR, Kirk C, Manning B, & Zentall TR (2012). Decision making by humans in a behavioral task: Do humans, like pigeons, show suboptimal choice? Learning and Behavior, 40(4), 439–447. 10.3758/s13420-012-0065-7 [DOI] [PubMed] [Google Scholar]
  120. Morris RW, & Bouton ME (2006). Effect of unconditioned stimulus magnitude on the emergence of conditioned responding. Journal of Experimental Psychology: Animal Behavior Processes, 32(4), 371–385. 10.1037/0097-7403.32.4.371 [DOI] [PubMed] [Google Scholar]
  121. Nelson JB, Navarro A, & Sanjuan M del C. (2014). Presentation and validation of “The Learning Game,” a tool to study associative learning in humans. Behavior Research Methods 2014 46:4, 46(4), 1068–1078. 10.3758/S13428-014-0446-2 [DOI] [PubMed] [Google Scholar]
  122. Newquist MH, Dozier CL, & Neidert PL (2012). A comparison of the effects of brief rules, a timer, and preferred toys on self-control. Journal of Applied Behavior Analysis, 45(3), 497–509. 10.1901/jaba.2012.45-497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Normand MP, & Knoll ML (2006). The effects of a stimulus-stimulus pairing procedure on the unprompted vocalizations of a young child diagnosed with autism. The Analysis of Verbal Behavior, 22, 81–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Notterman JM (1951). A study of some relations among aperiodic reinforcement, discrimination training, and secondary reinforcement. Journal of Experimental Psychology, 41(3), 161–169. 10.1037/H0061644 [DOI] [PubMed] [Google Scholar]
  125. Pace GM, Ivancic MT, Edwards GL, Iwata BA, & Page TJ (1985). Assessment of stimulus preference and reinforcer value with profoundly retarded individuals. Journal of Applied Behavior Analysis, 18(3), 1308015. 10.1901/jaba.1985.18-249 [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Palmer DC (1988). The blocking of conditioned reinforcement [University of Massachusetts Amherst; ]. https://scholarworks.umass.edu/dissertations_1/3743 [Google Scholar]
  127. Panlilio LV, Thorndike EB, & Schindler CW (2007). Blocking of conditioning to a cocaine-paired stimulus: Testing the hypothesis that cocaine perpetually produces a signal of larger-than-expected reward. Pharmacology Biochemistry and Behavior, 86(4), 774–777. 10.1016/J.PBB.2007.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Papini MR, & Dudley RT (1993). Effects of the number of trials per session on autoshaping in rats. Learning and Motivation, 24(2), 175–193. 10.1006/LMOT.1993.1011 [DOI] [Google Scholar]
  129. Papini MR, & Overmier JB (1985). Partial reinforcement and autoshaping of the pigeon’s key-peck behavior. Learning and Motivation, 16(1), 109–123. 10.1016/0023-9690(85)90007-4 [DOI] [Google Scholar]
  130. Pavlov IP (1927). Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex. Annals of Neurosciences, 17(3), 136. 10.5214/ans.0972-7531.1017309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Peck S, Rung JM, Hinnenkamp JE, & Madden GJ (2020). Reducing impulsive choice: VI. Delay-exposure training reduces aversion to delay-signaling stimuli. Psychology of Addictive Behaviors, 34(1), 147–155. 10.1037/adb0000495 [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Perkins CC, Beavers WO, Robert A Hancock J, Hemmendinger PC, Hemmendinger D, & Ricci JA (1975). Some variables affecting rate of key pecking during response-independent procedures (autoshaping). Journal of the Experimental Analysis of Behavior, 24(1), 59. 10.1901/JEAB.1975.24-59 [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Petursdottir AI, Carp CL, Matthies DW, & Esch BE (2011). Analyzing stimulus-stimulus pairing effects on preferences for speech sounds. The Analysis of Verbal Behavior, 27(1), 45–60. 10.1007/bf03393091 [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Petursdottir AI, & Lepper TL (2015). Inducing novel vocalizations by conditioning speech sounds as reinforcers. Behavior Analysis in Practice, 8(2), 223–232. 10.1007/s40617-015-0088-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Poling A (2010). Looking to the future: Will Behavior analysis survive and prosper? Behavior Analyst, 33(1), 7–17. 10.1007/BF03392200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. Powell PS, Travers BG, Klinger LG, & Klinger MR (2016). Difficulties with multi-sensory fear conditioning in individuals with autism spectrum disorder. Research in Autism Spectrum Disorders, 25, 137–146. 10.1016/J.RASD.2016.02.008 [DOI] [Google Scholar]
  137. Prados J (2011). Blocking and overshadowing in human geometry learning Invertebrate learning View project. Article in Journal of Experimental Psychology Animal Behavior Processes 10.1037/a0020715 [DOI] [PubMed] [Google Scholar]
  138. Preston RA, & Fantino E (1991). Conditioned reinforcement value and choice. Journal of the Experimental Analysis of Behavior, 55(2), 155–175. 10.1901/JEAB.1991.55-155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Rescorla RA (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66(1), 1–5. 10.1037/h0025984 [DOI] [PubMed] [Google Scholar]
  140. Rescorla RA (1969). Conditioned inhibition of fear resulting from negative CS-US contingencies. Journal of Comparative and Physiological Psychology, 67(4), 504–509. 10.1037/H0027313 [DOI] [PubMed] [Google Scholar]
  141. Rescorla RA (1980). Simultaneous and successive associations in sensory preconditioning. Journal of Experimental Psychology: Animal Behavior Processes, 6(3), 207–216. 10.1037/0097-7403.6.3.207 [DOI] [PubMed] [Google Scholar]
  142. Rescorla RA (1988). Pavlovian conditioning: It’s not what you think it is. American Psychologist, 43(3), 151–160. 10.1037/0003-066X.43.3.151 [DOI] [PubMed] [Google Scholar]
  143. Rescorla RA (1999a). Learning about qualitatively different outcomes during a blocking procedure. Animal Learning & Behavior 1999 27:2, 27(2), 140–151. 10.3758/BF03199671 [DOI] [Google Scholar]
  144. Rescorla RA (1999b). Within-subject partial reinforcement extinction effect in autoshaping. Quarterly Journal of Experimental Psychology, 52(1b), 75–87. 10.1080/713932693 [DOI] [Google Scholar]
  145. Robinson TE, & Flagel SB (2009). Dissociating the predictive and incentive motivational properties of reward-related cues through the study of individual differences. Biological Psychiatry, 65(10), 869–873. 10.1016/j.biopsych.2008.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  146. Roper KL, & Zentall TR (1999). Observing behavior in pigeons: The effect of reinforcement probability and response cost using a symmetrical choice procedure. Learning and Motivation, 30(3), 201–220. 10.1006/LMOT.1999.1030 [DOI] [Google Scholar]
  147. Rung JM, & Madden GJ (2018). Experimental reductions of delay discounting and impulsive choice: A systematic review and meta-analysis. Journal of Experimental Psychology: General [DOI] [PMC free article] [PubMed] [Google Scholar]
  148. Russell D, Ingvarsson ET, Haggar JL, & Jessel J (2018). Using progressive ratio schedules to evaluate tokens as generalized conditioned reinforcers. Journal of Applied Behavior Analysis, 51(1), 40–52. 10.1002/jaba.424 [DOI] [PubMed] [Google Scholar]
  149. Saltzman IJ (1949). Maze learning in the absence of primary reinforcement: A study of secondary reinforcement. Journal of Comparative and Physiological Psychology, 42(3), 161–173. 10.1037/H0059466 [DOI] [PubMed] [Google Scholar]
  150. Schoenfeld WN, Antonitis JJ, & Bersh PJ (1950). A preliminary study of training conditions necessary for secondary reinforcement. Journal of Experimental Psychology, 40(1), 40–45. 10.1037/h0062153 [DOI] [Google Scholar]
  151. Schweitzer JB, & Sulzer-Azaroff B (1988). Self-control: Teaching tolerance for delay in impulsive children. Journal of the Experimental Analysis of Behavior, 50(2), 173. 10.1901/JEAB.1988.50-173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Seidel RJ (1959). A review of sensory preconditioning. Psychological Bulletin, 56(1), 58–73. 10.1037/H0040776 [DOI] [PubMed] [Google Scholar]
  153. Shahan TA (2010). Conditioned reinforcement and response strength. Journal of the Experimental Analysis of Behavior, 93(2), 269–289. 10.1901/jeab.2010.93-269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. Shahan TA, & Cunningham P (2015). Conditioned reinforcement and information theory reconsidered. Journal of the Experimental Analysis of Behavior, 103(2), 405–418. 10.1002/jeab.142 [DOI] [PubMed] [Google Scholar]
  155. Shillingsburg MA, Hollander DL, Yosick RN, Bowen C, & Muskat LR (2015). Stimulus-stimulus pairing to increase vocalizations in children with language delays: A review. The Analysis of Verbal Behavior 2015 31:2, 31(2), 215–235. 10.1007/S40616-015-0042-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Skinner BF (1938). The behavior of organisms: An experimental analysis Appleton-Century. [Google Scholar]
  157. Smith R, Michael J, & Sundberg ML (1996). Automatic reinforcement and automatic punishment in infant vocal behavior. The Analysis of Verbal Behavior, 13, 39–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Smith T, Panfil K, Bailey C, & Kirkpatrick K (2019). Cognitive and behavioral training interventions to promote self-control. Journal of Experimental Psychology: Animal Learning and Cognition, 45(3), 259–279. 10.1037/xan0000208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  159. Soto FA (2018). Contemporary associative learning theory predicts failures to obtain blocking: Comment on Maes et al. (2016). Journal of Experimental Psychology: General, 147(4), 597–602. 10.1037/xge0000341 [DOI] [PubMed] [Google Scholar]
  160. Sparber SB, Bollweg GL, & Messing RB (1991). Food deprivation enhances both autoshaping and autoshaping impairment by a latent inhibition procedure. Behavioural Processes, 23(1), 59–74. 10.1016/0376-6357(91)90106-A [DOI] [PubMed] [Google Scholar]
  161. Stagner JP, & Zentall TR (2010). Suboptimal choice behavior by pigeons. Psychonomic Bulletin and Review, 17(3), 412–416. 10.3758/PBR.17.3.412 [DOI] [PubMed] [Google Scholar]
  162. Stein L, Sidman M, & Brady JV (1958). Some effects of two temporal variables on conditioned suppression. Journal of the Experimental Analysis of Behavior, 1(2), 153–162. 10.1901/jeab.1958.1-153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  163. Stockhorst U (1994). Effects of different accessibility of reinforcement schedules on choice in humans. Journal of the Experimental Analysis of Behavior, 62(2), 269–292. 10.1901/JEAB.1994.62-269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Stubbs DA, & Cohen SL (1972). Second-order schedules: Comparison of different procedures for scheduling paired and nonpaired brief stimuli. Journal of the Experimental Analysis of Behavior, 18(3), 403–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Tabbara RI, Maddux JMN, Beharry PF, Iannuzzi J, & Chaudhri N (2016). Effects of sucrose concentration and water deprivation on pavlovian conditioning and responding for conditioned reinforcement. Behavioral Neuroscience, 130(2), 231–242. 10.1037/bne0000138 [DOI] [PubMed] [Google Scholar]
  166. Terrace HS, Gibbon J, Farrell L, & Baldock MD (1975). Temporal factors influencing the acquisition and maintenance of an autoshaped keypeck. Animal Learning & Behavior 1975 3:1, 3(1), 53–62. 10.3758/BF03209099 [DOI] [Google Scholar]
  167. Thomas BL, & Papini MR (2020). Shifts in intertrial interval duration in autoshaping with rats: Implications for path dependence. Learning and Motivation, 72(November), 101687. 10.1016/j.lmot.2020.101687 [DOI] [Google Scholar]
  168. Thompson RF (1972). Sensory preconditioning. In Thompson RF & Voss JS (Eds.), Topics in learning and performance Academic Press. [Google Scholar]
  169. Thrailkill EA, & Shahan TA (2014). Temporal integration and instrumental conditioned reinforcement. Learning and Behavior, 42(3), 201–208. 10.3758/S13420-014-0138-X/FIGURES/5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  170. van Haaren F, van Hest A, & van de Poll NE (1987). Acquisition and reversal of a discriminated autoshaped response in male and female rats: Effects of long or short and fixed or variable intertrial interval durations. Learning and Motivation, 18(2), 220–233. 10.1016/0023-9690(87)90012-9 [DOI] [Google Scholar]
  171. Van Houten R, & Nau PA (1980). A comparison of the effects of fixed and variable ratio schedules of reinforcement on the behavior of deaf children. Journal of Applied Behavior Analysis, 13(1), 13–21. 10.1901/JABA.1980.13-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  172. Vandbakk M, Olaff HS, & Holth P (2020). Blocking of stimulus control and conditioned reinforcement. The Psychological Record 2020 70:2, 70(2), 279–292. 10.1007/S40732-020-00393-3 [DOI] [Google Scholar]
  173. Vessells J, Sy JR, Wilson A, & Green L (2018). Effects of delay fading and signals on self-control choices by children. Journal of Applied Behavior Analysis, 51(2), 374–381. 10.1002/jaba.454 [DOI] [PubMed] [Google Scholar]
  174. Vollmer TR, Borrero JC, Lalli JS, & Daniel D (1999). Evaluating self-control and impulsivity in children with severe behavior disorders. Journal of Applied Behavior Analysis, 32(4), 451–466. 10.1901/JABA.1999.32-451 [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Vollmer TR, & Hackenberg TD (2001). Reinforcement contingencies and social reinforcement: Some reciprocal relations between basic and applied research. Journal of Applied Behavior Analysis, 34(2), 241–253. 10.1901/JABA.2001.34-241 [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Ward RD, Gallistel CR, Jensen G, Richards VL, Fairhurst S, & Balsam PD (2012). Conditioned [corrected] stimulus informativeness governs conditioned stimulus-unconditioned stimulus associability. Journal of Experimental Psychology. Animal Behavior Processes, 38(3), 217–232. 10.1037/a0027621 [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Weinsztok S, Brassard S, Balodis I, Martin LE, & Amlung M (2021). Delay discounting in established and proposed behavioral addictions: A systematic review and meta-analysis. Frontiers in Behavioral Neuroscience, 15. 10.3389/FNBEH.2021.786358/FULL [DOI] [PMC free article] [PubMed] [Google Scholar]
  178. White CT, & Schlosberg H (1952). Degree of conditioning of the GSR as a function of the period of delay. Journal of Experimental Psychology, 43(5), 357–362. 10.1037/h0061520 [DOI] [PubMed] [Google Scholar]
  179. Williams BA (1994). Conditioned reinforcement: Experimental and theoretical issues. The Behavior Analyst, 17(2), 261–285. 10.1007/BF03392675 [DOI] [PMC free article] [PubMed] [Google Scholar]
  180. Williams BA, & Fantino E (1978). Effects on choice of reinforcement delay and conditioned reinforcement. Journal of the Experimental Analysis of Behavior, 29(1), 77–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Williams DR, & Williams H (1969). Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement. Journal of the Experimental Analysis of Behavior, 12(4), 511–520. 10.1901/JEAB.1969.12-511 [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Wolfe JB (1936). Effectiveness of token rewards for chimpanzees. Comparative Psychology Monographs, 12, 1–72. [Google Scholar]
  183. Zeiler MD (1979). Output dynamics. In Zeiler MD & Harzem P (Eds.), Advances in analysis of behaviour: Reinforcement and the organization of behaviour (pp. 79–115). Wiley. [Google Scholar]
  184. Zentall TR (2016). Resolving the paradox of suboptimal choice. Journal of Experimental Psychology: Animal Learning and Cognition, 42(1), 1–14. 10.1037/XAN0000085 [DOI] [PubMed] [Google Scholar]

RESOURCES