Abstract
Some researchers claim that variability is an operant dimension of behavior. The present paper reviews the concept of operant behavior and emphasizes that differentiation is the behavioral process that demonstrates an operant relation. Differentiation is conceived as change in the overlap between two probability distributions: the distribution of reinforcement probability as a function of some response property (S distribution) and the probability distribution of the response property itself (R distribution). This concept implies that the differentiation process can be measured only if S distribution and R distribution are both established on the same response property. To determine whether the differentially reinforced behavioral variability fits the proposed concept of operant behavior, I examine the main procedures (lag n and threshold procedures) and the main dependent variable (U value) employed in the studies of operant variability. Because lag n and threshold procedures establish their S distributions on properties distinct from U value, differentiation cannot be measured over the change in U value. I conclude that studies of operant variability have failed to provide a direct demonstration that variability is an operant dimension of behavior. Hence, studies in which measures of variability provide a basis to measure differentiation can better support the claim that variability is an operant dimension of behavior.
Keywords: operant variability, lag n procedure, threshold procedure, U value
Neuringer (2002, 2003, 2004, 2009) claims that variability is an operant dimension of behavior. Like other dimensions of behavior (force, topography, duration, location), variability is, according to Neuringer, a dimension that may control consequences of behavior and may be controlled by them. Many studies have shown that some measures of behavioral variability are, in fact, precisely controlled by contingencies of reinforcement, and they have given rise to behavioral technologies that have been successfully applied in a variety of fields, such as autism, depression, skills training, problem solving, and creativity (Neuringer, 2002). The present article does not contest that, as a practical matter, variability in behavior can be modified by contingencies of reinforcement. However, despite the successful application of technologies derived from variability studies, an analytic view of operant variability might contribute to the enrichment of this area of study. Therefore the present paper provides a conceptual analysis of Neuringer's claim in light of Catania's (1973, 1998) concept of operant behavior. To perform this analysis, the present work (a) reviews Catania's concept of operant behavior, emphasizing that differentiation is the behavioral process that demonstrates an operant relation; (b) examines the procedures and measures of variability often adopted in studies of operant variability; (c) shows that the most common measure of variability does not enable one to measure differentiation process in these studies; and (d) claims that procedures in which measures of variability allow measurement of differentiation can offer better evidence of operant variability.
OPERANT BEHAVIOR
According to Skinner (1969, pp. 105–132) an operant relation is demonstrated when some responses produce environmental changes and these changes modify the rate at which the responses are emitted. Therefore, his definition of operant behavior involves a relation between some responses and their consequences (a contingency relation) and the effects of this contingency relation on the rate of responding. Consequences that increase the probability of responding are called reinforcers. Skinner pointed out that reinforcement is always contingent on some properties of responses. The delivery of a food pellet that reinforces lever presses is contingent on certain properties of the responses. To close the circuit, a lever press must have some minimal force, and must occur at some range of angles and location in space. Hence, reinforcement is contingent on responses that meet those criteria. “Thus a set of contingencies defines an operant,” according to Skinner (1969, p. 131). Each of these contingencies is established on particular properties of responses. Responses with the properties required by the contingencies, and whose frequency is affected by the reinforcement, constitute an operant response class.
Skinner (1969) also noted that “Contingencies cannot always be detected on a given occasion. Although a response is reinforced, we cannot be sure what property satisfied the contingencies and hence defines the operant” (p. 131). It should be noted, however, that the specific properties that satisfy contingencies and the properties that come to occur more often may not be the same ones. Reinforcement can affect the frequency of responses that do not produce it. This spread of the effects of reinforcement to responses that do not produce it is called induction (Catania, 1998). Catania (1973, 1998) proposed a concept of operant behavior that distinguishes these two response classes: the class of responses that produce some consequences (descriptive response class) and the class of responses whose frequency is affected by the contingency relation (functional response class). Catania (1973, 1998) claimed that the concept of the operant emerges from the relation between these two response classes.
When an experimenter arranges contingencies that differentially reinforce some response, he or she adopts a reinforcement criterion based on some response property. This reinforcement criterion defines a descriptive response class and provides the basis for differential reinforcement, that is, the reinforcement of some responses but not others (Catania, 1998). According to Catania (1973), a descriptive response class may also be defined in terms of a distribution of conditional probabilities of reinforcement. This distribution (a stimulus distribution or S distribution, as I will refer to it henceforth) functionally relates probabilities of reinforcement to the values some response property may assume. If the experimenter selects the response property force and requires a minimum amount of force (x) for a reinforcer to be presented, he or she establishes an S distribution that is shown in Figure 1.
Figure 1.

S distribution that is established when the experimenter selects the response property of force and requires a minimum amount of force (x) to reinforce the response. If the force exerted by response is lower than x, the reinforcement probability is 0. If the force is equal to or higher than x, the reinforcement probability is 1.
Before differential reinforcement takes place, the property force exhibits a particular probability distribution in the organism's repertoire. Responses with a force higher than x may be rare when differential reinforcement begins, but differential reinforcement can engender an increase in the frequency of responses with a force higher than x. Thus, the differential reinforcement of responses with a force higher than x can change the probability distribution of the property force in the organism's repertoire. Hence, it is possible to distinguish two probability distributions: one that prevails before differential reinforcement goes into effect (an initial, or baseline, distribution of some response property, hereafter the Ri distribution) and another distribution that prevails after differential reinforcement had an effect on the baseline distribution (a final distribution of the response property, the Rf distribution). According to Catania's (1973) proposal, the probabilities that define the Ri and Rf distributions are derived from the relative frequencies at which the values the property assumes are observed to occur. Hence, if responses with a force higher than x initially occur at a low relative frequency in the organism's repertoire (Ri), it may be said, alternatively, that such responses occur with a low probability. A continuous property like force may assume noninteger values. In such a case, the range of values the property may assume can be divided into equal intervals. The experimenter can therefore measure the relative frequency at which responses in each interval occur. These relative frequencies represent the probabilities that define the Ri and Rf distributions, according to Catania.
The change in the probability distribution of the property force is an example of what Skinner called “quantitative differentiation” (1938, p. 338). Skinner's conceptual framework puts response differentiation at the center of operant conditioning, given that the latter process necessarily involves the former (Galbicka, 1988). It could be said, moreover, that “response differentiation is operant conditioning, and vice versa” (Galbicka, 1988, p. 343). A differentiation process, which is the usual effect of differential reinforcement, occurs when “the distribution of emitted responses [comes] to conform closely to the boundaries of the class of reinforced responses” (Catania, 1998, p. 117). In other words, a differentiation process occurs when the Rf distribution more closely conforms to the S distribution than does the Ri distribution. Thus, the differentiation process implies that the overlap between the S distribution and the Rf distribution is higher than the overlap between the S distribution and the Ri distribution. Catania (1973) proposed the standard Pearson product-moment correlation coefficient (r) as a measure of the degree of overlap between an S and an R distribution. Because a differentiation process entails an increase in the magnitude of the correlation, the variation of r constitutes the measure of this process. According to Catania (1973), “An operant relation is demonstrated when the distribution of stimulus probabilities along some dimension of responding increases the correlation between that distribution and the distribution of response probabilities along the same dimension” (p. 106). Thus, Catania's (1973, 1998) concept of the operant also implies that differentiation is the process that demonstrates an operant relation.
Therefore, the definition of an operant proposed by Catania (1973) involves three distributions that represent probabilities “as a function of some property of the response” (p. 105). These distributions (all with functional properties) are the following: an Ri distribution, an S distribution, and an Rf distribution. Figure 2 shows a hypothetical case in which the overlap between an S distribution and an Rf distribution is higher than the overlap between the S distribution and an Ri distribution. The change in overlap shows the usual effect of the differential reinforcement of responses with force higher than x.
Figure 2.
A hypothetical case in which the overlap between an S distribution and a final R distribution is higher than the overlap between the S distribution and an initial R distribution. The range of values that the property force may assume was divided into 10 class intervals to establish R distributions. Initially, responses with force higher than x occur with low probability (before the differential reinforcement of responses with force higher than x). The final R distribution represents a performance in which responses with force higher than x occur with high probability (after the differential reinforcement of responses with force higher than x).
It should be noted that a change in correlation requires at least two values (an initial correlation and a final correlation), and these correlations should only be calculated for S and R distributions that are established on the same property. Thus, even if differential reinforcement changes R distributions of several response properties, only the change in the R distribution of that property on which the S distribution was established enables one to measure the differentiation process.
Correlated properties. In the example above, the S distribution was established on the property force and the differentiation process would have to be measured by a change in the Ri distribution established on that property. However, the differential reinforcement of responses with a force higher than x could also affect the probability distribution of other response properties if these properties were correlated with force. If higher forces required longer durations, for example, the procedure would also change the Ri distribution of the property duration. However, only the change in Ri distribution established on the property force would allow one to measure the differentiation process, because the S distribution was established on that property. When an experimenter differentially reinforces responses with forces higher than x, he or she manipulates an aspect of the organism's environment. Manipulation of a single environmental variable can change several response properties. Response differentiation is the behavioral process that links the manipulation (differential reinforcement) to the behavioral change that must be considered in order to show operant conditioning.
If variability is an operant dimension of behavior, as Neuringer (2002, 2003, 2009) claims, it should be possible to differentially reinforce variable behavior and to obtain a differentiation process as a result. That is, if Neuringer's supposition holds, it should be possible to establish an S distribution on the property variability and to measure the differentiation process by modifying an Ri distribution established on this same property. In fact, the differentiation process could only be measured by changes in distributions from Ri to Rf established on the property variability. The analysis of methods employed in studies of operant variability will show, however, that the effects of differential reinforcement of variable behavior have often been measured through changes in response properties on which the S distribution was not established. The present analysis focuses on procedures and dependent variables that have been the most common in the studies of operant variability, specifically, the lag n, threshold procedures, percentage of reinforcement, and U value as dependent variables.
PROCEDURES USED IN STUDIES OF OPERANT VARIABILITY
The Lag n Procedure
A lag n contingency reinforces responses that differ from each of the previous n responses (differences being defined in terms of some response property). For example, when Schoenfeld, Harris, and Farmer (1966) tried to differentially reinforce variable behavior, they first conditioned a panel-press response in rats. In a second phase, a panel press was reinforced only if it ended an interresponse time (IRT), or a postreinforcement pause, that was different from the IRT (or postreinforcement pause) of the immediately preceding response. An IRT was considered different only if it fell within a temporal class interval different from that of the immediately preceding IRT (with temporal class intervals defined by the experimenters). Thus, Schoenfeld et al. used a lag n procedure with n = 1. The subjects in this study exhibited a pattern of alternating IRTs belonging to two class intervals. The rats that displayed this pattern earned all available reinforcers with the lowest possible level of variability.
Schwartz (1982) also tried to differentially reinforce variable behavior with a lag n procedure, but he conditioned key pecking in pigeons distributed across two keys, a left key (L) and a right key (R). Reinforcement was contingent on exactly four pecks on the left key and exactly four pecks on the right key, in any order. In addition, the contingency required pigeons to vary response sequences. The variability requirement was met only if the pattern of the current sequence differed from the immediately preceding n patterns. In an initial series of eight-peck sequences like LLLLRRRR, LLRRLLRR, LLRRLLRR, LLLLRRRR, LRLRLRLR (emitted in that order, with n = 1), only the third sequence would fail to be reinforced because it replicated the preceding sequence. Schwartz obtained low and decreasing variability during lag n sessions. In addition, the variability contingency involved some intermittency of reinforcement, because sequences that did not meet the variability requirement were not reinforced. Schwartz noted that this intermittency of reinforcement might have engendered part of the variability obtained under lag n contingency and concluded that the lag n contingency failed to engender operant variability.
Page and Neuringer (1985) used a similar procedure but eliminated the constraint that exactly four pecks occur on each key. Any eight-response sequence would be reinforced if it met the lag requirement of showing a different configuration of L and R responses from each of the preceding n sequences. Page and Neuringer conducted six experiments and demonstrated that the degree of sequence variability was a function of the lag requirements. The higher the value assigned to n, the higher the levels of obtained variability. Page and Neuringer also programmed a contingency that required birds to emit eight-response sequences, but without a variability requirement (a yoked variable-ratio [VR] schedule). In the yoke procedure, the reinforcer presentations were yoked to reinforced trials that occurred during sessions of the lag n contingency. Thus, the subjects in the yoke procedure were exposed to the same intermittency of reinforcement as in the lag n contingency, but in the former condition the pigeons were not required to vary, whereas in the latter the subjects were required to vary response sequences. Results showed that the lag n contingency generated greater variation in response sequences than did the yoked VR procedure. This finding demonstrated that variability engendered by the lag n contingency was not due simply to intermittency of reinforcement. Finally, Page and Neuringer demonstrated control of variable behavior by discriminative stimuli. These results led Page and Neuringer to conclude that variability is an operant dimension of behavior. The left column of Table 1 lists studies that adopted a lag n procedure and required subjects to emit sequences of responses across two or three operanda.
Table 1.
Some studies of operant variability that required subjects to emit sequences of responses across two or three operanda and adopted the lag n procedure or threshold procedure. They employed the U value as measure of variability
Threshold Procedure
Denney and Neuringer (1998) differentially reinforced four-response sequences across two operanda, but they adopted a reinforcement criterion based on the relative frequency at which the current sequence had occurred in the recent past. A sequence was reinforced only if its relative frequency, computed on the set of sequences emitted during the current and the previous session, was less than or equal to a certain threshold value. After each trial, the relative frequency of each different sequence was recalculated and, when the sequence was reinforced, the calculations involved an adjustment by a weighting coefficient. The adjusted value was the weighted relative frequency of the sequence. Any current sequence was reinforced only if its weighted relative frequency was less than or equal to the threshold value. This procedure differentially reinforced sequences that occurred infrequently in the recent past. Denney and Neuringer's threshold procedure engendered more variability than did a control procedure in which variability was not required (but in which the reinforcement frequencies were equated to those in the variability condition).
The lag n procedure and the threshold procedure have a common feature that unifies them. In both cases, reinforcement depends on the frequency at which a sequence occurred in the recent past: the relative frequency in the threshold procedure and the absolute frequency in lag n procedure. The right column of Table 1 lists studies that required subjects to emit sequences of responses across two or three operanda and adopted a threshold procedure.
Lag n Procedure, Threshold Procedure, and the S Distributions
Schwartz (1982) raised some problems with the response class definition as applied to the study of operant variability: “Suppose we wanted to use reinforcement to increase behavioral variability. What would define the operant class on which reinforcement depends? What objective property of responses would unite them into a class? It is clear that there is no such property” (p. 178). However, both the lag n and the threshold contingency provide reinforcement to some sequences and not to others. Thus, both procedures differentially reinforce the sequences subjects emit, and differential reinforcement implies a reinforcement criterion. Because these contingencies differentially reinforce sequences, reinforcement criteria must be established on sequence properties. The number of L responses, the number of R responses, and their duration are some sequence properties. If a subject shifts from one operandum (e.g., the left key) to another one (e.g., the right key) during the emission of a sequence, it executes a switch between the operanda. A switch involves, therefore, an intrasequential alternation between the operanda. The sequences LLRR, RRRR, LRRL, LRLR contain, respectively, 1, 0, 2, and 3 switches between the operanda. Some properties, such as the number of L responses, the number of R responses, the locations of L and R responses, and the number and locations of switches, are properties that define the sequence configuration. Some other properties, such as sequence duration or the force of each response, do not affect sequence configuration. All of these properties are intrinsic to the sequence, because it is possible to ascertain them by examining only the sequence itself.
There are, however, some properties of each individual sequence that assume values only when the sequence is taken within a particular series of sequences. Suppose the following series of sequences LLRR, RRRR, LRLR, RLRL is emitted. We can define relational properties for any sequence. For the last sequence of the series (RLRL), for example, the property absolute frequency at which a sequence containing the same number of switches occurred in the previous three sequences (one), or the relative frequency at which a sequence containing the same number of L responses occurred in the previous three sequences (0.67), or the absolute frequency at which a sequence showing the same configuration occurred in the previous three sequences (zero). These are examples of relational properties of sequences defined only by examining the series of sequences in which the current sequence is inserted.
A reinforcement criterion may be established on any sequence property. The lag n procedure establishes a reinforcement criterion based on the property absolute frequency at which the current sequence occurred in the previous n trials. The threshold procedure establishes a reinforcement criterion based on the property weighted relative frequency at which the current sequence occurred in the past.
Once a reinforcement criterion is defined, a descriptive class of response is also defined. This descriptive class may in turn be expressed in terms of S distribution. The threshold procedure reinforces sequences whose weighted relative frequency is less than or equal to a threshold value. Thus, the threshold procedure establishes its S distribution on the property weighted relative frequency at which the sequence occurred in the past (WRF). WRF is a relational property of sequences and ranges from 0 to 1. If a sequence presents a WRF value less than or equal to the threshold value it produces reinforcement; otherwise, the sequence never produces reinforcement. The lag n procedure reinforces only sequences whose absolute frequency in the previous n trials is equal to zero. Thus, a lag n contingency establishes its S distribution on the sequence property absolute frequency at which the sequence occurred in the previous n trials (FPN). FPN is also a relational property of sequences and may assume entire values between 0 and n. Sequences whose FPN is equal to zero are reinforced with a probability equal to 1. Sequences whose FPN is positive are reinforced with probability equal to 0.
R Distributions
It is also possible to establish an Ri distribution on properties WRF and FPN. If a subject emits only two different sequences and alternates them systematically, it produces a series of sequence like this: A, B, A, B, A, B, …, in which A is a sequence (e.g., LLLL) and B is another sequence (e.g., RRRR). This hypothetical stable pattern of sequence emission may be used to illustrate how one could establish the R distributions on the properties of FP1, FP2, FP3, and FP4. Table 2 shows how one could assign values to these four properties after each sequence emission. In the third column, the FP1 property assumes the value of 0 because the B sequence has not occurred in the immediately previous emission. In the fourth column, the FP2 property assumes the value of 1 because the A sequence has occurred once in the immediately two previous emissions. In the fifth column, the FP3 property assumes the value of 1 because the B sequence has occurred once in the immediately three previous emissions. In the sixth column, the FP4 property assumes the value of 2 because the A sequence has occurred twice in the immediately four previous emissions, and so on.
Table 2.
Values assigned to properties FP1, FP2, FP3, and FP4 after each sequence emission if the subject alternates systematically only two different sequences (A and B). FP1 corresponds to the property absolute frequency at which the sequence occurred in the immediately preceding trial, FP2 corresponds to the property absolute frequency at which the sequence occurred in the immediately preceding two trials, and so on
Table 3 shows the frequency calculations on the values taken from Table 2. It shows, for example, that in the pattern of alternation between two different sequences (A, B, A, B, A, B, …,), (a) the FP1 property, which may assume the values of 0 or 1, assumes the value of 0 for any sequence considered (i.e., the relative frequency at which the FP1 property assumes the value of 0 is equal to 1); (b) the FP2 property, which may assume the values of 0, 1, or 2, never assumes the value of 2 (i.e., the relative frequency at which the FP2 property assumes the value of 2 is equal to 0); (c) the FP4 property, which may assume the values of 0, 1, 2, 3, or 4, assumes the value of 2 for any sequence considered (i.e., the relative frequency at which the FP4 property assumes the value of 2 is equal to 1). Figure 3 represents graphically the R distributions of each property.
Table 3.
The calculation of relative frequencies on the values taken from Table 2
Figure 3.
Graphic representations of R distributions established on the sequence properties of FP1, FP2, FP3, and FP4.
Despite Schwartz's (1982) objection to behavioral variability as an operant class, the lag n and the threshold procedures select objective, although relational, properties of responses (or response sequences) on which reinforcement depends. These contingencies define a reinforcement criterion (and, therefore, establish S distributions) based on those properties, which, in turn, unite some sequences into a class, at least into a descriptive class.
DEPENDENT VARIABLES IN STUDIES OF OPERANT VARIABILITY
According to Neuringer (2002), “U value has been the most commonly employed measure of operant variability” (p. 683). The U stands for uncertainty, and U value derives from information theory (Attneave, 1959). The U value essentially measures levels of uncertainty. Page and Neuringer (1985) calculated a U value at three levels: U1 was calculated from the individual relative frequencies at which the responses L and R occurred; U2 was calculated from the relative frequencies at which the pairs of responses (LL, LR, RL, RR) occurred; and U3 was calculated from the relative frequencies at which the triplets of responses (LLL, LLR, LRL, LRR, …) occurred. U values show how much the responding is biased, because a perfectly variable (or unbiased) sequence pattern of L and R responses contains equal frequencies of individual responses (L and R), equal frequencies of pairs of responses (LL, LR, RL, RR), and equal frequencies of triplets of responses (LLL, LLR, LRL, LRR, …).
U value may also be calculated from the relative frequencies at which each of all different sequences occurs (Neuringer, 2002). The number of operanda employed (two or three) and the number of responses per sequence define a universe of all possible different sequence configurations. A procedure that requires four-response sequences across two operanda defines a universe of 16 (24) different sequences (different relative to configurations). In this case, the U value is a function of the relative frequency at which each of the different sequences occurs and the universe of all different sequences. Therefore, the change in U value reflects the change in the frequency distribution of all the different sequences. U value ranges from 0 to 1. When all different sequences occur equally often, the U value is equal to 1. When only a single sequence occurs, the U value is equal to 0. Thus, the more uniform the frequency distribution of all different sequences is, the more the U value approximates 1. Whenever U value is mentioned henceforth it refers to U value calculated from the relative frequencies at which each different sequence occurs.1 The percentage or proportion of reinforced sequences is another common dependent variable employed. It equals the ratio of the number of reinforced sequences to number of trials.
All of the studies listed in Table 1 employed U value as a measure of variability. All the studies in the left column calculated the percentage of reinforced sequences or the percentage of correct sequences (except Hunziker, Saldana, & Neuringer, 1996). Neuringer, Kornell, and Olufs (2001) also calculated percentage of correct sequences. Ross and Neuringer (2002) programmed a threshold contingency and calculated a U value, but they used human subjects and employed a procedure that required the participants to draw figures on a computer screen.
Note that neither the lag n nor the threshold procedure establishes their S distributions on U value because neither specifies a single conditional probability of reinforcement for each U value.
Differentiation Process and Measures of Variability
As previously discussed, a differentiation process should be measured only through the change of an Ri distribution in the same property as that on which the S distribution was established. Thus, even if a contingency increases levels of behavioral variability but in fact differentially reinforces some sequences on the basis of properties other than their variability, the increase in variability levels, whatever it is, does not allow one to measure a differentiation process. In this case, the increase in variability levels represents a secondary effect of differential reinforcement. Machado (1997) demonstrated such an effect. He argued that the lag n contingency of Page and Neuringer (1985) might have differentially reinforced sequences that contain an intermediate number of switches, and the obtained variability therefore might have been a by-product of such differential reinforcement. If Machado's hypothesis is correct, Page and Neuringer's findings demonstrated that the number of switches per sequence is an operant dimension of behavior, and the variability generated is an indirect effect of the reinforcement of sequences that contain an intermediate number of switches. Machado's hypothesis implies the following prediction: If a contingency directly reinforces sequences that contain intermediate number of switches, it will engender high levels of variability. Machado tested his hypothesis by establishing a reinforcement criterion based on the property number of switches of the sequence. He used pigeons and an experimental chamber equipped with two keys. An eight-response sequence was required in each trial. The procedures of Experiments 1 and 2 established a descriptive class on the sequence property number of switches. In Experiment 1, the subjects received food only if they emitted a sequence that contained at least one switch (Group 1) or at least two switches between the keys (Group 2). In Experiment 2, Machado programmed a probabilistic schedule of reinforcement on the property number of switches. Figure 4 shows the S distribution Machado established in Experiment 2. In Experiment 3, naive pigeons were explicitly required to vary behavior (a Lag 25 contingency was used).
Figure 4.

S distribution Machado (1997) established in Experiment 2.
Machado (1997) calculated the proportion of different sequences per session (that equaled the ratio of the number of different sequences to the number of trials) and the proportion of reinforced sequences the birds emitted. The former was the variability measure. The differential reinforcement of switching behavior produced substantial levels of variability. Machado reported, however, that the Lag 25 contingency engendered variability levels slightly higher than did the differential reinforcement of intermediate number of switches. Barba and Hunziker (2002) found similar results.
Note that the increases in proportions of different sequences in Machado's (1997) Experiments 1 and 2, although they actually occurred, could not enable one to measure a differentiation process, because the S distribution was not established on the property proportion of different sequences. Note, however, that the proportion of different sequences could be taken as a reinforcement criterion. After the emission of each sequence, Machado could calculate the proportion of different sequences emitted up to that point and reinforce the current sequence only if the current proportion were higher than a certain value. In this case, the variability measure would coincide with the sequence property on which the reinforcement criterion would have been defined. If the change in proportion were controlled by such a contingency of reinforcement, Machado would hold that the proportion of different sequences is itself a relational sequence property that is sensitive to differential reinforcement. Proportion of different sequences can be updated after the emission of each sequence. Thus, the proportion is a sequence property in the sense that it is possible to assign to any emitted sequence a specific value (the proportion of different sequences calculated over the set of all emitted sequences).
Hence, in Machado's (1997) Experiments 1 and 2, the obtained variability represented, indeed, some indirect effect of differential reinforcement. In these experiments, the measure of variability was clearly dissociated from the sequence property on which the S distribution was established (It should be noted, by the way, that the subjects could obtain all the available reinforcement with no variability at all.) Machado can, therefore, assert that his procedure (in Experiments 1 and 2) shaped the operant switching between the keys and the variability he obtained was a by-product of the differential reinforcement of switching behavior. This rationale holds because Machado's S distribution was established on the property number of switches (and not on the property variability).
A similar analysis seems to apply to situations in which the experimenter differentially reinforces sequences of responses with a lag n procedure and measures the effects of differential reinforcement through the change in U value. In this case, the experimenter establishes the S distribution on the FPN property (and not on U value). Therefore, the change in U value does not allow the experimenter to measure a differentiation process.
It might be argued, perhaps, that changes in U value and changes in the Ri distribution of the FPN property are correlated events (much as were the properties force and duration in my hypothetical example). However, this correlation measure is compromised because it depends on the n value and on the number of different possible sequences. Suppose, for example, a subject emits only a single four-response sequence with no switches under a contingency that does not require variability. If the subject is exposed to a Lag 3 contingency and comes to emit four different sequences by alternating them with equal frequency, the Ri distribution under the property FP3 will suffer a substantial change (the largest possible change, indeed), but the U value (calculated on the universe of 16 different sequences) will reach only one half of its highest value. If this hypothetical case involves an eight-response sequence (with the universe of 256 different sequences), the change in U value will be even lower. The same holds for threshold procedure. A subject can perfectly meet the requirements of such contingency and, at the same time, produce a small change in U value if the scheduled threshold value is high enough (or, in other words, if the contingency is very permissive).
Therefore, the studies that use lag n or threshold procedures and employ U value as the variability measure may be compared with Machado's (1997) Experiments 1 and 2 with respect to the following characteristic: All of them adopt a variability measure that is distinct from the sequence properties on which the S distributions are established. But this is not an inevitable feature of studies about operant variability. U value can be itself a property of individual sequences and S distributions can be established on it, as will be seen below.
CUMULATIVE U VALUE AS A SEQUENCE PROPERTY
According to Neuringer (2002), different methods, including a lag n contingency, threshold procedures, and differential reinforcement of switches versus repetitions, “increase or maintain variability by reinforcing it” (p. 682). Neuringer did not mention a procedure that takes U value as a reinforcement criterion, but such a procedure can be arranged. In a series of sequences, it is possible to assign a cumulative U value to each of them. The cumulative U value of any current sequence is a function of the frequencies of each different sequence that had occurred up to that point and of the number of all possible different sequences. For example, after each emission of a complete sequence, a program might compute a cumulative U value and provide reinforcement only if the current cumulative U value reached a minimal level. Before any sequence is emitted, the program may assume that the subject had emitted every possible sequence exactly once (like the threshold procedure programmed by Denney & Neuringer, 1998). To start with, therefore, the subject might have a cumulative U value equal to 1. This contingency establishes its S distribution on the sequence relational property cumulative U value. This procedure provides an identity between the most common measure of operant variability (U value) and the sequence property on which the differential reinforcement is based. Like FPN or WRF, cumulative U value is a relational property of sequences. Only when U value is the property on which the S distribution is established does the change in U value provide a basis to measure the differentiation process. As far as I know, a contingency that establishes the S distribution on the sequence property cumulative U value has not yet been arranged.2
Arbitrariness of U Value as a General Measure of Operant Variability
If a contingency does not establish its S distribution based on the U value, the adoption of the U value as the main dependent variable is to a large extent arbitrary. Why, for example, does the U-value calculation take the entire universe of all different sequences? If a subject emits only 10 different sequences during a session, why is the U value calculated on the entire universe (e.g., 16 different sequences, if a four-response sequence is required) and not on the universe of 10 different sequences? Only if a contingency establishes its S distribution on the cumulative U value (calculated from the entire universe of all different sequences) can the experimenter satisfactorily justify the calculation of U value (as a dependent variable) based on the entire universe of all different sequences.
Local Variability and Molar Variability
As Odum, Ward, Barnes, and Burke (2006) pointed out, under a lag n contingency, the U value is a “relatively molar measure of variability” (p. 162), because it is calculated from the frequencies at which each sequence was emitted during an entire session. The percentage of reinforced trials, on the other hand, is a measure affected by “more molecular aspects of variability” (p. 162). If n is low (relative to the universe of all possible different sequences), a lag n contingency requires subjects to emit only sequences that differ from the few more recent ones. This contingency does not require subjects to emit equally often all the possible different sequences. That is, it demands only local variability. Subjects that initially emit only a single sequence can meet the requirement of such contingency (and earn all the available reinforcement) with small changes in U value, because the U value reflects the uniformity of the frequency distribution of all possible different sequences. Therefore, the change in U value constitutes a rough measure of the effects of differential reinforcement engendered by one such contingency. Indeed, a lag n contingency always requires local variability, because the completion of more recent sequences is never reinforced. Only as n increases does the lag n contingency come to require the subject to emit equally often a greater number of different sequences.
On the other hand, a contingency that imposes a minimal cumulative U value to provide reinforcement requires mainly molar variability. This contingency does not necessarily require high levels of local variability. In fact, the same sequence could produce reinforcement even if it were emitted in consecutive trials if it was infrequent enough in the set of all sequences already emitted. Conversely, a sequence could not be reinforced, even if it did not repeat the more recent sequences if its cumulative U value does not reach the minimal cumulative U value. This contingency primarily requires the unbiased emission of different sequences over extended periods of time. Under this contingency, the change in U value enables one to measure the differentiation process and the change in local variability levels represents, in turn, a secondary effect of differential reinforcement.
CONCLUSION
Neuringer (2002, 2003, 2004, 2009) has argued that variability is an operant dimension of behavior, and has reported that U value has been the most common measure of operant variability. If differentiation is the process that demonstrates an operant relation, only the change in a dependent variable that provides the basis to measure a differentiation process might unequivocally demonstrate that variability is an operant dimension of behavior. If U value is not the property on which a reinforcement criterion is established, the change in U value does not provide the basis to measure a differentiation process. In such a case, the U value is at best a correlate of the property on which the S distribution is established. Thus, the change in U value, taken as general measure of operant variability, seems to constitute a crude measure of the effects produced by operant variability contingencies. Hence, methods of studying operant variability in which the property that defines an S distribution and the property on which the effects of differential reinforcement are measured are the same might offer more convincing evidence that variability is an operant dimension of behavior.
It is possible, in addition, that contingencies that require only local variability and contingencies that require primarily molar variability will engender different patterns of behavior. It is possible that distinct contingencies select variability at a molar level or variability at a local level. A prediction can be derived from this hypothesis: If it is correct, future research will be able to demonstrate (e.g., by employing yoked procedures) that contingencies that require only local variability and contingencies that require mainly molar variability may produce the same level of molar variability (e.g., the same U value) with different levels of local variability. This would be a tangible consequence of my analytic investigation. If future research shows that behavior is differentially sensitive to a requirement of local variability versus that of molar variability, then Neuringer's claim will be strengthened. Such a result would refine the concept of operant variability.
As I pointed out in the title, this article proposes a conceptual analysis of operant variability. The paper can be thought as a critique of the adoption of U value as a general measure of operant variability in studies designed to demonstrate experimentally that variability is an operant dimension of behavior. By general measure I mean that U value is often adopted as the main variability measure independently of the sequence property on which the reinforcement criterion is established. This article provides an analytic approach to the question of operant variability as long as it deconstructs the variability, as a complex behavioral dimension sensible to environmental consequences, into some more elementary components (or properties), each one of which may be differentially sensitive to differential reinforcement. This analytic effort does not deny that behavioral technologies derived from operant variability research are successfully applied in several fields.
Acknowledgments
I thank Patrick Diamond for reviewing the style of this paper. I also thank Armando Machado for his helpful comments on its content. Finally, I thank Maria de Lourdes Passos for her helpful comments on the content and style of this article.
Footnotes
![]() |
According to Machado (1997), the subjects exposed to lag n contingency learn to emit sequences with intermediate number of switches, and the change in variability levels is a by-product of this learning. That is, variability obtained under lag n contingency might not be a basic behavioral process. The present work does not address this question. This article argues that, if variability is an operant dimension of behavior, then the methods that supposedly demonstrate it must allow one to measure differentiation process on the change in the dependent variable that represents variability levels. It is claimed here that this measurement possibility is a necessary but not a sufficient condition to demonstrate that variability is an operant dimension of behavior.
REFERENCES
- Abreu-Rodrigues J., Lattal K.A., Dos Santos C.V., Matos R.A. Variation, repetition, and choice. Journal of the Experimental Analysis of Behavior. 2005;83:147–168. doi: 10.1901/jeab.2005.33-03. doi:10.1901/jeab.2005.33-03. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Attneave F. Applications of information theory to psychology: A summary of basic concepts. New York, NY: Holt, Rinehart & Winston; 1959. [Google Scholar]
- Barba L.S., Hunziker M.H.L. Variabilidade comportamental produzida por dois esquemas de reforçamento [Behavioral variability produced by two reinforcement schedules]. Acta Comportamentalia. 2002;10(1):5–22. [Google Scholar]
- Catania A.C. The concept of the operant in the analysis of behavior. Behaviorism. 1973;1:103–116. [Google Scholar]
- Catania A.C. Learning (4th ed.) Upper Saddle River, NJ: Prentice Hall; 1998. [Google Scholar]
- Cohen L., Neuringer A., Rhodes D. Effects of ethanol on reinforced variations and repetitions by rats under a multiple schedule. Journal of the Experimental Analysis of Behavior. 1990;54:1–12. doi: 10.1901/jeab.1990.54-1. doi:10.1901/jeab.1990.54-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denney J., Neuringer A. Behavioral variability is controlled by discriminative stimuli. Animal Learning & Behavior. 1998;26:154–162. doi:10.3758/BF03199208. [Google Scholar]
- Doughty A.H., Lattal K.A. Resistance to change of operant variation and repetition. Journal of the Experimental Analysis of Behavior. 2001;76:195–215. doi: 10.1901/jeab.2001.76-195. doi:10.1901/jeab.2001.76-195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galbicka G. Differentiating The Behavior of Organisms. Journal of the Experimental Analysis of Behavior. 1988;50:343–354. doi: 10.1901/jeab.1988.50-343. doi:10.1901/jeab.1988.50-343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grunow A., Neuringer A. Learning to vary and varying to learn. Psychonomic Bulletin & Review. 2002;9:250–258. doi: 10.3758/bf03196279. [DOI] [PubMed] [Google Scholar]
- Hunziker M.H.L., Saldana L., Neuringer A. Behavioral variability in SHR and WKY rats as a function of rearing environment and reinforcement contingency. Journal of the Experimental Analysis of Behavior. 1996;65:129–144. doi: 10.1901/jeab.1996.65-129. doi:10.1901/jeab.1996.65-129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Machado A. Increasing the variability of response sequences in pigeons by adjusting the frequency of switching between two keys. Journal of the Experimental Analysis of Behavior. 1997;68:1–25. doi: 10.1901/jeab.1997.68-1. doi:10.1901/jeab.1997.68-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morgan L., Neuringer A. Behavioral variability as a function of response topography and reinforcement contingency. Animal Learning & Behavior. 1990;18:257–263. doi:10.3758/BF03205284. [Google Scholar]
- Neuringer A. Operant variability and repetition as functions of interresponse time. Journal of Experimental Psychology: Animal Behavior Processes. 1991;17:3–12. doi:10.1037/0097-7403.17.1.3. [Google Scholar]
- Neuringer A. Operant variability: evidence, functions, and theory. Psychonomic Bulletin & Review. 2002;9:672–705. doi: 10.3758/bf03196324. doi:10.3758/BF03196324. [DOI] [PubMed] [Google Scholar]
- Neuringer A. Creativity and reinforced variability. In: Lattal K.A., Chase P.N., editors. Behavior theory and philosophy. New York, NY: Klewer Academic/Plenum; 2003. pp. 323–338. (Eds.). [Google Scholar]
- Neuringer A. Reinforced variability in animals and people. American Psychologist. 2004;59:891–906. doi: 10.1037/0003-066X.59.9.891. doi:10.1037/0003-066X.59.9.891. [DOI] [PubMed] [Google Scholar]
- Neuringer A. Operant variability and the power of reinforcement. The Behavior Analyst Today. 2009;10:319–343. [Google Scholar]
- Neuringer A., Deiss C., Olson G. Reinforced variability and operant learning. Journal of Experimental Psychology: Animal Behavior Processes. 2000;26:98–111. doi: 10.1037//0097-7403.26.1.98. doi:10.1037/0097-7403.26.1.98. [DOI] [PubMed] [Google Scholar]
- Neuringer A., Huntley R.W. Reinforced variability in rats: Effects of gender, age and contingency. Physiology & Behavior. 1992;51:145–149. doi: 10.1016/0031-9384(92)90216-o. doi:10.1016/0031-9384(92)90216-O. [DOI] [PubMed] [Google Scholar]
- Neuringer A., Kornell N., Olufs M. Stability and variability in extinction. Journal of Experimental Psychology: Animal Behavior Processes. 2001;27:79–94. doi:10.1037/0097-7403.27.1.79. [PubMed] [Google Scholar]
- Odum A.L., Ward R.D., Barnes C.A., Burke K.A. The effects of delayed reinforcement on variability and repetition of response sequences. Journal of the Experimental Analysis of Behavior. 2006;86:159–179. doi: 10.1901/jeab.2006.58-05. doi:10.1901/jeab.2006.58-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Page S., Neuringer A. Variability is an operant. Journal of Experimental Psychology: Animal Behavior Processes. 1985;11:429–452. doi: 10.1037//0097-7403.26.1.98. doi:10.1037/0097-7403.11.3.429. [DOI] [PubMed] [Google Scholar]
- Ross C., Neuringer A. Reinforcement of variations and repetitions along three independent response dimensions. Behavioural Processes. 2002;57:199–209. doi: 10.1016/s0376-6357(02)00014-1. doi:10.1016/S0376-6357(02)00014-1. [DOI] [PubMed] [Google Scholar]
- Schoenfeld W.N., Harris A.H., Farmer J. Conditioning response variability. Psychological Reports. 1966;19:551–557. doi: 10.2466/pr0.1966.19.2.551. doi:10.2466/pr0.1966.19.2.551. [DOI] [PubMed] [Google Scholar]
- Schwartz B. Failure to produce response variability with reinforcement. Journal of the Experimental Analysis of Behavior. 1982;37:171–181. doi: 10.1901/jeab.1982.37-171. doi:10.1901/jeab.1982.37-171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skinner B.F. The behavior of organisms. New York, NY: Appleton-Century Crofts; 1938. [Google Scholar]
- Skinner B.F. Contingencies of reinforcement. New York, NY: Appleton-Century Crofts; 1969. [Google Scholar]
- Ward R.D., Kynaston A.D., Bailey E.M., Odum A.L. Discriminative control of variability: Effects of successive stimulus reversal. Behavioural Processes. 2008;78:17–24. doi: 10.1016/j.beproc.2007.11.007. doi:10.1016/j.beproc.2007.11.007. [DOI] [PubMed] [Google Scholar]



