Skip to main content
Journal of the Experimental Analysis of Behavior logoLink to Journal of the Experimental Analysis of Behavior
. 2006 Jul;86(1):11–30. doi: 10.1901/jeab.2006.11-05

Contingency Discriminability and Peak Shift in Concurrent Schedules

Christian U Krägeloh 1,2, Douglas M Elliffe 1,2, Michael Davison 1,2
PMCID: PMC1592349  PMID: 16903490

Abstract

We investigated the effects of discriminative stimuli on choice in a highly variable environment using a procedure in which multiple two-key concurrent VI VI components changed every 10 reinforcers and were signaled by differential flashes of red and yellow keylights. Across conditions, five pigeons were exposed to a number of different combinations of the following component reinforcer ratios: 27∶1, 9∶1, 3∶1, 1∶1, 1∶3, 1∶9, 1∶27. Overall, there was clear control by the component signals in that preference, early in components and particularly before any reinforcers had been delivered, was ordinally related to the signaled reinforcer ratios. In conditions in which only two components arranged unequal reinforcer ratios (e.g., 27∶1 and 1∶27) with the remaining components arranging 1∶1 reinforcer ratios, preference before the first reinforcer in a component showed peak shift in that the most extreme preference did not occur in the unequal reinforcer-ratio components, but in 1∶1 components further towards the ends of the stimulus dimension. The contingency-discriminability model (Davison & Nevin, 1999) was fitted to the data and provided an excellent description of the interactions between stimulus and reinforcer effects in a highly variable environment.

Keywords: choice, stimulus control, contingency discriminability, peak shift, multiple-concurrent schedules, key peck, pigeons


The generalized matching law (Baum, 1974) proposes a linear relation between log response ratios and log obtained reinforcer ratios in concurrent schedules:

graphic file with name jeab-86-01-08-e01.jpg 1

where a refers to sensitivity to reinforcer ratio (Lobb & Davison, 1975) and log c to a constant preference (i.e., bias) for one response alternative over the other. Typical sensitivities in standard concurrent variable-interval (VI) VI schedules are around 0.8 to 0.9 (Baum, 1979; Taylor & Davison, 1983; Wearden & Burgess, 1982).

When the stimuli signalling the concurrent alternatives are not easily discriminable, sensitivity decreases. Miller, Saunders, and Bourland (1980) arranged concurrent schedules in which a switching key alternated stimuli and schedules on a main key (Findley, 1958) and varied the discriminative stimuli associated with the two concurrent alternatives across conditions. When the stimuli were line orientations that differed by 45°, pigeons strictly matched response ratios to the reinforcer ratios (a  =  1.0), but when the line orientations differed by only 15°, sensitivity decreased to 0.28 and 0.37. When identical stimuli signaled both alternatives, sensitivity was 0.17 for both pigeons. This effect also was confirmed by Bourland and Miller (1981). Alsop and Davison (1991) investigated whether the fact that some control by reinforcer ratios remained when the discriminative stimuli were identical could indicate a win/stay lose/shift strategy. To test this, pecks on the switching key did not strictly alternate the operative schedule and stimulus, but initiated one or the other concurrent alternative randomly. The schedule and stimulus in effect after a reinforcer was also determined randomly. When the discriminative stimuli were identical for both concurrent alternatives, there was no control by the reinforcer ratio and sensitivity to reinforcer ratio was zero.

A quantitative model of choice that is particularly suitable when the concurrent alternatives are less than perfectly discriminable is the contingency-discriminability model (Davison & Jenkins, 1985), which describes concurrent-schedule performance as follows:

graphic file with name jeab-86-01-08-e02.jpg 2

where c refers to any constant key bias, and dbr12 refers to the response–reinforcer contingency discriminability between Alternatives 1 and 2. That is, dbr12 describes the accuracy with which reinforcers are attributed by the animal to the response alternative on which they were obtained. The parameter dbr12 is constrained to fall between 1.0 and infinity, where a value of 1.0 indicates a total failure of response–reinforcer discriminability, thus predicting that B1/B2  =  c for all reinforcer ratios, and a value of infinity indicates perfect discriminability and strict matching between response and reinforcer ratios.

Alsop (1991) and Davison (1991) independently modified the contingency-discriminability model to account for performance in conditional discrimination (see review by Davison & Nevin, 1999). When the two choice alternatives in a conditional discrimination are signaled by different stimuli, the effect of stimulus control is described by an additional parameter dsb12, stimulus–response contingency discriminability. The parameter dsb12 measures the accuracy with which the response made on each trial is controlled by the sample stimulus presented on that trial. In a two-alternative multiple concurrent schedule with two signaled components, this parameter, combined with dbr12, determines the effective or perceived reinforcer ratio Ri1/Rj2 (j  =  1 or 2) to which the animal strictly matches its response ratio. Response ratios B11/B12 in the presence of S1 and B21/B22 in the presence of S2 are thus determined by:

graphic file with name jeab-86-01-08-e03.jpg 3

and

graphic file with name jeab-86-01-08-e04.jpg 4

The model can be applied to any number of responses and discriminative stimuli, but then requires dbr and dsb terms for every pairwise combination of responses and stimuli (Davison & Nevin, 1999).

Krägeloh and Davison (2003) investigated the effects of discriminative stimuli on preference in a highly variable environment. Using a procedure developed by Davison and Baum (2000), sessions consisted of seven different components that each arranged one of the following concurrent VI VI reinforcer ratios: 27∶1, 9∶1, 3∶1, 1∶1, 1∶3, 1∶9, and 1∶27. The order of these components was determined randomly without replacement. Components were separated by a 10-s blackout period and were in effect until the pigeon had received 10 reinforcers. In some conditions, components were differentially signaled by red and yellow flashes on both response keys. The cycle length of these flashes was always 1.34 s, and the length of the flash duration of one color relative to the other was ordinally related to the component reinforcer ratios. When components were signaled, sensitivity to the component reinforcer ratio was around 0.5 before the first reinforcer in components when a 2-s changeover delay (COD; Herrnstein, 1961) was used, and slightly below 0.4 with no COD. With successive reinforcers in components, sensitivity increased to around 0.8 with a COD and to around 0.5 to 0.6 without a COD.

Krägeloh and Davison's (2003) study was mainly concerned with comparing performance in signaled components with performance in unsignaled components, with and without a COD. The different component reinforcer ratios were not varied sufficiently to allow a test of the contingency-discriminability model with the stimulus parameters (Davison & Nevin, 1999). One purpose of the present study was to provide such a variation in reinforcer ratios to investigate whether the model can accurately describe concurrent-schedule performance in a highly variable environment.

Another purpose of the present experiment was to investigate peak shift of preference in concurrent-schedule performance. In standard stimulus-control research, positive peak shift is said to occur when the maximum response rate in a generalization gradient is not at the stimulus that was associated with reinforcement during prior discrimination training, but at a stimulus further along the stimulus dimension, in the direction away from that previously associated with extinction or a lower reinforcer rate (e.g., Hanson, 1959).

Several studies have investigated the effects of differential discriminative stimuli on peak shift in concurrent schedules (e.g., Blough, 1973; Catania, Silverman, & Stubbs, 1974; Leigland, 1987; Winton & Beale, 1971). These studies trained pigeons on a concurrent discrimination procedure with subsequent generalization tests in extinction. Winton and Beale, for example, trained pigeons on a concurrent VI VI switching-key procedure where pecks on the switching key produced a change in the discriminative stimulus associated with the VI schedules that were available on the main response key. In the differential-training phase, two concurrent-schedule alternatives arranged different reinforcer rates and were differentially signaled by different line orientations on a blue background. In a subsequent generalization test in extinction, relative time allocation, relative response allocation, and absolute response rates were measured, and positive peak shift occurred for all three measures in that maximum levels were not obtained at the stimulus that had been associated with the richer schedule, but at a stimulus further along the line-orientation dimension, away from the stimulus that was associated with the leaner schedule during training. Negative peak shift also occurred: The lowest levels of the measures were obtained at a stimulus beyond the stimulus associated with the leaner schedule, away from the stimulus associated with the richer schedule.

Generalization gradients of concurrent-schedule preference were obtained by Migler and Millenson (1969). They trained two rats to press levers on a multiple concurrent VI Extinction (Ext) concurrent Ext VI schedule. Components were arranged pseudorandomly with a maximum of three identical components in sequence. Each component lasted until a reinforcer was obtained, and a press at an illuminated nose key then started the next component. One component of the multiple schedule arranged an auditory stimulus of 25 clicks per min during which reinforcement was available on the right lever on a VI 226-s schedule and no reinforcers were obtainable on the left lever. In the other component, the discriminative stimulus was 2.5 clicks per min, and the left lever provided reinforcement on a VI 30-s schedule with no reinforcers on the right lever. During the last 32 sessions of the experiment, stimulus generalization tests were conducted during probe trials in extinction. The discriminative stimulus during probe trials was one of eight click rates which included the two click rates from the two multiple-schedule components. One of the rats showed peak shift, with the most extreme preference for the right lever at 55 clicks per min.

The present experiment used Davison and Baum's (2000) procedure with seven components of signaled concurrent-schedule reinforcer ratios. These components were presented in random order (without replacement) within sessions, separated by 10-s blackouts. Components lasted until the pigeon had obtained 10 reinforcers, and the components were always signaled by differential red and yellow flashes on both response keys. A 2-s COD was arranged throughout, and the ratios of the red and yellow flash durations signalling the components were equally logarithmically spaced. Across conditions, we varied the reinforcer ratio associated with each stimulus to test the contingency-discriminability model (Davison & Nevin, 1999) and to investigate peak shift in preference in frequently changing concurrent schedules.

Method

Subjects

Five pigeons, numbered 112 to 116, were housed in individual cages and were maintained at 85% ± 15 g of their free-feeding body weights by postsession supplementary feeding of mixed grain at about 1000 hr every day. Water and grit were accessible at all times. The present experiment was a continuation of Krägeloh and Davison's (2003) study with the same pigeons. Pigeon 111 died early during the experiment, and its data are not reported.

Apparatus

The pigeons' home cages (375 mm high by 380 mm deep by 375 mm wide) served as experimental chambers. The top, floor, and front walls of each chamber were made of metal grids, and the remaining walls were constructed of sheet metal. Two wooden perches were situated at 90° and 75 mm above the floor. One perch ran in parallel 80 mm from the front wall, and the other perch ran in parallel 100 mm from the right wall, where the experimental panel was situated. The experimental panel consisted of three plastic response keys (20 mm in diameter) that were 114 mm apart center to center and 223 mm above perch level. The response keys could be transilluminated red or yellow by light-emitting diodes directly behind the keys. Only the left and center keys were used in this experiment and when illuminated, required a force exceeding 0.1 N to record an effective response. This was as in the study by Krägeloh and Davison (2003), who erroneously reported that, instead of the center key, the right response key was used. Situated on the right wall, 60 mm above perch level, was a 40 × 40 mm magazine aperture, behind which a hopper was mounted that could provide access to wheat when raised. During reinforcement, the keys were extinguished, the magazine aperture was illuminated, and the hopper was raised for 2.5 s. The pigeons' cages faced those of pigeons in other experiments. The room lights were illuminated at 0000 hr and were extinguished at 1600 hr. Experimental sessions started every day at 0100 hr, and were arranged in succession according to the pigeon number. No personnel entered the room while the experiments were being conducted. All experimental events were arranged and recorded by an IBM®-compatible PC running MED-PC® software located in a separate room.

Procedure

No pretraining was required because this experiment was a continuation of the study by Krägeloh and Davison (2003). Each session consisted of seven components that were arranged randomly without replacement and which terminated after the pigeon had obtained 10 reinforcers. Each component was followed by a 10-s blackout period before the next component started. Depending on the experimental condition, each component arranged one of the following concurrent VI VI reinforcer ratios: 27∶1, 9∶1, 3∶1, 1∶1, 1∶3, 1∶9, and 1∶27. Reinforcers were arranged by interrogating a probability gate set at .037 every 1 s, thus providing an overall reinforcer rate of 2.2 reinforcers per min, equivalent to VI 27 s. Reinforcers were scheduled dependently; once a reinforcer was set up on one alternative, no other reinforcer could be arranged until that reinforcer had been collected (Stubbs & Pliskoff, 1969). A 2-s COD arranged that a reinforcer could not be obtained for the first 2 s after the pigeon began pecking on an alternative after responding to the other alternative (Herrnstein, 1961). As in the study by Krägeloh and Davison, components were signaled by differential red and yellow flashes on both response keys, with the difference that in the present study the ratios of the red and yellow flash durations were equally logarithmically spaced. Figure 1 shows the durations of red and yellow flashes that signaled each component. For example, in Component 1, both response keys were lit red for 1.19 s and then yellow for 0.15 s. In Component 5, response keys were red for 0.45 s and then yellow for 0.89 s. The cycle length was always 1.34 s. These stimuli remained the same throughout the experiment.

Fig 1. Differential red and yellow flash durations arranged on both response keys for Components 1 to 7.

Fig 1

Across conditions, the component reinforcer ratios were varied (Table 1) so as to change the relationship between the stimulus dimension (relative red vs yellow flash duration) and the reinforcer ratios that it signaled. For instance, in Condition 19, Components 1 through 7 signaled reinforcer ratios of, in order, 1∶27, 1∶9, 1∶3, 1∶1, 3∶1, 9∶1, and 27∶1. This meant that the reinforcer ratio was monotonically related to the stimulus dimension, because longer red and shorter yellow flashes signaled more right than left reinforcers, while shorter red than yellow flashes signaled more left than right reinforcers. Condition 32 reversed this monotonic relationship between reinforcer ratio and stimulus dimension. Conditions 20 and 26 arranged 1∶1 reinforcer ratios in all components, so that the reinforcer ratio was unrelated to the stimulus dimension. The other conditions arranged more complex relationships between reinforcer ratio and stimulus dimension. The obtained reinforcer ratios were close to those programmed.

Table 1. Sequence of experimental conditions and the arranged reinforcer ratio (left∶right) in each component. The stimuli accompanying Components 1–7 were as shown in Figure 1. The overall probability of reinforcement was .037/s throughout.
Condition Component Number
1 2 3 4 5 6 7
19 1∶27 1∶9 1∶3 1∶1 3∶1 9∶1 27∶1
20 1∶1 1∶1 1∶1 1∶1 1∶1 1∶1 1∶1
21 1∶1 1∶1 9∶1 1∶1 1∶9 1∶1 1∶1
22 1∶1 1∶1 1∶27 1∶1 27∶1 1∶1 1∶1
23 1∶1 1∶1 27∶1 1∶1 1∶27 1∶1 1∶1
24 1∶1 1∶27 1∶1 1∶1 1∶1 27∶1 1∶1
25 1∶1 27∶1 27∶1 1∶1 1∶27 1∶27 1∶1
26 1∶1 1∶1 1∶1 1∶1 1∶1 1∶1 1∶1
27 1∶27 9∶1 9∶1 9∶1 9∶1 9∶1 9∶1
28 1∶9 1∶9 1∶9 1∶9 1∶9 1∶9 27∶1
29 1∶27 1∶1 1∶1 1∶1 1∶1 1∶1 27∶1
30 27∶1 1∶1 1∶1 1∶1 1∶1 1∶1 1∶27
31 1∶1 1∶27 1∶27 1∶1 27∶1 27∶1 1∶1
32 27∶1 9∶1 3∶1 1∶1 1∶3 1∶9 1∶27

Daily sessions lasted until either all seven 10-reinforcer components had been completed or until 45 min had elapsed, whichever came first. In only 9 of the 2450 individual sessions used for data analyses were fewer than 68 reinforcers delivered. All experimental events were arranged by a remote computer running MED-PC® software, and the times at which they occurred were recorded at a resolution of 10 ms. Each condition was in effect for 50 daily sessions.

Results

The data from the last 35 sessions of each condition were analyzed. The data were too extensive to be shown in tabular form, and can be obtained from any of the authors on receipt of a blank CD-ROM.

The first analyses were carried out to determine whether mean results were representative of individual results. Figure 2 shows log response ratios in Condition 19 as a function of successive reinforcers in components for each individual pigeon and also for these log response ratios averaged across pigeons. Overall, log response ratios, including those measured before the first reinforcer had been obtained in each component, were ordinally related to the component reinforcer ratios. We examined whether performance was stable after the first 15 sessions. Analyses showed there were no systematic differences in performance between Sessions 16 to 30 and 31 to 50. With increasing numbers of reinforcers in components, log response ratios approached the arranged log reinforcer ratios and thus became more extreme for all components except Component 4, which had a 1∶1 reinforcer ratio. For the mean data, the log response ratios in Component 4 were around −0.10, thus reflecting a slight right-key bias (see also Krägeloh & Davison, 2003). The data of the individual pigeons were all similar and were well represented by the means. Henceforth, to save space, means will be used in most of the analyses reported here.

Fig 2. Log (left/right) response ratios as a function of successive reinforcers in individual components for Condition 19 for Pigeons 112 to 116 and for mean data.

Fig 2

Figures 3 and 4 show the same analyses for the means of individual pigeons for Conditions 20 to 26 and 27 to 32, respectively. Usually, log response ratios followed component reinforcer ratios, although there was clearly some generalization between components∶ In Condition 29, for example, Components 1 and 7 arranged a 1∶27 and a 27∶1 reinforcer ratio respectively, while all other components had 1∶1 reinforcer ratios. In the 1∶1 reinforcer-ratio components, stimulus generalization was evident with log response ratios ordinally related to the degree of similarity between the discriminative stimuli in that component and in Components 1 and 7 (Figure 4). Thus, in Component 2, preference was more toward the right key, and in Component 6 preference was more toward the left key, than in any other 1∶1 reinforcer-ratio component.

Fig 3. Log (left/right) response ratios as a function of successive reinforcers in individual components for Conditions 20 to 26 for mean data.

Fig 3

Fig 4. Log (left/right) response ratios as a function of successive reinforcers in individual components for Conditions 27 to 32 for mean data.

Fig 4

Carryover effects of stimulus control from previous conditions were also detectable. In Condition 20 (Figure 3), all components arranged a 1∶1 reinforcer ratio. However, component log response ratios were in the direction of the reinforcer ratios in the same components as in Condition 19 (Figure 2). In Component 1, which had signaled a 1∶27 reinforcer ratio in Condition 19, preference was more biased to the right key than in Component 7, in which the stimulus had previously signaled a 27∶1 component. No such carryover effect was detectable from Condition 25 to 26 (Figure 4). Unlike Condition 19, Condition 25 did not arrange component reinforcer ratios ordinally, but arranged two pairs of 27∶1 and 1∶27 reinforcer ratios (Table 1). In Condition 26, where all components had a 1∶1 reinforcer ratio, log response ratios were similar in all components.

The next analyses investigated stimulus-generalization gradients, plotted as log response ratios, as a function of component number or stimulus dimension. Note that the component flash-duration ratios were equally logarithmically spaced. Figures 5 to 7 show, for Conditions 21 to 23 respectively, log response ratios before the first component reinforcer and after the first, third, fifth, seventh, and ninth successive component reinforcer, for individual pigeons and for the mean data. Log response ratios before the first reinforcer in a component provide a measure of the extent of control by component signals that is unaffected by the occurrence of reinforcers and their effects on preference. However, Krägeloh and Davison (2003) observed carryover effects from the previous component reinforcer ratio on preference early in the next component even when components were signaled. In Condition 21 of the present experiment, Components 3 and 5 arranged 9∶1 and 1∶9 components, respectively, and the remaining components arranged a 1∶1 reinforcer ratio. In Component 3, therefore, carryover effects might be expected to shift preference either towards indifference (if the previous component was a 1∶1 component) or towards the right key (if the previous component was 1∶9). In 1∶1 components, in contrast, carryover effects would on average be at indifference. To account for such potential differential carryover effects, stimulus-generalization gradients also were plotted for log response ratios before the first reinforcer by discarding data from Components 3 and 5 when the immediately prior component were Components 5 and 3, respectively (with thicker lines than the other gradients and referred to in the figures as “no carryover”). This adjustment had no effect on the results — that is, the gradients showing behavior before the first reinforcer with (gray circles) and without (thick lines) potential carryover were almost identical in Figures 5 through 7.

Fig 5. Log (left/right) response ratios as a function of component numbers by Successive Reinforcers 1, 3, 5, 7, and 9 for Pigeons 112 to 116 and mean data in Condition 21.

Fig 5

Also shown are response ratios before the first component reinforcer, as well as when data from Components 3 and 5 were discarded when preceded by Component 5 and 3, respectively.

Fig 6. Log (left/right) response ratios as a function of component numbers by Successive Reinforcers 1, 3, 5, 7, and 9 for Pigeons 112 to 116 and mean data in Condition 22.

Fig 6

Also shown are response ratios before the first component reinforcer, as well as when data from Components 3 and 5 were discarded when preceded by Components 5 and 3, respectively.

Fig 7. Log (left/right) response ratios as a function of component numbers by Successive Reinforcers 1, 3, 5, 7, and 9 for Pigeons 112 to 116 and mean data in Condition 23.

Fig 7

Also shown are response ratios before the first component reinforcer, as well as when data from Components 3 and 5 were discarded when preceded by Components 5 and 3, respectively.

Figures 5 through 7 often show peak shift in preference before the first reinforcer in each component, in that preference was not most extreme in the components that arranged unequal reinforcer ratios (9∶1, 1∶9, 27∶1, or 1∶27), but in a 1∶1 component at either end of the stimulus dimension. Discarding data from Components 3 and 5 when immediately preceded by Components 5 and 3 did not alter the finding of peak shift, indicating that nondifferential carryover was not the determining factor in these results. With five pigeons and two ends of the stimulus dimension, peak shift occurred on 8 out of 10 occasions in Condition 21 (binomial p  =  .055; Figure 5) and 9 out of 10 occasions in Conditions 22 and 23 (binomial p  =  .011; Figures 6 and 7, respectively). In some instances, such as for Pigeon 116 in Condition 21 (Figure 5) and Pigeon 113 in Condition 22 (Figure 6), log response ratios were distributed across the stimulus dimension in a strictly descending and ascending order, respectively, although the relationship between stimuli and reinforcer ratio was nonmonotonic.

With increasing numbers of successive reinforcers, log response ratios increasingly approximated the arranged component log reinforcer ratios (Figures 5 through 7), with a strong relationship between response and reinforcer ratios emerging after about three successive reinforcers. For Pigeon 113 in Condition 21 (Figure 5), peak shift disappeared after the fifth successive reinforcer only to re-emerge after the seventh successive reinforcer. For all other pigeons, peak shift usually disappeared by the third reinforcer in a component.

The contingency-discriminability model (Equations 3 and 4) was extended to accommodate seven discriminative stimuli and was fitted to the present data using eight free parameters: dsb12, dsb23, dsb34, dsb45, dsb56, dsb67, dbr12, and c. The estimates of dsbij for all other combinations of stimuli were obtained by multiplying the above estimates, so that, for example, dsb13 is the product of dsb12 and dsb23. This makes the assumption, discussed by Davison and Nevin (1999), that log discriminability estimates are additive within a stimulus dimension. In this fit, we only used data obtained after three reinforcers in a component had been obtained because the data from this point onward appeared stable (Figures 5 through 7). Conditions 20 and 26 were not included, because these conditions arranged 1∶1 reinforcer ratios in all components and therefore provided no additional systematic variance for the fit. A matrix of the obtained total numbers of Rxy reinforcers was created for each pigeon and their mean, and the parameters were fitted to generate matrices of effective numbers of reinforcers, Rxy. Log Bi1/Bj2 response ratios were predicted for all 49 combinations in all conditions by adding the predicted log effective Ri1/Rj2 reinforcer ratio to the inherent bias. For the combination of Stimuli 3 and 5 (B31 and B52), for example:

graphic file with name jeab-86-01-08-e05.jpg 5

where

graphic file with name jeab-86-01-08-e06.jpg 6

and

graphic file with name jeab-86-01-08-e07.jpg 7

The Quattro-Pro® Optimizer was used to minimize the mean error sums of squares of the differences between the obtained and predicted log response ratios (see Appendix B of Davison & Nevin, 1999, for a detailed description on how to fit the model). The results for individual pigeons and mean data are shown in Table 2.

Table 2. Least-squares estimates of parameters dsb12, dsb23, dsb34, dsb45, dsb56, dsb67, dbr12, and log c obtained fitting Equations 3 and 4 to individual and mean data, and the percentage of variance accounted for (%VAC) by each fit.

Pigeon
112 113 114 115 116 Mean
dsb12 2.23 4.14 3.14 2.75 4.74 3.07
dsb23 5.83 3.47 4.13 4.57 4.58 4.18
dsb34 3.91 4.40 5.06 4.27 4.48 4.39
dsb45 3.06 4.27 4.36 3.28 2.85 3.26
dsb56 6.11 3.85 6.33 8.68 4.96 5.76
dsb67 3.11 2.44 2.94 5.01 3.21 3.07
dbr12 35.76 1.09*105 36.50 17.07 72.31 44.62
log c −0.13 −0.25 −0.04 −0.02 0.02 −0.06
%VAC 94 92 97 95 95 97

Overall, the model fitted the data very well with variance accounted for (VAC) above 92% for all individual fits. All dsb estimates ranged between 2.23 and 8.68 and had both a mean and a median of 4.16. For all pigeons except for Pigeon 113, dsb56 estimates were larger than any other dsb estimate. The dbr12 estimate for Pigeon 113 was almost 110,000, and thus substantially larger than those of the other pigeons, whose estimates ranged between 17.07 and 72.31. Log c was negative for all fits, except for Pigeon 116, indicating an overall right-key bias.

The results of the fits for the mean data are shown in Figures 8 (Conditions 19 to 25) and 9 (Conditions 27 to 32). These figures show log obtained reinforcer ratios and obtained and predicted log response ratios plotted as a function of component number (i.e., across the stimulus dimension). Overall, log response ratios were usually less extreme than log obtained reinforcer ratios, and predicted log response ratios approximated obtained log response ratios closely. In the components of Conditions 21 to 24 that did not have a 1∶1 reinforcer ratio, the model predicted slightly more extreme preference than obtained (Figure 8). In Conditions 27 and 28, the model predicted more extreme preference in the 9∶1 and 1∶9 components, but underpredicted preference in the 1∶27 and 27∶1 components, respectively (Figure 9).

Fig 8. Log (left/right) obtained reinforcer ratios (filled squares), obtained log (left/right) response ratios (unfilled triangles) and predicted log (left/right) response ratios (unfilled circles) as a function of component numbers for mean data for Conditions 19 to 25.

Fig 8

Fig 9. Log (left/right) obtained reinforcer ratios (filled squares), obtained log (left/right) response ratios (unfilled triangles) and predicted log (left/right) response ratios (unfilled circles) as a function of component numbers for mean data for Conditions 26 to 32.

Fig 9

The following analysis investigated local preference changes after reinforcers. Figure 10 shows, for selected conditions and pigeons or for mean data, log response ratios as a function of successive 5-s time bins after left- (left panel) or right-key (right panel) reinforcers. When a reinforcer was delivered, no further responses were recorded for that particular time bin. Later time bins, therefore, contained fewer response numbers, and their log-ratio estimates became increasingly variable. Where no data are shown in Figure 10, no responses occurred during that particular time bin for more than one pigeon, or preference was exclusively to one alternative. Preference typically was strongly biased to the just-reinforced alternative, but moved towards the component reinforcer ratios and appeared to stabilize after about 30 s since reinforcers (Figure 10). These preference pulses (Davison & Baum, 2002) usually were larger on the richer alternative. In Condition 29, for instance, preference pulses following right-key reinforcers were largest in Component 1 (1∶27) and smallest in Component 7 (27∶1). The size of the pulses was ordinally related to the reinforcer ratios (e.g., Condition 32), and appeared to generalize across the stimulus dimension. In Conditions 29 and 30, for example, the size of the preference pulses in the 1∶1 reinforcer-ratio components was usually, but not always strictly, related to the similarity of the stimulus in that component to those that signaled the 1∶27 and 27∶1 components (Figure 10).

Fig 10. Log (left/right) response ratios as a function of time since reinforcement (5-s time bins) by component number for selected pigeons or their means, and selected conditions.

Fig 10

The left panel shows preference after left-key reinforcers, and the right panel after right-key reinforcers.

Discussion

The results of the present study confirmed the findings of local preference changes reported by Krägeloh and Davison (2003). Immediately after reinforcement, there was a preference pulse, or period of bias in responding towards the just-reinforced alternative. After about 30 s, preference typically stabilized at a level that was controlled by the component reinforcer ratio (Figure 10). Generalization of preference, detected using more extended analyses such as overall preference as a function of successive reinforcers, also was detectable at this local level. In Condition 29, for example, Component 1 arranged a 1∶27 reinforcer ratio, Component 7 a 27∶1 reinforcer ratio, and remaining components all arranged a 1∶1 reinforcer ratio. Overall preference after successive reinforcers in these intermediate components was ordinally related to the similarity of the particular component stimulus to those that signaled the extreme reinforcer ratios (Figure 4). The preference-pulses analyses showed the same effect: The size of the preference pulses in the 1∶1 components was ordinally related to the degree of similarity of the stimuli to those of Components 1 and 7.

Clear control by component stimuli was evident early in components in that log response ratios were ordinally related to the signaled component reinforcer ratios before the delivery of the first component reinforcer (Figures 2 to 4). In the present study, as well as in that reported by Krägeloh and Davison (2003), preference in signaled components became more extreme with successive reinforcers. This indicates the presence of a reinforcer effect in addition to the stimulus effect. Krägeloh and Davison showed a lack of such control before the first reinforcer when components were nondifferentially signaled. In their experiment, component reinforcer ratios always ranged from 1∶27 to 27∶1. When components were nondifferentially signaled and a 2-s COD was arranged, overall sensitivity to reinforcement reached around 0.60, as compared to 0.90 when components were differentially signaled.

It might at first sight seem surprising that this training procedure produced less than complete stimulus control, and that successive reinforcers within components for the most part controlled behavior more strongly. That is, given the extended training in each condition, why were the gradients describing behavior before the first reinforcer in each component not more similar to those obtained later in components? We might have expected that the relationship between the stimulus dimension and reinforcer ratio would have been well learned after 50 daily sessions, and would therefore exert strong control. However, it needs to be remembered that our procedure was markedly different from previous experiments that have demonstrated stimulus control in concurrent schedules (e.g., Alsop & Davison, 1991; Miller et al., 1980). In those experiments, two discriminative stimuli signaled the two response alternatives in a switching-key concurrent schedule. In the present experiment, as in the study by Krägeloh and Davison (2003), the two response alternatives were the left and right keys, and so were easily discriminated. Instead, the discriminative stimuli signaled the reinforcer ratio on these alternatives, in a seven-component multiple-concurrent schedule. That is, strong stimulus control required the discrimination of seven different stimuli. When the relationship between the stimulus dimension and the reinforcer ratio was monotonic (Conditions 19 and 32), the dimension did acquire strong control, as seen by the similarity between behavior before the first reinforcer in a component and later in components (see Figure 2). When the relationship between stimuli and reinforcement was more complex, stimulus control was much weaker, but nevertheless detectable (Figures 3 through 7).

Figure 11 shows analyses of the present results using the generalized matching law (Equation 1). Data from all conditions were included in the least-squares linear regression. As in the analyses shown in Figures 8 and 9, data before the third reinforcer were discarded. If preference differences across components were solely attributable to reinforcer effects, log response ratios would be described by a linear function of log reinforcer ratios. Conditions 19 and 32, which both arranged reinforcer ratios ranging from 1∶27 to 27∶1, are shown using unfilled symbols to distinguish them from the other experimental conditions. The data from these conditions deviated clearly from the fitted line. The variance accounted for was 81% and thus considerably lower than typically obtained in concurrent-schedule procedures (e.g., Baum, 1979; Taylor & Davison, 1983; Wearden & Burgess, 1982).

Fig 11. Log (left/right) response ratios as a function of log (left/right) reinforcer ratios for mean data.

Fig 11

Data from Condition 19 are shown by unfilled circles and data from Condition 32 by unfilled squares, whereas data from Conditions 20 to 31 are shown by filled circles. The straight line is the line of best fit from a least-squares linear regression.

The contingency-discriminability model (Equations 3 and 4) provided an accurate fit to the present results with VAC values above 92% for individual pigeons and the mean data (Table 2). Across these 12 conditions, the same component stimuli were arranged, but the combinations of arranged reinforcer ratios associated with the stimuli were varied. Conditions 19 and 32, for example, had ordinal arrangements from 27∶1 through to 1∶27, Condition 27 arranged one 1∶27 component and six 9∶1 components, and Condition 25 arranged two pairs of 27∶1 and 1∶27 components with the remaining components having a 1∶1 reinforcer ratio. Such a variety of combinations provided sufficient signal for a stringent test of the fit of Equations 3 and 4.

In most instances, the predicted measures closely approximated the obtained ones (Figures 8 and 9), although there was some systematic overprediction. In Conditions 21 to 23, in which five components arranged a 1∶1 reinforcer ratio and two components unequal reinforcer ratios, the model predicted more extreme preference in the latter components than obtained (Figure 8). In Conditions 27 (1∶27, and six components of 9∶1) and 28 (27∶1, and six components of 1∶9), the model also overpredicted the log response ratios in the 9∶1 and 1∶9 components (Figure 9). In the remaining conditions, however, the predicted log response ratios were very close to those obtained.

The estimates of dbr12 ranged from 17.07 to almost 110,000, which is equivalent to generalized-matching sensitivity between 0.9 and 1.0 (Davison & Jenkins, 1985). Because the ratios of the differential flash lengths on the response keys were equally logarithmically spaced, the dsbij estimates for each pair of stimuli i and j were all expected to be similar. Identical estimates, however, are not necessarily to be expected, since equal distances along a stimulus dimension are not always equally discriminable (see Blough, 1961), and, of course, log flash-duration ratio may not be the effective stimulus dimension. Despite our attempt to arrange a set of stimuli that were equally spaced along a dimension, it is clear that some pairs of adjacent stimuli were more easily discriminated than others, and that these differences in discriminability were well captured by the model. A repeated-measures ANOVA showed that estimates of dsbij for different stimulus pairs did differ significantly (F5,20  =  4.38, p < .05). For comparison, Table 3 shows the results of the model fit with three free parameters, when it is assumed that all dsb values for pairs of adjacent values are identical. Restricting the number of free parameters resulted in slight changes in the values of dbr12 and log c, as expected. For each fit, values of the Akaike Information Criterion were smaller (more negative) when eight free parameters were used, which, given that the differences were larger than 10, means that the eight-parameter model is the better descriptor of the present data (Burnham & Anderson, 2002). Certainly, an even better description of the data could have been obtained with, say, five dsb values, but the logic of this model (Equations 5 to 7) requires that the dsb values for each pairwise set of discriminative stimuli be different. Our selection of stimuli was intended to equalize pairwise dsb values by selecting equal logarithmic steps in flash-frequency ratios. Evidently the equalization failed, thereby fortuitously supporting the general eight-parameter model over the simplified three-parameter model.

Table 3. Least-squares estimates of parameters dsb, dbr12, and log c obtained fitting Equations 3 and 4 to individual and mean data, and the percentage of variance accounted for (%VAC) by each fit. Also shown are the values of the Aikaike Information Criterion for this model fit with 3 free parameters (AIC3) and those of the fit shown in Table 2 with 8 free parameters (AIC8).

Pigeon
112 113 114 115 116 Mean
dsb 4.01 3.85 4.82 4.87 3.75 4.00
dbr12 23.67 1.21*106 22.22 13.34 289.04 32.29
log c −0.14 −0.25 −0.05 −0.02 0.03 −0.06
%VAC 91 91 96 93 94 97
AIC3 −1059 −962 −1222 −1131 −1112 −1304
AIC8 −1100 −999 −1283 −1207 −1146 −1341

It might seem that the generalized-matching analysis shown in Figure 11 is naïve, and that a fairer comparison of the two models would be to allow sensitivity values to differ between different stimulus pairs. That is, the parameter a in Equation 1 could measure the effect of stimulus disparity on choice in the same way as does dsbij in Equations 3 and 4 (e.g., Alsop & Davison, 1991; Miller et al., 1980). However, such an approach would require a separate value of a for every pair of stimuli. This is because there is no obvious way to combine values of a from pairs of adjacent stimuli to predict a value for stimuli further apart on the dimension. For example, if a12  =  0.6, and a23  =  0.7, we cannot predict a value for a13. By contrast, the contingency-discriminability model explicitly assumes that dsb13  =  dsb12dsb23 (Davison & Nevin, 1999). This means that we can predict the effect of arranging stimuli chosen from anywhere on the dimension from a limited set of estimates of discriminability. That is, the contingency-discriminability model offers a far more parsimonious and efficient description of the data than does the generalized matching law, with many fewer free parameters.

Before the first reinforcer in a component, stimulus-generalization gradients of log response ratios as a function of component number, hence as a function of flash duration ratios, showed consistent peak shift for individual pigeons (Figures 5 through 7). In Conditions 21 to 23, the most extreme levels of preference usually were not in the two components that arranged unequal reinforcer rates (Components 3 and 5), but in 1∶1 reinforcer-ratio components further towards the ends of the stimulus dimension (Components 1, 2, 6, and 7). Discarding data from Components 3 and 5 when immediately preceded by Component 5 and 3, respectively, did not alter these findings. Peak shift, therefore, does not appear to be a result of differential carryover from previous components (see Krägeloh & Davison, 2003). By the third successive reinforcer, log response ratios had begun to approximate the arranged reinforcer ratios (Figures 5 through 7).

Equations 3 and 4 define a steady-state model, and so cannot predict changes in preference during components, and specifically do not predict the disappearance of peak shift after as few as one or two component reinforcers (Figures 5 through 7). The matrix that determines the effective reinforcer ratios needs to be loaded with at least one reinforcer in each cell in order for it to be able to predict reinforcer ratios that are mathematically defined. This means that it could not start empty at the beginning of a new component, but would need to carry over reinforcers from previous components and sessions. The larger the numbers of reinforcers that are loaded into the matrix, the less impact will a single reinforcer have on the effective reinforcer ratio. Unless the model is modified to include a process by which accumulated reinforcers leak over time, such as suggested by Davison and Baum (2000), it cannot explain the rapid changes in the log response ratios that are observed at the beginning of components (Figures 5 through 7).

The model can predict peak shift in stable preferences. For that to be the case, however, the discriminative stimuli that are arranged for each component would need to be sufficiently difficult to discriminate to increase the amount of generalization across the stimulus dimension. In the present experiment, the differential flash durations that were arranged on both alternatives might have been too discrepant to produce peak shift in the stable state. Future experiments could look into more extensive variations of discriminative stimuli.

The results thus confirm and strengthen the findings by Migler and Millenson (1969) that peak shift can be observed in preference. In their study, only two rats were used and peak shift was only observed in one of them. Migler and Millenson noted that, because reinforcers were scheduled independently, switching was rare between the alternatives, which means that preference tended to be extreme. The present study showed that peak shift in terms of preference can occur reliably in this type of procedure.

This study has shown peak shift in a procedure that is novel in three ways. Firstly, although a number of previous studies (e.g., Blough, 1973; Migler & Millenson, 1969) provided differentially signaled concurrent-schedule training for their subjects, their procedures arranged reinforcers exclusively on one of the alternatives. The present experiment arranged a series of different reinforcer ratios in a variety of combinations. Secondly, peak shift in these previous studies was demonstrated using generalization tests that arranged stimuli different from those in training, whereas the present study obtained peak shift using training stimuli—that is, we used maintained-generalization testing (Blough, 1969) rather than a transient resistance-to-extinction test (Guttman & Kalish, 1956). Thirdly, previous concurrent-schedule discrimination procedures (e.g., Catania et al., 1974; Winton & Beale, 1971) arranged a different discriminative stimulus on each alternative, whereas the present study arranged the same differential flash durations on both alternatives.

The present results also show that the effects of stimulus control can be very long-lasting, especially when no additional variations in the reinforcer ratios occur. In Condition 20, all components arranged a 1∶1 reinforcer ratio, but preference in components was ordinally related to the component reinforcer ratios in Condition 19 (Figure 3). Condition 26, however, also arranged only 1∶1 reinforcer ratios, but preference was much less in accordance with reinforcer ratios in the previous condition. Note that in Condition 19, component preference was more extreme than in Condition 25, and therefore any carryover of preference into the subsequent condition likely would be higher in Condition 20 than in Condition 26. Neither Condition 20 nor Condition 26 was included in the fits of the contingency-discriminability model.

In summary, the present study extended Krägeloh and Davison's (2003) research on the effects of discriminative stimuli on choice in a variable environment. Component preference measured both at a local and at an extended level produced orderly gradients of stimulus generalization of preference. The novelty of this procedure was that peak shift could be demonstrated without novel (test) stimuli and with a procedure in which preference was used as a measure of stimulus control when a variety of different reinforcer ratios was arranged. By arranging a highly variable environment in which components changed every 10 reinforcers, and arranging a large range of conditions with different combinations of component reinforcer ratios, the present study provided a stringent test of the contingency-discriminability model. The model provided an accurate description of performance in this procedure, further highlighting its ability to provide a conceptually sound description of the combined effects of reinforcers and stimulus control.

Acknowledgments

We would like to thank the members of the Experimental Analysis of Behaviour Research Unit for their help in conducting this experiment, and Mick Sibley for looking after the pigeons.

References

  1. Alsop B. Behavioral models of signal detection and detection models of choice. In: Commons M.L, Nevin J.A, Davison M.C, editors. Signal detection: Mechanisms, models, and applications. Hillsdale, NJ: Erlbaum; 1991. pp. 39–55. In. eds. [Google Scholar]
  2. Alsop B, Davison M. Effects of varying stimulus disparity and the reinforcer ratio in concurrent-schedule and signal-detection procedures. Journal of the Experimental Analysis of Behavior. 1991;56:67–80. doi: 10.1901/jeab.1991.56-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baum W.M. On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior. 1974;22:231–242. doi: 10.1901/jeab.1974.22-231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baum W.M. Matching, undermatching, and overmatching in studies of choice. Journal of the Experimental Analysis of Behavior. 1979;32:269–281. doi: 10.1901/jeab.1979.32-269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blough D.S. The shape of some wavelength generalization gradients. Journal of the Experimental Analysis of Behavior. 1961;4:31–40. doi: 10.1901/jeab.1961.4-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Blough D.S. Generalization gradient shape and summation in steady-state tests. Journal of the Experimental Analysis of Behavior. 1969;12:91–104. doi: 10.1901/jeab.1969.12-91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Blough D.S. Two-way generalization peak shift after two-key training in the pigeon. Animal Learning & Behavior. 1973;1:171–174. [Google Scholar]
  8. Bourland G, Miller J.T. The role of discriminative stimuli in concurrent performances. Journal of the Experimental Analysis of Behavior. 1981;36:231–239. doi: 10.1901/jeab.1981.36-231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Burnham K.P, Anderson D.R. Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.) New York: Springer Verlag; 2002. [Google Scholar]
  10. Catania A.C, Silverman P.J, Stubbs D.A. Concurrent performances: Stimulus-control gradients during schedules of signaled and unsignaled concurrent reinforcement. Journal of the Experimental Analysis of Behavior. 1974;21:99–107. doi: 10.1901/jeab.1974.21-99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Davison M.C. Stimulus discriminability, contingency discriminability, and complex stimulus control. In: Commons M.L, Nevin J.A, Davison M.C, editors. Signal detection: Mechanisms, models, and applications. Hillsdale, NJ: Erlbaum; 1991. pp. 57–78. In. eds. [Google Scholar]
  12. Davison M, Baum W.M. Choice in a variable environment: Every reinforcer counts. Journal of the Experimental Analysis of Behavior. 2000;74:1–24. doi: 10.1901/jeab.2000.74-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Davison M, Baum W.M. Choice in a variable environment: Effects of blackout duration and extinction between components. Journal of the Experimental Analysis of Behavior. 2002;77:65–89. doi: 10.1901/jeab.2002.77-65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Davison M, Jenkins P.E. Stimulus discriminability, contingency discriminability, and schedule performance. Animal Learning & Behavior. 1985;13:77–84. [Google Scholar]
  15. Davison M, Nevin J.A. Stimuli, reinforcers, and behavior: An integration. Journal of the Experimental Analysis of Behavior. 1999;71:439–482. doi: 10.1901/jeab.1999.71-439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Findley J.D. Preference and switching under concurrent scheduling. Journal of the Experimental Analysis of Behavior. 1958;1:123–144. doi: 10.1901/jeab.1958.1-123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Guttman N, Kalish H.I. Discriminability and stimulus generalization. Journal of Experimental Psychology. 1956;51:79–88. doi: 10.1037/h0046219. [DOI] [PubMed] [Google Scholar]
  18. Hanson H.M. Effects of discrimination training on stimulus generalization. Journal of Experimental Psychology. 1959;58:321–334. doi: 10.1037/h0042606. [DOI] [PubMed] [Google Scholar]
  19. Herrnstein R.J. Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior. 1961;4:267–272. doi: 10.1901/jeab.1961.4-267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Krägeloh C.U, Davison M. Concurrent-schedule performance in transition: Changeover delays and signaled reinforcer ratios. Journal of the Experimental Analysis of Behavior. 2003;79:87–109. doi: 10.1901/jeab.2003.79-87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Leigland S. Discriminative stimulus control and the effects of concurrent operants. Journal of the Experimental Analysis of Behavior. 1987;47:213–223. doi: 10.1901/jeab.1987.47-213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lobb B, Davison M.C. Performance in concurrent interval schedules: A systematic replication. Journal of the Experimental Analysis of Behavior. 1975;24:191–197. doi: 10.1901/jeab.1975.24-191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Migler B, Millenson J.R. Analysis of response rates during stimulus generalization. Journal of the Experimental Analysis of Behavior. 1969;12:81–87. doi: 10.1901/jeab.1969.12-81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Miller J.T, Saunders S.S, Bourland G. The role of stimulus disparity in concurrently available reinforcement schedules. Animal Learning & Behavior. 1980;8:635–641. [Google Scholar]
  25. Stubbs D.A, Pliskoff S.S. Concurrent responding with fixed relative rate of reinforcement. Journal of the Experimental Analysis of Behavior. 1969;12:887–895. doi: 10.1901/jeab.1969.12-887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Taylor R, Davison M. Sensitivity to reinforcement in concurrent arithmetic and exponential schedules. Journal of the Experimental Analysis of Behavior. 1983;39:191–198. doi: 10.1901/jeab.1983.39-191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Wearden J.H, Burgess I.S. Matching since Baum (1979). Journal of the Experimental Analysis of Behavior. 1982;38:339–348. doi: 10.1901/jeab.1982.38-339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Winton A.S.W, Beale I.L. Peak shift in concurrent schedules. Journal of the Experimental Analysis of Behavior. 1971;15:73–81. doi: 10.1901/jeab.1971.15-73. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of the Experimental Analysis of Behavior are provided here courtesy of Society for the Experimental Analysis of Behavior

RESOURCES