Abstract
This study examined how operant behavior adapted to an abrupt but regular change in the timing of reinforcement. Pigeons were trained on a fixed interval (FI) 15-s schedule of reinforcement during half of each experimental session, and on an FI 45-s (Experiment 1), FI 60-s (Experiment 2), or extinction schedule (Experiment 3) during the other half. FI performance was well characterized by a mixture of two gamma-shaped distributions of responses. When a longer FI schedule was in effect in the first half of the session (Experiment 1), a constant interference by the shorter FI was observed. When a shorter FI schedule was in effect in the first half of the session (Experiments 1, 2, and 3), the transition between schedules involved a decline in responding and a progressive rightward shift in the mode of the response distribution initially centered around the short FI. These findings are discussed in terms of the constraints they impose to quantitative models of timing, and in relation to the implications for information-based models of associative learning.
Keywords: Timing, time perception, behavioral dynamics, associative learning, pigeons, fixed-interval schedules of reinforcement
1. Introduction
The Darwinian fitness of animals involves the adjustment of their behavior to environmental regularities, such as the correlation between biologically significant stimuli (associative learning) and the timing of those stimuli. Whereas associative learning dynamics have been extensively studied for 40 years (Rescorla and Wagner 1972; Pearce and Bouton 2001; Killeen et al. 2009), research on timing has been primarily focused on steady-state performance (Church 2006; Grondin 2010; Machado et al. 2009). Few studies have examined how behavior adjusts to changes in the periodicity of biologically significant stimuli and, consequently, our understanding of timing dynamics is incipient. The purpose of the present study is to describe and account for trial-by-trial changes in operant performance when the timing of reinforcement changes abruptly but regularly.
Studies on timing dynamics rely primarily on laboratory rats and pigeons as experimental subjects (but see Rivière et al. 2000), and on the fixed-interval (FI) schedule of reinforcement (e.g., Guilhardi et al. 2006), or some variation of this schedule, to assess the temporal control of behavior. In the FI schedule, the first response following a criterion (a fixed time since the onset of the trial) is reinforced. Two typical variations of the FI schedule are the response-initiated-delay (RID) schedule, in which reinforcement is delivered at the criterion if a response is produced before the criterion elapses (e.g., Wynne and Kalish 1999), and the peak interval (PI) procedure, in which a portion of FI trials are substituted by longer non-reinforced trials (e.g., Roberts 1981; Rodríguez-Gironés and Kacelnik 1999).
FI and RID studies use the post-reinforcement pause (PRP), the interval between trial onset and the first response, as the primary measure of temporal control. These studies have shown that an unpredictable disruption of periodic reinforcement by a series of trials with short criteria yields a rapid reduction in PRP in the trials following each short-criterion trial; after the disruption, PRP recover at a rate that is inversely proportional to the length of the disruption (Higa 1996a; Higa et al. 2002). If periodic reinforcement is disrupted by long-criteria trials, however, changes in PRP are negligible (Higa 1997; Ludvig and Staddon 2004) or so weak as to require very long disruptions (Higa and Tillou 2001). When changes in PRP are observed during a long-criterion disruption, recovery of shorter PRPs following the disruption is very rapid (Higa and Tillou 2001). Changes in PRP following long-criterion disruptions are made more visible in single-transition paradigms, in which the disruption is maintained for the remainder of the session (Higa 1996b; Higa et al. 1993). Even in single-transition paradigms, however, downshifts in PRP in response to downshifts in the criterion are faster than upshifts in PRP in response to upshifts in the criterion (Higa et al. 1993; Higa 1996b).
The use of PRPs allows for a trial-by-trial analysis of changes in the temporal control of behavior, but neglects evidence of temporal control that is visible in the responses that follow the first response in FI and RID. These latter responses are examined in the PI procedure. This procedure uses measures of central tendency of the distribution of responses in non-reinforced trials to establish the peak time, which typically coincides with the criterion and serves as an estimate of the time when the subject expects reinforcement (Roberts 1981). Studies on timing dynamics based on peak time estimates have yielded disparate results. Using rats, Meck et al. (1984) found that, when the criterion in the PI schedule was either increased or reduced permanently across sessions, the peak time was fixed for several sessions at an intermediate time before settling at the new criterion. In contrast, Rodríguez-Gironés and Kacelnik (1999) found that it took several intermediate steps for European starlings to complete a downward transition between PI schedules. Moreover, whereas the change in peak time in response to a criterial upshift in Meck et al. (1984) was symmetrical to the change in response to a criterial downshift, Lejeune et al. (1997) observed abrupt changes in peak time when rats transitioned from a short to a long criterion, but more progressive changes when transitioning from a long to a short criterion. Both Meck et al.'s (1984) and Lejeune et al.'s (1997) findings stand in contrast to the slower changes in PRPs in criterial upshifts relative to downshifts.
Despite their limited scope, the results obtained from PRP-based studies are sufficiently consistent to support informative quantitative models of timing dynamics (Luzardo, Ludvig, and Rivest 2013; Staddon, Chelaru, and Higa 2002, 2002b). In contrast, research focused on peak times provides a more comprehensive view of temporally controlled behavior, but the exiguous data it has provided support to seemingly inconsistent findings. To develop a model of timing dynamics comparable to those of associative learning dynamics, it may be beneficial to implement the simpler behavioral paradigms that have shown consistent results in PRP-based studies, and apply to those results a comprehensive analytic approach akin to that of peak-time-based research. To that end, the present study examined changes in response rate within individual FI trials using a single-transition paradigm similar to that used by Higa (1997). Unlike Higa's (1997) study, however, the present study implemented the transition regularly in the middle of each session. This modification allowed us to examine whether our experimental subjects anticipated the regular transition between schedules. Also, our analysis was not restricted to changes in PRP, but considered changes to the distribution of responses over time within each trial.
The present study involved the analysis of response-rate functions in trials around single regular transition between two FI schedules (Experiments 1 and 2) and between and FI schedule and the discontinuation of reinforcement (Experiment 3). This analysis was focused on answering two questions: (1) Is the change in reinforcement time anticipated? (2) Following the change in reinforcement time, how does control from the previous reinforcement time wane?
2. Experiment 1
2.1 Methods
2.1.1 Subjects
Four experienced male pigeons (Columba livia) served as experimental subjects; they were housed individually in a room with a 12 h:12 h day: night cycle, with dawn at 0600 h. Each bird had free access to water and grit in their home cages. Experiments were conducted during the day. Running weights were monitored daily and were kept at about 80% of free-feeding weights. Each pigeon was weighed immediately prior to an experimental session and was excluded from a session if its weight exceeded 8% of its running weight (i.e., 86.4% of free-feeding weight). When required, a supplementary feeding of ACE-HI pigeon pellets (Star Milling Co.) was given at the end of each day, at least 12 h before experimental sessions were conducted.
2.1.2 Apparatus
Experimental sessions were conducted in four MED Associates modular test chambers. The test panel contained a plastic transparent response key (25 mm in diameter: MED Associates, ENV-123AM), centered horizontally 70 mm from the ceiling. The key could be illuminated by white light from two diodes. Activation of the key generated a 100-ms period in which no further activations were registered. A rectangular opening (52 mm wide, 57 mm high) located 20 mm above the floor and centered on the test panel provided access to milo when a grain hopper was activated (Coulbourn Instruments, H14-10R). A houselight (MED Associates, ENV-215M) was mounted 12 mm from the ceiling on the sidewall opposite the test panel. The houselight dimly illuminated the chamber throughout each experimental session.
2.1.3 Procedure
2.1.3.1 Fixed-interval 15-s pretraining
Pigeons were first introduced to a fixed-interval (FI) 15-s schedule of reinforcement. In this schedule, the onset of a trial was signaled by the illumination of the center key with white light. The first keypeck after 15 s from trial onset turned off the center key and activated the hopper for 2.5 s, which served as reinforcer. The next trial started immediately after turning off the hopper. Each session finished after 60 trials or 90 min, whichever happened first. Each pigeon was pretrained for three sessions.
2.1.3.2 Experimental training
Experimental sessions were similar to FI 15-s pretraining sessions, with a few exceptions. First, in each session, trials were divided into two halves of 30 trials each. Each bird experienced two different schedules of reinforcement during each session, one schedule in each half. One schedule was always FI 15-s; the other was either FI 45-s or variable interval (VI) 45-s. Sessions containing a VI 45-s schedule were conducted for purposes unrelated to this experiment and are not analyzed here. Each bird experienced 23–27 consecutive sessions with each schedule permutation: FI 15-s then FI 45-s (Short-First); FI 45-s then FI 15-s (Long-First); FI 15-s then VI 45-s second; VI 45-s then FI 15-s (the latter two permutations were not analyzed). The order of presentation of each schedule permutation was counterbalanced across birds.
2.1.4 Data Analysis
The last 10 sessions of Short-First and Long-First sessions were analyzed. Key pecks within each trial in these sessions were averaged within 1-s bins. Only the first 15 and 45 bins were analyzed in FI 15-s and FI 45-s trials, respectively. Data was first averaged within each bin across blocks of 5 consecutive trials. This level of aggregation revealed the 5-trial blocks around the middle of the session where the transition in FI performance took place. We called these blocks of trials the transition period.
A simple model of transition between FI schedules assumes that the FI requirement in the first half of the session interferes with FI performance in the first few trials of the second half of the session. This interference is likely to be expressed as responding under the control of the first FI schedule while the second FI schedule is in effect (Catania and Reynolds, 1968). The joint control of behavior by multiple timing requirements is well characterized by models that assume parallel processes controlling responding around each of the two requirements (Leak and Gibbon, 1995). Such parallel processes yield a bimodal distribution of responses over time (Whitaker et al. 2003, 2008). Thus, a bimodal distribution was used to characterize the average performance within each transition trial,
| (1) |
where B(t) is the number of key pecks emitted between t – 1 and t seconds since trial start (i.e., the tth 1-s bin); KD is a scale factor (in number of key pecks), where D signifies each of two gamma distributions, S(hort) and L(ong); Γ(X;cD, λD) is the integral of a gamma distribution between X – 1 s and X, with shape parameter cD, and scale parameter λD; MS is the mode of the Short distribution.
Equation 1 assumes that key pecks are bimodally distributed over time in each trial, with maximal response rates at MS and at 45 s. The FI 15-s schedule was expected to maintain the peak at MS, and the FI 45-s schedule was expected to maintain the second peak at 45 s (Whitaker et al. 2003, 2008). KS and KL are, respectively, the height of the peaks at MS and at 45 s. Response rates were expected to decline around each of these peaks approximately following a gamma distribution (S and L) (Leak and Gibbon 1995). The parameters of each distribution D (cD and λD) can be expressed in terms of its mode (MD) and standard deviation (σD):
| (2) |
| (3) |
Equation 1 was intended to describe, for each individual animal, the average performance (across sessions) in each consecutive trial. It was expected that the distribution parameters would change from trial to trial within the transition period. The distribution parameters that were expected to change were maximal response rate (KS, KL), standard deviation (σS, σL), and the mode of the Short distribution (MS). We considered three possible updating rules for these parameters, expressed as mathematical models: an Arithmetic Progression model, a Geometric Progression model, and a Negative Acceleration model. These models are expressed mathematically in the middle column of Table 1. The Arithmetic Progression model assumes that the difference between parameters in consecutive trials was constant. The Geometric Progression model assumes that the ratio of parameters in consecutive trials was constant. The Negative Acceleration model assumes that the difference between parameters in consecutive trials was proportional to the difference between its present value and an asymptote.
Table 1.
Rules for updating Equation 1 parameters
| Models | Updating Rule | Mean | |
|---|---|---|---|
| Arithmetic Progression | Xn + 1 = Xn + ζX |
|
|
| Geometric Progression | Xn + 1 = Xn • ζX |
|
|
| Negative Acceleration | Xn + 1 = Xn + ζX • (XA − Xn) |
|
Note. Xn is a placeholder for parameter X in trial n. ζX is the updating parameter; for the Geometric Progression model, ζx ≥ 0; for the Negative Acceleration model, 1 ≥ ζX ≥ 0. XA is the asymptote of parameter X. The rightmost column shows the mean of X in trials n = 1, n = 2, n = 3, … n = m according to each updating rule.
The Arithmetic and Geometric Progression models each involve two higher-order free parameters for each (lower-order) distribution parameter: a baseline parameter (X1) and an updating parameter (ζX). Aside from these two parameters, the Negative Acceleration model involves a third higher-order free parameter, the asymptote parameter (XA). Thus, the Arithmetic and Geometric Progression models involves each a total of 10 free parameters (5 distribution parameters × 2 higher-order parameters); the Negative Acceleration model involves 15 free parameters (5 distribution parameters × 3 higher-order parameters).
After identifying the trials over which performance transitioned, the free parameters of each model were estimated using the method of least squares, with mean number of key pecks in each 1-s bin serving as data. To simplify the analysis, models were fit to individual trials only within the transition period, i.e., when the most noticeable changes in performance were taking place. Performance in all the other trials was averaged; mean model predictions (Table 1, rightmost column) were fit to these averages.
Akaike Information Criterion, corrected for small samples (AICc; Burnham and Anderson 2002) was used to select between models. Briefly, AICc is an estimation of model likelihood after correction for free parameters, which promotes the selection of models that provide a better fit to the data with fewer free parameters. Conventionally, to select a model with more free parameters (here, the Negative Acceleration model), its AICc must be at least 4 units below the AICc of any competing model (Burnham and Anderson 2002). This difference in AICc is the ΔAICc of the competing model; the estimated difference in model likelihood, after correcting for free parameters, is eΔAICc/2. Thus, the “4-unit difference” rule implies that the more complex model should be at least e2 = 7.4 times more likely than other models, even after correcting for free parameters, to be selected.
Once an updating model was selected, inferences were drawn on the direction of change of each distribution parameter over the transition period, based on higher-order parameter estimates. For instance, if the Negative Acceleration model were selected and K1,S > KA,S (i.e., the baseline maximal response rate maintained by the short FI is higher than its asymptote) it would suggest that the peak of the Short distribution declined over the transition period. Because the central aim of this study was to identify which distribution parameters reliably changed in a particular direction over the course of the transition period, we implemented a stringent criterion for identifying a reliable direction of change. The hypothesis that a distribution parameter changed in a particular direction was supported when the AICc-based estimates indicated the same direction of change in all four birds. When changes in distribution parameters were inconsistent between subjects (e.g., K1,S > KA,S for one pigeon, but K1,S < KA,S for the other three), we sought consistency by constraining the inconsistent parameter to be constant in one or more pigeons (e.g., ζKS = 0 for the deviating pigeon), and tested this constrained model using AICc. The hypothesis that distribution parameters changed in the postulated direction was supported only if AICc favored the constrained model.
2.2 Results
Figure 1 shows response rates in each block of 5 trials as a function of time in each trial, averaged across pigeons and across the last 10 sessions of the Short-First and Long-First conditions. In both conditions, responses were less dispersed over time and maximum response rate was higher in the FI 15-s than in the FI 45-s schedule. In the Short-First condition (a) response rates were higher during the first 5 s in the first block of the initial FI 15-s schedule, relative to the subsequent blocks; (b) response rates were generally higher during the first 35 s in the first block of the terminal FI 45-s schedule, relative to the subsequent blocks, and (c) response rates were lower during the last 15 s of the last block of the terminal FI 45-s schedule. Effect (b) identified the first 5 trials of the FI 45-s schedule as containing the transition period in the Short-First condition. Because performance during the last 5 trials of the FI 15-s schedule was not distinguishable from performance in preceding blocks of trials, there was no evidence of anticipation of the transition to the FI 45-s schedule during the FI 15-s schedule. Effect (c) suggested the exclusion from analysis of the last block of the terminal schedule in the Short-First condition.
Figure 1.
Mean response rate as a function of time in trial in the Short-First (top panel) and Long-First (bottom panel) conditions, averaged over blocks of 5 trials, in Experiment 1. Each curve represents a separate block of 5 trials.
In the Long-First condition, response rates were higher during the first 8 s and lower during the last 5 s in the first block of the terminal FI 15-s schedule, relative to the subsequent blocks of trials. This effect identified the first 5 trials of the FI 15-s schedule as containing the transition period in the Long-First condition. Because performance during the last 5 trials of the FI 45-s schedule was not distinguishable from performance in preceding trials, there was no evidence of anticipation of the transition to the FI 15-s schedule during the FI 45-s schedule.
Table 2 shows the summary of the AICc analysis conducted to select among updating models in the Short-First and Long-First conditions. Every model accounted for a higher percentage of variance (PVAF) in the Long-First than in the Short-First condition. In both conditions, the Negative Acceleration model accounted for more variance and had the lowest AICc than competing models. Even though the AICc of the second-best model in the Short-First condition, the Geometric Progression model, was more than 4 units greater than the AICc of the Negative Acceleration model, the estimated difference in model likelihood was low enough to justify exploring the Geometric Progression parameters to qualify inferences drawn from the Negative Acceleration model that was selected in the Short-First condition.
Table 2.
AICc analysis of updating rules for Short-First condition
| Updating Rule | Free Parameters | RSS | PVAF | ΔAICc |
|---|---|---|---|---|
| Short First Condition | ||||
| Arithmetic Progression | 44 | 118.0 | 85.4% | 597.1 |
| Geometric Progression | 44 | 61.2 | 92.4% | 6.9 |
| Negative Acceleration | 64 | 57.8 | 92.8% | 0.0 |
| Long First Condition | ||||
| Arithmetic Progression | 44 | 38.0 | 95.4% | 88.0 |
| Geometric Progression | 44 | 48.0 | 94.2% | 298.7 |
| Negative Acceleration | 64 | 32.7 | 96.0% | 0.0 |
Note. Model evaluations are based on 5 trials × 45 1-s bins × 4 pigeons = 900 observations. The number of free parameters is the number of distribution parameters (3 for Short + 2 for Long = 5) multiplied by the number of updating-rule parameters (2 or 3), plus 1 error variance parameter, all multiplied by 4 pigeons. RSS is the residual sum of squares from curve fitting. PVAF is the percent variance accounted for by each model. ΔAICc is the difference between each model's AICc and the lowest AICc in each condition.
When no constraints were imposed, the direction of change of Negative Acceleration estimates was consistent across all pigeons, except for one pigeon, for whom K1,L < KA,L and σ1,L < σA,L in the Short-First condition. That is, for this pigeon, unlike for the rest of the pigeons, the maximal response rate and standard deviation of the Long distribution (the one maintained by the FI 45-s schedule) appeared to increase over the course of the first few FI 45-s trials that followed the last FI 15-s trial (i.e., its baseline estimates were lower than its asymptotic estimates). Seeking consistency among subjects, we constrained the Negative Acceleration model, for this subject, such that K1,L = KA,L and σ1,L = σA,L. This meant the elimination of 4 free parameters (KA,L, σA,L, ζKL, and ζσL, for one pigeon) in the model, from 64 to 60 (see Table 2). These constraints increased RSS from 57.8 to 58.1. The AICc analysis, however, favored the constrained model, whose AICc was 4.42 units lower than the more complex, unconstrained model.
Mean (+/− SEM) Negative Acceleration parameter estimates, with constraints for consistency imposed, are shown in Table 3. These estimates suggest that, during the terminal schedule, (a) MS increased from a time close to the FI requirement toward a substantially longer time; (b) σS also increased, but only in the Short-First condition; (c) KS declined when FI 15-s was not the terminal schedule (Short First), but increased when it was; (d) baseline σL was very similar in both conditions (~30 s), but declined when FI 45-s was the terminal schedule, and (e) KL declined.
Table 3.
Mean (+/− SEM) Negative Acceleration parameter estimates.
| Parameter | Description | Exp. 1 Short-First | Exp. 1 Long-First | Exp. 2 | Exp. 3 |
|---|---|---|---|---|---|
| Short distribution (controlled by FI 15-s schedule) | |||||
| M 1,S | Baseline mode | 13.7 (0.4) s | 16.2 (1.2) s | 16.0 (2.2) s | 17.0 (1.6) s |
| MA,S | Asymptotic mode | 24.9 (5.0) s | >100 (>100) s | 42.3 (7.8) s | 46.4 (11.1) s |
| σ 1,S | Baseline standard deviation | 7.4 (0.6) s | 10.9 (2.6) s | 10.4 (0.9) s | 10.9 (3.1) s |
| σ A,S | Asymptotic standard deviation | >100 (>100) s | 10.9 (2.6) s | >100 (>100) s | >100 (>100) s |
| K 1,S | Baseline maximal resp. rate | 2.6 (0.4) rps | 0.8 (0.1) rps | 1.4 (0.2) rps | 1.7 (0.3) rps |
| K A,S | Asymptotic maximal resp. rate | 0.7 (0.3) rps | 3.0 (0.5) rps | 0.5 (0.2) rps | 0.1 (0.0) rps |
| ζ MS | Updating rate of mode | .72 (.12) | .28 (.24) | .42 (.13) | .49 (.17) |
| ζ σ S | Updating rate of st. deviation | .08 (.08) | n.c. | .51 (.22) | .34 (.21) |
| ζ KS | Updating rate max. resp. rate | .52 (.16) | .36 (.08) | .50 (.12) | .65 (.12) |
| Long distribution | |||||
| σ 1,L | Baseline standard deviation | 29.6 (2.9) s | 30.2 (2.6) | 66.8 (18.8) s | 67.8 (14.8) s* |
| σ A,L | Asymptotic standard deviation | 13.7 (5.6) s | >100 (>100) s | 26.1 (6.8) s | 67.8 (14.8) s* |
| K1,L | Baseline maximal resp. rate | 2.1 (0.5) rps | 1.7 (.40) rps | 1.6 (0.3) rps | 0.6 (0.2) rps |
| KA,L | Asymptotic maximal resp. rate | 1.4 (0.4) rps | 0.0 (0.0) rps | 1.0 (0.2) rps | 0.0 (0.0) rps |
| ζ σ L | Updating rate of st. deviation | .46 (.25) | .78 (.16) | .45 (.13) | n.c. |
| ζ KL | Updating rate max. resp. rate | .45 (.28) | .49 (.18) | .46 (.18) | .67 (.16) |
Note. N.c. means “not computable” (baseline and asymptotic estimates were the same). Rps means “responses per second”.
The standard deviation of the Long distribution in Experiment 3 was assumed constant.
Parameters of the Geometric Progression model in the Short-First condition were generally consistent with the inferences drawn from estimates of the Negative Acceleration model. In the Geometric Progression model, updating parameters greater/lower than 1 are indicative of an increase/decrease of the corresponding distribution parameter over the course of the transition period. According to estimated Geometric Progression parameters, ζMS > 1 and ζKS < 1 for all pigeons, with means of 1.29 (+/− 0.10) and 0.54 (+/− 0.06), respectively; other estimates were not consistent among pigeons. Thus, fits of the Negative Acceleration and Geometric Progression models support the notion that, when transitioning to a longer FI schedule, responses controlled by the shorter FI progressively recede in time and decline in frequency.
Besides the consistency across pigeons of estimates of Negative Acceleration and Geometric Progression parameters, two alternative models were considered to verify that the inference of a rightward-displaced Short distribution was not an artifact of the Negative Acceleration model fitting noise in performance. These models were fit to data from the Short-First transition period, where the rightward displacement of the Short distribution was more visible. First, we tested a constrained variation of the Negative Acceleration model, where ζMS = 0 and thus M1,S = MA,S (i.e., baseline and asymptotic mode of the Short distribution were equal). This model has 2 fewer parameters per pigeon than the unconstrained Negative Acceleration model, for a total of 56 free parameters (cf. Table 2). Second, we tested a variation of the Negative Acceleration model where M1,S = MA,S, but the mode of the Long distribution was free to vary, i.e., ML substituted for 45 s in Equation 1, and M1,L (baseline) and MA,L (asymptote) were free parameters. This model has the same number of free parameters as the unconstrained Negative Acceleration model of Table 1. The first model assumes that neither distribution shifts during transition; the second model assumes that only the Long distribution shifts during transition. After fitting these models using the method of least squares, the RSS were 87.2 and 76.0 for the “no shift” and “Long-only shift” models, respectively. These RSS yielded ΔAICc of 352.6 and 246.4, respectively, both substantially above the ΔAICc = 4.0 criterion for selection of the more complex model. The selection of the unconstrained Negative Acceleration model over variations of this model with a constant MS and a varying ML suggests that the rightward shift of MS over the transition period was reliable.
Finally, we considered whether a single-distribution variation of Equation 1, with parameters changing according to the unconstrained Negative Acceleration model, could provide a more parsimonious account of performance during the transition period in both the Short-First and Long-First conditions. In this variation of the Negative Acceleration model K1,L = KA,L = 0, and the number of free parameters is reduced by 24 (6 Long-distribution parameters per pigeon), from 64 to 40. The RSS for this model was 108.5 and 40.6 in the Short-First and Long-First conditions, respectively. These RSS yielded ΔAICc of 512.8 and 177.7, respectively, both substantially above the ΔAICc = 4.0 criterion for selection of the more complex model. The unconstrained Negative Acceleration model using the mixture distribution of Equation 1 was thus selected over a single-distribution variation. This selection supports the hypothesis that performance during the transition period was jointly controlled by the just-expired FI schedule and the current FI schedule.
Mean parameter estimates reported here should be taken with caution, regardless of the selected model, because (a) they make predictions about unobserved performance that are unlikely to be accurate, and (b) mean estimates were often not representative of individual estimates. For instance, according to Table 3, the mean asymptotic mode of the Short distribution (MA,S) in the Long-First condition is expected to increase rapidly toward intervals longer than 100 s when the FI 15-s schedule is in effect. This is unlikely to be accurate, because reinforcement is provided about every 15 s and not at longer intervals. Moreover, the estimated asymptotic mode varied widely between pigeons, from 16 to 595 s, and was negatively correlated with ζMS (in a log scale, r = −.97). Over the range of observed trials, however, the Negative Acceleration model provides an accurate description of the data, as suggested by the high percentage of variance in the data it accounts for (Table 2). Changes in distribution parameters are, therefore, better represented by the mean predictions at each transition trial than by the predictions of mean parameters. These mean predictions are drawn in Figure 2.
Figure 2.
Mean (+/− SEM) Short and Long distribution parameters over the terminal schedule, predicted by the Negative Acceleration model on the basis of individual parameters fitted to data from Experiment 1. M is the mode, σ the standard deviation, K the maximal response rate, and the coefficient of variation (CV) is the standard deviation divided by the mean. Black curves correspond to the Short-First condition (“SF”, terminal schedule is FI 45-s); gray curves correspond to the Long-First condition (“LF”, terminal schedule is FI 15-s). Circles are Short distribution parameters; triangles are Long distribution parameters. Predictions for Trial 1 served also as predictions for initial-schedule trials. “SS” in the x-axis means “subsequent trials”: 6 to 25 in SF, and 6 to 30 in LF.
The curves in Figure 2 are consistent with the inferences drawn from mean parameter estimates reported in Table 3. In addition, they show that, despite the very high mean estimate of MA,S in the Long-First condition, MS barely changed in that condition, particularly if compared to MS in the Short-First condition. Similarly, σS remained fairly low, despite very high asymptotic estimates. Figure 2 also includes estimates of the coefficient of variation (CV), which is the standard deviation of each distribution divided by its mean. The CV of a gamma distribution D is √(1/cD) (see Equation 2). The CV is generally constant over a fairly large range of timed intervals (Church and Meck 2003, but see Bizo et al. 2006). Such invariance was roughly replicated in all distributions except for the Large distribution in the Large-First condition. During the initial schedule and on Trial 1 of the terminal schedule, all CVs were about 0.4 to 0.5, a range that envelops typical estimates from pigeons in similar procedures (e.g., Sanabria and Killeen 2007). In subsequent trials, CVs changed only slightly for most distributions, but increased substantially for the Large distribution when its reinforcement was discontinued.
Figure 3 shows mean response rate as a function of time in trial in the Short-First condition, in trials 6–30 of the FI 15-s schedule, in trials 1–5 of the FI 45-s schedule (the transition period), and in trials 6–25 of the FI 45-s schedule. The thick continuous lines show mean fits of the Negative Acceleration model; the broken curves are the underlying Short and Long distributions. Consistent with its high PVAF (Table 2), the Negative Acceleration model provided a faithful description of performance data. Figure 3 depicts how, over trials without reinforcement at 15 s, the Short distribution shifted rightwards (increased MS), becoming flatter (increased σS) and smaller (reduced KS). The shift in the peak of the Short distribution over trials is traced by the “X” symbols.
Figure 3.
Mean response rate (circles) as a function of time in trial in the Short-First condition, in trials 6–30 of the FI 15-s schedule, in the first five trials of the FI 45-s schedule (the transition period), and trials 6–25 of the FI 45-s schedule, in Experiment 1. Thick continuous lines are mean traces of the Negative Acceleration model (see Equation 1 and Table 1) fit to individual pigeons; the broken curves are mean traces of each of the two underlying distributions. Thin continuous lines are mean traces of the Negative Acceleration model with the mode of the Short distribution (MS) constant across trials (the mode of the Long distribution, ML, was always constant). The “X” symbols trace the height and location of the peak of the mean Short distribution during the transition period and afterwards.
The thin continuous curve in Figure 3 is the mean best fit of a model that assumes a constant MS across trials. Such constrained model underpredicts response rate around the peak of the response rate function during the 2nd and 3rd trial of the transition period into the FI 45-s schedule. These are critical trials, because they capture the first adjustments of the pigeons to a change in schedule. On all other trials, predictions of the constrained and unconstrained models (which suggest no shift and rightward shift in MS, respectively) were mostly indistinguishable.
Figure 4 shows mean response rate as a function of time in trial in the Long-First condition, in trials 6–30 of the FI 45-s schedule, in trials 1–5 of the FI 15-s schedule (the transition period), and in trials 6–30 of the FI 15-s schedule. The thick continuous lines show mean fits of the Negative Acceleration model; the broken curves are the underlying Short and Long distributions. Consistent with its high PVAF (Table 2), the Negative Acceleration model provided a faithful description of performance data. Figure 4 shows that, over trials reinforced at 15 s, the Short distribution became larger (increased KS) and the Long distribution became flatter and thus, within the 0–15 s range, contributed progressively less to the observed response rate. During the last 25 FI 15-s trials, response rate at 15-s was more than double the response rate at 15-s during FI 45-s, and the contribution of the Long distribution to response rate was negligible.
Figure 4.

Mean observed response rate (circles) as a function of time in trial in the Long-First condition, in trials 6–30 of the FI 45-s schedule, in the first five trials of the FI 15-s schedule (the transition period), and trials 6–30 of the FI 15-s schedule, in Experiment 1. Thick continuous lines are mean traces of the Negative Acceleration model (see Equation 1 and Table 1) fit to individual pigeons; the broken curves are mean traces of each of the two underlying distributions (at 15-s, highest curve is the Short distribution).
2.3 Discussion
Pigeons were repeatedly exposed to a shift in the timing of reinforcement in the middle of each training session. Despite this repeated exposure, FI performance revealed no evidence of trial-dependent anticipation to the shift in timing of reinforcement. Instead, it appears that the change in schedule prompted an adjustment in performance, which was generally gradual, but that was mostly complete by the 5th trial of the terminal schedule.
Performance in the initial and terminal schedules was generally consistent with well-established patterns of FI performance: maximal response rate was observed near the time of reinforcement, and was higher in the shorter than in the longer FI schedule (Guilhardi and Church 2005). On any given trial, performance was well described by the mixture of two response frequency distributions, one controlled by reinforcement at 15 s (Short) and the other by reinforcement at 45 s (Long; Equations 1–3). Stable asymptotic performance on the FI 15-s schedule was very similar regardless of whether it preceded or followed the FI 45-s schedule. In contrast, the asymptotic response rate gradient on the FI 45-s schedule appeared to be steeper when following, rather than when preceding, the FI 15-s schedule (Figure 1). The mixture-distribution model attributed this difference in performance not only to the dispersion of the Long distribution (in the initial FI 45-s schedule, mean σL = 30.2 +/− 2.6s; in the last trials of the terminal FI 45-s schedule, mean σL = 21.8 s +/− 4.8 s), but also to the mode of the Short distribution (in the initial FI 45-s schedule, mean MS = 16.2 +/− 1.2 s; in the last trials of the terminal FI 45-s schedule, mean MS = 24.8 +/− 4.9 s). In other parameters, the Short and Long distributions underlying initial and terminal FI 45-s performance were remarkably similar. Differences and similarities in asymptotic parameter estimates across initial and terminal FI 45-s schedules may be appreciated in Figure 2, by comparing the grey symbols when x = 1 (initial) and the corresponding black symbols when x = “SS” (terminal), and in Figures 3 and 4, by comparing the broken curves in the top panel of Figure 4 (initial) and in the bottom-right panel of Figure 3 (terminal).
The difference in asymptotic MS between initial and terminal FI 45-s schedules suggests that reinforcement in the FI 15-s component of each session interfered with performance in the alternative FI 45-s component. When the FI 45-s schedule was presented first (Long First), reinforcement at 15 s appeared to be weakly anticipated, but such anticipation was independent of trials completed. This pattern of interference has been reported in alternating FI schedules, when the difference between schedules in not too large (Ludvig and Staddon 2005). When the FI 45-s schedule was presented second (Short First), past reinforcement at 15 s appeared to maintain some responses around times on which pecks were never reinforced, as if the time when reinforcement had been delivered was stretched in memory.
The adjustment in performance prompted by the change in schedule was well described by a negatively accelerated change in various distribution parameters (a kindred model is proposed by Guilhardi and Church 2005). The magnitude and direction of the change in parameters depended on the direction of the change in schedule. When transitioning from FI 45-s to FI 15-s (Long First), the Short distribution barely changed, except for a noticeable increase in maximal response rate; the Long distribution appeared to rapidly increase in dispersion and slowly decrease in maximal response rate. These inferences should be taken with caution, because predictions of performance were truncated at 15 s, and thus both theoretical distributions were, for the most part, verified against very limited data.
When transitioning from FI 15-s to FI 45-s (Short First), the maximal response declined for both distributions, faster for the Short than for the Long distribution, whereas the standard deviations converged, with the Short distribution becoming flatter and the Long distribution steeper. Most importantly, the mode of the Short distribution gradually shifted rightwards. These results, along with those of the Long First condition, suggest that the discontinuation of reinforcement at a particular time (a) reduces the amount of behavior supported by such reinforcement, (b) displaces that behavior to a later time, and (c) increases the variability in time of that behavior. Inferences (a) and (c) are consistent with other findings in extinction performance (Guilhardi and Church 2006). Inferences (b) and (c) may imply each other if the CV is constant or nearly constant, as suggested by some of the data reported here (bottom-right panel in Figure 2).
The rightward shift in time of previously reinforced timed behavior is inconsistent with the notion that performance depends on samples of time (+/− error) encoded in memory, as suggested by cognitive models of timing (Church, Meck, and Gibbon 1994; Gallistel and Gibbon 2000). To keep this assumption, it would be necessary to assume also that, absent reinforcement, encoded time is gradually stretched in memory. A simpler assumption is that, consistent with behavioral theories of timing (Killeen and Fetterman 1988; Machado 1997; Machado et al. 2009), timed behavior is dependent on the differential activation of behavioral states, that transitions between behavioral states follow a Poisson process, and that the speed of such process depends on the rate of reinforcement (see also Bizo and White 1994, 1994b). Consistent with the present data, this hypothesis predicts that a shift in FI requirement from 15 to 45 s would result in a slower transition between states (because of the reduction in rate of reinforcement), which imply a decline in response rate across states, including the state activated when reinforcement is delivered, but particularly in the state undergoing extinction. It also predicts that the state undergoing extinction would be activated later, and thus the modal time of behavior supported by this state would shift rightwards.
Some alternative accounts of performance during the transition between FI 15-s and FI 45-s were tested and ruled out. These included a mixture of two distributions with fixed modes, a mixture of a Short distribution with fixed mode and a free-floating Long distribution, and a single distribution shifting rightwards. Other accounts are plausible, but they are substantially more complex than the rightward-shifting distribution model. For instance, it is also possible that both Short and Long distributions shift rightwards during the transition period. This possibility is consistent with a deceleration of the Poisson process controlling the behavioral states associated with reinforcement. In such case, the mode of the Long distribution would be expected to shift, over the course of the transition period, from an intermediate time (e.g., 35 s) to the time of reinforcement (45 s). A test of this hypothesis would have to capture the whole Long distribution to visualize its change over the transition period, but the procedures conducted in the present study only captured the left half of the Long distribution. Finally, another possibility is that the Short distribution does not shift rightwards, but that instead an intermediate distribution rises and drops during the transition period. The intermediate-distribution model is substantially more complex than a rightward-shifting distribution model, because it involves specifying the parameters of a third distribution, the rate of change of those parameters as the distribution is revealed, and the rate of change of those parameters as the distribution wanes. Furthermore, because responses are not topographically tagged to one FI or the other, it is impossible to distinguish between a rising-and-dropping intermediate distribution from a distribution that is shifting rightwards.
The results of Experiment 1 support behavioral theories of timing over memory-encoding models. In Experiment 2 we sought to replicate and magnify the effects observed in Experiment 1 by using an FI 60-s schedule, instead of an FI 45-s schedule, as the alternative component to the FI 15-s schedule. Also, because in Experiment 1 the Long First condition only supported very limited inferences, Experiments 2 and 3 are focused on an analogue of the Short First condition, exposing pigeons to the FI 15-s schedule first.
3. Experiment 2
3.1 Methods
3.1.1 Subjects
Six experienced male pigeons (Columba livia) served as experimental subjects; none of them participated in Experiment 1. Housing conditions and feeding schedules for these subjects was identical to those in Experiment 1.
3.1.2 Apparatus
Experimental sessions were conducted in six MED Associates modular test chambers that were identical to those described in Experiment 1.
3.1.3 Procedure
3.1.3.1 Pretraining
All birds were pretrained on FI 15-s as described in Experiment 1 for two sessions.
3.1.3.2 Experiment training
Experimental sessions were similar to the Short-First sessions in Experiment 1, except that the Long schedule was FI 60-s instead of FI 45-s, each session consisted of 80 trials (i.e., 2 blocks of 40 trials) instead of 60 trials, and reinforcement was 2 instead of 2.5 s of hopper activation. Briefly, in each session, pigeons were exposed to 40 FI 15-s trials followed by 40 FI 60-s trials. Twenty-six sessions were conducted.
3.1.4 Data Analysis
Data analysis was conducted as in Experiment 1, with few modifications: (1) The transition period was identified a priori as the first 5 trials in the terminal FI 60-s schedule; (2) the first 5 trials of each session, and only those, were excluded from analysis a priori, (3) the upper limit of the asymptote of the Short-distribution mode, MS, was 60 s; (4) the mode of the Long distribution, ML, was fixed at 60 s instead of at 45 s (Equation 1), and (5) only parameters of the Negative Acceleration model (Table 1) were estimated by fitting the model only to FI 60-s performance data.
3.2 Results
Mean (+/− SEM) Negative Acceleration parameter estimates are shown in Table 3. Because every distribution parameter varied in the same direction for all pigeons over the course of the transition period, no constraint for consistency was required. Estimates shown in Table 3 suggest that, during the terminal schedule, (a) MS increased from a time close to the FI requirement toward a substantially longer time, (b) σS also increased, and (c) σL, KS, and KL declined. The directions of these changes were consistent with those observed in the Short-First condition in Experiment 1.
Mean predictions of distribution parameters at each transition trial are shown in Figure 5. These curves are consistent with inferences drawn from mean parameter estimates reported in Table 3. They also show that despite very high estimates of asymptotic σS, they did not rise substantially beyond estimates of σL. Figure 5 also shows that predicted CVs of Short and Long distributions were very similar and the Long-distribution CV generally declined over the transition period.
Figure 5.
Mean (+/− SEM) Short and Long distribution parameters over the terminal FI 60-s schedule, predicted by the Negative Acceleration model on the basis of individual parameters fitted to data from Experiment 2. M is the mode, σ the standard deviation, K the maximal response rate, and the coefficient of variation (CV) is the standard deviation divided by the mean. Circles are Short distribution parameters; triangles are Long distribution parameters. “SS” in the x-axis means “subsequent trials”, trials 6 to 40.
Figure 6 shows mean response rate as a function of time in trial, in each transition trial. The thick continuous lines show mean fits of the Negative Acceleration model; the broken curves are the underlying Short and Long distributions. As in Experiment 1, the Negative Acceleration model provided a faithful description of performance data. Figure 6 shows that, over trials without reinforcement at 15 s, the Short distribution shifted slightly rightwards (increased MS), becoming flatter (increased σS) and smaller (reduced KS). The shift in the peak of the Short distribution over FI 60-s trials is traced by the “X” symbols.
Figure 6.
Mean observed response rate (circles) as a function of time in the first 5 trials of the terminal FI 60-s schedule (the transition period), and trials 6–40 of the FI 60-s schedule, in Experiment 2. Thick continuous lines are mean traces of the Negative Acceleration model (see Equation 1 and Table 1) fit to individual pigeons; the broken curves are mean traces of each of the two underlying distributions. The “X” symbols trace the height and location of the peak of the mean Short distribution across FI 60-s trials.
3.3 Discussion
Experiment 2 replicated the effects observed in Experiment 1 of an abrupt but regular change in FI requirement on the performance of pigeons. Performance during the terminal FI 60-s schedule was well described by the mixture of two response frequency distributions. During the first 5 FI 60-s trials (the transition period), the Short distribution, which was controlled by reinforcement at 15 s, shifted rightwards and became flatter and smaller; the Long distribution, which was controlled by reinforcement at 60 s, became steeper but also smaller. All these qualitative effects were observed in Experiment 1, when the terminal FI was 45 s.
The aim of increasing the FI from 45 s in Experiment 1 to 60 s in Experiment 2 was to visualize more clearly the shift in the peak of the Short distribution during the transition period (cf. Figure 3). Despite the congruency between experiments in the dynamics of distribution parameters, the shift in the peak of the Short distribution was, if anything, less visible when FI 60 s served as terminal schedule (see slope of “X” symbols in Figures 5 and 6). Some differences in parameter estimates between experiments may be informative of this rather paradoxical effect of lengthening the FI requirement. In particular, note that in the Short-First condition of Experiment 1 (Table 3) the mode of the Short distribution (MS) shifted relatively fast (mean ζMS = .72 +/− .12) but its standard deviation (σS) increased slowly (mean ζσS = .08 +/− .08). In contrast, in Experiment 2 (Table 4), MS shifted relatively slowly (mean ζMS = .42 +/− .13) and σS increased fast (mean ζσS = .51 +/− .22). That is, in Experiment 2, the Short distribution became flatter too fast to allow for a clear visualization of the shift of its mode in Figure 6. Although it is unclear yet why a longer terminal schedule would yield this effect, it appears that this is a selective effect, because estimates of all other higher-order parameters (ζML, ζσL, ζKS, ζKL) were virtually identical in both experiments.
Further comparisons between parameter estimates in both experiments (Tables 3) reveal potentially interesting regularities. First, the mean asymptotic mode of the Short distribution (MA,S) was longer when the terminal schedule was FI 60 s rather than FI 45 s, further suggesting that the migration of the Short distribution during the transition period is sensitive to the location of the Long distribution. Second, the mean maximal response rate (K) in both Short and Long distributions, both at baseline and asymptote, were lower when the terminal schedule was FI 60-s instead of FI 45-s. This effect suggests that the amount of behavior in both distributions depend on the global rate of reinforcement in the session.
In Experiment 3, we sought further replication of the effects observed in Experiments 1 and 2, and verification that the differences in parameter estimates between experiments extend to a more extreme terminal schedule. In Experiment 3, the initial block of FI 15-s trials was followed by a terminal block of extinction trials. This manipulation was also expected to isolate the changes in the Short distribution during the transition period from behavior controlled by the terminal FI.
4. Experiment 3
4.1 Methods
4.1.1 Subjects
The same 6 pigeons (Columba livia) used in Experiment 2 served as experimental subjects in Experiment 3. Housing conditions and feeding schedules for these subjects was identical to those in Experiments 1 and 2.
4.1.2 Apparatus
Experimental sessions were conducted in six MED Associates modular test chambers that were identical to those described in Experiments 1 and 2.
4.1.3 Procedure
4.1.3.1 Experiment training
Because the pigeons were already trained on FI schedules (Experiment 2), no pretraining was required. Experimental sessions were similar to those of Experiment 2, except that Long-schedule trials consisted of a 0.5-s non-contingent hopper activation after 60 s of trial initiation. We have observed in our laboratory that 0.5-s non-contingent hopper activations do not allow retrieval of food by pigeons and does not support key pecking (see also Epstein 1981). Auditory and visual properties of such activations, however, appear to prevent roosting. Twenty-five sessions were conducted.
4.1.4 Data analysis
Data analysis was conducted as in Experiment 2, with few modifications. The mode of the Long distribution was fixed at 75 s, which is the time from the beginning of a terminal trial to the next feeding, if the terminal trial were followed by an FI 15-s trial. The assumption of a Long distribution of responses even under extinction contingencies was necessary to account for the systematic rise in response rate within terminal trials. Although this pattern might have been controlled by the non-contingent hopper activation at 60 s, it was more likely due to initial-schedule reinforcement 15 s after the end of each trial (Sanabria and Killeen 2007). In any case, changing the mode of the Long distribution between 60 and 75 s made little difference on Short-distribution parameters, which are the focus of the analysis. Furthermore, the standard deviation of the Long distribution (σL) was presumed to be constant, and thus ζσL was not estimated.
Because these pigeons had a history of keypeck reinforcement during the second half of every session, there was some concern that extinction learning was still undergoing during the sessions under analysis (last 10). To verify that there were no substantial trends in performance during these sessions, a linear regression of total responses per session was conducted for each individual pigeon over the sessions under analysis; mean (+/− SEM) intercept and slope are reported.
4.2 Results
The number of responses per session did not reveal a substantial trend over the analyzed sessions. The mean intercept was 1376 responses (+/− 276) and the mean slope was only slightly negative, -18.59 responses/session (+/− 27.67). That the mean slope was within one standard error of zero suggests that the negative slope was not substantial.
When no constraints were imposed, the direction of change of Negative Acceleration parameter estimates was inconsistent in two parameters: for all pigeons except one M1,S > MA,S, and for all pigeons except another K1,L < KA,L. That is, for only one pigeon the mode of the Short distribution appeared to shift leftward during the terminal extinction schedule; for only one other pigeon, the maximal response rate of the Long distribution appeared to increase during this schedule. Seeking consistency among subjects, we constrained the Negative Acceleration model such that M1,S = MA,S, and K1,L = KA,L for the corresponding subjects. This constraint eliminated parameters MA,S and ζMS for the first subject and KA,L and ζKL for the second subject, reducing the number of free parameters from 84 [(4 changing distribution parameters × 3 updating parameters + σL + error variance) × 6 pigeons] to 80. These constraints increased RSS from 70.8 to 71.0, reducing PVAF from 82.5% to 82.4%. An AICc analysis, however, favored the constrained model, whose AICc was 3.60 units lower than the more complex, unconstrained model.
Mean (+/− SEM) Negative Acceleration parameter estimates, with constraints for consistency imposed, are shown in Table 3. These estimates suggest that, during the terminal schedule, (a) MS increased from a time close to the FI requirement toward a substantially longer time, (b) σS also increased, and (c) KS and KL declined. The directions of these changes were consistent with those observed in Experiment 2 and in the Short-First condition of Experiment 1.
Mean predictions of distribution parameters at each transition trial are shown in Figure 7. Because σL was assumed constant, it appears as a constant function in the top-right panel; because ML was also assumed constant, the corresponding CV is also shown as a constant function in the bottom-right panel. These curves serve as reference for changes in Short-distribution parameters, which were consistent with inferences drawn from mean parameter estimates reported in Table 3. It is important to note that, as in prior experiments, despite the very high estimate of asymptotic σS, it did not rise substantially beyond the estimated σL. Figure 7 also shows that the predicted Short-distribution CV was very similar to the Long-distribution CV over the course of the transition period, but the Short-distribution CV became larger in subsequent trials.
Figure 7.
Mean (+/− SEM) Short and Long distribution parameters over the terminal extinction schedule, predicted by the Negative Acceleration model on the basis of individual parameters fitted to data from Experiment 3. M is the mode, σ the standard deviation, K the maximal response rate, and the coefficient of variation (CV) is the standard deviation divided by the mean. Circles are Short distribution parameters; triangles are Long distribution parameters. “SS” in the x-axis means “subsequent trials”, trials 6 to 40. The standard deviation of the Long distribution (σL) was constrained to be constant across trials.
Figure 8 shows mean response rate as a function of time in trial, in each transition trial. The thick continuous lines show mean fits of the Negative Acceleration model; the broken curves are the underlying Short and Long distributions. As in Experiments 1 and 2, the Negative Acceleration model provided a faithful description of performance data. Figure 8 shows that, over trials without reinforcement at 15 s, the Short distribution shifted slightly rightwards (increased MS), becoming flatter (increased σS) and smaller (reduced KS), and the Long distribution declined rapidly in size (reduced KL). The shift in the peak of the Short distribution over extinction trials is traced by the “X” symbols.
Figure 8.
Mean observed response rate (circles) as a function of time in the first 5 trials of the terminal extinction schedule (the transition period), and trials 6–40 of the extinction schedule, in Experiment 3. Thick continuous lines are mean traces of the Negative Acceleration model (see Equation 1 and Table 1) fit to individual pigeons; the broken curves are mean traces of each of the two underlying distributions. The “X” symbols trace the height and location of the peak of the mean Short distribution across extinction trials.
4.3 Discussion
Experiment 3 replicated the changes in the distribution of responses controlled by an FI 15-s schedule observed in Experiments 1 and 2 when reinforcement at 15 s was discontinued. Unlike in preceding experiments, however, changes in response frequency distribution in Experiment 3 were not confounded by reinforcement at a later time in the trial. Nonetheless, responsiveness on each trial to reinforcement in the following trial could not be discounted. Such responsiveness would explain the persisting rise (even at very low rates) of responding at the end of extinction trials (Sanabria and Killeen 2007).
As in previous experiments, performance during the terminal schedule was well described by the mixture of two response frequency distributions. During the first 5 extinction trials, the Short distribution, which was controlled by reinforcement at 15 s, shifted rightwards and became flatter and smaller; the Long distribution, which was presumed to be controlled by reinforcement at 75 s (15 s past the end of the trial), also became smaller. All these qualitative effects replicate those of Experiments 1 and 2.
The aim of removing reinforcement in the terminal schedule was to (1) isolate the changes in the Short distribution during the transition period from behavior controlled by reinforcement at other times, so as to better visualize those changes, or, alternatively, (2) verify whether extinction exacerbated the pattern of change in parameters over the transition period that obscured the shift in peak of the Short distribution in Experiment 2. It is not completely clear that extinction in the terminal schedule afforded a better visualization of the rightward shift in the Short distribution during the transition period (compare “X” symbols in Figures 3, 6, and 8). It is evident, however, that extinction did not exacerbate the pattern of change in Short-distribution parameters observed in Experiment 2. If anything, extinction slightly reversed such pattern: relative to Experiment 2, the mean updating rate of the Short mode (ζMS) and standard deviation (ζσS) in Experiment 3 were, respectively, slightly higher (by .07) and lower (by .17). These differences may just be due to sampling error, but they certainly do not support the notion, suggested by the comparison between Experiments 1 and 2, that a lower rate of reinforcement in the terminal schedule slows down the increase in MS and speeds up the increase in σS.
Short-distribution parameter estimates and their corresponding updating rates are remarkably similar between Experiments 2 and 3, with one notable exception, the rate of decline of the maximal response rate (ζKS). Relative to Experiment 3, where mean ζKS = .65 (+/− .12), this rate was lower in Experiments 1 (Short-First condition) and 2 [pooled mean ζKS = .52 (+/− .14)]. This means that responding controlled by FI 15-s may have declined faster when transitioning to extinction than when transitioning to a longer FI schedule. Although relative to sampling error this is not a large difference, it is the only one that would contribute to obscure the rightward shift in the Short distribution observed in Experiment 3.
5. General Discussion
5.1 Anticipation of a Shift in Reinforcement Time
In 3 experiments, 10 pigeons were exposed to an abrupt but regular change in the timing of positive reinforcement. The change in the timing of reinforcement was abrupt because it took place in a single trial; it was regular because it took place on every session, always in the middle of each session. The distribution of responses over time within each trial was well described by the mixture of two gamma distributions (Equations 1–3). Similar to other models of mixed FI performance (Whitaker et al. 2003, 2008), this model suggests that timed behavior crowds around reinforcement times.
With this model in hand, the present study sought to answer two questions, regarding the adaptation of behavior to regular changes in reinforcement time. The first question was whether or not the change in reinforcement time anticipated. Regarding this question, Experiment 1 found evidence of interference by a shorter FI schedule (FI 15-s) on performance on a preceding, longer FI schedule (FI 45-s). This interference was evidenced by the heightened rate of responding near 15-s in FI 45-s trials in the Long-First condition, which was maintained for the length of the FI 45-s block (30 consecutive trials; Figure 1; see Ludvig and Staddon 2005 for similar data from alternating FI schedules). A similar pattern was observed only in the first 5 FI 45-s trials when it followed the FI 15-s schedule (Short-First condition). Furthermore, the distribution of responses in FI 15-s is remarkably similar regardless of whether it was effective before or after the FI 45-s. These data suggest that pigeons anticipate regular downshifts in reinforcement time (or increases in rate of reinforcement). Such anticipation is not expressed as a progressive adjustment of the distribution of responses over trials, but as a partial failure of schedule control. In other words, the richer schedule is anticipated, but the timing of its onset is not. It is important to note that other researchers have observed trial-dependent anticipation (“anticipatory errors”) in pigeons to regular changes in contingences that occur in the middle of the experimental session (Cook and Rosen 2010; Rayburn-Reeves et al. 2013).
The anticipation of a downshift, but not an upshift, in FI requirement, may explain the asymmetric effect on PRP of downshifts vs. upshifts of the FI requirement. Downshifts result in a rapid reduction in PRP (Higa 1996a, Higa et al. 2002), whereas upshifts have little impact on PRP (Higa 1997; Ludvig and Staddon 2004), unless the longer FI is sustained for many trials (Higa 1996b, Higa and Tillou 2001, Higa et al. 1993). When one or few long FI trials are followed by many short FI trials, the anticipation of the reduction in delay of reinforcement may override control by the longer FI schedule. Sustaining the long FI for many trials (as was done in Experiment 1, Short-First condition) may reduce the anticipation of the reduction in delay of reinforcement, thus revealing the adjustment of behavior to the longer FI schedule.
5.2 Adjustment to a Shift in Reinforcement Time
The second question this study sought to answer was how, following a change in reinforcement time, control from the previous reinforcement time waned. Regarding this question, estimates of the mixture gamma model (Equations 1–3) obtained in all 3 experiments suggest that, following a regular rightward shift in reinforcement time and the extinction of reinforcement, responding controlled by the original reinforcement time declines in frequency, shifts rightward on time, and increases in variability over time. The latter two changes imply that the coefficient of variation (CV) of the timing of extinguished responses is kept nearly constant. This transition takes about 5 trials to be completed.
The rightward displacement of the temporal distribution of extinguished behavior is inconsistent with the notion that the mapping between objective and subjective time remains unaffected by extinction. Theories that imply such mapping, like Scalar Expectancy Theory (Church et al. 1994), Rate Estimation Theory (Gallistel and Gibbon 2000), Packet Theory (Guilhardi and Church 2005) and the Behavioral Economic Model (Jozefowiez et al. 2009), can only account for the rightward-displacement effect by assuming that the downward shift in FI induced a deceleration of the internal clock (consistent with the reinforcer-dependent pacemaker suggested by Killeen and Fetterman 1988), a loss of counts in the clock accumulator (consistent with the attention-gate model suggested by Zakay and Block 1996), or a lengthening of the representation of the original FI in memory. Changes in the response decision rule may explain the reduction in response rate and the increased dispersion of the displaced distribution (Sanabria et al. 2009), but not its displacement.
A model that may account for the rightward-displacement effect may instead map objective time to sequential behavioral states (Killeen and Fetterman 1988; Machado et al. 2009; Machado 1997), and assume that transitions between states are dependent on local rate of reinforcement and that responding in a state declines exponentially over extinction. The nearly constant CV observed during the transition period in all experiments implies that extinction affects the scale parameter of the displaced parameter (λS), not its shape parameter (cS; Equation 1). This suggests that the rightward displacement occurs because of the slower transition between states (indexed by λS), not because of an increase in the number of states preceding the previously reinforced state (indexed by cS).
5.3 Implications for Models of Associative Learning
As we develop the framework for a dynamic theory of timing, it is important to consider its implications for a related domain of learning theory, associative learning. Contemporary theories of associative learning suggest that the effectiveness of the conditioned stimulus (CS) depends on the information it communicates on the timing of the unconditioned stimulus (US; (Ward et al. 2012; Balsam, Drew, and Gallistel 2010; Balsam and Gallistel 2009). To the extent that the subjective representation of time is tied to behavioral states, the informativeness of the CS is reduced to its ability to maintain a consistent concomitance between subjective time (the behavioral state) and objective time (when the US occurs). The CS may facilitate such concomitance by resetting the cascade of behavioral states (i.e., by serving as a “time marker”) and by accelerating the rate at which states are produced. The resetting function implies that the interval to be timed, which typically coincides with modal responding at MD (Equation 2), is reduced; reducing MD while keeping constant the count of states (cD in Equation 2) implies a reduction in the mean interval between states (λD) and thus a reduction in variance of responding over time, because variance of a gamma distribution is proportional to λ. The latter proportionality also explains why the accelerating function of the CS reduces the temporal variability of responses. The accelerating function of the CS, in turn, implies a deceleration of the internal clock as the CS+ (trial onset) signaling imminent reinforcement in the first half of the Short First condition turns into a CS− signaling more delayed reinforcement.
The model of associative learning supported by the displacement of extinguished behavior over time does not select between competing real-time theories of learning, but it imposes an informative constraint on them. For instance, there has been a persistent discussion on whether the timing of learned events is encoded on a linear or logarithmic scale (Gibbon and Church, 1981; Machado and Vasconcelos, 2006; Yi 2009). The model of timing advanced here does not resolve this dispute but suggests an important consideration so far neglected: that the scale on which time is encoded, whichever it may be, must be flexible. Such flexibility may be incorporated into any real-time theory of timing and associative learning, but it appears to be particularly congenial to theories that assume that passage of time is tracked by sequential behavioral states (Killeen and Fetterman 1988; Machado et al. 2009; Machado 1997) or by memory traces decaying at multiple rates (Buhusi and Schmajuk 1999, Staddon and Higa, 1999). In these theories, the speed of transition between states and the rate of decay of memory traces may be sensitive to changes in local rate of reinforcement. To the extent that associative learning depends on the represention of stimuli on a subjective scale of time, the present data suggests that the acquisition and extinction of learned responses may depend largely on the reconfiguration of the temporal representation of stimuli, which is sensitive to changes in local rate of reinforcement.
Highlights
A sudden change in the timing of reinforcement initiated a transition period.
In this period, control of behavior was gradually transferred between schedules.
Control was expressed in distinct distributions of responses over time.
Schedule discontinuation displaced the response distribution down and to the right.
Such displacement supports a model of timing based on cascades of behavioral states.
Acknowledgments
This research was supported by seed funding from the College of Liberal Arts & Sciences, Arizona State University, and by the National Institutes of Health (MH094562, DA032632). We thank Brent Mueller for data collection and preliminary data analysis, and Peter Killeen for his support and advice.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
7. References
- Balsam PD, Drew MR, Gallistel C. Time and associative learning. Comparative Cognition & Behavior Reviews. 2010;5:1–22. doi: 10.3819/ccbr.2010.50001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balsam PD, Gallistel CR. Temporal maps and informativeness in associative learning. Trends in Neurosciences. 2009;32:73–78. doi: 10.1016/j.tins.2008.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bizo LA, Chu JYM, Sanabria F, Killeen PR. The failure of Weber's law in time perception and production. Behavioural Processes. 2006;71:201–210. doi: 10.1016/j.beproc.2005.11.006. [DOI] [PubMed] [Google Scholar]
- Bizo LA, White KG. The behavioral theory of timing: Reinforcer rate determines pacemaker rate. Journal of the Experimental Analysis of Behavior. 1994;61:19–33. doi: 10.1901/jeab.1994.61-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bizo LA, White KG. Pacemaker rate in the behavioral theory of timing. Journal of Experimental Psychology: Animal Behavior Processes. 1994b;20:308–321. [Google Scholar]
- Buhusi, Schmajuk Timing in simple conditioning and occasion setting: A neural network approach. Behavioural Processes. 1999;45:33–57. doi: 10.1016/s0376-6357(99)00008-x. [DOI] [PubMed] [Google Scholar]
- Burnham K, Anderson D. Model selection and multimodel inference: a practical information-theoretic approach. Springer Verlag; New York: 2002. [Google Scholar]
- Catania AC, Reynolds GS. A quantitative analysis of the responding maintained by interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior. 1968;11:327–383. doi: 10.1901/jeab.1968.11-s327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Church RM. Behavioristic, cognitive, biological, and quantitative explanations of timing. In: Wasserman EA, Zentall TR, editors. Comparative cognition: Experimental explorations of animal intelligence. Oxford University Press; New York: 2006. [Google Scholar]
- Church RM, Meck WH. A concise introduction to scalar timing theory. In: Meck WH, editor. Functional and neural mechanisms of interval timing. CRC Press; Boca Raton, FL: 2003. [Google Scholar]
- Church RM, Meck WH, Gibbon J. Application of scalar timing theory to individual trials. Journal of Experimental Psychology: Animal Behavior Processes. 1994;20:135–155. doi: 10.1037//0097-7403.20.2.135. [DOI] [PubMed] [Google Scholar]
- Cook RG, Rosen HA. Temporal control of internal states in pigeons. Psychonomic Bulletin & Review. 2010;17:915–922. doi: 10.3758/PBR.17.6.915. [DOI] [PubMed] [Google Scholar]
- Epstein R. Amount consumed as a function of magazine-cycle duration. Behaviour Analysis Letters. 1981;1:63–66. [Google Scholar]
- Gallistel CR, Gibbon J. Time, rate, and conditioning. Psychological Review. 2000;107:289–344. doi: 10.1037/0033-295x.107.2.289. [DOI] [PubMed] [Google Scholar]
- Gibbon J, Church RM. Time left: Linear versus logarithmic subjective time. Journal of Experimental Psychology: Animal Behavior Processes. 1981;7:87–108. [PubMed] [Google Scholar]
- Grondin S. Timing and time perception: A review of recent behavioral and neuroscience findings and theoretical directions. Attention, Perception, & Psychophysics. 2010;72:561–582. doi: 10.3758/APP.72.3.561. [DOI] [PubMed] [Google Scholar]
- Guilhardi P, Church RM. Dynamics of temporal discrimination. Learning & Behavior. 2005;33:399–416. doi: 10.3758/bf03193179. [DOI] [PubMed] [Google Scholar]
- Guilhardi P, Church RM. The pattern of responding after extensive extinction. Learning & Behavior. 2006;34:269–284. doi: 10.3758/bf03192883. [DOI] [PubMed] [Google Scholar]
- Guilhardi P, Yi L, Church RM. Effects of repeated acquisitions and extinctions on response rate and pattern. Journal of Experimental Psychology: Animal Behavior Processes. 2006;32:322–328. doi: 10.1037/0097-7403.32.3.322. [DOI] [PubMed] [Google Scholar]
- Higa JJ. Dynamics of time discrimination: II. The effects of multiple impulses. Journal of the Experimental Analysis of Behavior. 1996a;66:117–134. doi: 10.1901/jeab.1996.66-117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Higa JJ. Rapid timing of a single transition in interfood interval duration by rats. Animal Learning & Behavior. 1996b;25:177–184. [Google Scholar]
- Higa JJ. Dynamics of temporal control in rats: The effects of a brief transition in interval duration. Behavioural Processes. 1997;40:223–229. doi: 10.1016/s0376-6357(97)00021-1. [DOI] [PubMed] [Google Scholar]
- Higa JJ, Moreno S, Sparkman N. Interval timing in rats: tracking unsignaled changes in the fixed interval schedule requirement. Behavioural Processes. 2002;58:167–176. doi: 10.1016/s0376-6357(02)00029-3. [DOI] [PubMed] [Google Scholar]
- Higa JJ, Thaw JM, Staddon JE. Pigeons' wait-time responses to transitions in interfood-interval duration: Another look at cyclic schedule performance. Journal of the Experimental Analysis of Behavior. 1993;59:529–541. doi: 10.1901/jeab.1993.59-529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Higa JJ, Tillou P. Effects of increasing the time to reinforcement on interval timing in rats. International Journal of Comparative Psychology. 2001;14:64–75. [Google Scholar]
- Jozefowiez J, Staddon J, Cerutti D. The behavioral economics of choice and interval timing. Psychological Review. 2009;116:519. doi: 10.1037/a0016171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Killeen PR, Fetterman JG. A behavioral theory of timing. Psychological Review. 1988;95:274–295. doi: 10.1037/0033-295x.95.2.274. [DOI] [PubMed] [Google Scholar]
- Killeen PR, Sanabria F, Dolgov I. The dynamics of conditioning and extinction. Journal of Experimental Psychology: Animal Behavior Processes. 2009;35:447–472. doi: 10.1037/a0015626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leak TM, Gibbon J. Simultaneous timing of multiple intervals: Implications of the scalar property. Journal of Experimental Psychology: Animal Behavior Processes. 1995;21:3–19. [PubMed] [Google Scholar]
- Lejeune H, Ferrara A, Simons F, Wearden JH. Adjusting to changes in the time of reinforcement: Peak-interval transitions in rats. Journal of Experimental Psychology: Animal Behavior Processes. 1997;23:211–231. doi: 10.1037//0097-7403.23.2.211. [DOI] [PubMed] [Google Scholar]
- Ludvig EA, Staddon JER. The conditions for temporal tracking under interval schedules of reinforcement. Journal of Experimental Psychology: Animal Behavior Processes. 2004;30:299–316. doi: 10.1037/0097-7403.30.4.299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ludvig EA, Staddon JER. The effects of interval duration on temporal tracking and alternation learning. Journal of the Experimental Analysis of Behavior. 2005;83:243–262. doi: 10.1901/jeab.2005.88-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luzardo A, Ludvig EA, Rivest F. An adaptive drift-diffusion model of interval timing dynamics. Behavioural Processes. 2013 doi: 10.1016/j.beproc.2013.02.003. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Machado A. Learning the temporal dynamics of behavior. Psychological Review. 1997;104:241–265. doi: 10.1037/0033-295x.104.2.241. [DOI] [PubMed] [Google Scholar]
- Machado A, Malheiro MT, Erlhagen W. Learning to time: A perspective. Journal of the Experimental Analysis of Behavior. 2009;92:423–458. doi: 10.1901/jeab.2009.92-423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Machado A, Vasconcelos M. Acquisition versus steady state in the time-left experiment. Behavioural Processes. 2006;71:172–187. doi: 10.1016/j.beproc.2005.11.004. [DOI] [PubMed] [Google Scholar]
- Meck WH, Komeily-Zadeh FN, Church RM. Two-step acquisition: Modification of an internal clock's criterion. Journal of Experimental Psychology: Animal Behavior Processes. 1984;10:297–306. [PubMed] [Google Scholar]
- Pearce JM, Bouton ME. Theories of associative learning in animals. Annual Review of Psychology. 2001;52:111–139. doi: 10.1146/annurev.psych.52.1.111. [DOI] [PubMed] [Google Scholar]
- Rayburn-Reeves RM, Laude JR, Zentall TR. Pigeons show near-Optimal win-stay/lose-shift performance on a simultaneous-discrimination, midsession reversal task with short intertrial intervals. Behavioural Processes. 2013;92:65–70. doi: 10.1016/j.beproc.2012.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rescorla RA, Wagner AR. A theory of Pavlovian conditioning. In: Black A, Prokasy W, editors. Classical conditioning, II, Appleton-Century-Crofts. New York: 1972. [Google Scholar]
- Rivière V, Darcheville J, Clément C. Rapid timing of transitions in inter-reinforcement interval duration in infants. Behavioural Processes. 2000;52:109–115. doi: 10.1016/s0376-6357(00)00130-3. [DOI] [PubMed] [Google Scholar]
- Roberts S. Isolation of an internal clock. Journal of Experimental Psychology: Animal Behavior Processes. 1981;7:242–268. [PubMed] [Google Scholar]
- Rodríguez-Gironés MA, Kacelnik A. Behavioral adjustment to modifications in the temporal parameters of the environment. Behavioural Processes. 1999;45:173–191. doi: 10.1016/s0376-6357(99)00017-0. [DOI] [PubMed] [Google Scholar]
- Sanabria F, Killeen PR. Temporal generalization accounts for response resurgence in the peak procedure. Behavioural Processes. 2007;74:126–141. doi: 10.1016/j.beproc.2006.10.012. [DOI] [PubMed] [Google Scholar]
- Sanabria F, Thrailkill E, Killeen P. Timing with opportunity cost: Concurrent schedules of reinforcement improve peak timing. Learning & Behavior. 2009;37:217–229. doi: 10.3758/LB.37.3.217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Staddon JER, Chelaru IM, Higa JJ. Habituation, memory and the brain: The dynamics of interval timing. Behavioural Processes. 2002;57:71–88. doi: 10.1016/s0376-6357(02)00006-2. [DOI] [PubMed] [Google Scholar]
- Staddon JER, Chelaru IM, Higa JJ. A tuned-trace theory of interval-timing dynamics. Journal of the Experimental Analysis of Behavior. 2002b;77:105–124. doi: 10.1901/jeab.2002.77-105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Staddon JER, Higa Time and memory: Towards a pacemaker-free theory of interval timing. Journal of the Experimental Analysis of Behavior. 1999;71:215–251. doi: 10.1901/jeab.1999.71-215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward R, Gallistel C, Jensen G, Richards V, Fairhurst S, Balsam P. Conditional stimulus informativeness governs conditioned stimulus-unconditioned stimulus associability. Journal of Experimental Psychology: Animal Behavior Processes. 2012;38:217–232. doi: 10.1037/a0027621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whitaker S, Lowe CF, Wearden JH. Multiple-interval timing in rats: Performance on two-valued mixed fixed-interval schedules. Journal of Experimental Psychology: Animal Behavior Processes. 2003;29:277–291. doi: 10.1037/0097-7403.29.4.277. [DOI] [PubMed] [Google Scholar]
- Yi L. Do rats represent time logarithmically or linearly? Behavioural Processes. 2009;81:274–279. doi: 10.1016/j.beproc.2008.10.004. [DOI] [PubMed] [Google Scholar]
- Whitaker S, Lowe CF, Wearden JH. When to respond? And how much?: Temporal control and response output on mixed-fixed-interval schedules with unequally probable components. Behavioural Processes. 2008;77:33–42. doi: 10.1016/j.beproc.2007.06.001. [DOI] [PubMed] [Google Scholar]
- Wynne CD, Kalish ML. Effects of occasional short interfood intervals on temporal control in pigeons. Behavioural Processes. 1999;45:207–218. doi: 10.1016/s0376-6357(99)00019-4. [DOI] [PubMed] [Google Scholar]
- Zakay D, Block RA. The role of attention in time estimation processes. In: Pastor MA, Artieda J, editors. Time, internal clocks and movement. Elsevier; Amsterdam: 1996. [Google Scholar]







