Skip to main content
Journal of the Experimental Analysis of Behavior logoLink to Journal of the Experimental Analysis of Behavior
. 2009 Nov;92(3):423–458. doi: 10.1901/jeab.2009.92-423

Learning to Time: A Perspective

Armando Machado 1,, Maria Teresa Malheiro 1, Wolfram Erlhagen 1
PMCID: PMC2771665  PMID: 20514171

Abstract

In the last decades, researchers have proposed a large number of theoretical models of timing. These models make different assumptions concerning how animals learn to time events and how such learning is represented in memory. However, few studies have examined these different assumptions either empirically or conceptually. For knowledge to accumulate, variation in theoretical models must be accompanied by selection of models and model ideas. To that end, we review two timing models, Scalar Expectancy Theory (SET), the dominant model in the field, and the Learning-to-Time (LeT) model, one of the few models dealing explicitly with learning. In the first part of this article, we describe how each model works in prototypical concurrent and retrospective timing tasks, identify their structural similarities, and classify their differences concerning temporal learning and memory. In the second part, we review a series of studies that examined these differences and conclude that both the memory structure postulated by SET and the state dynamics postulated by LeT are probably incorrect. In the third part, we propose a hybrid model that may improve on its parents. The hybrid model accounts for the typical findings in fixed-interval schedules, the peak procedure, mixed fixed interval schedules, simple and double temporal bisection, and temporal generalization tasks. In the fourth and last part, we identify seven challenges that any timing model must meet.

Keywords: Learning-to-Time (LeT) model, Scalar Expectancy Theory (SET), mathematical models, temporal discrimination, timing


“Stated more generally the problem is how time as a dimension of nature enters into discriminative behavior and hence into human knowledge.”

(B. F. Skinner 1938, p. 263)

The capacity to adjust behavior to temporal regularities in the environment in the range of seconds to minutes is called interval timing, or timing for short. This capacity is expressed in a variety of ways such as in anticipating an important event once a specific interval of time has elapsed, judging which of two events lasted longer, performing an action for a given duration, or choosing which of two cues signals a shorter delay to a reward. In each case, timing is said to take place because behavior is a function of one or more arbitrary intervals between events or durations of events. To say that an animal or person is timing is not to say simply that its behavior occurs in time, but that the best predictor of its behavior is an interval of time.

After several decades of research, scientists still debate the properties that characterize timing (e.g., Lejeune & Wearden, 2006; Staddon & Cerutti, 2003; Zeiler, 1998; Zeiler & Powell, 1994), the processes and neural mechanisms that underlie it (Buhusi & Meck, 2005; Ivry & Spencer, 2004; Matell & Meck, 2000; Meck, 1996), how the capacity is disrupted by pharmacological agents and disease (e.g., Cevik, 2003; Meck, 1983; McClure, Saulsgiver, & Wynne, 2005), and which quantitative models and theories best describe it (e.g., Lejeune, Richelle, & Wearden, 2006; Staddon & Higa, 1999, 2006). Although much remains to be discovered, it is also the case that during the last decades psychologists have made substantial progress in the study of timing. First, they have developed a rich set of procedures to study the different expressions of the timing capacity (e.g., Church, 1984, 2004; Gallistel, 1990; Richelle & Lejeune, 1980; Roberts, 1998). Some of these procedures, described in greater detail below, include the fixed-interval schedule and the peak procedure to study concurrent timing (i.e., the timing of ongoing events), or the temporal generalization and temporal bisection procedures to study retrospective timing (i.e., the timing of elapsed events). Second, they have collected a large amount of orderly data on timing and from them advanced a few empirical generalizations. One of them, perhaps the most important, is the scalar property, the fact that timing is relative to the standard being timed (Church, 2003; Gibbon, 1977, 1991; Lejeune & Wearden, 2006). To illustrate, timing performances R1(t) and R2(t) on intervals 30- and 90-s long, respectively, are scale transforms of each other—R2(t) is proportional to R1(t/3). Third, they have proposed a significant number of models and theories of timing. These models come from different theoretical perspectives (behavioral, cognitive, computational, and neurobiological), propose different processes and mechanisms, stress different subsets of research findings, and have different depths of analysis. A nonexhaustive list includes the Scalar Expectancy Theory (Gibbon, 1991; Gibbon, Church, & Meck, 1984), the Behavioral theory of Timing (Killeen & Fetterman, 1988), the Spectral Model (Grossberg & Schmajuk, 1989), the Diffusion model (Staddon & Higa, 1991), the Multiple Oscillator model (Church & Broadbent, 1990), the Learning-to-Time model (Machado, 1997), the Multiple Time Scales model (Staddon & Higa, 1999), the Packet Theory (Kirkpatrick, 2002) and its descendant, the Modular Theory of Learning (Guilhardi, Yi, & Church, 2007), the Active Time Model (Dragoi, Staddon, Palmer, & Buhusi, 2003), and the list could continue with the neurobiological models. And fourth, they have started the thorny process of comparing and contrasting these models with each other and with data (e.g., Bizo, Chu, Sanabria, & Killeen, 2006; Fetterman & Killeen, 1995; Leak & Gibbon, 1995; Lejeune et al., 2006; Staddon & Higa, 1999; Yi, 2007). If doing experiments (point 2 above) explores the empirical space of timing, and proposing models (point 3 above) explores the theoretical space of timing, comparing and contrasting models with data coordinates the two spaces in an attempt to design more informative experiments and build more powerful theories.

The present article fits in this last category, for it reviews a series of studies that compared and contrasted the Scalar Expectancy Theory (SET), the leading model in the field of animal and human timing, with the Learning-to-Time (LeT) model, a derivative of Killeen and Fetterman's (1988) behavioral theory of timing. As we shall see, these two models make different assumptions about the processes underlying timing in general and what animals learn in timing tasks in particular. For this reason, examining the two models jointly has proved to be a fruitful exercise because it has led us to identify not only serious problems with each model but also important but unknown properties of timing and temporal memory. It has also helped to clarify problems that future research should solve.

Other studies have designed experiments specifically to contrast timing models, but most of them have not addressed issues of learning, or explored the models' distinct conceptions of learning. For example, one set of these studies contrasted SET and the behavioral theory of timing on the issue of whether the rate of a hypothetical internal clock is influenced by global and local reinforcement rates and how that influence might account for certain aspects of timing performance related to the scalar property (Fetterman & Killeen, 1991, 1995; Leak & Gibbon, 1995; Morgan, Killeen, & Fetterman, 1993; see also Bizo & White, 1994a, 1994b; 1995a, 1995b, 1997). By comparison, the issues in the present article have been examined less. They are, namely, how animals learn to time, how this learning affects their temporal memories, how temporal memories are accessed and their contents retrieved, and which experimental findings may help researchers choose among distinct conceptualizations of learning to time. But even the domain of learning to time is too broad to be covered in one single article. We further narrow our focus to issues related to memory, the precipitate of learning. We pay special attention to the contents of memory, the (often tacit) rules to form new memories, access them, and retrieve their contents. We will not discuss other important matters such as the nature of time markers (e.g., Staddon & Higa, 1999) or the timing of multiple signals (Meck & Church, 1984). And, for the most part, we restrict our remarks to timing in animals.

The article is divided into four parts. In the first part, we describe the structure of each model and how they work in two prototypical tasks, one of concurrent timing, the fixed-interval schedule, and the other of retrospective timing, the temporal bisection task1. Describing how the models work in these simple tasks will reveal their similarities and differences. In the second part, we summarize experiments that exploited some of the differences between the learning assumptions of the two models, and from their results we draw some implications for our understanding of timing. In the third part, we propose a new model of timing that integrates features of SET and LeT and show how the hybrid model overcomes some of the shortcomings of its two parents. In the fourth and final part, we identify some of the challenges that any model of timing must meet and thereby hope to pave the way for a better account of timing.

I. TWO MODELS OF TIMING, SET AND LET

To introduce the two models, we consider the simplest time-based task, the fixed-interval (FI) reinforcement schedule. In FI T-s, a reinforcer becomes available T s after the trial onset. Responses emitted at times t ≤ T are recorded but not reinforced, whereas the first response emitted at t > T earns the reinforcer and, usually, starts a new trial. Of interest is how the animal distributes its responses during the trial. Typically, at the steady state, the animal pauses or responds at a low rate during the first half to two thirds of the trial and then it either responds at a constant but significantly higher rate until the end of the trial, yielding the break-and-run pattern (Schneider, 1969), or it accelerates until the end of the trial, yielding the FI scallop (Dews, 1978). Averaged across trials, response rate follows a smooth, monotonically increasing, sigmoid curve. Moreover, when the same animal is exposed to different FI schedules, the average rate curves superimpose when plotted with normalized axes (Dews, 1970). Superimposition means that relative response rate at time t into the trial is a function of the ratio t/T. How do SET and LeT explain this performance?

SET in FI Schedules

SET is an elegant information-processing model developed by John Gibbon, Russell Church, and their collaborators (for summaries, see Church, 2003; Gallistel, 1990; Gibbon 1991). In its most basic form, the model postulates an internal clock composed by the three devices displayed in the top left panel of Figure 1, a pacemaker–accumulator unit, a memory, and a comparator. The pacemaker generates pulses at a high and variable rate (λ). The accumulator, which is reset to 0 at the beginning of each trial, adds the pacemaker pulses throughout the trial. When the reinforcer is delivered, the value in the accumulator is multiplied by a random factor (k*) and saved in a long-term memory store. Because the pacemaker rate λ and the memory factor k* are random variables (typically Gaussian), the value in the accumulator at the end of an interval and the value stored in memory also will be variable, even when the timed interval has constant duration. More important, both random variables induce scalar variability in the subject's representation of time, meaning that both change multiplicatively the duration of the physical interval2.

Fig 1.

Fig 1

The left panels show the structure of SET. A pacemaker generates pulses which are added in an accumulator and stored at the end of the to-be-timed interval in one or more long-term memories. To decide when to respond, the animal compares the number currently in the accumulator with samples extracted from the memories. In FI schedules (top left) only one memory is formed; in the bisection procedure (bottom left) two memories are formed. The right panels show the structure of LeT. After a time marker, a set of states (top circles) is activated in series. The states may be coupled to various degrees (associative links) with one or more operant responses (bottom circles). The strength of each response is determined by the dot product between the vectors of state activation and coupling. In FI schedules (top right) only one vector of couplings is formed; in the bisection procedure (bottom right) two vectors are formed.

Because each trial adds one value to the memory store, after a few trials the memory will contain a distribution of values representing the reinforcement times. According to SET, to decide whether or not to respond, the animal extracts a sample from its memory at trial onset and then compares the sample with the current value in the accumulator. The memory value, M, represents the reinforcement time; the accumulator value, xt, represents elapsed time during the trial. When the ratio between the accumulator value and the memory value crosses a threshold, Θ, responding changes from a low (or possibly zero) rate to a high rate. The threshold parameter Θ also is a random variable.

At the steady state, SET predicts on each trial a break-and-run response pattern, represented graphically by a step function. The moment of the break (graphically, the time when the step occurs) is a random variable with mean equal to a constant proportion of the FI and standard deviation proportional to the mean. The latter statement expresses Weber's law in the time domain. Averaging the individual trial step functions yields the session response rate curve with its typical sigmoid shape. SET also predicts that the average rate curves produced by the same animal on different FI schedules superimpose when plotted in relative time.

LeT in FI Schedules

Like its ancestor, the behavioral theory of timing (Killeen, 1991; Killeen & Fetterman, 1988), LeT postulates three elements, a series of states, a vector of associative links connecting the states to the operant response, and the operant response itself (see the top right panel of Figure 1; Machado 1997). The states embody the concepts of elicited, induced, adjunctive, interim, and terminal classes of behavior (Falk, 1977; Staddon, 1977; Staddon & Simmelhag, 1971; Timberlake & Lucas, 1985) and according to LeT they underlie the temporal organization of behavior. At present, we do not know how precisely the states relate to measurable behavior or what their neural basis is; they remain intervening variables (for further discussion of the role of the states in timing and their connection with mediating behaviors, see Fetterman, Killeen, & Hall, 1998; Killeen & Fetterman, 1988; Matthews & Lerer, 1987; Richelle & Lejeune, 1980).

The states are aroused or activated serially. Thus, when the trial begins only the first state is active, but, as time elapses, the activation of each state spreads with rate λ to the next state. Each state (n  =  1, 2,…) is coupled with the operant response and the degree of the coupling, represented by variable W(n), changes in real time, decreasing to 0 at rate α during extinction, and increasing to 1 at rate β during reinforcement. Thus states that are strongly active when food is unavailable lose their coupling to, and eventually may not support, the operant response, whereas states strongly active when food is available increase their coupling and may therefore sustain the response. Finally, the strength of the operant response is obtained by adding the cueing or discriminative function of all states, that is, their associative links, each multiplied by the degree of activation of the corresponding state. States that are both strongly active and strongly associated with the operant response exert more control over that response than less active or conditioned states3.

According to LeT, in the FI schedule the couplings between the early states and the operant response decrease because food is not available when these states are maximally active, but the couplings between the later states and the operant response increase because later states are the most active when food occurs. At the steady state, as successive states become active during the trial, their stronger couplings sustain increasing response rates. To predict superimposition of the rate curves for different FI schedules, LeT further assumes that parameter λ (i.e., how fast the activation spreads across states) and the ratio of the learning parameters, α/β, are both inversely proportional to T. This assumption means that as the FI increases (and overall reinforcement rate decreases), the activation spreads more slowly across the states and extinction becomes relatively less effective than reinforcement, a sort of partial reinforcement extinction effect.

We have described how SET and LeT handle the FI schedule, the prototypical concurrent timing task. Next, we describe how they handle temporal bisection, the prototypical retrospective timing task. Then we will have enough information about each model to identify their similarities and differences.

A temporal bisection task is a conditional discrimination task in which two sample stimuli differing only in duration are mapped to two comparison stimuli. A pigeon sees a center key lit for either 1 s or for 4 s and then chooses between two side keys, one red and the other green. Choices of Red are rewarded after 1-s samples and choices of Green are rewarded after 4-s samples. After the pigeon has learned to discriminate the two samples, stimulus generalization is examined by introducing samples with intermediate durations and measuring the subject's preference for one of the keys, say, Red. The psychometric function relating the proportion of Red choices to sample duration t, P(Red|t), has three features (Catania, 1970; Church & Deluty, 1977; Fetterman & Killeen, 1991; Killeen & Fetterman, 1988; Machado, 1997; Morgan, Killeen, & Fetterman, 1993; Platt & Davis, 1983; Stubbs, 1968). First, as t increases, P(Red|t) decreases monotonically and in a sigmoid way from about 1 to about 0. Second, the point of subjective equality, or PSE, is close to the geometric mean of the two training stimuli (i.e., the square root of their product). In the example at hand, P(Red|t)  =  0.5 when t  =  2 s. Third, individual subject psychometric functions obtained with samples holding the same ratio, for example, “1 vs. 4” and “4 vs. 16”, generally are scale transforms. This means that if the test durations from the “4 vs.16” discrimination are divided by 4, bringing them into the same range as the “1 vs. 4” test durations, then the two psychometric functions will superimpose. Superimposition reveals Weber's law for timing in the sense that equal ratios yield equal discriminabilities. How do SET and LeT explain this performance?

SET in the Temporal Bisection Task

The extension of SET to the bisection task requires one additional memory store and a more complex decision rule (see the bottom left panel of Figure 1; Gibbon, 1981, 1991; also Church, 2003; Gallistel, 1990). Specifically, in the “1 vs. 4” bisection task, there will be two memories, one containing the numbers that are in the accumulator when a choice of Red is rewarded and another containing the numbers that are in the accumulator when a choice of Green is rewarded. We identify the two memory stores by MRed and MGreen to stress the fact that they are indexed by the choice alternatives. Because the pacemaker speed λ and the memory factor k* are random variables, the values stored in each memory will vary across trials. At the steady state, each memory will contain a distribution of values whose mean represents the corresponding sample duration, and whose standard deviation represents the uncertainty associated with the sample duration due to the noise inherent in the timing process.

According to SET, after a sample with duration t the pigeon's choice will depend on three numbers, xt, the number of pulses in the accumulator at the end of the sample, XS, a number extracted from MRed and representing the short stimulus, and XL, a number extracted from MGreen and representing the long stimulus. If (xt/XS) < (XL/xt), then the pigeon is more likely to choose the Red or “Short” response, but if (xt/XS) > (XL/xt), then the pigeon is more likely to choose the Green or “Long” response. SET predicts indifference when (xt/XS)  =  (XL/xt), which is equivalent to xt  =  √(XS × XL), the geometric mean of the (subjective) training durations. SET also predicts sigmoid-shaped psychometric functions and superimposition of functions obtained with samples holding the same ratio (Gibbon, 1981; Church, 2003)

LeT in the Temporal Bisection Task

The model's extension to the bisection task requires one extra vector of associative links and a more complex decision rule (see bottom right panel of Figure 1; Machado, 1997). The states become active at sample onset, the time marker, and each state is now coupled with two responses. The strength of the link connecting state n with response r, Wr(n), changes only after the animal chooses. If the choice response is reinforced, the links between the states and that response increase, whereas the links between the states and the other response decrease, always in proportion to each state's activation. Conversely, if the choice response is extinguished, the links between the states and that response decrease, whereas the links between the states and the other response increase. In other words, the model assumes that, in bisection tasks, when the link between a state and one response changes, the link between the same state and the other response also changes, albeit in the opposite direction. Hence, the model's learning rule implements a strong form of response competition4.

On each trial, choice depends on which states are most active at the end of the sample and on the strength of the links between those states and the two responses. To illustrate, in a “1 vs. 4” task, after the 1-s sample the initial states are the most active and because of the reinforcement contingencies their link with Red will be strong whereas their link with Green will be weak—hence the preference for Red after short samples. However, after the 4-s samples, later states will be the most active and because of the reinforcement contingencies their link with Red will be weak whereas their link with Green will be strong—hence the preference for Green after the long samples. LeT predicts that preference for Red decreases as sample duration ranges from 1 to 4 s. Moreover, it also predicts (see Machado, 1997) a PSE close to, but slightly greater than, the geometric mean of the training stimuli and, when λ is proportional to the overall reinforcement rate during the trials, that psychometric functions obtained with samples holding the same ratio will superimpose when plotted on a common axis.

Similarities and Differences between SET and LeT

To analyze and test models by experiment, we need to understand first their similarities and differences. To that end, it is useful to compare the models' corresponding structures. Table 1 summarizes the information. The pacemaker–accumulator unit in SET corresponds to the serial organization of states in LeT: As the pacemaker emits pulses at rate λ (SET), the activation spreads across states at rate λ (LeT); at the beginning of each trial, as the accumulator is reset (SET), the first state in the series is aroused (LeT); during the trial, as the accumulator adds pulses (SET), successive states along the series become the most active states (LeT). And as the current number in the accumulator represents elapsed time (SET), the currently most active state also represents elapsed time (LeT). Furthermore, the memory store in SET corresponds to the vector of associative links in LeT. Both represent the subject's learning history, one as a distribution of subjective times of reinforcement (SET), the other as a vector of links with different strengths (LeT).

Table 1.

Similarities and differences between SET and LeT.

graphic file with name jeab-92-03-07-t01.jpg

Scalar Expectancy Theory (SET) Learning-to-Time (LeT) model
Parallel architecture Serial architecture
Pacemaker-accumulator unit Cascade of states
1. Pacemaker emits pulses at rate λ. 1. States are activated at rate λ.
2. The accumulator is reset to 0 at trial onset. 2. The first state is activated at trial onset.
3. The accumulator adds pulses. 3. States are activated serially.
4. The number of pulses in the accumulator represents elapsed time. 4. The most active state represents elapsed time.
Memory stores Vectors of associative links
1. Temporal memories are concentrated. 1. Temporal memories are distributed.
2. Each store represents important time moments, but not the reinforcement rate at those moments. 2. Each vector represents important time moments and the reinforcement rate at those moments.
3. Temporal memories are indexed by situational elements, not by the accumulator; hence, not by time. 3. Temporal memories are indexed directly by the states and therefore indirectly by time.
4. Temporal memories are context independent. 4. Temporal memories are context dependent.

Despite the structural similarities between the two models, and the fact that they often predict similar outcomes, the models differ in how they conceptualize what animals learn in timing tasks. To exploit these differences empirically and conceptually, we classify them in four types. These types are interrelated, and may be seen as different expressions of the same face, but because each focuses on a slightly different issue related to temporal learning and memory, we present them separately (see Table 1).

Concentrated (SET) vs. distributed (LeT) memory

Perhaps the most obvious difference between the two models is that whereas in SET memory is concentrated in stores or bins (e.g., MRed and MGreen in the bisection task described above), in LeT it is distributed across links. Moreover, in SET the memory bins have no internal structure. Their contents are like numbered balls mixed in an urn, with the numbers representing subjective time moments. Regardless of when the memory is sampled, each ball has the same probability of being selected5. In LeT, memories are distributed among links that couple the states to the operant response. The states structure the memory. Metaphorically speaking, memory sampling takes place one link at a time —when the first state is the most active, the first link is sampled and may be expressed in behavior; when the second state is the most active, the second link is sampled, and so on.

Retrieval: time independent (SET) vs. time dependent (LeT)

As a consequence of the previous point, the role of time in memory retrieval also differs. In SET, memory receives numbers from the accumulator, but otherwise the two structures are not related. In particular, accessing the memory and retrieving its contents does not depend on the contents of the accumulator. Because the accumulator represents elapsed time, we conclude that memory access and retrieval is time independent. In contrast, in LeT a behavioral state must be active for its link to be expressed behaviorally. One could say that the most active state (the equivalent of the accumulator content) retrieves the associative link (the equivalent of the memory content). Pursuing the analogy, because the links are sampled by the states, which represent time, one could say that in LeT retrieval is time dependent. This difference epitomizes the parallel and serial architectures of SET and LeT, respectively.

What is represented in memory? Relative (SET) vs. absolute (LeT) local reinforcement rates

In SET, the memory contents in time-based schedules depend only on the moments of reinforcement; if a reinforcer is collected at time t, a count representing t is added to memory; but if no reinforcer is collected at t, memory does not change. Because extinction plays no role in the model, the memory in SET can represent only local relative rates of reinforcement. In contrast, in LeT, the associative vectors represent not only the moments of reinforcement (via which link is strengthened), but also the absolute reinforcement rates at those moments (via how strong each link is). In LeT, the local rate is, in a sense, part of what the animal effectively learns in timing tasks. Another way to see this difference is to realize that for all practical purposes the memory stores in SET are like a relative frequency histogram, or a probability distribution. From it one can determine whether reinforcement is more likely to occur at time t1 or time t2 into the trial, but not how frequent reinforcement is at time t1. In LeT, the strengths of the links are more like an absolute frequency histogram, not a probability distribution, and from it one can determine not only whether reinforcement is more likely to occur at time t1 or time t2, but also how frequent reinforcement is at time t1.

Context independent (SET) vs. context dependent (LeT) memories

This is perhaps the least obvious difference between the two models. To illustrate it, consider the bisection task described above. According to SET the contents of the MRed and MGreen memory stores depend only on the duration of the two samples. The contents of the MGreen, for example, depend on the duration of the long sample (4 s) and are not affected by the duration of the short sample (1 s). This means that if the pigeons were trained with a short sample of 2 s, instead of 1 s, the contents of MGreen would remain the same because the 4-s sample did not change. We refer to the assumption that the contents of a memory store depend exclusively on the duration of its associated sample and not on the duration of the alternative sample as “context-independent memories”.

In contrast, in LeT the strengths of the links are context dependent. To understand this point, consider the links between the states and the Green response (see Figure 1, bottom right panel). Their final values will depend on the duration of the long and short samples. Given the model's learning rule, the links with the Green response change not only after 4-s samples, when Green is reinforced and Red extinguished, but also after 1-s samples, when Green is extinguished and Red is reinforced. Hence, if the duration of the short sample changes, the final values of the links connecting the states to the Green response will also change. Because the link vectors in LeT correspond to the memory stores in SET, we conclude that temporal memory is context sensitive in LeT but not in SET.

Given these differences, researchers would naturally like to know whether they are sensitive to empirical test and, equally important, whether they have theoretical import. We address these issues next.

II. EMPIRICAL TESTS: SET VERSUS LET

The first and larger set of studies described below deals mainly with the issue of context sensitivity. The second set of studies deals with the issue of what is represented in memory. The conceptual analyses that follow them deal with the issue of concentrated versus distributed memories and how temporal memories are formed and accessed.

Is Temporal Memory Context Sensitive? The Double Bisection Studies

To examine empirically the difference between the two models regarding context sensitivity, we developed the double bisection task (e.g., Machado & Keen, 1999). Its key idea is to vary the context of a sample in two temporal discriminations and see if that variation affects the generalization tests. Figure 2 shows the details. In a matching to sample task, a pigeon initially learns to choose a Red key after 1-s samples and a Green key after 4-s samples6. This discrimination may be represented by a mapping between the stimulus pair (S1, S4) and the response pair (Red, Green), {S1, S4} → {Red, Green}, where the subscripts identify the sample durations and the arrow means that the first response is rewarded following the first sample and the second response is rewarded following the second sample. The pigeon then learns a second discrimination, to choose a Blue key after 4-s samples and a Yellow key after 16-s samples, {S4, S16} → {Blue, Yellow}. Finally, the two discriminations are integrated in the same session. Half of the trials are the relatively short trials {S1, S4} → {Red, Green}, henceforth referred to as “Short” trials, and half of the trials are the “Long” trials {S4, S16} → {Blue, Yellow}.

Fig 2.

Fig 2

 A double bisection task is a conditional discrimination in which the animal learns two mappings, {S1, S4}→{Red, Green} on “Short” trials, and {S4, S16}→{Blue, Yellow} on “Long” trials. The subscripts indicate the sample duration; the arrow indicates that the first and second responses in each pair are correct following the first and second samples, respectively.

Having learned the two discriminations, what will the pigeon do in generalization tests in which the duration of the sample ranges from 1 to 16 s and the choice keys are Green and Blue? Both keys were associated with the same sample duration, 4 s, but their contexts differed. The context for the Green choices was the 1-s sample associated with Red, whereas the context for the Blue choices was the 16-s sample associated with Yellow. Will a sample be represented differently when it is embedded in different contexts?

SET is readily extended to the double bisection task. Instead of two, the animal forms four memories, each indexed by a different key (i.e., MRed, MGreen, MBlue and MYellow) and associated with one sample (1 s, 4 s, 4 s, and 16 s, respectively). Because the memories are context independent, the contents of MGreen and MBlue will be statistically identical. That is, the distributions of counts in the two stores will have the same mean and standard deviation. Hence, when the pigeon has a choice between the Green and Blue keys after a sample t-s long, it will compare the number in the accumulator with two samples extracted from identical distributions. The net result will be that preference will not change with sample duration. As the dotted line in the top left panel of Figure 3 shows, the function plotting the preference for Green over Blue, P(G|G vs. B), against t will be a horizontal line.

Fig 3.

Fig 3

The left panels show the predictions of SET and LeT for the generalization tests. In these tests, the sample ranges from 1 to 16 s and the comparison stimuli are one of the four pairs Green/Blue, Red/Yellow, Red/Blue, and Green/Yellow. The test with Green/Blue (top) is critical because the two keys were associated with the same 4-s sample duration. On these tests, SET predicts no effect of sample duration, whereas LeT predicts stronger preference for Green with longer samples, the context effect. The right panels show the data from five studies: Machado & Keen (1999), Machado & Pata (2005), Oliveira & Machado (2008), Arantes & Machado (2008) and Arantes (2008).

LeT also is readily extended to the double bisection task. Instead of two, there will be four link vectors (WRed, WGreen, WBlue and WYellow) coupling the states with the operant responses. Due to the contingencies of reinforcement and the model's learning rule, these vectors will change during training. Table 2 helps to understand how. We divide the states into three classes, those most active after 1-s samples (“Early”), 4-s samples (“Middle”), and 16-s samples (“Late”). Initially, they are all equally associated with the four responses (i.e., Wr(n)  =  0.5 for all r responses and n states). Then, during the “Short” trials, the “Early” states will become coupled strongly with Red and weakly with Green; the initial coupling of these states with Blue and Yellow will remain roughly unchanged because, when these states are the most active, rarely is the pigeon given a choice between Blue and Yellow. Hence, as the first column of Table 2 shows, at the end of training WRed(“Early”) ≈ 1, WGreen(“Early”) ≈ 0, and WBlue(“Early”) ≈ WYellow(“Early”) ≈ 0.5. The remaining columns show how the “Middle” and “Late” states become coupled with the responses. At the steady state, the “Early” states will be coupled more with Blue than with Green and therefore, after 1-s samples, the pigeon will prefer Blue to Green. Conversely, the “Late” states will be coupled more with Green than with Blue and therefore, after 16-s samples, the pigeon will prefer Green to Blue. More generally, as the solid line in the top left panel of Figure 3 shows, LeT predicts that preference for Green should increase with sample duration.

Table 2.

Strength of the links (W) between the states and the choice responses.

graphic file with name jeab-92-03-07-t02.jpg

States
“Early” “Middle” “Late”
Responses Red W → 1 W → 0 W ≈ 0.5
Green W → 0 W → 1 W ≈ 0.5
Blue W ≈ 0.5 W → 1 W → 0
Yellow W ≈ 0.5 W → 0 W → 1

Note. “Early”, “Middle” and “Late” represent the states most active after 1-, 4-, and 16-s samples, respectively. Initially, all links equal 0.5. The arrows show the effects of training in the double discrimination {S1, S4} → {Red, Green} and {S4, S16} → {Blue, Yellow}.

Another way to understand LeT's predictions is in terms of approach and avoidance. During the “Short” trials, the pigeon learns to approach Red and avoid Green after 1-s samples, but it learns little if anything regarding Blue and Yellow. Hence, during the tests with 1-s samples and the Blue and Green keys, the pigeon, deprived of the opportunity to choose Red, avoids Green and therefore chooses Blue. By the same token, during the “Long” trials, the pigeon learns to approach Yellow and avoid Blue after 16-s samples, but it learns little if anything regarding Red and Green. Hence, during the tests with 16-s samples and the Blue and Green keys, the pigeon, deprived of the opportunity to choose Yellow, avoids Blue and therefore chooses Green. Preference for Green should increase with the sample duration, the context effect.7

Although the tests with the Green and Blue keys are the most critical to examine the context sensitivity issue, three other tests may be run to further compare and contrast the models. After samples ranging from 1 to 16 s, the pigeon is given a choice between two other keys that have not been paired before, Red and Yellow, Red and Blue, or Green and Yellow. As the dotted lines in the left panels of Figure 3 show, SET predicts that the psychometric functions for the three tests will have the same shape—in fact, it predicts that they will be scale transforms. In contrast, LeT predicts a descending curve when the choice is between Red and Yellow, a U-shaped curve when the choice is between Red and Blue, and an inverted U-shaped curve when the choice is between Green and Yellow. The general trend of these predictions is readily understood by comparing the rows of the two responses in Table 2 (see Machado & Pata, 2005, for quantitative details).

The basic finding: The Context Effect

The right panels in Figure 3 show the average results of five studies. Machado and Keen's (1999) study used the basic procedure described above. The other four studies changed the basic procedure as follows: a) Arantes (2008) replaced the simultaneous discrimination task by its successive (or go/no-go) version ; b) Arantes and Machado (2008) never integrated the “Short” and “Long” training trials in the same session; c) Oliveira and Machado (2008) used visually different sample stimuli during the “Short” and “Long” trials; and d) Machado and Pata (2005) ran the test trials under nondifferential reinforcement instead of extinction.

Despite marked procedural differences, the results were similar. When the choice was between the Green and Blue keys (top right panel), the keys associated with the same sample durations but in different contexts, the preference for Green increased with sample duration. The result has substantial generality and it is consistent with LeT but not with SET. In the test with Red and Yellow, the keys associated with the shortest and longest samples, respectively, the results show that preference for Red decreased with sample duration, a result consistent with both models. In the remaining two tests there was more variation across pigeons. In the Red/Blue case, the psychometric function was roughly U-shaped, whereas in the Green/Yellow case it was roughly inverted U-shaped. Again, this pattern of results is qualitatively closer to LeT than SET.

Quantifying the context effect

In addition to predicting the context effect, LeT can go one step further and quantify it. Suppose two groups of pigeons learn the double temporal bisection task. The “Short” trials are the same for both groups but the “Long” trials differ. For Group 16 they are {S4, S16} → {Blue, Yellow}, as in the previous experiments. For Group 8 they are {S4, S8} → {Blue, Yellow}. The only difference between the groups is the duration of the longest sample, 16 s or 8 s. Will both groups show the context effect? And if so, will the magnitude of the effect differ between them?

The left and middle panels of Figure 4 show the predictions of each model. For the critical test between Green and Blue, SET again predicts no effect of sample duration. LeT predicts that preference for Green should increase with sample duration in both groups (the context effect), and that preference for Green should increase faster in Group 8 than in Group 16. That is, Group 8 should show a stronger effect. The reason is that, according to the model, avoidance of Blue at 8 s will be stronger in Group 8 than Group 16; hence, at t  =  8 s, preference for Green over Blue will be stronger in Group 88.

Fig 4.

Fig 4

The left and middle panels show the predictions of SET and LeT, respectively, for the test trials of two groups exposed to a double bisection task. Both groups learned the mapping {S1, S4}→{Red, Green} on “Short” trials, but, on “Long” trials, Group 8 learned the mapping {S4, S8}→{Blue, Yellow}, whereas Group 16 learned the mapping {S4, S16}→{Blue, Yellow}. The right panels show the data from Machado & Pata (2005).

For the remaining tests, both models predict that preference for Red over Yellow will decrease with stimulus durations faster for Group 8 than for Group 16. Given a choice between Red and Blue, SET predicts the same monotonic decreasing function for the two groups, whereas LeT predicts two distinct U-shaped functions. The function for Group 16 should be wider than the function for Group 8. And given a choice between Green and Yellow, SET predicts that preference for Green should decrease with stimulus duration, but faster for Group 8 than for Group 16. LeT predicts two inverted U-shaped functions, with the function for Group 16 being wider than the function for Group 8.

The rightmost panels of Figure 4 show the experimental results (Machado & Pata, 2005). The top panel reveals the context effect in both groups—preference for Green over Blue increased with sample duration. It also reveals that preference for Green increased faster for Group 8 than Group 16. These results are consistent with LeT but not SET. The remaining panels show that the shape of LeT's predicted curves was roughly similar to the shape of the obtained curves. The major discrepancy between LeT and the data occurred for Group 16 in the two bottom panels, for in each case the model predicted curves considerably wider than the observed curves. In fact, LeT always predicts narrower curves for Group 8 than for Group 16, a prediction at odds with the data. Concerning SET, the shape of its predicted curves agreed with the data reasonably well when the choice was between Red and Yellow (second row), but in the other cases the shape of SET's curves was at odds with the shape of the obtained curves.

Converging evidence for the context effect

Machado and Arantes (2006) attempted to obtain the context effect in a different way. Their rationale was similar to using the retardation-of-acquisition test to determine whether a stimulus is a conditioned inhibitor (Rescorla, 1969). After a group of pigeons learned the prototypical double bisection task, it was divided into two and each new group learned a new temporal discrimination involving the 1-s and 16-s samples and the Green and Blue keys. The only difference between the two groups was that one learned the mapping {S1, S16} → {Blue, Green} and the other learned the alternative mapping {S1, S16} → {Green, Blue}. At issue was which group would learn the new discrimination faster.

SET predicts equal speeds of acquisition. Because memories are context independent, there is no reason for one of the discriminations to be easier than the other. LeT predicts sharply different results for the two groups. According to LeT, learning the double bisection task creates a tendency to prefer Blue to Green after 1-s samples, but Green to Blue after 16-s samples. Therefore, for group {S1, S16} → {Blue, Green} the new task will be easy because it is consistent with the tendency induced by the previous training. In contrast, for group {S1, S16} → {Green, Blue} the new task will be difficult because it is inconsistent with the tendency induced by the previous training. According to LeT, the acquisition of Group Inconsistent should be retarded compared to the acquisition of Group Consistent.

The top panels of Figure 5 show LeT's specific predictions. During the first session with the new discrimination, both groups will behave similarly despite opposite contingencies of reinforcement. Whereas Group Consistent will be close to the steady state since the first session, Group Inconsistent will need a few sessions to reach the steady state. The bottom panels show the results. For Group Consistent, preference for Green increased with sample duration and the psychometric functions did not change appreciably from the first to the last session. For Group Inconsistent, during the first session, preference for Green increased with sample duration despite the opposite contingencies of reinforcement! During the second session, preference did not change systematically with sample duration. By the last session, preference for Green decreased systematically with sample duration in accord with the contingencies of reinforcement. This pattern of results is strongly consistent with LeT but not with SET.

Fig 5.

Fig 5

The top panels show the predictions of LeT for Groups Consistent and Inconsistent. Each curve shows the probability of choosing Green over Blue as a function of sample duration. The number on each curve identifies the session for which the curve applies (e.g., curve 0  =  immediately after double bisection training, curve 1  =  after one session with the new discrimination training, etc.). The bottom panels show the data from Machado & Arantes (2006). Choices following the 1-s and 16-s samples were reinforced provided they were correct, but choices following the 2-, 4- and 8-s samples were not reinforced. Green was correct following the 16-s samples for Group Consistent and the 1-s samples for Group Inconsistent.

Summary

The studies reviewed above (see also Oliveira & Machado, 2009) exploited one of the differences between the SET and LeT models, the context sensitivity of temporal memories. In the double bisection task, LeT predicted a context effect but SET did not. In all studies, the context effect was obtained—in simultaneous and successive discrimination tasks, directly on test trials and indirectly through its effects on the acquisition of a new discrimination, and with and without local or global cues signaling the forthcoming trial. Examining the two models has revealed an unknown property of timing, the context effect: Temporal memories are context dependent.

What is Represented in Memory? The Free-Operant Psychophysical Procedure Studies

The next studies examined another difference between SET and LeT, namely, what is represented in memory. Imagine this hypothetical situation. Two pigeons are exposed to 60-s trials. For pigeon A, one reinforcer is scheduled on each trial; for pigeon B one reinforcer is scheduled every fourth trial on average. For both pigeons, scheduled reinforcers are delivered randomly at 15-s or 45-s since trial onset; never at other times. Hence, pigeons A and B receive food at the same moments within the trial, but the absolute reinforcement rate at those moments is four times higher for pigeon A than pigeon B.

According to SET the memory contents of pigeons A and B will be identical because memory represents only the moments of reinforcement. The 2 pigeons will learn that reinforcement occurs at 15 s and 45 s, but because extinction plays no direct role in timing, they will not learn how often reinforcement occurs at those moments. In contrast, according to LeT, the memory of the 2 pigeons (i.e., the associative links) will differ because memory represents the moments of reinforcement (via which links are changed) and the rate of reinforcement at those moments (via by how much the links change with reinforcement and extinction). Therefore, the 2 pigeons will learn not only that reinforcement occurs at 15 s and 45 s, but also how often it occurs at those times9.

The basic finding

One way to examine this issue empirically is through the free operant psychophysical procedure, FOPP (Bizo & White, 1994a, 1994b, 1995a, 1995b; Killeen, Hall, Bizo, 1999; Stubbs, 1980). A 50-s trial starts with the illumination of two keylights, L and R. For the first 25 s only L choices are reinforceable; for the last 25 s only R choices are reinforceable. During a baseline condition the reinforcers are scheduled by two independent Variable-Interval (VI) 60-s schedules. The results show that, as time into the trial elapses, the proportion of R pecks increases from 0 to 1 according to a sigmoid function, with indifference around the middle of the trial. This finding is illustrated by the empty squares in the top left panel of Figure 6 (Bizo & White, 1995a). When the experimenters made the L key richer by changing the VI schedules (e.g., VI 40 s for L and VI 120 s for R) the birds switched to the R key later than during the baseline and the psychometric function shifted to the right. Conversely when the experimenters made the L key poorer (VI 120 s for L and VI 40 s for R) the animals switched to the R key earlier than in the baseline and the psychometric function shifted to the left.

Fig 6.

Fig 6

Psychometric functions obtained with the Free Operant Psychophysical Procedure—pecks to one key, say L, are reinforced (according to one or more VIs) only during the first half of the trial, and pecks to the other key, R, are reinforced (also according to one or more VIs) only during the second half of the trial. In the top panels, when the overall reinforcement rate favored the L key (first one of the two VI schedules), the psychometric function shifted to the right; when it favored the R key (last one of the two VI schedules), it shifted to the left. The middle panels show that when the overall reinforcement rates differ, the psychometric function shifts only if the local reinforcement rates differ around the middle of the trial; the bottom right panels show that when the overall reinforcement rates are equal, the functions shift provided the local reinforcement rates differ in the middle of the trial. The data are from Bizo & White (1995a) (top left panel) and Machado & Guilhardi (2000) (remaining panels). The curves show the fit of the LeT model.

Machado & Guilhardi (2000) reproduced Bizo & White's (1995a) experiment, but, for reasons explained below, they divided the 60-s trial into four segments. Pecks to the L key were reinforced only during the first two segments; pecks to the R key were reinforced only during the last two segments. Reinforcers were scheduled by four independent VIs, each operating during one segment. The notation “120–120 / 40–40”, for example, means that L pecks were reinforced according to a VI 120s during the first segment and another VI 120s during the second segment, but R pecks were reinforced according to a VI 40s during the third segment and another VI 40s during the fourth segment. The results, displayed in the top right panel of Figure 6, show that when the pigeons experienced a threefold difference in reinforcement rate between the L and R keys, the psychometric functions shifted appreciably.

SET has not been applied to the FOPP. However, its usual rules of memory formation would suggest the following account. The animal would form two memory stores, one containing the times of reinforcement for L key pecks and the other the times of reinforcement for R key pecks. Given that reinforcers are set up according to a VI schedule, the reinforced times will be distributed uniformly across the interval and independently of the VI parameters (see Machado & Guilhardi, 2000). Hence, according to SET, the animal's memories will not change with variations in the VI schedules and therefore the psychometric functions should not shift. More generally, the memory contents of SET cannot predict the experimental findings because they are insensitive to changes in reinforcement rate that are not accompanied by changes in the distribution of reinforcement times.

For LeT the shifts of the psychometric function depend on the link vectors. As before, divide the states into three classes, the states most active at the beginning (“Early”), around the middle (“Middle”), or at the end (“Late”) of the trial. Given the reinforcement contingencies, the “Early” states will be linked mostly with the L response and therefore the pigeons will prefer the L key at trial onset; the “Late” states will be linked mostly with the R response and therefore the pigeons will prefer the R key at the end of the trial. The “Middle” states will be linked differently across conditions. When the VIs are equal, these states will be linked equally with the two keys and therefore, around the middle of the trial, the pigeon will be indifferent; when the VIs favor the L key, the “Middle” states will be linked more with the L than the R key and therefore, around the middle of the trial, the pigeon continues to prefer the L key and the psychometric function shifts to the right. Conversely, when the VI for the L key is poorer, those states will be linked more with the R key and the psychometric function shifts to the left. The lines in the top panels of Figure 6 show LeT's account (see Machado & Guilhardi, 2000, for a more detailed explanation and mathematical details).

Local rates at time t

LeT makes one finer prediction—the psychometric function will shift only when the differences in reinforcement rate between the two keys occur in the middle of the trial. That is, for the function to shift, it is neither sufficient nor necessary that one key delivers more rewards than the other. Two sets of results support this claim. The first (Machado & Guilhardi, 2000, Experiment 1) addressed the sufficiency condition by comparing the shifts in two groups of pigeons (see middle row in Figure 6). The difference in the overall reinforcement rate between the keys was similar in the two groups, but whereas the left panel group experienced different reinforcement rates around the middle of the trial and similar rates at the extremes of the trial, the right panel group experienced a difference at the extremes but not at the middle of the trial. According to LeT, only the former group should show a shift. As the middle panels show, the results were consistent with LeT. Hence, a difference in overall reinforcement rates between the two keys is not sufficient to move the psychometric function.

The second experiment (Machado & Guilhardi, 2000, Experiment 2) addressed the necessary condition. The L and R keys always delivered the same overall reinforcement rate (see bottom panels). However for the left panel group the reinforcement rates around the middle of the trial differed, but for the right panel group they were equal. LeT predicted a shift in the former group only and the results were consistent with the predictions. Similar shifts were obtained also with rats (Guilhardi, Macinnis, Church & Machado, 2007). Hence, a difference in overall reinforcement rate between the two keys is not necessary to move the psychometric function.

Summary

The FOPP studies looked into another difference between SET and LeT, the contents of temporal memory. According to LeT, memory represents both the times of reinforcement and the reinforcement frequencies at those times; according to SET, memory represents only the times of reinforcement. However, it does not follow that SET cannot account for the empirical findings obtained with the FOPP. In fact, it is possible that by combining a) a threshold carefully biased by the difference in absolute reinforcement rates with b) memory stores that represent relative reinforcement rates, SET could predict the shifts of the psychometric function. If that proves to be the case, then more informative (and perhaps more complex) experiments will have to be designed to disentangle the two conceptions of what is represented in temporal memory.

Is Temporal Memory Concentrated or Distributed? The Challenge of Mixed-FI Schedule

SET is not a learning model. However, like any other model, to be able to work at all it must make minimal assumptions about learning—for example, that two memories are formed in the simple bisection task {S1, S4}→{Red, Green}. Minimal as they may be, these assumptions may have unanticipated consequences. Continuing with the example, if a theory assumes that an animal forms two memory stores (see, e.g., Gibbon, 1981, 1991; Gibbon et al., 1984; Gallistel, 1990), the theory must be reasonably clear about how the stores are accessed. In SET, this means answering the following question, “At the end of the trial, how does the timing system decide in which memory to save the current number in the accumulator?” The answer is straightforward: “If the reinforcer came from pecking the Red key, the number is saved in one memory store; if it came from pecking the Green key it is saved in another.” More generally, accumulator counts are saved to a particular memory store on the basis of the structural features of the task (e.g., choosing this or that distinctive key and getting a reward). Moreover, because reinforcement of the two types of pecks follows different sample durations, one memory store will come to represent the 1-s interval (MRed), and the other store will come to represent the 4-s interval (MGreen). The theory has no major difficulty accounting for the temporal discrimination.

The basic finding

Consider now a simpler task. A pigeon receives food for pecking a key after either 30 s or 240 s have elapsed since trial onset. There is only one key and one feeder in the situation and no cue signals whether the current trial will be short or long. The results of this mixed FI 30s-FI 240s experiment show that during the long trials average response rate increases from the beginning of the trial until approximately 30 s have elapsed, then it decreases, and then it increases again until the end of the trial. Figure 7 shows one example from Catania and Reynolds (1968; see also Ferster & Skinner, 1957, pp. 597–605; Leak & Gibbon, 1995; Whitaker, Lowe & Wearden, 2003, 2008). Leak and Gibbon showed that on most long trials the pigeons paused at the onset of the trial, then pecked until the shorter FI elapsed, paused again, and then pecked again until the end of the trial (break-run-break-run pattern). Early cumulative records from Ferster and Skinner also show, during the longer FIs, a significant pause or deceleration past the time of the shorter FI. As the authors put it, “a well-marked priming exists after the shorter interval, and a falling-off into a curvature appropriate to a longer interval” (p. 597).

Fig 7.

Fig 7

Average data from one pigeon exposed to a mixed FI 30s–FI 120s schedule (points) and the fit of the LeT model (curve). Data from Catania & Reynolds (1968).

This performance could be derived from SET by assuming that the animal stored the counts obtained at 30 s and 240 s into distinct memory stores. As Leak and Gibbon (1995, p. 6) put it, "in SET, there is assumed to be a single clock but an independent memory distribution for each criterion time interval". Then at the beginning of the trial the bird sampled a number from the “short” store, compared that number with the current number in the accumulator, pecked the key when the two numbers were sufficiently close, stopped pecking when they became sufficiently different again, at which time it sampled a number from the “long” store, and then executed the same routine. The account predicts the break-run-break-run within-trial pattern, the two peaks in the average response rate curve, and the fact that the widths of the two peaks show the scalar property (see also Whitaker et al, 2003, 2008).

A logical problem

Unfortunately, the account begs the question because that which was supposed to be explained was assumed in the explanation. In contrast with the bisection task, the reinforcers in the mixed-FI schedules have the same source and no distinct signal cues the two trials. Hence, how does the timing system “direct” the counts to the appropriate memory store? To reply that when the count is small it is directed to one store and when large to another explains nothing, for the reply simply replaces one unexplained discrimination (short vs. long intervals) by another (small vs. large counts).

To be consistent and avoid begging the question, the current version of SET must assume that the animal's memories are indexed (formed, accessed, etc.) by structural features of the situation, by distinctive cues being timed, or by the source of the reinforcers, for example, and not by time itself. A coherent account would proceed by stating that when the reinforcers come from a single source and are not correlated with distinct stimuli, the counts in the accumulator are all lumped into one and the same memory store—the memory is concentrated. Therefore when the reinforcers are obtained at two distinct moments, as in mixed-FI schedules, the distribution of the counts in memory will be a mixture of two distributions, the one induced by the reinforcers delivered at short intervals (30 s) and the other by the reinforcers delivered at long intervals (240 s). The predicted pattern of behavior also will be a mixture across trials of two patterns, the break-and-run pattern associated with an FI 30 s and the break-and-run pattern associated with an FI 240 s. This prediction is incorrect because the observed pattern is break-run-break-run within most trials (Leak & Gibbon, 1995).

The same problem is present in another study (Mellon, Leak, Fairhurst, & Gibbon, 1995). Pigeons received reinforcers at 16, 32, or 48 s since trial onset, without external signals cueing the FI interval. To explain the data, the authors assumed three distinct memory stores representing the three reinforcement times, but they did not ask how the memories might be formed in the first place—how does the timing system decide where to save a particular accumulator count? In addition, to fit the data, the authors assumed that the three memories were sampled in the correct order (i.e., first the memory for the 16-s interval, second the memory for 32-s interval, and lastly the memory for the 48-s interval), which may be correct, but they did not explain how the system knows which memory is first, second, and third. Surprisingly, to account for changes in response rate across the 48-s intervals, the authors also assumed different absolute response rates at different moments into the trial. That is, a temporal discrimination was assumed when that temporal discrimination was part of the problem to be explained.

LeT does not face the same difficulties because its equivalent of the memory counts (the links) are not concentrated in a memory bin. They remain distinct and accessed by the states themselves. In the mixed-FI schedule, the most active states around 30 s and 240 s will be linked with the operant response more strongly than the most active states at times t ≪ 30 s and 30 ≪ t ≪ 240 s. Hence, average response rate around the moments of reinforcement will be higher than at other moments, which matches the obtained bimodal response curve (see Figure 7).

However, LeT has two main difficulties in dealing with mixed-FI schedules. First, because the local reinforcement rate is lower at 30 s than 240 s, LeT always predicts higher peak rates at the long than the short FI. And second, because the overall reinforcement rate remains the same, LeT predicts greater precision in the timing of the longer than the shorter FI. Although these predictions occasionally hold, as Figure 7 shows, most data sets from mixed-FIs contradict them (for further analyses see Whitaker et al., 2003, 2008).

Summary

The mixed-FI analysis questions SET's assumption that the representations of time intervals are lumped into a memory bin. In addition, it identifies a logical problem with SET that needs to be solved (see also Machado & Silva, 2007, and Gallistel, 2007). Because LeT generates bimodal response rate distributions without begging the question, it suggests that temporal memories may be distributed and accessed serially.

What is Learned in FI Schedules and the Peak Procedure?

Behavior in time-based schedules has both stochastic and nonlinear properties. For example, in FI schedules subjects typically pause after the reinforcer for a variable amount of time and then respond until the end of the trial. The variable length of the pause illustrates the stochastic property; the abrupt transition from no responding to a high rate of responding illustrates the nonlinear property. Another example comes from the peak procedure (Catania, 1970; Roberts, 1981). Here, FI trials are intermixed with significantly longer trials that end without reinforcement, the empty or peak-interval trials. On these longer trials, subjects pause for a variable interval, typically shorter than the FI, respond for another variable interval, typically until the FI elapses, and then pause again either until the end of the trial or until a new bout of responding begins (break-run-break or break-run-break-run patterns; Church, Meck, & Gibbon, 1994; Kirkpatrick-Steger, Miller, Betti, & Wasserman, 1996; Sanabria & Killeen, 2007).

SET was designed with the stochastic and nonlinear structure of behavior in mind. It accounts for the nonlinear properties by means of a threshold-based decision rule. In FI schedules, the animal starts to respond when the relative discrepancy between the number in the accumulator and a sample extracted from the memory of reinforced times falls below a threshold. In the peak procedure, the same start rule applies, but then the animal stops responding when the same relative discrepancy falls above either the same threshold or another threshold (Gibbon et al., 1984). LeT on the other hand was designed to deal with the average performance in time-based schedules and therefore it does not account for the trial-by-trial variability in behavior or for its nonlinear properties. This is one of LeT's major shortcomings.

The problem

Despite differences of conception and scope, the models share common ground in that both describe what animals learn when exposed to an FI T-s schedule or a corresponding peak procedure (i.e., a procedure comprising FI T s and empty trials). According to both models, the animal learns that food occurs at a particular time since the beginning of the trial. In SET, the average of the counts stored in memory represents the time of food and their variability represents the uncertainty associated with that time. In LeT, the distribution of associative strength across the links represents also the average and the variability of the time of food. However, neither model accounts adequately for a well-known feature of responding in these two situations. In the peak procedure, a well trained animal will stop responding shortly after T s elapse but, in an FI schedule a well trained animal will not stop responding for a long interval if the reinforcer is omitted (Ferster & Skinner, 1957; Machado & Cevik, 1998; Monteiro & Machado, 2009). If in both situations the animals learned that food occurs at time T, then why do they pause in the peak procedure, but continue to respond in the FI schedule?

Another way of framing the problem is in term of temporal generalization: If the effects of reinforcement at T s generalize to neighboring times, both before and after T, and if this generalization explains why the animal starts to respond only when it is sufficiently close to T, then why does the animal not stop responding, when food is omitted in the FI, as soon as it is sufficiently away from T? Note that we are not talking about the effects of chronic exposure to reinforcement omission in the FI schedule (e.g., Staddon & Innis, 1969; see also Staddon & Cerutti, 2003), or the effects of prolonged extinction following FI training (e.g., Crystal & Baramidze, 2006; Machado & Cevik, 1998; Monteiro & Machado, 2009), but on the immediate effects of omitting the reinforcer.

In common presentations of SET (e.g., Church, 2003; Gibbon, 1991; Lejeune et al., 2006), only the start rule is invoked to explain performance in FI schedules, but both the start and stop rules are invoked to explain performance in the peak procedure. In both situations the start rule will determine when the animal starts responding, but only in the peak procedure will the stop rule determine when the animal stops. Hence, SET “solves” the problem by stating that responding in the FI schedule persists for a long while because the stop rule is not used. Unfortunately, though, the explanation omits a critical step: What determines when the stop rule is used? In other words, how do the empty trials, the only difference between the FI and the corresponding peak procedure, “activate” the stop rule? The question is pertinent because the empty trials are in every respect similar to other segments without reinforcement that the animal experiences during simple FI schedules. To answer that only empty trials give the animal the opportunity to learn to stop responding past the reinforcement time, states an obvious fact, but it does not explain how that fact causes the behavioral difference. We believe this omission is not trivial because by stressing only the moments of reinforcement, which obviously remain the same in the FI schedule and the corresponding peak procedure, SET has no principled way to conceptualize the distinctive role of the empty trials in activating the stop rule.

LeT accounts reasonably well for the average rate curve in the peak procedure: The states most active around the reinforcement time, say, 40 s, will be strongly linked with the operant response, but the earlier and later states will extinguish their couplings with the operant response. This profile of couplings (earlier and later states uncoupled, “middle” states coupled) explains why average response rate increases from trial onset, peaks around 40 s, and then decreases. The problem for LeT is to explain why in the FI schedule response rate remains high past the reinforcement time. Because in the FI schedule the trials never lasted significantly longer than 40 s, the later states did not have a chance to become coupled with the operant response. Hence, when the reinforcer is omitted and these states become the most active, they should not sustain response rate for a long interval. The model predicts that response rate will decline shortly past the time of reinforcement. This prediction is incorrect (e.g., Machado & Cevik, 1998; Monteiro & Machado, 2009).

Summary

In an FI T-s schedule and its corresponding peak procedure, the reinforcement moments are the same, namely, about T s from trial onset. Why then do animals trained on an FI schedule continue to respond for a long period of time if the reinforcer is omitted, whereas in the peak procedure they stop responding shortly after the reinforcement time? A principled account of this straightforward and well known fact still challenges SET and LeT.

III. A HYBRID MODEL

Both models have strengths and weaknesses. SET's strengths are its ability to explain the stochastic, nonlinear structure of responding in concurrent timing tasks and the scalar property. The latter is no small feat given the ubiquity of the scalar property across a wide range of procedures and behavioral measures (but see Lejeune & Wearden, 2006). Its weaknesses seem to be its assumptions concerning memory—concentrated in bins, insensitive to context, one-dimensional, and not accessed by temporal cues. Curiously, LeT's strengths and weaknesses seem to be the opposite. On the positive side, LeT postulates distributed, two-dimensional, and context-sensitive memories accessed serially. On the negative side, LeT has serious difficulties handling the scalar property when two or more intervals are timed in mixed-FI schedules, but the overall reinforcement rate does not change. The model predicts a clear violation of the scalar property that is contrary to the data (Machado, 1997; Whitaker et al., 2003, 2008). In addition, LeT simply does not deal with the stochastic, nonlinear structure of behavior (for other limitations see Machado & Cevik, 1998, and Rodríguez-Gironés & Kacelnik, 1999).

We have explored the possibility that a hybrid between SET and LeT could overcome at least some of the weaknesses, while retaining most of the strengths, of each model (see Church 1997 and Kirkpatrick & Church, 1998, on the virtues of hybridization). The new model preserves the overall learning structure of LeT but replaces its state-activation dynamics by a scalar-inducing dynamics equivalent to the pacemaker–accumulator structure of SET. A stochastic interpretation of the state dynamics plus a threshold-based decision rule enables the new model to deal with the stochastic and nonlinear structure of behavior and generate the scalar property without adjusting its parameters.

In what follows we explain how the new model works. Then we extend it to three concurrent timing tasks (FI schedule, peak procedure, mixed-FI schedules), and two retrospective timing tasks (temporal bisection and temporal generalization). Throughout we will focus mainly on the qualitative aspects of the model, but in the Appendix we present some mathematical analyses and an algorithm to simulate the model.

Model Assumptions

Killeen and Weiss (1987) proposed a general framework to understand pacemaker–accumulator systems, with scalar variance induced by counting errors in the accumulator, Poisson variance induced by random changes in the pacemaker's interpulse intervals, and constant variance induced by motor latencies or delays in starting the counting process, for example. Here, we assume only scalar variance to see how far the model can go with a minimal assumption.

On each trial, a set of states, numbered n  =  1, 2, …, is activated serially at a rate of λ states per s. That is, the first state will be active from time 0 to time 1/λ, the second state will be active from time 1/λ to time 2/λ, and so on. The activation of the states is like a wave travelling across them with velocity λ. This velocity is constant within a trial but varies randomly across trials according to a Gaussian distribution with mean µ and standard deviation σ.

State n has an associative link with the operant response and the strength of the link changes at the end of each trial. Let n* denote the active state when the trial ends. Then the following rules apply:

  • 1

    Reinforcement rule. If the trial ends with reinforcement then n* is a reinforced state and ΔW(n*)  =  β(1−W(n*)). If the trial ends without reinforcement, its link changes according to the extinction rule described next.

  • 2

    Extinction rule. The strength of the link of all extinguished states decreases by the amount ΔW(n)  =  −(α/n*)W(n), where n* is the active state at the end of the trial.10

  • 3

    For all states that were not active during the trial, ΔW(n)  =  0.

Finally, while state n is active, responses occur at rate A provided the link has strength greater than a threshold θ, that is, W(n) > θ, with 0 < θ < 1. Because we were not interested in absolute response rate, we let A  =  1 throughout the study.

In words, states become active in succession; if the associative link of the active state is greater than a threshold the animal responds; the link of the state active at the time of reinforcement increases, whereas the links of all its predecessors decrease. The new model has six free parameters: The state dynamics is governed by the mean, µ, and the standard deviation, σ of the activation wave; learning is governed by the extinction parameter, α, the reinforcement parameter, β, and the initial value of the associative links, represented by W0; and the decision to respond is governed by the threshold parameter, θ. However, steady state performance depends effectively on three parameters only, the ratio σ/μ (i.e., the coefficient of variation of the activation wave), the ratio α/β (i.e., the relative effect of extinction), and θ.

The new model differs from LeT in three major assumptions. First, whereas in LeT all states are active at t > 0, albeit in different degrees, and their activation is described by a Poisson distribution, in the new model only one state is active and the state activation is described by a Gaussian distribution (Gibbon, 1992). Second, in the extinction rule, parameter α is replaced by α/n*. The extra complexity brings a major benefit. In LeT, for the scalar property to hold, α had to be inversely proportional to the overall reinforcement rate, that is, the parameter had to be adjusted; the new rule yields the scalar property without adjusting its parameters (see below). And third, the new model can deal with the variability and nonlinearity of within-trial performance in concurrent timing tasks.

Concurrent Timing

The next three figures show the model's output in FI schedules, the peak procedure, and mixed-FI schedules. Throughout, the parameters were μ = 1, σ = 0.2, α = 1, β = 0.2, θ = 0.1, and W0(n)  =  0.12, for all states. Concerning the last two parameters, what is important is not their specific values but the relation W0(n) > θ, that is, all links have initial strength greater than the threshold, insuring that initially the animal responds regardless of which state is active. We simulated 10 stat-pigeons, each exposed to 20 sessions of 50 trials each, averaged the data from the last 5 sessions of each stat-pigeon, and then averaged the data across stat-pigeons. The time step equaled Δt  =  0.1 s and the model's output (0  =  no response, 1 =  response) was collected every second.

FI schedules

Figure 8 shows the model's output in four FI schedules. Here and throughout, the noisy curves plot the simulation results and the smooth curves, when present, plot the approximate analytical solutions included in the Appendix. Panel A illustrates the distribution of the associative links, W(n). The horizontal dashed line shows the response threshold. Consider the FI 60-s schedule (third curve from left) and take into account that, because the speed of transition across states had a mean value of μ =  1, the state that was most likely to be active during reinforcement was state n*  = 60. During training, when the first states (n < 40) were active, reinforcement rarely followed and therefore the strength of their links decreased to 0. Subsequent states (40 < n < 100), however, overlapped with reinforcement and their links were strengthened. The states still further down the series (n > 100) were never active during the trial and therefore their links retained the initial strength of 0.12. The curves for the other FIs are interpreted similarly.

Fig 8.

Fig 8

Model output for four FI schedules, 15, 30, 60, and 120 s. Panel A: Distribution of the strength of the associative links, W(n). The smooth lines plot approximate closed-form solutions derived in the Appendix. Panel B: response distribution for one stat-pigeon during the last trial (each row corresponds to one FI). For each second, a point is plotted if a response occurred. Panel C: Corresponding cumulative records. Panel D: Average response rate for each FI schedule. The smooth curves plot the approximations derived in the Appendix. Panel E: the scalar property.

The remaining panels deal with response output. Panel B shows the time of each response on the last trial of each simulation and panel C shows the corresponding cumulative records. The postreinforcement pause lasts approximately two-thirds of the FI. For short FIs the response pattern is clearly break-and-run; for the longest FI, the pattern is more scallop-like as responding goes through a period of acceleration and then stabilizes (cf. Dews, 1978; Schneider, 1969). The reason for these different patterns is that, as the FI increases, the W(n) curves (see panel A) become noisier and wider, and their left limbs have shallower slopes. Hence, the moment they cross the 0.1 threshold is more sharply defined for short FIs (break-and-run) than for long FIs (scallop). Panel D shows the average response rate curves based on the simulations (noisy curves) and the theoretical approximations predicted by the model (smooth curves). The model reproduces the typical sigmoid curve. Panel E illustrates that the curves for different FIs overlap when plotted in relative time—the scalar property.

The model predicts that if the reinforcer is omitted, responding will continue significantly beyond the reinforcement time. The reason is that the states that become active after T s have retained their initial associative strength and therefore sustain responding (see W(n) curves panel A). This result is predicted whenever it is assumed, as we did, that the initial weights are greater than the threshold (i.e., W0(n) > Θ). Behaviorally, this assumption means that, by default, the animal responds and then, in an FI schedule, it learns to withhold its responses during the initial segment of the trial.

Peak procedure

Figure 9 shows the model's output in four peak procedures. The FIs were 15, 30, 60, and 120 s, and the empty trials were four times longer and occurred on half of the trials. There are two main differences in the W(n) curves between the peak procedure and FI schedules (compare with panel A of Figure 8). The heights of the W(n) curves are lower in the peak procedure because of extinction during the empty trials. And the right limbs of the W(n) curves decrease to 0 in the peak procedure because during the empty trials later states become active and have the opportunity to lose their initial strength. However, states further down the series, which have remained inactive even during the empty trials, retain their initial strength (the right end points of each W curve equal 0.12). These states may sustain responding when they become active past the end of the empty trial (e.g., Monteiro & Machado, 2009).

Fig 9.

Fig 9

Model output for four peak procedures. The FIs were 15-, 30-, 60-, and 120-s long; the empty trials were always four times the FI length and occurred on half of the trials. Panel A: Distribution of W(n). The smooth lines plot approximate closed-form solutions derived in the Appendix. Panel B: response distribution for one stat-pigeon during the last trial of each peak procedure. Panel C: average response rate. The smooth curves plot the approximations derived in the Appendix. Panel D: the scalar property.

Panel B shows the response structure on the last empty trial of each peak procedure. The period of responding brackets the reinforcement time; the start and stop times, as well as the duration of the response period increase directly with the FI. The average response rate curves, displayed in panel C, peak around the time of reinforcement and are slightly asymmetric. For the longer FIs, average response rate increases at the end of the trial because the number of empty trials was insufficient to extinguish the initial couplings of the late states. Finally, panel D shows that the scalar property holds also in the peak procedure.

Mixed-FI schedules

Figure 10 shows the model's output in two mixed-FI schedules, mixed FI 15s-FI 120s and mixed FI 30s–FI 240s. The W(n) curves in panel A reveal the two sets of reinforced states. The structure of the response output during the longer trials (see panel B) consists of two periods of responding, the first bracketing the shorter reinforcement time and the second filling the last trial segment (break-run-break-run pattern). The average rate curves (panel C) show the two corresponding peaks. The scalar property holds in two ways. First, the two average rate curves overlap when plotted in relative time (panel D—scalar property across mixed-FIs). And second, although not shown in the figure, the two ascending limbs of each response rate curve also overlap when plotted in relative time (scalar property within a mixed-FI).

Fig 10.

Fig 10

Model output for two mixed-FI schedules, mixed FI 15s-FI 120s and mixed FI 30s-FI 240s. Panel A: distribution of W(n). The smooth lines plot approximate closed-form solutions derived in the Appendix. Panel B: response distribution for one stat-pigeon during the last trial. Panel C: average response rate. The smooth curves plot the approximations derived in the Appendix. Panel D: the scalar property.

The new model solves some of the problems with LeT. Thus, in all three procedures, it predicts the scalar property without having to adjust its parameters; it generates a behavioral stream similar to the behavioral stream of rats and pigeons (variable postreinforcement pauses; break-and-run patterns or FI scallops); and by conceiving of the animals in simple concurrent timing tasks as learning mainly when to stop responding, the model predicts that, if a reinforcer is omitted after FI training, responding will continue for a long interval after the reinforcement time. However, some potential problems with the new model are the (perhaps excessively) asymmetric curves predicted for the peak procedure (see Figure 9) and the fact that in mixed-FI schedules it cannot predict average response rates higher at the short than the long FIs (Whitaker et al., 2003, 2008). It remains to be seen whether these problems can be corrected by assuming, for example, a variable threshold, or different start and stop thresholds, in the peak procedure, and a decaying arousal function mapping the timing output to response rate in the mixed schedules.

Retrospective Timing

Simple and double temporal bisection

For the temporal bisection tasks, the state dynamics remains the same but the learning and decision rules change slightly to accommodate the specifics of the situation. According to the new model, at the onset of the sample the states are activated serially. At the end of the sample, one behavioral state, say, n, will be active. State n has links to the two comparison stimuli, the Red and Green keys. We represent the links by WR(n) and WG(n). The decision rule states that the animal will choose the Red key with probability WR(n)/(WR(n) + WG(n)) and the Green key with the complementary probability. This parameter-free decision rule is simpler than its equivalent in LeT.

The learning rules, however, remain the same: If the choice is rewarded, then the link of the reinforced response increases and that of the other response decreases, whereas if the choice is not rewarded the link of the extinguished response decreases and that of the other response increases. Specifically, assume the animal was in state n at the end of a sample and chose the Red key. Then the links between state n and the two responses would change as shown in Table 3.

Table 3.

Learning rules in the bisection task.

graphic file with name jeab-92-03-07-t03.jpg

Reinforced Extinguished
ΔWR(n)  =  +β(1−WR(n)) −αWR(n)
ΔWG(n)  =  −βWG(n) +α(1−WG(n))

Figure 11 shows the model's output in simple and double temporal bisection tasks. Throughout, the parameter values were μ = 1, σ = 0.4, α = 0, β = 0.1, and W0 = 0.10. Because initial simulations showed more variability in the results than in concurrent timing tasks, we ran 100 stat-birds instead of 10. For each of the stat-birds, the simulation followed Machado and Keen's (1999) experimental protocol. Panel A illustrates the distribution of the links in a {S4, S16}→{Red, Green} discrimination. The initial states, more likely to be active after the Short sample, linked with Red; the later states, more likely to be active after the Long sample, linked with Green; states still further down the series (n > 40) retained their initial link strength11.

Fig 11.

Fig 11

Model output for the temporal bisection task. Panel A: distribution of the associative links after training with the simple bisection task {S4, S16}→{Red, Green}. The filled and empty circles show the associative links with the Red and Green responses, respectively. Panel B: psychometric functions plotting the probability of choosing “Short” during test trials in four simple bisection tasks. From left to right, the training sample durations were 1 vs. 4, 2 vs. 8, 4 vs. 16, and 8 vs. 32. Panel C: the scalar property with bisection at the geometric mean in the four simple bisection tasks. Panel D: results from the tests with novel key pairings after training on the double bisection task {S1, S4}→{Red, Green} and {S4, S16}→{Blue, Yellow}. Simulation details: training sessions comprised 400 trials with each sample; the generalization sessions comprised 64 trials for each test sample plus 384 trials for each training sample; in the double bisection task, the test sessions with the novel key pairings comprised 64 test trials for each test sample and 3200 trials for each training sample.

Panel B shows the psychometric function for each of four discriminations in which the ratio of the Long-to-Short durations always equaled 4. The sigmoid curves reproduce the three key properties of temporal bisection data: They decrease monotonically, have the PSE close to the geometric mean, and, as panel C shows, overlap when plotted in relative time. More extensive simulations and mathematical analyses (see Appendix) revealed that for larger Long-to-Short ratios (e.g., 16 to 1), the PSE is between the geometric and harmonic means (as in, e. g., Siegel, 1986).

Panel D shows the results for the double bisection task {S1, S4}→{Red, Green} and {S4, S16}→{Blue, Yellow}. In the critical test with Green and Blue, the keys associated with the same sample duration, the model reproduces the context effect—preference for Green increases with sample duration. In the three other tests with novel key pairings, the model also reproduces the major trends in the data, namely, as the sample increases, a) preference for Red over Yellow decreases systematically; b) preference for Red over Blue first decreases until 4 s and then increases (U-shaped); and c) preference for Green over Yellow first increases until 4 s and then decreases (inverted U-shaped curve). Although not shown, the model also reproduces Machado and Pata's (2005) findings that preference for Green increases faster when the longest training duration is 8 s than when it is 16 s.

In the bisection procedure, the new model goes beyond LeT in that it generates the scalar property without parameter adjustments. As the preceding figure illustrates, the same set of parameters produces psychometric functions that superimpose whenever the ratio of Long-to-Short durations remains the same. The model also engenders the context effect and the other main patterns observed in the double bisection experiments reviewed above. However, some problems persist. Whereas LeT could generate PSEs slightly above the geometric mean, the new model generates PSEs at or below the geometric mean; for very large ratios, the predicted PSE is close to the harmonic mean. In addition, similar to LeT, the new model cannot accommodate the full set of results obtained with the double bisection task, in particular, the test results involving the “Red vs. Blue” and “Green vs. Yellow” keys in simultaneous and successive discriminations (compare Figures 3, 4 and 6). It remains to be seen whether adding a source of Poisson variance to the state dynamics (see Killeen & Weiss, 1987) corrects these shortcomings.

Temporal generalization

We conclude with a brief description of how the new model deals with some basic findings concerning another retrospective timing task, temporal generalization. The LeT model has not been applied to temporal generalization. Church and Gibbon (1982) performed the seminal experiments. Rats were reinforced following a T-s signal, but not following signals of shorter or longer durations. The results showed a generalization gradient with the maximum at T s. In addition, the authors found that a) linear and logarithmic spacing of the nonreinforced durations had no effect; b) the location of the maximum and the breadth of the gradient increased with the reinforced duration; c) the gradients obtained with different reinforced durations overlapped when plotted in relative time; d) reducing the probability of reinforcement following the target T-s signals decreased the height of the gradient; and e) reducing the probability of presenting the target T-s signals also decreased the height of the gradient.

The new model extends readily to the temporal generalization task: The signal activates the cascade of states. At the end of the signal, one state will be active, say, n, and the strength of its link, W(n), will increase with reinforcement [i.e., ΔW(n)  =  β(1−W(n)] and decrease with extinction [i.e., ΔW(n)  =  -αW(n)]. The probability of responding at the end of a signal equals the strength of the link of the active state, W(n). One additional assumption, also made by Church and Gibbon (1982), is that, on some trials, the animal's decision is not controlled by the signal. During these trials, the animal responds with some unconditional probability C. Hence, two factors determine behavior: If the animal did not pay attention to the sample (probability 1−π), it responds with probability C; if it did pay attention (probability π), it responds with the probability specified by the model, W(n). Thus, the overall probability of a response at the end of a t-s signal equals

graphic file with name jeab-92-03-07-e24.jpg

where n is the active state at the end of the signal. In the Appendix, we derive the steady state distributions of W(n) and P(R|t).

Figure 12 summarizes the model's predictions for the temporal generalization task. To isolate the effect of the timing parameters, only the attention parameter, π and the unconditional response probability parameter, C, were allowed to vary across panels. The others were fixed at the values μ =  1, σ =  0.2, α =  0.01 and β =  0.1. Notice that the free parameters π and C do not produce any temporal modulation; hence, in all panels, the effect of sample duration is mediated exclusively by W(n). In addition, because in each panel the two free parameters have the same value, the differences between the curves in the panel do not depend on them.

Fig 12.

Fig 12

Model predictions for the temporal generalization task (see Eq. 22 in the Appendix). Top panels: Generalization gradients when the reinforced signal was 4-s long and the nonreinforced signals were spaced linearly or logarithmically for a shorter (A) or longer (D) range. Middle panels: the reinforced signal was 2-, 4-, or 8-s long; the right panel shows the scalar property. Bottom left panel: effect of changing the reinforcement probability following the 4-s, target signal. Bottom right panel: effect of changing the probability of presenting the 4-s, target signal on each trial.

In panels A and D, the 4-s signal was reinforced with probability 1, the signals were spaced linearly or logarithmically, and over a shorter (A) or longer (D) range of durations. The curves show that the typical generalization gradient, with a maximum at 4 s, does not change appreciably with either the stimulus spacing or range (cf. Church & Gibbon, 1982, Experiments 1 and 2). In panels B and E, the reinforced signal varied across conditions (2, 4, or 8 s). The model predicts that the location of the maximum and the breadth of the gradient increase with the reinforced duration (B). In addition, it also predicts that the gradients obtained with different reinforced durations roughly overlap when plotted in relative time (E; cf. Church & Gibbon, Experiment 3). In panel C, the 4-s signal was reinforced either with probability 1 or 0.25 while the other signals were never reinforced. The model predicts that reducing the probability of reinforcement following the target signal decreases the height of the gradient (cf. Church & Gibbon, Experiment 4). Finally, in panel F, the 4-s reinforced signal occurred on either 50 or 25 percent of the trials. The model predicts that reducing the probability of presenting the target signal also decreases the height of the gradient (cf. Church & Gibbon, Experiment 5). We conclude that the new model accounts well for the major findings reported by Church and Gibbon (1982) concerning temporal generalization.

IV. CONCLUSION

The area of timing has witnessed a significant increase in the number of theoretical models. They differ in approach, domain of application, and generality. Variation in models is probably necessary to explore the theoretical domain of timing. But for knowledge to accumulate, variation in models must be followed by selection of models and model ideas. To that end, researchers may examine the successes and failures of each model and then attempt to identify the elements that deserve the credit for the former and the blame for the latter. They may also design experiments that by capitalizing on the differences between the models subject them to science's Supreme Court, the empirical test. Conceptual and mathematical analyses, on the one hand, and empirical research findings, on the other hand, are two complementary means of choosing among models and model ideas (Machado & Silva, 2007). Through variation and selection our models evolve and, we hope, come to depict reality a bit more accurately than before.

In this paper, we have engaged in some variation and selection concerning timing models. We analyzed two contemporary models, SET and LeT, identified the similarities and classified the differences between them, summarized experiments that have started to explore some of these differences and, in some cases, to put them to empirical test. Our conceptual analyses and the empirical findings we reviewed exposed some of the strengths and weaknesses of each model. To put these strengths and weaknesses into perspective, we review them in the light of seven challenges that any model or theory of timing must face. The first three were proposed by Church and Broadbent (1990):

  • 1

    “The first fact that any theory… must account for is the smooth peak function in which the mean probability of a response gradually increases to a maximum near the time of reinforcement and then decreases in a slightly asymmetrical fashion.” (p. 58). This challenge applies not only to the peak procedure, but also to the temporal generalization procedure.

  • 2

    “The second fact that any theory… must account for is that performance on individual trials, unlike the mean function, is characterized by an abrupt change from a state of low responding to a state of high responding and finally another state of low responding.” (p. 58). These abrupt changes in responding occur also in FI and mixed-FI schedules.

  • 3

    “The third fact that any theory… must account for is that the mean functions are very similar with time shown as a proportion of the time of reinforcement.” (p. 58). Perhaps the greatest constraint for any timing model, the scalar property is generally observed in the concurrent and retrospective timing tasks reviewed above.

The next four challenges are intimately related to temporal learning and memory:

  • 4

    Based on the double bisection experiments we suggest that temporal memories are context dependent. Hence, a 4-s interval seems longer when discriminated from a shorter interval than from a longer interval.

  • 5

    Based on the FOPP experiments we suggest that temporal memories register not only the various moments of reinforcement but also how often reinforcers occur at those various moments.

  • 6

    Based on the mixed-FI experiments we suggest that temporal memories are not collapsed into stores or bins but remain separate, distributed, and indexed by time itself. Temporal generalization notwithstanding, the animal knows, as it were, what happens at different moments since a time marker.

  • 7

    Any theory of timing must account for the fact that whereas responding ceases shortly after the reinforcement time in the peak procedure, it continues for a long interval if the reinforcer is omitted following FI training.

SET meets reasonably well the first three, but not the last four challenges. Its strengths are its ability to deal with the scalar property and with the stochastic and nonlinear properties of responding in time-based reinforcement schedules. Its weaknesses seem to be its assumptions concerning memories, their contents, and how they are formed and accessed. In its turn, LeT meets reasonably well challenges 1, 4, 5, and 6, has difficulties handling the scalar property (3), and simply does not deal with the stochastic and nonlinear structure of time-based performance (2). Neither model meets convincingly challenge 7.

We have proposed a hybrid model that may be less in error than its two predecessors. By inheriting the pacemaker–accumulator unit from SET and the learning rules from LeT, the hybrid model meets all seven challenges, at least partly. It deals with the stochastic and nonlinear properties of responding in time-based schedules; it generates temporal generalization gradients that peak around the time of reinforcement; it obeys the scalar property; its temporal memories are context sensitive, two-dimensional, and accessed serially. And it meets the last challenge because it assumes that, in time-based schedules, animals respond by default and learn to stop responding at time t when they experience extinction at time t.

But problems remain. The new model generates asymmetric curves in the peak procedure; in mixed-FI schedules, the response rate at the first peak is equal to or lower than the response rate at the second peak, but never higher; and in temporal bisection tasks with large ratios (e.g., 16), the PSE is close to the harmonic mean of the Long and Short durations. Although each of these results has been observed occasionally, the model lacks flexibility to account for the different results obtained in other studies (e.g., Whitaker et al. 2003, 2008). At this time we do not know whether additional assumptions may correct these problems (e.g., adding sources of Poisson and constant variance to the state dynamics; or making arousal decay between reinforcers). Another cycle of variation and selection is needed.

In summary, SET has contributed to our understanding of timing by revealing the widespread presence of the scalar property and by providing a simple, intuitive means of understanding it, the clock metaphor (Church, 1984, 2003; Lejeune et al., 2006; Gibbon, 1991). Judging by the number of studies that have used the model, whether to investigate animal or human timing, and from a behavioral or neurobiological perspective, its influence has been enormous (see Allan, 1998). LeT has contributed to our understanding of timing by questioning the memory architecture postulated by SET and, following earlier work by Killeen and Fetterman (1988), by suggesting an explicit hypothesis concerning how animals might learn to time. More to the point, LeT has called our attention to memory structure in timing. Perhaps then a hybrid between the two models will preserve their strengths and eliminate their weaknesses. We have proposed one. It remains to be seen whether the new model will confirm the well known fact that most interspecies hybrids are sterile or the equally well known fact that most intraspecies hybrids have increased vigor. Time will tell.

Acknowledgments

The authors thank R. Church, P. Guilhardi, F. Lopez, M. Menez, J. Josefowiez, F. Silva, M.P. de Carvalho, and three anonymous reviewers for their helpful comments. Armando Machado was supported by a grant from the Portuguese Foundation for Science and Technology (FCT). He thanks the Department of Mathematics for Science and Technology of the University of Minho and the Department of Psychology of Brown University for their hospitality during parts of his sabbatical leave.

APPENDIX

FI schedule

We assume an FI T-s long. Variables t and n represent the time into the trial and the states, respectively, with t ≥ 0 and n  =  1, 2…

State dynamics, N(t)

On each trial, the states are activated serially, starting with state 1 at trial onset. The states are activated at a rate of λ states per second, with λ> 0 a Gaussian random variable with mean μ and standard deviation σ. We represent by N(x, μ, σ) the density function, and by Φ(x, μ, σ) the distribution function, of that Gaussian variable evaluated at x.

Figure A1 illustrates the state dynamics on two trials of an FI 15-s schedule. On one trial, the sampled value of λ equalled 0.8 and consequently the active state changed at a rate of 0.8 states per sec (i.e., every 1.125 s). If we denote by N(t) the active state at time t—the ordinate in Figure A1, top panel—then N(t) equals the smallest integer greater than λt, which we represent by the symbol ⌈λt⌉. The last active state, that is, the state active at T  =  15 s, the time of reinforcement, was state 13 (i.e., N(T)  =  13); states n < 13 were active before T  =  15 s, whereas states n > 13 were inactive during the trial. We may say that state n  =  13 was the reinforced state, states n < 13 were extinguished states, and states n > 13 were inactive states. Note that in the FI the last active state is always a reinforced state, but in other procedures this may not be the case (e.g., on the empty trials of a peak procedure, the last active state is not a reinforced state). On the other trial, λ  =  1.2, states n < 19 were extinguished states, state n  =  19 was the reinforced state, and states n > 19 were inactive states. At the end of each trial then we may divide the states into three classes, extinguished states, the reinforced state, and inactive states. Clearly, in which class a state falls on any given trial is a random variable that depends on T and the sampled value of λ.

Fig A1.

Fig A1

The top panel shows two samples paths of N(t), the active state at time t, in an FI T  =  15 s. The activation speed λ came from a Gaussian distribution with mean μ =  1 and standard deviation σ =  0.2. When λ =  0.8, the last active (and reinforced) state, N(T), equalled 13; when λ =  1.2, N(T)  =  19. The bottom panel illustrates, for n  =  12, how q(n,T), p(n,T) and r(n,T) relate to the Gaussian density function for λ.

Let q(n,T) be the probability that state n is an extinguished state, p(n,T) the probability that state n is the reinforced state, and r(n,T) the probability that state n is an inactive state. Obviously, for any state n, q(n,T) + p(n,T) + r(n,T)  = 1. To derive an expression for q(n,T), note that state n is an extinguished state if and only if n is less than the reinforced state, N(T). Thus q(n,T) is the probability that n < N(T), which we represent by P{n < N(T)}. Because N(T)  =  ⌈λT⌉, we get

graphic file with name jeab-92-03-07-e01.jpg 1

Similarly, state n is an inactive state if and only if n > N(T). That is,

graphic file with name jeab-92-03-07-e02.jpg 2

Finally, from p(n,T)  = 1−q(n,T) − r(n,T), we obtain the probability that state n is the reinforced state,

graphic file with name jeab-92-03-07-e03.jpg 3

The bottom panel of Figure A1 illustrates for n  =  12 how q(n,T), p(n,T) and r(n,T) relate to the Gaussian density function for λ. The areas show r(n,T) (Eq. (2)), p(n,T) (Eq. (3)), and q(n,T) (Eq. (1)).

Below we will use the approximation

graphic file with name jeab-92-03-07-e04.jpg 4

and from Equation (4) we derive two other relations used to prove the scalar property below,

graphic file with name jeab-92-03-07-e05.jpg 5

and

graphic file with name jeab-92-03-07-e06.jpg 6
W(n)

Let W(n, m) be the strength of the link of state n at the beginning of trial m and E[W(n, m)] its expected value. We seek an expression that relates E[W(n, m+1)] to E[W(n, m)]. To determine E[W(n, m+1)], we consider three cases on trial m: the reinforced state, n*, was less than, equal to, or greater than n. Therefore,

graphic file with name jeab-92-03-07-e25.jpg

Next, we approximate the value of n* by its expected value, E[n*]  =  μT, and take expectations again to obtain,

graphic file with name jeab-92-03-07-e26.jpg

The solution of this difference equation is

graphic file with name jeab-92-03-07-e07.jpg 7

with

graphic file with name jeab-92-03-07-e27.jpg

Equation (7) was used to fit the simulation data in panel A of Figure 8.

At the steady state, and dropping the expectation symbol to ease the notation, we obtain

graphic file with name jeab-92-03-07-e08.jpg 8
Scalar property

To stress the fact that W(n) depends on T, we write it as W(n,T). The function W(n,T) shows the scalar property because

graphic file with name jeab-92-03-07-e28.jpg

where we have used Equation (5) and the fact that, for any integer k, q(kn, kT)  =  q(n,T).

R(t)

Responses occur when the active state has strength above the threshold, that is, when W(n) > Θ. This inequality has no explicit solution for n. Hence, to predict the average response rate function, we determine, numerically, the first state n for which W(n)>Θ; call this state n+. The response probability at time t may be approximated by the probability that state n+ or a subsequent state is active at time t, that is,

graphic file with name jeab-92-03-07-e09.jpg 9

Equation (9) is plotted in panel D of Figure 8 for four different FIs. The approximation is reasonable.

Exploratory analyses showed that the R(t) curve is well fitted by a log-normal distribution, but we have not been able to derive the distribution from the model.

Simulation algorithm

Assuming Δt  =  1 s, the following steps would simulate the model:

  • 1

    Initialize model parameters and set W(n)  =  W0 for all n;

  • 2

    For each trial,

    • a

      Sample the value of λ from a Gaussian distribution with mean μ and standard deviation σ;

    • b

      Then, for all t equal to 1, 2,…,T, do

      • i

        Determine the state active at time t: N(t) = ⌈λt⌉, where the function ⌈x⌉ means the smallest integer greater than x;

      • ii

        Determine the response at time t. A response occurs if the strength of the active state at time t (i.e., N(t)), is above threshold. Hence, R(t)  =  1 if W(N(t)) > Θ, and R(t)  =  0, otherwise;

    • c

      Determine the reinforced state, n* = N(T)  =  ⌈λT⌉;

    • d

      Increase the link strength of the reinforced state: W(n*) → W(n*) + β(1-W(n*));

    • e

      Decrease the link strength of all extinguished states: W(n) → W(n) − (α/n*)W(n), for all n<n*.

    • f

      Save relevant trial statistics and go to the next trial.

Peak procedure

The peak procedure comprises FI T1-s trials intermixed with T2-s empty trials (T2≫T1); the FI trials occur randomly with probability r1.

W(n)

A set of steps similar to those used for the FI schedule (see also Machado, 1997) yields the approximate solution for W(n). The expected value of W(n, m) equals

graphic file with name jeab-92-03-07-e10.jpg 10

with

graphic file with name jeab-92-03-07-e29.jpg

Equation (10) was used to plot the four curves in panel A of Figure 9.

At the steady state,

graphic file with name jeab-92-03-07-e11.jpg 11

If r1  =  1, then the peak procedure becomes a simple FI T1 s and Equation (11) reduces to Equation (8).

Scalar property

To emphasize that W(n) depends on T1 and T2, we write it as W(n,T1, T2). From Equation (11) and after using Equation (5) in both numerator and denominator, we get

graphic file with name jeab-92-03-07-e30.jpg

Because q(n,T2)+p(n,T2)/k ≈ q(n,T2) for large n, it follows that

graphic file with name jeab-92-03-07-e31.jpg
R(t)

To approximate the response probability function, we obtain numerically the first value of n such that W(n) > Θ and the subsequent value of n such that W(n) < Θ. We refer to them as n+ and n, respectively. Then,

graphic file with name jeab-92-03-07-e12.jpg 12

Equation (12) was used to plot the functions in panel C of Figure 9.

Mixed-FI schedules

The procedure comprises an FI T1 s and an FI T2 s (T1<T2). The short FI occurs with probability r1.

W(n)

The expected value of W(n, m) equals

graphic file with name jeab-92-03-07-e13.jpg 13

with

graphic file with name jeab-92-03-07-e32.jpg

Equation (13) was used to plot the four curves in panel A of Figure 10.

At the steady state,

graphic file with name jeab-92-03-07-e14.jpg 14

In the cases r1 = 1 or r1 = 0, the mixed schedule becomes a simple FI schedule, and Equation (14) reduces to Equation (8).

Scalar property

Equation (14) satisfies the scalar property, that is, W(kn, kT1, kT2)  =  W(n, T1,T2).

R(t)

To approximate the response probability function, we obtain numerically the first value of n such that W(n) > Θ, the subsequent value of n such that W(n) < Θ, and the second value of n such that W(n) > Θ again. We refer to them as n1+, n, and n2+, respectively. Then,

graphic file with name jeab-92-03-07-e15.jpg 15

Equation (15) was used to plot the functions in panel C of Figure 10.

Temporal bisection

Two samples, S and L (L > S), are paired with two responses, Red and Green, respectively (i.e., {S, L}→{Red, Green}). We assume the simplest task—the two samples are equally likely and each correct response is reinforced.

WR(n) and WG(n)

There are two vectors, one for each response. With respect to the steady state values of WR(n) and WG(n), we could not solve the equations for the general case with α > 0 and β > 0. Therefore, we studied the simpler case α =  0 and β> 0. The end result is similar to one of the four cases examined by Gibbon (1984; case “Scalar timing, likelihood ratio”).

When extinction has no effect (α =  0), reinforcement of one response increases the link connecting the active state to that response and decreases the link connecting the active state to the competing response. An intuitive argument provides the key to the solution for WR(n) and WG(n). If state n is more likely to be active at the end of the short than the long sample, then it will become linked with the “short” response (i.e., Red)—every choice of Red will be rewarded, which will strengthen WR(n) and weaken WG(n), which in turn will make it more likely to choose Red in the future. This positive feedback loop will, on average, drive WR(n) to 1 and WG(n) to 0. Panel A of Figure 11 illustrates the effect.

The steady state values of the two vectors will be either 0 or 1. The transition will occur between the last state that is more likely to be active at the end of the Short sample and the first state that is more likely to be active at the end of the Long sample. That is, the transition is the solution of the equation

graphic file with name jeab-92-03-07-e33.jpg

which may be approximated by the solution of the equation

graphic file with name jeab-92-03-07-e34.jpg

obtained, once again, using Equation (4) to approximate p(n,T). The solution is

graphic file with name jeab-92-03-07-e16.jpg 16

The bisection point or PSE is therefore approximately equal to

graphic file with name jeab-92-03-07-e17.jpg 17

To better understand the solution, we expand the square root in a Taylor series and retain only its first two terms. After some rearrangement we obtain

graphic file with name jeab-92-03-07-e35.jpg

where HM is the harmonic mean of S and L. The predicted PSE is greater than the HM; its deviation from the geometric mean is small for small ratios of L/S (e.g., 4) but increases with that ratio.

P{“Short”|T}

Given the 0/1 distribution of the link strengths, the probability of a “Short” response following a sample T-s long is the probability that the active state is n ≤ n1. Simulations showed that a more accurate result is obtained by adding to n1 a small correction factor between −1 and 1. That is,

graphic file with name jeab-92-03-07-e18.jpg 18
Scalar property

The scalar property states that P(“Short”|T, S, L)), the probability of choosing “Short” given a T-s sample in a discrimination task with S and L training samples, equals P(“Short”|kT, kS, kL)), the probability of choosing “Short” given a kT-s sample in a discrimination task with kS and kL training samples. The scalar property is apparent from Equation (16): If L and S are replaced by kL and kS, respectively, then the new solution, n2, equals kn1 and, as a consequence, the upper limit of the integral in Equation (18) remains constant.

Temporal Generalization

The task comprises a set of K samples, with duration T1, T2,…, TK, occurring with frequencies f1, f2,…, fK, and reinforced with probabilities r1, r2,…,rK.

W(n)

Consider a trial with sample Tj. For W(n) to change the animal must a) pay attention (probability π); b) be on state n at the end of the sample (probability p(n, Tj)); and c) respond (probability W(n)). Assuming all these events occur, the amount of change in W(n) will depend on the outcome, reinforcement with probability rj and extinction with probability 1−rj. Putting all events together yields,

graphic file with name jeab-92-03-07-e19.jpg 19

Rearranging and adding the effect of all samples yields

graphic file with name jeab-92-03-07-e20.jpg 20

Letting

graphic file with name jeab-92-03-07-e36.jpg

gives, after some rearrangement,

graphic file with name jeab-92-03-07-e37.jpg

We approximate the steady state solution of W(n) by assuming that the variance of W(n) is small such that E[W2(n,m)] ≈ E2[W(n,m)]. The result is

graphic file with name jeab-92-03-07-e38.jpg

At the steady state,

graphic file with name jeab-92-03-07-e21.jpg 21

The values of W(n) depend on the sample durations Tj, their frequencies of occurrence fj, and their reinforcement probabilities rj.

To illustrate how Equation (21) is used, assume the conditions of Experiment 1 in Church and Gibbon's (1982) study. The only reinforced sample was T  =  4s; it occurred on half of the trials and all other eight samples, T1, T2,…,T8, occurred on 1/16th of the trials each. Then,

graphic file with name jeab-92-03-07-e39.jpg
R(t)

At the end of a sample with duration t, the probability that a response occurs, P{R(t)  =  1}, is determined as follows: With probability 1− π the animal is not paying attention and therefore it responds with probability C; with probability π it is paying attention and it responds with a probability that depends on the link of the active state:

graphic file with name jeab-92-03-07-e22.jpg 22

with W(n) defined by Equation (21). Figure 12 shows plots of Equation (22) for various experiments from Church and Gibbon's (1982) study.

Scalar property

Assume that only one sample, T+, is reinforced and it is presented on half of the trials. The other samples, T1, T2,…,T8, are never reinforced and are equally likely to occur on the other half of the trials. We introduce the notation W(n, T+, T) to stress the dependence of W(n) on the samples, and we approximate the sum in Equation (22) by an integral, that is,

graphic file with name jeab-92-03-07-e40.jpg

Then, when all sample stimuli are multiplied by k, we obtain

graphic file with name jeab-92-03-07-e41.jpg

We will show that

graphic file with name jeab-92-03-07-e23.jpg 23

which is the scalar property. Using Equation (21),

graphic file with name jeab-92-03-07-e42.jpg

for some constant c  =  α/(8β). Then, using Equation (6) we obtain

graphic file with name jeab-92-03-07-e43.jpg

Equation (23) follows.

Footnotes

1

We have excluded prospective timing tasks (e.g., Gibbon & Church, 1981) because it is not yet clear what a timing model must account for in these tasks (e.g., Preston, 1994; Cerutti & Staddon, 2004; Machado & Vasconcelos, 2006). We have also excluded temporal differentiation tasks (e.g., Platt, 1979) because the LeT model has not been applied to them.

2

If λ is a Gaussian variable with mean μ and standard deviation σ, then the value in the accumulator at the end of an interval of length t will be a Gaussian variable with mean μ×t and standard deviation σ×t. The effect of k* is similar.

3

Whereas the Behavioral theory of Timing assumes that only one state is active at any time, in LeT, at t > 0, all states are active albeit in different degrees. The function relating state number, n, to degree of activation at time t is equal to BeT’s probability function that state n is active at time t (see Machado, 1997).

4

Some indirect evidence supports the assumption: With training, pigeons take fewer trials to correct a mistake. Early in training, they may take 3 or more trials on the average before switching from the incorrect (i.e., unreinforced) choice to the correct choice; later in training, typically they switch on the next trial.

5

But not each number, for several balls may have the same number.

6

Here and elsewhere we assume that all pigeons had the same key color assignments. In the real experiment, color was counterbalanced.

7

The context effect may also be interpreted as a peak-shift-like phenomenon. Pecking the Green and Blue keys are operants controlled by the sample duration. This control is maximal at 4 s, the SD, and, like for other stimulus dimensions, it may decrease as the signal duration departs from 4 s (Church & Gibbon, 1982). The 1-s sample (associated with Red) may be seen as an SΔ for pecking the Green key. If we assume that the effect of the SΔ is to shift the peak of the generalization gradient away from the SΔ (Elsmore, 1971; Guttman, 1959; Hanson, 1959; Russell & Kirkpatrick, 2007) then the gradient for Green will have its peak above 4 s. By a similar reasoning the peak of the generalization gradient for Blue will shift to durations shorter than 4 s because its SΔ is at 16 s. The net effect of these two shifts is that the gradient for Blue will peak at a shorter duration than the gradient for Green. Hence, on tests with the Green and Blue keys, preference for Green will increase with sample duration, the context effect.

8

If preference for Green over Blue is due to a peak-shift-like effect, then that preference should be enhanced by shortening the distance between the SD and the SΔ (Guttman, 1959; Hanson, 1959). The distance between the SD and SΔ is less in Group 8 (SΔ-SD  =  4) than in Group 16 (SΔ-SD  =  12).

9

The distinction between times of reinforcement and rates of reinforcement at those times is echoed in the dual memory structure, pattern memory and strength memory, respectively, of Guilhardi, Yi, and Church’s (2007) timing model.

10

An alternative extinction rule would be as follows. Instead of wating for the end of the trial to determine the reinforced state, n*, and then decrease the link strength of all extinguished states (n< n*), one could decrease the link strength of a state while it was the active state. In this case, to obtain the scalar property and preserve the linear operator model, the change in W(n) would need to be inversely proportional to n, i.e., ΔW(n) = -α/n W(n). The implications of this alternative extinction rule remain to be worked out.

11

Because states further down the chain do no lose their initial couplings, the model predicts indifference if the test stimulus is significantly longer than the long training stimulus. SET cannot predict this effect (see Siegel, 1986).

REFERENCES

  1. Allan L.G. The influence of the scalar timing model on human timing research. Behavioural Processes. 1998;44:101–117. doi: 10.1016/s0376-6357(98)00043-6. [DOI] [PubMed] [Google Scholar]
  2. Arantes J. Comparison of Scalar Expectancy Theory (SET) and the Learning-to-Time (LeT) model in a successive temporal bisection task. Behavioural Processes. 2008;78:269–278. doi: 10.1016/j.beproc.2007.12.008. [DOI] [PubMed] [Google Scholar]
  3. Arantes J, Machado A. Context effects in a temporal discrimination task: Further tests of the Scalar Expectancy Theory and Learning-to-Time models. Journal of the Experimental Analysis of Behavior. 2008;90:33–51. doi: 10.1901/jeab.2008.90-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bizo L.A, White K.G. The behavioral theory of timing: Reinforcer rate determines pacemaker rate. Journal of the Experimental Analysis of Behavior. 1994a;61:19–34. doi: 10.1901/jeab.1994.61-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bizo L.A, White K.G. Pacemaker rate in the behavioral theory of timing. Journal of Experimental Psychology: Animal Behavior Processes. 1994b;20:308–321. [Google Scholar]
  6. Bizo L.A, White K.G. Biasing the pacemaker in the behavioral theory of timing. Journal of the Experimental Analysis of Behavior. 1995a;64:225–235. doi: 10.1901/jeab.1995.64-225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bizo L.A, White K.G. Reinforcement context and pacemaker rate in the behavioral theory of timing. Animal Learning & Behavior. 1995b;23:376–382. [Google Scholar]
  8. Bizo L.A, White K.G. Timing with controlled reinforcer density: Implications for models of timing. Journal of Experimental Psychology: Animal Behavior Processes. 1997;23:44–55. [Google Scholar]
  9. Bizo L.A, Chu J.Y, Sanabria F, Killeen P.R. The failure of Weber's law in time perception and production. Behavioural Processes. 2006;71:201–210. doi: 10.1016/j.beproc.2005.11.006. [DOI] [PubMed] [Google Scholar]
  10. Buhusi C.V, Meck W.H. What makes us tick? Functional and neural mechanisms of interval timing. Nature Reviews. Neuroscience. 2005;6:755–765. doi: 10.1038/nrn1764. [DOI] [PubMed] [Google Scholar]
  11. Catania A.C. Reinforcement schedules and the psychophysical judgments: a study of some temporal properties of behavior. In: Schoenfeld W.N, editor. The theory of reinforcement schedules. New York: Appleton-Century-Crofts; 1970. pp. 1–42. [Google Scholar]
  12. Catania A.C, Reynolds G.S. A quantitative analysis of the responding maintained by interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior. 1968;11:327–383. doi: 10.1901/jeab.1968.11-s327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cerutti D.T, Staddon J.E.R. Immediacy vs. anticipated delay in the time-left experiment: a test of the cognitive hypothesis. Journal of Experimental Psychology: Animal Behavior Processes. 2004;30:45–57. doi: 10.1037/0097-7403.30.1.45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cevik M. Effects of methamphetamine on duration discrimination. Behavioral Neuroscience. 2003;117:774–784. doi: 10.1037/0735-7044.117.4.774. [DOI] [PubMed] [Google Scholar]
  15. Church R.M. Properties of the internal clock. In: Allan L, Gibbon J, editors. Timing and time perception. New York: Annals of the New York Academy of Sciences; 1984. pp. 566–582. [DOI] [PubMed] [Google Scholar]
  16. Church R.M. Quantitative models of animal learning and cognition. Journal of Experimental Psychology: Animal Behavior Processes. 1997;23:379–389. doi: 10.1037//0097-7403.23.4.379. [DOI] [PubMed] [Google Scholar]
  17. Church R.M. A concise introduction to scalar timing theory. In: Meck WarrenH., editor. Functional and neural mechanisms of interval timing. Boca Raton, FL: CRC Press; 2003. pp. 3–22. [Google Scholar]
  18. Church R.M. Temporal learning. In: Pashler H, Gallistel C.R, editors. Stevens' handbook of experimental psychology, Vol. 3: Learning, Motivation, and Emotion. New York: Wiley; 2004, 3rd ed. pp. 365–393. [Google Scholar]
  19. Church R.M, Broadbent H.A. Alternative representations of time, number, and rate. Cognition. 1990;37:55–81. doi: 10.1016/0010-0277(90)90018-f. [DOI] [PubMed] [Google Scholar]
  20. Church R.M, Deluty M.Z. Bisection of temporal intervals. Journal of Experimental Psychology: Animal Behavior Processes. 1977;3:216–228. doi: 10.1037//0097-7403.3.3.216. [DOI] [PubMed] [Google Scholar]
  21. Church R.M, Gibbon J. Temporal generalization. Journal of Experimental Psychology: Animal Behavior Processes. 1982;8:165–186. [PubMed] [Google Scholar]
  22. Church R.M, Meck W.H, Gibbon J. Application of scalar timing theory to individual trials. Journal of Experimental Psychology: Animal Behavior Processes. 1994;20:135–155. doi: 10.1037//0097-7403.20.2.135. [DOI] [PubMed] [Google Scholar]
  23. Crystal J.D, Baramidze G.T. Endogenous oscillations in short-interval timing. Behavioural Processes. 2006;74:152–158. doi: 10.1016/j.beproc.2006.10.008. [DOI] [PubMed] [Google Scholar]
  24. Dews P.B. The theory of fixed-interval responding. In: Schoenfeld W.N, editor. The theory of reinforcement schedules. New York: Appleton-Century-Crofts; 1970. pp. 43–61. [Google Scholar]
  25. Dews P.B. Studies on responding under fixed-interval schedules of reinforcement II. The scalloped pattern of the cumulative record. Journal of the Experimental Analysis of Behavior. 1978;29:67–75. doi: 10.1901/jeab.1978.29-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Dragoi V, Staddon J.E.R, Palmer R.G, Buhusi C.V. Interval timing as an emergent learning property. Psychological Review. 2003;110:126–144. doi: 10.1037/0033-295x.110.1.126. [DOI] [PubMed] [Google Scholar]
  27. Elsmore T.F. Control of responding by stimulus duration. Journal of the Experimental Analysis of Behavior. 1971;16:81–87. doi: 10.1901/jeab.1971.16-81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Falk J.L. The origin and functions of adjunctive behavior. Animal Learning and Behavior. 1977;5:325–335. [Google Scholar]
  29. Ferster C.B, Skinner B.F. Schedules of reinforcement. New York: Appleton-Century-Crofts; 1957. [Google Scholar]
  30. Fetterman J.G, Killeen P. Adjusting the pacemaker. Learning and Motivation. 1991;22:226–252. [Google Scholar]
  31. Fetterman J.G, Killeen P. Categorical scaling of time: Implications for clock counter models. Journal of Experimental Psychology: Animal Behavior Processes. 1995;21:43–63. [PubMed] [Google Scholar]
  32. Fetterman J.G, Killeen P.R, Hall S. Watching the clock. Behavioural Processes. 1998;44:211–224. doi: 10.1016/S0376-6357(98)00050-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gallistel C.R. The organization of learning. Cambridge, MA: Bradford Books/MIT Press; 1990. [Google Scholar]
  34. Gallistel C.R. Flawed foundations of associationism? Comments on Machado and Silva (2007) The American psychologist. 2007;62:682–685. doi: 10.1037/0003-066X.62.7.682. [DOI] [PubMed] [Google Scholar]
  35. Gibbon J. Scalar expectancy theory and Weber's law in animal timing. Psychological Review. 1977;84:279–325. [Google Scholar]
  36. Gibbon J. On the form and location of the psychometric bisection function for time. Journal of Mathematical Psychology. 1981;24:58–87. [Google Scholar]
  37. Gibbon J. Origins of scalar timing theory. Learning and Motivation. 1991;22:3–38. [Google Scholar]
  38. Gibbon J. Ubiquity of scalar timing with a Poisson clock. Journal of Mathematical Psychology. 1992;36:283–293. [Google Scholar]
  39. Gibbon J, Church R.M. Time left: linear versus logarithmic subjective time. Journal of Experimental Psychology: Animal Behavior Processes. 1981;7:87–108. [PubMed] [Google Scholar]
  40. Gibbon J, Church R.M, Meck W.H. Scalar timing in memory. In: Allan L, Gibbon J, editors. Timing and time perception. New York: Annals of the New York Academy of Sciences; 1984. pp. 52–77. [DOI] [PubMed] [Google Scholar]
  41. Grossberg S, Schmajuk N.A. Neural dynamics of adaptive timing and temporal discrimination during associative learning. Neural Networks. 1989;2:79–102. [Google Scholar]
  42. Guilhardi P, Yi L, Church R.M. A modular theory of learning and performance. Psychonomic Bulletin & Review. 2007;14:543–559. doi: 10.3758/bf03196805. [DOI] [PubMed] [Google Scholar]
  43. Guilhardi P, Macinnis M, Church R.M, Machado A. Shifts in the psychophysical function in rats. Behavioural Processes. 2007;75:167–175. doi: 10.1016/j.beproc.2007.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Guttman N. Generalization gradients around stimuli associated with different reinforcement schedules. Journal of Experimental Psychology. 1959;58:335–340. doi: 10.1037/h0045679. [DOI] [PubMed] [Google Scholar]
  45. Hanson H.M. Effects of discrimination training on stimulus generalization. Journal of Experimental Psychology. 1959;58:321–333. doi: 10.1037/h0042606. [DOI] [PubMed] [Google Scholar]
  46. Ivry R, Spencer R.M.C. The neural representation of time. Current Opinion in Neurobiology. 2004;14:225–232. doi: 10.1016/j.conb.2004.03.013. [DOI] [PubMed] [Google Scholar]
  47. Killeen P.R. Behavior's time. In: Bower G.H, editor. The psychology of learning and motivation. New York: Academic Press; 1991. pp. 295–334. [Google Scholar]
  48. Killeen P.R, Fetterman J.G. A behavioral theory of timing. Psychological Review. 1988;95:274–285. doi: 10.1037/0033-295x.95.2.274. [DOI] [PubMed] [Google Scholar]
  49. Killeen P.R, Hall S, Bizo L.A. A clock not wound runs down. Behavioural Processes. 1999;45:129–139. doi: 10.1016/s0376-6357(99)00014-5. [DOI] [PubMed] [Google Scholar]
  50. Killeen P.R, Weiss N.A. Optimal timing and the Weber function. Psychological Review. 1987;94:455–468. [PubMed] [Google Scholar]
  51. Kirkpatrick K. Packet theory of conditioning and timing. Behavioural Processes. 2002;57:89–106. doi: 10.1016/s0376-6357(02)00007-4. [DOI] [PubMed] [Google Scholar]
  52. Kirkpatrick K, Church R. Are separate theories of conditioning and timing necessary. Behavioral Processes. 1998;44:163–182. doi: 10.1016/s0376-6357(98)00047-3. [DOI] [PubMed] [Google Scholar]
  53. Kirkpatrick-Steger K, Miller S, Betti C, Wasserman E. Cyclic responding by pigeons on the peak timing procedure. Journal of the Experimental Psychology: Animal Behavior Processes. 1996;22:447–460. doi: 10.1037//0097-7403.22.4.447. [DOI] [PubMed] [Google Scholar]
  54. Leak T, Gibbon J. Simultaneous timing of multiple intervals: Implications of the scalar property. Journal of Experimental Psychology: Animal Behavior Processes. 1995;21:3–19. [PubMed] [Google Scholar]
  55. Lejeune H, Wearden J.H. Scalar properties in animal timing: Conformity and violations. The Quarterly Journal of Experimental Psychology. 2006;59:1875–1908. doi: 10.1080/17470210600784649. [DOI] [PubMed] [Google Scholar]
  56. Lejeune H, Richelle M, Wearden J.H. About Skinner and time: behavior-analytic contributions to research on animal timing. Journal of the Experimental Analysis of Behavior. 2006;85:125–142. doi: 10.1901/jeab.2006.85.04. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Machado A. Learning the temporal dynamics of behavior. Psychological Review. 1997;104:241–265. doi: 10.1037/0033-295x.104.2.241. [DOI] [PubMed] [Google Scholar]
  58. Machado A, Arantes J. Further tests of the Scalar Expectancy Theory (SET) and the Learning-to-Time (LeT) model in a temporal bisection task. Behavioural Processes. 2006;72:195–206. doi: 10.1016/j.beproc.2006.03.001. [DOI] [PubMed] [Google Scholar]
  59. Machado A, Cevik M. Acquisition and extinction under periodic reinforcement. Behavioural Processes. 1998;44:237–262. doi: 10.1016/s0376-6357(98)00052-7. [DOI] [PubMed] [Google Scholar]
  60. Machado A, Guilhardi P. Shifts in the psychometric function and their implications for models of timing. Journal of the Experimental Analysis of Behavior. 2000;74:25–54. doi: 10.1901/jeab.2000.74-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Machado A, Keen R. Learning to Time (LET) or Scalar Expectancy Theory (SET)? A critical test of two models of timing. Psychological Science. 1999;10:285–290. [Google Scholar]
  62. Machado A, Pata P. Testing the Scalar Expectancy Theory (SET) and the Learning to Time model (LeT) in a double bisection task. Learning and Behavior. 2005;33:111–122. doi: 10.3758/bf03196055. [DOI] [PubMed] [Google Scholar]
  63. Machado A, Silva F.J. Toward a richer view of the scientific method: The role of conceptual analysis. The American Psychologist. 2007;62:671–681. doi: 10.1037/0003-066X.62.7.671. [DOI] [PubMed] [Google Scholar]
  64. Machado A, Vasconcelos M. Acquisition versus steady state in the time-left experiment. Behavioural Processes. 2006;71:172–187. doi: 10.1016/j.beproc.2005.11.004. [DOI] [PubMed] [Google Scholar]
  65. Matell M.S, Meck W.M. Neuropsychological mechanisms of interval timing behavior. Bioessays. 2000;22:94–103. doi: 10.1002/(SICI)1521-1878(200001)22:1<94::AID-BIES14>3.0.CO;2-E. [DOI] [PubMed] [Google Scholar]
  66. Matthews T.J, Lerer B.E. Behavior patterns in pigeons during autoshaping with an incremental conditioned stimulus. Animal Learning and Behavior. 1987;15:69–75. [Google Scholar]
  67. McClure E.A, Saulsgiver K.A, Wynne C.D.L. Effects of δ-amphetamine on temporal discrimination in pigeons. Behavioural Pharmacology. 2005;16:193–208. doi: 10.1097/01.fbp.0000171773.69292.bd. [DOI] [PubMed] [Google Scholar]
  68. Meck W.H. Selective adjustment of the speed of internal clock and memory processes. Journal of Experimental Psychology: Animal Behavior Processes. 1983;9:171–201. [PubMed] [Google Scholar]
  69. Meck W.H. Neuropharmacology of timing and time perception. Cognitive Brain Research. 1996;3:227–242. doi: 10.1016/0926-6410(96)00009-2. [DOI] [PubMed] [Google Scholar]
  70. Meck W.H, Church R.M. Simultaneous temporal processing. Journal of Experimental Psychology: Animal Behavior Processes. 1984;10:1–29. [PubMed] [Google Scholar]
  71. Mellon R.C, Leak T.M, Fairhurst S, Gibbon J. Timing processes in the reinforcement-omission effect. Animal Learning and Behavior. 1995;23:286–296. [Google Scholar]
  72. Monteiro T, Machado A. Oscillations following periodic reinforcement. Behavioural Processes. 2009;81:170–188. doi: 10.1016/j.beproc.2008.10.003. [DOI] [PubMed] [Google Scholar]
  73. Morgan L, Killeen P.R, Fetterman J.G. Changing rates of reinforcement perturbs the flow of time. Behavioural Processes. 1993;30:259–272. doi: 10.1016/0376-6357(93)90138-H. [DOI] [PubMed] [Google Scholar]
  74. Oliveira L, Machado A. The effect of sample duration and cue on a double temporal discrimination. Learning and Motivation. 2008;39:71–94. [Google Scholar]
  75. Oliveira L, Machado A. Context effect in a temporal bisection task with the choice keys available during the sample. Behavioural Processes. 2009;81:286–292. doi: 10.1016/j.beproc.2008.12.021. [DOI] [PubMed] [Google Scholar]
  76. Platt J.R. Temporal differentiation and the psychophysics of time. In: Zeiler M.D, Harzem P, editors. Reinforcement and the organization of behavior. New York: Wiley; 1979. pp. 1–29. [Google Scholar]
  77. Platt J.R, Davis E.R. Bisection of temporal intervals by pigeons. Journal of Experimental Psychology: Animal Behavior Processes. 1983;9:160–170. [PubMed] [Google Scholar]
  78. Preston R.A. Choice in the time-left procedure and in concurrent chains with a time-left terminal link. Journal of the Experimental Analysis of Behavior. 1994;61:349–373. doi: 10.1901/jeab.1994.61-349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Rescorla R. Pavlovian conditioned inhibition. Psychological Bulletin. 1969;72:77–94. [Google Scholar]
  80. Richelle M, Lejeune H. Time in Animal Behavior. New York: Pergamon; 1980. [Google Scholar]
  81. Roberts S. Isolation of an internal clock. Journal of Experimental Psychology: Animal Behavior Processes. 1981;7:242–268. [PubMed] [Google Scholar]
  82. Roberts W.A. Principles of animal cognition. New York: McGraw-Hill; 1998. [Google Scholar]
  83. Rodriguez-Girones M.A, Kacelnik A. Behavioral adjustment to modifications in the temporal parameters of the environment. Behavioural Processes. 1999;44:173–191. doi: 10.1016/s0376-6357(99)00017-0. [DOI] [PubMed] [Google Scholar]
  84. Russell R, Kirkpatrick K. The role of temporal generalization in a temporal discrimination task. Behavioural Processes. 2007;74:115–125. doi: 10.1016/j.beproc.2006.08.004. [DOI] [PubMed] [Google Scholar]
  85. Sanabria F, Killeen P. Temporal generalization accounts for response resurgence in the peak procedure. Behavioural Processes. 2007;74:126–141. doi: 10.1016/j.beproc.2006.10.012. [DOI] [PubMed] [Google Scholar]
  86. Schneider B.A. A two-state analysis of fixed-interval responding in the pigeon. Journal of the Experimental Analysis of Behavior. 1969;12:677–687. doi: 10.1901/jeab.1969.12-677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Siegel S.F. A test of the similarity rule model of temporal bisection. Learning and Motivation. 1986;17:59–75. [Google Scholar]
  88. Skinner B.F. The behavior of organisms: An experimental analysis. New York: Appleton-Century-Crofts; 1938. [Google Scholar]
  89. Staddon J.E.R. Schedule-induced behavior. In: Honig W.K, Staddon J.E.R, editors. Handbook of operant behavior. Englewood Cliffs, NJ: Prentice-Hall; 1977. pp. 125–53. [Google Scholar]
  90. Staddon J.E.R, Cerutti D. Operant conditioning. Annual Review of Psychology. 2003;54:115–144. doi: 10.1146/annurev.psych.54.101601.145124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Staddon J.E.R, Higa J.J. Temporal learning. In: Bower G, editor. The psychology of learning and motivation, vol. 27. New York: Academic Press; 1991. pp. 265–294. [Google Scholar]
  92. Staddon J.E.R, Higa J. Time and memory: Towards a pacemaker-free theory of interval timing. Journal of the Experimental Analysis of Behavior. 1999;71:215–251. doi: 10.1901/jeab.1999.71-215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Staddon J.E.R, Higa J.J. Interval timing. Nature Reviews. Neuroscience. 2006;7:Aug 1, 2006. [Google Scholar]
  94. Staddon J.E.R, Innis N.K. Reinforcement omission on fixed-interval schedules. Journal of the Experimental Analysis of Behavior. 1969;12:689–700. doi: 10.1901/jeab.1969.12-689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Staddon J.E.R, Simmelhag V.L. The “superstition” experiment: A re-examination of its implications for the principles of adaptive behavior. Psychological Review. 1971;78:3–43. [Google Scholar]
  96. Stubbs A. The discrimination of stimulus duration by pigeons. Journal of the Experimental Analysis of Behavior. 1968;11:223–238. doi: 10.1901/jeab.1968.11-223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Stubbs A. Temporal discrimination and a free-operant psychophysical procedure. Journal of the Experimental Analysis of Behavior. 1980;33:167–185. doi: 10.1901/jeab.1980.33-167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Timberlake W, Lucas G.A. The basis of superstitious behavior: chance contingency, stimulus substitution, or appetitive behavior. Journal of the Experimental Analysis of Behavior. 1985;44:279–299. doi: 10.1901/jeab.1985.44-279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Whitaker J.S, Lowe C.F, Wearden J. Multiple-interval timing in rats: Performance on two valued mixed fixed-interval schedules. Journal of Experimental Psychology: Animal Behavior Processes. 2003;29:277–291. doi: 10.1037/0097-7403.29.4.277. [DOI] [PubMed] [Google Scholar]
  100. Whitaker J.S, Lowe C.F, Wearden J. When to respond? And how much? Temporal control and response output on mixed-fixed-interval schedules with unequally probable components. Behavioral Processes. 2008;77:33–42. doi: 10.1016/j.beproc.2007.06.001. [DOI] [PubMed] [Google Scholar]
  101. Yi L. Applications of timing theories to a peak procedure. Behavioural Processes. 2007;75:188–198. doi: 10.1016/j.beproc.2007.01.010. [DOI] [PubMed] [Google Scholar]
  102. Zeiler M.D. On sundials, springs, and atoms. Behavioural Processes. 1998;44:89–99. doi: 10.1016/s0376-6357(98)00042-4. [DOI] [PubMed] [Google Scholar]
  103. Zeiler M.D, Powell D.G. Temporal control in fixed-interval schedules. Journal of the Experimental Analysis of Behavior. 1994;61:1–9. doi: 10.1901/jeab.1994.61-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of the Experimental Analysis of Behavior are provided here courtesy of Society for the Experimental Analysis of Behavior

RESOURCES