Skip to main content
Journal of the Experimental Analysis of Behavior logoLink to Journal of the Experimental Analysis of Behavior
. 2007 Sep;88(2):285–317. doi: 10.1901/jeab.2007.88-285

A Theory of Attending, Remembering, and Reinforcement in Delayed Matching to Sample

John A Nevin 1,, Michael Davison 2,, Amy L Odum 3,, Timothy A Shahan 3,
PMCID: PMC1986439  PMID: 17970420

Abstract

A theory of attending and reinforcement in conditional discriminations (Nevin, Davison, & Shahan, 2005) is extended to working memory in delayed matching to sample by adding terms for disruption of attending during the retention interval. Like its predecessor, the theory assumes that reinforcers and disruptors affect the independent probabilities of attending to sample and comparison stimuli in the same way as the rate of overt free-operant responding as suggested by Nevin and Grace (2000), and that attending is translated into discriminative performance by the model of Davison and Nevin (1999). The theory accounts for the effects of sample-stimulus discriminability and retention-interval disruption on the levels and slopes of forgetting functions, and for the diverse relations between accuracy and sensitivity to reinforcement reported in the literature. It also accounts for the effects of reinforcer probability in multiple schedules on the levels and resistance to change of forgetting functions; for the effects of reinforcer probabilities signaled within delayed-matching trials; and for the effects of reinforcer delay, sample duration, and intertrial-interval duration. The model accounts for some data that have been problematic for previous theories, and makes testably different predictions of the effects of reinforcer probabilities and disruptors on forgetting functions in multiple schedules and signaled trials.

Keywords: delayed matching to sample, attending, remembering, accuracy, sensitivity, reinforcer probability, reinforcer delay


Conditional discriminations arrange that the stimulus signaling availability of reinforcement for a response depends on the value of a separate stimulus. For example, in an early study, Lashley (1938) trained rats in alternating blocks of trials to jump toward an upright triangle if the background was black, and toward an inverted triangle if the background was striped. In a series of test trials, Lashley varied the properties of the stimuli and concluded that choices depended both on the figure and on the background so that “a genuine reversal of the sense of reaction was determined by the character of the ground” (1938, p. 317). In other words, his subjects attended to both components of the conditional discrimination.

In Lashley's (1938) work, it is not clear whether the ground (black versus striped) or the figure (upright versus inverted triangle) served as the conditional cue to reverse the sense of the reaction. The distinction is clear, however, in the conditional-discrimination paradigm known as delayed matching to sample (DMTS). In a standard example, a pigeon is presented with a red or green light (the sample) on the center key of a three-key array for a few seconds. Then, after a retention interval, red and green lights are presented on the side keys (the comparisons), and food reinforcement is contingent on pecks to the comparison with the same color as the sample. In this situation, the sample color can be identified as the conditional cue determining the “sense of the reaction” to red or green comparisons on the basis of the sequential order and temporal separation between stimulus presentations. In the DMTS paradigm, it is especially clear that the pigeon must attend to both samples and comparisons in order to perform correctly and obtain reinforcers.

Nevin, Davison, and Shahan (2005) proposed that attending to the components of a conditional discrimination could be construed as unmeasured or covert behavior that was related to reinforcement according to the same quantitative expression as overt behavior. They applied their model of attending to matching-to-sample and signal-detection paradigms in which conditional stimuli signal which of two responses (defined either by separate stimuli or by topography) is correct, and showed that it gave a good account of data in situations where aspects of the stimuli and the conditions of reinforcement were varied with no temporal separation between conditional stimuli and response opportunities. Here, we extend the model to short-term working memory in the DMTS paradigm where the conditional stimuli and the response-defining stimuli are separated by a retention interval.

In general terms, the DMTS paradigm involves presentation of one or the other of two sample stimuli (S1 or S2) in discrete trials. After offset of the sample, a retention interval of length t seconds intervenes before the simultaneous presentation of two comparison stimuli (C1 and C2), one of which is physically the same as the preceding sample. Choice responses (B1 and B2) are defined by the comparison stimuli to which they are directed. A response to the matching comparison (B1 given S1 or B2 given S2) may be reinforced; a response to the nonmatching comparison is not reinforced. In a variant of the paradigm known as symbolic DMTS (DSMTS), the physical relations between S1 or S2 and C1 and C2 (or B1 and B2) are arbitrary, but the same reinforcement contingencies are in effect. The matrix of stimuli, responses, and reinforcers is shown in Figure 1, where cells are denoted by row-column notation. Thus, R11 refers to reinforcers for B1 on S1 trials (Cell 11), and R22 refers to reinforcers for B2 on S2 trials (Cell 22); no reinforcers are scheduled in Cells 12 or 21 (errors) in the studies considered here.

Fig 1.

Fig 1

The matrix of stimuli and responses defined by a conditional discrimination such as DMTS, where the samples are S1 or S2, and responses B1 and B2 are defined by the comparison stimuli C1 and C2. Cells are identified by row–column notation. Reinforcers for correct responses are designated R11 and R22; no consequences are arranged in the error Cells 12 and 21.

Nevin et al. (2005) assumed that on each trial, the probabilities of attending to the samples, p(As), and to the comparisons, p(Ac), were determined independently by a function derived from behavioral momentum theory (Nevin & Grace, 2000). Then, the probabilities of B1 and B2 given S1 or S2 were determined by the allocation of effective reinforcers to the cells of the matrix in Figure 1 as suggested by Davison and Nevin (1999). To bring the model of Nevin et al. to bear on DMTS, we make one additional assumption, namely that attending may be disrupted by distraction or interference during retention intervals. We begin by reviewing behavioral momentum theory and the Davison–Nevin model.

Behavioral Momentum Theory

Nevin and Grace (2000) suggested that the relation between response rate during short-term disruption and predisruption baseline response rate is given by

graphic file with name jeab-88-02-08-e01.jpg 1

where Bo and Bx are response rates measured during baseline and disruption, respectively; x is a dimensionless number representing the potency of the disruptor, with a negative sign indicating that the disruptor decreases response rate; rs is the reinforcer rate correlated with the stimulus situation in which responding is measured; ra is the overall average reinforcer rate in the experimental setting; and b is an exponent representing the sensitivity of resistance to change to the reinforcer ratio. Many studies of resistance to change have arranged multiple schedules, where rs is identified with the reinforcer rate in a target component and ra is the average reinforcer rate in a session. When variable-interval (VI) schedules are arranged in both components, the value of b is usually about 0.5 (Nevin, 2002), and we will use that value in all model applications below.

Equation 1 can be rewritten to describe the relation between steady-state response rate and reinforcer rate. Exponentiating,

graphic file with name jeab-88-02-08-e02.jpg 2

where Bo is now identified with the asymptotic response rate as rs/ra goes to infinity, and x is a general background disruptor that keeps measured response rate from achieving that asymptote when rs/ra is less than infinity.

Nevin et al. (2005) showed that Equation 2 provides a good description of Shahan's (2002) data relating the rate of observing behavior, which is often construed as an overt component of attending, to the reinforcer rate. Therefore, Equation 2 was used to characterize the probabilities of attending in the model of Nevin et al., and will be used similarly here. Specifically, we will assume that

graphic file with name jeab-88-02-08-e03.jpg 3

and

graphic file with name jeab-88-02-08-e04.jpg 4

Equation 3 states that the probability of attending to the samples, p(As), is an increasing function of the reinforcer rate correlated with the samples, rs (i.e., reinforcers per session divided by the time preceding, during, and following sample presentation until onset of the comparisons). Attending to the samples depends directly on the value of rs relative to the session average reinforcer rate ra, and inversely on a general background disruptor x and a separate disruptor qt that is specific to the retention interval, where q represents disruption per unit time and t is the duration of the retention interval. Similarly, Equation 4 states that the probability of attending to the comparisons, p(Ac), is an increasing function of the reinforcer rate correlated with the comparisons, rc (i.e., reinforcers per session divided by the time from sample offset to comparison offset). Attending to the comparisons depends directly on the value of rc relative to the reinforcer rate correlated with the samples, rs, and inversely on a general background disruptor z (which may differ from x) and by a separate disruptor vt, where v (which may differ from q) represents disruption per unit time within the retention interval and t is the duration of the retention interval. If t  =  0, Equations 3 and 4 are exactly the same as Equations 5 and 6 in Nevin et al. (2005), which concentrated on conditional discriminations with no retention interval.

Calculating Reinforcer Rates

To calculate the reinforcer rates ra, rs, and rc in Equations 3 and 4, it is necessary to specify when attending to samples and comparisons is assumed to occur in order to establish the appropriate time bases. Figure 2 shows the sequence of events within a standard DMTS trial and suggests how the subject's activities, measured or unmeasured, are assumed to take place during a trial. Specifically, we assume that the subject may orient toward the sample location or otherwise engage in observing behavior before sample onset, attend to the sample while it is present, and then attend to the “sample-as-coded” for the duration of the retention interval. Attending to the sample-as-coded corresponds to the notion of rehearsal in more cognitive accounts of memory processes, and may best be conceptualized as attending to any sample-related behavior, measured or unmeasured, that the subject may emit during the retention interval. Despite their nominal differences, all of these activities are summarized by the term “attending to the sample” and take a single value of p(As) according to Equation 3. The time during which the subject is assumed to engage in sample-related attending is shown as a dotted line in Figure 2.

Fig 2.

Fig 2

Time-line diagram of experimentally arranged events within a DMTS trial, and the times during which the subject is assumed to attend to the sample and comparisons. Times during which reinforcers and disruptors are assumed to operate on attending to samples or comparisons are also indicated. See text for explanation.

The reinforcer rate correlated with attending to the sample, rs, is given by dividing the number of reinforcers arranged for correct responses in an experimental condition by the total time during which the subject may orient toward, observe, or attend to the sample, including the intertrial interval (ITI), the sample duration, and the retention interval. This total time, which is coextensive with p(As), is indicated by line t(rs) in Figure 2. In the numerator of Equation 3, t is identified with the retention interval, which is a part of t(rs) as shown in Figure 2. The time base for rs is the same as in Nevin et al. (2005) with the addition of retention-interval time during which the subject may attend to the sample-as-coded.

We also assume that the subject may orient toward the comparison locations or engage in related observing behavior during the retention interval and then attend to the comparisons while they are present, as indicated by the dotted line. These activities are summarized by the term “attending to the comparisons” and take a single value of p(Ac) according to Equation 4. Thus, during the retention interval, the subject is assumed to engage in two concurrent, independent activities—colloquially, looking for the comparisons while rehearsing the way in which it has coded the samples—as suggested in Figure 2.

The reinforcer rate correlated with attending to the comparisons, rc, is given by dividing the number of reinforcers arranged for correct responses by the length of the retention interval during which the subject may orient toward or attend to the comparisons plus their duration indicated by line t(rc) in Figure 2, which is coextensive with p(Ac). Typically, comparison stimuli are turned off after a single response, so we assume 1-s latencies unless they are reported. The time base for rc is the same as in Nevin et al. (2005) with the addition of the retention interval.

The overall average reinforcer rate in a session, ra, is given by dividing all reinforcers available for correct responses by session duration. In Equation 3, rs is divided by ra because attending to the sample occurs within the overall session context. Likewise, in Equation 4, rc is divided by rs because attending to the comparisons occurs within the context provided by the samples.

As described above, attending to the samples may be reduced by general background disruption x plus disruptors that are specific to the retention interval, qt. Likewise, attending to the comparisons may be reduced by general background disruption z plus disruptors that are specific to the retention interval, vt. Figure 2 indicates the times during which disruptors x, z, and q or v are assumed to impact p(As) and p(Ac).

To give a sense of how the model operates, Figure 3 illustrates how model parameters affect p(As) and p(Ac) in DMTS with representative experimental parameters: 20-s ITI, 2-s sample duration,1-s latency to respond to the comparisons, and retention intervals ranging from 0 to 10 s, mixed within a condition. (In terms of the model, an alternative procedure in which the retention interval is varied between conditions differs only in the calculation of rs.) The left panel shows that if q  =  0 (i.e., no interference with attending to the sample-as-coded during the retention interval), p(As) is essentially constant over retention intervals, decreasing slightly because rs decreases as the retention interval lengthens. Increasing q to 0.3 leads to a steep decline in p(As) as the retention interval lengthens. The left panel also shows that the level of p(As) at t  =  0 depends inversely on x. The right panel shows that if v  =  0 (i.e., no disruption of observing behavior with respect to the upcoming comparisons), p(Ac) decreases slightly more than p(As) when q  =  0 because the retention interval has a proportionally larger impact on rc than on rs in Equation 4. Increasing v to 0.3 leads to a steep decline in p(Ac) as the retention interval lengthens. The right panel also shows that the level of p(Ac) at t  =  0 depends inversely on z. Thus, p(As) and p(Ac) are similarly affected by disruptors as suggested by Equations 3 and 4, but note that the level of p(Ac) is less affected than that of p(As) by the same changes in x and z. The reason is that rc/rs in Equation 4 is larger than rs/ra in Equation 3 unless the ITI is very short. Equations 3 and 4 can simulate a number of basic results in the DMTS literature with a fixed set of parameter values when p(As) and p(Ac) are translated into measured discrimination by the model of conditional discrimination performance proposed by Davison and Nevin (1999).

Fig 3.

Fig 3

Illustrations of the ways in which probabilities of attending to the samples, p(As), and to the comparisons, p(Ac), depend on the length of the retention interval with several values of the disruptors x, z, q, and v, assuming mixed retention intervals and representative experimental parameters.

Discriminability and Choice in Dmts

With reference to the matrix of stimuli, responses, and reinforcers in Figure 1, Davison and Nevin (1999) suggested that the effects of R11 and R22 generalized to the other cells as a result of confusability (i.e., less-than-perfect discriminability) between stimulus–behavior and behavior–reinforcer relations in the discriminated operants S1B1R11 and S2B2R22. Their approach identified confusability with contingent relations between the terms of discriminated operants, and is complicated by the fact that changing the physical properties of C1 and C2, which define B1 and B2, would affect both stimulus–behavior and behavior–reinforcer relations. Here, we simplify the Davison–Nevin approach by concentrating on stimulus-specific sources of generalization. We will designate the discriminability of the samples S1 and S2 as ds, and the discriminability of the comparison stimuli C1 and C2 as dc. Both ds and dc are construed as structural limits on the subject's ability to distinguish the relevant stimuli that depend on its sensory system, the physical differences between the samples and the comparisons, and other constant features of the experimental setting.

Both ds and dc range from infinity, implying perfect discriminability, to 1.0, implying complete confusability. Discriminability is the inverse of confusability, so generalization due to confusability of the stimuli is given by multiplying scheduled reinforcers by 1/ds for samples and by 1/dc for comparisons. The resulting matrix of scheduled and generalized reinforcers that accumulate in the cells of the matrix of Figure 1 is shown in Figure 4, where responses B1 and B2 are defined by the stimuli C1 and C2. For example, reinforcers in Cell 11 generalize to Cell 21 to the extent that the samples are confusable, and therefore the effective reinforcers in Cell 21 include those in Cell 11 multiplied by 1/ds. Additionally, reinforcers in Cell 22 also generalize to Cell 21 to the extent that the comparisons are confusable, and therefore the effective reinforcers in Cell 21 include those in Cell 22 multiplied by 1/dc. Thus, although no reinforcers are scheduled in Cell 21, the effective number of reinforcers in Cell 21 is the sum of reinforcers generalized from Cells 11 and 22.

Fig 4.

Fig 4

Effective reinforcer allocation in the cells of the stimulus–response matrix of Fig. 1. The discriminabilities of samples and comparisons are characterized as ds and dc, which may be conceptualized as distances between stimuli. Generalization between cells results from confusability of the samples and comparisons, 1/ds and 1/dc.

Davison and Nevin (1999) assumed that choice between B1 and B2 matched the effective numbers of reinforcers in Cells 11 and 12 on S1 trials, and in Cells 21 and 22 on S2 trials. From this assumption, Davison and Nevin made a number of predictions, a few of which were not generally supported by data (those predictions are not altered by shifting the notation to ds and dc and identifying these terms with the discriminability of the samples and comparisons, respectively). Nevin et al. (2005) showed that these discrepancies resulted from the implicit assumption of the Davison–Nevin model that the subject always attended to the stimuli, and could be rectified by allowing p(As) and/or p(Ac) to take values less than 1.0. For example, Nevin et al. showed that the predicted forms of the functions relating the log ratio of B1 to B2 on S1 and S2 trials to the log ratio of R11 to R22 changed substantially when p(As) decreased from 1.0 to .7. As a result, Nevin et al. were able to account for several data sets that were problematic for the original Davison–Nevin model.

Structure of the Dmts Model

Because p(As) and p(Ac) can take values less than 1.0, there are four possible states for the model, as shown in Figure 5. First, if the subject attends to the samples and to the comparisons, its probabilities of responding are given by the Davison–Nevin (1999) model with values of dsb and dbr replaced by ds and dc to characterize discriminabilities of the stimuli. The equations for calculating p(B1|S1) and p(B1|S2) are given at the bottom of Figure 5 as State 1, “Attend to both” [note that p(B2|S1)  =  1− p(B1|S1) and p(B2|S2)  =  1− p(B1|S2)]. Second, if the subject attends to the samples but does not attend to the comparisons, dc is effectively 1.0 so that B1 and B2 are chosen equally often on both S1 and S2 trials (State 2, “Attend to samples only”). Third, if the subject does not attend to the samples but does attend to the comparisons, ds is effectively 1.0 and choice between B1 and B2 is governed by the values of R11 and R22 as modulated by dc identically on S1 and S2 trials (State 3, “Attend to comparisons only”). Fourth, if the subject attends neither to the samples nor to the comparisons, both ds and dc are effectively 1.0 and B1 and B2 are chosen equally often (State 4, “No attending”). The values of p(B1|S1) and p(B1|S2) pooled for a large number of trials are calculated by multiplying p(B1|S1) and p(B1|S2) in each of the four states by the probabilities of entering these four states, which in turn are given by the values of p(As) and p(Ac) from Equations 3 and 4. The pooled values of p(B1|S1) and p(B1|S2) are then used to predict DMTS performance.

Fig 5.

Fig 5

The upper part of this figure shows how probabilities of attending or not attending to the samples and comparisons in DMTS lead to four states that determine response probabilities p(B1|S1) and p(B1|S2) according to the expressions for each state in the lower part of the figure; note that p(B2|S1)  =  1− p(B1|S1), and p(B2|S2)  =  1− p(B1|S2). Failures to attend are equivalent to discriminabilities equal to 1, so that ds is omitted from expressions for states 3 and 4, and dc is omitted from expressions for states 2 and 4. Overall performance is predicted by weighting the response probabilities in each state by the probability of entering that state.

We will predict two measures of performance, log d and log b, that have been employed in many studies of conditional discrimination performance since their introduction into the experimental analysis of behavior by Davison and Tustin (1978). Both measures range from zero to infinity. Log d is the logarithm of the geometric mean of the ratios of probabilities of correct to incorrect responses on S1 and S2 trials, and measures the accuracy of discrimination, which is given by the empirical allocation of responses and is not the same as discriminability, a theoretical term characterizing the stimuli and the organism's sensory system:

graphic file with name jeab-88-02-08-e05.jpg 5

Log b is the logarithm of the geometric mean of the ratios of probabilities of B1 to B2 responses on S1 and S2 trials and measures response bias, the tendency to emit B1 rather than B2:

graphic file with name jeab-88-02-08-e06.jpg 6

It is of special interest to examine the relation between log b and the ratio of reinforcers, R11/R22. Many studies (for review see Davison & McCarthy, 1988) have found that

graphic file with name jeab-88-02-08-e07.jpg 7

which is a version of the generalized matching law where a represents the sensitivity of response ratios to reinforcer ratios (Baum, 1974; for present purposes we ignore the possibility of biased responding when R11  =  R22).

In summary, the proposed model of DMTS consists of Equations 3 and 4, which predict probabilities of attending to the samples and comparisons; the four states that result from those probabilities characterized in Figure 5, together with the calculation of response probabilities in each state and the pooled values of p(B1|S1) and p(B1|S2); and the conversion of pooled p(B1|S1) and p(B1|S2) into behavioral measures log d, log b, and a. The terms of the full model are listed and characterized briefly in Table 1.

Table 1.

Model terms and parameters.

Components of conditional discriminations
S1, S2 Sample stimuli in matching to sample or signal detection
C1, C2 Comparison stimuli in matching to sample
B1, B2 Responses defined by comparison stimuli or topography, represented as counts in the conditional-discrimination matrix of Figure 1.
R11, R22 Numbers of reinforcers for B11, B22
Model structure
ds Discriminability of sample stimuli
Depends on S1S2 difference, sensory capacity
Does not depend on reinforcer rate or allocation
dc Discriminability of comparison stimuli
Depends on B1B2 or C1C2 difference, sensory capacity
Does not depend on reinforcer rate or allocation
p(As) Probability of attending to S1 and S2
Depends on reinforcer rate relative to session context
Does not depend on ds, dc, or reinforcer allocation
p(Ac) Probability of attending to C1 and C2
Depends on within-trial reinforcer rate relative to sample context
Does not depend on ds, dc, or reinforcer allocation
Momentum equations
B Measured response rate (B/min)
rs Component reinforcer rate in multiple free-operant schedules or reinforcer rate for attending to S1 and S2
rc Within-trial reinforcer rate after offset of sample stimuli for attending to C1 and C2
ra Overall average session reinforcer rate
x Background disruption or competition for observing or attending to sample stimuli
z Background disruption or competition for observing or attending to comparison stimuli
q Disruption of attending to samples-as-coded per unit time during retention interval.
v Disruption of observing or attending to comparisons per unit time during retention interval
f, c, d Parameters representing the effects of experimentally arranged disrupters: ICI food, contingency termination, and generalization decrement.
b Sensitivity of response rates or probabilities of attending to rs/ra or rc/rs.

Before fitting the proposed model to relevant data, we show that it can account for the form of the forgetting function, the effects of variations in sample discriminability, and the relation between a and the length of the retention interval. For these predictions, we use the representative temporal parameters that served to illustrate the properties of Equations 3 and 4 in Figure 3.

Forgetting Functions

The relation between log d and the length of the retention interval t is known as the forgetting function. Empirically, its form is monotonic decreasing, positively accelerated. White (2001) has reviewed a number of equations that have been proposed to describe forgetting functions, all of which perform quite well with data from a wide range of experiments with human and nonhuman subjects. He suggested that a simple exponential decay function may be most useful theoretically because it implies a constant probability of failure to remember throughout the retention interval (see also White, 1991). He also suggested that an exponential decay with time scaled as the square root of t may be appropriate if forgetting results from a diffusion-like process related to the standard deviation of subjective time. Another common alternative is the hyperbolic decay function, which features prominently in research on delay of reinforcement (see McCarthy & White, 1987, for discussion).

The upper left panel of Figure 6 shows that with x  =  z  =  q  =  v  =  0.1, and with ds  =  dc  =  400 (signifying an easy discrimination such as red vs. green for the pigeon; see Nevin et al., 2005), our model generates a forgetting function (filled circles) that lies between the exponential (unfilled squares) and the hyperbolic (unfilled circles) or exponential on t0.5 (filled triangles) in its degree of curvature. (The parameters of the three descriptive equations were chosen to approximate the forgetting function predicted by the model, with the value of log d at t  =  0 set equal for all four functions.) The differences in curvature are easier to appreciate in the upper right panel of Figure 6, which plots the logarithm of log d so that the exponential decay function is linear. The function predicted by the model is slightly concave up, and the hyperbolic and scaled exponential are more strongly concave up.

Fig 6.

Fig 6

The upper left panel shows the forgetting function predicted by Equations 3 and 4 with ds  =  dc  =  400, with x  =  z  =  q  =  v  =  0.1, and with the representative experimental parameters used to generate the functions in Fig. 3. The predicted function (filled circles) is compared with some descriptive functions that have been used to characterize empirical forgetting functions: exponential (unfilled squares), hyperbolic (unfilled circles), and exponential with retention intervals scaled as t0.5 (filled triangles). Parameters are indicated in the legend. The upper right panel presents the same functions (with the same legend) as logarithms of log d, so that exponential functions are rendered as linear. The lower left panel illustrates the predicted effect of reducing sample discriminability, ds, from 400 to 4, together with the predicted effects of increasing background disruptors x and z from 0.1 to 0.5 with ds  =  400. The lower right panel presents the same functions (with the same legend) as logarithms of log d, showing that changes in sample discriminability or background disruption appear as changes in the intercept but not the slope of the forgetting function.

The lower left panel of Figure 6 compares forgetting functions with x  =  z  =  q  =  v  =  0.1 for ds  =  400 and for ds  =  4, the latter signifying a relatively difficult discrimination between samples. The lower right panel shows that the functions are essentially parallel when plotted as the logarithm of log d. White (1985; see also 1991, 2001) has shown that variables affecting discrimination of the samples, such as the physical disparity between S1 and S2, sample durations, or sample response requirements, lead to empirical forgetting functions with different values at t  =  0 but with similar decay rates. Thus, model predictions generally accord with many research findings.

The lower panels of Figure 6 also show that predicted forgetting functions are almost identical if ds is 4 with x  =  z  =  0.1, or if ds  =  400 with x  =  z  =  0.5. Thus, the major effect of increasing background disruptors x and z is to shift the function downward without significantly altering its slope. The similarity of the effects of reducing sample discriminability or of increasing disruption suggests that either discriminabilities ds and dc or disruptors x and z could be omitted from the model. Nevertheless, we retain both sets of model terms to distinguish between the effects of variations in the stimuli, which determine response probabilities via the four sets of Davison–Nevin (1999) equations given in Figure 5, and the effects of variations in disruption and reinforcement, which determine p(As) and p(Ac) via Equations 3 and 4 and thereby determine the relative frequencies of entry into the four states depicted in Figure 5. Thus, the model can distinguish cases with high discriminability, represented by large values of ds and dc, but low probability of attending, represented by large values of x and z, and vice versa.

White (1985, 1991, 2001) also has shown that variables such as houselight illumination during the retention interval, usually interpreted as interference, affect the decay rate but not the initial level of exponential functions that describe empirical forgetting functions. The left panel of Figure 7 shows that increasing q from 0.1 to 0.3 leads to a sharp increase in decay rate but no change in initial level, whereas increasing v from 0.1 to 0.3 leads to a lesser increase in decay rate and a small decrease in initial level. The right panel of Figure 7 plots the forgetting functions as the logarithm of log d, where their linearity makes their exponential form and differences in their slopes more obvious.

Fig 7.

Fig 7

Forgetting functions predicted by Equations 3 and 4 with ds  =  dc  =  400, x  =  z  =  0.1, and with different values of q and v. The left panel presents the functions in their standard form with log d as the measure of accuracy; the right panel presents the same functions (with the same legend) as logarithms of log d to show that changes in retention-interval disruptors appear as changes in the slope of the forgetting function, with little or no change in the intercept.

The functions presented in Figures 6 and 7 demonstrate that the present model can simulate the major findings concerning the level and slope of empirical forgetting functions that have been described by exponential functions.

Sensitivity to Reinforcer Ratios

Davison and Nevin (1999) modeled the usual monotonic decreasing forgetting function by allowing the matrix of effective reinforcers to change over time in the retention interval, approaching equal effective reinforcement in all four cells as the retention interval became very long. They also showed that when the ratio of reinforcers, R11/R22, was varied, their model predicted increasing sensitivity to reinforcement (a in Equation 7) as the retention interval increased and log d decreased. Thus, Davison and Nevin predicted an inverse relation between the two dependent variables, log d and a. A similar inverse relation between log d and a was predicted by a very different model proposed by White and Wixted (1999). Unfortunately, the data on the relation between the retention interval and a are mixed: Jones and White (1992) and White and Wixted (1999) found that sensitivity increased, whereas McCarthy and Davison (1991) and McCarthy and Voss (1995) found that sensitivity decreased, and Harnett, McCarthy, and Davison (1984) found a decreasing trend. The latter three studies obtained the usual monotonic decreasing forgetting functions; that is, both sensitivity and accuracy decreased with increasing retention intervals, implying a direct relation between sensitivity and accuracy.

The present model can simulate both increasing and decreasing relations between a and the length of the retention interval as shown in the left panel of Figure 8; dc  =  400 for all examples. With ds  =  400 and x  =  z  =  q  =  v  =  0.1, our model predicts a shallow decreasing relation between sensitivity and retention interval length (filled circles). The predicted function starts higher and the decrease is more pronounced if ds  =  4 (filled squares). If q is increased to 0.2 and v is decreased to 0, sensitivity to reinforcement is predicted to increase as the retention interval becomes longer (unfilled circles). The reason is that p(As) decreases as the retention interval increases, resulting in an increased probability of entry into State 3, where response probabilities depend only on the ratio of reinforcers, R11/R22 (see Figure 5). By contrast, sensitivity is predicted to decrease as the retention interval becomes longer if ds  =  4 (unfilled squares) because increased numbers of effective reinforcers in cells 12 and 21 reduce the relative importance of State 3 in the calculation of pooled p(B1|S1) and p(B1|S2).1

Fig 8.

Fig 8

The left panel shows that sensitivity to reinforcer ratios (a) is predicted to be roughly constant over the retention interval with ds  =  400 and with x  =  z  =  q  =  v  =  0.1 (filled circles). When q  =  0.2 and v  =  0 with ds  =  400, the function increases (unfilled circles). When ds is decreased to 4 with x  =  z  =  q  =  v  =  0.1 (filled squares), and with q  =  0.2 and v  =  0 (unfilled squares), the functions decrease. Thus, the slope of the predicted relation between a and the retention interval depends on sample discriminability and the values of parameters representing disruptors in Equations 3 and 4. The right panel shows that the predicted relation between log d and a for x  =  0.1, z  =  0.1, q  =  0.2, and v  =  0 decreases with ds  =  400 (unfilled circles) and increases with ds  =  4 (unfilled squares). The function with ds  =  400, x  =  0.2, z  =  0.5, q  =  0.1, and v  =  0.1 (unfilled triangles) mimics the effects of very short intertrial intervals reported by White and Wixted (1999).

The right panel of Figure 8 depicts the relations between the dependent variables a and log d for two cases displayed in the left panel. The unfilled circles exhibit an inverse relation between a and log d of the sort that was predicted by Davison and Nevin (1999) and by White and Wixted (1999), and obtained by Jones and White (1992). The unfilled squares depict a direct relation of the sort obtained by McCarthy and Davison (1991) and by McCarthy and Voss (1995). The unfilled triangles, with ds  =  400, x  =  0.2, z  =  0.5, and q  =  v  =  0.1, depict a case where a is relatively low and constant or slightly increasing with respect to log d. White and Wixted (1999, Experiment 3) obtained a negative relation when the ITI was 15 s and accuracy was generally high, as in the function coded by unfilled circles, and a shallow positive relation when the ITI was 1 s and accuracy was generally low, as in the function coded by unfilled triangles. White and Wixted interpreted their 1-s ITI results in terms of proactive interference. Here, the higher values of x and z represent increased general background interference with attending, which could be interpreted as proactive interference.

In summary, our model can simulate the full range of results that have been reported in the literature. It does not, however, identify experimental variables that might affect parameters x, z, q, and v. If such variables can be identified and manipulated systematically in conjunction with the discriminability of the samples, it should be possible for a single experiment to generate the full range of relations between sensitivity, retention interval length, and accuracy of discrimination that have been reported in the literature, as suggested by Figure 8.

Modeling Experimental Data

We now show that our model can be fitted to the data of studies where reinforcer rates and disruptors are varied within or between conditions. Equations 3 and 4 state that p(As) and p(Ac) are determined by the values of the relevant disruptors and by ratios of reinforcer rates, and implicitly assume that p(As) and p(Ac) are independent of the discriminabilities of samples and comparisons, ds and dc. In the limit, this assumption is implausible: Why should attending be directed to indistinguishable samples or comparisons? A study by Alsop (1988) exemplifies the uncertainties. He varied sample disparity between conditions, including zero disparity (ds  =  1), in a signal-detection paradigm. Nevin et al. (2005) obtained a satisfactory fit to the data of all Alsop's conditions with a single value of p(As), but it would have been equally good if we had allowed p(As) to vary directly with sample disparity, including p(As)  =  0 for the zero-disparity condition. The problem is that the parameters of Equations 3 and 4 become indeterminate when the model is fitted to data from conditions where ds or dc  =  1, signifying complete confusability (see Figure 5). Accordingly, we consider only studies where the subjects were pigeons and the samples and comparisons were highly discriminable key colors, and set ds  =  dc  =  400 (as in Nevin et al., 2005) in all model applications reported here. The model was fitted to data averaged over pigeons by a nonlinear curve-fitting program (Microsoft Excel SolverTM).

Reinforcer Probabilities in Multiple Schedules

Odum, Shahan, and Nevin (2005) examined the effects of reinforcement probability on forgetting functions in the steady state and evaluated their resistance to change during disruption in a paradigm designated VI DMTS, which permits direct comparison with the effects of reinforcement and disruption on VI response rates. Specifically, pigeons produced DMTS trials according to variable-interval (VI) 20-s schedules with reinforcer probabilities of .9 or .1 for correct responses in the components of a multiple schedule. Components were signaled by lighting the center key red or green, and alternated after four DMTS trials were completed. Components were separated by a 15-s intercomponent interval (ICI). The samples and comparisons were yellow and blue. Samples remained on until the first center-key peck after 3 s. During the retention interval, the center key was lighted red or green according to the component currently in effect. Comparisons terminated with a single peck and were followed by 2-s food or blackout. Performance was disrupted by response-independent food during the ICI, and by extinction, both of which have been employed in many studies of the resistance to change of VI response rates.

The results showed that the VI response rate was higher and more resistant to change in the richer component, as in many multiple-schedule studies. Likewise, the level of the forgetting function was higher and more resistant to change in the richer component, although ICI food had relatively little effect at the shortest retention interval and both functions were flattened during extinction. Figure 9 displays DMTS accuracies, expressed as average values of log d during baseline and disruption tests.

Fig 9.

Fig 9

Forgetting functions reported by Odum, Shahan, & Nevin (2005) in multiple VI DMTS with reinforcer probabilities of .9 or .1 in the components. The top panel presents average forgetting functions in baseline, the middle panel presents forgetting functions pooled over 10 sessions with food presented during the ICI, and the bottom panel presents forgetting functions pooled over 10 sessions of extinction. Predictions based on Equations 8 and 9 are shown in each panel together with best-fitting parameter values; see text for explanation.

Nevin, Milo, Odum, and Shahan (2003) had used the same general procedure but with zero retention intervals; their discrimination data were modeled by Nevin et al. (2005) using the present Equations 3 and 4 with t  =  0, augmented to account for the effects of the different disruptors employed in their experiment. Here, we use exactly the same approach to model the data of Odum et al. (2005), with t varying from 0.1 to 8 s. The model equations are:

graphic file with name jeab-88-02-08-e08.jpg 8

and

graphic file with name jeab-88-02-08-e09.jpg 9

where f represents the added disruptive effects of ICI food, and c and d represent the disruptive effects of suspending the reinforcer contingency and generalization decrement from reinforcer omission, respectively (see Nevin, McLean, & Grace, 2001, for the treatment of extinction).

To apply Equations 8 and 9 to the data of Odum et al. (2005), we began by fitting the pooled baseline data (N  =  8) with f, c, and d equal to 0, and with x  =  z and q  =  v in order to minimize the number of free parameters. With x  =  z  =  0.012 and q  =  v  =  0.023, the quality of the fit is excellent, as shown in the top panel of Figure 9; the proportion of variance accounted for (VAC) is .98. Keeping those values of x, z, q, and v constant, we then allowed f, c, and d to vary so as to fit the data for ICI food and extinction. Thus, in effect, five free parameters were fitted to 24 data points in all. As shown in the center and bottom panels of Figure 9, the general agreement between data and predictions for ICI food and for extinction is satisfactory. With f  =  0.034, c  =  0.181, and d  =  0.0001 (signifying minimal generalization decrement), VAC is .91 for the full data set.

Evidence for Stimulus Control of Attending During The Retention Interval

Our conception of attending during DMTS trials (Figure 2) suggests that some form of attending to the samples and comparisons occurs during the retention interval even though the stimuli themselves are not present. If probabilities of attending depend on reinforcer rates signaled by multiple-schedule component stimuli according to Equations 3 and 4, as suggested by our analyses of the data of Odum et al. (2005), then reversing those stimuli during the retention interval should reverse the probabilities of attending. Odum, Shahan, and Nevin (2006)2 replicated the baseline procedure described above and then reversed the retention-interval key colors, with all other aspects of the procedure unchanged. Thus, during cue reversal, the retention-interval key color incorrectly signaled a low reinforcer probability in the rich component and vice versa.

Method

Subjects

Five pigeons that had served in related VI DMTS studies were maintained at 80% of their free-feeding weights.

Apparatus

The same three-key chamber that was used by Odum et al. (2005).

Procedure

The baseline procedure used by Odum et al. (2005), described above, was in effect for 44 to 51 sessions, after which all 5 pigeons were performing at levels similar to those shown in Figure 9, top panel. Then, retention-interval key colors were reversed between rich (reinforcer probability .9) and lean (reinforcer probability .1) components for 10 sessions, with all other aspects of the procedure unchanged. Thus, reinforcer probability was signaled correctly during the VI segments of each component and incorrectly during retention intervals. Cue reversal was followed by 10 sessions of baseline recovery (i.e., with retention-interval key colors again corresponding to those in the VI segments of each component).

Results

VI response rates and log d at each retention interval were calculated for both components from data pooled over the final 10 sessions of baseline, all 10 sessions with reversed cues, and 10 sessions of baseline recovery. In addition, rates of key pecking during retention intervals were calculated from data pooled over the four retention intervals within each condition.

For all 5 pigeons, VI response rates during cue reversal increased relative to baseline in the lean component and then decreased during return to baseline. VI response rates were not consistently affected in the rich component. The average data are shown in Figure 10, left panel. For all 5 pigeons, rates of key pecking during retention intervals decreased in the rich component and increased in the lean component during cue reversal, and then increased in rich (with one exception) and decreased in lean during return to baseline. Average data are shown in Figure 10, right panel. Retention-interval rates varied substantially among pigeons, perhaps because retention-interval pecking was not controlled by explicit experimental contingencies.

Fig 10.

Fig 10

The left panel presents average response rates during the VI segment of rich (reinforcer probability .9) and lean (reinforcer probability .1) components of multiple VI DMTS before, during, and after cue reversal during the retention interval, and the right panel presents key pecking rates during retention intervals in the same format; vertical bars indicate the standard error.

Average forgetting functions are presented in Figure 11, top left panel. To a first approximation, it appears that forgetting functions are roughly parallel, and that cue reversal increased the level of the lean-component function and decreased the level of the rich-component function (except at the 0.1-s retention interval). Because the average functions were roughly parallel, their overall levels can be characterized by the mean value of log d across retention intervals, and the effects of cue reversal can be expressed as the difference between mean log d in the rich and lean components. These difference scores are presented for individual subjects in the top right panel of Figure 11. The rich–lean difference decreased during cue reversal for every pigeon, and increased during return to baseline for 4 of the 5 pigeons, although the differences were small for P1821 and P3060.

Fig 11.

Fig 11

The upper left panel displays forgetting functions before and during retention-interval cue reversal averaged over subjects. The upper right panel shows the difference between mean log d in rich and lean components before, during, and after cue reversal. The bottom left panel presents forgetting functions predicted by the model with parameter values in the legend. The bottom right panel displays the forgetting functions during cue reversal as proportions of the prereversal baseline together with model predictions.

Although the effect of cue reversal might be expected to diminish with continued exposure because the reversed cues did not in fact signal reversed reinforcer probabilities, the average difference between log d values in rich and lean components was essentially constant over the 10 reversed-cue sessions (data not shown).

To model the cue-reversal data, we assumed that rs in each component was based on the weighted mean of reinforcer rates explicitly signaled by the consistent and reversed cues. For example, in the rich component, a reinforcer probability of .9 was signaled during the 20-s VI portion of each cycle preceding sample presentation and a reinforcer probability of .1 was signaled (incorrectly) during the retention interval. Thus, the weighted average reinforcer probability in the rich component at each retention interval t was calculated as (.9*20 + .1*t)/(20 + t). The signaled reinforcer probability for rc in the rich component during cue reversal was simply .1 rather than .9. Reinforcer rates rs and rc were calculated from these probabilities and entered into Equations 3 and 4 to predict DMTS accuracy during cue reversal. If x  =  z  =  0.012 and q  =  z  =  0.023, the values that were used for fitting the baseline data of Odum et al. (2005), the predicted mean difference between the baseline forgetting functions is much smaller than obtained and VAC is 0.68. When all four parameters were allowed to vary, the best fit to the data was given by x  =  0.012, z  =  0, q  =  0.068, and v  =  0.039, with VAC  =  .95. The difference in level between predicted functions shown in the lower left panel of Figure 11 is smaller than for the average data in the upper left panel. However, when data and predictions for cue reversal are reexpressed as proportions of the preceding baseline, the trends are well described by the model as shown in the lower right panel. Specifically, the obtained values of relative log d in the rich and lean components were decreasing and increasing functions of the retention interval, respectively, and the predicted functions agree closely with those obtained.

Discussion

The effects of cue reversal on forgetting functions suggest that during the retention interval, the pigeons were engaged in some sort of behavior that came under stimulus control by cues signaling high or low reinforcer probabilities. Whatever they were doing (what we have called “attending” here) affected discrimination accuracy in a way consistent with reinforcer probabilities signaled during the pigeons' extensive histories and during the VI segments of each component during reversal testing. The corresponding changes in retention-interval key pecking suggest that inferred attending may accompany, or be indexed by, overt behavior. The model proposed here describes the data quite well, suggesting that attending to the samples and comparisons during retention intervals is controlled by cues signaling different reinforcer rates in accordance with Equations 3 and 4.

Effects of Different Vi Schedules in Multiple Vi Dmts

The multiple VI DMTS paradigm was introduced in an earlier study by Schaal, Odum, and Shahan (2000). They arranged two components with different VI schedules but the same reinforcer probabilities, whereas Odum et al. (2005) used the same VI schedules with different reinforcer probabilities. In addition, Schaal et al. varied the length of the retention interval between conditions, whereas Odum et al. varied the length of the retention interval within components. Schaal et al. found that the forgetting function, with accuracy measured as proportion correct, was lower and somewhat steeper in the richer component (VI 20 s) than in the leaner component (VI 120 s). In view of the procedural differences between studies, together with the opposite ordering of forgetting functions, it is of more than passing interest that our model can account for the data of Schaal et al. as well as those of Odum et al.

To apply our model to the data of Schaal et al. (2000), we transformed average proportion correct (p) to logit p  =  log(p/(1-p), which is equivalent to log d when response bias is absent. As for the fits described above, we set ds  =  dc  =  400 and b  =  0.5, and fitted Equations 3 and 4 with four parameters, x, z, q, and v, to fit 14 data points. With x  =  z and q  =  v, as in our fits to the data of Odum et al. (2005), the predicted forgetting function for the VI 20-s component of Schaal et al. lies above that for the VI 120-s component, contrary to the data, and VAC is only .72. With all four parameters free to vary, however, the data are satisfactorily described as shown in Figure 12. With x  =  0, z  =  0.440, q  =  0, and v  =  0.347, VAC is .96. The values of x and q suggest maximal attending to the samples, whereas the values of z and v suggest a relatively low level of attending to the comparisons. The latter suggestion is plausible because Schaal et al. used a correction procedure that ensured receipt of reinforcement on every trial, regardless of which comparison was pecked at the first opportunity.

Fig 12.

Fig 12

Forgetting functions based on average data reported by Schaal, Odum, & Shahan (2000) in multiple VI DMTS with VI 20-s and VI 120-s schedules in the components, compared with predictions based on Equations 3 and 4 with parameter values in the legend. The data have been transformed from proportion correct to logit p, which is equivalent to log d. See text for explanation.

Our model for response rates (Equation 2) requires that response rate in the VI 20-s component be greater than that in the VI 120-s component at all retention intervals. However, Schaal et al. (2000) found that response rate in the VI 20-s component usually decreased to levels below those in the VI 120-s component as the retention interval increased across conditions. We address this problem in the Discussion; for now, we conclude that our model gives a good account of the effects of the length of the VI on forgetting functions in multiple VI DMTS.

Reinforcer Probabilities or Magnitudes in Signaled Trials

A different paradigm for examining reinforcer effects within sessions employs within-trial signals. For example, Nevin and Grosch (1990) arranged DMTS trials with long- or short-duration reinforcers in quasirandom order, with auditory stimuli that accompanied each trial signaling whether the reinforcer would be long or short on that trial. They found that the forgetting function in the long-reinforcer trials was higher than and parallel to the function in short-reinforcer trials. This “signaled magnitude effect” has been replicated by Jones, White, and Alsop (1995) and McCarthy and Voss (1995). Brown and White (2005, Experiment 2) repeated the signaled-magnitude effect under conditions where the signal was presented after sample offset, presumably precluding differential attending to the sample on the basis of signaled reinforcement. Here we will model the data of Experiment 1 by Brown and White because it arranged different reinforcer probabilities, rather than magnitudes, and thus is more nearly comparable to the study by Odum et al. (2005).

Brown and White's (2005) Experiment 1 arranged a 15-s ITI before each sample presentation. Then, after 10 responses, the color sample was extinguished and replaced with a geometric form signaling the reinforcer probability for that trial. The form remained on throughout the retention interval and until one of the comparison colors was chosen. As shown in Figure 13, the forgetting function on trials with reinforcer probability 1.0 was higher than and roughly parallel to the function on trials with reinforcer probability 0.2.

Fig 13.

Fig 13

Forgetting functions based on average data reported by Brown and White (2005) in DMTS trials with signaled reinforcer probabilities of 1.0 or .2, compared with predictions based on Equations 3 and 4 with parameter values in the legend. See text for explanation.

To apply our model, it is necessary to calculate rs separately for time preceding and following the signal indicating reinforcer probability on a given trial. Before and during sample presentation, rs must be the same for both trial types. During the retention interval, however, rs must differ between trial types, and because the subject is assumed to attend to the sample-as-coded after sample offset, this difference can provide a basis for differential accuracy. In modeling, we used a weighted average of rs preceding and following sample offset to summarize its value for each type of trial.

To fit Equations 3 and 4 to the data of Brown and White (2005, Experiment 1), we set ds  =  dc  =  400 and b  =  0.5 as for the multiple VI DMTS studies described above. Thus, the four parameters of Equations 3 and 4, x, z, q, and v, were used to fit 10 data points. With x  =  z and q  =  v, as in our fits to the data of Odum et al. (2005), VAC is .91 but the predicted forgetting functions are too close together. With all four parameters free to vary, however, VAC is .96 and the predicted functions are appropriately separated, as shown in Figure 13. The best-fitting values of x and q are 0, indicating maximal attending to the samples, perhaps because of the FR 10 contingency on pecks at the sample. The best-fitting values of z  =  0.030 and v  =  0.087 indicate moderate disruption of attending to the comparisons.

Brown and White (2005) did not examine resistance to change. Nevin and Grosch (1990) examined resistance to change in their signaled-magnitude study and found little consistent difference between large- and small-magnitude trials, contrary to the multiple-schedule results of Odum et al. (2005). Interestingly, our model predicts different results for signaled reinforcement trials and multiple schedule components. In signaled reinforcement trials, increasing x in Equation 3 has similar decremental effects on forgetting functions relative to baseline in both high- and low-probability trials; increasing z in Equation 4, in contrast, produces a smaller decrease relative to baseline in high- than in low-probability trials. The reason is that reinforcer probability is not signaled until sample onset (or offset, as in Brown & White) so that rs differs between high-probability and low-probability trials only after signal onset. As a consequence, rs/ra differs relatively little between trial types, whereas rc/rs is substantially greater on high-probability trials. By contrast, in multiple schedules, increasing x in Equation 3 produces a much smaller decrease, relative to baseline, in the high- than in the low-probability component, whereas increasing z in Equation 4 produces similar decreases relative to baseline in both components – just opposite from the predicted effects of increasing z in signaled reinforcement trials. The reason is that in multiple schedules, reinforcer probability is signaled throughout each component so that rs/ra differs between components in direct proportion to the component reinforcer probabilities, whereas rc/rs is the same for low- and high-probability components because both rc and rs are directly proportional to reinforcer probability. The divergent predictions for these two paradigms could be tested by comparing the effects of sample-specific and comparison-specific disruptors.3

Sensitivity to Reinforcer Ratios

We showed above that the sensitivity of log b to the ratio of reinforcers R11/R22 (a in Equation 7) could be either an increasing or a decreasing function of the retention interval, depending on the values of model parameters. Here, we fit the model to the data of Jones and White (1992), who examined the relation between a and the retention interval within sessions. The probabilities of reinforcement for B1 and B2 were varied between conditions so as to give different values of R11/R22 while the total reinforcer rate, R11+R22, was constant. The average values of log d are presented in the left panel of Figure 14 as functions of log(R11/R22) with retention-interval length as a parameter. Except at the shortest retention interval (0.01 s), the data suggest concave-up functions, especially at the longest retention interval (12 s). The predictions shown with the data are based on fits of Equations 3 and 4 with x  =  0.005, z  =  0.032, q  =  0, and v  =  0.303; as in other fits with colored lights as samples and comparisons, ds  =  dc  =  400, and b  =  0.5 as usual. One point at 0.01 s, which included seriously deviant data of Pigeon X3, was omitted from the fit; thus, VAC  =  .87 for 19 data points. If x  =  z and q  =  v, VAC decreases to .85 and the predicted functions in the left panel are flattened.

Fig 14.

Fig 14

The left panel shows how accuracy depended on the log ratio of reinforcers for correct responses at four retention intervals in the average data of Jones and White (1992), and the right panel shows how sensitivity to reinforcement (a) depended on the retention interval. In the left panel, data are coded for the retention interval and compared with predictions (designated p) based on Equations 3 and 4, with parameter values in the legend; the grey-filled circle includes seriously discrepant data from one pigeon at the 0.01-s retention interval, and was not used in model fits. See text for explanation.

The right panel of Figure 14 shows that with these parameters, the model predicts an increasing relation between sensitivity to reinforcement a and the retention interval that agrees with the data in direction but not in level. If x  =  z and q  =  v, the predicted function in the right panel goes down rather than up, as suggested by Figure 8 above. Note that if the generalized matching law, Equation 7, was an accurate descriptor of these data, the relations in the left panel would be horizontal rather than concave-up in form. Therefore, the use of Equation 7 to estimate a may be inappropriate.

McCarthy and Voss (1995) arranged a DMTS procedure where reinforcer magnitude (4.5 s vs 1.5 s) was signaled within trials, and the ratio of reinforcers for correct responses, R11/R22, was varied between conditions. They obtained higher forgetting functions on large-magnitude than on small-magnitude trials, but overall, the levels were unusually low for red–green color matching by pigeons. They also obtained systematic decreases in sensitivity to differential reinforcement as functions of the retention interval, with higher sensitivity in large- than in small-magnitude trials. We took summary values of log d and a from their Figures 2 and 5 because the full data set is no longer available. To model these data, we multiplied the numbers of reinforcers used to calculate ra, rs, and rc by reinforcer durations, thus giving total time of access to food. Our model does not fit their summary values at all well unless we allow ds, dc, and b to take values substantially different from those we have used throughout this paper. The problem is illustrated in Figure 15: Although the orders and trends of predicted functions correspond to the reported summary values, predicted values of log d (upper panel) are much too high and predicted values of a (lower panel) are much too low. The problem may arise in part from uncertainties about how best to model reinforcer magnitude as opposed to reinforcer probability or rate in the Davison–Nevin (1999) model (see also Alsop & Porritt, 2006, and discussion below).

Fig 15.

Fig 15

The upper panel shows forgetting functions for the average data reported by McCarthy and Voss (1995) for DMTS trials with signaled reinforcer magnitudes, and the lower panel shows how sensitivity to reinforcement (a) depended on the retention interval. In each panel, data are compared with predictions based on Equations 3 and 4; parameter values for predicted functions are given in the legend. See text for explanation.

Delays to Reinforcement

Another parameter of reinforcement that affects the accuracy of DMTS performance is delay between choice of B1 or B2 and presentation of the reinforcer. In a parametric study of reinforcer delays in DMTS, Sargisson and White (2003) varied retention intervals within conditions, and varied reinforcer delays between conditions, both over the range from 0 to 8 s, with a 12-s ITI. They found that forgetting functions became lower and somewhat steeper as the reinforcer delay increased. The upper panel of Figure 16 presents their average forgetting functions (with successive functions displaced upward by 0.5 log units to avoid overlap). Sargisson and White described their data with exponential decay functions and found that the intercept at 0-s retention interval (log do) decreased and the slope became steeper (i.e., the decay rate, s, increased) as functions of reinforcer delay, as shown in the lower left and right panels of Figure 16, respectively.

Fig 16.

Fig 16

The upper panel presents average forgetting functions obtained in a study by Sargisson and White (2003) with reinforcer delay (indicated at the right of each function) varied across conditions. To avoid overlap, successive functions are displaced upward by 0.5 log units. The lower panels show how the intercept (log d at 0 retention interval) and slope (decay rate, s) of exponential decay functions fitted to the forgetting functions in the upper panel change as a function of reinforcer delay. In each panel, data are compared with predictions based on Equations 3 and 4 with the value of delayed reinforcers decreased according to a hyperbolic function, with model parameters in the legend of the upper panel. See text for explanation.

To bring our model to bear on the effects of reinforcer delays, we consider two alternatives. The first is to include the duration of the delay in the time base for calculating ra and rc. However, this approach suggests the need for attending to the chosen-comparison-as-coded during the delay to reinforcement, with yet another parameter for disruption—and the model then predicts that the slopes of forgetting functions decrease with increasing delays, contrary to the data. A second approach is to construe the reinforcer delay as degrading the value of the consequences of correct comparison-key responses according to a hyperbolic decay function of the sort proposed by Mazur (1987; for discussion of the pervasiveness of hyperbolic discounting functions, see Rachlin, 2006). If rs and rc are multiplied by h/(h+d), where h is the half-life of the hyperbolic decay function, our model predicts that the intercepts of forgetting functions decrease and their slopes become steeper with delay, as reported by Sargisson and White (2003). It also predicts that accuracy decreases less rapidly with increasing reinforcer delays at zero retention interval than with increasing retention intervals at zero delay, as found earlier by McCarthy and Davison (1991). Predicted functions with ds  =  dc  =  400 and the exponent b  =  0.5 as usual, and with x  =  0.010, z  =  0, q  =  0.015, v  =  0.008, and h  =  0.6, are shown together with the data in Figure 16. Predictions agree closely with the data, and VAC  =  .98 with 30 data points. If x  =  z and q  =  v, our nonlinear curve-fitting program was unable to arrive at a solution. Although our approach is admittedly ad hoc, it may be useful in permitting the model to deal with other aspects of reinforcer value such as quality that can be expressed as weights on the reinforcer terms rs and rc.

Temporal Parameters

When experimental conditions differ in their temporal parameters such as the ITI, reinforcer rates are indirectly affected. It has long been recognized that DMTS accuracy depends directly on the length of the ITI (e.g., White, 1985). In the present model, increasing the ITI has the effect of reducing both ra and rs. If the ITI and retention interval are fixed within a condition, the time base for ra differs from that for rs only by the latency of response to the comparisons; therefore, in terms of the model, the increase in accuracy that accompanies a between-condition increase in the ITI results from the increase in rc relative to rs, with a consequent increase in p(Ac). If the retention interval is increased in proportion to the ITI, p(Ac) decreases to an extent that roughly offsets the ITI effect.

Roberts and Kraemer (1982) varied the ITI from 4 to 32 s and varied the retention interval t from 0.5 to 4 s between sessions. They found that accuracy, expressed as proportion correct (p), was similar at each value of the ratio ITI/t and increased as a linear function of ITI/t plotted on a logarithmic scale. The data of their Experiment 1, converted to logit p  =  log (p/(1-p)) as for the analysis of Schaal et al. (2000) and plotted as forgetting functions, are shown as unfilled symbols in Figure 17, with successive functions displaced upward by 0.5 log units to avoid overlap as in Figure 16. To apply Equations 3 and 4 to these data, we set ds  =  dc  =  400 and b  =  0.5 as for the studies described above. Model predictions of proportion correct are shown as lines corresponding to each ITI value. By inspection, the model gives a good account of the 16 data points with x  =  0, z  =  0.925, q  =  0.010, and v  =  0.077 (VAC  =  .97). If x  =  z and q  =  v, VAC decreases to .78.

Fig 17.

Fig 17

Forgetting functions reported by Roberts and Kraemer (1982, Experiment 1), transformed from proportion correct to logit p (unfilled symbols), plotted separately for the ITI lengths indicated at the right of each function. To avoid overlap, successive functions are displaced upward by 0.5 log units. The accompanying lines are predictions of Equations 3 and 4 with model parameters in the legend. See text for explanation.

Another temporal parameter that affects DMTS accuracy is sample duration. A number of studies have reported that accuracy is an increasing function of sample duration (e.g., Grant, 1976) or the fixed ratio required to terminate the sample, which necessarily increases sample duration (e.g., Roberts, 1972). These studies, like those of Roberts and Kraemer (1982), arranged reinforcers for every correct response. Under such conditions, sample duration and ITI length affect ra and rs in Equations 3 and 4 in exactly the same way, and predictions should be identical. We are aware of only one study that varied both sample duration and ITI, by Kojima (1985). That study employed auditory stimuli in a successive same/different go/no-go procedure with monkeys as subjects, and obtained similar ordinal effects on discrimination ratios when sample duration and ITI length were varied. However, retention intervals differed between sets of conditions, and quantitative predictions of go/no-go performance do not follow directly from our model. In any event, the prediction of identical effects on accuracy may fail because sample discriminability has been found to depend directly on sample duration in conditional wavelength discriminations (Blough, 1996).

To illustrate the problem, we fit our model to the data of Grant (1976), who varied sample duration from 1 to 14 s between sessions, with retention intervals varied from 0 to 60 s within sessions and a 120-s ITI. Figure 18 presents Grant's data, reexpressed as logit p, together with model predictions; successive forgetting functions have been displaced upward by 0.5 log units to avoid overlap, as in Figures 16 and 17. With ds allowed to take different values for each sample duration as indicated in the figure, and with x  =  0, z  =  0.034, q  =  0.030, and v  =  0, VAC  =  .97. If ds is set at 400 for all sample durations, VAC decreases to .71 and the four predicted forgetting functions are essentially identical. Thus, comparing the effects of varying sample durations and ITI lengths will entail the added complication of free parameters for sample discriminability.

Fig 18.

Fig 18

Forgetting functions reported by Grant (1976), transformed from proportion correct to logit p (unfilled symbols), plotted separately for the sample durations indicated at the right of each function. To avoid overlap, successive functions are displaced upward by 0.5 log units. The accompanying lines are predictions of Equations 3 and 4 with model parameters in the legend; the values of ds required to fit each function are given at the right. See text for explanation.

Summary

All in all, our model gives a good account of the effects of reinforcer probability on forgetting functions in multiple VI DMTS, including their resistance to change; the effects of signaling reversed reinforcer probabilities during retention intervals; and the effects of length of the VI when reinforcer probabilities are the same. It gives a similar account of the effects of reinforcer probability on steady-state forgetting functions in signaled-reinforcement paradigms, but makes testably different predictions for resistance to change. It also can account for the varied effects of differential reinforcement for the two correct responses in relation to the retention interval, for the effects of reinforcer delays on levels and slopes of forgetting functions, and for the effects of intertrial intervals and sample durations on forgetting functions. It is readily extended to predict the effects of other determiners of DMTS performance that can be specified in terms of the model's structure and parameters, some of which we discuss below.

General Discussion

In qualitative terms, we view the subject as engaging in various unmeasured, possibly covert activities related to the sample and comparison stimuli collectively called “attending,” where attending is construed as operant behavior that is functionally similar to an overt free operant in its dependence on reinforcement. Our model formalizes these notions in order to make quantitative predictions of DMTS performance. The model has several separable components, each of which may be modified without altering the overall structure of the model.

  1. In a conditional discrimination, attending to the samples and to the comparisons are independent processes, separately determined by reinforcers and disruptors.

  2. Attending to the samples and comparisons may be reduced by background disruptors and by disruptors that are specific to the retention interval.

  3. Attending to the samples and comparisons both depend on reinforcer rates relative to their context of reinforcement according to an expression derived from behavioral momentum theory.

  4. The relevant reinforcer rates are based on times during which attending, including observing, discriminating, coding, and attending to stimuli-as-coded may plausibly occur.

  5. Probabilities of attending are translated into measured discrimination by way of the Davison–Nevin (1999) model of conditional discrimination performance.

Thus, the level of the predicted forgetting function depends on the discriminability of the samples and the comparisons (ds and dc) and the probabilities of attending to those stimuli, which in turn depend on the levels of general background disruption (x and z), and the rate of reinforcement relative to the context in which attending occurs, calculated according to Equations 3 and 4. The slope of the forgetting function also depends on the extent of disruption during the retention interval, where attending to the sample-as-coded and orienting toward the comparison locations are construed as concurrent independent activities with separate disruptors (q and v) in Equations 3 and 4.

Because the general form of the forgetting function was predicted directly by Davison and Nevin (1999) without invoking unmeasured or covert activities such as attending, it is important to review the reasons for bringing attending into the present model.

First, because the Davison–Nevin (1999) model was based on ratios of reinforcers, it predicted that overall reinforcer rate would have no effect on the level of the forgetting function. This prediction is contrary to the multiple-schedule data of Odum et al. (2005) and related studies reported here, the signaled-reinforcement data of Brown and White (2005), and the between-condition data of White and Wixted (1999, Experiment 1). By assuming that attending is related to reinforcer rates relative to their context in the same way as free-operant response rates, the present model naturally incorporates the effects of overall reinforcer rate on resistance to change as well as the levels of steady-state forgetting functions.

Second, the Davison–Nevin (1999) model predicted a concave-down relation between log d and the ratio of reinforcers for correct responses, R11/R22, contrary to the data of Jones and White (1992; see Figure 16) and to those of Harnett et al. (1984, as reanalyzed by McCarthy & Nevin, 1991). As Nevin et al. (2005) demonstrated, the form of the relation between log d and R11/R22 changes from concave down to concave up when probabilities of attending decrease below 1.0. Thus, the present model provides a mechanism that accounts for the form of this relation.

Third, in the Davison–Nevin (1999) model, the parameters dsb and dbr were identified with relations between structural aspects of the experiment such as the physical stimuli, the contingencies relating them, and the sensory capacities of the subject, so that the forgetting function was completely determined by their values. Therefore, their model could not account for observed variations in the level or slope of the forgetting function during disruption or interference if dsb and dbr remained constant. The identical difficulty arises in the discriminability component of the present model, which uses the structure of the Davison–Nevin model but with ds representing sample discriminability and dc representing comparison discriminability to translate attending into measured performance. However, the present model explicitly predicts changes in the slope and level of the forgetting function during disruption of p(As) or p(Ac) as given by Equations 3 and 4 with ds and dc constant.

Fourth, the Davison–Nevin (1999) model predicted that sensitivity to reinforcer ratios was an increasing function of the length of the retention interval, and therefore negatively correlated with accuracy. Jones and White (1992) and White and Wixted (1999) have reported data confirming this prediction, but Harnett et al. (1984), McCarthy and Davison (1991), and McCarthy and Voss (1995) have reported decreasing functions and therefore positive correlations. Because the present model explicitly allows p(As) and p(Ac) to vary independently, it can account at least ordinally for the full range of results, especially in conjunction with variations in ds (see Figure 8 and related text).

In summary, bringing variations in attending into the framework of the Davison–Nevin (1999) model allows it to account for a far wider range of experimental data.

Interpreting and Measuring Attending Within Dmts Trials

Our model of DMTS performance makes a series of assumptions about unmeasured, possibly covert, activities and when they occur in DMTS trials (Figure 2). It would be easy to challenge these assumptions or to dismiss unmeasured activities as superfluous to a behavioral analysis. From a formal theoretical perspective, it is possible to regard p(As) and p(Ac), the probabilities of attending to the samples and comparisons, simply as names for intervening variables having no properties other than to select among the states outlined in Figure 5 and thereby to generate predictions of the sort shown in Figures 6 to 8, or to organize and summarize empirical data. The advantage of identifying p(As) and p(Ac) with unmeasured or covert activities is that the effects of experimentally arranged disruptors and conditions of reinforcement may be predicted—and tested—by assuming these activities to be functionally the same as overt responding.

The activities assumed to comprise “attending to the sample” may be measured by identifying them with experimental events. For example, attending before sample onset may be identified with overt responding that produces the sample, as in Nevin et al. (2003) and Odum et al. (2005). Likewise, attending during the sample may be identified with sample-specific response contingencies as in Urcuioli (1985). Attending to the sample-as-coded may be identified with sample-specific behavior during the retention interval as reported by Blough (1959). When overt topographical variants are not evident, sample-specific coding may be inferred from the results of transfer tests (e.g., Cumming, Berryman, & Cohen, 1965). Other researchers (e.g. Honig & Wasserman, 1981; Roitblat, 1980) have varied the properties of the samples and comparisons to infer whether subjects were retrospectively coding the sample or prospectively coding the correct comparison. For our purposes, it suffices to assume that the subject is engaged in some activity with respect to the sample, either overt or covert, during the retention interval, and that activity may or may not be attended to when the comparisons are presented, depending on the reinforcer rate and the degree of disruption during the retention interval. Likewise, orienting toward or observing the comparisons may be identified with distinctive response topographies that are required to make contact with those stimuli (e.g., Wright & Sands, 1981). Attending to the comparisons might be identified with comparison-specific responses, but we are not aware of any studies that have explored such responses experimentally. The effects of reversing retention-interval cues in multiple schedules, described above, strongly suggest that attending can be controlled by stimuli correlated with different reinforcer rates.

Figure 2 also suggests that times during which the disruptors and reinforcer rates operate in Equations 3 and 4 overlap with respect to attending to samples and comparisons. Therefore, when generalized disruptors such as ICI food or extinction are brought to bear on DMTS performance, we cannot know a priori whether they affect p(As) via x, p(Ac) via z, or both. Likewise, attending to samples-as-coded and looking for the comparisons occur concurrently during the retention interval, so we cannot know a priori whether retention-interval disruptors such as houselight illumination affect p(As) via q, p(Ac) via v, or both. Because p(As) is affected only by x during sample presentations, and p(Ac) is affected only by z during comparison presentations, it may be possible to arrange disruptors that are specific to attending to the samples or comparisons only while they are present. Some predictions of their effects will be advanced below.

Testable Predictions

Our model simulates representative findings in the literature and gives respectable quantitative accounts of some archival data. Prediction of novel findings is, of course, a stronger test. Here are some predictions that follow from the model.

1) Resistance to Change in Multiple-schedule and Signaled-reinforcement Paradigms

As noted above, the predicted effects of different disruptors on DMTS performance depend on the paradigm used. Specifically, in multiple schedules, the model predicts that disrupting p(As) produces a greater decrement in accuracy in the leaner of two components, whereas disrupting p(Ac) has a slightly greater disruptive effect in the richer component. Conversely, in signaled-reinforcement trials, disrupting p(As) produces similar decrements in both high- and low-reinforcement trials, whereas disrupting p(Ac) has a larger decremental effect in low-reinforcement trials. Testing these predictions requires the development of effective sample-specific and comparison-specific disruptors. Data confirming the predicted patterns of resistance to change described above would provide strong support for the model, whereas systematic deviations from predicted patterns would force revision of some of the model's components.

2) Behavioral Contrast in Forgetting Functions

In multiple schedules, free-operant response rate in a constant component is inversely related to reinforcer rate in the alternated component. This result is explained by Equation 2 for response rate. For example, if the reinforcer rate rs is constant in one component, decreasing the reinforcer rate in an alternated component decreases ra, the overall session average reinforcer rate, so that rs/ra increases and constant-component response rate must increase (positive contrast). Conversely, if the reinforcer rate in the alternated component is increased, constant-component response rate must decrease (negative contrast). Because Equations 3 and 4 are simply reexpressions of Equation 2 for probabilities of attending, the same inverse relation between attending in a constant component and reinforcement in an alternated component must hold. And because DMTS accuracy depends directly on probabilities of attending if all free parameter values are constant, the level of the forgetting function in a constant component must be inversely related to the reinforcer rate in an alternated component.

This prediction was tested by Nevin, Shahan, and Odum (in press) in the multiple VI DMTS paradigm described above. In one experiment, constant-component reinforcer probability was .3 while alternated-component reinforcer probability was .9 or .1 across successive conditions. The forgetting functions in the alternated component in the .9 and .1 conditions were quite similar to those obtained in the .9 and .1 components of multiple VI DMTS by Odum et al. (2005), and were well fitted with the same parameter values: x  =  z  =  0.012 and q  =  v  =  0.023. With these parameters, constant-component forgetting functions are predicted to be parallel, with the forgetting function in the .1 condition 0.13 log units above the forgetting function in the .9 condition. However, there was no evidence of contrast in the constant component, and the average obtained difference between forgetting functions was −0.04. Also, contrast effects in VI response rate were weak and inconsistent.

In a second experiment designed to increase the likelihood of VI response-rate contrast, constant-component reinforcer probability was .3 while the alternated component was either VI 25-s with immediate food or extinction across successive conditions. Contrast effects were obtained in both VI response rates and in the level of the forgetting functions; however, the average obtained difference between forgetting functions was 0.11 log units whereas the predicted value with x  =  z  =  0.012 and q  =  v  =  0.023 was 0.25. Post-hoc adjustment of parameter values could, of course, bring the model into better agreement with the data, but with four free parameters and eight data points this would be little cause for satisfaction. Thus, the model correctly predicts contrast in forgetting function levels analogous to that in free-operant response rates, but the magnitude of the obtained effect is small and may occur only under special procedural conditions. More research over a wider range of reinforcement conditions is needed to ascertain the generality of contrast in forgetting functions.

3) Reversing the Correlation Between Sensitivity and Accuracy

As explained above, a reliable decrease in sensitivity to differential reinforcement as a function of the retention interval, or a positive relation between sensitivity and accuracy, cannot be explained by the Davison–Nevin (1999) model. Neither can it be explained by the models of Wixted (1989) or of White and Wixted (1999) without added assumptions, as described below. It can be explained, however, by the present model if p(As) is less than 1.0 and p(Ac) is disrupted during retention intervals. Therefore, if an effective method for disrupting p(Ac) can be identified and applied in combination with a method that maintains p(As) at a constant level less than 1.0, it should be possible to reverse the sensitivity–accuracy relation by imposing and removing that disruptor. Data confirming the predicted effect of such a disruptor on the sensitivity–accuracy correlation would provide strong support for the present model.

4) Isosensitivity Curves

In signal detection research and theory, a plot of the probability of correct detections in relation to the probability of false alarms with constant signal intensity is known as the isosensitivity curve, which has been used to characterize underlying discriminal or threshold processes (e.g., Egan, 1975; Green & Swets, 1966). A standard method for generating empirical isosensitivity curves is to vary the relative frequencies or values of the payoffs for correct detections and false alarms. This is exactly the method used in DMTS to evaluate the sensitivity of the ratio of comparison choices, B1/B2, to the ratio of reinforcers, R11/R22. Thus, a plot of p(B1|S1) in relation to p(B1|S2), the probabilities of correct and incorrect comparison choices with sample S1, would constitute an isosensitivity curve.

Figure 19 presents simulated isosensitivity curves for ds  =  400 with the R11/R22 ratio varied from 199:1 to 1:199, for several values of p(As) and p(Ac), and compares them with the isosensitivity curve predicted for ds  =  4, which represents a moderately difficult discrimination between samples that is analogous to a detection task with a moderate signal-to-noise ratio. We set dc  =  400 for all examples. The upper left panel presents the simulated functions on probability coordinates, showing that the curve for ds  =  4, p(As)  =  .8, p(Ac)  =  1.0 resembles standard detection results. However, with ds  =  400, the curves differ markedly from those reported in the detection literature.

Fig 19.

Fig 19

The upper left panel displays predicted isosensitivity curves relating p(B1|S1 to p(B1|S2) when the ratio of reinforcers for correct responses is varied, with values of ds, p(As), and p(Ac) indicated in the legend. The upper right panel displays the same functions with the axes transformed to logit p. The lower left panel presents the data of Jones (2003, Part 1) for variations in reinforcer probabilities over a wide range with zero retention interval to illustrate rough agreement with the predicted “improper” form of the isosensitivity curve (see also Nevin et al., 2005, for analysis and model fits). The lower right panel presents the data of Jones and White (1992, see Fig. 14) replotted as isosensitivity curves at four retention intervals. See text for explanation.

The differences are easier to appreciate when the functions are replotted as logit p(B1|S1) in relation to logit p(B1|S2), as shown in the upper right panel; if the functions conformed to detection-theory expectation, they would be roughly linear and parallel to the major diagonal (Egan, 1975). With ds  =  4, p(As)  =  .8, p(Ac)  =  1.0, the function is linear and parallel to the major diagonal over 3 log units, showing that our model can approximate the isosensitivity curves in classical detection theory. With ds  =  400, p(As)  =  p(Ac)  =  .8, the curve lies at approximately the same average distance from the major diagonal, signifying similar overall accuracy, but it is truncated in length and more sharply curved inward. With ds  =  400, p(As)  =  p(Ac)  =  1.0, the function is much further from the major diagonal and its inward curvature is even stronger. Finally, with ds  =  400, p(As)  =  .8, p(Ac)  =  1.0, the curve becomes wavy and dips downward in the middle—an admittedly bizarre form that has not, to our knowledge, appeared previously in the empirical signal-detection or recognition literature. It would be regarded as “improper” in detection theory because its slope changes from negative to positive acceleration and back again as p(B1|S2) increases (Egan, 1975).

Because p(As) and p(Ac) decrease systematically with the length of the retention interval (see Figure 3), it should be possible to generate the full range of curves exhibited in Figure 19 experimentally by arranging many different reinforcer ratios over a wide range at each of several retention intervals, with many trials at each ratio to permit accurate estimation of extreme response probabilities. A matching-to-sample (MTS) experiment by Jones (2003, Part 1), which was discussed by Nevin et al. (2005), met these requirements but arranged 0-s retention intervals only. As shown in the lower left panel of Figure 19, the data approximate the “improper” wavy form predicted by the model with ds  =  dc  =  400, p(As)  =  .69, p(Ac)  =  .99 (note that these values for Part 1 only differ slightly from those fitted to all of Jones's data for Parts 1 and 2 by Nevin et al.). Jones and White (1992), in a study discussed above, varied reinforcer ratios at each of four retention intervals, but with a narrower range of ratios and many fewer trials per condition than in Jones's study. Consequently, the Jones and White data, shown in the lower right panel of Figure 19, are far too irregular to evaluate against model predictions. More precise evaluation of response probabilities over wider ranges of reinforcer ratios in DMTS research could yield isosensitivity curves with properties that accord with or challenge model predictions.

Problems and Limitations of The Model

Our model explicitly assumes that attending is affected by reinforcers and disruptors in the same way as the rate of overt free-operant behavior, and uses Equation 2 for response rate as the basis for the predictions of attending via Equations 3 and 4. Because DMTS accuracy is directly related to attending, variables that affect response rate should affect accuracy similarly when both are measured in the same conditions, as in the VI DMTS paradigm. Consistent with this expectation, Odum et al. (2005) found that both response rates and DMTS accuracy were higher in baseline, and were more resistant to intercomponent food and to extinction, in the richer of two VI DMTS components. Likewise, Nevin et al. (in press) found contrast effects in response rate as well as average DMTS accuracy. Additional support for this expectation was provided by Wilkie, Summers, and Spetch (1981). They examined the effects of four different stimuli, presented during the retention interval, on symbolic DMTS accuracy in pigeons, and found that houselight and a geometric form projected on the center key disrupted DMTS accuracy whereas tones and chamber vibration had no effect. In a second experiment, they presented the same stimuli during free-operant key pecking maintained by VI reinforcement and found that response rates were reduced by houselight and geometric forms on the key, but that tones and chamber vibration had no effect. The effects on accuracy and response rate were correlated across pigeons (see also Nevin et al., 2003).

In the multiple VI DMTS paradigm, however, Schaal et al. (2000) did not find similar effects of increasing the retention interval in DMTS trials on accuracy and response rate. They found that accuracy was higher in the VI 120-s component than in the VI 20-s component at all retention intervals, with the difference approaching 0 as the retention interval increased. By contrast, response rate in the VI 20-s component was higher at short retention intervals, but decreased relatively more rapidly than in the VI 120-s component as the retention interval increased and was usually lower at long retention intervals. The functions relating VI response rates to retention-interval length in the VI 20-s and VI 120-s components cannot be fitted by Equation 2 unless the reinforcing value of DMTS trial onset depends inversely on the length of the retention interval relative to the length of the VI schedule. Although the assumption of temporal relativity in reinforcer value would be consistent with Fantino's (1977) delay reduction theory (see O'Daly, Angulo, Gipson, & Fantino, 2006, for application to multiple chain schedules), embedding this assumption into Equations 3 and 4 would lead to serious mispredictions of DMTS accuracy as reported by Schaal et al.

A related problem arises when reinforcer amount signaled for correct DMTS performance varies between trials. A number of studies have found that reinforcer magnitude and rate have functionally similar effects on response rate and resistance to change in multiple schedules (see Nevin & Grace, 2000, for review). Therefore, the same should be true for DMTS accuracy. However, when reinforcer duration was treated like reinforcer rate, the model performed poorly with the DMTS data for signaled reinforcer magnitudes reported by McCarthy and Voss (1995). Davison and Nevin (1999) considered various ways to treat reinforcer magnitude, none of which was entirely satisfactory, and Alsop and Porritt (2006) have reported some effects of reinforcer magnitude that raise further challenges to its treatment by the Davison–Nevin model. The general problem may be that the present model has no specific provision for incorporating variations in reinforcer value into predictions derived from Equations 3 and 4 via the Davison–Nevin model.

Another problem arises when reinforcers for correct responses on S1 and S2 trials differ in magnitude, probability, or quality. In DMTS, differential outcomes of this sort typically maintain higher and shallower forgetting functions than in otherwise comparable conditions where the same outcomes are arranged on both S1 and S2 trials. This result does not follow from our model in its present form. A comprehensive review by Urcuioli (2005) concluded that the effect of differential outcomes is mediated by learned differential expectancies that combine with S1 and S2 to form compound samples S1E1 and S2E2, where E1 and E2 represent the expectancies of the outcomes signaled by S1 and S2. In some cases, E1 and E2 may correspond to different overt behavior on S1 and S2 trials (e.g., Alling, Nickel, and Poling, 1991; see Urcuioli, 1985, 2005 for discussion). Our model can account for higher and shallower forgetting functions for compound samples if we assume independent attending to each element, so that p(ASE)  =  p(AS) + p(AE) – p(AS)*p(AE). Because p(ASE) for the compound is greater than for either element, accuracy is higher and may decrease less with increasing retention intervals than in an otherwise identical condition with nondifferential outcomes, depending on the values of parameters x, z, q, and v. Research on the level and resistance to disruption of forgetting functions with differential outcomes, together with parallel research using compound samples, may indicate the plausibility of this approach; we defer attempts to model the effects of differential outcomes until the results of such research become available.

Relations to Other Theoretical Accounts of Dmts Performance

There are some interesting similarities between our approach and that of Wixted (1989). Wixted proposed that in DMTS, the discriminative strength (p) of the sample was given by a function derived from Fantino's (1977) delay-reduction theory, where the ratio of the ITI to the retention interval is related to the magnitude of delay reduction. As such, it involves the same terms as the reinforcer rate ratios rs/ra and rc/rs in our Equations 3 and 4. In Wixted's model, the effects of both durations are adjusted by separate additive sensitivity parameters. The effect of changing the values of these parameters is similar to the effect of changing x and z in Equations 3 and 4. Therefore, it is not surprising that Wixted's predictions of the Roberts and Kraemer (1982, Experiment 1) data are strikingly similar to ours, and have essentially identical VAC.

More generally, the proportion of correct responses predicted by Wixted's (1989) model is given by reinforcer proportions for correct matching responses weighted by p, the discriminative strength of the sample, plus reinforcer proportions that are independent of choice of comparison weighted by (1-p). Thus, as p decreases and accuracy decreases concomitantly, sensitivity to differential reinforcement increases. In this respect, Wixted's model is quite similar to ours as summarized in Figure 5, which shows how predicted response probabilities depend on the proportions of trials on which different effective reinforcer proportions are operative, which in turn depend on p(As) and p(Ac). Moreover, Wixted's p is derived from delay reduction theory, whereas in our model, p(As) and p(Ac) are derived from behavioral momentum theory. Thus, both models are built on functional relations developed in other domains and adapted to model DMTS performance. The major difference between Wixted's and our general approaches is that his model's predictions are based on proportions of reinforcers and therefore must be independent of the overall probability or rate of reinforcement (as in Davison & Nevin's, 1999, model), whereas the present model's predictions are based on absolute reinforcer rates relative to their context and therefore account naturally for the effects of different reinforcer probabilities or rates.

White and Wixted (1999) proposed an utterly different model of DMTS performance that derives from signal-detection theory in spirit but not in detail. The model assumes that during the retention interval, the sample stimuli are represented as normal distributions on a hypothetical effect axis, and that the distributions overlap to a greater extent as the retention interval lengthens. It also assumes that the height of each distribution depends on the probability of reinforcement for correct responses for each sample. Finally, it assumes that the choice of comparison depends directly on the ratio of the ordinate values of the distributions at a point sampled at random along the effect axis (likelihood ratio). The model predicts the relations between log d, log b, and the parameters of the sample distributions; it cannot predict forgetting functions without making assumptions about the ways in which the sample distributions change in time. In more general terms, White and Wixted have advanced a structural model that generates relations among dependent variables but does not specify their relations to independent variables. In particular, it predicts that sensitivity to differential reinforcement increases as the overlap of sample distributions increases, so that sensitivity and accuracy must be negatively related. As noted above, the same negative relation also follows from Wixted's (1989) model, from Davison and Nevin's (1999) model, and from the present model if ds is large and if v in Equation 4 is 0, which implies constant orienting, observing, or attending to the comparisons during the retention interval regardless of its length (see Figure 8). The predictive convergence of these very different models is remarkable.

In order to address findings of decreasing sensitivity to reinforcement with increases in the retention interval (and hence positive relations between sensitivity and accuracy), White and Wixted (1999) varied the ITI and showed that the negative relation between sensitivity and accuracy could be eliminated or reversed when the ITI was very short, so that the accuracy of discrimination was low. As shown in Figure 8, our model can mimic this result by increasing the values of x and z. More generally, our model can predict either positive or negative relations by varying the level of attending to the comparisons during the retention interval. Evidently, the inclusion of variables that affect attending to the comparisons is the critical difference between the present model and its predecessors. As a theoretical exercise, it would be interesting to introduce variations in the role of the comparison stimuli into other models. But it would be far more interesting to explore ways of modulating stimulus control by the comparisons in the laboratory.

Wright and Sands (1981) devised a method for evaluating attending to the comparisons in DMTS with wavelength stimuli by presenting them on surfaces behind the pecking keys so that their subjects could see only one at a time. In training, the comparisons were the same wavelengths as the samples. In test trials, different wavelengths served as comparisons. The pigeons almost always observed the comparison presented behind one side key. They then either pecked that key, suggesting that its wavelength met a subjective criterion for “matching,” or switched and observed the other comparison, sometimes switching back and forth several times within a trial before pecking a key. Clearly, this behavior is an overt equivalent of “attending to the comparisons” in our model, and the probability of pecking the first-observed key regardless of its wavelength could provide an inverse measure of p(Ac). Wright and Sands modeled their pigeons' patterns of observing by assuming that the comparisons were represented by normal distributions on a hypothetical continuum of wavelength effect, similar to that proposed by White and Wixted (1999) for the samples, and that the pigeons either pecked or switched depending on whether a particular observation fell to one side or the other of a criterion. The resulting probabilities were entered into a Markov model which predicted overall patterns of switching in relation to wavelength differences. The data were in close agreement with prediction. In effect, the distributions in the model of Wright and Sands play a role similar to dc in our model, and the location of the criterion for pecking or switching in their model might be related to the probability of reinforcement in our model: Specifically, decreasing the reinforcer probability might make the criterion for accepting the first-observed comparison as a match more lenient, with a concomitant decrease in accuracy. It would be of great interest to explore the effects of reinforcer probability in a version of their paradigm that permits measurement of observing or attending to both samples and comparisons in DMTS trials.

At the outset of this article, we suggested that in order for a subject to perform correctly and obtain reinforcers in conditional discrimination tasks, it must attend to both the conditional and choice cues. Our model makes that dual requirement explicit, and suggests that reinforcement determines attending in the same way as overt free-operant responding.

Acknowledgments

Preparation of this article was supported by NIMH Grant MH65949 to the University of New Hampshire.

Footnotes

1

Worksheets are available on the JEAB website: seab.envmed.rochester.edu/jeab/extensions/nevin.html

2

Paper presented at the meetings of the Association for Behavior Analysis, May 2006, Atlanta, GA. Copies of all individual data are available from the first author.

3

Worksheets are available on the JEAB website: seab.envmed.rochester.edu/jeab/extensions/nevin.html

References

  1. Alling K, Nickel M, Poling A. The effects of differential and nondifferential outcomes on response rates and accuracy under a delayed-matching-to-sample procedure. The Psychological Record. 1991;41:537–549. [Google Scholar]
  2. Alsop B. Detection and choice. New Zealand: University of Auckland; 1988. [Google Scholar]
  3. Alsop B, Porritt M. Discriminability and sensitivity to reinforcer magnitude in a detection task. Journal of the Experimental Analysis of Behavior. 2006;85:41–56. doi: 10.1901/jeab.2006.91-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baum W.M. On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior. 1974;22:231–242. doi: 10.1901/jeab.1974.22-231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blough D.S. Delayed matching in the pigeon. Journal of the Experimental Analysis of Behavior. 1959;2:151–160. doi: 10.1901/jeab.1959.2-151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Blough D.S. Error factors in pigeon discrimination and delayed matching. Journal of Experimental Psychology: Animal Behavior Processes. 1996;22:118–131. [Google Scholar]
  7. Brown G.S, White K.G. On the effects of signaling reinforcer probability and magnitude in delayed matching to sample. Journal of the Experimental Analysis of Behavior. 2005;83:119–128. doi: 10.1901/jeab.2005.94-03. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cumming W.W, Berryman R, Cohen L.R. Acquisition and transfer of zero delay matching. Psychological Reports. 1965;17:435–445. doi: 10.2466/pr0.1965.17.2.435. [DOI] [PubMed] [Google Scholar]
  9. Davison M, McCarthy D.C. The matching law: A research review. Hillsdale, NJ: Erlbaum; 1988. [Google Scholar]
  10. Davison M, Nevin J.A. Stimuli, reinforcers, and behavior: An integration. Journal of the Experimental Analysis of Behavior. 1999;71:439–482. doi: 10.1901/jeab.1999.71-439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Davison M.C, Tustin R.D. The relation between the generalized matching law and signal-detection theory. Journal of the Experimental Analysis of Behavior. 1978;29:331–336. doi: 10.1901/jeab.1978.29-331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Egan J.P. Signal detection theory and ROC analysis. New York: Academic Press; 1975. [Google Scholar]
  13. Fantino E. Conditioned reinforcement: Choice and information. In: Honig W.K, Staddon J.E.R, editors. Handbook of operant behavior. Englewood Cliffs, NJ: Prentice-Hall; 1977. pp. 313–339. [Google Scholar]
  14. Green D.M, Swets J.A. Signal detection theory and psychophysics. New York: Wiley; 1966. [Google Scholar]
  15. Grant D.S. Effect of sample presentation time on long-delay matching in the pigeon. Learning and Motivation. 1976;7:580–590. [Google Scholar]
  16. Harnett P, McCarthy D.C, Davison M. Delayed signal detection, differential reinforcement, and short-term memory in the pigeon. Journal of the Experimental Analysis of Behavior. 1984;42:87–111. doi: 10.1901/jeab.1984.42-87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Honig W.K, Wasserman E.A. Performance of pigeons on delayed simple and conditional discriminations under equivalent training procedures. Learning and Motivation. 1981;12:149–170. [Google Scholar]
  18. Jones B.M. Quantitative analyses of matching to sample performance. Journal of the Experimental Analysis of Behavior. 2003;79:323–350. doi: 10.1901/jeab.2003.79-323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jones B.M, White K.G. Sample-stimulus discriminability and sensitivity to reinforcement in delayed matching to sample. Journal of the Experimental Analysis of Behavior. 1992;58:159–172. doi: 10.1901/jeab.1992.58-159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jones B.M, White K.G, Alsop B.A. On two effects of signaling the consequences of remembering. Animal Learning and Behavior. 1995;23:256–272. [Google Scholar]
  21. Kojima S. Auditory short-term memory in the Japanese monkey. International Journal of Neuroscience. 1985;25:255–262. doi: 10.3109/00207458508985378. [DOI] [PubMed] [Google Scholar]
  22. Lashley K.S. Conditional reactions in the rat. Journal of Psychology. 1938;6:311–324. [Google Scholar]
  23. Mazur J.E. An adjusting procedure for studying delayed reinforcement. In: Commons M.L, Mazur J.E, Nevin J.A, Rachlin H, editors. Quantitative analyses of behavior, Vol. V: Effects of delay and intervening events on reinforcement value. Hillsdale, NJ: Erlbaum; 1987. pp. 55–73. [Google Scholar]
  24. McCarthy D.C, Davison M. The interaction between stimulus and reinforcer control on remembering. Journal of the Experimental Analysis of Behavior. 1991;56:51–66. doi: 10.1901/jeab.1991.56-51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. McCarthy D.C, Nevin J.A. The consequences of remembering. In: Abraham W.C, Corballis M.C, White K.G, editors. Memory mechanisms: A tribute to G. V. Goddard. Hillsdale, NJ: Erlbaum; 1991. pp. 275–290. [Google Scholar]
  26. McCarthy D.C, Voss P. Delayed matching-to-sample performance: Effects of relative reinforcer frequency and of signaled versus unsignaled reinforcer magnitudes. Journal of the Experimental Analysis of Behavior. 1995;63:33–51. doi: 10.1901/jeab.1995.63-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. McCarthy D.C, White K.G. Behavioral models of delayed detection and their application to memory. In: Commons M.L, Mazur J, Nevin J.A, Rachlin H.C, editors. Quantitative analyses of behavior: Vol. 5. The effects of delay and intervening events on reinforcement value. Hillsdale, NJ: Erlbaum; 1987. pp. 29–54. [Google Scholar]
  28. Nevin J.A. Measuring behavioral momentum. Behavioural Processes. 2002;57:187–198. doi: 10.1016/s0376-6357(02)00013-x. [DOI] [PubMed] [Google Scholar]
  29. Nevin J.A, Davison M, Shahan T.A. A theory of attending and reinforcement in conditional discrimination. Journal of the Experimental Analysis of Behavior. 2005;84:281–303. doi: 10.1901/jeab.2005.97-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Nevin J.A, Grace R.C. Behavioral momentum and the law of effect. Behavioral and Brain Sciences. 2000;23:73–130. doi: 10.1017/s0140525x00002405. [DOI] [PubMed] [Google Scholar]
  31. Nevin J.A, Grosch J. Effects of signaled reinforcer magnitude on delayed matching-to-sample performance. Journal of Experimental Psychology: Animal Behavior Processes. 1990;16:298–305. [Google Scholar]
  32. Nevin J.A, McLean A.P, Grace R.C. Resistance to extinction: Contingency termination and generalization decrement. Animal Learning and Behavior. 2001;29:176–191. [Google Scholar]
  33. Nevin J.A, Milo J, Odum A.L, Shahan T.A. Accuracy of discrimination, rate of responding, and resistance to change. Journal of the Experimental Analysis of Behavior. 2003;79:307–321. doi: 10.1901/jeab.2003.79-307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Nevin J.A, Shahan T.A, Odum A.L. Contrast effects in response rate and accuracy of delayed matching to sample. Quarterly Journal of Experimental Psychology. in press doi: 10.1080/17470210701557597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. O'Daly M, Angulo S, Gipson P, Fantino E. Influence of temporal context on value in the multiple-chains and successive-encounters procedures. Journal of the Experimental Analysis of Behavior. 2006;85:309–328. doi: 10.1901/jeab.2006.68-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Odum A.L, Shahan T.A, Nevin J.A. Resistance to change of forgetting functions and response rates. Journal of the Experimental Analysis of Behavior. 2005;84:65–75. doi: 10.1901/jeab.2005.112-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Rachlin H. Notes on discounting. Journal of the Experimental Analysis of Behavior. 2006;85:425–435. doi: 10.1901/jeab.2006.85-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Roberts W.A. Short-term memory in the pigeon: Effects of repetition and spacing. Journal of Experimental Psychology. 1972;94:74–83. [Google Scholar]
  39. Roberts W.A, Kraemer P.J. Some observations on the effects of intertrial interval and delay on delayed matching to sample in pigeons. Journal of Experimental Psychology: Animal Behavior Processes. 1982;8:342–353. [PubMed] [Google Scholar]
  40. Roitblat H.L. Codes and coding processes in pigeon short-term memory. Animal Learning and Behavior. 1980;8:341–351. [Google Scholar]
  41. Sargisson R.J, White K.G. The effect of reinforcer delays on the form of the forgetting function. Journal of the Experimental Analysis of Behavior. 2003;80:77–94. doi: 10.1901/jeab.2003.80-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Schaal D.W, Odum A.L, Shahan T.A. Pigeons may not remember the stimuli that reinforced their behavior. Journal of the Experimental Analysis of Behavior. 2000;73:125–139. doi: 10.1901/jeab.2000.73-125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Shahan T.A. Observing behavior: Effects of rate and magnitude of primary reinforcement. Journal of the Experimental Analysis of Behavior. 2002;78:161–178. doi: 10.1901/jeab.2002.78-161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Urcuioli P.J. On the role of differential sample behaviors in matching-to-sample. Journal of Experimental Psychology: Animal Behavior Processes. 1985;11:502–519. doi: 10.1037//0097-7403.11.4.502. [DOI] [PubMed] [Google Scholar]
  45. Urcuioli P.J. Behavioral and associative effects of differential outcomes in discrimination learning. Learning and Behavior. 2005;33:1–21. doi: 10.3758/bf03196047. [DOI] [PubMed] [Google Scholar]
  46. White K.G. Characteristics of forgetting functions in delayed matching to sample. Journal of the Experimental Analysis of Behavior. 1985;44:15–34. doi: 10.1901/jeab.1985.44-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. White K.G. Psychophysics of direct remembering. In: Commons M.L, Nevin J.A, Davison M, editors. Signal detection: Mechanisms, models, and applications. Hillsdale, NJ: Erlbaum; 1991. pp. 221–237. [Google Scholar]
  48. White K.G. Forgetting functions. Animal Learning and Behavior. 2001;29:193–207. [Google Scholar]
  49. White K.G, Wixted J.D. Psychophysics of remembering. Journal of the Experimental Analysis of Behavior. 1999;71:91–113. doi: 10.1901/jeab.1999.71-91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Wilkie D.M, Summers R.J, Spetch M.L. Effect of delay-interval stimuli on delayed symbolic matching to sample in the pigeon. Journal of the Experimental Analysis of Behavior. 1981;35:153–160. doi: 10.1901/jeab.1981.35-153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Wixted J.D. Nonhuman short-term memory: A quantitative analysis of selected findings. Journal of the Experimental Analysis of Behavior. 1989;52:409–426. doi: 10.1901/jeab.1989.52-409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wright A.A, Sands S.F. A model of detection and decision processes during matching to sample by pigeons: Performance with 88 different wavelengths in delayed and simultaneous matching tasks. Journal of Experimental Psychology: Animal Behavior Processes. 1981;7:191–216. [Google Scholar]

Articles from Journal of the Experimental Analysis of Behavior are provided here courtesy of Society for the Experimental Analysis of Behavior

RESOURCES