Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2013 Oct 19;368(1628):20130054. doi: 10.1098/rstb.2013.0054

Temporal expectancy in the context of a theory of visual attention

Signe Vangkilde 1,, Anders Petersen 1, Claus Bundesen 1
PMCID: PMC3758197  PMID: 24018716

Abstract

Temporal expectation is expectation with respect to the timing of an event such as the appearance of a certain stimulus. In this paper, temporal expectancy is investigated in the context of the theory of visual attention (TVA), and we begin by summarizing the foundations of this theoretical framework. Next, we present a parametric experiment exploring the effects of temporal expectation on perceptual processing speed in cued single-stimulus letter recognition with unspeeded motor responses. The length of the cue–stimulus foreperiod was exponentially distributed with one of six hazard rates varying between blocks. We hypothesized that this manipulation would result in a distinct temporal expectation in each hazard rate condition. Stimulus exposures were varied such that both the temporal threshold of conscious perception (t0 ms) and the perceptual processing speed (v letters s−1) could be estimated using TVA. We found that the temporal threshold t0 was unaffected by temporal expectation, but the perceptual processing speed v was a strikingly linear function of the logarithm of the hazard rate of the stimulus presentation. We argue that the effects on the v values were generated by changes in perceptual biases, suggesting that our perceptual biases are directly related to our temporal expectations.

Keywords: temporal expectation, perceptual processing speed, theory of visual attention

1. Introduction

Temporal expectation is expectation with respect to the timing of an event such as the appearance of a certain stimulus. In this paper, temporal expectancy is investigated in the context of the theory of visual attention (TVA) [1]. The general framework of TVA, which will be presented below, has been highly successful in accounting for a wide range of findings within the attention literature (for an overview, see [2,3]). Until recently, however, few TVA-based studies had investigated attentional effects within the temporal domain [47]. Here, we substantially extend one of these previous studies [4] and present the first parametric investigation of the relationship between temporal expectations and the way in which we process information.

2. Theory of visual attention

TVA seems to have been the first parallel-processing theory of attention based on the principle that Desimone & Duncan [8] in a later, highly influential article called ‘biased competition’ (see [1,9]; also see [10]). In TVA, all possible visual categorizations ascribing features to objects compete (race) to become encoded into a limited visual short-term memory (VSTM) before it is filled up. Each possible categorization is supported by the sensory evidence that the categorization is true. However, the competition is biased by attentional weights and perceptual biases, so that certain objects and categorizations have higher probabilities of becoming encoded into VSTM and being consciously perceived. The interaction between sensory evidence and attentional biases is specified in two equations: the rate and weight equations of TVA. Thus, TVA represents a mathematical formalization of the biased competition principle.

(a). Basic assumptions

In TVA, both visual recognition and selection of objects in the visual field consist of making visual categorizations. A visual categorization has the form ‘object x has feature i’ or, equivalently, ‘object x belongs to category i’. The categorization is made (or, equivalently selected) when the categorization is encoded into VSTM. When the categorization is made, object x is said both to be selected and to be recognized as a member of category i. This way, selection and recognition are viewed as two aspects of the same process.

When a visual categorization of an object completes processing, the categorization enters VSTM given that storage space is available in this system. Usually, the capacity of VSTM is assumed to be limited to K different objects, where K is about four (cf. [11]). K varies between individuals and is one of the main parameters of TVA. Clearing VSTM starts a race among objects in the visual field to become encoded into VSTM. An object is encoded in VSTM if, and only if, some categorization of the object is encoded in VSTM. Each object x may be represented in the encoding race by all possible categorizations of that object.

(i). Rate equation

By the rate equation of TVA, the rate, v(x, i), at which a particular visual categorization, ‘x belongs to i’, is encoded into VSTM is given by a product of three terms:

(i). 2.1

The first term, η(x, i), is the strength of the sensory evidence that x belongs to category i. The second term, βi, is a perceptual decision bias associated with category i (0 ≤ βi ≤ 1). The third term is the relative attentional weight of object x, that is, the weight of object x, wx, divided by the sum of weights across all objects in the visual field, S.

The sum of all v values across all perceptual categorizations of all elements in the visual field defines the total visual processing speed (C),

(i). 2.2

where R is the set of all visual categories.

(ii). Weight equation

The attentional weights in the rate equation of TVA depend on pertinence values. Every visual category j is supposed to have a pertinence, πj, which is a real number. The pertinence of category j is a measure of the momentary importance of attending to objects that belong to category j. The attentional weight, wx, of an object x in the visual field is given by the weight equation of TVA. The traditional form of the weight equation is

(ii). 2.3

where η(x, j) is the strength of the sensory evidence that object x belongs to category j and πj is the pertinence of category j. By equation (2.3), the attentional weight of an object is a weighted sum of pertinence values. The pertinence of a given category enters the sum with a weight equal to the strength of the sensory evidence that the object belongs to the category. Recently, however, evidence was obtained [12] that the attentional weight of an object depends not only on the pertinence of its features (i.e. goal relevance) but also on the feature contrast between the object and its surroundings (cf. [13]) such that

(ii). 2.3a

where κx is the local contrast between object x and other objects in the visual field.

In some situations (e.g. partial report), the visual stimuli fall into two categories: targets (i.e. elements to be attended to) and distractors (elements to be ignored), such that any target has approximately the same attentional weight as any other target, and any distractor has approximately the same attentional weight as any other distractor. In such cases, the efficiency of top-down selection can be defined as the ratio, α, between the attentional weights of a distractor and a target:

(ii). 2.4

(b). Mechanisms of selection

Taken together, the rate and weight equations of TVA describe two mechanisms of selection: a mechanism for selection of objects (filtering) and a mechanism for selection of categories (pigeonholing). The filtering mechanism is represented by pertinence values and attentional weights. If selection of objects with feature j is desired, then the pertinence of feature j should be high. The weight equation implies that when feature j has a high pertinence, objects possessing feature j obtain high attentional weights. Accordingly, by the rate equation, processing of objects with feature j is quick, so objects with this feature are likely to win the processing race and be encoded into VSTM.

The pigeonholing mechanism is represented by perceptual decision bias parameters. Pertinence values determine which objects are selected (filtering), but perceptual biases determine how the objects are categorized (pigeonholing). If particular types of categorizations must be reported or otherwise responded to, then the bias values of the corresponding categories should be high, making the desired types of categorizations more likely (cf. equation (2.1)).

When the selection system is coupled to a sensory system supplying appropriate η values, and when pertinence and bias parameters have been set, both filtering and pigeonholing are accomplished by an encoding race between visual categorizations whose rate parameters are determined through the simple algebraic operations of the rate and weight equations. Thus, TVA yields a truly computational account of selective attention in vision.

(c). Applications

(i). Single-stimulus identification

TVA has been applied to findings from a broad range of paradigms concerned with single-stimulus identification and selection from multi-element displays. For single-stimulus identification, TVA provides a mathematical derivation [1,14] of Luce's [15] classical biased-choice model, which has been successful in explaining many experimental findings on effects of visual discriminability and bias [16,17].

TVA also provides predictions of the time taken to recognize a singly presented simple stimulus. The v value in equation (2.1) is a hazard rate. Specifically, at any time t, the instantaneous value of v(x, i) is the conditional probability density that the categorization that object x belongs to category i is encoded into VSTM at time t, given that the categorization has not been encoded before time t, and given that storage space for the categorization is available in VSTM. This assumption implies that, if v(x, i) = 0 until time t0, and v(x, i) is kept constant for times t > t0, the probability p(t) of making the categorization ‘x belongs to i’ at or before time t is given by

(i). 2.5

where p(t) equals the probability of recognizing x as a member of i from a postmasked presentation with an exposure duration of t, given that t0 equals the longest ineffective exposure duration.

Equation (2.5) presupposes that guessing and perceptual confusions among stimuli (i.e. incorrect perceptual categorizations) can be neglected, but Kyllingsbæk et al. [18] recently showed how the time course of visual recognition of mutually confusable stimuli may also be analysed using TVA (also see [19]). Specifically, they proposed and tested a Poisson counter model of visual recognition of briefly presented, mutually confusable single stimuli in pure accuracy tasks. The model implies that during stimulus analysis, tentative categorizations that stimulus x belongs to category i are made at a constant Poisson rate, v(x, i). The analysis is continued until the stimulus disappears, and the overt response is based on the categorization made the greatest number of times. The model provided a close fit to individual data on identification of digits and an apparently perfect fit to data on identification of Landolt rings.

(ii). Selection from multi-element displays

For selection from multi-element displays, TVA provides a mathematical derivation of the fixed-capacity independent race model (FIRM) of Shibuya & Bundesen [20]. FIRM describes the processing of a stimulus display as a two-stage process. At the first stage of processing, an attentional weight measuring the strength of the sensory evidence that the element is a target is computed for each element in the display. At the second stage of processing, the race between the elements takes place. The total processing capacity at this stage of the system is assumed to be a constant, C elements per second, which is a basic parameter of FIRM. The processing capacity is distributed across the elements in proportion to their weights. Thus, every element in the display is allocated a certain fraction of the total processing capacity; the fraction equals the attentional weight of the element divided by the sum of the attentional weights across all of the elements in the display.

The amount of processing capacity allocated to an element determines how fast the element can be encoded into VSTM. Specifically, the time taken to encode an element is assumed to be exponentially distributed with a rate parameter equal to the amount of processing capacity that is allocated to the element. Encoding times for different elements are stochastically independent, and the elements actually selected are those elements whose encoding processes complete before the stimulus presentation terminates and before VSTM has been filled up.

FIRM predicts effects of variations in the exposure duration of the stimuli. These predictions were tested in a comprehensive study of partial report of digits from mixtures of letters and digits [20]. Exposure durations ranged from 10 up to 200 ms, and each display was terminated by a pattern mask. Figure 1 shows the probability distribution of the number of correctly reported targets as a function of the exposure duration, the number of targets (T), and the number of distractors (D) in the display for one representative subject (i.e. subject MP). Each panel shows the results for a particular combination of T and D. The top curve in a panel shows the probability of reporting at least one element correct as a function of exposure duration, the second curve from the top shows the probability of reporting at least two elements correct, etc. The fit shown by the smooth curves was obtained with VSTM capacity K at a value of 3.7 elements, total processing capacity C at 49 elements per second, parameter α (the weight ratio of a distractor to a target) at 0.40, and threshold of perception t0 at 19 ms. As can be seen, the fit was strikingly close.

Figure 1.

Figure 1.

Relative frequency of scores of j or more correctly reported targets as a function of exposure duration with j, number of targets T, and number of distractors D as parameters in partial report experiment of Shibuya & Bundesen [20]. Data are shown for a representative subject in the experiment, subject MP. Parameter j varies within panels; j is 1, open circles; 2, open squares; 3, solid squares; 4, solid circles or 5, triangle. T and D vary among panels. Smooth curves represent a theoretical fit to the data by the FIRM model. For clarity, observed frequencies less than 0.02 were omitted from the figure. (Adapted from Shibuya & Bundesen [20, p. 595]. Copyright © 1988 American Psychological Association.)

(d). Neural interpretation

The neural interpretation of TVA (NTVA) model by Bundesen et al. [2,21] provides a neural interpretation of the rate and weight equations, which together describe two mechanisms of attentional selection: one for selection of objects and one for selection of features. NTVA describes how these mechanisms work at the single-cell level: filtering changes the number of cortical neurons in which an object is represented, and pigeonholing changes the rate of firing in cortical neurons coding for particular features (figure 2). Specifically, the total neural activation representing a visual categorization of the form ‘object x has feature i’ is directly proportional to (i) the number of neurons representing the categorization, which is controlled by filtering, and (ii) the level of activation of the individual neurons representing the categorization, which is controlled by pigeonholing.

Figure 2.

Figure 2.

Attentional selection in NTVA: combined effects of filtering (selection of objects) and pigeonholing (selection of features) on the set of cortical spike trains representing a particular visual categorization of the form ‘object x has feature i’. The four conditions (quadrants) correspond to the factorial combinations of two levels of filtering (weak versus strong support to object x) by two levels of pigeonholing (weak versus strong support to feature i). Filtering changes the number of cortical neurons in which an object is represented. Pigeonholing changes the rate of firing of cortical neurons coding for a particular feature. (Adapted from Bundesen et al. [21, p. 292]. Copyright © 2005 American Psychological Association.)

Filtering makes the number of cells in which an object is represented increase with the behavioural importance of the object. Thus, in NTVA, visual processing is assumed to occur in parallel and with differential allocation of resources, so that important objects are represented in many cells. More specifically, the probability that a cortical neuron represents a particular object in its classical receptive field equals the attentional weight of the object divided by the sum of the attentional weights across all objects in the neuron's receptive field.

The weight equation of TVA describes the computation of attentional weights. Logically, the weights must be computed before processing resources (cells) can be distributed in accordance with the weights. Therefore, in NTVA, a normal perceptual cycle consists of two waves: a wave of unselective processing, in which attentional weights are computed, and a wave of selective processing, in which processing resources have been allocated in accordance with the weights. During the first wave, cortical processing resources are distributed at random (unselectively) across the visual field, but at the end of the first wave, an attentional weight for each object in the visual field has been computed and stored in a priority map. The weights are used for reallocation of attention (visual processing capacity) by dynamic remapping of receptive fields of cortical neurons. The remapping of receptive fields makes the number of neurons allocated to an object increase with the attentional weight of the object. Thus, during the second wave, cortical processing is selective in the sense that the amount of processing resources allocated to an object (the number of neurons that represent the properties of the object) varies with the attentional weight of the object. Because more processing resources are devoted to behaviourally important objects than to less important ones, the important objects are processed faster, and are therefore more likely to become encoded into VSTM. The VSTM system is conceived as a feedback mechanism that sustains activity in the neurons that have won the attentional competition, making the information represented by these neurons available for other cognitive processes. Recently, it has been suggested that this feedback mechanism may underlie the attentional dwell time phenomenon [5] (i.e. the inability to correctly identify the second of two targets presented in close temporal proximity).

3. Temporal expectancy

Vangkilde et al. [4] recently extended TVA to temporal attention (also see [6,7]). In an ever-changing world, temporal expectations guide our attention in time, so that our limited attentional resources are used in the most optimal way [22,23]. Behavioural benefits (e.g. speeding of motor responses) after temporal warning cues inducing valid temporal expectations have been demonstrated repeatedly over the past century [2426]. Most investigations have used foreperiod paradigms, in which the interval (or ‘foreperiod’) between a warning cue and a subsequent target stimulus is manipulated to induce different temporal expectations about the target onset. When foreperiods vary from trial to trial within a block, reaction times are faster for longer foreperiods [26]. This variable foreperiod effect is typically explained by the relationship between conditional probabilities, expectations and performance; temporal expectancy is often conceptualized as reflecting the hazard function of the occurrence of an imminent target stimulus [27,28]. In classical variable foreperiod paradigms, every cue–stimulus foreperiod is equally likely such that the conditional probability that the stimulus is going to appear at the next possible time of presentation (i.e. the hazard rate of target presentation) given that it has not yet been presented increases over the course of a trial. However, in these paradigms, the subjective level of temporal expectation is confounded with the actual duration of the foreperiod, and hence changes in performance over the course of a trial may be attributed to either the level of temporal expectation, the passing of time per se, or to a combination of the two.

In order to disentangle expectancy effects from the effect of foreperiod duration, Vangkilde et al. [4] used a modified foreperiod paradigm, in which foreperiods were drawn from non-ageing distributions, i.e. distributions with constant hazard rates [2931]. Specifically, they explored effects of temporal expectation on visual processing speed in a cued single-stimulus recognition paradigm with briefly presented, postmasked stimuli. Different hazard rate functions for the cue–stimulus foreperiod were used to manipulate temporal expectations. For example, in one experiment, the length of the foreperiod from the cue to the stimulus was distributed exponentially. For each block of trials, the participants knew which of two exponential distributions with different hazard rates the foreperiods would be drawn from. The hazard rate could be either high (1.33 s−1) or low (0.22 s−1) corresponding to mean foreperiods of 750 and 4500 ms, respectively. In either condition, the probability p(t) of correct report as a function of the stimulus duration (t) was well described by equation (2.5), t0 being the threshold of conscious perception (the longest ineffective exposure duration), and v(x, i) being the speed of encoding into VSTM at times t > t0. As manipulated by the hazard rate, temporal expectation had no effect on the threshold of conscious perception but a strong effect on the speed of subsequent encoding into VSTM. Averaged across participants, encoding was speeded up by 30% in the high hazard rate condition. This effect was found even though no general decrease in processing speed with time-on-task occurred. Thus, the effect was independent of the actual duration of the foreperiod on a given trial, but depended entirely on expectation. Vangkilde et al. [4] explained the effect in terms of TVA by assuming that temporal expectations affect perception by changing perceptual biases (values of β parameters). Specifically, a strong expectation that a stimulus letter will appear at the next moment should yield an increase in the β values of letter types, which, in turn, should speed up the recognition of the stimulus letter.

4. Parametric experiment

The current experiment was a cued single-letter recognition task akin to the experiments described above [4]. However, here, we parametrically manipulated the hazard rates defining the exponential distributions from which the waiting times were drawn, such that six different, constant hazard rate conditions were used. This manipulation allowed us to investigate the functional relation between changes in the temporal expectation that a target stimulus would appear in the next moment, and the perceptual speed with which the target would be processed. Based on the previous findings [4], we hypothesized that the use of exponentially distributed foreperiods should induce a distinct, constant temporal expectation independent of the actual waiting time within each experimental condition. Furthermore, we hypothesized that these different levels of expectation would be reflected by differences in the perceptual processing speeds for each condition, but that the temporal threshold of perception would be unaffected by temporal expectations.

(a). Method

(i). Participants

Six healthy, young university students (Mage = 25 years, s.d. = 1.09 years; four women) with normal or corrected-to-normal visual acuity participated in the experiment. They all completed 13 test sessions corresponding to approximately 10 h of testing for which they received £200.

(ii). Procedure and stimuli

Every trial in the experiment followed the same basic outline illustrated in figure 3, which began with a 1000 ms fixation screen featuring a central circle to be fixated. Then, a symbolic cue (one of the numbers 1, 2, 3, 4, 5 or 6 in red) indicating the hazard rate condition was flashed for 200 ms within the circle. After the cue and an additional 100 ms fixation screen, a variable foreperiod was initiated. After this a randomly chosen target letter from the set [ABDEFGHJKLMNOPRSTVXZ] was briefly presented 5° of visual angle to the right (probability 0.5) or left of the fixation point. Target letters were presented for varied durations (10, 20, 30, 40, 60 or 90 ms) after which they were terminated by pattern masks presented for 500 ms on both locations. Participants were instructed to fixate centrally at all times, and their task was to make a non-speeded report of the identity of the letter if they had clearly seen it, or if they had any information at all about the letter identity (i.e. to use all available information but refrain from pure guessing). Participants were informed of the accuracy of their reports after each trial. Stimulus letters were written in the font Arial (broad) with a letter point size of 68 corresponding to 2.7 × 2.3° of visual angle. The masks were made from letter fragments and measured 3.8 × 3.8° of visual angle to completely cover the letters. The experiment was run on a 19″ CRT monitor at 100 Hz using the E-prime v. 2 software with participants seated with their head in a chinrest approximately 65 cm from the monitor in a semi-darkened room. The participants' eye movements and blinks were tracked using a desktop mounted EyeLink1000 tracker from SR Research with a sampling rate of 1 kHz.

Figure 3.

Figure 3.

Time course of a single trial in the parametric experiment. The initial fixation ring was followed by a brief symbolic red number cue indicating the hazard rate condition (1–6). After the cue and an additional 100 ms fixation, the variable waiting time (foreperiod) before the imperative letter stimulus was initiated. The letter was presented either to the left or right of fixation for 10, 20, 30, 40, 60 or 90 ms after which the possible stimulus positions were masked. Then, the subjects were free to make an unspeeded report of the presented letter. Note that the foreperiod distributions used were unique for each of the six hazard rate conditions.

The cue indicated from which of six exponential distributions, each with a distinct hazard rate (λ), the foreperiod would be drawn. The hazard rates ranged from very low (λ = 0.1 s−1, mean foreperiod 10 s) to very high (λ = 3.2 s−1, mean foreperiod 0.3125 s) and were chosen such that, when ordered by increasing hazard rate, the rate doubled with each new condition (see table 1 for exact hazard rates and corresponding mean foreperiod durations in the six conditions). Figure 4 shows the probability density functions for the six different distributions together with their mean waiting times. Note that the time scale has been transformed by the natural logarithm to display both short and long waiting times in the same plot, and further, that the mean waiting times are equally spaced on the transformed scale.

Table 1.

Mean foreperiods, estimated attentional parameters, error rate and goodness-of-fit across hazard rate conditions. v, perceptual processing speed (letters s−1); t0, threshold of conscious perception (ms); var%, percentage of variance in the observed individual mean scores accounted for by the maximum-likelihood fits; r.m.s.d., square root of the mean squared deviation between observed and theoretical mean scores.

hazard rate (s−1) foreperiod (s)
v
t0
error rate
var%
r.m.s.d.
M range M s.e. M s.e. M s.e. M M
λ = 0.1 10 (0.09–47.88) 45.45 (04.27) 8.84 (0.27) 0.29 (0.03) 97.9 0.062
λ = 0.2 5 (0.05–23.94) 51.75 (05.24) 9.06 (0.25) 0.25 (0.02) 97.4 0.068
λ = 0.4 2.5 (0.03–11.97) 54.92 (04.38) 8.88 (0.44) 0.22 (0.02) 97.9 0.062
λ = 0.8 1.25 (0.02–5.99) 61.06 (05.03) 9.09 (0.29) 0.19 (0.01) 96.7 0.077
λ = 1.6 0.625 (0.01–3.00) 66.37 (06.98) 8.76 (0.23) 0.19 (0.01) 96.8 0.074
λ = 3.2 0.3125 (0.01–1.50) 72.65 (07.47) 9.05 (0.20) 0.20 (0.01) 97.7 0.062
Figure 4.

Figure 4.

Probability density functions for the six exponential distributions, with different hazard rates (λ), from which the foreperiods in the experiment were drawn. The functions have been plotted on a logarithmic scale to display both short and long foreperiods in the same figure. The short foreperiods are much more frequent in the higher hazard rate distributions (red, orange and yellow, respectively) than in the lower hazard rate distributions (green, blue and violet, respectively). The diamonds on the x-axis represent the mean waiting time for the exponential distribution in the corresponding colour.

We used a blocked design, where each session of 360 trials was divided into six equally long blocks representing the six expectancy conditions. The order of the blocks in a session was controlled across both sessions and participants. Specifically, in a given session, the block order was drawn without replacement from a list of 720 possible combinations of block orders, so that the same order was only run once. A written description of the condition was shown on the computer screen at the beginning of each block, and participants were instructed to actively use the cue as a reminder of the condition they were currently running. Every participant completed a full session as practice and then 12 experimental sessions yielding a total of 4320 experimental trials, 720 trials in each of the six conditions.

(b). Results and discussion

From the participants' performance (full datasets for each of the participants can be found in the electronic supplementary material) in the task described, we estimated two key components of visual attention using TVA: the temporal threshold of conscious perception, t0, and the perceptual processing speed, v. The probability p of correct report could be approximated as an exponential function of the stimulus duration t (cf. equation (2.5)), where t0 measured the temporal threshold in seconds, and v measured the perceptual processing speed in letters per second at times t > t0 [1]. For every participant, separate parameter estimates for each experimental condition were obtained by a standard TVA-based maximum-likelihood method [32,33]. Note, however, that the data could also have been fitted using the Poisson counter model [18] previously described. The obtained maximum-likelihood fits were excellent, accounting for almost all variation in the observed individual mean scores, and correspondingly, the difference between observed and theoretical mean scores was remarkably small across all conditions (both goodness-of-fit measures averaged across participants for the different experimental conditions are summarized in table 1).

The effect of the hazard rate manipulation on the estimated attentional parameters is shown in figure 5. As we hypothesized, increasing the temporal expectation of the participants through increasing hazard rates led to a strong and systematic increase in the perceptual processing speed v. This was evident in a repeated measures ANOVA with hazard rate condition (1–6) as within-subject factor, F5,25 = 17.89, p < 0.001, Inline graphic, but also at the individual level where all participants showed a pronounced speeding up of target processing from the lowest to the highest expectancy condition. Averaged across participants, the speed increased by almost 40% (M = 37.2, s.d. = 2.9), t5 = 7.98, p < 0.001, d = 0.69. By contrast, the threshold of conscious perception, t0, was unaffected by differences in expectancy, staying constant at approximately 9 ms across all experimental conditions, F5,25 = 0.50, p = 0.77, Inline graphic.

Figure 5.

Figure 5.

Effects of temporal expectancy (hazard rate condition) on (a) the mean perceptual processing speed v (letters s−1) and (b) the mean temporal threshold of perception t0 (ms) across participants. Hazard rate conditions are λ = 0.1 s−1 (violet), λ = 0.2 s−1 (blue), λ = 0.4 s−1 (green), λ = 0.8 s−1 (yellow), λ = 1.6 s−1 (orange) and λ = 3.2 s−1 (red). Error bars show standard errors of the means.

Averaged across all trial types the probability of an erroneous response was as high as 0.22, which might have been expected when using very brief target exposures and asking participants to use all available information for their reports. Nonetheless, the participants tended to make more erroneous reports in the three lowest hazard rate conditions, leading to a significant effect of condition in a repeated measures ANOVA with hazard rate condition (1–6) as within-subject factor, F5,25 = 14.87, p < 0.001, Inline graphic.

(i). Expectations across waiting times

In an ideal observer, using exponentially distributed foreperiods (with constant hazard rates) should lead to constant expectations, independent of the actual waiting times, within each experimental condition. Such results have previously been found [4], but, in the present experiment, we used a much broader range of waiting times which might result in notable changes in processing speed across waiting times. To investigate this, we divided the trials from each hazard rate condition into quartiles based on their actual waiting times. We then refitted trials from each of the quartiles to derive four new estimates of processing speed per hazard rate condition. The coloured dots connected by lines in figure 6 demonstrate the resulting processing speeds plotted as a function of the mean foreperiod in the relevant quartile (the coloured diamonds on the left show the estimated processing speeds based on the full dataset, cf. figure 5a). One should bear in mind that the estimates depicted here are based on a mere 25% of the trials in the full dataset and are at the limits of what can be modelled reliably. Even though the connecting lines are not completely flat across the quartiles, the processing speeds were fairly constant across waiting times, in accordance with previous experiments [4]. These observations were substantiated by a repeated measures ANOVA with hazard rate condition (1–6) and quartile (1–4) as within-subject factors, which showed a clear difference between the six hazard rate conditions, F5,25 = 7.63, p < 0.001, Inline graphic, but no changes in processing speed across waiting time quartiles, F3,15 = 0.73, p = 0.55, Inline graphic, and also no differential influence of waiting time quartile in the different hazard rate conditions, F15,75 = 1.125, p = 0.35, Inline graphic.

Figure 6.

Figure 6.

Perceptual processing speeds over foreperiods. Coloured circles depict mean processing speeds v (letters s−1) in four quartiles based on the waiting times in each hazard rate condition. Processing speeds are plotted as a function of the mean waiting time in the corresponding quartile. Coloured diamonds on the left show the estimated processing speeds based on the full dataset (cf. figure 5a, for reference).

In general, the observed effects of expectancy on parameters v and t0, respectively, seem very robust. Nonetheless, to ensure that our findings are not driven by factors other than expectancy, we have refitted our data under a variety of circumstances taking into consideration that participants may not distribute their attention equally on the two possible stimulus locations and that they may blink more at longer foreperiods. These analyses and results can be found in appendix A. As neither unequal distribution of attention nor blinking per se could explain our results, we felt confident using our findings to further investigate the functional relationship between expectancy and processing speed.

(ii). The relationship between level of expectancy and speed of perceptual processing

Perceptual processing is speeded up by increasing temporal expectation, but what is the exact relation between the perceptual processing speed and the level of expectancy specified by the hazard rate of stimulus presentation? Figure 5 indicated that the processing speed can be described as an increasing, negatively accelerated function of the hazard rate. Interestingly, a trend analysis based on the previously reported ANOVA on the effect of hazard rate condition on speed of processing (cf. figure 5a) revealed a very strong linear trend of the observations, F1,5 = 40.88, p = 0.001, Inline graphic, without hints of any higher order trends (all ps > 0.45, all Inline graphic). The trend analysis presupposes that the hazard rate conditions represent equally spaced quantitative steps along an underlying continuum, and this condition was satisfied in terms of a logarithmic scale, as is apparent from the semi-logarithmic plot of group processing speeds shown in figure 7. Corroborating the result of the trend analysis, the best linear fit to the semi-logarithmic plot in figure 7 accounted for a striking 99.4% of the variance of the observed values. The best-fitting linear function relating the processing speed v (letters s−1) to the natural logarithm of the hazard rate λ (s−1) of stimulus presentation is

(ii).

where a = 7.67 and b = 63 letters s−1, such that, for example, v = 63 letters s−1 for a hazard rate of stimulus presentation at 1 s−1, whereas—by extensive extrapolation downwards—v should reach a value of zero at a hazard rate of stimulus presentation equal to 0.0003 s−1.

Figure 7.

Figure 7.

Best linear fit to the mean estimated processing speed v (letters s−1) for each hazard rate condition plotted on a logarithmic scale. Conditions are λ = 0.1 s−1 (violet), λ = 0.2 s−1 (blue), λ = 0.4 s−1 (green), λ = 0.8 s−1 (yellow), λ = 1.6 s−1 (orange) and λ = 3.2 s−1 (red). Error bars show standard errors of the means. The linear fit accounted for 99.4% of the variance of the estimated v values.

In summary, our results support an account in which the perceptual speed of processing to a very good approximation can be described as a linearly increasing function of the logarithm of the level of expectancy given by the hazard rate.

(iii). Expectations, alertness and arousal in theory of visual attention-based studies

The effects of expectations and alertness on performance are not modelled explicitly by TVA [1] or NTVA [2,21] in their present forms, but in recent years a few studies [4,6,7,34] have directly investigated the relationship between these intensity aspects of attention and TVA-based measures of visual selectivity and capacity.

Matthias et al. [7] investigated the effects of a phasically alerting cue on perceptual processing capacity and spatial attention in three whole report experiments. They found that the cue transiently enhanced processing speed (though only at the shortest cue–target interval of 80 ms) but not storage capacity, as measured by TVA. This phasic enhancement of processing speed by the cue is apparently similar to the increase in processing speeds with higher expectations observed in this study. However, the use of a non-informative alerting signal is inherently different from our manipulation of temporal expectancy; extrinsic, alerting cues seem to trigger a fast and automatic process which has a short-lasting phasic influence on processing speed [7]. By contrast, we could speculate that manipulating temporal expectancies by changing the hazard rates of stimulus appearance may rely on an intrinsically generated process, which effectively leads to changes in the tonic levels of alertness.

Using a whole report paradigm, Finke et al. [34] examined the effects of two arousal-enhancing psychostimulants on TVA parameters of capacity. They found that both stimulants enhanced visual processing capacity and that one enhanced VSTM capacity but only in participants with low processing and storage capacities (i.e. below the median score). Thus, we might argue that increasing the tonic level of arousal using stimulants leads to the same effect on processing speed that we observed when modulating the level of expectancy using hazard rate manipulations. However, in our experiment, the effect of the hazard rate manipulation was present in all participants and not only for a subset with low processing speed. Evidently, more investigations are required to reliably link individual effects of alertness and arousal manipulations to TVA-based parameters of attention.

(iv). Expectations at the neural level

The neural interpretation of TVA, NTVA [2,21], provides a description of attentional selection of both objects and features at the single-cell level. According to NTVA, the encoding speed (v value) of a categorization depends on both the number of cortical neurons representing that categorization and the firing rates of those neurons. Here, the bias parameter βi is conceptualized as a scale factor that multiplies activations (firing rates) of cortical neurons coding for feature i (e.g. a particular letter type). Thus, multiplicative scaling of firing rates by change in perceptual biases (β values) may be the neural mechanism by which temporal expectations corresponding to particular hazard rates influence information processing [4].

Interestingly, electrophysiological studies in monkeys have demonstrated that expectations based on the statistics of stimulus appearance may result in very specific changes in visual representations even at the cellular level [35]. Indeed, the suggested hypothesis linking temporal expectations to scaling of firing rates is directly supported by studies showing that neural firing in feature-specific cortical regions can vary in ways that appear to reflect the monkeys’ temporal expectations indexed by the hazard function of stimulus presentation [36,37].

5. Conclusion and discussion

In equation (2.1) of TVA [1], the rate v(x, i) at which a particular visual categorization ‘x has feature i’ (e.g. ‘x is a letter of type i’) is encoded into VSTM is given by a product of three terms: the strength η(x, i) of the sensory evidence that x has feature i, the perceptual bias βi associated with feature i, and the relative attentional weight of object x. For single-stimulus identification, v(x, i) is just the product of η(x, i) and βi. The sensory evidence parameter η(x, i) is independent of attention, but the perceptual bias βi is assumed to increase with both the temporal expectation of stimulus presentation, the subjective prior probability of being presented with a visual object with feature i, and the subjective importance of processing feature i. It is tempting to assume that, when both the prior probability of feature i and the subjective importance of processing feature i are kept constant, βi is an increasing function of the temporal expectation of stimulus presentation. In this parametric experiment, we varied the temporal expectation of stimulus presentation by manipulating the hazard rate of stimulus presentation. The results showed that the v(x, i) values estimated from maximum-likelihood fits to the data were increasing, strikingly linear functions of the logarithm of the hazard rate of stimulus presentation. Presumably, the effects on the v values were generated by changes in perceptual biases. Thus, the results suggest that for all letter types i, the perceptual bias βi was a linearly increasing function of the logarithm of the hazard rate of letter presentation. Furthermore, the neural mechanism by which this is realized may be a multiplicative scaling of the firing rates reflecting the perceptual bias.

Acknowledgements

The authors thank Camilla Funch Staugaard for valuable help in collecting the data.

Appendix A

The effects of expectancy on parameters v and t0, respectively, seem very robust. We have taken several measures to ensure that our findings are not driven by other factors. These include refitting the data under a variety of circumstances taking unequal distribution of attention and blinking into consideration.

(a) Control for effects of spatial location

Although we instructed our participants to pay equal attention to the two possible stimulus locations, they might have favoured one of the locations, giving stimuli presented there a processing advantage. Thus, we fitted data from all six participants using a model that included separate v values for the right and left sides, respectively. The estimated processing speeds were very similar to the processing speeds derived from the single v value model, and a repeated measures ANOVA with hazard rate condition (1–6) and location (left versus right) as within-subject factors corroborated the strong increase in processing speed for both locations with increasing expectancy; for the main effect of hazard rate condition, F5,25 = 17.58, p < 0.001, Inline graphic. The ANOVA did not suggest any differences between the two locations—the main effect of location was F1,25 = 0.16, p = 0.71, Inline graphic—nor any interaction between condition and side, F5,25 = 1.05, p = 0.41, Inline graphic.

(b) Control for blinking

When participants had to wait for more than just a few seconds, we expected them to blink occasionally during the foreperiod, even though they were instructed to avoid blinking. To control for effects of blinking, we monitored participants' blinking during the experimental trials using apparatus for eye-tracking. Specifically, we wanted to explore the possibility that more blinks during longer waiting times could explain the slowing down of processing speed with lower hazard rates. The eye-tracking data from the six participants were quality checked, and trials with signal outage for more than 500 ms at any point during the trial were excluded (table 2). For the remaining, included trials, the observed probability of making a blink (‘blink trials’) at some point during the presentation of a target increased slightly with decreasing hazard rate (table 2). However, in a repeated measures ANOVA with hazard rate condition (1–6) as within-subject factor, Mauchly's test indicated that the assumption of sphericity had been violated, χ214 = 64.76, p < 0.001, and the Greenhouse–Geisser (ɛ = 0.21) corrected main effect of hazard rate condition was not significant, F1.05,5.25 = 3.27, p = 0.13, Inline graphic

Table 2.

Mean processing speed, threshold of perception and number of blinks for included trials. v, perceptual processing speed (letters s−1); t0, threshold of conscious perception (ms); target blinks, probability of a blink during the presentation of the target; excluded trials, proportion of trials excluded due to insufficient eye-tracking data. Target blinks are based on included trials, whereas v and t0 are based on included trials without blinks.

hazard rate (s−1) v
t0
target blinks
excluded trials
M s.e. M  s.e. M  s.e. M  s.e.
λ = 0.1 49.77 (03.76) 9.13 (0.19) 0.04 (0.02) 0.13 (0.09)
λ = 0.2 53.74 (04.83) 9.09 (0.26) 0.03 (0.02) 0.10 (0.06)
λ = 0.4 57.07 (03.78) 8.94 (0.46) 0.03 (0.02) 0.07  (.04)
λ = 0.8 62.55 (04.67) 9.15 (0.30) 0.01 (0.00) 0.05 (0.04)
λ = 1.6 67.10 (06.82) 8.85 (0.26) 0.01 (0.00) 0.05 (0.03)
λ = 3.2 72.88 (07.36) 9.03 (0.21) 0.00 (0.00) 0.04 (0.03)

When refitting the data from all of the included ‘no-blink’ trials, estimates of processing speed and threshold of perception in each hazard rate condition (table 2) closely matched the estimates based on the original, full dataset (figure 5a). In line with this, repeated measures ANOVAs with hazard rate condition (1–6) as within-subject factor corroborated our previous findings of a strong and highly significant effect of expectancy on processing speed v, F5,25 = 11.90, p < 0.001, Inline graphic, but no effect of expectancy on the threshold of perception t0, F5,25 = 0.38, p = 0.86, Inline graphic.

The study was carried out in accordance with the ethical principles of the World Medical Association (Declaration of Helsinki), and written informed consent was obtained for all participants before entering the study.

Funding statement

The experiments were supported by grants from the University of Copenhagen Programme of Excellence and the Danish Council for Independent Research (DFF) research career programme Sapere Aude.

References


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES