Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2013 Oct 19;368(1628):20130053. doi: 10.1098/rstb.2013.0053

Attentional selection in visual perception, memory and action: a quest for cross-domain integration

Werner X Schneider 1,2,3,, Wolfgang Einhäuser 1,4, Gernot Horstmann 1,2,3
PMCID: PMC3758196  PMID: 24018715

Abstract

For decades, the cognitive and neural sciences have benefitted greatly from a separation of mind and brain into distinct functional domains. The tremendous success of this approach notwithstanding, it is self-evident that such a view is incomplete. Goal-directed behaviour of an organism requires the joint functioning of perception, memory and sensorimotor control. A prime candidate for achieving integration across these functional domains are attentional processes. Consequently, this Theme Issue brings together studies of attentional selection from many fields, both experimental and theoretical, that are united in their quest to find overreaching integrative principles of attention between perception, memory and action. In all domains, attention is understood as combination of competition and priority control (‘bias’), with the task as a decisive driving factor to ensure coherent goal-directed behaviour and cognition. Using vision as the predominant model system for attentional selection, many studies of this Theme Issue focus special emphasis on eye movements as a selection process that is both a fundamental action and serves a key function in perception. The Theme Issue spans a wide range of methods, from measuring human behaviour in the real word to recordings of single neurons in the non-human primate brain. We firmly believe that combining such a breadth in approaches is necessary not only for attentional selection, but also to take the next decisive step in all of the cognitive and neural sciences: to understand cognition and behaviour beyond isolated domains.

Keywords: attention, biased competition, task, real world, priority, vision

1. Functional domains of mind and brain: a brief historical sketch and the issue of integration

Since the earliest origins of mind and brain research, the decomposition of the mind into functional domains or ‘modules’ has been advocated and also been debated. For instance, Aristotle carefully distinguishes modality-specific early sensation from integrated perception [1]. In modern times, the notion of distinct functional domains within the brain became popular with the work of Franz Josef Gall in the early-nineteenth century. While his ‘phrenology’, the inference of mental function from skull shape, is discredited—and justly so—the quest for localizing mental function in the brain has been continuing until today, nowadays, in particular, in the context of non-invasive imaging. The success of functional localization mainly resulted from loss-of-function studies in the mid-nineteenth century, such as the famous works by Broca and Wernicke for the case of language generation and understanding, respectively, as well as from detailed investigations by anatomists such as Flechsig, Vogt or Brodmann, whose cytoarchitectural map of the cortex is still in use today.

The rapid progress in brain research that started in the early nineteenth century has found itself mirrored in the development of psychology during the second half of the twentieth century. During the 1960s and 1970s, the so-called cognitive revolution changed the study of the mind considerably [2]. Cognitive psychology gradually replaced behaviourism as the dominant intellectual paradigm in psychology. The mind as such and its functional domains, such as perception, memory, reasoning or motivation again became respectable scientific topics. The language of information processing provided a new way to characterize mental operations [3]. Importantly, the cognitive revolution did not only give birth to a new psychological science, but also reshaped parts of linguistics, artificial intelligence and philosophy. The common language of information processing and the viewpoint that mind and brain can be studied independently—in analogy to the distinction between soft- and hardware in computer science—created the new interdisciplinary field of cognitive science. Cognitive neuroscience, which emerged as a new and truly interdisciplinary field in the 1980s and 1990s, presented the next decisive step towards understanding brain and mind [4]. Bringing together researchers from many disciplines, cognitive neuroscience started to break the doctrine of distinguishing the brain's ‘hardware’ from its ‘software’. The various functions and subfunctions of the mind are understood as properties of the brain. Given the new experimental methods of cognitive neuroscience (e.g. patient studies, single cell recordings, imaging methods, etc.) great progress has been made in discovering and revising functional domains at several levels. Nevertheless, one key issue that haunted research on mind and brain from its early origin on remains to be solved: how might these functional domains interact in generating behaviour and cognition?

Why is it important to address integration across domains explicitly and why now? A possible stance on the huge amount of scattered domain-specific knowledge could be that once we know enough about each functional domain, integration will be a natural and inevitable consequence. We firmly believe that this prediction will not turn out to be true. The history of experimental research on attention is an example of why we need an explicit approach to the issue of cross-domain integration. During the early phase of cognitive psychology, the unitary, ‘single-resource’ character of attentional processes was emphasized [5]. During the past decades, an opposite movement towards a diversification of attention has become dominant [6,7]. In the latter view, attention is considered a family label of different processes that are involved in competition and priority control [8]. This belief in the variety of attentional functions has dominated cognitive neuroscience since its beginnings [6], and resulted in a widespread disinterest in overarching features of various attentional processes (for exceptions, see [7,8]). For instance, attentional processes in perception have mainly been studied without caring about the corresponding processes in action control or in memory. Although this strategy has resulted in detailed theories and successful models—for example, for attention during visual search, during rapid object detection and recognition, and for gaze control in free-viewing—integrative approaches have remained scarce [8] and restricted in terms of domains (e.g. perception and action only, see [911]) or they are described at a relatively abstract level [7]. In summary, the history of attention research so far demonstrates that across-domain integration is not an obligatory consequence but may require concerted actions by various research fields. This Theme Issue makes one attempt in this direction.

2. Attention as biased competition: an approach to cross-domain integration of visual perception, memory and action

This Theme Issue is guided by the fundamental notion that attentional processes are good candidates for linking functional domains. Issues of selectivity, competition and priority control—hallmarks of attention [8]—are present in every domain. Following the ‘biased competition approach’ to attention [8,1216], competition means that domain-specific neural representations (e.g. ‘object’ representations in the visual modality) are characterized by limited capacity on the one hand and its counterpart, selectivity, on the other hand. Only few of these representations can be simultaneously active and may thus become consciously available or control actions at any given point in time. Priority control—‘bias’ in Desimone & Duncans’ terminology [12]—implies that selection among competing representations does not occur at random. Instead, selection is guided by current top-down factors such as task or intention and by bottom-up factors such as the ‘salience’ (intrinsic quality) of a stimulus representation. Top-down and bottom-up factors are combined through their shared input space (e.g. the location in the visual field), and their combination is typically referred to as ‘priority’ [17,18].

Moreover, in terms of measuring attentional processes in various domains, great progress has been made during past decades. Besides the powerful new cognitive neuroscience techniques such as studying patients with brain lesions and diseases in highly controlled experimental settings [19], measuring brain activation by imaging [7] or single unit recordings in non-human primates that perform complex tasks [13,18], great progress has also been made at the behavioural level, namely by measuring eye movements as proxy of covert visual attention [20]. Nowadays, the efficient and highly precise measurement of temporal and spatial eye-movement parameters [21] has become an increasingly popular way to study attention. In this Theme Issue, eye movements as an overt index of visual selection are prominently featured in various papers [2228] and their relation to covert shifts of attention explicitly addressed [29,30].

In the light of the large number of functional domains and the enormous amount of knowledge on selection, competition, selectivity and priority control in each of these individual domains, this Theme Issue deliberately restricts itself by imposing additional constraints beyond the fundamental notion of attentional across-domain integration by biased competition. We therefore focus on vision (integration of biased competition in visual perception, visual memory and visual action), on task-driven control of attentional integration and on real-world stimuli, scenes and behaviour as a paradigmatic case, in which attentional integration is evidently required.

(a). Integration of biased competition in visual perception, memory and action

Our focus of visual perception, memory and action had several reasons. First, during the past decades, our knowledge about visual information processing has grown tremendously [31,32]. This enormous progress has been based on experimental paradigms for studying specific functional domains such as visual attention [33], object recognition [34], visual short-term memory [35] or visual-based sensorimotor control [24,36]. The wealth of research paradigms and empirical data make research within the visual modality an optimal vantage point for further analysis. Second, we choose vision as a model system in that such a research focus allows the tackling of one of the greatest challenges of interdisciplinary research, namely variability in terms of methods, spatial and temporal scales, and theoretical languages. The complete aforementioned toolkit of research methods—electrophysiology, functional imaging, lesion studies and advanced behavioural methods—is accepted and used in all domains of vision research. Moreover, shared theoretical tools such as mathematical and computational modelling (e.g. neural networks, connectionist networks, image processing models) vastly ease communication across domains. To conclude, vision provides the tools and terminology an integrated view will be built upon, and in turn, vision serves as prime test bed for experimental validation.

Third, a substantial and increasing body of experimental work exists already that investigates how attentional processes might link visual perception, memory and action. These studies have focused, for instance, on how attention and working memory processes could interact [30,37,38], on how covert visual attention in perception and motor action selection may be coupled [3941] or on how retrieval from long-term memory and action selection might be linked [42].

(b). Task-driven control of attentional cross-domain competition

Biased competition in functional domains has to be integrated in order to achieve coherency in goal-directed behaviour and cognition. We suggest that common priority signals from the current task play a key role in this integration. At any given point in time, there should always be one unique task (or intention, action plan) at the highest level of control in mind and brain, and thus in control of attention [11,4345]. Even when several tasks are seemingly being carried out in parallel, a common action plan (i.e. a common task) may be active at the highest control level [11].

What is a task? We are committed to a relative broad working definition. A task consists, on the one hand, of goal states and on the other hand of ‘stimuli’, actions (responses) and connecting regularities [23]. Goal states—sometimes also called intentions—define a reference value or set point. If a particular reference value is selected as currently ‘being-in-charge’ for controlling mind and brain of an agent or organism, then the system will attempt to realize the corresponding state by (motor) activities [4649]. To reach the reference value, task-specific information for priority control and biasing of competition is required, namely a description of relevant objects, events and actions [10,43,44]. References to stimuli and actions, as well as regularities connecting them, are also important ingredients to tasks [50]. Examples of such regularities may be ‘if-context-X-then-perform-operation-Y’- statements [45]. Which networks in the brain of primates may represent the ‘task-in-charge’? Experimental evidence points to a central role of the prefrontal cortex (PFC [45,5052]). Large lesions of the PFC lead to the ‘environmental dependency syndrome’ [48,52] that is characterized by a control mode of mind and brain, in which external events win the competition more often than internal and temporally extended goals.

The assumption of ‘one-task-in-charge’ has to be supplemented by the assumption that important environmental events outside the current task can control attentional selection by changing the current task [53]. For instance, if you are reading a book in your room and you smell a fire, then it is highly probable that the current task of reading will be immediately replaced by the new task of locating the fire and escaping from it. This capability may be called ‘behavioural flexibility’—the ability to quickly adapt to the changing demands of the environment’. Again, the PFC is critical for this function [23].

(c). Real-world stimuli, scenes and tasks: a paradigm case for integration of domains by competition and priority control

The need for integration across functional domains becomes most evident when dealing with natural settings and real-world tasks. Traditional psychophysics has, by contrast, emphasized well-controlled (i.e. simple) stimuli and thereby mirrored the reductionist approach in structure in their paradigms. Frequently, such research implicitly or explicitly assumes that complex processing will then be eventually understood by combining results from simple stimuli [54]; this implies, however, linear processes that seem in sharp contrast to the highly nonlinear nature of perceptual and cognitive processing. In the context of attentional selection, dealing with more naturalistic settings, and in particular the question to what extent results from ‘classical’ experiments under more constrained conditions transfer to the real world, has therefore become a research topic of increasing interest.

Provided the tight link between gaze and covert attention [10,20], tracking eye movements and measuring gaze allocation present one of the most promising paradigms for studying attentional processes in real-world vision. For a long time, experiments in this area were restricted to constrained laboratory settings, introducing biases [55], often neglecting head and body movements, and providing limited information for real-world situations [56]. Only recently, with the advent of powerful wearable eye trackers and virtual reality technology, real-world tasks and stimuli could be combined with less and less constrained settings. Although pioneering experiments considered somewhat restricted domains, such as sports [30,57], driving [58], and food preparation [31,59], most recent developments go into the direction of free moving in real [60] and virtual [61] streets, or in similarly ‘natural’ environments for the observer. The key challenge for such endeavours is to allow the experimenter sufficient control over the experimental setting, without compromising the realism of the task and task set. Only when understanding—as proposed throughout this Theme Issue—attention as a combination of priority control and competition within and across domains as well as based on a solid theoretical and modelling framework [45,6267], quantitative hypotheses can be formulated that eventually allow both—a fully realistic scenario (task and environment) and sufficient experimental control. In this Theme Issue, we therefore use real-world tasks and natural stimuli as paradigmatic cases for integration of attentional selection across domains [22,26,28].

3. This Theme Issue at a glance

In this Theme Issue, we bring together data and models based from a large variety of fields, including theoretical modelling, classical psychophysical experiments across the domains of perception, memory and action, real-world perception and action, as well as monkey electrophysiology.

The Theme Issue opens with a review by Humphreys et al. [68] on action-related attention, showing converging evidence from patients and healthy participants for a pre-attentive coding of action relations. This provides an important constraint on formal theories of attention, namely the requirement to include affordances, even if seemingly perceptually complex, in the guidance of attention. In this spirit of action-related attention, Flanagan et al. [24] test how eye movements depend on the task in action observation. They find that proactive gaze behaviour, similar to the one preparing one's own actions, is elicited if and only if the evaluation of a mechanical event—judging the weight of an object lifted by someone else—is required, when compared with observing the visually identical situation with the task of predicting the choice of an item. Rolfs et al. [69] investigate whether reach preparation affects visual processing similar to the preparation of an eye movement. Indeed, they find that orientation-discrimination performance is better and apparent contrast higher at the reach target. However, unlike for eye movements, these effects show a distinct temporal evolution, suggesting two distinct mechanisms for performance benefits, which are in this view linked to movement preparation, and visual appearance, which is linked to priority. Together, these papers make a case for effects of action and action planning on attentional selection and/or the resulting perception.

Theeuwes [70] provides a detailed review on the literature on feature-based attention and argues that there is little evidence for endogenous, top-down control in feature-based attention and thus advocates a view that all feature-based attention could be explained fully by bottom-up priming, in contrast to the predominate role of top-down control in spatial attention, which is at the focus of many other studies in this Theme Issue.

Vangkilde et al.'s [71] article first provides a compact review of recent developments around Bundesen's theory of visual attention (TVA [72]), which is a the basis of several papers in this Theme Issue, as it provides a natural link between selection in perception and short-term memory. Vangkilde et al. include temporal expectations in the theory, and verify its predictions by new experimental data. Because parameters related to temporal expectancy turn out to be simple (linear) functions of the model's internal parameters, the article naturally extends the scope of TVA to relevant experimental parameters in the temporal domain. Finke et al. [73] apply TVA to disentangle attention and memory processes impaired in Alzheimer's disease (AD) from another. They find that competitive attentional selection is impaired very early in AD and based on these data suggest that initial phases of AD should be understood as ‘attentional weighing deficit’ rather than a deficit in memory per se. Models of attention such as TVA typically focus on attention deployment within a single fixation, whereas there is little theoretical work on the relation between memory and attention across an eye movement. Rooted in TVA, Schneider [30] proposes a complementary approach to model effects of attention and working memory across competition episodes. The novel model (task-driven visual attention and working memory; TRAM) unifies a series of experiments in attention and memory, in particular in the context of the attentional blink, which so far have modelled largely in isolation, in a single theory. Together, these papers exemplify the potential of formal theories of attention, in particular TVA and its descendants, for explaining a large variety of phenomena across domains.

Hollingworth & Hwang [74] directly test the relationship between visual working memory and attention, and provide evidence for a dual state of working memory. Only items that are immediately relevant for a task are retained in an active memory state, whereas non-immediately relevant items are stored in a more passive form of representation. Only the active representation influences visual selection and search, even if retrieval performance for actively and passively stored memories is similar.

A series of articles in this Theme Issue address whether and how results and models from the laboratory transfer to more realistic scenarios and tasks. 't Hart et al. [29] address whether the link between covert and overt attention holds for natural stimuli. They find that the fixation probability on an object during prolonged viewing correlates with its probability to be detected in a rapid-serial-visual presentation sequence, thereby relating overt attention in space to covert attention in time. Zelinsky et al. [28] present a model of attentional guidance that uses categorical information, rather than uses information about a particular exemplar, as a target template. By combining machine learning techniques with a model of guidance, the model not only correctly predicts present/absent judgements, but also gaze shifts of human observers viewing the same displays. Thereby, the model extends models of template search to the more naturalistic search mode, when no exact template, but only the category of the natural target object is given. Diaz et al. [22] show that intercepting a ball after a bounce and the smooth pursuit eye movement associated with this task is not measurably affected by the occlusion of the ball's trajectory after the bounce event, providing evidence for a dominant role of memory in this task. Tatler et al. [26] address the link between memory and action through attention for a realistic task. They show that the priority given to natural objects in a real-world task (tea-making) for both for gaze allocation and memorization is modulated by whether an observer is actually performing the task or merely watching it. Specifically, task-relevant items are fixated longer and their position is remembered better, if and only if the observer is actively engaged in the task. The benefit for position memory still holds if the observer is moving through the real-world setting without manipulating the objects (when compared with watching head-centred recordings), whereas the gaze preference requires active object manipulation. Toscani et al. [27] investigate how the sampling strategy resulting from the scene affects the perception of its basic physical quantities, such as lightness. They show that the lightness perception of two physically identical stimuli influences sampling by eye movements, and in turn, this sampling strategy modulates lightness perception. Following segmentation of a stimulus into target and occluder, fixations preferentially land on the target and thereby modulate lightness perception, and both fixations and perception are similarly affected if segmentation is no longer possible. In summary, these papers exemplify the integration of selection across domains for naturalistic situations: gaze is a valid proxy for attention, models of gaze guidance transfer to natural scenarios, memory plays an important role for gaze, as does active engagement in a task, and finally, selection through eye movements is an integral part for the perception of basic physical properties of a scene.

The Theme Issue concludes by three electrophysiological studies that provide some of the neural basis for the theories, concepts and behavioural data discussed above. Mirpour & Bisley [25] address the issue of avoiding irrelevant distractors in visual search by recording local field potentials (LFPs) in the lateral intraparietal area of the macaque monkey. They find that a potential target that has previously been fixated (i.e. is now known to the animal to be a non-target) has greater LFP power in the alpha and low beta band, indicating an active top-down suppression of potential targets that have been identified as non-targets in the present trial. As such, the paper provides a substrate for involvement of memory in modulating current perceptual and attentional processing. Everling & Johnston [23] address the role of the lateral PFC in modulating goal-directed behaviour. Contrary to the standard view, which presumes a role of PFC in suppressing unwanted behaviour, they review recent evidence in favour of PFC's role as facilitator of goal-directed saccades. This evidence, in particular, draws on primarily excitatory connection of the PFC to the superior colliculus in the macaque. In this view, PFC facilitates goal-directed behaviour and plays a decisive role in implementing and maintaining the task set. Heitz & Schall [75] provide a critical review of the stochastic accumulator framework to model speed–accuracy trade-offs. While they confirm that stochastic accumulator models provide a quantification of behaviour at large in the non-human primate, they provide compelling evidence that a one-to-one mapping of the model to neural activity falls short. Instead, they propose a multi-stage accumulator model that is consistent with both the behavioural and the currently available neuronal data. Together, the three electrophysiological papers mirror the themes of the whole issue, cross-domain integration and task as control factor: memory of the value of previously visited potential targets, implementation and maintenance of task rules and the active control of action selection processes based on perceptual information.

Acknowledgements

This Theme Issue resulted from the research group ‘Competition and priority control in mind and brain: new perspectives from task-driven vision’ at the ‘Center for Interdisciplinary Research’ (ZiF) in Bielefeld, Germany. The ZiF is Bielefeld University's Institute of Advanced Studies. Working with this ZiF research group was scientifically and personally in many respects rewarding—we are very grateful to all members of the ZIF groups and the numerous other researchers that came to the ZiF and contributed to this great research year 2012–2013 (see http://www.uni-bielefeld.de/%28en%29/ZIF/FG/2012Priority/). We are very grateful to Britta Padberg and her team for the splendid hospitality during this year. The papers of this Theme Issue are based on invited contributions of the opening conference of the ZiF research group in October 2012. Additional support for the hosting the opening conference was provided by the ‘Cluster of Excellence Cognitive Interaction Technology (CITEC)’. We especially thank the reviewers for this issue—Stefanie Becker, James Bisley, Eli Brenner, Claus Bundesen, John Duncan, Stefan Everling, John Findlay, Kathrin Finke, Randy Flanagan, Rebecca Förster, Karl Gegenfurtner, Marius ‘t Hart, Mary Hayhoe, Arvid Herwig, Steffen Klingenhöfer, Arni Kristjannson, Søren Kyllingsbæk, Abdeldjallil Naceri, Maria Nordfang, Antje Nuthmann, Christian Olivers, Marc Pomplun, Ian Robertson, Martin Rolfs, Jeff Schall, Thomas Schenk, Ben Tatler, Jan Theeuwes, Signe Vangkilde, Quasim Zaidi, and Greg Zelinsky—for their high-quality reviews and quick turn-around times that were essential for the issue's quality and timeliness.

References


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES