Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Sep 1.
Published in final edited form as: Proc IEEE Inst Electr Electron Eng. 2013 Sep;101(9):2058–2067. doi: 10.1109/JPROC.2013.2265801

Visual Attention and Applications in Multimedia Technologies

Patrick Le Callet 1, Ernst Niebur 2
PMCID: PMC3902206  NIHMSID: NIHMS537311  PMID: 24489403

Abstract

Making technological advances in the field of human-machine interactions requires that the capabilities and limitations of the human perceptual system are taken into account. The focus of this report is an important mechanism of perception, visual selective attention, which is becoming more and more important for multimedia applications. We introduce the concept of visual attention and describe its underlying mechanisms. In particular, we introduce the concepts of overt and covert visual attention, and of bottom-up and top-down processing. Challenges related to modeling visual attention and their validation using ad hoc ground truth are also discussed. Examples of the usage of visual attention models in image and video processing are presented. We emphasize multimedia delivery, retargeting and quality assessment of image and video, medical imaging, and the field of stereoscopic 3D images applications.

Index Terms: Visual system, video signal processing, multimedia systems, image analysis, image processing, image communication, image coding, stereo vision

I. INTRODUCTION

Selective attention is nature’s answer to a problem that is present in all but the simplest organisms and increasingly also in machines: information overload. To work efficiently in a variety of complex environments, animals and machines are equipped with an array of sensors, all of which are needed in one situation or another to assure survival of the animal or proper function of the machine. In any given situation,, however, only a subset of the sensory input is needed and it would be wasteful (and in many cases practically impossible) to process all sensory input at all times. Therefore, selection has to be made which sensors are relevant at a given time, and only information provided by those is allowed access to central processing resources, Frequently, even the input stream from one sensor may be overwhelmingly rich. For instance, all visual input to the human brain1 is provided by about 106 retinal ganglion cells per eye. Assuming a maximal firing rate of these neurons of about 100 Hz results in a channel capacity of 100 Mbits per second per eye. Indeed, analyses of spike train statistics of visual input to the brain in primates [1], carnivores [2] and insects [3] confirm that the rate of the transmitted information is within an order of magnitude of the channel capacity. This torrent of information cannot be, and does not have to be, processed in detail. Instead, only a fraction of the instantaneously available information is selected for detailed processing while the remainder is discarded.

The filtering process is called selective attention and its mechanisms have been studied systematically for well over a century [4]–[6]. The first parallel stages of sensory processing are followed by a bottleneck that restricts the amount of information allowed to proceed to more central processing stages [7], [8]. Information processing in these later stages occurs sequentially rather than in parallel. This allows the application of powerful algorithms to the selected parts of the input that would be too costly to implement for all of sensory input.

For instance, search for a “singleton” target (that is distinguished from distractors by one feature, e.g., by its color) is usually a parallel process (with search times nearly independent of the number of distractors) while search times for “conjunctive” targets (that can be distinguished from distractors only be considering more than one feature, e.g., color and orientation) increase linearly with the number of distractors, suggesting a serial search. Treisman and colleagues argue in their Feature Integration Theory [9] that identification of conjunctive targets requires to bind its various features to a coherent object, a task that cannot be performed by elementary feature maps but requires the resources of a more powerful attentional mechanism. This mechanism is not available in parallel for the whole visual field but needs to be applied sequentially. A more differentiated view of visual search has emerged since Treisman’s original theory, e.g., refs [10]–[13], but it is generally accepted that visual processing consists of a parallel stage that is fast but relatively simple, followed (if the task requires it) by application of a more powerful mechanism that needs to be applied sequentially to one (or possibly a few) parts of visual input. Exploitation of this limitation of the human visual system is the basis for applications in multimedia which is the topic of this paper.

In section II, we discuss mechanisms of selective selective attention in primate vision and existing computational models. In section III, we focus on some multi-media applications without seeking for exhaustiveness, and we conclude in section IV.

II. VISUAL ATTENTION MECHANISMS AND COMPUTATIONAL MODELS

In this Section, we introduce detailed computational models of selective attention and some of their limitations. We define two dichotomies, overt vs. covert attention in Section II-A and bottom-up vs. top-down attention in Section II-B. In section II-C, we briefly discuss difficulties in obtaining ground truth for model predictions.

A. Overt versus covert visual attention

Due to the much higher resolution in the center of the retina compared to its more peripheral regions, humans and other primates usually direct their center of gaze towards the most relevant areas of the visual scene. This generates a series of fixations (and smooth eye movements although the latter are not often discussed in the context of selective attention) called “overt attention,” since allocation of the high-resolution resources in the fovea can be easily observed by following the person’s eyes, most conveniently and quantifiably with an eye tracker. It has been proposed that far-reaching conclusions can be drawn about the state of the human mind by analyzing the details of this so-called “scan path” [14], [15].

Primates, however, do not have to attend compulsively to objects in their center of gaze. As discovered early on both experimentally [4] and through introspection [6], humans are able to focus their attention to peripheral locations, away from their center of gaze. An illustration of this process is a car driver who fixates the road while simultaneously and covertly monitoring road signs and lights that appear in the retinal periphery. Since this redirection of attention is not visible immediately, it is referred to as covert attention.

There are many experimental paradigms that can determine the movements of the covert focus of attention but none is as convenient, fast, and easy to understand as tracking the eyes of an observer; in other words, measuring his or her overt attentional state. Fortunately, although the locations of overt and covert attention can be dissociated, as discussed, psychophysical evidence shows that an eye movement to a new location is necessarily preceded by focal attention to this locationq [16]–[21]. This makes it possible to easily obtain a close correlate of overt attentional selections by recording eye movements which thus serve as a proxy for shifts of covert attention. Of course, prediction of eye movements is also of immense interest by itself and of great practical interest, including for multimedia applications, Section III. Frequently, models for covert attention are, explicitly or implicitly, used to predict eye movements.

B. Bottom-up versus top-down attention

Attentional selection is a central part of perception and cognition. As such, it is influenced by many factors, both internal and external to the observer. What is attended depends, for instance, on the observer’s motivation and the specific task he or she is performing. In a set of classic experiments, Yarbus [22] showed that eye movements (overt attention) of the same observer viewing the same visual scene differ dramatically depending on what information the observer is looking for in the scene. Attentional selection that depends on the internal state of the observer is referred to as “top-down attention.” It is very difficult to develop biologically realistic detailed models of such mechanisms which may include influences such as the personal history of the observer.

On the other hand, “bottom-up” selection only depends on the visual input provided instantaneously or in the very recent past (as in immediately preceding frames of a movie). As such, it is not only much easier to control but it is also easier to quantify the correlation between input and resulting behavior. For this reason, Koch and Ullman [23] proposed that bottom-up attention is a suitable candidate for detailed computational models of selective attention. Specifically, they proposed that bottom-up attention is directed to salient parts of the visual scene and they proposed the concept of a saliency map. This is a topographic map of the visual field whose scalar value is the saliency at the respective location. Saliency is computed at multiple scales from the local differences in visual submodalities (color, orientation, …). If both the basic premise that bottom-up attention is attracted by salience as well as their concept how salience is computed are correct, attentional control is then reduced to finding the local maxima in the saliency map and assigning the successively visited foci of attention to those maxima in order of decreasing peak value. This results in a “covert attentional scan path,”, see Figure 1 for an illustrative example, in analogy to the sequence of eye movements in overt attention.

Fig. 1.

Fig. 1

Examples of scanpath (a), Region of Interest Map and corresponding content (b), Fixation density Map and corresponding content (c).

This conceptual idea of attentional control by a saliency map was subsequently implemented in biologically realistic computational models [24]–[26]. Over the last decade and a half, these models have been refined, tested and applied by a large number of groups. Borji and Itti [27] provide an excellent overview of the current state-of-the-art of visual attention modeling including a taxonomy of models (information theoretical, cognitive, graphical, spectral, pattern classification, Bayesian,…).

The simplicity of the original saliency map model makes it attractive both conceptually as well as for applications but it also engenders limitations. For instance, it has been found that eye movements are typically oriented towards the centers of objects, rather than their borders which is where bottom-up saliency peaks [28]. Such deviations can be explained, at the cost of slightly higher complexity, by directing attention to proto-objects, rather than purely spatially defined regions of the visual scene [28], [29].

While bottom-up influences are thus important, it is clear that in many situations top-down attention plays a role, too. One consequence of the saliency map model is that its first selections in a new scene should agree better with observed eye movements than later ones, since less top-down guidance is expected to exist for input never seen before; this was confirmed experimentally [30]. It is also important to distinguish the term “salience” and “importance” (as in, e.g., Region of Interest/Importance, RoI) which are frequently considered synonyms in the signal processing literature. While both visual salience and visual importance denote the most visually “relevant” parts of the scene, it is useful to reserve the term “salience” for strictly bottom-up influences, while “important” areas can be selected based on both bottom-up and top-down criteria. The two mechanisms are thus driven by a different combination of sources. The interplay between these mechanisms has been studied showing that their relationship might vary along viewing time [31].

Even though top-down influences play an important role in attentional selection, we have already discussed that developing computational models of top-down attention in as much detail as for bottom-up attention is virtually impossible. Some progress has been made for parts of the general problem, for instance for finding objects [32]. This field must be considered, however, as being in its infancy. For instance, it is known that not only the properties of objects and their immediate surrounds but the interaction between objects on the image scale as well as the “gist” of the scene [33], [34] strongly influence search patterns and response times [35]. On the other hand, it was shown that low-level saliency is significantly predictive not only of eye movements but, surprisingly, even for conscious decisions of what observers consider interesting [36]. The fact that the very simple quantities computed in the original saliency map [24], [25] significantly influence human behavior after conscious deliberation and after many seconds of response time engenders hope that these easily and cheaply computed models and their derivatives can be useful for technical applications, even when humans are “in the loop,” as in multi-media applications.

C. Visual attention models and ground truth

Developing and testing computational models of visual attention depends on the availability of ground truth. Many studies rely on fixation density maps (FDM) generated from eye-tracking experiments (see Figure 2 for an illustration of the process leading to the generation of FDM). Consequently, most of the models are supposed to address mainly overt visual attention. Nevertheless, recommendations to properly generate FDM are still missing. Several eye-tracking FDM databases have been made publicly available corresponding to experiments conducted independently in different conditions. The question of corresponding viewing time is particularly critical regarding top-down and bottom-up competition, while rarely considered. How the difference between various experimental set up to obtain FDM may impact image processing applications has been recently investigated [37].

Fig. 2.

Fig. 2

Steps for transforming eye-tracking data into a Fixation Density Map. After gathering the raw data (top), saccades are identified and fixation locations are determined (center). The fixation map is then obtained by convolving fixation locations with a Gaussian whose size is determined by a combination of mean eye tracking error and the size of the human fovea (bottom).

Computational models of attention produce very different predictions for FDM (see examples in Figure 3). How to quantitatively compare the performance of different models given the ground truth is another topic of research, see ref. [38] for a recent study proposing several metrics to assess model performance. It should also be noted that using FDM as ground truth may not be warranted for all models since some are designed to explain aspects of visual attention mechanisms that are not reflected in FDM.

Fig. 3.

Fig. 3

Examples of FDM generated by visual attention models: original content (a), AIM model [?] (b), STB model [28] (c), SR model [39] (d)

Given these caveats, the usage of a given visual attention models should be achieved cautiously in image processing applications, considering the model type (e.g: top-down vs bottom-up) but also its performance regarding a given application context. Better characterization of a model should lead to comprehensive recommendation of proper usage.

III. APPLICATIONS OF VISUAL ATTENTION MODELS IN IMAGE AND VIDEO PROCESSING

In the following, we give an overview of applications of models of visual selective attention. In Sections III-A and III-B, we discuss multimedia delivery. Section III-C is devoted to re-targeting, Section III-D to quality assessment, and Section III-E to applications in medical imaging. Finally in Section III-F we discuss stereoscopic 3D images.

A. Multimedia Delivery: improving source coding

Several stages of the media delivery chain can benefit from insights into visual attention mechanisms. The first attempts were applied to selective compression of image and video contents. A survey on this topic can be found in [40]. Sensitivity and resolution reduction of the human visual system as a function of eccentricity is one the that could benefits to improve compression performance once salient location identified [43]. Selective compression is based on two priors: a prior of selection, that defines the most informative areas of an image, and a prior of compression that defines the coding nature and bit rate allocation strategy. Compression rate (prior of coding), and consequently the visual quality, can be differentially adapted to different image areas depending on the level of attention devoted to them by the human observers (prior of selection). The importance of a given image region can be computed based on the contribution of different features (contrast in color, orientation, intensity, …) [24], [41], [42] or in a simplified version under the assumption that human faces attract attention [43]. There are two principle approaches to prioritize coding of different image areas using saliency information. The first is the indirect approach [44] in which the graphical contents is pre-processed. Image agreas are selectively encoded according to their saliency, e.g., by low-pass filtering less important regions. The choice of preprocessing methods needs to be compatible with the coding scheme, especially with the quantization operator.

The direct approach is applied in block based coding methods. Bit rates are allocated to each macro block separately according to a visual saliency criterion (see Figure 4). Most of the time, this is achieved by changing the quantization parameters. This can be done using conventional RDO (Rate Distortion Optimization) techniques [45]–[48] or by providing a map based on a preceding analysis of the contents [49].

Fig. 4.

Fig. 4

Distribution of encoding cost of natural scenes (shown in a) for a conventional H.264 coding (b) and a saliency based approach (c) (from O. Le Meur, P. Le Callet, D. Barba Selective H.264 video coding based on a saliency map, http://people.irisa.fr/Olivier.Le Meur). Color coded pixels show the cost in the respective areas. The color scale at the bottom is common for all panels in rows b and c.

With the recent availability of low-cost, consumer-grade eye trackers, visual attention-based bit allocation techniques for network video streaming have been introduced [50]. To improve the efficacy of such gaze-based networked systems, gaze prediction strategies can be used to predict future gaze locations to lower the end-to-end reaction delay due to the finite round trip time (RTT) of transmission networks. Feng et al. [50] demonstrated that the bit rate can be reduced by slightly more than 20% without noticeable visual quality degradation even when end-to-end network delays ares as high as 200ms.

In another approach [51], the audio component is also taken into account to improve RoI encoding based on the observation that sound-emitting regions in an audio-visual sequence typically draw a viewer’s attention.

B. Multimedia Delivery: Improving Resilience to Transmission Errors

Packets in a video bitstream contain data with different levels of importance from the visual information point of view. This results in unequal amounts of perceived image quality degradation when these packages are lost. Quality assessment experiments with observers have demonstrated that the effect of a lost packet depends on the spatio-temporal location of the visual information coded in the packet. Perceived quality degradation is lowest when the loss affects regions of “noninterest” [52]–[54]. Visual attention based error resilience or RoI based channel coding methods are consequently good candidates to attenuate the perceptual quality loss resulting from packet loss. In the context of highly prediction based coding technologies such as H.264/AVC, for good compression performance there is a high dependency between many parts of the coded video sequence. However, this dependency comes with the drawback of allowing a spatio-temporal propagation of the error resulting from a packet loss. RoI based coding should also consider attenuating the effect of this spatiotemporal dependency when important parts of the bitstream are lost. As part of the H.264/AVC video coding standard, error resilience features such as Flexible Macroblock Ordering (FMO) and Data Partitioning (DP) can be exploited to improve resilience of salient regions of video content. DP partitions code slice into three separate NAL (Network Abstract Layer) units, containing each different part of the slice. FMO allows the ordering of macroblocks in slices according to a predefined map rather than using the usual raster scan order. Coupled with RoI-based coding, FMO is can be used to gather RoI macroblocks into a single slice [55]. An alternative approach [56] consist in confining the RoI in separate slices to prevent error propagation within a picture and then constraining the coding prediction process in the RoIs to avoid that the resulting loss distortion reaching RoIs in other pictures.

C. Image and Video retargeting

With the recent explosion of commonly available device types (tablet, smart phone, large displays, …), formats (3D, HD, Ultra HD, …) and services (video streaming, image database browsing, …), the visual dimension of multimedia contents viewed by a human observer can vary enormously, resulting in the stimulation of very different fractions of his or her visual field. Depending on display capacity and the purpose of the application, contents often need to be repurposed to generate smaller versions, with respect to image size, resolution, frame rate, … A common way to achieve this goal is to dramatically down-sample the picture homogeneously, as in thumbnail modes. This often yields poorly rendered pictures since important objects of the scene may be no longer recognizable. Alternatively, content repurposing techniques perform content-aware image resizing, for example by seam carving [57]. Saliency based image re-targeting (or content repurposing or reframing techniques) algorithms have been proposed following this idea: identify important regions of interest and compute the reduced picture centered on these parts [58], [59] (see figure 5 for an illustration). More recently, dynamic (i.e. time changing) thumbnails have been introduced using a dynamic computational model of visual attention [60]. Rubinstein and colleagues [61] have evaluated many image re-targeting algorithms both objectively and subjectively and demonstrated the value of saliency based cropping approaches.

Fig. 5.

Fig. 5

Process of saliency-based reframing [58]. The saliency-based thumbnail focuses on the most relevant image parts.

D. Image and Video quality assessment

Perceptual objective image quality assessment uses an algorithm that evaluates the quality of pictures or video as a human observer would do based on the properties of the human visual system. Visual attention is one of the features that can be considered based on the rationale that an artifact is likely more annoying in a salient region than in other areas [62]. Most of objective quality assessment methods can be decomposed in two steps. Image distortion is first locally (pixel-base, block-based, …) evaluated resulting in a distortion map. In the second step, a pooling function is used to combine the distortion map values into a single quality score value. An intuitive idea to improve quality assessment methods using visual attention information is to give greater weight at the pooling stage to degradation appearing in salient areas than in non-salient areas [63], [64]. Initial approaches consisted in weighting the distortion map using local saliency values before computing a linear or non linear mean. More recent studies, based on eye tracking data, demonstrated that this simple weighting is not very effective [65], [66] in the case of compression artifacts. Nevertheless, such approaches can lead to significantly improved performance in the case of non-uniformly located distortions such as those due to transmission impairments [67]. Alternative weighting methods have been introduced for compression artifacts with varying success [68], [69]. In ref. [70], more complex combinations of saliency map and distortion are introduced, assuming that weights should be a function of both saliency value and distortion level. You etal [71], [72] revisit the problem at the distortion level for video content. Distortion visibility can be balanced according to the human contrast sensitivity function. As the latter is spatially non uniform, gaze estimation should be considered to properly apply it.

Another open issue is which parts of the original content and its distorted version should be used for estimating the saliency map. Artifacts themselves may affect the deployment of visual attention; they may, for instance, attract attention [73]. Moreover, objective quality measures are expected to correlate with the outcomes of quality assessment experiments performed by observers. To obtain comparison data, observers need to perform specific tasks. Such tasks are likely to affect the visual attention deployment compared to a free-viewing [74]–[76].

E. Medical imaging

Over the past twenty years, digital medical imaging techniques (Computed Tomography, Magnetic Resonance Imaging, Ultrasound, Computed Radiography/Digital Radiography, Fluoroscopy, Positron Emission Tomography, Single Photon Emission Computed Tomography, …) have revolutionized healthcare practice, becoming a core source of information for clinicians to render diagnostic and treatment decisions. Practical analysis of medical images requires two basic processes: visually inspecting the image (involving visual perception processes, including detection and localization tasks), and performing an interpretation (requiring cognitive processes). Unfortunately, interpretation is not error-free and can be affected by the observer’s level of expertise and by technological aspects. Moreover, a side effect of the dramatic increase in the availability and use of medical images is a shortage of qualified image reading experts. It is likely that the time per image that is available for interpretation will continue to decrease in the future. Expertise in medical image reading therefore needs to be seen under the two aspects: accuracy and speed [77]. Understanding how clinicians read images, how they develop expertise throughout their careers, and why some people are better at interpreting medical images than others are crucial questions that are related to visual attention.

Such knowledge represents great potential to develop better training programs and create new tools that could enhance and speed up the learning process. A longitudinal study [77] of pathology residents during their development of expertise in reading slides of breast biopsies used eye tracking experiments at the beginning of each of their three years of residency, documenting changes of their scan paths as they increased the level of their experience. The data showed that search patterns changed with each successive year of experience. Over time, residents spent significantly less time per slide, made fewer fixations, and performed less examination of non-diagnostic areas. Similar findings have been obtained in radiology on multi-slice images such as Computer Tomography scans (CCT) [78] or multi sequences Magnetic Resonance Imaging (MRI) [79]. Figure 6 shows an example of scanpath and gaze fixations in the case of multiple MRI sequences.

Fig. 6.

Fig. 6

Scanpath and gaze fixations on multiple MRI sequences. Shown are different MRI sequences (gray) taken from one patient’s head, overlaid with eye movement data of a clinical expert (green). Lines are saccades and shaded circles signify fixations, with the diameter proportional to viewing time. It is seen that the image reader uses several sequences, implying comparison of different source of information from the different representations in several panels.

F. Stereoscopic 3D: new opportunities for visual attention

A key factor required for the wide-spread adoption of services based on stereoscopic images will be the creation of a compelling visual experience for the end-user. Perceptual issues and the importance of considering 3D visual attention to improve the overall 3D viewing experience in 3DTV broadcasting have been discussed extensively [80]. Integrating visual attention at source and channel coding level represents limited adaption compared to 2D case. More interestingly, content production offers new original opportunities to make use of insights in visual attention mechanisms, especially dealing with perceptual concept such as visual comfort. Comfortable viewing conditions, e.g., zone of comfortable viewing, of stereoscopic content is linked to several factors such as accommodation-vergence conflict, range of depth of focus and range of fusion [81], [82]. A seminal study by Wopking [83] suggests that visual discomfort increases with high spatial frequencies and disparities, partially because the limits of stereoscopic fusion increase as a result of the decreased spatial frequency. More generally, it appears that blurring can have a positive impact on visual comfort because it reduces the accommodation-vergence conflict, limiting both the need for accommodation and the effort to fuse [84], [85]. Simulating depth-of-field (DOF) is a way to take advantage of the retinal defocusing property in order to improve visual comfort, by artificially blurring images to a degree that corresponds to the relative depth from fixated objects. As reported by Lambooij et al. [86], “three essential steps are required for proper implementation of a simulated DOF: localization of the eye positions, determination of the fixation point and implementation of blur filters to non-fixated layers.” This procedure has been applied in virtual reality environments but has drawbacks in more general contexts since it affects depth cue integration between retinal disparity and areas with high amounts of blur [87]. Blurring effects can also be used for 3D content to direct the viewer’s attention towards a specific area of the image that could meet a comfortable viewing zone. In gaming and in the computer graphics community, visual attention modeling has attracted a growing interest. Visual attention models have been used to produce a more realistic behavior of a virtual character, to improve interactivity in 3D virtual environments, and to improve visual comfort when viewing rendered 3D virtual environments [88]–[90].

Due to geometry issues related to depth rendering, adaptation from a cinema environment to the home environment is far from being an automatic, straightforward process for 3D content production. Automated content-based post-production or post-processing tools to help adapt 3D content to television are expected to be developed. 3D visual attention models can be employed to provide the area of interest and convergence plane to drive the content repurposing of stereoscopic content. In addition, the adaptation of the scene depth can be used to improve visual comfort. To reduce both visual discomfort and fatigue, the convergence plane is usually continuously set to the main area of interest, as the latter is moving across different depth levels. A way to reduce eye strain is to modify the convergence plane of the main area of interest to place it on the display plane, i.e., by adapting the content disparity. Such visual attention based adaptive rendering of 3D stereoscopic video has been proposed using a 2D visual attention model [91].

IV. CONCLUSION

Visual attention is attracting a high level of interest in the vision science community. In this paper, we have demonstrated that this research interest is highly penetrating the Information and Communication Technology (ICT) field with some successful outcomes although there are still challenges ahead. One caveat is that, as in any trans-disciplinary approach, one has to assure that concepts from one research field are properly used when appropriated by another. For instance, in the image processing community, the terms “salience” and “importance” (or Visual Salience and Region of Interest/Importance) have sometimes been considered synonymous, while, as stated, they should be distinguished. Both denote the most visually “relevant” parts of the scene. However, the concepts differ as they may refer to two different mechanisms of visual attention: bottom-up vs. top-down. While the interaction between ICT and vision science is intensifying, the ICT community needs to assure carefully that the proper tools (models, validation protocols, databases, …) are used for the proper needs.

ACKNOWLEDGMENT

Work supported in part by Office of Naval Research grant N000141010278 and NIH grant R01EY016281.

Biographies

graphic file with name nihms537311b1.gif

Patrick Le Callet received both an M.Sc. and a PhD degree in image processing from Ecole polytechnique de l’Université de Nantes. He was also a student at the Ecole Normale Superieure de Cachan where he sat the “Aggrégation” (credentialing exam) in electronics of the French National Education. He worked as an Assistant Professor from 1997 to 1999 and as a full time lecturer from 1999 to 2003 at the Department of Electrical Engineering of Technical Institute of the University of Nantes (IUT). Since 2003 he teaches at Ecole polytechnique de l’Université de Nantes (Engineering School) in the Electrical Engineering and the Computer Science departments where is now a Full Professor. Since 2006, he is the head of the Image and Video Communication lab at CNRS IRCCyN, a group of more than 35 researchers. He is mostly engaged in research dealing with the application of human vision modeling in image and video processing. His current centers of interest are 3D image and video quality assessment, watermarking techniques and visual attention modeling and applications. He is co-author of more than 140 publications and communications and co-inventor of 13 international patents on these topics. He also co-chairs within the VQEG (Video Quality Expert Group) the “Joint-Effort Group” and “3DTV” activities. He is currently serving as associate editor for IEEE transactions on Circuit System and Video Technology, SPRINGER EURASIP Journal on Image and Video Processing, and SPIE Electronic Imaging.

graphic file with name nihms537311b2.gif

Ernst Niebur graduated with an MS degree (Diplom Physiker) from the Universität Dortmund, West Germany. He received a Post-Graduate Diploma in Artificial Intelligence from the Swiss Federal Institute of Technology (EPFL), Switzerland, and the Ph.D. degree (Dr és sciences) in physics from the Université de Lausanne, Switzerland. His dissertation topic was a detailed computational model of the motor nervous system of the nematode C. elegans.

Niebur was a Research Fellow and a Senior Research Fellow at the California Institute of Technology, Pasadena, and an Adjunct Professor at Queensland University of Technology, Brisbane, Australia. He joined the faculty of Johns Hopkins University in 1995 where he is currently a Professor of Neuroscience in the School of Medicine, and of Brain and Psychological Sciences in the School of Arts and Sciences. He uses computational neuroscience to understand the function of the nervous system at many levels.

Niebur was the recipient of a Seymour Cray (Switzerland) Award in Scientific Computation in 1988, an Alfred P. Sloan Fellowship in 1997, and a National Science Foundation CAREER Award in 1998.

Footnotes

1

Although attention controls input from all senses, we focus on vision throughout this article.

Contributor Information

Patrick Le Callet, LUNAM Université, Université de Nantes, Institut de Recherche en Communications et Cybernétique de Nantes, Polytech Nantes, UMR CNRS 6597, France patrick.lecallet@univ-nantes.fr.

Ernst Niebur, Solomon Snyder Department of Neuroscience and the Zanvyl Krieger Mind Brain Institute, Johns Hopkins University, Baltimore MD 21218 USA niebur@jhu.edu.

REFERENCES

  • 1.Reich DS, Mechler F, Purpura KP, Victor JD. Interspike intervals, receptive fields, and information encoding in primary visual cortex. J. Neurosci. 2000 Mar;vol. 20(5):1964–1974. doi: 10.1523/JNEUROSCI.20-05-01964.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Reinagel P, Reid RC. Temporal coding of visual information in the thalamus. J. Neurosci. 2000 Jul;vol. 20(14):5392–5400. doi: 10.1523/JNEUROSCI.20-14-05392.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Brenner N, Strong SP, Koberle R, Bialek W, Steveninck RRdRv. Synergy in a neural code. Neural Computation. 2000;vol. 12(7):1531–1552. doi: 10.1162/089976600300015259. [DOI] [PubMed] [Google Scholar]
  • 4.von Helmholtz H. Handbuch der physiologischen Optik. Leipzig: Voss; 1867. [Google Scholar]
  • 5.Wundt WM. Grundzüge de physiologischen Psychologie. W. Engelman; 1874. [Google Scholar]
  • 6.James W. The Principles of Psychology. New York: Henry Holt; 1890. [Google Scholar]
  • 7.Broadbent DE. Perception and Communication. London: Pergamon; 1958. [Google Scholar]
  • 8.Neisser U. Cognitive psychology. New York: Appleton-Century-Crofts; 1967. [Google Scholar]
  • 9.Treisman A, Gelade G. A feature-integration theory of attention. Cognitive Psychology. 1980;vol. 12:97–136. doi: 10.1016/0010-0285(80)90005-5. [DOI] [PubMed] [Google Scholar]
  • 10.He ZJ, Nakayama K. Surfaces versus features in visual search. Nature. 1992;vol. 359:231–233. doi: 10.1038/359231a0. [DOI] [PubMed] [Google Scholar]
  • 11.Wolfe JM. Guided search 2.0 - a revised model of visual search. Psychonomics Bulletin & Review. 1994;vol. 1(2):202–238. doi: 10.3758/BF03200774. [DOI] [PubMed] [Google Scholar]
  • 12.Tsotsos JK, Culhane SM, Wai WYK, Lai YH, Davis N, Nuflo F. Modelling visual attention via selective tuning. Artificial Intelligence. 1995 Oct;vol. 78(1–2):507–545. [Google Scholar]
  • 13.Wolfe J, Horowitz T. What attributes guide the deployment of visual attention and how do they do it? Nat. Rev. Neurosci. 2004 Jun;vol. 5:495–501. doi: 10.1038/nrn1411. [DOI] [PubMed] [Google Scholar]
  • 14.Noton D, Stark L. Scanpaths in eye movements. Science. 1971;vol. 171:308–311. doi: 10.1126/science.171.3968.308. [DOI] [PubMed] [Google Scholar]
  • 15.Zangemeister WH, Sherman K, Stark L. Evidence for global scanpath strategy in viewing abstract compared with realistic images. Neuropsychologia. 1995;vol. 33(8):1009–10025. doi: 10.1016/0028-3932(95)00014-t. [DOI] [PubMed] [Google Scholar]
  • 16.Shepherd M, Findlay JM, Hockey RJ. The relationship between eye movements and spatial attention. The Quarterly Journal of Experimental Psychology. 1986;vol. 38(3):475–491. doi: 10.1080/14640748608401609. [DOI] [PubMed] [Google Scholar]
  • 17.Schneider WX, Deubel H. Visual attention and saccadic eye movements: Evidence for obligatory and selective spatial coupling. Studies in Visual Information Processing. 1995;vol. 6:317–324. [Google Scholar]
  • 18.Deubel H, Schneider WX. Saccade target selection and object recognition: Evidence for a common attentional mechanism. Vision research. 1996;vol. 36(12):1827–1837. doi: 10.1016/0042-6989(95)00294-4. [DOI] [PubMed] [Google Scholar]
  • 19.Hoffman J, Subramaniam B. The role of visual attention in saccadic eye movements. Perception and Psychophysics. 1995;vol. 57(6):787–795. doi: 10.3758/bf03206794. [DOI] [PubMed] [Google Scholar]
  • 20.Kowler E, Anderson E, Dosher B, Blaser E. The role of attention in the programming of saccades. Vision Research. 1995;vol. 35(13):1897–1916. doi: 10.1016/0042-6989(94)00279-u. [DOI] [PubMed] [Google Scholar]
  • 21.McPeek RM, Maljkovic V, Nakayama K. Saccades require focal attention and are facilitated by a short-term memory system. Vision research. 1999;vol. 39(8):1555–1566. doi: 10.1016/s0042-6989(98)00228-4. [DOI] [PubMed] [Google Scholar]
  • 22.Yarbus A. Eye Movements and Vision. New York: Plenum Press; 1967. [Google Scholar]
  • 23.Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiol. 1985;vol. 4:219–227. [PubMed] [Google Scholar]
  • 24.Niebur E, Koch C. Control of selective visual attention: Modeling the “where” pathway. In: Touretzky DS, Mozer MC, Hasselmo ME, editors. Advances in Neural Information Processing Systems. vol. 8. Cambridge, MA: MIT Press; 1996. pp. 802–808. [Google Scholar]
  • 25.Itti L, Koch C, Niebur E. A model of saliency-based fast visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998 Nov;vol. 20(11):1254–1259. [Google Scholar]
  • 26.Itti L, Koch C. Computational modelling of visual attention. Nature Neuroscience. 2001;vol. 2:194–203. doi: 10.1038/35058500. [DOI] [PubMed] [Google Scholar]
  • 27.Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013 doi: 10.1109/TPAMI.2012.89. [DOI] [PubMed] [Google Scholar]
  • 28.Walther D, Koch C. Modeling attention to salient proto-objects. Neural Networks. 2006 Nov;vol. 19:1395–1407. doi: 10.1016/j.neunet.2006.10.001. [DOI] [PubMed] [Google Scholar]
  • 29.Mihalas S, Dong Y, von der Heydt R, Niebur E. Mechanisms of perceptual organization provide auto-zoom and auto-localization for attention to objects. Proceedings of the National Academy of Sciences. 2011;vol. 108(18):7583–7588. doi: 10.1073/pnas.1014655108. pMCID: PMC3088583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Parkhurst D, Law K, Niebur E. Modelling the role of salience in the allocation of visual selective attention. Vision Research. 2002;vol. 42(1):107–123. doi: 10.1016/s0042-6989(01)00250-4. [DOI] [PubMed] [Google Scholar]
  • 31.Wang J, Chandler DM, Le Callet P. Quantifying the relationship between visual salience and visual importance. Proceedings of SPIE. 2010 Feb;vol. 7527:75 270K–75 270K–9. [Online]. Available: http://spiedigitallibrary.org/proceedings/resource/2/psisdg/7527/1/75270K_1?isAuthorized=no. [Google Scholar]
  • 32.Navalpakkam V, Itti L. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) New York, NY: 2006. Jun, An integrated model of top-down and bottom-up attention for optimal object detection; pp. 2049–2056. bu ; cv ; td. [Google Scholar]
  • 33.Torralba A, Oliva A. Statistics of natural image categories. Network: Computation in Neural Systems. 2003;vol. 14:391–412. [PubMed] [Google Scholar]
  • 34.Torralba A, Oliva A, Castelhano M, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object selection. Psychological Review. 2006;vol. 113(4):766–786. doi: 10.1037/0033-295X.113.4.766. [DOI] [PubMed] [Google Scholar]
  • 35.Wolfe JM, Alvarez GA, Rosenholtz R, Kuzmova YI, Sherman AM. Visual search for arbitrary objects in real scenes. Attention, Perception, & Psychophysics. 2011;vol. 73(6):1650–1671. doi: 10.3758/s13414-011-0153-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Masciocchi C, Mihalas S, Parkhurst D, Niebur E. Everyone knows what is interesting: Salient locations which should be fixated. Journal of Vision. 2009 Oct;vol. 9(11):1–22. doi: 10.1167/9.11.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Engelke U, Liu H, Wang J, Le Callet P, Heynderickx I, Zepernick H, Maeder A. A comparative study of fixation density maps. IEEE Transactions on Image Processing. 2012 doi: 10.1109/TIP.2012.2227767. [DOI] [PubMed] [Google Scholar]
  • 38.Meur OL, Baccino T. Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behavior Research Methods. 2012:1–16. doi: 10.3758/s13428-012-0226-9. [Online]. Available: http://link.springer.com/article/10.3758/s13428-012-0226-9. [DOI] [PubMed] [Google Scholar]
  • 39.Hou X, Zhang L. Saliency detection: A spectral residual approach; IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR ’07; 2007. pp. 1–8. [Google Scholar]
  • 40.Lee J-S, Ebrahimi T. Perceptual video compression: A survey. IEEE Journal of Selected Topics in Signal Processing. 2012;vol. 6(6):684–697. [Google Scholar]
  • 41.Maeder A, Diederich J, Niebur E. Limiting human perception for image sequences. Proceedings of the SPIE. 1996;vol. 2657:330–337. [Google Scholar]
  • 42.Parkhurst D, Niebur E. Variable resolution displays: a theoretical, practical and behavioral evaluation. Human Factors. 2002;vol. 44(4):611–629. doi: 10.1518/0018720024497015. [DOI] [PubMed] [Google Scholar]
  • 43.Daly S, Matthews K, Ribas-Corbera J. As plain as the noise on your face: Adaptive video compression using face detection and visual eccentricity models. Journal of Electronic Imaging. 2001;vol. 10(1):30–46. [Online]. Available: +http://dx.doi.org/10.1117/1.1333679. [Google Scholar]
  • 44.Itti L. Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Transactions on Image Processing. 2004 Oct;vol. 13(10):1304–1318. doi: 10.1109/tip.2004.834657. [DOI] [PubMed] [Google Scholar]
  • 45.Eleftheriadis A, Jacquin A. Automatic face location detection and tracking for model-assisted coding of video teleconferencing sequences at low bit-rates. Signal Processing: Image Communication. 1995;vol. 7(3):231–248. [Online]. Available: http://www.sciencedirect.com/science/article/pii/092359659500028U. [Google Scholar]
  • 46.Doulamis N, Doulamis A, Kalogeras D, Kollias S. Low bit-rate coding of image sequences using adaptive regions of interest. IEEE Transactions on Circuits and Systems for Video Technology. 1998 Dec;vol. 8(8):928–934. [Google Scholar]
  • 47.Chen Z, Guillemot C. Perceptually-friendly H.264/AVC video coding; 2009 16th IEEE International Conference on Image Processing (ICIP); 2009. Nov, pp. 3417–3420. [Google Scholar]
  • 48.Liu Y, Li ZG, Soh YC. Region-of-interest based resource allocation for conversational video communication of H.264/AVC. IEEE Transactions on Circuits and Systems for Video Technology. 2008 Jan;vol. 18(1):134–139. [Google Scholar]
  • 49.Tang C-W, Chen C-H, Yu Y-H, Tsai C-J. A novel visual distortion sensitivity analysis for video encoder bit allocation. 2004 International Conference on Image Processing, 2004. ICIP ’04. 2004 Oct;vol. 5:3225–3228. Vol. 5. [Google Scholar]
  • 50.Feng Y, Cheung G, Tan W-t, Ji Y. Hidden markov model for eye gaze prediction in networked video streaming; 2011 IEEE International Conference on Multimedia and Expo (ICME); 2011. Jul, pp. 1–6. [Google Scholar]
  • 51.Lee J-S, De Simone F, Ebrahimi T. Efficient video coding based on audio-visual focus of attention. Journal of Visual Communication and Image Representation. 2011 Nov;vol. 22(8):704–711. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S104732031000146X. [Google Scholar]
  • 52.Boulos F, Parrein B, Le Callet P, Hands D, et al. Perceptual effects of packet loss on h. 264/AVC encoded videos. Fourth International workshop on Video Processing and Quality Metrics for consumer electronics, VPQM. 2009 [Google Scholar]
  • 53.Boujut H, Benois-Pineau J, Hadar O, Ahmed T, Bonnet P. Weighted-MSE based on saliency map for assessing video quality of h.264 video streams. Proc. SPIE 7867, Image Quality and System Performance VIII, 78670X. 2011 Jan;:78 670X–78 670X. [Online]. Available: http://dx.doi.org/10.1117/12.876471. [Google Scholar]
  • 54.Engelke U, Pepion R, Le Callet P, Zepernick H-J. Linking distortion perception and visual saliency in H.264/AVC coded video containing packet loss. Proceedings of SPIE. 2010 Jul;vol. 7744:774406–774406–10. [Online]. Available: http://spiedigitallibrary.org/proceedings/resource/2/psisdg/7744/1/774406_1?isAuthorized=no. [Google Scholar]
  • 55.Dhondt Y, Lambert P, Van de Walle R. A flexible macroblock scheme for unequal error protection; 2006 IEEE International Conference on Image Processing; 2006. Oct, pp. 829–832. [Google Scholar]
  • 56.Boulos F, Chen W, Parrein B, Le Callet P. Region-of-interest intra prediction for h. 264/AVC error resilience; Image Processing (ICIP), 2009 16th IEEE International Conference on; 2009. pp. 3109–3112. [Google Scholar]
  • 57.Avidan S, Shamir A. Seam carving for content-aware image resizing. ACM Trans. Graph. 2007 Jul;vol. 26(3) [Online]. Available: http://doi.acm.org/10.1145/1276377.1276390. [Google Scholar]
  • 58.Le Meur O, Castellan X, Le Callet P, Barba D. Efficient saliency-based repurposing method; 2006 IEEE International Conference on Image Processing; 2006. Oct, pp. 421–424. [Google Scholar]
  • 59.Setlur V, Lechner T, Nienhaus M, Gooch B. Retargeting images and video for preserving information saliency. IEEE Computer Graphics and Applications. 2007 Oct;vol. 27(5):80–88. doi: 10.1109/mcg.2007.133. [DOI] [PubMed] [Google Scholar]
  • 60.Da Silva MP, Courboulay V, Le Callet P. Real time dynamic image re-targeting based on a dynamic visual attention model; Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on; 2012. Jul, pp. 653–658. [Google Scholar]
  • 61.Rubinstein M, Gutierrez D, Sorkine O, Shamir A. ACM SIGGRAPH Asia 2010 papers, ser. SIGGRAPH ASIA ’10. New York, NY, USA: ACM; 2010. A comparative study of image retargeting; pp. 160:1–160:10. [Online]. Available: http://doi.acm.org/10.1145/1866158.1866186. [Google Scholar]
  • 62.Engelke U, Kaprykowsky H, Zepernick H-J, Ndjiki-Nya P. Visual attention in quality assessment. IEEE Signal Processing Magazine. 2011 Nov;vol. 28(6):50–59. [Google Scholar]
  • 63.Osberger W, Bergmann N, Maeder A. An automatic image quality assessment technique incorporating higher level perceptual factors. Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on; IEEE; 1998. pp. 414–418. [Google Scholar]
  • 64.Barland R, Saadane A. Blind quality metric using a perceptual importance map for JPEG-20000 compressed images; 2006 IEEE International Conference on Image Processing; 2006. Oct, pp. 2941–2944. [Google Scholar]
  • 65.Ninassi A, Le Meur O, Le Callet P, Barbba D. Does where you gaze on an image affect your perception of quality? applying visual attention to image quality metric. IEEE International Conference on Image Processing, 2007. ICIP 2007. 2007 Oct;vol. 2:II–169–II–172. [Google Scholar]
  • 66.Liu H, Heynderickx I. Visual attention in objective image quality assessment: Based on eye-tracking data. IEEE Transactions on Circuits and Systems for Video Technology. 2011 Jul;vol. 21(7):971–982. [Google Scholar]
  • 67.Engelke U, Barkowsky M, Le Callet P, Zepernick H-J. Modelling saliency awareness for objective video quality assessment. 2010 Second International Workshop on Quality of Multimedia Experience (QoMEX) 2010 Jun;:212–217. [Google Scholar]
  • 68.Larson E, Vu C, Chandler D. Can visual fixation patterns improve image fidelity assessment?; 15th IEEE International Conference on Image Processing, 2008. ICIP 2008; 2008. pp. 2572–2575. [Google Scholar]
  • 69.You J, Perkis A, Gabbouj M. Improving image quality assessment with modeling visual attention. 2010 2nd European Workshop on Visual Information Processing (EUVIP) 2010 Jul;:177–182. [Google Scholar]
  • 70.Redi J, Liu H, Gastaldo P, Zunino R, Heynderickx I. How to apply spatial saliency into objective metrics for JPEG compressed images?; 2009 16th IEEE International Conference on Image Processing (ICIP); 2009. Nov, pp. 961–964. [Google Scholar]
  • 71.You J, Xing L, Perkis A, Ebrahimi T. Visual contrast sensitivity guided video quality assessment; 2012 IEEE International Conference on Multimedia and Expo (ICME); 2012. Jul, pp. 824–829. [Google Scholar]
  • 72.You J, Korhonen J, Perkis A. Attention modeling for video quality assessment: Balancing global quality and local quality; 2010 IEEE International Conference on Multimedia and Expo (ICME); 2010. Jul, pp. 914–919. [Google Scholar]
  • 73.Le Meur O, Ninassi A, Le Callet P, Barba D. Do video coding impairments disturb the visual attention deployment? Signal Processing: Image Communication. 2010;vol. 25(8):597–609. [Google Scholar]
  • 74.Ninassi A, Le Meur O, Le Callet P, Barba D, Tirel A. Task impact on the visual attention in subjective image quality assessment; Proceedings of European Signal Processing Conference, France; 2006. Sep, p. invited paper. [Online]. Available: http: //hal.archives-ouvertes.fr/hal-00342685. [Google Scholar]
  • 75.Le Meur O, Ninassi A, Le Callet P, Barba D. Overt visual attention for free-viewing and quality assessment tasks: Impact of the regions of interest on a video quality metric. Signal Processing: Image Communication. 2010;vol. 25(7):547–558. [Google Scholar]
  • 76.Redi J, Liu H, Zunino R, Heynderickx I. Interactions of visual attention and quality perception. Proc. SPIE 7865, Human Vision and Electronic Imaging XVI, 78650S. 2011 Feb;:78 650S–78 650S. [Online]. Available: http://dx.doi.org/10.1117/12.876712. [Google Scholar]
  • 77.Krupinski EA. On the development of expertise in interpreting medical images. Proc. SPIE 8291, Human Vision and Electronic Imaging XVII, 82910R. 2012 Feb;:82 910R–82 910R. [Online]. Available: http://dx.doi.org/10.1117/12.916454. [Google Scholar]
  • 78.Venjakob A, Marnitz T, Mahler J, Sechelmann S, Rötting M. Radiologists’ eye gaze when reading cranial CT images. Proc. SPIE 8318, Medical Imaging 2012: Image Perception, Observer Performance, and Technology Assessment, 83180B. 2012 Feb;:83 180B–83 180B. [Online]. Available: http://dx.doi.org/10.1117/12.913611. [Google Scholar]
  • 79.Cavaro-Ménard C, Tanguy J-Y, Le Callet P. Eye-position recording during brain MRI examination to identify and characterize steps of glioma diagnosis. Proceedings of SPIE. 2010 Mar;vol. 7627:76 270E–76 270E–8. [Online]. Available: http://spiedigitallibrary.org/proceedings/resource/2/psisdg/7627/1/76270E_1?isAuthorized=no. [Google Scholar]
  • 80.Huynh-Thu Q, Barkowsky M, Le Callet P. The importance of visual attention in improving the 3D-TV viewing experience: Overview and new perspectives. Broadcasting, IEEE Transactions on. 2011 Jun;vol. 57(2):421–431. [Google Scholar]
  • 81.Pastoor S. Human factors of 3D displays in advanced image communications. Displays. 1993 Jul;vol. 14(3):150–157. [Online]. Available: http://www.sciencedirect.com/science/article/pii/0141938293900365. [Google Scholar]
  • 82.Nagata S. The binocular fusion of human vision on stereoscopic displays— field of view and environment effects. Ergonomics. 1996;vol. 39(11):1273–1284. doi: 10.1080/00140139608964547. PMID: 8888639. [Online]. Available: http://www.tandfonline.com/doi/abs/10.1080/00140139608964547. [DOI] [PubMed] [Google Scholar]
  • 83.Wopking M. Viewing comfort with stereoscopic pictures: An experimental study on the subjective effects of disparity magnitude and depth of focus. Journal of the Society for Information Display. 1995;vol. 3(Nr.3):101–103. [Google Scholar]
  • 84.Semmlow JL, Heerema D. The role of accommodative convergence at the limits of fusional vergence. Investigative Ophthalmology & Visual Science. 1979 Jan;vol. 18(9):970–976. [Online]. Available: http://www.iovs.org/content/18/9/970. [PubMed] [Google Scholar]
  • 85.Talmi K, Liu J. Eye and gaze tracking for visually controlled interactive stereoscopic displays. Signal Processing: Image Communication. 1999 Aug;vol. 14(10):799–810. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0923596598000447. [Google Scholar]
  • 86.Lambooij M, Fortuin M, Heynderickx I, IJsselsteijn W. Visual discomfort and visual fatigue of stereoscopic displays: A review. Journal of Imaging Science and Technology. 2009;vol. 53(3):30 201–1–30 201–14. [Google Scholar]
  • 87.Mather G, Smith DR. Depth cue integration: stereopsis and image blur. Vision Research. 2000 Jan;vol. 40(25):3501–3506. doi: 10.1016/s0042-6989(00)00178-4. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0042698900001784. [DOI] [PubMed] [Google Scholar]
  • 88.Hillaire S, Lecuyer A, Cozot R, Casiez G. Using an eye-tracking system to improve camera motions and depth-of-field blur effects in virtual environments; IEEE Virtual Reality Conference, 2008. VR’08; 2008. Mar, pp. 47–50. [Google Scholar]
  • 89.Hillaire S, Lécuyer A, Breton G, Corte TR. Proceedings of the 16th ACM Symposium on Virtual Reality Software and Technology, ser. VRST ’09. New York, NY, USA: ACM; 2009. Gaze behavior and visual attention model when turning in virtual environments; pp. 43–50. [Online]. Available: http://doi.acm.org/10.1145/1643928.1643941. [Google Scholar]
  • 90.Hillaire S, Lecuyer A, Cozot R, Casiez G. Depth-of-field blur effects for first-person navigation in virtual environments. IEEE Computer Graphics and Applications. 2008 Dec;vol. 28(6):47–55. doi: 10.1109/MCG.2008.113. [DOI] [PubMed] [Google Scholar]
  • 91.Chamaret C, Godeffroy S, Lopez P, Le Meur O. SPIE Stereoscopic Displays and Applications XXI. vol. 7524. SPIE; 2010. Feb, Adaptive 3D rendering based on region-of-interest. [Online]. Available: http://dx.doi.org/10.1117/12.837532. [Google Scholar]

RESOURCES