Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Dec 1.
Published in final edited form as: Curr Psychol. 2013 Sep 18;32(4):10.1007/s12144-013-9184-3. doi: 10.1007/s12144-013-9184-3

Out of Mind, Out of Sight: Unexpected Scene Elements Frequently Go Unnoticed Until Primed

George M Slavich 1,, Philip G Zimbardo 2
PMCID: PMC3865713  NIHMSID: NIHMS525589  PMID: 24363542

Abstract

The human visual system employs a sophisticated set of strategies for scanning the environment and directing attention to stimuli that can be expected given the context and a person’s past experience. Although these strategies enable us to navigate a very complex physical and social environment, they can also cause highly salient, but unexpected stimuli to go completely unnoticed. To examine the generality of this phenomenon, we conducted eight studies that included 15 different experimental conditions and 1,577 participants in all. These studies revealed that a large majority of participants do not report having seen a woman in the center of an urban scene who was photographed in midair as she was committing suicide. Despite seeing the scene repeatedly, 46 % of all participants failed to report seeing a central figure and only 4.8 % reported seeing a falling person. Frequency of noticing the suicidal woman was highest for participants who read a narrative priming story that increased the extent to which she was schematically congruent with the scene. In contrast to this robust effect of inattentional blindness, a majority of participants reported seeing other peripheral objects in the visual scene that were equally difficult to detect, yet more consistent with the scene. Follow-up qualitative analyses revealed that participants reported seeing many elements that were not actually present, but which could have been expected given the overall context of the scene. Together, these findings demonstrate the robustness of inattentional blindness and highlight the specificity with which different visual primes may increase noticing behavior.

Keywords: Visual attention, Selective attention, Perception, Inattentional blindness, Figure-ground perception, Aschematic blindness, Priming, Expectations, Transformational teaching


Philosophers and psychologists have long questioned whether humans have access to an objective reality. One possibility is that we are capable of seeing the physical and social world as it actually exists. This perspective has been termed naïve realism, or direct realism (Ross and Ward 1996), and it posits that our eyes function like cameras or sensors that collect information and then send it back to our brains, where it is represented accurately and objectively. Underlying this approach to seeing the empirical world “as it is,” in terms of its database of information, is bottom-up processing. In this mode of processing, visual attention is deployed based largely on the inherent saliency of different objects in a visual scene (Itti 2005; Itti et al. 1998). Objects that are highly salient are readily noticed and perceived by observers, while less salient objects are not.

Another possibility is that our experience of the physical and social world is inevitably influenced by subjective mental processes. This perspective was described in Bruner’s (1957) classic treatise on perceptual readiness. In that landmark paper, Bruner argued that some people are readier to expect, and therefore quicker to perceive, the least desirable event among an array of expected events; for others, in contrast, it is the most desirable event that stands out. From this perspective, for some people at least some of the time, we see the world “as we think it is,” in terms of our expectations, values, goals, and personal history. Underlying the process of seeing the world in this way, as influenced by cognition, is top-down processing. In this mode of processing, detection and perception of visual information is inherently shaped by observers’ differing histories, motivations, and tendencies, leading people to perceive different aspects of the visual scenes they encounter.

A large body of research has recently attempted to evaluate the respective contributions of bottom-up and top-down processing to visual search, and this work has renewed the debate about how real-world scenes are perceived and remembered (e.g., Chun 2003; Henderson and Hollingworth 1999). Arising from this research is evidence that bottom-up processes affect attention at a localized level, especially in the absence of top-down or contextual cues (Chen and Zelinsky 2006). When contextual information is present, however, it appears to heavily guide visual search behavior (Hsieh et al. 2011). This means that under most natural viewing conditions, top-down processes likely play a prominent role in guiding visual search, insofar as these processes provide information about the likely presence and expected location of various elements within a given scene (Torralba et al. 2006; Wolfe et al. 2003).

Central to top-down processing are conceptual frameworks, or schemas, that encode complex generalizations about one’s experience of the structure of the surrounding environment. We have schemas for virtually all aspects of our experience, including objects, people, and situations. Ultimately, it is through these schemas that we perceive, attend to, evaluate, understand, and remember experiences of the physical and social world (Cantor and Michel 1979; Levy et al. 1999). Schemas simplify the extremely complex task of identifying and classifying the innumerable things in our visual world by grouping instances into more readily-accessible wholes, or gestalts. By doing so, schemas greatly enhance the speed and accuracy of processing the enormous amount of information to which we are exposed everyday (Stryker and Burke 2000).

The obvious utility of schematic perception is based on the swift processing of the familiar. However, what about the novel event, or unfamiliar or unexpected stimulus (Beiderman et al. 1982)? Do perceptual objects of this sort require greater time or mental energy to process? Or, are such objects simply not perceived if they are sufficiently incongruent, or aschematic, with our personal expectations, or with their general context or background (Wolfe et al. 2005)?

Early answers to these questions were offered by social philosopher Henry David Thoreau (1860/1906), who wrote:

A man receives only what he is ready to receive, whether physically or intellectually or morally, as animals conceive at certain seasons their own kind only. We hear and apprehend only what we already half know… The phenomenon or fact that cannot in any wise be linked with the rest which he has observed, he does not observe (pp. 77–78).

Thoreau argued that perception is guided in part by previously-established schemas that provide categorical assignments for new stimuli based on their inherent meaning. This process, in turn, influences our expectations for the probable appearance of different stimuli within different contexts. According to this view, stimuli or changes that violate our expectations, or that rarely occur, should be less likely to be seen than those that are schematic or expected given their context or background. A large body of research has documented effects that are consistent with this view, and this phenomenon has generally been referred to as inattentional blindness (Becklen and Cervone 1983; Mack 2003; Mack and Rock 1998; Most et al. 2001; Neisser and Becklen 1975; Simons 2000).

Updating Thoreau takes us to a social-constructionist view of perception. This perspective posits that we see only what we have learned to see and that we are psychologically prepared to see. This perceptual preparation comes as a consequence of a complex personal history of perceiving events, actions, and people in certain ways, within certain contexts that are socially and culturally appropriate. Each individual’s personal history creates a set of powerful perceptual expectations about the way the world is and ought to be. As a result of these expectations, we most readily see what we expect to see and interpret incoming visual information in ways that are highly congruent with preexisting schemas. This notion that perceptions of the external world are shaped in this way have long been recognized by cognitive and gestalt theorists (e.g., Beck 1967; Perls 1969), and have been elaborated on in many ways over the years (e.g., Attneave 1954; Axelrod 1973; Becker 1973; Greenberg et al. 1986; Henderson and Hollingworth 1999).

The present study was designed to examine this general perceptual phenomenon by testing how frequently individuals report seeing contextually expected and unexpected objects in a visual scene. The study idea originated from an experience of one the authors (PGZ). Specifically, while looking through photos in the book, LIFE: The First 50 Years, 19361986, PGZ expressed surprise and distress at a photo that depicted a woman in midair, who had been photographed seconds after having leapt from a hotel window (See Fig. 1). A companion who was also looking at the photo over his shoulder noticed nothing unusual. Indeed, his recognition of the reality of the shocking scene came only after seeing the photo caption, which read: “The camera caught the 1942 plunge toward the Buffalo pavement of despondent divorcee in the last dreadful split second before death” (Kunhardt 1986, p. 64).

Fig. 1.

Fig. 1

Original suicidal woman stimulus photograph, published in LIFE: The First 50 Years, 19361986

To examine whether this failure to perceive the suicidal woman was a unique experience or an instance of a more common phenomenon, we conducted a series of eight studies in which more than 1,550 students viewed the photograph in a controlled classroom setting under fifteen different experimental conditions. Each study was conducted under the auspices of an experiential activity on “visual perception,” which is consistent with a transformational teaching approach to classroom instruction (Slavich 2005, 2006, 2009; Slavich and Zimbardo 2012; see also Slavich and Toussaint 2013). First, we examined the generality of the phenomenon of failing to see an object that is unexpected, but which should otherwise be readily apparent given its dynamic centrality in a static stimulus array. Based on previous research demonstrating the failure of individuals to detect rare objects (Wolfe et al. 2005) and unlikely or unexpected events (Simons and Chabris 1999), we hypothesized that most viewers would not report having seen the suicidal woman. Next, to extend the existing literature on this topic, we examined the extent to which a series of very specific priming and structural manipulations of the photograph enhanced participants’ detection of the woman and other peripheral scene elements. Across these experimental manipulations, we predicted that reporting of particular scene elements (including the woman) would increase greatest for participants who received primes aimed at making these elements more schematic with the scene, and thus more expected given the visual context.

Method

Overview

The suicidal woman photograph previously described was shown to students in introductory psychology courses over eight years. The photograph was presented on a projection screen for multiple exposures (2, 3, or 4 exposures) of increasing duration (2–10 seconds per exposure). Before each viewing, participants were instructed to extract as much information as possible from the photograph. After each viewing, participants were asked to report on a standardized response form every scene element that they could remember from the photograph.

As described in Table 1, eight replications utilizing this basic study design were conducted in all. Each replication occurred in a classroom format with 106–305 students per class (M=197.1, SD=73.6). The default study instructions prompted participants to “notice all that you can in the photograph” or to “extract as many details as possible from the scene.” In some replications, participants were randomly assigned to different groups and then primed with a specific visual search instruction (e.g., notice as many animate, inanimate, or unusual aspects as possible). In other replications, the photograph was modified in some way in sequential presentations of the test scene (e.g., the suicidal woman was deleted but then added to the scene, or vice versa; see Table 1). Given that the methods, stimuli, and procedures were essentially the same across the eight different studies (i.e., with the exception of structural changes and focus primes in some studies), we report the results of the studies together below.

Table 1.

Overview of characteristics of each study

Study Trials Stimulus duration ISIa Experimental condition n
1 2 6 s/6 s 2 min Notice all that you can 243
2 4 2 s/4 s/6 s/8 s 2 min Extract as many details as possible 106
3 3 2 s/4 s/6 s 2 min Detect name of businesses and anything unusual 305
4 4 3 s/4 s/7 s/10 s 2 min Detect name of businesses and anything unusual 117
5 2 2 s/6 s 2 min Notice all that you can 109
Notice as many animate objects as possible, such as people 45
Notice as many inanimate objects as possible, such as stores or signs 50
Notice things that are unusual 42
6 2 2 s/6 s 2 min Notice as many animate objects as possible, such as people 83
Notice as many inanimate objects as possible, such as stores or signs 84
Notice as many things that are unusual 64
7 2 2 s/6 s 2 min Suicidal woman story prime 121
8 2 2 s/6 s 2 min Woman present→Woman absent 67
Woman absent→Woman present 68
Barber pole present→Barber pole absent 73
N=1,577
a

The ISI, or interstimulus interval, is the time between offset of one stimulus and onset of the next. During this time, participants were asked to list what they saw in the photograph.

Participants

A total of 1,577 students at Stanford University, the majority (74.9 %) of who were freshman, participated in the eight replications that comprised this overall program of research. Participants were of college age (M=19.2 years old; SD=1.03) and approximately equal in sex representation (56 % were female). Participants received research credit for their time. All participants provided written informed consent prior to participation, and all study procedures were approved in advance by the Institutional Review Board at Stanford University.

Stimulus Materials

The central element of the test scene is a woman in the act of committing suicide, who was photographed just seconds after she jumped from a top-story window in the Genesee Hotel in Buffalo, New York, in 1942 (See Fig. 1). The photograph was published in the oversized LIFE Magazine book called LIFE: The First 50 Years, 19361986 (Kunhardt 1986). The woman is located approximately in the center of the scene.

Priming Manipulations

The stimulus photograph was presented after a classroom lecture that discussed perceptual violations of expectations. This overall procedure thus constituted a “meta prime” for noticing scene elements that do not fit one’s schema for such scenes. Schematic perception and violations of expectation had also been defined and discussed prior to presenting the photograph. As described above, we primed participants in a number of different ways across the different studies. We also altered the structure of the test scene in several ways to examine the effects that these experimental manipulations had on participants’ reporting of the different scene elements. The number of trials, stimulus duration, interstimulus interval, and sample size for each study is summarized in Table 1, and a brief description of each of the experimental manipulations is provided below.

Focus Primes

In two separate replications, participants were given more specific instructions on what to focus on in the scene (i.e., “focus primes”). These instructions were printed on the standardized response forms, which were randomly assigned to students in the classroom. As summarized in Table 1, the focus prime in Study 3 and Study 4 was to “Detect the name of businesses in the scene and anything else that is unusual.” In Study 5 and Study 6, participants were asked to “Try to notice…” (a) “as many Animate Objects as possible, such as people,” or (b) “as many Inanimate Objects as possible, such as stores or signs,” or (c) “as many things that are Unusual.”

Despondent Woman Story Prime

The most direct schema-congruent prime with respect to the suicidal woman was a brief vignette that participants in this priming condition read immediately before viewing the test scene. This vignette was presented to students in Study 7. The vignette, introduced as “background information that might be useful to you,” read as follows:

Joan was usually a happy person until she suffered a series of setbacks following graduation. Her mother died in a car crash, and soon after she lost her job that she really liked because the company was downsizing. She has had difficulty getting another job. Her boyfriend recently told her he was breaking off their engagement because he found a woman better suited to his values. Joan was despondent enough to seek help, but her depression worsened anyway. She felt her future was bleak when she checked into that downtown hotel…

Structural Manipulations

Finally, in one replication of the study (Study 8), we systematically altered the structure of the scene to examine how this manipulation influenced participants’ reporting of the different scene elements. In two experimental conditions in this study, the suicidal woman was airbrushed out of the scene to compare perceptions of the photograph on sequential viewings in which the central figure’s presence was manipulated as either present or absent in counterbalanced presentations (i.e., woman present then absent, or woman absent then present; See Fig. 2). In a third experimental condition in Study 8, the barber pole was added to and removed from the scene in counterbalanced presentations to test for differences in noticing scene elements as a function of altering a peripheral detail of the scene.

Fig. 2.

Fig. 2

Altered suicidal woman stimulus photograph, with woman airbrushed out

Procedure

In each replication of the study, the test scene was projected onto a large screen using a Kodak CAROUSEL slide projector. The room was lit with low, ambient lighting. Participants were told that they were about to see a scene presented multiple times, for a few seconds each time. They were instructed to scan the scene as quickly and carefully as possible, and to then report every scene element that they could recall. When reporting on subsequent presentations of the scene, they were told to list elements that they had not noticed previously, as well as any modifications of previously reported items. Participants were given two minutes to record their responses after each viewing.

Prior to this testing procedure, students recorded their age, sex, college year, quality of corrected vision (excellent, good, fair, poor), and viewing/sitting position in the classroom (center or side, and front, middle, or rear) on their standardized response form. This enabled a comparison of the reports of participants with the best vision in ideal viewing positions (front, center) with the reports of those in other viewing conditions. Participants were also asked to write their names on the response form to increase their motivation to perform the task well, given that one of the experimenters was also their course instructor.1

Results

Effects of Visual Acuity and Viewing Position on Reporting of Scene Elements

Analyses were conducted on data from an early replication of the study (Study 1, N=243) to test whether quality of vision (excellent, good, fair, poor) or viewing position (center-front, center-rear, sides) were related to the frequency of reporting elements in the scene. Viewing position was unrelated to reporting frequency, and quality of vision was only associated with reporting of the small “Coffee Shop” sign, χ2(6, N=243)=23.61, p=.001. Among those with excellent/good vision, 85 % of participants saw “Coffee Shop” compared to 52 % of those with fair/poor vision. Because noticing the central figure was unrelated to quality of vision and viewing position, we collapsed the data across these categories for all subsequent analyses.

Perceiving the Central Figure

A summary of seeing the central figure and several peripheral scene elements across all studies is presented in Table 2. Across all priming and structural manipulations, and averaged across repeated exposures (of 2, 3, or 4 presentations) and total exposure times (of 8, 12, and 24 seconds), only 2.2 % of participants reported seeing the target stimulus as a suicidal woman and only 3.9 % of participants reported seeing a suicidal person in the central position of the scene. Nearly half of all viewers (46 %) failed to report seeing anything or anyone in or near the position of the woman. Additional viewing opportunities and increased total exposure time did not significantly increase reporting of the target stimulus.

Table 2.

Percentage of participants who reported seeing the target stimulus and comparison peripheral stimuli across all studies, trials, and experimental conditions

Scene element N % of participants
Central element
 Saw suicidal woman 34/1,577 2.2 %
 Saw suicidal person 62/1,577 3.9 %
 Saw falling person 76/1,577 4.8 %
 Saw target stimulus 852/1,577 54.0 %
Peripheral elements
 Saw “Sandwiches 10¢” 382/1,577 24.2 %
 Saw “Milk Shakes” 411/1,577 26.1 %
 Saw “Garage” 1,036/1,577 65.7 %
 Saw “Fountain” 1,145/1,577 72.6 %
 Saw “Hotel” 1,309/1,577 83.0 %
 Saw “Coffee Shop” 1,482/1,577 94.0 %

Perceiving Peripheral Elements of the Scene

These data may be contrasted with what participants did report. For example, a majority of participants (65.7 %) reported seeing “Garage,” “Hotel” (83.0 %), and “Coffee Shop” (94.0 %). Perhaps even more remarkable, 24.2 % and 26.1 % of participants accurately reported seeing the small signs announcing “Sandwiches 10¢” and “Milk Shakes,” respectively, in the hotel window.

Priming Effects

Focus Primes

In Study 5 (N = 246) and Study 6 (N = 231), participants were primed by having been asked to notice animate, inanimate, or unusual elements of the scene immediately prior to viewing the photograph. These manipulations did not significantly affect the likelihood of seeing the woman as a central figure in the scene. As summarized in Table 3, however, these primes did influence seeing peripheral elements in the scene in predictable ways. For example, participants who received the Inanimate focus prime were more likely than those in the Unusual and Animate conditions to report seeing “Hotel,” χ2(2, N=231)=7.07, p=.03, and “Coffee Shop,” χ2(2, N=231)=7.07, p=.03. Participants who received the Unusual focus prime were more likely than those in the Inanimate and Animate conditions to report seeing “1.00 Up,” χ2(2, N=231)=12.05, p=.002. Finally, participants who received the Animate focus prime were more likely than those in the Inanimate and Unusual conditions to report seeing people in the doorway of the hotel, χ2(2, N=231)=12.88, p=.002.

Table 3.

Percentage of participants who reported seeing scene elements after two seconds of exposure by type of prime

Scene element Type of prime
Neutral Animate Inanimate Unusual Story
Central element
 Saw suicidal woman 0.0 % 0.0 % 0.0 % 0.0 % 12.4 %
 Saw suicidal person 0.5 % 0.0 % 0.0 % 0.0 % 1.7 %
 Saw falling person 0.6 % 15.7 % 9.5 % 9.4 % 2.5 %
 Saw target stimulus 11.1 % 39.7 % 27.2 % 31.4 % 25.0 %
Peripheral elements
 Saw people in doorway 3.4 % 19.3 % 6.0 % 3.1 % 10.1 %
 Saw “1.00 Up” 14.0 % 15.7 % 23.8 % 40.6 % 26.7 %
 Saw “Garage” 18.2 % 13.3 % 20.2 % 20.3 % 33.1 %
 Saw “Hotel” 53.1 % 47.0 % 66.7 % 51.6 % 91.6 %
 Saw “Coffee Shop” 77.0 % 45.8 % 70.2 % 53.1 % 83.3 %

Despondent Woman Story Prime

We hypothesized that seeing the suicidal woman would be enhanced for participants who read an account of a despondent woman with reasons to commit suicide immediately prior to viewing the stimulus photograph. This prediction was validated in Study 7. Of the 121 participants who received the despondent woman story prime immediately prior to viewing the scene, 15 individuals (12.4 %) reported seeing a suicidal woman, compared to 2.2 % of individuals across all samplings and manipulations in this research program, χ2(1, N=1577)=36.34, p<.001.

Structural Manipulation Effects

Woman Present or Absent

In Study 8 (N=135), we altered the structure of the scene by either adding or removing the woman from the photograph in the second of two consecutive presentations of the stimulus. Participants in this study were asked to detect anything that changed in the scene between the first and second viewing. Participants in the condition where the woman was present and then removed differed significantly from those in the condition where the woman was absent and then inserted into the scene with respect to whether they noticed a change on the second viewing, χ2(1, N=135)=55.66, p<.001. Specifically, when the woman was absent and then added to the scene, no one noticed that something was added. However, when she was present and then removed from the scene in the second viewing of the photograph, 58.2 % of participants reported that something had been deleted.

Barber Pole Present or Absent

To test for differences in noticing central scene elements (i.e., those involving the woman) as a function of altering the scene’s peripheral details, the barber pole was added to and removed from the scene in counterbalanced presentations with a sample of 73 participants (Study 8). However, addition and removal of the barber pole did not affect the frequency of reporting any central scene elements.

Qualitative Aspects of Scene Perception

A content analysis was subsequently conducted on the written perceptual accounts to qualitatively examine how participants processed the photograph. When noticed, the central figure was described in 42 different ways – for example, as a woman, man, person, object, and figure. The animate individual was seen as flying, hanging, leaping, diving, and committing suicide. The static object was a statue, mannequin, sculpture, gargoyle, and fixture. When perceived as a figure, it was an angel, cherub, fairy, nymph, and mermaid. Remarkably, the hotel was frequently (and accurately) recalled as the “Genesee Hotel,” but it was also called the “Genesee Motel” and “Genesee Drugs.” The garage sign was described in 18 ways – for example, as free garage and free parking. The people in the window and the doorway were described as a bellboy, doorman, crossing guard, customer, waiter, and policeman (the person in the doorway was actually a policewoman who was responding to the scene).

Participants also frequently reported seeing things in the scene that were not actually present. For example, some participants reported seeing a marine kissing a girl, two people dancing, a fire hydrant, car, bench, tree, alley, overcast sky, and foggy weather. Some of the remembrances involved assimilation to the familiar, extensions of seen objects to related ones, and the mention of elements that were not present, but which would be schematic with the scene. Examples of this included seeing a “car parked out front” immediately after reporting “garage.” Given the setting, fire hydrants and colored neon signs could be expected and were reported, although none are actually present in the scene. In three instances a “red and white barber pole” was reported as being present in this black and white photograph. Finally, several participants dealt with the uncertain, perhaps distressing sight of the suicidal woman by transforming her into a static object, such as a “thing on the roof,” even though the roof is not visible in the photograph.

Discussion

Human visual recognition processes are usually quite robust and effective even when viewing conditions are degraded (Cox et al. 2004). When information about an element or figure in the visual field is insufficient or incomplete, though, we rely heavily on contextual cues to determine how to best understand and interpret the stimulus (Bar and Ullman 1996; Beiderman et al. 1982; Palmer 1975). In such instances, the personal expectations that we bring to a given perceptual task – for example, as a result of our histories, motivations, and disposition – dramatically influence what we see and remember. This is the essence of top-down processing: Humans see what they are psychologically prepared to see and are generally blind to what they are not prepared to see in a particular visual context, given their preexisting schemas and associated expectations for that context.

Data from the present study provide robust evidence in support of this phenomenon. Across eight studies, 15 different experimental conditions, and more than 1,550 highly motivated, educated viewers, a large majority of participants failed to see the most significant event in a scene: a woman in the act of committing suicide. This occurred despite giving participants specific instructions to detect as many elements as possible from the scene, which was presented repeatedly with increasing duration times. Frequency of detecting the suicidal woman did not improve with additional exposure to the photograph and, in fact, was increased (from 2.2 % to 12.4 %) only among participants who read a paragraph-long priming story just prior to viewing the test scene. The story told of a despondent woman entering a hotel and was intended to increase the woman’s schematic congruency with the scene. Although some participants reasonably interpreted the central figure to be something other than a suicidal woman, nearly half of all participants (46 %) across all study replications and experimental manipulations failed to see anything or anyone in the general location of the woman.

In contrast, participants exhibited remarkably accurate and detailed recognition of many other elements in the scene. Across all replications and experimental conditions, for example, 65.7 % of participants saw “Garage,” 83 % of participants saw “Hotel,” and 94 % of participants saw “Coffee Shop.” These scene elements are arguably easier to perceive and recall than the suicidal woman. Importantly, though, they are also more schematically congruent with the scene. Coffee shops are frequently seen on street corners and garage signs are often posted near hotel entrances. More remarkable is that 24.2 % and 26.1 % of participants reported seeing “Sandwiches 10¢” and “Milk Shakes,” respectively. These two elements are incomplete and remain difficult to decipher even after prolonged exposure to the photograph. They are similar to the woman in this way, but differ in that they are schematically congruent with major elements of the scene, like “Coffee Shop” and “Fountain,” which were reported by more than 70 % of all participants.

These two sets of results permit one of the more interesting comparisons of the study: Approximately one-fourth of all viewers made complete sense out of two highly degraded window signs, presumably by using their expectations to fill in the visual information they needed but which was not present; in the same amount of time, only 2.2 % of participants were able to make sense out of the woman given the visual information provided. Accurate detection of the window signs and not the suicidal woman is made more remarkable by the fact that the signs are approximately one-tenth the size of the woman.

These results can be interpreted in several ways. One interpretation is that given the limited amount of time participants had to view the photograph, they simply reported the scene elements that had the greatest bottom-up saliency, such as “Hotel,” “Garage,” and “Coffee Shop.” These elements could arguably require less time to interpret than the suicidal woman. Although the present data do not speak to this interpretation, if tenable, we believe more participants would have reported basic stimulus elements of the central figure as well, such as the woman’s arms or legs. However, frequency of describing the woman’s basic features was relatively low. Indeed, half of all participants across all eight studies did not perform a bottom-up analysis of the suicidal woman.

Taken together, these findings demonstrate that while visual schemas can enhance the detection of expected stimuli, they can also significantly hamper the perception of aschematic stimuli. We hypothesized that several different priming and structural manipulations would improve participants’ detection of the central figure and various peripheral elements, and this prediction was validated. Compared to a neutral condition, for example, participants primed to search for something animate were more likely to see a falling person (15.1 % more) and some central figure (28.6 % more), and they were less likely to report seeing “Hotel” (6.1 % less) and “Coffee Shop” (31.2 % less). Although neither the Inanimate nor the Unusual primes impacted reporting of a suicidal person, both increased the likelihood of seeing some central figure. These findings are novel insofar as they demonstrate the high degree of specificity with which a variety of priming manipulations may influence what individuals see in a complex visual scene.

As expected, the largest priming effect was the significant increase in seeing a suicidal woman among participants who had just read a brief vignette about a despondent woman entering a hotel. In this condition, 12.4 % of participants reported having seen the suicidal woman, compared to 2.2 % across all conditions and replications. Participants who had just read the priming story were also much more likely to report seeing “Hotel” (91.6 % in the story condition, compared to 83 % across all conditions and replications). We interpret these effects based on the formulation that the despondent woman story made the suicidal woman schematic with the photograph, at least temporarily. Interestingly, removing the woman from the scene in the second of two sequential presentations caused 58.2 % of participants to notice a change in the scene, but it did not increase reports of having seen the suicidal woman.

Without these primes, a large majority of viewers failed to notice the woman in any form, suicidal or not. At the same time, participants saw much that was not really there. Indeed, although schematic processing of the scene interfered with participants’ accurate recognition of the aschematic figure, it also contributed to their tendency to invent a series of very reasonable elements to accompany the composite. For example, the barber pole becomes colored, neon signs flash, cars are parked on the street, and fire hydrants, trees, benches, and chairs are all introduced in schematically-congruent fashion. The perceived overcast sky (which is not even visible in the photograph) is enlivened by noticing a Marine kissing a girl, a man pulling a woman by her hair, and two people dancing in the street in front of the hotel. The central figure, when seen, takes on more than forty different forms across various perceptual accounts that all make sense given the context of the scene. Such is the stuff of vivid imaginations at odds with a simpler, more accurate visual account of the scene.

A reasonable question is whether participants’ inability to detect the suicidal woman is simply a failure of proper perceptual grouping (i.e., seeing an arm or leg and the hotel sign, but not grouping the elements so as to perceive both the sign and woman). This would be akin to the classic R. C. James (1973) photograph of a Dalmatian, where viewers can stare at the photograph for minutes but not see the Dalmatian… until it is labeled as such, at which point it becomes obvious. The label, it seems, helps to organize the otherwise random dots into a meaningfully-grouped representation of a dog. Although somewhat similar, the present findings differ from this demonstration in at least two ways: first, participants viewing the suicidal woman photograph were not required to decipher the general context of the scene (in fact, they were told they were going to see an urban scene); and second, participants viewing the suicidal woman photograph were not asked to code a seemingly random visual array of elements (i.e., both the suicidal woman and the setting were “conceptually whole” to begin with).

Relation to “Inattentional Blindness”

Considered more broadly, these effects are consistent with a growing body of research on “inattentional blindness” (Becklen and Cervone 1983; Bredemeier and Simons 2012; Hyman et al. 2010; Mack 2003; Mack and Rock 1998; Most et al. 2001; Neisser and Becklen 1975; Simons 2000; see also Varakin and Levin 2008). This work has demonstrated, for example, that when an experimenter stops to ask a pedestrian for directions and is interrupted by workmen walking between them, one of whom then replaces the experimenter, only half of participants notice the change (Simons and Levin 1998). And when a person in a gorilla suit or a woman with an umbrella enters a visual scene, 46 % of participants fail to notice the event, despite its prominent placement in the scene (Simons and Chabris 1999). Remarkably, these effects occur even among “expert observers” who look directly at the target (e.g., the gorilla), as confirmed by eye tracking (Drew et al. 2013). The present findings extend the results of these studies by showing that many objects in complex scenes (e.g., “Fountain,” “Hotel,” and “Coffee Shop” in the suicidal woman photograph) are actually easy to see and readily reported by participants, assuming they can be reasonably expected given the general context of the scene. This suggests that inattentional blindness cannot simply be the result of poor visibility. In addition, the data from the eight studies reported here show a far lower incidence of noticing (2.2 % for the suicidal woman) than in earlier studies, in addition to high rates of misidentification and schema-congruent replacements.

In addition, the present results are consistent with research on figure-ground perception, including Davenport and Potter (2004), who found that objects and their settings are processed interactively, with general knowledge about an object’s appropriateness in a particular scene influencing the perception of that object (see also Beiderman et al. 1982; Boyce and Pollatsek 1992; Boyce et al. 1989). Our findings extend this work by highlighting the remarkable acuity that people exhibit when deciphering objects that are schematically-congruent with their context or setting. The present study also highlights the effects of a variety of priming and structural manipulations on reporting accuracy. In many cases, these effects are highly specific, improving recognition of prime-congruent elements while decreasing recognition of prime-incongruent elements.

Limitations and Future Directions

Several limitations of these studies should be noted. First, our main comparisons relied on frequencies of noticing the unexpected woman versus frequencies of noticing other more schematic elements of the scene (i.e., a within-scene comparison). Future research would benefit from utilizing scenes in which it is possible to systematically modify the extent to which elements are schematically congruent with the scene by, for example, moving or altering the target object so as to make it either congruent or incongruent with the context of the scene while keeping constant the scene’s other characteristics. In addition, given the design of the present study, it was not possible to determine why participants failed to report the woman. For example, the present data do not speak to whether the unexpected visual information was ignored or distorted at the sensory-perceptual level or at the encoding or retrieval stages of remembering and reporting. This begs the question, were elements not reported because they were not seen in the first place (i.e., “inattentional blindness”) or rather because they were seen but not remembered (i.e., “inattentional amnesia”; see Wolfe 1999)? Future work could employ a gaze-tracking device to identify participants who look at the woman but who do not report seeing her. In addition, researchers could monitor arousal levels in this context (e.g., using measures such as the EEG P-300 “surprise wave,” heart rate measures, or fMRI indices) to identify participants who respond emotionally, but who still do not report seeing the woman.

Concluding Remarks

In conclusion, the human visual system is remarkably good at helping us navigate a very complex physical and social world that, more often than not, is highly predictable. Partly because of the dynamics that underlie this efficiency, however, we are also surprisingly bad at noticing rare or unexpected events, even when they are central elements of a scene (Wolfe et al. 2005). When confronted with scene elements or situations for which we have no established (or active) schema, such information appears to fall into a visual “black hole” – a conceptual blind spot. This is the take-home message from the present data. When perceptual expectations are violated, even our most basic visual recognition processes may fail, as we look but do not see.

Acknowledgments

Preparation of this report was supported in part by National Institutes of Health grant R01 CA140933, by the Cousins Center for Psychoneuroimmunology, and by a Society in Science – Branco Weiss Fellowship to George Slavich. We thank Lauren Anas, Waki Gamez, and Connie Turcott for assisting with data management.

Footnotes

1

Asking students to identify themselves could have led to less complete reporting of elements about which students were uncertain. However, given that students were explicitly instructed to list as many elements from the scene as possible, reporting elements about which students were uncertain was associated with doing better, not worse. This, we believe, decreased the likelihood of omitting uncertain scene elements.

Declaration of Conflicting Interests

The authors declare that they have no conflicts of interest with respect to their authorship or the publication of this article.

Contributor Information

George M. Slavich, Email: gslavich@mednet.ucla.edu, Cousins Center for Psychoneuroimmunology and Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, UCLA Medical Plaza 300, Room 3156, Los Angeles, CA 90095-7076, USA

Philip G. Zimbardo, Department of Psychology, Stanford University, Stanford, CA 94305-2130, USA

References

  1. Attneave F. Some informational aspects of visual perception. Psychological Review. 1954;61:183–193. doi: 10.1037/h0054663. [DOI] [PubMed] [Google Scholar]
  2. Axelrod R. Schema theory: an information processing model of perception and cognition. The American Political Science Review. 1973;67:1248–1266. [Google Scholar]
  3. Bar M, Ullman S. Spatial context in recognition. Perception. 1996;25:343–352. doi: 10.1068/p250343. [DOI] [PubMed] [Google Scholar]
  4. Beck AT. Depression: Clinical, experimental, and theoretical aspects. New York: Harper and Row; 1967. [Google Scholar]
  5. Becker E. The denial of death. New York: Simon & Schuster; 1973. [Google Scholar]
  6. Becklen R, Cervone D. Selective looking and the noticing of unexpected events. Memory and Cognition. 1983;11:601–608. doi: 10.3758/bf03198284. [DOI] [PubMed] [Google Scholar]
  7. Beiderman I, Mezzanotte RJ, Rabinowitz JC. Scene perception: detecting and judging objects undergoing relational violations. Cognitive Psychology. 1982;14:143–177. doi: 10.1016/0010-0285(82)90007-x. [DOI] [PubMed] [Google Scholar]
  8. Boyce SJ, Pollatsek A. Identification of objects in scenes: the role of scene background in object naming. Journal of Experimental Psychology: Learning, Memory, and Cognition. 1992;18:531–543. doi: 10.1037//0278-7393.18.3.531. [DOI] [PubMed] [Google Scholar]
  9. Boyce SJ, Pollatsek A, Rayner K. Effect of background information on object identification. Journal of Experimental Psychology: Human Perception and Performance. 1989;15:556–566. doi: 10.1037//0096-1523.15.3.556. [DOI] [PubMed] [Google Scholar]
  10. Bredemeier K, Simons DJ. Working memory and inattentional blindness. Psychonomic Bulletin and Review. 2012;19:239–244. doi: 10.3758/s13423-011-0204-8. [DOI] [PubMed] [Google Scholar]
  11. Bruner JS. On perceptual readiness. Psychological Review. 1957;64:123–152. doi: 10.1037/h0043805. [DOI] [PubMed] [Google Scholar]
  12. Cantor N, Michel W. Traits as prototypes: effects on recognition memory. Journal of Personality and Social Psychology. 1979;35:38–48. [Google Scholar]
  13. Chen X, Zelinsky GJ. Is visual search a top-down or bottom-up process? Journal of Vision. 2006;6:447. [Google Scholar]
  14. Chun MM. Scene perception and memory. In: Irwin D, Ross B, editors. Psychology of learning and motivation: Advances in research and theory: Cognitive vision. Vol. 42. San Diego: Academic Press; 2003. pp. 79–108. [Google Scholar]
  15. Cox D, Meyers E, Sinha P. Contextually evoked object-specific responses in human visual cortex. Science. 2004;304:115–117. doi: 10.1126/science.1093110. [DOI] [PubMed] [Google Scholar]
  16. Davenport JL, Potter MC. Scene consistency in object and background perception. Psychological Science. 2004;15:559–564. doi: 10.1111/j.0956-7976.2004.00719.x. [DOI] [PubMed] [Google Scholar]
  17. Drew T, Võ ML, Wolfe JM. The invisible gorilla strikes again: sustained inattentional blindness in expert observers. Psychological Science. 2013;24:1848–1853. doi: 10.1177/0956797613479386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Greenberg J, Pyszczynski T, Solomon S. The causes and consequences of the need for self-esteem: A terror management theory. In: Baumeister RF, editor. Public self and private self. New York: Springer; 1986. pp. 189–212. [Google Scholar]
  19. Henderson JM, Hollingworth A. High-level scene perception. Annual Review of Psychology. 1999;50:243–271. doi: 10.1146/annurev.psych.50.1.243. [DOI] [PubMed] [Google Scholar]
  20. Hsieh PJ, Colas JT, Kanwisher N. Pop-out without awareness: unseen feature singletons capture attention only when top-down attention is available. Psychological Science. 2011;22:1220–1226. doi: 10.1177/0956797611419302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hyman IE, Boss SM, Wise BM, McKenzie KE, Caggiano JM. Did you see the unicycling clown? Inattentional blindness while walking and talking on a cell phone. Applied Cognitive Psychology. 2010;24:597–607. [Google Scholar]
  22. Itti L. Models of bottom-up attention and saliency. In: Itti L, Rees G, Tsotsos JK, editors. Neurobiology of attention. San Diego: Elsevier; 2005. pp. 576–582. [Google Scholar]
  23. Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998;20:1254–1259. [Google Scholar]
  24. James RC. Dalmatian photo. In: Gregory RL, editor. The intelligent eye. New York: McGraw-Hill; 1973. p. 14. [Google Scholar]
  25. Kunhardt PB. LIFE: The first 50 years, 1936–1986. Boston: Little, Brown and Company; 1986. [Google Scholar]
  26. Levy SR, Plaks JE, Dweck CS. Modes of social thought: Implicit theories and social understanding. In: Chaiken S, Trope Y, editors. Dual-process theories in social psychology. New York: Guilford Press; 1999. pp. 179–202. [Google Scholar]
  27. Mack A. Inattentional blindness: looking without seeing. Current Directions in Psychological Science. 2003;12:180–184. [Google Scholar]
  28. Mack A, Rock I. Inattentional blindness. Cambridge: MIT Press; 1998. [Google Scholar]
  29. Most SB, Simons DJ, Scholl BJ, Jimenez R, Clifford E, Chablis CF. How not to be seen: the contributions of similarity and selective ignoring to sustained inattentional blindness. Psychological Science. 2001;12:9–17. doi: 10.1111/1467-9280.00303. [DOI] [PubMed] [Google Scholar]
  30. Neisser U, Becklen R. Selective looking: attending to visually significant events. Cognitive Psychology. 1975;7:480–494. [Google Scholar]
  31. Palmer SE. The effects of contextual scenes on the identification of objects. Memory and Cognition. 1975;3:519–526. doi: 10.3758/BF03197524. [DOI] [PubMed] [Google Scholar]
  32. Perls FS. In: Gestalt therapy verbatim. Stevens JO, editor. Lafayette: Real People Press; 1969. [Google Scholar]
  33. Ross L, Ward A. Naive realism in everyday life: Implications for social conflict and misunderstanding. In: Brown T, Reed E, Turiel E, editors. Values and knowledge. Hillsdale: Lawrence Erlbaum Associates; 1996. pp. 103–135. [Google Scholar]
  34. Simons DJ. Attentional capture and inattentional blindness. Trends in Cognitive Sciences. 2000;4:147–155. doi: 10.1016/s1364-6613(00)01455-8. [DOI] [PubMed] [Google Scholar]
  35. Simons DJ, Chabris CF. Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception. 1999;28:1059–1074. doi: 10.1068/p281059. [DOI] [PubMed] [Google Scholar]
  36. Simons DJ, Levin DT. Failure to detect changes to people during a real-world interaction. Psychonomic Bulletin and Review. 1998;5:644–649. [Google Scholar]
  37. Slavich GM. Transformational teaching. Excellence in teaching. 2005;5 Retrieved from http://www.teachpsych.org/ebooks/eit2005/eit05-11.html. [Google Scholar]
  38. Slavich GM. On becoming a teacher of psychology. In: Irons JG, Beins BC, Burke C, Buskist B, Hevern V, Williams JE, editors. The teaching of psychology in autobiography: Perspectives from exemplary psychology teachers. Washington, DC: American Psychological Association; 2006. pp. 92–99. [Google Scholar]
  39. Slavich GM. On 50 years of giving psychology away: an interview with Philip Zimbardo. Teaching of Psychology. 2009;36:278–284. [Google Scholar]
  40. Slavich GM, Toussaint L. Using the Stress and Adversity Inventory (STRAIN) as a teaching tool leads to significant learning gains in two courses on stress and health. Stress and Health. 2013 doi: 10.1002/smi.2523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Slavich GM, Zimbardo PG. Transformational teaching: theoretical underpinnings, basic principles, and core methods. Educational Psychology Review. 2012;24:569–608. doi: 10.1007/s10648-012-9199-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Stryker S, Burke PJ. The past, present, and future of an identity theory. Social Psychology Quarterly. 2000;63:284–297. [Google Scholar]
  43. Thoreau HD. The journal of Henry David Thoreau. In: Torrey B, editor. The writings of Henry David Thoreau: XIII. Boston: Houghton Mifflin; 1906. pp. 77–78. [Google Scholar]
  44. Torralba A, Oliva A, Castelhano M, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychological Review. 2006;113:766–786. doi: 10.1037/0033-295X.113.4.766. [DOI] [PubMed] [Google Scholar]
  45. Varakin DA, Levin DT. Scene structure enhances change detection. Quarterly Journal of Experimental Psychology. 2008;61:543–551. doi: 10.1080/17470210701774176. [DOI] [PubMed] [Google Scholar]
  46. Wolfe JM. Inattentional amnesia. In: Coltheart V, editor. Fleeting memories: Cognition of brief visual stimuli. Cambridge: MIT Press; 1999. pp. 71–94. [Google Scholar]
  47. Wolfe JM, Butcher SJ, Lee C, Hyle M. Changing your mind: on the contributions of top-down and bottom-up guidance in visual search for feature singletons. Journal of Experimental Psychology: Human Perception and Performance. 2003;29:483–502. doi: 10.1037/0096-1523.29.2.483. [DOI] [PubMed] [Google Scholar]
  48. Wolfe JM, Horowitz TS, Kenner N. Rare items often missed in visual searches. Nature. 2005;435:439–440. doi: 10.1038/435439a. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES