Abstract
When the proximal and distal elements of wire-frame cubes are conflated, observers perceive illusory structures that no longer behave veridically. These phenomena suggest that what we normally see depends on visual associations generated by experience. The necessity of such learning may explain why the mammalian visual system is subject to a prolonged period of plasticity in early life, when novel circuits are made in enormous numbers.
Keywords: vision, illusion, visual learning, association
Information generated by the eyes is ambiguous. Everyday we have to make decisions (about the size and distance of objects, their form, and whether they are moving) based on retinal images that can have two or more meanings (1–4). Indeed, because the complexities of a three-dimensional world are projected onto a two-dimensional receptor sheet, the interpretation of most retinal images is equivocal. The ability to resolve these uncertainties, a talent directly relevant to survival, shows that normally we have little trouble reaching valid conclusions about potentially confusing visual stimuli. But how do we accomplish this? In the 19th century, students of vision were divided on this issue, the two opposing camps being represented by Hering and Helmholtz (1, 5). Hering maintained that the innate analytic abilities of the visual system enabled such determinations to be made more or less a priori (the “nativist” position). Conversely, Helmholtz maintained that the correct interpretation of visual stimuli is generally a matter of inferences based on visual experience (the “empiricist” position).
We have reexamined this long-standing debate using visual stimuli generated by wire-frame cubes. Because the proximal and distal elements of such structures are not easily distinguished, an illusory object can be perceived that behaves quite differently from the solid objects we are accustomed to seeing. In addition to their intriguing—and often amusing—nature, these alternative percepts raise the question of whether visual perception is based on the operation of a priori rules for processing information supplied by the retina (6) or is better explained in terms of a posteriori associations acquired by experience with objects in the real world (1, 4, 7, 8).
Altered Form of a Wire-Frame.
When a transparent cube is viewed monocularly, the two most common interpretations of the stimulus alternate, much as when one views a two-dimensional representation (the familiar “Necker cube”; ref. 9). For a transparent cube positioned as in Fig. 1A, one interpretation—the correct one in this example—is looking down on the top of the cube; the other is as if looking up at its bottom (Fig. 1B). If the transparent cube is seen in its top-down presentation, all six faces appear to be approximately equal in area. When, however, the same retinal image is perceived as if viewed from the bottom, the structure is seen as a truncated pyramid, the proximal faces appearing smaller than the distal faces (Fig. 2). Moreover, when seen in the illusory bottom-up orientation, the cube appears to be balanced on its distal–inferior vertex, with the surface on which it actually rests rising from the balance point (see Figs. 1 and 2). (Illusory, in this case, means an interpretation of the stimulus that does not accord with the configuration of the object determined by direct measurement.) In short, the observer no longer judges the object to be a cube, despite the unchanged retinal image, knowledge of its actual structure, and the immediately preceding perception of a cube in top-down view.
A first order explanation of these phenomena follows from the geometry of the situation. Because of their greater distance, the angles subtended on the retina by the distal elements of the cube are less than the angles subtended by the proximal ones. When the object is perceived in its top-down (actual) presentation, the visual system “compensates” for this asymmetry of the retinal image such that the structure is seen as a cube (Fig. 2). This adjustment presumably occurs because the visual system associates retinal images that are routinely distorted by the geometry of size and distance with percepts that better represent the actual object. However, when the illusory (bottom-up) interpretation prevails, the usual relationship of the front and back faces of the cube is reversed, such that a different form of the same retinal image is perceived. This alternative perception occurs because the usual compensatory mechanism is now applied inappropriately.
Altered Motion Parallax.
A second remarkable phenomenon is apparent if, while viewing a wire-frame cube, the head is moved from side to side. Normally, this strategy is used to ascertain the spatial relationships of objects by motion parallax (Fig. 3). As the head moves one way, objects in the foreground are perceived as shifting in the opposite direction with respect to the background, thus aiding judgments about depth (10) that also are informed by stereopsis, accommodation, vergence, and many other cues. As long as the observer perceives the transparent cube in its actual (top-down) orientation (see Fig. 1), motion parallax is generated by head movements (Fig. 3A). When, however, the same retinal image is seen in reversed perspective, motion parallax fails: the object no longer moves laterally in relation to the background but rotates in the direction of the head movement (Fig. 3B).
A first order explanation again follows from the geometry of the situation. When the head is moved laterally, the proximal elements of the cube move a greater distance on the retina than the distal elements. Important to note, these changes of the retinal image generated by head movement are the same as those that occur when the cube rotates (see Fig. 3). The visual system normally appreciates that, when the head is moved to assess spatial relationships, the foreground objects are not in fact rotating but are most usefully perceived as shifting laterally with respect to the background. When, however, the observer sees the transparent cube in reversed perspective, the object elements perceived to be nearer (i.e., the distal elements of the cube) move less than the proximal elements; in this case, the visual system interprets the cube to be rotating in the direction of the head movement. The illusory perception occurs because this sequence of events signifies rotation when looking at objects that behave veridically. Thus, the visual system associates a particular sequence of changes in the retinal image with a particular perception (motion parallax), predicated on the behavior of conventional (solid) objects.
Altered Direction of Movement.
If the surface on which a wire-frame cube rests is made to rotate (see Fig. 1A), a third phenomenon becomes apparent. Although the cube seen in its actual (top-down) presentation appears to turn normally, when the observer interprets the image to be in the illusory (bottom-up) orientation (see Fig. 2), the direction of rotation immediately reverses. As a result, the cube is perceived to be tumbling in the opposite direction above the rotating surface (Fig. 4) (see also refs. 11 and 12). Moreover, the elements of a pattern on the portion of the rotating surface bounded by the bottom frame of the cube (see Fig. 1A) also rotate in an opposite direction. Despite knowledge of the actual arrangement of the object and the law of gravity, this illusory behavior looks every bit as “real” as the veridical rotation of the cube and the surface on which it rests.
The geometrical explanation of these subjectively amazing percepts is again straightforward. When the cube rotates, some elements move to the left while others move to the right. If the elements that move to the left in the actual (top-down) presentation are interpreted as being closer to the observer, the cube rotates in a clockwise direction. If, on the other hand, the contours that move to the right are perceived as being closer to the observer, the cube rotates in a counterclockwise direction. The visual system apparently determines the direction of rotation based on associating the proximal parts of an object moving to the left and the distal parts moving to the right with clockwise rotation and vice versa (compare Fig. 4 A and B; see also ref. 13). As a result, whenever the interpretation of the cube’s orientation changes, the perceived direction of rotation instantly reverses. The other aspects of the illusory behavior (i.e., tumbling above the surface, part of which is appropriated by one of the upright faces of the cube; the reversed rotation of the elements of a pattern bounded by the cube) follow from earlier explanations.
Implications.
Each of these several observations shows that the perception of a transparent object can be dramatically altered by the observer’s interpretation of its arrangement in space. The ambiguous retinal stimulus generated by monocular viewing of a wire-frame cube is similar to that elicited by the familiar two-dimensional Necker cube. As a result, perception shifts between two equally plausible interpretations of the retinal image that interchange the object’s proximal and distal elements. The perceptions that arise from the two alternative interpretations of a two-dimensional Necker cube behave identically. In the case of the three-dimensional cube, however, conflating the front and back of the stimulus has dramatic consequences. The aspects of a wire frame cube that change depending upon whether the stimulus is seen in its veridical or illusory presentation include such basic properties as the form of the object, its spatial relationship with other objects, and its direction of movement (Table 1). Beyond the geometrical explanations already offered, how can the same retinal image give rise to such different perceptual experiences?
Table 1.
Retinal image | Interpretation | Behavior of object |
---|---|---|
Stationary; proximal and distal elements differ in size | actual | appears as cube |
illusory | appears as truncated pyramid | |
Changes as observer moves head from side to side | actual | cube appears to move in direction opposite to head movement against background (motion parallax) |
illusory | cube appears to rotate in the same direction as head movement | |
Changes as object rotates | actual | cube appears to rest on surface and rotates in same direction as support |
illusory | cube appears to tumble in opposite direction as support |
As suggested by Hering (1, 5) and more recently by others (e.g., ref. 6), one explanation might be that visual perception is based on a priori analytic processes that operate on retinal information. When confronted with an ambiguous image, these processes would continue to operate but could generate more than one perceptual outcome in response to a particular stimulus. This strategy, however, would preclude the routine resolution of visual ambiguity. This point may be best appreciated by considering the resolution of semantic ambiguity. Take, for example, the sentence “The house is on the lake.” Like the retinal image of a wire frame cube, the sense is ambiguous in this case because of the multiple meanings of the preposition “on” (in particular, the statement could mean the house is floating on the lake or is simply near its shore). No a priori rule can, in principle, determine which of the possible meanings is intended because that information is not contained in the statement. The ambiguity could be resolved arbitrarily by limiting the meaning of the preposition but only at considerable cost to the richness of language. In fact, semantic ambiguity is retained, the correct meaning being sorted out by additional knowledge about context, usage, etc. Likewise in the case of an ambiguous retinal image, the uncertainty is resolved by virtue of additional information. Although ancillary cues such as those provided by stereopsis, or feedback from vergence and/or accommodation may often indicate the correct meaning of a retinal image, they are of limited effectiveness in determining spatial relationships among objects that are more than a few meters away from the observer (1, 3). Indeed, our observations make plain that such ancillary cues cannot resolve the ambiguities presented by a transparent cube; if they could, we would never see the illusory perceptions we describe.
The most plausible source of the additional information needed to resolve visual ambiguity is prior experience. Such experience could be derived from cues associated with other aspects of the scene, information from other sensory modalities (e.g., tactile experience with objects), from motor feedback, or even from associations established during phylogeny (14–16). Although the associational consequences of visual stimulation may be quite predictable, the “rules” in this conception are empirical. The visual system must accumulate by experience the associations elicited by an ambiguous retinal image and by the same token must eventually learn which set of associations best represents the actual object (i.e., the veridical percept). Because the generation of such associations is deeply ingrained in the nervous system, we are usually unaware of the visual puzzles they routinely solve. The virtue of the transparent figures we have used here is to present a type of ambiguity with which we have had little or no experience, thus forcing the observer to be aware of a process that we normally take for granted.
The analogy between ambiguous visual and linguistic information is also helpful in thinking about the development of the ability to resolve uncertainties by prior experience. The basic circuitry for understanding and producing speech sounds is present very early (17, 18), presumably being “hard-wired,” much as the circuitry that subserves classical visual receptive field properties (19–21). Whether in the context of language or vision, such circuitry provides the wherewithal to trigger the associations that indicate the correct meaning of a stimulus. The importance of experience in resolving ambiguity in the visual world is underscored by well documented clinical cases in which the proper interpretation of visual stimuli has to be learned again—often with great difficulty and limited success—when sight is restored in adults after blindness since childhood (22–25). (The etiology in such cases is typically bilateral destruction of the corneas by trauma or infection.) When vision is restored, these individuals invariably report difficulty understanding the visual world, despite the fact that they had been normally sighted in early life and that their postoperative visual acuity is reasonably good. Over weeks or months or in some cases longer, most of these patients learn to correctly interpret the meaning of various visual stimuli.
That experience has profound effects on the organization of the visual system in humans and other mammals is well known (26, 27). Moreover, a variety of evidence has shown that most brain circuitry is constructed postnatally (28–34) and is subject to the influence of neural activity (35–37). For example, in the developing rodent brain, different functional regions of cortex grow in proportion to their degree of metabolic and electrical activity (35, 36). Despite a wealth of information about neural development, the purpose of the large number of activity-dependent connections established postnatally has remained unclear. The observations we describe here suggest an answer to this puzzle. If the inherent uncertainty of many—perhaps most—retinal images can only be resolved by learning about actual objects, the influence of postnatal visual activity may serve primarily to establish the neuronal associations that enable appropriate interpretations of otherwise ambiguous information. In the context of receptive field properties alone, it is difficult to imagine why the visual system should remain plastic for a prolonged period in postnatal life, particularly because this malleability entails substantial jeopardy from the debilitating effects of visual deprivation (26, 27, 38). In the context of forming the neuronal associations needed to interpret inevitably ambiguous retinal images, such plasticity makes good sense.
Acknowledgments
We are especially grateful to Len White for his advice in the course of this work; David Fitzpatrick, Larry Katz, Greg Lockhead, and Ken Nakayama also provided helpful criticism. Support from a National Institutes of Health grant is gratefully acknowledged.
References
- 1.von Helmholtz H L F. Helmholtz’s Treatise on Physiological Optics, transl. Southall, J. P. C. I-III. Menasha, WI: George Banta Publishing; 1924. [Google Scholar]
- 2.Duncker K. In: Source Book of Gestalt Psychology. Ellis W H, editor. London: Routledge; 1938. pp. 161–172. [Google Scholar]
- 3.Rock I. Perception. New York: Freeman; 1984. [Google Scholar]
- 4.Gregory R L. Eye and Brain: The Psychology of Seeing. 4th Ed. Princeton: Princeton Univ. Press; 1990. [Google Scholar]
- 5.Turner R S. In the Eye’s Mind: Vision and the Helmholtz-Hering Controversy. Princeton: Princeton Univ. Press; 1994. [Google Scholar]
- 6.Marr D. Vision. San Francisco: Freeman; 1982. [Google Scholar]
- 7.Nakayama K, Shimojo S. Science. 1992;257:1357–1363. doi: 10.1126/science.1529336. [DOI] [PubMed] [Google Scholar]
- 8.Nakayama K, Shimojo S. In: Visual Cognition: An Invitation to Cognitive Science. 2nd Ed. Kosslyn S M, Osherson D N, editors. Cambridge, MA: MIT Press; 1995. pp. 1–70. [Google Scholar]
- 9.Necker L A. Phil Mag J Sci. 1832;1:329–337. [Google Scholar]
- 10.Rogers B, Graham M. Perception. 1979;8:125–134. doi: 10.1068/p080125. [DOI] [PubMed] [Google Scholar]
- 11.Peterson M A, Shyi G C-W. Percept Psychophys. 1988;44:31–42. doi: 10.3758/bf03207472. [DOI] [PubMed] [Google Scholar]
- 12.Masin S C. Foundations of Perceptual Theory. Amsterdam: North–Holland; 1993. [Google Scholar]
- 13.Ittelson A, Ames W H., Jr J Psychol. 1950;30:43–62. (1950). [Google Scholar]
- 14.Tinbergen N. Curious Naturalists. Garden City, NY: Doubleday; 1969. [Google Scholar]
- 15.Alcock J. Animal Behavior: An Evolutionary Approach. Sunderland, MA: Sinauer; 1993. [Google Scholar]
- 16.Fantz R L. Science. 1963;140:296–297. doi: 10.1126/science.140.3564.296. [DOI] [PubMed] [Google Scholar]
- 17.Eimas P D, Siqueland E R, Juscyzk P, Vigorito J. Science. 1971;171:303–306. doi: 10.1126/science.171.3968.303. [DOI] [PubMed] [Google Scholar]
- 18.Miyawaki M, Strange W, Verbrugge R, Liberman A, Jenkins J J, Fujimura O. Percept Psychophys. 1975;18:331–340. [Google Scholar]
- 19.Hubel D H, Wiesel T N. J Neurophysiol. 1963;26:994–1002. doi: 10.1152/jn.1963.26.6.994. [DOI] [PubMed] [Google Scholar]
- 20.Hubel D H, Wiesel T N. J Comp Neurol. 1974;158:267–294. doi: 10.1002/cne.901580304. [DOI] [PubMed] [Google Scholar]
- 21.Stryker M P, Sherk H. Science. 1975;190:904–906. doi: 10.1126/science.1188372. [DOI] [PubMed] [Google Scholar]
- 22.von Senden M. Space and Sight, transl. Heath, P. New York: Methuen; 1960. [Google Scholar]
- 23.Gregory R L, Wallace J G. Exp Psych Soc Monograph. 1963;2:1–46. [Google Scholar]
- 24.Valvo A. In: Sight Restoration After Long-Term Blindness: The Problems and Behavior Patterns of Visual Rehabilitation. Clark L L, Jastrzembska Z Z, editors. New York: American Foundation for the Blind; 1971. pp. 1–5. [Google Scholar]
- 25.Sacks O. An Anthropologist from Mars: Seven Paradoxical Tales. New York: Alfred A. Knopf; 1995. [Google Scholar]
- 26.Wiesel T N. Nature (London) 1982;299:583–591. doi: 10.1038/299583a0. [DOI] [PubMed] [Google Scholar]
- 27.Hubel D H. Eye, Brain, and Vision. New York: Freeman; 1988. [Google Scholar]
- 28.Cragg B C R. J Comp Neurol. 1975;160:147–166. doi: 10.1002/cne.901600202. [DOI] [PubMed] [Google Scholar]
- 29.Pomeroy S L, LaMantia A-S, Purves D. J Neurosci. 1990;10:1952–1966. doi: 10.1523/JNEUROSCI.10-06-01952.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.LaMantia A-S, Pomeroy S, Purves D. J Neurosci. 1992;12:976–988. doi: 10.1523/JNEUROSCI.12-03-00976.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Riddle D, Richards A, Zsuppan F, Purves D. J Neurosci. 1992;12:3509–3524. doi: 10.1523/JNEUROSCI.12-09-03509.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bourgeois J-P, Rakic P. J Neurosci. 1993;13:2801–2820. doi: 10.1523/JNEUROSCI.13-07-02801.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Purves D, Riddle D, White L, Gutierrez G. Curr Opin Neurobiol. 1994;4:120–123. doi: 10.1016/0959-4388(94)90041-8. [DOI] [PubMed] [Google Scholar]
- 34.Purves D, White L, Zheng D, Andrews T, Riddle D. In: Individual Development Over the Lifespan: Biological and Psychosocial Perspectives. Magnusson D, editor. Cambridge, UK: Cambridge Univ. Press; 1995. pp. 162–178. [Google Scholar]
- 35.Riddle D R, Gutierrez G, Zheng D, White L, Richards A, Purves D. J Neurosci. 1993;13:4193–4213. doi: 10.1523/JNEUROSCI.13-10-04193.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zheng D, Purves D. Proc Natl Acad Sci USA. 1995;92:1802–1806. doi: 10.1073/pnas.92.6.1802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Purves D. Neural Activity and the Growth of the Brain. Cambridge, UK: Cambridge Univ. Press; 1994. [Google Scholar]
- 38.Horton J C. In: Adler’s Physiology of the Eye. Hart W M, editor. St. Louis: Mosby; 1992. pp. 728–772. [Google Scholar]