Abstract
Scenes in the real world carry large amounts of information about color, texture, shading, illumination, and occlusion giving rise to our perception of a rich and detailed environment. In contrast, line drawings have only a sparse subset of scene contours. Nevertheless, they also trigger vivid three-dimensional impressions despite having no equivalent in the natural world. Here, we ask why line drawings work. We see that they exploit the underlying neural codes of vision and they also show that artists’ intuitions go well beyond the understanding of vision found in current neurosciences and computer vision.
Keywords: visual perception, art, picture perception, painting, computer vision
Line drawings have fascinated artists and scientists in various fields for many centuries with the first line drawings dating back more than 30,000 years (Figure 1A). The ease and immediacy of recognizing scenes and objects in simple line drawings suggests that, for the visual system, line drawings have deep similarities to other more detailed visual representations as well as to the real scenes they depict. For example, line drawings of visual scenes are recognized as fast and accurately as photographs (e.g., Biederman and Ju, 1988). Sometimes, line drawings convey a stunningly vivid impression of depth and three-dimensional shape, even when not much more than the outlines of an object are drawn.
Line drawings are so common in our everyday life that we seldom ask why they work. Once we ask that question, however, we realize that line drawings really are exceptional. In particular, in the real world, there are no lines around objects (with rare exceptions; Figure 1B). During the eons over which biological vision systems evolved, there has been no experience that could have adapted our visual systems to understand line drawings. Instead, objects are usually segmented from the background by lightness, texture, or color differences. So why does the visual system understand line drawings?
We could imagine that line drawings are a convention of modern art that we have come to recognize through learning as we have the alphabet in which this paper is written (e.g., Gombrich, 1969; Goodman, 1976). This account has been controversial (Kennedy, 1974, 1975; Deregowski, 1989; see also Gibson, 1971, 1979) and there is strong evidence against it. For example, it has been shown that infants (Yonas and Arterberry, 1994; see also Hochberg and Brooks, 1962), stone-age tribe members (Kennedy and Ross, 1975), and even chimpanzees (Itakura, 1994; Tanaka, 2007) are able to recognize line drawings. We even see line representation used by insects in bio-mimicry (Figure 1C). These findings rule out any strong version of culture-based acquisition for understanding line drawings, although clearly there are many culturally based conventions used in line drawings (Figure 1D).
If cultural knowledge is not the key for understanding line drawings and line drawings are too recent an arrival to have triggered any special adaptation, what then is the mechanism that allows us to make sense of these drawings? The likely explanation is that lines trigger a neural response that has evolved to deal with natural scenes. This fortuitous co-activation lets lines stand in for solid edges. Once artists discovered this (Kennedy, 1975), they quickly adopted this format as an economical and powerful method for representing scenes and objects. How does this “co-activation” work? The physiological investigation of the neural response to contours began with by Hubel’s and Wiesel’s (1962, 1968) transformational discovery that neurons in the primary visual cortex are tuned to the orientation of contours, responding to edges, and not to uniform areas. The part of cortex that analyzes visual information accounts for 30% or more of the cortex in primates and is located at the posterior pole of the brain. The visual cortex is further subdivided into several subregions that process the incoming images along parallel and serial streams. The first divisions of the visual cortex are labeled V1 through V4 and, in all of these, we see the orientation-tuned neurons that can respond to edges. In areas V1 and V2, the orientation-tuned detectors can be specific to the attribute defining the contour (color, contrast polarity, or texture, etc.) but, starting in area V2 (Gegenfurtner et al., 1996), through V5 (Albright, 1992), and on to object recognition areas like IT (Sáry et al., 1995), many become indifferent to the attribute that defines the contour. These orientation-tuned units evolved to efficiently detect the contours in the natural world (Olshausen and Field, 1996) but even though the edges in the world are typically marked by a discrete change in surface attributes – lighter on one side than the other, for example – these units respond as well to lines – lighter in the middle and dark on both sides, for example, or even illusory contours that are suggested by context but not physically present (von der Heydt et al., 1984; Lee and Nguyen, 2001; see also Seghier and Vuilleumier, 2006). In other words, the receptive field structure that efficiently recovers edges, also works well for lines even though it was not designed to do so.
Consider then the cortical pattern of response to an object like a cube (Figure 2). The contour-selective neurons with oriented receptive fields fire only along the contour and not within the uniform areas of the object. If we were to look at the visual cortex with voltage sensitive dyes (Cohen et al., 1968; Tasaki et al., 1968; Blasdel and Salama, 1986), the pattern of activity for the oriented units would resemble a sketch of the object (Marr, 1982), distorted by the cortical anatomy (Tootell et al., 1982). A set of lines that match the cube’s edges would trigger responses in the same pattern, indicating that, on a neural level, line representations are equivalent to the originals they depict. This notion is supported by a number of recent imaging studies that showed that the activation in response to line drawings was similar to that for other representations (e.g., Ishai et al., 2000; Walther et al., 2011).
Is that all there is to it? No, in fact, there is quite a lot remaining to explain as can be seen in any line version of a natural scene. These are typically uninterpretable and the simple image in Figure 3 shows why. Many of the contours in a scene arise from accidental illumination edges at the borders of shadows and shading. These contours, when represented as lines take on a reality that they should not have. Each line in a standard line drawing stands for depth or slant discontinuities between surfaces: these are “object contours.” When the borders of shadows are included in a line drawing, these contours also get promoted to the status of object contour – but for locations where there were none. As a result, the whole image is corrupted, deviating from the structure of the original objects. Figure 4 shows this even more dramatically because its original is a representation of a face that can only be recognized if the shadows are correctly processed. Rendered only as contours, the light and dark polarity required to interpret shadows is no longer available and the pattern becomes a meaningless set of lines. So, while contours are, of course, of prominent importance for visual perception (e.g., Koenderink, 1984), displaying all the contours of a scene in a line drawing will regularly fail to convey the essential parts of an image.
There is therefore a critical step between the extraction of edges and their assignment as “object contours.” The visual system understands how to determine which contours are the critical ones and beyond a certain level in the hierarchy of visual cortex, only those contours should remain. Shadow borders and other accidental contours must be removed in order to keep only the object contours. We do not know where or how this happens in the visual system. No imaging or physiological study has yet shown the absence of response to a shadow border at some level of cortex, and yet this must happen. Like the visual system, artists also understand which contours are the important ones. We see only the characteristic object contours in their line drawings – never shadow borders, no matter how prominent they are in the scene they are drawing. Artists appear to have access to a body of knowledge – what makes a characteristic contour – that scientists only dimly understand at present. Future studies of artists’ intuitive understanding of critical contours may lead to important insights for image understanding.
Following the initial critical choice of lines to include, what are the elements that are central to the information conveyed by line drawings? Particularly informative are those parts of an image where contours touch or intersect (Clowes, 1971; Huffman, 1971; Albert and Hoffman, 1995) and many authors have shown how these various junctions form a set of constraints that are often sufficient to specify the original object (c.f., Barrow and Tenenbaum, 1981; Malik, 1987). For example, a T-junction is formed when one object interrupts the contours of another object behind it (the contours meet in a junction as in the letter T); a Y-junction is seen at the front corner of a cube; an X-junction is formed when the contours of a transparent material cross those of a background surface. Contours also may end on their own when smooth surfaces self-occlude, such as the top of a torus or donut (Koenderink, 1984). These local junction cues are clearly used by the visual system to make sense of line drawings and we can see their power when they are in conflict (Figure 5) as the impossibility of the global shape does not suppress the local interpretations they trigger – a loophole exploited to great effect by artists like Escher and Reutersvard and scientists like Penrose. Interestingly, the way junctions are used in drawings has not changed very much over the recorded history of art (Biederman and Kim, 2008, see T-junctions where the rhinoceros’s legs meet its body in Figure 1A), suggesting again that they are informative aspects of the world and not creations of our culture.
However, while junctions are certainly informative, they are not necessary for recognizing line drawings. The recognition of many sketches reveals an important contribution of memory. When a set of contours matches a familiar prototype, memory serves to fill in the missing details (Figure 6). These and many other line drawings show how artists are able to depict such various features as depth, folds, occlusion, texture, brightness and even odor, mental energy, or motion by choosing the right lines, revealing that artists (implicitly or explicitly) understand the code of the visual system. Scientists have yet to fully understand what artists have successfully been practicing for thousands of years and for some questions of the neural codes of vision, we may find that artists are a more immediate and better source of information than our most advanced scientific studies, whether of behavior, single cell recordings, or brain imaging.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
This work was supported by a Chaire d’Excellence Grant from the ANR.
References
- Albert M. K., Hoffman D. D. (1995). “Genericity in spatial vision,” in Geometric Representations of Perceptual Phenomena: Papers in Honor of Tarow Indow on His 70th Birthday, eds Hoffman D., Luce R. D., D’Zmura M., Iverson G., Romney A. K. (New York: Lawrence Erlbaum), 95–112 [Google Scholar]
- Albright T. D. (1992). Form-cue invariant motion processing in primate visual cortex. Science 255, 1141–1143 10.1126/science.1546317 [DOI] [PubMed] [Google Scholar]
- Barrow H. G., Tenenbaum J. M. (1981). Interpreting line drawings as three dimensional surfaces. Artif. Intell. 17, 75–117 10.1016/0004-3702(81)90021-7 [DOI] [Google Scholar]
- Biederman I., Ju G. (1988). Surface versus edge-based determinants of visual recognition. Cogn. Psychol. 20, 38–64 10.1016/0010-0285(88)90024-2 [DOI] [PubMed] [Google Scholar]
- Biederman I., Kim J. G. (2008). 17000 years of depicting the junction of two smooth shapes. Perception 37, 161–164 10.1068/p5907 [DOI] [PubMed] [Google Scholar]
- Blasdel G. G., Salama G. (1986). Voltage-sensitive dyes reveal a modular organization in monkey striate cortex. Nature 321, 579–585 10.1038/321579a0 [DOI] [PubMed] [Google Scholar]
- Clowes M. B. (1971). On seeing things. Artif. Intell. 2, 79–116 10.1016/0004-3702(71)90005-1 [DOI] [Google Scholar]
- Cohen L. B., Keynes R. D., Hille B. (1968). Light scattering and birefringence changes during nerve activity. Nature 218, 438–441 10.1038/218271a0 [DOI] [PubMed] [Google Scholar]
- Deregowski J. B. (1989). Real space and represented space: cross-cultural perspectives. Behav. Brain Sci. 12, 51–119 10.1017/S0140525X00024559 [DOI] [Google Scholar]
- Gegenfurtner K. R., Kiper D. C., Fenstemaker S. B. (1996). Processing of color, form, and motion in macaque area V2. Vis. Neurosci. 13, 161–172 10.1017/S0952523800007203 [DOI] [PubMed] [Google Scholar]
- Gibson J. J. (1971). The information available in pictures. Leonardo 4, 27–35 10.2307/1572228 [DOI] [Google Scholar]
- Gibson J. J. (1979). The Ecological Approach to Visual Perception. Boston: Houghton Mifflin [Google Scholar]
- Gombrich E. H. (1969). Art and Illusion. A Study in the Psychology of Pictorial Representation. Princeton, NJ: Bollingen Series/Princeton University Press4991343 [Google Scholar]
- Goodman N. (1976). Languages of Art: An Approach to a Theory of Symbols, 2nd Edn Indianapolis: Hackett Publishing Company [Google Scholar]
- Hochberg J., Brooks V. (1962). Pictorial recognition as an unlearned ability: a study of one child’s performance. Am. J. Psychol. 75, 624–628 10.2307/1420286 [DOI] [PubMed] [Google Scholar]
- Hubel D. H., Wiesel T. N. (1962). Receptive Felds and functional architecture in the cat’s visual cortex. J. Neurosci. 160, 106–154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hubel D. H., Wiesel T. N. (1968). Receptive Felds and functional architecture of monkey striate cortex. J. Neurosci. 195, 215–243 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huffman D. A. (1971). “Impossible objects as nonsense sentences,” in Machine Intelligence 6, eds Melzer B., Michie D. (Edinburgh: Edinburgh University Press; ), 295–323 [Google Scholar]
- Ishai A., Ungerleider L. G., Martin A., Haxby J. V. (2000). The representation of objects in the human occipital and temporal cortex. J. Cogn. Neurosci. 12, 35–51 10.1162/089892900564055 [DOI] [PubMed] [Google Scholar]
- Itakura S. (1994). Recognition of line-drawings representations by a chimpanzee (Pan troglodytes). J. Gen. Psychol. 121, 189–197 10.1080/00221309.1994.9711177 [DOI] [PubMed] [Google Scholar]
- Kennedy J. M. (1974). A Psychology of Picture Perception. San Francisco: Jossey-Bass [Google Scholar]
- Kennedy J. M. (1975). Drawings were discovered, not invented. New Sci. 67, 523–527 [Google Scholar]
- Kennedy J. M. (1997). How the blind draw. Sci. Am. 276, 60–65 10.1038/scientificamerican0497-60 [DOI] [PubMed] [Google Scholar]
- Kennedy J. M., Ross A. S. (1975). Outline picture perception by the Songe of Papua. Perception 4, 391–406 10.1068/p040391 [DOI] [Google Scholar]
- Koenderink J. J. (1984). What does the occluding contour tell us about solid shape? Perception 13, 321–330 10.1068/p130321 [DOI] [PubMed] [Google Scholar]
- Lee T. S., Nguyen M. (2001). Dynamics of subjective contour formation in the early visual cortex. Proc. Natl. Acad. Sci. U.S.A. 98, 1907–1911 10.1073/pnas.121064698 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malik J. (1987). Interpreting line drawings of curved objects. Int. J. Comput. Vis. 1, 73–103 10.1007/BF00128527 [DOI] [Google Scholar]
- Marr D. (1982). Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. San Francisco: W. H. Freeman and Company [Google Scholar]
- Olshausen B. A., Field D. J. (1996). Emergence of simple cell receptive field properties by learning a sparse code for natural images. Nature 381, 606–607 10.1038/381607a0 [DOI] [PubMed] [Google Scholar]
- Sáry G., Vogels R., Kovács G., Orban G. A. (1995). Responses of monkey inferior temporal neurons to luminance-, motion-, and texture-defined gratings. J. Neurophysiol. 73, 1341–1354 [DOI] [PubMed] [Google Scholar]
- Seghier M. L., Vuilleumier P. (2006). Functional neuroimaging findings on the human perception of illusory contours. Neurosci. Biobehav. Rev. 30, 595–612 10.1016/j.neubiorev.2005.11.002 [DOI] [PubMed] [Google Scholar]
- Tanaka M. (2007). Recognition of pictorial representations by chimpanzees (Pan troglodytes). Anim. Cogn. 10, 169–179 10.1007/s10071-006-0056-1 [DOI] [PubMed] [Google Scholar]
- Tasaki I., Watanabe A., Sandlin R., Carnay L. (1968). Changes in fluorescence, turbidity, and birefringence associated with nerve excitation. Proc. Natl. Acad. Sci. U.S.A. 61, 883–888 10.1073/pnas.61.3.883 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tootell R. B. H., Silverman M. S., Switkes E., De Valois R. S. (1982). Deoxyglucose analysis of retinotpic organization in primate striate cortex. Science 220, 737–739 10.1126/science.6301017 [DOI] [PubMed] [Google Scholar]
- von der Heydt R., Peterhans E., Baumgartner G. (1984). Illusory contours and cortical neuron responses. Science 224, 1260–1262 10.1126/science.6539501 [DOI] [PubMed] [Google Scholar]
- Walther D. B., Chai B., Caddigan E., Beck D. M., Fei-Fei L. (2011). Simple line drawings suffice for functional MRI decoding of natural scene categories. Proc. Natl. Acad. Sci. U.S.A. 108, 9661–9666 10.1073/pnas.1015666108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yonas A., Arterberry M. E. (1994). Infants perceive spatial structure specified by line junctions. Perception 23, 1427–1435 10.1068/p231427 [DOI] [PubMed] [Google Scholar]