Abstract
The color context effects referred to as color contrast, constancy, and assimilation underscore the fact that color percepts do not correspond to the spectral characteristics of the generative stimuli. Despite a variety of proposed theories, these phenomena have resisted explanation in a single principled framework. Using a hyperspectral image database of natural scenes, we here show that color contrast, constancy, and assimilation are all predicted by the statistical organization of spectral returns from natural visual environments.
It has long been appreciated that color percepts, like other perceived visual qualities, do not correspond in any simple way to the distribution of light energy in spectral returns. Thus, two visual targets generating identical spectra can, as a consequence of context, elicit different color percepts (Fig. 1 A and C), whereas different spectral returns can elicit more similar colors (Fig. 1 B and D). In color contrast and constancy stimuli, the colors generated by the targets are shifted away from the colors elicited by their respective surrounds (Fig. 1 A and B, respectively). On the other hand, in color assimilation stimuli (Fig. 1 C and D), highly repetitive spatial patterns cause the apparent colors of the patterned elements to be shifted toward each other.
The variety and complexity of these contextual effects have made it extraordinarily difficult to rationalize color perception in any of the several conceptual frameworks that have been proposed (see Discussion). Here, we examine the idea that color contrast, constancy, and assimilation are all generated by past experience with the statistical organization of spectral returns arising from natural visual environments. The biological rationale underlying this idea is that, because of the conflation of the generative sources of visual stimuli in the retinal image, spectral stimuli are inherently ambiguous (1, 2). Accordingly, if visually guided behavior is to be successful, visual percepts must in some way be determined by the statistical organization of spectral returns in natural environments that humans have typically encountered, rather than by the physical properties of the particular stimulus giving rise to the retinal image (2). If this supposition is correct, the perceived color of a target in any given stimulus should be predicted by the typical chromaticity of the target in the chromatic and spatial contexts that occur in natural scenes.
This hypothesis was tested by analyzing the co-occurrence of targets and contexts in a database of hyperspectral images of natural scenes. A large number of samples was collected by using criteria that mimicked the spatial complexity and contextual chromaticity of the standard color contrast, constancy, and assimilation stimuli illustrated in Fig. 1 (see Fig. 2 and Methods). In this way, we could evaluate the conditional probability distribution of the spectral characteristics of a target, given the spectral characteristics of the context and the spatial complexity of the sample. We then asked whether the perceived color of the target in such stimuli is predicted by these distributions.
Methods
Sampling the Hyperspectral Database. The database comprised 41 natural scenes obtained with hyperspectral techniques that measured the radiance of each image pixel as a function of wavelength (3–5). These images were obtained with the intention of representing typical natural environments and have been used previously to investigate the distributions of spectral returns, reflectances, and illuminants (4–5).
Rectangular regions of fixed size (typically 10% of the overall image size) were randomly sampled in all of the images in the database and analyzed in terms of target and context (Fig. 2). To obtain samples pertinent to color contrast, constancy, and assimilation stimuli by using the same method, the target was defined by spatially isolated patches of the same size (1.5% of the region sampled) and value in the 1931 Commission Internationale de L'Eclairage (CIE) chromaticity diagram (6) (Fig. 2 B2 and C2); the context was defined as the remaining area. When the number of target and contextual elements was relatively small (e.g., <10), the samples approximated the configuration of color contrast and constancy stimuli (see Fig. 1 A and B); when, however, the number of the patches was relatively large (e.g., >20), the samples more nearly approximated the configuration of color assimilation stimuli (see Fig. 1 C and D).
Separating each sampled region into target and context was carried out in two steps. First, a small patch with a relatively uniform spectrum (SD of the chromaticity of all of the pixels in the patch divided by their mean chromaticity = <0.03) was selected near the center of the region being analyzed (see boxed patch in Fig. 2 B2 and C2). Using this patch as a reference, the remaining area of the region was sampled to identify all of the other patches of similar chromaticity. To be accepted as a component of the target (see definition above), any given patch had to be at least one pixel away from any other patch, both horizontally and vertically. The set of all such patches was then defined as the target in the sampled region, as illustrated by the patches on the black background in Fig. 2 B2 and C2. In a second step, the context of the target was identified in each sample. If the chromaticity of the area remaining in the region sampled was relatively uniform, it was accepted as the context of the target (see Fig. 2 B3 and C3). If the remaining area failed to meet this criterion, the sample was excluded from further analysis and the procedure was repeated beginning with another region in the image. Although the set of target and context areas identified in this way in natural images is inevitably less uniform than in the “laboratory” stimuli illustrated in Fig. 1, this approach identifies the relevant stimulus categories as they occur in normal viewing (see Discussion for further explanation of the underlying rationale of this approach). In this way, we generated a total of ≈1.5 million samples that approximated, as closely as possible in natural images, the color contrast, constancy, and assimilation stimuli that have generally been used in psychophysical studies.
Statistical Analysis. We next computed the chromaticity of these components (denoted Ct and Cc) by averaging the values of all of the pixels in the target and context, respectively. The number of patches in the target of each sample (denoted Nt) was also determined as an index of the overall spatial complexity of the stimulus. Using this information, we analyzed the probability distribution of the spectral characteristics of the target in terms of the CIE chromaticity diagram, given the context and spatial complexity of the samples (i.e., P[Ct|Cc,Nt]). This determination of co-occurrence was made by accumulating and normalizing the frequency of occurrence of each possible combination of target chromaticity, context chromaticity, and number of iterated elements in the samples. We then asked how the probability distribution of the chromaticity of a target changes as a function of different chromatic contexts and spatial complexity. Finally, we computed the average difference between the chromaticity of the target and the context as a function of the number of iterated elements in the samples to provide a more general assessment of the statistical relationship among the relevant variables.
Results
Statistics of Target Chromaticities in Natural Scenes. Fig. 3 shows the means of the probability distribution of target chromaticity for two values of Cc and a set of different values of Nt, superimposed on the corresponding area of the 1931 CIE chromaticity diagram. As might be expected, the typical chromaticities of targets in different chromatic contexts in natural scenes are different. Thus, all of the target chromaticity values in a greenish (circles) context (Cc = [0.32, 0.36]; square) occupy systematically different positions in the CIE chromaticity diagram compared with the target values in a reddish (asterisks) context (Cc = [0.42, 0.38]; triangle). Furthermore, as the number of iterated elements increases (arrows), the mean values of the distributions for targets in a greenish context change from a yellowish green to a green that is very close to the chromaticity of the context. Conversely, the mean values of the distributions for targets in a reddish context change from a bluish-red to the reddish chromaticity of the context as the spatial complexity increases. Thus, as the complexity of the sample grows, the typical chromaticity of targets in natural scenes changes from a chromaticity opposite that of the context toward a value similar to that of the context.
Average Chromaticity as a Function of Sample Complexity. To examine more generally how the average chromaticity difference between target and context in natural images varies as a function of stimulus complexity, we determined the average chromaticity difference (|Ct – Cc|) as a function of spatial complexity of the sample (Nt) (Fig. 4). As Nt increases, the difference |Ct – Cc| decreases monotonically. In other words, the chromaticity of target and context tend to become more similar as the samples comprise more iterated elements.
Probabilistic Explanation of Color Context Effects. If the perceptual effects of spectral stimuli are indeed generated by the statistical organization of spectral returns in the visual environments that humans have always experienced, the perceived colors of the targets in stimuli such as those in Fig. 1 should be affected by the typical spectral characteristics of targets in the relevant spectral contexts and spatial complexities encountered in natural scenes (see Figs. 3 and 4).
In color contrast stimuli (e.g., Fig. 1 A), the spectral returns of the two central targets are physically the same. In a spectrally neutral context, the central targets should therefore occupy the same position in subjective color space. In different spectral contexts, however, their positions in this space will, according to the hypothesis being tested here, be changed by the typical spectral characteristics of the target in the corresponding spectral context and spatial complexity found in natural scenes. The statistical analysis carried out shows that the characteristic chromaticity of a target embedded in a stimulus whose spatial complexity is similar to that of standard color contrast and constancy stimuli tends to be more reddish when the context is greenish and more greenish when the context is reddish (or, more generally, spectrally different targets and contexts tend to have opposing chromaticities in the CIE diagram) (see Fig. 3). Thus, on empirical grounds, a target in a stimulus with a red context should appear more greenish, whereas the same target in a green context will appear more reddish (see Fig. 1 A).
In color constancy stimuli, however, the spectral returns of the two central targets are physically different from one another (e.g., Fig. 1B). Thus, the targets in a neutral context should occupy different positions in subjective color space. In this case, the positions of the colors seen in different spectral contexts will be shifted toward each other by the typical characteristics of the target in the corresponding context and degree of spatial complexity in natural scenes (see Fig. 3). In Fig. 1B, for example, the greenish target in a green context appears more reddish because that is the typical chromaticity of the target in such contexts and spatial complexities in natural scenes. Conversely, the yellowish target in a red context appears more greenish because that is the typical chromaticity of the target in such contexts. As a result, the spectrally different central targets in Fig. 1B (and all related stimuli) appear more similar than expected on the basis of their spectral returns.
By the same token, in the color assimilation stimuli (e.g., Fig. 1 C and D), the positions of spectrally identical targets will be shifted in subjective color space by the typical spatial and chromatic configuration of the samples of such stimuli in natural scenes. Analysis of the relevant samples shows that, unlike the simpler patterns of the contrast and constancy samples, the chromaticities of highly iterated targets tend to be greenish in a green context and reddish in a red context (see Figs. 3 and 4). Thus, the perceived color of the spectrally identical yellowish targets in the stimulus in Fig. 1C would be expected to be greenish in a green context and reddish in a red context, as is the case. In Fig. 1D, however, the typical chromaticity of the target in the corresponding spatial and chromatic contexts in natural scenes would cause the positions of the spectrally different targets (yellowish and greenish) to shift toward each other, as is again evident in Fig. 1.
In summary, the perceived colors of the targets in color contrast, constancy, and assimilation stimuli are all predicted by the way that their positions are shifted in subjective color space by the typical co-occurrence in natural scenes of the spectral return of the targets together with the relevant spectral context and spatial configuration.
Discussion
In previous studies, color contrast and constancy have most often been attributed to the lateral interactions among neurons at some level of the visual system (7–12). In this conception, a neural integration of spectral contrast ratios in the relevant stimuli generates these contextual effects. This mechanism, however, should make the apparent colors of the elements in color assimilation stimuli shift away from each other, which is the opposite of what is seen (see Fig. 1 C and D). In addition, this sort of explanation cannot rationalize perceptual phenomena such as the Wertheimer–Benary illusion and related stimuli that depend on the spatial attributes of the stimulus (13–16). Indeed, it is not difficult to create stimuli that elicit color perceptions opposite those predicted by theories based on lateral integration (17–18).
Another attempt to rationalize color context effects has appealed to multiple spatial frequency filtering mechanisms (19–23). In this scenario, color contrast is taken to be generated by neuronal responses tuned to low spatial frequency stimuli and assimilation by neuronal responses tuned to both low and high spatial frequency components. Although this further mechanism is helpful in accounting for some color context effects, color perceptual phenomena that depend on scene properties such as orientation and depth also resist this sort of explanatory framework.
Given these uncertainties and the implications of other recent work on color perception (2, 24) the hypothesis that color contrast, constancy, and assimilation effects (and by inference all color percepts) are generated by the statistical organization of the spectral returns arising from natural visual environments is an attractive one. The present analysis of target and context samples that approximate standard color contrast, constancy, and assimilation stimuli show that the probability distribution of the spectral characteristics of targets in nature vary systematically as a function of the spectral characteristics of the context and its spatial complexity. Thus, the probability distributions of target chromaticity values in different chromatic contexts are different (see Fig. 3), and, as the spatial complexity of the sample increases, the chromaticity of the target tends to change from a value opposite that of the context to a value similar to that of the context (see Figs. 3 and 4).
An instantiation of these statistical facts by the visual system presumably serves to generate visual percepts that promote appropriate visually guided behavior in response to inherently ambiguous visual stimuli (2, 24). Accordingly, the perceived color of a target is determined by how the positions of the spectral returns of targets are shifted in subjective color space by the spectral context and spatial complexity that humans have previously encountered in the spectral patterns of light that have always fallen on the retina.
A remaining question is why, in broad terms, sampling the various configurations illustrated in Fig. 1 in a natural scene database generates these statistical biases. Recall that we obtained samples pertinent to color contrast, constancy, and assimilation stimuli by identifying regions in natural scenes in which target and context were relatively uniform. Because no other constraints were used, the samples effectively include all possible combinations underlying such stimuli in natural scenes. Thus, any statistical bias obtained in the probability distributions reflects the intrinsic characteristics of natural environments. The fact that the typical chromaticity of a target in natural scenes is opposite that of the context, and that it tends to change to a value similar to that of the context as the spatial complexity of the sample increases, can be understood intuitively. When only one or a few target elements are identified in a spectrally homogenous surround (as in a color contrast and constancy stimuli), the real-world sources of target and context tend to be physically different surfaces in the same illumination; on the other hand, when many target elements are found in a spectrally homogenous surround, the sources of the target and context are likely be physically similar surfaces. (This does not mean, of course, that the target and context in any stimulus with high spatial complexity will be perceived to be the same color; recall that in this framework the perceived color of a target is determined by how the position of a spectral return is shifted in subjective color space by the corresponding spectral contexts and spatial complexity that humans have typically encountered.) Because any visual stimulus conflates its generative physical sources (e.g., reflectance and illumination), these systematic changes in the probable physical sources conveyed by the statistical organization of the spectral returns in natural scenes will systematically change the perceptual effects of inherently ambiguous stimuli.
Acknowledgments
We thank Catherine Howe, Shuro Nundy, David Schwartz, James Voyvodic, and Zhiyong Yang for helpful criticism.
Abbreviation: CIE, Commission Internationale de L'Eclairage.
References
- 1.Marr, D. (1982) Vision: A Computational Investigation into Human Representation and Processing of Visual Information (Freeman, San Francisco).
- 2.Purves, D. & Lotto, R. B. (2003) Why We See What We Do: An Empirical Theory of Vision (Sinauer, Sunderland, MA).
- 3.Brelstaff, G., Parraga, A., Troscianko, T. & Carr, D. (1995) in Proceedings of SPIE 2587: Geographic Information Systems, Photogrammetry, and Geological/Geophysical Remote Sensing, eds. Lurie, J. B., Pearson, J. J. & Zillioli, E. (Internatl. Soc. Opt. Eng., Bellingham, WA), pp. 150-159.
- 4.Parraga, C. A., Brelstaff, G. J., Troscianko, T. & Moorhead, I. (1998) J. Opt. Soc. Am. A 15, 563-569. [DOI] [PubMed] [Google Scholar]
- 5.Chiao, C. C., Cronin, T. W. & Osorio, D. (2000) J. Opt. Soc. Am. A 7, 218-224. [DOI] [PubMed] [Google Scholar]
- 6.Commission Internationale de L'Eclairage (1986) Colorimetry (Central Bureau of the Commission Internationale de L'Eclairage, Vienna), 2nd Ed., Publ. 15.2.
- 7.Hering, E. (1920) Grundzüge der Lehre von Lichtsinn (Springer, Berlin), trans. Hurvich, L. M. & Jameson, D. J. (1964) Outlines of a Theory of the Light Sense (Harvard Univ. Press, Cambridge, MA).
- 8.Helson, H. (1938) J. Exp. Psychol. 23, 439-471. [Google Scholar]
- 9.Judd, D. B. (1940) J. Opt. Soc. Am. A 30, 2-32. [Google Scholar]
- 10.Land, E. H. & MaCann, J. J. (1971) J. Opt. Soc. Am. A 61, 1-11. [DOI] [PubMed] [Google Scholar]
- 11.Hurvich, L. M. (1981) Color Vision (Sinauer, Sunderland, MA)
- 12.Land, E. H. (1986) Vision Res. 26, 7-21. [DOI] [PubMed] [Google Scholar]
- 13.Benary, W. (1924) Psychol. Forsch. 5, 131-142. [Google Scholar]
- 14.Knill, D. & Kersten, D. (1991) Nature 351, 228-230. [DOI] [PubMed] [Google Scholar]
- 15.Adelson, E. H. (1993) Science 262, 2042-2044. [DOI] [PubMed] [Google Scholar]
- 16.Brown, R. O. & MacLeod, D. I. (1997) Curr. Biol. 7, 844-849. [DOI] [PubMed] [Google Scholar]
- 17.White, M. (1979) Perception 8, 413-416. [DOI] [PubMed] [Google Scholar]
- 18.Todorovic, D. (1997) Perception 26, 379-394. [DOI] [PubMed] [Google Scholar]
- 19.Blaskeslee, B. & McCourt, M. E. (1997) Vision Res. 37, 2849-2869. [DOI] [PubMed] [Google Scholar]
- 20.McCourt, M. E. (1982) Vision Res. 22, 119-134. [DOI] [PubMed] [Google Scholar]
- 21.Zaidi, Q. (1989) Vision Res. 29, 691-697. [DOI] [PubMed] [Google Scholar]
- 22.DeValois, R. L. & DeValois, K. K. (1988) Spatial Vision (Oxford Univ. Press, New York).
- 23.Moulden, B. & Kingdom, F. A. (1991) Vision Res. 31, 1999-2008. [DOI] [PubMed] [Google Scholar]
- 24.Lotto, R. B. & Purves, D. (2000) Proc. Natl. Acad. Sci. USA 97, 12834-12839. [DOI] [PMC free article] [PubMed] [Google Scholar]