Skip to main content
. 2021 Sep 23;118(39):e2109237118. doi: 10.1073/pnas.2109237118

Fig. 1.

Fig. 1.

Cross-linguistic patterns in color naming and the rate−distortion hypothesis. B&K (1) and WCS (2) studied color vocabularies in 130 languages around the world (see WCS). (A) The 330 color chips named by native speakers in the WCS study. Colors shown here are best approximations in Standard RGB (sRGB) color space. (B) Empirical color vocabularies for two example languages in the WCS, each with six basic color terms. Color chips correspond to A, but they have been colored according to the focal color of the term chosen by the majority of speakers surveyed (or by a mixture of the best choice focal colors when there was more than one best choice). The languages Vagla and Martu-Wangka, although linguistically unrelated and separated by a distance of nearly 14,000 km, have remarkably similar partitions of colors into basic color terms (2). (C) Schematic diagram of rate−distortion theory applied to color naming. A speaker needs to refer to color x with probability p(x). The speaker uses a probabilistic rule p(x^|x) to assign color terms, x^, to colors, x. This rule depends on the perceptual distortion d(x||x^) introduced by substituting x^ for the true color, x, where each term x^ is associated with a coordinate in color space. The choice of the term x^ by the speaker reduces the listener’s uncertainty about the true color being referenced, measured, on average, by the mutual information (IX;X^). While any probabilistic mapping from colors to terms, p(x^|x), is possible, some mappings are more efficient than others. Rate−distortion theory provides optimal term mappings that allow a listener to glean as much information as possible, for a given level of tolerable distortion and distribution of communicative needs p(x).