Abstract
Since Darwin’s seminal works, the universality of facial expressions of emotion has remained one of the longest standing debates in the biological and social sciences. Briefly stated, the universality hypothesis claims that all humans communicate six basic internal emotional states (happy, surprise, fear, disgust, anger, and sad) using the same facial movements by virtue of their biological and evolutionary origins [Susskind JM, et al. (2008) Nat Neurosci 11:843–850]. Here, we refute this assumed universality. Using a unique computer graphics platform that combines generative grammars [Chomsky N (1965) MIT Press, Cambridge, MA] with visual perception, we accessed the mind’s eye of 30 Western and Eastern culture individuals and reconstructed their mental representations of the six basic facial expressions of emotion. Cross-cultural comparisons of the mental representations challenge universality on two separate counts. First, whereas Westerners represent each of the six basic emotions with a distinct set of facial movements common to the group, Easterners do not. Second, Easterners represent emotional intensity with distinctive dynamic eye activity. By refuting the long-standing universality hypothesis, our data highlight the powerful influence of culture on shaping basic behaviors once considered biologically hardwired. Consequently, our data open a unique nature–nurture debate across broad fields from evolutionary psychology and social neuroscience to social networking via digital avatars.
Keywords: modeling, reverse correlation, categorical perception, top-down processing, cultural specificity
As first noted by Darwin in The Expression of the Emotions in Man and Animals (1), some basic facial expressions originally served an adaptive, biological function such as regulating sensory exposure (2). By virtue of their biological origins (1–3), facial expressions have long been considered the universal language to signal internal emotional states, recognized across all cultures. Specifically, the universality hypothesis proposes that six basic internal human emotions (i.e., happy, surprise, fear, disgust, anger, and sad) are expressed using the same facial movements across all cultures (4–7), supporting universal recognition. However, consistent cross-cultural disagreement about the emotion (8–13) and intensity (8–10, 14–16) conveyed by gold standard universal facial expressions (17) now questions the universality hypothesis.
To test the universality hypothesis directly, we used a unique computer graphics platform (18) that combines the power of generative grammars (19, 20) with the subjectivity of visual perception to genuinely reconstruct the mental representations of basic facial expressions in individual observers (see also refs. 21, 22). Mental representations reflect the past visual experiences and the future expectations of the individual observer. A cross-cultural comparison of the mental representations of the six basic expressions therefore provides a direct test of their universality.
Fig. 1 illustrates our unique computer graphics platform (see Materials and Methods, Stimuli and Materials and Methods, Procedure for full details). Like a generative grammar (19, 20), we randomly generated all possible three-dimensional facial movements (see Movie S1 for an example). Observers only categorized these random facial animations as expressive when the random facial movements correlated with their subjective mental representations—i.e., when they perceive an emotion. Thus, we can capture the subsets of facial movements that correlate with the subjective, culture-specific representations of the six basic emotions in individual observers and compare them.
Fig. 1.
Random generative grammar of facial movements and the perceptual categorization of emotions. (Stimulus) On each experimental trial, the facial movement generator randomly selected a subset of facial movements, called action units (AUs) (here, AU9 color coded in red, AU10 Left in green, and AU17 in blue) and values specifying the AU temporal parameters (see color-coded temporal curves). On the basis of these parameters, the generator rendered a three-dimensional facial animation of random facial movements, illustrated here with four snapshots. The color-coded vector Below represents the 3 (of 41) randomly selected AUs comprising the stimulus on this illustrative experimental trial. (Mental representations) Observers categorized each random facial animation according to the six basic emotion categories (plus “don’t know”) and rated the emotional intensity on a five-point scale. Observers will interpret the random facial animation as a meaningful facial expression (here, “disgust,” “medium intensity”) when the facial movements correspond to the observer’s mental representation of that facial expression. Each observer (15 Western Caucasian and 15 East Asian) categorized 4,800 such facial animations of same and other-race faces.
Fifteen Western Caucasian (WC) and 15 East Asian (EA) observers (Materials and Methods, Observers) each categorized 4,800 such animations (evenly split between same and other-race face stimuli) by emotion (i.e., one of the six basic emotions or “don’t know”) and intensity (on a five-point scale ranging from “very low” to “very high”).
To model the mental representation of each facial expression, we reverse correlated (23) the random facial movements with the emotion response (e.g., happy) that these random facial movements elicited (Materials and Methods, Model Fitting) (18). In total, we computed 180 models of facial expression representations per culture (15 observers × 6 emotions × 2 race of face). Each model comprised a 41-dimensional vector coding a composition of facial muscles—one dimension per muscle group, with six parameters coding its temporal dynamics and a set of intensity gradients coding how these dynamics change with perceived intensity (Materials and Methods, Model Fitting).
The universality hypothesis predicts that, in each culture, these mental models will form six distinct clusters—one per basic emotion, because each emotion is expressed using a specific combination of facial movements common to all humans. In addition, the mental models should also represent similar signaling of emotional intensity across cultures. Our data demonstrate cultural divergence on both counts.
Results
Six Basic Emotions Are Not Universal.
We clustered the 41-dimensional models of expression representation in each culture independently (Materials and Methods, Clustering Analysis and Mutual Information and Fig. S1). As predicted (24), WC models form six distinct and emotionally homogeneous clusters. However, EA models overlap considerably between emotion categories, demonstrating a different, culture-specific, and therefore not universal, representation of the basic emotions. Fig. 2 summarizes the results for each culture (WC, Left; EA, Right).
Fig. 2.
Cluster analysis and dissimilarity matrices of the Western Caucasian and East Asian models of facial expressions. In each panel, vertical color-coded bars show the k means (k = 6) cluster membership of each model. Each 41-dimensional model (n = 180 per culture) corresponds to the emotion category labeled Above (30 models per emotion). The underlying gray-scale dissimilarity matrices represent the Euclidean distances between each pair of models, used as inputs to k-means clustering. Note that, in the Western Caucasian group, the lighter squares along the diagonal indicate higher model similarity within each of the six emotion categories compared with the East Asian models. Correspondingly, k-means cluster analysis shows that the Western Caucasian models form six emotionally homogenous clusters (e.g., all 30 “happy” models belong to the same cluster, color-coded in purple). In contrast, the East Asian models show considerable model dissimilarity within each emotion category and overlap between categories, particularly for “surprise,” “fear,” “disgust,” “anger,” and “sad” (note the heterogeneous color coding of these models).
Representation of Emotional Intensity Varies Across Cultures.
To identify where and when in the face each culture represents emotional intensity, we compared the models of expression representation according to how facial movements covaried with perceived emotional intensity across time (Materials and Methods, Model Fitting). Fig. 3 summarizes the results (blue, WC; red, EA; P < 0.05). The temporal dynamics of the models revealed culture-specific representation of emotional intensity, as mirrored by popular culture EA emoticons: In EA, (^.^) is happy and (>.<) is angry (see also Movie S2 for examples of culture-specific use of the eyes and mouth). The red face regions in Fig. 3 show that EA models represent emotion intensity primarily with early movements of the eyes in happy, fear, disgust, and anger, whereas WC models represent emotional intensity with other parts of the face.
Fig. 3.
Spatiotemporal location of emotional intensity representation in Western Caucasian and East Asian culture. In each row, color-coded faces show the culture-specific spatiotemporal location of expressive features representing emotional intensity, for each of the six basic emotions. Color coding is as follows: blue, Western Caucasian; red, East Asian, where values reflect the t statistic. All color-coded regions show a significant (P < 0.05) cultural difference as indicated by asterisks labeled on the color bar. Note for the EA models (i.e., red face regions), emotional intensity is represented with characteristic early activations.
Discussion
Using a FACS-based random facial expression generator and reverse correlation, we reconstructed 3D dynamic models of the six basic facial expressions of emotion in Western Caucasian and East Asian cultures. Analysis of the models revealed clear cultural specificity both in the groups of facial muscles and the temporal dynamics representing basic emotions. Specifically, cluster analysis showed that Western Caucasians represent the six basic emotions each with a distinct set of facial muscles. In contrast, the East Asian models showed less distinction, characterized by considerable overlap between emotion categories, particularly for surprise, fear, disgust, and anger. Cross-cultural analysis of the temporal dynamics of the models showed cultural specificity where (in the face) and when facial expressions convey emotional intensity. Together, our results show that facial expressions of emotion are culture specific, refuting the notion that human emotion is universally represented by the same set of six distinct facial expression signals.
To understand the implications of our results, it is important to first highlight the fundamental relationship between the perception and production of facial expressions. Specifically, the facial movements perceived by observers reflect those produced in their social environment because signals designed for communication (and therefore recognition) are those perceived by the observer. That is, one would question the logic and adaptive value of an expressive signal that the receiver could not or does not perceive. Thus, the models reconstructed here reflect the experiences of individual observers interacting with their social environment and provide predictive information to guide cognition and behavior. These dynamic mental representations, therefore, reflect both the past experiences and future expectations of basic facial expressions in each culture.
Cultural specificity in the facial expression models therefore likely reflects differences in the facial expression signals transmitted and encountered by observers in their social environment. For example, cultural differences in the communication of emotional intensity could reflect the operation of culture-specific display rules (25) on the transmission (and subsequent experience) of facial expressions in each cultural context. For example, East Asian models of fear, disgust, and anger show characteristic early signs of emotional intensity with the eyes, which are under less voluntary control than the mouth (26), reflecting restrained facial behaviors as predicted by the literature (27). Similarly, culture-specific dialects (28) or accents (29) would diversify basic facial expression signals across cultures, giving rise to cultural hallmarks of facial behavior. For example, consider the “happy” models in Fig. 3—East Asian models show an early increased activation of the orbicularis oculi muscle, pars lateralis (action unit 6) which typifies “genuine” smiles (26, 30).
Are the six basic emotions universal? We show that six clusters are optimal to characterize the Western Caucasian facial expression models, thus supporting the view that human emotion is composed of six basic categories (24, 31–33). However, our data show that this organization of emotions does not extend to East Asians, questioning the notion that these six basic emotion categories are universal. Rather, our data reflect that the six basic emotions (i.e., happy, surprise, fear, disgust, anger, and sad) are inadequate to accurately represent the conceptual space of emotions in East Asian culture and likely neglect fundamental emotions such as shame (34), pride (35), or guilt (36). Although beyond the scope of the current paper, such questions can now be addressed with our platform by constructing a more diverse range of facial expression models that accurately reflect social communication in different cultures beyond the six basic categories reported in the literature.
In sum, our data directly show that across cultures, emotions are expressed using culture-specific facial signals. Although some basic facial expressions such as fear and disgust (2) originally served as an adaptive function when humans “existed in a much lower and animal-like condition” (ref. 1, p. 19), facial expression signals have since evolved and diversified to serve the primary role of emotion communication during social interaction. As a result, these once biologically hardwired and universal signals have been molded by the diverse social ideologies and practices of the cultural groups who use them for social communication.
Materials and Methods
Observers.
We screened and recruited 15 Western Caucasian (European, six males, mean age 21.3 y, SD 1.2 y) and 15 East Asian (Chinese, seven males, mean age 22.9 y, SD 1.3 y). All EA observers had newly arrived in a Western country (mean residence 5.2 mo, SD 0.94 mo) with International English Language Testing System score ≥6.0 (competent user). All observers had minimal experience of other cultures (as assessed by questionnaire; SI Observer Questionnaire), normal or corrected-to-normal vision, gave written informed consent, and were paid £6/h in an ethically approved experiment.
Stimuli.
On each experimental trial, a 4D photorealistic facial animation generator (18) randomly selected, from 41 core action units (AUs) (37), a subsample of AUs from a binomial distribution (n = 5, P = 0.6, median = 3). For each AU, the generator selected random values for each of the six temporal parameters (onset/peak/offset latency, peak amplitude, acceleration, and deceleration) from a uniform distribution. We generated time courses for each AU using a cubic Hermite spline interpolation (five control points, 30 time frames). To generate unique identities on each trial, we first obtained eight neutral expression identities per race (white WC: four female, mean age 23 y, SD 4.1 y; Chinese EA: four female, mean age 22.1 y, SD 0.99 y) under the same conditions of illumination (2,600 lx) and recoding distance (143 cm; Dimensional Imaging) (38). Before recording, posers removed any makeup, facial hair, visible jewelry, and/or glasses, and removed the visibility of head hair using a cap. We then created, for each race of face, two independent “identity spaces” for each sex using the correspondent subset of base identities and the shape and Red-Green-Blue (RGB) texture alignment procedures (18). We defined all points in the identity space by a [4 identities × 1] unit vector, where each entry corresponded to the weights assigned to each individual identity in a linear mixture. We then randomly selected each unit vector from a uniform distribution and constructed the neutral base shape and RGB texture accordingly. Finally, we retargeted the selected temporal dynamic parameters for each AU onto the identity created and rendered all facial animations using 3ds Max.
Procedure.
Observers viewed stimuli on a black background displayed on a 19-inch flat panel Dell monitor with a refresh rate of 60 Hz and resolution of 1,024 × 1,280. Stimuli appeared in the central visual field and remained visible until the observer responded. A chin rest ensured a constant viewing distance of 68 cm, with images subtending 14.25° (vertical) and 10.08° (horizontal) of visual angle, reflecting the average size of a human face (39) during natural social interaction (40). We randomized trials within each block and counterbalanced (race of face) blocks across observers in each cultural group. Before the experiment, we established familiarity with the emotion categories by asking observers to provide correct synonyms and descriptions of each emotion category. We controlled stimulus presentation using Matlab 2009b.
Model Fitting.
To construct the facial expression models for each observer, emotion, and intensity level, we followed established model fitting procedures (18). First, we performed a Pearson correlation between the binary activation parameter of each AU and the binary response variable for each of the observers’ emotion responses, thus producing a 41-dimensional vector detailing the composition of facial muscles. To model the dynamic component of the models, we then performed a linear regression between each of the binary emotion response variables and the six temporal parameters for each AU. To calculate the intensity gradients, we fitted a linear regression model of each AU’s temporal parameters to the observer’s intensity ratings. Finally, to generate movies of the dynamic models, we combined the significantly correlated AUs with the temporal parameters derived from the regression coefficients.
Clustering Analysis and Mutual Information.
To ascertain the optimal number of clusters required to map the distribution of the models in each culture, we applied k-means clustering analysis (41) (k = 2–40 inclusive) to the 180 WC and 180 EA models independently and calculated mutual information (41) (MI) for each value of k as follows. We randomly selected 90 models (15 per emotion) and applied k-means clustering analysis (Euclidian distance; 1,000 repetitions). Using the resulting k centroids, we then assigned the remaining 90 models to clusters on the basis of shortest Euclidean distance, and calculated MI, i.e., (model emotion label; cluster). We repeated the computation 100 times, averaged the 100 MI values, and normalized by an ideal MI (i.e., perfect association between cluster and emotion label). Fig. S1 shows the averaged MI for each culture (WC, blue line; EA, red line).
Supplementary Material
Acknowledgments
The Economic and Social Research Council and Medical Research Council (United Kingdom; ESRC/MRC-060-25-0010) and the British Academy (SG113332) supported this work.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1200155109/-/DCSupplemental.
References
- 1.Darwin C. The Expression of the Emotions in Man and Animals. 3rd Ed. London: Fontana Press; 1999. [Google Scholar]
- 2.Susskind JM, et al. Expressing fear enhances sensory acquisition. Nat Neurosci. 2008;11:843–850. doi: 10.1038/nn.2138. [DOI] [PubMed] [Google Scholar]
- 3.Andrew RJ. Evolution of facial expression. Science. 1963;142:1034–1041. doi: 10.1126/science.142.3595.1034. [DOI] [PubMed] [Google Scholar]
- 4.Ekman P, Sorenson ER, Friesen WV. Pan-cultural elements in facial displays of emotion. Science. 1969;164:86–88. doi: 10.1126/science.164.3875.86. [DOI] [PubMed] [Google Scholar]
- 5.Tomkins SS. Affect, Imagery, and Consciousness. New York: Springer; 1962. [Google Scholar]
- 6.Tomkins SS. Affect, Imagery, and Consciousness. New York: Springer; 1963. [Google Scholar]
- 7.Ekman P, Friesen W, Hagar JC. Facial Action Coding System Investigators Guide. Salt Lake City, UT: Research Nexus; 1978. [Google Scholar]
- 8.Biehl M, et al. Matsumoto and Ekman's Japanese and Caucasian facial expressions of emotion (JACFEE): Reliability data and cross-national differences. J Nonverbal Behav. 1997;21:3–21. [Google Scholar]
- 9.Ekman P, et al. Universals and cultural differences in the judgments of facial expressions of emotion. J Pers Soc Psychol. 1987;53:712–717. doi: 10.1037//0022-3514.53.4.712. [DOI] [PubMed] [Google Scholar]
- 10.Matsumoto D, et al. American-Japanese cultural differences in judgements of emotional expressions of different intensities. Cogn Emotion. 2002;16:721–747. [Google Scholar]
- 11.Matsumoto D. Cultural influences on the perception of emotion. J Cross Cult Psychol. 1989;20:92–105. [Google Scholar]
- 12.Jack RE, Blais C, Scheepers C, Schyns PG, Caldara R. Cultural confusions show that facial expressions are not universal. Curr Biol. 2009;19:1543–1548. doi: 10.1016/j.cub.2009.07.051. [DOI] [PubMed] [Google Scholar]
- 13.Moriguchi Y, et al. Specific brain activation in Japanese and Caucasian people to fearful faces. Neuroreport. 2005;16:133–136. doi: 10.1097/00001756-200502080-00012. [DOI] [PubMed] [Google Scholar]
- 14.Yrizarry N, Matsumoto D, Wilson-Cohn C. American-Japanese Differences in multiscalar intensity ratings of universal facial expressions of emotion. Motiv Emot. 1998;22:315–327. [Google Scholar]
- 15.Matsumoto D, Ekman P. American-Japanese cultural differences in intensity ratings of facial expressions of emotion. Motiv Emot. 1989;13:143–157. [Google Scholar]
- 16.Matsumoto D. Cultural similarities and differences in display rules. Motiv Emot. 1990;14:195–214. [Google Scholar]
- 17.Matsumoto D, Ekman P. Slides. San Francisco: San Francisco State University; 1988. Japanese and Caucasian Facial Expressions of Emotion (JACFEE) and Neutral Faces (JACNeuF) [Google Scholar]
- 18.Yu H, Garrod OGB, Schyns PG. Perception-driven facial expression synthesis. Comput Graph. 2012;36(3):152–162. [Google Scholar]
- 19.Chomsky N. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press; 1965. [Google Scholar]
- 20.Grenander U, Miller M. Pattern Theory: From Representation to Inference. Oxford, UK: Oxford Univ Press; 2007. [Google Scholar]
- 21.Jack RE, Caldara R, Schyns PG. Internal representations reveal cultural diversity in expectations of facial expressions of emotion. J Exp Psychol Gen. 2011;141:19–25. doi: 10.1037/a0023463. [DOI] [PubMed] [Google Scholar]
- 22.Dotsch R, Wigboldus DH, Langner O, van Knippenberg A. Ethnic out-group faces are biased in the prejudiced mind. Psychol Sci. 2008;19:978–980. doi: 10.1111/j.1467-9280.2008.02186.x. [DOI] [PubMed] [Google Scholar]
- 23.Ahumada A, Lovell J. Stimulus features in signal detection. J Acoust Soc Am. 1971;49:1751–1756. [Google Scholar]
- 24.Levenson RW. Basic emotion questions. Emotion Review. 2011;3:379–386. [Google Scholar]
- 25.Ekman P. Universals and cultural differences in facial expressions of emotion. In: Cole J, editor. Nebraska Symposium on Motivation. Lincoln, NE: Univ of Nebraska Press; 1972. [Google Scholar]
- 26.Duchenne GB. In: The mechanisms of human facial expression or an electro-physiological analysis of the expression of the emotions. Cuthebertson R, editor. New York: Cambridge Univ Press; 1862–1990. [Google Scholar]
- 27.Matsumoto D, Takeuchi S, Andayani S, Kouznetsova N, Krupp D. The contribution of individualism vs. collectivism to cross-national differences in display rules. Asian J Soc Psychol. 1998;1:147–165. [Google Scholar]
- 28.Elfenbein HA, Beaupré M, Lévesque M, Hess U. Toward a dialect theory: Cultural differences in the expression and recognition of posed facial expressions. Emotion. 2007;7:131–146. doi: 10.1037/1528-3542.7.1.131. [DOI] [PubMed] [Google Scholar]
- 29.Marsh AA, Elfenbein HA, Ambady N. Nonverbal “accents”: Cultural differences in facial expressions of emotion. Psychol Sci. 2003;14:373–376. doi: 10.1111/1467-9280.24461. [DOI] [PubMed] [Google Scholar]
- 30.Niedenthal PM, Mermillod M, Maringer M, Hess U. The simulation of smiles (SIMS) model: Embodied simulation and the meaning of facial expression. Behav Brain Sci. 2010;33:417–433, discussion 433–480. doi: 10.1017/S0140525X10000865. [DOI] [PubMed] [Google Scholar]
- 31.Ekman P. An argument for basic emotions. Cogn Emotion. 1992;6:169–200. [Google Scholar]
- 32.Ekman P. Are there basic emotions? Psychol Rev. 1992;99:550–553. doi: 10.1037/0033-295x.99.3.550. [DOI] [PubMed] [Google Scholar]
- 33.Ekman P, Cordaro D. What is meant by calling emotions basic. Emotion Review. 2011;3:364–370. [Google Scholar]
- 34.Li J, Wang L, Fischer K. The organisation of Chinese shame concepts? Cogn Emotion. 2004;18:767–797. [Google Scholar]
- 35.Tracy JL, Robins RW. Show your pride: Evidence for a discrete emotion expression. Psychol Sci. 2004;15:194–197. doi: 10.1111/j.0956-7976.2004.01503008.x. [DOI] [PubMed] [Google Scholar]
- 36.Bedford O, Hwang K-K. Guilt and shame in Chinese culture: A cross-cultural framework from the perspective of morality and identity. J Theory Soc Behav. 2003;33:127–144. [Google Scholar]
- 37.Ekman P, Friesen W. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Washington, DC: Consulting Psychologists Press; 1978. [Google Scholar]
- 38.Urquhart CW, Green DS, Borland ED. Proceedings of the 3rd European Conference on Visual Media Production. Edison, NJ: Institution of Electrical Engineers; 2006. 4D Capture using passive stereo photogrammetry; p. 196. [Google Scholar]
- 39.Ibrahimagić-Šeper L, Čelebić A, Petričević N, Selimović E. Anthropometric differences between males and females in face dimensions and dimensions of central maxillary incisors. Med Glas. 2006;3:58–62. [Google Scholar]
- 40.Hall E. The Hidden Dimension. Garden City, NY: Doubleday; 1966. [Google Scholar]
- 41.MacQueen J. Vol. 1. Berkeley, CA: Univ of California Press; 1967. Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; pp. 281–297. [Google Scholar]
- 42.Magri C, Whittingstall K, Singh V, Logothetis NK, Panzeri S. A toolbox for the fast information analysis of multiple-site LFP, EEG and spike train recordings. BMC Neurosci. 2009;10:81. doi: 10.1186/1471-2202-10-81. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



