Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Aug 6.
Published in final edited form as: Curr Biol. 2011 Nov 8;21(21):R888–R890. doi: 10.1016/j.cub.2011.09.020

Face Recognition: Vision and Emotions beyond the Bubble

Hanlin Tang 1, Gabriel Kreiman 1,2,3
PMCID: PMC4122972  NIHMSID: NIHMS353946  PMID: 22075428

Abstract

A new study of how neurons in the human amygdala represent faces and their component features argues for a holistic representation.


Visual input from the retina travels through a cascade of processes in the neocortex to the highest echelons of the brain, eventually feeding into areas that govern memory, emotion, cognition and action. An important step to explaining these higher brain functions is to first understand and quantitatively characterize the neuronal circuits behind the transformation of the pixel-like visual input to the complex behaviorally relevant format in higher brain centers.

As reported recently in Current Biology, Rutishauser et al. [1] courageously attacked this question by recording the activity of individual neurons in the human brain while subjects view and act upon images of faces. The researchers focussed their study on the amygdala, a region of the brain that receives direct visual input from the inferior temporal cortex and plays a central role in processing emotions [2]. Higher brain centers that govern complex behavior are typically difficult to study, and the amygdala is no exception. Studies in rodents and non-human primates can take advantage of electrophysiological techniques to monitor the activity of individual neurons, but it is not always trivial to design behavioral paradigms that tap into the rich repertoire of human emotions. Non-invasive studies of the human amygdala suffer from poor spatial and/or temporal resolution. Rutishauser et al. [1] combined the best of both worlds by examining neuronal activity in epileptic patients in whom electrodes had been implanted for clinical reasons [3,4]. This type of recording can provide insights about human cognition at the level of individual neurons and local circuits.

Previous single unit studies have revealed that neurons in the primate amygdala (in humans and monkeys) respond to complex visual shapes, including faces and other stimuli [58]. However, it was not clear whether these responses require visual presentation of the whole stimulus, or whether certain parts or features of the stimulus are sufficient to elicit a selective response. Because the amygdala is involved in recognizing emotions, the integration of different features into a whole percept may provide clues about how emotions are processed. Rutishauser et al. [1] hypothesized that the representation in the amygdala may have ‘holistic’ characteristics: that is, that neurons might be particularly sensitive to whole stimuli as opposed to stimulus parts. The authors used an experimental paradigm in which face images are presented through ‘bubbles’ such that only partial information is available to the viewer, who has to make a categorical discrimination based on the input.

What do neurons in the amygdala say about all this holistic business? Rutishauser et al. [1] found that several amygdala neurons prefer whole stimuli as opposed to specific parts or features. These neurons show surprising sensitivity in their firing rate responses to small degrees of occlusion in the stimuli, suggesting a ‘holistic’ representation. The responses are not necessarily monotonic and often defy our intuitions. In fact, the firing activity in response to stimulus parts does not reveal any immediately obvious relationship to the responses to the whole stimuli: the authors argue that the former cannot predict the latter. Intriguingly, in many instances, more information leads to smaller responses.

Given these puzzling observations, it is worth pondering the visual inputs to the amygdala and the degree to which the incoming information reflects features or wholes or both. Visual shape information is conveyed to the amygdala primarily through regions in inferior temporal cortex in monkeys [9] (less is known about the detailed neuroanatomical connections in humans). One possibility is that the input provides information about features and is combined in the amygdala in order to interpret the emotions conveyed by the whole stimulus. This notion is consistent with the results of several neurophysiological recordings in the macaque monkey inferior temporal cortex, where neurons seem to respond to complex shapes and features (for example, [1013] among many others). Alternatively, regions of inferior temporal cortex that feed into the amygdala may contain neurons that share some properties with the ones reported by Rutishauser et al. [1], such as enhanced responses to whole objects [14].

It is not easy to interpret the neurophysiological responses without the aid of clear theoretical and computational models. The problem of object completion from partial information has received significant attention in the computational neuroscience literature. Object completion is relevant to the current study because the images were seen through bubbles, making object recognition from partial information a necessary step for a putative ‘holistic’ representation. Attractor networks show a remarkable ability to complete patterns by driving activity according to well-specified dynamical rules that guide the system from arbitrary starting points towards stored memories [15].

Some authors have speculated that the neuronal responses in the hippocampus are reminiscent of the dynamical patterns described by attractor networks [16]. The extent to which these similarities extend to the amygdala are not clear. These attractor network models rely on massive recurrent connectivity and contrast with other computational architectures where features are combined in purely feed-forward hierarchical fashion (for example, [17,18]). Several computational models of the ventral visual stream progressively build neurons that respond to more complex features using input from the parts represented in the previous layer [1720]. It is conceivable (but far from clear) that hierarchical feature-based representations throughout the ventral visual stream encounter attractor network architectures at the highest echelons. It will be interesting and important for the field to reflect upon the type of computational principles that can give rise to the variety and non-monotonic nature of the responses reported by Rutishauser et al. [1].

The computational models also highlight the difficulties inherent in definitions about wholes and parts. In the current study [1], as in many other studies, there is an anthropomorphic distinction between wholes and parts. Further inspection shows that these definitions are far from trivial. Isn't a face a part of a whole individual? Or why not consider the eyes as a separate whole? Is ‘F’ a whole letter or is it part of the letter ‘E’? Perhaps the distinction between features and wholes can be accounted for at least partly, by experience with particular combinations of features that tend to appear together in certain configurations. Simple null models may not know about ‘whole objects’, often work in feature spaces that are indifferent to the charm of faces and may not necessarily be able to distinguish emotions in the images. Inasmuch as these null models fail to explain the bewildering complexity and beauty of the neurophysiology in the amygdala, the current study elegantly forces us to go back and build more elaborate theories and algorithms.

One of the nice aspects of doing science is that good work can lead to more work. Thus, several questions emerge from the work of Rutishauser et al. [1]. As outlined above, the definition of ‘wholes’ and ‘parts’ is not trivial. It seems important to further understand the visual input to areas such as the amygdala so that we can better describe what computational properties are unique to the amygdala and which ones are inherited from previous processing. The authors focus on face images, but the study leaves open the possibility that the amygdala can have similar responses to non-face objects. Is the “holistic” nature of the representation limited to faces [14]? The dynamics of the neuronal responses may provide further insights regarding the computational principles behind recognition and object completion. What type of computational models can give rise to the non-intuitive responses described in this study? The inspiring work of Rutishauser et al. [1] opens the doors to a rich set of questions that deserve further investigation.

References

  • 1.Rutishauser U, Tudusciuc O, Neumann D, Mamelak A, Heller C, Ross I, Philpott L, Sutherling W, Adolphs R. Single-unit responses selective for whole faces in the human amygdala. Curr Biol. 2011;21:1654–1660. doi: 10.1016/j.cub.2011.08.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Adolphs R, Tranel D, Damasio H, Damasio A. Impaired recognition of emotion in facial expressions following bilateral damage to the amygdala. Nature. 1994;372:669–672. doi: 10.1038/372669a0. [DOI] [PubMed] [Google Scholar]
  • 3.Engel AK, Moll CK, Fried I, Ojemann GA. Invasive recordings from the human brain: clinical insights and beyond. Nat Rev Neurosci. 2005;6:35–47. doi: 10.1038/nrn1585. [DOI] [PubMed] [Google Scholar]
  • 4.Kreiman G. Single neuron approaches to human vision and memories. Curr Opin Neurobiol. 2007;17:471–475. doi: 10.1016/j.conb.2007.07.005. [DOI] [PubMed] [Google Scholar]
  • 5.Fried I, MacDonald KA, Wilson C. Single neuron activity in human hippocampus and amygdala during recognition of faces and objects. Neuron. 1997;18:753–765. doi: 10.1016/s0896-6273(00)80315-3. [DOI] [PubMed] [Google Scholar]
  • 6.Gothard KM, Battaglia FP, Erickson CA, Spitler KM, Amaral DG. Neural responses to facial expression and face identity in the monkey amygdala. J Neurophysiol. 2007;97:1671–1683. doi: 10.1152/jn.00714.2006. [DOI] [PubMed] [Google Scholar]
  • 7.Leonard CM, Rolls ET, Wilson FAW, Baylis GC. Neurons in the amygdala of the monkey with responses selective for faces. Behav Brain Res. 1985;15:159–76. doi: 10.1016/0166-4328(85)90062-2. [DOI] [PubMed] [Google Scholar]
  • 8.Kreiman G, Koch C, Fried I. Category-specific visual responses of single neurons in the human medial temporal lobe. Nat Neurosci. 2000;3:946–953. doi: 10.1038/78868. [DOI] [PubMed] [Google Scholar]
  • 9.Cheng K, Saleem KS, Tanaka K. Organization of corticostriatal and corticoamygdalar projections arising from the anterior inferotemporal area TE of the macaque monkey: A Phaseolus vulgaris leucoagglutinin study. J Neurosci. 1997;17:7902–925. doi: 10.1523/JNEUROSCI.17-20-07902.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kovacs G, Vogels R, Orban GA. Selectivity of macaque inferior temporal neurons for partially occluded shapes. J Neurosci. 1995;15:1984–1997. doi: 10.1523/JNEUROSCI.15-03-01984.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Logothetis NK, Sheinberg DL. Visual object recognition. Annu Rev Neurosci. 1996;19:577–621. doi: 10.1146/annurev.ne.19.030196.003045. [DOI] [PubMed] [Google Scholar]
  • 12.Connor CE, Brincat SL, Pasupathy A. Transformation of shape information in the ventral pathway. Curr Opin Neurobiol. 2007;17:140–147. doi: 10.1016/j.conb.2007.03.002. [DOI] [PubMed] [Google Scholar]
  • 13.Nielsen K, Logothetis N, Rainer G. Dissociation between LFP and spiking activity in macaque inferior temporal cortex reveals diagnostic parts-based encoding of complex objects. J Neurosci. 2006;26:9639–9645. doi: 10.1523/JNEUROSCI.2273-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Logothetis NK. Object recognition: holistic representations in the monkey brain. Spat Vis. 2000;13:165–178. doi: 10.1163/156856800741180. [DOI] [PubMed] [Google Scholar]
  • 15.Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA. 1982;79:2554–2558. doi: 10.1073/pnas.79.8.2554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Deco G, Rolls ET. Computational Neuroscience of Vision. Oxford: Oxford University Press; 2004. [Google Scholar]
  • 17.Fukushima K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybernet. 1980;36:193–202. doi: 10.1007/BF00344251. [DOI] [PubMed] [Google Scholar]
  • 18.Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nat Neurosci. 1999;2:1019–1025. doi: 10.1038/14819. [DOI] [PubMed] [Google Scholar]
  • 19.Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791. doi: 10.1038/44565. [DOI] [PubMed] [Google Scholar]
  • 20.Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T. A quantitative theory of immediate visual recognition. Prog Brain Res. 2007;165C:33–56. doi: 10.1016/S0079-6123(06)65004-8. [DOI] [PubMed] [Google Scholar]

RESOURCES