Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2005 Apr 28;102(19):6996–7001. doi: 10.1073/pnas.0502605102

Representations of faces and body parts in macaque temporal cortex: A functional MRI study

Mark A Pinsk *,†,, Kevin DeSimone *,†, Tirin Moore §, Charles G Gross *, Sabine Kastner *,†
PMCID: PMC1100800  PMID: 15860578

Abstract

Human neuroimaging studies suggest that areas in temporal cortex respond preferentially to certain biologically relevant stimulus categories such as faces and bodies. Single-cell studies in monkeys have reported cells in inferior temporal cortex that respond selectively to faces, hands, and bodies but provide little evidence of large clusters of category-specific cells that would form “areas.” We probed the category selectivity of macaque temporal cortex for representations of monkey faces and monkey body parts relative to man-made objects using functional MRI in animals trained to fixate. Two face-selective areas were activated bilaterally in the posterior and anterior superior temporal sulcus exhibiting different degrees of category selectivity. The posterior face area was more extensively activated in the right hemisphere than in the left hemisphere. Immediately adjacent to the face areas, regions were activated bilaterally responding preferentially to body parts. Our findings suggest a category-selective organization for faces and body parts in macaque temporal cortex.

Keywords: non-human primate, visual category representations


Human and non-human primates have a remarkable ability to recognize a large variety of different objects in their environments. In humans, neuroimaging studies have shown that object information is represented in a large swath of ventral temporal and lateral occipital cortex that is characterized by stronger responses to objects than to non-objects (1, 2). Within these object-selective activations, discrete regions have been identified that respond preferentially to some biologically relevant stimulus categories such as faces [the “fusiform face area,” FFA (36)] or bodies [the “extrastriate body area,” EBA (7)], suggesting a category-specific and anatomically segregated modular organization of neural representations related to certain classes of object stimuli. These studies are consistent with reports of patients with lesions of temporooccipital cortex, who show selective impairments in recognizing familiar faces [prosopagnosia (8)] or body parts (9) but not other objects.

In the macaque, much less is known about the large-scale representation of object information. Single-cell physiology studies have shown that neurons in inferior temporal (IT) cortex typically respond to complex stimuli with some selectivity for shape, color, and texture (10, 11). A small proportion of IT neurons were found to respond selectively to faces (1016), hands (14, 17), or human bodies (18). These neurons were more common in the portion of cytoarchitectonic area TE on the ventral bank of the superior temporal sulcus (STS) and in the superior polysensory area on the dorsal bank of the STS. They were also found on the lateral and ventral surfaces of area TE. Even though face-selective neurons were found clustered together, or sometimes even formed columns (19, 20), there was no evidence for an organization into face- or body-selective areas in monkey IT cortex, similar to the human FFA or EBA.

However, this view has recently been challenged by the demonstration of discrete face-selective areas in the posterior and anterior STS by using functional MRI (fMRI) in anesthetized and awake macaques (21, 22). The face-selective areas were embedded in a large object-selective activation extending from area V4 to rostral TE (22). These studies were the first to provide evidence of category-selective areas in monkey IT cortex. Here, we probe the neural representations of two classes of biologically relevant stimuli, monkey faces and body parts, using fMRI in behaving macaques.

Methods

Subjects. Subjects were three adult, male macaque monkeys (Macaca fascicularis) weighing 4–9 kg. All procedures were approved by the Princeton University Animal Care and Use Committee and conformed to National Institutes of Health guidelines for the humane care and use of laboratory animals.

Details regarding surgery, experimental setup, data acquisition, and analysis are described by Pinsk et al. (23) and will only be briefly summarized here. Each animal was surgically implanted with a plastic head bolt by using ceramic screws and dental acrylic. Monkeys were placed in an MR-compatible primate chair prone with their heads erect and rigidly fixed in a head-holding apparatus. The animals were acclimated to the MRI environment through the use of a mock scanner. Monkeys were trained to fixate on a small dot at the center of a display screen by using an infrared eye-tracking system (Applied Science Laboratories, Bedford, MA). By providing the animals with regular juice rewards while they maintained fixation within a 4° square window, and systematically increasing the reward rate during the course of a trial, the animals were trained to fixate for as long as 4 min.

Visual Stimuli and Experimental Design. Color pictures of monkey faces, monkey body parts, and man-made objects were presented on a screen while the animals maintained fixation (see Fig. 4, which is published as supporting information on the PNAS web site). The stimuli subtended 12 × 12° and were presented for 1 s foveally behind the fixation point (0.5° diameter), followed by a 1-s blank interval during which only the fixation point was present. Several additional categories of stimuli (food, laboratory scenes) were presented during each trial, the results of which will be described in a separate report. Blocks of stimuli from each category were presented interleaved with blank periods, each lasting for 12 s. Each category block was repeated twice within a trial, resulting in trials of 240 s each. Monkey M3 was only tested for faces in relation to pictures of houses instead of man-made objects and not for body parts. Stimulus presentation and eye position recordings were synchronized to the beginning of each scan by using a trigger pulse from the scanner.

Data Acquisition. Structural and functional images were acquired with a 3-T head-dedicated scanner (Magnetom Allegra, Siemens, Erlangen, Germany), using a 12-cm transmit/receive surface coil (NMSC-023, Nova Medical, Wakefield, MA). For cortical surface reconstructions, a high-resolution (0.5 × 0.5 × 0.5 mm) structural scan was acquired in a separate session during which the animals were anesthetized with Telazol (tiletamine/zolazepam, 10 mg/kg i.m.) [MPRAGE sequence; field of view (FOV) = 128 × 128 mm; 256 × 256 matrix; TR = 2,500 ms; TE = 4.4 ms; TI = 1,100 ms; flip angle = 8°; 20 acquisitions]. All other scan sessions were performed with awake animals. Functional images were taken with a gradient echo, echo planar imaging sequence (FOV = 80 × 80 mm; 64 × 64 matrix; TR = 2,400 ms, TE = 32 ms, flip angle = 90°, bandwidth = 2,112 Hz per pixel). Twenty-seven contiguous coronal slices (thickness of 2 mm without gap; in-plane resolution of 1.25 × 1.25 mm) were acquired in six to eight series of 100 images each, starting from the posterior pole and covering the brain up to the region of the principal sulcus. An in-plane magnetic field map image was acquired to perform echo planar imaging undistortion (FOV = 80 × 80 mm; 64 × 64 matrix; TR = 600 ms, TE = 8.8/11.3 ms; flip angle = 45°). An anatomical scan was also acquired in the same session to facilitate cortical surface alignments (0.5 × 0.5 × 1.0 mm, MPRAGE sequence; FOV = 128 × 128 mm; 256 × 256 matrix; TR = 2,500 ms; TE = 4.4 ms; TI = 1,100 ms; flip angle = 8°; 1 acquisition).

In total, 4,400 and 4,800 functional volumes were acquired in monkeys M1 and M2 in a total of six scan sessions per animal. In monkey M3, 600 functional volumes were acquired in a single session. The reliability of activations was investigated in monkey M1, who repeated the experiment twice with 3,000 functional volumes acquired in five and four scan sessions, respectively.

Data Analysis. Data were analyzed by using afni (http://afni.nimh.nih.gov/afni), freesurfer (http://surfer.nmr.mgh.harvard.edu), and suma (http://afni.nimh.nih.gov/afni/suma). Scans during which the animal broke fixation >20 times and for longer than 500 ms each time were excluded from fMRI analysis. The functional images were motion-corrected to the image acquired closest to the anatomical scan, undistorted by using the field map scan (24), and spatially filtered with a 2-mm Gaussian kernel. Each time series was normalized to its mean to input all of the time series across scan sessions into a single multiple regression analysis. Square-wave functions matching the time course of the experimental design were convolved with a gamma-variate function (25) and used as regressors of interest in a multiple regression model in the framework of the general linear model (26). Additional regressors to account for variance due to baseline shifts between time series, linear drifts within time series, and head motion were included in the regression model. Brain regions responding more strongly to faces or body parts were identified by contrasting presentation blocks of faces with objects and body parts with objects, respectively, similar to definitions used in previous human fMRI studies (3, 4, 6, 27). The statistical maps were thresholded at a Z score of 2.33 (P < 0.01) and overlaid on anatomical scans or cortical surface reconstructions.

Regions of interest within temporal cortex were defined as clusters of 10 or more contiguous voxels. Clusters smaller than 10 voxels were included as regions of interest only if a larger cluster (>10 voxels) was found in the same anatomical location of the other hemisphere. The raw, unsmoothed fMRI signals were averaged across all activated voxels within a given region of interest and across scans, and normalized to the mean intensity obtained during the blank periods. Statistical significance was determined by two-sample t tests assuming unequal variances on the averaged peak intensities of fMRI signals for each condition obtained in a given scan session and in both monkeys. The selectivity of each area for the different stimulus categories was assessed by computing a category selectivity index [CSI = (RcatRctrl)/(Rcat + Rctrl), where Rcat is the averaged response of peak MRI intensities obtained during face or body part conditions and Rctrl is the averaged response obtained during the object condition].

Eye-movement analysis was performed with ilab software (28) and confirmed that the animals maintained fixation for almost all of the time during scanning sessions. For example, during a typical session, monkey M2 maintained fixation within the 4° window for 97 ± 1% of the time and made an average of 2.5 ± 1 eye movements outside the window that lasted for >500 ms before returning to the window.

Results

To identify brain regions that responded more strongly to faces than man-made objects, we contrasted face and object conditions and compared the resulting activations in monkeys M1–M3. This contrast revealed two activated regions in temporal cortex (Z > 2.33, P < 0.01). A posterior face (pFace) area was activated bilaterally along the fundus and banks of the posterior STS (slice 1 in Fig. 1A), and an anterior face (aFace) area, also activated bilaterally, was found in the anterior STS and on the middle temporal gyrus (MTG) (slice 2 in Fig. 1 A). There was some variability in the locations of these areas among the three animals. For example, M1's aFace area was located on both banks and fundus of the STS, whereas M2's aFace area was located only on the right MTG (Fig. 1 A). Activations in similar locations were also found in M3 (Fig. 5, which is published as supporting information on the PNAS web site).

Fig. 1.

Fig. 1.

Category-selective representations of faces and body parts in monkey temporal cortex. (A) Coronal slices of monkeys M1 and M2 depicting voxels activated significantly more by faces compared to objects. (B) Same coronal slices depicting voxels activated significantly more by body parts compared with objects. Approximate locations of the coronal slices (1, 2) are indicated on a sagittal slice. Scale indicates Z-score values of functional activity in colored regions. R, right hemisphere.

Brain regions that responded more strongly to body parts than to man-made objects were identified by contrasting monkey body part and object conditions and comparing the resulting activity in monkeys M1 and M2. This contrast revealed two activated regions in the temporal cortex (Z > 2.33, P < 0.01). A posterior body part (pBody) area was found along the banks and fundus of the STS in M1 (slice 1 in Fig. 1B) but not in M2. A second region was found more anterior along the STS. This anterior body part (aBody) area was activated bilaterally in both monkeys and located on the banks and the fundus of the STS (slice 2 in Fig. 1B).

It is important to note that the differences in activation patterns could not be attributed to systematic differences in eye movements while viewing the three categories of stimuli. Fixation performance was similar for the three monkeys, and there were no significant differences in the amount of horizontal and vertical eye movements made while viewing stimuli of the three different categories [e.g., monkey M2: horizontal eye movements, F0.05(2,45) = 1.16, P = 0.32; vertical eye movements, F0.05(2,45) = 0.52, P = 0.59].

The anatomical relationship between the face and body part areas was examined by determining the responsiveness of each voxel shown in the activation maps of Fig. 1 for M1 and M2 to faces (color-coded in red), body parts (color-coded in yellow), or both stimuli (color-coded in blue). The color-coded voxels were then projected onto inflated and flattened cortical surface reconstructions (Fig. 2). With this presentation format, there appears to be a larger number of category-selective areas compared to those shown in Fig. 1. However, this is a result of the data-display procedures. When neighboring voxels touch cortically distant regions (e.g., upper and lower banks of a sulcus), they tend to separate when projected onto the cortical surface. Face-selective regions in M1 were located in the lower and upper banks of the posterior STS in the left and right hemispheres, respectively (Fig. 2 A). Interestingly, the more anterior activations in M1 for faces and body parts appeared to be continuous and also partially overlapping. The body part-selective area in the anterior STS in M2 was located on both banks, and the aFace-selective area was found to be adjacent and ventral to it on the right MTG (Fig. 2B). Overall, this topography suggests a representation of the appearance of the monkey body in the anterior STS and MTG. It is possible that there is a second representation in the posterior STS, as suggested by the findings in M1, but this will need further study.

Fig. 2.

Fig. 2.

Topographic relationship of face and body part areas. Inflated and flattened cortical surface representations of the left and right hemispheres of monkeys M1 (A) and M2 (B). Activated voxels are color-coded according to their preferred category. Faces > objects, red; body parts > objects, yellow; overlap of the two, blue. sts, superior temporal sulcus; sf, sylvian fissure; ios, inferior occipital sulcus; ots, occipitotemporal sulcus; ls, lunate sulcus; ips, inferior parietal sulcus; cs, central sulcus; as, arcuate sulcus.

To examine the extent of the category-selective activations, the volumes of the activated regions in the left and right hemisphere were analyzed and compared in the three animals (Table 1). The size of the face activations ranged from 3 mm3 to 103 mm3, and the size of the body part activations ranged from 87 mm3 to 121 mm3. There were no significant differences in activated volumes between the right and left hemisphere in any of the category-selective areas except for the pFace area, which showed a larger activation volume in the right than in the left hemisphere in each of the three monkeys [t(2) = 13.85, P < 0.01].

Table 1. Activated volumes in category-selective areas.

Activated volume, mm3
Monkey Area RH LH
M1 pFace 56.25 12.5
aFace 18.75 103.13
pBody 21.88 6.25
aBody 121.88 87.5
M2 pFace 56.25 6.25
aFace 53.13 0
aBody 56.25 121.88
M3* pFace 56.25 0
aFace 31.25 3.13
Mean ± SE pFace 56 ± 0 6 ± 4
aFace 34 ± 10 35 ± 34
aBody 89 ± 33 105 ± 17
*

Face responsive regions in monkey M3 were defined in a single scan session by comparing faces to houses rather than to objects.

The reliability of activations for the face and body part areas in the STS was examined in monkey M1, who participated in two additional experiments (Exps. 2 and 3 in Fig. 6, which is published as supporting information on the PNAS web site) that were similar to our original study (Exp. 1 in Fig. 6). Exp. 2 was identical to Exp. 1 in terms of the experimental design but involved a higher-resolution echo planar imaging sequence (1.25 × 1.25 × 1.25 mm instead of 1.25 × 1.25 × 2.0 mm). In Exp. 3, we probed only the face and object, but not the body part condition. Both the pFace and the aFace areas were activated across the three experiments (Fig. 6A). Furthermore, the volume of the pFace area in the right hemisphere was larger than in the corresponding area of the left hemisphere in both additional experiments (Exp. 2: 97 mm3 vs. 0 mm3; Exp. 3: 53 mm3 vs. 16 mm3), thus replicating the right hemisphere asymmetry found across the three monkeys in Exp. 1. The aBody area, but not the pBody area, was activated in Exp. 2 (Fig. 6B), suggesting that activations of the aBody area were more robust and reliable than those of the pBody area.

The response properties and category selectivity of face- and body part-related activations were studied by performing a time course analysis of fMRI signals (Fig. 3A). fMRI signals were averaged across all activated voxels within each region, across hemispheres, scans, and monkeys M1 and M2 and normalized to the average signals obtained during blank periods. This analysis revealed that both face areas responded two to three times as strongly to faces than objects [pFace: faces vs. objects, t(128) = 3.20, P < 0.01; aFace: faces vs. objects, t(142) = 4.22, P < 0.01]. However, the pFace area was as responsive to faces as to body parts [pFace: faces vs. body parts, t(119) = 0.68, not significant], whereas body parts did not evoke significantly stronger responses than objects in the aFace area [aFace: body parts vs. objects, t(141) =–0.62, not significant] (Fig. 3A). Both the posterior and aBody areas responded more strongly to body parts than to objects and faces [pBody: body parts vs. objects, t(61) = 2.48, P < 0.05; body parts vs. faces, t(61) = 2.91, P < 0.01; aBody: body parts vs. objects, t(138) = 4.76, P < 0.01; body parts vs. faces, t(138) = 4.65, P < 0.01]. However, objects and faces both evoked a considerable and similar response in the posterior area, whereas almost no activity was elicited by these stimuli in the aBody area (Fig. 3A). The differences in category selectivity were further quantified with a category selectivity index (CSI). Values close to 1 indicate strong selectivity for a given stimulus category (faces and body parts) relative to the control category (objects). Values around 0 indicate no preference between the stimulus category and the control category. And negative values indicate a greater preference for the control category than the probed stimulus category. It is striking that the aFace and aBody areas as well as the pBody area were highly selective for faces and body parts, respectively, whereas the pFace area did not discriminate between faces and body parts (Fig. 3B and Table 2, which is published as supporting information on the PNAS web site).

Fig. 3.

Fig. 3.

Time courses of fMRI signals from category-selective areas. (A) Signals from the pFace, pBody, aBody, and aFace areas averaged across two monkeys (M1 and M2). The duration of the visual stimulation epoch is indicated by the black bar. The pBody area was activated only in M1. Note the y-axis scale change for the time courses in the pFace and the aBody areas. (B) Selectivity of each area to faces (red) and body parts (black) relative to objects, computed with an index (see Methods and Results).

Discussion

Our results confirm previous fMRI studies reporting face-selective activations in apparently similar anatomical locations in anesthetized (21) and awake macaques (22). They extend these results by demonstrating an area responding selectively to body parts adjacent to the face-selective area in the anterior STS. In addition, our results suggest a more extensive activation of the pFace area in the right hemisphere than in the left hemisphere. The differences in activations that we obtained with face, body part, and object stimuli are unlikely to reflect differences in the attentional state of the animals performing the passive viewing task. On such an account, one would not predict reliable activations of the same regions across different experiments, high test-retest reliability in the same experiment, or consistent activations across individual monkeys, all of which were demonstrated in our study.

Neurons responding more strongly to faces than to other visual stimuli have been found, on both banks and the floor of the STS, and less prominently on the lateral and ventral convexity of IT cortex (1016). These STS “face cells” were often found clustered together in patches and sometimes formed columns (19, 20). However, their overall proportion was reported to be small, 20% of visually responsive cells at the most (15), and no evidence for a “face area” consisting of chiefly face responsive cells was found. Indeed, bilateral lesions of the STS did not induce specific impairments in face discrimination but led only to mild impairments in discrimination of eye gaze (29, 30). Our results, on the other hand, suggest that a large proportion of neurons that are regionally clustered together in the anterior and posterior STS must be activated selectively in response to faces to evoke sufficiently strong blood oxygenation level-dependent signals. Single-cell physiology studies in animals, in which the face-selective blood oxygenation level-dependent signals will guide the electrode placement, will be necessary to address this apparent discrepancy.

In humans, functional brain imaging studies have demonstrated a distributed neural system activated specifically by faces consisting of areas in the lateral fusiform gyrus (the FFA), lateral inferior occipital gyrus, and STS (3, 31). These different areas appear to be involved in different aspects of the perceptual analysis of faces. Perception of face identity was shown to be associated with regions in the inferior occipital and fusiform gyri, and perception of eye gaze, eye, and mouth movements, and facial expressions was associated with the STS region (3235). Our finding of several face-responsive areas exhibiting different degrees of category-selectivity raises the possibility that face perception in the macaque may also be mediated by a distributed network of areas as in humans. Interestingly, in single-cell studies, neurons responding more strongly to facial expression were found more frequently in the STS, whereas neurons responding to facial identity were found more frequently in the IT gyrus (36), suggesting such a dissociation of function with respect to different aspects of face perception. Neurons in the anterior STS have been shown to respond to cues that are important for social communication such as eye gaze or emotional face expression (35, 37, 38). Therefore, it is possible that the macaque anterior STS face area serves a similar function to that of the human STS face area. Tsao and colleagues (22) have proposed in a related monkey fMRI study that the face area in the posterior STS may be homologous to the human FFA. However, the human FFA has been shown to respond differentially to faces and human bodies (39), whereas the pFace area that we identified in the macaque did not discriminate monkey faces and body parts.

We found a more extensive activation of the pFace area in the right than in the left hemisphere, suggesting a hemispheric asymmetry in the processing of face information in the macaque. This finding is consistent with several studies reporting right hemisphere superiority in monkeys for processing facial information (4042). In humans, there is converging evidence from neuropsychological and neuroimaging studies for a right hemispheric dominance in face perception. Damage to the posterior right hemisphere is often sufficient to produce prosopagnosia (43). Neuroimaging studies have shown that face-selective areas are often more strongly activated in the right hemisphere than in the left hemisphere (35). These findings are paralleled by behavioral studies demonstrating that human subjects process face information faster when presented to the left hemifield than to the right hemifield, the so-called “left field advantage” (44).

Neurons in the anterior STS have also been found to respond to body parts such as hands (14, 17) and to static views of human bodies with and without heads (18, 45). Our finding of an area in the anterior STS that responded more strongly to body parts than to faces and objects is in agreement with these physiology studies, although, as with faces, the proportion of neurons responding selectively to body parts or bodies was found to be rather small (<10% of visually responsive neurons) and did not suggest the existence of a category-selective area. Neurons in the anterior STS have also been shown to respond to complex body movements such as walking patterns (12, 46) or biological motion (45), suggesting that this high-level extrastriate area integrates form and motion aspects of biologically relevant visual stimuli (45). Tsao and colleagues (22) demonstrated an area responding more strongly to headless human bodies relative to human body parts such as faces and hands as well as man-made objects in posterior STS but not in anterior STS, using fMRI in awake macaques. Their body area is likely different from the pBody area of monkey M1 given that our stimulus set contained monkey body parts including hand stimuli.

Whereas several of our findings regarding face-selective activations suggest similarities in the functional organization of face representation in human and macaque, our findings regarding the representations of body parts in the two species indicate a number of differences. Although both species appear to have category-specific representations of body parts, the human EBA is located in Brodmann area 18, possibly in retinotopically organized visual cortex (7), whereas the monkey body part area is located in the anterior STS, a high-level visual processing area within the temporal cortex. The human EBA has been implicated not only in the processing of the appearance of the body, but also in the coding of goal-directed movements of the observer's body parts (47). This area does not, however, carry signals that can differentiate biological from nonbiological motion (48), whereas neurons in the anterior STS have been shown to respond to biological motion (45). Most importantly, the EBA and face-selective areas such as the FFA were found in widely separated locations of cortex in the human (ref. 7; but see ref. 39). In contrast, the body part-selective region in macaque anterior STS was found to be adjacent in M2 and partially overlapping with face-selective activations in M1, suggesting a more continuous representation of the appearance of the monkey body in the anterior STS.

In conclusion, we present evidence for category-selective neural representations in macaque IT cortex for object stimuli that are of particular biological significance to the animal, namely monkey faces and body parts, suggesting a modular organization of certain classes of object stimuli in the ventral visual pathway.

Supplementary Material

Supporting Information
pnas_102_19_6996__.html (4.9KB, html)

Acknowledgments

We thank Michael Benharrosh, Jonathan D. Cohen, Rhodri Cusack, Michael S. A. Graziano, James V. Haxby, and Kimberly J. Montgomery. This work was supported by National Institutes of Health Grants R01 EY-11347 (to C.G.G.), R01 MH-64043 (to S.K.), and P50 MH-62196 (to S.K.) and a grant from the Whitehall Foundation (to S.K.).

Author contributions: M.A.P., T.M., C.G.G., and S.K. designed research; M.A.P., S.K., and K.D. performed research; M.A.P., S.K., and K.D. analyzed data; and M.A.P., T.M., C.G.G., and S.K. wrote the paper.

Abbreviations: aBody, anterior body part; aFace, anterior face; EBA, extrastriate body area; FFA, fusiform face area; fMRI, functional MRI; IT, inferior temporal; pBody, posterior body part; pFace, posterior face; STS, superior temporal sulcus.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
pnas_102_19_6996__.html (4.9KB, html)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES