Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2014 Sep 10;34(37):12457–12469. doi: 10.1523/JNEUROSCI.0485-14.2014

Face Inversion Decreased Information about Facial Identity and Expression in Face-Responsive Neurons in Macaque Area TE

Yasuko Sugase-Miyamoto 1,, Narihisa Matsumoto 1, Kaoru Ohyama 1,2, Kenji Kawano 3
PMCID: PMC6615495  PMID: 25209284

Abstract

To investigate the effect of face inversion and thatcherization (eye inversion) on temporal processing stages of facial information, single neuron activities in the temporal cortex (area TE) of two rhesus monkeys were recorded. Test stimuli were colored pictures of monkey faces (four with four different expressions), human faces (three with four different expressions), and geometric shapes. Modifications were made in each face-picture, and its four variations were used as stimuli: upright original, inverted original, upright thatcherized, and inverted thatcherized faces. A total of 119 neurons responded to at least one of the upright original facial stimuli. A majority of the neurons (71%) showed activity modulations depending on upright and inverted presentations, and a lesser number of neurons (13%) showed activity modulations depending on original and thatcherized face conditions. In the case of face inversion, information about the fine category (facial identity and expression) decreased, whereas information about the global category (monkey vs human vs shape) was retained for both the original and thatcherized faces. Principal component analysis on the neuronal population responses revealed that the global categorization occurred regardless of the face inversion and that the inverted faces were represented near the upright faces in the principal component analysis space. By contrast, the face inversion decreased the ability to represent human facial identity and monkey facial expression. Thus, the neuronal population represented inverted faces as faces but failed to represent the identity and expression of the inverted faces, indicating that the neuronal representation in area TE cause the perceptual effect of face inversion.

Keywords: electrophysiology, face perception, information analysis, cluster analysis, single units

Introduction

In our social life, we can immediately detect a face and easily recognize its individual identity and emotional state. This ability is assumed to be based on neural processing in areas participating in the analysis of visual characteristics of faces in the temporal lobe in both humans and macaque monkeys (Tsao et al., 2008; Bell et al., 2009; Pinsk et al., 2009). This assumption is supported by findings of electrophysiological studies showing that facial identity and expression are represented by face-responsive neurons in the macaque temporal cortex (Hasselmo et al., 1989; Sugase et al., 1999; Eifuku et al., 2004; Freiwald and Tsao, 2010; Sugase-Miyamoto et al., 2011).

In human psychological studies, a picture-plane inversion of a face reduces face recognition ability of both facial identity (e.g., Yin, 1969) and facial expression (McKelvie, 1995; Prkachin, 2003; Derntl et al., 2009), an effect known as the face inversion effect. However, such an inversion has only a minor effect on face detection (Valentine, 1991; Lewis and Edmonds, 2003). It is reported that the configuration of facial parts becomes less perceptible after face inversion (Freire et al., 2000; for review, see Rossion, 2008; McKone and Yovel, 2009). For example, it is difficult to perceive the inverted orientation of the eyes and mouth in an inverted face, which is known as the “Thatcher effect” (Thompson, 1980). Recent comparative psychological studies have reported the existence of the Thatcher effect even in macaque monkeys (Adachi et al., 2009; Dahl et al., 2010). It has been reported that, in the macaque temporal cortex, face-responsive/selective neurons (Perrett et al., 1984; Tsao et al., 2006) and face-selective areas (Bell et al., 2009; Pinsk et al., 2009) decrease the response strength to inverted faces compared with the response to upright faces. However, it is not clear whether or not face inversion decreases the amount of information about facial identity/expression represented by the face-responsive neurons.

In our previous studies, we reported that face-responsive neurons in macaque area TE convey different kinds of information along the time axis, information about a global category (human faces vs monkey faces vs simple shapes) being conveyed earlier than information about more detailed facial category items (i.e., facial identity and expression) (Sugase et al., 1999; Matsumoto et al., 2005). In the present study, to understand the effect of face inversion on the information conveyed by neurons, single-unit activities in area TE were recorded during presentations of monkey and human faces with four variations (upright original, inverted original, upright thatcherized, inverted thatcherized faces) and simple shapes. Principal component analysis (PCA) applied on the neuronal population responses revealed significant effects of face inversion on information conveyed by TE neurons. Face inversion decreased the amount of fine information conveyed by the neurons and impaired their ability to represent human facial identity and monkey facial expression, although it did not affect the information about global category. The results seem to be consistent with the behavioral phenomena that have been reported by psychological studies.

Materials and Methods

Subjects.

Subjects were two male rhesus monkeys (Macaca mulatta) weighing 10 and 9 kg (Monkey C and Monkey T, respectively). Monkey T was used after another experiment on recording during conditional associations. All surgical and experimental procedures were approved by the Animal Care and Use Committee of the National Institute of Advanced Industrial Science and Technology (Japan) and were performed in accordance with the Guidelines for Care and Use of Animals of the National Institute of Advanced Industrial Science and Technology.

Experimental apparatus and behavioral task.

The monkeys performed a task while seated in a primate chair positioned in front of a monitor (FlexScan T966, EIZO) on which visual stimuli were displayed. The center of the monitor screen was located at eye level, 57 cm in front of the eyes. A touch-sensitive bar was mounted on the chair at the level of the monkey's hands. Liquid rewards were delivered from a drinking spout positioned in front of the monkey's mouth. Behavioral control and data acquisition were performed using the REX real-time data-acquisition program adapted to the QNX operating system (Hays et al., 1982). Spike times and task events were recorded at a 1 ms time resolution with REX. Neurobehavioral Systems Presentation software was used to display visual stimuli at a refresh rate of 75 Hz (Neurobehavioral Systems).

In a fixation task, a trial began when the monkey touched the bar, after which a yellow target (0.3° × 0.3°) appeared in the center of the screen. After the monkey had fixated on the yellow target for 200–300 ms, the target was removed for 100 ms, and then one of the test stimuli was presented for 400–500 ms. After the disappearance of the test stimuli, the yellow target reappeared for 200–300 ms and a drop of juice was delivered as a reward in every trial. If the monkey released the bar or broke eye fixation during the stimulus presentation, an error was registered and the monkey had to repeat the identical trial type from the beginning.

The test stimuli consisted of colored pictures of 28 faces, including 16 monkey faces (four models with four expressions) and 12 human faces (three models with four expressions), inverted images (180 degree rotation) of the 28 faces, thatcherized images (180 degree rotation of the eyes) of the 28 faces, inverted images of the 28 thathcerized faces, and 8 geometric shapes (rectangles and circles, each in one of four colors), resulting in 120 stimuli. The size of the stimuli was mostly within 20° × 20°. Only the eyes were rotated in the thatcherized images because 12 of the 16 monkey faces showed the mouth open (see Fig. 1A), but the Thatcher effect in macaque monkeys had been tested using faces with the mouth closed (Adachi et al., 2009; Dahl et al., 2010) and because the accuracy of detecting eye manipulation was similar to the accuracy of detecting eye-and-mouth manipulation in human psychological studies (Riby et al., 2009; Donnelly et al., 2011).

Figure 1.

Figure 1.

Activity of an example face-responsive neuron in area TE. A–C, Response diagrams for monkey, human, and shape stimuli, respectively. Each diagram consists of a stimulus image and a raster plot of the response with a spike-density function, in the top and bottom rows, respectively. Each horizontal row of dots represents the times of the spikes in each trial. For the spike-density function, spikes per millisecond over all trials were summed and smoothed with a Gaussian filter (SD = 20 ms). The neuronal activity was aligned at the time of the stimulus onset (0 ms). Gray vertical line indicates the minimum duration of stimulus presentation.

Eye positions were measured using an infrared pupil-position monitoring system (i_rec, Matsuda; http://staff.aist.go.jp/k.matsuda/eye/). The window size for eye fixation was 10° × 10° at the center of the screen.

Surgery.

A titanium head-fixation post was affixed to the skull under pentobarbiturate anesthesia (25 mg/kg). Monkeys were allowed to recover from surgery for 1 month. After the monkeys had learned the task, a recording chamber (Crist Instrument) was implanted above the dorsal surface of the hemisphere for each monkey. Chamber locations were determined using stereotaxic coordinates from magnetic resonance (MR) images of each animal's brain (Saunders et al., 1990). The monkeys were allowed a 2 week postoperative recovery period. They were then retrained in the fixation task. Body temperature, heart rate, blood pressure, and the percentage of blood oxygen saturation were monitored throughout all surgical procedures. The monkeys received antibiotics for 1 week after each surgery to reduce the risk of postoperative infections and received analgesics during and after surgery.

Unit recording.

Recording sessions began after the monkeys had been retrained on the fixation task. Single-unit recording was performed extracellularly with a tungsten electrode (Micro Probe; and Frederic Haer) that was inserted vertically through a guide cannula (Crist Instrument) at the beginning of each recording session. The electrode was advanced toward area TE using a hydraulic microdrive (MO-97A-S, Narishige), whereas the 28 upright original facial stimuli and the eight geometric shapes were presented pseudo-randomly to the monkey in the fixation task. Single units were isolated online using a threshold and dual time-amplitude windows (DDIS-1, Bak Electronics) or using a spike sorter (Sankei) based on PCA (Abeles and Goldstein, 1977). Unit activity was converted to pulses and recorded at a 1 ms time resolution with REX. If a single unit showed an increase or decrease of activity in response to the presentation of at least 1 of the 36 test stimuli during the initial part of the recording, the remaining facial stimuli (i.e., the inverted faces, the thatcherized faces, and the inverted thatcherized faces) were added as stimuli. The recording site location was determined using MR images. Three-dimensional brain images were reconstructed using software (Brain Explorer (c), National Institute of Advanced Industrial Science and Technology; http://riodb.ibase.aist.go.jp/brain/index.php?LANG=ENG). The distance (millimeters) to the interaural line, the distance to the midline, and the distance from the bottom end of the grid in the recording chamber to white matter dorsal to the inferior temporal gyrus were measured using MR images taken with an electrode placed in a position through a guide tube. We determined the positions of the gray matter by the frequency of action potentials and by contact with the dura or cranial base at the ventral surface of the brain. To examine the distribution of single-unit locations, the distance from a grid in the recording chamber to a recording location was measured after the experiment each day. All recording sites were located in the stereotaxic plane 19–24 mm anterior to the interaural line and 15–18 mm lateral to the midline, including the lower bank of the superior temporal sulcus and the fundus and the lip of the anterior middle temporal sulcus, which approximately corresponds to areas TEa and TEav (Seltzer and Pandya, 1978; Saleem and Tanaka, 1996). There was no consistent trend between the two monkeys regarding the distribution of neurons according to the types of information encoded by each neuron.

Data analysis.

All data analyses were performed using conventional statistical procedures with the R statistical computing environment (R Development Core Team, 2004) or MATLAB (MathWorks). Neurons tested with more than three trials per stimulus (the median number of trials was five per stimulus) were analyzed. If a neuron showed a significant increase or decrease in spike counts during a 400 ms period starting 50 ms after the test stimulus onset compared with the counts during a 400 ms period before the test stimulus onset (paired t tests, p < 0.05), it was considered to be a visually responsive neuron. Sliding window analysis was performed to examine the time course of changes in neuronal activity. For every neuron, each trial was divided into 50 ms bins that were slid across the trial in 8 ms steps, the spike count being obtained for each bin. The window width was 50 ms as in Sugase et al. (1999) based on Heller et al. (1995), demonstrating that measurable information was observed using information analysis with a resolution no finer than around 50 ms in the inferior temporal (IT) cortex (Heller et al., 1995). The middle of the bin was moved beginning 5 ms after the stimulus onset and lasting until 509 ms as was done in Sugase et al. (1999): i.e., from [−20, 30] ms to [484, 534] ms after the stimulus onset (the numbers in the brackets indicate the beginning and the end time of each 50 ms bin, respectively). Then, for the data from each bin, statistical analysis (including two-way ANOVA, the Wilcoxon rank sum test, t test, and so on) was performed. A set of p values obtained from each bin was adjusted for multiple comparison using the false discovery rate (“p.adjust” function with Benjamini and Hochberg method in R) (Bouret et al., 2012).

Mutual information was calculated based on the method described by Sugase et al. (1999). Briefly, information about the test stimuli was divided into a global category (human faces vs monkey faces vs simple shapes) and a fine category with four items (identity of the human faces, expression on the human faces, identity of the monkey faces, and expression on the monkey faces). Each predictable piece of information associated with the occurrence of a neuronal response (I(S; R)) was quantified as the decrease in entropy of the stimulus occurrence (H(S)) as follows:

graphic file with name zns03714-6157-m01.jpg

where S is the set of stimuli s, R is the set of signals r (the neuronal responses, i.e., spike counts), p(s|r) is the conditional probability of stimulus s given an observed spike count r, and p(s) is the a priori probability of stimulus s. The brackets indicate an average of the signal distribution p(r). We evaluated “significance” of information with the χ2 test (Kitazawa et al., 1998) in each bin, and the amount of nonsignificant information was treated as zero. The information latency was measured from the stimulus onset to the midpoint of the first significant time window. When a neuron coded more than two fine category items, the shortest latency was used as the latency of the fine category. Information with at least two consecutive significant time windows was analyzed in the measurements of the information latency and peak.

Sparseness index (Rolls and Tovee, 1995), SI, was calculated as follows:

graphic file with name zns03714-6157-m02.jpg

where ri is the firing rate in response to the ith stimulus in the set of n stimuli. The firing rate ri was calculated using averaged spike counts during a 300 ms period starting from 100 ms after the stimulus onset across trials. The sparseness has a maximal value of 1.0, indicating equally distributed firing rates across all the stimuli.

For population analysis, the method described by Matsumoto et al. (2005) was used. Briefly, a population activity vector for a test stimulus consisting of spike counts of individual neurons within a 50 ms time window was calculated. The dimension of each population activity vector corresponded to the number of face-responsive neurons. The start time of the time window was in increments of 1 ms from 0 ms (at the beginning of the presentation of the test stimuli) to 400 ms. To determine the time window in which the global or fine categorization was maximally represented, the Euclidean distances between population activity vectors were calculated. Because our previous study (Matsumoto et al., 2005) showed that the global category represented by face-responsive neurons is human versus monkey versus shape and that the fine category items are human identities and monkey expressions, we calculated the distances between human, monkey, and shape vectors for the global categorization, and the distances between human identity vectors and between monkey expression vectors for the fine categorization. The distance of the global category was estimated as follows. At first, the centers of gravity of the 12 population activity vectors for the human upright original faces, of the 16 vectors for the monkey upright original faces, and of the 8 vectors for the shapes were calculated. Next, the distances between the human and monkey gravity centers, between the monkey and shape gravity centers, and between the shape and human gravity centers were calculated. The distance of the global category was then estimated by averaging the obtained three distances. The distance of the fine category was estimated as follows. First, the center of gravity of the four vectors for each monkey expression was calculated. Next, six distances between each of the four gravity centers of the monkey expressions were calculated. The distance of the monkey expressions was then estimated by averaging the obtained six distances. Similarly, the center of gravity of the four vectors of each human identity was calculated. The distances between each three gravity centers of the human identities were calculated. The distance of the human identities was then estimated by averaging the three obtained distances. The distance of the fine category was calculated by averaging the distance of the monkey expressions and the distance of the human identities. To visualize the behavior of a neuronal population, PCA was applied on the population activity vectors in each time window. A clustering method was also applied to the population activity vectors. In this method, population activity vectors were assumed to be generated from 119-dimensional Gaussian distributions (i.e., a mixture of Gaussians). Variational Bayes algorithm (Attias, 1999; Ghahramani and Beal, 2000) was used to estimate the parameters of the mixture of Gaussians (i.e., the means, variances, mixing ratios, and number of Gaussian distributions). The number of Gaussians corresponding to the number of clusters was estimated from the free energy, which indicated the distance between the estimated mixture of Gaussians and the most appropriate mixture of Gaussians. As the free energy increased, the estimated mixture of Gaussians approached the most appropriate one. The number of Gaussians was set from 1 to 10, and the free energy was calculated 100 times for each number of Gaussians. The parameters and the number of Gaussians were then examined at the point where the free energy was the maximum.

A measure of firing rate variability (the Fano factor) was computed using the MATLAB codes that were developed by Churchland et al. (2010). Spike counts were computed in a 50 ms sliding window moving in 8 ms steps from [−76, −26] ms to [484, 534] ms after the stimulus onset for each neuron. The mean-matched Fano factor was then calculated using the MATLAB code VarVsMean. The statistical significance of the Fano factor decline was assessed by comparing the Fano factor before the stimulus onset (from [−76, −26] ms to [−28, 22] ms) and that after the onset (from [4, 54] ms to [108, 158] ms).

Results

Activity of 120 visually responsive neurons (28 and 92 from Monkey C and Monkey T, respectively; see Materials and Methods) in area TE was recorded during a fixation task while 120 test stimuli, including upright original faces, inverted original faces, upright thatcherized faces (eye inversion), inverted thatcherized faces, and geometric shapes, were presented. Of the 120 neurons, 119 (119 of 120, 97%; Monkey C = 28 and Monkey T = 91) showed a significant increase or decrease in neuronal activity during presentation of at least one of 28 upright original facial stimuli (paired t test, p < 0.05) (e.g., Fig. 1), and the remaining one neuron did so only during presentation of the geometric shapes. The 119 neurons were regarded as face-responsive neurons. Of the 119 neurons, 73 (Monkey C = 22, Monkey T = 51) also showed significant activity changes during the presentation of at least one of the geometric shapes.

The effect of face inversion and thatcherization was examined using two-way ANOVA against spike counts in 50 ms bins that were slid in 8 ms steps from [−20, 30] ms to [484, 534] ms after the stimulus onset. Many face-responsive neurons (84 of 119, 71%; Monkey C = 22, Monkey T = 62) showed a significant effect of face inversion, and fewer neurons (16 of 119, 13%; Monkey C = 7, Monkey T = 9) showed a significant effect of thatcherization (p < 0.05, p values adjusted for multiple comparison using the false discovery rate).

Figures 1 and 2 show a representative example of the activity of a face-responsive neuron. The neuron showed stronger responses to face images of Monkey model 3 than to images of the remaining monkey models in the upright original face condition (Fig. 1A, Monkey model 3 shown in the third row). Figure 2 shows diagrams of the responses to one facial expression (Fig. 1A, expression d) of the four monkey models. In the upright original face condition (the diagrams in the top row), the strength of the response to Monkey model 3 was higher than those in response to the other models in both the initial transient and later sustained responses. In the inverted original face condition (the diagrams in the second row), the strength of the response to the Monkey model 3 was maintained but the responses to the remaining monkey models increased, resulting in decreased sensitivity to Monkey model 3. A similar trend was observed in the discharge pattern of the same neuron in response to the upright and inverted thatcherized faces (Fig. 2, diagrams in the third and fourth rows). Thus, the selectivity of the neuron was modulated by the face inversion.

Figure 2.

Figure 2.

Responses to one facial expression of monkey across four experimental conditions for the example neuron shown in Figure 1. Responses to the upright original faces, the inverted original faces, the upright thatcherized faces, and the inverted thatcherized faces are shown in the first, second, third, and fourth row of the response diagrams, respectively. Fiducial marking as in Figure 1.

To detect the change due to the face inversion in the amount of stimulus information that was represented by the neuronal response, the information transmission rate was calculated (see Materials and Methods). We focused on information about one global category (monkey vs human vs shape) and four fine category items within the global category member (identity of monkey models, monkey facial expressions, identity of human models, and human facial expressions) to examine the temporal processing stages of facial information based on our previous study (Sugase et al., 1999). Figure 3A shows the results of the information analysis for the responses of the neuron in Figures 1 and 2. The neuron represented significant information about the global category, the monkey identity, and the human identity in the upright condition. In the case of face inversion, the information transmission rate of the global category increased (solid and broken red curves for upright and inverted conditions, respectively), whereas the information about the monkey identity decreased (solid and broken black curves, for upright and inverted conditions, respectively). A similar trend was observed in the information transmission rate of the thatcherized faces for the same neuron (Fig. 3B). Thus, during the time the monkey looked at the inverted faces, information about the monkey identity decreased, whereas information about the global category increased for both the original and thatcherized faces.

Figure 3.

Figure 3.

Information transmission rate curves of the example neuron in Figures 1 and 2 in the original face condition (A) and in the thatcherized face condition (B). Information transmission rate is plotted against the midpoint of the 50 ms sliding window moving in 8 ms steps and aligned at the stimulus onset. Only significant information was plotted. In the time axis (abscissa), the minimum duration of stimulus presentation is indicated as a thick gray line. In both the original and thatcherized face conditions, the information transmission rate of the global category in the upright condition (solid red curves) is smaller than that in the inverted condition (broken red curves), and the information transmission rate of the fine category (monkey identity) in the upright condition (solid black curves) is larger than that in the inverted condition (broken black curves).

Figure 4 shows the summed information transmission rate curves for the global (red) and fine category items of facial identity (black) and facial expression (green) for the 119 face-responsive neurons. Each line represents the successive sum of contributions from additional cases. In the upright condition of original faces (Fig. 4A), both the information latency (Table 1) and the timing of the information peak for the global category were shorter and differed significantly from those for the fine category (peak, Kolmogorov–Smirnov test, p = 3.8 × 10−5, median, 141 ms; IQR, 133–165 ms for the global category; median, 173 ms; IQR, 149–265 ms for the fine category), replicating the results of our previous study (Sugase et al., 1999). Information latency for the global category was shorter than that for the fine category in all four facial conditions (Table 1). Furthermore, the amount of information about the fine category items (facial identity and expression) was significantly different and smaller from the amount of information about the global category in all four facial conditions (paired Wilcoxon signed rank test for the face responsive neurons in time windows from [−20, 30] ms to [484, 534] ms after the stimulus onset, one-sided, p < 0.05, p values adjusted for multiple comparison using the false discovery rate). Thus, the firing rates of the neurons discriminated the global category faster and better than the fine category.

Figure 4.

Figure 4.

Cumulative sum of information transmission rate across 119 face-responsive neurons in the upright original face condition (A), in the inverted original face condition (B), in the upright thatcherized face condition (C), and in the inverted thatcherized face condition (D). Red curves represent the information transmission rate of the global category. Black curves represent the information transmission rate of fine category items of facial identity (human identity and monkey identity). Green curves represent the information transmission rate of fine category items of facial expression (human expression and monkey expression). Each line indicates the successive sum of contributions from additional cases. Numbers of cases in each condition are shown in Table 2. Nonsignificant information was excluded from the summation. In the time axis (abscissa), the minimum duration of stimulus presentation is indicated as a thick gray line. E, A comparison of the total summed information transmission rates for the fine category of facial identity between the upright (solid black curves) and inverted (broken black curves) condition of the original faces. F, A comparison of the total summed information transmission rates for the fine category of facial expression between the upright (solid green curves) and inverted (broken green curves) condition of the original faces. G, A comparison of the total summed information transmission rates for the fine category of facial identity between the upright (solid black curves) and inverted (broken black curves) condition in the thatcherized faces. H, A comparison of the total summed information transmission rates for the fine category of facial expression between the upright (solid green curves) and inverted (broken green curves) condition in the thatcherized faces. Red dots above the abscissa indicate time windows in which the amount of information in the inverted condition was significantly smaller than the amount in the upright condition (paired Wilcoxon signed rank test, one-sided, p < 0.05, p values adjusted for multiple comparison using the false discovery rate). Other fiducial markings are as in Figure 3.

Table 1.

Latency of information from stimulus onseta

Condition Information
p NGlobal (Monkey C, Monkey T) NFine (Monkey C, Monkey T)
Global
Fine
Median IQR Median IQR
Upright original 101 85–117 141 125–197 1.5 × 10−10 109 (27, 82) 59 (24, 35)
Inverted original 101 85–117 149 117–213 1.4 × 10−6 103 (26, 77) 37 (11, 26)
Upright thatcherized 101 85–117 149 125–197 2.6 × 10−5 110 (27, 83) 51 (22, 29)
Inverted thatcherized 101 85–117 141 111–189 4.4 × 10−6 106 (26, 80) 39 (15, 24)

aLatency of information is shown in ms. p values of Kolmogorov–Smirnov test between the latency of the global information and that of the fine information; NGlobal and NFine, number of neurons with global and fine information, respectively. Numbers in parentheses indicate the number of neurons in Monkey C and Monkey T, respectively.

As shown in Figure 4, the amount of information about the global category in the inverted condition was similar to the amount in the upright condition for both the original and thatcherized faces. The summed amount of the global information along the time axis for each neuron was not significantly different across the four facial conditions (two-way ANOVA, information transmission rate was summed from [−20, 30] ms to [484, 534] ms after the stimulus onset, factor = upright vs inverted, F(1,442) = 0.35, p = 0.55; factor = original vs thatcherized, F(1,442) = 0.37, p = 0.54; interaction, F(1,442) = 0.04, p = 0.83). Thus, the firing rates of the neurons represented the information about the global category regardless of the inversion or thatcherization of the faces.

On the other hand, the amount of information about the fine category items of facial identity and expression in the inverted condition was smaller than the amount in the upright condition for both the original (Fig. 4A vs Fig. 4B, middle and bottom rows) and thatcherized faces (Fig. 4C vs Fig. 4D, middle and bottom rows). As shown in Figure 4E–H, the total summed amount of information about the facial identity (Fig. 4E,G) and expression (Fig. 4F,H) in the inverted condition (broken black and green curves, respectively) was lower than those in the upright condition (solid black and green curves for facial identity and expression, respectively). The time window in which the amount of information in the upright condition was significantly larger than that in the inverted condition is indicated as a red point above the abscissa (paired Wilcoxon signed rank test for the face responsive neurons in time windows from [−20, 30] ms to [484, 534] ms after the stimulus onset, one-sided, p < 0.05, p values adjusted for multiple comparison using the false discovery rate). The significant difference was detected in similar time windows for the facial identity in the original face condition (starting from the [132, 182] ms window), for the facial expression in the original face condition (starting from the [116, 166] ms window), and for the facial identity in the thatcherized face condition (starting from the [124, 174] ms window). There was no statistical significance between the amount of information in the upright condition and that in the inverted condition for the facial expression of the thatcherized face (Fig. 4H).

Table 2 shows the number of neurons (in total and for each monkey) of the 119 face-responsive neurons representing significant information about either global or fine category items in the four different conditions. The number of neurons representing the global information was slightly larger in the upright condition than in the inverted condition (χ2 test, df = 1, p = 0.013 for original faces, p = 0.20, i.e., not significant, for thatcherized faces). For each of the four fine category items, a smaller number of neurons represented a significant amount of information in the inverted condition than they did in the upright condition. Thus, the total number of neurons representing the fine information in the upright condition was larger than that in the inverted condition, and the frequencies of the two (i.e., the number of neurons over the number of 119 face-responsive neurons) were significantly different (χ2 test, df = 1, p = 0.004 and 0.038 for original and thatcherized faces, respectively). In the case of face inversion, neurons often lost but sometimes gained the ability to represent fine category information. For example, among 76 neurons that represented fine information in the upright original face condition, 34 lost information in the inverted condition. On the other hand, 11 neurons gained fine information in the inverted condition. Thus, a total of 53 neurons represented fine information in the inverted condition. A similar trend was observed in the thatcherized face condition (among 70 neurons with fine information in the upright condition, 30 lost the information in the inverted condition, whereas 13 neurons gained fine information in the inverted condition).

Table 2.

Number of neurons representing global or fine informationa

Conditions Information
Global Fine
None
Monkey identity Monkey expression Human identity Human expression Total
Upright original 117 32 37 45 14 76 (128) 2
(Monkey C, Monkey T) (27, 90) (13, 19) (15, 22) (15, 30) (8, 6) (26, 50) (1, 1)
Inverted original 107 24 26 28 7 53 (85) 6
(Monkey C, Monkey T) (26, 81) (10, 14) (9, 17) (7, 21) (5, 2) (16, 37) (1, 5)
Upright thatcherized 114 24 32 46 10 70 (112) 5
(Monkey C, Monkey T) (27, 87) (11, 13) (14, 18) (14, 32) (7, 3) (24, 46) (1, 4)
Inverted thatcherized 108 18 27 26 7 53 (78) 8
(Monkey C, Monkey T) (26, 82) (5, 13) (11, 16) (8, 18) (5, 2) (16, 37) (2, 6)

aNeurons that encoded information from (−20, 30) ms to (484, 534) ms after the stimulus onset were counted. Numbers in parentheses in the column Total indicate the total number of cases of fine categories (up to 4 cases for each neuron). A pair of numbers in parentheses indicate number of neurons in Monkey C and that in Monkey T, respectively.

To examine whether or not the decrease in fine information in the inverted condition could be detected as a decrease in selectivity of facial-stimulus representation, a sparseness index of neuronal responses to the facial stimuli (Rolls and Tovee, 1995) (see Materials and Methods) was calculated by using absolute firing rates of each face-responsive neuron. The sparseness index in the upright condition (median, 0.81; IQR, 0.60–0.90 for original face; median, 0.83; IQR, 0.61–0.90 for thatcherized face) was slightly but significantly smaller than the index in the inverted condition for both the original and thatcherized faces (median, 0.84; IQR, 0.65–0.91 for original face; median, 0.84; IQR, 0.64–0.91 for thatcherized face) (paired Wilcoxon signed rank tests, one-sided, p = 1.9 × 10−6 for comparisons between upright original and inverted original conditions and p = 1.0 × 10−5 between upright thatcherized and inverted thatcherized conditions), indicating that neuronal representation of the facial stimuli in the upright condition were sparser (i.e., more selective) than the responses in the inverted condition. The value of the sparseness index was similar to that reported by Rolls and Tovee (1995) (mean ± SD: 0.82 ± 0.12).

Representation of test stimuli by neuronal population

To examine representation of the test stimuli by the population of neurons along the time axis, we calculated 120 population activity vectors consisting of the mean firing rates of the 119 face-responsive neurons (Monkey C = 28, Monkey T = 91) for the 120 test stimuli, including 28 upright original faces, 28 inverted original faces, 28 upright thatcherized faces, 28 inverted thatcherized faces, and 8 geometric shapes, within a 50 ms time window, moving in 1 ms steps. To determine the time window in which the global or fine categorization was maximally represented, the Euclidean distances between population activity vectors were calculated (see Materials and Methods). Based on the periods of the distance peaks of the global category and the fine category, we determined the [115, 165] ms window as the time window for global categorization, and the [140, 190] ms window as the time window for fine categorization.

Because the amount of the global information was retained in the inverted condition, it was assumed that the population of neurons probably represents the global category in the inverted condition. However, it was not clear whether or not the inverted monkey faces were represented near the upright monkey faces and the inverted human faces near the upright human faces within the space of the population activity vectors. To address this issue, PCA was applied on the 64 population activity vectors, including the upright and inverted original faces and shapes, in the [115, 165] ms window. Consequently, the 64 population activity vectors in the 119-dimensional space were projected onto the 64 vectors in the two-dimensional space. As shown in Figure 5, the vectors of the inverted monkey faces (blue X's) are represented near the vectors of the upright monkey faces (red X's), the vectors of the inverted human faces (blue plus signs) are represented near the upright human faces (red plus signs), and the vectors of the shapes (squares) are represented separately from the human and monkey face vectors. The Euclidean distances between 28 upright faces and 28 inverted faces (median, 6.40; IQR, 5.79–7.10) were significantly smaller than the distances of the 192 combinations of 16 monkey faces and 12 human faces (median, 9.77; IQR, 9.21–10.3) (Wilcoxon rank sum test, p = 1.5 × 10−17). These results show that the inverted monkey faces evoked population responses similar to those evoked by the upright monkey faces and that the inverted human faces evoked population responses similar to those evoked by the upright human faces, indicating that the activity of the neuronal population represented the inverted monkey faces as monkey faces and the inverted human faces as human faces. The clusters obtained by using a cluster analysis are shown as circles in Figure 5. There were two clusters of monkey faces: one cluster containing 15 upright and three inverted monkey faces and the other containing one upright and 13 inverted monkey faces. There were three clusters of human faces, each cluster containing all 12 upright faces, 11 inverted faces, and one inverted face, respectively. Thus, the members of the clusters also showed that the vectors of the upright and inverted faces were separated. To investigate the mechanism by which the neuronal population represented the upright versus inverted face separation in each member of the global category and by which the amount of information about the global category was retained in the case of face inversion, PCA was applied on the upright original monkey and human faces (Fig. 6A), on the upright and inverted original human faces (Fig. 6B), and on the upright and inverted original monkey faces (Fig. 6C). The first principal component separated the upright monkey faces from the upright human faces (Fig. 6A, left), the upright human faces from the inverted human faces (Fig. 6B, left), and the upright monkey faces from the inverted monkey faces (Fig. 6C, left), respectively. The contribution of each neuron to the first principal component in each PCA is shown in the right-hand side of each panel. It indicates that the value of each neuron was not positively correlated between the first principal component separating the monkey faces from the human faces (Fig. 6A, right) and the first principal component separating the upright human faces from the inverted human faces (Fig. 6B, right) (Pearson's correlation coefficient of −0.33, p = 0.0002; thus in this case, a negative correlation). The value of each neuron was also not correlated between the first principal component separating the monkey faces from the human faces (Fig. 6A, right) and the first principal component separating the upright monkey faces from the inverted monkey faces (Fig. 6C, right) (Pearson's correlation coefficient of 0.13, p = 0.16). However, the value of each neuron was positively correlated between the first principal component separating the upright and inverted human faces (Fig. 6, right) and the first principal component separating the upright and inverted monkey faces (Fig. 6C, right) (Pearson's correlation coefficient of 0.44, p = 1.1 × 10−7). The results suggest that different members of the neuronal population contribute to the global categorization and to the upright versus inverted face separation and that this is the mechanism by which the amount of the global information remains constant in the case of face inversion. The results also suggest that the upright versus inverted separation of the monkey faces and the upright versus inverted separation of the human faces are likely to have contributing members in common.

Figure 5.

Figure 5.

Population activity vectors consisting of upright and inverted original faces and shapes in two-dimensional space rearranged using PCA in the [115, 165] ms window. The contribution ratio was 47.4%. The points represent the population activity vectors for the individual stimuli. Red X's (Monkey up) indicate upright original monkey faces; blue X's (Monkey inv) indicate inverted original monkey faces; red plus signs (Human up) indicate upright original human faces; blue plus signs (Human inv) indicate inverted original human faces; red squares represent shapes. The clusters obtained by using the cluster analysis are shown as circles. The center of each circle is calculated so that it is located at the average value of the coordinates of the points. The radius of each circle was calculated from the SD of the coordinates of the points.

Figure 6.

Figure 6.

Population activity vectors in two-dimensional space rearranged using PCA and eigenvectors of the first principal component. A, PCA on upright original human and monkey faces. B, PCA on upright and inverted original human faces. C, PCA on upright and inverted original monkey faces. The contribution ratios were 55.2%, 35.6%, and 34.0% in A–C, respectively. In each figure, the panel on the right represents the eigenvector of the first principal component in the [115, 165] ms time window. The horizontal axis indicates each face responsive neuron (from 1 to 119; 1–28 in Monkey C and 29–119 in Monkey T), whereas the vertical axis indicates the value of each element that constitutes the eigenvector. Fiducial markings are as in Figure 5.

We then looked at the differences in representation of the fine category items depending on the upright and inverted conditions. PCA was applied on the 120 population activity vectors consisting of all the test stimuli in the [115, 165] ms and [140, 190] ms time windows. Consequently, the 120 population activity vectors in the 119-dimensional space were projected onto 120 vectors in the two-dimensional space. Figure 7A shows the distributions of the 36 population activity vectors, consisting of 28 upright original faces and 8 shapes, taken from the 120 vectors in the two-dimensional space in the [115, 165] ms and [140, 190] ms time windows. The clusters obtained by using the cluster analysis in the [115, 165] ms and [140, 190] ms windows are shown as circles. There were four and nine clusters in the [115, 165] ms and [140, 190] ms windows, respectively. The [115, 165] ms window contained four clusters corresponding to human faces, monkey faces, seven shapes, and the remaining one shape. The members of each cluster in the [140, 190] ms window are shown in Figure 7A (bottom). In the [140, 190] ms window, some clusters that were found in the [115, 165] ms window were separated into subclusters, replicating the result of our previous study (Matsumoto et al., 2005). The human face cluster in the [115, 165] ms window was separated into six subclusters, differentiating three different human models. The monkey face cluster in the [115, 165] ms window was separated into two subclusters. One cluster contained four full-open-mouthed faces (Fig. 1A, monkey facial expression c) and two mid-open-mouthed faces (Fig. 1A, monkey facial expression d), and the other contained the remaining monkey faces. In the inverted original face condition shown in Figure 7B, however, there were four clusters both in the [115, 165] ms and [140, 190] ms windows. Both the [115, 165] ms window and the [140, 190] ms window contained clusters corresponding to human faces, 15 monkey faces, the remaining one monkey face (a stimulus of Monkey model 1 with facial expression c in Fig. 1A) and shapes (members of each cluster in the [115, 165] ms window not shown). Thus, the human face cluster in the [115, 165] ms window was not separated into subclusters in the [140, 190] ms window. The member of the two monkey face clusters in the [115, 165] ms window remained the same in the [140, 190] ms window. Comparing the members of each cluster in Figure 7A, B (bottom), it is clear that the population activity was impaired in its ability to represent fine category items in the inverted condition. A similar trend was observed in the upright and inverted thatcherized face conditions. Comparing the members of each cluster in Figure 7C, D (bottom), it is again evident that the clusters in the upright condition represented fine categorizations of human identity and monkey expression, whereas the clusters in the inverted condition did not. To evaluate the cluster separation of the global category between the upright and inverted conditions, the Euclidean distance between the mean vector of the human clusters and that of the monkey clusters, the Euclidean distance between the mean vector of the human clusters and that of the shape clusters, and the Euclidean distance between the mean vector of the monkey clusters and that of the shape clusters were measured in the [115, 165] ms window. The obtained distances (human–monkey, human–shape, and monkey–shape distances) in the upright and inverted conditions were not significantly different for both the original and thatcherized faces (Wilcoxon signed rank test, p = 0.25 for original faces, p = 0.5 for thatcherized faces).

Figure 7.

Figure 7.

Population activity vectors in two-dimensional space rearranged using PCA in the [115, 165] ms and [140, 190] ms windows. A, Population vectors of upright original faces and shapes. B, Population vectors of inverted original faces and shapes. C, Population vectors of upright thatcherized faces and shapes. D, Population vectors of inverted thatcherized faces and shapes. The contribution ratio was 42.8% in the [115, 165] ms window and 39.9% in the [140, 190] ms window. The points represent the population activity vectors for the individual stimuli. Shapes of the points indicate the global categorization (i.e., X's, plus signs, and squares for monkey faces, human faces, and shapes, respectively). Each color of the points indicates human model for human faces, monkey model for monkey faces, and circle or rectangle for the shapes. Other fiducial markings are as in Figure 5. Members of individual clusters in the [140, 190] ms window are shown in the bottom.

To evaluate the results of the cluster analysis, we performed nonparametric bootstrap analysis to obtain histograms of the number of clusters. Bootstrap estimates of the number of clusters were computed as the number of clusters in 2000 bootstrap samples obtained by randomly resampling with replacement from 119 neuronal activities for the upright original and inverted original face conditions in the [115, 165] and the [140, 190] ms windows. The results are shown in Figure 8. The histogram of the upright original condition in the [140, 190] ms window (the right-hand panel of Fig. 8A, right) was significantly different from that of the inverted original condition in the same window (Fig. 8B, right) (χ2 test, df = 7, p < 2.2 × 10−16). The cluster numbers shown in Figure 7A (4 at [115, 165] ms and 9 at [140, 190] ms) and Figure 7B (4 at [115, 165] ms and 4 at [140, 190] ms) were within the range of the number in each histogram in Figure 8A (a median of 6, IQR of 5–8 at [115, 165] ms and a median of 8, IQR of 7–9 at [140, 190] ms) and in Figure 8B (a median of 6, IQR of 5–7 at [115, 165] ms and a median of 5, IQR of 4–6 at [140, 190] ms). Therefore, the results of the bootstrap sampling indicate that the inversion manipulation decreased the number of clusters in the [140, 190] ms window.

Figure 8.

Figure 8.

Histogram for number of clusters. A, The number of clusters estimated in 2000 bootstrap samples in the [115, 165] ms window (left-hand panel) and in the [140, 190] ms (right-hand panel) for upright original face condition. B, The number of clusters estimated in 2000 bootstrap samples in the [115, 165] ms window (left-hand panel) and in the [140, 190] ms (right-hand panel) for inverted original face condition. The vertical broken line in each panel indicates the number of clusters shown in Figure 7A, B.

To examine whether or not the difference in separability of fine category between the upright and inverted conditions was related to characteristics of neural responses, firing-rate variability across trials (mean-matched Fano factor) (Churchland et al., 2010) was computed in each face-responsive neuron and averaged across the neurons. Consistent with previous findings (Churchland et al., 2010), the mean-matched Fano factor decreased after the stimulus onset (Wilcoxon rank sum test, p = 1.7 × 10−5 in upright original face condition, p = 1.7 × 10−5 in inverted original face condition, p = 6.9 × 10−5 in upright thatcherized face condition, p = 0.0016 in inverted thatcherized face condition), the Fano factor after the stimulus onset was similar across the four facial conditions (two-way ANOVA, factor = upright vs inverted, F(1,52) = 0.89, p = 0.35; factor = original vs thatcherized, F(1,52) = 0.0012, p = 0.97; interaction, F(1,52) = 0.0014, p = 0.97), and the minimum Fano factor value was observed in the [68, 118] ms window in all four facial conditions. This result suggests that the decrease in the firing-rate variability is unrelated to the decrease in the separability of fine category in the inverted condition.

To examine whether or not the population of neurons differentiated the original face from the thatcherized version of the face, the Euclidean distance between population activity vectors was calculated for each original thatcherized face pair with normal facial expression. The time course of the averaged distance in the upright condition (black) was compared with that in the inverted condition (cyan) for monkey (Fig. 9A) and human (Fig. 9B) faces. If the face-responsive neurons represented differences between the original and thatcherized faces in the upright condition more than they did in the inverted condition, the Euclidean distance between the original and thatcherized faces in the upright condition would be larger than that in the inverted condition. The statistical significance was examined from [4, 54] ms to [348, 398] ms after the stimulus onset in 8 ms steps (i.e., the time windows that were used in the information analysis). As shown in Figure 9A, the original thatcherized face distance in the upright condition was significantly larger than that in the inverted condition in three time windows: [228, 278] ms, [252, 302] ms, and [260, 310] ms after the stimulus onset for the monkey faces (black dots on the abscissa, paired t test, one sided, df = 3, p < 0.05, p values adjusted for multiple comparison using the false discovery rate). Although there was a difference between the two distances for the human faces (Fig. 9B), it was not significant. Thus, the neuronal population represented the original monkey face and the thatcherized version of the monkey face in the upright condition more separately than in the inverted condition.

Figure 9.

Figure 9.

Averaged Euclidean distance between population activity vectors for each original thatcherized face pair in upright and inverted conditions along the time axis. A, Averaged Euclidean distance between original thatcherized face pair for monkey faces. B, Averaged Euclidean distance between original thatcherized face pair for human faces. The distance was calculated only for faces with normal facial expression. Black curves represent original thatcherized face pair distances in upright condition; cyan curves represent original thatcherized face pair distances in inverted condition. Black dots on the abscissa indicate windows in which the original thatcherized face distances in the upright condition were significantly larger than that in the inverted condition.

Discussion

We investigated the effects of face inversion and thatcherization on temporal processing stages of facial information in macaque area TE. Face inversion altered the responses of many neurons, and the altered responses differently affected two levels of categorization of the facial stimuli. PCA on the neuronal population responses revealed that the global categorization occurred regardless of the face inversion and thatcherization and that the inverted faces were represented near the upright faces in the population activity vector space. However, the inversion decreased the amount and the number of neurons representing fine information (facial identity and expression) and the ability of the neuronal population to represent human identity and monkey expression.

Effect of face inversion on information represented by neuronal activity and population representation of facial information

In human behavioral studies, it has been found that the configuration of facial parts is less perceptible after face inversion (Freire et al., 2000; Rossion, 2008; McKone and Yovel, 2009). Previous studies on nonhuman primates have also shown that face-responsive/selective neurons decrease the response strength and increase the response latency to inverted face images (Perrett et al., 1984; Tsao et al., 2006). Consistent with these findings, we observed that many neurons changed their responses to faces depending on upright and inverted conditions. The present study extends the previous findings by showing that face inversion not only altered the neuronal responses but also partially impaired facial information processing; the neuronal representation of the fine categories was affected, although that of the global category was not. The distribution of the eigenvectors of PCA showed that different members in the neuronal population contributed to the global categorization of monkey versus human faces and to the upright versus inverted categorization, indicating that face inversion probably altered the members that participated in different aspects of the categorization.

The result that face inversion decreased the fine information but did not affect the global information suggests different neural mechanisms underlying encoding of these two levels of information. This is consistent with previous studies regarding processing of different aspects of a face or a visual stimulus in the IT cortex. Freiwald and Tsao (2010) showed that information about facial identity across different facial views emerges gradually after stimulus onset in an anterior middle face patch of the inferior temporal gyrus, indicating that additional information processing is necessary to compute facial identity across different facial views. It is consistent with the results of the present study and those of our previous report (Sugase et al., 1999) that information latency for the global category was shorter than that for the fine category items. The earlier development of the global information than the local information was also found in responses of IT neurons to nonfacial geometric shapes (Sripati and Olson, 2009).

Because the inversion preserves low-level characteristics of the images, the result of the sustained representation of the global category despite inversion is consistent with the suggestion by Baldassi et al. (2013) that the representation of visual objects in the IT cortex is accounted for by their shape similarity rather than semantic similarity. Because neurons in the IT cortex are sensitive to the configuration of facial parts (Yamane et al., 1988; Freiwald et al., 2009) and a position of a facial part within a whole face is altered after face inversion (Freiwald et al., 2009), it is understandable that the neurons lost their selective responses to the configural (high-level) characteristic of the faces, which may have been acquired through repeated encounter in daily life.

The present result of impaired representation of the fine category items in the case of the inversion suggests that impaired neuronal representation would cause the face inversion effect in face images of both species. In comparative psychological studies, it has been shown that monkeys show the face inversion effect at least in monkey faces, but findings have been inconsistent as to whether or not they show the effect only in the case of a conspecific's face (Tomonaga, 1994; Dahl et al., 2007, 2009; Parr et al., 2008). Because behavioral effects were not observed in other studies (Rosenfeld and Van Hoesen, 1979; Bruce, 1982; Dittrich, 1994; Gothard et al., 2004), it remains to be elucidated whether or not the neuronal face inversion effect has a causal effect on behavior. The effect of inversion was observed in both facial identity and expression categories, suggesting the contribution of area TE to the face inversion effect on both recognition of facial identity and expression. Because the amygdala plays a role in activity modulations according to facial expressions in the monkey IT cortex (Hadj-Bouziane et al., 2008, 2012), it might also play a role in the decreased representation of the facial expression in the case of face inversion.

Human event-related potential studies have shown that face inversion affects an early negative component that peaks at ∼170 ms after stimulus onset (N170, starting at 130–150 ms) (Rousselet et al., 2008) and results in delay in the peak latency and increase in the amplitude (Rossion and Gauthier, 2002). The timing of the face inversion effect on the fine information (∼150 ms) and that on the representation of the fine categories by the neuronal population ([140, 190] ms) seem to be close to that of the effect on the human N170. However, because the local field potentials in monkeys show differences between faces and nonface objects (a candidate homolog of the human N170) at ∼100–130 ms (Kreiman et al., 2006; Freiwald and Tsao, 2010), the timing of the face inversion effect in our study might not be compatible to the timing of the human N170. It is still not clear whether or not additional mechanisms are recruited in the case of face inversion as suggested by the N170 face inversion effect (Sadeh and Yovel, 2010).

Effect of thatcherization on responses of a neuronal population

The Euclidean distance between the population activity vector of the original face and thatcherized face in the upright condition was greater than those in the inverted condition for monkey faces (Fig. 9A), and the difference was observed starting ∼250 ms after the stimulus onset, which was much later than the peak of the cumulative sum of global and fine information across the neuronal population. This result suggests that the Thatcher effect could be observed for monkey faces and that the effect is represented later than the face inversion effect. The former result is in agreement with the behavioral observation in comparative psychology that monkeys show the Thatcher effect for monkey faces (Adachi et al., 2009; Dahl et al., 2010). It is not clear why the Thatcher effect was observed later than the face inversion effect. Human event-related potential studies, in contrast, have shown the effect both in N170 (Milivojevic et al., 2003; Carbon et al., 2005; Boutsen et al., 2006; Gu et al., 2007) and in a later component (P250) (Milivojevic et al., 2003). Because the contribution of the temporal lobe and emotion/social evaluation processing areas (amygdala, medial prefrontal, and subcallosal cortices) to the Thatcher effect has been documented by human functional magnetic resonance imaging (fMRI) studies (Rotshtein et al., 2001; Donnelly et al., 2011; Psalta et al., 2014), it is possible that these areas play a role in the Thatcher effect observed in area TE.

A relatively small number of neurons changed their activity upon presentation of thatcherized faces, suggesting that the contribution of TE neurons to the Thatcher effect may be small. One possible reason is that the thatcherized faces were presented in a pseudo-random order among the remaining facial stimuli. As a result, the monkeys did not have a chance to compare the original thatcherized face pairs. The Thatcher effect in macaque monkeys has been measured as a dishabituation (an increase in looking times) to a thatcherized face immediately after habituation to the original version of the face (Adachi et al., 2009; Dahl et al., 2010). The face inversion effect, in contrast, is known to be observed in randomized presentation of upright and inverted faces (e.g., Yin, 1969). A repeated presentation of a face is another task design to be considered in future studies since a human fMRI study has shown the influence of thatcherized faces in high-order visual areas using fMR adaptation (Rotshtein et al., 2001).

In conclusion, face inversion and thatcherization were found to affect the temporal processing stages of facial information in macaque area TE. The neuronal representation suggested that the ability to recognize global categorization is maintained but that the ability to recognize facial identities and facial expressions decreases in the case of face inversion and that, in the case of thatcherization, sensitivity to eye inversion in an inverted face is less than that in an upright face for monkey faces. Thus, the neuronal face representation in area TE may cause the face inversion effect and the Thatcher effect observed in psychological studies.

Footnotes

This work was supported by Grants-in-Aid for Scientific Research on Innovative Areas “Face perception and recognition” 21119528 and 23119732 to Y.S.-M., Grant-in-Aid for Scientific Research on Innovative Areas “Sparse Modeling” 26120535 to Y.S.-M., and KAKENHI 22700161 to N.M. and 24650105 to K.K. We thank Dr. Aya Takemura, Dr. Shigeru Yamane, Dr. Keiji Matsuda, Dr. Ichiro Takashima, Dr. Masato Okada, Dr. Shorato Akaho, Dr. Barry J. Richmond, Ai Muramatsu, Toshiharu Takasu, and Mizuho Yamane for their assistance.

The authors declare no competing financial interests.

References

  1. Abeles M, Goldstein MH. Multiple train analysis. Proc IEEE. 1977;65:762–773. doi: 10.1109/PROC.1977.10559. [DOI] [Google Scholar]
  2. Adachi I, Chou DP, Hampton RR. Thatcher effect in monkeys demonstrates conservation of face perception across primates. Curr Biol. 2009;19:1270–1273. doi: 10.1016/j.cub.2009.05.067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Attias H. Inferring parameters and structure of latent variable models by Variational Bayes. Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence; San Mateo, CA: Morgan-Kaufmann; 1999. [Google Scholar]
  4. Baldassi C, Alemi-Neissi A, Pagan M, Dicarlo JJ, Zecchina R, Zoccolan D. Shape similarity, better than semantic membership, accounts for the structure of visual object representations in a population of monkey inferotemporal neurons. PLoS Comp Biol. 2013;9:e1003167. doi: 10.1371/journal.pcbi.1003167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bell AH, Hadj-Bouziane F, Frihauf JB, Tootell RB, Ungerleider LG. Object representations in the temporal cortex of monkeys and humans as revealed by functional magnetic resonance imaging. J Neurophysiol. 2009;101:688–700. doi: 10.1152/jn.90657.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bouret S, Ravel S, Richmond BJ. Complementary neural correlates of motivation in dopaminergic and noradrenergic neurons of monkeys. Front Behav Neurosci. 2012;6:40. doi: 10.3389/fnbeh.2012.00040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Boutsen L, Humphreys GW, Praamstra P, Warbrick T. Comparing neural correlates of configural processing in faces and objects: an ERP study of the Thatcher illusion. Neuroimage. 2006;32:352–367. doi: 10.1016/j.neuroimage.2006.03.023. [DOI] [PubMed] [Google Scholar]
  8. Bruce C. Face recognition by monkeys: absence of an inversion effect. Neuropsychologia. 1982;20:515–521. doi: 10.1016/0028-3932(82)90025-2. [DOI] [PubMed] [Google Scholar]
  9. Carbon CC, Schweinberger SR, Kaufmann JM, Leder H. The Thatcher illusion seen by the brain: an event-related brain potentials study. Brain Res Cogn Brain Res. 2005;24:544–555. doi: 10.1016/j.cogbrainres.2005.03.008. [DOI] [PubMed] [Google Scholar]
  10. Churchland MM, Yu BM, Cunningham JP, Sugrue LP, Cohen MR, Corrado GS, Newsome WT, Clark AM, Hosseini P, Scott BB, Bradley DC, Smith MA, Kohn A, Movshon JA, Armstrong KM, Moore T, Chang SW, Snyder LH, Lisberger SG, Priebe NJ, et al. Stimulus onset quenches neural variability: a widespread cortical phenomenon. Nat Neurosci. 2010;13:369–378. doi: 10.1038/nn.2501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Dahl CD, Logothetis NK, Hoffman KL. Individuation and holistic processing of faces in rhesus monkeys. Proc R Soc Lond B Biol Sci. 2007;274:2069–2076. doi: 10.1098/rspb.2007.0477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dahl CD, Wallraven C, Bülthoff HH, Logothetis NK. Humans and macaques employ similar face-processing strategies. Curr Biol. 2009;19:509–513. doi: 10.1016/j.cub.2009.01.061. [DOI] [PubMed] [Google Scholar]
  13. Dahl CD, Logothetis NK, Bulthoff HH, Wallraven C. The Thatcher illusion in humans and monkeys. Proc R Soc Lond B Biol Sci. 2010;277:2973–2981. doi: 10.1098/rspb.2010.0438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Derntl B, Seidel EM, Kainz E, Carbon CC. Recognition of emotional expressions is affected by inversion and presentation time. Perception. 2009;38:1849–1862. doi: 10.1068/p6448. [DOI] [PubMed] [Google Scholar]
  15. Dittrich W. How monkeys see others: discrimination and recognition of monkeys' shape. Behav Process. 1994;33:139–154. doi: 10.1016/0376-6357(94)90063-9. [DOI] [PubMed] [Google Scholar]
  16. Donnelly N, Zürcher NR, Cornes K, Snyder J, Naik P, Hadwin J, Hadjikhani N. Discriminating grotesque from typical faces: evidence from the Thatcher illusion. PLoS One. 2011;6:e23340. doi: 10.1371/journal.pone.0023340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Eifuku S, De Souza WC, Tamura R, Nishijo H, Ono T. Neuronal correlates of face identification in the monkey anterior temporal cortical areas. J Neurophysiol. 2004;91:358–371. doi: 10.1152/jn.00198.2003. [DOI] [PubMed] [Google Scholar]
  18. Freire A, Lee K, Symons LA. The face inversion effect as a deficit in the encoding of configural information: direct evidence. Perception. 2000;29:159–170. doi: 10.1068/p3012. [DOI] [PubMed] [Google Scholar]
  19. Freiwald WA, Tsao DY. Functional compartmentalization and viewpoint generalization within the macaque face-processing system. Science. 2010;330:845–851. doi: 10.1126/science.1194908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Freiwald WA, Tsao DY, Livingstone MS. A face feature space in the macaque temporal lobe. Nat Neurosci. 2009;12:1187–1196. doi: 10.1038/nn.2363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ghahramani Z, Beal MJ. Variational inference for Bayesian mixtures of factor analyzers. In: Solla SA, Leen TK, Müller KR, editors. Advances in neural information processing. Vol 12. Cambridge, MA: Massachusetts Institute of Technology; 2000. [Google Scholar]
  22. Gothard KM, Erickson CA, Amaral DG. How do rhesus monkeys (Macaca mulatta) scan faces in a visual paired comparison task? Anim Cogn. 2004;7:25–36. doi: 10.1007/s10071-003-0179-6. [DOI] [PubMed] [Google Scholar]
  23. Gu X, Li Y, Yang X, Zhu Y. Spatial–temporal analysis of face processing using an ERP study of the Thatcher illusion. Conference proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2007. pp. 2496–2499. [DOI] [PubMed] [Google Scholar]
  24. Hadj-Bouziane F, Bell AH, Knusten TA, Ungerleider LG, Tootell RB. Perception of emotional expressions is independent of face selectivity in monkey inferior temporal cortex. Proc Natl Acad Sci U S A. 2008;105:5591–5596. doi: 10.1073/pnas.0800489105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Hadj-Bouziane F, Liu N, Bell AH, Gothard KM, Luh WM, Tootell RB, Murray EA, Ungerleider LG. Amygdala lesions disrupt modulation of functional MRI activity evoked by facial expression in the monkey inferior temporal cortex. Proc Natl Acad Sci U S A. 2012;109:E3640–E3648. doi: 10.1073/pnas.1218406109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hasselmo ME, Rolls ET, Baylis GC. The role of expression and identity in the face-selective responses of neurons in the temporal visual cortex of the monkey. Behav Brain Res. 1989;32:203–218. doi: 10.1016/S0166-4328(89)80054-3. [DOI] [PubMed] [Google Scholar]
  27. Hays AV, Richmond BJ, Optican LM. A UNIX-based multiple process system for real-time data acquisition and control. WESCON Conf Proc. 1982;2:1–10. [Google Scholar]
  28. Heller J, Hertz JA, Kjaer TW, Richmond BJ. Information flow and temporal coding in primate pattern vision. J Comput Neurosci. 1995;2:175–193. doi: 10.1007/BF00961433. [DOI] [PubMed] [Google Scholar]
  29. Kitazawa S, Kimura T, Yin PB. Cerebellar complex spikes encode both destinations and errors in arm movements. Nature. 1998;392:494–497. doi: 10.1038/33141. [DOI] [PubMed] [Google Scholar]
  30. Kreiman G, Hung CP, Kraskov A, Quiroga RQ, Poggio T, DiCarlo JJ. Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex. Neuron. 2006;49:433–445. doi: 10.1016/j.neuron.2005.12.019. [DOI] [PubMed] [Google Scholar]
  31. Lewis MB, Edmonds AJ. Face detection: mapping human performance. Perception. 2003;32:903–920. doi: 10.1068/p5007. [DOI] [PubMed] [Google Scholar]
  32. Matsumoto N, Okada M, Sugase-Miyamoto Y, Yamane S, Kawano K. Population dynamics of face-responsive neurons in the inferior temporal cortex. Cereb Cortex. 2005;15:1103–1112. doi: 10.1093/cercor/bhh209. [DOI] [PubMed] [Google Scholar]
  33. McKelvie SJ. Emotional expression in upside-down faces: evidence for configurational and componential processing. Br J Soc Psychol. 1995;34:325–334. doi: 10.1111/j.2044-8309.1995.tb01067.x. [DOI] [PubMed] [Google Scholar]
  34. McKone E, Yovel G. Why does picture-plane inversion sometimes dissociate perception of features and spacing in faces, and sometimes not? Toward a new theory of holistic processing. Psychon Bull Rev. 2009;16:778–797. doi: 10.3758/PBR.16.5.778. [DOI] [PubMed] [Google Scholar]
  35. Milivojevic B, Clapp WC, Johnson BW, Corballis MC. Turn that frown upside down: ERP effects of thatcherization of misorientated faces. Psychophysiology. 2003;40:967–978. doi: 10.1111/1469-8986.00115. [DOI] [PubMed] [Google Scholar]
  36. Parr LA, Heintz M, Pradhan G. Rhesus monkeys (Macaca mulatta) lack expertise in face processing. J Comp Psychol. 2008;122:390–402. doi: 10.1037/0735-7036.122.4.390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Perrett DI, Smith PA, Potter DD, Mistlin AJ, Head AS, Milner AD, Jeeves MA. Neurones responsive to faces in the temporal cortex: studies of functional organization, sensitivity to identity and relation to perception. Hum Neurobiol. 1984;3:197–208. [PubMed] [Google Scholar]
  38. Pinsk MA, Arcaro M, Weiner KS, Kalkus JF, Inati SJ, Gross CG, Kastner S. Neural representations of faces and body parts in macaque and human cortex: a comparative FMRI study. J Neurophysiol. 2009;101:2581–2600. doi: 10.1152/jn.91198.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Prkachin GC. The effects of orientation on detection and identification of facial expressions of emotion. Br J Psychol. 2003;94:45–62. doi: 10.1348/000712603762842093. [DOI] [PubMed] [Google Scholar]
  40. Psalta L, Young AW, Thompson P, Andrews TJ. The Thatcher illusion reveals orientation dependence in brain regions involved in processing facial expressions. Psychol Sci. 2014;25:128–136. doi: 10.1177/0956797613501521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. R Development Core Team. A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2004. [Google Scholar]
  42. Riby DM, Riby LM, Reay JL. Differential sensitivity to rotations of facial features in the Thatcher illusion. Psychol Rep. 2009;105:721–726. doi: 10.2466/PR0.105.3.721-726. [DOI] [PubMed] [Google Scholar]
  43. Rolls ET, Tovee MJ. Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J Neurophysiol. 1995;73:713–726. doi: 10.1152/jn.1995.73.2.713. [DOI] [PubMed] [Google Scholar]
  44. Rosenfeld SA, Van Hoesen GW. Face recognition in the rhesus monkey. Neuropsychologia. 1979;17:503–509. doi: 10.1016/0028-3932(79)90057-5. [DOI] [PubMed] [Google Scholar]
  45. Rossion B. Picture-plane inversion leads to qualitative changes of face perception. Acta Psychol. 2008;128:274–289. doi: 10.1016/j.actpsy.2008.02.003. [DOI] [PubMed] [Google Scholar]
  46. Rossion B, Gauthier I. How does the brain process upright and inverted faces? Behav Cogn Neurosci Rev. 2002;1:63–75. doi: 10.1177/1534582302001001004. [DOI] [PubMed] [Google Scholar]
  47. Rotshtein P, Malach R, Hadar U, Graif M, Hendler T. Feeling or features: different sensitivity to emotion in high-order visual cortex and amygdala. Neuron. 2001;32:747–757. doi: 10.1016/S0896-6273(01)00513-X. [DOI] [PubMed] [Google Scholar]
  48. Rousselet GA, Husk JS, Bennett PJ, Sekuler AB. Time course and robustness of ERP object and face differences. J Vis. 2008;8:1–18. doi: 10.1167/8.12.3. [DOI] [PubMed] [Google Scholar]
  49. Sadeh B, Yovel G. Why is the N170 enhanced for inverted faces? An ERP competition experiment. Neuroimage. 2010;53:782–789. doi: 10.1016/j.neuroimage.2010.06.029. [DOI] [PubMed] [Google Scholar]
  50. Saleem KS, Tanaka K. Divergent projections from the anterior inferotemporal area TE to the perirhinal and entorhinal cortices in the macaque monkey. J Neurosci. 1996;16:4757–4775. doi: 10.1523/JNEUROSCI.16-15-04757.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Saunders RC, Aigner TG, Frank JA. Magnetic resonance imaging of the rhesus monkey brain: use for stereotactic neurosurgery. Exp Brain Res. 1990;81:443–446. doi: 10.1007/BF00228139. [DOI] [PubMed] [Google Scholar]
  52. Seltzer B, Pandya DN. Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey. Brain Res. 1978;149:1–24. doi: 10.1016/0006-8993(78)90584-X. [DOI] [PubMed] [Google Scholar]
  53. Sripati AP, Olson CR. Representing the forest before the trees: a global advantage effect in monkey inferotemporal cortex. J Neurosci. 2009;29:7788–7796. doi: 10.1523/JNEUROSCI.5766-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Sugase Y, Yamane S, Ueno S, Kawano K. Global and fine information coded by single neurons in the temporal visual cortex. Nature. 1999;400:869–873. doi: 10.1038/23703. [DOI] [PubMed] [Google Scholar]
  55. Sugase-Miyamoto Y, Matsumoto N, Kawano K. Role of temporal processing stages by inferior temporal neurons in facial recognition. Front Psychol. 2011;2:141. doi: 10.3389/fpsyg.2011.00141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Thompson P. Margaret Thatcher: a new illusion. Perception. 1980;9:483–484. doi: 10.1068/p090483. [DOI] [PubMed] [Google Scholar]
  57. Tomonaga M. How laboratory-raised Japanese monkeys (Macaca fuscata) perceive rotated photographs of monkeys: evidence for an inversion effect in face perception. Primates. 1994;35:155–165. doi: 10.1007/BF02382051. [DOI] [Google Scholar]
  58. Tsao DY, Freiwald WA, Tootell RB, Livingstone MS. A cortical region consisting entirely of face-selective cells. Science. 2006;311:670–674. doi: 10.1126/science.1119983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Tsao DY, Moeller S, Freiwald WA. Comparing face patch systems in macaques and humans. Proc Natl Acad Sci U S A. 2008;105:19514–19519. doi: 10.1073/pnas.0809662105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Valentine T. A unified account of the effects of distinctiveness, inversion, and race in face recognition. Q J Exp Psychol A. 1991;43:161–204. doi: 10.1080/14640749108400966. [DOI] [PubMed] [Google Scholar]
  61. Yamane S, Kaji S, Kawano K. What facial features activate face neurons in the inferotemporal cortex of the monkey? Exp Brain Res. 1988;73:209–214. doi: 10.1007/BF00279674. [DOI] [PubMed] [Google Scholar]
  62. Yin RK. Looking at upside-down faces. J Exp Psychol. 1969;81:141–145. doi: 10.1037/h0027474. [DOI] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES