Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2011 Apr 18;108(18):7635–7640. doi: 10.1073/pnas.1016213108

Humans and monkeys share visual representations

Denis Fize a,b,1, Maxime Cauchoix a,b, Michèle Fabre-Thorpe a,b
PMCID: PMC3088612  PMID: 21502509

Abstract

Conceptual abilities in animals have been shown at several levels of abstraction, but it is unclear whether the analogy with humans results from convergent evolution or from shared brain mechanisms inherited from a common origin. Macaque monkeys can access “non-similarity–based concepts,” such as when sorting pictures containing a superordinate target category (animal, tree, etc.) among other scenes. However, such performances could result from low-level visual processing based on learned regularities of the photographs, such as for scene categorization by artificial systems. By using pictures of man-made objects or animals embedded in man-made or natural contexts, the present study clearly establishes that macaque monkeys based their categorical decision on the presence of the animal targets regardless of the scene backgrounds. However, as is found with humans, monkeys performed better with categorically congruent object/context associations, especially when small object sizes favored background information. The accuracy improvements and the response-speed gains attributable to superordinate category congruency in monkeys were strikingly similar to those of human subjects tested with the same task and stimuli. These results suggest analogous processing of visual information during the activation of abstract representations in both humans and monkeys; they imply a large overlap between superordinate visual representations in humans and macaques as well as the implicit use of experienced associations between object and context.

Keywords: homology, natural scenes, non-human primate, visual categorization


In a demonstration of monkeys’ abstraction abilities, Bovet and Vauclair (1) reported that monkeys were able to classify new pairs of real objects as “same” when belonging to the same superordinate category (an apple and a banana, or a cup and a padlock), while classifying other combinations as “different.” The ability to perform a judgment of conceptual identity among categories such as Food or Tools corresponds to an abstract level of conceptualization (24), but the nature of the cerebral processes, the mental representations involved, and their similarity with those of humans remain unclear.

In monkeys, the ability to form and access perceptual classes is the most extensively studied abstraction level. Categorizing by perceptual similarity enables the formation of open-ended categories, which generalize to novel elements of the same kind (2, 3). This categorical behavior emerges spontaneously even in macaque monkeys with lesions of the lateral prefrontal cortex (5), a structure in which neurons show category-specific selectivity (68). In fact, building “perceptual concepts” (4) such as natural visual categories—trees, cats, dogs, etc. (6, 9)—or artificial ones such as letter categories (10, 11) could rely on relatively simple mechanisms: a set of representative visual features involving particular shapes or typical shadings could be used as diagnostic information to categorize objects at this perceptual level (12). Such processing in humans and monkeys is likely to take place in the inferior temporal cortex, which has long been known to be critical for object visual recognition (13, 14) and generalization of object views (15). However, single-neuron activity within this region does not seem to reflect categorical information (9, 1618).

At a more abstract level, conceptual behavior implies categorizing beyond the physical similarity between exemplars of a class (3, 4). The few behavioral studies that have investigated this abstraction level in monkeys dramatically increased stimulus variety by using pictures of natural scenes and superordinate categories such as Food or Animal to avoid diagnosticity from a restrained set of low-level visual cues (1, 1922). In the Animal category, for example, the large variety of perceptually different instances of mammals, birds, insects, or fishes and the high performance reached on new exemplars suggest an abstract level of categorization and an ability for macaque monkeys to access superordinate representations (23).

However, the claim that abstract representations are used to perform the superordinate categorizations was challenged by the finding that global scene statistics can predict the presence of animals or objects in scene pictures (24). Indeed, to succeed in such tasks, monkeys had been extensively trained by using large image sets extracted from commercial databases (1921). Because the animal images are much more likely to be pictured on natural scene backgrounds than in urban contexts, the scene statistics that support the distinction between natural and man-made environments (25, 26) and the presence of foreground objects could potentially allow performance above chance. Monkeys could have used such contextual regularities to reach high scores without any conceptual representation of the object category.

Here, we adapted for monkeys the approach used in earlier work investigating contextual effects on object processing during visual scene categorization by humans (27): animal target objects and man-made distractor objects were displayed randomly in either man-made or natural scene backgrounds. Using such stimuli should confuse the monkeys if they rely on global scene statistics to perform the task. Possible similarities between the mental representations used by humans and monkeys to solve such visual categorization tasks were also investigated by using the categorical object/context interference phenomenon. Indeed, in humans, categorization performance is affected in terms of both delayed reaction times and lower accuracies when the object and the background context belong to different superordinate categories (2731), suggesting that the mental representations of object and background context overlapped during task performance and were “category-sensitive” at the superordinate level (27, 31).

Results show that monkeys’ categorization performance primarily relied on the processing of animal/object information, and that the scene background did not play a major diagnostic role. Furthermore, the categorical interference between foreground objects and background contexts had very similar effects on monkey and human performance over a wide range of parameters. These data suggest a high analogy between monkeys and humans for the use of visual cues to access categorical representation and their cerebral processing.

Results and Discussion

Two macaque monkeys (Dy and Rx) and 11 human subjects performed the Animal/Non-Animal rapid scene-categorization task developed by Thorpe et al. (32) and adapted for monkeys by Fabre-Thorpe et al. (20). The response consisted of a button release and a screen touch performed in under 1 s when a briefly flashed stimulus (50 ms) contained an animal (go/no-go task; Fig. 1A). The test stimuli were composed of 192 associations of four achromatic scenes counterbalancing superordinate context categories (man-made and natural) and superordinate object categories (animal or man-made object): very varied animal and man-made object exemplars were presented equally often in the natural and man-made background contexts. Test stimuli were only seen once by each subject, and backgrounds and objects were controlled for various low-level visual characteristics (Materials and Methods and Fig. 1B).

Fig. 1.

Fig. 1.

Animal/Non-Animal categorization performance reached by human and monkey subjects on the first presentation of the test stimuli. (A) Animal/Non-Animal task. Stimuli were presented for three frames unmasked on a black screen. When a scene containing an animal was flashed, subjects had to release a button and touch the tactile screen within 1 s (target stimulus, go response); otherwise, they kept their hands on the button (distractor stimulus, no-go response). Correct (go and no-go) responses were rewarded by a noise (and a juice drop for monkeys); incorrect (go and no-go) responses were punished by the redisplay of the incorrectly categorized stimulus for 3 s. (B) Two examples of the 192 associations of four stimuli. For each association, the man-made and natural backgrounds had equal average luminance and RMS contrast, and the animal and man-made object vignettes had equal surface, center-of-mass location, luminance, and RMS contrast. (C) First-trial hit and false-alarm rates for the group of 11 human subjects and the macaque monkeys Rx and Dy. Bar histograms for hits and false alarms correspond to the stimulus illustrations in the left and right columns, respectively. (Humans: paired t test, t = 26, df = 10, P < 10 × 10−9; Rx: χ2 test, χ2 = 122, P < 10 × 10−9; Dy: χ2 test, χ2 = 86, P < 10 × 10−9.) (D) First-trial performance accuracy (pooling correct go and no-go responses) computed separately for natural and man-made background stimuli. Bar histograms for the accuracies on natural and man-made backgrounds correspond to the stimulus illustrations in the upper and lower rows, respectively. Dotted line represents chance level; error bars indicate SEM. (χ2 tests, natural background—humans: contingency table, χ2 = 269, n = 11, P < 10 × 10−5; Rx: χ2 = 13, P < 0.0003; Dy: χ2 = 9, P < 0.003. χ2 tests, man-made background—humans: contingency table, χ2 = 327, n = 11, P < 10 × 10−5; Rx: χ2 = 11, P < 0.001; Dy: χ2 = 6, P < 0.02.)

For monkeys, the sequence of stimuli randomly interleaved test stimuli with familiar scenes to ensure the stability of monkey motivation and performance (33); first-trial responses to the test stimuli were thus of crucial importance. Familiar scenes were taken from the commercial photographs on which monkeys initially learned the Animal/Non-Animal task and had been intermittently trained and tested (for 5–6 y) (Materials and Methods); both monkeys performed at 89% correct on these familiar stimuli during the days preceding the present experiment.

Non-Similarity–Based Concepts in Macaque.

For both monkeys, first-trial performance on the test stimuli showed accuracy scores significantly above chance, regardless of the nature of the stimuli background (Fig. 1). The monkeys were able to use both man-made objects and animals as the pertinent key target for task performance: the proportion of hits outnumbered very significantly the proportion of false alarms in both monkeys (Fig. 1C). Monkeys were also able to ignore the scene background category because global-task accuracy was well above chance level, regardless of the environment category (Fig. 1D): natural (Rx, 71%; Dy, 67%) or man-made (Rx, 69%; Dy, 63%).

This task was particularly hard to perform given the high ratio of test stimuli introduced among familiar pictures (1/3 test, 2/3 familiar stimuli). In addition, in the test stimulus set, objects averaged only 6% of scene surface (range 0.2–22%), a much smaller object/scene surface range than in familiar stimuli. Object locations in the test stimuli were also extremely varied at eccentricities ranging from 0.2° to 24° from fixation (12° on average), which probably accounts for the conservative strategy exhibited by both monkeys (there were more misses than false alarms: both monkeys, χ2 tests, P < 10 × 10−5) and the accuracy drop compared with previous studies of our group [usually 90% correct (20, 21)]. In fact, a similar drop in performance accuracy was also observed in the 11 human subjects performing the same task with the same test stimuli: they averaged only 80% correct, a performance that has to be compared with the 94% correct usually reached by humans on novel scenes (19, 32, 34).

Thus, despite their long training with commercial photographs that mostly associate animals with natural backgrounds, the two monkeys were clearly able to use object information and to ignore background information when categorizing new scenes. From the very first presentation of the manipulated stimuli, the bias attributable to the scene background category only accounted for ∼2–5% of the global accuracy (Fig. 1D), despite the relatively small size of most objects within the scenes. Such immediate generalization for new man-made objects and animal exemplars presented in unusual scene contexts rules out the possibility that scene background regularities alone could explain performance.

Such results obtained in monkeys show that superordinate representations supporting abstract concepts do not necessarily require high-level functions such as linguistic abilities or even elaborate processing far from perceptual modalities (23, 35, 36). Recent results indicate that superordinate categories are the first to be accessed within the visual modality (37, 38). In fact, several processing schemes have been proposed that combine such coarse-to-fine visual processing (26, 3941) with fast decision mechanisms. For example, coarse visual information could be rapidly conveyed within the ventral pathway (21) or through the dorsal visual pathway (39) to frontal cortices (42) in order to prime object representations and facilitate subsequent detailed analysis in the ventral visual pathway. Another possibility is that midlevel areas in the ventral visual pathway might use intermediate representations that could be sufficient for categorical judgments (4345). In both cases, coarse visual information is presumed to be sufficient to trigger prefrontal cortex activity that reflects the precise delineation of categorical boundaries relevant for task performance, as proposed by Freedman, Miller, and collaborators (6, 8).

Object and Context Congruency.

Are superordinate visual representations similar in humans and macaques, and do they involve analogous mechanisms? To address these questions, we investigated the interference between object and background categories: we defined as categorically congruent the stimuli that associated congruent superordinate object/context associations (animals pasted on natural backgrounds or man-made objects on man-made backgrounds). Conversely, stimuli that embedded animals in man-made backgrounds or man-made objects in natural backgrounds were considered to be noncongruent (31).

From the very first stimulus presentations, both monkeys and humans exhibited 8% accuracy advantage for congruent compared with noncongruent test stimuli (Table S1). Similar object/context congruence biases have previously been described in humans for rapid scene categorization using color stimuli (2731); such congruence effects were observed even on the earliest behavioral responses, leading to the suggestion that it could result from feed-forward facilitation between neuronal populations that are usually coactivated because of their selectivity to visual features that are highly likely to co-occur within our environment (27). Following this hypothesis, we predicted that this congruence bias would be robust to short-term practice independently from any improvement in task performance.

Monkey and human performances were thus compared for three consecutive sessions in which all test stimuli where seen only once a day. The monkeys further performed ad libitum for several daily sessions on test stimuli only (9 sessions for Dy and 15 for Rx). As illustrated Fig. 2A, we observed a significant accuracy increase with practice in both species, although it was less pronounced in monkey Dy. Such accuracy increase with scene familiarity could be because of a higher success rate on difficult stimuli (34) in particular for incongruent object/background associations. However, no interaction between congruence effect and task practice was observed in either species. In monkeys, a two-way ANOVA (congruence × session) paired by subjects yielded a main effect for categorical congruence [F(1, 43) = 90, P < 10 × 10−4, η2 = 0.17] and session [F(10, 43) = 14, P < 10 × 10−4, η2 = 0.26], without interaction between congruence and session [F(10, 43) = 2, P > 0.1, η2 = 0.03]. Similar ANOVA performed over the group of human subjects showed significant effects of categorical congruence [F(1, 65) = 299, P < 10 × 10−4, η2 = 0.39] and session [F(2, 65) = 90, P < 10 × 10−4, η2 = 0.24] and no interaction [F(2, 65) = 3, P > 0.05, η2 = 0.009]. Similar results were obtained on reaction times (SI Materials and Methods). Thus, although global performance could improve, the impairment observed with incongruent object/background associations was not reduced with practice in either species. This result reinforces the hypothesis of hard-wired (dis)facilitation mechanisms between neurons selective for visual features (not) belonging to the same superordinate categories, as mentioned above.

Fig. 2.

Fig. 2.

Global Animal/Non-Animal task performance computed separately for categorical congruent (blue) and noncongruent (orange) object/context associations, reached by human (triangle) and monkey (circle and square) subjects. (A) Performance accuracy for consecutive sessions (test stimuli, all trials). Both species exhibited an accuracy increase with practice (linear regressions—humans: coefficient = 3.2, R2 = 0.39, df = 2, P < 10 × 10−4; Rx: coefficient = 0.8, R2 = 0.79, df = 16, P < 10 × 10−6; Dy: coefficient = 0.4, R2 = 0.24, df = 10, P = 0.06) but no interaction between categorical congruence and practice (see text). (B) Accuracy and mean reaction time as a function of object surface assessed as a percentage of the whole image; stimuli were divided in four sets of equal size. The accuracy performance reached with computer simulations (dashed lines) was computed for the Animal vs. Non-Animal categorization task performed on the test stimulus set, after learning had been achieved by using the monkey's familiar training image set. Dotted line represents chance level; error bars indicate SEM between object/background associations.

Are scene statistics the visual features that account for background interaction with object category? We tested this hypothesis using Oliva and Torralba's classification model (24) on the stimuli of the current experiment. This model efficiently implements the major principles of scene gist recognition, considering the global scene features (25) used in a number of recent scene-recognition models (e.g., refs. 46 and 47). First, the simulation was successful in selecting the animal scenes within a set of familiar images that monkeys had categorized over the years (768 photographs, model average accuracy 74%; note, however, that Rx and Dy scored 95% and 96%, respectively, on such familiar images). Second, it was successful in distinguishing between man-made and natural scene backgrounds by using the current 768 test stimuli (averaging 83% accuracy). These results stress the fact that processing image statistics could be a straightforward mechanism for using contextual information. However, the model failed completely at categorizing the manipulated test stimuli as containing animals: it only reached 53% accuracy on congruent and 50% on noncongruent object/context associations (average accuracy vs. chance level, χ2 test, not significant; congruent vs. noncongruent, χ2 test, P < 0.005).

This failure, however, could be accounted for by the small animal size relative to previous experiments. We tested this hypothesis by calculating simulation results by quartiles of the object sizes, corresponding to 2%, 4%, 7%, and 11% of the whole image. Although model accuracy increased mildly with object surface, computer simulations only reached 54% accuracy on the largest objects: the model based on scene statistics failed in object categorization.

On the contrary, this analysis performed on subjects’ behavioral data showed an important size effect that was similar in both species: performance improved with increasing object size for both accuracy and mean reaction time (Fig. 2B; ANOVAs are detailed in SI Materials and Methods). Interestingly, the accuracy advantage for congruent compared with noncongruent stimuli was highest for small objects (both species >9%), whereas the smallest accuracy advantage was observed for the biggest objects (both species <2%; Table S2 and ANOVAs in SI Materials and Methods). The level of interference between context category and object processing was thus similarly related in both species to the background and object surfaces.

These results suggest that background statistics could play a key role in the context interference observed when humans and monkeys categorize objects in natural scenes. Even if background statistics are not the diagnostic cues used by either species for object categorization, they could play a role analog to the contextual cueing reported by Chun and Jiang in visual search tasks (48): these authors showed that repeated associations between spatial configurations and target locations helped subjects in spatial attention tasks, despite the fact that subjects were not aware of these repeated associations and were not subsequently able to explicitly recognize them. In the present case, similar implicit learning could have occurred through repeated exposures to natural backgrounds associated with the presence of animals. Such implicit learning could thus involve low-level visual cues, including global scene statistics, consistent with their proposed role in triggering the fastest responses during context-categorization tasks (49).

These data suggest a high analogy between monkeys and humans in the way they use visual cues to access categorical representation, the underlying cerebral mechanisms, and their impact on behavior. They suggest the existence of analogous context and object superordinate representations in both species. This proposition and its consequences in terms of processing speed are further tested below.

Fast Access to Abstract Representations and Their Local Interaction.

If the processing mechanisms responsible for object/context interactions are analogous in humans and monkeys, any advantage in terms of processing speed should be similar in the two species. More precisely, the underlying hypothesis of feed-forward facilitation implies that this processing speed advantage should be significant from the fastest responses. Indeed, in humans, previous investigations of the influence of object/context congruence on rapid go/no-go scene categorization (27, 31) reported that scenes in which the object conflicted with the surrounding scene backgrounds required an additional processing time of ∼10–20 ms for the earliest responses. We thus took advantage of the severe time constraints imposed by the task to investigate whether the early response onset would be delayed when object and context categories are in conflict. We made the assumption that this delay previously documented in humans would also exist for macaques, despite the facts that the fastest reaction times were associated with the biggest objects, and the biggest objects were associated with the smallest congruence effect on average performance.

For this analysis, a large number of trials was needed that could not be restricted to first-trial performance. For both species, all go responses over all sessions were expressed by using 10-ms time bins. Minimal reaction time (corresponding to the minimal input–output processing time) was then determined as the first time bin for which correct responses significantly exceeded false alarms, using binomial tests. This temporal analysis was performed for each subject individually.

Individual results showed that monkeys Rx and Dy exhibited, respectively, 30-ms and 20-ms delays between minimal reaction time on congruent and noncongruent object/context associations (Table S3 and Fig. S1). In humans, similar results were observed despite a higher variability (0- to 40-ms individual delays), but the reverse effect was never observed; this intrinsic individual variability was likely enhanced by the fact that human subjects performed far fewer trials than the monkeys did. Interestingly, in all subjects, the earliest false alarms were produced when natural backgrounds were presented: scene backgrounds efficiently biased behavior from the fastest responses.

Pooled across subjects (Fig. 3A), the distributions of go responses showed a general 50-ms advantage for monkeys compared with humans. However, an equal delay of 30 ms in minimal reaction time was observed in both species between the responses to congruent and noncongruent stimuli (monkey, 200 vs. 230 ms; human, 250 vs. 280 ms): when the object category conflicts with the surrounding context, the additional processing time is similar in the two species from the earliest responses. This finding was further observed with a d′ analysis designed to evaluate how accuracy varies with response latency independently of the subject's strategies. Stimulus detectability was computed over time by using cumulative d′ scores, by computing d′ = z(hit) – z(fa) by 10-ms time bin (Fig. 3A Inset). Data showed that performance with congruent object/context associations reached higher d′ values than with noncongruent ones with a similar temporal shift toward shorter response latencies for humans and monkeys. Considered in the frame of signal detection theory, this result indicates that human and macaque cerebral mechanisms are equally sensitive to the congruence of the visual features that determine object and context categories.

Fig. 3.

Fig. 3.

Reaction-time distributions in the Animal/Non-Animal task computed separately for categorical congruent (blue) and noncongruent (orange) object/context associations. (A) Performance speed of humans (thin lines) and macaques (thick lines). Minimal reaction time (vertical dotted lines) was defined as the earliest 10-ms time bin in which hits (continuous lines) significantly outnumber false alarms (dashed lines). (Inset) Cumulative d′ curves of human and monkey responses to congruent and noncongruent object/context associations. (B) Distribution of reaction times as a function of object surface assessed as a percentage of the whole image. For both species, the object surface and associated minimal reaction times are indicated by the gray-level code. (C) Distribution of reaction times as a function of object surface (thick lines, large surfaces; thin lines, small surfaces; dashed lines, false alarms) when objects were categorically congruent (blue) or not (orange) with their background context. Associated minimal reaction times are indicated for both species.

To quantify how object size could affect the congruence effect from the fastest responses, we computed minimal reaction times with stimuli containing either the largest or the smallest objects (Fig. 3B). Between the earliest responses to small and large objects, a 30-ms delay was observed in both species, showing that “larger is faster” (50) not only on average performance but also for the fastest responses. For these large objects, object/context incongruence delayed the earliest responses by 10 ms in both species (Fig. 3C); this value can be compared with the 30-ms average delay for all object sizes in both species reported above.

Such similar delays in monkeys and humans are surprising. Direct behavioral comparisons have, until now, reported macaque response latencies at about two-thirds of the human ones. This principle seems to hold for ocular and manual responses for both absolute latencies as well as for the delays induced by task manipulations (21, 51, 52). A convincing explanation could be the slow speed of intracortical connections that makes brain size a critical factor in determining the time of information transfer between cortical areas (20, 21). If it is the case, the similar delays in humans and monkeys responses reported here are the likely signatures of analogous local computations or, in other words, analogous mechanisms that are embodied in restricted cortical regions all along the processing pathways. For example, the above 30-ms delay related to object size observed during object categorization by human and monkey plausibly reflects similar neural integration mechanisms in the ventral pathways of the two species: indeed Sripati and Olson (53) recently reported that shape selectivity of macaque inferotemporal neurons could develop 30 ms earlier for large stimuli compared with small ones. Here we propose that the highly similar temporal dynamics of object/context interactions observed behaviorally are the signature of analogous fast visual mechanisms that locally process features for object and scene category.

In summary, these results demonstrate first that macaque monkeys can really perform Animal/Non-Animal categorization tasks based on animals as a category and that they can do the task irrespective of scene background content. Moreover, despite their faster reaction times, monkeys exhibited highly similar behavior to human subjects when facing object/context incongruence: there was a similar accuracy impairment in object categorization, similar reaction-time delays observed from the fastest responses, and similar sensitivity to the object/background surface ratio. In both species, the Animal representations used to perform the task are sensitive to background cues but are mostly related to animal feature information that needs to be generalized from a wide range of animal types and shapes. These superordinate representations presumably result from visual neuronal mechanisms that operate in a coarse-to-fine scheme, where scene background and object visual features are locally processed in parallel with early interactions and competition for accumulating evidence in favor of a category that is relevant behaviorally (54).

Nevertheless, it might appear surprising that humans and macaques share these brain mechanisms and early representations. The extreme temporal constraints imposed in the task, in particular the short stimulus duration (50 ms) that restricts the time available for information uptake and the time allowed for response production (1 s), may have emphasized the similarity between humans and monkeys. There is little doubt that, with no time constraints, humans could use more sophisticated strategies that might allow them to perform the task at 100% correct, but when forced to rely on fast early processing of visual information, the cerebral mechanisms used by humans appear to be very similar to those used by monkeys, a “cognitive brick” that would be common to both species. Natural selection may have favored the development of facilitatory mechanisms between populations of object- and feature-selective neurons relatively early in visual processing. If so, one might expect these categorical abilities could be broadly shared across species.

Materials and Methods

Subjects and Initial Training.

Two male rhesus monkeys (Rx and Dy, both aged 14 y) performed a go/no-go task to categorize natural scene pictures as containing (or not containing) animals. Initial training followed the procedure reported in Fabre-Thorpe et al. (20): learning was progressive, starting with 10 images and gradually introducing new scenes every day over a period of several weeks until both monkeys were performing well on ∼300 stimuli. Although the monkeys’ motivation and level of reward were kept stable by randomly interleaving familiar and new stimuli, the recurrent introduction of new stimuli (usually 10–20%) forced the monkeys to look for an underlying rule to produce the adequate response rather than to rely on stimulus memorization. Both monkeys had been trained for intermittent periods on the Animal/Non-Animal task since 2005, but monkey Rx had first been trained on a Food/Non-Food task since 1996 (1921, 55). For both monkeys, the set of familiar stimuli included at least 750 stimuli. All procedures conformed to French and European standards concerning the use of experimental animals; protocols were approved by the regional ethical committee for experimentation on animals (agreement ref. MP/05/05/01/05).

Eleven human subjects (aged 23–50 y, 37 y average, five females) performed the same categorization task using the same experimental setup. Stimulus size in pixels and display were identical, as were behavioral control (button release, screen touch, and the 3-s stimulus display that followed incorrect decisions). Correct decisions were indicated by a beeping noise only.

Stimuli.

In familiar and test stimuli, the Animal and Man-Made Object superordinate categories include very varied exemplars. The Animal category included mammals (57% and 65% in familiar and test stimuli, respectively), birds (23% and 15%), insects (2% and 6%), reptiles and amphibians (8%), and fish and crustaceans (10% and 6%). In the test stimuli, man-made objects included means of transport (16%), urban furniture (15%), house furniture (16%), kitchen and water containers (17%), tools and toys (13%), interior decoration (8%), and other various objects (15%). The familiar stimuli were commercial photographs in which animals were usually large (or even very large), centered, and focused by the photographer. In the test stimuli, a wide range of object sizes and locations were used. Views of all of the test stimuli with associated first-trial performances and a subset of familiar stimuli are available at http://cerco.ups-tlse.fr/∼denis/fizePNAS2011. For further details about test stimulus generation, see SI Materials and Methods.

Procedure.

In a session, human subjects performed the Animal/Non-Animal categorization task once using the 768 test stimuli presented randomly (about half an hour). Each human subject performed three sessions on a daily basis to assess the robustness of the behavioral measures with three repetitions. For monkeys, test stimuli were introduced progressively, intermixed with familiar stimuli. Twelve daily sessions were needed to record monkey performance on the complete test stimulus set presented three times. Monkeys were further tested on daily sessions (9 for Dy and 15 for Rx) using the 768 test stimuli randomly presented ad libitum. Monkeys Dy and Rx performed, respectively, a total of 7,940 and 14,560 trials on test stimuli.

Computer Simulations.

Simulations used the code distributed by Oliva and Torralba (24) using the default software parameters. Each simulated task included 500 simulations: the performance accuracy indicated in the text result from their average value (SEM ranged from 1.07% to 1.2%). Each simulation involved randomly shuffling the stimuli into two equal sets that were considered for the subsequent phases of learning and testing.

More detailed descriptions of the methods can be found in SI Materials and Methods.

Supplementary Material

Supporting Information

Acknowledgments

We thank Simon Thorpe, Rufin VanRullen, Leila Reddy, and Thomas Serre for valuable comments on the manuscript. Computer simulations were possible thanks to the code freely provided by Aude Oliva and Antonio Torralba. This work was supported by the Centre National de la Recherche Scientifique; the Université de Toulouse, Université Paul Sabatier; and the Fondation pour la Recherche Médicale. The Délegation Générale de l'Armement provided financial support to M.C.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. R.T. is a guest editor invited by the Editorial Board.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1016213108/-/DCSupplemental.

References

  • 1.Bovet D, Vauclair J. Judgment of conceptual identity in monkeys. Psychon Bull Rev. 2001;8:470–475. doi: 10.3758/bf03196181. [DOI] [PubMed] [Google Scholar]
  • 2.Herrnstein RJ, Loveland DH. Complex visual concept in the pigeon. Science. 1964;146:549–551. doi: 10.1126/science.146.3643.549. [DOI] [PubMed] [Google Scholar]
  • 3.Zayan R, Vauclair J. Categories as paradigms for comparative cognition. Behav Processes. 1998;42:87–99. doi: 10.1016/s0376-6357(97)00064-8. [DOI] [PubMed] [Google Scholar]
  • 4.Lazareva OF, Wasserman EA. In: Learning Theory and Behavior. Menzel R, editor. Oxford: Elsevier; 2008. pp. 197–226. [Google Scholar]
  • 5.Minamimoto T, Saunders RC, Richmond BJ. Monkeys quickly learn and generalize visual categories without lateral prefrontal cortex. Neuron. 2010;66:501–507. doi: 10.1016/j.neuron.2010.04.010. [DOI] [PubMed] [Google Scholar]
  • 6.Freedman DJ, Riesenhuber M, Poggio T, Miller EK. Categorical representation of visual stimuli in the primate prefrontal cortex. Science. 2001;291:312–316. doi: 10.1126/science.291.5502.312. [DOI] [PubMed] [Google Scholar]
  • 7.Freedman DJ, Riesenhuber M, Poggio T, Miller EK. Visual categorization and the primate prefrontal cortex: Neurophysiology and behavior. J Neurophysiol. 2002;88:929–941. doi: 10.1152/jn.2002.88.2.929. [DOI] [PubMed] [Google Scholar]
  • 8.Cromer JA, Roy JE, Miller EK. Representation of multiple, independent categories in the primate prefrontal cortex. Neuron. 2010;66:796–807. doi: 10.1016/j.neuron.2010.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Vogels R. Categorization of complex visual images by rhesus monkeys. Part 1: Behavioural study. Eur J Neurosci. 1999;11:1223–1238. doi: 10.1046/j.1460-9568.1999.00530.x. [DOI] [PubMed] [Google Scholar]
  • 10.Schrier AM, Angarella R, Povar ML. Studies of concept formation by stumptailed monkeys: Concepts humans, monkeys, and letter A. J Exp Psychol Anim Behav Process. 1984;10:564–584. [Google Scholar]
  • 11.Vauclair J, Fagot J. Categorization of alphanumeric characters by baboons (Papio papio): Within and between class stimulus discrimination. Curr Psychol Cogn. 1996;15:449–462. [Google Scholar]
  • 12.Sigala N, Gabbiani F, Logothetis NK. Visual categorization and object representation in monkeys and humans. J Cogn Neurosci. 2002;14:187–198. doi: 10.1162/089892902317236830. [DOI] [PubMed] [Google Scholar]
  • 13.Logothetis NK, Pauls J, Poggio T. Shape representation in the inferior temporal cortex of monkeys. Curr Biol. 1995;5:552–563. doi: 10.1016/s0960-9822(95)00108-4. [DOI] [PubMed] [Google Scholar]
  • 14.Tanaka K. Inferotemporal cortex and object vision. Annu Rev Neurosci. 1996;19:109–139. doi: 10.1146/annurev.ne.19.030196.000545. [DOI] [PubMed] [Google Scholar]
  • 15.Weiskrantz L, Saunders RC. Impairments of visual object transforms in monkeys. Brain. 1984;107:1033–1072. doi: 10.1093/brain/107.4.1033. [DOI] [PubMed] [Google Scholar]
  • 16.Freedman DJ, Riesenhuber M, Poggio T, Miller EK. A comparison of primate prefrontal and inferior temporal cortices during visual categorization. J Neurosci. 2003;23:5235–5246. doi: 10.1523/JNEUROSCI.23-12-05235.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kiani R, Esteky H, Mirpour K, Tanaka K. Object category structure in response patterns of neuronal population in monkey inferior temporal cortex. J Neurophysiol. 2007;97:4296–4309. doi: 10.1152/jn.00024.2007. [DOI] [PubMed] [Google Scholar]
  • 18.Sigala N, Logothetis NK. Visual categorization shapes feature selectivity in the primate temporal cortex. Nature. 2002;415:318–320. doi: 10.1038/415318a. [DOI] [PubMed] [Google Scholar]
  • 19.Delorme A, Richard G, Fabre-Thorpe M. Ultra-rapid categorisation of natural scenes does not rely on colour cues: A study in monkeys and humans. Vision Res. 2000;40:2187–2200. doi: 10.1016/s0042-6989(00)00083-3. [DOI] [PubMed] [Google Scholar]
  • 20.Fabre-Thorpe M, Richard G, Thorpe SJ. Rapid categorization of natural images by rhesus monkeys. Neuroreport. 1998;9:303–308. doi: 10.1097/00001756-199801260-00023. [DOI] [PubMed] [Google Scholar]
  • 21.Macé MJ, Richard G, Delorme A, Fabre-Thorpe M. Rapid categorization of natural scenes in monkeys: Target predictability and processing speed. Neuroreport. 2005;16:349–354. doi: 10.1097/00001756-200503150-00009. [DOI] [PubMed] [Google Scholar]
  • 22.Roberts WA, Mazmanian DS. Concept learning at different levels of abstraction by pigeons, monkeys, and people. J Exp Psychol Anim Behav Process. 1988;14:247–260. [Google Scholar]
  • 23.Fabre-Thorpe M. Visual categorization: Accessing abstraction in non-human primates. Philos Trans R Soc Lond B Biol Sci. 2003;358:1215–1223. doi: 10.1098/rstb.2003.1310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Oliva A, Torralba A. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int J Comput Vis. 2001;42:145–175. [Google Scholar]
  • 25.Oliva A, Torralba A. Building the gist of a scene: The role of global image features in recognition. Prog Brain Res. 2006;155:23–36. doi: 10.1016/S0079-6123(06)55002-2. [DOI] [PubMed] [Google Scholar]
  • 26.Hughes HC, Nozawa G, Kitterle F. Global precedence, spatial frequency channels, and the statistics of natural images. J Cogn Neurosci. 1996;8:197–230. doi: 10.1162/jocn.1996.8.3.197. [DOI] [PubMed] [Google Scholar]
  • 27.Joubert OR, Fize D, Rousselet GA, Fabre-Thorpe M. Early interference of context congruence on object processing in rapid visual categorization of natural scenes. J Vis. 2008;8:11. doi: 10.1167/8.13.11. [DOI] [PubMed] [Google Scholar]
  • 28.Davenport JL, Potter MC. Scene consistency in object and background perception. Psychol Sci. 2004;15:559–564. doi: 10.1111/j.0956-7976.2004.00719.x. [DOI] [PubMed] [Google Scholar]
  • 29.Davenport JL. Consistency effects between objects in scenes. Mem Cognit. 2007;35:393–401. doi: 10.3758/bf03193280. [DOI] [PubMed] [Google Scholar]
  • 30.Fei-Fei L, Iyer A, Koch C, Perona P. What do we perceive in a glance of a real-world scene? J Vis. 2007;7:10. doi: 10.1167/7.1.10. [DOI] [PubMed] [Google Scholar]
  • 31.Joubert OR, Rousselet GA, Fize D, Fabre-Thorpe M. Processing scene context: Fast categorization and object interference. Vision Res. 2007;47:3286–3297. doi: 10.1016/j.visres.2007.09.013. [DOI] [PubMed] [Google Scholar]
  • 32.Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. Nature. 1996;381:520–522. doi: 10.1038/381520a0. [DOI] [PubMed] [Google Scholar]
  • 33.Minamimoto T, La Camera G, Richmond BJ. Measuring and modeling the interaction among reward size, delay to reward, and satiation level on motivation in monkeys. J Neurophysiol. 2009;101:437–447. doi: 10.1152/jn.90959.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Fabre-Thorpe M, Delorme A, Marlot C, Thorpe S. A limit to the speed of processing in ultra-rapid visual categorization of novel natural scenes. J Cogn Neurosci. 2001;13:171–180. doi: 10.1162/089892901564234. [DOI] [PubMed] [Google Scholar]
  • 35.Barsalou LW, Kyle Simmons W, Barbey AK, Wilson CD. Grounding conceptual knowledge in modality-specific systems. Trends Cogn Sci. 2003;7:84–91. doi: 10.1016/s1364-6613(02)00029-3. [DOI] [PubMed] [Google Scholar]
  • 36.Humphreys GW, Forde EM. Hierarchies, similarity, and interactivity in object recognition: “Category-specific” neuropsychological deficits. Behav Brain Sci. 2001;24:453–476, discussion 476–509. [PubMed] [Google Scholar]
  • 37.Macé MJ, Joubert OR, Nespoulous JL, Fabre-Thorpe M. The time-course of visual categorizations: You spot the animal faster than the bird. PLoS ONE. 2009;4:e5927. doi: 10.1371/journal.pone.0005927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Fabre-Thorpe M. In: The Making of Human Concepts. Mareschal D, Quin PC, Lea S, editors. Oxford: Oxford Univ Press; [Google Scholar]
  • 39.Bullier J. Integrated model of visual processing. Brain Res Brain Res Rev. 2001;36:96–107. doi: 10.1016/s0165-0173(01)00085-6. [DOI] [PubMed] [Google Scholar]
  • 40.Schyns PG, Oliva A. From blobs to boundary edges: Evidence for time- and spatial-scale-dependent scene recognition. Psychol Sci. 1994;5:195–200. [Google Scholar]
  • 41.Navon D. Forest before trees: The precedence of global features in visual perception. Cognit Psychol. 1977;9:353–383. [Google Scholar]
  • 42.Bar M, et al. Top-down facilitation of visual recognition. Proc Natl Acad Sci USA. 2006;103:449–454. doi: 10.1073/pnas.0507062103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Mirabella G, et al. Neurons in area V4 of the macaque translate attended visual features into behaviorally relevant categories. Neuron. 2007;54:303–318. doi: 10.1016/j.neuron.2007.04.007. [DOI] [PubMed] [Google Scholar]
  • 44.Ullman S, Vidal-Naquet M, Sali E. Visual features of intermediate complexity and their use in classification. Nat Neurosci. 2002;5:682–687. doi: 10.1038/nn870. [DOI] [PubMed] [Google Scholar]
  • 45.Delorme A, Richard G, Fabre-Thorpe M. Key visual features for rapid categorization of animals in natural scenes. Front Psychol. 2010;1:21. doi: 10.3389/fpsyg.2010.00021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Grossberg S, Huang TR. ARTSCENE: A neural system for natural scene classification. J Vis. 2009;9:6. doi: 10.1167/9.4.6. [DOI] [PubMed] [Google Scholar]
  • 47.Siagian C, Itti L. Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Trans Pattern Anal Mach Intell. 2007;29:300–312. doi: 10.1109/TPAMI.2007.40. [DOI] [PubMed] [Google Scholar]
  • 48.Chun MM, Jiang Y. Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cognit Psychol. 1998;36:28–71. doi: 10.1006/cogp.1998.0681. [DOI] [PubMed] [Google Scholar]
  • 49.Joubert OR, Rousselet GA, Fabre-Thorpe M, Fize D. Rapid visual categorization of natural scene contexts with equalized amplitude spectrum and increasing phase noise. J Vis. 2009;9:2. doi: 10.1167/9.1.2. [DOI] [PubMed] [Google Scholar]
  • 50.Vogels R. Visual perception: Larger is faster. Curr Biol. 2009;19:R691–R693. doi: 10.1016/j.cub.2009.07.020. [DOI] [PubMed] [Google Scholar]
  • 51.Busettini C, Fitzgibbon EJ, Miles FA. Short-latency disparity vergence in humans. J Neurophysiol. 2001;85:1129–1152. doi: 10.1152/jn.2001.85.3.1129. [DOI] [PubMed] [Google Scholar]
  • 52.Fischer B, Weber H. Express saccades and visual attention. Behav Brain Sci. 1993;16:553–567. [Google Scholar]
  • 53.Sripati AP, Olson CR. Representing the forest before the trees: A global advantage effect in monkey inferotemporal cortex. J Neurosci. 2009;29:7788–7796. doi: 10.1523/JNEUROSCI.5766-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Perrett DI, Oram MW, Ashbridge E. Evidence accumulation in cell populations responsive to faces: An account of generalisation of recognition without mental transformations. Cognition. 1998;67:111–145. doi: 10.1016/s0010-0277(98)00015-8. [DOI] [PubMed] [Google Scholar]
  • 55.Macé M, Delorme A, Richard G, Fabre-Thorpe M. Spotting animals in natural scenes: Efficiency of humans and monkeys at very low contrasts. Anim Cogn. 2010;13:405–418. doi: 10.1007/s10071-009-0290-4. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES