Skip to main content
eLife logoLink to eLife
. 2018 Oct 12;7:e36310. doi: 10.7554/eLife.36310

Perceptual processing in the ventral visual stream requires area TE but not rhinal cortex

Mark AG Eldridge 1,, Narihisa Matsumoto 2, John H Wittig Jnr 3, Evan C Masseau 1, Richard C Saunders 1, Barry J Richmond 1,
Editors: Lila Davachi4, Eve Marder5
PMCID: PMC6207425  PMID: 30311907

Abstract

There is an on-going debate over whether area TE, or the anatomically adjacent rhinal cortex, is the final stage of visual object processing. Both regions have been implicated in visual perception, but their involvement in non-perceptual functions, such as short-term memory, hinders clear-cut interpretation. Here, using a two-interval forced choice task without a short-term memory demand, we find that after bilateral removal of area TE, monkeys trained to categorize images based on perceptual similarity (morphs between dogs and cats), are, on the initial viewing, badly impaired when given a new set of images. They improve markedly with a small amount of practice but nonetheless remain moderately impaired indefinitely. The monkeys with bilateral removal of rhinal cortex are, under all conditions, indistinguishable from unoperated controls. We conclude that the final stage of the integration of visual perceptual information into object percepts in the ventral visual stream occurs in area TE.

Research organism: Rhesus macaque

Introduction

An intriguing property of the visual system is how easily and effortlessly we perceive objects (sensory processing), and discriminate among those of similar appearance, even when the particular exemplar has never been seen before. We quickly distinguish any red tomato from any red apple, regardless of the variety. A half-century of behavioral, anatomical and physiological research has revealed that this feat of visual perception is supported by a sequence of connected brain regions stretching from the occipital cortex to inferior temporal cortex (Figure 1A) (Gross et al., 1972; Ungerleider and Mishkin, 1982). Simple features, such as oriented edges or lines, are represented in caudal regions, beginning with area V1 (Hubel and Wiesel, 1959). Conjunctions of features defining whole objects are represented in rostral regions, culminating with area TE (Kobatake and Tanaka, 1994). Directly adjacent to area TE is rhinal cortex, an anatomically distinct region that receives dense projections from area TE (Suzuki and Amaral, 1994), and is thought to be important primarily for memory function (Meunier et al., 1993; Higuchi and Miyashita, 1996). It has more recently been suggested that rhinal cortex is important for visual perception of complex objects or objects with over-lapping features, and hence might be considered the final stage of the visual perceptual processing hierarchy (Buckley et al., 2001; Bussey et al., 2003; Baxter, 2009), but see Hampton, 2005, Suzuki (2009).

Figure 1. Background and task.

(A) Ventral visual stream - simple features represented in primary visual cortex (green). Increasing complexity of representations in intermediate areas, culminating in the representation of whole objects in area TE (yellow, bounded by dashed line). Immediately rostro-ventral to TE is rhinal cortex (Rh) (red, bounded by solid line - n.b. medial portion of rhinal cortex not visible from this angle). See figure supplement for rhinal cortex reconstructions. (B) A single trial from the perceptual categorization task (see supplemental methods for details). (C) Examples of the cat-dog morphed images presented as visual stimuli in Experiment 1.

Figure 1.

Figure 1—figure supplement 1 . Estimates of the extent of the aspiration lesions of the three monkeys in the TE-lesioned group (top), and rhinal-lesioned group (bottom) are plotted on coronal sections at the indicated levels, and reconstructed onto lateral/ventral views of the macaque brain, respectively; reconstructions for each case are shown at the bottom of each column.

Figure 1—figure supplement 1 .

Lesions were reconstructed using MR images (see Matsumoto et al., 2016) for details). Across the groups, lesions largely covered the areas of interest, and damage to adjacent structures was minimal and distributed idiosyncratically across monkeys. TE lesion reconstruction figure modified from Matsumoto et al., 2016, Figure 1; copyright remains with the authors.

To determine whether visual object processing is finalized in area TE, or whether it extends to rhinal cortex, categorization was tested at several levels of perceptual difficulty in a series of experiments using stimuli with overlapping features. All tasks required remembering visual perceptual categories. However, in every trial, the monkeys responded while the stimulus was present, thereby minimizing demands on short-term memory.

Results

We tested three groups of three monkeys: an unoperated control group, a group with a bilateral removal of area TE, and a group with a bilateral removal of rhinal cortex (including peri- and ento- rhinal cortex). Monkeys were trained to perform a cat vs. dog category discrimination in a visually cued two-interval forced choice (2-IFC) paradigm (Figure 1B) (Matsumoto et al., 2016). They were required to judge whether an image was more dog-like or cat-like when presented with stimuli drawn from a set of category-ambiguous ‘morphed’ images, created by blending and warping cats with dogs in different proportions. The monkeys responded while a stimulus was present, thereby minimizing demands on short-term memory.

Experiment 1 – learning to categorize morphed stimuli

In Experiment 1, monkeys were presented with a set of stimuli comprising cats, dogs, and intermediate morphed images spaced at 10% increments on an arbitrary scale (Abrosoft, Beijing, China) from 0 % to 100% dog (Figure 1C). Monkeys could avoid an extended inter-trial delay by releasing the bar in the first interval (signaled by a red target) for stimuli that were less than 50% dog, and were rewarded for releasing the bar in the second interval (signaled by a green target) for stimuli that were more than 50% dog. This amounts to an asymmetrical reward structure. They were rewarded randomly for releasing during the green interval for 50 – 50 morphs. The monkeys in all three groups classified most stimuli well the first time they were presented (Figure 2A,B,C). The control and rhinal-removal groups improved quickly with repetition, reaching asymptotic performance by the 10th repeat of the stimulus set (Figure 2D). The TE-removal group was slower to learn, only reaching the level of performance of controls by the 14th repetition of the stimulus set. Across presentations, the group with TE removals categorized less accurately than the control group (Linear Mixed Effects model, LME, p=0.0072, z = 2.69). The deficit seen in the TE-removal group might be attributable to slower learning, as there was an interaction effect between number of presentations by treatment group (Ctl vs. TE) by morph level (LME, p=1.07×10−8, z = −5.72). The performance of the group with rhinal removals was indistinguishable from that of controls (LME, p=0.23, z = −1.19).

Figure 2. Experiment 1.

Figure 2.

(A, B, C) Categorization performance of control (n = 3), TE-lesioned (n = 3), and rhinal-lesioned (n = 3) groups, respectively, during the first 10 presentations of this stimulus set (10 of 16 total presentations plotted for clarity). A steeper gradient to the central portion of the sigmoid indicates higher classification accuracy. Data fit with the function: a + b/(1 + exp(c * x + d)), where a, b, c, and d are free parameters. (D) The maximum slope (±s.e.m.) of the fitted functions in 1C, D and E plotted across presentations, fitted with a quadratic function for each group.

Figure 2—source data 1. Experiment 1 - learning to categorize morphed images.
DOI: 10.7554/eLife.36310.005

Experiment 2 – area TE removal impairs perceptually difficult categorization

In Experiment 2, we examined sensitivity to perceptual ambiguity at the category boundary, where classification should be most difficult. The stimuli used in Experiment 2 were derived from the same cat and dog pairs used in Experiment 1. We continued to explore the full range of category space (from 0% to 100% dog), but biased the distribution of stimuli towards the category boundary (cat 50:50 dog) by presenting stimuli consisting of the following ratios of cat:dog: 100:0, 75:25, 65:35, 60:40, 55:45, 50:50, 45:55, 40:60, 35:65, 25:75, and 0:100 (supplementary Figure 1). The asymmetrical reward structure of the task design produced a bias in all treatment groups toward classifying ambiguous stimuli as dogs, that is at the 50% morph level an unbiased subject would report the stimuli as dog-like 50% of the time. However, control monkeys reported the 50:50 morphs as more dog-like on 70% of presentations. The rhinal-lesioned group showed a similar bias, reporting 64% as more dog-like. The TE-lesioned group showed a bias in the same direction as the other two groups, reporting 50:50 morphs as more dog-like 61% of the time. For this experiment, we are interested in the ability to categorize visual stimuli accurately which is measured by the steepness of the discrimination curve. To remove the confound of learning rate, and focus on perceptual processing, we continued testing the three different treatment groups until performance reached asymptotic levels. For Experiment 2, all groups had reached asymptotic performance by the tenth presentation of the stimuli. During presentations 11 to 20, the categorization accuracy of monkeys with bilateral aspiration removals of rhinal cortex was indistinguishable from that of controls (LME, p=0.49, z = −0.68). Monkeys with bilateral TE removals made significantly more incorrect assignments than the other two groups (LME, p=2.64×10−10, z = −6.32), but nonetheless categorized better than would be expected by chance (t test, p=0.00057, d.f. = 5, t = −7.49) (Figure 3A). It seems unlikely that the impairment in making visual category discriminations following TE removals could be attributed to a learning deficit, because performance of all three test groups was stable throughout the latter 10 presentations of the test stimuli (LME, effect of repetition, p=0.54, z = −0.61).

Figure 3. Experiment 2.

(A) Categorization performance of three groups of monkeys: controls (n = 3), TE-lesioned group (n = 3), and Rh-lesioned group (n = 3), mean (±s.e.m.) of presentations 10 to 20. Data fit with the function: a + b/(1 + exp(c * x + d)), where a, b, c, and d are free parameters. (C, D, E) Mean reaction times (±s.e.m.) of control, TE-lesioned, and rhinal-lesioned groups, respectively, for trials ending with a correct response. All trials at the 50% level are included. See figure supplement for stimulus examples.

Figure 3—source data 1. Experiment 2 - asymptotic categorization performance.
DOI: 10.7554/eLife.36310.009

Figure 3.

Figure 3—figure supplement 1. The cat-dog morphed images presented as visual stimuli in Experiment 2.

Figure 3—figure supplement 1.

A total of 20 morph series (one per row) were used for this experiment. Each stimulus was presented once per set, in pseudo-random order; the monkeys completed two sets each day.
Figure 3—figure supplement 2. (A) Categorization performance of individual monkeys performing Experiment 2.

Figure 3—figure supplement 2.

Data fit with the function: a + b/(1 + exp(c * x + d)), where a, b, c, and d are free parameters. (C, D, E) Reaction times (±s.e.m.) of individual monkeys for trials ending with a correct response. All trials at the 50% level are included. Monkey-to-color mapping per the legend in ‘A’. Reaction times for bar releases during the presentation of the red target are represented by filled circles, reaction times for releases during the presentation of the green target are represented by open circles.

For all monkeys, the closer the stimulus was to the category boundary, the longer it took to release the lever on correct trials during the presence of the red target (LME – effect of morph level, p=0.0017, d.f. = 5, t = 5.23) (Figure 3B,C,D), supporting the inference that classification is more difficult near the category boundary. The latency to release following the green target was constant across difficulty levels because the monkeys had presumably made the decision that the stimulus was dog-like earlier in the trial (LME – effect of morph level, p=0.40, d.f. = 5, t = 0.26). We interpret the response time to the green target as the basic visual-motor reaction time. Reaction times did not differ across groups, suggesting that TE removal did not slow the decision process (reaction times during red target (LME, p=0.15, d.f. = 2, t = 1.35)), nor did it slow the basic visual-motor reaction time (reaction time during green target (LME, p=0.48, d.f. = 2, t = −0.069)). The deficits observed in the TE group are also unlikely to be due to a failure in basic visual acuity, because grating contrast-sensitivity – a test designed to assess the visual acuity of human subjects (Blakemore and Campbell, 1969) - was indistinguishable across all three groups (Ctl vs TE: LME, p=0.28, d.f. = 2, t = −0.69; Ctl vs Rh: LME, p=0.22, d.f. = 2, t = −0.95)) (Figure 4), and similar to those of humans (Blakemore and Campbell, 1969).

Figure 4. Visual acuity testing.

Figure 4.

Contrast sensitivity is plotted on a logarithmic scale against spatial frequency. Mean sensitivity (±s.e.m.) for each of the three groups of monkeys - controls (n = 3), TE-lesioned group (n = 3), and Rh-lesioned group (n = 3), (six technical replicates per monkey) - is fit with a quadratic function.

Figure 4—source data 1. Visual acuity testing.
DOI: 10.7554/eLife.36310.011

Motivation and attention did not seem to be altered in either lesion group because the reaction times of all groups were indistinguishable, and there were few late bar release errors (<1% on average) – the few late release errors that were recorded were not distributed significantly differently among the treatment groups (Kruskal-Wallis, χ2 = 2.49, p=0.29).

Experiment 3 – area TE impairment with visually degraded stimuli

Experiment 3 tested the possibility that monkeys with large cortical removals (TE/rhinal) had compensated for a deficit in object perception/categorization by memorizing one or more simple diagnostic features of each morph series (e.g. the ‘tail’ of the stimuli in the second row of Figure 1C). It has been demonstrated previously that the performance of monkeys with perirhinal cortex lesions is indistinguishable from controls in discriminating perceptually dissimilar stimuli obscured by masks (Hampton and Murray, 2002). The rationale for using the approach here, with perceptually similar stimuli, was if the monkeys were relying on a single diagnostic feature of the morphed stimuli to establish category membership in Experiment 2, image degradation of this nature (Figure 5A) would have a detrimental effect on performance. Consistent with this hypothesis, monkeys with bilateral TE removals were severely impaired by the masks relative to their own overall performance in the inter-leaved unmasked trials (Figure 5D) (LME, p=2×10−16, z = 9.03). They categorized the masked stimuli significantly less accurately than controls (Figure 5B) (LME, p=0.0017, z = 3.13). The performance of controls and that of monkeys with rhinal removals was indistinguishable (LME, p=0.20, z = −1.27); both groups exhibited smaller reductions in classification accuracy than the TE group in the presence of the masks, relative to the inter-leaved unmasked trials (Figure 5C,E) (Ctl: LME, p=8.92×10−11, z = 6.48; rhinal: LME, p=1.33×10−10, z = 6.42). Due to the asymmetrical reward structure of this and similar tasks, when monkeys fail to discriminate between the two offers they will usually resort to releasing the lever during the presence of the green target 100% of the time, as this is the only condition in which reward is available. This bias is visible in the impaired performance of the TE-lesioned monkeys following the foreground masking of the stimuli performed in Experiment 3.

Figure 5. Experiment 3.

(A) Examples of the visual stimuli presented. Four checker-board masks were placed over each of the stimuli used in Experiment 1, and presented inter-leaved with an unmasked version of each stimulus. (B) Categorization performance of the three test groups: mean (±s.e.m.) of responses to first presentation of all masked stimuli. (C, D, E) Categorization performance on masked (mean of all masks) vs. unmasked stimuli for each group, respectively (first presentation).

Figure 5—source data 1. Experiment 3 - categorization of visually degraded stimuli.
DOI: 10.7554/eLife.36310.014

Figure 5.

Figure 5—figure supplement 1. (A, C, E) Categorization performance on unmasked stimuli presented in Experiment 3, for individual control (A), TE-lesioned (C), and rhinal-lesioned (E) monkeys.

Figure 5—figure supplement 1.

(B, D, F) Performance on masked (mean of all masks) stimuli (first presentation) for each group: (B) control, (D) TE-lesioned, (F) rhinal-lesioned.

Experiment 4 – area TE impairment with a novel stimulus set

In Experiment 4, a large set of novel morphs were used as trial-unique stimuli in a single session to control for the possibility that the monkeys memorized individual stimulus-response outcomes in Experiments 1, 2 and 3. It has been demonstrated that visual discrimination of complex stimuli can be performed independently of area TE and of rhinal cortex if the stimuli have become sufficiently well learned (Eacott et al., 1994). We morphed a set of cat images each with two dog images, and vice versa (Figure 6A, supplementary Figure 2). This manipulation reduces the utility of a strategy focused on a single memorized feature (e.g. in Figure 4A, note the difference in appearance of the tail of ‘Cat A’ at the 50% morph level when paired with ‘Dog A’, versus the 50% level when paired with ‘Dog B’). Consistent with the results of Experiment 3, monkeys lacking area TE categorized stimuli significantly less accurately than controls (LME, p=2×10−16, z = 9.18), whereas those with rhinal cortex removals were indistinguishable from controls (Figure 6B) (LME, p=−0.62, z = −0.50).

Figure 6. Experiment 4.

(A) Examples of the visual stimuli presented; each cat was morphed with two dogs, and vice versa, for example Cat A was morphed with Dog A (top row), and with Dog B (bottom row). Examples at the 0%, 50%, and 100% dog level are shown; the full set of stimuli used in Experiment 4 was distributed across the same morph levels as used in Experiment 1 (see figure supplement for a larger set of stimulus examples). (B) Mean categorization performance (±s.e.m.) of the three test groups with a single presentation of each stimulus.

Figure 6—source data 1. Experiment 4 - categorization of novel stimuli.
DOI: 10.7554/eLife.36310.018

Figure 6.

Figure 6—figure supplement 1. Examples of the cat-dog morphed images presented as visual stimuli in Experiment 4.

Figure 6—figure supplement 1.

Twelve morph series – one on each row – are shown. A total of 40 morph series were used for this experiment. Each stimulus was presented once, in pseudo-random order; the monkeys completed a single session.
Figure 6—figure supplement 2. Categorization performance of individual monkeys in Experiment 4, in which each stimulus was novel, and presented only once.

Figure 6—figure supplement 2.

Discussion

According to the hierarchical model of object perception, area TE plays an important role in the integration of visual features into identifiable objects. The results from the experiments described above suggest that area TE is only important for perceiving and classifying objects with complex over-lapping features, especially when novel/unfamiliar. The monkeys with TE removals were surprisingly good at classifying all but the most ambiguous stimuli (i.e. those closest to the category boundary). Perception of simple, low-level features appears to be intact in the monkeys lacking TE as shown by their normal ability to detect the change in color of the target from red to green in Experiments 1 – 4, and by their normal contrast sensitivity when tested with black and white sine wave gratings.

There is a debate in both human and non-human primate literature about whether medial temporal lobe tissue - rhinal cortex - is important for the perception of feature-ambiguous stimuli (Lee et al., 2005; Levy et al., 2005; Buckley and Gaffan, 2006; Baxter, 2009; Suzuki, 2009; Lee and Rudebeck, 2010). The proposal that the visual object processing hierarchy includes perirhinal cortex arose from studies claiming a role for monkey perirhinal cortex in visual perception following tests of visual discrimination, oddity detection, or delayed-match-to-sample (DMS) at zero delay (Eacott et al., 1994; Buckley et al., 2001; Bussey et al., 2003). The aforementioned tasks required monkeys to compare an image with one or more specific images that must be maintained in memory (even in the case of the oddity task – in which stimuli appear simultaneously – the subject must maintain a representation of two or more features from three or more images to make a correct choice, a process that presumably requires the subjects to look at each image in turn and remember those already looked at). In ‘delayed matching’ tasks, monkeys with removals of rhinal cortex are only impaired when there is a delay of several seconds between the sample and test stimuli (Buffalo et al., 1999), and have impaired short-term memory but intact habit formation for the exact same images (Tu et al., 2011). In the present study, monkeys that had received lesions several years prior to testing (and hence compensatory reorganization in response to the testing series is unlikely) performed a task which only requires memory for a categorical exemplar or boundary, along with the category-response mapping. The anatomical extent of the rhinal removals was comparable to those described in the above-mentioned studies. Nonetheless, performance of rhinal-lesioned monkeys was indistinguishable from that of control monkeys by all measures. It could be argued that because categorization tasks require generalization across perceptually similar stimuli, that such tasks may be solved using features of intermediate complexity, rather than gestalt object representations (the latter being the type of perceptual function some have tried to ascribe to rhinal cortex (Bussey et al., 2003)). Although we cannot preclude this possibility, the large degree of feature overlap in our stimulus sets (c.f. the 40% vs. 60% morphed stimuli in Figure 1C) means that there are no obvious such intermediate features on which the task could be solved. When the discriminations are made more difficult (Experiments 3 and 4), there are no large changes in the response selections of the control and rhinal-lesioned groups. However, there are two changes in the response patterns of the TE-lesioned monkeys; first, their discrimination performance becomes considerably worse as shown by the decrease in slope of the discrimination curve. Second, the monkeys have changed their criterion for classifying a stimulus as a dog as shown by the shift in the 50% discrimination point to the left. Since these changes occur when the discrimination is made difficult, we can conclude that the monkeys have a discrimination deficit, and in the face of the discrimination being made more difficult the monkeys have changed their criterion by classifying more trials as dogs. The change in criterion is not surprising in the face of the asymmetrical reward structure of the task. The most parsimonious interpretation of these observations is that monkeys assess each image in its entirety in order to make a categorical judgement, and hence it seems that the TE-rhinal boundary represents the end of perceptual processing in the ventral visual stream.

The lack of evidence for a role of rhinal cortex in perceptual processing reported here suggests that previous reports of deficits in visually guided behavior following rhinal removals may be attributable to memory deficits. This interpretation is supported by a number of studies demonstrating a role for rhinal cortex in mnemonic processes. In behavioral testing, monkeys lacking rhinal cortex are impaired in object recognition (Meunier et al., 1993), and learning stimulus-stimulus associations (Murray et al., 1993). They are also severely impaired in comparing reward values across time (Liu et al., 2000), even when vision plays no role in the task (Clark et al., 2012). Rhinal cortex neurons appear to be tuned for complex stimuli in a similar manner to those recorded in area TE. However, neuronal responses in TE generally exhibit task-invariant tuning for specific stimuli (McMahon et al., 2014), whereas neuronal responses in rhinal cortex are modulated by behavioral context (Yakovlev et al., 1998; Lehky and Tanaka, 2007; Naya and Suzuki, 2011; Eradath et al., 2015). For example, when the stimulus-mapping in a delay-discounting task is randomized, perirhinal neurons lose their selectivity on the next trial (Liu and Richmond, 2000). The loss of stimulus selectivity in perirhinal neurons following a contextual change implies that the apparent selectivity is derived from a learned association with predicted outcome.

Rhinal cortex neurons are also sensitive to learned stimulus-stimulus associations; they form a functional microcircuit that is dynamically modulated by task demand (Erickson and Desimone, 1999; Hirabayashi et al., 2013) and they convey information relevant to behavioral choice in a more accessible manner than those in TE (Pagan et al., 2013; Pagan and Rust, 2014). Furthermore, the delay between TE and perirhinal neural responses to visual stimuli is considerably longer (~60 ms) than is usually attributed to a direct feed-forward mechanism (10–15 ms) (Xiang and Brown, 1998; Liu and Richmond, 2000); this suggests that rhinal cortex is doing more than a simple monosynaptic transformation of visual input from TE neurons. In the current study, a variety of manipulations that increased the perceptual difficulty of a categorization task revealed significant, although far from catastrophic, deficits in monkeys with bilateral TE removals, and no deficits in monkeys with bilateral rhinal removals. Thus, the final stage of image processing in the ventral visual stream appears to occur in area TE. Rhinal cortex is critical for learning and remembering contextual or associative relations among stimuli or events, but appears to play no role in object percept formation.

Materials and methods

Subjects and surgeries

All experimental procedures conformed to the Institute of Medicine Guide for the Care and Use of Laboratory Animals and were performed under an Animal Study Proposal approved by the Animal Care and Use Committee of the National Institute of Mental Health. Subjects were nine adult male monkeys (Macaca mulatta). Three monkeys (5 – 6 years old, weighing 6.9 – 9.0 kg) had previously received bilateral aspiration removals of area TE 2 years prior to the commencement of this study; the reconstructions of these lesions have been published previously (Matsumoto et al., 2016). Three monkeys (7 years old, weighing 7.0 – 14.5 kg) received bilateral aspiration removals of rhinal cortex (Rh) (comprising peri- and ento-rhinal cortex - comprising areas 28, 35, and 36 of Brodmann) 3 years prior to this study. Aspiration removals of rhinal cortex have been described previously (Meunier et al., 1993; Fritz et al., 2005); the Rh removals were largely as intended (supplementary Figure 3). Three monkeys (8 – 11 years old, weighing 7.8 – 9.5 kg) were unoperated controls.

Behavior

Monkeys sat in a primate chair inside a darkened, sound-attenuated testing chamber. They were positioned 57 cm from a computer monitor (Samsung 2233RZ) (Wang and Nikolić, 2011) subtending 40o × 30o of visual angle. Task timing and visual stimulus presentation were under the control of networked computers running, respectively, custom written (Real-time Experimentation and Control, REX (Hays et al., 1982)) and commercially available (Presentation, Neurobehavioral Systems) software for the design and control of behavioral experiments. Monkeys were initially trained to grasp and release a touch-sensitive bar to earn fluid rewards. After this initial shaping, a red/green color discrimination task was introduced (Bowman et al., 1996). Red/green trials began with a bar press, 100 ms later a small red target square (0.5°) was presented at the center of the display (over-laying a white noise background). Animals were required to continue grasping the touch bar until the color of the target square changed from red to green. Color changes occurred randomly 500 – 1,500 ms after bar touch. Rewards were delivered if the bar was released between 200 and 1000 ms after the color change; releases occurring either before or after this epoch were counted as errors. Thus, the color change occurs randomly within a 1000 ms time window, and the behavioral response must be made within 1000 ms of the color change; this design encourages the monkeys to use the color target to guide their responses. A strategy based on timing alone would result in a theoretical maximum of less than 50% correct. All correct responses were followed by visual feedback (target square color changed to blue) after bar release and reward delivery 200 – 400 ms after visual feedback. There was a 2-s inter-trial interval (ITI), regardless of the outcome of the previous trial.

After an animal reached criterion in the red/green task (2 consecutive days with >85% correct performance) a visual discrimination task was introduced. Each trial began when the animal grasped the touch bar; bar press was now initially followed (after 100 ms) by the presentation of a cue image. For training, the cues were two black and white block (‘Walsh’) patterns (13o x 13o). The cues signaled whether a release during the green target would result in the delivery of a drop of liquid reward, or a 4000 – 6000 ms ‘time-out’. The red target appeared 500 ms after the cue and changed color to green 2000 – 3000 ms later if the monkey continued to hold the bar. Monkeys could avoid the predicted outcome by releasing the lever before the red target transitioned to green; a new trial could then be initiated after the standard ITI. After the monkeys became acclimated to the incentive difference between the two black and white cues (2 consecutive days with >85% correct performance), they progressed to category training. For category training, the two black and white cues were replaced with two sets of cues, that is 20 dogs as the rewarded set, and 20 cats as the unrewarded set. If the monkey released the lever during the green target when a dog was present, the monkeys received one drop of liquid reward (Figure 1B). If the monkey released during the green target when a cat was present, no reward was delivered and there was a 4 – 6 s time-out. There was never a reward for releasing while the red target was present. The optimal behavior is to release during the presentation of the red target for the trials on which cats are presented, essentially skipping on to the next randomly selected trial, and release during the presentation of the green target for the trials on which dogs are presented to obtain the reward. This design is effectively a visually cued two-interval forced choice (2- IFC) task, with asymmetrical reward, in which the color of the central target indicates the current ‘choice window’. In the second phase of category training, monkeys were presented with four larger sets of trial-unique images (240 cats and 240 dogs), to confirm that the monkeys were able to classify stimuli based on visual perceptual categorization.

For the experiments with morphed stimuli, releasing the lever during the green target resulted in a 4 – 6 s time-out if the stimulus was more cat-like (i.e. <50% dog), and a reward if the stimulus was more dog-like (i.e. >50% dog). The outcome of trials on which a stimulus at the category boundary (i.e. == 50% dog) was presented was determined probabilistically; 50% of trials resulted in reward delivery, 50% resulted in a time-out. In Experiments 1 and 2, the stimulus set was presented twice each day, over eight and 10 days, respectively. In Experiment 3, the stimulus set was presented once over 2 days; with morph level and mask type counter-balanced across both days. The addition of the masked stimuli resulted in the set size for Experiment three being considerably larger than those used in the other experiments; testing was thus split over two sessions to ensure that the monkeys did not become sated or inattentive during performance. In Experiment 4, the stimulus set was presented once on a single day.

Visual cues

All visual cues were jpeg or pcx format photos (200 × 200 pixels). The training sets of dogs/cats used in this study are the same as in our previous report (Minamimoto et al., 2010). The images used in the four experiments were generated from a subset of the training images, in which pairs of cats and dogs were used to create cat-dog morph sequences using Fantamorph software (Abrosoft, Beijing, China). For Experiment 1, 20 dogs were morphed with 20 cats. From each cat-dog morph series, a set of stimuli comprising cats, dogs, and intermediate morphed images, spaced at 10% increments from 0% to 100% dog, was derived (Figure 1C). For Experiment 2, the same cat-dog morph series were used as in Experiment 1, but the distribution of stimuli was concentrated around the category boundary (cat 50:50 dog); stimuli consisted of the following ratios of cat:dog: 100:0, 75:25, 65:35, 60:40, 55:45, 50:50, 45:55, 40:60, 35:65, 25:75, and 0:100 (Figure 3, Figure 3—figure supplement 1). For Experiment 3, the stimuli from Experiment two were presented; on four fifths of trials, the stimuli were overlaid with one of four coarse black-block masks (Figure 5A). For Experiment 4, a new set of 20 cats and 20 dogs was used to create cat-dog morph series. Each cat was morphed to each of two dogs, and vice versa, for a total of 40 cat-dog morph series, to make it more difficult to discriminate among images from a morph series based on a single prominent feature (Figure 6A and 2).

Data analysis

Modeling and statistical methods were implemented in MatLab (MathWorks) and R (R core team, 2017). Linear mixed effects analyses (lme4; Bates et al., 2015) were used to evaluate the relationship between categorization accuracy (recorded as the proportion of trials classified as ‘dog’) and treatment group (control, TE, or Rh). The fixed effects entered into the model were treatment group, morph level, and, where relevant, repetition (with interaction terms). As random effects, we included the intercept for subject. This approach assumes that variation in repeated measures data is due to both fixed (for example, treatment group) and random (for example, monkey) effects, allows independent variables to be treated as continuous (for example, morph level) or categorical (for example, treatment group), and allows for non-normal dependent measures (that is, categorization accuracy). The data followed a logistic distribution, so we used a binomial link function.

Acknowledgements

We thank Alex Cummins for histology support, Grace Mammarella for lesion reconstructions, and Adin Horowitz for assistance with behavioral testing. We are grateful to Drs Alex Martin, Chris Baker and Christian Quaia for comments on the draft manuscript. This work was supported by the Intramural Research Program, National Institute of Mental Health, National Institutes of Health, Department of Health and Human Services. The opinions expressed in this article are the authors’ own and do not necessarily reflect the views of the US National Institutes of Health, the Department of Health and Human Services, or the United States Government.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Mark AG Eldridge, Email: mark.a.g.eldridge@gmail.com.

Barry J Richmond, Email: barry.richmond@nih.gov.

Lila Davachi, New York University, United States.

Eve Marder, Brandeis University, United States.

Funding Information

This paper was supported by the following grant:

  • National Institute of Mental Health 1ZIAMH002032-41 to Mark A G Eldridge, Narihisa Matsumoto, John H Wittig, Evan C Masseau, Richard C Saunders, Barry J Richmond.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

Conceptualization, Resources, Data curation, Software, Supervision, Investigation, Methodology, Writing—original draft, Writing—review and editing.

Conceptualization, Resources, Data curation, Software, Investigation, Methodology, Project administration.

Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Ethics

Animal experimentation: All experimental procedures conformed to the Institute of Medicine Guide for the Care and Use of Laboratory Animals and were performed under an Animal Study Protocol approved by the Animal Care and Use Committee of the National Institute of Mental Health, covered by project number: MH002032.

Additional files

Transparent reporting form
DOI: 10.7554/eLife.36310.019

Data availability

All data generated or analysed during this study are included in the manuscript and supporting files.

References

  1. Bates D, Mächler M, Bolker B, Walker S. Fitting linear Mixed-Effects models using lme4. Journal of Statistical Software. 2015;67:v067i01 [Google Scholar]
  2. Baxter MG. Involvement of medial temporal lobe structures in memory and perception. Neuron. 2009;61:667–677. doi: 10.1016/j.neuron.2009.02.007. [DOI] [PubMed] [Google Scholar]
  3. Blakemore C, Campbell FW. On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. The Journal of Physiology. 1969;203:237–260. doi: 10.1113/jphysiol.1969.sp008862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bowman EM, Aigner TG, Richmond BJ. Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards. Journal of Neurophysiology. 1996;75:1061–1073. doi: 10.1152/jn.1996.75.3.1061. [DOI] [PubMed] [Google Scholar]
  5. Buckley MJ, Booth MC, Rolls ET, Gaffan D. Selective perceptual impairments after perirhinal cortex ablation. The Journal of Neuroscience. 2001;21:9824–9836. doi: 10.1523/JNEUROSCI.21-24-09824.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Buckley MJ, Gaffan D. Perirhinal cortical contributions to object perception. Trends in Cognitive Sciences. 2006;10:100–107. doi: 10.1016/j.tics.2006.01.008. [DOI] [PubMed] [Google Scholar]
  7. Buffalo EA, Ramus SJ, Clark RE, Teng E, Squire LR, Zola SM. Dissociation between the effects of damage to perirhinal cortex and area TE. Learning & Memory. 1999;6:572–599. doi: 10.1101/lm.6.6.572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bussey TJ, Saksida LM, Murray EA. Impairments in visual discrimination after perirhinal cortex lesions: testing 'declarative' vs. 'perceptual-mnemonic' views of perirhinal cortex function. European Journal of Neuroscience. 2003;17:649–660. doi: 10.1046/j.1460-9568.2003.02475.x. [DOI] [PubMed] [Google Scholar]
  9. Clark AM, Bouret S, Young AM, Richmond BJ. Intersection of reward and memory in monkey rhinal cortex. Journal of Neuroscience. 2012;32:6869–6877. doi: 10.1523/JNEUROSCI.0887-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Eacott MJ, Gaffan D, Murray EA. Preserved recognition memory for small sets, and impaired stimulus identification for large sets, following rhinal cortex ablations in monkeys. European Journal of Neuroscience. 1994;6:1466–1478. doi: 10.1111/j.1460-9568.1994.tb01008.x. [DOI] [PubMed] [Google Scholar]
  11. Eradath MK, Mogami T, Wang G, Tanaka K. Time context of cue-outcome associations represented by neurons in perirhinal cortex. Journal of Neuroscience. 2015;35:4350–4365. doi: 10.1523/JNEUROSCI.4730-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Erickson CA, Desimone R. Responses of macaque perirhinal neurons during and after visual stimulus association learning. The Journal of Neuroscience. 1999;19:10404–10416. doi: 10.1523/JNEUROSCI.19-23-10404.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fritz J, Mishkin M, Saunders RC. In search of an auditory engram. PNAS. 2005;102:9359–9364. doi: 10.1073/pnas.0503998102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gross CG, Rocha-Miranda CE, Bender DB. Visual properties of neurons in inferotemporal cortex of the Macaque. Journal of Neurophysiology. 1972;35:96–111. doi: 10.1152/jn.1972.35.1.96. [DOI] [PubMed] [Google Scholar]
  15. Hampton RR, Murray EA. Learning of discriminations is impaired, but generalization to altered views is intact, in monkeys (Macaca mulatta) with perirhinal cortex removal. Behavioral Neuroscience. 2002;116:363–377. doi: 10.1037/0735-7044.116.3.363. [DOI] [PubMed] [Google Scholar]
  16. Hampton RR. Monkey perirhinal cortex is critical for visual memory, but not for visual perception: reexamination of the behavioural evidence from monkeys. The Quarterly Journal of Experimental Psychology Section B. 2005;58:283–299. doi: 10.1080/02724990444000195. [DOI] [PubMed] [Google Scholar]
  17. Hays AV, Richmond BJ, Optican LM. "A UNIX-Based Multiple-Process System for Real-Time Data Acquisition and Control.". WESCON Conference Proceedings. 1982;2:1–10. [Google Scholar]
  18. Higuchi S, Miyashita Y. Formation of mnemonic neuronal responses to visual paired associates in inferotemporal cortex is impaired by perirhinal and entorhinal lesions. PNAS. 1996;93:739–743. doi: 10.1073/pnas.93.2.739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hirabayashi T, Takeuchi D, Tamura K, Miyashita Y. Microcircuits for hierarchical elaboration of object coding across primate temporal areas. Science. 2013;341:191–195. doi: 10.1126/science.1236927. [DOI] [PubMed] [Google Scholar]
  20. Hubel DH, Wiesel TN. Receptive fields of single neurones in the cat's striate cortex. The Journal of Physiology. 1959;148:574–591. doi: 10.1113/jphysiol.1959.sp006308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kobatake E, Tanaka K. Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. Journal of Neurophysiology. 1994;71:856–867. doi: 10.1152/jn.1994.71.3.856. [DOI] [PubMed] [Google Scholar]
  22. Lee AC, Bussey TJ, Murray EA, Saksida LM, Epstein RA, Kapur N, Hodges JR, Graham KS. Perceptual deficits in amnesia: challenging the medial temporal lobe 'mnemonic' view. Neuropsychologia. 2005;43:1–11. doi: 10.1016/j.neuropsychologia.2004.07.017. [DOI] [PubMed] [Google Scholar]
  23. Lee AC, Rudebeck SR. Human medial temporal lobe damage can disrupt the perception of single objects. Journal of Neuroscience. 2010;30:6588–6594. doi: 10.1523/JNEUROSCI.0116-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lehky SR, Tanaka K. Enhancement of object representations in primate perirhinal cortex during a visual working-memory task. Journal of Neurophysiology. 2007;97:1298–1310. doi: 10.1152/jn.00167.2006. [DOI] [PubMed] [Google Scholar]
  25. Levy DA, Shrager Y, Squire LR. Intact visual discrimination of complex and feature-ambiguous stimuli in the absence of perirhinal cortex. Learning & Memory. 2005;12:61–66. doi: 10.1101/lm.84405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Liu Z, Murray EA, Richmond BJ. Learning motivational significance of visual cues for reward schedules requires rhinal cortex. Nature Neuroscience. 2000;3:1307–1315. doi: 10.1038/81841. [DOI] [PubMed] [Google Scholar]
  27. Liu Z, Richmond BJ. Response differences in monkey TE and perirhinal cortex: stimulus association related to reward schedules. Journal of Neurophysiology. 2000;83:1677–1692. doi: 10.1152/jn.2000.83.3.1677. [DOI] [PubMed] [Google Scholar]
  28. Matsumoto N, Eldridge MA, Saunders RC, Reoli R, Richmond BJ. Mild perceptual categorization deficits follow bilateral removal of anterior inferior temporal cortex in rhesus monkeys. Journal of Neuroscience. 2016;36:43–53. doi: 10.1523/JNEUROSCI.2058-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. McMahon DB, Jones AP, Bondar IV, Leopold DA. Face-selective neurons maintain consistent visual responses across months. PNAS. 2014;111:8251–8256. doi: 10.1073/pnas.1318331111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Meunier M, Bachevalier J, Mishkin M, Murray EA. Effects on visual recognition of combined and separate ablations of the entorhinal and perirhinal cortex in rhesus monkeys. The Journal of Neuroscience. 1993;13:5418–5432. doi: 10.1523/JNEUROSCI.13-12-05418.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Minamimoto T, Saunders RC, Richmond BJ. Monkeys quickly learn and generalize visual categories without lateral prefrontal cortex. Neuron. 2010;66:501–507. doi: 10.1016/j.neuron.2010.04.010. [DOI] [PubMed] [Google Scholar]
  32. Murray EA, Gaffan D, Mishkin M. Neural substrates of visual stimulus-stimulus association in rhesus monkeys. The Journal of Neuroscience. 1993;13:4549–4561. doi: 10.1523/JNEUROSCI.13-10-04549.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Naya Y, Suzuki WA. Integrating what and when across the primate medial temporal lobe. Science. 2011;333:773–776. doi: 10.1126/science.1206773. [DOI] [PubMed] [Google Scholar]
  34. Pagan M, Urban LS, Wohl MP, Rust NC. Signals in inferotemporal and perirhinal cortex suggest an untangling of visual target information. Nature Neuroscience. 2013;16:1132–1139. doi: 10.1038/nn.3433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Pagan M, Rust NC. Dynamic target match signals in perirhinal cortex can be explained by instantaneous computations that act on dynamic input from inferotemporal cortex. Journal of Neuroscience. 2014;34:11067–11084. doi: 10.1523/JNEUROSCI.4040-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Suzuki WA, Amaral DG. Perirhinal and parahippocampal cortices of the macaque monkey: cortical afferents. The Journal of Comparative Neurology. 1994;350:497–533. doi: 10.1002/cne.903500402. [DOI] [PubMed] [Google Scholar]
  37. Suzuki WA. Perception and the medial temporal lobe: evaluating the current evidence. Neuron. 2009;61:657–666. doi: 10.1016/j.neuron.2009.02.008. [DOI] [PubMed] [Google Scholar]
  38. Tu HW, Hampton RR, Murray EA. Perirhinal cortex removal dissociates two memory systems in matching-to-sample performance in rhesus monkeys. Journal of Neuroscience. 2011;31:16336–16343. doi: 10.1523/JNEUROSCI.2338-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Ungerleider LG, Mishkin M. In: Two Cortical Visual Systems. Analysis of Visual Behavior. Ingle D, Goodale M. A, Mansfield R. J. W, editors. Cambridge, Mass: MIT Press; 1982. [Google Scholar]
  40. Wang P, Nikolić D. An LCD monitor with sufficiently precise timing for research in vision. Frontiers in Human Neuroscience. 2011;5:85. doi: 10.3389/fnhum.2011.00085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Xiang JZ, Brown MW. Differential neuronal encoding of novelty, familiarity and recency in regions of the anterior temporal lobe. Neuropharmacology. 1998;37:657–676. doi: 10.1016/S0028-3908(98)00030-6. [DOI] [PubMed] [Google Scholar]
  42. Yakovlev V, Fusi S, Berman E, Zohary E. Inter-trial neuronal activity in inferior temporal cortex: a putative vehicle to generate long-term visual associations. Nature Neuroscience. 1998;1:310–317. doi: 10.1038/1131. [DOI] [PubMed] [Google Scholar]

Decision letter

Editor: Lila Davachi1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Where object perception ends in the ventral visual stream" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and David Van Essen as the Senior Editor. The following individual involved in review of your submission has agreed to reveal his identity: Robert Hampton (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

In this manuscript, Eldridge et al. investigate whether the rhinal cortex is involved in visual perception or whether upstream visual area TE is the "terminal" stage in visual processing in the ventral visual stream. To answer this question, the authors compared and contrasted behavioral performance of groups of monkeys with bilateral rhinal cortex lesions, bilateral TE lesions and unoperated controls. They employed a basic two-interval, forced choice, cat-dog categorization task in which monkeys were asymmetrically rewarded for releasing a lever during a "choice window" indicated by the color of the central target (red target for cat-like stimuli and green target for dog-like stimuli). The stimulus set consisted of cats, dogs, and intermediate morphs between the two. The authors find that while all groups of monkeys classified most stimuli well on the first repetition, the rates of improvement towards asymptotic performance were variable in different groups. While the TE lesioned monkeys improved the slowest over repetitions, the rhinal lesioned monkeys and the unoperated controls improved at a faster rate and did not differ from each other. Further, the authors used modified versions of the task to test the robustness of their results in the face of task difficulty. In sum, the authors argue that while area TE is required for visual perception, rhinal cortex is not necessary. This is taken to contribute to the ongoing debate about the role of perirhinal cortex in perception and suggest that area TE marks the end of purely visual processing in the ventral visual stream and does not extend to anatomically adjacent rhinal cortex.

Essential revisions:

Overall, this is an important study which is expected to be of interest to a broad array of visual, cognitive and behavioral neuroscientists. At the same time, however, there are substantial concerns about confounding factors in the task design and behavior that need to be discussed and addressed. In addition, improvements and additional analyses are needed for a more rigorous evaluation and a more complete picture of the issues under study.

1) Task-design:

1a) In the two-interval forced choice, cat/dog categorization task, is the first choice target always red and the second choice target always green? Or is the order of these two colored targets counterbalanced? In other words, is the first interval always correlated with the red target for cat stimuli and the second interval always correlated with green target for dog stimuli? Are monkeys using color to do the task? Or are they relying on an unknown combination of color and interval, so that it is not possible to know for sure what the monkeys are relying on since they are correlated?

1b) Clarification in text: In the subsection “Experiment 1 – learning to categorize morphed stimuli”, where the task is described, it is stated that monkeys were rewarded for responding with a lever release for cat-like stimuli in the red (first) interval and for dog-like stimuli in the green (second) interval. However, in the second paragraph of the subsection “Behavior”, it states that only responses for dog-like stimuli during the green (second) interval were rewarded and there was never a reward for releasing while the red target was present. There is a discrepancy here and needs to be resolved.

2) Psychophysical results (Figures 2, 5 and 6):

If the task contains an asymmetrical reward structure in which monkeys are rewarded only for the "dog" category, can the results in Figure 2B be explained by this reward bias? It looks like psychophysical performance for TE lesioned monkeys is impaired only for 0-20% dog stimuli (which are essentially cat-like stimuli). It is possible that monkeys are very good at the category that is rewarded and not good for the category that is unrewarded. Bias in performance is also present only for TE lesioned monkeys and not for controls or rhinal lesioned monkeys. Further, in Figures 5 (5B and 5D) and 6 (6B), the patterns of deficits in TE monkeys is biased towards the left of the psychophysical curve (just like in Figure 2). More insight is needed to understand whether this bias is a result of the asymmetrical reward structure? Does the bias switch if only cat stimuli were rewarded?

Also, for TE lesioned monkeys (Figure 2B), saturating performance (at 80-100% morph levels) decreases with repetition but not for controls or rhinal lesioned monkeys. Is this statistically significant?

3) Behavior and training for Figure 2:

What was the frequency of repetitions for the three groups of monkeys? Was it one repetition per day and was it similar across all the individual monkeys in the three groups? Could the impaired learning in TE lesioned monkeys be a result of more repetitions within the same time interval for control and rhinal lesioned monkeys but lesser repetitions within the same time interval for TE lesioned monkeys?

4) Analysis for Figure 3:

The text (subsection “Experiment 2 – area TE removal impairs perceptually difficult categorization”, first paragraph) mentions that the analysis used only sessions 11-20 in which asymptotic performance was reached to remove the effect of learning. However, in the subsection “Experiment 1 – learning to categorize morphed stimuli”, it states that while controls and rhinal lesioned monkeys reached asymptotic performance in 10 sessions, TE lesioned monkeys reached asymptotic performance in 14 sessions. Yet, to compare across groups for Figure 3, sessions 11-20 have been used for analysis, even though TE lesioned monkeys did not reach asymptotic performance by session 11. It might be cleaner to report analysis for Figure 3 using sessions in which all monkey groups reached asymptotic performance (perhaps session 14-20). The results in Figure 3A could be attributable to slower learning in TE.

5) Reaction-time distributions in Figure 3:

In Figure 2C, the variance in RTs for release on red is lower than the variance in RTs for release on red in Figure 2B. Also, in Figure 2C, the variance in RTs for release on green is higher than the variance in RTs for release on green in Figure 2B and 2D. In both the lesioned monkey groups (2C and 2D), RTs for release on red are faster on the hardest conditions (near boundary) as compared to controls. Is there an explanation for these effects?

6) Color vision related to Figure 4:

Previous studies (Buckley 1997) have shown that TE lesions cause deficits in color discrimination. Do the anatomical locations of TE lesions in this study match with other studies that show color deficits? Does the TE lesioned group have difficulty in discriminating color? And does this affect their performance on the task? In the first paragraph of the Discussion, it mentions that these monkeys can detect the change from red to green. Is this from a different task? This should be explained in the text or referenced in some form.

7) How does the size of the TE lesion compare to the size of the rhinal lesion? For example, if TE lesions are larger than rhinal lesions, this could potentially lead to more deficits in TE lesioned groups than rhinal lesioned groups.

8) In both groups of lesioned monkeys, is there a reduction in attention or motivation during task performance? For instance, are more trials aborted before completion or are there more fixation breaks – and do these differ between monkey groups?

9) Please include plots of the data from individual TE animals for all of the figures where a deficit is shown (in order to evaluate the heterogeneity of performance, and to see in which animals a reward bias could account for their behavior). Without seeing the TE lesions (including subcortical structures), we cannot know if they encroached upon reward circuitry or other brain regions to a greater extent than the Rh lesions, allowing non-perceptual explanations of the deficit to remain possible.

10) Please clarify how the present task differs from Task 1 of Lee et al., 2005, and how it has reduced memory demands relative to Lee and Rudebeck, 2010. There is a large prior literature showing Rh lesion-induced perceptual deficits that may be prematurely dismissed on the basis that those tasks did not eliminate memory demands adequately. But those tasks were described as having almost non-existent memory demands. Lee and Rudebeck, 2010, used a "possible/impossible" object judgement with no memory demands. Similarly, Lee et al., 2005 used a task almost identical to the present monkey task except that it displayed 2 stimuli simultaneously, with an identical category judgment (referring to a 'target' category defined at task outset); thus, in Lee et al., the greatest memory demand was comparing to a target category held in mind (as in the present study), not comparing between two adjacent stimuli. Second, it seems implausible that the extremely short memory demands imposed by prior oddity tasks (saccading between adjacent stimuli) are enough to render Rh involvement critical. This implies a role for Rh cortex in sensory/perceptual integration so extreme that patients with Rh damage should be unable to make sense of the dynamic visual world. Please provide further discussion of this important point. You may not be able to fully resolve this as it will likely remain hotly debated but describing your specific concerns in a convincing and detailed way will go much farther to increase the impact of this paper.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Perceptual processing in the ventral visual stream requires area TE but not rhinal cortex" for further consideration at eLife. Your revised article has been favorably evaluated by Eve Marder as the Senior Editor, a Reviewing Editor, and three reviewers.

This revision was very responsive in addressing several questions posed by the reviewers and providing further clarification of the design and interpretation of the reported results. There was not complete agreement across the reviewers as reviewer 1 suggests an intriguing alternative interpretation for the results that we hope will be the basis for further discussion. Their re-review is included below and we would ask that you consider their comments and make some final revisions to your paper to respond to their concerns.

Reviewer #1:

I remain unconvinced that the results and their interpretation are sufficiently compelling for publication in eLife (but I accepted the earlier, collective "letter to the authors" that was reasonably positive, because I see that I might be in the minority). My reservations stem from the ambiguous nature of the task (and the pattern of results in individual TE monkeys) and from weaknesses in the arguments for the key interpretations/conclusions. I respond to the authors' response letter below.

Authors: We have included plots of individual monkey performance for all experiments in which a deficit was reported (Figure 3—figure supplement 2; Figure 5—figure supplement 1; Figure 6—figure supplement 2).

Reviewer: This confirms what I ascertained by looking at the raw data: the TE lesion effect comes almost entirely from one monkey (K). In the other 2 TE monkeys, to the extent that they show any deficit, it is in classifying cats as dogs without a symmetrical tendency to classify dogs as cats. In other words, in these two monkeys, their small number of errors could be due to response bias (greater propensity to release the bar on green, because of increased reward-seeking/reduced risk-aversion, etc.). The authors state in the revised manuscript that "Due to the asymmetrical reward structure of this and similar tasks, when monkeys fail to discriminate between the two offers they will usually resort to releasing the lever during the presence of the green target 100% of the time, as this is the only condition in which reward is available." This illustrates the ambiguous nature of the task: if monkeys show a bias toward releasing on green (i.e. selecting "dog") it could be due to failure to discriminate or due to response bias (since the assignment of reward contingencies is not symmetric with respect to cat/dog category) and we cannot know which. This weakness is a key reason why I do not feel the present results are compelling enough to warrant publication in eLife.

Authors: Lee et al., 2001, 'task one' is a classic visual discrimination task (simultaneous S+ vs. S-), where the categories offer no information that can help the subjects solve the task.… Lee 'task 2' is something of a hybrid between visual discrimination and oddity tasks; the subject can use one stimulus to determine which of the other two stimuli is most perceptually similar (i.e. the target), and thus must make real-time comparisons among the 3 stimuli presented (further discussion of this point below). 'Task 2' could also be solved using the same visual discrimination learning required in 'task 1', although we agree with the authors that it is parsimonious to assume that the adoption of the former strategy is more likely. In both tasks, the subjects are being asked to make a stimulus-reward association – task 1 over the longer timescale of multiple presentations; task 2 over the shorter timescale of within-trial presentations. Our task places no similar requirement for 'stimulus-reward' mapping on our subjects. The conclusion we draw from these comparisons is that Rh cortex is likely critical for stimulus-reward association memory (as others have demonstrated), but that memory for 'category' is supported elsewhere (earlier in the visual system).

Reviewer: The authors argue that the key distinction between the Lee et al. tasks and their task is the use of only a single item per learned discrimination (in Lee et al.) versus the use of sets of items constituting the to-be-discriminated categories (i.e. multiple cats and multiple dogs) in their task. In Lee et al., the requirement to associate each single item with reward thus renders the task a "memory" task. Whereas, in the authors' task, because monkeys had to generalize across multiple items within a category and correctly associate that category with reward, the task is not a memory task but a perceptual discrimination task. Note that "and correctly associate that with reward" was added by me – the authors make no mention of reward in describing their own task. Yet an association between the category and the reward is critical for correct performance in the authors' task, and in fact this category-to-reward mapping is more complex than in the Lee et al. tasks (monkeys must learn: if "dog", then release while green; if "cat", then do not release until red). To argue that a simpler reward contingency (in Lee et al.) makes the task more "mnemonic" than a complex reward contingency (in the present task) seems backwards. I agree that what makes the present task different from the Lee et al. tasks is the requirement for generalizing across category exemplars, but this implies a very different interpretation than the one offered by the authors. The most sensible interpretation is as follows. The present task is a categorization task that requires both generalization across diverse cats (or across diverse dogs) and discrimination of cats from dogs. Therefore, the optimal representations are not whole, unique objects (residing in rhinal cortex), but rather Shimon Ullmann-esque intermediate complexity features (known to reside in IT) that allow generalization across distinct cats as well as discrimination of cats from dogs. This accords with another interesting/complicating factor in the present task: owing to the distribution of perceptual features possessed by cats and dogs, most cats are a plausible subset of dogs, but most dogs are not a plausible subset of cats (Mareschal, French and Quinn, 2000; Mareschal, Quinn and French, 2002). Given this category asymmetry, having compromised IT representations (needed for generalization and discrimination) might lead to a bias toward classing cats as dogs, as seen in the data. One way to test this would be to replicate the study in a design without reward asymmetry, to see if the dog-bias (which could then only arise from inherent category asymmetry) still exists.

Authors: Lee and Rudebeck, 2010, implemented a task that required subjects to report whether drawings of stimuli were viable as 3D objects. The memory demands of that task are much more similar to those of the present study – the subjects classified stimuli into one of two categories. However, the task only tests 'perception' in an abstract sense; there remains a confound with cognitive load – the mental reconstruction of a 3-dimensional image from a 2-dimensional representation demands more than simple perception of the object as a whole.

Reviewer: Here, it is not clear how the authors' argument rebuts the reviewer's concern. Do the authors mean to equate "mental reconstruction of a 3-dimensional image from a 2-dimensional representation" with traditional conceptions of declarative memory (i.e., with the "memory" account of rhinal cortex function that is used to dismiss other findings of rhinal lesion-induced deficits)? This does not seem plausible. I would like to see either a different, more compelling reason for attributing the Lee et al. findings (both the 2005 and 2010 studies) to a "memory" deficit, or an alternative interpretation of the present results that can accommodate all of the data in a satisfying way.

Authors: The reviewers suggest that the short-term memory demands imposed by oddity tasks are equivalent to the sensory/perceptual demands of the dynamic visual world. In the Lee et al., 2005 study.… the sum of the saccadic intervals between the different objects, and among features within each object, will be on the order of 100s of ms, during which information has to be actively held in some form of short-term memory.

Reviewer: Again, it is not clear how the authors' argument rebuts the reviewer's concern. If rhinal cortex lesions impair the ability to hold information in memory for ~100ms, this would have serious deleterious effects on perception of the dynamic visual world. For example, when a prime stimulus disrupts perception of a subsequent target stimulus, its effects can either blend with (boost) or be "discounted" from (detract from) perception of the target stimulus, depending on for how long the prime appears. The prime duration at which our perceptual systems tend to switch from blending to discounting is approximately 100-300ms (e.g., Huber, 2008). In other words, basic mechanisms of dynamic perception would be massively altered if the ability to maintain information for 100ms were lost. This is not typically how the experience of individuals with rhinal cortex lesions is characterized.

Reviewer #2:

I am satisfied with the revision. The authors have responded adequately to reviewer comments.

Reviewer #3:

The authors have addressed the concerns raised in my review from the previous round, and the manuscript has been improved. I am satisfied with their revisions and response.

eLife. 2018 Oct 12;7:e36310. doi: 10.7554/eLife.36310.022

Author response


Essential revisions:

Overall, this is an important study which is expected to be of interest to a broad array of visual, cognitive and behavioral neuroscientists. At the same time, however, there are substantial concerns about confounding factors in the task design and behavior that need to be discussed and addressed. In addition, improvements and additional analyses are needed for a more rigorous evaluation and a more complete picture of the issues under study.

1) Task-design:

1a) In the two-interval forced choice, cat/dog categorization task, is the first choice target always red and the second choice target always green? Or is the order of these two colored targets counterbalanced? In other words, is the first interval always correlated with the red target for cat stimuli and the second interval always correlated with green target for dog stimuli? Are monkeys using color to do the task? Or are they relying on an unknown combination of color and interval, so that it is not possible to know for sure what the monkeys are relying on since they are correlated?

Yes, the first choice target is always red, and the second always green. The first interval is always correlated with the red target, and the second interval with green. This structure does not influence the monkey’s decision, as the trial type (more cat-like or more dog-like) is selected on a random basis. To illustrate this point: Minamimoto et al., 2010, used a similar task structure to that used in the present study; when they substituted the categorical cues for stimuli derived by randomly arranging black and white pixels, the monkey’s performance fell to chance.

It is extremely unlikely that monkeys use the timing of the interval to perform the task. The timing of the color change from red to green varies from 2000 – 3000 ms during the testing phase (500 – 1500 ms during the initial training phase), and the timing window for correctly releasing the bar after the target changes to green is 200 – 1000 ms. These timings encourage the monkeys to use the color cue to determine when to make their responses; a strategy based on timing alone would result in very low levels of success. We explain this, with further discussion, in the first paragraph of the subsection “Behavior”.

The dependence of the reaction time distributions on the difficulty of the cat-like options indicates that the monkeys are releasing the bar when they have presumably gathered enough information to make a decision. The low variance across morph levels for bar release responses after the color change from red-to-green (Figure 3B, C, D) confirms that the monkeys are attending to the color change as opposed to using a timing strategy.

1b) Clarification in text: In the subsection “Experiment 1 – learning to categorize morphed stimuli”, where the task is described, it is stated that monkeys were rewarded for responding with a lever release for cat-like stimuli in the red (first) interval and for dog-like stimuli in the green (second) interval. However, in the second paragraph of the subsection “Behavior”, it states that only responses for dog-like stimuli during the green (second) interval were rewarded and there was never a reward for releasing while the red target was present. There is a discrepancy here and needs to be resolved.

Thank you for bringing this error to our attention. The description in the subsection “Behavior” was correct. We have modified the earlier description of the task to reflect this.

2) Psychophysical results (Figures 2, 5 and 6):

If the task contains an asymmetrical reward structure in which monkeys are rewarded only for the "dog" category, can the results in Figure 2B be explained by this reward bias? It looks like psychophysical performance for TE lesioned monkeys is impaired only for 0-20% dog stimuli (which are essentially cat-like stimuli). It is possible that monkeys are very good at the category that is rewarded and not good for the category that is unrewarded. Bias in performance is also present only for TE lesioned monkeys and not for controls or rhinal lesioned monkeys. Further, in Figures 5 (5B and 5D) and 6 (6B), the patterns of deficits in TE monkeys is biased towards the left of the psychophysical curve (just like in Figure 2). More insight is needed to understand whether this bias is a result of the asymmetrical reward structure? Does the bias switch if only cat stimuli were rewarded?

The asymmetrical reward does produce a response bias. However, this bias is present for all groups, and hence cannot account for the deficit in the TE group. We have emphasized the universal effect of the bias in the first paragraph of the subsection “Experiment 2 – area TE removal impairs perceptually difficult categorization”.

We provide further insight into the reason for the direction of the bias (to the left of the psychophysical curve) in the subsection “Experiment 3 – area TE impairment with visually degraded stimuli”.

We did not test whether the bias switched if we switched reward contingency between the two categories, however we would strongly expect the bias to favor the rewarded category.

Also, for TE lesioned monkeys (Figure 2B), saturating performance (at 80-100% morph levels) decreases with repetition but not for controls or rhinal lesioned monkeys. Is this statistically significant?

The decrease in saturating performance (at 80-100% morph levels) with repetition for the TE group does not reach statistical significance (RM-ANOVA on the parameter of the curve fit that corresponds to saturating performance – ‘b’ in the function: a + b / (1 + exp(c * x + d)): main effect of treatment group, p = 0.99; main effect of session, p = 0.61, interaction of group*session, p = 0.51).

3) Behavior and training for Figure 2:

What was the frequency of repetitions for the three groups of monkeys? Was it one repetition per day and was it similar across all the individual monkeys in the three groups? Could the impaired learning in TE lesioned monkeys be a result of more repetitions within the same time interval for control and rhinal lesioned monkeys but lesser repetitions within the same time interval for TE lesioned monkeys?

The frequency of repetitions was identical for all treatment groups. We describe this in greater detail in the last paragraph of the subsection “Behavior”. All monkeys received the same number of list presentations (i.e. number of trials), and there was no consistent difference between groups in how much time was required to complete the testing sessions.

4) Analysis for Figure 3:

The text (subsection “Experiment 2 – area TE removal impairs perceptually difficult categorization”, first paragraph) mentions that the analysis used only sessions 11-20 in which asymptotic performance was reached to remove the effect of learning. However, in the subsection “Experiment 1 – learning to categorize morphed stimuli”, it states that while controls and rhinal lesioned monkeys reached asymptotic performance in 10 sessions, TE lesioned monkeys reached asymptotic performance in 14 sessions. Yet, to compare across groups for Figure 3, sessions 11-20 have been used for analysis, even though TE lesioned monkeys did not reach asymptotic performance by session 11. It might be cleaner to report analysis for Figure 3 using sessions in which all monkey groups reached asymptotic performance (perhaps session 14-20). The results in Figure 3A could be attributable to slower learning in TE.

In Experiment 1, the TE-removal group were slower to learn, only reaching the level of performance of controls by the 14th repetition of the stimulus set; all sessions were included in the analysis. In Experiment 2 we wanted to exclude the potential confound of a learning effect by evaluating performance after a level plane had been acquired. In Experiment 2, all monkeys had reached asymptotic performance by the tenth session, hence sessions 11 – 20 were used in the analysis. We have clarified this in the first paragraph of the subsection “Experiment 2 – area TE removal impairs perceptually difficult categorization”.

5) Reaction-time distributions in Figure 3:

In Figure 2C, the variance in RTs for release on red is lower than the variance in RTs for release on red in Figure 2B. Also, in Figure 2C, the variance in RTs for release on green is higher than the variance in RTs for release on green in Figure 2B and 2D. In both the lesioned monkey groups (2C and 2D), RTs for release on red are faster on the hardest conditions (near boundary) as compared to controls. Is there an explanation for these effects?

The plots of the individual monkey’s reaction times (now provided in Figure 3—figure supplement 2) go some way to explaining the differences in variance noted by the reviewers; for example, one of the TE-lesioned monkeys (monkey ‘G’) was almost 200 ms slower than all of the other monkeys in all treatment groups to release the bar in response to the appearance of the green target. However, the success rate of this monkey in classifying the morphed images correctly was between those of the other two TE-lesioned monkeys (see panel A of Figure 3—figure supplement 2), suggesting that the slower reaction time cannot easily be interpreted to explain the deficit.

Reaction times were not statistically different among groups. The reviewers noticed a trend towards slower reaction times for the control group at the most ambiguous levels of category membership, but the statistical analysis indicates that this was not significant.

6) Color vision related to Figure 4:

Previous studies (Buckley 1997) have shown that TE lesions cause deficits in color discrimination. Do the anatomical locations of TE lesions in this study match with other studies that show color deficits? Does the TE lesioned group have difficulty in discriminating color? And does this affect their performance on the task? In the first paragraph of the Discussion, it mentions that these monkeys can detect the change from red to green. Is this from a different task? This should be explained in the text or referenced in some form.

The TE lesions in the present study are larger than those reported in Buckley, 1997, in that they extend medially to the lateral bank of the occipitotemporal sulcus. We have reproduced the figure illustrating TE lesion extent for simple reference (Figure 1—figure supplement 1). The reference to red-to-green color discrimination refers to the experiments described in the present study, we clarify this in the first paragraph of the Discussion. If the TE-lesioned group couldn’t discriminate color, they wouldn’t be able to complete the preliminary training phase in which monkeys are required to report the change of the target color from red to green.

TE-lesioned animals learned the red-green color discrimination at the same rate as controls and those with rhinal lesions. We did not test whether the monkeys were discriminating color or luminance. What is important is that their responses were governed by the test image.

7) How does the size of the TE lesion compare to the size of the rhinal lesion? For example, if TE lesions are larger than rhinal lesions, this could potentially lead to more deficits in TE lesioned groups than rhinal lesioned groups.

The TE lesions result in the removal of a larger volume of tissue than the rhinal removals. However, others have demonstrated that TE removals do not non-specifically produce greater impairments than rhinal cortex removals (Buckley et al., 1997, Buffalo et al., 1999). More importantly, the rhinal removals are equivalent to those previously reported – specifically in studies in which the authors reported a deficit in visual perceptual processing, such as the Bussey et al., 2003 study – which used exactly the same boundaries for rhinal cortex as were used in the present study. We have emphasized this in the second paragraph of the Discussion.

8) In both groups of lesioned monkeys, is there a reduction in attention or motivation during task performance? For instance, are more trials aborted before completion or are there more fixation breaks – and do these differ between monkey groups?

Motivation and attention do not appear to be affected by either lesion type. This statement and supporting evidence are presented in the last paragraph of the subsection “Experiment 2 – area TE removal impairs perceptually difficult categorization”. Monkeys were not head-posted, hence no fixation data is available. Sessions were limited to a maximum of 440 – 500 trials per day (depending on the experiment). In other studies in which we allow the monkeys to work to satiety, they will routinely work for 2000 – 3000 trials a day (depending on the individual), hence we had no difficulty keeping all three groups of monkeys motivated and attentive throughout testing.

9) Please include plots of the data from individual TE animals for all of the figures where a deficit is shown (in order to evaluate the heterogeneity of performance, and to see in which animals a reward bias could account for their behavior). Without seeing the TE lesions (including subcortical structures), we cannot know if they encroached upon reward circuitry or other brain regions to a greater extent than the Rh lesions, allowing non-perceptual explanations of the deficit to remain possible.

We have included plots of individual monkey performance for all experiments in which a deficit was reported (Figure 3—figure supplement 2; Figure 5—figure supplement 1; Figure 6—figure supplement 2). We have also reproduced the reconstructions of the TE lesions in Figure 1—figure supplement 1.

10) Please clarify how the present task differs from Task 1 of Lee et al., 2005, and how it has reduced memory demands relative to Lee and Rudebeck, 2010. There is a large prior literature showing Rh lesion-induced perceptual deficits that may be prematurely dismissed on the basis that those tasks did not eliminate memory demands adequately. But those tasks were described as having almost non-existent memory demands. Lee and Rudebeck, 2010, used a "possible/impossible" object judgement with no memory demands. Similarly, Lee et al., 2005 used a task almost identical to the present monkey task except that it displayed 2 stimuli simultaneously, with an identical category judgment (referring to a 'target' category defined at task outset); thus, in Lee et al., the greatest memory demand was comparing to a target category held in mind (as in the present study), not comparing between two adjacent stimuli. Second, it seems implausible that the extremely short memory demands imposed by prior oddity tasks (saccading between adjacent stimuli) are enough to render Rh involvement critical. This implies a role for Rh cortex in sensory/perceptual integration so extreme that patients with Rh damage should be unable to make sense of the dynamic visual world. Please provide further discussion of this important point. You may not be able to fully resolve this as it will likely remain hotly debated but describing your specific concerns in a convincing and detailed way will go much farther to increase the impact of this paper.

Both tasks performed by the subjects in the Lee at al. (2005) study are significantly different in design to those presented in the current study. Lee ‘task one’ is a classic visual discrimination task (simultaneous S+ vs. S-), where the categories offer no information that can help the subjects solve the task. The categories in Lee et al. are used to assess performance at different levels of perceptual difficulty, and to determine whether stimuli in different semantic groupings are processed in different brain regions. Lee ‘task 2’ is something of a hybrid between visual discrimination and oddity tasks; the subject can use one stimulus to determine which of the other two stimuli is most perceptually similar (i.e. the target), and thus must make real-time comparisons among the 3 stimuli presented (further discussion of this point below). ‘Task 2’ could also be solved using the same visual discrimination learning required in ‘task 1’, although we agree with the authors that it is parsimonious to assume that the adoption of the former strategy is more likely. In both tasks, the subjects are being asked to make a stimulus-reward association – task 1 over the longer timescale of multiple presentations; task 2 over the shorter timescale of within-trial presentations. Our task places no similar requirement for ‘stimulus-reward’ mapping on our subjects. The conclusion we draw from these comparisons is that Rh cortex is likely critical for stimulus-reward association memory (as others have demonstrated), but that memory for ‘category’ is supported elsewhere (earlier in the visual system). We feel that to elaborate this point in full in the Discussion of the present study would oblige us to engage in a similar depth of analysis for several of the other equally well-designed studies performed in human subjects, and hence would distract from the focus of the manuscript; we have included a reference to the Lee et al., 2005 paper in the section of the Discussion in which we acknowledge the debate surrounding the role of rhinal cortex in the human literature (Discussion, second paragraph).

The reviewers suggest that the short-term memory demands imposed by oddity tasks are equivalent to the sensory/perceptual demands of the dynamic visual world. In the Lee et al., 2005 study, and all other robust studies on this topic, the authors are careful to ensure that no single feature can be used to easily discriminate among the stimuli presented. That means that in an oddity task the subject must attend to a minimum of three images, and must attend to a minimum of two features in each of those three images (and probably more than two features to be confident). If this task requires foveal vision, as seems likely, then the sum of the saccadic intervals between the different objects, and among features within each object, will be on the order of 100s of ms, during which information has to be actively held in some form of short-term memory.

Lee and Rudebeck, 2010. implemented a task that required subjects to report whether drawings of stimuli were viable as 3D objects. The memory demands of that task are much more similar to those of the present study – the subjects classified stimuli into one of two categories. However, the task only tests ‘perception’ in an abstract sense; there remains a confound with cognitive load – the mental reconstruction of a 3-dimensional image from a 2-dimensional representation demands more than simple perception of the object as a whole.

We certainly agree with the reviewers’ comment that this topic will be hotly debated! We have modified the second paragraph of our Discussion to detail our argument (we hope) with greater clarity.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

This revision was very responsive in addressing several questions posed by the reviewers and providing further clarification of the design and interpretation of the reported results. There was not complete agreement across the reviewers as reviewer 1 suggests an intriguing alternative interpretation for the results that we hope will be the basis for further discussion. Their re-review is included below and we would ask that you consider their comments and make some final revisions to your paper to respond to their concerns.

We would like to thank reviewer 1 for the detailed constructive criticism, thoughtful arguments, and helpful suggestions.

As a general statement, we would like to note that the key observation of interest in this manuscript is the lack of impairment in performing perceptually-difficult categorizations in monkeys with rhinal removals. In this context, the modest impairment of the group with TE removals serves as a positive control.

Reviewer #1:

I remain unconvinced that the results and their interpretation are sufficiently compelling for publication in eLife (but I accepted the earlier, collective "letter to the authors" that was reasonably positive, because I see that I might be in the minority). My reservations stem from the ambiguous nature of the task (and the pattern of results in individual TE monkeys) and from weaknesses in the arguments for the key interpretations/conclusions. I respond to the authors' response letter below.

Authors: We have included plots of individual monkey performance for all experiments in which a deficit was reported (Figure 3—figure supplement 2; Figure 5—figure supplement 1; Figure 6—figure supplement 2).

Reviewer: This confirms what I ascertained by looking at the raw data: the TE lesion effect comes almost entirely from one monkey (K). In the other 2 TE monkeys, to the extent that they show any deficit, it is in classifying cats as dogs without a symmetrical tendency to classify dogs as cats. In other words, in these two monkeys, their small number of errors could be due to response bias (greater propensity to release the bar on green, because of increased reward-seeking/reduced risk-aversion, etc.). The authors state in the revised manuscript that "Due to the asymmetrical reward structure of this and similar tasks, when monkeys fail to discriminate between the two offers they will usually resort to releasing the lever during the presence of the green target 100% of the time, as this is the only condition in which reward is available." This illustrates the ambiguous nature of the task: if monkeys show a bias toward releasing on green (i.e. selecting "dog") it could be due to failure to discriminate or due to response bias (since the assignment of reward contingencies is not symmetric with respect to cat/dog category) and we cannot know which. This weakness is a key reason why I do not feel the present results are compelling enough to warrant publication in eLife.

We thank the reviewer for highlighting this issue for further clarification. If a response bias were introduced by a specific lesion (e.g. TE removal), that bias should be observed across all experimental conditions. The bias that exists (for all monkeys – see subsection “Experiment 2 – area TE removal impairs perceptually difficult categorization”, first paragraph) is only exacerbated in the TE removal group when the task is made perceptually more difficult in Experiment 3 (i.e. by partially obscuring the stimuli used in Experiment 2 with masks – Figure 5B). In this case, the bias (upward translation of the fitted function) is seen because of the increased perceptual difficulty. The increased perceptual difficulty is manifested as flattening of the slope of the discrimination function).

All three monkeys with TE removals (K, T and G) showed flattening of the discrimination curve relative to all control and all rhinal-lesioned monkeys in all of the morph discrimination tests. Critically, the gradients of the discrimination curves for the control monkeys and those with rhinal-removals were indistinguishable.

When the discriminations are made more difficult (Experiments 3 and 4) there are two changes in the discrimination functions. First, the TE monkeys’ discrimination performance becomes considerably worse as shown by the decrease in slope of the discrimination curve. Second, as correctly noticed by the referee, the monkeys have changed their criterion for classifying a stimulus as a dog as shown by the shift in the 50% discrimination point to the left. Since these changes occur when the discrimination is made difficult by the masks, we can conclude that the monkeys have a discrimination deficit, and in the face of the discrimination being made more difficult the monkeys have changed their criterion by classifying more trials as dogs. The change in criterion is not surprising in the face of the asymmetrical reward structure of the task.

Authors: Lee et al., 2001, 'task one' is a classic visual discrimination task (simultaneous S+ vs. S-), where the categories offer no information that can help the subjects solve the task.… Lee 'task 2' is something of a hybrid between visual discrimination and oddity tasks; the subject can use one stimulus to determine which of the other two stimuli is most perceptually similar (i.e. the target), and thus must make real-time comparisons among the 3 stimuli presented (further discussion of this point below). 'Task 2' could also be solved using the same visual discrimination learning required in 'task 1', although we agree with the authors that it is parsimonious to assume that the adoption of the former strategy is more likely. In both tasks, the subjects are being asked to make a stimulus-reward association – task 1 over the longer timescale of multiple presentations; task 2 over the shorter timescale of within-trial presentations. Our task places no similar requirement for 'stimulus-reward' mapping on our subjects. The conclusion we draw from these comparisons is that Rh cortex is likely critical for stimulus-reward association memory (as others have demonstrated), but that memory for 'category' is supported elsewhere (earlier in the visual system).

Reviewer: The authors argue that the key distinction between the Lee et al. tasks and their task is the use of only a single item per learned discrimination (in Lee et al.) versus the use of sets of items constituting the to-be-discriminated categories (i.e. multiple cats and multiple dogs) in their task. In Lee et al., the requirement to associate each single item with reward thus renders the task a "memory" task. Whereas, in the authors' task, because monkeys had to generalize across multiple items within a category and correctly associate that category with reward, the task is not a memory task but a perceptual discrimination task. Note that "and correctly associate that with reward" was added by me – the authors make no mention of reward in describing their own task.

We agree with the reviewer that it is important that we clearly convey that our task does include a memory demand for category boundary/prototype. We currently state that our task “requires memory for a categorical exemplar or boundary, along with the category-response mapping”. However, the key distinction we wish to make is that our task carries a requirement for category memory, and not short-term memory, or item-by-item stimulus-reward associations (of the type required for standard S+/S- visual object discriminations). We now state this explicitly in the summary of experimental design: “All tasks required remembering visual perceptual categories. However, in every trial the monkeys responded while the stimulus was present, thereby minimizing demands on short-term memory”.

Yet an association between the category and the reward is critical for correct performance in the authors' task, and in fact this category-to-reward mapping is more complex than in the Lee et al. tasks (monkeys must learn: if "dog", then release while green; if "cat", then do not release until red). To argue that a simpler reward contingency (in Lee et al.) makes the task more "mnemonic" than a complex reward contingency (in the present task) seems backwards.

The reviewer is correct that the basic rules of the two tasks are different. However, the category-action mapping in our task (if ‘dog’ release while green, etc.) has been learned by all groups, as evidenced by their near 100% accuracy at the extremes of the morph scale (i.e. 0% dog and 100% dog in Figure 3A). A similar type of response mapping to one of two intervals is required to perform the contrast sensitivity task, and all groups performed this without bias. In any case, the learning of the rule is not the ‘mnemonic’ component we refer to here. We agree that our task and that of Lee et al. place different memory demands on the subject. Our task requires the monkeys to learn a category boundary, using a large set of fixed stimuli. In contrast, Lee et al. required humans to learn specific stimulus-reward associations, with two stimuli at a time, over a small number of repetitions. In fact, these differing memory demands form the crux of our argument: if both tasks are perceptually challenging, and differ significantly only in their mnemonic requirements, then the presence of a Rh impairment in their task, and absence in ours, seems most parsimoniously interpreted as due to the different mnemonic demands. This leads to the conclusion that Rh cortex is only required for certain types of memory (e.g. for single items and not for categories).

I agree that what makes the present task different from the Lee et al. tasks is the requirement for generalizing across category exemplars, but this implies a very different interpretation than the one offered by the authors. The most sensible interpretation is as follows. The present task is a categorization task that requires both generalization across diverse cats (or across diverse dogs) and discrimination of cats from dogs. Therefore, the optimal representations are not whole, unique objects (residing in rhinal cortex), but rather Shimon Ullmann-esque intermediate complexity features (known to reside in IT) that allow generalization across distinct cats as well as discrimination of cats from dogs.

This is indeed a thought-provoking alternative explanation of our results. We have added a section to the Discussion to cover this topic (Discussion, second paragraph).

This accords with another interesting/complicating factor in the present task: owing to the distribution of perceptual features possessed by cats and dogs, most cats are a plausible subset of dogs, but most dogs are not a plausible subset of cats (Mareschal, French and Quinn, 2000; Mareschal, Quinn and French, 2002). Given this category asymmetry, having compromised IT representations (needed for generalization and discrimination) might lead to a bias toward classing cats as dogs, as seen in the data. One way to test this would be to replicate the study in a design without reward asymmetry, to see if the dog-bias (which could then only arise from inherent category asymmetry) still exists.

It has been suggested that the observation that cats are a plausible subset of dogs but not vice versa is driven by the greater variability among dogs. The studies the reviewer references were performed on human infants, with very small sample sets (12 stimuli per category). The training sets used in the present study were much larger (960 stimuli per category [subsection “Behavior”, second paragraph]); thus, the corresponding reduction in variability makes the ‘plausible subset’ interpretation unlikely. With regards to the potential effects on discriminability associated with bias, please refer back to our response to reviewer 1.

Authors: Lee and Rudebeck, 2010, implemented a task that required subjects to report whether drawings of stimuli were viable as 3D objects. The memory demands of that task are much more similar to those of the present study – the subjects classified stimuli into one of two categories. However, the task only tests 'perception' in an abstract sense; there remains a confound with cognitive load – the mental reconstruction of a 3-dimensional image from a 2-dimensional representation demands more than simple perception of the object as a whole.

Reviewer: Here, it is not clear how the authors' argument rebuts the reviewer's concern. Do the authors mean to equate "mental reconstruction of a 3-dimensional image from a 2-dimensional representation" with traditional conceptions of declarative memory (i.e., with the "memory" account of rhinal cortex function that is used to dismiss other findings of rhinal lesion-induced deficits)? This does not seem plausible. I would like to see either a different, more compelling reason for attributing the Lee et al. findings (both the 2005 and 2010 studies) to a "memory" deficit, or an alternative interpretation of the present results that can accommodate all of the data in a satisfying way.

We do not wish to equate ‘mental reconstruction of a 3D image’ with memory. Neither task places explicit demands on short-term memory. We merely note that this type of reconstruction demands more than simple perception of the object as a whole.

Authors: The reviewers suggest that the short-term memory demands imposed by oddity tasks are equivalent to the sensory/perceptual demands of the dynamic visual world. In the Lee et al., 2005 study.… the sum of the saccadic intervals between the different objects, and among features within each object, will be on the order of 100s of ms, during which information has to be actively held in some form of short-term memory.

Reviewer: Again, it is not clear how the authors' argument rebuts the reviewer's concern. If rhinal cortex lesions impair the ability to hold information in memory for ~100ms, this would have serious deleterious effects on perception of the dynamic visual world. For example, when a prime stimulus disrupts perception of a subsequent target stimulus, its effects can either blend with (boost) or be "discounted" from (detract from) perception of the target stimulus, depending on for how long the prime appears. The prime duration at which our perceptual systems tend to switch from blending to discounting is approximately 100-300ms (e.g., Huber, 2008). In other words, basic mechanisms of dynamic perception would be massively altered if the ability to maintain information for 100ms were lost. This is not typically how the experience of individuals with rhinal cortex lesions is characterized.

The critical distinction we make between the short-term memory demands placed on subjects by the carefully controlled behavioral tasks claimed to test monkey perception vs. the types of ‘short-term memory’ required for seamlessly processing visual information in a dynamic visual world, is that the former is an active, effortful process, while the latter is a passive, and seemingly effortless one. As such, they will likely have very different neural substrates (see Wittig et al., 2016, and Wittig and Richmond, 2014 for a discussion about selective working memory vs. passive recency/novelty memory and their roles in monkey short term memory).

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Figure 2—source data 1. Experiment 1 - learning to categorize morphed images.
    DOI: 10.7554/eLife.36310.005
    Figure 3—source data 1. Experiment 2 - asymptotic categorization performance.
    DOI: 10.7554/eLife.36310.009
    Figure 4—source data 1. Visual acuity testing.
    DOI: 10.7554/eLife.36310.011
    Figure 5—source data 1. Experiment 3 - categorization of visually degraded stimuli.
    DOI: 10.7554/eLife.36310.014
    Figure 6—source data 1. Experiment 4 - categorization of novel stimuli.
    DOI: 10.7554/eLife.36310.018
    Transparent reporting form
    DOI: 10.7554/eLife.36310.019

    Data Availability Statement

    All data generated or analysed during this study are included in the manuscript and supporting files.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES