Abstract
Visual object categorization is a critical task in our daily life. Many studies have explored category representation in the inferior temporal (IT) cortex at the level of single neurons and population. However, it is not clear how behavioral demands modulate this category representation. Here, we recorded from the IT single neurons in monkeys performing two different tasks with identical visual stimuli: passive fixation and body/object categorization. We found that category selectivity of the IT neurons was improved in the categorization compared with the passive task where reward was not contingent on image category. The category improvement was the result of larger rate enhancement for the preferred category and smaller response variability for both preferred and nonpreferred categories. These specific modulations in the responses of IT category neurons enhanced signal-to-noise ratio of the neural responses to discriminate better between the preferred and nonpreferred categories. Our results provide new insight into the adaptable category representation in the IT cortex, which depends on behavioral demands.
Keywords: object category, behavioral demand, visual perception, inferior temporal cortex
visual object categorization organizes the infinite number of visual stimuli into meaningful groups; therefore, it is crucial to understanding the visual world and interacting with it. Understanding the cortical representation of the visual object categories is one of the most fundamental questions in object recognition studies. The inferior temporal (IT) cortex lies at the final stage of the ventral visual pathway and is the first visual area that contain neurons with selective category responses (Desimone et al. 1984; Logothetis and Sheinberg 1996; Perrett et al. 1982; Tanaka 1996). It is heavily connected with the prefrontal cortex, memory-related cortices, and association areas (Axmacher et al. 2008; Miyashita 1993; Webster et al. 1994). These properties of the IT cortex have made it a suitable area where detailed bottom-up visual information could be transformed into more abstract category-related information (Logothetis and Sheinberg 1996). The IT cortex contains neurons with selective responses to visual object categories such as faces and bodies (Desimone et al. 1984; Kiani et al. 2007; Kobatake and Tanaka 1994; Logothetis and Sheinberg 1996; Perrett et al. 1982; Tanaka 1996; Tsao et al. 2006). Object categories are also represented in the activity of the population of IT neurons (Hung et al. 2005; Kiani et al. 2007; Meyers et al. 2008). Functional MRI (fMRI) studies of the IT lobe of monkeys and humans have shown cortical patches responsive to a specific category such as faces or bodies (Downing et al. 2001; Kanwisher et al. 1997; Pinsk et al. 2005; Rajimehr et al. 2009; Tsao et al. 2003). We have previously shown that the activity of the category-selective neural clusters in the IT is causally linked to the perceptual categorization performance (Afraz et al. 2006).
The position of the IT cortex at the top of the visual hierarchy and its connections with areas involved in cognitive functions such as the prefrontal cortex (Webster et al. 1994) has made it a good candidate to study the effects of top-down nonvisual cues on the sensory visual processing. The strength of top-down signals enhances along the ventral visual pathway toward the IT cortex. It has been shown that attentional enhancement of firing rate gets larger and occurs earlier along the ventral stream, from V1 toward V2 and then V4 (Buffalo et al. 2010). Task demand is one of the nonvisual cues that could potentially modulate the IT neural responses. It is shown that in the IT cortex, the same stimulus elicits different responses depending on the context and the task demands (Anderson and Sheinberg 2008; Fuster and Jervey 1981; Vogels et al. 1995). Richmond and Sato (1987) showed that the IT neural responses enhance during the visual discrimination compared with the stimulus attention tasks. Sato (1988) reported a suppression in the IT neural responses in pattern discrimination compared with passive viewing task. Koida and Komatsu (2007) found that in the color categorization and discrimination tasks, IT neural activity increased and decreased, respectively, compared with the passive viewing task. In contrast, Suzuki et al. (2006) found no difference in the IT neural responses at different stimulus discrimination levels.
These studies have explored the task-related modulation of the amplitude of the IT neural response to the visual stimuli. However, the effect of the task demands on the object category selectivity of the IT neurons has not been explored yet. Selectivity of the IT neurons to complex visual objects is a specific property of the IT neurons in the visual hierarchy and underlies our visual categorization ability. By a microstimulation study, we have previously shown that activity of the category-selective neural clusters in the IT is causally linked to the perceptual categorization performance (Afraz et al. 2006). The effect of microstimulation is correlated with the level of category selectivity of the neurons. These results suggest that category-selective neurons such as face and body neurons are linked to the visual categorization performance. Given the importance of our understanding about the neural mechanisms of visual object categorization, we designed the current experiment to study how behavioral needs modulate the category representation in the IT cortex. The monkeys in the current study performed two tasks with the same visual stimuli but different behavioral demands: passive fixation and categorization. By analyzing the spiking activity of the IT single neurons recorded while monkeys were performing these two tasks, we explored how category selectivity of IT neurons was modulated by the behavioral demands.
METHODS
Subjects.
Two male adult macaque monkeys were used in this experiment. Before training, the monkeys were prepared with head restraints and recording chambers implanted stereotaxically on the dorsal surface of their skull under aseptic conditions while monkeys were anesthetized with sodium pentobarbital. All experimental procedures were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. They were also approved by the animal care and use committee of Institute for Research in Fundamental Sciences (IPM).
Stimuli.
The stimuli were 7 × 7° in size, grayscale photographs of body (human, monkey, and 4-leg subcategories) and object categories (aircraft, car, and chair subcategories). There were 90 images in each category (30 images per subcategory; Emadi and Esteky 2013). Body images had no facial features. Each image was presented in 4 different noise levels. Each noise level was generated by assigning a uniformly distributed grayscale value to X% of image pixels where X was the absolute noise level and had 1 of the values of 10, 30, 45, or 60. These 720 noisy images [(2 categories) × (90 images in each category) × (4 noise levels)] and 90 full-noise images (100% noise) were randomly presented to the monkeys without repetition. The pattern of the noise was fixed for 1 image across days but different among stimuli. The stimuli were presented on a 19-in. cathode-ray tube (CRT) monitor (resolution: 800 × 600 pixels) placed 57 cm in front of the monkey seated in a primate chair.
Tasks.
For passive fixation task, monkeys were trained to perform a passive viewing task (Fig. 1A). Following 400 ms of fixation on a white fixation point at the center of the screen, a randomly selected sequence of images was presented to the monkey. Each image was presented for 70 ms with a variable interstimulus interval (850–950 ms). The monkey was rewarded with a drop of apple juice every 1.5–2 s as long as its gaze was fixated within a 2.4 × 2.4° invisible fixation window at the center of the screen. The sequence of images stopped when the monkey broke the gaze fixation, and the fixation point reappeared after 1,500 ms of blank interval. Stimuli in the broken trials were later randomly selected for the representation.
For two-alternative forced choice body/object categorization task, we trained two monkeys for body/object categorization task (Fig. 1B). The monkeys started a trial by fixating on a fixation point within a 2.4 × 2.4° window at the center of the screen. The fixation point duration was randomly chosen from three variable durations (350, 400, or 450 ms). This prestimulus phase was followed by a brief (70 ms) presentation of an image chosen pseudorandomly from the set of noisy body and object images. After 500-ms delay, a green and red response target were presented 10° to the left and right of the screen center, respectively. The monkey was required to make a saccadic eye movement to one of the targets to indicate whether the image was a body or an object. Monkeys were trained to make the saccade no later than 300 ms after the target onset and keep their gaze fixed within 2.4 × 2.4° window on the saccade point for 150 ms. The assignment of body and object choices to the left and right targets was reversed between the two monkeys. Each correct response (i.e., when the image was correctly categorized by monkey) was rewarded by a drop of apple juice. For full-noise stimuli (0% visual signal), the monkey was rewarded randomly in 50% of the trials. Correct and wrong trials were followed by 750- and 1,500-ms intertrial interval, respectively.
To examine the unlikely event of whether the monkeys simply memorized all the images in the image set, a behavioral experiment was conducted after the completion of the training for each monkey. In these experiments, 40 novel images (20 bodies and 20 objects) were intermixed with 40 familiar body and object images (randomly chosen from the learned images). The stimuli were presented to the monkeys without any visual noise. The body/object categorization performance of the monkeys was measured in several behavioral sessions. In each session, we used a new set of novel images. The monkeys' performance for novel stimuli was not different from familiar stimuli from the very beginning of the behavioral sessions (monkey 1: familiar stimuli: 90.00% ± 2.75, novel stimuli: 83.33% ± 3.42, t-test, P = 0.13; monkey 2: familiar stimuli: 92.50% ± 2.41, novel stimuli: 86.67% ± 3.12, t-test, P = 0.14).
In each recording session, monkeys performed both passive categorization tasks. At the beginning of each recording session, 810 noisy stimuli in the image set were randomly divided into 9 blocks of 90 images separately for each task. Monkeys were presented with an interleaved order of passive and categorization blocks starting randomly with either of them in each recording session (Fig. 1C). Only recording sessions in which the monkeys successfully completed at least half of the trials in both tasks were included in the analysis. Median number of trials per category in both tasks was 360 (mean ± SE; passive: bodies = 342 ± 10, objects = 340 ± 9.5; categorization: bodies = 338 ± 10.3, objects = 336 ± 9.9). The eye position was measured by an infrared eye-tracking system.
Recording.
In a preparatory surgery, head restraints and recording chambers were stereotaxically implanted under aseptic conditions on the dorsal surface of the skull of the monkeys while the animals were anesthetized with sodium pentobarbital. Craniotomy was performed to record from the IT cortex of the monkeys. After recovery, single units were recorded with tungsten electrodes (FHC). The position of the recording chamber and the recording sites were defined by the stereotactic measurements and MRIs acquired before surgery. Subdivisions of the IT cortex were defined using the location of cortical sulci as illustrated by Tanaka and colleagues (Saleem et al. 2000; Saleem and Tanaka 1996; Tamura and Tanaka 2001). Recordings were made over the lower bank of superior temporal sulcus and TE cortices (12–18 and 13–20 mm anterior to interauricular line in monkeys 1 and 2, respectively). During each recording session, a single tungsten electrode was inserted into the IT cortex. The electrode was advanced with an Evarts-type manipulator (Narishige) from the dorsal surface of the brain through a stainless steel guide tube inserted into the brain down to 10–15 mm above the recording sites. After isolating units in real-time by a template matching algorithm (Wörgötter et al. 1986) while inserting the electrode into IT cortex, the monkeys were required to perform a short passive fixation task. In this task, 10 randomly selected noiseless images from each subcategory were presented. Each image was presented once for 70 ms with 630-ms blank interval in between. The peristimulus time histograms (PSTHs) of the units were plotted real-time and visually inspected. If the unit was visually responsive, monkeys were presented with the interleaved categorization and passive tasks. Spiking activity of the isolated single neurons was recorded extracellularly while monkeys were performing the tasks. A total of 95 visually responsive single neurons were recorded from 2 monkeys (74 from monkey 1 and 21 from monkey 2). Visual responsiveness was defined as significantly larger responses (in any of the sliding 100-ms windows from 100 to 400 ms after the stimulus onset) relative to the baseline activity (−50 to 50 ms relative to the stimulus onset) following the presentation of body or object images in at least 1 of the noise levels (t-test, α = 0.01).
Data analysis.
Based on the similar trend of monkeys' behavior and a similar pattern in other results, data from two monkeys were combined in all of the analyses. In the categorization task, we only analyzed the correctly completed trials. Trials with different levels of noise (10, 30, 45, and 60%) are used in all of the analyses unless otherwise mentioned. Because of the lack of category information in the fully noisy images (100% noise), this condition was excluded from the analyses. In all of the analyses, a 300-ms window from 100 to 400 ms after the stimulus onset was used. PSTHs were smoothed by convolving with a 30-ms Gaussian kernel. Before running any parametric tests, the normality of the distributions was confirmed by Kolmogorov-Smirnov test (P < 0.01 in all of the distributions). All of the statistical tests were 1-tailed, 1-sample t-tests unless otherwise mentioned. P values less than 0.00001 are shown as P < 10−5.
Analysis of the response modulation index.
Response modulation index (RMI) was used as a normalized index to examine the rate modulation in categorization task compared with the passive task:
μ(C) and μ(P) were the averaged evoked response of each neuron in categorization (only correct trials) and passive tasks. Positive and negative values represent response enhancement and suppression in categorization compared with passive, respectively.
Analysis of the response variability (Fano factor).
Fano factor (FF) was used as the index of response variability:
σ2 And μ were the variance and mean of the spike count, respectively. The FF was measured in 100-ms windows with 1-ms steps for the responses evoked by each category. The number of spikes was calculated in each window in each trial. Then, the FF within each window was computed as the ratio of variance in spike counts to mean spike count across all trials. For rate matching between two tasks in each neuron, the most common firing rate value, across all windows in both tasks, was selected as the matching rate. The FFs of all windows with firing rates ±10% of the selected rate were averaged in each task.
d′.
d′ Was used as an index of the neural category selectivity for bodies vs. objects:
μ(B) and μ(O) were the mean response of each neuron to body and object images, respectively. σ2(B) and σ2(O) were the variance of the distributions of neural responses to body and object images, respectively. Positive and negative values represent body and object category selectivity, respectively. We measured d′ in a 300-ms window from 100 to 400 ms after the stimulus onset. For plotting d′ in time, d′ was measured in sliding 100-ms windows with 1-ms steps.
Classification.
A linear “support vector machine” was used to assess the neural performance in body vs. object categorization. The neural performance was measured by using the Support Vector Machines functions from the Statistics Toolbox of MATLAB software. We used sequential minimal optimization method to find the separating hyperplane. The input for each neuron was the response of the neuron to body and object categories in a 300-ms window from 100 to 400 ms after stimulus onset. In each round of classification, we randomly selected the 75% of trials of every neuron in each task for training the classifier. The classification performance of the neural population was tested on the remaining 25% of the trials. In all neurons, equal numbers of trials were used in each task and image category. Trials were pseudorandomly selected from the pool of trials in each condition. This procedure was repeated 100 times for each step of increasing number of neurons in Fig. 2G. Percentage of task-dependent categorization improvement in each cell group was measured as:
RESULTS
We trained two monkeys to perform two different tasks: passive fixation (called passive) and two-alternative forced choice body/object categorization (called categorization). In these two tasks, the same images were presented to the monkeys in two different behavioral contexts. In the passive task, monkeys passively viewed the images to get a reward (Fig. 1A), whereas in the categorization task they were required to categorize correctly the images as bodies or objects to get a reward (Fig. 1B). Images were selected from different body and object subcategories and were presented to the monkeys with different levels of added noise (Fig. 1, C and D). Blocks of these two tasks were presented to the monkeys in an interleaved order (Fig. 1E). Figure 1F shows that monkeys' performance in the categorization task decreased as noise level increased. As expected, monkeys had a better performance categorizing less noisy stimuli. Performance for fully noisy images was 48.48 ± 0.60, which was different from chance (t-test, P = 0.016), indicating a small bias (1.52%) in this noise level toward object choice.
Response selectivity in categorization vs. passive viewing.
To study the behavior-related modulation of the IT category responses, we recorded spiking activity of 95 single IT neurons in monkeys performing the passive and categorization tasks. We measured d′, which is an indication of the signal-to-noise ratio of the neural responses (see methods) and captures the separation between the distributions of responses to body and object images in each neuron. Larger d′ values mean less overlap between these two distributions or higher signal-to-noise ratio for category selectivity. By measuring d′, we found 51 body and 44 object neurons using the responses during the categorization task (see methods). Absolute d′ values were higher in body compared with object neurons (2-sample, 2-tailed t-test, P = 0.0168; body neurons: n = 51, object neurons: n = 44).
To study the neural signature of the behavioral demands, we plotted d′ over time in two different tasks for exemplar body (C52) and object (C83) neurons measured in 100-ms sliding windows (Fig. 2, A and B). We observed that category selectivity was improved during the evoked response in the categorization task compared with the passive task, making each group of neurons more selective to their preferred categories (1,000 bootstrap replicates, t-test, 100–400 ms, P < 10−5 in both neurons). We studied these changes at the population level and observed similar results for body and object neurons [Fig. 2, C and D; t-test, 100 to 400 ms, P < 10−5 in body neurons (n = 51), P = 0.000011 in object neurons (n = 44)].
To explore better this modulation, we measured the difference between the d′ values in categorization and passive tasks in all of the single neurons. Figure 2E shows a positive correlation between the d′ modulation and d′ in single neurons (monkey 1: r = 0.71, P = 0.0003, n = 74; monkey 2: r = 0.72, P < 10−5, n = 21). In neurons with positive d′ (body selective), d′ improvement in categorization compared with passive task was 0.0488 ± 0.0102 (mean ± SE). We observed significant d′ increase in 38 out of 51 (74.5%) body neurons (t-test, α = 0.05, 1,000 bootstraps). In neurons with negative d′ (object selective), d′ modulation was −0.0552 ± 0.0105. We found significant d′ decrease in 33 out of 44 (75%) of object neurons (t-test, α = 0.05, 1,000 bootstraps). These findings indicate that selectivity of 74.5 and 75% of body and object neurons to their preferred categories were improved in the categorization task. Improvement of category selectivity of the single neurons during the categorization task is consistent with the monkeys' behavioral demand for a higher categorization performance to obtain more reward.
To explore the effect of task difficulty, we measured the d′ modulation in each noise level separately. The modulation was 0.0223 ± 0.0171 in 10%, 0.0444 ± 0.0150 in 30%, 0.0476 ± 0.0179 in 45%, and 0.0625 ± 0.0153 in 60% noise levels. Figure 2F shows a nonsignificant trend for larger d′ modulation in more ambiguous conditions (Pearson correlation, r = 0.0881, P = 0.0872; n = 95).
Category information at the population level in categorization vs. passive viewing.
To see how category representation at the population level was modulated between these two tasks, a linear classifier was trained on the population of neurons to discriminate body and object images in each task. Figure 2G shows the performance of the classifier as a function of the number of body and object neurons for categorizing images into bodies and objects in each task. Increasing the number of neurons enhanced the categorization performance from chance level (50%) to ∼70% for body neurons in categorization task. This performance was close to the monkeys' averaged behavioral performance (72.9%). Body neurons showed higher categorization performance compared with object neurons in both tasks (44 neurons of each group; t-test, P < 0.00001 in both tasks, 100 bootstraps). Performance in both groups of neurons was larger in categorization compared with passive task (significance line in Fig. 2G; t-test, α = 0.05, 100 bootstraps). To compare this task-dependent categorization improvement between body and object neurons, we measured the percentage of performance modulation for both cell groups (44 neurons of each group; see also methods). These measurements revealed that performance modulation in body neurons (11.67% ± 2.77) was significantly larger than object neurons (6.24% ± 2.89; t-test, P = 0.021, 100 bootstraps; n = 44).
Firing rate in categorization vs. passive viewing.
By definition, d′ takes both mean and variability of the responses into account. d′ Improvement in categorization task can be the result of signal increase (rate enhancement) and/or noise decrease (response variability decrease).
To explore the modulation of the firing rate, we plotted the response of the exemplar body neuron to the preferred and nonpreferred categories in passive (Fig. 3A, left) and categorization (Fig. 3A, right) tasks. We observed that the response difference of this neuron to the preferred vs. nonpreferred category was enhanced during the categorization task (1,000 bootstraps, t-test, P < 10−5). Figure 3B shows similar results for the exemplar object neuron (1,000 bootstraps, t-test, P < 10−5). We also plotted the averaged response of the population of body (Fig. 3C) and object (Fig. 3D) neurons to their preferred and nonpreferred categories in different tasks and observed similar results at the population level.
To quantify this observation, we measured an RMI in each neuron for different image categories (see methods). Positive RMI values were indicative of response enhancement in categorization task compared with the passive task. We measured RMI values of the body neurons for body images (4.97% ± 1.09) and object images (1.93% ± 1.11). RMI in the population of body neurons for both preferred and nonpreferred categories was significantly larger than zero (t-test, preferred category: P = 0.000016, nonpreferred category: P = 0.043; n = 51). RMI of the population of object neurons for both body (3.92% ± 1.18) and object images (7.72% ± 1.40) were also significantly larger than zero (t-test, preferred category: P = 0.000001, nonpreferred category: P = 0.00095; n = 44).
We also compared the RMI values for the preferred and nonpreferred categories. RMI difference was measured as RMI for preferred category minus RMI for nonpreferred category. The results showed that RMI values were larger for preferred than nonpreferred categories in both cell groups at the population level (Fig. 3E; body neurons: ΔRMI = 3.05% ± 0.77; t-test; monkey 1 = 0.0009, n = 74; monkey 2 = 0.0295, n = 41, both monkeys: P = 0.00011, n = 95; object neurons: ΔRMI = 3.80% ± 0.84; monkey 1 = 0.0003, n = 74; monkey 2 = 0.0188, n = 21; both monkeys: P = 0.000024, n = 95). These findings indicate that during the categorization with higher level of task engagement compared with the passive task, responses to both image categories were enhanced. However, this response enhancement was specific in a way that larger response enhancement occurred for the preferred category, resulting in the category selectivity improvement at the level of single neurons. This effect was not significantly different between the two groups of body and object neurons (t-test, P = 0.74; body neurons: n = 51, object neurons: n = 44). At the level of single neurons, 38 body (74.51%) and 36 object (81.82%) neurons showed significantly larger rate modulation for their preferred than nonpreferred category (t-test, α = 0.05, 1,000 bootstraps).
Response variability in categorization vs. passive viewing.
The variability of the neural response to different members of each category is an important factor in measuring the d′ or how well a neuron could discriminate between two categories. We plotted the FF of the exemplar body (C52; Fig. 4A) and object (C83; Fig. 4B) neurons across time, measured in 100-ms sliding windows (see methods). We observed a decrease in FF after the stimulus presentation in both neurons, which was stronger in the categorization compared with the passive task. We also plotted the FF in time for the population of neurons and found lower FF in the categorization compared with the passive task later during the evoked response (Fig. 4C, significance line; t-test, α = 0.05). To explore better this effect, we measured the rate-matched FF in each single neuron during the evoked responses (see methods). Figure 4D shows FF of the single neurons in the categorization vs. passive task. Thirty-five body (63.6%) and twenty-five object (56.8%) neurons showed significantly lower FF during the categorization task (t-test, α = 0.05, 1,000 bootstraps). The modulation of the FF was 0.1044 ± 0.0297 in body and 0.0460 ± 0.0366 in object neurons. The decrease of FF in the categorization compared with the passive task was a small but significant effect in the population (t-test; monkey 1: P = 0.0275, n = 74; monkey 2: P = 0.0086, n = 21; both monkeys: P = 0.0013, n = 95). This effect was not significantly different between body and object neurons (2-tailed, 2-sample t-test, P = 0.15; body neurons: n = 51, object neurons: n = 44). We also compared the FF modulation (categorization FF minus passive FF) for body and object images separately and found no difference in the population for FF modulation in these 2 image groups (2-tailed, 1-sample t-test, P = 0.14; n = 95). This finding suggests that the variability of the response to different members of the preferred and also nonpreferred categories was decreased during the categorization compared with the passive task.
The results of RMI and FF analysis suggest that the larger d′ values in the categorization task are related to the larger rate enhancement for the preferred category besides smaller response variability for both preferred and nonpreferred categories in the categorization task. Figure 4E summarizes our findings and shows how category representation improves in the categorization task. Larger rate enhancement for the preferred category and the smaller response variability for both preferred and nonpreferred categories result in a greater separation of the signal and noise distributions. This enhances the category discriminability of the neural response during the categorization task, which is consistent with the monkeys' behavioral demand.
DISCUSSION
By comparing the activity of IT neurons in monkeys performing passive fixation and body/object categorization tasks, we found that behavioral demand modulates category representation in the IT cortex. Category selectivity was improved in the categorization task at the level of both the single neurons and the population. The category representation improvement resulted from two different mechanisms. First, the rate enhancement observed in the categorization task was larger for the preferred category. Second, the response variability to both preferred and nonpreferred categories decreased in categorization compared with the passive task.
The effect of the task demand on the response amplitude of the IT neurons has been explored previously. The results show increase, decrease, or no modulations in the responses of IT neurons depending on the task in hand (Anderson and Sheinberg 2008; Fuster and Jervey 1981; Koida and Komatsu 2007; Richmond and Sato 1987; Sato 1988; Suzuki et al. 2006; Vogels et al. 1995). We observed that during the categorization task with higher levels of task engagement compared with the passive task, firing rates were enhanced for both preferred and nonpreferred categories. More importantly, we found that this rate enhancement was larger for the preferred category compared with the nonpreferred category. This specific rate modulation results in category selectivity enhancement at the level of single neurons. It is possible that the general increase in the responses to the preferred and nonpreferred categories is related to the enhanced alertness in the categorization task, whereas the selective larger response enhancement to the preferred category is related to the task-specific visual attention.
Few studies have explored the effect of the top-down signals on the reliability of the neural responses in the ventral visual pathway. It was previously shown that response reliability of the V1 and V4 neurons is not affected by top-down signals (McAdams and Maunsell 1999; Wiener et al. 2001). However, more recent studies have reported that the response reliability of the V4 neurons improves by attention-related top-down signals (Cohen and Maunsell 2009; Mitchell et al. 2007; Noudoost and Moore 2011). We found that the IT response reliability increased in the categorization compared with the passive task, which is consistent with the more recent reports on the effect of top-down signals on the response reliability of V4 neurons. However, our result extends those earlier findings in three aspects. First, we report how response reliability changes based on the behavioral demands in two different tasks compared with the attention modulation in the same task reported in previous studies (Cohen and Maunsell 2009; Mitchell et al. 2007). Second, we measured response reliability modulation in the IT cortex. Third, we measured the variability of the responses of IT neurons to a richer stimulus set where the response to image exemplars of a category were only presented once. All of the previous studies have explored the variability of the responses to the repeated presentations of one single stimulus. More reliable response to different members of each category decreases the overlap between the response distributions to those categories and enhances the category discriminability.
We found that the amplitude and the variability of the responses were modulated in a direction that led to higher discriminability power. There was a larger rate enhancement for the preferred category with similar reliability enhancement for both categories. Larger distance between the mean of the responses to the two categories (larger between-categories response difference) and more reliable response to the members of each category (smaller within-category response difference) both enhance the discrimination between the two categories and improve the category selectivity of the neural responses. It has been shown that engagement in a spatial attention task decorrelates the ongoing fluctuations of the neural population activity in macaque area V4 (Mitchell et al. 2009). This decorrelation reduces the response variability caused by the task-irrelevant intrinsic cortical activity and thereby enhances the signal-to-noise ratio. Our results showing a similar decline in the variability of the responses to the preferred and nonpreferred categories could be related to a similar mechanism.
It is important to note that this effect is different from the change in the response selectivity of the IT neurons as a result of learning and training (De Baene et al. 2008; Freedman et al. 2006; Kobatake et al. 1998; Sigala 2004; Sigala and Logothetis 2002; Vogels 1999). Here, monkeys switched between two tasks several times in a session, and this flexible category representation was modulated from one pattern to the other consistent with the changes in the behavioral demands in a session.
The origin of the modulation in the category representation could be a “hierarchical” top-down feedback about the task rule from downstream areas like the prefrontal cortex (Muhammad et al. 2006). This modulation could be the result of changes in the synaptic weight and functional connectivity within IT and also between IT and upstream/downstream areas. In our experiment, monkeys were well-trained for discriminating bodies from other objects. It has been shown that there is a specialized neural system for the representation of bodies as one of the most important visual stimuli that we perceive (Downing et al. 2001). Learning a task is a top-down guided process (Ahissar and Hochstein 2004). Based on the “reverse hierarchy theory” in a highly trained subject, the ability to perform a task could progress backward to the lower levels with a better signal-to-noise ratio (Ahissar and Hochstein 2004). In our well-trained monkeys, it is possible that two synaptic/connectivity states in the IT cortex for each task have been established. The top-down signal might be necessary just to switch the IT from one state to the other when the task changes. The IT remains in that state until the next task switch happens. Understanding the origin of the task-related signals is a challenging topic that needs further studies with specific design to address this question.
In conclusion, we found that selective rate enhancement and lower response variability in the categorization compared with the passive task improve category representation in the IT cortex during a categorization task. Our findings suggest that modulation in the category selectivity of the same population of IT neurons could account for the behavioral demands in different tasks. These results are supported by previous studies indicating that the information for two different tasks (categorization and identification) exists in the same neural population (Hung et al. 2005; Sugase et al. 1999). The present results provide new insight into the adaptable category representation in the IT cortex depending on the behavioral demands and also the underlying neural mechanisms of the visual object categorization.
GRANTS
This work was supported in part by a grant from Iranian National Science Foundation (INSF).
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the author(s).
AUTHOR CONTRIBUTIONS
N.E. and H.E. conception and design of research; N.E. performed experiments; N.E. analyzed data; N.E. and H.E. interpreted results of experiments; N.E. prepared figures; N.E. drafted manuscript; N.E. and H.E. edited and revised manuscript; N.E. and H.E. approved final version of manuscript.
ACKNOWLEDGMENTS
We thank Reza Rajimehr for comments on the manuscript.
REFERENCES
- Afraz SR, Kiani R, Esteky H. Microstimulation of inferotemporal cortex influences face categorization. Nature 442: 692–695, 2006. [DOI] [PubMed] [Google Scholar]
- Ahissar M, Hochstein S. The reverse hierarchy theory of visual perceptual learning. Trends Cogn Sci 8: 457–464, 2004. [DOI] [PubMed] [Google Scholar]
- Anderson B, Sheinberg DL. Effects of temporal context and temporal expectancy on neural activity in inferior temporal cortex. Neuropsychologia 46: 947–957, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Axmacher N, Schmitz DP, Wagner T, Elger CE, Fell J. Interactions between medial temporal lobe, prefrontal cortex, and inferior temporal regions during visual working memory: a combined intracranial EEG and functional magnetic resonance imaging study. J Neurosci 28: 7304–7312, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buffalo EA, Fries P, Landman R, Liang H, Desimone R. A backward progression of attentional effects in the ventral stream. Proc Natl Acad Sci USA 107: 361–365, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen MR, Maunsell JH. Attention improves performance primarily by reducing interneuronal correlations. Nat Neurosci 12: 1594–1600, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Baene W, Ons B, Wagemans J, Vogels R. Effects of category learning on the stimulus selectivity of macaque inferior temporal neurons. Learn Mem 15: 717–727, 2008. [DOI] [PubMed] [Google Scholar]
- Desimone R, Albright TD, Gross CG, Bruce C. Stimulus-selective properties of inferior temporal neurons in the macaque. J Neurosci 4: 2051–2062, 1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Downing PE, Jiang Y, Shuman M, Kanwisher N. A cortical area selective for visual processing of the human body. Science 293: 2470–2473, 2001. [DOI] [PubMed] [Google Scholar]
- Emadi N, Esteky H. Neural representation of ambiguous visual objects in the inferior temporal cortex. PLoS One 8: e76856, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freedman DJ, Riesenhuber M, Poggio T, Miller EK. Experience-dependent sharpening of visual shape selectivity in inferior temporal cortex. Cereb Cortex 16: 1631–1644, 2006. [DOI] [PubMed] [Google Scholar]
- Fuster JM, Jervey JP. Inferotemporal neurons distinguish and retain behaviorally relevant features of visual stimuli. Science 212: 952–955, 1981. [DOI] [PubMed] [Google Scholar]
- Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends Neurosci 15: 20–25, 1992. [DOI] [PubMed] [Google Scholar]
- Hung CP, Kreiman G, Poggio T, DiCarlo JJ. Fast readout of object identity from macaque inferior temporal cortex. Science 310: 863–866, 2005. [DOI] [PubMed] [Google Scholar]
- Kanwisher N, McDermott J, Chun MM. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J Neurosci 17: 4302–4311, 1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kiani R, Esteky H, Mirpour K, Tanaka K. Object category structure in response patterns of neuronal population in monkey inferior temporal cortex. J Neurophysiol 97: 4296–4309, 2007. [DOI] [PubMed] [Google Scholar]
- Kobatake E, Tanaka K. Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. J Neurophysiol 71: 856–867, 1994. [DOI] [PubMed] [Google Scholar]
- Kobatake E, Wang G, Tanaka K. Effects of shape-discrimination training on the selectivity of inferotemporal cells in adult monkeys. J Neurophysiol 80: 324–330, 1998. [DOI] [PubMed] [Google Scholar]
- Koida K, Komatsu H. Effects of task demands on the responses of color-selective neurons in the inferior temporal cortex. Nat Neurosci 10: 108–116, 2007. [DOI] [PubMed] [Google Scholar]
- Logothetis NK, Sheinberg DL. Visual object recognition. Annu Rev Neurosci 19: 577–621, 1996. [DOI] [PubMed] [Google Scholar]
- McAdams CJ, Maunsell JH. Effects of attention on the reliability of individual neurons in monkey visual cortex. Neuron 23: 765–773, 1999. [DOI] [PubMed] [Google Scholar]
- Meyers EM, Freedman DJ, Kreiman G, Miller EK, Poggio T. Dynamic population coding of category information in inferior temporal and prefrontal cortex. J Neurophysiol 100: 1407–1419, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mitchell JF, Sundberg KA, Reynolds JH. Differential attention-dependent response modulation across cell classes in macaque visual area V4. Neuron 55: 131–141, 2007. [DOI] [PubMed] [Google Scholar]
- Mitchell JF, Sundberg KA, Reynolds JH. Spatial attention decorrelates intrinsic activity fluctuations in macaque area V4. Neuron 63: 879–888, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miyashita Y. Inferior temporal cortex: where visual perception meets memory. Annu Rev Neurosci 16: 245–263, 1993. [DOI] [PubMed] [Google Scholar]
- Muhammad R, Wallis JD, Miller EK. A comparison of abstract rules in the prefrontal cortex, premotor cortex, inferior temporal cortex, and striatum. J Cogn Neurosci 18: 974–989, 2006. [DOI] [PubMed] [Google Scholar]
- Noudoost B, Moore T. Control of visual cortical signals by prefrontal dopamine. Nature 474: 372–375, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perrett DI, Rolls ET, Caan W. Visual neurones responsive to faces in the monkey temporal cortex. Exp Brain Res 47: 329–342, 1982. [DOI] [PubMed] [Google Scholar]
- Pinsk MA, DeSimone K, Moore T, Gross CG, Kastner S. Representations of faces and body parts in macaque temporal cortex: a functional MRI study. Proc Natl Acad Sci USA 102: 6996–7001, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rajimehr R, Young JC, Tootell RB. An anterior temporal face patch in human cortex, predicted by macaque maps. Proc Natl Acad Sci USA 106: 1995–2000, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richmond BJ, Sato T. Enhancement of inferior temporal neurons during visual discrimination. J Neurophysiol 58: 1292–1306, 1987. [DOI] [PubMed] [Google Scholar]
- Saleem KS, Suzuki W, Tanaka K, Hashikawa T. Connections between anterior inferotemporal cortex and superior temporal sulcus regions in the macaque monkey. J Neurosci 20: 5083–5101, 2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saleem KS, Tanaka K. Divergent projections from the anterior inferotemporal area TE to the perirhinal and entorhinal cortices in the macaque monkey. J Neurosci 16: 4757–4775, 1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sato T. Effects of attention and stimulus interaction on visual responses of inferior temporal neurons in macaque. J Neurophysiol 60: 344–364, 1988. [DOI] [PubMed] [Google Scholar]
- Sigala N. Visual categorization and the inferior temporal cortex. Behav Brain Res 149: 1–7, 2004. [DOI] [PubMed] [Google Scholar]
- Sigala N, Logothetis NK. Visual categorization shapes feature selectivity in the primate temporal cortex. Nature 415: 318–320, 2002. [DOI] [PubMed] [Google Scholar]
- Sugase Y, Yamane S, Ueno S, Kawano K. Global and fine information coded by single neurons in the temporal visual cortex. Nature 400: 869–873, 1999. [DOI] [PubMed] [Google Scholar]
- Suzuki W, Matsumoto K, Tanaka K. Neuronal responses to object images in the macaque inferotemporal cortex at different stimulus discrimination levels. J Neurosci 26: 10524–10535, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura H, Tanaka K. Visual response properties of cells in the ventral and dorsal parts of the macaque inferotemporal cortex. Cereb Cortex 11: 384–399, 2001. [DOI] [PubMed] [Google Scholar]
- Tanaka K. Inferotemporal cortex and object vision. Annu Rev Neurosci 19: 109–139, 1996. [DOI] [PubMed] [Google Scholar]
- Tsao DY, Freiwald WA, Knutsen TA, Mandeville JB, Tootell RB. Faces and objects in macaque cerebral cortex. Nat Neurosci 6: 989–995, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsao DY, Freiwald WA, Tootell RB, Livingstone MS. A cortical region consisting entirely of face-selective cells. Science 311: 670–674, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vogels R. Categorization of complex visual images by rhesus monkeys. Part 2: single-cell study. Eur J Neurosci 11: 1239–1255, 1999. [DOI] [PubMed] [Google Scholar]
- Vogels R, Sary G, Orban GA. How task-related are the responses of inferior temporal neurons? Vis Neurosci 12: 207–214, 1995. [DOI] [PubMed] [Google Scholar]
- Webster MJ, Bachevalier J, Ungerleider LG. Connections of inferior temporal areas TEO and TE with parietal and frontal cortex in macaque monkeys. Cereb Cortex 4: 470–483, 1994. [DOI] [PubMed] [Google Scholar]
- Wiener MC, Oram MW, Liu Z, Richmond BJ. Consistency of encoding in monkey visual cortex. J Neurosci 21: 8210–8221, 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wörgötter F, Daunicht WJ, Eckmiller R. An on-line spike form discriminator for extracellular recordings based on an analog correlation technique. J Neurosci Methods 17: 141–151, 1986. [DOI] [PubMed] [Google Scholar]