Abstract
Neurons in macaque inferotemporal cortex (ITC) respond less strongly to familiar than to novel images. It is commonly assumed that this effect arises within ITC because its neurons respond selectively to complex images and thus encode in an explicit form information sufficient for identifying a particular image as familiar. However, no prior study has examined whether neurons in low-order visual areas selective for local features also exhibit familiarity suppression. To address this issue, we recorded from neurons in macaque area V2 with semichronic microelectrode arrays while monkeys repeatedly viewed a set of large complex natural images. We report here that V2 neurons exhibit familiarity suppression. The effect develops over several days with a trajectory well fitted by an exponential function with a rate constant of ∼100 exposures. Suppression occurs in V2 at a latency following image onset shorter than its reported latency in ITC.
SIGNIFICANCE STATEMENT Familiarity suppression, the tendency for neurons to respond less strongly to familiar than novel images, is well known in monkey inferotemporal cortex. Suppression has been thought to arise in inferotemporal cortex because its neurons respond selectively to large complex images and thus explicitly to encode information sufficient for identifying a particular image as familiar. No previous study has explored the possibility that familiarity suppression occurs even in early-stage visual areas where neurons are selective for simple features in confined receptive fields. We now report that neurons in area V2 exhibit familiarity suppression. This finding challenges our current understanding of information processing in V2 as well as our understanding of the mechanisms that underlie familiarity suppression.
Keywords: familiarity, macaque, suppression, V2
Introduction
In macaque inferotemporal cortex (ITC), the population response to an image rendered familiar by long-term experience begins at normal strength but is suppressed shortly after onset, a phenomenon termed familiarity suppression (Meyer et al., 2014). Studies of familiarity suppression typically employ complex natural images rendered familiar by hundreds (Fahy et al., 1993; Sobotka and Ringo, 1993; Xiang and Brown, 1998; Freedman et al., 2006; Mruczek and Sheinberg, 2007; Anderson et al., 2008; Woloszyn and Sheinberg, 2012; Meyer et al., 2014) or thousands (Woloszyn and Sheinberg, 2012) of exposures imposed over the course of weeks (Freedman et al., 2006; Meyer et al., 2014) or months (Fahy et al., 1993; Sobotka and Ringo, 1993; Xiang and Brown, 1998; Freedman et al., 2006; Mruczek and Sheinberg, 2007; Anderson et al., 2008; Woloszyn and Sheinberg, 2012). The effect is evident regardless of whether exposure involves active discrimination (Fahy et al., 1993; Sobotka and Ringo, 1993; Xiang and Brown, 1998; Freedman et al., 2006; Mruczek and Sheinberg, 2007; Anderson et al., 2008; Woloszyn and Sheinberg, 2012) or passive viewing (Freedman et al., 2006; Meyer et al., 2014) and regardless of whether subsequent testing involves active discrimination (Fahy et al., 1993; Sobotka and Ringo, 1993; Xiang and Brown, 1998; Mruczek and Sheinberg, 2007) or passive viewing (Freedman et al., 2006; Mruczek and Sheinberg, 2007; Anderson et al., 2008; Woloszyn and Sheinberg, 2012; Meyer et al., 2014).
Three ideas have been put forward with regard to behavioral or perceptual advantages that might arise from familiarity suppression. First, reduction of population response could serve as a signal allowing detection of an image as familiar. Support for this notion has come from experiments requiring monkeys to detect repetition of an image. Suppression is more pronounced when the image is detected as a repeat than when it is not (Meyer and Rust, 2016). Second, reduction of the population response could underlie better discrimination of the familiar image. This is consonant with the observation that familiarity suppression in ITC is especially pronounced for nonpreferred images, with the consequence that neuronal tuning is sharper and the population representation is sparser for familiar than for novel images (Freedman et al., 2006; Woloszyn and Sheinberg, 2012). However, behavioral evidence for improved processing has been obtained only under conditions of explicit training as distinct from passive viewing (Rainer and Miller, 2000; Rainer et al., 2004). Finally, familiarity suppression might underlie the reduced salience of familiar as compared with novel images. Monkeys, like humans, spend less time gazing at familiar than at novel images (Jutras and Buffalo, 2010; Ghazizadeh et al., 2016). Moreover, familiar distractors are less effective than novel distractors in a visual search task after extensive training requiring monkeys to ignore the distractors (Mruczek and Sheinberg, 2007).
Familiarity suppression commonly is assumed to originate in ITC because ITC neurons have large receptive fields capable of encompassing an entire image and exhibit selectivity for particular complex images (Tanaka et al., 1991). Thus they represent in explicit form information that would allow identifying an image as familiar. Familiarity suppression in high-order areas downstream from ITC, including perirhinal cortex (Fahy et al., 1993; Xiang and Brown, 1998), entorhinal cortex (Fahy et al., 1993; Xiang and Brown, 1998), dorsolateral prefrontal cortex (Rainer and Miller, 2000), and the hippocampus (Fahy et al., 1993; Xiang and Brown, 1998), could arise through propagation from ITC. The assumption that familiarity suppression is mediated by neurons selective for complex images, however, is not necessarily justified. Low-order areas upstream from ITC, such as V1 and V2, contain neurons that are individually selective for simple local features and yet, as a population, must uniquely encode the identity of each complex image. It is conceivable that population coding as embodied in these areas is sufficient to support familiarity suppression. To investigate this possibility, we monitored the activity of V2 neurons with semichronic electrode arrays while monkeys repeatedly viewed images representing complex artificial and natural objects.
Materials and Methods
Subjects.
Two adult rhesus macaques (Macaca mulatta) participated in the study (Monkey G, an 8.5 kg female; Monkey L, an 11.1 kg male). All experimental procedures were approved by the Carnegie Mellon University Institutional Animal Care and Use Committee and were in compliance with the guidelines set forth in the NIH Guide for the Care and Use of Laboratory Animals.
Images.
The images represented natural and man-made objects against a blank background with a resolution of 150 × 150 pixels. When presented on a CRT monitor at a viewing distance of 57 cm, each image subtended 6.5° of visual angle along whichever axis, vertical or horizontal, was longer.
Task.
Each trial began with attainment of fixation on a central spot. After a delay of 300 ms, an image appeared in superimposition on the aggregate receptive field of the recorded V2 neurons. The image was visible for 500–800 ms. After an additional 200 ms, the fixation spot jumped to one of four peripheral locations distributed around the clock at 90° intervals. Liquid reward was delivered upon completion of a saccade to the spot at its new location. Eye position was monitored continuously with an infrared optical eye tracking system sampling at 120 Hz (ISCAN). A trial was aborted without reward if, at any point before delivery of reward the monkey failed to maintain fixation within a central window spanning 0.6°–0.8°. The sequence of images across trials was random except for the constraint that each image appear once in each block of trials. In the typical session using 25 familiar images and 25 session-specific novel images, each block of 50 trials contained one instance of each image.
Semichronic microelectrode recording.
Recording simultaneously from multiple neurons was critical to success of the study. It allowed us to average out noise due to the image selectivity of individual neurons recorded on a given day when comparing responses to familiar and novel images on that day. Averaging across days would likewise have eliminated noise but would have prevented tracking the trajectory with which familiarity suppression developed. We monitored neuronal activity through an SC32-1 array, a modular, replaceable micromanipulator system allowing independent bidirectional control of 32 microelectrodes arranged in a square array with 1.5 mm inter-electrode spacing (Gray Matter Research). The array was implanted over the intact dura above the occipital operculum with its center roughly at the border between areas V2 and V1. A screw-driven mechanism allowed independent bidirectional control of the depth of each electrode over a range of 16 mm with an accuracy of ∼15 μm. This provided sufficient control to isolate the spiking activity of individual neurons. The location of the tip of each electrode remained relatively stable across multiple days as evidenced by consistency in the pattern of neuronal selectivity for familiar images. However, the precise identity of the recorded neurons probably varied across successive days.
Sequence of sessions.
We performed six experiments. Each experiment consisted of multiple sessions occupying many but not all days of the full experimental period (Table 1, row 2). The experiments had in common two critical features: (1) during numerous “familiarization” sessions (Table 1, row 3), we exposed the monkey to the 25 images in the experiment-unique familiarization set and (2) during a subset of these sessions which we term “F–N” sessions (Table 1, row 6), we monitored neuronal responses while presenting, on interleaved trials, not only the 25 familiar images but also 25 session-unique novel images. These critical commonalities allowed us to combine data across experiments to analyze the dependence of familiarity suppression (as measured during each F–N session) on the total number of prior exposures to the familiar images (as received during all preceding familiarization sessions). Other aspects of design varied unavoidably from experiment to experiment. The variability arose from factors impossible to control in a multiday experiment. The monkey's level of motivation on a given day influenced the number of exposures to the familiar images that could be achieved on that day. Likewise, our estimate of the monkey's level of motivation determined whether, on a given day, we strove to complete a brief session involving exposure to only familiar images or a prolonged session involving interleaved presentation of familiar and novel images together with neuronal data collection. Having established, in early experiments, that familiarity suppression occurred robustly, we introduced, in late experiments, certain manipulations designed to elucidate the dependence of the phenomenon on the properties of the images. These included “aperture/full-view” tests (Table 1, row 11) and “repeated novel” tests (Table 1, row 12).
Table 1.
Experiment | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
1. Monkey | GRH | LRH | LRH | LRH | LRH | GLH |
2. Duration in days | 30 | 35 | 10 | 20 | 32 | 15 |
3. No. of familiarization days | 12 | 22 | 9 | 10 | 15 | 11 |
4. No. of familiarization exposures | 285 | 282 | 195 | 147 | 248 | 191 |
5. No. of V2 electrodes | 7 | 10 | 23 | 26 | 25 | 30 |
6. No. of F–N sessions | 6 | 17 | 8 | 7 | 10 | 9 |
7. No. of late F–N sessions | 5 | 15 | 6 | 5 | 8 | 7 |
8. No. of neurons in late F–N sessions | 21 | 110 | 125 | 133 | 194 | 197 |
9. Mean suppression index in late sessions | 0.068 | 0.13 | 0.073 | 0.050 | 0.056 | 0.028 |
10. Suppression significant at p < 0.0001 | Yes | Yes | Yes | Yes | Yes | Yes |
11. No. of aperture/full-view sessions | — | — | 3 | 4 | 1 | 5 |
12. No. of repeated novel sessions | — | 5 | 2 | 3 | 3 | 4 |
Row 1, This indicates in which of two monkeys the experiment was conducted (G or L) and in which hemisphere (LH: left or RH: right). Row 2, The duration of the entire period during which exposure and recording were carried out. Row 3, The number of daily sessions in which the monkey was given exposure to the familiar images up to and including the final F–N session (row 6). Row 4, The number of times the monkey saw each familiar image across all days indicated in row 3. Row 5, The number of electrodes yielding V2 activity at some point during the experiment. Row 6, The number of sessions in which neuronal activity was monitored during interleaved exposure to the familiar image set and a session-unique novel image set. Row 7, The number of F–N sessions (row 6) in the phase of each experiment consisting of the first session in which familiarity suppression was statistically significant (p < 0.05, signed rank test, with number of observations in each category equal to the number of neurons recorded during the session) and all subsequent sessions. Data from these blocks formed the database for the analysis of the latency of familiarity suppression. Row 8, The number of neurons recorded in late F–N sessions (row 7). Neurons recorded on the same electrode on successive days counted as different. Row 9, The population familiarity suppression index computed on the basis of all late F–N sessions according to the formula (N − F)/(N + F), where N and F were the mean firing rates elicited by novel and familiar images 120–540 ms after stimulus onset. Row 10, This indicates whether the tendency for the novel-image firing rate to exceed the familiar-image firing rate achieved significance at the level p < 0.0001 (Wilcoxon rank sum test with n equal to the number of neurons indicated in row 7). Row 11, The number of sessions in which familiarity suppression was compared between a condition in which the full view was presented and a condition in which only that portion of the image visible through a 3° square aperture centered on the image frame was presented. These sessions do not contribute to counts in previous rows. Row 12, The number of sessions conducted late in the experiment, after familiarity suppression had developed, in which a novel image set used during an early session was used again. These sessions do not contribute to counts in previous rows.
Receptive field mapping.
At the outset of each multiday experiment, after having advanced the electrodes to the desired depth, we plotted the receptive fields of the newly isolated neurons. We first manually delineated the receptive fields of neurons recorded through each electrode while the monkey maintained fixation on a central spot. Having thus approximately located all receptive fields, we proceeded to plot them automatically by presenting long narrow horizontal and vertical bars for a duration of 250 ms at locations staggered to span the region of collective visual sensitivity. Each bar was 0.1° wide and was either 4° or 8° long as dictated by the need to span the region of collective visual sensitivity. The horizontal (or vertical) bar was presented at 12 vertical (or horizontal) locations evenly spaced at intervals of 0.33° (in the case of the 4° bar) or 0.5° (in the case of the 8° bar). Independently for vertical and horizontal bars, we determined the center of the receptive field and its diameter at half-height. In plots representing the receptive field as a circle, the diameter of the circle is the average of the horizontal and vertical diameters. These stimuli, although not matched to the preferences of neurons at any individual site, nevertheless did elicit responses from neurons at all V2 sites and so did allow receptive field mapping. It is possible that use of long bars, as required for automatic mapping of multiple receptive fields, led to a slight underestimation of receptive field size due to the fact that the bars extended into the receptive field surround. The dimensions of the plotted receptive fields are, however, consistent with results obtained by more precise mapping procedures. V2 neurons representing the portion of the visual field on which this study is focused (Fig. 1A) have receptive fields with an average diameter of 1.5° (Shushruth et al., 2009).
Decoding.
To decode image identity from single-trial population activity we used a support vector machine. We trained 300 binary classifiers on all possible pairwise discriminations of the 25 images in the set. To prevent training and testing on the same data, we used a 10-fold cross-validation design, running 10 sessions in each of which one tenth of the trials was held in reserve for testing. At voting time, the image that got the highest number of votes was taken as the output of the combined classifier. The reported accuracy scores are averages across all 10 sessions.
Experimental design and statistical analysis.
All statistical analyses were performed in MATLAB (MathWorks). Individual analyses are described in Results. The statistical tests used in these analyses, including the Wilcoxon signed rank test, the χ2 test, and linear regression with a large sample size, do not assume normality in the data.
Results
We monitored neuronal visual responses through multiple electrodes implanted semichronically in area V2 during six experiments in three hemispheres of two monkeys (Table 1, row 1). At the outset of each experiment, we advanced the electrodes so as to obtain well isolated neuronal activity. We then plotted the receptive fields of neurons at all recording sites. We identified recording sites as being in V2 on the basis of well established patterns of receptive field size and topography (Gattass et al., 1981). The number of electrodes yielding V2 data ranged from 7 to 26 across experiments (Table 1, row 5). The number of differentiable action potentials recorded from an electrode was typically one or two. All neurons had receptive fields centered in the lower contralateral quadrant of the visual field (Fig. 1A). At the beginning of each experiment, we selected 25 images to serve as the familiarization set and adjusted the location of the 6.5° × 6.5° image frame to encompass the receptive fields of the newly isolated neurons (Fig. 1B). Each experiment consisted of multiple familiarization sessions spread out over a period of 1–5 weeks (Table 1, rows 2–4). Each session was divided into trials during each of which the monkey maintained central fixation while a single image was presented for 500–800 ms. The number of exposures per familiar image per day ranged from 8 to 45 with a mean of 17. During most sessions, the monkey viewed not only the 25 images in the familiarization set but also, on an equal number of interleaved trials, 25 session-unique novel images (Fig. 1C).
To determine whether V2 neurons exhibited familiarity suppression, we compared population visual responses elicited by 25 familiar and 25 novel images presented during interleaved trials on the same day (Table 1, row 6). We averaged the visual responses of all neurons recorded on a given day so as to minimize the influence of interneuronal differences in image selectivity. We averaged the visual responses across all images in a given category so as to minimize the influence of inter-image differences in salience. We tested for a reduction in familiar-image response strength relative to novel-image response strength, rather than for a reduction in absolute familiar-image response strength, so as to factor out day-to-day fluctuations in the firing rates of the recorded neurons. On inspecting population histograms representing responses to familiar and novel images, we discovered that familiarity suppression emerged in V2 over the course of the first few familiarization sessions. For example, in Experiment 2, suppression was not evident during Sessions 1–2 whereas it was consistently present from Session 3 onward (Fig. 1D). The histograms representing “novel” and “familiar ” responses on Day 1 provide an example of noise arising from stochastic variability in response strength and differential image efficacy because, on Day 1, both sets of images were being viewed for the first time.
To characterize the rate at which suppression developed, we considered data from all 56 sessions in which monkeys viewed interleaved familiar and novel images (Table 1, row 6). For each session, we computed an index of familiarity suppression: (N − F)/(N + F) where N (or F) was the mean across all recorded neurons of the spike rate elicited by novel (or familiar) images in a window 120–540 ms after stimulus onset. Upon plotting this index as a function of the number of times the monkey had viewed each familiar image before the session in question, we found that the index was positive, indicating the occurrence of familiarity suppression, in all sessions conducted after the monkey had viewed each image 50 or more times (Fig. 2A). The zero-intercept exponential function yielding the best fit to the data had an asymptote of 0.13 and a rate constant of 130 prior exposures. This function yielded a significantly better fit than a zero-intercept line (F test, p = 0.017, F = 6.07, n = 58). Basing the analysis on the number of prior training days rather than the number of prior exposures to the familiar images yielded qualitatively similar results. The zero-intercept exponential function yielding the best fit to the data had an asymptote of 0.11 and a rate constant of 8 d. This function yielded a significantly better fit than a zero-intercept line (F test, p = 0.00033, F = 14.61, n = 58). Thus there was a significant tendency, whether the analysis was based on exposures or days, for suppression not only to increase but also to saturate over the course of an experiment.
The apparent increase in familiarity suppression over the course of the experiment might have been an artifact of our using more effective novel-image sets later in the experiment. To rule out this interpretation, we dedicated several late sessions to repeat presentation of images, both familiar and novel, presented during a session early in the experiment (Table 1, row 12). We found that familiarity suppression was stronger during the late sessions than during the early sessions even when the novel images use for comparison were physically identical (Fig. 2B). The tendency for familiarity suppression to be stronger during the late session, as revealed by the preponderance of points beneath the identity line, was statistically significant (Wilcoxon signed rank test, early mean = 0.017, late mean = 0.047, p = 0.0014, n = 17). This finding is especially striking because the repeated novel images, having been viewed during an early session, were no longer strictly speaking novel. We conclude that the familiarity suppression measured late in the main experiment was not an artifact of the accidental properties of the session-unique novel images selected for use late in the main experiment.
Familiarity suppression in V2 could have been a product of feedback from ITC. If so, then suppression in V2 should have appeared at relatively long latency after image onset. To measure the latency of suppression, we considered data from 46 sessions following establishment of the effect (Table 1, row 7). Upon plotting the difference between the novel-image response and the familiar-image response as a function of time following image onset, we found that suppression appeared at ∼100 ms following image onset (Fig. 3). To characterize the timing of the effect precisely, we smoothed the data from each session by convolution with a 5 ms SD half-Gaussian kernel encompassing past but not future time points. We then identified the first sequence of five consecutive bins in each of which the number of sessions with observations greater than zero significantly exceeded the number of sessions with observations less than zero (χ2 test with Yates correction, α = 0.05, n = 46). We took the first bin of this string as marking the time of onset of suppression. The latency as measured thus was 110 ms. Following its onset, suppression exhibits an intriguing dynamic pattern, first ramping up over the course of ∼100 ms and then declining somewhat (Fig. 3B). The slow onset of suppression (Fig. 3B) stands in contrast to the rapid onset of the population visual response (Fig. 3A). It suggests dependence on multisynaptic recurrent or feedback connections and involvement of attractor dynamics.
Because the images used in this experiment were larger than the receptive fields of the V2 neurons, it is natural to wonder whether V2 neurons were sensitive to the familiarity of the entire image or only that part of the image within their receptive fields. To resolve this issue, we dedicated 13 sessions during the late stage of data collection to testing whether familiarity suppression was diminished by blocking off parts of the image around the periphery of the frame and therefore outside the receptive fields of most of the recorded neurons (Table 1, row 11). We presented either the full image or only that part of the image visible through a 3° square aperture centered on the image frame. If only image content inside the neuronal receptive field mattered, then, for neurons with receptive fields confined to the aperture, familiarity suppression should have been of equal strength under the two conditions. We found instead that familiarity suppression was reduced under the aperture condition as compared with the full-view condition (Fig. 4A–C). The aperture manipulation reduced familiarity suppression in all 13 such sessions (Fig. 4D), with the collective effect attaining statistical significance (Wilcoxon signed rank test, full-view mean = 0.033, aperture mean = 0.011, p = 0.0039, n = 13). The reduction might have occurred because some neurons had receptive fields extending beyond the 3° aperture and so were deprived of visual stimulation when images were confined to the aperture. In accordance with this interpretation, the population firing rate was slightly reduced under the aperture condition (Fig. 4B) compared with the full-view condition (Fig. 4A). To resolve this issue, we repeated the analysis on subpopulations of sites selected to minimize the distance between the receptive-field center and the aperture center. As we confined analysis to sites with receptive fields closer and closer to the center of the aperture, the aperture-induced reduction in familiarity suppression persisted (Fig. 4E,F). We conclude that familiarity suppression depended not only on parts of the image within the classic receptive field but also on image content in the near or far surround.
In ITC, image familiarization has been reported to sharpen neuronal selectivity for the familiar images and possibly to make them more discriminable from each other on the basis of population activity (Freedman et al., 2006; Woloszyn and Sheinberg, 2012). To investigate whether sharpening occurred in V2, we performed an analysis based on responses to familiar and novel images presented during late sessions (Table 1, row 7). We ranked images from best to worst for each neuron, computed mean population firing rate as a function of image-rank and characterized the resulting population tuning curve with a standard sparseness index (Vinje and Gallant, 2000):
where ri is the firing rate elicited by image i and n is the number of images. The sparseness index was slightly greater for familiar images (0.27) than for novel images (0.25) but the difference was not significant (Kolmogorov–Smirnov test comparing curves normalized to rank 1 firing rate, p = 0.96, n = 25). To assess whether population activity encoded familiar image identity more efficiently than novel image identity, we performed a decoding analysis. This was based on data collected in Experiments 3–6 during sessions in which familiarity suppression was demonstrably present (Table 1, row 7). We focused on Experiments 3–6 because the average number of neurons per session (15 or higher) was sufficiently large to support meaningful decoding. For each of 26 sessions, independently for familiar and session-unique novel images, we trained a linear support vector machine to report image identity based on single-trial population activity. The mean classification accuracy was 42% for novel images and 39% for familiar images as compared with chance expectation of 4%. The difference between the accuracies achieved for the two image categories achieved statistical significance (signed rank test, p = 0.0027, n = 26). Thus decoding was actually less efficient for familiar than for novel images.
Discussion
The key finding of this study is that neurons of macaque area V2 exhibit familiarity suppression. Previous studies of visual plasticity in low-order visual areas of the adult monkey have concerned primarily subtle shifts of stimulus tuning that develop during the performance of tasks requiring difficult visual discriminations and that are evident specifically in the context of task performance (Schoups et al., 2001; Ghose et al., 2002; Lee et al., 2002; Li et al., 2004; Gilbert and Li, 2012, 2013; Liang et al., 2017). Familiarity suppression has been demonstrated previously only in ITC and areas of higher order to which it projects (Fahy et al., 1993; Sobotka and Ringo, 1993; Xiang and Brown, 1998; Freedman et al., 2006; Mruczek and Sheinberg, 2007; Anderson and Sheinberg, 2008; Anderson et al., 2008; Woloszyn and Sheinberg, 2012; Meyer et al., 2014). In ITC, familiarity suppression could arise from fatigue of neurons selective for the particular complex images or from fatigue of synapses to which those neurons give rise. In V2, however, neurons are selective for local features (Hegdé and Van Essen, 2003; Freeman et al., 2013). Any given feature is unlikely to have been represented with excessive strength in the 25 images of the arbitrarily selected familiarization set. Thus the nature of the mechanism that underlies familiarity suppression in V2 is unclear.
One possibility is that familiarity suppression in V2 is fed back from ITC. This idea is concordant with the principle that top-down feedback plays a critical role in the control of neuronal visual responsiveness in V1 and V2 (Riesenhuber and Poggio, 1999; Lamme and Roelfsema, 2000; Hochstein and Ahissar, 2002; Lee and Mumford, 2003; Li et al., 2004; Friston, 2005; Gilbert and Li, 2013; Wokke et al., 2013) and fits with studies demonstrating that top-down effects appear in V1 and V2 at latencies of 100 ms or more following visual onset (Lamme and Roelfsema, 2000; Lee and Nguyen, 2001; Lee, 2002; Lee et al., 2002; Supèr et al., 2003; Poort et al., 2012; Chen et al., 2014). If the suppressive signal in V2 were simply a duplicate of the suppressive signal in ITC, conveyed through top-down transmission, then it would necessarily appear at a longer latency in V2 than in ITC. The only previous report explicitly describing suppression latency in ITC indicated relatively late onset, at 120, 118, and 158 ms, in three monkeys (Anderson et al., 2008). The reported values are, however, based on a statistical criterion different from ours. To level the playing field between studies and to allow for comparison to a broader range of studies, we took measurements directly from population histograms depicted in figures illustrating familiarity suppression (Freedman et al., 2006; Mruczek and Sheinberg, 2007; Anderson et al., 2008; Woloszyn and Sheinberg, 2012; Meyer et al., 2014). First, we measured the latency of the visual response itself. We found visual latency to be longer by ∼30 ms in ITC than in V2 (Table 2, visual latency) in general agreement with previous reports (Lamme and Roelfsema, 2000; Self et al., 2017). The difference in latencies presumably corresponds to the feedforward transmission delay between V2 and ITC. If feedback involves a comparable transmission delay, then familiarity suppression fed back from ITC to V2 should appear in V2 at a delay of ∼30 ms relative to its appearance ITC. To assess whether this was so, we compared the latency of familiarity suppression in V2 in the present study to its latency in ITC in previous studies. We found that familiarity suppression, far from occurring later in V2 than in ITC, actually appeared earlier by ∼20 ms (Table 2, suppression latency). In both V2 and ITC, suppression of the familiar-image response accompanies a brief post-peak upward inflection of firing rate (Fig. 5A–C, arrows), but the inflection and the suppression alike are earlier in V2 than in ITC. These observations, however, do not absolutely rule out the idea that familiarity suppression in V2 depends on top-down input from areas of higher order. The measurements of latency in V2 and ITC were made in different animals. Even if they were replicated in the same animal, they might be reconciled with a mechanism whereby familiarity suppression is fed back to V2 from areas less hierarchically elevated than ITC. There are, indeed, preliminary indications that neurons in V4 do exhibit familiarity suppression (Guan et al., 2017). Finally, it is possible that familiarity suppression in V2 depends in some way on feedback from ITC during the earliest phase of the visual response, beginning at ∼70 ms, when ITC neurons encode image identity but do not yet exhibit familiarity suppression and when a few ITC neurons highly selective for the familiar image respond especially strongly to it (Woloszyn and Sheinberg, 2012).
Table 2.
Visual latency, ms | Suppression latency, ms | Suppression half-height, ms | |
---|---|---|---|
ITC: Freedman et al., 2006 (their Fig. 8) | 76 | 109 | 152 |
ITC: Mruczek and Sheinberg, 2007 (their Fig. 9A) | 56 | 131 | 154 |
ITC: Anderson et al., 2008 (their Fig. 4M) | 82 | 106 (120) | 121 |
ITC: Anderson et al., 2008 (their Fig. 4J) | 55 | 116 (118) | 158 |
ITC: Anderson et al., 2008 (their Fig. 4S) | 63 | 133 (158) | 154 |
ITC: Woloszyn and Sheinberg, 2012 (their Fig. 4A) | 80 | 142 | 164 |
ITC: Meyer et al., 2014 (their Fig. 5A) | 57 | 110 | 182 (180) |
ITC: Average across studies | 67 | 121 | 155 |
V2: Current study (Fig. 2C) | 30 (45) | 100 (110) | 113 (116) |
The approach of taking measurements from population histograms was necessary as a means for including multiple studies (because most do not provide numeric latencies) and for equating the latency criterion across studies (because subtle variations in criterion can produce substantial changes in latency). Where a numeric estimate based on a statistical criterion is available, it is provided parenthetically after the estimate based on direct measurement. Note that attainment of statistical criterion is generally delayed relative to signal onset visible in population histograms.
Visual latency, Time following image onset at which firing rate rose above baseline; suppression latency, time at which novel-image-minus-familiar-image difference rose above zero; suppression half-height, time of attainment of half-peak height by the novel-image-minus-familiar-image signal.
An alternative possibility is that familiarity suppression in V2 arises at least in part from a mechanism intrinsic to the area. This raises the question: How could neurons selective for local features detect a global image as familiar? Our thoughts on this subject begin with the fact that a familiar image is represented in V2 by simultaneous activity of an ensemble of neurons selective for its local features. Familiarity suppression might occur in V2 at an ensemble-specific rather than a neuron-specific level. For example, if the late phase of the response to an image depended on lateral interactions among the neurons responsive to it, and if repeated exposure to the image induced weakening of excitatory interactions or strengthening of inhibitory interactions among coactive neurons (Barlow and Földiák, 1989; Lim et al., 2015), then the result would be ensemble-specific familiarity suppression. Such an effect would run counter to the classic idea that synapses between coactive neurons undergo Hebbian strengthening but would be consistent with a scheme in which efficient coding arises from redundancy reduction (Lewicki, 2002; Olshausen and Field, 2004; King et al., 2013). Two observations in the present study are compatible with this model. First, we have found that parts of the image outside the classic receptive field contribute to familiarity suppression. Lateral interactions among V2 neurons could explain the impact of these features. Second, we have found that the onset of familiarity suppression is coincident with a post-peak upward inflection in the population firing rate (Fig. 5C). This inflection could reflect the arrival of indirect inputs relayed from other V2 neurons via lateral connections. The possibility that familiarity suppression in V2 depends in part or in whole on a mechanism intrinsic to V2 has direct implications for our understanding of the phenomenon in all areas. It suggests regarding familiarity suppression as a general manifestation of principles of statistical learning operative at all levels of ventral stream processing rather than as a product of definitive recognition such as one might assume to occur only at a late stage of visual processing.
Familiarity suppression develops rapidly in V2. It is well established after the monkey has viewed each image as few as 50 times over the course of several days. The fact that familiarity suppression develops rapidly in V2 is in harmony with previous reports on ITC indicating that experience-dependent effects are evident after as little as a few hours (Li and DiCarlo, 2010) or a single day (Erickson et al., 2000). The rate at which familiarity suppression develops in ITC is not known. In addition to establishing that familiarity suppression develops rapidly in V2, we have also found that it tends to level out over the course of a few hundred exposures. This is indicated by the fact that an exponential function relating effect strength to exposure number affords a significantly better fit to the data than a linear function. We caution, however, that the asymptote of the best-fit exponential function, (N − F)/(N + F) = 0.13, may not represent a true limit on the process. In ITC, familiarity suppression appears to increase gradually over the course of thousands of exposures (Mohan and Freedman, 2017). The same could be true in V2. This is one possible explanation for the fact that familiarity suppression in V2 in our study is of relatively small magnitude as compared with familiarity suppression in ITC in previous studies involving more numerous exposures.
Footnotes
This work was supported by NIH RO1 EY024912, NIH P50 MH103204, NSF CISE1320651, and IARPA via Department of Interior Contract D16PC00007, and technical support from NIH P30 EY008098. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies. We thank Karen McCracken for technical assistance, and Jason Samonds and Charles Gray for assisting in the implantation of the SC32 arrays.
The authors declare no competing financial interests.
References
- Anderson B, Sheinberg DL (2008) Effects of temporal context and temporal expectancy on neural activity in inferior temporal cortex. Neuropsychologia 46:947–957. 10.1016/j.neuropsychologia.2007.11.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson B, Mruczek RE, Kawasaki K, Sheinberg D (2008) Effects of familiarity on neural activity in monkey inferior temporal lobe. Cereb Cortex 18:2540–2552. 10.1093/cercor/bhn015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barlow H, Földiák P (1989) Adaptation and decorrelation in the cortex. In: The computing neuron (Durbin R, Miall C, Mitchison G, eds), pp 54–72. Wokingham, England: Addison-Wesley. [Google Scholar]
- Chen M, Yan Y, Gong X, Gilbert CD, Liang H, Li W (2014) Incremental integration of global contours through interplay between visual cortical areas. Neuron 82:682–694. 10.1016/j.neuron.2014.03.023 [DOI] [PubMed] [Google Scholar]
- Erickson CA, Jagadeesh B, Desimone R (2000) Clustering of perirhinal neurons with similar properties following visual experience in adult monkeys. Nat Neurosci 3:1143–1148. 10.1038/80664 [DOI] [PubMed] [Google Scholar]
- Fahy FL, Riches IP, Brown MW (1993) Neuronal activity related to visual recognition memory: long-term memory and the encoding of recency and familiarity information in the primate anterior and medial inferior temporal and rhinal cortex. Exp Brain Res 96:457–472. [DOI] [PubMed] [Google Scholar]
- Freedman DJ, Riesenhuber M, Poggio T, Miller EK (2006) Experience-dependent sharpening of visual shape selectivity in inferior temporal cortex. Cereb Cortex 16:1631–1644. 10.1093/cercor/bhj100 [DOI] [PubMed] [Google Scholar]
- Freeman J, Ziemba CM, Heeger DJ, Simoncelli EP, Movshon JA (2013) A functional and perceptual signature of the second visual area in primates. Nat Neurosci 16:974–981. 10.1038/nn.3402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friston K. (2005) A theory of cortical responses. Philos Trans R Soc Lond B Biol Sci 360:815–836. 10.1098/rstb.2005.1622 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gattass R, Gross CG, Sandell JH (1981) Visual topography of V2 in the macaque. J Comp Neurol 201:519–539. 10.1002/cne.902010405 [DOI] [PubMed] [Google Scholar]
- Ghazizadeh A, Griggs W, Hikosaka O (2016) Ecological origins of object salience: reward, uncertainty, aversiveness, and novelty. Front Neurosci 10:378. 10.3389/fnins.2016.00378 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ghose GM, Yang T, Maunsell JH (2002) Physiological correlates of perceptual learning in monkey V1 and V2. J Neurophysiol 87:1867–1888. 10.1152/jn.00690.2001 [DOI] [PubMed] [Google Scholar]
- Gilbert CD, Li W (2012) Adult visual cortical plasticity. Neuron 75:250–264. 10.1016/j.neuron.2012.06.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilbert CD, Li W (2013) Top-down influences on visual processing. Nat Rev Neurosci 14:350–363. 10.1038/nrn3476 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guan S, Xia R, Sheinberg D (2017) Bidirectional visual processing: distinct dynamics and interactions between V4 and inferior temporal cortex in challenging scenarios. Soc Neurosci Abstr 43:403.07. [Google Scholar]
- Hegdé J, Van Essen DC (2003) Strategies of shape representation in macaque visual area V2. Vis Neurosci 20:313–328. 10.1017/S0952523803203102 [DOI] [PubMed] [Google Scholar]
- Hochstein S, Ahissar M (2002) View from the top: hierarchies and reverse hierarchies in the visual system. Neuron 36:791–804. 10.1016/S0896-6273(02)01091-7 [DOI] [PubMed] [Google Scholar]
- Jutras MJ, Buffalo EA (2010) Recognition memory signals in the macaque hippocampus. Proc Natl Acad Sci U S A 107:401–406. 10.1073/pnas.0908378107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- King PD, Zylberberg J, DeWeese MR (2013) Inhibitory interneurons decorrelate excitatory cells to drive sparse code formation in a spiking model of V1. J Neurosci 33:5475–5485. 10.1523/JNEUROSCI.4188-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamme VA, Roelfsema PR (2000) The distinct modes of vision offered by feedforward and recurrent processing. Trends Neurosci 23:571–579. 10.1016/S0166-2236(00)01657-X [DOI] [PubMed] [Google Scholar]
- Lee TS. (2002) Top-down influence in early visual processing: a Bayesian perspective. Physiol Behav 77:645–650. 10.1016/S0031-9384(02)00903-4 [DOI] [PubMed] [Google Scholar]
- Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A Opt Image Sci Vis 20:1434–1448. 10.1364/JOSAA.20.001434 [DOI] [PubMed] [Google Scholar]
- Lee TS, Nguyen M (2001) Dynamics of subjective contour formation in the early visual cortex. Proc Natl Acad Sci U S A 98:1907–1911. 10.1073/pnas.98.4.1907 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee TS, Yang CF, Romero RD, Mumford D (2002) Neural activity in early visual cortex reflects behavioral experience and higher-order perceptual saliency. Nat Neurosci 5:589–597. 10.1038/nn0602-860 [DOI] [PubMed] [Google Scholar]
- Lewicki MS. (2002) Efficient coding of natural sounds. Nat Neurosci 5:356–363. 10.1038/nn831 [DOI] [PubMed] [Google Scholar]
- Li N, DiCarlo JJ (2010) Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex. Neuron 67:1062–1075. 10.1016/j.neuron.2010.08.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li W, Piëch V, Gilbert CD (2004) Perceptual learning and top-down influences in primary visual cortex. Nat Neurosci 7:651–657. 10.1038/nn1255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liang H, Gong X, Chen M, Yan Y, Li W, Gilbert CD (2017) Interactions between feedback and lateral connections in the primary visual cortex. Proc Natl Acad Sci U S A 114:8637–8642. 10.1073/pnas.1706183114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lim S, McKee JL, Woloszyn L, Amit Y, Freedman DJ, Sheinberg DL, Brunel N (2015) Inferring learning rules from distributions of firing rates in cortical neurons. Nat Neurosci 18:1804–1810. 10.1038/nn.4158 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer T, Rust C (2016) Single trial familiarity judgments are reflected in the IT population response. Soc Neurosci Abstr 42:242.08. [Google Scholar]
- Meyer T, Walker C, Cho RY, Olson CR (2014) Image familiarization sharpens response dynamics of neurons in inferotemporal cortex. Nat Neurosci 17:1388–1394. 10.1038/nn.3794 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mohan K, Freedman DJ (2017) Visual image familiarity learning at multiple timescales in the inferotemporal cortex. Soc Neurosci Abstr 590.13. [Google Scholar]
- Mruczek RE, Sheinberg DL (2007) Context familiarity enhances target processing by inferior temporal cortex neurons. J Neurosci 27:8533–8545. 10.1523/JNEUROSCI.2106-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olshausen BA, Field DJ (2004) Sparse coding of sensory inputs. Curr Opin Neurobiol 14:481–487. 10.1016/j.conb.2004.07.007 [DOI] [PubMed] [Google Scholar]
- Poort J, Raudies F, Wannig A, Lamme VA, Neumann H, Roelfsema PR (2012) The role of attention in figure-ground segregation in areas V1 and V4 of the visual cortex. Neuron 75:143–156. 10.1016/j.neuron.2012.04.032 [DOI] [PubMed] [Google Scholar]
- Rainer G, Lee H, Logothetis NK (2004) The effect of learning on the function of monkey extrastriate visual cortex. PLoS Biol 2:E44. 10.1371/journal.pbio.0020044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rainer G, Miller EK (2000) Effects of visual experience on the representation of objects in the prefrontal cortex. Neuron 27:179–189. 10.1016/S0896-6273(00)00019-2 [DOI] [PubMed] [Google Scholar]
- Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2:1019–1025. 10.1038/14819 [DOI] [PubMed] [Google Scholar]
- Schoups A, Vogels R, Qian N, Orban G (2001) Practising orientation identification improves orientation coding in V1 neurons. Nature 412:549–553. 10.1038/35087601 [DOI] [PubMed] [Google Scholar]
- Self MW, van Kerkoerle T, Goebel R, Roelfsema PR (2017) Benchmarking laminar fMRI: neuronal spiking and synaptic activity during top-down and bottom-up processing in the different layers of cortex. Neuroimage. Advance online publication. Retrieved June 23, 2017. doi: 10.1016/j.neuroimage.2017.06.045. [DOI] [PubMed] [Google Scholar]
- Shushruth S, Ichida JM, Levitt JB, Angelucci A (2009) Comparison of spatial summation properties of neurons in macaque V1 and V2. J Neurophysiol 102:2069–2083. 10.1152/jn.00512.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sobotka S, Ringo JL (1993) Investigation of long-term recognition and association memory in unit responses from inferotemporal cortex. Exp Brain Res 96:28–38. 10.1007/BF00230436 [DOI] [PubMed] [Google Scholar]
- Supèr H, Spekreijse H, Lamme VA (2003) Figure-ground activity in primary visual cortex (V1) of the monkey matches the speed of behavioral response. Neurosci Lett 344:75–78. 10.1016/S0304-3940(03)00360-4 [DOI] [PubMed] [Google Scholar]
- Tanaka K, Saito H, Fukada Y, Moriya M (1991) Coding visual images of objects in the inferotemporal cortex of the macaque monkey. J Neurophysiol 66:170–189. 10.1152/jn.1991.66.1.170 [DOI] [PubMed] [Google Scholar]
- Vinje WE, Gallant JL (2000) Sparse coding and decorrelation in primary visual cortex during natural vision. Science 287:1273–1276. 10.1126/science.287.5456.1273 [DOI] [PubMed] [Google Scholar]
- Wokke ME, Vandenbroucke AR, Scholte HS, Lamme VA (2013) Confuse your illusion: feedback to early visual cortex contributes to perceptual completion. Psychol Sci 24:63–71. 10.1177/0956797612449175 [DOI] [PubMed] [Google Scholar]
- Woloszyn L, Sheinberg DL (2012) Effects of long-term visual experience on responses of distinct classes of single units in inferior temporal cortex. Neuron 74:193–205. 10.1016/j.neuron.2012.01.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiang JZ, Brown MW (1998) Differential neuronal encoding of novelty, familiarity and recency in regions of the anterior temporal lobe. Neuropharmacology 37:657–676. 10.1016/S0028-3908(98)00030-6 [DOI] [PubMed] [Google Scholar]