Skip to main content
eLife logoLink to eLife
. 2016 Aug 22;5:e17256. doi: 10.7554/eLife.17256

Attention operates uniformly throughout the classical receptive field and the surround

Bram-Ernst Verhoef 1,2,3,*, John HR Maunsell 1,2
Editor: Doris Y Tsao4
PMCID: PMC5021523  PMID: 27547989

Abstract

Shifting attention among visual stimuli at different locations modulates neuronal responses in heterogeneous ways, depending on where those stimuli lie within the receptive fields of neurons. Yet how attention interacts with the receptive-field structure of cortical neurons remains unclear. We measured neuronal responses in area V4 while monkeys shifted their attention among stimuli placed in different locations within and around neuronal receptive fields. We found that attention interacts uniformly with the spatially-varying excitation and suppression associated with the receptive field. This interaction explained the large variability in attention modulation across neurons, and a non-additive relationship among stimulus selectivity, stimulus-induced suppression and attention modulation that has not been previously described. A spatially-tuned normalization model precisely accounted for all observed attention modulations and for the spatial summation properties of neurons. These results provide a unified account of spatial summation and attention-related modulation across both the classical receptive field and the surround.

DOI: http://dx.doi.org/10.7554/eLife.17256.001

Research Organism: Rhesus macaque

eLife digest

At any moment, our brain receives an enormous amount of information from our senses. However, we are not aware of all of this information; only the information we decide to focus on is perceived in detail. This ability to focus our attention is important for survival.

The neurons involved in vision respond best to information that comes from a small ‘window’ in what is being seen. When something appears in this window (known as the neuron’s receptive field), the activity of the neuron either increases or decreases. How does focusing attention on an object change the neuron’s response? Verhoef and Maunsell investigated this question by recording electrical activity in an area of the brain called V4 in monkeys as they focused their attention on objects in different locations of the neuron’s receptive field.

The recordings show that a single rule determines when attention influences a neuron’s activity. If an object inside the neuron’s receptive field decreases the activity of the neuron, then attention can change that neuron’s activity. Attention then changes the activity of the neuron by either removing or further boosting the influence of these objects.

Verhoef and Maunsell then developed a mathematical model based on these results, and found that the model could explain why the activity of a neuron changes when attention is focused on objects at different locations in its receptive field. The next step is to understand exactly how the brain works to either remove or boost the influence of an object that causes a neuron’s activity to decrease.

DOI: http://dx.doi.org/10.7554/eLife.17256.002

Introduction

Our eyes are constantly bombarded by a welter of visual stimuli, only a small fraction of which can be processed thoroughly (Chun et al., 2011; Kastner and Ungerleider, 2000). Spatial attention sifts through the plethora of stimuli – enhancing perception at behaviorally-relevant locations – but the underlying neural principles of this process are not fully understood (Chun et al., 2011; Kastner and Ungerleider, 2000; Posner, 1980; Carrasco, 2011; Roelfsema et al., 1998; Anton-Erxleben and Carrasco, 2013).

Neuronal responses modulate as attention shifts among stimuli at different receptive-field locations (Moran and Desimone, 1985; Treue and Maunsell, 1996; Reynolds et al., 1999; Martínez-Trujillo and Treue, 2002; Ghose and Maunsell, 2008; Lee and Maunsell, 2010; Ni et al., 2012; Recanzone and Wurtz, 2000; Luck et al., 1997; Motter, 1993; Chelazzi et al., 1998; Zénon and Krauzlis, 2012). These response modulations can be complicated. For example, depending on the stimulus configuration, attending to a non-preferred stimulus can either increase or suppress activity (e.g. Treue and Maunsell, 1996). Normalization models of attention provide a succinct framework in which these complex response modulations can be understood (Reynolds et al., 1999; Ghose, 2009; Reynolds and Heeger, 2009; Lee, 2009; Boynton, 2009). However, only a few studies have directly tested these models against the responses of individual neurons to various stimulus configurations (Ni et al., 2012; Lee, 2009; Sanayei et al., 2015; Xiao et al., 2014).

Normalization models of attention assume that attention acts on stimulus-induced excitation and suppression to modulate neuronal responses. Importantly, both excitation and suppression vary spatially within the receptive field: excitation is largely restricted to the classical receptive field (cRF), while suppression extends far beyond into the surround (Cavanaugh et al., 2002a, 2002b; Sceniak et al., 1999; Desimone and Schein, 1987; Carandini et al., 1997). Crucially, how attention interacts with the receptive field structure of neurons remains unclear. For example, the way that attention acts on neuronal responses when shifted among stimuli inside the cRF versus when shifted to stimuli inside the surround has not been compared directly (Motter, 1993; Sanayei et al., 2015; Sundberg et al., 2009). Differences may occur because feedforward-, feedback- and intracortical circuitries are thought to contribute differentially to the suppressive and excitatory inputs associated with stimuli in either the cRF or the surround (Angelucci et al., 2014), and because the cRF and the surround presumably serve different functional roles (Angelucci et al., 2014; Schwartz and Simoncelli, 2001; Vinje and Gallant, 2000). More generally, it remains unknown if and how attention operates on the spatially-varying excitation and suppression of a neuron's receptive field. This is a pivotal open question because, as we will show below, the interaction between attention and the receptive field structure determines which neurons are most affected by attention and consequently are most likely to influence attentional behavior.

We measured how attention affects neuronal responses to various stimulus configurations both inside and outside the cRF of V4 neurons, and fitted normalization models to the responses of individual neurons. We find that the principles that drive attention modulation are remarkably similar within the classical receptive field and the surround. We show that stimuli induce excitation and suppression that varies spatially, and that attention interacts with this spatially-varying excitation and suppression. This interaction explained the large differences in attention modulations across neurons, and a non-additive relationship among stimulus selectivity, stimulus-induced suppression and attention modulation. A spatially-tuned normalization model, wherein attention multiplies both the excitatory and spatially-varying normalization term, precisely accounted for all neuronal responses to either single or multiple stimuli, either attended or unattended, presented inside either the cRF or the surround. The model relates stimulus selectivity, stimulus-induced suppression and attention-related modulation to each other, and unifies spatial summation and attention-related modulation across different regions of the receptive field.

Results

Task and behavioral performance

We trained two rhesus monkeys to perform a visual-detection task in which spatial attention was controlled and measured. In each trial, a sequence of stimuli was presented at four locations equidistant from the fixation point (Figure 1A). Stimuli were full-contrast static Gabor stimuli with one of two orthogonal orientations. The monkey's task was to detect a faint white spot (target; Figure 1A right) that appeared in the center of one Gabor during a randomly selected stimulus presentation. We manipulated attention in blocks of trials by cueing the monkey at the start of each block as to which stimulus location was most likely to contain the target (Materials and methods). In 91% of trials the target was presented at the cued location (valid cue; position of the black circle Figure 1A). On the remaining 9% of trials the target appeared with equal probability at one of the three uncued locations: either next to the cued location (invalid near; position of the yellow circle Figure 1A), or at one of two locations contralateral to the cued location (invalid far; position of the blue circles Figure 1A).

Figure 1. Task and performance.

Figure 1.

(A) Every trial consisted of a sequence of stimulus presentations. On each stimulus presentation (200 ms duration; 200–1020 inter-stimulus interval), Gabor stimuli of two orthogonal orientations could be presented at four possible stimulus locations. The monkey was rewarded for detecting a faint white spot (target) in the center of one Gabor during one stimulus presentation. For 91% of trials the target was presented at the cued location (location of the black circle; valid trials). On the remaining 9% of trials the target was presented at one of three uncued locations: adjacent to the cued location (location of the yellow circle; invalid near), or at one of two locations on the opposite side of the fixation point (location of the blue circles; invalid far). Colored circles in (A) are shown for illustrative purposes, never presented during the task. (B) Average performance across recording sessions for monkey M1. Proportion correct (± SEM based on N = 52 sessions; proportion correct at equal target strength: Valid: 0.79; Invalid near: 0.42; Invalid far: 0.30) as a function of target strength for trials in which the target occurred at the cued (gray: valid) or uncued (yellow: invalid near; blue: invalid far) location. Target strength is defined as the opacity of the target. The pictograms below the target-strength axis illustrate the nature of the target-strength manipulation but do not represent actual target-strength values used during the recordings. (C) Average performance across recording sessions for monkey M2 (N = 78 sessions; proportion correct at equal target strength: Valid: 0.56; Invalid near: 0.24; Invalid far: 0.05).

DOI: http://dx.doi.org/10.7554/eLife.17256.003

The attention cue considerably affected behavioral performance in the task: targets were much more likely detected at a cued location than at an uncued location, even when the uncued location was adjacent to the cued location (Figure 1B,C; valid vs. invalid near: monkey M1: p=8 × 10−27, M2: p=1 × 10−26; valid vs. invalid far: M1: p=1 × 10−36, M2: p=2 × 10−58; paired t-test on the average proportion correct across sessions; M1: N = 52; M2: N = 78). The improved performance indicates that the monkeys preferentially attended to the cued stimulus location, which allowed us to compare neuronal responses among conditions in which attention was directed to different stimulus locations within neurons' cRF or surround.

Experimental conditions and example neurons

We examined the principles by which attention affects neuronal responses to stimuli inside the classical receptive field (cRF) or within the surround (sRF). Using chronically implanted microelectrode arrays, we recorded from 728 neurons in visual area V4 in the left hemisphere of two monkeys (monkey M1: 264; M2: 464) while they performed the visual-detection task in which spatial attention was controlled. All results presented here are based on the activity of these 728 single neurons, but all findings were confirmed in the responses of 12,067 multi-unit clusters (M1: 4709; M2: 7358). During each session we simultaneously measured the activity of multiple neurons, and optimized the orientation and position of stimuli for a randomly selected unit. The neurons' receptive field centers were located in the lower right visual field (black dots in Figure 2A for an example session).

Figure 2. Stimulus conditions.

(A) Neurons' receptive field centers were located in the lower right visual field: black dots indicate receptive-field centers of 16 simultaneously recorded neurons from one recording session. White circles (1,2,3) indicate the three stimulus locations near the neurons' receptive field for this example session. Within a block of trials, only two stimulus locations were used: locations 1+2 or 1+3. (B) Nine possible stimulus combinations resulting from two stimulus locations and two orthogonal orientations. (C) Two receptive-field configurations: cRF-cRF stimulus configuration with two stimuli inside the neuron's classical receptive field. White dotted circle illustrates the cRF. (D) sRF-cRF stimulus configuration with one stimulus inside a neuron's cRF and an adjacent stimulus in its surround. Each stimulus location near the neurons' receptive fields (stimulus location 1,2,3 in 2A) had a corresponding stimulus location on the opposite side of the fixation point (stimuli near Away in C, D; see also Figure 2—figure supplement 1). (E) Pictograms illustrate for one Gabor pair the stimulus configurations used to calculate all indices. Cyan circles indicate the preferred Gabor (P), orange circles the non-preferred Gabor (N). Solid circles represent task conditions wherein attention was directed toward a stimulus location near the neurons' receptive field (PAttN, PNAtt). Dashed circles indicate that the stimulus was unattended and attention was directed toward another location.

DOI: http://dx.doi.org/10.7554/eLife.17256.004

Figure 2.

Figure 2—figure supplement 1. Average PSTH for individual Gabor stimuli presented inside the classical receptive field (cRF) and within the surround.

Figure 2—figure supplement 1.

Shown are the average responses from the same V4 neurons to a Gabor stimulus placed either inside the cRF (dashed line) or within the surround (solid line). Surround stimuli on average slightly suppressed the baseline response. Black vertical line indicates stimulus onset. Shading over the lines indicates ± SEM. Based on the responses from 558 neurons for which a surround position was examined.

In different blocks of trials, we measured neuronal responses to stimuli presented at three different receptive-field locations (stimulus locations 1, 2, 3 in Figure 2A). Within a block of trials, only two of these stimulus locations were used: e.g. location 1+2 or 1+3 in Figure 2A (Materials and methods). During each stimulus presentation within a trial, we presented one, two, or no stimuli at the two stimulus locations near the receptive field (Figure 2B).

Depending on the location of each neuron's receptive field, stimuli fell either inside the cRF or within the surround. We distinguished between two receptive-field configurations: one in which the two stimulus locations both lay inside the neuron's cRF (cRF-cRF, Figure 2C), and another in which one stimulus location was positioned inside the neuron's cRF while the other stimulus location was positioned inside its surround (sRF-cRF, Figure 2D). Because we tested the responses to stimuli shown at two locations pairings (e.g. locations 1+2 in Figure 2C vs. 1+3 in Figure 2D), 309 neurons were tested in both a cRF-cRF and an sRF-cRF configuration (M1: 97; M2: 212). We classified locations as belonging to the cRF or sRF using stimulus presentations that included only one Gabor (Figure 2B; Materials and methods). Locations where either stimulus orientation generated a response were considered to lie within the cRF. Those where neither stimulus orientation generated a response were considered to lie within the surround (Figure 2—figure supplement 1).

In different blocks of trials, the monkeys directed their attention toward all possible stimulus locations, one attended location per block of trials, each time ignoring the other stimulus locations. Attention was directed toward stimulus locations near the neurons' receptive fields (e.g. locations 1, 2, or 3 in Figure 2A), or toward stimulus locations away from the receptive fields ('Away' in Figure 2C,D; attend away), i.e. to stimulus locations on the opposite side of the fixation point from the neuron's receptive field.

We quantified the stimulus selectivity of the neurons separately for each stimulus configuration. For each of four Gabor pairs (Figure 2B) at each pair of stimulus locations (i.e. location pairings 1+2 or 1+3, Figure 2A), we used a selectivity index ('Selectivity', Figure 2E): (P−N)/(P+N), that ranges from zero (unselective) to one (completely selective). Here P is the response to the component Gabor of a Gabor pair that generated the stronger average response when presented alone (preferred), and is the response to other component Gabor that generated the weaker average response when presented alone (non-preferred). Note that the preferred and non-preferred Gabor within a pair were presented at two different receptive-field locations, and could have the same or a different orientation (Figure 2B). Thus stimulus selectivity between members of a Gabor pair could arise from a neuron's orientation selectivity and from its preference for spatial locations. In subsequent analyses, we will show that the relationship between attention modulation and stimulus selectivity does not depend on whether the stimulus feature is space or orientation. What is critical for attention modulation is a differential response to the component stimuli of a compound stimulus.

For both the cRF-cRF and sRF-cRF condition, we measured each neuron's stimulus-induced suppression for each Gabor pair at each pair of stimulus locations using a stimulus-induced suppression index: (P−PN)/(P+PN) (middle right pictogram Figure 2E). PN is the response to the Gabor pair (P and N defined as before). This index is negative when the neuronal response increases when a non-preferred stimulus is added to the preferred stimulus, and positive when the neuronal response is suppressed by the addition of a non-preferred stimulus to the preferred stimulus. By definition, neurons do not respond to a stimulus when it appears alone inside the surround, so the surround stimulus is invariably assigned as non-preferred (N).

For both the selectivity index and the stimulus-induced suppression index, the responses to the preferred (P), non-preferred (N), and their combined presentation (PN) were measured in the same attention state: when attention was directed away from the neuron's receptive field (attend away). These responses are shown in the bar-plot insets in Figure 3A–D.

Figure 3. Example attention modulations.

Figure 3.

Responses of four different neurons to a selected Gabor pair are shown (measured in different sessions). (A) Example 1: cRF-cRF configuration. Left panel shows this neuron's receptive-field map with the two stimulus locations at which the Gabors were presented overlaid (white-gray dots). Right panel PSTHs show the neuronal responses to the Gabor pair when attention was directed toward the preferred Gabor (cyan line; PAttN), the non-preferred Gabor (orange line; PNAtt), or a stimulus on the opposite side of the fixation point (green dashed line; PN; attend away). Bar-plot inset shows the responses of this neuron to a Gabor pair (PN) and its component Gabors (P, N), all measured in the attend away condition. This neuron's response was selective to the component Gabors of the Gabor pair (P vs. N), suppressed by the addition of a non-preferred Gabor to a preferred Gabor (P vs. PN), and strongly modulated when attention was shifted between the two component Gabors of the Gabor pair (PAttN vs. PNAtt). (B) Example 2: another neuron in the cRF-cRF configuration. This neuron showed weak selectivity, hardly any suppression, and little attention modulation. (C) Example 3: sRF-cRF configuration with one Gabor inside the neuron's cRF, and one Gabor inside its surround. By definition, the cRF Gabor is preferred (P) and the silent surround Gabor is non-preferred (N). The neuron responded highly selectively to the cRF and the surround Gabor when presented alone (P vs. N), showed surround suppression (P vs. PN), and was modulated by attention (PAttN vs. PNAtt). (D) Example 4: another neuron in the sRF-cRF configuration. This neuron was highly selective to the component Gabors of the Gabor pair, but only weakly suppressed by the surround Gabor, and showed little attention modulation. The insets show the average waveforms of the recorded neurons (blue) plus that of the multi-unit activity measured at the same electrode (grey). Shading around the mean represents ± 2 median absolute deviation (MAD). Scale bars indicate 50 μV and 0.1 ms. The receptive-field maps were normalized to the maximum response for each neuron during receptive-field mapping (RF max), dark blue shows the baseline response. Error bars represent ± SEM.

DOI: http://dx.doi.org/10.7554/eLife.17256.006

Figure 3A–D shows examples of attention-related response modulations of four different neurons to one selected Gabor pair: two neurons in the cRF-cRF configuration (A, B) and two neurons in the sRF-cRF configuration (C, D). The neuron in Figure 3A responded selectively to the two component Gabors of the Gabor pair shown inside the neuron's cRF (inset: P vs. N; selectivity index=0.44). Its response to the preferred Gabor was suppressed when the non-preferred Gabor was added to it (inset: P vs. PN; suppression index=0.22). The position of attention profoundly affected this neuron's responses: Compared to when attention was directed away from the neuron's receptive field (dashed green line; PN; attend away), attention to the preferred Gabor increased this neuron's response (cyan line; PAttN), whereas attention to the non-preferred Gabor suppressed its response (orange line; PNAtt).

Attention-related modulation was quantified using an attention-modulation index: (PAtt N−PNAtt) / (PAtt N + PNAtt(lower pictogram Figure 2E), which is positive when the neuronal response increases when attention is directed toward the preferred Gabor, compared to when attention is directed toward the non-preferred Gabor. The attention-modulation index for example 1 was 0.48.

In contrast to example 1, the response of the neuron in Figure 3B was poorly selective to the component Gabors of the Gabor pair (P vs. N; selectivity index=0.066), showed little suppression when a non-preferred Gabor was placed alongside a preferred Gabor (P vs. PN; suppression index=0.04), and was only weakly modulated when attention shifted between the preferred and the non-preferred Gabor within the cRF (cyan vs. orange line; attention-modulation index=0.04).

Figure 3C shows the responses of a neuron to a Gabor pair in another stimulus configuration, in which one Gabor was placed inside the neuron's cRF and another Gabor inside its surround (sRF-cRF). As expected, the neuron responded much more to the cRF Gabor than to the surround Gabor (P vs. N; selectivity=0.963). When the surround Gabor was placed alongside the cRF Gabor, the neuron's response was greatly reduced, the hallmark of surround suppression (P vs. PN; suppression index=0.336). The neuron showed strong attention-related modulation: Compared to when attention was removed from both the cRF and the surround Gabor (dashed green line; attend away), attention to the cRF Gabor increased this neuron's response (cyan line), while attention to the surround Gabor sharply decreased its response (orange line; attention-modulation index=0.58).

The response of the fourth example neuron in Figure 3D was highly selective to the component Gabors of the Gabor pair (P vs. N), only slightly suppressed by the surround Gabor (P vs. PN), and its firing rate was barely modulated by attention (cyan vs. orange line; selectivity index=0.9; suppression index=0.05; attention-modulation index=0.08).

These examples illustrate the diverse stimulus selectivities, stimulus interactions (i.e. stimulus-induced suppression) and attention-related modulations in the neuronal responses in visual cortex. Next, we asked how variability in stimulus selectivity and stimulus-induced suppression relates to variability in attention modulation within the cRF and the surround across the sample of recorded neurons.

Relationship among selectivity, stimulus-induced suppression and attention modulation

We first examined the relationship between selectivity and attention modulation. Shifting attention between two Gabors inside the cRF was associated with larger response changes for neurons with more selective responses to the component Gabors of the Gabor pair (Figure 4A; cRF-cRF configuration; p=4 × 10−109 for a non-zero slope; linear regression) (Reynolds et al., 1999). Attention-related modulation was also stronger for neurons that responded more selectively to the cRF and surround stimulus (Figure 4B; sRF-cRF configuration; p=3 × 10−76 for a non-zero slope; linear regression). Low selectivity can occur in the sRF-cRF configuration when the cRF stimulus produces little response because it has a non-preferred orientation or is positioned at a weakly responsive cRF location. Comparing Figure 4A and B shows that attention-related modulation increases more with selectivity in the cRF-cRF than in the sRF-cRF configuration (p=5 × 10−4 for different slopes in each receptive-field configuration; general linear model).

Figure 4. First-order analyses suggest that attention modulation follows different principles for stimuli inside the cRF and the surround.

(A, B) Average attention modulation as a function of the stimulus selectivity in the cRF-cRF and sRF-cRF configuration respectively. Low selectivity occurs in the sRF-cRF configuration when neurons respond weakly to the cRF stimulus, e.g. because of a non-preferred orientation or a weakly responsive cRF location, and have a baseline response to the surround stimulus. (C, D) Histogram of all stimulus-induced suppression indices measured in the cRF-cRF and sRF-cRF configuration respectively. The suppression index is negative when neurons increase their response when a non-preferred stimulus is added to the preferred stimulus (enhancing), and positive when neurons decrease their response when a non-preferred stimulus is added to the preferred stimulus (suppressing). Black bars indicate indices associated with Gabor pairs for which the suppression index differed significantly from zero (p<0.01; permutation t-test; see also Figure 4—figure supplement 1). Triangle points to the mean suppression index. (E, F) Average attention modulation as a function of stimulus-induced suppression in the cRF-cRF and sRF-cRF configuration respectively. Error bars represent ± SEM. (G, H) Stimulus-induced suppression versus stimulus selectivity for all Gabor pairs in the cRF-cRF (N = 1769) and sRF-cRF (N = 1768) configuration respectively.

DOI: http://dx.doi.org/10.7554/eLife.17256.007

Figure 4.

Figure 4—figure supplement 1. Example neurons with strong surround suppression.

Figure 4—figure supplement 1.

A, B, C three neurons with significant surround suppression (p<0.01). The left panels show the average responses of the neurons to single stimuli presented either inside the cRF (black; cRF), the surround (light grey; Surround) and the responses to the combined presentation of both the cRF and the surround stimulus (grey; cRF + surround). The insets show the average waveforms of the recorded neurons (blue) plus that of the multi-unit activity measured at the same electrode (grey). Shading around the mean represents ± 2 median absolute deviation (MAD). The right panels show each neuron's receptive-field map with the two stimulus locations at which the Gabors were presented overlaid (white-gray dots). The receptive-field maps were normalized to the maximum response for each neuron during receptive-field mapping (RF max), dark blue shows the baseline response. See Figure 3C for another example.

We next examined stimulus-induced suppression. V4 neuronal responses on average decrease when a non-preferred stimulus is added to a preferred stimulus inside their cRF (Figure 4C; average suppression index = 0.08, p=4 × 10−104; t-test for a difference from zero) (Reynolds et al., 1999). Similarly, stimulating the surround decreases the average responses of V4 neurons (Figure 4D; average suppression index = 0.04, p=2 × 10−28; t-test) (Schein and Desimone, 1990). However, stimuli inside the surround suppressed the neuronal responses less than stimuli inside the cRF: the average suppression index for the surround condition (sRF-cRF) was significantly smaller than the average suppression index for the cRF condition (M1: p=9 × 10−6; M2: p=4 × 10−15; t-test; see below and Figure 7 for further discussion). The black bars in Figure 4C and D represent neurons that were significantly (p<0.01) suppressed by the non-preferred (surround) stimulus. See Figure 4—figure supplement 1 for some example neurons with significant surround suppression (see also Figure 3C). Surround suppression was also weaker than cRF suppression when comparing only suppression indices that differed significantly from zero (p<0.001).

Extending previous findings in area MT (Ni et al., 2012; Lee, 2009), we find that V4 neurons with stronger stimulus-induced suppression by cRF stimuli also showed stronger attention modulation (Figure 4E; p=1 × 10−31 for a non-zero slope; linear regression). Furthermore, and consistent with a previous study (Sundberg et al., 2009), attention modulation was also stronger for neurons whose responses were more suppressed by a surround stimulus (Figure 4F; p=5 × 10−4 for a non-zero slope; linear regression). However, comparing Figure 4E and F shows that attention-related modulation increases more with stimulus-induced suppression in the sRF-cRF than in the cRF-cRF configuration (p=0.005 for different slopes in each receptive-field configuration; general linear model).

A previous study in V4 examining the relationship among stimulus selectivity, sensory interaction (akin to stimulus-induced suppression) and attention modulation, found a strong correlation between stimulus selectivity and sensory interaction (Reynolds et al., 1999). In the present study, however, stimulus selectivity and stimulus-induced suppression were not significantly correlated with each other across neurons (Figure 4G,H Pearson correlation = 0.02, p=0.32; see Discussion for further comments on the difference between studies). This finding shows that the correlations between stimulus selectivity and attention modulation, and that between stimulus-induced suppression and attention modulation, are not explained by an underlying correlation between selectivity and suppression. Furthermore, and in contrast to previous studies, the lack of a correlation between both indices allowed us to examine the separate contributions of selectivity and suppression to the magnitude of attention modulation.

The above-mentioned different relationships between stimulus selectivity, stimulus-induced suppression and attention-related modulation in the cRF-cRF and the sRF-cRF configuration, suggest that the rules that govern attention modulation differ within the cRF and the surround. Next we falsify this suggestion and show how these different relationships in the two receptive-field configurations can be explained by a common rule.

Stimulus selectivity and stimulus-induced suppression interact in determining attention modulation and do so similarly inside the cRF and the surround

We used multiple linear regression to examine if attention-related modulation depends on the joint magnitude of stimulus selectivity and stimulus-induced suppression. For both receptive-field configurations, the regression model included a main effect of selectivity and a main effect of stimulus-induced suppression. Importantly, in each RF configuration the regression model also included an interactive product term, which measured the dependency of attention-related modulation on both selectivity and stimulus-induced suppression, i.e. this term measures whether the relationship among selectivity, suppression and attention modulation is non-additive (see Materials and methods for further information).

Figure 5 shows how attention modulation varies with selectivity and stimulus-induced suppression (5A; cRF-cRF, 5B; sRF-cRF). For both configurations, the relationship is non-additive. Specifically, Figure 5A and B show that when stimulus-induced suppression is low, attention modulation will be weak, even when attention is shifted between a strong and a weak stimulus (upper left corner in Figure 5A,B). That is, the plots show that the effect of selectivity near zero stimulus-induced suppression is weak, although significant (main effect of selectivity at zero stimulus-induced suppression: cRF-cRF: p=2 × 10−64; sRF-cRF: p=2 × 10−60; M1: p=7 × 10−136 across RF configurations; M2: p=5 × 10−30 across RF configurations).

Figure 5. Selectivity and stimulus-induced suppression interact to control attention modulation.

(A, B) Average attention modulation as a function of stimulus-induced suppression (x-axis) and stimulus selectivity (y-axis) in the cRF-cRF and sRF-cRF configuration respectively. The magnitude of attention modulation is indicated by color (red = strong, blue = weak). Note that, although the data covered most of this space (see Figure 4G,H), few regions, e.g. the lower right corner in (B), were not well sampled. (C) Model schematic. Every stimulus contributes an excitatory drive (L1 and L2) to the neuron's response (R1,2att) to a Gabor pair. Each stimulated receptive-field location, either inside the cRF or inside the surround, contributes divisive suppression (α1 and α2) to the neuron's response. The divisive suppression is fixed for each receptive-field location, independent of the stimulus presented at that location. A small amount of baseline suppression is further added (σ parameter; not shown). Directing attention toward a stimulus location has a multiplicative effect (β) on the parameters (L2 and α2) corresponding to the attended receptive-field location (location 2 in the schematic). (D, E) Average model-predicted attention modulation as a function of the observed stimulus-induced suppression (x-axis) and the observed stimulus selectivity (y-axis) in the cRF-cRF and sRF-cRF configuration respectively (See also Figure 5—figure supplement 1). Same conventions as in (A, B).

DOI: http://dx.doi.org/10.7554/eLife.17256.009

Figure 5.

Figure 5—figure supplement 1. Example single-neuron responses and their corresponding model fits.

Figure 5—figure supplement 1.

(A) Neuron with a preferred (P) and non-preferred (N) stimulus presented inside the cRF (cRF-cRF condition). Black: observed responses. Grey: modeled responses. P, Patt, N, Natt show the responses to the individually-presented preferred and non-preferred stimulus with attention away (P, N), or attention directed to the stimulus (Patt, Natt). PN shows the condition in which both stimuli were presented simultaneously with attention away (PN), attention directed toward the preferred stimulus (PattN), or directed toward the non-preferred stimulus (PNatt). The values of the model parameters for each example neuron are shown on the right. Note that these parameter values correspond to spike counts in a 250 ms window and should be multiplied by four to obtain spikes/s. This neuron's response is suppressed when a non-preferred stimulus is added to a preferred stimulus (P vs. PN). The model accounts for this difference because the non-preferred stimulus induces few excitation (small L2) but large enough suppression (α2). So suppression dominates over excitation. The model also captures the strong attention modulation (PattN vs. PNatt) through the β parameter, which multiplies the excitatory drive (L) and suppressive drive (α) of the attended stimulus. By increasing the weight of both drives, attention effectively focuses on the inputs related to the attended stimuli, as if the inputs from other stimuli were attenuated. So attention to a weak stimulus decreases the response, while attention to a strong stimulus increases the response (i.e. attention modulation). (B) Neuron with a cRF (P) and surround (N) stimulus (sRF-cRF condition). The model accounts for the observed suppression and attention modulation, which is similar to that of the neuron in A (cRF-cRF condition). (C) Neuron with a cRF (P) and surround (N) stimulus (sRF-cRF condition). The surround stimulus induces no surround suppression (low α2 and L2value). As a result, shifting attention between the cRF and the surround stimulus leads to virtually no attention modulation.

Conversely, when selectivity is low, attention modulation will also be weak, even when attention is shifted between stimuli that strongly suppress each other’s response (bottom right corner in Figure 5A,B; effect of stimulus-induced suppression at zero selectivity: cRF-cRF: p=0.5; sRF-cRF: p=0.6; M1: p=0.4 across RF configurations; M2: p=0.3 across RF configurations).

Strong attention-related modulation occurs only when selectivity and suppression are both large, and this was true for both RF configurations (upper right corner in Figure 5A,B; interaction between stimulus-induced suppression and selectivity: cRF-cRF: p=2 × 10−9; sRF-cRF: p=3 × 10−6; M1: p=1 × 10−4 across RF configurations; M2: p=3 × 10−20 across RF configurations).

The interaction between selectivity and stimulus-induced suppression did not differ significantly between the cRF-cRF and sRF-cRF configuration (M1: p=0.6; M2: p=0.85; 3-way interaction), nor did any other interaction with RF configuration. Because a non-significant effect does not indicate the absence of an effect, we performed a Bayesian regression analysis (Materials and methods). This analysis showed that the observed data are 347 times more likely to agree with a regression model that does not distinguish between the cRF-cRF and sRF-cRF configurations than with a model that does include RF-configuration as a predictor. Thus attention modulation is driven by similar mechanisms within the cRF and the surround.

Attention modulation depends on a general definition of stimulus selectivity

To examine response modulations associated with shifting attention between two receptive-field stimuli, previous studies used two different stimuli (e.g. stimuli of different orientations, colors, directions), each presented at a different but approximately equally-responsive cRF position (Moran and Desimone, 1985; Reynolds et al., 1999; Ghose and Maunsell, 2008; Lee and Maunsell, 2010; Ni et al., 2012). Similar to these previous studies, we reproduced the above findings using the data from conditions with low spatial selectivity. When attention shifted between stimuli at two approximately-equally responsive cRF positions (less than 2 spike/s response difference when each of two cRF positions is stimulated with an identical single stimulus), similar effects were observed (main effect of orientation selectivity: p<0.001; main effect suppression: p=0.7; interaction between feature selectivity and suppression: p=0.02; multiple linear regression).

Next, we examined whether the converse situation, i.e. same stimuli at unequally-responsive cRF positions, would produce attention modulations comparable to those described earlier. We found similar attention-related modulations using Gabor pairs consisting of identical Gabor stimuli presented at various cRF positions. Note that in this situation selectivity to the component Gabors of a Gabor pair originates solely from a neuron's spatial preferences (receptive field), because the Gabors are identical. All of the above findings were replicated using only the data obtained with Gabor pairs consisting of identical Gabors (main effect spatial selectivity: p=4 × 10−22; main effect suppression: p=0.29; interaction between spatial selectivity and suppression: p=6 × 10−6; multiple linear regression). This indicates that attention-related modulation depends on a differential response to the component stimuli of a compound stimulus, regardless of the origin of the response difference (feature or spatial). Accordingly, at their most abstract level, models of neuronal attention modulation only need to account for the responses arising from different component stimuli, whether they arise from preferences for specific stimulus features, preferences for certain parts of the receptive field, or both.

A spatially-tuned normalization model captures attention modulation inside the cRF and in the surround

Our findings reveal a striking uniformity in the rules that govern attention modulation inside the cRF and within the surround: the interaction between stimulus selectivity and stimulus-induced suppression strongly influences how much attention modulates neuronal responses. Hence, any model of neuronal attention modulation needs to embody this relationship. We found that a spatially-tuned normalization model can readily capture this interaction (Materials and methods).

We used a spatially-tuned normalization model, described as follows (Figure 5C):

R1,2=L1+L2α1+α2+σ (1)

where R1,2 is the neuronal response to a Gabor pair consisting of Gabors 1 and 2. L1 and L2 are the excitatory drives associated with each component Gabor. The α1 and α2 parameters control the suppressive drive of each stimulated cRF or surround location. In this model, α1 and α2 are each associated with one receptive-field location, and do not vary with the orientation of the stimuli shown at those locations. Because the suppression, or normalization, is free to vary across receptive-field locations, the normalization is spatially tuned. In fitting the data, we fixed α1 at one to constrain the model. The α parameter adds baseline suppression. Directing attention toward the first (R1att,2; Equation (2)) or second (R1,2att; Equation (3)) receptive-field location has a multiplicative effect on the parameters corresponding to the attended receptive-field location. This is described by the β parameter in Equations (2) and (3):

R1att,2=βL1+L2β+α2+σ (2)
R1,2att=L1+βL21+βα2+σ (3)

The model was fit to each neuron's responses in all stimulus conditions: including conditions with one stimulus or two stimuli near the receptive field, and conditions with attention directed toward stimulus locations near the receptive field or away from it (Materials and methods).

The spatially-tuned normalization model provided an accurate account of the neuronal data, giving a median two-fold cross-validated explained variance of 87% across neurons (M1: 86%; M2: 88%). For the 309 neurons (M1: 97; M2: 212) that were tested in both a cRF-cRF and an sRF-cRF configuration the responses were equally well explained (M1: 86%; M2: 87%).

The model captures the way attention modulates neuronal responses to stimuli inside the cRF or the surround. Figure 4A,B,E,F (light grey points) show that the model precisely accounts for attention modulation across the full range of observed stimulus selectivity and stimulus-induced suppression values, within both the cRF-cRF and sRF-cRF configuration. Figure 5D,E shows the average model predictions based on the model fits from all neurons in the cRF-cRF (D) and sRF-cRF (E) configurations (see Figure 5—figure supplement 1 for response fits of individual neurons). The model reproduces the way stimulus selectivity and stimulus-induced suppression interacted in both the cRF-cRF and the sRF-cRF configurations: predicting large attention modulation when both selectivity and suppression are strong, but little attention modulation when either selectivity or suppression is low. Thus this single model describes how attention modulates responses to stimuli inside the cRF or the surround.

The previous analyses were based on stimulus configurations with two stimuli inside the neurons' receptive field. Importantly, the model also accounts for the neuronal effects of attention to single stimuli at different receptive field locations. This is shown in Figure 6, which shows how the effect of attention varies when attention is directed to single stimuli at various distances from the receptive field center: for the observed (A) and the modeled (B) responses. Attention modulation to single stimuli is typically small compared to attention modulation with multiple stimuli. This is because in single stimulus conditions there are no suppressive influences from a flanking stimulus, and we have shown that these suppressive influences are necessary to induce strong attention modulation (Figure 5). The model accounts for these smaller attention modulations with single stimuli.

Figure 6. The spatially-tuned normalization model captures how attention modulates responses to single stimuli presented at various receptive field locations.

Figure 6.

(A) Observed responses. Average response as a function of the distance between the single stimulus and the receptive field center, when the stimulus is attended (white) or unattended (black). (B) Same as (A) but for the modeled responses. The responses of each neuron were normalized to the maximum response across conditions in which a single stimulus was presented inside the receptive field. The receptive field distance is given by the Mahalanobis distance from the Gaussian receptive-field center. The Mahalanobis distance is akin to the number of standard deviations (σ) away from the receptive-field center. Only neurons whose receptive fields were well fitted with a two-dimensional Gaussian profile (>80% explained variance; 306 neurons; M1: 95; M2: 211) were included. Error bars represent ± SEM.

DOI: http://dx.doi.org/10.7554/eLife.17256.011

Stimulus-induced suppression depends on distance from the receptive-field center

What determines the spatial tuning of suppression? For each neuron we sorted the value of the suppression-parameter α, associated with each of the three measured receptive-field locations (Figure 2A), as a function of the distance of each receptive-field location from the neuron's receptive-field center (Figure 7A). Within neurons, locations closest to the receptive-field center induced on average greater suppression than locations furthest away from the receptive-field center (average α-parameter values of closest vs. furthest location: p=2 × 10−23; M1: p=7 × 10−5; M2: p=0.005; paired permutation t-test). This is further illustrated in Figure 7B,C (gray), which shows the average normalized α-parameter value as a function of the distance from the receptive-field center. Lower α-values at greater distances are consistent with the observation that surround suppression is significantly weaker than cRF suppression (Figure 4C,D; see above).

Figure 7. Spatially-tuned excitation and suppression decrease with distance from the receptive-field center, but at different rates.

Figure 7.

(A) Each recording session we measured neuronal responses to stimuli presented at three different receptive-field locations (Figure 2A). The responses of each neuron were fitted with the spatially-tuned normalization model. The value of the suppression parameters α associated with each of the three measured receptive-field locations were ranked according to the proximity of those receptive-field locations to the neuron's receptive-field center: 1 being closest, and 3 being furthest away from the receptive-field center. The suppression parameter values were then normalized by the maximum α-value for each neuron. For each ranking number, the normalized suppression parameter values were subsequently averaged across neurons. Stimulus locations closest to the receptive-field center contributed more suppression to the neurons' response than those furthest away. (B) Average normalized suppressive drive (α, gray) and excitatory drive (L, black) as a function of the distance (in visual degrees) of its corresponding receptive-field location from the receptive-field center. The value of the excitatory drive parameter L for stimuli of different orientations were averaged per receptive-field location, and normalized by the maximum excitatory drive across the three measured receptive-field locations of a neuron. (C) Same as (B) but with an alternative distance measure, namely the Mahalanobis distance, which is akin to the number of standard deviations (σ) away from the receptive-field center. In (B) and (C), each excitatory-drive value L (black) has a corresponding suppressive-drive value α (gray). Error bars represent ± SEM.

DOI: http://dx.doi.org/10.7554/eLife.17256.012

In addition to suppression, Figure 7B,C also shows the strength of the excitatory drive (measured by the average L-parameter values) as a function of the distance from the receptive-field center (black). Comparing the curves for the excitatory (black) and the suppressive drive (gray) reveals a striking similarity between the receptive field structure of V4 neurons and that previously observed in primary visual cortex (V1) (Cavanaugh et al., 2002b; Sceniak et al., 1999; DeAngelis et al., 1994): both excitation and suppression are maximal near the receptive-field center, but excitation is more spatially concentrated, with suppression stretching over larger distances.

The spatial pattern of suppression suggests that suppression in the cRF and the surround are continuous extensions of one another. These findings indicate that attention operates uniformly on the spatially-continuous excitation and suppression of a neuron's receptive field.

Spatial variability in excitation and suppression underlies differences in attention modulation across neurons

Is a variable top-down attention signal necessary? We fixed the attention parameter β at a constant value for all neurons, i.e. as the mean β value across neurons when estimated as a free parameter. This constrained model accounted almost as well for the data as the model in which β was free to vary (median two-fold cross-validated percentage explained variance 86%, compared to 87% for the unconstrained model). This finding suggests that differences in attention modulation between neurons are only weakly related to differences in the top-down attention signal across neurons (see also [Ni et al., 2012]).

In contrast, spatial tuning is important. When we instead kept the α terms constant and allowed β to vary across neurons, the model's performance decreased significantly (77%; p=0.004, Sequential F-test), especially for neurons that were tested in both a cRF-cRF and an sRF-cRF configuration (73%; p=0.002).

Hence when a fixed stimulus is shown, differences in attention modulation across neurons appear to arise when a relatively uniform top-down attention signal interacts with the different amounts of excitation and suppression elicited by that stimulus in each neuron.

Discussion

We measured the dependency of neuronal attention modulation on stimulus selectivity and stimulus-induced suppression throughout different receptive field regions, including the surround, of V4 neurons. We found that stimulus selectivity and stimulus-induced suppression strongly interact to determine the magnitude of attention modulation in neurons. This interaction determined attention modulation within both the classical receptive field and the surround, indicating that remarkably similar principles drive attention modulation inside the center and surrounding regions of the receptive field. A spatially-tuned normalization model, fitted to the responses of individual neurons, captured the dependency of attention modulation on both stimulus selectivity and suppression, and provided an excellent account of how attention operates across different regions of the receptive field, with either single or multiple stimuli shown inside. Each stimulus configuration induced variable amounts of excitation and suppression in different neurons, depending on the receptive field position of the stimuli. Attention operates on this variable excitation and suppression, thereby explaining why the magnitude of attention-related modulations varies so widely across neurons.

Reynolds et al. (1999) observed a strong correlation between stimulus selectivity and their index of stimulus-induced suppression. It is important to note that this strong correlation is not a general property of spatial summation in visual cortex. Instead, the correlation they observed likely arose from the experimental design used in that study. Specifically, for each neuron Reynolds and colleagues presented different stimulus pairs that fell on two locations that were always chosen to lie well within the cRF at similar distances from the receptive field center. The authors kept the stimulus at one location fixed (reference stimulus), but varied the orientation and color of the stimulus at the second location. But varying the orientation and color of a stimulus, but not its location, varies predominantly the excitation and not the suppression to the neuron (see Appendix; we found no evidence for orientation-tuned suppression in our data). Consequently, both the selectivity and the stimulus-induced suppression index varied with a single variable, the strength of the second stimulus relative to the strength of the fixed stimulus. This explains the strong correlation between stimulus selectivity and suppression in that study. In contrast, we presented stimuli at locations across both the classical and surrounding receptive field, allowing both stimulus-induced excitation and suppression to vary across stimulus configurations. This is the primary reason for the difference between the studies. The variable reference stimulus across neurons in Reynolds et al., and the slightly different indices used in the two studies will have further amplified the differences in findings in both studies. Hence, under more general stimulus conditions wherein stimuli can fall on any receptive field location, stimulus selectivity and stimulus-induced suppression are not correlated.

Importantly, by avoiding a correlation between selectivity and suppression, we were able to examine their separate contributions to the magnitude of attention modulation. Shifting attention between a strong and a weak stimulus, each presented at different receptive-field locations, changes which neuronal inputs are emphasized, thereby causing neuronal-response modulations. However, we show that such clear-cut differential processing only occurs if the weaker stimulus also induces strong suppression: without suppression, attention has no leverage to amplify input differences. The spatially-tuned normalization model captures the dependency of attention-related modulation on both selectivity and suppression, and does so for neurons with stimuli in either the cRF or the surround.

Both Sundberg et al. (2009) and Sanayei et al. (2015) used conditions with one stimulus inside the classical receptive field and at least one other stimulus in the surround (sRF-cRF). However, neither of these studies used a condition with both stimuli positioned inside the cRF (cRF-cRF). Hence, a direct comparison of attention modulation within the cRF and the surround could not be performed in these studies. This comparison is crucial to determine whether the neuronal effects of attention differ between the cRF and the surround. Importantly, we found that seeing the similarity between attentional effects in both receptive field regions requires examination of the combined relationship, i.e. interaction, between attention modulation and both stimulus selectivity and stimulus-induced suppression (Figure 4 versus 5). The interaction between these variables was similar in both receptive field configurations.

Sanayei et al. fit different (normalization) models, but never measured the neuronal effects of attention when attention was directed to the surround stimulus; the authors only compared the effects of attention to the cRF stimulus, with or without surround stimulation, versus attention to a distant stimulus. Attention was never directed to the surround stimuli. Thus, Sanayei et al. lacked the crucial information needed to examine how surround suppression affects attention modulation and to test the efficacy of normalization models. We tested whether a single model could fit both the cRF-cRF and sRF-cRF data. The good fits of the spatially-tuned normalization model to the data obtained in both receptive field configurations provided further evidence that attention acts similarly inside the cRF and the surround.

Suppression and excitation may rely on distinct mechanisms in different regions of the receptive field (Angelucci et al., 2014). Our data do not pertain to these different mechanisms and we may have missed some small differences in attention modulation associated with these distinct mechanisms. Nonetheless, our findings show that the way attention interacts with excitation and suppression across different regions of the receptive field is remarkably similar.

In recent years, several normalization models of attention have been proposed (Reynolds et al., 1999; Ghose and Maunsell, 2008; Ni et al., 2012; Ghose, 2009; Reynolds and Heeger, 2009; Lee, 2009; Boynton, 2009; Lee et al., 1999). Two of the more elaborate models explicitly assumed that attention acts on a specific receptive-field structure, namely one that encompasses a relatively narrow excitatory field in addition to a wider suppressive field (Ghose, 2009; Reynolds and Heeger, 2009). This receptive-field structure is based on findings from primary visual cortex (V1) (Cavanaugh et al., 2002b; Sceniak et al., 1999; DeAngelis et al., 1994). It is important to note, however, that none of these studies empirically tested if attention actually operates on the spatially-varying excitation and suppression implied by such a receptive field structure. We started with a spatially-tuned normalization model that made no assumptions about the structure of excitation and suppression in the receptive field of V4 neurons. Furthermore, and in contrast to the earlier-mentioned studies (Ghose, 2009; Reynolds and Heeger, 2009), we explicitly fitted models to the responses of individual neurons to test relationships with the underlying receptive field structure. Interestingly, this naive model reveals that the receptive field organization of V4 neurons strongly resembles that of V1 neurons (Cavanaugh et al., 2002b; Sceniak et al., 1999; DeAngelis et al., 1994): both excitation and suppression are maximal near the receptive-field center, but excitation is more spatially concentrated, while suppression stretches over larger distances. These findings suggest that similar receptive field organizations can be found throughout different stages of visual cortex. Importantly, our findings show that attention operates uniformly across the spatially-varying excitation and suppression of a receptive field: throughout the receptive field, including the surround, attention-related modulations of neuronal responses is governed by very similar normalization rules.

The finding that the rules of neuronal attention modulation are similar across different regions of the receptive field simplifies our view of attentional operations in visual cortex, and provides strong support for normalization models of attention (Ghose, 2009; Reynolds and Heeger, 2009). We also show that the origin of the stimulus-induced excitation is not important for determining the magnitude of attention modulation: we found no distinction between excitation related to a neuron's feature tuning (i.e. orientation tuning) or spatial tuning (i.e. receptive field). What matters for neuronal attention modulation is stimulus-induced excitation, regardless of its origin, in conjunction with spatially-tuned suppression. It follows that when a particular stimulus configuration induces variable amounts of excitation and suppression in different neurons, attention-related modulations will vary across these neurons.

The fact that attention operates on the spatially-varying excitation and suppression of a receptive field has important implications, as it determines which neurons will be most influenced by attention. For instance, with a given number of stimuli presented inside the receptive field, attention to a preferred stimulus shown inside the center of the receptive field typically has the greatest potential to elevate neuronal responses. This is true not only because stimuli near the receptive field center generally elicit most excitation, but also because such stimuli are most likely to induce the greatest suppression. The elevated suppression by center stimuli gives them more weight in normalization mechanisms as it allows them to better discount the suppressive influences from other simultaneously presented stimuli. Similarly, attention to a weak stimulus inside the receptive field center will in general reduce responses more than attention to a weak stimulus elsewhere in the receptive field, including the surround. This does not mean that stimuli in the surround, which induce relatively less suppression, have little impact on attention modulation. Indeed, because the surround is so much larger than the cRF it can contribute considerable suppression. Such strong surround suppression likely occurs under natural viewing conditions where stimuli are shown throughout the visual field, many of them covering the surround (Vinje and Gallant, 2000; Ozeki et al., 2009; Haider et al., 2010; Coen-Cagli et al., 2015). The normalization model predicts that such strong surround suppression may robustly amplify attention modulation, much beyond the attention modulation observed without surround suppression.

This effect is illustrated in Figure 8 in which spatial attention was applied to model neurons with (upper panels Figure 8) or without a surround (lower panels Figure 8). The model neurons with a surround strongly modulated their responses by attention, but those without a surround much less (Figure 8F upper vs. lower panel). Hence, although the precise role of the surround is still unknown (Schwartz and Simoncelli, 2001; Vinje and Gallant, 2000; Sachdev et al., 2012), an important contribution of the surround may lie in its ability to amplify attention-related response modulations.

Figure 8. The surround may amplify spatial attention under natural viewing conditions.

Figure 8.

(A) Original image. (B) Model neurons tiled the image. Each pixel contained one model neuron with its receptive-field centered on that pixel. An example cRFs (solid red) and surround (red dashed) for one neuron are shown. The radius of each neuron's surround was approximately five times larger than the radius of its cRF. The model neurons computed local contrast within the excitatory and suppressive component of their receptive field. The response maps show each neuron's response: neurons near high-contrast regions responded most as indicated by the luminance of the pixels. (C) Original image scaled according to the response map in (B). (D) Attention was directed to the left eye. Attention weighed the excitatory and suppressive inputs with its Gaussian kernel, resulting in stronger responses of the neurons with receptive fields near the attended location relative to neurons with receptive fields outside the locus of attention. (E) Original image scaled according to the response map in (D), illustrating the way attention changes the visual representation. (F) Attention modulation of each neuron, defined as (responseAtt - response) / (responseAtt + response). Here, response is the response map without attention as in (B), while responseAtt is the response map with attention as in (D). Upper panels (BF) are based on model neurons with a suppressive surround. Lower panels (BF) are based on model neurons without a suppressive surround, but with the same amount of suppression inside the cRF as the neurons with a suppressive surround.

DOI: http://dx.doi.org/10.7554/eLife.17256.013

Materials and methods

Surgical procedures

Two male rhesus monkeys M1 and M2 (Macaca mulatta, both 9 kg) were trained to perform a spatial attention task. Monkeys were pair housed in standard 12:12 light-dark cycle and given food ad libitum. Before training, each animal was implanted with a head post. After completion of the behavioral training (~7 months), we implanted a 10 × 10 array of microelectrodes into area V4 of the left cerebral hemisphere, between the lunate sulcus and the superior temporal sulcus. Before surgery, animals were given buprenorphine (0.005 mg/kg, intramuscular) and flunixin (1.0 mg/kg, intramuscular) as analgesics, and a prophylactic dose of an antibiotic (Baytril, 5 mg/kg, intramuscular). For surgery, animals were sedated with ketamine (15 mg/kg, intramuscular) and xylazine (2 mg/kg, intramuscular) and given atropine (0.05 mg/kg, intramuscular) to reduce salivation. Anesthesia was maintained with 1–2% isoflurane. Antibiotic was administered again 1.5 hr into surgery; buprenorphine and flunixin were given for 48 hr post-operatively. All procedures were approved by the Institutional Animal Care and Use Committee of Harvard Medical School (Boston, MA; protocol #04214).

Visual stimulation

Stimuli were presented on a gamma-corrected cathode-ray tube (CRT) display with a 100 Hz frame rate and a resolution of 1024 × 768 pixels. Monkeys were seated 57 cm from the center of the screen. Stimuli consisted of full-contrast achromatic odd-symmetric static Gabor stimuli (0.6–2.2 cycles per degree; one spatial frequency per daily session) presented on a gray background (42 cd/m [Kastner and Ungerleider, 2000]) and were rendered online using custom-written software (https://github.com/MaunsellLab/Lablib-Public-05-July-2016.git). The Gabor stimuli were truncated at three SD from their center.

Spatial attention task

We trained monkeys to perform a visual detection task in which spatial attention was manipulated (Figure 1A). The trial started when the monkey fixated a small spot in a virtual 1.5° square fixation window in the center of the video display for 240–700 ms. Eye movements were tracked using an infrared eye-tracking camera (EyeLink 1000) sampling binocularly at 500 Hz. The duration of the fixation period was randomly drawn from a uniform distribution. Following fixation a sequence of stimuli was presented, in which each stimulus presentation lasted 200 ms and was separated from other stimuli by 200–1020 ms interstimulus intervals (Figure 1B). The durations of the interstimulus intervals were randomly drawn from an exponential distribution (τ = 200 ms). During the interstimulus interval only a gray screen with the fixation dot was shown. The stimulus presentations were short to prevent animals from adjusting their attention within a stimulus presentation in response to the number of stimuli presented (Lee and Maunsell, 2010; Ni et al., 2012; Lee, 2009; Williford and Maunsell, 2006).

On each trial, stimuli appeared at two locations near the receptive fields of neurons, but the two locations differed between blocks of trials. One stimulus location (the middle location: location 1 in Figure 2A) never varied, but in different blocks of trials the second stimulus location was shifted either clockwise (location 2 in Figure 2A) or counterclockwise (location 3 in Figure 2A). For the example session in Figure 2A, the possible stimulus-location pairings were 1+2 and 1+3. All stimulus locations were equidistant from the fixation point, and stimulus locations 2 and 3 were equidistant from stimulus location 1. The two different pairs of stimulus locations assured that many neurons were tested in both receptive-field configurations (cRF-cRF and sRF-cRF).

On each stimulus presentation within a trial, we presented one, two, or no stimuli at the two stimulus locations near the neurons' receptive fields. The stimuli could be of one of two orthogonal orientations. Each session, the stimulus orientation was optimized for a randomly selected unit, so that different orientations were used across sessions. A representative set of nine possible stimulus combinations (for a particular orientation pair) is shown in Figure 2B. Using these different stimulus combinations we could measure stimulus selectivity, stimulus-induced suppression and attention modulation.

Each stimulus location near the neurons' receptive fields (stimulus location 1, 2, 3 in Figure 2A) had a corresponding and equally eccentric stimulus location on the opposite side of the fixation point (e.g. stimuli near Away in Figure 2C,D). As outlined below, we instructed monkeys to direct their attention to one stimulus location, either near or away from the receptive fields. This way we could measure not only how attention modulated neuronal responses when directed to different stimuli near the neurons' receptive fields, but also measure stimulus selectivity and stimulus-induced suppression with attention directed away from the neurons' receptive fields.

On each stimulus presentation (of multiple in a trial), each stimulus location was equally likely to contain one orientation, the other orientation or no stimulus. When Gabor pairs were presented near the neurons' receptive fields, their centers were separated by a median of 2.3° (range: 1.6–4.8°), and always separated by at least six Gabor standard deviations (mean Gabor σ: 0.45°; range: 0.17–0.5°). With such inter-stimulus distances, two stimuli can be presented within the receptive fields of V4 neurons.

Subjects were required to detect a faint white spot, labeled Target in Figure 1A,B. The target appeared at one of the four stimulus locations (two near the neurons' receptive field and two counterparts on the opposite side of the fixation point; see above) during one stimulus presentation within a trial. The target never appeared on the first stimulus presentation of a trial, but could occur with equal probability on any other stimulus presentation (range: 2–8). Two to five percent of the trials contained no target and the monkey was rewarded for maintaining fixation. Targets were presented in the center of Gabor stimuli to encourage the monkeys to confine their attention to a restricted part of visual space, near the cued stimulus location.

Task difficulty was manipulated by varying the target strength, defined as the opacity of the target (range of alpha-transparency values: 0.06–0.28). Each session we used six different target strengths (Figure 1B,C). The monkey was rewarded with a drop of juice for making a saccade to the target location within 350 ms of its appearance.

Attention was cued to one location in blocks of ~150 trials. Before the start of each block the monkey performed three to five instruction trials in which stimuli were presented at a single (cued) location. The instruction trials cued the monkey to attend to that location during subsequent trials in which stimuli could occur at all four locations.

Within a block of trials, the target appeared at the cued location in 91% of the trials (valid trials; position of the black circle in Figure 1A). In the remaining 9% of the trials (invalid trials) the target appeared at one of the three other (uncued) stimulus locations, with equal probability (position of the yellow and blue circles in Figure 1A). We used a single target strength for the invalid trials, as this allowed us to obtain reliable estimates of behavior at the unattended locations despite the small number of invalid trials (Figure 1B,C) (Cohen and Maunsell, 2009). Using invalid trials, we could compare performance between attended and unattended locations.

Recordings

We recorded neuronal activity using a 10 × 10 array of microelectrodes (Blackrock Microsystems; impedances: 0.3–1.2 MΩ at 1 kHz; 1 mm long electrodes; 0.4 mm between adjacent electrodes), chronically implanted into area V4 of the left cerebral hemisphere of each monkey. The data presented here are from 130 daily sessions of recording (Monkey M1: 52; Monkey M2: 78).

At the beginning of each recording session, we mapped the receptive fields and optimized stimulus parameters (position, orientation) for a randomly selected unit. We first measured the orientation-tuning curve of each neuron using a large Gabor that covered the lower right visual field. Orientation tuning was measured using Gabors of 8 different orientations spanning 180°.

We then mapped the spatial receptive field of each neuron using a Gabor with the preferred orientation of the selected unit, and a Gabor with an orientation orthogonal to that neuron's preferred orientation. Using two orthogonal orientations assured that most neurons were responsive to at least one stimulus. The same two orientations were subsequently used during the attention task. Receptive-field mapping was computer controlled and used a full-contrast static Gabor with the two orthogonal orientations (spatial frequency: 1.1 cycles per degree; Gabor sigma: 0.3°) on an 8 by 8 grid of positions (~azimuth range: −1 to 8°; ~elevation range: 2.5 to −8°). The center-of-mass of the receptive field (unfitted data) was defined as the receptive-field center. The receptive field plots in Figure 3 are based on the linearly-interpolated spike counts measured at each grid location. The spike counts were obtained within a 200 ms window starting 50 ms after stimulus onset.

Action potential waveforms were sorted offline using spike-sorting software (Offline Sorter, Plexon Inc). Waveforms for which the first two principal-component scores formed a well-defined cluster, separate from other waveforms, were classified as single units. The receptive fields of the units were located in the lower right quadrant at an average eccentricity of 3° for monkey M1 and 4° for monkey M2.

Analyses

We included only neuronal data from stimulus presentations from correct, validly-cued trials. We excluded incorrect trials, invalidly-cued trials, instruction trials, trials with no target, the first stimulus presentation of a trial (on which no target could occur), and stimulus presentations with a target. Neurons were included in the analyses if they responded significantly above baseline to any single Gabor presented at any stimulus location in the attend away condition (ANOVA; α=0.05). Responses in the attend away condition were obtained by averaging the firing rates from the conditions in which attention was directed to either of the two stimulus locations furthest away from the receptive-field center of the neuron (Away in Figure 2C,D). The small subset of neurons whose responses where significantly suppressed below baseline by all stimuli (N=13) was not further analyzed. Neuronal responses were computed based on the spikes in the interval from 50 ms to 300 ms after stimulus onset. Similar results were obtained using different intervals.

A stimulus location was considered within the classical receptive field (cRF) if the neuron responded significantly to any single stimulus (of either orientation) presented at that location, measured with attention away from the neuron's receptive field (attend away). The median distance from the receptive-field center of a stimulus inside the cRF was 1.7° (interquartile range 1.2° to 2.5° or 0.7–1.5σ, where σ is the Mahalanobis distance from those neurons whose receptive fields were well fitted with a bivariate Gaussian function: >80% explained variance, N=306 neurons). A stimulus location was considered to be within the surround of a neuron if the neuron did not increase its firing rate significantly to any single stimulus (of either orientation; N>36 trials per stimulus) presented at that location, measured with attention away from the neuron's receptive field (attend away). Note that neurons for which a surround location was measured did respond significantly to at least one of the stimuli when it was presented inside the cRF instead of the surround (Figure 2—figure supplement 1). The median distance of a surround location from the receptive-field center was 3.5° (interquartile range 2.7° to 4.3°, or 1.9σ to 3.1σ). We obtained similar results when we additionally required surround positions to lie at more than 2.5σ from the receptive field center. We also recorded from neurons with two stimuli inside their surround, i.e. sRF-sRF configuration. These data were not further analyzed due to a lack of responses.

The peristimulus time histograms (PSTHs) in Figure 3A–D were computed by counting the number of spikes in 1 ms bins and smoothing with a Gaussian filter of σ = 5 ms.

A selectivity index was computed based on the responses to the component Gabors of each Gabor pair (four pairs in Figure 2B). Selectivity indices were computed for each Gabor pair presented at each pair of stimulus locations (stimulus locations 1+2 or 1+3 in Figure 2A). We thus obtained eight selectivity indices per neuron. The selectivity index is defined as (P - N)/(P + N). Here, P (preferred) and N (non-preferred) are the neuronal responses to the strongest and weakest component Gabor of a Gabor pair when presented alone. P and N were measured with attention away from the neurons' receptive fields (attend away). The upper pictogram in Figure 2E illustrates the computation of the selectivity index for one Gabor pair. By definition the neuron does not respond to the stimulus when it appears alone inside the surround. It follows that for the sRF-cRF receptive-field configuration, the surround Gabor is always assigned as non-preferred (N) and the cRF Gabor as preferred (P).

Stimulus-suppression indices were similarly computed for each of eight possible Gabor pairs as (P - PN) / (P + PN), where P is the response to the preferred Gabor of a Gabor pair as described before, and PN is the response to the Gabor pair. Both P and PN were measured with attention away from the neurons' receptive fields (attend away). The middle pictogram in Figure 2E illustrates the computation of the stimulus-induced suppression index for one Gabor pair. We obtained similar results when defining a suppression index as (P+N-PN) / (P+N+PN). Note that the stimulus-induced suppression index is distinct from the α terms in the model. This is because the stimulus-induced suppression index is based on the observed neuronal responses, which comprise both an α and L term (i.e. response = L/(α+σ)). In terms of the model parameters, the stimulus-induced suppression index is given by:

Stimulusinducedsuppression index=responsePresponseP+NresponseP+responseP+N=LPαP+σLP+LNαP+αN+σLPαP+σ+LP+LNαP+αN+σ,

where LP is the excitatory drive from the preferred component Gabor of a Gabor pair, LN is the excitatory drive from the non-preferred component Gabor, αP is the suppressive drive from the preferred component Gabor, and αN is the suppressive drive from the non-preferred component Gabor. So the stimulus-induced suppression index depends on both the excitatory and suppressive drive from the stimulus.

Attention-modulation indices were computed for each of eight possible Gabor pairs as (PAttN - PNAtt) / (PAttN + PNAtt), where PAttN is the neuronal response to the Gabor pair with attention directed to P, PNAtt is the neuronal response to the Gabor pair with attention directed to N. The lower pictogram in Figure 2E illustrates the computation of the attention-modulation index for one Gabor pair.

All Gabor pairs for which a neuron responded on average with at least 1 spike (in the 250 ms window) in the attend away condition were further analyzed, but similar results were obtained using other criteria. This way neuronal data from 728 neurons were analyzed (monkey M1: 264; M2: 464). In Figures 4, 5 selectivity and stimulus-induced suppression indices are computed for each neuron and all Gabor pairs, so neurons contribute more than one index. Due to the chronic nature of our recordings, it is likely that some neurons were resampled across days. Because we adjusted the values of the stimulus orientations and locations each day for a randomly selected unit, resampling rarely involved identical stimulus configurations. Similar results were obtained for both monkeys (see Results).

We used multiple linear regression to examine if attention-related modulation depends on the interaction between stimulus selectivity and stimulus-induced suppression. For both RF configurations, the model included a main effect of selectivity, supplemented with a main effect of stimulus-induced suppression. The model also included an interactive product term, which measured the dependency of attention-related modulation on both selectivity and stimulus-induced suppression. The regression model is given by:

attentionmodulation=selectivityβ1+suppressionβ2+selectivitysuppressionβ3+error

In this model, the main effect of selectivity measures the contribution of selectivity to attention modulation given that suppression is zero. Similarly, the main effect of suppression measures the contribution of suppression to attention modulation given that selectivity is zero. For example, if suppression is zero, the suppression and the interaction term (selectivity × suppression × β3) both go to zero, leaving only the selectivity term β1, which specifies the contribution of selectivity to attention modulation. Conversely, if selectivity is zero, the only non-zero term is the β2 term, which specifies the contribution of suppression to attention modulation. Thus the main effects are not estimated from a particular selection or a subset of the dataset, but follow mathematically from the linear regression model with interaction term. For the Bayesian regression analysis, we compared the marginal likelihood of the data given a regression model that does not include receptive field configuration as a factor to the marginal likelihood of the data given a regression model that does include receptive field configuration as a factor (i.e. the Bayes factor, using the lmBF function from the BayesFactor package in R [Rouder et al., 2012]).

The plots in Figure 5A,B,D,E were obtained using regularized bilinear interpolation on the observed or modeled attention-modulation indices from all Gabor pairs and all neurons (Surface Fitting using gridfit (http://www.mathworks.com/matlabcentral/fileexchange/8998), MATLAB Central File Exchange).

Model

Tuned normalization has been applied before (Carandini et al., 1997; Schwartz and Simoncelli, 2001; Lee et al., 1999; Rust et al., 2006) and has been used to explain neuronal-response modulation when attention is shifted between two stimuli with different motion directions inside the cRF of MT neurons (Ni et al., 2012). The spatially-tuned normalization model is described by Equation (1). This spatially-tuned normalization model was fitted to the neuronal responses of all 728 neurons used in the analyses. The model parameters are: L11, L12, L21, L22, L31, L32, α2, α3, σ, β. Specifically, Lij (adopted from linear response) is the excitatory drive from a stimulus of orientation j (j = 1, 2) at receptive-field location i (i = 1, 2, 3). αpis the suppression parameter associated with stimulated receptive-field location p=1, 2, 3). β adds attention to the model and is multiplied with the parameters associated with the attended location (L and α; see Results and Figure 5C). In the conditions with attention directed away from the receptive fields (attend away) β = 1 (see Equation (1)). σ is the semi-saturation constant that is fixed across conditions and serves as a baseline suppression parameter. The σ parameter was introduced in Heeger's (Heeger, 1992) original divisive normalization model to model the shape of contrast-response functions of neurons in primary visual cortex. It also stabilized the response when stimuli of low (or zero) contrast are presented by preventing division by zero. For our data, which involve only high-contrast stimuli, it represents baseline suppression, which may arise from spontaneously active inhibitory neurons or suppression caused by constant stimuli (e.g. the edge of the stimulus display) visible to the monkey. It is an important parameter to accommodate the neuronal effects of attention to single stimuli inside the receptive field (see Figure 5—figure supplement 1A vs. B: P vs. Patt). The median value of the sigma parameter was 0.06 (median absolute deviation (MAD): 0.26). A model with no sigma term performs significantly worse at explaining responses to isolated attended stimuli (median two-fold cross-validated percentage explained variance 84%, compared to 87% for the model with sigma term; p=0.006; sequential F-test). We also fit for each neuron a model with only one free excitatory (L) term capturing excitation across all stimulus conditions. This model with one L term performed significantly worse at explaining neuronal responses than the full model with all L-terms (median two-fold cross-validated percentage explained variance 48%, compared to 87% for the model with all L terms; p<0.0001; sequential F-test).

We tested two pairs of receptive-field locations (see Spatial attention task and Figure 2A). α1is the suppression parameter related to the receptive-field location common to both of the receptive-field location pairs (stimulus location 1 for the example session in Figure 2A), and is set to one to constrain the model. All parameters were constrained to be nonnegative. For each neuron, the model was fitted to the neuronal responses of 36 distinct attention and stimulus combinations by minimizing the sum of squared error using a simplex optimization algorithm (MATLAB fminsearch.m; MathWorks). The goodness-of-fit of the model was calculated for each neuron as the percentage explained variance, which was determined by taking the square of the correlation coefficient between the model-predicted responses and the observed neuronal responses across all stimulus conditions. The explained variance was calculated using the average neuronal responses from trials not used to fit the model. For this purpose, we employed two-fold cross-validation, fitting the model based on half of the randomly-chosen trials, and using the remaining data to measure the goodness of fit of the model. This procedure was repeated five times, each time using a different random draw, and subsequently averaged across all cross-validations to produce the reported goodness-of-fit.

In Figure 8, each image pixel contained one model neuron with its receptive field centered on that pixel. The neurons' receptive fields consisted of an excitatory and suppressive receptive field. These receptive fields were modeled as a circular two-dimensional Gaussian with a standard deviation of eight pixels for the excitatory field and 40 pixels for the suppressive field. The model neurons computed local contrast within their excitatory and suppressive receptive field. The excitatory input was divisively normalized by the suppressive input to generate the model neuron's response. Model neurons without a surround experienced no suppression from stimuli positioned outside their classical receptive field. The classical receptive field was defined as all pixels within 16 pixels, i.e. two standard deviations from the excitatory receptive field, from the receptive-field center. Attention was modeled as a circular two-dimensional Gaussian kernel with a standard deviation of five pixels and amplitude equal to six. Attention weighed the excitatory and suppressive inputs with its kernel, resulting in stronger responses of the model neurons near the locus of attention relative to model neurons outside the locus of attention.

Acknowledgements

We thank Thomas Zhihao Luo for good discussions and comments on the manuscript. We thank Jackson J Cone, Till S Hartmann, Mark H Histed, J Patrick Mayo and Amy M Ni for comments on an earlier version of the manuscript, and Steven J Sleboda for technical assistance. Bram-Ernst Verhoef is a postdoctoral research fellow of the Flemish Fund for Scientific Research (FWO).

Appendix

Selectivity and suppression are correlated when only one of two stimuli is varied

Equation 1 from the spatially-tuned normalization dictates that suppression and selectivity will be correlated when only one of two stimuli is varied. In the spatially-tuned normalization model, the L and α terms are independent of each other. However the α terms are fixed at each RF location. Consequently, varying only the stimulus at one RF location, while keeping the other stimulus fixed, corresponds to keeping L1, α1 and α2 fixed. Here, L1 is the excitatory drive from the fixed stimulus, α1 is the suppressive drive from the fixed stimulus, and α2 is the suppressive drive from stimulating the RF location from the variable stimulus. Thus L2, the excitatory drive from the variable stimulus, is the only variable that varies when different stimuli are presented at that RF location. The variable L2 term changes both the selectivity and suppression indices in similar ways. In model terms, this can be seen as:

selectivity=response1response2response1+response2=L1α1+σL2α2+σL1α1+σ+L2α2+σ,

where we assume (without loss of generality) that the fixed stimulus at location 1 is the preferred stimulus, i.e. L1α1+σ>L2α2+σ. Also:

Stimulusinducedsuppression=response1response1+2response1+response1+2=L1α1+σL1+L2α1+α2+σL1α1+σ+L1+L2α1+α2+σ,

where the fixed stimulus at location 1 is again the preferred stimulus.

If we present a weak stimulus at the variable location 2, i.e. L2≈ 0, the selectivity and stimulus-induced suppression index both become high:

selectivity=response10response1+0=L1α1+σ0α2+σL1α1+σ+0α2+σ=L1α1+σL1α1+σ=1,

and

Stimulusinduced suppression=response1response1+2response1+response1+2=L1α1+σL1+0α1+α2+σL1α1+σ+L1+0α1+α2+σ=L1α1+σL1α1+α2+σL1α1+σ+L1α1+α2+σ

The stimulus-induced suppression index is also high because response1+2=L1+L2α1+α2+σ is at its lowest (less subtraction in the numerator) when stimulus 2 adds no excitatory drive to response1+2.

Conversely, adding a more potent stimulus at variable location 2, i.e. increasing L2, decreases the selectivity, because response2 is now not zero anymore and thus decreases the numerator. Similarly, the stimulus-induced suppression index will also be smaller, because response1+2=L1+L2α1+α2+σ increases with increasing L2, which in turn decreases the numerator, L1α1+σL1+L2α1+α2+σ, of the stimulus-induced suppression index.

Thus, both selectivity and stimulus-induced suppression increase with a weaker stimulus at variable position 2, and both selectivity and stimulus-induced suppression decrease with a more potent stimulus at variable position 2. This shows that, within cells, selectivity and stimulus-induced suppression will be correlated when only one of two stimuli is varied. This agrees with the single neuron examples in Reynolds et al.

Reynolds et al. repeated these measurements for several neurons and included all data points from each neuron into the populations scatter plots. Note that Reynolds et al. always positioned their stimuli at approximately equally-responsive RF locations. The RFs of V4 neurons are approximately Gaussian, which means that equally-responsive RF positions are expected to lie at approximately equal (Mahalanobis) distances from the RF center. Following Figure 7, these equally-responsive RF positions, at equally distant RF positions, are expected to have similar alphas. Because Reynolds et al. only sampled RF locations with similar alphas, similar positive relationships between selectivity and suppression would have been observed for each neuron. Given that all neurons had a similar positive relationship between selectivity and suppression, the population data will also display a similar positive relationship.

Our experimental design differed substantially from that of Reynolds et al., because when one stimulus was fixed there were few variants of the other stimulus. For example, when presenting a Gabor with one orientation at RF location 1, our design introduced only two different stimuli at RF location 2: a Gabor with the same orientation or a Gabor with an orthogonal orientation as that at location 1. In contrast, Reynolds et al. used 16 different stimulus conditions.

We compared the two stimulus conditions in which the stimulus at one location was kept constant while the stimulus at the other location varied between two orthogonal orientations. Across neurons and stimuli, we found that the average stimulus-induced suppression index of the stimulus condition with greater selectivity was significantly greater than that of the other stimulus condition with the lower selectivity (stimulus-induced suppression index of 0.07 vs. 0.05, p = 3 × 10−13, paired permutation t-test). Thus, despite the very limited number of stimulus conditions in our design, we could reproduce the correlation between suppression and selectivity of that paper.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

  • Fonds Wetenschappelijk Onderzoek to Bram-Ernst Verhoef.

  • National Institutes of Health R01EY005911 to John HR Maunsell.

  • National Institutes of Health R01EY021550 to John HR Maunsell.

Additional information

Competing interests

The authors declare that no competing interests exist.

Author contributions

B-EV, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article.

JHRM, Conception and design, Analysis and interpretation of data, Drafting or revising the article.

Ethics

Animal experimentation: This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All procedures were approved by the Institutional Animal Care and Use Committee of Harvard Medical School (Boston, MA; protocol #04214).

References

  1. Angelucci A, Shushruth S. In: The New Visual Neurosciences. Werner J. S, Chalupa L. M, editors. The MIT Press; 2014. pp. 425–444. [Google Scholar]
  2. Anton-Erxleben K, Carrasco M. Attentional enhancement of spatial resolution: linking behavioural and neurophysiological evidence. Nature Reviews Neuroscience. 2013;14:188–200. doi: 10.1038/nrn3443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Boynton GM. A framework for describing the effects of attention on visual responses. Vision Research. 2009;49:1129–1143. doi: 10.1016/j.visres.2008.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Carandini M, Heeger DJ, Movshon JA. Linearity and normalization in simple cells of the macaque primary visual cortex. Journal of Neuroscience. 1997;17:8621–8644. doi: 10.1523/JNEUROSCI.17-21-08621.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Carrasco M. Visual attention: the past 25 years. Vision Research. 2011;51:1484–1525. doi: 10.1016/j.visres.2011.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cavanaugh JR, Bair W, Movshon JA. Selectivity and spatial distribution of signals from the receptive field surround in macaque V1 neurons. Journal of Neurophysiology. 2002a;88:2547–2556. doi: 10.1152/jn.00693.2001. [DOI] [PubMed] [Google Scholar]
  7. Cavanaugh JR, Bair W, Movshon JA. Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology. 2002b;88:2530–2546. doi: 10.1152/jn.00692.2001. [DOI] [PubMed] [Google Scholar]
  8. Chelazzi L, Duncan J, Miller EK, Desimone R. Responses of neurons in inferior temporal cortex during memory-guided visual search. Journal of Neurophysiology. 1998;80:2918–2940. doi: 10.1152/jn.1998.80.6.2918. [DOI] [PubMed] [Google Scholar]
  9. Chun MM, Golomb JD, Turk-Browne NB. A taxonomy of external and internal attention. Annual Review of Psychology. 2011;62:73–101. doi: 10.1146/annurev.psych.093008.100427. [DOI] [PubMed] [Google Scholar]
  10. Coen-Cagli R, Kohn A, Schwartz O. Flexible gating of contextual influences in natural vision. Nature Neuroscience. 2015;18:1648–1655. doi: 10.1038/nn.4128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cohen MR, Maunsell JH. Attention improves performance primarily by reducing interneuronal correlations. Nature Neuroscience. 2009;12:1594–1600. doi: 10.1038/nn.2439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. DeAngelis GC, Freeman RD, Ohzawa I. Length and width tuning of neurons in the cat's primary visual cortex. Journal of Neurophysiology. 1994;71:347–374. doi: 10.1152/jn.1994.71.1.347. [DOI] [PubMed] [Google Scholar]
  13. Desimone R, Schein SJ. Visual properties of neurons in area V4 of the macaque: sensitivity to stimulus form. Journal of Neurophysiology. 1987;57:835–868. doi: 10.1152/jn.1987.57.3.835. [DOI] [PubMed] [Google Scholar]
  14. Ghose GM, Maunsell JH. Spatial summation can explain the attentional modulation of neuronal responses to multiple stimuli in area V4. Journal of Neuroscience. 2008;28:5115–5126. doi: 10.1523/JNEUROSCI.0138-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Ghose GM. Attentional modulation of visual responses by flexible input gain. Journal of Neurophysiology. 2009;101:2089–2106. doi: 10.1152/jn.90654.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Haider B, Krause MR, Duque A, Yu Y, Touryan J, Mazer JA, McCormick DA. Synaptic and network mechanisms of sparse and reliable visual cortical activity during nonclassical receptive field stimulation. Neuron. 2010;65:107–121. doi: 10.1016/j.neuron.2009.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Heeger DJ. Normalization of cell responses in cat striate cortex. Visual Neuroscience. 1992;9:181–197. doi: 10.1017/S0952523800009640. [DOI] [PubMed] [Google Scholar]
  18. Kastner S, Ungerleider LG. Mechanisms of visual attention in the human cortex. Annual Review of Neuroscience. 2000;23:315–341. doi: 10.1146/annurev.neuro.23.1.315. [DOI] [PubMed] [Google Scholar]
  19. Lee DK, Itti L, Koch C, Braun J. Attention activates winner-take-all competition among visual filters. Nature Neuroscience. 1999;2:375–381. doi: 10.1038/7286. [DOI] [PubMed] [Google Scholar]
  20. Lee J, Maunsell JH. A normalization model of attentional modulation of single unit responses. PLoS One. 2009;4:e17256. doi: 10.1371/journal.pone.0004651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Lee J, Maunsell JH. Attentional modulation of MT neurons with single or multiple stimuli in their receptive fields. Journal of Neuroscience. 2010;30:3058–3066. doi: 10.1523/JNEUROSCI.3766-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Luck SJ, Chelazzi L, Hillyard SA, Desimone R. Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. Journal of Neurophysiology. 1997;77:24–42. doi: 10.1152/jn.1997.77.1.24. [DOI] [PubMed] [Google Scholar]
  23. Martínez-Trujillo J, Treue S. Attentional modulation strength in cortical area MT depends on stimulus contrast. Neuron. 2002;35:365–370. doi: 10.1016/S0896-6273(02)00778-X. [DOI] [PubMed] [Google Scholar]
  24. Moran J, Desimone R. Selective attention gates visual processing in the extrastriate cortex. Science. 1985;229:782–784. doi: 10.1126/science.4023713. [DOI] [PubMed] [Google Scholar]
  25. Motter BC. Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. Journal of Neurophysiology. 1993;70:909–919. doi: 10.1152/jn.1993.70.3.909. [DOI] [PubMed] [Google Scholar]
  26. Ni AM, Ray S, Maunsell JH. Tuned normalization explains the size of attention modulations. Neuron. 2012;73:803–813. doi: 10.1016/j.neuron.2012.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Ozeki H, Finn IM, Schaffer ES, Miller KD, Ferster D. Inhibitory stabilization of the cortical network underlies visual surround suppression. Neuron. 2009;62:578–592. doi: 10.1016/j.neuron.2009.03.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Posner MI. Orienting of attention. Quarterly Journal of Experimental Psychology. 1980;32:3–25. doi: 10.1080/00335558008248231. [DOI] [PubMed] [Google Scholar]
  29. Recanzone GH, Wurtz RH. Effects of attention on MT and MST neuronal activity during pursuit initiation. Journal of Neurophysiology. 2000;83:777–790. doi: 10.1152/jn.2000.83.2.777. [DOI] [PubMed] [Google Scholar]
  30. Reynolds JH, Chelazzi L, Desimone R. Competitive mechanisms subserve attention in macaque areas V2 and V4. Journal of Neuroscience. 1999;19:1736–1753. doi: 10.1523/JNEUROSCI.19-05-01736.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Reynolds JH, Heeger DJ. The normalization model of attention. Neuron. 2009;61:168–185. doi: 10.1016/j.neuron.2009.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Roelfsema PR, Lamme VA, Spekreijse H. Object-based attention in the primary visual cortex of the macaque monkey. Nature. 1998;395:376–381. doi: 10.1038/26475. [DOI] [PubMed] [Google Scholar]
  33. Rouder JN, Morey RD, Speckman PL, Province JM. Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology. 2012;56:356–374. doi: 10.1016/j.jmp.2012.08.001. [DOI] [Google Scholar]
  34. Rust NC, Mante V, Simoncelli EP, Movshon JA. How MT cells analyze the motion of visual patterns. Nature Neuroscience. 2006;9:1421–1431. doi: 10.1038/nn1786. [DOI] [PubMed] [Google Scholar]
  35. Sachdev RN, Krause MR, Mazer JA. Surround suppression and sparse coding in visual and barrel cortices. Frontiers in Neural Circuits. 2012;6:1–14. doi: 10.3389/fncir.2012.00043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Sanayei M, Herrero JL, Distler C, Thiele A. Attention and normalization circuits in macaque V1. European Journal of Neuroscience. 2015;41:949–964. doi: 10.1111/ejn.12857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Sceniak MP, Ringach DL, Hawken MJ, Shapley R. Contrast's effect on spatial summation by macaque V1 neurons. Nature Neuroscience. 1999;2:733–739. doi: 10.1038/11197. [DOI] [PubMed] [Google Scholar]
  38. Schein SJ, Desimone R. Spectral properties of V4 neurons in the macaque. Journal of Neuroscience. 1990;10:3369–3389. doi: 10.1523/JNEUROSCI.10-10-03369.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Schwartz O, Simoncelli EP. Natural signal statistics and sensory gain control. Nature Neuroscience. 2001;4:819–825. doi: 10.1038/90526. [DOI] [PubMed] [Google Scholar]
  40. Sundberg KA, Mitchell JF, Reynolds JH. Spatial attention modulates center-surround interactions in macaque visual area v4. Neuron. 2009;61:952–963. doi: 10.1016/j.neuron.2009.02.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Treue S, Maunsell JH. Attentional modulation of visual motion processing in cortical areas MT and MST. Nature. 1996;382:539–541. doi: 10.1038/382539a0. [DOI] [PubMed] [Google Scholar]
  42. Vinje WE, Gallant JL. Sparse coding and decorrelation in primary visual cortex during natural vision. Science. 2000;287:1273–1276. doi: 10.1126/science.287.5456.1273. [DOI] [PubMed] [Google Scholar]
  43. Williford T, Maunsell JH. Effects of spatial attention on contrast response functions in macaque area V4. Journal of Neurophysiology. 2006;96:40–54. doi: 10.1152/jn.01207.2005. [DOI] [PubMed] [Google Scholar]
  44. Xiao J, Niu YQ, Wiesner S, Huang X. Normalization of neuronal responses in cortical area MT across signal strengths and motion directions. Journal of Neurophysiology. 2014;112:1291–1306. doi: 10.1152/jn.00700.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Zénon A, Krauzlis RJ. Attention deficits without cortical neuronal deficits. Nature. 2012;489:434–437. doi: 10.1038/nature11497. [DOI] [PMC free article] [PubMed] [Google Scholar]
eLife. 2016 Aug 22;5:e17256. doi: 10.7554/eLife.17256.014

Decision letter

Editor: Doris Y Tsao1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

[Editors’ note: a previous version of this study was rejected after peer review, but the authors submitted for reconsideration. The first decision letter after peer review is shown below.]

Thank you for submitting your work entitled "Attention operates uniformly throughout the Classical Receptive Field and the Surround" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and David Van Essen as the Senior Editor. Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife in its present form due to significant concerns about the validity of the central claims, as detailed in the reviews. However, all three reviewers believe that the work has high potential to make a significant contribution, therefore we invite you to submit a new version of the manuscript if you are able to address the central comments of the reviewers.

Reviewer #1:

Summary:

I found this study to be very elegant in its linking of several complex phenomena (selectivity, suppression, and attentional modulation) with a single, simple model. The paper is also beautifully written. Verhoef and Maunsell performed array recordings in macaque to study the effects of attention on stimuli either inside the classical receptive field of a neuron or in the surround. They show that attention acts uniformly on the response of neurons by amplifying the effect of the attended stimulus, i.e. increasing the response for the preferred stimulus and exacerbating the suppression for a non-preferred stimulus, regardless of whether it is in the classical RF or the surround, and regardless whether the preferred stimulus is preferred due to its location in the RF or its orientation. Importantly, they found that selectivity and suppression strength are not correlated, and an interaction between selectivity and suppression strength is needed to predict the attention modulation strength, which could be accounted for by a fitting a model where attention has a multiplicative effect on both the excitatory input and the normalization strength of the attended stimulus. This showed that the effect of attention is the same for classical receptive fields as the surround receptive field.

Major issues:

The authors claim that the attention effect solely depends on the differential response of the component stimuli, regardless of whether the difference arises from spatial selectivity or feature selectivity. They show the former, i.e. that their findings of interaction between selectivity and suppression etc. can be reproduced by looking only at conditions where the same stimulus was presented, so that only spatial selectivity can evoke the observed attention effects. But can you show from your data that feature selectivity alone can reproduce your results (e.g. look at conditions where neuron has low spatial selectivity, i.e. the neurons response is the same if you present the same stimulus at one of two locations; or by pairwise comparisons of different stimuli at the same locations)?

Reviewer #2:

Verhoef and Maunsell recorded the activity of V4 neurons in alert and behaving monkeys performing a dot-detection task requiring focused spatial attention. The authors were interested in comparing attentional modulation under conditions where two Gabor stimuli were placed within the classical receptive field versus conditions where one Gabor was placed within the classical receptive field and the second Gabor was placed in the adjacent region of the extraclassical surround. They observed a variety of attentional modulations in both conditions and they found that attentional modulation correlates with stimulus selectivity and also with stimulus-induced suppression in both conditions. Importantly, the relationships between attentional modulation and stimulus selectivity and induced suppression were complicated as stimulus selectivity and induced suppression were not themselves correlated among the recorded neuronal population. Instead, attentional modulation was greatest when selectivity was high and stimulus-induced suppression was also high. The authors then proposed a normalization model that nicely accounts for these relationships. Two aspects of this study were particularly novel and interesting.

First, in describing how variability in attentional modulation of neuronal activity relates to neuronal stimulus selectivity and stimulus-induced suppression, the authors add to general knowledge about attentional modulation of neuronal activity. The use of the normalization model to explain these relationships is elegant.

Second and related, the authors examine the relative contributions of stimulus selectivity and stimulus-induced suppression toward overall attentional modulation, which is also supported by the normalization model.

While these findings are important and advance the field of attention, there are some concerns about the context in which these results are presented and also some technical concerns that slightly dampen overall enthusiasm for the study.

One concern involves placing the results of this study in the context of prior, quite similar work. In the Introduction, the authors state that "the way that attention acts in the cRF versus the surround has not been compared directly" and they cite the work of Sundberg et al. (2009) and Sanayei et al. (2015), both of which examined attentional modulation of responses to visual stimuli within the classical receptive field and the extraclassical surround. In the Sundberg et al. study, the stimuli were similar and similarly placed in the receptive field relative to the current study. The authors' statement therefore seems inaccurate. But additionally, the authors never directly compare their findings with those of Sundberg et al. and Sanayei et al. If the authors are going to make the case that their work overturns or expands upon that of prior groups, then this needs to be explicitly described in the Introduction and/or Discussion.

A second concern involves the amount of suppression observed in the two Gabor stimulus configurations. In the Abstract and Introduction the authors relate their work to the rich literature on relative contributions of classical and extraclassical regions of the receptive field. The data in Figure 4 indicate that there is actually very little surround suppression observed in the sRF-cRF condition. The authors are aware of this and use careful wording (stimulus-induced suppression) and they include Figure 7 to illustrate that the stimulus-induced suppression they observe is qualitatively similar to surround suppression measured previously. But throughout the study, the results are consistent across the cRF-cRF and sRF-cRF conditions. The authors interpret this finding as evidence that attention acts uniformly across both the center and surround of the receptive field. But another possible explanation is that surround suppression was never in fact activated by the stimulus configurations used and thus contributions of the extraclassical surround were not accurately assessed.

There are two possible explanations for the lack of surround suppression in their data that the authors should address. First, the stimulus configuration of two adjacent Gabors may not be ideal for activating surround suppression in V4 neurons because the stimuli are not covering large enough extents of the center or surround. Second, and perhaps more importantly, if recordings are multi-unit rather than single-unit measurements, this will underestimate the extent of surround suppression. The electrodes used in this study are similar to those used previously by the Maunsell group and prior studies often reported data from multi-unit rather than single-unit recordings. Low impedance electrodes, such as those reported here, are not optimal for recording well-isolated single units. Quantities of well-isolated single-unit versus multi-unit recordings used for the analyses are not reported. Given that surround suppression will vary depending on the type of neuronal activity recorded, it would be helpful to see an example of well-isolated single units and to have a quantification of the number of single-unit versus multi-unit recordings that were utilized for each measurement involving stimulus-induced suppression.

Reviewer #3:

Verhoef and Maunsell measure responses of V4 neurons under conditions in which number of stimuli (2D Gabor patches), stimulus orientation, stimulus position, and spatial attention vary. Their main empirical finding is that attention modulation (change in response when attention is shifted between two stimuli in or near the receptive field) is a function not only of selectivity (difference in response to the two stimuli presented individually) but also suppression (reduction of response to the preferred stimulus when the non-preferred stimulus is added) (Figure 4). Moreover, they argue that selectivity and suppression interact to determine the strength of attention effects, and that attention modulation is weak or absent if either is low (Figure 5). They fit a model incorporating selectivity and suppressive effects to each neuron. Based on the fitting results, they argue that the strength of attentional modulation is entirely determined by selectivity combined with suppression, and that attention is a uniform multiplier interacting with these terms.

The dependency of attentional modulation on suppression looks clear, especially for interactions between one stimulus in the classical receptive field (CRF) and another stimulus in the surround. I believe this is a novel and important finding. The supra-additive interaction between selectivity and suppression is visually clear from Figure 5, but I had trouble understanding the analytical support for this. I think the authors performed analyses specific to cases with 0 selectivity and cases with 0 suppression, but this is not clear from the manuscript. In one case they report highly significant p values, in the other non-significant, so it is not clear how the analysis demonstrates supra-additivity.

The main claim of the paper is that attention acts in a uniform way on all stimuli and all CRF and surround positions, in that attentional modulation is entirely predictable from stimulus selectivity and location-specific suppression values. This claim is based on the cross-validation performance of models fit to each neuron with a single term for attentional modulation but multiple terms for stimulus responses and suppression effects. I think there are a number of problems with this analysis that undercut the main claim of the paper.

First, the design of the model seems arbitrary and inconsistent with the findings and the Discussion. One of the suppression values in the denominator, alpha1, is always set to 1. I assume that L1 and alpha1 pertain to the preferred or driving stimulus in a given stimulus pair, at least in most cases, although this is not clear. The Discussion mentions that attentional modulation depends in part on the suppression value at the location of the driving stimulus. How could the model capture that value if it is always fixed? Alpha2 can vary, and must be near 0 for cases where suppression is low (in order to produce equivalent responses when the non-preferred stimulus is added). But if alpha1 is always equal to 1, a low alpha2 value actually contributes to larger attentional modulation, i.e. a larger difference between response with attention at location 1 and response with attention at location 2. This would be the opposite of the empirical finding that low suppression results in low attentional modulation. Some of these issues might be clarified with a more complete presentation of the model and how it is applied to different cases.

Second, it is not clear from this manuscript that the solutions found by the model are reasonable. There are no examples or population plots to show how fitted parameters explain responses under different conditions and how they vary across neurons to explain different experimental results. The only information about fitted parameter values is in Figure 7, which shows that fitted α values are tightly clustered around 0.5. This seems incompatible with Figure 4C,D, showing that measured suppression values cluster near 0. The authors need to show how their model works to explain the different attention effects they measured.

Third, the results of the modeling analyses are presented only in terms of average variance explained in the response pattern, without any specific analysis of how well attentional modulations are predicted. The variance explained values are extremely high, but I would guess this is because most of the response variance depends on differences in stimulus type and location (rather than attention), and the model fits a separate excitatory value for each stimulus type/location combination and a separate suppression value for each location. With all these separate terms, high variance explained seems guaranteed, even with cross-validation, since fitting all these terms requires that the training set contains all stimulus type/location combinations and all suppression locations. (The manuscript does not specify how training and testing conditions were divided.) The variance due to attention is a small fraction of overall response variance, less than 10% based on Figure 6 (and on average modulation at median suppression shown in 4E,F), and variability in attention effects across stimulus conditions looks even smaller. Thus, in most cases the individual models could get attention effects completely backwards and still produce high average variance explained values. That seems quite possible, since the fitting procedure will be driven primarily by the large variance between stimulus conditions. The authors effectively confirm this by showing that the fitted attention parameter is unnecessary for the high variance explained results.

In summary, the demonstration that attention modulation depends on suppression seems clear and noteworthy. However, the main claim of the paper, that attentional modulation is uniform, depending only on selectivity and suppression, would need a different analysis directed specifically at attentional effects as opposed to overall response variance across many different stimulus conditions.

[Editors’ note: what now follows is the decision letter after the authors submitted for further consideration.]

Thank you for resubmitting your work entitled "Attention operates uniformly throughout the Classical Receptive Field and the Surround" for further consideration at eLife. Your revised article has been favorably evaluated by David Van Essen (Senior editor), a Reviewing editor, and three reviewers.

The manuscript has been improved, but a number of remaining issues that need to be addressed before acceptance, as detailed below:

Reviewer #1:

"Attention operates uniformly throughout the classical receptive field and the surround."

In general, I think the authors have addressed the reviewer's comments clearly and compellingly. However, a few issues remain unclear:

1) The authors explain their results in relation to Sundberg as arising from the fact that the latter did not vary the position of the second stimulus: "Consequently, both the selectivity and the stimulus-induced suppression index varied with a single variable, the strength of the second stimulus relative to the strength of the fixed stimulus." This is not clear. Even at a fixed location, it seems L1, L2, alpha1, and alpha2 are independent in theory, and selectivity and suppression could be uncorrelated. Furthermore, as shown in Figure 7, excitation and suppression fall off at independent rates with distance from RF center, and presumably Reynolds sampled a variety of locations wrt RF center. Thus the authors need to clarify and explain precisely the statement in the quotation marks above (preferably with reference to equations). Are the authors claiming that Eq1 logically implies that suppression and selectivity should be correlated when measured in the manner of Reynolds? Related to this, it would be helpful if the authors actually analyzed the subset of their data corresponding to Reynold's paper, and show that they can reproduce the correlation between suppression and selectivity of that paper.

2) I did not see any response to Reviewer 3's point: "But if alpha1 is always equal to 1, a low alpha2 value actually contributes to larger attentional modulation, i.e. a larger difference between response with attention at location 1 and response with attention at location 2. This would be the opposite of the empirical finding that low suppression results in low attentional modulation." Please address.

Reviewer #3:

Original review: "The dependency of attentional modulation on suppression looks clear […] demonstrates supra-additivity."

The authors added an explanatory clause to the main text, but the supporting analyses remain unclear. The reported p-values are for "main effects at 0 selectivity/suppression". What does "at 0" mean? Is it an analysis performed on a subset of cells with no significant selectivity/suppression? Or is it an analysis performed on a row or column (how wide?) from the average plot? If so, how many neurons actually occupied those rows or columns?

The other confusing thing here is that p values for significance of main effects are given as though they support the statements about the weak effects, which they don't, in the sense that the p-values are actually most significant for main effects of selectivity, which are being discounted as weak. The discrepancy needs explanation. There must be a more appropriate numerical basis for the statement that both selectivity and suppression are required for strong attention effects.

In addition, it is hard to reconcile Figure 5A,B, where attention effects are strongly focused near selectivity = 1 (even if you collapse across suppression) with Figure 4A,B, where attention effects are equally strong at selectivity = 0.5. What explains the difference?

Original review: "The main claim of the paper is that attention acts in a uniform way on all stimuli […] claim of the paper."

The authors respond that, in addition to the model, the claim is based on the visual similarity of Figure 5A and 5B, and on "the statistical tests that we performed on these neuronal responses (see the discussion on the general linear model above)". 5A and B do not look entirely similar; in 5A the peak is elongated horizontally, showing that attention is more sensitive to suppression when the distractor is in the receptive field; in 5B the peak is elongated vertically, showing that attention is more sensitive to selectivity when the distractor is in the surround. A direct test of whether the attention effects are equivalent in 5A and 5B would be some kind of two-dimensional Kolmogorov-Smirnov analysis (see, e.g., Lopes, Reid, Hobson, Proceedings of Science, XI International Workshop on Advanced Computing and Analysis Techniques in Physics Research April 23-27 2007 Amsterdam, the Netherlands.) The linear regression is a less direct way of testing the question, and I don't think that it could capture the differences between peak shapes in 5A and 5B, in which case the new Bayes Factor analysis would not be meaningful.

Original review: "First, the design of the model seems arbitrary and inconsistent with the findings and the Discussion. […] Some of these issues might be clarified with a more complete presentation of the model and how it is applied to different cases."

The authors explain that it is only the relative values between parameters that matter, and support this by writing out a multiplication of all terms by 1/alpha1. This still doesn't make sense to me in that the α values need to be able to go to 0, as in Figure 5/Figure 5—figure supplement 1C, to represent absence of suppression. If suppression due to stimulus 1 is entirely absent (in which case the multiplier 1/alpha1 would be undefined), how would the model capture that? The addition of Figure 5/Figure 5—figure supplement 1 is certainly helpful in understanding the operation of the model, but it doesn't solve this confusion for me.

Original review: "Second, it is not clear from this manuscript that the solutions found by the model are reasonable. […] The authors need to show how their model works to explain the different attention effects they measured."

In their response, the authors provide an extended explanation of how α values can be balanced out in either direction by L values, concluding that "in general, there exists no direct relationship between the suppression-index in Figure 4C,D and the α term." To the extent this is true, Figure 7 and the accompanying legend are highly confusing, since they equate α with suppression in the labels and in the text. The fact that the new supplementary example with 0 suppression also has a 0 α value will tend to add to this confusion. Since they are plotting α in Figure 7, the authors need to explain what it means and how it relates (if at all) to suppression, not just to reviewers but also to readers.

Original review: "Third, the results of the modeling […] The authors effectively confirm this by showing that the fitted attention parameter is unnecessary for the high variance explained results."

The authors correctly point out that the models must not be failing to capture attention effects given the close approximation to population level effects in Figure 4. And, they reiterate that removing the attention term reduces overall explained variance from 87% to 79%. My main point had simply been that these were both extremely high values, and this is due to the fact that the number of terms in the model is on the same order as the number of conditions, so a close fit is not surprising or informative in either case. The cross-validation does not change this because, given the low number of conditions and the description now given in methods, it amounts to simply splitting repetitions from identical trial types into two groups and proving they give the same result. I don't think the 87% explained variance with the full model or the 8% drop in explained variance by themselves establish that "attention acts uniformly across the cRF and the surround". (The exceedingly close fits are due to the similar number of terms and conditions; the 8% drop shows that the attention term mattered, but it doesn't address how exactly attention behaved.) Nor, as explained above, do I think this is established by visual similarity between Figure 5A and B or the linear regression analysis. As stated above, I think the direct test of this proposition would be a statistical comparison of 5A and 5B. Regardless of the result, the clear and more interesting result here is the criticality of the interaction between selectivity and suppression for attention.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Attention operates uniformly throughout the Classical Receptive Field and the Surround" for further consideration at eLife. Your revised article has been favorably evaluated by David Van Essen (Senior editor) and the Reviewing editor.

The manuscript has been improved and we greatly appreciate the detailed, equation-based clarifications in response to several of the points raised by the reviewers.

However, the reviewers remain concerned whether the authors have really provided evidence that attention acts uniformly throughout the receptive field. The reviewers would like clarification of the following points:

1) Is there any reason that 5A and 5B should look identical? In the initial reviewer response, the authors stated, "We would like to emphasize that the claim that attention acts in a uniform way on all cRF and surround positions is supported by more than just the model that accurately accounts for all neuronal responses, irrespective of the receptive field position and attention condition. In particular, this uniformity can be seen directly in the similarity of the neuronal responses in Figure 5A and B, which show that the dependency of attention modulation on selectivity and suppression does not depend on whether stimuli are presented in the cRF or the surround." The reviewers see a clear difference between 5A and B. This seems to provide evidence that attention is more critically dependent on selectivity in the CRF-CRF condition, and more critically dependent on suppression in the CRF-SRF condition; quite the opposite of operating uniformly. The reviewers are confused whether the authors have a hypothesis or explanation for why the dependency of attention on suppression and selectivity should differ between CRF (5A) and SRF (5B); it seems that the difference is contrary to the authors' conclusion. If they do have a hypothesis about the difference, please state it and test it, and modify the "uniform operation" conclusion accordingly.

2) In the most recent reviewer response, the authors seem to be backing away from claim that 5A and B need to look similar, and are appealing to the similarity between the model fits (5D,E) and the data (5A, B). The reviewers remain concerned that the reason why the model fits assuming uniform attention are so good is that all of the non-attention related terms (separate excitatory term for each stimulus type/location combination and a separate suppression value for each location) are doing all the work. Since the number of terms being fitted is nearly equal to the number of measurements, this guarantees over fitting, and it can't be cured by cross-validation because every condition would need to be in both portions of the data to fit all the stimulus- and position-specific terms.

It would seem advisable to include an explicit caveat about this.

3) Most importantly, the obvious way to test their claim is some kind of direct comparison of attentional modulation between CRF and SRF. Can the authors do this?

4) Have the authors tried to fit other models with fewer non-attention related variables (e.g., keeping suppression constant)? How does this affect the attention term?

Overall, the reviewers would be most convinced by a new analysis directly showing that attentional modulation as a function of selectivity and suppression is the same in the cRF and surround, without appeal to goodness of model fits. Without such an analysis, it remains unclear whether the authors have identified a truth about the brain, that attention acts uniformly, or whether they have simply shown that a viable model of neurons can be constructed in which attention acts uniformly, but other models with a variable attention term are equally plausible.

eLife. 2016 Aug 22;5:e17256. doi: 10.7554/eLife.17256.015

Author response


[Editors’ note: the author responses to the first round of peer review follow.]

Major issues:

The authors claim that the attention effect solely depends on the differential response of the component stimuli, regardless of whether the difference arises from spatial selectivity or feature selectivity. They show the former, i.e. that their findings of interaction between selectivity and suppression etc. can be reproduced by looking only at conditions where the same stimulus was presented, so that only spatial selectivity can evoke the observed attention effects. But can you show from your data that feature selectivity alone can reproduce your results (e.g. look at conditions where neuron has low spatial selectivity, i.e. the neurons response is the same if you present the same stimulus at one of two locations; or by pairwise comparisons of different stimuli at the same locations)?

Thanks for the suggestions. We repeated the same analysis on the data from conditions with low spatial selectivity. This analysis confirmed that feature (orientation) selectivity alone can reproduce our results. When attention shifts between two approximately equally responsive cRF positions (less than 2 spike/s response difference when each of two cRF positions is stimulated with an identical single stimulus), similar effects were observed (main effect of feature selectivity: p<0.001; main effect suppression: p=0.7; interaction between feature selectivity and suppression: p=0.02; multiple linear regression).

This result is further corroborated by the modeling results, as the model makes no distinction between spatial and feature selectivity, yet it fits the data well.

This new result has been added to the findings on (Results section, subsection “A spatially-tuned normalization model captures attention modulation inside the cRF and in the surround”): "Similar to these previous studies, we reproduced the above findings using the data from conditions with low spatial selectivity. When attention shifts between stimuli at two approximately equally responsive cRF positions (less than 2 spike/s response difference when each of two cRF positions is stimulated with an identical single stimulus), similar effects were observed (main effect of feature selectivity: p<0.001; main effect suppression: p=0.7; interaction between feature selectivity and suppression: p=0.02; multiple linear regression). Next, we examined whether the converse situation, i.e. same stimuli at unequally-responsive cRF positions, would produce attention modulations comparable to those described earlier..…"

Reviewer #2:

Verhoef and Maunsell recorded the activity of V4 neurons in alert and behaving monkeys performing a dot-detection task requiring focused spatial attention.

[…]

But additionally, the authors never directly compare their findings with those of Sundberg et al. and Sanayei et al. If the authors are going to make the case that their work overturns or expands upon that of prior groups, then this needs to be explicitly described in the Introduction and/or Discussion.

Both Sundberg et al. (2009) and Sanayei et al. (2015) used conditions with one stimulus inside the classical receptive field and at least one other stimulus in the surround (sRF-cRF condition in our study). However, neither of these studies compared responses to a condition in which both stimuli were positioned inside the cRF (cRF-cRF condition).

Hence, these investigators could not examine how attention affects neuronal responses when shifted between two stimuli inside the cRF (cRF-cRF condition) versus when shifted between one stimulus inside the cRF and another stimulus inside the surround (sRF-cRF condition). This comparison is crucial for determining whether attentional effects on neuronal responses differ between the cRF and the surround. Additionally, we show the similarity between attentional effects in the cRF and the surround can only be revealed by examining the interaction between stimulus selectivity and stimulus-induced suppression (Figure 4 versus 5). No previous study has ever done this.

Finally, Sundberg et al. did not test whether a single model could fit data from both the cRF and the surround. In our study, the good fits of the spatially-tuned normalization model to the data obtained in both receptive field configurations provided critical corroborating evidence for the conjecture that attention acts similarly inside the cRF and the surround. Sanayei et al. fit different (normalization) models, but never measured the neuronal effects of attention when attention was directed to the surround stimulus; they only compared the effects of attention to the cRF stimulus, with or without surround stimulation, versus attention to a distant stimulus. Attention was never directed to the surround stimuli. Thus, Sanayei et al. lacked the crucial information needed to examine how surround suppression affects attention modulation and to test the efficacy of normalization models.

We believe that most of the confusion comes from our wording, which did not mention that attention should shift between stimuli either presented inside the cRF or between a cRF and surround stimulus. We now write on (third paragraph of the Introduction section): "For example, the way that attention acts on neuronal responses when shifted between stimuli inside the cRF versus when shifted to stimuli inside the surround has not been compared directly16,23,29."

We now also discuss both studies in the Discussion on (fourth paragraph of Discussion section): "Both Sundberg et al. (2009) and Sanayei et al. (2015) used conditions with one stimulus inside the classical receptive field and at least one other stimulus in the surround (sRF-cRF). However, neither of these studies used a condition with both stimuli positioned inside the cRF (cRF-cRF). Hence, a direct comparison of attention modulation within the cRF and the surround could not be performed in these studies. This comparison is crucial to determine whether the neuronal effects of attention differ between the cRF and the surround. Importantly, we found that seeing the similarity between attentional effects in both receptive field regions requires examination of the combined relationship, i.e. interaction, between attention modulation and both stimulus selectivity and stimulusinduced suppression (Figure 4 versus 5). The interaction between these variables was similar in both receptive field configurations.

Sanayei et al. fit different (normalization) models, but never measured the neuronal effects of attention when attention was directed to the surround stimulus; the authors only compared the effects of attention to the cRF stimulus, with or without surround stimulation, versus attention to a distant stimulus. Attention was never directed to the surround stimuli. Thus, Sanayei et al. lacked the crucial information needed to examine how surround suppression affects attention modulation and to test the efficacy of normalization models. We tested whether a single model could fit both the cRF-cRF and sRF-cRF data. The good fits of the spatially-tuned normalization model to the data obtained in both receptive field configurations provided further evidence that attention acts similarly inside the cRF and the surround."

A second concern involves the amount of suppression observed in the two Gabor stimulus configurations.

[...]

There are two possible explanations for the lack of surround suppression in their data that the authors should address. First, the stimulus configuration of two adjacent Gabors may not be ideal for activating surround suppression in V4 neurons because the stimuli are not covering large enough extents of the center or surround.

Our visual stimuli were not designed for maximal surround suppression, which would have required filling the surround with a high contrast stimulus. This was not possible in our experiment, which required attention that was comparably restricted in space in all conditions (cRF-cRF and sRF-cRF). It is likely that surround suppression was limited primarily because of the limited extent of the Gabor stimuli. However, we know of no reason to believe that the Gabors led to qualitative (rather than quantitative) differences from what would have been obtained had optimal surround activation been possible. We observed very similar surround suppression as that observed in previous studies on attention in the surround (Sundberg et al., 2009; Sanayei et al., 2015).

Although the mean surround suppression was modest, a substantial proportion of the neurons was significantly suppressed by the surround stimulus (see Figure 4D black bars, p<0.01).

Using Gabors made it possible for us to explore activation of the surround at varying offsets from the RF center. Figure 7 shows that surround suppression decreases with increasing distance from the RF center. Including data from near and far portions of the surround allowed us to show that attention acts comparably across the range of distances. As noted in a response to Reviewer 1 (above), limiting our analyses to sites with statistically significant suppression indices did not change the results. The consistency across the cRF-cRF and sRF-cRF conditions also holds when considering only neurons with strong surround suppression (Figure 5). We now also present some more single-neuron examples that show clear surround suppression (see below and also Figure 4—figure supplement 1 and Figure 3C).

Hence, although larger surround stimuli would have caused more suppression, the surround stimuli in our study produced clear suppression and attention interacted with this surround suppression as it did with suppression from the cRF.

Second, and perhaps more importantly, if recordings are multi-unit rather than single-unit measurements, this will underestimate the extent of surround suppression. The electrodes used in this study are similar to those used previously by the Maunsell group and prior studies often reported data from multi-unit rather than single-unit recordings. Low impedance electrodes, such as those reported here, are not optimal for recording well-isolated single units. Quantities of well-isolated single-unit versus multi-unit recordings used for the analyses are not reported. Given that surround suppression will vary depending on the type of neuronal activity recorded, it would be helpful to see an example of well-isolated single units and to have a quantification of the number of single-unit versus multi-unit recordings that were utilized for each measurement involving stimulus-induced suppression.

All results reported in this study were based on single units, not multi-unit sites.

However we did observe very similar results when analyzing the multi-units. We apologize for this point not being clear in the previous version of the manuscript. We have added on (Third paragraph of Introduction section): "All results presented here are based on the activity of these 728 single neurons, but all findings were confirmed in the responses of 12067 multi-unit clusters (M1: 4709; M2: 7358)." We also added the waveforms of the recorded neurons (blue) plus that of the multi-unit activity measured at the same electrode (grey) in Figure 3.

Finally, we added a new supplemental figure in which we present some wellisolated single units that demonstrate strong and significant surround suppression (see Figure 3C for another example). We have added on Results section, subsection “Relationship between selectivity, stimulus-induced suppression and attention modulation”: "The black bars in Figure 4C and D represent neurons that were significantly (p<0.01) suppressed by the non-preferred (surround) stimulus. See Figure 4—figure supplement 1 for some example neurons with significant surround suppression (see also Figure 3C)."

Reviewer #3:

Verhoef and Maunsell measure responses of V4 neurons under conditions in which number of stimuli (2D Gabor patches), stimulus orientation, stimulus position, and spatial attention vary. Their main empirical finding is that attention modulation (change in response when attention is shifted between two stimuli in or near the receptive field) is a function not only of selectivity (difference in response to the two stimuli presented individually) but also suppression (reduction of response to the preferred stimulus when the non-preferred stimulus is added) (Figure 4). Moreover, they argue that selectivity and suppression interact to determine the strength of attention effects, and that attention modulation is weak or absent if either is low (Figure 5). They fit a model incorporating selectivity and suppressive effects to each neuron. Based on the fitting results, they argue that the strength of attentional modulation is entirely determined by selectivity combined with suppression, and that attention is a uniform multiplier interacting with these terms.

The dependency of attentional modulation on suppression looks clear, especially for interactions between one stimulus in the classical receptive field (CRF) and another stimulus in the surround. I believe this is a novel and important finding. The supra-additive interaction between selectivity and suppression is visually clear from Figure 5, but I had trouble understanding the analytical support for this. I think the authors performed analyses specific to cases with 0 selectivity and cases with 0 suppression, but this is not clear from the manuscript. In one case they report highly significant p values, in the other non-significant, so it is not clear how the analysis demonstrates supra-additivity.

We used a general linear regression model to examine whether selectivity and suppression interact to determine the magnitude of attention modulation. This regression model is explained in the Methods. The model includes two main effects, one for selectivity and one for suppression, in addition to an interaction term.

attention modulation = selectivity ⋅β1+ suppression ⋅β 2+ selectivity ⋅suppression ⋅β3+error

In this model, the main effect of selectivity measures the contribution of selectivity to attention modulation given that suppression is zero. Similarly, the main effect of suppression measures the contribution of suppression to attention modulation given that selectivity is zero. For example, if suppression is zero, the suppression and interaction term (selectivityÅ~suppressionÅ~β3) both go to zero, leaving only the selectivity term β1, which specifies the contribution of selectivity to attention modulation. Conversely, if selectivity is zero, the only non-zero term is the β2 term, which specifies the contribution of suppression to attention modulation.

The main effect of suppression was never significant. The main effect of selectivity was, but the effect was very small as can be seen in Figure 5. These main effects are not important for the message in this manuscript, but are provided for completeness. In striking contrast, all terms corresponding to the interaction between selectivity and suppression (β3) were highly significant in both the cRF-cRF and the sRF-cRF configuration, and for each monkey separately. This demonstrates non-additivity (Results section, subsection “Stimulus selectivity and stimulus-induced suppression interact in determining attention modulation and do so similarly inside the cRF and the surround”).

We show that this non-additivity is similar in each receptive field configuration (using a 3-way interaction between selectivity, suppression and receptive field configuration,), meaning that β3 (but also β1 and β2) does not differ significantly between the cRF-cRF and sRF-cRF condition. Because a non-significant effect does not indicate the absence of an effect, we performed a new Bayesian general linear model analysis. This analysis showed that the observed data are 347 times more likely to agree with a regression model that does not distinguish between the cRF-cRF and sRF-cRF configurations than with a model that does include RF-configuration as a predictor. This means that attention modulation is driven by similar mechanisms within the cRF and the surround.

We now clarify this (same section)"Importantly, in each RF configuration the regression model also included an interactive product term, which measured the dependency of attention-related modulation on both selectivity and stimulus-induced suppression, i.e. this term measures whether the relationship between selectivity, suppression and attention modulation is non-additive (see Methods for further information)."

We also expanded the regression description in Analyses subsection of the Methods section:

"We used multiple linear regression to examine if attention-related modulation depends on the interaction between stimulus selectivity and stimulus-induced suppression.

[…]

Conversely, if selectivity is zero, the only non-zero term is the β2 term, which specifies the contribution of suppression to attention modulation."

The main claim of the paper is that attention acts in a uniform way on all stimuli and all CRF and surround positions, in that attentional modulation is entirely predictable from stimulus selectivity and location-specific suppression values. This claim is based on the cross-validation performance of models fit to each neuron with a single term for attentional modulation but multiple terms for stimulus responses and suppression effects. I think there are a number of problems with this analysis that undercut the main claim of the paper.

We would like to emphasize that the claim that attention acts in a uniform way on all cRF and surround positions is supported by more than just the model that accurately accounts for all neuronal responses, irrespective of the receptive field position and attention condition. In particular, this uniformity can be seen directly in the similarity of the neuronal responses in Figure 5A and B, which show that the dependency of attention modulation on selectivity and suppression does not depend on whether stimuli are presented in the cRF or the surround. In addition, the claim of uniformity is supported by the statistical tests that we performed on these neuronal responses (see the Discussion on the general linear model above). The good model fits provide additional support for this claim.

First, the design of the model seems arbitrary and inconsistent with the findings and the Discussion. One of the suppression values in the denominator, alpha1, is always set to 1. I assume that L1 and alpha1 pertain to the preferred or driving stimulus in a given stimulus pair, at least in most cases, although this is not clear. The Discussion mentions that attentional modulation depends in part on the suppression value at the location of the driving stimulus. How could the model capture that value if it is always fixed? Alpha2 can vary, and must be near 0 for cases where suppression is low (in order to produce equivalent responses when the non-preferred stimulus is added). But if alpha1 is always equal to 1, a low alpha2 value actually contributes to larger attentional modulation, i.e. a larger difference between response with attention at location 1 and response with attention at location 2. This would be the opposite of the empirical finding that low suppression results in low attentional modulation. Some of these issues might be clarified with a more complete presentation of the model and how it is applied to different cases.

We are sorry about the confusion and address these concerns here.

First, we should point out that in the cRF-cRF condition both stimuli are typically driving stimuli, meaning that both stimuli contribute a non-zero L-term.

Second, we used multi-electrode array recordings, recording simultaneously from many neurons. For some neurons the stimulus at position 1 (and L1 and α1) was the stronger driver (e.g., positioned closer the receptive field center), while for other neurons the stimulus at position 2 drove the neuron stronger (larger L2 term). For the sRF-cRF condition, position 1 was the surround position for some neurons, while position 2 was the surround position for other neurons. This depended on where the two stimulus positions fell relative to the receptive field of each neuron, which varied across days because we recorded different neurons and used different stimulus positions on different days (Materials and methods section, subsection “Analyses”).

Third, α1 was set to 1 only to simplify the model. Letting it vary has no effect. If it is allowed to vary, the resulting fit can always be converted to a fit with α1 =1 by multiplying all terms by the appropriate factor, this can be seen as follows:

R1,2=L1+L2α1+α2+σ=(1α11α1)L1+L2α1+α2+σ=L1α1+L2α11+α2α1+σα1=L1'+L2'1+α2'+σ'

Transforming α1 to 1 causes the other parameters to take different values (indicated by primes in the equation) but has no other consequence. In general, the absolute values of the model parameters have no meaning. Only the relative values of the parameters with respect to each other are important.

For the spatially-tuned normalization model, fixing α1 to 1 provides the benefit of placing all the other model parameters on a similar scale, where they can be more readily compared. For example, if receptive field position 2 has an α2 > 1, it means that stimuli presented at position 2 will contribute more suppression than stimuli presented at position 1. If α1 were free to vary, α2 could not be interpreted based on its absolute value alone. The crux of the model lies in the relative values of its parameters, which vary spatially (Figure 7). Below we also present some single-neuron examples to further explain the model.

Second, it is not clear from this manuscript that the solutions found by the model are reasonable. There are no examples or population plots to show how fitted parameters explain responses under different conditions and how they vary across neurons to explain different experimental results. The only information about fitted parameter values is in Figure 7, which shows that fitted α values are tightly clustered around 0.5. This seems incompatible with Figure 4C,D, showing that measured suppression values cluster near 0. The authors need to show how their model works to explain the different attention effects they measured.

Figure 7 shows how the key model parameters (the α- (grey) and L- (black) parameters) vary across the receptive field. As described in the manuscript, variation in the other model terms is less important. For example, (Results section, subsection “Spatial variability in excitation and suppression underlies differencs in attention modulation across neurons”) we point out that variability in the attention parameter β is inconsequential because a model with a fixed β-parameter fits the data equally well. Note that this does not mean that the betaparameter is unimportant. In fact, the β-parameter is crucial to capture the neuronal effect of attention. It does mean that the β-parameter does not need to vary across different neurons to account for the data (see below). Finally, the values of the σ parameter tend to cluster around zero. We now give information on the σ-values on (see also the response to reviewer 2).

It is important to realize that the suppression values in Figure 4C, D are not directly related to the α values in Figure 7. The suppression values in Figure 4C, D are a consequence of a balance of excitation and suppression. In the model, excitation is represented by the L-parameter and suppression is represented by the α-parameter.

For example, if one presents a new stimulus adjacent to an already presented stimulus, the neuronal response might decrease, and a positive stimulus-induced suppression index will be measured. The response decrease means that the added stimulus induced more suppression than excitation, but this can happen with different values of the α- and Lparameter.

For example the L-term might be large and the α-term even larger, or the

L-term might be minimal with an α-term that is small but large enough to suppress the response.

An α value of 0.5 will suppress neuronal responses when the corresponding Lterm is small, but the same α value might lead to increased responses when the corresponding L-term is large. This is why stimuli in the surround often suppress neuronal responses: surround stimuli contribute little excitation relative to suppression (low L-term compared to α-term). However, those same stimuli might readily increase the response when presented inside the receptive field center (high L-term compared to α-term).

The near-zero values in Figure 4C, D show that a second stimulus often produces small changes in the balance between excitation and suppression. Nevertheless, as can be seen in Figure 4C, D, in many cases stimuli did tip the balance between excitation and suppression in favor of suppression (positive suppression indices) or excitation (negative suppression values).

Thus the stimulus-induced suppression indices depend on the balance between excitation and suppression, which is modeled by the relative values of the L- and alphaterm.

In general, there exists no direct relationship between the suppression-index in

Figure 4C, D and the α-term. Below we also present some single-neuron examples to further explain the model.

Third, the results of the modeling analyses are presented only in terms of average variance explained in the response pattern, without any specific analysis of how well attentional modulations are predicted. The variance explained values are extremely high, but I would guess this is because most of the response variance depends on differences in stimulus type and location (rather than attention), and the model fits a separate excitatory value for each stimulus type/location combination and a separate suppression value for each location. With all these separate terms, high variance explained seems guaranteed, even with cross-validation, since fitting all these terms requires that the training set contains all stimulus type/location combinations and all suppression locations. (The manuscript does not specify how training and testing conditions were divided.) The variance due to attention is a small fraction of overall response variance, less than 10% based on Figure 6 (and on average modulation at median suppression shown in 4E,F), and variability in attention effects across stimulus conditions looks even smaller. Thus, in most cases the individual models could get attention effects completely backwards and still produce high average variance explained values. That seems quite possible, since the fitting procedure will be driven primarily by the large variance between stimulus conditions. The authors effectively confirm this by showing that the fitted attention parameter is unnecessary for the high variance explained results.

It is not true that the attention parameter (β) is unnecessary for the high variance explained. Although an attention parameter that varies across neurons is unnecessary, the attention parameter by itself, with a value fixed across all neurons, is crucial to model any attention modulation. What we show is that the attention parameter does not have to vary across neurons, but its fixed value is pivotal in order to predict any attention effects.

Without the attention parameter the model would not explain any curve in Figure 4A, B, E and F, and Figure 5D and E because it would predict zero attention modulation across the board, in clear contradiction to the data. A model with no β parameter does a significantly poorer job (median explained 2-fold cross-validated variance of 79% versus 87% for the model with β parameter; p<0.01).

As the reviewer notes, a substantial amount of variance remains explained because of the variance that arises from stimulus differences. This is also because the responses of some neurons were little affected by attention (see neurons with weak selectivity and suppression in Figure 5A and B), producing small decreases in explained variance when leaving out the β parameter. Other neurons' responses were strongly affected by attention (see neurons with strong selectivity and suppression in Figure 5A and B) and a model with no β parameter explained up to 50% of the variance less than a model with β parameter. Our results show that the magnitude of attention modulation, and thus how much of the response variance is explained by attention and β, varies with suppression and selectivity. Thus the critical test of the model lies in its ability to explain the full range of attention modulations across the range of observed suppression and selectivity. We show that the model does an excellent job of fitting the observed attention modulations across the full range of selectivity and suppression values and in different stimulus and receptive field conditions. Figure 4A and B plot attention modulation vs. selectivity in both RF configurations. The light grey values in these plots represent the average attention modulations predicted by the model and show that the model precisely captures these trends in the population of neurons across the full range of selectivity values. Similarly Figure 4E and F show (light grey values) that the model also precisely captures how attention modulation varies with stimulus-induced suppression in the population of neurons across the full range of suppression indices. In addition, Figure 5D and E show that the model captures attention modulation in the population of neurons within the entire space created by the selectivity and suppression induces, i.e. the model clearly predicts the interaction, and does so very similar to the observed data in Figure 5Aand B . Finally, Figure 6B shows that the model precisely reproduces the average attention modulation within the population of neurons in the conditions where single stimuli are presented at different RF locations. The close correspondence between the model predictions and the observed data are only possible if the model gets the attention effects right.

We now add (Results section, subsection “A spatially-tuned normalization model captures attention modulation inside the cRF and in the surround”): "Figure 4A, B, E and F (light grey points) show that the model precisely accounts for attention modulation across the full range of observed stimulus selectivity and stimulus-induced suppression values, within both the cRF-cRF and sRF-cRF configuration."

We also added a new figure with single-neuron examples to further explain the model and to demonstrate the model's ability to account for attention modulation in the responses of individual neurons (see below, Figure 5—figure supplement 1).

In addition, we now provide details on the cross-validation procedure in the

Methods: "The goodness-of-fit of the model was calculated for each neuron as the percentage explained variance, which was determined by taking the square of the correlation coefficient between the model-predicted responses and the observed neuronal responses across all stimulus conditions. The explained variance was calculated using the average neuronal responses from trials not used to fit the model.

For this purpose we employed two-fold cross-validation, fitting the model based on half of the randomlychosen trials, and using the remaining data to measure the goodness of fit of the model. This procedure was repeated five times, each time using a different random draw, and subsequently averaged across all cross-validations to produce the reported goodness offit."

In sum, the claim that attention acts uniformly across the cRF and the surround is supported by the observed neuronal responses (Figure 5A and B), a statistical analysis of these data (the general linear model, see above), and the spatially-tuned normalization model that accurately accounted for all neuronal responses, irrespective of the receptive field position of the stimuli and the attention condition. We show that this model predicts the observed attention modulation across conditions with one or two stimuli, with stimuli at different RF positions, and across the full range of the observed selectivity and stimulus-induced suppression values. We show this based on the population of neurons (Figure 4A, B, E and F, Figure 5D and E, and Figure 6B) and based on single-neuron examples (Figure 5—figure supplement 1).

[Editors' note: the author responses to the re-review follow.]

Thank you for resubmitting your work entitled "Attention operates uniformly throughout the Classical Receptive Field and the Surround" for further consideration at eLife. Your revised article has been favorably evaluated by David Van Essen (Senior editor), a Reviewing editor, and three reviewers.

The manuscript has been improved, but a number of remaining issues that need to be addressed before acceptance, as detailed below:

Reviewer #1 (:

"Attention operates uniformly throughout the classical receptive field and the surround."

In general, I think the authors have addressed the reviewer's comments clearly and compellingly. However, a few issues remain unclear:

1) The authors explain their results in relation to Sundberg as arising from the fact that the latter did not vary the position of the second stimulus: "Consequently, both the selectivity and the stimulus-induced suppression index varied with a single variable, the strength of the second stimulus relative to the strength of the fixed stimulus." This is not clear. Even at a fixed location, it seems L1, L2, alpha1, and alpha2 are independent in theory, and selectivity and suppression could be uncorrelated. Furthermore, as shown in Figure 7, excitation and suppression fall off at independent rates with distance from RF center, and presumably Reynolds sampled a variety of locations wrt RF center. Thus the authors need to clarify and explain precisely the statement in the quotation marks above (preferably with reference to equations). Are the authors claiming that Eq1 logically implies that suppression and selectivity should be correlated when measured in the manner of Reynolds?

That is correct: Equation 1 from the spatially-tuned normalization dictates that suppression and selectivity will be correlated when only one of two stimuli is varied. In the spatially-tuned normalization model, the L and α terms are indeed independent of each other. However the α terms are fixed at each RF location. Consequently, varying only the stimulus at one RF location, while keeping the other stimulus fixed, corresponds to keeping L1, α1 and α2 fixed. Here, L1 is the excitatory drive from the fixed stimulus, α1 is the suppression drive from the fixed stimulus, and α2 is the suppressive drive from stimulating the RF location from the variable stimulus. Thus L2, the excitatory drive from the variable stimulus, is the only variable that varies when different stimuli are presented at that RF location. The variable L2 term changes both the selectivity and suppression indices in similar ways.

In model terms, this can be seen as:

selectivity=response1response2response1+response2=L1α1+σL2α2+σL1α1+σ+L2α2+σ, where we assume (without loss of generality) that the fixed stimulus at location 1 is the preferred stimulus, i.e. L1α1+σ>L2α2+σ. Also:

Stimulusinduced suppression=response1response1+2response1+response1+2=L1α1+σL1+L2α1+α2+σL1α1+σ+L1+L2α1+α2+σ, where the fixed stimulus at location 1 is again the preferred stimulus.

If we present a weak stimulus at the variable location 2, i.e. L2≈ 0, the selectivity and stimulus-induced suppression index both become high:

selectivity=response10response1+0=L1α1+σ0α2+σL1α1+σ+0α2+σ=L1α1+σL1α1+σ=1, and

Stimulusinduced suppression=response1response1+2response1+response1+2==L1α1+σL1+0α1+α2+σL1α1+σ+L1+0α1+α2+σ=L1α1+σL1α1+α2+σL1α1+σ+L1α1+α2+σThe stimulus-induced suppression index is also high because response1+2=L1+L2α1+α2+σis at its lowest (less subtraction in the numerator) when stimulus 2 adds no excitatory drive to response1+2.

Conversely, adding a more potent stimulus at variable location 2, i.e. increasing L2, decreases the selectivity, because response2 is now not zero anymore and thus decreases the numerator. Similarly, the stimulus-induced suppression index will also be smaller, because response1+2=L1+L2α1+α2+σ increases with increasing L2, which in turn decreases the numerator, L1α1+σL1+L2α1+α2+σ, of the stimulus-induced suppression index.

Thus both selectivity and stimulus-induced suppression increase with a weaker stimulus at variable position 2, and both selectivity and stimulus-induced suppression decrease with a more potent stimulus at variable position 2. This shows that, within cells, selectivity and stimulus-induced suppression will be correlated when only one of two stimuli is varied. This agrees with the single neuron examples in Reynolds et al..

Reynolds et al. repeated these measurements for several neurons and included all data points from each neuron into the populations scatter plots. Note that Reynolds et al. always positioned their stimuli at approximately equally-responsive RF locations. The RFs of V4 neurons are approximately Gaussian, which means that equally-responsive RF positions are expected to lie at approximately equal (Mahalanobis) distances from the RF center. Following Figure 7, these equally-responsive RF positions, at equally distant RF positions, are expected to have similar alphas. Because Reynolds et al. only sampled RF locations with similar alphas, similar positive relationships between selectivity and suppression would have been observed for each neuron. Given that all neurons had a similar positive relationship between selectivity and suppression, the population data will also display a similar positive relationship.

This information has been added to the Appendix.

Related to this, it would be helpful if the authors actually analyzed the subset of their data corresponding to Reynold's paper, and show that they can reproduce the correlation between suppression and selectivity of that paper.

Our experimental design differed substantially from that of Reynolds et al., because when one stimulus was fixed there were few variants of the other stimulus. For example, when presenting a Gabor with one orientation at RF location 1, our design introduced only two different stimuli at RF location 2: a Gabor with the same orientation or a Gabor with an orthogonal orientation as that at location 1. In contrast, Reynolds et al. used 16 different stimulus conditions.

We compared the two stimulus conditions in which the stimulus at one location was kept constant while the stimulus at the other location varied between two orthogonal orientations. Across neurons and stimuli, we found that the average stimulus-induced suppression index of the stimulus condition with greater selectivity was significantly greater than that of the other stimulus condition with the lower selectivity (stimulus-induced suppression index of 0.07 vs. 0.05, p = 3x10-13, paired permutation t-test). Thus despite the limited number of stimulus conditions in our design, we could reproduce the correlation between suppression and selectivity of that paper.

This information has been added to the Appendix.

2) I did not see any response to Reviewer 3's point: "But if alpha1 is always equal to 1, a low alpha2 value actually contributes to larger attentional modulation, i.e. a larger difference between response with attention at location 1 and response with attention at location 2. This would be the opposite of the empirical finding that low suppression results in low attentional modulation." Please address.

We apologize for not responding to this point directly (but see Figure 5—figure supplement 1C) in the previous revision and will do so here. Our findings show that more suppression from the non-preferred stimulus causes more attention modulation. If α1 (= 1) corresponds to the non-preferred stimulus then a low α2 (i.e. the α corresponding to the preferred stimulus) will indeed cause more attention modulation. This is because the non-preferred stimulus has the greatest α, so that when the non-preferred stimulus is added to the preferred stimulus it will decrease the neuronal response (α1 is relatively high, but L1 is low), and attention can amplify or weaken this suppression. If the small α2 corresponds to the non-preferred stimulus, there will instead be little attention modulation. Under these circumstances directing attention to stimulus 1 or stimulus 2 does not modulate the response much, i.e. R1att,2R1,2att:

R1att,2=βL1+L2β+α2+σβL1+0β+0+σ=L11+σ/βL11+0=L1R1,2att=L1+βL21+βα2+σL1+β01+β0+σ=L11+σL11+0=L1

Here, we used the fact that the stimulus at position 2 is non-preferred, so L2≈ 0. Furthermore, using the fact that alpha2 is small as assumed in the question, α2≈ 0. Finally, we used the fact that σis usually small (median σ=0.06; see Methods).

Reviewer #3:

The authors added an explanatory clause to the main text, but the supporting analyses remain unclear. The reported p-values are for "main effects at 0 selectivity/suppression". What does "at 0" mean? Is it an analysis performed on a subset of cells with no significant selectivity/suppression? Or is it an analysis performed on a row or column (how wide?) from the average plot? If so, how many neurons actually occupied those rows or columns?

We apologize for the confusion. The analyses referred to are based on all data, not a particular selection or subset of the dataset. It is a mathematical property of the linear regression model (in contrast to an ANOVA) with an interaction term that each of the main effects measures the contribution of a variable to the independent variable, given that the other variables (selectivity or suppression) in the model are zero. In our case, the main effect of selectivity gives the rate of change of attention modulation with selectivity, given that suppression is zero. The main effect of suppression gives the rate of change of attention modulation with suppression, given that selectivity is zero. Although the term "main effects" are typically used to denote these effects, "conditional effects" would probably be a better name.

We have added the following to the Methods (Analyses subsection): "Thus the main effects are not estimated from a particular selection or subset of the dataset, but follow mathematically from the linear regression model with interaction term."

The other confusing thing here is that p values for significance of main effects are given as though they support the statements about the weak effects, which they don't, in the sense that the p-values are actually most significant for main effects of selectivity, which are being discounted as weak. The discrepancy needs explanation. There must be a more appropriate numerical basis for the statement that both selectivity and suppression are required for strong attention effects.

P-values are of course a different measure than effect sizes and can be very low even for very small effects. As the reviewer notes, the main effect of selectivity in our dataset is a good example of this statement. We included the p-values of the main effects for completeness, although these effects are clearly small (Figure 5). One option would be to mention the regression coefficients in the main text. However, as is known from the statistical regression literature, regression coefficients of main effects become hard to interpret in the context of interactions. This is because the coefficients of the main-effects cannot be interpreted without considering the coefficient of the interaction. Furthermore, the interaction can completely dominate a main effect, because the dependent variable grows non-linearly due to the interaction, as is apparent in our data for high values of the selectivity and suppression indices (upper right corner Figure 5A,B). Thus including the regression coefficients is probably more confusing than elucidating.

We believe that Figure 5A,B gives the reader the best impression of the effect sizes. We support this impression with the regression analysis, which we believe is an appropriate statistical analysis for these purposes in the sense that it captures a statistically significant non-additive trend in the data.

We now write (Materials and methods section, subsection “Model”: " Specifically, Figure 5A and B show that when stimulus-induced suppression is low, attention modulation will be weak, even when attention is shifted between a strong and a weak stimulus (upper left corner in Figure 5A,B). That is, the plots show that the effect of selectivity near zero stimulus-induced suppression is weak, although significant (main effect of selectivity at zero stimulus-induced suppression: cRF-cRF: p=2x10-64; sRF-cRF: p=2x10-60; M1: p=7x10-136 across RF configurations; M2: p=5x10-30 across RF configurations). "

In addition, it is hard to reconcile Figure 5A,B, where attention effects are strongly focused near selectivity = 1 (even if you collapse across suppression) with Figure 4A,B, where attention effects are equally strong at selectivity = 0.5. What explains the difference?

Figure 4A,B shows the attention modulation as a function of selectivity, averaged across all data points with different suppression values. Note that there are more data points with a stimulus-induced suppression near zero (see Figure 4C,D,G,H). For a stimulus-induced suppression near zero, attention modulation does not differ much across selectivity (see Figure 5A,B). Because the average in Figure 4A,B is dominated by the majority of the data points, i.e. the data with stimulus-induced suppression near zero, attention modulation does not vary much as a function of selectivity above 0.5. Thus the average plots in Figure 4A,B hide important trends in the data, which only become apparent in Figure 5A,B. The very strong attention effects indeed cluster near selectivity = 1 in Figure 5A,B. This is because selectivity and stimulus-induced suppression interact with each other to determine attention modulation. To visualize this interaction, one needs to consider both variables (selectivity and suppression), as in Figure 5A,B. The interaction causes attention modulation to increase rapidly with increasing values of selectivity and suppression, but fewer points exist near high selectivity and high suppression values (as expected because we could not optimize our stimuli for the majority of the simultaneously recorded neurons).

The authors respond that, in addition to the model, the claim is based on the visual similarity of Figure 5A and 5B, and on "the statistical tests that we performed on these neuronal responses (see the discussion on the general linear model above)". 5A and B do not look entirely similar; in 5A the peak is elongated horizontally, showing that attention is more sensitive to suppression when the distractor is in the receptive field; in 5B the peak is elongated vertically, showing that attention is more sensitive to selectivity when the distractor is in the surround. A direct test of whether the attention effects are equivalent in 5A and 5B would be some kind of two-dimensional Kolmogorov-Smirnov analysis (see, e.g., Lopes, Reid, Hobson, Proceedings of Science, XI International Workshop on Advanced Computing and Analysis Techniques in Physics Research April 23-27 2007 Amsterdam, the Netherlands.) The linear regression is a less direct way of testing the question, and I don't think that it could capture the differences between peak shapes in 5A and 5B, in which case the new Bayes Factor analysis would not be meaningful.

We thank the reviewer for providing us with this interesting article about two-dimensional Kolmogorov-Smirnov tests. This test compares two bivariate distributions. However, it is important to note that Figure 5A and B are not bivariate densities. Instead, Figure 5A,B plots attention modulation, not probability mass, as a function of selectivity and suppression. Thus to compare Figure 5A and B one would require a three-dimensional Kolmogorov-Smirnov test, with selectivity, suppression and attention modulation as variables. Crucially, this test will not give the correct answer to the question of similarity of attention effects between the receptive field and the surround conditions. This is because the test compares the full empirical distributions of both conditions, including whether the distributions of selectivity and stimulus-induced suppression indices differ between both receptive field configurations. We showed that the surround condition has on average higher selectivity values and lower suppression values than the classical receptive field condition (Figure 4C,D,G,H). Thus, even if attention modulation follows the exact same principles in the classical receptive field and the surround, the test will detect differences due to different selectivities and suppressions between the two conditions.

Importantly, Figure 5D,E shows that the spatially-tuned normalization model does capture most of the differences in peak shapes in Figure 5A,B. This model has the same structure for the classical receptive field and the surround data, and only differs between these conditions by the numerical values of its parameters (see Figure 7). The single attention parameter (β) operates in exactly the same way in both receptive field configurations. So although the linear regression analysis is supportive of the claim, the strongest support comes from the very good fits of the attention modulations by the normalization model in both receptive-field configurations.

The authors explain that it is only the relative values between parameters that matter, and support this by writing out a multiplication of all terms by 1/alpha1. This still doesn't make sense to me in that the α values need to be able to go to 0, as in Figure 5/Figure 5—figure supplement 1C, to represent absence of suppression. If suppression due to stimulus 1 is entirely absent (in which case the multiplier 1/alpha1 would be undefined), how would the model capture that? The addition of Figure 5/Figure 5—figure supplement 1 is certainly helpful in understanding the operation of the model, but it doesn't solve this confusion for me.

Values of zero are well-approximated by very small values. In fact, the value of alpha2 in Figure 5/Figure 5—figure supplement 1C is not exactly zero, but 3x10-16. This is where the optimization algorithm stopped, because smaller values of alpha2 resulted in negligible improvements of the error function.

If suppression due the stimulus 1 is entirely absent, the algorithm increases the value of alpha2 so that the ratio alpha1/alpha2 becomes very small. So also here the relative values of the parameters matter. Specifically, the α of stimulus 1 is very small compared to that of stimulus 2.

In their response, the authors provide an extended explanation of how α values can be balanced out in either direction by L values, concluding that "in general, there exists no direct relationship between the suppression-index in Figure 4C,D and the α term." To the extent this is true, Figure 7 and the accompanying legend are highly confusing, since they equate α with suppression in the labels and in the text. The fact that the new supplementary example with 0 suppression also has a 0 α value will tend to add to this confusion. Since they are plotting α in Figure 7, the authors need to explain what it means and how it relates (if at all) to suppression, not just to reviewers but also to readers.

We thank the reviewer for pointing this out. We now write inthe Methods: "Note that the stimulus-induced suppression index is distinct from the αterms in the model. This is because the stimulus-induced suppression index is based on the observed neuronal responses, which comprise both an α and L term (i.e. response = L/(α+σ)). In terms of the model parameters, the stimulus-induced suppression index is given by: Stimulusinduced suppressionindex=responsePresponseP+NresponseP+responseP+N==LPαP+σLP+LNαP+αN+σLPαP+σ+LP+LNαP+αN+σ, where LPis the excitatory drive from the preferred component Gabor of a Gabor pair, LN is the excitatory drive from the non-preferred component Gabor, αP is the suppressive drive from the preferred component Gabor, and αN is the suppressive drive from the non-preferred component Gabor. So the stimulus-induced suppression index depends on both the excitatory and suppressive drive from the stimulus."

We also changed the legend of Figure 7 and write "Suppressive drive" and "Excitatory drive" instead of "Suppression" and "Excitation".

The authors correctly point out that the models must not be failing to capture attention effects given the close approximation to population level effects in Figure 4. And, they reiterate that removing the attention term reduces overall explained variance from 87% to 79%. My main point had simply been that these were both extremely high values, and this is due to the fact that the number of terms in the model is on the same order as the number of conditions, so a close fit is not surprising or informative in either case. The cross-validation does not change this because, given the low number of conditions and the description now given in methods, it amounts to simply splitting repetitions from identical trial types into two groups and proving they give the same result. I don't think the 87% explained variance with the full model or the 8% drop in explained variance by themselves establish that "attention acts uniformly across the cRF and the surround". (The exceedingly close fits are due to the similar number of terms and conditions; the 8% drop shows that the attention term mattered, but it doesn't address how exactly attention behaved.) Nor, as explained above, do I think this is established by visual similarity between Figure 5A and B or the linear regression analysis. As stated above, I think the direct test of this proposition would be a statistical comparison of 5A and 5B. Regardless of the result, the clear and more interesting result here is the criticality of the interaction between selectivity and suppression for attention.

As the reviewer correctly points out, a substantial amount of variance remains explained because of the variance that arises from stimulus differences. This is also because the responses of many neurons were little affected by attention (see neurons with weak selectivity and suppression in Figure 5A and B), producing small decreases in explained variance when leaving out the β parameter. A smaller fraction of neurons were strongly affected by attention (see neurons with strong selectivity and suppression in Figure 5A and B) and a model with no β parameter explained up to 50% of the variance less than a model with β parameter. If one were to select only highly selective neurons that are strongly suppressed by nearby stimuli, the average percentage of explained variance by the β parameter would be much higher. However, we could not optimize our stimuli for each of the simultaneously recorded neurons.

Our results show that the magnitude of attention modulation, and thus how much of the response variance is explained by attention and β, varies with suppression and selectivity. Thus the critical test of the model lies in its ability to explain the full range of attention modulations across the range of observed suppression and selectivity. We show that the model does an excellent job of fitting the observed attention modulations across the full range of selectivity and suppression values and in different stimulus and receptive field conditions. The model explains all these attention effects using a single attention parameter, i.e. β, that interacts with the spatial summation or normalization mechanisms of neurons. This β parameter can even be fixed across neurons and receptive field configurations, without appreciably affecting the average percentage explained variance. We believe this is a remarkable accomplishment of the model.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Attention operates uniformly throughout the Classical Receptive Field and the Surround" for further consideration at eLife. Your revised article has been favorably evaluated by David Van Essen (Senior editor) and the Reviewing editor.

The manuscript has been improved and we greatly appreciate the detailed, equation-based clarifications in response to several of the points raised by the reviewers.

However, the reviewers remain concerned whether the authors have really provided evidence that attention acts uniformly throughout the receptive field. The reviewers would like clarification of the following points:

1) Is there any reason that 5A and 5B should look identical? In the initial reviewer response, the authors stated, "We would like to emphasize that the claim that attention acts in a uniform way on all cRF and surround positions is supported by more than just the model that accurately accounts for all neuronal responses, irrespective of the receptive field position and attention condition. In particular, this uniformity can be seen directly in the similarity of the neuronal responses in Figure 5A and B, which show that the dependency of attention modulation on selectivity and suppression does not depend on whether stimuli are presented in the cRF or the surround." The reviewers see a **clear difference between 5A and B**. This seems to provide evidence that attention is more critically dependent on selectivity in the CRF-CRF condition, and more critically dependent on suppression in the CRF-SRF condition; quite the opposite of operating uniformly. The reviewers are confused whether the authors have a hypothesis or explanation for why the dependency of attention on suppression and selectivity should differ between CRF (5A) and SRF (5B); it seems that the difference is contrary to the authors' conclusion. If they do have a hypothesis about the difference, please state it and test it, and modify the "uniform operation" conclusion accordingly.

In pointing out the striking similarity between Figure 5A and 5B, we did not mean to claim that receptive field and surround mechanisms were identical in every detail, or that CRF and surround phenomena depend on exactly the same circuitry (which seems unlikely), or that there is pixel-by-pixel identity between the two plots. We meant to emphasize that in both cases attention-related modulation depends critically on a combination of selectivity and suppression in a quantitatively similar way in both the CRF and surround. We stand by that claim, and emphasize that this powerful first-order similarity has never before been reported in the literature.

It is conceivable that there are some quantitative differences in the roles of selectivity and suppression between the receptive field and the surround, such as those described by the reviewers. Alternatively, the small second-order differences in Figure 5 might depend, at least partially, on sampling noise, as suggested by the statistical tests (see below). We will continue to work on refining our models as we collect data in the future, but differences of that sort will not change the first-order observations we describe here and are well beyond the scope of the current report.

We have modified the text to make the specific claims clearer (sixth paragraph of Discussion): "Suppression and excitation may rely on distinct mechanisms in different regions of the receptive field31. Our data do not pertain to these different mechanisms and we may have missed some small differences in attention modulation associated with these distinct mechanisms. Nonetheless, our findings show that the way attention interacts with excitation and suppression across different regions of the receptive field is remarkably similar."

2) In the most recent reviewer response, the authors seem to be backing away from claim that 5A and B need to look similar, and are appealing to the similarity between the model fits (5D,E) and the data (5A, B). The reviewers remain concerned that the reason why the model fits assuming uniform attention are so good is that all of the non-attention related terms (separate excitatory term for each stimulus type/location combination and a separate suppression value for each location) are doing all the work. Since the number of terms being fitted is nearly equal to the number of measurements, this guarantees over fitting, and it can't be cured by cross-validation because every condition would need to be in both portions of the data to fit all the stimulus- and position-specific terms.

It would seem advisable to include an explicit caveat about this.

Please note that the non-attention related terms are not doing all the work in the model. Without the attention term (β), the model predicts no attention modulation. Consequently, without the attention term Figure 5D, E would be uniformly zero, in clear contradiction to the data. Hence the attention term is critical in explaining all attention modulations. Thus in addition to the non-attention terms, the attention term performs crucial work.

Please also note that our model has nine free parameters (constant β) that account for 36 measurements in each neuron (see Methods). Thus the number of terms (9) is not nearly equal to the number of measurements (36).

Finally, we point out that cross-validation does in fact counteract the adverse effects of over fitting. Cross validation gives a nearly unbiased estimate of a model's performance on the validating data. In particular, a model with too many terms is expected to perform worse on cross-validation because the fits on one data half would include fitted noise (i.e. over fitting) and this fitted noise is independent of the noise in the validating data, causing lower explained variances. Thus with too many model terms the cross validation would punish, not improve, the goodness-of-fit measures.

3) Most importantly, the obvious way to test their claim is some kind of direct comparison of attentional modulation between CRF and SRF. Can the authors do this?

We provide several such direct comparisons between attention modulations in the CRF and the surround.

First, there is no significant interaction between RF configuration (CRF and SRF) and the effects of selectivity and suppression on attention modulation (p ≥ 0.6; Results section, subsection “Stimulus selectivity and stimulus-induced suppression interact in determining attention modulation and do so similarly inside the cRF and the surround”). In other words, attention modulation as a function of selectivity and suppression does not differ significantly between the cRF and the surround. Note that this is a direct test of the hypothesis that attention modulation as a function of selectivity and suppression is the same in the cRF and surround, without appeal to goodness of model fits. We emphasize that classical statistical tests (i.e. direct tests) are not designed to test the veracity of a null-hypothesis such as, in our case, identical attention modulation in the CRF and the surround: a non-significant p-value provides little information about the truthfulness of the null hypothesis. This is why we also included a Bayesian test, which does not suffer from this drawback.

Second, the Bayesian analysis showed that the data are much (347 times) more likely to come from a model in which one does not distinguish between RF configurations. According to this Bayesian analysis, adding extra parameters to distinguish between the CRF and SRF would be spurious and cause over fitting. Thus this Bayesian analysis indicates that attention modulation as a function of selectivity and suppression is the same in the cRF and surround.

Third, the spatially-tuned normalization model, which has no extra parameters to distinguish between RF configurations, successfully accounts for all data in the CRF and the surround.

In empirical sciences, such as neuroscience, one cannot prove statements, one can only provide evidence in favor of a hypothesis. The three above-mentioned analyses all point in the same direction, namely that it is most probable and parsimonious to conclude that attention modulation operates uniformly across the CRF and SRF.

4) Have the authors tried to fit other models with fewer non-attention related variables (e.g., keeping suppression constant)? How does this affect the attention term?

Overall, the reviewers would be most convinced by a new analysis directly showing that attentional modulation as a function of selectivity and suppression is the same in the cRF and surround, without appeal to goodness of model fits. Without such an analysis, it remains unclear whether the authors have identified a truth about the brain, that attention acts uniformly, or whether they have simply shown that a viable model of neurons can be constructed in which attention acts uniformly, but other models with a variable attention term are equally plausible.

We have explored other models. Two of these analyses were included in the previous versions of the manuscript. We have also added in the current version an additional analysis in which we fitted a model with constant excitatory terms.

Results section, subsection “Spatial variability in excitation and suppression underlies differences in attention modulation across neurons” in the manuscript we describe the model fits with constant suppression terms. We repeat the conclusion here: a model with constant suppression terms, but with free attention term, fits the data significantly worse. This analysis, together with the spatially-tuned suppression shown in Figure 7 and findings from previous studies, demonstrates that a separate suppression term for each spatial location was necessary to account for the observed neuronal responses.

On the “Model” subsection of the Materials and methods section, we describe the model fits with no σ term. We repeat the conclusion here: a model with no σ term fits the data significantly worse. So we included a σ term in the model.

We have added the results from a new analysis in which we fit for each neuron a model with only one free excitatory (L) term to capture excitation across all stimulus conditions. This model with one L term performed significantly worse at explaining neuronal responses (median two-fold cross-validated percentage explained variance 48%, compared to 87% for the model with all L terms; p < 0.0001; sequential F-test).

Taken together, these analyses show that all non-attention terms were necessary to account for the observed neuronal responses.

When fitting models in which some of these non-attention terms are omitted, the attention term may stay unchanged in some cases, i.e. for neurons in which the omitted non-attention term happened to be relatively unimportant given the stimulus conditions, or change in other (most) cases, i.e. trying to compensate for the detrimental effects of omitting necessary non-attention terms. In any case, the changes in the attention term depend on a complex interaction between the specific neuron, stimulus conditions and the type of the omitted parameter. Importantly, these changes in the attention term are meaningless because the attention terms now tries to capture some of the effects of an omitted non-attention term, so it loses its relationship to attention per se.


Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

RESOURCES