Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jul 9.
Published in final edited form as: Curr Biol. 2018 Jun 18;28(13):2058–2069.e4. doi: 10.1016/j.cub.2018.04.092

Encoding of target detection during visual search by single neurons in the human brain

Shuo Wang 1,2, Adam N Mamelak 3, Ralph Adolphs 2,5, Ueli Rutishauser 3,4,5
PMCID: PMC6445637  NIHMSID: NIHMS1516799  PMID: 29910078

Summary

Neurons in the primate medial temporal lobe (MTL) respond selectively to visual categories such as faces, contributing to how the brain represents stimulus meaning. However, it remains unknown whether MTL neurons continue to encode stimulus meaning when it changes flexibly as a function of variable task demands imposed by goal-directed behavior. While classically associated with long-term memory, recent lesion and neuroimaging studies show that the MTL also contributes critically to the online guidance of goal-directed behaviors such as visual search. Do such tasks modulate responses of neurons in the MTL, and if so, do their responses mirror bottom-up input from visual cortices or do they reflect more abstract goal-directed properties? To answer these questions, we performed concurrent recordings of eye movements and single neurons in the MTL and medial frontal cortex (MFC) in human neurosurgical patients performing a memory-guided visual search task. We identified a distinct population of target-selective neurons in both the MTL and MFC whose response signaled whether the currently fixated stimulus was a target or distractor. This target-selective response was invariant to visual category, and predicted whether a target was detected or missed behaviorally during a given fixation. The response latencies, relative to fixation onset, of MFC target-selective neurons preceded those in the MTL by ~200ms, suggesting a frontal origin for the target signal. The human MTL thus represents not only fixed stimulus identity, but also task-specified stimulus relevance due to top-down goal-relevance.

Keywords: Human single neuron, Visual search, Medial temporal lobe, Medial frontal cortex, Goal relevance

Introduction

Goal-directed behaviors such as visual search depend on holding in mind a desired outcome while deploying sequences of actions in order to reach the goal [14]. Search depends on top-down modulation of bottom-up visual processes to detect whether the current location contains the target or not [2, 512]. While classically investigated for its role in long-term memory [13], the MTL is also involved in mediating ongoing behaviors by facilitating visual and spatial working memory and scene integration processes [1418]. For example, patients with MTL lesions are impaired in visual search due to an inability to organize exploration patterns and properly maintain a representation of the search target [17, 18]. Similarly, such patients are impaired in oddity-judgment tasks that require the inspection of complex scenes [19]. A key contribution of the MTL to ongoing behavior is through its support of visual and spatial working memory [15, 16, 20, 21], which is essential in visual search to maintain the representation of the target and to organize an efficient search without revisits. In the presence of distractors (which in our task are fixations on non-targets), patients without a functional MTL are impaired in some working memory tasks [14, 16, 2224]. During search, this manifests in more revisits of the target [17]. Similarly, evidence from neuroimaging [20, 25] and intracranial recordings [21, 26] indicates that persistent activity within the MTL supports visual working memory in such tasks. Together, this body of work shows that the MTL, and in particular the hippocampus, is supports goal-directed behavior. However, it remains largely unknown how such modulation by behaviorally relevant goals is implemented.

In higher-order visual cortex, the response of neurons is modulated by whether the preferred stimulus of a neuron is currently a target of a search or not [9, 12, 27]. This modulation in visual cortex is in addition to a neuron’s visual tuning, such that a cell’s response is informative about the presence or absence of only a specific target [2, 11, 12, 28]. Furthermore, this neuronal target response predicts whether a search target will be missed [2, 28]. The origins of such modulation are thought to be primarily the frontal cortex [4, 2934]. However, higher-level visual responses in macaques are in addition also modulated by the MTL in tasks requring memory [35, 36]. Despite this observation, the role of the MTL in search has remained unclear because no previous study has obtained single-neuron recordings from this region during memory-guided visual search.

We used simultaneous single-neuron recordings and eye tracking performed in neurosurgical patients to investigate how signals related to target detection interact with visual sensory signals in the human MTL during active search. Subjects were instructed to locate a visual target shown among 23 other stimuli in an array during instructed visual search. The target changed trial-by-trial, requiring rapid implementation of changing goals. We used a range of stimuli, including objects and faces, as search targets [37]. This approach enabled us to identify category-tuned neurons [3840] and to test whether the response of these neurons was modulated by target relevance.

Results

Task and behavior

Patients (n = 8, 11 sessions; Table S1) performed a visual search task. In each trial, subjects reported by button press whether the search target was present or absent (Figure 1A,B; 80% of all trials contained the target). The search target (cue) changed for every trial and was displayed for 1s immediately preceding the search array. Therefore, patients had to keep the target item in working memory throughout the search. Patients performed well with an average reporting accuracy of 92.0±7.52% (mean±SD across sessions; 91.4±8.47% for target-present trials and 94.5±6.02% for target-absent trials, two-tailed paired t-test, P = 0.18). Considering only correct trials, the average reaction time (RT, relative to search array onset) for target-present trials was significantly faster compared to target-absent trials (1.92±0.76s vs. 3.96±1.56s, two-tailed paired t-test, P < 10−4; Figure 1C). The accuracy and RT of the patients was comparable to that of healthy participants who had performed the same task in a previous study [37], confirming that our patients’ behavioral performance was not impaired (two-tailed two-sample t-test, Ps = 0.17 and 0.11 for accuracy and RT, respectively). Consistent with prior reports [41], dwell times of fixations on targets (384±88.4ms) were significantly longer than those on distractors (302±63.0ms; paired two-tailed t-test, t(10) = 2.72, P = 0.022; Figure 1D; see STAR Methods for more fixation analysis).

Figure 1. Task and Behavior.

Figure 1.

(A) Task structure. The search cue is shown for 1s, immediately followed by the search array. Subjects are instructed to indicate by button press whether the target is present or absent (timeout 14s). Trial-by-Trial feedback is given immediately after button press (‘Correct’, ‘Incorrect’, or ‘Time Out’), followed by a blank screen for 1–2s. (B-C) Example visual search arrays with fixations indicated. (B) Sample stimuli. Each circle represents a fixation. Green circle: first fixation. Magenta circle: last fixation. Yellow line: saccades. Blue dot: raw gaze position. Red box: target. (C) Boxplot of reaction time. TP: target-present. TA: target-absent. (D) Boxplot of fixation duration. T: fixations on targets. D: fixations on distractors. On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points the algorithm considers to be not outliers, and the outliers are plotted individually.

Electrophysiology

We recorded in total 317 single neurons in the human amygdala and hippocampus. 228 neurons had an average firing rate >0.2Hz and we restricted our analysis to this subset. In the analyses below, we pool neurons from both areas and commonly refer to them as “MTL neurons”. The words cell and neuron are used interchangeably.

Target-selective (TS) neurons in the MTL

To investigate the neural substrates of target detection, we aligned neuronal responses at fixation onset and compared the response of each neuron between fixations on distractors and targets. We first analyzed responses to all fixations onto targets, whether they were detected or not. As the response, we used the mean firing rate in a time window starting 200ms before fixation onset and ending 200ms after fixation offset (next saccade onset). The duration of this window was on average 690±45.0ms (mean±SD across sessions). 50 cells (21.9%; binomial P < 10−20; 19 and 31 from the amygdala and hippocampus, respectively) had a response that differed significantly between fixations on targets vs. distractors in target-present trials (two-tailed t-test, P < 0.05). We identified two types of such “target-selective” (TS) neurons: one type had a greater response to targets relative to distractors (target-preferring; 27/50 cells, Figure 2A-C) and the second had a greater response to distractors relative to targets (distractor-preferring; 23/50, Figure 2D,E; note that these distractor-preferring cells still carried information about targets; see STAR Methods). The proportion of TS cells identified was highly significant (permutation test, P < 0.001): a control permutation test shuffling the labels of targets and distractors revealed 11.43±3.34 neurons expected by chance (mean±SD; 6.44±2.51 target-preferring and 5.00±2.17 distractor-preferring). This result demonstrates that a subset of MTL neurons encode whether the present fixation landed on a target or not (see Figure S3 for example neuronal responses aligned at button-press).

Figure 2. Target-selective (TS) cells signal top-down goal relevance but are not visually tuned.

Figure 2.

(A-C) Neurons that increase their firing rate when fixating on targets, but not distractors (selection by two-tailed t-test in a time window of −200ms before fixation onset to 200ms after fixation offset: all Ps < 10−11). (D-E) Neurons that decrease their firing rate when fixating on targets but not distractors (all Ps < 10−3). Fixations are sorted by fixation duration (black line shows start of the next saccade). t=0 is fixation onset. Asterisk indicates a significant difference between fixations on targets and distractors in that bin (P < 0.05, two-tailed t-test, Bonferroni-corrected; bin size = 50 ms). (F-G) Mean response to fixations on each visual object as a function of whether the object is currently a target or a distractor (two example neurons are shown). Each data point is a different object. Gray bars (upper right) show the number of data points above and below the diagonal. In these two examples, the response to most objects was stronger when the object was a target (sign-test: (F) P = 1.40 × 10−5; (G) P = 5.35 × 10−4), indicating that the response was predominantly modulated trial-by-trial by target relevance rather than visual identity. Note that these two example neurons are the same as in (B and C). (H-K) Single-fixation analysis using the TSI. (H) Shown is the cumulative distribution of the single-fixation response of fixation-aligned target- and distractor-preferring neurons for fixations on targets and distractors (n = 50 neurons). (I) TSI summary. TS: target-selective cells. NTS: non-TS cells. Error bars denote one SEM across cells. Chance performance was calculated by permutation tests by shuffling the label of target and distractor. Error bars denote one SD across permutations. KS test was used to compare TS cells with non-TS cells and permutation test was used to compare against chance performance. Asterisk indicates a significant difference. ***: P < 0.001. n.s.: not significant. (J) Detected vs. missed targets. FDetect: detected targets. FMiss: missed targets. D: distractors. (K) Temporal response characteristics of TS neurons. FAll: all fixations on targets. FFirst: the first fixation on target. DBefore: fixations on distractors before the first fixation on target. DAfter: fixations on distractors after the first fixation on target. Asterisk indicates a significant difference between fixation types. ***: P < 0.001. See also Figure S1, S2, S3, and S4.

To assess whether the response of TS cells could be attributed to target detection alone, or in addition also encoded visual stimulus category, we next compared their response between fixations on identical objects as a function of whether the fixated object was a target or a distractor in that trial. This comparison revealed that the TS neurons that increased their activity for targets did so even when comparing target and distractor fixations that landed on visually identical objects (n = 27 neurons, two-tailed paired t-test on difference in firing rate against 0: P = 0.0019; Figure 2F,G show two example neurons, both P < 0.001, sign-test). Similarly, the distractor-preferring TS neurons increased their firing rate to distractors compared to targets when only considering fixations on identical visual objects (n = 23, two-tailed paired t-test on difference in firing rate against 0: P < 0.001). Because in this comparison the visual category of the stimuli is identical, this result indicates that the differential response of TS neurons must be driven by target detection alone.

TS neurons predict behavioral target detection

We next investigated the relationship between the response of TS cells and behavior by quantifying the response of TS cells during individual fixations using a target-selectivity index (“TSI”). The TSI is equal to the neuronal response during an on-target fixation, normalized by the average response during fixations on distractors (see Eq. 1,2 and STAR Methods). As expected, the TSI for TS cells was significantly larger during fixations on targets compared to fixations on distractors (n = 50, two-tailed two-sample Kolmogorov–Smirnov (KS) test, P < 10−122; Figure 2H and Figure S4C; this finding is also true for target- and distractor-preferring neurons considered separately, see Figure S4D-I). This result confirms that the single-fixation response of TS cells is strong enough for single-fixation analysis (also see Figure S4A,B for ROC analysis). Permutation tests by shuffling the label of target and distractor further confirmed our results: TS cells (30.6±17.3%, mean±SD across cells; Figure 2I) had a significantly higher TSI compared to chance (8.68±1.09%, P < 0.001), whereas the TSI of all non-TS cells (12.1±10.8%; Figure 2I) did not differ from chance (12.3±0.83%, P = 0.60).

We first tested whether TS neurons responded differently to targets as a function of whether the target was detected or not by the subject (detected vs. misses, where misses were target fixations for which patients failed to press the target-present button immediately; see STAR Methods). TS neurons had a significantly higher response to targets that were detected relative to those that were missed (TSI: 31.1±18.6% vs. 16.9±79.5%, mean±SD across cells, KS-test, P < 0.001; Figure 2J; see [28] for a similar analysis in macaques). Nevertheless, the response of TS neurons to missed targets was significantly larger than that to distractors (TS index: 16.9±79.5% vs. 0.0±0.0% (by default); KS-test, P < 10−7; Figure 2J). Thus, TS cells distinguished targets from distractors even when targets were not behaviorally detected by button press. Nevertheless, the maximal response of TS cells was only evoked if an on-target fixation coincided with behavioral detection of the target, suggesting that the target detection signal may be graded, from a strength insufficient for behavioral choice, to a strength sufficient for the behavioral choice and typically accompanied by conscious recognition of the target.

Temporal response characteristics of TS neurons

Above, we considered all fixations on distractors regardless of whether they occurred before or after the target had been fixated intermittently. However, in a subset of 25.3±19.1% of trials, subjects continued to fixate on at least one of the distractors after they first fixated on the target (note that above, fewer fixations were considered as misses because we required at least 3 fixations before button press). We next used these trials to explore how TS neurons responded to targets and distractors as a consequence of fixating on targets. As expected, TS neurons had a significantly higher TSI when comparing all target fixations (30.6±17.3%, mean±SD across cells) to the subset of distractor fixations that preceded the first target fixation (−0.34±4.56%; KS-test: P < 10−20; Figure 2K). This was also true when only considering the first target fixation in each trial compared to the distractor fixations that preceded this first target fixation (26.2±16.0% vs. −0.34±4.56%; KS-test: P < 10−20; Figure 2K). On the other hand, when only considering distractor fixations that occurred after the first fixation on a target (21.0±36.4%), TS neurons still showed a significantly higher TSI for all targets (30.6±17.3%; KS-test: P < 0.001; Figure 2K). Thus, the response of TS cells remained sensitive to the type of stimulus fixated (target or distractor) even after the target had already been fixated. However, the TSI for distractor fixations increased significantly for those that were fixated after the target was already fixated at least once compared to those before the first on-target fixation (KS-test: P < 10−9; Figure 2K). This indicates that despite subjects not stopping the search, the presence of the target did influence the response of TS cells immediately following the first fixation on the target, which here resulted in TS cells responding to some degree also to distractor fixations that followed the first on-target fixation.

Category cells are modulated by target detection

We next asked whether the previously described category-selective cells in the MTL [39] might also be modulated by detection of the target. We identified two types of category cells: face-selective and general visual category-selective responses. We tested separately for face-selective cells because of their prominence in the MTL [42]. We then proceeded to compare the response of both types of category cells between fixations on targets and distractors to determine whether their response discriminated targets from distractors, in addition to visual categories.

To identify face-selective neurons, we compared the response between face and non-face stimuli during cue presentation. The response of 39 neurons (17.1%; P < 10−11) had a significant response difference (31/39 increased their firing rate for faces compared to non-faces, whereas 8/39 increased their firing rate for non-faces compared to faces but did not further distinguish non-face categories; see Figure 3A-E and Figure S5 for examples and Figure 3F,G for group average). To select category-selective neurons, we grouped the objects into 14 visual categories (see STAR Methods). The response of 32 neurons (14.0%; P < 10−7) co-varied significantly as a function of visual category during cue presentation (one-way ANOVA of 14 categories, P < 0.05). Figure 3H shows the ordered responses from the best to the worst stimulus for these neurons. Compared to the unselected neurons, category-selective neurons showed steeper changes from the best to the worst stimulus, a difference that was significant at all stimulus levels (two-tailed two-sample t-test, all Ps < 0.022; FDR corrected) except the best and worst stimuli (Figure 3H; similar results were obtained for amygdala and hippocampal neurons separately, see Figure 3I).

Figure 3. Category cells are fixation sensitive and visually tuned during search and are modulated by top-down goal relevance.

Figure 3.

(A-C) Example neurons that increased firing rate to face cues (selection by two-tailed t-test during search cue presentation: P < 10−3). (D-E) Example neurons that increased firing rate for non-face cues (P < 10−4). Each raster (upper) and PSTH (lower) is shown with color coding as indicated. Trials are aligned to cue presentation (gray lines). Trials within each category are sorted according to reaction time (black line). Waveforms for each unit are shown in the raster plot. In the PSTH plot, asterisk indicates a significant difference between the response to face-target trials and non-face-target trials in that bin (P < 0.05, two-tailed t-test, Bonferroni-corrected; bin size = 250 ms). Shaded area denotes ±SEM across trials. (F-G) Cue-aligned average PSTH of all neurons that significantly increased firing rate for face cues (F) and non-face cues (G), respectively. (H-I) Object category selectivity tuning showing ordered average responses from the best to the worst stimulus. Neurons with category selectivity were identified by one-way ANOVA across 14 object categories (P < 0.05). Neurons that did not reach statistical significance (unselected neurons) were shown for comparison purposes. Responses were normalized by the response to the best stimulus. Asterisks in (H) indicate significant difference between selected and unselected neurons: +: P < 0.1, *: P < 0.05, **: P < 0.01, and ***: P < 0.001. (I) Amygdala (solid line) and hippocampal (dashed line) category neurons showed a similar category tuning. No significant differences in category selectivity were found between amygdala and hippocampal neurons at all stimulus levels (two-tailed two-sample t-test, all Ps > 0.16), although unselected amygdala neurons showed a stronger visual selectivity. (J-N) Summary of DOS. (J) DOS during cue presentation (left) and fixations on targets during search (right). Face: face-selective neurons. Category: category-selective neurons. Note that DOS values are dependent on the number of categories, making DOS values of face-selective and category-selective cells not comparable. Error bars for category neurons denote one SEM across cells. Chance performance was calculated by permutation tests by shuffling the label of categories. Error bars for chance values denote one SD across permutations. Asterisk indicates a significant difference by permutation test. ***: P < 0.001. (K) Correlation of DOS (evaluated with 14 object categories) between cue presentation and search. Each circle represents a neuron. Magenta circles denote the category-selective neurons (n = 32) and gray circles denote the unselected neurons (n = 196). The magenta line is the linear fit for category-selective neurons (r = 0.65, P < 10−4) and the gray line is the linear fit for all neurons (r = 0.65, P < 10−28). (L) DOS for targets vs. distractors. T: fixations on targets. D: fixations on distractors. Error bars denote one SEM across cells. Asterisk indicates a significant difference by two-tailed paired t-test. ***: P < 0.001. (M) Correlation of DOS (evaluated with 14 object categories) between fixations on targets and distractors. Each circle represents a neuron. Magenta circles denote the category-selective neurons (n = 32) and gray circles denote the unselected neurons (n = 196). The magenta line is the linear fit for category-selective neurons (r = 0.59, P = 0.00039) and the gray line is the linear fit for all neurons (r = 0.63, P < 10−25). (N) DOS for TS neurons did not differ between target and distractor fixations. Legend follows (J). n.s.: not significant. See also Figure S5,

To summarize the response of category neurons during search, we used the depth of selectivity (DOS) index (see STAR Methods and Eq. 3) [43]. DOS values range between 0–1, with 1 indicating tuning to only a single category (note the dependence of DOS values on the number of categories, making DOS values of face-selective and category-selective cells not comparable). During the cue period, the mean DOS of face-selective and category-selective cells was 0.42±0.14 (chance: 0.14; permutation test: P < 0.001; Figure 3J) and 0.68±0.16 (chance: 0.58; permutation test: P < 0.001; Figure 3J), respectively. We found that category cells identified during the cue period remained visually selective during search: using neuronal responses aligned at fixation onset, selectivity remained significantly above chance for both face-selective (mean DOS 0.36±0.22 vs. 0.22 expected by chance; permutation test: P < 0.001; Figure 3J) and category-selective neurons (mean DOS 0.76±0.16 vs. 0.69 expected by chance; P < 0.001; Figure 3J). Also, the DOS index of category cells computed during search and during cue presentation was correlated (face-selective: r = 0.44, P = 0.0047, category-selective: r = 0.65, P < 10−4, all: r = 0.65, P < 10−28; Figure 3K), indicating that their tuning remained stable. Note that the analysis of category cell activity during search is statistically independent of cell selection, because category cells were selected based only on their response during the cue period. This result shows that category cells selected for tuning to the cue remained visually selective when considering fixation-onset triggered responses.

We next compared the selectivity of category cells during search between fixations on targets vs. distractors. We found that category cells were significantly more tuned during fixations on targets compared to distractors (numbers are DOS index values; face-selective neurons: 0.36±0.22 (mean±SD) vs. 0.20±0.15 for target and distractor, respectively; two-tailed paired t-test: P = 0.00047; category-selective neurons: 0.76±0.16 vs. 0.62±0.17, P < 10−4; Figure 3L). Also, visual selectivity was significantly correlated between fixations on targets and distractors when considering all neurons regardless of whether they were category cells or not (DOS for face-selectivity: r = 0.39, P < 10−11; DOS for category-selectivity: r = 0.63, P < 10−25; Figure 3M). Together, this result shows that category cells are modulated both by sensory information about the visual object category, and by target detection during the search task.

TS and category cells are largely distinct

We next investigated whether target cells were also visual category-selective [39]. The DOS values of TS neurons were not larger than those expected by chance (mean DOS 0.49±0.21 vs. 0.49 expected by chance; evaluated during cue presentation period with 14 object categories; permutation test by scrambling object identity: P = 0.58; Figure 3N). This result shows that the response of TS cells was, on average, not visually tuned.

Were there cells that were both TS and category cells? We found a small number of such cells: only 12 (5.26%) of neurons were both TS and face-selective cells (Figure 4A) and only 7 (3.07%) of neurons were both TS and category-selective cells (Figure 4B). The proportion of cells that qualified as both was not greater than expected from independence of these two attributes, i.e., TS neurons had a similar percentage of category neurons as the entire population (χ2-test: P = 0.25 and P = 0.99, respectively; Figure 4A,B) and category neurons had a similar percentage of TS neurons as the entire population (P = 0.23 and P = 0.99, respectively). A scatter plot of TSI vs. DOS for all recorded neurons further showed that TS neurons and category neurons were largely separate populations (Figure 4C and Figure 4D show DOS values during cue period for face-selective and category-selective neurons, respectively). Furthermore, TSI and DOS were not correlated for all cells (r = 0.061, P = 0.36), TS cells (r = 0.062, P = 0.67), category cells (r = −0.057, P = 0.76), or cells that were qualified as both TS and category (r = −0.21, P = 0.66; similar results were found using face-selective DOS). Lastly, a two-way ANOVA (TS X category) of the entire population of neurons confirmed that neither face-selective cells (Figure 4C; TSI: P = 0.95; DOS: P = 0.81; main effects P < 10−6 as by selection) nor category-selective cells (Figure 4D; TSI: P = 0.60; DOS: P = 0.88; main effects P < 10−20 as by selection) interacted with TS cells. Together with the preceding selectivity analysis, this result suggests that TS and category cells are largely distinct.

Figure 4. Population summary for MTL.

Figure 4.

(A) Most selective neurons were either only TS (n=38) or only face-selective (n=27) neurons, with a minority qualifying as both (n=12). (B) Most selective neurons were either only TS (n=43) or category-selective (n=25) neurons, with a minority qualifying as both (n=7). (C-D) Response of all recorded neurons as a function of visual selectivity ((C) DOS index for face selectivity; (D) DOS index for category selectivity; chance DOS was subtracted to correct for baseline) and target selectivity (TSI; chance TSI was subtracted to correct for baseline). Each circle represents a neuron. Color of the circle indicates classification of the neuron (Red: target neurons. Brown: visually selective neurons. Blue: both. Gray: neither).

MTL TS neurons respond later than MFC TS neurons

Where do the MTL target detection signals originate? To explore the origins of the target detection signal, we also recorded from two medial frontal brain areas: the pre-supplementary motor area (pre-SMA) and the anterior mid-cingulate cortex (aMCC). Nine patients (10 sessions) performed the identical task. We recorded in total 182 single neurons in the pre-SMA, 129 of which had a spontaneous firing rate of >0.2Hz, and 211 neurons from the aMCC, 162 of which had a firing rate of >0.2Hz. We found 51 TS neurons in the pre-SMA (Figure 5A,B; 39.5%; binomial P < 10−50; 31 cells had a greater response to targets relative to distractors) and 42 TS neurons in the aMCC (25.9%, binomial P < 10−50; 18 cells had a greater response to targets relative to distractors). We next proceeded to compare the onset latency, relative to fixation onset, of the TS neurons between MFC and MTL. This revealed that pre-SMA TS neurons responded significantly earlier than MTL TS neurons (Figure 5E,F; pre-SMA: −163ms relative to fixation onset; MTL: 44ms; permutation test: P = 0.005; see Figure 5C,D for individual examples). This result was similar for aMCC TS neurons, which also responded significantly earlier than MTL TS neurons (Figure S6A,B; aMCC: −140ms relative to fixation onset; MTL: 44ms; permutation test: P = 0.042). This result held when restricting analysis to neurons recorded simultaneously (Figure S6C,D; permutation test: P = 0.023). Together, this data shows that target-relevance signals are available earlier in MFC compared to MTL, suggesting that the MTL target signal might represent top-down input from the frontal cortex. In addition, TS neurons in MFC also differentiate between detected vs. missed targets like MTL TS neurons (see STAR Methods).

Figure 5. TS neurons in the pre-SMA respond before TS neurons in the MTL.

Figure 5.

(A-B) Example TS neurons from pre-SMA. (C-D) Two TS neurons simultaneously recorded in the pre-SMA (C) and MTL (D). Legend conventions as in Figure 2A-E. (E) Cumulative firing rate for TS neurons from the pre-SMA (dotted lines; n = 31 neurons) and MTL (solid lines; n = 27 neurons). Note that only TS neurons that had a greater firing rate for targets than distractors are shown. Shaded area denotes ±SEM across neurons. Red: fixations on targets. Blue: fixations on distractors. Top bars show clusters of time points with a significant difference (one-tailed pairwise t-test; P < 0.01; FDR-corrected; cluster size > 10 time points). Arrows indicate the first time point of the significant cluster. Magenta: MTL neurons. Black: pre-SMA neurons. (F) Difference in cumulative firing rate (same data as shown in (E)). Shaded area denotes ±SEM across neurons. Arrows indicate the first time point of the significant cluster. Magenta: MTL neurons. Black: pre-SMA neurons. See also Figure S6,

MTL neurons encode category signals earlier than target detection signals

Was the response latency of category and TS cells in the MTL different? To answer this question, we selected a population of category neurons whose response differentiated between fixations on faces and non-faces during search (two-tailed two-sample t-test; n = 27; 17 neurons had a greater response for faces and we focused on this subset here). We found that category neurons with faces as the preferred stimulus responded significantly faster than TS neurons (Figure S6E,F; category: −155ms relative to fixation onset; TS: 44ms; permutation test: P = 0.026) relative to fixation onset. Similarly, we found that face-selective neurons selected during cue presentation (Figure 3A-G) also responded earlier than TS neurons during search (Figure S6G-I). Together, our results show that MTL neurons encode visual information earlier than target-detection responses.

Interestingly, even though category cells became selective before TS cells, when we separately analyzed DOS in each consecutive 50ms time window, we found that DOS increased immediately after TS cells became selective (~44ms; Figure S6M). This result further supports the idea that category neurons are modulated by goal relevance signals in the MTL.

MTL TS neurons do not depend on explicitly defined targets

To further characterize TS cells, we conducted a comparison “pop-out” task while recording from the same neurons as the above “standard” task (9 sessions, 157 MTL neurons with an overall firing rate >0.2Hz). In this pop-out task (Figure 6A,B), no explicit search cue was given. Instead, the target was defined as an “oddball”: there was either one face among vehicles or one vehicle among faces. Patients were instructed to indicate by button press as soon as they determined whether there was an oddball (target). This allowed us to examine target or distractor responses in the absence of an explicitly provided search cue.

Figure 6. MTL TS neurons do not require an explicit search target.

Figure 6.

(A) Task structure for the control pop-out task. The target was defined as an “oddball”: there was either one face among vehicles or one vehicle among faces. Subjects were instructed to indicate by button press as soon as they determined whether there was an oddball (the same button press as the standard task). Trial-by-trial feedback is given immediately after button press. (B) Sample stimuli of pop-out task. Each circle represents a fixation. Green circle: first fixation. Magenta circle: last fixation. Yellow line: saccades. Blue dot: raw gaze position. Red box: target. In this example, the target (“oddball”) is a face. (C) Scatter plot of the mean normalized firing rate between the standard and pop-out task. Each circle represents the mean normalized firing rate difference between fixations on targets and fixations on distractors for a neuron. Magenta circles denote the TS neurons (derived from the standard task; n = 28) and gray circles denote the unselected neurons. The magenta line is the linear fit for TS neurons and the gray line is the linear fit for all neurons that had an overall firing rate >0.2Hz for both tasks in comparison (n = 157).

We examined whether the subset of TS neurons selected from the standard task for which we also recorded the pop-out task (28 TS neurons; 9 target-preferring and 19 distractor-preferring) also signaled target detection in the pop-out task. We found that this was the case: TS cell responses were significantly correlated between the standard and pop-out tasks (Figure 6C; r(28) = 0.49, P = 0.0082 for TS neurons and r(157) = 0.25, P = 0.0015 for all neurons). The correlation results were further confirmed by comparison to chance performance (note that this test was independent from selection): neurons had above-baseline discrimination of targets vs. distractors in the pop-out task (two-tailed one-sample t-test: P = 0.037). Together, the consistency between the standard and pop-out tasks suggests that MTL neurons encode search goals irrespective of goal format.

Lastly, we compared the onset latency of MFC TS neurons (n = 29; 20 from the pre-SMA and 9 from the aMCC) and MTL TS neurons (n = 9) in the pop-out task. We found that MFC TS neurons responded significantly earlier than MTL TS neurons, just as had been the case in the standard task (Figure S6K,L; MFC: −160ms relative to fixation onset; MTL: 16ms; permutation test: P = 0.008). This result suggests that MFC neurons can signal task-relevance in at least two different ways: through instructions held in working memory, and by detection of oddball stimuli. It is also worth noting that MTL TS neurons responded earlier in the pop-out task (16ms) compared to the standard task (44ms; permutation test: P = 0.028), but MFC TS neurons responded with a similar latency (standard task: −171ms, pop-out task: −160ms; permutation test: P = 0.58), suggesting that MTL neurons receive top-down signals faster when less working memory is involved (i.e., Δt between MFC and MTL; standard task: 215ms, pop-out task: 176ms).

Discussion

Our results reveal two distinct visually responsive populations of neurons in the human MTL. One population shows response selectivity to visual object categories including faces as described before [39, 42, 44]. A second population, described here for the first time, shows response selectivity to targets and distractors in memory-based visual search. Our results suggest a population-level response in the MTL that represents the meaning of visual stimuli by encoding both their category membership and their task relevance. Differential latency analysis suggests that the target detection signal in the MTL likely originates from frontal cortex. In addition, in the MTL, visual category information was available earlier than the target-detection response. Lastly, a control experiment suggested that the MTL target response is independent of the format of the search goals.

TS cells constitute a single-neuron correlate of an aspect of MTL function that so far has received relatively little attention: the online guidance of behavior [45]. Despite an absence of demands on long-term memory, lesion studies have consistently shown that patients with MTL lesions exhibit deficits in sufficiently demanding visual search tasks [17, 18]. A critical component of visual search is working memory, which is required to keep in mind the current search goal and to keep track of visited locations. It has been suggested that the effects of MTL lesions are due to the increasingly recognized role of the MTL in supporting memory for even brief periods of time (seconds) when distractors and other competing demands are present [16, 26, 46]. In contrast, here we provide evidence for a second role of the MTL during search: the detection of goal relevance. This result is compatible with the finding that patients with MTL lesions miss targets more frequently and fixate distractors longer than controls [47]. Our subjects sometimes also missed targets despite directly looking at them, in which case the activity of TS cells was reduced. Together, our data provides direct evidence for a role of the human MTL in the implementation of behaviorally relevant goals.

We found that the response of TS cells during on-target fixations was modulated by whether the subject detected the target or not. In addition to goal relevance, their response was thus indicative of the choice made by the subject. Note that a similar relationship has been observed for visually selective neurons in the inferotemporal cortex of macaques [28]. Here, in contrast, we show that non-visually selective TS neurons exhibit this relationship. This finding adds to the increasing evidence that the activity of MTL neurons reflects choices made about visual stimuli rather than the sensory input [48] and that it may track visual awareness of the presented stimulus [4952].

Modulation of neural activity by goal-relevance has been observed in a number of other brain areas. For example, in monkeys, in both single saccade tasks [2, 53, 54] and tasks with naturalistic free-viewing [9, 27, 28], temporal lobe cortical neurons show an enhanced response to visual stimuli presaccadically when the stimulus in the receptive field becomes the target. Similarly, during delayed match-to-sample tasks at fixation, visual responses to ‘match’ stimuli in IT and perirhinal cortex are modulated by goal relevance [11, 12]. Here, we provide four new pieces of data relative to this literature. First, visually-tuned cells in the human MTL are similarly modulated by goal relevance, a response which they might inherit from their input [2, 11, 12]. Although visually-tuned cells responded with a shorter latency than TS cells, their selectivity increased after TS cells became selective, further confirming the modulation of TS cells (cf. Figure S6M). Second, non-visually tuned target-selective cells were indicative of goal relevance alone, a kind of response that has not previously been documented in the MTL of either humans nor macaques. Third, we identified TS cells in the MFC that responded significantly earlier than those of simultaneously recorded TS cells in the MTL. Fourth, MTL neurons encoded search goals irrespective of goal format. Responses indicating behavioral relevance have been investigated intensively in the frontal cortex, in particular the frontal eye fields [5557]. This information is thought to top-down modulate other areas [34]. Our findings suggest that MTL TS cells reflect a top-down signal within the MTL that originates in the frontal cortex. Here, this conclusion rests on latency differences alone. Future experiments are needed to test the causality of this pathway more directly.

We found that MFC neurons responded to targets and MTL neurons responded to faces ~150 ms prior to fixation onset, suggesting that the neurons respond to what is currently attended but not yet fixated. Given the observed fixation duration (Figure 1D; ~300 ms), patients must have begun processing the next to-be-fixated item in the middle of the present fixation, consistent with prior reports that subjects can identify a search target at least one fixation in advance [58]. We here found neuronal responses supporting such look-ahead processing. Recordings from monkey superior colliculus have even shown that the process of target selection can encompass at least two future saccade targets [59].

Visually selective cells are a prominent feature of the human MTL [3840, 60]. Category cells increase their firing rate shortly after the onset of only a set of preferred stimuli shown in isolation. The latency, amplitude and duration of this activity increase is modulated by a variety of factors that include anatomical location, tuning sparsity, speed of stimulus presentation, and visual awareness [46, 50, 61]. These properties of category cells can all be explained using a feedforward view of sensory processing. In contrast, here we now show that category cell activity is modulated by the top-down factor of search goal relevance. A second form of top-down modulation that modulates category cell activity is spatial attention [42]. We also observed such modulation in our task, because category cell activity was sensitive to the currently fixated object. Our subjects actively searched the array, a situation where the current gaze position is an accurate indication of the location of spatial attention (except for brief moments of time preceding saccades; [28, 62, 63]). Together, our results reveal that during active search, category cell activity is jointly modulated by the two top-down factors of spatial attention and goal-relevance.

Searching for a face in a crowd is perhaps one of the most common visual search tasks that humans perform, making it important to independently investigate face cells. Here, we found that face cells were strongly modulated by the top-down factors of spatial attention and goal-relevance. This result was also true when restricting our analysis to face cells recorded in the amygdala, which reveals that the social inference processes that are thought to rely on face cells in the amygdala [64, 65] can be modulated by goal-relevance. In conclusion, our results provide important insights into how the brain detects goal-relevant targets in the environment. While behavioral work has long indicated that the MTL is important in the online control of behavior, it has so far remained unclear what specifically its contribution is. Here, we provide direct evidence for one such contribution: the detection of goal-relevant targets.

STAR Methods

Contact for Reagent and Resource Sharing

Further information and requests for resources should be directed to and will be fulfilled by the Lead Contact, Ralph Adolphs (radolphs@hss.caltech.edu).

Experimental Model and Subject Details

There were 11 sessions from 8 patients in total. Three patients did two sessions. All sessions had simultaneous eye tracking. All participants provided written informed consent according to protocols approved by the institutional review boards of the Cedars-Sinai Medical Center and the California Institute of Technology. We also compared our patients’ behavioral task performance with that of 8 healthy individuals to confirm that our patients performed normally in the task. These healthy individuals have been characterized in a previous study [37].

Method Details

Stimuli

We used 20 distinct visual search arrays. In each array there were 24 items whose spatial locations were randomized between the 20 arrays. 12 items were faces (faces and people with different postures, emotions, ages, and genders, etc.) and 12 items did not contain faces (furniture, toys, food, etc.) (see Figure 1B for an example). These face and non-face items composing the array stimuli have been characterized and described previously [37, 66]. From each array stimulus, we randomly assigned 4 face items and 4 non-face items as targets (on 8 distinct trials). For each array, we also had 2 target-absent trials, i.e., the target was not among the objects in the search array (one target-absent trial with a face target, and one with a non-face target). Therefore, in total we had 100 trials with face targets and 100 trials with non-face targets, and 20% of trials were target-absent trials. The entire task was separated into two blocks. Each block had 100 trials. Patients finished at least one block. Importantly, low-level properties of face and non-face items were equalized within each search array. The face and non-face items did not differ in standard low-level saliency as quantified by the Itti-Koch model [67, 68], distance to center or size (all Ps > 0.79). In the analysis of visual tuning to object category, we further categorized the items into 14 finer categories: face, clock, vehicle, furniture, electronics, stationery, sign, plant, toy, sport, bag, comb, clothes, and food.

In the pop-out task, we only used face items and vehicle items from those in the standard task. There were still 24 items in total in each search array, but each array was generated online with a randomly selected subset of face and vehicle items. The spatial location of each item was also randomly decided online. Target-present arrays had one face among vehicles or one vehicle among faces, and target-absent arrays had homogeneously all faces or all vehicles.

Task

The task (Figure 1A,B) has been described in a previous study [37]. A target was presented for 1 second followed by the search array. Patients were instructed to find the item in the array that matched the target and explicitly told that the array might or might not contain the target. The search array stayed up for at most 14 seconds, or until the subject responded by a Cedrus™ button box, either by pushing the left button to indicate that the target was found in the array, or by pushing the right button to indicate that the target was absent in the array. A feedback message (‘Correct’ or ‘Incorrect’) was then displayed for 1 second. Subjects were instructed to respond as quickly and accurately as possible. If subjects did not respond within 14 seconds after array onset, a message ‘Time Out’ was displayed. An inter-trial-interval (ITI) was jittered between 1 to 2 seconds. The array and target orders were completely randomized for each subject. Subjects practiced 5 trials before the experiment to familiarize themselves with the task. In the end, the overall percentage of correct answers was displayed to patients as a motivation.

The control pop-out task used different search arrays (Figure 6B) but the same task as the standard task, except that targets were pre-defined and thus there was no search cue. Patients were still instructed to report target presence as quickly and accurately as possible. Each block had 100 trials.

Patients sat approximately 60 cm from an LCD display with a 17-inch screen (screen resolution: 1024 × 768). The refresh rate of the display was 60Hz and the stimuli occupied the center of the display (31.5° × 25.4° visual angle). Stimuli were presented using MATLAB with the Psychtoolbox 3 [69] (http://psychtoolbox.org).

Electrophysiology

We recorded bilaterally from implanted depth electrodes in the amygdala, hippocampus, and pre-SMA from patients with pharmacologically intractable epilepsy. Target locations iwere verified using post-implantation structural MRIs as shown in Figure S1. At each site, we recorded from eight 40 μm microwires inserted into a clinical electrode as described previously [70, 71]. Efforts were always made to avoid passing the electrode through a sulcus, and its attendant sulcal blood vessels, and thus the location varied but was always well within the body of the targeted area. Microwires projected medially out at the end of the depth electrode and examination of the microwires after removal suggests a spread of about 20–30 degrees. The amygdala electrodes were likely sampling neurons in the mid-medial part of the amygdala and the most likely microwire location is the basomedial nucleus or possibly the deepest part of the basolateral nucleus. Bipolar wide-band recordings (0.1–9kHz), using one of the eight microwires as reference, were sampled at 32kHz and stored continuously for off-line analysis with a Neuralynx system. The raw signal was filtered with zero-phase lag 300–3kHz bandpass filter and spikes were sorted using a semi-automatic template matching algorithm as described previously [72]. Units were carefully isolated and recording and spike sorting quality were assessed quantitatively (Figure S2).

Eye tracking

Patients were recorded with a remote non-invasive infrared Eyelink 1000 system (SR Research, Canada). One of the eyes was tracked at 500Hz. The eye tracker was calibrated with the built-in 9-point grid method at the beginning of each block. Fixation extraction was carried out using software supplied with the Eyelink eye tracking system. Saccade detection required a deflection of greater than 0.1°, with a minimum velocity of 30°/s and a minimum acceleration of 8000°/s2, maintained for at least 4 ms. Fixations were defined as the complement of a saccade, i.e. periods without saccades. Analysis of the eye movement record was carried out off-line after completion of the experiments.

Rectangular regions of interest (ROI) were used to define array items in the search array [37]. The ROI boundaries tightly encompassed the entire array item and varied between items, depending on the item size. Fixations within the ROIs were counted as falling on items. Each fixation was treated individually, and multiple consecutive fixations that fell on the same array item were counted as discrete samples.

Patients occasionally failed to consciously detect targets, conditional on targets having been fixated. We defined “misses” as fixations that landed on the target even though the target was not detected. We excluded the last 3 fixations landing on the target for misses because they corresponded to target detection [37].

Quantification and Statistical Analysis

Fixation analysis

We included fixations from target-present trials only to analyze target response, and fixations from both target-present and target-absent trials to analyze visual selectivity. We included all trials, but qualitatively the same results were derived when analyzing correct trials only. We drew rectangular regions of interest (ROIs) that tightly encompassed the array items (Figure 1B). 64.4±3.63% (mean±SD across sessions) fixations were on items (the rest were in margins), and multiple consecutive fixations within the same array item were counted as discrete samples. 18.0±5.09% fixations from target-present trials fell onto targets. To analyze target response, we contrasted fixations on targets to all other fixations in target-present trials, including those in the margins, which were broadly defined as “distractors”. Qualitatively the same results were derived when only including fixations fully within the item ROIs. To analyze visual selectivity, we only considered fixations fully within the item ROIs.

Trials in which targets had ever been fixated (regardless of whether they had been detected by button press), had on average 5.09±1.35 and 0.65±0.63 fixations on distractors before and after the first fixation on target, respectively. Furthermore, patients occasionally failed to consciously detect targets even when they had been fixated. We defined “misses” as fixations that landed on the target even though the trial was not immediately terminated by button press from the subject, i.e., at least 3 fixations away from button press. We excluded the last 3 fixations landing on the target for misses because they corresponded to target detection [37]. 7.38±6.18% of target-present trials contained such misses.

Spikes

Only units with an average firing rate of at least 0.2Hz (entire task) were considered. Only single units were considered. Trials were aligned to stimulus onset or button press. Fixations were aligned to fixation onset. Average firing rates (PSTH) were computed by counting spikes across all trials in consecutive 250 ms bins and across all fixations in consecutive 50 ms bins. Pairwise comparisons were made using a two-tailed t-test at P < 0.05 and Bonferroni-corrected for multiple comparisons in the group PSTH.

Single-neuron ROC analysis

Neuronal ROCs were constructed based on the spike counts in a time window −500 to 500 ms around button press for trial-wise analysis, and in a time window of 200 ms before fixation onset to 200 ms after fixation offset for fixation-wise analysis. We varied the detection threshold between the minimal and maximal spike count observed, linearly spaced in 20 steps. The AUC of the ROC was calculated by integrating the area under the ROC curve (trapezoid rule). The AUC value is an unbiased estimate for the sensitivity of an ideal observer that counts spikes and makes a binary decision based on whether the number of spikes is above or below a threshold. We defined the category with higher overall firing rate as ‘true positive’ and the category with lower overall firing rate as ‘false positive’. Therefore, the AUC value was always above 0.5 by definition.

TSI

In target-preferring neurons (n = 27), the normalized firing rate (i.e., the firing rate was normalized by dividing by average baseline (the firing rate 1000 ms before cue onset) across all trials, separately for each unit) was 1.41±0.28 and 1.03±0.17 for fixations on targets and distractors, respectively (two-tailed paired t-test, P = 9.94×10−11), whereas in distractor-preferring neurons (n = 23), the normalized firing rate was 0.71±0.24 and 0.92±0.26 for fixations on targets and distractors, respectively (two-tailed paired t-test, P = 4.83×10−7). This result shows that the subset of TS neurons that have a larger response to distractors than targets are decreasing their activity for targets rather than increasing their activity for distractors.

We quantified for each neuron whether its response differed between fixations on targets and fixations on distractors using a single-fixation TSI (Eq. 1). The TSI facilitates group analysis and comparisons between different types of cells (i.e., target- and distractor-preferring cells in this study), as motivated by previous studies [44, 73, 74]. The TSI quantifies the response during fixation i relative to the mean response to fixations on distractors and baseline (a 1 second interval of blank screen right before target cue onset). The mean response and baseline was calculated individually for each neuron.

TSIi=FRimean(FRDistractor)mean(FRBaseline)100% Eq. 1

For each fixation i, which can be either target or distractor, TSIi is the baseline normalized mean firing rate (FR) during an interval from 200 ms before fixation-onset to 200 ms after fixation offset (the same time interval as cell selection). Different time intervals were tested as well, to ensure that results were qualitatively the same and not biased by particular spike bins.

If a neuron distinguishes fixations on target from fixations on distractors, the average value of TSIi will be significantly different from 0. Since target-preferring neurons have more spikes in fixations on targets and distractor-preferring neurons have more spikes in fixation on distractors (the selection process is described above), on average TSIi is positive for target-preferring neurons and negative for distractor-preferring neurons. To get an aggregate measure of activity that pools across neurons, TSIi was multiplied by −1 if the neuron is classified as a distractor-preferring neuron (Eq. 2). This makes TSIi on average positive for both types of target neurons. Notice that the factor −1 depends only on the neuron type, which is determined by t-tests on fixations as described above, but not fixation type. Thus, negative TSIi values are still possible.

TSIi=FRimean(FRDistractor)mean(FRBaseline)100% Eq. 2

After calculating TSIi for every fixation, we subsequently averaged all TSIi of fixations that belong to the same category. By definition, the average value of TSIi for fixation on distractors will be equal to zero because the definition of TSIi is relative to the response to fixation on distractors (see Eq. 2). The mean baseline firing rate was calculated across all trials. The same FRDistractor was subtracted for both types of fixations.

The cumulative distribution function (CDF) was constructed by calculating for each possible value x of the TSI how many examples are smaller than x. That is, F(x) = P(X ≤ x), where X is a vector of all TSI values. The CDF of fixations on targets and distractors were compared using two-tailed two-sample Kolmogorov–Smirnov tests. All error bars are ±SE unless indicated otherwise.

8.95±3.19% of all fixations in target-present trials that landed on an item (not in the blank) were consecutive (i.e., on the same item as the previous fixation), and 48.5±17.5% of these consecutive fixations were fixations on targets. Because there were fewer fixations on targets than on distractors overall, the percentage of consecutive fixations on targets (24.6±8.08%) was significantly higher than that on distractors (5.77±2.99%; two-tailed paired t-test: P < 10−5). However, even though there were multiple fixations on targets, the fixations were not always consecutive: we found that in 28.6±16.1% of trials with multiple fixations on targets, there were distractors in between fixations on targets, resulting in an average of 0.71±0.58 fixations (absolute count) on distractors between fixations on targets. Similar results could be derived when excluding consecutive fixations on targets and distractors: we could still identify 37 TS neurons (16.2%; binomial P < 10−10) with an average TSI of 32.4±19.8%. Lastly, when using only the first fixations on targets and distractors, we could still identify 30 TS neurons (13.2%; binomial P < 10−6) with an average TSI of 31.0±18.8%.

Moreover, we confirmed our results by excluding fixations in the margins (i.e., not on an item) of the search array (only including fixations fully within the item ROIs): we could identify 37 TS neurons (16.2%; binomial P < 10−10) with an average TSI of 45.6±29.0%. Notably, our TS neurons did not have a significant difference in firing rate between fixations on distractors and margins (two-tailed paired t-test: P = 0.74).

Lastly, similar to MTL neurons, aMCC TS neurons also had a significantly higher response to targets that were detected relative to those that were missed (TSI: 34.2±29.0% vs. 13.2±66.9%, mean±SD across cells, KS-test, P = 0.0081), and the response of aMCC TS neurons to missed targets was significantly larger than that to distractors (TSI: 13.2±66.9% vs. 0.0±0.0% (by default); KS-test, P < 10−6). Similarly, we found that pre-SMA TS neurons also had a significantly higher response to targets that were detected relative to those that were missed (TSI: 84.2±112.5% vs. −8.51±92.3%, mean±SD across cells, KS-test, P < 10−7). In addition, however, the response of pre-SMA TS neurons to missed targets was significantly lower than that to distractors (TS index: −8.51±92.3% vs. 0.0±0.0%; KS-test, P < 0.05), indicating that the response to missed targets was suppressed. Notably, when we compared between brain areas, we found that TSI of neurons from the aMCC did not differ significantly from that from the MTL (P = 0.53 for detected targets, P = 0.84 for missed targets, and P = 0.91 for the difference between detected and missed targets), whereas TSI of neurons from the pre-SMA was significantly higher for detected targets (P = 0.0014) and in particular the difference between detected and missed targets (P = 0.012; but not missed targets only: P = 0.16). Together, this shows that TS neurons in MFC differentiate between detected vs. missed targets like MTL TS neurons. In addition, this data suggests that the pre-SMA was more strongly correlated with behavioral detection of targets relative to the MTL.

Depth of selectivity (DOS) index and visual selectivity

We quantified the DOS for each neuron:

DOS=n(j=1nrj)/rmaxn1 Eq. 3

where n is the number of categories (n = 2 and 14 for face-selective and category-selective neurons, respectively), rj is the mean response to category j, and rmax is the maximal mean response. DOS varies from 0 to 1, with 0 indicating an equal response to all categories and 1 exclusive response to one but none of the other categories. Thus, a DOS value of 1 is equal to maximal sparseness.

When we selected visually selective neurons, we used a tighter fixation time window of 0 to 300 ms after fixation onset to ensure that visual inputs were restricted to fixations. However, using the identical time window for target neurons that included saccades before and after fixations, we derived qualitatively the same results.

To assess statistical significance, we estimated the null distribution by first randomly shuffling the category labels of fixations/target cues and then computing DOS. We used 1000 runs for the permutation analysis. We compared the observed DOS with this null distribution of DOS to obtain p-values.

Differential latency

We binned spike trains into 1-ms bins and computed the cumulative sum. We then averaged the cumulative sums for fixations on targets and distractors, respectively. We here only analyzed target-preferring neurons because distractor-preferring neurons had different temporal dynamics. We then compared, at every point of time, whether the cumulative sums of a group of neurons were different (P < 0.01, one-tailed pairwise t-test; FDR corrected). The first point of time of the significant cluster (cluster size > 10 time points) was used as the estimate of the differential latency. Note that this method is not sensitive to differences in baseline firing rate between neurons because the latency estimate is pairwise for each neuron individually.

To compare latency between TS and category neurons, we selected a population of 27 neurons that differentiated fixations on faces vs. non-faces during search (two-tailed two-sample t-test, P < 0.05), so that they were comparable to TS neurons (i.e., both were selected based on fixations during search). The majority of this subset of neurons (17 neurons) had a greater response for faces than non-faces and we focused on the neurons with faces as preferred stimulus. In addition, we confirmed our results with the category neurons selected during cue presentation. We examined face-selective neurons, which had a clear preferred and non-preferred category. We conducted two analyses, with fixations on targets. The first one focused on face-selective neurons with faces as the preferred stimulus (i.e., neurons that increased their firing rate for faces). The second one combined face-selective neurons with faces as preferred stimulus and face-selective neurons with non-faces as the preferred stimulus by inverting the latter, so that the preferred response of all face-selective neurons was a firing rate increase.

To assess statistical significance, we estimated the null distribution by first randomly shuffling the labels for groups and then repeated the above latency analysis. We used 1000 runs for the permutation analysis. We compared the observed latency difference between groups with this null distribution of latency difference to obtain p-values.

Data and Software Availability

The spike detection and sorting toolbox OSort was used for data processing, which is available as open source. Data and custom MATLAB analysis scripts are available upon reasonable request.

Supplementary Material

1
2
  • A new class of neurons in the human MTL signal target detection during visual search

  • Response of target neurons is invariant to visual category

  • MFC target neurons precede MTL target neurons by ~200ms

  • Target neurons do not depend on explicitly defined targets

Acknowledgements

We thank all patients for their participation, Juan Xu for creating the array item labels, and Ming Jiang for helping with the analyses. This research was supported by the Rockefeller Neuroscience Institute, the Autism Science Foundation and the Dana Foundation (to S.W.), the Simons Foundation (Simons Collaboration on the Global Brain Award 543015SPI to R.A.), an NSF CAREER award (1554105 to U.R.), and the NIMH (R01MH110831 to U.R. and Conte Center P50MH094258 to R.A.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Footnotes

Declaration of Interests

The authors declare no competing interests.

Wang et al. reveal a new class of neurons in the human medial temporal lobe that signal target detection during visual search. These neurons have invariant response to visual category, predict detected or missed targets, do not depend on explicitly defined targets, and likely have a frontal origin.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Corbetta M, and Shulman GL (2002). Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3, 201–215. [DOI] [PubMed] [Google Scholar]
  • 2.Chelazzi L, Miller EK, Duncan J, and Desimone R (1993). A neural basis for visual search in inferior temporal cortex. Nature 363, 345–347. [DOI] [PubMed] [Google Scholar]
  • 3.Wolfe JM, and Horowitz TS (2004). What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5, 495–501. [DOI] [PubMed] [Google Scholar]
  • 4.Miller BT, and D’Esposito M (2005). Searching for “the top” in top-down control. Neuron 48, 535–538. [DOI] [PubMed] [Google Scholar]
  • 5.Chelazzi L, Duncan J, Miller EK, and Desimone R (1998). Responses of neurons in inferior temporal cortex during memory-guided visual search. J Neurophysiol 80, 2918–2940. [DOI] [PubMed] [Google Scholar]
  • 6.Wolfe JM (1994). Guided Search 2.0 - a Revised Model of Visual-Search. Psychonomic Bulletin & Review 1, 202–238. [DOI] [PubMed] [Google Scholar]
  • 7.Itti L, and Koch C (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Res 40, 1489–1506. [DOI] [PubMed] [Google Scholar]
  • 8.Shulman GL, McAvoy MP, Cowan MC, Astafiev SV, Tansy AP, Avossa G, and Corbetta M (2003). Quantitative Analysis of Attention and Detection Signals During Visual Search. Journal of Neurophysiology 90, 3384. [DOI] [PubMed] [Google Scholar]
  • 9.Bichot NP, Rossi AF, and Desimone R (2005). Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4. Science 308, 529–534. [DOI] [PubMed] [Google Scholar]
  • 10.Rutishauser U, and Koch C (2007). Probabilistic modeling of eye movement data during conjunction search via feature-based attention. Journal of Vision 7, −. [DOI] [PubMed] [Google Scholar]
  • 11.Miller EK, and Desimone R (1994). Parallel neuronal mechanisms for short-term memory. Science 263, 520. [DOI] [PubMed] [Google Scholar]
  • 12.Pagan M, Urban LS, Wohl MP, and Rust NC (2013). Signals in inferotemporal and perirhinal cortex suggest an untangling of visual target information. Nat Neurosci 16, 1132–1139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Squire LR, Stark CE, and Clark RE (2004). The medial temporal lobe. Annu Rev Neurosci 27, 279–306. [DOI] [PubMed] [Google Scholar]
  • 14.Hartley T, Bird CM, Chan D, Cipolotti L, Husain M, Vargha-Khadem F, and Burgess N (2007). The hippocampus is required for short-term topographical memory in humans. Hippocampus 17, 34–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Graham KS, Barense MD, and Lee AC (2010). Going beyond LTM in the MTL: a synthesis of neuropsychological and neuroimaging findings on the role of the medial temporal lobe in memory and perception. Neuropsychologia 48, 831–853. [DOI] [PubMed] [Google Scholar]
  • 16.Jeneson A, and Squire LR (2012). Working memory, long-term memory, and medial temporal lobe function. Learning & Memory 19, 15–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Warren DE, Duff MC, Jensen U, Tranel D, and Cohen NJ (2012). Hiding in plain view: lesions of the medial temporal lobe impair online representation. Hippocampus 22, 1577–1588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yee LTS, Warren DE, Voss JL, Duff MC, Tranel D, and Cohen NJ (2014). The Hippocampus Uses Information Just Encountered to Guide Efficient Ongoing Behavior. Hippocampus 24, 154–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lee AC, Buckley MJ, Pegman SJ, Spiers H, Scahill VL, Gaffan D, Bussey TJ, Davies RR, Kapur N, Hodges JR, et al. (2005). Specialization in the medial temporal lobe for processing of objects and scenes. Hippocampus 15, 782–797. [DOI] [PubMed] [Google Scholar]
  • 20.Ranganath C, and D’Esposito M (2001). Medial temporal lobe activity associated with active maintenance of novel information. Neuron 31, 865–873. [DOI] [PubMed] [Google Scholar]
  • 21.Johnson EL, and Knight RT (2015). Intracranial recordings and human memory. Curr Opin Neurobiol 31, 18–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Aggleton JP, Shaw C, and Gaffan EA (1992). The performance of postencephalitic amnesic subjects on two behavioural tests of memory: concurrent discrimination learning and delayed matching-to-sample. Cortex 28, 359–372. [DOI] [PubMed] [Google Scholar]
  • 23.Holdstock JS, Gutnikov SA, Gaffan D, and Mayes AR (2000). Perceptual and mnemonic matching-to-sample in humans: contributions of the hippocampus, perirhinal and other medial temporal lobe cortices. Cortex 36, 301–322. [DOI] [PubMed] [Google Scholar]
  • 24.Olson IR, Moore KS, Stark M, and Chatterjee A (2006). Visual working memory is impaired when the medial temporal lobe is damaged. J Cognitive Neurosci 18, 1087–1097. [DOI] [PubMed] [Google Scholar]
  • 25.Schaefer A, Braver TS, Reynolds JR, Burgess GC, Yarkoni T, and Gray JR (2006). Individual differences in amygdala activity predict response speed during working memory. J Neurosci 26, 10120–10128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kaminski J, Sullivan S, Chung JM, Ross IB, Mamelak AN, and Rutishauser U (2017). Persistently active neurons in human medial frontal and medial temporal lobe support working memory. Nat Neurosci 20, 590–601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Mazer JA, and Gallant JL (2003). Goal-Related Activity in V4 during Free Viewing Visual Search: Evidence for a Ventral Stream Visual Salience Map. Neuron 40, 1241–1250. [DOI] [PubMed] [Google Scholar]
  • 28.Sheinberg DL, and Logothetis NK (2001). Noticing Familiar Objects in Real World Scenes: The Role of Temporal Cortical Neurons in Natural Vision. The Journal of Neuroscience 21, 1340–1350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bichot NP, Thompson KG, Chenchal Rao S, and Schall JD (2001). Reliability of macaque frontal eye field neurons signaling saccade targets during visual search. J Neurosci 21, 713–725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Everling S, Tinsley CJ, Gaffan D, and Duncan J (2002). Filtering of neural signals by focused attention in the monkey prefrontal cortex. Nat Neurosci 5, 671–676. [DOI] [PubMed] [Google Scholar]
  • 31.Buschman TJ, and Miller EK (2007). Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science 315, 1860–1862. [DOI] [PubMed] [Google Scholar]
  • 32.Rossi AF, Bichot NP, Desimone R, and Ungerleider LG (2007). Top down attentional deficits in macaques with lesions of lateral prefrontal cortex. J Neurosci 27, 11306–11314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zhou H, and Desimone R (2011). Feature-based attention in the frontal eye field and area V4 during visual search. Neuron 70, 1205–1217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Tomita H, Ohbayashi M, Nakahara K, Hasegawa I, and Miyashita Y (1999). Top-down signal from prefrontal cortex in executive control of memory retrieval. Nature 401, 699. [DOI] [PubMed] [Google Scholar]
  • 35.Naya Y, Yoshida M, and Miyashita Y (2001). Backward spreading of memory-retrieval signal in the primate temporal cortex. Science 291, 661–664. [DOI] [PubMed] [Google Scholar]
  • 36.Naya Y, Sakai K, and Miyashita Y (1996). Activity of primate inferotemporal neurons related to a sought target in pair-association task. Proc Natl Acad Sci U S A 93, 2664–2669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wang S, Xu J, Jiang M, Zhao Q, Hurlemann R, and Adolphs R (2014). Autism spectrum disorder, but not amygdala lesions, impairs social attention in visual search. Neuropsychologia 63, 259–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Fried I, MacDonald KA, and Wilson CL (1997). Single Neuron Activity in Human Hippocampus and Amygdala during Recognition of Faces and Objects. Neuron 18, 753–765. [DOI] [PubMed] [Google Scholar]
  • 39.Kreiman G, Koch C, and Fried I (2000). Category-specific visual responses of single neurons in the human medial temporal lobe. Nat Neurosci 3, 946–953. [DOI] [PubMed] [Google Scholar]
  • 40.Rutishauser U, Ye S, Koroma M, Tudusciuc O, Ross IB, Chung JM, and Mamelak AN (2015). Representation of retrieval confidence by single neurons in the human medial temporal lobe. Nat Neurosci 18, 1041–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Wenzel MA, Golenia J-E, and Blankertz B (2016). Classification of Eye Fixation Related Potentials for Variable Stimulus Saliency. Frontiers in Neuroscience 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Minxha J, Mosher C, Morrow JK, Mamelak AN, Adolphs R, Gothard KM, and Rutishauser U (2017). Fixations Gate Species-Specific Responses to Free Viewing of Faces in the Human and Macaque Amygdala. Cell Reports 18, 878–891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Rainer G, Asaad WF, and Miller EK (1998). Selective representation of relevant information by neurons in the primate prefrontal cortex. Nature 393, 577–579. [DOI] [PubMed] [Google Scholar]
  • 44.Rutishauser U, Tudusciuc O, Neumann D, Mamelak AN, Heller AC, Ross IB, Philpott L, Sutherling WW, and Adolphs R (2011). Single-Unit Responses Selective for Whole Faces in the Human Amygdala. Current biology : CB 21, 1654–1660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Voss JL, Bridge DJ, Cohen NJ, and Walker JA (2017). A Closer Look at the Hippocampus and Memory. Trends in Cognitive Sciences 21, 577–588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kornblith S, Quian Quiroga R, Koch C, Fried I, and Mormann F (2017). Persistent Single-Neuron Activity during Working Memory in the Human Medial Temporal Lobe. Current Biology. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Warren DE, Duff MC, Tranel D, and Cohen NJ (2011). Observing degradation of visual representations over short intervals when MTL is damaged. Journal of cognitive neuroscience 23, 3862–3873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Wang S, Yu R, Tyszka JM, Zhen S, Kovach C, Sun S, Huang Y, Hurlemann R, Ross IB, Chung JM, et al. (2017). The human amygdala parametrically encodes the intensity of specific facial emotions and their categorical ambiguity. Nature Communications 8, 14821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kreiman G, Fried I, and Koch C (2002). Single-neuron correlates of subjective vision in the human medial temporal lobe. Proceedings of the National Academy of Sciences 99, 8378–8383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Quiroga RQ, Mukamel R, Isham EA, Malach R, and Fried I (2008). Human single-neuron responses at the threshold of conscious recognition. Proceedings of the National Academy of Sciences 105, 3599–3604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Rey Hernan G., Fried I, and Quian Quiroga R (2014). Timing of Single-Neuron and Local Field Potential Responses in the Human Medial Temporal Lobe. Current Biology 24, 299–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Reber TP, Faber J, Niediek J, Boström J, Elger CE, and Mormann F (2017). Single-Neuron Correlates of Conscious Perception in the Human Medial Temporal Lobe. Current Biology. [DOI] [PubMed] [Google Scholar]
  • 53.Tolias AS, Moore T, Smirnakis SM, Tehovnik EJ, Siapas AG, and Schiller PH (2001). Eye Movements Modulate Visual Receptive Fields of V4 Neurons. Neuron 29, 757–767. [DOI] [PubMed] [Google Scholar]
  • 54.Ogawa T, and Komatsu H (2004). Target Selection in Area V4 during a Multidimensional Visual Search Task. The Journal of Neuroscience 24, 6371–6382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Sakagami M, and Niki H (1994). Encoding of behavioral significance of visual stimuli by primate prefrontal neurons: relation to relevant task conditions. Experimental Brain Research 97, 423–436. [DOI] [PubMed] [Google Scholar]
  • 56.Schall JD, Hanes DP, Thompson KG, and King DJ (1995). Saccade target selection in frontal eye field of macaque. I. Visual and premovement activation. The Journal of Neuroscience 15, 6905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hasegawa RP, Matsumoto M, and Mikami A (2000). Search Target Selection in Monkey Prefrontal Cortex. Journal of Neurophysiology 84, 1692. [DOI] [PubMed] [Google Scholar]
  • 58.Kotowicz A, Rutishauser U, and Koch C (2010). Time course of target recognition in visual search. Frontiers in Human Neuroscience 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Shen K, and Paré M (2014). Predictive Saccade Target Selection in Superior Colliculus during Visual Search. The Journal of Neuroscience 34, 5640–5648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Quian Quiroga R, Reddy L, Kreiman G, Koch C, and Fried I (2005). Invariant visual representation by single neurons in the human brain. Nature 435, 1102–1107. [DOI] [PubMed] [Google Scholar]
  • 61.Mormann F, Kornblith S, Quiroga RQ, Kraskov A, Cerf M, Fried I, and Koch C (2008). Latency and Selectivity of Single Neurons Indicate Hierarchical Processing in the Human Medial Temporal Lobe. The Journal of Neuroscience 28, 8865–8872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Rolls ET, Aggelopoulos NC, and Zheng F (2003). The Receptive Fields of Inferior Temporal Cortex Neurons in Natural Scenes. The Journal of Neuroscience 23, 339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Zirnsak M, and Moore T (2014). Saccades and shifting receptive fields: anticipating consequences or selecting targets? Trends in Cognitive Sciences 18, 621–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Adolphs R (2010). What does the amygdala contribute to social cognition? Annals of the New York Academy of Sciences 1191, 42–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Rutishauser U, Mamelak AN, and Adolphs R (2015). The primate amygdala in social perception – insights from electrophysiological recordings and stimulation. Trends in Neurosciences 38, 295–306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Sasson N, Dichter G, and Bodfish J (2012). Affective Responses by Adults with Autism Are Reduced to Social Images but Elevated to Images Related to Circumscribed Interests. PLoS ONE 7, e42457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Itti L, and Koch C (2001). Computational modelling of visual attention. Nat Rev Neurosci 2, 194–203. [DOI] [PubMed] [Google Scholar]
  • 68.Itti L, Koch C, and Niebur E (1998). A Model of Saliency-Based Visual Attention for Rapid Scene Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259. [Google Scholar]
  • 69.Brainard DH (1997). The Psychophysics Toolbox. Spat Vis 10, 433–436. [PubMed] [Google Scholar]
  • 70.Rutishauser U, Mamelak AN, and Schuman EM (2006). Single-Trial Learning of Novel Stimuli by Individual Neurons of the Human Hippocampus-Amygdala Complex. Neuron 49, 805–813. [DOI] [PubMed] [Google Scholar]
  • 71.Rutishauser U, Ross IB, Mamelak AN, and Schuman EM (2010). Human memory strength is predicted by theta-frequency phase-locking of single neurons. Nature 464, 903–907. [DOI] [PubMed] [Google Scholar]
  • 72.Rutishauser U, Schuman EM, and Mamelak AN (2006). Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo. Journal of Neuroscience Methods 154, 204–224. [DOI] [PubMed] [Google Scholar]
  • 73.Rutishauser U, Schuman EM, and Mamelak AN (2008). Activity of human hippocampal and amygdala neurons during retrieval of declarative memories. Proceedings of the National Academy of Sciences 105, 329–334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Wang S, Tudusciuc O, Mamelak AN, Ross IB, Adolphs R, and Rutishauser U (2014). Neurons in the human amygdala selective for perceived emotion. Proceedings of the National Academy of Sciences 111, E3110–E3119. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Data Availability Statement

The spike detection and sorting toolbox OSort was used for data processing, which is available as open source. Data and custom MATLAB analysis scripts are available upon reasonable request.

RESOURCES