Abstract
The ventral intraparietal area (VIP) of the macaque brain is a multimodal cortical region, with many cells tuned to both optic flow and vestibular stimuli. Responses of many VIP neurons also show robust correlations with perceptual judgments during a fine heading discrimination task. Previous studies have shown that heading tuning based on optic flow is represented in a clustered fashion in VIP. However, it is unknown whether vestibular self-motion selectivity is clustered in VIP. Moreover, it is not known whether stimulus- and choice-related signals in VIP show clustering in the context of a heading discrimination task. To address these issues, we compared the response characteristics of isolated single units (SUs) with those of the undifferentiated multiunit (MU) activity corresponding to several neighboring neurons recorded from the same microelectrode. We find that MU activity typically shows selectivity similar to that of simultaneously recorded SUs, for both the vestibular and visual stimulus conditions. In addition, the choice-related activity of MU signals, as quantified using choice probabilities, is correlated with the choice-related activity of SUs. Overall, these findings suggest that both sensory and choice-related signals regarding self-motion are clustered in VIP.
NEW & NOTEWORTHY We demonstrate, for the first time, that the vestibular tuning of ventral intraparietal area (VIP) neurons in response to both translational and rotational motion is clustered. In addition, heading discriminability and choice-related activity are also weakly clustered in VIP.
Keywords: clustering, heading perception, optic flow, vestibular, VIP
INTRODUCTION
A common organizational feature of the cerebral cortex is the grouping of neurons into functional clusters (typically 200–500 μm in size, also known as columns, domains, patches, puffs, etc.) in which nearby neurons have similar response properties (Mountcastle 1997). Because it may be necessary to pool responses of neurons with similar tuning to improve signal reliability (Darian-Smith et al. 1973; Zohary et al. 1994), a clustered organization may act as a framework to facilitate pooling and simplify connectivity (Chklovskii and Koulakov 2004; Darian-Smith et al. 1973; Zohary et al. 1994). Thus understanding which features are clustered in different brain areas may help to reveal the functional roles of these areas.
The ventral intraparietal area (VIP) in macaque monkeys is a multimodal cortical region in which neurons have responses to both optic flow and vestibular stimuli related to self-motion (Chen et al. 2011a; Maciokas and Britten 2010; Zhang and Britten 2010; Zhang et al. 2004). Previous studies have shown that heading tuning for optic flow stimuli in VIP is organized in a clustered manner (Zhang et al. 2004). Given that VIP also receives considerable input from area MSTd (Baizer et al. 1991; Boussaoud et al. 1990), which has a similar clustered organization for both vestibular and visual self-motion signals (Chen et al. 2008), we hypothesized that VIP may also contain a clustered representation of vestibular signals, which may facilitate decoding of self-motion from population activity.
Recent work has also demonstrated that some VIP neurons show improved discrimination thresholds during a multimodal heading discrimination task, in which both visual and vestibular self-motion cues are available for discriminating small variations in heading (Chen et al. 2013). In addition, responses of many VIP neurons show strong trial-by-trial correlations with perceptual decisions, suggesting that VIP carries robust signals that are related to what the animal actually perceives, rather than just the sensory stimulus (Chen et al. 2013). It is currently unknown whether VIP neurons are clustered according to their heading discrimination sensitivity and their choice-related activity.
To address these issues, we first examined the clustering of vestibular and visual self-motion selectivity by comparing the tuning of single units (SUs) with that of simultaneously recorded multiunit (MU) activity during a passive fixation task (Chen et al. 2011a). In addition, we examined the clustering of heading discriminability and choice-related signals obtained while monkeys performed a heading discrimination task (Chen et al. 2013). Our results indicate that both the sensory and choice-related signals regarding self-motion are clustered within area VIP.
METHODS
Subjects
Physiological experiments were performed in 7 hemispheres of 5 male rhesus monkeys (Macaca mulatta) weighing 6–10 kg. The surgical preparation, behavioral training, and electrophysiological recording procedures have been described in detail previously (Chen et al. 2013; 2011a). Briefly, each animal was chronically implanted with a lightweight plastic ring for head restraint and with a scleral search coil in at least one eye for monitoring eye movements inside a magnetic field (CNC Engineering, Seattle, WA). After recovery from surgery, animals were trained using standard operant conditioning procedures to perform a fixation task and a discrimination task, as described below. All animal surgeries and experimental procedures were approved by the Institutional Animal Care and Use Committee at Washington University and were in accordance with National Institutes of Health guidelines.
Apparatus and Motion Stimuli
During experiments, the monkey sat in a primate chair that was mounted on top of a six-degree-of-freedom motion platform (model 6DOF2000E; Moog, East Aurora, NY; Fig. 1A). Vestibular stimuli were delivered by movements of this platform, as described in detail previously (Chen et al. 2008; Gu et al. 2006). Visual stimuli were programmed in OpenGL and generated by an OpenGL accelerator board (Quadro FX 3000G; PNY Technologies). Stimuli were rear-projected (Christie Digital Mirage 2000; Christie, Cyrus, CA) onto a tangent screen placed ~30 cm in front of the monkey, which subtended 90° × 90° of visual angle. Visual stimuli simulated self-motion through a 3-dimensional (3D) cloud of stars (100 cm wide, 100 cm tall, and 40 cm deep for the passive fixation task; 100 cm wide, 100 cm tall, and 50 cm deep for the heading discrimination task). A near clipping plane prevented stars from being rendered when they were closer than 5 cm to the animal’s eyes. Star density was 0.01/cm3, with each star being a 0.15-cm × 0.15-cm triangle. Stimuli were presented stereoscopically as red/green anaglyphs and were viewed through Kodak Wratten filters (red no. 29; green no. 61). The display contained a variety of depth cues, including horizontal disparity, motion parallax, and size information. The binocular disparity of the stars ranged from 32° crossed (nearest dots at the clipping plane distance of 5 cm) to 3° uncrossed. Thus the stimuli are well suited to activate VIP neurons, as reported previously (Bremmer et al. 2013; Colby et al. 1993; Yang et al. 2011).
Fig. 1.
Experimental setup, passive fixation task, and heading discrimination task. A: schematic illustration of the virtual reality apparatus. Monkeys were seated on a motion platform, with 6 degrees of freedom, that provided the vestibular stimulus. A projector mounted on the platform displayed images of 3D optic flow, and the field coil was used to monitor eye movements. B: passive fixation task for the translation and rotation protocols. Stimuli were delivered in 26 directions corresponding to all possible combinations of azimuth and elevation sampled, in 45° increments, from a sphere. The monkey fixated the central dot for 200 ms before the translational/rotational stimulus was presented and was required to maintain fixation during the entire stimulus period of 2-s duration. C: heading discrimination task. Stimuli were delivered in 9 logarithmically spaced headings within the horizontal plane, with 0° heading corresponding to straight forward motion. After the motion stimulus was delivered, the fixation point disappeared, 2 choice targets appeared, and the monkey was required to report his perceived heading (left vs. right) by making a saccade to one of two targets.
Experimental Protocol and Behavioral Tasks
We examined the 3D tuning of VIP neurons for both translation and rotation by recording neural responses to stimuli defined by optic flow or physical motion of the platform (Chen et al. 2011a). Once the 3D translation protocol was completed, and if good cell isolation was maintained, most VIP neurons from monkey J and a few cells from monkey U were then tested with a 3D rotation protocol. Other VIP neurons were instead tested using a heading discrimination task (monkey U and monkey C; details below) or a disparity task (monkey O and monkey P, as part of other studies). Thus the rotation protocol was only delivered to a subset (216 of 452) of the neurons for which we had completed the translation protocol.
In each real (vestibular) or simulated (visual) motion stimulus, the animal was translated along or rotated around 1 of 26 directions sampled evenly from a sphere (Fig. 1B). Each movement trajectory (either real or visually simulated) had a duration of 2 s and consisted of a Gaussian velocity profile. For the translation protocol, the amplitude was 13 cm (total displacement), with a peak acceleration of ~0.1 g (~0.98 m/s2) and a peak velocity of 30 cm/s. For the rotation protocol, the amplitude was 9°, with a peak angular velocity of ~20°/s (Chen et al. 2011a). These stimulus parameters far exceed vestibular detection thresholds (MacNeilage et al. 2010a, 2010b). All directions of motion of visual and vestibular stimuli were randomly interleaved within a single block of trials, and this was done in separate blocks for translation and rotation protocols. For each trial, the animal was required to fixate a central target (0.2° in diameter) for 200 ms before stimulus onset and was rewarded with a drop of juice at the end of each trial for maintaining fixation throughout stimulus presentation. Trials were aborted and data discarded when the monkey’s gaze deviated by >1° from the fixation target.
Five animals (monkeys C, J, O, P, U) were trained only to perform the passive fixation task, whereas two animals (monkeys C and U) were also extensively trained to perform a heading discrimination task (Chen et al. 2013). The discrimination task involved probing a small range of headings (±9°, ±3.6°, ±1.44°, ±0.58°, 0°) that were chosen on the basis of psychophysical performance (for details, see Gu et al. 2007). In the discrimination task, monkeys were asked to report whether their perceived heading was leftward or rightward relative to straight ahead by making a saccade to one of two choice targets. For the discrimination task, the total displacement along the motion trajectory was 30 cm, with a peak acceleration of 0.1 g (~0.98 m/s2) and a peak velocity of 45 cm/s.
The data presented in this article include neurons tested with both the fixation and discrimination tasks. The passive fixation task during translation or rotation (Fig. 1B) was used to collect the data shown in Figs. 2–8. The heading discrimination task (Fig. 1C) was used to collect the data in Figs. 9–13. Note that data from the fixation task were available for each unit recorded in the discrimination task.
Fig. 2.
Example data from a VIP neuron tested in the vestibular (top row) and visual (bottom row) translation conditions during the passive fixation task. A: color contour maps show the 3D direction tuning profiles for SU (left column) and MU (right column) activity, plotted using the Lambert cylindrical equal-area projection (see methods). Mean firing rate is color coded. B: normalized (z scored) cross-correlation function between SU and MU responses. The abscissa represents the time lag between SU and MU spike trains, and the ordinate is the z-scored cross-correlation. Black and red curves represent the cross-correlation functions before (raw) and after SU spikes were excluded from the MU signal, respectively.
Fig. 8.

Population tuning curves (passive fixation task). Tuning curves show responses to 8 headings (azimuth angles) within the horizontal plane. For each SU with significant tuning, data were shifted horizontally to align the peaks. The corresponding MU data were aligned to the preferred azimuth of each SU before averaging. A: vestibular translation. B: visual translation. C: vestibular rotation. D: visual rotation. Open symbols represent the average SU responses with significant tuning, and filled symbols indicate the corresponding average MU responses (regardless of whether they were significantly tuned individually).
Fig. 9.

Tuning curves and neurometric functions for an example pair of SU and MU recordings during the heading discrimination task. A: heading tuning curves for an example SU (red, vestibular; blue, visual; green, combined). Zero heading corresponds to straight ahead, and positive/negative values correspond to rightward/leftward headings. B: responses of the simultaneously recorded MU activity during the discrimination task. C and D: neurometric functions computed from the SU (C) and MU (D) responses using ROC analysis. Smooth curves show best-fitting cumulative Gaussian functions.
Fig. 13.

A and B: comparison of heading preferences in the heading discrimination (abscissa) and passive fixation (ordinate) tasks for the vestibular (A; N = 29) and visual conditions (B; N = 49). For recordings in which SU and MU activity have heading preferences of the same sign (both leftward or both rightward), data are shown for the SU only. In cases where SU and MU activity have opposite heading preferences, data are shown for the one with the greater vector sum (fixation task data) or the greater magnitude of the regression coefficient (discrimination task data). C and D: comparison between CPs of SU and MU activity computed on basis of heading preferences from tuning curves in the fixation task (C, vestibular; D, visual). Note that this analysis could not be performed for the combined condition.
Electrophysiological Recordings
We recorded extracellularly from single neurons using tungsten microelectrodes (Frederick Haer; tip diameter 3 μm, impedance 1–2 MΩ at 1 kHz). The microelectrode was advanced into the cortex through a transdural guide tube, using a hydraulic microdrive (Frederick Haer). Neural signals were amplified, bandpass filtered (400–5,000Hz), and then isolated with a dual voltage-time window discriminator (Bak Electronics). Spike times and all behavioral events were recorded with 1-ms resolution. Raw neural signals were also digitized at a rate of 25 kHz using a CED Power 1401 (Cambridge Electronic Design) for offline spike sorting.
Area VIP was identified using a combination of magnetic resonance imaging scans, stereotaxic coordinates, white/gray matter transitions, and physiological response properties, as described in detail previously (Chen et al. 2011a). We first identified the medial tip of the intraparietal sulcus and then moved laterally until there was no longer a directionally selective visual response in the MU activity. At the anterior end of the VIP area, visually responsive neurons gave way to purely somatosensory cells in the fundus. At the posterior end, direction-selective neurons gave way to visual cells that were not selective for motion.
Data Analysis
SU and MU activity.
SUs were isolated online with a dual voltage-time window discriminator (Bak Electronics, Mount Airy, MD). MU activity was extracted offline from the digitized raw neural signals by setting a threshold such that the spontaneous firing rate for MU activity was 50 spikes/s greater than the spontaneous firing rate of the simultaneously recorded SU. To ensure the independence of MU and SU signals, we subtracted one count from time bins of the MU signal that occurred within ±1 ms of each SU spike. The efficacy of this procedure was confirmed (see Figs. 2–4) by computing the cross-correlation function between simultaneous SU and MU recordings (Chen et al. 2008). The raw coincidence histograms were normalized by Z scoring such that estimates of correlation strength should not be dependent on firing rates (for details, see de Oliveira et al. 1997; Eggermont and Smith 1996). In cases for which more than one SU could be spike-sorted at a recording site, we included only the most well-isolated SU for comparison with MU activity.
Fig. 4.

Average cross-correlation (Z scored) for all pairs of SU and MU recordings (passive fixation task, vestibular and visual data pooled), shown separately for translation (N = 425; A) and rotation (N = 134; B). Black and red curves show the average cross-correlation before and after SU spike removal from the MU signal, respectively.
Analysis of 3D tuning properties.
For each SU/MU recording, we first constructed peristimulus time histograms (PSTHs) for each direction of translation/rotation by using 25-ms time bins smoothed with a 400-ms boxcar filter. We then calculated the maximum response of the neuron across stimulus directions for each 25-ms time bin between 0.5 and 2 s after motion onset. This yields a peak response vector, R(t), which contains the maximal response across directions at each point in time. For each 400-ms time window, we performed ANOVA to assess the statistical significance of directional selectivity at each time point. This yields a vector, p(t), that summarizes the significance of direction tuning as a function of time.
We next identified the local maxima of R(t) using the following criteria: 1) a local maximum is the largest value within a given neighborhood, and 2) the ANOVA values from p(t) must be significant (P < 0.05) for five consecutive time bins centered on the putative local maximum. The set of local maxima so defined were thus ranked according to peak response value. We sought to identify a continuous temporal sequence of time bins for which direction tuning is significantly positively correlated with tuning at the local maximum (for details, see Chen et al. 2010). “Peak times” were defined as the times of local response maxima that met these criteria, corresponding to distinct epochs of directional tuning. Because the average peak times for SU (vestibular: 1.05 s; visual: 1.09 s) and MU (vestibular: 1.31 s; visual: 1.16 s) activity are similar, we chose the peak time with the largest SU response to define the analysis window for each pair of SU and MU recordings.
To visualize spatial tuning in 3D, mean firing rates measured during the 400-ms window centered at the peak time were transformed using the Lambert cylindrical equal-area projection (Snyder 1987) and then plotted as a function of azimuth and elevation in the form of color contour maps (Chen et al. 2011a). In these plots, the abscissa represents azimuth and the ordinate represents a cosine-transformed version of elevation.
The preferred direction of a neuron for each stimulus was described by the azimuth and elevation of the vector sum of the individual responses (after subtraction of spontaneous activity). In such a representation, the mean firing rate in each trial was considered to represent the magnitude of a 3D vector whose direction was defined by the azimuth and elevation angles of the particular stimulus (Gu et al. 2006). To plot the difference in 3D direction preference (|ΔPreferred direction|) between SU and MU responses (see Fig. 6, A and B), the data were again sinusoidally transformed such that random combinations of directions on a sphere would result in a flat distribution of |ΔPreferred direction|.
Fig. 6.

Summary of differences in direction tuning between SU and MU activity (passive fixation task). A and B: histograms of the absolute difference in 3D direction preferences (|∆Preferred direction|) between SU and MU tuning for vestibular (N = 142/425; red) and visual (N = 202/425; blue) translation (A) and for vestibular (N = 41) and visual (N = 29) rotation (B) conditions (shown only for pairs with significant tuning). C and D: distributions of the correlation coefficient (RSU, MU) between SU and MU tuning curves (all neurons included) for vestibular and visual translation (C) and rotation (D) conditions. Solid and hatched bars indicate values of RSU, MU that are significantly and nonsignificantly different from zero, respectively.
The strength of directional tuning was quantified using a direction discrimination index (DDI; see Takahashi et al. 2007) given by
where Rmax and Rmin are the maximum and minimum responses from the 3D tuning function, respectively. SSE is the sum squared error around the mean response, N is the total number of observations (trials), and M is the number of stimulus directions (M = 26). The DDI quantifies the reliability of a neuron for distinguishing between preferred and null motion directions, with a value that ranges from 0 to 1, corresponding to response modulations that range from poor to strong.
Analysis of heading sensitivity and choice-related activity.
To quantify the animal’s behavioral performance in the heading discrimination task, we plotted the proportion of “rightward” choices as a function of heading relative to straight ahead and then fitted the psychometric curve with a cumulative Gaussian function (for details, see Chen et al. 2013; Gu et al. 2007):
where h denotes heading, µ denotes the mean of the underlying Gaussian distribution, and σ denotes the standard deviation (SD). The psychophysical threshold was defined as the SD of the cumulative Gaussian fit, which corresponds to 84% correct performance (assuming no bias).
For the analyses of neural responses in the discrimination task, we first computed mean firing rates during the middle 400-ms interval of each stimulus presentation. Distributions of neuronal responses were tabulated for each heading, and receiver operating characteristic (ROC) analysis was used to compute the ability of an ideal observer to discriminate between each pair of headings (e.g., 4° leftward vs. 4° rightward) based solely on neuronal responses (Britten et al. 1992; Chen et al. 2013; Gu et al. 2008). ROC values were then plotted as a function of heading and were fit with a cumulative Gaussian function to compute the neuronal threshold for each SU or MU recording.
Predicted thresholds for the combined stimulus condition, assuming optimal (maximum likelihood) cue integration, were computed as (Ernst and Banks 2002)
where σvestibular and σvisual represent neuronal thresholds for the vestibular and visual conditions, respectively.
To quantify the congruency between vestibular and visual heading tuning functions measured during the discrimination task, we calculated a congruency index (CI) (Chen et al. 2013; Fetsch et al. 2011; Gu et al. 2008) over the narrow range of headings tested. CI was defined as the product of Pearson correlation coefficients comparing firing rate with heading for vestibular and visual conditions:
CI ranges from −1 to 1, with values near 1 indicating that visual and vestibular tuning functions have a consistent slope and values near −1 indicating opposite slopes. CI values will be close to 0 when either tuning function is flat (or even-symmetric) over the range of headings tested. Note that CI reflects both the congruency of tuning and the steepness of the slopes of the tuning curves around straight ahead.
To determine whether neuronal responses covary with perceptual decisions, we also computed choice probabilities (CPs) using ROC analysis (for details, see Chen et al. 2013; Gu et al. 2007, 2008). For each heading, responses were sorted into two groups based on the choice that the animal made at the end of each trial: “preferred” choices were those made in favor of the neuron’s preferred heading, and “null” choices were those made in favor of the opposite direction. The preferred heading of each neuron was determined in two separate ways: from responses measured during the discrimination task (see Fig. 12) and from heading tuning curves obtained during passive fixation (see Fig. 13, C and D, and results for details).
Fig. 12.
Comparison between CPs computed from SU and MU activity during the heading discrimination task in the vestibular (A; N = 29), visual (B; N = 49), and combined conditions (C: N = 56). Filled symbols denote cases with CPs significantly (P < 0.05) different from chance (CP = 0.5) for both SU and MU activity, whereas open symbols show data sets with CPs that were not significantly different from 0.5 for either SU or MU. Arrowheads indicate geometric mean values. For this analysis, CPs were computed according to heading preferences in the discrimination task.
For each heading with at least three data points for each possible choice, responses were normalized (Z scored; see Kang and Maunsell 2012) by subtracting the mean response and dividing by the SD across stimulus repetitions. Z-scored responses were then pooled across headings into a single pair of distributions for preferred and null choices. ROC analysis on this pair of distributions yielded the grand CP. The statistical significance of CPs (whether they were significantly different from the chance level of 0.5, P < 0.05) was determined using a permutation test. To generate each permuted data set, each trial was randomly reassigned as either a preferred-choice or null-choice trial, according to the overall probability with which the monkey made that choice during the discrimination task. A CP value was then recomputed from each permuted data set using ROC analysis. For each neuron, this permutation process was repeated 1,000 times to construct a “random” choice probability distribution. A measured CP was considered to be significant if it fell outside the 95% confidence interval of the mean of this random choice probability distribution. For each stimulus condition (visual, vestibular, combined), CPs were computed on the basis of a neuron’s preferred sign of heading (leftward or rightward) for that particular stimulus condition (as determined from responses in the discrimination or fixation task). Hence, CPs are defined relative to the heading preference for each stimulus modality.
RESULTS
To test whether signals related to self-motion are clustered in VIP, we first analyzed the similarity of 3D direction selectivity between simultaneously recorded SU and MU activity. The SU data for this portion of the analysis were collected as part of previous studies (Chen et al. 2011a, 2011b). In addition, we measured the similarity of heading discriminability and choice-related activity between SU and MU signals during a heading discrimination task. The SU data for this task have also been described previously (Chen et al. 2013).
Clustering of Self-Motion Selectivity in VIP
In the 3D translation protocol, each VIP neuron was tested with 26 headings corresponding to all combinations of azimuth and elevation angles separated by 45° on a sphere (Fig. 1B). Figure 2A shows an example of 3D translation tuning for simultaneously recorded SU and MU activity in VIP. The data are shown as color contour maps in which mean firing rate is plotted as a function of azimuth and elevation (see Gu et al. 2006). These maps were computed in a 400-ms time window centered at the “peak time” of the SU response (see methods for details, as well as Chen et al. 2011a). Note that the peak MU responses are about twofold greater than the peak SU responses for this example recording.
In the vestibular condition, the SU (peak time: 1.01 s) shows clear spatial tuning for translation, with a preferred direction at 202° azimuth and −15° elevation (Fig. 2A, top left). A nearly opposite translation preference (measured at a peak time of 1.06 s) was seen in the visual condition for this SU (Fig. 2A, bottom left), with the direction preference occurring at 25° azimuth and −35° elevation. This pattern of results is typical of an “opposite” cell, as described previously (Chen et al. 2011a). The MU activity recorded simultaneously with this SU (Fig. 2A, right column) shows similar tuning for translational motion with a preferred heading of (187°, −7°) for the vestibular condition and (37°, −45°) for the visual condition. The absolute difference in 3D direction preference (|∆Preferred direction|) between SU and MU responses was 17° for the vestibular condition and 13° for the visual condition, indicating that SU and MU translation preferences are well matched.
To make sure that this similarity in heading preferences was not caused by SU spikes that also contributed to the MU activity, SU spikes were removed from the MU signal before 3D tuning profiles were computed (see methods for details). To confirm that this was successful, we computed a normalized (Z scored) cross-correlation (see de Oliveira et al. 1997; Eggermont and Smith 1996) between the SU and MU signals, as shown in Fig. 2B. Before SU spikes were removed, there was a sharp peak in the cross-correlogram centered at 0 ms. After SU spikes were removed, the cross-correlograms were much flatter, indicating that SU spikes were effectively excluded from the MU signal. Note, however, that cross-correlograms often are not completely flat following removal of SU spikes from the MU signal; rather, a portion of a broader correlogram peak sometimes remains. We presume that this broader peak results from the action of common inputs to the SU and MU signals, which cannot be removed completely by our procedure. Because SU spikes are effectively removed from the MU signal, the similarity in tuning between MU and SU activity shown in Fig. 2A suggests that nearby neurons in VIP have similar direction preferences for translational motion. An example data set with similar rotation tuning for SU and MU activity is shown in Fig. 3A.
Fig. 3.
Example data from a VIP neuron tested in the vestibular (top row) and visual (bottom row) rotation conditions during the passive fixation task. A: in the vestibular condition, the preferred directions computed as the vector sum of SU and MU responses were [azimuth, elevation] = [293, 20°] and [326, 21°], respectively. In the visual condition, the preferred direction was [139, 17°] for SU activity and [113, 36°] for the MU response. The |∆Preferred direction| between SU and MU responses was 30.4° for the vestibular condition and 29.3° for the visual condition. B: normalized (Z scored) cross-correlation function between SU and MU responses.
For the translation protocol, this analysis was performed for a total of 425 sites of simultaneous SU/MU recordings using single microelectrodes. There were no significant differences between the z-scored cross-correlation results between vestibular and visual conditions, either before or after SU spikes were removed (time bin: 1 ms; P = 0.59 before SU spikes were removed; P = 0.45 after SU spikes were removed; Wilcoxon rank sum test). Thus the average results across all data sets have been summarized in Fig. 4A. Before SUs were excluded from MU signals, there was a sharp correlation peak centered around 0 ms. After SUs were excluded, the cross-correlogram was relatively flat, suggesting that the artificial coupling between SU and MU signals was effectively removed (see also Chen et al. 2008; de Oliveira et al. 1997). Similar results were obtained for the rotation protocol (Fig. 4B).
The proportions of SU and MU responses with significant 3D heading tuning are summarized in Table 1. In the vestibular translation condition, 63% (268/425) of SUs and 42% (177/425) of MUs showed significant heading tuning (ANOVA, P < 0.05), and these proportions are slightly greater than those reported for area MSTd (Chen et al. 2008). In contrast, 76% (324/425) of SUs and 51% (218/425) of MUs were significantly tuned in the visual translation condition, proportions that are substantially lower than those seen in MSTd (Chen et al. 2008). These results are consistent with previous findings that SU responses to visual translation are less selective in VIP than in MSTd, whereas SU tuning for vestibular translation is stronger in VIP than MSTd (Chen et al. 2011b). When SU activity in VIP was significantly tuned for heading, the corresponding MU signal was also significantly tuned in 53% (142/268) of cases for the vestibular condition and 62% (202/324) of cases for the visual condition. When MU activity was significantly tuned, the SU response was also significantly tuned in 80% (142/177) and 93% (202/218) of cases for the vestibular and visual conditions, respectively. Thus both SU and MU selectivities for translational motion were slightly less prevalent in the vestibular condition.
Table 1.
Percentage of SU and MU with significant spatial tuning for translation
| MU |
||
|---|---|---|
| SU | P ≤ 0.05 | P > 0.05 |
| Vestibular | ||
| P ≤ 0.05 | 142/425 (33%) | 126/425 (30%) |
| P > 0.05 | 35/425 (8%) | 122/425 (29%) |
| Visual | ||
| P ≤ 0.05 | 202/425 (47%) | 122/425 (29%) |
| P > 0.05 | 16/425 (4%) | 85/425 (20%) |
Data are numbers and percentages of SU and MU with significant spatial tuning for translation (ANOVA, P ≤ 0.05) in vestibular (N = 425) and visual (N = 425) conditions.
For the rotation protocol, we analyzed data from a total of 134 SU/MU pairs for the vestibular and visual conditions (Table 2). Sixty-five percent (87/134) of SUs and 38% (51/134) of MUs were significantly tuned for vestibular rotation, compared with 55% (73/134) of SUs and 23% (31/134) of MUs for visual rotation (ANOVA, P < 0.05). When SU tuning was significant, MU tuning was also significant for 47% (41/87) of cases for the vestibular condition and 40% (29/73) of cases for the visual condition. When MU tuning was significant, SU tuning was also significant for 80% (41/51) of SUs in the vestibular condition and 94% (29/31) of cases in the visual condition. Thus the incidence of significant tuning in MU activity was similar for vestibular translation and rotation but was lower for visual rotation than for visual translation.
Table 2.
Percentage of SU and MU with significant spatial tuning for rotation
| MU |
||
|---|---|---|
| SU | P ≤ 0.05 | P > 0.05 |
| Vestibular | ||
| P ≤ 0.05 | 41/134 (31%) | 46/134 (34%) |
| P > 0.05 | 10/134 (7%) | 37/134 (28%) |
| Visual | ||
| P ≤ 0.05 | 29/134 (22%) | 44/134 (33%) |
| P > 0.05 | 2/134 (1%) | 59/134 (44%) |
Data are numbers and percentages of SU and MU with significant spatial tuning for rotation (ANOVA, P ≤ 0.05) in vestibular (N = 134) and visual (N = 134) conditions.
Next, we summarize the similarity of SU and MU heading tuning across the population of VIP neurons. To quantify response strength, we computed the difference between the peak response (across headings) and spontaneous activity (Rmax − spont) for both SU and MU signals. Figure 5A shows the comparison of this metric between SU and MU responses recorded simultaneously during the translation protocol. The vast majority of data points lie above the diagonal for both the vestibular (red) and visual (blue) conditions. The average MU/SU response ratios were 5.7 and 4.1 for the vestibular and visual conditions, respectively. In addition, response strength of MU activity is significantly correlated with that of SU activity for both vestibular (R = 0.29, P = 4.5 × 10−9, Spearman rank correlation) and visual conditions (R = 0.40, P = 2.2 × 10−16). Similarly, for the rotation protocol (Fig. 5B), the average ratios of MU/SU peak responses were 3.9 and 2.7 for the vestibular and visual conditions, respectively, and response strengths were significantly correlated between MU and SU signals for the visual (R = 0.42, P = 3.9 × 10−6) but not the vestibular conditions (R = 0.13, P = 0.14). Thus, for both the translation and rotation protocols, the average MU responses are substantially stronger than the average SU responses (paired t-tests, P < 0.001 for all 4 combinations of modality and protocol).
Fig. 5.

Quantitative summary of peak responses (Rmax − spont) and tuning strength (DDI) derived from SU and MU responses for the translation (N = 425) and rotation (N = 134) conditions (passive fixation task). A and B: comparison of SU and MU peak responses (spks/s, spikes per second) for translation (A) and rotation (B). C and D: comparison of SU and MU measures of tuning strength (DDI) for translation (C) and rotation (D). Red symbols denote vestibular condition; blue symbols denote visual condition. Filled and open symbols denote cells with and without significant 3D heading tuning, respectively (ANOVA, P < 0.05).
To quantify the strength of self-motion selectivity, we computed a signal-to-noise measure of tuning strength called DDI (see methods; see also Takahashi et al. 2007). Figure 5C compares DDI values computed from SU and MU responses for the translation protocol (N = 425). Filled symbols represent data sets with significant tuning for both SU and MU activity (ANOVA, P < 0.05), whereas open symbols indicate nonsignificant tuning (ANOVA, P > 0.05) for either SU or MU activity. Most data points tend to fall around or below the unity-slope diagonal. As a result, the mean DDI values for MU activity in the vestibular (0.54 ± 0.006, mean ± SE) and visual (0.57 ± 0.006) conditions were significantly less than the corresponding DDI values for SU activity in the vestibular (0.60 ± 0.005) and visual (0.64 ± 0.006) conditions (Wilcoxon signed-rank tests, P = 3.1 × 10−18 for vestibular, P = 1.4 × 10−29 for visual). MU and SU DDI values were significantly correlated for both the vestibular (R = 0.37, P = 3.5 × 10−15, Spearman rank correlation) and visual conditions (R = 0.53, P = 5.8 × 10−15), indicating that SUs tend to exhibit weaker tuning at sites where the MU response is poorly tuned. Thus cases of flat tuning in the MU signal do not result solely from a combination of SUs that are individually well tuned, but to different directions. Rather, it appears that weak MU tuning is associated, to some extent, with weaker SU tuning.
Figure 5D shows a similar comparison of SU and MU DDI values for the rotation protocol (N = 134). Mean DDI values for SU activity significantly exceeded those for MU activity in both the vestibular (0.59 ± 0.008 vs. 0.53 ± 0.008) and visual conditions (0.58 ± 0.011 vs. 0.51 ± 0.008; Wilcoxon signed-rank tests, P = 3.1 × 10−9 for vestibular, P = 4.0 × 10−11 for visual). Despite the smaller sample sizes, MU and SU DDI values were significantly correlated for both vestibular (R = 0.34, P = 4.9 × 10−5, Spearman rank correlation) and visual conditions (R = 0.45, P = 6.0 × 10−8). In addition, an analysis of covariance (ANCOVA) on the relationship between MU and SU DDI values revealed no significant interaction with translation/rotation protocol type (P = 0.98 for vestibular condition, P = 0.07 for visual condition, ANCOVA interaction effect), indicating that the extent of clustering appears to be comparable for rotation and translation.
We next consider the matching of 3D direction tuning preferences between SU and MU responses. For the translation protocol, 33% of data sets (142/425) had significant heading tuning for both SU and MU responses in the vestibular condition, whereas 48% (202/425) showed significant SU and MU tuning in the visual condition. For these data sets, we computed the smallest angle in 3D between the preferred direction vectors for SU and MU activity (|∆Preferred direction|), as described above. As shown in Fig. 6A, the distributions of |∆Preferred direction| were significantly nonuniform (vestibular: P < 0.001; visual: P < 0.001, uniformity test), with peaks close to 0° (vestibular: median value = 34°; visual: median value = 27°). A large majority of SU/MU pairs (vestibular: 71%; visual: 78%) have direction preferences within 60° of each other, indicating that vestibular and visual heading preferences are strongly clustered in VIP. For the rotation protocol, analogous data were available from 41 recordings with significantly tuned SU and MU activities in the vestibular condition, and 29 recordings with significantly tuned SU and MU signals in the visual condition. As shown in Fig. 6B, the distributions of |ΔPreferred direction| were again nonuniform (uniformity test, vestibular: P = 0.05; visual: P < 0.001), with peaks close to 0° (vestibular: median value = 62°; visual: median value = 31°). Thus, when both SU and MU activity in VIP are significantly tuned, the preferred direction vectors tend to be similar.
To further quantify the overall similarity of SU and MU tuning, we also computed the Pearson correlation coefficient between SU and MU tuning profiles (RSU,MU). Figure 6C illustrates distributions of RSU,MU for all simultaneously recorded SU and MU responses in the translation condition. For the vestibular condition, 48% (204/425) of RSU,MU values are significantly different from 0 (solid red bars, 188/204 are positive and 16/204 are negative), with an overall median value of 0.35. For the visual condition, the vast majority of RSU,MU values (279/425, 66%) are significantly different from zero (solid blue bars, 265/279 are positive and 14/279 are negative), with an overall median value of 0.53. This correlation analysis shows that clustering of selectivity is weaker for vestibular translation than for visual translation (Wilcoxon rank sum test, P = 1.2 × 10−7). Figure 6D shows analogous results for the rotation protocol. For the visual rotation condition, 82/134 data sets (61%) have RSU,MU values significantly different from 0, with an overall median value of 0.41. For the vestibular rotation condition, the values are smaller: 67/134 RSU,MU values are significantly different from 0 (50%), with an overall median value of 0.22.
It is worth emphasizing that imperfect spike sorting is highly unlikely to account for the strong similarities between SU and MU tuning that we have observed (Figs. 5 and 6). We estimate that imperfect spike sorting could account for, at most, a few percent of the MU events resulting from missed SU spikes, and typically much less (see discussion for details).
Clustering of the Congruency of Vestibular and Visual Selectivity in VIP
The analyses above consider the similarity of direction tuning between SU and MU responses separately for the visual and vestibular conditions. We now consider the congruency of vestibular and visual selectivity. Figure 7A shows the distribution of differences in heading preference between significantly tuned visual and vestibular responses for SUs recorded in the translation protocol. As reported previously (Chen et al. 2011a), this distribution is clearly bimodal (puni = 0.001; pbi = 0.30; modality test) such that 34% (77/226) of VIP neurons have visual and vestibular heading preferences that differ by less than 60° (“congruent” cells) and 38% (86/226) have preferences that differ by greater than 120° (“opposite” cells). A similar pattern of results can be seen for MU activity in Fig. 7B: 36% (70/197) of MUs were classified as congruent and 33% were classified as opposite, indicating that these two classes of visual/vestibular congruency are generally preserved in MU activity.
Fig. 7.
Comparison of multisensory congruency between SU and MU activity (passive fixation task). A and D: distribution of differences in 3D direction preferences (|∆Preferred direction|) between vestibular and visual translation (A) and rotation (D) conditions for SU responses. B and E: same as A and D, respectively, but for MU responses. C and F: comparison of |∆Preferred direction| between SU and MU activities for translation (C; R = 0.52, P = 2.6 × 10−11, N = 143) and rotation conditions (F; R = 0.40, P = 0.03, N = 30). Data are shown for recording sites with significant SU and MU tuning for both visual and vestibular conditions.
For the subset of recordings with significant translation tuning in both visual and vestibular responses, Fig. 7C compares the congruency of SU and MU tuning for recordings with significant tuning of both signals (N = 143). Although there is considerable scatter, the congruency of MU responses is well correlated with that of SU responses (R = 0.52, P = 2.6 × 10−11, Spearman rank correlation). Thus both congruent and opposite cells appear to be clustered in area VIP. For the rotation protocol, the data are much sparser and there are more opposite cells than congruent cells, but the congruency of SU and MU tuning still tend to be similar (Fig. 7, D–F).
Population Tuning Curves
To provide a simple graphical summary of the results quantified in Figs. 5–7, we computed population tuning curves for SU and simultaneously recorded MU activity. To simplify the presentation, tuning curves were constructed only in the horizontal plane. This yields clear tuning for most neurons and allows us to more simply average across neurons (Chen et al. 2008; Fetsch et al. 2007; Gu et al. 2006). Figure 8A shows the average tuning curves for SU (open symbols) and MU (filled symbols) responses in the vestibular translation condition, with dashed and solid curves displaying the best fits of a wrapped Gaussian function (Yang and Maunsell 2004). For each SU with significant tuning (ANOVA, P < 0.05), spontaneous activity was subtracted and the data were shifted such that the maximal response occurs at zero azimuth. The MU data were then aligned to the azimuth preference of each SU before averaging. Thus, if there was no clustering of tuning in VIP, the MU population tuning curve should be flat.
Overall, the average MU responses show strong tuning that aligns well with SU responses for the translation condition (Fig. 8, A and B) such that the peaks of the wrapped Gaussian fits are closely aligned. The pattern of results is broadly similar, but weaker, for the rotation protocol, as shown in Fig. 8, C and D.
Clustering of Heading Perception-Related Signals in VIP
The analyses described above consider the similarity of sensory processing of self-motion signals between SU and MU activity. However, VIP responses do not depend solely on the self-motion stimuli; rather, many VIP neurons show responses that correlate strongly with heading percepts, independent of the stimulus (for details, see Chen et al. 2013). Thus we were also interested in whether signals related to heading discrimination and choice are clustered in VIP. For a subset of neurons with significant heading tuning in the horizontal plane, we further compared the similarity between SU and MU responses during a heading discrimination task in which monkeys reported whether their heading was right or left of straight forward. Heading stimuli were visual, vestibular, or a combination of both modalities.
Figure 9, A and B, shows an example of heading tuning measured during the discrimination task for simultaneously recorded SU and MU activity in VIP. Because only a narrow range of headings was tested in this task, the resulting heading tuning curves are approximately monotonic for all three stimulus conditions. Both SU and MU responses show congruent heading tuning, with stronger responses to leftward motion (negative headings) than to rightward motion (positive headings). With the use of signal detection theory (Chen et al. 2013; Gu et al. 2008), the tuning curves of Fig. 9, A and B, were transformed into neurometric functions, as shown in Fig. 9, C and D. The neuronal discrimination threshold for each stimulus condition was computed as the SD of the best-fitting cumulative Gaussian function. For the example SU (Fig. 9A), neuronal sensitivity is greater during combined visual/vestibular stimulation than when either single cue is presented alone, with a correspondingly smaller neuronal threshold for the combined condition (4.6°) than for the visual (13.5°) and vestibular (8.6°) conditions. Similar neuronal cue-integration effects were observed for the simultaneously recorded MU activity (Fig. 9B), with neuronal thresholds of 4.7° for the combined condition, 13.4° for the vestibular condition, and 23.2° for the visual condition. Thus both SU and MU activity could discriminate smaller variations in heading when both cues were presented together, indicating that nearby neurons in VIP can have similar heading discriminability.
The relative neuronal sensitivities of SU and MU responses are summarized in Fig. 10 for our sample of recordings during the heading discrimination task. For all three conditions, average neuronal thresholds for MU activity (geometric means ± SD: 28.43 ± 3.39 for vestibular, 15.48 ± 3.36 for visual, 16.32 ± 3.04 for combined) are greater than those for SU responses (13.95 ± 3.42 for vestibular, 14.09 ± 2.69 for visual, 10.26 ± 2.58 for combined). Neuronal thresholds for SU and MU activity are modestly but significantly correlated for the visual (Fig. 10B; R = 0.35, P = 0.02, Spearman rank correlation) and combined conditions (Fig. 10C; R = 0.29, P = 0.04) but only marginally correlated for the vestibular condition (Fig. 10A; R = 0.37, P = 0.06). These results indicate that SUs with above-average sensitivity are located in local regions of VIP where the MU activity is also more sensitive than average.
Fig. 10.
Comparison between SU and MU neuronal thresholds during the heading discrimination task for the vestibular (A; N = 27), visual (B; N = 45), and combined conditions (C; N = 50). Data are shown only for recordings in which both SU and MU responses showed significant heading tuning in the horizontal plane. Solid lines indicate type II linear regression fits to the data. Histograms along the margins show distributions of neuronal thresholds.
An important question to address at the population level is whether neuronal thresholds, like the monkey’s behavioral thresholds, are significantly lower for the combined condition than for the single-cue conditions, as would be expected from cue-integration theory. Given the examples in Fig. 9 and previous findings from SU recordings (Chen et al. 2013), we expect sensitivity in the combined condition to depend on the congruency of visual and vestibular heading tuning. Thus we computed a congruency index (CI) between visual and vestibular tuning functions obtained during the heading discrimination task (see methods). The ratio of the measured threshold in the combined condition to the optimal prediction is inversely correlated with the CI for SUs, as shown previously (Fig. 11, open symbols; data from Chen et al. 2013). Here, we show that the combined/predicted threshold ratio for MU activity has a similar dependence on CI (Fig. 11, filled symbols), with a significant negative correlation (R = −0.41, P = 0.02, Spearman rank correlation). This negative correlation means that neurons with large positive CIs have thresholds close to the optimal prediction (ratios near unity), whereas neurons with negative CIs generally have combined/predicted threshold ratios above unity. Thus MUs with highly congruent tuning, like SUs in VIP, show neuronal sensitivity in the combined condition that matches the theoretical prediction. Indeed, an ANCOVA performed on the logarithm of the combined/predicted threshold ratio revealed no significant interaction between SU/MU signal type and congruency index (P = 0.18, ANCOVA interaction effect), indicating that the negative correlation with CI is not significantly different between SU and MU activity. Thus the neural correlates of improved sensitivity during cue integration are clearly revealed in MU responses.
Fig. 11.

Dependence of cue-integration effects on congruency of visual and vestibular tuning for both SU and MU responses during the heading discrimination task. The threshold ratio measured in the combined condition to the optimal predicted threshold is plotted as a function of the congruency index (see methods for details). Open and filled symbols represent SU (N = 56) and MU (N = 35) data, respectively. Bold lines indicate type II linear regressions (SU: dashed line; MU: solid line). Dashed thin horizontal line indicates threshold ratio of unity. Note that only neurons with significant tuning under both the vestibular and visual conditions have been included in this analysis.
The analyses of Figs. 9–11 quantify the sensitivity of SU and MU responses to small stimulus variations. We now consider the relationship between SU/MU activity and perceptual decisions, independent of the stimulus (Gu et al. 2007, 2008). For this purpose, we computed choice probabilities (CPs; Britten et al. 1996) to quantify how well neural activity can predict the monkey’s choices. A CP significantly greater than chance (0.5) indicates that the neuron fires more strongly when the monkey makes a choice in favor of the cell’s heading preference. For the example SU in Fig. 9, CP values are 0.76, 0.59, and 0.73 for the vestibular, visual, and combined conditions, respectively. These CP values are significantly different from 0.5 for the vestibular and combined conditions (vestibular: P = 0.002; combined: P < 0.001, permutation tests), but CP for the visual condition does not reach significance (P = 0.078). The simultaneously recorded MU activity in this experiment also shows highly significant choice-related activity, with CP values of 0.77, 0.71, and 0.85 for the vestibular, visual, and combined conditions, respectively (P < 0.001 for all 3 conditions, permutation tests). Thus nearby neurons in VIP can have similar correlations with choice.
Figure 12 compares CP values between SU and MU activity across the population of recording sites in VIP. Filled symbols represent data sets with significant CPs for both SU and MU responses, whereas open symbols indicate a nonsignificant CP for either SU or MU. For the vestibular condition, the average CP for MU activity (0.58 ± 0.030, mean ± SE) is not significantly different from that for SU responses (0.63 ± 0.036; P = 0.13, Wilcoxon signed-rank test). MU and SU CP values are modestly but significantly correlated for the vestibular condition (R = 0.44, P = 0.02, Spearman rank correlation). For the visual and combined conditions, data points are distributed rather evenly around the unity-slope diagonal (Fig. 12, B and C). Accordingly, the average CPs for MU activity (0.58 ± 0.015 for the visual condition, 0.57 ± 0.024 SE for the combined condition) are not significantly different from the corresponding average CPs for SU responses (0.55 ± 0.020 for visual; 0.61 ± 0.023 for combined; P = 0.23 for visual; P = 0.18 for combined, Wilcoxon signed-rank test). CP values for MU and SU responses are significantly correlated for the combined condition (R = 0.48, P = 1.66 × 10−4), but not for the visual condition (R = 0.21, P = 0.14).
In the analyses of Fig. 12, CP values were computed with respect to each neuron’s heading preferences measured during the discrimination task (e.g., Fig. 9, A and B). However, responses measured during the discrimination task are likely to reflect influences of both stimulus and choice. Recent work (Zaidel et al. 2017) has shown that CPs can be biased when choice-related signals are sufficiently strong to overwhelm stimulus-related responses. This confound affects CPs for many VIP neurons (Zaidel et al. 2017). To avoid this confound in our assessment of clustering, we also computed CPs based on heading preferences computed for each neuron from data measured during the fixation task (such data were available only for the vestibular and visual conditions).
The heading preference of each unit during the passive fixation task was computed as the vector sum of mean responses to each stimulus. In contrast, because the range of headings was narrow during the discrimination task, the heading preference (left vs. right of straight ahead) during the discrimination task was measured by regressing neuronal responses against heading such that a regression coefficient >0 indicates a rightward heading preference. Heading preferences for the fixation and discrimination tasks are compared in Fig. 13, A and B. All recordings with significant vestibular (Fig. 13A) or visual (Fig. 13B) heading tuning for both SU and MU responses in the horizontal plane were included in this comparison. Most cells have the same sign of heading preference relative to straight forward (Fig. 13, A and B, top right and bottom left quadrants); thus CP values for the two tasks would be the same for these neurons. For cases that fall in the top left and bottom right quadrants of Fig. 13, A and B, the CP value for the discrimination task would be given by the difference between unity and the CP value from the fixation task.
Based on heading preferences determined during the fixation task, CPs for SUs (0.61 ± 0.039 for vestibular, 0.52 ± 0.021 for visual, means ± SE) are not significantly different from the corresponding CPs for MU responses (0.54 ± 0.032 for vestibular, 0.55 ± 0.017 for visual; P = 0.07 for vestibular, P = 0.52 for visual, Wilcoxon signed-rank test). In addition, CPs for SU and MU responses are significantly correlated for both the vestibular (R = 0.54, P = 0.002, Spearman rank correlation) and visual conditions (R = 0.36, P = 0.01; Fig. 13, C and D). Overall, mean CP values tend to be somewhat lower when computed on the basis of tuning in the fixation condition (Fig. 13, C and D) compared with those in Fig. 12, consistent with the findings of Zaidel et al. (2017). Furthermore, there was no correlation between DDI and CPs (measured using 2 different methods) for SU/MU responses in either vestibular or visual conditions (Spearman rank correlation, P > 0.23). Together, the results of Figs. 10–13 show that measures of heading sensitivity and choice-related activity are weakly to moderately clustered in VIP.
DISCUSSION
By comparing the activity of isolated SUs with the aggregate MU activity of nearby neurons recorded on the same electrode, we have examined the clustering of self-motion tuning properties in area VIP. Our results confirm previous findings of clustered optic flow selectivity in area VIP (Zhang et al. 2004) and demonstrate, for the first time, that the vestibular tuning of VIP neurons in response to both translational and rotational motion is also clustered. In addition, we demonstrate that heading discriminability and choice-related activity are also clustered, at least weakly, in VIP. Although this analysis cannot establish whether the clustering reflects a columnar architecture, it does suggest that area VIP may contain a topographic map of translational and rotational motion based on both sensory and perception-related signals.
Clustering of Optic Flow and Vestibular Selectivity in Area VIP
Zhang and Britten (2004) assessed the clustering of VIP neurons for selectivity to translational optic flow by comparing the tuning of single units with that of the local multi-unit activity; thus, their data can be best compared with our visual translation condition. Zhang and Britten (2004) calculated the correlation coefficient between the heading tuning curves of single- and multi-unit responses. The average correlation coefficient was 0.61 for 18 pairs of SU and MU responses, and 89% of cases were significantly correlated, although they did not report the percentage of significant visual tuning. In our data, for 202 paired recordings with significant heading tuning for both SU and MU responses (P < 0.05, ANOVA), 89% (179/202) of cases were significantly correlated, and the average correlation coefficient was 0.60. Thus, the clustering of translational optic flow selectivity is very similar between our study and that of Zhang and Britten (2004). This similarity occurs despite the fact that our neurons were tested with a 3D stimulus set, whereas Zhang and Britten measured heading tuning curves over a relatively narrow range of headings (around straightforward) in the horizontal plane (typically −30 to +30°) and during pursuit eye movements.
In addition to clustering of optic flow responses, we have shown that VIP neurons also show some clustering for vestibular signals. Furthermore, multisensory visual-vestibular response interactions also occurred in clusters. When a SU is significantly tuned to heading in both the visual and vestibular conditions, the MU activity is also tuned for both vestibular and visual stimuli for 40% of the sites. Since receptive fields in VIP for different modalities usually overlap (Duhamel et al. 1998), a clustered organization might facilitate the process of multisensory sensory integration.
Interestingly, SUs tend to exhibit weaker tuning for both the translation and rotation protocols at sites where the MU response is poorly tuned (Fig. 5, C and D). These data argue strongly against the possibility that flat MU tuning results from a combination of SUs that are individually well tuned, but to different directions. This appears to be somewhat different from the case of orientation pinwheels in V1 (Maldonado et al. 1997), where the aggregate (MU) response is poorly tuned but SUs are generally well tuned to a variety of different directions. In contrast, SUs are much more likely to exhibit poor selectivity at sites where MU tuning is poor in VIP. Thus MU responses are a reasonably good predictor of SU preferences.
Independence of MU and SU Signals in Clustering Analysis
To minimize dependencies between MU and SU signals, we subtract one count from time bins of the MU signal that occurred within ±1 ms of each SU spike, thus effectively removing all SU spike events from the MU data stream. This approach should work well even when online spike sorting is imperfect. If the window discriminator threshold was set too low, such that some MU events erroneously entered the SU data stream, those events would still be removed by our procedure because they would be coincident in the SU and MU event records. On the other hand, if the window discriminator threshold was set too high, such that some SU spikes were missing from the SU signal and thereby included in the MU event record, these events would not be removed from the MU signal by our procedure. However, given that the fraction of SU spikes that would be missed this way is modest (probably at most ~5–10%) and that MU event rates are a few times greater than SU response rates, such a contribution of missed SU spikes to the MU signal would be very small (at most a few percent, and typically much less).
To further examine whether imperfect spike sorting could give rise to artifactual clustering, we used the raw analog data to re-sort both SU and MU activity offline; this was done for a subset of recordings in the 3D translation tuning protocol (N = 191) for which SU isolation was especially good. The DDI values of SU and MU responses were significantly correlated for both the vestibular (R = 0.45, P = 9.6 × 10−11, Spearman rank correlation) and visual conditions (R = 0.53, P = 6.0 × 10−15), similar to results for the entire population (Fig. 5, C and D). For the subset of these 191 data sets that showed significant SU and MU tuning (vestibular: 26%; visual: 69%), the distributions of |ΔPreferred direction| again had peaks close to 0° (median values: 42° for the vestibular condition, 34° for the visual condition). In addition, heading tuning profiles computed from SU and MU signals generally showed significant positive correlations (RSU,MU), with median values of 0.32 and 0.50 for the vestibular and visual conditions, respectively. Thus all of our main findings were confirmed for a subset of recordings with the best SU isolation.
Clustering of Neuronal Sensitivity and Choice-Related Signals in Area VIP
For many years, CPs have been interpreted as reflecting a causal contribution of neurons to a particular sensory task. However, it has become increasingly clear that the interpretation of CPs is complex. Several studies have revealed that, in sensory cortex, neuronal activity could be modulated by both bottom-up sensory information and top-down signals related to choice (Britten et al. 1996; Cumming and Nienborg 2016; Kwon et al. 2016; Nienborg et al. 2012; Shadlen et al. 1996; Yang et al. 2016). This makes it difficult to unravel the respective contributions of bottom-up and top-down influences. If a neuron has a weak rightward heading preference and its response is strongly enhanced by leftward choice signals, then the tuning curve measured during the discrimination task may incorrectly suggest that the neuron’s stimulus preference is for leftward headings. In such cases, if the responses measured during the discrimination task are used to define heading preference in the computation of CP, then CP will be artificially large and greater than 0.5 (Zaidel et al. 2017). To assess the impact of this problem, we computed CPs in two different ways: one in which heading preference was determined from responses during the discrimination task (Fig. 12) and another in which heading preference was determined from a separate test conducted during passive visual fixation (Fig. 13C, D). The relatively larger CPs obtained using the former method indicate that CP values were artificially inflated for some neurons when computed using the first method, as also described recently by Zaidel et al. (2017). Intriguingly, the CPs of MU signals decreased significantly when computed on the basis of heading preferences from the fixation task (P = 0.004, Wilcoxon signed-rank test, visual and vestibular conditions pooled together), whereas CPs for SUs did not change significantly (P = 0.21, Wilcoxon signed-rank test). This may be due to the fact that the heading tuning of MU signals was weaker than that of SUs, which makes it easier for the true sensory tuning to be overridden by top-down choice signals.
Functional Significance of Clustering of Both Sensory and Choice-Related Signals in VIP
The VIP area of primates has been considered to be a large associative cortical region, where signals from different sensory modalities are integrated to provide the basis for perceptual processes such as space perception and perception of the body schema (Avillac et al. 2005; Chen et al. 2013; Cooke et al. 2003; Schlack et al. 2002). However, multisensory integration is not limited to just sensory inputs; it can be affected by a host of cognitive process, including attention, memory, and prior experience. The clustering of both sensory and perception-related signals in VIP may suggest a way for top-down factors to influence the processing of multisensory signals; e.g., a stimulus presented in one modality can affect the processing of an accessory stimulus presented in another modality, either due to its task relevance (Busse et al. 2005; Donohue et al. 2013) or the learned associations (Fiebelkorn et al. 2010; Molholm et al. 2007). One possible reason is that multisensory integration may involve higher order networks that actively maintain a mental model of the environment and that generate predictions about the expected sensory input. Sensory processing itself may be adjusted on the basis of the (dis)agreement between the actual sensory input and the activity predicted by the mental model. Moreover, this prediction error may also require an update of the mental model itself. However, the potential roles of VIP in such processes need to be further examined.
GRANTS
This work was supported by National Basic Research Program of China Grants 31371029 and 31571121 and Shanghai Education Committee of Scientific Research Innovation Grant 15JC1400104 and 16JC1400100. G. C. DeAngelis was supported by National Institutes of Health (NIH) Grant EY016178. D. E. Angelaki was supported by NIH Grant DC014678.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS
A.C. conceived and designed research; A.C. performed experiments; M.S. and A.C. analyzed data; M.S., G.C.D., D.E.A., and A.C. interpreted results of experiments; M.S. prepared figures; M.S. drafted manuscript; M.S., G.C.D., D.E.A., and A.C. edited and revised manuscript; M.S., G.C.D., D.E.A., and A.C. approved final version of manuscript.
REFERENCES
- Avillac M, Denève S, Olivier E, Pouget A, Duhamel JR. Reference frames for representing visual and tactile locations in parietal cortex. Nat Neurosci 8: 941–949, 2005. doi: 10.1038/nn1480. [DOI] [PubMed] [Google Scholar]
- Baizer JS, Ungerleider LG, Desimone R. Organization of visual inputs to the inferior temporal and posterior parietal cortex in macaques. J Neurosci 11: 168–190, 1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boussaoud D, Ungerleider LG, Desimone R. Pathways for motion analysis: cortical connections of the medial superior temporal and fundus of the superior temporal visual areas in the macaque. J Comp Neurol 296: 462–495, 1990. doi: 10.1002/cne.902960311. [DOI] [PubMed] [Google Scholar]
- Bremmer F, Schlack A, Kaminiarz A, Hoffmann KP. Encoding of movement in near extrapersonal space in primate area VIP. Front Behav Neurosci 7: 8, 2013. doi: 10.3389/fnbeh.2013.00008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis Neurosci 13: 87–100, 1996. doi: 10.1017/S095252380000715X. [DOI] [PubMed] [Google Scholar]
- Britten KH, Shadlen MN, Newsome WT, Movshon JA. The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci 12: 4745–4765, 1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Busse L, Roberts KC, Crist RE, Weissman DH, Woldorff MG. The spread of attention across modalities and space in a multisensory object. Proc Natl Acad Sci USA 102: 18751–18756, 2005. doi: 10.1073/pnas.0507704102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen A, DeAngelis GC, Angelaki DE. Macaque parieto-insular vestibular cortex: responses to self-motion and optic flow. J Neurosci 30: 3022–3042, 2010. doi: 10.1523/JNEUROSCI.4029-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen A, DeAngelis GC, Angelaki DE. Representation of vestibular and visual cues to self-motion in ventral intraparietal cortex. J Neurosci 31: 12036–12052, 2011a. doi: 10.1523/JNEUROSCI.0395-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen A, DeAngelis GC, Angelaki DE. A comparison of vestibular spatiotemporal tuning in macaque parietoinsular vestibular cortex, ventral intraparietal area, and medial superior temporal area. J Neurosci 31: 3082–3094, 2011b. doi: 10.1523/JNEUROSCI.4476-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen A, DeAngelis GC, Angelaki DE. Functional specializations of the ventral intraparietal area for multisensory heading discrimination. J Neurosci 33: 3567–3581, 2013. doi: 10.1523/JNEUROSCI.4522-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen A, Gu Y, Takahashi K, Angelaki DE, DeAngelis GC. Clustering of self-motion selectivity and visual response properties in macaque area MSTd. J Neurophysiol 100: 2669–2683, 2008. doi: 10.1152/jn.90705.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chklovskii DB, Koulakov AA. Maps in the brain: what can we learn from them? Annu Rev Neurosci 27: 369–392, 2004. doi: 10.1146/annurev.neuro.27.070203.144226. [DOI] [PubMed] [Google Scholar]
- Colby CL, Duhamel JR, Goldberg ME. Ventral intraparietal area of the macaque: anatomic location and visual response properties. J Neurophysiol 69: 902–914, 1993. doi: 10.1152/jn.1993.69.3.902. [DOI] [PubMed] [Google Scholar]
- Cooke DF, Taylor CS, Moore T, Graziano MS. Complex movements evoked by microstimulation of the ventral intraparietal area. Proc Natl Acad Sci USA 100: 6163–6168, 2003. doi: 10.1073/pnas.1031751100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cumming BG, Nienborg H. Feedforward and feedback sources of choice probability in neural population responses. Curr Opin Neurobiol 37: 126–132, 2016. doi: 10.1016/j.conb.2016.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Darian-Smith I, Johnson KO, Dykes R. “Cold” fiber population innervating palmar and digital skin of the monkey: responses to cooling pulses. J Neurophysiol 36: 325–346, 1973. doi: 10.1152/jn.1973.36.2.325. [DOI] [PubMed] [Google Scholar]
- de Oliveira SC, Thiele A, Hoffmann KP. Synchronization of neuronal activity during stimulus expectation in a direction discrimination task. J Neurosci 17: 9248–9260, 1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Donohue SE, Todisco AE, Woldorff MG. The rapid distraction of attentional resources toward the source of incongruent stimulus input during multisensory conflict. J Cogn Neurosci 25: 623–635, 2013. doi: 10.1162/jocn_a_00336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duhamel JR, Colby CL, Goldberg ME. Ventral intraparietal area of the macaque: congruent visual and somatic response properties. J Neurophysiol 79: 126–136, 1998. doi: 10.1152/jn.1998.79.1.126. [DOI] [PubMed] [Google Scholar]
- Eggermont JJ, Smith GM. Neural connectivity only accounts for a small part of neural correlation in auditory cortex. Exp Brain Res 110: 379–391, 1996. doi: 10.1007/BF00229138. [DOI] [PubMed] [Google Scholar]
- Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415: 429–433, 2002. doi: 10.1038/415429a. [DOI] [PubMed] [Google Scholar]
- Fetsch CR, Pouget A, DeAngelis GC, Angelaki DE. Neural correlates of reliability-based cue weighting during multisensory integration. Nat Neurosci 15: 146–154, 2011. doi: 10.1038/nn.2983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fetsch CR, Wang S, Gu Y, DeAngelis GC, Angelaki DE. Spatial reference frames of visual, vestibular, and multimodal heading signals in the dorsal subdivision of the medial superior temporal area. J Neurosci 27: 700–712, 2007. doi: 10.1523/JNEUROSCI.3553-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiebelkorn IC, Foxe JJ, Molholm S. Dual mechanisms for the cross-sensory spread of attention: how much do learned associations matter? Cereb Cortex 20: 109–120, 2010. doi: 10.1111/j.1460-9568.2010.07196.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu Y, Angelaki DE, DeAngelis GC. Neural correlates of multisensory cue integration in macaque MSTd. Nat Neurosci 11: 1201–1210, 2008. doi: 10.1038/nn.2191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu Y, DeAngelis GC, Angelaki DE. A functional link between area MSTd and heading perception based on vestibular signals. Nat Neurosci 10: 1038–1047, 2007. doi: 10.1038/nn1935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu Y, Watkins PV, Angelaki DE, DeAngelis GC. Visual and nonvisual contributions to three-dimensional heading selectivity in the medial superior temporal area. J Neurosci 26: 73–85, 2006. doi: 10.1523/JNEUROSCI.2356-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kang I, Maunsell JH. Potential confounds in estimating trial-to-trial correlations between neuronal response and behavior using choice probabilities. J Neurophysiol 108: 3403–3415, 2012. doi: 10.1152/jn.00471.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kwon SE, Yang H, Minamisawa G, O’Connor DH. Sensory and decision-related activity propagate in a cortical feedback loop during touch perception. Nat Neurosci 19: 1243–1249, 2016. doi: 10.1038/nn.4356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maciokas JB, Britten KH. Extrastriate area MST and parietal area VIP similarly represent forward headings. J Neurophysiol 104: 239–247, 2010. doi: 10.1152/jn.01083.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacNeilage PR, Banks MS, DeAngelis GC, Angelaki DE. Vestibular heading discrimination and sensitivity to linear acceleration in head and world coordinates. J Neurosci 30: 9084–9094, 2010a. doi: 10.1523/JNEUROSCI.1304-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacNeilage PR, Turner AH, Angelaki DE. Canal-otolith interactions and detection thresholds of linear and angular components during curved-path self-motion. J Neurophysiol 104: 765–773, 2010b. doi: 10.1152/jn.01067.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maldonado PE, Gödecke I, Gray CM, Bonhoeffer T. Orientation selectivity in pinwheel centers in cat striate cortex. Science 276: 1551–1555, 1997. doi: 10.1126/science.276.5318.1551. [DOI] [PubMed] [Google Scholar]
- Molholm S, Martinez A, Shpaner M, Foxe JJ. Object-based attention is multisensory: co-activation of an object’s representations in ignored sensory modalities. Eur J Neurosci 26: 499–509, 2007. doi: 10.1111/j.1460-9568.2007.05668.x. [DOI] [PubMed] [Google Scholar]
- Mountcastle VB. The columnar organization of the neocortex. Brain 120: 701–722, 1997. doi: 10.1093/brain/120.4.701. [DOI] [PubMed] [Google Scholar]
- Nienborg H, Cohen MR, Cumming BG. Decision-related activity in sensory neurons: correlations among neurons and with behavior. Annu Rev Neurosci 35: 463–483, 2012. doi: 10.1146/annurev-neuro-062111-150403. [DOI] [PubMed] [Google Scholar]
- Schlack A, Hoffmann KP, Bremmer F. Interaction of linear vestibular and visual stimulation in the macaque ventral intraparietal area (VIP). Eur J Neurosci 16: 1877–1886, 2002. doi: 10.1046/j.1460-9568.2002.02251.x. [DOI] [PubMed] [Google Scholar]
- Shadlen MN, Britten KH, Newsome WT, Movshon JA. A computational analysis of the relationship between neuronal and behavioral responses to visual motion. J Neurosci 16: 1486–1510, 1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Snyder JP. Map Projections: A Working Manual (U.S. Geological Survey Professional Paper 1395). Washington, DC: U.S. Geological Survey, 1987. [Google Scholar]
- Takahashi K, Gu Y, May PJ, Newlands SD, DeAngelis GC, Angelaki DE. Multimodal coding of three-dimensional rotation and translation in area MSTd: comparison of visual and vestibular selectivity. J Neurosci 27: 9742–9756, 2007. doi: 10.1523/JNEUROSCI.0817-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang H, Kwon SE, Severson KS, O’Connor DH. Origins of choice-related activity in mouse somatosensory cortex. Nat Neurosci 19: 127–134, 2016. doi: 10.1038/nn.4183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang T, Maunsell JH. The effect of perceptual learning on neuronal responses in monkey visual area V4. J Neurosci 24: 1617–1626, 2004. doi: 10.1523/JNEUROSCI.4442-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Y, Liu S, Chowdhury SA, DeAngelis GC, Angelaki DE. Binocular disparity tuning and visual-vestibular congruency of multisensory neurons in macaque parietal cortex. J Neurosci 31: 17905–17916, 2011. doi: 10.1523/JNEUROSCI.4032-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zaidel A, DeAngelis GC, Angelaki DE. Decoupled choice-driven and stimulus-related activity in parietal neurons may be misrepresented by choice probabilities. Nat Commun 8: 715, 2017. doi: 10.1038/s41467-017-00766-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang T, Britten KH. Clustering of selectivity for optic flow in the ventral intraparietal area. Neuroreport 15: 1941–1945, 2004. [DOI] [PubMed] [Google Scholar]
- Zhang T, Britten KH. The responses of VIP neurons are sufficiently sensitive to support heading judgments. J Neurophysiol 103: 1865–1873, 2010. doi: 10.1152/jn.00401.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang T, Heuer HW, Britten KH. Parietal area VIP neuronal responses to heading stimuli are encoded in head-centered coordinates. Neuron 42: 993–1001, 2004. doi: 10.1016/j.neuron.2004.06.008. [DOI] [PubMed] [Google Scholar]
- Zohary E, Shadlen MN, Newsome WT. Correlated neuronal discharge rate and its implications for psychophysical performance. Nature 370: 140–143, 1994. [Erratum. Nature 371: 358, 1994.] doi: 10.1038/370140a0. [DOI] [PubMed] [Google Scholar]






