Understanding multivariate brain activity: Evaluating the effect of voxelwise noise correlations on population codes in functional magnetic resonance imaging

Ru-Yuan Zhang; Xue-Xin Wei; Kendrick Kay

doi:10.1371/journal.pcbi.1008153

. 2020 Aug 18;16(8):e1008153. doi: 10.1371/journal.pcbi.1008153

Understanding multivariate brain activity: Evaluating the effect of voxelwise noise correlations on population codes in functional magnetic resonance imaging

Ru-Yuan Zhang ^1,^2,^3,^*, Xue-Xin Wei ⁴, Kendrick Kay ³

Editor: Saad Jbabdi⁵

PMCID: PMC7454976 PMID: 32810133

Abstract

Previous studies in neurophysiology have shown that neurons exhibit trial-by-trial correlated activity and that such noise correlations (NCs) greatly impact the accuracy of population codes. Meanwhile, multivariate pattern analysis (MVPA) has become a mainstream approach in functional magnetic resonance imaging (fMRI), but it remains unclear how NCs between voxels influence MVPA performance. Here, we tackle this issue by combining voxel-encoding modeling and MVPA. We focus on a well-established form of NC, tuning-compatible noise correlation (TCNC), whose sign and magnitude are systematically related to the tuning similarity between two units. We show that this form of voxelwise NCs can improve MVPA performance if NCs are sufficiently strong. We also confirm these results using standard information-theoretic analyses in computational neuroscience. In the same theoretical framework, we further demonstrate that the effects of noise correlations at both the neuronal level and the voxel level may manifest differently in typical fMRI data, and their effects are modulated by tuning heterogeneity. Our results provide a theoretical foundation to understand the effect of correlated activity on population codes in macroscopic fMRI data. Our results also suggest that future fMRI research could benefit from a closer examination of the correlational structure of multivariate responses, which is not directly revealed by conventional MVPA approaches.

Author summary

Noise correlation (NC) is the key component of multivariate response distributions and thus characterizing its effects on population codes is the cornerstone for understanding probabilistic computation in the brain. Despite extensive studies of NCs in neurophysiology, little is known with respect to their role in functional magnetic resonance imaging (fMRI). We characterize the effect of voxelwise NC by building voxel-encoding models and directly quantifying the amount of information in simulated multivariate fMRI data. In contrast to the detrimental effects of NC implied in neurophysiological studies, we find that voxelwise NCs can enhance information codes if NC is sufficiently strong. Our work highlights the important role of noise correlations in decipher population codes using fMRI.

Introduction

Understanding how neural populations encode information and guide behavior is a central question in modern neuroscience. In a neuronal population, many units exhibit correlated activity, and this likely reflects an important feature of information coding in the brain. In computational neuroscience, researchers have investigated the relationship between signal correlation (SC), referring to the similarity between the tuning functions of two neurons, and noise correlation (NC), referring to the correlation between two neurons’ trial-by-trial responses evoked by repetitive presentations of the same stimulus [1–3].

Previous studies in neurophysiology have discovered that neurons that share similar tuning functions (i.e., a positive SC) also tend to have a weak positive NC, a pervasive phenomenon across several brain regions [4–11]. In this paper, we denote this type of NC as tuning-compatible noise correlation (TCNC) because the sign and the magnitude of the NC are systematically related to the SC between a pair of neurons. A bulk of theoretical and empirical work has shown that NCs have a substantial impact on population codes. For example, the seminal study by Zohary, Shadlen [12] demonstrated that TCNCs limit the amount of information in a neural population as the noise is shared by neurons and cannot be simply averaged out. Later on, researchers realized that this detrimental effect of TCNC is mediated by other factors, such as the form of NC, heterogeneity of tuning functions, and its relevance to behavior [2, 13–16].

The study of NCs in the brain has been historically impeded by technical barriers to measuring simultaneously the activity of many neurons in neurophysiological experiments. In contrast, functional magnetic resonance imaging (fMRI) naturally measures the activity of many neural populations throughout the entire brain. Imaging scientists often use multivariate pattern analysis (MVPA) to assess the accuracy of population codes [17, 18]. However, above-chance decoding performance in MVPA does not specify the detailed representational structure underlying multivariate voxel responses. For example, Fig 1 illustrates a simple two-voxel scenario in multivariate decoding. The decoding accuracy in the original state (Fig 1A) can be improved (e.g., by attention, learning) via either the further separation of mean responses (Fig 1B) or the changes to the covariance geometry (Fig 1C). This example highlights the impact of the shape of the response distribution on population codes and these effects cannot be easily disentangled by the conventional MVPA approach [19].

Fig 1 — The pool consists of two responsive voxels and the two color disks represent the trial-by-trial response distributions evoked by two different stimuli. Panel A illustrates the original state of the population responses. Decoding performance can be improved via either a bigger separation of the mean population response (panel B) or changes in the covariance structure (panel C). Representational structures in panels B and C indicate improved population codes but have distinct underlying mechanisms. Panel D illustrates that certain covariance changes can worsen decoding.

The magnitude and the structure of NCs in fMRI data still remain largely unknown. It has been shown that NCs influence MVPA accuracy and that certain types of classifiers can compensate for NCs [20]. But the precise nature of NCs has not yet been thoroughly characterized. There have been a few recent investigations of NCs. A study by Ryu and Lee [21] evaluated the impact of three factors—retinotopic distance, cortical distance, and tuning similarity—on voxelwise NCs in early visual cortex, and found that tuning similarity is the major determinant for voxelwise NCs. Furthermore, van Bergen and Jehee [22] systematically evaluated voxelwise NCs in human V1 to V3 and showed that the magnitude of NCs monotonically increases as tuning similarity increases. Furthermore, one recent study found that a multivariate classifier can exploit voxelwise NCs to decode population information [23]. Our recent work showed that the voxelwise noise correlations in general enhance the amount of information in a limited pool in human early visual cortex [24]. These results provide specific evidence supporting the existence of voxelwise TCNC, and suggest that a deeper understanding of how NC manifests in fMRI data is critical for studying probabilistic neural computation using multivariate fMRI data [22, 25].

In the present study, we combine MVPA and the voxel-encoding modeling approach to assess how the magnitude and form of NCs impact population codes in fMRI data. Similar to prior theoretical work in neurophysiology, we aim to derive the theoretical bound of the effects of voxelwise NCs on population codes in multivariate voxel responses. We assess the accuracy of population codes by MVPA and information-theoretic analyses. The voxel-encoding model used in this study allows us to systematically manipulate response parameters (i.e., voxel tuning) so as to examine NCs in different scenarios [26]. We first assess the quantitative relationship between decoding accuracy and the strength of NCs. We then directly calculate the amount of information as a function of NCs in a voxel population. Both methods demonstrate that the accuracy of population codes in fMRI data follows a U-shaped function as the strength of TCNC increases. Notably, all these analyses in voxel populations are compared against classical findings in neuronal populations. We show that the effects of NCs on population codes are strongly mediated by tuning heterogeneity in voxel populations.

Materials and methods

Previous endeavors of brain decoding generally fall into two broad categories: classification of stimuli into discrete categories [27] and estimation of a continuous stimulus variable [28]. We thus evaluated the effect of NC in brain decoding in two tasks—a stimulus-classification task and a stimulus-estimation task. We will first introduce the simulation on a neuronal population and then specify the voxel-encoding model used to generate simulated responses of a voxel population (see Fig 2).

Fig 2 — The neuron-encoding model (panel A) proposes a neuronal population with orientation-selective tuning curves. Each neuron has Poisson-like response variance and the noise correlation between two neurons can be specified with different structures and strength (see Materials and Methods). The voxel-encoding model proposes a similar neuronal population and the response of a single voxel is the linear combination of the responses of multiple neurons. The noise correlation between two voxels can be specified using similar methods (see Materials and Methods). Note that voxelwise NCs can come from the response variability at both neuronal and voxel levels (see Fig 6). Using the neuron- and the voxel-encoding models, we can generate many trials of neuronal and voxel population responses and perform conventional MVPA on the simulated data. The goal is to examine multivariate decoding performance as a function of the NC structure and strength between either neurons or voxels.

Assessment of effects of noise correlations in neuronal populations

Neuron-encoding model

The neuron-encoding model assumes a pool of orientation-selective neurons whose preferred orientations are equally spaced between [1°, 180°]. We manipulated the number of neurons in our simulations. Similarly, all orientations throughout the entire paper are angles in degrees within [1°, 180°]. Tuning curves of the neurons can be described as:

g_{k} (s) = α + β * e^{γ * (c o s (\frac{π}{90} (s - φ_{k})) - 1)}

(1)

where g_k(s) is the tuning function of the k-th neuron. s is the stimulus. φ_k indicates the preferred orientation of the k-th neuron. α is the baseline firing rate, β controls the response range, and γ controls the width of the tuning curve. We set the parameter values α = 1, β = 19, and γ = 2, resulting in a tuning curve with the maximum firing rate at 20 spikes per second. This tuning curve is consistent with previous theoretical work [29] and empirical measurements in the primary visual cortex in primates [5].

Based on this setting, the mean of neuronal population responses given stimulus s can be represented by G(s) = [g_k(s)]. However, empirically measured neuronal responses vary trial-by-trial. We posit that the mean of trial-by-trial population responses is G(s). We will detail the covariance in the following section.

Noise correlation and covariance

We proposed three types of NCs for neuronal data (see Table 1): angular-based tuning compatible noise correlation (aTCNC), curve-based tuning compatible noise correlation (cTCNC) and shuffled noise correlation (SFNC).

Table 1. List of symbols.

Symbol	Meaning
NC	Noise correlation
SC	Signal correlation
*fMRI* *MVPA*	Functional magnetic resonance imaging Multivariate pattern analysis
*TCNC*	Tuning-compatible noise correlation
*aTCNC*	Angular-based tuning-compatible noise correlation
*cTCNC*	Curve-based tuning-compatible noise correlation
*SFNC*	Shuffled noise correlation
R^cTCNC	Angular-based tuning-compatible noise correlation matrix
R^aTCNC	Curve-based tuning-compatible noise correlation matrix
R^SFNC	Shuffled noise correlation matrix
c_neuron	Noise correlation coefficient between neurons
c_vxs	Noise correlation coefficient between voxels
c_homo	voxel tuning heterogeneity coefficient
W	Linear weighting matrix from neuronal to voxel responses

Open in a new tab

Several theoretical studies assume the NC between a pair of neurons is an exponential function of the angular difference between their preferred orientations, here defined as angular-based tuning compatible noise correlation (aTCNC):

r_{i j}^{a T C N C} = e^{(- \frac{| φ_{i} - φ_{j} |}{L} * \frac{90}{π})}

(2)

$r_{i j}^{a T C N C}$ is the NC between the i-th and the j-th neurons. φ_i and φ_j are their preferred orientations. This equation specifies that the NC between two neurons diminishes as their preferred orientations are farther apart. The parameter L controls the magnitude of such decay. We denote the correlation matrix as R^cTCNC. Here we set L = 1 for simplicity. Ecker, Berens [29] has shown that the parametric form of NC and the value of L does not qualitatively change the result of the simulation, as long as the generated correlation matrix is positive definite. Note that by this definition aTCNCs are always positive (i.e., range 0~1, also see Fig 3A).

Fig 3 — Example noise correlation matrices simulated in a neuronal (panels A-C) and a voxel population (D, E). In the neuronal population (180 neurons), the angular-based TCNC matrix, the curve-based TCNC matrix, and the SFNC matrix are illustrated from left to right. Neurons are sorted according to their preferred orientation from 1 to 180°. In the voxel population (180 voxels), the curve-based TCNC matrix and the SFNC matrix are illustrated. Note that we do not sort the voxels according to their tuning preferences. The NC coefficients (c_neuron or c_vxs) are set to 1 in matrices from A-E. Panels F-H illustrate the cTCNC matrices with NC coefficient (c_neuron) values 0, 0.5 and 1, respectively. Note that panels B and H are identical.

The second type is the curve-based tuning compatible noise correlation (cTCNC). In this case, the NC between a pair of neurons is proportional to their SC (i.e., correlation of their orientation tuning curves):

r_{i j}^{c T C N C} = (1 - δ_{i j}) * c o r r (g_{i} (S), g_{j} (S)) + δ_{i j},

(3)

where δ_ij is the Kronecker delta (δ_ij = 1 if i = j and δ_ij = 0 otherwise). S indicates all possible orientations between [1°, 180°], and $r_{i j}^{c T C N C}$ is the NC between the i-th and the j-th neurons. g_i(S) and g_j(S) are their tuning curves (see Eq 1). We denote R^cTCNC as the correlation matrix. Note that unlike aTCNCs, cTCNCs can be negative (see Fig 3B). Also, the key difference between cTCNC and aTCNC is that cTCNC does not rely on the functional form of tuning curves. In other words, cTCNC can be computed given irregular tuning curves, whereas aTCNC can be only computed from unimodal tuning curves. This is important for specifications of voxelwise NCs (see below).

In the third case, we shuffled the NCs between all pairs of neurons in R^cTCNC such that the rows and columns are rearranged in the same randomized order but the diagonal of the matrix is kept intact (Fig 3C). We term this type of NC as shuffled noise correlation (SFNC) since the correlation is no longer necessarily related to the neuronal tuning relations. We want to especially emphasize that here shuffling refers to untangling any relationships (e.g., linear relationship in aTCNC Eq 2 or cTCNC Eq 3) between noise correlations and tuning similarity (i.e., signal correlation), but noise correlations still exist. This is different from some studies in which multivariate responses data are shuffled across trials to completely eliminate noise correlations between voxels (i.e., all off-diagonal elements in a covariance matrix are 0) [30, 31]. Our case is similar to the situation that we randomly inject some noise correlations between voxels regardless of their tuning similarity. The correlation matrix of SFNCs is denoted as R^SFNC. R^SFNC can serve as a comparison for R^cTCNC since shuffling does not alter the overall distribution of NCs in a neuronal population.

Furthermore, we assumed Poisson noise of spikes such that the response variance of a neuron is equal to the mean activity evoked by a stimulus.

τ_{k}^{2} (s) = g_{k} (s)

(4)

where $τ_{k}^{2} (s)$ is the response variance of the k-th neuron triggered by the stimulus s. Note that in this case the response variance is stimulus-dependent. The covariance between neurons i and j (q_neuronij as below) can be expressed as:

{q_{n e u r o n}}_{i j} = (1 - δ_{i j}) {* c}_{n e u r o n} * r_{i j} * τ_{i} τ_{j} + δ_{i j} * τ_{i} τ_{j}

(5)

where c_neuron is a parameter that controls the strength of the neuronal NC. τ_i and τ_j are the standard deviation of responses of the two neurons (see Eq 4), respectively. δ_ij is the Kronecker delta. Given the covariance matrix Q_neuron, we can express the population response noise distribution as:

e \sim N (0, Q_{n e u r o n}),

(6)

Data simulation and multivariate pattern analysis

Stimulus-classification task

In the stimulus-classification task, we attempted to determine which of two stimuli were presented, based on the simulated neuronal population responses. We manipulated two independent variables: population size (i.e., the number of neurons) and NC strength (i.e., c_neuron in Eq 5). We built a linear discriminant using the Matlab function classify.m. The linear discriminant assumes that the conditional probability density functions p (b | s = s₁) and p (b | s = s₂) are both normally distributed with the same covariance and estimates the means and covariance from the training data. Here b is the vector of a population response in one trial (also see Eq 7). The classifier was trained on half of the data and tested on the other half.

For the neuronal populations, we attempted to classify two stimuli: s₁ = 92°, s₂ = 88°. The two stimuli were chosen to control the overall task difficulty (i.e., avoid ceiling and floor effects in classification accuracy). We set six pool size levels (i.e., 10, 20, 50, 100, 200, and 400 neurons) and six NC strength levels (i.e., c_neuron = 0, 0.1, 0.3, 0.5, 0.8, and 0.99). For each combination of a pool size and a c_neuron value and for each form of NC, we performed 100 independent simulations and then averaged classification accuracy values across simulations. To compensate for potential overfitting as the pool size increases, we set the number of trials for each stimulus to be 100 times the pool size. All data were equally divided into two independent parts for training and testing.

Stimulus-estimation task

In the stimulus-estimation task, neuronal responses in a trial were simulated for an orientation randomly chosen within [1°, 180°], and then a maximum likelihood estimator (MLE) was used to reconstruct the orientation value. Formally, given a population response pattern b in a trial, we attempted to find the stimulus s that maximizes the likelihood:

a r g m a x_{x \in (1,180]} p (b | s)

(7)

Note that the likelihood function has been introduced above as the neuron-encoding model (see noise distribution in Eqs 5 & 6). We numerically evaluated the likelihood of a pattern response b for each of 180 integer stimulus orientations (i.e., 1°–180°) and chose the orientation that yielded the maximum likelihood value. It is worth noting that, in contrast to classification, the MLE method does not involve any model training, and estimations were directly performed based on the known generative neuron-encoding model. We randomly sampled 1000 stimuli (i.e., 1000 trials) from [1°,180°] for decoding. The same pool size and c_neuron settings as in the stimulus-classification task were used. For each combination of a pool size and a c_neuron value, we calculated the mean circular squared errors (MSE_circ) across all trials between the estimated stimuli ( ${\hat{s}}_{i}$ ) and the true stimuli (s_i) across all trials:

{M S E}_{c i r c} = \frac{1}{1000} \sum_{i = 1}^{1000} {({\hat{s}}_{i} - s_{i})}^{2},

(8)

where ${\hat{s}}_{i}$ is the estimated stimulus and s_i is the true stimulus in the i-th trial. We took the inverse of the MSE_circ as the estimation efficiency (see Figs 4 and 5). A higher estimation efficiency value indicates a more accurate estimation.

Fig 4 — The multivariate classification accuracy (panels A-C) and maximum likelihood estimation efficiency (panels D-F) are depicted as a function of the magnitude of the aTCNC (panels A, D), TCNC (panels B, E) and the SFNC (panels C, F). Both classification accuracy and estimation efficiency decline as the strength of aTCNC and cTCNC increases. Conversely, increasing the strength of SFNC improves decoding accuracy.

Fig 5 — The multivariate classification accuracy (panels A, B) and estimation efficiency (panels C, D) are depicted as a function of the magnitude of cTCNCs (panels A, C) and SFNCs (panels B, D). Decoding accuracy exhibits U-shaped functions as cTCNCs increase. Similar to a neuronal population, SFNCs always improve decoding accuracy.

Assessment of effects of noise correlations in voxel populations

Voxel-encoding model

The voxel-encoding model uses the same pool of orientation-selective neurons (i.e., 180 neurons with tuning curves defined in Eq 1) as in the neuron-encoding model. We further assume that the response of a voxel is the linear combination of all neurons in the neuronal population:

h_{i} (s) = \sum_{k = 1}^{180} w_{k i} g_{k} (s),

(9)

where h_i(s) is the tuning function of the i-th voxel. w_ki is the connection weight between the k-th neuron to the i-th voxel. We sampled w_ki from a uniform distribution:

w_{k i} \sim u n i f o r m (0, 0.01),

(10)

This range was used so that generated fMRI responses typically range between 0 and 10, and can be viewed as approximating units of percent blood-oxygen-level-dependent (BOLD) change. This is also consistent with the range of empirically measured fMRI responses in most studies.

The mean of voxel population response given stimulus s can be represented by H(s) = [h_i(s)]. To express the trial-by-trial variation of voxel responses, we specify:

b = H (s) + e,

(11)

Here, b represents the observed response across voxels on a trial (as might be obtained from a general linear model applied to fMRI data) and e represents the multivariate normal noise distribution:

e \sim N (0, Q_{v x s}),

(12)

where Q_vxs is the covariance matrix between voxels, which will be detailed in the following section. It is noteworthy that we only calculate the voxel tuning curves as the weighted sum of the neuronal pool (Eq 9), but the voxel response variability does not only originate from neuronal response variability. If all voxel activities (including variability) are completely determined by a weighted sum of neuronal activities, the H(s) in Eq 11 should also be a random variable. However, in realistic fMRI data there are also other sources of voxel-level noise (e.g., thermal noise, head motion, see discussion) whose quantitative influences on voxel activity are difficult to delineate. Thus, we do not treat H(s) as a variable and instead assume an independent Gaussian noise (Eq 12).