Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2023 Jun 14:2023.06.14.544903. [Version 1] doi: 10.1101/2023.06.14.544903

An information-theoretic quantification of the content of communication between brain regions

Marco Celotto 1,2,3, Jan Bím 4, Alejandro Tlaie 2, Vito De Feo 5, Stefan Lemke 6, Daniel Chicharro 7, Hamed Nili 1, Malte Bieler 8, Ileana L Hanganu-Opatz 9, Tobias H Donner 10, Andrea Brovelli 11, Stefano Panzeri 1,2,*
PMCID: PMC10312682  PMID: 37398375

Abstract

Quantifying the amount, content and direction of communication between brain regions is key to understanding brain function. Traditional methods to analyze brain activity based on the Wiener-Granger causality principle quantify the overall information propagated by neural activity between simultaneously recorded brain regions, but do not reveal the information flow about specific features of interest (such as sensory stimuli). Here, we develop a new information theoretic measure termed Feature-specific Information Transfer (FIT), quantifying how much information about a specific feature flows between two regions. FIT merges the Wiener-Granger causality principle with information-content specificity. We first derive FIT and prove analytically its key properties. We then illustrate and test them with simulations of neural activity, demonstrating that FIT identifies, within the total information flowing between regions, the information that is transmitted about specific features. We then analyze three neural datasets obtained with different recording methods, magneto- and electro-encephalography, and spiking activity, to demonstrate the ability of FIT to uncover the content and direction of information flow between brain regions beyond what can be discerned with traditional anaytical methods. FIT can improve our understanding of how brain regions communicate by uncovering previously hidden feature-specific information flow.

1. Introduction

Cognitive functions, such as perception and action, emerge from the processing and routing of information across brain regions [9; 44; 51; 52; 36]. Methods to study within-brain communication [10; 11; 47] are often based on the Wiener-Granger causality principle, which identifies propagation of information between simultaneously recorded brain regions as the ability to predict the current activity of a putative receiving region from the past activity of a putative sending region, discounting the self-prediction from the past activity of the receiving region [19; 55]. While early measures implementing this principle, such as Granger causality [47], capture only linear interactions, successive information theoretic measures (the closely-related Directed Information [30] and Transfer Entropy [46]) are capable of capturing both linear and nonlinear time-lagged interactions between brain regions [6; 53]. While using such measures has advanced our understanding of brain communication [6; 8; 11; 26; 49; 50; 54; 40], they are designed to capture only the overall information propagated by neural activity across regions, and are insensitive to the content of information flow. Assessing the content of information flow, not only its presence, would be invaluable to understand how complex brain functions, involving distributed processing and flow of different types of information, arise.

Here, we leverage recent progress in Partial Information Decomposition (PID; [56; 27]) to develop a new non-negative measure (Feature-specific Information Transfer; FIT) that quantifies the directed flow of information about a specific feature of interest between neural populations (Fig. 1A). The PID decomposes the total information that a set of source variables encodes about a specific target variable into components representing shared (redundant) encoding between the variables, unique encoding by some of the variables, or synergistic encoding in the combination of different variables. FIT isolates features-specific information flowing from one region to another by identifying the part of the feature information encoded in the current activity of the receiving region that is shared (redundant) with information present in the past activity of the sending region (because if some piece of information is transmitted it should be first found in the sender and then in the receiver) and that was not encoded before by the receiver (if it were, it will not have come from the sender).

Figure 1:

Figure 1:

Sketch of FIT and TE. (A) TE is the established information-theoretic measure to quantify the overall information propagated by between two simultaneously recorded brain regions X (sender) and Y (receiver) by neural activity. Feature-specific Information Transfer (FIT) measures the information flowing from X to Y about the stimulus feature S. B) TE and FIT incorporate the Wiener-Granger causality principle. TE is the mutual information between the past of activity X and the present activity of Y conditioned on the past of Y. FIT is the feature information in the present of Y shared with the past information of X and unique with respect to the past information of Y , giving content-specificity to the Wiener-Granger principle.

We first mathematically derive a definition of FIT based on PID. We then use it to demonstrate, on simulated data, that it is specifically sensitive to the flow of information about specific features, correctly discarding feature-unrelated transmission. We then demonstrate that FIT is able to track the feature-specific content and direction of information flow using three different types of simultaneous multi-region brain recordings (electroencephalography - EEG, magnetoencephalography - MEG, and spiking activity). We also address how introducing appropriate null hypotheses and defining conditioned versions of FIT can deal with potential confounding factors, such as the time-lagged encoding of information in two regions without actual communication between them.

2. Defining Feature-specific Information Transfer (FIT)

We consider two time-series of neural activity X and Y simultaneously recorded from two brain regions over several experimental trials. X and Y might carry information about a feature S varying from trial to trial, e.g. a feature of a sensory stimulus or a certain action (Fig. 1A). The activity measured in each region, X and Y, may be any type of brain signal, e.g. the spiking activity of single or multiple neurons, or the aggregate activity of neural populations, such as EEG or MEG. We call Ypres the activity of Y at the present time point t , and Xpast and Ypast the past activity of X and Y respectively (Fig. 1B). Established information theoretic measures such as Transfer Entropy (TE, Fig. 1A) [46] utilize the Wiener-Granger principle to quantify the overall information propagated by neural activity from a putative sender X to a putative receiver Y as the mutual information I between the present neural activity of the receiver Ypres and the past activity of the sender Xpast, conditioned on the past activity of the receiver Ypast (Fig. 1B, left):

TE(XY)=I(Xpast;YpresYpast) (1)

(see Supplementary Material, SM1.1 for how TE depends on probabilities of past and present activity).

TE captures the information propagated by neural activity across regions but lacks the ability to isolate information flow about specific external variables. To overcome this limitation, here we define FIT, which quantifies the flow of information specifically about a feature S from a putative sending area X to a putative receiving area Y (Fig. 1A). We define FIT using the recently developed PID [56]. PID decomposes the joint mutual information I(X¯;S), that a set of N source variables X¯=(X1,X2,,XN) carries about a target variable S, into non-negative components called information atoms (see SM1.2). For N=2, PID breaks down the joint mutual information I(X1,X2;S) into four atoms: the Shared (or redundant) Information about S, SI(S:{X1,X2}), that both X1 and X2 encode; the two pieces of Unique Information, UI(S:{X1X2}) and UI(S:{X2X1}), provided by one source variable but not by the other ; and the Complementary (synergistic) information, CI(S:{X1,X2}), encoded in the combination of X1 and X2 . Several measures have been proposed to quantify information atoms [56; 5; 17; 25]. Here we use the original measure [56], called Imin, which guarantees non-negative values for information atoms for any number N of source variables (see SM1.2).

We wanted FIT to measure the directed flow of information about S between X and Y, rather than the overall propagation of information measured by TE (Fig. 1A, bottom). To do this, we isolated the information in the past of the sender X that Y receives at time t about a feature S. To respect the Wiener-Granger definition of causality, such information should not have been present in the past activity of the receiver Y. Therefore, we performed the PID in the space of four variables S, Xpast, Ypast, and Ypres. In such space, PID provides information atoms that combine Shared, Unique and Complementary Information carried by three sources about one target [56]. One natural candidate atom to measure FIT is the information about S that Xpast shares with Ypres and is unique with respect to Ypast:SUI(S:{Xpast,YpresYpast}) (Fig. 1B, right, Fig. S1B). This atom intuitively captures what we are interested in measuring, and satisfies two mathematical properties that we desire for FIT: it is upper bounded by the stimulus information encoded in the past of X(I(S;Xpast)) and in the present of Y(I(S;Ypres)) (see SM1.3.1). However, the information atom SUI(S:{Xpast,YpresYpast}) has two properties that would be undesirable for FIT. The first is that its value can exceed the total amount of information propagated from X to Y (TE), which is undesirable because the overall propagation of activity (Fig. 1A, bottom) must be an upper bound to the information that activity transmits about a specific feature. The second is that this atom depends on Xpast, Ypres, S only through the pairwise marginal distributions P(Xpast,S) and P(Ypres,S) but not through the marginal distribution P(Xpast,Ypres), which implies that this atom by itself cannot rule out confounding scenarios where both sender and receiver encode feature information at different times but no transmission between them takes place (see SM1.3.1).

To address these limitations, following [37] we considered the alternative PID taking S, Ypast, and Xpast as source variables and Ypres as a target. In this second PID (Fig. S1B), the atom that intuitively relates to FIT is SUI(Yt:{Xpast,SYpast}), the information about Ypres that Xpast shares with S that is unique with respect to Ypast. While being intuitively similar to SUI(S:{Xpast,YpresYpast}), SUI(Yt:{Xpast,SYpast}) is upper bounded by TE (but not by I(S;Xpast)) and depends on the pairwise marginal distribution P(Xpast,Ypres) (see SM1.3.2). Thus, this second atom has useful properties expected by a good definition of FIT that complement those of the first atom. Importantly, Shannon information quantities impose constraints that relate PID atoms across decompositions with different targets. Previous work [38] demonstrated that, for PID with N = 2 sources, these constraints reveal the existence of finer information components shared between similar atoms of different decompositions. Here, we extended this approach (see SM1.3.3) to N = 3 sources and demonstrated that the second atom is the only one in the second PID that has a pairwise algebraic relationship with the first atom, indicating that these atoms share a common, finer information component. Therefore, following the logic of [37], we defined FIT by selecting this finer common component by taking the minimum between these two atoms:

FIT=min[SUI(S:{Xpast,YpresYpast}),SUI(Yt:{Xpast,SYpast})] (2)

This definition ensures that FIT is at the same time upper bounded by I(S;Xpast), by I(S;Xpres) and by TE(XY). Additionally, FIT depends on the joint distribution P(S,Xpast,Ypres) through all the pairwise marginals P(S,Xpast), P(S,Ypast), and P(Xpast,Ypast), implying that it can rule out, using appropriate permutation tests, false-communication scenarios in which X and Y encode the stimulus independently with a temporal lag, without any within-trial transmission (see SM1.3.4).

Note that the definition of FIT holds when defining present and/or past activity as multidimensional variables spanning several time points. However, use of multidimensional neuron responses requires significantly more data for accurate computation of information. For this reason, following established procedures [6; 34; 33], in all numerical computations of TE and FIT reported in this study we computed the present of Y at a single time point t and the past of X and of Y at individual time points lagged by a delay δ:Xpast=Xtδ and Ypast=Ytδ.

3. Validation of FIT on simulated data

To test the ability of FIT to measure feature-specific information flow between brain regions, we performed simulations in scenarios of feature-specific and feature-unrelated information transfer.

We performed (Fig. 2) a simulation (details in SM2.1) in which the encoded and transmitted stimulus feature S was a stimulus-intensity integer value (1 to 4) . The activity of the sender X was a two-dimensional variable with one stimulus-informative Xstim and one stimulus-uninformative component Xnoise. The stimulus-informative dimension had a temporally-localized stimulus-dependent bump in the activity (from 200 to 250ms) and multiplicative Gaussian noise (similar results were found with additive noise, see SM2.1 and Fig. S3). The stimulus-unrelated component was, at any time point, a zero-mean Gaussian noise. The activity of the receiver Y was the weighted sum of Xstim and Xnoise with a delay δ, plus Gaussian noise . The delay δ was chosen randomly in each simulation repetition in the range 40-60ms. Simulated activity was discretized into 3 equipolulated bins to compute information [28] . Here and in all further simulations, we averaged information values across simulation repetitions we determined their significance via non-parametric permutation tests. For TE, we permuted X across all trials to test for the presence of significant within-trial transmission between X and Y. For FIT, we conducted two different permutation tests: one for the presence of stimulus information in X and Y (shuffling S across trials), and another for the contribution of within-trial correlations between X and Y to the transmission of S (shuffling X across trials at fixed stimulus). We set the threshold for FIT significance as the 99th percentile of the element-wise maximum between the two permuted distributions (see SM1.6).

Figure 2:

Figure 2:

Testing FIT on simulated data. A) FIT and TE as function of stimulus-related (wstim) and stimulus-unrelated (wnoise) transmission strength. * indicate significant values (p < 0.01, permutation test) for the considered parameter set. B) Dynamics of FIT and TE in a simulation with time-localized stimulus-information transmission. Red area shows the window of stimulus-related information transfer. Yellow dots show time points with significant information (p < 0.01, permutation test). Results plot mean (lines) and SEM (shaded area) across 50 simulations (2000 trials each).

We investigated how FIT and TE from X to Y depended on the amount of stimulus-related transmission (increased by increasing wstim) and of stimulus-unrelated transmission (increased by increasing wnoise). For brevity, we report values at the first time point in which information in Y was received from X, but similar results hold for later time points (not shown). Both FIT and TE increased when increasing the amount of stimulus-related transmission wstim (Fig. 2A). However, TE increased with wnoise (Fig. 2A), as expected from a measure that captures the overall propagation of activity, while FIT decreased when increasing wnoise, indicating that FIT specifically captures the transmission of information about the considered feature.

We then investigated how well TE and FIT temporally localize the stimulus-related information transmitted from X to Y (Fig. 2B). We simulated a case in which stimulus-related information was transmitted from X to Y only in a specific window ([240, 310]ms) and computed FIT and TE at each time point (see SM2.1 for details) . FIT was significant only in the time window in which X received the stimulus information from Y. On the contrary, TE was significant at any time point, reflecting that noise was transmitted from X to Y throughout the whole simulation time.

We performed further simulations to investigate whether the non-parametric permutation test described above can correctly rule out as non-significant feature-specific transmission the scenario in which X and Y independently encode S without actual communication occurring between them. We simulated a scenario in which stimulus information was encoded with a temporal lag in X and Y, with no transmission from X to Y. We found that the resulting values were always non-significant (see SM2.2 and Fig. S4C) when tested against a surrogate null-hypothesis distribution (pairing X and Y in randomly permuted trials with the same stimulus) that destroy the within-trial communication between X and Y without changing the stimulus information encoding in X and Y (see SM1.6).

Finally, we addressed how to remove the confounding effect of transmission of feature information to Y not from X but from a third region Z. In Granger causality or TE analyses, this is addressed conditioning the measures on Z [1; 32]. In an analogous way, we developed a conditioned version of FIT, called cFIT (see SM1.4), which measures the feature information transmitted from X to Y that is unique with respect to the past activity of a third region Z. We tested its performance in simulations in which both X and Z transmitted stimulus information to Y and found that cFIT reliably estimated the unique contribution of X in transmitting stimulus information to Y, beyond what was transmitted by Z (see SM2.2 and Fig. S4D).

4. Analysis of real neural data

We assessed how well FIT detects direction and specificity of information transfer in real neural data.

4.1. Flow of stimulus and choice information across the human visual system

We analyzed a previously published dataset ([57], see also SM3.1 for details) of source-level MEG data recorded while human participants performed a visual decision-making task. At the beginning of each trial, a reference stimulus was presented (contrast 50%), followed by a test stimulus that consisted of a sequence of 10 visual samples with variable contrasts (Fig. 3A). After the test stimulus sequence, participants reported their choice of whether the average contrast of the samples was greater or smaller than the reference contrast. The previous study on these data ([57]) analysed the encoding of stimulus and choice signals in individual areas but did not study information transfer. We focused on gamma-band activity (defined as the instantaneous power of the 40-75Hz frequency band), because it is the most prominent band for visual information encoding [21; 41; 16] and information propagation [6; 3] in the visual system. Previous work has demonstrated that gamma-band transmission is stronger in the feedforward (from lower to higher in the visual cortical hierarchy) than in the feedback (from higher to lower in the visual cortical hierarchy) direction [50; 3; 31], suggesting that gamma is a privileged frequency channel for transmitting feedforward information. However, these previous studies did not determine the content of the information being transmitted.

Figure 3:

Figure 3:

Information flow across the human visual hierarchy with MEG. A) Sketch of task B) Cortical surface map of the location of the four considered visual regions. C) Temporal profiles of stimulus information and time-delay stimulus FIT maps for an example regions pair (V1, V3A). Top to bottom: stimulus information in V1; time-delay FIT map in the feedforward (V1 → V3A) direction; time-delay FIT map in the feedback (V3A → V1) direction; stimulus information in V3A. D) Comparison between FIT about stimulus (FITS) and FIT about choice (FITC) in the visual network in the feedforward (top) and feedback (bottom) directions. E) Graphs representing the strength of feedforward (yellow) and feedback (orange) information transmission in the network for TE (left) and FIT (right). Links are weighted proportionally to the communication strength between each pair. The arrows on the bottom points toward the dominant direction of overall transmission, and are weighted proportionally to the difference between feedforward and feedback transmission. F) Comparison between feedforward and feedback transmission in the network for TE (left) and stimulus FIT (right). G) Same as E but for feedforward transmission in correct (green) vs error (gray) trials. H) Same as F but for feedforward transmission on correct vs error trials. In all panels, lines and image plots show averages and errorbars SEM across participants, experimental sessions and regions pairs (in case of FIT and TE) or regions (in case of mutual information). *: p<0.05, **: p<0.01, ***: p<0.001. Information-theoretic quantities were computed from the gamma band ([40-75]Hz) power of source-level MEG, first computed separately for the left and right hemisphere and then averaged.

To address this question, we quantified FIT in a network of four visual cortical areas (Fig. 3B) that we selected because they encoded high amounts of stimulus information and because they were sufficiently far apart to avoid leakage in source reconstruction [57]. The areas, listed in order of position, from lower to higher, in the cortical hierarchy were: primary visual cortex (V1), area V3A (carrying maximal stimulus information in the dorsal stream visual cortex), areas V3CD, and LO3 (carrying maximal stimulus information in the MT+ complex). Because participants made errors (behavioral performance was 75% correct), in each trial the stimulus presented could differ from the participant’s choice. We thus assessed the content of the information flow by computing FIT about either the sensory stimulus (FITS, using as feature the mean contrast across all 10 within-trial samples) or the reported choice (FITC), in each instant of time in the [−100, 500]ms peri-stimulus time window (because stimulus information was higher in the first 500ms post-stimulus, see SM3.1.4 and Fig. S6) and across a range of putative inter-area delays δ. In Fig. 3C we show the resulting FITS time-delay information maps for the example pair of regions V1 and V3A. A cluster-permutation analysis [29; 13] revealed significant feedforward stimulus-specific information transmission from V1 to V3A (but no significant feedback from V3A to V1) localized 200-400ms after the stimulus onset, with an inter-area communication delay between 65 and 250ms (see SM3.1.4 and Fig. S6).

We compared properties of overall information propagated by neural activity (computed with TE) and information flow (computed with FIT) across all pairs of areas within the considered visual cortical network. To determine the prevalent content of information flow in the network, we compared the amount of FITS and FITC transmitted in the feedforward and in the feedback directions (Fig. 3D). Gamma-band transmitted more information about the stimulus than about choice (i.e. FITS > FITC) in both the feedforward (p < 10−5 two-tailed paired t-test) and in the feedback (p < 0.002 two-tailed paired t-test) direction, with a larger difference for the feedforward direction. Thus, we focused on stimulus-specific information flow in the following FIT analyses. We then studied the leading direction of information flow. Both the total amount of activity propagation (TE) and the stimulus-specific information (FITS) were larger in the feedforward than in the feedback direction (Fig. 3F), but with a larger effect for FITS (p < 10−8 two-tailed paired t-test) compared to TE (p < 0.01 two-tailed paired t-test). Together, these results show that gamma-band activity in the visual system carries principally information about the stimulus (rather than choice) and propagates it more feedforward than feedback.

We next assessed the behavioral relevance of the feedforward stimulus information transmitted by the gamma band. A previous study showed that the overall (feature-unspecific) strength of feedforward gamma band information propagation negatively correlates with reaction times, indicating that stronger feedforward gamma actvity propagation favours faster decisions [42]. However, no study has addressed whether the amount of stimulus information transmitted forward in the gamma band helps accuracy of decision making. We addressed this question by comparing how FITS varied between trials when participants made a correct or incorrect choice (Fig. 2G). We matched the number of correct and error trials to avoid data size confounds [35]. FITS in the feedforward direction was significantly lower in error than in correct trials (Fig. 3H, right, p < 0.001 two-tailed paired t-test), while TE did not vary between correct and error trials (Fig. 3H, left, p = 0.24 two-tailed paired t-test). Feedback information transmission (both in terms of overall transmission by activity, TE, and stimulus specific information flow, FITS), did not vary between correct and incorrect trials. This indicates that the feedforward flow of stimulus information, rather than a general flow of activity, is key for forming correct choices based on sensory evidence.

These results provide the new discovery that the gamma band transmits feedfoward stimulus information of behavioral relevance, and highlight the power of FIT in revealing the content and direction of information flow between brain areas.

4.2. Eye-specific interhemispheric information flow during face detection

We next tested the ability of FIT to detect feature-specific information flow between brain hemispheres. We analyzed a published EEG dataset recorded from human participants detecting the presence of either a face or a random texture from an image covered by a bubble mask randomly generated in each trial ([43]; see SM3.2.1 for details). Previous analysis of these data [23] showed that eye visibility in the image (defined as the proportion of image pixels in the eye region visible through the mask) is the most relevant image feature for successful face discrimination. It then showed that eye-specific information appears first at ~120ms post-image presentation in the Occipito-Temporal (OT) region of the hemisphere contralateral with respect to the position of the eye, and then appears ~20-40ms later in the ipsilateral OT region (Fig. 4A). However, this study did not determine if the eye information in the ipsilater hemisphere is received from the contralateral hemisphere. To address this issue, we computed FIT transmission of eye-specific information between the Left OT (LOT) and Right OT (ROT) regions (using the electrodes within these regions that had most information as in [23], see SM3.2.2). Left Eye (LE) FIT from the contra- to the ipsi-lateral OT (ROT to LOT, Fig. 4C) peaked between 150 to 190ms after image onset with transfer delays of 20-80ms (Fig. 4B). Right eye (RE) FIT the contra- to the ipsi-lateral OT (LOT to ROT) peaked with similar times and delays. Both contra-to-ipsilateral LE and RE had statistically significant FIT peaks in the time-delay maps (cluster-permutation analysis, p < 0.01, see SM3.2.2 and Fig. S7). Thus, FIT determined the communication window for contralateral flow of eye-specific information with high precision.

Figure 4:

Figure 4:

Inter-hemispheric eye-specific information flow during face detection using human EEG. (A) Schematic of the putative information flow. LOT (ROT) denote Left (Right) occipito-temporal regions. LE (RE) denote the Left (Right) Eye visibility feature (B) Information (lines) carried by the EEG in each region, and FIT (image plot) about LE across regions. (C) Same as B for RE. (D) Contra- to ipsi-lateral vs ipsi- to contra-lateral transfers for TE and FIT for both LE and RE. Dots and images plot averages and errorbars plot SEM across participants (N=15).

To gain further insight about the directionality and feature-specificity of the information flow across hemispheres, we compared FIT and TE across transfer directions and/or eye-specific visibility conditions (Fig. 4D, middle and right). Right-to-left LE FIT was significantly larger than left-to-right LE FIT (p < 0.001 two-tailed paired t-test) or right-to-left RE FIT (p < 0.01 two-tailed paired t-test). Left-to-right RE FIT was significantly larger than right-to-left RE FIT (p < 0.05 two-tailed paired t-test) or left-to-right LE FIT (p < 0.05 two-tailed paired t-test). In contrast, when computing with TE the overall propagated information, we found no significant difference between directions, Fig. 4D, left). Thus, the use of FIT revealed a temporally localized flow of eye information across hemispheres that was feature-selective (i.e. about mainly the contralateral eye) and direction-specific (contra-to-ipsilateral), without direction specificity in the overall information propagation by neural activity across hemispheres (as reveled by TE).

Finally, to more tightly localize the origin of eye-specific contralateral information flow, we asked whether the contralateral OT electrodes selected in our analyses were the sole senders of inter-hemispheric eye-specific information. We used the conditioned version of FIT, cFIT, to compute the amount of transfer of eye information from the contra- to the ipsi-lateral OT after removing the effect of eye-specific information possibly routed through alternative sending locations (see SM3.4). We found (Fig. S9A) that the effect we measured with FIT was robust even when conservatively removing with cFIT the information that could have been routed through other locations.

4.3. Stimulus-specific information flow in a thalamocortical network

We finally used FIT to measure the feature- and direction-specificity of information flow in the thalamocortical somatosensory and visual pathways. We analysed a published dataset in which multi-unit spiking activity was simultaneously recorded in anaesthetized rats from the primary visual and somatosensory cortices, and from first-order visual and somatosensory thalamic nuclei ([7], see SM3.3.1 for details), during either unimodal visual, unimodal tactile, or bimodal (visual and tactile) stimulation. This analysis tests FIT on another major type of brain recordings (spiking activity). Moreover, due to the wealth of knowledge about the thalamocortical network [48; 15], we can validate FIT against the highly-credible predictions that information about basic sensory features flows from thalamus to cortex, and that somatosensory and visual pathways primarily transmit tactile and visual information, respectively. Using FIT on these data, we found (see SM3.3.3 and Fig. S8) that sensory information flowed primarily from thalamus to cortex, rather than from cortex to thalamus. We also found that the feedforward somatosensory pathway transmits more information about tactile- than about visually-discriminative features, and that the feedforward visual pathway transmits more information about visually- than tactile-discriminative features. Importantly, TE was similarly strong in both directions, and when considering tactile- or visually-discriminative features. This confirms the power of FIT for uncovering stimulus-specific information transfer, and indicates a partial dissociation between overall information propagation by neural activity and neural transfer of specific information.

5. Comparison with previously published measures

We finally examine how FIT differs from alternative algorithms for identifying components of the flow of information about specific features. We focus on measures that implement the Wiener-Granger discounting of the information present in the past activity of the sender. Other methods, that do not implement this (and thus just correlate past information of the sender with present information of the receiver), erroneously identify information already encoded in the past activity of the receiver as information transmitted from a sender (see SM4).

A possible simple proxy for identifying feature-specific information flow is quantifying how the total amount of transmitted information (TE) is modulated by the stimulus [6]. For the case of two stimuli, this amounts to the difference of TE computed for each stimulus. We show in SM4.1, using simulations, that this measure can fail in capturing stimulus-related information flow even in simple scenarios of stimulus information transmission. Additionally, when tested on MEG data, it could not assess the directionality of information transmission within brain networks (see SM4.1).

A previous study [22] defined a measure, Directed Feature Information (DFI), which computes feature-specific information redundant between the present activity of the receiver and the past activity of the sender, conditioned on the past activity of the receiver. However, DFI used a measure of redundancy that actually conflated the effects of redundancy and synergy (see SM1.5). Because of this, DFI is often negative and thus not interpretable as measure of information flow (see SM4.2). In contrast, FIT is non-negative and uses PID to consider only redundant information between sender and receiver, as appropriate to identify transmission of information. Moreover, because DFI discounts only past activity of the sender rather than its feature-specific information, it was less precise and less conservative in localizing direction and timing of feature-specific information flow (see SM4.2).

Finally, a study defined feature-specific information using PID in the space of four variables S, Xpast, Ypast, and Ypres [4]. However, this measure was not upper bounded by either feature information encoded in the past of the sending region or the total information flowing between regions.

6. Discussion

We developed and validated FIT, an information theoretic measure of the feature-specific information transfer between a sender X and a receiver Y. FIT combines the PID concepts of redundancy and uniqueness of information [56] with the Wiener-Granger causality principle [10] to isolate, within the overall transmitted information (TE), the flow of information specifically related to a feature S.

The strengths and limitations of FIT as a neural data analysis tool mostly reflect the general ones of information theory for studying neural information processing. Information theory has led to major advances to neural coding, not only because the brain is an information processing machine and information theory is a natural tool to quantify it, but because it captures linear and non-linear interactions at all orders making little assumptions [39; 14]. This is important for a general neural analysis tool, because deviations from linearity and order of interactions vary in often unknown ways between brain areas, stimulus types and recording modalities [12; 24; 18]. Using such a general formalism avoids potentially biasing results with wrong assumptions. However, the price to pay for the fact that information theory includes full probability distributions is that it is data hungry. While our definitions of FIT and cFIT are straightforwardly valid for multivariate analyses including conditioning on the information of multiple regions [32] (as in cFIT) or obtaining more conservative estimates of information transmission on which information in the receiver Y is requested to be unique with respect to the information of the sender and receiver at multiple past time points [53], for data sampling reasons in practice in real data these analyses are confined to conditioning to one region or a single past time points[6; 34; 33]. In future work, we aim to make FIT applicable to analyses of multiple regions or time points coupling it with advanced non-parametric [45] methods to robustly estimate its multivariate probability distributions.

The generality of our approach lends itself to further developments. Importantly, we defined FIT directly at the level of PID atoms. This means that, although here we implemented FIT using the original definition of redundancy in PID [56] because it has the advantage of being non-negative for all information atoms, FIT can be easily implemented also using other PID redundancy measures [20; 5; 2; 17; 25] with complementary advantages and disadvantages (see SM1.2).

To demonstrate the properties of FIT, we performed numerical simulations in different communication scenarios and compared FIT against TE (Fig. 2). These simulations confirmed that TE effectively detected the overall propagation of activity, but it did not detect the flow of feature-specific information. FIT, in contrast, reliably detected feature- and direction-specific information flow with high temporal sensitivity. We confirmed the utility of FIT in applications to neural data. In three brain datasets spanning the range of electrophysiological recordings (spiking activity, MEG and EEG), FIT credibly determined the directionality and feature specificity of information flow. Importantly, in most of these datasets this happened in the absence of variations in the overall flow of activity between the same brain regions (measured with TE). The partial dissociation between overall activity flow and feature-specific flow found consistently in simulations and data has important implications. First, it highlights the need of introducing a specific measure of feature information transfer such as FIT, as it resolves question unaddressed by content-unspecific measures. Second, it establishes that measuring feature-specific components of information flow between brain regions is critical to go beyond the measurement of overall neural activity propagation and uncover aspects of cross-area communication relevant for ongoing behavior. Thus, as methods to record from multiple brain areas rapidly advance, FIT is well suited to uncover fundamental principles in how brain regions communicate.

Supplementary Material

Supplement 1

Acknowledgements

This research has received funding from the European Union’s Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No. 945539 (Human Brain Project SGA3), from the NIH Brain Initiative (grants U19 NS107464, R01 NS109961, R01 NS108410), and from the Simons Foundation (SFARI Human Cognitive and Behavioral Science grant 982347). We thank G. Bondanelli and G.M. Lorenz for useful discussion and feedback on the manuscript.

References

  • [1].Barnett L. and Seth A. K.. The MVGC multivariate granger causality toolbox: A new approach to granger-causal inference. Journal of Neuroscience Methods, 223:50–68, 2014. [DOI] [PubMed] [Google Scholar]
  • [2].Barrett A. B.. Exploration of synergistic and redundant information sharing in static and dynamical Gaussian systems. Physical Review E, 91(5):052802, 2015. [DOI] [PubMed] [Google Scholar]
  • [3].Bastos A. M., Vezoli J., Bosman C. A., Schoffelen J.-M., Oostenveld R., Dowdall J. R., Weerd P. D., Kennedy H., and Fries P.. Visual areas exert feedforward and feedback influences through distinct frequency channels. Neuron, 85(2):390–401, 2015. [DOI] [PubMed] [Google Scholar]
  • [4].Beer R. D. and Williams P. L.. Information processing and dynamics in minimally cognitive agents. Cognitive Science, 39(1):1–38, 2014. [DOI] [PubMed] [Google Scholar]
  • [5].Bertschinger N., Rauh J., Olbrich E., Jost J., and Ay N.. Quantifying unique information. Entropy, 16(4):2161–2183, 2014. [Google Scholar]
  • [6].Besserve M., Lowe S. C., Logothetis N. K., Scholkopf B., and Panzeri S.. Shifts of gamma phase across primary visual cortical sites reflect dynamic stimulus-modulated information transfer. PLOS Biology, 13(9):e1002257, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Bieler M., Xu X., Marquardt A., and Hanganu-Opatz I. L.. Multisensory integration in rodent tactile but not visual thalamus. Scientific Reports, 8(1):15684, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Bosman C. A., Schoffelen J.-M., Brunet N., Oostenveld R., Bastos A. M., Womelsdorf T., Rubehn B., Stieglitz T., Weerd P. D., and Fries P.. Attentional stimulus selection through selective synchronization between monkey visual areas. Neuron, 75(5):875–888, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Bressler S. L. and Menon V.. Large-scale brain networks in cognition: emerging methods and principles. Trends in cognitive sciences, 14(6):277–290, 2010. [DOI] [PubMed] [Google Scholar]
  • [10].Bressler S. L. and Seth A. K.. Wiener-granger causality: A well-established methodology. NeuroImage, 58(2): 323–329, 2011. [DOI] [PubMed] [Google Scholar]
  • [11].Brovelli A., Ding M., Ledberg A., Chen Y.-h., Nakamura R., and Bressler S. L.. Beta oscillations in a large-scale sensorimotor cortical network: directional influences revealed by granger causality. Proceedings of the National Academy of Sciences, 101(26):9849–9854, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Chelaru M. I., Eagleman S., Andrei A. R., Milton R., Kharas N., and Dragoi V.. High-order interactions explain the collective behavior of cortical populations in executive but not sensory areas. Neuron, 109(24):3954–3961, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Combrisson E., Allegra M., Basanisi R., Ince R. A., Giordano B. L., Bastin J., and Brovelli A.. Group-level inference of information-based measures for the analyses of cognitive brain networks from neurophysiological data. NeuroImage, 258:119347, 2022. [DOI] [PubMed] [Google Scholar]
  • [14].de R. R. van Steveninck Ruyter, Lewen G. D., Strong S. P., Koberle R., and Bialek W.. Reproducibility and variability in neural spike trains. Science, 275(5307):1805–1808, 1997. [DOI] [PubMed] [Google Scholar]
  • [15].Diamond M. E., Armstrong-James M., and Ebner F. F.. Somatic sensory responses in the rostral sector of the posterior group (POm) and in the ventral posterior medial nucleus (VPM) of the rat thalamus. The Journal of Comparative Neurology, 318(4):462–476, 1992. [DOI] [PubMed] [Google Scholar]
  • [16].Donner T. H. and Siegel M.. A framework for local cortical oscillation patterns. Trends in Cognitive Sciences, 15(5):191–199, 2011. [DOI] [PubMed] [Google Scholar]
  • [17].Finn C. and Lizier J.. Pointwise partial information decomposition using the specificity and ambiguity lattices. Entropy, 20(4):297, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Ganmor E., Segev R., and Schneidman E.. Sparse low-order interaction network underlies a highly correlated and learnable neural population code. Proceedings of the National Academy of Sciences, 108(23):9679–9684, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Granger C. W.. Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3):424–438, 1969. [Google Scholar]
  • [20].Griffith V. and Koch C.. Quantifying synergistic mutual information. In Guided Self-Organization: Inception, pages 159–190. Springer, Berlin, Heidelberg, 2014. [Google Scholar]
  • [21].Hadjipapas A., Lowet E., Roberts M., Peter A., and Weerd P. D.. Parametric variation of gamma frequency and power with luminance contrast: A comparative study of human MEG and monkey LFP and spike responses. NeuroImage, 112:327–340, 2015. [DOI] [PubMed] [Google Scholar]
  • [22].Ince R. A., Van Rijsbergen N. J., Thut G., Rousselet G. A., Gross J., Panzeri S., and Schyns P. G.. Tracing the flow of perceptual features in an algorithmic brain network. Scientific Reports, 5:17681, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Ince R. A., Jaworska K., Gross J., Panzeri S., Van Rijsbergen N. J., Rousselet G. A., and Schyns P. G.. The deceptively simple N170 reflects network information processing mechanisms involving visual feature coding and transfer across hemispheres. Cerebral Cortex, 26(11):4123–4135, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Kayser C., Körding K. P., and König P.. Processing of complex stimuli and natural scenes in the visual cortex. Current Opinion in Neurobiology, 14(4):468–473, 2004. [DOI] [PubMed] [Google Scholar]
  • [25].Kolchinsky A.. A novel approach to the partial information decomposition. Entropy, 24(3):403, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Li M., Han Y., Aburn M. J., Breakspear M., Poldrack R. A., Shine J. M., and Lizier J. T.. Transitions in information processing dynamics at the whole-brain network level are driven by alterations in neural gain. PLoS computational biology, 15(5):e1006957, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Lizier J., Bertschinger N., Jost J., and Wibral M.. Information decomposition of target effects from multi-source interactions: Perspectives on previous, current and future work. Entropy, 20(4):307, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Magri C., Whittingstall K., Singh V., Logothetis N. K., and Panzeri S.. A toolbox for the fast information analysis of multiple-site LFP, EEG and spike train recordings. BMC Neuroscience, 10(1):1–24, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Maris E. and Oostenveld R.. Nonparametric statistical testing of eeg-and meg-data. Journal of neuroscience methods, 164(1):177–190, 2007. [DOI] [PubMed] [Google Scholar]
  • [30].Massey J. L.. Causality, feedback and directed information. In International Symposium on Information Theory Applications, 1990. [Google Scholar]
  • [31].Michalareas G., Vezoli J., van Pelt S., Schoffelen J.-M., Kennedy H., and Fries P.. Alpha-beta and gamma rhythms subserve feedback and feedforward influences among human visual cortical areas. Neuron, 89(2): 384–397, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Montalto A., Faes L., and Marinazzo D.. MuTE: A MATLAB toolbox to compare established and novel estimators of the multivariate transfer entropy. PLoS ONE, 9(10):e109462, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Oever S. T., Sack A. T., Oehrn C. R., and Axmacher N.. An engram of intentionally forgotten information. Nature Communications, 12(1):6443, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Palmigiano A., Geisel T., Wolf F., and Battaglia D.. Flexible information routing by transient synchrony. Nature Neuroscience, 20(7):1014–1022, 2017. [DOI] [PubMed] [Google Scholar]
  • [35].Panzeri S., Senatore R., Montemurro M. A., and Petersen R. S.. Correcting for the sampling bias problem in spike train information measures. Journal of Neurophysiology, 98(3), 2007. [DOI] [PubMed] [Google Scholar]
  • [36].Panzeri S., Moroni M., Safaai H., and Harvey C. D.. The structures and functions of correlations in neural population codes. Nature Reviews Neuroscience, 23(9):551–567, 2022. [DOI] [PubMed] [Google Scholar]
  • [37].Pica G., Piasini E., Chicharro D., and Panzeri S.. Invariant components of synergy, redundancy, and unique information among three variables. Entropy, 19(9):451, 2017. [Google Scholar]
  • [38].Pica G., Piasini E., Safaai H., Runyan C., Harvey C., Diamond M., Kayser C., Fellin T., and Panzeri S.. Quantifying how much sensory information in a neural code is relevant for behavior. In Advances in Neural Information Processing Systems 30, pages 3686–3696, 2017. [Google Scholar]
  • [39].Quiroga R. Q. and Panzeri S.. Extracting information from neuronal populations: information theory and decoding approaches. Nature Reviews Neuroscience, 10(3):173–185, 2009. [DOI] [PubMed] [Google Scholar]
  • [40].Ramirez-Villegas J. F., Besserve M., Murayama Y., Evrard H. C., Oeltermann A., and Logothetis N. K.. Coupling of hippocampal theta and ripples with pontogeniculooccipital waves. Nature, 589(7840):96–102, Nov. 2020. [DOI] [PubMed] [Google Scholar]
  • [41].Ray S. and Maunsell J. H.. Differences in gamma frequencies across visual cortex restrict their possible use in computation. Neuron, 67(5):885–896, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Rohenkohl G., Bosman C. A., and Fries P.. Gamma synchronization between v1 and v4 improves behavioral performance. Neuron, 100(4):953–963, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Rousselet G. A., Ince R. A., van Rijsbergen N. J., and Schyns P. G.. Eye coding mechanisms in early human face event-specific potentials. Journal of vision, 14(13):7–7, 2014. [DOI] [PubMed] [Google Scholar]
  • [44].Runyan C. A., Piasini E., Panzeri S., and Harvey C. D.. Distinct timescales of population coding across cortex. Nature, 548(7665):92–96, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Safaai H., Onken A., Harvey C., and Panzeri S.. Information estimation using nonparametric copulas. Physical Review E, 98(5):053302, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Schreiber T.. Measuring information transfer. Physical Review Letters, 85(2):461–464, 2000. [DOI] [PubMed] [Google Scholar]
  • [47].Seth A. K., Barrett A. B., and Barnett L.. Granger causality analysis in neuroscience and neuroimaging. Journal of Neuroscience, 35(6):3293–3297, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Sherman S. M. and Guillery R. W.. The role of the thalamus in the flow of information to the cortex. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 357(1428):1695–1708, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Stramaglia S., Angelini L., Wu G., Cortes J. M., Faes L., and Marinazzo D.. Synergetic and redundant information flow detected by unnormalized granger causality: Application to resting state fmri. IEEE Transactions on Biomedical Engineering, 63(12):2518–2524, 2016. [DOI] [PubMed] [Google Scholar]
  • [50].Van Kerkoerle T., Self M. W., Dagnino B., Gariel-Mathis M.-A., Poort J., Van Der Togt C., and Roelfsema P. R.. Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex. Proceedings of the National Academy of Sciences, 111(38):14332–14341, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [51].Van Vugt B., Dagnino B., Vartak D., Safaai H., Panzeri S., Dehaene S., and Roelfsema P. R.. The threshold for conscious report: Signal loss and response bias in visual and frontal cortex. Science, 360(6384):537–542, 2018. [DOI] [PubMed] [Google Scholar]
  • [52].Varela F., Lachaux J.-P., Rodriguez E., and Martinerie J.. The brainweb: Phase synchronization and large-scale integration. Nature Reviews Neuroscience, 2(4):229–239, 2001. [DOI] [PubMed] [Google Scholar]
  • [53].Vicente R., Wibral M., Lindner M., and Pipa G.. Transfer entropy-a model-free measure of effective connectivity for the neurosciences. Journal of computational neuroscience, 30(1):45–67, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].Vinck M., Huurdeman L., Bosman C. A., Fries P., Battaglia F. P., Pennartz C. M., and Tiesinga P. H.. How to detect the granger-causal flow direction in the presence of additive noise? NeuroImage, 108:301–318, 2015. [DOI] [PubMed] [Google Scholar]
  • [55].Wiener N.. The theory of prediction. In Modern Mathematics for Engineers. Beckenbach E. F., New York: McGraw-Hill, 1956. [Google Scholar]
  • [56].Williams P. L. and Beer R. D.. Nonnegative decomposition of multivariate information. arXiv, 2010. [Google Scholar]
  • [57].Wilming N., Murphy P. R., Meyniel F., and Donner T. H.. Large-scale dynamics of perceptual decision information across human cortex. Nature Communications, 11(1):5109, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES