Abstract
The transformation and transmission of brain stimuli reflect the dynamical brain activity in space and time. Compared with functional magnetic resonance imaging (fMRI), magneto- or electroencephalography (M/EEG) fast couples to the neural activity through generated magnetic fields. However, the MEG signal is inhomogeneous throughout the whole brain, which is affected by the signal-to-noise ratio, the sensors’ location and distance. Current non-invasive neuroimaging modalities such as fMRI and M/EEG excel high resolution in space or time but not in both. To solve the main limitations of current technique for brain activity recording, we propose a novel recurrent memory optimization approach to predict the internal behavioral states in space and time. The proposed method uses Optimal Polynomial Projections to capture the long temporal history with robust online compression. The training process takes the pairs of fMRI and MEG data as inputs and predicts the recurrent brain states through the Siamese network. In the testing process, the framework only uses fMRI data to generate the corresponding neural response in space and time. The experimental results with Human connectome project (HCP) show that the predicted signal could reflect the neural activity with high spatial resolution as fMRI and high temporal resolution as MEG signal. The experimental results demonstrate for the first time that the proposed method is able to predict the brain response in both milliseconds and millimeters using only fMRI signal.
Keywords: Brain dynamics, Recurrent neural network, fMRI
1. Introduction
In computational neuroscience, brain state often refers to wakefulness, sleep, and anesthesia. However, the precise and dynamical brain complexity is still missing. Dynamical neural representation is thought to arise from neural firing, which do not fire in isolation [11]. The development of modern brain measuring techniques makes it possible to infer the large-scale structural and functional connectivity and characterize the anatomical and functional patterns in human cortex. Electrophysiological methods provide a direct and non-invasive way to record brain with milliseconds temporal resolution which is not affected by the problems commonly caused by intermediate processes. fMRI uses brain activityrelated blood-oxygen-level-dependent (BOLD) to understand the neural representations in millimeter level, however it is too sluggish in the time domain. None of the current non-invasive brain recording techniques could measure the high resolution dynamics in space and time. This is a major challenge in neuroscience which draws attention of researchers to develop simultaneous simulation models of brain dynamics.
Since each brain region interacts with other regions, it is hard to measure the intraregional brain dynamics with single technique such as fMRI or MEG. With the increase of the strength of brain region interactions, the measured brain dynamical signals decrease. Recent study on complex biological system provides the powerful mathematical functions to model the coupling structural and functional brain activity using multimodal brain measurements [19]. Previous study use principle component analysis (PCA) [3, 12, 13] or linear models [9, 15] to quantify the inherent neural representations in either high dimensional or low dimensional. The high dimensionality is used for information encoding. The low dimensionality is used to encode information in complex cognitive or motor tasks [8]. Modern methods such as representational similarity analysis (RSA) [7] and multivoxel pattern analysis (MVPA) [10] is used to quantify the brain activity across task conditions. Further developments utilize the states defined as the transmission between different task activity to function higher cognitive dynamics [2, 3, 14, 16, 18].
The question remains: how to model the change and transmission between neurons with high spatial and temporal resolution? In this paper, we propose a novel method that use fMRI to predict the corresponding neural activity with high resolution in space and time. Our main contributions are summarized as follows:
The current non-invasive brain measuring technique could not identify the neural activity with high resolution in both spatial and temporal domain. MEG/EEG could resolve the brain activity in milliseconds but the spatial resolution is low. In contrast, fMRI could represent the neural dynamics in millimeters, but it is too slow for the rapid transmission between cortical sites. To link the mapping between brain regions and time points, we propose a general framework to project the time points onto polynomial basis. In the training step, the proposed method takes the ROI parcellated brain signals (fMRI and MEG) to learn the dynamical representation. In the testing step, the proposed method is the first to use original 4D fMRI signal to predict the internal brain response in space and time.
Most of the existing recurrent neural network (RNN) suffers from the vanishing gradient problem when dealing with long-term time series. It is challenging to represent the brain response at milliseconds with current RNN model. The proposed method could discretize and project the time points onto the polynomial basis, which is able to deal with any time length data using the memory compression.
The two experimental results demonstrate that the proposed method is able to resolve the neural activity in space and time. Its feasibility is validated when compared with fMRI in structural domain using brain networks of independent component analysis (ICA). The predicted results also show the similar pattern to MEG signal in temporal domain.
The proposed method is related to the fMRI and MEG data fusion research. The difference between the proposed method and existing data fusion method is summarized as follows. 1) The proposed method does not use the MEG data in the testing step. It is able to use 4D fMRI data to predict the brain internal states in space and time. The MEG data is used in the training step as a reference for the proposed method to learn the temporal pattern of brain sates. The research problem could be considered as the “super-resolution” in the temporal domain of fMRI data and keeps the structural details. However, as far as we know, most of the previous methods require both fMRI and MEG data to predict the internal states. It is hard to obtain simultaneous fMRI and MEG data at the same time. 2) Secondly, the proposed method could predict the high resolution temporal signals for each voxel of fMRI data. In the testing step, the proposed method does not require the preprocessing steps like ROI average or beamforming which could introduce noise in the preprocessed data. As far as we know, the proposed method is the first to predict the spatio-temporal brain internal states for each voxel using only fMRI data.
2. Spatio-Temporal Dynamical Modeling of Brain Response
Our approach benefits from the framework of Legendre Memory Unit (LMU) and online function approximation to learn the memory representation [6, 17]. The dynamical brain network with neurons could be modeled as
| (1) |
where represents the internal states of neuron nodes at time . denotes the nonlinear dynamical function of each node. And represents the external stimuli for neurons. is the observed time series of brain measurements (i.e. MEG or fMRI signals). and are time series noise of the internal states and measurements. is the output function which controls the output of the brain measurements at time . The nonlinear brain function is modeled using recurrent neural network (RNN).
The whole framework of the proposed method is shown in Fig. 1. In the training stage, the proposed method consists of two parts, the RNN model with Polynomial Projection and the Siamese network to score the agreement between the network prediction and the original time series of brain activity. Both Region-of-interest (ROI) extracted fMRI and MEG signals are used to train the network. However, in the testing stage, only fMRI signal is included to predict the internal states in space and time. The experiments demonstrate that the proposed method could deal with both the ROI extract fMRI signal and the original 4D fMRI image. The predicted internal behavioral states show high resolution in spatial and temporal domain.
Fig. 1.

Illustration of the proposed spatio-temporal brain dynamical network (a) Training framework. Both ROI extracted fMRI and MEG signals are used in the training process. (b) Testing framework for generating the internal states which uses 4D fMRI as input.
We propose a Polynomial Projection Operators together with Recurrent memory to solve the dynamical internal state . Given the input or , the proposed method aims to solve the future prediction based on the cumulative history . However, as temporal resolution for MEG signal is intractably high, there exists vanishing gradient problem when the model evolves over all time states. To solve the vanishing gradient problem, we project the input state onto the subspace and maintain the compressed historical representation. So there are two problems to be solved: how to quantify the approximation, and the way to learn the subspace.
Function Approximation.
We introduce the probability measure to define the space of square function . is defined on a subspace . The function is utilized to minimize the approximation of with , where is the measure ranges in .
Polynomial Basis Expansion for Subspace Learning.
To learn the suitable subspace, we define the polynomial basis with parameter to represent the projected history, which means the historical dynamics could be represented using coefficients with the basis . represents the size of the compression. However, it is challenging to maintain the parameter when . We show more details of the suitable polynomial basis in Supplementary material.
The first step is to choose the appropriate basis in the projection operator. The projection operator takes the historical memory of and minimize the approximation of using . According to approximation theory, we use the orthogonal polynomials of as the orthogonal basis and represent the coefficients .
The second key step is to differentiate the projection using the inner product , which will lead to the similar result that is expressed using and . Thus, satisfies the ODE form , where and . For each time , We could change to the ODE form,
| (2) |
The whole derivation of is shown in Supplementary material. Next we want to solve the ODE problem and obtain the coefficient by discretizing the continuous function and , which yields the recurrence using . The orthogonal basis is defined as follows,
| (3) |
2.1. The Siamese Network for Behavioral Prediction in Space and Time
To score the agreement between the time series prediction and original brain measurements, we introduce the Siamese network [1] which is the weight sharing network for comparing two views. It should be noted that the Siamese network is only used in the training process to help learn the structural and functional patterns of fMRI and MEG. We give detailed description of training and testing procedure in Supplementary material. The Siamese network is used to measure the similarity between fMRI and MEG signals. The Siamese network consists of two views, the predicted MEG signal using fMRI and the corresponding ground truth MEG signal. The two input views are preprocessed using the encoder network . The encoder network shares the same weights between two views. The encoder network of the predicted MEG-like signal is followed by an MLP head to match the output of the other view.
Encoder Network with Continuous Convolutions.
To predict the multimodal brain measurements using the orthogonal basis in Eq. 3, We could use the continuous convolutions to represent the output signals. Following the previous work of bilinear method, we converts the state and in Eq. 3 into an approximation and . Based on Eq. 3, the continuous convolutions is defined as,
| (4) |
Similarity Measurement.
Given two kinds of inputs and at time with . represents the predicted brain measurement of the output model in Eq. 1. is the original time series of brain activity, such as fMRI or MEG signals. The similarity of two view is defined as
| (5) |
where and is -norm.
In the experiment, we adopt the stop-gradient operation. The stop-gradient treat the second input as constant. The symmetrized loss is denoted by
| (6) |
3. Experimental Results
We tested the fidelity of the proposed method based on the resting state fMRI and MEG from HCP dataset. The resting-state fMRI was pre-processed following the minimal preprocessing pipeline [5]. Then the pre-processed data was registered into a standard cortical surface using MSMAll [5]. The artefacts were removed using ICA-FIX. The cortical surface was parcellated into N=360 major ROI [4]. In addition, the averaged time course of each ROI was normalized using z-score. The resting-state MEG was pre-processed using ICA to remove out artefacts related to head and eye movement. Sensor-space data were down-sampled 300 Hz using anti-aliasing filter. Next the MEG data were source-reconstructed with a scalar beamformer and registered into the standard space of the Montreal Neuroimaging Institute (MNI). MEG signals were then filtered into 1-30Hz and beamformed onto 6 mm grid. We used theta (4–8 Hz), alpha (8–13 Hz) and beta (13–30 Hz) bands to filter the source-space data. The parcellation atlas and z-score normalization method of MEG were similar to the resting-state fMRI.
3.1. Spatio-Temporal Patterns of the Predicted Results Using 4D fMRI Image
We next applied the proposed method to acquire the dynamics of the behavioral representation for each voxel in fMRI signal. We used the whole 4D fMRI image to generate the corresponding behavioral states in the spatio-temproal domain. We show the spatial map of independent temporal signal using original MEG and the predicted results in Fig. 2. The spatial map was generated using the ICA. With the 25 generated ICA temporal components of MEG and predicted results, we paired the 8 resting state brain networks(RSNs) spatial map with that derived using resting state fMRI. The DMN pattern is shown in Fig. 2a. The nodes is highlighted in the medial frontal cortex and inferior parietal lobules. The patterns of left lateralized frontoparietal and sensorimotor network are shown in Fig. 2b and Fig. 2c. From Fig. 2, we could see that the spatial pattern of predicted results could match that of MEG in all the network. However, the spatial resolution of predicted result is much higher than MEG signal.
Fig. 2.

Brain networks acquired with ICA shown in the order predicted results (top) and MEG (bottom). (a) DMN, (b) left frontoparietal network, (c) sensorimotor network.
The results have shown that the temporal ICA components originate from the brain regions correlated with the RSNs spatial maps. We show more results in the Supplementary material. The proposed method could generate the high resolution behavioral states that are the perfect matches to the spatial pattern of the observed fMRI. Thus, the proposed method could provide fundamental role of brain dynamics related to behavioral measurements.
3.2. Temporal Pattern of Predicted Results
We finally evaluated temporal pattern of the predicted internal states compared with original 4D fMRI and MEG image. Figure 3 shows the averaged neural time series of the predicted results, fMRI and MEG signal in visual network. From Fig. 3, we could see that the proposed method inherits the high temporal resolution of MEG signal in a dynamical system. The proposed method predicts the hidden observation of the dynamical neural transmission locally and globally. In Table 1, we use the mean squared error (MSE) to measure the similarity between the ROI averaged predicted results and ROI averaged MEG signals. In addition, we also compare the proposed method with three baselines LSTM, GRU-D and proposed method without Siamese netwrok. Polynomial Projection Operators could be combined with The introduction of the spatio-temporal constraints into the dynamical system is consistent with the fundamental role in biological system. The proposed method provides chance to understand how behavioral representations evolve over time.
Fig. 3.

Averaged neural time series in visual network (a) fMRI, (b) predicted result, (c) MEG.
Table 1.
Temporal pattern prediction results with different baseline models
| Methods | LSTM | GRU-D | Without Siamese network | Proposed method |
|---|---|---|---|---|
| MSE | 0.3199 | 0.5805 | 0.1134 | 0.0577 |
4. Conclusions
To understand the complex brain dynamics, we need to record the activity in space and time, which could not be solved using current noninvasive techniques of brain measurements. We propose a novel computational model that could combine the information from several techniques and predict the internal brain activity with both high spatial and high temporal resolution. The proposed framework address the vanishing gradient problem by abstracting the long tern temporal relationship with functional approximation. In the training step, we use both fMRI and MEG data to learn the brain dynamical representation with Siamese network. While in the testing step, for the first time, the proposed method solves the problem of predicting the internal behavioral states with high resolution in spatial and temporal domain using only fMRI data. The potential of the proposed method to represent the spatio-temporal dynamics has been demonstrated using two experiments with HCP data.
Supplementary Material
Acknowledgement.
This work was partially supported by NIH R01AG071243, R01MH125928, R01AG049371, U01AG068057, and NSF IIS 2045848, 1845666, 1852606, 1838627, 1837956, 1956002, IIA 2040588.
Footnotes
Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-3-031-16431-6_32.
References
- 1.Chen X, He K: Exploring simple siamese representation learning. arXiv preprint arXiv:2011.10566 (2020)
- 2.Cornblath EJ, et al. : Temporal sequences of brain activity at rest are constrained by white matter structure and modulated by cognitive demands. Commun. Biol 3(1), 1–12 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gallego JA, Perich MG, Naufel SN, Ethier C, Solla SA, Miller LE: Cortical population activity within a preserved neural manifold underlies multiple motor behaviors. Nat. Commun 9(1), 1–13 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Glasser MF, et al. : A multi-modal parcellation of human cerebral cortex. Nature 536(7615), 171–178 (2016) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Glasser MF, et al. : The minimal preprocessing pipelines for the human connectome project. Neuroimage 80, 105–124 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Gu A, Dao T, Ermon S, Rudra A, Ré C: Hippo: Recurrent memory with optimal polynomial projections. arXiv preprint arXiv:2008.07669 (2020)
- 7.Kriegeskorte N, Mur M, Bandettini PA: Representational similarity analysis-connecting the branches of systems neuroscience. Front. Syst. Neurosci 2, 4 (2008) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.McIntosh AR, Mišić B: Multivariate statistical analyses for neuroimaging data. Annu. Rev. Psychol 64, 499–525 (2013) [DOI] [PubMed] [Google Scholar]
- 9.Musall S, Kaufman MT, Juavinett AL, Gluf S, Churchland AK: Single-trial neural dynamics are dominated by richly varied movements. Nat. Neurosci 22(10), 1677–1686 (2019) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Norman KA, Polyn SM, Detre GJ, Haxby JV: Beyond mind-reading: multivoxel pattern analysis of FMRI data. Trends Cogn. Sci 10(9), 424–430 (2006) [DOI] [PubMed] [Google Scholar]
- 11.Saxena S, Cunningham JP: Towards the neural population doctrine. Curr. Opin. Neurobiol 55, 103–111 (2019) [DOI] [PubMed] [Google Scholar]
- 12.Shine JM, et al. : Human cognition involves the dynamic integration of neural activity and neuromodulatory systems. Nat. Neurosci 22(2), 289–296 (2019) [DOI] [PubMed] [Google Scholar]
- 13.Stringer C, Pachitariu M, Steinmetz N, Carandini M, Harris KD: High-dimensional geometry of population responses in visual cortex. Nature 571(7765), 361–365 (2019) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Taghia J, et al. : Uncovering hidden brain state dynamics that regulate performance and decision-making during cognition. Nat. Commun 9(1), 1–19 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Tang E, Mattar MG, Giusti C, Lydon-Staley DM, Thompson-Schill SL, Bassett DS: Effective learning is accompanied by high-dimensional and efficient representations of neural activity. Nat. Neurosci 22(6), 1000–1009 (2019) [DOI] [PubMed] [Google Scholar]
- 16.Tavares RM, et al. : A map for social navigation in the human brain. Neuron 87(1), 231–243 (2015) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Voelker AR, Kajić I, Eliasmith C: Legendre memory units: Continuous-time representation in recurrent neural networks. In: Proceedings of the 33st International Conference on Neural Information Processing Systems (2019) [Google Scholar]
- 18.Zhao C, Gao X, Emery WJ, Wang Y, Li J: An integrated spatio-spectral-temporal sparse representation method for fusing remote-sensing images with different resolutions. IEEE Trans. Geosci. Remote Sens 56(6), 3358–3370 (2018) [Google Scholar]
- 19.Zhao C, Li H, Jiao Z, Du T, Fan Y: A 3D convolutional encapsulated long short-term memory (3DConv-LSTM) model for denoising fMRI data. In: Martel AL, Abolmaesumi P, Stoyanov D, Mateus D, Zuluaga MA, Zhou SK, Racoceanu D, Joskowicz L (eds.) MICCAI 2020. LNCS, vol. 12267, pp. 479–488. Springer, Cham: (2020). 10.1007/978-3-030-59728-3_47 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
