Abstract
For decades, task functional magnetic resonance imaging (tfMRI) has been a powerful noninvasive tool to explore the organizational architecture of human brain function. Researchers have developed a variety of brain network analysis methods for task fMRI data, including the general linear model (GLM), independent component analysis (ICA) and sparse representation methods. However, these shallow models are limited in faithful reconstruction and modeling of the hierarchical and temporal structures of brain networks, as demonstrated in more and more studies. Recently, recurrent neural networks (RNNs) exhibit great ability of modeling hierarchical and temporal dependency features in the machine learning field, which might be suitable for task fMRI data modeling. To explore such possible advantages of RNNs for task fMRI data, we propose a novel framework of Deep Recurrent Neural Network (DRNN) to model the functional brain networks from task fMRI data. Experimental results on the motor task fMRI data of Human Connectome Project 900 subjects release demonstrated that the proposed Deep Recurrent Neural Network can not only faithfully reconstruct functional brain networks, but also identify more meaningful brain networks with multiple time scales which are overlooked by traditional shallow models. In general, this work provides an effective and powerful approach to identifying functional brain networks at multiple time scales from task fMRI data.
Keywords: Task fMRI, Brain network, RNN, Deep learning
I. Introduction
Exploring the organizational architecture of human brain function has been a tense interest in neuroscience community since the inception of neuroscience [1-5]. After decades of active research using noninvasive neuroimaging methods such as functional magnetic resonance imaging (fMRI), there has been mounting evidence that the brain function is realized by the interaction of multiple concurrent neural process or functional brain networks [6-10] and these networks are spatially distributed across specific structural substrate of neuroanatomical areas [11, 12]. In these fMRI based studies, researchers developed a variety of brain network reconstruction and modeling techniques, such as the general linear model (GLM) [13, 14], principal component analysis (PCA) [15], independent component analysis (ICA) [16-18] and sparse representation/dictionary learning methods [19-27]. Among all of these methods, GLM is among the most widely used task-based fMRI (tfMRI) data analysis methods and ICA is among the dominant resting state fMRI (rsfMRI) data analysis methods. These methods reconstructed many meaningful functional brain networks which are characterized by both spatial maps and corresponding temporal time series from both tfMRI and rsfMRI data sets and greatly advanced our understanding of the regularity and variability of brain functions [14, 16, 20].
However, those existing approaches which are based on shallow models are limited in faithful reconstruction and modeling of the hierarchical and temporal structures of brain functional networks in tfMRI data [28, 29]. Recently, deep learning methods have attracted much attention in a variety of challenges [30, 31] and artificial intelligence applications [32-36]. The success of deep learning methods lies in the ability of automatically and hierarchically representing the raw data. Typically, deep neural networks consist of multiple layers and higher layer features could be derived from lower level features, and thus derive a hierarchical feature representations from the raw data. Inspired by the great success of deep learning methods, more and more researchers applied deep learning methods in medical image analysis such as image registration [37], image segmentation [38, 39], image fusion [40], computer-aided diagnosis and prognosis [41, 42], lesion/landmark detection [43-45], hemodynamic response functions (HRFs) estimation [46] and functional brain network analysis [42, 47-49]. For instance, Suk et al. [42] investigated the functional connectivities in resting-state fMRI data using Deep Auto-Encoder (DAE) ; Hjelm and Huang et al. [50, 51] identified brain functional networks using the restricted Boltzmann machine (RBM); Zhao et al. used 3D convolutional neural networks to classify fMRI-derived functional brain networks [49] and adopted Deep Convolutional Autoencoder to construct fine-granularity functional brain network atlases. Although recent works demonstrate the great superiority of deep learning methods for functional brain network analysis [47-49], information at multiple time scales is rarely taken into consideration in these models, although it is known that brain activities have multiple time scales [52].
Recently, recurrent neural networks (RNNs) are gaining more and more attention [53], especially in machine translation [54], speech recognition [55] and language modeling [56]. Recurrent neural networks (RNNs) are feed forward neural networks which augmented with edges that span adjacent time steps, introducing the time notation to the neural network model [57]. Unlike traditional neural networks, RNNs can use their internal memory unit to process arbitrary sequences of inputs and model the sequential and time dependencies on multiple time scales [57]. That is, RNN models make their predictions based on not only the information available at a given time, but also the information that was available in the past. Actually, the brain activity is modulated by long temporal dependencies [46], which quite coincides with the characteristics of RNN models. Therefore, it is quite natural and well justified to adopt RNNs to explore the brain functional networks in tfMRI data. Though the RNNs based fMRI data analysis framework has been proposed a number of times [46, 58, 59] focused on deriving representational stimulus features or HRF functions. However, it has been rarely explored whether RNNs can be utilized to infer functional brain networks with the whole brain tfMRI data.
In order to explore the possible advantages of RNN models, in this study, we proposed a novel framework of deep recurrent neural network (DRNN) for modeling functional brain networks from tfMRI data. An important characteristic of DRNN framework is that the task stimulus information is sequentially processed through the model and it automatically generates the observed whole brain voxel signals. In this way, the hierarchical and temporal structures of the brain activities are captured and brain networks at multiple time scales (especially time dependency sensitive brain networks) can be identified. We used the motor task tfMRI dataset of HCP 900 subjects release as test beds, and extensive experimental results demonstrated the superiority of the proposed method in identifying functional brain networks at multiple time scales in tfMRI data. It is interesting that the proposed DRNN framework can not only identify well shaped functional brain networks by GLM, but also more networks at multiple time scales which were overlooked by traditional shallow methods.
II. Materials and Methods
A. Overview
Fig. 1 summarizes the proposed deep recurrent neural network (DRNN) model. There are three major steps to model tfMRI functional brain networks using DRNN. First, for each subject, the task design stimulus curves are gathered into a stimulus matrix X(k stimuli with t time series) as the input layer and the whole brain tfMRI signals are aggregated into a big signal matrix Y(m voxels’ signals with t time series). Then these task stimulus patterns passed through two hidden layers and each layer is of nh RNN units respectively. Next the response of top hidden layer is connected to the whole brain signal matrix via a fully connected layer ([nh,m]). Specifically, each hidden node’s connection weight vector represents a typical functional brain network and its corresponding hidden response to specific stimulus patterns represents the temporal activity pattern of the network.
Fig. 1.
Overview of the DRNN model.
B. Data Acquisition and Pre-processing
The Human Connectome Project (HCP) dataset is one of the most systematic and comprehensive neuroimaging dataset in current stage which aims to bring data from the major MRI neuroimaging modalities together into a cohesive framework to enable detailed comparisons between brain architecture, connectivity, and function across individual subjects. Importantly, this dataset is publicly available which makes it a good test bed for different researchers. In this paper, we adopted motor tfMRI dataset of HCP 900 subjects release and all the tfMRI datasets of HCP Q1 release to test our proposed method. There are 68 subjects in Q1 release dataset and over 800 subjects in HCP 900 subjects release dataset.
Seven categories of behavioral tasks are adopted in the HCP tfMRI dataset, including Working memory, Gambling, Motor, Language, Social, Relational, and Emotion tasks. In working memory task, participants were presented with blocks of trials that consisted of pictures of places, tools, faces and body parts. A version of the N-back task was chosen to assess working memory [60]. Gambling task was adapted from Delgado et al. [61]. Participants played a card guessing game to guess the number on a mystery card in order to win or lose money. Motor task was adapted from Buckner and colleagues [62, 63]. The participants were presented with visual cues that asked them to tap fingers, squeeze toes, or move tongue. Language task was developed by Binder et al. [64]. The task consists of two runs that each interleave blocks of a story task and a math task. In social task, participants were presented with short video clips of objects (squares, circles, triangles) either interacting in some way, or moving randomly. They were asked to judge whether the objects had a mental interaction after each video clip [65-68]. Relational task was adapted from Smith et al. [69]. Participants were given stimuli which were different shapes filled with different textures under relational or matching conditions. Emotion task was adapted from Hariri and colleagues [70, 71]. Participants were presented with blocks of trials that consisted of faces and shapes and asked to match them. Table 1 lists several parameters for HCP tfMRI datasets. The detailed parameters and design paradigms of the tasks are available in [72, 73].
Table 1.
Parameters for HCP tfMRI datasets.
| Task | Duration (min) |
# of frames | # of task blocks |
|---|---|---|---|
| Working Memory | 5:01 | 405 | 8 |
| Gambling | 3:12 | 253 | 4 |
| Motor | 3:34 | 284 | 10 |
| Language | 3:57 | 316 | 8 |
| Social | 3:27 | 274 | 5 |
| Relational | 2:56 | 232 | 6 |
| Emotion | 2:16 | 176 | 6 |
The detailed acquisition parameters of these tfMRI data were set as follows: 220 mm FOV, in-plane FOV: 208×180 mm, flip angle=52, BW=2290 Hz/Px, 2×2×2 mm spatial resolution, 90×104 matrix, 72 slices, TR=0.72s, TE=33.1ms. The preprocessing of the task fMRI data sets includes skull removal, motion correction, slice time correction, spatial smoothing, and global drift removal (high-pass filtering). All these preprocessing steps were implemented in FSL FEAT. More detailed data acquisition protocol and preprocessing procedures, are detailed in literatures [20, 74]. All of these individual fMRI datasets are first registered to the MNI common space for further study. Besides, the GLM-based activation results are also derived using FSL FEAT for comparison.
C. Deep Recurrent Neural Network Model
RNNs are feedforward neural networks augmented with edges spanning adjacent time steps where connections between units form a directed cycle. These connections introduce a notation of time and provide memory of past state. In contrast with traditional neural networks which only receive information at the bottom layer and output at the highest layer, RNNs receive input and produce output at each iteration step. However, a common RNN only process information through one layer before going to output which could not provide hierarchical structure of processing the input information and the temporal hierarchy of input signals is not clear. In order to overcome these limitations, we propose a deep recurrent neural network (DRNN) framework for modeling functional brain networks from tfMRI data. The basic idea of DRNN is stacking RNNs to construct a hierarchical network architecture [75]. Each hidden layer is a recurrent neural network and the hidden state of each layer is the input of next layer. In this way, new information propagates throughout the hierarchy during each network update and temporal context is added in each layer (Fig. 2). As demonstrated in character-based language modelling studies[75], stacking RNNs automatically creates different time scales across different levels and also forms a temporal hierarchical information processing structure.
Fig. 2.
Illustrative map of a DRNN. Blue circle represents input units, green circle represents hidden units and red circle represents output units.
We define a DRNN with L layers and each layer has ni hidden units. The input sequence is denoted as (x(1), x(2), … , x(t)) where each data point is a real-valued vector and the target sequence is denoted as (y(1), y(2), … , y(t)) and the hidden state of i-th layer is denoted as hi(t). In order to avoid confusion between the indices of nodes and sequence steps, we use superscripts for time and subscripts for layer index. The output of DRNN model can be modeled as:
| (1) |
where is the estimated output from the top hidden layer and V is the weight matrix between hidden layer and output and bi is the bias parameters which contain the offset of each node.
There are different types of RNN architectures and the long short-term memory (LSTM) [76] is the most popular specialized memory units of RNNs, which is developed for long time series. Each LSTM unit has a cell state to store information from previous time points, so it can “remember” information of longer time series. Thus, the LSTM unit has better performance than the basic RNN unit and other shallow models in modeling human brain activities [46] since information in human brain is processed in a hierarchical and dynamic way and has various time dependencies [28, 29]. The hidden states of an LSTM unit are defined as:
| (2) |
| (3) |
where ct is the cell state, ot are the output gate activities and ☉ denotes elementwise multiplication. Information about the previous time points is stored in the cell state. What information will be retrieved from the cell state is controlled by the output gate. The cell state of an LSTM unit can be defined as:
| (4) |
| (5) |
| (6) |
| (7) |
where ft are the forget gate activities, it are the input gate activities. Forget gates decide what old information will be thrown away from the cell states. Input gates control what new information will be kept in the cell states. is an auxiliary variable created by a tanh layer for new candidate values could be added to the state. Furthermore, Uf, Ui, Uc and Wf, Wi, Wc are the corresponding weights and bf, bi, bc are the biases. These parameters are shared over time and determine the behavior of the gates.
In this paper, we also adopted another popular RNN units, the Gated Recurrent Units (GRU) [54] for validation study. Compared with LSTM, GRU combines hidden states with cell states and input gates with forget gates. The hidden states of a GRU unit is defined as:
| (8) |
| (9) |
| (10) |
| (11) |
where zt are update gate activities, rt are reset gate activities and is an auxiliary variable. Similar with the gates in LSTM units, the information flow between the time points is controlled by GRU units. Like the parameters in LSTM units, Uz, Ur, Uh and Wz, Wr, Wh are the corresponding weights and bz, br, bh are the biases which determine the behavior of the gates. A sketch map of basic RNN units, LSTM units and GRU units is illustrated in Fig. 3.
Fig. 3.
The structures of (a) a basic RNN unit, (b) an LSTM unit and (c) a GRU unit.
D. Identification of Functional Brain Networks
In the DRNN model, the task design stimulus information is separated in different time points and put into the model step by step in each iteration. In each network update, new information is propagated to the hierarchical structure and temporal context is added in each RNN layer. Each hidden layer in the DRNN is a recurrent neural network and each upper layer receives the hidden state information from previous layer as input. Thus, the output information through the stacking RNNs structure is of different time scales. Finally, the top hidden layer's output is connected to the whole brain signal matrix via a fully connected layer. Specifically, each hidden node’s connection weight vector represents a typical functional brain network’s spatial distribution and its corresponding hidden response to specific stimulus represents the temporal pattern of the network.
In order to compare the derived brain networks with those by other methods, a spatial matching method [20, 22] is adopted to calculate the spatial similarity between the identified networks and the network templates derived from other methods. The spatial similarity is defined as the spatial pattern overlap rate R:
| (12) |
where S and T are cortical spatial maps of a brain network component and the brain network template, respectively.
E. Theoretical Brain Responses
In traditional activation detection studies, the hypothetical brain responses (regressors in GLM method) are modeled as the convolution of the stimulus function and a fixed hemodynamic response function (HRF) (Fig. 4(a)). However, it has been proved to be limited in interpreting neural activities across a wide range of brain area [47]. Instead, previous studies reported that the information are processed in a hierarchical and dynamic way in human brain, and thus the responses of human brain should be rich and variable, even under the same stimulus condition. In order to better interpret the observed brain response activities, we proposed a novel theoretical brain responses model [21, 47] based on the idea that the variety of brain responses are the transformation forms of the basic theoretical response (Fig. 4(b)). Specifically, the basic theoretical regressors are extended into regressor groups via delay, derivative, integral, anti (inversed) operations, which are the most common transforms in signal processing [47]. The theoretical brain responses model could be modeled as follows:
| (13) |
where r(t) is the extended regressor groups and ri(t) represents the extended regressor with different operations.
Fig. 4.
The pipeline of modeling hypothesized regressors. (a) Traditional methods; (b) Our method.
We treat the derived theoretical brain responses as the benchmarks of underlying neural activities and compare them with the DRNN response output. The detailed comparisons and analysis will be discussed in the results section.
F. Model Training and Interpretation
The proposed DRNN framework consists of two RNN layers with nh units respectively. Task stimulus information of each subject and corresponding whole brain fMRI signals are adopted to train the DRNN model. Dropout [77] is adopted to regularize the hidden layers. The parameters in the DRNN framework is optimized to minimize the mean square error between the whole brain signals and its reconstructions. The TensorFlow [78] system is adopted to implement the models.
To be more specific, Fig. 5 shows the overall pipeline of the training and identification stages of the DRNN model. During the training stage, task design stimuli are put into the model and the whole brain tfMRI signals are aggregated into a big signal matrix (Fig. 5(b)) to optimize the reconstructed whole brain signals. After model convergence, each vector of the weight matrix in FC layer naturally represents a spatial distribution of a typical functional brain network (Fig. 5(e)). In order to aid the interpretation of the identified temporal response patterns, we extend the basic task paradigm pattern into theoretical regressor groups by the method in section II.E (Fig. 5(d)) and each regressor represents corresponding time scale patterns. By comparing the similarities between the extended regressors and temporal response patterns, highly-correlated response patterns are picked out, which represents corresponding time scale temporal patterns. It should be noticed that our temporal patterns are outputs by keeping a specific stimulus active (the others are set to zeros) and pass the stimulus information through the trained model.
Fig. 5.
The pipeline of training DRNN model and identification of functional brain networks at multiple time scales. (a) Input task design stimuli. (b) Whole brain tfMRI signals which are aggregated into a signal matrix for optimizing the reconstructed brain signals. (c) Stimuli for identification by keeping one specific stimulus active and setting the others to zeros. (d) Identified temporal patterns at multiple time scales by comparing hidden output series and extended regressor groups to specific stimulus. (e) Weight matrix of FC layer that each vector represents a spatial distribution of a functional brain network.
III. Experimental Results
In this study, the proposed DRNN model has been tested on the motor task tfMRI dataset of HCP 900 subjects release and the whole HCP Q1 release dataset.
A. Model Implementation
After preprocessing and signal extracting, we obtained 244,341 voxels’ signals for the motor task dataset of HCP 900 subjects release and 223,945 for HCP Q1 release dataset. The LSTM cells were initialized by the default initializer of TensorFlow, which is the Xavier uniform initializer [79], and the initial state was set to zero state. The weight and bias of the fully connected layer were initialized to zeros. The learning rate was set to 0.004 with a decay factor of 0.25 every five epochs. Adam optimizer [80] with its default parameters ( β1 = 0.9, β2 = 0.999, ϵ = 10−8) was applied for optimizing the parameters to minimize the mean square error (MSE) between the whole brain signals and its reconstruction. The training was performed by utilizing dual NVIDIA GTX 1080Ti GPU cards for 20 epochs. It took approximately 18 hours to train this model on HCP 900 subject release dataset and half an hour on HCP Q1 release dataset.
Specifically, there are 822 subjects in HCP 900 release dataset and 68 subjects in HCP Q1 release dataset. The same model was trained independently on each dataset. For each training, all subjects’ signals were used during the training stage, since training on grouped subjects’ data will help avoid overfitting. Both L1- and L2-norm regularization were tried during the training stage, but the training loss increased rapidly with either regularization. Therefore, only MSE was taken as the loss function. Furthermore, independent trainings on several split subsets were applied for validation study, and we obtained almost the same loss convergences (Fig. 6) and similar results. Actually, the training on half of the HCP Q1 release dataset (34 subjects) could obtain similar results. However, more training data (HCP 900 subjects release) will improve the reliability and interpretability of the results, reach loss convergence faster, and avoid overfitting better.
Fig. 6.
Training losses on the whole dataset of HCP Q1 release (68 subjects), HCP 900 subjects release (822 subjects), and two subsets of HCP 900 subjects release with half data (411 subjects).
B. Identified Typical Functional Brain Networks
After training of the proposed DRNN model, we can get the outputs of hidden layers and the weight matrix of the fully connected layer. Specifically, each vector of the weight matrix represents the spatial distribution of a typical brain network and each corresponding hidden output represents the temporal patterns. Fig. 7 illustrates a few typical brain networks identified on the motor tfMRI dataset of HCP 900 subjects release using DRNN model. For comparison, we also list the GLM group-wise activation maps on the right column. This figure clearly shows that part of our trained functional networks are quite similar to the corresponding GLM activation maps. In order to quantitatively measure the similarity, we adopt the Equation (12) to calculate the spatial overlap rate between the identified DRNN networks and the corresponding GLM activation maps which are listed in the first row of Table 2. In addition, the corresponding temporal patterns are also quite similar to the common HRF response patterns (convolution results of task design paradigm and HRF function). Fig. 8 shows that the corresponding temporal response patterns, the task design patterns and the HRF response patterns, It is easy to see that the temporal patterns of DRNN brain networks have high correlations to the HRF responses. Though comparison, the high spatial overlap rate and close temporal correlation suggest that the proposed DRNN model can identify meaningful and reliable functional networks in an automatic way.
Fig. 7.
A few identified functional brain networks in the motor task tfMRI dataset of HCP 900 subjects release. The left is the networks identified using DRNN model with LSTM units and the right is the GLM-derived group-wise activation maps. M1-M6 represent different stimuli.
Table 2.
The first row shows the spatial overlap rate between the identified networks by DRNN and the corresponding GLM-derived group-wise activation maps. The second row shows the Pearson correlation between the temporal pattern and the common HRF response patterns.
| M1 | M2 | M3 | M4 | M5 | M6 |
|---|---|---|---|---|---|
| 0.66 | 0.52 | 0.56 | 0.54 | 0.44 | 0.54 |
| 0.93 | 0.94 | 0.90 | 0.95 | 0.94 | 0.85 |
Fig. 8.
Temporal response patterns corresponding to the identified functional brain networks in Fig. 7.
C. Identified Functional Brain Networks at Multiple Time Scales
During the training stage, the task stimulus information propagates through the hierarchical and temporal model iteratively, and the final output naturally reflects the different time scale responses of the original stimulus information that correspond to brain networks at multiple time scales. In order to better interpret the identified functional brain networks, we further calculated the correlations between the identified temporal brain activity patterns and the theoretical regressor groups which were adopted in previous literature studies [47]. Essentially, the theoretical regressor groups represent the possible multiple time scale brain responses. Our basic idea is that if a specific temporal pattern is highly correlated with an extended theoretical regressor, the corresponding identified DRNN network should belong to the similar time scale network.
Fig. 10 shows the temporal correlation maps between temporal response patterns of the 30 identified DRNN networks using stimulus M6 and the extended hypothetical regressor groups in [47]. Similarly, we also extended the basic HRF response patterns with multiple delays, derivate, integral and inverse operation. Specifically, in each subfigure, each row represents a DRNN network and 7 columns represent 7 different time delays ranging from 3s to 21s with an interval of 3 second. From this figure, we can see that there are a few network temporal patterns highly correlated with the extended hypothetical regressors, and they represent the identified different time scales of brain networks. Fig. 9 illustrates a few typical identified different time scales of brain networks and corresponding temporal patterns. From this result, we can see that a variety of time scales of theoretical response networks including multiple delays, multiple inversed HRFs and delays, different derivative and integral operations could be identified. We further checked the spatial patterns of these networks and it is interesting that these networks are similar but not the same. This is reasonable since these networks are evoked by the same stimulus but at different time scales. These multiple time scales of brain networks can be effectively identified with the DRNN framework, which is a major advantage of the proposed model.
Fig. 10.
Temporal correlation maps between temporal responses patterns of the identified 30 DRNN networks and the extended hypothetical regressor groups. (a) HRF delay group; (b) derivative form group; (c) integral form group; (d) inversed HRF group; (e) inversed derivative form group; (f) inversed integral form group. In each subfigure, each row represents a DRNN network and 7 columns represent 7 different time delays with an interval of 3 seconds.
Fig. 9.
The spatial and temporal patterns of a few identified brain networks at multiple time scales shown in Fig. 10.
D. Identified Typical Functional Brain Networks in Different HCP Release Datasets
In order to validate the proposed method, we also applied the DRNN framework on all seven tasks of HCP Q1 release datasets. We also take motor task as an example. Both HCP Q1 release and HCP 900 subjects release follow the same task design and the Q1 release is the early dataset with just 68 subjects. As expected, the identified brain networks and corresponding temporal patterns are quite similar. Fig. 14 shows a few typical networks identified in both motor tfMRI datasets. While the left part is from the HCP Q1 release dataset, and the right part shows the identified networks from HCP 900 subjects release dataset. It is easy to see that the spatial distributions of identified networks are quite similar. We also checked the corresponding temporal patterns which are illustrated in Fig. 13. Similarly, the corresponding temporal patterns are similar to the task design’s HRF response patterns. A minor difference is that the functional networks identified on HCP 900 subjects release dataset have more concentrated foci as it has far more subjects (822) than Q1 release dataset (68), which suggests that more training data will improve the power of DRNN model.
Fig. 14.
A few functional brain networks identified by DRNN model with 30 LSTM units. The left is trained on the motor tfMRI dataset of HCP Q1 release and the right is of HCP 900 subjects release.
Fig. 13.
Temporal response patterns corresponding to the identified functional brain networks of HCP Q1 release dataset in Fig. 14.
Furthermore, the DRNN model was trained independently on six other tasks of HCP Q1 release tfMRI datasets to explore the reproducibility. A few typical functional brain networks identified by DRNN on HCP Q1 Social, Language, Relational, Emotion, Working memory and Gambling task datasets (corresponding networks are represented as S1, S2, L1, L2, R1, R2, E1, E2, W1, W2, W3, W4, G1, G2) and their corresponding temporal patterns are shown in Fig. 11 and Fig. 12. Among the subfigures in Fig. 12, we found that temporal patterns of working memory task do not match the HRF responses well as other tasks. A possible explanation is that the proportion of impulse during the whole stimulus series is quite low for the working memory task, and DRNN did not have enough information to work at the best performance. From these results, we can see that the proposed DRNN model could identify meaningful functional brain networks on different tfMRI datasets and the model may further improve its detection precision with more training data.
Fig. 11.
Identified functional networks by DRNN with 30 LSTM units and the corresponding GLM-derived group-wise activation maps in HCP Q1 release. S, L, R, E, W, G represents Social, Language, Relational, Emotion, Working memory and Gambling tfMRI datasets, respectively.
Fig. 12.
Temporal response patterns corresponding to the identified functional brain networks in Fig. 11.
E. Parameters
The RNN units’ type and numbers of RNN units and layers are important parameters in RNN based studies. However, it is still an open question to optimize the parameters in RNN models. Therefore, these parameters are set experientially in this paper. In order to evaluate the effects of different parameter settings on DRNN models, we adopted different combinations of parameters to examine the reproducibility and stability of our method.
1). Effect of Types of RNN Units
LSTM/GRU are the most popular RNN structures and most of importance RNN works are based on these two models [75]. What’s more, LSTM/GRU units are good at modeling long term temporal dependencies[57] which is overlooked in traditional methods. Interestingly, we got similar results using LSTM units and GRU units. We also take the common brain networks in the motor tfMRI dataset of HCP 900 subjects release as examples. Fig. 15 illustrates the identified typical brain networks using LSTM and GRU units. It is easy to see that the spatial maps and corresponding temporal patterns are quite close to each other. These results suggest that the proposed DRNN model is stable and robust on LSTM and GRU type units.
Fig. 15.
Typical identified brain networks using LSTM and GRU units.
We further checked the DRNN model with basic RNN units. Fig. 16 shows several typical temporal responses to the stimulus with basic RNN units in motor tfMRI datasets of HCP 900 subjects release. It is interesting to see that one type of response pattern of basic RNN unit follows the task design stimulus curve well rather than the theoretical HRF curve (Fig. 16(a, b)). However, another type is not smooth enough and changes from time point to time point sharply (Fig. 16(c, d)). They cannot simulate the theoretical HRF responses very well. A potential reason is the basic RNN unit can’t “remember” long-term information and its current hidden state just relies on the state of the last one time point. This also suggest us fMRI signals are sequential output of neuron activities and temporal dependency information of fMRI signals is beneficial for understanding the structure of functional brain networks.
Fig. 16.
Typical temporal response patterns to the stimuli using basic RNN units.
2). Effect of Numbers of RNN Units
We also checked the effect of different numbers of RNN units. Specifically, we alternated the numbers of RNN units as 15, 30, 40, 50 and 80. We also adopted the motor tfMRI dataset of HCP 900 subjects release as examples here. Fig. 17 shows spatial and temporal patterns of typical identified brain networks which have high temporal correlated responses to the same stimulus. Generally, different numbers of RNN units work out similar spatial patterns and temporal responses, while less RNN units result in spatial patterns with more concentrated foci and larger training loss. In general, the model remains stable across a range of parameter settings which demonstrate the model is robust.
Fig. 17.
Spatial and temporal patterns of typical identified brain networks with 15, 30, 40, 50 and 80 LSTM units.
3). Effect of Numbers of RNN Layers
Experiments on the DRNN model with different numbers of RNN layers were taken to explore the effect of different RNN layers. We alternated the numbers of RNN layers as 1, 2, 4, and 8. And the motor tfMRI dataset of HCP 900 subjects release was adopted as examples. The number of RNN layers mainly affects the temporal response. Generally, when the task design stimuli propagate to more RNN layers, additional temporal hierarchical information will be captured. However, more temporal hierarchies make the model harder to simulate the brain signals accurately, which leads to a slower convergence rate during the training stage and produces less correlated temporal responses. Fig. 18 (a, b) shows two temporal response patterns with single RNN layer. As shown, the response curves are sensitive to the edges of the task design stimulus curves. To some extent the DRNN model with just single RNN layer works similar to that with basic RNN units, since not enough temporal hierarchical information propagated through the model. Another experiment with 8 RNN layers results in a temporal response curve which cannot simulate each impulse of the stimulus curve at a same amplitude (Fig. 18 (c)), or an almost horizontal curve (Fig. 18 (d)). The two RNN hidden layer structure was set experimentally to produce smoother and more correlated temporal responses.
Fig. 18.
Several typical temporal response patterns to the stimuli with single RNN layer (a, b) and 8 RNN layers (c, d).
IV. Discussion
In this work, we proposed a novel deep recurrent neural network (DRNN) for modeling functional brain networks in tfMRi data. The DRNN framework naturally combines the common deep neural networks with RNN. Each hidden layer of DRNN is a recurrent neural network and the output of each layer is the input time series of the upper layer. This structure automatically creates different time scales across different levels and thus form a temporal hierarchy. After training with the task stimulus, the whole brain voxel signals are automatically reconstructed with the top hidden layer output. Specifically, the weight vector between the hidden units and the whole brain fMRi signals describes the spatial distribution of this network and the top hidden layer's output under specific stimuli naturally represents the corresponding temporal patterns of the brain network. The hierarchical and temporal information of the brain activities is captured, and different time scales of brain networks can be identified. The detailed interpretation of the proposed model is shown in Fig. 5.
We have adopted the motor tfMRI dataset of HCP 900 subjects release and the whole HCP Q1 tfMRi data as test beds and extensive experiments demonstrated the superiority of the proposed models. As demonstrated in results section III.B, III.C, and III.D, the output of our proposed framework is robust and consistent across a range of parameter settings and the subject number settings. As demonstrated in results section III.B and III.C, there are several interesting observations are summarized as follows: 1) Typical functional brain networks in traditional methods can be well reconstructed. As shown in results section III.B, the DRNN model is able to identify GLM activation results. 2) The DRNN model could identify multiple time scales of brain networks. As shown in results section III.C, multiple temporal scale response regressors (including delayed, inversed, derivative, and integral forms of the HRF regressor) and corresponding spatial networks could be simultaneously identified. These results bring us novel understanding of functional brain networks and provide evidence of the hierarchical information processing structure of human brain. 3) The DRNN model automatically generated the observed fMRI signals from stimulus task design patterns and reshaped a few task stimulus patterns into theoretical HRF response patterns, which suggests DRNN model provides a novel way to simulate the brain response and an alternative method to predict the brain activities. As far as we know, this is the first work using RNN network in modeling functional brain networks.
In this paper, we have focused on describing the basic idea of the proposed DRNN model and estimating multiple time scales of brain networks with DRNNs. The extensive results demonstrate the superiority of the proposed methods. However, this model could still be improved in several directions. First, although the motor tfMRI dataset of HCP 900 subjects release is a good test bed, we should apply more datasets and subjects to test the model. As suggested by the results, more training subjects will help improve the detection accuracy. Second, the spatial patterns identified by the DRNN model are group-wise. It will be interesting to examine such spatial patterns at individual subject level in the future. Also, the DRNN model could be potentially improved by integrating other methods to better analyze individual spatial patterns. Third, the computation load of the DRNN model is still relatively high, and thus more advanced RNN models and training methods should be tried to further improve the results. Fourth, visualization methods should be developed to better interpret the details of deep learning model. Besides, the structure of DRNN suggest a natural way to simulate the brain response. The details of the simulation process should be further explored. Finally, the proposed method should be further adopted and explored to investigate more brain diseases such as Mild cognitive impairment (MCI), Alzheimer’s disease (AD), Attention deficit hyperactivity disorder (ADHD), Autism spectrum disorder (ASD) and Schizophrenia data in the future.
V. Conclusions
In general, our work contributes a novel framework for modeling the functional brain networks in tfMRI data, and this framework can not only identify typical brain networks as traditional methods (e. g. GLM) but also simultaneously detect multiple time scales of functional brain networks. Furthermore, the DRNN model automatically reconstructs the whole brain fMRI signals from stimulus task design patterns, which exhibits a similar way to brain response. This model helps us better understand the brain activities and can help us develop more meaningful fMRI data modeling tools in the future. We plan to release our training models and source codes to facilitate more deep learning studies in fMRI study. Source codes will be released at https://github.com/yan-cui/DRNN.
Acknowledgments
This work was supported by National Key R&D Program of China under contract No. 2017YFB1002201 and NSF of China 61806167. Y. Cui was supported by the Fundamental Research Funds for the Central Universities under grant 2017FZA5021. S. Zhao was supported by the National Science Foundation of China under Grant 61806167, the Fundamental Research Funds for the Central Universities under grant 3102017zy030 and the China Postdoctoral Science Foundation under grant 2017 M613206. J. Han was supported by the National Science Foundation of China under Grant 61473231 and 61522207. T. Liu was supported by NIH R01 DA-033393, NIH R01 AG-042599, NSF CAREER Award IIS-1149260, NSF CBET-1302089, NSF BCS-1439051 and NSF DBI-1564736. L. Guo was supported by the National Science Foundation of China under Grant 61333017.
Contributor Information
Yan Cui, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, 310027, China.
Shijie Zhao, School of Automation, Northwestern Polytechnical University, Xi’an, 710072, China.
Han Wang, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, 310027, China.
Li Xie, College of Biomedical Engineering & Instrument Science, and the State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, 310027, China.
Yaowu Chen, College of Biomedical Engineering & Instrument Science, and Zhejiang Provincial Key Laboratory for Network Multimedia Technologies, Zhejiang University, Hangzhou, 310027, China.
Junwei Han, School of Automation, Northwestern Polytechnical University, Xi’an, 710072, China.
Lei Guo, School of Automation, Northwestern Polytechnical University, Xi’an, 710072, China.
Fan Zhou, College of Biomedical Engineering & Instrument Science, Zhejiang University, and the Zhejiang University Embedded System Engineering Research Center, Ministry of Education of China, Hangzhou, 310027, China.
Tianming Liu, Cortical Architecture Imaging and Discovery Lab, Department of Computer Science and Bioimaging Research Center, The University of Georgia, Athens, GA, 30602, USA.
References
- [1].Logothetis NK, "What we can do and what we cannot do with fMRI," Nature, vol. 453, pp. 869–878, 2008. [DOI] [PubMed] [Google Scholar]
- [2].Fister I, Suganthan PN, Kamal SM, Al-Marzouki FM, Perc M, and Strnad D, "Artificial neural network regression as a local search heuristic for ensemble strategies in differential evolution," Nonlinear Dynamics, vol. 84, pp. 895–914, 2016. [Google Scholar]
- [3].Erkaymaz O, Ozer M, and Perc M, "Performance of small-world feedforward neural networks for the diagnosis of diabetes," Applied Mathematics and Computation, vol. 311, pp. 22–28, 2017. [Google Scholar]
- [4].Gosak M, Markovič R, Dolenšek J, Rupnik MS, Marhl M, Stožer A, et al. , "Network science of biological systems at different scales: a review," Physics of life reviews, 2017. [DOI] [PubMed] [Google Scholar]
- [5].Jalili M and Perc M, "Information cascades in complex networks," Journal of Complex Networks, vol. 5, pp. 665–693, 2017. [Google Scholar]
- [6].Dosenbach NU, Visscher KM, Palmer ED, Miezin FM, Wenger KK, Kang HC, et al. , "A core system for the implementation of task sets," Neuron, vol. 50, p. 799, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Duncan J, "The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviour," Trends in Cognitive Sciences, vol. 14, p. 172, 2010. [DOI] [PubMed] [Google Scholar]
- [8].Fedorenko E, Duncan J, and Kanwisher N, "Broad domain generality in focal regions of frontal and parietal cortex," Proceedings of the National Academy of Sciences of the United States of America, vol. 110, p. 16616, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].MD F, AZ S, JL V, M C, DC VE, and ME R, "The human brain is intrinsically organized into dynamic, anticorrelated functional networks," Proceedings of the National Academy of Sciences of the United States of America, vol. 102, p. 9673, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Pessoa L, "Beyond brain regions: network perspective of cognition-emotion interactions," Behavioral & Brain Sciences, vol. 35, pp. 158–159, 2012. [DOI] [PubMed] [Google Scholar]
- [11].Huettel SA, Song AW, and McCarthy G, Functional magnetic resonance imaging vol. 1: Sinauer Associates Sunderland, 2004. [Google Scholar]
- [12].Bullmore E and Sporns O, "Complex brain networks: graph theoretical analysis of structural and functional systems," Nature Reviews Neuroscience, vol. 10, pp. 186–198, 2009. [DOI] [PubMed] [Google Scholar]
- [13].Friston KJ, Fletcher P, Josephs O, Holmes A, Rugg M, and Turner R, "Event-related fMRI: characterizing differential responses," Neuroimage, vol. 7, pp. 30–40, 1998. [DOI] [PubMed] [Google Scholar]
- [14].Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, and Frackowiak RS, "Statistical parametric maps in functional imaging: a general linear approach," Human brain mapping, vol. 2, pp. 189–210, 1994. [Google Scholar]
- [15].Andersen AH, Gash DM and Avison MJ, "Principal component analysis of the dynamic response measured by fMRI: a generalized linear systems framework," Magnetic Resonance Imaging, vol. 17, pp. 795–815, 1999. [DOI] [PubMed] [Google Scholar]
- [16].McKeown MJ, Jung T-P, Makeig S, Brown G, Kindermann SS, Lee T-W, et al. , "Spatially independent activity patterns in functional MRI data during the Stroop color-naming task," Proceedings of the National Academy of Sciences, vol. 95, pp. 803–810, 1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Biswal BB and Ulmer JL, "Blind source separation of multiple signal sources of fMRI data sets using independent component analysis," Journal of computer assisted tomography, vol. 23, pp. 265–271, 1999. [DOI] [PubMed] [Google Scholar]
- [18].Beckmann CF, DeLuca M, Devlin JT, and Smith SM, "Investigations into resting-state connectivity using independent component analysis," Philosophical Transactions of the Royal Society of London B: Biological Sciences, vol. 360, pp. 1001–1013, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Jiang X, Li X, Lv J, Zhao S, Zhang S, Zhang W, et al. , "Temporal Dynamics Assessment of Spatial Overlap Pattern of Functional Brain Networks Reveals Novel Functional Architecture of Cerebral Cortex," IEEE Transactions on Biomedical Engineering, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Lv J, Jiang X, Li X, Zhu D, Zhang S, Zhao S, et al. , "Holistic atlases of functional networks and interactions reveal reciprocal organizational architecture of cortical function," IEEE Transactions on Biomedical Engineering, vol. 62, pp. 1120–1131, 2015. [DOI] [PubMed] [Google Scholar]
- [21].Zhao S, Han J, Hu X, Jiang X, Lv J, Zhang T, et al. , "Extendable supervised dictionary learning for exploring diverse and concurrent brain activities in task-based fMRI," Brain Imaging & Behavior, pp. 1–15, 2017. [DOI] [PubMed] [Google Scholar]
- [22].Zhao S, Han J, Lv J, Jiang X, Hu X, Zhao Y, et al. , "Supervised dictionary learning for inferring concurrent brain networks," IEEE transactions on medical imaging, vol. 34, pp. 2036–2045, 2015. [DOI] [PubMed] [Google Scholar]
- [23].Hu X, Lv C, Cheng G, Lv J, Guo L, Han J, et al. , "Sparsity-constrained fMRI decoding of visual saliency in naturalistic video streams," IEEE Transactions on Autonomous Mental Development, vol. 7, pp. 65–75, 2015. [Google Scholar]
- [24].Shu Z, Yu Z, Xi J, Shen D, and Liu T, "Joint representation of consistent structural and functional profiles for identification of common cortical landmarks," Brain Imaging & Behavior, pp. 1–15, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Zhang W, Jiang X, Zhang S, Howell BR, Zhao Y, Zhang T, et al. , "Connectome-scale functional intrinsic connectivity networks in macaques," Neuroscience, vol. 364, pp. 1–14, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Zhang W, Lv J, Zhang S, Zhao Y, and Liu T, "Modeling resting state fMRI data via longitudinal supervised stochastic coordinate coding," in Biomedical Imaging (ISBI 2018), 2018 IEEE 15th International Symposium on, 2018, pp. 127–131. [Google Scholar]
- [27].Zhang W, Lv J, Li X, Zhu D, Jiang X, Zhang S, et al. , "Experimental Comparisons of Sparse Dictionary Learning and Independent Component Analysis for Brain Network Inference from fMRI Data," IEEE Transactions on Biomedical Engineering, 2018. [DOI] [PubMed] [Google Scholar]
- [28].Ferrarini L and Veer IE, "Hierarchical functional modularity in the resting-state human brain," Human Brain Mapping, vol. 30, pp. 2220–2231, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Meunier D, Lambiotte R, Fomito A, Ersche KD, and Bullmore ET, "Hierarchical modularity in human brain functional networks," Frontiers in Neuroinformatics, vol. 3, p. 37, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S et al. , "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, vol. 115, pp. 211–252, 2015. [Google Scholar]
- [31].Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, and Zisserman A, "The pascal visual object classes challenge: A retrospective," International journal of computer vision, vol. 111, pp. 98–136, 2015. [Google Scholar]
- [32].Collobert R and Weston J, "A unified architecture for natural language processing:deep neural networks with multitask learning," in International Conference, 2008, pp. 160–167. [Google Scholar]
- [33].Hinton G, Deng L, Yu D, Dahl GE, Mohamed A.-r., Jaitly N, et al. , "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups," IEEE Signal Processing Magazine, vol. 29, pp. 82–97, 2012. [Google Scholar]
- [34].Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Driessche GVD, et al. , "Mastering the game of Go with deep neural networks and tree search," Nature, vol. 529, p. 484, 2016. [DOI] [PubMed] [Google Scholar]
- [35].Karpathy A and Li FF, "Deep visual-semantic alignments for generating image descriptions," in Computer Vision and Pattern Recognition, 2015, pp. 3128–3137. [DOI] [PubMed] [Google Scholar]
- [36].Zhang J and Zong C, "Deep Neural Networks in Machine Translation: An Overview," IEEE Intelligent Systems, vol. 30, pp. 16–25, 2015. [Google Scholar]
- [37].Wu G, Kim M, Wang Q, Munsell BC, and Shen D, "Scalable High Performance Image Registration Framework by Unsupervised Deep Feature Representations Learning," IEEE Transactions on Biomedical Engineering, vol. 63, pp. 1505–1516, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Zhang W, Li R, Deng H, Wang L, Lin W, Ji S, et al. , "Deep convolutional neural networks for multi-modality isointense infant brain image segmentation," Neuroimage, vol. 108, p. 214, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Kleesiek J, Urban G, Hubert A, Schwarz D, Maier-Hein K, Bendszus M, et al. , "Deep MRI brain extraction: A 3D convolutional neural network for skull stripping," Neuroimage, vol. 129, p. 460, 2016. [DOI] [PubMed] [Google Scholar]
- [40].Suk HI, Lee SW, and Shen D, "Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis," Neuroimage, vol. 101, pp. 569–582, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [41].Suk HI, Lee SW, and Shen D, "Latent feature representation with stacked auto-encoder for AD/MCI diagnosis," Brain Structure & Function, vol. 220, pp. 841–859, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Suk HI, Wee CY, Lee SW, and Shen D, "State-space model with deep learning for functional dynamics estimation in resting-state fMRI," Neuroimage, vol. 129, p. 292, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Pereira S, Pinto A, Alves V, and Silva CA, "Brain Tumor Segmentation Using Convolutional Neural Networks in MRI Images," IEEE Transactions on Medical Imaging, vol. 35, pp. 1240–1251, 2016. [DOI] [PubMed] [Google Scholar]
- [44].Tulder GV and Bruijne MD, "Combining Generative and Discriminative Representation Learning for Lung CT Analysis With Convolutional Restricted Boltzmann Machines," IEEE Transactions on Medical Imaging, vol. 35, pp. 1262–1272, 2016. [DOI] [PubMed] [Google Scholar]
- [45].Dou Q, Chen H, Yu L, Zhao L, Qin J, Wang D, et al. , "Automatic Detection of Cerebral Microbleeds From MR Images via 3D Convolutional Neural Networks," IEEE Transactions on Medical Imaging, vol. 35, p. 1182, 2016. [DOI] [PubMed] [Google Scholar]
- [46].Güçlü U and Gerven MAJV, "Modeling the Dynamics of Human Brain Activity with Recurrent Neural Networks," Frontiers in Computational Neuroscience, vol. 11, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Huang H, Hu X, Makkie M, Dong Q, Zhao Y, Han J, et al. , Modeling Task fMRI Data via Deep Convolutional Autoencoder, 2017 [DOI] [PubMed] [Google Scholar]
- [48].Zhao Y, Dong Q, Chen H, Iraji A, Li Y, Makkie M, et al. , "Constructing fine-granularity functional brain network atlases via deep convolutional autoencoder," Medical Image Analysis, vol. 42, p. 200, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Zhao Y, Dong Q, Zhang S, Zhang W, Chen H, Jiang X, et al. , "Automatic Recognition of fMRI-derived Functional Networks using 3D Convolutional Neural Networks," IEEE Transactions on Biomedical Engineering, vol. PP, pp. 1–1, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].Hjelm RD, Calhoun VD, Salakhutdinov R, Allen EA, Adali T, and Plis SM, "Restricted Boltzmann Machines for Neuroimaging: an Application in Identifying Intrinsic Networks," Neuroimage, vol. 96, p. 245, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Huang H, Hu X, Han J, Lv J, Liu N, Guo L, et al. , "Latent source mining in FMRI data via deep neural network," in IEEE International Symposium on Biomedical Imaging, 2016, pp. 638–641. [Google Scholar]
- [52].Buzsaki G and Draguhn A, "Neuronal oscillations in cortical networks," Science, vol. 304, pp. 1926–1929, June 25 2004. [DOI] [PubMed] [Google Scholar]
- [53].Litjens G, Kooi T, Bejnordi BE, Aaa S, Ciompi F, Ghafoorian M, et al. , "A survey on deep learning in medical image analysis," Medical Image Analysis, vol. 42, p. 60, 2017. [DOI] [PubMed] [Google Scholar]
- [54].Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. , "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation," Computer Science, 2014. [Google Scholar]
- [55].Graves A, Mohamed A.-r., and Hinton G, "Speech recognition with deep recurrent neural networks," in Acoustics, speech and signal processing (icassp), 2013 ieee international conference on, 2013, pp. 6645–6649. [Google Scholar]
- [56].Sutskever I, Martens J, and Hinton GE, "Generating Text with Recurrent Neural Networks," in International Conference on Machine Learning, ICML 2011, Bellevue, Washington, Usa, June 28 - July, 2011, pp. 1017–1024. [Google Scholar]
- [57].Lipton ZC, Berkowitz J, and Elkan C, "A Critical Review of Recurrent Neural Networks for Sequence Learning," Computer Science, 2015. [Google Scholar]
- [58].Kriegeskorte N, "Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing," Annual Review of Vision Science, vol. 1, p. 417, 2015. [DOI] [PubMed] [Google Scholar]
- [59].Güçlü U and van Gerven MA, "Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream," Journal of Neuroscience the Official Journal of the Society for Neuroscience, vol. 35, p. 10005, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [60].Drobyshevsky A, Baumann SB, and Schneider W, "A rapid fMRI task battery for mapping of visual, motor, cognitive, and emotional function," Neuroimage, vol. 31, pp. 732–744, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [61].Delgado MR, Nystrom LE, Fissell C, Noll D, and Fiez JA, "Tracking the hemodynamic responses to reward and punishment in the striatum," Journal of neurophysiology, vol. 84, pp. 3072–3077, 2000. [DOI] [PubMed] [Google Scholar]
- [62].Buckner RL, Krienen FM, Castellanos A, Diaz JC, and Yeo BTT, "The organization of the human cerebellum estimated by intrinsic functional connectivity," Journal Of Neurophysiology, vol. 106, pp. 2322–2345, November 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [63].Thomas Yeo B, Krienen FM, Sepulcre J, Sabuncu MR, Lashkari D, Hollinshead M, et al. , "The organization of the human cerebral cortex estimated by intrinsic functional connectivity," Journal of neurophysiology, vol. 106, pp. 1125–1165, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [64].Binder JR, Gross WL, Allendorfer JB, Bonilha L, Chapin J, Edwards JC, et al. , "Mapping anterior temporal lobe language areas with fMRI: a multicenter normative study," Neuroimage, vol. 54, pp. 1465–1475, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [65].Castelli F, Happe F, Frith U, and Frith C, "Movement and mind: A functional imaging study of perception and interpretation of complex intentional movement patterns," Neuroimage, vol. 12, pp. 314–325, September 2000. [DOI] [PubMed] [Google Scholar]
- [66].Castelli F, Frith C, Happe F, and Frith U, "Autism, Asperger syndrome and brain mechanisms for the attribution of mental states to animated shapes," Brain, vol. 125, pp. 1839–1849, August 2002. [DOI] [PubMed] [Google Scholar]
- [67].Wheatley T, Milleville SC, and Martin A, "Understanding animate agents - Distinct roles for the social network and mirror system," Psychological Science, vol. 18, pp. 469–474, June 2007. [DOI] [PubMed] [Google Scholar]
- [68].White SJ, Coniston D, Rogers R, and Frith U, "Developing the Frith-Happe Animations: A Quick and Objective Test of Theory of Mind for Adults with Autism," Autism Research, vol. 4, pp. 149–154, April 2011. [DOI] [PubMed] [Google Scholar]
- [69].Smith R, Keramatian K, and Christoff K, "Localizing the rostrolateral prefrontal cortex at the individual level," Neuroimage, vol. 36, pp. 1387–1396, July 15 2007. [DOI] [PubMed] [Google Scholar]
- [70].Hariri AR, Tessitore A, Mattay VS, Fera F, and Weinberger DR, "The amygdala response to emotional stimuli: A comparison of faces and scenes," Neuroimage, vol. 17, pp. 317–323, September 2002. [DOI] [PubMed] [Google Scholar]
- [71].Manuck SB, Brown SM, Forbes EE, and Hariri AR, "Temporal stability of individual differences in amygdala reactivity," American Journal Of Psychiatry, vol. 164, pp. 1613–1614, October 2007. [DOI] [PubMed] [Google Scholar]
- [72].Barch DM, Burgess GC, Harms MP, Petersen SE, Schlaggar BL, Corbetta M, et al. , "Function in the human connectome: task-fMRI and individual differences in behavior," Neuroimage, vol. 80, pp. 169–189, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [73].Van Essen DC, Smith SM, Barch DM, Behrens TE, Yacoub E, and Ugurbil K, "The WU-Minn Human Connectome Project: an overview," Neuroimage, vol. 80, pp. 62–79, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [74].Glasser MF, Sotiropoulos SN, Wilson JA, Coalson TS, Fischl B, Andersson JL, et al. , "The Minimal Preprocessing Pipelines for the Human Connectome Project," Neuroimage, vol. 80, pp. 105–124, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [75].Hermans M and Schrauwen B, "Training and analysing deep recurrent neural networks," in Advances in neural information processing systems, 2013, pp. 190–198. [Google Scholar]
- [76].Hochreiter S and Schmidhuber J, "Long short-term memory," Neural computation, vol. 9, pp. 1735–1780, 1997. [DOI] [PubMed] [Google Scholar]
- [77].Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, and Salakhutdinov RR, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012. [Google Scholar]
- [78].Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, et al. , "TensorFlow: A System for Large-Scale Machine Learning," in OSDI, 2016, pp. 265–283. [Google Scholar]
- [79].Glorot X and Bengio Y, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp. 249–256. [Google Scholar]
- [80].Kingma DP and Ba JL, "Adam: Amethod for stochastic optimization," in Proc. 3rd Int. Conf. Learn. Representations, 2014. [Google Scholar]


















