Abstract
A variety of exciting scientific achievements have been made in the last few decades in brain encoding and decoding via functional magnetic resonance imaging (fMRI). This trend continues to rise in recent years, as evidenced by the increasing number of published papers in this topic and several published survey papers addressing different aspects of research issues. Essentially, these survey articles were mainly from cognitive neuroscience and neuroimaging perspectives, although computational challenges were briefly discussed. To complement existing survey articles, this paper focuses on the survey of the variety of image analysis methodologies, such as neuroimage registration, fMRI signal analysis, ROI (regions of interest) selection, machine learning algorithms, reproducibility analysis, structural and functional connectivity, and natural image analysis, which were employed in previous brain encoding/decoding research works. This paper also provides discussions of potential limitations of those image analysis methodologies and possible future improvements. It is hoped that extensive discussions of image analysis issues could contribute to the advancements of the increasingly important brain encoding/decoding field.
Keywords: fMRI, encoding, decoding, vision, visual stimulus
Introduction
Encoding and decoding are two critical and complementary perspectives to understand the fundamental mechanisms of brain functions via neural codes (Dayan et al., 2001; Gerstner et al., 1997; Haynes and Rees, 2006; Trappenberg, 2010). Encoding models aim at understanding how brain activity varies according to the concurrent variation in external stimuli, e.g., natural visual stimulus discussed in this paper, and how well the brain activity can be predicted from the quantitatively modeled external stimuli. In contrast, decoding models aim at studying how much of the external stimuli can be learned by observing the brain activity (Haxby et al., 2001; Kay et al., 2008; Miyawaki et al., 2008; Naselaris et al., 2011). Essentially, encoding uses external stimuli to predict brain activity, while decoding uses brain activity to predict information about external stimuli. Thus, encoding and decoding are complementary rather than distinct operations (Naselaris et al., 2011), and they are largely overlapped in terms of computational methods including stimuli representation, functional brain activity measurement, and pattern analysis methodologies for exploring the relationship between stimuli and brain activities.
Due to the fast-growing interest in and the significance of brain encoding/decoding research, several survey papers addressing different aspects of fMRI-based works have already been published (Hasson et al., 2010; Haynes and Rees, 2006; Kay and Gallant, 2009; Naselaris et al., 2011). For instance, Haynes and Rees (2006) discussed the general research problem of ‘brain reading’, which has been studied in the domain of visual perception and other types of mental state including covert attitudes and lie detection. The authors also covered technical challenges and important ethical issues concerning the privacy of personal thought in brain decoding. Kay and Gallant (2009) summarized several advancements of brain decoders of visual stimuli via fMRI including those in (Kay et al., 2008; Mitchell et al., 2008; Miyawaki et al., 2008; Thirion et al., 2007), and provided perspectives on the future research direction and potential application of brain decoding. Hasson et al. (2010) reviewed existing studies that examined the reliability of cortical activity within or between human subjects in response to natural visual stimulation (e.g., free viewing of movies), particularly, on the inter-subject and intra-subject correlations of fMRI responses to the same set of visual stimuli. Naselaris et al. (2011) offered a comprehensive survey of recent experimental methodology advancements in voxel-based decoding models of visual stimuli. The authors laid out a systematic modeling framework that includes estimating an encoding model for every voxel in an fMRI scan and using the estimated encoding models to perform decoding. Sugase-Miyamoto et al. (2011) focused on reviewing the role of temporal stages of encoded facial information in the visual system including the areas of V1, V2, V4, and the inferior temporal (IT) cortex.
In general, most of previous survey articles on brain encoding/decoding of visual stimulus published so far (Hasson et al., 2010; Haynes and Rees, 2006; Kay and Gallant, 2009; Naselaris et al., 2011; Sugase-Miyamoto et al., 2011) were mainly from cognitive neuroscience and neuroimaging perspectives. Though computational challenges were briefly discussed in those survey articles, the impacts of image analysis methodologies such as neuroimage registration, fMRI signal analysis, brain ROI (regions of interest) selection, machine learning algorithms, reproducibility analysis, functional connectivity, and natural image analysis, on brain encoding and decoding studies need further extensive discussions. Therefore, this paper will focus on the image analysis methodologies that were employed in previous brain encoding/decoding research works to model both the external stimuli and the brain responses, on the discussions of potential limitations of those methodologies, and on possible future improvements. In particular, this paper will concentrate on the brain encoding/decoding of visual stimuli in the human vision systems, which will be used as a test-bed to discuss the image analysis methodologies.
The rest of this paper is organized as follows. We will first survey the major neuroimage analysis methodologies used in existing brain encoding/decoding applications. The major challenges of neuroimage analysis methods in brain encoding/decoding research and possible solutions are discussed in the following sections. Finally, we conclude this paper and provide perspectives of future applications of brain encoding and decoding.
Image analysis methodologies
Brain encoding and decoding via fMRI research involves a variety of neuroimage analysis techniques and methods, in that most of these applications aim to infer meaningful information about brain responses from fMRI image data and correlate it with external visual stimuli. For instance, intra-subject neuroimage registration methods are typically used to align fMRI images with structural images, and inter-subject neuroimage registration methods are commonly used to warp different brains into the same template space for group-wise integration and comparison. To measure the functional brain responses to external visual stimuli, fMRI signals processing and analysis algorithms are widely used for information extraction. To construct brain decoders, machine learning algorithms are commonly used to correlate those measurements of brain responses with visual stimuli. Another prominent issue in brain encoding/decoding applications via fMRI is how to select the most relevant voxels or ROIs from fMRI volumes to construct and learn encoding/decoding models. Furthermore, how to establish the correspondences between those selected voxels/ROIs across individuals has been a long-standing challenging and open problem in the human brain mapping and neuroimage analysis fields in general. The following sub-sections will survey these major neuroimage analysis issues and state-of-the-art methodologies.
Structural substrates for brain response modeling
Extraction of the most relevant fMRI signals from brain scans is the first step to infer meaningful information and to construct brain decoding models. That is, researchers need to determine the structural substrates of functional responses first, based on which fMRI signals can be extracted. Typical, there are two general methodologies used in the literature. The first is to determine ROIs or voxels based on current neuroscience domains knowledge. For instance, neuroscientists can manually draw ROIs in the V1, V2 and V3 areas in the visual cortex (Haxby et al., 2001). The second category of methods is data-driven, which determines the location and size of ROIs from fMRI data itself. For instance, the activation detection results can be used for the determination of relevant brain areas for functional responses modeling (Haxby et al., 2001; Walther et al., 2009). Therefore, most of previous brain encoding/decoding studies can be classified into either voxel-based or ROI-based methods, considering how the fMRI signals were extracted from the volumetric fMRI images. Notably, this classification scheme is borrowed from other fMRI data analysis applications such as fMRI activation detection methods, functional connectivity modeling, and brain network modeling.
Voxel-based methods for brain decoding have been widely used in the literature due to its simplicity and effectiveness. In many voxel-based encoding models (Dumoulin and Wandell, 2008; Mitchell et al., 2008; Naselaris et al., 2011; Thirion et al., 2007), the authors aimed to predict functional activity in single voxels that are evoked by different stimuli. Thus, those encoding models can provide a quantitative description of how external stimulus information is represented in the functional activity of individual voxels. For instance, several thousands of voxels located in the V1, V2, and V3 areas of the visual cortex were used for learning the predictive receptive-field models (Kay et al., 2008). However, the difficulties in voxel-based methods include the lack of correspondences among voxels in different brains. The broad medical image registration field, is working on establishing correspondences of voxels in different brains (Avants et al., 2008; Fischl et al., 2002; Liu et al., 2003; Shen and Davatzikos, 2002; Thompson and Toga, 1996; Zhang and Cootes, 2011). However, it is still an open and challenging problem so far (Liu, 2011). In the fMRI analysis field, researchers have to rely on spatial smoothing (Friston et al., 1996; Li et al., 2012c; Mikl et al., 2008; Tahmasebi, 2010; Tahmasebi et al., 2009; Yue et al., 2010) to deal with the misalignment caused by the error of image registration methods. Essentially, if the encoding/decoding models based on voxels do not possess correspondences across different brains, the reproducibility and generalizability in other subjects and populations are limited and thus the validation of those models will be difficult.
Due to the intrinsic variability of brain anatomy and function (Liu, 2011), the variability of voxel-based models across different brains could be remarkable. For instance, in the brain encoding/decoding models in Kay et al., 2008, the receptive-field models derived from the V1, V2 and V3 cortical regions for two individual participants were quite different (Miyawaki et al., 2008). As mentioned before, the establishment of correspondences between these voxel-based models is challenging, given the lack of a common human brain architecture representation. As an alternative approach, in several previous studies (Mitchell et al., 2004; Mitchell et al., 2008), researchers had to rely on image registration algorithms to align different brains into the same template space, and then the fMRI signals or responses were measured by assuming that different brains had correspondences in the same template space. Notably, Haxby et al. (2001) mentioned that they showed that the topographic arrangement of the fMRI-derived pattern of response was consistent within subjects, but they were not able to perform similar analysis across subjects due to the lack of effective and accurate image warping algorithms that can register individual brains to a common atlas space.
Instead of extracting fMRI signals from single voxels, researchers have also tried extracting fMRI BOLD signals from ROIs, either manually or automatically determined. In an early effort, Cox and Savoy (2003) used data from voxels in predefined ROIs during a subset of trials for each subject individually and employed multivariate statistical pattern recognition methods to classify patterns of fMRI activation evoked by the visual presentation of various categories of objects such as baskets, birds, butterflies, chairs, cows, tropical fish, garden gnomes, horses, African masks, and teapots. The authors used two data-driven methods to identify ROIs including a procedure that identified voxels that vary significantly across at least one of the categories of stimuli (Cox and Savoy, 2003; Hu et al., 2012; Ji et al., 2011). However, the author mentioned that “the boundaries and exact functional roles of these areas are not well understood”. The authors in (Haxby et al., 2001) and (Walther et al., 2009) used fMRI activation detection methods to derive a small set of brain regions that were most responsive to external stimuli as ROIs, based on which classification algorithms were then employed to differentiate various patterns.
Though task-based fMRI is considered as a benchmark approach to inferring functional ROIs, it also has limitations. First, large-scale brain networks, such as visual, emotion, attention, working memory, language, semantic and etc., are typically involved in the brain’s responses to natural visual stimulus. Determining all of these functional networks by task-based fMRI dataset is typically time-consuming and cost-prohibitive. Second, accurate activation detection from fMRI dataset is still a challenging and open problem due to a variety of technical difficulties. For instance, the spatial smoothing step, which is widely applied in individual or group-wise activation detections (Friston et al., 1996; Mikl et al., 2008; Tahmasebi et al., 2009; Tahmasebi, 2010; Yue et al., 2010; Li et al., 2012), could result in several downside effects including border blurring, weakening small region activation (Tahmasebi et al., 2009) and the shifting of activation centers (Li et al., 2012). Third, the variability of fMRI activation detection results for the same external stimulus across individuals and populations could be remarkable (Thirion et al., 2007). Thus, meta-analysis has been commonly used as a remedy to enhance the statistical power and reliability of individual fMRI studies (Derrfuss and Mar, 2009; Laird et al., 2009).
Voxel-based and ROI-based brain decoding models have been successfully applied in the above-mentioned various scenarios. However, their reproducibility, generalizability and reliability have been limited due to the lack of a common and individualized representation of human brain architecture (Liu, 2011; Zhu et al., 2012a). Because of the remarkable structural and functional variation across individual brains, neuroimage registration algorithms are still insufficient to accurately establish correspondences in different brains (Liu, 2011). One possible solution is to discover and represent common structural and functional brain architectures by a dense set of reproducible and consistent brain landmarks that can be accurately and reliably localized in each individual brain. For instance, a promising recent development along this direction was reported in (Zhu et al., 2012b), called Dense Individualized and Common Connectivity-based Cortical Landmarks (DICCCOL). The DICCCOL system was developed by a data-driven search strategy that discovered 358 consistent and corresponding functional ROIs, in which each identified functional ROI was optimized to possess maximal group-wise consistency of DTI-derived fiber shape patterns (Zhu et al., 2012a; Zhu et al., 2012b). The neuroscience foundation is that each brain’s cytoarchitectonic area possesses a unique set of extrinsic inputs/outputs, called the “connectional fingerprint” (Passingham et al., 2002), which principally determine the functions that each brain area might perform. Notably, Zhu et al. (2012b) examined the potential functional roles of those 358 DICCCOLs via six different datasets of multimodal task-based fMRI/DTI and resting state fMRI/DTI images, and demonstrated that DICCCOL ROI not only possess consistent structural connection patterns, but also exhibit common functional activations (Zhu et al., 2012a). DICCCOL system can be potentially used for brain encoding/decoding applications, in that they offer intrinsically-established structural and functional correspondences across individuals and provide reliably structural substrates for fMRI signals extractions (Jiang et al., 2012).
Brain response modeling
FMRI has already revolutionized how researchers study the human brain functions (Friston et al., 1994; Heeger and Ress, 2002; Logothetis, 2008; Matthews and Jezzard, 2004). In the fMRI field, fMRI BOLD signals (Ogawa et al., 1990) have been widely used to measure the brain’s functional responses to external stimuli, and thus have been naturally widely used in brain encoding and decoding applications. For instance, in a pioneering effort in (Haxby et al., 2001), the authors used the weights for different regressors as estimates of the strengths of BOLD responses relative to rest. Then, the volumes of interest (VOI) were drawn on the structural MRI images to identify ventral temporal, lateral temporal, and ventrolateral occipital cortex. Afterwards, voxels within these delineated VOIs that were significantly object-selective were used for the following analysis of within-category and between-category correlations. In this way, the correspondences of fMRI BOLD signals within VOIs across different brain were established via manual definitions on MRI images. Kay et al. (2008) used the basis-restricted separable (BRS) model to pre-process the fMRI BOLD time-series data for each voxel, and a set of basis functions was used to describe the shape of the response time course. Finally, the estimated model parameters were used to characterize the amplitude of the brain’s responses to visual stimuli. Similar approaches for quantitation of brain responses have been used in other papers by the same group (Naselaris et al., 2011). Miyawaki et al. (2008) first normalized the fMRI BOLD amplitude relative to the mean amplitude of the first 20s rest period in each run, in order to minimize the baseline difference across individual runs. Then, the fMRI BOLD signals of each voxel were averaged within each stimulus interval after shifting by 4s to compensate for the delays of hemodynamic response.
However, it has been widely recognized that fMRI BOLD signal could be subject to physiological motion effect or a variety of non-neuronal noises (Heeger and Ress, 2002; Logothetis, 2008). As a consequence, using the raw amplitudes of fMRI BOLD responses (even after normalizations) for quantitative modeling of functional brain responses could be risky (Deng et al., 2012). Alternatively, a variety of other brain encoding/decoding studies have used the brain activation patterns to describe the brain’s functional responses to external visual stimuli. In particular, the GLM (general linear model) (Friston et al., 1995) has been widely used to fit prior models to individual voxels’ fMRI BOLD time series, and the estimated parameters are used to describe the functional activities of voxels. In general, the GLMs themselves can be viewed as encoding models of brain responses (Naselaris et al., 2011). Thus, the GLM-based measurements of brain responses have been widely adopted in brain decoding applications. For instance, Davatzikos et al. (2005) used fMRI activation patterns for lie classification and reported that 99% of the true and false responses were discriminated correctly. Mitchell et al. (2008) used fMRI activation patterns to predict brain activity associated with the meanings of nouns, based on the premise that different spatial patterns of brain activations are associated with thinking about different semantic categories of pictures and words (e.g., tools, buildings, and animals). The models in (Mitchell et al., 2008) were trained with a combination of data from a trillion-word text corpus and scanned task-based fMRI data associated with viewing several dozen concrete nouns. Then, the learned models were used to predict fMRI activations for thousands of other concrete nouns in the text corpus, with promising results. In other application scenarios, Walther et al. (2009) used fMRI activation patterns to study which regions of the brain can separate natural scene categories (such as forests vs. mountains vs. beaches). It was reported that the visual area V1, the parahippocampal place area (PPA), retrosplenial cortex (RSC), and lateral occipital complex (LOC) contributed to distinguish among six natural scene categories. Furthermore, a variety of relevant brain decoding works have shown that the full spatial patterns of brain activity, measured simultaneously at many locations (LaConte et al., 2005; LaConte et al., 2006; Mourão-Miranda et al., 2005), can significantly improve the brain encoding/decoding models. Those fMRI activation pattern-based or multivariate analyses have been shown to be superior over univariate approaches that analyze only one location. For instance, LaConte et al. (2005) used a feature vector composed of the voxels’ fMRI data and applied the SVM to classify the brain states into task-performance and control periods.
Growing evidence has suggested that multivariate patterns of fMRI activations can be more sensitive in encoding brain responses (Haynes and Rees, 2006; Norman et al., 2006; Schrouff and Phillips, 2012). It was argued in (Norman et al., 2006) that multivariate activation pattern analyses have better sensitivity than univariate methods. One of the underlying rationales is that large-scale brain regions and networks are typically involved in the perception and cognition of visual stimuli. Therefore, it is natural to hypothesize that functional connectivity within relevant brain networks could substantially contribute to brain encoding and decoding applications. In the computational neuroscience field (Trappenberg, 2010), it has been commonly recognized that the functional connectivities of individual neurons can be responsive to external stimuli. That is, functional connectivity, in addition to other individual neurons’ measurement like firing rates, can be used to represent the neurons’ responses to external stimulus as neural codes.
Interestingly, increasing experimental and computational evidence showed that the connectivity between or among neurons or neuronal regions can be used to represent the neural responses. For instance, it was shown in (Christopher deCharms and Merzenich, 1996) that the firing rates of two neurons are not quite correlated to the external stimulus, but the correlation between the two neurons’ firing rates is much more related to the stimulus curve. In recent reviews (Engel et al., 2001; Singer, 1999), it was reported that the neuronal synchrony or interaction could be a versatile neuronal code for the definition of neuron/stimulus relations. It was even hypothesized that neuronal communication between two neuronal groups mechanistically depends on the coherence between them and the absence of neuronal coherence prevents communication (Fries, 2005). These earlier theoretic works laid out the cellular foundation for connectivity-based measurements of brain response via fMRI. In recent years, there have been several works that employed connectivity-based measurements for quantification of the brain’s responses (Hu et al., 2010; Ji et al., 2011; Jiang et al., 2012; Pantazatos et al., 2012; Richiardi et al., 2011) via fMRI data. Richiardi et al. (2011) used multi-band functional connectivity graphs to decode the brain states in resting and under natural movie stimulus. In (Hu et al., 2010; Hu et al., 2012; Ji et al., 2011), the authors used the functional connectivity matrices within four relevant brain networks as measurements of the brain’s responses to decode the categories of external visual stimuli. It was shown that the brain decoders can achieve relatively good classification performance in separate testing datasets, other than the training datasets with fMRI scans. Pantazatos et al. (2012) hypothesized that the patterns of large-scale functional connectivity decode the emotional expression of visual stimulus faces within single individuals using the training data from separate subjects. The authors used the brain atlas to define 270 nodes and constructed functional connectivity networks based on fMRI signals. The authors successfully used a linear kernel SVM pattern classifier to differentiate the implicit fearful and neutral faces.
Notably, the results reported in several recent studies (Hu et al., 2010; Ji et al., 2011; Pantazatos et al., 2012; Richiardi et al., 2011); Jiang et al. (2012) suggest that functional connectivity or connectomes offer a new, alternative school of methodologies for quantitative measurements of functional brain responses that can be potentially used for brain decoders. The fundamental neuroscience basis for this school of methods is that brain function is realized via large-scale structural and functional networks (Dayan et al., 2001; Gerstner et al., 1997; Haynes and Rees, 2006; Trappenberg, 2010). The emerging new field of connectomes (Hagmann et al., 2010; Kennedy, 2010; Li et al., 2012a; Van Dijk et al., 2010; Williams, 2010; Zhu et al., 2012b) and the newly available connectomics methods and tools would provide promising opportunities for the advancements of brain encoding and decoding applications in the near future.
In addition to the fMRI BOLD signal-based, activation-based, and connectivity-based methods mentioned above, the fourth category of methods for measuring the brain’s functional responses in brain encoding/decoding applications is to quantify the inter-subject or intra-subject correlation. For instance, Hasson et al. (2004) analyzed the data by comparing the evoked fMRI response time courses across different subjects (called inter-subject correlation). In that work, the inter-subject correlation curves of BOLD signals in response to same visual stimulus are compared. Golland et al. (2007) compared intra-subject correlation of the fMRI response time courses evoked within the same subject by repeated presentations of the same visual stimulus. Though this category of methodology has been less frequently used in the literature, it has its own advantages, particularly in measuring the temporal responses to external stimuli. It is envisioned that in the future, it might be desirable to use a combination of the four categories of methodologies in previous sections for specific brain encoding/decoding applications by leveling their strengths and avoiding their weaknesses.
External stimuli modeling
One goal of brain encoding and decoding applications is to derive a quantitative mapping between the external visual stimulus and the brain response. A variety of previous fMRI studies have shown that brain decoders can be used to reconstruct visual features, such as orientation and motion direction (Kamitani and Tong, 2005, 2006), visual object categories (Cox and Savoy, 2003; Haxby et al., 2001), semantic objects (Naselaris et al., 2011; O'Craven and Kanwisher, 2000), and video objects (Nishimoto et al., 2011), by learning the quantitative mapping between fMRI-derived brain activity patterns and visual stimulus based on training datasets. Since the previous sections have already discussed neuroimage analysis methods for quantitative measurements of the brain’s responses, this section will be devoted to visual stimuli and their quantitative measurements.
The external visual stimuli used in brain encoding/decoding applications can range from very simple shapes to very complex video streams. In an early effort, Belliveau et al. (1991) used phonic (or luminance) as visual stimuli, and demonstrated the brain regions involved in light perception. After the work in (Belliveau et al., 1991), line arrays in different orientations were used (Kamitani and Tong, 2005), aiming at investigating the brain’s response to certain images in simple patterns. Similar stimuli were used in (Shibata et al., 2011) to explore the early plastic learning ability of early visual cortex. Then, geometric patterns, letters and digits have been used in visual encoding and decoding applications. Miyawaki et al. (2008) used geometric patterns and letter images to develop a decoding model to reconstruct the stimuli from brain activities. Fujiwara et al. (2009) improved the decoding model using the same dataset. Both of those works achieved the reconstruction of legible images of some characteristics of the stimuli. The similar work was also studied by van Gerven et al. (2010), in which hand written digits such as “6” and “9”, instead of geometric pattern or letters in print font, were used. Engel et al. (1997) used color geometric pattern images, but they focused on the effect of color not the geometric patterns on brain responses, since the authors were interested in the brain responses evoked by colors. This work was different from the research done by Brouwer and Heeger (2009), whose interest was in decoding and reconstruction. The simple external stimuli enable researchers to keep brain responses only towards certain pattern which is interested in and eliminate disturbance introduced by other elements of images, such as texture and color which may evoke higher-level brain functions, resulting in suppressing or enhancing prime visual cortex.
In addition to geometric shapes and colors, researchers have used semantic visual objects as stimuli in brain encoding and decoding applications. In 1997, Kanwisher et al. (1997) used gray level images of faces and objects in different pre-processing conditions to study the responses of fusiform face area (FFA). In other studies, Tsao et al. (2006) and Goffaux et al. (2011) focused on the difference of brain responses to faces and other kinds of objects. Haxby et al. (2001) used more complicated gray level images of faces, animals and small artificial objects as stimuli, aiming to model the patterns of brain responses to different kinds of grey level objects. Other studies (Carlson et al., 2003; Cox and Savoy, 2003) adopted similar stimuli but with more categories. (Sterzer et al., 2008) used the gray level images to investigate the effect of visual stimuli in early visual cortex which was suppressed by high level brain function. Grey level images are far from human vision in daily life. However, this type of visual stimuli provides content control ability which prevents brain responses are disturbed by color stimuli instead of object shapes. Compared to simple phonic or geometric images, gray level images provide more complex simulations.
Instead of using gray-level images with only foreground, Kay et al. (2008) and Naselaris et al. (2009) used gray level images with both foreground and background in their decoding model studies, providing increased naturality. However, gray level images are limited when researchers try to study the encoding and decoding models related to color information. Hence, color images have been widely used as visual stimuli as well. Eger et al. (2008) used synthetic objects images in different colors as stimuli. MacEvoy and Epstein (2009) used color images as stimuli but with a more complicated representation form. Peelen et al. (2009) used humans and cars pictures as stimuli, with and without background, and Walther et al. (2009) used more categories of natural scene pictures as stimuli to build a model for scene classification based on brain activation patterns. Comparing to the simple and abstract stimuli used before, color images gives scenarios which are closer to nature and evoke brain activations more naturally.
In addition to static images, researchers have used time-series video streams in visual stimulus encoding and decoding applications. For instance, Beauchamp et al. (2003) used videos of moving bodies and tools, as well as point-lighted moving bodies and tools, to study the organization of brain responses to different types of complex visual motion via fMRI. Sekiyama et al. (2003) employed videos of a female broadcaster pronouncing three different syllables and asked subjects to distinguish what the broadcaster pronounced. Then, fMRI data was used to examine the brain’s responses and the cross-modal binding in auditory-visual speech perception. Werner and Noppeney (2010) adopted 15 tools and 15 instruments instead of broadcaster face videos. Malinen et al. (2007) used mute color natural videos as stimuli. Villarreal et al. (2012) studied the effect of context when gestures were presented to subjects with or without background in videos. The authors also used background-only videos to compare whether the effect of context was additive on brain responses. In a series of recent studies, Hu et al. (2012) and Ji et al. (2011) used TV news programs, included sports, weather report and commercials as stimuli. There are many other fMRI-based encoding/decoding studies that used video streams as visual stimuli, e.g., those in (Bartels and Zeki, 2004; Hasson et al., 2008; Hasson et al., 2004; Nishimoto et al., 2011; Sabuncu et al., 2010; Whittingstall et al., 2010).
The image/video stimuli used in previous brain encoding and decoding applications can be simply quantified by the image grid intensities or colors (Miyawaki et al., 2008; Naselaris et al., 2011; Nishimoto et al., 2011), or by semantic categories of natural images (Haxby et al., 2001; Mitchell et al., 2008). An alternative approach is to describe natural image/videos by their low-level computer vision (CV) based visual features. Typically, low-level features can be computed directly by computer algorithms without the involvement of human, which has been extensively studied in the image analysis and computer vision domains. Widely-used low-level features include color, texture, gradient histogram, and bag-of-words, which represent the statistical information of the whole images, or some local information such as position, shape, angles, and so on. For example, Bartels et al. (2008) used motion energy to model the stimuli during natural viewing of movies. It was formed by the difference of directions within the same areas between frames within one clip of video. The wavelet Gabor filter was used in (Kay et al., 2008; Naselaris et al., 2009) to model the features of image shape and the distribution patterns of gray scale images. Hu et al. (2012) employed six types of features, including color histogram, color correlogram, color moments, co-occurrence textures, wavelet textures grid, and edge histogram, for the brain decoding applications. These features were extracted from key frames in one video clips and all the features from all key frames standing for the video clip. Ji et al. (2011) used sets of SIFT features (Lowe, 2004) in their work to represent and retrieve video shots. Then, the SIFT sets from every video clips were used to stand for the whole stimuli comprising several video clips. Essentially, there are many other image/video descriptors that have been developed in the image analysis and computer vision communities (Goferman et al., 2012; Li et al., 2005), which can be potentially used for the quantitative description of visual stimuli in the future. CV features are typically objective and can be automatically derived.
External stimuli-brain response mapping
Generally speaking, brain encoders and decoders can be considered as predictive models of brain responses and visual stimuli. The GLM models (Friston et al., 1995) can thus be considered as the earliest brain decoder and decoders, which map the brain’s hemodynamic responses measured by fMRI BOLD signals with external stimulus curves. The basic assumption of GLM models is that the brain’s functional response follows the stimulus curve, e.g., block-based or event-related paradigms, after accounting for the hemodynamic response delay. Due to its simplicity and effectiveness, the GLM models and their derived measurements have already been widely used in brain decoders (Davatzikos et al., 2005; Mitchell et al., 2008; Naselaris et al., 2011; Walther et al., 2009). GLM can be used to predict brain responses either in the voxel level (Mitchell et al., 2008) or in ROI level (Beauchamp et al., 2003; Goffaux et al., 2011; Sterzer et al., 2008). GLM method has also been used in decoding research studies by converting encoding model into decoding model (Carlson et al., 2003; Eger et al., 2008; Haxby et al., 2001; Reddy et al., 2010; Walther et al., 2009).
Recently, a variety of researchers have used multivariate pattern classification algorithms on distributed patterns of functional MRI data for the purpose of decoding the information represented in the subject’s brain. This type of multivariate pattern analysis methodology has resulted in impressive successes of mind reading (Haynes and Rees, 2006; Norman et al., 2006; Reddy et al., 2010). It is conceived that the major advantage of multivariate pattern analysis methods over voxel-based methods is the increased sensitivity (Norman et al., 2006). An early application of multivariate pattern classification in visual decoding was done by Haxby et al. (2001). After that, researchers employed multivariate pattern classification in a variety of applications. For instance, many brain decoding techniques primarily relied on the widely used linear support vector machines (SVMs) (Craddock et al., 2009; LaConte et al., 2005; LaConte et al., 2006; Mourão-Miranda et al., 2005). Shirer et al. (2012) used a classifier to identify 4 states (in rest, remembering the events of their day, subtracting numbers, or (silently) singing lyrics) with 84% accuracy. Importantly, the classifier achieved 85% accuracy when identifying these states in a second, independent training dataset.
Alternatively, other classification or machine learning algorithms have been employed in brain encoding and decoding applications. Mitchell et al. (2004) explored several classifiers such as Gaussian Naive Bayes (GNB), SVM, and k Nearest Neighbor (kNN) to differentiate if a human subject is looking at a picture or a sentence, or is viewing a word describing food, people, buildings, etc. In particular, the authors explored a few approaches for feature selection including selecting the most discriminating voxels, selecting the most active voxels, selecting the most active voxels per ROI. It is interesting that the authors found that the feature selection method based on selecting the most active voxels was superior to others. Richiardi et al. (2011) formulated the brain decoding as a tree decision process. Compared to SVMs, the functional trees offered the convenience of adaptively adjusting the model according to feature space complexity (Richiardi et al., 2011). That is, an ad-hoc switch between linear and non-linear decision boundary can be affected during the training stage, and that less parameters need to be optimized in the procedure. Finally, Bayesian decoding methods have also been used in several papers (Friston et al., 2008; Fujiwara et al., 2009; Naselaris et al., 2009).
In recent studies in (Hu et al., 2010; Hu et al., 2012), the authors proposed a feature projection model based on principal component analysis (PCA) and canonical component analysis (CCA) to explore the relationship between visual feature representation of the external stimuli and functional connectivity based representation of the brain responses, and to project the visual feature representation to the canonical latent space shared by visual feature representation and the brain responses. The results in (Hu et al., 2010; Hu et al., 2012) have suggested that the fMRI-derived features truly contributed to improving video classifications. Recently, Ji et al. (2011) designed a mapping from the low-level visual features of the external stimuli to the fMRI-derived brain responses by the Gaussian process regression (GPR) (Rasmussen and Williams, 2006). Experimental results showed that the method can significantly improve the performance of the traditional video retrieval method (Ji et al., 2011).
In the future, many advanced machine learning techniques may be introduced from machine learning domain, e.g., sub-space learning methodologies (Tenenbaum et al., 2000) and multi-task learning approaches (Argyriou et al., 2008; Micchelli et al., 2010; Obozinski et al., 2010). Researchers also introduced sparse learning method in voxel selection (Yamashita et al., 2008) or whole brain classification (Ryali et al., 2010). Sparse learning has been combined with existing method to overcome the drawbacks in decoding (Lee et al., 2011). In general, there are many possibilities to map fMRI-derived brain responses and visual features, and the advancements from the machine learning field will substantially contribute to brain encoding and decoding studies.
Summarization
In brain encoding and decoding studies, researchers have already achieved remarkable results in investigating brain activities. For examples, Kay et al. (2008) modeled visual stimuli by wavelet Gabor filters and indicated that the orientation of lines in the image stimuli was not a critical factor for the brain’s perception. Haxby et al. (2001) discovered that different ROI response patterns when subjects were viewing various stimuli, e.g., grey images of human faces, house, cats, bottles, shoes and scissors. The authors indicate that the fusiform face area (FFA) was more active when face stimuli were presented; meanwhile the parahippocampal place area (PPA) was more active when object stimuli were presented. Zhang et al. (2012b) found that the saliency map was generated in early visual cortex V1. There are also results focusing on other areas of cortex, such as V3, MTG, RCS, LOC (Bartels et al., 2008; Beauchamp et al., 2003; Kamitani and Tong, 2005; Sekiyama et al., 2003; Walther et al., 2009). The findings achieved in existing visual encoding and decoding studies have largely broadened our understanding of the complex brain functions. However, as we pointed out in this paper, the challenges in image analysis methodologies for visual encoding and decoding studies are still considerable. We briefly summarize existing studies from the perspectives of external stimuli modeling, structural substrates for functional brain responses modeling, functional brain responses modeling and mapping strategies for linking external stimuli and brain responses, as illustrated from Table 1 to Table 4, respectively.
Table 1.
Existing studies categorized by external stimuli modeling.
| Method | Reference | Advantages | Limitations |
|---|---|---|---|
| Qualitative description (Conventional task-based paradigm, that is, the stimuli is presented or not) |
Cox and Savoy (2003) Shinkareva et al. (2011) Haxby et al. (2001) Peelen et al. (2009) |
Powerful in functional brain mapping for specific brain functions. | The neuronal responses to task-based paradigm are weaker than those associated with naturalistic stimuli (Mechler et al., 1998; Yao et al., 2007); Difficult to explore these multipurpose properties of cortical areas associated with processing in many attributes that have to be processed simultaneously and interactively Bartels and Zeki (2004). |
| Semi-quantitative description (labels or rating scores) |
Eger et al. (2008) Bartels and Zeki (2004) Walther et al. (2009) |
Partly alleviates the problems in conventional task-based paradigm. | The quantification is coarse, subjective or labor intensive. |
| Quantitative description |
Kay et al. (2008) Ji et al. (2011) Hu et al. (2012) Nishimoto et al. (2011) |
The quantification is derived automatically and is objective. | Induces difficulties in statistical analysis and inferences; Meanings for efficient and precise quantitative modeling methods are still limited. |
Table 4.
Categories by mapping methods.
| Method | Reference | Advantages | Limitations |
|---|---|---|---|
| Hypothesis test |
Cox and Savoy (2003) Redcay et al. (2010) Thirion et al. (2006) Kanwisher et al. (1997) |
Activity in one area can be analyzed without disturbance from other regions. | Difficult to analyze combination of activities in different regions. |
| General linear model (GLM) |
Villarreal et al. (2012) Goffaux et al. (2011) Peelen et al. (2009) MacEvoy and Epstein (2009) Beauchamp et al. (2003) |
Ability to cooperate with different statistical tools; Suitable for both encoding and decoding; |
Lower accuracy than SVM and Bayesian methods. |
| MVPA |
MacEvoy and Epstein (2009) Peelen et al. (2009) Stokes et al. (2009) Bartels and Zeki (2004) Clithero et al. (2011) |
Multivariate methods; More sensitive than univariate analysis and thus improved accuracy |
Vulnerable to small sample size; |
| SVM |
Eger et al. (2008) Reddy et al. (2010) Sterzer et al. (2008) Cox and Savoy (2003) |
High accuracy; Robust to small sample size; |
Results are difficult to interpret; |
| Bayesian method |
Naselaris et al. (2009) Shinkareva et al. (2011) Naselaris et al. (2012) |
High accuracy; Ability to reconstruct visual stimuli. |
Results are difficult to interpret; |
In, we briefly categorize the methods for external stimuli modeling into three groups: qualitative, semi-quantitative and quantitative. For instance, “0” or “1” representation of the stimuli in conventional task-paradigm based fMRI studies, that is, the stimulus was presented or not, is a typical example of qualitative description. Qualitative description of external stimuli is very powerful in functional brain mapping for specific brain functions. However, it has difficulty in exploring those multipurpose properties of cortical areas associated with processing in many attributes that have to be processed simultaneously and interactively (Rasmussen et al., 2012). Furthermore, the neuronal responses to task-based paradigm are weaker than those associated with naturalistic stimuli that the brain has to process in everyday life (Chai et al., 2009; Logothetis et al., 2001). Semi-quantitative modeling of external stimuli such as semantic labels of image stimuli, or rating scores from participating subjects provides more information about the external stimuli. However, that information is coarse, subjective and sometimes labor-intensive. Quantitative modeling methods provide automatic, precise and objective description of external stimuli, but it induces difficulties in statistical analysis and inferences as well. Furthermore, efficient and precise quantitative modeling methods are still limited in current studies.
Table 2 focuses on the structural substrate based on which the functional brain responses were measured. We categorize existing works into two categories including voxel based and ROI based methods. In voxel based methods, the brain responses were directly measured for each voxel. ROI based methods provide straightforward structural substrate for brain responses quantification with high spatial resolution. However, the major limitation is that it lacks of precise inter-subject correspondence between subjects due to the limited performance of brain image registration methods. In ROI based methods, the brain responses were measured based on identified brain ROIs. According to the meanings for ROI identification, we further group ROI based methods into two groups including manual and automatic ROI identification. In manual ROI identification, the ROIs are defined by experts with prior knowledge. Researchers manually delineate a specific group of ROIs and examine whether and how the activities related to the manually delineated ROIs are correlated with external stimuli. In automatic ROI identification, researchers usually adopt task-based fMRI (localization fMRI) to identify brain ROIs related to specific brain functions based on brain activation detection. The advantages and limitations of those methods are also summarized in Table 2.
Table 2.
Categories by structural substrates for functional brain responses modeling.
| Method | Reference | Advantages | Limitations |
|---|---|---|---|
| Voxel based |
Haxby et al. (2001) Thirion et al. (2006) Kay et al. (2008) Miyawaki et al. (2008) Engel et al. (1997) Malinen et al. (2007) |
Straightforward; High spatial resolution; |
Lacks of precise inter-subject correspondence; depends on image registration |
| ROI based (pre-defined) | O'Craven and Kanwisher (2000) | Establishes coarse inter-subject correspondence; | Requires hypothesis based on neuroscience knowledge; subjective variations; labor-intensive; |
| ROI based (data-driven) |
Walther et al. (2009) Ji et al. (2011) |
Fully automatic; Does not require prior neuroscience hypothesis; Precise inter-subject correspondence; |
Additional localization fMRI scans; |
Table 3 focuses on how to describe brain response to external stimulus. The existing works are categorized into three groups including BOLD-signal based, activation based and brain connectivity based. BOLD-signal based models directly use the BOLD fMRI time series as the measurement of brain activities. Activation based methods focus on the functional activities patterns estimated by brain activation detection methods such as the general linear model. Connectivity based methods use the functional interactions (such as pairwise functional connectivities among brain networks) to measure the brain’s responses. We summarize the advantages and disadvantages of those methods in the table.
Table 3.
Categories by brain response modeling.
| Method | Reference | Advantages | Limitations |
|---|---|---|---|
| BOLD-signal based |
Haxby et al. (2001) Kay et al. (2008) Miyawaki et al. (2008) |
Straightforward | Vulnerable to noise; Difficult to explore interactions between separated cortical areas; |
| Activation based |
Friston et al. (1995) Mitchell et al. (2008) Walther et al. (2009) |
Robust to noise; | Difficult to explore interactions between separated cortical areas; |
| Connectivity based |
Bartels and Zeki (2005) Ji et al. (2011) Hu et al. (2012) Chai et al. (2009) |
Enables the exploration of functional interactions between brain ROIs; Robust to noise; |
Challenges exist in brain ROIs identification |
Table 4 focuses on the methods for the mapping between external stimuli and brain response. We listed popular methods employed in existing encoding and decoding studies, as well as their advantages and limitations. It is notable that encoding and decoding are complementary (Naselaris et al., 2011) and some studies employ one method for both encoding and decoding (Goffaux et al., 2011; Sterzer et al., 2008).
Other Challenges
In addition to the mentioned challenges in image analysis, there are several other challenges from neuroscience, neuroimaging, and computational methodology perspectives, which will be discussed in the following section.
Neuroscience challenges
Although some details of the basic representational architecture of early visual cortex are known (e.g., retinotopy, hypercolumns (Hubel and Wiesel, 1968, 1969)), very little is known about how the higher-order visual cortex represents complex real-world visual objects or stimuli and the conjunctions of features that comprise them (Cox and Savoy, 2003). Also, there have been debates on the relative modularity (Downing et al., 2001; Kanwisher et al., 1997) or distributedness (Haxby et al., 2001; Ishai et al., 1999) of activity in human ventral extrastriate visual cortex in visual object representations like human faces. It was stated in (Haynes and Rees, 2006) that “topographic organization of neuronal selectivities is clearly a systematic feature of sensory and motor processing, but the extent to which it might also be associated with higher cognitive processes and different cortical areas remains unknown.” Thus, future advancements of brain encoding and decoding applications heavily depend on the deeper understanding of the human vision systems. For instance, it was mentioned in (Kay and Gallant, 2009) that one possible way to improve reconstruction accuracy in (Miyawaki et al., 2008) was to use information conveyed by voxel responses in other higher visual areas, such as V4. However, given that only a rudimentary understanding of how visual areas beyond V1–V3 represent stimuli has been achieved so far, this is still a challenge.
It has been widely recognized that the number of possible perceptual or cognitive states is infinite (Haynes and Rees, 2006), whereas the number of training categories in brain encoding/decoding applications is typically very limited (Dumoulin and Wandell, 2008; Mitchell et al., 2008; Naselaris et al., 2011). As a consequence, brain encoding/decoding research might need to be restricted to simple cases with a fixed number of alternatives at the beginning, based on which training datasets are available. Also, it is not clear how many mental states simultaneously occur during brain encoding/decoding experiments, and it is not clear whether it is possible to independently model them concurrently. Thus, significant advancements of perceptual and cognitive neuroscience, in particular, deeper understanding of perceptual and cognitive processes, are warranted to move the field of brain encoding and decoding to the next level in the future.
Neuroimaging and human brain mapping challenges
The spatial and temporal resolutions of current neuroimaging techniques are still limited, and far from being capable of fully capturing the rich and complex dynamics of the functioning human brain. Also, the neural basis of the fMRI BOLD signal (despite its wide adoption and application) is not fully understood yet (Arthurs and Boniface, 2002; Logothetis et al., 2001). In the future, novel MRI techniques for increasing spatial resolution and signal-to-noise ratio, such as the use of ultra-high magnetic fields (Yacoub et al., 2008) and parallel imaging, can potentially substantially improve current brain encoding/decoding studies.
Besides spatial resolution and signal-to-noise ratio, another challenge for fMRI application in brain encoding and decoding is the limited temporal resolution. This limitation does not allow fMRI to record real time brain state changes. In contrast, electroencephalography (EEG) and magnetoencephalography (MEG) are able to offer higher temporal resolution, which can record brain activities in milliseconds. However, both of the two means cannot provide high spatial resolution, which does not allow them to locate precise regions where the cortex reacts to external stimuli. Thus it is nature for researchers to combine fMRI and EEG to observe brain activities in order to obtain high resolution in both spatial and temporal domains. And this demand also leads to data fusion issue that how researchers could use fMRI data and EEG data to deduce the procedures of brain state shifting.
Some researchers have employed the combination of fMRI and EEG. Salek-Haddadi et al. (2003) studied the situation in which EEG and fMRI worked simultaneously. This method was employed in cortical source imaging in (Liu and He, 2008; Vulliemoz et al., 2009). The combination overcomes the low temporal resolution of fMRI, which is considered as a major disadvantage of this neuroimaging technique. Additionally, the combination helps to develop new brain models, such as direct graph, by combining EEG data (Vulliemoz et al., 2010), which is different from the widely used indirect graph or connectivity model.
There are several fundamental problems that have not been solved in human brain mapping yet, including the representation of common structural and functional brain architecture (Zhu et al., 2012), the establishment of correspondences among brain regions or ROIs (Liu, 2011), and the lack of effective and reliable processing of neuroimaging data such as fMRI and DTI data.
The number of subjects used in fMRI scans in previous brain decoding studies had been relatively small. For instance, the studies in (Kay et al., 2008) and (Miyawaki et al., 2008) used two subjects in the training of their decoders. In the study of (Haxby et al., 2001), six subjects were used in the fMRI data acquisition and analysis. It is certain that acquisition of fMRI datasets for large number of subject is challenging, however, the reproducibility of those learned decoder models remains as an important issue to be investigated in the future. Essentially, the critical lack of algorithms and methods that can accurately warp different brain images into the same atlas space seriously hampered the advancements of studying the reproducibility of brain decoders, as evidenced in a variety of prior studies (Haxby et al., 2001; Mitchell et al., 2004; Mitchell et al., 2008).
In the literature, increasing number of studies have reported the dramatic dynamics of brain functions and interactions (Chang and Glover, 2010; Gao and Lin, 2012; Li et al., 2012b; Majeed et al., 2011; Smith et al., 2012; Spreng et al., 2010; Zhang et al., 2012a), that is, the brain undergoes remarkable functional dynamics in either resting state (Chang and Glover, 2010; Li et al., 2012b; Smith et al., 2012) or during task performances (Li et al., 2012b; Zhang et al., 2012a). Thus, the lack of those dynamics can substantially influence the reliability and robustness of the learned encoding/decoding models based on the fundamental assumption of temporal stationarity. In the future, quantitative characterization of those time-dependent functional connectivity/connectome dynamics and representative patterns is necessary to elucidate fundamentally important temporal attributes of functional connections that cannot be seen by traditional static functional activity analysis methods.
Researchers also concern the validity of their encoding models, which leads to the issue of checking whether the encoding models are accurate or not. Kay et al. (2008) manually compared brain responses computed by their encoding model and actual responses recorded by fMRI to verify the validity of the encoding model. Other researchers converted encoding models to decoding models and adopted the predicting accuracy as the metric of validity of encoding models. This method was first introduced by Haxby et al. (2001) and adopted in other works (Carlson et al., 2003; Eger et al., 2008; Naselaris et al., 2011)
To overcome the limitations of brain encoding and decoding, there are some methods in different aspects worth being considered. In stimuli description form, researchers may introduce more CV features into this area. Existing works employed CV features such as Gabor filter coefficients (Naselaris et al., 2009), SIFT (Ji et al., 2011) and motion energy (Bartels et al., 2008; Nishimoto et al., 2011). In brain response model, the existing works only employed models without connectivity or bi-connected functional connectivity model. One possible improvement is to introduce multivariate functional connectivity model in the future.
Computational methodology challenges
It is important to point out that brain decoding is essentially based on inverse inference (Haynes and Rees, 2006). That is, even if a specific functional brain response pattern was inferred to co-occur with a mental state under a specific visual stimulus context, the mental state and visual stimulus pattern might not be necessarily or causally connected. For instance, if such a response-stimulus pattern is found under a different context, this relationship might not be indicative of the brain decoding of the visual stimulus. Such inverse inferences have been pointed out in a number of domains of neuroimaging (Haynes and Rees, 2006; Poldrack, 2006). One possibility of improving the inverse inference by the authors in (Poldrack, 2006) was to improve the definition of ROIs, whose significance has already been extensively discussed in this paper.
In previous studies, while pattern recognition approaches such as SVM classifiers or Bayesian algorithms have shown great promise for extracting correlative relationships between external visual stimuli and brain activity patterns, one must remain cautious about the nature of the information that a machine learning algorithm is using to distinguish different classes of stimuli. The relationship that can be extracted by machine learning methods does not necessarily mean that this information is used by the brain (Cox and Savoy, 2003). Furthermore, in the statistics and machine learning fields, it is widely recognized that over-fitting occurs when a learned model generally has poor predictive performance (Dietterich, 1995), as it can exaggerate fluctuations in the data. That is, the model could possibly describe random error or noise instead of the underlying true relationship. Overfitting could occur in many situations whenever a model is excessively complex, such as having too many parameters relative to the number of observations. Due to the relatively small number of training cases in previous brain encoding/decoding studies, it should be cautious to interpret the generalizability of those learned models. This limitation may be overcome by introducing other machine learning method, e.g., sparse learning (Lee et al., 2011; Ryali et al., 2010; Vu et al., 2011) or linear regression model (Ji et al., 2011) as potential directions to enable researchers build more complex and meaningful mapping between stimuli and brain response.
Discussions and concluding remarks
Brain encoding and decoding methods can significantly advance the field of cognitive neuroscience itself. For instance, one possibility suggested by the results of (Miyawaki et al., 2008) is that people can use fMRI data to reconstruct the contents of visual imagery or perhaps even dreams in the future (Kay and Gallant, 2009). It was also mentioned in (Kay and Gallant, 2009) that whether or not brain decoding approaches can be extended to those subjective perceptual states relies on whether the neural processes that mediate these states are similar to those involved in normal perception (Kosslyn et al., 2001).
In the future, the successes of brain encoding/decoding approaches and methodologies can be potentially applied in a variety of fields including brain-computer interface (BCI), which aims to connect between brain and machine (Nijholt and Tan, 2008; Van Gerven et al., 2009; Wolpaw et al., 2002). In the BCI field, the functional brain activity can be measured and used to provide feedback to computer or to modify one’s own patterns of brain activity (Lebedev and Nicolelis, 2006). The effective and accurate quantification of functional responses in brain decoders can be potentially translated into practical applications for the healthy and disabled user (Bashashati et al., 2007; Christopher deCharms, 2008; Velliste et al., 2008), as well as into novel ways of analyzing neurophysiological data in cognitive neuroscience (Sitaram et al., 2007). Additionally, the combination of EEG and fMRI would also benefit BCI design. Due to the volume and price of the device, fMRI might not be suitable to be applied in BCI instrument at current stage. As a result, BCI equipment is mainly based on EEG or electrocorticography (ECoG), which comes with the merits of small volume, light weight, low price and high temporal resolution. However, information from fMRI would provide assistance to EEG signal analysis, given the prior knowledge for the BCI design works which employed EEG. Researchers may use fMRI information to decide the way of electrodes placement on skull skin or cortex by considering cortex activation in certain areas for the precise location provided from fMRI images. It is also helpful for BCI design to build accurate model by combining fMRI and EEG data in certain external or internal stimuli and eliminate unrelated cortical area.
Visual encoding and decoding can also be used to advance image/video representation and analysis. For instance, visual encoding and decoding can help researchers in computer vision areas to extract and learn more descriptive and rich visual features which are more correlative to the brain’s functional responses (Hu et al., 2010; Hu et al., 2012; Ji et al., 2011). These predictive models or brain decoders can be used to effectively bridge the semantic gaps between low-level visual stream representations and high-level semantics perceived by the human brain, which has been verified by a variety of neuroimaging-guided image/video analyses (Hu et al., 2012; Kapoor et al., 2008; Walther et al., 2009; Wang et al., 2009). Visual encoding and decoding was also used in artist training. Hasson, et al., (2004) proposed to apply visual encoding and decoding in movie maker training, in which the students may study not only the movies themselves but also how these movies influence humans.
Finally, it should be pointed out that brain encoding and decoding is a highly interdisciplinary research field that interfaces with neuroscience, neuroimaging, image analysis, and machine learning. To accelerate the pace of innovation in this emerging new field in the future, large-scale collaborative efforts amongst the above-mentioned research disciplines are critically important, including sharing common experiment designs, computational algorithms and codes, and benchmark original and pre-processed datasets. Essentially, these collaborative efforts would stimulate new ideas, facilitate novel methodologies, and importantly, independently cross-validate scientific findings and computational algorithms/tools from different labs.
Acknowledgements
T. Liu was supported by the NIH Career Award (EB006878), NIH R01 HL087923-03S2, NIH R01 DA033393, NSF CAREER Award (IIS-1149260) and The University of Georgia start-up research funding. The author would like to thank the following collaborators for helpful discussions: Kaiming Li, Xiang Ji, Fan Deng, Dajiang Zhu, Tuo Zhang, Hanbo Chen, Xiang Li, Shu Zhang, Carlos, Faraco, L. Stephen Miller, Heng Huang, Xian-Sheng Hua, and Lie Lu.
References
- Andersson J, Smith S, Jenkinson M. Fnirt-fmrib’s non-linear image registration tool. Human Brain Mapping. 2008:15–19. [Google Scholar]
- Argyriou A, Micchelli CA, Pontil M, Ying Y. A spectral regularization framework for multi-task structure learning. Advances in Neural Information Processing Systems: NIPS. 2008:25–32. [Google Scholar]
- Arthurs OJ, Boniface S. How well do we understand the neural origins of the fMRI BOLD signal? TRENDS in Neurosciences. 2002;25:27–31. doi: 10.1016/s0166-2236(00)01995-0. [DOI] [PubMed] [Google Scholar]
- Avants BB, Epstein C, Grossman M, Gee JC. Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain. Medical Image Analysis. 2008;12:26–41. doi: 10.1016/j.media.2007.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bartels A, Zeki S. Functional brain mapping during free viewing of natural scenes. Human Brain Mapping. 2004;21:75–85. doi: 10.1002/hbm.10153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bartels A, Zeki S. Brain dynamics during natural viewing conditions—a new guide for mapping connectivity in vivo. Neuroimage. 2005;24:339–349. doi: 10.1016/j.neuroimage.2004.08.044. [DOI] [PubMed] [Google Scholar]
- Bartels A, Zeki S, Logothetis N. Natural vision reveals regional specialization to local motion and to contrast-invariant, global flow in the human brain. Cerebral Cortex. 2008;18:705–717. doi: 10.1093/cercor/bhm107. [DOI] [PubMed] [Google Scholar]
- Bashashati A, Fatourechi M, Ward RK, Birch GE. A survey of signal processing algorithms in brain–computer interfaces based on electrical brain signals. Journal of Neural Engineering. 2007;4:R32–R57. doi: 10.1088/1741-2560/4/2/R03. [DOI] [PubMed] [Google Scholar]
- Beauchamp MS, Lee KE, Haxby JV, Martin A. FMRI responses to video and point-light displays of moving humans and manipulable objects. Journal of Cognitive Neuroscience. 2003;15:991–1001. doi: 10.1162/089892903770007380. [DOI] [PubMed] [Google Scholar]
- Beer RD. Dynamical approaches to cognitive science. Trends in Cognitive Sciences. 2000;4:91–99. doi: 10.1016/s1364-6613(99)01440-0. [DOI] [PubMed] [Google Scholar]
- Belliveau J, Kennedy D, Jr, McKinstry R, Buchbinder B, Weisskoff R, Cohen M, Vevea J, Brady T, Rosen B. Functional mapping of the human visual cortex by magnetic resonance imaging. Science. 1991;254:716–719. doi: 10.1126/science.1948051. [DOI] [PubMed] [Google Scholar]
- Brouwer GJ, Heeger DJ. Decoding and reconstructing color from responses in human visual cortex. The Journal of Neuroscience. 2009;29:13992–14003. doi: 10.1523/JNEUROSCI.3577-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carlson TA, Schrater P, He S. Patterns of activity in the categorical representations of objects. Journal of Cognitive Neuroscience. 2003;15:704–717. doi: 10.1162/089892903322307429. [DOI] [PubMed] [Google Scholar]
- Chai B, Walther D, Beck D, Li F-F. Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis. Advances in Neural Information Processing Systems. 2009:270–278. [Google Scholar]
- Chang C, Glover GH. Time-frequency dynamics of resting-state brain connectivity measured with fMRI. Neuroimage. 2010;50:81–98. doi: 10.1016/j.neuroimage.2009.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Christopher deCharms R. Applications of real-time fMRI. Nature Reviews Neuroscience. 2008;9:720–729. doi: 10.1038/nrn2414. [DOI] [PubMed] [Google Scholar]
- Christopher deCharms R, Merzenich MM. Primary cortical representation of sounds by the coordination of action-potential timing. Nature. 1996;381:610–613. doi: 10.1038/381610a0. [DOI] [PubMed] [Google Scholar]
- Clithero JA, Smith DV, Carter RM, Huettel SA. Within-and cross-participant classifiers reveal different neural coding of information. Neuroimage. 2011;56:699–708. doi: 10.1016/j.neuroimage.2010.03.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cox DD, Savoy RL. Functional magnetic resonance imaging (fMRI)“brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex. Neuroimage. 2003;19:261–270. doi: 10.1016/s1053-8119(03)00049-1. [DOI] [PubMed] [Google Scholar]
- Craddock RC, Holtzheimer PE, III, Hu XP, Mayberg HS. Disease state prediction from resting state functional connectivity. Magnetic Resonance in Medicine. 2009;62:1619–1628. doi: 10.1002/mrm.22159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dai W, Chen Y, Xue GR, Yang Q, Yu Y. Translated learning: Transfer learning across different feature spaces. Advances in Neural Information Processing Systems: NIPS. 2008:353–360. [Google Scholar]
- Dantone M, Gall J, Fanelli G, Van Gool L. Real-time facial feature detection using conditional regression forests; IEEE Conference on Computer Vision and Pattern Recognition: CVPR; 2012. pp. 2578–2585. [Google Scholar]
- Davatzikos C, Ruparel K, Fan Y, Shen D, Acharyya M, Loughead J, Gur R, Langleben DD. Classifying spatial patterns of brain activity with machine learning methods: application to lie detection. Neuroethics Publications. 2005;28:663–668. doi: 10.1016/j.neuroimage.2005.08.009. [DOI] [PubMed] [Google Scholar]
- Dayan P, Abbott L. Theoretical neuroscience: Computational and mathematical modeling of neural systems. Journal of Cognitive Neuroscience. 2003;15:154–155. [Google Scholar]
- Dayan P, Abbott LF, Abbott L. Theoretical neuroscience: Computational and mathematical modeling of neural systems. Philosophical Psychology. 2001:563–577. [Google Scholar]
- Deng F, Zhu D, Liu T. FMRI Signal Analysis Using Empirical Mean Curve Decomposition. IEEE Transactions on Biomedical Engineering. 2012 doi: 10.1109/TBME.2012.2221125. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Derrfuss J, Mar RA. Lost in localization: the need for a universal coordinate database. Neuroimage. 2009;48:1–7. doi: 10.1016/j.neuroimage.2009.01.053. [DOI] [PubMed] [Google Scholar]
- Dietterich T. Overfitting and undercomputing in machine learning. ACM Computing Survey. 1995;27:326–327. [Google Scholar]
- Downing PE, Jiang Y, Shuman M, Kanwisher N. A cortical area selective for visual processing of the human body. Science. 2001;293:2470–2473. doi: 10.1126/science.1063414. [DOI] [PubMed] [Google Scholar]
- Dumoulin SO, Wandell BA. Population receptive field estimates in human visual cortex. Neuroimage. 2008;39:647–660. doi: 10.1016/j.neuroimage.2007.09.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eger E, Ashburner J, Haynes JD, Dolan RJ, Rees G. fMRI activity patterns in human LOC carry information about object exemplars within category. Journal of Cognitive Neuroscience. 2008;20:356–370. doi: 10.1162/jocn.2008.20019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engel AK, Fries P, Singer W. Dynamic predictions: oscillations and synchrony in top-down processing. Nature Reviews Neuroscience. 2001;2:704–716. doi: 10.1038/35094565. [DOI] [PubMed] [Google Scholar]
- Engel S, Zhang X, Wandell B. Colour tuning in human visual cortex measured with functional magnetic resonance imaging. Nature. 1997;388:68–71. doi: 10.1038/40398. [DOI] [PubMed] [Google Scholar]
- Evgeniou AAT, Pontil M. Multi-task feature learning. MIT Press; 2007. [Google Scholar]
- Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, van der Kouwe A, Killiany R, Kennedy D, Klaveness S. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron. 2002;33:341–355. doi: 10.1016/s0896-6273(02)00569-x. [DOI] [PubMed] [Google Scholar]
- Fries P. A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends in Cognitive Sciences. 2005;9:474–480. doi: 10.1016/j.tics.2005.08.011. [DOI] [PubMed] [Google Scholar]
- Friston K, Chu C, Mourão-Miranda J, Hulme O, Rees G, Penny W, Ashburner J. Bayesian decoding of brain images. Neuroimage. 2008;39:181–205. doi: 10.1016/j.neuroimage.2007.08.013. [DOI] [PubMed] [Google Scholar]
- Friston KJ, Holmes A, Poline JB, Price CJ, Frith C. Detecting activations in PET and fMRI: levels of inference and power. Neuroimage. 1996;4:223–235. doi: 10.1006/nimg.1996.0074. [DOI] [PubMed] [Google Scholar]
- Friston KJ, Holmes AP, Poline J, Grasby P, Williams S, Frackowiak RSJ, Turner R. Analysis of fMRI time-series revisited. Neuroimage. 1995;2:45–53. doi: 10.1006/nimg.1995.1007. [DOI] [PubMed] [Google Scholar]
- Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, Frackowiak RSJ. Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping. 1994;2:189–210. [Google Scholar]
- Fujiwara Y, Miyawaki Y, Kamitani Y. Estimating image bases for visual image reconstruction from human brain activity. Advances in Neural Information Processing Systems: NIPS. 2009:576–584. [Google Scholar]
- Gao W, Lin W. Frontal parietal control network regulates the anti-correlated default and dorsal attention networks. Human Brain Mapping. 2012;33:192–202. doi: 10.1002/hbm.21204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerstner W, Kreiter AK, Markram H, Herz AVM. Neural codes: firing rates and beyond. Proceedings of the National Academy of Sciences. 1997;94:12740–12741. doi: 10.1073/pnas.94.24.12740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 2012;34:1915–1926. doi: 10.1109/TPAMI.2011.272. [DOI] [PubMed] [Google Scholar]
- Goffaux V, Peters J, Haubrechts J, Schiltz C, Jansma B, Goebel R. From coarse to fine? Spatial and temporal dynamics of cortical face processing. Cerebral Cortex. 2011;21:467–476. doi: 10.1093/cercor/bhq112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golland Y, Bentin S, Gelbard H, Benjamini Y, Heller R, Nir Y, Hasson U, Malach R. Extrinsic and intrinsic systems in the posterior cortex of the human brain revealed during natural sensory stimulation. Cerebral Cortex. 2007;17:766–777. doi: 10.1093/cercor/bhk030. [DOI] [PubMed] [Google Scholar]
- Guger C, Ramoser H, Pfurtscheller G. Real-time EEG analysis with subject-specific spatial patterns for a brain-computer interface (BCI) Rehabilitation Engineering, IEEE Transactions on. 2000;8:447–456. doi: 10.1109/86.895947. [DOI] [PubMed] [Google Scholar]
- Hagmann P, Cammoun L, Gigandet X, Gerhard S, Ellen Grant P, Wedeen V, Meuli R, Thiran JP, Honey CJ, Sporns O. MR connectomics: principles and challenges. Journal of Neuroscience Methods. 2010;194:34–45. doi: 10.1016/j.jneumeth.2010.01.014. [DOI] [PubMed] [Google Scholar]
- Hasson U, Landesman O, Knappmeyer B, Vallines I, Rubin N, Heeger DJ. Neurocinematics: The neuroscience of film. Projections. 2008;2:1–26. [Google Scholar]
- Hasson U, Malach R, Heeger DJ. Reliability of cortical activity during natural stimulation. Trends in Cognitive Sciences. 2010;14:40–48. doi: 10.1016/j.tics.2009.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hasson U, Nir Y, Levy I, Fuhrmann G, Malach R. Intersubject synchronization of cortical activity during natural vision. Science. 2004;303:1634–1640. doi: 10.1126/science.1089506. [DOI] [PubMed] [Google Scholar]
- Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science. 2001;293:2425–2430. doi: 10.1126/science.1063736. [DOI] [PubMed] [Google Scholar]
- Haynes JD, Rees G. Decoding mental states from brain activity in humans. Nature Reviews Neuroscience. 2006;7:523–534. doi: 10.1038/nrn1931. [DOI] [PubMed] [Google Scholar]
- He X, Niyogi P. Locality preserving projections. Proceeding of Advances in Neural Information Processing Systems: NIPS. 2003:153–160. [Google Scholar]
- Heeger DJ, Ress D. What does fMRI tell us about neuronal activity? Nature Reviews Neuroscience. 2002;3:142–151. doi: 10.1038/nrn730. [DOI] [PubMed] [Google Scholar]
- Hu X, Deng F, Li K, Zhang T, Chen H, Jiang X, Lv J, Zhu D, Faraco C, Zhang D. Proceedings of the International Conference on Multimedia: ICM. Firenze, Italy: ACM; 2010. Bridging low-level features and high-level semantics via fMRI brain imaging for video classification; pp. 451–460. [Google Scholar]
- Hu X, Li K, Han J, Hua X, Guo L, Liu T. Bridging the Semantic Gap via Functional Brain Imaging. Multimedia, IEEE Transactions on. 2012;14:314–325. [Google Scholar]
- Hubel DH, Wiesel TN. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology. 1968;195:215–243. doi: 10.1113/jphysiol.1968.sp008455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hubel DH, Wiesel TN. Anatomical demonstration of columns in the monkey striate cortex. Nature. 1969;221:747–750. doi: 10.1038/221747a0. [DOI] [PubMed] [Google Scholar]
- Huettel S, Song A, McCarthy G. Functional Magnetic Resonance Imaging. Sinauer Associates. 2004 [Google Scholar]
- Hung CP, Kreiman G, Poggio T, DiCarlo JJ. Fast readout of object identity from macaque inferior temporal cortex. Science. 2005;310:863–866. doi: 10.1126/science.1117593. [DOI] [PubMed] [Google Scholar]
- Ishai A, Ungerleider LG, Martin A, Schouten JL, Haxby JV. Distributed representation of objects in the human ventral visual pathway. Proceedings of the National Academy of Sciences. 1999;96:9379–9384. doi: 10.1073/pnas.96.16.9379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ji X, Han J, Hu X, Li K, Deng F, Fang J, Guo L, Liu T. International Conference on Image Processing: ICIP. IEEE; 2011. Retrieving video shots in semantic brain imaging space using manifold-ranking; pp. 3633–3636. [Google Scholar]
- Jiang X, Zhang T, Hu X, Lu L, Han J, Guo L, Liu T. Music/Speech Classification Using High-level Features Derived from fMRI Brain Imaging; Proceedings of the 20th ACM International Conference on Multimedia: ACMMM; 2012. in press. [Google Scholar]
- Kamitani Y, Tong F. Decoding the visual and subjective contents of the human brain. Nature Neuroscience. 2005;8:679–685. doi: 10.1038/nn1444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kamitani Y, Tong F. Decoding Seen and Attended Motion Directions from Activity in the Human Visual Cortex. Current Biology. 2006;16:1096–1102. doi: 10.1016/j.cub.2006.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kanwisher N, McDermott J, Chun MM. The fusiform face area: a module in human extrastriate cortex specialized for face perception. The Journal of Neuroscience. 1997;17:4302–4311. doi: 10.1523/JNEUROSCI.17-11-04302.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kapoor A, Shenoy P, Tan D. Combining brain computer interfaces with vision for object categorization; IEEE Conference on Computer Vision and Pattern Recognition: CVPR; 2008. pp. 1–8. [Google Scholar]
- Kay KN, Gallant JL. I can see what you see. Nature Neuroscience. 2009;12:245–245. doi: 10.1038/nn0309-245. [DOI] [PubMed] [Google Scholar]
- Kay KN, Naselaris T, Prenger RJ, Gallant JL. Identifying natural images from human brain activity. Nature. 2008;452:352–355. doi: 10.1038/nature06713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kennedy DN. Making connections in the connectome era. Neuroinformatics. 2010;8:61–62. doi: 10.1007/s12021-010-9070-1. [DOI] [PubMed] [Google Scholar]
- Kosslyn SM, Ganis G, Thompson WL. Neural foundations of imagery. Nature Reviews Neuroscience. 2001;2:635–642. doi: 10.1038/35090055. [DOI] [PubMed] [Google Scholar]
- Kriegeskorte N, Goebel R, Bandettini P. Information-based functional brain mapping. Proceedings of the National Academy of Sciences of the United States of America. 2006;103:3863–3868. doi: 10.1073/pnas.0600244103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LaConte S, Strother S, Cherkassky V, Anderson J, Hu X. Support vector machines for temporal classification of block design fMRI data. Neuroimage. 2005;26:317–329. doi: 10.1016/j.neuroimage.2005.01.048. [DOI] [PubMed] [Google Scholar]
- LaConte SM, Peltier SJ, Hu XP. Real-time fMRI using brain-state classification. Human Brain Mapping. 2006;28:1033–1044. doi: 10.1002/hbm.20326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laird AR, Eickhoff SB, Kurth F, Fox PM, Uecker AM, Turner JA, Robinson JL, Lancaster JL, Fox PT. ALE meta-analysis workflows via the BrainMap database: progress towards a probabilistic functional brain atlas. Frontiers in Neuroinformatics. 2009;3:1–11. doi: 10.3389/neuro.11.023.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lebedev MA, Nicolelis MAL. Brain? machine interfaces: past, present and future. Trends in Neurosciences. 2006;29:536–546. doi: 10.1016/j.tins.2006.07.004. [DOI] [PubMed] [Google Scholar]
- Li J, Levine MD, An X, He H. Saliency detection based on frequency and spatial domain analysis. Neuroscience. 2005;8:975–977. [Google Scholar]
- Li K, Guo L, Faraco C, Zhu D, Chen H, Yuan Y, Lv J, Deng F, Jiang X, Zhang T, Hu X, Zhang D, Miller LS, Liu T. Visual analytics of brain networks. Neuroimage. 2012a;61:82–97. doi: 10.1016/j.neuroimage.2012.02.075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li K, Guo L, Zhu D, Hu X, Han J, Liu T. Individual functional ROI optimization via maximization of group-wise consistency of structural and functional profiles. Neuroinformatics. 2012b;10:225–242. doi: 10.1007/s12021-012-9142-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li K, Zhu D, Guo L, Li Z, Lynch ME, Coles C, Hu X, Liu T. Connectomics signatures of prenatal cocaine exposure affected adolescent brains. Human Brain Mapping. 2012c doi: 10.1002/hbm.22082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li X, Lim C, Li K, Guo L, Liu T. Detecting Brain State Changes via Fiber-Centered Functional Connectivity Analysis. Neuroinformatics. 2012d doi: 10.1007/s12021-012-9157-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y, Gilmore JH, Wang J, Styner M, Lin W, Zhu H. Twin MARM: two-stage multiscale adaptive regression methods for twin neuroimaging data. IEEE Trans Med Imaging. 2012e;31:1100–1112. doi: 10.1109/TMI.2012.2185830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu T. A few thoughts on brain ROIs. Brain imaging and behavior. 2011;5:189–202. doi: 10.1007/s11682-011-9123-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu TM, Shen DG, Davatzikos C. Deformable registration of cortical structures via hybrid volumetric and surface warping. Medical Image Computing and Computer-Assisted Intervention: MICCAI. 2003;2879:780–787. [Google Scholar]
- Liu Z, He B. fMRI-EEG integrated cortical source imaging by use of time-variant spatial constraints. Neuroimage. 2008;3:1198. doi: 10.1016/j.neuroimage.2007.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Livingstone MS, Hubel DH. Thalamic inputs to cytochrome oxidase-rich regions in monkey visual cortex. Proceedings of the National Academy of Sciences. 1982;79:6098–6101. doi: 10.1073/pnas.79.19.6098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Logothetis NK, Pauls J, Augath M, Trinath T, Oeltermann A. Neurophysiological investigation of the basis of the fMRI signal. Nature. 2001;412:150–157. doi: 10.1038/35084005. [DOI] [PubMed] [Google Scholar]
- Logothetis NK. What we can do and what we cannot do with fMRI. Nature. 2008;453:869–878. doi: 10.1038/nature06976. [DOI] [PubMed] [Google Scholar]
- Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision. 2004;60:91–110. [Google Scholar]
- MacEvoy SP, Epstein RA. Decoding the representation of multiple simultaneous objects in human occipitotemporal cortex. Current Biology. 2009;19:943–947. doi: 10.1016/j.cub.2009.04.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Majeed W, Magnuson M, Hasenkamp W, Schwarb H, Schumacher EH, Barsalou L, Keilholz SD. Spatiotemporal dynamics of low frequency BOLD fluctuations in rats and humans. Neuroimage. 2011;54:1140–1150. doi: 10.1016/j.neuroimage.2010.08.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malinen S, Hlushchuk Y, Hari R. Towards natural stimulation in fMRI—issues of data analysis. Neuroimage. 2007;35:131–139. doi: 10.1016/j.neuroimage.2006.11.015. [DOI] [PubMed] [Google Scholar]
- Matthews P, Jezzard P. Functional magnetic resonance imaging. Journal of Neurology, Neurosurgery & Psychiatry. 2004;75:6–12. [PMC free article] [PubMed] [Google Scholar]
- Mechler F, Victor JD, Purpura KP, Shapley R. Robust temporal coding of contrast by V1 neurons for transient but not for steady-state stimuli. The Journal of Neuroscience. 1998;18:6583–6598. doi: 10.1523/JNEUROSCI.18-16-06583.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Micchelli CA, Morales JM, Pontil M. A family of penalty functions for structured sparsity. Advances in Neural Information Processing Systems: NIPS. 2010;23:1612–1623. [Google Scholar]
- Mikl M, Mareček R, Hluštík P, Pavlicová M, Drastich A, Chlebus P, Brázdil M, Krupa P. Effects of spatial smoothing on fMRI group inferences. Magnetic Resonance Imaging. 2008;26:490–503. doi: 10.1016/j.mri.2007.08.006. [DOI] [PubMed] [Google Scholar]
- Mitchell TM, Hutchinson R, Niculescu RS, Pereira F, Wang X, Just M, Newman S. Learning to decode cognitive states from brain images. Machine Learning. 2004;57:145–175. [Google Scholar]
- Mitchell TM, Shinkareva SV, Carlson A, Chang KM, Malave VL, Mason RA, Just MA. Predicting human brain activity associated with the meanings of nouns. Science. 2008;320:1191–1195. doi: 10.1126/science.1152876. [DOI] [PubMed] [Google Scholar]
- Miyawaki Y, Uchida H, Yamashita O, Sato M, Morito Y, Tanabe HC, Sadato N, Kamitani Y. Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron. 2008;60:915–929. doi: 10.1016/j.neuron.2008.11.004. [DOI] [PubMed] [Google Scholar]
- Mourão-Miranda J, Bokde AL, Born C, Hampel H, Stetter M. Classifying brain states and determining the discriminating activation patterns: Support Vector Machine on functional MRI data. Neuroimage. 2005;28:980–995. doi: 10.1016/j.neuroimage.2005.06.070. [DOI] [PubMed] [Google Scholar]
- Naselaris T, Kay KN, Nishimoto S, Gallant JL. Encoding and decoding in fMRI. Neuroimage. 2011;56:400–410. doi: 10.1016/j.neuroimage.2010.07.073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naselaris T, Prenger RJ, Kay KN, Oliver M, Gallant JL. Bayesian reconstruction of natural images from human brain activity. Neuron. 2009;63:902–915. doi: 10.1016/j.neuron.2009.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naselaris T, Stansbury DE, Gallant JL. Cortical representation of animate and inanimate objects in complex natural scenes. Journal of Physiology-Paris. 2012;106:239–249. doi: 10.1016/j.jphysparis.2012.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nijholt A, Tan D. Brain-computer interfacing for intelligent systems. Intelligent Systems, IEEE. 2008;23:72–79. [Google Scholar]
- Nishimoto S, Vu An T, Naselaris T, Benjamini Y, Yu B, Gallant Jack L. Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies. Current Biology. 2011;21:1641–1646. doi: 10.1016/j.cub.2011.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Norman KA, Polyn SM, Detre GJ, Haxby JV. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences. 2006;10:424–430. doi: 10.1016/j.tics.2006.07.005. [DOI] [PubMed] [Google Scholar]
- O'Craven KM, Kanwisher N. Mental imagery of faces and places activates corresponding stimulus-specific brain regions. Journal of Cognitive Neuroscience. 2000;12:1013–1023. doi: 10.1162/08989290051137549. [DOI] [PubMed] [Google Scholar]
- Obozinski G, Taskar B, Jordan MI. Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing. 2010;20:231–252. [Google Scholar]
- Ogawa S, Lee T, Kay A, Tank D. Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proceedings of the National Academy of Sciences. 1990;87:9868–9872. doi: 10.1073/pnas.87.24.9868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pantazatos SP, Talati A, Pavlidis P, Hirsch J. Decoding Unattended Fearful Faces with Whole-Brain Correlations: An Approach to Identify Condition-Dependent Large-Scale Functional Connectivity. PLoS Comput Biol. 2012;8:e1002441. doi: 10.1371/journal.pcbi.1002441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Passingham RE, Stephan KE, Kotter R. The anatomical basis of functional localization in the cortex. Nature Reviews Neuroscience. 2002;3:606–616. doi: 10.1038/nrn893. [DOI] [PubMed] [Google Scholar]
- Patric H, Leila C, Xavier G, Stephan G, Ellen GP, Van W, Reto M, Jean-Philippe T, J HC. MR connectomics: Principles and challenges. Elsevier, Kidlington: ROYAUME-UNI; 2010. [Google Scholar]
- Peelen MV, Fei-Fei L, Kastner S. Neural mechanisms of rapid natural scene categorization in human visual cortex. Nature. 2009;460:94–97. doi: 10.1038/nature08103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poldrack RA. Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences. 2006;10:59–63. doi: 10.1016/j.tics.2005.12.004. [DOI] [PubMed] [Google Scholar]
- Port RF, van Gelder T. Mind as motion: Explorations in the dynamics of cognition. MIT Press; 1995. [Google Scholar]
- Rasmussen CE, Williams C. Gaussian processes for machine learning. Vol. 38. Cambridge, MA, USA: The MIT Press; 2006. pp. 715–719. [Google Scholar]
- Rasmussen PM, Hansen LK, Madsen KH, Churchill NW, Strother SC. Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognit. 2012;45:2085–2100. [Google Scholar]
- Redcay E, Dodell-Feder D, Pearrow MJ, Mavros PL, Kleiner M, Gabrieli JD, Saxe R. Live face-to-face interaction during fMRI: a new tool for social cognitive neuroscience. Neuroimage. 2010;50:1639–1647. doi: 10.1016/j.neuroimage.2010.01.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reddy L, Tsuchiya N, Serre T. Reading the mind's eye: Decoding category information during mental imagery. Neuroimage. 2010;50:818–825. doi: 10.1016/j.neuroimage.2009.11.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richiardi J, Eryilmaz H, Schwartz S, Vuilleumier P, Van De Ville D. Decoding brain states from fMRI connectivity graphs. Neuroimage. 2011;56:616–626. doi: 10.1016/j.neuroimage.2010.05.081. [DOI] [PubMed] [Google Scholar]
- Sabuncu MR, Singer BD, Conroy B, Bryan RE, Ramadge PJ, Haxby JV. Function-based intersubject alignment of human cortical anatomy. Cerebral Cortex. 2010;20:130–140. doi: 10.1093/cercor/bhp085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salek-Haddadi A, Friston K, Lemieux L, Fish D. Studying spontaneous EEG activity with fMRI. Brain Res. Rev. 2003;43:110–133. doi: 10.1016/s0165-0173(03)00193-0. [DOI] [PubMed] [Google Scholar]
- Schrouff J, Phillips CLM. Multivariate Pattern Recognition Analysis: Brain Decoding. Coma and Disorders of Consciousness, 35–43. In: Schnakers C, Laureys S, editors. Coma and Disorders of Consciousness. London: Springer; 2012. pp. 35–43. [Google Scholar]
- Sekiyama K, Kanno I, Miura S, Sugita Y. Auditory-visual speech perception examined by fMRI and PET. Neuroscience Research. 2003;47:277–287. doi: 10.1016/s0168-0102(03)00214-1. [DOI] [PubMed] [Google Scholar]
- Shen D, Davatzikos C. HAMMER: hierarchical attribute matching mechanism for elastic registration. Medical Imaging, IEEE Transactions on. 2002;21:1421–1439. doi: 10.1109/TMI.2002.803111. [DOI] [PubMed] [Google Scholar]
- Shibata K, Watanabe T, Sasaki Y, Kawato M. Perceptual learning incepted by decoded fMRI neurofeedback without stimulus presentation. Science. 2011;334:1413–1415. doi: 10.1126/science.1212003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shinkareva SV, Malave VL, Mason RA, Mitchell TM, Just MA. Commonality of neural representations of words and pictures. Neuroimage. 2011;54:2418–2425. doi: 10.1016/j.neuroimage.2010.10.042. [DOI] [PubMed] [Google Scholar]
- Shirer W, Ryali S, Rykhlevskaia E, Menon V, Greicius M. Decoding subject-driven cognitive states with whole-brain connectivity patterns. Cerebral Cortex. 2012;22:158–165. doi: 10.1093/cercor/bhr099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singer W. Neuronal synchrony: A versatile code review for the definition of relations? Neuron. 1999;24:49–65. doi: 10.1016/s0896-6273(00)80821-1. [DOI] [PubMed] [Google Scholar]
- Sitaram R, Caria A, Veit R, Gaber T, Rota G, Kuebler A, Birbaumer N. fMRI Brain-Computer Interface: A Tool for Neuroscientific Research and Treatment. Computational Intelligence and Neuroscience. 2007 doi: 10.1155/2007/25487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sitaram R, Weiskopf N, Caria A, Veit R, Erb M, Birbaumer N. fMRI Brain-Computer Interfaces. Signal Processing Magazine, IEEE. 2008;25:95–106. [Google Scholar]
- Smith SM, Miller KL, Moeller S, Xu J, Auerbach EJ, Woolrich MW, Beckmann CF, Jenkinson M, Andersson J, Glasser MF. Temporally-independent functional modes of spontaneous brain activity. Proceedings of the National Academy of Sciences. 2012;109:3131–3136. doi: 10.1073/pnas.1121329109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spiridon M, Kanwisher N. How distributed is visual category information in human occipito-temporal cortex? An fMRI study. Neuron. 2002;35:1157–1166. doi: 10.1016/s0896-6273(02)00877-2. [DOI] [PubMed] [Google Scholar]
- Spreng RN, Stevens WD, Chamberlain JP, Gilmore AW, Schacter DL. Default network activity, coupled with the frontoparietal control network, supports goal-directed cognition. Neuroimage. 2010;53:303–317. doi: 10.1016/j.neuroimage.2010.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sterzer P, Haynes J-D, Rees G. Fine-scale activity patterns in high-level visual areas encode the category of invisible objects. Journal of Vision. 2008;8:1–12. doi: 10.1167/8.15.10. [DOI] [PubMed] [Google Scholar]
- Stokes M, Thompson R, Cusack R, Duncan J. Top-down activation of shape-specific population codes in visual cortex during mental imagery. The Journal of Neuroscience. 2009;29:1565–1572. doi: 10.1523/JNEUROSCI.4657-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugase-Miyamoto Y, Matsumoto N, Kawano K. Role of temporal processing stages by inferior temporal neurons in facial recognition. Frontiers in Psychology. 2011;2:1–8. doi: 10.3389/fpsyg.2011.00141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tahmasebi A. School of Computing. Kingston, Ontario, Canada: Queen's University; 2010. Quantification of Inter-subject Variability in Human Brain and Its Impact on Analysis of fMRI Data. [Google Scholar]
- Tahmasebi AM, Abolmaesumi P, Zheng ZZ, Munhall KG, Johnsrude IS. Reducing inter-subject anatomical variation: Effect of normalization method on sensitivity of functional magnetic resonance imaging data analysis in auditory cortex and the superior temporal region. Neuroimage. 2009;47:1522–1531. doi: 10.1016/j.neuroimage.2009.05.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tasker R, Tsuda T, Hawrylyshyn P. Clinical neurophysiological investigation of deafferentation pain. Advances in Pain Research and Therapy. 1983;5:713–738. [Google Scholar]
- Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290:2319–2323. doi: 10.1126/science.290.5500.2319. [DOI] [PubMed] [Google Scholar]
- Thirion B, Duchesnay E, Hubbard E, Dubois J, Poline J-B, Lebihan D, Dehaene S. Inverse retinotopy: inferring the visual content of images from brain activation patterns. Neuroimage. 2006;33:1104–1116. doi: 10.1016/j.neuroimage.2006.06.062. [DOI] [PubMed] [Google Scholar]
- Thirion B, Pinel P, Mériaux S, Roche A, Dehaene S, Poline JB. Analysis of a large fMRI cohort: Statistical and methodological issues for group analyses. Neuroimage. 2007;35:105–120. doi: 10.1016/j.neuroimage.2006.11.054. [DOI] [PubMed] [Google Scholar]
- Thompson P, Toga AW. A surface-based technique for warping three-dimensional images of the brain. Medical Imaging, IEEE Transactions on. 1996;15:402–417. doi: 10.1109/42.511745. [DOI] [PubMed] [Google Scholar]
- Trappenberg TP. Fundamentals of computational neuroscience. Oxford University Press; 2010. [Google Scholar]
- Tsao DY, Freiwald WA, Knutsen TA, Mandeville JB, Tootell RBH. Faces and objects in macaque cerebral cortex. Nature Neuroscience. 2003;6:989–995. doi: 10.1038/nn1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsao DY, Freiwald WA, Tootell RBH, Livingstone MS. A cortical region consisting entirely of face-selective cells. Science. 2006;311:670–674. doi: 10.1126/science.1119983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Dijk KRA, Hedden T, Venkataraman A, Evans KC, Lazar SW, Buckner RL. Intrinsic functional connectivity as a tool for human connectomics: theory, properties, and optimization. Journal of Neurophysiology. 2010;103:297–321. doi: 10.1152/jn.00783.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Gerven M, Farquhar J, Schaefer R, Vlek R, Geuze J, Nijholt A, Ramsey N, Haselager P, Vuurpijl L, Gielen S. The brain–computer interface cycle. Journal of Neural Engineering. 2009;6:041001. doi: 10.1088/1741-2560/6/4/041001. [DOI] [PubMed] [Google Scholar]
- van Gerven MAJ, de Lange FP, Heskes T. Neural decoding with hierarchical generative models. Neural Computation. 2010;22:3127–3142. doi: 10.1162/NECO_a_00047. [DOI] [PubMed] [Google Scholar]
- Velliste M, Perel S, Spalding MC, Whitford AS, Schwartz AB. Cortical control of a prosthetic arm for self-feeding. Nature. 2008;453:1098–1101. doi: 10.1038/nature06996. [DOI] [PubMed] [Google Scholar]
- Villarreal MF, Fridman EA, Leiguarda RC. The Effect of the Visual Context in the Recognition of Symbolic Gestures. PloS one. 2012 doi: 10.1371/journal.pone.0029644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vulliemoz S, Lemieux L, Daunizeau J, Michel CM, Duncan JS. The combination of EEG Source Imaging and EEG-correlated functional MRI to map epileptic networks. Epilepsia. 2010;51:491–505. doi: 10.1111/j.1528-1167.2009.02342.x. [DOI] [PubMed] [Google Scholar]
- Vulliemoz S, Thornton R, Rodionov R, Carmichael D, Guye M, Lhatoo S, McEvoy A, Spinelli L, Michel C, Duncan J. The spatio-temporal mapping of epileptic networks: combination of EEG–fMRI and EEG source imaging. Neuroimage. 2009;46:834–843. doi: 10.1016/j.neuroimage.2009.01.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vu VQ, Ravikumar P, Naselaris T, Kay KN, Gallant JL, Yu B. Encoding and Decoding V1 Fmri Responses to Natural Images with Sparse Nonparametric Models. Ann Appl Stat. 2011;5:1159–1182. doi: 10.1214/11-AOAS476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walther DB, Caddigan E, Fei-Fei L, Beck DM. Natural scene categories revealed in distributed patterns of activity in the human brain. The Journal of Neuroscience. 2009;29:10573–10581. doi: 10.1523/JNEUROSCI.0559-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Pohlmeyer E, Hanna B, Jiang Y-G, Sajda P, Chang S-F. Brain state decoding for rapid image retrieval; Proceedings of the 17th ACM International Conference on Multimedia: ACMMM; 2009. pp. 945–954. [Google Scholar]
- Werner S, Noppeney U. Distinct functional contributions of primary sensory and association areas to audiovisual integration in object categorization. The Journal of Neuroscience. 2010;30:2662–2675. doi: 10.1523/JNEUROSCI.5091-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whittingstall K, Bartels A, Singh V, Kwon S, Logothetis NK. Integration of EEG source imaging and fMRI during continuous viewing of natural movies. Magnetic Resonance Imaging. 2010;28:1135–1142. doi: 10.1016/j.mri.2010.03.042. [DOI] [PubMed] [Google Scholar]
- Williams R. The human connectome: just another'ome? The Lancet Neurology. 2010;9:238–239. doi: 10.1016/S1474-4422(10)70046-6. [DOI] [PubMed] [Google Scholar]
- Wolpaw JR, Birbaumer N, McFarland DJ, Pfurtscheller G, Vaughan TM. Brain-computer interfaces for communication and control. Clinical neurophysiology. 2002;113:767–791. doi: 10.1016/s1388-2457(02)00057-3. [DOI] [PubMed] [Google Scholar]
- Wu MCK, David SV, Gallant JL. Complete functional characterization of sensory neurons by system identification. Annu. Rev. Neurosci. 2006;29:477–505. doi: 10.1146/annurev.neuro.29.051605.113024. [DOI] [PubMed] [Google Scholar]
- Yacoub E, Harel N, Uğurbil K. High-field fMRI unveils orientation columns in humans. Proceedings of the National Academy of Sciences. 2008;105:10607–10612. doi: 10.1073/pnas.0804110105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Q, Chen Y, Xue G-R, Dai W, Yu Y. Heterogeneous transfer learning for image clustering via the social web. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009;1:1–9. [Google Scholar]
- Yao H, Shi L, Han F, Gao H, Dan Y. Rapid learning in cortical coding of visual scenes. Nature neuroscience. 2007;10:772–778. doi: 10.1038/nn1895. [DOI] [PubMed] [Google Scholar]
- Yue Y, Loh JM, Lindquist MA. Adaptive spatial smoothing of fMRI images. Statistics and its Interface. 2010;3:3–13. [Google Scholar]
- Zhang P, Cootes T. In: Automatic Part Selection for Groupwise Registration Information Processing in Medical Imaging. Székely G, Hahn H, editors. Berlin / Heidelberg: Springer; 2011. pp. 636–647. [DOI] [PubMed] [Google Scholar]
- Zhang T, Guo L, Li K, Jing C, Yin Y, Zhu D, Cui G, Li L, Liu T. Predicting functional cortical ROIs via DTI-derived fiber shape models. Cerebral Cortex. 2012a;22:854–864. doi: 10.1093/cercor/bhr152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang X, Guo L, Li X, Zhu D, Li K, Sun Z, Jin C, Hu X, Han J, Zhao Q, Li L, Liu T. Characterization of Task-free/Task-performance Brain States. Medical Image Computing and Computer-Assisted Intervention: MICCAI. 2012b doi: 10.1007/978-3-642-33418-4_30. in press. [DOI] [PubMed] [Google Scholar]
- Zhu D, Li K, Faraco CC, Deng F, Zhang D, Guo L, Miller LS, Liu T. Optimization of functional brain ROIs via maximization of consistency of structural connectivity profiles. Neuroimage. 2012a;59:1382–1393. doi: 10.1016/j.neuroimage.2011.08.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu D, Li K, Guo L, Jiang X, Zhang T, Zhang D, Chen H, Deng F, Faraco C, Jin C, Wee C-Y, Yuan Y, Lv P, Yin Y, Hu X, Duan L, Hu X, Han J, Wang L, Shen D, Miller LS, Li L, Liu T. DICCCOL: Dense Individualized and Common Connectivity-Based Cortical Landmarks. Cerebral Cortex. 2012b doi: 10.1093/cercor/bhs072. [DOI] [PMC free article] [PubMed] [Google Scholar]
