Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Sep 1.
Published in final edited form as: Curr Opin Biomed Eng. 2021 Jun 2;19:100298. doi: 10.1016/j.cobme.2021.100298

Naturalistic Stimuli: A Paradigm for Multi-Scale Functional Characterization of the Human Brain

Yizhen Zhang 1, Jung-Hoon Kim 2,3, David Brang 4, Zhongming Liu 1,2,*
PMCID: PMC8376216  NIHMSID: NIHMS1710725  PMID: 34423178

Abstract

Movies, audio stories, and virtual reality are increasingly used as stimuli for functional brain imaging. Such naturalistic paradigms are in sharp contrast to the tradition of experimental reductionism in neuroscience research. Being complex, dynamic, and diverse, naturalistic stimuli set up a more ecologically relevant condition and induce highly reproducible brain responses across a wide range of spatiotemporal scales. Here, we review recent technical advances and scientific findings on imaging the brain under naturalistic stimuli. Then we elaborate on the premise of using naturalistic paradigms for multi-scale, multi-modal, and high-throughput functional characterization of the human brain. We further highlight the growing potential of using deep learning models to infer neural information processing from brain responses to naturalistic stimuli. Lastly, we advocate large-scale collaborations to combine brain imaging and recording data across experiments, subjects, and labs that use the same set of naturalistic stimuli.

Introduction

Progress in functional brain imaging has been remarkable. While technological advancements push the limits of imaging resolution and speed, experimental paradigms also evolve in parallel. Traditional paradigms use simple stimuli or tasks designed to control for extraneous variables and thus isolating a variable of interest and specifically testing how that variable changes brain activity. Such paradigms set up specific hypotheses for mapping specific functions onto specific brain regions in line with the long-standing focus on functional specialization, e.g., face perception in the right fusiform gyrus [1]. Findings resulting from these paradigms, however, tend to be piecemeal and sometimes incohesive, and largely ignore the distributed and flexible nature of many perceptual and cognitive processes [2]. Since the early 2000s, many functional brain imaging studies have shifted their focus to the task-free resting state [3]. This paradigm shift redirects the focus from mapping functions to mapping the brain connectome as a collection of resting state networks. The resting state is not so simple as is nominally implied but varies across visceral, arousal, and mental states [46]. Nevertheless, the resting state is not explicitly biased by any overt task and thus useful for mapping the brain’s intrinsic functional architecture while being convenient to leverage big data [7]. Although resting state networks are relatively straightforward to map, their functional interpretation is often speculative due to the lack of functional context in a task-free condition.

In this paper, we advocate naturalistic stimuli as a paradigm complementary to traditional task paradigms, which are overly specific, and resting-state paradigms, which are entirely non-specific. Naturalistic stimuli, e.g., movies, speech, virtual reality, set up a more ecologically relevant condition for brain research and involve rich spatial and temporal contexts to engage a wide range of neural processes and mechanisms [8]. Computational models are increasingly available to decompose seemingly complex stimuli into explainable features and to relate the decomposed features to specific neurons, voxels, regions, and networks observable with brain imaging and recording techniques. Naturalistic paradigms present an opportunity to combine data across different spatiotemporal scales, functional systems, imaging modalities, and individuals or laboratories. In the following, we review the recent technical innovations and neuroscientific findings that support this perspective. Then we outline future directions for utilizing naturalistic stimuli to curate and leverage “big data” for high-throughput and multi-scale functional characterization of the human brain.

Reproducible brain responses to naturalistic stimuli

The brain enables humans to explore and interact with a complex and dynamic world. Stimuli are considered to be naturalistic if they mimic the real-world while being diverse and dynamic. Typical examples include movie, speech, or music, that involve perceptual, cognitive, and emotional experiences common in daily life. A defining feature of naturalistic stimuli is their ability to evoke brain responses that are highly reproducible within and across subjects as first demonstrated by Hasson and colleagues [9]. Specifically, different subjects viewing the same movie exhibit consistent responses in the stimulus-evoked cortical locations [10] and even in the white matter [11]. Inter-subject functional correlations further separate stimulus-related network interactions from stimulus-unrelated ongoing activity [12, 13]. Being naturalistic is necessary for stimuli to induce highly reliable cortical responses. Natural stimuli tend to generate more reliable responses than artificial stimuli [14]. For example, an artificial video that preserves the spectral magnitude but randomizes the spectral phase of a natural movie does not elicit reproducible fMRI responses even in early visual areas [15].

Different movies may further engage different regions across attention, limbic, and cognitive-control cortical networks [14]. Other than videos, speech and music are increasingly used to investigate neural processes for speech recognition [16], language comprehension [17, 18], and music perception [19]. Each type of naturalistic stimuli engages a broader set of brain regions and more diverse modes of network interactions than its artificial counterparts. Arguably, a relatively smaller battery of different types of naturalistic stimuli may offer more comprehensive and ecologically relevant perspectives to investigate the entire brain than is possible with many more traditional paradigms with strict experimental reductionism. The high throughput is a compelling advantage of using naturalistic stimuli in brain research. We also refer readers to a recent review by Leopold and colleagues [20].

Naturalistic stimuli can induce reliable brain responses observable with different techniques, e.g., functional magnetic resonance imaging (fMRI) [14], magnetoencephalography (MEG) [21] and electroencephalography (EEG) [22], electrocorticography (ECoG) [23], local field potential (LFP) or even single-unit activity [24]. As such, naturalistic stimuli engage reliable neural processes in a wide range of spatiotemporal scales to give rise to rich experiences common across individuals. While most studies focus on one type of neural signal, relating different types of signals under the same natural stimuli is a compelling approach to compare and integrate findings obtained with different techniques [2527].

In daily-life experiences, sensory information is embedded in rich spatial and temporal contexts. For biological vision, humans have different perceptual understandings of the same input placed in different spatial contexts (e.g., through contextual modulation [28]) or take different actions upon the same object depending on its temporal context (e.g., through process memory [29] and recurrent computation). It is impractical to design or synthesize artificial stimuli to adequately exemplify the diverse contexts that occur in a dynamic and realistic environment. As a nonlinear and dynamical system, the brain does not follow the superposition principle. Brain responses to stereotypical stimuli, while useful for descriptive understanding, do not amount to a computational and generalizable understanding of neural information processing under naturalistic conditions. From the perspective of nonlinear system identification, the process of modeling and understanding the brain should benefit from choosing the system’s input signal that follows the statistics of natural stimuli, which drive continuous learning and inference in the brain.

Integrating brain data across different scales and modalities

Modern imaging and recording techniques are complementary to one another. No single modality by itself offers sufficiently high spatial and temporal resolution or specificity to reveal a complete picture about brain function [30]. The use of the same naturalistic stimuli across different experiments, subjects, and labs set up a common context to bridge reproducible brain responses observed with various modalities across different spatiotemporal scales (Figure 1).

Figure 1.

Figure 1

Integrating brain and behavioral data across different scales and modalities given the use of a common set of naturalistic stimuli. a) Different subjects undergoing the same naturalistic stimuli exhibit consistent brain responses such that data collected from different subjects or even different labs can be integrated. b) Different types of naturalistic stimuli can be used to probe the interactions within and across neural and cognitive systems responsible for perception, action, memory, emotion, and cognition. c) Given the same naturalistic stimuli, brain responses observed with various imaging or recording techniques can be linked to one another, d) in order to bridge neural processes across different scales relevant to neurons, cortical layers or columns, circuits, regions, networks, and systems. Color-coded arrows in the illustration of cortical layers indicate directed functional connectivity in the feedforward (red) and feedback (blue) directions.

One strategy is to combine low-resolution and high-resolution fMRI signals to bridge cortical regions to layers. Typical fMRI data with blood oxygen level dependent (BOLD) contrast are collected at 3 Tesla with >2mm isotropic resolution and ~1s intervals. Sub-millimeter fMRI with contrast sensitized to small vessels is attainable at ≥7 Tesla to resolve activity by layers within a 2-4mm cortical depth [31, 32]. Since its resolution and contrast are hard to attain at a sufficiently fast speed for the whole brain, acquisition of layer-specific fMRI signals is often restricted to a limited brain slab. Given the same naturalistic stimuli, the fMRI signals measured with different resolutions and fields of view are presumably reproducible across subjects and experiments. It is thus reasonable to correlate whole-brain fMRI signals with layer-specific fMRI signals, to resolve feedforward and feedback input that terminates at different cortical layers (Figure 2c) [33]. The correlations are expected to bridge layer-specific circuitry to large-scale networks and add directionality to otherwise undirected functional connectivity [34]. To measure correlations or other types of statistical dependence alike, brain signals are viewed as random variables.

Figure 2.

Figure 2

Spectral and laminar features of feedforward and feedback cortical processes. a) In a log-log scale, the power spectrum of a neural signal exhibits a broadband 1/f-like pattern plus narrowband components at different frequency bands (delta, theta, alpha, beta, and gamma). The narrowband and broadband components can be separated by using algorithms such as the Irregular Resampling Auto-Spectral Analysis (IRASA) [36]. Plausible interpretation of the separated spectral components is to take the broadband power as a surrogate of population spiking output from the recorded neurons, the narrowband power in a lower frequency band (e.g. alpha) as a surrogate of the feedback synaptic input to the neuronal population, the narrowband power in a high frequency band (e.g. gamma) as a surrogate of the feedforward input. b) The feedforward and feedback connections between two neuronal populations at different hierarchical levels synchronize their output-input signals. The spiking output at the higher level is synchronized with the feedback input at the lower level. The spiking output at the lower level is synchronized with the feedforward input at the higher level. c) The feedforward and feedback connections originate from and terminate at different cortical layers. The feedforward input terminates at the granular layer and the feedback input terminates at the supragranular and infragranular layers. Such distinctive laminar profiles are likely resolvable with high-resolution fMRI that report synaptic input into different cortical laminae within each cortical column.

This practice is more reasonable under naturalistic stimuli (rather than traditional tasks), which are inherently irregular and complex. Correlational patterns under naturalistic stimuli also appear to be more similar to the patterns spontaneously emerging in the resting state [35]. It is plausible that the intrinsic functional architecture is more functionally relevant to neural processes given naturalistic stimuli rather than non-naturalistic stimuli.

Such a strategy can be taken further to bridge multimodal signals observed with fMRI and electrophysiology [26, 27]. Unlike fMRI, electrophysiology (e.g., EEG, MEG, ECoG, LFP) captures a broader spectrum of neural dynamics and separates neural components by frequency: broadband scale-free activity and narrowband oscillations [3638]. Different frequencies are associated with distinctive cortical mechanisms [39]. Increasing evidence suggests that fast oscillations in the gamma band (30-70Hz) reflect feedforward input, slow oscillations in the alpha (8-13Hz) or beta (13-30Hz) band reflect feedback input [4042], and broadband activity reflects spiking activity [43] (Figure 2). The spectral signatures of directed network communications presumably correspond to distinct laminar profiles observed with layer fMRI [44, 45] (Figure 2c). Therefore, naturalistic stimuli set up a convenient condition for integrating distinct neural and anatomical features into a converging understanding of directed functional connectivity at the circuit level.

Relating electrophysiology to fMRI during observations of the same naturalistic stimuli can also help elucidate the mechanisms of neurovascular coupling and lay a stronger foundation for the interpretation of fMRI-observed activity and connectivity [26, 46]. Spectrally or temporally separable neural components all contribute to the fMRI signal [47, 48]. Understanding their differential contributions to the fMRI signal is critical to the interpretation of fMRI in terms of neuronal input vs. output, excitation vs. inhibition [49]. Whereas simultaneous acquisition of neural and fMRI signals is technically challenging and ethically constrained, the use of the same naturalistic stimuli makes it more feasible to compare neural and fMRI signals collected from different sessions and subjects, especially when neural signals are recorded invasively from human brains [46]. In this regard, Park and colleagues [27] addressed the relationship between single-unit activity within a voxel and the fMRI activity throughout the brain, attempting to bridge the perhaps largest gap in both spatial and temporal scales between electrophysiology and fMRI. Comparing or integrating data across different scales is not trivial. When brain signals are down to a finer spatial scale, it is arguably less likely to expect a one-to-one correspondence between units or voxels across different datasets. It is perhaps more reasonable to expect the correspondence in representation, a virtual readout of multi-variate pattern through either a linear or shallow nonlinear function that analytically resembles how a neuron reads out the pattern of its input neurons through synaptic connectivity. To define such a readout function, it is also necessary to separate data into independent sets for training, cross-validation, and testing to avoid circular analysis.

Naturalistic stimuli usually involve neural processing within and across multiple functional systems in the human brain (Figure 1b). For example, reading vs. listening to a story activates different unimodal sensory areas but engages the same transmodal areas selective to the linguistic content regardless of the sensory pathway used to convey it [50, 51]. Likewise, music imagery and perception activate a common set of cortical regions underlying the same mental processes even in the absence of bottom-up sensory input [19, 52]. With naturalistic paradigms, it is plausible to evaluate the time-locked synchrony of brain responses between different cortical locations or systems across different perceptual or cognitive conditions.

Bridging brain data to human behavior through computational models

Brain responses to naturalistic stimuli are often analyzed with non-parametric and model-free methods [9, 53]. Despite their simplicity, such analyses can only lead to observational findings. One can answer where in the brain are activated by the stimuli but cannot address how the stimuli are processed through the activated regions in order to arrive at the resulting perceptual or behavioral outcomes. An emerging approach that addresses these issues is to use deep neural networks to model neural information processing that bridges stimuli to behaviors and explains brain responses given naturalistic stimuli.

The past decade has witnessed great strides in deep neural networks for artificial intelligence [54], allowing machines to perform perceptual and cognitive tasks sometimes even better than humans do. Such networks are inspired by biological systems and are increasingly used as brain models [55, 56]. These models stack many layers of neurons into a hierarchical system are usually trained with a large number of natural images, speeches, videos, texts, etc. The trained models can unpack naturalistic stimuli down to hierarchical features and predict human-level behaviors.

For vision, the most commonly used neural networks are feed-forward only, passing information bottom-up and layer by layer until the representation at the top layer is readily usable to define a perceptually driven concept. For example, convolutional neural networks (CNN) have been used to model brain responses to naturalistic videos [10, 57] (Figure 3). From videos, CNNs extract features and organize them by layers at an increasing level of abstraction. The model-extracted hierarchical feature representations can explain the brain’s hierarchical cortical representations of the same videos [10, 58, 59] (Figure 3b). As a “digital mirror” of the brain, such models can be applied to decode fMRI scans and reconstruct the video input and/or describe its semantic content [10, 60, 61] (Figure 3d). These models are however not perfect and notably different from the brain, especially for their lack of a mechanism for feedback, lateral, or recurrent computation. As such, the off-the-shelf CNNs are unable to fully encode or decode brain activity. More biologically plausible models are emerging [6265] to incorporate top-down prediction and recurrent computation. Promising directions are, for example, to build network models that include the ventral and dorsal visual streams, account for the selectivity and bias in retinal sampling, separate inference for foveal vs. peripheral input, predict eye movement, and yield time-varying percepts in degraded, multi-stable, or illusory conditions.

Figure 3.

Figure 3

Using deep neural networks to encode and decode fMRI signals from human subjects watching natural videos, a) A deep convolutional neural network (CNN) serves as a model of feedforward processing in the human visual system [10]. For brain encoding (blue arrows), the CNN extracts multiple layers of visual features from every video frame and a linear encoding model uses the feature representations to explain or predict the fMRI response at every cortical location. For brain decoding (purple arrows), the decoding model combines the fMRI responses across locations to reconstruct the low-level visual feature and categorize the semantic content in the video. b) The encoding models reveal the hierarchical organization of the visual cortex. Different layers of visual features differentially explain the fMRI responses at individual locations. The lower to higher layers in the CNN progressively map onto the striate to extrastriate cortex, c) Both the CNN and the visual cortex encode selective representations specific to human faces. The FFA in the brain (purple curve) and the ‘Face’ unit in the CNN (blue curve) exhibit the same response dynamics during the video, showing peak responses occurring when the video shows human faces, d) This figure shows the visual reconstruction from cortical responses through the decoder of a variational autoencoder model [60]. The top row shows the original video frames in the natural movie stimuli. The second row shows the visual reconstruction based on the decoding model trained and tested with the same subject’s brain. The third row shows the reconstruction performance, where the decoding model is trained on one subject but tested on a different subject. Images in this figure are modified from prior publications with permission.

Beyond vision, neural network models are also beginning to demonstrate their ability to explain cortical responses during natural language processing. For example, shallow or deep neural networks trained with a large corpus of texts are able to convert thousands of words into vectors embedded in a continuous feature space [66, 67]. This echoes the notion of a continuous semantic space that the brain encodes in order to organize or create concepts [18, 68]. Using word embedding models to explain fMRI responses during narrative story comprehension, findings from a recent study shed light on distributed cortical networks that not only store individual concepts but also infer relations between concepts [18]. Language models are also demonstrated to be able to decode sentence meanings from brain scans [69] or decode text from ECoG recordings [70].

Whereas models of neural computation mostly separate sensory perception and semantic cognition, the brain uses continuous neural processing to ground cognition in perception and action [71, 72]. This gap awaits to be filled with multimodal neural network models that first use lower layers to perform neural information processing for sensory and/or motor modalities and then fuse information at higher layers for conceptual abstraction. Such models are expected to be closer to how embodied cognition emerges in the brain.

Brain-inspired models alongside functional brain imaging or recording with naturalistic video and speech stimuli will advance artificial intelligence and neuroscience in synergy. On one hand, computational models are used to substantiate and test neuroscientific hypotheses against brain responses and human behaviors, and thus support progress in neuroscience. On the other hand, brain and behavioral data during naturalistic stimuli can test models for biological plausibility and thus advance artificial intelligence towards human intelligence. With the increasing popularity of naturalistic paradigms, the synergy between artificial intelligence and neuroscience is booming at an unprecedented speed.

Challenges and opportunities

A unique opportunity is to use a common set of naturalistic stimuli for collaborative efforts in functional imaging and recording across different modalities, subjects, experiments, and labs. The notion of big data has been increasingly embraced and leveraged in large-scale projects such as the Human Connectome Project [73]. The HCP has collected human fMRI data with natural audiovisual stimuli in addition to a much larger set of resting state and task fMRI data. There is a lack of large-scale efforts to extend the naturalistic paradigm to other spatial or temporal scales inaccessible with standardized fMRI protocols.

Of particular interest are intracranial neural recordings attainable from patients with electrodes implanted for clinical diagnosis or surgical planning. The electrode implantation requires invasive surgical procedures and is only justifiable by clinical needs. In each patient, the implanted electrodes often sample limited regions and are unable to cover the whole brain. To optimally utilize intracranial neural recordings, it is desirable to use naturalistic stimuli for high-throughput characterization of many functional domains. The use of the same naturalistic stimuli induces reproducible neural responses and allows for multiplicative combinations of the signals recorded at different brain locations from different patients (Figure 1).

Paradigms with naturalistic stimuli are well suited to assemble “big data” resources [74, 75] and promote interdisciplinary collaborations. Psychologists and neuroscientists design the stimuli to diversely and inclusively sample functional domains. Imaging scientists push the resolution to image the whole brain at a fast speed or selected regions for fine spatial details. Neurophysiologists record neural activity with implanted electrodes or non-invasive sensors.

Cognitive scientists evaluate perceptual, behavioral, emotional, and cognitive outcomes. Computational scientists design models of neural information processing to link stimuli to behaviors. Putting them together into a collaborative ecosystem, multidisciplinary efforts can be synergized to bridge brain activity across mesoscopic and macroscopic scales, bridge the gap between neural and vascular measurements, link neural signals to neural computation, and connect the brain to behavior. Such combinatorial power will be the unique premise for using naturalistic paradigms for experimental and computational neuroscience.

Challenges also come along with this opportunity. The neuroinformatic infrastructure and open science culture are rapidly growing [76], but remain inadequate to fully support multi-modal and multi-site data sharing and harmonization, as well as advanced analytical methods such as deep learning. Software for data standardization, analysis and integration awaits community-wide coordination and adoption to assure reproducibility and scalability. Although highly reliable and reproducible, brain responses to the same stimuli are not strictly identical across subjects. Individual variations are inevitable and of interest for basic science and clinical research. Initial progress has been made to characterize individual variation during naturalistic stimuli [77], while new techniques show the initial promise to align data, representations or models across individuals or datasets [7880]. Despite these challenges, it is foreseeable that experimental paradigms with naturalistic stimuli are at the next frontier of imaging, recording, and modeling the brain in a realistic, complex, dynamic environment.

Acknowledgements

This work reported in this article is in part supported by NIH grants (R01 MH104402, R00 DC013828) and the University of Michigan. We are grateful for the constructive comments from two anonymous reviewers.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Conflicts of Interest

The authors declare no conflict of interest.

Given his role as a guest editor, Zhongming Liu had no involvement in the peer review of this article and has no access to information regarding its peer review. Full responsibility for the editorial process for this article was delegated to Bin He.

Reference

  • [1].Kanwisher N, McDermott J, and Chun MM, “The fusiform face area: a module in human extrastriate cortex specialized for face perception,” Journal of neuroscience, vol. 17, no. 11, pp. 4302–4311, 1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Meyers EM, Freedman DJ, Kreiman G, Miller EK, and Poggio T, “Dynamic population coding of category information in inferior temporal and prefrontal cortex,” Journal of neurophysiology, vol. 100, no. 3, pp. 1407–1419, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Fox MD and Raichle ME, “Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging,” Nature reviews neuroscience, vol. 8, no. 9, pp. 700–711,2007. [DOI] [PubMed] [Google Scholar]
  • [4].Chang C et al. , “Tracking brain arousal fluctuations with fMRI,” Proceedings of the National Academy of Sciences, vol. 113, no. 16, pp. 4518–4523, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Özbay PS et al. , “Sympathetic activity contributes to the fMRI signal,” Communications biology, vol. 2, no. 1, pp. 1–9, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Gonzalez-Castillo J, Kam JW, Hoy CW, and Bandettini PA, “How to Interpret Resting-State fMRI: Ask Your Participants,” Journal of Neuroscience, vol. 41, no. 6, pp. 1130–1141,2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Biswal BB et al. , “Toward discovery science of human brain function,” Proceedings of the National Academy of Sciences, vol. 107, no. 10, pp. 4734–4739, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Sonkusare S, Breakspear M, and Guo C, “Naturalistic stimuli in neuroscience: critically acclaimed,” Trends in cognitive sciences, vol. 23, no. 8, pp. 699–714, 2019. [DOI] [PubMed] [Google Scholar]; * This is a review paper on the naturalistic paradigm, summarizing various forms of naturalistic stimuli, common analysis methods, and new insights into brain functions. The authors also discuss the potential clinical application with naturalistic paradigms.
  • [9].Hasson U, Nir Y, Levy I, Fuhrmann G, and Malach R, “Intersubject synchronization of cortical activity during natural vision,” science, vol. 303, no. 5664, pp. 1634–1640, 2004. [DOI] [PubMed] [Google Scholar]
  • [10].Wen H, Shi J, Zhang Y, Lu K-H, Cao J, and Liu Z, “Neural encoding and decoding with deep learning for dynamic natural vision,” Cerebral Cortex, vol. 28, no. 12, pp. 4136–4160, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Marussich L, Lu K-H, Wen H, and Liu Z, “Mapping white-matter functional organization at rest and during naturalistic visual perception,” Neuroimage, vol. 146, pp. 1128–1141,2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Simony E et al. , “Dynamic reconfiguration of the default mode network during narrative comprehension,” Nature communications, vol. 7, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Lynch LK, Lu KH, Wen H, Zhang Y, Saykin AJ, and Liu Z, “Task- evoked functional connectivity does not explain functional connectivity differences between rest and task conditions,” Human brain mapping, vol. 39, no. 12, pp. 4939–4948, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Hasson U, Malach R, and Heeger DJ, “Reliability of cortical activity during natural stimulation,” Trends in cognitive sciences, vol. 14, no. 1, pp. 40–48, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Lu K-H, Hung S-C, Wen H, Marussich L, and Liu Z, “Influences of High-Level Features, Gaze, and Scene Transitions on the Reliability of BOLD Responses to Natural Movie Stimuli,” PloS one, vol. 11, no. 8, p. e0161797, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].de Heer WA, Huth AG, Griffiths TL, Gallant JL, and Theunissen FE, “The hierarchical cortical organization of human speech processing,” Journal of Neuroscience, vol. 37, no. 27, pp. 6539–6557, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Bhattasali S et al. , “Localising memory retrieval and syntactic composition: an fMRI study of naturalistic language comprehension,” Language, Cognition and Neuroscience, vol. 34, no. 4, pp. 491–510, 2019. [Google Scholar]
  • [18].Zhang Y, Han K, Worth R, and Liu Z, “Connecting concepts in the brain by mapping cortical representations of semantic relations,” Nature communications, vol. 11, no. 1, pp. 1–13,2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; ** From hours of fMRI data collected during subjects listening to natural stories, this study shows that both the concepts themselves and the relationships between them are represented by distributed cortical networks.
  • [19].Zhang Y, Chen G, Wen H, Lu K-H, and Liu Z, “Musical imagery involves wernicke’s area in bilateral and anti-correlated network interactions in musicians,” Scientific reports, vol. 7, no. 1, pp. 1–13, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Leopold DA and Park SH, “Studying the visual brain in its natural rhythm,” Neuroimage, vol. 216, p. 116790, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; ** This review article highlights that studies involving movies and free viewing provide more opportunities than the traditional paradigms with strict experimental reductionism for investigating higher-order functions in the visual system that supports our daily visual experience.
  • [21].Puschmann S, Regev M, Baillet S, and Zatorre RJ, “MEG inter-subject phase-locking of stimulus-driven activity during naturalistic speech listening correlates with musical training,” Journal of Neuroscience, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]; * Using naturalistic speech stimuli, the authors demonstrate robust bilateral patterns of stimulus-driven phase synchronization between auditory sensory areas and higher-order processing networks, where the level of musical training has a positive relationship with the information propagation across these regions.
  • [22].Chang et al W-T, “Combined MEG and EEG show reliable patterns of electromagnetic brain activity during natural viewing,” Neuroimage, vol. 114, pp. 49–56, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Honey CJ et al. , “Slow cortical dynamics and the accumulation of information over long timescales,” Neuron, vol. 76, no. 2, pp. 423–434, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].McMahon DB, Russ BE, Elnaiem HD, Kurnikova AI, and Leopold DA, “Singleunit activity during natural vision: diversity, consistency, and spatial sensitivity among AF face patch neurons,” Journal of neuroscience, vol. 35, no. 14, pp. 5537–5548, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Bhattasali S, Brennan J, Luh W-M, Franzluebbers B, and Hale J, “The Alice Datasets: fMRI & EEG Observations of Natural Language Comprehension,” in Proceedings of The 12th Language Resources and Evaluation Conference, 2020, pp. 120–125. [Google Scholar]; * This article summarizes the information of the Alice Datasets, which are ecologically valid datasets with fMRI data (29 subjects) and EEG data (52 subjects) collected during subjects listening to the same naturalistic story.
  • [26].Haufe S et al. , “Elucidating relations between fMRI, ECoG, and EEG through a common natural stimulus,” Neuroimage, vol. 179, pp. 79–91, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Park SH, Russ BE, McMahon DB, Koyano KW, Berman RA, and Leopold DA, “Functional subpopulations of neurons in a macaque face patch revealed by single-unit fMRI mapping,” Neuron, vol. 95, no. 4, pp. 971–981. e5, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Coen-Cagli R, Kohn A, and Schwartz O, “Flexible gating of contextual influences in natural vision,” Nature neuroscience, vol. 18, no. 11, pp. 1648–1655, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Hasson U, Chen J, and Honey CJ, “Hierarchical process memory: memory as an integral component of information processing,” Trends in cognitive sciences, vol. 19, no. 6, pp. 304–313, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].He B, Sohrabpour A, Brown E, and Liu Z, “Electrophysiological source imaging: a noninvasive window to brain dynamics,” Annual review of biomedical engineering, vol. 20, pp. 171–196, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].De Martino F, Moerel M, Ugurbil K, Goebel R, Yacoub E, and Formisano E, “Frequency preference and attention effects across cortical depths in the human primary auditory cortex,” Proceedings of the National Academy of Sciences, vol. 112, no. 52, pp. 16036–16041,2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Huber L et al. , “High-resolution CBV-fMRI allows mapping of laminar activity and connectivity of cortical input and output in human M1,” Neuron, vol. 96, no. 6, pp. 1253–1263. e7, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Felleman DJ and Van Essen DC, “Distributed hierarchical processing in the primate cerebral cortex,” Cerebral cortex (New York, NY: 1991), vol. 1, no. 1, pp. 1–47, 1991. [DOI] [PubMed] [Google Scholar]
  • [34].Huber L et al. , “Layer-dependent functional connectivity methods,” Progress in Neurobiology, p. 101835, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; ** This article provides an overview of analysis procedures for mapping layer-specific directional (feedforward and feedback) functional connectivity in hierarchical brain networks. The authors also discuss the perspective of using layer fMRI to bridge brain activity between circuits and networks.
  • [35].Wilf M, Strappini F, Golan T, Hahamy A, Harel M, and Malach R, “Spontaneously emerging patterns in human visual cortex reflect responses to naturalistic sensory stimuli,” Cerebral cortex, vol. 27, no. 1, pp. 750–763, 2017. [DOI] [PubMed] [Google Scholar]
  • [36].Wen H and Liu Z, “Separating fractal and oscillatory components in the power spectrum of neurophysiological signal,” Brain topography, vol. 29, no. 1, pp. 13–26, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Wen H and Liu Z, “Broadband electrophysiological dynamics contribute to global resting-state fMRI signal,” Journal of Neuroscience, vol. 36, no. 22, pp. 6030–6040, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38].Donoghue T et al. , “Parameterizing neural power spectra into periodic and aperiodic components,” Nature neuroscience, vol. 23, no. 12, pp. 1655–1665, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Buzsáki G and Draguhn A, “Neuronal oscillations in cortical networks,” science, vol. 304, no. 5679, pp. 1926–1929, 2004. [DOI] [PubMed] [Google Scholar]
  • [40].Van Kerkoerle T et al. , “Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex,” Proceedings of the National Academy of Sciences, vol. 111, no. 40, pp. 14332–14341,2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Bastos AM et al. , “Visual areas exert feedforward and feedback influences through distinct frequency channels,” Neuron, vol. 85, no. 2, pp. 390–401, 2015. [DOI] [PubMed] [Google Scholar]
  • [42].Michalareas G, Vezoli J, Van Pelt S, Schoffelen J-M, Kennedy H, and Fries P, “Alpha-beta and gamma rhythms subserve feedback and feedforward influences among human visual cortical areas,” Neuron, vol. 89, no. 2, pp. 384–397, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Manning JR, Jacobs J, Fried I, and Kahana MJ, “Broadband shifts in local field potential power spectra are correlated with single-neuron spiking in humans,” Journal of Neuroscience, vol. 29, no. 43, pp. 13613–13620, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Bastos AM, Usrey WM, Adams RA, Mangun GR, Fries P, and Friston KJ, “Canonical microcircuits for predictive coding,” Neuron, vol. 76, no. 4, pp. 695–711, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Scheeringa R, Koopmans PJ, van Mourik T, Jensen O, and Norris DG, “The relationship between oscillatory EEG activity and the laminar-specific BOLD signal,” Proceedings of the National Academy of Sciences, vol. 113, no. 24, pp. 6761–6766, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Mukamel R, Gelbard H, Arieli A, Hasson U, Fried I, and Malach R, “Coupling between neuronal firing, field potentials, and FMRI in human auditory cortex,” Science, vol. 309, no. 5736, pp. 951–954, 2005. [DOI] [PubMed] [Google Scholar]
  • [47].Mantini D, Perrucci MG, Del Gratta C, Romani GL, and Corbetta M, “Electrophysiological signatures of resting state networks in the human brain,” Proceedings of the National Academy of Sciences, vol. 104, no. 32, pp. 13170–13175, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Liu Z, de Zwart JA, Chang C, Duan Q, van Gelderen P, and Duyn JH, “Neuroelectrical decomposition of spontaneous brain activity measured with functional magnetic resonance imaging,” Cerebral Cortex, vol. 24, no. 11, pp. 3080–3089, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Logothetis NK, “What we can do and what we cannot do with fMRI,” Nature, vol. 453, no. 7197, pp. 869–878, 2008. [DOI] [PubMed] [Google Scholar]
  • [50].Regev M, Honey CJ, Simony E, and Hasson U, “Selective and invariant neural responses to spoken and written narratives,” Journal of Neuroscience, vol. 33, no. 40, pp. 15978–15988, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [51].Deniz F, Nunez-Elizalde AO, Huth AG, and Gallant JL, “The representation of semantic information across human cerebral cortex during listening versus reading is invariant to stimulus modality,” Journal of Neuroscience, vol. 39, no. 39, pp. 7722–7736, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]; * This study compared the representation of semantic information when subjects listen or read the same story. It suggests that under the naturalistic paradigm, the high-order information processing is to some extent invariant to the sensory modality of the stimuli.
  • [52].Dijkstra N, Bosch SE, and van Gerven MA, “Shared neural mechanisms of visual perception and imagery,” Trends in cognitive sciences, vol. 23, no. 5, pp. 423–434, 2019. [DOI] [PubMed] [Google Scholar]
  • [53].Simony E et al. , “Dynamic reconfiguration of the default mode network during narrative comprehension,” Nature communications, vol. 7, p. 12141, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].LeCun Y, Bengio Y, and Hinton G, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015. [DOI] [PubMed] [Google Scholar]
  • [55].Yamins DL and DiCarlo JJ, “Using goal-driven deep learning models to understand sensory cortex,” Nature neuroscience, vol. 19, no. 3, pp. 356–365, 2016. [DOI] [PubMed] [Google Scholar]
  • [56].Richards BA et al. , “A deep learning framework for neuroscience,” Nature neuroscience, vol. 22, no. 11, pp. 1761–1770, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]; * This article discusses how the learning goal, learning rules, and network architecture in the optimization-based deep learning framework would provide theoretical insights and drive experimental progress in neuroscience.
  • [57].Güçlü U and van Gerven MA, “Increasingly complex representations of natural movies across the dorsal stream are shared between subjects,” NeuroImage, vol. 145, pp. 329–336, 2017. [DOI] [PubMed] [Google Scholar]
  • [58].Shi J, Wen H, Zhang Y, Han K, and Liu Z, “Deep recurrent neural network reveals a hierarchy of process memory during dynamic natural vision,” Human brain mapping, vol. 39, no. 5, pp. 2269–2282, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [59].Wen H, Shi J, Chen W, and Liu Z, “Deep residual network predicts cortical representation and organization of visual features for rapid categorization,” Scientific reports, vol. 8, no. 1, pp. 1–17, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [60].Han K et al. , “Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex,” NeuroImage, vol. 198, pp. 125–136, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]; * This study highlights a variational autoencoder as an unsupervised model of the “Bayesian brain” for learning hierarchical representations in the human visual system. It shows that this model can both predict fMRI responses to natural videos and reconstruct naturalistic visual experiences from brain scans.
  • [61].Nishida S and Nishimoto S, “Decoding naturalistic experiences from human brain activity via distributed representations of words,” Neuroimage, vol. 180, pp. 232–242, 2018. [DOI] [PubMed] [Google Scholar]
  • [62].Lotter W, Kreiman G, and Cox D, “Deep predictive coding networks for video prediction and unsupervised learning,” arXiv preprint arXiv:1605.08104, 2016. [Google Scholar]
  • [63].Wen H, Han K, Shi J, Zhang Y, Culurciello E, and Liu Z, “Deep predictive coding network for object recognition,” in International Conference on Machine Learning, 2018: PMLR, pp. 5266–5275. [Google Scholar]
  • [64].Kubilius et al J, “Brain-like object recognition with high-performing shallow recurrent ANNs,” arXiv preprint arXiv:1909.06161, 2019. [Google Scholar]
  • [65].van Bergen RS and Kriegeskorte N, “Going in circles is the way forward: the role of recurrence in visual inference,” Current Opinion in Neurobiology, vol. 65, pp. 176–193, 2020. [DOI] [PubMed] [Google Scholar]; * This article emphasizes the insight that deep feed-forward neural networks are a special case of recurrent neural networks that unrolled in time. By discussing the role of recurrence in visual inference, the authors suggest that introducing recurrence to the computational models can help us better understand the biological vision.
  • [66].Mikolov T, Sutskever I, Chen K, Corrado GS, and Dean J, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, 2013, pp. 3111–3119. [Google Scholar]
  • [67].Devlin J, Chang M-W, Lee K, and Toutanova K, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018. [Google Scholar]
  • [68].Huth AG, de Heer WA, Griffiths TL, Theunissen FE, and Gallant JL, “Natural speech reveals the semantic maps that tile human cerebral cortex,” Nature, vol. 532, no. 7600, p. 453, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [69].Pereira F et al. , “Toward a universal decoder of linguistic meaning from brain activation,” Nature communications, vol. 9, no. 1, p. 963, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [70].Sun P, Anumanchipalli GK, and Chang EF, “Brain2Char: a deep architecture for decoding text from brain recordings,” Journal of Neural Engineering, vol. 17, no. 6, p. 066015, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; ** This study develops a novel deep network architecture for directly decoding text from electrocorticography (ECoG) recordings when subjects read sentences on a screen. It provides a potential for high-performance brain-computer interfaces for natural language communication.
  • [71].Pulvermüller F, “How neurons make meaning: brain mechanisms for embodied and abstract-symbolic semantics,” Trends in cognitive sciences, vol. 17, no. 9, pp. 458–470, 2013. [DOI] [PubMed] [Google Scholar]
  • [72].Martin A, “GRAPES—Grounding representations in action, perception, and emotion systems: How object properties and categories are represented in the human brain,” Psychonomic bulletin & review, vol. 23, no. 4, pp. 979–990, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [73].Van Essen et al DC, “The WU-Minn human connectome project: an overview,” Neuroimage, vol. 80, pp. 62–79, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [74].DuPre E, Hanke M, and Poline J-B, “Nature abhors a paywall: How open science can realize the potential of naturalistic stimuli,” Neuroimage, vol. 216, p. 116330, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; * This article highlights the need and value of data sharing in the field of neuroscience studies with naturalistic stimuli.
  • [75].Aliko S, Huang J, Gheorghiu F, Meliss S, and Skipper JI, “A naturalistic neuroimaging database for understanding the brain using ecological stimuli,” Scientific Data, vol. 7, no. 1, pp. 1–21, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; * This paper introduces a public dataset with high-quality fMRI data and behavioral assessment from 86 participants watching 10 full-length movies.
  • [76].Poldrack RA and Gorgolewski KJ, “Making big data open: data sharing in neuroimaging,” Nature neuroscience, vol. 17, no. 11, pp. 1510–1517, 2014. [DOI] [PubMed] [Google Scholar]
  • [77].Finn ES et al. , “Idiosynchrony: From shared responses to individual differences during naturalistic neuroimaging,” NeuroImage, vol. 215, p. 116828, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [78].Haxby JV, Guntupalli JS, Nastase SA, and Feilong M, “Hyperalignment: Modeling shared information encoded in idiosyncratic cortical topographies,” Elife, vol. 9, p. e56601, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; * This article provides an overview of hyperalignment, which models information shared across brains by aligning individual data to a common, high-dimensional space. It highlights the usage of rich, naturalistic stimuli in estimating the transformation matrices in this framework.
  • [79].Wen H, Shi J, Chen W, and Liu Z, “Transferring and generalizing deep-learning-based neural encoding models across subjects,” NeuroImage, vol. 176, pp. 152–163, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [80].Khosla M, Ngo GH, Jamison K, Kuceyeski A, and Sabuncu MR, “A shared neural encoding model for the prediction of subject-specific fMRI response,” in International Conference on Medical image Computing and Computer-Assisted Intervention, 2020: Springer, pp. 539–548. [Google Scholar]

RESOURCES