Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Feb 1.
Published in final edited form as: Stat Sci. 2021 Feb;36(1):89–108. doi: 10.1214/20-sts792

Network Modeling in Biology: Statistical Methods for Gene and Brain Networks

Y X Rachel Wang §, Lexin Li , Jingyi Jessica Li , Haiyan Huang **
PMCID: PMC8296984  NIHMSID: NIHMS1636819  PMID: 34305304

Abstract

The rise of network data in many different domains has offered researchers new insight into the problem of modeling complex systems and propelled the development of numerous innovative statistical methodologies and computational tools. In this paper, we primarily focus on two types of biological networks, gene networks and brain networks, where statistical network modeling has found both fruitful and challenging applications. Unlike other network examples such as social networks where network edges can be directly observed, both gene and brain networks require careful estimation of edges using covariates as a first step. We provide a discussion on existing statistical and computational methods for edge esitimation and subsequent statistical inference problems in these two types of biological networks.

Keywords: gene regulatory networks, brain connectivity networks, network reconstruction, network inference

1. INTRODUCTION

Network structures exist everywhere in biology as many biological systems function via complex interactions among their individual components. In ecosystems, species interact in a number of different forms which are central to maintaining biodiversity, the most common being predator-prey relationships. In human brain, neurons communicate by passing electric and chemical signals through synapses. At the cellular level, DNA, RNA, proteins and other molecules participate in a variety of biochemical reactions that determine inner workings of a cell. Networks offer a succinct mathematical representation of these systems, with “sets of items, which we will call vertices or sometimes nodes, with connections between them, called edges” [150].

Network modeling has been successfully applied in many settings where the biological questions of interest have their counterparts in graph theory. For example, many biochemical networks have a scale-free topology with a few highly connected nodes [14], known as hubs in network analysis, which may correspond to key enzymes in biochemical processes. Another key goal in network analysis is to detect communities, which are groups of tightly connected nodes. These could be genes with related functionalities, or regions of brain with coordinated actions.

From the statistical point of view, another reason why biological systems are particularly amenable to network analysis lies in the data collection process. Rapid advances in biological technologies mean that measurements of variables are not limited to observational settings, and extensive experimental studies can be performed to examine how variables respond under different conditions. One prominent example can be found in genomic studies, where numerous high-throughput technologies have generated a staggering amount of data measuring gene expression levels and epigenetic interactions. Another example is the studies of brain, where numerous imaging technologies have collected a wide variety of brain images measuring distinct brain characteristics, ranging from brain structure and function to numerous chemical constituents. For this reason, here we choose to discuss statistical methods for gene and brain networks, with more focus on the former.

The enormous wealth of data provides both opportunities and challenges for the analysis of the above two classes of networks. Unlike physical or social networks, interactions among genes are much harder to observe. Although experiments can be performed to search for and verify each gene-gene interaction, it is much more cost effective to infer these interactions and reconstruct network edges using statistical and computational tools on high-throughput gene expression data (more recently single cell expression data). The computational results can help narrow down possible candidates for further experimental validation. The computationally inferred networks may contain up to tens of thousands of nodes requiring efficient methods for network inference. In this article, we focus on a few specific problems involved in gene network analysis and mention other relevant applications in genomics beyond gene networks when appropriate. As another example in biology, we will also review statistical methods for constructing and analyzing brain connectivity networks.

Without claiming to be exhaustive, we will discuss the challenges in these networks considering the type and quality of data available, relevant biological questions to be addressed, and statistical and computational concerns. We highlight the success and limitations of current network modeling paradigms and statistical methodologies, and propose possible directions for future development.

2. GENE NETWORKS

Gene regulatory networks play a fundamental role in defining cell structure and function. In such a network, transcription factors (TFs), RNA and other small molecules act as regulators to activate or repress the expression levels of genes, which in turn increase or decrease the production of proteins. Thus gene-gene interactions can occur in the form of direct physical binding of proteins (TFs) to their target sequences, which can be represented as directed graphs with causal relationships. In a broader sense, gene-gene interactions may also include indirect interactions when the expression of a gene influence those of others with regulations caused by one or more intermediaries, or when two genes are co-regulated thus showing similar epxression profiles; these associations are generally reported as undirected graphs.

Despite all being the focus of studies in the network literature, biological networks such as gene networks differ from social networks in a few important aspects, which give rise to challenging situations for statistical modeling. Compared to relationship networks obtained from popular social media, gene networks are typically smaller in size. The former can additionally grow in size by including more users, whereas the size of gene networks is limited by the number of genes that exist in an organism and can be measured in an experiment. As will be explained in detail in Section 2.1, edges in gene networks need to be estimated from covariates. Since the measurements of these covariates rely on specific technologies, the number of samples one can take is often restricted by cost considerations and other practical constraints. Finally, since edges in these networks represent interactions between nodes, they are directly affected by the underlying dynamics in gene regulation. These biological processes are complicated in nature; gene regulatory mechanisms depend on tissue types, cellular environment, and their activities can be changed by disease state. All of these factors can give rise to challenging situations for estimating and interpreting network structures. In the following sections, we will review existing approaches in the relevant literature with these limitations on biological data in mind.

2.1. Inferring gene-gene relationships using expression data

In the past two decades, estimating gene-gene interactions have primarily relied on gene expression data, which have been made readily available in the form of microarray or RNA-seq data. Coexpression is one of the earliest concepts proposed to infer edges in a gene network and is based on the concept of “guilt by association”: genes that have similar expression profiles under different experimental conditions are likely to be co-regulated and hence functionally related. However, despite the extensive literature, many open questions remain due to the complex nature of gene interactions: in a broader sense, these coexpression relationships can be nonlinear, transient, and subject to changes depending on the cellular environment. In this section, without claiming to be exhaustive, we discuss a few main approaches for inferring gene networks that differ in their considerations of how genes behave across the given samples. More detailed reviews can be found in e.g. [226].

Pairwise coexpression measure

Given an expression matrix with p genes arranged in rows and their expression levels measured under n experimental conditions in columns, computing the coexpression between genes i and j involves choosing a suitable similarity measure for estimating the association between two vectors. The choice of the measure crucially depends on a number of factors including the nature of the interaction, experimental design, the number of samples available, and other computational concerns.

Correlation measures based on Pearson’s or Spearman’s correlation are among the most popular methods used in the literature [52, 194, 199, 233, 109]. Either hard [25] or soft thresholding [112] is then applied to produce a binary or weighted network. These correlations are easy to compute and interpret but limited in the type of pairwise association they can detect, which is linear or monotonic. When the relationship between expression vectors is more complex, one commonly used class of methods is based on mutual information (MI). MI measures the general statistical dependence between gene expression levels and is thus able to capture nonlinear relationships. In the calculation of MI, marginal and joint entropies of the expression levels can be approximated via discretization [20] or using a smoothing kernel [39, 15, 139]. Other variants including MI with background, maximal MI, and maximal information coefficient (MIC, [173]) have also been used in practice. For time-course data, techniques in times series analysis (e.g. time-frequency analysis) have been applied to improve the sensitivity of similarity measures [170, 55], often assuming explicit models for generating the observed data.

Other features intrinsic to the nature of gene interactions may create more complex situations. For instance, gene interactions may change as the intrinsic cellular state varies or only exist under a specific cellular condition [27, 254]. To detect local correlation patterns, spline regression models [159] and nonparametric methods based on comparing local expression patterns [176, 229] have been proposed. For time series data, another prevalent feature is the presence of time shifts between association patterns, reflecting the fact that regulation may take effect after a time delay. Methods for handling the time lag issue include time-shifted Pearson’s correlation [100], time-shifted expression rank pattern analysis [228], and time sequence alignment algorithms [111, 1, 70, 249].

Partial correlation for group interactions

In a real biological pathway, a gene may interact with a group of genes but not possess a strong marginal relationship with any individual member of the group. Gaussian graphical models (GGM) offer a more realistic way to model these higher-level interactions. Assuming a multivariate normal distribution for the expression vectors for a set of genes W, this approach aims to estimate the partial correlation between genes i and j, that is, their correlation conditioning on W\{i, j}. Since the partial correlations are proportional to their corresponding entries in the inverse covariance matrix Σ−1, the inference problem amounts to estimating Σ−1, or the precision matrix. The major difficulty of such an estimation arises from the high dimensional nature of gene expression data, which naturally requires in-built sparsity in inference methods. A rich wealth of literature exists both in the context of gene expression analysis [119, 142, 160] and general high dimensional inference [248, 62, 256, 81] to tackle this problem.

One limitation of this approach lies in the choice of the conditional set in the partial correlation calculation. As pointed out in [41] and [103], the inclusion of noisy genes in the set W\{i, j} may introduce spurious dependencies and consequently false edges in the estimated network. Instead of conditioning on the entire set W\{i, j}, there have been efforts on using lower order partial correlations [41, 135, 232, 231, 120, 121], which condition on one or two other genes. Beyond lower order interactions, [103] proposed a semi-supervised approach to screen for conditionally correlated genes with a small set of known pathway genes. An unsupervised approach involving applying sparse canonical correlation analysis coupled with repeated random partition and subsampling can be found in [227].

Adding causality and dynamics

A deeper understanding of the gene regulation system requires going beyond undirected relationships between genes and knowing the causal drivers behind them. Bayesian networks (BNs) use directed acyclic graphs (DAGs) to represent the joint distribution of nodes (genes) as a series of local probability distributions. In a BN, given its parents, each node is independent of its non-descendants. In this sense, each directed edge can be interpreted as a causal link. The difficulty of inferring BNs lies in the computational cost required to search through all possible graph structures, which is NP-hard [34]. In addition to greedy search [247], various heuristics have been proposed to increase the search efficiency [82, 148, 2, 128]. One can gain further information from perturbation gene experiments (by knockout or RNA interference), with methods that provide causal bounds for direct and indirect effects based on skeleton graphs obtained from the path consistency algorithm [133], and estimate the posterior distribution of a causal ordering of genes with MCMC techniques [172].

BNs can be extended to capture temporal relationships between the variables [104, 262, 258, 218]. In a dynamic BN, the joint probability factorizes into local probabilities of each node associated with every time point, where the parents of a node can include nodes from previous time points. Another class of methods based on differential equations (DEs), which models the rate of change in the expression level of a gene as a function of the expression of other genes with different functional forms [32, 206, 12]. In addition to the issue of computational complexity, another drawback of these methods lies in the nature of data required to perform extensive inference. More sample measurements need taken on a slowly changing system or finely spaced in time in order to capture the underlying dynamics. Attempts to capture these causal and dynamic relationships in gene networks for higher organisms using expression data alone have had limited success; auxiliary information from other data sources (e.g. protein-protein interaction, ChIP-seq) can increase our chances in characterizing the complexity in a more realistic and accurate way [224, 240].

Beyond traditional studies: recent advancements and future trends

Most of the methods discussed above focus on analyzing a single gene expression dataset, which suffers from the dimensionality curse (pn problem). To obtain a more complete picture, one step further is to perform integrated analysis on gene expression data generated by different groups [223, 155] to increase n, and even other types of data such as TF binding, protein-protein interaction (PPI), which provide direct physical evidence of regulatory interactions [11, 117, 138].

It is worth noting that recent advances in single-cell sequencing technology are offering a new perspective on studying gene pathway with gene expression information. For instance, single-cell RNA-seq (scRNA-seq) data can measure gene functional activities at individual cell resolution and thus has potential to characterize gene regulatory actions with cell-to-cell variability [28, 140, 57]. However, despite these attractive and promising features, the high noise level in typical single-cell experiments as well as the dynamics of individual cells also present new challenges for developing statistical methods for data preprocessing and network / pathway inference. We will discuss single-cell data in more detail in Section 3.

In addition to computationally inferred gene-gene interactions, extensive experiments also have provided sets of “true” interactions in some species. For example, genome-wide experimental screening of gene-gene interactions have been carried out in yeast with high-throughput techniques (SGA, [209]; E-MAP, [185]), whereas screening these interactions in higher organisms require more restrictive techniques and have achieved much less coverage [116, 47, 124]. These experimentally validated interactions can be found in databases such as RegulonDB and KEGG [181, 153]. Comparing results from various computational methods against these validated interactions allows us to assess the performance of each method. Some of these comparisons also suggest different computational methods can lead to quite disparate sets of predicted interactions [42, 137]. As a result, the use of ensemble methods [86, 136] has been proposed to achieve more stable and accurate outcomes via bootstrapping and aggregation. Overall, constructing a complete catalog of gene interactions remains challenging due to the extensive scale of the problem. On the other hand, more specific biological goals and prior knowledge can help us narrow down possible approaches and lead to plausible simplifications.

2.2. Understanding Network Structures

Having reconstructed gene networks, it is now possible to systematically study the topological features of these graphs using graph-theoretical tools to understand and predict the underlying biological functions. While many local and global features can be extracted from the graphs, in this section we will focus on discussing node-level connectivity measures and how they reflect the functional importance of the nodes, and higher-level connectivity patterns including motifs and communities. Other extensive reviews on using graph-based methods for analyzing biological networks can be found in e.g. [6, 84, 158]. In this section, we will consider computationally reconstructed gene networks, which contain noise arising from estimation errors, and also known networks in genomics where edges are directly measured with biological assays, such as PPI networks.

Node-level connectivity

In gene and PPI networks, how a node is connected to the rest of the network can be an important indication of its biological role. Removing nodes with high connectivity or nodes between highly connected components can significantly affect the overall topology. In biological systems, this may correspond to malfunctioning of key genes or proteins which can cause serious perturbations. Different measures of node connectivity, or centrality, exist. In the simplest form, nodes with high degrees, which are also known as hubs, have been long studied in gene and PPI networks for model organisms, especially yeast [89, 261]. They have shown that hub genes and proteins encoded by them are essential to survival, and these genes tend to be older and evolve more slowly [61]. In human, hubs have been associated with cancer and other types of disease [13] - the protein products of disease-related genes tend to have high degrees [220, 93, 242]. Identifying hubs requires measuring node centrality, the simplest kind of which is node degrees. Reweighing each neighbor with their own degrees gives rise to another measure called eigenvector centrality. Similar to the PageRank algorithm, eigenvector centrality gives more weight to nodes connected to important neighbors and been used to distinguish essential proteins in yeast PPI networks [54] and mine gene-disease associations [156].

Another measure of connectivity, termed betweeness centrality, considers the number of times a node lies on the shortest path between two other nodes. Since nodes with high betweeness centrality act as bridges connecting subgraphs, they are also called bottlenecks. It has been observed that many bottleneck nodes correspond to essential connector proteins and genes in directed regulatory networks [94, 246]. Further centrality measures and comparison of their performance in identifying essential or disease-causing genes / proteins can be found in [157, 29].

Higher-order structures – Motifs

At a higher level, biological networks are often decomposed into smaller functional modules in which individual nodes perform coordinated actions. The concept of motifs was introduced by [143] as simple building blocks of complex networks. A motif is a small connected subgraph, which occurs significantly more frequently in the given network than expected by chance. Commonly occurring motifs include positive and negative feedback loops, oscillators and bifans, and these have been associated with optimized biological functions in regulatory networks [13]. The statistical analysis of motifs amounts to a problem of subgraph counting: for a given subgraph, one needs to first obtain the frequency of all subgraphs which are topologically equivalent, then determine its statistical significance. The challenge of the first part lies in the computational challenge when the network is large. Since exhaustive enumeration is usually infeasible, sampling methods [99, 230] are needed for estimation. The second part depends on the random graph model used to determine the background frequency. [17] took a local graph alignment approach, which is conceptually similar to sequence alignment, with a scoring function measuring the significance of individual subgraphs and their similarity so that the aligned subgraphs are characterized by a consensus motif that has a high number of internal connections. [92] proposed a finite mixture model for random networks and used an EM algorithm to estimate the parameters and background probabilities. Other motif algorithms can be found in [36, 235].

Clustering and community detection

Since motifs tend to be small in size, another approach is to identify densely connected clusters of nodes which can correspond to genes with related functions or proteins involved in the same complex. Clustering can be applied to gene expression vectors directly using heuristic algorithms (Self Organizing Maps [202]), genetic algorithms [45] or model based approaches (Expectation Maximization [245, 147]; variational Bayes [205]). Alternatively, noting that most methods in Section 2.1 give rise to a similarity matrix, k-means and hierarchical clustering have been widely used in gene expression studies [204, 52]. Taking into account that one gene can participate in multiple pathways, fuzzy versions of k-means have also been developed [43, 66]. One of the difficulties of these methods lies in the choice of number of clusters or where to cut the tree structure to produce the final clusters. This is usually done by considering the within cluster dispersion, or statistics derived from it including the gap statistic [207] and the silhouette measure [175].

Other methods operate on the given networks directly. Many heuristic algorithms have been developed for gene networks [16, 189] and PPI networks [146, 68] to identify tightly connected components in the graphs. In PPI networks, Markov Clustering (MCL, [216]) has been particularly popular. The algorithm simulates random walks on a given graph by iteratively taking powers of the underlying stochastic matrix and inflating each entry until the graph is partitioned into subsets. MCL has been widely applied to discover protein complexes and cluster protein sequences into families [53, 219, 183], but still lacks theoretical justification.

In statistical network analysis, identifying tightly connected clusters corresponds to the problem of community detection. Model-based community detection requires a generative probabilistic model for random graphs, one of the most popular being the Stochastic Block Model [79]. In a SBM with K classes, for each node i, a latent class variable Zi ∈ {1,…, K} is assigned according to some categorical distribution. Then the probability of an edge between nodes i and j is given by P(Ai,j = 1|Zi = k, Zj = l) = Hk,l, where A is the adjacency matrix and H is the K × K connectivity probability matrix. The inference problems for SBMs involve both node classification and parameter estimation, and a block with a high internal edge probability can be considered as a potential functional module. An extensive literature on inference methods for SBM exists. On the other hand, although SBM and community detection have been applied to gene networks and PPI networks [74, 40], the vanilla model is too simplistic to account for real network features such as degree variation within blocks and overlapping blocks. These can be addressed to some extent using a degree-corrected SBM [98] and mixed membership SBM [5].

Going beyond gene and PPI networks, recent advances in chromatin conformation capture techniques open up new ground for applying community detection algorithms. Chromatin conformation capture experiments like Hi-C measure the frequency of interaction between pairs of genome loci in 3D space, thus providing insights the spatial organization of genomes. One specific feature of the 3D organization is known as topologically associating domains (TADs), which are densely interacting, contiguous chromatin regions playing important roles in regulating gene expression [46, 187, 114]. Treating genome loci as nodes and their interactions as edges, one can consider the structure of chromatin as an interaction network with TADs corresponding to dense communities. Methods based on mixed membership SBM [21] and modularity maximization [244, 151] have been proposed but do not enforce the constraint that the communities in this case have to be contiguous. [225] proposed a network model that accounts for non-exchangeability of nodes (genome loci) and is capable of incorporating biological covariates at the TAD boundaries.

Gene prioritization – semi-supervised clustering

When a specific biological process or pathway is concerned with partial knowledge of the process / pathway known, a relevant question that has been considered extensively under a supervised or semi-supervised setting in the literature is known as “gene prioritization” [145]. In general, gene prioritization refers to a computational analysis for ranking genes by their relevance to a disease or biomedical condition through a set of seed (bait) genes and some chosen relevance measure or criteria.

When attempting a whole genome analysis, a major challenge in gene prioritization analysis is how to extract sparse, true signals from large, heterogeneous, noisy data. For instance, when a particular pathway is targeted, the considered data would likely have a low signal-to-noise ratio since the great majority of genes may have no relation to the pathway of interest and the sheer number of pairs of such genes outweighs those that show patterned relations in data. Among many existing approaches, GIANT [71, 234] and ENDEAVOUR [210] have been widely used for a genome-wide gene prioritization analysis. They can accept a group of seed (or bait) genes that are believed to be related to the same biological process as input, and return a list of genes that have been ranked according to computed functional relevance by incorporating multiple sources of data. In particular, GIANT uses a data-driven Bayesian methodology to integrate diverse experiments and information such as genome-wide association study (GWAS) P values and tissue-specific networks; ENDEAVOUR obtains a single global ranking of candidate genes by integrating their rankings associated with each data source using order statistics. These approaches have been found successful in many applications. However, the incorporation of multiple sources of information may bring both positive and negative effects to the analysis. On one side, more sources of information would allow assessing the interactions between candidate and seed (bait) genes from different perspectives and so may offer a more comprehensive portrait on the considered biological process. But on the other hand, information from multiple, heterogeneous sources could reflect different biology with diverse noise and so may dilute the strength in studying a specific biological process under a certain condition.

Given gene expression data, many available methods for gene prioritization analysis have pointed to the general “guilt by association” principle and its extensions by generating hypotheses about potential interactions between candidate genes and seed (bait) genes (e.g., [211, 73, 31]). For instance, GeneFishing [129] uses this strategy and identifies novel genes relevant to a biological process of interest under the guidance of seed (bait) genes utilizing a semi-supervised, non-parametric clustering procedure coupled with a bagging-like majority voting approach. GeneFishing shares identical input-output schema with GIANT and ENDEAVOUR, but also differs from GIANT and ENDEAVOUR in key aspects. In particular, GeneFishing only uses gene-expression data. In a brief summary, the key features of GeneFishing include: (i) repeatedly, randomly splitting a large search-space into smaller ones and aggregating the results from all the sub-search-spaces (i.e., the bagging idea); (ii) adding known pathway genes into each sub-search-space to provide a focus for the search (making the method semi-supervised). Consequently, GeneFishing has been found to be advantageous in terms of being robust against noise in the seed (bait) genes and also being effective with handling a large noisy dataset with sparse signal in some applications.

As should be clear, false discoveries and missed discoveries are key issues with all the three methods mentioned above. One way to handle these issues is to use the irrelevance of most genes and replicability to deal with type I error, cross validation and stability for type II. Most importantly, if possible, it would be ideal to have results be guided by experimental validation.

3. SINGLE-CELL RNA SEQUENCING (SCRNA-SEQ) DATA

The recent advances of single-cell RNA-sequencing (scRNA-seq) technologies have revolutionized biomedical sciences by revealing genome-wide gene expression levels at an unprecedentedly individual cell level [188, 179, 88, 51, 77]. Most of the methods discussed in Section 2 were developed primarily for microarray and bulk RNA-seq technologies, which measure average gene expression levels across a collection of (from thousands to millions) cells and provide “coarse” tissue-level gene expression profiles. New scRNA-seq technologies have led to expression measurements at finer resolution and enabled researchers to confirm previously known cell types, to identify new cell types, and to characterize gene-gene interactions within each cell type. Given that scRNA-seq data have revealed widespread heterogeneity among various cell types of the same tissue [18], gene networks inferred at the cell-type level are expected to uncover gene-gene relationships masked in tissue-level gene networks constructed using microarry and bulk RNA-seq data.

Conceptually, the aforementioned computational approaches for inferring gene networks from bulk gene expression data should still be relevant to scRNA-seq data if the data structure is compatible. The distinct characteristics of scRNA-seq data, however, have posed new computational challenges for gene network inference. Below we summarize the challenges and the state-of-the-art method-ological development in three subsections. In SubSection 3.1, we describe several computational issues in scRNA-seq data pre-processing, including the detection of “problematic cells”, normalization of gene expression levels across cells, and imputation of missing gene expression levels in individual cells. In SubSection 3.2, we discuss identification of cell types from scRNA-seq data. In SubSection 3.3, we review existing studies on inferring cell-type-specific gene networks and discuss some open challenges and future research directions for network analysis using scRNA-seq data.

3.1. scRNA-seq data pre-processing

Both being high-throughput sequencing technologies for measuring gene expression, the pre-processing of scRNA-seq data shares some conceptual similarity with that of bulk RNA-seq data, but also presents unique challenges. While many well-studied pre-processing techniques are available for bulk RNA-seq, developing relevant methods for scRNA-seq data is still a very active research area. For this reason, we present here a discussion of issues arising from scRNA-seq pre-processing.

Similar to bulk RNA-seq, the existence of a variety of scRNA-seq platforms and protocols presents a hurdle for computational method development and cross-validation across datasets. Several published reviews have compared a portion of these platforms [77, 108, 162, 260, 30]. Certain data pre-processing issues are only specific to a particular type of platforms. For example, several platforms use unique molecular identifiers (UMIs) to remove polymerase chain reaction (PCR) amplification bias [134]. Pre-processing data generated by these platforms requires a step called UMI deduplication, which is to correct UMI errors that occur during amplification and sequencing. Multiple methods have been developed for this task [192, 161, 195]. Another issue is the detection of “problematic cells” including empty droplets (not an actual cell) and doublets (two cells are mistaken for one cell) in droplet-based platforms [134, 107, 255], and damaged cells in all platforms. Accordingly, multiple computational and experimental solutions have been proposed [85, 96, 197].

In bulk RNA-seq, the number of sequenced reads can vary widely among different samples. Analogously, individual cells may have vastly different numbers of sequenced reads in scRNA-seq. The reason is a combination of biological phenomena (e.g., some cells indeed have more mRNA transcripts than others) and technical artifacts (e.g., variations in cell capture efficiency). It is important to normalize scRNA-seq data so that gene expression levels are comparable across cells, a condition necessary for any downstream analyses. Existing scRNA-seq normalization methods belong to two major categories: spike-in dependent methods and direct normalization methods. In the former, spike-in RNA molecules with the same concentration are added to each cell prior to library preparation [196], and normalization is done through scaling so that spike-in read counts are equalized across cells. However, the addition of spike-in is only allowed for plate-based platforms such as STRT-seq, SMART-seq, and SMART-seq2 [203, 87, 171, 163], and it does not apply to the more recently developed droplet-based platforms, which have advantages including a lower per-cell cost and a larger number of cells to sequence in parallel [134, 107]. The second and more dominant category, direct normalization methods, in contrast, do not require modification to experimental procedures and are thus more generally applicable. Direct normalization methods for scRNA-seq data are either adaptation of existing normalization methods for bulk RNA-seq data (e.g., DESeq [9], trimmed mean of M values (TMM) normalization [174], and the simple library size normalization so that all cells have the same total number of reads) or new methods that specifically account for distinct features of scRNA-seq data (e.g., excess zero counts [101, 58], more details below). Examples of new methods include scran, which uses cell pooling and subsequent deconvolution to estimate scale factors of individual cells [130], and SCnorm, which groups genes whose counts have similar dependence on sequencing depths and estimates a scaling factor for each gene group [10]. For a comprehensive review of scRNA-seq normalization methods, we refer interested readers to [214].

Finally, a concern unique in scRNA-seq data analysis is the presence of excess zero counts. This can be caused by a technical artefact, known as the “dropout” phenomenon, in which a gene is observed at a moderate expression level in one cell but is undetected in another cell of the same type [108, 101]. Dropouts occur because the current technologies do not reliably and consistently detect low levels of RNA, and consequently, genes may incorrectly appear to be inactive. Dropouts appear as excess zero or low counts in scRNA-seq data, obscuring downstream analyses such as the identification of differentially expressed genes between cell types and the inference of gene networks. To address this issue, multiple methods have been developed to improve the quality of scRNA-seq data from various perspectives. Examples of imputation or recovery methods include scImpute, which first identifies likely false zero and low counts and then imputes them by borrowing information from similar cells [123]; SAVER, which estimates unobserved true gene expression levels in a Bayesian model by borrowing information across genes [80]; and MAGIC, which alters gene expression levels by sharing information across similar cells based on the idea of heat diffusion [215]. A recent review of existing imputation methods is available at [251]. Alternatively, the presence of zero counts can be due to natural fluctuations in gene expression levels as cells go through different stages of the cell cycle [237].

3.2. Identification of cell types

After appropriate pre-processing, scRNA-seq data offer a new opportunity for inferring gene networks at the cell-type level. In order to do this, a key task is the identification of cell types, also known as cell subpopulations or cell states. There are two major approaches to identifying cell types from scRNA-seq data. The first approach leverages prior knowledge on cell-type marker genes. However, it cannot lead to the discovery of new cell types or subtypes. The second approach is based on unsupervised cell clustering. While it is useful for de novo discovery of new cell types and subtypes, unsupervised learning depends on many user-specific inputs, including which clustering algorithm to use (e.g., K-means clustering, hierarchical clustering, density-based clustering, or graph-based clustering), the type of similarity or distance metric between two cells, and the number of clusters, which is a key parameter needed for many clustering algorithms. Taking into account the distinct features of scRNA-seq data, multiple cell clustering algorithms have been developed, including SNN-Cliq, which does not use conventional similarity measures but leverages the ranking of cells to construct a cell-cell graph for identifying cell clusters [241]; BiSNN-Walk, which extends SNN-Cliq and uses an iterative biclustering approach to return a ranked list of cell clusters, each associated with a set of ranked genes based on their levels of affiliation with the cluster [190]; CIDR, the first clustering method that incorporates imputation of dropout gene expression levels [125]; SC3, a widely-used ensemble method that combines multiple clustering algorithms [106]; and Seurat, which identifies cell clusters based on a shared nearest neighbor (SNN) clustering algorithm [182]. In addition to commonly used similarity metric including the Pearson correlation, Spearman correlation, Euclidean distance, other cell similarity measures can be found in e.g. [184, 91]. An evaluation study that compares multiple clustering methods is available in [49]. For a recent review of methods and challenges in unsupervised clustering of scRNA-seq data, please refer to [105].

3.3. Inference of cell-type-specific gene networks and its challenges

Having identified cell types, one possible approach to gene network inference is to use existing inference or construction methods within each cell type (e.g. SINCERA [75]). Several other methods have been developed to incorporate scRNA-seq data characteristics. One study inferred gene co-expression networks by identifying significant pairwise gene associations using both continuous and binary components of linearly transformed scRNA-seq gene expression data [164]. Another study used Boolean regulatory network models with discretized single-cell expression profiles to construct a network of 20 transcription factors (TFs), which predicts direct regulation of the TF Erg in early blood development of mouse embryos [144]. More recently, SCENIC is a computational method that simultaneously reconstructs gene regulatory networks and identifies cell types from scRNA-seq data [4]. SCENIC defines regulons as TF-target gene (TG) co-expression modules with TF motif enrichment, and it then calculates regulon activity scores, which are robust against dropouts, for downstream analyses. PIDC is another computational method that infers gene regulatory networks using a multivariate information measure based on partial information decomposition, which captures higher-order information than pairwise mutual information [28]. PIDC is enabled by the large sample size (i.e., number of cells) in scRNA-seq data. These network inference methods together facilitate the investigation of gene regulatory relationships at the cell-type level.

Despite the existence of many gene network inference methods, scRNA-seq data still call for new computational methods for specific network analysis tasks. Given the high level of noise and excess zeros in scRNA-seq data, one limitation of many inferred cell-type-specific gene networks is that they are typically small in size and require TF-TG pairing information. Another reasonable approach is to consider joint network inference for multiple cell types, which borrows information across cell types to achieve more accurate inference. Relevant statistical approaches include a method that infers multiple Gaussian graphical models (GGMs) with a joint sparsity constraint [35] and a Bayesian nonparametric dynamic Poisson graphical model that combines information across biological conditions for joint inference of TF co-activation networks [131]. In some cases, there is a known or inferred temporal or spatial structure of cell types, such as a reconstructed cell lineage or pseudotime by computational methods including Monocle [212], Waterfull [191], Wishbone [186], TSCAN [90], Monocle2 [169], Slingshot [198], and CellRouter [38]. To incorporate such a cell type structure into network inference, one may leverage existing statistical methods in the network analysis literature, e.g., a Bayesian neighborhood selection method that jointly estimates multiple GGMs with a spatial and/or temporal structure among these GGMs [126] and a group-fused graphical Lasso method for estimating piecewise constant time-evolving GGMs.

The availability of cell-type-specific gene networks has opened up new grounds for applications of statistical network inference. For example, an important statistical question is how to test for the differences between two networks or among multiple networks along a spatial or temporal trajectory. For this task, there are multiple differential network analysis approaches that have been applied to studying protein-protein interaction networks and gene networks constructed from bulk gene expression data [250, 69, 83, 67, 8, 132, 76]. A new statistical challenge in extending these existing methods to scRNA-seq data is how to incorporate the uncertainty in cell type identification and/or cell type trajectory reconstruction.

As we discussed in Section 2.1, how to fairly evaluate network inference methods remains a critical challenge for computational biologists. Multiple steps can affect the network inference results, including the aforementioned complex data pre-processing, the choice of nodes (what genes to include), and the definition of edges (marginal or conditional associations, linear or non-linear associations, directed or undirected associations, etc.). The lack of proper benchmarking data is the key reason behind this challenge, and it is necessary to have joint efforts with experimentalists to design reasonable benchmark standards.

4. BRAIN NETWORKS

Human brain is a complex, interconnected network. The study of brain is another important area where network analysis tools have proven extremely useful. One popular type of analysis is brain connectivity analysis, which aims to provide an accurate and informative mapping and signal extraction of the human brain by analyzing connectivities between different neurons or brain regions [19]. Results from such analysis can lead to crucial insights of pathologies of neurological disorders. For example, increasing amount of evidence suggests that compared to a healthy brain, a connectivity network changes in the presence of numerous neurological disorders [60]. There has been a fast development of brain connectivity analysis using graph theoretical tools [59]. At the heart of this endeavor is the notion that brain connectivity can be abstracted to a graph, with nodes representing neural elements, e.g., neurons or brain regions, and links representing some measure of structural, functional or causal interaction between nodes. Such a representation brings the rich repository of graph theory and tools to the realm of brain connectivity analysis to characterize diverse anatomical, functional and dynamical properties of brain networks.

4.1. Basics

A graphical analysis of brain connectivity starts with defining nodes. This step is crucial and nontrivial, with different ways of defining nodes at different resolutions. At the microscopic level, nodes are neurons, with the number of neurons ranging at the order of 1013 to 1014. At the macroscopic level, nodes can be individual image voxels, with the number of voxels ranging at the order of 105 to 107, or can be spatial brain regions-of-interest (ROIs), with the number of ROIs ranging at the order of 102. The ROIs can be defined anatomically, according to a brain atlas that is built on prior anatomical information such as sclcal and gyral landmarks [44, 213], or can be defined functionally, based on prior functional information such as coordinates of peak activation [48]. More recently, there have been proposals to parcellate the brain and define the ROIs according to data-driven clustering of resting-state functional or diffusion-weighted imaging measures [165].

The next step is to determine the edges between nodes, and we discuss network edge estimation in Section 4.2. It is useful to recognize that there are three broad classes of brain connectivity one can consider to define edges: functional connectivity, structural connectivity, and effective connectivity [63, 64]. Different classes lead to different ways of defining network edges. Simply speaking, functional connectivity refers to statistical correlations and dependencies between spatially distinct neurophysiological recordings of brain activities. Structural connectivity refers to the anatomical connections and physical wirings between brain regions. Effective connectivity refers to the causal influence exerted amongst neural systems.

In this review, we primarily focus on statistical methods for functional connectivity analysis where the nodes are pre-defined brain regions. This is the area that has probably been most intensively studied in both neuroscience and statistics. See [193] for a recent review on functional connectivity analysis, and a discussion on blind spots and breakthroughs. We only briefly discuss structural connectivity analysis and effective connectivity analysis. Even though we attempt to cover a wide range of methods, we are sure to miss some important papers. See [217] and [59] for more discussion on node and edge definitions in brain connectivity analysis.

4.2. Network estimation

Functional connectivity analysis

Two mainstream imaging modalities to study brain functional connectivity are electroencephalography (EEG)[152, 221, 154], and resting-state functional magnetic resonance imaging (fMRI) [127, 113]. For each study subject, EEG records the voltage values of multiple electrodes placed at various scalp locations over time, producing a spatial by temporal data matrix that can be used for downstream analysis. Resting-state fMRI measures changes in blood flow and oxygenation at individual voxels of brain over time, yielding a 4-way data array, which needs pre-processing first. Following a pre-specified brain region parcellation, the time course data of the voxels within the same region are summarized, most often averaged, to represent the signal of that region. Alternatively, instead of using simple averages, [97] on proposed to use kernel canonical correlation coefficient between all the voxels from the two regions to define the strength of connectivity. For both modalities, the resulting data is a spatial (location/region) temporal (time) matrix for each individual subject, from which a functional connectivity network is estimated. The most commonly used approach to construct a connectivity network is to treat the time series data of each spatial location as repeated measures to compute marginal correlations between every pair of nodes/locations. Some of these methods, e.g., the Pearson correlation coefficient [72] and partial correlation [178, 22, 222], have been discussed in Section 2.1 in the context of gene networks. In addition to various connectivity network estimation solutions, [238] developed a formal inference approach to explicitly quantify the significance of individual links in a connectivity network. They adopted the matrix normal distribution, formulated the problem as precision matrix testing, and controlled the false discovery of multiple testing.

Alternatively, [166] treated the time-course data as continuous random functions, and developed a functional graphical model to estimate the connectivity network, based on functional conditional independence under a functional normal assumption. [118] further relaxed the normality assumption, and proposed the notion of functional additive conditional independence as a criterion for constructing functional graphical models. Their method requires neither parametric assumption, nor high-dimensional kernels, and thus avoids the curse of dimensionality and is able to scale to large networks.

In addition to measuring the correlation of two time series in the time domain, which shows how the signal changes over time, one can also measure the correlation in the frequency domain, which shows the signal within each given frequency band over a range of frequencies. Such frequency domain analyses help address two problems existing in time domain analyses: temporal inconsistency and noise sensitivity. Coherence is one correlation measure in the frequency domain, which is the analog of cross correlation in the time domain, and is a temporally invariant frequency-specific measure of linear association between signals. [56] studied the EEG data and used partial coherence as the measure of functional connectivity, which identifies the frequency bands that drive the direct linear association between any pair of nodes. They developed a generalized shrinkage estimator, a weighted average of a parametric and a nonparametric estimator, of the partial coherence matrix. Moreover, [122] employed time-series, clustering and functional data analysis to study spectral synchronicity and functional connectivity also using EEG data. [180] discussed using mutual information and partial mutual information to estimate functional connectivity network, and [26] further extended the method. Similar ideas apply to fMRI data as well; see [3].

Dynamic functional connectivity

Traditionally, functional connectivity analysis based on resting-state fMRI assumes that the functional connectivity network is static. Consequently, one often aggregates the time-course data over the entire duration of the scanning session and obtains a single estimate of the connectivity network. In recent years, emerging evidences have suggested that the network very likely changes dynamically over the scan [23]. [7] proposed to assess the functional connectivity dynamics based on spatial independent component analysis, sliding time window correlation, and k-means clustering of the windowed correlation matrices. [37] developed a dynamic connectivity regression approach to detect temporal change points in functional connectivity. [243] further extended this approach to handle large networks. [201] proposed a structured tensor factorization approach that encourages sparsity and smoothness in parameters along the specified tensor modes. They then built a dynamic tensor clustering method, and applied to brain dynamic functional connectivity analysis.

[168] developed a method to estimate individual graph given an external variable, e.g., age, and proposed a multi-step procedure. They first obtained the sample covariance matrix estimates at the observed values of the external variable. They then constructed a smoothed covariance estimate through kernel smoothing for any value of the external variable in between. Finally, they plugged the covariance estimate into a sparse precision matrix estimation method such as CLIME in [22].

Beyond functional connectivity

As we have mentioned previously, aside from functional connectivity, structural connectivity and effective connectivity have been also been considered in the literature. While a vast number of papers on each of the topic is available, due to space limit, we only briefly discuss a few. More details can be found in the references therein.

Structural connectivity analysis aims to reconstruct white matter fiber tracts, which are large axonal bundles with similar destinations, in brain. Such a white fiber structure serves as a proxy to brain anatomical structure. It is indicative of brain abnormality in white matter due to axonal loss or deformation, and is thought to be related to many neural degenerative diseases. The white fiber structure can be deduced from the diffusion characteristics of water molecules in brain, as water tends to diffuse faster along the fiber bundles. Diffusion tensor imaging (DTI) is an in vivo and non-invasive medical imaging technology that measures water diffusion in brain. [236] developed a method for fiber direction estimation, smoothing and tracking. [253] developed a way to utilize multiple white matter features to construct structural connectivity across subjects.

Effective connectivity aims to model causal relationships between brain neurons or regions, and refers explicitly to the influence that one neural system exerts over another, either at a synaptic or population level. See [64] for a review. While functional connectivity is often encoded by an undirected graph, effective connectivity is encoded by a directed graph. Two common classes of effective connectivity modeling approaches are dynamic causal modeling (DCM, [65]), and structural equation modeling [141]. Notably, the DCM approach utilizes ordinary differential equation models for the neural dynamics and hemodynamic response. However, it is often computationally expensive and is often restricted to a relatively small number of nodes. [95] proposed a spatio-spectral mixed-effects model for effective connectivity analysis using task-based fMRI. [252] developed a dynamic directional model with block structure for effective connectivity using electrocorticographic (ECoG) data. [24] developed a causal dynamic network model to estimate brain activation and connection also using task-based fMRI data.

4.3. Network comparison

Accumulated evidences have indicated that, compared to a healthy brain, a connectivity network alters in the presence of numerous neurological disorders, including Alzheimer’s disease, attention deficit hyperactivity disorder, autism spectrum disorder, and many others [78, 208, 177]. Such alternations in brain connectivity are associated with cognitive and behavioral functions, and hold crucial insights of pathologies of neurological disorders [60]. As such, it is of paramount importance to compare brain connectivity networks under different physiological conditions, e.g., the disorder diagnostic status.

The first question is to estimate multiple brain connectivity networks jointly under different conditions. [259] modeled the spatial temporal data as matrix-valued normal, then proposed a nonconvex penalization to simultaneously estimate multiple networks coded by precision matrices under different conditions. They assumed that not only each individual precision matrix is sparse, but also the difference of the precision matrices across the conditions is sparse. Both types of sparsity are biologically sensible. [3] approached the problem in the frequency domain, and developed a sparse reduced rank modeling framework for functional connectivity analysis across multiple groups.

The second question is to carry out formal statistical inference to compare brain connectivity networks under different conditions. [102] tackled this problem by first summarizing the network through a set of network metrics, then employing a standard two-sample test. This strategy is commonly employed in the neuroscience literature and is easy to implement. However, it remains unclear to what extent each network metric provides a meaningful representation of brain function and structure [59]. [33] developed a method to detect differentially expressed connectivity subnetworks under different conditions, by searching clusters of the graph, and resorting to a permutation test to obtain the p-value of the selected subnetwork. [50] developed a fully Bayesian solution for network comparison, under a series of prior distributions, and the solution is very flexible. [149] turned the matrix data into vector normal by whitening, and used bootstrap resampling method for inference. [239] adopted the matrix normal distribution, and developed an inferential procedure for testing equality of individual entries of partial correlation matrices across multiple groups while controlling for false discovery.

In addition to the element-wise comparison of multiple networks, there is another family of methods that use persistent homology and are built to take into account the network topology. Homology is an algebraic formalism to associate a sequence of objects with a topological space. Persistent homology is a technique of computational topology that charts the changes in topological network features over multiple resolutions and scales. In doing so, it reveals the most persistent topological features that are robust to noise. See [115, 167, 193] for more details.

5. CONCLUDING REMARKS

In this review, using gene networks and brain networks as primary examples, we have discussed statistical methods for constructing networks and how biological knowledge can be extracted from network topology. It is also possible to bring other biological covariates into the network analysis; one popular such example is to associate the estimated brain connectivity network with external phenotypes. [110] proposed a semiparametric Bayesian conditional graphical model for joint selection of important neuroimaging biomarkers such as the brain functional connectivity, as well as significant genetic biomarkers. [200] developed a class of tensor response regression models that associate a symmetric correlation matrix with a set of covariates such as age and sex. [257] developed intrinsic regression models that associate the diffusion tensor from structural connectivity analysis with the covariates.

It is useful to note that gene coexpression networks and brain connectivity networks share some conceptual similarities, both using some correlation measure to represent edges. However, they differ in how repeated measures are taken. In the latter, the replications are repeated measures of the time series, and a single correlation network can be constructed for every single subject/sample. Thus typically in a brain network study, multiple networks are available for statistical analysis. In the case of gene networks with microarray or bulk RNA-seq data, the replications are individual samples, and usually only a single correlation network is constructed across all the samples. As discussed in Section 3, this perspective is changing with the availability of scRNA-seq, which allows for a network to be constructed for each cell type.

Acknowledgments

§Supported by the ARC DECRA Fellowship.

Supported by NSF grant DMS-1613137; NIH grants R01AG034570 and R01AG061303.

Supported by NSF grant DBI-1846216; NIH/NIGMS grant R01GM120507; Johnson & Johnson WiSTEM2D Award; Sloan Research Fellowship.

REFERENCES

  • 1.Aach John and George M Church. Aligning gene expression time series with time warping algorithms. Bioinformatics, 17(6):495–508, 2001. [DOI] [PubMed] [Google Scholar]
  • 2.Aghdam Rosa, Ganjali Mojtaba, Zhang Xiujun, and Eslahchi Changiz. Cn: a consensus algorithm for inferring gene regulatory networks using the sorder algorithm and conditional mutual information test. Molecular BioSystems, 11(3):942–949, 2015. [DOI] [PubMed] [Google Scholar]
  • 3.Ahn Mihye, Shen Haipeng, Lin Weili, and Zhu Hongtu. A sparse reduced rank framework for group analysis of functional neuroimaging data. Statistica Sinica, 25(1):295, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Aibar Sara, Carmen Bravo González-Blas Thomas Moerman, Imrichova Hana, Hulselmans Gert, Rambow Florian, Marine Jean-Christophe, Geurts Pierre, Aerts Jan, Oord Joost van den, et al. Scenic: single-cell regulatory network inference and clustering. Nature methods, 14(11):1083, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Airoldi Edoardo M, Blei David M, Fienberg Stephen E, and Xing Eric P. Mixed membership stochastic blockmodels. Journal of machine learning research, 9(Sep):1981–2014, 2008. [PMC free article] [PubMed] [Google Scholar]
  • 6.Aittokallio Tero and Schwikowski Benno. Graph-based methods for analysing networks in cell biology. Briefings in bioinformatics, 7(3):243–255, 2006. [DOI] [PubMed] [Google Scholar]
  • 7.Allen Elena A, Damaraju Eswar, Plis Sergey M, Erhardt Erik B, Eichele Tom, and Calhoun Vince D. Tracking whole-brain connectivity dynamics in the resting state. Cerebral cortex, 24(3):663–676, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Amar David, Safer Hershel, and Shamir Ron. Dissection of regulatory networks that are altered in disease via differential co-expression. PLoS computational biology, 9(3):e1002955, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Anders Simon and Huber Wolfgang. Differential expression analysis for sequence count data. Genome biology, 11(10):R106, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bacher Rhonda, Chu Li-Fang, Leng Ning, Gasch Audrey P, Thomson James A, Stewart Ron M, Newton Michael, and Kendziorski Christina. Scnorm: A quantile-regression based approach for robust normalization of single-cell rna-seq data. bioRxiv, page 090167, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bar-Joseph Ziv, Gerber Georg K, Lee Tong Ihn, Rinaldi Nicola J, Yoo Jane Y, Robert François, Gordon D Benjamin, Fraenkel Ernest, Jaakkola Tommi S, Young Richard A, et al. Computational discovery of gene modules and regulatory networks. Nature biotechnology, 21(11):1337, 2003. [DOI] [PubMed] [Google Scholar]
  • 12.Bar-Joseph Ziv, Gitter Anthony, and Simon Itamar. Studying and modelling dynamic biological processes using time-series gene expression data. Nature Reviews Genetics, 13(8):552, 2012. [DOI] [PubMed] [Google Scholar]
  • 13.Barabási Albert-László, Gulbahce Natali, and Loscalzo Joseph. Network medicine: a network-based approach to human disease. Nature reviews genetics, 12(1):56, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Barabasi Albert-Laszlo and Oltvai Zoltan N. Network biology: understanding the cell’s functional organization. Nature reviews genetics, 5(2):101, 2004. [DOI] [PubMed] [Google Scholar]
  • 15.Basso Katia, Margolin Adam A, Stolovitzky Gustavo, Klein Ulf, Dalla-Favera Riccardo, and Califano Andrea. Reverse engineering of regulatory networks in human b cells. Nature genetics, 37(4):382, 2005. [DOI] [PubMed] [Google Scholar]
  • 16.Ben-Dor Amir, Shamir Ron, and Yakhini Zohar. Clustering gene expression patterns. Journal of computational biology, 6(3–4):281–297, 1999. [DOI] [PubMed] [Google Scholar]
  • 17.Berg Johannes and Lässig Michael. Local graph alignment and motif search in biological networks. Proceedings of the National Academy of Sciences, 101(41):14689–14694, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Buettner Florian, Natarajan Kedar N, Casale F Paolo, Proserpio Valentina, Scialdone Antonio, Theis Fabian J, Teichmann Sarah A, Marioni John C, and Stegle Oliver. Computational analysis of cell-to-cell heterogeneity in single-cell rna-sequencing data reveals hidden subpopulations of cells. Nature biotechnology, 33(2):155, 2015. [DOI] [PubMed] [Google Scholar]
  • 19.Bullmore Ed and Sporns Olaf. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews neuroscience, 10(3):186, 2009. [DOI] [PubMed] [Google Scholar]
  • 20.Butte Atul J and Kohane Isaac S. Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In Biocomputing 2000, pages 418–429. World Scientific, 1999. [DOI] [PubMed] [Google Scholar]
  • 21.Cabreros Irineo, Abbe Emmanuel, and Tsirigos Aristotelis. Detecting community structures in hi-c genomic data. In 2016 Annual Conference on Information Science and Systems (CISS), pages 584–589. IEEE, 2016. [Google Scholar]
  • 22.Cai Tony, Liu Weidong, and Luo Xi. A constrained 1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494):594–607, 2011. [Google Scholar]
  • 23.Calhoun Vince D, Miller Robyn, Pearlson Godfrey, and Adalı Tulay. The chronnectome: time-varying connectivity networks as the next frontier in fmri data discovery. Neuron, 84(2):262–274, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Cao Xuefei, Sandstede Björn, and Luo Xi. A functional data method for causal dynamic network modeling of task-related fmri. Frontiers in neuroscience, 13, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Carter Scott L, Brechbühler Christian M, Griffin Michael, and Bond Andrew T. Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics, 20(14):2242–2250, 2004. [DOI] [PubMed] [Google Scholar]
  • 26.Cassidy Ben, Rae Caroline, and Solo Victor. Brain activity: Connectivity, sparsity, and mutual information. IEEE transactions on medical imaging, 34(4):846–860, 2014. [DOI] [PubMed] [Google Scholar]
  • 27.Chahrour Maria, Jung Sung Yun, Shaw Chad, Zhou Xiaobo, Wong Stephen TC, Qin Jun, and Zoghbi Huda Y. Mecp2, a key contributor to neurological disease, activates and represses transcription. Science, 320(5880):1224–1229, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Chan Thalia E, Stumpf Michael PH, and Babtie Ann C. Gene regulatory network inference from single-cell data using multivariate information measures. Cell systems, 5(3):251–267, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Chavali Sreenivas, Barrenas Fredrik, Kanduri Kartiek, and Benson Mikael. Network properties of human disease genes with pleiotropic effects. BMC systems biology, 4(1):78, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Chen Geng and Shi Tieliu. Single-cell rna-seq technologies and related computational data analysis. Frontiers in Genetics, 10:317, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Chen Jing, Bardes Eric E, Aronow Bruce J, and Jegga Anil G. Toppgene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic acids research, 37(suppl 2):W305–W311, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Chen Kuang-Chi, Wang Tse-Yi, Tseng Huei-Hun, Huang Chi-Ying F, and Kao Cheng-Yan. A stochastic differential equation model for quantifying transcriptional regulatory network in saccharomyces cerevisiae. Bioinformatics, 21(12):2883–2890, 2005. [DOI] [PubMed] [Google Scholar]
  • 33.Chen Shuo, Kang Jian, Xing Yishi, and Wang Guoqing. A parsimonious statistical method to detect groupwise differentially expressed functional connectivity networks. Human brain mapping, 36(12):5196–5206, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chickering David Maxwell, Heckerman David, and Meek Christopher. Large-sample learning of bayesian networks is np-hard. Journal of Machine Learning Research, 5(Oct):1287–1330, 2004. [Google Scholar]
  • 35.Chun Hyonho, Zhang Xianghua, and Zhao Hongyu. Gene regulation network inference with joint sparse gaussian graphical models. Journal of Computational and Graphical Statistics, 24(4):954–974, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ciriello Giovanni and Guerra Concettina. A review on models and algorithms for motif discovery in protein-protein interaction networks. Briefings in Functional Genomics and Proteomics, 7(2):147–156, 2008. [DOI] [PubMed] [Google Scholar]
  • 37.Cribben Ivor, Haraldsdottir Ragnheidur, Atlas Lauren Y, Wager Tor D, and Lindquist Martin A. Dynamic connectivity regression: determining state-related changes in brain connectivity. Neuroimage, 61(4):907–920, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Rocha Edroaldo Lummertz Da, Rowe R Grant, Lundin Vanessa, Malleshaiah Mohan, Jha Deepak Kumar, Rambo Carlos R, Li Hu, North Trista E, Collins James J, and Daley George Q. Reconstruction of complex single-cell trajectories using cellrouter. Nature communications, 9(1):892, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Carsten O Daub Ralf Steuer, Selbig Joachim, and Kloska Sebastian. Estimating mutual information using b-spline functions-an improved similarity measure for analysing gene expression data. BMC bioinformatics, 5(1):118, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Daudin J-J, Picard Franck, and Robin Stéphane. A mixture model for random graphs. Statistics and computing, 18(2):173–183, 2008. [Google Scholar]
  • 41.Fuente Alberto De La, Bing Nan, Hoeschele Ina, and Mendes Pedro. Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics, 20(18):3565–3574, 2004. [DOI] [PubMed] [Google Scholar]
  • 42.Smet Riet De and Marchal Kathleen. Advantages and limitations of current network inference methods. Nature Reviews Microbiology, 8(10):717, 2010. [DOI] [PubMed] [Google Scholar]
  • 43.Dembélé Doulaye and Kastner Philippe. Fuzzy c-means method for clustering microarray data. bioinformatics, 19(8):973–980, 2003. [DOI] [PubMed] [Google Scholar]
  • 44.Desikan Rahul S, Ségonne Florent, Fischl Bruce, Quinn Brian T, Dickerson Bradford C, Blacker Deborah, Buckner Randy L, Dale Anders M, Maguire R Paul, Hyman Bradley T, et al. An automated labeling system for subdividing the human cerebral cortex on mri scans into gyral based regions of interest. Neuroimage, 31(3):968–980, 2006. [DOI] [PubMed] [Google Scholar]
  • 45.Gesú Vito Di, Giancarlo Raffaele, Bosco Giosué Lo, Raimondi Alessandra, and Scaturro Davide. Genclust: A genetic algorithm for clustering gene expression data. BMC bioinformatics, 6(1):289, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Dixon Jesse R, Selvaraj Siddarth, Yue Feng, Kim Audrey, Li Yan, Shen Yin, Hu Ming, Liu Jun S, and Ren Bing. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485(7398):376, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Dixon Scott J, Costanzo Michael, Baryshnikova Anastasia, Andrews Brenda, and Boone Charles. Systematic mapping of genetic interaction networks. Annual review of genetics, 43:601–625, 2009. [DOI] [PubMed] [Google Scholar]
  • 48.Dosenbach Nico UF, Nardos Binyam, Cohen Alexander L, Fair Damien A, Power Jonathan D, Church Jessica A, Nelson Steven M, Wig Gagan S, Vogel Alecia C, Lessov-Schlaggar Christina N, et al. Prediction of individual brain maturity using fmri. Science, 329(5997):1358–1361, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Duò Angelo, Robinson Mark D, and Soneson Charlotte. A systematic performance evaluation of clustering methods for single-cell rna-seq data. F1000Research, 7, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Durante Daniele, Dunson David B, et al. Bayesian inference and testing of group differences in brain networks. Bayesian Analysis, 13(1):29–58, 2018. [Google Scholar]
  • 51.Eberwine James, Sul Jai-Yoon, Bartfai Tamas, and Kim Junhyong. The promise of single-cell sequencing. Nature methods, 11(1):25, 2014. [DOI] [PubMed] [Google Scholar]
  • 52.Eisen MB, Spellman PT, Brown PO, and Botstein D. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95:14863–14868, 1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Enright Anton J, Dongen Stijn Van, and Ouzounis Christos A. An efficient algorithm for large-scale detection of protein families. Nucleic acids research, 30(7):1575–1584, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Estrada Ernesto. Virtual identification of essential proteins within the protein interaction network of yeast. Proteomics, 6(1):35–40, 2006. [DOI] [PubMed] [Google Scholar]
  • 55.Feng Jiawu, Barbano Paolo Emilio, and Mishra Bud. Time-frequency feature detection for time-course microarray data. In Proceedings of the 2004 ACM symposium on Applied computing, pages 128–132. ACM, 2004. [Google Scholar]
  • 56.Fiecas Mark, Ombao Hernando, et al. The generalized shrinkage estimator for the analysis of functional connectivity of brain signals. The Annals of Applied Statistics, 5(2A):1102–1125, 2011. [Google Scholar]
  • 57.Fiers Mark WEJ, Minnoye Liesbeth, Aibar Sara, González-Blas Carmen Bravo, Atak Zeynep Kalender, and Aerts Stein. Mapping gene regulatory networks from single-cell omics data. Briefings in functional genomics, 17(4):246–254, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Finak Greg, Andrew McDavid Masanao Yajima, Deng Jingyuan, Gersuk Vivian, Shalek Alex K, Slichter Chloe K, Miller Hannah W, McElrath M Juliana, Prlic Martin, et al. Mast: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell rna sequencing data. Genome biology, 16(1):278, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Fornito Alex, Zalesky Andrew, and Breakspear Michael. Graph analysis of the human connectome: promise, progress, and pitfalls. Neuroimage, 80:426–444, 2013. [DOI] [PubMed] [Google Scholar]
  • 60.Fox Michael D and Greicius Michael. Clinical applications of resting state functional connectivity. Frontiers in systems neuroscience, 4:19, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Fraser Hunter B, Hirsh Aaron E, Steinmetz Lars M, Scharfe Curt, and Feldman Marcus W. Evolutionary rate in the protein interaction network. Science, 296(5568):750–752, 2002. [DOI] [PubMed] [Google Scholar]
  • 62.Friedman Jerome, Hastie Trevor, and Tibshirani Robert. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Friston Karl J. Functional and effective connectivity in neuroimaging: a synthesis. Human brain mapping, 2(1–2):56–78, 1994. [Google Scholar]
  • 64.Friston Karl J. Functional and effective connectivity: a review. Brain connectivity, 1(1):13–36, 2011. [DOI] [PubMed] [Google Scholar]
  • 65.Friston Karl J, Harrison Lee, and Penny Will. Dynamic causal modelling. Neuroimage, 19(4):1273–1302, 2003. [DOI] [PubMed] [Google Scholar]
  • 66.Fu Limin and Medico Enzo. Flame, a novel fuzzy clustering method for the analysis of dna microarray data. BMC bioinformatics, 8(1):3, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Gambardella Gennaro, Moretti Maria Nicoletta, Cegli Rossella De, Cardone Luca, Peron Adriano, and Bernardo Diego Di. Differential network analysis for the identification of condition-specific pathway activity and regulation. Bioinformatics, 29(14):1776–1785, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Gao Lin, Sun Peng-Gang, and Song Jia. Clustering algorithms for detecting functional modules in protein interaction networks. Journal of Bioinformatics and Computational Biology, 7(01):217–242, 2009. [DOI] [PubMed] [Google Scholar]
  • 69.Gill Ryan, Datta Somnath, and Datta Susmita. A statistical framework for differential network analysis from microarray data. BMC bioinformatics, 11(1):95, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Goltsev Yury and Papatsenko Dmitri. Time warping of evolutionary distant temporal gene expression data based on noise suppression. BMC bioinformatics, 10(1):353, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Greene Casey S, Krishnan Arjun, Wong Aaron K, Ricciotti Emanuela, Zelaya Rene A, Himmelstein Daniel S, Zhang Ran, Hartmann Boris M, Zaslavsky Elena, Sealfon Stuart C, et al. Understanding multicellular function and disease with human tissue-specific networks. Nature genetics, 47(6):569, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Greicius Michael D, Krasnow Ben, Reiss Allan L, and Menon Vinod. Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. Proceedings of the National Academy of Sciences, 100(1):253–258, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Guala Dimitri and Sonnhammer Erik LL. A large-scale benchmark of gene prioritization methods. Scientific reports, 7:46598, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Guimera Roger and Amaral Luis A Nunes. Functional cartography of complex metabolic networks. nature, 433(7028):895, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Guo Minzhe, Wang Hui, Potter S Steven, Whitsett Jeffrey A, and Xu Yan. Sincera: a pipeline for single-cell rna-seq profiling analysis. PLoS computational biology, 11(11):e1004575, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Ha Min Jin, Baladandayuthapani Veerabhadran, and Do Kim-Anh. Dingo: differential network analysis in genomics. Bioinformatics, 31(21):3413–3420, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Haque Ashraful, Engel Jessica, Teichmann Sarah A, and Lönnberg Tapio. A practical guide to single-cell rna-sequencing for biomedical research and clinical applications. Genome medicine, 9(1):75, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Hedden Trey, Dijk Koene RA Van, Becker J Alex, Mehta Angel, Sperling Reisa A, Johnson Keith A, and Buckner Randy L. Disruption of functional connectivity in clinically normal older adults harboring amyloid burden. Journal of Neuroscience, 29(40):12686–12694, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Holland Paul W, Laskey Kathryn Blackmond, and Leinhardt Samuel. Stochastic block-models: First steps. Social networks, 5(2):109–137, 1983. [Google Scholar]
  • 80.Huang Mo, Wang Jingshu, Torre Eduardo, Dueck Hannah, Shaffer Sydney, Bonasio Roberto, John I Murray Arjun Raj, Li Mingyao, and Zhang Nancy R. Saver: gene expression recovery for single-cell rna sequencing. Nature methods, 15(7):539, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Huang Shiqiong, Jin Jiashun, Yao Zhigang, et al. Partial correlation screening for estimating large precision matrices, with applications to classification. The Annals of Statistics, 44(5):2018–2057, 2016. [Google Scholar]
  • 82.Husmeier Dirk. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks. Bioinformatics, 19(17):2271–2282, 2003. [DOI] [PubMed] [Google Scholar]
  • 83.Ideker Trey and Krogan Nevan J. Differential network biology. Molecular systems biology, 8(1):565, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Ideker Trey and Sharan Roded. Protein networks in disease. Genome research, 18(4):644–652, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Ilicic Tomislav, Kim Jong Kyoung, Kolodziejczyk Aleksandra A, Bagger Frederik Otzen, McCarthy Davis James, Marioni John C, and Teichmann Sarah A. Classification of low quality cells from single-cell rna-seq data. Genome biology, 17(1):29, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Irrthum Alexandre, Wehenkel Louis, Geurts Pierre, et al. Inferring regulatory networks from expression data using tree-based methods. PloS one, 5(9):e12776, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Islam Saiful, Kjällquist Una, Moliner Annalena, Zajac Pawel, Fan Jian-Bing, Lönnerberg Peter, and Linnarsson Sten. Characterization of the single-cell transcriptional landscape by highly multiplex rna-seq. Genome research, 21(7):1160–1167, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Islam Saiful, Zeisel Amit, Joost Simon, Gioele La Manno Pawel Zajac, Kasper Maria, Peter Lönnerberg, and Sten Linnarsson. Quantitative single-cell rna-seq with unique molecular identifiers. Nature methods, 11(2):163, 2014. [DOI] [PubMed] [Google Scholar]
  • 89.Jeong Hawoong, Mason Sean P, Barabási A-L, and Oltvai Zoltan N. Lethality and centrality in protein networks. Nature, 411(6833):41, 2001. [DOI] [PubMed] [Google Scholar]
  • 90.Ji Zhicheng and Ji Hongkai. Tscan: Pseudo-time reconstruction and evaluation in single-cell rna-seq analysis. Nucleic acids research, 44(13):e117–e117, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Jiang Hao, Sohn Lydia L, Huang Haiyan, and Chen Luonan. Single cell clustering based on cell-pair differentiability correlation and variance analysis. Bioinformatics, 34(21):3684–3694, 2018. [DOI] [PubMed] [Google Scholar]
  • 92.Jiang Rui, Tu Zhidong, Chen Ting, and Sun Fengzhu. Network motif identification in stochastic networks. Proceedings of the National Academy of Sciences, 103(25):9404–9409, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Jonsson Pall F and Bates Paul A. Global topological features of cancer proteins in the human interactome. Bioinformatics, 22(18):2291–2297, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Joy Maliackal Poulo, Brock Amy, Ingber Donald E, and Huang Sui. High-betweenness proteins in the yeast protein interaction network. BioMed Research International, 2005(2):96–103, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Kang Hakmook, Ombao Hernando, Linkletter Crystal, Long Nicole, and Badre David. Spatio-spectral mixed-effects model for functional magnetic resonance imaging data. Journal of the American Statistical Association, 107(498):568–577, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Kang Hyun Min, Subramaniam Meena, Targ Sasha, Nguyen Michelle, Maliskova Lenka, Elizabeth McCarthy Eunice Wan, Wong Simon, Byrnes Lauren, Lanata Cristina M, et al. Multiplexed droplet single-cell rna-sequencing using natural genetic variation. Nature biotechnology, 36(1):89, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Kang Jian, Bowman F DuBois, Mayberg Helen, and Liu Han. A depression network of functionally connected regions discovered via multi-attribute canonical correlation graphs. NeuroImage, 141:431–441, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Karrer Brian and Newman Mark EJ. Stochastic blockmodels and community structure in networks. Physical review E, 83(1):016107, 2011. [DOI] [PubMed] [Google Scholar]
  • 99.Kashtan Nadav, Itzkovitz Shalev, Milo Ron, and Alon Uri. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics, 20(11):1746–1758, 2004. [DOI] [PubMed] [Google Scholar]
  • 100.Kato Mamoru, Tsunoda Tatsuhiko, and Takagi Toshihisa. Lag analysis of genetic networks in the cell cycle of budding yeast. Genome Informatics, 12:266–267, 2001. [Google Scholar]
  • 101.Kharchenko Peter V, Silberstein Lev, and Scadden David T. Bayesian approach to single-cell differential expression analysis. Nature methods, 11(7):740, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Kim Junghi, Wozniak Jeffrey R, Mueller Bryon A, Shen Xiaotong, and Pan Wei. Comparison of statistical tests for group differences in brain functional networks. NeuroImage, 101:681–694, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Kim Kyungpil, Jiang Keni, Teng Siew Leng, Feldman Lewis J, and Huang Haiyan. Using biologically interrelated experiments to identify pathway genes in arabidopsis. Bioinformatics, 28(6):815–822, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Kim Sunyong, Imoto Seiya, and Miyano Satoru. Dynamic bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems, 75(1–3):57–65, 2004. [DOI] [PubMed] [Google Scholar]
  • 105.Kiselev Vladimir Yu, Andrews Tallulah S, and Hemberg Martin. Challenges in unsupervised clustering of single-cell rna-seq data. Nature Reviews Genetics, page 1, 2019. [DOI] [PubMed] [Google Scholar]
  • 106.Kiselev Vladimir Yu, Kirschner Kristina, Michael T Schaub Tallulah Andrews, Yiu Andrew, Chandra Tamir, Natarajan Kedar N, Reik Wolf, Barahona Mauricio, Green Anthony R, et al. Sc3: consensus clustering of single-cell rna-seq data. Nature methods, 14(5):483, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Klein Allon M, Mazutis Linas, Akartuna Ilke, Tallapragada Naren, Veres Adrian, Li Victor, Peshkin Leonid, Weitz David A, and Kirschner Marc W. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell, 161(5):1187–1201, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Kolodziejczyk Aleksandra A, Kim Jong Kyoung, Svensson Valentine, Marioni John C, and Teichmann Sarah A. The technology and biology of single-cell rna sequencing. Molecular cell, 58(4):610–620, 2015. [DOI] [PubMed] [Google Scholar]
  • 109.Kumari Sapna, Nie Jeff, Chen Huann-Sheng, Ma Hao, Stewart Ron, Li Xiang, Lu Meng-Zhu, Taylor William M, and Wei Hairong. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. PloS one, 7(11):e50411, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Kundu Suprateek and Kang Jian. Semiparametric bayes conditional graphical models for imaging genetics applications. Stat, 5(1):322–337, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Kwon Andrew T, Hoos Holger H, and Ng Raymond. Inference of transcriptional regulation relationships from gene expression data. In Proceedings of the 2003 ACM symposium on Applied computing, pages 135–140. ACM, 2003. [DOI] [PubMed] [Google Scholar]
  • 112.Langfelder Peter and Horvath Steve. Wgcna: an r package for weighted correlation network analysis. BMC bioinformatics, 9(1):559, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Lazar Nicole. The statistical analysis of functional MRI data. Springer Science & Business Media, 2008. [Google Scholar]
  • 114.Dily François Le, Baù Davide, Pohl Andy, Vicent Guillermo P, Serra François, Soronellas Daniel, Castellano Giancarlo, Wright Roni HG, Ballare Cecilia, Filion Guillaume, et al. Distinct structural transitions of chromatin topological domains correlate with coordinated hormone-induced gene regulation. Genes & development, 28(19):2151–2162, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Lee Hyekyoung, Chung Moo K, Kang Hyejin, Kim Bung-Nyun, and Lee Dong Soo. Discriminative persistent homology of brain networks. In 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pages 841–844. IEEE, 2011. [Google Scholar]
  • 116.Lehner Ben, Crombie Catriona, Tischler Julia, Fortunato Angelo, and Fraser Andrew G. Systematic mapping of genetic interactions in caenorhabditis elegans identifies common modifiers of diverse signaling pathways. Nature genetics, 38(8):896, 2006. [DOI] [PubMed] [Google Scholar]
  • 117.Lemmens Karen, Bie Tijl De, Dhollander Thomas, Keersmaecker Sigrid C De, Thijs Inge M, Schoofs Geert, Weerdt Ami De, Moor Bart De, Vanderleyden Jos, Collado-Vides Julio, et al. Distiller: a data integration framework to reveal condition dependency of complex regulons in escherichia coli. Genome biology, 10(3):R27, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Li Bing and Solea Eftychia. A nonparametric graphical model for functional data with application to brain networks based on fmri. Journal of the American Statistical Association, 113(524):1637–1655, 2018. [Google Scholar]
  • 119.Li Hongzhe and Gui Jiang. Gradient directed regularization for sparse gaussian concentration graphs, with applications to inference of genetic networks. Biostatistics, 7(2):302–317, 2005. [DOI] [PubMed] [Google Scholar]
  • 120.Li Ker-Chau. Genome-wide coexpression dynamics: theory and application. Proceedings of the National Academy of Sciences, 99(26):16875–16880, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Li Ker-Chau, Palotie Aarno, Yuan Shinsheng, Bronnikov Denis, Chen Daniel, Wei Xuelian, Choi Oi-Wa, Saarela Janna, and Peltonen Leena. Finding disease candidate genes by liquid association. Genome biology, 8(10):R205, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Li Qian, Şentürk Damla, Sugar Catherine A, Jeste Shafali, DiStefano Charlotte, Frohlich Joel, and Telesca Donatello. Inferring brain signals synchronicity from a sample of eeg readings. Journal of the American Statistical Association, pages 1–18, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Li Wei Vivian and Li Jingyi Jessica. An accurate and robust imputation method scimpute for single-cell rna-seq data. Nature communications, 9(1):997, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Lin Andy, Wang Richard T, Ahn Sangtae, Park Christopher C, and Smith Desmond J. A genome-wide map of human genetic interactions inferred from radiation hybrid genotypes. Genome research, 20(8):1122–1132, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Lin Peijie, Troup Michael, and Ho Joshua WK. Cidr: Ultrafast and accurate clustering through imputation for single-cell rna-seq data. Genome biology, 18(1):59, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Lin Zhixiang, Wang Tao, Yang Can, and Zhao Hongyu. On joint estimation of gaussian graphical models for spatial and temporal data. Biometrics, 73(3):769, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Lindquist Martin A et al. The statistical analysis of fmri data. Statistical science, 23(4):439–464, 2008. [Google Scholar]
  • 128.Liu Fei, Zhang Shao-Wu, Guo Wei-Feng, Wei Ze-Gang, and Chen Luonan. Inference of gene regulatory network based on local bayesian networks. PLoS computational biology, 12(8):e1005024, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Liu K, Theusch E, Zhou Y, Ashuach T, Dose A, Bickel PJ, Medina MW, and Huang H. Genefishing: a method to reconstruct context-specific portraits of biological processes and its application to cholesterol metabolism. Proceedings of the National Academy of Sciences, 2019. to appear. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Aaron TL Lun Karsten Bach, and John C Marioni. Pooling across cells to normalize single-cell rna sequencing data with many zero counts. Genome biology, 17(1):75, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131.Luo Xiangyu, Wei Yingying, et al. Nonparametric bayesian learning of heterogeneous dynamic transcription factor networks. The Annals of Applied Statistics, 12(3):1749–1772, 2018. [Google Scholar]
  • 132.Ma Chuang, Xin Mingming, Feldmann Kenneth A, and Wang Xiangfeng. Machine learning-based differential network analysis: A study of stress-responsive transcriptomes in arabidopsis. The Plant Cell, 26(2):520–537, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133.Maathuis Marloes H, Kalisch Markus, Bühlmann Peter, et al. Estimating high-dimensional intervention effects from observational data. The Annals of Statistics, 37(6A):3133–3164, 2009. [Google Scholar]
  • 134.Macosko Evan Z, Basu Anindita, Satija Rahul, Nemesh James, Shekhar Karthik, Goldman Melissa, Tirosh Itay, Bialas Allison R, Kamitaki Nolan, Martersteck Emily M, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell, 161(5):1202–1214, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Magwene Paul M and Kim Junhyong. Estimating genomic coexpression networks using first-order conditional independence. Genome biology, 5(12):R100, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Marbach Daniel, Costello James C, Küffner Robert, Vega Nicole M, Prill Robert J, Camacho Diogo M, Allison Kyle R, Aderhold Andrej, Bonneau Richard, Chen Yukun, et al. Wisdom of crowds for robust gene network inference. Nature methods, 9(8):796, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Marbach Daniel, Prill Robert J, Schaffter Thomas, Mattiussi Claudio, Floreano Dario, and Stolovitzky Gustavo. Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the national academy of sciences, 107(14):6286–6291, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138.Marbach Daniel, Roy Sushmita, Ay Ferhat, Meyer Patrick E, Candeias Rogerio, Kahveci Tamer, Bristow Christopher A, and Kellis Manolis. Predictive regulatory models in drosophila melanogaster by integrative inference of transcriptional networks. Genome research, 22(7):1334–1349, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Margolin Adam A, Nemenman Ilya, Basso Katia, Wiggins Chris, Stolovitzky Gustavo, Favera Riccardo Dalla, and Califano Andrea. Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. In BMC bioinformatics, volume 7, page S7. BioMed Central, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140.Matsumoto Hirotaka, Kiryu Hisanori, Furusawa Chikara, Ko Minoru SH, Ko Shigeru BH, Gouda Norio, Hayashi Tetsutaro, and Nikaido Itoshi. Scode: an efficient regulatory network inference algorithm from single-cell rna-seq during differentiation. Bioinformatics, 33(15):2314–2321, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Mclntosh AR and Gonzalez-Lima Francisco. Structural equation modeling and its application to network analysis in functional brain imaging. Human brain mapping, 2(1–2):2–22, 1994. [Google Scholar]
  • 142.Meinshausen Nicolai, Bühlmann Peter, et al. High-dimensional graphs and variable selection with the lasso. The annals of statistics, 34(3):1436–1462, 2006. [Google Scholar]
  • 143.Milo Ron, Shen-Orr Shai, Itzkovitz Shalev, Kashtan Nadav, Chklovskii Dmitri, and Alon Uri. Network motifs: simple building blocks of complex networks. Science, 298(5594):824–827, 2002. [DOI] [PubMed] [Google Scholar]
  • 144.Moignard Victoria, Woodhouse Steven, Haghverdi Laleh, Lilly Andrew J, Tanaka Yosuke, Wilkinson Adam C, Buettner Florian, Macaulay Iain C, Jawaid Wajid, Diamanti Evangelia, et al. Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nature biotechnology, 33(3):269, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Moreau Yves and Tranchevent Léon-Charles. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nature Reviews Genetics, 13(8):523, 2012. [DOI] [PubMed] [Google Scholar]
  • 146.Moschopoulos Charalampos N, Pavlopoulos Georgios A, Schneider Reinhard, Likothanassis Spiridon D, and Kossida Sophia. Giba: a clustering tool for detecting protein complexes. In BMC bioinformatics, volume 10, page S11. BioMed Central, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Muro Shizuko, Takemasa Ichiro, Oba Shigeyuki, Matoba Ryo, Ueno Noriko, Maruyama Chiyuri, Yamashita Riu, Sekimoto Mitsugu, Yamamoto Hirofumi, Nakamori Shoji, et al. Identification of expressed genes linked to malignancy of human colorectal carcinoma by parametric clustering of quantitative expression data. Genome biology, 4(3):R21, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Nair Ajay, Chetty Madhu, and Wangikar Pramod P. Improving gene regulatory network inference using network topology information. Molecular BioSystems, 11(9):2449–2463, 2015. [DOI] [PubMed] [Google Scholar]
  • 149.Narayan Manjari, Allen Genevera I, and Tomson Steffie. Two sample inference for populations of graphical models with applications to functional connectivity. arXiv preprint arXiv:1502.03853, 2015. [Google Scholar]
  • 150.Newman MEJ. Networks: An introduction. Oxford University Press, 2010. [Google Scholar]
  • 151.Norton Heidi K, Emerson Daniel J, Huang Harvey, Kim Jesi, Titus Katelyn R, Gu Shi, Bassett Danielle S, and Phillips-Cremins Jennifer E. Detecting hierarchical genome folding with network modularity. Nature methods, 15(2):119, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152.Nunez Michael D, Nunez Paul L, Srinivasan Ramesh, Ombao H, Linquist M, Thompson W, and Aston J. Electroencephalography (eeg): neurophysics, experimental methods, and signal processing. Handbook of Neuroimaging Data Analysis, pages 175–197, 2016. [Google Scholar]
  • 153.Okuda Shujiro, Yamada Takuji, Hamajima Masami, Itoh Masumi, Katayama Toshiaki, Bork Peer, Goto Susumu, and Kanehisa Minoru. Kegg atlas mapping for global analysis of metabolic pathways. Nucleic acids research, 36(suppl 2):W423–W426, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154.Ombao H, Schroder AL, Euan C, Ting CM, and Samdin B. Advanced topics for modeling electroencephalograms. In Ombao H, Linquist M, Thompson W, and Aston J, editors, Handbook of Neuroimaging Data Analysis, pages 567–626. Chapman & Hall/CRC, 2016. [Google Scholar]
  • 155.Omranian Nooshin, Eloundou-Mbebi Jeanne MO, Mueller-Roeber Bernd, and Nikoloski Zoran. Gene regulatory network inference using fused lasso on multiple data sets. Scientific reports, 6:20533, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Özgür Arzucan, Vu Thuy, Erkan Güneş, and Radev Dragomir R. Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics, 24(13):i277–i285, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157.Paladugu Sri R, Zhao Shan, Ray Animesh, and Raval Alpan. Mining protein networks for synthetic genetic interactions. Bmc Bioinformatics, 9(1):426, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Pavlopoulos Georgios A, Secrier Maria, Moschopoulos Charalampos N, Soldatos Theodoros G, Kossida Sophia, Aerts Jan, Schneider Reinhard, and Bagos Pantelis G. Using graph theory to analyze biological networks. BioData mining, 4(1):10, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159.Pei Yonggang, Gao Qinghui, Li Juntao, and Zhao Xiting. Identifying local co-regulation relationships in gene expression data. Journal of theoretical biology, 360:200–207, 2014. [DOI] [PubMed] [Google Scholar]
  • 160.Peng Jie, Wang Pei, Zhou Nengfeng, and Zhu Ji. Partial correlation estimation by joint sparse regression models. Journal of the American Statistical Association, 104(486):735–746, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 161.Petukhov Viktor, Guo Jimin, Baryawno Ninib, Severe Nicolas, Scadden David T, Samsonova Maria G, and Kharchenko Peter V. dropest: pipeline for accurate estimation of molecular counts in droplet-based single-cell rna-seq experiments. Genome biology, 19(1):78, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162.Picelli Simone. Single-cell rna-sequencing: the future of genome biology is now. RNA biology, 14(5):637–650, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 163.Picelli Simone, Björklund Åsa K, Faridani Omid R, Sagasser Sven, Winberg Gösta, and Sandberg Rickard. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nature methods, 10(11):1096, 2013. [DOI] [PubMed] [Google Scholar]
  • 164.Pina Cristina, Teles José, Fugazza Cristina, May Gillian, Wang Dapeng, Guo Yanping, Soneji Shamit, Brown John, Patrik Edén Mattias Ohlsson, et al. Single-cell network analysis identifies ddit3 as a nodal lineage regulator in hematopoiesis. Cell reports, 11(10):1503–1510, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165.Power Jonathan D, Cohen Alexander L, Nelson Steven M, Wig Gagan S, Barnes Kelly Anne, Church Jessica A, Vogel Alecia C, Laumann Timothy O, Miezin Fran M, Schlaggar Bradley L, et al. Functional network organization of the human brain. Neuron, 72(4):665–678, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 166.Qiao Xinghao, Guo Shaojun, and James Gareth M. Functional graphical models. Journal of the American Statistical Association, 114(525):211–222, 2019. [Google Scholar]
  • 167.Qiu Anqi, Lee Annie, Tan Mingzhen, and Chung Moo K. Manifold learning on brain functional networks in aging. Medical image analysis, 20(1):52–60, 2015. [DOI] [PubMed] [Google Scholar]
  • 168.Qiu Huitong, Han Fang, Liu Han, and Caffo Brian. Joint estimation of multiple graphical models from high dimensional time series. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(2):487–504, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169.Qiu Xiaojie, Hill Andrew, Packer Jonathan, Lin Dejun, Ma Yi-An, and Trapnell Cole. Single-cell mrna quantification and differential analysis with census. Nature methods, 14(3):309, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 170.Ramoni Marco F, Sebastiani Paola, and Kohane Isaac S. Cluster analysis of gene expression dynamics. Proceedings of the National Academy of Sciences, 99(14):9121–9126, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 171.Ramsköld Daniel, Luo Shujun, Wang Yu-Chieh, Li Robin, Deng Qiaolin, Faridani Omid R, Daniels Gregory A, Khrebtukova Irina, Loring Jeanne F, Laurent Louise C, et al. Full-length mrna-seq from single-cell levels of rna and individual circulating tumor cells. Nature biotechnology, 30(8):777, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 172.Rau Andrea, Jaffrézic Florence, and Nuel Grégory. Joint estimation of causal effects from observational and intervention gene expression data. BMC systems biology, 7(1):111, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173.Reshef David N, Reshef Yakir A, Finucane Hilary K, Grossman Sharon R, McVean Gilean, Turnbaugh Peter J, Lander Eric S, Mitzenmacher Michael, and Sabeti Pardis C. Detecting novel associations in large data sets. science, 334(6062):1518–1524, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 174.Robinson Mark D and Oshlack Alicia. A scaling normalization method for differential expression analysis of rna-seq data. Genome biology, 11(3):R25, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 175.Rousseeuw Peter J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987. [Google Scholar]
  • 176.Roy Swarup, Bhattacharyya Dhruba K, and Kalita Jugal K. Reconstruction of gene co-expression network from microarray data using local expression patterns. BMC bioinformatics, 15(7):S10, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 177.Rudie Jeffrey D, Brown JA, Beck-Pancer D, Hernandez LM, Dennis EL, Thompson PM, Bookheimer SY, and Dapretto MJNC. Altered functional and structural brain network organization in autism. NeuroImage: clinical, 2:79–94, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 178.Ryali Srikanth, Chen Tianwen, Supekar Kaustubh, and Menon Vinod. Estimation of functional connectivity in fmri data using stability selection-based sparse partial correlation with elastic net penalty. NeuroImage, 59(4):3852–3861, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 179.Saliba Antoine-Emmanuel, Westermann Alexander J, Gorski Stanislaw A, and Vogel Jörg. Single-cell rna-seq: advances and future challenges. Nucleic acids research, 42(14):8845–8860, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 180.Salvador Raymond, Suckling John, Schwarzbauer Christian, and Bullmore Ed. Undirected graphs of frequency-dependent functional connectivity in whole brain networks. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1457):937–946, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 181.Santos-Zavaleta Alberto, Salgado Heladia, Gama-Castro Socorro, Sánchez-Pérez Mishael, Gómez-Romero Laura, Ledezma-Tejeida Daniela, García-Sotelo Jair Santiago, Alquicira-Hernández Kevin, Muñiz-Rascado Luis José, Peña-Loredo Pablo, et al. Regulondb v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in e. coli k-12. Nucleic acids research, 47(D1):D212–D220, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 182.Satija Rahul, Farrell Jeffrey A, Gennert David, Schier Alexander F, and Regev Aviv. Spatial reconstruction of single-cell gene expression data. Nature biotechnology, 33(5):495, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 183.Satuluri Venu, Parthasarathy Srinivasan, and Ucar Duygu. Markov clustering of protein interaction networks with improved balance and scalability. In Proceedings of the first ACM international conference on bioinformatics and computational biology, pages 247–256. ACM, 2010. [Google Scholar]
  • 184.Schiffman Courtney, Lin Christina, Shi Funan, Chen Luonan, Sohn Lydia, and Huang Haiyan. Sideseq: a cell similarity measure defined by shared identified differentially expressed genes for single-cell rna sequencing data. Statistics in biosciences, 9(1):200–216, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 185.Schuldiner Maya, Collins Sean R, Thompson Natalie J, Denic Vladimir, Bhamidipati Arunashree, Punna Thanuja, Ihmels Jan, Andrews Brenda, Boone Charles, Greenblatt Jack F, et al. Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell, 123(3):507–519, 2005. [DOI] [PubMed] [Google Scholar]
  • 186.Setty Manu, Tadmor Michelle D, Reich-Zeliger Shlomit, Angel Omer, Salame Tomer Meir, Kathail Pooja, Choi Kristy, Bendall Sean, Friedman Nir, and Pe’er Dana. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nature biotechnology, 34(6):637, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 187.Sexton Tom, Yaffe Eitan, Kenigsberg Ephraim, Bantignies Frédéric, Leblanc Benjamin, Hoichman Michael, Parrinello Hugues, Tanay Amos, and Cavalli Giacomo. Three-dimensional folding and functional organization principles of the drosophila genome. Cell, 148(3):458–472, 2012. [DOI] [PubMed] [Google Scholar]
  • 188.Shapiro Ehud, Biezuner Tamir, and Linnarsson Sten. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nature Reviews Genetics, 14(9):618, 2013. [DOI] [PubMed] [Google Scholar]
  • 189.Sharan Roded, Maron-Katz Adi, and Shamir Ron. Click and expander: a system for clustering and visualizing gene expression data. Bioinformatics, 19(14):1787–1799, 2003. [DOI] [PubMed] [Google Scholar]
  • 190.Shi Funan and Huang Haiyan. Identifying cell subpopulations and their genetic drivers from single-cell rna-seq data using a biclustering approach. Journal of Computational Biology, 24(7):663–674, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 191.Shin Jaehoon, Berg Daniel A, Zhu Yunhua, Shin Joseph Y, Song Juan, Bonaguidi Michael A, Enikolopov Grigori, Nauen David W, Christian Kimberly M, Ming Guo-li, et al. Single-cell rna-seq with waterfall reveals molecular cascades underlying adult neurogenesis. Cell stem cell, 17(3):360–372, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 192.Smith Tom, Heger Andreas, and Sudbery Ian. Umi-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome research, 27(3):491–499, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 193.Solo Victor, Poline Jean-Baptiste, Lindquist Martin A, Simpson Sean L, Bowman F DuBois, Chung Moo K, and Cassidy Ben. Connectivity in fmri: blind spots and break-throughs. IEEE transactions on medical imaging, 37(7):1537–1550, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 194.Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, and Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9:3273–3297, 1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 195.Srivastava Avi, Malik Laraib, Smith Tom, Sudbery Ian, and Patro Rob. Alevin efficiently estimates accurate gene abundances from dscrna-seq data. Genome biology, 20(1):65, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 196.Stegle Oliver, Teichmann Sarah A, and Marioni John C. Computational and analytical challenges in single-cell transcriptomics. Nature Reviews Genetics, 16(3):133, 2015. [DOI] [PubMed] [Google Scholar]
  • 197.Stoeckius Marlon, Zheng Shiwei, Houck-Loomis Brian, Hao Stephanie, Yeung Bertrand Z, Mauck William M, Smibert Peter, and Satija Rahul. Cell hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome biology, 19(1):224, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 198.Street Kelly, Risso Davide, Russell B Fletcher Diya Das, Ngai John, Yosef Nir, Purdom Elizabeth, and Dudoit Sandrine. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC genomics, 19(1):477, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 199.Stuart JM, Segal E, Koller D, and Kim SK. A gene-coexpression network for global discovery of conserved genetic modules. Science, 302:249–255, 2003. [DOI] [PubMed] [Google Scholar]
  • 200.Sun Will Wei and Li Lexin. Store: sparse tensor response regression and neuroimaging analysis. The Journal of Machine Learning Research, 18(1):4908–4944, 2017. [Google Scholar]
  • 201.Sun Will Wei and Li Lexin. Dynamic tensor clustering. Journal of the American Statistical Association, pages 1–28, 2019.34012183 [Google Scholar]
  • 202.Tamayo Pablo, Slonim Donna, Mesirov Jill, Zhu Qing, Kitareewan Sutisak, Dmitrovsky Ethan, Lander Eric S, and Golub Todd R. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences, 96(6):2907–2912, 1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 203.Tang Fuchou, Barbacioru Catalin, Wang Yangzhou, Nordman Ellen, Lee Clarence, Xu Nanlan, Wang Xiaohui, Bodeau John, Tuch Brian B, Siddiqui Asim, et al. mrna-seq whole-transcriptome analysis of a single cell. Nature methods, 6(5):377, 2009. [DOI] [PubMed] [Google Scholar]
  • 204.Tavazoie Saeed, Hughes Jason D, Campbell Michael J, Cho Raymond J, and Church George M. Systematic determination of genetic network architecture. Nature genetics, 22(3):281, 1999. [DOI] [PubMed] [Google Scholar]
  • 205.Teschendorff Andrew E, Wang Yanzhong, Barbosa-Morais Nuno L, Brenton James D, and Caldas Carlos. A variational bayesian mixture modelling framework for cluster analysis of gene-expression data. Bioinformatics, 21(13):3025–3033, 2005. [DOI] [PubMed] [Google Scholar]
  • 206.Tian Tianhai and Burrage Kevin. Stochastic models for regulatory networks of the genetic toggle switch. Proceedings of the national Academy of Sciences, 103(22):8372–8377, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 207.Tibshirani Robert, Walther Guenther, and Hastie Trevor. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2):411–423, 2001. [Google Scholar]
  • 208.Tomasi Dardo and Volkow Nora D. Abnormal functional connectivity in children with attention-deficit/hyperactivity disorder. Biological psychiatry, 71(5):443–450, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 209.Tong Amy Hin Yan, Lesage Guillaume, Bader Gary D, Ding Huiming, Xu Hong, Xin Xiaofeng, Young James, Berriz Gabriel F, Brost Renee L, Chang Michael, et al. Global mapping of the yeast genetic interaction network. science, 303(5659):808–813, 2004. [DOI] [PubMed] [Google Scholar]
  • 210.Tranchevent Léon-Charles, Ardeshirdavani Amin, Sarah ElShal Daniel Alcaide, Aerts Jan, Auboeuf Didier, and Moreau Yves. Candidate gene prioritization with endeavour. Nucleic acids research, 44(W1):W117–W121, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 211.Tranchevent Léon-Charles, Barriot R, Yu S, and Vooren SV. Gene prioritization through genomic data fusion. Nature Biotechnology, 24(5):537–544, 2006. [DOI] [PubMed] [Google Scholar]
  • 212.Trapnell Cole, Cacchiarelli Davide, Grimsby Jonna, Pokharel Prapti, Li Shuqiang, Morse Michael, Lennon Niall J, Livak Kenneth J, Mikkelsen Tarjei S, and Rinn John L. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature biotechnology, 32(4):381, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 213.Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, et al. Automated anatomical labeling of activations in. 2002. [DOI] [PubMed]
  • 214.Vallejos Catalina A, Risso Davide, Scialdone Antonio, Dudoit Sandrine, and Marioni John C. Normalizing single-cell rna sequencing data: challenges and opportunities. Nature methods, 14(6):565, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 215.Dijk David Van, Sharma Roshan, Nainys Juozas, Yim Kristina, Kathail Pooja, Carr Ambrose J, Burdziak Cassandra, Moon Kevin R, Chaffer Christine L, Pattabiraman Diwakar, et al. Recovering gene interactions from single-cell data using data diffusion. Cell, 174(3):716–729, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 216.Dongen Stijn Marinus Van. Graph clustering by flow simulation. PhD thesis, 2000.
  • 217.Varoquaux Gaël and Craddock R Cameron. Learning and comparing functional connectomes across subjects. NeuroImage, 80:405–415, 2013. [DOI] [PubMed] [Google Scholar]
  • 218.Vinh Nguyen Xuan, Chetty Madhu, Coppel Ross, and Wangikar Pramod P. Globalmit: learning globally optimal dynamic bayesian network with the mutual information test criterion. Bioinformatics, 27(19):2765–2766, 2011. [DOI] [PubMed] [Google Scholar]
  • 219.Vlasblom James and Wodak Shoshana J. Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC bioinformatics, 10(1):99, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 220.Wachi Shinichiro, Yoneda Ken, and Wu Reen. Interactome-transcriptome analysis reveals the high centrality of genes differentially expressed in lung cancer tissues. Bioinformatics, 21(23):4205–4208, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 221.Wang Y, Hu L, and Ombao H. Statistical analysis of electroencephalograms. In Ombao H, Linquist M, Thompson W, and Aston J, editors, Handbook of Neuroimaging Data Analysis, pages 523–565. Chapman & Hall/CRC, 2016. [Google Scholar]
  • 222.Wang Yikai, Kang Jian, Kemmer Phebe B, and Guo Ying. An efficient and reliable statistical method for estimating functional connectivity in large scale brain networks using partial correlation. Frontiers in neuroscience, 10:123, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 223.Wang Yong, Joshi Trupti, Zhang Xiang-Sun, Xu Dong, and Chen Luonan. Inferring gene regulatory networks from multiple microarray datasets. Bioinformatics, 22(19):2413–2420, 2006. [DOI] [PubMed] [Google Scholar]
  • 224.Wang Yong, Zhang Xiang-Sun, and Xia Yu. Predicting eukaryotic transcriptional cooperativity by bayesian network integration of genome-wide data. Nucleic acids research, 37(18):5943–5958, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 225.Wang YX, Sarkar Purnamrita, Ursu Oana, Kundaje Anshul, and Bickel Peter J. Network modelling of topological domains using hi-c data. Annals of Applied Statistics, 2019. to appear. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 226.Wang YX Rachel and Huang Haiyan. Review on statistical methods for gene network reconstruction using expression data. Journal of theoretical biology, 362:53–61, 2014. [DOI] [PubMed] [Google Scholar]
  • 227.Wang YX Rachel, Jiang Keni, Feldman Lewis J, Bickel Peter J, Huang Haiyan, et al. Inferring gene-gene interactions and functional modules using sparse canonical correlation analysis. The Annals of Applied Statistics, 9(1):300–323, 2015. [Google Scholar]
  • 228.Wang YX Rachel, Liu Ke, Theusch Elizabeth, Rotter Jerome I, Medina Marisa W, Waterman Michael S, and Huang Haiyan. Generalized correlation measure using count statistics for gene expression data with ordered samples. Bioinformatics, 34(4):617–624, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 229.Wang YX Rachel, Waterman Michael S, and Huang Haiyan. Gene coexpression measures in large heterogeneous samples using count statistics. Proceedings of the National Academy of Sciences, 111(46):16371–16376, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 230.Wernicke Sebastian. Efficient detection of network motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 3(4):347–359, 2006. [DOI] [PubMed] [Google Scholar]
  • 231.Wille Anja and Bühlmann Peter. Low-order conditional independence graphs for inferring genetic networks. Statistical applications in genetics and molecular biology, 5(1), 2006. [DOI] [PubMed] [Google Scholar]
  • 232.Wille Anja, Zimmermann Philip, Eva Vranová Andreas Fürholz, Laule Oliver, Bleuler Stefan, Hennig Lars, Prelić Amela, Rohr Peter von, Thiele Lothar, et al. Sparse graphical gaussian modeling of the isoprenoid gene network in arabidopsis thaliana. Genome biology, 5(11):R92, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 233.Wolfe CJ, Kohane IS, and Butte AJ. Systematic survey reveals general applicability of “guilt-by-association” within gene coexpression networks. BMC Bioinformatics, 6:227, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 234.Wong Aaron K, Krishnan Arjun, and Troyanskaya Olga G. Giant 2.0: genome-scale integrated analysis of gene networks in tissues. Nucleic acids research, 46(W1):W65–W70, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 235.Wong Elisabeth, Baur Brittany, Quader Saad, and Huang Chun-Hsi. Biological network motif detection: principles and practice. Briefings in bioinformatics, 13(2):202–215, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 236.Wong Raymond KW, Lee Thomas CM, Paul Debashis, and Peng Jie. Fiber direction estimation, smoothing and tracking in diffusion mri. The annals of applied statistics, 10(3):1137, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 237.Wu Zhijin, Zhang Yi, Stitzel Michael L, and Wu Hao. Two-phase differential expression analysis for single cell rna-seq. Bioinformatics, 34(19):3340–3348, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 238.Xia Yin and Li Lexin. Hypothesis testing of matrix graph model with application to brain connectivity analysis. Biometrics, 73(3):780–791, 2017. [DOI] [PubMed] [Google Scholar]
  • 239.Xia Yin and Li Lexin. Matrix graph hypothesis testing and application in brain connectivity alternation detection. Statistica Sinica, to appear, 2018. [Google Scholar]
  • 240.Xiong Qing, Ancona Nicola, Hauser Elizabeth R, Mukherjee Sayan, and Furey Terrence S. Integrating genetic and gene expression evidence into genome-wide association analysis of gene sets. Genome research, 22(2):386–397, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 241.Xu Chen and Su Zhengchang. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics, 31(12):1974–1980, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 242.Xu Jianzhen and Li Yongjin. Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics, 22(22):2800–2805, 2006. [DOI] [PubMed] [Google Scholar]
  • 243.Xu Yuting and Lindquist Martin A. Dynamic connectivity detection: an algorithm for determining functional connectivity change points in fmri data. Frontiers in neuroscience, 9:285, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 244.Yan Koon-Kiu, Lou Shaoke, and Gerstein Mark. Mrtadfinder: A network modularity based approach to identify topologically associating domains in multiple resolutions. PLoS computational biology, 13(7):e1005647, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 245.Ka Yee Yeung Chris Fraley, Murua Alejandro, Raftery Adrian E., and Ruzzo Walter L.. Model-based clustering and data transformations for gene expression data. Bioinformatics, 17(10):977–987, 2001. [DOI] [PubMed] [Google Scholar]
  • 246.Yu Haiyuan, Kim Philip M, Sprecher Emmett, Trifonov Valery, and Gerstein Mark. The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS computational biology, 3(4):e59, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 247.Yu Jing, Smith V Anne, Wang Paul P, Hartemink Alexander J, and Jarvis Erich D. Using bayesian network inference algorithms to recover molecular genetic regulatory networks. In International Conference on Systems Biology, volume 2002, 2002. [Google Scholar]
  • 248.Yuan Ming and Lin Yi. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19–35, 2007. [Google Scholar]
  • 249.Yuan Yuan, Chen Yi-Ping Phoebe, Ni Shengyu, Xu Augix Guohua, Tang Lin, Vingron Martin, Somel Mehmet, and Khaitovich Philipp. Development and application of a modified dynamic time warping algorithm (dtw-s) to analyses of primate brain expression time series. BMC bioinformatics, 12(1):347, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 250.Zhang Bai, Li Huai, Riggins Rebecca B, Zhan Ming, Xuan Jianhua, Zhang Zhen, Hoffman Eric P, Clarke Robert, and Wang Yue. Differential dependency network analysis to identify condition-specific topological changes in biological networks. Bioinformatics, 25(4):526–532, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 251.Zhang Lihua and Zhang Shihua. Comparison of computational methods for imputing single-cell rna-sequencing data. IEEE/ACM transactions on computational biology and bioinformatics, 2018. [DOI] [PubMed] [Google Scholar]
  • 252.Zhang Tingting, Wu Jingwei, Li Fan, Caffo Brian, and Boatman-Reich Dana. A dynamic directional model for effective brain connectivity using electrocorticographic (ecog) time series. Journal of the American Statistical Association, 110(509):93–106, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 253.Zhang Zhengwu, Descoteaux Maxime, Zhang Jingwen, Girard Gabriel, Chamberland Maxime, Dunson David, Srivastava Anuj, and Zhu Hongtu. Mapping population-based structural connectomes. NeuroImage, 172:130–145, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 254.Zhao Fang, McCarrick-Walmsley Ruth, Åkerblad Peter, Sigvardsson Mikael, and Kadesch Tom. Inhibition of p300/cbp by early b-cell factor. Molecular and cellular biology, 23(11):3837–3846, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 255.Zheng Grace XY, Terry Jessica M, Belgrader Phillip, Ryvkin Paul, Bent Zachary W, Wilson Ryan, Ziraldo Solongo B, Wheeler Tobias D, McDermott Geoff P, Zhu Junjie, et al. Massively parallel digital transcriptional profiling of single cells. Nature communications, 8:14049, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 256.Zhou Shuheng, Rütimann Philipp, Xu Min, and Bülmann Peter. High-dimensional covariance estimation based on gaussian graphical models. Journal of Machine Learning Research, 12(Oct):2975–3026, 2011. [Google Scholar]
  • 257.Zhu Hongtu, Chen Yasheng, Joseph G Ibrahim Yimei Li, Hall Colin, and Lin Weili. Intrinsic regression models for positive-definite matrices with applications to diffusion tensor imaging. Journal of the American Statistical Association, 104(487):1203–1212, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 258.Zhu Jun, Chen Yanqing, Leonardson Amy S, Wang Kai, Lamb John R, Emilsson Valur, and Schadt Eric E. Characterizing dynamic changes in the human blood transcriptional network. PLoS computational biology, 6(2):e1000671, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 259.Zhu Yunzhang and Li Lexin. Multiple matrix gaussian graphs estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80(5):927–950, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 260.Ziegenhain Christoph, Vieth Beate, Parekh Swati, Reinius Björn, Amy Guillaumet-Adkins Martha Smets, Leonhardt Heinrich, Heyn Holger, Hellmann Ines, and Enard Wolfgang. Comparative analysis of single-cell rna sequencing methods. Molecular cell, 65(4):631–643, 2017. [DOI] [PubMed] [Google Scholar]
  • 261.Zotenko Elena, Mestre Julian, O’Leary Dianne P, and Przytycka Teresa M. Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality. PLoS computational biology, 4(8):e1000140, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 262.Zou Min and Conzen Suzanne D. A new dynamic bayesian network (dbn) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, 21(1):71–79, 2004. [DOI] [PubMed] [Google Scholar]

RESOURCES